Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-03-18 Thread Bruce Momjian
On Fri, Mar 15, 2013 at 05:47:30PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  I'm marking this patch Ready for Committer, qualified with a recommendation 
  to
  adopt only the wal.sgml changes.
 
 I've committed this along with some further wordsmithing.  I kept
 Peter's change to pg_test_fsync's default -s value; I've always felt
 that 2 seconds was laughably small.  It might be all right for very
 quick-and-dirty tests, but as a default value, it seems like a poor
 choice, because it's at the very bottom of the credible range of
 choices.

Agreed, 2 seconds was at the bottom.  The old behavior was very slow so
I went low.  Now that we are using it, 5 secs makes sense.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-03-15 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 I'm marking this patch Ready for Committer, qualified with a recommendation to
 adopt only the wal.sgml changes.

I've committed this along with some further wordsmithing.  I kept
Peter's change to pg_test_fsync's default -s value; I've always felt
that 2 seconds was laughably small.  It might be all right for very
quick-and-dirty tests, but as a default value, it seems like a poor
choice, because it's at the very bottom of the credible range of
choices.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-28 Thread Noah Misch
On Mon, Jan 28, 2013 at 04:48:56AM +, Peter Geoghegan wrote:
 On 28 January 2013 03:34, Noah Misch n...@leadboat.com wrote:
  Would you commit to the same git repository the pgbench-tools data for the
  graphs appearing in that blog post?  I couldn't readily tell what was
  happening below 16 clients due to the graphed data points blending together.
 
 I'm afraid that I no longer have that data. Of course, I could fairly
 easily recreate it, but I don't think I'll have time tomorrow. Is it
 important? Are you interested in both the insert and tpc-b cases?

No need, then.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-28 Thread Noah Misch
On Mon, Jan 28, 2013 at 04:29:12AM +, Peter Geoghegan wrote:
 On 28 January 2013 03:34, Noah Misch n...@leadboat.com wrote:
  On the EBS configuration with volatile fsync timings, the variability didn't
  go away with 15s runs.  On systems with stable fsync times, 15s was no 
  better
  than 2s.  Absent some particular reason to believe 5s is better than 2s, I
  would leave it alone.
 
 I'm not recommending doing so because I thought you'd be likely to get
 better numbers on EBS; obviously the variability you saw there likely
 had a lot to do with the fact that the underlying physical machines
 have multiple tenants. It has just been my observation that more
 consistent figures can be obtained (on my laptop) by using a
 pg_test_fsync --secs-per-test of about 5. That being the case, why
 take the chance with 2 seconds?

I can't get too excited about it either way.

 It isn't as if people run
 pg_test_fsync everyday, or that they cannot set --secs-per-test to
 whatever they like themselves. On the other hand, the cost of setting
 it too low could be quite high now, because the absolute values (and
 not just how different wal_sync_methods compare) is now important.

True.  You'd actually want to run the tool with a short interval to select a
wal_sync_method, then test the chosen method for a longer period to get an
accurate reading for commit_delay.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-27 Thread Peter Geoghegan
Hi Noah,

On 27 January 2013 02:31, Noah Misch n...@leadboat.com wrote:
 I did a few more benchmarks along the spectrum.

 So that's a nice 27-53% improvement, fairly similar to the pattern for your
 laptop pgbench numbers.

I presume that this applies to a tpc-b benchmark (the pgbench
default). Note that the really compelling numbers that I reported in
that blog post (where there is an increase of over 80% in transaction
throughput at lower client counts) occur with an insert-based
benchmark (i.e. a maximally commit-bound workload).

 Next, based on your comment about the possible value
 for cloud-hosted applications

 -clients-   -tps@commit_delay=0--tps@commit_delay=500-
 32  1224,1391,1584  1175,1229,1394
 64  1553,1647,1673  1544,1546,1632
 128 1717,1833,1900  1621,1720,1951
 256 1664,1717,1918  1734,1832,1918

 The numbers are all over the place, but there's more loss than gain.

I suspected that the latency of cloud storage might be relatively
poor. Since that is evidently not actually the case with Amazon EBS,
it makes sense that commit_delay isn't compelling there. I am not
disputing whether or not Amazon EBS should be considered
representative of such systems in general - I'm sure that it should
be.

 There was no appreciable
 performance advantage from setting commit_delay=0 as opposed to relying on
 commit_siblings to suppress the delay.  That's good news.

Thank you for doing that research; I investigated that the fastpath in
MinimumActiveBackends() works well myself, but it's useful to have my
findings verified.

 On the GNU/Linux VM, pg_sleep() achieves precision on the order of 10us.
 However, the sleep was consistently around 70us longer than requested.  A
 300us request yielded a 370us sleep, and a 3000us request gave a 3080us sleep.
 Mac OS X was similarly precise for short sleeps, but it could oversleep a full
 1000us on a 35000us sleep.

Ugh.

 The beginning of this paragraph stills says commit_delay causes a delay just
 before a synchronous commit attempts to flush WAL to disk.  Since it now
 applies to every WAL flush, that should be updated.

Agreed.

 There's a similar problem at the beginning of this paragraph; it says
 specifically, The commit_delay parameter defines for how many microseconds
 the server process will sleep after writing a commit record to the log with
 LogInsert but before performing a LogFlush.

Right.

 As a side note, if we're ever going to recommend a fire-and-forget method for
 setting commit_delay, it may be worth detecting whether the host sleep
 granularity is limited like this.  Setting commit_delay = 20 for your SSD and
 silently getting commit_delay = 1 would make for an unpleasant surprise.

Yes, it would. Note on possible oversleeping added.

 !   para
 !Since the purpose of varnamecommit_delay/varname is to allow
 !the cost of each flush operation to be more effectively amortized
 !across concurrently committing transactions (potentially at the
 !expense of transaction latency), it is necessary to quantify that
 !cost when altering the setting.  The higher that cost is, the more
 !effective varnamecommit_delay/varname is expected to be in
 !increasing transaction throughput.  The

 That's true for spinning disks, but I suspect it does not hold for storage
 with internal parallelism, notably virtualized storage.  Consider an iSCSI
 configuration with high bandwidth and high latency.  When network latency is
 the limiting factor, will sending larger requests less often still help?

Well, I don't like to speculate about things like that, because it's
just too easy to be wrong. That said, it doesn't immediately occur to
me why the statement that you've highlighted wouldn't be true of
virtualised storage that has the characteristics you describe. Any
kind of latency at flush time means that clients idle, which means
that the CPU is potentially not kept fully busy for a greater amount
of wall time, where it might otherwise be kept more busy.

 One would be foolish to run a performance-sensitive workload like those in
 question, including the choice to have synchronous_commit=on, on spinning
 disks with no battery-backed write cache.  A cloud environment is more
 credible, but my benchmark showed no gain there.

In an everyday sense you are correct. It would typically be fairly
senseless to run an application that was severely limited by
transaction throughput like this, when a battery-backed cache could be
used at the cost of a couple of hundred dollars. However, it's quite
possible to imagine a scenario in which the economics favoured using
commit_delay instead. For example, I am aware that at Facebook, a
similar Facebook-flavoured-MySQL setting (sync_binlog_timeout_usecs)
is used. Furthermore, it might not be obvious that fsync speed is an
issue in practice. Setting commit_delay to 4,000 has seemingly no
downside on my laptop - it *positively* affects 

Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-27 Thread Noah Misch
On Mon, Jan 28, 2013 at 12:16:24AM +, Peter Geoghegan wrote:
 On 27 January 2013 02:31, Noah Misch n...@leadboat.com wrote:
  I did a few more benchmarks along the spectrum.
 
  So that's a nice 27-53% improvement, fairly similar to the pattern for your
  laptop pgbench numbers.
 
 I presume that this applies to a tpc-b benchmark (the pgbench
 default). Note that the really compelling numbers that I reported in
 that blog post (where there is an increase of over 80% in transaction
 throughput at lower client counts) occur with an insert-based
 benchmark (i.e. a maximally commit-bound workload).

Correct.  The pgbench default workload is already rather friendly toward
commit_delay, so I wanted to stay away from even-friendlier tests.

Would you commit to the same git repository the pgbench-tools data for the
graphs appearing in that blog post?  I couldn't readily tell what was
happening below 16 clients due to the graphed data points blending together.

  !   para
  !Since the purpose of varnamecommit_delay/varname is to allow
  !the cost of each flush operation to be more effectively amortized
  !across concurrently committing transactions (potentially at the
  !expense of transaction latency), it is necessary to quantify that
  !cost when altering the setting.  The higher that cost is, the more
  !effective varnamecommit_delay/varname is expected to be in
  !increasing transaction throughput.  The
 
  That's true for spinning disks, but I suspect it does not hold for storage
  with internal parallelism, notably virtualized storage.  Consider an iSCSI
  configuration with high bandwidth and high latency.  When network latency is
  the limiting factor, will sending larger requests less often still help?
 
 Well, I don't like to speculate about things like that, because it's
 just too easy to be wrong. That said, it doesn't immediately occur to
 me why the statement that you've highlighted wouldn't be true of
 virtualised storage that has the characteristics you describe. Any
 kind of latency at flush time means that clients idle, which means
 that the CPU is potentially not kept fully busy for a greater amount
 of wall time, where it might otherwise be kept more busy.

On further reflection, I retract the comment.  Regardless of internal
parallelism of the storage, PostgreSQL issues WAL fsyncs serially.

  One would be foolish to run a performance-sensitive workload like those in
  question, including the choice to have synchronous_commit=on, on spinning
  disks with no battery-backed write cache.  A cloud environment is more
  credible, but my benchmark showed no gain there.
 
 In an everyday sense you are correct. It would typically be fairly
 senseless to run an application that was severely limited by
 transaction throughput like this, when a battery-backed cache could be
 used at the cost of a couple of hundred dollars. However, it's quite
 possible to imagine a scenario in which the economics favoured using
 commit_delay instead. For example, I am aware that at Facebook, a
 similar Facebook-flavoured-MySQL setting (sync_binlog_timeout_usecs)
 is used. Furthermore, it might not be obvious that fsync speed is an
 issue in practice. Setting commit_delay to 4,000 has seemingly no
 downside on my laptop - it *positively* affects both average and
 worse-case transaction latency - so with spinning disks, it probably
 would actually be sensible to set it and forget it, regardless of
 workload.

I agree that commit_delay is looking like a safe bet for spinning disks.

 I attach a revision that I think addresses your concerns. I've
 polished it a bit further too - in particular, my elaborations about
 commit_delay have been concentrated at the end of wal.sgml, where they
 belong. I've also removed the reference to XLogInsert, because, since
 all XLogFlush call sites are now covered by commit_delay, XLogInsert
 isn't particularly relevant.

I'm happy with this formulation.

 I have also increased the default time that pg_test_fsync runs - I
 think that the kind of variability commonly seen in its output, that
 you yourself have reported, justifies doing so in passing.

On the EBS configuration with volatile fsync timings, the variability didn't
go away with 15s runs.  On systems with stable fsync times, 15s was no better
than 2s.  Absent some particular reason to believe 5s is better than 2s, I
would leave it alone.

I'm marking this patch Ready for Committer, qualified with a recommendation to
adopt only the wal.sgml changes.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-27 Thread Peter Geoghegan
On 28 January 2013 03:34, Noah Misch n...@leadboat.com wrote:
 On the EBS configuration with volatile fsync timings, the variability didn't
 go away with 15s runs.  On systems with stable fsync times, 15s was no better
 than 2s.  Absent some particular reason to believe 5s is better than 2s, I
 would leave it alone.

I'm not recommending doing so because I thought you'd be likely to get
better numbers on EBS; obviously the variability you saw there likely
had a lot to do with the fact that the underlying physical machines
have multiple tenants. It has just been my observation that more
consistent figures can be obtained (on my laptop) by using a
pg_test_fsync --secs-per-test of about 5. That being the case, why
take the chance with 2 seconds? It isn't as if people run
pg_test_fsync everyday, or that they cannot set --secs-per-test to
whatever they like themselves. On the other hand, the cost of setting
it too low could be quite high now, because the absolute values (and
not just how different wal_sync_methods compare) is now important.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-27 Thread Peter Geoghegan
On 28 January 2013 03:34, Noah Misch n...@leadboat.com wrote:
 Would you commit to the same git repository the pgbench-tools data for the
 graphs appearing in that blog post?  I couldn't readily tell what was
 happening below 16 clients due to the graphed data points blending together.

I'm afraid that I no longer have that data. Of course, I could fairly
easily recreate it, but I don't think I'll have time tomorrow. Is it
important? Are you interested in both the insert and tpc-b cases?

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-26 Thread Noah Misch
Hi Peter,

I took a look at this patch and the benchmarks you've furnished:

https://github.com/petergeoghegan/commit_delay_benchmarks
http://pgeoghegan.blogspot.com/2012/06/towards-14000-write-transactions-on-my.html

On Wed, Nov 14, 2012 at 08:44:26PM +, Peter Geoghegan wrote:

 Attached is a doc-patch that makes recommendations that are consistent
 with my observations about what works best here. I'd like to see us
 making *some* recommendation - for sympathetic cases, setting
 commit_delay appropriately can make a very large difference to
 transaction throughput. Such sympathetic cases - many small write
 transactions - are something that tends to be seen relatively
 frequently with web applications, that disproportionately use cloud
 hosting. It isn't at all uncommon for these cases to be highly bound
 by their commit rate, and so it is compelling to try to amortize the
 cost of a flush as effectively as possible there. It would be
 unfortunate if no one was even aware that commit_delay is now useful
 for these cases, since the setting allows cloud hosting providers to
 help these cases quite a bit, without having to do something like
 compromise durability, which in general isn't acceptable.

Your fast-fsync (SSD, BBWC) benchmarks show a small loss up to 8 clients and a
10-20% improvement at 32 clients.  That's on a 4-core/8-thread CPU, assuming
HT was left enabled.  Your slow-fsync (laptop) benchmarks show a 40-100%
improvement in the 16-64 client range.

I did a few more benchmarks along the spectrum.  First, I used a Mac, also
4-core/8-thread, with fsync_writethrough; half of fsync time gave commit_delay
= 35000.  I used pgbench, scale factor 100, 4-minute runs, three trials each:

-clients-   -tps@commit_delay=0--tps@commit_delay=35000-
8   51,55,6382,84,86
16  98,100,107  130,134,143
32  137,148,157 192,200,201
64  199,201,214 249,256,258

So that's a nice 27-53% improvement, fairly similar to the pattern for your
laptop pgbench numbers.  Next, based on your comment about the possible value
for cloud-hosted applications, I tried a cc2.8xlarge (16 core, 32 thread),
GNU/Linux EC2 instance with a data directory on a standard EBS volume, ext4
filesystem.  Several 15s pg_test_fsync runs could not agree on an fsync time;
I saw results from 694us to 1904us.  Ultimately I settled on trying
commit_delay=500, scale factor 300:

-clients-   -tps@commit_delay=0--tps@commit_delay=500-
32  1224,1391,1584  1175,1229,1394
64  1553,1647,1673  1544,1546,1632
128 1717,1833,1900  1621,1720,1951
256 1664,1717,1918  1734,1832,1918

The numbers are all over the place, but there's more loss than gain.  Amit
Kapila also measured small losses in tps@-c16:

http://www.postgresql.org/message-id/000701cd6ff0$013a6210$03af2630$@kap...@huawei.com


I was curious about the cost of the MinimumActiveBackends() call when relying
on commit_siblings to skip the delay.  I ran a similar test with an extra 500
idle backends, clients=8, commit_siblings=20 (so the delay would never be
used), and either a zero or nonzero commit_delay.  There was no appreciable
performance advantage from setting commit_delay=0 as opposed to relying on
commit_siblings to suppress the delay.  That's good news.

On the GNU/Linux VM, pg_sleep() achieves precision on the order of 10us.
However, the sleep was consistently around 70us longer than requested.  A
300us request yielded a 370us sleep, and a 3000us request gave a 3080us sleep.
Mac OS X was similarly precise for short sleeps, but it could oversleep a full
1000us on a 35000us sleep.


 diff doc/src/sgml/wal.sgml
 index fc5c3b2..92619dd
 *** a/doc/src/sgml/wal.sgml
 --- b/doc/src/sgml/wal.sgml
 ***
 *** 375,382 
  just before a synchronous commit attempts to flush
  acronymWAL/acronym to disk, in the hope that a single flush
  executed by one such transaction can also serve other transactions
 !committing at about the same time.  Setting 
 varnamecommit_delay/varname
 !can only help when there are many concurrently committing transactions.
 /para
   
/sect1
 --- 375,397 

The beginning of this paragraph stills says commit_delay causes a delay just
before a synchronous commit attempts to flush WAL to disk.  Since it now
applies to every WAL flush, that should be updated.

 ***
 *** 560,570 
  is not enabled, or if fewer than xref linkend=guc-commit-siblings
  other sessions are currently in active transactions; this avoids
  sleeping when it's unlikely that any other session will commit soon.
 !Note that on most platforms, the resolution of a sleep request is
  ten milliseconds, so that any nonzero varnamecommit_delay/varname
  setting between 1 and 1 microseconds would have the same effect.
 !Good values for these parameters are not yet clear; 

[HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-21 Thread Noah Misch
On Mon, Jan 21, 2013 at 12:23:21AM +, Peter Geoghegan wrote:
 On 19 January 2013 20:38, Noah Misch n...@leadboat.com wrote:
  staticloud.com seems to be gone.  Would you repost these?
 
 I've pushed these to a git repo, hosted on github.
 
 https://github.com/petergeoghegan/commit_delay_benchmarks
 
 I'm sorry that I didn't take the time to make the html benchmarks
 easily viewable within a browser on this occasion.

That's plenty convenient; thanks.

What filesystem did you use for testing?  Would you also provide /proc/cpuinfo
or a rough description of the system's CPUs?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Doc patch making firm recommendation for setting the value of commit_delay

2013-01-19 Thread Noah Misch
On Wed, Nov 14, 2012 at 08:44:26PM +, Peter Geoghegan wrote:
 http://commit-delay-results-ssd-insert.staticloud.com
 http://commit-delay-stripe-insert.staticloud.com
 http://commit-delay-results-stripe-tpcb.staticloud.com
 http://commit-delay-results-ssd-insert.staticloud.com/19/pg_settings.txt

staticloud.com seems to be gone.  Would you repost these?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers