Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Fabien COELHO
To emphasize potential bad effects without having to build too large a host and involve too many table spaces, I would suggest to reduce significantly the "checkpoint_flush_after" setting while running these tests. Meh, that completely distorts the test. Yep, I agree. The point would be to

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Fabien COELHO
My impression is that we actually know what we need to know anyway? Sure, the overall summary is "it is much better with the patch" on this large SSD test, which is good news because the patch was really designed to help with HDDs. -- Fabien. -- Sent via pgsql-hackers mailing list

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Fabien COELHO
You took 5% of the tx on two 12 hours runs, totaling say 85M tx on one and 100M tx on the other, so you get 4.25M tx from the first and 5M from the second. OK I'm saying that the percentile should be computed on the largest one (5M), so that you get a curve like the following, with both

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Andres Freund
On 2016-03-22 10:52:55 +0100, Fabien COELHO wrote: > To emphasize potential bad effects without having to build too large a host > and involve too many table spaces, I would suggest to reduce significantly > the "checkpoint_flush_after" setting while running these tests. Meh, that completely

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Fabien COELHO
WRT tablespaces: What I'm planning to do, unless somebody has a better proposal, is to basically rent two big amazon instances, and run pgbench in parallel over N tablespaces. Once with local SSD and once with local HDD storage. Ok. Not sure how to control that table spaces are actually on

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Andres Freund
On 2016-03-22 10:48:20 +0100, Tomas Vondra wrote: > Hi, > > On 03/22/2016 10:44 AM, Fabien COELHO wrote: > > > > > 1) regular-latency.png > >>> > >>>I'm wondering whether it would be clearer if the percentiles > >>>where relative to the largest sample, not to itself, so that the > >>>figures

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Tomas Vondra
Hi, On 03/22/2016 10:44 AM, Fabien COELHO wrote: 1) regular-latency.png I'm wondering whether it would be clearer if the percentiles where relative to the largest sample, not to itself, so that the figures from the largest one would still be between 0 and 1, but the other (unpatched) one

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Fabien COELHO
1) regular-latency.png I'm wondering whether it would be clearer if the percentiles where relative to the largest sample, not to itself, so that the figures from the largest one would still be between 0 and 1, but the other (unpatched) one would go between 0 and 0.85, that is would be cut

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Andres Freund
Hi, On 2016-03-21 18:46:58 +0100, Tomas Vondra wrote: > I've repeated the tests, but this time logged details for 5% of the > transaction (instead of aggregating the data for each second). I've also > made the tests shorter - just 12 hours instead of 24, to reduce the time > needed to complete

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Tomas Vondra
Hi, On 03/22/2016 07:35 AM, Fabien COELHO wrote: Hello Tomas, Thanks again for these interesting benches. Overall, this means ~300M transactions in total for the un-throttled case, so sample with ~15M transactions available when computing the following charts. Still a very sizable run!

Re: [HACKERS] checkpointer continuous flushing

2016-03-22 Thread Fabien COELHO
Hello Tomas, Thanks again for these interesting benches. Overall, this means ~300M transactions in total for the un-throttled case, so sample with ~15M transactions available when computing the following charts. Still a very sizable run! There results (including scripts for generating the

Re: [HACKERS] checkpointer continuous flushing

2016-03-21 Thread Tomas Vondra
Hi, I've repeated the tests, but this time logged details for 5% of the transaction (instead of aggregating the data for each second). I've also made the tests shorter - just 12 hours instead of 24, to reduce the time needed to complete the benchmark. Overall, this means ~300M transactions

Re: [HACKERS] checkpointer continuous flushing

2016-03-19 Thread Fabien COELHO
Hello Tomas, Thanks for these great measures. * 4 x CPU E5-4620 (2.2GHz) 4*8 = 32 cores / 64 threads. * 256GB of RAM Wow! * 24x SSD on LSI 2208 controller (with 1GB BBWC) Wow! RAID configuration ? The patch is designed to fix very big issues on HDD, but it is good to see that the

Re: [HACKERS] checkpointer continuous flushing

2016-03-19 Thread Tomas Vondra
Hi, On 03/17/2016 10:14 PM, Fabien COELHO wrote: ... I would have suggested using the --latency-limit option to filter out very slow queries, otherwise if the system is stuck it may catch up later, but then this is not representative of "sustainable" performance. When pgbench is running

Re: [HACKERS] checkpointer continuous flushing

2016-03-19 Thread Fabien COELHO
Hello Tomas, But I do think it's a very useful tool when it comes to measuring the consistency of behavior over time, assuming you're asking questions about the intervals and not the original transactions. For a throttled run, I think it is better to check whether or not the system could

Re: [HACKERS] checkpointer continuous flushing

2016-03-19 Thread Fabien COELHO
Is it possible to run tests with distinct table spaces on those many disks? Nope, that'd require reconfiguring the system (and then back), and I don't have access to that system (just SSH). Ok. Also, I don't quite see what would that tell us? Currently the flushing context is shared

Re: [HACKERS] checkpointer continuous flushing

2016-03-19 Thread Tomas Vondra
Hi, On 03/11/2016 02:34 AM, Andres Freund wrote: Hi, I just pushed the two major remaining patches in this thread. Let's see what the buildfarm has to say; I'd not be surprised if there's some lingering portability problem in the flushing code. There's one remaining issue we definitely want

Re: [HACKERS] checkpointer continuous flushing

2016-03-19 Thread Tomas Vondra
Hi, On 03/17/2016 06:36 PM, Fabien COELHO wrote: Hello Tomas, Thanks for these great measures. * 4 x CPU E5-4620 (2.2GHz) 4*8 = 32 cores / 64 threads. Yep. I only used 32 clients though, to keep some of the CPU available for the rest of the system (also, HT does not really double the

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-13 Thread Jim Nasby
On 3/13/16 6:30 PM, Peter Geoghegan wrote: On Sat, Mar 12, 2016 at 5:21 PM, Jeff Janes wrote: Would the wiki be a good place for such tips? Not as formal as the documentation, and more centralized (and editable) than a collection of blog posts. That general direction

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-13 Thread Peter Geoghegan
On Sat, Mar 12, 2016 at 5:21 PM, Jeff Janes wrote: > Would the wiki be a good place for such tips? Not as formal as the > documentation, and more centralized (and editable) than a collection > of blog posts. That general direction makes sense, but I'm not sure if the Wiki

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-12 Thread Jeff Janes
On Thu, Mar 10, 2016 at 11:25 PM, Peter Geoghegan wrote: > On Thu, Mar 10, 2016 at 11:18 PM, Fabien COELHO wrote: >> I can only concur! >> >> The "Performance Tips" chapter (II.14) is more user/query oriented. The >> "Server Administration" bool (III) does

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Peter Geoghegan
On Thu, Mar 10, 2016 at 11:18 PM, Fabien COELHO wrote: > I can only concur! > > The "Performance Tips" chapter (II.14) is more user/query oriented. The > "Server Administration" bool (III) does not discuss this much. That's definitely one area in which the docs are lacking

Re: [HACKERS] checkpointer continuous flushing

2016-03-10 Thread Fabien COELHO
I just pushed the two major remaining patches in this thread. Hurray! Nine months the this baby out:-) -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
As you wish. I thought that understanding the underlying performance model with sequential writes written in chunks is important for the admin, and as this guc would have an impact on performance it should be hinted about, including the limits of its effect where large bases will converge to

Re: [HACKERS] checkpointer continuous flushing

2016-03-10 Thread Andres Freund
Hi, I just pushed the two major remaining patches in this thread. Let's see what the buildfarm has to say; I'd not be surprised if there's some lingering portability problem in the flushing code. There's one remaining issue we definitely want to resolve before the next release: Right now we

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-11 00:23:56 +0100, Fabien COELHO wrote: > As you wish. I thought that understanding the underlying performance model > with sequential writes written in chunks is important for the admin, and as > this guc would have an impact on performance it should be hinted about, > including the

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
[...] If the default is in pages, maybe you could state it and afterwards translate it in size. Hm, I think that's more complicated for users than it's worth. As you wish. I liked the number of pages you used initially because it really gives a hint of how much random IOs are avoided when

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
Hello Andres, I'm not sure I've seen these performance... If you have hard evidence, please feel free to share it. Man, are you intentionally trying to be hard to work with? Sorry, I do not understand this remark. You were refering to some latency measures in your answer, and I was just

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-10 23:43:46 +0100, Fabien COELHO wrote: > > > > >Whenever more than bgwriter_flush_after bytes have > >been written by the bgwriter, attempt to force the OS to issue these > >writes to the underlying storage. Doing so will limit the amount of > >

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
Whenever more than bgwriter_flush_after bytes have been written by the bgwriter, attempt to force the OS to issue these writes to the underlying storage. Doing so will limit the amount of dirty data in the kernel's page cache, reducing the likelihood of

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-10 23:38:38 +0100, Fabien COELHO wrote: > I'm not sure I've seen these performance... If you have hard evidence, > please feel free to share it. Man, are you intentionally trying to be hard to work with? To quote the email you responded to: > My current plan is to commit this with

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
[...] I had originally kept it with one context per tablespace after refactoring this, but found that it gave worse results in rate limited loads even over only two tablespaces. That's on SSDs though. Might just mean that a smaller context size is better on SSD, and it could still be

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-10 17:33:33 -0500, Robert Haas wrote: > On Thu, Mar 10, 2016 at 5:24 PM, Andres Freund wrote: > > On 2016-02-21 09:49:53 +0530, Robert Haas wrote: > >> I think there might be a semantic distinction between these two terms. > >> Doesn't writeback mean writing pages

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Robert Haas
On Thu, Mar 10, 2016 at 5:24 PM, Andres Freund wrote: > On 2016-02-21 09:49:53 +0530, Robert Haas wrote: >> I think there might be a semantic distinction between these two terms. >> Doesn't writeback mean writing pages to disk, and flushing mean making >> sure that they are

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-02-21 09:49:53 +0530, Robert Haas wrote: > I think there might be a semantic distinction between these two terms. > Doesn't writeback mean writing pages to disk, and flushing mean making > sure that they are durably on disk? So for example when the Linux > kernel thinks there is too much

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-08 09:28:15 +0100, Fabien COELHO wrote: > > >>>Now I cannot see how having one context per table space would have a > >>>significant negative performance impact. > >> > >>The 'dirty data' etc. limits are global, not per block device. By having > >>several contexts with unflushed dirty

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-08 Thread Fabien COELHO
Now I cannot see how having one context per table space would have a significant negative performance impact. The 'dirty data' etc. limits are global, not per block device. By having several contexts with unflushed dirty data the total amount of dirty data in the kernel increases. Possibly,

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Fabien COELHO
Hello Andres, Now I cannot see how having one context per table space would have a significant negative performance impact. The 'dirty data' etc. limits are global, not per block device. By having several contexts with unflushed dirty data the total amount of dirty data in the kernel

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Andres Freund
On 2016-03-07 21:10:19 +0100, Fabien COELHO wrote: > Now I cannot see how having one context per table space would have a > significant negative performance impact. The 'dirty data' etc. limits are global, not per block device. By having several contexts with unflushed dirty data the total amount

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Fabien COELHO
Hello Andres, (1) with 16 tablespaces (1 per table) on 1 disk : 680.0 tps per second avg, stddev [ min q1 median d3 max ] <=300tps 679.6 ± 750.4 [0.0, 317.0, 371.0, 438.5, 2724.0] 19.5% (2) with 1 tablespace on 1 disk : 956.0 tps per second avg, stddev [ min q1 median d3 max ]

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Andres Freund
On 2016-02-22 20:44:35 +0100, Fabien COELHO wrote: > > >>Random updates on 16 tables which total to 1.1GB of data, so this is in > >>buffer, no significant "read" traffic. > >> > >>(1) with 16 tablespaces (1 per table) on 1 disk : 680.0 tps > >>per second avg, stddev [ min q1 median d3 max ]

Re: [HACKERS] checkpointer continuous flushing - V16

2016-03-07 Thread Andres Freund
On 2016-03-07 09:41:51 -0800, Andres Freund wrote: > > Due to the difference in amount of RAM, each machine used different scales - > > the goal is to have small, ~50% RAM, >200% RAM sizes: > > > > 1) Xeon: 100, 400, 6000 > > 2) i5: 50, 200, 3000 > > > > The commits actually tested are > > > >

Re: [HACKERS] checkpointer continuous flushing - V16

2016-03-07 Thread Andres Freund
On 2016-03-01 16:06:47 +0100, Tomas Vondra wrote: > 1) HP DL380 G5 (old rack server) > - 2x Xeon E5450, 16GB RAM (8 cores) > - 4x 10k SAS drives in RAID-10 on H400 controller (with BBWC) > - RedHat 6 > - shared_buffers = 4GB > - min_wal_size = 2GB > - max_wal_size = 6GB > > 2) workstation with i5

Re: [HACKERS] checkpointer continuous flushing - V16

2016-03-01 Thread Fabien COELHO
Hello Tomas, One of the goals of this thread (as I understand it) was to make the overall behavior smoother - eliminate sudden drops in transaction rate due to bursts of random I/O etc. One way to look at this is in terms of how much the tps fluctuates, so let's see some charts. I've

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Fabien COELHO
Random updates on 16 tables which total to 1.1GB of data, so this is in buffer, no significant "read" traffic. (1) with 16 tablespaces (1 per table) on 1 disk : 680.0 tps per second avg, stddev [ min q1 median d3 max ] <=300tps 679.6 ± 750.4 [0.0, 317.0, 371.0, 438.5, 2724.0] 19.5%

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Andres Freund
On 2016-02-22 11:05:20 -0500, Tom Lane wrote: > Andres Freund writes: > > Interesting. That doesn't reflect my own tests, even on rotating media, > > at all. I wonder if it's related to: > >

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Tom Lane
Andres Freund writes: > Interesting. That doesn't reflect my own tests, even on rotating media, > at all. I wonder if it's related to: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23d0127096cb91cb6d354bdc71bd88a7bae3a1d5 > If you use your 12.04

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Andres Freund
On 2016-02-22 14:11:05 +0100, Fabien COELHO wrote: > > >I did a quick & small test with random updates on 16 tables with > >checkpoint_flush_after=16 checkpoint_timeout=30 > > Another run with more "normal" settings and over 1000 seconds, so less > "quick & small" that the previous one. > >

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Fabien COELHO
I did a quick & small test with random updates on 16 tables with checkpoint_flush_after=16 checkpoint_timeout=30 Another run with more "normal" settings and over 1000 seconds, so less "quick & small" that the previous one. checkpoint_flush_after = 16 checkpoint_timeout = 5min # default

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Fabien COELHO
Hallo Andres, AFAICR I used a "flush context" for each table space in some version I submitted, because I do think that this whole writeback logic really does make sense *per table space*, which suggest that there should be as many write backs contexts as table spaces, otherwise the positive

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
ISTM that "progress" and "progress_slice" only depend on num_scanned and per-tablespace num_to_scan and total num_to_scan, so they are somehow redundant and the progress could be recomputed from the initial figures when needed. They don't cause much space usage, and we access the values

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
[...] I do think that this whole writeback logic really does make sense *per table space*, Leads to less regular IO, because if your tablespaces are evenly sized (somewhat common) you'll sometimes end up issuing sync_file_range's shortly after each other. For latency outside checkpoints it's

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Andres Freund
Hi, On 2016-02-21 10:52:45 +0100, Fabien COELHO wrote: > * CpktSortItem: > > I think that allocating 20 bytes per buffer in shared memory is a little on > the heavy side. Some compression can be achieved: sizeof(ForlNum) is 4 bytes > to hold 4 values, could be one byte or even 2 bits somewhere.

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Andres Freund
On 2016-02-21 08:26:28 +0100, Fabien COELHO wrote: > >>In the discussion in the wal section, I'm not sure about the effect of > >>setting writebacks on SSD, [...] > > > >Yea, that paragraph needs some editing. I think we should basically > >remove that last sentence. > > Ok, fine with me. Does

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
Hallo Andres, Here is a review for the second patch. For 0002 I've recently changed: * Removed the sort timing information, we've proven sufficiently that it doesn't take a lot of time. I put it there initialy to demonstrate that there was no cache performance issue when sorting on just

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
Hallo Andres, [...] I do think that this whole writeback logic really does make sense *per table space*, Leads to less regular IO, because if your tablespaces are evenly sized (somewhat common) you'll sometimes end up issuing sync_file_range's shortly after each other. For latency outside

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Fabien COELHO
Hallo Andres, In some previous version I think a warning was shown if the feature was requested but not available. I think we should either silently ignore it, or error out. Warnings somewhere in the background aren't particularly meaningful. I like "ignoring with a warning" in the log

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Robert Haas
On Sun, Feb 21, 2016 at 3:37 AM, Andres Freund wrote: >> The documentation seems to use "flush" but the code talks about "writeback" >> or "flush", depending. I think one vocabulary, whichever it is, should be >> chosen and everything should stick to it, otherwise everything

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Andres Freund
Hi, On 2016-02-20 20:56:31 +0100, Fabien COELHO wrote: > >* Currently *_flush_after can be set to a nonzero value, even if there's > > no support for flushing on that platform. Imo that's ok, but perhaps > > other people's opinion differ. > > In some previous version I think a warning was shown

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Fabien COELHO
Hello Andres, For 0001 I've recently changed: * Don't schedule writeback after smgrextend() - that defeats linux delayed allocation mechanism, increasing fragmentation noticeably. * Add docs for the new GUC variables * comment polishing * BackendWritebackContext now isn't dynamically

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-19 Thread Michael Paquier
On Sat, Feb 20, 2016 at 5:08 AM, Fabien COELHO wrote: >> Kernel 3.2 is extremely bad for Postgresql, as the vm seems to amplify IO >> somehow. The difference to 3.13 (the latest LTS kernel for 12.04) is huge. >> >> >>

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-19 Thread Andres Freund
On 2016-02-19 22:46:44 +0100, Fabien COELHO wrote: > > Hello Andres, > > >Here's the next two (the most important) patches of the series: > >0001: Allow to trigger kernel writeback after a configurable number of > >writes. > >0002: Checkpoint sorting and balancing. > > I will look into these

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-19 Thread Fabien COELHO
Hello Andres, Here's the next two (the most important) patches of the series: 0001: Allow to trigger kernel writeback after a configurable number of writes. 0002: Checkpoint sorting and balancing. I will look into these two in depth. Note that I would have ordered them in reverse because

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-19 Thread Andres Freund
On 2016-02-04 16:54:58 +0100, Andres Freund wrote: > Hi, > > Fabien asked me to post a new version of the checkpoint flushing patch > series. While this isn't entirely ready for commit, I think we're > getting closer. > > I don't want to post a full series right now, but my working state is >

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-19 Thread Fabien COELHO
Hallo Patric, Kernel 3.2 is extremely bad for Postgresql, as the vm seems to amplify IO somehow. The difference to 3.13 (the latest LTS kernel for 12.04) is huge.

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-19 Thread Patric Bechtel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Fabien, Fabien COELHO schrieb am 19.02.2016 um 16:04: > >>> [...] Ubuntu 12.04 LTS (precise) >> >> That's with 12.04's standard kernel? > > Yes. Kernel 3.2 is extremely bad for Postgresql, as the vm seems to amplify IO somehow. The difference

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-19 Thread Fabien COELHO
Hello. Based on these results I think 32 will be a good default for checkpoint_flush_after? There's a few cases where 64 showed to be beneficial, and some where 32 is better. I've seen 64 perform a bit better in some cases here, but the differences were not too big. Yes, these many runs show

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-19 Thread Andres Freund
Hi, On 2016-02-19 10:16:41 +0100, Fabien COELHO wrote: > Below the results of a lot of tests with pgbench to exercise checkpoints on > the above version when fetched. Wow, that's a great test series. > Overall comments: > - sorting & flushing is basically always a winner > - benchmarking

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-19 Thread Fabien COELHO
Hello Andres, I don't want to post a full series right now, but my working state is available on http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/checkpoint-flush git://git.postgresql.org/git/users/andresfreund/postgres.git checkpoint-flush Below

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-18 Thread Andres Freund
On 2016-02-18 09:51:20 +0100, Fabien COELHO wrote: > I've looked at these patches, especially the whole bench of explanations and > comments which is a good source for understanding what is going on in the > WAL writer, a part of pg I'm not familiar with. > > When reading the patch 0002

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-18 Thread Andres Freund
On 2016-02-11 19:44:25 +0100, Andres Freund wrote: > The first two commits of the series are pretty close to being ready. I'd > welcome review of those, and I plan to commit them independently of the > rest as they're beneficial independently. The most important bits are > the comments and docs

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-18 Thread Fabien COELHO
Hello Andres, 0001: Make SetHintBit() a bit more aggressive, afaics that fixes all the potential regressions of 0002 0002: Fix the overaggressive flushing by the wal writer, by only flushing every wal_writer_delay ms or wal_writer_flush_after bytes. I've looked at these

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 1:44 PM, Andres Freund wrote: > On 2016-02-04 16:54:58 +0100, Andres Freund wrote: >> Fabien asked me to post a new version of the checkpoint flushing patch >> series. While this isn't entirely ready for commit, I think we're >> getting closer. >> >> I

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-11 Thread Andres Freund
On 2016-02-04 16:54:58 +0100, Andres Freund wrote: > Fabien asked me to post a new version of the checkpoint flushing patch > series. While this isn't entirely ready for commit, I think we're > getting closer. > > I don't want to post a full series right now, but my working state is > available

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-09 Thread Fabien COELHO
I think I would appreciate comments to understand why/how the ringbuffer is used, and more comments in general, so it is fine if you improve this part. I'd suggest to leave out the ringbuffer/new bgwriter parts. Ok, so the patch would only onclude the checkpointer stuff. I'll look at this

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-09 Thread Andres Freund
On February 9, 2016 10:46:34 AM GMT+01:00, Fabien COELHO wrote: > >>> I think I would appreciate comments to understand why/how the >>> ringbuffer is used, and more comments in general, so it is fine if >you >>> improve this part. >> >> I'd suggest to leave out the

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-08 Thread Andres Freund
Hi Fabien, On 2016-02-04 16:54:58 +0100, Andres Freund wrote: > I don't want to post a full series right now, but my working state is > available on > http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/checkpoint-flush >

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-08 Thread Fabien COELHO
Hello Andres, Any comments before I spend more time polishing this? I'm running tests on various settings, I'll send a report when it is done. Up to now the performance seems as good as with the previous version. I'm currently updating docs and comments to actually describe the current

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-08 Thread Andres Freund
On 2016-02-08 19:52:30 +0100, Fabien COELHO wrote: > I think I would appreciate comments to understand why/how the ringbuffer is > used, and more comments in general, so it is fine if you improve this part. I'd suggest to leave out the ringbuffer/new bgwriter parts. I think they'd be committed

Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-04 Thread Andres Freund
Hi, Fabien asked me to post a new version of the checkpoint flushing patch series. While this isn't entirely ready for commit, I think we're getting closer. I don't want to post a full series right now, but my working state is available on

Re: [HACKERS] checkpointer continuous flushing

2016-02-01 Thread Alvaro Herrera
This patch got its fair share of reviewer attention this commitfest. Moving to the next one. Andres, if you want to commit ahead of time you're of course encouraged to do so. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training

Re: [HACKERS] checkpointer continuous flushing

2016-01-27 Thread Robert Haas
On Wed, Jan 20, 2016 at 9:02 AM, Andres Freund wrote: > Chatting on IM with Heikki, I noticed that we're pretty pessimistic in > SetHintBits(). Namely we don't set the bit if XLogNeedsFlush(commitLSN), > because we can't easily set the LSN. But, it's actually fairly common >

Re: [HACKERS] checkpointer continuous flushing

2016-01-21 Thread Andres Freund
On 2016-01-21 11:33:15 +0530, Amit Kapila wrote: > On Wed, Jan 20, 2016 at 9:07 PM, Andres Freund wrote: > > I don't think it's strongly related - the contention here is on read > > access to the clog, not on write access. > > Aren't reads on clog contended with parallel

Re: [HACKERS] checkpointer continuous flushing

2016-01-20 Thread Amit Kapila
On Wed, Jan 20, 2016 at 9:07 PM, Andres Freund wrote: > > On 2016-01-20 12:16:24 -0300, Alvaro Herrera wrote: > > Andres Freund wrote: > > > > > The relevant thread is at > > >

Re: [HACKERS] checkpointer continuous flushing

2016-01-20 Thread Alvaro Herrera
Andres Freund wrote: > The relevant thread is at > http://archives.postgresql.org/message-id/CA%2BTgmoaCr3kDPafK5ygYDA9mF9zhObGp_13q0XwkEWsScw6h%3Dw%40mail.gmail.com > what I didn't remember is that I voiced concern back then about exactly this: >

Re: [HACKERS] checkpointer continuous flushing

2016-01-20 Thread Andres Freund
On 2016-01-20 12:16:24 -0300, Alvaro Herrera wrote: > Andres Freund wrote: > > > The relevant thread is at > > http://archives.postgresql.org/message-id/CA%2BTgmoaCr3kDPafK5ygYDA9mF9zhObGp_13q0XwkEWsScw6h%3Dw%40mail.gmail.com > > what I didn't remember is that I voiced concern back then about

Re: [HACKERS] checkpointer continuous flushing

2016-01-20 Thread Andres Freund
On 2016-01-19 22:43:21 +0100, Andres Freund wrote: > On 2016-01-19 12:58:38 -0500, Robert Haas wrote: > > This seems like a problem with the WAL writer quite independent of > > anything else. It seems likely to be inadvertent fallout from this > > patch: > > > > Author: Simon Riggs

Re: [HACKERS] checkpointer continuous flushing

2016-01-20 Thread Andres Freund
On 2016-01-20 11:13:26 +0100, Andres Freund wrote: > On 2016-01-19 22:43:21 +0100, Andres Freund wrote: > > On 2016-01-19 12:58:38 -0500, Robert Haas wrote: > > I think the problem isn't really that it's flushing too much WAL in > > total, it's that it's flushing WAL in a too granular fashion. I

Re: [HACKERS] checkpointer continuous flushing

2016-01-19 Thread Fabien COELHO
I measured it in a different number of cases, both on SSDs and spinning rust. I just reproduced it with: postgres-ckpt14 \ -D /srv/temp/pgdev-dev-800/ \ -c maintenance_work_mem=2GB \ -c fsync=on \ -c synchronous_commit=off \ -c

Re: [HACKERS] checkpointer continuous flushing

2016-01-19 Thread Andres Freund
On 2016-01-19 10:27:31 +0100, Fabien COELHO wrote: > Also, the performance level is around 160 tps on HDDs, which make sense to > me for a 7200 rpm HDD capable of about x00 random writes per second. It > seems to me that you reported much better performance on HDD, but I cannot > really see how

Re: [HACKERS] checkpointer continuous flushing

2016-01-19 Thread Andres Freund
On 2016-01-19 12:58:38 -0500, Robert Haas wrote: > This seems like a problem with the WAL writer quite independent of > anything else. It seems likely to be inadvertent fallout from this > patch: > > Author: Simon Riggs > Branch: master Release: REL9_2_BR [4de82f7d7]

Re: [HACKERS] checkpointer continuous flushing

2016-01-19 Thread Robert Haas
On Mon, Jan 18, 2016 at 11:39 AM, Andres Freund wrote: > On 2016-01-16 10:01:25 +0100, Fabien COELHO wrote: >> Hello Andres, >> >> >I measured it in a different number of cases, both on SSDs and spinning >> >rust. I just reproduced it with: >> > >> >postgres-ckpt14 \ >> >

Re: [HACKERS] checkpointer continuous flushing

2016-01-19 Thread Fabien COELHO
synchronous_commit = off does make a significant difference. Sure, but I had thought about that and kept this one... But why are you then saying this is fundamentally limited to 160 xacts/sec? I'm just saying that the tested load generates mostly random IOs (probably on average over 1

Re: [HACKERS] checkpointer continuous flushing

2016-01-19 Thread Fabien COELHO
synchronous_commit = off does make a significant difference. Sure, but I had thought about that and kept this one... I think I found one possible culprit: I automatically wrote 300 seconds for checkpoint_timeout, instead of 30 seconds in your settings. I'll have to rerun the tests with

Re: [HACKERS] checkpointer continuous flushing

2016-01-19 Thread Andres Freund
On 2016-01-19 13:34:14 +0100, Fabien COELHO wrote: > > >synchronous_commit = off does make a significant difference. > > Sure, but I had thought about that and kept this one... But why are you then saying this is fundamentally limited to 160 xacts/sec? > I think I found one possible culprit: I

Re: [HACKERS] checkpointer continuous flushing

2016-01-18 Thread Andres Freund
On 2016-01-16 10:01:25 +0100, Fabien COELHO wrote: > > Hello Andres, > > >I measured it in a different number of cases, both on SSDs and spinning > >rust. I just reproduced it with: > > > >postgres-ckpt14 \ > > -D /srv/temp/pgdev-dev-800/ \ > > -c maintenance_work_mem=2GB \ > >

Re: [HACKERS] checkpointer continuous flushing

2016-01-16 Thread Fabien COELHO
Hello Andres, Hello Tomas. Ooops, sorry Andres, I mixed up the thread in my head so was not clear who was asking the questions to whom. I was/am using ext4, and it turns out that, when abling flushing, the results are hugely dependant on barriers=on/off, with the latter making flushing

Re: [HACKERS] checkpointer continuous flushing

2016-01-16 Thread Fabien COELHO
Hello Andres, I measured it in a different number of cases, both on SSDs and spinning rust. I just reproduced it with: postgres-ckpt14 \ -D /srv/temp/pgdev-dev-800/ \ -c maintenance_work_mem=2GB \ -c fsync=on \ -c synchronous_commit=off \ -c

Re: [HACKERS] checkpointer continuous flushing

2016-01-15 Thread Andres Freund
Hi Fabien, On 2016-01-11 14:45:16 +0100, Andres Freund wrote: > I measured it in a different number of cases, both on SSDs and spinning > rust. I just reproduced it with: > > postgres-ckpt14 \ > -D /srv/temp/pgdev-dev-800/ \ > -c maintenance_work_mem=2GB \ > -c fsync=on \

Re: [HACKERS] checkpointer continuous flushing

2016-01-15 Thread Fabien COELHO
Hi Fabien, Hello Tomas. On 2016-01-11 14:45:16 +0100, Andres Freund wrote: I measured it in a different number of cases, both on SSDs and spinning rust. I just reproduced it with: postgres-ckpt14 \ -D /srv/temp/pgdev-dev-800/ \ -c maintenance_work_mem=2GB \ -c

  1   2   3   >