Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-13 Thread Jim Nasby
On 3/13/16 6:30 PM, Peter Geoghegan wrote: On Sat, Mar 12, 2016 at 5:21 PM, Jeff Janes wrote: Would the wiki be a good place for such tips? Not as formal as the documentation, and more centralized (and editable) than a collection of blog posts. That general direction

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-13 Thread Peter Geoghegan
On Sat, Mar 12, 2016 at 5:21 PM, Jeff Janes wrote: > Would the wiki be a good place for such tips? Not as formal as the > documentation, and more centralized (and editable) than a collection > of blog posts. That general direction makes sense, but I'm not sure if the Wiki

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-12 Thread Jeff Janes
On Thu, Mar 10, 2016 at 11:25 PM, Peter Geoghegan wrote: > On Thu, Mar 10, 2016 at 11:18 PM, Fabien COELHO wrote: >> I can only concur! >> >> The "Performance Tips" chapter (II.14) is more user/query oriented. The >> "Server Administration" bool (III) does

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Peter Geoghegan
On Thu, Mar 10, 2016 at 11:18 PM, Fabien COELHO wrote: > I can only concur! > > The "Performance Tips" chapter (II.14) is more user/query oriented. The > "Server Administration" bool (III) does not discuss this much. That's definitely one area in which the docs are lacking

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
As you wish. I thought that understanding the underlying performance model with sequential writes written in chunks is important for the admin, and as this guc would have an impact on performance it should be hinted about, including the limits of its effect where large bases will converge to

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-11 00:23:56 +0100, Fabien COELHO wrote: > As you wish. I thought that understanding the underlying performance model > with sequential writes written in chunks is important for the admin, and as > this guc would have an impact on performance it should be hinted about, > including the

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
[...] If the default is in pages, maybe you could state it and afterwards translate it in size. Hm, I think that's more complicated for users than it's worth. As you wish. I liked the number of pages you used initially because it really gives a hint of how much random IOs are avoided when

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
Hello Andres, I'm not sure I've seen these performance... If you have hard evidence, please feel free to share it. Man, are you intentionally trying to be hard to work with? Sorry, I do not understand this remark. You were refering to some latency measures in your answer, and I was just

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-10 23:43:46 +0100, Fabien COELHO wrote: > > > > >Whenever more than bgwriter_flush_after bytes have > >been written by the bgwriter, attempt to force the OS to issue these > >writes to the underlying storage. Doing so will limit the amount of > >

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
Whenever more than bgwriter_flush_after bytes have been written by the bgwriter, attempt to force the OS to issue these writes to the underlying storage. Doing so will limit the amount of dirty data in the kernel's page cache, reducing the likelihood of

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-10 23:38:38 +0100, Fabien COELHO wrote: > I'm not sure I've seen these performance... If you have hard evidence, > please feel free to share it. Man, are you intentionally trying to be hard to work with? To quote the email you responded to: > My current plan is to commit this with

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Fabien COELHO
[...] I had originally kept it with one context per tablespace after refactoring this, but found that it gave worse results in rate limited loads even over only two tablespaces. That's on SSDs though. Might just mean that a smaller context size is better on SSD, and it could still be

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-10 17:33:33 -0500, Robert Haas wrote: > On Thu, Mar 10, 2016 at 5:24 PM, Andres Freund wrote: > > On 2016-02-21 09:49:53 +0530, Robert Haas wrote: > >> I think there might be a semantic distinction between these two terms. > >> Doesn't writeback mean writing pages

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Robert Haas
On Thu, Mar 10, 2016 at 5:24 PM, Andres Freund wrote: > On 2016-02-21 09:49:53 +0530, Robert Haas wrote: >> I think there might be a semantic distinction between these two terms. >> Doesn't writeback mean writing pages to disk, and flushing mean making >> sure that they are

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-02-21 09:49:53 +0530, Robert Haas wrote: > I think there might be a semantic distinction between these two terms. > Doesn't writeback mean writing pages to disk, and flushing mean making > sure that they are durably on disk? So for example when the Linux > kernel thinks there is too much

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-10 Thread Andres Freund
On 2016-03-08 09:28:15 +0100, Fabien COELHO wrote: > > >>>Now I cannot see how having one context per table space would have a > >>>significant negative performance impact. > >> > >>The 'dirty data' etc. limits are global, not per block device. By having > >>several contexts with unflushed dirty

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-08 Thread Fabien COELHO
Now I cannot see how having one context per table space would have a significant negative performance impact. The 'dirty data' etc. limits are global, not per block device. By having several contexts with unflushed dirty data the total amount of dirty data in the kernel increases. Possibly,

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Fabien COELHO
Hello Andres, Now I cannot see how having one context per table space would have a significant negative performance impact. The 'dirty data' etc. limits are global, not per block device. By having several contexts with unflushed dirty data the total amount of dirty data in the kernel

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Andres Freund
On 2016-03-07 21:10:19 +0100, Fabien COELHO wrote: > Now I cannot see how having one context per table space would have a > significant negative performance impact. The 'dirty data' etc. limits are global, not per block device. By having several contexts with unflushed dirty data the total amount

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Fabien COELHO
Hello Andres, (1) with 16 tablespaces (1 per table) on 1 disk : 680.0 tps per second avg, stddev [ min q1 median d3 max ] <=300tps 679.6 ± 750.4 [0.0, 317.0, 371.0, 438.5, 2724.0] 19.5% (2) with 1 tablespace on 1 disk : 956.0 tps per second avg, stddev [ min q1 median d3 max ]

Re: [HACKERS] checkpointer continuous flushing - V18

2016-03-07 Thread Andres Freund
On 2016-02-22 20:44:35 +0100, Fabien COELHO wrote: > > >>Random updates on 16 tables which total to 1.1GB of data, so this is in > >>buffer, no significant "read" traffic. > >> > >>(1) with 16 tablespaces (1 per table) on 1 disk : 680.0 tps > >>per second avg, stddev [ min q1 median d3 max ]

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Fabien COELHO
Random updates on 16 tables which total to 1.1GB of data, so this is in buffer, no significant "read" traffic. (1) with 16 tablespaces (1 per table) on 1 disk : 680.0 tps per second avg, stddev [ min q1 median d3 max ] <=300tps 679.6 ± 750.4 [0.0, 317.0, 371.0, 438.5, 2724.0] 19.5%

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Andres Freund
On 2016-02-22 11:05:20 -0500, Tom Lane wrote: > Andres Freund writes: > > Interesting. That doesn't reflect my own tests, even on rotating media, > > at all. I wonder if it's related to: > >

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Tom Lane
Andres Freund writes: > Interesting. That doesn't reflect my own tests, even on rotating media, > at all. I wonder if it's related to: > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23d0127096cb91cb6d354bdc71bd88a7bae3a1d5 > If you use your 12.04

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Andres Freund
On 2016-02-22 14:11:05 +0100, Fabien COELHO wrote: > > >I did a quick & small test with random updates on 16 tables with > >checkpoint_flush_after=16 checkpoint_timeout=30 > > Another run with more "normal" settings and over 1000 seconds, so less > "quick & small" that the previous one. > >

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Fabien COELHO
I did a quick & small test with random updates on 16 tables with checkpoint_flush_after=16 checkpoint_timeout=30 Another run with more "normal" settings and over 1000 seconds, so less "quick & small" that the previous one. checkpoint_flush_after = 16 checkpoint_timeout = 5min # default

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-22 Thread Fabien COELHO
Hallo Andres, AFAICR I used a "flush context" for each table space in some version I submitted, because I do think that this whole writeback logic really does make sense *per table space*, which suggest that there should be as many write backs contexts as table spaces, otherwise the positive

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
ISTM that "progress" and "progress_slice" only depend on num_scanned and per-tablespace num_to_scan and total num_to_scan, so they are somehow redundant and the progress could be recomputed from the initial figures when needed. They don't cause much space usage, and we access the values

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
[...] I do think that this whole writeback logic really does make sense *per table space*, Leads to less regular IO, because if your tablespaces are evenly sized (somewhat common) you'll sometimes end up issuing sync_file_range's shortly after each other. For latency outside checkpoints it's

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Andres Freund
Hi, On 2016-02-21 10:52:45 +0100, Fabien COELHO wrote: > * CpktSortItem: > > I think that allocating 20 bytes per buffer in shared memory is a little on > the heavy side. Some compression can be achieved: sizeof(ForlNum) is 4 bytes > to hold 4 values, could be one byte or even 2 bits somewhere.

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Andres Freund
On 2016-02-21 08:26:28 +0100, Fabien COELHO wrote: > >>In the discussion in the wal section, I'm not sure about the effect of > >>setting writebacks on SSD, [...] > > > >Yea, that paragraph needs some editing. I think we should basically > >remove that last sentence. > > Ok, fine with me. Does

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
Hallo Andres, Here is a review for the second patch. For 0002 I've recently changed: * Removed the sort timing information, we've proven sufficiently that it doesn't take a lot of time. I put it there initialy to demonstrate that there was no cache performance issue when sorting on just

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-21 Thread Fabien COELHO
Hallo Andres, [...] I do think that this whole writeback logic really does make sense *per table space*, Leads to less regular IO, because if your tablespaces are evenly sized (somewhat common) you'll sometimes end up issuing sync_file_range's shortly after each other. For latency outside

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Fabien COELHO
Hallo Andres, In some previous version I think a warning was shown if the feature was requested but not available. I think we should either silently ignore it, or error out. Warnings somewhere in the background aren't particularly meaningful. I like "ignoring with a warning" in the log

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Robert Haas
On Sun, Feb 21, 2016 at 3:37 AM, Andres Freund wrote: >> The documentation seems to use "flush" but the code talks about "writeback" >> or "flush", depending. I think one vocabulary, whichever it is, should be >> chosen and everything should stick to it, otherwise everything

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Andres Freund
Hi, On 2016-02-20 20:56:31 +0100, Fabien COELHO wrote: > >* Currently *_flush_after can be set to a nonzero value, even if there's > > no support for flushing on that platform. Imo that's ok, but perhaps > > other people's opinion differ. > > In some previous version I think a warning was shown

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-20 Thread Fabien COELHO
Hello Andres, For 0001 I've recently changed: * Don't schedule writeback after smgrextend() - that defeats linux delayed allocation mechanism, increasing fragmentation noticeably. * Add docs for the new GUC variables * comment polishing * BackendWritebackContext now isn't dynamically

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-19 Thread Andres Freund
On 2016-02-19 22:46:44 +0100, Fabien COELHO wrote: > > Hello Andres, > > >Here's the next two (the most important) patches of the series: > >0001: Allow to trigger kernel writeback after a configurable number of > >writes. > >0002: Checkpoint sorting and balancing. > > I will look into these

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-19 Thread Fabien COELHO
Hello Andres, Here's the next two (the most important) patches of the series: 0001: Allow to trigger kernel writeback after a configurable number of writes. 0002: Checkpoint sorting and balancing. I will look into these two in depth. Note that I would have ordered them in reverse because

Re: [HACKERS] checkpointer continuous flushing - V18

2016-02-19 Thread Andres Freund
On 2016-02-04 16:54:58 +0100, Andres Freund wrote: > Hi, > > Fabien asked me to post a new version of the checkpoint flushing patch > series. While this isn't entirely ready for commit, I think we're > getting closer. > > I don't want to post a full series right now, but my working state is >