On Thu, Feb 23, 2012 at 3:17 AM, Greg Smith <g...@2ndquadrant.com> wrote:
> I think an even bigger factor now is that the BGW writes can disturb write > ordering/combining done at the kernel and storage levels. It's painfully > obvious now how much PostgreSQL relies on that to get good performance. All > sorts of things break badly if we aren't getting random writes scheduled to > optimize seek times, in as many contexts as possible. It doesn't seem > unreasonable that background writer writes can introduce some delay into the > checkpoint writes, just by adding more random components to what is already > a difficult to handle write/sync series. That's what I think what these > results are showing is that background writer writes can deoptimize other > forms of write. How hard would it be to dummy up a bgwriter which, every time it wakes up, it forks off a child process to actually do the write, and then the real one just waits for the child to exit? If it didn't have to correctly handle signals, SINVAL, and such, it should be just a few lines of code, but I don't know how much we can ignore signals and such even just for testing purposes. My thought here is that the kernel is getting in a snit over one process doing all the writing on the system, and is punishing that process in a way that ruins things for everyone. > > A second fact that's visible from the TPS graphs over the test run, and > obvious if you think about it, is that BGW writes force data to physical > disk earlier than they otherwise might go there. On a busy system like you are testing, the BGW should only be writing out data a fraction of a second before the backends would otherwise be doing it, unless the "2 minutes to circle the buffer pool" logic is in control rather than the bgwriter_lru_multiplier and bgwriter_lru_maxpages logic. From the data reported, we can see how many buffer-allocations there are but not how many circles of the pool it took to find them) It doesn't seem likely that small shifts in timing are having that effect, compared to the possible effect of who is doing the writing. If the timing is truly the issue, lowering bgwriter_delay might smooth the timing out and bring closer to what the backends would do for themselves. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers