On Tuesday, July 16, 2013 10:16 PM Ants Aasma wrote: > On Jul 14, 2013 9:46 PM, "Greg Smith" <g...@2ndquadrant.com> wrote: > > I updated and re-reviewed that in 2011: > http://www.postgresql.org/message-id/4d31ae64.3000...@2ndquadrant.com > and commented on why I think the improvement was difficult to reproduce > back then. The improvement didn't follow for me either. It would take > a really amazing bit of data to get me to believe write sorting code is > worthwhile after that. On large systems capable of dirtying enough > blocks to cause a problem, the operating system and RAID controllers > are already sorting block. And *that* sorting is also considering > concurrent read requests, which are a lot more important to an > efficient schedule than anything the checkpoint process knows about. > The database doesn't have nearly enough information yet to compete > against OS level sorting. > > That reasoning makes no sense. OS level sorting can only see the > writes in the time window between PostgreSQL write, and being forced > to disk. Spread checkpoints sprinkles the writes out over a long > period and the general tuning advice is to heavily bound the amount of > memory the OS willing to keep dirty. This makes probability of > scheduling adjacent writes together quite low, the merging window > being limited either by dirty_bytes or dirty_expire_centisecs. The > checkpointer has the best long term overview of the situation here, OS > scheduling only has the short term view of outstanding read and write > requests. By sorting checkpoint writes it is much more likely that > adjacent blocks are visible to OS writeback at the same time and will > be issued together.
I think Oracle also use similar concept for making writes efficient, and they have patent also for this technology which you can find at below link: http://www.google.com/patents/US7194589?dq=645987&hl=en&sa=X&ei=kn7mUZ-PIsWq rAe99oDgBw&sqi=2&pjf=1&ved=0CEcQ6AEwAw Although Oracle has different concept for performing checkpoint writes, but I thought of sharing the above link with you, so that unknowingly we should not go into wrong path. AFAIK instead of depending on OS buffers, they use direct I/O and infact in the patent above they are using temporary buffer (Claim 3) to sort the writes which is not the same idea as far as I can understand by reading above thread. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers