On 12/20/06, Takayuki Tsunakawa <[EMAIL PROTECTED]> wrote:
[Conclusion] I believe that the problem cannot be solved in a real sense by avoiding fsync/fdatasync(). We can't ignore what commercial databases have done so far. The kernel does as much as he likes when PostgreSQL requests him to fsync().
I am new to the community and am very interested in the tests that you have done. I am also working on resolving the sudden IO spikes at checkpoint time. I agree with you that fsync() is the core issue here. Being a new member I was wondering if someone on this list has done testing with O_DIRECT and/or O_SYNC for datafiles as that seems to be the most logical way of dealing with fsync() flood at checkpoint time. If so, I'll be very interested in the results. As mentioned in this thread that a single bgwriter with O_DIRECT will not be able to keep pace with cleaning effort causing backend writes. I think (i.e. IMHO) multiple bgwriters and/or AsyncIO with O_DIRECT can resolve this issue. Talking of bgwriter_* parameters I think we are missing a crucial internal counter i.e. number of dirty pages. How much work bgwriter has to do at each wakeup call should be a function of total buffers and currently dirty buffers. Relying on both these values instead of just one static NBuffers should allow bgwriter to adapt more quickly to workload changes and ensure that not much work is accumulated for checkpoint. -- Inaam Rana EnterpriseDB http://www.enterprisedb.com