Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Where am I wrong? > > I don't think any of this is relevant. There are a certain number of > blocks we have to get down to disk before we can declare a transaction > committed, and there are a certain number that we have to get down to > disk before we can declare a checkpoint complete. You are focusing too > much on the question of whether a particular process performs an fsync > operation, and ignoring the fact that ultimately it's got to wait for > I/O to complete --- directly or indirectly. If it blocks waiting for > some other process to declare a buffer clean, rather than writing for > itself, what's the difference?
The difference is two-fold. First, there might be 10 other backends asking for writes, so it isn't clear that asking someone else do the right is as fast. Second, that other writer is doing fsync, so it is 100x or 1000x slower than our current setup. > Sure, fsync serializes the particular process that's doing it, but we > can deal with that by spreading the fsyncs across multiple processes, > and trying to ensure that they are mostly background processes rather > than foreground ones. How many? That was my point, that it might require 1000 backend processes _and_ it would be slower because we are write/fsync rather than write. However, I think we could fix that by doing the write and returning OK to the backend, then doing the fsync whenever we want --- perhaps that was already your plan. > I don't claim that immediate-fsync-on-write is the only answer, but > I cannot follow your reasoning for dismissing it out of hand ... and I > certainly cannot buy *any* logic that says that sync() is a good answer > to any of these issues. AFAICS sync() means that we abandon > responsibility. sync() means we group the fsync into one massive one, that sync all other process I/O too --- clearly bad, but I am hoping for something as good as what we currently have because that sync hopefully is only ever few minutes. > > Do we know that having the background writer fsync a file that was > > written by a backend cause all the data to fsync? I think I could write > > a program to test this by timing each of these tests: > > That might prove something about the particular platform you tested it > on; but it would not speak to the real problem, which is what we can > assume is true on every platform... Yes, it would only be per platform. I was thinking we could have a platform test and enable this behavior on platforms that support it (all?) and use sync on the others. Also, let me say I am glad we are delving into this. Our buffer system has needed an overhaul for a while, and right now we already have some major improvements for 7.5, and this discussion is just a continuation of those improvements. -- Bruce Momjian | http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])