On Mon, Dec 12, 2011 at 2:40 PM, Greg Troxel <g...@ir.bbn.com> wrote: > > Andy Ruhl <acr...@gmail.com> writes: > >> If solving your problem depends on sync frequency, I don't see why >> this shouldn't be managed by some knob to twiddle. Given that the >> crash scenario doesn't get worse depending on where the knob is or if >> the crash happens while the knob is working. If it does, it's >> pointless. > > My sense is that Donald isn't complaining about why is the sync > frequency 30s instead of 60s;
That's right. The only thing I'm *really* complaining about is people who don't read what seems to me to be plain English (I exclude from my complaint those for whom English is not their native language). it's more bafflement at waiting 10-15 > minutes with an idle disk and having the data not synced at all. > There's a historical period of 30s, and that seems both not often enough > not to cause trouble and often enough to not boggle users. That's certainly an issue with NetBSD that David Holland, correctly in my view, identified as a bug. OpenBSD, per the experiments I've already described, does not exhibit this behavior. Note that this bug *may* not worsen the probability of recovery after a crash. It might even increase it! Think about it. If you boot NetBSD and mount a filesystem async, it is going to be correctly structured (or deemed to be by fsck) at boot time, or the system wouldn't mount it. Assuming the system is happy with it, if you then make changes to the filesystem, but, because of this bug they are all in the buffer cache and never get written out, and then the system crashes --- you've got the filesystem you started with. This bug more importantly affects, in my view, the amount of stuff you might lose in the event of a crash. If the system has been up for N hours and you've been working away, making changes, dutifully hitting ctrl-s in gnumeric to write out changes because people have told you that changes to a gnumeric spreadsheet aren't in the filesystem until saved, and the system crashes, you are in for a big surprise. Chances are good that you will not lose the filesystem, but chances are great that you will lose your N hours of work. > > It may also make sense to have a syncer behavior that is low rate, to > not overwhelm asked-for IO, and to use most of the disk bandwidth when > it is on, and to let it be otherwise, for laptops. But a basic > correctness property is almost certainly that if the disk is spun up and > is not in heavy use and lots of time passes, dirty buffers (data and > metadata) are written to disk. Yep. Now, knowing about this bug, a simple sync-sleep loop takes care of it. But it should be fixed in the system, so the user doesn't have to remember to do this, or to install such a loop in one of the init-time files. /Don