On Wednesday 17 June 2009 16:45:42 Cathey, Jim wrote:
> That particular system wasn't using fsync() (or
> any equivalent) if the -f was given, it sync'd
> the _filesystem_ upon which the given file lay.
> (That would, of course, catch the file as well.)

I was asking why you need -f to indicate you're supplying a command line 
argument when you could list files on the command line directly, since the 
current command doesn't take any.

What the implementation chooses to do with that information is a separate 
issue.  My point was "what does the -f accomplish?"

> I only mentioned it as a data point, since there
> was life before GNU or POSIX.  And systems in that
> day weren't as fast as now, and things like rfs
> existed where a full-blown sync could indeed be
> very expensive.

It can still be hideously expensive.

The sync code in linux has been screwed up for years, and despite adding 
multiple i/o schedulers and iorenice and such, they're still finding brown 
paper bag bugs like the one in ext3 that just got fixed in 2.6.30:

  http://lwn.net/Articles/328363/

I note that when I back up stuff onto my terabyte USB drive (I use rsync for 
this, as a normal priority command run by my normal login user), things like 
vi become unusable (it calls fsync() about every 100 characters you type, and 
yes cursoring around in the file counts)  because any attempt to sync anything 
will hang until the backup to the USB drive _completes_.  Yes I've timed it 
and it's waited longer than 15 minutes.  Yes, the processes will be stuck in D 
state and unkillable until the backup completes.

Perhaps that was the above ext3 bug, dunno yet.

> (I seem to recall that the database
> volume had a sync -f hooked to the powerfail interrupt,
> followed by a full sync that it would finish if it
> had time before power fully failed.)

These days that's mostly handled with journaling and state checkpointing.  A 
busy server with 8 gigabytes of ram that's dirtying a lot of pages can take 15 
seconds to flush its dirty buffers even with fast disks and no new activity.

The point of fsync is notification and error handling.  You don't acknowledge 
the transaction until it's committed to disk locally (email servers put huge 
amounts of effort into this, because it's _vitally_ important that the spam 
filter be the thing to have a false positive and discard your email rather than 
the network transaction losing it), and fsync lets you know when that 
happened.  It also might let you know if there was an error, although it just 
flushes the file contents and not the metadata so you have to open the 
directory 
containing it and fsync that too if you really care.  Plus the way cacheing 
works it doesn't usually try to write your data to disk until after you call 
close() so historically real physical disk errors happened when nobody was 
listening, so the lower layers weren't very good about propogating them up and 
they just wound up in dmesg instead...

Anyway, being able to specify paths on the sync command line seems like a 
reasonable extension, but not if there isn't agreement on what sync should 
_do_ with the extra information.  (Calling it fsync doesn't change that.  What 
does the suse one do?)

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to