On Sat, Mar 5, 2011 at 4:22 PM, Martin Steigerwald <[email protected]>wrote:

> Since after loosing uptime records once again due to a crash while testing
> kernels I am so through with it that it is not even funny anymore,
> Theodore T'so said that one should not fear the fsync() [1] - especially
> not with Ext4 - and I prefer not loosing uptime records over and over and
> over again I build a test version using fsync() at the location this bug
> report is about:
>

As far as I understand it, upstream isn't willing to consider a fsync()
patch which is /bad/ for all the previously mentioned reasons, which the
post you quote only marginally addresses. Uptimed doesn't need atomicity nor
durability: it keeps a *backup* of its previous database. Also, to put
things in a little more perspective: uptimed doesn't only run on Linux. It
runs on a variety of other platforms, where fsync() may have a greater costs
than what Ted suggests. And finally, and probably more to the point:

FSYNC(2)                    BSD System Calls Manual
FSYNC(2)

[...]

     Note that while fsync() will flush all data from the host to the drive
     (i.e. the "permanent storage device"), the drive itself may not physi-
     cally write the data to the platters for quite some time and it may be
     written in an out-of-order sequence.

     Specifically, if the drive loses power or the OS crashes, the
application
     may find that only some or none of their data was written.  The disk
     drive may also re-order the data so that later writes may be present,
     while earlier writes are not.

This explains that even fsync() cannot /certify/ that the data will hit the
disk in the event of a crash (this is especially true with nowadays larger
caches on disks).

I'm absolutely against such a patch which is the wrong solution to this
problem either (and no, I'm not going to add a patch to tune for a specific
filesystem - not everyone uses ext4 - especially not to work around system
crashes, which, *again*, do not constitute a "normal use of the system").

As stated before, the correct solution would be to add another layer of
checks during daemon startup, which would assert that the file it's reading
is valid (i.e. to begin with "not empty" and "has parseable data"), and fall
back to the backup copy otherwise. This, by design, is the correct approach
and has /none/ of the drawbacks of your fsync() patch.

I would gladly review such a patch.

HTH

-- 
Thibaut VARENE
http://www.parisc-linux.org/~varenet/

Reply via email to