On Thu, Aug 15, 2013 at 10:31:54PM +0200, richard -rw- weinberger wrote:
> On Thu, Aug 15, 2013 at 8:59 PM, Richard W.M. Jones <rjo...@redhat.com> wrote:
> > On Thu, Aug 15, 2013 at 08:41:53PM +0200, richard -rw- weinberger wrote:
> >> Meant regarding fsync()...
> >
> > Ah right, see what you mean :-)
> >
> > It calls sync(2).  Then it opens each /dev/ubdX device and calls fsync
> > on the file descriptor:
> >
> > https://github.com/libguestfs/libguestfs/blob/master/daemon/sync.c#L54
> >
> > The reason for this is a bit complicated, but has to do with write
> > barriers.  I believe this has been fixed since then.
> >
> > https://github.com/libguestfs/libguestfs/commit/c0a3c9ce70b98171e737e49e6dccc4457963f2ec
> >
> > In any case, we're calling sync & fsync in the guest, and that ain't
> > causing the host cache to be flushed.
> 
> Hmm, AFAIK a fsync() after sync() is pointless.

That's definitely what you'd expect, but Linux was buggy until last
year: it only synched mounted filesystems, not devices in general.

This was fixed:

  commit a8c7176b6ded413d5044a00f1d05477b95a6d7ad
  Author: Jan Kara <j...@suse.cz>
  Date:   Tue Jul 3 16:45:32 2012 +0200

    vfs: Make sys_sync writeout also block device inodes

but in libguestfs we can't rely on people using a new enough kernel so
we do the safe thing.

> > It's pretty easy to show this with libguestfs:
> >
> > LIBGUESTFS_BACKEND=uml LIBGUESTFS_HV=~/d/linux-um/linux \
> > time ./run ./fish/guestfish -N fs:ext2:1G -m /dev/sda1 fallocate64 /data 
> > 800M
> > time sync
> >
> > The first command finishes in 8 seconds, with barely any disk
> > activity.  The sync afterwards takes 9 seconds with the disk light on
> > the whole time, while it actually writes the whole 800 MB of data.
> 
> I think you don't see the desired effect because UML does not open
> the backing file with O_SYNC.
> But you can tell UML to do so.
> e.g. write ubdas=/foo instead of ubda=/foo.

I didn't know about these flags.

I modified libguestfs to use the 's' flag, and reran the test above:

  UML (with O_SYNC):           598 seconds
  UML (unsafe):                  9 seconds
  UML + explicit host sync:     20 seconds
  KVM:                          20 seconds

However using O_SYNC is not the same as properly handling write
barriers.  UML is buggy in this respect because a user in the guest
who does 'sync' (and the sync returns to the prompt) wouldn't expect
that if the power failed immediately afterwards they would lose all
their data.  If they are using a journaled filesystem they'd also risk
unrecoverable data corruption during general usage.

KVM and LVM have both been here a few years ago and have fixed these
issues (mind you, KVM only does the right thing if you use the right
cache=... option).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Reply via email to