Re: [Qemu-devel] Re: [RFC][PATCH] make sure disk writes actually hit disk

2006-07-31 Thread Jens Axboe
On Fri, Jul 28 2006, Rik van Riel wrote:
> Anthony Liguori wrote:
> 
> >Right now Fabrice is working on rewriting the block API to be
> >asynchronous.  There's been quite a lot of discussion about why using
> >threads isn't a good idea for this
> 
> Agreed, AIO is the way to go in the long run.
> 
> >With a proper async API, is there any reason why we would want this to be
> >tunable?  I don't think there's much of a benefit of prematurely claiming
> >a write is complete especially once the SCSI emulation can support
> >multiple simultaneous requests.
> 
> You're right.  This O_SYNC bandaid should probably stay in place
> to prevent data corruption, until the AIO framework is ready to
> be used.

O_SYNC is horrible, it'll totally kill performance. QEMU is basically
just a write cache enabled disk and it supports disk flushes as well. So
essentially it's the OS on top of QEMU that needs to take care for
flushing data out, like using barriers on the file system and
propagating fsync() properly down.

-- 
Jens Axboe



___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Re: [RFC][PATCH] make sure disk writes actually hit disk

2006-07-28 Thread Paul Brook
> > Have you measured the impact of O_SYNC? I wouldn't be surprised if it was
> > significant.
>
> I suspect it'll be horrific in the qemu codebase (blocking execution
> of the guest OS until disk IO is complete), but it's fine in the Xen
> qemu-dm situation, where IO completion happens asynchronously.
>
> The recent commit message on the Xen side did not suggest there was
> that much of a difference between both qemu code bases.  Obviously
> I was wrong, and the O_SYNC bandaid should probably be kept out for
> now.

Ah, ok. I didn't realise they'd diverged that much either.

Paul


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Re: [RFC][PATCH] make sure disk writes actually hit disk

2006-07-28 Thread Rik van Riel

Paul Brook wrote:

With a proper async API, is there any reason why we would want this to be
tunable?  I don't think there's much of a benefit of prematurely claiming
a write is complete especially once the SCSI emulation can support
multiple simultaneous requests.

You're right.  This O_SYNC bandaid should probably stay in place
to prevent data corruption, until the AIO framework is ready to
be used.


It's arguable whether O_SYNC is needed at all. Qemu doesn't claim data is 
written to disk, and provides facilities for the guest OS to flush the cache, 
just like real hardware does.


Nice.  Another difference between the qemu codebase and the qemu-dm
codebase used by Xen.

With the bdrv_flush stuff in place, it should even be easy for qemu
to actually do something when the guest OS switches disk write caching
off (currently that is a noop in the qemu code base).

Have you measured the impact of O_SYNC? I wouldn't be surprised if it was 
significant.


I suspect it'll be horrific in the qemu codebase (blocking execution
of the guest OS until disk IO is complete), but it's fine in the Xen
qemu-dm situation, where IO completion happens asynchronously.

The recent commit message on the Xen side did not suggest there was
that much of a difference between both qemu code bases.  Obviously
I was wrong, and the O_SYNC bandaid should probably be kept out for
now.

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Re: [RFC][PATCH] make sure disk writes actually hit disk

2006-07-28 Thread Paul Brook
> > With a proper async API, is there any reason why we would want this to be
> > tunable?  I don't think there's much of a benefit of prematurely claiming
> > a write is complete especially once the SCSI emulation can support
> > multiple simultaneous requests.
>
> You're right.  This O_SYNC bandaid should probably stay in place
> to prevent data corruption, until the AIO framework is ready to
> be used.

It's arguable whether O_SYNC is needed at all. Qemu doesn't claim data is 
written to disk, and provides facilities for the guest OS to flush the cache, 
just like real hardware does.

Have you measured the impact of O_SYNC? I wouldn't be surprised if it was 
significant.

Paul


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Re: [RFC][PATCH] make sure disk writes actually hit disk

2006-07-28 Thread Rik van Riel

Anthony Liguori wrote:


Right now Fabrice is working on rewriting the block API to be
asynchronous.  There's been quite a lot of discussion about why using
threads isn't a good idea for this


Agreed, AIO is the way to go in the long run.


With a proper async API, is there any reason why we would want this to be
tunable?  I don't think there's much of a benefit of prematurely claiming
a write is complete especially once the SCSI emulation can support
multiple simultaneous requests.


You're right.  This O_SYNC bandaid should probably stay in place
to prevent data corruption, until the AIO framework is ready to
be used.

No sense investing too much time in a fancier band-aid.

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] Re: [RFC][PATCH] make sure disk writes actually hit disk

2006-07-28 Thread Anthony Liguori
On Fri, 28 Jul 2006 15:54:30 -0400, Rik van Riel wrote:

> This is the simple approach to making sure that disk writes actually hit
> disk before we tell the guest OS that IO has completed.  Thanks to
> DMA_MULTI_THREAD the performance still seems to be adequate.

Hi Rik,

Right now Fabrice is working on rewriting the block API to be
asynchronous.  There's been quite a lot of discussion about why using
threads isn't a good idea for this (I wish Xen wouldn't use this patch but
that's another conversation :-)).

The async block API will allow the use of different kinds of async
"backends".  The default (on Linux) will be posix-aio.  I'm currently
working on an HTTP backend and will also write a linux-aio (which, of
course, will be using O_DIRECT).

> A fancier solution would be to make the sync/non-sync behaviour of the
> qemu disk backing store tunable from the guest OS, by tuning the IDE disk
> write cache on/off with hdparm, and having hw/ide.c call ->fsync functions
> in the block backends.

With a proper async API, is there any reason why we would want this to be
tunable?  I don't think there's much of a benefit of prematurely claiming
a write is complete especially once the SCSI emulation can support
multiple simultaneous requests.

I was hoping to just make linux-aio the default if it was available...

Regards,

Anthony Liguori

> I'm willing to code up the fancy solution if people prefer that.




___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] Re: [RFC][PATCH] make sure disk writes actually hit disk

2006-07-28 Thread Rik van Riel

Rik van Riel wrote:

This is the simple approach to making sure that disk writes actually
hit disk before we tell the guest OS that IO has completed.  Thanks
to DMA_MULTI_THREAD the performance still seems to be adequate.


Hah, and of course that bit is only found in Xen's qemu-dm. Doh!

I knew I should have also checked some of the files my patch didn't
touch :)

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel