Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

Avi Kivity Tue, 22 Apr 2008 08:11:19 -0700

Jamie Lokier wrote:
> Avi Kivity wrote:
>   
>>> And video streaming on some embedded devices with no MMU!  (Due to the
>>> page cache heuristics working poorly with no MMU, sustained reliable
>>> streaming is managed with O_DIRECT and the app managing cache itself
>>> (like a database), and that needs AIO to keep the request queue busy.
>>> At least, that's the theory.)
>>>       
>> Could use threads as well, no?
>>     
>
> Perhaps.  This raises another point about AIO vs. threads:
>
> If I submit sequential O_DIRECT reads with aio_read(), will they enter
> the device read queue in the same order, and reach the disk in that
> order (allowing for reordering when worthwhile by the elevator)?
>


Yes, unless the implementation in the kernel (or glibc) is threaded.

> With threads this isn't guaranteed and scheduling makes it quite
> likely to issue the parallel synchronous reads out of order, and for
> them to reach the disk out of order because the elevator doesn't see
> them simultaneously.
>   

If the disk is busy, it doesn't matter.  The requests will queue and the 
elevator will sort them out.  So it's just the first few requests that 
may get to disk out of order.

> With AIO (non-Glibc! (and non-kthreads)) it might be better at
> keeping the intended issue order, I'm not sure.
>
> It is highly desirable: O_DIRECT streaming performance depends on
> avoiding seeks (no reordering) and on keeping the request queue
> non-empty (no gap).
>
> I read a man page for some other unix, describing AIO as better than
> threaded parallel reads for reading tape drives because of this (tape
> seeks are very expensive).  But the rest of the man page didn't say
> anything more.  Unfortunately I don't remember where I read it.  I
> have no idea whether AIO submission order is nearly always preserved
> in general, or expected to be.
>   

I haven't considered tape, but this is a good point indeed.  I expect it 
doesn't make much of a difference for a loaded disk.

>   
>> It's me at fault here.  I just assumed that because it's easy to do aio 
>> in a thread pool efficiently, that's what glibc does.
>>
>> Unfortunately the code does some ridiculous things like not service 
>> multiple requests on a single fd in parallel.  I see absolutely no 
>> reason for it (the code says "fight for resources").
>>     
>
> Ouch.  Perhaps that relates to my thought above, about multiple
> requests to the same file causing seek storms when thread scheduling
> is unlucky?
>   

My first thought on seeing this is that it relates to a deficiency on 
older kernels servicing multiple requests on a single fd (i.e. a 
per-file lock).  I don't know if such a deficiency ever existed, though.

>   
>> It could and should.  It probably doesn't.
>>
>> A simple thread pool implementation could come within 10% of Linux aio 
>> for most workloads.  It will never be "exactly", but for small numbers 
>> of disks, close enough.
>>     
>
> I would wait for benchmark results for I/O patterns like sequential
> reading and writing, because of potential for seeks caused by request
> reordering, before being confident of that.
>
>   

I did have measurements (and a test rig) at a previous job (where I did 
a lot of I/O work); IIRC the performance of a tuned thread pool was not 
far behind aio, both for seeks and sequential.  It was a while back though.


-- 
error compiling committee.c: too many arguments to function


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

Reply via email to