On 18 February 2014 10:28, John Baldwin <j...@freebsd.org> wrote:
> On Monday, February 17, 2014 6:24:21 am David Chisnall wrote:
>> P.S. If aio() is creating a new thread per request, rather than scheduling
> them from a pool, then that is also likely a bug.  The aio APIs were designed
> so that systems with DMA controllers could issue DMA requests in the syscall
> and return immediately, then trigger the notification in response to the DMA-
> finished interrupt.  There shouldn't need to be any kernel threads created to
> do this...
>
> AIO uses a pool, but the requests are all done synchronously from that
> pool.  While our low-level disk routines are async (e.g. GEOM etc.),
> the filesystem code above that generally is not.  The aio code does have
> some special gunk in place for sockets (and I believe raw disk I/O) to
> make it truly async, but aio for files uses sychronous I/O from a pool
> of worker threads.

Just to expand on John's response - which is absolutely correct:

* the IO strategy routines these days do indeed do things via
callbacks, so no AIO worker threads required
* However any blocking that goes on in the completion path ends up
making the disk IO rate drop dramatically - so there's still a single
AIO completion thread involved in posting the kqueue notifications
(ie, doing kqueue notifications from the strategy completion callback
doesn't work well because of (kqueue, etc) lock contention)
* The disk code is all blocking - especially trying to do metadata
reads for things like directory traversal.

I don't know if Scott is working on the async directory stuff or not,
but nailing a fully async path for filesystem strategy() calls on an
arbitrary file would really aid high throughput AIO based systems.
We'd be able to do zero copy disk IO for both read and write.


-a
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to