> Basically speaking - there needs to be some sort of strategy for
> bypassing the ARC or even parts of the ARC for applications that
> may need to advise the filesystem of either:
> 1) the delicate nature of imposing additional buffering for their
> data flow
> 2) already well optimized applications that need more adaptive
> cache in the application instead of the underlying filesystem or
> volume manager

This advice can't be sensibly delivered to ZFS via a Direct I/O
mechanism.  Anton's characterization of Direct I/O as, "an optimization
which allows data to be transferred directly between user data buffers
and disk, without a memory-to-memory copy," is concise and accurate.
Trying to intuit advice from this is unlikely to be useful.  It would be
better to develop a separate mechanism for delivering advice about the
application to the filesystem.  (fadvise, perhaps?)

A DIO implementation for ZFS is more complicated than UFS and adversely
impacts well optimized applications.

I looked into this late last year when we had a customer who was
suffering from too much bcopy overhead.  Billm found another workaround
instead of bypassing the ARC.

The challenge for implementing DIO for ZFS is in dealing with access to
the pages mapped by the user application.  Since ZFS has to checksum all
of its data, the user's pages that are involved in the direct I/O cannot
be written to by another thread during the I/O.  If this policy isn't
enforced, it is possible for the data written to or read from disk to be
different from their checksums.

In order to protect the user pages while a DIO is in progress, we want
support from the VM that isn't presently implemented.  To prevent a page
from being accessed by another thread, we have to unmap the TLB/PTE
entries and lock the page.  There's a cost associated with this, as it
may be necessary to cross-call other CPUs.  Any thread that accesses the
locked pages will block.  While it's possible lock pages in the VM
today, there isn't a neat set of interfaces the filesystem can use to
maintain the integrity of the user's buffers.  Without an experimental
prototype to verify the design, it's impossible to say whether overhead
of manipulating the page permissions is more than the cost of bypassing
the cache.

What do you see as potential use cases for ZFS Direct I/O?  I'm having a
hard time imagining a situation in which this would be useful to a
customer.  The application would probably have to be single-threaded,
and if not, it would have to be pretty careful about how its threads
access buffers involved in I/O.

-j
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to