On Wed, Sep 09, 2009 at 04:02:15PM -0500, Rich.Brown at sun.com wrote: > == Introduction/Background == > > Zero-copy (copy avoidance) is essentially buffer sharing > among multiple modules that pass data between the modules. > This proposal avoids the data copy in the READ/WRITE path > of filesystems, by providing a mechanism to share data buffers > between the modules. It is intended to be used by network file > sharing services like NFS, CIFS or others. > > Although the buffer sharing can be achieved through a few different > solutions, any such solution must work with File Event Monitors > (FEM monitors)[1] installed on the files. The solution must > allow the underlying filesystem to maintain any existing file > range locking in the filesystem. > > The proposed solution provides extensions to the existing VOP > interface to request and return buffers from a filesystem. The > buffers are then used with existing VOP_READ/VOP_WRITE calls with > minimal changes. > > > == Proposed Changes == <...>
> == Using the New VOP Interfaces for Zero-copy == > > VOP_REQZCBUF()/VOP_RETZCBUF() are expected to be used in conjunction with > VOP_READ() or VOP_WRITE() to implement zero-copy read or write. > > a. Read > > In a normal read, the consumer allocates the data buffer and passes it to > VOP_READ(). The provider initiates the I/O, and copies the data from its > own cache buffer to the consumer supplied buffer. > > To avoid the copy (initiating a zero-copy read), the consumer > first calls VOP_REQZCBUF() to inform the provider to prepare to > loan out its cache buffer. It then calls VOP_READ(). After the > call returns, the consumer has direct access to the cache buffer > loaned out by the provider. After processing the data, the > consumer calls VOP_RETZCBUF() to return the loaned cache buffer to > the provider. <...> > b. Write > > In a normal write, the consumer allocates the data buffer, loads the data, > and passes the buffer to VOP_WRITE(). The provider copies the data from > the consumer supplied buffer to its own cache buffer, and starts the I/O. > > To initiate a zero-copy write, the consumer first calls VOP_REQZCBUF() to > grab a cache buffer from the provider. It loads the data directly to > the loaned cache buffer, and calls VOP_WRITE(). After the call returns, > the consumer calls VOP_RETZCBUF() to return the loaned cache buffer to > the provider. Just for clarification: this interface only affects pages mapped in the kernel, correct? I'm trying to understand if this is just for reducing the number of in-kernel copies, or if this is a userland <-> kernel zero-copy interface. Thanks, -j