On Tue, Jan 15, 2013 at 4:29 AM, Raghavendra Gowdappa <rgowd...@redhat.com>wrote:
> > > ----- Original Message ----- > > From: "Anand Avati" <aav...@redhat.com> > > To: "Amar Tumballi" <atumb...@redhat.com> > > Cc: bhar...@linux.vnet.ibm.com, gluster-devel@nongnu.org, "Raghavendra > Gowdappa" <rgowd...@redhat.com> > > Sent: Thursday, January 10, 2013 12:20:09 PM > > Subject: Re: [Gluster-devel] zero-copy readv > > > > On 01/09/2013 10:37 PM, Amar Tumballi wrote: > > > > > >> > > >> - On the read side things are a little more complicated. In > > >> rpc-transport/socket, there is a call to iobuf_get() to create a > > >> new > > >> iobuf for reading in the readv reply data from the server. We will > > >> need > > >> a framework changes where, if the readv request (of the xid for > > >> which > > >> readv reply is being handled) happened to be a "direct" variant > > >> (i.e, > > >> zero-copy), then the "special iobuf around user's memory" gets > > >> picked up > > >> and read() from socket is performed directly into user's memory. > > >> Similar, but equivalent, changes will have to be done in RDMA > > >> (Raghavendra on CC can help). Since the goal is to avoid memory > > >> copy, > > >> this data will be bypassing io-cache (and purging pre-cached data > > >> of > > >> those regions along the way). > > >> > > > > > > On the read side too, our client protocol is designed to handle > > > 0-copy > > > already, ie, if the fop comes with an iobuf/iobref, then the same > > > buffer > > > is used for copying the received data from network. > > > (client_submit_request() is designed to handle this). [1] > > > > > > We made all these changes to make RDMA 0-copy a possibility, so > > > even > > > RDMA transport should be already 0-copy friendly. > > > > > > Thats my understanding. > > > > > > Regards, > > > Amar > > > > > > [1] - recent patches to handle RPC read-ahead may involve small > > > data > > > copy from header to data buffer, but surely not very high. > > > > > > > Amar - note that the current infrastructure present for 0-copy RDMA > > might not be sufficient for GFAPI's 0-copy. A glfs_readv() request > > from > > the app can come as a vector of memory pointers (and not a contiguous > > iobuf) and therefore require storing an iovec/count as well. This > > might > > also mean we need to exercise the scatter-gather aspects of the verbs > > API. > > If we pass user supplied vectors as write chunks to server, it will do > rdma-writes to memory regions pointed by those vectors. So, I think there > are no major changes required to rdma as well. I wasn't sure if the client-side interface b/w protocol/client and rpc-transport/rdma was doing everything right even though the rdma transport itself had the capability. I guess that is probably what you mentioned as "If we pass user supplied vectors..". Avati
_______________________________________________ Gluster-devel mailing list Gluster-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/gluster-devel