On Wed, 11 Jan 2017, Jason Dillaman wrote:
> +1
> 
> I'd be happy to tweak the internals of librbd to support pass-through
> of C buffers all the way to librados. librbd clients like QEMU use the
> C API and this currently results in several extra copies (in librbd
> and librados).

+1 from me too.

The caveat is that we have to be very careful with buffers that are 
provided by users.  Currently the userspace messenger code doesn't provide 
a way to manage the provenance of references to the buffer::raw_static 
buffers, which means that even if the write has completed, there may be 
ways for an MOSDOp to still be alive that references that memory.

Either (1) we have to audit the code to be sure that by the time the 
Objecter request completes we know that all messages and their bufferlists 
are cleared (tricky/fragile), or (2) introduce some buffer management 
interface in librados so that the buffer lifecycle is independent of the 
request.  I would prefer (2), but it means the interfaces would be 
something like

 rados_buffer_create(...)
 copy your data into that buffer
 rados_write(...) or whatever
 rados_buffer_release(...)

and then rados can do the proper refcounting and only deallocate the 
memory when all refs have gone away.  Unfortunately, I suspect that there 
is a largish category of users where this isn't sufficient... e.g., if 
some existing C user has its own buffer and it isn't practical to 
allocate/release via rados_buffer_* calls instead of malloc/free (or 
whatever).

Jason, where does librbd fall?

sage


> 
> On Wed, Jan 11, 2017 at 11:44 AM, Piotr Dałek <piotr.da...@corp.ovh.com> 
> wrote:
> > Hello,
> >
> > As the subject says - are here any users/consumers of librados C API? I'm
> > asking because we're researching if this PR:
> > https://github.com/ceph/ceph/pull/12216 will be actually beneficial for
> > larger group of users. This PR adds a bunch of new APIs that perform object
> > writes without intermediate data copy, which will reduce cpu and memory load
> > on clients. If you're using librados C API for object writes, feel free to
> > comment here or in the pull request.
> >
> >
> > --
> > Piotr Dałek
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> -- 
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to