Thanks for your answer !
I will definitely look into these allocators to see if I can get
`serialization_buffer<>` to work with memory which has been pinned both
for the network and for the CUDA runtime.
However I fear I don't have enough knowledge yet to start hacking on
the HPX internals.
Le mardi 08 août 2017 à 22:16 +0000, Biddiscombe, John A. a écrit :
> > BTW, are you sure about the units on fig. 3 ? 4 seconds to
> > serialize
> 20KB of data in not especially High Performance... unless of course
> were running HPX on a toaster :-)
> Crap! the units should not be seconds. I will fix my local copy and
> contact the publishers in case there is time to have it changed.
> Regarding the use of different allocators - there is a virtual base
> class for the allocator/pool that provides the memory so that we can
> use different allocators for the ibverbs and libfabric
> implementations, so it ought to be ok to provide an
> hpx::compute::gpu::allocator and allow it to hand out memory for the
> rma objects. providing the received data (over the network?) isn't
> passed to a compute host with different pinning requirement, then all
> is fine. if received data is in a network rma buffer and then passed
> directly to the gpu transfer, we might need a way to pin the memory
> using both apis for network and gpu, but that can all be handled in
> the allocator abstraction itself which is generic and easy to extend.
> Once I work on this again, I'll let you know and if you have specific
> features you want to try out, I can work with you - sorry if my
> current deadlines push this out too far for you.
> Should you want to have a go at it - there is an rma_object branch on
> github that I need to get merged into master sooner rather than
> hpx-users mailing list
hpx-users mailing list