Hi Hartmut,

Thanks a lot for your suggestion of using `serialize_buffer`. I will try 
to implement it using a custom allocator, which allocates the correct 
type of pinned memory depending on the capabilities of the node (i.e. 
using cudaMallocHost() if there is a GPU and malloc() + mlock() 
otherwise). Modifying the `InputData` type is not a problem here.

John's `rma_object<>` seems very interesting as well, and is probably 
the way to go in the long term. I will try to dig into the code and look 
at his implementation of a pinned allocator to see how I could adapt it 
to my use case.

If you are interested, I can keep you updated once I have a working 
prototype (even if it is not zero-copy yet).

Best regards,
hpx-users mailing list

Reply via email to