[hpx-users] Receiving action arguments on pinned memory

Jean-Loup Tastet Mon, 07 Aug 2017 05:55:11 -0700

Hi all,

I am currently trying to use HPX to offload computationally intensive tasks to 
remote GPU nodes. In idiomatic HPX, this would typically be done by invoking a 
remote action:


    OutputData compute(InputData input_data)
    {
        /* Asynchronously copy `input_data` to device using DMA */
        /* Do work on GPU */
        /* Copy back the results to host */
        return results;
    }

    HPX_PLAIN_ACTION(compute, compute_action);

    // In sender code
    auto fut = hpx::async(compute_action(), remote_locality_with_gpu, 
std::move(input_data));

So far, so good.

However, an important requirement is that the memory allocated for the input 
data on the receiver end be pinned, to enable asynchronous copy between the 
host and the GPU. This can of course always be done by copying the argument 
`input_data` to pinned memory within the function body, but I would prefer to 
avoid any superfluous copy in order to minimize the overhead.

Do you know if it is possible to control within HPX where the memory for the 
input data will be allocated (on the receiver end) ? I tried to use the 
`pinned_allocator` from the Thrust library for the data members of `InputData`, 
and although it did its job as expected, it also requires to allocate pinned 
memory on the sender side (for the construction of the object), as well as the 
presence of the Thrust library and the CUDA runtime on both machines. This led 
me to think that there should be a better way.

Ideally, I would be able to directly deserialize the incoming data into pinned 
memory. Do you know if there is a way to do this or something similar in HPX ? 
If not, do you think it is possible to emulate such functionality by directly 
using the low-level constructs / internals of HPX ? This is for a prototype, so 
it is okay to use unstable / undocumented code as long as it allows me to prove 
the feasibility of the approach.

I would greatly appreciate any input / suggestions on how to approach this 
issue. If anyone has experience using HPX with GPUs or on heterogeneous 
clusters, I would be very interested in hearing about it as well.

Best regards,
Jean-Loup Tastet
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

[hpx-users] Receiving action arguments on pinned memory

Reply via email to