Hi,

We have some rough ideas of applying Flight in HPC (High Performance Computation). Would like to hear comments.

HPC infrastructure normally leverages RDMA for fast data transfer among storage nodes and compute nodes. Computation tasks are dispatched to compute nodes with best fit resources.

Concretely, we are investigating porting UCX as Flight transport layer. UCX is a communication framework for modern networks. [1] Besides HPC usage, many projects (spark, dask, blazingsql, etc) also adopt UCX to accelerate network transmission. [2][3]

I see a recent discussion about decoupling Flight from gRPC. Looks this is also what we should do first to adapt UCX to Flight.

Another thing is HPC may transfer commands together with data payload to execute by the received compute node. FlighSQL looks is for similar purpose, though HPC normally has more flexible computation tasks, it may even transfer an exec binary to execute on target. [4]

[1] https://openucx.org/documentation/
[2] https://github.com/openucx/sparkucx
[3] https://blog.dask.org/2019/06/09/ucx-dgx
[4] https://arxiv.org/pdf/2108.02253.pdf

Reply via email to