cyb70289 opened a new pull request #12196:
URL: https://github.com/apache/arrow/pull/12196


   This patch decouples flightrpc data plane from grpc so we can leverage
   optimized data transfer libraries.
   
   The basic idea is to replace grpc stream with a data plane stream for
   FlightData transmission in DoGet/DoPut/DoExchange. There's no big change
   to current flight client and server implementations. Added a wrapper to
   support both grpc stream and data plane stream. By default, grpc stream
   is used, which goes the current grpc based code path. If a data plane is
   enabled (currently through environment variable), flight payload will go
   through the data plane stream instead. See client.cc and server.cc to
   review the changes.
   
   **About data plane implementation**
   
   - data_plane/types.{h,cc}
     Defines client/server data plane and data plane stream interfaces.
     It's the only exported api to other component ({client,server}.cc).
   - data_plane/serialize.{h,cc}
     De-Serialize FlightData manually as we bypass grpc. Luckly, we already
     implemented related functions to support payload zero-copy.
   - shm.cc
     A shared memory driver to verify the data plane approach. The code may
     be a bit hard to read, it's better to focus on data plane interface
     implementations at first before dive deep into details like shared
     memory, ipc and buffer management related code.
     Please note there are still many caveats in current code, see TODO and
     XXX in shm.cc for details.
   
   **To evaluate this patch**
   
   I tested shared memory data plane on Linux (x86, Arm) and MacOS (Arm).
   Build with `-DARROW_FLIGHT_DP_SHM=ON` to enable the shared memory data
   plane. Set `FLIGHT_DATAPLANE=shm` environment variable to run unit tests
   and benchmarks with the shared memory data plane enabled.
   
   ```
   Build: cmake -DARROW_FLIGHT_DP_SHM=ON -DARROW_FLIGHT=ON ....
   Test:  FLIGHT_DATAPLANE=shm release/arrow-flight-test
   Bench: FLIGHT_DATAPLANE=shm release/arrow-flight-benchmark \
          -num_streams=1|2|4 -num_threads=1|2|4
   ```
   
   Benchmark result (throughput, latency) on Xeon Gold 5218.
   Test case: DoGet, batch size = 128KiB
   
   | streams | grpc over unix socket | shared memory data plane |
   | ------- | --------------------- | ------------------------ |
   | 1       |  3324 MB/s,  35 us    |  7045 MB/s,  16 us       |
   | 2       |  6289 MB/s,  38 us    | 13311 MB/s,  17 us       |
   | 4       | 10037 MB/s,  44 us    | 25012 MB/s,  17 us       |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to