[ 
https://issues.apache.org/jira/browse/ARROW-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293795#comment-17293795
 ] 

David Li commented on ARROW-10351:
----------------------------------

FWIW [~yibo], I actually tried pipelining the Flight client and server: 
[https://github.com/lidavidm/arrow/tree/arrow-10351-async-compression]

This way we would do I/O-bound (gRPC read/write) and CPU-bound (Arrow record 
batch encoding/decoding) work on separate threads with readahead on the I/O 
side.

In our tests it did not have any benefit. I didn't test the actual async gRPC 
APIs, however, unless there is a major difference between how those are 
implemented on the gRPC side, I'd be doubtful that they'd help by themselves 
unless they unlock some opportunity to parallelize/pipeline work. But if you 
are investigating we'd be curious to see the results! It could definitely 
improve how ergonomic the APIs are and/or open a path to asyncio support in the 
Python bindings. It might also improve latency instead of throughput (our tests 
have focused on throughput).

> [C++][Flight] See if reading/writing to gRPC get/put streams asynchronously 
> helps performance
> ---------------------------------------------------------------------------------------------
>
>                 Key: ARROW-10351
>                 URL: https://issues.apache.org/jira/browse/ARROW-10351
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, FlightRPC
>            Reporter: Wes McKinney
>            Assignee: David Li
>            Priority: Major
>
> We don't use any asynchronous concepts in the way that Flight is implemented 
> now, i.e. IPC deconstruction/reconstruction (which may include compression!) 
> is not performed concurrent with moving FlightData objects through the gRPC 
> machinery, which may yield suboptimal performance. 
> It might be better to apply an actor-type approach where a dedicated thread 
> retrieves and prepares the next raw IPC message (within a Future) while the 
> current IPC message is being processed -- that way reading/writing to/from 
> the gRPC stream is not blocked on the IPC code doing its thing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to