put streams asynchronously helps performance

David Li (Jira) Mon, 03 May 2021 12:46:28 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338585#comment-17338585
 ]


David Li commented on ARROW-10351:
----------------------------------

And between two EC2 t3.xlarge instances:

Without compression:
{noformat}
Testing method: DoPut
Using standalone TCP server
Server host: ip-172-31-73-63.ec2.internal
Server port: 31337
Number of perf runs: 4
Number of concurrent gets/puts: 1
Batch size: 5265782
Batches written: 3456
Bytes written: 18198543232
Nanos: 36150890880
Speed: 480.085 MB/s
Throughput: 95.5993 batches/s
Latency mean: 8485 us
Latency quantile=0.5: 8692 us
Latency quantile=0.95: 9233 us
Latency quantile=0.99: 10627 us
Latency max: 13944 us {noformat}
flight-poc, with sync compression:
{noformat}
Testing method: DoPut
Using standalone TCP server
Server host: ip-172-31-73-63.ec2.internal
Server port: 31337
Number of perf runs: 4
Number of concurrent gets/puts: 1
Batch size: 5265782
Batches written: 3456
Bytes written: 18198543232
Nanos: 38743831916
Speed: 447.955 MB/s
Throughput: 89.2013 batches/s
Latency mean: 9305 us
Latency quantile=0.5: 9312 us
Latency quantile=0.95: 9736 us
Latency quantile=0.99: 10097 us
Latency max: 11723 us {noformat}
flight-poc, with async compression:
{noformat}
Testing method: DoPut
Using standalone TCP server
Server host: ip-172-31-73-63.ec2.internal
Server port: 31337
Number of perf runs: 4
Number of concurrent gets/puts: 1
Batch size: 5265782
Batches written: 3456
Bytes written: 18198543232
Nanos: 36706487822
Speed: 472.818 MB/s
Throughput: 94.1523 batches/s
Latency mean: 8739 us
Latency quantile=0.5: 8726 us
Latency quantile=0.95: 9258 us
Latency quantile=0.99: 9793 us
Latency max: 12832 us {noformat}
It still doesn't seem very beneficial. Maybe if we have a very compressible 
dataset, and/or tune the compressor used?

> [C++][Flight] See if reading/writing to gRPC get/put streams asynchronously 
> helps performance
> ---------------------------------------------------------------------------------------------
>
>                 Key: ARROW-10351
>                 URL: https://issues.apache.org/jira/browse/ARROW-10351
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, FlightRPC
>            Reporter: Wes McKinney
>            Priority: Major
>
> We don't use any asynchronous concepts in the way that Flight is implemented 
> now, i.e. IPC deconstruction/reconstruction (which may include compression!) 
> is not performed concurrent with moving FlightData objects through the gRPC 
> machinery, which may yield suboptimal performance. 
> It might be better to apply an actor-type approach where a dedicated thread 
> retrieves and prepares the next raw IPC message (within a Future) while the 
> current IPC message is being processed -- that way reading/writing to/from 
> the gRPC stream is not blocked on the IPC code doing its thing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-10351) [C++][Flight] See if reading/writing to gRPC get/put streams asynchronously helps performance

Reply via email to