[
https://issues.apache.org/jira/browse/IMPALA-9134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong resolved IMPALA-9134.
-----------------------------------
Fix Version/s: Impala 3.4.0
Resolution: Fixed
> Parallelise flush in data stream sender
> ---------------------------------------
>
> Key: IMPALA-9134
> URL: https://issues.apache.org/jira/browse/IMPALA-9134
> Project: IMPALA
> Issue Type: Sub-task
> Components: Distributed Exec
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Labels: perf, scalability
> Fix For: Impala 3.4.0
>
>
> The data stream sender currently does a synchronous RPC to close each channel
> https://github.com/apache/impala/blob/d4648e8/be/src/runtime/krpc-data-stream-sender.cc#L565.
> This is suboptimal because it serializes the network round-trips and takes
> sum(RTT) over all the destinations in the best case, where no data needs to
> be flushed or 2 * sum(RTT) in the worst case if all channels need to flush
> data.
> If the RPCs were done asynchronously and overlapped with each other, we could
> get this down to 2 * max(RTT).
> I'm including this as a subtask of multi-threading because this is going to
> scale poorly as the number of fragment instances increases.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)