Michael Ho created IMPALA-6766:
----------------------------------
Summary: Resource management for network
Key: IMPALA-6766
URL: https://issues.apache.org/jira/browse/IMPALA-6766
Project: IMPALA
Issue Type: Improvement
Components: Distributed Exec
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: Michael Ho
There is no way to manage the network bandwidth usages of a query. In other
words, a query which shuffles a huge amount of data can slow down other
concurrent queries. The followings are the observed bandwidth of a query when
it's run alone and when it's run with another query which shuffles a lot of
data across the network. We should consider extending the resource pool concept
to also manage network usage.
Good case:
DataStreamSender (dst_id=4)
- BytesSent: 828.3 MiB (868564531)
- InactiveTotalTime: 0ns (0)
- NetworkThroughput(*): 706.4 MiB/s (740751383)
Bad case:
DataStreamSender (dst_id=4)
- BytesSent: 828.3 MiB (868564531)
- InactiveTotalTime: 0ns (0)
- NetworkThroughput(*): 182.3 MiB/s (191106930)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)