Sahil Takiar created IMPALA-9458:
------------------------------------
Summary: Improve runtime profile counters for slow IO from remote
stores
Key: IMPALA-9458
URL: https://issues.apache.org/jira/browse/IMPALA-9458
Project: IMPALA
Issue Type: Improvement
Reporter: Sahil Takiar
Remote storage systems (e.g. cloud stores like S3 and ABFS) often have long
tail latencies. Most I/O finishes relatively quickly, but some calls make take
significantly longer. Even for HDFS, this is an issue (e.g. hedged reads were
developed to help mitigate tail latencies, although no such feature exists for
cloud storage connectors).
Currently, scan nodes just track the total amount of time spent reading data.
It would be good to have a summary stats counter that tracks the min, avg, and
max time spent reading data. This should at least allow us to identify when
calls to remote storage services are taking longer than usual.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]