[
https://issues.apache.org/jira/browse/IMPALA-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17050734#comment-17050734
]
Tim Armstrong commented on IMPALA-9458:
---------------------------------------
I did a couple of things recently that might cover or overlap with this. Linked
the JIRAs.
> Improve runtime profile counters for slow IO from remote stores
> ---------------------------------------------------------------
>
> Key: IMPALA-9458
> URL: https://issues.apache.org/jira/browse/IMPALA-9458
> Project: IMPALA
> Issue Type: Improvement
> Reporter: Sahil Takiar
> Priority: Major
>
> Remote storage systems (e.g. cloud stores like S3 and ABFS) often have long
> tail latencies. Most I/O finishes relatively quickly, but some calls make
> take significantly longer. Even for HDFS, this is an issue (e.g. hedged reads
> were developed to help mitigate tail latencies, although no such feature
> exists for cloud storage connectors).
> Currently, scan nodes just track the total amount of time spent reading data.
> It would be good to have a summary stats counter that tracks the min, avg,
> and max time spent reading data. This should at least allow us to identify
> when calls to remote storage services are taking longer than usual.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]