[ 
https://issues.apache.org/jira/browse/IMPALA-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yida Wu reassigned IMPALA-12046:
--------------------------------

    Assignee: Yida Wu  (was: Quanlong Huang)

> Add profile counter for scan range queueing time on disk queues
> ---------------------------------------------------------------
>
>                 Key: IMPALA-12046
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12046
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Yida Wu
>            Priority: Critical
>              Labels: observability, supportability
>
> I saw a profile showing the total time of a ScanNode is dominanted by 
> {{{}ScannerIoWaitTime{}}}. However, the hdfs openFileTime and readTime are 
> all small. No other counters can explain why {{ScannerIoWaitTime}} is long.
> {code:java}
> - DecompressionTime: 964.648ms
> - InactiveTotalTime: 0.000ns
> - MaterializeTupleTime: 2s132ms
> - ScannerIoWaitTime: 11s641ms          <-- Dominants the total time
> - TotalRawHdfsOpenFileTime: 14.501ms
> - TotalRawHdfsReadTime: 1s374ms
> - TotalReadThroughput: 29.94 MB/secĀ 
> - TotalTime: 15s865ms{code}
> After some debug, I realize the time is spent in queuing in the disk queue. 
> If the scanner is consuming data faster than the disk queue threads can read, 
> scan ranges will be queueing in the disk queues. The queueing time is not 
> counted in either TotalRawHdfsOpenFileTime or TotalRawHdfsReadTime, but is 
> counted in ScannerIoWaitTime. We should add profile counter for the queueing 
> time on disk queues to better explain ScannerIoWaitTime.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to