[ 
https://issues.apache.org/jira/browse/IMPALA-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17834657#comment-17834657
 ] 

Quanlong Huang commented on IMPALA-12046:
-----------------------------------------

I think this is helpful. Do you still plan to work on this? [~baggio000] 

> Add profile counter for scan range queueing time on disk queues
> ---------------------------------------------------------------
>
>                 Key: IMPALA-12046
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12046
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Yida Wu
>            Priority: Critical
>              Labels: observability, supportability
>
> I saw a profile showing the total time of a ScanNode is dominanted by 
> {{{}ScannerIoWaitTime{}}}. However, the hdfs openFileTime and readTime are 
> all small. No other counters can explain why {{ScannerIoWaitTime}} is long.
> {code:java}
> - DecompressionTime: 964.648ms
> - InactiveTotalTime: 0.000ns
> - MaterializeTupleTime: 2s132ms
> - ScannerIoWaitTime: 11s641ms          <-- Dominants the total time
> - TotalRawHdfsOpenFileTime: 14.501ms
> - TotalRawHdfsReadTime: 1s374ms
> - TotalReadThroughput: 29.94 MB/sec 
> - TotalTime: 15s865ms{code}
> After some debug, I realize the time is spent in queuing in the disk queue. 
> If the scanner is consuming data faster than the disk queue threads can read, 
> scan ranges will be queueing in the disk queues. The queueing time is not 
> counted in either TotalRawHdfsOpenFileTime or TotalRawHdfsReadTime, but is 
> counted in ScannerIoWaitTime. We should add profile counter for the queueing 
> time on disk queues to better explain ScannerIoWaitTime.
> May also consider to add metrics for counting the size in the scanner which 
> queues the data is reading from.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to