[
https://issues.apache.org/jira/browse/IMPALA-7095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16802954#comment-16802954
]
ASF subversion and git services commented on IMPALA-7095:
---------------------------------------------------------
Commit 0f76e7c380a22f828acfd8d82d57b8c4166b5d93 in impala's branch
refs/heads/2.x from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0f76e7c ]
IMPALA-7095: clean up scan node profiles
Add counters to scan node implementations where they make sense but were
missing (e.g. row batch queue counters for multithread Kudu scans) and
remove them where they don't make sense (e.g. scanner thread counters
for non-multithreaded scans).
Refactors the multithreaded Kudu and HDFS scans to share logic via
composition (single inheritance doesn't work for this case),
which enables the same set of counters to be maintained with shared
code. The row batch queueing and thread tracking is now shared. I looked
at combining the logic around 'status_', 'lock_' and 'done_' between the
two but the details were different enough that it didn't seem worth
abstracting.
Adds a PeakScannerThreadConcurrency counter - this answers a common
question.
Fixes RowsRead for data source scans.
Fix some of the comments to be more accurate/useful.
Testing:
Ran exhaustive tests. Ran various types of scans (HDFS, Kudu, HBase,
Data source) and inspected the profile output manually.
Change-Id: I77286282d42e7764bfdf94c7ec47cec9d743f787
Reviewed-on: http://gerrit.cloudera.org:8080/10810
Reviewed-by: Tim Armstrong <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-on: http://gerrit.cloudera.org:8080/12848
> Improve scanner thread counters in HDFS and Kudu scans
> ------------------------------------------------------
>
> Key: IMPALA-7095
> URL: https://issues.apache.org/jira/browse/IMPALA-7095
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 3.0
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
> Labels: observability
> Fix For: Impala 3.1.0
>
>
> There are a few deficiencies here:
> * We don't track the peak number of scanner threads. Consumers of the profile
> often confuse NumScannerThreadsStarted with the peak.
> * Kudu scans are missing some metrics, e.g. AverageScannerThreadConcurrency.
> We should make sure that Kudu and HDFS are consistent.
> We should clean this up, and maybe refactor the code so that less logic is
> duplicated
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]