Qifan Chen has uploaded a new patch set (#10). (
http://gerrit.cloudera.org:8080/16474 )
Change subject: IMPALA-10178 Run-time profile shall report skews
......................................................................
IMPALA-10178 Run-time profile shall report skews
This fix addresses the current limitation in runtime profile that
skews existing in certain operators such as the rows read counter
(RowsRead) in the scan operators are not reported. A skew condition
exists when the number of rows processed at each operator instance
is not about the same and can be detected through standard deviation
(stddev). A high stddev (say > 5) usually implies the existence of
skew.
With the fix and in the average fragment profile, such skew is
detected for the following counters
1. RowsRead in HDFS_SCAN_NODE profile
2. ProbeRows and BuildRows in HASH_JOIN_NODE profile
3. RowsReturned in GroupingAggregator profile
and reported as follows:
1. In the skew summary section which lists the names of the
operators with skews;
2. In each corresponding operator, the name of the counters
and the corresponding stddev values.
Examples of skews reported for a hash join and an hdfs scan.
Averaged Fragment F00:(Total: 1s075ms, non-child: 26.919ms, ...
... ...
num instances: 3
skew(s) found at: HASH_JOIN_NODE (id=4), HDFS_SCAN_NODE (id=0)
HASH_JOIN_NODE (id=4):(Total: 1s204ms, non-child: 2.166ms, ...
Skew details: ProbeRows ([16904, 17750, 19197],
stddev=946.77)
... ...
HDFS_SCAN_NODE (id=0):(Total: 1s032ms, non-child: 1s032ms, ...
Skew details: RowsRead ([913887, 917913, 1048604],
stddev=62578.85)
TODO:
1. Add unit tests;
2. Run core tests.
Change-Id: I91041f2856eef8293ea78f1721f97469062589a1
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/util/CMakeLists.txt
A be/src/util/runtime-profile-counters.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M be/src/util/stat-util.h
7 files changed, 149 insertions(+), 8 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/16474/10
--
To view, visit http://gerrit.cloudera.org:8080/16474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1
Gerrit-Change-Number: 16474
Gerrit-PatchSet: 10
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>