Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16474 )

Change subject: IMPALA-10178 Run-time profile shall report skews
......................................................................


Patch Set 21:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16474/21/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/16474/21/be/src/util/runtime-profile.cc@1928
PS21, Line 1928:   if (stddev > 5.0) {
> In the past, stddev with a threshold of 5 served the purpose well.

Would be interested into seeing what evidence we have to support this. Might be 
worth running this logic against some larger runtime profiles and see what 
comes out.

With Parquet encodings, Parquet filter pushdown, page skipping, runtime 
filters, etc. I wouldn't expect the number of rows read by a scan node to be 
that close together.

just want to make sure that the skew flag doesn't start popping up on every 
runtime profile we get, in which case folks will start to ignore it.

another benchmark might be to see how many TPC-DS or TPC-H profiles the skew 
flag pops up on. maybe 30 GB would be enough scale, not sure.



--
To view, visit http://gerrit.cloudera.org:8080/16474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91041f2856eef8293ea78f1721f97469062589a1
Gerrit-Change-Number: 16474
Gerrit-PatchSet: 21
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 24 Sep 2020 20:32:43 +0000
Gerrit-HasComments: Yes

Reply via email to