Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17306 )

Change subject: IMPALA-10445: Adjust NDV's scale with query option
......................................................................


Patch Set 9:

(6 comments)

Looks good to me!

http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG@12
PS9, Line 12: estimation by setting larger `scale`. That scale is decided by SQL
in SQL function NDV(<expr>, <scale>)


http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG@11
PS9, Line 11: Since IMPALA-2658, we can trade memory for more accurate
            : estimation by setting larger `scale`. That scale is decided by SQL
            : writers. However, it is a bumpy road for cluster admins to allow 
for
            : larger scales. Here lies 2 reasons:
May rewrite the sentence as follows.

Since IMPALA-2658, we can trade memory for more accurate
estimation by setting larger NDV scale in SQL function NDV(<expr>, <scale>). 
However, the use of larger NDV scale requires the modification of the NDV 
function in queries which may not be practical in certain applications.


http://gerrit.cloudera.org:8080/#/c/17306/9//COMMIT_MSG@24
PS9, Line 24: In this commit, we introduced a new Query Option 
`DEFAULT_NDV_SCALE`.
            : During to the advantage of query option, Cluster admins can 
either tune
            : 1 desired query, or influence upcoming queries by placing a 
default
            : query option in a dynamic resource pool.
This paragraph can be rewritten as follows.

In this commit, we introduced a new Query Option `DEFAULT_NDV_SCALE` with the 
following semantics.

1. The allowed value is in the range [2..10];
2. The value of the query option is the default NDV scale for the SQL function 
of form NDV(<expr>). Previously, the NDV scale used in such functions is fixed 
at 2;
3. It does not influence the NDV scale for SQL function NDV(<expr>, <scale>) in 
which the NDV scale is provided by the 2nd argument <scale>.


http://gerrit.cloudera.org:8080/#/c/17306/9/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/17306/9/common/thrift/ImpalaService.thrift@655
PS9, Line 655: , make it easier to change NDV's scale
may change to " to SQL NDV function NDV(<expr>.


http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java
File fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java:

http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/CountDistinctToNdvRule.java@58
PS9, Line 58: Do function name substitution
Probably should say "Create the substituted form".


http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java
File fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java:

http://gerrit.cloudera.org:8080/#/c/17306/9/fe/src/main/java/org/apache/impala/rewrite/DefaultNdvScaleRule.java@38
PS9, Line 38: DefaultNdvScaleRule
I wonder if this rule can be merged to the CountDistinctToNdvRule. That is, if 
APPX_COUNT_DISTINCT query option is set, we fire that rule. Within that rule, 
we check for query option DEFAULT_NDV_SCALE and form NDV() function in two 
ways: NDV(<expr>) if the value of DEFAULT_NDV_SCALE is 2, or NDV(<expr>, 
<scale>) otherwise.



--
To view, visit http://gerrit.cloudera.org:8080/17306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1669858a6e8252e167b464586e8d0b6cb0d0bd50
Gerrit-Change-Number: 17306
Gerrit-PatchSet: 9
Gerrit-Owner: fifteencai <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Tue, 13 Apr 2021 17:40:26 +0000
Gerrit-HasComments: Yes

Reply via email to