Qifan Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15997


Change subject: Fix JIRA IMPALA-2658: Extend the NDV function to accept a 
precision
......................................................................

Fix JIRA IMPALA-2658: Extend the NDV function to accept a precision

 This work addresses the current limitation in NDV function by
 extending the function to take the 2nd integer-typed argument,
 which must be an abstract value in the range of 1 to 10. This
 abstract value specifies a real precision value used in the HLL
 algorithm for the function.

 Front end work:
  1. Allow the template ndv function in builtin db to take variable number
     of arguments;
  2. When necessary, extend the argument type for NDV() with one extra
     argument of integer type;
  3. Verify that the 2nd argument of a NDV() is an integer literal in [1,10];
  4. A new method to implement the mapping of the abstract value to the
     hll precision (the real work is TBD);
  5. The length of the intermediate data type is computed based on the
     actual hll precision. When the 2nd argument is missing, the length
     is 1024 as in the current implementation;
  6. The 2nd argument, if present, will be carried over all the way to the BE.

 Back end work:
  1. Remove the hardcoded precision (10) from these methods:
       AggregateFunctions::HllInit(),
       AggregateFunctions::HllUpdate(),
       AggregateFunctions::HllMerge(),
       AggregateFunctions::HllFinalEstimate(),
       AggregateFunctions::HllFinalize();
  2. Instead, the actual precision is computed from the
     length of the intermediate data type as log2(hll_len);
  3. Verify that the length of the intermediate data type is
     correct according to the 2nd argument (if present).

 Work TDB:
  1. Add unit tests;
  2. Determine the final mapping of the 10 abstract values to
     10 possible Hll precisions.

Change-Id: I48a4517bd0959f7021143073d37505a46c551a58
---
M be/src/common/logging.h
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M be/src/exprs/expr.h
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
9 files changed, 173 insertions(+), 36 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/15997/1
--
To view, visit http://gerrit.cloudera.org:8080/15997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I48a4517bd0959f7021143073d37505a46c551a58
Gerrit-Change-Number: 15997
Gerrit-PatchSet: 1
Gerrit-Owner: Qifan Chen <[email protected]>

Reply via email to