Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-2373: Extrapolate row counts for HDFS tables.
......................................................................


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6840/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

PS2, Line 77: TTableStats tableStats_
nit: a bit safer to initialize this here and set now rows to -1.


PS2, Line 196: /**
             :    * Returns the value of the ROW_COUNT constant, or -1 if not 
found.
             :    */
Remove, not needed.


http://gerrit.cloudera.org:8080/#/c/6840/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

PS2, Line 128: Input cardinality based on the partition row counts and based on 
extrapolation
             :   // using the table-level row count and raw data size. -1 if 
invalid.
Remove the part that explains where this is derived from; doesn't really help. 
Simply state what these two variables represent.


PS2, Line 664: Chose
Choose (typo)


PS2, Line 664:  Chose between the extrapolated row count and the one based on 
stored stats.
I think that requires some further explanation. Something along the lines of 
"if table stats (row count, bytes) are available, use that information to 
estimate cardinality using extrapolation, otherwise use the exact row counts 
from table/partition stats". Also, I would put L664-705 in a separate function 
responsible solely for estimating and setting cardinality, but feel free to 
ignore this if you don't like it.


PS2, Line 672: if (!partitions_.isEmpty() && numPartitionsMissingStats_ == 
partitions_.size()
             :         && extrapolatedNumRows_ == -1) {
             :       // if none of the partitions knew its number of rows, and 
extrapolation was
             :       // not possible, we fall back on the table stats
             :       cardinality_ = tbl_.getNumRows();
             :     }
I think it might help readability if you put this inside the if block in 
L666-668. Also, do we have a test for this case?


http://gerrit.cloudera.org:8080/#/c/6840/2/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test:

PS2, Line 268: 61
Is this ok that this changed?


-- 
To view, visit http://gerrit.cloudera.org:8080/6840
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I972c8a03ed70211734631a7dc9085cb33622ebc4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-HasComments: Yes

Reply via email to