Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13993 )

Change subject: IMPALA-8834: Short-circuit partition key scan
......................................................................


Patch Set 14:

(5 comments)

A few minor comments. Looks good overall.

http://gerrit.cloudera.org:8080/#/c/13993/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13993/14//COMMIT_MSG@11
PS14, Line 11: It can therefore be safely enabled by default.
I suppose the enabling by default will be a separate patch ?


http://gerrit.cloudera.org:8080/#/c/13993/14//COMMIT_MSG@55
PS14, Line 55: Testing:
Thanks for adding the detailed tests :)


http://gerrit.cloudera.org:8080/#/c/13993/14/be/src/exec/hdfs-scan-node-mt.cc
File be/src/exec/hdfs-scan-node-mt.cc:

http://gerrit.cloudera.org:8080/#/c/13993/14/be/src/exec/hdfs-scan-node-mt.cc@98
PS14, Line 98:   row_batch->limit_capacity(1);
Should this limiting be under an if(is_partition_key_scan_) check ?


http://gerrit.cloudera.org:8080/#/c/13993/14/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/13993/14/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1134
PS14, Line 1134:       long numRangesAdjusted = numRanges * 100;
I can see why the adjustment would be needed  .. to ensure a distributed scan 
is done.  Presumably, for smaller number of partitions, we would still be ok to 
run on a single node ?
Since this is multiplying a long value, it could overflow in a pathological 
case, so suggest capping it with max long value.


http://gerrit.cloudera.org:8080/#/c/13993/14/testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test:

http://gerrit.cloudera.org:8080/#/c/13993/14/testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test@21
PS14, Line 21:    HDFS partitions=24/24 files=24 size=478.45KB
it wasn't obvious to me why the HDFS prefix got added based on the changes in 
this patch since the new item added to the Explain plan is 'partition key scan'.



--
To view, visit http://gerrit.cloudera.org:8080/13993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26c87525a4f75ffeb654267b89948653b2e1ff8c
Gerrit-Change-Number: 13993
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Tue, 21 Apr 2020 00:45:57 +0000
Gerrit-HasComments: Yes

Reply via email to