Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
......................................................................


Patch Set 13:

(4 comments)

Thanks, that looks simpler and clearer now. Just some minor things.

http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/8523/13/be/src/scheduling/scheduler.cc@302
PS13, Line 302:         for (const TScanRangeLocationList& range : 
entry.second.concrete_range) {
              :           expanded_locations.push_back(range);
              :         }
that could be just expanded_locations.insert(concrete_range.begin(), 
concrete_range.end())?


http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/13/common/thrift/Planner.thrift@108
PS13, Line 108: concrete_range
plural since the field is a list: concrete_ranges, split_specs


http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@224
PS13, Line 224: scanRangeSpecs_.getSplit_specSize()
this seems wrong - a single spec may result in multiple hosts, no?
Though I guess for hbase this won't be set so in practice doesn't matter. But 
should we instead just assert that the split_spec list size is 0?


http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/8523/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1126
PS13, Line 1126: scanRangeSpecs_.getSplit_specSize();
shouldn't that do some calculation based on file size and blocks size and is 
splittable? i.e. after your change we'll get a different number for 
numRemoteRanges when running on S3, right?



--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 13
Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Fri, 18 May 2018 23:09:26 +0000
Gerrit-HasComments: Yes

Reply via email to