Gabor Kaszab has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
......................................................................


Patch Set 1:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

Line 90:           context->partition_descriptor()->id(),stream_->filename()));
> nit:space after ,
Done


PS1, Line 163: ,s
> space
Done


PS1, Line 309:     
> 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 613:     context_->partition_descriptor()->id(), filename());
> 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

PS1, Line 266: ,
> space
Done


PS1, Line 266: string
> Do we need to call the string constructor explicitly? Seems a bit weird.
Done


PS1, Line 266: string
> remove string constructor here
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 197:   /// Allocate a new scan range object, stored in the runtime state's 
object pool. For
> This comment needs updating since partition_id is now required.
Done


PS1, Line 237:   /// Returns nullptr if the search doesn't find the descriptor.
> not true; it has a dcheck
Done


PS1, Line 360: pair_hash
> misleading name because the hash is computed on the value.
apparently, unordered_map doesn't accept pair as key by default, only if a 
custom hash is provided.


Line 360:   struct pair_hash {
> We have some utilities like this already in util/container-util.h, let's mo
Done


Line 363:         return std::hash<T2>{}(p.second);
> Let's hash both values in the pair and combine them with boost::hash_combin
Done


PS1, Line 369: pair<int64_t, std::string>
> Can you typedef this and using that here and in the .cc file ?
Done


PS1, Line 381: pair<int64_t, std::string>
> same
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

Line 71:     data = self.execute_scalar("select sum(i), sum(j) from %s" % 
FQ_TBL_NAME)
> Can we run this query with num_nodes=1 to verify that the same bug doesn't 
Done


PS1, Line 81: ad
> as?
Done


PS1, Line 81: imapala
> Impala
Done


PS1, Line 109: output
> 'output' isn't clear (it doesn't match up with the queries). say this is wh
Done


Line 110:     # (note, that shortcoming on avro writer would result the 
inserted value as NULL,
> How about we only query the second column to avoid this complication?
Done


PS1, Line 122: output:
             :     # [NULL, 1] 3 times
             :     # [NULL, 2] 3 times
> again, this is confusing
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <laszlo.g...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to