Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
......................................................................


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/hdfs-parquet-scanner.cc@253
PS6, Line 253:   conjuncts.reserve(scan_node_->conjuncts().size() +
> Pass reserve to ctor instead.
Avoiding it for the reason mentioned in another similar comment.


http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/hdfs-parquet-scanner.cc@274
PS6, Line 274:     const vector<ScalarExpr*>& conjuncts) const {
> Pass reserve to ctor instead.
Passing size to ctor would not only reserve the memory it would also initialise 
it with some values (0 for int, default constructor for user defined class etc 
). So using push_back and insert after that would give a different result: 
https://onlinegdb.com/uyvYdBNZ1


http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/parquet-column-chunk-reader.h
File be/src/exec/parquet/parquet-column-chunk-reader.h:

http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/parquet-column-chunk-reader.h@124
PS6, Line 124:   Status ReadNextDataPage(
> Probably not necessary/good to templatize page-level functions like this un
I agree. Since it is at the page level we can avoid it. Removed it now.


http://gerrit.cloudera.org:8080/#/c/17860/5/be/src/exec/parquet/parquet-column-chunk-reader.h
File be/src/exec/parquet/parquet-column-chunk-reader.h:

http://gerrit.cloudera.org:8080/#/c/17860/5/be/src/exec/parquet/parquet-column-chunk-reader.h@125
PS5, Line 125:       bool* eos, uint8_t** data, int* data_size, bool read_data 
= true);
> What was the motivation for inlining this (page-level) function?
inlining was side effect of using template; since definition for templated 
function has to be visible everywhere it is used, it was pulled into header. 
But since I removed template I have removed inlining too.


http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/scratch-tuple-batch-test.cc
File be/src/exec/scratch-tuple-batch-test.cc:

http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/scratch-tuple-batch-test.cc@64
PS6, Line 64:   scoped_ptr<ScratchTupleBatch> scratch_batch(
> line too long (100 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 7
Gerrit-Owner: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 12 Oct 2021 13:24:29 +0000
Gerrit-HasComments: Yes

Reply via email to