Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17860 )

Change subject: IMPALA-9873: Avoid materilization of columns for filtered out 
rows in Parquet table.
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@253
PS3, Line 253:   conjuncts.reserve(scan_node_->conjuncts().size() +
             :     scan_node_->filter_exprs().size());
             :   conjuncts.insert(std::end(conjuncts), 
std::begin(scan_node_->conjuncts()),
> I tested this for Min/Max filters at row level and this code handles it as
Great!  May add a test for the following query. Since the filter is on 
a.wr_item_sk, the later materialization should help save on wr_reason_sk.

select a.wr_reason_sk, a.wr_item_sk from web_returns a, web_returns b where
a.wr_item_sk = b.wr_item_sk and
b.wr_return_ship_cost < 10;


http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2330
PS3, Line 2330:
> There is no recheck happening once the batch is formed even if they have fe
Okay. That makes sense.



--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 7
Gerrit-Owner: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Wed, 13 Oct 2021 17:53:28 +0000
Gerrit-HasComments: Yes

Reply via email to