Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16720 )

Change subject: [WIP] IMPALA-10325: Parquet scan should use min/max statistics 
to skip pages based on equi-join predicate
......................................................................


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG@9
PS12, Line 9: This patch adds the logic to utilize min/max stats
> On the scope of the work.
I think getting the row group/page filtering working (with all data types, etc) 
is a good end-point for this patch.

My understanding is that the row/partition filtering will get enabled 
automatically once the filters are generated and I just want to understand the 
implications of that:

* It looks like the min-max filters are ordered after the bloom filters for 
evaluation purposes.
* It looks like the min-max filters don't count towards the 
MAX_NUM_RUNTIME_FILTERS limit - 
https://impala.apache.org/docs/build/html/topics/impala_max_num_runtime_filters.html#max_num_runtime_filters.
 So this means we will maybe get some new source/destination pairs, which might 
change the runtime behaviour of some plans.

I suspect this is all a net win, since the min-max filters should be relatively 
cheap to construct and will get automatically disabled if they're ineffective 
in the scan, but there is a bit of overhead added.

So I think we want to do some benchmarks to make sure there's no regressions 
before changing the default. Probably TPC-DS since it is heavy on the filters.



--
To view, visit http://gerrit.cloudera.org:8080/16720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
Gerrit-Change-Number: 16720
Gerrit-PatchSet: 12
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Mon, 30 Nov 2020 18:31:48 +0000
Gerrit-HasComments: Yes

Reply via email to