Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 )
Change subject: [WIP] IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate ...................................................................... Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16720/12//COMMIT_MSG@9 PS12, Line 9: This patch adds the logic to utilize min/max stats > On the scope of the work. I think getting the row group/page filtering working (with all data types, etc) is a good end-point for this patch. My understanding is that the row/partition filtering will get enabled automatically once the filters are generated and I just want to understand the implications of that: * It looks like the min-max filters are ordered after the bloom filters for evaluation purposes. * It looks like the min-max filters don't count towards the MAX_NUM_RUNTIME_FILTERS limit - https://impala.apache.org/docs/build/html/topics/impala_max_num_runtime_filters.html#max_num_runtime_filters. So this means we will maybe get some new source/destination pairs, which might change the runtime behaviour of some plans. I suspect this is all a net win, since the min-max filters should be relatively cheap to construct and will get automatically disabled if they're ineffective in the scan, but there is a bit of overhead added. So I think we want to do some benchmarks to make sure there's no regressions before changing the default. Probably TPC-DS since it is heavy on the filters. -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 12 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Mon, 30 Nov 2020 18:31:48 +0000 Gerrit-HasComments: Yes
