Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17706 )
Change subject: IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans ...................................................................... Patch Set 20: (8 comments) Did a first round. The change looks really nice and promising! http://gerrit.cloudera.org:8080/#/c/17706/20//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17706/20//COMMIT_MSG@9 PS20, Line 9: patches patch http://gerrit.cloudera.org:8080/#/c/17706/20//COMMIT_MSG@10 PS20, Line 10: one row and only one column, right? http://gerrit.cloudera.org:8080/#/c/17706/20//COMMIT_MSG@30 PS20, Line 30: InertFor InsertFor http://gerrit.cloudera.org:8080/#/c/17706/20/be/src/exec/nested-loop-join-builder.h File be/src/exec/nested-loop-join-builder.h: http://gerrit.cloudera.org:8080/#/c/17706/20/be/src/exec/nested-loop-join-builder.h@104 PS20, Line 104: if (build_filters) { Probably the compiler is smart enough to do that, but this 'if' could be moved out of the FOREACH_ROW: if (build_filters) { FOREACH_ROW(... } http://gerrit.cloudera.org:8080/#/c/17706/20/be/src/exec/nested-loop-join-builder.cc File be/src/exec/nested-loop-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/17706/20/be/src/exec/nested-loop-join-builder.cc@157 PS20, Line 157: To be optimized* Is it a TODO for the current CR? If not, could you please open a Jira ticket? http://gerrit.cloudera.org:8080/#/c/17706/20/be/src/exec/nested-loop-join-builder.cc@259 PS20, Line 259: void NljBuilder::PublishRuntimeFilters(int64_t num_build_rows) { nit: There are some code duplication with partitioned-hash-join-builder.cc. Is it possible to move some parts to a common place? http://gerrit.cloudera.org:8080/#/c/17706/20/be/src/util/min-max-filter-ir.cc File be/src/util/min-max-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/17706/20/be/src/util/min-max-filter-ir.cc@115 PS20, Line 115: DCHECK(false) << "StringMinMaxFilter::InsertForLE() is not supported"; Maybe we could use TruncateUp/TruncateDown: https://github.com/apache/impala/blob/master/be/src/util/string-util.cc#L34:8 And when they don't work maybe we could just disable the filter by setting AlwaysTrue? http://gerrit.cloudera.org:8080/#/c/17706/20/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/17706/20/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1073 PS20, Line 1073: .produceOneValueLogically(analyzer, expr); Can we use isScalarSubquery() instead: https://github.com/apache/impala/blob/954eb5c85d329af7690698cdc4d0f409260e6d18/fe/src/main/java/org/apache/impala/analysis/Expr.java#L1486:18 This would enable this optimization for more cases, e.g. SELECT without FROM, select with LIMIT 1. -- To view, visit http://gerrit.cloudera.org:8080/17706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 Gerrit-Change-Number: 17706 Gerrit-PatchSet: 20 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Amogh Margoor <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 06 Aug 2021 18:27:31 +0000 Gerrit-HasComments: Yes
