Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17706 )

Change subject: IMPALA-3430: Runtime filter : Extend runtime filter to support 
Min/Max values for HDFS scans
......................................................................


Patch Set 25: Code-Review+1

(8 comments)

I ran over the code one more time before I go on vacation.

I had some comments, but the code looks great overall. Feel free to carry my +1 
once those comments are resolved.

http://gerrit.cloudera.org:8080/#/c/17706/25//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17706/25//COMMIT_MSG@42
PS25, Line 42: only for sorted or partitioned
             : join columns.
I think it would make sense to at least turn this on at the file/row 
group/page. Probably the overhead of building the filter is negligible in this 
case.


http://gerrit.cloudera.org:8080/#/c/17706/25/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/17706/25/be/src/exec/nested-loop-join-builder.h@131
PS25, Line 131:   /// This is replaced at runtime with code generated by 
CodegenInsertRuntimeFilters().
Is this true?

PartitionedHashJoinBuilder has some mechanism for codegen, e.g.:
https://github.com/apache/impala/blob/5a9dcd108d8a1c6f3ea0062d8de750b6e41fb635/be/src/exec/partitioned-hash-join-builder.cc#L1406

But I don't see that for NestedLoopJoinBuilder. Probably we should remove this, 
or add TODO with a new Jira ticket?

But since we insert only one value from the scalar subquery it's probably not a 
big deal.


http://gerrit.cloudera.org:8080/#/c/17706/25/be/src/exec/nested-loop-join-builder.h@149
PS25, Line 149: initFilterContexts
nit: InitFilterContexts (Upper Camel Case)


http://gerrit.cloudera.org:8080/#/c/17706/25/be/src/exec/nested-loop-join-builder.cc
File be/src/exec/nested-loop-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17706/25/be/src/exec/nested-loop-join-builder.cc@191
PS25, Line 191:     PublishRuntimeFilters(
Don't we want to publish the runtime filters for separate builds as well?


http://gerrit.cloudera.org:8080/#/c/17706/25/be/src/util/min-max-filter-ir.cc
File be/src/util/min-max-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17706/25/be/src/util/min-max-filter-ir.cc@130
PS25, Line 130:   CopyToBuffer(&min_buffer_, &min_, min_.len);
Do we always need to invoke CopyToBuffer() for the min_ value?


http://gerrit.cloudera.org:8080/#/c/17706/25/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/17706/25/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@410
PS25, Line 410: miax
max()


http://gerrit.cloudera.org:8080/#/c/17706/25/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test
File 
testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test:

http://gerrit.cloudera.org:8080/#/c/17706/25/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test@409
PS25, Line 409:  <
Could you please also add a test with '='?


http://gerrit.cloudera.org:8080/#/c/17706/25/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test@429
PS25, Line 429: # String data type is not supported as it is impossible to 
represent the min() and max()
              : # value for the data type.
This seems to be supported now.



--
To view, visit http://gerrit.cloudera.org:8080/17706
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446
Gerrit-Change-Number: 17706
Gerrit-PatchSet: 25
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Wed, 11 Aug 2021 15:36:48 +0000
Gerrit-HasComments: Yes

Reply via email to