Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18141 )

Change subject: IMPALA-10898: Add runtime IN-list filters for ORC tables
......................................................................


Patch Set 14:

(14 comments)

Thank Qifan for the detailed review!

http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter-ir.cc
File be/src/util/in-list-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter-ir.cc@26
PS13, Line 26:   if (UNLIKELY(val == nullptr)) {
> UNLIKELY
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter-ir.cc@30
PS13, Line 30:   if (UNLIKELY(values_.size() >= entry_limit_ || 
str_values_.size() >= entry_limit_)) {
> UNLIKELY
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter-ir.cc@55
PS13, Line 55: if (UNLIKELY(s->ptr == nullptr)) {
             :         contains_null_ = true
> nit. should we check null-ness again? See line at 26.
The default constructor of StringValue creates a null 'ptr'. I think we'd 
better check this.
https://github.com/apache/impala/blob/fe04c50/be/src/runtime/string-value.h#L51


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter-ir.cc@58
PS13, Line 58: str_total
> nit. Probably should be named as str_total_size_.
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.h
File be/src/util/in-list-filter.h:

http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.h@39
PS13, Line 39:   InListFilter(ColumnType type, uint32_t entry_limit, bool 
contains_null = false);
> Include contains_null and column type here.
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc
File be/src/util/in-list-filter.cc:

http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc@63
PS13, Line 63: retur
> "return false" here helps with release code.
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc@71
PS13, Line 71: if (type.type == TYPE
> nit. it is better to supply the type in the cstr.
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc@78
PS13, Line 78:
             : InListFilter* InListFilter::Create(const InListFilterPB& 
protobuf, ColumnType type,
             :     uint32_t entry_limit, ObjectPool* pool) {
> nit. probably should inited in the cstr.
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc@100
PS13, Line 100: break;
> return null?
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc@112
PS13, Line 112:
              :
              : void InListFilter::ToProtobuf(const InListFilter* filter, 
InListFilterPB* protobuf) {
              :   DCHECK(protobuf != nullptr);
              :   i
> Other fields that are not copied: type_, contains_null_,  str_size_ and ent
Oops, we don't need this method. Removed it.


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc@142
PS13, Line 142:   if (type_ == TYPE_STRING || type_ == TYPE_VARCHAR || type_ == 
TYPE_CHAR) {
> Same comment for Copy method: missing fields.
Done


http://gerrit.cloudera.org:8080/#/c/18141/13/be/src/util/in-list-filter.cc@174
PS13, Line 174:
> should handle null case.
Oops! Done.


http://gerrit.cloudera.org:8080/#/c/18141/13/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test:

http://gerrit.cloudera.org:8080/#/c/18141/13/testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test@787
PS13, Line 787: broadcast
> May repeat this test with partition HJ to verify that in-list filters is no
Done. I change the test to verify distributed plan in order to show the 
partitioned join. Otherwise it's getting the single node plan by default.


http://gerrit.cloudera.org:8080/#/c/18141/13/testdata/workloads/functional-query/queries/QueryTest/in_list_filters.test
File testdata/workloads/functional-query/queries/QueryTest/in_list_filters.test:

http://gerrit.cloudera.org:8080/#/c/18141/13/testdata/workloads/functional-query/queries/QueryTest/in_list_filters.test@127
PS13, Line 127:
> may add a test on date column type.
Sure. The ORC date_tbl is corrupted and need to wait for 
https://gerrit.cloudera.org/c/18262/ to be merged.



--
To view, visit http://gerrit.cloudera.org:8080/18141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I25080628233799aa0b6be18d5a832f1385414501
Gerrit-Change-Number: 18141
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 23 Feb 2022 07:33:09 +0000
Gerrit-HasComments: Yes

Reply via email to