Norbert Luksa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15403 )

Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner
......................................................................


Patch Set 1:

(6 comments)

Thanks Quanlong and Csaba for reviewing. Addressed some comments, will do the 
testing with the next patch.
Also, I will play with different predicates since now I see huge regression for 
some queries. Eg. the following query takes around twice as much as before:
select * from lineitem where l_suppkey = 10

http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@875
PS1, Line 875: UtcToUnixTime
> Looks like it's an ORC bug that the ORC writers (both Java and C++ versions
Changed to FloorUtcToUnixTimeMillis.
I see Csaba opened ORC-611 regarding the bug.


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@888
PS1, Line 888:     case TYPE_VARCHAR:
             :     case TYPE_CHAR: {
> I am not sure about the correctness in case of CHAR(N) and VARCHAR(N)
Added padding.
Maybe we should wait here with ORC-612?


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@902
PS1, Line 902: type.GetByteSize()
> I think this should be "literal_expr->type().GetByteSize()" since it's cons
Done


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@914
PS1, Line 914: static_cast<uint64_t>((dv16->value() << 64) >> 64)
> Can we cast int128_t to uint64_t directly?
Looks like we can, done.


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@959
PS1, Line 959:     orc::Literal literal =
             :         GetLiteralSearchArguments(eval, slot_desc->type(), 
&predicate_type);
             : 
             :     if (fn_name == "lt") {
> This logic is not enough if there is some mismatch between Impala's and ORC
Same as above, wait for ORC-612?


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@984
PS1, Line 984:       
row_reader_options_.setSearchArgument(std::move(final_sarg));
> Could you add a VLOG_FILE level logging for 'final_sarg'? It would be helpf
Done



--
To view, visit http://gerrit.cloudera.org:8080/15403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Gerrit-Change-Number: 15403
Gerrit-PatchSet: 1
Gerrit-Owner: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Tue, 17 Mar 2020 12:55:39 +0000
Gerrit-HasComments: Yes

Reply via email to