Norbert Luksa has posted comments on this change. ( http://gerrit.cloudera.org:8080/15403 )
Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner ...................................................................... Patch Set 1: (6 comments) Thanks Quanlong and Csaba for reviewing. Addressed some comments, will do the testing with the next patch. Also, I will play with different predicates since now I see huge regression for some queries. Eg. the following query takes around twice as much as before: select * from lineitem where l_suppkey = 10 http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@875 PS1, Line 875: UtcToUnixTime > Looks like it's an ORC bug that the ORC writers (both Java and C++ versions Changed to FloorUtcToUnixTimeMillis. I see Csaba opened ORC-611 regarding the bug. http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@888 PS1, Line 888: case TYPE_VARCHAR: : case TYPE_CHAR: { > I am not sure about the correctness in case of CHAR(N) and VARCHAR(N) Added padding. Maybe we should wait here with ORC-612? http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@902 PS1, Line 902: type.GetByteSize() > I think this should be "literal_expr->type().GetByteSize()" since it's cons Done http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@914 PS1, Line 914: static_cast<uint64_t>((dv16->value() << 64) >> 64) > Can we cast int128_t to uint64_t directly? Looks like we can, done. http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@959 PS1, Line 959: orc::Literal literal = : GetLiteralSearchArguments(eval, slot_desc->type(), &predicate_type); : : if (fn_name == "lt") { > This logic is not enough if there is some mismatch between Impala's and ORC Same as above, wait for ORC-612? http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@984 PS1, Line 984: row_reader_options_.setSearchArgument(std::move(final_sarg)); > Could you add a VLOG_FILE level logging for 'final_sarg'? It would be helpf Done -- To view, visit http://gerrit.cloudera.org:8080/15403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845 Gerrit-Change-Number: 15403 Gerrit-PatchSet: 1 Gerrit-Owner: Norbert Luksa <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Norbert Luksa <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Tue, 17 Mar 2020 12:55:39 +0000 Gerrit-HasComments: Yes
