Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15403 )

Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@875
PS1, Line 875: UtcToUnixTime
> According to ORC spec the timestamp stats are stored as millisecs (which is
Looks like it's an ORC bug that the ORC writers (both Java and C++ versions) 
don't handle the rounding issue of max:

https://github.com/apache/orc/blob/fea154436c37c81a16b13d879b510096cfaa2946/java/core/src/java/org/apache/orc/impl/writer/TimestampTreeWriter.java#L108
https://github.com/apache/orc/blob/fea154436c37c81a16b13d879b510096cfaa2946/c%2B%2B/src/ColumnWriter.cc#L1800


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@902
PS1, Line 902: type.GetByteSize()
I think this should be "literal_expr->type().GetByteSize()" since it's 
consistent with the type of "val".


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@914
PS1, Line 914: static_cast<uint64_t>((dv16->value() << 64) >> 64)
Can we cast int128_t to uint64_t directly?


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@984
PS1, Line 984:       
row_reader_options_.setSearchArgument(std::move(final_sarg));
Could you add a VLOG_FILE level logging for 'final_sarg'? It would be helpful 
for debugging.


http://gerrit.cloudera.org:8080/#/c/15403/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/15403/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@543
PS1, Line 543:     } else if (op == BinaryPredicate.Operator.EQ) {
ORC lib supports pushing down EQ predicates. I think we don't need to transform 
them into LE+GE predicates. But I'm ok leaving this as a further optimization.



--
To view, visit http://gerrit.cloudera.org:8080/15403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Gerrit-Change-Number: 15403
Gerrit-PatchSet: 1
Gerrit-Owner: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Mon, 16 Mar 2020 08:17:50 +0000
Gerrit-HasComments: Yes

Reply via email to