Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15403 )
Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner ...................................................................... Patch Set 1: (1 comment) The change looks good so far (much tidier than the Parquet implementation). I will look through the testing part a bit later - I think that the existing tests are not enough to cover every case we want to. http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@888 PS1, Line 888: case TYPE_VARCHAR: : case TYPE_CHAR: { I am not sure about the correctness in case of CHAR(N) and VARCHAR(N) For CHAR(N): If a string is shorter than N, then it is padded with spaces. Is val already padded? Lets assume an "s CHAR(2) column and a predicate s="a " If the columns consists of only values "a", then we should not drop them, as after padding they will satisfy the predicate, but min-max logic can drop them, as "a " is greater than the max element. For VARCHAR(N): (also applies to CHAR(N)) The strings should be truncated before checking predicates, Lets assume an "s VARCHAR(1) column and a predicate s="a" If the column consists of only values "aa", then we should not drop them, as after truncation they'l become "a", but min-max filter may drop them a "a" is less than the min value. -- To view, visit http://gerrit.cloudera.org:8080/15403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845 Gerrit-Change-Number: 15403 Gerrit-PatchSet: 1 Gerrit-Owner: Norbert Luksa <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Norbert Luksa <[email protected]> Gerrit-Comment-Date: Wed, 11 Mar 2020 14:31:22 +0000 Gerrit-HasComments: Yes
