Panagiotis Garefalakis created ORC-673:
------------------------------------------
Summary: PPD: LTE Point equality comparison is wrong when RG
MIN==MAX
Key: ORC-673
URL: https://issues.apache.org/jira/browse/ORC-673
Project: ORC
Issue Type: Bug
Reporter: Panagiotis Garefalakis
Assignee: Panagiotis Garefalakis
Currently LESS_THAN_EQUALS PPD evaluation does not properly handle the
Equality corner case where a RG has a single repeating value and thus MIN ==
MAX:
As part of the range to point comparison the compare method will return MIN:
https://github.com/apache/orc/blob/2f98b1a555850051b5081105262c1744dcc14906/java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java#L359
Since the evaluatePredicateMinMax method does explicitly account for that
scenario, it will return YES_NO, with the row group ending up being selected
(even though it could be avoided):
https://github.com/apache/orc/blob/2f98b1a555850051b5081105262c1744dcc14906/java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java#L658
Steps to repro on Hive:
{code:java}
create table tbl2 (fld int, fld1 int) stored as ORC
tblproperties('transactional'='true');
insert into tbl2 values (1,1);
insert into tbl2 values (2,2);
insert into tbl2 values (3,3);
select * from tbl2 where fld > 1 and fld < 3;
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)