Prasanth Jayachandran created ORC-135:
-----------------------------------------

             Summary: PPD for timestamp is wrong when reader and writer 
timezones are different
                 Key: ORC-135
                 URL: https://issues.apache.org/jira/browse/ORC-135
             Project: Orc
          Issue Type: Bug
    Affects Versions: 1.3.0, 1.2.0, 1.1.0, 1.0.0
            Reporter: Prasanth Jayachandran
            Assignee: Prasanth Jayachandran
            Priority: Critical


When reader and writer timezones are different, PPD evaluation does not offset 
the timezone when reading the min and max values. This can result is wrong PPD 
evaluation and hence incorrect results.

Example:
Table written in US/Eastern timezone. All values in this table are "2007-08-01 
00:00:00.0".
{code:title=PPD disabled}
hive> set hive.optimize.index.filter=false;
hive> select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 
00:00:00.0' limit 1;
2007-08-01 00:00:00.0
OK
{code}

{code:title=PPD enabled}
set hive.optimize.index.filter=true;
select ORDER_DATE from ORDER_FACT_small where ORDER_DATE='2007-08-01 
00:00:00.0' limit 1;
OK
{code}
No rows are returned when PPD is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to