Zoltan Ivanfi created PARQUET-1628:
--------------------------------------

             Summary: Accept local timestamps annotated with the legacy 
timestamp types
                 Key: PARQUET-1628
                 URL: https://issues.apache.org/jira/browse/PARQUET-1628
             Project: Parquet
          Issue Type: Task
          Components: parquet-mr
            Reporter: Zoltan Ivanfi
            Assignee: Nandor Kollar


The rules for TIMESTAMP forward-compatibility were created based on the 
assumption that TIMESTAMP_MILLIS and TIMESTAMP_MICROS have only been used in 
the instant aka. UTC-normalized semantics so far.

>From this false premise it followed that TIMESTAMPs with local semantics were 
>a new type and did not need to be annotated with the old types to maintain 
>compatibility. In fact, annotating them with the old types were considered to 
>be harmful, since it would have mislead older readers into thinking that they 
>can read TIMESTAMPs with local semantics, when in reality they would have 
>misinterpreted them as TIMESTAMPs with instant semantics. This would have lead 
>to a difference of several hours, corresponding to the time zone offset.

In reality, however, this misinterpretation of timestamps has already been 
going on for a while, since Arrow annotates local timestamps with 
TIMESTAMP_MILLIS or TIMESTMAP_MICROS.

To maintain forward compatibilty of local timestamps, Arrow annotates them with 
the legacy timestamp logical types. However, the Java library considers these 
logical types to be incompatible and discards the new type in favour of the 
legacy ones (since doing the other way around would change the behaviour). 
Parquet-mr should be updated so that it accepts this combination of new and old 
logical types.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to