[
https://issues.apache.org/jira/browse/ORC-340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441164#comment-16441164
]
Owen O'Malley commented on ORC-340:
-----------------------------------
I don't understand what is going wrong.
The current "timestamp" type is logically a time without any timezone
information. If the user puts in "2010-04-05 12:34:56" they should get the same
back regardless of which timezone they are currently in. As you point out the
Java Timestamp uses the local timezone. Unfortunately, the
TimestampColumnVector was defined using the fields from Java Timestamp,
including using the local timezone. *sigh*
We should probably add a way of marking the TimestampColumnVector as stored in
UTC rather than local with an option to the reader to use the new switch. But
that is a different request.
So is the conversion from timestamp to string doing a double conversion?
> ConvertTreeReaderFactory does not take into account timezone information
> ------------------------------------------------------------------------
>
> Key: ORC-340
> URL: https://issues.apache.org/jira/browse/ORC-340
> Project: ORC
> Issue Type: Bug
> Components: evolution
> Reporter: Jesus Camacho Rodriguez
> Priority: Critical
>
> When converting timestamp/date to string group, {{toString}} is called on
> {{java.sql.Date}} and {{java.sql.Timestamp}}, which pick up the default time
> zone to represent the date/time as a String. This can lead to the wrong
> String representation. See
> {{ConvertTreeReaderFactory.StringGroupFromTimestampTreeReader}} or
> {{ConvertTreeReaderFactory.StringGroupFromDateTreeReader}}.
> {{StringGroupFromTimestampTreeReader}} should apply shifting, similar to
> {{TreeReaderFactory.TimestampTreeReader}}.
> {{StringGroupFromDateTreeReader}} should use a date formatter in UTC.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)