Github user xndai commented on the issue:
https://github.com/apache/orc/pull/233
Sorry, I am confused after reading the discussions above. The key question
I have is - do we implement ORC TIMESTAMP as SQL "TIMESTAMP with Timezone" or
"TIMESTAMP without Timezone"? It seems to me that we implement it as the later
one. That's why we went for a rather complicated design that involved local
epoch and logics to handle DST while moving between time zones. But
@majetideepak comment above stated that we wanted to implement TIMESTAMP as
TIMESTAMP with Tz with ORC-10. This contradicts what I saw in Java reader
implementation in which timestamp value is adjusted per reader time zone
(TreeReaderFactory.java line 987 to 992).
So if my understanding is correct, which means TIMESTAMP should be
implemented as TIMESTAMP w/o Tz, then the current C++ reader has a bug that it
always adjusts to gmt rather than the reader timezone (ColumnReader.cc line
339, 340).
@omalley is probably the best person to answer this question...
---