[jira] [Commented] (ORC-346) Bug in TimestampColumnReader (or Writer)
[ https://issues.apache.org/jira/browse/ORC-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444664#comment-16444664 ] ASF GitHub Bot commented on ORC-346: GitHub user rip-nsk opened a pull request: https://github.com/apache/orc/pull/253 ORC-346: [C++] Add one second when writing negative Timestamp's with non zero nanos … …to match the reader code. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rip-nsk/orc ORC-346 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/orc/pull/253.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #253 commit 0507d1f970c613fd7b12ee814fb9e040bfa30d24 Author: rip-nskDate: 2018-04-19T19:22:09Z Add one second when writing negative Timestamp's with non zero nanos to match the reader code. > Bug in TimestampColumnReader (or Writer) > > > Key: ORC-346 > URL: https://issues.apache.org/jira/browse/ORC-346 > Project: ORC > Issue Type: Bug > Components: C++ >Reporter: rip.nsk >Priority: Critical > > void TimestampColumnReader::next(ColumnVectorBatch& rowBatch, uint64_t > numValues, char *notNull) > has the following code: > c++\src\ColumnReader.cc:338 > int64_t writerTime = secsBuffer[i] + epochOffset; > secsBuffer[i] = writerTimezone.convertToUTC(writerTime); > {color:#f79232}if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {{color} > {color:#f79232} secsBuffer[i] -= 1;{color} > {color:#f79232} }{color} > {color:#33}which likely leads to read wrong seconds value for PRE_1970 > dates{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ORC-346) Bug in TimestampColumnReader (or Writer)
[ https://issues.apache.org/jira/browse/ORC-346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rip.nsk updated ORC-346: Summary: Bug in TimestampColumnReader (or Writer) (was: Bug in TimestampColumnReader) > Bug in TimestampColumnReader (or Writer) > > > Key: ORC-346 > URL: https://issues.apache.org/jira/browse/ORC-346 > Project: ORC > Issue Type: Bug > Components: C++ >Reporter: rip.nsk >Priority: Critical > > void TimestampColumnReader::next(ColumnVectorBatch& rowBatch, uint64_t > numValues, char *notNull) > has the following code: > c++\src\ColumnReader.cc:338 > int64_t writerTime = secsBuffer[i] + epochOffset; > secsBuffer[i] = writerTimezone.convertToUTC(writerTime); > {color:#f79232}if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {{color} > {color:#f79232} secsBuffer[i] -= 1;{color} > {color:#f79232} }{color} > {color:#33}which likely leads to read wrong seconds value for PRE_1970 > dates{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ORC-346) Bug in TimestampColumnReader
[ https://issues.apache.org/jira/browse/ORC-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1685#comment-1685 ] rip.nsk edited comment on ORC-346 at 4/19/18 6:01 PM: -- TEST(Writer, writeTimestamp) failed with the following change: +c++/test/TestWriter.cc:573 --- time_t currTime = std::time(nullptr); +++ time_t currTime = -14210715; // 1969-07-20 12:34:45 was (Author: rip@gmail.com): TEST(Writer, writeTimestamp) failed with the following change: diff --git a/c++/test/TestWriter.cc b/c++/test/TestWriter.cc @@ -573,7 +573,7 @@ namespace orc { - time_t currTime = std::time(nullptr); + time_t currTime = -14210715; // 1969-07-20 12:34:45 > Bug in TimestampColumnReader > > > Key: ORC-346 > URL: https://issues.apache.org/jira/browse/ORC-346 > Project: ORC > Issue Type: Bug > Components: C++ >Reporter: rip.nsk >Priority: Critical > > void TimestampColumnReader::next(ColumnVectorBatch& rowBatch, uint64_t > numValues, char *notNull) > has the following code: > c++\src\ColumnReader.cc:338 > int64_t writerTime = secsBuffer[i] + epochOffset; > secsBuffer[i] = writerTimezone.convertToUTC(writerTime); > {color:#f79232}if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {{color} > {color:#f79232} secsBuffer[i] -= 1;{color} > {color:#f79232} }{color} > {color:#33}which likely leads to read wrong seconds value for PRE_1970 > dates{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ORC-346) Bug in TimestampColumnReader
[ https://issues.apache.org/jira/browse/ORC-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1685#comment-1685 ] rip.nsk edited comment on ORC-346 at 4/19/18 6:01 PM: -- TEST(Writer, writeTimestamp) failed with the following change: +c++/test/TestWriter.cc:573 -- time_t currTime = std::time(nullptr); + time_t currTime = -14210715; // 1969-07-20 12:34:45 was (Author: rip@gmail.com): TEST(Writer, writeTimestamp) failed with the following change: +c++/test/TestWriter.cc:573 --- time_t currTime = std::time(nullptr); +++ time_t currTime = -14210715; // 1969-07-20 12:34:45 > Bug in TimestampColumnReader > > > Key: ORC-346 > URL: https://issues.apache.org/jira/browse/ORC-346 > Project: ORC > Issue Type: Bug > Components: C++ >Reporter: rip.nsk >Priority: Critical > > void TimestampColumnReader::next(ColumnVectorBatch& rowBatch, uint64_t > numValues, char *notNull) > has the following code: > c++\src\ColumnReader.cc:338 > int64_t writerTime = secsBuffer[i] + epochOffset; > secsBuffer[i] = writerTimezone.convertToUTC(writerTime); > {color:#f79232}if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {{color} > {color:#f79232} secsBuffer[i] -= 1;{color} > {color:#f79232} }{color} > {color:#33}which likely leads to read wrong seconds value for PRE_1970 > dates{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ORC-346) Bug in TimestampColumnReader
[ https://issues.apache.org/jira/browse/ORC-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1685#comment-1685 ] rip.nsk edited comment on ORC-346 at 4/19/18 6:00 PM: -- TEST(Writer, writeTimestamp) failed with the following change: diff --git a/c++/test/TestWriter.cc b/c++/test/TestWriter.cc @@ -573,7 +573,7 @@ namespace orc { - time_t currTime = std::time(nullptr); + time_t currTime = -14210715; // 1969-07-20 12:34:45 was (Author: rip@gmail.com): TEST(Writer, writeTimestamp) failed with the following change: diff --git a/c++/test/TestWriter.cc b/c++/test/TestWriter.cc index c61d184..40d9243 100644 --- a/c++/test/TestWriter.cc +++ b/c++/test/TestWriter.cc @@ -570,7 +570,7 @@ namespace orc { std::vector times(rowCount); for (uint64_t i = 0; i < rowCount; ++i) { - time_t currTime = std::time(nullptr); + time_t currTime = -14210715; // 1969-07-20 12:34:45 times[i] = static_cast(currTime) - static_cast(i * 60); tsBatch->data[i] = times[i]; tsBatch->nanoseconds[i] = static_cast(i * 1000); > Bug in TimestampColumnReader > > > Key: ORC-346 > URL: https://issues.apache.org/jira/browse/ORC-346 > Project: ORC > Issue Type: Bug > Components: C++ >Reporter: rip.nsk >Priority: Critical > > void TimestampColumnReader::next(ColumnVectorBatch& rowBatch, uint64_t > numValues, char *notNull) > has the following code: > c++\src\ColumnReader.cc:338 > int64_t writerTime = secsBuffer[i] + epochOffset; > secsBuffer[i] = writerTimezone.convertToUTC(writerTime); > {color:#f79232}if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {{color} > {color:#f79232} secsBuffer[i] -= 1;{color} > {color:#f79232} }{color} > {color:#33}which likely leads to read wrong seconds value for PRE_1970 > dates{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ORC-346) Bug in TimestampColumnReader
[ https://issues.apache.org/jira/browse/ORC-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1685#comment-1685 ] rip.nsk commented on ORC-346: - TEST(Writer, writeTimestamp) failed with the following change: diff --git a/c++/test/TestWriter.cc b/c++/test/TestWriter.cc index c61d184..40d9243 100644 --- a/c++/test/TestWriter.cc +++ b/c++/test/TestWriter.cc @@ -570,7 +570,7 @@ namespace orc { std::vector times(rowCount); for (uint64_t i = 0; i < rowCount; ++i) { - time_t currTime = std::time(nullptr); + time_t currTime = -14210715; // 1969-07-20 12:34:45 times[i] = static_cast(currTime) - static_cast(i * 60); tsBatch->data[i] = times[i]; tsBatch->nanoseconds[i] = static_cast(i * 1000); > Bug in TimestampColumnReader > > > Key: ORC-346 > URL: https://issues.apache.org/jira/browse/ORC-346 > Project: ORC > Issue Type: Bug > Components: C++ >Reporter: rip.nsk >Priority: Critical > > void TimestampColumnReader::next(ColumnVectorBatch& rowBatch, uint64_t > numValues, char *notNull) > has the following code: > c++\src\ColumnReader.cc:338 > int64_t writerTime = secsBuffer[i] + epochOffset; > secsBuffer[i] = writerTimezone.convertToUTC(writerTime); > {color:#f79232}if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {{color} > {color:#f79232} secsBuffer[i] -= 1;{color} > {color:#f79232} }{color} > {color:#33}which likely leads to read wrong seconds value for PRE_1970 > dates{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ORC-346) Bug in TimestampColumnReader
[ https://issues.apache.org/jira/browse/ORC-346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rip.nsk updated ORC-346: Summary: Bug in TimestampColumnReader (was: Probably bug in TimestampColumnReader) > Bug in TimestampColumnReader > > > Key: ORC-346 > URL: https://issues.apache.org/jira/browse/ORC-346 > Project: ORC > Issue Type: Bug > Components: C++ >Reporter: rip.nsk >Priority: Critical > > void TimestampColumnReader::next(ColumnVectorBatch& rowBatch, uint64_t > numValues, char *notNull) > has the following code: > c++\src\ColumnReader.cc:338 > int64_t writerTime = secsBuffer[i] + epochOffset; > secsBuffer[i] = writerTimezone.convertToUTC(writerTime); > {color:#f79232}if (secsBuffer[i] < 0 && nanoBuffer[i] != 0) {{color} > {color:#f79232} secsBuffer[i] -= 1;{color} > {color:#f79232} }{color} > {color:#33}which likely leads to read wrong seconds value for PRE_1970 > dates{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer
[ https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444388#comment-16444388 ] ASF GitHub Bot commented on ORC-341: Github user wgtmac commented on the issue: https://github.com/apache/orc/pull/249 @jcamachor You are right. WriterOptions/WriterContext are ideal places to set this kind of values. > Support time zone as a parameter for Java reader and writer > --- > > Key: ORC-341 > URL: https://issues.apache.org/jira/browse/ORC-341 > Project: ORC > Issue Type: Improvement >Reporter: Jesus Camacho Rodriguez >Priority: Major > > Currently, time zone is hardcoded as the system default time zone and ORC > applies displacement between timestamp values read/written based on time zone. > This issue aims at adding the option to pass the time zone as a parameter to > the reader/writer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer
[ https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444251#comment-16444251 ] ASF GitHub Bot commented on ORC-341: Github user jcamachor commented on the issue: https://github.com/apache/orc/pull/249 Pushed a new commit with the changes. We would still need a storage-api release for the ```TimestampColumnVector``` changes. > Support time zone as a parameter for Java reader and writer > --- > > Key: ORC-341 > URL: https://issues.apache.org/jira/browse/ORC-341 > Project: ORC > Issue Type: Improvement >Reporter: Jesus Camacho Rodriguez >Priority: Major > > Currently, time zone is hardcoded as the system default time zone and ORC > applies displacement between timestamp values read/written based on time zone. > This issue aims at adding the option to pass the time zone as a parameter to > the reader/writer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer
[ https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444106#comment-16444106 ] ASF GitHub Bot commented on ORC-341: Github user jcamachor commented on the issue: https://github.com/apache/orc/pull/249 @wgtmac , thanks for the feedback. Please bear with me for a bit, as it is first time I am touching ORC code base. OK, I think ```TypeDescription``` is not a problem then since we set the value at reader / writer, independently of the default that we use at creation time. For reader, everything seems easy. However, for the writer, it is a bit trickier since the stripe footer stores the information about the time zone, hence it should be set beforehand using, e.g., the context or options objects. Does that seem reasonable? > Support time zone as a parameter for Java reader and writer > --- > > Key: ORC-341 > URL: https://issues.apache.org/jira/browse/ORC-341 > Project: ORC > Issue Type: Improvement >Reporter: Jesus Camacho Rodriguez >Priority: Major > > Currently, time zone is hardcoded as the system default time zone and ORC > applies displacement between timestamp values read/written based on time zone. > This issue aims at adding the option to pass the time zone as a parameter to > the reader/writer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)