[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466186#comment-16466186
 ] 

ASF GitHub Bot commented on ORC-341:


Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/249


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464595#comment-16464595
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
@wgtmac , @omalley , thanks for the feedback. I think I have addressed all 
your points with last two commits, could you take another look? Thanks


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464124#comment-16464124
 ] 

ASF GitHub Bot commented on ORC-341:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r186133762
  
--- Diff: java/core/src/java/org/apache/orc/OrcFile.java ---
@@ -320,6 +321,16 @@ public ReaderOptions fileMetadata(final FileMetadata 
metadata) {
 public FileMetadata getFileMetadata() {
   return fileMetadata;
 }
+
+public ReaderOptions useUTCTimestamp(boolean value) {
+  useUTCTimestamp = value;
+  return this;
+}
+
+public boolean isUseUTCTimestamp() {
--- End diff --

This should be getUseUTCTimestamp.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464125#comment-16464125
 ] 

ASF GitHub Bot commented on ORC-341:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r186137062
  
--- Diff: java/core/src/java/org/apache/orc/impl/writer/WriterImplV2.java 
---
@@ -373,7 +379,11 @@ private void flushStripe() throws IOException {
   OrcProto.StripeFooter.Builder builder =
   OrcProto.StripeFooter.newBuilder();
   if (writeTimeZone) {
-builder.setWriterTimezone(TimeZone.getDefault().getID());
+if (useUTCTimeZone) {
+  builder.setWriterTimezone(TimeZone.getTimeZone("UTC").getID());
--- End diff --

I'd be tempted to just use setWriterTimezone("UTC"), because we'll already 
fail if UTC is called something else.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464123#comment-16464123
 ] 

ASF GitHub Bot commented on ORC-341:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r186134015
  
--- Diff: java/core/src/java/org/apache/orc/OrcFile.java ---
@@ -320,6 +321,16 @@ public ReaderOptions fileMetadata(final FileMetadata 
metadata) {
 public FileMetadata getFileMetadata() {
   return fileMetadata;
 }
+
+public ReaderOptions useUTCTimestamp(boolean value) {
--- End diff --

This should also cause the TimestampStatistics to use UTC.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464127#comment-16464127
 ] 

ASF GitHub Bot commented on ORC-341:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r186134142
  
--- Diff: java/core/src/java/org/apache/orc/OrcFile.java ---
@@ -761,6 +782,10 @@ public boolean getWriteVariableLengthBlocks() {
 public HadoopShims getHadoopShims() {
   return shims;
 }
+
+public boolean isUseUTCTimestamp() {
--- End diff --

Rename this to getUseUTCTimestamp.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464126#comment-16464126
 ] 

ASF GitHub Bot commented on ORC-341:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r186140770
  
--- Diff: 
java/core/src/java/org/apache/orc/impl/writer/TimestampTreeWriter.java ---
@@ -54,9 +57,20 @@ public TimestampTreeWriter(int columnId,
 if (rowIndexPosition != null) {
   recordPosition(rowIndexPosition);
 }
-this.localTimezone = TimeZone.getDefault();
-// for unit tests to set different time zones
-this.baseEpochSecsLocalTz = 
Timestamp.valueOf(BASE_TIMESTAMP_STRING).getTime() / MILLIS_PER_SECOND;
+if (writer.isUseUTCTimestamp()) {
+  this.localTimezone = TimeZone.getTimeZone("UTC");
+} else {
+  this.localTimezone = TimeZone.getDefault();
+}
+this.localDateFormat = new SimpleDateFormat("-MM-dd HH:mm:ss");
--- End diff --

It sucks that there isn't a simpler way in Java to do this.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461891#comment-16461891
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r185691194
  
--- Diff: java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java ---
@@ -990,6 +1007,10 @@ public void nextVector(ColumnVector previousVector,
   TimestampColumnVector result = (TimestampColumnVector) 
previousVector;
   super.nextVector(previousVector, isNull, batchSize);
 
+  if (context.isUseUTCTimestamp()) {
+result.setIsUTC(true);
--- End diff --

result.setIsUTC(context.isUseUTCTimestamp());

Just in case result is in UTC but context.isUseUTCTimestamp() is false.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461890#comment-16461890
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r185690848
  
--- Diff: 
java/core/src/java/org/apache/orc/impl/writer/TimestampTreeWriter.java ---
@@ -54,9 +57,20 @@ public TimestampTreeWriter(int columnId,
 if (rowIndexPosition != null) {
   recordPosition(rowIndexPosition);
 }
-this.localTimezone = TimeZone.getDefault();
-// for unit tests to set different time zones
-this.baseEpochSecsLocalTz = 
Timestamp.valueOf(BASE_TIMESTAMP_STRING).getTime() / MILLIS_PER_SECOND;
+if (writer.isUseUTCTimestamp()) {
+  this.localTimezone = TimeZone.getTimeZone("UTC");
--- End diff --

We'd better change its name to this.writeTimezone to avoid confusion in the 
future.
Same for localDateFormat and baseEpochSecsLocalTz below.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461892#comment-16461892
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r185691649
  
--- Diff: 
java/core/src/java/org/apache/orc/impl/writer/TimestampTreeWriter.java ---
@@ -28,7 +28,9 @@
 import org.apache.orc.impl.SerializationUtils;
 
 import java.io.IOException;
-import java.sql.Timestamp;
+import java.text.DateFormat;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
 import java.util.TimeZone;
 
 public class TimestampTreeWriter extends TreeWriterBase {
--- End diff --

We should also change writeBatch function below.

The input vector.isUTC may be true while writer.isUseUTCTimestamp() is 
false; vice versa. In this case, we need to convert them to correct writer 
timezone.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461644#comment-16461644
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
I have been testing the patch from Hive and everything seems to be working 
as expected.

I have rebased the patch and merge both commits. Also, I had to extend my 
changes to the newly created ```WriterImplV2```.

@omalley , @wgtmac , could you take a final look and merge if it is OK? 
Thanks


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-05-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460058#comment-16460058
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
I have just updated the patch now that we have moved to the new storage-api 
version. I will run some tests with Hive locally asap and will get back 
confirming that everything is working as expected.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457098#comment-16457098
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r184811998
  
--- Diff: java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java ---
@@ -975,6 +992,8 @@ public void nextVector(ColumnVector previousVector,
   TimestampColumnVector result = (TimestampColumnVector) 
previousVector;
   super.nextVector(previousVector, isNull, batchSize);
 
+  // TODO: If context.isUseUTCTimestamp(), set 
TimestampColumnVector.useUTC to true
--- End diff --

There is a vote going on for a storage-api release, then next week I can 
rebase the patch to consume it and hopefully we can check it in. Thanks @wgtmac 
!


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457070#comment-16457070
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on a diff in the pull request:

https://github.com/apache/orc/pull/249#discussion_r184807989
  
--- Diff: java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java ---
@@ -975,6 +992,8 @@ public void nextVector(ColumnVector previousVector,
   TimestampColumnVector result = (TimestampColumnVector) 
previousVector;
   super.nextVector(previousVector, isNull, batchSize);
 
+  // TODO: If context.isUseUTCTimestamp(), set 
TimestampColumnVector.useUTC to true
--- End diff --

I think it is better to set storage-api to 3.0.0 and fix this TODO in this 
patch as well.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444388#comment-16444388
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on the issue:

https://github.com/apache/orc/pull/249
  
@jcamachor You are right. WriterOptions/WriterContext are ideal places to 
set this kind of values.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444251#comment-16444251
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
Pushed a new commit with the changes.

We would still need a storage-api release for the 
```TimestampColumnVector``` changes.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444106#comment-16444106
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
@wgtmac , thanks for the feedback. Please bear with me for a bit, as it is 
first time I am touching ORC code base.
OK, I think ```TypeDescription``` is not a problem then since we set the 
value at reader / writer, independently of the default that we use at creation 
time. For reader, everything seems easy. However, for the writer, it is a bit 
trickier since the stripe footer stores the information about the time zone, 
hence it should be set beforehand using, e.g., the context or options objects. 
Does that seem reasonable?


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443051#comment-16443051
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on the issue:

https://github.com/apache/orc/pull/249
  
For reader, can we set useUTCTImestamp in the function 
TimestampTreeReader::nextVector? For writer, it is caller's responsibility to 
set useUTCTImestamp before calling TimestampTreeWriter::writeBatch. Does this 
help? @jcamachor 


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442771#comment-16442771
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
@omalley , I have been trying to add the Boolean ```useUTCTimestamp``` as 
suggested. Making it work with the reader/writer does not seem to be a problem, 
since I can pass the information through the context. However, we also create 
column vectors in the ```TypeDescription``` class, where we do not seem to have 
any context information, just the type string representation. It seems that 
unless we pass the information through that representation, we cannot know the 
value for the boolean when we create the column over there, and I do not think 
we want to go in that direction. Any ideas?

If we do not go in that direction, I thought that I can change current 
patch to use a ```boolean``` instead of the ```TimeZone``` itself (but without 
storing it).

Please, let me know what you think.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441500#comment-16441500
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on the issue:

https://github.com/apache/orc/pull/249
  
@jcamachor Yes I was just meaning ORC doesn't have problems in dealing with 
timestamps itself. Definitely using UTC everywhere makes things way easier.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441423#comment-16441423
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
@omalley , it seems like a good idea, let me explore it and refresh the PR. 
I will adapt HIVE-19226 to these new changes too.

@wgtmac , I understand you are suggesting that this can be fixed only from 
Hive side? Problem is that existing ORC files should still be read properly, 
hence you would need to recognize old vs new ORC files. In addition, you will 
apply displacement twice when reading/writing, in Hive and in ORC. It seems to 
me the cleaner solution is just being able to point to ORC that timestamp data 
is in UTC from Java reader/writer. FWIW, change to stringify in 
TimestampColumnVector is needed indeed.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441286#comment-16441286
 ] 

ASF GitHub Bot commented on ORC-341:


Github user omalley commented on the issue:

https://github.com/apache/orc/pull/249
  
Also note that the C++ reader already uses UTC for its 
TimestampColumnVector. :)


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441285#comment-16441285
 ] 

ASF GitHub Bot commented on ORC-341:


Github user omalley commented on the issue:

https://github.com/apache/orc/pull/249
  
@jcamachor I'd suggest a much simpler API:

- Instead of passing in the reader timezone, make a boolean option to 
useUtcForTimestamp.
- Extend TimestampColumnVector to have a boolean isUTC field.
- The TimestampTreeWriter can use the isUTC in the ColumnVector to 
determine if it is in UTC.
- The reader can set isUTC appropriately based on the option.

Thoughts?


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441241#comment-16441241
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on the issue:

https://github.com/apache/orc/pull/249
  
AFAIK, I don't think ORC has any issue in HIVE-12192. What ORC guarantees 
is that we should always get same wall clock time representation w/o timezone. 
Current Java implementation leverages java.sql.Timestamp which uses local 
timezone and that's why writer and reader always use  timestamp values in local 
timezone. Unless we add a new TimestampColumnVector which enforces UTC timezone 
to adopt your change here.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440446#comment-16440446
 ] 

ASF GitHub Bot commented on ORC-341:


Github user jcamachor commented on the issue:

https://github.com/apache/orc/pull/249
  
@wgtmac , see discussion in 
https://issues.apache.org/jira/browse/HIVE-12192 for more context.


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440131#comment-16440131
 ] 

ASF GitHub Bot commented on ORC-341:


Github user wgtmac commented on the issue:

https://github.com/apache/orc/pull/249
  
Why do we need this change?


> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-14 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438320#comment-16438320
 ] 

Jesus Camacho Rodriguez commented on ORC-341:
-

[~owen.omalley], could you review it? Thanks

> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-14 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438319#comment-16438319
 ] 

Jesus Camacho Rodriguez commented on ORC-341:
-

PR in https://github.com/apache/orc/pull/249

> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-341) Support time zone as a parameter for Java reader and writer

2018-04-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438318#comment-16438318
 ] 

ASF GitHub Bot commented on ORC-341:


GitHub user jcamachor opened a pull request:

https://github.com/apache/orc/pull/249

[ORC-341] Support time zone as a parameter for Java reader and writer



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jcamachor/orc ORC-341

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/orc/pull/249.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #249


commit 1653d3640d89e404a9e65b7a18fae3a68535a6f0
Author: Jesus Camacho Rodriguez 
Date:   2018-04-14T11:31:49Z

[ORC-341] Support time zone as a parameter for Java reader and writer




> Support time zone as a parameter for Java reader and writer
> ---
>
> Key: ORC-341
> URL: https://issues.apache.org/jira/browse/ORC-341
> Project: ORC
>  Issue Type: Improvement
>Reporter: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, time zone is hardcoded as the system default time zone and ORC 
> applies displacement between timestamp values read/written based on time zone.
> This issue aims at adding the option to pass the time zone as a parameter to 
> the reader/writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)