[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2023-11-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789880#comment-17789880
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli merged PR #900:
URL: https://github.com/apache/parquet-mr/pull/900




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2023-11-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789881#comment-17789881
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1827000412

   Merged




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2023-11-26 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789870#comment-17789870
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1826944950

   > I just noticed this PR and sorry to see it does not check in. @mwong38 
Could you try rebase it one last time? Thanks!
   
   Done. I really hope it's the last time.




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2023-10-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776574#comment-17776574
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

wgtmac commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1768050027

   I just noticed this PR and sorry to see it does not check in. @mwong38 Could 
you try rebase it one last time? Thanks!




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2023-10-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776140#comment-17776140
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

zhaolihe commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1766145091

any reason why this PR hasn't been merged yet?




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-10-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619125#comment-17619125
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

sheinbergon commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1281551940

   @shangxinli any reason why this PR hasn't been merged yet?




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-09-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600051#comment-17600051
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1236318657

   > I think we are close to merge this PR. Resolve the conflict and use the 
imports , then we can merge.
   
   Done




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570544#comment-17570544
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1193386419

   I think we are close to merge this PR. Resolve the conflict and use the 
imports , then we can merge. 




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570543#comment-17570543
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on code in PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r928306582


##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java:
##
@@ -427,6 +485,218 @@ public void addBinary(Binary binary) {
 
   }
 
+  final class ProtoTimestampConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+final LogicalTypeAnnotation.TimestampLogicalTypeAnnotation 
logicalTypeAnnotation;
+
+public ProtoTimestampConverter(ParentValueContainer parent, 
LogicalTypeAnnotation.TimestampLogicalTypeAnnotation logicalTypeAnnotation) {
+  this.parent = parent;
+  this.logicalTypeAnnotation = logicalTypeAnnotation;
+}
+
+@Override
+public void addLong(long value) {
+  switch (logicalTypeAnnotation.getUnit()) {
+case MICROS:
+  parent.add(Timestamps.fromMicros(value));
+  break;
+case MILLIS:
+  parent.add(Timestamps.fromMillis(value));
+  break;
+case NANOS:
+  parent.add(Timestamps.fromNanos(value));
+  break;
+  }
+}
+  }
+
+  final class ProtoDateConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoDateConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addInt(int value) {
+  LocalDate localDate = LocalDate.ofEpochDay(value);
+  com.google.type.Date date = com.google.type.Date.newBuilder()

Review Comment:
   If there is no collision on the imports, we should use imports generally. 





> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570542#comment-17570542
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on code in PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r928306114


##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java:
##
@@ -97,6 +127,46 @@ public MessageType convert(Class 
protobufClass) {
 
   private  Builder>, GroupBuilder> 
addField(FieldDescriptor descriptor, final GroupBuilder builder) {
 if (descriptor.getJavaType() == JavaType.MESSAGE) {
+  if (unwrapProtoWrappers) {
+String typeName = descriptor.getMessageType().getFullName();
+if (typeName.equals(PROTOBUF_TIMESTAMP_TYPE)) {
+  return builder.primitive(INT64, 
getRepetition(descriptor)).as(timestampType(true, TimeUnit.NANOS));

Review Comment:
   @emkornfield  Do you still have a comment on this? 





> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17562101#comment-17562101
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on code in PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912817641


##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java:
##
@@ -97,6 +127,46 @@ public MessageType convert(Class 
protobufClass) {
 
   private  Builder>, GroupBuilder> 
addField(FieldDescriptor descriptor, final GroupBuilder builder) {
 if (descriptor.getJavaType() == JavaType.MESSAGE) {
+  if (unwrapProtoWrappers) {
+String typeName = descriptor.getMessageType().getFullName();
+if (typeName.equals(PROTOBUF_TIMESTAMP_TYPE)) {
+  return builder.primitive(INT64, 
getRepetition(descriptor)).as(timestampType(true, TimeUnit.NANOS));

Review Comment:
   Isn't that outside the scope of ProtoSchemaConverter/ParquetWriter? I don't 
think it should go into the business of doing transformations. If the point of 
`ProtoSchemaConverter` is to give the closest representation of the Protobuf 
object in Parquet, then it should be NANOS and nothing more.





> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561973#comment-17561973
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on code in PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912586665


##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java:
##
@@ -97,6 +127,46 @@ public MessageType convert(Class 
protobufClass) {
 
   private  Builder>, GroupBuilder> 
addField(FieldDescriptor descriptor, final GroupBuilder builder) {
 if (descriptor.getJavaType() == JavaType.MESSAGE) {
+  if (unwrapProtoWrappers) {
+String typeName = descriptor.getMessageType().getFullName();
+if (typeName.equals(PROTOBUF_TIMESTAMP_TYPE)) {
+  return builder.primitive(INT64, 
getRepetition(descriptor)).as(timestampType(true, TimeUnit.NANOS));

Review Comment:
   In that case, you can use the default 'TimeUnit.NANOS' while having that 
configured. 





> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561948#comment-17561948
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on code in PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912547741


##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java:
##
@@ -97,6 +127,46 @@ public MessageType convert(Class 
protobufClass) {
 
   private  Builder>, GroupBuilder> 
addField(FieldDescriptor descriptor, final GroupBuilder builder) {
 if (descriptor.getJavaType() == JavaType.MESSAGE) {
+  if (unwrapProtoWrappers) {
+String typeName = descriptor.getMessageType().getFullName();

Review Comment:
   That's a good point. I'll change it to compare the `Descriptor` directly to 
the Wrapper's Descriptor rather than the name.





> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561946#comment-17561946
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on code in PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912545610


##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java:
##
@@ -97,6 +127,46 @@ public MessageType convert(Class 
protobufClass) {
 
   private  Builder>, GroupBuilder> 
addField(FieldDescriptor descriptor, final GroupBuilder builder) {
 if (descriptor.getJavaType() == JavaType.MESSAGE) {
+  if (unwrapProtoWrappers) {
+String typeName = descriptor.getMessageType().getFullName();
+if (typeName.equals(PROTOBUF_TIMESTAMP_TYPE)) {
+  return builder.primitive(INT64, 
getRepetition(descriptor)).as(timestampType(true, TimeUnit.NANOS));

Review Comment:
   I don't think it's worth complicating the API. The Timestamp common Proto 
stores time in nanoseconds. There's no good reason to deviate from that or to 
truncate the resolution. If the user wishes to do more manipulation, it can be 
done downstream.





> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-07-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561945#comment-17561945
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on code in PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r912545477


##
parquet-protobuf/pom.xml:
##
@@ -57,6 +58,16 @@
   protobuf-java
   ${protobuf.version}
 
+
+  com.google.protobuf
+  protobuf-java-util
+  ${protobuf.version}
+
+
+  com.google.api.grpc
+  proto-google-common-protos

Review Comment:
   If you want to pre-packaged common Proto classes, you need to include this.





> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553412#comment-17553412
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

sheinbergon commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1153562823

   @mwong38 anyway I can help with finalizing this PR?




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-04-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525061#comment-17525061
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on PR #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1104043913

   @mwong38 Can you address the feedback from @emkornfield before we can merge? 




> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513131#comment-17513131
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

emkornfield commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r836044690



##
File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java
##
@@ -97,6 +127,46 @@ public MessageType convert(Class 
protobufClass) {
 
   private  Builder>, GroupBuilder> 
addField(FieldDescriptor descriptor, final GroupBuilder builder) {
 if (descriptor.getJavaType() == JavaType.MESSAGE) {
+  if (unwrapProtoWrappers) {
+String typeName = descriptor.getMessageType().getFullName();

Review comment:
   is there a reason you are using type equality vs descriptor equality?  
(I haven't checked but I would think .Equals would work?  Maybe add a comment 
if this is intentional.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513128#comment-17513128
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

emkornfield commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r836042758



##
File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java
##
@@ -35,22 +35,49 @@
 
 import java.util.List;
 
+import static org.apache.parquet.schema.LogicalTypeAnnotation.TimeUnit;
+import static org.apache.parquet.schema.LogicalTypeAnnotation.dateType;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.enumType;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.listType;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.mapType;
 import static org.apache.parquet.schema.LogicalTypeAnnotation.stringType;
-import static org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.*;
+import static org.apache.parquet.schema.LogicalTypeAnnotation.timeType;
+import static org.apache.parquet.schema.LogicalTypeAnnotation.timestampType;
+import static org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.BINARY;
+import static 
org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.BOOLEAN;
+import static org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.DOUBLE;
+import static org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.FLOAT;
+import static org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.INT32;
+import static org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.INT64;
 
 /**
  * Converts a Protocol Buffer Descriptor into a Parquet schema.
  */
 public class ProtoSchemaConverter {
 
+  public static final String PROTOBUF_TIMESTAMP_TYPE = 
"google.protobuf.Timestamp";
+  public static final String PROTOBUF_DATE_TYPE = "google.type.Date";
+  public static final String PROTOBUF_TIME_TYPE = "google.type.TimeOfDay";
+  public static final String PROTOBUF_DOUBLE_TYPE = 
"google.protobuf.DoubleValue";
+  public static final String PROTOBUF_FLOAT_TYPE = 
"google.protobuf.FloatValue";
+  public static final String PROTOBUF_INT64_TYPE = 
"google.protobuf.Int64Value";
+  public static final String PROTOBUF_UINT64_TYPE = 
"google.protobuf.UInt64Value";
+  public static final String PROTOBUF_INT32_TYPE = 
"google.protobuf.Int32Value";
+  public static final String PROTOBUF_UINT32_TYPE = 
"google.protobuf.UInt32Value";
+  public static final String PROTOBUF_BOOL_TYPE = "google.protobuf.BoolValue";
+  public static final String PROTOBUF_STRING_TYPE = 
"google.protobuf.StringValue";
+  public static final String PROTOBUF_BYTES_TYPE = 
"google.protobuf.BytesValue";
+
   private static final Logger LOG = 
LoggerFactory.getLogger(ProtoSchemaConverter.class);
   private final boolean parquetSpecsCompliant;
+  private final boolean unwrapProtoWrappers;
 
   public ProtoSchemaConverter() {
-this(false);
+this(false, false);
+  }
+
+  public ProtoSchemaConverter(boolean parquetSpecsCompliant) {

Review comment:
   please add docs for all new publicly facingin parameters.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513127#comment-17513127
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

emkornfield commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r836040766



##
File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoSchemaConverter.java
##
@@ -97,6 +127,46 @@ public MessageType convert(Class 
protobufClass) {
 
   private  Builder>, GroupBuilder> 
addField(FieldDescriptor descriptor, final GroupBuilder builder) {
 if (descriptor.getJavaType() == JavaType.MESSAGE) {
+  if (unwrapProtoWrappers) {
+String typeName = descriptor.getMessageType().getFullName();
+if (typeName.equals(PROTOBUF_TIMESTAMP_TYPE)) {
+  return builder.primitive(INT64, 
getRepetition(descriptor)).as(timestampType(true, TimeUnit.NANOS));

Review comment:
   should TimeUnit be something configurable?  I know protobuf timestamps 
support nanosecond resolution but might not be what applications want to store.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513126#comment-17513126
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

emkornfield commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r836039431



##
File path: parquet-protobuf/pom.xml
##
@@ -57,6 +58,16 @@
   protobuf-java
   ${protobuf.version}
 
+
+  com.google.protobuf
+  protobuf-java-util
+  ${protobuf.version}
+
+
+  com.google.api.grpc
+  proto-google-common-protos

Review comment:
   why this dependency, i thought well known types shipped with the proto 
package?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512548#comment-17512548
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1079368200


   Will have another look soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512378#comment-17512378
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

sheinbergon commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1079025532


   @shangxinli any news about merging this version? Are there still any 
blockers?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17510486#comment-17510486
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

dossett commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1075164744






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509743#comment-17509743
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r830941957



##
File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java
##
@@ -427,6 +485,218 @@ public void addBinary(Binary binary) {
 
   }
 
+  final class ProtoTimestampConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+final LogicalTypeAnnotation.TimestampLogicalTypeAnnotation 
logicalTypeAnnotation;
+
+public ProtoTimestampConverter(ParentValueContainer parent, 
LogicalTypeAnnotation.TimestampLogicalTypeAnnotation logicalTypeAnnotation) {
+  this.parent = parent;
+  this.logicalTypeAnnotation = logicalTypeAnnotation;
+}
+
+@Override
+public void addLong(long value) {
+  switch (logicalTypeAnnotation.getUnit()) {
+case MICROS:
+  parent.add(Timestamps.fromMicros(value));
+  break;
+case MILLIS:
+  parent.add(Timestamps.fromMillis(value));
+  break;
+case NANOS:
+  parent.add(Timestamps.fromNanos(value));
+  break;
+  }
+}
+  }
+
+  final class ProtoDateConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoDateConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addInt(int value) {
+  LocalDate localDate = LocalDate.ofEpochDay(value);
+  com.google.type.Date date = com.google.type.Date.newBuilder()

Review comment:
   I think in this particular case it's more clear to be explicit that this 
is the Google Protobuf Date, rather than, say, the Java built-in Date.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509744#comment-17509744
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r830942166



##
File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java
##
@@ -427,6 +485,218 @@ public void addBinary(Binary binary) {
 
   }
 
+  final class ProtoTimestampConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+final LogicalTypeAnnotation.TimestampLogicalTypeAnnotation 
logicalTypeAnnotation;
+
+public ProtoTimestampConverter(ParentValueContainer parent, 
LogicalTypeAnnotation.TimestampLogicalTypeAnnotation logicalTypeAnnotation) {
+  this.parent = parent;
+  this.logicalTypeAnnotation = logicalTypeAnnotation;
+}
+
+@Override
+public void addLong(long value) {
+  switch (logicalTypeAnnotation.getUnit()) {
+case MICROS:
+  parent.add(Timestamps.fromMicros(value));
+  break;
+case MILLIS:
+  parent.add(Timestamps.fromMillis(value));
+  break;
+case NANOS:
+  parent.add(Timestamps.fromNanos(value));
+  break;
+  }
+}
+  }
+
+  final class ProtoDateConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoDateConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addInt(int value) {
+  LocalDate localDate = LocalDate.ofEpochDay(value);
+  com.google.type.Date date = com.google.type.Date.newBuilder()
+.setYear(localDate.getYear())
+.setMonth(localDate.getMonthValue())
+.setDay(localDate.getDayOfMonth())
+.build();
+  parent.add(date);
+}
+  }
+
+  final class ProtoTimeConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+final LogicalTypeAnnotation.TimeLogicalTypeAnnotation 
logicalTypeAnnotation;
+
+public ProtoTimeConverter(ParentValueContainer parent, 
LogicalTypeAnnotation.TimeLogicalTypeAnnotation logicalTypeAnnotation) {
+  this.parent = parent;
+  this.logicalTypeAnnotation = logicalTypeAnnotation;
+}
+
+@Override
+public void addLong(long value) {
+  LocalTime localTime;
+  switch (logicalTypeAnnotation.getUnit()) {
+case MILLIS:
+  localTime = LocalTime.ofNanoOfDay(value * 1_000_000);
+  break;
+case MICROS:
+  localTime = LocalTime.ofNanoOfDay(value * 1_000);
+  break;
+case NANOS:
+  localTime = LocalTime.ofNanoOfDay(value);
+  break;
+default:
+  throw new IllegalArgumentException("Unrecognized TimeUnit: " + 
logicalTypeAnnotation.getUnit());
+  }
+  com.google.type.TimeOfDay timeOfDay = 
com.google.type.TimeOfDay.newBuilder()
+.setHours(localTime.getHour())
+.setMinutes(localTime.getMinute())
+.setSeconds(localTime.getSecond())
+.setNanos(localTime.getNano())
+.build();
+  parent.add(timeOfDay);
+}
+  }
+
+  final class ProtoDoubleValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoDoubleValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addDouble(double value) {
+  parent.add(DoubleValue.of(value));
+}
+  }
+
+  final class ProtoFloatValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoFloatValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addFloat(float value) {
+  parent.add(FloatValue.of(value));
+}
+  }
+
+  final class ProtoInt64ValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoInt64ValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addLong(long value) {
+  parent.add(Int64Value.of(value));
+}
+  }
+
+  final class ProtoUInt64ValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoUInt64ValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addLong(long value) {
+  parent.add(UInt64Value.of(value));
+}
+  }
+
+  final class ProtoInt32ValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoInt32ValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addInt(int value) {
+  parent.add(Int32Value.of(value));
+}
+  }
+
+  final class ProtoUInt32ValueConverter extends 

[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509738#comment-17509738
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

sheinbergon commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1073708633


   @mwong38 let me know if you want me to help in any way


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509502#comment-17509502
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r830653898



##
File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java
##
@@ -427,6 +485,218 @@ public void addBinary(Binary binary) {
 
   }
 
+  final class ProtoTimestampConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+final LogicalTypeAnnotation.TimestampLogicalTypeAnnotation 
logicalTypeAnnotation;
+
+public ProtoTimestampConverter(ParentValueContainer parent, 
LogicalTypeAnnotation.TimestampLogicalTypeAnnotation logicalTypeAnnotation) {
+  this.parent = parent;
+  this.logicalTypeAnnotation = logicalTypeAnnotation;
+}
+
+@Override
+public void addLong(long value) {
+  switch (logicalTypeAnnotation.getUnit()) {
+case MICROS:
+  parent.add(Timestamps.fromMicros(value));
+  break;
+case MILLIS:
+  parent.add(Timestamps.fromMillis(value));
+  break;
+case NANOS:
+  parent.add(Timestamps.fromNanos(value));
+  break;
+  }
+}
+  }
+
+  final class ProtoDateConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoDateConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addInt(int value) {
+  LocalDate localDate = LocalDate.ofEpochDay(value);
+  com.google.type.Date date = com.google.type.Date.newBuilder()
+.setYear(localDate.getYear())
+.setMonth(localDate.getMonthValue())
+.setDay(localDate.getDayOfMonth())
+.build();
+  parent.add(date);
+}
+  }
+
+  final class ProtoTimeConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+final LogicalTypeAnnotation.TimeLogicalTypeAnnotation 
logicalTypeAnnotation;
+
+public ProtoTimeConverter(ParentValueContainer parent, 
LogicalTypeAnnotation.TimeLogicalTypeAnnotation logicalTypeAnnotation) {
+  this.parent = parent;
+  this.logicalTypeAnnotation = logicalTypeAnnotation;
+}
+
+@Override
+public void addLong(long value) {
+  LocalTime localTime;
+  switch (logicalTypeAnnotation.getUnit()) {
+case MILLIS:
+  localTime = LocalTime.ofNanoOfDay(value * 1_000_000);
+  break;
+case MICROS:
+  localTime = LocalTime.ofNanoOfDay(value * 1_000);
+  break;
+case NANOS:
+  localTime = LocalTime.ofNanoOfDay(value);
+  break;
+default:
+  throw new IllegalArgumentException("Unrecognized TimeUnit: " + 
logicalTypeAnnotation.getUnit());
+  }
+  com.google.type.TimeOfDay timeOfDay = 
com.google.type.TimeOfDay.newBuilder()
+.setHours(localTime.getHour())
+.setMinutes(localTime.getMinute())
+.setSeconds(localTime.getSecond())
+.setNanos(localTime.getNano())
+.build();
+  parent.add(timeOfDay);
+}
+  }
+
+  final class ProtoDoubleValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoDoubleValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addDouble(double value) {
+  parent.add(DoubleValue.of(value));
+}
+  }
+
+  final class ProtoFloatValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoFloatValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addFloat(float value) {
+  parent.add(FloatValue.of(value));
+}
+  }
+
+  final class ProtoInt64ValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoInt64ValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addLong(long value) {
+  parent.add(Int64Value.of(value));
+}
+  }
+
+  final class ProtoUInt64ValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoUInt64ValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addLong(long value) {
+  parent.add(UInt64Value.of(value));
+}
+  }
+
+  final class ProtoInt32ValueConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoInt32ValueConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addInt(int value) {
+  parent.add(Int32Value.of(value));
+}
+  }
+
+  final class ProtoUInt32ValueConverter extends 

[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509501#comment-17509501
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on a change in pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#discussion_r830652740



##
File path: 
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java
##
@@ -427,6 +485,218 @@ public void addBinary(Binary binary) {
 
   }
 
+  final class ProtoTimestampConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+final LogicalTypeAnnotation.TimestampLogicalTypeAnnotation 
logicalTypeAnnotation;
+
+public ProtoTimestampConverter(ParentValueContainer parent, 
LogicalTypeAnnotation.TimestampLogicalTypeAnnotation logicalTypeAnnotation) {
+  this.parent = parent;
+  this.logicalTypeAnnotation = logicalTypeAnnotation;
+}
+
+@Override
+public void addLong(long value) {
+  switch (logicalTypeAnnotation.getUnit()) {
+case MICROS:
+  parent.add(Timestamps.fromMicros(value));
+  break;
+case MILLIS:
+  parent.add(Timestamps.fromMillis(value));
+  break;
+case NANOS:
+  parent.add(Timestamps.fromNanos(value));
+  break;
+  }
+}
+  }
+
+  final class ProtoDateConverter extends PrimitiveConverter {
+
+final ParentValueContainer parent;
+
+public ProtoDateConverter(ParentValueContainer parent) {
+  this.parent = parent;
+}
+
+@Override
+public void addInt(int value) {
+  LocalDate localDate = LocalDate.ofEpochDay(value);
+  com.google.type.Date date = com.google.type.Date.newBuilder()

Review comment:
   Can we use import? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509336#comment-17509336
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1073074540


   Can you squash all the commits? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502398#comment-17502398
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1060883376


   I don't see a reason why this cannot be merge. I will have a look soon. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501969#comment-17501969
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

sheinbergon commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1059969038


   @shangxinli is there any reason this work shouldn't be merged, given 
@mwong38 explanation?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501970#comment-17501970
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

sheinbergon edited a comment on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1059969038


   @shangxinli is there any reason this work shouldn't be merged, given 
@mwong38 explanation? Is there something I can help with?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501903#comment-17501903
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1059918680


   After proto3 made everything optional, there is no way to know whether a 
primitive has been set or not. That is, you could no longer represent a 
"nullable" primitive. (They later brought `optional` keyword back, but the 
damage is done). The solution was to wrap a primitive in a message. For 
example, to represent a nullable `double`, there is the 
[DoubleValue](https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#doublevalue).
 It's analogous to Java's boxed `Double`. 
   
   Parquet supports nullable primitives natively. If we were to represent these 
well-known Protobuf types directly, it will be nested inside a deeper data 
structure; very inconvenient and a waste of space. What I did here is "unwrap" 
these primitives and make use of Parquet's nullaibility.
   
   Additionally, I convert other well-known types such as Timestamp and Date to 
Parquet's. Again, rather than represent the data structure in its raw nested 
form (seconds/nanos or year, month, day, etc), they are converted to Parquet's 
native or logical representation of Timestamp and Date. You can turn on or off 
this features in the `ProtoSchemaConverter` constructor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2022-03-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501733#comment-17501733
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

sheinbergon commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-1059757563


   @shangxinli @mwong38  any way I can help with merge? Currently, 
proto-to-parquet conversions do not support proper timestamp handling, and it 
looks like this work addresses the issue.
   
   Let me know.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2021-11-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442869#comment-17442869
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

shangxinli commented on pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900#issuecomment-967285427


   @mwong38, can you put more information in the Jira on why/what is changed? 
This is pretty big change and it would help people to review your code. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (PARQUET-2042) Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay

2021-05-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338228#comment-17338228
 ] 

ASF GitHub Bot commented on PARQUET-2042:
-

mwong38 opened a new pull request #900:
URL: https://github.com/apache/parquet-mr/pull/900


   … logical Timestamps, Date, TimeOfDay
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [X] My PR addresses the following [Parquet 
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references 
them in the PR title. For example, "PARQUET-1234: My Parquet PR"
 - https://issues.apache.org/jira/browse/PARQUET-XXX
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Tests
   
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines. In 
addition, my commits follow the guidelines from "[How to write a good git 
commit message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [X] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain Javadoc that 
explain what it does
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unwrap common Protobuf wrappers and logical Timestamps, Date, TimeOfDay
> ---
>
> Key: PARQUET-2042
> URL: https://issues.apache.org/jira/browse/PARQUET-2042
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-protobuf
>Reporter: Michael Wong
>Priority: Major
>
> Related to https://issues.apache.org/jira/browse/PARQUET-1595



--
This message was sent by Atlassian Jira
(v8.3.4#803005)