[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-09-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17605374#comment-17605374
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1248173412

   @stejskal This doesn't directly answer your question, but I resolved the 
JIRA ticket and marked it for the 1.13.0 release.  I'm not aware of the 
timeline for that release though.




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
> Fix For: 1.13.0
>
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-09-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17604981#comment-17604981
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

stejskal commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1247314834

   is there any idea when this will be released? 




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-07-25 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570689#comment-17570689
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

guillaume-fetter commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1193643505

   Thank you very much!




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-07-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566350#comment-17566350
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1183301646

   Terrific, thank you!




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-07-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566347#comment-17566347
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

shangxinli commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1183298278

   Merged. Thanks again! 




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-07-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566346#comment-17566346
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

shangxinli merged PR #963:
URL: https://github.com/apache/parquet-mr/pull/963




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-07-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566326#comment-17566326
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1183261724

   Thank you @shangxinli !  Do you want to merge it now or closer to the next 
release?




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17561769#comment-17561769
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

shangxinli commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1172925425

   Sorry for the late response and thank you @guillaume-fetter and @dossett for 
the contribution. Yeah, it seems low risk and LGTM. 




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554279#comment-17554279
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1155680245

   I tested this locally and it works beautifully thank you @guillaume-fetter.
   
   @shangxinli @gszadovszky 

> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553622#comment-17553622
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1154013785

   @guillaume-fetter I see what you mean, that makes sense. I think for my use 
case (reading protobuf data from kafka via the confluent schema registry and 
then writing to parquet) I won't get tripped up by the serializability issue. 
This will be a nice parquet enhancement!




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553618#comment-17553618
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

guillaume-fetter commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1154004113

   @dossett Depends on your use case. If you are running a simple program that 
does data processing on a single host, then you're good. If you are using a big 
data processing tool (like me here, Flink) you can't pass around a DM instance 
from one task to the other, or at least, I did not find a way to make it work...
   For unrelated reasons, we are using the SelfDescribingMessage design pattern 
(https://developers.google.com/protocol-buffers/docs/techniques#self-description),
 which is a specific message, therefore serializable. From there we wrote a 
parquet writer which basically converts the SelfDescribingMessage to a 
DynamicMessage and then writes it using this upgraded ProtoWriteSupport.
   
   It's clearly convoluted unless you are already using a SelfDescribingMessage 
or equivalent.




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553557#comment-17553557
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1153856375

   Oh that's interesting @guillaume-fetter so you can't just write out a 
dynamic message into parquet without jumping through more hoops?




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553445#comment-17553445
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

guillaume-fetter commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1153626738

   Just a heads-up (because I have run into that issue), DynamicMessage is not 
serializable. 
   So this means that this use-case is for local-only instances of a 
DynamicMessage. In my use case I need to build the DynamicMessage from another 
object which is serializable and do so directly in the writer, which is a bit 
convoluted.
   




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553434#comment-17553434
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

guillaume-fetter commented on code in PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#discussion_r895442880


##
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoWriteSupport.java:
##
@@ -115,27 +120,32 @@ public void prepareForWrite(RecordConsumer 
recordConsumer) {
   public WriteContext init(Configuration configuration) {
 
 // if no protobuf descriptor was given in constructor, load descriptor 
from configuration (set with setProtobufClass)
-if (protoMessage == null) {
-  Class pbClass = 
configuration.getClass(PB_CLASS_WRITE, null, Message.class);
-  if (pbClass != null) {
-protoMessage = pbClass;
-  } else {
-String msg = "Protocol buffer class not specified.";
-String hint = " Please use method 
ProtoParquetOutputFormat.setProtobufClass(...) or other similar method.";
-throw new BadConfigurationException(msg + hint);
+if (descriptor == null) {
+  if (protoMessage == null) {
+Class pbClass = 
configuration.getClass(PB_CLASS_WRITE, null, Message.class);
+if (pbClass != null) {
+  protoMessage = pbClass;
+} else {
+  String msg = "Protocol buffer class or descriptor not specified.";
+  String hint = " Please use method 
ProtoParquetOutputFormat.setProtobufClass(...) or other similar method.";
+  throw new BadConfigurationException(msg + hint);
+}
   }
+  descriptor = Protobufs.getMessageDescriptor(protoMessage);
+} else {
+  //Assume no specific Message extending class, so use DynamicMessage
+  protoMessage = DynamicMessage.class;

Review Comment:
   Yes I agree. In the end I set it just for the sake of having it set, but you 
are right it will be more confusing than useful.





> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-protobuf
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552383#comment-17552383
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1151481731

   +1 (non-binding) for this change.  `DynamicMessage` is quite useful in 
protobuf and support here would be great, I ran into a need for it just today.
   cc @belugabehr in case they have thoughts. There aren't any active 
protobuf-parquet committers AFAICT.




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552382#comment-17552382
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

dossett commented on PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1151481732

   +1 (non-binding) for this change.  `DynamicMessage` is quite useful in 
protobuf and support here would be great, I ran into a need for it just today.
   cc @belugabehr in case they have thoughts. There aren't any active 
protobuf-parquet committers AFAICT.




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2022-04-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529469#comment-17529469
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

guillaume-fetter opened a new pull request, #963:
URL: https://github.com/apache/parquet-mr/pull/963

   ### Jira
   
   - [X] My PR addresses the following [Parquet 
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references 
them in the PR title:
 - https://issues.apache.org/jira/browse/PARQUET-1020
   
   
   ### Tests
   
   - [X] My PR adds the following unit test:
 - testProto3SimplestDynamicMessage in 
parquet-protobuf/src/test/java/org/apache/parquet/proto/ProtoWriteSupportTest.java
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines. In 
addition, my commits follow the guidelines from "[How to write a good git 
commit message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain Javadoc that 
explain what it does
   
   
   This is sort of a resubmission of 
https://github.com/apache/parquet-mr/pull/414 as the PR has been left open for 
quite some time, and the branch has diverged a bit.
   Please tell me if this is okay.




> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (PARQUET-1020) Add support for Dynamic Messages in parquet-protobuf

2020-06-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134247#comment-17134247
 ] 

ASF GitHub Bot commented on PARQUET-1020:
-

alexcardell commented on pull request #414:
URL: https://github.com/apache/parquet-mr/pull/414#issuecomment-643289745


   This PR seems to have stagnated but this is exactly what we're looking for, 
if I fork and fix those conflicts can we reignite the discussion?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support for Dynamic Messages in parquet-protobuf
> 
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
>  Issue Type: New Feature
>Reporter: Alex Buck
>Assignee: Alex Buck
>Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the 
> generated protobuf classes to allow us to make our job very generic. 
> I think this could be achieved by setting the descriptor upfront, similarly 
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet 
> schema created by ProtoSchemaConverter using the passed in descriptor, rather 
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach 
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414



--
This message was sent by Atlassian Jira
(v8.3.4#803005)