[
https://issues.apache.org/jira/browse/PARQUET-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553434#comment-17553434
]
ASF GitHub Bot commented on PARQUET-1020:
-----------------------------------------
guillaume-fetter commented on code in PR #963:
URL: https://github.com/apache/parquet-mr/pull/963#discussion_r895442880
##########
parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoWriteSupport.java:
##########
@@ -115,27 +120,32 @@ public void prepareForWrite(RecordConsumer
recordConsumer) {
public WriteContext init(Configuration configuration) {
// if no protobuf descriptor was given in constructor, load descriptor
from configuration (set with setProtobufClass)
- if (protoMessage == null) {
- Class<? extends Message> pbClass =
configuration.getClass(PB_CLASS_WRITE, null, Message.class);
- if (pbClass != null) {
- protoMessage = pbClass;
- } else {
- String msg = "Protocol buffer class not specified.";
- String hint = " Please use method
ProtoParquetOutputFormat.setProtobufClass(...) or other similar method.";
- throw new BadConfigurationException(msg + hint);
+ if (descriptor == null) {
+ if (protoMessage == null) {
+ Class<? extends Message> pbClass =
configuration.getClass(PB_CLASS_WRITE, null, Message.class);
+ if (pbClass != null) {
+ protoMessage = pbClass;
+ } else {
+ String msg = "Protocol buffer class or descriptor not specified.";
+ String hint = " Please use method
ProtoParquetOutputFormat.setProtobufClass(...) or other similar method.";
+ throw new BadConfigurationException(msg + hint);
+ }
}
+ descriptor = Protobufs.getMessageDescriptor(protoMessage);
+ } else {
+ //Assume no specific Message extending class, so use DynamicMessage
+ protoMessage = DynamicMessage.class;
Review Comment:
Yes I agree. In the end I set it just for the sake of having it set, but you
are right it will be more confusing than useful.
> Add support for Dynamic Messages in parquet-protobuf
> ----------------------------------------------------
>
> Key: PARQUET-1020
> URL: https://issues.apache.org/jira/browse/PARQUET-1020
> Project: Parquet
> Issue Type: New Feature
> Components: parquet-protobuf
> Reporter: Alex Buck
> Assignee: Alex Buck
> Priority: Major
>
> Hello. We would like to pass in a DynamicMessage rather than using the
> generated protobuf classes to allow us to make our job very generic.
> I think this could be achieved by setting the descriptor upfront, similarly
> to how there is a ProtoParquetOutputFormat today.
> In ProtoWriteSupport in the init method it could then generate the parquet
> schema created by ProtoSchemaConverter using the passed in descriptor, rather
> than taking it from the generated proto class.
> Would there be interest in incorporating this change? If so does the approach
> above sound sensible? I am happy to do a pull request
> initial PR here: https://github.com/apache/parquet-mr/pull/414
--
This message was sent by Atlassian Jira
(v8.20.7#820007)