[
https://issues.apache.org/jira/browse/PARQUET-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17581510#comment-17581510
]
ASF GitHub Bot commented on PARQUET-1711:
-----------------------------------------
matthieun opened a new pull request, #988:
URL: https://github.com/apache/parquet-mr/pull/988
In case some proto definitions have circular dependencies, the proto schema
converter breaks those and logs a warning, instead of a
`StackOverflowException`.
### Jira
- [x] My PR addresses the following [Parquet
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references
them in the PR title. For example, "PARQUET-1234: My Parquet PR"
- https://issues.apache.org/jira/browse/PARQUET-1711
### Tests
- [x] My PR adds the following unit tests __OR__ does not need testing for
this extremely good reason:
- Proto definitions with circular dependencies tested in
`ProtoSchemaConverterTest`
### Commits
- [x] My commits all reference Jira issues in their subject lines. In
addition, my commits follow the guidelines from "[How to write a good git
commit message](http://chris.beams.io/posts/git-commit/)":
1. Subject is separated from body by a blank line
1. Subject is limited to 50 characters (not including Jira issue reference)
1. Subject does not end with a period
1. Subject uses the imperative mood ("add", not "adding")
1. Body wraps at 72 characters
1. Body explains "what" and "why", not "how"
### Documentation
- [ ] In case of new functionality, my PR adds documentation that describes
how to use it.
- All the public functions and the classes in the PR contain Javadoc that
explain what it does
> [parquet-protobuf] stack overflow when work with well known json type
> ---------------------------------------------------------------------
>
> Key: PARQUET-1711
> URL: https://issues.apache.org/jira/browse/PARQUET-1711
> Project: Parquet
> Issue Type: Bug
> Affects Versions: 1.10.1
> Reporter: Lawrence He
> Priority: Major
>
> Writing following protobuf message as parquet file is not possible:
> {code:java}
> syntax = "proto3";
> import "google/protobuf/struct.proto";
> package test;
> option java_outer_classname = "CustomMessage";
> message TestMessage {
> map<string, google.protobuf.ListValue> data = 1;
> } {code}
> Protobuf introduced "well known json type" such like
> [ListValue|https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#listvalue]
> to work around json schema conversion.
> However writing above messages traps parquet writer into an infinite loop due
> to the "general type" support in protobuf. Current implementation will keep
> referencing 6 possible types defined in protobuf (null, bool, number, string,
> struct, list) and entering infinite loop when referencing "struct".
> {code:java}
> java.lang.StackOverflowErrorjava.lang.StackOverflowError at
> java.base/java.util.Arrays$ArrayItr.<init>(Arrays.java:4418) at
> java.base/java.util.Arrays$ArrayList.iterator(Arrays.java:4410) at
> java.base/java.util.Collections$UnmodifiableCollection$1.<init>(Collections.java:1044)
> at
> java.base/java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1043)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:64)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
> at
> org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)