Lawrence He created PARQUET-1711:
------------------------------------

             Summary: [parquet-protobuf] stack overflow when work with well 
known json type
                 Key: PARQUET-1711
                 URL: https://issues.apache.org/jira/browse/PARQUET-1711
             Project: Parquet
          Issue Type: Bug
    Affects Versions: 1.10.1
            Reporter: Lawrence He


Writing following protobuf message as parquet file is not possible: 
{code:java}
syntax = "proto3";

import "google/protobuf/struct.proto";

package test;
option java_outer_classname = "CustomMessage";

message TestMessage {
    map<string, google.protobuf.ListValue> data = 1;
} {code}
Protobuf introduced "well known json type" such like 
[ListValue|https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#listvalue]
 to work around json schema conversion. 

However writing above messages traps parquet writer into an infinite loop due 
to the "general type" support in protobuf. Current implementation will keep 
referencing 6 possible types defined in protobuf (null, bool, number, string, 
struct, list) and entering infinite loop when referencing "struct".
{code:java}
java.lang.StackOverflowErrorjava.lang.StackOverflowError at 
java.base/java.util.Arrays$ArrayItr.<init>(Arrays.java:4418) at 
java.base/java.util.Arrays$ArrayList.iterator(Arrays.java:4410) at 
java.base/java.util.Collections$UnmodifiableCollection$1.<init>(Collections.java:1044)
 at 
java.base/java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1043)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:64)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.convertFields(ProtoSchemaConverter.java:66)
 at 
org.apache.parquet.proto.ProtoSchemaConverter.addField(ProtoSchemaConverter.java:96)
 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to