[ 
https://issues.apache.org/jira/browse/PARQUET-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Wu resolved PARQUET-2425.
------------------------------
    Fix Version/s: 1.14.0
         Assignee: Claire McGinty
       Resolution: Fixed

> AvroSchemaConverter doesn't support non-grouped repeated fields
> ---------------------------------------------------------------
>
>                 Key: PARQUET-2425
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2425
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Claire McGinty
>            Assignee: Claire McGinty
>            Priority: Major
>             Fix For: 1.14.0
>
>
> Currently AvroSchemaConverter#convert does not support Parquet-to-Avro 
> conversions where the Parquet schema contains a non-grouped repeated type. 
> For example, this operation:
>  
> new AvroSchemaConverter()
>    .convert(MessageTypeParser.parseMessageType(
>      "message MySchema \{ repeated int32 repeatedField; }"
>    ))
>  
> triggers an UnsupportedOperationException("REPEATED not supported outside 
> LIST or MAP"): 
> https://github.com/apache/parquet-mr/blob/apache-parquet-1.13.1/parquet-avro/src/main/java/org/apache/parquet/avro/AvroSchemaConverter.java#L292
>  
> However, if I'm interpreting the format spec correctly 
> ([https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#nested-types]),
>  ungrouped repeated types should be treated as REQUIRED:
> > This does not affect repeated fields that are not annotated: A repeated 
> > field that is neither contained by a {{{}LIST{}}}- or {{{}MAP{}}}-annotated 
> > group nor annotated by {{LIST}} or {{MAP}} should be interpreted as a 
> > required list of required elements where the element type is the type of 
> > the field.
> If this interpretation is correct, can we update AvroSchemaConverter to 
> handle this use case? I'll put up a PR demonstrating it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to