[
https://issues.apache.org/jira/browse/FLINK-33759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cai Liuyang updated FLINK-33759:
--------------------------------
Description: When we use flink-parquet format wirte Map<String, String>[]
type (which will be read by spark job), we encounter an exception:
`org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the
field should be ommited completely instead`, after review the code, we found
flink-parquet doesn't support write nested array or map, because
[ArrayWriter|[https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L437|https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L437)]]
and
[MapWriter|[https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L391|https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L391)]]
doesn't impl `public void write(ArrayData arrayData, int ordinal) {}`
function. (was: When we use flink-parquet format wirte Map<String, String>[]
type (which will be read by spark job), we encounter an exception:
`org.apache.parquet.io.ParquetEncodingException: empty fields are illegal, the
field should be ommited completely instead`, after review the code, we found
flink-parquet doesn't support write nested array or map, because
[ArrayWriter]([https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L437)]
and
[MapWriter]([https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L391)]
doesn't impl `public void write(ArrayData arrayData, int ordinal) {}`
function.)
> flink parquet writer support write nested array or map type
> -----------------------------------------------------------
>
> Key: FLINK-33759
> URL: https://issues.apache.org/jira/browse/FLINK-33759
> Project: Flink
> Issue Type: Improvement
> Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
> Reporter: Cai Liuyang
> Priority: Major
>
> When we use flink-parquet format wirte Map<String, String>[] type (which will
> be read by spark job), we encounter an exception:
> `org.apache.parquet.io.ParquetEncodingException: empty fields are illegal,
> the field should be ommited completely instead`, after review the code, we
> found flink-parquet doesn't support write nested array or map, because
> [ArrayWriter|[https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L437|https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L437)]]
> and
> [MapWriter|[https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L391|https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/row/ParquetRowDataWriter.java#L391)]]
> doesn't impl `public void write(ArrayData arrayData, int ordinal) {}`
> function.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)