Krzysztof Chmielewski created FLINK-31202:
---------------------------------------------
Summary: Add support for reading Parquet files containing Arrays
with complex types.
Key: FLINK-31202
URL: https://issues.apache.org/jira/browse/FLINK-31202
Project: Flink
Issue Type: New Feature
Affects Versions: 1.16.1, 1.16.0, 1.17.0, 1.16.2, 1.17.1
Reporter: Krzysztof Chmielewski
reading complex types to Parquet is possible since Flink 1.16 after
implementing https://issues.apache.org/jira/browse/FLINK-24614
However this implementation lacks support for reading complex nested types such
as
* Array<Array>
* Array<Map>
* Array<Row>
This ticket is about to add support for reading below types from Parquet format
files.
Currently when trying to read Parquet file containing column which such a type,
below exception is thrown:
{code:java}
Caused by: java.lang.RuntimeException: Unsupported type in the list: ROW<`f1`
INT>
at
org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175)
at
org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113)
at
org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81)
{code}
OR:
{code:java}
Caused by: java.lang.RuntimeException: Unsupported type in the list: ARRAY<INT>
at
org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readPrimitiveTypedRow(ArrayColumnReader.java:175)
at
org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.fetchNextValue(ArrayColumnReader.java:113)
at
org.apache.flink.formats.parquet.vector.reader.ArrayColumnReader.readToVector(ArrayColumnReader.java:81)
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)