Nishanth created ARROW-17452:
--------------------------------
Summary: Arrow-Parquet c++ errors out OSError: Malformed levels
min: 3 max: 3 out of range. Max Level: 2
Key: ARROW-17452
URL: https://issues.apache.org/jira/browse/ARROW-17452
Project: Apache Arrow
Issue Type: Bug
Components: C++
Affects Versions: 9.0.0
Reporter: Nishanth
Attachments: athena_struct.gz.parquet
Current Arrow-Parquet c++ errors out on some files with error
{code:java}
OSError: Malformed levels min: 3 max: 3 out of range. Max Level: 2{code}
This is noticed particularly in Parquet columns with nested data structures.
The source of the exception is a check which checks the min / max is respected
on what the column has defined.
[https://github.com/apache/arrow/blob/master/cpp/src/parquet/column_reader.cc#L177|http://example.com/]
The parquet files were created in Athena using the following query and read
with arrow-parquet c++.
{code:java}
create table struct_athena (int1 int, struct1 struct<field1: string, field2:
string>)
LOCATION 's3://'
TBLPROPERTIES (
'table_type'='ICEBERG',
'format'='parquet'
);
insert into struct_athena VALUES (1, (CAST(ROW('one', 'two') AS ROW(field1
varchar, field2 varchar))));
{code}
The generated parquet file is attached in the JIRA.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)