Wes McKinney created ARROW-7952: ----------------------------------- Summary: [C++][Parquet] Error when failing to read original Arrow schema from Parquet metadata Key: ARROW-7952 URL: https://issues.apache.org/jira/browse/ARROW-7952 Project: Apache Arrow Issue Type: Bug Components: C++, Python Reporter: Wes McKinney
I experienced the following failure {code} ~/code/arrow/python/pyarrow/_parquet.pyx in pyarrow._parquet.ParquetReader.open() ~/code/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status() ArrowInvalid: Tried reading schema message, was null or length 0 In ../src/parquet/arrow/reader_internal.cc, line 596, code: ::arrow::ipc::ReadSchema(&input, &dict_memo, out) In ../src/parquet/arrow/reader_internal.cc, line 672, code: GetOriginSchema(metadata, &manifest->schema_metadata, &manifest->origin_schema) {code} when reading the following file https://github.com/wesm/vldb-2019-apache-arrow-workshop/raw/1e9cf24bd6b8ae03e419e15ebc78b2e8135b8e7a/fec-2012.parquet I don't know whether this file is malformed (it was generated from a development version of Arrow), so this may not actually be a problem, but this mode of failure was unexpected and so I would like to understand why it happened -- This message was sent by Atlassian Jira (v8.3.4#803005)