[
https://issues.apache.org/jira/browse/ARROW-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wes McKinney closed ARROW-7952.
-------------------------------
Resolution: Not A Problem
Started investigating and answered my own question
https://github.com/apache/arrow/commit/4fe330aa4ed4564c9502733e25fc2b762e1002bf
The base64-encoding of the metadata was implemented after this file was
generated -- the non-base64-encoded version was never released, so this old
file should simply be overwritten
> [C++][Parquet] Error when failing to read original Arrow schema from Parquet
> metadata
> -------------------------------------------------------------------------------------
>
> Key: ARROW-7952
> URL: https://issues.apache.org/jira/browse/ARROW-7952
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Python
> Reporter: Wes McKinney
> Priority: Major
>
> I experienced the following failure
> {code}
> ~/code/arrow/python/pyarrow/_parquet.pyx in
> pyarrow._parquet.ParquetReader.open()
> ~/code/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
> ArrowInvalid: Tried reading schema message, was null or length 0
> In ../src/parquet/arrow/reader_internal.cc, line 596, code:
> ::arrow::ipc::ReadSchema(&input, &dict_memo, out)
> In ../src/parquet/arrow/reader_internal.cc, line 672, code:
> GetOriginSchema(metadata, &manifest->schema_metadata,
> &manifest->origin_schema)
> {code}
> when reading the following file
> https://github.com/wesm/vldb-2019-apache-arrow-workshop/raw/1e9cf24bd6b8ae03e419e15ebc78b2e8135b8e7a/fec-2012.parquet
> I don't know whether this file is malformed (it was generated from a
> development version of Arrow), so this may not actually be a problem, but
> this mode of failure was unexpected and so I would like to understand why it
> happened
--
This message was sent by Atlassian Jira
(v8.3.4#803005)