[
https://issues.apache.org/jira/browse/ARROW-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499585#comment-17499585
]
Joris Van den Bossche commented on ARROW-15790:
-----------------------------------------------
Currently, if you want to preserve Arrow's field metadata in a Parquet
roundtrip, you need to set the {{parquet::ArrowWriterProperties::store_schema}}
option to true. This will store a serialized arrow schema in the Parquet
FileMetaData, and restore information from this when reading.
For actually supporting reading/writing field / column-level metadata in
Parquet, see ARROW-15548
> field's metadata is not write into Parquet file
> -----------------------------------------------
>
> Key: ARROW-15790
> URL: https://issues.apache.org/jira/browse/ARROW-15790
> Project: Apache Arrow
> Issue Type: Bug
> Environment: Ubuntu
> Reporter: Sifang Li
> Priority: Blocker
>
> I used this code to test the metadata write into file and read back behavior
> of parquet file:
> [https://gist.github.com/dantrim/33f9f14d0b2d3ec45c022aa05f7a45ee]
>
> The generated file does not have metadata when I read the file in using code
> below and print it out:
>
> {quote}std::shared_ptr<arrow::io::ReadableFile> infile;
> PARQUET_ASSIGN_OR_THROW(infile,
> arrow::io::ReadableFile::Open("./test.parquet",
> arrow::default_memory_pool()));
> std::unique_ptr<parquet::arrow::FileReader> reader;
> PARQUET_THROW_NOT_OK(
> parquet::arrow::OpenFile(infile, arrow::default_memory_pool(), &reader));
> std::shared_ptr<arrow::Table> table;
> PARQUET_THROW_NOT_OK(reader->ReadTable(&table));
> EXPECT_EQ(frameCount, table->num_rows());
> std::cout<<"==="<<table->schema()->ToString(true) <<std::endl; /// no meta
> shown{quote}
> Here is the version info:
> libparquet-dev/focal,now 7.0.0-1 amd64 [installed]
> libparquet-glib-dev/focal,now 7.0.0-1 amd64 [installed]
> libparquet-glib700/focal,now 7.0.0-1 amd64 [installed,automatic]
> libparquet700/focal,now 7.0.0-1 amd64 [installed,automatic]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)