CMIW, the writer version here means the data page version [1], which is stored in the data page header [2] and differs from format version [3].
The format version can be obtained directly via show metadata data command suggested by Micah. Although there is a command line in the parquet-mr to print page metadata [4], unfortunately it doesn't print the data page version. The cli may need extra work to print them out. [1] https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/README.md?plain=1#L130 [2] https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L668 [3] https://github.com/apache/parquet-format/blob/master/CHANGES.md [4] https://github.com/apache/parquet-mr/blob/master/parquet-cli/README.md?plain=1#L84 Best, Gang On Sun, Apr 23, 2023 at 5:02 AM Micah Kornfield <[email protected]> wrote: > I'm not familiar with it but I would think the show metadata data command > would work get general metadata. Please note the version field is not > entirely helpful as some implementations always hard-code it to certain > value. The application/created by is generally better way to determine the > writer. > > Another way of doing this is with pyarrow [1] > > [1] > > https://arrow.apache.org/docs/python/generated/pyarrow.parquet.read_metadata.html > > On Thu, Apr 20, 2023 at 6:42 AM Simhadri G <[email protected]> wrote: > > > Hi everyone, > > > > I have a question regarding the WRITER_VERSION = > “parquet.writer.version”. > > > > I understand that the writer can have one of the 2 values can have the > > following 2 values. [1] > > > > PARQUET_1_0 ("v1"), > > PARQUET_2_0 ("v2"); > > > > I currently have a parquet file and I would like to determine the parquet > > writer version used to write this file. I have tried to obtain the > > metadata/dump using parquet-tools, but unfortunately, this did not > include > > the information I needed. > > > > Therefore, I would be most grateful if someone could please help me out > by > > advising where I can find the writer version information. Thank you very > > much for your time and assistance. > > > > Thanks, > > Simhadri G > > > > [1] > > > > > https://github.com/apache/parquet-mr/blob/master/parquet-column/src/main/java/org/apache/parquet/column/ParquetProperties.java#L69 > > >
