ryan-williams commented on issue #39399:
URL: https://github.com/apache/arrow/issues/39399#issuecomment-1873563192
Why do you say the Windows and Ubuntu Parquet files are invalid?
This works for me:
```python
import pandas as pd
print(pd.read_parquet('https://github.com/runsascoded/parquet-diff-test/raw/ubuntu/out/pyarrow/snappy/empty.parquet'))
# Empty DataFrame
# Columns: [a]
# Index: []
print(pd.read_parquet('https://github.com/runsascoded/parquet-diff-test/raw/windows/out/pyarrow/snappy/empty.parquet'))
# Empty DataFrame
# Columns: [a]
# Index: []
print(pd.read_parquet('https://github.com/runsascoded/parquet-diff-test/raw/macos/out/pyarrow/snappy/empty.parquet'))
# Empty DataFrame
# Columns: [a]
# Index: []
```
Various tools ([parquet-tools](https://github.com/hangxie/parquet-tools),
[parquet2json](https://github.com/jupiter/parquet2json),
`pyarrow.parquet.ParquetFile.metadata.to_dict()`) have no problem confirming
the metadata in each Parquet file, it's just that the bytes are different on
macOS.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]