ryan-williams commented on issue #39399:
URL: https://github.com/apache/arrow/issues/39399#issuecomment-1873563192

   Why do you say the Windows and Ubuntu Parquet files are invalid?
   
   This works for me:
   ```python
   import pandas as pd
   
print(pd.read_parquet('https://github.com/runsascoded/parquet-diff-test/raw/ubuntu/out/pyarrow/snappy/empty.parquet'))
   # Empty DataFrame
   # Columns: [a]
   # Index: []
   
print(pd.read_parquet('https://github.com/runsascoded/parquet-diff-test/raw/windows/out/pyarrow/snappy/empty.parquet'))
   # Empty DataFrame
   # Columns: [a]
   # Index: []
   
print(pd.read_parquet('https://github.com/runsascoded/parquet-diff-test/raw/macos/out/pyarrow/snappy/empty.parquet'))
   # Empty DataFrame
   # Columns: [a]
   # Index: []
   ```
   
   Various tools ([parquet-tools](https://github.com/hangxie/parquet-tools), 
[parquet2json](https://github.com/jupiter/parquet2json), 
`pyarrow.parquet.ParquetFile.metadata.to_dict()`) have no problem confirming 
the metadata in each Parquet file, it's just that the bytes are different on 
macOS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to