pitrou commented on PR #13938: URL: https://github.com/apache/arrow/pull/13938#issuecomment-1222530533
So, it seems this is a capability that should be preserved. The problem is the new dataset implementation doesn't allow reading the file back: ```python >>> pq.read_table('file.parquet', use_legacy_dataset=False) Traceback (most recent call last): [...] ArrowInvalid: Multiple matches for FieldRef.Name(a) in a: int64 a: int64 __fragment_index: int32 __batch_index: int32 __last_in_fragment: bool __filename: string >>> pq.read_table('file.parquet', use_legacy_dataset=True) <ipython-input-12-6eeebe64658f>:1: FutureWarning: Passing 'use_legacy_dataset=True' to get the legacy behaviour is deprecated as of pyarrow 8.0.0, and the legacy implementation will be removed in a future version. pq.read_table('file.parquet', use_legacy_dataset=True) pyarrow.Table a: int64 a: int64 ---- a: [[4,5,6]] a: [[1,2,3]] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org