milesgranger opened a new pull request, #13938: URL: https://github.com/apache/arrow/pull/13938
Without this patch, the following is possible: ```python import pyarrow as pa import pyarrow.parquet as pq t = pa.Table.from_pydict({'a': [1,2,3]}) t = t.add_column(0, 'a', pa.array([4, 5, 6])) # Adding column with same field name pq.write_table(t, 'file.parquet') # OK pq.read_table('file.parquet') # Error ... ArrowInvalid: Multiple matches for FieldRef.Name(a) in a: int64 a: int64 __fragment_index: int32 __batch_index: int32 __last_in_fragment: bool __filename: string ``` This patch will prevent `pq.write_table(...)` from writing a table with duplicate field names: ```python t.write_table(t, 'file.parquet') ... ArrowInvalid: Cannot write parquet table with duplicate field names: a ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org