kangakum36 opened a new pull request, #46638: URL: https://github.com/apache/arrow/pull/46638
### Rationale for this change The documentation for [pyarrow.Table.combine_chunks](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.combine_chunks) and [Table::CombineChunks](https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4NK5arrow5Table13CombineChunksEP10MemoryPool) states: All the underlying chunks in the ChunkedArray of each column are concatenated into zero or one chunk. However, [this comment](https://github.com/apache/arrow/blob/d7015bd6e610b6cd6752f6cd543509bd5f8853ff/cpp/src/arrow/table.cc#L567) indicates that binary columns can be combined into multiple chunks. Multiple chunks are produced when combining into one chunk would result in a buffer overflow. A reproducible example is [here](https://github.com/apache/arrow/issues/46633#issuecomment-2918122485). ### What changes are included in this PR? Change `Table::CombineChunks` and `pyarrow.Table.combine_chunks` documentation to specify that binary columns can be combined into multiple chunks. ### Are these changes tested? No, they are only documentation changes. ### Are there any user-facing changes? Yes, documentation changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
