kangakum36 opened a new pull request, #46638:
URL: https://github.com/apache/arrow/pull/46638

   ### Rationale for this change
   
   The documentation for 
[pyarrow.Table.combine_chunks](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html#pyarrow.Table.combine_chunks)
 and 
[Table::CombineChunks](https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4NK5arrow5Table13CombineChunksEP10MemoryPool)
 states: All the underlying chunks in the ChunkedArray of each column are 
concatenated into zero or one chunk.
   
   However, [this 
comment](https://github.com/apache/arrow/blob/d7015bd6e610b6cd6752f6cd543509bd5f8853ff/cpp/src/arrow/table.cc#L567)
 indicates that binary columns can be combined into multiple chunks. Multiple 
chunks are produced when combining into one chunk would result in a buffer 
overflow.
   
   A reproducible example is 
[here](https://github.com/apache/arrow/issues/46633#issuecomment-2918122485).
   
   ### What changes are included in this PR?
   
   Change `Table::CombineChunks` and `pyarrow.Table.combine_chunks` 
documentation to specify that binary columns can be combined into multiple 
chunks.
   
   ### Are these changes tested?
   
   No, they are only documentation changes.
   
   ### Are there any user-facing changes?
   
   Yes, documentation changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to