pyarrow.ChunkedArray.combine_chunks is a method which is documented as "Flatten this ChunkedArray into a single non-chunked array."
Incidentally, it happens to *always* copy the underlying chunk data - even if the ChunkedArray is composed of just a single contiguous chunk which could be returned directly. That has major performance impact for my particular application, which calls `combine_chunks` on all ChunkedArrays to compact them. When there is one chunk, this copy is unnecessary, but my application spends about 5% to 15% of its total runtime just on these copies! A workaround is trivial to implement, but this seems like an unnecessary footgun. But the point has been raised that perhaps the incidental copy that combine_chunks does is actually part of its API, since users might depend on that copy. This was brought up in a PR [0] and an issue [1]. My discussion topic: is this side-effect a part of the combine_chunks API? If it is, I think it should be documented as such, opening the space for a new method which avoids the unnecessary copy. If not, I think we should improve its performance. --- [0]: "Optimize combine_chunks when there is only one chunk" https://github.com/apache/arrow/pull/37319 [1]: "Concatenating a single array is a compaction utility" https://github.com/apache/arrow/issues/37878