pitrou commented on PR #41700: URL: https://github.com/apache/arrow/pull/41700#issuecomment-2299076111
Note that there's something inherently suboptimal in this PR: we're trading the concatenation of the chunked values (essentially allocating a new values array) against the resolution of many chunked indices (essentially allocating _two_ new indices arrays). This is only beneficial if the value width is quite large (say a 256-byte FSB) _or_ the number of indices is much smaller than the number of values. In the end, perhaps we want to guard this with a heuristic based on total byte size of values and length of indices. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
