Dandandan edited a comment on pull request #9036: URL: https://github.com/apache/arrow/pull/9036#issuecomment-752122091
I think part of a further speed up could be moving the building of the left / build-side `Vec<&ArrayData>` arrays so that it is only created once instead of for each right batch in `build_batch_from_indices`. Currently when making the batch size smaller, the build-side Vec is built more times, but also contains more (smaller) batches itself, which could explain (part of the) big / exponential slowdown on smaller batches. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
