Dandandan commented on issue #523: URL: https://github.com/apache/arrow-datafusion/issues/523#issuecomment-856483448
> I agree that Python just should merge them on the tests. I was a bit surprised that even in such a low number of entries we are splitting them: seems odd to me. There could be some heuristics / optimizations to not apply partitioning for small datasets (when known upfront). For example, with hash join that can be beneficial when the left side is very small compared to the right side (hash partitioning the right side in that case could be slower than building the left side in a single thread / worker). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
