mslapek commented on PR #5745:
URL: 
https://github.com/apache/arrow-datafusion/pull/5745#issuecomment-1489737150

   @mingmwang Thank you for a thorough explanation! 🙂 It makes a perfect sense.
   
   Yeah. Union is only a concatenated list of partitions... itself it does no 
performance impact - that's what I've forgotten! 😬
   
   So the only performance issue is: when `RepartitionExec(4)` can be avoided? 
When we have in all branches equally-sized partitions, and their amount allows 
for a sensible parallelization - then the reshuffle could be avoided.
   
   But:
   1. it is not related in a specific way to the union, it's a general issue
   2. it would require data from some Metastore
   
   ---
   
   To sum up, it looks like this PR should be closed without merge. 😭
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to