viirya commented on PR #8991:
URL: 
https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1915602228

   > wouldn't it actually make more sense to compute the expressions prior to 
the networked shuffle so only 2 columns of data (`lcol_1 + lcol_2` and `rcol_1 
+ rcol_2`) need to be sent, rather than the 4 original columns 🤔
   
   Hmm, except for joining keys, I think you still can list other columns 
(e.g., the original 4 columns) into selection list? So they are not always able 
to be removed from shuffle, I think?
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to