Kontinuation opened a new pull request, #1605:
URL: https://github.com/apache/datafusion-comet/pull/1605

   ## Which issue does this PR close?
   
   Closes #1589.
   
   ## Rationale for this change
   
   `CometBroadcastExchangeExec` didn't implement `outputPartitioning` method, 
this prevents CometBroadcastExchangeExec from being correctly generated in AQE 
optimization. This patch fixes this problem to make shuffled equi-joins being 
able to be optimized to CometBroadcastHashJoin by AQE.
   
   ## What changes are included in this PR?
   
   This PR contains 2 fixes to make AQE broadcast join optimization work 
correctly for Comet:
   
   1. Implement `outputPartitioning` method of `CometBroadcastExchangeExec`, 
also fixes other places that prevents the AQE optimization from happening.
   2. Implement `doExecuteBroadcast` method of `CometColumnarToRowExec`. The 
parent of `CometBroadcastExchangeExec` may change to Spark 
`BroadcastHashJoinExec` during AQE optimization, this requires inserting a 
`CometColumnarToRowExec` above `CometBroadcastExchangeExec` to broadcast the 
data as rows instead of column batches.
   
   ## How are these changes tested?
   
   1. Added unit tests
   2. Tested using TPC-H SF=100 locally
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to