cloud-fan commented on pull request #31941:
URL: https://github.com/apache/spark/pull/31941#issuecomment-807018185


   > I think it is better to make use of the AQE framework to reuse the 
broadcast exchange or newQueryStage.
   
   I agree, and I think this PR does it?
   
   When planning the DPP filter,  the broadcast plan may have 2 different 
states:
   1. It's already submitted as a query stage, which means it's available in 
the stage cache. No matter it's running or completed, we will create a 
`ReusedQueryStage` for DPP filter.
   2. It's not submitted yet and not available in the stage cache. We should 
create a fresh `QueryStage` for DPP filter and put it in the stage cache, so 
that the AQE framework can reuse it later.
   
   Case 2 is a bit tricky due to race conditions. Maybe the DPP filter and AQE 
framework are creating a fresh query stage at the same time. We should 
double-check it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to