zhongyu09 commented on pull request #31084:
URL: https://github.com/apache/spark/pull/31084#issuecomment-757995580


   > I reverted it for now for RC preparation . Let's make a PR with clarifying 
which case it doesn't cover, and why this is a partial fix.
   
   For partial fix, it is difficult to give an stable UT. I would rather give 
an stable fix. I think two directions:
   1.  make sure broadcast job is submitted before shuffle map job, the calling 
of materialize() for non-broadcast query stage should wait until all the 
broadcast jobs are submitted. 
   2. excluded the schedule time for broadcast job when we calculate time out. 
This is very hard to measure. For downgrade, perhaps we can measure the time 
for pure broadcast, that is, minus collect time. But this also has big changes, 
as well as changes for non-AQE.
   
   I prefer for #1, it behavior more like non-AQE and is this PR's original 
intention and will have less impact to non-AQE.
   
        


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to