ivoson commented on PR #54136:
URL: https://github.com/apache/spark/pull/54136#issuecomment-3850928592

   > Is it possible to switch the initialization order between BlockManager 
initialization and ShuffleManager initialization?
   > 
   > If impossible, I actually think we could add a new flag in 
BlockManagerInfo (or a pending list for those inital BlockManagers in terms of 
better memory utilization) to indicate whether the executor is ready for 
shuffle migration by sending a RPC signal (similar to `LaunchedExecutor`) after 
ShuffleManager initialized.
   > 
   > TBH, the current fix looks a bit complex to me.
   
   Have thought about the other options:
   It's hard to reorder the initialization steps since we'll need to register 
blockManager to get the blockMangerId before executor heartbeat. And moving 
executor heartbeat after shuffle manager initialization may cause heartbeat 
timeout...
   
   For adding a new flag in `BlockManagerInfo` will introduce a new stage and 
new RPC protocol between driver and executor, try to avoid that since the issue 
only affect shuffle migration with some race condition which should be pretty 
rare.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to