wzhfy edited a comment on pull request #29580: URL: https://github.com/apache/spark/pull/29580#issuecomment-686214396
> > Then other threads can not process messages in that inbox, which causes the endpoint to hang > > Other than what inbox has been stopped, this would not happen. > Are you referring to this ? Or any other cases ? @mridulm In our case, messages for `DriverEndpoint` couldn't get processed after OOM happened in a dispatcher thread. Cluster's spark version is 2.3, but I think same problem would exist in 2.4, and for other endpoints in 3.0 (`DriverEndpoint` becomes an `IsolatedRpcEndpoint` instead of `ThreadSafeRpcEndpoint` in 3.x). IIUC, an inbox is stopped only when it's unregistered. When a dispatcher thread is processing messages in an inbox, if a fatal error (e.g. OOM) happens, it will just throw the error. I don't find any place to stop the inbox when this case happens. Please correct me if I'm wrong, thanks! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
