SaurabhChawla100 edited a comment on pull request #29413: URL: https://github.com/apache/spark/pull/29413#issuecomment-672918107
> so one question, here, you are seeing events dropped that cause hangs in the application code? What queue were they in? So we have seen the event drop in ExecutorManagement Queue , appStatus Queue, eventlog Queue. Since there are scenarios where there is some long running jobs in Zeppelin Notebook and Jupyter notebook where the spark Application is running till the time the notebook is stopped or there is some validation that is added that will check if the application is idle based on any job running or not. And if the event for jobEnd is dropped from appstatus queue, this will make the notebook to think spark is running the job , resulting in the Spark Job to be in hung state even the nothing is running for long period of time. Also we have seen the impact where some stage end , stage start , task info is dropped from the executorMangment Queue resulting in taking the decision for upscaling and downscaling of executors ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
