SaurabhChawla100 edited a comment on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-672918107


   > so one question, here, you are seeing events dropped that cause hangs in 
the application code? What queue were they in?
   
   So we have seen the event drop in  ExecutorManagement Queue , appStatus 
Queue, eventlog Queue. 
   
   Since there are scenarios where there is some long running jobs in Zeppelin 
Notebook and Jupyter notebook where the spark Application is running till the 
time the notebook is stopped or there is some validation that is added that 
will check if the application is idle based on any job running or not and if we 
event for jobEnd  event is dropped from appstatus queue, this will make the 
notebook to think spark is running the job , resulting in the Spark Job to be 
in hung state even the nothing is running for long period of time.
   
   Also we have seen the impact where some stage end , stage start , task info 
is dropped from the executorMangment Queue resulting in taking the decision for 
upscaling and downscaling of executors  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to