[GitHub] [spark] AngersZhuuuu commented on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

GitBox Thu, 04 Feb 2021 06:55:53 -0800


AngersZhuuuu commented on pull request #31437:
URL: https://github.com/apache/spark/pull/31437#issuecomment-773369627



   > ok, please update the description with those details.
   > maybe cluster mode doesn't matter here because application master would be 
killed anyway.
   > The original change I believe was before we could handle application 
master being killed and restarted. I think that is handled ok now. so just to 
verify with this change, the application master gets preempted and killed and 
application master gets restarted and the driver process continue, correct?
   > 
   > The only thing better would be if yarn told us this was a preempt case, 
did you look at that at all? its been a while since I looked into the yarn code.
   
   From the error stack, it only tell us the attempt can't be found. I got root 
cause in yarn's log. Since we can't change yarn's code  in spark side, what we 
can do here is just retry. I am looking into yarn's code these days.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] AngersZhuuuu commented on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

Reply via email to