AngersZhuuuu opened a new pull request #34366:
URL: https://github.com/apache/spark/pull/34366


   ### What changes were proposed in this pull request?
   
   Some yarn-cluster application meet such exception.
   ```
   21/10/20 03:31:55 ERROR Client: Application diagnostics message: Application 
application_1632999510150_2163647 failed 1 times (global limit =8; local limit 
is =1) due to AM Container for appattempt_1632999510150_2163647_000001 exited 
with  exitCode: 0
   Failing this attempt.Diagnostics: For more detailed output, check the 
application tracking page: 
http://ip-xx-xx-xx-xx.idata-server.shopee.io:8088/cluster/app/application_1632999510150_2163647
 Then click on links to logs of each attempt.
   . Failing the application.
   Exception in thread "main" org.apache.spark.SparkException: Application 
application_1632999510150_2163647 finished with failed status
   ```
   
   It's caused by below situation:
   1. yarn-cluster mode application usr code finished, AM shutdown hook 
triggered
   2. AM call unregister from RM but timeout, since AM shutdown hook have try 
catch, won't throw exception, so AM container exit with code 0(application user 
code running success).
   3. Since RM lose connection with AM, then treat this container as failed 
final status.
   4. Then client side got application report as final status failed but am 
container exit code 0. client treat it as failed, then retry.
   
   
   it's a unnecessary retry. we can avoid it.
   
   
   ### Why are the changes needed?
   Avoid unnecessary retry
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   
   Manual tested


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to