Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14916#discussion_r77202121
  
    --- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
    @@ -222,7 +222,9 @@ private[spark] class ApplicationMaster(
     
             if (!unregistered) {
               // we only want to unregister if we don't want the RM to retry
    -          if (finalStatus == FinalApplicationStatus.SUCCEEDED || 
isLastAttempt) {
    +          if (finalStatus == FinalApplicationStatus.SUCCEEDED ||
    +            exitCode == ApplicationMaster.EXIT_EARLY ||
    +            exitCode == ApplicationMaster.EXIT_EXCEPTION_USER_CLASS || 
isLastAttempt) {
    --- End diff --
    
    You can't do this.  There are various reasons these can happen and if any 
of them are retryable by yarn you are now preventing that from happening by 
unregistering.  The kill may cause these but other things could to. The 
EXIT_EXCEPTION_USER_CLASS is any throwable from the user code, the EXIT_EARLY 
is unknown and thus would want to retry.
    
    I'm fine with adding something in if we know it was kill, but I think thats 
hard here because yarn doesn't tell us.  Ideally we have a spark command to 
kill nicely and then we can do the cleanup ourselves.
    
    The client should try to clean this up if it sees its killed, assuming its 
still running.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to