Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/921#discussion_r13544073
  
    --- Diff: 
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala 
---
    @@ -252,16 +252,12 @@ class ApplicationMaster(args: 
ApplicationMasterArguments, conf: Configuration,
         try {
           logInfo("Allocating " + args.numExecutors + " executors.")
           // Wait until all containers have finished
    -      // TODO: This is a bit ugly. Can we make it nicer?
    -      // TODO: Handle container failure
           yarnAllocator.addResourceRequests(args.numExecutors)
           // Exits the loop if the user thread exits.
           while (yarnAllocator.getNumExecutorsRunning < args.numExecutors && 
userThread.isAlive) {
    -        if (yarnAllocator.getNumExecutorsFailed >= maxNumExecutorFailures) 
{
    -          finishApplicationMaster(FinalApplicationStatus.FAILED,
    -            "max number of executor failures reached")
    -        }
             yarnAllocator.allocateResources()
    +        checkNumExecutorsFailed()
    +        allocateMissingExecutor()
    --- End diff --
    
    You are correct that it isn't updated until that is called and in this has 
it really doesn't matter to much since we are in while loop and then after this 
we will just go to the reporter thread to continue to loop.  I just find it odd 
to add in missing executors but then not call allocate immediately afterwards 
(we sleep and then we possibly break from the loop).   For that reason I would 
prefer these to be moved up.  Yes we have one extra unneeded call to them, but 
I think it flows better and will be more resilient to other code changes in the 
future.  
    Also note that launchReporterThread runs the logic the same way so right 
now you are doing the failed check and and allocateMissing twice in a row on 
the first time it goes into the reporter thread.
    
    If you really have objections to that then we could make one call of 
allocateResources() outside of the while loop and then do the checkFailed, 
allocateMissing, and then allocateResources in side the while loop.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to