Anindya Sinha created AURORA-413:
------------------------------------

             Summary: aurora update fails if update results in a pending job
                 Key: AURORA-413
                 URL: https://issues.apache.org/jira/browse/AURORA-413
             Project: Aurora
          Issue Type: Bug
          Components: Client
    Affects Versions: 0.5.0
            Reporter: Anindya Sinha


Assume I have a job running on the cluster with 2 instances. Let us say both 
tasks are in RUNNING state.

At this point, I update my job to bump up the # of instances from 2 to 4, and 
do an aurora update. As expected, it leaves the 2 running instances intact, and 
attempts to start instances 3 and 4. Assume instance 3 starts up fine and is in 
RUNNING state. If for some reason, the cluster is in such a state that instance 
4 cannot be scheduled immediately (ie. it is in PENDING state). In that case, 
the update fails if the task does not move to RUNNING state in 30-45 seconds 
after attempting to launch. Since update fails, it kills instance 3 as well.

aurora update should not fail if a task is in PENDING state since that task may 
get scheduled when some other job finishes or fails. Also, if I were to do a 
aurora create instead with # of instances = 4, it keeps 3 of them in RUNNING 
while the 4th instance is in PENDING state. So, the behavior is different 
depending on whether we do a "aurora create" v "aurora update" which ideally 
should not be the case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to