Youjie Chen created SLIDER-939:
----------------------------------

             Summary: flex down does not cancel the outstanding request
                 Key: SLIDER-939
                 URL: https://issues.apache.org/jira/browse/SLIDER-939
             Project: Slider
          Issue Type: Bug
          Components: core
    Affects Versions: Slider 0.80
         Environment: Hadoop 2.7.1 
Slider 0.80.0
            Reporter: Youjie Chen
             Fix For: Slider 0.81


I run slider app on  a 6 nodes cluster. To ensure there is only one 
comonent(worker) instance on each node, I set yarn.memory to 51% of the total 
memory. 
Then I flex up to 7 workers,  there would be one worker request(outstanding)  
that will never be met, this is expected.

Then I flexed down back to 6 workers, and any container request for any job 
would be blocked even if there are plenty of memory/core for the job, From RM 
log, we can see there are continuous output:
capacity.CapacityScheduler 
(CapacityScheduler.java:allocateContainersToNode(1240)) - Skipping scheduling 
since node test.example.com:45454 is reserved by application 
appattempt_1442384698868_0008_000001

 It seems  the outstanding requests are not actually cancelled in the 
requesting container queue but keep trying to request.

After I flexed down to 5 workers, the other blocked jobs can run.
This is related to JIRA https://issues.apache.org/jira/browse/SLIDER-490



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to