Shrijeet Paliwal created SLIDER-828:
---------------------------------------

             Summary: Redundant container request from slider causing high load 
on busy cluster
                 Key: SLIDER-828
                 URL: https://issues.apache.org/jira/browse/SLIDER-828
             Project: Slider
          Issue Type: Bug
          Components: appmaster
    Affects Versions: Slider 0.61
            Reporter: Shrijeet Paliwal


Context:

We were seeing very aggressive preemption done by Fair Scheduler and 98% of 
preemption activity is triggered due to slider queue's needs. Slider queue is 
stable queue i.e its containers don't churn and it has been provided a fair 
share guarantee of more than it needs (high weight & min share double of its 
steady state needs). So it was puzzling to see it triggering preemption. When I 
turned on debug logging of fair scheduler I noticed scheduler demand update 
thread reporting unusually high demand from Slider queue. 

Initial thought was a bug in scheduler but later I concluded its Slider's 
problem but not due to its own code but due to AMRMClient code. I can 
deterministically reproduce the issue on my laptop running a pseudo yarn+slider 
setup.  I traced it to an open issue 
https://issues.apache.org/jira/browse/YARN-3020. 

The problem: 

1. A region server fails for the first time, slider notices it and registers a 
request to RM via AMRMClient for a new container. At this time AMRMClient 
caches this allocation request with the 'Resource' (a data structure with 
memory, cpu & priority) as key. (source: AMRMClientImpl.java, cache is 
remoteRequestsTable)
2. A region server fails again, slider notices it and registers a request to RM 
again via AMRMClient for a (one) new container. AMRMClient finds that similar 
Resource request (the memory, cpu and priority for RS doesn't change obviously) 
in its cache, add +1 to the container count before putting it over wire.NOTE: 
Slider didn't need 2 containers, but ends up receiving 2. When containers are 
allocated, slider keeps one and discards one. 
3. As explained in YARN-3020, with subsequent failures we will keep asking for 
more and more containers when in reality we always need one. 

For fair scheduler this means demand keeps going up. It doesn't know that 
slider ends up discarding the surplus containers. In order to satisfy the 
demand it kills mercilessly. Needless to say this will not be just triggered by 
container failure, even flexing should trigger this. 

The fix: 

Rumor is that AMRMClient doesn't have a bug, its intended behaviour (source: 
comments in  YARN-3020). The claim is that on receiving container client should 
clear the cache by calling a method called 'removeContainerRequest'. Slider 
isn't following the protocol correctly, in Slider's defense the protocol is not 
well defined. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to