Hi

If you check out the develop branch from the ASF git repository, I believe it 
now contains a fix for this.

It also contains SLIDER-799: AM-managed placement escalation, and all but one 
subtask
of SLIDER-611 -that is, all the enhancements for placement planned for Slider 
0.80-incubating

Shrijeet -can you grab this branch, do a local build and see if the problem you 
are seeing is now fixed?

> On 23 Mar 2015, at 17:15, Shrijeet Paliwal <[email protected]> wrote:
> 
> Hello,
> 
> *Context:*
> 
> We were seeing very aggressive preemption done by Fair Scheduler and 98% of
> preemption activity is triggered due to slider queue's needs. Slider queue
> is stable queue i.e its containers don't churn and it has been provided a
> fair share guarantee of more than it needs (high weight & min share double
> of its steady state needs). So it was puzzling to see it triggering
> preemption. When I turned on debug logging of fair scheduler I noticed
> scheduler demand update thread reporting unusually high demand from Slider
> queue.
> 
> Initial thought was a bug in scheduler but later I concluded its Slider's
> problem but not due to its own code but due to AMRMClient code. I can
> deterministically reproduce the issue on my laptop running a pseudo
> yarn+slider setup.  I traced it to an open issue
> https://issues.apache.org/jira/browse/YARN-3020.
> 
> *The problem: *
> 
> 1. A region server fails for the first time, slider notices it
> and registers a request to RM via AMRMClient for a new container. At this
> time AMRMClient caches this allocation request with the 'Resource' (a data
> structure with memory, cpu & priority) as key.
> (source: AMRMClientImpl.java, cache is remoteRequestsTable)
> 2. A region server fails again, slider notices it and registers a request
> to RM again via AMRMClient for a (one) new container. AMRMClient finds that
> similar Resource request (the memory, cpu and priority for RS doesn't
> change obviously) in its cache, add +1 to the container count before
> putting it over wire.*NOTE*: Slider didn't need 2 containers, but ends up
> receiving 2. When containers are allocated, slider keeps one and discards
> one.
> 3. As explained in YARN-3020, with subsequent failures we will keep asking
> for more and more containers when in reality we always need one.
> 
> For fair scheduler this means demand keeps going up. It doesn't know that
> slider ends up discarding the surplus containers. In order to satisfy the
> demand it kills mercilessly. Needless to say this will not be just
> triggered by container failure, even flexing should trigger this.
> 
> *The fix: *
> 
> Rumor is that AMRMClient doesn't have a bug, its intended behaviour
> (source: comments in  YARN-3020). The claim is that on receiving container
> client should clear the cache by calling a method called
> 'removeContainerRequest'. Slider isn't following the protocol correctly, in
> Slider's defense the protocol is not well defined.
> 
> Thoughts?
> --
> Shrijeet

Reply via email to