Hi If you check out the develop branch from the ASF git repository, I believe it now contains a fix for this.
It also contains SLIDER-799: AM-managed placement escalation, and all but one subtask of SLIDER-611 -that is, all the enhancements for placement planned for Slider 0.80-incubating Shrijeet -can you grab this branch, do a local build and see if the problem you are seeing is now fixed? > On 23 Mar 2015, at 17:15, Shrijeet Paliwal <[email protected]> wrote: > > Hello, > > *Context:* > > We were seeing very aggressive preemption done by Fair Scheduler and 98% of > preemption activity is triggered due to slider queue's needs. Slider queue > is stable queue i.e its containers don't churn and it has been provided a > fair share guarantee of more than it needs (high weight & min share double > of its steady state needs). So it was puzzling to see it triggering > preemption. When I turned on debug logging of fair scheduler I noticed > scheduler demand update thread reporting unusually high demand from Slider > queue. > > Initial thought was a bug in scheduler but later I concluded its Slider's > problem but not due to its own code but due to AMRMClient code. I can > deterministically reproduce the issue on my laptop running a pseudo > yarn+slider setup. I traced it to an open issue > https://issues.apache.org/jira/browse/YARN-3020. > > *The problem: * > > 1. A region server fails for the first time, slider notices it > and registers a request to RM via AMRMClient for a new container. At this > time AMRMClient caches this allocation request with the 'Resource' (a data > structure with memory, cpu & priority) as key. > (source: AMRMClientImpl.java, cache is remoteRequestsTable) > 2. A region server fails again, slider notices it and registers a request > to RM again via AMRMClient for a (one) new container. AMRMClient finds that > similar Resource request (the memory, cpu and priority for RS doesn't > change obviously) in its cache, add +1 to the container count before > putting it over wire.*NOTE*: Slider didn't need 2 containers, but ends up > receiving 2. When containers are allocated, slider keeps one and discards > one. > 3. As explained in YARN-3020, with subsequent failures we will keep asking > for more and more containers when in reality we always need one. > > For fair scheduler this means demand keeps going up. It doesn't know that > slider ends up discarding the surplus containers. In order to satisfy the > demand it kills mercilessly. Needless to say this will not be just > triggered by container failure, even flexing should trigger this. > > *The fix: * > > Rumor is that AMRMClient doesn't have a bug, its intended behaviour > (source: comments in YARN-3020). The claim is that on receiving container > client should clear the cache by calling a method called > 'removeContainerRequest'. Slider isn't following the protocol correctly, in > Slider's defense the protocol is not well defined. > > Thoughts? > -- > Shrijeet
