More evidence:

Spark is also affected: https://issues.apache.org/jira/browse/SPARK-2687
One more relevant yarn jira: https://issues.apache.org/jira/browse/YARN-1902

--
Shrijeet

On Mon, Mar 23, 2015 at 10:15 AM, Shrijeet Paliwal <
[email protected]> wrote:

> Hello,
>
> *Context:*
>
> We were seeing very aggressive preemption done by Fair Scheduler and 98%
> of preemption activity is triggered due to slider queue's needs. Slider
> queue is stable queue i.e its containers don't churn and it has been
> provided a fair share guarantee of more than it needs (high weight & min
> share double of its steady state needs). So it was puzzling to see it
> triggering preemption. When I turned on debug logging of fair scheduler I
> noticed scheduler demand update thread reporting unusually high demand from
> Slider queue.
>
> Initial thought was a bug in scheduler but later I concluded its Slider's
> problem but not due to its own code but due to AMRMClient code. I can
> deterministically reproduce the issue on my laptop running a pseudo
> yarn+slider setup.  I traced it to an open issue
> https://issues.apache.org/jira/browse/YARN-3020.
>
> *The problem: *
>
> 1. A region server fails for the first time, slider notices it
> and registers a request to RM via AMRMClient for a new container. At this
> time AMRMClient caches this allocation request with the 'Resource' (a data
> structure with memory, cpu & priority) as key.
> (source: AMRMClientImpl.java, cache is remoteRequestsTable)
> 2. A region server fails again, slider notices it and registers a request
> to RM again via AMRMClient for a (one) new container. AMRMClient finds that
> similar Resource request (the memory, cpu and priority for RS doesn't
> change obviously) in its cache, add +1 to the container count before
> putting it over wire.*NOTE*: Slider didn't need 2 containers, but ends up
> receiving 2. When containers are allocated, slider keeps one and discards
> one.
> 3. As explained in YARN-3020, with subsequent failures we will keep asking
> for more and more containers when in reality we always need one.
>
> For fair scheduler this means demand keeps going up. It doesn't know that
> slider ends up discarding the surplus containers. In order to satisfy the
> demand it kills mercilessly. Needless to say this will not be just
> triggered by container failure, even flexing should trigger this.
>
> *The fix: *
>
> Rumor is that AMRMClient doesn't have a bug, its intended behaviour
> (source: comments in  YARN-3020). The claim is that on receiving
> container client should clear the cache by calling a method called
> 'removeContainerRequest'. Slider isn't following the protocol correctly, in
> Slider's defense the protocol is not well defined.
>
> Thoughts?
> --
> Shrijeet
>

Reply via email to