[
https://issues.apache.org/jira/browse/SLIDER-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378039#comment-14378039
]
Steve Loughran commented on SLIDER-828:
---------------------------------------
Assuming that the commentary in YARN-1902 is correct, that this is "as
defined", then slider needs to cancel the request for every allocated container.
This is going to force us to actually track the individual allocation requests;
currently this is only done for the placed requests as parts of the escalation
process.
> Redundant container request from slider causing high load on busy cluster
> -------------------------------------------------------------------------
>
> Key: SLIDER-828
> URL: https://issues.apache.org/jira/browse/SLIDER-828
> Project: Slider
> Issue Type: Bug
> Components: appmaster
> Affects Versions: Slider 0.61, Slider 0.70
> Reporter: Shrijeet Paliwal
> Priority: Blocker
> Fix For: Slider 0.80
>
>
> Context:
> We were seeing very aggressive preemption done by Fair Scheduler and 98% of
> preemption activity is triggered due to slider queue's needs. Slider queue is
> stable queue i.e its containers don't churn and it has been provided a fair
> share guarantee of more than it needs (high weight & min share double of its
> steady state needs). So it was puzzling to see it triggering preemption. When
> I turned on debug logging of fair scheduler I noticed scheduler demand update
> thread reporting unusually high demand from Slider queue.
> Initial thought was a bug in scheduler but later I concluded its Slider's
> problem but not due to its own code but due to AMRMClient code. I can
> deterministically reproduce the issue on my laptop running a pseudo
> yarn+slider setup. I traced it to an open issue
> https://issues.apache.org/jira/browse/YARN-3020.
> The problem:
> 1. A region server fails for the first time, slider notices it and registers
> a request to RM via AMRMClient for a new container. At this time AMRMClient
> caches this allocation request with the 'Resource' (a data structure with
> memory, cpu & priority) as key. (source: AMRMClientImpl.java, cache is
> remoteRequestsTable)
> 2. A region server fails again, slider notices it and registers a request to
> RM again via AMRMClient for a (one) new container. AMRMClient finds that
> similar Resource request (the memory, cpu and priority for RS doesn't change
> obviously) in its cache, add +1 to the container count before putting it over
> wire.NOTE: Slider didn't need 2 containers, but ends up receiving 2. When
> containers are allocated, slider keeps one and discards one.
> 3. As explained in YARN-3020, with subsequent failures we will keep asking
> for more and more containers when in reality we always need one.
> For fair scheduler this means demand keeps going up. It doesn't know that
> slider ends up discarding the surplus containers. In order to satisfy the
> demand it kills mercilessly. Needless to say this will not be just triggered
> by container failure, even flexing should trigger this.
> The fix:
> Rumor is that AMRMClient doesn't have a bug, its intended behaviour (source:
> comments in YARN-3020). The claim is that on receiving container client
> should clear the cache by calling a method called 'removeContainerRequest'.
> Slider isn't following the protocol correctly, in Slider's defense the
> protocol is not well defined.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)