[ 
https://issues.apache.org/jira/browse/HADOOP-13029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245854#comment-15245854
 ] 

Daryn Sharp commented on HADOOP-13029:
--------------------------------------

I've been pondering this very question since upgrading to 2.7.  I'm not sure 
it's a good idea unless well thought out since the current impl suffers from 
some unexpected behaviors.

In our experience an abusive user floods the 1st queue and backoff kicks in for 
everyone.  Agreed, less than desirable.  Bad user decays to a lower queue, 
probably floods it too.  Depending on job width all queues may fill but decay 
is a dampener.  Allowing simple spillover for all will allow the bad user to 
rapidly fill queues.  On the flip-side, w/o spillover, nobody was going to get 
those queue slots anyway.

Additionally, total call count is driven up by the backoff rejected calls.  
Generally shrinking call percentages of all other users into highest priority 
queue.  Once the bad users' tasks are all blocked in queues, the incoming call 
rate appears lower.  They decay back into a lower queue and clog it back up 
again.

> Have FairCallQueue try all lower priority sub queues before backoff
> -------------------------------------------------------------------
>
>                 Key: HADOOP-13029
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13029
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>
> Currently if FairCallQueue and backoff are enabled, backoff will kick in as 
> soon as the assigned sub queue is filled up.
> {noformat}
>   /**
>    * Put and offer follow the same pattern:
>    * 1. Get the assigned priorityLevel from the call by scheduler
>    * 2. Get the nth sub-queue matching this priorityLevel
>    * 3. delegate the call to this sub-queue.
>    *
>    * But differ in how they handle overflow:
>    * - Put will move on to the next queue until it lands on the last queue
>    * - Offer does not attempt other queues on overflow
>    */
> {noformat}
> Seems it is better to try lower priority sub queues when the assigned sub 
> queue is full, just like the case when backoff is disabled. This will give 
> regular users more opportunities and allow the cluster to be configured with 
> smaller call queue length. [~chrili], [~arpitagarwal], what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to