[ 
https://issues.apache.org/jira/browse/HADOOP-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665659#action_12665659
 ] 

Matei Zaharia commented on HADOOP-5075:
---------------------------------------

By the way, the reason I think the break is not incorrect is the following: 
oldSlots will equal slotsLeft after the floor loop only if the number of slots 
we had left to distribute was so small that no job had its share of the total 
above 0.5. In this case, the ceil loop will just add +1 slot to the first few 
jobs in order of weight and then run out of slots to distribute, and the outer 
loop would've exited anyway with slotsLeft = 0. Remember that slotsLeft is the 
number of guaranteed slots for the pool that we have to distribute, so it's 
bounded and way smaller than the number of slots a job could potentially 
consume (it might get some of those too, but not as guaranteed share, just as 
fair share). Also, the jobs list that we iterate over has only jobs that are 
not at full capacity and can thus all use an extra slot, so we don't "lose" 
slots by going through the list and seeing that actually some jobs can't use 
any slots.

> Potential infinite loop in updateMinSlots
> -----------------------------------------
>
>                 Key: HADOOP-5075
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5075
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fair-share
>            Reporter: Matei Zaharia
>            Priority: Blocker
>             Fix For: 0.19.1, 0.20.0, 0.21.0
>
>         Attachments: hadoop-5075-v2.patch, hadoop-5075-v3.patch, 
> hadoop-5075.patch
>
>
> We ran into a problem at Facebook where the updateMinSlots loop in the 
> scheduler was repeating infinitely. This might happen if, due to rounding, we 
> are unable to assign the last few slots in a pool. This patch adds a break 
> statement to ensure that the loop exists if it hasn't managed to assign any 
> slots.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to