jchenjc opened a new pull request #4110:
URL: https://github.com/apache/hadoop/pull/4110


   ### Description of PR
   When running a Hive job in a low-capacity queue on an idle cluster, 
preemption kicked in to preempt job containers even though there's no other job 
running and competing for resources. 
   
   Let's take this scenario as an example:
   ```
   cluster resource : <Memory:168GB, VCores:48>
   queue_low: min_capacity 1%
   queue_mid: min_capacity 19%
   queue_high: min_capacity 80%
   CapacityScheduler with DRF
   ```
   During the fifo preemption candidates selection process, the 
preemptableAmountCalculator needs to first "computeIdealAllocation" which 
depends on each queue's guaranteed/min capacity. A queue's guaranteed capacity 
is currently calculated as "Resources.multiply(totalPartitionResource, 
absCapacity)", so the guaranteed capacity of queue_low is:
   `queue_low: <Memory: (168*0.01)GB, VCores:(48*0.01)> = <Memory:1.68GB, 
VCores:0.48>`, but since the Resource object takes only Long values, these 
Doubles values get casted into Long, and then the final result becomes 
`<Memory:1GB, VCores:0>`
   
   Because the guaranteed capacity of queue_low is 0, its normalized guaranteed 
capacity based on active queues is also 0 based on the current algorithm in 
"resetCapacity". This eventually leads to the continuous preemption of job 
containers running in queue_low. 
   
   In order to work around this corner case, "resetCapacity" needs to consider 
a couple new scenarios: 
   
   if the sum of absoluteCapacity/minCapacity of all active queues is zero, we 
should normalize their guaranteed capacity evenly: `1.0f / num_of_queues`
   
   if the sum of pre-normalized guaranteed capacity values (MB or VCores) of 
all active queues is zero, meaning we might have several queues like queue_low 
whose capacity value got casted into 0, we should normalize evenly as well like 
the first scenario (if they are all tiny, it really makes no big difference, 
for example, 1% vs 1.2%).
   
   if one of the active queues has a zero pre-normalized guaranteed capacity 
value but its absoluteCapacity/minCapacity is not zero, then we should 
normalize based on the weight of their configured queue 
absoluteCapacity/minCapacity. This is to make sure queue_low gets a small but 
fair normalized value when queue_mid is also active. 
   `minCapacity / (sum_of_min_capacity_of_active_queues)`
   
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to