Peter Bacsko created YUNIKORN-2574:
--------------------------------------
Summary: totalPartitionResource should not be mutated with
AddTo/SubFrom
Key: YUNIKORN-2574
URL: https://issues.apache.org/jira/browse/YUNIKORN-2574
Project: Apache YuniKorn
Issue Type: Bug
Components: core - scheduler
Affects Versions: 1.5.0, 1.4.0
Reporter: Peter Bacsko
Assignee: Peter Bacsko
There is a potential data race in PartitionContext: the field
"totalPartitionResource" is mutated in place. The problem is that the method
{{GetTotalPartitionResource()}} does not clone it.
{noformat}
func (pc *PartitionContext) GetTotalPartitionResource() *resources.Resource {
pc.RLock()
defer pc.RUnlock()
return pc.totalPartitionResource
}
{noformat}
In general, we should prefer the immutable approach for variables like this,
just like in {{objects.Queue}}:
{noformat}
func (sq *Queue) IncAllocatedResource(alloc *resources.Resource, nodeReported
bool) error {
// check this queue: failure stops checks if the allocation is not part
of a node addition
newAllocated := resources.Add(sq.allocatedResource, alloc)
[ ... removed ... ]
sq.Lock()
defer sq.Unlock()
// all OK update this queue
sq.allocatedResource = newAllocated
sq.updateAllocatedResourceMetrics()
return nil
}
// incPendingResource increments pending resource of this queue and its parents.
func (sq *Queue) incPendingResource(delta *resources.Resource) {
// update the parent
if sq.parent != nil {
sq.parent.incPendingResource(delta)
}
// update this queue
sq.Lock()
defer sq.Unlock()
sq.pending = resources.Add(sq.pending, delta)
sq.updatePendingResourceMetrics()
}
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]