wilfred-s commented on code in PR #458:
URL: https://github.com/apache/yunikorn-core/pull/458#discussion_r1036719716
##########
pkg/scheduler/objects/application.go:
##########
@@ -1093,6 +1097,9 @@ func (sa *Application) tryReservedAllocate(headRoom
*resources.Resource, nodeIte
// Do we need a specific node?
if ask.GetRequiredNode() != "" {
+ if ask.HasTriggeredPreemption() ||
time.Since(ask.GetLastPreemptionAttempt()) < sa.preemptionAttemptInterval {
Review Comment:
I think @craigcondit was right: once we triggered preemption the kills will
happen and cannot be reversed. We do not re-trigger *ever* in any case not even
after a short period of time.
The idea is to find a victim list and if we have a list:
- mark the ask as preemption triggered
- kill all victims
- mark all victims as preempted
Next scheduling cycle if the ask is still there and has the preemption
triggered flag skip it and do not re-trigger. The node is reserved for the ask
so it will get scheduled on that node. It could take up to the grace time for
the pod to shutdown, which could be loooong in scheduling times.
##########
pkg/scheduler/objects/allocation_ask.go:
##########
@@ -245,3 +247,31 @@ func (aa *AllocationAsk) GetAllocationLog()
[]*AllocationLogEntry {
}
return res
}
+
+// SetTriggeredPreemption sets whether preemption has been triggered for this
ask
+func (aa *AllocationAsk) SetTriggeredPreemption(triggered bool) {
+ aa.Lock()
+ defer aa.Unlock()
+ aa.preemptionTriggered = triggered
+}
+
+// HasTriggeredPreemption returns whether this ask has triggered preemption
+func (aa *AllocationAsk) HasTriggeredPreemption() bool {
+ aa.RLock()
+ defer aa.RUnlock()
+ return aa.preemptionTriggered
+}
+
+// SetLastPreemptionAttempt sets the time which tells when the last preemption
attempt was performed for this ask.
+func (aa *AllocationAsk) SetLastPreemptionAttempt(attempt time.Time) {
Review Comment:
I would expect that the method would generate the time by calling
`time.Now()` instead of passing in the pre-created object.
If this was done to allow resetting the time during tests we should add a
method that allows clearing both time and the flag in one call. There could
even be an argument for not having the boolean: if the time is set the
preemption has been triggered.
##########
pkg/scheduler/objects/allocation.go:
##########
@@ -356,6 +357,20 @@ func (a *Allocation) GetAllocatedResource()
*resources.Resource {
return a.allocatedResource
}
+// SetPreempted marks the allocation as preempted.
+func (a *Allocation) SetPreempted(preempted bool) {
Review Comment:
Should we log something at a `Debug` or even `Error` level if we try to set
this flag when already set? I would consider it a big problem.
##########
pkg/scheduler/objects/allocation.go:
##########
@@ -356,6 +357,20 @@ func (a *Allocation) GetAllocatedResource()
*resources.Resource {
return a.allocatedResource
}
+// SetPreempted marks the allocation as preempted.
+func (a *Allocation) SetPreempted(preempted bool) {
Review Comment:
I would highly doubt that we would ever unset this flag.
When we trigger preemption we ask the shim to kill the pod. That is not
going to be undone.
##########
pkg/scheduler/objects/application.go:
##########
@@ -1742,3 +1754,10 @@ func (sa *Application)
SetTimedOutPlaceholder(taskGroupName string, timedOut int
sa.placeholderData[taskGroupName].TimedOut = timedOut
}
}
+
+// test only
+func (sa *Application) SetPreemptionAttemptInterval(interval time.Duration) {
Review Comment:
See earlier comment I think this is not needed at all.
##########
pkg/scheduler/objects/application.go:
##########
@@ -1128,6 +1135,7 @@ func (sa *Application) tryReservedAllocate(headRoom
*resources.Resource, nodeIte
}
func (sa *Application) tryPreemption(reserve *reservation, ask *AllocationAsk)
bool {
+ ask.SetLastPreemptionAttempt(time.Now())
Review Comment:
We should not set this unless we have really preempted otherwise it looks
like we did but we really did not. See comment above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]