1996fanrui commented on code in PR #25218:
URL: https://github.com/apache/flink/pull/25218#discussion_r1772439538
##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/allocator/StateLocalitySlotAssigner.java:
##########
@@ -139,6 +154,36 @@ public Collection<SlotAssignment> assignSlots(
return assignments;
}
+ /**
+ * The sorting principle and strategy here are very similar to {@link
Review Comment:
> The user explicitly set local state recovery in the scenario where
`StateLocalitySlotAssigner` is used (see
[execution.state-recovery.from-local](https://github.com/apache/flink/blob/7adeecd3445947f42d3e3d1e2961b9464e910236/flink-core/src/main/java/org/apache/flink/configuration/StateRecoveryOptions.java#L108)),
i.e. the user might value keeping the state on the machine in that case. 🤔
Yes, you are right. Another scenario is that users want to obtain the
benefits of local recovery (quickly recover the state) without wasting
resources.
If both of cases are needed, it seems we need an additional options to
control it. (When using StateLocalitySlotAssigner, is local recovery high
priority? Or is resources high priority?)
Anyway, after our discussion, it's better to only update the
DefaultAssigneer in this PR. The strategy of `StateLocalitySlotAssigner` can be
discussed in a separate JIRA or mail list.
WDYT?
> > The state locality only take effect during the job recovery, it's an
optimization.
>
> Why would that only have an affect during job recovery (i.e. when the
Dispatcher recovers the job)? Every rescale operation recovers from a
checkpoint in the end. Or am I misunderstanding you here?
Sorry, I didn't express clearly. I mean `job recovery` happens when job
start or rescale.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]