[
https://issues.apache.org/jira/browse/FLINK-26547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chesnay Schepler updated FLINK-26547:
-------------------------------------
Description:
To allow recovered TMs to eagerly re-offer their slots we allowed the
registration of slots without a matching requirement if the job is currently
restarting.
All slots that the pool accepts are mapped to a certain requirement, in order
to determine whether sufficient slots were received yet. If a slot is reserved
for a requirement that does not coincide with the mapping the pool come up
with, then the mapping and requirements are changed accordingly to ensure we
still request sufficient slots.
This leads to issues with slots that were accepted without a matching
requirement. Those were mapped to the actual resource profile of the slot (to
fit into the book-keeping). With the above logic in place this could lead to a
specific resource requirement being added, which the remaining JM components
are not aware of (and thus will never get rid of).
was:To allow recovered TMs to eagerly re-offer their slots we allowed the
registration of slots without a matching requirement if the job is currently
restarting.
> Accepting slots without a matching requirement leads to unfulfillable
> requirements
> ----------------------------------------------------------------------------------
>
> Key: FLINK-26547
> URL: https://issues.apache.org/jira/browse/FLINK-26547
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.15.0
> Reporter: Chesnay Schepler
> Assignee: Chesnay Schepler
> Priority: Blocker
> Fix For: 1.15.0
>
>
> To allow recovered TMs to eagerly re-offer their slots we allowed the
> registration of slots without a matching requirement if the job is currently
> restarting.
> All slots that the pool accepts are mapped to a certain requirement, in order
> to determine whether sufficient slots were received yet. If a slot is
> reserved for a requirement that does not coincide with the mapping the pool
> come up with, then the mapping and requirements are changed accordingly to
> ensure we still request sufficient slots.
> This leads to issues with slots that were accepted without a matching
> requirement. Those were mapped to the actual resource profile of the slot (to
> fit into the book-keeping). With the above logic in place this could lead to
> a specific resource requirement being added, which the remaining JM
> components are not aware of (and thus will never get rid of).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)