[ 
https://issues.apache.org/jira/browse/FLINK-19142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297082#comment-17297082
 ] 

Zhu Zhu edited comment on FLINK-19142 at 3/8/21, 6:59 AM:
----------------------------------------------------------

[~trohrmann] do you mean a case like this? 
 - Previously bulk_1 uses 2 slot \{a1, a2\} and bulk_2 uses 2 slots \{b1,b2\} 
and there are only these 4 slots in JM slot pool. Later slot a1 and slot b1 get 
lost, bulk_1 and bulk_2 need to restart. And bulk_1 cannot use b2 and it will 
need to request one more new slot.

If so, I think it is not a problem for streaming jobs because the job is 
expected to acquire all required slots at the same time sooner or later.
For batch jobs, it is a problem that resource deadlocks can happen if newly 
slot requirements cannot be fulfilled. However, from what I know, local 
recovery, in which case PreviousAllocationSlotSelectionStrategy is used, is not 
expected for batch jobs. So maybe we can let the job use 
LocationPreferenceSlotSelectionStrategy if it is batch job even if local 
recover is enabled?


was (Author: zhuzh):
[~trohrmann] do you mean a case like this? 
 - Previously bulk_1 uses 2 slot \{a1, a2\} and bulk_2 uses 2 slots \{b1,b2\} 
and there are only these 4 slots in JM slot pool. Later slot a1 and slot b1 get 
lost, bulk_1 and bulk_2 need to restart. And bulk_1 cannot use b2 and it will 
need to request one more new slot.

If so, I think it is not a problem for streaming jobs because the job is 
expected to acquire all required slots at the same time sooner or later.
For batch jobs, it is a problem that resource deadlocks can happen if newly 
slot requirements cannot be fulfilled. However, from what I know, local 
recovery, in which case PreviousAllocationSlotSelectionStrategy is used, is not 
expected for batch jobs. So maybe we can stick to 
LocationPreferenceSlotSelectionStrategy if it is batch job even if local 
recover is enabled?

> Investigate slot hijacking from preceding pipelined regions after failover
> --------------------------------------------------------------------------
>
>                 Key: FLINK-19142
>                 URL: https://issues.apache.org/jira/browse/FLINK-19142
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.12.0
>            Reporter: Andrey Zagrebin
>            Assignee: Zhu Zhu
>            Priority: Major
>             Fix For: 1.13.0
>
>
> The ticket originates from [this PR 
> discussion|https://github.com/apache/flink/pull/13181#discussion_r481087221].
> The previous AllocationIDs are used by 
> PreviousAllocationSlotSelectionStrategy to schedule subtasks into the slot 
> where they were previously executed before a failover. If the previous slot 
> (AllocationID) is not available, we do not want subtasks to take previous 
> slots (AllocationIDs) of other subtasks.
> The MergingSharedSlotProfileRetriever gets all previous AllocationIDs of the 
> bulk from SlotSharingExecutionSlotAllocator but only from the current bulk. 
> The previous AllocationIDs of other bulks stay unknown. Therefore, the 
> current bulk can potentially hijack the previous slots from the preceding 
> bulks. On the other hand the previous AllocationIDs of other tasks should be 
> taken if the other tasks are not going to run at the same time, e.g. not 
> enough resources after failover or other bulks are done.
> One way to do it may be to give to MergingSharedSlotProfileRetriever all 
> previous AllocationIDs of bulks which are going to run at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to