[ 
https://issues.apache.org/jira/browse/FLINK-13555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-13555:
-------------------------------------

    Assignee: Xintong Song

> Failures of slot requests requiring unfulfillable managed memory should not 
> be ignored.
> ---------------------------------------------------------------------------------------
>
>                 Key: FLINK-13555
>                 URL: https://issues.apache.org/jira/browse/FLINK-13555
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.9.0
>            Reporter: Xintong Song
>            Assignee: Xintong Song
>            Priority: Blocker
>             Fix For: 1.9.0
>
>         Attachments: flink-unk-standalonesession-0-u-home.log, 
> flink-unk-taskexecutor-0-u-home.log
>
>
> Currently, SlotPool ignores failures of requesting slots from ResourceManager 
> for all batch slot requests. The idea behind this is to allow batch slot 
> requests pending at SlotPool and waiting for other tasks to finish and 
> release slots. A slot request will be failed only if it is not fulfilled in 
> its timeout.
> However, there could be two kinds of request slots from RM failures.
>  # RM does not have available slots. All slots are in use at the moment. But 
> they might become available later when the currently running tasks finish.
>  # The slot request requires too many resources that can not be fulfilled by 
> any slot (available or not) in the cluster. The request is also not likely to 
> be fulfilled later.
> For the 2nd kinds of failures, it doesn't make sense to wait for the timeout. 
> We should fail the job immediately, with proper error messages describing the 
> problem and suggesting the user to tune job or cluster configurations.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to