[
https://issues.apache.org/jira/browse/FLINK-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhu Zhu updated FLINK-13241:
----------------------------
Description:
In the case that a job allocates a few slots first and after a period allocates
some other slots. The YarnResourceManager seems to receive and ignore the
latter slot requests.
To produce this issue, we can create a job with 2 vertices in different shared
groups, as shown below:
!17_37_05__07_12_2019.jpg|width=433,height=127!
Slot allocation for map2 vertex happens after the source vertex acquires slots
to decide its location, thus to meet the input constraints.
YarnResourceManager can receive slot requests for map2, but seems not to handle
it and the job will hang there waiting for resources.
In my observation, this issue does not happen on Flink(Version: 1.9-SNAPSHOT,
Rev:3bc322a, Date:26.06.2019 @ 17:28:51 CST). It should be a new issue after
that.
was:
In the case that a job allocates a few slots first and after a period allocates
some other slots. The YarnResourceManager seems to receive and ignore the
latter slot requests.
To produce this issue, we can create a job with 2 vertices in different shared
groups, as shown below:
!17_37_05__07_12_2019.jpg|width=433,height=127!
map2 vertex needs to wait until the source vertex to decide its location, thus
to meet the input constraints.
YarnResourceManager can receive slot requests for map2, but seems not to handle
it and the job will hang there waiting for resources.
In my observation, this issue does not happen on Flink(Version: 1.9-SNAPSHOT,
Rev:3bc322a, Date:26.06.2019 @ 17:28:51 CST). It should be a new issue after
that.
> YarnResourceManager does not handle slot allocations in certain cases
> ---------------------------------------------------------------------
>
> Key: FLINK-13241
> URL: https://issues.apache.org/jira/browse/FLINK-13241
> Project: Flink
> Issue Type: Bug
> Components: Deployment / YARN
> Affects Versions: 1.9.0
> Reporter: Zhu Zhu
> Priority: Major
> Attachments: 17_37_05__07_12_2019.jpg
>
>
> In the case that a job allocates a few slots first and after a period
> allocates some other slots. The YarnResourceManager seems to receive and
> ignore the latter slot requests.
> To produce this issue, we can create a job with 2 vertices in different
> shared groups, as shown below:
> !17_37_05__07_12_2019.jpg|width=433,height=127!
> Slot allocation for map2 vertex happens after the source vertex acquires
> slots to decide its location, thus to meet the input constraints.
> YarnResourceManager can receive slot requests for map2, but seems not to
> handle it and the job will hang there waiting for resources.
> In my observation, this issue does not happen on Flink(Version: 1.9-SNAPSHOT,
> Rev:3bc322a, Date:26.06.2019 @ 17:28:51 CST). It should be a new issue after
> that.
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)