[jira] [Commented] (FLINK-13162) Default value of slot.idle.timeout is too large for batch job

Xintong Song (JIRA) Wed, 10 Jul 2019 01:00:28 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-13162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881813#comment-16881813
 ]


Xintong Song commented on FLINK-13162:
--------------------------------------

[~zjffdu], [~fly_in_gis],

This slot idle timeout in slot pool could also affect batch jobs with Flink 1.9 
blink runner a bit.

In Flink 1.9, we have fine grained, dynamic allocated slot managed memory for 
batch jobs. That means if a task requests for a slot with 100MB managed memory, 
it will get a slot with 100MB managed memory. Then after the task is finished, 
the slot can only be reused by other tasks with managed memory requirements 
less than 100MB. If subsequence tasks need more managed memory, and the cluster 
do not have other available resources, they may need to wait for the previous 
slot to timeout.

This should not affect streaming jobs or any job using flink runner, because 
those jobs do not have fine grained resource requirements.

> Default value of slot.idle.timeout is too large for batch job
> -------------------------------------------------------------
>
>                 Key: FLINK-13162
>                 URL: https://issues.apache.org/jira/browse/FLINK-13162
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Configuration, Runtime / Coordination
>    Affects Versions: 1.9.0
>            Reporter: Jeff Zhang
>            Priority: Major
>         Attachments: image-2019-07-10-09-19-34-513.png
>
>
> The default value of slot.idle.timeout is 50 seconds, it is too large for 
> batch job.
> It will cause downstream vertex unable to start even the upstream vertex is 
> finished. It has to wait for 50 seconds. The default value of this kind of 
> configuration doesn't make sense for batch job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-13162) Default value of slot.idle.timeout is too large for batch job

Reply via email to