[ https://issues.apache.org/jira/browse/FLINK-23403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381829#comment-17381829 ]
Yang Wang commented on FLINK-23403: ----------------------------------- I am afraid the decrease of heartbeat timeout will take some major impacts on the production Flink workloads. For example * The fullGC takes a longer time than 10s. * Even though our internal network bandwidth is 10gb in Alibaba, we still found some heartbeat timeout issues when the network pressure or tcp retransmission is high. AFAIK, the network environment of the self-built IDCs is not better than this. > Decrease default values for heartbeat timeout and interval > ---------------------------------------------------------- > > Key: FLINK-23403 > URL: https://issues.apache.org/jira/browse/FLINK-23403 > Project: Flink > Issue Type: Improvement > Components: Runtime / Configuration, Runtime / Coordination > Affects Versions: 1.14.0 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > In order to speed up failure detection I suggest to decrease the default > values for the heartbeat timeout and interval from 50s/10s to 15s/3s. -- This message was sent by Atlassian Jira (v8.3.4#803005)