[ 
https://issues.apache.org/jira/browse/FLINK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189926#comment-17189926
 ] 

Till Rohrmann commented on FLINK-19069:
---------------------------------------

I think this is indeed a valid problem which we should fix. The problem 
[~wind_ljy] means with the timeout is if the user is requesting information 
from the {{JobMaster}} but it will also affect the heartbeats and, thus, the 
stability of the system.

We actually run the counterpart of {{finalizeOnMaster}}, which is 
{{initializeOnMaster}} as well on the main thread. At the moment this is not a 
problem because we do this before the {{JobMaster}} starts working.

> finalizeOnMaster takes too much time and client timeouts
> --------------------------------------------------------
>
>                 Key: FLINK-19069
>                 URL: https://issues.apache.org/jira/browse/FLINK-19069
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination, Runtime / Task
>    Affects Versions: 1.9.0
>            Reporter: Jiayi Liao
>            Priority: Critical
>             Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Currently we execute {{finalizeOnMaster}} in JM's main thread, which may 
> stuck the JM for a very long time and client timeouts eventually. 
> For example, we'd like to write data to HDFS  and commit files on JM, which 
> takes more than ten minutes to commit tens of thousands files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to