[jira] [Commented] (FLINK-23654) Allow configurable number of jobmanager-future threads

Till Rohrmann (Jira) Tue, 10 Aug 2021 05:59:20 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17396663#comment-17396663
 ]


Till Rohrmann commented on FLINK-23654:
---------------------------------------

Thanks for reporting this issue and the analysis [~raganico] and [~Thesharing]. 
I think you are right that it is not good that we use the same thread pool for 
short lived future callback executions and heavy I/O operations. I think that 
splitting the thread pool and introducing advanced configuration options to 
make the number of threads configurable makes a lot of sense. Does any of you 
have time to work on this issue?

> Allow configurable number of jobmanager-future threads
> ------------------------------------------------------
>
>                 Key: FLINK-23654
>                 URL: https://issues.apache.org/jira/browse/FLINK-23654
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / REST
>    Affects Versions: 1.14.0, 1.12.5, 1.13.2
>            Reporter: Nicolas Raga
>            Priority: Critical
>             Fix For: 1.14.0
>
>
> The JobManagerSharedServices futureExecutor is used for asynchronous request 
> in multiple Flink components. When the JobMaster creates the execution graph, 
> it passes the *scheduledExecutorService* (which is the 
> jobManagerSharedServices.getScheduledExecutorService) to both the 
> *futureExecutor* and the *ioExecutor.* In the ExecutionGraph, the 
> *ioExecutor* is the executor which is used to execute blocking I/O 
> operations. It is also passed in to the *CheckpointCoordinator* which uses it 
> for asynchronous calls like disposing pending checkpoints, clean up failed 
> checkpoints, etc. The *futureExecutor*  is even passed on to the *Execution* 
> class, which is then used to dispatch callbacks from futures and asynchronous 
> RPC calls from within vertexes! Lastly this executor is also used to process 
> asynchronous requests from the Flink REST endpoint.
>  
> Hence, using the endpoint for monitoring during large checkpoints or blocking 
> I/O operations on the same threadpool causes degraded performance on the 
> endpoint. We have already been able to test that an increase in this thread 
> count allows to faster responses to incoming requests. We can begin by simply 
> exposing a *jobmanager.future-thread.factor* that can provide a factor above 
> the number of CPU's. Afterwards, we can consider a dedicated thread pool for 
> blocking I/O that won't cause degradation of performance for the REST 
> endpoint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23654) Allow configurable number of jobmanager-future threads

Reply via email to