[jira] [Commented] (FLINK-13249) Distributed Jepsen test fails with blocked TaskExecutor

Till Rohrmann (JIRA) Tue, 16 Jul 2019 01:18:08 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16885928#comment-16885928
 ]


Till Rohrmann commented on FLINK-13249:
---------------------------------------

This is a very good question. The down side of the global executor is that it's 
not really well defined who else is using it. Assume that there is another user 
(low level priority) who runs a long running task in it. Then, the retriggering 
of the partition request would again not happen.

If we don't have a concrete and immediate plan how to refactor this code in the 
future, I would slightly be against using the global executor because I fear 
that it will remain for quite some time and could bite us later.

> Distributed Jepsen test fails with blocked TaskExecutor
> -------------------------------------------------------
>
>                 Key: FLINK-13249
>                 URL: https://issues.apache.org/jira/browse/FLINK-13249
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>    Affects Versions: 1.9.0
>            Reporter: Till Rohrmann
>            Assignee: Stefan Richter
>            Priority: Blocker
>              Labels: test-stability
>             Fix For: 1.9.0
>
>         Attachments: jstack_25661_YarnTaskExecutorRunner
>
>
> The distributed Jepsen test which kills {{JobMasters}} started to fail 
> recently. From a first glance, it looks as if the {{TaskExecutor's}} main 
> thread is blocked by some operation. Further investigation is required.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-13249) Distributed Jepsen test fails with blocked TaskExecutor

Reply via email to