[
https://issues.apache.org/jira/browse/FLINK-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955058#comment-16955058
]
Zili Chen commented on FLINK-14434:
-----------------------------------
Thanks for your reply [~trohrmann]. I final a possibly better option3 for this
issue. The diff is tiny and expressive.
https://github.com/TisonKun/flink/commit/baacecb92f11cd367dc89bc48744fea6de94670b
In short, the problem described above is mainly caused by when "job manager
runner result future" called back, "job manager runner created future"
conceptually finished but not completed. Revisit the semantic of
{{#createJobManagerRunner}} we are able to just return a future represent the
creation and let the caller take care of the start.
Compared with option 2, this approach has a clear semantic and a subtle
difference that it execute {{JobManagerRunner}} in akka-dispatcher thread, not
in MainThread. We internally have some issue if starting jm runner happens in
dispatcher MainThread[1] but it doesn't exist in community codebase.
I'm glad to help with FLINK-11843 and FLINK-11719 on the review side. For this
issue I'd like to send this tiny patch as a pull request so that you can
coordinate patches depending on your schedule.
[1] FYI: It is an interesting case but bias a bit from community codebase. We
move the job registry totally to Dispatcher so that when job manager runner
granted leadership it send a RPC to Dispatcher for querying what job scheduling
status now. Our fork is currently based on 1.7 so that there is a dead-lock
execution order with solution option 2 above.
1. job manager runner called {{#start}} in Dispatcher MainThread
2. job manager runner leader election service started, and if it is
standalone(non-ha), it directly calls grantLeadership
3. job manager runner on granted leadership, send a RPC to Dispatcher for
querying and wait for the result.
4. because {{#start}} occupied the MainThread, the later RPC cannot be
processed.
We can workaround this case in many ways such as dispatch action a bit, but it
might infer that if we can schedule an action out of Dispatcher MainThread
without worry about synchronization provided by single-thread, we'd better to
do it.
> Dispatcher#createJobManagerRunner should returns on creation succeed, not
> after startJobManagerRunner
> -----------------------------------------------------------------------------------------------------
>
> Key: FLINK-14434
> URL: https://issues.apache.org/jira/browse/FLINK-14434
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.10.0
> Reporter: Zili Chen
> Assignee: Zili Chen
> Priority: Major
> Fix For: 1.10.0
>
> Attachments: patch.diff
>
>
> In an edge case, let's said
> 1) job finished nearly immediately
> 2) Dispatcher has been suspended in {{#startJobManagerRunner}} after
> {{jobManagerRunner.start();}} but before {{return jobManagerRunner;}}
> due to
> 1) we put {{jobManagerRunnerFutures}} with {{#startJobManagerRunner}}
> finished.
> 2) the creation of JobManagerRunner doesn't happen in MainThread.
> it is a possible execution order
> 1) JobManagerRunner created in akka-dispatcher thread
> 2) then apply {{Dispatcher#startJobManagerRunner}}
> 3) until {{jobManagerRunner.start();}} and before {{return jobManagerRunner;}}
> 4) this thread suspended
> 5) job finished, execute callback on MainThread
> 6) {{jobManagerRunnerFutures.get(jobID).getNow(null)}} returns {{null}}
> because akka-dispatcher thread doesn't {{return jobManagerRunner;}}
> 7) it report {{There is a newer JobManagerRunner for the job}} but actually
> not.
> **Solution**
> Two perspective but we can even have them both.
> 1. return {{jobManagerRunnerFuture}} in {{#createJobManagerRunner}}, let
> {{#startJobManagerRunner}} an action
> 2. on JobManagerRunner created, execute {{#startJobManagerRunner}} in
> MainThread.
> CC [~trohrmann]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)