[ 
https://issues.apache.org/jira/browse/FLINK-12887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869086#comment-16869086
 ] 

Xiaogang Shi commented on FLINK-12887:
--------------------------------------

Hi [~till.rohrmann], now we are using many unfenced asynchronous operations in 
Yarn RM to process notifications from Yarn. Otherwise, Yarn RM will miss some 
notifications when it has not granted the leadership.

Another case is the timers to release stuck containers. When a Yarn RM 
restarts, it will recover containers from previous attempts. Some containers 
may be in stuck and we should kill them to release resources. We now use timers 
to monitor these recovered containers and will kill those containers whose task 
managers cannot register in time. The timers must be unfenced because the Yarn 
RM may not grant the leadership when it recovers the containers.


> Schedule UnfencedMessage would lost envelope info 
> --------------------------------------------------
>
>                 Key: FLINK-12887
>                 URL: https://issues.apache.org/jira/browse/FLINK-12887
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.9.0
>            Reporter: TisonKun
>            Priority: Major
>
> We provide {{runAsync}}, {{callAsync}} and {{scheduleRunAsync}} for 
> {{MainThreadExecutable}}, while providing {{runAsyncWithoutFencing}} and 
> {{callAsyncWithoutFencing}} additionally for {{FencedMainThreadExecutable}}.
> Let's think about a case when we want to schedule a unfenced runnable or any 
> other unfenced message(currently, we don't have such code path but it's 
> semantically valid.). 
> 1. {{FencedAkkaRpcActor}} received an unfenced runnable with delay
> 2. It extracted the runnable from unfenced message and call 
> {{super.handleRpcMessage}}.
> 3. {{AkkaRpcActor}} enveloped the message and schedule it by 
> {{AkkaRpcActor#L410}}.
> However, {{FencedAkkaRpcActor#envelopeSelfMessage}} was called for envelope. 
> Thus the unfenced message now become a fenced message.
> We can anyway implement {{scheduleRunAsyncWithoutFencing}} to schedule 
> unfenced message directly by {{actorsystem.scheduler.scheduleOnce(..., 
> dispatcher)}}, but with current codebase I notice that {{RunAsync}} has a 
> wried {{atTimeNanos}}(i.e., delay) property. Ideally how to schedule a 
> message is shown on what params ScheduleExecutorService called with, at least 
> we cannot extract an unfenced message and envelop it into a fence message and 
> then schedule it, which goes into wrong semantic.
> cc [~till.rohrmann]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to