[jira] [Commented] (FLINK-26719) Rethink the default reschedule reconcile loop

Aitozi (Jira) Sat, 19 Mar 2022 02:35:07 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-26719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509231#comment-17509231
 ]


Aitozi commented on FLINK-26719:
--------------------------------

> If we do not want to provide stronger resiliency/guarantees than the Flink 
> native integration in itself then I guess we do not need to check, or it's 
> enough to check at larger intervals.

I have understood generally. In other words, we are using the reconcile loop to 
do the periodic check and plan to produce the ERROR events, Right? 

I think it's an interesting feature to explore, it may be an ability of 
monitoring or self-healing of the operator. The monitoring can use the polling 
or the informer based technique.

Thanks for your guys' explanation, Let’s go and see the evolution of this 
ability :).

> Rethink the default reschedule reconcile loop
> ---------------------------------------------
>
>                 Key: FLINK-26719
>                 URL: https://issues.apache.org/jira/browse/FLINK-26719
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Aitozi
>            Priority: Major
>
> When I test locally, I found that it will reschedule and reconcile with the 
> {{operator.reconciler.reschedule.interval.sec}} I doubt why we need this? I 
> think we just need to reconcile
>  # waiting for the status change
>  # receive the new event
>  # waiting for the savepoint result
> So when JobManagerDeploymentStatus is Ready, we do not have to trigger the 
> reconcile except waiting for the savepoint result.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26719) Rethink the default reschedule reconcile loop

Reply via email to