[ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752036#comment-17752036
 ] 

Fang Yong commented on FLINK-32667:
-----------------------------------

[~chesnay] [~mapohl] Currently ha service consists of two parts

Part1: Flink cluster will register its dispatcher address, rest port to ha 
service such as zk or configmap for k8s, then the client can get these 
information and submit job to dispatcher via rest, this is needed in OLAP 
scenario

Part2: Dispatcher will validate, save and recover job from JobGraphStore and 
JobResultStore requires failover.

In OLAP scenario, Part2 stores the jobs to external systems (zk/s3) which may 
cause latency jitter in olap queries, so we only need Part1 not Par2. My 
original idea was to add an option for ha service and it can use embedded 
JobGraphStore and memory JobResultStore, it may be the more suitable solution. 
But it will cause the job will not recover even if it is configured with 
failover when JM crashes.

The second idea is that we do not store jobs without failover to ha service, 
but currently we can only check this according to restart strategy and also the 
issues mentioned by [~chesnay] above. 

What do you think of this, do you think the first idea (add option for ha 
service) is feasible? Looking forward to your feedback, thanks





> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> ----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-32667
>                 URL: https://issues.apache.org/jira/browse/FLINK-32667
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.18.0
>            Reporter: Fang Yong
>            Assignee: Fang Yong
>            Priority: Major
>              Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to