[ 
https://issues.apache.org/jira/browse/FLINK-17598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256246#comment-17256246
 ] 

hayden zhou commented on FLINK-17598:
-------------------------------------

[~fly_in_gis] I am trying to deploy Flink on k8s with HA mode, I use PV as the 
HA `storageDir`,  it seems the leader election is successfully

but I got error "Service temporarily unavailable due to an ongoing leader 
election. Please refresh." if I submit job

> Implement FileSystemHAServices for native K8s setups
> ----------------------------------------------------
>
>                 Key: FLINK-17598
>                 URL: https://issues.apache.org/jira/browse/FLINK-17598
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Deployment / Kubernetes, Runtime / Coordination
>            Reporter: Canbin Zheng
>            Priority: Major
>
> At the moment we use Zookeeper as a distributed coordinator for implementing 
> JobManager high availability services. But in the cloud-native environment, 
> there is a trend that more and more users prefer to use *Kubernetes* as the 
> underlying scheduler backend while *Storage Object* as the Storage medium, 
> both of these two services don't require Zookeeper deployment.
> As a result, in the K8s setups, people have to deploy and maintain their 
> Zookeeper clusters for solving JobManager SPOF. This ticket proposes to 
> provide a simplified FileSystem HA implementation with the leader-election 
> removed, which saves the efforts of Zookeeper deployment.
> To achieve this, we plan to 
> # Introduce a {{FileSystemHaServices}} which implements the 
> {{HighAvailabilityServices}}.
> # Replace Deployment with StatefulSet to ensure *at most one* semantics, 
> preventing potential concurrent access to the underlying FileSystem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to