[ 
https://issues.apache.org/jira/browse/FLINK-27357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus H Christ updated FLINK-27357:
-----------------------------------
    Description: 
Flink on Kubernetes has High Availability services which build an external 
service for the web ui access.

[https://sourcegraph.com/github.com/apache/flink@master/-/blob/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/decorators/ExternalServiceDecorator.java]

In the case of multiple job managers in reactive mode or to avoid 15s or so 
between job manager restarts in K8S High Availability services, a new job 
manager leader can be elected. In this case I think the service no longer 
points to the correct job manager as any of the job manager pods can have the 
jobmanager label.

It might help to use the endpoint of one specific pod of the JobManager for the 
external service web ui, similar to how TaskManagers use JobManager IPs for 
High Availabiilty.

 

It might also help to update the service with the new IP or endpoint somehow 
for the external service to point to when updating the ConfigMap for JobManager 
leader election:

[https://sourcegraph.com/github.com/apache/flink/-/blob/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesLeaderRetrievalDriverFactory.java]

  was:
Flink on Kubernetes has High Availability services which build an external 
service for the web ui access.

[https://sourcegraph.com/github.com/apache/flink@master/-/blob/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/decorators/ExternalServiceDecorator.java]

In the case of multiple job managers in reactive mode or to avoid 15s or so 
between job manager restarts in K8S High Availability services, a new job 
manager leader can be elected. In this case I think the service no longer 
points to the correct job manager as any of the job manager pods can have the 
jobmanager label.

It might help to use the endpoint of the JobManager for the external service 
web ui, similar to how Task managers do for High Availabiilty.

 

It might also help to update the service with the new IP or endpoint somehow 
for the external service to point to when updating the ConfigMap for JobManager 
leader election:

[https://sourcegraph.com/github.com/apache/flink/-/blob/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesLeaderRetrievalDriverFactory.java]


> In Flink HA Service on K8S, Web UI External Service should point to elected 
> Job Manager leader's IP
> ---------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-27357
>                 URL: https://issues.apache.org/jira/browse/FLINK-27357
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Jesus H Christ
>            Priority: Minor
>
> Flink on Kubernetes has High Availability services which build an external 
> service for the web ui access.
> [https://sourcegraph.com/github.com/apache/flink@master/-/blob/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/decorators/ExternalServiceDecorator.java]
> In the case of multiple job managers in reactive mode or to avoid 15s or so 
> between job manager restarts in K8S High Availability services, a new job 
> manager leader can be elected. In this case I think the service no longer 
> points to the correct job manager as any of the job manager pods can have the 
> jobmanager label.
> It might help to use the endpoint of one specific pod of the JobManager for 
> the external service web ui, similar to how TaskManagers use JobManager IPs 
> for High Availabiilty.
>  
> It might also help to update the service with the new IP or endpoint somehow 
> for the external service to point to when updating the ConfigMap for 
> JobManager leader election:
> [https://sourcegraph.com/github.com/apache/flink/-/blob/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/highavailability/KubernetesLeaderRetrievalDriverFactory.java]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to