[jira] [Created] (FLINK-20249) Rethink the necessity of the k8s internal Service even on non-HA mode

jiang7chengzitc (Jira) Thu, 19 Nov 2020 05:56:09 -0800

jiang7chengzitc created FLINK-20249:
---------------------------------------


             Summary: Rethink the necessity of the k8s internal Service even on 
non-HA mode
                 Key: FLINK-20249
                 URL: https://issues.apache.org/jira/browse/FLINK-20249
             Project: Flink
          Issue Type: Improvement
          Components: Deployment / Kubernetes
    Affects Versions: 1.11.0
            Reporter: jiang7chengzitc
             Fix For: 1.11.3
         Attachments: flink internal service.pdf

In non-HA mode, k8s will create internal service that directs the communication 
from TaskManagers Pod to JobManager Pod, and TM Pods could re-register to the 
new JM Pod once a JM Pod failover occurs.

However recently I do an experiment and find a problem that k8s will first 
create new TM pods and then destory old TM pods after a period of time once JM 
Pod failover (note: new JM podIP has changed), then job will be reschedule by 
JM on new TM pods, it means new TM has been registered to JM. 

During this process, internal service is active all the time, but I think it is 
not necessary that keep this internal service, In other words, wo can weed out 
internal service and use JM podIP for TM pods communication with JM pod, In 
this case, it be consistent with HA mode.

Finally，related experiments is in attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-20249) Rethink the necessity of the k8s internal Service even on non-HA mode

Reply via email to