[
https://issues.apache.org/jira/browse/FLINK-24947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446370#comment-17446370
]
liuzhuo commented on FLINK-24947:
---------------------------------
As you said, using HostNetWork mode, we will have some changes to the port
generation. In the k8s environment, we provide ports through two Services,
InternalService (start port 6123 for TaskManager to access JobManager) and
ExternalService (start port 8080 for client to access JobManager), these two
services are created by the client. In HostNetWork mode, the port must be
generated randomly, so the creation time of these two services needs to be
postponed, and then the service is created after the JobManager is started.
This is because we can get the values of these two ports only after the
JobManager starts successfully. It should be noted that these two services need
to be created/updated every time the JobManager is started.
>1. How the TaskManager could find the leader JobManager address after the
>JobManager failover without HA?
In non-HA mode, after the JobManager fails over and started, it will use
the new port (6123) to modify the InternalService, so that the TaskManager can
still obtain the correct JobManager address information through the
InternalService
>2.How the Flink client could find the leader JobManager address without HA?
This is indeed a more difficult place, because the client has no way to get
the real JobManager's Rest port (8080) when submitting. Maybe we can wait for a
certain period of time on the client to get the accurate port of the
ExternalService (8080). .
In fact, we internally use Ingress to solve this problem. When the Client
builds the Deployment, we add an IngressDecorator, so that we can access the
JobManager through this ingress (this Ingress is not reachable when it is just
started, because the corresponding The ExternalService has not been created,
and it takes effect after the JobManager creates the ExternalService). This
method is a better way, but the prerequisite is that an IngressController is
required. I am not sure if this is suitable for everyone.
>3.Should we update the internal/external K8s service when the JobManager has
>allocated a dynamic port
Yes, as described above, JobManager need to update the InternalService and
ExternalService every time it starts
I suddenly discovered that we are using session mode internally. For
applicaiton mode, I may need to investigate again.
> Flink on k8s support HostNetWork model
> --------------------------------------
>
> Key: FLINK-24947
> URL: https://issues.apache.org/jira/browse/FLINK-24947
> Project: Flink
> Issue Type: New Feature
> Components: Deployment / Kubernetes
> Reporter: liuzhuo
> Priority: Minor
>
> For the use of flink on k8s, for performance considerations, it is important
> to choose a CNI plug-in. Usually we have two environments: Managed and
> UnManaged.
> Managed: Cloud vendors usually provide very efficient CNI plug-ins, we
> don’t need to care about network performance issues
> UnManaged: On self-built K8s clusters, CNI plug-ins are usually optional,
> similar to Flannel and Calcico, but such software network cards usually lose
> some performance or require some additional network strategies.
> For an unmanaged environment, if we also want to achieve the best network
> performance, should we support the *HostNetWork* model?
> Use the host network to achieve the best performance
--
This message was sent by Atlassian Jira
(v8.20.1#820001)