[
https://issues.apache.org/jira/browse/FLINK-24819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440330#comment-17440330
]
Yang Wang commented on FLINK-24819:
-----------------------------------
cc [~yittg], Could you please also have a look on this ticket?
> Higher cpu load after using SharedIndexInformer replaced naked Kubernetes
> watch
> -------------------------------------------------------------------------------
>
> Key: FLINK-24819
> URL: https://issues.apache.org/jira/browse/FLINK-24819
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / Kubernetes
> Affects Versions: 1.14.0
> Reporter: Yang Wang
> Priority: Major
>
> In FLINK-22054, Flink has used a shared informer for ConfigMap to replace the
> naked K8s watch. After then, each Flink JVM process(JM/TM) only needs one
> connection to APIServer for ConfigMap watching. It aims to reduce the network
> pressure on K8s APIServer.
>
> However, in our recent tests, we found that the CPU and memory cost of
> APIServer have been doubled while running same Flink workloads. After digging
> more details in the K8s, I think the root cause might be that ETCD does not
> have indexes for labels. It means APIServer need to pull all the events from
> ETCD for each watch and then filter with specified labels(e.g.
> app=xxx,type=flink-native-kubernetes,configmap-type=high-availability)
> internally. Before FLINK-22054, we started a dedicated connection for each
> ConfigMap watching. And it seems that APIServer only need to pull the events
> for the specified ConfigMap name.
>
> Watch URL example(Before):
> [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?metadata.name=job-009d4f51-ca02-4793-a49b-a3344538719b-resourcemanager-leader&watch=true|https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability&resourceVersion=1153687321&watch=true]
>
> Watch URL example(After):
> [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability&resourceVersion=1153687321&watch=true]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)