[ 
https://issues.apache.org/jira/browse/FLINK-24819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang resolved FLINK-24819.
------------------------------
    Resolution: Fixed

> Higher APIServer cpu load after using SharedIndexInformer replaced naked 
> Kubernetes watch
> -----------------------------------------------------------------------------------------
>
>                 Key: FLINK-24819
>                 URL: https://issues.apache.org/jira/browse/FLINK-24819
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>    Affects Versions: 1.14.0
>            Reporter: Yang Wang
>            Priority: Major
>
> In FLINK-22054, Flink has used a shared informer for ConfigMap to replace the 
> naked K8s watch. After then, each Flink JVM process(JM/TM) only needs one 
> connection to APIServer for ConfigMap watching. It aims to reduce the network 
> pressure on K8s APIServer.
>  
> However, in our recent tests, we found that the CPU and memory cost of 
> APIServer have been doubled while running same Flink workloads. After digging 
> more details in the K8s, I think the root cause might be that ETCD does not 
> have indexes for labels. It means APIServer need to pull all the events from 
> ETCD for each watch and then filter with specified labels(e.g. 
> app=xxx,type=flink-native-kubernetes,configmap-type=high-availability) 
> internally. Before FLINK-22054, we started a dedicated connection for each 
> ConfigMap watching. And it seems that APIServer only need to pull the events 
> for the specified ConfigMap name.
>  
> Watch URL example(Before):
> [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?metadata.name=job-009d4f51-ca02-4793-a49b-a3344538719b-resourcemanager-leader&watch=true|https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability&resourceVersion=1153687321&watch=true]
>  
> Watch URL example(After):
> [https://kubernetes.default:6443/api/v1/namespaces/vvp-workload/configmaps?labelSelector=app%3Dk8s-ha-app-1-1636077491-23461%2Ctype%3Dflink-native-kubernetes%2Cconfigmap-type%3Dhigh-availability&resourceVersion=1153687321&watch=true]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to