wangyang0918 commented on a change in pull request #11323: [FLINK-16439][k8s] Make KubernetesResourceManager starts workers using WorkerResourceSpec requested by SlotManager URL: https://github.com/apache/flink/pull/11323#discussion_r394772038
########## File path: flink-kubernetes/src/main/java/org/apache/flink/kubernetes/KubernetesResourceManager.java ########## @@ -81,10 +80,8 @@ private final FlinkKubeClient kubeClient; - private final ContaineredTaskManagerParameters taskManagerParameters; - - /** The number of pods requested, but not yet granted. */ - private int numPendingPodRequests = 0; + /** Map from pod name to worker resource. */ + private final Map<String, WorkerResourceSpec> podWorkerResources; Review comment: IIUC, `podWorkerResources` is used to store the mapping from podname to resource. When we receive new pods , it will be used to decrease the pending workers. Also when the pod terminates exceptionally, a new same one will be allocated. The problem is 1. We never cleanup or remove any entry. It might be a problem in a long running streaming job or batch job with many pods. 2. When the jobmanager failover, it can not be recovered. So this is the by-design behavior? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services