wangyang0918 commented on issue #10089: [FLINK-12342][yarn] Remove container requests in order to reduce excess containers URL: https://github.com/apache/flink/pull/10089#issuecomment-550589824 @tillrohrmann 1. You are right. The the removal of the container requests and the upload of files is running in the `YarnResourceManager`'s main thread. I think it is a single thread, so if we receive 100 containers in the first heart beat, and it will take 5000ms to launch them(50ms for each). Even we receive all the remaining 900 containers in the second heart beat, `removeContainerRequest` could not be called timely. So we still have excess containers. I agree with you that it could be done in the future optimization. (1) Do not upload taskmanager-conf.yaml to hdfs, use dynamic properties instead. (2) Use `NMClientAsync` instead of `NMClient`. 2. I start some other jobs to occupy most resources of the yarn cluster. When i have submitted the flink cluster, i kill some of them one by one. So the required 1000 containers are allocated in two or more heart beats.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
