Hi, I create a job with following parameters: org.apache.flink.configuration.Configuration{ yarn.containers.vcores=2 yarn.appmaster.vcores=1 }
ClusterSpecification{ taskManagerMemoryMB=1024 slotsPerTaskManager=1 } After I launch job programmatically I have : yarn node -list -showDetails Configured Resources : <memory:8192, vCores:8> Allocated Resources : <memory:1250, vCores:1> - I suppose this was created for JobManager But in logs I see 3 requests to create Requesting new TaskExecutor container with resources <memory:2048, vCores:2> Here is a log fragment: JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000. org.apache.flink.yarn.YarnResourceManager - Requesting new TaskExecutor container with resources <memory:2048, vCores:2>. Number pending requests 1. org.apache.flink.yarn.YarnResourceManager - Request slot with profile ResourceProfile{UNKNOWN} for job 64080d7889797133215e501e72b23a74 with allocation id a1c9ff2b7ec9ad662108b8a2b2301fcf. org.apache.flink.yarn.YarnResourceManager - Requesting new TaskExecutor container with resources <memory:2048, vCores:2>. Number pending requests 2. org.apache.flink.yarn.YarnResourceManager - Request slot with profile ResourceProfile{UNKNOWN} for job 64080d7889797133215e501e72b23a74 with allocation id 21f57b4324bdd50dd293547bc4b19ce2. org.apache.flink.yarn.YarnResourceManager - Requesting new TaskExecutor container with resources <memory:2048, vCores:2>. Number pending requests 3. Close ResourceManager connection Shut down cluster because application is in FAILED, diagnostics null. Here are things I would like to clarify: Why there are 3 requests to create TaskExecutor instead of 1? Why no task executor is created despite I have 7 cores and 7 GB of free RAM? What is ResourceProfile{UNKNOWN}? What is diagnostic null? When I change number ClusterSpecification.slotsPerTaskManager to 1 - I get : "Cannot serve slot request, no ResourceManager connected" "Could not allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources" Why ResourceManager aint created despite I request even even less resource for this? Regards, Vitaliy