Hi, I have a Flink 1.9.0 cluster deployed on AWS ECS. Cluster is running, but metrics are not showing in the UI.
For other services (RPC / Data) it works because the connection is initiated from the TM to the JM through a load-balancer. But it does not work for metrics where JM tries to initiate a connection with the TMs. Currently, Flink uses *taskmanager.host* configuration as both 'bind address' and 'advertised address'. When TM starts, it binds to the internal Docker IP which is not accessible from the JM. Also, the TM *metrics.internal.query-service.port* is set to a specific port which is dynamically bind to a random ECS host port. It seems that I need a separate setting for bind-address/port vs advertised-address/port. I saw there were several discussions on this issue also for Kubernetes: https://issues.apache.org/jira/browse/FLINK-11127 There was also an attempt to solve this by using Akka configurations here: https://hub.docker.com/r/lzaugg/flink-taskmanager/ Can someone suggest a solution for this issue on AWS ECS? Would appreciate your help. Thanks, Rafi