Hi, I've recently redeployed Myriad in our Mesos cluster.
However, the node managers fail because they are trying to connect to a invalid Resource Manager IP. Below is a part of the log in one of the Mesos Agents that attemts to launch a Node manager. 15/09/22 15:41:52 INFO webapp.WebApps: Web app /node started at 8042 15/09/22 15:41:52 INFO webapp.WebApps: Registered webapp guice modules 15/09/22 15:41:52 INFO client.RMProxy: Connecting to ResourceManager at / 0.0.0.0:8031 15/09/22 15:41:52 INFO nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: [] 15/09/22 15:41:52 INFO nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[] 15/09/22 15:41:54 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 15/09/22 15:41:55 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 15/09/22 15:41:56 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 15/09/22 15:41:57 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) You can see that it attempts to connect to 0.0.0.0:8031 when the active resource manager is located in a different location. I've followed the instructions here. https://github.com/mesos/myriad/blob/phase1/docs/myriad-dev.md Which configuration do I need to recheck to get this right? Thanks in advance. -zhongyue -- *Intel SSG/STO/BDT* 880 Zixing Road, Zizhu Science Park, Minhang District, 200241, Shanghai, China +862161166500
