Hi,

Seems like there is an issue with the NM launch. It should go STAGING to ACTIVE 
state. 
Do check the NM logs for the issue. 

Thank you 
Aashreya

Sent from my iPhone

> On Sep 22, 2015, at 7:41 PM, Zhongyue Luo <[email protected]> wrote:
> 
> Thanks Sarjeet, it work.
> 
> However, this seems very strange. Shouldn't the RM's IP be included in the
> task info so that the executor injects the IP when launching the NM?
> 
> Also I can see that the defaule NM has been registered to the RM through
> the RM web ui but the task status is still "STAGING" from the Mesos web ui.
> Is this normal?
> 
> On Tue, Sep 22, 2015 at 11:19 PM, Sarjeet Singh <[email protected]>
> wrote:
> 
>> Zhongyue,
>> 
>> You can specify RM's IP from commandline when starting RM, or you can set
>> the following property in yarn-site.xml:
>> 
>> <property>
>> 
>>    <name>yarn.resourcemanager.hostname</name>
>> 
>>    <value>RM IP</value>
>> 
>>  </property>
>> 
>> OR
>> 
>> From commandline,
>> 
>> YARN_RESOURCEMANAGER_OPTS=-Dyarn.resourcemanager.hostname=<RM_IP> && yarn
>> resourcemanager
>> 
>> ===========================
>> 
>> Try the following and see it it works?
>> 
>> -Sarjeet
>> 
>> On Tue, Sep 22, 2015 at 1:04 AM, Zhongyue Luo <[email protected]>
>> wrote:
>> 
>>> Hi,
>>> 
>>> I've recently redeployed Myriad in our Mesos cluster.
>>> 
>>> However, the node managers fail because they are trying to connect to a
>>> invalid Resource Manager IP.
>>> 
>>> Below is a part of the log in one of the Mesos Agents that attemts to
>>> launch a Node manager.
>>> 
>>> 15/09/22 15:41:52 INFO webapp.WebApps: Web app /node started at 8042
>>> 15/09/22 15:41:52 INFO webapp.WebApps: Registered webapp guice modules
>>> 15/09/22 15:41:52 INFO client.RMProxy: Connecting to ResourceManager at /
>>> 0.0.0.0:8031
>>> 15/09/22 15:41:52 INFO nodemanager.NodeStatusUpdaterImpl: Sending out 0
>> NM
>>> container statuses: []
>>> 15/09/22 15:41:52 INFO nodemanager.NodeStatusUpdaterImpl: Registering
>> with
>>> RM using containers :[]
>>> 15/09/22 15:41:54 INFO ipc.Client: Retrying connect to server:
>>> 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is
>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>>> MILLISECONDS)
>>> 15/09/22 15:41:55 INFO ipc.Client: Retrying connect to server:
>>> 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is
>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>>> MILLISECONDS)
>>> 15/09/22 15:41:56 INFO ipc.Client: Retrying connect to server:
>>> 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is
>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>>> MILLISECONDS)
>>> 15/09/22 15:41:57 INFO ipc.Client: Retrying connect to server:
>>> 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is
>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>>> MILLISECONDS)
>>> 
>>> You can see that it attempts to connect to 0.0.0.0:8031 when the active
>>> resource manager is located in a different location.
>>> 
>>> I've followed the instructions here.
>>> https://github.com/mesos/myriad/blob/phase1/docs/myriad-dev.md
>>> 
>>> Which configuration do I need to recheck to get this right?
>>> 
>>> Thanks in advance.
>>> 
>>> -zhongyue
>>> 
>>> --
>>> *Intel SSG/STO/BDT*
>>> 880 Zixing Road, Zizhu Science Park, Minhang District, 200241, Shanghai,
>>> China
>>> +862161166500
> 
> 
> 
> -- 
> *Intel SSG/STO/BDT*
> 880 Zixing Road, Zizhu Science Park, Minhang District, 200241, Shanghai,
> China
> +862161166500

Reply via email to