RE: [jira] [Created] (MAPREDUCE-3030) RM is not processing heartbeat and continuously giving the message 'Node not found rebooting'

Devaraj K Mon, 19 Sep 2011 05:11:35 -0700

Hi Vinod,

   As I have commented in the issue, It is causing due to equals method in
NodeId.java. It can be fixed as part of this defect.



REBOOT command handling can be taken care as part of other issue. 

Devaraj K 

-----Original Message-----
From: Vinod Kumar Vavilapalli [mailto:[email protected]] 
Sent: Monday, September 19, 2011 4:59 PM
To: [email protected]
Subject: Re: [jira] [Created] (MAPREDUCE-3030) RM is not processing
heartbeat and continuously giving the message 'Node not found rebooting'

This sure is a bug on the NM side which doesn't handle the REBOOT command
from the RM.

But can you upload the RM side logs related to this node so that we are sure
there aren't any bugs in RM? Thanks!

On Mon, Sep 19, 2011 at 3:46 PM, Devaraj K (JIRA) <[email protected]> wrote:

> RM is not processing heartbeat and continuously giving the message 'Node
> not found rebooting'
>
>
----------------------------------------------------------------------------
-----------------
>
>                 Key: MAPREDUCE-3030
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3030
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 0.24.0
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Blocker
>
>
> {code:title=Node Manager Logs|borderStyle=solid}
> 2011-09-19 13:39:29,816 INFO  webapp.WebApps (WebApps.java:start(162)) -
> Registered webapp guice modules
> 2011-09-19 13:39:29,817 INFO  service.AbstractService
> (AbstractService.java:start(61)) -
> Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is
> started.
> 2011-09-19 13:39:29,818 INFO  service.AbstractService
> (AbstractService.java:start(61)) - Service:Dispatcher is started.
> 2011-09-19 13:39:29,819 INFO  nodemanager.NodeStatusUpdaterImpl
> (NodeStatusUpdaterImpl.java:start(133)) - Configured ContainerManager
> Address is 10.18.52.124:45454
> 2011-09-19 13:39:29,819 INFO  ipc.YarnRPC (YarnRPC.java:create(47)) -
> Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 2011-09-19 13:39:29,822 INFO  ipc.HadoopYarnRPC
> (HadoopYarnProtoRPC.java:getProxy(49)) - Creating a HadoopYarnProtoRpc
proxy
> for protocol interface org.apache.hadoop.yarn.server.api.ResourceTracker
> 2011-09-19 13:39:29,862 INFO  nodemanager.NodeStatusUpdaterImpl
> (NodeStatusUpdaterImpl.java:registerWithRM(165)) - Connected to
> ResourceManager at 0.0.0.0:8025
> 2011-09-19 13:39:30,369 INFO  nodemanager.NodeStatusUpdaterImpl
> (NodeStatusUpdaterImpl.java:registerWithRM(189)) - Registered with
> ResourceManager as 10.18.52.124:45454 with total resource of memory: 8192,
> 2011-09-19 13:39:30,369 INFO  service.AbstractService
> (AbstractService.java:start(61)) -
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl is
> started.
> 2011-09-19 13:39:30,371 INFO  service.AbstractService
> (AbstractService.java:start(61)) -
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeManager is started.
> {code}
>
>
>
> {code:title=Resource Manager Logs|borderStyle=solid}
> 2011-09-19 14:01:03,238 INFO  resourcemanager.ResourceTrackerService
> (ResourceTrackerService.java:nodeHeartbeat(201)) - Node not found
rebooting
> 10.18.52.124:45454
> Call:
>
protocol=org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService
$BlockingInterface,
> method=nodeHeartbeat
> 2011-09-19 14:01:04,240 INFO  resourcemanager.ResourceTrackerService
> (ResourceTrackerService.java:nodeHeartbeat(201)) - Node not found
rebooting
> 10.18.52.124:45454
> Call:
>
protocol=org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService
$BlockingInterface,
> method=nodeHeartbeat
> 2011-09-19 14:01:05,242 INFO  resourcemanager.ResourceTrackerService
> (ResourceTrackerService.java:nodeHeartbeat(201)) - Node not found
rebooting
> 10.18.52.124:45454
> Call:
>
protocol=org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService
$BlockingInterface,
> method=nodeHeartbeat
> 2011-09-19 14:01:06,244 INFO  resourcemanager.ResourceTrackerService
> (ResourceTrackerService.java:nodeHeartbeat(201)) - Node not found
rebooting
> 10.18.52.124:45454
> Call:
>
protocol=org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService
$BlockingInterface,
> method=nodeHeartbeat
> 2011-09-19 14:01:07,246 INFO  resourcemanager.ResourceTrackerService
> (ResourceTrackerService.java:nodeHeartbeat(201)) - Node not found
rebooting
> 10.18.52.124:45454
> Call:
>
protocol=org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService
$BlockingInterface,
> method=nodeHeartbeat
> 2011-09-19 14:01:08,247 INFO  resourcemanager.ResourceTrackerService
> (ResourceTrackerService.java:nodeHeartbeat(201)) - Node not found
rebooting
> 10.18.52.124:45454
> {code}
>
> Node Manager is registered with Resource manager and the for every
> heartbeat, it is printing the above message.
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

RE: [jira] [Created] (MAPREDUCE-3030) RM is not processing heartbeat and continuously giving the message 'Node not found rebooting'

Reply via email to