Hi Tim,

jps should show it running if it's "hung".

In that case, see if you can get a stack trace using "jstack <pid>".
Paste that trace and we may be able to figure out where it's hung up.

-Todd

On Thu, Oct 15, 2009 at 12:12 AM, tim robertson
<[email protected]> wrote:
> Thanks for the info Todd,
>
>> I have also seen this error, but it's not fatal.
> Ok, I'll ignore it for the time being.
>
>> Is this log from just the NameNode or did you tail multiple logs? It
>> seems odd that your namenode would be trying to make an IPC client to
>> itself (port 8020).
>
> Just the name node.
>
>> After you see these messages, does your namenode shut down?
>
> It kind of just hangs really (Already tried 9 times, Already tried 10 times 
> etc)
>
>> Does jps show it running? Is the web interface available at port 50070?
>
> No, it never is available.
>
> I'll keep hunting for some config error.
>
> Cheers
> Tim
>
>
>>
>> -Todd
>>
>>
>> On Wed, Oct 14, 2009 at 2:33 AM, tim robertson
>> <[email protected]> wrote:
>>> Hi all,
>>>
>>> Using hadoop 0.20.1 I am seeing the following on my namenode startup.
>>>
>>> 2009-10-14 11:09:54,232 INFO org.apache.hadoop.ipc.Server: Error
>>> register getProtocolVersion
>>> java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
>>>
>>> Could someone please point me in the right direction for diagnosing
>>> where I have gone wrong?
>>>
>>> Thanks
>>> Tim
>>>
>>>
>>>
>>> /************************************************************
>>> STARTUP_MSG: Starting NameNode
>>> STARTUP_MSG:   host = cluma.gbif.org/192.38.28.77
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 0.20.1
>>> STARTUP_MSG:   build =
>>> http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.20.1-rc1
>>> -r 810220; compiled by 'oom' on Tue Sep  1 20:55:56 UTC 2009
>>> ************************************************************/
>>> 2009-10-14 11:09:53,010 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>> Initializing RPC Metrics with hostName=NameNode, port=8020
>>> 2009-10-14 11:09:53,013 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>> cluma.gbif.clu/192.168.76.254:8020
>>> 2009-10-14 11:09:53,015 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>> 2009-10-14 11:09:53,019 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>> Initializing NameNodeMeterics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2009-10-14 11:09:53,056 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> fsOwner=root,root,bin,daemon,sys,adm,disk,wheel
>>> 2009-10-14 11:09:53,057 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> supergroup=supergroup
>>> 2009-10-14 11:09:53,057 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> isPermissionEnabled=true
>>> 2009-10-14 11:09:53,064 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>> Initializing FSNamesystemMetrics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2009-10-14 11:09:53,065 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>> FSNamesystemStatusMBean
>>> 2009-10-14 11:09:53,090 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1
>>> 2009-10-14 11:09:53,094 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Number of files under
>>> construction = 0
>>> 2009-10-14 11:09:53,094 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Image file of size 94
>>> loaded in 0 seconds.
>>> 2009-10-14 11:09:53,094 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Edits file
>>> /hadoop/name/current/edits of size 4 edits # 0 loaded in 0 seconds.
>>> 2009-10-14 11:09:53,098 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Image file of size 94
>>> saved in 0 seconds.
>>> 2009-10-14 11:09:53,113 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading
>>> FSImage in 72 msecs
>>> 2009-10-14 11:09:53,114 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of
>>> blocks = 0
>>> 2009-10-14 11:09:53,114 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of invalid
>>> blocks = 0
>>> 2009-10-14 11:09:53,114 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
>>> under-replicated blocks = 0
>>> 2009-10-14 11:09:53,114 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
>>> over-replicated blocks = 0
>>> 2009-10-14 11:09:53,114 INFO org.apache.hadoop.hdfs.StateChange:
>>> STATE* Leaving safe mode after 0 secs.
>>> 2009-10-14 11:09:53,114 INFO org.apache.hadoop.hdfs.StateChange:
>>> STATE* Network topology has 0 racks and 0 datanodes
>>> 2009-10-14 11:09:53,114 INFO org.apache.hadoop.hdfs.StateChange:
>>> STATE* UnderReplicatedBlocks has 0 blocks
>>> 2009-10-14 11:09:53,198 INFO org.mortbay.log: Logging to
>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> org.mortbay.log.Slf4jLog
>>> 2009-10-14 11:09:53,242 INFO org.apache.hadoop.http.HttpServer: Port
>>> returned by webServer.getConnectors()[0].getLocalPort() before open()
>>> is -1. Opening the listener on 50070
>>> 2009-10-14 11:09:53,243 INFO org.apache.hadoop.http.HttpServer:
>>> listener.getLocalPort() returned 50070
>>> webServer.getConnectors()[0].getLocalPort() returned 50070
>>> 2009-10-14 11:09:53,243 INFO org.apache.hadoop.http.HttpServer: Jetty
>>> bound to port 50070
>>> 2009-10-14 11:09:53,243 INFO org.mortbay.log: jetty-6.1.14
>>> 2009-10-14 11:09:53,433 INFO org.mortbay.log: Started
>>> [email protected]:50070
>>> 2009-10-14 11:09:53,434 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
>>> 0.0.0.0:50070
>>> 2009-10-14 11:09:53,434 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> Responder: starting
>>> 2009-10-14 11:09:53,435 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> listener on 8020: starting
>>> 2009-10-14 11:09:53,438 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 8020: starting
>>> 2009-10-14 11:09:53,438 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 8020: starting
>>> 2009-10-14 11:09:53,438 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 8020: starting
>>> 2009-10-14 11:09:53,438 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 8020: starting
>>> 2009-10-14 11:09:53,438 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 4 on 8020: starting
>>> 2009-10-14 11:09:53,439 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 6 on 8020: starting
>>> 2009-10-14 11:09:53,439 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 7 on 8020: starting
>>> 2009-10-14 11:09:53,439 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 8 on 8020: starting
>>> 2009-10-14 11:09:53,441 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 9 on 8020: starting
>>> 2009-10-14 11:09:53,444 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 5 on 8020: starting
>>> 2009-10-14 11:09:54,232 INFO org.apache.hadoop.ipc.Server: Error
>>> register getProtocolVersion
>>> java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
>>>        at 
>>> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
>>>        at 
>>> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
>>>        at 
>>> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
>>>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
>>>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>>>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>>>        at java.security.AccessController.doPrivileged(Native Method)
>>>        at javax.security.auth.Subject.doAs(Subject.java:396)
>>>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>>> 2009-10-14 11:09:54,667 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.3:50010 storage
>>> DS-551038728-192.168.76.3-50010-1255511394664
>>> 2009-10-14 11:09:54,668 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.3:50010
>>> 2009-10-14 11:09:54,673 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.2:50010 storage
>>> DS-681115827-192.168.76.2-50010-1255511394673
>>> 2009-10-14 11:09:54,674 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.2:50010
>>> 2009-10-14 11:09:54,689 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.7:50010 storage
>>> DS-1731752631-192.168.76.7-50010-1255511394688
>>> 2009-10-14 11:09:54,689 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.7:50010
>>> 2009-10-14 11:09:54,701 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.6:50010 storage
>>> DS-338163265-192.168.76.6-50010-1255511394698
>>> 2009-10-14 11:09:54,701 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.6:50010
>>> 2009-10-14 11:09:59,688 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.1:50010 storage
>>> DS-988595128-192.168.76.1-50010-1255511399688
>>> 2009-10-14 11:09:59,689 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.1:50010
>>> 2009-10-14 11:10:04,277 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.9:50010 storage
>>> DS-1096719686-192.168.76.9-50010-1255511404276
>>> 2009-10-14 11:10:04,278 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.9:50010
>>> 2009-10-14 11:10:04,289 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.5:50010 storage
>>> DS-1405051412-192.168.76.5-50010-1255511404288
>>> 2009-10-14 11:10:04,289 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.5:50010
>>> 2009-10-14 11:10:05,961 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.8:50010 storage
>>> DS-1138617755-192.168.76.8-50010-1255511405960
>>> 2009-10-14 11:10:05,961 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.8:50010
>>> 2009-10-14 11:10:06,154 INFO org.apache.hadoop.hdfs.StateChange:
>>> BLOCK* NameSystem.registerDatanode: node registration from
>>> 192.168.76.4:50010 storage
>>> DS-569548170-192.168.76.4-50010-1255511406153
>>> 2009-10-14 11:10:06,154 INFO org.apache.hadoop.net.NetworkTopology:
>>> Adding a new node: /default-rack/192.168.76.4:50010
>>> 2009-10-14 11:10:14,480 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 0
>>> time(s).
>>> 2009-10-14 11:10:35,483 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 1
>>> time(s).
>>> 2009-10-14 11:10:56,486 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 2
>>> time(s).
>>> 2009-10-14 11:11:17,489 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 3
>>> time(s).
>>> 2009-10-14 11:11:38,492 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 4
>>> time(s).
>>> 2009-10-14 11:11:59,495 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 5
>>> time(s).
>>> 2009-10-14 11:12:20,498 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 6
>>> time(s).
>>> 2009-10-14 11:12:41,501 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 7
>>> time(s).
>>> 2009-10-14 11:13:02,504 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 8
>>> time(s).
>>> 2009-10-14 11:13:23,506 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: cluma.gbif.clu/192.168.76.254:8020. Already tried 9
>>> time(s).
>>> etc.
>>>
>>
>

Reply via email to