Anyone? Anyone at all?
I figured out the issue with the jobtracker, but I still have the
errors:

* Error register getProtocolVersion
* File (..) could only be replicated to 0 nodes, instead of 1

as explained in my first mail.

The 2nd error can appear without _any_ errors in _any_ of the datanode
or tasktracker logs.  And the NameNode webinterface even tells me all
nodes are live, none are dead.

This is effectively holding me back from using the cluster, 
I'm completely in the dark, I find this very frustrating. :(

Thank you,

Dieter

On Mon, 4 Apr 2011 18:45:49 +0200
Dieter Plaetinck <[email protected]> wrote:

> Hi,
> I have a cluster of 4 debian squeeze machines, on all of them I
> installed the same version ( hadoop-0.20.2.tar.gz )
> 
> I have : n-0 namenode, n-1: jobtracker and n-{0,1,2,3} slaves
> but you can see all my configs in more detail @
> http://pastie.org/1754875
> 
> the machines have 3GiB RAM.
> I don't think disk space is an issue anywhere, but FWIW:
> /var/bomvl (which would house hdfs stuff) is a filesystem of 140GB on
> all nodes. they all have 1.8GB free on /
> 
> 
> so, I invoke the cluster on n-0:
> 
> $ bin/hadoop namenode -format
> 11/04/04 18:27:33 INFO namenode.NameNode: STARTUP_MSG: 
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = n-0.map-reduce.search.wall2.test/10.2.1.90
> STARTUP_MSG:   args = [-format]
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 11/04/04 18:27:33 INFO namenode.FSNamesystem:
> fsOwner=dplaetin,Search,root 11/04/04 18:27:33 INFO
> namenode.FSNamesystem: supergroup=supergroup 11/04/04 18:27:33 INFO
> namenode.FSNamesystem: isPermissionEnabled=true 11/04/04 18:27:33
> INFO common.Storage: Image file of size 98 saved in 0 seconds.
> 11/04/04 18:27:34 INFO common.Storage: Storage
> directory /var/bomvl/name has been successfully formatted. 11/04/04
> 18:27:34 INFO namenode.NameNode:
> SHUTDOWN_MSG: /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at
> n-0.map-reduce.search.wall2.test/10.2.1.90
> ************************************************************/
> 
> $ bin/start-all.sh
> starting namenode, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.out
> n-0: starting datanode, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-datanode-n-0.map-reduce.search.wall2.test.out
> n-2: starting datanode, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-datanode-n-2.map-reduce.search.wall2.test.out
> n-3: starting datanode, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-datanode-n-3.map-reduce.search.wall2.test.out
> n-1: starting datanode, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.out
> localhost: starting secondarynamenode, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-secondarynamenode-n-0.map-reduce.search.wall2.test.out
> starting jobtracker, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-jobtracker-n-0.map-reduce.search.wall2.test.out
> n-2: starting tasktracker, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-tasktracker-n-2.map-reduce.search.wall2.test.out
> n-3: starting tasktracker, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-tasktracker-n-3.map-reduce.search.wall2.test.out
> n-0: starting tasktracker, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-tasktracker-n-0.map-reduce.search.wall2.test.out
> n-1: starting tasktracker, logging
> to 
> /var/log/hadoop/hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.out
> 
> 
> However, in
> n-0:/var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.log,
> I see this: 2011-04-04 18:27:34,760 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode:
> STARTUP_MSG: /************************************************************
> STARTUP_MSG: Starting NameNode STARTUP_MSG:   host =
> n-0.map-reduce.search.wall2.test/10.2.1.90 STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 2011-04-04 18:27:34,846 INFO
> org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics
> with hostName=NameNode, port=9000 2011-04-04 18:27:34,851 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
> n-0-lan0/10.1.1.2:9000 2011-04-04 18:27:34,853 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
> with processName=NameNode, sessionId=null 2011-04-04 18:27:34,854
> INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
> Initializing NameNodeMeterics using context
> object:org.apache.hadoop.metrics.spi.NullContext 2011-04-04
> 18:27:34,893 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> fsOwner=dplaetin,Search,root 2011-04-04 18:27:34,893 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> supergroup=supergroup 2011-04-04 18:27:34,893 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> isPermissionEnabled=true 2011-04-04 18:27:34,899 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
> Initializing FSNamesystemMetrics using context
> object:org.apache.hadoop.metrics.spi.NullContext 2011-04-04
> 18:27:34,900 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
> FSNamesystemStatusMBean 2011-04-04 18:27:34,927 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1
> 2011-04-04 18:27:34,931 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Number of files under
> construction = 0 2011-04-04 18:27:34,931 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98
> loaded in 0 seconds. 2011-04-04 18:27:34,932 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Edits
> file /var/bomvl/name/current/edits of size 4 edits # 0 loaded in 0
> seconds. 2011-04-04 18:27:34,971 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98
> saved in 0 seconds. 2011-04-04 18:27:35,146 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading
> FSImage in 263 msecs 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of
> blocks = 0 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> invalid blocks = 0 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> under-replicated blocks = 0 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> over-replicated blocks = 0 2011-04-04 18:27:35,148 INFO
> org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 0
> secs. 2011-04-04 18:27:35,148 INFO
> org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0
> racks and 0 datanodes 2011-04-04 18:27:35,148 INFO
> org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has
> 0 blocks 2011-04-04 18:27:40,247 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog 2011-04-04 18:27:40,299 INFO
> org.apache.hadoop.http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 50070 2011-04-04 18:27:40,300 INFO
> org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned
> 50070 webServer.getConnectors()[0].getLocalPort() returned 50070
> 2011-04-04 18:27:40,300 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50070 2011-04-04 18:27:40,300 INFO org.mortbay.log:
> jetty-6.1.14 2011-04-04 18:27:40,518 INFO org.mortbay.log: Started
> [email protected]:50070 2011-04-04 18:27:40,518 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
> 0.0.0.0:50070 2011-04-04 18:27:40,521 INFO
> org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2011-04-04 18:27:40,521 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 9000: starting 2011-04-04 18:27:40,522 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000: starting
> 2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000: starting 2011-04-04 18:27:40,522 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000: starting
> 2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 9000: starting 2011-04-04 18:27:40,526 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000: starting
> 2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 9000: starting 2011-04-04 18:27:40,526 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000: starting
> 2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 9000: starting 2011-04-04 18:27:40,528 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000: starting
> 2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 7 on 9000: starting 2011-04-04 18:27:40,550 INFO
> org.apache.hadoop.ipc.Server: Error register getProtocolVersion
> java.lang.IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion at
> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-04-04
> 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register
> getProtocolVersion java.lang.IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion at
> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-04-04
> 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register
> getProtocolVersion java.lang.IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion at
> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-04-04
> 18:29:20,803 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.5:50010
> storage DS-1679058086-10.2.0.80-50010-1301934560735 2011-04-04
> 18:29:20,806 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.5:50010 2011-04-04 18:29:32,077 INFO
> org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.3:50010
> storage DS-195369122-10.2.1.91-50010-1301934572032 2011-04-04
> 18:29:32,077 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.3:50010 2011-04-04 18:29:39,321 INFO
> org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.4:50010
> storage DS-564465257-10.2.1.79-50010-1301934579273 2011-04-04
> 18:29:39,321 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.4:50010 2011-04-04 18:30:58,007 INFO
> org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.2:50010
> storage DS-166622494-10.2.1.90-50010-1301934658005 2011-04-04
> 18:30:58,007 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.2:50010 2011-04-04 18:33:13,788 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
> from 10.1.1.2 2011-04-04 18:33:13,788 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 0 Total time for transactions(ms): 0Number of
> transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0
> 2011-04-04 18:33:14,091 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll FSImage
> from 10.1.1.2 2011-04-04 18:33:14,092 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 0 Total time for transactions(ms): 0Number of
> transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 33 
> 
> 
> Anyone knows what could be the issue at hand? The suggestions I could
> find in other threads are "check diskspace" and "make sure the
> versions are the same", but on the former I'm good, the latter is not
> an issue here.
> 
>  When I try to put a file on dfhs, I get:
> 011-04-04 17:10:28,306 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call
> addBlock(/tmp/hadoop-dplaetin/mapred/system/jobtracker.info,
> DFSClient_371874060) from 10.1.1.3:47491: error: java.io.IOException:
> File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be
> replicated to 0 nodes, instead of 1 java.io.IOException:
> File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be
> replicated to 0 nodes, instead of 1 at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 
> 
> On n-1, hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log
> looks good: $ cat
> hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log
> 2011-04-04 18:27:35,886 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> STARTUP_MSG: /************************************************************
> STARTUP_MSG: Starting DataNode STARTUP_MSG:   host =
> n-1.map-reduce.search.wall2.test/10.2.1.91 STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2 STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 2011-04-04 18:27:40,535 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Storage
> directory /var/bomvl/data is not formatted. 2011-04-04 18:27:40,535
> INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
> 2011-04-04 18:27:40,828 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
> FSDatasetStatusMBean 2011-04-04 18:27:40,829 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server
> at 50010 2011-04-04 18:27:40,831 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith
> is 1048576 bytes/s 2011-04-04 18:27:45,876 INFO org.mortbay.log:
> Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog 2011-04-04 18:27:45,940 INFO
> org.apache.hadoop.http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 50075 2011-04-04 18:27:45,940 INFO
> org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned
> 50075 webServer.getConnectors()[0].getLocalPort() returned 50075
> 2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50075 2011-04-04 18:27:45,940 INFO org.mortbay.log:
> jetty-6.1.14 2011-04-04 18:29:27,006 INFO org.mortbay.log: Started
> [email protected]:50075 2011-04-04 18:29:27,011 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
> with processName=DataNode, sessionId=null 2011-04-04 18:29:32,026
> INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC
> Metrics with hostName=DataNode, port=50020 2011-04-04 18:29:32,029
> INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2011-04-04 18:29:32,029 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 50020: starting 2011-04-04 18:29:32,031 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting
> 2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 50020: starting 2011-04-04 18:29:32,031 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting
> 2011-04-04 18:29:32,031 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration =
> DatanodeRegistration(n144-17a.wall2.test:50010, storageID=,
> infoPort=50075, ipcPort=50020) 2011-04-04 18:29:32,061 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: New storage id
> DS-195369122-10.2.1.91-50010-1301934572032 is assigned to data-node
> 10.1.1.3:50010 2011-04-04 18:29:32,061 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(10.1.1.3:50010,
> storageID=DS-195369122-10.2.1.91-50010-1301934572032, infoPort=50075,
> ipcPort=50020)In DataNode.run, data =
> FSDataset{dirpath='/var/bomvl/data/current'} 2011-04-04 18:29:32,062
> INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using
> BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec 2011-04-04
> 18:29:32,070 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
> BlockReport of 0 blocks got processed in 2 msecs 2011-04-04
> 18:29:32,071 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
> Starting Periodic block scanner.
> 
> However, this doesn't seem very right:
> $ head -n 40
> hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.log
> 2011-04-04 18:27:39,041 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG: /************************************************************
> STARTUP_MSG: Starting TaskTracker STARTUP_MSG:   host =
> n-1.map-reduce.search.wall2.test/10.2.1.91 STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 2011-04-04 18:27:44,176 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog 2011-04-04 18:27:44,263 INFO
> org.apache.hadoop.http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 50060 2011-04-04 18:27:44,269 INFO
> org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned
> 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
> 2011-04-04 18:27:44,269 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50060 2011-04-04 18:27:44,269 INFO org.mortbay.log:
> jetty-6.1.14 2011-04-04 18:28:05,858 INFO org.mortbay.log: Started
> [email protected]:50060 2011-04-04 18:28:05,863 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
> with processName=TaskTracker, sessionId= 2011-04-04 18:28:05,872 INFO
> org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics
> with hostName=TaskTracker, port=42914 2011-04-04 18:28:05,908 INFO
> org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2011-04-04 18:28:05,908 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 42914: starting 2011-04-04 18:28:05,909 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 0 on 42914: starting
> 2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 42914: starting 2011-04-04 18:28:05,909 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 2 on 42914: starting
> 2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 42914: starting 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 4 on 42914: starting
> 2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 42914: starting 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 6 on 42914: starting
> 2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 7 on 42914: starting 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
> localhost/127.0.0.1:42914 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.mapred.TaskTracker: Starting tracker
> tracker_n144-17a.wall2.test:localhost/127.0.0.1:42914 2011-04-04
> 18:28:06,933 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 0 time(s). 2011-04-04
> 18:28:07,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 1 time(s). 2011-04-04
> 18:28:08,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 2 time(s). 2011-04-04
> 18:28:09,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 3 time(s). 2011-04-04
> 18:28:10,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 4 time(s). 2011-04-04
> 18:28:11,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 5 time(s). 2011-04-04
> 18:28:12,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 6 time(s). 2011-04-04
> 18:28:13,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 7 time(s). 2011-04-04
> 18:28:14,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 8 time(s). 2011-04-04
> 18:28:15,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 9 time(s). 2011-04-04
> 18:28:15,938 INFO org.apache.hadoop.ipc.RPC: Server at
> n-1/10.1.1.3:9001 not available yet, Zzzzz... 2011-04-04 18:28:17,939
> INFO org.apache.hadoop.ipc.Client: Retrying connect to server:
> n-1/10.1.1.3:9001. Already tried 0 time(s).
> 
> Indeed, the jobtracker does not seem to be running (although that's
> probably another problem): $ sudo netstat -n -l -p | grep 9001
> $ 
> 
> Thanks for any help,
> Dieter

Reply via email to