Hello, I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to launch a Hadoop cluster on Amazon EC2. To do so, I modified the bundled scripts above for my EC2 account, and then created my own Hadoop 0.20.0 AMI. The steps I followed for creating AMIs and launching EC2 Hadoop clusters are the same I was using for over a year with Hadoop 0.18.* and 0.19.*.
I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and ran the following to launch a new cluster: root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2 After the usual EC2 wait, one master and two slave instances were launched on EC2, as expected. When I ssh'ed into the instances, here is what I found: Slaves: DataNode and NameNode are running Master: Only NameNode is running I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts) without any problems, from both master and slaves. However, since JobTracker is not running, I cannot run map-reduce jobs. I verified that the port for fs.default.name is set to 50001, the port for mapred.job.tracker is set to 50002 and that there are no other port conflicts for these ports. I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker, reproduced below: ----------------------------------------------- <<< 2009-07-20 16:56:30,273 WARN org.apache.hadoop.conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and h dfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively 2009-07-20 16:56:30,320 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = domU-12-31-39-04-30-16/10.240.55.228 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009 ************************************************************/ 2009-07-20 16:56:31,332 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=50002 2009-07-20 16:56:31,603 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2009-07-20 16:56:31,900 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030 2009-07-20 16:56:31,900 INFO org.mortbay.log: jetty-6.1.14 2009-07-20 16:56:33,461 INFO org.mortbay.log: Started [email protected]:50030 2009-07-20 16:56:33,462 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2009-07-20 16:56:33,531 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 50002 2009-07-20 16:56:33,532 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030 2009-07-20 16:56:51,554 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2009-07-20 16:56:53,060 INFO org.apache.hadoop.hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F SNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4 22) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo cationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation Handler.java:59) at $Proxy4.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF SClient.java:2873) ... ... 2009-07-20 16:56:55,878 WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping /mnt/hadoop/mapred/system/jobtracker.info retries left 1 2009-07-20 16:56:59,082 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info could only replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F SNamesystem.java:1256) ... ... 2009-07-20 16:57:00,092 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to domU-12-31-39-04-30-16.compute-1.internal/10.240.55.228:50002 : Address already in use at org.apache.hadoop.ipc.Server.bind(Server.java:190) at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:253) at org.apache.hadoop.ipc.Server.<init>(Server.java:1026) at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:488) at org.apache.hadoop.ipc.RPC.getServer(RPC.java:450) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1537) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:174) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3528) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119 ) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.apache.hadoop.ipc.Server.bind(Server.java:188) ... 7 more 2009-07-20 16:57:00,093 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down JobTracker at domU-12-31-39-04-30-16/10.240.55.228 ************************************************************/ >>> ----------------------------------------------- So it looks like the JobTracker launched, but then died trying to replicate the jobtracker.info file to one or more slaves. Would appreciate any help in this... Thanks a lot, jp
