Hi, Sorry, I'm also newbie of whirr. I just did that and can access the hdfs from my local machine now.
On Apr 16, 2012, at 1:58 AM, Đỗ Hoàng Khiêm wrote: > Hi, > > I did it follow the Whirr documentation > > $ cp -r /etc/hadoop-0.20/conf.empty /etc/hadoop-0.20/conf.whirr > $ rm -f /etc/hadoop-0.20/conf.whirr/*-site.xml > $ cp ~/.whirr/myhadoopcluster/hadoop-site.xml /etc/hadoop-0.20/conf.whirr > > Just another (naive) question, can u explain to me about the role of local > Hadoop in this installation, why we need to configure the local instance to > work with the cluster? And what is the main advantage of Whirr over the ec2 > scripts in the hadoop src/contrib/ec2? > > Thanks for your reply. > > On Mon, Apr 16, 2012 at 3:02 PM, Huanchen Zhang <[email protected]> wrote: > Hi, > > How did you configure the local hadoop? > > I just simply copied ~/.whirr/myhadoopcluster/hadoop-site.xml to my local > hadoop config folder, and it works. > > Best, > Huanchen > > On Apr 16, 2012, at 12:12 AM, Đỗ Hoàng Khiêm wrote: > >> Hi, I am new to Whirr and I'm trying to setup a Hadoop cluster on EC2 with >> Whirr,I have followed the tutorial on Cloudera >> https://ccp.cloudera.com/display/CDHDOC/Whirr+Installation >> >> Before install Whirr, I install Hadoop (0.20.2-cdh3u3), then install Whirr >> (0.5.0-cdh3u3) on my local machine (running Linux Mint 11). >> >> Here's my cluster config file >> >> whirr.cluster-name=large-cluster >> whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 >> hadoop-datanode+hadoop-tasktracker >> whirr.provider=aws-ec2 >> whirr.identity=XXXXXXXXXXXXXXX >> whirr.credential=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx >> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa >> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub >> whirr.hadoop-install-function=install_cdh_hadoop >> whirr.hadoop-configure-function=configure_cdh_hadoop >> whirr.hardware-id=m1.large >> whirr.image-id=us-east-1/ami-da0cf8b3 >> whirr.location-id=us-east-1 >> >> The cluster lauching looks normally >> >> khiem@master ~ $ whirr launch-cluster --config large-hadoop.properties >> Bootstrapping cluster >> Configuring template >> Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker] >> Configuring template >> Starting 1 node(s) with roles [hadoop-jobtracker, hadoop-namenode] >> Nodes started: [[id=us-east-1/i-9aa01dfd, providerId=i-9aa01dfd, >> group=large-cluster, name=null, location=[id=us-east-1a, scope=ZONE, >> description=us-east-1a, parent=us-east-1, iso3166Codes=[US-VA], >> metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, >> family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, >> description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], >> state=RUNNING, loginPort=22, privateAddresses=[10.196.142.64], >> publicAddresses=[107.20.64.97], hardware=[id=m1.large, providerId=m1.large, >> name=null, processors=[[cores=2.0, speed=2.0]], ram=7680, volumes=[[id=null, >> type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], >> [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, >> isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, >> durable=false, isBootDevice=false]], supportsImage=is64Bit()], >> loginUser=ubuntu, userMetadata={}]] >> Nodes started: [[id=us-east-1/i-0aa31e6d, providerId=i-0aa31e6d, >> group=large-cluster, name=null, location=[id=us-east-1a, scope=ZONE, >> description=us-east-1a, parent=us-east-1, iso3166Codes=[US-VA], >> metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, >> family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, >> description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], >> state=RUNNING, loginPort=22, privateAddresses=[10.85.130.43], >> publicAddresses=[50.17.128.123], hardware=[id=m1.large, providerId=m1.large, >> name=null, processors=[[cores=2.0, speed=2.0]], ram=7680, volumes=[[id=null, >> type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], >> [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, >> isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, >> durable=false, isBootDevice=false]], supportsImage=is64Bit()], >> loginUser=ubuntu, userMetadata={}]] >> Authorizing firewall ingress to [Instance{roles=[hadoop-jobtracker, >> hadoop-namenode], publicIp=50.17.128.123, privateIp=10.85.130.43, >> id=us-east-1/i-0aa31e6d, nodeMetadata=[id=us-east-1/i-0aa31e6d, >> providerId=i-0aa31e6d, group=large-cluster, name=null, >> location=[id=us-east-1a, scope=ZONE, description=us-east-1a, >> parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, >> imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, >> arch=paravirtual, is64Bit=true, >> description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], >> state=RUNNING, loginPort=22, privateAddresses=[10.85.130.43], >> publicAddresses=[50.17.128.123], hardware=[id=m1.large, providerId=m1.large, >> name=null, processors=[[cores=2.0, speed=2.0]], ram=7680, volumes=[[id=null, >> type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], >> [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, >> isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, >> durable=false, isBootDevice=false]], supportsImage=is64Bit()], >> loginUser=ubuntu, userMetadata={}]}] on ports [50070, 50030] for >> [116.96.138.41/32] >> Authorizing firewall ingress to [Instance{roles=[hadoop-jobtracker, >> hadoop-namenode], publicIp=50.17.128.123, privateIp=10.85.130.43, >> id=us-east-1/i-0aa31e6d, nodeMetadata=[id=us-east-1/i-0aa31e6d, >> providerId=i-0aa31e6d, group=large-cluster, name=null, >> location=[id=us-east-1a, scope=ZONE, description=us-east-1a, >> parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, >> imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, >> arch=paravirtual, is64Bit=true, >> description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], >> state=RUNNING, loginPort=22, privateAddresses=[10.85.130.43], >> publicAddresses=[50.17.128.123], hardware=[id=m1.large, providerId=m1.large, >> name=null, processors=[[cores=2.0, speed=2.0]], ram=7680, volumes=[[id=null, >> type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], >> [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, >> isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, >> durable=false, isBootDevice=false]], supportsImage=is64Bit()], >> loginUser=ubuntu, userMetadata={}]}] on ports [8020, 8021] for >> [50.17.128.123/32] >> Running configuration script >> Configuration script run completed >> Running configuration script >> Configuration script run completed >> Completed configuration of large-cluster >> Namenode web UI available at >> http://ec2-50-17-128-123.compute-1.amazonaws.com:50070 >> Jobtracker web UI available at >> http://ec2-50-17-128-123.compute-1.amazonaws.com:50030 >> Wrote Hadoop site file /home/khiem/.whirr/large-cluster/hadoop-site.xml >> Wrote Hadoop proxy script /home/khiem/.whirr/large-cluster/hadoop-proxy.sh >> Wrote instances file /home/khiem/.whirr/large-cluster/instances >> Started cluster of 2 instances >> Cluster{instances=[Instance{roles=[hadoop-datanode, hadoop-tasktracker], >> publicIp=107.20.64.97, privateIp=10.196.142.64, id=us-east-1/i-9aa01dfd, >> nodeMetadata=[id=us-east-1/i-9aa01dfd, providerId=i-9aa01dfd, >> group=large-cluster, name=null, location=[id=us-east-1a, scope=ZONE, >> description=us-east-1a, parent=us-east-1, iso3166Codes=[US-VA], >> metadata={}], uri=null, imageId=us-east-1/ami-da0cf8b3, os=[name=null, >> family=ubuntu, version=10.04, arch=paravirtual, is64Bit=true, >> description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], >> state=RUNNING, loginPort=22, privateAddresses=[10.196.142.64], >> publicAddresses=[107.20.64.97], hardware=[id=m1.large, providerId=m1.large, >> name=null, processors=[[cores=2.0, speed=2.0]], ram=7680, volumes=[[id=null, >> type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], >> [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, >> isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, >> durable=false, isBootDevice=false]], supportsImage=is64Bit()], >> loginUser=ubuntu, userMetadata={}]}, Instance{roles=[hadoop-jobtracker, >> hadoop-namenode], publicIp=50.17.128.123, privateIp=10.85.130.43, >> id=us-east-1/i-0aa31e6d, nodeMetadata=[id=us-east-1/i-0aa31e6d, >> providerId=i-0aa31e6d, group=large-cluster, name=null, >> location=[id=us-east-1a, scope=ZONE, description=us-east-1a, >> parent=us-east-1, iso3166Codes=[US-VA], metadata={}], uri=null, >> imageId=us-east-1/ami-da0cf8b3, os=[name=null, family=ubuntu, version=10.04, >> arch=paravirtual, is64Bit=true, >> description=ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20101020.manifest.xml], >> state=RUNNING, loginPort=22, privateAddresses=[10.85.130.43], >> publicAddresses=[50.17.128.123], hardware=[id=m1.large, providerId=m1.large, >> name=null, processors=[[cores=2.0, speed=2.0]], ram=7680, volumes=[[id=null, >> type=LOCAL, size=10.0, device=/dev/sda1, durable=false, isBootDevice=true], >> [id=null, type=LOCAL, size=420.0, device=/dev/sdb, durable=false, >> isBootDevice=false], [id=null, type=LOCAL, size=420.0, device=/dev/sdc, >> durable=false, isBootDevice=false]], supportsImage=is64Bit()], >> loginUser=ubuntu, userMetadata={}]}], >> configuration={hadoop.job.ugi=root,root, >> mapred.job.tracker=ec2-50-17-128-123.compute-1.amazonaws.com:8021, >> hadoop.socks.server=localhost:6666, >> fs.s3n.awsAccessKeyId=AKIAIGXAURLAB7CQE77A, >> fs.s3.awsSecretAccessKey=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, >> fs.s3.awsAccessKeyId=AZIAIGXIOPLAB7CQE77A, >> hadoop.rpc.socket.factory.class.default=org.apache.hadoop.net.SocksSocketFactory, >> fs.default.name=hdfs://ec2-50-17-128-123.compute-1.amazonaws.com:8020/, >> fs.s3n.awsSecretAccessKey=dWDRq2z0EQegdPrbbL8Dab3eCu98O32r3gOrIbOK}} >> >> >> >> >> >> I've also started the proxy and update the local Hadoop configuration follow >> Cloudera tutorial, but when I tried to test the HDFS with hadoop fs -ls / >> >> the terminal prints connection error: >> >> 12/04/12 11:54:43 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found >> in the classpath. Usage of hadoop-site.xml is deprecated. Instead use >> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of >> core-default.xml, mapred-default.xml and hdfs-default.xml respectively >> 12/04/12 11:54:43 INFO security.UserGroupInformation: JAAS Configuration >> already set up for Hadoop, not re-installing. >> 12/04/12 11:54:45 INFO ipc.Client: Retrying connect to server: >> ec2-50-17-128-123.compute-1.amazonaws.com/50.17.128.123:8020. Already tried >> 0 time(s). >> 12/04/12 11:54:46 INFO ipc.Client: Retrying connect to server: >> ec2-50-17-128-123.compute-1.amazonaws.com/50.17.128.123:8020. Already tried >> 1 time(s). >> 12/04/12 11:54:48 INFO ipc.Client: Retrying connect to server: >> ec2-50-17-128-123.compute-1.amazonaws.com/50.17.128.123:8020. Already tried >> 2 time(s). >> 12/04/12 11:54:49 INFO ipc.Client: Retrying connect to server: >> ec2-50-17-128-123.compute-1.amazonaws.com/50.17.128.123:8020. Already tried >> 3 time(s) >> >> >> >> >> >> In the proxy terminal >> >> Running proxy to Hadoop cluster at >> ec2-50-17-128-123.compute-1.amazonaws.com. Use Ctrl-c to quit. >> Warning: Permanently added >> 'ec2-50-17-128-123.compute-1.amazonaws.com,50.17.128.123' (RSA) to the list >> of known hosts. >> channel 2: open failed: connect failed: Connection refused channel 2: open >> failed: connect failed: Connection refused >> channel 2: open failed: connect failed: Connection refused >> channel 2: open failed: connect failed: Connection refused >> channel 2: open failed: connect failed: Connection refused >> >> The namenode webUI (50070 port) also not available, I can ssh to the >> namenode but inside the namenode, it looks like there's none of Hadoop or >> Java installation, is this strange thing? >> >> Any comment is appreciated. >> >> . >> >> > >
