I am happy it works for you. One thing you need to keep in mind is that there is no firewall for clusters started on Rackspace.
Cheers, -- Andrei Savu / axemblr.com / Tools for Clouds On Thu, Jun 21, 2012 at 3:49 PM, Paritosh Ranjan <[email protected]> wrote: > Thanks a lot Andrei. This answers all my queries. > > The cluster would be used as a dev env for a deduplication engine, and > yes, it involves a bit of machine learning ( GA and Naive Bayes ). > Again, thanks for all the help. > > Regards, > Paritosh > ------------------------------ > *From:* Andrei Savu [[email protected]] > *Sent:* Thursday, June 21, 2012 2:39 PM > > *To:* [email protected] > *Subject:* Re: Whirr 0.7.1 : CDH : Rackspace : Problem > > Great! > > You have to either install Hadoop on your local machine (same version as > the one running on the cluster) or ssh on the namenode to be able to use > the hadoop command. > > We are not shipping the Hadoop binaries with Whirr. > > BTW how are you planning to use the cluster? Machine learning? > > -- Andrei Savu / axemblr.com / Tools for Clouds > > On Thu, Jun 21, 2012 at 3:36 PM, Paritosh Ranjan <[email protected]>wrote: > >> Thanks Andrei. Now the cluster is up. >> However, I have one more problem left. >> >> I am not able to use hadoop commands. I ran the hddoop-proxy.sh and set >> HADOOP_CONF_DIR as mentioned on the quick start guide page. >> Do I need to do anything more? >> >> Please see the logs: >> >> hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ . >> ~/.whirr/humaninference_rackspace/hadoop-proxy.sh & >> [1] 11179 >> hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ Running proxy to Hadoop >> cluster at 184-106-176-156.static.cloud-ips.com. Use Ctrl-c to quit. >> Warning: Permanently added >> '184-106-176-156.static.cloud-ips.com,184.106.176.156' >> (RSA) to the list of known hosts. >> >> hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ hadoop >> bash: hadoop: command not found >> hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ export >> HADOOP_CONF_DIR=~/.whirr/humaninference_rackspace >> hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ hadoop fs -ls / >> bash: hadoop: command not found >> >> ------------------------------ >> *From:* Andrei Savu [[email protected]] >> *Sent:* Thursday, June 21, 2012 1:56 PM >> >> *To:* [email protected] >> *Subject:* Re: Whirr 0.7.1 : CDH : Rackspace : Problem >> >> Try replacing: >> >> whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 >> hadoop-datanode+hadoop-tasktracker >> >> with: >> >> whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,1 >> hadoop-datanode+hadoop-tasktracker >> >> The role order is important. >> >> -- Andrei Savu / axemblr.com / Tools for Clouds >> >> >> On Thu, Jun 21, 2012 at 2:52 PM, Paritosh Ranjan <[email protected]>wrote: >> >>> Sometimes the namenode is up but the jobtracker has never been up. >>> I have tried it 4-5 times. >>> ------------------------------ >>> *From:* Andrei Savu [[email protected]] >>> *Sent:* Thursday, June 21, 2012 1:49 PM >>> *To:* [email protected] >>> *Subject:* Re: Whirr 0.7.1 : CDH : Rackspace : Problem >>> >>> Have you checked the servers? I see no serious error. The file >>> permissions issue should be harmless. >>> >>> By default Whirr is deploying CDH3U4. Is this what you need? >>> >>> -- Andrei Savu / axemblr.com / Tools for Clouds >>> >>> On Thu, Jun 21, 2012 at 2:43 PM, Paritosh Ranjan <[email protected]>wrote: >>> >>>> This is the complete log, if it can help. >>>> >>>> ./whirr launch-cluster --config hadoop.properties >>>> Bootstrapping cluster >>>> Configuring template >>>> Configuring template >>>> Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker] >>>> Starting 1 node(s) with roles [hadoop-jobtracker, hadoop-namenode] >>>> Nodes started: [[id=20933521, providerId=20933521, >>>> group=humaninferencerackspace, name=humaninferencerackspace-af5, >>>> location=[id=e90923cb40144ee405a900686d6cd631, scope=HOST, >>>> description=e90923cb40144ee405a900686d6cd631, parent=cloudservers-us, >>>> iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, >>>> family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu >>>> 10.04 LTS], state=RUNNING, loginPort=22, >>>> hostname=humaninferencerackspace-af5, privateAddresses=[10.183.34.9], >>>> publicAddresses=[50.57.166.152], hardware=[id=4, providerId=4, name=2GB >>>> server, processors=[[cores=8.0, speed=1.0]], ram=2048, volumes=[[id=null, >>>> type=LOCAL, size=80.0, device=null, durable=true, isBootDevice=true]], >>>> supportsImage=ALWAYS_TRUE, tags=[]], loginUser=root, userMetadata={}, >>>> tags=[]]] >>>> Nodes started: [[id=20933520, providerId=20933520, >>>> group=humaninferencerackspace, name=humaninferencerackspace-d4b, >>>> location=[id=c948bbc4457a5a9a1d135260d0f5ec01, scope=HOST, >>>> description=c948bbc4457a5a9a1d135260d0f5ec01, parent=cloudservers-us, >>>> iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, >>>> family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu >>>> 10.04 LTS], state=RUNNING, loginPort=22, >>>> hostname=humaninferencerackspace-d4b, privateAddresses=[10.183.34.3], >>>> publicAddresses=[50.57.166.89], hardware=[id=4, providerId=4, name=2GB >>>> server, processors=[[cores=8.0, speed=1.0]], ram=2048, volumes=[[id=null, >>>> type=LOCAL, size=80.0, device=null, durable=true, isBootDevice=true]], >>>> supportsImage=ALWAYS_TRUE, tags=[]], loginUser=root, userMetadata={}, >>>> tags=[]]] >>>> Authorizing firewall ingress to [20933520] on ports [50030] for [ >>>> 108.171.174.174/32] >>>> Authorizing firewall ingress to [20933520] on ports [8021] for [ >>>> 50.57.166.89/32] >>>> Authorizing firewall ingress to [20933520] on ports [50070] for [ >>>> 108.171.174.174/32] >>>> Authorizing firewall ingress to [20933520] on ports [8020, 8021] for [ >>>> 50.57.166.89/32] >>>> Authorizing firewall ingress to [20933520] on ports [50030] for [ >>>> 108.171.174.174/32] >>>> Authorizing firewall ingress to [20933520] on ports [8021] for [ >>>> 50.57.166.89/32] >>>> Starting to run scripts on cluster for phase configureinstances: >>>> 20933521 >>>> Starting to run scripts on cluster for phase configureinstances: >>>> 20933520 >>>> Running configure phase script on: 20933521 >>>> Running configure phase script on: 20933520 >>>> configure phase script run completed on: 20933521 >>>> configure phase script run completed on: 20933520 >>>> >>>> Successfully executed configure script: [output= at >>>> java.io.FileOutputStream.openAppend(Native Method) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:207) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:131) >>>> at org.apache.log4j.FileAppender.setFile(FileAppender.java:290) >>>> at >>>> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164) >>>> at >>>> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216) >>>> at >>>> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257) >>>> at >>>> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133) >>>> hadoop-0.20-namenode. >>>> Safe mode is OFF >>>> , error=12/06/21 10:58:38 INFO namenode.FSNamesystem: >>>> supergroup=supergroup >>>> 12/06/21 10:58:38 INFO namenode.FSNamesystem: isPermissionEnabled=true >>>> 12/06/21 10:58:38 INFO namenode.FSNamesystem: >>>> dfs.block.invalidate.limit=1000 >>>> 12/06/21 10:58:38 INFO namenode.FSNamesystem: >>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), >>>> accessTokenLifetime=0 min(s) >>>> 12/06/21 10:58:38 INFO common.Storage: Image file of size 110 saved in >>>> 0 seconds. >>>> 12/06/21 10:58:38 INFO common.Storage: Storage directory >>>> /data/hadoop/hdfs/name has been successfully formatted. >>>> 12/06/21 10:58:38 INFO namenode.NameNode: SHUTDOWN_MSG: >>>> /************************************************************ >>>> SHUTDOWN_MSG: Shutting down NameNode at >>>> 50-57-166-89.static.cloud-ips.com/50.57.166.89 >>>> ************************************************************/ >>>> , exitCode=0] >>>> Successfully executed configure script: >>>> [output=java.io.FileNotFoundException: >>>> /var/log/hadoop/logs/SecurityAuth.audit (Permission denied) >>>> at java.io.FileOutputStream.openAppend(Native Method) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:207) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:131) >>>> at org.apache.log4j.FileAppender.setFile(FileAppender.java:290) >>>> at >>>> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164) >>>> at >>>> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216) >>>> at >>>> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257) >>>> at >>>> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133) >>>> hadoop-0.20-tasktracker. >>>> , error=dpkg-preconfigure: unable to re-open stdin: >>>> update-rc.d: warning: hadoop-0.20-datanode start runlevel arguments (2 >>>> 3 4 5) do not match LSB Default-Start values (3 5) >>>> update-rc.d: warning: hadoop-0.20-datanode stop runlevel arguments (0 1 >>>> 6) do not match LSB Default-Stop values (0 1 2 4 6) >>>> dpkg-preconfigure: unable to re-open stdin: >>>> update-rc.d: warning: hadoop-0.20-tasktracker start runlevel arguments >>>> (2 3 4 5) do not match LSB Default-Start values (3 5) >>>> update-rc.d: warning: hadoop-0.20-tasktracker stop runlevel arguments >>>> (0 1 6) do not match LSB Default-Stop values (0 1 2 4 6) >>>> , exitCode=0] >>>> Finished running configure phase scripts on all cluster instances >>>> Completed configuration of humaninference_rackspace role >>>> hadoop-jobtracker >>>> Jobtracker web UI available at >>>> http://50-57-166-89.static.cloud-ips.com:50030 >>>> Completed configuration of humaninference_rackspace role hadoop-namenode >>>> Namenode web UI available at >>>> http://50-57-166-89.static.cloud-ips.com:50070 >>>> Wrote Hadoop site file >>>> /home/hadoepje/.whirr/humaninference_rackspace/hadoop-site.xml >>>> Wrote Hadoop proxy script >>>> /home/hadoepje/.whirr/humaninference_rackspace/hadoop-proxy.sh >>>> Completed configuration of humaninference_rackspace role hadoop-datanode >>>> Completed configuration of humaninference_rackspace role >>>> hadoop-tasktracker >>>> Wrote instances file >>>> /home/hadoepje/.whirr/humaninference_rackspace/instances >>>> Started cluster of 2 instances >>>> Cluster{instances=[Instance{roles=[hadoop-datanode, >>>> hadoop-tasktracker], publicIp=50.57.166.152, privateIp=10.183.34.9, >>>> id=20933521, nodeMetadata=[id=20933521, providerId=20933521, >>>> group=humaninferencerackspace, name=humaninferencerackspace-af5, >>>> location=[id=e90923cb40144ee405a900686d6cd631, scope=HOST, >>>> description=e90923cb40144ee405a900686d6cd631, parent=cloudservers-us, >>>> iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, >>>> family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu >>>> 10.04 LTS], state=RUNNING, loginPort=22, >>>> hostname=humaninferencerackspace-af5, privateAddresses=[10.183.34.9], >>>> publicAddresses=[50.57.166.152], hardware=[id=4, providerId=4, name=2GB >>>> server, processors=[[cores=8.0, speed=1.0]], ram=2048, volumes=[[id=null, >>>> type=LOCAL, size=80.0, device=null, durable=true, isBootDevice=true]], >>>> supportsImage=ALWAYS_TRUE, tags=[]], loginUser=root, userMetadata={}, >>>> tags=[]]}, Instance{roles=[hadoop-jobtracker, hadoop-namenode], >>>> publicIp=50.57.166.89, privateIp=10.183.34.3, id=20933520, >>>> nodeMetadata=[id=20933520, providerId=20933520, >>>> group=humaninferencerackspace, name=humaninferencerackspace-d4b, >>>> location=[id=c948bbc4457a5a9a1d135260d0f5ec01, scope=HOST, >>>> description=c948bbc4457a5a9a1d135260d0f5ec01, parent=cloudservers-us, >>>> iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, >>>> family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu >>>> 10.04 LTS], state=RUNNING, loginPort=22, >>>> hostname=humaninferencerackspace-d4b, privateAddresses=[10.183.34.3], >>>> publicAddresses=[50.57.166.89], hardware=[id=4, providerId=4, name=2GB >>>> server, processors=[[cores=8.0, speed=1.0]], ram=2048, volumes=[[id=null, >>>> type=LOCAL, size=80.0, device=null, durable=true, isBootDevice=true]], >>>> supportsImage=ALWAYS_TRUE, tags=[]], loginUser=root, userMetadata={}, >>>> tags=[]]}], configuration={fs.default.name=hdfs:// >>>> 50-57-166-89.static.cloud-ips.com:8020/, mapred.job.tracker= >>>> 50-57-166-89.static.cloud-ips.com:8021, hadoop.job.ugi=root,root, >>>> hadoop.rpc.socket.factory.class.default=org.apache.hadoop.net.SocksSocketFactory, >>>> hadoop.socks.server=localhost:6666}} >>>> You can log into instances using the following ssh commands: >>>> 'ssh -i /home/hadoepje/.ssh/id_rsa -o "UserKnownHostsFile /dev/null" -o >>>> StrictHostKeyChecking=no [email protected]' >>>> 'ssh -i /home/hadoepje/.ssh/id_rsa -o "UserKnownHostsFile /dev/null" -o >>>> StrictHostKeyChecking=no [email protected]' >>>> hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ >>>> /var/log/hadoop/logs/SecurityAuth.audit (Permission denied) >>>> bash: syntax error near unexpected token `Permission' >>>> >>>> ------------------------------ >>>> *From:* Paritosh Ranjan [[email protected]] >>>> *Sent:* Thursday, June 21, 2012 1:21 PM >>>> *To:* [email protected] >>>> *Subject:* Whirr 0.7.1 : CDH : Rackspace : Problem >>>> >>>> Hi, >>>> >>>> I am switching to Whirr 0.7.1 from 0.6-someversion >>>> I am facing some issue while installing the cdh distribution on >>>> RackSpace ( I am able to installing the Apache distribution ). >>>> >>>> I have followed the instructions on this page >>>> http://whirr.apache.org/docs/0.7.1/quick-start-guide.html. >>>> >>>> This is my hadoop.properties file >>>> >>>> whirr.cluster-name=<theclustername> >>>> whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 >>>> hadoop-datanode+hadoop-tasktracker >>>> whirr.provider=cloudservers-us >>>> whirr.identity=<theidentity :)> >>>> whirr.credential=<thepassword :)> >>>> whirr.hardware-id=4 >>>> whirr.image=49 >>>> whirr.private-key-file=/home/hadoepje/.ssh/id_rsa >>>> whirr.public-key-file=/home/hadoepje/.ssh/id_rsa.pub >>>> whirr.hadoop.install-function=install_cdh_hadoop >>>> whirr.hadoop.configure-function=configure_cdh_hadoop >>>> >>>> >>>> However, I am getting this error. Can someone help? >>>> >>>> Successfully executed configure script: [output= at >>>> java.io.FileOutputStream.openAppend(Native Method) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:207) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:131) >>>> at org.apache.log4j.FileAppender.setFile(FileAppender.java:290) >>>> at >>>> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164) >>>> at >>>> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216) >>>> at >>>> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257) >>>> at >>>> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133) >>>> hadoop-0.20-namenode. >>>> Safe mode is OFF >>>> , error=12/06/21 10:58:38 INFO namenode.FSNamesystem: >>>> supergroup=supergroup >>>> 12/06/21 10:58:38 INFO namenode.FSNamesystem: isPermissionEnabled=true >>>> 12/06/21 10:58:38 INFO namenode.FSNamesystem: >>>> dfs.block.invalidate.limit=1000 >>>> 12/06/21 10:58:38 INFO namenode.FSNamesystem: >>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), >>>> accessTokenLifetime=0 min(s) >>>> 12/06/21 10:58:38 INFO common.Storage: Image file of size 110 saved in >>>> 0 seconds. >>>> 12/06/21 10:58:38 INFO common.Storage: Storage directory >>>> /data/hadoop/hdfs/name has been successfully formatted. >>>> 12/06/21 10:58:38 INFO namenode.NameNode: SHUTDOWN_MSG: >>>> /************************************************************ >>>> SHUTDOWN_MSG: Shutting down NameNode at >>>> 50-57-166-89.static.cloud-ips.com/50.57.166.89 >>>> ************************************************************/ >>>> , exitCode=0] >>>> Successfully executed configure script: >>>> [output=java.io.FileNotFoundException: >>>> /var/log/hadoop/logs/SecurityAuth.audit (Permission denied) >>>> at java.io.FileOutputStream.openAppend(Native Method) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:207) >>>> at java.io.FileOutputStream.<init>(FileOutputStream.java:131) >>>> at org.apache.log4j.FileAppender.setFile(FileAppender.java:290) >>>> at >>>> org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164) >>>> at >>>> org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216) >>>> at >>>> org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257) >>>> at >>>> org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133) >>>> hadoop-0.20-tasktracker. >>>> >>>> Thanks and Regards, >>>> Paritosh >>>> >>> >>> >> >
