Thanks a lot Andrei. This answers all my queries.

The cluster would be used as a dev env for a deduplication engine, and yes, it 
involves a bit of machine learning ( GA and Naive Bayes ).
Again, thanks for all the help.

Regards,
Paritosh
________________________________
From: Andrei Savu [[email protected]]
Sent: Thursday, June 21, 2012 2:39 PM
To: [email protected]
Subject: Re: Whirr 0.7.1 : CDH : Rackspace : Problem

Great!

You have to either install Hadoop on your local machine (same version as the 
one running on the cluster) or ssh on the namenode to be able to use the hadoop 
command.

We are not shipping the Hadoop binaries with Whirr.

BTW how are you planning to use the cluster? Machine learning?

-- Andrei Savu / axemblr.com<http://axemblr.com/> / Tools for Clouds

On Thu, Jun 21, 2012 at 3:36 PM, Paritosh Ranjan 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Andrei. Now the cluster is up.
However, I have one more problem left.

I am not able to use hadoop commands. I ran the hddoop-proxy.sh and set 
HADOOP_CONF_DIR as mentioned on the quick start guide page.
Do I need to do anything more?

Please see the logs:

hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ . 
~/.whirr/humaninference_rackspace/hadoop-proxy.sh &
[1] 11179
hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ Running proxy to Hadoop cluster 
at 
184-106-176-156.static.cloud-ips.com<http://184-106-176-156.static.cloud-ips.com>.
 Use Ctrl-c to quit.
Warning: Permanently added 
'184-106-176-156.static.cloud-ips.com<http://184-106-176-156.static.cloud-ips.com>,184.106.176.156'
 (RSA) to the list of known hosts.

hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ hadoop
bash: hadoop: command not found
hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ export 
HADOOP_CONF_DIR=~/.whirr/humaninference_rackspace
hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ hadoop fs -ls /
bash: hadoop: command not found

________________________________
From: Andrei Savu [[email protected]<mailto:[email protected]>]
Sent: Thursday, June 21, 2012 1:56 PM

To: [email protected]<mailto:[email protected]>
Subject: Re: Whirr 0.7.1 : CDH : Rackspace : Problem

Try replacing:

whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 
hadoop-datanode+hadoop-tasktracker

with:

whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,1 
hadoop-datanode+hadoop-tasktracker

The role order is important.

-- Andrei Savu / axemblr.com<http://axemblr.com/> / Tools for Clouds


On Thu, Jun 21, 2012 at 2:52 PM, Paritosh Ranjan 
<[email protected]<mailto:[email protected]>> wrote:
Sometimes the namenode is up but the jobtracker has never been up.
I have tried it 4-5 times.
________________________________
From: Andrei Savu [[email protected]<mailto:[email protected]>]
Sent: Thursday, June 21, 2012 1:49 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Whirr 0.7.1 : CDH : Rackspace : Problem

Have you checked the servers? I see no serious error. The file permissions 
issue should be harmless.

By default Whirr is deploying CDH3U4. Is this what you need?

-- Andrei Savu / axemblr.com<http://axemblr.com/> / Tools for Clouds

On Thu, Jun 21, 2012 at 2:43 PM, Paritosh Ranjan 
<[email protected]<mailto:[email protected]>> wrote:
This is the complete log, if it can help.

 ./whirr launch-cluster --config hadoop.properties
Bootstrapping cluster
Configuring template
Configuring template
Starting 1 node(s) with roles [hadoop-datanode, hadoop-tasktracker]
Starting 1 node(s) with roles [hadoop-jobtracker, hadoop-namenode]
Nodes started: [[id=20933521, providerId=20933521, 
group=humaninferencerackspace, name=humaninferencerackspace-af5, 
location=[id=e90923cb40144ee405a900686d6cd631, scope=HOST, 
description=e90923cb40144ee405a900686d6cd631, parent=cloudservers-us, 
iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, 
family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu 10.04 
LTS], state=RUNNING, loginPort=22, hostname=humaninferencerackspace-af5, 
privateAddresses=[10.183.34.9], publicAddresses=[50.57.166.152], 
hardware=[id=4, providerId=4, name=2GB server, processors=[[cores=8.0, 
speed=1.0]], ram=2048, volumes=[[id=null, type=LOCAL, size=80.0, device=null, 
durable=true, isBootDevice=true]], supportsImage=ALWAYS_TRUE, tags=[]], 
loginUser=root, userMetadata={}, tags=[]]]
Nodes started: [[id=20933520, providerId=20933520, 
group=humaninferencerackspace, name=humaninferencerackspace-d4b, 
location=[id=c948bbc4457a5a9a1d135260d0f5ec01, scope=HOST, 
description=c948bbc4457a5a9a1d135260d0f5ec01, parent=cloudservers-us, 
iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, 
family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu 10.04 
LTS], state=RUNNING, loginPort=22, hostname=humaninferencerackspace-d4b, 
privateAddresses=[10.183.34.3], publicAddresses=[50.57.166.89], hardware=[id=4, 
providerId=4, name=2GB server, processors=[[cores=8.0, speed=1.0]], ram=2048, 
volumes=[[id=null, type=LOCAL, size=80.0, device=null, durable=true, 
isBootDevice=true]], supportsImage=ALWAYS_TRUE, tags=[]], loginUser=root, 
userMetadata={}, tags=[]]]
Authorizing firewall ingress to [20933520] on ports [50030] for 
[108.171.174.174/32<http://108.171.174.174/32>]
Authorizing firewall ingress to [20933520] on ports [8021] for 
[50.57.166.89/32<http://50.57.166.89/32>]
Authorizing firewall ingress to [20933520] on ports [50070] for 
[108.171.174.174/32<http://108.171.174.174/32>]
Authorizing firewall ingress to [20933520] on ports [8020, 8021] for 
[50.57.166.89/32<http://50.57.166.89/32>]
Authorizing firewall ingress to [20933520] on ports [50030] for 
[108.171.174.174/32<http://108.171.174.174/32>]
Authorizing firewall ingress to [20933520] on ports [8021] for 
[50.57.166.89/32<http://50.57.166.89/32>]
Starting to run scripts on cluster for phase configureinstances: 20933521
Starting to run scripts on cluster for phase configureinstances: 20933520
Running configure phase script on: 20933521
Running configure phase script on: 20933520
configure phase script run completed on: 20933521
configure phase script run completed on: 20933520

Successfully executed configure script: [output=        at 
java.io.FileOutputStream.openAppend(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:207)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:131)
        at org.apache.log4j.FileAppender.setFile(FileAppender.java:290)
        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164)
        at 
org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216)
        at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257)
        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133)
hadoop-0.20-namenode.
Safe mode is OFF
, error=12/06/21 10:58:38 INFO namenode.FSNamesystem: supergroup=supergroup
12/06/21 10:58:38 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/06/21 10:58:38 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=1000
12/06/21 10:58:38 INFO namenode.FSNamesystem: isAccessTokenEnabled=false 
accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/06/21 10:58:38 INFO common.Storage: Image file of size 110 saved in 0 
seconds.
12/06/21 10:58:38 INFO common.Storage: Storage directory /data/hadoop/hdfs/name 
has been successfully formatted.
12/06/21 10:58:38 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at 
50-57-166-89.static.cloud-ips.com/50.57.166.89<http://50-57-166-89.static.cloud-ips.com/50.57.166.89>
************************************************************/
, exitCode=0]
Successfully executed configure script: [output=java.io.FileNotFoundException: 
/var/log/hadoop/logs/SecurityAuth.audit (Permission denied)
        at java.io.FileOutputStream.openAppend(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:207)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:131)
        at org.apache.log4j.FileAppender.setFile(FileAppender.java:290)
        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164)
        at 
org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216)
        at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257)
        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133)
hadoop-0.20-tasktracker.
, error=dpkg-preconfigure: unable to re-open stdin:
update-rc.d: warning: hadoop-0.20-datanode start runlevel arguments (2 3 4 5) 
do not match LSB Default-Start values (3 5)
update-rc.d: warning: hadoop-0.20-datanode stop runlevel arguments (0 1 6) do 
not match LSB Default-Stop values (0 1 2 4 6)
dpkg-preconfigure: unable to re-open stdin:
update-rc.d: warning: hadoop-0.20-tasktracker start runlevel arguments (2 3 4 
5) do not match LSB Default-Start values (3 5)
update-rc.d: warning: hadoop-0.20-tasktracker stop runlevel arguments (0 1 6) 
do not match LSB Default-Stop values (0 1 2 4 6)
, exitCode=0]
Finished running configure phase scripts on all cluster instances
Completed configuration of humaninference_rackspace role hadoop-jobtracker
Jobtracker web UI available at http://50-57-166-89.static.cloud-ips.com:50030
Completed configuration of humaninference_rackspace role hadoop-namenode
Namenode web UI available at http://50-57-166-89.static.cloud-ips.com:50070
Wrote Hadoop site file 
/home/hadoepje/.whirr/humaninference_rackspace/hadoop-site.xml
Wrote Hadoop proxy script 
/home/hadoepje/.whirr/humaninference_rackspace/hadoop-proxy.sh
Completed configuration of humaninference_rackspace role hadoop-datanode
Completed configuration of humaninference_rackspace role hadoop-tasktracker
Wrote instances file /home/hadoepje/.whirr/humaninference_rackspace/instances
Started cluster of 2 instances
Cluster{instances=[Instance{roles=[hadoop-datanode, hadoop-tasktracker], 
publicIp=50.57.166.152, privateIp=10.183.34.9, id=20933521, 
nodeMetadata=[id=20933521, providerId=20933521, group=humaninferencerackspace, 
name=humaninferencerackspace-af5, 
location=[id=e90923cb40144ee405a900686d6cd631, scope=HOST, 
description=e90923cb40144ee405a900686d6cd631, parent=cloudservers-us, 
iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, 
family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu 10.04 
LTS], state=RUNNING, loginPort=22, hostname=humaninferencerackspace-af5, 
privateAddresses=[10.183.34.9], publicAddresses=[50.57.166.152], 
hardware=[id=4, providerId=4, name=2GB server, processors=[[cores=8.0, 
speed=1.0]], ram=2048, volumes=[[id=null, type=LOCAL, size=80.0, device=null, 
durable=true, isBootDevice=true]], supportsImage=ALWAYS_TRUE, tags=[]], 
loginUser=root, userMetadata={}, tags=[]]}, Instance{roles=[hadoop-jobtracker, 
hadoop-namenode], publicIp=50.57.166.89, privateIp=10.183.34.3, id=20933520, 
nodeMetadata=[id=20933520, providerId=20933520, group=humaninferencerackspace, 
name=humaninferencerackspace-d4b, 
location=[id=c948bbc4457a5a9a1d135260d0f5ec01, scope=HOST, 
description=c948bbc4457a5a9a1d135260d0f5ec01, parent=cloudservers-us, 
iso3166Codes=[], metadata={}], uri=null, imageId=112, os=[name=null, 
family=ubuntu, version=10.04, arch=null, is64Bit=true, description=Ubuntu 10.04 
LTS], state=RUNNING, loginPort=22, hostname=humaninferencerackspace-d4b, 
privateAddresses=[10.183.34.3], publicAddresses=[50.57.166.89], hardware=[id=4, 
providerId=4, name=2GB server, processors=[[cores=8.0, speed=1.0]], ram=2048, 
volumes=[[id=null, type=LOCAL, size=80.0, device=null, durable=true, 
isBootDevice=true]], supportsImage=ALWAYS_TRUE, tags=[]], loginUser=root, 
userMetadata={}, tags=[]]}], 
configuration={fs.default.name<http://fs.default.name>=hdfs://50-57-166-89.static.cloud-ips.com:8020/<http://50-57-166-89.static.cloud-ips.com:8020/>,
 
mapred.job.tracker=50-57-166-89.static.cloud-ips.com:8021<http://50-57-166-89.static.cloud-ips.com:8021>,
 hadoop.job.ugi=root,root, 
hadoop.rpc.socket.factory.class.default=org.apache.hadoop.net.SocksSocketFactory,
 hadoop.socks.server=localhost:6666}}
You can log into instances using the following ssh commands:
'ssh -i /home/hadoepje/.ssh/id_rsa -o "UserKnownHostsFile /dev/null" -o 
StrictHostKeyChecking=no [email protected]<mailto:[email protected]>'
'ssh -i /home/hadoepje/.ssh/id_rsa -o "UserKnownHostsFile /dev/null" -o 
StrictHostKeyChecking=no [email protected]<mailto:[email protected]>'
hadoepje@hadoepgrote:~/whirr/whirr-0.7.1/bin$ 
/var/log/hadoop/logs/SecurityAuth.audit (Permission denied)
bash: syntax error near unexpected token `Permission'

________________________________
From: Paritosh Ranjan [[email protected]<mailto:[email protected]>]
Sent: Thursday, June 21, 2012 1:21 PM
To: [email protected]<mailto:[email protected]>
Subject: Whirr 0.7.1 : CDH : Rackspace : Problem

Hi,

I am switching to Whirr 0.7.1 from 0.6-someversion
I am facing some issue while installing the cdh distribution on RackSpace ( I 
am able to installing the Apache distribution ).

I have followed the instructions on this page 
http://whirr.apache.org/docs/0.7.1/quick-start-guide.html.

This is my hadoop.properties file

whirr.cluster-name=<theclustername>
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 
hadoop-datanode+hadoop-tasktracker
whirr.provider=cloudservers-us
whirr.identity=<theidentity :)>
whirr.credential=<thepassword :)>
whirr.hardware-id=4
whirr.image=49
whirr.private-key-file=/home/hadoepje/.ssh/id_rsa
whirr.public-key-file=/home/hadoepje/.ssh/id_rsa.pub
whirr.hadoop.install-function=install_cdh_hadoop
whirr.hadoop.configure-function=configure_cdh_hadoop


However, I am getting this error. Can someone help?

Successfully executed configure script: [output=        at 
java.io.FileOutputStream.openAppend(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:207)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:131)
        at org.apache.log4j.FileAppender.setFile(FileAppender.java:290)
        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164)
        at 
org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216)
        at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257)
        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133)
hadoop-0.20-namenode.
Safe mode is OFF
, error=12/06/21 10:58:38 INFO namenode.FSNamesystem: supergroup=supergroup
12/06/21 10:58:38 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/06/21 10:58:38 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=1000
12/06/21 10:58:38 INFO namenode.FSNamesystem: isAccessTokenEnabled=false 
accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/06/21 10:58:38 INFO common.Storage: Image file of size 110 saved in 0 
seconds.
12/06/21 10:58:38 INFO common.Storage: Storage directory /data/hadoop/hdfs/name 
has been successfully formatted.
12/06/21 10:58:38 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at 
50-57-166-89.static.cloud-ips.com/50.57.166.89<http://50-57-166-89.static.cloud-ips.com/50.57.166.89>
************************************************************/
, exitCode=0]
Successfully executed configure script: [output=java.io.FileNotFoundException: 
/var/log/hadoop/logs/SecurityAuth.audit (Permission denied)
        at java.io.FileOutputStream.openAppend(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:207)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:131)
        at org.apache.log4j.FileAppender.setFile(FileAppender.java:290)
        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164)
        at 
org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216)
        at 
org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257)
        at 
org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133)
hadoop-0.20-tasktracker.

Thanks and Regards,
Paritosh



Reply via email to