We are using Whirr to setup a rackspace cluster to run Hadoop jobs. We use the
Cloudera Hadoop. Below is our hadoop.properties
whirr.cluster-name=our_cluster
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,6
hadoop-datanode+hadoop-tasktracker
whirr.provider=cloudservers-us
whirr.identity=${env:RACKSPACE_USERNAME}
whirr.credential=${env:RACKSPACE_API_KEY}
whirr.hardware-id=6
whirr.image=49
whirr.login-user=user
whirr.private-key-file=/home/user/.ssh/id_rsa_whirr
whirr.public-key-file=/home/user/.ssh/id_rsa_whirr.pub
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
All is working fine. But now I want to change the hadoop configuration file on
the nodes. Actually, we want to increase the amount of heap space available to
Hadoop (HADOOP_HEAPSIZE). So we want to change the hadoop-env.sh file on each
node.
My Question is: How can I do that? Do I need to open the
lib/whirr-cdh-0.6.0-incubating.jar and tweak the contents of that jar, then
repackage it?
I hope somebody can share some knowledge on this. Thanks!