Joy, The ordering of those 2 blocks would indeed destroy the initial ganglia setup. Thanks for reporting this.
We'll get a fix up for it ASAP. Gary On Fri, Nov 19, 2010 at 9:33 PM, Saptarshi Guha <[email protected]>wrote: > I logged into the master > > 1. In hbase-ec2-init-remote.sh, the block is evaluated (on master) > > if [ "$IS_MASTER" = "true" ]; then > sed -i -e "s|\( *mcast_join *=.*\)|#\1|" \ > -e "s|\( *bind *=.*\)|#\1|" \ > -e "s|\( *mute *=.*\)| mute = yes|" \ > -e "s|\( *location *=.*\)| location = \"master-node\"|" \ > /etc/gmond.conf > mkdir -p /mnt/ganglia/rrds > chown -R ganglia:ganglia /mnt/ganglia/rrds > rm -rf /var/lib/ganglia; cd /var/lib; ln -s /mnt/ganglia ganglia; cd > service gmond start > service gmetad start > apachectl start > > > but /mnt/ganglia/rrds is not present (maybe because mnt is not mounted? or > see [2]) > gmetad is not running , so > > apachectl stop > service gmetad stop ## was not running in the first place > service gmond stop > > mkdir -p /mnt/ganglia/rrds > chown -R ganglia:ganglia /mnt/ganglia/rrds > rm -rf /var/lib/ganglia; cd /var/lib; ln -s /mnt/ganglia ganglia; cd > service gmond start > service gmetad start > apachectl start > > and ganglia now works. > > > [2] This block in hbase-ec2-int-remote-sh might undo everything in 1. > > # Reformat sdb as xfs > umount /mnt > mkfs.xfs -f /dev/sdb > mount -o noatime /dev/sdb /mnt > > > Cheers > > Joy > > > On Fri, Nov 19, 2010 at 1:03 PM, Lars George <[email protected]> > wrote: > > > Let us know what you find. Thanks! > > > > On Fri, Nov 19, 2010 at 6:49 PM, Saptarshi Guha > > <[email protected]> wrote: > > > Yes, messages is the right place. Saw this > > > Nov 19 12:25:26 ip-10-98-154-214 /usr/sbin/gmetad[1293]: Unable to > > > mkdir(/var/lib/ganglia/rrds/unspecified): No such file or directory > > > Cheers > > > J > > > > > > On Fri, Nov 19, 2010 at 2:15 AM, Lars George <[email protected]> > > wrote: > > >> > > >> Yeah, this will be superseded by WHIRR-25 over the next month or two. > > >> The "root" name was simply a choice, no reason not to change it. As > > >> for Ganglia, do you see the Ganglia daemon run on each node? If not, > > >> please have a look into the logs on the servers, the user scripts > > >> usually log their process in the /var/log/messages or so. > > >> > > >> Please feel free to add any idea to the above WHIRR-25 > > >> (https://issues.apache.org/jira/browse/WHIRR-25) so that we can > > >> include it. > > >> > > >> Lars > > >> > > >> On Fri, Nov 19, 2010 at 8:55 AM, Saptarshi Guha > > >> <[email protected]> wrote: > > >> > Hello, > > >> > > > >> > Thanks to apurtells github repo of the hbase-ec2 i managed to start > an > > >> > hbase cluster. > > >> > Everything works nicely, I can check the uis of the JT/NN and Hbase > > >> > Master. > > >> > > > >> > What I cant see are the ganglia metrics despite the url provided by > > the > > >> > proxy > > >> > > > >> > http://ec2-a-b-c-d.compute-1.amazonaws.com/ganglia > > >> > > > >> > There was an error collecting ganglia data (127.0.0.1:8652): > > fsockopen > > >> > error: Connection refused > > >> > > > >> > 1) How can i check if even ganglia is running? > > >> > > > >> > On a side note, in the hbase-ec2 scripts, all the code to run an > > >> > instance uses -k root. > > >> > I changed this to -k $EC2_ROOT_PAIR_NAME where pair name is the name > > >> > of the key I created, thus I dont > > >> > have to restrict myself to -k root. This appears to work for me. > > >> > > > >> > Thanks > > >> > > > >> > Joy > > >> > > > >> > HBASE_VERSION=0.20.4 > > >> > HADOOP_VERSION=0.20.2 > > >> > https://github.com/apurtell/hbase-ec2 > > >> > > > > > > > > > >
