I spent a lot of time to figure it out however i did not find a solution. Problems from the logs pointed me for some bugs in rrdupdate tool, however i tried to solve it with different versions of ganglia and rrdtool but the error is the same. Segmentation fault appears after the following lines, if I run gmetad in debug mode...
"Created rrd /var/lib/ganglia/rrds/hdcluster/xxx/metricssystem.MetricsSystem.publish_max_time.rrd" "Created rrd /var/lib/ganglia/rrds/hdcluster/xxx/metricssystem.MetricsSystem.snapshot_max_time.rrd " which I suppose are generated from MetricsSystemImpl.java (Is there any way just to disable this two metrics?) >From the /var/log/messages there are a lot of errors: "xxx gmetad[15217]: RRD_update (/var/lib/ganglia/rrds/hdc/xxx/metricssystem.MetricsSystem.publish_imax_time.rrd): converting '4.9E-324' to float: Numerical result out of range" "xxx gmetad[15217]: RRD_update (/var/lib/ganglia/rrds/hdc/xxx/metricssystem.MetricsSystem.snapshot_imax_time.rrd): converting '4.9E-324' to float: Numerical result out of range" so probably there are some converting issues ? Where should I look for the solution? Would you rather suggest to use ganglia 3.0.x with the old protocol and leave the version >3.1 for further releases? any help is realy appreciated... On 1 February 2012 04:04, Merto Mertek <[email protected]> wrote: > I would be glad to hear that too.. I've setup the following: > > Hadoop 0.20.205 > Ganglia Front 3.1.7 > Ganglia Back *(gmetad)* 3.1.7 > RRDTool <http://www.rrdtool.org/> 1.4.5. -> i had some troubles > installing 1.4.4 > > Ganglia works just in case hadoop is not running, so metrics are not > publshed to gmetad node (conf with new hadoop-metrics2.proprieties). When > hadoop is started, a segmentation fault appears in gmetad deamon: > > sudo gmetad -d 2 > ....... > Updating host xxx, metric dfs.FSNamesystem.BlocksTotal > Updating host xxx, metric bytes_in > Updating host xxx, metric bytes_out > Updating host xxx, metric metricssystem.MetricsSystem.publish_max_time > Created rrd > /var/lib/ganglia/rrds/hdcluster/hadoopmaster/metricssystem.MetricsSystem.publish_max_time.rrd > Segmentation fault > > And some info from the apache log <http://pastebin.com/nrqKRtKJ>.. > > Can someone suggest a ganglia version that is tested with hadoop 0.20.205? > I will try to sort it out however it seems a not so tribial problem.. > > Thank you > > > > > > On 2 December 2011 12:32, praveenesh kumar <[email protected]> wrote: > >> or Do I have to apply some hadoop patch for this ? >> >> Thanks, >> Praveenesh >> > >
