Hello,
We tried to use ganglia on diskless nodes with system in a ramdisk. We
used a 2.5.30 kernel and gmond is dying after having send one sample of
it's metrics.
Here is the log :
=================
set_metric_value() got metric key 20
set_metric_value() exec'd mem_free_func (20)
mcast_value() mcasting mem_free value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() got metric key 21
set_metric_value() exec'd mem_shared_func (21)
mcast_value() mcasting mem_shared value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() got metric key 22
set_metric_value() exec'd mem_buffers_func (22)
Segmentation fault
A strace does not give additional information (I can send the strace log if
requested)
and all is identical (libraries, ...) to the "master node"
where gmond runs fine excepts :
- kernel 2.4.18
- /usr directory is read-only on the nodes.
We tried ganglia 2.4.1 and 2.5.
I noticed that /proc/stat is different for the two kernels and so I
wonder if it can cause gmond to crash "badly" :
Node (kernel 2.5.30)
************************* oscar_cluster *************************
processing node node01.tour1
--------- node01.tour1---------
cpu 152060 6 12405 8089102
cpu0 152060 6 12405 8089102
page 7446 18246
swap 0 0
intr 84250671 82535754 2 0 0 0 0 0 0 0 0 0 1714915 0 0 0 0
disk_io:
pageallocs 6662731
pagefrees 6632641
pageactiv 7247
pagedeact 0
pagefault 15571178
majorfault 833
pagescan 0
pagesteal 0
pageoutrun 0
allocstall 0
ctxt 2009358
btime 1032817204
processes 119263
Master (2.4.18)
[EMAIL PROTECTED] root]# cat /proc/stat
cpu 1701148 219838 666892 181036471
cpu0 1701148 219838 666892 181036471
page 9091489 51558917
swap 40293 49304
intr 262331317 183624349 28077 0 15878 3 21714 2 0 1 0 2472287 70258953 160026
0 5750027 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
disk_io: (3,0):(5761797,811852,18119716,4949945,102072642)
ctxt 802893759
btime 1031063501
processes 242532
--
Benoit des Ligneris Etudiant au Doctorat -- Ph. D. Student
Web : http://benoit.des.ligneris.net/
President du - GULUS - president http://www.gulus.org/
Mydynaweb Developpe(u)r: http://mydynaweb.sf.net/
Thin-OSCAR http://thin-oscar.sourceforge.net/
GPG/PGP Key http://cyrus.physique.usherb.ca/gpg_ben.php