[Ganglia-general] Unable to get nvidia module working

2016-04-19 Thread Jeff White
I'm trying to get this nvidia module working on CentOS 7: https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia I did the following: * Installed CUDA on node but NOT on the Ganglia server * Installed nvidia-ml-py-7.352.0.tar.gz on both node and server * put nvidia.py into

Re: [Ganglia-general] Unable to get nvidia module working

2016-04-19 Thread Jesse Becker
Do you know if the metrics are actually being collected? An easy way to test is to use netcat or telnet to connect to the compute node with the nVidia card: nc node123 8649 That should dump a bunch of XML, and you can search that for the metrics generated by the plugin. If you find them

[Ganglia-general] [Ganglia-General] node page shows only "1"

2016-04-19 Thread Kristoff Isserstedt
Hi! We changed our ganglia server to use rrdcache and at the source site is everything ok. But when you switch to a node site there is only a "1" between the header and footer. ganglia 3.7.2 ganglia web 3.7.1 rrdtool 1.4.7 dwoo1.1.1