[Ganglia-general] Gmond Python module for monitoring NVIDIA GPUs

2011-06-17 Thread Bernard Li
Dear all: Just a quick note letting you guys know that we now have a python module for monitoring NVIDIA GPUs using the newly released Python bindings for NVML: https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia If you are running a cluster with NVIDIA GPUs, please download

[Ganglia-general] how to integrate nvdia gpu monitoring

2015-10-20 Thread Hridyesh Kumar
. For more information on what metrics are supported on what models, please refer to NVML documentation After following the above procedure respective services gmond and gmetad restart could not get the GPU metrics in Ganglia. Thanks & Regards, Hridyesh kumar System Engi

[Ganglia-general] Gmond Python Module for Monitoring NVIDIA GPU

2012-02-11 Thread Gowtham
/pypi/nvidia-ml-py/ requires Python to be newer than 2.4 - following Phil's instructions in a recent email, I got Python 2.7 and 3.x to install; and used that to get these Python bindings for NVML to install. I then followed the instructions in 'Ganglia/gmond python modules' page https

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Bernard Li
be a good idea. My schedule is mostly open next week. When are others free? I will brush up on sflow by then. NVML and the Python metric module are tested at NVIDIA on Windows and Linux, but not within Cygwin. The process will be easier/faster on the NVML side if we keep Cygwin out of the loop

Re: [Ganglia-general] how to integrate nvdia gpu monitoring

2015-10-20 Thread Hridyesh Kumar
ll metrics that the management library could detect for your GPU are collected. For more information on what metrics are supported on what models, please refer to NVML documentation After following the above procedure respective services gmond and gmetad restart could not get the GPU metrics in Ganglia.

[Ganglia-general] Gmond Python Module for Monitoring NVIDIA GPU

2012-02-14 Thread Gowtham
I'm trying to implement the instructions given here http://developer.nvidia.com/ganglia-monitoring-system on one of our Rocks 5.4.2 clusters that has 2 GPU cards in every compute node. Part #1: Python bindings for the NVML http://pypi.python.org/pypi/nvidia-ml-py/ This requires Python

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Peter Phaal
Hi Robert, sFlow is a very simple protocol - an sFlow agent periodically sends XDR encoded structures over UDP. Each structure has a tag and a length, making the protocol extensible. In the short term, it would make sense is to define an sFlow structure to carry the current NVML metrics and tag

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Robert Alexander
Hey, A meeting may be a good idea. My schedule is mostly open next week. When are others free? I will brush up on sflow by then. NVML and the Python metric module are tested at NVIDIA on Windows and Linux, but not within Cygwin. The process will be easier/faster on the NVML side if we

[Ganglia-general] Aggregating all GPU metrics into single graph.

2013-04-21 Thread Lee, Wayne
node within our Linux cluster may each have 4, 8 or 16 GPUs. I'm currently using the NVML Python Nvidia module to gather various metrics for each GPU for each of the 500 nodes in our cluster. Therefore within my /var/lib/ganglia/rrds/Dell_group/node1, you would find the following rrd files

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Nigel LEACH
take you point about re-using the existing GPU module and gmetric, unfortunately I don't have experience with Python. My plan is to write something in C to export the nvml metrics, with various output options. We will then decide whether to call this new code from existing gmond 3.1 via gmetric

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Ivan Lozgachev
, unfortunately I don't have experience with Python. My plan is to write something in C to export the nvml metrics, with various output options. We will then decide whether to call this new code from existing gmond 3.1 via gmetric, new (if we get it working) gmond 3.4, or one of our existing

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Nigel LEACH
metrics. I take you point about re-using the existing GPU module and gmetric, unfortunately I don't have experience with Python. My plan is to write something in C to export the nvml metrics, with various output options. We will then decide whether to call this new code from existing gmond 3.1

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Bernard Li
the nvml metrics, with various output options. We will then decide whether to call this new code from existing gmond 3.1 via gmetric, new (if we get it working) gmond 3.4, or one of our existing third party tools - ITRS Geneous. As regards your list of metrics they are pretty definitive, but I

[Ganglia-general] Sample/example gmetad.conf, gmond.conf, conf.php, etc for multiple grids, one web server, nfs mounted rrds area?

2011-10-05 Thread Lee, Wayne
all stored on an NFS filesystem /nfs/data/ganglia/rrds. - A single Apache web server running a single gmetad daemon which collects data from 4 different clusters. - Installed NVIDIA GPU Python Ganglia module plugin. This requires the NVIDIA NVML Python binding nvidia-ml-py

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Peter Phaal
: https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia https://github.com/ganglia/ganglia_contrib Longer term, it would make sense to extend Host sFlow to use the C-based NVML API to extract and export metrics. This would be straightforward - the Host sFlow agent uses native C APIs

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Bernard Li
/ganglia_contrib Longer term, it would make sense to extend Host sFlow to use the C-based NVML API to extract and export metrics. This would be straightforward - the Host sFlow agent uses native C APIs on the platforms it supports to extract metrics. What would take some thought is developing standard

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Robert Alexander
/gpu/nvidia https://github.com/ganglia/ganglia_contrib Longer term, it would make sense to extend Host sFlow to use the C-based NVML API to extract and export metrics. This would be straightforward - the Host sFlow agent uses native C APIs on the platforms it supports to extract metrics

Re: [Ganglia-general] Ganglia-general Digest, Vol 61, Issue 14

2011-06-17 Thread Guo Star
Dear all: Just a quick note letting you guys know that we now have a python module for monitoring NVIDIA GPUs using the newly released Python bindings for NVML: https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia If you are running a cluster with NVIDIA GPUs, please download