Re: [Ganglia-general] GSoC 2014 Project Interest

2014-03-07 Thread Dirk Luo
Hi, Bernard: Today when I was searching for information about "hardcoded limit of supporting 4 GPUs", I found a booklet named "NVML API REFERENCE MANUAL Version 3.295.45". In the first chapter, it lists all products the NVML API supports. Unfortunately, the graphics cards in

Re: [Ganglia-general] GSoC 2014 Project Interest

2014-03-07 Thread Bernard Li
Us", I found a booklet named "NVML API REFERENCE MANUAL > Version 3.295.45". In the first chapter, it lists all products the > NVML API supports. Unfortunately, the graphics cards in my cluster are > under limited support. I doubt if this limitation would stop me from > wor

Re: [Ganglia-general] GSOC-2014 Project Interest

2014-03-05 Thread Bernard Li
Hi Md: Thanks for your email. You already have my email address so feel free to send me questions there. I am proposing that students work on the following: - Update plugin to support new metrics that can be collected by new version of NVML - Update web interface to support summarizing GPU

[Ganglia-general] Gmond Python module for monitoring NVIDIA GPUs

2011-06-16 Thread Bernard Li
Dear all: Just a quick note letting you guys know that we now have a python module for monitoring NVIDIA GPUs using the newly released Python bindings for NVML: https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia If you are running a cluster with NVIDIA GPUs, please download

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Bernard Li
may be a good idea. My schedule is mostly open next week. When > are others free? I will brush up on sflow by then. > > NVML and the Python metric module are tested at NVIDIA on Windows and > Linux, but not within Cygwin. The process will be easier/faster on the > NVML side if w

[Ganglia-general] how to integrate nvdia gpu monitoring

2015-10-20 Thread Hridyesh Kumar
detect for your GPU are collected. For more information on what metrics are supported on what models, please refer to NVML documentation After following the above procedure respective services gmond and gmetad restart could not get the GPU metrics in Ganglia. Thanks & Regards, Hr

[Ganglia-general] Gmond Python Module for Monitoring NVIDIA GPU

2012-02-14 Thread Gowtham
I'm trying to implement the instructions given here http://developer.nvidia.com/ganglia-monitoring-system on one of our Rocks 5.4.2 clusters that has 2 GPU cards in every compute node. Part #1: Python bindings for the NVML http://pypi.python.org/pypi/nvidia-ml-py/ This requires P

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Peter Phaal
Hi Robert, sFlow is a very simple protocol - an sFlow agent periodically sends XDR encoded structures over UDP. Each structure has a tag and a length, making the protocol extensible. In the short term, it would make sense is to define an sFlow structure to carry the current NVML metrics and tag

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-16 Thread Robert Alexander
on support? Thanks, Bernard On Thursday, July 12, 2012, Robert Alexander wrote: Hey, A meeting may be a good idea. My schedule is mostly open next week. When are others free? I will brush up on sflow by then. NVML and the Python metric module are tested at NVIDIA on Windows and Linux, but

Re: [Ganglia-general] GSoC 2014 Project Interest

2014-03-06 Thread Bernard Li
Hi Dirk: On Thursday, 6 March 2014, Dirk Luo wrote: To my knowledge, the NVML plug-in provides a variety of GPU metrics. > With these metrics, the RRDtool/graphite draws graphs as defined by > the parameters supplied on the command line. The parameters supplied > are defined in a fil

Re: [Ganglia-general] how to integrate nvdia gpu monitoring

2015-10-20 Thread Hridyesh Kumar
metrics that the management library could detect for your GPU are collected. For more information on what metrics are supported on what models, please refer to NVML documentation After following the above procedure respective services gmond and gmetad restart could not get the GPU metrics in

[Ganglia-general] Gmond Python Module for Monitoring NVIDIA GPU

2012-02-11 Thread Gowtham
n.org/pypi/nvidia-ml-py/ requires Python to be newer than 2.4 - following Phil's instructions in a recent email, I got Python 2.7 and 3.x to install; and used that to get these Python bindings for NVML to install. I then followed the instructions in 'Ganglia/gmond python modules&

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Robert Alexander
Hey, A meeting may be a good idea. My schedule is mostly open next week. When are others free? I will brush up on sflow by then. NVML and the Python metric module are tested at NVIDIA on Windows and Linux, but not within Cygwin. The process will be easier/faster on the NVML side if we

Re: [Ganglia-general] Starting with GSoC 2014, help needed

2014-03-05 Thread Bernard Li
Hi Praful: Thanks for your email. For the GPU project, I propose the following work to be done: - Update plugin to support new metrics that can be collected by new version of NVML - Update web interface to support summarizing GPU graphs under Host Overview - Update web interface to better

[Ganglia-general] Aggregating all GPU metrics into single graph.

2013-04-21 Thread Lee, Wayne
quot;HP group", and the third would be the "Appro group".Each node within our Linux cluster may each have 4, 8 or 16 GPUs. I'm currently using the NVML Python Nvidia module to gather various metrics for each GPU for each of the 500 nodes in our cluster. Therefore with

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Nigel LEACH
s. I take you point about re-using the existing GPU module and gmetric, unfortunately I don't have experience with Python. My plan is to write something in C to export the nvml metrics, with various output options. We will then decide whether to call this new code from existing gmond 3.1 v

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Ivan Lozgachev
d gmetric, > unfortunately I don't have experience with Python. My plan is to write > something in C to export the nvml metrics, with various output options. We > will then decide whether to call this new code from existing gmond 3.1 via > gmetric, new (if we get it working) gmond 3.

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Nigel LEACH
metrics. > > I take you point about re-using the existing GPU module and gmetric, > unfortunately I don't have experience with Python. My plan is to write > something in C to export the nvml metrics, with various output options. We > will then decide whether to call this new

Re: [Ganglia-general] GSoC 2014 Project Interest

2014-03-06 Thread Dirk Luo
Hi, Bernard: I am interested in working on the GPU project. To my knowledge, the NVML plug-in provides a variety of GPU metrics. With these metrics, the RRDtool/graphite draws graphs as defined by the parameters supplied on the command line. The parameters supplied are defined in a file similar

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Bernard Li
ygwin, my latest errors >> are parsing gm_protocol_xdr.c. I don't know whether we should follow this >> up, it would be nice to have a Windows gmond, but my only reason for >> upgrading are the GPU metrics. >> >> I take you point about re-using the existing

[Ganglia-general] Sample/example gmetad.conf, gmond.conf, conf.php, etc for multiple grids, one web server, nfs mounted rrds area?

2011-10-05 Thread Lee, Wayne
1.7. Ganglia 3.1.7 Apache Web server = - OS Version: RedHat 5.5 - RRDs files all stored on an NFS filesystem /nfs/data/ganglia/rrds. - A single Apache web server running a single gmetad daemon which collects data from 4 different clusters. -

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Peter Phaal
: https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia https://github.com/ganglia/ganglia_contrib Longer term, it would make sense to extend Host sFlow to use the C-based NVML API to extract and export metrics. This would be straightforward - the Host sFlow agent uses native C APIs

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Bernard Li
//github.com/ganglia/ganglia_contrib > > Longer term, it would make sense to extend Host sFlow to use the > C-based NVML API to extract and export metrics. This would be > straightforward - the Host sFlow agent uses native C APIs on the > platforms it supports to extract metrics. >

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Robert Alexander
ganglia/gmond_python_modules/tree/master/gpu/nvidia > https://github.com/ganglia/ganglia_contrib > > Longer term, it would make sense to extend Host sFlow to use the > C-based NVML API to extract and export metrics. This would be > straightforward - the Host sFlow agent uses nativ

Re: [Ganglia-general] Ganglia-general Digest, Vol 61, Issue 14

2011-06-16 Thread Guo Star
fig > eth0 Link encap:Ethernet HWaddr B8:AC:6F:14:20:09 > inet6 addr: fe80::baac:6fff:fe14:2009/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:30202937055 errors:0 dropped:48 overruns:0 frame:0 > TX pac