Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Nigel LEACH
Thanks for the updates Peter and Bernard. 

I have been unable to get gmond 3.4 working under Cygwin, my latest errors are 
parsing gm_protocol_xdr.c. I don't know whether we should follow this up, it 
would be nice to have a Windows gmond, but my only reason for upgrading are the 
GPU metrics.

I take you point about re-using the existing GPU module and gmetric, 
unfortunately I don't have experience with Python. My plan is to write 
something in C to export the nvml metrics, with various output options. We will 
then decide whether to call this new code from existing gmond 3.1 via gmetric, 
new (if we get it working) gmond 3.4, or one of our existing third party tools 
- ITRS Geneous. 

As regards your list of metrics they are pretty definitive, but I will probably 
also export 

*total ecc errors - nvmlDeviceGetTotalEccErrors)
*individual ecc errors - nvmlDeviceGetDetailedEccErrors
*active compute processes - nvmlDeviceGetComputeRunningProcesses

Regards
Nigel  

-Original Message-
From: peter.ph...@gmail.com [mailto:peter.ph...@gmail.com] 
Sent: 10 July 2012 20:06
To: Nigel LEACH
Cc: bern...@vanhpc.org; ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

Nigel,

A simple option would be to use Host sFlow agents to export the core metrics 
from your Windows servers and use gmetric to send add the GPU metrics.

You could combine code from the python GPU module and gmetric implementations 
to produce a self contained script for exporting GPU
metrics:

https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia
https://github.com/ganglia/ganglia_contrib

Longer term, it would make sense to extend Host sFlow to use the C-based NVML 
API to extract and export metrics. This would be straightforward - the Host 
sFlow agent uses native C APIs on the platforms it supports to extract metrics.

What would take some thought is developing standard set of summary metrics to 
characterize GPU performance. Once the set of metrics is agreed on, then adding 
them to the sFlow agent is pretty trivial.

Currently the Ganglia python module exports the following metrics - are they 
the right set? Anything missing? It would be great to get involvement from the 
broader Ganglia community to capture best practice from anyone running large 
GPU clusters, as well as getting input from NVIDIA about the key metrics.

* gpu_num
* gpu_driver
* gpu_type
* gpu_uuid
* gpu_pci_id
* gpu_mem_total
* gpu_graphics_speed
* gpu_sm_speed
* gpu_mem_speed
* gpu_max_graphics_speed
* gpu_max_sm_speed
* gpu_max_mem_speed
* gpu_temp
* gpu_util
* gpu_mem_util
* gpu_mem_used
* gpu_fan
* gpu_power_usage
* gpu_perf_state
* gpu_ecc_mode

As far as scalability is concerned, you should find that moving to sFlow as the 
measurement transport reduces network traffic since all the metrics for a node 
are transported in a single UDP datagram (rather than a datagram per metric 
when using gmond as the agent). The other consideration is that sFlow is 
unicast, so if you are using a multicast Ganglia setup then this involves 
re-structuring your a configuration.

You still need to have at least one gmond instance, but it acts as an sFlow 
aggregator and is mute:
http://blog.sflow.com/2011/07/ganglia-32-released.html

Peter

On Tue, Jul 10, 2012 at 8:36 AM, Nigel LEACH nigel.le...@uk.bnpparibas.com 
wrote:
 Hello Bernard, I was coming to that conclusion, I've been trying to 
 compile on various combinations of Cygwin, Windows, Hardware this 
 afternoon, but without success yet. I've still got a few more tests to do 
 though.



 The GPU plugin is my only reason for upgrading from our current 3.1.7, 
 and there is nothing else esoteric we use. We do have Linux Blades, 
 but all of our Tesla's are hosted on Windows.  The entire estate is 
 quite large, so we would need to ensure sFlow scales, no reason to 
 think it won't, but I have little experience with it..



 Regards

 Nigel



 From: bern...@vanhpc.org [mailto:bern...@vanhpc.org]
 Sent: 10 July 2012 16:19
 To: Nigel LEACH
 Cc: neil.mckee...@gmail.com; ganglia-general@lists.sourceforge.net


 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin



 Hi Nigel:



 Perhaps other developers could chime in but I'm not sure if the latest 
 version could be compiled under Windows, at least I was not aware of 
 any testing done.



 Going forward I would like to encourage users to use hsflowd under Windows.
 I'm talking to the developers to see if we can add support for GPU 
 monitoring.  Do you have any other requirements besides that?



 Thanks,



 Bernard

 On Tuesday, July 10, 2012, Nigel LEACH wrote:

 Hi Neil, Many thanks for the swift reply.



 I want to take a look at sFlow, but it isn't a prerequisite.



 Anyway, I disabled sFlow, and (separately) included the patch you 
 sent. Both fixes appeared successful. For now I am going with your 
 patch, and sFlow enabled.



 I say appeared successful, as make was error free, and a gmond.exe

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Ivan Lozgachev
Hi all,

Maybe it will be interesting. Some time ago I successfully compiled
gmond 3.0.7 and 3.1.2 under Cygwin. If you need it I can upload
somewhere gmond and 3rd party sources + compilation script.
Also, I have gmetad 3.0.7 compiled for Windows. In additional, I
developed (just for fun) my implementation of gmetad 3.1.2 using .NET
and C#.

P. S. I do not know whether it is possible to use these gmong versions
to collect statistic from GPU.

--
Best regards,
Ivan.

2012/7/12 Nigel LEACH nigel.le...@uk.bnpparibas.com:
 Thanks for the updates Peter and Bernard.

 I have been unable to get gmond 3.4 working under Cygwin, my latest errors 
 are parsing gm_protocol_xdr.c. I don't know whether we should follow this up, 
 it would be nice to have a Windows gmond, but my only reason for upgrading 
 are the GPU metrics.

 I take you point about re-using the existing GPU module and gmetric, 
 unfortunately I don't have experience with Python. My plan is to write 
 something in C to export the nvml metrics, with various output options. We 
 will then decide whether to call this new code from existing gmond 3.1 via 
 gmetric, new (if we get it working) gmond 3.4, or one of our existing third 
 party tools - ITRS Geneous.

 As regards your list of metrics they are pretty definitive, but I will 
 probably also export

 *total ecc errors - nvmlDeviceGetTotalEccErrors)
 *individual ecc errors - nvmlDeviceGetDetailedEccErrors
 *active compute processes - nvmlDeviceGetComputeRunningProcesses

 Regards
 Nigel

 -Original Message-
 From: peter.ph...@gmail.com [mailto:peter.ph...@gmail.com]
 Sent: 10 July 2012 20:06
 To: Nigel LEACH
 Cc: bern...@vanhpc.org; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Nigel,

 A simple option would be to use Host sFlow agents to export the core metrics 
 from your Windows servers and use gmetric to send add the GPU metrics.

 You could combine code from the python GPU module and gmetric implementations 
 to produce a self contained script for exporting GPU
 metrics:

 https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia
 https://github.com/ganglia/ganglia_contrib

 Longer term, it would make sense to extend Host sFlow to use the C-based NVML 
 API to extract and export metrics. This would be straightforward - the Host 
 sFlow agent uses native C APIs on the platforms it supports to extract 
 metrics.

 What would take some thought is developing standard set of summary metrics to 
 characterize GPU performance. Once the set of metrics is agreed on, then 
 adding them to the sFlow agent is pretty trivial.

 Currently the Ganglia python module exports the following metrics - are they 
 the right set? Anything missing? It would be great to get involvement from 
 the broader Ganglia community to capture best practice from anyone running 
 large GPU clusters, as well as getting input from NVIDIA about the key 
 metrics.

 * gpu_num
 * gpu_driver
 * gpu_type
 * gpu_uuid
 * gpu_pci_id
 * gpu_mem_total
 * gpu_graphics_speed
 * gpu_sm_speed
 * gpu_mem_speed
 * gpu_max_graphics_speed
 * gpu_max_sm_speed
 * gpu_max_mem_speed
 * gpu_temp
 * gpu_util
 * gpu_mem_util
 * gpu_mem_used
 * gpu_fan
 * gpu_power_usage
 * gpu_perf_state
 * gpu_ecc_mode

 As far as scalability is concerned, you should find that moving to sFlow as 
 the measurement transport reduces network traffic since all the metrics for a 
 node are transported in a single UDP datagram (rather than a datagram per 
 metric when using gmond as the agent). The other consideration is that sFlow 
 is unicast, so if you are using a multicast Ganglia setup then this involves 
 re-structuring your a configuration.

 You still need to have at least one gmond instance, but it acts as an sFlow 
 aggregator and is mute:
 http://blog.sflow.com/2011/07/ganglia-32-released.html

 Peter

 On Tue, Jul 10, 2012 at 8:36 AM, Nigel LEACH nigel.le...@uk.bnpparibas.com 
 wrote:
 Hello Bernard, I was coming to that conclusion, I've been trying to
 compile on various combinations of Cygwin, Windows, Hardware this
 afternoon, but without success yet. I've still got a few more tests to do 
 though.



 The GPU plugin is my only reason for upgrading from our current 3.1.7,
 and there is nothing else esoteric we use. We do have Linux Blades,
 but all of our Tesla's are hosted on Windows.  The entire estate is
 quite large, so we would need to ensure sFlow scales, no reason to
 think it won't, but I have little experience with it..



 Regards

 Nigel



 From: bern...@vanhpc.org [mailto:bern...@vanhpc.org]
 Sent: 10 July 2012 16:19
 To: Nigel LEACH
 Cc: neil.mckee...@gmail.com; ganglia-general@lists.sourceforge.net


 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin



 Hi Nigel:



 Perhaps other developers could chime in but I'm not sure if the latest
 version could be compiled under Windows, at least I was not aware of
 any testing done.



 Going forward I would like

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Nigel LEACH
Thanks Ivan, but we have 3.0 and 3.1 gmond running under Cygwin (and using 
APR), the problem is with the 3.4 spin.

-Original Message-
From: lozgachev.i...@gmail.com [mailto:lozgachev.i...@gmail.com] 
Sent: 12 July 2012 11:54
To: Nigel LEACH
Cc: peter.ph...@gmail.com; ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

Hi all,

Maybe it will be interesting. Some time ago I successfully compiled gmond 3.0.7 
and 3.1.2 under Cygwin. If you need it I can upload somewhere gmond and 3rd 
party sources + compilation script.
Also, I have gmetad 3.0.7 compiled for Windows. In additional, I developed 
(just for fun) my implementation of gmetad 3.1.2 using .NET and C#.

P. S. I do not know whether it is possible to use these gmong versions to 
collect statistic from GPU.

--
Best regards,
Ivan.

2012/7/12 Nigel LEACH nigel.le...@uk.bnpparibas.com:
 Thanks for the updates Peter and Bernard.

 I have been unable to get gmond 3.4 working under Cygwin, my latest errors 
 are parsing gm_protocol_xdr.c. I don't know whether we should follow this up, 
 it would be nice to have a Windows gmond, but my only reason for upgrading 
 are the GPU metrics.

 I take you point about re-using the existing GPU module and gmetric, 
 unfortunately I don't have experience with Python. My plan is to write 
 something in C to export the nvml metrics, with various output options. We 
 will then decide whether to call this new code from existing gmond 3.1 via 
 gmetric, new (if we get it working) gmond 3.4, or one of our existing third 
 party tools - ITRS Geneous.

 As regards your list of metrics they are pretty definitive, but I will 
 probably also export

 *total ecc errors - nvmlDeviceGetTotalEccErrors) *individual ecc 
 errors - nvmlDeviceGetDetailedEccErrors *active compute processes - 
 nvmlDeviceGetComputeRunningProcesses

 Regards
 Nigel

 -Original Message-
 From: peter.ph...@gmail.com [mailto:peter.ph...@gmail.com]
 Sent: 10 July 2012 20:06
 To: Nigel LEACH
 Cc: bern...@vanhpc.org; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Nigel,

 A simple option would be to use Host sFlow agents to export the core metrics 
 from your Windows servers and use gmetric to send add the GPU metrics.

 You could combine code from the python GPU module and gmetric 
 implementations to produce a self contained script for exporting GPU
 metrics:

 https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia
 https://github.com/ganglia/ganglia_contrib

 Longer term, it would make sense to extend Host sFlow to use the C-based NVML 
 API to extract and export metrics. This would be straightforward - the Host 
 sFlow agent uses native C APIs on the platforms it supports to extract 
 metrics.

 What would take some thought is developing standard set of summary metrics to 
 characterize GPU performance. Once the set of metrics is agreed on, then 
 adding them to the sFlow agent is pretty trivial.

 Currently the Ganglia python module exports the following metrics - are they 
 the right set? Anything missing? It would be great to get involvement from 
 the broader Ganglia community to capture best practice from anyone running 
 large GPU clusters, as well as getting input from NVIDIA about the key 
 metrics.

 * gpu_num
 * gpu_driver
 * gpu_type
 * gpu_uuid
 * gpu_pci_id
 * gpu_mem_total
 * gpu_graphics_speed
 * gpu_sm_speed
 * gpu_mem_speed
 * gpu_max_graphics_speed
 * gpu_max_sm_speed
 * gpu_max_mem_speed
 * gpu_temp
 * gpu_util
 * gpu_mem_util
 * gpu_mem_used
 * gpu_fan
 * gpu_power_usage
 * gpu_perf_state
 * gpu_ecc_mode

 As far as scalability is concerned, you should find that moving to sFlow as 
 the measurement transport reduces network traffic since all the metrics for a 
 node are transported in a single UDP datagram (rather than a datagram per 
 metric when using gmond as the agent). The other consideration is that sFlow 
 is unicast, so if you are using a multicast Ganglia setup then this involves 
 re-structuring your a configuration.

 You still need to have at least one gmond instance, but it acts as an sFlow 
 aggregator and is mute:
 http://blog.sflow.com/2011/07/ganglia-32-released.html

 Peter

 On Tue, Jul 10, 2012 at 8:36 AM, Nigel LEACH nigel.le...@uk.bnpparibas.com 
 wrote:
 Hello Bernard, I was coming to that conclusion, I've been trying to 
 compile on various combinations of Cygwin, Windows, Hardware this 
 afternoon, but without success yet. I've still got a few more tests to do 
 though.



 The GPU plugin is my only reason for upgrading from our current 
 3.1.7, and there is nothing else esoteric we use. We do have Linux 
 Blades, but all of our Tesla's are hosted on Windows.  The entire 
 estate is quite large, so we would need to ensure sFlow scales, no 
 reason to think it won't, but I have little experience with it..



 Regards

 Nigel



 From: bern...@vanhpc.org [mailto:bern...@vanhpc.org]
 Sent

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Bernard Li
Hi Nigel:

Technically you only need 3.1 gmond to have support for the Python
metric module.  But I'm not sure whether we have ever tested this
under Windows.

Peter and Robert: How quickly can we get hsflowd to support GPU
metrics collection internally?  Should we setup a meeting to discuss
this?

Thanks,

Bernard

On Thu, Jul 12, 2012 at 4:05 AM, Nigel LEACH
nigel.le...@uk.bnpparibas.com wrote:
 Thanks Ivan, but we have 3.0 and 3.1 gmond running under Cygwin (and using 
 APR), the problem is with the 3.4 spin.

 -Original Message-
 From: lozgachev.i...@gmail.com [mailto:lozgachev.i...@gmail.com]
 Sent: 12 July 2012 11:54
 To: Nigel LEACH
 Cc: peter.ph...@gmail.com; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Hi all,

 Maybe it will be interesting. Some time ago I successfully compiled gmond 
 3.0.7 and 3.1.2 under Cygwin. If you need it I can upload somewhere gmond and 
 3rd party sources + compilation script.
 Also, I have gmetad 3.0.7 compiled for Windows. In additional, I developed 
 (just for fun) my implementation of gmetad 3.1.2 using .NET and C#.

 P. S. I do not know whether it is possible to use these gmong versions to 
 collect statistic from GPU.

 --
 Best regards,
 Ivan.

 2012/7/12 Nigel LEACH nigel.le...@uk.bnpparibas.com:
 Thanks for the updates Peter and Bernard.

 I have been unable to get gmond 3.4 working under Cygwin, my latest errors 
 are parsing gm_protocol_xdr.c. I don't know whether we should follow this 
 up, it would be nice to have a Windows gmond, but my only reason for 
 upgrading are the GPU metrics.

 I take you point about re-using the existing GPU module and gmetric, 
 unfortunately I don't have experience with Python. My plan is to write 
 something in C to export the nvml metrics, with various output options. We 
 will then decide whether to call this new code from existing gmond 3.1 via 
 gmetric, new (if we get it working) gmond 3.4, or one of our existing third 
 party tools - ITRS Geneous.

 As regards your list of metrics they are pretty definitive, but I will
 probably also export

 *total ecc errors - nvmlDeviceGetTotalEccErrors) *individual ecc
 errors - nvmlDeviceGetDetailedEccErrors *active compute processes -
 nvmlDeviceGetComputeRunningProcesses

 Regards
 Nigel

 -Original Message-
 From: peter.ph...@gmail.com [mailto:peter.ph...@gmail.com]
 Sent: 10 July 2012 20:06
 To: Nigel LEACH
 Cc: bern...@vanhpc.org; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Nigel,

 A simple option would be to use Host sFlow agents to export the core metrics 
 from your Windows servers and use gmetric to send add the GPU metrics.

 You could combine code from the python GPU module and gmetric
 implementations to produce a self contained script for exporting GPU
 metrics:

 https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia
 https://github.com/ganglia/ganglia_contrib

 Longer term, it would make sense to extend Host sFlow to use the C-based 
 NVML API to extract and export metrics. This would be straightforward - the 
 Host sFlow agent uses native C APIs on the platforms it supports to extract 
 metrics.

 What would take some thought is developing standard set of summary metrics 
 to characterize GPU performance. Once the set of metrics is agreed on, then 
 adding them to the sFlow agent is pretty trivial.

 Currently the Ganglia python module exports the following metrics - are they 
 the right set? Anything missing? It would be great to get involvement from 
 the broader Ganglia community to capture best practice from anyone running 
 large GPU clusters, as well as getting input from NVIDIA about the key 
 metrics.

 * gpu_num
 * gpu_driver
 * gpu_type
 * gpu_uuid
 * gpu_pci_id
 * gpu_mem_total
 * gpu_graphics_speed
 * gpu_sm_speed
 * gpu_mem_speed
 * gpu_max_graphics_speed
 * gpu_max_sm_speed
 * gpu_max_mem_speed
 * gpu_temp
 * gpu_util
 * gpu_mem_util
 * gpu_mem_used
 * gpu_fan
 * gpu_power_usage
 * gpu_perf_state
 * gpu_ecc_mode

 As far as scalability is concerned, you should find that moving to sFlow as 
 the measurement transport reduces network traffic since all the metrics for 
 a node are transported in a single UDP datagram (rather than a datagram per 
 metric when using gmond as the agent). The other consideration is that sFlow 
 is unicast, so if you are using a multicast Ganglia setup then this involves 
 re-structuring your a configuration.

 You still need to have at least one gmond instance, but it acts as an sFlow 
 aggregator and is mute:
 http://blog.sflow.com/2011/07/ganglia-32-released.html

 Peter

 On Tue, Jul 10, 2012 at 8:36 AM, Nigel LEACH nigel.le...@uk.bnpparibas.com 
 wrote:
 Hello Bernard, I was coming to that conclusion, I've been trying to
 compile on various combinations of Cygwin, Windows, Hardware this
 afternoon, but without success yet. I've still got a few more tests to do 
 though

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Robert Alexander
Hey,

A meeting may be a good idea.  My schedule is mostly open next week.  When are 
others free?  I will brush up on sflow by then.

NVML and the Python metric module are tested at NVIDIA on Windows and Linux, 
but not within Cygwin.  The process will be easier/faster on the NVML side if 
we keep Cygwin out of the loop.

-Robert

-Original Message-
From: Bernard Li [mailto:bern...@vanhpc.org]
Sent: Thursday, July 12, 2012 10:49 AM
To: Nigel LEACH
Cc: lozgachev.i...@gmail.com; ganglia-general@lists.sourceforge.net; Peter 
Phaal; Robert Alexander
Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

Hi Nigel:

Technically you only need 3.1 gmond to have support for the Python metric 
module.  But I'm not sure whether we have ever tested this under Windows.

Peter and Robert: How quickly can we get hsflowd to support GPU metrics 
collection internally?  Should we setup a meeting to discuss this?

Thanks,

Bernard

On Thu, Jul 12, 2012 at 4:05 AM, Nigel LEACH nigel.le...@uk.bnpparibas.com 
wrote:
 Thanks Ivan, but we have 3.0 and 3.1 gmond running under Cygwin (and using 
 APR), the problem is with the 3.4 spin.

 -Original Message-
 From: lozgachev.i...@gmail.com [mailto:lozgachev.i...@gmail.com]
 Sent: 12 July 2012 11:54
 To: Nigel LEACH
 Cc: peter.ph...@gmail.com; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Hi all,

 Maybe it will be interesting. Some time ago I successfully compiled gmond 
 3.0.7 and 3.1.2 under Cygwin. If you need it I can upload somewhere gmond and 
 3rd party sources + compilation script.
 Also, I have gmetad 3.0.7 compiled for Windows. In additional, I developed 
 (just for fun) my implementation of gmetad 3.1.2 using .NET and C#.

 P. S. I do not know whether it is possible to use these gmong versions to 
 collect statistic from GPU.

 --
 Best regards,
 Ivan.

 2012/7/12 Nigel LEACH nigel.le...@uk.bnpparibas.com:
 Thanks for the updates Peter and Bernard.

 I have been unable to get gmond 3.4 working under Cygwin, my latest errors 
 are parsing gm_protocol_xdr.c. I don't know whether we should follow this 
 up, it would be nice to have a Windows gmond, but my only reason for 
 upgrading are the GPU metrics.

 I take you point about re-using the existing GPU module and gmetric, 
 unfortunately I don't have experience with Python. My plan is to write 
 something in C to export the nvml metrics, with various output options. We 
 will then decide whether to call this new code from existing gmond 3.1 via 
 gmetric, new (if we get it working) gmond 3.4, or one of our existing third 
 party tools - ITRS Geneous.

 As regards your list of metrics they are pretty definitive, but I
 will probably also export

 *total ecc errors - nvmlDeviceGetTotalEccErrors) *individual ecc
 errors - nvmlDeviceGetDetailedEccErrors *active compute processes -
 nvmlDeviceGetComputeRunningProcesses

 Regards
 Nigel

 -Original Message-
 From: peter.ph...@gmail.com [mailto:peter.ph...@gmail.com]
 Sent: 10 July 2012 20:06
 To: Nigel LEACH
 Cc: bern...@vanhpc.org; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Nigel,

 A simple option would be to use Host sFlow agents to export the core metrics 
 from your Windows servers and use gmetric to send add the GPU metrics.

 You could combine code from the python GPU module and gmetric
 implementations to produce a self contained script for exporting GPU
 metrics:

 https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidi
 a https://github.com/ganglia/ganglia_contrib

 Longer term, it would make sense to extend Host sFlow to use the C-based 
 NVML API to extract and export metrics. This would be straightforward - the 
 Host sFlow agent uses native C APIs on the platforms it supports to extract 
 metrics.

 What would take some thought is developing standard set of summary metrics 
 to characterize GPU performance. Once the set of metrics is agreed on, then 
 adding them to the sFlow agent is pretty trivial.

 Currently the Ganglia python module exports the following metrics - are they 
 the right set? Anything missing? It would be great to get involvement from 
 the broader Ganglia community to capture best practice from anyone running 
 large GPU clusters, as well as getting input from NVIDIA about the key 
 metrics.

 * gpu_num
 * gpu_driver
 * gpu_type
 * gpu_uuid
 * gpu_pci_id
 * gpu_mem_total
 * gpu_graphics_speed
 * gpu_sm_speed
 * gpu_mem_speed
 * gpu_max_graphics_speed
 * gpu_max_sm_speed
 * gpu_max_mem_speed
 * gpu_temp
 * gpu_util
 * gpu_mem_util
 * gpu_mem_used
 * gpu_fan
 * gpu_power_usage
 * gpu_perf_state
 * gpu_ecc_mode

 As far as scalability is concerned, you should find that moving to sFlow as 
 the measurement transport reduces network traffic since all the metrics for 
 a node are transported in a single UDP datagram (rather than a datagram per 
 metric when using gmond as the agent). The other

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Bernard Li
Hi Robert:

When you said you tested the Python metric modules, did you just test the
Python scripts under Windows or did you somehow got gmond compiled under
Windows natively with Python support?

Thanks,

Bernard

On Thursday, July 12, 2012, Robert Alexander wrote:

 Hey,

 A meeting may be a good idea.  My schedule is mostly open next week.  When
 are others free?  I will brush up on sflow by then.

 NVML and the Python metric module are tested at NVIDIA on Windows and
 Linux, but not within Cygwin.  The process will be easier/faster on the
 NVML side if we keep Cygwin out of the loop.

 -Robert

 -Original Message-
 From: Bernard Li [mailto:bern...@vanhpc.org javascript:;]
 Sent: Thursday, July 12, 2012 10:49 AM
 To: Nigel LEACH
 Cc: lozgachev.i...@gmail.com javascript:;;
 ganglia-general@lists.sourceforge.net javascript:;; Peter Phaal; Robert
 Alexander
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Hi Nigel:

 Technically you only need 3.1 gmond to have support for the Python metric
 module.  But I'm not sure whether we have ever tested this under Windows.

 Peter and Robert: How quickly can we get hsflowd to support GPU metrics
 collection internally?  Should we setup a meeting to discuss this?

 Thanks,

 Bernard

 On Thu, Jul 12, 2012 at 4:05 AM, Nigel LEACH 
 nigel.le...@uk.bnpparibas.com javascript:; wrote:
  Thanks Ivan, but we have 3.0 and 3.1 gmond running under Cygwin (and
 using APR), the problem is with the 3.4 spin.
 
  -Original Message-
  From: lozgachev.i...@gmail.com javascript:; [mailto:
 lozgachev.i...@gmail.com javascript:;]
  Sent: 12 July 2012 11:54
  To: Nigel LEACH
  Cc: peter.ph...@gmail.com javascript:;;
 ganglia-general@lists.sourceforge.net javascript:;
  Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin
 
  Hi all,
 
  Maybe it will be interesting. Some time ago I successfully compiled
 gmond 3.0.7 and 3.1.2 under Cygwin. If you need it I can upload somewhere
 gmond and 3rd party sources + compilation script.
  Also, I have gmetad 3.0.7 compiled for Windows. In additional, I
 developed (just for fun) my implementation of gmetad 3.1.2 using .NET and
 C#.
 
  P. S. I do not know whether it is possible to use these gmong versions
 to collect statistic from GPU.
 
  --
  Best regards,
  Ivan.
 
  2012/7/12 Nigel LEACH nigel.le...@uk.bnpparibas.com javascript:;:
  Thanks for the updates Peter and Bernard.
 
  I have been unable to get gmond 3.4 working under Cygwin, my latest
 errors are parsing gm_protocol_xdr.c. I don't know whether we should follow
 this up, it would be nice to have a Windows gmond, but my only reason for
 upgrading are the GPU metrics.
 
  I take you point about re-using the existing GPU module and gmetric,
 unfortunately I don't have experience with Python. My plan is to write
 something in C to export the nvml metrics, with various output options. We
 will then decide whether to call this new code from existing gmond 3.1 via
 gmetric, new (if we get it working) gmond 3.4, or one of our existing third
 party tools - ITRS Geneous.
 
  As regards your list of metrics they are pretty definitive, but I
  will probably also export
 
  *total ecc errors - nvmlDeviceGetTotalEccErrors) *individual ecc
  errors - nvmlDeviceGetDetailedEccErrors *active compute processes -
  nvmlDeviceGetComputeRunningProcesses
 
  Regards
  Nigel
 
  -Original Message-
  From: peter.ph...@gmail.com javascript:; [mailto:
 peter.ph...@gmail.com javascript:;]
  Sent: 10 July 2012 20:06
  To: Nigel LEACH
  Cc: bern...@vanhpc.org javascript:;;
 ganglia-general@lists.sourceforge.net javascript:;
  Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin
 
  Nigel,
 
  A simple option would be to use Host sFlow agents to export the core
 metrics from your Windows servers and use gmetric to send add the GPU
 metrics.
 
  You could combine code from the python GPU module and gmetric
  implementations to produce a self contained script for exporting GPU
  metrics:
 
  https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidi
  a https://github.com/ganglia/ganglia_contrib
 
  Longer term, it would make sense to extend Host sFlow to use the
 C-based NVML API to extract and export metrics. This would be
 straightforward - the Host sFlow agent uses native C APIs on the platforms
 it supports to extract metrics.
 
  What would take some thought is developing standard set of summary
 metrics to characterize GPU performance. Once the set of metrics is agreed
 on, then adding them to the sFlow agent is pretty trivial.
 
  Currently the Ganglia python module exports the following metrics - are
 they the right set? Anything missing? It would be great to get involvement
 from the broader Ganglia community to capture best practice from anyone
 running large GPU clusters, as well as getting input from NVIDIA about the
 key metrics.
 
  * gpu_num
  * gpu_driver
  * gpu_type
  * gpu_uuid
  * gpu_pci_id
  * gpu_mem_total
  * gpu_graphics_speed

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-12 Thread Peter Phaal
Hi Robert,

sFlow is a very simple protocol - an sFlow agent periodically sends
XDR encoded structures over UDP. Each structure has a tag and a
length, making the protocol extensible.

In the short term, it would make sense is to define an sFlow structure
to carry the current NVML metrics and tag it using NVIDIA's IANA
assigned vendor number (5703). Something along the lines:

/* NVML statistics */
/* opaque = counter_data; enterprise = 5703, format=1 */
struct nvml_gpu_counters {
  unsigned int device_count;
  unsigned int mem_total;
  unsigned int mem_util;
 ...
}

Additional examples are in the sFlow Host Structures specification
(http://www.sflow.org/sflow_host.txt), these are the structures
currently being exported by the Host sFlow agent.

Extending the Windows Host sFlow agent to export these metrics would
involve adding a routine to populate and serialize this structure -
pretty straightforward - if you look at the Host sFlow agent source
code you will see examples of how the existing structures are handled.
For Ganglia to support the new counters, we would need to add a
decoder to gmond for the new structure - also straightforward.

Are per device metrics important, or can we roll up the metrics across
all the GPUs  on a server? With sFlow we generally roll up metrics for
each node where possible - the goal is to provide enough detail so
that the operations team can tell whether a node is healthy or not,
but not so much as to overwhelm the monitoring system and limit
scaleability. Once a problem is detected, detailed metrics for
troubleshooting and diagnostics can be performed using point tools on
the host.

The metrics currently exposed by NVML API could be improved -
everything appears to be a 1 second gauge. A more robust model for
metrics is to maintain monotonic counters so that they can be polled
at different frequencies and still produce meaningful results.
Counters are also more robust when sending metrics over an unreliable
transport like UDP. The receiver calculates the delta's and can easily
compensate for lost packets.

Longer term it would be useful to have a discussion to see what
metrics best characterize operational performance and are feasible to
implement. Counters such as number of threads started, number  of busy
ticks,  number of idle ticks etc. are the type of measurement you want
to calculate utilizations. Some kind of load average based on the
thread run queue would also be interesting.

My calendar is pretty open next week - I am based in San Francisco, so
8am-5pm PST works best.

Peter

On Thu, Jul 12, 2012 at 11:58 AM, Robert Alexander
ralexan...@nvidia.com wrote:
 Hey,

 A meeting may be a good idea.  My schedule is mostly open next week.  When 
 are others free?  I will brush up on sflow by then.

 NVML and the Python metric module are tested at NVIDIA on Windows and Linux, 
 but not within Cygwin.  The process will be easier/faster on the NVML side if 
 we keep Cygwin out of the loop.

 -Robert

 -Original Message-
 From: Bernard Li [mailto:bern...@vanhpc.org]
 Sent: Thursday, July 12, 2012 10:49 AM
 To: Nigel LEACH
 Cc: lozgachev.i...@gmail.com; ganglia-general@lists.sourceforge.net; Peter 
 Phaal; Robert Alexander
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Hi Nigel:

 Technically you only need 3.1 gmond to have support for the Python metric 
 module.  But I'm not sure whether we have ever tested this under Windows.

 Peter and Robert: How quickly can we get hsflowd to support GPU metrics 
 collection internally?  Should we setup a meeting to discuss this?

 Thanks,

 Bernard

 On Thu, Jul 12, 2012 at 4:05 AM, Nigel LEACH nigel.le...@uk.bnpparibas.com 
 wrote:
 Thanks Ivan, but we have 3.0 and 3.1 gmond running under Cygwin (and using 
 APR), the problem is with the 3.4 spin.

 -Original Message-
 From: lozgachev.i...@gmail.com [mailto:lozgachev.i...@gmail.com]
 Sent: 12 July 2012 11:54
 To: Nigel LEACH
 Cc: peter.ph...@gmail.com; ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

 Hi all,

 Maybe it will be interesting. Some time ago I successfully compiled gmond 
 3.0.7 and 3.1.2 under Cygwin. If you need it I can upload somewhere gmond 
 and 3rd party sources + compilation script.
 Also, I have gmetad 3.0.7 compiled for Windows. In additional, I developed 
 (just for fun) my implementation of gmetad 3.1.2 using .NET and C#.

 P. S. I do not know whether it is possible to use these gmong versions to 
 collect statistic from GPU.

 --
 Best regards,
 Ivan.

 2012/7/12 Nigel LEACH nigel.le...@uk.bnpparibas.com:
 Thanks for the updates Peter and Bernard.

 I have been unable to get gmond 3.4 working under Cygwin, my latest errors 
 are parsing gm_protocol_xdr.c. I don't know whether we should follow this 
 up, it would be nice to have a Windows gmond, but my only reason for 
 upgrading are the GPU metrics.

 I take you point about re-using the existing GPU module and gmetric

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Nigel LEACH
Hi Neil, Many thanks for the swift reply.

I want to take a look at sFlow, but it isn't a prerequisite.

Anyway, I disabled sFlow, and (separately) included the patch you sent. Both 
fixes appeared successful. For now I am going with your patch, and sFlow 
enabled.

I say appeared successful, as make was error free, and a gmond.exe was 
created. However, it doesn't appear to work out of the box. I created a default 
gmond.conf

./gmond --default_config  /usr/local/etc/gmond.conf

and then simply ran gmond. It started a process, but no port (8649) was 
created. Running in debug mode I get this

$ ./gmond -d 10
loaded module: core_metrics
loaded module: cpu_module
loaded module: disk_module
loaded module: load_module
loaded module: mem_module
loaded module: net_module
loaded module: proc_module
loaded module: sys_module


and nothing further.

I have done little investigation yet, so unless there is anything obvious I am 
missing, I'll continue to troubleshoot.

Regards
Nigel


From: neil.mckee...@gmail.com [mailto:neil.mckee...@gmail.com]
Sent: 09 July 2012 18:15
To: Nigel LEACH
Cc: ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

You could try adding --disable-sflow as another configure option.   (Or were 
you planning to use sFlow agents such as hsflowd?).

Neil


On Jul 9, 2012, at 3:50 AM, Nigel LEACH wrote:


Ganglia 3.4.0
Windows 2008 R2 Enterprise
Cygwin 1.5.25
IBM iDataPlex dx360 with Tesla M2070
Confuse 2.7

I'm trying to use the Ganglia Python modules to monitor a Windows based GPU 
cluster, but having problems getting gmond to compile. This 'configure' 
completes successfully

./configure --with-libconfuse=/usr/local --without-libpcre --enable-static-build

but 'make' fails, this is the tail of standard output

mv -f .deps/g25_config.Tpo .deps/g25_config.Po
gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
-I/usr/include/ap
r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW -g 
-O2 -I/usr/
local/include -fno-strict-aliasing -Wall -MT core_metrics.o -MD -MP -MF 
.deps/core_metrics
.Tpo -c -o core_metrics.o core_metrics.c
mv -f .deps/core_metrics.Tpo .deps/core_metrics.Po
gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
-I/usr/include/ap
r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW -g 
-O2 -I/usr/
local/include -fno-strict-aliasing -Wall -MT sflow.o -MD -MP -MF 
.deps/sflow.Tpo -c -o sfl
ow.o sflow.c
sflow.c: In function `process_struct_JVM':
sflow.c:1033: warning: comparison is always true due to limited range of data 
type
sflow.c:1034: warning: comparison is always true due to limited range of data 
type
sflow.c:1035: warning: comparison is always true due to limited range of data 
type
sflow.c:1036: warning: comparison is always true due to limited range of data 
type
sflow.c:1037: warning: comparison is always true due to limited range of data 
type
sflow.c:1038: warning: comparison is always true due to limited range of data 
type
sflow.c:1039: warning: comparison is always true due to limited range of data 
type
sflow.c: In function `processCounterSample':
sflow.c:1169: warning: unsigned int format, uint32_t arg (arg 4)
sflow.c:1169: warning: unsigned int format, uint32_t arg (arg 4)
sflow.c: In function `process_sflow_datagram':
sflow.c:1348: error: `AF_INET6' undeclared (first use in this function)
sflow.c:1348: error: (Each undeclared identifier is reported only once
sflow.c:1348: error: for each function it appears in.)
make[3]: *** [sflow.o] Error 1
make[3]: Leaving directory `/var/tmp/ganglia-3.4.0/gmond'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/var/tmp/ganglia-3.4.0/gmond'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/tmp/ganglia-3.4.0'
make: *** [all] Error 2

Has anyone come across this before ?

Many Thanks
Nigel


___
This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and delete this e-mail. Any unauthorised copying, 
disclosure or distribution of the material in this e-mail is prohibited.

Please refer to http://www.bnpparibas.co.uk/en/email-disclaimer/ for additional 
disclosures. 
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. 
http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.netmailto:Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Bernard Li
Hi Nigel:

Perhaps other developers could chime in but I'm not sure if the latest
version could be compiled under Windows, at least I was not aware of any
testing done.

Going forward I would like to encourage users to use hsflowd under Windows.
 I'm talking to the developers to see if we can add support for GPU
monitoring.  Do you have any other requirements besides that?

Thanks,

Bernard

On Tuesday, July 10, 2012, Nigel LEACH wrote:

  Hi Neil, Many thanks for the swift reply. 

 ** **

 I want to take a look at sFlow, but it isn’t a prerequisite.

 ** **

 Anyway, I disabled sFlow, and (separately) included the patch you sent.
 Both fixes appeared successful. For now I am going with your patch, and
 sFlow enabled.

 ** **

 I say “appeared successful”, as make was error free, and a gmond.exe was
 created. However, it doesn’t appear to work out of the box. I created a
 default gmond.conf 

 ** **

 ./gmond --default_config  /usr/local/etc/gmond.conf

 ** **

 and then simply ran gmond. It started a process, but no port (8649) was
 created. Running in debug mode I get this

 ** **

 $ ./gmond -d 10

 loaded module: core_metrics

 loaded module: cpu_module

 loaded module: disk_module

 loaded module: load_module

 loaded module: mem_module

 loaded module: net_module

 loaded module: proc_module

 loaded module: sys_module

 ** **

 ** **

 and nothing further.

 ** **

 I have done little investigation yet, so unless there is anything obvious
 I am missing, I’ll continue to troubleshoot.

 ** **

 Regards

 Nigel

 ** **

 ** **

 *From:* neil.mckee...@gmail.com javascript:_e({}, 'cvml',
 'neil.mckee...@gmail.com'); 
 [mailto:neil.mckee...@gmail.comjavascript:_e({}, 'cvml', 
 'neil.mckee...@gmail.com');]

 *Sent:* 09 July 2012 18:15
 *To:* Nigel LEACH
 *Cc:* ganglia-general@lists.sourceforge.net javascript:_e({}, 'cvml',
 'ganglia-general@lists.sourceforge.net');
 *Subject:* Re: [Ganglia-general] Gmond Compilation on Cygwin

 ** **

 You could try adding --disable-sflow as another configure option.   (Or
 were you planning to use sFlow agents such as hsflowd?).

 ** **

 Neil

 ** **

 ** **

 On Jul 9, 2012, at 3:50 AM, Nigel LEACH wrote:



 

 Ganglia 3.4.0

 Windows 2008 R2 Enterprise

 Cygwin 1.5.25

 IBM iDataPlex dx360 with Tesla M2070

 Confuse 2.7

  

 I’m trying to use the Ganglia Python modules to monitor a Windows based
 GPU cluster, but having problems getting gmond to compile. This ‘configure’
 completes successfully

  

 ./configure --with-libconfuse=/usr/local --without-libpcre
 --enable-static-build

  

 but ‘make’ fails, this is the tail of standard output

  

 mv -f .deps/g25_config.Tpo .deps/g25_config.Po

 gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
 -I/usr/include/ap

 r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE
 -DSFLOW -g -O2 -I/usr/

 local/include -fno-strict-aliasing -Wall -MT core_metrics.o -MD -MP -MF
 .deps/core_metrics

 .Tpo -c -o core_metrics.o core_metrics.c

 mv -f .deps/core_metrics.Tpo .deps/core_metrics.Po

 gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
 -I/usr/include/ap

 r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE
 -DSFLOW -g -O2 -I/usr/

 local/include -fno-strict-aliasing -Wall -MT sflow.o -MD -MP -MF
 .deps/sflow.Tpo -c -o sfl

 ow.o sflow.c

 sflow.c: In function `process_struct_JVM':

 sflow.c:1033: warning: comparison is always true due to limited range of
 data type


 ___
 This e-mail may contain confidential and/or privileged information. If you
 are not the intended recipient (or have received this e-mail in error)
 please notify the sender immediately and delete this e-mail. Any
 unauthorised copying, disclosure or distribution of the material in this
 e-mail is prohibited.

 Please refer to http://www.bnpparibas.co.uk/en/email-disclaimer/ for
 additional disclosures.

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Nigel LEACH
Hello Bernard, I was coming to that conclusion, I've been trying to compile on 
various combinations of Cygwin, Windows, Hardware this afternoon, but without 
success yet. I've still got a few more tests to do though.

The GPU plugin is my only reason for upgrading from our current 3.1.7, and 
there is nothing else esoteric we use. We do have Linux Blades, but all of our 
Tesla's are hosted on Windows.  The entire estate is quite large, so we would 
need to ensure sFlow scales, no reason to think it won't, but I have little 
experience with it..

Regards
Nigel

From: bern...@vanhpc.org [mailto:bern...@vanhpc.org]
Sent: 10 July 2012 16:19
To: Nigel LEACH
Cc: neil.mckee...@gmail.com; ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

Hi Nigel:

Perhaps other developers could chime in but I'm not sure if the latest version 
could be compiled under Windows, at least I was not aware of any testing done.

Going forward I would like to encourage users to use hsflowd under Windows.  
I'm talking to the developers to see if we can add support for GPU monitoring.  
Do you have any other requirements besides that?

Thanks,

Bernard

On Tuesday, July 10, 2012, Nigel LEACH wrote:
Hi Neil, Many thanks for the swift reply.

I want to take a look at sFlow, but it isn't a prerequisite.

Anyway, I disabled sFlow, and (separately) included the patch you sent. Both 
fixes appeared successful. For now I am going with your patch, and sFlow 
enabled.

I say appeared successful, as make was error free, and a gmond.exe was 
created. However, it doesn't appear to work out of the box. I created a default 
gmond.conf

./gmond --default_config  /usr/local/etc/gmond.conf

and then simply ran gmond. It started a process, but no port (8649) was 
created. Running in debug mode I get this

$ ./gmond -d 10
loaded module: core_metrics
loaded module: cpu_module
loaded module: disk_module
loaded module: load_module
loaded module: mem_module
loaded module: net_module
loaded module: proc_module
loaded module: sys_module


and nothing further.

I have done little investigation yet, so unless there is anything obvious I am 
missing, I'll continue to troubleshoot.

Regards
Nigel


From: 
neil.mckee...@gmail.comjavascript:_e(%7b%7d,%20'cvml',%20'neil.mckee...@gmail.com');
 
[mailto:neil.mckee...@gmail.comjavascript:_e(%7b%7d,%20'cvml',%20'neil.mckee...@gmail.com');]
Sent: 09 July 2012 18:15
To: Nigel LEACH
Cc: 
ganglia-general@lists.sourceforge.netjavascript:_e(%7b%7d,%20'cvml',%20'ganglia-general@lists.sourceforge.net');
Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

You could try adding --disable-sflow as another configure option.   (Or were 
you planning to use sFlow agents such as hsflowd?).



Neil





On Jul 9, 2012, at 3:50 AM, Nigel LEACH wrote:



Ganglia 3.4.0

Windows 2008 R2 Enterprise

Cygwin 1.5.25

IBM iDataPlex dx360 with Tesla M2070

Confuse 2.7



I'm trying to use the Ganglia Python modules to monitor a Windows based GPU 
cluster, but having problems getting gmond to compile. This 'configure' 
completes successfully



./configure --with-libconfuse=/usr/local --without-libpcre --enable-static-build



but 'make' fails, this is the tail of standard output



mv -f .deps/g25_config.Tpo .deps/g25_config.Po

gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
-I/usr/include/ap

r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW -g 
-O2 -I/usr/

local/include -fno-strict-aliasing -Wall -MT core_metrics.o -MD -MP -MF 
.deps/core_metrics

.Tpo -c -o core_metrics.o core_metrics.c

mv -f .deps/core_metrics.Tpo .deps/core_metrics.Po

gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
-I/usr/include/ap

r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW -g 
-O2 -I/usr/

local/include -fno-strict-aliasing -Wall -MT sflow.o -MD -MP -MF 
.deps/sflow.Tpo -c -o sfl

ow.o sflow.c

sflow.c: In function `process_struct_JVM':

sflow.c:1033: warning: comparison is always true due to limited range of data 
type

___
This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and delete this e-mail. Any unauthorised copying, 
disclosure or distribution of the material in this e-mail is prohibited.

Please refer to http://www.bnpparibas.co.uk/en/email-disclaimer/ for additional 
disclosures.

___
This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and delete this e-mail. Any unauthorised copying, 
disclosure or distribution of the material in this e-mail is prohibited.

Please refer to http://www.bnpparibas.co.uk/en/email-disclaimer

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Peter Phaal
Nigel,

A simple option would be to use Host sFlow agents to export the core
metrics from your Windows servers and use gmetric to send add the GPU
metrics.

You could combine code from the python GPU module and gmetric
implementations to produce a self contained script for exporting GPU
metrics:

https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia
https://github.com/ganglia/ganglia_contrib

Longer term, it would make sense to extend Host sFlow to use the
C-based NVML API to extract and export metrics. This would be
straightforward - the Host sFlow agent uses native C APIs on the
platforms it supports to extract metrics.

What would take some thought is developing standard set of summary
metrics to characterize GPU performance. Once the set of metrics is
agreed on, then adding them to the sFlow agent is pretty trivial.

Currently the Ganglia python module exports the following metrics -
are they the right set? Anything missing? It would be great to get
involvement from the broader Ganglia community to capture best
practice from anyone running large GPU clusters, as well as getting
input from NVIDIA about the key metrics.

* gpu_num
* gpu_driver
* gpu_type
* gpu_uuid
* gpu_pci_id
* gpu_mem_total
* gpu_graphics_speed
* gpu_sm_speed
* gpu_mem_speed
* gpu_max_graphics_speed
* gpu_max_sm_speed
* gpu_max_mem_speed
* gpu_temp
* gpu_util
* gpu_mem_util
* gpu_mem_used
* gpu_fan
* gpu_power_usage
* gpu_perf_state
* gpu_ecc_mode

As far as scalability is concerned, you should find that moving to
sFlow as the measurement transport reduces network traffic since all
the metrics for a node are transported in a single UDP datagram
(rather than a datagram per metric when using gmond as the agent). The
other consideration is that sFlow is unicast, so if you are using a
multicast Ganglia setup then this involves re-structuring your a
configuration.

You still need to have at least one gmond instance, but it acts as an
sFlow aggregator and is mute:
http://blog.sflow.com/2011/07/ganglia-32-released.html

Peter

On Tue, Jul 10, 2012 at 8:36 AM, Nigel LEACH
nigel.le...@uk.bnpparibas.com wrote:
 Hello Bernard, I was coming to that conclusion, I’ve been trying to compile
 on various combinations of Cygwin, Windows, Hardware this afternoon, but
 without success yet. I’ve still got a few more tests to do though.



 The GPU plugin is my only reason for upgrading from our current 3.1.7, and
 there is nothing else esoteric we use. We do have Linux Blades, but all of
 our Tesla’s are hosted on Windows.  The entire estate is quite large, so we
 would need to ensure sFlow scales, no reason to think it won’t, but I have
 little experience with it..



 Regards

 Nigel



 From: bern...@vanhpc.org [mailto:bern...@vanhpc.org]
 Sent: 10 July 2012 16:19
 To: Nigel LEACH
 Cc: neil.mckee...@gmail.com; ganglia-general@lists.sourceforge.net


 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin



 Hi Nigel:



 Perhaps other developers could chime in but I'm not sure if the latest
 version could be compiled under Windows, at least I was not aware of any
 testing done.



 Going forward I would like to encourage users to use hsflowd under Windows.
 I'm talking to the developers to see if we can add support for GPU
 monitoring.  Do you have any other requirements besides that?



 Thanks,



 Bernard

 On Tuesday, July 10, 2012, Nigel LEACH wrote:

 Hi Neil, Many thanks for the swift reply.



 I want to take a look at sFlow, but it isn’t a prerequisite.



 Anyway, I disabled sFlow, and (separately) included the patch you sent. Both
 fixes appeared successful. For now I am going with your patch, and sFlow
 enabled.



 I say “appeared successful”, as make was error free, and a gmond.exe was
 created. However, it doesn’t appear to work out of the box. I created a
 default gmond.conf



 ./gmond --default_config  /usr/local/etc/gmond.conf



 and then simply ran gmond. It started a process, but no port (8649) was
 created. Running in debug mode I get this



 $ ./gmond -d 10

 loaded module: core_metrics

 loaded module: cpu_module

 loaded module: disk_module

 loaded module: load_module

 loaded module: mem_module

 loaded module: net_module

 loaded module: proc_module

 loaded module: sys_module





 and nothing further.



 I have done little investigation yet, so unless there is anything obvious I
 am missing, I’ll continue to troubleshoot.



 Regards

 Nigel





 From: neil.mckee...@gmail.com [mailto:neil.mckee...@gmail.com]
 Sent: 09 July 2012 18:15
 To: Nigel LEACH
 Cc: ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin



 You could try adding --disable-sflow as another configure option.   (Or
 were you planning to use sFlow agents such as hsflowd?).



 Neil





 On Jul 9, 2012, at 3:50 AM, Nigel LEACH wrote:



 Ganglia 3.4.0

 Windows 2008 R2 Enterprise

 Cygwin 1.5.25

 IBM iDataPlex dx360 with Tesla M2070

 Confuse 2.7



 I’m trying to use the Ganglia

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Bernard Li
Adding Robert Alexander to the list, since he and I worked together on
the NVIDIA plug-in.

Thanks,

Bernard

On Tue, Jul 10, 2012 at 12:06 PM, Peter Phaal peter.ph...@gmail.com wrote:
 Nigel,

 A simple option would be to use Host sFlow agents to export the core
 metrics from your Windows servers and use gmetric to send add the GPU
 metrics.

 You could combine code from the python GPU module and gmetric
 implementations to produce a self contained script for exporting GPU
 metrics:

 https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia
 https://github.com/ganglia/ganglia_contrib

 Longer term, it would make sense to extend Host sFlow to use the
 C-based NVML API to extract and export metrics. This would be
 straightforward - the Host sFlow agent uses native C APIs on the
 platforms it supports to extract metrics.

 What would take some thought is developing standard set of summary
 metrics to characterize GPU performance. Once the set of metrics is
 agreed on, then adding them to the sFlow agent is pretty trivial.

 Currently the Ganglia python module exports the following metrics -
 are they the right set? Anything missing? It would be great to get
 involvement from the broader Ganglia community to capture best
 practice from anyone running large GPU clusters, as well as getting
 input from NVIDIA about the key metrics.

 * gpu_num
 * gpu_driver
 * gpu_type
 * gpu_uuid
 * gpu_pci_id
 * gpu_mem_total
 * gpu_graphics_speed
 * gpu_sm_speed
 * gpu_mem_speed
 * gpu_max_graphics_speed
 * gpu_max_sm_speed
 * gpu_max_mem_speed
 * gpu_temp
 * gpu_util
 * gpu_mem_util
 * gpu_mem_used
 * gpu_fan
 * gpu_power_usage
 * gpu_perf_state
 * gpu_ecc_mode

 As far as scalability is concerned, you should find that moving to
 sFlow as the measurement transport reduces network traffic since all
 the metrics for a node are transported in a single UDP datagram
 (rather than a datagram per metric when using gmond as the agent). The
 other consideration is that sFlow is unicast, so if you are using a
 multicast Ganglia setup then this involves re-structuring your a
 configuration.

 You still need to have at least one gmond instance, but it acts as an
 sFlow aggregator and is mute:
 http://blog.sflow.com/2011/07/ganglia-32-released.html

 Peter

 On Tue, Jul 10, 2012 at 8:36 AM, Nigel LEACH
 nigel.le...@uk.bnpparibas.com wrote:
 Hello Bernard, I was coming to that conclusion, I’ve been trying to compile
 on various combinations of Cygwin, Windows, Hardware this afternoon, but
 without success yet. I’ve still got a few more tests to do though.



 The GPU plugin is my only reason for upgrading from our current 3.1.7, and
 there is nothing else esoteric we use. We do have Linux Blades, but all of
 our Tesla’s are hosted on Windows.  The entire estate is quite large, so we
 would need to ensure sFlow scales, no reason to think it won’t, but I have
 little experience with it..



 Regards

 Nigel



 From: bern...@vanhpc.org [mailto:bern...@vanhpc.org]
 Sent: 10 July 2012 16:19
 To: Nigel LEACH
 Cc: neil.mckee...@gmail.com; ganglia-general@lists.sourceforge.net


 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin



 Hi Nigel:



 Perhaps other developers could chime in but I'm not sure if the latest
 version could be compiled under Windows, at least I was not aware of any
 testing done.



 Going forward I would like to encourage users to use hsflowd under Windows.
 I'm talking to the developers to see if we can add support for GPU
 monitoring.  Do you have any other requirements besides that?



 Thanks,



 Bernard

 On Tuesday, July 10, 2012, Nigel LEACH wrote:

 Hi Neil, Many thanks for the swift reply.



 I want to take a look at sFlow, but it isn’t a prerequisite.



 Anyway, I disabled sFlow, and (separately) included the patch you sent. Both
 fixes appeared successful. For now I am going with your patch, and sFlow
 enabled.



 I say “appeared successful”, as make was error free, and a gmond.exe was
 created. However, it doesn’t appear to work out of the box. I created a
 default gmond.conf



 ./gmond --default_config  /usr/local/etc/gmond.conf



 and then simply ran gmond. It started a process, but no port (8649) was
 created. Running in debug mode I get this



 $ ./gmond -d 10

 loaded module: core_metrics

 loaded module: cpu_module

 loaded module: disk_module

 loaded module: load_module

 loaded module: mem_module

 loaded module: net_module

 loaded module: proc_module

 loaded module: sys_module





 and nothing further.



 I have done little investigation yet, so unless there is anything obvious I
 am missing, I’ll continue to troubleshoot.



 Regards

 Nigel





 From: neil.mckee...@gmail.com [mailto:neil.mckee...@gmail.com]
 Sent: 09 July 2012 18:15
 To: Nigel LEACH
 Cc: ganglia-general@lists.sourceforge.net
 Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin



 You could try adding --disable-sflow as another configure option.   (Or
 were you planning to use sFlow

Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-10 Thread Robert Alexander
Hey Nigel,

I would be happy to help where I can.  I think Peter's approach is a good start.

We are updating the Ganglia plug-in with a few more metrics.  My dev branch on 
github has some updates not yet in the trunk.
https://github.com/ralexander/gmond_python_modules/tree/master/gpu/nvidia

In terms of metrics, I can help explain what each means.  I expect the 
usefulness of each to vary based on installation, so hopefully others can 
contribute their thoughts.

* gpu_num - Useful indirectly.
* gpu_driver - Useful when different machines may have different installed 
driver versions.

* gpu_type - Marketing name of the GPU.
* gpu_uuid - Globally unique immutable ID for the GPU chip.  This is the NVIDIA 
preferred identifier when SW interfaces with a GPU.  On a multi GPU board, each 
GPU has a unique UUID.
* gpu_pci_id - What the GPU looks like on the PCI bus ID.
+ gpu_serial - For Tesla GPUs there is a serial number printed on the board.  
Note, that when there are multiple GPU chips on a single board, they share a 
common board serial number.  When a human needs to grab a particular board, 
this number works well.

* gpu_mem_total
* gpu_mem_used
Useful for high level application profiling.

* gpu_graphics_speed
+ gpu_max_graphics_speed
* gpu_sm_speed
+ gpu_max_sm_speed 
* gpu_mem_speed
+ gpu_max_mem_speed
These are various clock speeds.  Faster clocks - higher performance.

* gpu_perf_state
Similar to CPU pstates.  P0 is the fastest performance.  When pstate is 
P0 clock speeds and PCIe bandwidth can be reduced.

* gpu_util
* gpu_mem_util
% of time when the GPU SM or GPU memory was busy over the last second
This is a very coarse grain way to monitor GPU usage.
I.E. If only one SM is busy, but it is busy for the entire 
second then gpu_util = 100
* gpu_fan
* gpu_temp
Some GPUs support these.  Useful to see how well the GPU is cooled.

* gpu_power_usage
+ gpu_power_man_mode
+ gpu_power_man_limit
GPU power draw.  Some GPUs support configurable power limits via power 
management mode.

* gpu_ecc_mode
Useful to ensure all GPUs are configured the same.  Describes if GPU 
memory error checking and correction is on or off.

If you are only concerned about coarse grained GPU performance, then GPU 
performance state, utilization and %memory used may work well.

Bernard, thanks for the heads up.

Hope that helps,
Robert Alexander
NVIDIA CUDA Tools Software Engineer

-Original Message-
From: Bernard Li [mailto:bern...@vanhpc.org] 
Sent: Tuesday, July 10, 2012 12:32 PM
To: Peter Phaal
Cc: Nigel LEACH; ganglia-general@lists.sourceforge.net; Robert Alexander
Subject: Re: [Ganglia-general] Gmond Compilation on Cygwin

Adding Robert Alexander to the list, since he and I worked together on the 
NVIDIA plug-in.

Thanks,

Bernard

On Tue, Jul 10, 2012 at 12:06 PM, Peter Phaal peter.ph...@gmail.com wrote:
 Nigel,

 A simple option would be to use Host sFlow agents to export the core 
 metrics from your Windows servers and use gmetric to send add the GPU 
 metrics.

 You could combine code from the python GPU module and gmetric 
 implementations to produce a self contained script for exporting GPU
 metrics:

 https://github.com/ganglia/gmond_python_modules/tree/master/gpu/nvidia
 https://github.com/ganglia/ganglia_contrib

 Longer term, it would make sense to extend Host sFlow to use the 
 C-based NVML API to extract and export metrics. This would be 
 straightforward - the Host sFlow agent uses native C APIs on the 
 platforms it supports to extract metrics.

 What would take some thought is developing standard set of summary 
 metrics to characterize GPU performance. Once the set of metrics is 
 agreed on, then adding them to the sFlow agent is pretty trivial.

 Currently the Ganglia python module exports the following metrics - 
 are they the right set? Anything missing? It would be great to get 
 involvement from the broader Ganglia community to capture best 
 practice from anyone running large GPU clusters, as well as getting 
 input from NVIDIA about the key metrics.

 * gpu_num
 * gpu_driver
 * gpu_type
 * gpu_uuid
 * gpu_pci_id
 * gpu_mem_total
 * gpu_graphics_speed
 * gpu_sm_speed
 * gpu_mem_speed
 * gpu_max_graphics_speed
 * gpu_max_sm_speed
 * gpu_max_mem_speed
 * gpu_temp
 * gpu_util
 * gpu_mem_util
 * gpu_mem_used
 * gpu_fan
 * gpu_power_usage
 * gpu_perf_state
 * gpu_ecc_mode

 As far as scalability is concerned, you should find that moving to 
 sFlow as the measurement transport reduces network traffic since all 
 the metrics for a node are transported in a single UDP datagram 
 (rather than a datagram per metric when using gmond as the agent). The 
 other consideration is that sFlow is unicast, so if you are using a 
 multicast Ganglia setup then this involves re-structuring your a 
 configuration.

 You still need to have at least one gmond instance, but it acts as an 
 sFlow aggregator and is mute:
 http

[Ganglia-general] Gmond Compilation on Cygwin

2012-07-09 Thread Nigel LEACH
Ganglia 3.4.0
Windows 2008 R2 Enterprise
Cygwin 1.5.25
IBM iDataPlex dx360 with Tesla M2070
Confuse 2.7

I'm trying to use the Ganglia Python modules to monitor a Windows based GPU 
cluster, but having problems getting gmond to compile. This 'configure' 
completes successfully

./configure --with-libconfuse=/usr/local --without-libpcre --enable-static-build

but 'make' fails, this is the tail of standard output

mv -f .deps/g25_config.Tpo .deps/g25_config.Po
gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
-I/usr/include/ap
r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW -g 
-O2 -I/usr/
local/include -fno-strict-aliasing -Wall -MT core_metrics.o -MD -MP -MF 
.deps/core_metrics
.Tpo -c -o core_metrics.o core_metrics.c
mv -f .deps/core_metrics.Tpo .deps/core_metrics.Po
gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
-I/usr/include/ap
r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW -g 
-O2 -I/usr/
local/include -fno-strict-aliasing -Wall -MT sflow.o -MD -MP -MF 
.deps/sflow.Tpo -c -o sfl
ow.o sflow.c
sflow.c: In function `process_struct_JVM':
sflow.c:1033: warning: comparison is always true due to limited range of data 
type
sflow.c:1034: warning: comparison is always true due to limited range of data 
type
sflow.c:1035: warning: comparison is always true due to limited range of data 
type
sflow.c:1036: warning: comparison is always true due to limited range of data 
type
sflow.c:1037: warning: comparison is always true due to limited range of data 
type
sflow.c:1038: warning: comparison is always true due to limited range of data 
type
sflow.c:1039: warning: comparison is always true due to limited range of data 
type
sflow.c: In function `processCounterSample':
sflow.c:1169: warning: unsigned int format, uint32_t arg (arg 4)
sflow.c:1169: warning: unsigned int format, uint32_t arg (arg 4)
sflow.c: In function `process_sflow_datagram':
sflow.c:1348: error: `AF_INET6' undeclared (first use in this function)
sflow.c:1348: error: (Each undeclared identifier is reported only once
sflow.c:1348: error: for each function it appears in.)
make[3]: *** [sflow.o] Error 1
make[3]: Leaving directory `/var/tmp/ganglia-3.4.0/gmond'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/var/tmp/ganglia-3.4.0/gmond'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/tmp/ganglia-3.4.0'
make: *** [all] Error 2

Has anyone come across this before ?

Many Thanks
Nigel


___
This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and delete this e-mail. Any unauthorised copying, 
disclosure or distribution of the material in this e-mail is prohibited.

Please refer to http://www.bnpparibas.co.uk/en/email-disclaimer/ for additional 
disclosures.
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general


Re: [Ganglia-general] Gmond Compilation on Cygwin

2012-07-09 Thread Neil Mckee
You could try adding --disable-sflow as another configure option.   (Or were 
you planning to use sFlow agents such as hsflowd?).

Neil


On Jul 9, 2012, at 3:50 AM, Nigel LEACH wrote:

 Ganglia 3.4.0
 Windows 2008 R2 Enterprise
 Cygwin 1.5.25
 IBM iDataPlex dx360 with Tesla M2070
 Confuse 2.7
  
 I’m trying to use the Ganglia Python modules to monitor a Windows based GPU 
 cluster, but having problems getting gmond to compile. This ‘configure’ 
 completes successfully
  
 ./configure --with-libconfuse=/usr/local --without-libpcre 
 --enable-static-build
  
 but ‘make’ fails, this is the tail of standard output
  
 mv -f .deps/g25_config.Tpo .deps/g25_config.Po
 gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
 -I/usr/include/ap
 r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW 
 -g -O2 -I/usr/
 local/include -fno-strict-aliasing -Wall -MT core_metrics.o -MD -MP -MF 
 .deps/core_metrics
 .Tpo -c -o core_metrics.o core_metrics.c
 mv -f .deps/core_metrics.Tpo .deps/core_metrics.Po
 gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -DCYGWIN -I/usr/include/apr-1
 -I/usr/include/ap
 r-1-I../lib -I../include/ -I../libmetrics -D_LARGEFILE64_SOURCE -DSFLOW 
 -g -O2 -I/usr/
 local/include -fno-strict-aliasing -Wall -MT sflow.o -MD -MP -MF 
 .deps/sflow.Tpo -c -o sfl
 ow.o sflow.c
 sflow.c: In function `process_struct_JVM':
 sflow.c:1033: warning: comparison is always true due to limited range of data 
 type
 sflow.c:1034: warning: comparison is always true due to limited range of data 
 type
 sflow.c:1035: warning: comparison is always true due to limited range of data 
 type
 sflow.c:1036: warning: comparison is always true due to limited range of data 
 type
 sflow.c:1037: warning: comparison is always true due to limited range of data 
 type
 sflow.c:1038: warning: comparison is always true due to limited range of data 
 type
 sflow.c:1039: warning: comparison is always true due to limited range of data 
 type
 sflow.c: In function `processCounterSample':
 sflow.c:1169: warning: unsigned int format, uint32_t arg (arg 4)
 sflow.c:1169: warning: unsigned int format, uint32_t arg (arg 4)
 sflow.c: In function `process_sflow_datagram':
 sflow.c:1348: error: `AF_INET6' undeclared (first use in this function)
 sflow.c:1348: error: (Each undeclared identifier is reported only once
 sflow.c:1348: error: for each function it appears in.)
 make[3]: *** [sflow.o] Error 1
 make[3]: Leaving directory `/var/tmp/ganglia-3.4.0/gmond'
 make[2]: *** [all-recursive] Error 1
 make[2]: Leaving directory `/var/tmp/ganglia-3.4.0/gmond'
 make[1]: *** [all-recursive] Error 1
 make[1]: Leaving directory `/var/tmp/ganglia-3.4.0'
 make: *** [all] Error 2
  
 Has anyone come across this before ?
  
 Many Thanks
 Nigel
  
 
 ___
 This e-mail may contain confidential and/or privileged information. If you 
 are not the intended recipient (or have received this e-mail in error) please 
 notify the sender immediately and delete this e-mail. Any unauthorised 
 copying, disclosure or distribution of the material in this e-mail is 
 prohibited.
 
 Please refer to http://www.bnpparibas.co.uk/en/email-disclaimer/ for 
 additional disclosures. 
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. 
 http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
 Ganglia-general mailing list
 Ganglia-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general