[Ganglia-developers] RFC: patch to gmond to send metrics to InfluxDB

2016-04-29 Thread Jesse Becker
I have been working recently on a branch to Gnaglia/monitor-core that
allows gmond to send metrics directly to an InfluxDB database.  It can be
found here:

  https://github.com/hawson/monitor-core/tree/influxdb


This purely a change to the gmond agent.  Other programs (e.g. gmetad,
gmetric) and components (the web UI) are not changed.  However, a logical
next phase could be to rework the WebUI and Gmetad to use InfluxDB as a
backend.


I've tried to keep the changes as isolated as possible, creating a new
lib/influxdb.c file for the new functionality, and hooking into gmond as
part of the existing Ganglia_collection_group_send() function. Thus, when a
packet would normally be sent to another gmond, it can also be sent to an
influxdb channel at the same time.  There are, of course, various other
changes sprinkled about to other files, mostly to add new gmond.conf
options.

The gmond.conf documentation, and default configuration file (from 'gmond
-t') has also been updated to cover the two new configuration options.

The first new option is an influxdb_send_channel stanza.  It is fairly
simple, with three options.

  influxdb_send_channel {
host = myinfluxdb.example.com
port = 8089
default_tags = zone=us-east,host_class=hpc  //optional tags sent with
each metric
  }

The "host" and "port" attributes are required, and their purpose should be
obvious.  The "default_tags" attribute is optional.  Influxdb permits tags
associated with each time/key/value tuple; this is how hostnames are
stored, for example.  This attribute allows default tags to be associated
with every metric sent, for example to identify an HPC cluster, or AWS
zone, or other useful bit of metadata.

The other change to gmond.conf is also optional, but strongly recommended.
Every collection_group stanza may now have an optional "measurement"
attribute.  An example for the some of the system load metrics:

  collection_group {
collect_every = 20
time_threshold = 90
measurement = "load"  // <<<<<<<<<<<<<<<  new atttribute
metric {
  name = "load_one"
  title = "One Minute Load Average"
}
metric {
  name = "load_five"
  title = "Five Minute Load Average"
}
[...]
  }

This attribute is used to assist InfluxDB in organizing metrics into groups
of "measurements".  Measurements are similar in function to an SQL table
(InfluxDB is not an SQL database, and the analogy is not perfect).  Since
most metrics in a collection group tend to be similar (all CPU stats are
collected at the same time, network stats at another, etc), adding this
metric at the collection_group level seems to make the most sense.  If a
collection_group does not have a measurement attribute, the metric name
(e.g. "load_one"), is used instead; this is not recommended.  You may use a
measurement name in different collection_group stanzas.  This is
appropriate, if  there are simlar metrics with different collection and
send intervals. The new example gmond.conf file has a few examples of this.
Note that adding this support did require some minor reorganization of the
default gmond.conf file.

I know of several improvements that cold be made, but believe that the code
is fit for general review.

Comments, questions and corrections are all welcome.  The Github fork has
the "Issues" feature enabled, and things can be posted there

-- 
Jesse Becker
--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Gmetad bottlenecks

2014-01-15 Thread Jesse Becker
On Wed, Jan 15, 2014 at 8:41 AM, Nicholas Satterly nfsatte...@gmail.com wrote:
 If we are to look at redoing the XML parsing next then the two contenders
 that come to mind are gzipped JSON and Google Protocol Buffers.

 PB is meant to be very efficient and therefore faster, however it seems
 people have gotten comparable results with gzipped JSON. An obvious
 advantage of gzipped JSON is that it would be simple to make the output
 human readable though we could easily develop a CLI tool that allowed us to
 query and decode ganglia PB data for testing.

I think there are large advantages to using standard and widely
adopted formats, so JSON gets my vote there.  That's not to say PB
isn't widely used, but I suspect there are a lot more tools and
programming language bindings to read and process JSON data.

Or...why not just use XDR for bulk data transport, instead of
introducing a third format (XML, XDR, and JSON/PB)?   I'm not hugely
familiar with XDR, so apologies if I'm missing an obvious reason why
it wouldn't work here.



-- 
Jesse Becker

--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Has anyone considered using php accelerators such as eaccelerator

2013-07-08 Thread Jesse Becker
I always figured that most of the time spent on rendering a Ganglia
page is due to the many calls to rrdtool to render the images.  I
don't know how much a PHP accelerator would help in that case.  That
said, I don't have any actual numbers or testing to back that up.

On Mon, Jul 8, 2013 at 10:27 AM, Chris Burroughs
chris.burrou...@gmail.com wrote:
 I'm not aware of anyone that has tried and published their results.

 Does eaccelerator improve both time and memory use, or make time/memory
 tradeoffs in favor of the cpu?  My largest server side gweb performance
 gripe is that I have needed to bump php.ini's memory_limit several
 times, currently at 512M.


 On 07/06/2013 10:11 AM, Nikhil wrote:
 Hi,

 I was wondering if anyone has considered using any php accelerator for the
 ganglia web to improve the serving performance. I am not sure if the
 ganglia web code memory print is minimalistic, but I do tend to believe
 with the addition of php accelerator such as eaccelerator for apache + php
 ganglia web can improve upon results. I do want to know if any of others in
 the group has tried their hands at this and have been successful in doing
 so.

 Thanks,
 Nikhil




 --
 This SF.net email is sponsored by Windows:

 Build for Windows Store.

 http://p.sf.net/sfu/windows-dev2dev



 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers


 --
 This SF.net email is sponsored by Windows:

 Build for Windows Store.

 http://p.sf.net/sfu/windows-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers



-- 
Jesse Becker

--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Ganglia not recognizing other nodes?

2012-07-21 Thread Jesse Becker
What does netstat or lsof say about gmond interface binding?

(in haste, sorry for the brevity)

On Sat, Jul 21, 2012 at 10:50 AM, Jeff Layton layto...@att.net wrote:
 Good morning,

 Apologies for the simple question. I've got a simple cluster
 with a master node and one compute node. I installed the
 latest Ganglia on the master node (3.4.0) - libganglia,
 ganglia-gmond, ganglia-metad, ganglia-web (3.5.1). I can
 use ganglia-web to see the master node with no problems.

 I'm using Warewulf for the compute node and I installed
 libganglia and ganglia-gmond in the VNFS and rebooted the
 node. When the node comes back up, I tested ganglia via
 gstat -all on the compute node and it seems to work
 correctly.

 However, ganglia-web doesn't display anything for the compute
 node even though I've added it to the data_source line in
 gmetad.conf:


 data_source my cluster 10.1.0.250 10.1.0.1:8649


 I also checked if the master node could access the data from
 the compute node via gstat -all and I only get data from the
 master node (i.e. no compute node).

 I checked the Ethernet interfaces on both nodes and both
 are listed as MULTICAST. iptabbles on the master node and the
 compute node are off and the services are not running (checked
 that 3 times).

 There is a simple Netgear GigE switch between the nodes
 (unmanaged). I don't think that's a problem.

 One thing I think is interesting is that the master node has
 eth0 with an IP of 192.168.1.250 which is to the outside world
 and eth1 is 10.1.0.250 which is the cluster network. The compute
 node has eth0 as 10.1.0.1. But when I go to http://localhost/ganglia
 I can only access the master node as 192.168.1.250, not
 10.1.0.250 (i.e. the list of nodes is only 192.168.1.250).

 Otherwise i can login into the compute node, ping it, etc. It works
 fine but somehow I'm missing a configuration piece for ganglia.

 TIA!

 Jeff



 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Ganglia not recognizing other nodes?

2012-07-21 Thread Jesse Becker
I should add that there's a bind_interface (and similar) setting in
gmond.conf, should gmond, in fact, not have bound to the proper
interface

On Sat, Jul 21, 2012 at 11:29 AM, Jesse Becker haw...@gmail.com wrote:
 What does netstat or lsof say about gmond interface binding?

 (in haste, sorry for the brevity)

 On Sat, Jul 21, 2012 at 10:50 AM, Jeff Layton layto...@att.net wrote:
 Good morning,

 Apologies for the simple question. I've got a simple cluster
 with a master node and one compute node. I installed the
 latest Ganglia on the master node (3.4.0) - libganglia,
 ganglia-gmond, ganglia-metad, ganglia-web (3.5.1). I can
 use ganglia-web to see the master node with no problems.

 I'm using Warewulf for the compute node and I installed
 libganglia and ganglia-gmond in the VNFS and rebooted the
 node. When the node comes back up, I tested ganglia via
 gstat -all on the compute node and it seems to work
 correctly.

 However, ganglia-web doesn't display anything for the compute
 node even though I've added it to the data_source line in
 gmetad.conf:


 data_source my cluster 10.1.0.250 10.1.0.1:8649


 I also checked if the master node could access the data from
 the compute node via gstat -all and I only get data from the
 master node (i.e. no compute node).

 I checked the Ethernet interfaces on both nodes and both
 are listed as MULTICAST. iptabbles on the master node and the
 compute node are off and the services are not running (checked
 that 3 times).

 There is a simple Netgear GigE switch between the nodes
 (unmanaged). I don't think that's a problem.

 One thing I think is interesting is that the master node has
 eth0 with an IP of 192.168.1.250 which is to the outside world
 and eth1 is 10.1.0.250 which is the cluster network. The compute
 node has eth0 as 10.1.0.1. But when I go to http://localhost/ganglia
 I can only access the master node as 192.168.1.250, not
 10.1.0.250 (i.e. the list of nodes is only 192.168.1.250).

 Otherwise i can login into the compute node, ping it, etc. It works
 fine but somehow I'm missing a configuration piece for ganglia.

 TIA!

 Jeff



 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




 --
 Jesse Becker



-- 
Jesse Becker

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] add extras parsing to json graph-definitions

2012-07-19 Thread Jesse Becker
On Thu, Jul 19, 2012 at 8:54 AM, Jochen Hein joc...@jochen.org wrote:
 Vladimir Vuksan vli...@veus.hr writes:

 I would define a scaling factor or some other variable. I do want to
 steer away from having tool specific options unless absolutely
 necessary.

 I agree that would be a useful goal, I just have no idea what I options
 I may need for my special problem [see my mail to ganglia-general].

 I've had a look at the other PHP-reports. A couple of them pass the
 option '--rigid' to rddtool. Other used options are --logarithmic and
 --lower-limit. I've no idea how that could be mapped into json and keep
 the syntax and the parsing simple.

Speaking as the one who wrote[1] a good chunk of the original modular
report code, the idea for for the 'extras' variable is to allow the
report writer to make whatever changes to make the graph look the way
they want.  This includes overriding the default settings.  The
various options cited, such as --rigid and --logarithmic, are perfect
examples.

Quoting from graph.php:

/* The $extras variable is used for other arguemnts that may not
 * fit nicely for other reasons.  Complicated requests for --color, or
adding --ridgid, for example.
 * It is simply a way for the graph writer to add an arbitrary options
when calling rrdtool, and to
 * forcibly override other settings, since rrdtool will use the last
version of an option passed.
 */

Note that this *predates* JSON support, and when I wrote the code,
there had been zero mention of any other backends other than rrdtool.
It's worth nothing that a lot, perhaps even most, of the graphing
code is rrdtool specific.

However, I don't see any particular reason why the JSON code in
graph.php should not make use of the 'extras' graph variable.  Short
of re-writing the graphing API (so to speak), this seems a
reasonable solution in the short term.  I agree with Vladimir that
something generic is preferable, but I expect that something
sufficiently generic won't be much use.  We'll have a data structure
that includes a handful of values for title, x/y label, and a
massive string or array of
all_the_other_options_for_this_graph_type.



[1] Or screwed up horribly, take your pick.


-- 
Jesse Becker

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [Ganglia-general] [SECURITY] [IMPORTANT] Security issue in Ganglia Web

2012-07-15 Thread Jesse Becker
On Sun, Jul 15, 2012 at 2:48 PM, Bernard Li bern...@vanhpc.org wrote:
 Hi Daniel:

 If you want to start a wiki page for that, that's fine.  But in my
 experience these pages get stale pretty quickly ;-)

While true, stale != inaccurate or even useless.  I've written
information on (internal) wiki pages that is 5 years old, with nary a
change.  The information is still accurate and useful to this day.

-- 
Jesse Becker

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] ganglia-web 3.4.x into Debian

2012-05-17 Thread Jesse Becker
On Thu, May 17, 2012 at 4:52 PM, Daniel Pocock dan...@pocock.com.au wrote:
 To get this into Debian, a couple of things are needed:

 - copyright - I've started a debian/copyright file listing the authors
 of each piece of work (everything has to listed, for all original code
 and every piece of javascript that has been copied, etc)

What do you need, specifically, from the code authors?

-- 
Jesse Becker

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] releasing 3.3.2 today?

2012-03-20 Thread Jesse Becker
On Tue, Mar 20, 2012 at 13:36, Daniel Pocock dan...@pocock.com.au wrote:
 On 20/03/2012 17:34, Bernard Li wrote:
 On Tue, Mar 20, 2012 at 10:03 AM, Daniel Pocockdan...@pocock.com.au  wrote:

 I agree with that approach, with a slight variation - I'll tag it as
 3.3.3dp1 (after adding the ChangeLog file)

 Quick question -- does this prevent RPM upgrading? i.e. 3.3.3dp1 -  3.3.3?


 It is just a tag to help us keep track of what we test, it is not
 intended for versioning a binary package

 However, if we want to be able to version binary packages using release
 candidates, we may need to look at the problem more closely

It matters very much for RPM packages, and I suspect debian packages
as well (although I don't know the details of building those).

RPMs inherently understand version numbers, and that 3.3.2 is a more
recent release than 3.3.1.  There are some further syntactic bits
added in, such as package release numbers (which are essentially, the
least significant digit in the version), but the version number
matters a great deal.  I think that versions such as 3.3.3pre1,
3.3.3pre2 and 3.3.3 are handled correctly.

These pages details the issue:
http://fedoraproject.org/wiki/Tools/RPM/VersionComparison
http://fedoraproject.org/wiki/Packaging:NamingGuidelines#Version_Tag

There are some very ugly was to work around insane versioning schemes,
but we don't have that problem (and shouldn't get into hacky
workarounds)


-- 
Jesse Becker

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] releasing 3.3.2 today?

2012-03-20 Thread Jesse Becker
On Tue, Mar 20, 2012 at 14:52, Daniel Pocock dan...@pocock.com.au wrote:

 On 20/03/12 19:27, Bernard Li wrote:
 I don't really want to make a big deal out of this but I thought it
 was long agreed that we would tag a release (eg. 3.3.2) and that would
 potentially be our Release Candidate.  If everything is fine, we
 will just release as is otherwise we will discard 3.3.2, bump the
 version to 3.3.3 and repeat the cycle.


 I remember that discussion too, and I think was pushing that same
 argument - that it is easier to burn release numbers than to worry about
 suffixes

I agree.  So long as the numbers only increase, the minor release
number is basically irrelevant.

 That discussion was held in the days of SVN, when making a tag was quite
 painful

 Now we have git,
 - people can make local tags (almost like bookmarks?) whenever they like
 - you can make two tags on a single commit, because tags are like
 symlinks (e.g. 3.3.3rc1 and 3.3.3 both point to the same commit)

 This comes back to my earlier comments: the tags I have made today (e.g.
 3.3.3dp1) are not intended for packaging, it is just a helpful reminder
 for me to know how I built the tarball for people to test.  I think it
 is a useful phase in the release process.

 Once we get to the point where people want to test proper versioned
 RPMs, then we use a real tag (e.g. 3.3.3) and if the RPMs are proved to
 be dodgy after that tag is made, then we burn the version number and try
 3.3.4

Now, with RPM releases, it may not be that bad.  RPMs inherently
support a release (in the RPM lingo), which is the least significant
digit in the complete version number.  If we have a ganglia release of
X.Y.Z, the RPM release could go through several changes with it's
own release number.  The first RPM-release for a new upstream is 1,
and each change increments.  So the first official binary release
would be something like X.Y.Z.-1, then -2, -3, etc.  When Ganglia
X.Y.(Z+1) is released, the RPM starts over:  X.Y.(Z+1)-1  (and not,
say, X.Y.(Z+1)-4)

If we do make a policy of tagging pre-releases for testing, i suggest
that the tag include something obvious, such as a pre1 sort of
suffix.




-- 
Jesse Becker

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] releasing 3.3.2 today?

2012-03-20 Thread Jesse Becker
On Tue, Mar 20, 2012 at 15:08, Daniel Pocock dan...@pocock.com.au wrote:
 On 20/03/12 19:59, Jesse Becker wrote:
 On Tue, Mar 20, 2012 at 14:52, Daniel Pocock dan...@pocock.com.au wrote:

 On 20/03/12 19:27, Bernard Li wrote:

 I don't really want to make a big deal out of this but I thought it
 was long agreed that we would tag a release (eg. 3.3.2) and that would
 potentially be our Release Candidate.  If everything is fine, we
 will just release as is otherwise we will discard 3.3.2, bump the
 version to 3.3.3 and repeat the cycle.



 I remember that discussion too, and I think was pushing that same
 argument - that it is easier to burn release numbers than to worry about
 suffixes

 I agree.  So long as the numbers only increase, the minor release
 number is basically irrelevant.


 That discussion was held in the days of SVN, when making a tag was quite
 painful

 Now we have git,
 - people can make local tags (almost like bookmarks?) whenever they like
 - you can make two tags on a single commit, because tags are like
 symlinks (e.g. 3.3.3rc1 and 3.3.3 both point to the same commit)

 This comes back to my earlier comments: the tags I have made today (e.g.
 3.3.3dp1) are not intended for packaging, it is just a helpful reminder
 for me to know how I built the tarball for people to test.  I think it
 is a useful phase in the release process.

 Once we get to the point where people want to test proper versioned
 RPMs, then we use a real tag (e.g. 3.3.3) and if the RPMs are proved to
 be dodgy after that tag is made, then we burn the version number and try
 3.3.4

 Now, with RPM releases, it may not be that bad.  RPMs inherently
 support a release (in the RPM lingo), which is the least significant
 digit in the complete version number.  If we have a ganglia release of


 What is the intention of that release number though?

Yeah, the nomenclature gets confusing here. :-)

 Is that intended to be maintained by upstream?

 Or is it reserved for the packager?

The RPM version tag is maintained by upstream.  Only in rare cases
should an outside packager assign a version number.  And when they
do, IMO, it should be strictly date based--such as 20120320--instead
of an arbitrary version 1.0.

The RPM release, on the other hand, is strictly the purvue of the
packager, and indicates changes to the package that is distributed.
This can included updated metadata for the package, a fix to the build
or install process used in creating the package, fixing permissions,
or even adding code patches to the pristine upstream source code.  All
of those are legitimate.


 If the Ganglia community releases a tarball called, 3.3.3-2.tar.gz, for
 example, then someone building RPM packages might release 3.3.3-2-1

If the Ganglia project really did release 3.3.3-2.tar.gz, we should
have our heads examined. :)

But yes, the resulting RPM could potentiall appear as you described,
and it would be a mess.


 The next day, the packager decides to modify a file location within his
 spec file.  He is still using the same upstream tarball.  He bumps his
 release number to 2, so it is 3.3.3-2-2.rpm

Yep.

 In this case, the release number is used to distinguish different
 versions of a spec file maintained outside Ganglia.

That's exactly right.

 The same type of thing happens with Debian - the Debian maintainers keep
 their own artefacts in a repository, and they add a suffix to create the
 version numbers of their packages.

We maintain an RPM repository at $day_job of program that come from
various researchers--many of whom wouldn't know proper software
engineering processes if they were hit over the head with a printed
copy of the of the CMM.  Suffice it to say that we have a lot of
interesting version and number schemes to deal with. :-/

 One other comment: when I did the MSI packages (with WiX), I discovered
 the nasty world of Windows packaging, where you can only have a 4 byte
 version number, basically like an IP address, written as A.B.C.D where
 each value is between 0-255.  Does anyone still build MSI packages?

Does anyone care? :)

The most complicated, non-contrived version/releases I've see are in
the kernel RPM packages.  Here are some examples from a production
Centos 5 host.  First, the full package is shown, including the name,
version, release and arch of the package:

  $ rpm -q kernel
  kernel-2.6.18-194.32.1.el5.centos.plus.x86_64
  kernel-2.6.18-238.9.1.el5.centos.plus.x86_64

Now, just showing the version and release:

  $ rpm -q --qf %{version}  %{release}\n kernel
  2.6.18  194.32.1.el5.centos.plus
  2.6.18  238.9.1.el5.centos.plus

This works correctly because of the leading numbers in the release
tag--even though there is a bunch of extra non-numeric content as
well.



-- 
Jesse Becker

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure

Re: [Ganglia-developers] Protocol Efficiency Ideas

2012-01-27 Thread Jesse Becker
On Thu, Jan 26, 2012 at 14:34, Dave Rawks d...@pandora.com wrote:
 Hey All,
        We've been talking about adding json in addition to xml for the tcp
 listen port exchange format. And I was curious if the EXTRA_DATA
 subtree to the XML ever contains something aside from EXTRA_ELEMENTS
 and if the EXTRA_ELEMENTS ever have attributes aside from NAME and
 VAL.

snip interesting  JSON discussion

 On a relatively small cluster with a dozen metrics and a handful of
 hosts the savings are minor. However on a cluster of hundreds of hosts
 with perhaps dozens of metrics the savings would equate to MBs of data
 per tcp fetch. And the parse speed of the json /should/ be much faster
 as well.

On a slightly different note, extending the gmond protocol itself to
support more than one metric per packet may be useful as well.

-- 
Jesse Becker

--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] O'Reilly eBook on Ganglia

2011-12-12 Thread Jesse Becker
On Fri, Dec 9, 2011 at 19:51, Matt Massie m...@massie.us wrote:
 We're in the process of pulling together a team to write an O'Reilly eBook
 on Ganglia.

 Here's a rough idea of some of the topics we could cover

 Ganglia's components and overall architecture
 Typical deployment configurations including simple steps for verifying an
 installation (e.g. unicast/multicast, single cluster/multiple distributed
 clusters/datacenter)
 Navigating and using the new web interface
 Tips for extending ganglia's functionality (e.g. gmetric, modules)
 Common integration points (e.g. Hadoop metrics, Nagios)
 A simple step-by-step checklist for debugging common ganglia issues with
 pointers to our web site, mailing lists, irc channel, etc.
 Supported platforms and core metrics (e.g. Ganglia on AIX, Linux Power
 systems)
 Scaling to clusters  1000 nodes
 Using Ganglia in mixed environments
 Ganglia in the enterprise
 Development of custom modules

 What are the things you would be most interested in?  Are there other topics
 you'd like to see covered?

An answer to the question how do you remove dead hosts? would be
welcome.  I think emails about that constitute about 50% of the list
traffic. ;-)

A brief explanation of the protocols and XML schemas would useful.  I
don't think that I've ever seen a clear description of gmond packets
anywhere.

-- 
Jesse Becker

--
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Ganglia REST interface (was:Re: Gauging interest in writing a Ganglia eBook)

2011-12-05 Thread Jesse Becker
.  Who is interesting in helping write the book?



 --
 All the data continuously generated in your IT infrastructure
 contains a definitive record of customers, application performance,
 security threats, fraudulent activity, and more. Splunk takes this
 data and makes sense of it. IT sense. And common sense.
 http://p.sf.net/sfu/splunk-novd2d
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




--
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Investigating feasibility of moving repo to Github

2011-07-11 Thread Jesse Becker
On Mon, Jul 11, 2011 at 12:42, Carlo Marcelo Arenas Belon
care...@sajinet.com.pe wrote:
 On Sun, Jul 10, 2011 at 09:27:28PM -0400, Jesse Becker wrote:

 My only concern is with the import process itself.

 any import process that I know of from svn to git, should at least preserve
 the history, what is your concern specifically here?

I did some experimenting specifically with the Ganglia repository, and
ran into some trouble during the limited time I had to focus on this.
As I recall, the simple import will lose certain information
regarding branches and tags, although the commits and log information
themselves are preserved.


 There is a lot of important metadata in the existing SVN repository.

 are you referring to which files are executable or ASCII and stuff like that?
 tools should be able to translate them most likely into their corresponding
 git flags

No, the various non-code files (documentation, logs, etc), I'm not
concerned about.  I expect git to handle those flawlessly.  I refer to
the metadata in the repository itself--tags, logs, code attributions,
etc.

 if you are talking about the external dependency to web/dwoo that was added
 in trunk and therefore now also in 3.2, that would need to be translated as
 well, but git submodules allows for that.

I'd forgotten about dwoo, actually, but again am not terribly worried
about it either.  I don't consider that our code; we just happen to
have a copy of it for convenience

 I believe that
 this should be completely preserved, either directly within the git
 repository, or as a separate standalone (and frozen) SVN repository.
 The commit logs, test branches, and history is too important to lose.

 the test branches that are no longer open (because they were already merged
 back) wouldn't need to be migrated IMHO, as for the other branches that were
 open but never merged back, the should be probably migrated over as well as
 topic branches but later weeded out after their good parts had been merged
 back, to avoid confusion.

Closed branches can remain closed, but I still think they should be
kept as a record, if nothing else.

 git allows you to have infinite number of local branches on your repository
 anyway, for all topics you would feel like.

 Carlo




-- 
Jesse Becker

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Investigating feasibility of moving repo to Github

2011-07-11 Thread Jesse Becker
On Mon, Jul 11, 2011 at 06:22:47PM -0400, Carlo Marcelo Arenas Belon wrote:
On Mon, Jul 11, 2011 at 01:05:47PM -0400, Jesse Becker wrote:
 On Mon, Jul 11, 2011 at 12:42, Carlo Marcelo Arenas Belon
 care...@sajinet.com.pe wrote:
  On Sun, Jul 10, 2011 at 09:27:28PM -0400, Jesse Becker wrote:

  I believe that
  this should be completely preserved, either directly within the git
  repository, or as a separate standalone (and frozen) SVN repository.
  The commit logs, test branches, and history is too important to lose.
 
  the test branches that are no longer open (because they were already 
  merged
  back) wouldn't need to be migrated IMHO, as for the other branches that 
  were
  open but never merged back, the should be probably migrated over as well as
  topic branches but later weeded out after their good parts had been 
  merged
  back, to avoid confusion.

 Closed branches can remain closed, but I still think they should be
 kept as a record, if nothing else.

For keeping a record of them, it would be easier to keep svn around in a
read only way, as it is (nearly) imposible to reconstruct the merges and
the full history in svn anyway, as it was only recently that metadata was
added for keeping track of the merges.

And I've said before that keeping SVN around, read-only, is perfectly fine.

I still have no objections to migrating to git, and think that a
'best-effort' import of the existing commits is worthwhile, and keep the
SVN repository files around in case someone needs to refer back to them.


-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Investigating feasibility of moving repo to Github

2011-07-10 Thread Jesse Becker
On Sun, Jul 10, 2011 at 16:28, Vladimir Vuksan vli...@veus.hr wrote:
 As most know couple months ago we created a number of repositories on
 Github for people to contribute their gmetric scripts, python modules etc.
 This has been a huge success as we received tons of good contributions. We
 have had a number of conversations on #ganglia Freenode channel about
 moving the Ganglia repository to Github as that provides us with better
 way of accepting user contributions and makes development more social. I
 have made an import of the monitor-core trunk into Github as an experiment
 and it converted flawlessly. You can look at the results here

 https://github.com/ganglia/monitor-core

 Any thoughts on why we shouldn't make the Github our primary repository ?

+1 from me as well, but with a caveat:

My only concern is with the import process itself.  There is a lot of
important metadata in the existing SVN repository.  I believe that
this should be completely preserved, either directly within the git
repository, or as a separate standalone (and frozen) SVN repository.
The commit logs, test branches, and history is too important to lose.

Otherwise, I think it a good idea.



-- 
Jesse Becker

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Web 2.0 UI code freeze

2011-04-11 Thread Jesse Becker
I think that we should punt authentication to other systems/modules
that are dedicated to doing so.  Apache (et al) have lots of ways of
handling authentication, and I don't see *any* reason to reinvent the
wheel here, and a lot of reasons not to.

If we want to provide hooks to allow for access based on a username,
that's fine (if (valid_auth(user) and (user in lis)t) { display
chart} sort of thing).  This is even a useful trick, and could be
extended to groups (from /etc/group, AD, NIS, etc), if we so choose.
But punt the authentication stuff to something else entirely.

If a user wants to store custom views and such, push it into a cookie,
and store it on the browser side.  Under *NO* circumstances should we
allow a user to write data to the server through Ganglia.  Having a
canned view, created by the admin, stored in a config file on the
webserver is okay, so long as it is not possible to edit the file
through the web pages.



On Mon, Apr 11, 2011 at 22:38, Alex Dean a...@crackpot.org wrote:

 On Apr 11, 2011, at 9:07 PM, Vladimir Vuksan wrote:

 I am not sure what the right approach is. We could provide optional 
 authentication but this may be better addressed by people implementing 
 access controls in Apache ie. adding Basic auth to particular URLs. You 
 could certainly easily disable writing by setting proper permissions on 
 conf/ directory. This may be a non-issue for lots of people who are behind 
 firewalls and do not want an extra level of authentication. Perhaps we 
 should just document some of these approaches instead of reinventing the 
 authentication.

 Thoughts ?

 I agree about not wanting to overly complicate things.

 If we had some is_writable() checks, and didn't display links to actions 
 which required write access (like updating a view) when that access wasn't 
 available, that would probably be enough to implement a basic level of 
 security.  It seems a little clunky to have to chmod the filesystem, make 
 some changes, and chmod it back, but it may be good enough for now.  If we go 
 that route, I think our Makefile ought to set the conf/ directory as 
 read-only by default.

 I like the idea of using Apache's authentication mechanisms, but they may not 
 be fine-grained enough in some cases.  For example, view.php is used both to 
 display a view and to modify it.  How would you make it read-only for some 
 users, but allow admins to edit views?  You might be able to use a 
 LocationMatch directive, but that seems likely to become a mess in a hurry.

 I think it would be pretty straightforward to take the concept already in 
 auth.php, and add a distinction between 'view' (for private clusters) and 
 'edit' (for actions which change config) permissions.  Collecting username  
 password from the user could still be done via HTTP auth as is already done 
 in auth.php, but you'd need to change the file to distinguish between those 
 who can edit and those who can view.

 This doesn't feel overly complex to me, but I'm interested in what others 
 have to say.  I don't want to hold up a release if I'm the only one who's 
 concerned.  Any other opinions out there?

 alex
 --
 Forrester Wave Report - Recovery time is now measured in hours and minutes
 not days. Key insights are discussed in the 2010 Forrester Wave Report as
 part of an in-depth evaluation of disaster recovery service providers.
 Forrester found the best-in-class provider in terms of services and vision.
 Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker

--
Forrester Wave Report - Recovery time is now measured in hours and minutes
not days. Key insights are discussed in the 2010 Forrester Wave Report as
part of an in-depth evaluation of disaster recovery service providers.
Forrester found the best-in-class provider in terms of services and vision.
Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] $conf array

2011-02-24 Thread Jesse Becker
+1 to that too

On Wed, Feb 23, 2011 at 23:49, Bernard Li bern...@vanhpc.org wrote:
 +1 from me as well.

 I guess we should probably check it into both monitor-web-2.0 and trunk.

 Cheers,

 Bernard

 On Wed, Feb 23, 2011 at 7:35 PM, Jesse Becker haw...@gmail.com wrote:
 +1

 On Wed, Feb 23, 2011 at 21:27, Alex Dean a...@crackpot.org wrote:
 One of my gripes with the current PHP frontend code is how hard it can be 
 to recall where which variables are configuration, which come from user 
 input, and which are just local variables.  As one step toward fixing this 
 issue, I think it would be nice to place all configuration values (mainly 
 in conf.php currently) into a $conf array.  The benefit is that it's 
 immediately clear in any code which uses these values that you're dealing 
 with configuration values.  There's no danger of name collisions with other 
 variables.

 I'm just wondering if others feel the same way, and would support a change 
 like this.  It's pretty straighforward to do, but would obviously touch a 
 lot of different code.  Before I go ahead with making all those changes, I 
 guess I'd just like to know if there are any huge objections out there to 
 this idea.  Take a look, let me know what you think.  I'd like to do 
 something similar for user input as well, maybe $user?

 http://pastie.org/1600587

 thanks,
 alex



 --
 Free Software Download: Index, Search  Analyze Logs and other IT data in
 Real-Time with Splunk. Collect, index and harness all the fast moving IT 
 data
 generated by your applications, servers and devices whether physical, 
 virtual
 or in the cloud. Deliver compliance at lower cost and gain new business
 insights. http://p.sf.net/sfu/splunk-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




 --
 Jesse Becker

 --
 Free Software Download: Index, Search  Analyze Logs and other IT data in
 Real-Time with Splunk. Collect, index and harness all the fast moving IT data
 generated by your applications, servers and devices whether physical, virtual
 or in the cloud. Deliver compliance at lower cost and gain new business
 insights. http://p.sf.net/sfu/splunk-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers





-- 
Jesse Becker

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] $conf array

2011-02-23 Thread Jesse Becker
+1

On Wed, Feb 23, 2011 at 21:27, Alex Dean a...@crackpot.org wrote:
 One of my gripes with the current PHP frontend code is how hard it can be to 
 recall where which variables are configuration, which come from user input, 
 and which are just local variables.  As one step toward fixing this issue, I 
 think it would be nice to place all configuration values (mainly in conf.php 
 currently) into a $conf array.  The benefit is that it's immediately clear in 
 any code which uses these values that you're dealing with configuration 
 values.  There's no danger of name collisions with other variables.

 I'm just wondering if others feel the same way, and would support a change 
 like this.  It's pretty straighforward to do, but would obviously touch a lot 
 of different code.  Before I go ahead with making all those changes, I guess 
 I'd just like to know if there are any huge objections out there to this 
 idea.  Take a look, let me know what you think.  I'd like to do something 
 similar for user input as well, maybe $user?

 http://pastie.org/1600587

 thanks,
 alex



 --
 Free Software Download: Index, Search  Analyze Logs and other IT data in
 Real-Time with Splunk. Collect, index and harness all the fast moving IT data
 generated by your applications, servers and devices whether physical, virtual
 or in the cloud. Deliver compliance at lower cost and gain new business
 insights. http://p.sf.net/sfu/splunk-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Vertical label in metric.php

2011-02-11 Thread Jesse Becker
As I recall, it forces the graphs within the images to align between
images that include labels, and those that do not.  One of Ganglia's
strengths is allowing for easy data/time correlations.  This is easy
only if the graphs actually have the same timescale (generally true),
and line up appropriately (which this patch tries to help with).

At least, that how I remember it.


On Fri, Feb 11, 2011 at 07:22:34PM -0500, Bernard Li wrote:
Hey Jesse:

Just trying to understand this commit:

https://sourceforge.net/apps/trac/ganglia/changeset/2356/trunk/monitor-core/web/graph.d/metric.php

Why are we setting the vertical label to the metricname?

Thanks,

Bernard

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] 3.1 branch backport proposals

2011-01-26 Thread Jesse Becker
+1 from me on both.  r2458

On Tue, Jan 25, 2011 at 15:29, Bernard Li bern...@vanhpc.org wrote:
 Hi all:

 Could someone please vote on the following two backport proposals for 3.1?

  * build: Install manpages in appropriate locations when `make install` is run
    http://sourceforge.net/apps/trac/ganglia/changeset/2299
    http://sourceforge.net/apps/trac/ganglia/changeset/2301
    +1: bernardli

  * build: Include BUGS file to distribution tarball
    http://sourceforge.net/apps/trac/ganglia/changeset/2455
    +1: bernardli
    bernardli: depends on Install manpages in appropriate locations
 when `make install` is run

 Thanks!

 Bernard

 --
 Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
 Finally, a world-class log management solution at an even better price-free!
 Download using promo code Free_Logger_4_Dev2Dev. Offer expires
 February 28th, so secure your free ArcSight Logger TODAY!
 http://p.sf.net/sfu/arcsight-sfd2d
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker

--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] send_metadata_interval

2011-01-07 Thread Jesse Becker
On Fri, Jan 7, 2011 at 15:25, Bernard Li bern...@vanhpc.org wrote:
 Hi all:

 Since the release of Ganglia 3.1, we have introduced the new
 configuration option send_metadata_interval in gmond.conf.  This is
 set to 0 by default and the user must set this to a sane number if
 using unicast otherwise if gmonds are restarted, hosts may appear to
 be offline (this is documented in the release notes).  A bug has
 already been filed:

 http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=242

 We recently have a lot of users having this issue and Vladimir
 recommend that we just set a sane number as the default and be done
 with it, since we end up spending a lot of time on IRC/mailing-list to
 solve the same problem over and over again.

 Since there have been some commits to the 3.1 branch since tagging
 3.1.7, I propose we just copy 3.1.7 tag, update the send_meta_data
 interval in the configuration file and release that as 3.1.8.

 This is not the normal procedure for making a release, so I'd like to
 get some feedback from other developers.

 BTW I am thinking of setting send_metadata_interval to 30 seconds.
 Also, does anybody know if this setting affects multicast setups in
 any way?

I think that it's fine to set this to a non-zero value, but I wonder
if 30 seconds is too high.  I did a quick set of checking on the
actual packets that are sent--and specifically the metadata packets.
I haven't been able to really delve into the code to figure exactly
what's going on (this part of the code is't terribly transparent to
me), but I *think* that they are really large--on the order of several
KB when fully assembled, as compared to less than 100-120 bytes for a
typical metric packet .  I think that size will increase with the
number of metrics stored, since each one must be described in full XML
each time.

The reason for the large size is that an entire XML description of the
metrics appears to be sent each time.  Metadata packets also appear to
go over TCP, not UDP.

My testing was pretty simple:
1) setup a gmond (from SVN, well after 3.1 came out) in unicast mode.
2) set 'send_metadata_interfaval' to 1
3) disable all modules, except for 'mod_core'
4) remove all collection groups.
5) start gmond, and run tcpdump.

On a large cluster, with lots of metrics per host, I can see problems
if the metadata packets are sent too frequently.  I have hosts that
send well over 300 metrics (lots of CPU cores makes for lots of
metrics...).  Each of these need to be described in the metadata
packets.

So I think that setting a non-zero default is fine.  But think that
something like 300 or 600 seconds would be preferable.


-- 
Jesse Becker

--
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to 
best implement a security strategy that keeps consumers' information secure 
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] sFlow counters in Ganglia

2010-10-26 Thread Jesse Becker
On Tue, Oct 26, 2010 at 03:47, Bernard Li bern...@vanhpc.org wrote:
 Perhaps we should start a separate thread to discuss Kostas' idea of
 having a pluggable interface for metric sources.  With sFlow support,
 gmond can now handle both gmond sources and sFlow sources.  With a
 pluggable interface, we could in the future be able to aggregate data
 collected by other tools, such as collectd, collectl, etc.  The
 possibilities are endless.

 What do you guys think?

I think these ideas (separate thread and modular collection methods)
are great ideas.

-- 
Jesse Becker

--
Nokia and ATT present the 2010 Calling All Innovators-North America contest
Create new apps  games for the Nokia N8 for consumers in  U.S. and Canada
$10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing
Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store 
http://p.sf.net/sfu/nokia-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [Ganglia-general] IRC chat on Ganglia Web Frontend re-write 10/13/2010 (Wed) 9-10am PDT

2010-10-13 Thread Jesse Becker
I have a log that I will try to clean up and post later today.

On Wed, Oct 13, 2010 at 14:46, Dave Josephsen d...@dbg.com wrote:
 Hey all,

 Did anyone take minutes?  I wasn't able to attend but am interested in 
 hearing about the chat.

 Thanks

 -dave

 - Original Message -
 From: Bernard Li bern...@vanhpc.org
 To: ganglia-developers@lists.sourceforge.net, Ganglia 
 ganglia-gene...@lists.sourceforge.net
 Sent: Thursday, October 7, 2010 1:55:26 PM GMT -06:00 US/Canada Central
 Subject: [Ganglia-general] IRC chat on Ganglia Web Frontend re-write 
 10/13/2010 (Wed) 9-10am PDT

 Dear all:

 I've been talking to people on and off about doing a web frontend
 re-write, in fact I have been thinking about it since almost three
 years ago when I started the wishlist thread:

 http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg03070.html

 I've managed to gather a group of developers and users who are
 interested in this re-write effort and we are planning to have a
 discussion forum on irc.freenode.net #ganglia on 10/13/2010 (next
 Wednesday) from 9-10am PDT.

 I would like to extend this invitation to anybody interested in participating.

 Here's the agenda:

 Introductions: Briefly introduce yourself

 Discussion points:
 - Use case of Ganglia
 - Language: PHP? Ruby on Rails? Framework?
 - Technologies: graphite, canvas, SVG, javascriptrrd, flot, jQuery, node.js
 - Modular by design, extensible by users
 - State: Login, different views for administrator, managers, operators
 - Browsing (server-side) vs Customization/Interactive mode (client-side)
 - Customization: custom graphs, overlay graphs, compare arbitrary graphs
 - Interactivity: Allow users to interact with graphs, zoom in/out, etc.
 - Allow arbitrary grouping of hosts
 - URL pretty names: eg. http://ganglia.info/grid/cluster/host
 - Flexible URL API for graph generation (turn legends on/off, titles,
 height, width, etc.)
 - Metric query backend (JSON?)
 - Search for host/metric/graphs (aka Ganglia Instant)
 - Identified by unique identifiers (as opposed to hostnames)
 - New logos, icons, etc.

 Hope to see you all there.

 Cheers,

 Bernard

 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb
 ___
 Ganglia-general mailing list
 ganglia-gene...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general

 --
 Beautiful is writing same markup. Internet Explorer 9 supports
 standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
 Spend less time writing and  rewriting code and more time creating great
 experiences on the web. Be a part of the beta today.
 http://p.sf.net/sfu/beautyoftheweb
 ___
 Ganglia-general mailing list
 ganglia-gene...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general




-- 
Jesse Becker

--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Dead code in web/meta_view.php

2010-08-25 Thread Jesse Becker
Looks dead to me too.  Nuke away.

On Wed, Aug 25, 2010 at 20:27, Bernard Li bern...@vanhpc.org wrote:
 Hi guys:

 Have a look at line 147 and 148 of web/meta_view.php in trunk:

 http://sourceforge.net/apps/trac/ganglia/browser/trunk/monitor-core/web/meta_view.php#L147

 there seems to be a syntax error with the $tpl-assign() call.  That
 line essentially is no-op since there is no variable specified for the
 assignment.

 Unless I am missing something, I am going to remove the 2 lines.

 Thanks,

 Bernard

 --
 Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
 Be part of this innovative community and reach millions of netbook users
 worldwide. Take advantage of special opportunities to increase revenue and
 speed time-to-market. Join now, and jumpstart your future.
 http://p.sf.net/sfu/intel-atom-d2d
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker

--
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [Ganglia-general] Spoofing functionality in 3.1.x branch...

2010-06-24 Thread Jesse Becker
+1 from me.  I'm all for more and better examples/documentation.

On Thu, Jun 24, 2010 at 18:37, Bernard Li bern...@vanhpc.org wrote:
 Hi Brad:

 I am doing some cleanup in the trunk repo and found that the
 spfexample (python module) was not included in the distribution
 tarball and/or backported to the 3.1.x branch.  Should it be?  I'm not
 saying we should activate it by default, but should just include it
 much like the other example python module.

 Thanks,

 Bernard

 On Thu, Dec 4, 2008 at 9:14 AM, Brad Nicholes bnicho...@novell.com wrote:

 For those that are interested in the module based spoofing feature, all of 
 the functionality should be complete and has been backported to the 3.1.x 
 branch.  I have also added some spoofing module examples to trunk that can 
 be downloaded from monitor-core/gmond/python_modules/example/spfexample.py 
 in the trunk repository.  There  is also a small .pyconf file in 
 monitor-core/gmond/python_modules/conf.d/spfexample.pyconf.  This example 
 module should give you enough guidance so that you can build your own 
 spoofing module.  Please let me know if anything is missing.

 Brad


 -
 This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
 Build the coolest Linux based applications with Moblin SDK  win great prizes
 Grand prize is a trip for two to an Open Source event anywhere in the world
 http://moblin-contest.org/redirect.php?banner_id=100url=/
 ___
 Ganglia-general mailing list
 ganglia-gene...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-general


 --
 ThinkGeek and WIRED's GeekDad team up for the Ultimate
 GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
 lucky parental unit.  See the prize list and enter to win:
 http://p.sf.net/sfu/thinkgeek-promo
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker
Every cloud has a silver lining, except for the mushroom-shaped ones,
which come lined with strontium-90.

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] RFE suggestions for Ganglia 3.1.7

2010-06-01 Thread Jesse Becker
On Tue, Jun 1, 2010 at 11:52, Art Peck arthurp...@aol.com wrote:

 I am very impressed with Ganglia 3.1.7. I would like to create an addon 
 package to facilitate monitoring of the Oracle Sun Ray Server Software and 
 associated desktop devices.

 I've already created one Python module and integrated it into gmond and 
 gmetad. For the most part, it is working as I wanted. However, I would really 
 like to be able to manipulate the formatting of the resulting graph(s). For 
 example, I would greatly prefer a line graph to an area and I really need to 
 STACK several metrics on a graph. So I have the follow RFE's:

Writing custom graphing modules has been more easily supported for
several releases now.  If you have need to create customized charts
(like many of us do), there is a well documented framework for doing
so.  Take a look at the various *_report.php files in the web/graph.d
directory of your ganglia installation.  I've specifically written a
storage report script that uses stacked graphs.

 (1) Extend the descriptor dictionary to include key=value pairs that get 
 passed to gmetd and the web frontend allowing for specification of more of 
 the rrdgraph formatting options. Maybe something like 'graph_type' = 'Line', 
 'line_color' = 'Red', 'background_color' = 'Lt Blue', etc

I believe that you could fake this already using string metrics.

 (2) Extend gmond/gmetad and the web frontend to interpret a descriptor that 
 is a collection of dictionaries as meaning that the metrics described  should 
 be stacked on one graph.

Gmond has nothing directly to do with drawing the graphs, and it will
almost certainly remain that way; its sole job is collect and send
metrics.  Gmetad has several tasks (collect/receive metrics, write RRD
files, spew XML), and only one of them is tangentially related to
drawing graphs (spewing XML).

Almost all of the work for graph drawing is done by the various .php
files, as mentioned above.  The proper place to interpret such
metadata (line color information, etc) as you proposed is in the
reporting step.  This could, in thory, be pulled out of the XML stream
from gmetad.


 (3) Allow for collection of strings that are not reported by the web frontend 
 but rather stored by rrdtool for other reporting tasks. For example, 
 configuration management, in this case simple firmware version reporting, and 
 inventory management.

RRD files cannot, so far as I am aware, cannot store this sort of
data.  Gmond is able to collect this sort of information, and Gmetad
is able to report on it via the XML stream.  I suggest that you parse
out the various bits of configuration information and store them in
something appropriate of your choosing.

This is a good idea--don't get me wrong--and a lot of people could
make use of a program that did somethign like this.  Please post any
modules or add-ons that you come up with.



--
Jesse Becker
Every cloud has a silver lining, except for the mushroom-shaped ones,
which come lined with strontium-90.

--

___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [PATCH] Weird system/idle cpu metrics (ppc64)

2010-05-06 Thread Jesse Becker
Committed as r2295 to trunk.

On Thu, May 6, 2010 at 08:38, Rafael Xavier de Souza
rxavi...@br.ibm.com wrote:

 Hi Jesse,

 Here it is.

 Thank you so much!

 Rafael Xavier de Souza
 Linux Technology Center Software Engineer
 IBM Systems  Technology Group
 rxavi...@br.ibm.com
 MM17 Hortolândia-SP, Brazil

 Em 05-05-2010 19:33, Jesse Becker escreveu:

 Could you please re-base this patch off of trunk?  Once done, I'll test and 
 commit it.

 Thanks for the patch!

 On Wed, May 5, 2010 at 09:47, Rafael Xavier de Souza rxavi...@br.ibm.com 
 wrote:

 This patch fixes CPU system and idle metrics. Ganglia uses float variables 
 (double) to store the jiffies and the jiffies sums. But, Float numbers have 
 a problem with precision and Ganglia is getting lost with big ppc64 numbers 
 on its calculations. This patch changes libmetrics/linux/metrics.c to use 
 integers (unsined long long) to store the jiffies and jiffies sums and 
 floats (double) to store the calculated numbers only.

 Signed off by: Rafael Xavier rxavi...@br.ibm.com

 OBS:
 1) This patch has been sniff tested, but not extensively.
 2) It fixes it for linux only. The ideal approach would be to make the same 
 changes for other OSes as well.
 --

 Rafael Xavier de Souza
 Linux Technology Center Software Engineer
 IBM Systems  Technology Group
 rxavi...@br.ibm.com
 MM17 Hortolândia-SP, Brazil

 --

 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




 --
 Jesse Becker
 Every cloud has a silver lining, except for the mushroom-shaped ones, which 
 come lined with strontium-90.




--
Jesse Becker
Every cloud has a silver lining, except for the mushroom-shaped ones,
which come lined with strontium-90.

--
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [PATCH] Weird system/idle cpu metrics (ppc64)

2010-05-05 Thread Jesse Becker
I also looked at similar code for several other operating systems.  At least
some of them already store the counters as 'unsigned long long', and do a
conversion right before returning the data.

On Wed, May 5, 2010 at 18:33, Jesse Becker haw...@gmail.com wrote:

 Could you please re-base this patch off of trunk?  Once done, I'll test and
 commit it.

 Thanks for the patch!

 On Wed, May 5, 2010 at 09:47, Rafael Xavier de Souza 
 rxavi...@br.ibm.comwrote:

  This patch fixes CPU system and idle metrics. Ganglia uses float
 variables (double) to store the jiffies and the jiffies sums. But, Float
 numbers have a problem with precision and Ganglia is getting lost with big
 ppc64 numbers on its calculations. This patch changes
 libmetrics/linux/metrics.c to use integers (unsined long long) to store the
 jiffies and jiffies sums and floats (double) to store the calculated numbers
 only.


 Signed off by: Rafael Xavier rxavi...@br.ibm.com rxavi...@br.ibm.com

 OBS:
 1) This patch has been sniff tested, but not extensively.
 2) It fixes it for linux only. The ideal approach would be to make the
 same changes for other OSes as well.

 --

 Rafael Xavier de Souza
 Linux Technology Center Software Engineer
 IBM Systems  Technology Group
 rxavi...@br.ibm.com
 MM17 Hortolândia-SP, Brazil


 --

 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




 --
 Jesse Becker
 Every cloud has a silver lining, except for the mushroom-shaped ones, which
 come lined with strontium-90.




-- 
Jesse Becker
Every cloud has a silver lining, except for the mushroom-shaped ones, which
come lined with strontium-90.
image/pngimage/pngimage/pngimage/png--
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] ganglia with cacti style

2010-03-30 Thread Jesse Becker
On Fri, Mar 26, 2010 at 12:40:31PM -0400, Ramon Bastiaans wrote:
Hi Bernard,

It's here now:

  * http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=256

Have a nice weekend.

Committed (with minor whitespace tweaks) as r2294.

Thanks for the patch!


-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] lib64 on Linux

2010-02-10 Thread Jesse Becker
On Wed, Feb 10, 2010 at 11:32, Daniel Pocock dan...@pocock.com.au wrote:


 A while back, I introduced a patch to configure.in that determines
 whether or not lib directories should be called lib or lib64

 This is needed when trying to find things like libconfuse

 lib64 is not a universal phenomenon though - it is not the norm on Debian.

 However, I note that the Linux FHS does mandate lib64,

I must admit I find this amusing. :-)

 http://www.pathname.com/fhs/pub/fhs-2.3.html#LIB64

 so I may reverse the logic, the new logic will be to use lib64 on any
 x86_64-*-linux except for those on an exception list (e.g. Debian)

+1 from me.  Using lib64 for 64bit libs is certainly the expected
behavior for the various Red Hat derived distributions.


-- 
Jesse Becker
Every cloud has a silver lining, except for the mushroom-shaped ones,
which come lined with strontium-90.

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] lib64 on Linux

2010-02-10 Thread Jesse Becker
On Wed, Feb 10, 2010 at 12:02, Daniel Pocock dan...@pocock.com.au wrote:
 Jesse Becker wrote:
 On Wed, Feb 10, 2010 at 11:32, Daniel Pocock dan...@pocock.com.au wrote:

 A while back, I introduced a patch to configure.in that determines
 whether or not lib directories should be called lib or lib64

 This is needed when trying to find things like libconfuse

 lib64 is not a universal phenomenon though - it is not the norm on Debian.

 However, I note that the Linux FHS does mandate lib64,


 I must admit I find this amusing. :-)


 http://www.pathname.com/fhs/pub/fhs-2.3.html#LIB64

 so I may reverse the logic, the new logic will be to use lib64 on any
 x86_64-*-linux except for those on an exception list (e.g. Debian)


 +1 from me.  Using lib64 for 64bit libs is certainly the expected
 behavior for the various Red Hat derived distributions.


 Actually, the FHS does go further though:

 - on many other architectures, lib64 is also expected, e.g. PPC64,
 s390x, sparc64

 - however, on ia64, lib64 should NOT be used (what does RHEL do on ia64?)

I'm not sure (but may be able to find out if it really matters).  I
*suspect* that it just uses  /lib.  On ia64 systems, there is no (or
at least less) expectation of ever running 32bit code.  Yes, yes,
there are emulators and such, but that is a special case.  This is as
opposed to x86_64 systems where there is both the ability and
expectation that 32bit and 64bit code will coexist.

 Maybe a specific autoconf macro is needed to encapsulate the FHS
 requirements and the real-world variations between platforms.

Possibly.  I would not be surprised if something like this already
exists; we can't be the first project to deal with this problem.


-- 
Jesse Becker
Every cloud has a silver lining, except for the mushroom-shaped ones,
which come lined with strontium-90.

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] gmetad and rrdtool scalability

2009-12-20 Thread Jesse Becker
On Sun, Dec 20, 2009 at 10:49, Spike Spiegel fsm...@gmail.com wrote:
 On Mon, Dec 14, 2009 at 2:00 AM, Vladimir Vuksan vli...@veus.hr wrote:
 I think you guys are complicating much :-). Can't you simply have multiple
 gmetads in different sites poll a single gmond. That way if one gmetad fails
 data is still available and updated on the other gmetads. That is what we
 used to do.

 Would you mind explaining me why having multiple gmetads in different
 colos pulling form the same gmond is simpler than the infrastructure I

Conceptually, it may be simpler since you the two gmetad instances can
be considered 100% independent of each other; they just happen to have
the same polling targets.  There's not need, under normal
circumstances, for the two installs to deal with each other.  The
catch is when there is a failure, and you need to bring back one of
the two instances.  You trade simplicity in up-front configuration for
complexity during the recovery.

(Not trying to speak for Vladimir, just tossing in a few comments of my own.)

I do not claim that this is the proper solution for all places, but it
is *a* solution that is good enough for some.


-- 
Jesse Becker

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] gmetad and rrdtool scalability

2009-12-20 Thread Jesse Becker
On Sun, Dec 20, 2009 at 11:02, Spike Spiegel fsm...@gmail.com wrote:
 On Mon, Dec 14, 2009 at 10:28 AM, Carlo Marcelo Arenas Belon
 care...@sajinet.com.pe wrote:
 a) you are only concerned with redundancy and not looking for
 scalability - when I say scalability, I refer to the idea of maybe 3 or
 more gmetads running in parallel collecting data from huge numbers of agents

 what is the bottleneck here?, CPUs for polling or IO?, if IO using memory
 would be most likely all you really need (specially considering RAM is really
 cheap and RRDs are very small), if CPUs then there might be somethings we
 can do to help with that, but vertical scalability is what gmetad has, and
 for that usually means going to a bigger box if you hit the limit on the
 current one.

 Ime cpu isnt' really a problem, the big load is I/O and indeed moving
 the rrds to a ramdisk is the most common solution with pretty decent
 results.

I concur, for the moment. ;-)

If gmetad takes on more duties, in terms of more sophisticated
interactive access, built-in trickery for improving disk IO, etc, then
CPU could become an issue.  However, that's a really big if, and a
problem for the future.

 I think there's a middle ground here that'd be interesting to explore,
 altho that's a different thread, but for kicks this is the gist: the
 common pattern for rrd storage is hour/day/month/year and I've always
 found it bogus. In many cases I've needed higher resolution (down to
 the second) for the last 5-20 minutes, then intervals of an hr to a
 couple hrs, then a day to three days and then a week to 3 weeks etc
 etc, which increases your storage requirements, but  is imho not an
 abuse of rrd and still retains the many advantages of rrd over having
 to maintain a RDBMs.

The d/w/m/y split is a good *starting point*.  Ganglia needs to ship
with some sort of sensible default configuration that essentially
works for many/most people.  You (singular or plural) are are free to
customize your RRD configuration as policy and storage capacity
require and permit.  Ganglia officially supports this via the RRD
config like in gmetad.conf.   and as your storage system permits.  In
the ideal world, you keep all data, at the highest resolution,
forever, but that usually isn't practical.


-- 
Jesse Becker

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [RFC] status update for removing ganglia release names from the code

2009-12-03 Thread Jesse Becker
On Thu, Dec 3, 2009 at 07:44, Carlo Marcelo Arenas Belon
care...@sajinet.com.pe wrote:
 Jesse

 There is a backport request for 3.1 labeled build: remove ganglia release
 name from the code and that has a veto from you which I would like to see
 reconsidered.

I've removed my objection (r2136).  It's a minor point, and as Martin
pointed out, there are bigger problems with 3.1.  Something this
trivial should not block a release.  If it stays, that fine with me.
If it goes, also fine with me.

-- 
Jesse Becker

--
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] template-based metric definition with PCRE

2009-11-28 Thread Jesse Becker
On Sat, Nov 28, 2009 at 08:42, Daniel Pocock dan...@pocock.com.au wrote:

 For those following trunk, you may need to bootstrap again, and make
 sure you have pcre available.

 I've linked gmond with libpcre so that it can dynamically match the
 metric names

 E.g., for the multicpu module, this is the only metric definition that
 needs to be given to enable all metrics on all cores:

  metric {
    name_match = multicpu_([a-z]+)([0-9]+)
    value_threshold = 1.0
    title = CPU-\\2 \\1
  }

Oh, that's cool. +1 for me.

 I'd be interested in any feedback on the PCRE dependency.  If necessary,
 the feature can be made into a compile time option so that gmond can
 build without it.

Yes, an optional compile time option is the way to do this.  Use it if
present, but continue on without it if not present.


-- 
Jesse Becker

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] fix for monitor-core/web/graph.d/metric.php

2009-11-13 Thread Jesse Becker
Thanks for the patch.  Applied as r2105 to trunk.

On Fri, Nov 13, 2009 at 12:00, Greg Bruno greg.br...@gmail.com wrote:
 there are two fixes i would like to submit for consideration to the file:

    monitor-core/web/graph.d/metric.php

 the first is, 'jobstart_color' should be added to the global
 declarations. that is, the following code:

         global $context,
               $default_metric_color,
               $hostname,
               $jobstart,
               $load_color,
               $max,
               $meta_designator,
               $metricname,
               $metrictitle,
               $min,
               $range,
               $rrd_dir,
               $size,
               $summary,
               $value,
               $vlabel,
               $strip_domainname;

 should be changed to:

         global $context,
               $default_metric_color,
               $hostname,
               $jobstart,
               $load_color,
               $max,
               $meta_designator,
               $metricname,
               $metrictitle,
               $min,
               $range,
               $rrd_dir,
               $size,
               $summary,
               $value,
               $vlabel,
               $jobstart_color,
               $strip_domainname;

 ===

 the second fix is to add a space between the  and ' characters. that
 is, change the line:

    $series .= 'VRULE:$jobstart#$jobstart_color' ;

 to:

    $series .=  'VRULE:$jobstart#$jobstart_color' ;


  - gb

 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Storing Ganglia data in MySql.

2009-11-12 Thread Jesse Becker
On Thu, Nov 12, 2009 at 06:23:07AM -0500, Daniel Pocock wrote:
Bernard Li wrote:
 Hi Daniel:

 On Tue, Nov 10, 2009 at 8:30 AM, Daniel Pocock dan...@pocock.com.au wrote:

   
 Which gmetad is intended to be on the future roadmap?

 For a large site, do you believe it is fair to say that the C
 implementation is best for performance?
 

 I think if possible, we should perform a side-by-side performance test.

 It should be possible to run both gmetad and gmetad-python polling the
 same sets of gmonds for data.  The gmetads can reside on different but
 similar servers.
   
I've been meaning to do something like this anyway, with a view to 
finding out how scalable Ganglia is when handling a large number of 
hosts and metrics at short polling intervals.

I can say that the IO system really matters a lot when it comes to
scaling, probably moreso than the number of hosts.  That said, I'd love
to see the results of any tests that you do.

I recently moved from a 15 second interval to 30 seconds to reduce load
on the gmetad server.  I'm only dealing with about 5,300 metrics over
70 hosts, with about 70 metrics on average for each host (some more,
some less).  This cut the load a fair bit, but I lost resolution in the
rrd files.  After trying this for about a day, I moved all RRD files
over to a ramdisk, which made the IO issues go away.


One related issue for me is distributing the load across multiple gmetad 
servers dynamically - is something like this on the roadmap for either 
of the gmetad versions?  At the moment, the administrator has to manage 
the load and move things around, but with users feeding their own 
metrics in, the need for Ganglia to adapt dynamically and distribute the 
load becomes more important.

My first thoughts on the matter involved some kind of shared storage 
containing all the RRDs, and using DLM to make sure that only one gmetad 
can write the RRDs for a particular host or cluster.



--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [PATCH] graph stats avg/min/max

2009-11-04 Thread Jesse Becker
On Wed, Nov 04, 2009 at 04:43:30AM -0500, Ramon Bastiaans wrote:
Jesse,

I fixed one more bug, seems I made a copy/paste typo with the cached mem
values.

You want me to resent it or is there no interest for this patch?

Please send it along.

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Using git for ganglia source code management (was Re: [Ganglia-general] Ganglia 3.1.4 beta ready for testing)

2009-11-03 Thread Jesse Becker
On Tue, Nov 3, 2009 at 03:26, Carlo Marcelo Arenas Belon
care...@sajinet.com.pe wrote:
 I notice it can do some handy tricks, like generating version numbers that
 reflect the tag you are building in

 it can also do some more interesting tricks, like pushing/pulling from a
 subversion server and so there is really no need to force anyone to migrate
 either.

So basically, if you (plural) want to use git, go right ahead.

 As a project, it may need a bit more discussion, and at least some
setup work.  I would suggest having both systems run in parallel for a
short while before cutting over.  Pick a revision in the SVN tree as a
base for the import/export to git, and make sure that all of the
history, tagging, branching, etc is preserved.  Then for a short while
(few weeks?  a month?) run both systems in parallel, and make sure the
developers make *TWO* checkins.  One for SVN, one for git.  Yes, it's
a hassle--that's why the transition time is short. :)

-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] bootstrapping ganglia with modern autotool versions for release (was Re: Ganglia 3.1.4 beta ready for testing)

2009-11-03 Thread Jesse Becker
On Tue, Nov 3, 2009 at 02:33, Carlo Marcelo Arenas Belon
care...@sajinet.com.pe wrote:
 On Fri, Oct 30, 2009 at 12:28:03PM -0700, Bernard Li wrote:

 I have a Fedora 9 VM that I can use to bootstrap in the future --
 would the autotools that come with that version work?

 something with libtool 2.2 probably better, as well as something
 that is still getting updates (in case there are bugs that need
 to be fixed).

 Fedora 12 is going to be released in a couple of weeks and
 therefore Fedora 10 will go out of support a month after that,
 leaving Fedora 9 EOL for more than 3 months already :

  https://www.redhat.com/archives/fedora-announce-list/2009-July/msg4.html

While Fedora 9 may be EOL, there are older distributions, such as
CentOS4.x that are not.  Is there an official policy on what distros
are supported?  I can't find a list on the website or wiki.  The
'ganglia.pod' file claims:

  Ganglia runs on Linux (i386, ia64, sparc, alpha, powerpc, m68k,
mips, arm, hppa, s390),
  FreeBSD, NetBSD, OpenBSD, DragonflyBSD, MacOS X, Solaris, AIX, IRIX,
Tru64, HPUX and
  Windows NT/XP/2000/2003/2008 making it as portable as it is scalable.

That still true?  The file was last updated in r1708 (Aug 2008), but
the vast bulk of the content is from r225 when the file was created
back in 2003.  When was the last time someone built ganglia on IRIX,
or Linux-hppa (shudder)?  This of course ignores various
distributions, which seems to be the problem at hand.

This all said, I strongly suspect that a build that works on
RHEL4/CentOS4 will work with Fedora9 (if there's a desire to support
that distribution).  Similarly, a build that works for older SuSE
releases would be useful as well.  And let's not forget the ancient
Debian systems out there. ;-)

-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Graph.php patch for rrdtool error

2009-11-03 Thread Jesse Becker
 (a bit late in my reply--sorry!)

On Mon, Oct 12, 2009 at 04:01, Jeroen Nijhof jeroen.nij...@sara.nl wrote:
 Hi,

 I noticed that when rrdtool didn't create an image caused by some error
 you get a broken image
 I've written a small patch which checks for rrdtool error's and if found
  it will send a transparant 1 pixel gif instead.
 Maybe instead of a transparant pixel we can greate a nice N/A image or
 something but for now this will do the job.

Thanks for the patch.

 This patch is against the latest stable release.

This will go against the development release to be included in 3.2.x.
If there is a desire, it could be backported to the 3.1.x release as
well.

One minor issue:  Internet Explorer did not support inline images
until IE8.  So this patch will not work those older versions of IE.
This could, theoretically, be an issue for some users.  According to
the W3C [1], IE 6 and 7 still count for about 25% of browser hits;
however, this number is steadily falling.


[1] http://www.w3schools.com/browsers/browsers_stats.asp


-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Feeble attempt at gmond aliasing

2009-10-22 Thread Jesse Becker
On Thu, Oct 22, 2009 at 10:58, Brad Nicholes bnicho...@novell.com wrote:
 Unless I am misunderstanding the issue, a missing configuration option
 shouldn't be a problem for libconfuse.

That directly contradicts the libconfuse docs; however, those docs are
limited, at best, and I suppose could be wrong or misleading.

 Follow the 'Title' configuration directive on a metric.  Every metric can 
 optionally  have a title that is ultimately passed up through the XML.  The 
 code in gmond.c  asks libconfuse for the title when the metric definition is 
 read.  If no title has
been given in the configuration file, then the return from libconfuse when 
asked
 for the title, is NULL.

I'd tried doing something like this in some test code, and hit
libconfuse errors (along the lines of all config options must be
defined, exiting!).  I didn't have a good example to work from
though, and I'll check out the 'Title' code as a reference.

Thanks for the pointers.


-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Feeble attempt at gmond aliasing

2009-10-21 Thread Jesse Becker
Minor update on this:

It appears that libconfuse is completely unable to handle
missing/default values for configuration options[1].  So adding an
'alias' option to gmond will mean that every gmond.conf file has to be
updated to include an alias= line.

The libconfuse documentation is...limited.  Could someone more
familiar with it than I am offer suggestions as to how to set a
default value and handle the case where the alias= line is not
present?

[1] This is a really stupid design decision, IMO. :-(

On Thu, Oct 1, 2009 at 22:08, Jesse Becker haw...@gmail.com wrote:
 Here's my poor attempt at a patch to add aliasing to gmond, in an
 effort to stimulate some discussion on the topic.  The patch is
 against trunk.  I've done some basic testing (e.g. no immediate core
 dumps), but that's it for the moment.

 Comments?  Improvements?

 Index: lib/libgmond.c
 ===
 --- lib/libgmond.c      (revision 2093)
 +++ lib/libgmond.c      (working copy)
 @@ -66,6 +66,7 @@
   CFG_BOOL(gexec, 0, CFGF_NONE),
   CFG_INT(send_metadata_interval, 0, CFGF_NONE),
   CFG_STR(module_dir, NULL, CFGF_NONE),
 +  CFG_STR(alias,NULL,CFGF_NONE),
   CFG_END()
  };

 Index: gmond/gmond.c
 ===
 --- gmond/gmond.c       (revision 2093)
 +++ gmond/gmond.c       (working copy)
 @@ -301,6 +301,18 @@
  }

  static void
 +handle_alias( void ) {
 +       cfg_t *tmp = cfg_getsec( config_file, globals);
 +       char *tmp_myname;
 +       /* Allow for hostname aliases */
 +       tmp_myname = cfg_getstr(tmp, alias);
 +       if (tmp_myname) {
 +               strncpy(myname, tmp_myname, APRMAXHOSTLEN);
 +               debug_msg(Aliasing hostname to [%s], myname);
 +       }
 +}
 +
 +static void
  daemonize_if_necessary( char *argv[] )
  {
   int should_daemonize;
 @@ -2630,6 +2642,8 @@

   gmond_argv = argv;

 +  myname[0] = '\0';
 +
   if (cmdline_parser (argc, argv, args_info) != 0)
       exit(1) ;

 @@ -2658,6 +2672,7 @@
     }

   process_configuration_file();
 +  handle_alias();

   if(args_info.metrics_flag)
     {
 @@ -2686,7 +2701,8 @@
   load_metric_modules();

   /* Collect my hostname */
 -  apr_gethostname( myname, APRMAXHOSTLEN+1, global_context);
 +  if (!*myname)
 +    apr_gethostname( myname, APRMAXHOSTLEN+1, global_context);

   apr_signal( SIGPIPE, SIG_IGN );
   apr_signal( SIGINT, sig_handler );


 --
 Jesse Becker




-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Saving energy in Ganglia clusters

2009-10-20 Thread Jesse Becker
On Tue, Oct 20, 2009 at 02:19, Prashanth Mohan prashmo...@gmail.com wrote:
 Hello,

 I am looking at mechanisms to save energy on ganglia clusters. I was
 wondering if there is any ongoing or previous work in this regard. The

Always an interesting and useful subject.  Have you seen project
Hedeby from Sun?  It sounds like a similar sort of idea:
http://hedeby.sunsource.net/

I keep meaning to try putting it on my cluster, but lack of time...  sigh

 idea is essentially to put Idle machines to sleep and wake them up
 (perhaps using wake-on-lan) when there is a job to be run on the node

IPMI is frequently a better choice, but isn't available everywhere.
Wake-on-LAN can be flaky, but may be more common.


 (using Gexec). While GMond will make the decision as to whether (and
 when) to put the machine to sleep, Gexec will perform appropriate
 scheduling based on the power-performance tradeoff that the user
 provides.

This is very much in line with what Hedeby does.

 Also, where can I find statistics about the largest Ganglia clusters?
 This is basically for knowing whether Ganglia production clusters are
 large enough so that we can add scheduling logic that appropriately
 turns off machines which are spatially together, so that the HVAC
 resources for the particular area can also be turned off (further
 saving energy).

There was a thread about this on the list some months ago.  I suggest
trying to that as a start.  There are definitely some large systems
out there (thousands of nodes).

You can also look for (some) pages that link back to Ganglia, which
will include a lot large clusters:
http://www.google.com/search?q=link:http://ganglia.sourceforge.net/



-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Feeble attempt at gmond aliasing

2009-10-09 Thread Jesse Becker
On Fri, Oct 9, 2009 at 15:52, Spike Spiegel fsm...@gmail.com wrote:
 On Fri, Oct 2, 2009 at 9:59 PM, Jesse Becker haw...@gmail.com wrote:
 On Fri, Oct 2, 2009 at 10:35, Brad Nicholes bnicho...@novell.com wrote:
 How well does this fit into the previous discussions of using a GUID to 
 identify a box rather than an IP or FQDN?  Are aliasing and GUID 
 identifiers related or are they two separate issues?

 I think that is a separate, but related, discussion.  Perhaps I'm
 wrong, but there doesn't seem to be a clear consensus about using
 GUIDs vs. FQDN vs. IPs vs. something else (again, someone correct me
 if I'm wrong).  Maybe we should open that discussion again?

 why a separate discussion? You're adding a config option which you're
 free to set to whatever you think and that to me covers all cases, you
 could set it to the hostname, an ip or a GUID. Personally I find that
 in large infrastructure naming machines meaningfully is a lost game,
 the host itself is more or less irrelevant and what matters is the
 service associated to it, so I'd assign a GUID myself and maintain the
 association with the service somewhere else, maybe as a metric itself.
 On the other hand for the small shop host names are a pretty decent
 approach to map your infrastructure so they would prolly want to use
 that as an identifier. Either way having it as an option is a safe way
 of handling it and avoids surprises at the gmetad end (I don't like
 this thing that the received resolves the ip of the sender to decide
 its name).

The GUID discussion I refered to was if gmond/gmetad should be
rewritten, top-to-bottom, to use GUIDs instead of relying on DNS/IP
addresses.  My understanding is that everything would have use them,
including the .rrd files underneath.  That is, IMO, a big overhaul.

Adding aliasing is theoretically a smaller change, that I think works
within the existing code.  This is what I'm proposing to
add--something simple, and inexpensive to implement, but hopefully
useful to many people.

Thus, I see it as separate, but perhaps complementary/related.

-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Feeble attempt at gmond aliasing

2009-10-09 Thread Jesse Becker
On Fri, Oct 9, 2009 at 17:05, Spike Spiegel fsm...@gmail.com wrote:
 As to Rick's comments I believe they are only valid if we assume that
 the string representing a host should be its ip or the fqdn resolving
 to it, which I think is one of the many problems this alias patch is
 meant to solve (instances on EC2 or with multiple interfaces are a
 pita if things rely on ips/PTR for identification).

I'm a *huge* fan of giving admins powerful tools that do specifically
what they ask for.  The admins should be clueful enough to know when
something is appropriate and when it isn't.  If that means they break
their own system due to a misconfiguration, so be it.  The default
setting, of course, should be to try to do the right thing--which
Ganglia does most of the time.

Several people have asked for this feature in the last year or two,
and I (dimly) recall them all having decent, if not very good, reasons
for wanting aliasing.

 what do we need next? people compiling gmond with this patch and testing?

Yes. (myself included)

 [1] I've seen that discussion coming up in several instances on the
 rrd ML and never go anywhere because of some big change that
 apparently would be necessary to implement that feature correctly.

Hmm.  Pity. :-(

-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [PATCH] graph stats avg/min/max

2009-10-07 Thread Jesse Becker
On Wed, Oct 7, 2009 at 06:03, Ramon Bastiaans ramon.bastia...@sara.nl wrote:
 Hi Jesse,

 Here is the new patch. I rewrote it a bit, since my previous patch could
 give some formatting/alignment errors in certain situations.

I took a quick look this morning.

 This new patch now handles formatting/alignment properly for everything.

 Let me know if you get around to review it or have any feedback.

There are still a few alignment issues with the chart sizes; they are
related to the different number of lines of text in each chart (yes,
it's a pain, I know...).  It's possible that they could be addressed
more easily with newer versions of rrdtool (which can specify the size
of the image, and scale the chart to fit).

A few notes:

* There are a few places where I see negative values, that I think
should be caught by the RPN calculations.  Here's an example:
http://bayimg.com/EAEdDaACF
* Are the full Now/Max/Avg/Min stats really needed for all of the
metrics?  I am thinking mostly of metrics that are generally
static--such as node count and total RAM installed.
* The graph sizes when $graphreport_stats = false are misaligned.  I
think that the code is resizing the graphs, regardless of the number
of lines of text used in the legend.

And a general question to the list:
* When $graphreport_stats = false, should the behavior be to remove
all stats from the graph and have a minimalist view of the data, or to
show the more limited condensed view as currently in trunk?


-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [PATCH] graph stats avg/min/max

2009-10-07 Thread Jesse Becker
On Wed, Oct 7, 2009 at 10:22, Jesse Becker haw...@gmail.com wrote:
 On Wed, Oct 7, 2009 at 06:03, Ramon Bastiaans ramon.bastia...@sara.nl wrote:
 And a general question to the list:
 * When $graphreport_stats = false, should the behavior be to remove
 all stats from the graph and have a minimalist view of the data, or to
 show the more limited condensed view as currently in trunk?

Of course, the current 3.1.x stable code doesn't show any stats, which
some people may like.  How to people feel about that option as well?
Essentially, conf.php will have $show_stats option, with possible
values of {no,yes, extended}.  These would correspond respectively to
the behavior of 3.1.x, trunk, and Ramon's patch.

The obvious downside to this an increase in code complexity.

-- 
Jesse Becker

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Feeble attempt at gmond aliasing

2009-10-02 Thread Jesse Becker
On Fri, Oct 2, 2009 at 01:43, Rick Cobb rc...@quantcast.com wrote:
 Well, as far as generating discussion goes, I think we're better off
 only aliasing/spoofing IP addresses @ the gmond level, and resolving
 all names with gmetad.   That removes all issues of, e.g., whether the
 host thinks it should send a FQDN or just a basename, or how well
 dns / resolv.conf is set up on every machine in every cluster, etc.
 Only the gmetad servers need to have well-configured resolvers, and
 there are orders of magnitude fewer of those in many networks.
 Besides: fewer system calls on the boxes that are doing the real work
 our clusters our built to do.

All good points.  Sending only the IP address also potentially could
make the packets just slightly smaller, as an IPv4 packet will fit
into 32bits total, instead of one byte per character.  (Of course,
this nicely avoids the whole IPv6 and wide-char hostname issue.)

In my (again feeble) defense, there's also nothing stopping anyone
from setting IP addresses in the alias= field.

There are, it seems, two issues related to this.  The first is many
people have requested aliasing abilities for gmond for various
reasons.  The other is a broader shift in what gmond actually reports
(i.e. sending FQDN or just IP).  Fixing the first issue doesn't
prevent fixing the 2nd issue; do it in stages.

 Never did get that patch finished, though, so I probably should stay
 out of the discussion :-)

Incorrect!  :-)  Finish your patch, and let's see it.  I'm not deeply
attached to what I posted.

-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] graph report cacti style min/max/average/etc

2009-10-02 Thread Jesse Becker
On Fri, Oct 2, 2009 at 05:36, Ramon Bastiaans ramon.bastia...@sara.nl wrote:
 Hi Jesse,

 I was unaware there was something similar in trunk already. However, I
 took a look and what is in trunk is not exactly what I had in mind.

 My version is now configurable through conf.php and automatically
 resizes/aligns the legend font according to the graph size.

 I will send the patch and then you guys can see if you like it or not.

Great! We look forward to it.


-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Feeble attempt at gmond aliasing

2009-10-02 Thread Jesse Becker
On Fri, Oct 2, 2009 at 10:35, Brad Nicholes bnicho...@novell.com wrote:
 How well does this fit into the previous discussions of using a GUID to 
 identify a box rather than an IP or FQDN?  Are aliasing and GUID identifiers 
 related or are they two separate issues?

I think that is a separate, but related, discussion.  Perhaps I'm
wrong, but there doesn't seem to be a clear consensus about using
GUIDs vs. FQDN vs. IPs vs. something else (again, someone correct me
if I'm wrong).  Maybe we should open that discussion again?

One advantage to adding aliasing, like this patch tries to do, is that
it doesn't require changing any other parts of the system.  This is a
fairly non-intrusive patch regarding code.  Now, it's entirely
possible that this will have more subtle ramifications elsewhere
though.  In fact, that's another point in favor of adding aliasing:
it should help flush out other subtle dependencies on hostnames and IP
addresses.

-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] Feeble attempt at gmond aliasing

2009-10-01 Thread Jesse Becker
Here's my poor attempt at a patch to add aliasing to gmond, in an
effort to stimulate some discussion on the topic.  The patch is
against trunk.  I've done some basic testing (e.g. no immediate core
dumps), but that's it for the moment.

Comments?  Improvements?

Index: lib/libgmond.c
===
--- lib/libgmond.c  (revision 2093)
+++ lib/libgmond.c  (working copy)
@@ -66,6 +66,7 @@
   CFG_BOOL(gexec, 0, CFGF_NONE),
   CFG_INT(send_metadata_interval, 0, CFGF_NONE),
   CFG_STR(module_dir, NULL, CFGF_NONE),
+  CFG_STR(alias,NULL,CFGF_NONE),
   CFG_END()
 };

Index: gmond/gmond.c
===
--- gmond/gmond.c   (revision 2093)
+++ gmond/gmond.c   (working copy)
@@ -301,6 +301,18 @@
 }

 static void
+handle_alias( void ) {
+   cfg_t *tmp = cfg_getsec( config_file, globals);
+   char *tmp_myname;
+   /* Allow for hostname aliases */
+   tmp_myname = cfg_getstr(tmp, alias);
+   if (tmp_myname) {
+   strncpy(myname, tmp_myname, APRMAXHOSTLEN);
+   debug_msg(Aliasing hostname to [%s], myname);
+   }
+}
+
+static void
 daemonize_if_necessary( char *argv[] )
 {
   int should_daemonize;
@@ -2630,6 +2642,8 @@

   gmond_argv = argv;

+  myname[0] = '\0';
+
   if (cmdline_parser (argc, argv, args_info) != 0)
   exit(1) ;

@@ -2658,6 +2672,7 @@
 }

   process_configuration_file();
+  handle_alias();

   if(args_info.metrics_flag)
 {
@@ -2686,7 +2701,8 @@
   load_metric_modules();

   /* Collect my hostname */
-  apr_gethostname( myname, APRMAXHOSTLEN+1, global_context);
+  if (!*myname)
+apr_gethostname( myname, APRMAXHOSTLEN+1, global_context);

   apr_signal( SIGPIPE, SIG_IGN );
   apr_signal( SIGINT, sig_handler );


-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Another interface for Ganglia stats

2009-09-26 Thread Jesse Becker
On Sat, Sep 26, 2009 at 18:57, Vladimir Vuksan vli...@veus.hr wrote:
 Question is where do we go next :-) ?

Code.

-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Three .spec file issues

2009-09-18 Thread Jesse Becker
On Fri, Sep 18, 2009 at 06:40,  daniel.poc...@barclayscapital.com wrote:

 2) The 'Release' macro is not properly set in the specfile.
 This is because it is defined by the @REL@ autoconf variable,
 which is hard-coded to 1 in configure.in.  I propose that
 we change this.
 The simplest thing would be to remove @REL@ from
 ganglia.spec.in, and update the Release macro manually.
 The other option is to remember to update REL in configure.in
 each time a change is made to the ganglia.spec.in file.

 I actually introduced that variable, thinking that it would be useful to
 have it in a single place (configure.in) and use it for generating both
 the spec file and the Solaris pkginfo file.
 Obviously, some changes won't impact both platforms (e.g. a change
 within the spec file doesn't impact Solaris), but maybe someone will
 rebuild the packages for all platforms with no code changes but a
 different configure option and they will use @REL@ to show that
 something has changed - does that seem reasonable?

 Therefore, I'm not keen to go back to the manual solution, but some
 alternative to @r...@...

Fair enough.  I didn't realize it was a recent addition, and given the
long %changelog, was surprised to see it still at 1.  I'll add a
note in the specfile about updating it, and bump the version in trunk.

-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Three .spec file issues

2009-09-18 Thread Jesse Becker
On Fri, Sep 18, 2009 at 08:58,  daniel.poc...@barclayscapital.com wrote:
 Fair enough.  I didn't realize it was a recent addition, and
 given the long %changelog, was surprised to see it still at
 1.  I'll add a note in the specfile about updating it, and
 bump the version in trunk.

 I've noticed different ways of doing this, too

 Sometimes I see:

 1.0-1
 1.0-2
 1.1-1 (release number reverts back to 1 on new version of code)
 1.1-2

 and other projects do something like:

 1.0-1
 1.0-2
 1.1-2 (nothing changed in spec file, release number unchanged)
 1.1-3

 Is there any preference for how this should work in Ganglia?  Can it be done 
 in a unified way across packaging systems, or should we avoid any such 
 unification?

I prefer a monotonically non-decreasing release number (the 2nd option).

-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] Three .spec file issues

2009-09-17 Thread Jesse Becker
1)  I've made a few small changes to the ganglia.spc.in file (r2087
and r2088) to use the %{version} tag instead of @VERSION@ in a few
places.  This is considered to be 'correct' according to RPM package
building, and allows for easier upgrades if a user need to take the
.spec file and update it themselves.  The initial './bootstrap' will
still have to be run, since @VERSION@ does need to be expanded at
least once to define an RPM macro, and set a few items in the
comments.

2) The 'Release' macro is not properly set in the specfile.  This is
because it is defined by the @REL@ autoconf variable, which is
hard-coded to 1 in configure.in.  I propose that we change this.
The simplest thing would be to remove @REL@ from ganglia.spec.in, and
update the Release macro manually.  The other option is to remember
to update REL in configure.in each time a change is made to the
ganglia.spec.in file.

3) There are several sub-packages created by the specfile:
  libganglia-3_1_0
  ganglia-python
  ganglia-web (noarch)
  ganglia-devel
  ganglia-gmond
  ganglia-gmetad

The ganglia-devel package includes a /usr/lib/libganglia.so *symlink*
that points to the actual .so included in the libganglia package.  I
think that this symlink should be in the libganglia package.  Is there
a particular reason it is in ganglia-devel instead?

-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Undefined variable: rrd_options in functions.php

2009-09-16 Thread Jesse Becker
On Wed, Sep 16, 2009 at 08:41:29AM -0400, Daniel Pocock wrote:
 I'm a little surprised that we have to check the existence of the 
 variable every time it is used - I looked at index.php and graph.php, 
 and both of them appear to read eval_config.php before reading functions.php

I agree.  There was some discussion a while back about having a
centralized 'validation' and 'sanitation' routine through which all
variables would be passed before use.  To me, the problem seems to be that
PHP doesn't treat an undefined value as false in quite the same way as
Perl[1].  However, the code is written, in many places, as though it *is*
treated the same.  So we get lots of these undef warnings in the logs.
The code functions, but is noisy.


 If the variable's existence is not known at this point in functions 
 (despite a default value in eval_config.php), then aren't we running the 
 risk that it's value might have been lost somewhere?

I think that may be the case.  I was testing this out a little bit
yesterday in trunk, and couldn't get the eval_config.php bit to work
correctly.  Thus, my patch[2].

I'm slightly concerned about adding ad-hoc things like this, and wonder
if we need a more general solution to initalizing all of the variables
that may be needed.  Obviously, there are lots of files in conf.php
that should be included at all times.  There are also lots of things in
get_context.php as well.  And just to make things interesting, there's
the issue of conf.php being generated from conf.php.in, instead of just
included directly.

For the conf.php problem, perhaps this idea would work?  Leave conf.php
alone, as a generated file to allow for install-time customization by
the user.  Create a new config file that has options that shouldn't
usually (but could) be changed by the local admin.  Essentially, a don't
change this stuff unless you know what you are doing kind of file.  Yes,
changes to this file will make upgrades harder, but the user was warned.
If they know what they are doing, then they can presumably manage to
patch the file during an upgrade as well.


[1] I won't argue that either way is good or bad.  Not here, at least. ;-)
[2] If there is a better solution, feel free to revert/replace this patch.

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Undefined variable: rrd_options in functions.php

2009-09-15 Thread Jesse Becker
On Thu, Sep 10, 2009 at 13:28, Daniel Pocock dan...@pocock.com.au wrote:
 The reason I didn't put $rrd_options in conf.php is because it's value
 is meant to be derived from other variables, in other words, the
 administrator should not change $rrd_options itself.

 However, if get_context.php is not always read, then maybe there is
 another approach: I can create a `do not touch/advanced users only'
 section at the bottom of conf.php, and things like $rrd_options can go
 there.


 I haven't seen any other comments on this, so it's now at the  bottom of
 conf.php.in


 The only problem that I can think is that since conf.php doesn't get
 updated automatically by packaging tools (you don't want to overwrite
 the admins changes) a minor update is now more complicated than it
 should be. I'll prefer to see not admin changeable variables somewhere
 else than conf.php if they are required.

 Good point - I've now created eval_config.php, and that pulls in
 conf.php before creating the derived values

This still doesn't seem to work in trunk.  I'm still getting undefined
errors from line 267 in functions.php.  However, it seems we may be
making this much harder than needed.


[becker...@saturn ~/ganglia/versions/trunk/monitor-core/web]$ svn diff
-x -w -c 2068
Index: functions.php
===
--- functions.php   (revision 2067)
+++ functions.php   (revision 2068)
@@ -223,6 +223,10 @@
   {
  $out = array();

+ # If $rrd_options isn't set from conf.php or eval_config.php
+ if (!isset($rrd_options))
+$rrd_options = '';
+
  $rrd_dir = $rrds/$clustername/$host;
  if (file_exists($rrd_dir/$metricname.rrd)) {
 $command = RRDTOOL .  graph /dev/null $rrd_options .
@@ -263,6 +267,10 @@
else
  $sum_dir = $rrds/$clustername/__SummaryInfo__;

+# If $rrd_options isn't set from conf.php or eval_config.php
+if (!isset($rrd_options))
+$rrd_options = '';
+
$command = RRDTOOL .  graph /dev/null $rrd_options .
  --start $start --end $end .
  DEF:avg='$sum_dir/$metricname.rrd':'sum':AVERAGE .


-- 
Jesse Becker

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] Undefined variable: rrd_options in functions.php

2009-09-02 Thread Jesse Becker
I think that $rrd_options might be defined in the wrong spot?

I'm getting lots of Undefined variable: rrd_options in functions.php error 
messages in trunk.  It is used in functions.php in two places, as of r2049. 
Digging a bit, I see that $rrd_options is defined at the top of 
get_context.php (also from r2049).  get_context.php includes functions.php, 
but not vice-versa.

It seems that the logical place for this variable is actually in conf.php, and 
then functions.php could call 'include_once(conf.php)'.

This sound okay, or am I missing something?

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] fix for bug 232

2009-07-16 Thread Jesse Becker
On Thu, Jul 16, 2009 at 10:48, Brad Nicholesbnicho...@novell.com wrote:
 On 7/16/2009 at 8:07 AM, in message 4a5f3430.20...@pocock.com.au, Daniel
 Pocock dan...@pocock.com.au wrote:


 I tried to attach this solution to the bug report, but I get this error:

 You did not enter a valid attachment number.


 Anyhow, this is a solution for bug 232:

 http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=232

 As a consequence of applying this patch:
 - whenever an RRD is created/updated, the hostname directory name will
 be converted to lowercase
 - any capitalization can be used with the `h' parameter to the web interface
 - whenever gmetad receives a hostname in the XML, it will use a
 non-case-sensitive comparison to decide if it already has data for that host
 - the XML emitted by gmetad will show the capitalization that was
 received in the XML, not the lowercase version

 Anyone applying this patch needs to rename all their hostname
 directories to lowercase.

 Regards,

 Daniel

 This patch seems reasonable to me.  The only part that bothers me is the fact 
 that an upgrade from a previous version might break existing installs unless 
 they rename all of their rrd directories.  That could be a problem for some 
 users that have a large number of monitored boxes.

Perhaps we should include a small script to help with this in the
contrib/ directory?


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Disk IO as gmond core metric

2009-07-10 Thread Jesse Becker
On Fri, Jul 10, 2009 at 14:07, Brad Nicholesbnicho...@novell.com wrote:

 Anyway everything else worked just fine.  I have a couple of suggestion. You 
 should probably include a sample .conf file so that the user doesn't have to 
 go figure everything out like all of the module information.  I have attached 
 the one that I used.  This way they can just drop the .conf file in the 
 conf.d directory and everything just works.  You also might want to update 
 the COPYING and INSTALL files to reflect the current state of things.  The 
 INSTALL file should contain information about building and installing the 
 module.

Would it be possible, in the future--I know you can't do this now--to
allow for module configuration directly in gmond.conf?

I absolutely think it should be added to the wiki.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Metric modules for Perl and Ruby (was:Re: Disk IO as gmond core metric)

2009-07-10 Thread Jesse Becker
On Fri, Jul 10, 2009 at 11:21, Brad Nicholesbnicho...@novell.com wrote:
 On 7/9/2009 at 5:43 PM, in message
 8121824c0907091643od6832c5y3c4ffa37696e4...@mail.gmail.com, JB Kim
 jbremn...@gmail.com wrote:

 Lastly, is anyone already working on a perl equivalent module of mod_python?
 With the 3.1.x gmond framework, it would be definitely possible to
 further extend DSO functionality
 by running embedded interpreters like perl and R.



 Not that I know of but this is something that we have been talking about 
 since the introduction of mod_python.  In fact one of the reasons why the 
 python interface was embedded this way was to allow for other interpreters to 
 do the same.  The intention was to someday write a mod_perl, mod_ruby, 
 mod_whatever in order to support other languages.

Not official, but there is the embeddedgmetric project.  It's grown
quite a bit since I last checked:

  http://code.google.com/p/embeddedgmetric/



-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Metric modules for Perl and Ruby (was:Re: Disk IO as gmond core metric)

2009-07-10 Thread Jesse Becker
On Fri, Jul 10, 2009 at 16:37, JB Kimjung.bae@morganstanley.com wrote:
 Hi guys,

 Thanks for the feedback. I think the embeddedgmetric is a bit different
 from what brad and I were discussing.

 The embeddedgmetric proj seems like it attempts to provide programmatic API
 interface to inject metrics into gmond network from multitude of languages,
 for folks who want to avoid running gmetric from the command line.

Yeah, I know that it's different, but it also has a perl (and other)
implementation.  It's the barest, smallest, minimal bit of a start for
a Perl library, but it's the *only* start I know of. ;-)

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Metric modules for Perl and Ruby (was:Re: Disk IO as gmond core metric)

2009-07-10 Thread Jesse Becker
On Fri, Jul 10, 2009 at 16:44, JB Kimjung.bae@morganstanley.com wrote:

 Well, I was thinking about tackling mod_perl or mod_R on the side as time
 permits if noone's already working on it.

A pure-perl implementation has been on my to-do list for a long
time, but I've never had time to work on it.

I'm happy to help test and do minor work, but my time is short--as
always.  I've a number of perl scripts that exist only to call
gmetric.  Keeping it within Perl would be quite nice.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] PATCH: ensure string metrics are null terminated for python-based user metrics

2009-07-09 Thread Jesse Becker
Greg Bruno wrote:
 for ganglia v3.1.2, the patch below ensures that all 'string' metrics
 are null terminated for python-based user metrics. without this patch,
 any string metric that is longer than MAX_G_STRING_SIZE (currently 32
 bytes), will not be null terminated and will corrupt the value in the
 metric.

Committed to trunk in r2000, and proposed for backporting to the 3.1.x tree. 
Thanks for the patch.

 also, regarding MAX_G_STRING_SIZE, would it be possible to increase it
 in future releases? i've currently set it to 128 in
 include/gm_value.h.

I don't object, although I'm not the most familiar with this part of the code. 
  I'm curious:  what metrics are you trying to use that are that long?

 
  - gb
 
 
 =
 *** ganglia-3.1.2/gmond/modules/python/mod_python.c   2009-01-28
 15:23:20.0 -0800
 --- /tmp/patch-files/gmond/modules/python/mod_python.c2009-07-08
 13:04:40.0 -0700
 ***
 *** 127,134 
 --- 127,138 
   }
   else if (PyString_Check(dv)) {
   char* p = PyString_AsString(dv);
   strncpy(bfr, p, len);
 + /*
 +  * ensure bfr is null terminated
 +  */
 + bfr[len-1] = '\0';
   }
   else if (PyFloat_Check(dv)) {
   double v = PyFloat_AsDouble(dv);
   snprintf(bfr, len, %f, v);
 ***
 *** 422,430 
   gmi-name = apr_pstrdup (pool, minfo-mname);
   gmi-tmax = minfo-tmax;
   if (!strcasecmp(minfo-vtype, string)) {
   gmi-type = GANGLIA_VALUE_STRING;
 ! gmi-msg_size = UDP_HEADER_SIZE+32;
   }
   else if (!strcasecmp(minfo-vtype, uint)) {
   gmi-type = GANGLIA_VALUE_UNSIGNED_INT;
   gmi-msg_size = UDP_HEADER_SIZE+8;
 --- 426,434 
   gmi-name = apr_pstrdup (pool, minfo-mname);
   gmi-tmax = minfo-tmax;
   if (!strcasecmp(minfo-vtype, string)) {
   gmi-type = GANGLIA_VALUE_STRING;
 ! gmi-msg_size = UDP_HEADER_SIZE+MAX_G_STRING_SIZE;
   }
   else if (!strcasecmp(minfo-vtype, uint)) {
   gmi-type = GANGLIA_VALUE_UNSIGNED_INT;
   gmi-msg_size = UDP_HEADER_SIZE+8;
 =
 
 --
 Enter the BlackBerry Developer Challenge  
 This is your chance to win up to $100,000 in prizes! For a limited time, 
 vendors submitting new applications to BlackBerry App World(TM) will have
 the opportunity to enter the BlackBerry Developer Challenge. See full prize  
 details at: http://p.sf.net/sfu/Challenge
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers
 


-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)

--
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] multicpu module on Cygwin or static builds

2009-06-05 Thread Jesse Becker
Got a patch? :)

On Thu, Jun 4, 2009 at 18:34, Daniel Pocockdan...@pocock.com.au wrote:



 Is there a specific reason why the multicpu module should not be used on
 Cygwin builds (or any other static build)?

 I found that only some trivial changes were needed to make it compile
 and link:

 - the static modifier is needed in front of timely_file proc_stat,
 otherwise it conflicts with a variable of the same name in another
 module when linking

 - it needs to be added to GLDADD in gmond/Makefile.am

 - gmond/modules/cpu/Makefile.am needs to be tweaked to build it in a
 static build

 Maybe this can be enabled by default from the next release?

 Regards,

 Daniel


 --
 OpenSolaris 2009.06 is a cutting edge operating system for enterprises
 looking to deploy the next generation of Solaris that includes the latest
 innovations from Sun and the OpenSource community. Download a copy and
 enjoy capabilities such as Networking, Storage and Virtualization.
 Go to: http://p.sf.net/sfu/opensolaris-get
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [PATCH] isblank is undefined without ISOC99

2009-02-21 Thread Jesse Becker
committed to trunk.  r1967

On Sat, Feb 21, 2009 at 15:54, Justin Bronder jsbron...@gentoo.org wrote:
 I get the following warning when compiling 3.1.1 and 3.1.2.  Following the way
 that __USE_GNU is defined in libmetrics/linux/metrics.c, the attached patch
 removes this warning.

 if /bin/sh ../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I. 
 -I..-I../../lib -I../../include -march=nocona -O2 -pipe -Wall -MT 
 metrics.lo -MD -MP -MF .deps/metrics.Tpo -c -o metrics.lo metrics.c; \
then mv -f .deps/metrics.Tpo .deps/metrics.Plo; else rm -f 
 .deps/metrics.Tpo; exit 1; fi
  gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../../lib -I../../include -march=nocona 
 -O2 -pipe -Wall -MT metrics.lo -MD -MP -MF .deps/metrics.Tpo -c metrics.c  
 -fPIC -DPIC -o .libs/metrics.o
 metrics.c: In function 'update_ifdata':
 metrics.c:204: warning: implicit declaration of function 'isblank'


 diff -urN a/ganglia-3.1.1/libmetrics/linux/metrics.c 
 b/ganglia-3.1.1/libmetrics/linux/metrics.c
 --- a/ganglia-3.1.1/libmetrics/linux/metrics.c  2008-08-25 13:44:57.0 
 -0400
 +++ b/ganglia-3.1.1/libmetrics/linux/metrics.c  2008-11-18 21:33:01.370635031 
 -0500
 @@ -3,6 +3,9 @@
  #ifndef __USE_GNU
  #define __USE_GNU
  #endif
 +#ifndef __USE_ISOC99
 +#define __USE_ISOC99
 +#endif
  #include string.h
  #include time.h
  #include unistd.h


 --
 Justin Bronder

 --
 Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
 -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
 -Strategies to boost innovation and cut costs with open source participation
 -Receive a $600 discount off the registration fee with the source code: SFAD
 http://p.sf.net/sfu/XcvMzF8H
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers





-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] gmond reload and execve

2009-02-20 Thread Jesse Becker
On Fri, Feb 20, 2009 at 05:56,  daniel.poc...@barclayscapital.com wrote:
 There was previously some discussion about a greaper binary for
 re-starting gmond after a config change.

Well, the usual way to do this is to catch a HUP signal, and do
whatever processing is needed.  I know that some programs will simply
call the read_config_file() routine, instead of fullying exec()ing
again.

 Is it potentially safe for gmond to just execve another gmond?  The new
 process will use the same PID, same effective UID, same file
 descriptors.


Yeah, its probably safe.  As I mentioned, I've seen other long running
daemons do this.  It's worth a try, in any event.

 If gmond is already running with reduced privileges (post-setuid), does
 it need to gain root privileges again to restart in this way?

No.  No extra privileges are needed.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF and network overload + [Feature] multiple requests per conn on interactive port

2009-01-14 Thread Jesse Becker
.

 --
 This SF.net email is sponsored by:
 SourcForge Community
 SourceForge wants to tell your story.
 http://p.sf.net/sfu/sf-spreadtheword
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers





-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Avg Utilization

2008-12-10 Thread Jesse Becker
On Wed, Dec 10, 2008 at 03:51,  [EMAIL PROTECTED] wrote:
  IMO the graphs are already to cluttered for a quick overview
 and I have ideas for adding even more clutter (that matters
 to me, like the date the graph was produced :-). But if
 clutter doesn't matter, why not do both? I do not believe
 that the additional overhead is such a problem.

 I believe the graphs should be template driven, somewhat like the
 template pages.

While this would be nice, it would be very using the current web front
end.  The HTML template system currently in use is basically a
one-way deal:  the various PHP scripts process various things (.rrd
files, taling to gmetad, etc), and stuff the results into pre-existing
HTML template files.  The templates themselves do not do anything
besides presentation.  Specifically, nothing in the templates are used
as input for these calculations.

 However, the choice of templates would replace the choice of graph sizes
 - so the user would be able to choose which template they want.

 Each template could specify:
 - graph size

This is only possible because of clever (ab)use of HTML forms and URL
parameter processing.  The other ideas you mentioned (show stats,
fonts, hostnames) cannot be driven explicitly at the template level,
given the current code.

 There are probably other things that could be configured too.

Of course! :)

 One related issue is that the HTML pages should specify the height and
 width in the IMG tags for graphs - maybe a function is needed to
 generate IMG tags.

This is slightly circular, since rrdtool will take height and width
arguments for the size of the chart--*not* including the border,
title, legend, etc--and return the total image size on the command
line.  However, since we use passthru(), this information is currently
lost.

A further complication is that crafting the various reports is not
always easy.  Trying to make them flexible enough to be truly template
driven will make the code much more complicated.

Now, all of that said, I actually don't object to the idea of making
the web frontend more flexible.  In fact, I think that it is a great
idea.  But there's very much a tradeoff between making/keeping the
code sane, and doing everything that we'd like.


I propose something of a middle ground:  make the various parts of the
frontend generic enough that building a new template is straight
forward.  Specifically, make graph creating simpler, and in such a way
that it is easy to specify a number of these different
parameters--sizes, titles, etc.  If things are set in the conf.php
file, perhaps we could let a different .php file override some of the
settings on a per-template basis?

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Avg Utilization

2008-12-10 Thread Jesse Becker
On Wed, Dec 10, 2008 at 10:22,  [EMAIL PROTECTED] wrote:
 This is slightly circular, since rrdtool will take height and
 width arguments for the size of the chart--*not* including
 the border, title, legend, etc--and return the total image
 size on the command line.  However, since we use passthru(),
 this information is currently lost.

 That suggests that rrdtool needs some extra command line option telling
 it that the dimensions are for the image size.

Newer versions of rrdtool have this.  I meant to mention that previously.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] use of find_avg() in cluster_view.php

2008-12-08 Thread Jesse Becker
On Mon, Dec 8, 2008 at 11:26,  [EMAIL PROTECTED] wrote:
 Looking more closely at how find_avg() is used by cluster_view.php:

  $avg_cpu_num = find_avg($clustername, , cpu_num);
  if ($avg_cpu_num == 0) $avg_cpu_num = 1;
  $cluster_util = sprintf(%.0f, ((double) find_avg($clustername, ,
 load_one) / $avg_cpu_num ) * 100);

 Basically, the code above takes the quotient of two averages:

  util = average(load_one) / average(cpu_num)

 Is that really the same as taking the average of the quotient?

  X = set of values for each t, (load_one(t) / cpu_num(t))
  util = average(X)


Computing the values on a global or per-system focus has wildly
different answers.

Consider two systems show on this URL:
http://spreadsheets.google.com/pub?key=pLqhPr4caFmQ3g_G7cwUCSQ

It shows a cluster makde of two dissimilar systems.  One with 4 CPUs,
and a load of 4.  One with one CPU, and a load of 0.  Is your cluster
utilization 80% or 50%?

I (currently) believe that the current method, which is two divide the
average(load) by the average(cpu_num), is correct behavior.  Computing
multiple per-system utilization figures, then averaging them, is less
immediately useful.  (This is not to say useless, since these
per-system utilization numbers could help to give you an idea what
rackspace is being efficiently used.)

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


[Ganglia-developers] backport proposal for graph statistics, bugzilla 206

2008-10-18 Thread Jesse Becker
Just a note that I've added a backport proposal for bugzilla ID#206
into the 3.1.x branch STATUS file.  This is a split from #193.  I've
consolidated several patches from trunk into a single patch, and
posted it.  Please review, test and vote.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] backport proposal for graph statistics, bugzilla 206

2008-10-18 Thread Jesse Becker
On Sat, Oct 18, 2008 at 13:33, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 On Sat, Oct 18, 2008 at 01:14:32PM -0400, Jesse Becker wrote:
 I've
 consolidated several patches from trunk into a single patch, and
 posted it.  Please review, test and vote.

 was just looking at that and I have to admit I am not sure why a consolidated
 patch that diverts from trunk will be needed.

 couldn't just a merge from all relevant patches in trunk be used for
 backport?, if there are few minor textual changes why not include them as
 well to avoid having later conflicts when trying to merge further stuff from
 trunk?

Because I figured it would be easier to review and test applying one
patch, instead of the 4-5 that it takes otherwise.  The single patch
was produced directly from a diff against trunk; no new code is
included.

 if the changes that were skipped were not good for 3.1, then they are not good
 for trunk either and that could be fixed with further patches in trunk than
 then could be added to the list of patches from 3.1 for backport.

Nothing was skipped.  In fact, given that new stuff goes into trunk
*before* it goes into 3.1, I don't really see how it could have been
skipped.  As I said, there's no new code here.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Ganglia Wiki organization

2008-10-07 Thread Jesse Becker
A bit late in replying...

On Thu, Sep 25, 2008 at 20:17, Bernard Li  I have created a new
navigation link called Development in the Ganglia Wiki:

 http://ganglia.wiki.sourceforge.net/development

 I envision to place all development-related resources under this page.

Great idea.

 Going through the other navigation links, I propose that we move How
 the Project Works, How to Participate, Project Administration and
 Wish List under this subcategory -- thoughts?

Agreed, with one minor change:  Leave the how to participate link on
the sidebar.  I think that it is important to not bury that link in a
sub-page.

 On the same token, I am thinking of creating another subcategory
 called User resources and place some of the other pages there.

That's reasonable.  What would go there?

For my part, I think that the following wiki pages should *remain* in
the sidebar, or somewhere obvious on each page:
  * home
  * A link to the documentation, including installation
  * How to participate
  * Release Notes
  * Link to gmetric/module repos.

The above list is not sorted.  I also don't have a problem with pages
being linked from multiple places.  For example, a documentation link
on the sidebar does not preclude having a second documentation link on
the developer page(s).

One of the guidelines for Wikipedia Editors is Be Bold.[1]  So have
at it!  We can always change/revert the pages if it doesn't work out.

[1] http://en.wikipedia.org/wiki/Wikipedia:Be_bold

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] STATUS file etiquette (ignore previousemail please)

2008-09-18 Thread Jesse Becker
On Wed, Sep 17, 2008 at 10:45, Brad Nicholes [EMAIL PROTECTED] wrote:
 I'm not sure I like any of those options.  How about if we use a
 modification of your first suggestion.

snipping of good suggestions

 The voting history or basically what happened during the proposal
 review is all captured in a series of brief comments in the STATUS file.
  The original author may also decide to include a link to an email
 thread where a more detailed discussion is or has taken place.  In this
 process nobody messes with anybody else's votes and the original author
 of the proposal remains in control of the proposal.

This seems reasonable to me.  I think that it would be worth trying
this procedure to see how it works out.


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] STATUS file etiquette

2008-09-16 Thread Jesse Becker
On Mon, Sep 15, 2008 at 19:59, Brad Nicholes [EMAIL PROTECTED] wrote:
 In general, I am on board with this as you outlined it.  Let me make a
 couple more suggestions (inline)

Thanks for the comments.  A few things I thought of after sending the
1st email are also below.


 Suggestion 2)  Don't mess with other people's votes.


 +1, even when a proposal is modified, the existing votes need to
 remain.  However we do need to somehow put a procedure in place that
 allows for a re-review.  I don't have any suggestions for that yet.

Perhaps an SVN commit that does nothing but remove *all* votes for a
given stanza, and make sure the log entry indicates what is going on?
Alternately, if a proposal is reworked, and needs re-review, add that
to the notes about the backport.  That's...cumbersome, but I can't
think of a better idea.




 Suggestion 4)  When a backport has been accepted, move the entire
 stanza into a CHANGES file (or something similar), along with a date
 stamp..  This should be done when the changes are actually committed
 to that branch, not when the votes are cast.  This also means that a
 CHANGES file needs to be created.


 +1 again, another suggestion would be to just move the stanza to lower
 section with the same STATUS file.  But I am good with this either way.

I think that I prefer moving them to another file, for two reasons:
1)  It keeps the STATUS file from growing without bound.
2)  the Revision diffs should be very obvious as to what is happening,
what with a big chunk of text removed from STATUS, and appearing in
CHANGES.




-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] STATUS file etiquette

2008-09-16 Thread Jesse Becker
On Tue, Sep 16, 2008 at 05:39, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 On Mon, Sep 15, 2008 at 07:22:27PM -0400, Jesse Becker wrote:

 A few of us (Bernard, Brad Nicholes, and myself) were musing on IRC
 about the etiquette of STATUS file edits.

 since I did 55.4% of the commits to the 3.1 STATUS file and all the
 other committers except for Martin (2.7%) were part of that IRC meeting,
 guess this was directed at me.

Actually, no.  It stemmed more from a confusion on my part as to how
proper way of doing things.  Etiquette isn't the right word, perhaps
procedure is better (I couldn't think of it yesterday).

 We decided that this is
 better discussed on the wider list, and I was volunteered to broach
 the issue (lucky me).

 in our wiki under the section of communication it is suggested that
 decisions made on IRC be summarized to the list as shown by :

  http://ganglia.wiki.sourceforge.net/ganglia_project

No decisions were made, as I pointed out.  And you'll note that this
issue specifically was brought on the wider list *because* there were
only three people involved.

 So to get things started, here are a few things
 that *I* think would be useful.  These are suggestions only--I'm in no
 position to dictate anything to anyone.

 Suggestion 1)  The +1, +0 and -1 votes get one line apiece, for a
 total of 3 lines. See below for an example.

 +1

 funny you mention it was your idea though since that was the way that
 it was documented to work before as reflected by the template until this
 commit reformatted all entries differently :

 
 r1716 | hawson | 2008-08-22 21:15:21 -0700 (Fri, 22 Aug 2008) | 3 lines

 * Added reviews to a number of proposed backports.
 * split a few +1 lines that contained multiple users into multiple +1 lines

Yep.  I'm aware of that.  I actually still prefer multiple lines for
two reasons:  1)  it makes it more obvious as to the number of votes
have been cast, as well as the relative number of +1, +0, -1.  2)  It
makes the diffs cleaner when looking to see what votes were
added/changed.  However, this is such a minor thing, that it isn't
worth arguing over.  I suggested single lines since that seems to be
what other people prefer.

 Suggestion 2)  Don't mess with other people's votes.

 -1

 votes are attached to patch proposals and so if the proposal changes the
 vote has to be recasted (indeed we talked about this in our ganglia meeting
 in groundworks)

I agree that new patches require new votes, but there needs to be more
communication when this happens.  Deleteing the votes and comments
removes that information about previous patches, and potentially why
it was rejected (or not) in the first place.  This information is
useful, and should be preserved somehow.  Perhaps we could do
something like this (all revisions and patch numbers are 100% bogus)

* gmond: recover from interface reconfiguration gracefully
   http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=38
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=1478
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=1632
228 -1: carenas
229 carenas: not a fix but a workaround to the problem



 if the proponent finds a problem with his original proposal and changes it,
 keeping the +1 votes on it will be incorrect as what was verified was
 something different to what the vote is attached to.  the only logical
 course of action should be to remove the vote so that it can be voted again.

 Suggestion 3)  A vote of +1 does not need comment.  If committing a
 vote of +0 or -1, a comment as to why is *strongly* encouraged.

 +1 with comments

 the wording used slightly contradicts the Decision Making section on our
 wiki for Project Administration

  http://ganglia.wiki.sourceforge.net/ganglia_project

 -1 MUST have a comment that explains clearly why the current proposal
 needs more work before backported and ideally the missing pieces or an
 alternative proposal contributed.

 it is also important to note that a backport rejection that says something
 like I don't like this, I think we should do it differently should also
 take into consideration that trunk is already changed to what the proposal
 was suggesting and so an alternative proposal MUST be contributed and
 hopefully a broad discussion started.

A fair point.  But if there are objections to the backport patch, this
may mean that the upstream patch to trunk should be reviewed and
possibly amended.

 most of the time though, I would expect discussion will be started
 directly in trunk and even before the patch is proposed for backport,
 after all trunk is under CTR rules and the R there means REVIEW.


 The comment can be either on the voting line, or immediately after the
 stanza.  See below for an example.

 -1

 as explained above this could be problematic for keeping the votes in
 only 3 lines so it might

Re: [Ganglia-developers] STATUS file etiquette (ignore previous email please)

2008-09-16 Thread Jesse Becker
Crud... Gmail sent the last email before I was done.  (PEBKAC, really)
Please ignore the previous email, and read this one instead.

On Tue, Sep 16, 2008 at 05:39, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 On Mon, Sep 15, 2008 at 07:22:27PM -0400, Jesse Becker wrote:

 A few of us (Bernard, Brad Nicholes, and myself) were musing on IRC
 about the etiquette of STATUS file edits.

 since I did 55.4% of the commits to the 3.1 STATUS file and all the
 other committers except for Martin (2.7%) were part of that IRC meeting,
 guess this was directed at me.

Actually, no.  It stemmed more from a confusion on my part as to how
proper way of doing things.  Etiquette isn't the right word, perhaps
procedure is better (I couldn't think of it yesterday).

 We decided that this is
 better discussed on the wider list, and I was volunteered to broach
 the issue (lucky me).

 in our wiki under the section of communication it is suggested that
 decisions made on IRC be summarized to the list as shown by :

  http://ganglia.wiki.sourceforge.net/ganglia_project

No decisions were made, as I pointed out.  And you'll note that this
issue specifically was brought on the wider list *because* there were
only three people involved.

 So to get things started, here are a few things
 that *I* think would be useful.  These are suggestions only--I'm in no
 position to dictate anything to anyone.

 Suggestion 1)  The +1, +0 and -1 votes get one line apiece, for a
 total of 3 lines. See below for an example.

 +1

 funny you mention it was your idea though since that was the way that
 it was documented to work before as reflected by the template until this
 commit reformatted all entries differently :

 
 r1716 | hawson | 2008-08-22 21:15:21 -0700 (Fri, 22 Aug 2008) | 3 lines

 * Added reviews to a number of proposed backports.
 * split a few +1 lines that contained multiple users into multiple +1 lines

Yep.  I'm aware of that.  Consensus seems to be to use single lines,
so I suggested it.  I actually  prefer multiple lines for
two reasons:  1)  it makes it more obvious as to the number of votes
have been cast, as well as the relative number of +1, +0, -1.  2)  It
makes the diffs cleaner when looking to see what votes were
added/changed.

 Suggestion 2)  Don't mess with other people's votes.

 -1

 votes are attached to patch proposals and so if the proposal changes the
 vote has to be recasted (indeed we talked about this in our ganglia meeting
 in groundworks)

I agree that new patches require new votes, but there needs to be more
communication when this happens.  At the very least, a note to -devel
that a new patch is present and the issue in question needs a revote.
Furthermore deleting the votes and comments removes information about
previous patches, and potentially why it was rejected (or not) in the
first place.  This information is
useful, and should be preserved somehow.  Perhaps we could do
something like this (all revisions and patch numbers are 100% bogus)

* gmond: report CPU color
   http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100
   -1: hawson
   hawson: doesn't actually work due to changes in r150.
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=200
   +1: hawson:  actually works

This is a bit cumbersome, but does have the advantage of keeping a
timeline of sorts, as well as comments and vote history.

Other things to possibly do would be have a date stamp of some sort:

* gmond: report CPU color (proposed 2008-06-01)
   http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100
   -1: hawson (2008-07-01)
   hawson: doesn't actually work due to changes in r150.
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=200
   +1: hawson:  actually works (2008-08-01)


Or perhaps starting with:

* gmond: report CPU color (proposed 2008-06-01)
   http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100
   -1: hawson
   hawson: doesn't actually work due to changes in r150.

and once a new patch is out, clearing votes and comments but adding a
note that a revote is needed:

* gmond: report CPU color (proposed 2008-06-01)
   http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100
   http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=200
 (REVOTE NEEDED)



 it is also important to note that a backport rejection that says something
 like I don't like this, I think we should do it differently should also
 take into consideration that trunk is already changed to what the proposal
 was suggesting and so an alternative proposal MUST be contributed

[Ganglia-developers] STATUS file etiquette

2008-09-15 Thread Jesse Becker
A few of us (Bernard, Brad Nicholes, and myself) were musing on IRC
about the etiquette of STATUS file edits.  We decided that this is
better discussed on the wider list, and I was volunteered to broach
the issue (lucky me).  So to get things started, here are a few things
that *I* think would be useful.  These are suggestions only--I'm in no
position to dictate anything to anyone.

Suggestion 1)  The +1, +0 and -1 votes get one line apiece, for a
total of 3 lines. See below for an example.

Suggestion 2)  Don't mess with other people's votes.

Suggestion 3)  A vote of +1 does not need comment.  If committing a
vote of +0 or -1, a comment as to why is *strongly* encouraged.  The
comment can be either on the voting line, or immediately after the
stanza.  See below for an example.

Suggestion 4)  When a backport has been accepted, move the entire
stanza into a CHANGES file (or something similar), along with a date
stamp..  This should be done when the changes are actually committed
to that branch, not when the votes are cast.  This also means that a
CHANGES file needs to be created.


So, any thoughts or comments?  These are, as a I said, suggestions
only.  I'm quite willing to be convinced of other behavior.


Example for #1 and #3 above:

  * gmond: Frobnicate the quazzle before inducticating the bibblebop.
http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=-
+1:  hawson
-1: Tom, Dick, Harry
+0:  Mr. Bill
Tom:  Inducitacting should happen before the frobnication.
Harry:  The quazzle should be checked for extra widgets before frobnication.
Mr. Bill:  Oh no!


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] printing output with rrdtool

2008-09-12 Thread Jesse Becker
On Fri, Sep 12, 2008 at 13:33, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 Jessie,

Jesse, actually. :-)

 the following commit (r1754 in ganglia's svn) seems to be patching the fix
 proposed by Jason as part of BUG37 and that was committed in r1595 and has
 been left unconsistent (as not all uses of this feature has been converted
 to use /dev/null).

Then the other instances should be converted, IMO.  I looked at this
one specifically since it was causing trouble on a local install
running trunk.  Specifically, the error was:

PHP Notice:  Undefined offset:  1 in
/home/beckerjes/ganglia/versions/trunk/monitor-core/web-working/functions.php
on line 266, referer:
http://saturn/ganglia-jb/?m=load_oner=hours=descendinghc=4mc=2

Tracing this back, you get this function:
   249  
#---
   250  #
   251  # Finds the avg of the given cluster  metric from the summary rrds.
   252  #
   253  function find_avg($clustername, $hostname, $metricname)
   254  {
   255 global $rrds, $start, $end;
   256 $avg = 0;
   257
   258 if ($hostname)
   259   $sum_dir = $rrds/$clustername/$hostname;
   260 else
   261   $sum_dir = $rrds/$clustername/__SummaryInfo__;
   262 $command = RRDTOOL .  graph /dev/null --start $start --end $end .
   263   DEF:avg='$sum_dir/$metricname.rrd':'sum':AVERAGE .
   264   PRINT:avg:AVERAGE:%.2lf ;
   265 exec($command, $out);
   266 $avg = $out[1];
   267 #echo $sum_dir: avg($metricname)=$avgbr\n;
   268 return $avg;
   269  }

After adding a one-line print let me see exactly the rrdtool command
getting called (rrdtool 1.2.26):

  /usr/bin/rrdtool graph '' --start -3600 --end N
DEF:avg='/long/path/to/rrds/__SummaryInfo__/cpu_num.rrd':'sum':AVERAGE
PRINT:avg:AVERAGE:%.2lf

(Or something to that effect, the test system is at home, and I'm not.
 I'm building this on the fly from the code.)

Sure enough, no output at all, and this causes line 266 to throw the
error.  I can reproduce this using rrdtool 1.2.23 and 1.2.26.  This
was specifically in the meta-view, but not in the cluster- or
host-views.

 I had been unable to reproduce any problem with the original patch using
 several different versions of rrdtool, but your comment seems to imply you
 had observed the problem somehow, could you elaborate on that?

I tested three cases:
  rrdtool -
  rrdtool ''
  rrdtool /dev/null

The only one that I could get to work consistently was 'rrdtool
/dev/null'.  This trick of using /dev/null is actually suggested in
a number of rrdtool mailing list threads for situations where you want
rrdtool to calculate a value, but suppress the generation of a graph.
I'd also like to note that there are no difference in syscalls between
using /dev/null and rrdtool '', so there should be no additional
cost between the two methods.  There are some between using a dash and
the other two, but I think that's because nothing is written to STDOUT
at all.

Of course, in the case of graph.php, we want 'rrdtool -', since the
graphs are generated in-line.

 but the solution proposed by Jason has the advantage of being cross-platform
 (/dev/null doesn't exist in windows), so if you see no problem with the
 original patch I'd suggest you revert yours.

As I said, this was the only one of the three that I could get working
consistently.  I also happen to think that rrdtool /dev/null blah
is more obvious than rrdtool '' blah.  So there's a minor argument
to be made in favor of that as well.  In the case of Windows, there is
an equivalent NUL file that could be used.

Regardless, I do *not* think that the patch should simply get
reverted.  Arguably, the problem stems from a lack of proper error
handling on the exec() call.  This exists in 1753 (and persists 1754,
for that matter), so a simple revert won't help matters any.  Instead,
having code to handle NULL output from exec(), in addition to an
rrdtool that can reliably suppress graph generation but still do the
computations we want.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Bugzilla Bug 193: Avg Load percentages and overall cluster utilization.

2008-08-19 Thread Jesse Becker
On Tue, Aug 19, 2008 at 20:33, Bernard Li [EMAIL PROTECTED] wrote:

 I currently have an incomplete fix, but I need to get consensus as to
 what average utilization really means for grid of grids: should
 average utilization for a grid be load average divided by the number
 of cpus for the *entire* meta-grid or just over the grid in question?

For a both the grid and meta-grid I think the average utilization
should be counted for all hosts.  Consider a meta-grid that gets data
from from two grids A and B.  Grid A has 100 hosts and Grid B has 20.
If 15 machines are running full-tilt (e.g. 100% utilization on 15
different hosts), then the Grid utilization figures are quite
different.

For Grid A, there's a utilization of 15%.  For Grid B, it's 75%.

If we weight the two grids equally, we get an average utilization of
82.5%, even though there are 90 idle systems.

On the other hand, if you take into account the number of hosts, you
get a different figure:   (15+15)/(100+20) = 25% average utilization.
To me, this seems to be a more accurate representation of the overall
usage.

Now, this does *not* take into account things like relative CPU
speeds, or multi-core systems.  100% utilization on my ancient Celeron
is quite a bit different than 100% on the latest quad-core Operton.  I
don't think that we need to delve quite that far into compariing


 Alternatively, we can rollback this backport and punt it until 3.1.2.

Probably the cleanest solution for now.

 On a related note, I think we should distinguish between a Grid and
 a Meta-Grid (i.e. a grid of grids) in the Front End -- do people
 care?

Yes, it would be good to distinguish between them, even if all it does
is  say Meta-Grid instead of Grid.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Ready for 3.1.1...?

2008-08-15 Thread Jesse Becker
On Fri, Aug 15, 2008 at 13:59, Bernard Li [EMAIL PROTECTED] wrote:
 Hi Brad:

 On Fri, Aug 15, 2008 at 10:48 AM, Brad Nicholes [EMAIL PROTECTED] wrote:

I know it seems like just yesterday that we shipped Ganglia 3.1.0 (well 
 probably because it was ;)  But there have been some significant patches 
 added to 3.1.1 including a fix for a segfault in gmetad.  Due mainly to the 
 segfault patch, I am proposing that we tag and roll a testing tarball for of 
 3.1.1 within the next week with a goal of shipping 3.1.1 two weeks after the 
 tag.  There are still a number of backports 
 Comments?  Questions?  Feedback?

 I am okay to roll new testing tarball as soon as we are ready -- just
 give the word.

Ditto.  There's nothing in the backports proposal that absolutely must
go into 3.1.1 (although anything that's been approved may as well get
released).

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Gmetad scalability tests

2008-08-01 Thread Jesse Becker
On Fri, Aug 1, 2008 at 09:14,  [EMAIL PROTECTED] wrote:
 Has anyone written any kind of simulator for testing gmetad, e.g. a
 gmond that reports thousands of metrics for gmetad to log?

I've not heard specifically of a simulator, although a very large
cluster would basically do the same thing. :-)

 With gmond 3.1, a simulator could probably be written as a plugin that
 creates thousands of metrics.

That could be very useful.  Of course, it's not hard to fire off
gmetric multiple times in a very tight loop either. :-)  It would be
interesting to see such a module, and what bottlenecks it exposes.
There are some known issues with gmetad and updating the .rrd
files--this is why many people store them on a ramdisk .


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Backport proposals for typo fixes...

2008-07-31 Thread Jesse Becker
On Thu, Jul 31, 2008 at 17:49, Bernard Li [EMAIL PROTECTED] wrote:
 Hi all:

 I just fixed a typo in one of the configuration files in trunk, now I
 want this backported to the 3.1.x branch.  Since this does not touch
 any actual code (just comments), can I get a free pass and not have to
 go through a backport proposal?

+1

 Perhaps we can outline some minor fixes that do not require to go
 through this peer review process and document this list in the Wiki?

I'd suggest that minor typo changes, minor rewording to things like
error messages, and any documentation are fair game for this.


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Auto configuration ideas

2008-07-29 Thread Jesse Becker
On Tue, Jul 29, 2008 at 11:35,  [EMAIL PROTECTED] wrote:
 -Original Message-
 From: Jesse Becker [mailto:[EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
  I've been looking at how we currently deploy Ganglia configuration
  files in our organisation, and whether the process can be improved.

 snip

  - allowing a central configuration server to override some, but not
  all, of the values specified in the config file

 This is possible with cfengine, and other configuration
 management tools.  I have several groups of machines, and

 There are some more details that were not in my original email:

 - it is a multi-platform deployment, including Linux, Solaris and an OS
 that requires Cygwin

Cfengine works on all of those.  Puppet works on the first two, and
probably via Cygwin (Puppet is based on Ruby, and probably runs
wherever Ruby does).

 - it is envisaged that some users will be able to change configuration
 settings through a web interface, and that those changes will be
 propogated to the nodes/clusters that the users who chosenfind

It should be straightforward to convert webpage input into various
rules files for cfengine/puppet/etc.  These rules would then be used
to push the various settings around as needed.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] What to do with the contrib directory???

2008-07-10 Thread Jesse Becker
Just to chime in here...

On Thu, Jul 10, 2008 at 13:17, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 On Thu, Jul 10, 2008 at 08:46:11AM -0600, Brad Nicholes wrote:
 At the developers meeting last Feb. we talked about what to do with new 
 module

 and other types of contributions such as utility scripts.  A new contrib/
 directory was created by hawson in March.

 that was I think unrelated with the debate about a module repository that we
 had at the summit and more linked to the immediate need of distributing useful
 user provided stuff like the SMF profiles needed to run ganglia in Solaris 10
 as you can see by (also linked into the STATUS file) :

  
 http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg03807.html

Correct.  I don't think that contrib/ was ever intended to be the
proper location for user contributed gmond modules.  I see it as
more for miscellaneous other stuff--like SMF startup scripts, or
routines for copying and restoring files from tmpfs on a regular
basis, or other sorts of infrastructure tasks (for lack of a better
term).

 I don't think we should just be dumping anything that lands in contrib/
 into our builds.

Agreed, but the threshold need not be high for inclusion.  I am *not*
suggesting that we take any and every submission.  However, if it's
useful to someone, it's probably useful to someone else, and at least
worth considering.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [Ganglia-svn] SF.net SVN: ganglia: [1538]trunk/monitor-core/Makefile.am

2008-07-10 Thread Jesse Becker
On Thu, Jul 10, 2008 at 13:15, Brad Nicholes [EMAIL PROTECTED] wrote:
  I was specially interested in getting ganglia-rrd-modify.pl somehow

snip

 I think adding contrib makes probably more sense (as is usually done in
 other
 opensource packages), and as far as we clear the distribution of all those
 contributions of course with some nice looking legalese (which I think has
 been implicitly granted through the process of getting those files publicly
 to the list)

Agreed.

 I'm OK with it either way.  If we add contrib/ to the package, then we should 
 still have someplace where we put stuff that we like and think is valuable, 
 but haven't approved yet.  Does that make sense?  However a download page on 
 the wiki or some other kind of web directory listing might make it easier to 
 reference for the user.

This is exactly what contrib directories are for:  things that are
useful and worth distributing as a courtesy, but are *not* directly
supported by the main development team.  If something is ever
promoted/taken over by main developers, then it gets removed from
contrib/, and added into the proper location elsewhere in the
project.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Stacked Graphs

2008-06-30 Thread Jesse Becker
I took a quick look at the patch, although I've not applied yet.

A few comments:

1)  the HSV_TO_RGB and get_col functions should, IMO, be moved out to
the functions.php file.
2)  Chart size is fixed to 400x300.  This should be based on the other
charts (either pie chart, or other plots, depending on placement).
3)  A few mostly minor coding inefficiencies (needless recalculation
of an array length inside a loop, for example), but nothing major.

I'll try out the patch tomorrow if I get a chance (tonight is out)

On Mon, Jun 30, 2008 at 15:34, Bernard Li [EMAIL PROTECTED] wrote:
 On Thu, Jun 19, 2008 at 3:07 PM, Brad Nicholes [EMAIL PROTECTED] wrote:

 The patch looks good.  Now we just need somebody with a lot PHP web frontend 
 experience than me to review the patch and determine if it should be 
 committed to trunk.  Jesse, Bernard... I'll leave it up to you or anybody 
 else with commit rights looking from some code to review. :)

 I have reviewed the patch and updated the bugzilla entry.  Waiting for
 response right now.

 The sooner we can get this checked in, the better.  Otherwise other
 code changes may start to conflict with this patch.

 Cheers,

 Bernard

 -
 Check out the new SourceForge.net Marketplace.
 It's the best place to buy or sell services for
 just about anything Open Source.
 http://sourceforge.net/services/buy/index.php
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] subtitle still clovered for very long hostnames using 1.2.27

2008-06-24 Thread Jesse Becker
I just committed r1460 to trunk.  This adds the $use_fqdn_hostname
setting to conf.php.  The default behavior is to *not* show the FQDN.
The patch adds a small utility function to remove everything after the
first . character in a hostname, and then adds checks for
$use_fqdn_hostname in the various graphing functions.

It is possible to turn this option on, so that the FQDN is shown, but
hosts that report themselves using only a short hostname will still be
displayed as such, since we can't magically conjure an arbitrary
domain for hosts that lack them.

On Tue, Jun 24, 2008 at 13:27, Bernard Li [EMAIL PROTECTED] wrote:
 Hi Carlo:

 I saw the comment in the STATUS file in 3.1 branch.

 Can you post a screenshot of what the graph looks like?

 We could probably get around it by having a user-configurable option
 $use_fqdn_hostname in conf.php to determine whether we use FQDN or
 short hostname when rendering the graphs.  This has been a feature
 request and could potentially get around the issue.  Of course if your
 short hostname is actually long -- then nothing we could really do
 can help.

 Cheers,

 Bernard

 -
 Check out the new SourceForge.net Marketplace.
 It's the best place to buy or sell services for
 just about anything Open Source.
 http://sourceforge.net/services/buy/index.php
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Trunk r1438: web/graph.php

2008-06-24 Thread Jesse Becker
On Tue, Jun 24, 2008 at 01:48, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 On Mon, Jun 23, 2008 at 11:36:02AM -0700, Bernard Li wrote:
 Hi Carlo:

 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=1438

 Do we really want the debug message to be text/plain?

 Previously it is text/html (default)

 the default content to use is a webserver configuration, I presume in your
 setup is probably text/html which is why you were not able to see the bug.
 it is specially annoying if you happen to have something like
 application/octet-stream instead and I am pretty sure has to be amusing if
 it happens to be audio/x-wav.

That seems like a reasonable reason to force the type ourselves.  We
know what the output is, and should properly declare it as such.

 and the text are wrapped such that you could
 see the entire command on the screen.  Now there is no text-wrapping
 and makes it hard to see the entire command in one go.

 you also couldn't see the bug because your browser was nice enough to format
 and wrap an invalid HTML file but that is also specific to your setup.

 if you want to have it formatted and wrapped I would recommend doing that in
 the code instead.  if that is the case HTML might be better, as it will also
 allow for other formatting aids like fonts, bold and colors, but I think the
 currently proposed setup is more practical for a debugging flag.

A middle ground would be to do *minor* preprocessing on the command,
to make the lines
wrap cleanly, then wrap it in pre tags.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] [RFT] libmetrics: freebsd: avoid unitialized values and invalid casts for cpu_speed

2008-04-23 Thread Jesse Becker
On Wed, Apr 23, 2008 at 6:33 AM, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 the following proposed patch for stable (3.0 and 3.1) removes a floating cast
  and the use of an uninitialized buffer which could result in high cpu_speed
  values when the size of the buffer used by the call to sysctlbyname on
  machdep.tsc.freq was smaller than the one proposed (8 bytes).

  attached original fix committed as r1293

+1

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] relicensing the web frontend as GNU GPL v2

2008-04-22 Thread Jesse Becker
On Tue, Apr 22, 2008 at 11:05 AM,  [EMAIL PROTECTED] wrote:

 Quoting Brad Nicholes [EMAIL PROTECTED]:

  On 4/19/2008 at 10:48 AM, in message
  [EMAIL PROTECTED], Jesse Becker
  [EMAIL PROTECTED] wrote:
  On Sat, Apr 19, 2008 at 11:37 AM, Brad Nicholes
  [EMAIL PROTECTED] wrote:
   Apparently there are a lot of choices for replacing TemplatePower
   with some
  other templating class.  Check out
  http://www.whenpenguinsattack.com/2006/07/19/php-template-engine-roundup/
  We
  just need to find one that isn't GPL.  Preferably BSD or MIT.
  Apache license
  would be good also.
 
  LGPL?  (I'm looking through lists on freshmeat.net...)
 
 
  LGPL would be OK if we can't find something licensed under BSD, MIT
  or another more liberal license.
 
  Brad

 http://www.smarty.net/

 Smarty is probably the most common templating framework for PHP, and
 is LGPL.  Maybe a bit heavyweight for what's needed in the Ganglia
 frontend.

I spent a little time looking at xtemplate.  It's dual-licensed BSD
and GPL, somewhat similar in several ways to the current template
system, and appears to be fairly small.

http://www.phpxtemplate.org/HomePage


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] relicensing the web frontend as GNU GPL v2

2008-04-22 Thread Jesse Becker
On Tue, Apr 22, 2008 at 2:00 PM, Brad Nicholes [EMAIL PROTECTED] wrote:
 If phpxtemplate will work for us, then this sounds like a good way to go from 
 a licensing perspective.  It is dual licensed under both LGPL and BSD.  We 
 would obviously take it under the BSD license.  In either case, the Ganglia 
 project source code could remain completely under the BSD license.  How hard 
 would it be to replace what we have today with phpxtemplate?


It didn't look too hard.  I spent a little time looking into the
conversion over the weekend.  I can't say I got it to work, but I
think that's because:
1)  There are lots of places that need to be changed
2) I haven't yet come up with a good automated search and replace.

The syntax is similar in spirt, and format, to current system, but
different enough that a simple search and replace won't actually work.
 It shouldn't be hard, I just need a free moments--or someone else can
take a crack at it, of course.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] relicensing the web frontend as GNU GPL v2

2008-04-19 Thread Jesse Becker
On Sat, Apr 19, 2008 at 3:14 AM, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 most likely just a formality, as the web frontend templating system was based
  on the GPLv2+ TemplatePower class from the very beginning (at least as
  shown from the history in svn).

  a quick line count from the files involved says the contributers that will
  need to consent will be (including number of lines committed from all files 
 in
  the web directory including non php files which could be as well discarded as
  an alternative) :

  38 bnicholes
  87 carenas
 410 knobi1
 426 bernardli
 686 hawson
 830 sacerdoti
3940 massie

Fine with me.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Platform experts needed (was:Re: Ganglia 3.1.x stable branch has been created...)

2008-04-18 Thread Jesse Becker
On Fri, Apr 18, 2008 at 12:04 PM, Brad Nicholes [EMAIL PROTECTED] wrote:
So here is another request to all you platform experts out there.  The 
 Ganglia project will be rolling alpha tarballs of the Ganglia 3.1 version.  
 If the tarball does not work on your platform, please fix it and submit a 
 patch back to the project.  Ganglia 3.0.x already works on a variety of 
 platforms and we would like to see 3.1.x work on an equal or greater number.  
 But we need platform experts to make this happen.  Here is your chance to 
 jump in and help make Ganglia 3.1 the best release ever.

I just did a clean checkout of trunk/, and it seems to have built just
fine on OpenBSD 4.1.

This is a change from the recent past, where there were some bootstrap
problems on this platform.


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Platform experts needed (was:Re: Ganglia 3.1.x stable branch has been created...)

2008-04-18 Thread Jesse Becker
On Fri, Apr 18, 2008 at 2:56 PM, Bernard Li [EMAIL PROTECTED] wrote:
I just did a clean checkout of trunk/, and it seems to have built just
fine on OpenBSD 4.1.

  Just to be sure, I would suggest you check out the 3.1 branch, even
  though it is probably not much different from trunk.

The problem I had recently were during the bootstrap stage, and a few
minor issues with ./configure.  I just tried using r1256 from
branches/monitor-core-3.1, and it looks fine.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Historical Data

2008-04-16 Thread Jesse Becker
On Wed, Apr 16, 2008 at 5:43 PM, Witham, Timothy D
[EMAIL PROTECTED] wrote:
  I don't have space for it since my grids are too huge, but it would be
  easier to just keep more detail in the RRDs which is decided at create
  time.  I haven't yet tried it, but gmetad/conf.c implies that the data
  retention policy could be changed in the config file (I don't see this
  option in the man page though; is that a bug?):

This is correct.  You can change the *initial* settings for an RRD
file *when it is created*.  If you already have  RRD files, changing
the settings in the config file will have no effect.  If you want to
change your current files, then you need to export the existing data
('rrdtool xport'), create new files with new settings, then import the
old data ('rrdtool restore').


  But maybe your parenthetical comment means you did that already but had
  too much waitIO?  And that's why you went to a cron job?  If so, you are
  storing the RRDs in tmpfs, right?

And read this paper:
  http://www.usenix.org/events/lisa07/tech/plonka.html

Not related to Ganglia, but it does discuss optimizing performance on
with RRD files.  In short:  turn off read-ahead, and upgrade to a
recent version of rrdtool.

Also, I am curious as to how the performance of rrdtool would be
affected if we were to store related metrics in a single rrd file:
e.g., we could group cpu_(user,system,idle,wio,nice) in a single
  file,
which I think would reduce the resource usage of gmetad
  significantly.

  I have wondered that too.  Since RRD is random access, it seems like it
  should be at least as efficient and probably more efficient since there
  would be less files open.  But it would be difficult to change.  Now

The cost of calling open() is fairly low, and even on huge clusters,
I'd be surprised if this is hugely significant.  RRD files have a
short header section that stores information about the RRAs, the DSs,
and offsets as to the current pointer for the RRAs within the file.
With mutliple metrics in a single file, you reduce the number of
open() calls, but increase the number of calls to seek() within a
single file.

One of the main points of the paper I mentioned above is that
read-ahead is almost entirely wasted on RRD files.  In order to
read/write a single value in an RRA, the OS will open the file, read
the header (which is short) plus many other blocks on disk because of
read-ahead settings.  Next, rrdtool must seek to the proper location
in the RRD file, read a bunch of blocks (which we don't care about),
then write the new data.   Repeat this seek/readahead/write pattern
for each RRA that needs to be updated.


  each RRD is simple with DS:sum and DS:num for summaries; the metric is
  in the filename itself.  To change, you would need to put the metric
  names in the RRDs: DS:cpu_user_sum, etc. and I think you would have to
  update all metrics with one rrd_update call.  Of course this would work
  only for the standard metrics and extra metrics would still need to be
  in their own files.  Or, perhaps with the new metric groupings, each
  group could be an RRD file of related metrics.  And then you'd have to
  change the PHP to understand all this...

yep.  You could only consolidate a few sets of metrics, since not all
system support all of them.

However, ideas for improving the FE are welcome.  Once we get 3.1 (or
3.2) out the door, I'd like to work on new FE, perhaps with things
like consolidated RRD files in mind.


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


  1   2   >