Re: [Ganglia-developers] 3.1.0 wishlist
Bernard Li wrote: On 10/17/07, Matthias Blankenhaus [EMAIL PROTECTED] wrote: Yes, this is straight forward. However, I was thinking that core 2.6 metrics should be in the standard set of metrics, no ? Well I don't know -- what if folks are still runing 2.4 kernel ;-) Then they get a slightly different set of metrics. :) Here's a recent kernel from CentOS: 2.6.9-55.0.9.plus.c4smp $ cat /proc/meminfo MemTotal: 1025076 kB MemFree: 69740 kB Buffers: 46044 kB Cached: 385280 kB SwapCached: 0 kB Active: 592224 kB Inactive: 234664 kB HighTotal: 0 kB HighFree:0 kB LowTotal: 1025076 kB LowFree: 69740 kB SwapTotal: 2040244 kB SwapFree: 2039964 kB Dirty: 8 kB Writeback: 0 kB Mapped: 455388 kB Slab:90276 kB CommitLimit: 2552780 kB Committed_AS: 634552 kB PageTables: 10732 kB VmallocTotal: 536870911 kB VmallocUsed:298016 kB VmallocChunk: 536572431 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 2048 kB And one from an old RH 7.3 box: 2.4.20-24.7smp $ cat meminfo total:used:free: shared: buffers: cached: Mem: 2114387968 2014195712 1001922560 391311360 1422262272 Swap: 2089177088 81911808 2007265280 MemTotal: 2064832 kB MemFree: 97844 kB MemShared: 0 kB Buffers:382140 kB Cached:1381500 kB SwapCached: 7428 kB Active:1461564 kB ActiveAnon: 13412 kB ActiveCache: 1448152 kB Inact_dirty: 25004 kB Inact_laundry: 262828 kB Inact_clean: 28056 kB Inact_target: 355488 kB HighTotal: 1179584 kB HighFree:54924 kB LowTotal: 885248 kB LowFree: 42920 kB SwapTotal: 2040212 kB SwapFree: 1960220 kB -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmetric going away in 3.1 ?
john allspaw wrote: Hey all - While I love the idea of the python module, I'd like the plain-old gmetric binary to stay as-is. I'm not sure if there are/were plans on getting rid of it in favor of the new custom framework stuff. Some of the reasons why I think it should stay: I also think that it should stay. Having a basic command line program that throws data into ganglia is incredibly useful. Python bindings are great (and Perl bindings would be nifty too), but not all sites have Python installed. With a compiled version of gmetric, if you can run gmond for basic stats collection, you can then run any custom data collection you want. Thoughts ? Keep gmetric, and provide (or accept contributions for) bindings for other programming languages. Just my $0.02... -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ideas for PHP frontend improvements
[EMAIL PROTECTED] wrote: I've been working on a list of ways to improve the PHP web frontend. Just curious what every else thinks of these. - add support for a caching layer for generated graphs Add code that allows caching of generated graphs, either on the filesystem or in a memcache cache. When generating a graph, serve the cached version if the refresh time for that metric has not passed. The rrdtool program already has a --lazy option that will (quoting from the rrdgraph man page): [o]nly generate the graph if the current graph is out of date or not existent. While not a complete in-ganglia solution, it should be an easy change. -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ideas for PHP frontend improvements
Matthew Chambers wrote: I don't know if I would call it difficult change, but it's a different method of generating the graph. Currently rrdtool writes to standard output and that gets sent straight to the client (after the HTTP I'd forgotten that the graphs were generated on the fly. That does change things a bit, and makes it a bit more complicated to change. header). The --lazy option obviously only works for writing to a file, which means the frontend will need a place to store those graphs. That's do-able, but it means additional I/O for the server which the current solution avoids - not to mention the additional time to generate the graphs due to waiting for I/O. I'm a bit perplexed about where to But consider that currently each file is generated on the fly, and there is no caching done at all. The webserver not only has to generate the image, including the IO to read the RRD file, but also serve the bits over the network. As a test, I just requested a specific image from one of my Ganglia installations twice in under a second. According to the apache logs for this host, it sent the full 10kb both times, and presumably had to generate the image from scratch each time. The --lazy option would, I think, stat() the current on-disk graph file, stat() the corresponding RRD file(s), and generate a new graph only if needed. Of course, you have to generate a unique filename for each graph, but I don't see that as too hard. It looks like the current method (dynamic graph generation) has read IO with every request. If things were changed to use --lazy, I *think* that there would be read and write IO to generate the graph, but subsequent requests would only create a new chart if there is new data, and allow the webserver to make use of various caching mechanisms. Currently, the images are explicitly not cached at all, so that would have to change as well. Firefox says that the small graphs are almost all under 7kb in size, while the medium graphs are less than 16kb. I've got about 30 medium size images per host, plus a few small images. So in my case, it's about half a MB per host. Dunno how applicable that would be to other sites. I suppose the real question is what's the bottleneck? Is it the graph generation? The network? IO reading files off the disk? -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] OpenBSD tweaks
Carlo Marcelo Arenas Belon wrote: On Thu, Nov 29, 2007 at 09:56:14AM -0500, Jesse Becker wrote: I've been trying to build the trunk version of Ganglia on an OpenBSD 4.1 box recently, and it isn't for the faint of heart. sadly true, building trunk isn't that easy if you are not using Fedora, indeed I didn't even though that doing a bootstrap in OpenBSD was possible. Surprises never cease. ;-) what about expat? 2.0.0? No problems with expat, but it's version expat-2.0.0. if not using the python plugins then It would be probably easier to use the included libraries (specially considering that after OpenBSD 4.2 expat will be part of base) ./configure --enable-static-build I've tried that, but with mixed success. --- configure.in.orig Wed Nov 28 16:34:39 2007 +++ configure.inWed Nov 28 16:53:16 2007 @@ -339,7 +339,7 @@ echo Checking for confuse if test x$libconfusepath != x test x$libconfusepath != xyes ; then CFLAGS=$CFLAGS -I$libconfusepath/include -LDFLAGS=$LDFLAGS -L$libconfusepath/lib +LDFLAGS=$LDFLAGS -L$libconfusepath/lib -lintl -liconv echo Added -I$libconfusepath/include to CFLAGS echo Added -L$libconfusepath/lib to LDLAGS fi we are going to have to probably put that conditionally for OpenBSD to avoid pulling extra dependencies for other platforms that might not need it. Sure. A quick check on a Centos4 system shows that it does *not* directly need libintl or libiconv. One clarification on this, the dependency tree is: libconfuse -- libintl -- libiconv Not: libconfuse -- libintl -- libiconv e.g. libconuse only requires libintl, not both. the problem being of course (and I didn't check that, so I might be wrong) that there might be some specific flavor for the libconfuse port which might not need that as well, and will fail to build.to did you try autoconf-2.51p1? I didn't. I can try to test that if there's a need. will take a look at this here and see if I can get autoconf to behave. Thanks. -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Static ganglia builds (was OpenBSD tweaks)
Brad Nicholes wrote: Correct. Both the gmond binary and the metric modules have to be running the same APR code in the same process. If both are statically linked to a libapr they would both have to initialize their own APR library, create their own memory pools, sockets, etc. They would not be able to share anything such as registering a module cleanup routine on a gmond owned memory pool. This is the reason why it was necessary to move to a dynamically loaded libapr for 3.1 and why --enable-static-build disables all of the loadable module functionality. The only way around it would be to not allow a loadable metric module to use any APR functions or be passed any APR created data. This would greatly restrict the functionality of a loadable module. So building ganglia 3.1.x statically no matter which version of APR you use, basically gets you ganglia 3.0.5. This doesn't discount the idea of delivering a separate tarball with that contains the external libraries, it just doesn't solve the dynamic vs static linking issue. There are two issues here: 1) supporting statically compiled binaries 2) having the library code in-tree or provided separately Static binaries are wonderful in some cases, and have saved my bacon a number of times (although not with ganglia specifically). I've seen other projects provide -static option that creates static libraries, based on whatever versions are found. This could either replace, or be in addition to --enable-static-build (although the difference should be explained in the configure --help options). Can the dynamically loaded metric modules could be statically linked? Obviously, you lose the dynamic part, but would retain the ability to monitor things. I've also heard that it is also possible to link some libraries statically, but leave others dynamic. Perhaps that is also an option? If a separate tarball of apr/expat/confuse are provided, then I think it would be appropriate to include a short bit of documentation along the lines of compile these three libraries as follows... Run the main ganglia ./configure script with --foo and --bar to use them -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - SF.Net email is sponsored by: The Future of Linux Business White Paper from Novell. From the desktop to the data center, Linux is going mainstream. Let it simplify your IT future. http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Automatic storage handling for gmetad's RRDs
Matthias Blankenhaus wrote: Howdy ! Hello. After tossing several ideas around we ended up with the following proposal. The idea is to control the RRD archives solely via gmetad.conf, more specifically via the 'rrd_rootdir' variable. It points by default to '/var/lib/ganglia/rrds'. Now, whenever the user modifies this value to something like '/dev/shm/rrds' or 'tmpfs' and then restarts gmetad via '/etc/init.d/gmetad restart' we essentialy move the RRD archive around and potentially alter fstab and issue necessary mount commands. While I love the idea of integrating tmpfs (et al) support into Ganglia, I would vote *against* having ganglia modify /etc/fstab, running mount, etc. Performing these tasks requires root access, and I don't see a need to add this. Instead, I would suggest something a bit more general. Essentially, provide two new options: rrd_backupdir and rrd_backuptime. If rrd_backupdir is set, and is not equal to rrd_rootdir, then gmetad can do a periodic copy according to the frequency set in rrd_backuptime. When gmetad starts, it will try to restore the rrd files from rrd_backupdir into rrd_rootdir (which lives on a ramdisk), and when it shuts down (cleanly!), it will force a copy to non-ramdisk files. In summary, when gmetad starts up we restore the RRD DB from a FS to memory. When gmetad stops we back it up from memory to a FS. It is debatable wether a cron job should once in a while back up the RRDs in addition to the gmetad start / stop scenario. Perhaps having gmetad force a copy when it receives a SIGUSR1, or just implement a simple timer that reads rrd_backuptime? And is there any reason why a simple cp -r /dev/shm/rrds /var/lib/ganglia/ won't work? (I'm trying to think of failure modes, and if there is any special file locking that would have to be done). -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia 3.0.x security fix
On Dec 10, 2007 1:44 PM, Bernard Li [EMAIL PROTECTED] wrote: The latest snapshot of the 3.0.x branch with the fix is available here: http://www.ganglia.info/snapshots/3.0.x/ We would like to make an official release of 3.0.6 ASAP to address this security issue so we would really appreciate it if the community could help us test the snapshot to confirm that everything is working fine. I can confirm that the snapshot functions (gmond, gmetad, and the web FE) on OpenBSD: OpenBSD mowett.localdomain 4.1 GENERIC#1435 i386. I have not done any checking on the actual security vulnerability. - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] PNG vs SVG
There was a discussion on IRC recently about PNG vs. SVG output files. The questions we had revolved around how hard it is to generate each format, and also decode it as well. Lacking any hard numbers, I decided to make up my own. :) All tests were run on my laptop, a Pentium-M 1.6GHz box (cpu at full-throttle), using rrdtool 1.2.23. The .rrd files came from an recent Ganglia build from the SVN trunk. I copied an .rrd file into a ramdisk, and ran a shell script to benchmark creating and rendering PNG and SVG files. Creating the files was pretty simple: I just ran these two command a thousand times (grin): rrdtool graph /dev/null -a PNG DEF:load=$FILE:sum:AVERAGE LINE1:load#ff:load rrdtool graph /dev/null -a SVG DEF:load=$FILE:sum:AVERAGE LINE1:load#ff:load Before the loops, I ran 'dd' on the various input files to make sure they were cached, although since the tests were run on a RAM disk, that shouldn't matter much. Generating 1,000 PNG files took: real 28.57 user 21.89 sys 2.29 Generating 1,000 SVG files took much less time: real 4.18 user 2.34 sys 1.03 This makes quite a bit of sense. Since SVG is just XML, all that really needs to be done is wrap some XML stuff around the datapoints. Making the PNG actually requires requires plotting points, and generating the bitmap. Once we make the image files, we need to decode them. This is a bit harder, and I couldn't think of a way to easily benchmark Firefox's SVG rendering code. Instead, I used the 'convert' program from ImageMagick to convert the output SVG and PPM files into the next closest thing: a PPM file. I realize that this does not tell us anything about browser rending, and might be completely useless. I hope, however, that we can get a rough idea of how hard it is for a browser to render an SVG file relative to a PNM file. Thus, I ran these commands 1,000 times each: convert -depth 8 +antialias test.png ppm:- /dev/null convert -depth 8 +antialias test.svg ppm:- /dev/null The input files are the files created by rrdtool, with no other manipulation. It is interesting to note that the image dimensions on the .svg file are slightly smaller than the png: test.png PNG 481x168 481x168+0+0 DirectClass 8-bit 15.2695kb test.svg SVG 470x168 470x168+0+0 DirectClass 16-bit 6.28906kb Also note that the .svg file is less than half the size of the .png. The numbers Decoding PNG files: real 47.50 user 37.94 sys 3.79 Decoding SVG files: real 149.44 user 119.52 sys 7.34 So it looks like generating SVG files is clearly faster than PNG files, by about a factor of 10 (User CPU time). On the other hand, decoding the files with the 'convert' program was clearly slower at SVG files. The source RRD file and script ar available if anyone wants to play with them on their own. Comments? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] PNG vs SVG
Matt Ryan wrote: I had heard once that there were browser compatibility issues with SVG - specifically, that they didn't work properly on browsers other than IE. Perhaps I misunderstood?? Firefox can render SVG files. There are some minor syntactic hoops you need jump through if you want to have the images inline, but it's not difficult. You can't use the usual IMG tag, but instead use EMBED. Now, as to browsers rendering this correctly in the first place... that's a different story. Apparently, you can also do completely inline SVG as well, but I'm not sure about that. I think that this is something that could be enabled or disabled in conf.php, on a per-site basis, and the default should be to generate PNG files--the current behavior. -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Ganglia.info download page needs to be updated.
The main download ganglia page still points to 3.0.5: http://ganglia.info/?page_id=55 -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Modular graph.php
Attached are several files that break graph.php into several smaller and more modular components. Each chart is now generated from a specific .php file, and graph.php now serves as a frontend to these. There is one file per report, and a generic metric.php for reporting individual metrics. Adding a new report should be very straightforward: add a new report file named namehere_report.php to the new web/graphs/ directory, and add the report name to the $optional_graphs variable in conf.php. I'm including only one actual patch file (to function.php) for this change. Everything else is either a new file (all *_report.php files), or the diff is large enough that following it may not make much sense. I've tested this against r919 and updated it for r920. Here are the files function.php-add_sanitize.patch Short patch to add a sanitize function, no other changes. Better to have this here, and usable everywhere in the web FE, than just for graphs. graph.php A *replacement* for the current graph.php. I can create a formal patch file, but it is, I think easier to just replace the file completely (then make your own diff if you want). There are a number of changes in this file. The largest is the removal of all code to actually generate the chart series for rrdtool. The huge if {} elseif {} block is gone, with the contents pulled out into multiple *_report.php files. Values from $_GET are checked in a slightly more consistent manner, and a fair bit of documentation and commenting has been added as well. graphs/* This is a new directory, with all new files. This directory needs to be created, then place all *_report.php files and metric.php in this location. A more commented version of the CPU report is also included as sample_report.php to help with writing your own custom report.s With only a few minor cosmetic exceptions, the actual charts are the same code as from the monolithic graph.php. None of the reports have changed, just the structure around them. I've not heard anything about progress on the proposed improvements for 3.1.0 to the front end, and so far as things go, this isn't a huge change. However, it's a start, and should help make future changes to graphs a bit more self-contained. I'm sure that there can be some improvements made, so right now I'd just like help testing, and to get comments and feedback on code. So...any comments or criticisms? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 functions.php-add_sanitize.patch Description: Binary data graph.php Description: Binary data cpu_report.php Description: Binary data load_report.php Description: Binary data mem_report.php Description: Binary data metric.php Description: Binary data network_report.php Description: Binary data packet_report.php Description: Binary data sample_report.php Description: Binary data - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia automatic snapshots (was Re:[Ganglia-general] Ganglia rrdtool problem?)
Bernard Li wrote: I can build and post newer 3.1.x snapshots, but I don't think I'll have time to test them before posting. They would be svn snapshots. I'd expect all of the usual disclaimers to apply: code won't work, may break, etc, etc. Use at your own risk. -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Patch to fix value figures in per-host metrics
Another small patch. This one fixes an error introduced in r919, that breaks the proper display of current metric values in the charts. Specifically the (now 0.23), and similar text. The bug is that the 'v' entity is run throught clean_number, which *only* handles integers; floating point numbers are rejected out of hand. The patch adds a new function called clean_float, which will properly pass floating point numbers (including those in scientific notation). Looking over the code, it should be possible to use clean_float everywhere clean_number is currently called, or add support for floating point numbers to clean_number, since we really want to check for valid numerical data, not just a string of pure digits. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 value_floating_point.patch Description: Binary data - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] rrdtool 1.0.x/1.2.x font differences
I noticed recently that two common versions of rrdtool have different font choices, which directly affects Ganglia's graphs. I specifically noticed this in version 1.0.49 and version 1.2.x; both of these are versions commonly shipped by various vendors. The problem is that the default font used by the two versions is different. I've posted examples here: http://pliernose.ath.cx/ganglia/fonts/ Note the difference in font size, and spefically that there are two lines of text in the 1.0.49 image, but three lines of text in the 1.2.x version. This directly affects the various fudge values used in the various charts. I did a little digging, and I think the font change happened around rrdtool version 1.2.2, but I'm not completely sure. In any event, it was probably years ago. Newer versions of rrdtool accept font specifications, but 1.0.49 does not. So, the question to the list: which font to support? I'd guess the 1.2.x series, but there are still a lot of copies of rrdtool-1.0.49 running out there. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Patch for modular graph.php.
Below is a link to a patch against the current SVN trunk (r931) to split graphing into discrete parts. This is a rehash of a post from early January[1], but with a number of fixes, and a lot of updated documentation in the code. I would post it inline, but it's too large for the mailing list.[2] The patch can be found here: http://pliernose.ath.cx/ganglia/patches/modular-graph.patch As always, comments are welcome. As before, each chart is generated from a specific .php file, with graph.php acting as a frontend of sorts. There is one file per report (Load, CPU, Network, etc), a generic metric.php for reporting individual metrics, and a heavily commented sample_report.php file. Gone is the huge if/else block for all of the different graph types. Various sanitation and check routines are, I hope, more clearly laid out. Adding a new report should now be straightforward: 1) Add a new report file named namehere_report.php in web/graphs.d/ 2) Add the report name to the $optional_graphs variable in conf.php. This should also help down the road during the hypothetical UI overhaul. Since the graphs are split more cleanly, it should easier for the front end code (and users) to request specific graphs. For people who like this sort of thing, here's a diffstat of the patch: conf.php |3 functions.php |5 graph.d/cpu_report.php | 77 graph.d/load_report.php| 48 + graph.d/mem_report.php | 58 ++ graph.d/metric.php | 127 ++ graph.d/network_report.php | 35 graph.d/packet_report.php | 35 graph.d/sample_report.php | 137 +++ graph.php | 389 ++--- 10 files changed, 652 insertions(+), 262 deletions(-) This nicely reflects the complexity of the various reports. The individual metrics are, by far, the most complicated, as there is different handling depending on context. The sample_report is actually just the CPU_report, but has about 50 lines of comments on how the graphs are created. I have been using this patch in production for the last two weeks on a moderate sized cluster, and it has worked well. [1] http://sf.net/mailarchive/message.php?msg_id=dbdc3b250801072023g70116de4lf7bb2751ad9eaba2%40mail.gmail.com [2] List moderators: feel free to reject that other email from the to the list. :-) -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia issue with RRDTool 1.3 beta3
On Feb 5, 2008 10:08 AM, Jarod Wilson [EMAIL PROTECTED] wrote: On Tuesday 05 February 2008 10:04:00 am Jesse Becker wrote: On Feb 5, 2008 9:57 AM, Jarod Wilson [EMAIL PROTECTED] wrote: http://koji.fedoraproject.org/packages/rrdtool/1.3/0.6.beta3.fc9/ (despite the fc9 tag, it'll run just fine on Fedora 8 too) How about RHEL4 or 5? Less certain... For RHEL4 and RHEL5, I'd suggest perhaps doing a rebuild of the package, but the rrdtool already available for 'em in EPEL shouldn't have this memory leak problem. Rebuilding is fine. Those are just the types of systems that I have availble for testing. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia issue with RRDTool 1.3 beta3
On Feb 5, 2008 9:57 AM, Jarod Wilson [EMAIL PROTECTED] wrote: http://koji.fedoraproject.org/packages/rrdtool/1.3/0.6.beta3.fc9/ (despite the fc9 tag, it'll run just fine on Fedora 8 too) How about RHEL4 or 5? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] *** attempt to put segment in horiz list twice
On Feb 4, 2008 8:16 PM, Bernard Li [EMAIL PROTECTED] wrote: Have been getting the following message in apache logs: *** attempt to put segment in horiz list twice I think this started cropping up after the xss-fixes in both 3.0.x branch and trunk. I think these are benign messages, but it would be nice to get rid of them instead of taking up space. Anybody else can confirm they are getting these messages? I'm not seeing them on my production install at work either from a 3.0.5 frontend or a trunk frontend. I'm using rrdtool 1.2.23 and libart-2.3.6 on a Centos4.6 box. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Python modules error logging
On Feb 7, 2008 6:01 PM, Bernard Li [EMAIL PROTECTED] wrote: For one of my Python modules, I forgot to specify the value type, and when I start gmond, I got the following messages in /var/log/messages: GMOND[13687]: [PYTHON] No value type given. Using uint. Is it possible for it to log *which* module is causing this warning? I took a quick look, and this is thrown by mod_python.c:342. At this point in the code, I'm not sure that the specific module information is available. The function fill_metric_info looks like it parses minfo, a pointer to py_metric_init_t (defined at the top of the same file). I don't see an information about the module present in that. Probably the best you could do is try to print as much other information about the metric as you can, and work backwards. :-/ -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Semi-serious bug in 3.0.6
On Feb 8, 2008 3:44 PM, Bernard Li [EMAIL PROTECTED] wrote: In my books -- this is quite a bad bug and I think we need to do a release (either as 3.0.6.1 or 3.0.7). Thoughts? I'd vote for 3.0.7 instead of 3.0.x.y. BTW, given the fact that 3.1.0 will break compatibility with 3.0.x, I have a feeling that a lot of folks probably won't want to upgrade (unless we can work out the upgrade path for past data). So, perhaps we might need to maintain the 3.0.x branch after all... Was there a change in the .rrd file configuration? I haven't noticed one, and I've got several web frontends (3.0.5, 3.0.6, and trunk) all running off the same set of .rrd files. One ganglia install has 3.0.5 and the web FE from trunk working off .rrd files created from 3.0.4 (gmetad is from 3.0.5). Another system is frontends from 3.0.5, 3.0.6 and trunk, using gmetad from trunk. Also, many people aren't going to upgrade unless $VENDOR issues a patch. It's just that simple, and many places *can't* upgrade *unless* there's an official patch (even if they want to upgrade). Rocks is probably one of the largest vendors, and they are still on 3.0.4. So the 3.0.x line is probably here to stay for a while, alas. Heck, Debian is still on 2.5.7 for the sarge, etch, lenny, and sid releases (I'm not suggesting that the 2.5x line still be supported). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Semi-serious bug in 3.0.6
Here is a small patch against version 3.0.6 (*NOT* against trunk) that should fix the now 0.00 problem. It is simple enough: cut here- --- functions.php.orig Fri Dec 14 18:42:42 2007 +++ functions.php Fri Feb 8 22:47:39 2008 @@ -422,13 +422,8 @@ #--- # If arg is entirely numeric, return it. Otherwise, return null. -function clean_number( $digit ) -{ - $return_value = null; - if( ctype_digit( $digit ) ) { -$return_value = $digit; - } - return $return_value; +function clean_number( $digit ) { + return is_numeric($digit) ? $digit : null; } #--- -cut here-- The original problem is that ctype_digit is literal in what it checks for: digits. So 4 would get passed, but 4.0 would not, since that pesky non-digit . character is there. We actually want valid numbers, not just digits. is_numeric should work here, or some fancy regex (but that's probably slower and not needed). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Vertical label patch
Since we seem to be tossing out minor web front end patches tonight, here's one that fixes the persistent 0 - 0.00 that shows up on all of the per-metric graphs. The patch applies against 3.0.6 and trunk. The new behavior is to do the following: * If an explicit vertical label is passed, use it. * If a valid maximum or minimum value has been passed, via the 'x' or 'n' URL parameters respectively, use them. * If neither are passed, suppress printing a vertical label alltogether. Note that there should be *something* printed as a label for alignment purposes. Otherwise, the sizes of the graphs are off by a few pixels, and it looks really bad. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 vert-label.patch Description: Binary data - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Semi-serious bug in 3.0.6
On Feb 8, 2008 11:33 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Jesse: On 2/8/08, Jesse Becker [EMAIL PROTECTED] wrote: Here is a small patch against version 3.0.6 (*NOT* against trunk) that should fix the now 0.00 problem. It is simple enough: I thought the fix for the now 0.00 problem was in changeset 926 (as I mentioned in my original email), and that cleanly applies to the branch. Am I missing something here? Slightly different approaches. The r926 patch adds a new subroutine in functions.php. The one I just posted fixes an existing one. Take your pick. I actually think that using is_numeric() is a better solution, rather than using a regex. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] clean_float() vs clean_number()
On Feb 11, 2008 5:23 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Jesse: On 2/11/08, Jesse Becker [EMAIL PROTECTED] wrote: Depends. There are cases where we need to distinguish between floats and integers. For example, the start time for graphs should be integers (only), while other things can be either floats or ints (the 'vl' URL parameter,for example). Okay, in that case I would prefer that the revised patch contains two functions: clean_float() and clean_int(), and that they are used accordingly. There are two issues here: the immediate problem with clean_float not working correctly, and a more general input validiation problem. I'd suggest applying the current patch we have to fix the first one, and then go back and review all of the validation routines, and make changes accordingly to handle the second. It'd be good to gather the checks into one place as well (get_context.php, or even before that?). This will be a somewhat broad change, since this logic is scattered here and there. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Semi-serious bug in 3.0.6
On Feb 11, 2008 2:00 PM, Bernard Li [EMAIL PROTECTED] wrote: On 2/11/08, Brad Nicholes [EMAIL PROTECTED] wrote: One of the items on the 3.1.x wishlist is to change the units of metrics to the base unit, i.e. if the unit used to be GB, we change it to Bytes and let RRDtool deal with the scaling. If we decide to go ahead with this, it will potentially change the granularity of the data being stored in the RRD. Actually, I thought it was the other way around. Things were being stored in bytes, and some users were hitting rrdtool limits. So the idea was for Ganglia to store KB or MB instead of raw B. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] clean_float() vs clean_number()
On Feb 11, 2008 5:12 PM, Bernard Li [EMAIL PROTECTED] wrote: http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=178 So basically after applying the patch, there will only be one function left, i.e. clean_float() that could handle both float and integers. If that's the case, shouldn't we just name it clean_number? Depends. There are cases where we need to distinguish between floats and integers. For example, the start time for graphs should be integers (only), while other things can be either floats or ints (the 'vl' URL parameter,for example). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Vertical label patch
On Feb 11, 2008 5:47 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Jesse: On 2/8/08, Jesse Becker [EMAIL PROTECTED] wrote: Since we seem to be tossing out minor web front end patches tonight, here's one that fixes the persistent 0 - 0.00 that shows up on all of the per-metric graphs. The patch applies against 3.0.6 and trunk. The new behavior is to do the following: * If an explicit vertical label is passed, use it. * If a valid maximum or minimum value has been passed, via the 'x' or 'n' URL parameters respectively, use them. * If neither are passed, suppress printing a vertical label alltogether. Note that there should be *something* printed as a label for alignment purposes. Otherwise, the sizes of the graphs are off by a few pixels, and it looks really bad. Not sure if this is the correct way to fix this, see: http://ganglia01.slac.stanford.edu:8080/ganglia/glast/?c=glastlnxm=r=hours=descendinghc=4 In older releases, the vertical label actually did something -- with your patch, it simply removed it (for the load graph). Nothing is printed if nothing is given to be printed. They are passing explicit, non-zero ranges for the chart. The patch should handle this case. The problem comes from ranges of 0 - 0. This is nonsensical, and worth supressing. If there is a valid range or vertical label passed, then by all means print it. (You'll also note that they are having problems with the now X.XX values as well, as well as a strange escaping problem with a parenthesis.) In the graphs from SLAC, it is labelled 0 - 8.0, however, I'm not sure whether this information is simply redundant or not. It is redundant, in this case. RRDTool will fill in the ranges be default. The vert-label is more useful for indicating units and similar metadata. If you chart already indicates 0-8, why print it again? One other note about this: the way the limits are found (if not explictly passed) is apparently via find_limits() in functions.php. This function might be broken, since that where $min and $max are set. This function also uses rrdtool to get that information, and means that there will be at least two system() calls to rrdtool for each graph. This probably won't help performance much. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Overlapping labels in graphs
On Feb 12, 2008 7:06 PM, Bernard Li [EMAIL PROTECTED] wrote: Some times I get graphs such as the attached one. The top and bottom labels are overlapping -- any ideas why? (This graph was generated by the code in trunk) I think that's from rrdtool. What version are you using? Do you have the parameters used to create the chart? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] ideas on sanitizing variables.
Attached is a bunch of variable sanitation checking. It isn't done, but I wanted to throw it out for comments before I go too far down some hole and can't dig myself out. This is not a patch, since are only two include_once lines (in index.php and graph.php) for existing files. The rest of the patch would be to add this file.Just imagine that this file gets run after conf.php is sourced, be before get_context.php is read. The idea is to take $_GET (and later $_COOKIE), and check to make sure that their contents have valid information. Invalid information is *discarded*, and cannot be used by the rest of the code. Thus, if $_GET['st'] has bogus data (non-integral data, to be precise), then it is deleted from the array. This is pretty harsh, but should make problems obvious very quickly. There are two main sections to the code. The first is the large array near the top of the file. This defines what parameters we care about, and how they should be used. Anything not in this array, or doens't match the datatype requested, will be discarded. Any new parameters that are added (for example, to indicated which metric groups should be collapsed) should be added to this array. The second part is the foreach{} loop at the end. It runs through all variables in $_GET (and later, $_COOKIE, etc), and checks if we care about it at all (e.g. it is in the large array I just mentioned). Keys that we want, and are valid will be kept, but everything else will be pitched. As I said, it isn't done yet, although it is basically functional. I'm looking for general comments, not specific bug reports yet. So, comments and suggestion welcome. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 sanitize.php Description: Binary data - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.0.7 release
On Feb 12, 2008 7:57 PM, Bernard Li [EMAIL PROTECTED] wrote: Guys: I plan to check-in the following patch and release 3.0.7: -GANGLIA_RELEASE_NAME=Foss +GANGLIA_RELEASE_NAME=Fossett Nice touch. The following issues are fixed: 1) (now 0.00) 2) Show Hosts toggle stopped working Both look fixed. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] clean_float() vs clean_number()
On Feb 12, 2008 11:43 AM, [EMAIL PROTECTED] wrote: Quoting Jesse Becker [EMAIL PROTECTED]: In the meantime, I started on a patch to put all of the variable checks and sanitation in one place. Do you mean move the checks out of get_context.php and graph.php? Yes, pretty much all of it. brings up the question: are we checking or cleaning? Good point. clean_string() will actually change data, escaping unwanted HTML which might be present (cleaning). clean_number() will return NULL if the input value is not numeric (checking). I see the potential for 2 groups of functions: is_valid_type() and clean_type(). The is_valid* set are, for the most part, already part of the PHP core in the form of is_numeric, is_bool, etc. If we want our own wrappers around them for the sake of consistency, that's not difficult. However, there are some datatypes that require manual validation: the check for valid hexcolors and image sizes come to mind. These all return true or false, and do not change the data. Cleaning is, IMO, an inherently different operation, and implies making changes to the data. I've traced a few of the display and undefined index problems to strings that return NULL, with no other warning. I also have things setup so that any data that is not explicitly defined is *thrown away* during the sanitation process. This is to keep something from sneaking in unchecked, either by mistake or malintent. I wish PHP had taint checking. I agree the 'check' approach is less error prone. If input seems to be malformed, it's a bad idea to try to guess what the user intended. Perhaps we ought to add some logic after the input validations whereby if any validations failed, the script exits with an error message. There will be plenty of errors thrown in case of invalid data. I'm inherently distrustful of all data that isn't explicitly set; this is especially true for data coming from URL query-strings. Other thoughts on validations and filtering: There are some input values (like $sort) that currently have escapeshellcmd() run on them, but are never used in a shell command. We could save a bit by only running escapeshellcmd() when it's actually needed, just before the shell call to rrdtool. For most of these, escapeshellarg() would be more appropriate, since they are arguments not actual commands. Sure. It is somewhat simpler to just run everything through though, instead of having to distinguish which variables are used for what. Putting all user-supplied input into some type of container, like an array called $user, would add clarity to the code. Wherever this I'll think about this. Right now, I'm checking the variables in-place ($_GET and $_COOKIE), and throwing away ones that are invalid or unknown. Since so much of the code pulls directly from $_GET, this makes for fewer changes elsewhere in the code. However, if you used $_GET directly, it's pretty obvious where it came from. :) input is used, it's obvious where the value originally came from. We'd need to decide if this stores the raw input, a filtered/validated value, or both. I'd favor doing the same thing for all values set in conf.php: put them in a $conf array. When reading through a script, this makes it much easier to identify the original source of whatever value is being used. I like the idea of $conf[], and use that concept a lot of my own programs. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] clean_float() vs clean_number()
On Feb 11, 2008 6:46 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Alex: BTW, I am going to check in the patches for trunk, however, I will rename clean_float() to clean_number() since the function name and the comment seems a bit misleading based on its call to is_numeric (i.e. it is not really checking whether it is a valid float, just a valid number). In the meantime, I started on a patch to put all of the variable checks and sanitation in one place. The various clean_* checks are not directly part of it, and could be removed if not needed. This brings up the question: are we checking or cleaning? For example, if the query string has the key/value pair st=ABC123 what do we want to do? The st variable is for the graph start time, in epoch seconds, so it should always be an integer. Do we want the validation routines to warn/fail because it isn't of the desired type, or should it try to clean the data, and return what it can (in this case, 123)? The former is more strict, and less likely to cause strange problems due to malformed data (a graph from 123 seconds after the epoch to now probably isn't what you want). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.0.7 release
On Feb 13, 2008 6:00 AM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Tue, Feb 12, 2008 at 04:57:25PM -0800, Bernard Li wrote: Guys: I plan to check-in the following patch and release 3.0.7: can we use the same exact code already committed in trunk (including spaces and other details) so that there are no unnecessary divergences in the maintenance branch? I'd suggest that any whitespace changes be done as completely separate patches, and marked as such in the SVN log. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] ideas on sanitizing variables.
On Feb 13, 2008 11:46 AM, [EMAIL PROTECTED] wrote: Quoting Jesse Becker [EMAIL PROTECTED]: You said the only patching would be to include this script in get_context.php and graph.php. So, I take that to mean that get_context.php and graph.php would still access values in $_GET like they currently do, but these would be modified by santize.php. Yes and no. This file will be included by anything that is front-facing. So index.php and graph.php like I mentioned, but also the host_view, cluster_view, etc files. Not a big deal though. Right now, the idea is to *NOT* change any data, just accept or reject based on some basic type checking. $_GET (et al) remain available to the other scripts (index.php, et al), but non-validated data is removed. Essentially, something exists in $_GET if, and only if, it has explicitly been declared okay. I think it'd be clearer to do everything in one place. Then, the only script that touches $_GET/$_COOKIE is sanitize.php, and all other scripts use variables that it makes available. That way you have only 1 place where user input is touched, and $_GET continues to be what you'd expect: raw user input. Exactly. Now, if we decide to do any sort of cleanup, that could be either a separate stage (maybe clean.php), or left to the relevant sections of the other code. In the foreach() loop, if a $_GET value is found to be valid (the if() condition is true), put the value in a $clean array, using the same key as $_GET/$_COOKIE. All other code should use these values from the $clean array rather than the current local variables. When reading other scripts, this convention makes it clear that a variable is user input, and that it has been put through appropriate checks. That's a thought, and not difficult to do. It would mean changing all other code that currently uses $_GET to use $clean. Not a big deal--just a big search and replace. (Options like this are why I'm asking for input--I thought of doing it both ways, but wasn't sure which would be preferred in this case.) If a value is invalid, maybe create the key in $clean with a sane default value, or a null if no sensible default is possible. Code that wanted to use something in $clean then wouldn't need to do lots of isset() checking. This is pretty much what get_context.php and graph.php already do. Default values are an issue. The should be handled in a different step though. This could be part of a clean stage, to ensure that everything that needs to be set is, in fact, set. Won't the INT, FLOAT, and NUMBER checks in valid_parameter() always be true? (float)$value would always be a float. Hmm...true. I was thinking of the case where where you have something like: $float_var=cow; is_float($float_var); What is the floating point representation of a bovine? I'll need to go back and check on type casting. I was trying to avoid using regexes for generic validation tests, but looks like that might be the way to go. The BOOL check is allowing a 1 or 0, not an actual boolean. If I saw a value being validated as BOOL, I'd expect it to be a boolean, not a 1 or a 0. Mostly it doesn't matter since PHP is so loose with types, but the broken 'Show Hosts' bug in 3.0.6 shows that the difference does matter in some cases. This is an issue with the is_bool() PHP function. It actually wants a true boolean value of True of False, not a string that is 0 or 1. Most/All of this data in $_GET is ultimately handled as string values. For things like the show hosts flag, we aren't given real boolean values; we are given a string. More regex fodder... error_log calls should identify where they are coming from. Alternately, you could trigger_error( message, E_USER_WARNING ). That automatically adds information about file and line number where the error occurred. Noted. Will add. Hope that's helpful, Very much so. Thanks. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
On Feb 15, 2008 5:42 PM, Bernard Li [EMAIL PROTECTED] wrote: Sure, please update us after the weekend, we'll likely release 3.0.7 then. Running valgrind on the ganglia-3.0.6.200802141157.tar.gz tarball you posted for testing: ==2590== 5,554 bytes in 1,282 blocks are definitely lost in loss record 13 of 16 ==2590==at 0x4904A06: malloc (vg_replace_malloc.c:149) ==2590==by 0x3DCEC707E1: strndup (in /lib64/tls/libc-2.3.4.so) ==2590==by 0x407F74: bytes_out_func (in /usr/sbin/gmond) ==2590==by 0x404A54: Ganglia_collection_group_collect (in /usr/sbin/gmond) ==2590==by 0x404CB0: process_collection_groups (in /usr/sbin/gmond) ==2590==by 0x405190: main (in /usr/sbin/gmond) ==2590== ==2590== LEAK SUMMARY: ==2590==definitely lost: 5,554 bytes in 1,282 blocks. ==2590== possibly lost: 0 bytes in 0 blocks. ==2590==still reachable: 415,977 bytes in 998 blocks. ==2590== suppressed: 0 bytes in 0 blocks. I modified my gmond.conf to report much more aggressively than usual so that the test time would be shorter. However, with *this configuration*, it works out to about 770 bytes per minute. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
I'm not sure if this is right--I've only take a really quick check in libmetrics/linux/metrics.c, and my C-fu is rusty. It looks like strndup() is called in linux/metrics.c:hash_lookup (about line 131) to dupliate an interface name, which is included in the stats structure as stats-name. The net_dev_stats function will return this struct. The function is called in a number of places pkts_in_func, pkts_out_func, bytes_out_func and bytes_in_func. The variable *ns is assigned the output of hash_lookup (e.g. the struct). Since the 'name' element is malloc()ed, but not explictly freed, it will not go away when *ns goes out of scope. This is the leak, isn't it? All four of these functions are very similar, and need to be fixed if this is the case. Or did I miss something obvious? :) On Feb 19, 2008 4:54 PM, Bernard Li [EMAIL PROTECTED] wrote: On 2/19/08, Jesse Becker [EMAIL PROTECTED] wrote: I modified my gmond.conf to report much more aggressively than usual so that the test time would be shorter. However, with *this configuration*, it works out to about 770 bytes per minute. So did we want to hunt this other memory leak down prior to 3.0.7 release? Cheers, Bernard -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Memory leak in gmond
On Feb 19, 2008 7:39 PM, Martin Knoblauch [EMAIL PROTECTED] wrote: - Original Message From: Jesse Becker [EMAIL PROTECTED] To: Ganglia Developers ganglia-developers@lists.sourceforge.net Sent: Tuesday, February 19, 2008 11:25:54 PM Subject: Re: [Ganglia-developers] Memory leak in gmond I'm not sure if this is right--I've only take a really quick check in libmetrics/linux/metrics.c, and my C-fu is rusty. It looks like strndup() is called in linux/metrics.c:hash_lookup (about line 131) to dupliate an interface name, which is included in the stats structure as stats-name. The net_dev_stats function will return this struct. The function is called in a number of places pkts_in_func, pkts_out_func, bytes_out_func and bytes_in_func. The variable *ns is assigned the output of hash_lookup (e.g. the struct). Since the 'name' element is malloc()ed, but not explictly freed, it will not go away when *ns goes out of scope. This is the leak, isn't it? All four of these functions are very similar, and need to be fixed if this is the case. Or did I miss something obvious? :) Lines 137, 148 and 159 ? :-) I saw those. :-P I meant after the struct has been returned, outside the function, the memory is never freed. Inside that function, it's okay. The memory allocated in line 151 is never freed, indeed. But it is only allocated once per interface and stays alive for the entire lifetime of the gmond process. So, it is not leaked. Ah, that makes more sense, especially if those variables exist for the lifetime of the program. So, I've just run gmond under valgrind and duma (a fork of the old Electric Fence memory debugger), and I can't seem to reproduce the problem now. Neither one of them is showing any obvious leaks, at least not in the 15 minute tests I've run. The test system(s) are CentOS4.6 boxes. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] svn woes for ganglia repository
On Sat, Feb 23, 2008 at 5:44 PM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Sat, Feb 23, 2008 at 01:13:44PM -0500, Jesse Becker wrote: On Sat, Feb 23, 2008 at 1:07 PM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: anyone knows what happened here? It's probably a sever-side issue. I can duplicate the problem, and there are a number of people having trouble of various sorts: http://sourceforge.net/tracker/?func=browsegroup_id=1atid=21 and all tickets closed since with svn back in service but no comments on what happened. Someone probably let /tmp get full. :-) sorta off-topic but 2 loosely related questions : 1) do we still need the monitor-core-XDR-refactor branch? can it be deleted? Was that rolled into trunk? 2) anyone interested in using instead a distributed RCS? What did you have in mind? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] svn woes for ganglia repository
On Mon, Feb 25, 2008 at 1:49 PM, Bernard Li [EMAIL PROTECTED] wrote: On 2/23/08, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: using something like git which will not go down and block anyone to do And where do we plan to host this? Well, I think that part of the reason to use git is that you don't *need* a central repository like you do with SVN and CVS. Everyone has a repository, and shares patches back and forth, so in effect, everyone hosts it. Lots of people could pull from your version of the tree, or my version, or Carlo's version, or someone else's tree. Now, you can have a central repository as well, of course: http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#public-repositories -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] 3.0.7?
Ack for Centos 4.6. On Tue, Feb 26, 2008 at 8:22 AM, Jesse Becker [EMAIL PROTECTED] wrote: Seems to work on OpenBSD 4.1. Will try on Centos4.6 later today. On Tue, Feb 26, 2008 at 2:35 AM, Martin Knoblauch [EMAIL PROTECTED] wrote: Hi Bernard, as I said, all my stuff can wait for 3.0.8. As for the ACKs - ACK ACK ACK ACK :-) Cheers Martin -- Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de - Original Message From: Bernard Li [EMAIL PROTECTED] To: Martin Knoblauch [EMAIL PROTECTED] Cc: ganglia-developers@lists.sourceforge.net Sent: Monday, February 25, 2008 8:06:45 PM Subject: Re: 3.0.7? Hi Martin: On 2/25/08, Martin Knoblauch wrote: what are your plans for 3.0.7? Any time now ? :-) If not, I would like to commit a small patch to enable syslogging error mesages for gmond. But it can wait for 3.0.8. To be honest I am waiting for more ACKs. But either way it will get released either tomorrow or Wednesday so please wait until then to check in the patch for syslogging. Thanks, Bernard - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [patch] change privateclusters auth headerto include clustername
On Thu, Mar 6, 2008 at 10:57 AM, Brad Nicholes [EMAIL PROTECTED] wrote: -1 for now. The concern that I have is that by injecting the name of the cluster as it is pulled from the query string, seems a little dangerous. This would allow the realm to be altered in any way by just modifying the query string. Not sure if that is a real issue or not, but it seems dangerous. Can anybody else clarify this more? It seems that the issue is that different clusters should exist in different authentication realms. Currently, they do not. IMO, this is both reasonable and desirable. I think that this patch would probably be okay, if there was some additional checking logic. Specifically, something to compare the value of $clustername against a list of valid NAME attributes in the CLUSTER tags. This way, if someone requests a cluster they know exists, it's okay, but they can't arbitrarily try against a non-existent realm. Of course, does that matter?To pass HTTP auth, you have to have a valid triplet of information in the form of realm:username:password (at least, that's my understanding of it). On the assumption that Apache does the right thing in the case of a bogus realm (cause authentication to fail), then I don't see much of a problem with this patch. The one other thing to double-check is that $clustername is properly escaped, since it will be displayed back to the user. So, a +0 from me. :-) -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [patch] change privateclusters auth headerto include clustername
One question: why the difference between Ganglia Private Cluster and Private Ganglia Cluster: ? :) Looks fine otherwise +1 (my tally now sheet has two +1 scores, and one -1 from Brad). On Fri, Mar 7, 2008 at 7:06 AM, Ramon Bastiaans [EMAIL PROTECTED] wrote: Ignore my previous one, I sent the wrong patch. This is the correct patch! Ramon Bastiaans wrote: I agree, I guess it was in theory possible to trick auth.php into switching the realm. Didn't think of that. What about this one then? Now it checks if someone is not trying to change the cluster context and if the cluster is one of the private clusters. - Ramon. -- ing. R. Bastiaans Systems Programmer / High Performance Computing Visualisation / SARA Computing and Networking Services Kruislaan 415PO Box 194613 1098 SJ Amsterdam1090 GP Amsterdam P.+31 (0)20 592 3000 F.+31 (0)20 668 3167 --- There are really only three types of people: Those who make things happen, those who watch things happen and those who say, What happened? - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch for modular graph.php.
On Wed, Mar 12, 2008 at 5:23 AM, Ramon Bastiaans [EMAIL PROTECTED] wrote: I like this setup a lot, has anyone considered this? Well, I have, but that doesn't count. Doesn't seem to have made it's way into svn and I saw no more replies on this topic. I don't think that it applies cleanly to trunk anymore, since there have been some interim fixes for a few things. I'll fix it up, and re-post when I get a chance (crazy busy recently). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch for modular graph.php.
I just committed the modular-graph code to trunk: r1051. Hopefully, it doesn't break anything horribly for anyone. Rehashing this from the original email I posted a while back: Each chart is generated from a specific .php file, with graph.php acting as a gatekeeper for the specific graphing files. There is one file per report (Load, CPU, Network, etc), a generic metric.php for reporting individual metrics, and a heavily commented sample_report.php file. Gone is the huge if/else block for all of the different graph types. Various sanitation and check routines are, I hope, more clearly laid out as well. I've mentioned work to help with sanitizing all _GET/_COOKIE (et al) variables, and that code is similar to this section of the modular-graph patch. Adding a new report should now be straightforward: 1) Add a new report file named namehere_report.php in web/graphs.d/ 2) Add the report name to the $optional_graphs variable in conf.php. This should also help down the road during the hypothetical UI overhaul. Since the graphs are split more cleanly, it should easier for the front end code (and users) to request specific graphs. This nicely reflects the complexity of the various reports. The individual metrics are, by far, the most complicated, as there is different handling depending on context. The sample_report is actually just the CPU_report, but has about 50 lines of comments on how the graphs are created. I have been using this patch in production for quite a while now, and it works for me. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Time to create the 3.1.x stable branch...
On Thu, Mar 13, 2008 at 3:42 PM, Brad Nicholes [EMAIL PROTECTED] wrote: Are there SPEC file changes that need to go in to support the modular web frontend Not any more. (r1061) -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Time to create the 3.1.x stable branch...
On Thu, Mar 13, 2008 at 3:42 PM, Brad Nicholes [EMAIL PROTECTED] wrote: I think that with the removal of the srclib directory from the SVN trunk repository, we have completed everything that we thought needed to be done before creating the 3.1.x stable branch. The only other thing that I know of is testing to make sure that an older 3.0.x gmetad can consume the XML data from a newer 3.1.x gmond. Has anybody had a chance to test this? Were we going to try and make gmond-3.1.x backwards compatable with gmond-3.0.x? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch for modular graph.php.
That's a side-effect of different versions of RRDTool using different fonts. I mentioned this before: http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg03447.html On Fri, Mar 14, 2008 at 8:30 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Jesse: On 3/12/08, Jesse Becker [EMAIL PROTECTED] wrote: I just committed the modular-graph code to trunk: r1051. Hopefully, it doesn't break anything horribly for anyone. Rehashing this from the original email I posted a while back: Each chart is generated from a specific .php file, with graph.php acting as a gatekeeper for the specific graphing files. There is one file per report (Load, CPU, Network, etc), a generic metric.php for reporting individual metrics, and a heavily commented sample_report.php file. Gone is the huge if/else block for all of the different graph types. Various sanitation and check routines are, I hope, more clearly laid out as well. I've mentioned work to help with sanitizing all _GET/_COOKIE (et al) variables, and that code is similar to this section of the modular-graph patch. Adding a new report should now be straightforward: 1) Add a new report file named namehere_report.php in web/graphs.d/ 2) Add the report name to the $optional_graphs variable in conf.php. This should also help down the road during the hypothetical UI overhaul. Since the graphs are split more cleanly, it should easier for the front end code (and users) to request specific graphs. This nicely reflects the complexity of the various reports. The individual metrics are, by far, the most complicated, as there is different handling depending on context. The sample_report is actually just the CPU_report, but has about 50 lines of comments on how the graphs are created. I have been using this patch in production for quite a while now, and it works for me. After updating ganglia-web to SVN r1063, I noticed that the summary graphs (Load, CPU, Memory, Network Last [Hour, ...]) no longer have the same size (specifically the height varies) -- was this intentional? Regards, Bernard -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] OpenBSD build failure on r1067
Now that srclib/ has been yanked, I tried building a clean copy of trunk on my OpenBSD box, and hit a few snags. First, running ./bootstrap fails: --snip-- [EMAIL PROTECTED] ~/Ganglia/versions/newtrunk/monitor-core $ ./bootstrap Bootstrapping libmetrics Running aclocal /usr/local/share/aclocal/speex.m4:10: warning: underquoted definition of XIPH_PATH_SPEEX run info '(automake)Extending aclocal' or see http://sources.redhat.com/automake/automake.html#Extending-aclocal /usr/local/share/aclocal/libgcrypt.m4:23: warning: underquoted definition of AM_PATH_LIBGCRYPT /usr/local/share/aclocal/audiofile.m4:12: warning: underquoted definition of AM_PATH_AUDIOFILE /usr/local/share/aclocal/ao.m4:9: warning: underquoted definition of XIPH_PATH_AO Running autoheader Running automake configure.in: installing `build/install-sh' configure.in: installing `build/missing' aix/Makefile.am: installing `build/depcomp' configure.in:12: installing `build/config.guess' configure.in:12: installing `build/config.sub' Makefile.am: installing `./INSTALL' configure.in:16: required file `build/ltmain.sh' not found Create distribution timestamp Running aclocal /usr/local/share/aclocal/speex.m4:10: warning: underquoted definition of XIPH_PATH_SPEEX run info '(automake)Extending aclocal' or see http://sources.redhat.com/automake/automake.html#Extending-aclocal /usr/local/share/aclocal/libgcrypt.m4:23: warning: underquoted definition of AM_PATH_LIBGCRYPT /usr/local/share/aclocal/audiofile.m4:12: warning: underquoted definition of AM_PATH_AUDIOFILE /usr/local/share/aclocal/ao.m4:9: warning: underquoted definition of XIPH_PATH_AO Running autoheader Running automake configure.in: installing `config/install-sh' configure.in: installing `config/missing' gmetad/Makefile.am: installing `config/depcomp' configure.in:109: installing `config/config.guess' configure.in:109: installing `config/config.sub' Makefile.am: installing `./INSTALL' configure.in:130: required file `config/ltmain.sh' not found To begin installation, run ./configure now --snip-- This fails to create the actual configure script, so it's hard to get much farther. Second, and probably related to #1, there are some libtool issues. So far as I can tell, Ganglia uses it's own copy libtool, instead of a system version. This libtool isn't created, so things fail to build. OThis begs the question: why don't we use the system libtool? Third, on a different copy of trunk I tried rebuilding from the ./configure stage, and found that it fails on the check for libconfuse. The error is a little misleading: --snip-- Checking for confuse Added -I/usr/local/include to CFLAGS Added -L/usr/local/lib to LDLAGS checking for cfg_parse in -lconfuse... no libconfuse not found --snip-- Libconfuse was, in fact, found, but the compile failed due to two linking issues. It appears that on OpenBSD, libconfuse requires libintl, which in turn requires libiconv. Adding the options -lintl -liconv appears to fix this problem. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] OpenBSD build failure on r1067
On Sun, Mar 16, 2008 at 3:03 PM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Sun, Mar 16, 2008 at 12:18:05PM -0400, Jesse Becker wrote: Now that srclib/ has been yanked, I tried building a clean copy of trunk on my OpenBSD box, and hit a few snags. First, running ./bootstrap fails: haven't tried bootstrap ('cause bootstrapping only works well in linux anyway) but you should be able to use a snapshot : So then the build process for trunk is basically broken for non-Linux systems? http://www.sajinet.com.pe/ganglia/ganglia-3.1.0.1066.tar.gz Second, and probably related to #1, there are some libtool issues. So far as I can tell, Ganglia uses it's own copy libtool, instead of a system version. This libtool isn't created, so things fail to build. OThis begs the question: why don't we use the system libtool? the whole point of bootstrapping is to create a libool to use to build the rest of the application in a portable way. the system libtool should only be used for bootstrapping and is not needed after. Err... I'm confused then. If the whole point of libtool is to handle libraries portably, why are duplicating the job of something already installed? (Obviously, if libtool *isn't* installed already, that's a different problem.) But, if there's a need to use our own libtool, and specifically to create ltmain.sh, then we should do that properly. So far as I can tell, this is not happening, and is a bug. I notice that two different ltmain.sh files were removed from SVN in r1066. The one in libmetrics/ is what is causing this build failure. Third, on a different copy of trunk I tried rebuilding from the ./configure stage, and found that it fails on the check for libconfuse. The error is a little misleading: --snip-- Checking for confuse Added -I/usr/local/include to CFLAGS Added -L/usr/local/lib to LDLAGS checking for cfg_parse in -lconfuse... no libconfuse not found --snip-- ./configure --with-libconfuse=/usr/local will workaround that. I have that already, actually. My full configure invocation is: ./configure \ CC='ccache gcc' CXX='ccache g++' \ CFLAGS='-O2 -I/usr/local/include -I/usr/X11R6/include' \ LDFLAGS='-L/usr/local/lib -L/usr/X11R6/lib -lintl -liconv' \ --with-gmetad --enable-status \ --with-libapr=/usr/local \ --with-libconfuse=/usr/local \ --with-libexpat=/usr/local \ --with-librrd=/usr/local \ --prefix=/home/jbecker/Ganglia/install/ I have to add -lintl and -liconv to LDFLAGS for configure to succeed. I have ganglia 3.1.0 running in my OpenBSD 4.3 (beta) test box with all dependencies from ports (including expat from base as used in 4.3, in 4.2 Using 4.1 here, using the official packages for libconfuse, libiconv, and gettext. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] OpenBSD build failure on r1067
On Sun, Mar 16, 2008 at 8:26 PM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Sun, Mar 16, 2008 at 06:27:00PM -0400, Jesse Becker wrote: So then the build process for trunk is basically broken for non-Linux systems? it always was; before it used to work reliably only in fedora or centos AFAIK which is what we had been using to do bootstraps and releases for all the 3.0 series. Until recently it worked fine on OpenBSD, and now it doesn't. For the sake of argument, I'll say that recently covers the last 3 weeks or so, since I've only tested it in the last day or so. Given that ltmain.sh was recently removed, that could If this needs some time to stablize, that's fine, but bootstrapping should work, IMO, on all operating systems if possible, but at *least* Linux, *BSD, Solaris, and AIX (as those seem to be the most common platforms). for now, and as I mentioned before, your best bet is to do a bootstrap in centos 4.6 and use a snapshot. Second, and probably related to #1, there are some libtool issues. So far as I can tell, Ganglia uses it's own copy libtool, instead of a system version. This libtool isn't created, so things fail to build. OThis begs the question: why don't we use the system libtool? the whole point of bootstrapping is to create a libool to use to build the rest of the application in a portable way. the system libtool should only be used for bootstrapping and is not needed after. Err... I'm confused then. If the whole point of libtool is to handle libraries portably, why are duplicating the job of something already installed? (Obviously, if libtool *isn't* installed already, that's a different problem.) yes, you are confused. Nothing new there. But that doesn't actually answer the question. :) ltmain.sh IS generated by libtool at bootstrap time If that's the intended behavior, fine. However, the problem is that ltmain.sh is *not* generated, and this is a bug--as we've established. and bootstrap time is the only time that libtool should be needed in a system. Right. that is, by design the system libtool is never used at configure time, and that is the way it is user everywhere (not only in ganglia) I notice that two different ltmain.sh files were removed from SVN in r1066. The one in libmetrics/ is what is causing this build failure. because libtoolize --copy --force is missing in bootstrap to regenerate them and in your system autoreconf is not in /usr/bin Well that's an easy fix: I have that already, actually. My full configure invocation is: ./configure \ CC='ccache gcc' CXX='ccache g++' \ CFLAGS='-O2 -I/usr/local/include -I/usr/X11R6/include' \ LDFLAGS='-L/usr/local/lib -L/usr/X11R6/lib -lintl -liconv' \ --with-gmetad --enable-status \ --with-libapr=/usr/local \ --with-libconfuse=/usr/local \ --with-libexpat=/usr/local \ --with-librrd=/usr/local \ --prefix=/home/jbecker/Ganglia/install/ I have to add -lintl and -liconv to LDFLAGS for configure to succeed. you shouldn't need to, and therefore there is another bug somewhere there. the libtool library (libconfuse.la) should instruct libtool about the extra dependencies required for you. Well, given the current libtool issues, it's reasonable to let this one slide for now. I can't say that I really trust libtool to do its job at the moment. ;-) libiconv and gettext are not ganglia's dependencies, you need Correct, but libconfuse needs them at link-time, and the configure scripts aren't currently catching this (which they should). There are two unresolved symbols in libconfuse.so that come from libintl.so: # nm /usr/local/lib/libconfuse.so.0.0 |grep intl U libintl_bindtextdomain U libintl_dgettext # nm /usr/local/lib/libintl.so.3.0 |grep libintl_bindtextdomain 1df8 T libintl_bindtextdomain And libintl.so has a similar dependency on gettext. I can see this also as being a problem related to libtool as well. for expat (expat-2.0.0), libconfuse (libconfuse-2.5p0) and apr (1.2.7) for OpenBSD 4.1 (without gmetad) are you using those packages when you refer to official packages? Yep, I've got those versions exactly, all from Ports. As I mentioned, this worked fine until recently. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] OpenBSD build failure on r1067
On Mon, Mar 17, 2008 at 3:57 AM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: Until recently it worked fine on OpenBSD, and now it doesn't. Wrong. Just because it didn't error out, doesn't mean it worked fine; because I could build from SVN previously, and lost the ability to do so. Yes, I can work around it, but it's worth reporting and fixing (I've done the former, and will try to assist with the later). recently has a version number (r1065); where this failure is (as described before) a bug because autoreconf wasn't configured YET to work in OpenBSD and the old bootstrap was missing the libtoolize call to create ltmain.sh and the rest of the libtool generated files. Fair enough. the changes started in r1044 enable more platforms to have a correct bootstrap as you suggested. Noted. because libtoolize --copy --force is missing in bootstrap to regenerate them and in your system autoreconf is not in /usr/bin Well that's an easy fix: Haven't yet seen a patch, so Committed revision 1069 Here's the patch: if there's an autoreconf in $PATH, use it. Index: bootstrap === --- bootstrap (revision 1075) +++ bootstrap (working copy) @@ -1,6 +1,7 @@ #!/bin/sh # $Id$ -if [ -x /usr/bin/autoreconf ]; then { +which autoreconf +if [ 0 = $? ]; then { echo Bootstrapping libmetrics cd libmetrics autoreconf --verbose --install --make cd .. I have that already, actually. My full configure invocation is: ./configure \ CC='ccache gcc' CXX='ccache g++' \ CFLAGS='-O2 -I/usr/local/include -I/usr/X11R6/include' \ LDFLAGS='-L/usr/local/lib -L/usr/X11R6/lib -lintl -liconv' \ --with-gmetad --enable-status \ --with-libapr=/usr/local \ it is better if you let apr configure itself with apr-1-config here. this is an incorrect use of --with-libapr because apr from ports is not installed with its headers in /usr/local/include There was an issue with apr-1-config, and I had to specify the path directly. It may have been the failoure of configure to find it; I think it was looking for apr1-config, or something along those lines. Since this doesn't seem to be the case anymore (yay!), I can remove that. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch for modular graph.php.
On Tue, Mar 18, 2008 at 7:01 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Jesse: On 3/14/08, Jesse Becker [EMAIL PROTECTED] wrote: That's a side-effect of different versions of RRDTool using different fonts. I mentioned this before: http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg03447.html Okay -- but this wasn't the case (at least for me) before the re-work for the modular graph.php. Can this be fixed? FYI, I am using RRDTool 1.2.23. The various $fudge values were probably changed between now and then. Fixing this for 1.2.23 will make the sizes wrong for 1.0.x. We *could* try checking the version of rrdtool somehow, and act accordingly. Calling 'rrdtool --version' for each graph is probably a bad idea, although we could possibly have a conf.php setting. Also, In Cluster View, the host graphs are gone and I got the following error in httpd.log: ERROR: Problems reading database name Can't help you there. That looks like an rrdtool error, and I can't reproduce the problem. There's one mention of it on Google: http://www.groundworkopensource.com/community/forums/viewtopic.php?t=1537 -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-general] Ganglia SMF Manifests
On Wed, Mar 19, 2008 at 2:46 AM, Ben Rockwood [EMAIL PROTECTED] wrote: Given the request for RHEL/Centos chkconfig (bleh) scripts, I thought I'd post my Ganglia SMF Manifests. I'm willing to share my Solaris/X86 build with anyone interested, although building Ganglia on Solaris is a dream. Both manifests assume that Ganglia was installed with the prefix /opt/ganglia, and you'd load them like so: svccfg import ganglia_gmond.xml Any objections to including these in the monitor-core/cool-stuff directory? (With notes about proper credits for the files, and possibly a note about not being officially supported)? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia SMF Manifests
On Wed, Mar 19, 2008 at 1:07 PM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Wed, Mar 19, 2008 at 11:30:31AM -0400, Jesse Becker wrote: On Wed, Mar 19, 2008 at 2:46 AM, Ben Rockwood [EMAIL PROTECTED] wrote: Given the request for RHEL/Centos chkconfig (bleh) scripts, I thought I'd post my Ganglia SMF Manifests. I'm willing to share my Solaris/X86 Any objections to including these in the monitor-core/cool-stuff directory? (With notes about proper credits for the files, and possibly a note about not being officially supported)? -1 cool-stuff doesn't mean anything (and doesn't exist in trunk anymore either), Guess I need to remove my copy of cool-stuff then. the SMF manifests are equivalent to the SysV init scripts in gmond and gmetad directories and if committed should go there. they can't be committed as-is either because the assumptions made are configurable and users will expect that they match (regardless of how many notes about not being supported are added) but will be useful if ever make is expanded to make a target to build [open]solaris packages. What about creating a contrib directory? This is common in many different projects, and frequently used for files such as this. I've yet to see a complaint in any project about something in contrib/ not working properly. It seems a shame to have something potentially useful and not distribute it. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] GANGLIA_NANO_VERSION in tarball
On Thu, Mar 20, 2008 at 5:25 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Carlo: Instead of trying to solve this automatically, I would suggest the following instead: - set GANGLIA_NANO_VERSION to something like test by default - when we want to build a real snapshot to be consumed by users/developers, you replace test with the real SVN version manually in configure.in This has the added option of allowing the person doing the snapshot release to add additional tags to the SVN version (eg. GANGLIA_NANO_VERSION = 1090-perlmodules) -- the reason why this may be useful is I may be testing some changes I have made in my local repository (and not checked in), yet with the current scheme, the SVN version will be the same. Thoughts? Seems reasonable, and at least worth trying out. If there is going to be a manual aspect to this, make sure that it gets written down in some sort of README.snapshot or README.release file that has all of these small little things in it (things like update the ganglia.info download page to point to the new release). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Patch for modular graph.php.
On Fri, Mar 21, 2008 at 10:03 PM, Bernard Li [EMAIL PROTECTED] wrote: BTW, was the modular patch done against an old revision? It looks like my work with using the title for metric graphs have been undone -- I am not sure what else is missing... Except for perhaps the very first version, the patch has always been against the current trunk revision. I don't see any difference between the titles of metric graphs in between 3.05, 3.0.6, 3.0.7 and trunk. What specifically are you reffering to? -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Heterogeneous metrics
Bernard Li wrote: Testing trunk, I noticed this behaviour: Let's say I have 2 hosts, one host with a custom metric temp (which measures the temperature of the motherboard, for instance), and the other host does not have this metric. I have a similar configuration, with similar metrics. I go to the Cluster Report pull-down menu, and select temp which results in only one host graph (the one which has this metric). The other host doesn't have a graph but simply have a text link to the host and an error is outputted to httpd's error_log saying that the corresponding temp.rrd file doesn't exist. Does anybody remember what the 3.0.x behaviour is for this situation (or perhaps this wasn't allowed previously)? On 3.0.7, I have the same symptoms, except that I don't see text links to the missing graphs. Looking at the HTML for the page, the links are present, just not rendered (Firefox 2.0.0.12). There are messages in the error_log about missing rrds. And going forward, how should we solve this? I'm not sure how much of a failure this really is. A metric was requested, the data isn't there, and Ganglia is gracefully handling the issue. I think that having the gaps even has some value: it makes it very obvious which hosts don't have that metric. One possible improvement would be to reduce the spurious messages in error_log. Maybe an ignore missing .rrd files config option? Another would be to replace the text links you see, or the blank spaces I see, with a placeholder message of some sort reading Metric FOO not available for host BAR. A more subtle problem is when the .rrd files are present, but haven't been updated recently. You just get blank charts, with zeros everywhere. Not really sure how to handle that one though... -- Jesse Becker NHGRI Linux support (Digicon Contractor) smime.p7s Description: S/MIME Cryptographic Signature - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] http://ganglia.info troubles.
The ganglia.info website appears offline, and returns this error: Fatal error: Unknown function: classybody() in /home/groups/g/ga/ganglia/htdocs/wp-content/themes/GangliaTheme/header.php on line 35 -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] http://ganglia.info troubles.
On Tue, Mar 25, 2008 at 5:40 PM, Bernard Li [EMAIL PROTECTED] wrote: On Tue, Mar 25, 2008 at 2:25 PM, Matt Massie [EMAIL PROTECTED] wrote: i'm confused. i've done nothing to the site lately. i just tried to upgrade to 2.3.3 (the latest) to see if that fixes it. it doesn't. i'll try to take a look a little later when i get a chance.. i've just swamped now. I haven't touched it either... I wonder if it's a SF.net issue or we got hacked? One question though -- do you remember if pages like Community, Downloads are supposed to be pages and not posts? I think if they are converted to pages, they might work...? Not sure. Well, it's still not working this morning, whatever the problem. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Host graph in cluster view has duplicated hostname in title
On Tue, Apr 1, 2008 at 8:17 PM, Bernard Li [EMAIL PROTECTED] wrote: In current trunk (r1185), when you are in Cluster view, your host graphs' titles would have duplicated hostnames eg. server1.ganglia.info server1.ganglia.info last hour. The following patch fixes that and brings the behaviour closer to what it was before: And reverting actually breaks a few other things, in similar ways. I was looking at this a while last night and will again tonight. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Host graph in cluster view has duplicated hostname in title
Fixed, I think, in r1190. On Tue, Apr 1, 2008 at 8:25 PM, Jesse Becker [EMAIL PROTECTED] wrote: On Tue, Apr 1, 2008 at 8:17 PM, Bernard Li [EMAIL PROTECTED] wrote: In current trunk (r1185), when you are in Cluster view, your host graphs' titles would have duplicated hostnames eg. server1.ganglia.info server1.ganglia.info last hour. The following patch fixes that and brings the behaviour closer to what it was before: And reverting actually breaks a few other things, in similar ways. I was looking at this a while last night and will again tonight. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Install locations of gmetric and gstat
On Wed, Apr 2, 2008 at 2:37 PM, Bernard Li [EMAIL PROTECTED] wrote: Currently gmetric and gstat are installed in /usr/bin, whereas gmond and gmetad are installed in /usr/sbin. IMHO I think all binaries should be installed to /usr/sbin. One might argue that maybe gstat should be made available to users, but I think gmetric should definitely be confined to /usr/sbin. The gmond and gmetad programs certainly belong in /usr/sbin. I think gstat should stay in /usr/bin. For gmetric, I don't have a good argument *for* either directory. Why should it be in /usr/sbin? However, an argument *against* moving it is that moving gmetric to /usr/sbin will break any script that has /usr/bin/ hard coded. This is probably more common than anyone would like to admit. Based on that, I'd say leave things where they are. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Host graph in cluster view has duplicated hostname in title
On Wed, Apr 2, 2008 at 2:41 PM, Bernard Li [EMAIL PROTECTED] wrote: On Tue, Apr 1, 2008 at 7:02 PM, Jesse Becker [EMAIL PROTECTED] wrote: Fixed, I think, in r1190. Almost. The per-host metric graphs' titles have hostnames before the metrictitle. Now that we are using the long description for the titles, you are almost guaranteed to run out of space in the graph. Previously, per-host metric graphs simply have the metrictitle/metricname as the title -- hostnames weren't there before. Which is inconsistent behavior relative to the rest of the charts. Everywhere else, the device being plotted (a host, cluster, etc) is listed as the title, and metric being plotted is listed in the legend. Thus, the change to make per-host metrics the same. A large number of my hosts have fairly long names, and in some cases there is occasional clipping when using the smallest chart size. It isn't as bad as you'd expect though. What do you think about shortening the hostnames for display purposes (and only display purposes in these charts? In all charts, or just the small chart size? Do you think that removing everything after the first . would be sufficient? There's only so much that we can do about long hostnames though--users are free to name their hosts however they wish. this-is-compute-node-001.compute-cluster.domain.com is a perfectly valid hostname, although perhaps a bit silly. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Install locations of gmetric and gstat
On Wed, Apr 2, 2008 at 3:01 PM, Bernard Li [EMAIL PROTECTED] wrote: On Wed, Apr 2, 2008 at 11:48 AM, Jesse Becker [EMAIL PROTECTED] wrote: Gmetric injects metrics to the collection framework which gmond/gmetad belongs to, so to quote Martin, by logic, they should belong in the same location. Well, both ssh and sshd are part of a secure communications framework. Would you put ssh in /usr/sbin? :-) I'll quote the FHS: /usr/sbin : Non-essential standard system binaries /usr/bin : Most user commands Based on that, I'll buy the gmetric in /usr/sbin argument. moving gmetric to /usr/sbin will break any script that has /usr/bin/ hard coded. This is probably more common than anyone would like to admit. Based on that, I'd say leave things where they are. But isn't that a good incentive for folks to replace all their gmetric cronjobs with Python/C modules? :-) Haven't we had this discussion already? :) -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Host graph in cluster view has duplicated hostname in title
On Wed, Apr 2, 2008 at 3:23 PM, Bernard Li [EMAIL PROTECTED] wrote: On Wed, Apr 2, 2008 at 12:08 PM, Jesse Becker [EMAIL PROTECTED] wrote: What do you think about shortening the hostnames for display purposes (and only display purposes in these charts? In all charts, or just the small I have thought about that, by simply using the short hostname. I think this is fine if all your servers belong to one domain, but what if your cluster contains hosts from different domains having the same short hostname...? And that's exactly why I didn't do this in the first place... -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Host graph in cluster view has duplicated hostname in title
On Wed, Apr 2, 2008 at 6:09 PM, Bernard Li [EMAIL PROTECTED] wrote: On Tue, Apr 1, 2008 at 7:02 PM, Jesse Becker [EMAIL PROTECTED] wrote: Fixed, I think, in r1190. Looks like it's still broken. But in a different way. That's progress! Sorta. If my gmetad is aggregating another gmetad, the Grid summary graph has the gridname, but the Cluster summary graph does not (so instead of gridname Load last hour it is just Load last hour). Here's the rrdgraph command via debug. Production is a remote gmetad I am aggregating: Can you please post a URL for the graph in question? It doesn't have to be a full URL, just all the ugly CGI variables. Also, in r1202, there's a one-line addition to try and handle the grid context. I don't have an easy way of testing gmetad polling gmetad, so I can't easily test it. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] web httpd error log Graph [x] in [context y]
On Thu, Apr 3, 2008 at 5:51 PM, Bernard Li [EMAIL PROTECTED] wrote: Instead of this change: http://ganglia.svn.sourceforge.net/viewvc/ganglia/trunk/monitor-core/web/graph.php?r1=1205r2=1204pathrev=1205 Perhaps, we can enable this when debugging is on? And perhaps further That shouldn't have been committed. Rather, it shouldn't have been committed when uncommented. I see that it's already been re-commented though. I agree that a sort of 'debug' mode would good. There is some code to handle it, but only in some cases, and not terribly consistently. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Host graph in cluster view has duplicated hostname in title
On Thu, Apr 3, 2008 at 6:17 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi Jesse: On Wed, Apr 2, 2008 at 7:50 PM, Jesse Becker [EMAIL PROTECTED] wrote: Can you please post a URL for the graph in question? It doesn't have to be a full URL, just all the ugly CGI variables. graph.php?G=grid1m=r=hours=descendinghc=4st=1207260060g=load_reportz=mediumr=hour Thanks. Also, in r1202, there's a one-line addition to try and handle the grid context. I don't have an easy way of testing gmetad polling gmetad, so I can't easily test it. I don't see any different behaviour in the frontend. Ah well, worth a shot. It shouldn't affect anything else, but I'll revert it in a bit. BTW, In your gmetad.conf, I wonder if you can have a data_source like: data_source blah localhost:8651 to mimic a meta-grid configuration...? I'll give that a try, but doesn't the blah have to match the cluster name on the gmond? I've had poor luck when those didn't match. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Host name in graph titles...
On Fri, Apr 4, 2008 at 2:37 PM, Brad Nicholes [EMAIL PROTECTED] wrote: I know that this was discussed somewhere on the list, but I can't find where right now. Why is the host name of the machine being added to the title of all of the graphs? Isn't this redundant since you have already drilled down to that host? Shouldn't the graphs only contain the user friendly title just to distinguish one graph from another? I agree that it's redundant. I had originally removed it as part of the modular graph changes, but I think someone requested that I add it back in. One reason to include the hostname is that you can then link/copy the image directly, and there is no ambiguity as to what the chart refers. However, there's no reason at all to have the information duplicated on the same chart--having the hostname listed twice, for example. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia web page load time
On Mon, Apr 7, 2008 at 2:02 PM, Bernard Li [EMAIL PROTECTED] wrote: Hi all: I have a Ganglia page which takes quite a bit of time to load and I was wondering if anybody have any PHP code that will allow me to measure the load time and print this in the main page (much like the Downloading and parsing ganglia's XML tree took 0.0073s. line). $start = microtime(); $end = microtime(); $delta = round($end - $start, 4); print Generating this page took $delta seconds; Have to make sure that this isn't called for every single graph though, since that's a waste. It should give one tenth of a millisecond precision (from the round(..., 4)) to match the XML output. Should be fairly simple to add. I guess I could use some third-party plugins for Firefox or something for this, but thought it might be of general interest to put this in the main page... Firebug will tell you. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Register now and save $200. Hurry, offer ends at 11:59 p.m., Monday, April 7! Use priority code J8TLD2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia web page load time
On Mon, Apr 7, 2008 at 2:28 PM, Bernard Li [EMAIL PROTECTED] wrote: On Mon, Apr 7, 2008 at 11:14 AM, Jesse Becker [EMAIL PROTECTED] wrote: $start = microtime(); $end = microtime(); $delta = round($end - $start, 4); print Generating this page took $delta seconds; How can you tell when $start and $end is? For instance I don't think you can pin that on header and footer since footer may be loaded *before* all the graphs are drawn. There isn't much else you can do that I can think of. From the point of view of the PHP code, once index.php can finished processing the footer information, that transaction is done. The graphs are all separate HTTP transactions, and requesting those is the duty of the web client. I can imagine a complicated set of checks to make sure that graphs are requested in due course, and report that time, but I don't think it would be pretty. I don't think that the pure PHP parts of Ganglia are all that slow--it don't do anything terribly complicated. If I'm reading the output from firebug correctly, generating the HTML took something like 20 miliseconds. The slow parts are in calling graph.php umpteen times, at upwards times ranging from 150ms to almost a full second. I expect that the delay is mostly from the actual rrdtool calls, as opposed to the processing done by PHP. Firebug will tell you. Will try that out. Also try the YSlow addon for Firebug, it is an interesting, and sometimes even useful, addition. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Register now and save $200. Hurry, offer ends at 11:59 p.m., Monday, April 7! Use priority code J8TLD2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gmond.conf upgrade from 3.0.x - 3.1.x
On Fri, Apr 11, 2008 at 7:03 PM, Bernard Li [EMAIL PROTECTED] wrote: Right now I am working on the upgrade path from 3.0.x to 3.1.x. Since there are diffs between 3.0.x gmond and 3.1.x gmond, I am proposing the following changes in the spec file during upgrade: %postun gmond -t /tmp/gmond.conf.old /* generate default conf from 3.0.x */ %post gmond -t /tmp/gmond.conf /* generate default conf from 3.1.x */ diff -ru /tmp/gmond.conf.old /tmp/gmond.conf /tmp/gmond.conf.patch cd /etc/ganglia cp gmond.conf gmond.conf.3.0.x patch -p2 /tmp/gmond.conf.patch This *should* correctly update the old conf to the new conf. I guess the real question is whether we should automate this, or let the user manually fix this themselves... Thoughts? How about doing it both ways? If there is an existing /etc/gmond.conf, then create a new patched file as something like /etc/gmond.conf.upgrade_from_3.0.x. This leaves the existing file alone, reducing the risk of breaking something. Also, a big message about moving this file into place should also be displayed, telling the admin that to finish the upgrade, they need to move the new file into place. For a new install (e.g. there's no /etc/gmond.conf file), just install the file, of course. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Historical Data
On Wed, Apr 16, 2008 at 5:43 PM, Witham, Timothy D [EMAIL PROTECTED] wrote: I don't have space for it since my grids are too huge, but it would be easier to just keep more detail in the RRDs which is decided at create time. I haven't yet tried it, but gmetad/conf.c implies that the data retention policy could be changed in the config file (I don't see this option in the man page though; is that a bug?): This is correct. You can change the *initial* settings for an RRD file *when it is created*. If you already have RRD files, changing the settings in the config file will have no effect. If you want to change your current files, then you need to export the existing data ('rrdtool xport'), create new files with new settings, then import the old data ('rrdtool restore'). But maybe your parenthetical comment means you did that already but had too much waitIO? And that's why you went to a cron job? If so, you are storing the RRDs in tmpfs, right? And read this paper: http://www.usenix.org/events/lisa07/tech/plonka.html Not related to Ganglia, but it does discuss optimizing performance on with RRD files. In short: turn off read-ahead, and upgrade to a recent version of rrdtool. Also, I am curious as to how the performance of rrdtool would be affected if we were to store related metrics in a single rrd file: e.g., we could group cpu_(user,system,idle,wio,nice) in a single file, which I think would reduce the resource usage of gmetad significantly. I have wondered that too. Since RRD is random access, it seems like it should be at least as efficient and probably more efficient since there would be less files open. But it would be difficult to change. Now The cost of calling open() is fairly low, and even on huge clusters, I'd be surprised if this is hugely significant. RRD files have a short header section that stores information about the RRAs, the DSs, and offsets as to the current pointer for the RRAs within the file. With mutliple metrics in a single file, you reduce the number of open() calls, but increase the number of calls to seek() within a single file. One of the main points of the paper I mentioned above is that read-ahead is almost entirely wasted on RRD files. In order to read/write a single value in an RRA, the OS will open the file, read the header (which is short) plus many other blocks on disk because of read-ahead settings. Next, rrdtool must seek to the proper location in the RRD file, read a bunch of blocks (which we don't care about), then write the new data. Repeat this seek/readahead/write pattern for each RRA that needs to be updated. each RRD is simple with DS:sum and DS:num for summaries; the metric is in the filename itself. To change, you would need to put the metric names in the RRDs: DS:cpu_user_sum, etc. and I think you would have to update all metrics with one rrd_update call. Of course this would work only for the standard metrics and extra metrics would still need to be in their own files. Or, perhaps with the new metric groupings, each group could be an RRD file of related metrics. And then you'd have to change the PHP to understand all this... yep. You could only consolidate a few sets of metrics, since not all system support all of them. However, ideas for improving the FE are welcome. Once we get 3.1 (or 3.2) out the door, I'd like to work on new FE, perhaps with things like consolidated RRD files in mind. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Platform experts needed (was:Re: Ganglia 3.1.x stable branch has been created...)
On Fri, Apr 18, 2008 at 12:04 PM, Brad Nicholes [EMAIL PROTECTED] wrote: So here is another request to all you platform experts out there. The Ganglia project will be rolling alpha tarballs of the Ganglia 3.1 version. If the tarball does not work on your platform, please fix it and submit a patch back to the project. Ganglia 3.0.x already works on a variety of platforms and we would like to see 3.1.x work on an equal or greater number. But we need platform experts to make this happen. Here is your chance to jump in and help make Ganglia 3.1 the best release ever. I just did a clean checkout of trunk/, and it seems to have built just fine on OpenBSD 4.1. This is a change from the recent past, where there were some bootstrap problems on this platform. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Platform experts needed (was:Re: Ganglia 3.1.x stable branch has been created...)
On Fri, Apr 18, 2008 at 2:56 PM, Bernard Li [EMAIL PROTECTED] wrote: I just did a clean checkout of trunk/, and it seems to have built just fine on OpenBSD 4.1. Just to be sure, I would suggest you check out the 3.1 branch, even though it is probably not much different from trunk. The problem I had recently were during the bootstrap stage, and a few minor issues with ./configure. I just tried using r1256 from branches/monitor-core-3.1, and it looks fine. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] relicensing the web frontend as GNU GPL v2
On Sat, Apr 19, 2008 at 3:14 AM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: most likely just a formality, as the web frontend templating system was based on the GPLv2+ TemplatePower class from the very beginning (at least as shown from the history in svn). a quick line count from the files involved says the contributers that will need to consent will be (including number of lines committed from all files in the web directory including non php files which could be as well discarded as an alternative) : 38 bnicholes 87 carenas 410 knobi1 426 bernardli 686 hawson 830 sacerdoti 3940 massie Fine with me. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] relicensing the web frontend as GNU GPL v2
On Tue, Apr 22, 2008 at 11:05 AM, [EMAIL PROTECTED] wrote: Quoting Brad Nicholes [EMAIL PROTECTED]: On 4/19/2008 at 10:48 AM, in message [EMAIL PROTECTED], Jesse Becker [EMAIL PROTECTED] wrote: On Sat, Apr 19, 2008 at 11:37 AM, Brad Nicholes [EMAIL PROTECTED] wrote: Apparently there are a lot of choices for replacing TemplatePower with some other templating class. Check out http://www.whenpenguinsattack.com/2006/07/19/php-template-engine-roundup/ We just need to find one that isn't GPL. Preferably BSD or MIT. Apache license would be good also. LGPL? (I'm looking through lists on freshmeat.net...) LGPL would be OK if we can't find something licensed under BSD, MIT or another more liberal license. Brad http://www.smarty.net/ Smarty is probably the most common templating framework for PHP, and is LGPL. Maybe a bit heavyweight for what's needed in the Ganglia frontend. I spent a little time looking at xtemplate. It's dual-licensed BSD and GPL, somewhat similar in several ways to the current template system, and appears to be fairly small. http://www.phpxtemplate.org/HomePage -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] relicensing the web frontend as GNU GPL v2
On Tue, Apr 22, 2008 at 2:00 PM, Brad Nicholes [EMAIL PROTECTED] wrote: If phpxtemplate will work for us, then this sounds like a good way to go from a licensing perspective. It is dual licensed under both LGPL and BSD. We would obviously take it under the BSD license. In either case, the Ganglia project source code could remain completely under the BSD license. How hard would it be to replace what we have today with phpxtemplate? It didn't look too hard. I spent a little time looking into the conversion over the weekend. I can't say I got it to work, but I think that's because: 1) There are lots of places that need to be changed 2) I haven't yet come up with a good automated search and replace. The syntax is similar in spirt, and format, to current system, but different enough that a simple search and replace won't actually work. It shouldn't be hard, I just need a free moments--or someone else can take a crack at it, of course. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [RFT] libmetrics: freebsd: avoid unitialized values and invalid casts for cpu_speed
On Wed, Apr 23, 2008 at 6:33 AM, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: the following proposed patch for stable (3.0 and 3.1) removes a floating cast and the use of an uninitialized buffer which could result in high cpu_speed values when the size of the buffer used by the call to sysctlbyname on machdep.tsc.freq was smaller than the one proposed (8 bytes). attached original fix committed as r1293 +1 -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] subtitle still clovered for very long hostnames using 1.2.27
I just committed r1460 to trunk. This adds the $use_fqdn_hostname setting to conf.php. The default behavior is to *not* show the FQDN. The patch adds a small utility function to remove everything after the first . character in a hostname, and then adds checks for $use_fqdn_hostname in the various graphing functions. It is possible to turn this option on, so that the FQDN is shown, but hosts that report themselves using only a short hostname will still be displayed as such, since we can't magically conjure an arbitrary domain for hosts that lack them. On Tue, Jun 24, 2008 at 13:27, Bernard Li [EMAIL PROTECTED] wrote: Hi Carlo: I saw the comment in the STATUS file in 3.1 branch. Can you post a screenshot of what the graph looks like? We could probably get around it by having a user-configurable option $use_fqdn_hostname in conf.php to determine whether we use FQDN or short hostname when rendering the graphs. This has been a feature request and could potentially get around the issue. Of course if your short hostname is actually long -- then nothing we could really do can help. Cheers, Bernard - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Trunk r1438: web/graph.php
On Tue, Jun 24, 2008 at 01:48, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Mon, Jun 23, 2008 at 11:36:02AM -0700, Bernard Li wrote: Hi Carlo: http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=1438 Do we really want the debug message to be text/plain? Previously it is text/html (default) the default content to use is a webserver configuration, I presume in your setup is probably text/html which is why you were not able to see the bug. it is specially annoying if you happen to have something like application/octet-stream instead and I am pretty sure has to be amusing if it happens to be audio/x-wav. That seems like a reasonable reason to force the type ourselves. We know what the output is, and should properly declare it as such. and the text are wrapped such that you could see the entire command on the screen. Now there is no text-wrapping and makes it hard to see the entire command in one go. you also couldn't see the bug because your browser was nice enough to format and wrap an invalid HTML file but that is also specific to your setup. if you want to have it formatted and wrapped I would recommend doing that in the code instead. if that is the case HTML might be better, as it will also allow for other formatting aids like fonts, bold and colors, but I think the currently proposed setup is more practical for a debugging flag. A middle ground would be to do *minor* preprocessing on the command, to make the lines wrap cleanly, then wrap it in pre tags. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Stacked Graphs
I took a quick look at the patch, although I've not applied yet. A few comments: 1) the HSV_TO_RGB and get_col functions should, IMO, be moved out to the functions.php file. 2) Chart size is fixed to 400x300. This should be based on the other charts (either pie chart, or other plots, depending on placement). 3) A few mostly minor coding inefficiencies (needless recalculation of an array length inside a loop, for example), but nothing major. I'll try out the patch tomorrow if I get a chance (tonight is out) On Mon, Jun 30, 2008 at 15:34, Bernard Li [EMAIL PROTECTED] wrote: On Thu, Jun 19, 2008 at 3:07 PM, Brad Nicholes [EMAIL PROTECTED] wrote: The patch looks good. Now we just need somebody with a lot PHP web frontend experience than me to review the patch and determine if it should be committed to trunk. Jesse, Bernard... I'll leave it up to you or anybody else with commit rights looking from some code to review. :) I have reviewed the patch and updated the bugzilla entry. Waiting for response right now. The sooner we can get this checked in, the better. Otherwise other code changes may start to conflict with this patch. Cheers, Bernard - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] What to do with the contrib directory???
Just to chime in here... On Thu, Jul 10, 2008 at 13:17, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Thu, Jul 10, 2008 at 08:46:11AM -0600, Brad Nicholes wrote: At the developers meeting last Feb. we talked about what to do with new module and other types of contributions such as utility scripts. A new contrib/ directory was created by hawson in March. that was I think unrelated with the debate about a module repository that we had at the summit and more linked to the immediate need of distributing useful user provided stuff like the SMF profiles needed to run ganglia in Solaris 10 as you can see by (also linked into the STATUS file) : http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg03807.html Correct. I don't think that contrib/ was ever intended to be the proper location for user contributed gmond modules. I see it as more for miscellaneous other stuff--like SMF startup scripts, or routines for copying and restoring files from tmpfs on a regular basis, or other sorts of infrastructure tasks (for lack of a better term). I don't think we should just be dumping anything that lands in contrib/ into our builds. Agreed, but the threshold need not be high for inclusion. I am *not* suggesting that we take any and every submission. However, if it's useful to someone, it's probably useful to someone else, and at least worth considering. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] [Ganglia-svn] SF.net SVN: ganglia: [1538]trunk/monitor-core/Makefile.am
On Thu, Jul 10, 2008 at 13:15, Brad Nicholes [EMAIL PROTECTED] wrote: I was specially interested in getting ganglia-rrd-modify.pl somehow snip I think adding contrib makes probably more sense (as is usually done in other opensource packages), and as far as we clear the distribution of all those contributions of course with some nice looking legalese (which I think has been implicitly granted through the process of getting those files publicly to the list) Agreed. I'm OK with it either way. If we add contrib/ to the package, then we should still have someplace where we put stuff that we like and think is valuable, but haven't approved yet. Does that make sense? However a download page on the wiki or some other kind of web directory listing might make it easier to reference for the user. This is exactly what contrib directories are for: things that are useful and worth distributing as a courtesy, but are *not* directly supported by the main development team. If something is ever promoted/taken over by main developers, then it gets removed from contrib/, and added into the proper location elsewhere in the project. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Auto configuration ideas
On Tue, Jul 29, 2008 at 11:35, [EMAIL PROTECTED] wrote: -Original Message- From: Jesse Becker [mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I've been looking at how we currently deploy Ganglia configuration files in our organisation, and whether the process can be improved. snip - allowing a central configuration server to override some, but not all, of the values specified in the config file This is possible with cfengine, and other configuration management tools. I have several groups of machines, and There are some more details that were not in my original email: - it is a multi-platform deployment, including Linux, Solaris and an OS that requires Cygwin Cfengine works on all of those. Puppet works on the first two, and probably via Cygwin (Puppet is based on Ruby, and probably runs wherever Ruby does). - it is envisaged that some users will be able to change configuration settings through a web interface, and that those changes will be propogated to the nodes/clusters that the users who chosenfind It should be straightforward to convert webpage input into various rules files for cfengine/puppet/etc. These rules would then be used to push the various settings around as needed. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Backport proposals for typo fixes...
On Thu, Jul 31, 2008 at 17:49, Bernard Li [EMAIL PROTECTED] wrote: Hi all: I just fixed a typo in one of the configuration files in trunk, now I want this backported to the 3.1.x branch. Since this does not touch any actual code (just comments), can I get a free pass and not have to go through a backport proposal? +1 Perhaps we can outline some minor fixes that do not require to go through this peer review process and document this list in the Wiki? I'd suggest that minor typo changes, minor rewording to things like error messages, and any documentation are fair game for this. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Gmetad scalability tests
On Fri, Aug 1, 2008 at 09:14, [EMAIL PROTECTED] wrote: Has anyone written any kind of simulator for testing gmetad, e.g. a gmond that reports thousands of metrics for gmetad to log? I've not heard specifically of a simulator, although a very large cluster would basically do the same thing. :-) With gmond 3.1, a simulator could probably be written as a plugin that creates thousands of metrics. That could be very useful. Of course, it's not hard to fire off gmetric multiple times in a very tight loop either. :-) It would be interesting to see such a module, and what bottlenecks it exposes. There are some known issues with gmetad and updating the .rrd files--this is why many people store them on a ramdisk . -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ready for 3.1.1...?
On Fri, Aug 15, 2008 at 13:59, Bernard Li [EMAIL PROTECTED] wrote: Hi Brad: On Fri, Aug 15, 2008 at 10:48 AM, Brad Nicholes [EMAIL PROTECTED] wrote: I know it seems like just yesterday that we shipped Ganglia 3.1.0 (well probably because it was ;) But there have been some significant patches added to 3.1.1 including a fix for a segfault in gmetad. Due mainly to the segfault patch, I am proposing that we tag and roll a testing tarball for of 3.1.1 within the next week with a goal of shipping 3.1.1 two weeks after the tag. There are still a number of backports Comments? Questions? Feedback? I am okay to roll new testing tarball as soon as we are ready -- just give the word. Ditto. There's nothing in the backports proposal that absolutely must go into 3.1.1 (although anything that's been approved may as well get released). -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Bugzilla Bug 193: Avg Load percentages and overall cluster utilization.
On Tue, Aug 19, 2008 at 20:33, Bernard Li [EMAIL PROTECTED] wrote: I currently have an incomplete fix, but I need to get consensus as to what average utilization really means for grid of grids: should average utilization for a grid be load average divided by the number of cpus for the *entire* meta-grid or just over the grid in question? For a both the grid and meta-grid I think the average utilization should be counted for all hosts. Consider a meta-grid that gets data from from two grids A and B. Grid A has 100 hosts and Grid B has 20. If 15 machines are running full-tilt (e.g. 100% utilization on 15 different hosts), then the Grid utilization figures are quite different. For Grid A, there's a utilization of 15%. For Grid B, it's 75%. If we weight the two grids equally, we get an average utilization of 82.5%, even though there are 90 idle systems. On the other hand, if you take into account the number of hosts, you get a different figure: (15+15)/(100+20) = 25% average utilization. To me, this seems to be a more accurate representation of the overall usage. Now, this does *not* take into account things like relative CPU speeds, or multi-core systems. 100% utilization on my ancient Celeron is quite a bit different than 100% on the latest quad-core Operton. I don't think that we need to delve quite that far into compariing Alternatively, we can rollback this backport and punt it until 3.1.2. Probably the cleanest solution for now. On a related note, I think we should distinguish between a Grid and a Meta-Grid (i.e. a grid of grids) in the Front End -- do people care? Yes, it would be good to distinguish between them, even if all it does is say Meta-Grid instead of Grid. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] printing output with rrdtool
On Fri, Sep 12, 2008 at 13:33, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: Jessie, Jesse, actually. :-) the following commit (r1754 in ganglia's svn) seems to be patching the fix proposed by Jason as part of BUG37 and that was committed in r1595 and has been left unconsistent (as not all uses of this feature has been converted to use /dev/null). Then the other instances should be converted, IMO. I looked at this one specifically since it was causing trouble on a local install running trunk. Specifically, the error was: PHP Notice: Undefined offset: 1 in /home/beckerjes/ganglia/versions/trunk/monitor-core/web-working/functions.php on line 266, referer: http://saturn/ganglia-jb/?m=load_oner=hours=descendinghc=4mc=2 Tracing this back, you get this function: 249 #--- 250 # 251 # Finds the avg of the given cluster metric from the summary rrds. 252 # 253 function find_avg($clustername, $hostname, $metricname) 254 { 255 global $rrds, $start, $end; 256 $avg = 0; 257 258 if ($hostname) 259 $sum_dir = $rrds/$clustername/$hostname; 260 else 261 $sum_dir = $rrds/$clustername/__SummaryInfo__; 262 $command = RRDTOOL . graph /dev/null --start $start --end $end . 263 DEF:avg='$sum_dir/$metricname.rrd':'sum':AVERAGE . 264 PRINT:avg:AVERAGE:%.2lf ; 265 exec($command, $out); 266 $avg = $out[1]; 267 #echo $sum_dir: avg($metricname)=$avgbr\n; 268 return $avg; 269 } After adding a one-line print let me see exactly the rrdtool command getting called (rrdtool 1.2.26): /usr/bin/rrdtool graph '' --start -3600 --end N DEF:avg='/long/path/to/rrds/__SummaryInfo__/cpu_num.rrd':'sum':AVERAGE PRINT:avg:AVERAGE:%.2lf (Or something to that effect, the test system is at home, and I'm not. I'm building this on the fly from the code.) Sure enough, no output at all, and this causes line 266 to throw the error. I can reproduce this using rrdtool 1.2.23 and 1.2.26. This was specifically in the meta-view, but not in the cluster- or host-views. I had been unable to reproduce any problem with the original patch using several different versions of rrdtool, but your comment seems to imply you had observed the problem somehow, could you elaborate on that? I tested three cases: rrdtool - rrdtool '' rrdtool /dev/null The only one that I could get to work consistently was 'rrdtool /dev/null'. This trick of using /dev/null is actually suggested in a number of rrdtool mailing list threads for situations where you want rrdtool to calculate a value, but suppress the generation of a graph. I'd also like to note that there are no difference in syscalls between using /dev/null and rrdtool '', so there should be no additional cost between the two methods. There are some between using a dash and the other two, but I think that's because nothing is written to STDOUT at all. Of course, in the case of graph.php, we want 'rrdtool -', since the graphs are generated in-line. but the solution proposed by Jason has the advantage of being cross-platform (/dev/null doesn't exist in windows), so if you see no problem with the original patch I'd suggest you revert yours. As I said, this was the only one of the three that I could get working consistently. I also happen to think that rrdtool /dev/null blah is more obvious than rrdtool '' blah. So there's a minor argument to be made in favor of that as well. In the case of Windows, there is an equivalent NUL file that could be used. Regardless, I do *not* think that the patch should simply get reverted. Arguably, the problem stems from a lack of proper error handling on the exec() call. This exists in 1753 (and persists 1754, for that matter), so a simple revert won't help matters any. Instead, having code to handle NULL output from exec(), in addition to an rrdtool that can reliably suppress graph generation but still do the computations we want. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] STATUS file etiquette
A few of us (Bernard, Brad Nicholes, and myself) were musing on IRC about the etiquette of STATUS file edits. We decided that this is better discussed on the wider list, and I was volunteered to broach the issue (lucky me). So to get things started, here are a few things that *I* think would be useful. These are suggestions only--I'm in no position to dictate anything to anyone. Suggestion 1) The +1, +0 and -1 votes get one line apiece, for a total of 3 lines. See below for an example. Suggestion 2) Don't mess with other people's votes. Suggestion 3) A vote of +1 does not need comment. If committing a vote of +0 or -1, a comment as to why is *strongly* encouraged. The comment can be either on the voting line, or immediately after the stanza. See below for an example. Suggestion 4) When a backport has been accepted, move the entire stanza into a CHANGES file (or something similar), along with a date stamp.. This should be done when the changes are actually committed to that branch, not when the votes are cast. This also means that a CHANGES file needs to be created. So, any thoughts or comments? These are, as a I said, suggestions only. I'm quite willing to be convinced of other behavior. Example for #1 and #3 above: * gmond: Frobnicate the quazzle before inducticating the bibblebop. http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=- +1: hawson -1: Tom, Dick, Harry +0: Mr. Bill Tom: Inducitacting should happen before the frobnication. Harry: The quazzle should be checked for extra widgets before frobnication. Mr. Bill: Oh no! -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] STATUS file etiquette
On Mon, Sep 15, 2008 at 19:59, Brad Nicholes [EMAIL PROTECTED] wrote: In general, I am on board with this as you outlined it. Let me make a couple more suggestions (inline) Thanks for the comments. A few things I thought of after sending the 1st email are also below. Suggestion 2) Don't mess with other people's votes. +1, even when a proposal is modified, the existing votes need to remain. However we do need to somehow put a procedure in place that allows for a re-review. I don't have any suggestions for that yet. Perhaps an SVN commit that does nothing but remove *all* votes for a given stanza, and make sure the log entry indicates what is going on? Alternately, if a proposal is reworked, and needs re-review, add that to the notes about the backport. That's...cumbersome, but I can't think of a better idea. Suggestion 4) When a backport has been accepted, move the entire stanza into a CHANGES file (or something similar), along with a date stamp.. This should be done when the changes are actually committed to that branch, not when the votes are cast. This also means that a CHANGES file needs to be created. +1 again, another suggestion would be to just move the stanza to lower section with the same STATUS file. But I am good with this either way. I think that I prefer moving them to another file, for two reasons: 1) It keeps the STATUS file from growing without bound. 2) the Revision diffs should be very obvious as to what is happening, what with a big chunk of text removed from STATUS, and appearing in CHANGES. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] STATUS file etiquette
On Tue, Sep 16, 2008 at 05:39, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Mon, Sep 15, 2008 at 07:22:27PM -0400, Jesse Becker wrote: A few of us (Bernard, Brad Nicholes, and myself) were musing on IRC about the etiquette of STATUS file edits. since I did 55.4% of the commits to the 3.1 STATUS file and all the other committers except for Martin (2.7%) were part of that IRC meeting, guess this was directed at me. Actually, no. It stemmed more from a confusion on my part as to how proper way of doing things. Etiquette isn't the right word, perhaps procedure is better (I couldn't think of it yesterday). We decided that this is better discussed on the wider list, and I was volunteered to broach the issue (lucky me). in our wiki under the section of communication it is suggested that decisions made on IRC be summarized to the list as shown by : http://ganglia.wiki.sourceforge.net/ganglia_project No decisions were made, as I pointed out. And you'll note that this issue specifically was brought on the wider list *because* there were only three people involved. So to get things started, here are a few things that *I* think would be useful. These are suggestions only--I'm in no position to dictate anything to anyone. Suggestion 1) The +1, +0 and -1 votes get one line apiece, for a total of 3 lines. See below for an example. +1 funny you mention it was your idea though since that was the way that it was documented to work before as reflected by the template until this commit reformatted all entries differently : r1716 | hawson | 2008-08-22 21:15:21 -0700 (Fri, 22 Aug 2008) | 3 lines * Added reviews to a number of proposed backports. * split a few +1 lines that contained multiple users into multiple +1 lines Yep. I'm aware of that. I actually still prefer multiple lines for two reasons: 1) it makes it more obvious as to the number of votes have been cast, as well as the relative number of +1, +0, -1. 2) It makes the diffs cleaner when looking to see what votes were added/changed. However, this is such a minor thing, that it isn't worth arguing over. I suggested single lines since that seems to be what other people prefer. Suggestion 2) Don't mess with other people's votes. -1 votes are attached to patch proposals and so if the proposal changes the vote has to be recasted (indeed we talked about this in our ganglia meeting in groundworks) I agree that new patches require new votes, but there needs to be more communication when this happens. Deleteing the votes and comments removes that information about previous patches, and potentially why it was rejected (or not) in the first place. This information is useful, and should be preserved somehow. Perhaps we could do something like this (all revisions and patch numbers are 100% bogus) * gmond: recover from interface reconfiguration gracefully http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=38 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=1478 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=1632 228 -1: carenas 229 carenas: not a fix but a workaround to the problem if the proponent finds a problem with his original proposal and changes it, keeping the +1 votes on it will be incorrect as what was verified was something different to what the vote is attached to. the only logical course of action should be to remove the vote so that it can be voted again. Suggestion 3) A vote of +1 does not need comment. If committing a vote of +0 or -1, a comment as to why is *strongly* encouraged. +1 with comments the wording used slightly contradicts the Decision Making section on our wiki for Project Administration http://ganglia.wiki.sourceforge.net/ganglia_project -1 MUST have a comment that explains clearly why the current proposal needs more work before backported and ideally the missing pieces or an alternative proposal contributed. it is also important to note that a backport rejection that says something like I don't like this, I think we should do it differently should also take into consideration that trunk is already changed to what the proposal was suggesting and so an alternative proposal MUST be contributed and hopefully a broad discussion started. A fair point. But if there are objections to the backport patch, this may mean that the upstream patch to trunk should be reviewed and possibly amended. most of the time though, I would expect discussion will be started directly in trunk and even before the patch is proposed for backport, after all trunk is under CTR rules and the R there means REVIEW. The comment can be either on the voting line, or immediately after the stanza. See below for an example. -1 as explained above this could be problematic for keeping the votes in only 3 lines so it might
Re: [Ganglia-developers] STATUS file etiquette (ignore previous email please)
Crud... Gmail sent the last email before I was done. (PEBKAC, really) Please ignore the previous email, and read this one instead. On Tue, Sep 16, 2008 at 05:39, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Mon, Sep 15, 2008 at 07:22:27PM -0400, Jesse Becker wrote: A few of us (Bernard, Brad Nicholes, and myself) were musing on IRC about the etiquette of STATUS file edits. since I did 55.4% of the commits to the 3.1 STATUS file and all the other committers except for Martin (2.7%) were part of that IRC meeting, guess this was directed at me. Actually, no. It stemmed more from a confusion on my part as to how proper way of doing things. Etiquette isn't the right word, perhaps procedure is better (I couldn't think of it yesterday). We decided that this is better discussed on the wider list, and I was volunteered to broach the issue (lucky me). in our wiki under the section of communication it is suggested that decisions made on IRC be summarized to the list as shown by : http://ganglia.wiki.sourceforge.net/ganglia_project No decisions were made, as I pointed out. And you'll note that this issue specifically was brought on the wider list *because* there were only three people involved. So to get things started, here are a few things that *I* think would be useful. These are suggestions only--I'm in no position to dictate anything to anyone. Suggestion 1) The +1, +0 and -1 votes get one line apiece, for a total of 3 lines. See below for an example. +1 funny you mention it was your idea though since that was the way that it was documented to work before as reflected by the template until this commit reformatted all entries differently : r1716 | hawson | 2008-08-22 21:15:21 -0700 (Fri, 22 Aug 2008) | 3 lines * Added reviews to a number of proposed backports. * split a few +1 lines that contained multiple users into multiple +1 lines Yep. I'm aware of that. Consensus seems to be to use single lines, so I suggested it. I actually prefer multiple lines for two reasons: 1) it makes it more obvious as to the number of votes have been cast, as well as the relative number of +1, +0, -1. 2) It makes the diffs cleaner when looking to see what votes were added/changed. Suggestion 2) Don't mess with other people's votes. -1 votes are attached to patch proposals and so if the proposal changes the vote has to be recasted (indeed we talked about this in our ganglia meeting in groundworks) I agree that new patches require new votes, but there needs to be more communication when this happens. At the very least, a note to -devel that a new patch is present and the issue in question needs a revote. Furthermore deleting the votes and comments removes information about previous patches, and potentially why it was rejected (or not) in the first place. This information is useful, and should be preserved somehow. Perhaps we could do something like this (all revisions and patch numbers are 100% bogus) * gmond: report CPU color http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100 -1: hawson hawson: doesn't actually work due to changes in r150. http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=200 +1: hawson: actually works This is a bit cumbersome, but does have the advantage of keeping a timeline of sorts, as well as comments and vote history. Other things to possibly do would be have a date stamp of some sort: * gmond: report CPU color (proposed 2008-06-01) http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100 -1: hawson (2008-07-01) hawson: doesn't actually work due to changes in r150. http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=200 +1: hawson: actually works (2008-08-01) Or perhaps starting with: * gmond: report CPU color (proposed 2008-06-01) http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100 -1: hawson hawson: doesn't actually work due to changes in r150. and once a new patch is out, clearing votes and comments but adding a note that a revote is needed: * gmond: report CPU color (proposed 2008-06-01) http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=12345 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=100 http://ganglia.svn.sourceforge.net/viewvc/ganglia?view=revrevision=200 (REVOTE NEEDED) it is also important to note that a backport rejection that says something like I don't like this, I think we should do it differently should also take into consideration that trunk is already changed to what the proposal was suggesting and so an alternative proposal MUST be contributed
Re: [Ganglia-developers] STATUS file etiquette (ignore previousemail please)
On Wed, Sep 17, 2008 at 10:45, Brad Nicholes [EMAIL PROTECTED] wrote: I'm not sure I like any of those options. How about if we use a modification of your first suggestion. snipping of good suggestions The voting history or basically what happened during the proposal review is all captured in a series of brief comments in the STATUS file. The original author may also decide to include a link to an email thread where a more detailed discussion is or has taken place. In this process nobody messes with anybody else's votes and the original author of the proposal remains in control of the proposal. This seems reasonable to me. I think that it would be worth trying this procedure to see how it works out. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Ganglia Wiki organization
A bit late in replying... On Thu, Sep 25, 2008 at 20:17, Bernard Li I have created a new navigation link called Development in the Ganglia Wiki: http://ganglia.wiki.sourceforge.net/development I envision to place all development-related resources under this page. Great idea. Going through the other navigation links, I propose that we move How the Project Works, How to Participate, Project Administration and Wish List under this subcategory -- thoughts? Agreed, with one minor change: Leave the how to participate link on the sidebar. I think that it is important to not bury that link in a sub-page. On the same token, I am thinking of creating another subcategory called User resources and place some of the other pages there. That's reasonable. What would go there? For my part, I think that the following wiki pages should *remain* in the sidebar, or somewhere obvious on each page: * home * A link to the documentation, including installation * How to participate * Release Notes * Link to gmetric/module repos. The above list is not sorted. I also don't have a problem with pages being linked from multiple places. For example, a documentation link on the sidebar does not preclude having a second documentation link on the developer page(s). One of the guidelines for Wikipedia Editors is Be Bold.[1] So have at it! We can always change/revert the pages if it doesn't work out. [1] http://en.wikipedia.org/wiki/Wikipedia:Be_bold -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] backport proposal for graph statistics, bugzilla 206
Just a note that I've added a backport proposal for bugzilla ID#206 into the 3.1.x branch STATUS file. This is a split from #193. I've consolidated several patches from trunk into a single patch, and posted it. Please review, test and vote. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] backport proposal for graph statistics, bugzilla 206
On Sat, Oct 18, 2008 at 13:33, Carlo Marcelo Arenas Belon [EMAIL PROTECTED] wrote: On Sat, Oct 18, 2008 at 01:14:32PM -0400, Jesse Becker wrote: I've consolidated several patches from trunk into a single patch, and posted it. Please review, test and vote. was just looking at that and I have to admit I am not sure why a consolidated patch that diverts from trunk will be needed. couldn't just a merge from all relevant patches in trunk be used for backport?, if there are few minor textual changes why not include them as well to avoid having later conflicts when trying to merge further stuff from trunk? Because I figured it would be easier to review and test applying one patch, instead of the 4-5 that it takes otherwise. The single patch was produced directly from a diff against trunk; no new code is included. if the changes that were skipped were not good for 3.1, then they are not good for trunk either and that could be fixed with further patches in trunk than then could be added to the list of patches from 3.1 for backport. Nothing was skipped. In fact, given that new stuff goes into trunk *before* it goes into 3.1, I don't really see how it could have been skipped. As I said, there's no new code here. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] use of find_avg() in cluster_view.php
On Mon, Dec 8, 2008 at 11:26, [EMAIL PROTECTED] wrote: Looking more closely at how find_avg() is used by cluster_view.php: $avg_cpu_num = find_avg($clustername, , cpu_num); if ($avg_cpu_num == 0) $avg_cpu_num = 1; $cluster_util = sprintf(%.0f, ((double) find_avg($clustername, , load_one) / $avg_cpu_num ) * 100); Basically, the code above takes the quotient of two averages: util = average(load_one) / average(cpu_num) Is that really the same as taking the average of the quotient? X = set of values for each t, (load_one(t) / cpu_num(t)) util = average(X) Computing the values on a global or per-system focus has wildly different answers. Consider two systems show on this URL: http://spreadsheets.google.com/pub?key=pLqhPr4caFmQ3g_G7cwUCSQ It shows a cluster makde of two dissimilar systems. One with 4 CPUs, and a load of 4. One with one CPU, and a load of 0. Is your cluster utilization 80% or 50%? I (currently) believe that the current method, which is two divide the average(load) by the average(cpu_num), is correct behavior. Computing multiple per-system utilization figures, then averaging them, is less immediately useful. (This is not to say useless, since these per-system utilization numbers could help to give you an idea what rackspace is being efficiently used.) -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers