Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()
On Wed, Feb 23, 2011 at 05:12:03PM -0800, Bernard Li wrote: I tested under EL5 and EL6 and it was't able to get past the initial buffer size. ?I believe what I did was: Correction. It works on EL6, but not on EL5: most likely the test is just giving inconsistent results, and that is why now works in EL5, while it didn't before. [CentOS 5.5 x86_64 with kernel 2.6.18-194.32.1.el5] read(3, 2.6.18-194.32.1., 16) = 16 read(3, , 16) = 0 [RHEL6b2 x86_64 with kernel 2.6.32-37.el6.x86_64] read(3, 2.6.32-37.el6.x8, 16) = 16 read(3, 6_64\n, 16) = 5 The issue may be specific to files in /proc/sys, because I tried reading /proc/stat on CentOS 5.5 and it worked fine. very unlikely, and considering that this is some code modification you made and that you only have, the problem is most likely in your code anyway (maybe even miscompiled) Carlo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()
On Wed, Feb 23, 2011 at 09:42:56AM -0800, Bernard Li wrote: what second pass? ? dummy = proc_sys_kernel_osrelease; ? rval.int32 = slurpfile(/proc/sys/kernel/osrelease, dummy, ? ? ? ? ? ? ? ? ? ? ? ? ?MAX_G_STRING_SIZE); why would anyone call slurpfile in a loop anyway?, and slurpfile doesn't call itself recursively but just reads as much data as it can into the buffer provided (second parameter). Sorry I wasn't clear, I meant the goto read loop: 123 read: 124 read_len = read(fd, db, buflen); 125 if (read_len = 0) 126 { 127if (errno == EINTR) 128 goto read; 129err_ret(slurpfile() read() error on file %s, filename); 130close(fd); 131return SYNAPSE_FAILURE; 132 } this code is not relevant as it is only called when EINTR is received because a signal interrupts the read call (very unlikely) the second conditional after that code is used to continue reading the buffer after it is resized if that is possible and that works fine as shown by your tests 136if (read_len == buflen) 137 { 138 if (dynamic) { 139 dynamic += buflen; 140 db = realloc(*buffer, dynamic); 141 *buffer = db; 142 db = *buffer + dynamic - buflen; 143 goto read; 144 } else { 145 --read_len; 146 err_msg(slurpfile() read() buffer overflow on file %s, filename); 147 } 148 } When I straced the process, the first read() was able to read up to MAX_G_STRING, however, the second read() returns 0. However, if I read a regular file (not in /proc filesystem), it was able to read the rest of the string in the second pass just fine. this just sounds to strange, but was able to replicate it after a lot of guessing in a CentOS 5 VM (both 32bit and 64bit) as shown by : # strace -e read dd if=/proc/sys/kernel/osrelease bs=16 /dev/null read(0, 2.6.18-164.9.1.e, 16) = 16 read(0, , 16) = 0 so not a ganglia problem, and just a problem with the way you were trying to use slurpfile and the way that specific sysctl handler is implemented in that version of the kernel. makes sense anyway to not worry about partial reads from a value that is meant to be used whole anyway, but interestingly enough and as you reported later it is no longer working that way with newer kernels. Regarding this particular bug -- how should we fix this? There are currently two issues: 1) The OS release is truncated in the web frontend and that is to protect the gmond process against crashes 2) The warning slurpfile() read() buffer overflow on file /proc/sys/kernel/osrelease is displayed multiple times during RPM installation (possibly because gmond was called to generate conf files etc.) that was meant to be mostly informative, but the message might need to be reworked to be more effective. Can we potentially increase MAX_G_STRING or have proc_sys_kernel_osrelease buffer size resize dynamically? no Carlo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] $conf array
+1 to that too On Wed, Feb 23, 2011 at 23:49, Bernard Li bern...@vanhpc.org wrote: +1 from me as well. I guess we should probably check it into both monitor-web-2.0 and trunk. Cheers, Bernard On Wed, Feb 23, 2011 at 7:35 PM, Jesse Becker haw...@gmail.com wrote: +1 On Wed, Feb 23, 2011 at 21:27, Alex Dean a...@crackpot.org wrote: One of my gripes with the current PHP frontend code is how hard it can be to recall where which variables are configuration, which come from user input, and which are just local variables. As one step toward fixing this issue, I think it would be nice to place all configuration values (mainly in conf.php currently) into a $conf array. The benefit is that it's immediately clear in any code which uses these values that you're dealing with configuration values. There's no danger of name collisions with other variables. I'm just wondering if others feel the same way, and would support a change like this. It's pretty straighforward to do, but would obviously touch a lot of different code. Before I go ahead with making all those changes, I guess I'd just like to know if there are any huge objections out there to this idea. Take a look, let me know what you think. I'd like to do something similar for user input as well, maybe $user? http://pastie.org/1600587 thanks, alex -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] hsflowd for Windows + Ganglia webfrontend
Hi Neil: I finally had a chance to test out the patch. Didn't run into any major issue on my end, so +1 from me. On Thu, Feb 17, 2011 at 2:22 PM, Neil McKee neil.mc...@inmon.com wrote: There are three metrics to draw particular attention to: 1. System UUID I noticed that on my Windows host, UUID is ----. I guess UUID generation on Windows is not supported yet? I tested with hsflowd 1.12 on Windows XP. The Datasource ID and Parent Datasource ID can be treated as opaque strings that the UI could use to capture and represent the containment hierarchy. Perhaps you could explain a bit about the format of Datasource ID? I've granted you SVN access, so please feel free to check the code into trunk. But perhaps Brad might want to review the code quickly before you do so :-) Can you also modify the manpage for gmond.conf plus add it to the default configuration? I'm okay with accept_all_physical = yes as the default. BTW, are you interested in implementing UUID for gmond? We've been talking about using UUID instead of hostname/IP as host identifiers because those things could change, so I think this would be a great feature to be added to our code base. Thanks, Bernard -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] hsflowd for Windows + Ganglia webfrontend
On Feb 24, 2011, at 3:07 PM, Bernard Li wrote: Hi Neil: I finally had a chance to test out the patch. Didn't run into any major issue on my end, so +1 from me. On Thu, Feb 17, 2011 at 2:22 PM, Neil McKee neil.mc...@inmon.com wrote: There are three metrics to draw particular attention to: 1. System UUID I noticed that on my Windows host, UUID is ----. I guess UUID generation on Windows is not supported yet? I tested with hsflowd 1.12 on Windows XP. The Windows UUID appears for me, but that's from a Windows7 OS. Maybe we would need to look in a different place on XP. (The Linux port falls back on the UUID of the first local disk if it can't get a UUID for the whole system, so maybe something similar to that would be acceptable as a work around.) Please raise this question on host-sflow-discuss. The Datasource ID and Parent Datasource ID can be treated as opaque strings that the UI could use to capture and represent the containment hierarchy. Perhaps you could explain a bit about the format of Datasource ID? The specs on sflow.org cover this, but basically it consists of {IPAddress,dsClass,dsIndex} where the IPAddress can be a v4 or v6 address, and the dsClass tells you if the dsIndex is referring to an interface, a physical entity or a logical entity (such as a VM or application). Conceptually I think of each datasource as being one observation point in the system. From ganglia's perspective it's probably best to treat it as an opaque string, and just use it to know, for example, that a particular VM is running on a particular hypervisor. I've granted you SVN access, so please feel free to check the code into trunk. But perhaps Brad might want to review the code quickly before you do so :-) OK. I'll wait for Brad to comment. Can you also modify the manpage for gmond.conf plus add it to the default configuration? I'm okay with accept_all_physical = yes as the default. OK. BTW, are you interested in implementing UUID for gmond? We've been talking about using UUID instead of hostname/IP as host identifiers because those things could change, so I think this would be a great feature to be added to our code base. I really don't know my way around gmond, and that sounds like it might be a far-reaching change. Neil Thanks, Bernard -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] $conf array
I wrote a script to read in a conf.php file and convert it to use a $conf array. I don't think we have a place for utility scripts like this right now. Where should it go? https://github.com/alexdean/ganglia-stuff/blob/master/reformat_conf.php After you use this tool, but before all code is updated to use $conf, we can maintain backwards-compatibility with existing code with something like this: https://gist.github.com/843349 I've taken a quick tour through my local ganglia using a converted conf.php and not seen any obvious problems, but of course the non-obvious ones are probably there somewhere. alex On Feb 24, 2011, at 7:01 AM, Jesse Becker wrote: +1 to that too On Wed, Feb 23, 2011 at 23:49, Bernard Li bern...@vanhpc.org wrote: +1 from me as well. I guess we should probably check it into both monitor-web-2.0 and trunk. Cheers, Bernard On Wed, Feb 23, 2011 at 7:35 PM, Jesse Becker haw...@gmail.com wrote: +1 On Wed, Feb 23, 2011 at 21:27, Alex Dean a...@crackpot.org wrote: One of my gripes with the current PHP frontend code is how hard it can be to recall where which variables are configuration, which come from user input, and which are just local variables. As one step toward fixing this issue, I think it would be nice to place all configuration values (mainly in conf.php currently) into a $conf array. The benefit is that it's immediately clear in any code which uses these values that you're dealing with configuration values. There's no danger of name collisions with other variables. I'm just wondering if others feel the same way, and would support a change like this. It's pretty straighforward to do, but would obviously touch a lot of different code. Before I go ahead with making all those changes, I guess I'd just like to know if there are any huge objections out there to this idea. Take a look, let me know what you think. I'd like to do something similar for user input as well, maybe $user? http://pastie.org/1600587 thanks, alex -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers -- Jesse Becker -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] gengetopt and SYSCONFDIR
This should be a better patch against current trunk, which does not add gengetopt as a build dependency (unless you need to modify cmdline.sh and re-generate the files): Index: gmetric/Makefile.am === --- gmetric/Makefile.am (revision 2490) +++ gmetric/Makefile.am (working copy) @@ -1,3 +1,5 @@ +include $(top_srcdir)/ganglia.inc + if STATIC_BUILD GCFLAGS = GLDADD = @@ -8,13 +10,19 @@ GLDFLAGS = endif -AM_CFLAGS = -I../lib -I../include $(GCFLAGS) -DSYSCONFDIR='$(sysconfdir)' +AM_CFLAGS = -I../lib -I../include $(GCFLAGS) bin_PROGRAMS = gmetric -gmetric_SOURCES = gmetric.c cmdline.c cmdline.h + +cmdline.c: cmdline.c.in $(FIXCONFIG) + $(FIXCONFIG) cmdline.c.in + +gmetric_SOURCES = gmetric.c cmdline.c.in cmdline.c cmdline.h gmetric_LDADD = $(top_builddir)/lib/libganglia.la \ $(top_builddir)/lib/libgetopthelper.a \ $(top_builddir)/libmetrics/libmetrics.la \ $(GLDADD) gmetric_LDFLAGS = $(GLDFLAGS) + +CLEANFILES = cmdline.c Index: gmetric/cmdline.sh === --- gmetric/cmdline.sh (revision 2490) +++ gmetric/cmdline.sh (working copy) @@ -5,7 +5,7 @@ purpose The Ganglia Metric Client (gmetric) announces a metric on the list of defined send channels defined in a configuration file -option conf c The configuration file to use for finding send channels string default=/etc/ganglia/gmond.conf no +option conf c The configuration file to use for finding send channels string default=@sysconfdir@/gmond.conf no option name n Name of the metric string no option value v Value of the metric string no option type t Either string|int8|uint8|int16|uint16|int32|uint32|float|double string no @@ -13,6 +13,9 @@ option slope s Either zero|positive|negative|both string default=both no option tmax x The maximum time in seconds between gmetric calls int default=60 no option dmax d The lifetime in seconds of this metric int default=0 no +option group g Groupof the metric string no +option desc D Description of the metric string no +option title T Titlei of the metric string no option spoof S IP address and name of host/device (colon separated) we are spoofing string default= no option heartbeat H spoof a heartbeat message (use with spoof option) no Notes: - gengetopt is called via: gengetopt --c-extension=c.in --input cmdline.sh which generates cmdline.c.in, this will then have @sysconfdir@ replaced by the correct value to generate cmdline.c - cmdline.c is included in the distribution tarball and will have a bogus sysconfdir (usually /usr/local/etc) but it will get replaced when the user does ./configure - make clean would delete cmdline.c If this looks good to everyone, I will check this into trunk and update the backport proposal for adding groups/desc/title to gmetric for 3.1 branch. Thanks, Bernard On Wed, Sep 1, 2010 at 11:56 PM, Bernard Li bern...@vanhpc.org wrote: Hi all: I'm trying to get the backport proposal for adding group, description and title metadata tags to gmetric approved and am in the process of fixing the auto-generated files that were patched in the changesets outlined below: http://sourceforge.net/apps/trac/ganglia/browser/branches/monitor-core-3.1/STATUS#L209 In the process, I found that additional auto-generated files have been patched: http://sourceforge.net/apps/trac/ganglia/changeset/2021/ In order to fix that, I'll need to modify cmdline.sh and add SYSCONFDIR inside the default clause. Try as I may, it doesn't appear that I could include double quotes, and escaping with \ has the generated cmdline.c showing up as \ SYSCONFDIR \ which doesn't get work. To get around the problem, I propose that we delete cmdline.c and cmdline.h from gmond, gmetric, gmetad sub directories, move cmdline.sh - cmdline.sh.in, and update the Makefile.am targets such that fixconfig is called on cmdline.sh.in to generate cmdline.sh. cmdline.c and cmdline.h will need to be generated on the fly (so gengetopt will be an additional build dependency). The patch will look something like this: Index: gmetric/Makefile.am === --- gmetric/Makefile.am (revision 2322) +++ gmetric/Makefile.am (working copy) @@ -1,3 +1,5 @@ +include $(top_srcdir)/ganglia.inc + if STATIC_BUILD GCFLAGS = GLDADD = @@ -8,9 +10,16 @@ GLDFLAGS = endif -AM_CFLAGS = -I../lib -I../include $(GCFLAGS) -DSYSCONFDIR='$(sysconfdir)' +AM_CFLAGS = -I../lib -I../include $(GCFLAGS) bin_PROGRAMS = gmetric + +cmdline.sh: cmdline.sh.in $(FIXCONFIG) + $(FIXCONFIG) cmdline.sh.in + +cmdline.c cmdline.h: cmdline.sh + gengetopt --input ./cmdline.sh + gmetric_SOURCES = gmetric.c cmdline.c cmdline.h gmetric_LDADD = $(top_builddir)/lib/libganglia.la \ $(top_builddir)/lib/libgetopthelper.a \ Index: