Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()

2011-02-24 Thread Carlo Marcelo Arenas Belon
On Wed, Feb 23, 2011 at 05:12:03PM -0800, Bernard Li wrote:
 
  I tested under EL5 and EL6 and it was't able to get past the initial
  buffer size. ?I believe what I did was:
 
 Correction.  It works on EL6, but not on EL5:

most likely the test is just giving inconsistent results, and that is why
now works in EL5, while it didn't before.

 [CentOS 5.5 x86_64 with kernel 2.6.18-194.32.1.el5]
 
 read(3, 2.6.18-194.32.1., 16) = 16
 read(3, , 16) = 0
 
 [RHEL6b2 x86_64 with kernel 2.6.32-37.el6.x86_64]
 
 read(3, 2.6.32-37.el6.x8, 16) = 16
 read(3, 6_64\n, 16)   = 5
 
 The issue may be specific to files in /proc/sys, because I tried
 reading /proc/stat on CentOS 5.5 and it worked fine.

very unlikely, and considering that this is some code modification you
made and that you only have, the problem is most likely in your code anyway
(maybe even miscompiled)

Carlo

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()

2011-02-24 Thread Carlo Marcelo Arenas Belon
On Wed, Feb 23, 2011 at 09:42:56AM -0800, Bernard Li wrote:
 
  what second pass?
 
  ? dummy = proc_sys_kernel_osrelease;
  ? rval.int32 = slurpfile(/proc/sys/kernel/osrelease, dummy,
  ? ? ? ? ? ? ? ? ? ? ? ? ?MAX_G_STRING_SIZE);
 
  why would anyone call slurpfile in a loop anyway?, and slurpfile
  doesn't call itself recursively but just reads as much data as it
  can into the buffer provided (second parameter).
 
 Sorry I wasn't clear, I meant the goto read loop:
 
 123   read:
 124  read_len = read(fd, db, buflen);
 125  if (read_len = 0)
 126 {
 127if (errno == EINTR)
 128   goto read;
 129err_ret(slurpfile() read() error on file %s, filename);
 130close(fd);
 131return SYNAPSE_FAILURE;
 132 }

this code is not relevant as it is only called when EINTR is received
because a signal interrupts the read call (very unlikely)

the second conditional after that code is used to continue reading the
buffer after it is resized if that is possible and that works fine as
shown by your tests

136if (read_len == buflen)
137   {
138  if (dynamic) {
139 dynamic += buflen;
140 db = realloc(*buffer, dynamic);
141 *buffer = db;
142 db = *buffer + dynamic - buflen;
143 goto read;
144  } else {
145 --read_len;
146 err_msg(slurpfile() read() buffer overflow on file %s, 
filename);
147  }
148   }

 When I straced the process, the first read() was able to read up to
 MAX_G_STRING, however, the second read() returns 0.  However, if I
 read a regular file (not in /proc filesystem), it was able to read the
 rest of the string in the second pass just fine.

this just sounds to strange, but was able to replicate it after a lot of
guessing in a CentOS 5 VM (both 32bit and 64bit) as shown by :

# strace -e read dd if=/proc/sys/kernel/osrelease bs=16  /dev/null
read(0, 2.6.18-164.9.1.e, 16) = 16
read(0, , 16) = 0   

so not a ganglia problem, and just a problem with the way you were trying
to use slurpfile and the way that specific sysctl handler is implemented
in that version of the kernel.

makes sense anyway to not worry about partial reads from a value that is
meant to be used whole anyway, but interestingly enough and as you reported
later it is no longer working that way with newer kernels.

 Regarding this particular bug -- how should we fix this?  There are
 currently two issues:
 
 1) The OS release is truncated in the web frontend

and that is to protect the gmond process against crashes

 2) The warning slurpfile() read() buffer overflow on file
 /proc/sys/kernel/osrelease is displayed multiple times during RPM
 installation (possibly because gmond was called to generate conf files
 etc.)

that was meant to be mostly informative, but the message might need to
be reworked to be more effective.

 Can we potentially increase MAX_G_STRING or have
 proc_sys_kernel_osrelease buffer size resize dynamically?

no

Carlo

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] $conf array

2011-02-24 Thread Jesse Becker
+1 to that too

On Wed, Feb 23, 2011 at 23:49, Bernard Li bern...@vanhpc.org wrote:
 +1 from me as well.

 I guess we should probably check it into both monitor-web-2.0 and trunk.

 Cheers,

 Bernard

 On Wed, Feb 23, 2011 at 7:35 PM, Jesse Becker haw...@gmail.com wrote:
 +1

 On Wed, Feb 23, 2011 at 21:27, Alex Dean a...@crackpot.org wrote:
 One of my gripes with the current PHP frontend code is how hard it can be 
 to recall where which variables are configuration, which come from user 
 input, and which are just local variables.  As one step toward fixing this 
 issue, I think it would be nice to place all configuration values (mainly 
 in conf.php currently) into a $conf array.  The benefit is that it's 
 immediately clear in any code which uses these values that you're dealing 
 with configuration values.  There's no danger of name collisions with other 
 variables.

 I'm just wondering if others feel the same way, and would support a change 
 like this.  It's pretty straighforward to do, but would obviously touch a 
 lot of different code.  Before I go ahead with making all those changes, I 
 guess I'd just like to know if there are any huge objections out there to 
 this idea.  Take a look, let me know what you think.  I'd like to do 
 something similar for user input as well, maybe $user?

 http://pastie.org/1600587

 thanks,
 alex



 --
 Free Software Download: Index, Search  Analyze Logs and other IT data in
 Real-Time with Splunk. Collect, index and harness all the fast moving IT 
 data
 generated by your applications, servers and devices whether physical, 
 virtual
 or in the cloud. Deliver compliance at lower cost and gain new business
 insights. http://p.sf.net/sfu/splunk-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers




 --
 Jesse Becker

 --
 Free Software Download: Index, Search  Analyze Logs and other IT data in
 Real-Time with Splunk. Collect, index and harness all the fast moving IT data
 generated by your applications, servers and devices whether physical, virtual
 or in the cloud. Deliver compliance at lower cost and gain new business
 insights. http://p.sf.net/sfu/splunk-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers





-- 
Jesse Becker

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] hsflowd for Windows + Ganglia webfrontend

2011-02-24 Thread Bernard Li
Hi Neil:

I finally had a chance to test out the patch.  Didn't run into any
major issue on my end, so +1 from me.

On Thu, Feb 17, 2011 at 2:22 PM, Neil McKee neil.mc...@inmon.com wrote:

 There are three metrics to draw particular attention to:

 1. System UUID

I noticed that on my Windows host, UUID is
----.  I guess UUID generation on
Windows is not supported yet?  I tested with hsflowd 1.12 on Windows
XP.

 The Datasource ID and Parent Datasource ID can be treated as opaque 
 strings that the UI could use to capture and represent the containment 
 hierarchy.

Perhaps you could explain a bit about the format of Datasource ID?

I've granted you SVN access, so please feel free to check the code
into trunk.  But perhaps Brad might want to review the code quickly
before you do so :-)

Can you also modify the manpage for gmond.conf plus add it to the
default configuration?  I'm okay with accept_all_physical = yes as
the default.

BTW, are you interested in implementing UUID for gmond?  We've been
talking about using UUID instead of hostname/IP as host identifiers
because those things could change, so I think this would be a great
feature to be added to our code base.

Thanks,

Bernard

--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] hsflowd for Windows + Ganglia webfrontend

2011-02-24 Thread Neil McKee

On Feb 24, 2011, at 3:07 PM, Bernard Li wrote:

 Hi Neil:
 
 I finally had a chance to test out the patch.  Didn't run into any
 major issue on my end, so +1 from me.
 
 On Thu, Feb 17, 2011 at 2:22 PM, Neil McKee neil.mc...@inmon.com wrote:
 
 There are three metrics to draw particular attention to:
 
 1. System UUID
 
 I noticed that on my Windows host, UUID is
 ----.  I guess UUID generation on
 Windows is not supported yet?  I tested with hsflowd 1.12 on Windows
 XP.
 

The Windows UUID appears for me,  but that's from a Windows7 OS.  Maybe we 
would need to look in a different place on XP.  (The Linux port falls back on 
the UUID of the first local disk if it can't get a UUID for the whole system,  
so maybe something similar to that would be acceptable as a work around.)  
Please raise this question on host-sflow-discuss.


 The Datasource ID and Parent Datasource ID can be treated as opaque 
 strings that the UI could use to capture and represent the containment 
 hierarchy.
 
 Perhaps you could explain a bit about the format of Datasource ID?

The specs on sflow.org cover this,  but basically it consists of 
{IPAddress,dsClass,dsIndex} where the IPAddress can be a v4 or v6 address,  and 
the dsClass tells you if the dsIndex is referring to an interface, a physical 
entity or a logical entity (such as a VM or application).   Conceptually I 
think of each datasource as being one observation point in the system.   From 
ganglia's perspective it's probably best to treat it as an opaque string,  and 
just use it to know,  for example,  that a particular VM is running on a 
particular hypervisor.

 
 I've granted you SVN access, so please feel free to check the code
 into trunk.  But perhaps Brad might want to review the code quickly
 before you do so :-)

OK.  I'll wait for Brad to comment.

 
 Can you also modify the manpage for gmond.conf plus add it to the
 default configuration?  I'm okay with accept_all_physical = yes as
 the default.
 

OK.

 BTW, are you interested in implementing UUID for gmond?  We've been
 talking about using UUID instead of hostname/IP as host identifiers
 because those things could change, so I think this would be a great
 feature to be added to our code base.
 

I really don't know my way around gmond,  and that sounds like it might be a 
far-reaching change.

Neil



 Thanks,
 
 Bernard


--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] $conf array

2011-02-24 Thread Alex Dean
I wrote a script to read in a conf.php file and convert it to use a $conf 
array.  I don't think we have a place for utility scripts like this right now.  
Where   should it go?  
https://github.com/alexdean/ganglia-stuff/blob/master/reformat_conf.php

After you use this tool, but before all code is updated to use $conf, we can 
maintain backwards-compatibility with existing code with something like this:
https://gist.github.com/843349

I've taken a quick tour through my local ganglia using a converted conf.php and 
not seen any obvious problems, but of course the non-obvious ones are probably 
there somewhere.

alex

On Feb 24, 2011, at 7:01 AM, Jesse Becker wrote:

 +1 to that too
 
 On Wed, Feb 23, 2011 at 23:49, Bernard Li bern...@vanhpc.org wrote:
 +1 from me as well.
 
 I guess we should probably check it into both monitor-web-2.0 and trunk.
 
 Cheers,
 
 Bernard
 
 On Wed, Feb 23, 2011 at 7:35 PM, Jesse Becker haw...@gmail.com wrote:
 +1
 
 On Wed, Feb 23, 2011 at 21:27, Alex Dean a...@crackpot.org wrote:
 One of my gripes with the current PHP frontend code is how hard it can be 
 to recall where which variables are configuration, which come from user 
 input, and which are just local variables.  As one step toward fixing this 
 issue, I think it would be nice to place all configuration values (mainly 
 in conf.php currently) into a $conf array.  The benefit is that it's 
 immediately clear in any code which uses these values that you're dealing 
 with configuration values.  There's no danger of name collisions with 
 other variables.
 
 I'm just wondering if others feel the same way, and would support a change 
 like this.  It's pretty straighforward to do, but would obviously touch a 
 lot of different code.  Before I go ahead with making all those changes, I 
 guess I'd just like to know if there are any huge objections out there to 
 this idea.  Take a look, let me know what you think.  I'd like to do 
 something similar for user input as well, maybe $user?
 
 http://pastie.org/1600587
 
 thanks,
 alex
 
 
 
 --
 Free Software Download: Index, Search  Analyze Logs and other IT data in
 Real-Time with Splunk. Collect, index and harness all the fast moving IT 
 data
 generated by your applications, servers and devices whether physical, 
 virtual
 or in the cloud. Deliver compliance at lower cost and gain new business
 insights. http://p.sf.net/sfu/splunk-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers
 
 
 
 
 --
 Jesse Becker
 
 --
 Free Software Download: Index, Search  Analyze Logs and other IT data in
 Real-Time with Splunk. Collect, index and harness all the fast moving IT 
 data
 generated by your applications, servers and devices whether physical, 
 virtual
 or in the cloud. Deliver compliance at lower cost and gain new business
 insights. http://p.sf.net/sfu/splunk-dev2dev
 ___
 Ganglia-developers mailing list
 Ganglia-developers@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/ganglia-developers
 
 
 
 
 
 -- 
 Jesse Becker
 


--
Free Software Download: Index, Search  Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] gengetopt and SYSCONFDIR

2011-02-24 Thread Bernard Li
This should be a better patch against current trunk, which does not
add gengetopt as a build dependency (unless you need to modify
cmdline.sh and re-generate the files):

Index: gmetric/Makefile.am
===
--- gmetric/Makefile.am (revision 2490)
+++ gmetric/Makefile.am (working copy)
@@ -1,3 +1,5 @@
+include $(top_srcdir)/ganglia.inc
+
 if STATIC_BUILD
 GCFLAGS =
 GLDADD =
@@ -8,13 +10,19 @@
 GLDFLAGS =
 endif

-AM_CFLAGS = -I../lib -I../include $(GCFLAGS) -DSYSCONFDIR='$(sysconfdir)'
+AM_CFLAGS = -I../lib -I../include $(GCFLAGS)

 bin_PROGRAMS = gmetric
-gmetric_SOURCES =  gmetric.c cmdline.c cmdline.h
+
+cmdline.c: cmdline.c.in $(FIXCONFIG)
+   $(FIXCONFIG) cmdline.c.in
+
+gmetric_SOURCES =  gmetric.c cmdline.c.in cmdline.c cmdline.h
 gmetric_LDADD   =  $(top_builddir)/lib/libganglia.la \
$(top_builddir)/lib/libgetopthelper.a \
   $(top_builddir)/libmetrics/libmetrics.la \
   $(GLDADD)

 gmetric_LDFLAGS = $(GLDFLAGS)
+
+CLEANFILES = cmdline.c
Index: gmetric/cmdline.sh
===
--- gmetric/cmdline.sh  (revision 2490)
+++ gmetric/cmdline.sh  (working copy)
@@ -5,7 +5,7 @@
 purpose The Ganglia Metric Client (gmetric) announces a metric
 on the list of defined send channels defined in a configuration file

-option conf c The configuration file to use for finding send
channels string default=/etc/ganglia/gmond.conf no
+option conf c The configuration file to use for finding send
channels string default=@sysconfdir@/gmond.conf no
 option name n Name of the metric string no
 option value v Value of the metric string no
 option type t Either
string|int8|uint8|int16|uint16|int32|uint32|float|double string no
@@ -13,6 +13,9 @@
 option slope s Either zero|positive|negative|both string default=both  no
 option tmax x The maximum time in seconds between gmetric calls
int default=60 no
 option dmax d The lifetime in seconds of this metric int default=0 no
+option group g Groupof the metric string no
+option desc D Description of the metric string no
+option title T Titlei of the metric string no
 option spoof S IP address and name of host/device (colon
separated) we are spoofing string default= no
 option heartbeat H spoof a heartbeat message (use with spoof option) no

Notes:
- gengetopt is called via: gengetopt --c-extension=c.in --input
cmdline.sh which generates cmdline.c.in, this will then have
@sysconfdir@ replaced by the correct value to generate cmdline.c
- cmdline.c is included in the distribution tarball and will have a
bogus sysconfdir (usually /usr/local/etc) but it will get replaced
when the user does ./configure
- make clean would delete cmdline.c

If this looks good to everyone, I will check this into trunk and
update the backport proposal for adding groups/desc/title to gmetric
for 3.1 branch.

Thanks,

Bernard

On Wed, Sep 1, 2010 at 11:56 PM, Bernard Li bern...@vanhpc.org wrote:
 Hi all:

 I'm trying to get the backport proposal for adding group, description
 and title metadata tags to gmetric approved and am in the process of
 fixing the auto-generated files that were patched in the changesets
 outlined below:

 http://sourceforge.net/apps/trac/ganglia/browser/branches/monitor-core-3.1/STATUS#L209

 In the process, I found that additional auto-generated files have been 
 patched:

 http://sourceforge.net/apps/trac/ganglia/changeset/2021/

 In order to fix that, I'll need to modify cmdline.sh and add 
 SYSCONFDIR  inside the default clause.  Try as I may, it doesn't
 appear that I could include double quotes, and escaping with \ has the
 generated cmdline.c showing up as \ SYSCONFDIR \ which doesn't get
 work.

 To get around the problem, I propose that we delete cmdline.c and
 cmdline.h from gmond, gmetric, gmetad sub directories, move cmdline.sh
 - cmdline.sh.in, and update the Makefile.am targets such that
 fixconfig is called on cmdline.sh.in to generate cmdline.sh.
 cmdline.c and cmdline.h will need to be generated on the fly (so
 gengetopt will be an additional build dependency).

 The patch will look something like this:

 Index: gmetric/Makefile.am
 ===
 --- gmetric/Makefile.am (revision 2322)
 +++ gmetric/Makefile.am (working copy)
 @@ -1,3 +1,5 @@
 +include $(top_srcdir)/ganglia.inc
 +
  if STATIC_BUILD
  GCFLAGS =
  GLDADD =
 @@ -8,9 +10,16 @@
  GLDFLAGS =
  endif

 -AM_CFLAGS = -I../lib -I../include $(GCFLAGS) -DSYSCONFDIR='$(sysconfdir)'
 +AM_CFLAGS = -I../lib -I../include $(GCFLAGS)

  bin_PROGRAMS = gmetric
 +
 +cmdline.sh: cmdline.sh.in $(FIXCONFIG)
 +       $(FIXCONFIG) cmdline.sh.in
 +
 +cmdline.c cmdline.h: cmdline.sh
 +       gengetopt --input ./cmdline.sh
 +
  gmetric_SOURCES =  gmetric.c cmdline.c cmdline.h
  gmetric_LDADD   =  $(top_builddir)/lib/libganglia.la \
                    $(top_builddir)/lib/libgetopthelper.a \
 Index: