Here is the patch diff in SVN

http://ganglia.svn.sourceforge.net/viewvc/ganglia/trunk/monitor-core/libmetrics/linux/metrics.c?r1=860&r2=933


I haven't looked at any of the other platforms besides linux.  Do we have the 
same problem there?


Brad

>>> On 2/14/2008 at 6:29 AM, in message
<[EMAIL PROTECTED]>, Martin Knoblauch
<[EMAIL PROTECTED]> wrote:
> Hi,
> 
>  maybe attached patch (based on 3.0.4) can fix the leak. The daemon runs and 
> reports metrics. It is of course to early to say.
> 
>  When looking at the linux metrics file, I just realized  hom much code 
> duplication there is. Basically all funtion-groups that grok the same
> /proc/xxx files should be rewritten to use common code. This ist true for 
> cpu, load and network. Maybe others.
> 
> Cheers
> Martin
> ------------------------------------------------------
> Martin Knoblauch
> email: k n o b i AT knobisoft DOT de
> www:   http://www.knobisoft.de 
> 
> ----- Original Message ----
>> From: Martin Knoblauch <[EMAIL PROTECTED]>
>> To: Kumar Vaibhav <[EMAIL PROTECTED]>; Carlo Marcelo Arenas Belon 
> <[EMAIL PROTECTED]>
>> Cc: [email protected] 
>> Sent: Thursday, February 14, 2008 11:36:37 AM
>> Subject: Re: [Ganglia-developers] Memory leak in gmond
>> 
>> Hi,
>> 
>>  after looking at one of my employerss customers installations, it 
> definitely 
>> seems that metrics-collecting/non-mute "gmond"s are growing (substantially) 
> over 
>> time. Pure listeners seem to be unaffected.
>> 
>>  If I remember correctly, Kumars valgrind traces found that "strndup" might 
>> allocate later leaked memory. If I look at the 3.0.4 
> libmetrics/linux/metrics.c 
>> I have the strong feeling that all four network functions are careless about 
> the 
>> memory allocated by strndup:
>> 
>> 217:           char *devname, *src;
>> 228:           devname = strndup(src, n);
>> 238:                 net_dev_stats *ns = hash_lookup(devname, 1,
>> 
>> 305:           char *devname, *src;
>> 316:           devname = strndup(src, n);
>> 326:                 net_dev_stats *ns = hash_lookup(devname, 1,
>> 
>> 393:           char *devname, *src;
>> 404:           devname = strndup(src, n);
>> 414:                 net_dev_stats *ns = hash_lookup(devname, 1,
>> 
>> 481:           char *devname, *src;
>> 492:           devname = strndup(src, n);
>> 502:                 net_dev_stats *ns = hash_lookup(devname, 1,
>> 
>> 
>>  Have to look at it some more.
>> 
>> Cheers
>>  Martin
>> ------------------------------------------------------
>> Martin Knoblauch
>> email: k n o b i AT knobisoft DOT de
>> www:   http://www.knobisoft.de 
>> 
>> ----- Original Message ----
>> > From: Kumar Vaibhav 
>> > To: Carlo Marcelo Arenas Belon 
>> > Cc: [email protected] 
>> > Sent: Saturday, February 9, 2008 8:59:18 AM
>> > Subject: Re: [Ganglia-developers] Memory leak in gmond
>> > 
>> > Carlo Marcelo Arenas Belon wrote:
>> > > On Tue, Jan 22, 2008 at 04:17:07PM +0530, Kumar Vaibhav wrote:
>> > >> I am using ganglia-3.0.5 on a woodcrest processor cluster. and I see 
>> > >> that after running for weeks the memory consumption of the gmond 
>> > >> process 
>> > >> is something about 400 MB.
>> > > 
>> > > did you check what was the size 1 hour after all gmond proceses in your
>> > > cluster were started?, if you are using multicast and have a large 
>> > > number 
> of
>> > > nodes/metrics then that is the ammount of memory that is needed to hold 
> all
>> > > those metrics from all nodes most likely.
>> > I Checked it . The memory size increases with Time. i Tried ps -eo 
>> > cmd,rss and can see the size of gmond increases with time.
>> > > 
>> > >> ==2381== LEAK SUMMARY:
>> > >> ==2381==    definitely lost: 69 bytes in 16 blocks.
>> > >> ==2381==      possibly lost: 0 bytes in 0 blocks.
>> > > 
>> > > that means there is no memory leak (execpt for 69 bytes)
>> > This is so because I had run it for few minutes only.
>> > > 
>> > >> ==2381==    still reachable: 1,446,276 bytes in 1,463 blocks.
>> > > 
>> > > that is the RSS of your process
>> > by memory I mean RSS only.
>> > 
>> > 
>> > Here are some new tests I have done.
>> > 
>> > I isolated two nodes of the cluster by changing their multicast address. 
>> > On one I run gmond in mute mode and on one in deaf mode. The RSS of 
>> > gmond in deaf node continues to increase. But the RSS of gmond on mute 
>> > mode stablises after some. time. And it didn't increase for a week.
>> > 
>> > Hope this will help you to solve the problem.
>> > > 
>> > > Carlo
>> > 
>> > Vaibhav
>> > 
>> > -------------------------------------------------------------------------
>> > This SF.net email is sponsored by: Microsoft
>> > Defy all challenges. Microsoft(R) Visual Studio 2008.
>> > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ 
>> > _______________________________________________
>> > Ganglia-developers mailing list
>> > [email protected] 
>> > https://lists.sourceforge.net/lists/listinfo/ganglia-developers 
>> > 
>> > 
>> 
>> 
>> 
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Microsoft
>> Defy all challenges. Microsoft(R) Visual Studio 2008.
>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ 
>> _______________________________________________
>> Ganglia-developers mailing list
>> [email protected] 
>> https://lists.sourceforge.net/lists/listinfo/ganglia-developers 
>> 
>> 



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Ganglia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to