David,
after some looking at CALC_NETSTAT I see no *type* problems here:
#define CALC_NETSTAT(type) (double)
((cur_ninfo->type<last_ninfo->type)?
-1:(cur_ninfo->type - last_ninfo->type)/timediff)
cur_ninfo->type and last_ninfo->type are of the same type and the
macro will just return a double float of either -1 or a positive rate.
It would be interesting to see the values of cur_ninfo->type,
last_ninfo->type and timediff when you observe the petabyte
performance. Can you add some debug statements around lines 873-876?
Cheers
Martin
--- David Wong <[EMAIL PROTECTED]> wrote:
> I don't write much code nowadays, so I'm going to need a lot of help
> with this.
>
> I dug through the ganglia code and I found this interesting tidbit in
> libmetrics/aix/metrics.c which may be indicative of the problem.
>
> There's an assignment from cur_ninfo.ibytes to cur_net_stat.ibytes,
> but
> the types of the two variables are different.
>
> net_stat::ibytes is a double:
>
> struct net_stat{
> double ipackets;
> double opackets;
> double ibytes;
> double obytes;
> } cur_net_stat;
>
> and we have *ninfo declared here:
>
> perfstat_netinterface_total_t ninfo[2],*last_ninfo, *cur_ninfo ;
>
> libperfstat.h has perfstat_netinterface_total_t::ibytes as
> u_longlong_t.
>
> Does this code try to do what I think it is doing, i.e. assign an
> unsigned 64 bit integer to a signed 64bit integer?
>
> I'm willing to test the code if someone who's more adept at coding
> and
> building will take on the challenge.
>
> It looks to me that the type mismatch will have to fixed in a few
> places, such as CALC_NETSTAT, and we'll have to add an unsigned long
> long to g_val_t too. Those are the ones I can see so far.
>
> David Wong
> Senior Systems Engineer
> Management Dynamics, Inc.
> Phone: 201-804-6127
> [EMAIL PROTECTED]
>
> -----Original Message-----
> From: Martin Knoblauch [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, March 28, 2007 12:00 PM
> To: David Wong; [email protected]
> Subject: Re: [Ganglia-general] Help! I have a petabyte/s network
>
> David,
>
> as far as I remember, the AIX metrics code had an
> overflow/wrap-around
> problem prior to 3.0.4. Maybe the fixes are not thorough enough.
>
> The packets/sec are of course less affected.
>
> Cheers
> Martin
>
> --- David Wong <[EMAIL PROTECTED]> wrote:
>
> > Ganglia is reporting that I'm pushing up to 200 Petabytes/s through
> > my
> > network. Nobody tell the network admin!
> >
> > I'm running Ganglia 3.0.4 with the Power5 add-ons on AIX5.3
> >
> > Bytes in and out statistics generally appear to have the right
> > values.
> > However at random times, I get spikes in the petabytes/s range.
> >
> > Here's a dump of the bytes_in database. At first, I suspected
> > perhaps
> > these coincide with some counters getting reset, but they don't
> occur
> > at
> > regular intervals.
> >
> > <!-- 2007-03-27 20:42:00 GMT / 1175028120
> -->
> > <row><v> 1.9268390706e+05 </v></row>
> > <!-- 2007-03-27 20:48:00 GMT / 1175028480
> -->
> > <row><v> 1.5833184975e+05 </v></row>
> > <!-- 2007-03-27 20:54:00 GMT / 1175028840
> -->
> > <row><v> 1.6838302753e+05 </v></row>
> > <!-- 2007-03-27 21:00:00 GMT / 1175029200
> -->
> > <row><v> 1.3766069592e+05 </v></row>
> > <!-- 2007-03-27 21:06:00 GMT / 1175029560
> -->
> > <row><v> 2.1711888414e+05 </v></row>
> > <!-- 2007-03-27 21:12:00 GMT / 1175029920
> -->
> > <row><v> 4.9959709273e+16 </v></row>
> > <!-- 2007-03-27 21:18:00 GMT / 1175030280
> -->
> > <row><v> 1.7401339783e+05 </v></row>
> > <!-- 2007-03-27 21:24:00 GMT / 1175030640
> -->
> > <row><v> 2.0955720861e+05 </v></row>
> > <!-- 2007-03-27 21:30:00 GMT / 1175031000
> -->
> > <row><v> 1.9032255300e+05 </v></row>
> > <!-- 2007-03-27 21:36:00 GMT / 1175031360
> -->
> > <row><v> 1.9162727036e+05 </v></row>
> > <!-- 2007-03-27 21:42:00 GMT / 1175031720
> -->
> > <row><v> 1.2703790825e+05 </v></row>
> >
> > Can anyone shed light on what might be happening? Any pointers for
> > debugging?
> >
> > David Wong
> > Senior Systems Engineer
> > Management Dynamics, Inc.
> > Phone: 201-804-6127
> > [EMAIL PROTECTED]
> >
> >
> >
> >
>
------------------------------------------------------------------------
> -
> > Take Surveys. Earn Cash. Influence the Future of IT
> > Join SourceForge.net's Techsay panel and you'll get the chance to
> > share your
> > opinions on IT & business topics through brief surveys-and earn
> cash
> >
>
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDE
> V
> > _______________________________________________
> > Ganglia-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/ganglia-general
> >
> >
>
>
> ------------------------------------------------------
> Martin Knoblauch
> email: k n o b i AT knobisoft DOT de
> www: http://www.knobisoft.de
>
>
>
>
-------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to
> share your
> opinions on IT & business topics through brief surveys-and earn cash
>
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Ganglia-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>
>
------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www: http://www.knobisoft.de