On Tue, Mar 29, 2011 at 02:30:23PM -0400, Vladimir Vuksan wrote: >I see it all the time :-(. According to Bernard this is due to problem >with some of the Broadcom cards. Perhaps Bernard can offer more insight.
you also get PB/s values if you failover an IP to a different interface. eg. 10gige to a backup gige. possibly there are other common cases too, maybe bringing up new or old interfaces with zero'd or pre-existing counters. I think some sort of generic 'is this an insane value' limiter in the core code would be the best idea. limiters are easy to apply if you know what the physical limits of the interface are. eg <0 or > 1gbit/s on a gige link. not quite so easy for things like pkts/s. we implemented (external) limiters because switch chip resets on our InfiniBand fabric cause the 64bit hardware byte and pkt counters on each port of the chip go back to zero. it's a 40gbit/s fabric (3.2Gbyte/s of data) with fast cpus, so I impose limiters of >0 and < 3Gbyte/s and < 10Mpkt/s on this data to make sure it is sane before spoof'ing it into ganglia. even though the firmware that was probing the switch chips and causing resets is fixed now, the limiter is still good to have to protect ganglia data from other unforseen problems. it's a pain to have to go in and edit rrd files. cheers, robin -- Dr Robin Humble, HPC Systems Analyst, NCI National Facility ------------------------------------------------------------------------------ Enable your software for Intel(R) Active Management Technology to meet the growing manageability and security demands of your customers. Businesses are taking advantage of Intel(R) vPro (TM) technology - will your software be a part of the solution? Download the Intel(R) Manageability Checker today! http://p.sf.net/sfu/intel-dev2devmar _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

