On Wed, 2007-06-27 at 17:37, Mark Seger wrote: > Jason Gunthorpe wrote: > > On Wed, Jun 27, 2007 at 05:13:40PM -0400, Hal Rosenstock wrote: > > > > > >>> - The kernel periodically fetches the performance stats and aggregates > >>> them into a 64 wrapping counter. The kernel sends PMA mads into the > >>> mellanox firmware to read and reset the counters > >>> - The new 64 bit stats are exported via sysfs/proc/whatever as > >>> wrapping counters > >>> - When a PMA packet comes in the kernel services it rather than > >>> passing it on to the chip firmware. > >>> > >> In this way, both 32 and 64 bit counters could be presented by the PMA > >> but how would it know when the a counter has maxed out in terms of the > >> PMA and how would a remote clear be handled ? > >> > > > > Each time the counter is cleared the kernel would store the 64 bit > > value as the 'last PMA counter'. Then the calculation is just > > > > if ((current - stored) >= saturation) > > return saturation; > > return current - stored; > > > > After 2**64 counts the saturation computation will stop working. It > > would take 24 years of constant maxed out data transfer for a 12x QDR > > link to wrap a 64 bit dword byte counter. > > > > A nice side benifit would that linux drivers could present a > > consistent PMA interface with new extended 64 bit counters even with > > older hardware. > > > I agree for 64 bit counters but for 32 bit ones it gets a little more > complicated because they can max out in under a minute! Since it's > tough to decide when a counter has maxed out you therefore HAVE to clear > it every time! This means your monitoring utility will need to examine > the /proc counters within that 'max-out' window or the counters will > latch on > you. If you wait too long to look you're screwed and now we're back to > the fact that the counters don't wrap. > > what I'd like to hear is the sense of the community whether or not > something like this would be acceptable. if it is, that means nobody is > allowed to clear counters on their own
Per the IBA spec, I don't think you can legislate this away. IB supports a standard way to remotely clear counters (and the various Performance Managers or other similar tools utilize this clearing feature). -- Hal > AND that the single source for counter information then becomes /proc. > > -mark > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
