Hal Rosenstock wrote:
On Wed, 2007-07-11 at 10:15, Mark Seger wrote:
My basic philosophy, and I suspect there are those who might disagree,
is that you can't use the network to monitor the network, at least not
in times of trouble.
Right, in times of certain troubles.
and that is the key. since you can't know apriori when you're about to
have troubles, you need to be collecting the data locally before they occur.
That's why I insist on having to query the HCAs
directly since I can't always be sure the network is there and/or
reliable. If you are willing to concede that this can indeed happen
than the question becomes one of how do you reliably get data from an
HCA and that's the basis for my (re)starting this discussion.
The reliability comes from timeout/retry mechanisms. If performance data
cannot be obtained on an IB network, it needs to be trouble shooted at a
lower level (by SMPs).
In any case, a rearchitecture of the PMA was proposed and seems
reasonable to me in that it can accomodate either approach. All that is
needed now is for someone to step up and champion an implementation of
this. Unfortunately, I do not have time to do so.
I don't know if what I've been proposing requires any rearchitecting as
I see is as something local to each node. Specificially, and there is
already an implementation of this in an earlier voltaire stack, is to
export wrapping HCA counters to /proc. The module that does this
read/clears the counters on every access but since no local applications
are accessing the counters directly, clearing them doesn't hurt anyone.
Alas, anyone else who wants to query the counters will find them reset.
The other side benefit of exporting these counters is such a way is now
lots of others can collect/report this info. In other words is someone
chose to add IB stats to sar, it would become very easy to do!
If this is the type of thing people are interested in, I might be able
to supply some code to do it.
As for querying the switch for counters, what do you do on a very large
network, say 10s of thousands of nodes if you want to get performance
data every second? I also realize this is an extreme situation today
(the node count not the frequency of monitoring) but I'm sure everyone
would agree systems of these sizes are not that far off.
You have a distributed performance manager to handle this. A hierarchy
of performance managers has been discussed on the list before.
ahh, I see.
-mark
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general