Hi Hal, > > > > > Second, I have run some tests querying the fabric of our large > > > clusters here (~500 nodes) and the results were promising for a > > > single node implementation. > > > I don't recall the numbers as this was a while ago but it > was on the > > > order of > > > <2 sec and I think <1 but I don't want to be misquoted. > > > > Does PerfMgr query switch ports ? > > Yes (of course it does). > > > If it does I am surprised by the short sweep time you got. > > > > Does it have >1 query on the wire at a given time? > > Yes, Default appears to be 500 currently (maybe that needs > dialing back a bit) but is settable via > perfmgr_max_outstanding_queries in options file. This explains some. > > > If not then I am even more surprised. > > > > Was the cluster running a job at the time of the query ? > > Is this question related to VL0 contention ? Yes > > -- Hal > > > Thanks > > > > Eitan Zahavi > > Senior Engineering Director, Software Architect Mellanox > Technologies > > LTD > > Tel:+972-4-9097208 > > Fax:+972-4-9593245 > > P.O. Box 586 Yokneam 20692 ISRAEL > > > > > > > > > -----Original Message----- > > > From: Ira Weiny [mailto:[EMAIL PROTECTED] > > > Sent: Tuesday, July 10, 2007 7:47 PM > > > To: Eitan Zahavi > > > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; > > > [email protected]; [EMAIL PROTECTED] > > > Subject: Re: [ofa-general] IB performance stats (revisited) > > > > > > On Thu, 28 Jun 2007 10:24:59 +0300 > > > "Eitan Zahavi" <[EMAIL PROTECTED]> wrote: > > > > > > > > On Wed, 2007-06-27 at 14:23, Eitan Zahavi wrote: > > > > > > In the last months it is the second time I hear people > > > > > complaining the > > > > > > current monitoring solution in OFA is integrated > with OpenSM. > > > > > > > > > > I must have missed this both times (didn't see this in Mark's > > > > > post) and the statement itself is somewhat inaccurate as well. > > > > Private talks - I hope they will speak up for themselves now... > > > > > > > > > > > These people do not use OpenSM but do use OFED. > > > > > > > > > > I'm not sure I'm following what you mean here. > > > > > > > > > > If you mean that some people want to run PerfMgr without > > > the SM/SA > > > > > aspects (so that they can run a vendor based SM), that is > > > the next > > > > > thing we are adding to the implementation. > > > > Exactly. OK when is that coming? > > > > > > There is very little which ties the current PerfMgr to OpenSM. > > > Basically it just gets the current fabric topology. > > > As Hal has said changes are coming. > > > > > > > > > > > > > > > > > > Another drawback if that > > > > > > no naming is provided and the reporting uses GUIDs. > > > > > > > > > > Naming is provided via NodeDescription. > > > > This might be good for hosts but is not covering switches ... > > > > > > It does include switches. However, since most systems > have the same > > > name for multiple switches this becomes ineffective. > > > I have queried Voltaire for a way to change the > NodeDescription for > > > switches, but at the time I asked, there was no way to do it. > > > Perhaps there is now? What about other vendors? This is why > > > ibnetdiscover and other diags have "switch map" support. (A > > > GUID->name mapping to override the default > NodeDescription.) Nothing > > > would please me more than to be able to remove that for a more > > > "automatic" solution. > > > > > > > > > > > > > > I also can't hold myself from saying again I think you > > > are going > > > > > > to hit the wall with the concept of doing the PMA from > > > a single node. > > > > > > > > > > If you are referring to the fact the PerMgr is currently not > > > > > distributed, that will be done as has been stated before. > > > > Good. When is it expected? Will it be OFED 1.3? > > > > > > When Hal first sent out the PerfMgr design I thought we > should jump > > > right to the distributed model as well. But now I am > glad we have > > > gone the way we did. > > > First off, we have something which "works" and from which we can > > > expand. > > > Second, I have run some tests querying the fabric of our large > > > clusters here (~500 nodes) and the results were promising for a > > > single node implementation. > > > I don't recall the numbers as this was a while ago but it > was on the > > > order of > > > <2 sec and I think <1 but I don't want to be misquoted. > > > > > > For sure, a distributed model offers many advantages and > we will get > > > there. But for many the current single node approach should work > > > just fine. > > > > > > Thanks, > > > Ira > > > > > > > > > > > Thanks > > > > > > > > > > -- Hal > > > > > > > > > > > Eitan Zahavi > > > > > > Senior Engineering Director, Software Architect Mellanox > > > > > Technologies > > > > > > LTD > > > > > > Tel:+972-4-9097208 > > > > > > Fax:+972-4-9593245 > > > > > > P.O. Box 586 Yokneam 20692 ISRAEL > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: [EMAIL PROTECTED] > > > > > > > [mailto:[EMAIL PROTECTED] On > > > Behalf Of Hal > > > > > > > Rosenstock > > > > > > > Sent: Wednesday, June 27, 2007 8:12 PM > > > > > > > To: Mark Seger > > > > > > > Cc: Finn, Ed; [email protected] > > > > > > > Subject: Re: [ofa-general] IB performance stats > (revisited) > > > > > > > > > > > > > > On Wed, 2007-06-27 at 13:07, Mark Seger wrote: > > > > > > > > >The performance managers deal with the counter > > > stickiness (by > > > > > > > > >resetting them when they think they need to). They > > > > > > > typically export > > > > > > > > >their data although this is not specified by > IBA so it is > > > > > > > in a vendor > > > > > > > > >proprietary manner. > > > > > > > > > > > > > > > > > > > > > > > > > > so I guess these guys are poor citizens as well... > > > > > > > > > > > > > > Not sure what you mean. > > > > > > > > > > > > > > > the real issue as I see it then means nobody can trust > > > > > the data if > > > > > > > > randon tools randomly reset the counters. a > real shame... > > > > > > > > > > > > > > I consider this to be a real rather than random > app for this. > > > > > > > Guess it depends on what one considers random. > > > > > > > > > > > > > > -- Hal > > > > > > > > > > > > > > > -mark > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > general mailing list > > > > > > > [email protected] > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/genera > > > > > > > l > > > > > > > > > > > > > > To unsubscribe, please visit > > > > > > > http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > general mailing list > > > > [email protected] > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > To unsubscribe, please visit > > > > http://openib.org/mailman/listinfo/openib-general > > > > > > > > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
