Hi Bernd, On Sun, 2008-04-06 at 18:05 +0200, Bernd Schubert wrote: > Hello Hal, > > On Sat, Apr 05, 2008 at 06:19:43AM -0700, Hal Rosenstock wrote: > > Hi Bernd, > > > > On Sat, 2008-04-05 at 00:12 +0200, Bernd Schubert wrote: > > > Hello, > > > > > > after I upgraded one of our clusters to opensm-3.2.1 it seems to have > > > gotten > > > much better there, at least no further RcvSwRelayErrors, even when the > > > cluster is in idle state and so far also no SymbolErrors, which we also > > > have > > > seens before. > > > > > > However, after I just started a lustre stress test on 50 clients (to a > > > lustre > > > storage system with 20 OSS servers and 60 OSTs), ibcheckerrors reports > > > about > > > 9000 XmtDiscards within 30 minutes. > > > > > > Searching for this error I find "This is a symptom of congestion and may > > > require tweaking either HOQ or switch lifetime values". > > > Well, I have to admit I neither know what HOQ is, nor do I know how to > > > tweak > > > it. I also do not have an idea to set switch lifetime values. I guess > > > this > > > isn't related to the opensm timeout option, is it? > > > > > > Hmm, I just found a cisci pdf describing how to set the lifetime on these > > > switches, but is this also possible on Flextronics switches? > > > > What routing algorithm are you using ? Rather than play with those > > switch values, if you are not using up/down, could you try that to see > > if it helps with the congestion you are seeing ? > > I now configured up/down, but still got XmtDiscards, though, only on one port. > > Error check on lid 205 (SW_pfs1_leaf2) port all: FAILED > #warn: counter XmtDiscards = 6213 (threshold 100) lid 205 port 1 > Error check on lid 205 (SW_pfs1_leaf2) port 1: FAILED > #warn: counter RcvSwRelayErrors = 1431 (threshold 100) lid 205 port 13 > Error check on lid 205 (SW_pfs1_leaf2) port 13: FAILED
Are you running IPoIB ? If so, SwRelayErrors are not necessarily indicative of a "real" issue due to the fact that multicasts reflected on the same port are mistakenly counted. > I'm also not sure if up/down is the optimal algorithm for a fabric with only > two switches. > > Since describing the connections in words is a bit difficult, I just upload > a drawing here: > > http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/ib/Interswitch-cabling.pdf > > The root-guid for the up/down algorithm is leaf-5 of of the small switch. But > I'm still not sure about up/down at all. Doesn't one need for up/down at least > 3 switches? Something like this ascii graphic below? > > > root-switch > / \ > / \ > Sw-1 ------------ Sw-2 Doesn't your chassis switch have many switches in it ? You did say it was 144 ports so it's made up of a number of switches. You may need to choose a "better" root than up/down automatically determines. -- Hal > Thanks for your help, > Bernd > > > PS: These RcvSwRelayErrors are also back again. I think this occur on some > operations of Lustre. Even if these RcvSwRelayErrors are not critical, they > are still a bit annoying, since they make it hard to find other errors in > the output ob ibcheckerrors. > If we can really ignore these errors, I will write a patch to not display > these > by default. > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
