Bob Ciotti wrote: > Sorry to bounce this off the list - should it be too remedial. I promise > that I've been consuming a lot of the spec and OFA code. Maybe you consider > that a promise or a warning we will be more active :| > > Our configuration is >6000 CA in a mix of infinihostIII/connectx and > longbow extenders and >800 24 port switches on a single subnet. (SGI ICE > with lots of other stuff plugged in). Its DDR everywhere except across the > longbows. Hosts range from a few different generations of x86 xeon, x86 > opteron and itanium. We use lustre but have the srp traffic on a separate > subnet. > > A few weeks ago connection setup times were mentioned on this list along > with ARP and path record lookups not being scalable. We experience these > problems as well and need to address these scalability issues. I have a quite > a bit of test data and a few different ideas to bounce off the list RE path > records, once I am a little more versed in the spec. There has already been > some work done to limit ARP traffic. > > > Todays question has to do with SM errors. > We have been seeing lots of these - sometimes more than others. Digging > around some it appears that the 6777 represents the number of duplicates? > This value fluctuates around some, but not alot. Comments in the code > indicate that any valuse >1 is a problem. Question is, should or is this > OK to be happening and how does it occur? > > We will probably do an update to the 1.4 or 1.4.1 SM in the next few days. > We are currently running a pre 1.4 top of tree pull from back in dec. bob > >
This may be the path record query triggered by ipoib (ib_sa_path_rec_get). It uses METHOD_GET and if there is more than one path record it will fail. using METHOD_GET_TABLE should solve it (the fix is in ib_sa module.) If you enable debugging in ipoib and see path record query failures, this is probably it. --Yossi. _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general