On 09:58 Fri 29 Feb , Albert Chu wrote: > > What you're trying to do is calculate "ignore_existing_lfts" when the port > trap is received rather than during routing later on?
Not trap but when PortInfo is received during subnet discovery phase of sweep (before routing configuration). > Logically it looks > fine. I tried to make a fix from the "trap side" instead of the "routing > side" initially too, but I didn't see a clean way to do it (obviously I > don't know the code as well). I'll try it out when I get a chance. > > (FYI, I noticed > + if (p_physp->need_update) > should probably be: > + if (p_physp->need_update && p_node->sw) > given the code a few lines above? > ) Yeah, this should be similar, but I don't understand yet why p_node->sw check is really needed few lines above - for switches PortInfo is queried only after SwitchInfo receive where p_node->sw is initialized. Probably we can just remove this check here. > > Regardless to this it also could be useful to add to the console a > > command to set p_subn->ignore_existing_lfts up manually. > > Yeah, like you said above, this would especially be needed when a new > switch is added to the network. I'll work with Ira on this. Thanks. > > Hmm, interesting... Are you running mpibench during heavy sweep? If so > > could the degradation be due to a fact of path migration and potential > > packet drops? > > Afraid not, it was after the heavy sweeps. I ran opensm in the foreground > and saw nothing going on besides the occasional lite sweep. > > I've seen similar "inconsistencies" on performance when I've run ~120 node > jobs on this cluster. So I personally think the tests are due to > randomness of the nodes selected. I don't know if anything can be > definitive until a 140+ node job is run (which I don't know if I can :-(). Ok. Sasha _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
