On 16:50 Mon 19 Mar , Yevgeny Kliteynik wrote: > > In __osm_ucast_mgr_process_neighbor(), there is the following assertion: > > CL_ASSERT( hops <= osm_switch_get_hop_count( p_sw, lid_ho, > port_num ) ); > > This assertion fails, since the hop count becomes inconsistent.
This is not big problem IMO, we just need to not deal with non-existing LIDs there (so __osm_ucast_mgr_process_neighbor() code should be improved in this direction and this assertion removed). And the LFTs generation code doesn't try to build entries for non-existing LIDs, so "old" min hop vectors will be ignored there. But I think we could have a problem when the port (switch with master) is reconnected at different location. Then old/invalid hop counts will be counted again and if it "wins" we can get not expected routing paths. So obviously hop matrix cleanup is simplest fix - Agreed. > >>I'm not sure about the trunk though. > >>Sasha, > >>Can you please check that you latest improvements to the > >>routing don't have this problem? > > > >With disconnecting switches should be similar behavior I guess. > > Right, I checked it - same problem. Interesting. This function is different in the master and doesn't scan LIDs from 1 up to max anymore, instead it scans only switches existing at the moment. Could you provide more details about the master? Do you able to see the problem with just switch disconnections? What is the test case? Sasha _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
