On 14:20 Sun 23 Nov , Yevgeny Kliteynik wrote: >>> One immediate outcome of this bug is opensm.fdbs file - when it >>> is dumped from the switch LFT (and not from lft_buf), >> Why this bug is triggered only now? > > I had sometimes errors in simulations, and after aome analysis > I decided that they are timing problems with the tests. > Now that I did some stress testing of ucast cache, I started > to see more of these errors.
If you are sure that this is simulator or test problems then just close #1406 as invalid. Obviously we don't need such patch then. > >>> it sometimes >>> doesn't match the lst file. >> What this "sometimes" mean? I think the case should be investigated >> deeper. By such patch we are just trying to hide a possible issue. >> As far as I understand opensm.fdbs (and other routing dump) are >> generated only after all LinFwdTbl responses are arrived, when some of >> them failed 'subnet_initialization_error' flag is up and OpenSM will >> resweep. If so why is 'opensm.fdbs' broken? It is not immediately >> clear for me. > > I didn't see 'subnet_initialization_error' in such cases. > Anyway, here's what I can do: at the end of each ucast_mgr_process > I'll compare lft and lft_buf (something that the other patch is > doing, the one that frees lft_buf), and if there is a difference, > then we have a problem. In not - then I'll look for the cause > elsewhere. Yes, seems deeper investigation is needed here. Thanks. Sasha _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
