On Thu, Apr 07, 2005 at 03:32:24PM -0400, Hal Rosenstock wrote: ... > Assumption: > > The proposed solution assumes that the ignore GUIDs file option of > OpenSM only impacts the routing algorithm (path counting) and should not > be extended for bad port handling. > > Proposed Solution: > > The OpenSM will implement a configurable policy (some number of > consecutive lack of responses to SM requests). At the point of > exhaustion of the timeout/retry strategy, that port will be marked as > "bad" by OpenSM.
Generally speaking, seperating recovery "policy" from "detection" is a good thing. ... > Is there a need to store these "bad" ports persistently (and ignore them > on startup) ? If opensm can see the physical link is ok, I would think it save any state. It's possible a system just hasn't loaded whatever SW is necessary to talk to the SM and might require operator intervention to kick that off (e.g. none of my systems auto-reboot unless I'm testing a specific customer environment). I expect it's a seperate policy on how long to save information after the physical link has been dropped - similar to DHCP. grant _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
