On Fri, 2006-06-09 at 17:18, Sean Hefty wrote: > Hal Rosenstock wrote: > > What does mesh mean in this instance ? How do you know the multicast > > routing tables are indeed valid and that the SM didn't corrupt them ? > > (Why did the SM need restarting ?) > > I meant that the values agree with each other, and there are no conflicts.
How are conflicts determined ? The SA has no way of querying the end nodes for their multicast information; it currently is the other way around. > > The MLID is supplied by the SA in response to a group request from the > > end node, not the other way around. The end node doesn't tell the SA > > what MLID to use for a group. > > One of the ideas is for the end nodes to provide this data, even if that > means > extending the architecture. OK. What if the SM already put the MLID to use for something else ? > The problem is that the SA lost its state, but the network is working fine. How does the SM know that the network is working fine ? > The end nodes know which groups they have joined and the mapping of MGIDs to > MLIDs. > And the switches are already programmed correctly. I'm not sure what constitutes a correctness criterion here. > Even if we have the ability for an SM to transparently fail over to another > SM, > because of the architecture, the end nodes are being forced to assume that > all > multicast group information has been lost. In the case of an SM which replicated its database, it would replicate the registrations which include multicast so this reregistration shouldn't be necessary. But I don't know of a way that the end node knows whether the SM is doing this database replication. > How about this? What if the end nodes only re-joined their groups on > LID_CHANGE > or CLIENT_REREGISTER events? That is, an SM_CHANGE would not result in > clients > needing to rejoin any groups. This puts the burden on the SM to generate a > CLIENT_REREGISTER event only if it's needed. SMs that can fail over and > maintain multicast state in the process would be able to do so. I think more than this is needed. -- Hal > - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
