On Fri, 2007-04-13 at 09:14, Michael S. Tsirkin wrote: > > Quoting Hal Rosenstock <[EMAIL PROTECTED]>: > > Subject: Re: multicast join failed for... > > > > On Thu, 2007-04-12 at 10:08, Michael S. Tsirkin wrote: > > > > Quoting Hal Rosenstock <[EMAIL PROTECTED]>: > > > > Subject: Re: multicast join failed for... > > > > > > > > On Wed, 2007-04-11 at 23:38, Michael S. Tsirkin wrote: > > > > > > Quoting Hal Rosenstock <[EMAIL PROTECTED]>: > > > > > > Subject: Re: multicast join failed for... > > > > > > > > > > > > On Wed, 2007-04-11 at 15:47, Michael S. Tsirkin wrote: > > > > > > > > Quoting Hal Rosenstock <[EMAIL PROTECTED]>: > > > > > > > > Subject: Re: multicast join failed for... > > > > > > > > > > > > > > > > On Wed, 2007-04-11 at 14:12, Michael S. Tsirkin wrote: > > > > > > > > > > > If yes, I'm actually not too happy with this. > > > > > > > > > > > > > > > > > > > > > > Would something like the following heuristic work better? > > > > > > > > > > > - select the max rate between all participants > > > > > > > > > > > > > > > > > > > > The issue is that one doesn't know all the participants in > > > > > > > > > > a group as > > > > > > > > > > they are joined dynamically. > > > > > > > > > > > > > > > > > > > > (I think we've been over this aspect on the list several > > > > > > > > > > times in the > > > > > > > > > > past.) > > > > > > > > > > > > > > > > > > That's why I suggest the fix, so that the rate is adapted > > > > > > > > > dynamically. > > > > > > > > > > > > > > > > > > > > - when a host with lower rate joins, destroy the group > > > > > > > > > > > > > > > > > > > > I don't think a group can be destroyed like this > > > > > > > > > > "underneath" its > > > > > > > > > > existing members. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Of course it can. That's what happens when SM is restarted. > > > > > > > > > > > > > > > > Client reregistration ? I don't like using that big hammer as a > > > > > > > > solution > > > > > > > > to this. Seems a little harsh to me. > > > > > > > > > > > > > > I think it's not too bad > > > > > > > > > > > > It requires all subscriptions to reregister. This affects more > > > > > > things > > > > > > than just multicast or even the groups affected which might not be > > > > > > all > > > > > > of the multicast groups. Hence BIG hammer. > > > > > > > > > > Changing an option in opensm config requires restarting > > > > > opensm. Isn't that right? > > > > > > > > Yes but that doesn't have to be the case going forward in terms of > > > > OpenSM reconfig. > > > > > > > > > > So its an even bigger hammer. > > > > > > > > Restarting opensm is a slightly bigger hammer right now (than client > > > > reregistration) in the case the admin wants it "dynamic" but I suspect > > > > this only needs to be done once. > > > > > > I think you forgot that currently one has to edit the config file, > > > just restarting opensm isn't enough :). > > > Let the user decide for us is a *HUGE* hammer - it usually solves > > > all problem, but at what cost? > > > > Doesn't the admin "plan" his network ? This is part of the installation > > and bringup IMO. > > I agree the admin must plan the network. > But I disagree this should necessarily involve editing config files.
Why doesn't it include editing config files when some non default is needed ? > > There are a couple of ways to avoid having the admin decide but they all > > involve penalizing the more normal use cases (pushing the admin burden > > to them). I'm ambivalent about whether that's a better choice. > > I don't think what I propose penalizes normal use. > It just turns what used to be an error into working configuration. I was referring to the existing static rate approach with a default, not to the proposed dynamic approach. > > > > > > There could be a more > > > > > > graceful way to deal with this. I don't like using client reregister > > > > > > unless absolutely needed. > > > > > > > > > > What are the other options that have the same funcitionality? > > > > > > > > Perhaps a spec enhancement is possible to make this better. > > > > > > Sure. Meanwhile, opensm will have to support legacy networks > > > too so I think we can start with the reregister solution. > > > > OK; it could be another option. Would you propose this being the default > > option ? > > No, I expect if node supports an ability to reregister specific mcast > groups, this capability can be advertised somehow, and SM > will use it if available, and plain reregister if not. I wasn't referring to the mechanisms underneath which accomplish the dynamic rate adjustment here. I was asking if you propose dynamic rate being the default rate for multicast groups (and any specific static rate would need to be configured). > > > > > > > - previously we had some client failing join > > > > > > > which is worse. > > > > > > > > > > > > Maybe not. Maybe that's what the admin wants (to keep the higher > > > > > > rate > > > > > > rather than degrade the group due to some link issue). > > > > > > > > > > Rate could be an option, but I think generally people prefer > > > > > things working even if at a slower rate. > > > > > > > > I think it's a coin flip. > > > > > > I disagree. I think people that want the join to fail basically > > > just want to make debugging easy. We can help them without failing joins. > > > > > > > I've seen it both ways and either way there > > > > are support questions. > > > > > > I think we can solve this relatively easily: compare the bcast group > > > rate with local rate and have IPoIB produce a warning in log if these > > > do not match. > > > > > > This is similiar to what we have with USB2.0 device in USB slot, > > > people seem to be happy. > > > > > > > In the current scenario, it is join failures. In > > > > the proposed scenario, it is more subtle: performance implications and > > > > perhaps SA network storms. > > > > > > I don't believe we'll see network storms: rate has to drop from DDR to SDR > > > only once. > > > > Frequency appears low (but I'm sure we'll hit some oscillating case down > > the road) but impacts all multicast groups whether or not this node > > affects them as well as other subscriptions. Client reregister is a > > storm IMO and should only be used when there is absolutely no other > > choice. > > I agree it might be useful to give opensm a way to detect > that a set of mcast groups belongs to a specific application, Certainly, one can detect IPv4 groups and IPv6 groups but not sure about the level of granularity needed here. > and a way to force re-registration. more gracefully. -- Hal _______________________________________________ general mailing list [EMAIL PROTECTED] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
