Hi Renee, On 20/05/2009 01:21, Renee Danson wrote: > While thinking about the dhcp_wait_time property, and whether or not we > still need it in phase 1, I came up with some more questions about how > we deal with NCU priority group mode. > > For NCUs with activation-mode set to prioritized: each NCU is assigned > a priority group number; one or more links may have the same number. > Each prioritized link also has a priority group mode (exclusive, shared, > all). The mode determines how many of the group members must be available > in order for the group to be considered available: exclusive means that > one member must be available, and at most one will be enabled; shared > means that one member must be available, and any that are available will > be enabled; all means that all members must be available and will be > enabled. > > So far, so good. The question is: how do we define "available"? The > most obvious answer is the link is up. This works for ethernet devices > that report link state, but falls down with wireless devices, which > aren't really up until connected. So when deciding which NCUs should > be enabled, we consider wireless links to always be available. If we > are unable to complete a connection for any reason, we must then fall > back to the next choice on our priority list.
Certainly the iwh and ath drivers don't seem to flag a LINK as UP unless they are actually connected - but maybe all wireless drivers aren't the same. Alan added a test at the LINK level (IIRC) that checks if a wireless device is connected, this would seem to be the correct thing to use to make the decision w.r.t. the wireless LINK NCU and whether it's UP or not. I think that assuming all wireless links are available is a mistake. >From a *user* perspective, the network (as a whole) is not available until they have an IP address and is able to ping/browse www.google.com ;) This also applies - and will probably be an RFE we need to address at some point - where any application needs to be able to "phone home" and wants to ask the question "Do I have a network connection?", if so then do some action. An example where this is currently relevant would be with IPS and it's automatic check for updates - this needs a way to know when it's on-line (and maybe whether it's connected to the internet or an internal LAN?)... > > But there's another catch. The phase 0/0.5 policy conflates link and > interface configuration, so if, for example, you have a wired link > that's up, but cannot obtain a dhcp address on it, nwamd will (after a > timeout) fall back to the next available device. But in our priority > group scheme, with the default policy (which is supposed to match that > of phase 0/0.5) in place, as long as one ethernet device has link, we > consider that priority group available and active. No need to try > anything else. > > To resolve this, I think we need to make our NCU condition checking a > little more complicated, unfortunately. I think we need to extend > the state check of a link NCU to include the state of the associated > interface NCU. In other words, both the link:bge0 and interface:bge0 > must be online in order for the link:bge0 component of its priority > group to be considered online. This is certainly how the GUI presents things to the user - since he GUI combines the LINK and IP NCUs into a single representation (that's what a User expects) - so we view a specific device to not be fully available unless both the LINK and IP NCUs are ready - it certainly would be good if nwamd considered this too... > > We do need to allow time for this to happen, though. So I suspect > we still need something like the dhcp_wait_time value (which probably > needs to be a tunable property) to bound the time we'll wait on the > link/interface pair to become online. If after the timeout one or > the other is still in the offline* state, we should leave it there, > but move on to the next priority group and start trying to bring up > links there. I agree whole-heartedly about this approach. As for the time to allow - why does it need to be any different to the dhcp_wait_time value? Surely this is the same thing in the end, or is it that this value is too long for this usage? If it's the latter, then it would make sense (and I believe this is the impl, but not 100% sure) that all the interfaces in a group are brought up in parallel so that the switch between groups is as short as possible, i.e. not some multiple of the timeout value. Thanks, Darren.
