Alan Maguire wrote: > Anurag S. Maskey wrote: >> >>> Stepping back a bit, I think our big concern >>> is ending up in the wrong state - we're processing >>> event A, and as a consequence move into >>> state X, but meanwhile sitting in the event >>> queue is a state change event moving us >>> into state Y. Have I got this right? >> Yep, that's exactly right. Furthermore, state X may not be reachable >> from state Y, which I think is the main problem. >> >>> Could someone provide an example of the sort >>> of problem that can occur - I'm a bit confused >>> I'm afraid, and a concrete example might help. >>> Thanks! >> 11103 is sort of an example of this. The ncu is already online. >> When dhcp times out, we originally moved to offline*. My fix checks >> to make sure the ncu is not already online. If it is online, don't >> create the state change event, otherwise create the event that >> changes to offline*. >> >> This fix is still not complete. The ncu could be in offline* state >> and the online state event could be in the queue. In this case, the >> timed out event is still enqueued. The NCU changes to online and >> then offline* state. >> > Okay, got it, thanks! So in this case, a solution might > be (and I think this is what you were suggesting as > a more long-term solution) to validate the state change > when we dequeue and consume the state change event > (rather than use the current state to determine whether > we enqueue the state change event). So in this case > specifically, we'd enqueue the "offline*/dhcp timed out" > state change unconditionally, but reject it in > nwamd_ncu_handle_state_event() if we are already > online as an invalid state transition. right. this keeps all the logic of valid transitions in one place.
> > I think that would work in this case - what worries > me though are situations where we enqueue > state change X if in state A or state change Y > if in state B. So in other words, the actual desired > state in the state change event we enqueue is > dependent on the state we are in. I think that type > of scenario is more complex, and I think there's > probably a few examples of this sort of thing > scattered around the nwamd code. hmm ... what if for the state machine, the actual events/triggers are passed to the state machine rather than the state events. These would be things like IFF_UP, IFF_DOWN, DHCP_TIMED_OUT, NEWADDR, DELADDR, etc. (basically the same triggers that we currently have that do the nwamd_object_set_state() calls). Then, the state machine can handle the issue you mentioned above. It checks the current state and the event/trigger and then transitions to either state A or state B. Anurag > > Michael's suggestion to make state changes > immediate might help localize state change > processing in time a bit more closely. In practical > terms, this could be implemented by designating > state change events as prioritized - they still > get added to the event queue in order, but > they're added to the event queue ahead of other > types of event. > > I think it would help in evaluating which of these > approaches (and I can see arguments for doing > both, and I don't even think they're mutually exclusive) we utilize if > we could come up with a rough rule > of thumb which helps evaluate possibly problematic > state changes like the one above. I've tried to > look at all the cases where we call > nwamd_object_set_state() (there's 50 or so) > but I keep getting bogged down in the details. > What do you think? > > Alan
