On Thu, 24 Sep 2009 15:41:25 -0400
"Anurag S. Maskey" <Anurag.Maskey at Sun.COM> wrote:

> 
> 
> Michael Hunter wrote:
> >>>
> >>> That still doesn't answer how you know what is in the event stream that
> >>> hasn't been processed that doesn't take take the ncu down.  You are
> >>> comparing information from the state now to determine how to process an
> >>> event that will be operating on a possible different state.
> >>>   
> >>>       
> >> Isn't this true for all calls to nwamd_object_set_state() and 
> >> nwamd_object_set_state_timed() calls? None of these calls care for 
> >> what's already in the event queue. Some of the abnormalities that we 
> >> have seen in the past are related to this. We transition to a state 
> >> without actually caring for what state we are currently in.
> >>     
> >
> > s/in/in or will be in/
> >
> > Correct.  That is bad.  By mixing things we do immediately with things
> > that we queue we act on things without haveing processed all the
> > information.
> >
> > The problem here is that we could have an ncu down in the event queue
> > to be processed which would mean that we really want to transition to
> > off*/TOd.  Your change will break in that case.
> >   
> I am not introducing the brokenness. It already existed, in fact, my 
> changes slightly reduces it because if the ncu is online and there is a 
> offline*/TOd in the event queue, the state is not changed (before my 
> fix, the state was being changed anyway).

Whether you slightly fix it or just move it around is debateable.  But
neither is acceptable.

> 
> You are leading me to think that all our state machines are broken 
> because we never care about what is in the queue. Every 
> nwamd_object_set_state() is questionable because by the time that 
> particular state event is processed, the world may be different than 
> when the state event was created.

I was thinking about this some more.  nwamd_object_set_state()
shouldn't be an event.  It should act directly on the object.  The
things that are events should be external state changes we receive and
keep ordered.  Decision we make based on those events should take
effect immediately.

> 
> > Well, in the end it is all an FSM ;)  Choosing to be more formal about
> > our core logic because it is complex in a structured kind of way would
> > help.
> >
> > FWIW I don't know how to fix the bug in your code off the top of my
> > head.  You might be able to argue it will never happen or you can
> > detect when it does/will happen.  Or you might be able to stash some
> > state away that you check in other places.  The first seems unlikely.
> > The second seems hard to get right and ultimately leads us to belts and
> > braces types of complexity.
> >   
> "it will never happen" is not the case because no one knows what will 
> happen to the links. stashing state may work, it will require every 
> state transition to check the stashed state in which case we can write, 
> what I call "correct", solution

We need to make the correct a decision based on the information we've
received so far and effect changes based on that information.

> 
> I think the solution here is for the state event handler (i.e., all 
> nwamd_*_handle_state_event() functions) to make sure that the new state 
> and aux state is reachable from the current state and aux state 
> according to the state transitions that are possible (we'll have to 
> create a complete state diagram to achieve this). The combination of 
> state and aux state increases the number of checks, but there's no way 
> around it. I don't know how feasible this change is at this point.

I'm not sure I quite follow.

                mph

> 
> Anurag
> 

Reply via email to