Re: [networking-discuss] nemo mac_notify change (webrev)

Garrett D'Amore Fri, 07 Sep 2007 12:05:22 -0700

On Fri, 2007-09-07 at 11:52 -0700, Thirumalai Srinivasan wrote:
> In your scheme what is the point in time at which the framework can be sure
> that a mac endpoint has been quiesced and there won't be any further
> upcalls/notifies from the driver on this endpoint ? You seem to have 
> postponed
> this point of time right up to driver's detach -> mac_destroy(). Whether and
> when exactly the driver's detach routine is called is decided by the
> DDI framework.
> 
> Separating mac_unregister from mac_destroy is not a big issue, but the
> framework needs a well defined way by which it can quiesce a mac endpoint.
> Note that with polling and direct call features, the driver and the 
> framework
> have opaque pointers to each other's data structures (such as rings) and we
> need a positive mechanism for quiescing a mac endpoint to be sure that there
> isn't a pending upcall or downcall thread with a stale pointer.
> 
> For example when all mac clients have closed a mac end point, the
> mac framework would like to quiesce the end point and reclaim / cleanup
> anything associated with that mac endpoint. The last client close -> 
> mac_stop()
> -> xxx_m_stop is a point where this synchronization can happen.


I don't think m_stop is the right time to do this, because the GLDv3
driver still has a reference to the mac_impl_t.

When mac_unregister succeeds, then it is known that the upper layers are
quiesced.  A simple flag (mip->mi_disabled) can indicate that up-calls
from the driver can be ignored until the driver has destroyed the mac.

I think the problem here is that you need to quiesce both upcalls from
below, and downcalls from above.  You can't do it all at once.

> 
>  From the driver side, it has to have the internal mechanisms to quiesce 
> itself
> in response to a request from above say via the xxx_m_stop.

There is a level of quiesce associated with m_stop, but there is a
secondary level of resource destruction associated with detach.  m_stop
does *not* mean that you won't see an m_start happen again almost
instantaneously (in the driver), so its not really a good idea for the
driver to rely too much on m_stop as a quiesce against detatch.

*UNLESS* there is a guarantee that the framework will not call m_start()
between the time the driver's detach() entry point is called and the
time mac_unregister() completes.  Right now I think that there is no
such guarantee.

And there is still the problem, to my mind, of a stream that is holding
the instance (mac_open) but has not yet called mac_start.  In this case,
what happens if mac_unregister is called?  It fails.  But if we've torn
down the resources, then we have to be prepared to reinstate them on
failure.  Its ugly.

>  If it is hard to
> do a synchronous xxx_m_stop then may be a notification when it is completed
> is another possibility. From the framework side, it must make sure that
> xxx_m_* entry points are single threaded on a given mac endpoint. So we
> won't have a xxx_m_start concurrently coming down with a xxx_m_stop.

I'm not worried about that.  I'm worried about async races between the
driver's detach() and m_stop routines.

It seems a lot of heartache here just to avoid an extra function call in
the drivers.  I don't understand the objection to just doing the simple,
and correct, thing of separating out the mac unregistration from the mac
destruction.

        -- Garrett
> 
> Thirumalai
>  
> Garrett D'Amore wrote:
> 
> >On Fri, 2007-09-07 at 09:11 -0700, Thirumalai Srinivasan wrote:
> >  
> >
> >>>There are further complexities, because the driver cannot know if any
> >>>upstream callers are still busy until it calls mac_unregister().  And
> >>>then, because mac_unregister *also* frees the structure, its too late.
> >>> 
> >>>
> >>>      
> >>>
> >>If there are indeed upstream callers, mi_ref is non-zero and 
> >>mac_register will fail. So it should be sufficient, if the driver 
> >>quiesces the chip, and become single threaded on the mac endpoint before 
> >>calling mac_unregister to avoid the race.
> >>    
> >>
> >
> >You're missing a big point.  That is that the effort to become single
> >threaded, *before* this point, may be a significant PITA, particularly
> >since mac_unregister may fail.
> >
> >  
> >
> >> Another question, what does 
> >>xxx_m_stop really mean ? Why can't the driver quiesce everything at this 
> >>point itself ? If mac_unregister() succeeds, it implies that an 
> >>xxx_m_stop has happened before that. It doesn't seem right if we can 
> >>still have upcall threads and notifications going on even after 
> >>mac_stop() has finished successfully.
> >>    
> >>
> >
> >Well, perhaps mac_stop says that we shouldn't upcall.  I'm not sure.  If
> >so, then this would solve the race.  Right now I *think* pretty much
> >most drivers do shutdown the chip (though I'm not sure all of them do)
> >in mac_stop, to the point of even dropping the link.
> >
> >I do know that there are cyclics, etc. that some of these drivers
> >register in attach(), and don't tear down until detach(), and that they
> >would *prefer* to do that after unregistration is complete.
> >
> >One question here... can mac_start() (or any other mac entry point) be
> >called *while* running the DDI detach code?  It isn't clear to me that
> >DDI detach is single threaded from MAC's view until after mac_unregister
> >is called.
> >
> >In other words, imagine this... 
> >
> >
> >m_stop() {
> >... statep->STARTED = B_FALSE;
> >}
> >
> >m_start() {
> >... statep->STARTED = B_TRUE;
> >}
> >
> >detach() {
> >...
> >     if (statep->STARTED)
> >             return (DDI_FAILURE);
> >     /* race point */
> >     mac_unregister(...);
> >}
> >
> >Now what happens if, at "race point" above, mac_start() is called?
> >
> >One solution to this is to add some complexity to the in the drivers,
> >like this:
> >
> >m_start() {
> >     mutex_enter(&statep->LOCK);
> >     if (statep->DETACHING) {
> >             return B_FALSE;
> >     }
> >     statep->STARTED = B_TRUE;
> >     mutex_exit(&statep->LOCK);
> >}
> >
> >detach() {
> >     mutex_enter(&statep->LOCK);
> >     if (statep->STARTED) {
> >             mutex_exit(&statep->LOCK);
> >             return (DDI_FAILURE);
> >     }
> >     statep->DETACHING = B_TRUE;
> >     mutex_exit(&statep->LOCK);
> >     return (DDI_SUCCESS);
> >}
> >
> >So that adds complexity to the driver.  Furthermore, it has a potential
> >semantic change that I'm uncomfortable with.  Specifically, it isn't
> >clear to me that a stream in DL_ATTACHED, but not DL_BOUND, state, will
> >have statep->STARTED set to true.
> >
> >In otherwords, does the very act of calling mac_open() with for a node
> >(which holds the device busy from DR) *also* arrange to call
> >mac_start()?  And does it do so atomically, so that there is no race
> >where DDI detach can be called between mac_open and mac_start?
> >
> >All these questions of atomicity are confusing, and lead to a
> >substantial increase in complexity.  I still maintain the simplest
> >solution is to separate mac destruction from unregistration from the
> >framework.
> >
> >Here's an sample implementation that I think is cleaner:
> >
> >mac_start() {
> >     ... /* no need to set or check any flags */
> >}
> >
> >mac_stop() {
> >     /* no need to set or check any flags */
> >}
> >
> >detach() {
> >     if (mac_unregister(mh) == B_FALSE)
> >             return (DDI_FAILURE);
> >     /* do any thing else required to go single threaded, etc. */
> >     mac_destroy(mh);
> >}
> >
> >     -- Garrett
> >  
> >
> >>Thirumalai
> >>
> >>Garrett D'Amore wrote:
> >>
> >>    
> >>
> >>>On Fri, 2007-09-07 at 08:25 -0400, Sebastien Roy wrote:
> >>> 
> >>>
> >>>      
> >>>
> >>>>Paul Durrant wrote:
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>>>On 06/09/07, Garrett D'Amore <[EMAIL PROTECTED]> wrote:
> >>>>>     
> >>>>>
> >>>>>          
> >>>>>
> >>>>>>(Obviously mac_alloc() here would return "mac_t *", but under the covers
> >>>>>>it is really allocating a whole mac_impl_t.  And mac_free() would be
> >>>>>>free()'ing a a whole mac_impl_t, though it is taking a mac_t as its
> >>>>>>argument.)
> >>>>>>
> >>>>>>If we want to pursue this course of action, I can put together a webrev
> >>>>>>with the changes... it will take me a while because there are a *lot* of
> >>>>>>drivers to change... but I want to get this change in *before* we make
> >>>>>>GLDv3 public.
> >>>>>>       
> >>>>>>
> >>>>>>            
> >>>>>>
> >>>>>FWIW, I find the current mac_register_t scheme ugly; I don't see why a
> >>>>>statically defined mac_register_t cannot simply be passed into
> >>>>>mac_register(). I don't see why we need separate allocator functions
> >>>>>there.
> >>>>>     
> >>>>>
> >>>>>          
> >>>>>
> >>>>It's done this way so that the structure can be expanded without 
> >>>>affecting binary compatibility.  We can add a new field to the structure, 
> >>>>and the system won't crash and burn when an existing driver registers. 
> >>>>The framework will allocate an appropriately sized structure, but the 
> >>>>older driver simply won't fill in the new fields it doesn't know about.
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>>>I also don't see that you can't break this race using reference
> >>>>>counting. There's no need for mac_unregister() to actually free the
> >>>>>mac_impl_t; it could just drop a ref. and the last man out frees.
> >>>>>Let's try not to complicate the interface because of implementation
> >>>>>artifacts.
> >>>>>     
> >>>>>
> >>>>>          
> >>>>>
> >>>>I don't believe a reference counting scheme makes a difference in the 
> >>>>scenario that Thiru described.  He's describing a scenario where the 
> >>>>mac_impl_t is freed by mac_unregister() before another thread gets a 
> >>>>chance to examine that mac_impl_t to verify its contents (either for a 
> >>>>flag or a reference count, it doesn't matter, the end result is still the 
> >>>>same; boom!)
> >>>>
> >>>>IMO, this case is borderline a driver bug.  The driver has called some 
> >>>>MAC function (which hasn't returned yet), and decides to call 
> >>>>mac_unregister().  mac_unregister() is documented to invalidate the MAC 
> >>>>handle, so drivers shouldn't knowingly call mac_unregister() while using 
> >>>>that handle, or use the handle during or after calls to mac_unregister().
> >>>>
> >>>>To me, this is a similar problem to calling free() on a buffer that an 
> >>>>application knows that it's still using...  Synchronization on the 
> >>>>allocation and deallocation of that buffer can't be done by the memory 
> >>>>framework, it has to be done by the application.  The framework can't 
> >>>>possibly know what the application is going to do with that buffer once 
> >>>>the free() function has been called.
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>There are further complexities, because the driver cannot know if any
> >>>upstream callers are still busy until it calls mac_unregister().  And
> >>>then, because mac_unregister *also* frees the structure, its too late.
> >>>
> >>>Forcing drivers to work around this will incur substantial code
> >>>complexity across every Nemo driver.  (Think an extra lock, flag, and
> >>>condition variable used with *every* place that a driver calls up into
> >>>Nemo.)  I think that is a poor decision.
> >>>
> >>>   -- Garrett
> >>>
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >
> >  
> >
> 

_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] nemo mac_notify change (webrev)

Reply via email to