On Thu, Jan 12, 2006 at 02:29:52PM +1100, Keith Owens wrote:
> John Hesterberg (on Wed, 11 Jan 2006 15:39:10 -0600) wrote:
> >On Wed, Jan 11, 2006 at 01:02:10PM -0800, Matt Helsley wrote:
> >>    Have you looked at Alan Stern's notifier chain fix patch? Could that be
> >> used in task_notify?
> >
> >I have two concerns about an all-tasks notification interface.
> >First, we want this to scale, so don't want more global locks.
> >One unique part of the task notify is that it doesn't use locks.
> 
> Neither does Alan Stern's atomic notifier chain.  Indeed it cannot use
> locks, because the atomic notifier chains can be called from anywhere,
> including non maskable interrupts.  The downside is that Alan's atomic
> notifier chains require RCU.
> 
> An alternative patch that requires no locks and does not even require
> RCU is in http://marc.theaimsgroup.com/?l=linux-kernel&m=113392370322545&w=2

Interesting!  Missed this on the first time around...

But doesn't notifier_call_chain_lockfree() need to either disable
preemption or use atomic operations to update notifier_chain_lockfree_inuse[]
in order to avoid problems with preemption?  If I understand the
code, one such problem could be caused by the following sequence
of events:

1.      Task A enters notifier_call_chain_lockfree(), gets a copy
        of the current CPU in local variable "cpu", snapshots the
        (initially zero) value of notifier_chain_lockfree_inuse[cpu]
        into local variable "nested", then is preempted.

2.      Task B enters notifier_call_chain_lockfree(), gets a copy
        of the current CPU in local variable "cpu", snapshots the
        (still zero) value of notifier_chain_lockfree_inuse[cpu]
        into local variable "nested", sets the value of
        notifier_chain_lockfree_inuse[cpu] to 1.

3.      Task A runs again, perhaps because Task B's priority dropped,
        perhaps because some other CPU became available.  It also
        sets the value of notifier_chain_lockfree_inuse[cpu] to 1.
        It then gains a reference to a notifier_block (call it Fred).

4.      Task B completes running through the notifier chain and sets
        notifier_chain_lockfree_inuse[cpu] = nested, which is zero.

5.      Task C invokes notifier_chain_unregister_lockfree() in order
        to remove Fred.  Task C finds all notifier_chain_lockfree_inuse[cpu]
        entries equal to zero, so removes Fred while Task A is still
        referencing it.  Which I believe is what was to be prevented.

If one updates notifier_chain_lockfree_inuse[cpu] using atomics,
then one could imagine a sequence of calls to notifier_call_chain_lockfree()
and preemptions that prevented one of the notifier_chain_lockfree_inuse[]
elements from ever reaching zero (though maybe this is being overly
paranoid).  If one disables preemption, then latency might become
excessive.

So what am I missing?

                                                Thanx, Paul


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
ckrm-tech mailing list
https://lists.sourceforge.net/lists/listinfo/ckrm-tech

Reply via email to