On Fri, Feb 03, 2012 at 08:46:50AM -0800, Bhanu Prakash Gollapudi wrote:
> On 2/3/2012 3:57 AM, Neil Horman wrote:
> >On Thu, Feb 02, 2012 at 04:59:54PM -0800, Bhanu Prakash Gollapudi wrote:
> >>On 2/2/2012 1:03 PM, Neil Horman wrote:
> >>>Fix it by adding an ema_lock to list accesses.  Make it an rcu lock so we 
> >>>don't
> >>>get Warnings when we try to tx a frame with a spinlock held
> 
> 
> >>>
> >>
> >>
> >>>  /**
> >>>@@ -2166,6 +2182,7 @@ struct fc_exch_mgr_anchor *fc_exch_mgr_add(struct 
> >>>fc_lport *lport,
> >>>                                      bool (*match)(struct fc_frame *))
> >>>  {
> >>>   struct fc_exch_mgr_anchor *ema;
> >>>+  unsigned long flags;
> >>>
> >>>   ema = kmalloc(sizeof(*ema), GFP_ATOMIC);
> >>>   if (!ema)
> >>>@@ -2174,8 +2191,10 @@ struct fc_exch_mgr_anchor *fc_exch_mgr_add(struct 
> >>>fc_lport *lport,
> >>>   ema->mp = mp;
> >>>   ema->match = match;
> >>>   /* add EM anchor to EM anchors list */
> >>>-  list_add_tail(&ema->ema_list,&lport->ema_list);
> >>>+  spin_lock_irqsave(&lport->ema_lock, flags);
> >>
> >>Neil, why do we need spin_lock_irqsave() for ema_lock?
> >>spin_lock_bh() is good enough I think. Probably you're getting the
> >>warning because you are using irqsave.
> >>
> >spin_lock_bh might be sufficient, I figured we could start with irqsave and
> >convert individual paths to bh as they were cleared.  Theres really very 
> >little
> >performance difference, and both are only held for very short periods while 
> >the
> >list is added to/removed from.  The longer traversals are rcu_read_lock
> >protected.
> >
> >As for the warning, there is none, you're confusing this patch with my 
> >earlier
> >error in which I held and exchanges ex_lock with irqs disabled.  This lock
> >protects the ema_list from these potentially concurrent paths:
> >
> >1) traversals of the list in fc_find_ema from softirq context in the recieve
> >path for fcoe (called from 
> >fcoe_rcv->fcoe_recv_frame->fc_exch_recv->fc_find_ema)
> >
> >2) traversals of the list that occur during parallel bottom halves in
> >fc_exch_alloc, fc_seq_assign, and fc_exch_lookup that may occur in the 
> >receive
> >path operating in softirq on other cpus in parallel
> >
> >3) list_add and list_del operations that can be intiated via userspace calls
> >through sysfs.
> >
> >Theres no warning because the lock is only held during the short period when 
> >we
> >add or remove an entry.  The read side is protected via rcu_read_lock.
> 
> 
> >>> Fix it by adding an ema_lock to list accesses.  Make it an rcu
> lock so we don't
> >>> get Warnings when we try to tx a frame with a spinlock held
> 
> Is this a different warning you were referring? If it is the same
> one belonging to the other patch, the above statement can be removed
> to avoid confusion.
> 
> Thanks.
> 
No, this patch has nothing to do with the warning noted in the other patch.
This is just a fix for the observation that its possible to concurrently access
the ema_list of a port from two or more contexts, leading to list corruption.
Thats all.  We discussed this here:
https://bugzilla.redhat.com/show_bug.cgi?id=771251

I made note of the problem in comment 66 and kiran concurred with it in comment
82, I expanded on it ni comment 86, and in comment 87 you seemed to understand
that we were talking about two different things.  I'm sorry if I wasn't clear,
but this patch has nothing to do with the previously mentioned warning.  That
was an error attributed to some changes that erroneously got left behind while I
was tracking down the workqueue deadlock, this is something completely
different.
Thanks
Neil

_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel

Reply via email to