Re: [Open-FCoE] [PATCH] fcoe: provide exclusion to concurrent ema_list access (v2)

Bhanu Prakash Gollapudi Fri, 03 Feb 2012 09:23:39 -0800

On 2/3/2012 9:00 AM, Neil Horman wrote:

On Fri, Feb 03, 2012 at 08:46:50AM -0800, Bhanu Prakash Gollapudi wrote:

On 2/3/2012 3:57 AM, Neil Horman wrote:

On Thu, Feb 02, 2012 at 04:59:54PM -0800, Bhanu Prakash Gollapudi wrote:

On 2/2/2012 1:03 PM, Neil Horman wrote:

Fix it by adding an ema_lock to list accesses.  Make it an rcu lock so we don't
get Warnings when we try to tx a frame with a spinlock held

  /**
@@ -2166,6 +2182,7 @@ struct fc_exch_mgr_anchor *fc_exch_mgr_add(struct 
fc_lport *lport,
                                           bool (*match)(struct fc_frame *))
  {
        struct fc_exch_mgr_anchor *ema;
+       unsigned long flags;

        ema = kmalloc(sizeof(*ema), GFP_ATOMIC);
        if (!ema)
@@ -2174,8 +2191,10 @@ struct fc_exch_mgr_anchor *fc_exch_mgr_add(struct 
fc_lport *lport,
        ema->mp = mp;
        ema->match = match;
        /* add EM anchor to EM anchors list */
-       list_add_tail(&ema->ema_list,&lport->ema_list);
+       spin_lock_irqsave(&lport->ema_lock, flags);


Neil, why do we need spin_lock_irqsave() for ema_lock?
spin_lock_bh() is good enough I think. Probably you're getting the
warning because you are using irqsave.

spin_lock_bh might be sufficient, I figured we could start with irqsave and
convert individual paths to bh as they were cleared.  Theres really very little
performance difference, and both are only held for very short periods while the
list is added to/removed from.  The longer traversals are rcu_read_lock
protected.

As for the warning, there is none, you're confusing this patch with my earlier
error in which I held and exchanges ex_lock with irqs disabled.  This lock
protects the ema_list from these potentially concurrent paths:

1) traversals of the list in fc_find_ema from softirq context in the recieve
path for fcoe (called from fcoe_rcv->fcoe_recv_frame->fc_exch_recv->fc_find_ema)

2) traversals of the list that occur during parallel bottom halves in
fc_exch_alloc, fc_seq_assign, and fc_exch_lookup that may occur in the receive
path operating in softirq on other cpus in parallel

3) list_add and list_del operations that can be intiated via userspace calls
through sysfs.

Theres no warning because the lock is only held during the short period when we
add or remove an entry.  The read side is protected via rcu_read_lock.

Fix it by adding an ema_lock to list accesses.  Make it an rcu

lock so we don't

get Warnings when we try to tx a frame with a spinlock held


Is this a different warning you were referring? If it is the same
one belonging to the other patch, the above statement can be removed
to avoid confusion.

Thanks.

No, this patch has nothing to do with the warning noted in the other patch.
This is just a fix for the observation that its possible to concurrently access
the ema_list of a port from two or more contexts, leading to list corruption.
Thats all.  We discussed this here:
https://bugzilla.redhat.com/show_bug.cgi?id=771251

I made note of the problem in comment 66 and kiran concurred with it in comment
82, I expanded on it ni comment 86, and in comment 87 you seemed to understand
that we were talking about two different things.  I'm sorry if I wasn't clear,
but this patch has nothing to do with the previously mentioned warning.  That
was an error attributed to some changes that erroneously got left behind while I
was tracking down the workqueue deadlock, this is something completely
different.
Thanks
Neil

Thanks Neil.

_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel

Re: [Open-FCoE] [PATCH] fcoe: provide exclusion to concurrent ema_list access (v2)

Reply via email to