On Wed, Feb 08, 2012 at 05:42:12PM -0800, Vasu Dev wrote:
> On Wed, 2012-02-08 at 19:56 -0500, Neil Horman wrote:
> > On Wed, Feb 08, 2012 at 12:35:07PM -0800, Vasu Dev wrote:
>><snip>

> > 3) We can make the rx queue flush work properly by guaranteeing that
> > fcoe_recv_frame will always be called in the context of a recevier thread.
> > Thats a workable alternative, but if, as you've said, you don't want to
> > introduce additional locking, its a lousy idea, having to call
> > spin_lock_bh/spin_unlock_bh on each received frame is an enormous overhead 
> > above
> > and beyond taking an rcu_read_lock on a list traversal.
> 
> I agree with all.
> 
I concur. I hate having to constantly lock in the fast path, but we can look at
ways to remove the rx list lock (or mitigate it) another day.  Given that, the
more I look at at fcoe_recv_frame, the more problems I find with it getting
called in softirq context, I think this is the best way to cover all the
problems.  I'll have a patch for you later today.


> > 
> > 
> > I should also point out there is precident for what I'm doing here,  the
> > lport->vports list is accessed both the add/delete execution paths, and in 
> > the
> > softirq path, and is protected by a mutex (which is a bug in and of iteself,
> > given that calling fc_vport_id_lookup in softirq context will give you a big
> > might_sleep warning, or a sleeping in interrupt panic if the mutex is 
> > already
> > held.   The point here being however, you have other lists that are already
> > protected for what appear to be the exact same reasons.  Although this does
> > speak to the fact that, regardless of this conversation, we should not 
> > currently
> > ever try to call fcoe_recv_frame in softirq context.  I'll take care of that
> > tomorrow.  That will have the pleasant side effect of taking care of this 
> > bug as
> > well, since it will make the rx thread purge work properly in 
> > fcoe_if_destroy. 
> 
> Looks okay to me and the flows I was calling out in 2) should get
> covered with that and that too w/o having lock here, it will always take
> hit of context switch but I guess much better than having lock.
> 
> >  
> > 
> > I had a bunch more comments on the rest of your thread, but seeing that the
> > fcoe_recv_frame can cause lots more problems, I think this is all moot. :)
> > 
> 
> We might need the patch I sent out to not free master lport until done
> with fcoe interface cleanup.
> 
> > Anywho, I hope the above clarifies what the problem here is.  Basically, 
> > what
> > you think is preventing mutual access of the ema_list, isn't.  But the fact 
> > that
> > fcoe_recv_frame isn't a path that can handle softirq context is the bigger
> > problem here, and fixing that will fix the ema_list at the same time.  I'll 
> > have
> > a new patch in the morning.
> 
> Great, thanks. I appreciate your detailed mail and I guess we converged
> on issues requiring fix before ema_list.
> 
Thank you for putting up with me on this issue.  I'm glad we could find a way to
fix this (and several other potential problems at the same time :)). I'll have a
fix posted later today.

Regards
Neil

_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel

Reply via email to