On Wed, 2012-02-08 at 19:56 -0500, Neil Horman wrote: > On Wed, Feb 08, 2012 at 12:35:07PM -0800, Vasu Dev wrote: > > On Tue, 2012-02-07 at 06:59 -0500, Neil Horman wrote: > > > On Mon, Feb 06, 2012 at 03:34:00PM -0800, Vasu Dev wrote: > > > > On Thu, 2012-02-02 at 16:03 -0500, Neil Horman wrote: > > > > > fc_port->ema_list is being accessed in multiple parallel contexts > > > > > with no > > > > > locking around it leading to potential corruption. > > > > > > > > I don't see any possibility of ema_list corruption here and if there is > > > I do, and I've tried to explain it here. It seems I've not been clear > > > enough. > > > > > > > any such possible case/s then we should fix such case/s at source > > > > instead ensuring list sanity by any un-necessary additional locking as > > > What do think I'm trying to do here? > > > > The patch is adding additional ema_list locking while it is not required > > as I explained described scenario in patch description for additional > > locking is not possible and neither newly described flows from latest > > response seems issues, more details below along them. > > > > > The point is to fix unintended access to the ema_list instead of adding > > additional locks to the list since list addition/deletion is already > > protected and fcoe create/destroy processing should be ensuring no > > access to ema_list: > > > > 1) prior its setup is fully done, all frames ignored until its setup > > done and that prevents any attempt to access empty or half initialized > > ema_list. > > 2) while list getting removed or after its removed. All pending exches, > > IOs and receive flows either already flushed or their frames is ignored > > at this point and that prevents no more access to ema_list at this point > > while ema_list getting empty or already empty. > > > I really don't see how you can say this.
I described flows to avoid hitting empty ema_list as I stated in heading "create/destroy processing should be ensuring no access to ema_list". Your detailed analysis below identifies case 2) is not fully covered and I agree to that due to possible races here as we trying to avoid global locking on fast path for lport state checking. The current code has big ? due to no locking here for disabled lport state check and that is loose point but we want that way as you stated below in fast path. /* lport lock ? */ if (!lport || lport->state == LPORT_ST_DISABLED) { FC_LPORT_DBG(lport, "Receiving frames for an lport that " "has not been initialized correctly\n"); fc_frame_free(fp); return; } > There is absolutely no reason that one > ore more cpus can be midway through handling a frame on recieve in softirq > context, past the point where they've checked and confirmed lport state and > any > other bits in the lport struct which might clear them to start using that data > (including ema_list). Say for example, this occurs, > Yes that is problem due to no locking around lport state checking on fast path, however we disable lport early on before doing all flushing and cleanup which goes thru various workqueue sync flush and I guess that adds enough delay and probably thats why we have not seen issue we are talking here. However I agree its not foolproof as you also described below in detail. > CPU0 is in fc_exch_recv, and has just finished checking to make sure that > lport->state, Say the state in LPORT_ST_READY, so it passes that check. The > next thing it will do is call fc_find_ema, which traverses the ema_list. Note > in fcoe_rcv, if the right conditions are met you can get into fcoe_rcv_frame > in > softirq context without the rx_list lock held. > > At that moment CPU0 gets a interrupt, so it goes off to service that. > > While CPU0 is handling the interrupt, CPU1, handles a write request from user > space, requesting to destroy the interface that CPU0 is processing the frame > it > received from. It manages to logoff the fabric, flush the tx queue (which is > the only thing it can flush), and destroys the lport. It gets to the point > where it calls fc_exch_mgr_free and starts deleting items from the lports > dma_list. > > At that moment CPU0 returns from its interrupt handler, and continues > processing > that frame, it calls fc_find_ema, using an lport that is now in some halfway > dead state, and starts traversing a list that is actively being manipulated by > CPU1 at the same time. > > > The bottom line is, theres a direct path to traverse the ema_list in softirq > context, and that makes your rx queue flush useless. That in turn makes it > entirely possible to have a delete operation on the ema_list happen in > parallel > with a list traversal. Yeah passing disabled lport is the source of problem here due to no locking around lport state checking here. > I'll buy that this works well on add, because you're > guaranteed no frames are being received until you call dev_add_pack, making > your > add safe from softirq operaion. The same is not true on release. > So we can do a few things about this: > > 1) We can try to do a dev_remove_pack prior to calling fc_fabric_logoff so as > to > prevent incomming frames from traversing the receive path while we're doing a > delete. This is impractical, as we don't have a way to ensure that all > softirq > contexts that are handling fcoe frames are completed. > Yes, I agree and indeed we had that way and then order was changed by some fixes to do fcoe interface clean after fcoe_if_destory. > 2) We can do the ema_list rcu lock like I'm proposing > That won't fix problem at source, what if new exch gets created after we are done with exches cleanup and during destory and we will hit exch mgr free with some active exches, basically your above described detailed race not covered. > 3) We can make the rx queue flush work properly by guaranteeing that > fcoe_recv_frame will always be called in the context of a recevier thread. > Thats a workable alternative, but if, as you've said, you don't want to > introduce additional locking, its a lousy idea, having to call > spin_lock_bh/spin_unlock_bh on each received frame is an enormous overhead > above > and beyond taking an rcu_read_lock on a list traversal. I agree with all. > > > I should also point out there is precident for what I'm doing here, the > lport->vports list is accessed both the add/delete execution paths, and in the > softirq path, and is protected by a mutex (which is a bug in and of iteself, > given that calling fc_vport_id_lookup in softirq context will give you a big > might_sleep warning, or a sleeping in interrupt panic if the mutex is already > held. The point here being however, you have other lists that are already > protected for what appear to be the exact same reasons. Although this does > speak to the fact that, regardless of this conversation, we should not > currently > ever try to call fcoe_recv_frame in softirq context. I'll take care of that > tomorrow. That will have the pleasant side effect of taking care of this bug > as > well, since it will make the rx thread purge work properly in > fcoe_if_destroy. Looks okay to me and the flows I was calling out in 2) should get covered with that and that too w/o having lock here, it will always take hit of context switch but I guess much better than having lock. > > > I had a bunch more comments on the rest of your thread, but seeing that the > fcoe_recv_frame can cause lots more problems, I think this is all moot. :) > We might need the patch I sent out to not free master lport until done with fcoe interface cleanup. > Anywho, I hope the above clarifies what the problem here is. Basically, what > you think is preventing mutual access of the ema_list, isn't. But the fact > that > fcoe_recv_frame isn't a path that can handle softirq context is the bigger > problem here, and fixing that will fix the ema_list at the same time. I'll > have > a new patch in the morning. Great, thanks. I appreciate your detailed mail and I guess we converged on issues requiring fix before ema_list. Regards Vasu > > regards. > Neil > _______________________________________________ devel mailing list devel@open-fcoe.org https://lists.open-fcoe.org/mailman/listinfo/devel