On Tue, Dec 04, 2012 at 05:45:27PM +0000, Parikh, Neerav wrote: > Hi Neil, > > I'm not sure if you got to look at this RFC. > This is related to the NPIV destroy patch that I'd earlier posted. > > Thanks, > Neerav > Sorry for the reply-with CC here, but I seem to have deleted the origional message. This patch looks good to me however. Nice work Neerav. Repost it with the RFC removed, and I'll ack it.
Neil P.S. I presume removing the config_mutex will be the subject of some follow-on work? Or do we still need it to prevent multiple user processes from accessing these paths via sysfs? Neil > > -----Original Message----- > > From: devel-boun...@open-fcoe.org [mailto:devel-boun...@open-fcoe.org] > > On Behalf Of Neerav Parikh > > Sent: Friday, November 30, 2012 2:17 PM > > To: devel@open-fcoe.org > > Subject: [Open-FCoE] [RFC PATCH] fcoe: Fix deadlock while deleting FCoE > > interface with NPIV ports > > > > This patch fixes following deadlock caused by destroying of > > an FCoE interface with active NPIV ports on that interface. > > > > Call Trace: > > [<ffffffff814b7e88>] schedule+0x64/0x66 > > [<ffffffff814b6b4f>] schedule_timeout+0x36/0xe3 > > [<ffffffff81070c55>] ? update_curr+0xd6/0x110 > > [<ffffffff81071f6b>] ? hrtick_update+0x1b/0x4d > > [<ffffffff81072405>] ? dequeue_task_fair+0x1ca/0x1d9 > > [<ffffffff8106a369>] ? need_resched+0x1e/0x28 > > [<ffffffff814b7d14>] wait_for_common+0x9b/0xf1 > > [<ffffffff8106e7be>] ? try_to_wake_up+0x1e0/0x1e0 > > [<ffffffff814b7e22>] wait_for_completion+0x1d/0x1f > > [<ffffffff8105ae82>] flush_workqueue+0x116/0x2a1 > > [<ffffffff8105b357>] drain_workqueue+0x66/0x14c > > [<ffffffff8105b8ef>] destroy_workqueue+0x1a/0xcf > > [<ffffffffa009211e>] fc_remove_host+0x154/0x17f [scsi_transport_fc] > > [<ffffffffa00edbb8>] fcoe_if_destroy+0x184/0x1c9 [fcoe] > > [<ffffffffa00edc28>] fcoe_destroy_work+0x2b/0x44 [fcoe] > > [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4 > > [<ffffffffa00edbfd>] ? fcoe_if_destroy+0x1c9/0x1c9 [fcoe] > > [<ffffffff8105c396>] worker_thread+0x1db/0x268 > > [<ffffffff810604a3>] ? wake_up_bit+0x2a/0x2a > > [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6 > > [<ffffffff8105ffd6>] kthread+0x6f/0x77 > > [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10 > > [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b > > > > Call Trace: > > [<ffffffff814b7e88>] schedule+0x64/0x66 > > [<ffffffff814b8041>] schedule_preempt_disabled+0xe/0x10 > > [<ffffffff814b70a1>] __mutex_lock_common.clone.5+0x117/0x17a > > [<ffffffff814b7117>] __mutex_lock_slowpath+0x13/0x15 > > [<ffffffff814b6f76>] mutex_lock+0x23/0x37 > > [<ffffffff8125b890>] ? list_del+0x11/0x30 > > [<ffffffffa00edc84>] fcoe_vport_destroy+0x43/0x5f [fcoe] > > [<ffffffffa009130a>] fc_vport_terminate+0x48/0x110 > > [scsi_transport_fc] > > [<ffffffffa00913ef>] fc_vport_sched_delete+0x1d/0x79 > > [scsi_transport_fc] > > [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4 > > [<ffffffffa00913d2>] ? fc_vport_terminate+0x110/0x110 > > [scsi_transport_fc] > > [<ffffffff8105c396>] worker_thread+0x1db/0x268 > > [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6 > > [<ffffffff8105ffd6>] kthread+0x6f/0x77 > > [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10 > > [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b > > [<ffffffff814c0300>] ? gs_change+0x13/0x13 > > > > In the prior version of this fix posted here: > > http://lists.open-fcoe.org/pipermail/devel/2012-October/012318.html > > Based on feedback and discussion with Neil Horman it seems that the > > above patch > > may have a case where the fcoe_vport_destroy() and fcoe_destroy_work() > > can > > race; hence this RFC that is trying to solve the same problem in a > > different way. > > > > In the current approach instead of removing the fcoe_config_mutex from > > the > > vport_delete callback function; I've chosen to delete all the NPIV > > ports first > > on a given root lport before continuing with the removal of the root > > lport. > > > > Signed-off-by: Neerav Parikh <neerav.par...@intel.com> > > CC: Neil Horman <nhor...@tuxdriver.com> > > --- > > drivers/scsi/fcoe/fcoe.c | 23 +++++++++++++++++++++++ > > 1 files changed, 23 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c > > index 666b7ac..040f59b 100644 > > --- a/drivers/scsi/fcoe/fcoe.c > > +++ b/drivers/scsi/fcoe/fcoe.c > > @@ -2139,8 +2139,31 @@ static void fcoe_destroy_work(struct work_struct > > *work) > > { > > struct fcoe_port *port; > > struct fcoe_interface *fcoe; > > + struct Scsi_Host *shost; > > + struct fc_host_attrs *fc_host; > > + unsigned long flags; > > + struct fc_vport *vport; > > + struct fc_vport *next_vport; > > > > port = container_of(work, struct fcoe_port, destroy_work); > > + shost = port->lport->host; > > + fc_host = shost_to_fc_host(shost); > > + > > + /* Loop through all the vports and mark them for deletion */ > > + spin_lock_irqsave(shost->host_lock, flags); > > + list_for_each_entry_safe(vport, next_vport, &fc_host->vports, > > peers) { > > + if (vport->flags & (FC_VPORT_DEL | FC_VPORT_CREATING)) { > > + continue; > > + } else { > > + vport->flags |= FC_VPORT_DELETING; > > + queue_work(fc_host_work_q(shost), > > + &vport->vport_delete_work); > > + } > > + } > > + spin_unlock_irqrestore(shost->host_lock, flags); > > + > > + flush_workqueue(fc_host_work_q(shost)); > > + > > mutex_lock(&fcoe_config_mutex); > > > > fcoe = port->priv; > > > > _______________________________________________ > > devel mailing list > > devel@open-fcoe.org > > https://lists.open-fcoe.org/mailman/listinfo/devel > _______________________________________________ devel mailing list devel@open-fcoe.org https://lists.open-fcoe.org/mailman/listinfo/devel