Vasu Dev wrote: > In case of i/f destroy all exches must be freed before its EM > mempool destroyed but currently some exches could be still > releasing in their scheduled work while EM mempool destroy called. > > Fixing this issue by calling flush_scheduled_work to complete all > pending exches related work before EM mempool destroyed during > i/f destroy. > > The cancel_delayed_work_sync cannot be called during final > fc_exch_reset to complete all exch work due to lport locking > orders, so removes related comment block not relevant any more. > > More details on this issue is discussed in this email thread > http://www.open-fcoe.org/pipermail/devel/2009-August/003439.html > > RFC notes:- > > Now I'm running into another issue with added flush_scheduled_work, > this forces all system work q flushed and that includes > fc_host work for fc_rport_final_delete and that threads hangs > with three locks held fc_host->work_q_name, rport->rport_delete_work > and shost->scan_mutex. I don't see any of these locks held when > added flush_scheduled_work called and I suppose this issue must > have got fixed by Joe's pending rport deletion related fixes. > Also I couldn't reproduce this issue here before this patch also, > looks like rare race.
I just posted a hang something like those on linux-scsi, and there's another hang I reported in July that's similar. These are the two postings: http://marc.info/?l=linux-scsi&m=124966844805471&w=2 http://marc.info/?l=linux-scsi&m=124856078500527&w=2 I've been talking to Mike Christie about possible ways to fix the latter. > So Joe could you please test this fix in your setup with your > rport deletion related fix applied ? Yes. I'll do my tests with this. > > Signed-off-by: Vasu Dev <[email protected]> > --- > > drivers/scsi/libfc/fc_exch.c | 7 +------ > 1 files changed, 1 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c > index b51db15..9c754d5 100644 > --- a/drivers/scsi/libfc/fc_exch.c > +++ b/drivers/scsi/libfc/fc_exch.c > @@ -1446,12 +1446,6 @@ static void fc_exch_reset(struct fc_exch *ep) > > spin_lock_bh(&ep->ex_lock); > ep->state |= FC_EX_RST_CLEANUP; > - /* > - * we really want to call del_timer_sync, but cannot due > - * to the lport calling with the lport lock held (some resp > - * functions can also grab the lport lock which could cause > - * a deadlock). > - */ > if (cancel_delayed_work(&ep->timeout_work)) > atomic_dec(&ep->ex_refcnt); /* drop hold for timer */ > resp = ep->resp; > @@ -1898,6 +1892,7 @@ void fc_exch_mgr_free(struct fc_lport *lport) > { > struct fc_exch_mgr_anchor *ema, *next; > > + flush_scheduled_work(); > list_for_each_entry_safe(ema, next, &lport->ema_list, ema_list) > fc_exch_mgr_del(ema); > } > > _______________________________________________ > devel mailing list > [email protected] > http://www.open-fcoe.org/mailman/listinfo/devel _______________________________________________ devel mailing list [email protected] http://www.open-fcoe.org/mailman/listinfo/devel
