Vasu Dev wrote:
> In case of i/f destroy all exches must be freed before its EM
> mempool destroyed but currently some exches could be still
> releasing in their scheduled work while EM mempool destroy called.
> 
> Fixing this issue by calling flush_scheduled_work to complete all
> pending exches related work before EM mempool destroyed during
> i/f destroy.
> 
> The cancel_delayed_work_sync cannot be called during final
> fc_exch_reset to complete all exch work due to lport locking
> orders, so removes related comment block not relevant any more.
> 
> More details on this issue is discussed in this email thread
> http://www.open-fcoe.org/pipermail/devel/2009-August/003439.html
> 
> RFC notes:-
> 
>   Now I'm running into another issue with added flush_scheduled_work,
> this forces all system work q flushed and that includes
> fc_host work for fc_rport_final_delete and that threads hangs
> with three locks held fc_host->work_q_name, rport->rport_delete_work
> and shost->scan_mutex. I don't see any of these locks held when
> added flush_scheduled_work called and I suppose this issue must
> have got fixed by Joe's pending rport deletion related fixes.
> Also I couldn't reproduce this issue here before this patch also,
> looks like rare race.

I just posted a hang something like those on linux-scsi,
and there's another hang I reported in July that's similar.
These are the two postings:

http://marc.info/?l=linux-scsi&m=124966844805471&w=2
http://marc.info/?l=linux-scsi&m=124856078500527&w=2

I've been talking to Mike Christie about possible ways to fix the latter.

>    So Joe could you please test this fix in your setup with your
> rport deletion related fix applied ?

Yes.  I'll do my tests with this.


> 
> Signed-off-by: Vasu Dev <[email protected]>
> ---
> 
>  drivers/scsi/libfc/fc_exch.c |    7 +------
>  1 files changed, 1 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c
> index b51db15..9c754d5 100644
> --- a/drivers/scsi/libfc/fc_exch.c
> +++ b/drivers/scsi/libfc/fc_exch.c
> @@ -1446,12 +1446,6 @@ static void fc_exch_reset(struct fc_exch *ep)
>  
>       spin_lock_bh(&ep->ex_lock);
>       ep->state |= FC_EX_RST_CLEANUP;
> -     /*
> -      * we really want to call del_timer_sync, but cannot due
> -      * to the lport calling with the lport lock held (some resp
> -      * functions can also grab the lport lock which could cause
> -      * a deadlock).
> -      */
>       if (cancel_delayed_work(&ep->timeout_work))
>               atomic_dec(&ep->ex_refcnt);     /* drop hold for timer */
>       resp = ep->resp;
> @@ -1898,6 +1892,7 @@ void fc_exch_mgr_free(struct fc_lport *lport)
>  {
>       struct fc_exch_mgr_anchor *ema, *next;
>  
> +     flush_scheduled_work();
>       list_for_each_entry_safe(ema, next, &lport->ema_list, ema_list)
>               fc_exch_mgr_del(ema);
>  }
> 
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-fcoe.org/mailman/listinfo/devel

_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Reply via email to