Bhanu Gollapudi wrote:
> When a target sends multiple back-to-back LOGO's, we try to process
> all of them irrespective of the status of processing of prior LOGO
> which results in oops in fc_rport_work :
> general protection fault: 0000 [#1] SMP
> last sysfs file: /sys/block/sdao/dev
> CPU 0
> Pid: 6870, comm: fc_rport_eq Not tainted 2.6.33-rc4-fcoe-next-00286-g07cca55
> RIP: 0010:[<ffffffffa031aaf4>] [<ffffffffa031aaf4>]
> fc_rport_work+0x288/0x3d4 [libfc]
> RSP: 0018:ffff88004695fe00 EFLAGS: 00010202
> RAX: dead000000200200 RBX: ffff880046958000 RCX: ffff8800469580e0
> RDX: dead000000100100 RSI: dead000000200200 RDI: dead000000100100
> RBP: ffff88004695fe70 R08: ffff88004695fc40 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000292cef
> R13: 0000000000000004 R14: 0000000000000000 R15: ffff880046958048
> FS: 0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000041bb7a08 CR3: 0000000078d8f000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process fc_rport_eq (pid: 6870, threadinfo ffff88004695e000, task
> ffff88005d78afa0)
> Stack:
> 0000000000000000 ffff880046958458 ffff880046958400 000000007f7e2fa0
> <0> ffff88005d78b328 00000001000c55d4 ffff88004695fe40 ffffffff810a12e1
> <0> ffff88004695fe60 ffff880046e99d30 ffff8800469580f0 ffffe8ffffc12a80
> Call Trace:
> [<ffffffff810a12e1>] ? trace_hardirqs_off+0x9/0x20
> [<ffffffffa031a86c>] ? fc_rport_work+0x0/0x3d4 [libfc]
> [<ffffffff81056d14>] worker_thread+0x149/0x1e5
> [<ffffffff81059c27>] ? autoremove_wake_function+0x0/0x3d
> [<ffffffff81056bcb>] ? worker_thread+0x0/0x1e5
> [<ffffffff810598da>] kthread+0x6e/0x76
> [<ffffffff81003a14>] kernel_thread_helper+0x4/0x10
> [<ffffffff814a4369>] ? restore_args+0x0/0x30
> [<ffffffff8105986c>] ? kthread+0x0/0x76
> [<ffffffff81003a10>] ? kernel_thread_helper+0x0/0x10
>
> To avoid this drop all LOGO's received while a prior LOGO is being processed
>
> Signed-off-by: Bhanu Prakash Gollapudi <[email protected]>
> ---
> drivers/scsi/libfc/fc_rport.c | 18 ++++++++++++------
> 1 files changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/libfc/fc_rport.c b/drivers/scsi/libfc/fc_rport.c
> index f179ffc..fba5218 100644
> --- a/drivers/scsi/libfc/fc_rport.c
> +++ b/drivers/scsi/libfc/fc_rport.c
> @@ -1621,14 +1621,20 @@ static void fc_rport_recv_logo_req(struct fc_lport
> *lport,
> FC_RPORT_DBG(rdata, "Received LOGO request while in state %s\n",
> fc_rport_state(rdata));
>
> - fc_rport_enter_delete(rdata, RPORT_EV_LOGO);
> -
> /*
> - * If the remote port was created due to discovery, set state
> - * to log back in. It may have seen a stale RSCN about us.
> + * Drop all further LOGO's while a prior LOGO is being
> + * processed
> */
> - if (rdata->disc_id)
> - fc_rport_state_enter(rdata, RPORT_ST_RESTART);
> + if (rdata->rp_state != RPORT_ST_RESTART) {
> + fc_rport_enter_delete(rdata, RPORT_EV_LOGO);
> +
> + /*
> + * If the remote port was created due to discovery,
> + * set state to log back in. It may have seen a
> + * stale RSCN about us.
> + */
> + if (rdata->disc_id)
> + fc_rport_state_enter(rdata, RPORT_ST_RESTART);
> + }
> mutex_unlock(&rdata->rp_mutex);
> } else
> FC_RPORT_ID_DBG(lport, sid,
I've seen this symptom, too, and I agree this is one way its caused.
A more complete fix is to not set the rdata->event to RPORT_EV_NONE in
fc_rport_work() if the rdata is removed from the list. That prevents
fc_rport_enter_delete() from rescheduling work on the rdata that's about
to be freed.
I think that makes the above patch unnecessary. Do you agree that that fixes
it?
Here's my patch, cut & pasted so it may not apply, just for illustration, I'll
send to the alias separately:
--- a/drivers/scsi/libfc/fc_rport.c
+++ b/drivers/scsi/libfc/fc_rport.c
@@ -307,11 +307,11 @@ static void fc_rport_work(struct work_struct *work)
*/
mutex_lock(&lport->disc.disc_mutex);
mutex_lock(&rdata->rp_mutex);
- if (rdata->rp_state == RPORT_ST_RESTART)
+ if (rdata->rp_state == RPORT_ST_RESTART) {
restart = 1;
- else
+ rdata->event = RPORT_EV_NONE;
+ } else
list_del(&rdata->peers);
- rdata->event = RPORT_EV_NONE;
mutex_unlock(&rdata->rp_mutex);
mutex_unlock(&lport->disc.disc_mutex);
---
Thanks,
Joe
_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel