Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
Hey Chris and Guilherme, I'm indeed not responsive under this email address. Thanks for the testing, looks like you have the magic target to reproduce this. I think this verifies what Mike's idea of what was going wrong, and we're way overdue to get this fixed upstream. Thanks to IBM for pushing this, I don't think any major distro is shipping this patch and we don't want to keep having to back it out. The options look like 1) back out the session lock changes that split it into two locks 2) add in the additional locking from this test patch 3) some other fix for the issue of targets that complete tasks oddly I'm leaning to #1, as I don't want to keep adding more locks for this. Thanks Chris! IIRC, the lock changes from Shlomo/Or are not on RHEL, SLES and Ubuntu anymore, as you mentioned. We requested them to revert the patch, and it was accepted. On the other hand, your patch is great and a cool fix to this. If we have any good numbers and/or reasons to keep their patch, guess the alternative #2 is cool too. I can perform more testing if you plan to send this (or similar) patch to iscsi list. Sagi, Or, Shlomo? You pushed to keep this from being backed out before. Here's your cause, any better ideas on fixing it? I also tried to go back in the mailing list archives, but I don't see any real numbers for the performance gains. I'll loop Sagi here based on the email I see he's using on NVMe list currently - seems it's different from the one showed in the header of this message. IIRC, this was brought up more than two years ago? it's been a while now. The motivation for the fined grained locking from Shlomo was designed to address the submission/completion inter-locking scheme that was not needed for iser. In iser, task completions are triggered from soft-irq only for task responses, the data-transfer is driven in HW, so we don't need the inter-locking between submissions and task management or error handling. My recollection is that this scheme solved a contention point we had back then, if I'm not mistaken it was as much as 50% improvement in IOPs scalability in some scenarios. Now, this was all pre block-mq. So I think the correct solution for iscsi (iser, tcp and offloads) is to use block-mq facilities for task pre-allocations (scsi host tagset) and have iscsi tcp take care of it's own locking instead of imposing it inherently in libiscsi. We can have LOGIN, LOGOUT, NOOP_OUT, TEXT, TMR as reserved tags, and queue_depth with max session cmds. I had a prototype for that back when I experimented with scsi-mq conversion (way back...), but kinda got stuck with trying to figure out how to convert the offload drivers qla4xxx, bnx2i and cxgbi which seemed to rely heavily on on the task pools. If people are more interested in improving iscsi locking schemes we can discuss on approaches for it.
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 06/02/2017 15:27, Chris Leech wrote: > - Original Message - >> On 09/11/2016 03:21, Chris Leech wrote: >>> On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote: Sure! Count on us to test any patches. I guess the first step is to reproduce on upstream right? We haven't tested specifically this scenario for long time. Will try to reproduce on 4.9-rc4 and update here. >>> >>> Great, I'm looking forward to hearing the result. >>> >>> Assuming it reproduces, I don't think this level of fine grained locking >>> is necessarily the best solution, but it may help confirm the problem. >>> Especially if the WARN_ONCE I slipped in here triggers. >> >> Chris, sorry for my huge delay. >> Finally I was able to perform tests and I have good news - seems your >> patch below fixed the issue. > > Thanks for the testing, looks like you have the magic target to reproduce > this. > > I think this verifies what Mike's idea of what was going wrong, and we're way > overdue to get this fixed upstream. Thanks to IBM for pushing this, I don't > think any major distro is shipping this patch and we don't want to keep > having to back it out. > > The options look like > 1) back out the session lock changes that split it into two locks > 2) add in the additional locking from this test patch > 3) some other fix for the issue of targets that complete tasks oddly > > I'm leaning to #1, as I don't want to keep adding more locks for this. Thanks Chris! IIRC, the lock changes from Shlomo/Or are not on RHEL, SLES and Ubuntu anymore, as you mentioned. We requested them to revert the patch, and it was accepted. On the other hand, your patch is great and a cool fix to this. If we have any good numbers and/or reasons to keep their patch, guess the alternative #2 is cool too. I can perform more testing if you plan to send this (or similar) patch to iscsi list. > Sagi, Or, Shlomo? You pushed to keep this from being backed out before. > Here's your cause, any better ideas on fixing it? I also tried to go back in > the mailing list archives, but I don't see any real numbers for the > performance gains. I'll loop Sagi here based on the email I see he's using on NVMe list currently - seems it's different from the one showed in the header of this message. Thanks, Guilherme > > - Chris > >> Firstly, I was able to reproduce with kernel 4.10-rc6. See the file >> repro.out - it's a dump from xmon, the kernel debugger from PowerPC. >> With this tool we can dump the exception details, registers, PACA >> (Processor Address Communication Area, ppc specific structure) and >> dmesg. It took me less than 15 minutes to reproduce. >> >> Then, I applied your patch on the top of this kernel and the benchmark >> was able to successfully complete, in about 3 hours. We can see the >> WARN() you added was reached, the attached file dmesg-cleech_patch shows >> the kernel log with your patch. >> >> The workload was FIO benchmark doing both reads and writes to the remote >> storage via iSCSI, connected over ethernet direct cabling (10Gb speed). >> Distro was Ubuntu 16.04.1 . >> >> Any more tests or info you need, please let me know! >> Cheers, >> >> >> Guilherme >> >> >>> - Chris >>> >>> --- >>> >>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c >>> index f9b6fba..fbd18ab 100644 >>> --- a/drivers/scsi/libiscsi.c >>> +++ b/drivers/scsi/libiscsi.c >>> @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task >>> *task, int state) >>> WARN_ON_ONCE(task->state == ISCSI_TASK_FREE); >>> task->state = state; >>> >>> - if (!list_empty(&task->running)) >>> + spin_lock_bh(&conn->taskqueuelock); >>> + if (!list_empty(&task->running)) { >>> + WARN_ONCE(1, "iscsi_complete_task while task on list"); >>> list_del_init(&task->running); >>> + } >>> + spin_unlock_bh(&conn->taskqueuelock); >>> >>> if (conn->task == task) >>> conn->task = NULL; >>> @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct >>> iscsi_hdr *hdr, >>> if (session->tt->xmit_task(task)) >>> goto free_task; >>> } else { >>> + spin_lock_bh(&conn->taskqueuelock); >>> list_add_tail(&task->running, &conn->mgmtqueue); >>> + spin_unlock_bh(&conn->taskqueuelock); >>> iscsi_conn_queue_work(conn); >>> } >>> >>> @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task) >>> * this may be on the requeue list already if the xmit_task callout >>> * is handling the r2ts while we are adding new ones >>> */ >>> + spin_lock_bh(&conn->taskqueuelock); >>> if (list_empty(&task->running)) >>> list_add_tail(&task->running, &conn->requeue); >>> + spin_unlock_bh(&conn->taskqueuelock); >>> iscsi_conn_queue_work(conn); >>> } >>> EXPORT_SYMBOL_GPL(iscsi_requeue_task); >>> @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
- Original Message - > On 09/11/2016 03:21, Chris Leech wrote: > > On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote: > >> > >> Sure! Count on us to test any patches. I guess the first step is to > >> reproduce on upstream right? We haven't tested specifically this > >> scenario for long time. Will try to reproduce on 4.9-rc4 and update here. > > > > Great, I'm looking forward to hearing the result. > > > > Assuming it reproduces, I don't think this level of fine grained locking > > is necessarily the best solution, but it may help confirm the problem. > > Especially if the WARN_ONCE I slipped in here triggers. > > Chris, sorry for my huge delay. > Finally I was able to perform tests and I have good news - seems your > patch below fixed the issue. Thanks for the testing, looks like you have the magic target to reproduce this. I think this verifies what Mike's idea of what was going wrong, and we're way overdue to get this fixed upstream. Thanks to IBM for pushing this, I don't think any major distro is shipping this patch and we don't want to keep having to back it out. The options look like 1) back out the session lock changes that split it into two locks 2) add in the additional locking from this test patch 3) some other fix for the issue of targets that complete tasks oddly I'm leaning to #1, as I don't want to keep adding more locks for this. Sagi, Or, Shlomo? You pushed to keep this from being backed out before. Here's your cause, any better ideas on fixing it? I also tried to go back in the mailing list archives, but I don't see any real numbers for the performance gains. - Chris > Firstly, I was able to reproduce with kernel 4.10-rc6. See the file > repro.out - it's a dump from xmon, the kernel debugger from PowerPC. > With this tool we can dump the exception details, registers, PACA > (Processor Address Communication Area, ppc specific structure) and > dmesg. It took me less than 15 minutes to reproduce. > > Then, I applied your patch on the top of this kernel and the benchmark > was able to successfully complete, in about 3 hours. We can see the > WARN() you added was reached, the attached file dmesg-cleech_patch shows > the kernel log with your patch. > > The workload was FIO benchmark doing both reads and writes to the remote > storage via iSCSI, connected over ethernet direct cabling (10Gb speed). > Distro was Ubuntu 16.04.1 . > > Any more tests or info you need, please let me know! > Cheers, > > > Guilherme > > > > - Chris > > > > --- > > > > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c > > index f9b6fba..fbd18ab 100644 > > --- a/drivers/scsi/libiscsi.c > > +++ b/drivers/scsi/libiscsi.c > > @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task > > *task, int state) > > WARN_ON_ONCE(task->state == ISCSI_TASK_FREE); > > task->state = state; > > > > - if (!list_empty(&task->running)) > > + spin_lock_bh(&conn->taskqueuelock); > > + if (!list_empty(&task->running)) { > > + WARN_ONCE(1, "iscsi_complete_task while task on list"); > > list_del_init(&task->running); > > + } > > + spin_unlock_bh(&conn->taskqueuelock); > > > > if (conn->task == task) > > conn->task = NULL; > > @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct > > iscsi_hdr *hdr, > > if (session->tt->xmit_task(task)) > > goto free_task; > > } else { > > + spin_lock_bh(&conn->taskqueuelock); > > list_add_tail(&task->running, &conn->mgmtqueue); > > + spin_unlock_bh(&conn->taskqueuelock); > > iscsi_conn_queue_work(conn); > > } > > > > @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task) > > * this may be on the requeue list already if the xmit_task callout > > * is handling the r2ts while we are adding new ones > > */ > > + spin_lock_bh(&conn->taskqueuelock); > > if (list_empty(&task->running)) > > list_add_tail(&task->running, &conn->requeue); > > + spin_unlock_bh(&conn->taskqueuelock); > > iscsi_conn_queue_work(conn); > > } > > EXPORT_SYMBOL_GPL(iscsi_requeue_task); > > @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) > > * only have one nop-out as a ping from us and targets should not > > * overflow us with nop-ins > > */ > > + spin_lock_bh(&conn->taskqueuelock); > > check_mgmt: > > while (!list_empty(&conn->mgmtqueue)) { > > conn->task = list_entry(conn->mgmtqueue.next, > > struct iscsi_task, running); > > list_del_init(&conn->task->running); > > + spin_unlock_bh(&conn->taskqueuelock); > > if (iscsi_prep_mgmt_task(conn, conn->task)) { > > /* regular RX path uses back_lock */ > > spin_lock_bh(&conn->session->back_lock); > > __iscsi_put_ta
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 11/09/2016 03:21 AM, Chris Leech wrote: > On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote: >> >> Sure! Count on us to test any patches. I guess the first step is to >> reproduce on upstream right? We haven't tested specifically this >> scenario for long time. Will try to reproduce on 4.9-rc4 and update here. > > Great, I'm looking forward to hearing the result. > > Assuming it reproduces, I don't think this level of fine grained locking > is necessarily the best solution, but it may help confirm the problem. > Especially if the WARN_ONCE I slipped in here triggers. > > - Chris Chris, I'm not able to reproduce right now - environment was misconfigured, and now I'm leaving the office for 20 days. Will test as soon as I'm back! Thanks, Guilherme > > --- > > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c > index f9b6fba..fbd18ab 100644 > --- a/drivers/scsi/libiscsi.c > +++ b/drivers/scsi/libiscsi.c > @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task *task, > int state) > WARN_ON_ONCE(task->state == ISCSI_TASK_FREE); > task->state = state; > > - if (!list_empty(&task->running)) > + spin_lock_bh(&conn->taskqueuelock); > + if (!list_empty(&task->running)) { > + WARN_ONCE(1, "iscsi_complete_task while task on list"); > list_del_init(&task->running); > + } > + spin_unlock_bh(&conn->taskqueuelock); > > if (conn->task == task) > conn->task = NULL; > @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct > iscsi_hdr *hdr, > if (session->tt->xmit_task(task)) > goto free_task; > } else { > + spin_lock_bh(&conn->taskqueuelock); > list_add_tail(&task->running, &conn->mgmtqueue); > + spin_unlock_bh(&conn->taskqueuelock); > iscsi_conn_queue_work(conn); > } > > @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task) >* this may be on the requeue list already if the xmit_task callout >* is handling the r2ts while we are adding new ones >*/ > + spin_lock_bh(&conn->taskqueuelock); > if (list_empty(&task->running)) > list_add_tail(&task->running, &conn->requeue); > + spin_unlock_bh(&conn->taskqueuelock); > iscsi_conn_queue_work(conn); > } > EXPORT_SYMBOL_GPL(iscsi_requeue_task); > @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) >* only have one nop-out as a ping from us and targets should not >* overflow us with nop-ins >*/ > + spin_lock_bh(&conn->taskqueuelock); > check_mgmt: > while (!list_empty(&conn->mgmtqueue)) { > conn->task = list_entry(conn->mgmtqueue.next, >struct iscsi_task, running); > list_del_init(&conn->task->running); > + spin_unlock_bh(&conn->taskqueuelock); > if (iscsi_prep_mgmt_task(conn, conn->task)) { > /* regular RX path uses back_lock */ > spin_lock_bh(&conn->session->back_lock); > __iscsi_put_task(conn->task); > spin_unlock_bh(&conn->session->back_lock); > conn->task = NULL; > + spin_lock_bh(&conn->taskqueuelock); > continue; > } > rc = iscsi_xmit_task(conn); > if (rc) > goto done; > + spin_lock_bh(&conn->taskqueuelock); > } > > /* process pending command queue */ > @@ -1535,19 +1547,24 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) > conn->task = list_entry(conn->cmdqueue.next, struct iscsi_task, > running); > list_del_init(&conn->task->running); > + spin_unlock_bh(&conn->taskqueuelock); > if (conn->session->state == ISCSI_STATE_LOGGING_OUT) { > fail_scsi_task(conn->task, DID_IMM_RETRY); > + spin_lock_bh(&conn->taskqueuelock); > continue; > } > rc = iscsi_prep_scsi_cmd_pdu(conn->task); > if (rc) { > if (rc == -ENOMEM || rc == -EACCES) { > + spin_lock_bh(&conn->taskqueuelock); > list_add_tail(&conn->task->running, > &conn->cmdqueue); > conn->task = NULL; > + spin_unlock_bh(&conn->taskqueuelock); > goto done; > } else > fail_scsi_task(conn->task, DID_ABORT); > + spin_lock_bh(&conn->taskqueuelock); > continue; > } > rc = iscsi_xmit_task(conn); >
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote: > > Sure! Count on us to test any patches. I guess the first step is to > reproduce on upstream right? We haven't tested specifically this > scenario for long time. Will try to reproduce on 4.9-rc4 and update here. Great, I'm looking forward to hearing the result. Assuming it reproduces, I don't think this level of fine grained locking is necessarily the best solution, but it may help confirm the problem. Especially if the WARN_ONCE I slipped in here triggers. - Chris --- diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index f9b6fba..fbd18ab 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task *task, int state) WARN_ON_ONCE(task->state == ISCSI_TASK_FREE); task->state = state; - if (!list_empty(&task->running)) + spin_lock_bh(&conn->taskqueuelock); + if (!list_empty(&task->running)) { + WARN_ONCE(1, "iscsi_complete_task while task on list"); list_del_init(&task->running); + } + spin_unlock_bh(&conn->taskqueuelock); if (conn->task == task) conn->task = NULL; @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct iscsi_hdr *hdr, if (session->tt->xmit_task(task)) goto free_task; } else { + spin_lock_bh(&conn->taskqueuelock); list_add_tail(&task->running, &conn->mgmtqueue); + spin_unlock_bh(&conn->taskqueuelock); iscsi_conn_queue_work(conn); } @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task) * this may be on the requeue list already if the xmit_task callout * is handling the r2ts while we are adding new ones */ + spin_lock_bh(&conn->taskqueuelock); if (list_empty(&task->running)) list_add_tail(&task->running, &conn->requeue); + spin_unlock_bh(&conn->taskqueuelock); iscsi_conn_queue_work(conn); } EXPORT_SYMBOL_GPL(iscsi_requeue_task); @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) * only have one nop-out as a ping from us and targets should not * overflow us with nop-ins */ + spin_lock_bh(&conn->taskqueuelock); check_mgmt: while (!list_empty(&conn->mgmtqueue)) { conn->task = list_entry(conn->mgmtqueue.next, struct iscsi_task, running); list_del_init(&conn->task->running); + spin_unlock_bh(&conn->taskqueuelock); if (iscsi_prep_mgmt_task(conn, conn->task)) { /* regular RX path uses back_lock */ spin_lock_bh(&conn->session->back_lock); __iscsi_put_task(conn->task); spin_unlock_bh(&conn->session->back_lock); conn->task = NULL; + spin_lock_bh(&conn->taskqueuelock); continue; } rc = iscsi_xmit_task(conn); if (rc) goto done; + spin_lock_bh(&conn->taskqueuelock); } /* process pending command queue */ @@ -1535,19 +1547,24 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) conn->task = list_entry(conn->cmdqueue.next, struct iscsi_task, running); list_del_init(&conn->task->running); + spin_unlock_bh(&conn->taskqueuelock); if (conn->session->state == ISCSI_STATE_LOGGING_OUT) { fail_scsi_task(conn->task, DID_IMM_RETRY); + spin_lock_bh(&conn->taskqueuelock); continue; } rc = iscsi_prep_scsi_cmd_pdu(conn->task); if (rc) { if (rc == -ENOMEM || rc == -EACCES) { + spin_lock_bh(&conn->taskqueuelock); list_add_tail(&conn->task->running, &conn->cmdqueue); conn->task = NULL; + spin_unlock_bh(&conn->taskqueuelock); goto done; } else fail_scsi_task(conn->task, DID_ABORT); + spin_lock_bh(&conn->taskqueuelock); continue; } rc = iscsi_xmit_task(conn); @@ -1558,6 +1575,7 @@ static int iscsi_data_xmit(struct iscsi_conn *conn) * we need to check the mgmt queue for nops that need to * be sent to aviod starvation */ + spin_lock_bh(&conn->taskqueuelock);
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 11/07/2016 04:15 PM, Chris Leech wrote: > Hi, > > I'm kicking this old thread because I don't think this ever got > resolved. I wish I had more info, but it seems to involve target > specific behavior that hasn't come up in our test labs. Thanks very much for reopening this thread! We have the Or's patch reverted in multiple distros, so the issue is not likely to happen on customer's from IBM, but upstream kernel never saw a proper fix for this. > > So what can I do at this point to help resolve this? > > On Sun, Nov 15, 2015 at 12:10:48PM +0200, Or Gerlitz wrote: >> Mike, Chris >> >> After the locking change, adding a task to any of the connection >> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. >> >> Removing tasks from any of these lists in iscsi_data_xmit is under >> the session forward lock and **before** calling down to the transport >> to handle the task. >> >> The iscsi_complete_task helper was added by Mike's commit >> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" >> and is indeed typically called under the backward lock && has this section >> >> + if (!list_empty(&task->running)) >> + list_del_init(&task->running); >> >> which per my reading of the code never comes into play, can you comment? >> >> Lets address this area before we move to the others claims made on the patch. > > This bit in particular is where I see a cause for concern. If that > list_del_init call ever races against other list operations, there's a > potential corruption. It's presumably there for a reason, and Mike > explained a case where some targets have been known to send a check > condition at unexpected times that would hit it. > > I don't like having known list locking violations hanging around, based > on an expectation that we'll never hit that path with well behaved > targets. > > If we can get a fix worked up for the list locking here, can we get any > testing on it from the original reports at IBM? That was very helpful > in testing a full reversion patch. Sure! Count on us to test any patches. I guess the first step is to reproduce on upstream right? We haven't tested specifically this scenario for long time. Will try to reproduce on 4.9-rc4 and update here. Cheers, Guilherme > > - Chris Leech > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
Hi, I'm kicking this old thread because I don't think this ever got resolved. I wish I had more info, but it seems to involve target specific behavior that hasn't come up in our test labs. So what can I do at this point to help resolve this? On Sun, Nov 15, 2015 at 12:10:48PM +0200, Or Gerlitz wrote: > Mike, Chris > > After the locking change, adding a task to any of the connection > mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. > > Removing tasks from any of these lists in iscsi_data_xmit is under > the session forward lock and **before** calling down to the transport > to handle the task. > > The iscsi_complete_task helper was added by Mike's commit > 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" > and is indeed typically called under the backward lock && has this section > > + if (!list_empty(&task->running)) > + list_del_init(&task->running); > > which per my reading of the code never comes into play, can you comment? > > Lets address this area before we move to the others claims made on the patch. This bit in particular is where I see a cause for concern. If that list_del_init call ever races against other list operations, there's a potential corruption. It's presumably there for a reason, and Mike explained a case where some targets have been known to send a check condition at unexpected times that would hit it. I don't like having known list locking violations hanging around, based on an expectation that we'll never hit that path with well behaved targets. If we can get a fix worked up for the list locking here, can we get any testing on it from the original reports at IBM? That was very helpful in testing a full reversion patch. - Chris Leech -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 01/22/2016 10:50 AM, Brian King wrote: > On 11/11/2015 11:05 PM, mchri...@redhat.com wrote: >> From: Mike Christie >> >> This patch fixes this oops report by Guilherme Piccol: >> >> list_del corruption. prev->next should be c00f3da2b080, but was >> c00f3da2c080 > > Hi Mike! I haven't seen any follow ups on this for a while. What is the plan > for this one? > Hey, I ended up sending Guilherme a patch for the problem I described here. They were not hitting that problem. Checked if we were hitting a different regression in the abort path. Not hitting that either. Or was still debugging it. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 11/11/2015 11:05 PM, mchri...@redhat.com wrote: > From: Mike Christie > > This patch fixes this oops report by Guilherme Piccol: > > list_del corruption. prev->next should be c00f3da2b080, but was > c00f3da2c080 Hi Mike! I haven't seen any follow ups on this for a while. What is the plan for this one? Thanks, Brian -- Brian King Power Linux I/O IBM Linux Technology Center -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 11/18/15, 5:30 AM, Or Gerlitz wrote: On Mon, Nov 16, 2015 at 7:30 PM, Michael Christie wrote: On Nov 15, 2015, at 4:10 AM, Or Gerlitz wrote: On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie wrote: On 11/13/2015 09:06 AM, Or Gerlitz wrote: After the locking change, adding a task to any of the connection mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. Removing tasks from any of these lists in iscsi_data_xmit is under the session forward lock and **before** calling down to the transport to handle the task. The iscsi_complete_task helper was added by Mike's commit 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" and is indeed typically called under the backward lock && has this section + if (!list_empty(&task->running)) + list_del_init(&task->running); which per my reading of the code never comes into play, can you comment? I had sent this to Sagi and your mellanox email the other day: The bug occurs when a target completes a command while we are still processing it. If we are doing a WRITE and the iscsi_task is on the cmdqueue because we are handling a R2T. Mike, I failed to find how an iscsi_task can be added again to the cmdqueue list, nor how it can be added to the requeue list without the right locking, nor how can an iscsi_task be on either of these lists when iscsi_complete_task is invoked. Are you only considering normal execution or also the cc case I mentioned in the last mail? For normal execution we are ok. Specifically, my reading of the code says that these are the events over time: t1. queuecommand :: we put the task on the cmdqueue list (libiscsi.c:1741) - under fwd lock t2. iscsi_data_xmit :: we remove the task from the cmdqueue list (libiscsi.c:1537) - under fwd lock when the R2T flow happens, we do the following t3. iscsi_tcp_hdr_dissect --> iscsi_tcp_r2t_rsp --> iscsi_requeue_task :: put the task on the requeue list -- the call to iscsi_tcp_r2t_rsp is under the fwd lock If the target were to send a CC at this point, we are adding the task under the frwd lock, but the completion path would only have the back lock. The list_empty, list_add and list_del calls then race and we could end up in a bad state right? t4. iscsi_data_xmit :: we remove the task from the requeue list (libiscsi.c: 1578) - under fwd lock We could also get the bad CC at this time and end up doing 2 list_dels at the same time. The t4 one under the frwd lock and the cc handling completion one under the back lock like in t3. Do you agree to t1...t4 being what's possible for a given task? or I missed something? The target shouldn't send a Check Condition at this time, but some do. If that happens, then iscsi_queuecommand could be adding a new task to the cmdqueue, while the recv path is handling the CC for the task with the outsanding R2T. The recv path iscsi_complete_task call sees that task it on the cmdqueue and deletes it from the list at the same time iscsi_queuecommand is adding a new task. This should not happen per the iscsi spec. There is some wording about waiting to finish the sequence in progress, but targets goof this up. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On Mon, Nov 16, 2015 at 7:30 PM, Michael Christie wrote: >> On Nov 15, 2015, at 4:10 AM, Or Gerlitz wrote: >> On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie wrote: >>> On 11/13/2015 09:06 AM, Or Gerlitz wrote: >> After the locking change, adding a task to any of the connection >> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. >> >> Removing tasks from any of these lists in iscsi_data_xmit is under >> the session forward lock and **before** calling down to the transport >> to handle the task. >> >> The iscsi_complete_task helper was added by Mike's commit >> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" >> and is indeed typically called under the backward lock && has this section >> >> + if (!list_empty(&task->running)) >> + list_del_init(&task->running); >> >> which per my reading of the code never comes into play, can you comment? > I had sent this to Sagi and your mellanox email the other day: > The bug occurs when a target completes a command while we are still > processing it. If we are doing a WRITE and the iscsi_task > is on the cmdqueue because we are handling a R2T. Mike, I failed to find how an iscsi_task can be added again to the cmdqueue list, nor how it can be added to the requeue list without the right locking, nor how can an iscsi_task be on either of these lists when iscsi_complete_task is invoked. Specifically, my reading of the code says that these are the events over time: t1. queuecommand :: we put the task on the cmdqueue list (libiscsi.c:1741) - under fwd lock t2. iscsi_data_xmit :: we remove the task from the cmdqueue list (libiscsi.c:1537) - under fwd lock when the R2T flow happens, we do the following t3. iscsi_tcp_hdr_dissect --> iscsi_tcp_r2t_rsp --> iscsi_requeue_task :: put the task on the requeue list -- the call to iscsi_tcp_r2t_rsp is under the fwd lock t4. iscsi_data_xmit :: we remove the task from the requeue list (libiscsi.c: 1578) - under fwd lock Do you agree to t1...t4 being what's possible for a given task? or I missed something? >> The target shouldn't >> send a Check Condition at this time, but some do. If that happens, then >> iscsi_queuecommand could be adding a new task to the cmdqueue, while the >> recv path is handling the CC for the task with the outsanding R2T. The >> recv path iscsi_complete_task call sees that task it on the cmdqueue and >> deletes it from the list at the same time iscsi_queuecommand is adding a new >> task. >> This should not happen per the iscsi spec. There is some wording about >> waiting to finish the sequence in progress, but targets goof this up. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On Mon, Nov 16, 2015 at 7:30 PM, Michael Christie wrote: >> On Nov 15, 2015, at 4:10 AM, Or Gerlitz wrote: >> After the locking change, adding a task to any of the connection >> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. >> >> Removing tasks from any of these lists in iscsi_data_xmit is under >> the session forward lock and **before** calling down to the transport >> to handle the task. >> >> The iscsi_complete_task helper was added by Mike's commit >> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" >> and is indeed typically called under the backward lock && has this section >> >> + if (!list_empty(&task->running)) >> + list_del_init(&task->running); >> which per my reading of the code never comes into play, can you comment? >> The bug occurs when a target completes a command while we are still >> processing it. If we are doing a WRITE and the iscsi_task >> is on the cmdqueue because we are handling a R2T. The target shouldn't >> send a Check Condition at this time, but some do. If that happens, then >> iscsi_queuecommand could be adding a new task to the cmdqueue, while the >> recv path is handling the CC for the task with the outsanding R2T. The >> recv path iscsi_complete_task call sees that task it on the cmdqueue and >> deletes it from the list at the same time iscsi_queuecommand is adding a new >> task. So we're now a bit beyond trivial bug and >> This should not happen per the iscsi spec. There is some wording about >> waiting to finish the sequence in progress, but targets goof this up. we have target/s that violate the spec, this is life, but can explain why it took us 18m to get bug report. Can you provide few point pointers into the relevant code pieces that one need to look at to realize what's going on there? was this code added before or after the patch? Or. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
> On Nov 15, 2015, at 4:10 AM, Or Gerlitz wrote: > > On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie wrote: >> On 11/13/2015 09:06 AM, Or Gerlitz wrote: The patch has caused multiple regressions, did not even compile when > sent to me, and was poorly reviewed and I have not heard from you guys > in a week. Given the issues the patch has had and the current time, I do > not feel comfortable with it anymore. I want to re-review it and fix it > up when there is more time. >>> Mike (Hi), >>> >>> It's a complex patch that touches all the iscsi transports, and yes, >>> when it was send to you the 1st time, there was build error on one of >>> the offload transports (bad! but happens) and yes, as you pointed, one >>> static checker fix + one bug fix for it went upstream after this has >>> been merged, happens too. >> >> A patch should not cause this many issues. >> >>> What makes you say it was poorly reviewed? >> >> I just did not do a good job at looking at the patch. I should have >> caught all of these issues. >> >> - The bnx2i cleanup_task bug should have been obvious, especially for me >> because I had commented about the back lock and the abort path. >> >> - This oops, was so basic. Incorrect locking around a linked list being >> accessed from 2 threads is really one of those 1st year kernel >> programmer things. > > Mike, Chris > > After the locking change, adding a task to any of the connection > mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. > > Removing tasks from any of these lists in iscsi_data_xmit is under > the session forward lock and **before** calling down to the transport > to handle the task. > > The iscsi_complete_task helper was added by Mike's commit > 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" > and is indeed typically called under the backward lock && has this section > > + if (!list_empty(&task->running)) > + list_del_init(&task->running); > > which per my reading of the code never comes into play, can you comment? I had sent this to Sagi and your mellanox email the other day: > The bug occurs when a target completes a command while we are still > processing it. If we are doing a WRITE and the iscsi_task > is on the cmdqueue because we are handling a R2T. The target shouldn't > send a Check Condition at this time, but some do. If that happens, then > iscsi_queuecommand could be adding a new task to the cmdqueue, while the > recv path is handling the CC for the task with the outsanding R2T. The > recv path iscsi_complete_task call sees that task it on the cmdqueue and > deletes it from the list at the same time iscsi_queuecommand is adding a new > task. > > This should not happen per the iscsi spec. There is some wording about > waiting to finish the sequence in progress, but targets goof this up. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie wrote: > On 11/13/2015 09:06 AM, Or Gerlitz wrote: >>> The patch has caused multiple regressions, did not even compile when >>> > sent to me, and was poorly reviewed and I have not heard from you guys >>> > in a week. Given the issues the patch has had and the current time, I do >>> > not feel comfortable with it anymore. I want to re-review it and fix it >>> > up when there is more time. >> Mike (Hi), >> >> It's a complex patch that touches all the iscsi transports, and yes, >> when it was send to you the 1st time, there was build error on one of >> the offload transports (bad! but happens) and yes, as you pointed, one >> static checker fix + one bug fix for it went upstream after this has >> been merged, happens too. > > A patch should not cause this many issues. > >> What makes you say it was poorly reviewed? > > I just did not do a good job at looking at the patch. I should have > caught all of these issues. > > - The bnx2i cleanup_task bug should have been obvious, especially for me > because I had commented about the back lock and the abort path. > > - This oops, was so basic. Incorrect locking around a linked list being > accessed from 2 threads is really one of those 1st year kernel > programmer things. Mike, Chris After the locking change, adding a task to any of the connection mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. Removing tasks from any of these lists in iscsi_data_xmit is under the session forward lock and **before** calling down to the transport to handle the task. The iscsi_complete_task helper was added by Mike's commit 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" and is indeed typically called under the backward lock && has this section + if (!list_empty(&task->running)) + list_del_init(&task->running); which per my reading of the code never comes into play, can you comment? Lets address this area before we move to the others claims made on the patch. Again, the patch is around for ~18 months, since 3.15, and no deep complaints so far, lets not jump to conclusions. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 11/13/2015 09:06 AM, Or Gerlitz wrote: >> The patch has caused multiple regressions, did not even compile when >> > sent to me, and was poorly reviewed and I have not heard from you guys >> > in a week. Given the issues the patch has had and the current time, I do >> > not feel comfortable with it anymore. I want to re-review it and fix it >> > up when there is more time. > Mike (Hi), > > It's a complex patch that touches all the iscsi transports, and yes, > when it was send to you the 1st time, there was build error on one of > the offload transports (bad! but happens) and yes, as you pointed, one > static checker fix + one bug fix for it went upstream after this has > been merged, happens too. A patch should not cause this many issues. > What makes you say it was poorly reviewed? I just did not do a good job at looking at the patch. I should have caught all of these issues. - The bnx2i cleanup_task bug should have been obvious, especially for me because I had commented about the back lock and the abort path. - This oops, was so basic. Incorrect locking around a linked list being accessed from 2 threads is really one of those 1st year kernel programmer things. - The iscsi_xmit_task static checker locking was really simple too. - There was the issue offlist I emailed you guys about where I started to try and fix/review it yesterday, and I found another race in the abort and completion path that can result in a null pointer reference. - I have not had time to fully review it again, but I think the the reset/recovery code looks shady too. > > And now after few years in upstream a possibly real bug was found > (happens), why rush and revert? lets see if we can fix. Send patches. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On Thu, Nov 12, 2015 at 10:58 PM, Mike Christie wrote: > On 11/12/2015 06:03 AM, Sagi Grimberg wrote: >>> The bug is caused by this patch: >>> >>> 659743b02c411075b26601725947b21df0bb29c8 >>> >>> which allowed the task lists to be manipulated under different locks >>> in the xmit and completion path. >>> To fix the oops this patch just reverts that patch. It also reverts >>> these 2 patches for regressions that were also a result of that patch: >> Whoa now Mike, this is a major change. Can we take a step back and think >> about this for a second? > The issue has been on the open-iscsi list for a week! You are on the > list still right? Or is even ccd on the thread. The email you sent me a week ago also cc-ed open-iscsi hence was routed by a rule in my mailer that made it to land in my open-iscsi subscription folder which is don't visit much nowadays. Only when you posted to linux-scsi we saw that and responded within few hours, to begin with. >> Can you provide a more detailed analysis of why is this list corruption >> triggered? What scenario was not honoring the locking policy? > Basic locking around a linked list bug. iscsi_queuecommand adds it under > the frwd lock and iscsi_complete_task was taking it off the back_lock. >> This code has been running reliably in our labs for a long time now >> (both iser and tcp). > The patch has caused multiple regressions, did not even compile when > sent to me, and was poorly reviewed and I have not heard from you guys > in a week. Given the issues the patch has had and the current time, I do > not feel comfortable with it anymore. I want to re-review it and fix it > up when there is more time. Mike (Hi), It's a complex patch that touches all the iscsi transports, and yes, when it was send to you the 1st time, there was build error on one of the offload transports (bad! but happens) and yes, as you pointed, one static checker fix + one bug fix for it went upstream after this has been merged, happens too. What makes you say it was poorly reviewed? And now after few years in upstream a possibly real bug was found (happens), why rush and revert? lets see if we can fix. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On Thu, Nov 12, 2015 at 4:03 AM, Sagi Grimberg wrote: > >> The bug is caused by this patch: >> >> 659743b02c411075b26601725947b21df0bb29c8 >> >> which allowed the task lists to be manipulated under different locks >> in the xmit and completion path. >> >> To fix the oops this patch just reverts that patch. It also reverts >> these 2 patches for regressions that were also a result of that patch: > > > Whoa now Mike, this is a major change. Can we take a step back and think > about this for a second? > > My understanding is that the kfifo circular buffer design allows a > writer (e.g. the producer) and a reader (e.g. the consumer) to avoid > extra locking when accessing the kfifo buffer. I don't think the problem is with the kfifo cmdpool, but rather the connection mgmtqueue/cmdqueue linked lists. iscsi_task structs are linked in using the task->running list_head. - Chris -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
On 11/12/2015 06:03 AM, Sagi Grimberg wrote: > >> The bug is caused by this patch: >> >> 659743b02c411075b26601725947b21df0bb29c8 >> >> which allowed the task lists to be manipulated under different locks >> in the xmit and completion path. >> >> To fix the oops this patch just reverts that patch. It also reverts >> these 2 patches for regressions that were also a result of that patch: > > Whoa now Mike, this is a major change. Can we take a step back and think > about this for a second? The issue has been on the open-iscsi list for a week! You are on the list still right? Or is even ccd on the thread. > > My understanding is that the kfifo circular buffer design allows a > writer (e.g. the producer) and a reader (e.g. the consumer) to avoid > extra locking when accessing the kfifo buffer. > For the next feature window I am working on patch that makes the api easier to use (the cleanup_task locking is bad as can be seen from the bnx2i regression the patch also caused) and also uses kfifos for the fast path. > From include/linux/kfifo.h: > /* > * Note about locking : There is no locking required until only * one > reader > * and one writer is using the fifo and no kfifo_reset() will be * called > * kfifo_reset_out() can be safely used, until it will be only called > * in the reader thread. > * For multiple writer and one reader there is only a need to lock the > writer. > * And vice versa for only one writer and multiple reader there is only > a need > * to lock the reader. > */ > > The patch by Shlomo, implements the locking policy documented above: > - multiple readers: multiple threads entering queuecommand reading the > kfifo (kfifo_out) are mutually excluded by the frwd_lock. > - multiple writers: completion contexts writing to the kfifo (kfifo_in) > are mutually excluded by the back_lock. > > Can you provide a more detailed analysis of why is this list corruption > triggered? What scenario was not honoring the locking policy? Basic locking around a linked list bug. iscsi_queuecommand adds it under the frwd lock and iscsi_complete_task was taking it off the back_lock. > This code has been running reliably in our labs for a long time now > (both iser and tcp). The patch has caused multiple regressions, did not even compile when sent to me, and was poorly reviewed and I have not heard from you guys in a week. Given the issues the patch has had and the current time, I do not feel comfortable with it anymore. I want to re-review it and fix it up when there is more time. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch
The bug is caused by this patch: 659743b02c411075b26601725947b21df0bb29c8 which allowed the task lists to be manipulated under different locks in the xmit and completion path. To fix the oops this patch just reverts that patch. It also reverts these 2 patches for regressions that were also a result of that patch: Whoa now Mike, this is a major change. Can we take a step back and think about this for a second? My understanding is that the kfifo circular buffer design allows a writer (e.g. the producer) and a reader (e.g. the consumer) to avoid extra locking when accessing the kfifo buffer. From include/linux/kfifo.h: /* * Note about locking : There is no locking required until only * one reader * and one writer is using the fifo and no kfifo_reset() will be * called * kfifo_reset_out() can be safely used, until it will be only called * in the reader thread. * For multiple writer and one reader there is only a need to lock the writer. * And vice versa for only one writer and multiple reader there is only a need * to lock the reader. */ The patch by Shlomo, implements the locking policy documented above: - multiple readers: multiple threads entering queuecommand reading the kfifo (kfifo_out) are mutually excluded by the frwd_lock. - multiple writers: completion contexts writing to the kfifo (kfifo_in) are mutually excluded by the back_lock. Can you provide a more detailed analysis of why is this list corruption triggered? What scenario was not honoring the locking policy? This code has been running reliably in our labs for a long time now (both iser and tcp). Going back to a single lock restores the contention point between queuecommand and complete_pdu. P.S. While we're on the subject, I'd like to see block tags replace the kfifo for iscsi tasks. It's difficult because sw and offload drivers map hosts/sessions differently. It can be done if we move the tasks to the iscsi_host and have a reserved (unique) task tags for the session login/logout/nop_in/nop_out. With this we won't need locks around our tasks arrays. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] iscsi: fix regression caused by session lock patch
From: Mike Christie This patch fixes this oops report by Guilherme Piccol: list_del corruption. prev->next should be c00f3da2b080, but was c00f3da2c080 [ cut here ] WARNING: at lib/list_debug.c:59 CPU: 48 PID: 12033 Comm: fio-2.2.7 Not tainted 3.18.22-354.el7_1.pkvm3_1_0.3600.1.ppc64le #1 task: c00f25827000 ti: c00fff5f8000 task.ti: c00f057e4000 NIP: c04d15a4 LR: c04d15a0 CTR: c04bdfd0 REGS: c00fff5faf10 TRAP: 0700 Not tainted (3.18.22-354.el7_1.pkvm3_1_0.3600.1.ppc64le) MSR: 90029033 CR: 28082842 XER: 2000 CFAR: c0988dbc SOFTE: 1 NIP [c04d15a4] __list_del_entry+0xb4/0xe0 LR [c04d15a0] __list_del_entry+0xb0/0xe0 Call Trace: [c00fff5fb190] [c04d15a0] __list_del_entry+0xb0/0xe0 (unreliable) [c00fff5fb1f0] [d00010fb0c8c] iscsi_complete_task+0x8c/0x190 [libiscsi] [c00fff5fb270] [d00010fb24c8] __iscsi_complete_pdu+0x508/0xaa0 [libiscsi] [c00fff5fb350] [d00010fb2ab4] iscsi_complete_pdu+0x54/0xa0 [libiscsi] [c00fff5fb3a0] [d000110121ac] iscsi_tcp_hdr_recv_done+0x38c/0x1250 [libiscsi_tcp] [c00fff5fb480] [d00011011a48] iscsi_tcp_recv_skb+0xd8/0x4b0 [libiscsi_tcp] [c00fff5fb570] [d00011071708] iscsi_sw_tcp_recv+0x98/0x170 [iscsi_tcp] [c00fff5fb610] [c0882794] tcp_read_sock+0xe4/0x2d0 [c00fff5fb680] [d00011071590] iscsi_sw_tcp_data_ready+0x70/0x150 [iscsi_tcp] [c00fff5fb720] [c08921c8] tcp_rcv_established+0x3f8/0x6f0 [c00fff5fb780] [c089d75c] tcp_v4_do_rcv+0x19c/0x420 [c00fff5fb7e0] [c08a15bc] tcp_v4_rcv+0xa6c/0xb20 [c00fff5fb8c0] [c08700f8] ip_local_deliver_finish+0x178/0x350 [c00fff5fb910] [c0870424] ip_rcv_finish+0x154/0x420 [c00fff5fb990] [c0816bf4] __netif_receive_skb_core+0x7a4/0x9c0 [c00fff5fba80] [c081a0fc] netif_receive_skb_internal+0x4c/0xf0 [c00fff5fbac0] [c081b17c] napi_gro_receive+0xfc/0x1a0 [c00fff5fbb00] [d8373280] bnx2x_rx_int+0xb30/0x1650 [bnx2x] [c00fff5fbc90] [d8374440] bnx2x_poll+0x310/0x440 [bnx2x] [c00fff5fbd40] [c081a8e8] net_rx_action+0x2d8/0x3e0 [c00fff5fbe10] [c00cfc44] __do_softirq+0x174/0x3a0 [c00fff5fbf00] [c00d01f8] irq_exit+0xc8/0x100 [c00fff5fbf20] [c0010d40] __do_irq+0xa0/0x1c0 [c00fff5fbf90] [c0024aac] call_do_irq+0x14/0x24 [c00f057e7dd0] [c0010f00] do_IRQ+0xa0/0x120 [c00f057e7e30] [c00025e8] hardware_interrupt_common+0x168/0x180 Instruction dump: 0fe0 4bd8 3c62ff91 386304c0 484b77ad 6000 0fe0 4bc0 3c62ff91 38630480 484b7795 6000 <0fe0> 4ba8 3c62ff91 7d254b78 ---[ end trace 005d2749e6bab12f ]--- The bug is caused by this patch: 659743b02c411075b26601725947b21df0bb29c8 which allowed the task lists to be manipulated under different locks in the xmit and completion path. To fix the oops this patch just reverts that patch. It also reverts these 2 patches for regressions that were also a result of that patch: 72b9740201d5f0e24b0b8326a4949786a30ff628 35843048e7e979df3b7b9f2ad49e21797a11386b Cc: Guilherme Piccoli Signed-off-by: Mike Christie --- drivers/scsi/be2iscsi/be_main.c | 26 ++--- drivers/scsi/bnx2i/bnx2i_hwi.c | 46 - drivers/scsi/bnx2i/bnx2i_iscsi.c | 10 +- drivers/scsi/iscsi_tcp.c | 22 ++-- drivers/scsi/libiscsi.c | 214 +-- drivers/scsi/libiscsi_tcp.c | 28 ++--- drivers/scsi/qla4xxx/ql4_isr.c | 4 +- include/scsi/libiscsi.h | 17 +--- include/scsi/libiscsi_tcp.h | 2 - 9 files changed, 161 insertions(+), 208 deletions(-) diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index a4a5d6d..987cd60 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -232,20 +232,20 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc) cls_session = starget_to_session(scsi_target(sc->device)); session = cls_session->dd_data; - spin_lock_bh(&session->frwd_lock); + spin_lock_bh(&session->lock); if (!aborted_task || !aborted_task->sc) { /* we raced */ - spin_unlock_bh(&session->frwd_lock); + spin_unlock_bh(&session->lock); return SUCCESS; } aborted_io_task = aborted_task->dd_data; if (!aborted_io_task->scsi_cmnd) { /* raced or invalid command */ - spin_unlock_bh(&session->frwd_lock); + spin_unlock_bh(&session->lock); return SUCCESS; } - spin_unlock_bh(&session->frwd_lock); + spin_unlock_bh(&session->lock); /* Invalidate WRB Posted for this Task */ AMAP_SET_BITS(struct amap_iscsi_wrb, invld, aborted_io_task->pwrb_handle->pwrb, @@ -310,9 +310,9 @@ static int beiscsi_eh_d