subject:"\[PATCH 1\/1\] iscsi\: fix regression caused by session lock patch"

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2017-02-06 Thread Sagi Grimberg


Hey Chris and Guilherme,

I'm indeed not responsive under this email address.


Thanks for the testing, looks like you have the magic target to reproduce this.

I think this verifies what Mike's idea of what was going wrong, and we're way 
overdue to get this fixed upstream.  Thanks to IBM for pushing this, I don't 
think any major distro is shipping this patch and we don't want to keep having 
to back it out.

The options look like
1) back out the session lock changes that split it into two locks
2) add in the additional locking from this test patch
3) some other fix for the issue of targets that complete tasks oddly

I'm leaning to #1, as I don't want to keep adding more locks for this.


Thanks Chris! IIRC, the lock changes from Shlomo/Or are not on RHEL,
SLES and Ubuntu anymore, as you mentioned. We requested them to revert
the patch, and it was accepted.

On the other hand, your patch is great and a cool fix to this. If we
have any good numbers and/or reasons to keep their patch, guess the
alternative #2 is cool too. I can perform more testing if you plan to
send this (or similar) patch to iscsi list.



Sagi, Or, Shlomo?  You pushed to keep this from being backed out before.  
Here's your cause, any better ideas on fixing it?  I also tried to go back in 
the mailing list archives, but I don't see any real numbers for the performance 
gains.


I'll loop Sagi here based on the email I see he's using on NVMe list
currently - seems it's different from the one showed in the header of
this message.


IIRC, this was brought up more than two years ago? it's been
a while now.

The motivation for the fined grained locking from Shlomo was
designed to address the submission/completion inter-locking
scheme that was not needed for iser.

In iser, task completions are triggered from soft-irq only for
task responses, the data-transfer is driven in HW, so we don't need
the inter-locking between submissions and task management or error
handling.

My recollection is that this scheme solved a contention point we had
back then, if I'm not mistaken it was as much as 50% improvement in
IOPs scalability in some scenarios.

Now, this was all pre block-mq. So I think the correct solution for
iscsi (iser, tcp and offloads) is to use block-mq facilities for
task pre-allocations (scsi host tagset) and have iscsi tcp take care
of it's own locking instead of imposing it inherently in libiscsi.

We can have LOGIN, LOGOUT, NOOP_OUT, TEXT, TMR as reserved tags,
and queue_depth with max session cmds. I had a prototype for that
back when I experimented with scsi-mq conversion (way back...),
but kinda got stuck with trying to figure out how to convert the
offload drivers qla4xxx, bnx2i and cxgbi which seemed to rely heavily
on on the task pools.

If people are more interested in improving iscsi locking schemes we
can discuss on approaches for it.

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2017-02-06 Thread Guilherme G. Piccoli

On 06/02/2017 15:27, Chris Leech wrote:
> - Original Message -
>> On 09/11/2016 03:21, Chris Leech wrote:
>>> On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:

 Sure! Count on us to test any patches. I guess the first step is to
 reproduce on upstream right? We haven't tested specifically this
 scenario for long time. Will try to reproduce on 4.9-rc4 and update here.
>>>
>>> Great, I'm looking forward to hearing the result.
>>>
>>> Assuming it reproduces, I don't think this level of fine grained locking
>>> is necessarily the best solution, but it may help confirm the problem.
>>> Especially if the WARN_ONCE I slipped in here triggers.
>>
>> Chris, sorry for my huge delay.
>> Finally I was able to perform tests and I have good news - seems your
>> patch below fixed the issue.
> 
> Thanks for the testing, looks like you have the magic target to reproduce 
> this.
> 
> I think this verifies what Mike's idea of what was going wrong, and we're way 
> overdue to get this fixed upstream.  Thanks to IBM for pushing this, I don't 
> think any major distro is shipping this patch and we don't want to keep 
> having to back it out.
> 
> The options look like
> 1) back out the session lock changes that split it into two locks
> 2) add in the additional locking from this test patch
> 3) some other fix for the issue of targets that complete tasks oddly
> 
> I'm leaning to #1, as I don't want to keep adding more locks for this.

Thanks Chris! IIRC, the lock changes from Shlomo/Or are not on RHEL,
SLES and Ubuntu anymore, as you mentioned. We requested them to revert
the patch, and it was accepted.

On the other hand, your patch is great and a cool fix to this. If we
have any good numbers and/or reasons to keep their patch, guess the
alternative #2 is cool too. I can perform more testing if you plan to
send this (or similar) patch to iscsi list.


> Sagi, Or, Shlomo?  You pushed to keep this from being backed out before.  
> Here's your cause, any better ideas on fixing it?  I also tried to go back in 
> the mailing list archives, but I don't see any real numbers for the 
> performance gains.

I'll loop Sagi here based on the email I see he's using on NVMe list
currently - seems it's different from the one showed in the header of
this message.


Thanks,



Guilherme

> 
> - Chris
> 
>> Firstly, I was able to reproduce with kernel 4.10-rc6. See the file
>> repro.out - it's a dump from xmon, the kernel debugger from PowerPC.
>> With this tool we can dump the exception details, registers, PACA
>> (Processor Address Communication Area, ppc specific structure) and
>> dmesg. It took me less than 15 minutes to reproduce.
>>
>> Then, I applied your patch on the top of this kernel and the benchmark
>> was able to successfully complete, in about 3 hours. We can see the
>> WARN() you added was reached, the attached file dmesg-cleech_patch shows
>> the kernel log with your patch.
>>
>> The workload was FIO benchmark doing both reads and writes to the remote
>> storage via iSCSI, connected over ethernet direct cabling (10Gb speed).
>> Distro was Ubuntu 16.04.1 .
>>
>> Any more tests or info you need, please let me know!
>> Cheers,
>>
>>
>> Guilherme
>>
>>
>>> - Chris
>>>
>>> ---
>>>
>>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>>> index f9b6fba..fbd18ab 100644
>>> --- a/drivers/scsi/libiscsi.c
>>> +++ b/drivers/scsi/libiscsi.c
>>> @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task
>>> *task, int state)
>>> WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
>>> task->state = state;
>>>
>>> -   if (!list_empty(&task->running))
>>> +   spin_lock_bh(&conn->taskqueuelock);
>>> +   if (!list_empty(&task->running)) {
>>> +   WARN_ONCE(1, "iscsi_complete_task while task on list");
>>> list_del_init(&task->running);
>>> +   }
>>> +   spin_unlock_bh(&conn->taskqueuelock);
>>>
>>> if (conn->task == task)
>>> conn->task = NULL;
>>> @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct
>>> iscsi_hdr *hdr,
>>> if (session->tt->xmit_task(task))
>>> goto free_task;
>>> } else {
>>> +   spin_lock_bh(&conn->taskqueuelock);
>>> list_add_tail(&task->running, &conn->mgmtqueue);
>>> +   spin_unlock_bh(&conn->taskqueuelock);
>>> iscsi_conn_queue_work(conn);
>>> }
>>>
>>> @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
>>>  * this may be on the requeue list already if the xmit_task callout
>>>  * is handling the r2ts while we are adding new ones
>>>  */
>>> +   spin_lock_bh(&conn->taskqueuelock);
>>> if (list_empty(&task->running))
>>> list_add_tail(&task->running, &conn->requeue);
>>> +   spin_unlock_bh(&conn->taskqueuelock);
>>> iscsi_conn_queue_work(conn);
>>>  }
>>>  EXPORT_SYMBOL_GPL(iscsi_requeue_task);
>>> @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2017-02-06 Thread Chris Leech

- Original Message -
> On 09/11/2016 03:21, Chris Leech wrote:
> > On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:
> >>
> >> Sure! Count on us to test any patches. I guess the first step is to
> >> reproduce on upstream right? We haven't tested specifically this
> >> scenario for long time. Will try to reproduce on 4.9-rc4 and update here.
> > 
> > Great, I'm looking forward to hearing the result.
> > 
> > Assuming it reproduces, I don't think this level of fine grained locking
> > is necessarily the best solution, but it may help confirm the problem.
> > Especially if the WARN_ONCE I slipped in here triggers.
> 
> Chris, sorry for my huge delay.
> Finally I was able to perform tests and I have good news - seems your
> patch below fixed the issue.

Thanks for the testing, looks like you have the magic target to reproduce this.

I think this verifies what Mike's idea of what was going wrong, and we're way 
overdue to get this fixed upstream.  Thanks to IBM for pushing this, I don't 
think any major distro is shipping this patch and we don't want to keep having 
to back it out.

The options look like
1) back out the session lock changes that split it into two locks
2) add in the additional locking from this test patch
3) some other fix for the issue of targets that complete tasks oddly

I'm leaning to #1, as I don't want to keep adding more locks for this.

Sagi, Or, Shlomo?  You pushed to keep this from being backed out before.  
Here's your cause, any better ideas on fixing it?  I also tried to go back in 
the mailing list archives, but I don't see any real numbers for the performance 
gains.

- Chris

> Firstly, I was able to reproduce with kernel 4.10-rc6. See the file
> repro.out - it's a dump from xmon, the kernel debugger from PowerPC.
> With this tool we can dump the exception details, registers, PACA
> (Processor Address Communication Area, ppc specific structure) and
> dmesg. It took me less than 15 minutes to reproduce.
> 
> Then, I applied your patch on the top of this kernel and the benchmark
> was able to successfully complete, in about 3 hours. We can see the
> WARN() you added was reached, the attached file dmesg-cleech_patch shows
> the kernel log with your patch.
> 
> The workload was FIO benchmark doing both reads and writes to the remote
> storage via iSCSI, connected over ethernet direct cabling (10Gb speed).
> Distro was Ubuntu 16.04.1 .
> 
> Any more tests or info you need, please let me know!
> Cheers,
> 
> 
> Guilherme
> 
> 
> > - Chris
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> > index f9b6fba..fbd18ab 100644
> > --- a/drivers/scsi/libiscsi.c
> > +++ b/drivers/scsi/libiscsi.c
> > @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task
> > *task, int state)
> > WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
> > task->state = state;
> > 
> > -   if (!list_empty(&task->running))
> > +   spin_lock_bh(&conn->taskqueuelock);
> > +   if (!list_empty(&task->running)) {
> > +   WARN_ONCE(1, "iscsi_complete_task while task on list");
> > list_del_init(&task->running);
> > +   }
> > +   spin_unlock_bh(&conn->taskqueuelock);
> > 
> > if (conn->task == task)
> > conn->task = NULL;
> > @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct
> > iscsi_hdr *hdr,
> > if (session->tt->xmit_task(task))
> > goto free_task;
> > } else {
> > +   spin_lock_bh(&conn->taskqueuelock);
> > list_add_tail(&task->running, &conn->mgmtqueue);
> > +   spin_unlock_bh(&conn->taskqueuelock);
> > iscsi_conn_queue_work(conn);
> > }
> > 
> > @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
> >  * this may be on the requeue list already if the xmit_task callout
> >  * is handling the r2ts while we are adding new ones
> >  */
> > +   spin_lock_bh(&conn->taskqueuelock);
> > if (list_empty(&task->running))
> > list_add_tail(&task->running, &conn->requeue);
> > +   spin_unlock_bh(&conn->taskqueuelock);
> > iscsi_conn_queue_work(conn);
> >  }
> >  EXPORT_SYMBOL_GPL(iscsi_requeue_task);
> > @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
> >  * only have one nop-out as a ping from us and targets should not
> >  * overflow us with nop-ins
> >  */
> > +   spin_lock_bh(&conn->taskqueuelock);
> >  check_mgmt:
> > while (!list_empty(&conn->mgmtqueue)) {
> > conn->task = list_entry(conn->mgmtqueue.next,
> >  struct iscsi_task, running);
> > list_del_init(&conn->task->running);
> > +   spin_unlock_bh(&conn->taskqueuelock);
> > if (iscsi_prep_mgmt_task(conn, conn->task)) {
> > /* regular RX path uses back_lock */
> > spin_lock_bh(&conn->session->back_lock);
> > __iscsi_put_ta

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-11 Thread Guilherme G. Piccoli

On 11/09/2016 03:21 AM, Chris Leech wrote:
> On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:
>>
>> Sure! Count on us to test any patches. I guess the first step is to
>> reproduce on upstream right? We haven't tested specifically this
>> scenario for long time. Will try to reproduce on 4.9-rc4 and update here.
> 
> Great, I'm looking forward to hearing the result.
> 
> Assuming it reproduces, I don't think this level of fine grained locking
> is necessarily the best solution, but it may help confirm the problem.
> Especially if the WARN_ONCE I slipped in here triggers.
> 
> - Chris

Chris, I'm not able to reproduce right now - environment was
misconfigured, and now I'm leaving the office for 20 days.

Will test as soon as I'm back!
Thanks,


Guilherme


> 
> ---
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index f9b6fba..fbd18ab 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task *task, 
> int state)
>   WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
>   task->state = state;
> 
> - if (!list_empty(&task->running))
> + spin_lock_bh(&conn->taskqueuelock);
> + if (!list_empty(&task->running)) {
> + WARN_ONCE(1, "iscsi_complete_task while task on list");
>   list_del_init(&task->running);
> + }
> + spin_unlock_bh(&conn->taskqueuelock);
> 
>   if (conn->task == task)
>   conn->task = NULL;
> @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct 
> iscsi_hdr *hdr,
>   if (session->tt->xmit_task(task))
>   goto free_task;
>   } else {
> + spin_lock_bh(&conn->taskqueuelock);
>   list_add_tail(&task->running, &conn->mgmtqueue);
> + spin_unlock_bh(&conn->taskqueuelock);
>   iscsi_conn_queue_work(conn);
>   }
> 
> @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
>* this may be on the requeue list already if the xmit_task callout
>* is handling the r2ts while we are adding new ones
>*/
> + spin_lock_bh(&conn->taskqueuelock);
>   if (list_empty(&task->running))
>   list_add_tail(&task->running, &conn->requeue);
> + spin_unlock_bh(&conn->taskqueuelock);
>   iscsi_conn_queue_work(conn);
>  }
>  EXPORT_SYMBOL_GPL(iscsi_requeue_task);
> @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
>* only have one nop-out as a ping from us and targets should not
>* overflow us with nop-ins
>*/
> + spin_lock_bh(&conn->taskqueuelock);
>  check_mgmt:
>   while (!list_empty(&conn->mgmtqueue)) {
>   conn->task = list_entry(conn->mgmtqueue.next,
>struct iscsi_task, running);
>   list_del_init(&conn->task->running);
> + spin_unlock_bh(&conn->taskqueuelock);
>   if (iscsi_prep_mgmt_task(conn, conn->task)) {
>   /* regular RX path uses back_lock */
>   spin_lock_bh(&conn->session->back_lock);
>   __iscsi_put_task(conn->task);
>   spin_unlock_bh(&conn->session->back_lock);
>   conn->task = NULL;
> + spin_lock_bh(&conn->taskqueuelock);
>   continue;
>   }
>   rc = iscsi_xmit_task(conn);
>   if (rc)
>   goto done;
> + spin_lock_bh(&conn->taskqueuelock);
>   }
> 
>   /* process pending command queue */
> @@ -1535,19 +1547,24 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
>   conn->task = list_entry(conn->cmdqueue.next, struct iscsi_task,
>   running);
>   list_del_init(&conn->task->running);
> + spin_unlock_bh(&conn->taskqueuelock);
>   if (conn->session->state == ISCSI_STATE_LOGGING_OUT) {
>   fail_scsi_task(conn->task, DID_IMM_RETRY);
> + spin_lock_bh(&conn->taskqueuelock);
>   continue;
>   }
>   rc = iscsi_prep_scsi_cmd_pdu(conn->task);
>   if (rc) {
>   if (rc == -ENOMEM || rc == -EACCES) {
> + spin_lock_bh(&conn->taskqueuelock);
>   list_add_tail(&conn->task->running,
> &conn->cmdqueue);
>   conn->task = NULL;
> + spin_unlock_bh(&conn->taskqueuelock);
>   goto done;
>   } else
>   fail_scsi_task(conn->task, DID_ABORT);
> + spin_lock_bh(&conn->taskqueuelock);
>   continue;
>   }
>   rc = iscsi_xmit_task(conn);
>

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-08 Thread Chris Leech

On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:
>
> Sure! Count on us to test any patches. I guess the first step is to
> reproduce on upstream right? We haven't tested specifically this
> scenario for long time. Will try to reproduce on 4.9-rc4 and update here.

Great, I'm looking forward to hearing the result.

Assuming it reproduces, I don't think this level of fine grained locking
is necessarily the best solution, but it may help confirm the problem.
Especially if the WARN_ONCE I slipped in here triggers.

- Chris

---

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index f9b6fba..fbd18ab 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task *task, 
int state)
WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
task->state = state;
 
-   if (!list_empty(&task->running))
+   spin_lock_bh(&conn->taskqueuelock);
+   if (!list_empty(&task->running)) {
+   WARN_ONCE(1, "iscsi_complete_task while task on list");
list_del_init(&task->running);
+   }
+   spin_unlock_bh(&conn->taskqueuelock);
 
if (conn->task == task)
conn->task = NULL;
@@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr,
if (session->tt->xmit_task(task))
goto free_task;
} else {
+   spin_lock_bh(&conn->taskqueuelock);
list_add_tail(&task->running, &conn->mgmtqueue);
+   spin_unlock_bh(&conn->taskqueuelock);
iscsi_conn_queue_work(conn);
}
 
@@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
 * this may be on the requeue list already if the xmit_task callout
 * is handling the r2ts while we are adding new ones
 */
+   spin_lock_bh(&conn->taskqueuelock);
if (list_empty(&task->running))
list_add_tail(&task->running, &conn->requeue);
+   spin_unlock_bh(&conn->taskqueuelock);
iscsi_conn_queue_work(conn);
 }
 EXPORT_SYMBOL_GPL(iscsi_requeue_task);
@@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
 * only have one nop-out as a ping from us and targets should not
 * overflow us with nop-ins
 */
+   spin_lock_bh(&conn->taskqueuelock);
 check_mgmt:
while (!list_empty(&conn->mgmtqueue)) {
conn->task = list_entry(conn->mgmtqueue.next,
 struct iscsi_task, running);
list_del_init(&conn->task->running);
+   spin_unlock_bh(&conn->taskqueuelock);
if (iscsi_prep_mgmt_task(conn, conn->task)) {
/* regular RX path uses back_lock */
spin_lock_bh(&conn->session->back_lock);
__iscsi_put_task(conn->task);
spin_unlock_bh(&conn->session->back_lock);
conn->task = NULL;
+   spin_lock_bh(&conn->taskqueuelock);
continue;
}
rc = iscsi_xmit_task(conn);
if (rc)
goto done;
+   spin_lock_bh(&conn->taskqueuelock);
}
 
/* process pending command queue */
@@ -1535,19 +1547,24 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
conn->task = list_entry(conn->cmdqueue.next, struct iscsi_task,
running);
list_del_init(&conn->task->running);
+   spin_unlock_bh(&conn->taskqueuelock);
if (conn->session->state == ISCSI_STATE_LOGGING_OUT) {
fail_scsi_task(conn->task, DID_IMM_RETRY);
+   spin_lock_bh(&conn->taskqueuelock);
continue;
}
rc = iscsi_prep_scsi_cmd_pdu(conn->task);
if (rc) {
if (rc == -ENOMEM || rc == -EACCES) {
+   spin_lock_bh(&conn->taskqueuelock);
list_add_tail(&conn->task->running,
  &conn->cmdqueue);
conn->task = NULL;
+   spin_unlock_bh(&conn->taskqueuelock);
goto done;
} else
fail_scsi_task(conn->task, DID_ABORT);
+   spin_lock_bh(&conn->taskqueuelock);
continue;
}
rc = iscsi_xmit_task(conn);
@@ -1558,6 +1575,7 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
 * we need to check the mgmt queue for nops that need to
 * be sent to aviod starvation
 */
+   spin_lock_bh(&conn->taskqueuelock);

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-07 Thread Guilherme G. Piccoli

On 11/07/2016 04:15 PM, Chris Leech wrote:
> Hi,
> 
> I'm kicking this old thread because I don't think this ever got
> resolved.  I wish I had more info, but it seems to involve target
> specific behavior that hasn't come up in our test labs.

Thanks very much for reopening this thread! We have the Or's patch
reverted in multiple distros, so the issue is not likely to happen on
customer's from IBM, but upstream kernel never saw a proper fix for this.


> 
> So what can I do at this point to help resolve this?
> 
> On Sun, Nov 15, 2015 at 12:10:48PM +0200, Or Gerlitz wrote:
>> Mike, Chris
>>
>> After the locking change, adding a task to any of the connection
>> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
>>
>> Removing tasks from any of these lists in iscsi_data_xmit is under
>> the session forward lock and **before** calling down to the transport
>> to handle the task.
>>
>> The iscsi_complete_task helper was added by Mike's commit
>> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
>> and is indeed typically called under the backward lock && has this section
>>
>> +   if (!list_empty(&task->running))
>> +   list_del_init(&task->running);
>>
>> which per my reading of the code never comes into play, can you comment?
>>
>> Lets address this area before we move to the others claims made on the patch.
> 
> This bit in particular is where I see a cause for concern.  If that
> list_del_init call ever races against other list operations, there's a
> potential corruption.  It's presumably there for a reason, and Mike
> explained a case where some targets have been known to send a check
> condition at unexpected times that would hit it.
> 
> I don't like having known list locking violations hanging around, based
> on an expectation that we'll never hit that path with well behaved
> targets.
> 
> If we can get a fix worked up for the list locking here, can we get any
> testing on it from the original reports at IBM?  That was very helpful
> in testing a full reversion patch.

Sure! Count on us to test any patches. I guess the first step is to
reproduce on upstream right? We haven't tested specifically this
scenario for long time. Will try to reproduce on 4.9-rc4 and update here.

Cheers,



Guilherme


> 
> - Chris Leech
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-07 Thread Chris Leech

Hi,

I'm kicking this old thread because I don't think this ever got
resolved.  I wish I had more info, but it seems to involve target
specific behavior that hasn't come up in our test labs.

So what can I do at this point to help resolve this?

On Sun, Nov 15, 2015 at 12:10:48PM +0200, Or Gerlitz wrote:
> Mike, Chris
> 
> After the locking change, adding a task to any of the connection
> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
> 
> Removing tasks from any of these lists in iscsi_data_xmit is under
> the session forward lock and **before** calling down to the transport
> to handle the task.
> 
> The iscsi_complete_task helper was added by Mike's commit
> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
> and is indeed typically called under the backward lock && has this section
> 
> +   if (!list_empty(&task->running))
> +   list_del_init(&task->running);
> 
> which per my reading of the code never comes into play, can you comment?
> 
> Lets address this area before we move to the others claims made on the patch.

This bit in particular is where I see a cause for concern.  If that
list_del_init call ever races against other list operations, there's a
potential corruption.  It's presumably there for a reason, and Mike
explained a case where some targets have been known to send a check
condition at unexpected times that would hit it.

I don't like having known list locking violations hanging around, based
on an expectation that we'll never hit that path with well behaved
targets.

If we can get a fix worked up for the list locking here, can we get any
testing on it from the original reports at IBM?  That was very helpful
in testing a full reversion patch.

- Chris Leech

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-01-22 Thread Mike Christie

On 01/22/2016 10:50 AM, Brian King wrote:
> On 11/11/2015 11:05 PM, mchri...@redhat.com wrote:
>> From: Mike Christie 
>>
>> This patch fixes this oops report by Guilherme Piccol:
>>
>> list_del corruption. prev->next should be c00f3da2b080, but was 
>> c00f3da2c080
> 
> Hi Mike! I haven't seen any follow ups on this for a while. What is the plan 
> for this one? 
> 

Hey,

I ended up sending Guilherme a patch for the problem I described here.
They were not hitting that problem. Checked if we were hitting a
different regression in the abort path. Not hitting that either.

Or was still debugging it.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-01-22 Thread Brian King

On 11/11/2015 11:05 PM, mchri...@redhat.com wrote:
> From: Mike Christie 
> 
> This patch fixes this oops report by Guilherme Piccol:
> 
> list_del corruption. prev->next should be c00f3da2b080, but was 
> c00f3da2c080

Hi Mike! I haven't seen any follow ups on this for a while. What is the plan 
for this one? 

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-18 Thread Mike Christie


On 11/18/15, 5:30 AM, Or Gerlitz wrote:

On Mon, Nov 16, 2015 at 7:30 PM, Michael Christie  wrote:

On Nov 15, 2015, at 4:10 AM, Or Gerlitz  wrote:
On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie  wrote:

On 11/13/2015 09:06 AM, Or Gerlitz wrote:



After the locking change, adding a task to any of the connection
mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.

Removing tasks from any of these lists in iscsi_data_xmit is under
the session forward lock and **before** calling down to the transport
to handle the task.

The iscsi_complete_task helper was added by Mike's commit
3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
and is indeed typically called under the backward lock && has this section

+   if (!list_empty(&task->running))
+   list_del_init(&task->running);

which per my reading of the code never comes into play, can you comment?



I had sent this to Sagi and your mellanox email the other day:



The bug occurs when a target completes a command while we are still
processing it. If we are doing a WRITE and the iscsi_task
is on the cmdqueue because we are handling a R2T.


Mike,

I failed to find how an iscsi_task can be added again to the cmdqueue list,
nor how it can be added to the requeue list without the right locking, nor how
can an iscsi_task be on either of these lists when iscsi_complete_task
is invoked.


Are you only considering normal execution or also the cc case I 
mentioned in the last mail? For normal execution we are ok.




Specifically, my reading of the code says that these are the events over time:

t1. queuecommand :: we put the task on the cmdqueue list
(libiscsi.c:1741) - under fwd lock

t2. iscsi_data_xmit :: we remove the task from the cmdqueue list
(libiscsi.c:1537) - under fwd lock

when the R2T flow happens, we do the following

t3. iscsi_tcp_hdr_dissect --> iscsi_tcp_r2t_rsp --> iscsi_requeue_task ::
put the task on the requeue list -- the call to iscsi_tcp_r2t_rsp is
under the fwd lock


If the target were to send a CC at this point, we are adding the task 
under the frwd lock, but the completion path would only have the back 
lock. The list_empty, list_add and list_del calls then race and we could 
end up in a bad state right?




t4. iscsi_data_xmit :: we remove the task from the requeue list
(libiscsi.c: 1578) - under fwd lock


We could also get the bad CC at this time and end up doing 2 list_dels 
at the same time. The t4 one under the frwd lock and the cc handling 
completion one under the back lock like in t3.




Do you agree to t1...t4 being what's possible for a given task? or I
missed something?


The target shouldn't
send a Check Condition at this time, but some do. If that happens, then
iscsi_queuecommand could be adding a new task to the cmdqueue, while the
recv path is handling the CC for the task with the outsanding R2T.  The
recv path iscsi_complete_task call sees that task it on the cmdqueue and
deletes it from the list at the same time iscsi_queuecommand is adding a new 
task.




This should not happen per the iscsi spec. There is some wording about
waiting to finish the sequence in progress, but targets goof this up.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-18 Thread Or Gerlitz

On Mon, Nov 16, 2015 at 7:30 PM, Michael Christie  wrote:
>> On Nov 15, 2015, at 4:10 AM, Or Gerlitz  wrote:
>> On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie  wrote:
>>> On 11/13/2015 09:06 AM, Or Gerlitz wrote:

>> After the locking change, adding a task to any of the connection
>> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
>>
>> Removing tasks from any of these lists in iscsi_data_xmit is under
>> the session forward lock and **before** calling down to the transport
>> to handle the task.
>>
>> The iscsi_complete_task helper was added by Mike's commit
>> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
>> and is indeed typically called under the backward lock && has this section
>>
>> +   if (!list_empty(&task->running))
>> +   list_del_init(&task->running);
>>
>> which per my reading of the code never comes into play, can you comment?

> I had sent this to Sagi and your mellanox email the other day:

> The bug occurs when a target completes a command while we are still
> processing it. If we are doing a WRITE and the iscsi_task
> is on the cmdqueue because we are handling a R2T.

Mike,

I failed to find how an iscsi_task can be added again to the cmdqueue list,
nor how it can be added to the requeue list without the right locking, nor how
can an iscsi_task be on either of these lists when iscsi_complete_task
is invoked.

Specifically, my reading of the code says that these are the events over time:

t1. queuecommand :: we put the task on the cmdqueue list
(libiscsi.c:1741) - under fwd lock

t2. iscsi_data_xmit :: we remove the task from the cmdqueue list
(libiscsi.c:1537) - under fwd lock

when the R2T flow happens, we do the following

t3. iscsi_tcp_hdr_dissect --> iscsi_tcp_r2t_rsp --> iscsi_requeue_task ::
put the task on the requeue list -- the call to iscsi_tcp_r2t_rsp is
under the fwd lock

t4. iscsi_data_xmit :: we remove the task from the requeue list
(libiscsi.c: 1578) - under fwd lock

Do you agree to t1...t4 being what's possible for a given task? or I
missed something?

>> The target shouldn't
>> send a Check Condition at this time, but some do. If that happens, then
>> iscsi_queuecommand could be adding a new task to the cmdqueue, while the
>> recv path is handling the CC for the task with the outsanding R2T.  The
>> recv path iscsi_complete_task call sees that task it on the cmdqueue and
>> deletes it from the list at the same time iscsi_queuecommand is adding a new 
>> task.

>> This should not happen per the iscsi spec. There is some wording about
>> waiting to finish the sequence in progress, but targets goof this up.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-17 Thread Or Gerlitz

On Mon, Nov 16, 2015 at 7:30 PM, Michael Christie  wrote:
>> On Nov 15, 2015, at 4:10 AM, Or Gerlitz  wrote:

>> After the locking change, adding a task to any of the connection
>> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
>>
>> Removing tasks from any of these lists in iscsi_data_xmit is under
>> the session forward lock and **before** calling down to the transport
>> to handle the task.
>>
>> The iscsi_complete_task helper was added by Mike's commit
>> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
>> and is indeed typically called under the backward lock && has this section
>>
>> +   if (!list_empty(&task->running))
>> +   list_del_init(&task->running);
>> which per my reading of the code never comes into play, can you comment?

>> The bug occurs when a target completes a command while we are still
>> processing it. If we are doing a WRITE and the iscsi_task
>> is on the cmdqueue because we are handling a R2T. The target shouldn't
>> send a Check Condition at this time, but some do. If that happens, then
>> iscsi_queuecommand could be adding a new task to the cmdqueue, while the
>> recv path is handling the CC for the task with the outsanding R2T.  The
>> recv path iscsi_complete_task call sees that task it on the cmdqueue and
>> deletes it from the list at the same time iscsi_queuecommand is adding a new
>> task.

So we're now a bit beyond trivial bug and

>> This should not happen per the iscsi spec. There is some wording about
>> waiting to finish the sequence in progress, but targets goof this up.

we have target/s that violate the spec, this is life, but can explain
why it took us 18m to get
bug report. Can you provide few point pointers into the relevant code
pieces that one need
to look at to realize what's going on there? was this code added
before or after the patch?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-16 Thread Michael Christie


> On Nov 15, 2015, at 4:10 AM, Or Gerlitz  wrote:
> 
> On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie  wrote:
>> On 11/13/2015 09:06 AM, Or Gerlitz wrote:
 The patch has caused multiple regressions, did not even compile when
> sent to me, and was poorly reviewed and I have not heard from you guys
> in a week. Given the issues the patch has had and the current time, I do
> not feel comfortable with it anymore. I want to re-review it and fix it
> up when there is more time.
>>> Mike (Hi),
>>> 
>>> It's a complex patch that touches all the iscsi transports, and yes,
>>> when it was send to you the 1st time, there was build error on one of
>>> the offload transports (bad! but happens) and yes, as you pointed, one
>>> static checker fix + one bug fix for it went upstream after this has
>>> been merged, happens too.
>> 
>> A patch should not cause this many issues.
>> 
>>> What makes you say it was poorly reviewed?
>> 
>> I just did not do a good job at looking at the patch. I should have
>> caught all of these issues.
>> 
>> - The bnx2i cleanup_task bug should have been obvious, especially for me
>> because I had commented about the back lock and the abort path.
>> 
>> - This oops, was so basic. Incorrect locking around a linked list being
>> accessed from 2 threads is really one of those 1st year kernel
>> programmer things.
> 
> Mike, Chris
> 
> After the locking change, adding a task to any of the connection
> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
> 
> Removing tasks from any of these lists in iscsi_data_xmit is under
> the session forward lock and **before** calling down to the transport
> to handle the task.
> 
> The iscsi_complete_task helper was added by Mike's commit
> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
> and is indeed typically called under the backward lock && has this section
> 
> +   if (!list_empty(&task->running))
> +   list_del_init(&task->running);
> 
> which per my reading of the code never comes into play, can you comment?


I had sent this to Sagi and your mellanox email the other day:


> The bug occurs when a target completes a command while we are still
> processing it. If we are doing a WRITE and the iscsi_task
> is on the cmdqueue because we are handling a R2T. The target shouldn't
> send a Check Condition at this time, but some do. If that happens, then
> iscsi_queuecommand could be adding a new task to the cmdqueue, while the
> recv path is handling the CC for the task with the outsanding R2T.  The
> recv path iscsi_complete_task call sees that task it on the cmdqueue and
> deletes it from the list at the same time iscsi_queuecommand is adding a new
> task.
> 
> This should not happen per the iscsi spec. There is some wording about
> waiting to finish the sequence in progress, but targets goof this up.




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-15 Thread Or Gerlitz

On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie  wrote:
> On 11/13/2015 09:06 AM, Or Gerlitz wrote:
>>> The patch has caused multiple regressions, did not even compile when
>>> > sent to me, and was poorly reviewed and I have not heard from you guys
>>> > in a week. Given the issues the patch has had and the current time, I do
>>> > not feel comfortable with it anymore. I want to re-review it and fix it
>>> > up when there is more time.
>> Mike (Hi),
>>
>> It's a complex patch that touches all the iscsi transports, and yes,
>> when it was send to you the 1st time, there was build error on one of
>> the offload transports (bad! but happens) and yes, as you pointed, one
>> static checker fix + one bug fix for it went upstream after this has
>> been merged, happens too.
>
> A patch should not cause this many issues.
>
>> What makes you say it was poorly reviewed?
>
> I just did not do a good job at looking at the patch. I should have
> caught all of these issues.
>
> - The bnx2i cleanup_task bug should have been obvious, especially for me
> because I had commented about the back lock and the abort path.
>
> - This oops, was so basic. Incorrect locking around a linked list being
> accessed from 2 threads is really one of those 1st year kernel
> programmer things.

Mike, Chris

After the locking change, adding a task to any of the connection
mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.

Removing tasks from any of these lists in iscsi_data_xmit is under
the session forward lock and **before** calling down to the transport
to handle the task.

The iscsi_complete_task helper was added by Mike's commit
3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
and is indeed typically called under the backward lock && has this section

+   if (!list_empty(&task->running))
+   list_del_init(&task->running);

which per my reading of the code never comes into play, can you comment?

Lets address this area before we move to the others claims made on the patch.

Again, the patch is around for ~18 months, since 3.15, and no deep
complaints so far, lets
not jump to conclusions.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-13 Thread Mike Christie

On 11/13/2015 09:06 AM, Or Gerlitz wrote:
>> The patch has caused multiple regressions, did not even compile when
>> > sent to me, and was poorly reviewed and I have not heard from you guys
>> > in a week. Given the issues the patch has had and the current time, I do
>> > not feel comfortable with it anymore. I want to re-review it and fix it
>> > up when there is more time.
> Mike (Hi),
> 
> It's a complex patch that touches all the iscsi transports, and yes,
> when it was send to you the 1st time, there was build error on one of
> the offload transports (bad! but happens) and yes, as you pointed, one
> static checker fix + one bug fix for it went upstream after this has
> been merged, happens too.

A patch should not cause this many issues.

> What makes you say it was poorly reviewed?

I just did not do a good job at looking at the patch. I should have
caught all of these issues.

- The bnx2i cleanup_task bug should have been obvious, especially for me
because I had commented about the back lock and the abort path.

- This oops, was so basic. Incorrect locking around a linked list being
accessed from 2 threads is really one of those 1st year kernel
programmer things.

- The iscsi_xmit_task static checker locking was really simple too.

- There was the issue offlist I emailed you guys about where I started
to try and fix/review it yesterday, and I found another race in the
abort and completion path that can result in a null pointer reference.

- I have not had time to fully review it again, but I think the the
reset/recovery code looks shady too.

> 
> And now after few years in upstream a possibly real bug was found
> (happens), why rush and revert? lets see if we can fix.

Send patches.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-13 Thread Or Gerlitz

On Thu, Nov 12, 2015 at 10:58 PM, Mike Christie  wrote:
> On 11/12/2015 06:03 AM, Sagi Grimberg wrote:
>>> The bug is caused by this patch:
>>>
>>> 659743b02c411075b26601725947b21df0bb29c8
>>>
>>> which allowed the task lists to be manipulated under different locks
>>> in the xmit and completion path.

>>> To fix the oops this patch just reverts that patch. It also reverts
>>> these 2 patches for regressions that were also a result of that patch:

>> Whoa now Mike, this is a major change. Can we take a step back and think
>> about this for a second?

> The issue has been on the open-iscsi list for a week! You are on the
> list still right? Or is even ccd on the thread.

The email you sent me a week ago also cc-ed open-iscsi hence was
routed by a rule in my mailer that made it to land in my open-iscsi
subscription folder which is don't visit much nowadays. Only when you
posted to linux-scsi we saw that and responded within few hours, to
begin with.

>> Can you provide a more detailed analysis of why is this list corruption
>> triggered? What scenario was not honoring the locking policy?

> Basic locking around a linked list bug. iscsi_queuecommand adds it under
> the frwd lock and iscsi_complete_task was taking it off the back_lock.

>> This code has been running reliably in our labs for a long time now
>> (both iser and tcp).

> The patch has caused multiple regressions, did not even compile when
> sent to me, and was poorly reviewed and I have not heard from you guys
> in a week. Given the issues the patch has had and the current time, I do
> not feel comfortable with it anymore. I want to re-review it and fix it
> up when there is more time.

Mike (Hi),

It's a complex patch that touches all the iscsi transports, and yes,
when it was send to you the 1st time, there was build error on one of
the offload transports (bad! but happens) and yes, as you pointed, one
static checker fix + one bug fix for it went upstream after this has
been merged, happens too.

What makes you say it was poorly reviewed?

And now after few years in upstream a possibly real bug was found
(happens), why rush and revert? lets see if we can fix.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-12 Thread Chris Leech

On Thu, Nov 12, 2015 at 4:03 AM, Sagi Grimberg  wrote:
>
>> The bug is caused by this patch:
>>
>> 659743b02c411075b26601725947b21df0bb29c8
>>
>> which allowed the task lists to be manipulated under different locks
>> in the xmit and completion path.
>>
>> To fix the oops this patch just reverts that patch. It also reverts
>> these 2 patches for regressions that were also a result of that patch:
>
>
> Whoa now Mike, this is a major change. Can we take a step back and think
> about this for a second?
>
> My understanding is that the kfifo circular buffer design allows a
> writer (e.g. the producer) and a reader (e.g. the consumer) to avoid
> extra locking when accessing the kfifo buffer.

I don't think the problem is with the kfifo cmdpool, but rather the
connection mgmtqueue/cmdqueue linked lists.   iscsi_task structs are linked
in using the task->running list_head.

- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-12 Thread Mike Christie

On 11/12/2015 06:03 AM, Sagi Grimberg wrote:
> 
>> The bug is caused by this patch:
>>
>> 659743b02c411075b26601725947b21df0bb29c8
>>
>> which allowed the task lists to be manipulated under different locks
>> in the xmit and completion path.
>>
>> To fix the oops this patch just reverts that patch. It also reverts
>> these 2 patches for regressions that were also a result of that patch:
> 
> Whoa now Mike, this is a major change. Can we take a step back and think
> about this for a second?

The issue has been on the open-iscsi list for a week! You are on the
list still right? Or is even ccd on the thread.

> 
> My understanding is that the kfifo circular buffer design allows a
> writer (e.g. the producer) and a reader (e.g. the consumer) to avoid
> extra locking when accessing the kfifo buffer.
> 

For the next feature window I am working on patch that makes the api
easier to use (the cleanup_task locking is bad as can be seen from the
bnx2i regression the patch also caused) and also uses kfifos for the
fast path.


> From include/linux/kfifo.h:
> /*
>  * Note about locking : There is no locking required until only * one
> reader
>  * and one writer is using the fifo and no kfifo_reset() will be * called
>  *  kfifo_reset_out() can be safely used, until it will be only called
>  * in the reader thread.
>  *  For multiple writer and one reader there is only a need to lock the
> writer.
>  * And vice versa for only one writer and multiple reader there is only
> a need
>  * to lock the reader.
>  */
> 
> The patch by Shlomo, implements the locking policy documented above:
> - multiple readers: multiple threads entering queuecommand reading the
>   kfifo (kfifo_out) are mutually excluded by the frwd_lock.
> - multiple writers: completion contexts writing to the kfifo (kfifo_in)
>   are mutually excluded by the back_lock.
> 
> Can you provide a more detailed analysis of why is this list corruption
> triggered? What scenario was not honoring the locking policy?

Basic locking around a linked list bug. iscsi_queuecommand adds it under
the frwd lock and iscsi_complete_task was taking it off the back_lock.


> This code has been running reliably in our labs for a long time now
> (both iser and tcp).

The patch has caused multiple regressions, did not even compile when
sent to me, and was poorly reviewed and I have not heard from you guys
in a week. Given the issues the patch has had and the current time, I do
not feel comfortable with it anymore. I want to re-review it and fix it
up when there is more time.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-12 Thread Sagi Grimberg




The bug is caused by this patch:

659743b02c411075b26601725947b21df0bb29c8

which allowed the task lists to be manipulated under different locks
in the xmit and completion path.

To fix the oops this patch just reverts that patch. It also reverts
these 2 patches for regressions that were also a result of that patch:


Whoa now Mike, this is a major change. Can we take a step back and think
about this for a second?

My understanding is that the kfifo circular buffer design allows a
writer (e.g. the producer) and a reader (e.g. the consumer) to avoid
extra locking when accessing the kfifo buffer.

From include/linux/kfifo.h:
/*
 * Note about locking : There is no locking required until only * one 
reader

 * and one writer is using the fifo and no kfifo_reset() will be * called
 *  kfifo_reset_out() can be safely used, until it will be only called
 * in the reader thread.
 *  For multiple writer and one reader there is only a need to lock the 
writer.
 * And vice versa for only one writer and multiple reader there is only 
a need

 * to lock the reader.
 */

The patch by Shlomo, implements the locking policy documented above:
- multiple readers: multiple threads entering queuecommand reading the
  kfifo (kfifo_out) are mutually excluded by the frwd_lock.
- multiple writers: completion contexts writing to the kfifo (kfifo_in)
  are mutually excluded by the back_lock.

Can you provide a more detailed analysis of why is this list corruption
triggered? What scenario was not honoring the locking policy?
This code has been running reliably in our labs for a long time now
(both iser and tcp).

Going back to a single lock restores the contention point between
queuecommand and complete_pdu.


P.S.
While we're on the subject, I'd like to see block tags replace the
kfifo for iscsi tasks. It's difficult because sw and offload drivers
map hosts/sessions differently. It can be done if we move the tasks to
the iscsi_host and have a reserved (unique) task tags for the session
login/logout/nop_in/nop_out. With this we won't need locks around our
tasks arrays.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-11 Thread mchristi

From: Mike Christie 

This patch fixes this oops report by Guilherme Piccol:

list_del corruption. prev->next should be c00f3da2b080, but was 
c00f3da2c080
[ cut here ]
WARNING: at lib/list_debug.c:59
CPU: 48 PID: 12033 Comm: fio-2.2.7 Not tainted 
3.18.22-354.el7_1.pkvm3_1_0.3600.1.ppc64le #1
task: c00f25827000 ti: c00fff5f8000 task.ti: c00f057e4000
NIP: c04d15a4 LR: c04d15a0 CTR: c04bdfd0
REGS: c00fff5faf10 TRAP: 0700   Not tainted 
(3.18.22-354.el7_1.pkvm3_1_0.3600.1.ppc64le)
MSR: 90029033   CR: 28082842  XER: 2000
CFAR: c0988dbc SOFTE: 1
NIP [c04d15a4] __list_del_entry+0xb4/0xe0
LR [c04d15a0] __list_del_entry+0xb0/0xe0
Call Trace:
[c00fff5fb190] [c04d15a0] __list_del_entry+0xb0/0xe0 (unreliable)
[c00fff5fb1f0] [d00010fb0c8c] iscsi_complete_task+0x8c/0x190 [libiscsi]
[c00fff5fb270] [d00010fb24c8] __iscsi_complete_pdu+0x508/0xaa0 
[libiscsi]
[c00fff5fb350] [d00010fb2ab4] iscsi_complete_pdu+0x54/0xa0 [libiscsi]
[c00fff5fb3a0] [d000110121ac] iscsi_tcp_hdr_recv_done+0x38c/0x1250 
[libiscsi_tcp]
[c00fff5fb480] [d00011011a48] iscsi_tcp_recv_skb+0xd8/0x4b0 
[libiscsi_tcp]
[c00fff5fb570] [d00011071708] iscsi_sw_tcp_recv+0x98/0x170 [iscsi_tcp]
[c00fff5fb610] [c0882794] tcp_read_sock+0xe4/0x2d0
[c00fff5fb680] [d00011071590] iscsi_sw_tcp_data_ready+0x70/0x150 
[iscsi_tcp]
[c00fff5fb720] [c08921c8] tcp_rcv_established+0x3f8/0x6f0
[c00fff5fb780] [c089d75c] tcp_v4_do_rcv+0x19c/0x420
[c00fff5fb7e0] [c08a15bc] tcp_v4_rcv+0xa6c/0xb20
[c00fff5fb8c0] [c08700f8] ip_local_deliver_finish+0x178/0x350
[c00fff5fb910] [c0870424] ip_rcv_finish+0x154/0x420
[c00fff5fb990] [c0816bf4] __netif_receive_skb_core+0x7a4/0x9c0
[c00fff5fba80] [c081a0fc] netif_receive_skb_internal+0x4c/0xf0
[c00fff5fbac0] [c081b17c] napi_gro_receive+0xfc/0x1a0
[c00fff5fbb00] [d8373280] bnx2x_rx_int+0xb30/0x1650 [bnx2x]
[c00fff5fbc90] [d8374440] bnx2x_poll+0x310/0x440 [bnx2x]
[c00fff5fbd40] [c081a8e8] net_rx_action+0x2d8/0x3e0
[c00fff5fbe10] [c00cfc44] __do_softirq+0x174/0x3a0
[c00fff5fbf00] [c00d01f8] irq_exit+0xc8/0x100
[c00fff5fbf20] [c0010d40] __do_irq+0xa0/0x1c0
[c00fff5fbf90] [c0024aac] call_do_irq+0x14/0x24
[c00f057e7dd0] [c0010f00] do_IRQ+0xa0/0x120
[c00f057e7e30] [c00025e8] hardware_interrupt_common+0x168/0x180
Instruction dump:
0fe0 4bd8 3c62ff91 386304c0 484b77ad 6000 0fe0 4bc0
 3c62ff91 38630480 484b7795 6000 <0fe0> 4ba8 3c62ff91 7d254b78
---[ end trace 005d2749e6bab12f ]---

The bug is caused by this patch:

659743b02c411075b26601725947b21df0bb29c8

which allowed the task lists to be manipulated under different locks
in the xmit and completion path.

To fix the oops this patch just reverts that patch. It also reverts
these 2 patches for regressions that were also a result of that patch:

72b9740201d5f0e24b0b8326a4949786a30ff628
35843048e7e979df3b7b9f2ad49e21797a11386b

Cc: Guilherme  Piccoli 
Signed-off-by: Mike Christie 
---
 drivers/scsi/be2iscsi/be_main.c  |  26 ++---
 drivers/scsi/bnx2i/bnx2i_hwi.c   |  46 -
 drivers/scsi/bnx2i/bnx2i_iscsi.c |  10 +-
 drivers/scsi/iscsi_tcp.c |  22 ++--
 drivers/scsi/libiscsi.c  | 214 +--
 drivers/scsi/libiscsi_tcp.c  |  28 ++---
 drivers/scsi/qla4xxx/ql4_isr.c   |   4 +-
 include/scsi/libiscsi.h  |  17 +---
 include/scsi/libiscsi_tcp.h  |   2 -
 9 files changed, 161 insertions(+), 208 deletions(-)

diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
index a4a5d6d..987cd60 100644
--- a/drivers/scsi/be2iscsi/be_main.c
+++ b/drivers/scsi/be2iscsi/be_main.c
@@ -232,20 +232,20 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc)
cls_session = starget_to_session(scsi_target(sc->device));
session = cls_session->dd_data;
 
-   spin_lock_bh(&session->frwd_lock);
+   spin_lock_bh(&session->lock);
if (!aborted_task || !aborted_task->sc) {
/* we raced */
-   spin_unlock_bh(&session->frwd_lock);
+   spin_unlock_bh(&session->lock);
return SUCCESS;
}
 
aborted_io_task = aborted_task->dd_data;
if (!aborted_io_task->scsi_cmnd) {
/* raced or invalid command */
-   spin_unlock_bh(&session->frwd_lock);
+   spin_unlock_bh(&session->lock);
return SUCCESS;
}
-   spin_unlock_bh(&session->frwd_lock);
+   spin_unlock_bh(&session->lock);
/* Invalidate WRB Posted for this Task */
AMAP_SET_BITS(struct amap_iscsi_wrb, invld,
  aborted_io_task->pwrb_handle->pwrb,
@@ -310,9 +310,9 @@ static int beiscsi_eh_d

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

[PATCH 1/1] iscsi: fix regression caused by session lock patch

20 matches

Site Navigation

Mail list logo

Footer information