Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2017-02-06 Thread Sagi Grimberg

Hey Chris and Guilherme,

I'm indeed not responsive under this email address.


Thanks for the testing, looks like you have the magic target to reproduce this.

I think this verifies what Mike's idea of what was going wrong, and we're way 
overdue to get this fixed upstream.  Thanks to IBM for pushing this, I don't 
think any major distro is shipping this patch and we don't want to keep having 
to back it out.

The options look like
1) back out the session lock changes that split it into two locks
2) add in the additional locking from this test patch
3) some other fix for the issue of targets that complete tasks oddly

I'm leaning to #1, as I don't want to keep adding more locks for this.


Thanks Chris! IIRC, the lock changes from Shlomo/Or are not on RHEL,
SLES and Ubuntu anymore, as you mentioned. We requested them to revert
the patch, and it was accepted.

On the other hand, your patch is great and a cool fix to this. If we
have any good numbers and/or reasons to keep their patch, guess the
alternative #2 is cool too. I can perform more testing if you plan to
send this (or similar) patch to iscsi list.



Sagi, Or, Shlomo?  You pushed to keep this from being backed out before.  
Here's your cause, any better ideas on fixing it?  I also tried to go back in 
the mailing list archives, but I don't see any real numbers for the performance 
gains.


I'll loop Sagi here based on the email I see he's using on NVMe list
currently - seems it's different from the one showed in the header of
this message.


IIRC, this was brought up more than two years ago? it's been
a while now.

The motivation for the fined grained locking from Shlomo was
designed to address the submission/completion inter-locking
scheme that was not needed for iser.

In iser, task completions are triggered from soft-irq only for
task responses, the data-transfer is driven in HW, so we don't need
the inter-locking between submissions and task management or error
handling.

My recollection is that this scheme solved a contention point we had
back then, if I'm not mistaken it was as much as 50% improvement in
IOPs scalability in some scenarios.

Now, this was all pre block-mq. So I think the correct solution for
iscsi (iser, tcp and offloads) is to use block-mq facilities for
task pre-allocations (scsi host tagset) and have iscsi tcp take care
of it's own locking instead of imposing it inherently in libiscsi.

We can have LOGIN, LOGOUT, NOOP_OUT, TEXT, TMR as reserved tags,
and queue_depth with max session cmds. I had a prototype for that
back when I experimented with scsi-mq conversion (way back...),
but kinda got stuck with trying to figure out how to convert the
offload drivers qla4xxx, bnx2i and cxgbi which seemed to rely heavily
on on the task pools.

If people are more interested in improving iscsi locking schemes we
can discuss on approaches for it.

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.


Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2017-02-06 Thread Guilherme G. Piccoli
On 06/02/2017 15:27, Chris Leech wrote:
> - Original Message -
>> On 09/11/2016 03:21, Chris Leech wrote:
>>> On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:

 Sure! Count on us to test any patches. I guess the first step is to
 reproduce on upstream right? We haven't tested specifically this
 scenario for long time. Will try to reproduce on 4.9-rc4 and update here.
>>>
>>> Great, I'm looking forward to hearing the result.
>>>
>>> Assuming it reproduces, I don't think this level of fine grained locking
>>> is necessarily the best solution, but it may help confirm the problem.
>>> Especially if the WARN_ONCE I slipped in here triggers.
>>
>> Chris, sorry for my huge delay.
>> Finally I was able to perform tests and I have good news - seems your
>> patch below fixed the issue.
> 
> Thanks for the testing, looks like you have the magic target to reproduce 
> this.
> 
> I think this verifies what Mike's idea of what was going wrong, and we're way 
> overdue to get this fixed upstream.  Thanks to IBM for pushing this, I don't 
> think any major distro is shipping this patch and we don't want to keep 
> having to back it out.
> 
> The options look like
> 1) back out the session lock changes that split it into two locks
> 2) add in the additional locking from this test patch
> 3) some other fix for the issue of targets that complete tasks oddly
> 
> I'm leaning to #1, as I don't want to keep adding more locks for this.

Thanks Chris! IIRC, the lock changes from Shlomo/Or are not on RHEL,
SLES and Ubuntu anymore, as you mentioned. We requested them to revert
the patch, and it was accepted.

On the other hand, your patch is great and a cool fix to this. If we
have any good numbers and/or reasons to keep their patch, guess the
alternative #2 is cool too. I can perform more testing if you plan to
send this (or similar) patch to iscsi list.


> Sagi, Or, Shlomo?  You pushed to keep this from being backed out before.  
> Here's your cause, any better ideas on fixing it?  I also tried to go back in 
> the mailing list archives, but I don't see any real numbers for the 
> performance gains.

I'll loop Sagi here based on the email I see he's using on NVMe list
currently - seems it's different from the one showed in the header of
this message.


Thanks,



Guilherme

> 
> - Chris
> 
>> Firstly, I was able to reproduce with kernel 4.10-rc6. See the file
>> repro.out - it's a dump from xmon, the kernel debugger from PowerPC.
>> With this tool we can dump the exception details, registers, PACA
>> (Processor Address Communication Area, ppc specific structure) and
>> dmesg. It took me less than 15 minutes to reproduce.
>>
>> Then, I applied your patch on the top of this kernel and the benchmark
>> was able to successfully complete, in about 3 hours. We can see the
>> WARN() you added was reached, the attached file dmesg-cleech_patch shows
>> the kernel log with your patch.
>>
>> The workload was FIO benchmark doing both reads and writes to the remote
>> storage via iSCSI, connected over ethernet direct cabling (10Gb speed).
>> Distro was Ubuntu 16.04.1 .
>>
>> Any more tests or info you need, please let me know!
>> Cheers,
>>
>>
>> Guilherme
>>
>>
>>> - Chris
>>>
>>> ---
>>>
>>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>>> index f9b6fba..fbd18ab 100644
>>> --- a/drivers/scsi/libiscsi.c
>>> +++ b/drivers/scsi/libiscsi.c
>>> @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task
>>> *task, int state)
>>> WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
>>> task->state = state;
>>>
>>> -   if (!list_empty(>running))
>>> +   spin_lock_bh(>taskqueuelock);
>>> +   if (!list_empty(>running)) {
>>> +   WARN_ONCE(1, "iscsi_complete_task while task on list");
>>> list_del_init(>running);
>>> +   }
>>> +   spin_unlock_bh(>taskqueuelock);
>>>
>>> if (conn->task == task)
>>> conn->task = NULL;
>>> @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct
>>> iscsi_hdr *hdr,
>>> if (session->tt->xmit_task(task))
>>> goto free_task;
>>> } else {
>>> +   spin_lock_bh(>taskqueuelock);
>>> list_add_tail(>running, >mgmtqueue);
>>> +   spin_unlock_bh(>taskqueuelock);
>>> iscsi_conn_queue_work(conn);
>>> }
>>>
>>> @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
>>>  * this may be on the requeue list already if the xmit_task callout
>>>  * is handling the r2ts while we are adding new ones
>>>  */
>>> +   spin_lock_bh(>taskqueuelock);
>>> if (list_empty(>running))
>>> list_add_tail(>running, >requeue);
>>> +   spin_unlock_bh(>taskqueuelock);
>>> iscsi_conn_queue_work(conn);
>>>  }
>>>  EXPORT_SYMBOL_GPL(iscsi_requeue_task);
>>> @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
>>>  * only have one nop-out as a ping from us and targets should not

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2017-02-06 Thread Chris Leech
- Original Message -
> On 09/11/2016 03:21, Chris Leech wrote:
> > On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:
> >>
> >> Sure! Count on us to test any patches. I guess the first step is to
> >> reproduce on upstream right? We haven't tested specifically this
> >> scenario for long time. Will try to reproduce on 4.9-rc4 and update here.
> > 
> > Great, I'm looking forward to hearing the result.
> > 
> > Assuming it reproduces, I don't think this level of fine grained locking
> > is necessarily the best solution, but it may help confirm the problem.
> > Especially if the WARN_ONCE I slipped in here triggers.
> 
> Chris, sorry for my huge delay.
> Finally I was able to perform tests and I have good news - seems your
> patch below fixed the issue.

Thanks for the testing, looks like you have the magic target to reproduce this.

I think this verifies what Mike's idea of what was going wrong, and we're way 
overdue to get this fixed upstream.  Thanks to IBM for pushing this, I don't 
think any major distro is shipping this patch and we don't want to keep having 
to back it out.

The options look like
1) back out the session lock changes that split it into two locks
2) add in the additional locking from this test patch
3) some other fix for the issue of targets that complete tasks oddly

I'm leaning to #1, as I don't want to keep adding more locks for this.

Sagi, Or, Shlomo?  You pushed to keep this from being backed out before.  
Here's your cause, any better ideas on fixing it?  I also tried to go back in 
the mailing list archives, but I don't see any real numbers for the performance 
gains.

- Chris

> Firstly, I was able to reproduce with kernel 4.10-rc6. See the file
> repro.out - it's a dump from xmon, the kernel debugger from PowerPC.
> With this tool we can dump the exception details, registers, PACA
> (Processor Address Communication Area, ppc specific structure) and
> dmesg. It took me less than 15 minutes to reproduce.
> 
> Then, I applied your patch on the top of this kernel and the benchmark
> was able to successfully complete, in about 3 hours. We can see the
> WARN() you added was reached, the attached file dmesg-cleech_patch shows
> the kernel log with your patch.
> 
> The workload was FIO benchmark doing both reads and writes to the remote
> storage via iSCSI, connected over ethernet direct cabling (10Gb speed).
> Distro was Ubuntu 16.04.1 .
> 
> Any more tests or info you need, please let me know!
> Cheers,
> 
> 
> Guilherme
> 
> 
> > - Chris
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> > index f9b6fba..fbd18ab 100644
> > --- a/drivers/scsi/libiscsi.c
> > +++ b/drivers/scsi/libiscsi.c
> > @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task
> > *task, int state)
> > WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
> > task->state = state;
> > 
> > -   if (!list_empty(>running))
> > +   spin_lock_bh(>taskqueuelock);
> > +   if (!list_empty(>running)) {
> > +   WARN_ONCE(1, "iscsi_complete_task while task on list");
> > list_del_init(>running);
> > +   }
> > +   spin_unlock_bh(>taskqueuelock);
> > 
> > if (conn->task == task)
> > conn->task = NULL;
> > @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct
> > iscsi_hdr *hdr,
> > if (session->tt->xmit_task(task))
> > goto free_task;
> > } else {
> > +   spin_lock_bh(>taskqueuelock);
> > list_add_tail(>running, >mgmtqueue);
> > +   spin_unlock_bh(>taskqueuelock);
> > iscsi_conn_queue_work(conn);
> > }
> > 
> > @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
> >  * this may be on the requeue list already if the xmit_task callout
> >  * is handling the r2ts while we are adding new ones
> >  */
> > +   spin_lock_bh(>taskqueuelock);
> > if (list_empty(>running))
> > list_add_tail(>running, >requeue);
> > +   spin_unlock_bh(>taskqueuelock);
> > iscsi_conn_queue_work(conn);
> >  }
> >  EXPORT_SYMBOL_GPL(iscsi_requeue_task);
> > @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
> >  * only have one nop-out as a ping from us and targets should not
> >  * overflow us with nop-ins
> >  */
> > +   spin_lock_bh(>taskqueuelock);
> >  check_mgmt:
> > while (!list_empty(>mgmtqueue)) {
> > conn->task = list_entry(conn->mgmtqueue.next,
> >  struct iscsi_task, running);
> > list_del_init(>task->running);
> > +   spin_unlock_bh(>taskqueuelock);
> > if (iscsi_prep_mgmt_task(conn, conn->task)) {
> > /* regular RX path uses back_lock */
> > spin_lock_bh(>session->back_lock);
> > __iscsi_put_task(conn->task);
> > spin_unlock_bh(>session->back_lock);
> > conn->task = 

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-11 Thread Guilherme G. Piccoli
On 11/09/2016 03:21 AM, Chris Leech wrote:
> On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:
>>
>> Sure! Count on us to test any patches. I guess the first step is to
>> reproduce on upstream right? We haven't tested specifically this
>> scenario for long time. Will try to reproduce on 4.9-rc4 and update here.
> 
> Great, I'm looking forward to hearing the result.
> 
> Assuming it reproduces, I don't think this level of fine grained locking
> is necessarily the best solution, but it may help confirm the problem.
> Especially if the WARN_ONCE I slipped in here triggers.
> 
> - Chris

Chris, I'm not able to reproduce right now - environment was
misconfigured, and now I'm leaving the office for 20 days.

Will test as soon as I'm back!
Thanks,


Guilherme


> 
> ---
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index f9b6fba..fbd18ab 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task *task, 
> int state)
>   WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
>   task->state = state;
> 
> - if (!list_empty(>running))
> + spin_lock_bh(>taskqueuelock);
> + if (!list_empty(>running)) {
> + WARN_ONCE(1, "iscsi_complete_task while task on list");
>   list_del_init(>running);
> + }
> + spin_unlock_bh(>taskqueuelock);
> 
>   if (conn->task == task)
>   conn->task = NULL;
> @@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct 
> iscsi_hdr *hdr,
>   if (session->tt->xmit_task(task))
>   goto free_task;
>   } else {
> + spin_lock_bh(>taskqueuelock);
>   list_add_tail(>running, >mgmtqueue);
> + spin_unlock_bh(>taskqueuelock);
>   iscsi_conn_queue_work(conn);
>   }
> 
> @@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
>* this may be on the requeue list already if the xmit_task callout
>* is handling the r2ts while we are adding new ones
>*/
> + spin_lock_bh(>taskqueuelock);
>   if (list_empty(>running))
>   list_add_tail(>running, >requeue);
> + spin_unlock_bh(>taskqueuelock);
>   iscsi_conn_queue_work(conn);
>  }
>  EXPORT_SYMBOL_GPL(iscsi_requeue_task);
> @@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
>* only have one nop-out as a ping from us and targets should not
>* overflow us with nop-ins
>*/
> + spin_lock_bh(>taskqueuelock);
>  check_mgmt:
>   while (!list_empty(>mgmtqueue)) {
>   conn->task = list_entry(conn->mgmtqueue.next,
>struct iscsi_task, running);
>   list_del_init(>task->running);
> + spin_unlock_bh(>taskqueuelock);
>   if (iscsi_prep_mgmt_task(conn, conn->task)) {
>   /* regular RX path uses back_lock */
>   spin_lock_bh(>session->back_lock);
>   __iscsi_put_task(conn->task);
>   spin_unlock_bh(>session->back_lock);
>   conn->task = NULL;
> + spin_lock_bh(>taskqueuelock);
>   continue;
>   }
>   rc = iscsi_xmit_task(conn);
>   if (rc)
>   goto done;
> + spin_lock_bh(>taskqueuelock);
>   }
> 
>   /* process pending command queue */
> @@ -1535,19 +1547,24 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
>   conn->task = list_entry(conn->cmdqueue.next, struct iscsi_task,
>   running);
>   list_del_init(>task->running);
> + spin_unlock_bh(>taskqueuelock);
>   if (conn->session->state == ISCSI_STATE_LOGGING_OUT) {
>   fail_scsi_task(conn->task, DID_IMM_RETRY);
> + spin_lock_bh(>taskqueuelock);
>   continue;
>   }
>   rc = iscsi_prep_scsi_cmd_pdu(conn->task);
>   if (rc) {
>   if (rc == -ENOMEM || rc == -EACCES) {
> + spin_lock_bh(>taskqueuelock);
>   list_add_tail(>task->running,
> >cmdqueue);
>   conn->task = NULL;
> + spin_unlock_bh(>taskqueuelock);
>   goto done;
>   } else
>   fail_scsi_task(conn->task, DID_ABORT);
> + spin_lock_bh(>taskqueuelock);
>   continue;
>   }
>   rc = iscsi_xmit_task(conn);
> @@ -1558,6 +1575,7 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
>* we need to check the mgmt queue for nops that need to
>* be sent to 

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-08 Thread Chris Leech
On Mon, Nov 07, 2016 at 04:23:10PM -0200, Guilherme G. Piccoli wrote:
>
> Sure! Count on us to test any patches. I guess the first step is to
> reproduce on upstream right? We haven't tested specifically this
> scenario for long time. Will try to reproduce on 4.9-rc4 and update here.

Great, I'm looking forward to hearing the result.

Assuming it reproduces, I don't think this level of fine grained locking
is necessarily the best solution, but it may help confirm the problem.
Especially if the WARN_ONCE I slipped in here triggers.

- Chris

---

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index f9b6fba..fbd18ab 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -560,8 +560,12 @@ static void iscsi_complete_task(struct iscsi_task *task, 
int state)
WARN_ON_ONCE(task->state == ISCSI_TASK_FREE);
task->state = state;
 
-   if (!list_empty(>running))
+   spin_lock_bh(>taskqueuelock);
+   if (!list_empty(>running)) {
+   WARN_ONCE(1, "iscsi_complete_task while task on list");
list_del_init(>running);
+   }
+   spin_unlock_bh(>taskqueuelock);
 
if (conn->task == task)
conn->task = NULL;
@@ -783,7 +787,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr,
if (session->tt->xmit_task(task))
goto free_task;
} else {
+   spin_lock_bh(>taskqueuelock);
list_add_tail(>running, >mgmtqueue);
+   spin_unlock_bh(>taskqueuelock);
iscsi_conn_queue_work(conn);
}
 
@@ -1474,8 +1480,10 @@ void iscsi_requeue_task(struct iscsi_task *task)
 * this may be on the requeue list already if the xmit_task callout
 * is handling the r2ts while we are adding new ones
 */
+   spin_lock_bh(>taskqueuelock);
if (list_empty(>running))
list_add_tail(>running, >requeue);
+   spin_unlock_bh(>taskqueuelock);
iscsi_conn_queue_work(conn);
 }
 EXPORT_SYMBOL_GPL(iscsi_requeue_task);
@@ -1512,22 +1520,26 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
 * only have one nop-out as a ping from us and targets should not
 * overflow us with nop-ins
 */
+   spin_lock_bh(>taskqueuelock);
 check_mgmt:
while (!list_empty(>mgmtqueue)) {
conn->task = list_entry(conn->mgmtqueue.next,
 struct iscsi_task, running);
list_del_init(>task->running);
+   spin_unlock_bh(>taskqueuelock);
if (iscsi_prep_mgmt_task(conn, conn->task)) {
/* regular RX path uses back_lock */
spin_lock_bh(>session->back_lock);
__iscsi_put_task(conn->task);
spin_unlock_bh(>session->back_lock);
conn->task = NULL;
+   spin_lock_bh(>taskqueuelock);
continue;
}
rc = iscsi_xmit_task(conn);
if (rc)
goto done;
+   spin_lock_bh(>taskqueuelock);
}
 
/* process pending command queue */
@@ -1535,19 +1547,24 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
conn->task = list_entry(conn->cmdqueue.next, struct iscsi_task,
running);
list_del_init(>task->running);
+   spin_unlock_bh(>taskqueuelock);
if (conn->session->state == ISCSI_STATE_LOGGING_OUT) {
fail_scsi_task(conn->task, DID_IMM_RETRY);
+   spin_lock_bh(>taskqueuelock);
continue;
}
rc = iscsi_prep_scsi_cmd_pdu(conn->task);
if (rc) {
if (rc == -ENOMEM || rc == -EACCES) {
+   spin_lock_bh(>taskqueuelock);
list_add_tail(>task->running,
  >cmdqueue);
conn->task = NULL;
+   spin_unlock_bh(>taskqueuelock);
goto done;
} else
fail_scsi_task(conn->task, DID_ABORT);
+   spin_lock_bh(>taskqueuelock);
continue;
}
rc = iscsi_xmit_task(conn);
@@ -1558,6 +1575,7 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
 * we need to check the mgmt queue for nops that need to
 * be sent to aviod starvation
 */
+   spin_lock_bh(>taskqueuelock);
if (!list_empty(>mgmtqueue))
goto check_mgmt;
}
@@ -1577,12 +1595,15 @@ static int iscsi_data_xmit(struct iscsi_conn *conn)
conn->task 

Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-07 Thread Guilherme G. Piccoli
On 11/07/2016 04:15 PM, Chris Leech wrote:
> Hi,
> 
> I'm kicking this old thread because I don't think this ever got
> resolved.  I wish I had more info, but it seems to involve target
> specific behavior that hasn't come up in our test labs.

Thanks very much for reopening this thread! We have the Or's patch
reverted in multiple distros, so the issue is not likely to happen on
customer's from IBM, but upstream kernel never saw a proper fix for this.


> 
> So what can I do at this point to help resolve this?
> 
> On Sun, Nov 15, 2015 at 12:10:48PM +0200, Or Gerlitz wrote:
>> Mike, Chris
>>
>> After the locking change, adding a task to any of the connection
>> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
>>
>> Removing tasks from any of these lists in iscsi_data_xmit is under
>> the session forward lock and **before** calling down to the transport
>> to handle the task.
>>
>> The iscsi_complete_task helper was added by Mike's commit
>> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
>> and is indeed typically called under the backward lock && has this section
>>
>> +   if (!list_empty(>running))
>> +   list_del_init(>running);
>>
>> which per my reading of the code never comes into play, can you comment?
>>
>> Lets address this area before we move to the others claims made on the patch.
> 
> This bit in particular is where I see a cause for concern.  If that
> list_del_init call ever races against other list operations, there's a
> potential corruption.  It's presumably there for a reason, and Mike
> explained a case where some targets have been known to send a check
> condition at unexpected times that would hit it.
> 
> I don't like having known list locking violations hanging around, based
> on an expectation that we'll never hit that path with well behaved
> targets.
> 
> If we can get a fix worked up for the list locking here, can we get any
> testing on it from the original reports at IBM?  That was very helpful
> in testing a full reversion patch.

Sure! Count on us to test any patches. I guess the first step is to
reproduce on upstream right? We haven't tested specifically this
scenario for long time. Will try to reproduce on 4.9-rc4 and update here.

Cheers,



Guilherme


> 
> - Chris Leech
> 

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.


Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2016-11-07 Thread Chris Leech
Hi,

I'm kicking this old thread because I don't think this ever got
resolved.  I wish I had more info, but it seems to involve target
specific behavior that hasn't come up in our test labs.

So what can I do at this point to help resolve this?

On Sun, Nov 15, 2015 at 12:10:48PM +0200, Or Gerlitz wrote:
> Mike, Chris
> 
> After the locking change, adding a task to any of the connection
> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
> 
> Removing tasks from any of these lists in iscsi_data_xmit is under
> the session forward lock and **before** calling down to the transport
> to handle the task.
> 
> The iscsi_complete_task helper was added by Mike's commit
> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
> and is indeed typically called under the backward lock && has this section
> 
> +   if (!list_empty(>running))
> +   list_del_init(>running);
> 
> which per my reading of the code never comes into play, can you comment?
> 
> Lets address this area before we move to the others claims made on the patch.

This bit in particular is where I see a cause for concern.  If that
list_del_init call ever races against other list operations, there's a
potential corruption.  It's presumably there for a reason, and Mike
explained a case where some targets have been known to send a check
condition at unexpected times that would hit it.

I don't like having known list locking violations hanging around, based
on an expectation that we'll never hit that path with well behaved
targets.

If we can get a fix worked up for the list locking here, can we get any
testing on it from the original reports at IBM?  That was very helpful
in testing a full reversion patch.

- Chris Leech

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.


Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-18 Thread Mike Christie

On 11/18/15, 5:30 AM, Or Gerlitz wrote:

On Mon, Nov 16, 2015 at 7:30 PM, Michael Christie  wrote:

On Nov 15, 2015, at 4:10 AM, Or Gerlitz  wrote:
On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie  wrote:

On 11/13/2015 09:06 AM, Or Gerlitz wrote:



After the locking change, adding a task to any of the connection
mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.

Removing tasks from any of these lists in iscsi_data_xmit is under
the session forward lock and **before** calling down to the transport
to handle the task.

The iscsi_complete_task helper was added by Mike's commit
3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
and is indeed typically called under the backward lock && has this section

+   if (!list_empty(>running))
+   list_del_init(>running);

which per my reading of the code never comes into play, can you comment?



I had sent this to Sagi and your mellanox email the other day:



The bug occurs when a target completes a command while we are still
processing it. If we are doing a WRITE and the iscsi_task
is on the cmdqueue because we are handling a R2T.


Mike,

I failed to find how an iscsi_task can be added again to the cmdqueue list,
nor how it can be added to the requeue list without the right locking, nor how
can an iscsi_task be on either of these lists when iscsi_complete_task
is invoked.


Are you only considering normal execution or also the cc case I 
mentioned in the last mail? For normal execution we are ok.




Specifically, my reading of the code says that these are the events over time:

t1. queuecommand :: we put the task on the cmdqueue list
(libiscsi.c:1741) - under fwd lock

t2. iscsi_data_xmit :: we remove the task from the cmdqueue list
(libiscsi.c:1537) - under fwd lock

when the R2T flow happens, we do the following

t3. iscsi_tcp_hdr_dissect --> iscsi_tcp_r2t_rsp --> iscsi_requeue_task ::
put the task on the requeue list -- the call to iscsi_tcp_r2t_rsp is
under the fwd lock


If the target were to send a CC at this point, we are adding the task 
under the frwd lock, but the completion path would only have the back 
lock. The list_empty, list_add and list_del calls then race and we could 
end up in a bad state right?




t4. iscsi_data_xmit :: we remove the task from the requeue list
(libiscsi.c: 1578) - under fwd lock


We could also get the bad CC at this time and end up doing 2 list_dels 
at the same time. The t4 one under the frwd lock and the cc handling 
completion one under the back lock like in t3.




Do you agree to t1...t4 being what's possible for a given task? or I
missed something?


The target shouldn't
send a Check Condition at this time, but some do. If that happens, then
iscsi_queuecommand could be adding a new task to the cmdqueue, while the
recv path is handling the CC for the task with the outsanding R2T.  The
recv path iscsi_complete_task call sees that task it on the cmdqueue and
deletes it from the list at the same time iscsi_queuecommand is adding a new 
task.




This should not happen per the iscsi spec. There is some wording about
waiting to finish the sequence in progress, but targets goof this up.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.


Re: [PATCH 1/1] iscsi: fix regression caused by session lock patch

2015-11-16 Thread Michael Christie

> On Nov 15, 2015, at 4:10 AM, Or Gerlitz  wrote:
> 
> On Fri, Nov 13, 2015 at 6:51 PM, Mike Christie  wrote:
>> On 11/13/2015 09:06 AM, Or Gerlitz wrote:
 The patch has caused multiple regressions, did not even compile when
> sent to me, and was poorly reviewed and I have not heard from you guys
> in a week. Given the issues the patch has had and the current time, I do
> not feel comfortable with it anymore. I want to re-review it and fix it
> up when there is more time.
>>> Mike (Hi),
>>> 
>>> It's a complex patch that touches all the iscsi transports, and yes,
>>> when it was send to you the 1st time, there was build error on one of
>>> the offload transports (bad! but happens) and yes, as you pointed, one
>>> static checker fix + one bug fix for it went upstream after this has
>>> been merged, happens too.
>> 
>> A patch should not cause this many issues.
>> 
>>> What makes you say it was poorly reviewed?
>> 
>> I just did not do a good job at looking at the patch. I should have
>> caught all of these issues.
>> 
>> - The bnx2i cleanup_task bug should have been obvious, especially for me
>> because I had commented about the back lock and the abort path.
>> 
>> - This oops, was so basic. Incorrect locking around a linked list being
>> accessed from 2 threads is really one of those 1st year kernel
>> programmer things.
> 
> Mike, Chris
> 
> After the locking change, adding a task to any of the connection
> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock.
> 
> Removing tasks from any of these lists in iscsi_data_xmit is under
> the session forward lock and **before** calling down to the transport
> to handle the task.
> 
> The iscsi_complete_task helper was added by Mike's commit
> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races"
> and is indeed typically called under the backward lock && has this section
> 
> +   if (!list_empty(>running))
> +   list_del_init(>running);
> 
> which per my reading of the code never comes into play, can you comment?


I had sent this to Sagi and your mellanox email the other day:


> The bug occurs when a target completes a command while we are still
> processing it. If we are doing a WRITE and the iscsi_task
> is on the cmdqueue because we are handling a R2T. The target shouldn't
> send a Check Condition at this time, but some do. If that happens, then
> iscsi_queuecommand could be adding a new task to the cmdqueue, while the
> recv path is handling the CC for the task with the outsanding R2T.  The
> recv path iscsi_complete_task call sees that task it on the cmdqueue and
> deletes it from the list at the same time iscsi_queuecommand is adding a new
> task.
> 
> This should not happen per the iscsi spec. There is some wording about
> waiting to finish the sequence in progress, but targets goof this up.




-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.