Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-10 Thread Bart Van Assche
On Thu, 2017-08-10 at 11:14 +0900, Damien Le Moal wrote:
> I am currently trying different approaches for this. In the mean time, I
> would like to see the unlock change patch be applied to fix the deadlock
> problem.

Hello Damien,

That approach sounds fine to me.

Bart.

Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-09 Thread Damien Le Moal


On 8/10/17 11:14, Damien Le Moal wrote:
> Bart,
> 
> On 8/10/17 00:24, Bart Van Assche wrote:
>> On Wed, 2017-08-09 at 14:15 +0900, Damien Le Moal wrote:
>>> Overall, yes, that is possible. However, considering only write commands
>>> to a single zone, since at any time only one command is dequeued (the
>>> one holding the zone lock), having the "lock+dequeue" and
>>> "unlock+requeue" both atomic prevent any reordering of write commands
>>> directed to that zone. The first command in the queue for the zone will
>>> either (1) get the zone lock and be dequeued OR (2) stay in the queue.
>>> In case (1), the next write for the zone will hit case (2) until the
>>> dispatch command completes. If the dispatch command is requeued (at the
>>> dispatch queue head), we hit back again the (1) or (2) cases, without
>>> any change in the order. Isnt't it ?
>>
>> Hello Damien,
>>
>> Commands that get requeued are not reinserted at their original position
>> in the request queue but usually at a different position. This is what makes
>> requeueing cause request reordering. Anyway, can you check whether replacing
>> patch "sd_zbc: Write unlock zone from sd_uninit_cmnd()" with the (untested)
>> patch below does not trigger any lockups and prevents request reordering?
>> Since that patch keeps the zone lock across requeuing attempts it should
>> prevent request reordering.
> 
> Your patch will indeed prevent deadlock for the case where the write
> command that acquired the lock gets re-queued before running/completing.
> If after re-queueing the command end ups being *before* any following
> sequential write to the same zone, it will be fine as the flag indicates
> that the zone lock is already acquired. So the command prep will not
> cause deadlocking.
> 
> However, the failure pattern I have seen most of the time is a write
> command never tried before ending up at the head of the dispatch queue,
> trying to acquire the zone lock and failing to do so because the lock
> owning command is behind in the queue.
> 
> Ex: The dispatch queue has sequential write commands A at the head and
> write B following.
> (1) A is dequeued and prep-ed, the zone lock is acquired.
> (2) For whatever reason, A gets requeued, still holding the zone lock.
> (3) Between (1) and (2), another context tries to push command B and
> dequeues it. Zone lock fails and (B) goes for requeue (using a
> workqueue, so different context)
> (4) If the requeue work for A runs before the one for B (instead of both
> A and B ending up in the same work), then B ends up at the head of the
> queue.
> (5) From here, B fails systematically. Requests behind B are not tried
> since the device is marked busy. So B stays at the head and the failure
> persists.

Correction: At step (3), since command B is not issued to the HBA, it
goes for requeue in the same context as scsi_queue_rq() call.
At step (4), since A was issued, requeue is done with the requeue list
workqueue. But A can still end up behind B because initially, B is only
requeued at the head of the dispatch local list, which is itself spliced
into the dispatch queue after that. A requeue may still be done before B
goes back in the dispatch queue.

> 
> The patch I sent is not about solving the ordering problem. Never was.
> It just avoids all possible deadlock situations.
> 
> There are more reordering situations possible higher up in the stack.
> Ex: without a scheduler, blk-mq simply moves all submitted requests from
> the CPU context queues into the hctx dispatch queue, in CPU order. This
> means that a thread cleanly writing sequentially to a zone but moved
> from one CPU to another half-way through the sequence can see its write
> reordered in the dispatch queue. Same for the actual dispatch code
> calling scsi_queue_rq(). This code loops over a local list, not the hctx
> queue. So this code running from 2 different contexts with the second
> one starting half way through our well behaving thread sequence will end
> up splitting the sequence into 2 different local lists and so loosing
> the sequential ordering.
> 
> The no-scheduler case is a little special but needs handling. As for the
> blk-mq dispatch code, we need that serialized/single threaded for zoned
> block devices. I am inclined to say that this should be the case for any
> HDD anyway as performance can only be better. Ming's series already
> brings improvements, but no ordering guarantees for zoned devices.
> 
> I am currently trying different approaches for this. In the mean time, I
> would like to see the unlock change patch be applied to fix the deadlock
> problem.
> 
> Best.

Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-09 Thread Damien Le Moal
Bart,

On 8/10/17 00:24, Bart Van Assche wrote:
> On Wed, 2017-08-09 at 14:15 +0900, Damien Le Moal wrote:
>> Overall, yes, that is possible. However, considering only write commands
>> to a single zone, since at any time only one command is dequeued (the
>> one holding the zone lock), having the "lock+dequeue" and
>> "unlock+requeue" both atomic prevent any reordering of write commands
>> directed to that zone. The first command in the queue for the zone will
>> either (1) get the zone lock and be dequeued OR (2) stay in the queue.
>> In case (1), the next write for the zone will hit case (2) until the
>> dispatch command completes. If the dispatch command is requeued (at the
>> dispatch queue head), we hit back again the (1) or (2) cases, without
>> any change in the order. Isnt't it ?
> 
> Hello Damien,
> 
> Commands that get requeued are not reinserted at their original position
> in the request queue but usually at a different position. This is what makes
> requeueing cause request reordering. Anyway, can you check whether replacing
> patch "sd_zbc: Write unlock zone from sd_uninit_cmnd()" with the (untested)
> patch below does not trigger any lockups and prevents request reordering?
> Since that patch keeps the zone lock across requeuing attempts it should
> prevent request reordering.

Your patch will indeed prevent deadlock for the case where the write
command that acquired the lock gets re-queued before running/completing.
If after re-queueing the command end ups being *before* any following
sequential write to the same zone, it will be fine as the flag indicates
that the zone lock is already acquired. So the command prep will not
cause deadlocking.

However, the failure pattern I have seen most of the time is a write
command never tried before ending up at the head of the dispatch queue,
trying to acquire the zone lock and failing to do so because the lock
owning command is behind in the queue.

Ex: The dispatch queue has sequential write commands A at the head and
write B following.
(1) A is dequeued and prep-ed, the zone lock is acquired.
(2) For whatever reason, A gets requeued, still holding the zone lock.
(3) Between (1) and (2), another context tries to push command B and
dequeues it. Zone lock fails and (B) goes for requeue (using a
workqueue, so different context)
(4) If the requeue work for A runs before the one for B (instead of both
A and B ending up in the same work), then B ends up at the head of the
queue.
(5) From here, B fails systematically. Requests behind B are not tried
since the device is marked busy. So B stays at the head and the failure
persists.

The patch I sent is not about solving the ordering problem. Never was.
It just avoids all possible deadlock situations.

There are more reordering situations possible higher up in the stack.
Ex: without a scheduler, blk-mq simply moves all submitted requests from
the CPU context queues into the hctx dispatch queue, in CPU order. This
means that a thread cleanly writing sequentially to a zone but moved
from one CPU to another half-way through the sequence can see its write
reordered in the dispatch queue. Same for the actual dispatch code
calling scsi_queue_rq(). This code loops over a local list, not the hctx
queue. So this code running from 2 different contexts with the second
one starting half way through our well behaving thread sequence will end
up splitting the sequence into 2 different local lists and so loosing
the sequential ordering.

The no-scheduler case is a little special but needs handling. As for the
blk-mq dispatch code, we need that serialized/single threaded for zoned
block devices. I am inclined to say that this should be the case for any
HDD anyway as performance can only be better. Ming's series already
brings improvements, but no ordering guarantees for zoned devices.

I am currently trying different approaches for this. In the mean time, I
would like to see the unlock change patch be applied to fix the deadlock
problem.

Best.

> 
> Thanks,
> 
> Bart.
> 
> ---
>  drivers/scsi/sd_zbc.c| 8 
>  include/scsi/scsi_cmnd.h | 3 +++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index 96855df9f49d..6d18b1bd3b26 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -290,10 +290,15 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
>* threads running on different CPUs write to the same
>* zone (with a synchronized sequential pattern).
>*/
> + if (cmd->flags & SCMD_IN_PROGRESS)
> + return BLKPREP_OK;
> +
>   if (sdkp->zones_wlock &&
>   test_and_set_bit(zno, sdkp->zones_wlock))
>   return BLKPREP_DEFER;
>  
> + cmd->flags |= SCMD_

Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-09 Thread Bart Van Assche
On Wed, 2017-08-09 at 14:15 +0900, Damien Le Moal wrote:
> Overall, yes, that is possible. However, considering only write commands
> to a single zone, since at any time only one command is dequeued (the
> one holding the zone lock), having the "lock+dequeue" and
> "unlock+requeue" both atomic prevent any reordering of write commands
> directed to that zone. The first command in the queue for the zone will
> either (1) get the zone lock and be dequeued OR (2) stay in the queue.
> In case (1), the next write for the zone will hit case (2) until the
> dispatch command completes. If the dispatch command is requeued (at the
> dispatch queue head), we hit back again the (1) or (2) cases, without
> any change in the order. Isnt't it ?

Hello Damien,

Commands that get requeued are not reinserted at their original position
in the request queue but usually at a different position. This is what makes
requeueing cause request reordering. Anyway, can you check whether replacing
patch "sd_zbc: Write unlock zone from sd_uninit_cmnd()" with the (untested)
patch below does not trigger any lockups and prevents request reordering?
Since that patch keeps the zone lock across requeuing attempts it should
prevent request reordering.

Thanks,

Bart.

---
 drivers/scsi/sd_zbc.c| 8 
 include/scsi/scsi_cmnd.h | 3 +++
 2 files changed, 11 insertions(+)

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 96855df9f49d..6d18b1bd3b26 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -290,10 +290,15 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
 * threads running on different CPUs write to the same
 * zone (with a synchronized sequential pattern).
 */
+   if (cmd->flags & SCMD_IN_PROGRESS)
+   return BLKPREP_OK;
+
if (sdkp->zones_wlock &&
test_and_set_bit(zno, sdkp->zones_wlock))
return BLKPREP_DEFER;
 
+   cmd->flags |= SCMD_IN_PROGRESS;
+
return BLKPREP_OK;
 }
 
@@ -302,6 +307,9 @@ void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
struct request *rq = cmd->request;
struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 
+   WARN_ON_ONCE(!(cmd->flags & SCMD_IN_PROGRESS));
+   cmd->flags &= ~SCMD_IN_PROGRESS;
+
if (sdkp->zones_wlock) {
unsigned int zno = sd_zbc_zone_no(sdkp, blk_rq_pos(rq));
WARN_ON_ONCE(!test_bit(zno, sdkp->zones_wlock));
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index a1266d318c85..f18c812094b5 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -57,6 +57,9 @@ struct scsi_pointer {
 /* for scmd->flags */
 #define SCMD_TAGGED(1 << 0)
 #define SCMD_UNCHECKED_ISA_DMA (1 << 1)
+#ifdef CONFIG_BLK_DEV_ZONED
+#define SCMD_IN_PROGRESS   (1 << 2)
+#endif
 
 struct scsi_cmnd {
struct scsi_request req;


Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-08 Thread Damien Le Moal


On 8/9/17 13:15, Bart Van Assche wrote:
> On Wed, 2017-08-09 at 13:07 +0900, Damien Le Moal wrote:
>>
>> On 8/9/17 12:57, Bart Van Assche wrote:
>>> On Wed, 2017-08-09 at 11:47 +0900, Damien Le Moal wrote:
 On 8/9/17 11:19, Damien Le Moal wrote:
> On 8/9/17 00:07, Bart Van Assche wrote:
>> On Tue, 2017-08-08 at 13:17 +0900, Damien Le Moal wrote:
>>> acquired the lock completes can cause deadlocks with scsi-mq due to
>>> potential queue reordering if the lock owning request is requeued and
>>> not executed.
>>
>> Please note that even with scsi-sq requeueing can cause request 
>> reordering.
>
> I am painfully aware of this fact... I will update the commit message to
> reflect this.

 Correction: checking the code again, I just realized that the prep_fn
 queue function, which calls sd_init_cmnd() and causes a zone to be
 write-locked if the prepared command is a write, is called within
 blk_peek_request(). That function is called in scsi_request_fn() with
 the queue lock held. So for the legacy scsi path, this means that for a
 write command to a zone, the operation "lock zone and dequeue command"
 is atomic. Thanks to the zone lock, further dequeue of write commands
 for the same (locked) zone cannot happen and so there is no reordering
 possible for write commands directed at the same zone. Reordering for
 other commands is possible, but similarly to regular disks, this is not
 a problem with ZBC/ZAC drives.
>>>
>>> Hello Damien,
>>>
>>> The locking approach in the upstream code indeed prevents reordering of
>>> write requests that apply to the same zone. However, this patch modifies
>>> the locking approach. Are you sure that with this patch applied no request
>>> reordering will occur if requests are requeued since this patch causes the
>>> zone lock to be released before requeuing occurs?
>>
>> scsi_requeue_command() calls blk_unprep_request() and
>> blk_requeue_request under the queue lock. So atomically.
>> I think that suppresses any possibility of write requests to the same
>> zone to be reordered.
> 
> The queue lock is released after the request is requeued until the queue is
> rerun. Another request can get executed before the requeued request gets
> reexecuted. Or in other words, requeuing can change the order in which
> requests are executed.

Overall, yes, that is possible. However, considering only write commands
to a single zone, since at any time only one command is dequeued (the
one holding the zone lock), having the "lock+dequeue" and
"unlock+requeue" both atomic prevent any reordering of write commands
directed to that zone. The first command in the queue for the zone will
either (1) get the zone lock and be dequeued OR (2) stay in the queue.
In case (1), the next write for the zone will hit case (2) until the
dispatch command completes. If the dispatch command is requeued (at the
dispatch queue head), we hit back again the (1) or (2) cases, without
any change in the order. Isnt't it ?

-- 
Damien Le Moal,
Western Digital


Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-08 Thread Bart Van Assche
On Wed, 2017-08-09 at 13:07 +0900, Damien Le Moal wrote:
> 
> On 8/9/17 12:57, Bart Van Assche wrote:
> > On Wed, 2017-08-09 at 11:47 +0900, Damien Le Moal wrote:
> > > On 8/9/17 11:19, Damien Le Moal wrote:
> > > > On 8/9/17 00:07, Bart Van Assche wrote:
> > > > > On Tue, 2017-08-08 at 13:17 +0900, Damien Le Moal wrote:
> > > > > > acquired the lock completes can cause deadlocks with scsi-mq due to
> > > > > > potential queue reordering if the lock owning request is requeued 
> > > > > > and
> > > > > > not executed.
> > > > > 
> > > > > Please note that even with scsi-sq requeueing can cause request 
> > > > > reordering.
> > > > 
> > > > I am painfully aware of this fact... I will update the commit message to
> > > > reflect this.
> > > 
> > > Correction: checking the code again, I just realized that the prep_fn
> > > queue function, which calls sd_init_cmnd() and causes a zone to be
> > > write-locked if the prepared command is a write, is called within
> > > blk_peek_request(). That function is called in scsi_request_fn() with
> > > the queue lock held. So for the legacy scsi path, this means that for a
> > > write command to a zone, the operation "lock zone and dequeue command"
> > > is atomic. Thanks to the zone lock, further dequeue of write commands
> > > for the same (locked) zone cannot happen and so there is no reordering
> > > possible for write commands directed at the same zone. Reordering for
> > > other commands is possible, but similarly to regular disks, this is not
> > > a problem with ZBC/ZAC drives.
> > 
> > Hello Damien,
> > 
> > The locking approach in the upstream code indeed prevents reordering of
> > write requests that apply to the same zone. However, this patch modifies
> > the locking approach. Are you sure that with this patch applied no request
> > reordering will occur if requests are requeued since this patch causes the
> > zone lock to be released before requeuing occurs?
> 
> scsi_requeue_command() calls blk_unprep_request() and
> blk_requeue_request under the queue lock. So atomically.
> I think that suppresses any possibility of write requests to the same
> zone to be reordered.

The queue lock is released after the request is requeued until the queue is
rerun. Another request can get executed before the requeued request gets
reexecuted. Or in other words, requeuing can change the order in which
requests are executed.

Bart.

Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-08 Thread Damien Le Moal


On 8/9/17 12:57, Bart Van Assche wrote:
> On Wed, 2017-08-09 at 11:47 +0900, Damien Le Moal wrote:
>> On 8/9/17 11:19, Damien Le Moal wrote:
>>> On 8/9/17 00:07, Bart Van Assche wrote:
 On Tue, 2017-08-08 at 13:17 +0900, Damien Le Moal wrote:
> acquired the lock completes can cause deadlocks with scsi-mq due to
> potential queue reordering if the lock owning request is requeued and
> not executed.

 Please note that even with scsi-sq requeueing can cause request reordering.
>>>
>>> I am painfully aware of this fact... I will update the commit message to
>>> reflect this.
>>
>> Correction: checking the code again, I just realized that the prep_fn
>> queue function, which calls sd_init_cmnd() and causes a zone to be
>> write-locked if the prepared command is a write, is called within
>> blk_peek_request(). That function is called in scsi_request_fn() with
>> the queue lock held. So for the legacy scsi path, this means that for a
>> write command to a zone, the operation "lock zone and dequeue command"
>> is atomic. Thanks to the zone lock, further dequeue of write commands
>> for the same (locked) zone cannot happen and so there is no reordering
>> possible for write commands directed at the same zone. Reordering for
>> other commands is possible, but similarly to regular disks, this is not
>> a problem with ZBC/ZAC drives.
> 
> Hello Damien,
> 
> The locking approach in the upstream code indeed prevents reordering of
> write requests that apply to the same zone. However, this patch modifies
> the locking approach. Are you sure that with this patch applied no request
> reordering will occur if requests are requeued since this patch causes the
> zone lock to be released before requeuing occurs?

scsi_requeue_command() calls blk_unprep_request() and
blk_requeue_request under the queue lock. So atomically.
I think that suppresses any possibility of write requests to the same
zone to be reordered.

Best regards.

-- 
Damien Le Moal,
Western Digital


Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-08 Thread Bart Van Assche
On Wed, 2017-08-09 at 11:47 +0900, Damien Le Moal wrote:
> On 8/9/17 11:19, Damien Le Moal wrote:
> > On 8/9/17 00:07, Bart Van Assche wrote:
> > > On Tue, 2017-08-08 at 13:17 +0900, Damien Le Moal wrote:
> > > > acquired the lock completes can cause deadlocks with scsi-mq due to
> > > > potential queue reordering if the lock owning request is requeued and
> > > > not executed.
> > > 
> > > Please note that even with scsi-sq requeueing can cause request 
> > > reordering.
> > 
> > I am painfully aware of this fact... I will update the commit message to
> > reflect this.
> 
> Correction: checking the code again, I just realized that the prep_fn
> queue function, which calls sd_init_cmnd() and causes a zone to be
> write-locked if the prepared command is a write, is called within
> blk_peek_request(). That function is called in scsi_request_fn() with
> the queue lock held. So for the legacy scsi path, this means that for a
> write command to a zone, the operation "lock zone and dequeue command"
> is atomic. Thanks to the zone lock, further dequeue of write commands
> for the same (locked) zone cannot happen and so there is no reordering
> possible for write commands directed at the same zone. Reordering for
> other commands is possible, but similarly to regular disks, this is not
> a problem with ZBC/ZAC drives.

Hello Damien,

The locking approach in the upstream code indeed prevents reordering of
write requests that apply to the same zone. However, this patch modifies
the locking approach. Are you sure that with this patch applied no request
reordering will occur if requests are requeued since this patch causes the
zone lock to be released before requeuing occurs?

Bart.

Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-08 Thread Damien Le Moal
Bart,

On 8/9/17 11:19, Damien Le Moal wrote:
> Bart,
> 
> On 8/9/17 00:07, Bart Van Assche wrote:
>> On Tue, 2017-08-08 at 13:17 +0900, Damien Le Moal wrote:
>>> Releasing the write lock of a zone when the write commnand that
>> 
>> command?
>>> acquired the lock completes can cause deadlocks with scsi-mq due to
>>> potential queue reordering if the lock owning request is requeued and
>>> not executed.
>>
>> Please note that even with scsi-sq requeueing can cause request reordering.
> 
> I am painfully aware of this fact... I will update the commit message to
> reflect this.

Correction: checking the code again, I just realized that the prep_fn
queue function, which calls sd_init_cmnd() and causes a zone to be
write-locked if the prepared command is a write, is called within
blk_peek_request(). That function is called in scsi_request_fn() with
the queue lock held. So for the legacy scsi path, this means that for a
write command to a zone, the operation "lock zone and dequeue command"
is atomic. Thanks to the zone lock, further dequeue of write commands
for the same (locked) zone cannot happen and so there is no reordering
possible for write commands directed at the same zone. Reordering for
other commands is possible, but similarly to regular disks, this is not
a problem with ZBC/ZAC drives.

This is good, but a little suboptimal as request dispatch gets stalled
every time there are multiple write commands to the same zone queued.
This could be optimized to let other command go through and so operate
the drive at higher queue depth and reduce latency for operations like
reads.

The patch "block: Zoned block device single-threaded submission" that I
sent previously and that Christoph Nack-ed is thus in fact not necessary
at all.

>>> Since sd_uninit_cmnd() is always called when a request is requeued,
>>> call sd_zbc_write_unlock_zone() from that function for write requests
>>> that acquired a zone lock. Acquisition of a zone lock by a write command
>>> is indicated using the new command flag SCMD_ZONE_WRITE_LOCK.
>>
>> Hello Damien,
>>
>> Since this patch differs slightly from the version that was posted a few
>> days ago, should a "v2" label have been added to the subject and should a
>> changelog have been included? Anyway:
> 
> Yes, forgot to do that. Reposting.
> 
>> Reviewed-by: Bart Van Assche 
> 
> Thanks.
> 

-- 
Damien Le Moal,
Western Digital


Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-08 Thread Damien Le Moal
Bart,

On 8/9/17 00:07, Bart Van Assche wrote:
> On Tue, 2017-08-08 at 13:17 +0900, Damien Le Moal wrote:
>> Releasing the write lock of a zone when the write commnand that
> 
> command?
>> acquired the lock completes can cause deadlocks with scsi-mq due to
>> potential queue reordering if the lock owning request is requeued and
>> not executed.
> 
> Please note that even with scsi-sq requeueing can cause request reordering.

I am painfully aware of this fact... I will update the commit message to
reflect this.

>> Since sd_uninit_cmnd() is always called when a request is requeued,
>> call sd_zbc_write_unlock_zone() from that function for write requests
>> that acquired a zone lock. Acquisition of a zone lock by a write command
>> is indicated using the new command flag SCMD_ZONE_WRITE_LOCK.
> 
> Hello Damien,
> 
> Since this patch differs slightly from the version that was posted a few
> days ago, should a "v2" label have been added to the subject and should a
> changelog have been included? Anyway:

Yes, forgot to do that. Reposting.

> Reviewed-by: Bart Van Assche 

Thanks.

-- 
Damien Le Moal,
Western Digital


Re: [PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-08 Thread Bart Van Assche
On Tue, 2017-08-08 at 13:17 +0900, Damien Le Moal wrote:
> Releasing the write lock of a zone when the write commnand that

command?
> acquired the lock completes can cause deadlocks with scsi-mq due to
> potential queue reordering if the lock owning request is requeued and
> not executed.

Please note that even with scsi-sq requeueing can cause request reordering.

> Since sd_uninit_cmnd() is always called when a request is requeued,
> call sd_zbc_write_unlock_zone() from that function for write requests
> that acquired a zone lock. Acquisition of a zone lock by a write command
> is indicated using the new command flag SCMD_ZONE_WRITE_LOCK.

Hello Damien,

Since this patch differs slightly from the version that was posted a few
days ago, should a "v2" label have been added to the subject and should a
changelog have been included? Anyway:

Reviewed-by: Bart Van Assche 



[PATCH] sd_zbc: Write unlock zone from sd_uninit_cmnd()

2017-08-07 Thread Damien Le Moal
Releasing the write lock of a zone when the write commnand that
acquired the lock completes can cause deadlocks with scsi-mq due to
potential queue reordering if the lock owning request is requeued and
not executed.

Since sd_uninit_cmnd() is always called when a request is requeued,
call sd_zbc_write_unlock_zone() from that function for write requests
that acquired a zone lock. Acquisition of a zone lock by a write command
is indicated using the new command flag SCMD_ZONE_WRITE_LOCK.

Signed-off-by: Damien Le Moal 
Reviewed-by: Christoph Hellwig 
---
 drivers/scsi/sd.c| 3 +++
 drivers/scsi/sd_zbc.c| 9 +
 include/scsi/scsi_cmnd.h | 1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index bea36adeee17..e2647f2d4430 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1277,6 +1277,9 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
 {
struct request *rq = SCpnt->request;
 
+   if (SCpnt->flags & SCMD_ZONE_WRITE_LOCK)
+   sd_zbc_write_unlock_zone(SCpnt);
+
if (rq->rq_flags & RQF_SPECIAL_PAYLOAD)
__free_page(rq->special_vec.bv_page);
 
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 96855df9f49d..8aa54779aac1 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -294,6 +294,9 @@ int sd_zbc_write_lock_zone(struct scsi_cmnd *cmd)
test_and_set_bit(zno, sdkp->zones_wlock))
return BLKPREP_DEFER;
 
+   WARN_ON_ONCE(cmd->flags & SCMD_ZONE_WRITE_LOCK);
+   cmd->flags |= SCMD_ZONE_WRITE_LOCK;
+
return BLKPREP_OK;
 }
 
@@ -302,9 +305,10 @@ void sd_zbc_write_unlock_zone(struct scsi_cmnd *cmd)
struct request *rq = cmd->request;
struct scsi_disk *sdkp = scsi_disk(rq->rq_disk);
 
-   if (sdkp->zones_wlock) {
+   if (sdkp->zones_wlock && cmd->flags & SCMD_ZONE_WRITE_LOCK) {
unsigned int zno = sd_zbc_zone_no(sdkp, blk_rq_pos(rq));
WARN_ON_ONCE(!test_bit(zno, sdkp->zones_wlock));
+   cmd->flags &= ~SCMD_ZONE_WRITE_LOCK;
clear_bit_unlock(zno, sdkp->zones_wlock);
smp_mb__after_atomic();
}
@@ -335,9 +339,6 @@ void sd_zbc_complete(struct scsi_cmnd *cmd,
case REQ_OP_WRITE_ZEROES:
case REQ_OP_WRITE_SAME:
 
-   /* Unlock the zone */
-   sd_zbc_write_unlock_zone(cmd);
-
if (result &&
sshdr->sense_key == ILLEGAL_REQUEST &&
sshdr->asc == 0x21)
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index a1266d318c85..6af198d8120b 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -57,6 +57,7 @@ struct scsi_pointer {
 /* for scmd->flags */
 #define SCMD_TAGGED(1 << 0)
 #define SCMD_UNCHECKED_ISA_DMA (1 << 1)
+#define SCMD_ZONE_WRITE_LOCK   (1 << 2)
 
 struct scsi_cmnd {
struct scsi_request req;
-- 
2.13.3