Re: [PATCH 1/4] blk-mq-tag: check for NULL rq when iterating tags

2017-08-03 Thread Jens Axboe
On 08/03/2017 02:50 PM, Bart Van Assche wrote:
> On Thu, 2017-08-03 at 14:40 -0600, Jens Axboe wrote:
>> On 08/03/2017 02:35 PM, Jens Axboe wrote:
 I agree with what you wrote in the description of this patch.
 However, since I have not yet found the code that clears tags->rqs[],
 would it be possible to show me that code?
>>>
>>> Since it's been a month since I wrote this code, I went and looked
>>> too.  My memory was that we set/clear it dynamically since we added
>>> scheduling, but looks like we don't clear it. The race is still valid
>>> for when someone runs a tag check in parallel with someone allocating
>>> a tag, since there's a window of time where the tag bit is set, but
>>> ->rqs[tag] isn't set yet. That's probably the race I hit, not the
>>> completion race mentioned in the change log.
>>
>> Rewrote the commit message:
>>
>> http://git.kernel.dk/cgit/linux-block/commit/?h=mq-inflight=1908e43118e688e41ac8656edcf3e7a150f3f5081
> 
> Hello Jens,
> 
> This is what I found in the updated commit:
> 
> blk-mq-tag: check for NULL rq when iterating tags
> 
> Since we introduced blk-mq-sched, the tags->rqs[] array has been
> dynamically assigned. So we need to check for NULL when iterating,
> since there's a window of time where the bit is set, but we haven't
> dynamically assigned the tags->rqs[] array position yet.
> 
> This is perfectly safe, since the memory backing of the request is
> never going away while the device is alive.
> 
> Does this mean that blk_mq_tagset_busy_iter() can skip requests that it
> shouldn't skip and also that blk_mq_tagset_busy_iter() can pass a pointer to
> the previous request that was associated with a tag instead of the current
> request to its busy_tag_iter_fn argument? Shouldn't these races be fixed,
> e.g. by swapping the order in which the tag are set and tags->rqs[] are
> assigned such that the correct request pointer is passed to the
> busy_tag_iter_fn argument?

We can't swap them. We need to reserve a bit first, which entails
setting that bit. Once that's set, we can assign ->rqs[]. The race isn't
a big deal, and I don't want to add any code to prevent it, since that
would mean locking. Since we have the luxury of the request itself
always being valid memory, we can deal with stale info or having a NULL
because of the ordering. It's a conscious trade off.

-- 
Jens Axboe



Re: [PATCH 1/4] blk-mq-tag: check for NULL rq when iterating tags

2017-08-03 Thread Bart Van Assche
On Thu, 2017-08-03 at 14:40 -0600, Jens Axboe wrote:
> On 08/03/2017 02:35 PM, Jens Axboe wrote:
> > > I agree with what you wrote in the description of this patch.
> > > However, since I have not yet found the code that clears tags->rqs[],
> > > would it be possible to show me that code?
> > 
> > Since it's been a month since I wrote this code, I went and looked
> > too.  My memory was that we set/clear it dynamically since we added
> > scheduling, but looks like we don't clear it. The race is still valid
> > for when someone runs a tag check in parallel with someone allocating
> > a tag, since there's a window of time where the tag bit is set, but
> > ->rqs[tag] isn't set yet. That's probably the race I hit, not the
> > completion race mentioned in the change log.
> 
> Rewrote the commit message:
> 
> http://git.kernel.dk/cgit/linux-block/commit/?h=mq-inflight=1908e43118e688e41ac8656edcf3e7a150f3f5081

Hello Jens,

This is what I found in the updated commit:

blk-mq-tag: check for NULL rq when iterating tags

Since we introduced blk-mq-sched, the tags->rqs[] array has been
dynamically assigned. So we need to check for NULL when iterating,
since there's a window of time where the bit is set, but we haven't
dynamically assigned the tags->rqs[] array position yet.

This is perfectly safe, since the memory backing of the request is
never going away while the device is alive.

Does this mean that blk_mq_tagset_busy_iter() can skip requests that it
shouldn't skip and also that blk_mq_tagset_busy_iter() can pass a pointer to
the previous request that was associated with a tag instead of the current
request to its busy_tag_iter_fn argument? Shouldn't these races be fixed,
e.g. by swapping the order in which the tag are set and tags->rqs[] are
assigned such that the correct request pointer is passed to the
busy_tag_iter_fn argument?

Thanks,

Bart.

Re: [PATCH 1/4] blk-mq-tag: check for NULL rq when iterating tags

2017-08-03 Thread Jens Axboe
On 08/03/2017 02:35 PM, Jens Axboe wrote:
>> I agree with what you wrote in the description of this patch.
>> However, since I have not yet found the code that clears tags->rqs[],
>> would it be possible to show me that code?
> 
> Since it's been a month since I wrote this code, I went and looked
> too.  My memory was that we set/clear it dynamically since we added
> scheduling, but looks like we don't clear it. The race is still valid
> for when someone runs a tag check in parallel with someone allocating
> a tag, since there's a window of time where the tag bit is set, but
> ->rqs[tag] isn't set yet. That's probably the race I hit, not the
> completion race mentioned in the change log.

Rewrote the commit message:

http://git.kernel.dk/cgit/linux-block/commit/?h=mq-inflight=1908e43118e688e41ac8656edcf3e7a150f3f5081

-- 
Jens Axboe



Re: [PATCH 1/4] blk-mq-tag: check for NULL rq when iterating tags

2017-08-03 Thread Jens Axboe
On 08/03/2017 02:29 PM, Bart Van Assche wrote:
> On Thu, 2017-08-03 at 14:01 -0600, Jens Axboe wrote:
>> Since we introduced blk-mq-sched, the tags->rqs[] array has been
>> dynamically assigned. So we need to check for NULL when iterating,
>> since we could be racing with completion.
>>
>> This is perfectly safe, since the memory backing of the request is
>> never going away while the device is alive. Only the pointer in
>> ->rqs[] may be reset.
>>
>> Signed-off-by: Jens Axboe 
>> ---
>>  block/blk-mq-tag.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
>> index d0be72ccb091..b856b2827157 100644
>> --- a/block/blk-mq-tag.c
>> +++ b/block/blk-mq-tag.c
>> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int 
>> bitnr, void *data)
>>  bitnr += tags->nr_reserved_tags;
>>  rq = tags->rqs[bitnr];
>>  
>> -if (rq->q == hctx->queue)
>> +if (rq && rq->q == hctx->queue)
>>  iter_data->fn(hctx, rq, iter_data->data, reserved);
>>  return true;
>>  }
>> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, 
>> unsigned int bitnr, void *data)
>>  if (!reserved)
>>  bitnr += tags->nr_reserved_tags;
>>  rq = tags->rqs[bitnr];
>> -
>> -iter_data->fn(rq, iter_data->data, reserved);
>> +if (rq)
>> +iter_data->fn(rq, iter_data->data, reserved);
>>  return true;
>>  }
> 
> Hello Jens,
> 
> I agree with what you wrote in the description of this patch. However, since
> I have not yet found the code that clears tags->rqs[], would it be possible
> to show me that code?

Since it's been a month since I wrote this code, I went and looked too.
My memory was that we set/clear it dynamically since we added
scheduling, but looks like we don't clear it. The race is still valid
for when someone runs a tag check in parallel with someone allocating a
tag, since there's a window of time where the tag bit is set, but
->rqs[tag] isn't set yet. That's probably the race I hit, not the
completion race mentioned in the change log.

-- 
Jens Axboe



Re: [PATCH 1/4] blk-mq-tag: check for NULL rq when iterating tags

2017-08-03 Thread Bart Van Assche
On Thu, 2017-08-03 at 14:01 -0600, Jens Axboe wrote:
> Since we introduced blk-mq-sched, the tags->rqs[] array has been
> dynamically assigned. So we need to check for NULL when iterating,
> since we could be racing with completion.
> 
> This is perfectly safe, since the memory backing of the request is
> never going away while the device is alive. Only the pointer in
> ->rqs[] may be reset.
> 
> Signed-off-by: Jens Axboe 
> ---
>  block/blk-mq-tag.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index d0be72ccb091..b856b2827157 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -214,7 +214,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int 
> bitnr, void *data)
>   bitnr += tags->nr_reserved_tags;
>   rq = tags->rqs[bitnr];
>  
> - if (rq->q == hctx->queue)
> + if (rq && rq->q == hctx->queue)
>   iter_data->fn(hctx, rq, iter_data->data, reserved);
>   return true;
>  }
> @@ -249,8 +249,8 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned 
> int bitnr, void *data)
>   if (!reserved)
>   bitnr += tags->nr_reserved_tags;
>   rq = tags->rqs[bitnr];
> -
> - iter_data->fn(rq, iter_data->data, reserved);
> + if (rq)
> + iter_data->fn(rq, iter_data->data, reserved);
>   return true;
>  }

Hello Jens,

I agree with what you wrote in the description of this patch. However, since
I have not yet found the code that clears tags->rqs[], would it be possible
to show me that code?

Thanks,

Bart.