Re: [dm-devel] [PATCH] block: Improve io_opt limit stacking

2020-05-25 Thread Keith Busch
On Fri, May 22, 2020 at 09:36:18AM -0400, Martin K. Petersen wrote:
> 
> >>> + if (t->io_opt & (t->physical_block_size - 1))
> >>> + t->io_opt = lcm(t->io_opt, t->physical_block_size);
> >
> >> Any comment on this patch ?  Note: the patch the patch "nvme: Fix
> >> io_opt limit setting" is already queued for 5.8.
> >
> > Setting io_opt to the physical block size is not correct.
> 
> Oh, missed the lcm(). But I'm still concerned about twiddling io_opt to
> a value different than the one reported by an underlying device.
>
> Setting io_opt to something that's less than a full stripe width in a
> RAID, for instance, doesn't produce the expected result. So I think I'd
> prefer not to set io_opt at all if it isn't consistent across all the
> stacked devices.

We already modify the stacked io_opt value if the two request_queues
report different io_opt's. If, however, only one reports an io_opt value,
and it happens to not align with the other's physical block size, the
code currently rejects stacking those devices. Damien's patch should
just set a larger io_opt value that aligns with both, so if one io_opt
is a RAID stripe size, the stacked result will be multiple stripes.

I think that makes sense, but please do let us know if you think such
mismatched devices just shouldn't stack.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] block: Improve io_opt limit stacking

2020-05-22 Thread Martin K. Petersen


>>> +   if (t->io_opt & (t->physical_block_size - 1))
>>> +   t->io_opt = lcm(t->io_opt, t->physical_block_size);
>
>> Any comment on this patch ?  Note: the patch the patch "nvme: Fix
>> io_opt limit setting" is already queued for 5.8.
>
> Setting io_opt to the physical block size is not correct.

Oh, missed the lcm(). But I'm still concerned about twiddling io_opt to
a value different than the one reported by an underlying device.

Setting io_opt to something that's less than a full stripe width in a
RAID, for instance, doesn't produce the expected result. So I think I'd
prefer not to set io_opt at all if it isn't consistent across all the
stacked devices.

Let me chew on it for a bit...

-- 
Martin K. Petersen  Oracle Linux Engineering

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] block: Improve io_opt limit stacking

2020-05-22 Thread Martin K. Petersen


Damien,

>> +if (t->io_opt & (t->physical_block_size - 1))
>> +t->io_opt = lcm(t->io_opt, t->physical_block_size);

> Any comment on this patch ?  Note: the patch the patch "nvme: Fix
> io_opt limit setting" is already queued for 5.8.

Setting io_opt to the physical block size is not correct.

-- 
Martin K. Petersen  Oracle Linux Engineering

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH] block: Improve io_opt limit stacking

2020-05-22 Thread Damien Le Moal
On 2020/05/14 15:58, Damien Le Moal wrote:
> When devices with different physical sector sizes are stacked, the
> largest value is used as the stacked device physical sector size. For
> the optimal IO size, the lowest common multiple (lcm) of the underlying
> devices is used for the stacked device. In this scenario, if only one of
> the underlying device reports an optimal IO size, that value is used as
> is for the stacked device but that value may not be a multiple of the
> stacked device physical sector size. In this case, blk_stack_limits()
> returns an error resulting in warnings being printed on device mapper
> startup (observed with dm-zoned dual drive setup combining a 512B
> sector SSD with a 4K sector HDD).
> 
> To fix this, rather than returning an error, the optimal IO size limit
> for the stacked device can be adjusted to the lowest common multiple
> (lcm) of the stacked physical sector size and optimal IO size, resulting
> in a value that is a multiple of the physical sector size while still
> being an optimal size for the underlying devices.
> 
> This patch is complementary to the patch "nvme: Fix io_opt limit
> setting" which prevents the nvme driver from reporting an optimal IO
> size equal to a namespace sector size for a device that does not report
> an optimal IO size.
> 
> Suggested-by: Keith Busch 
> Signed-off-by: Damien Le Moal 
> ---
>  block/blk-settings.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 9a2c23cd9700..9a2b017ff681 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -561,11 +561,8 @@ int blk_stack_limits(struct queue_limits *t, struct 
> queue_limits *b,
>   }
>  
>   /* Optimal I/O a multiple of the physical block size? */
> - if (t->io_opt & (t->physical_block_size - 1)) {
> - t->io_opt = 0;
> - t->misaligned = 1;
> - ret = -1;
> - }
> + if (t->io_opt & (t->physical_block_size - 1))
> + t->io_opt = lcm(t->io_opt, t->physical_block_size);
>  
>   t->raid_partial_stripes_expensive =
>   max(t->raid_partial_stripes_expensive,
> 

Jens,

Any comment on this patch ?
Note: the patch the patch "nvme: Fix io_opt limit setting" is already queued for
5.8.

-- 
Damien Le Moal
Western Digital Research



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [PATCH] block: Improve io_opt limit stacking

2020-05-14 Thread Damien Le Moal
When devices with different physical sector sizes are stacked, the
largest value is used as the stacked device physical sector size. For
the optimal IO size, the lowest common multiple (lcm) of the underlying
devices is used for the stacked device. In this scenario, if only one of
the underlying device reports an optimal IO size, that value is used as
is for the stacked device but that value may not be a multiple of the
stacked device physical sector size. In this case, blk_stack_limits()
returns an error resulting in warnings being printed on device mapper
startup (observed with dm-zoned dual drive setup combining a 512B
sector SSD with a 4K sector HDD).

To fix this, rather than returning an error, the optimal IO size limit
for the stacked device can be adjusted to the lowest common multiple
(lcm) of the stacked physical sector size and optimal IO size, resulting
in a value that is a multiple of the physical sector size while still
being an optimal size for the underlying devices.

This patch is complementary to the patch "nvme: Fix io_opt limit
setting" which prevents the nvme driver from reporting an optimal IO
size equal to a namespace sector size for a device that does not report
an optimal IO size.

Suggested-by: Keith Busch 
Signed-off-by: Damien Le Moal 
---
 block/blk-settings.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 9a2c23cd9700..9a2b017ff681 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -561,11 +561,8 @@ int blk_stack_limits(struct queue_limits *t, struct 
queue_limits *b,
}
 
/* Optimal I/O a multiple of the physical block size? */
-   if (t->io_opt & (t->physical_block_size - 1)) {
-   t->io_opt = 0;
-   t->misaligned = 1;
-   ret = -1;
-   }
+   if (t->io_opt & (t->physical_block_size - 1))
+   t->io_opt = lcm(t->io_opt, t->physical_block_size);
 
t->raid_partial_stripes_expensive =
max(t->raid_partial_stripes_expensive,
-- 
2.25.4

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel