On Tue, 21 Jan 2025, Christoph Hellwig wrote:
> On Mon, Jan 20, 2025 at 04:16:26PM +0100, Mikulas Patocka wrote:
> > Some SATA SSDs and most NVMe SSDs report physical block size 512 bytes,
> > but they use 4K remapping table internally and they do slow
> > read-modify-write cycle for requests that are not aligned on 4K boundary.
> > Therefore, io_opt should be aligned on 4K.
>
> Not really. I mean it's always smart to not do tiny unaligned I/O
> unless you have to. So we're not just going to cap an exported value
> to a magic number because of something.
The purpose of this patch is to avoid doing I/O not aligned on 4k
boundary.
The 512-byte value that some SSDs report is just lie.
> > Signed-off-by: Mikulas Patocka <mpato...@redhat.com>
> > Fixes: a23634644afc ("block: take io_opt and io_min into account for
> > max_sectors")
> > Fixes: 9c0ba14828d6 ("blk-settings: round down io_opt to
> > physical_block_size")
>
> Please explain how this actually is a fix.
Some USB-SATA bridges report optimal I/O size 33553920 bytes (that is
512*65535). If you connect a SATA SSD that reports 512-bytes physical
sector size to this kind of USB-SATA bridge, the kernel will believe that
the value 33553920 is valid optimal I/O size and it will attempt to align
I/O to this boundary - the result will be that most of the I/O will not be
aligned on 4k, causing performance degradation.
Optimal I/O size 33553920 may also confuse userspace tools (lvm,
cryptsetup) to align logical volumes on 33553920-byte boundary.
Mikulas