On 06.10.25 17:57, Stefan Hajnoczi wrote:
On Fri, Oct 03, 2025 at 10:55:09AM +0300, Vladimir Sementsov-Ogievskiy wrote:
On 02.10.25 21:39, Stefan Hajnoczi wrote:
Linux block devices require write zeroes alignment whereas files do not.
It may come as a surprise that block devices opened in buffered I/O mode
require the alignment although regular read/write requests do not.
Therefore it is necessary to populate the pwrite_zeroes_alignment field.
Signed-off-by: Stefan Hajnoczi <[email protected]>
---
block/file-posix.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/block/file-posix.c b/block/file-posix.c
index 8c738674ce..05c92c824d 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1602,6 +1602,23 @@ static void raw_refresh_limits(BlockDriverState *bs,
Error **errp)
bs->bl.pdiscard_alignment = dalign;
}
+
+#ifdef __linux__
+ /*
+ * When request_alignment > 1, pwrite_zeroes_alignment does not need to
+ * be set explicitly. When request_alignment == 1, it must be set
+ * explicitly because Linux requires logical block size alignment.
+ */
+ if (bs->bl.request_alignment == 1) {
+ ret = probe_logical_blocksize(s->fd,
+ &bs->bl.pwrite_zeroes_alignment);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret,
+ "Failed to probe logical block size");
Isn't it too restrictive? Could we consider failed attempt to probe as
permission
to proceed without write-zeroes alignment? In raw_probe_alignment, we fallback
to guessing request_alignment from memalign.
The logical block size alignment is required for write zeroes, otherwise
write zeroes will fail with EINVAL (not ENOTSUP).
There is no way to probe in the !needs_alignment case since read
requests don't require alignment and write zeroes would be destructive.
Theoretically, if we also implement some kind of automation for unaligned tails
(like for read/write request_alignment), to support "required write-zeroes
alignment",
we could postpone probing up to the first write-zeroes operation.. But seems,
that
would be too much work (and complex logic to support in future) for nothing.
I think it's preferrable to fail here. This should never happen on a
Linux kernel because BLKSSZGET has been there since the initial git
import in 2005.
Agreed.
+ return;
+ }
+ }
+#endif /* __linux__ */
}
raw_refresh_zoned_limits(bs, &st, errp);
--
Best regards,
Vladimir
--
Best regards,
Vladimir