On 11.04.25 04:04, Eric Blake wrote:
The 'want_zero' parameter to raw_co_block_status() was added so that
we can avoid potentially time-consuming lseek(SEEK_DATA) calls
throughout the file (working around poor filesystems that have O(n)
rather than O(1) extent probing).  But when it comes to learning if a
file is completely sparse (for example, it was just created), always
claiming that a file is all data without even checking offset 0 breaks
what would otherwise be attempts at useful optimizations for a
known-zero mirror destination.

Note that this allows file-posix to report a file as completely zero
if it was externally created (such as via 'truncate --size=$n file')
as entirely sparse; however, it does NOT work for files created
internally by blockdev-create.  That's because blockdev-create
intentionally does a sequence of truncate(0), truncate(size),
allocate_first_block(), in order to make it possible for gluster on
XFS to probe the sector size for direct I/O (which doesn't work if the
first block is sparse).  That will be addressed in a later patch.

Signed-off-by: Eric Blake <ebl...@redhat.com>
---
  block/file-posix.c | 9 ++++++++-
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 56d1972d156..67e83528cf5 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3217,7 +3217,14 @@ static int coroutine_fn 
raw_co_block_status(BlockDriverState *bs,
          return ret;
      }

-    if (!want_zero) {
+    /*
+     * If want_zero is clear, then the caller wants speed over
+     * accuracy, and the only place where SEEK_DATA should be
+     * attempted is at the start of the file to learn if the file has
+     * any data at all (anywhere else, just blindly claim the entire
+     * file is data).
+     */
+    if (!want_zero && offset) {
          *pnum = bytes;
          *map = offset;
          *file = bs;

Looks like a hack. So we have bdrv_co_is_zero_fast() which do pass 
want_zero=false to block-status. But in case of mirror, which want to check the 
whole disk, we actually want want_zero=true, and detect it by offset=0..

Isn't it better to add a kind of bdrv_is_zero_middle_speed() (which means, 
don't try to read the data to check, but be free to use suboptimal lseek call 
or something like this), which will pass want_zero=true, and use it from 
mirror? Mirror case differs from usage in qcow2 exactly by the fact that we 
call it only once.


Another doubt (really weak): can this one extra lseek be so slow, that mirror 
becomes worse?
Den, is it right, that problems about slow lseek (that we experienced several 
years ago) were about qcow2-internals, and nothing related to mirror itself? 
May one lseek call on mirror target break something?


--
Best regards,
Vladimir


Reply via email to