On 5/31/23 18:04, Eric Blake wrote: > On Wed, May 31, 2023 at 01:29:30PM +0200, Laszlo Ersek wrote: >>>> Putting aside alignment even, I don't understand why reducing "count" to >>>> uint16_t would be reasonable. With the current 32-bit-only block >>>> descriptor, we already need to write loops in libnbd clients, because we >>>> can't cover the entire remote image in one API call [*]. If I understood >>>> Eric right earlier, the 64-bit extensions were supposed to remedy that >>>> -- but as it stands, clients will still need loops ("chunking") around >>>> block status fetching; is that right? >>> >>> While the larger extents reduce the need for looping, it does not >>> entirely eliminate it. For example, just because the server can now >>> tell you that an image is entirely data in just one reply does not >>> mean that it will actually do so - qemu in particular limits block >>> status of a qcow2 file to reporting just one cluster at a time for >>> consistency reasons, where even if you use the maximum size of 2M >>> clusters, you can never get more than (2M/16)*2M = 256G status >>> reported in a single request. >> >> I don't understand the calculation. I can imagine the following >> interpretation: >> >> - QEMU never sends more than 128K block descriptors, and each descriptor >> covers one 2MB sized cluster --> 256 GB of the disk covered in one go. >> >> But I don't understand where the (2M/16) division comes from, even >> though the quotient is 128K. > > Ah, I need to provide more backstory on the qcow2 format. A qcow2 > image has a fixed cluster size, chosen between between 512 and 2M > bytes. A smaller cluster size has less wasted space for small images, > but uses more overhead. Each cluster has to be stored in an L1 map, > where pages of the map are also a cluster in length, with 16 bytes per > map entry. So if you pick a cluster size of 512, you get 512/16 or 32 > entries per L1 page; if you pick a cluster size of 2M, you get 2M/16 > or 128k entries per L1 page. When reporting block status, qemu reads > at most one L1 page to then say how each cluster referenced from that > page is mapped. > > https://gitlab.com/qemu-project/qemu/-/blob/master/docs/interop/qcow2.txt#L491 > >> >> I can connect the constant "128K", and >> <https://github.com/NetworkBlockDevice/nbd/commit/926a51df>, to your >> paragraph [*] above, but not the division. > > In this case, the qemu limit on reporting block status of at most one > L1 map page at a time happens to have no relationship to the NBD > constant of limiting block status reports to no more than 1M extents > (8M bytes) in a single reply, nor the fact that qemu picked a cap of > 1M bytes (128k extents) on its NBD reply regardless of whether the > underlying image is qcow2 or some other format.
Thanks! [...] Laszlo _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs