Hi, I'd like to understand the reasoning behind the buffer size used in the `qemu-img convert` command. Currently, `IO_BUF_SIZE` is defined as 2MB in `qemu-img.c`. Based on performance observations, I would like to propose a patch that increases this to the current upper limit of 16MB, with the behavior being configurable once I understand the reasoning behind the existing values.
The motivation stems from analysing perf reports during image conversions with different negotiated sizes with a nbdkit daemon, when investigating performance slowdowns with virt-v2v; it revealed smaller, inefficient network data patterns. This buffer size is what qemu-img negotiates with nbdkit, as validated through some debug logging. Increasing the buffer size drastically improved performance, especially for network accesses when the source image is accessed over the internet. A quick observation of bandwidth utilisation with the larger buffer sizes show a significant improvement. Using nload as an example, on an otherwise-idle system shows a very low bandwidth utilisation when using a a buffer size of 2M- Using 2MB buffer sizes ====================== Incoming: Outgoing: Curr: 33.44 MBit/s Curr: 1.02 MBit/s Avg: 35.19 MBit/s Avg: 1.09 MBit/s Min: 1.13 MBit/s Min: 34.52 kBit/s Max: 50.19 MBit/s Max: 3.39 MBit/s Ttl: 473.11 GByte Ttl: 9.26 GByte On the same setup, using a larger buffer size of 16M yields much better results, hitting peaks of ~267Mb/s, as opposed to 50Mb/s with a 2MB buffer size. This is a pretty significant improvement and therefore warranted some closer analysis. Using 16MB buffer sizes ======================= Incoming: Outgoing: Curr: 267.08 MBit/s Curr: 7.89 MBit/s Avg: 117.76 MBit/s Avg: 3.18 MBit/s Min: 4.20 kBit/s Min: 47.87 kBit/s Max: 267.30 MBit/s Max: 9.40 MBit/s Ttl: 369.93 GByte Ttl: 9.34 GByte It is also important to note here that this problem is visible for a specific combination of using qemu-img alongside virt-v2v (virt-v2v 1.42, as is the case on Oracle Linux 8). While the internals of this utility are out of scope for this discussion forum, what's relevant here is that qemu-img uses this buffer size when negotiating receive window sizes with nbdkit during initialisation, which bottlenecks its own performance. Can anybody explain the reasoning behind keeping `IO_BUF_SIZE` at 2MB? Are there specific constraints (e.g., memory usage, compatibility, or other platform-specific limitations) that prevent increasing it to 16MB (or higher)? Appreciate any insights/reasoning behind this value. A patch can be provided if the discussion deems it relevant. Regards, Akash Kulhalli
