On 05/22/2017 10:24 PM, Eric Blake wrote: > On 05/19/2017 04:34 AM, Anton Nefedov wrote: >> If COW area of the newly allocated cluster is zeroes, there is no reason >> to write zero sectors in perform_cow() again now as whole clusters are >> zeroed out in single chunks by handle_alloc_space(). > But that's only true if you can guarantee that handle_alloc_space() > succeeded at ensuring the cluster reads as zeroes. If you silently > ignore errors (which is what patch 1/13 does), you risk assuming that > the cluster reads as zeroes when in reality it does not, and then you > have corrupted data. > > The idea of avoiding a COW of areas that read as zero at the source when > the destination also already reads as zeroes makes sense, but I'm not > convinced that this patch is safe as written. we will recheck error path. OK.
>> Introduce QCowL2Meta field "reduced", since the existing fields >> (offset and nb_bytes) still has to keep other write requests from >> simultaneous writing in the area >> >> iotest 060: >> write to the discarded cluster does not trigger COW anymore. >> so, break on write_aio event instead, will work for the test >> (but write won't fail anymore, so update reference output) >> >> iotest 066: >> cluster-alignment areas that were not really COWed are now detected >> as zeroes, hence the initial write has to be exactly the same size for >> the maps to match >> >> performance tests: === >> >> qemu-io, >> results in seconds to complete (less is better) >> random write 4k to empty image, no backing >> HDD >> 64k cluster >> 128M over 128M image: 160 -> 160 ( x1 ) >> 128M over 2G image: 86 -> 84 ( x1 ) >> 128M over 8G image: 40 -> 29 ( x1.4 ) >> 1M cluster >> 32M over 8G image: 58 -> 23 ( x2.5 ) >> >> SSD >> 64k cluster >> 2G over 2G image: 71 -> 38 ( x1.9 ) >> 512M over 8G image: 85 -> 8 ( x10.6 ) >> 1M cluster >> 128M over 32G image: 314 -> 2 ( x157 ) > At any rate, the benchmark numbers show that there is merit to pursuing > the idea of reducing I/O when partial cluster writes can avoid writing > COW'd zeroes on either side of the data. > yes! This is exactly the point and also with this approach we would allow sequential non-aligned to cluster writes, which is also very good. Den