On 05.11.2013 09:51, Stefan Hajnoczi wrote: > On Sat, Oct 26, 2013 at 03:03:09PM +0200, Max Reitz wrote: >> Am 20.09.2013 12:32, schrieb Stefan Hajnoczi: >>> On Thu, Sep 19, 2013 at 05:07:56PM +0200, Max Reitz wrote: >>>> As far as I understand, the I/O speed (the duration of an I/O >>>> operation) should be pretty much the same for all scenarios, >>>> however, the latency is the value in question (since the overlap >>>> checks should affect the latency only). >>> The other value to look at is the host CPU consumption per I/O. In >>> other words, the CPU overhead added by performing the extra checks: >>> >>> efficiency = avg throughput / avg cpu utilization >>> >>> Once CPU consumption reaches 100% the workload is CPU-bound and we have >>> a bottleneck. >>> >>> Hopefully the efficiency doesn't change noticably either, then we know >>> there is no big impact from the extra checks. >>> >>> Stefan >> Okay, after fixing the VM state in qcow2, I was now finally able to >> actually perform the CPU benchmark. On second thought, it wasn't really >> neccessary, since I performed most of the tests in RAM anyway, so the >> CPU was already the bottleneck for these tests. >> >> I ran bonnie++ (bonnie++ -s 4g -n 0 -x 16) from an arch live CD ISO on a >> 5 GB qcow2 image formatted as ext4, both residing in /tmp; I prepared >> the VM state to the point where I just had to press Enter to perform the >> test and shut down the VM. I then performed a snapshot and used this >> image as the basis for two tests, one with no overlap checks enabled and >> one with all of them enabled. >> >> The time output for both qemu instances was respectively: >> >> echo 'sendkey ret' | time $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 >> -cdrom arch.iso -drive file=base.qcow2,overlap-check=none -enable-kvm >> -vga std -m 512 -loadvm 0 -monitor stdio >> d 294.42s user 117.72s system 98% cpu 6:58.00 total >> >> echo 'sendkey ret' | time $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 >> -cdrom arch.iso -drive file=base.qcow2,overlap-check=all -enable-kvm >> -vga std -m 512 -loadvm 0 -monitor stdio >> d 298.87s user 119.55s system 100% cpu 6:56.37 total >> >> So, as you can see, the CPU time differs only marginally (using all >> overlap checks instead of none took 1.52 % more CPU time). > Good, looks like the impact isn't very noticable. > > I wonder if that 1.52% is reproducible or just noise, did you run the > benchmark multiple times? > > Stefan
I just ran three tests for each (alternating between the modes), the results are as following (comparing the total time): overlap-check=none: 421.29 s, 412.37 s, 414.60 s Average: 416.09 s Standard deviation: 3.79 s overlap-check=all: 420.02 s, 415.11 s, 423.37 s Average: 419.50 s Standard deviation: 3.39 s So, using all overlap checks is nearly consistently slower – however, the difference is exactly within the single standard deviation. There is a difference, but first of all, it is pretty much unremarkable, and second, remember all tests are run in tmpfs. This is the absolute maximum slowdown we'll ever experience. Max