On 12/04/2017 18:19, Dustin Wenz wrote:
> I'm starting a new thread based on the previous discussion in "bhyve uses all 
> available memory during IO-intensive operations" relating to size inflation 
> of bhyve data stored on zvols. I've done some experimenting with this, and I 
> think it will be useful for others.
> The zvols listed here were created with this command:
>       zfs create -o volmode=dev -o volblocksize=Xk -V 30g 
> vm00/chyves/guests/myguest/diskY
> The zvols were created on a raidz1 pool of four disks. For each zvol, I 
> created a basic zfs filesystem in the guest using all default tuning (128k 
> recordsize, etc). I then copied the same 8.2GB dataset to each filesystem.
>       volblocksize    size amplification
>       512B            11.7x
>       4k                      1.45x
>       8k                      1.45x
>       16k                     1.5x
>       32k                     1.65x
>       64k                     1x
>       128k            1x
> The worst case is with a 512B volblocksize, where the space used is more than 
> 11 times the size of the data stored within the guest. The size efficiency 
> gains are non-linear as I continue from 4k and double the block sizes; 32k 
> blocks being the second-worst. The amount of wasted space was minimized by 
> using 64k and 128k blocks.
> It would appear that 64k is a good choice for volblocksize if you are using a 
> zvol to back your VM, and the VM is using the virtual device for a zpool. 
> Incidentally, I believe this is the default when creating VMs in FreeNAS.
>       - .Dustin

As I explained a bit in the other thread, this depends a lot on your
VDEV configuration.

Allocations on RAID-Z* must be padded out to a multiple of 1+p (where p
is the parity level)

So on RAID-Z1, all allocations must be divisible by 2.

Of course any record size less than 4k, on drives with 4k sectors would
be rounded up as well.

So, with recordsize=512, you would end up using: 4k for data, 4k for
parity, with a waste factor of almost 16x.

4k is a bit better
Z1: 1 data + 1 parity + 0 padding = 2x
Z2: 1 data + 2 parity + 0 padding = 3x
Z3: 1 data + 3 parity + 0 padding = 4x

8k can be worse, where the RAID-Z padding comes into play:
Z1: 2 data + 1 parity + 1 padding = 2x (expect 1.5x)
Z2: 2 data + 2 parity + 2 padding = 3x (expect 2x)
Z3: 2 data + 3 parity + 3 padding = 4x (expect 2.x5)

Finally, all of these nice even numbers can be be thrown out once you
enable compression, and some blocks will compress better than others. An
8k record that fits into one 4k sector, etc.

Also consider that 'zfs' commands show size after its calculations of
what the expected raid-z parity space consumption will be, but does not
consider losses to padding. Whereas numbers given by the 'zpool'
command, are raw actual storage.

Allan Jude
freebsd-virtualization@freebsd.org mailing list
To unsubscribe, send any mail to 

Reply via email to