I just realized that I forgot to send this message to zfs-discuss back
in May when I fixed this bug.  Sorry for the delay.

The putback of the following bug fix to Solaris Nevada build 42 and
Solaris 10 update 3 build 3 (and coinciding with the change to ZFS
on-disk version 3) changes the behavior of space accounting when using
pools with raid-z:

        6288488 du reports misleading size on RAID-Z

The old behavior is that on raidz vdevs, the space used and available
includes the space used to store the data redundantly (ie. the parity
blocks).  On mirror vdevs, and all other products' RAID-4/5
implementations, it does not, leading to confustion.  Customers are
accustomed to the redundant space not being reported, so this change
makes zfs do that for raid-z devices as well.

The new behavior applies to:
(a) newly created pools (with version 3 or later)
(b) old (version 1 or 2) pools which, when 'zpool upgrade'-ed, did not
have any raid-z vdevs (but have since 'zpool add'-ed a raid-z vdev)

Note that the space accounting behavior will never change on old raid-z
pools.  If the new behavior is desired, these pools must be backed up,
destroyed, and re-'zpool create'-ed.

The 'zpool list' output is unchanged (ie. it still includes the space
used for parity information).  This is bug 6308817 "discrepancy between
zfs and zpool space accounting".

The reported space used may be slightly larger than the parity-free size
because the amount of space used to store parity with RAID-Z varies
somewhat with blocksize (eg. even small blocks need at least 1 sector of
parity).  On most workloads[*], the overwhelming majority of space is
stored in 128k blocks, so this effect is typically not very pronounced.

--matt

[*] One workload where this effect can be noticable is when the
'recordsize' property has be decreased, eg. for a database or zvol.
However, in this situation the "rounding error space" can be completely
eliminated by using an appropriate number of disks in the raid-z group,
according to the following table:

        exact               optimal num. disks
      recordsize          raidz1          raidz2
          8k+           3, 5 or 9       6, 10 or 18
          4k            3 or 5          6 or 10
          2k            3               6

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to