On 2014-03-22 21:32, Ask Bjørn Hansen wrote:

On Mar 22, 2014, at 10:43 AM, Gabriele Bulfon <[email protected]> wrote:

NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
data2 1.62T 998G 658G - 60% 1.00x ONLINE -

zfs list shows

NAME USED AVAIL REFER MOUNTPOINT
data2 793G 293G 40.0K /data2
data2/myvolume 792G 421G 664G -

664G (or 792G) is more than 658G. According to Jim’s explanation ZFS wants to 
make sure you can replace every byte on the zvol and you don’t have room for 
that while keeping the snapshot.

Note that 658G is the zpool free space, which accounts raw blocks not
yet allocated for data or parity (in raidz case, not so for mirrors
which are more intuitively accounted). So in case of Gabriele's raidz
of 3 disks, 421G*1.5 = 631.5G unallocated space ROUGHLY matches the
free pool space (some overheads, 1/64 reservations, etc. in play).
Likewise, the allocation of 664G of written userdata for the pool
(in its one dataset) translates to 996G with parity (x1.5) which also
adequately matches the zpool numbers.

Another thing to see is that the pool available space is 293G, while
the zvol available space is 421G. This 128G difference is the reserved
space to guarantee filling up of the dataset to its 792G size (792-664).
These 128G are not yet allocated (via "zfs refer" nor "zpool alloc"
columns), but are already considered "used" and are unavailable for
use by any other datasets than this volume.

I think these 128G might be available for snapshots of this volume,
since the space is unused at the time of snapshot, and the reservation
for a complete rewrite of the data would be around 664G rather than
the full 792G.

very strange that snapshotting zvol would require usage space:
> isn't one of the best zfs options zero usage space snapshot at
> creation time?

It is not "usage": unlike some other systems, snapshots and clones
do not actually allocate a separate copy of the data for this.
Here is a matter of reservations done by default to guarantee that
you can (re-)use the live branch of the dataset completely, logical
setting that can be easily changed (at a risk).

You can and may tweak the (ref)reservation and/or some other zvol
dataset attributes to disable these guarantees, i.e. if you need
the snapshot in order to send out a zfs-send stream. Indeed, then
you would not be able to completely rewrite the dataset contents,
so an end-user might be surprised to see an out-of-space or some
other IO error while writing to a VMDK container with seemingly
lots of free space (according to its format headers), until you
destroy the snapshot and release the blocks referenced only by it
(overwritten in the newer branches) back into the available space.

See also the "man zfs" page, i.e.

     volsize=size

         For volumes, specifies the logical size of  the  volume.
         By  default, creating a volume establishes a reservation
         of equal size. For storage pools with a  version  number
         of  9  or  higher,  a refreservation is set instead...

         The reservation is kept equal to  the  volume's  logical
         size  to  prevent  unexpected  behavior  for  consumers.
         Without the reservation, the volume  could  run  out  of
         space,  resulting  in undefined behavior or data corrup-
         tion, depending on how the volume is used....

         Though not recommended, a "sparse volume" (also known as
         "thin provisioning") can be created by specifying the -s
         option to the zfs create -V command, or by changing  the
         reservation after the volume has been created. A "sparse
         volume" is a volume where the reservation is  less  then
         the  volume  size.   Consequently,  writes  to  a sparse
         volume can fail with ENOSPC when  the  pool  is  low  on
         space.  For  a sparse volume, changes to volsize are not
         reflected in the reservation.


And a simple example, though a bit long. A picture is worth more
than words, though some are offered for comment as well :)

# zfs list rpool
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  44.1G  14.5G    50K  /rpool

# zfs create -V5g rpool/testvol
# zfs list rpool rpool/testvol
NAME            USED  AVAIL  REFER  MOUNTPOINT
rpool          49.3G  9.31G    50K  /rpool
rpool/testvol  5.16G  14.5G    16K  -

# zfs snapshot rpool/testvol@0
# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  49.3G  9.31G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    5.16G  14.5G    16K  -
rpool/testvol@0      0      -    16K  -

So (on an oi_151a8 at least) the zvol snapshot does not use space
initially.

Now I fill up some of the volume:

# dd if=/dev/zero bs=65536 count=8192 of=/dev/zvol/rdsk/rpool/testvol
8192+0 records in
8192+0 records out
# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  49.3G  9.31G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    5.16G  14.0G   513M  -
rpool/testvol@0    15K      -    16K  -

# zfs snapshot rpool/testvol@1
# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  49.8G  8.81G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    5.66G  14.0G   513M  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1      0      -   513M  -

The pool available space as well as the space available for the
live branch volume dataset has decreased.

Now I add data to some other range:

# dd if=/dev/zero bs=65536 count=8192 seek=10240 of=/dev/zvol/rdsk/rpool/testvol
8192+0 records in
8192+0 records out

# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  49.8G  8.81G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    5.66G  13.5G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
# zfs snapshot rpool/testvol@2

# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  50.3G  8.31G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    6.16G  13.5G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
rpool/testvol@2      0      -  1.00G  -

Another half a gig chomped off available spaces for pool and zvol,
and added to "refer" of the volume and later its snapshot.

Now I overwrite the same locations:

# dd if=/dev/zero bs=65536 count=4096 of=/dev/zvol/rdsk/rpool/testvol
4096+0 records in
4096+0 records out

# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  50.3G  8.31G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    6.16G  13.2G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
rpool/testvol@2    17K      -  1.00G  -

Available space for the zvol has gone down (though it is within reservations and has not any more affected the pool available
space), while used space (including reservations and snapshots)
remains in place...

# zfs snapshot rpool/testvol@3
# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  50.5G  8.06G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    6.41G  13.2G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
rpool/testvol@2    17K      -  1.00G  -
rpool/testvol@3      0      -  1.00G  -

Seemingly, all is the same - the live dataset and each snapshot
reference 1G. However, the "used" space has gone up, and avail
space for the pool has gone down being reserved for the snapshot
rewriting possible later on.

Here are some size-stats:

# zfs get all rpool/testvol | grep G
rpool/testvol  used                            6.41G          -
rpool/testvol  available                       13.2G          -
rpool/testvol  referenced                      1.00G          -
rpool/testvol  volsize                         5G             local
rpool/testvol  refreservation                  5.16G          local
rpool/testvol  usedbydataset                   1.00G          -
rpool/testvol  usedbyrefreservation            5.16G          -
rpool/testvol  logicalused                     1.25G          -
rpool/testvol  logicalreferenced               1.00G          -

Here is an override of the reservation security example (you can
of course use some other value than zero, i.e. if you expect to
write no more than some amount of data while your zfs-send is
being done, you can cater for that by a limited reservation):

# zfs set refreservation=0g rpool/testvol

# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  45.4G  13.2G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    1.25G  13.2G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
rpool/testvol@2    17K      -  1.00G  -
rpool/testvol@3      0      -  1.00G  -

So, the pool "used" space has decreased, the space available for
other datasets has increased. Space available in the zvol remains,
though the "used" value now matches the sum of actual allocations
in the snapshots (and the live branch, which has zero new bytes).

# zfs get all rpool/testvol | grep G
rpool/testvol  used                            1.25G          -
rpool/testvol  available                       13.2G          -
rpool/testvol  referenced                      1.00G          -
rpool/testvol  volsize                         5G             local
rpool/testvol  usedbydataset                   1.00G          -
rpool/testvol  logicalused                     1.25G          -
rpool/testvol  logicalreferenced               1.00G          -

A new snapshot does not break your explicit requirements for lack
of reservations:

# zfs snapshot rpool/testvol@4
# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  45.4G  13.2G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    1.25G  13.2G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
rpool/testvol@2    17K      -  1.00G  -
rpool/testvol@3      0      -  1.00G  -
rpool/testvol@4      0      -  1.00G  -

This stays valid if you add data and snapshots as well:

# dd if=/dev/zero bs=65536 count=4096 of=/dev/zvol/rdsk/rpool/testvol
4096+0 records in
4096+0 records out

# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  45.6G  13.0G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    1.50G  13.0G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
rpool/testvol@2    17K      -  1.00G  -
rpool/testvol@3      0      -  1.00G  -
rpool/testvol@4      0      -  1.00G  -

# zfs snapshot rpool/testvol@5
# zfs list rpool ; zfs list -tall -r rpool/testvol
NAME    USED  AVAIL  REFER  MOUNTPOINT
rpool  45.6G  13.0G    50K  /rpool
NAME              USED  AVAIL  REFER  MOUNTPOINT
rpool/testvol    1.50G  13.0G  1.00G  -
rpool/testvol@0    15K      -    16K  -
rpool/testvol@1    17K      -   513M  -
rpool/testvol@2    17K      -  1.00G  -
rpool/testvol@3      0      -  1.00G  -
rpool/testvol@4      0      -  1.00G  -
rpool/testvol@5      0      -  1.00G  -



root@summit-blade5:/tmp#

Hope this explains things,
//Jim Klimov


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to