I attached an updated case with the following changes:

Dave Pacheco wrote:
> In section A, paragraph 4: I still find the terminology of saying that
> "zfs send -b" is used to "restore properties" from the original
> dataset confusing, since it's really just generating a stream that
> will later be restored. For customers who understand that a NDMP
> backup = "zfs send" and NDMP restore = "zfs restore" (which it is),
> this is an important distinction. I would just change "restore" to
> "backup" in that sentence.
>

I Changed that to

  "This case also adds a -b (restore from backup) option to the 'zfs send'
  command to generate a stream that, when received, restores properties
  from the original dataset ... "

> The header of section C.1.1 is similarly confusing.
>

I changed that from
    C.1.1. Backup with 'receive -o', restore with 'send -b'
to
    C.1.1. 'zfs receive -o' and 'zfs send -b'


> Section B.1: For clarification in the first sentence, you might say
> "_subsequent_ incremental 'zfs send' streams".
>

Done.

> You mention the "-t" aliases in sections A, B, and the overview in C,
> but not in the detailed solutions part of C or the manpage diffs.
>

Nice catch. I meant to include that in the manpage diffs, but I had
nothing further to say about them in the detailed solutions part of C.

C.3.5. zfs list -t

          -t type

              A comma-separated list of types  to  display,  where
              type  is  one  of  filesystem, snapshot , volume, or
              all. For example, specifying  -t  snapshot  displays
-             only snapshots.
+             only snapshots. The following aliases can be used in
+             place of the type specifiers: fs (filesystem), snap
+             (snapshot), and vol (volume).

Thanks,
Tom
Template Version: @(#)sac_nextcase 1.70 03/30/10 SMI
This information is Copyright (c) 2010, Oracle and/or its affiliates.
All rights reserved.
1. Introduction
    1.1. Project/Component Working Name:
     ZFS backup options
    1.2. Name of Document Author/Supplier:
     Author:  Thomas Erickson
    1.3  Date of This Document:
     26 May, 2010
4. Technical Description

ZFS backup options 

A. SUMMARY

Dataset properties can interfere with the use of 'zfs send' and 'zfs
receive' as a backup solution because the properties of the original
dataset may not work in the context of the backup dataset, or they may
not be the right settings in terms of how you want to back up your data.
Local properties on the backup dataset also interfere with restores,
since there is no way to tell 'zfs send' to send the properties received
from the original rather than local settings that may apply only to the
backup.

This case adds a -o option to the 'zfs receive' command that works like
the existing 'zfs create -o property=value' to specify the initial
properties of a newly created dataset, making it possible to override
properties received from a non-incremental 'zfs send' stream.

Similar to -o, a -x option for 'zfs receive' is added to override
received properties without specifying a value, simply to ensure that
they have no effect on the behavior of the received dataset (as if they
had been excluded from the send stream), even though the received values
can still be restored.

This case also adds a -b (restore from backup) option to the 'zfs send'
command to generate a stream that, when received, restores properties
from the original dataset rather than local settings specified on the
backup dataset with 'zfs set' or 'zfs receive -o'.

For the convenience of related administrator actions, this case adds a
-r option to 'zfs set' similar to 'zfs inherit -r' and new aliases for
use with 'zfs list -t'.

B. PROBLEM

B.1. Backup

Today we can set properties locally with 'zfs set' to override
properties received from subsequent incremental 'zfs send' streams, but
this is no help with non-incremental streams because there is no
opportunity to run 'zfs set' on a dataset before it is newly created by
'zfs receive'. This is particularly awkward when a sent dataset has a
non-default mountpoint, since the receive has no place to replicate the
sent data when it fails to mount the received dataset on a directory
that may not exist or that is already in use by the sent dataset. For
example:

% zfs create -o mountpoint=/tank/mnt tank/fs/mnt
% zfs snapshot  tank/fs/m...@1
% zfs create  tank/foo
% zfs send -R  tank/fs/m...@1 | zfs recv -d tank/foo
cannot mount 'tank/foo/fs/mnt': mountpoint or dataset is busy
%

Similar problems can interfere with the use of send/receive as a backup
solution when there is a property in the source dataset whose validity
on the target depends on inherited settings different from those of the
source. For example, the following interference from the source quota
prevents a successful receive when the target inherits copies
differently:

% zfs create tank/a
% zfs create tank/z
% zfs set copies=3 tank/z
% zfs create tank/a/b
% zfs set quota=2m tank/a/b
% mkfile 1m /tank/a/b/file
% zfs snapshot tank/a/b...@1
% zfs send -R tank/a/b...@1 | zfs recv -e tank/z
cannot receive new filesystem stream: destination tank/z space quota exceeded
%

The lack of a receive option to set initial property values also hinders
administrators from applying settings like compression and deduplication
to backups, since local settings that differ from those in the send
stream can only be applied after the fact and therefore only apply to
incremental updates.

Sometimes the administrator has a value in mind to override the received
property, but other times the administrator may simply want the default.
In that case we don't want to task the administrator with finding out
the default in order to set it explicitly, so an option to override a
received property without providing an explicit value is more helpful
(and preserves inheritance).

B.2. Restore

Restoring properties from a backup is also troublesome because 'zfs
send' assumes that the effective property values are the ones to include
in the send stream (if specified on the sent dataset and not inherited).
Since local settings override received properties, the administrator
will not get back the original settings received on the backup dataset
because 'zfs send' will choose the effective local settings instead.
Local settings on the backup likely specify how the administrator wants
to store backups and have nothing to do with the restored dataset being
like the original.

To clarify the problem of restoring properties from a backup, it helps
to think of the effective value of a property is the topmost value in
the following stack:

    local
    received
    ---------
    inherited
    default

Each level of the stack overrides the one below it. If a property does
not have a specified value at the topmost level, ZFS resorts to the
value at the next lower level, if it exists, and so on. (The default
value always exists.) The first value it finds determines the effective
value of the property. This is how ZFS already works and this case does
nothing to change that.

Values above the dashed line (local and received) are explicit and
included in a 'zfs send' stream unless overridden. Values below the
dashed line (inherited and default) are implicit and never included in a
'zfs send' stream. If you say nothing about a dataset property, neither
by setting it nor by receiving it, your dataset inherits the property
from its parent. If the parent does not have a set or received value and
neither do any of its ancestors, then your dataset inherits the default.
Inheriting properties is the way to get the behavior you want without
including those settings in a 'zfs send' stream. Again, this case does
not change that.

The reason that sending the effective value is a problem for backups is
best understood by considering the outcome of a replication chain.
Suppose that we send from dataset A and receive to dataset B, then send
from dataset B and receive to dataset C. Clearly we want A->B to
replicate the effective properties of A, but what about B->C? If B
overrides a property locally, should C get the original value from A or
the local value on B? Currently 'zfs send' can only replicate the
effective properties of B. It cannot pass along the original properties
from A without interference from local settings on B, thereby finishing
the relay of A->B->C with the same properties it had at the start. The
semantics of restoring from a backup are not possible unless you can get
back the properties you started with.

B.3. Set recursively

In the case of a recursive 'zfs send' stream, the option to set the
initial values of specific properties needs to apply those properties
recursively. For consistency and convenience, 'zfs set' should also
support this ability to apply a local setting recursively so it can
override a property received in a recursive incremental update.

B.4. Query

It's inconvenient to spell out the full names of the dataset types
"filesystem", "volume", and "snapshot" when querying received datasets
with 'zfs list -t'.

C. PROPOSED SOLUTION

C.1. Overview

The proposed solution adds a -o option to 'zfs receive' to specify the
initial values of any number of properties as '-o property=value', just
like we do with 'zfs create'. To avoid the unwanted effects of a
property in a 'zfs send' stream without specifying a value for that
property, a similar '-x property' option overrides the named property in
a way that preserves the received value for a possible future restore
(or inherit -S) but has no effect on the effective value of the property
on the received dataset, as if the named property had been excluded from
the send stream.

Both of these receive options, -o and -x, work with recursive send
streams by overriding the received property recursively. For
convenience, they also work with incremental streams, even though an
administrator can use 'zfs set' and/or 'zfs inherit' to override
received properties before receiving an incremental update. For
consistency with 'zfs inherit -r' as well as the -o and -x receive
options, a '-r' option added to 'zfs set' makes it possible to apply a
property setting recursively before receiving a recursive, incremental
update.

A new -b option added to 'zfs send' specifies received rather than
effective property values to be included in the send stream, making it
possible to restore the original, backed-up settings without
interference from local settings.

Finally, the proposed solution adds the type aliases "fs" for
"filesystem", "snap" for "snapshot", and "vol" for "volume" to use with
'zfs list -t' when querying datasets.

C.1.1. 'zfs receive -o' and 'zfs send -b'

To support the use of send/receive as a backup solution, ZFS will set a
property's local value to that specified by receive -o, whereas its
received value (overridden by the local value) will be set to the value
obtained from the send stream (if any). For example:

% zfs set compression=off a/fs
% zfs snapshot a/f...@1
% zfs send -R a/f...@1 | zfs recv -o compression=on -d z
% zfs get -o all compression z/fs
NAME  PROPERTY     VALUE     RECEIVED  SOURCE
z/fs  compression  on        off       local
%

This way, the backup dataset created by 'zfs receive' can use whatever
settings the administrator specifies, but the original settings are not
lost and can still be restored from the backup. To do that requires a
new -b option for 'zfs send' that tells ZFS to send received properties
regardless of local settings. This is not the default 'zfs send'
behavior because it might be confusing to send property values that are
not currently effective on the source dataset.

The following table describes what 'zfs send -b' actually sends compared
to 'zfs send' without -b, depending on whether the source property has a
local value, a received value, both, or neither:

local  received | sent with -b       sent without -b
-----  -------- | -----------------  ---------------
yes    no       | nothing            local value
no     yes      | received value     received value
yes    yes      | received value     local value
no     no       | nothing            nothing

An internal "$hasrecvd" property tracks whether a dataset was received
on or after pool version 22 (received properties). If that internal
property is not found, 'zfs send' treats the dataset as the start of a
replication chain by sending its effective property values, as if the
dataset was sent without -b. This behavior is consistent in the sense
that 'send -b' always preserves original properties. In the replication
chain A->B->C, the initial replication A->B is a special case in which
the original properties are the effective and not the received
properties, so the column under "sent without -b" in the above table
applies. The alternative was to send no properties, since there are no
received properties if the dataset has never been received and this is
the initial replication. While that may be the more consistent behavior,
it lacks utility. The chosen behavior potentially frees backup
applications from deciding when to use -b since they can use -b by
default and get the desired behavior in both the A->B and B->C cases. In
the case where a received dataset needs to become the start of a new
replication chain, that can be indicated by a user property.

C.1.2. Recursive streams and 'zfs set -r'

If the sent stream is recursive (ie 'zfs send -R'); that is, it includes
one or more levels of descendant datasets, then 'zfs recv -o
property=value' ensures that the effective value of the property is the
same in all received datasets. It does this in the most sensible way it
can, depending on the property. If the property is inheritable, it sets
the named property to the specified value only on the top level dataset
and inherits the named property on all the datasets below that. That
way, all received datasets get the setting from one place. The received
values, however, reflect the settings of the source datasets. For
example:

% zfs create a/b
% zfs create a/b/c
% zfs create a/b/c/d
% zfs set compression=off a/b
% zfs set compression=gzip a/b/c/d
% zfs get compression a/b a/b/c a/b/c/d
NAME     PROPERTY     VALUE     SOURCE
a/b      compression  off       local
a/b/c    compression  off       inherited from a/b
a/b/c/d  compression  gzip      local
% zfs snapshot -r a/b...@1
% zfs send -R a/b...@1 | zfs recv -o compression=on -d z
% zfs get -o all compression z/b z/b/c z/b/c/d
NAME     PROPERTY     VALUE     RECEIVED  SOURCE
z/b      compression  on        off       local
z/b/c    compression  on        -         inherited from z/b
z/b/c/d  compression  on        gzip      inherited from z/b
%

Note that the value "gzip" in the RECEIVED column for z/b/c/d reflects
the local setting of the source dataset, a/b/c/d.

If the property is not inheritable, 'zfs recv -o property=value' instead
sets the property recursively. For example, if we override canmount
instead of compression:

% zfs create a/b
% zfs create a/b/c
% zfs create a/b/c/d
% zfs set canmount=off a/b
% zfs set canmount=noauto a/b/c/d
% zfs get canmount a/b a/b/c a/b/c/d
NAME     PROPERTY  VALUE     SOURCE
a/b      canmount  off       local
a/b/c    canmount  on        default
a/b/c/d  canmount  noauto    local
% zfs snapshot -r a/b...@1
% zfs send -R a/b...@1 | zfs recv -o canmount=on -d z
% zfs get -o all canmount z/b z/b/c z/b/c/d
NAME     PROPERTY  VALUE     RECEIVED  SOURCE
z/b      canmount  on        off       local
z/b/c    canmount  on        -         local
z/b/c/d  canmount  on        noauto    local
%

The new 'zfs set -r' option uses the same algorithm to make the setting
effective throughout the subtree of child datasets. If the property is
inheritable, it sets the property on the top-level dataset only and
inherits the setting the rest of the way down the subtree. In the above
example involving compression, receiving without -o and correcting after
the fact with 'zfs set -r compression=on z/b' would still result in the
same property values:

% zfs send -R a/b...@1 | zfs recv -d z
% zfs set -r compression=on z/b
% zfs get -o all compression z/b z/b/c z/b/c/d
NAME     PROPERTY     VALUE     RECEIVED  SOURCE
z/b      compression  on        off       local
z/b/c    compression  on        -         inherited from z/b
z/b/c/d  compression  on        gzip      inherited from z/b
%

The difference is that in the case of 'zfs recv -o compression=on' the
initial received data is actually compressed, whereas 'set -r' only
applies to future updates. Without -o to specify compression, 'zfs
receive' uses the received property values ("off" and "gzip" in this
case) to determine how the initial receive should or should not compress
data:

% zfs send -R a/b...@1 | zfs recv -d z
% zfs get -o all compression z/b z/b/c z/b/c/d
NAME     PROPERTY     VALUE     RECEIVED  SOURCE
z/b      compression  off       off       received
z/b/c    compression  off       -         inherited from z/b
z/b/c/d  compression  gzip      gzip      received
%

That's how the data is actually received before compression is
overridden locally with 'set -r'.

Either way, the final outcome is exactly the same as if the
administrator had typed the commands

% zfs inherit -r compression z/b
% zfs set compression=on z/b

If the property is not inheritable, as in the example involving canmount
above, then 'zfs set -r' sets the property all the way down the subtree
rooted at the specified dataset.

A subset of non-inheritable properties, like quota and reservation, take
a size value. In that case it does not make sense to set the property
recursively, since the size already applies to the entire subtree, so
'zfs receive -o' would only set the property on the top-level dataset.
For example:

% zfs create a/b
% zfs create a/b/c
% zfs create a/b/c/d
% zfs set quota=1g a/b
% zfs set quota=1m a/b/c/d
% zfs get quota a/b a/b/c a/b/c/d
NAME     PROPERTY  VALUE  SOURCE
a/b      quota     1G     local
a/b/c    quota     none   default
a/b/c/d  quota     1M     local
% zfs snapshot -r a/b...@1
% zfs send -R a/b...@1 | zfs recv -o quota=5g -d z
% zfs get -o all quota z/b z/b/c z/b/c/d
NAME     PROPERTY  VALUE  RECEIVED  SOURCE
z/b      quota     5G     1G        local
z/b/c    quota     none   -         default
z/b/c/d  quota     1M     1M        received
%

However, in the case of overriding received quota with quota=none, it
does make sense to set the property recursively, so the above example
turns out somewhat differently:

% zfs send -R a/b...@1 | zfs recv -o quota=none -d z
% zfs get -o all quota z/b z/b/c z/b/c/d
NAME     PROPERTY  VALUE  RECEIVED  SOURCE
z/b      quota     none   1G        local
z/b/c    quota     none   -         local
z/b/c/d  quota     none   1M        local
%

'zfs set -r' treats non-inheritable size properties like quota in
exactly the same way.

Regardless of the property, 'zfs receive -o' of a recursive stream and
'zfs set -r' both use the same algorithm to do what makes the most sense
depending on the property. The only potential difference is when a
snapshot is missing from the send stream. In that case, 'zfs send'
prints a warning about the missing snapshot and 'zfs receive -o' does
not update the property in the target dataset corresponding to the
missing snapshot. In the normal case, however, 'zfs receive -o' visits
the entire subtree of child datasets and the outcome (as far as
properties are concerned) is the same as that of 'zfs set -r'.

C.1.3. 'zfs receive -x'

The idea of 'receive -x' is to ensure that the received property does
not determine the effective value of the property on the received
dataset; that is, to get the behavior that you would have gotten if the
property had been excluded from the send stream, and to do that
recursively in the case of a recursive stream. How it accomplishes that
depends on the property. If the property is inheritable, 'receive -x'
explicitly inherits the property to override the received value. The
effect is the same as 'zfs inherit -r' if the send stream is recursive,
for example:

% zfs create a/b
% zfs create a/b/c
% zfs create a/b/c/d
% zfs set compression=off a/b
% zfs set compression=gzip a/b/c/d
% zfs get compression a/b a/b/c a/b/c/d
NAME     PROPERTY     VALUE     SOURCE
a/b      compression  off       local
a/b/c    compression  off       inherited from a/b
a/b/c/d  compression  gzip      local
% zfs snapshot -r a/b...@1
% zfs send -R a/b...@1 | zfs recv -x compression -d z
% zfs get -o all compression z/b z/b/c z/b/c/d
NAME     PROPERTY     VALUE     RECEIVED  SOURCE
z/b      compression  off       off       default
z/b/c    compression  off       -         default
z/b/c/d  compression  off       gzip      default
%

Explicit inheritance causes ZFS to ignore the received value and resort
to the parent if there is no local setting. In this case, since the
parent pool z does not specify a value, all the children inherit the
default.

If the property is not inheritable, 'receive -x' sets the default value
recursively. Since the default canmount value happens to be "on", the
canmount example in C.1.2 above turns out the same when we replace

% zfs send -R a/b...@1 | zfs recv -o canmount=on -d z

with

% zfs send -R a/b...@1 | zfs recv -x canmount -d z

In the case of a non-inheritable size property like quota, the default
value "none" is set recursively (since it makes sense to do so), so
'-x quota' is the same as '-o quota=none' for non-incrementals. In the
case of 'receive -x volsize' where there is no default value, the
command fails with an error message unless the send stream is an
incremental update.

In the case of an incremental update, 'zfs receive -x' does nothing if
the received property is already overridden by explicit inheritance or a
local setting. Checking for an existing setting and updating the
property is atomic.

If the property is not present in the send stream, -x does nothing.

C.1.4. Uneditable, set-once, and special properties

Specifying an uneditable property with 'receive -o' or 'receive -x'
fails the command and prints an error message. Even set-once properties
normally settable by 'zfs create -o' fail with an error message when
specified with 'zfs receive -o' because they are bound to the sent data.
These include

    normalization
    casesensitivity
    utf8only
    volblocksize

A set-once property that is independent of the sent data might be
specifiable with 'receive -o' or 'receive -x'.

The following property is editable, but modifications to the property
only affect subsequent writes, not subsequent receives:

    recordsize

Specifying recordsize with 'receive -o' or 'receive -x' (default is
128K) succeeds without a warning message and has no effect on received
data.

C.2. Version Compatibility

To get the full benefit of the options described in this case, the ZFS
pool needs to be at version 22 (received properties) or later. Before
that, the following options behave differently:

  receive -o  If the specified property is present in the send stream,
              it is replaced by the value specified after the -o option,
              since the sent value cannot be overridden locally.

  receive -x  If the specified property is present in the send stream, 
              it is simply ignored, since it cannot be overridden
              locally.

     send -b  This option is ignored. Received properties cannot be sent
              in favor of local settings because the two cannot be
              distinguished. All datasets appear to have never been
              received and their settings are sent as original
              properties.

Otherwise the options described in this case work the same on all pool
versions.

C.3. Manpage diffs

C.3.1. zfs receive

< zfs receive [-vnFu] filesystem|volume|snapshot
> zfs receive [-vnFu] [[-o property=value] | [-x property]] ...
>      filesystem|volume|snapshot

< zfs receive [-vnFu] [-d | -e] filesystem
> zfs receive [-vnFu] [[-o property=value] | [-x property]] ...
>      [-d | -e] filesystem                                       

<     zfs receive [-vnFu] filesystem|volume|snapshot
<     zfs receive [-vnFu] [-d | -e] filesystem
>     zfs receive [-vnFu] [[-o property=value] | [-x property]] ...
>          filesystem|volume|snapshot
>     zfs receive [-vnFu] [[-o property=value] | [-x property]] ...
>          [-d | -e] filesystem
     
...  
         When a snapshot replication package stream that is  gen-
         erated by using the zfs send -R command is received, any
         snapshots that do not exist on the sending location  are
<        destroyed by using the zfs destroy -d command.
>        destroyed by using the zfs destroy -d command. If -o
>        property=value or -x property is specified, it applies to the
>        effective value of the property throughout the entire subtree
>        of replicated datasets. Effective property values may be set or
>        inherited, depending on the property and whether the dataset is
>        the topmost in the replicated subtree. Received properties are
>        retained in spite of being overridden and may be restored with
>        zfs inherit -rS or zfs send -Rb.


...

>         -o property=value
>
>             Sets the specified property as if  the  command  zfs
>             set  property=value is invoked at the same time the
>             received dataset is created from the non-incremental send
>             stream or updated from the incremental send stream. Any
>             editable ZFS  property  can also  be  set  at receive
>             time. Set-once properties bound to the received data, such
>             as normalization and casesensitivity, cannot be set at
>             receive time even when the datasets are newly created by
>             zfs receive. Multiple -o options can be specified. An
>             error results if the same property is specified in
>             multiple -o or -x options.
>
>         -x property
>
>             Ensures that the effective value of the specified property
>             after the receive is unaffected by the value of that
>             property in the send stream (if any), as if the property
>             had been excluded from the send stream. If the specified
>             property is not present in the send stream, this option
>             does nothing. If a received property needs to be
>             overridden, the effective value may be set or inherited,
>             depending on the property. In the case of an incremental
>             update, -x leaves any existing local setting or explicit
>             inheritance unchanged (since the received property is
>             already overridden). All -o restrictions apply equally to
>             -x.

C.3.2. zfs send

< zfs send [-vR] [-[iI] snapshot] snapshot
> zfs send [-vRb] [-[iI] snapshot] snapshot

<     zfs send [-vR] [-[iI] snapshot] snapshot
>     zfs send [-vRb] [-[iI] snapshot] snapshot

...

>         -b
>
>             Sends only received property values whether or not they
>             are overridden by local settings, but only if the dataset
>             has ever been received. Use this option when you want 'zfs
>             receive' to restore received properties backed up on the
>             sent dataset and to avoid sending local settings that may
>             have nothing to do with the source dataset, but only with
>             how the data is backed up.

C.3.3. zfs set

< zfs set property=value filesystem|volume|snapshot ...
> zfs set [-r] property=value filesystem|volume|snapshot ...

<    zfs set property=value filesystem|volume|snapshot ...
>    zfs set [-r] property=value filesystem|volume|snapshot ...

...

>        -r
>
>            Recursively apply the effective value of the setting
>            throughout the subtree of child datasets. The effective
>            value may be set or inherited, depending on the property.

C.3.4. Native Properties: recordsize

         Changing the file system's recordsize affects only files
<        created afterward; existing files are unaffected.
>        created afterward; existing files and received data are
>        unaffected.

C.3.5. zfs list -t

         -t type

             A comma-separated list of types  to  display,  where
             type  is  one  of  filesystem, snapshot , volume, or
             all. For example, specifying  -t  snapshot  displays
<            only snapshots.
>            only snapshots. The following aliases can be used in
>            place of the type specifiers: fs (filesystem), snap
>            (snapshot), and vol (volume).

Stability

This case requests patch/micro release binding.  The new interfaces are
committed. 


6. Resources and Schedule
    6.4. Steering Committee requested information
    6.4.1. Consolidation C-team Name:
        ON
    6.5. ARC review type: FastTrack
    6.6. ARC Exposure: open


_______________________________________________
opensolaris-arc mailing list
[email protected]

Reply via email to