That looks fine.  I would suggest adding a comment explaining that for -R
(sd->recursive), the stream needs to have info about all snapshots, so that
the receiver can change their properties to match, and delete the
nonexistent snapshots; but if this is just -p, we only need info about the
snapshots being sent.

--matt


On Tue, Dec 10, 2013 at 5:14 AM, Andriy Gapon <[email protected]> wrote:

> on 09/12/2013 20:12 Matthew Ahrens said the following:
> > I can see several ways to improve this performance:
> >
> > 1. When receiving, only set properties on snapshots that are not already
> set (as
> > received properties).
> >
> > 2. Once channel programs
> > <http://www.open-zfs.org/wiki/Projects/ZFS_Channel_Programs> is
> available, we
> > will be able to set all the props on all the snapshots in one txg.
> >
> > 3. It would probably be fine to change -p to only send properties of the
> > filesystems & snapshots that are being sent.  It was implemented that way
> > primarily for -R, which *does* need to tell the receiving system about
> all the
> > snapshots that exist on the sending system (& their props).  The
> implementation
> > of -p kind of came along for the ride.
> >
>
> Thanks!
> #2 would be perfect, of course.
> For now we decided to go with #3 and the following patch seems to work for
> us.
> How does it look to you?
>
> commit 91043718d5865d5d39f772edfbcee657c914963e
> Author: Andriy Gapon <[email protected]>
> Date:   Mon Dec 9 16:03:39 2013 +0200
>
>     zfs send -p: send properties only for snapshots that are actually sent
>
>     ... as opposed to sending properties of all snapshots of the relevant
>     filesystem.  The previous behavior results in properties being set on
>     all snapshots on the receiving side, which is quite slow.
>     More details are here:
>       http://thread.gmane.org/gmane.comp.file-systems.openzfs.devel/346
>
>     Behavior of zfs send -R is not changed.
>
> diff --git a/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
> b/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
> index 1fc46c2..8eb4276 100644
> --- a/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
> +++ b/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c
> @@ -564,6 +564,8 @@ typedef struct send_data {
>         const char *fromsnap;
>         const char *tosnap;
>         boolean_t recursive;
> +       boolean_t seenfrom;
> +       boolean_t seento;
>
>         /*
>          * The header nvlist is of the following format:
> @@ -598,20 +600,39 @@ send_iterate_snap(zfs_handle_t *zhp, void *arg)
>         uint64_t guid = zhp->zfs_dmustats.dds_guid;
>         char *snapname;
>         nvlist_t *nv;
> +       boolean_t isfromsnap, istosnap;
>
>         snapname = strrchr(zhp->zfs_name, '@')+1;
> +       isfromsnap = (sd->fromsnap != NULL &&
> +           strcmp(sd->fromsnap, snapname) == 0);
> +       istosnap = (sd->tosnap != NULL && (strcmp(sd->tosnap, snapname) ==
> 0));
>
> -       VERIFY(0 == nvlist_add_uint64(sd->parent_snaps, snapname, guid));
>         /*
>          * NB: if there is no fromsnap here (it's a newly created fs in
>          * an incremental replication), we will substitute the tosnap.
>          */
> -       if ((sd->fromsnap && strcmp(snapname, sd->fromsnap) == 0) ||
> -           (sd->parent_fromsnap_guid == 0 && sd->tosnap &&
> -           strcmp(snapname, sd->tosnap) == 0)) {
> +       if (isfromsnap || (sd->parent_fromsnap_guid == 0 && istosnap)) {
>                 sd->parent_fromsnap_guid = guid;
>         }
>
> +       if (!sd->recursive) {
> +               if (!sd->seenfrom && isfromsnap) {
> +                       sd->seenfrom = B_TRUE;
> +                       zfs_close(zhp);
> +                       return (0);
> +               }
> +
> +               if (sd->seento || !sd->seenfrom) {
> +                       zfs_close(zhp);
> +                       return (0);
> +               }
> +
> +               if (istosnap)
> +                       sd->seento = B_TRUE;
> +       }
> +
> +       VERIFY(0 == nvlist_add_uint64(sd->parent_snaps, snapname, guid));
> +
>         VERIFY(0 == nvlist_alloc(&nv, NV_UNIQUE_NAME, 0));
>         send_iterate_prop(zhp, nv);
>         VERIFY(0 == nvlist_add_nvlist(sd->snapprops, snapname, nv));
>
>
> > On Mon, Dec 9, 2013 at 1:21 AM, Andriy Gapon <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >
> >     We are using zfs send -I for a kind of replication that we need to
> do in our
> >     system.  Recently we started using a user defined property to
> associate certain
> >     metadata with each snapshot and so we switched to generating streams
> (stream
> >     packages) using zfs send -p -I.  After that we noticed that
> performance of zfs
> >     recv became orders of magnitude worse than it was before.  The
> performance seems
> >     to get worse as a number of snapshots grows.
> >
> >     So, it seems that when -p option is used along with -I a resulting
> stream
> >     package contains a special nvlist that is not present when -p is not
> used.  That
> >     nvlist has a nested nvlist named "snaps" that lists all the
> snapshots of a
> >     filesystem being sent.  I'd like to emphasize that it is not a list
> of snapshots
> >     being sent, but a list of all snapshots that exist on a source
> system.  Also,
> >     there is "snapprops" nvlists that contains custom properties of all
> the
> >     snapshots.
> >
> >     When such a stream package is received the libzfs code would go over
> a list of
> >     all local snapshots of the filesystem and check if a snapshot has the
> >     corresponding entries in the snaps and snapprops nvlists.  If it
> does, then the
> >     the properties would be set on the snapshot.
> >
> >     Essentially, the above means that each time a stream generated with
> -p -I is
> >     received the code sets / re-sets properties on all snapshots that
> are common to
> >     both the sending and receiving systems.  Also, it seems that this is
> done twice
> >     - before acting on the actual packaged streams and after that.
> >
> >     For example, if both systems have 100 snapshots and then we send
> another, say, 2
> >     snapshots, then properties will be set 100 (before) + 102 (after) ==
> 202 times.
> >     I am not sure if this behavior is by design or by accident...
> >
> >     The problem is further aggravated by fact that the properties are
> set on each
> >     snapshot one by one using the synctask mechanism.  As such, each
> action
> >     generates some extra I/O (writing MOS, re-writing vdev labels, etc)
> and has a
> >     delay that is not insignificant.  On some systems we observe that
> delay to be in
> >     hundreds of milliseconds.  Multiplied by hundreds of snapshots this
> results in
> >     many minutes spent receiving a trivial sized stream.
> >
> >     I would like to inquire what semantic is expected for such -p -I
> streams.
> >     I would appreciate any pointers on how to fix / improve performance
> of handling
> >     such streams.
> >
> >     Thanks a lot!
> >
> >     P.S.
> >     The code of interest is recv_incremental_replication() and
> zfs_send() in
> >     libzfs_sendrecv.c.
> >
> >     --
> >     Andriy Gapon
> >
> >     _______________________________________________
> >     developer mailing list
> >     [email protected] <mailto:[email protected]>
> >     http://lists.open-zfs.org/mailman/listinfo/developer
> >
> >
>
>
> --
> Andriy Gapon
>
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to