That looks fine. I would suggest adding a comment explaining that for -R (sd->recursive), the stream needs to have info about all snapshots, so that the receiver can change their properties to match, and delete the nonexistent snapshots; but if this is just -p, we only need info about the snapshots being sent.
--matt On Tue, Dec 10, 2013 at 5:14 AM, Andriy Gapon <[email protected]> wrote: > on 09/12/2013 20:12 Matthew Ahrens said the following: > > I can see several ways to improve this performance: > > > > 1. When receiving, only set properties on snapshots that are not already > set (as > > received properties). > > > > 2. Once channel programs > > <http://www.open-zfs.org/wiki/Projects/ZFS_Channel_Programs> is > available, we > > will be able to set all the props on all the snapshots in one txg. > > > > 3. It would probably be fine to change -p to only send properties of the > > filesystems & snapshots that are being sent. It was implemented that way > > primarily for -R, which *does* need to tell the receiving system about > all the > > snapshots that exist on the sending system (& their props). The > implementation > > of -p kind of came along for the ride. > > > > Thanks! > #2 would be perfect, of course. > For now we decided to go with #3 and the following patch seems to work for > us. > How does it look to you? > > commit 91043718d5865d5d39f772edfbcee657c914963e > Author: Andriy Gapon <[email protected]> > Date: Mon Dec 9 16:03:39 2013 +0200 > > zfs send -p: send properties only for snapshots that are actually sent > > ... as opposed to sending properties of all snapshots of the relevant > filesystem. The previous behavior results in properties being set on > all snapshots on the receiving side, which is quite slow. > More details are here: > http://thread.gmane.org/gmane.comp.file-systems.openzfs.devel/346 > > Behavior of zfs send -R is not changed. > > diff --git a/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c > b/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c > index 1fc46c2..8eb4276 100644 > --- a/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c > +++ b/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c > @@ -564,6 +564,8 @@ typedef struct send_data { > const char *fromsnap; > const char *tosnap; > boolean_t recursive; > + boolean_t seenfrom; > + boolean_t seento; > > /* > * The header nvlist is of the following format: > @@ -598,20 +600,39 @@ send_iterate_snap(zfs_handle_t *zhp, void *arg) > uint64_t guid = zhp->zfs_dmustats.dds_guid; > char *snapname; > nvlist_t *nv; > + boolean_t isfromsnap, istosnap; > > snapname = strrchr(zhp->zfs_name, '@')+1; > + isfromsnap = (sd->fromsnap != NULL && > + strcmp(sd->fromsnap, snapname) == 0); > + istosnap = (sd->tosnap != NULL && (strcmp(sd->tosnap, snapname) == > 0)); > > - VERIFY(0 == nvlist_add_uint64(sd->parent_snaps, snapname, guid)); > /* > * NB: if there is no fromsnap here (it's a newly created fs in > * an incremental replication), we will substitute the tosnap. > */ > - if ((sd->fromsnap && strcmp(snapname, sd->fromsnap) == 0) || > - (sd->parent_fromsnap_guid == 0 && sd->tosnap && > - strcmp(snapname, sd->tosnap) == 0)) { > + if (isfromsnap || (sd->parent_fromsnap_guid == 0 && istosnap)) { > sd->parent_fromsnap_guid = guid; > } > > + if (!sd->recursive) { > + if (!sd->seenfrom && isfromsnap) { > + sd->seenfrom = B_TRUE; > + zfs_close(zhp); > + return (0); > + } > + > + if (sd->seento || !sd->seenfrom) { > + zfs_close(zhp); > + return (0); > + } > + > + if (istosnap) > + sd->seento = B_TRUE; > + } > + > + VERIFY(0 == nvlist_add_uint64(sd->parent_snaps, snapname, guid)); > + > VERIFY(0 == nvlist_alloc(&nv, NV_UNIQUE_NAME, 0)); > send_iterate_prop(zhp, nv); > VERIFY(0 == nvlist_add_nvlist(sd->snapprops, snapname, nv)); > > > > On Mon, Dec 9, 2013 at 1:21 AM, Andriy Gapon <[email protected] > > <mailto:[email protected]>> wrote: > > > > > > We are using zfs send -I for a kind of replication that we need to > do in our > > system. Recently we started using a user defined property to > associate certain > > metadata with each snapshot and so we switched to generating streams > (stream > > packages) using zfs send -p -I. After that we noticed that > performance of zfs > > recv became orders of magnitude worse than it was before. The > performance seems > > to get worse as a number of snapshots grows. > > > > So, it seems that when -p option is used along with -I a resulting > stream > > package contains a special nvlist that is not present when -p is not > used. That > > nvlist has a nested nvlist named "snaps" that lists all the > snapshots of a > > filesystem being sent. I'd like to emphasize that it is not a list > of snapshots > > being sent, but a list of all snapshots that exist on a source > system. Also, > > there is "snapprops" nvlists that contains custom properties of all > the > > snapshots. > > > > When such a stream package is received the libzfs code would go over > a list of > > all local snapshots of the filesystem and check if a snapshot has the > > corresponding entries in the snaps and snapprops nvlists. If it > does, then the > > the properties would be set on the snapshot. > > > > Essentially, the above means that each time a stream generated with > -p -I is > > received the code sets / re-sets properties on all snapshots that > are common to > > both the sending and receiving systems. Also, it seems that this is > done twice > > - before acting on the actual packaged streams and after that. > > > > For example, if both systems have 100 snapshots and then we send > another, say, 2 > > snapshots, then properties will be set 100 (before) + 102 (after) == > 202 times. > > I am not sure if this behavior is by design or by accident... > > > > The problem is further aggravated by fact that the properties are > set on each > > snapshot one by one using the synctask mechanism. As such, each > action > > generates some extra I/O (writing MOS, re-writing vdev labels, etc) > and has a > > delay that is not insignificant. On some systems we observe that > delay to be in > > hundreds of milliseconds. Multiplied by hundreds of snapshots this > results in > > many minutes spent receiving a trivial sized stream. > > > > I would like to inquire what semantic is expected for such -p -I > streams. > > I would appreciate any pointers on how to fix / improve performance > of handling > > such streams. > > > > Thanks a lot! > > > > P.S. > > The code of interest is recv_incremental_replication() and > zfs_send() in > > libzfs_sendrecv.c. > > > > -- > > Andriy Gapon > > > > _______________________________________________ > > developer mailing list > > [email protected] <mailto:[email protected]> > > http://lists.open-zfs.org/mailman/listinfo/developer > > > > > > > -- > Andriy Gapon >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
