on 09/12/2013 20:12 Matthew Ahrens said the following: > I can see several ways to improve this performance: > > 1. When receiving, only set properties on snapshots that are not already set > (as > received properties). > > 2. Once channel programs > <http://www.open-zfs.org/wiki/Projects/ZFS_Channel_Programs> is available, we > will be able to set all the props on all the snapshots in one txg. > > 3. It would probably be fine to change -p to only send properties of the > filesystems & snapshots that are being sent. It was implemented that way > primarily for -R, which *does* need to tell the receiving system about all the > snapshots that exist on the sending system (& their props). The > implementation > of -p kind of came along for the ride. >
Thanks! #2 would be perfect, of course. For now we decided to go with #3 and the following patch seems to work for us. How does it look to you? commit 91043718d5865d5d39f772edfbcee657c914963e Author: Andriy Gapon <[email protected]> Date: Mon Dec 9 16:03:39 2013 +0200 zfs send -p: send properties only for snapshots that are actually sent ... as opposed to sending properties of all snapshots of the relevant filesystem. The previous behavior results in properties being set on all snapshots on the receiving side, which is quite slow. More details are here: http://thread.gmane.org/gmane.comp.file-systems.openzfs.devel/346 Behavior of zfs send -R is not changed. diff --git a/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c b/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c index 1fc46c2..8eb4276 100644 --- a/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c +++ b/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_sendrecv.c @@ -564,6 +564,8 @@ typedef struct send_data { const char *fromsnap; const char *tosnap; boolean_t recursive; + boolean_t seenfrom; + boolean_t seento; /* * The header nvlist is of the following format: @@ -598,20 +600,39 @@ send_iterate_snap(zfs_handle_t *zhp, void *arg) uint64_t guid = zhp->zfs_dmustats.dds_guid; char *snapname; nvlist_t *nv; + boolean_t isfromsnap, istosnap; snapname = strrchr(zhp->zfs_name, '@')+1; + isfromsnap = (sd->fromsnap != NULL && + strcmp(sd->fromsnap, snapname) == 0); + istosnap = (sd->tosnap != NULL && (strcmp(sd->tosnap, snapname) == 0)); - VERIFY(0 == nvlist_add_uint64(sd->parent_snaps, snapname, guid)); /* * NB: if there is no fromsnap here (it's a newly created fs in * an incremental replication), we will substitute the tosnap. */ - if ((sd->fromsnap && strcmp(snapname, sd->fromsnap) == 0) || - (sd->parent_fromsnap_guid == 0 && sd->tosnap && - strcmp(snapname, sd->tosnap) == 0)) { + if (isfromsnap || (sd->parent_fromsnap_guid == 0 && istosnap)) { sd->parent_fromsnap_guid = guid; } + if (!sd->recursive) { + if (!sd->seenfrom && isfromsnap) { + sd->seenfrom = B_TRUE; + zfs_close(zhp); + return (0); + } + + if (sd->seento || !sd->seenfrom) { + zfs_close(zhp); + return (0); + } + + if (istosnap) + sd->seento = B_TRUE; + } + + VERIFY(0 == nvlist_add_uint64(sd->parent_snaps, snapname, guid)); + VERIFY(0 == nvlist_alloc(&nv, NV_UNIQUE_NAME, 0)); send_iterate_prop(zhp, nv); VERIFY(0 == nvlist_add_nvlist(sd->snapprops, snapname, nv)); > On Mon, Dec 9, 2013 at 1:21 AM, Andriy Gapon <[email protected] > <mailto:[email protected]>> wrote: > > > We are using zfs send -I for a kind of replication that we need to do in > our > system. Recently we started using a user defined property to associate > certain > metadata with each snapshot and so we switched to generating streams > (stream > packages) using zfs send -p -I. After that we noticed that performance > of zfs > recv became orders of magnitude worse than it was before. The > performance seems > to get worse as a number of snapshots grows. > > So, it seems that when -p option is used along with -I a resulting stream > package contains a special nvlist that is not present when -p is not > used. That > nvlist has a nested nvlist named "snaps" that lists all the snapshots of a > filesystem being sent. I'd like to emphasize that it is not a list of > snapshots > being sent, but a list of all snapshots that exist on a source system. > Also, > there is "snapprops" nvlists that contains custom properties of all the > snapshots. > > When such a stream package is received the libzfs code would go over a > list of > all local snapshots of the filesystem and check if a snapshot has the > corresponding entries in the snaps and snapprops nvlists. If it does, > then the > the properties would be set on the snapshot. > > Essentially, the above means that each time a stream generated with -p -I > is > received the code sets / re-sets properties on all snapshots that are > common to > both the sending and receiving systems. Also, it seems that this is done > twice > - before acting on the actual packaged streams and after that. > > For example, if both systems have 100 snapshots and then we send another, > say, 2 > snapshots, then properties will be set 100 (before) + 102 (after) == 202 > times. > I am not sure if this behavior is by design or by accident... > > The problem is further aggravated by fact that the properties are set on > each > snapshot one by one using the synctask mechanism. As such, each action > generates some extra I/O (writing MOS, re-writing vdev labels, etc) and > has a > delay that is not insignificant. On some systems we observe that delay > to be in > hundreds of milliseconds. Multiplied by hundreds of snapshots this > results in > many minutes spent receiving a trivial sized stream. > > I would like to inquire what semantic is expected for such -p -I streams. > I would appreciate any pointers on how to fix / improve performance of > handling > such streams. > > Thanks a lot! > > P.S. > The code of interest is recv_incremental_replication() and zfs_send() in > libzfs_sendrecv.c. > > -- > Andriy Gapon > > _______________________________________________ > developer mailing list > [email protected] <mailto:[email protected]> > http://lists.open-zfs.org/mailman/listinfo/developer > > -- Andriy Gapon _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
