We are using zfs send -I for a kind of replication that we need to do in our
system.  Recently we started using a user defined property to associate certain
metadata with each snapshot and so we switched to generating streams (stream
packages) using zfs send -p -I.  After that we noticed that performance of zfs
recv became orders of magnitude worse than it was before.  The performance seems
to get worse as a number of snapshots grows.

So, it seems that when -p option is used along with -I a resulting stream
package contains a special nvlist that is not present when -p is not used.  That
nvlist has a nested nvlist named "snaps" that lists all the snapshots of a
filesystem being sent.  I'd like to emphasize that it is not a list of snapshots
being sent, but a list of all snapshots that exist on a source system.  Also,
there is "snapprops" nvlists that contains custom properties of all the 
snapshots.

When such a stream package is received the libzfs code would go over a list of
all local snapshots of the filesystem and check if a snapshot has the
corresponding entries in the snaps and snapprops nvlists.  If it does, then the
the properties would be set on the snapshot.

Essentially, the above means that each time a stream generated with -p -I is
received the code sets / re-sets properties on all snapshots that are common to
both the sending and receiving systems.  Also, it seems that this is done twice
- before acting on the actual packaged streams and after that.

For example, if both systems have 100 snapshots and then we send another, say, 2
snapshots, then properties will be set 100 (before) + 102 (after) == 202 times.
I am not sure if this behavior is by design or by accident...

The problem is further aggravated by fact that the properties are set on each
snapshot one by one using the synctask mechanism.  As such, each action
generates some extra I/O (writing MOS, re-writing vdev labels, etc) and has a
delay that is not insignificant.  On some systems we observe that delay to be in
hundreds of milliseconds.  Multiplied by hundreds of snapshots this results in
many minutes spent receiving a trivial sized stream.

I would like to inquire what semantic is expected for such -p -I streams.
I would appreciate any pointers on how to fix / improve performance of handling
such streams.

Thanks a lot!

P.S.
The code of interest is recv_incremental_replication() and zfs_send() in
libzfs_sendrecv.c.

-- 
Andriy Gapon
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to