[zfs-discuss] scripting incremental replication data streams
I send a replication data stream from one host to another. (and receive). I discovered that after receiving, I need to remove the auto-snapshot property on the receiving side, and set the readonly property on the receiving side, to prevent accidental changes (including auto-snapshots.) Question #1: Actually, do I need to remove the auto-snapshot on the receiving side? Or is it sufficient to simply set the readonly property? Will the readonly property prevent auto-snapshots from occurring? So then, sometime later, I want to send an incremental replication stream. I need to name an incremental source snap on the sending side... which needs to be the latest matching snap that exists on both sides. Question #2: What's the best way to find the latest matching snap on both the source and destination? At present, it seems, I'll have to build a list of sender snaps, and a list of receiver snaps, and parse and search them, till I find the latest one that exists in both. For shell scripting, this is very non-trivial. Thanks... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
On Sep 12, 2012, at 12:44 PM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) wrote: > I send a replication data stream from one host to another. (and receive). > I discovered that after receiving, I need to remove the auto-snapshot > property on the receiving side, and set the readonly property on the > receiving side, to prevent accidental changes (including auto-snapshots.) > > Question #1: Actually, do I need to remove the auto-snapshot on the > receiving side? Yes > Or is it sufficient to simply set the readonly property? No > Will the readonly property prevent auto-snapshots from occurring? No > > So then, sometime later, I want to send an incremental replication stream. I > need to name an incremental source snap on the sending side... which needs > to be the latest matching snap that exists on both sides. > > Question #2: What's the best way to find the latest matching snap on both > the source and destination? At present, it seems, I'll have to build a list > of sender snaps, and a list of receiver snaps, and parse and search them, > till I find the latest one that exists in both. For shell scripting, this is > very non-trivial. Actually, it is quite easy. You will notice that "zfs list -t snapshot" shows the list in creation time order. If you are more paranoid, you can get the snapshot's creation time from the "creation" property. For convenience, "zfs get -p creation ..." will return the time as a number. Something like this: for i in $(zfs list -t snapshot -H -o name); do echo $(zfs get -p -H -o value creation $i) $i; done | sort -n -- richard -- illumos Day & ZFS Day, Oct 1-2, 2012 San Fransisco www.zfsday.com richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
When I wrote a script for this, I used separate snapshots, with a different naming convention, to use as the endpoints for the incremental send. With this, it becomes easier: find the newest snapshot with that naming convention on the sending side, and check that it exists on the receiving side. This way, you don't have to deal with the latest frequent/hourly on the target side having been removed from the source side since the last backup. Tim On Wed, Sep 12, 2012 at 2:44 PM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) < opensolarisisdeadlongliveopensola...@nedharvey.com> wrote: > I send a replication data stream from one host to another. (and receive). > > > I discovered that after receiving, I need to remove the auto-snapshot > property on the receiving side, and set the readonly property on the > receiving side, to prevent accidental changes (including auto-snapshots.)* > *** > > ** ** > > Question #1: Actually, do I need to remove the auto-snapshot on the > receiving side? Or is it sufficient to simply set the readonly property? > Will the readonly property prevent auto-snapshots from occurring? > > ** ** > > So then, sometime later, I want to send an incremental replication stream. > I need to name an incremental source snap on the sending side... which > needs to be the latest matching snap that exists on both sides. > > ** ** > > Question #2: What's the best way to find the latest matching snap on > both the source and destination? At present, it seems, I'll have to > build a list of sender snaps, and a list of receiver snaps, and parse and > search them, till I find the latest one that exists in both. For shell > scripting, this is very non-trivial. > > ** ** > > Thanks... > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
On 09/13/12 07:44 AM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) wrote: I send a replication data stream from one host to another. (and receive). I discovered that after receiving, I need to remove the auto-snapshot property on the receiving side, and set the readonly property on the receiving side, to prevent accidental changes (including auto-snapshots.) Question #1:Actually, do I need to remove the auto-snapshot on the receiving side?Or is it sufficient to simply set the readonly property?Will the readonly property prevent auto-snapshots from occurring? So then, sometime later, I want to send an incremental replication stream.I need to name an incremental source snap on the sending side...which needs to be the latest matching snap that exists on both sides. Question #2:What's the best way to find the latest matching snap on both the source and destination?At present, it seems, I'll have to build a list of sender snaps, and a list of receiver snaps, and parse and search them, till I find the latest one that exists in both.For shell scripting, this is very non-trivial. That's pretty much how I do it. Get the two (sorted) sets of snapshots, remove those that only exist on the remote end (ageing) and send those that only exist locally. The first incremental pair will be the last common snapshot and the first unique local snapshot. I haven't tried this in a script, but it's quite straightforward in C++ using the standard library set container and algorithms. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
> From: Richard Elling [mailto:richard.ell...@gmail.com] > > Question #2: What's the best way to find the latest matching snap on both > the source and destination? At present, it seems, I'll have to build a list > of > sender snaps, and a list of receiver snaps, and parse and search them, till I > find the latest one that exists in both. For shell scripting, this is very > non- > trivial. > > Actually, it is quite easy. You will notice that "zfs list -t snapshot" shows > the > list in > creation time order. Actually, I already knew that. But at the time of initial send, the latest snap is most likely a "frequent," which will most likely not exist at a later time to be the base of the incremental. Which means, during the incremental, an arbitrary number of the latest snaps on both the sender and receiver probably need to be ignored. In other words, I can't just use a fixed tail or awk command... And I can't alphabetic sort... In shell programming land (unless I want to python or something) I'll have to nest a for-loop in a for-loop. export latestmatch="" export sendersnaps=`ssh otherhost "zfs list -t snapshot | grep $FILESYSTEM" | sed 's/ .*//'"` export receiversnaps=`zfs list -t snapshot | grep $FILESYSTEM | sed 's/ .*//'` for sendersnap in $sendersnaps ; do for receiversnap in $receiversnaps ; do if [ "$sendersnap" = "$receiversnap" ] ; then export latestmatch = $sendersnap fi done done if [ -z "$latestmatch" ] ; then echo "No matching snaps, can't send incremental." # Do a full send, or abort else echo "Doing incremental" ssh otherhost "zfs send -I $latestmatch ... etc etc... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Edward Ned Harvey > > Question #2: What's the best way to find the latest matching snap on both > the source and destination? At present, it seems, I'll have to build a list > of > sender snaps, and a list of receiver snaps, and parse and search them, till I > find the latest one that exists in both. For shell scripting, this is very > non- > trivial. Someone replied to me off-list and said: Try http://blog.infrageeks.com/auto-replicate/ Does everything you're looking for in a portable shell script. The auto-backup adds in zfs holds ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
Unless i'm missing something, they didn't solve the "matching snapshots" thing yet, from their site: "To Do: Additional error handling for mismatched snapshots (last destination snap no longer exists on the source) walk backwards through the remote snaps until a common snapshot is found and destroy non-matching remote snapshots" As more evidence, the script doesn't contain any for loops, and has comments that indicate it just uses the latest snapshot on the target. It also doesn't appear that the script uses "zfs hold" anywhere, so I don't think it will stop the auto-snapshots from disappearing on the source (i'm also not sure how the auto-snap service will respond to a failure to destroy a held snapshot, i've had it go into maintenance mode on me for just changing the timezone). Why are you trying to solve this harder problem, rather than using your own snapshots that the auto-snapshot service won't destroy? If you make a snapshot that doesn't start with zfs-auto-snap, then use it (them) as the endpoint(s) for the transfer, adding holds if desired, this entire mess of matching snapshots that may or may not still exist just goes away. Tim On Wed, Sep 12, 2012 at 5:05 PM, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) < opensolarisisdeadlongliveopensola...@nedharvey.com> wrote: > > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > > boun...@opensolaris.org] On Behalf Of Edward Ned Harvey > > > > Question #2: What's the best way to find the latest matching snap on > both > > the source and destination? At present, it seems, I'll have to build a > list of > > sender snaps, and a list of receiver snaps, and parse and search them, > till I > > find the latest one that exists in both. For shell scripting, this is > very non- > > trivial. > > Someone replied to me off-list and said: > > Try http://blog.infrageeks.com/auto-replicate/ > > Does everything you're looking for in a portable shell script. > > The auto-backup adds in zfs holds > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
On 09/13/12 10:23 AM, Timothy Coalson wrote: Unless i'm missing something, they didn't solve the "matching snapshots" thing yet, from their site: "To Do: Additional error handling for mismatched snapshots (last destination snap no longer exists on the source) walk backwards through the remote snaps until a common snapshot is found and destroy non-matching remote snapshots" That's what I do as party of my "destroy snapshots not on the source" check. Over many years of managing various distributed systems, I've discovered the apparently simple tends to get complex! -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
On Wed, Sep 12, 2012 at 7:16 PM, Ian Collins wrote: > On 09/13/12 10:23 AM, Timothy Coalson wrote: > >> Unless i'm missing something, they didn't solve the "matching snapshots" >> thing yet, from their site: >> >> "To Do: >> >> Additional error handling for mismatched snapshots (last destination snap >> no longer exists on the source) walk backwards through the remote snaps >> until a common snapshot is found and destroy non-matching remote snapshots" >> >> > That's what I do as party of my "destroy snapshots not on the source" > check. Over many years of managing various distributed systems, I've > discovered the apparently simple tends to get complex! I tricked the auto-snapshot service into doing that bit for me on the receiving end, so I can just use zfs send -I, however due to when it decides to do the cleanup for each category (it won't clean up any old daily snapshots unless the auto-snapshot property is true when it is time to take a daily snap), adding it to cron requires more care than I'd like. Since I am only replicating between two hosts, though, I can deal with it. I have been tempted to write a cron script to replace the auto-snap service, to remove its naming classes, cleanup oddities, etc, but haven't had a good reason to yet. Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] scripting incremental replication data streams
> I send a replication data stream from one host to another. (and receive). Have you looked at snapsend? http://labs.omniti.com/labs/tools/browser/trunk/snapsend.sh We've used this for a while in production for our backups and replication. We adapted it a bit so it runs every minute without overlapping (through a daemon process) and sends email alerts when errors occur. We'll be updating it in the next week or so some more. We want the process to run on the remote host instead of the primary and have it save x days of snaps for backups as well. This way it will save the last 60 minutes of snapshots (which are sent every minute or less) as well as one snapshot per day for 30 days. All snapshots are created on the primary host and pulled to the remote host to avoid issues. If you are interested we can push the code to github once we test it. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss