Re: [zfs-discuss] scripting incremental replication data streams

2012-09-19 Thread Karl Wagner

Hi Edward,

My own personal view on this is that the simplest option is the best.

In your script, create a new snapshot using one of 2 names. Let's call 
them SNAPSEND_A and SNAPSEND_B. You can decide which one by checking 
which currently exists.


As manual setup, on the first run, create SNAPSEND_A and send it to 
your target. This can, obviously, be done incrementally from your last 
replication/last common snapshot.


Now in your script, you would:
* Check your source dataset for the existence of SNAPSEND_A and 
SNAPSEND_B. Let's assume this is the first run after manual setup, so 
SNAPSEND_A will exist.

* Create SNAPSEND_B. Replicate this over to your receiving dataset.
* Remove SNAPSEND_A on both sides. This will leave all intermediate 
snapshots.


Next run, it will create SNAPSEND_A again, and remove B when finished.

Hope this helps.
Karl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Richard Elling
On Sep 12, 2012, at 12:44 PM, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

 I send a replication data stream from one host to another. (and receive).
 I discovered that after receiving, I need to remove the auto-snapshot 
 property on the receiving side, and set the readonly property on the 
 receiving side, to prevent accidental changes (including auto-snapshots.)
  
 Question #1:  Actually, do I need to remove the auto-snapshot on the 
 receiving side?  

Yes

 Or is it sufficient to simply set the readonly property?  

No

 Will the readonly property prevent auto-snapshots from occurring?

No

  
 So then, sometime later, I want to send an incremental replication stream.  I 
 need to name an incremental source snap on the sending side...  which needs 
 to be the latest matching snap that exists on both sides.
  
 Question #2:  What's the best way to find the latest matching snap on both 
 the source and destination?  At present, it seems, I'll have to build a list 
 of sender snaps, and a list of receiver snaps, and parse and search them, 
 till I find the latest one that exists in both.  For shell scripting, this is 
 very non-trivial.

Actually, it is quite easy. You will notice that zfs list -t snapshot shows 
the list in
creation time order. If you are more paranoid, you can get the snapshot's 
creation time from the creation property. For convenience, zfs get -p 
creation ...
will return the time as a number. Something like this:
for i in $(zfs list -t snapshot -H -o name); do echo $(zfs get -p -H -o value 
creation $i) $i; done | sort -n

 -- richard

--
illumos Day  ZFS Day, Oct 1-2, 2012 San Fransisco 
www.zfsday.com
richard.ell...@richardelling.com
+1-760-896-4422








___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Timothy Coalson
When I wrote a script for this, I used separate snapshots, with a different
naming convention, to use as the endpoints for the incremental send.  With
this, it becomes easier: find the newest snapshot with that naming
convention on the sending side, and check that it exists on the receiving
side.  This way, you don't have to deal with the latest frequent/hourly on
the target side having been removed from the source side since the last
backup.

Tim

On Wed, Sep 12, 2012 at 2:44 PM, Edward Ned Harvey
(opensolarisisdeadlongliveopensolaris) 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

  I send a replication data stream from one host to another. (and receive).
 

 I discovered that after receiving, I need to remove the auto-snapshot
 property on the receiving side, and set the readonly property on the
 receiving side, to prevent accidental changes (including auto-snapshots.)*
 ***

 ** **

 Question #1:  Actually, do I need to remove the auto-snapshot on the
 receiving side?  Or is it sufficient to simply set the readonly property?
 Will the readonly property prevent auto-snapshots from occurring?

 ** **

 So then, sometime later, I want to send an incremental replication stream.
 I need to name an incremental source snap on the sending side...  which
 needs to be the latest matching snap that exists on both sides.

 ** **

 Question #2:  What's the best way to find the latest matching snap on
 both the source and destination?  At present, it seems, I'll have to
 build a list of sender snaps, and a list of receiver snaps, and parse and
 search them, till I find the latest one that exists in both.  For shell
 scripting, this is very non-trivial.

 ** **

 Thanks...

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Ian Collins
On 09/13/12 07:44 AM, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) wrote:


I send a replication data stream from one host to another. (and receive).

I discovered that after receiving, I need to remove the auto-snapshot 
property on the receiving side, and set the readonly property on the 
receiving side, to prevent accidental changes (including auto-snapshots.)


Question #1:Actually, do I need to remove the auto-snapshot on the 
receiving side?Or is it sufficient to simply set the readonly 
property?Will the readonly property prevent auto-snapshots from occurring?


So then, sometime later, I want to send an incremental replication 
stream.I need to name an incremental source snap on the sending 
side...which needs to be the latest matching snap that exists on both 
sides.


Question #2:What's the best way to find the latest matching snap on 
both the source and destination?At present, it seems, I'll have to 
build a list of sender snaps, and a list of receiver snaps, and parse 
and search them, till I find the latest one that exists in both.For 
shell scripting, this is very non-trivial.




That's pretty much how I do it.  Get the two (sorted) sets of snapshots, 
remove those that only exist on the remote end (ageing) and send those 
that only exist locally.  The first incremental pair will be the last 
common snapshot and the first unique local snapshot.


I haven't tried this in a script, but it's quite straightforward in C++ 
using the standard library set container and algorithms.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
 From: Richard Elling [mailto:richard.ell...@gmail.com]
 
 Question #2:  What's the best way to find the latest matching snap on both
 the source and destination?  At present, it seems, I'll have to build a list 
 of
 sender snaps, and a list of receiver snaps, and parse and search them, till I
 find the latest one that exists in both.  For shell scripting, this is very 
 non-
 trivial.
 
 Actually, it is quite easy. You will notice that zfs list -t snapshot shows 
 the
 list in
 creation time order. 

Actually, I already knew that.  But at the time of initial send, the latest 
snap is most likely a frequent, which will most likely not exist at a later 
time to be the base of the incremental.  Which means, during the incremental, 
an arbitrary number of the latest snaps on both the sender and receiver 
probably need to be ignored.  In other words, I can't just use a fixed tail or 
awk command...  And I can't alphabetic sort...  

In shell programming land (unless I want to python or something) I'll have to 
nest a for-loop in a for-loop.

export latestmatch=
export sendersnaps=`ssh otherhost zfs list -t snapshot | grep $FILESYSTEM | 
sed 's/ .*//'`
export receiversnaps=`zfs list -t snapshot | grep $FILESYSTEM | sed 's/ .*//'`
for sendersnap in $sendersnaps ; do
  for receiversnap in $receiversnaps ; do
if [ $sendersnap = $receiversnap ] ; then
  export latestmatch = $sendersnap
fi
  done
done

if [ -z $latestmatch ] ; then
  echo No matching snaps, can't send incremental.
  # Do a full send, or abort
else
  echo Doing incremental
  ssh otherhost zfs send -I $latestmatch
  ... etc etc...


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
 Question #2:  What's the best way to find the latest matching snap on both
 the source and destination?  At present, it seems, I'll have to build a list 
 of
 sender snaps, and a list of receiver snaps, and parse and search them, till I
 find the latest one that exists in both.  For shell scripting, this is very 
 non-
 trivial.

Someone replied to me off-list and said:

Try http://blog.infrageeks.com/auto-replicate/

Does everything you're looking for in a portable shell script.

The auto-backup adds in zfs holds

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Timothy Coalson
Unless i'm missing something, they didn't solve the matching snapshots
thing yet, from their site:

To Do:

Additional error handling for mismatched snapshots (last destination snap
no longer exists on the source) walk backwards through the remote snaps
until a common snapshot is found and destroy non-matching remote snapshots

As more evidence, the script doesn't contain any for loops, and has
comments that indicate it just uses the latest snapshot on the target.  It
also doesn't appear that the script uses zfs hold anywhere, so I don't
think it will stop the auto-snapshots from disappearing on the source (i'm
also not sure how the auto-snap service will respond to a failure to
destroy a held snapshot, i've had it go into maintenance mode on me for
just changing the timezone).

Why are you trying to solve this harder problem, rather than using your own
snapshots that the auto-snapshot service won't destroy?  If you make a
snapshot that doesn't start with zfs-auto-snap, then use it (them) as the
endpoint(s) for the transfer, adding holds if desired, this entire mess of
matching snapshots that may or may not still exist just goes away.

Tim

On Wed, Sep 12, 2012 at 5:05 PM, Edward Ned Harvey
(opensolarisisdeadlongliveopensolaris) 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

  From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
  boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
  Question #2:  What's the best way to find the latest matching snap on
 both
  the source and destination?  At present, it seems, I'll have to build a
 list of
  sender snaps, and a list of receiver snaps, and parse and search them,
 till I
  find the latest one that exists in both.  For shell scripting, this is
 very non-
  trivial.

 Someone replied to me off-list and said:

 Try http://blog.infrageeks.com/auto-replicate/

 Does everything you're looking for in a portable shell script.

 The auto-backup adds in zfs holds

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Ian Collins

On 09/13/12 10:23 AM, Timothy Coalson wrote:
Unless i'm missing something, they didn't solve the matching 
snapshots thing yet, from their site:


To Do:

Additional error handling for mismatched snapshots (last destination 
snap no longer exists on the source) walk backwards through the remote 
snaps until a common snapshot is found and destroy non-matching remote 
snapshots




That's what I do as party of my destroy snapshots not on the source 
check.  Over many years of managing various distributed systems, I've 
discovered the apparently simple tends to get complex!


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Timothy Coalson
On Wed, Sep 12, 2012 at 7:16 PM, Ian Collins i...@ianshome.com wrote:

 On 09/13/12 10:23 AM, Timothy Coalson wrote:

 Unless i'm missing something, they didn't solve the matching snapshots
 thing yet, from their site:

 To Do:

 Additional error handling for mismatched snapshots (last destination snap
 no longer exists on the source) walk backwards through the remote snaps
 until a common snapshot is found and destroy non-matching remote snapshots


 That's what I do as party of my destroy snapshots not on the source
 check.  Over many years of managing various distributed systems, I've
 discovered the apparently simple tends to get complex!


I tricked the auto-snapshot service into doing that bit for me on the
receiving end, so I can just use zfs send -I, however due to when it
decides to do the cleanup for each category (it won't clean up any old
daily snapshots unless the auto-snapshot property is true when it is time
to take a daily snap), adding it to cron requires more care than I'd like.
 Since I am only replicating between two hosts, though, I can deal with it.
 I have been tempted to write a cron script to replace the auto-snap
service, to remove its naming classes, cleanup oddities, etc, but haven't
had a good reason to yet.

Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] scripting incremental replication data streams

2012-09-12 Thread Chris Nagele
 I send a replication data stream from one host to another. (and receive).

Have you looked at snapsend?

  http://labs.omniti.com/labs/tools/browser/trunk/snapsend.sh

We've used this for a while in production for our backups and
replication. We adapted it a bit so it runs every minute without
overlapping (through a daemon process) and sends email alerts when
errors occur. We'll be updating it in the next week or so some more.
We want the process to run on the remote host instead of the primary
and have it save x days of snaps for backups as well. This way it will
save the last 60 minutes of snapshots (which are sent every minute or
less) as well as one snapshot per day for 30 days. All snapshots are
created on the primary host and pulled to the remote host to avoid
issues.

If you are interested we can push the code to github once we test it.

-Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss