Re: Musings on ZFS Backup strategies

2013-03-07 Thread George Kontostanos
I have found that the use of mbuffer really speeds up the differential
transfer process:

#!/bin/sh
export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:

pool=zroot
destination=tank
host=1.2.3.4

today=`date +$type-%Y-%m-%d`
yesterday=`date -v -1d +$type-%Y-%m-%d`

# create today snapshot
snapshot_today=$pool@$today
# look for a snapshot with this name
if zfs list -H -o name -t snapshot | sort | grep $snapshot_today$ 
/dev/null; then
echo  snapshot, $snapshot_today, already exists
exit 1
else
echo  taking todays snapshot, $snapshot_today | sendmail root
zfs snapshot -r $snapshot_today
fi

# look for yesterday snapshot
snapshot_yesterday=$pool@$yesterday

if zfs list -H -o name -t snapshot | sort | grep
$snapshot_yesterday$  /dev/null; then

echo  yesterday snapshot, $snapshot_yesterday, exists lets proceed
with backup

zfs send -R -i $snapshot_yesterday $snapshot_today | mbuffer -q -v 0
-s 128k -m 1G | ssh root@$host mbuffer -s 128k -m 1G | zfs receive
-Fd $destination  /dev/null


echo  backup complete destroying yesterday snapshot | sendmail root

zfs destroy -r $snapshot_yesterday
echo Backup done | sendmail root
exit 0
else
echo  missing yesterday snapshot aborting, $snapshot_yesterday
exit 1
fi


-- 
George Kontostanos
---
http://www.aisecure.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-04 Thread Volodymyr Kostyrko

02.03.2013 03:12, David Magda:


On Mar 1, 2013, at 12:55, Volodymyr Kostyrko wrote:


Yes, I'm working with backups the same way, I wrote a simple script that 
synchronizes two filesystems between distant servers. I also use the same 
script to synchronize bushy filesystems (with hundred thousands of files) where 
rsync produces a too big load for synchronizing.

https://github.com/kworr/zfSnap/commit/08d8b499dbc2527a652cddbc601c7ee8c0c23301


There are quite a few scripts out there:

http://www.freshports.org/search.php?query=zfs


A lot of them require python or ruby, and none of them manages 
synchronizing snapshots over network.



For file level copying, where you don't want to walk the entire tree, here is the 
zfs diff command:


zfs diff [-FHt] snapshot [snapshot|filesystem]

 Describes differences between a snapshot and a successor dataset. The
 successor dataset can be a later snapshot or the current filesystem.

 The changed files are displayed including the change type. The change
 type is displayed useing a single character. If a file or directory
 was renamed, the old and the new names are displayed.


http://www.freebsd.org/cgi/man.cgi?query=zfs

This allows one to get a quick list of files and directories, then use 
tar/rsync/cp/etc. to do the actual copy (where the destination does not have to 
be ZFS: e.g., NFS, ext4, Lustre, HDFS, etc.).


I know that but I see no reason in reverting to file-based synch if I 
can do block-based.


--
Sphinx of black quartz, judge my vow.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-04 Thread David Magda
On Mon, March 4, 2013 11:07, Volodymyr Kostyrko wrote:
 02.03.2013 03:12, David Magda:
 There are quite a few scripts out there:

  http://www.freshports.org/search.php?query=zfs

 A lot of them require python or ruby, and none of them manages
 synchronizing snapshots over network.

Yes, but I think it is worth considering the creation of snapshots, and
the transfer of snapshots, as two separate steps. By treating them
independently (perhaps in two different scripts), it helps prevent the
breakage in one from affecting the other.

Snapshots are not backups (IMHO), but they are handy for users and
sysadmins for the simple situations of accidentally files. If your network
access / copying breaks or is slow for some reason, at least you have
simply copies locally. Similarly if you're having issues with the machine
that keeps your remove pool.

By keeping the snapshots going separately, once any problems with the
network or remote server are solved, you can use them to incrementally
sync up the remote pool. You can simply run the remote-sync scripts more
often to do the catch up.

It's just an idea, and everyone has different needs. I often find it handy
to keep different steps in different scripts that are loosely coupled.

 This allows one to get a quick list of files and directories, then use
 tar/rsync/cp/etc. to do the actual copy (where the destination does not
 have to be ZFS: e.g., NFS, ext4, Lustre, HDFS, etc.).

 I know that but I see no reason in reverting to file-based synch if I
 can do block-based.

Sure. I just thought I'd mention it in the thread in case other do need
that functionality and were not aware of zfs diff. Not everyone does or
can do pool-to-pool backups.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-04 Thread Volodymyr Kostyrko

04.03.2013 19:04, David Magda:

On Mon, March 4, 2013 11:07, Volodymyr Kostyrko wrote:

02.03.2013 03:12, David Magda:

There are quite a few scripts out there:

http://www.freshports.org/search.php?query=zfs


A lot of them require python or ruby, and none of them manages
synchronizing snapshots over network.


Yes, but I think it is worth considering the creation of snapshots, and
the transfer of snapshots, as two separate steps. By treating them
independently (perhaps in two different scripts), it helps prevent the
breakage in one from affecting the other.


Exactly. My script is just an addition to zfSnap or any other tool that 
manages snapshots. Currently it does nothing more then comparing list of 
available snapshots and network transfer.



Snapshots are not backups (IMHO), but they are handy for users and
sysadmins for the simple situations of accidentally files. If your network
access / copying breaks or is slow for some reason, at least you have
simply copies locally. Similarly if you're having issues with the machine
that keeps your remove pool.


Yes, I addressed such thing specifically adding availability to restart 
transfer from any point or just even don't care - once initialized the 
process is autonomous and in case of failure anything would be rolled 
back to last known good snapshot. I also added possibility to 
compress/limit traffic.



By keeping the snapshots going separately, once any problems with the
network or remote server are solved, you can use them to incrementally
sync up the remote pool. You can simply run the remote-sync scripts more
often to do the catch up.

It's just an idea, and everyone has different needs. I often find it handy
to keep different steps in different scripts that are loosely coupled.


I just tried to give another use for snapshots. Or least the way to 
simplify things in one specific situation.


--
Sphinx of black quartz, judge my vow.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Ronald Klop
On Fri, 01 Mar 2013 21:34:39 +0100, Daniel Eischen deisc...@freebsd.org  
wrote:



On Fri, 1 Mar 2013, Ben Morrow wrote:


Quoth Daniel Eischen deisc...@freebsd.org:


Yes, we still use a couple of DLT autoloaders and have nightly
incrementals and weekly fulls.  This is the problem I have with
converting to ZFS.  Our typical recovery is when a user says
they need a directory or set of files from a week or two ago.
Using dump from tape, I can easily extract *just* the necessary
files.  I don't need a second system to restore to, so that
I can then extract the file.


As Karl said originally, you can do that with snapshots without having
to go to your backups at all. With the right arrangements (symlinks to
the .zfs/snapshot/* directories, or just setting the snapdir property to
'visible') you can make it so users can do this sort of restore
themselves without having to go through you.


It wasn't clear that snapshots were traversable as a normal
directory structure.  I was thinking it was just a blob
that you had to roll back to in order to get anything out
of it.


That is the main benefit of snapshots. :-) You can also very easily diff  
files between them.

Mostly a lot of data is static so it does not cost a lot to keep snapshots.
There are a lot of scripts online and in ports which make a nice retention  
policy like e.g. 7 daily snaphots, 8 weekly, 12 monthly, 2 yearly. See  
below for (an incomplete list of) what I keep about my homedir at home.



Under our current scheme, we would remove snapshots
after the next (weekly) full zfs send (nee dump), so
it wouldn't help unless we kept snapshots around a
lot longer.


Why not.


Am I correct in assuming that one could:

   # zfs send -R snapshot | dd obs=10240 of=/dev/rst0

to archive it to tape instead of another [system:]drive?


Yes, your are correct. The manual page about zfs send says: 'The format of  
the stream is committed. You will be able to receive your streams on  
future versions of ZFS.'



Ronald.



tank/home 115G  65.6G   
53.6G  /home
tank/home@auto-2011-10-25_19.00.yearly   16.3G  -   
56.8G  -
tank/home@auto-2012-06-06_22.00.yearly   5.55G  -   
53.3G  -
tank/home@auto-2012-09-02_20.00.monthly  2.61G  -   
49.3G  -
tank/home@auto-2012-10-15_06.00.monthly  2.22G  -   
49.9G  -
tank/home@auto-2012-11-26_13.00.monthly  2.47G  -   
50.2G  -
tank/home@auto-2013-01-07_13.00.monthly  2.56G  -   
51.5G  -
tank/home@auto-2013-01-21_13.00.weekly   1.06G  -   
52.4G  -
tank/home@auto-2013-01-28_13.00.weekly409M  -   
52.3G  -
tank/home@auto-2013-02-04_13.00.monthly   625M  -   
52.5G  -
tank/home@auto-2013-02-11_13.00.weekly689M  -   
52.5G  -
tank/home@auto-2013-02-16_13.00.weekly   17.7M  -   
52.5G  -
tank/home@auto-2013-02-17_13.00.daily17.7M  -   
52.5G  -
tank/home@auto-2013-02-18_13.00.daily17.9M  -   
52.5G  -

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Ronald Klop
On Fri, 01 Mar 2013 18:55:22 +0100, Volodymyr Kostyrko c.kw...@gmail.com  
wrote:



01.03.2013 16:24, Karl Denninger:

Dabbling with ZFS now, and giving some thought to how to handle backup
strategies.

ZFS' snapshot capabilities have forced me to re-think the way that I've
handled this.  Previously near-line (and offline) backup was focused on
being able to handle both disasters (e.g. RAID adapter goes nuts and
scribbles on the entire contents of the array), a double-disk (or worse)
failure, or the obvious (e.g. fire, etc) along with the aw crap, I just
rm -rf'd something I'd rather not!

ZFS makes snapshots very cheap, which means you can resolve the aw
crap situation without resorting to backups at all.  This turns the
backup situation into a disaster recovery one.

And that in turn seems to say that the ideal strategy looks more like:

Take a base snapshot immediately and zfs send it to offline storage.
Take an incremental at some interval (appropriate for disaster recovery)
and zfs send THAT to stable storage.

If I then restore the base and snapshot, I get back to where I was when
the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem@unique-label
zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
zfs destroy pool/some-filesystem@unique-label

and that seems to work (and restore) just fine.


Yes, I'm working with backups the same way, I wrote a simple script that  
synchronizes two filesystems between distant servers. I also use the  
same script to synchronize bushy filesystems (with hundred thousands of



Your filesystems grow a lot of hair? :-)





files) where rsync produces a too big load for synchronizing.

https://github.com/kworr/zfSnap/commit/08d8b499dbc2527a652cddbc601c7ee8c0c23301

I left it where it was but I was also planning to write some purger for  
snapshots that would automatically purge snapshots when pool gets low on  
space. Never hit that yet.



Am I looking at this the right way here?  Provided that the base backup
and incremental are both readable, it appears that I have the disaster
case covered, and the online snapshot increments and retention are
easily adjusted and cover the oops situations without having to resort
to the backups at all.

This in turn means that keeping more than two incremental dumps offline
has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion without
error to apply on top of the base.  That in turn makes the backup
storage requirement based only on entropy in the filesystem and not time
(where the tower of Hanoi style dump hierarchy imposed both a time AND
entropy cost on backup media.)


Well, snapshots can pose a value in a longer timeframe depending on  
data. Being able to restore some file accidentally deleted two month ago  
already saved 2k$ for one of our customers.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread David Magda
On Mar 1, 2013, at 21:14, Ben Morrow wrote:

 But since ZFS doesn't support POSIX.1e ACLs that's not terribly
 useful... I don't believe bsdtar/libarchive supports NFSv4 ACLs yet.

Ah yes, just noticed that. Thought it did.

https://github.com/libarchive/libarchive/wiki/TarNFS4ACLs

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Peter Jeremy
On 2013-Mar-01 08:24:53 -0600, Karl Denninger k...@denninger.net wrote:
If I then restore the base and snapshot, I get back to where I was when
the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem@unique-label
zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
zfs destroy pool/some-filesystem@unique-label

and that seems to work (and restore) just fine.

This gives you an incremental since the base snapshot - which will
probably grow in size over time.  If you are storing the ZFS send
streams on (eg) tape, rather than receiving them, you probably still
want the Towers of Hanoi style backup hierarchy to control your
backup volume.  It's also worth noting that whilst the stream will
contain the compression attributes of the filesystem(s) in it, the
actual data is the stream in uncompressed

This in turn means that keeping more than two incremental dumps offline
has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion without
error to apply on top of the base.

This is quite a critical point with this style of backup: The ZFS send
stream is not intended as an archive format.  It includes error
detection but no error correction and any error in a stream renders
the whole stream unusable (you can't retrieve only part of a stream).
If you go this way, you probably want to wrap the stream in a FEC
container (eg based on ports/comms/libfec) and/or keep multiple copies.

The recommended approach is to do zfs send | zfs recv and store a
replica of your pool (with whatever level of RAID that meets your
needs).  This way, you immediately detect an error in the send stream
and can repeat the send.  You then use scrub to verify (and recover)
the replica.

(Yes, I know, I've been a ZFS resister ;-))

Resistance is futile. :-)

On 2013-Mar-01 15:34:39 -0500, Daniel Eischen deisc...@freebsd.org wrote:
It wasn't clear that snapshots were traversable as a normal
directory structure.  I was thinking it was just a blob
that you had to roll back to in order to get anything out
of it.

Snapshots appear in a .zfs/snapshot/SNAPSHOT_NAME directory at each
mountpoint and are accessible as a normal read-only directory
hierarchy below there.  OTOH, the send stream _is_ a blob.

Am I correct in assuming that one could:

   # zfs send -R snapshot | dd obs=10240 of=/dev/rst0

to archive it to tape instead of another [system:]drive?

Yes.  The output from zfs send is a stream of bytes that you can treat
as you would any other stream of bytes.  But this approach isn't
recommended.

-- 
Peter Jeremy


pgp61ijyBCuu8.pgp
Description: PGP signature


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Karl Denninger

On 3/2/2013 4:14 PM, Peter Jeremy wrote:
 On 2013-Mar-01 08:24:53 -0600, Karl Denninger k...@denninger.net wrote:
 If I then restore the base and snapshot, I get back to where I was when
 the latest snapshot was taken.  I don't need to keep the incremental
 snapshot for longer than it takes to zfs send it, so I can do:

 zfs snapshot pool/some-filesystem@unique-label
 zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
 zfs destroy pool/some-filesystem@unique-label

 and that seems to work (and restore) just fine.
 This gives you an incremental since the base snapshot - which will
 probably grow in size over time.  If you are storing the ZFS send
 streams on (eg) tape, rather than receiving them, you probably still
 want the Towers of Hanoi style backup hierarchy to control your
 backup volume.  It's also worth noting that whilst the stream will
 contain the compression attributes of the filesystem(s) in it, the
 actual data is the stream in uncompressed
I noted that.  The script I wrote to do this looks at the compression
status in the filesystem and, if enabled, pipes the data stream through
pbzip2 on the way to storage.  The only problem with this presumption is
that for database data filesystems the best practices say that you
should set the recordsize to that of the underlying page size of the
dbms (e.g. 8k for Postgresql) for best performance and NOT enable
compression.

Reality however is that the on-disk format of most database files is
EXTREMELY compressible (often WELL better than 2:1), so I sacrifice
there.  I think the better option is to stuff a user parameter into the
filesystem attribute table (which apparently I can do without boundary)
telling the script whether or not to compress on output so it's not tied
to the filesystem's compression setting.

I'm quite-curious, in fact, as to whether the best practices really
are in today's world.  Specifically, for a CPU-laden machine with lots
of compute power I wonder if enabling compression on the database
filesystems and leaving the recordsize alone would be a net performance
win due to the reduction in actual I/O volume.  This assumes you have
the CPU available, of course, but that has gotten cheaper much faster
than I/O bandwidth has.

 This in turn means that keeping more than two incremental dumps offline
 has little or no value; the second merely being taken to insure that
 there is always at least one that has been written to completion without
 error to apply on top of the base.
 This is quite a critical point with this style of backup: The ZFS send
 stream is not intended as an archive format.  It includes error
 detection but no error correction and any error in a stream renders
 the whole stream unusable (you can't retrieve only part of a stream).
 If you go this way, you probably want to wrap the stream in a FEC
 container (eg based on ports/comms/libfec) and/or keep multiple copies.
That's no more of a problem than it is for a dump file saved on a disk
though, is it?  While restore can (putatively) read past errors on a
tape, in reality if the storage is a disk and part of the file is
unreadable the REST of that particular archive is unreadable.  Skipping
unreadable records does sorta work for tapes, but it rarely if ever
does for storage onto a spinning device within the boundary of the
impacted file.

In practice I attempt to cover this by (1) saving the stream to local
disk and then (2) rsync'ing the first disk to a second in the same
cabinet.  If the file I just wrote is unreadable I should discover it at
(2), which hopefully is well before I actually need it in anger.  Disk
#2 then gets rotated out to an offsite vault on a regular schedule in
case the building catches fire or similar.  My exposure here is to
time-related bitrot which is a non-zero risk but I can't scrub a disk
that's sitting in a vault, so I don't know that there's a realistic
means around this risk other than a full online hotsite that I can
ship the snapshots to (which I don't have the necessary bandwidth or
storage to cover.)

If I change the backup media (currently UFS formatted) to ZFS formatted
and dump directly there via a zfs send/receive I could run both drives
as a mirror instead of rsync'ing from one to the other after the first
copy is done, then detach the mirror to rotate the drive out and attach
the other one, causing a resilver.  That's fine EXCEPT if I have a
controller go insane I now probably lose everything other than the
offsite copy since everything is up for write during the snapshot
operation.  That ain't so good and that's a risk I've had turn into
reality twice in 20 years.  On the upside if the primary has an error on
it I catch it when I try to resilver as that operation will fail since
the entire data structure that's on-disk and written has to be traversed
and the checksums should catch any silent corruption. If that happens I
know I'm naked (other than the vault copy which I hope is good!) until I
replace the 

Re: Musings on ZFS Backup strategies

2013-03-02 Thread Steven Hartland
- Original Message - 
From: Karl Denninger k...@denninger.net

Reality however is that the on-disk format of most database files is
EXTREMELY compressible (often WELL better than 2:1), so I sacrifice
there.  I think the better option is to stuff a user parameter into the
filesystem attribute table (which apparently I can do without boundary)
telling the script whether or not to compress on output so it's not tied
to the filesystem's compression setting.

I'm quite-curious, in fact, as to whether the best practices really
are in today's world.  Specifically, for a CPU-laden machine with lots
of compute power I wonder if enabling compression on the database
filesystems and leaving the recordsize alone would be a net performance
win due to the reduction in actual I/O volume.  This assumes you have
the CPU available, of course, but that has gotten cheaper much faster
than I/O bandwidth has.


We've been using ZFS compression on mysql filesystems for quite some
time and have good success with it. It is dependent on the HW as
you say though so you need to know where the bottleneck is in your
system, cpu or disk.

mysql 5.6 also added better recordsize support which could be interesting.

Also be aware of the additional latency the compression can add. I'm
also not 100% sure that the compression in ZFS scales beyond one core
its been something I've meant to look in to / test but not got round
to.

   Regards
   Steve


This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread John
The recommended approach is to do zfs send | zfs recv and store a
replica of your pool (with whatever level of RAID that meets your
needs).  This way, you immediately detect an error in the send stream
and can repeat the send.  You then use scrub to verify (and recover)
the replica.

I do zfs send | zfs recv from several machines to a backup server in a
different building. Each day an incremental send is done using the previous
day's incremental send as the base. One reason for this approach is to minimize
the amount of bandwidth required since one of the machines is across a T1.

This technique requires keeping a record of the current base snapshot for each
filesystem, and a system in place to keep from destroying the base snapshot.
I learned the latter the hard way when a machine went down for several days,
and when it came back up the script that destroys out-of-date snapshots deleted
the incremental base snapshot.

I'm running 9.1-stable with zpool features on my machines, and with this upgrade
came zfs hold and zfs release. This allows you to lock a snapshot so it can't
be destroyed until it's released. With this feature, I do the following for
each filesystem:

zfs send -i yesterdays_snapshot todays_snapshot | ssh backup_server zfs recv
on success:
  zfs hold todays_snapshot
  zfs release yesterdays_snapshot
  ssh backup_server zfs hold todays_snapshot
  ssh backup_server zfs release yesterdays_snapshot
  update zfs_send_dates file with filesystem and snapshot name


John Theus
TheUsGroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Karl Denninger
Quoth Ben Morrow:
 I don't know what medium you're backing up to (does anyone use tape any
 more?) but when backing up to disk I much prefer to keep the backup in
 the form of a filesystem rather than as 'zfs send' streams. One reason
 for this is that I believe that new versions of the ZFS code are more
 likely to be able to correctly read old versions of the filesystem than
 old versions of the stream format; this may not be correct any more,
 though.

 Another reason is that it means I can do 'rolling snapshot' backups. I
 do an initial dump like this

 # zpool is my working pool
 # bakpool is a second pool I am backing up to

 zfs snapshot -r zpool/fs at dump 
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 zfs send -R zpool/fs at dump 
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable | zfs recv -vFd 
 bakpool

 That pipe can obviously go through ssh or whatever to put the backup on
 a different machine. Then to make an increment I roll forward the
 snapshot like this

 zfs rename -r zpool/fs at dump 
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable dump-old
 zfs snapshot -r zpool/fs at dump 
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 zfs send -R -I @dump-old zpool/fs at dump 
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable | zfs recv -vFd 
 bakpool
 zfs destroy -r zpool/fs at dump-old 
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 zfs destroy -r bakpool/fs at dump-old 
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable

 (Notice that the increment starts at a snapshot called @dump-old on the
 send side but at a snapshot called @dump on the recv side. ZFS can
 handle this perfectly well, since it identifies snapshots by UUID, and
 will rename the bakpool snapshot as part of the recv.)

 This brings the filesystem on bakpool up to date with the filesystem on
 zpool, including all snapshots, but never creates an increment with more
 than one backup interval's worth of data in. If you want to keep more
 history on the backup pool than the source pool, you can hold off on
 destroying the old snapshots, and instead rename them to something
 unique. (Of course, you could always give them unique names to start
 with, but I find it more convenient not to.)

Uh, I see a potential problem here.

What if the zfs send | zfs recv command fails for some reason before
completion?  I have noted that zfs recv is atomic -- if it fails for any
reason the entire receive is rolled back like it never happened.

But you then destroy the old snapshot, and the next time this runs the
new gets rolled down.  It would appear that there's an increment
missing, never to be seen again.

What gets lost in that circumstance?  Anything changed between the two
times -- and silently at that? (yikes!)

-- 
-- Karl Denninger
/The Market Ticker ®/ http://market-ticker.org
Cuda Systems LLC
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Ben Morrow
Quoth Karl Denninger k...@denninger.net:
 Quoth Ben Morrow:
  I don't know what medium you're backing up to (does anyone use tape any
  more?) but when backing up to disk I much prefer to keep the backup in
  the form of a filesystem rather than as 'zfs send' streams. One reason
  for this is that I believe that new versions of the ZFS code are more
  likely to be able to correctly read old versions of the filesystem than
  old versions of the stream format; this may not be correct any more,
  though.
 
  Another reason is that it means I can do 'rolling snapshot' backups. I
  do an initial dump like this
 
  # zpool is my working pool
  # bakpool is a second pool I am backing up to
 
  zfs snapshot -r zpool/fs at dump
  zfs send -R zpool/fs at dump | zfs recv -vFd bakpool
 
  That pipe can obviously go through ssh or whatever to put the backup on
  a different machine. Then to make an increment I roll forward the
  snapshot like this
 
  zfs rename -r zpool/fs at dump dump-old
  zfs snapshot -r zpool/fs at dump
  zfs send -R -I @dump-old zpool/fs at dump | zfs recv -vFd bakpool
  zfs destroy -r zpool/fs at dump-old
  zfs destroy -r bakpool/fs at dump-old
 
  (Notice that the increment starts at a snapshot called @dump-old on the
  send side but at a snapshot called @dump on the recv side. ZFS can
  handle this perfectly well, since it identifies snapshots by UUID, and
  will rename the bakpool snapshot as part of the recv.)
 
  This brings the filesystem on bakpool up to date with the filesystem on
  zpool, including all snapshots, but never creates an increment with more
  than one backup interval's worth of data in. If you want to keep more
  history on the backup pool than the source pool, you can hold off on
  destroying the old snapshots, and instead rename them to something
  unique. (Of course, you could always give them unique names to start
  with, but I find it more convenient not to.)
 
 Uh, I see a potential problem here.
 
 What if the zfs send | zfs recv command fails for some reason before
 completion?  I have noted that zfs recv is atomic -- if it fails for any
 reason the entire receive is rolled back like it never happened.
 
 But you then destroy the old snapshot, and the next time this runs the
 new gets rolled down.  It would appear that there's an increment
 missing, never to be seen again.

No, if the recv fails my backup script aborts and doesn't delete the old
snapshot. Cleanup then means removing the new snapshot and renaming the
old back on the source zpool; in my case I do this by hand, but it could
be automated given enough thought. (The names of the snapshots on the
backup pool don't matter; they will be cleaned up by the next successful
recv.)

 What gets lost in that circumstance?  Anything changed between the two
 times -- and silently at that? (yikes!)

It's impossible to recv an incremental stream on top of the wrong
snapshot (identified by UUID, not by its current name), so nothing can
get silently lost. A 'zfs recv -F' will find the correct starting
snapshot on the destination filesystem (assuming it's there) regardless
of its name, and roll forward to the state as of the end snapshot. If a
recv succeeds you can be sure nothing up to that point has been missed.

The worst that can happen is if you mistakenly delete the snapshot on
the source pool that marks the end of the last successful recv on the
backup pool; in that case you have to take an increment from further
back (which will therefore be a larger incremental stream than it needed
to be). The very worst case is if you end up without any snapshots in
common between the source and backup pools, and you have to start again
with a full dump.

Ben

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Karl Denninger

On 3/2/2013 10:23 PM, Ben Morrow wrote:
 Quoth Karl Denninger k...@denninger.net:
 Quoth Ben Morrow:
 I don't know what medium you're backing up to (does anyone use tape any
 more?) but when backing up to disk I much prefer to keep the backup in
 the form of a filesystem rather than as 'zfs send' streams. One reason
 for this is that I believe that new versions of the ZFS code are more
 likely to be able to correctly read old versions of the filesystem than
 old versions of the stream format; this may not be correct any more,
 though.

 Another reason is that it means I can do 'rolling snapshot' backups. I
 do an initial dump like this

 # zpool is my working pool
 # bakpool is a second pool I am backing up to

 zfs snapshot -r zpool/fs at dump
 zfs send -R zpool/fs at dump | zfs recv -vFd bakpool

 That pipe can obviously go through ssh or whatever to put the backup on
 a different machine. Then to make an increment I roll forward the
 snapshot like this

 zfs rename -r zpool/fs at dump dump-old
 zfs snapshot -r zpool/fs at dump
 zfs send -R -I @dump-old zpool/fs at dump | zfs recv -vFd bakpool
 zfs destroy -r zpool/fs at dump-old
 zfs destroy -r bakpool/fs at dump-old

 (Notice that the increment starts at a snapshot called @dump-old on the
 send side but at a snapshot called @dump on the recv side. ZFS can
 handle this perfectly well, since it identifies snapshots by UUID, and
 will rename the bakpool snapshot as part of the recv.)

 This brings the filesystem on bakpool up to date with the filesystem on
 zpool, including all snapshots, but never creates an increment with more
 than one backup interval's worth of data in. If you want to keep more
 history on the backup pool than the source pool, you can hold off on
 destroying the old snapshots, and instead rename them to something
 unique. (Of course, you could always give them unique names to start
 with, but I find it more convenient not to.)
 Uh, I see a potential problem here.

 What if the zfs send | zfs recv command fails for some reason before
 completion?  I have noted that zfs recv is atomic -- if it fails for any
 reason the entire receive is rolled back like it never happened.

 But you then destroy the old snapshot, and the next time this runs the
 new gets rolled down.  It would appear that there's an increment
 missing, never to be seen again.
 No, if the recv fails my backup script aborts and doesn't delete the old
 snapshot. Cleanup then means removing the new snapshot and renaming the
 old back on the source zpool; in my case I do this by hand, but it could
 be automated given enough thought. (The names of the snapshots on the
 backup pool don't matter; they will be cleaned up by the next successful
 recv.)
I was concerned that if the one you rolled to old get killed without
the backup being successful then you're screwed as you've lost the
context.  I presume that zfs recv will properly set the exit code
non-zero if something's wrong (I would hope so!)
 What gets lost in that circumstance?  Anything changed between the two
 times -- and silently at that? (yikes!)
 It's impossible to recv an incremental stream on top of the wrong
 snapshot (identified by UUID, not by its current name), so nothing can
 get silently lost. A 'zfs recv -F' will find the correct starting
 snapshot on the destination filesystem (assuming it's there) regardless
 of its name, and roll forward to the state as of the end snapshot. If a
 recv succeeds you can be sure nothing up to that point has been missed.
Ah, ok.  THAT I did not understand.  So the zfs recv process checks what
it's about to apply the delta against, and if it can't find a consistent
place to start it garfs rather than screw you.  That's good.  As long as
it gets caught I can live with it.  Recovery isn't a terrible pain in
the butt so long as it CAN be recovered.  It's the potential for silent
failures that scare the bejeezus out of me for all the obvious reasons.
 The worst that can happen is if you mistakenly delete the snapshot on
 the source pool that marks the end of the last successful recv on the
 backup pool; in that case you have to take an increment from further
 back (which will therefore be a larger incremental stream than it needed
 to be). The very worst case is if you end up without any snapshots in
 common between the source and backup pools, and you have to start again
 with a full dump.

 Ben
Got it.

That's not great in that it could force a new full copy, but it's also
not the end of the world.  In my case I am already automatically taking
daily and 4-hour snaps, keeping a week's worth around, which is more
than enough time to be able to obtain a consistent place to go from. 
That should be ok then.

I think I'm going to play with this and see what I think of it.  One
thing that is very attractive to this design is to have the receiving
side be a mirror, then to rotate to the vault copy run a scrub (to
insure that both members are 

Re: Musings on ZFS Backup strategies

2013-03-02 Thread Phil Regnauld
Karl Denninger (karl) writes:
 
 I think I'm going to play with this and see what I think of it.  One
 thing that is very attractive to this design is to have the receiving
 side be a mirror, then to rotate to the vault copy run a scrub (to
 insure that both members are consistent at a checksum level), break the
 mirror and put one in the vault, replacing it with the drive coming FROM
 the vault, then do a zpool replace and allow it to resilver into the
 other drive.  You now have the two in consistent state again locally if
 the pool pukes and one in the vault in the event of a fire or other
 entire facility is toast event.

That's one solution.

 The only risk that makes me uncomfortable doing this is that the pool is
 always active when the system is running.  With UFS backup disks it's
 not -- except when being actually written to they're unmounted, and this
 materially decreases the risk of an insane adapter scribbling the
 drives, since there is no I/O at all going to them unless mounted. 
 While the backup pool would be nominally idle it is probably
 more-exposed to a potential scribble than the UFS-mounted packs would be.

Could zpool export in between syncs on the target, assuming that's not
your root pool :)

Cheers,
Phil
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-02 Thread Ben Morrow
Quoth Phil Regnauld regna...@x0.dk:
 
  The only risk that makes me uncomfortable doing this is that the pool is
  always active when the system is running.  With UFS backup disks it's
  not -- except when being actually written to they're unmounted, and this
  materially decreases the risk of an insane adapter scribbling the
  drives, since there is no I/O at all going to them unless mounted. 
  While the backup pool would be nominally idle it is probably
  more-exposed to a potential scribble than the UFS-mounted packs would be.
 
   Could zpool export in between syncs on the target, assuming that's not
   your root pool :)

If I were feeling paranoid I might be tempted to not only keep the pool
exported when not in use, but to 'zpool offline' one half of the mirror
while performing the receive, then put it back online and allow it to
resilver before exporting the whole pool again. I'm not sure if there's
any way to wait for the resilver to finish except to poll 'zpool
status', though.

Ben

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Musings on ZFS Backup strategies

2013-03-01 Thread Karl Denninger
Dabbling with ZFS now, and giving some thought to how to handle backup
strategies.

ZFS' snapshot capabilities have forced me to re-think the way that I've
handled this.  Previously near-line (and offline) backup was focused on
being able to handle both disasters (e.g. RAID adapter goes nuts and
scribbles on the entire contents of the array), a double-disk (or worse)
failure, or the obvious (e.g. fire, etc) along with the aw crap, I just
rm -rf'd something I'd rather not!

ZFS makes snapshots very cheap, which means you can resolve the aw
crap situation without resorting to backups at all.  This turns the
backup situation into a disaster recovery one.

And that in turn seems to say that the ideal strategy looks more like:

Take a base snapshot immediately and zfs send it to offline storage.
Take an incremental at some interval (appropriate for disaster recovery)
and zfs send THAT to stable storage.

If I then restore the base and snapshot, I get back to where I was when
the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem@unique-label
zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
zfs destroy pool/some-filesystem@unique-label

and that seems to work (and restore) just fine.

Am I looking at this the right way here?  Provided that the base backup
and incremental are both readable, it appears that I have the disaster
case covered, and the online snapshot increments and retention are
easily adjusted and cover the oops situations without having to resort
to the backups at all.

This in turn means that keeping more than two incremental dumps offline
has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion without
error to apply on top of the base.  That in turn makes the backup
storage requirement based only on entropy in the filesystem and not time
(where the tower of Hanoi style dump hierarchy imposed both a time AND
entropy cost on backup media.)

Am I missing something here?

(Yes, I know, I've been a ZFS resister ;-))

-- 
-- Karl Denninger
/The Market Ticker ®/ http://market-ticker.org
Cuda Systems LLC
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Ronald Klop
On Fri, 01 Mar 2013 15:24:53 +0100, Karl Denninger k...@denninger.net  
wrote:



Dabbling with ZFS now, and giving some thought to how to handle backup
strategies.

ZFS' snapshot capabilities have forced me to re-think the way that I've
handled this.  Previously near-line (and offline) backup was focused on
being able to handle both disasters (e.g. RAID adapter goes nuts and
scribbles on the entire contents of the array), a double-disk (or worse)
failure, or the obvious (e.g. fire, etc) along with the aw crap, I just
rm -rf'd something I'd rather not!

ZFS makes snapshots very cheap, which means you can resolve the aw
crap situation without resorting to backups at all.  This turns the
backup situation into a disaster recovery one.

And that in turn seems to say that the ideal strategy looks more like:

Take a base snapshot immediately and zfs send it to offline storage.
Take an incremental at some interval (appropriate for disaster recovery)
and zfs send THAT to stable storage.

If I then restore the base and snapshot, I get back to where I was when
the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem@unique-label
zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
zfs destroy pool/some-filesystem@unique-label

and that seems to work (and restore) just fine.

Am I looking at this the right way here?  Provided that the base backup
and incremental are both readable, it appears that I have the disaster
case covered, and the online snapshot increments and retention are
easily adjusted and cover the oops situations without having to resort
to the backups at all.

This in turn means that keeping more than two incremental dumps offline
has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion without
error to apply on top of the base.  That in turn makes the backup
storage requirement based only on entropy in the filesystem and not time
(where the tower of Hanoi style dump hierarchy imposed both a time AND
entropy cost on backup media.)

Am I missing something here?

(Yes, I know, I've been a ZFS resister ;-))



I do the same. I only use zfs send -I (capital i) so I have all the  
snapshots on the backup also.

That way the data survives an oops (rm -r) and a fire at the same time. :-)

Ronald.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Royce Williams
On Fri, Mar 1, 2013 at 6:06 AM, Ronald Klop ronald-freeb...@klop.yi.org wrote:
 On Fri, 01 Mar 2013 15:24:53 +0100, Karl Denninger k...@denninger.net
 wrote:

 Dabbling with ZFS now, and giving some thought to how to handle backup
 strategies.

 ZFS' snapshot capabilities have forced me to re-think the way that I've
 handled this.  Previously near-line (and offline) backup was focused on
 being able to handle both disasters (e.g. RAID adapter goes nuts and
 scribbles on the entire contents of the array), a double-disk (or worse)
 failure, or the obvious (e.g. fire, etc) along with the aw crap, I just
 rm -rf'd something I'd rather not!

 ZFS makes snapshots very cheap, which means you can resolve the aw
 crap situation without resorting to backups at all.  This turns the
 backup situation into a disaster recovery one.

 And that in turn seems to say that the ideal strategy looks more like:

 Take a base snapshot immediately and zfs send it to offline storage.
 Take an incremental at some interval (appropriate for disaster recovery)
 and zfs send THAT to stable storage.

 If I then restore the base and snapshot, I get back to where I was when
 the latest snapshot was taken.  I don't need to keep the incremental
 snapshot for longer than it takes to zfs send it, so I can do:

 zfs snapshot pool/some-filesystem@unique-label
 zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
 zfs destroy pool/some-filesystem@unique-label

 and that seems to work (and restore) just fine.

 Am I looking at this the right way here?  Provided that the base backup
 and incremental are both readable, it appears that I have the disaster
 case covered, and the online snapshot increments and retention are
 easily adjusted and cover the oops situations without having to resort
 to the backups at all.

 This in turn means that keeping more than two incremental dumps offline
 has little or no value; the second merely being taken to insure that
 there is always at least one that has been written to completion without
 error to apply on top of the base.  That in turn makes the backup
 storage requirement based only on entropy in the filesystem and not time
 (where the tower of Hanoi style dump hierarchy imposed both a time AND
 entropy cost on backup media.)

 Am I missing something here?

 (Yes, I know, I've been a ZFS resister ;-))


 I do the same. I only use zfs send -I (capital i) so I have all the
 snapshots on the backup also.
 That way the data survives an oops (rm -r) and a fire at the same time. :-)

Concur.  There are disasters that are not obvious until some time
has passed -- such as security breaches, application problems that
cause quiet data corruption, etc.

I do not know how a live ZFS filesystem could be manipulated by an
intruder, but the possibility is there.

-- 
Royce Williams
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread dweimer

On 03/01/2013 8:24 am, Karl Denninger wrote:

Dabbling with ZFS now, and giving some thought to how to handle backup
strategies.

ZFS' snapshot capabilities have forced me to re-think the way that 
I've
handled this.  Previously near-line (and offline) backup was focused 
on

being able to handle both disasters (e.g. RAID adapter goes nuts and
scribbles on the entire contents of the array), a double-disk (or 
worse)
failure, or the obvious (e.g. fire, etc) along with the aw crap, I 
just

rm -rf'd something I'd rather not!

ZFS makes snapshots very cheap, which means you can resolve the aw
crap situation without resorting to backups at all.  This turns the
backup situation into a disaster recovery one.

And that in turn seems to say that the ideal strategy looks more like:

Take a base snapshot immediately and zfs send it to offline storage.
Take an incremental at some interval (appropriate for disaster 
recovery)

and zfs send THAT to stable storage.

If I then restore the base and snapshot, I get back to where I was 
when

the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem@unique-label
zfs send -i pool/some-filesystem@base 
pool/some-filesystem@unique-label

zfs destroy pool/some-filesystem@unique-label

and that seems to work (and restore) just fine.

Am I looking at this the right way here?  Provided that the base 
backup

and incremental are both readable, it appears that I have the disaster
case covered, and the online snapshot increments and retention are
easily adjusted and cover the oops situations without having to 
resort

to the backups at all.

This in turn means that keeping more than two incremental dumps 
offline

has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion 
without

error to apply on top of the base.  That in turn makes the backup
storage requirement based only on entropy in the filesystem and not 
time
(where the tower of Hanoi style dump hierarchy imposed both a time 
AND

entropy cost on backup media.)

Am I missing something here?

(Yes, I know, I've been a ZFS resister ;-))


I briefly did something like this between two FreeNAS boxes, it seemed 
to work well, but my secondary Box wasn't quite up to par hardware.  
Combine that with the lack of necessary internet bandwidth with a second 
physical location in case of something really disastrous, like a tornado 
or fire destroying my house.  I ended up just using an eSATA drive dock 
and Bacula, with a few external drives rotated regularly into my office 
at work, rather than upgrading the secondary box.


If you have the secondary box that is adequate, and either offsite 
backups aren't a concern or you have a big enough pipe to a secondary 
location that houses the backup this should work.


I would recommend testing your incremental snapshot rotation, I never 
did test a restore from anything but the most recent set of data when I 
was running my setup, I did however save a weeks worth of hourly 
snapshots on a couple of the more rapidly changing data sets.


--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Karl Denninger

On 3/1/2013 9:36 AM, dweimer wrote:
 On 03/01/2013 8:24 am, Karl Denninger wrote:
 Dabbling with ZFS now, and giving some thought to how to handle backup
 strategies.

 ZFS' snapshot capabilities have forced me to re-think the way that I've
 handled this.  Previously near-line (and offline) backup was focused on
 being able to handle both disasters (e.g. RAID adapter goes nuts and
 scribbles on the entire contents of the array), a double-disk (or worse)
 failure, or the obvious (e.g. fire, etc) along with the aw crap, I just
 rm -rf'd something I'd rather not!

 ZFS makes snapshots very cheap, which means you can resolve the aw
 crap situation without resorting to backups at all.  This turns the
 backup situation into a disaster recovery one.

 And that in turn seems to say that the ideal strategy looks more like:

 Take a base snapshot immediately and zfs send it to offline storage.
 Take an incremental at some interval (appropriate for disaster recovery)
 and zfs send THAT to stable storage.

 If I then restore the base and snapshot, I get back to where I was when
 the latest snapshot was taken.  I don't need to keep the incremental
 snapshot for longer than it takes to zfs send it, so I can do:

 zfs snapshot pool/some-filesystem@unique-label
 zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
 zfs destroy pool/some-filesystem@unique-label

 and that seems to work (and restore) just fine.

 Am I looking at this the right way here?  Provided that the base backup
 and incremental are both readable, it appears that I have the disaster
 case covered, and the online snapshot increments and retention are
 easily adjusted and cover the oops situations without having to resort
 to the backups at all.

 This in turn means that keeping more than two incremental dumps offline
 has little or no value; the second merely being taken to insure that
 there is always at least one that has been written to completion without
 error to apply on top of the base.  That in turn makes the backup
 storage requirement based only on entropy in the filesystem and not time
 (where the tower of Hanoi style dump hierarchy imposed both a time AND
 entropy cost on backup media.)

 Am I missing something here?

 (Yes, I know, I've been a ZFS resister ;-))

 I briefly did something like this between two FreeNAS boxes, it seemed
 to work well, but my secondary Box wasn't quite up to par hardware. 
 Combine that with the lack of necessary internet bandwidth with a
 second physical location in case of something really disastrous, like
 a tornado or fire destroying my house.  I ended up just using an eSATA
 drive dock and Bacula, with a few external drives rotated regularly
 into my office at work, rather than upgrading the secondary box.

 If you have the secondary box that is adequate, and either offsite
 backups aren't a concern or you have a big enough pipe to a secondary
 location that houses the backup this should work.

 I would recommend testing your incremental snapshot rotation, I never
 did test a restore from anything but the most recent set of data when
 I was running my setup, I did however save a weeks worth of hourly
 snapshots on a couple of the more rapidly changing data sets.

I rotate the disaster disks out to a safe-deposit box at the bank, and
they're geli-encrypted, so if stolen they're worthless to the thief
(other than their cash value as a drive) and if the building goes poof
I have the ones in the vault to recover from.  There's the potential for
loss up to the rotation time of course but that is the same risk I had
with all UFS filesystems.

I've tested the restores onto a spare box and it appears to work as
expected...

Thanks for the comments!

-- 
-- Karl Denninger
/The Market Ticker ®/ http://market-ticker.org
Cuda Systems LLC
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread dweimer

On 03/01/2013 9:45 am, Karl Denninger wrote:


I briefly did something like this between two FreeNAS boxes, it 
seemed

to work well, but my secondary Box wasn't quite up to par hardware.
Combine that with the lack of necessary internet bandwidth with a
second physical location in case of something really disastrous, like
a tornado or fire destroying my house.  I ended up just using an 
eSATA

drive dock and Bacula, with a few external drives rotated regularly
into my office at work, rather than upgrading the secondary box.

If you have the secondary box that is adequate, and either offsite
backups aren't a concern or you have a big enough pipe to a secondary
location that houses the backup this should work.

I would recommend testing your incremental snapshot rotation, I never
did test a restore from anything but the most recent set of data when
I was running my setup, I did however save a weeks worth of hourly
snapshots on a couple of the more rapidly changing data sets.


I rotate the disaster disks out to a safe-deposit box at the bank, and
they're geli-encrypted, so if stolen they're worthless to the thief
(other than their cash value as a drive) and if the building goes 
poof
I have the ones in the vault to recover from.  There's the potential 
for

loss up to the rotation time of course but that is the same risk I had
with all UFS filesystems.

I've tested the restores onto a spare box and it appears to work as
expected...

Thanks for the comments!


Yes, good point on the Geli encryption, I do that as well on my 
external backup drives, didn't think to mention that in the last post.  
I have considered the safe-Deposit box as well, but our office building 
at work is fairly well secured seeing as it houses the main data-center 
for our company, doors locked 24 hours a day, with electronic locks that 
log all entries.  Its also an old brick and concrete building, that has 
survived a direct Tornado hit about 15 years ago with only very minor 
cosmetic exterior damage, to the awning over the front stairs and the 
Company logo above it.  I feel fairly secure in keeping the disk drives 
there, and if ever need my offsite backup at 3:00am I can go get it 
rather than be stuck waiting for the bank to open.


--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Karl Denninger

On 3/1/2013 10:08 AM, dweimer wrote:
 On 03/01/2013 9:45 am, Karl Denninger wrote:

 I briefly did something like this between two FreeNAS boxes, it seemed
 to work well, but my secondary Box wasn't quite up to par hardware.
 Combine that with the lack of necessary internet bandwidth with a
 second physical location in case of something really disastrous, like
 a tornado or fire destroying my house.  I ended up just using an eSATA
 drive dock and Bacula, with a few external drives rotated regularly
 into my office at work, rather than upgrading the secondary box.

 If you have the secondary box that is adequate, and either offsite
 backups aren't a concern or you have a big enough pipe to a secondary
 location that houses the backup this should work.

 I would recommend testing your incremental snapshot rotation, I never
 did test a restore from anything but the most recent set of data when
 I was running my setup, I did however save a weeks worth of hourly
 snapshots on a couple of the more rapidly changing data sets.

 I rotate the disaster disks out to a safe-deposit box at the bank, and
 they're geli-encrypted, so if stolen they're worthless to the thief
 (other than their cash value as a drive) and if the building goes poof
 I have the ones in the vault to recover from.  There's the potential for
 loss up to the rotation time of course but that is the same risk I had
 with all UFS filesystems.

 I've tested the restores onto a spare box and it appears to work as
 expected...

 Thanks for the comments!

 Yes, good point on the Geli encryption, I do that as well on my
 external backup drives, didn't think to mention that in the last
 post.  I have considered the safe-Deposit box as well, but our office
 building at work is fairly well secured seeing as it houses the main
 data-center for our company, doors locked 24 hours a day, with
 electronic locks that log all entries.  Its also an old brick and
 concrete building, that has survived a direct Tornado hit about 15
 years ago with only very minor cosmetic exterior damage, to the awning
 over the front stairs and the Company logo above it.  I feel fairly
 secure in keeping the disk drives there, and if ever need my offsite
 backup at 3:00am I can go get it rather than be stuck waiting for the
 bank to open.

I keep two copies on-site (rsync'd from one to the other), both offline
when not actively being written to, and rotate the second with one in
the vault.  When the vault copy is rotated on the next cycle it is
sync'd automatically.

So I have two shots at a restore on-site all the time; the last chance
one is in the vault in the event the building is destroyed and if that
happens the delay until the bank opens is probably the least of my problems.

-- 
-- Karl Denninger
/The Market Ticker ®/ http://market-ticker.org
Cuda Systems LLC
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Ben Morrow
Quoth Karl Denninger k...@denninger.net:
 Dabbling with ZFS now, and giving some thought to how to handle backup
 strategies.
[...]
 
 Take a base snapshot immediately and zfs send it to offline storage.
 Take an incremental at some interval (appropriate for disaster recovery)
 and zfs send THAT to stable storage.
 
 If I then restore the base and snapshot, I get back to where I was when
 the latest snapshot was taken.  I don't need to keep the incremental
 snapshot for longer than it takes to zfs send it, so I can do:
 
 zfs snapshot pool/some-filesystem@unique-label
 zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
 zfs destroy pool/some-filesystem@unique-label
 
 and that seems to work (and restore) just fine.

For backup purposes it's worth using the -R and -I options to zfs send
rather than -i. This will preserve the other snapshots, which can be
important.

 Am I looking at this the right way here?  Provided that the base backup
 and incremental are both readable, it appears that I have the disaster
 case covered, and the online snapshot increments and retention are
 easily adjusted and cover the oops situations without having to resort
 to the backups at all.
 
 This in turn means that keeping more than two incremental dumps offline
 has little or no value; the second merely being taken to insure that
 there is always at least one that has been written to completion without
 error to apply on top of the base.  That in turn makes the backup
 storage requirement based only on entropy in the filesystem and not time
 (where the tower of Hanoi style dump hierarchy imposed both a time AND
 entropy cost on backup media.)

No, that's not true. Since you keep taking successive increments from a
fixed base, the size of those increments will increase over time (each
increment will include all net filesystem activity since the base
snapshot). In UFS terms, it's equivalent to always taking level 1 dumps.
Unlike with UFS, the @base snapshot will also start using increasing
amounts of space in the source zpool.

I don't know what medium you're backing up to (does anyone use tape any
more?) but when backing up to disk I much prefer to keep the backup in
the form of a filesystem rather than as 'zfs send' streams. One reason
for this is that I believe that new versions of the ZFS code are more
likely to be able to correctly read old versions of the filesystem than
old versions of the stream format; this may not be correct any more,
though.

Another reason is that it means I can do 'rolling snapshot' backups. I
do an initial dump like this

# zpool is my working pool
# bakpool is a second pool I am backing up to

zfs snapshot -r zpool/fs@dump
zfs send -R zpool/fs@dump | zfs recv -vFd bakpool

That pipe can obviously go through ssh or whatever to put the backup on
a different machine. Then to make an increment I roll forward the
snapshot like this

zfs rename -r zpool/fs@dump dump-old
zfs snapshot -r zpool/fs@dump
zfs send -R -I @dump-old zpool/fs@dump | zfs recv -vFd bakpool
zfs destroy -r zpool/fs@dump-old
zfs destroy -r bakpool/fs@dump-old

(Notice that the increment starts at a snapshot called @dump-old on the
send side but at a snapshot called @dump on the recv side. ZFS can
handle this perfectly well, since it identifies snapshots by UUID, and
will rename the bakpool snapshot as part of the recv.)

This brings the filesystem on bakpool up to date with the filesystem on
zpool, including all snapshots, but never creates an increment with more
than one backup interval's worth of data in. If you want to keep more
history on the backup pool than the source pool, you can hold off on
destroying the old snapshots, and instead rename them to something
unique. (Of course, you could always give them unique names to start
with, but I find it more convenient not to.)

Ben

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Daniel Eischen

On Fri, 1 Mar 2013, Ben Morrow wrote:


Quoth Karl Denninger k...@denninger.net:

Dabbling with ZFS now, and giving some thought to how to handle backup
strategies.

[...]


Take a base snapshot immediately and zfs send it to offline storage.
Take an incremental at some interval (appropriate for disaster recovery)
and zfs send THAT to stable storage.

If I then restore the base and snapshot, I get back to where I was when
the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem@unique-label
zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
zfs destroy pool/some-filesystem@unique-label

and that seems to work (and restore) just fine.


For backup purposes it's worth using the -R and -I options to zfs send
rather than -i. This will preserve the other snapshots, which can be
important.


Am I looking at this the right way here?  Provided that the base backup
and incremental are both readable, it appears that I have the disaster
case covered, and the online snapshot increments and retention are
easily adjusted and cover the oops situations without having to resort
to the backups at all.

This in turn means that keeping more than two incremental dumps offline
has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion without
error to apply on top of the base.  That in turn makes the backup
storage requirement based only on entropy in the filesystem and not time
(where the tower of Hanoi style dump hierarchy imposed both a time AND
entropy cost on backup media.)


No, that's not true. Since you keep taking successive increments from a
fixed base, the size of those increments will increase over time (each
increment will include all net filesystem activity since the base
snapshot). In UFS terms, it's equivalent to always taking level 1 dumps.
Unlike with UFS, the @base snapshot will also start using increasing
amounts of space in the source zpool.

I don't know what medium you're backing up to (does anyone use tape any
more?) but when backing up to disk I much prefer to keep the backup in
the form of a filesystem rather than as 'zfs send' streams. One reason
for this is that I believe that new versions of the ZFS code are more
likely to be able to correctly read old versions of the filesystem than
old versions of the stream format; this may not be correct any more,
though.


Yes, we still use a couple of DLT autoloaders and have nightly
incrementals and weekly fulls.  This is the problem I have with
converting to ZFS.  Our typical recovery is when a user says
they need a directory or set of files from a week or two ago.
Using dump from tape, I can easily extract *just* the necessary
files.  I don't need a second system to restore to, so that
I can then extract the file.

dump (and ufsdump for our Solaris boxes) _just work_, and we
can go back many many years and they will still work.  If we
convert to ZFS, I'm guessing we'll have to do nightly
incrementals with 'tar' instead of 'dump' as well as doing
ZFS snapshots for fulls.

This topic is very interesting to me, as we're at the point
now (with Solaris 11 refusing to even boot from anything but
ZFS) that we have to consider ZFS.

--
DE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Volodymyr Kostyrko

01.03.2013 16:24, Karl Denninger:

Dabbling with ZFS now, and giving some thought to how to handle backup
strategies.

ZFS' snapshot capabilities have forced me to re-think the way that I've
handled this.  Previously near-line (and offline) backup was focused on
being able to handle both disasters (e.g. RAID adapter goes nuts and
scribbles on the entire contents of the array), a double-disk (or worse)
failure, or the obvious (e.g. fire, etc) along with the aw crap, I just
rm -rf'd something I'd rather not!

ZFS makes snapshots very cheap, which means you can resolve the aw
crap situation without resorting to backups at all.  This turns the
backup situation into a disaster recovery one.

And that in turn seems to say that the ideal strategy looks more like:

Take a base snapshot immediately and zfs send it to offline storage.
Take an incremental at some interval (appropriate for disaster recovery)
and zfs send THAT to stable storage.

If I then restore the base and snapshot, I get back to where I was when
the latest snapshot was taken.  I don't need to keep the incremental
snapshot for longer than it takes to zfs send it, so I can do:

zfs snapshot pool/some-filesystem@unique-label
zfs send -i pool/some-filesystem@base pool/some-filesystem@unique-label
zfs destroy pool/some-filesystem@unique-label

and that seems to work (and restore) just fine.


Yes, I'm working with backups the same way, I wrote a simple script that 
synchronizes two filesystems between distant servers. I also use the 
same script to synchronize bushy filesystems (with hundred thousands of 
files) where rsync produces a too big load for synchronizing.


https://github.com/kworr/zfSnap/commit/08d8b499dbc2527a652cddbc601c7ee8c0c23301

I left it where it was but I was also planning to write some purger for 
snapshots that would automatically purge snapshots when pool gets low on 
space. Never hit that yet.



Am I looking at this the right way here?  Provided that the base backup
and incremental are both readable, it appears that I have the disaster
case covered, and the online snapshot increments and retention are
easily adjusted and cover the oops situations without having to resort
to the backups at all.

This in turn means that keeping more than two incremental dumps offline
has little or no value; the second merely being taken to insure that
there is always at least one that has been written to completion without
error to apply on top of the base.  That in turn makes the backup
storage requirement based only on entropy in the filesystem and not time
(where the tower of Hanoi style dump hierarchy imposed both a time AND
entropy cost on backup media.)


Well, snapshots can pose a value in a longer timeframe depending on 
data. Being able to restore some file accidentally deleted two month ago 
already saved 2k$ for one of our customers.


--
Sphinx of black quartz, judge my vow.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Ben Morrow
Quoth Daniel Eischen deisc...@freebsd.org:
 
 Yes, we still use a couple of DLT autoloaders and have nightly
 incrementals and weekly fulls.  This is the problem I have with
 converting to ZFS.  Our typical recovery is when a user says
 they need a directory or set of files from a week or two ago.
 Using dump from tape, I can easily extract *just* the necessary
 files.  I don't need a second system to restore to, so that
 I can then extract the file.

As Karl said originally, you can do that with snapshots without having
to go to your backups at all. With the right arrangements (symlinks to
the .zfs/snapshot/* directories, or just setting the snapdir property to
'visible') you can make it so users can do this sort of restore
themselves without having to go through you.

Ben

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread dweimer

On 03/01/2013 1:25 pm, kpn...@pobox.com wrote:

On Fri, Mar 01, 2013 at 09:45:32AM -0600, Karl Denninger wrote:
I rotate the disaster disks out to a safe-deposit box at the bank, 
and

they're geli-encrypted, so if stolen they're worthless to the thief
(other than their cash value as a drive) and if the building goes 
poof
I have the ones in the vault to recover from.  There's the potential 
for
loss up to the rotation time of course but that is the same risk I 
had

with all UFS filesystems.

What do you do about geli keys? Encrypted backups aren't much use if
you can't unencrypt them.


In my case I set them up with a pass-phrase only, I can mount them on 
any FreeBSD system using geli attach ... then enter pass-phrase when 
prompted. It is less secure than the key method (just because the 
pass-phrase is far shorter than a key would be), but it ensures as long 
as I can remember the pass-phrase I can access the data.  However my 
backups in this method are personal data, worse case scenario is someone 
steals my identity, personal photos, and iTunes library.  My bank 
accounts don't have enough money in them to make it worth, someone going 
through the time and effort to get the data off the disks.  The 
pass-phrase I picked uses all the good practices of mixed case, special 
characters, and its not something easy to guess even by people who know 
me well.  It would be far easier to break into my house and get the data 
that way, than break the encryption, on the external backup media.
If I was say backing up a corporate data with this method and my 
company did defense research, well I would probably use both a 
pass-phrase and key combination and store an offsite copy of the key in 
a separate secure location from the media.


--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Daniel Eischen

On Fri, 1 Mar 2013, Ben Morrow wrote:


Quoth Daniel Eischen deisc...@freebsd.org:


Yes, we still use a couple of DLT autoloaders and have nightly
incrementals and weekly fulls.  This is the problem I have with
converting to ZFS.  Our typical recovery is when a user says
they need a directory or set of files from a week or two ago.
Using dump from tape, I can easily extract *just* the necessary
files.  I don't need a second system to restore to, so that
I can then extract the file.


As Karl said originally, you can do that with snapshots without having
to go to your backups at all. With the right arrangements (symlinks to
the .zfs/snapshot/* directories, or just setting the snapdir property to
'visible') you can make it so users can do this sort of restore
themselves without having to go through you.


It wasn't clear that snapshots were traversable as a normal
directory structure.  I was thinking it was just a blob
that you had to roll back to in order to get anything out
of it.

Under our current scheme, we would remove snapshots
after the next (weekly) full zfs send (nee dump), so
it wouldn't help unless we kept snapshots around a
lot longer.

Am I correct in assuming that one could:

  # zfs send -R snapshot | dd obs=10240 of=/dev/rst0

to archive it to tape instead of another [system:]drive?

--
DE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Karl Denninger

On 3/1/2013 2:34 PM, Daniel Eischen wrote:
 On Fri, 1 Mar 2013, Ben Morrow wrote:

 Quoth Daniel Eischen deisc...@freebsd.org:

 Yes, we still use a couple of DLT autoloaders and have nightly
 incrementals and weekly fulls.  This is the problem I have with
 converting to ZFS.  Our typical recovery is when a user says
 they need a directory or set of files from a week or two ago.
 Using dump from tape, I can easily extract *just* the necessary
 files.  I don't need a second system to restore to, so that
 I can then extract the file.

 As Karl said originally, you can do that with snapshots without having
 to go to your backups at all. With the right arrangements (symlinks to
 the .zfs/snapshot/* directories, or just setting the snapdir property to
 'visible') you can make it so users can do this sort of restore
 themselves without having to go through you.

 It wasn't clear that snapshots were traversable as a normal
 directory structure.  I was thinking it was just a blob
 that you had to roll back to in order to get anything out
 of it.

 Under our current scheme, we would remove snapshots
 after the next (weekly) full zfs send (nee dump), so
 it wouldn't help unless we kept snapshots around a
 lot longer.

 Am I correct in assuming that one could:

   # zfs send -R snapshot | dd obs=10240 of=/dev/rst0

 to archive it to tape instead of another [system:]drive?

Yes.

-- 
-- Karl Denninger
/The Market Ticker ®/ http://market-ticker.org
Cuda Systems LLC
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Daniel Eischen

On Fri, 1 Mar 2013, kpn...@pobox.com wrote:


On Fri, Mar 01, 2013 at 12:23:31PM -0500, Daniel Eischen wrote:

Yes, we still use a couple of DLT autoloaders and have nightly
incrementals and weekly fulls.  This is the problem I have with
converting to ZFS.  Our typical recovery is when a user says
they need a directory or set of files from a week or two ago.
Using dump from tape, I can easily extract *just* the necessary
files.  I don't need a second system to restore to, so that
I can then extract the file.

dump (and ufsdump for our Solaris boxes) _just work_, and we
can go back many many years and they will still work.  If we
convert to ZFS, I'm guessing we'll have to do nightly
incrementals with 'tar' instead of 'dump' as well as doing
ZFS snapshots for fulls.


What about extended attributes? ACLs? Are those saved by tar?


I think tar (as root or -p) will attempt to preserve those.

--
DE
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Karl Denninger

On 3/1/2013 1:25 PM, kpn...@pobox.com wrote:
 On Fri, Mar 01, 2013 at 09:45:32AM -0600, Karl Denninger wrote:
 I rotate the disaster disks out to a safe-deposit box at the bank, and
 they're geli-encrypted, so if stolen they're worthless to the thief
 (other than their cash value as a drive) and if the building goes poof
 I have the ones in the vault to recover from.  There's the potential for
 loss up to the rotation time of course but that is the same risk I had
 with all UFS filesystems.
 What do you do about geli keys? Encrypted backups aren't much use if
 you can't unencrypt them.
I keep them in my head.  Even my immediate family could not guess it;
one of the things I mastered many years ago was algorithmic and very
long passwords that are easy to remember but impossible for someone to
guess other than by brute force, and if long enough that becomes
prohibitive for the guesser.

If I needed even better I'd keep the (random part of the) composite key
on an external thing (e.g. thumbdrive) that is only stuffed in the box
to boot and attach the drives, the removed and stored separately under
separate and high security.

There is no point to using a composite key IF THE RANDOM PART CAN BE
STOLEN; you then are back to the security of the typed password (if
any), so if you want the better level of security you need to deal with
the physical security of the random portion and make sure it is NEVER on
an unencrypted part of the disk itself.

If you're not going to do that then a strong and long password is just
as good.

I can mount my backup volumes on any FreeBSD machine that has the geli
framework.

-- 
-- Karl Denninger
/The Market Ticker ®/ http://market-ticker.org
Cuda Systems LLC
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread David Magda
On Mar 1, 2013, at 12:23, Daniel Eischen wrote:

 dump (and ufsdump for our Solaris boxes) _just work_, and we
 can go back many many years and they will still work.  If we
 convert to ZFS, I'm guessing we'll have to do nightly
 incrementals with 'tar' instead of 'dump' as well as doing
 ZFS snapshots for fulls.

Keep some snapshots, and send stuff to tape after a certain amount of time. 
Most (though not all) restores are usually within x weeks, where x is a 
different value for each organization. (Things will be generally asymptotic 
though.)

So if you keep 1 week worth of snapshots, you'll probably end being able to 
service (say) 25% of restore requests: the file can be grabbed usually from 
yesterday's snapshot. If you keep 2 weeks' worth of snapshots, probably catch 
50% of requests. 4 weeks will give you 80%; 6 weeks, 90%; 8 weeks, 95%. 

Of course the more snapshots, the more spinning disk you need (using power and 
generating heat).

Most articles describing backup/restore best practices I've read in the last 
few years have stated you want to use disk first (snapshots, VTLs, etc.), and 
then clone to tape after a certain amount of time (x weeks). Or rather: disk 
AND tape, then clone to another tape (so you have two) and purge the disk copy 
after x.

So in this instance, keep snapshots around for a little while, and keep doing 
your tape backups for long-term storage. Also inform people about the 
.snapshot/ directory so they can possibly do some self service in case they 
fat finger something (quicker for them, and less hassle for help desk/IT).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread David Magda

On Mar 1, 2013, at 15:39, Daniel Eischen wrote:

 On Fri, 1 Mar 2013, kpn...@pobox.com wrote:
 
 What about extended attributes? ACLs? Are those saved by tar?
 
 I think tar (as root or -p) will attempt to preserve those.

Specifically bsdtar (with libarchive) and star:

https://github.com/libarchive/libarchive/wiki/TarPosix1eACLs
http://www.freshports.org/archivers/star/

GNUtar is a bit tricky: older versions don't handle ACLs at all so you have to 
check version numbers on your creation and extraction hosts.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread David Magda

On Mar 1, 2013, at 12:55, Volodymyr Kostyrko wrote:

 Yes, I'm working with backups the same way, I wrote a simple script that 
 synchronizes two filesystems between distant servers. I also use the same 
 script to synchronize bushy filesystems (with hundred thousands of files) 
 where rsync produces a too big load for synchronizing.
 
 https://github.com/kworr/zfSnap/commit/08d8b499dbc2527a652cddbc601c7ee8c0c23301

There are quite a few scripts out there:

http://www.freshports.org/search.php?query=zfs

For file level copying, where you don't want to walk the entire tree, here is 
the zfs diff command:

 zfs diff [-FHt] snapshot [snapshot|filesystem]
 
Describes differences between a snapshot and a successor dataset. The
successor dataset can be a later snapshot or the current filesystem.
 
The changed files are displayed including the change type. The change
type is displayed useing a single character. If a file or directory
was renamed, the old and the new names are displayed.

http://www.freebsd.org/cgi/man.cgi?query=zfs

This allows one to get a quick list of files and directories, then use 
tar/rsync/cp/etc. to do the actual copy (where the destination does not have to 
be ZFS: e.g., NFS, ext4, Lustre, HDFS, etc.).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Musings on ZFS Backup strategies

2013-03-01 Thread Ben Morrow
Quoth David Magda dma...@ee.ryerson.ca:
 On Mar 1, 2013, at 15:39, Daniel Eischen wrote:
  On Fri, 1 Mar 2013, kpn...@pobox.com wrote:
  
  What about extended attributes? ACLs? Are those saved by tar?
  
  I think tar (as root or -p) will attempt to preserve those.
 
 Specifically bsdtar (with libarchive) and star:
 
 https://github.com/libarchive/libarchive/wiki/TarPosix1eACLs

But since ZFS doesn't support POSIX.1e ACLs that's not terribly
useful... I don't believe bsdtar/libarchive supports NFSv4 ACLs yet.

Ben

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org