Re: [zfs-discuss] Announce: zfsdump

2010-07-05 Thread Tristram Scott
 At this point, I will repeat my recommendation about
 using
 zpool-in-files as a backup (staging) target.
  Depending where you
 ost, and how you combine the files, you can achieve
 these scenarios
 without clunkery, and with all the benefits a zpool
 provides.
 

This is another good scheme.

I see a number of points to consider when choosing amongst the various 
suggestions for backing up zfs file systems.  In no particular order, I have 
these:

1. Does it work in place, or need an intermediate copy on disk?
2. Does it respect ACLs?
3. Does it respect zfs snapshots?
4. Does it allow random access to files, or only full file system restore?
5. Can it (mostly) survive partial data corruption?
6. Can it handle file systems larger than a single tape?
7. Can it stream to multiple tapes in parallel?
8. Does it understand the concept of incremental backups?

I still see this as a serious gap in the offering of zfs.  Clearly so do many 
other people, as there are a lot of methods offered to handle at least some of 
the above.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-07-05 Thread Joerg Schilling
Tristram Scott tristram.sc...@quantmodels.co.uk wrote:

 I see a number of points to consider when choosing amongst the various 
 suggestions for backing up zfs file systems.  In no particular order, I have 
 these:

Let me fill this out for star ;-)

 1. Does it work in place, or need an intermediate copy on disk?

Yes

 2. Does it respect ACLs?

not yet (because of missing interest from Sun)
If people show interest, a ZFS ACL implementation would not take much time
as there is already UFS ACL support in star.

 3. Does it respect zfs snapshots?

Yes
Star recommends to run incrementals on snapshots. Star incrementals
will work correclty if the snapshot just creates a new filesystem ID but 
leaves inode numbers identical (this is how it works with UFS snapshots).

 4. Does it allow random access to files, or only full file system restore?

Yes

 5. Can it (mostly) survive partial data corruption?

Yes for data curruption in the archive, for data currupion in ZFS - see ZFS

 6. Can it handle file systems larger than a single tape?

Yes

 7. Can it stream to multiple tapes in parallel?

There is Hardware for this task (check for TAPE RAID)

 8. Does it understand the concept of incremental backups?

Yes

And regarding the speed for incrementals:

A scan on a Sunfire X 4540 with a typical mix of small and large files (1.5 TB 
of filesystem data in 7.7 million files) takes 20 minutes. There seems to be a 
performance problem in the ZFS implementation: The data is made from 4 copies 
of identical file sets, each 370 GB in size and the performance degrades after 
some time. During parsing the first set of files, the performance is 4x higher, 
so this 1.5 TB test could have been finished in 5 minutes.
This test was done with an empty cache. With a populated cache, the incremental
scan is much faster and takes only 4 minutes.

It seems that incrementals at user space level still are feasible.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-07-03 Thread Daniel Carosone
On Wed, Jun 30, 2010 at 12:54:19PM -0400, Edward Ned Harvey wrote:
 If you're talking about streaming to a bunch of separate tape drives (or
 whatever) on a bunch of separate systems because the recipient storage is
 the bottleneck instead of the network ... then split probably isn't the
 most useful way to distribute those streams.  Because split is serial.
 You would really want to stripe your data to all those various
 destinations, so they could all be writing simultaneously.  But this seems
 like a very specialized scenario, that I think is probably very unusual.

At this point, I will repeat my recommendation about using
zpool-in-files as a backup (staging) target.  Depending where you
host, and how you combine the files, you can achieve these scenarios
without clunkery, and with all the benefits a zpool provides.

 1 - Create a bunch of files, sized appropriately for your eventual backup
 media unit (e.g. tape).  

 2 - make a zpool out of them, in whatever vdev arrangement suits your
 space and error tolerance needs (plain stripe or raidz or both).
 Set compression, dedup etc (encryption, one day) as suits you, too.

 3 - zfs send | zfs recv into this pool-of-files.  rsync from non-zfs
 hosts, too, if you like.

 4 - scrub, if you like

 5 - write the files to tape, or into whatever file-oriented backup
 solution you prefer (perhaps at a less frequent schedule than
 sends).

 6 - goto 3 (incremental sends for later updates)

I came up with this scheme when zpool was the only forwards-compatible
format, before the send stream format was a committed interface too.
However, there are still several other reasons why this is preferable
to backing up send streams directly.

--
Dan.

pgpEHsmYFFyjp.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-07-01 Thread Edward Ned Harvey
 From: Asif Iqbal [mailto:vad...@gmail.com]
 
 currently to speed up the zfs send| zfs recv I am using mbuffer. It
 moves the data
 lot faster than using netcat (or ssh) as the transport method

Yup, this works because network and disk latency can both be variable.  So
without buffering, your data stream must instantaneously go the speed of
whichever is slower:  The disk or the network.

But when you use buffering, you're able to go as fast as the network at all
times.  You remove the effect of transient disk latency.


 that is why I thought may be transport it like axel does better than
 wget.
 axel let you create multiple pipes, so you get the data multiple times
 faster
 than with wget.

If you're using axel to download something from the internet, the reason
it's faster than wget is because your data stream is competing against all
the other users of the internet, to get something from that server across
some WAN.  Inherently, all the routers and servers on the internet will
treat each data stream fairly (except when explicitly configured to be
unfair.)  So when you axel some file from the internet using multiple
threads, instead of wget'ing with a single thread, you're unfairly hogging
the server and WAN bandwidth between your site and the remote site.  Slowing
down everyone else on the internet who are running with only 1 thread each.

Assuming your zfs send backup is going local, on a LAN, you almost certainly
do not want to do that.

If your zfs send is going across the WAN ... maybe you do want to
multithread the datastream.  But you better ensure it's encrypted.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-30 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Asif Iqbal
 
 would be nice if i could pipe the zfs send stream to a split and then
 send of those splitted stream over the
 network to a remote system. it would help sending it over to remote
 system quicker. can your tool do that?

Does that make sense?  I assume the network is the bottleneck; the only way
the multiple streams would go any faster than a single stream would be
because you're multithreading and hogging all the bandwidth for yourself,
instead of sharing fairly with the httpd or whatever other server is trying
to use the bandwidth.

If you're talking about streaming to a bunch of separate tape drives (or
whatever) on a bunch of separate systems because the recipient storage is
the bottleneck instead of the network ... then split probably isn't the
most useful way to distribute those streams.  Because split is serial.
You would really want to stripe your data to all those various
destinations, so they could all be writing simultaneously.  But this seems
like a very specialized scenario, that I think is probably very unusual.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-30 Thread Asif Iqbal
On Wed, Jun 30, 2010 at 12:54 PM, Edward Ned Harvey
solar...@nedharvey.com wrote:
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Asif Iqbal

 would be nice if i could pipe the zfs send stream to a split and then
 send of those splitted stream over the
 network to a remote system. it would help sending it over to remote
 system quicker. can your tool do that?

 Does that make sense?  I assume the network is the bottleneck; the only way
 the multiple streams would go any faster than a single stream would be
 because you're multithreading and hogging all the bandwidth for yourself,
 instead of sharing fairly with the httpd or whatever other server is trying
 to use the bandwidth.

currently to speed up the zfs send| zfs recv I am using mbuffer. It
moves the data
lot faster than using netcat (or ssh) as the transport method

that is why I thought may be transport it like axel does better than wget.
axel let you create multiple pipes, so you get the data multiple times faster
than with wget.



 If you're talking about streaming to a bunch of separate tape drives (or
 whatever) on a bunch of separate systems because the recipient storage is
 the bottleneck instead of the network ... then split probably isn't the
 most useful way to distribute those streams.  Because split is serial.
 You would really want to stripe your data to all those various
 destinations, so they could all be writing simultaneously.  But this seems
 like a very specialized scenario, that I think is probably very unusual.





-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-29 Thread Tristram Scott
 
 would be nice if i could pipe the zfs send stream to
 a split and then
 send of those splitted stream over the
 network to a remote system. it would help sending it
 over to remote
 system quicker. can your tool do that?
 
 something like this
 
s | - | j
 - | o   zfs recv
(local)   l  | - | i(remote)
 t  | - | n
  copy from the fifos to tape(s).
 

 Asif Iqbal

I did look at doing this, with the intention of allowing simultaneous streams 
to multiple tape drives, but put the idea to one side.   

I thought of providing interleaved streams, but wasn't happy with the idea that 
the whole process would block when one of the pipes stalled.

I also contemplated dividing the stream into several large chunks, but for them 
to run simultaneously that seemed to require several reads of the original dump 
stream.  Besides the expense of this approach,  I am not certain that repeated 
zfs send streams have exactly the same byte content.

I think that probably the best approach would be the interleaved streams.

That said, I am not sure how this would necessarily help with the situation you 
describe.  Isn't the limiting factor going to be the network bandwidth between 
remote machines?  Won't you end up with four streams running at quarter speed?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-29 Thread Asif Iqbal
On Tue, Jun 29, 2010 at 8:17 AM, Tristram Scott
tristram.sc...@quantmodels.co.uk wrote:

 would be nice if i could pipe the zfs send stream to
 a split and then
 send of those splitted stream over the
 network to a remote system. it would help sending it
 over to remote
 system quicker. can your tool do that?

 something like this

                            s | - | j
 - | o   zfs recv
            (local)       l  | - | i    (remote)
                 t  | - | n
  copy from the fifos to tape(s).


 Asif Iqbal

 I did look at doing this, with the intention of allowing simultaneous streams 
 to multiple tape drives, but put the idea to one side.

 I thought of providing interleaved streams, but wasn't happy with the idea 
 that the whole process would block when one of the pipes stalled.

 I also contemplated dividing the stream into several large chunks, but for 
 them to run simultaneously that seemed to require several reads of the 
 original dump stream.  Besides the expense of this approach,  I am not 
 certain that repeated zfs send streams have exactly the same byte content.

 I think that probably the best approach would be the interleaved streams.

 That said, I am not sure how this would necessarily help with the situation 
 you describe.  Isn't the limiting factor going to be the network bandwidth 
 between remote machines?  Won't you end up with four streams running at 
 quarter speed?

if, for example, the network pipe is bigger then one unsplitted stream
of zfs send | zfs recv then splitting it to multiple streams should
optimize the network bandwidth, shouldn't it ?


 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-29 Thread Kyle McDonald
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 6/28/2010 10:30 PM, Edward Ned Harvey wrote:
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Tristram Scott

 If you would like to try it out, download the package from:
 http://www.quantmodels.co.uk/zfsdump/
 
 I haven't tried this yet, but thank you very much!
 
 Other people have pointed out bacula is able to handle multiple tapes, and
 individual file restores.  However, the disadvantage of
 bacula/tar/cpio/rsync etc is that they all have to walk the entire
 filesystem searching for things that have changed.

A compromise here might be to feed those tools the output from the new
ZFS diff command (which 'diffs' 2 snapshots.) when it arrives.

That might get somethign close to the best of both worlds.

 -Kyle

 
 The advantage of zfs send (assuming incremental backups) is that it
 already knows what's changed, and it can generate a continuous datastream
 almost instantly.  Something like 1-2 orders of magnitude faster per
 incremental backup.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.14 (MingW32)

iQEcBAEBAgAGBQJMKgO2AAoJEEADRM+bKN5wqF0IAJMN1+41+WSEy8qR4QrxFkPc
VgHv976VjY/mf2EujeSLQOwHEzx4bEfAnA7DjehQqim0YXSvo5jIDXwEZYkoCBaU
TsD6RQucks23fJUhsf0XKZNXZkpe7dqxGFXbOVd8so12LoYaB4/ZfZMdaQrhOHX8
CwyjS22YCvgxYTEUXs52RSwBg8Qw/sxjMYNa2D/iJPgZ8qtezNiiJD3bb8b30TRy
0YFHnAaC6V4/iyDvh+NpixPflaLMFmCkSh55zK1rBVHNJ7npUpZEFAKUZOXq/q38
bttGomj5gJSaoI8u8NGqADuh4Bk7JbkqKncXGJ6gxwW0pyIEplI3tS6yCTHgP/w=
=Hhu9
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-29 Thread Tristram Scott
 
 if, for example, the network pipe is bigger then one
 unsplitted stream
 of zfs send | zfs recv then splitting it to multiple
 streams should
 optimize the network bandwidth, shouldn't it ?
 

Well, I guess so.  But I wonder, what is the bottle neck here.  If it is the 
rate at which zfs send can stream data, there is a good chance that is limited 
by disk read.  If we split it into four pipes, I still think you are going to 
see four quarter rate reads.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-29 Thread Tristram Scott

evik wrote:


Reading this list for a while made it clear that zfs send is not a
backup solution, it can be used for cloning the filesystem to a backup
array if you are consuming the stream with zfs receive so you get
notified immediately about errors. Even one bitflip will render the
stream unusable and you will loose all data, not just part of your
backup cause zfs receive will restore the whole filesystem or nothing
at all depending on the correctness of the stream.

You can use par2 or something similar to try to protect the stream
against bit flips but that would require a lot of free storage space
to recover from errors.

e


The all or nothing aspect does make me nervous, but there are things 
which can be done about it.  The first step, I think, is to calculate a 
checksum of the data stream(s).


 -k chkfile.
  Calculates MD5 checksums for  each  tape  and  for  the
  stream  as a whole. These are written to chkfile, or if
  specified as -, then to stdout.

Run the dump stream back through digest -a md5 and verify that it is intact.

Certainly, using an error correcting code could help us out, but at 
additional expense, both computational and storage.


Personally, for disaster recovery purposes, I think that verifying the 
data after writing to tape is good enough.  What I am looking to guard 
against is the unlikely event that I have a hardware (or software) 
failure, or serious human error.  This is okay with the zfs send stream, 
unless, of course, we get a data corruption on the tape.  I think the 
correlation between hardware failure today and tape corruption since 
yesterday / last week when I last backed up must be pretty small.


In the event that I reach for the tape and find it corrupted, I go back 
a week to the previous full dump stream.


Clearly the strength of the backup solution needs to match the value of 
the data, and especially the cost of not having the data.  For our large 
database applications we mirror to a remote location, and use tape 
backup.  But still, I find the ability to restore the zfs filesystem 
with all its snapshots very useful, which is why I choose to work with 
zfs send.


Tristram



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Announce: zfsdump

2010-06-28 Thread Tristram Scott
For quite some time I have been using zfs send -R fsn...@snapname | dd 
of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks back 
the size of the file system grew to larger than would fit on a single DAT72 
tape, and I once again searched for a simple solution to allow dumping of a zfs 
file system to multiple tapes.  Once again I was disappointed...

I expect there are plenty of other ways this could have been handled, but none 
leapt out at me.  I didn't want to pay large sums of cash for a commercial 
backup product, and I didn't see that Amanda would be an easy thing to fit into 
my existing scripts.  In particular, (and I could well be reading this 
incorrectly) it seems that the commercial products, Amanda, star, all are 
dumping the zfs file system file by file (with or without ACLs).  I found none 
which would allow me to dump the file system and its snapshots, unless I used 
zfs send to a scratch disk, and dumped to tape from there.  But, of course, 
that assumes I have a scratch disk large enough.

So, I have implemented zfsdump as a ksh script.  The method is as follows:
1. Make a bunch of fifos.
2. Pipe the stream from zfs send to split, with split writing to the fifos (in 
sequence).
3. Use dd to copy from the fifos to tape(s).

When the first tape is complete, zfsdump returns.  One then calls it again, 
specifying that the second tape is to be used, and so on.

From the man page:

 Example 1.  Dump the @Tues snapshot of the  tank  filesystem
 to  the  non-rewinding,  non-compressing  tape,  with a 36GB
 capacity:

  zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 0

 For the second tape:

  zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 1

If you would like to try it out, download the package from:
http://www.quantmodels.co.uk/zfsdump/

I have packaged it up, so do the usual pkgadd stuff to install.

Please, though, [b]try this out with caution[/b].  Build a few test file 
systems, and see that it works for you. 
[b]It comes without warranty of any kind.[/b]


Tristram
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Brian Kolaci

I use Bacula which works very well (much better than Amanda did).
You may be able to customize it to do direct zfs send/receive, however I find 
that although they are great for copying file systems to other machines, they 
are inadequate for backups unless you always intend to restore the whole file 
system.  Most people want to restore a file or directory tree of files, not a 
whole file system.  In the past 25 years of backups and restores, I've never 
had to restore a whole file system.  I get requests for a few files, or 
somebody's mailbox or somebody's http document root.
You can directly install it from CSW (or blastwave).

On 6/28/2010 11:26 AM, Tristram Scott wrote:

For quite some time I have been using zfs send -R fsn...@snapname | dd 
of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks back 
the size of the file system grew to larger than would fit on a single DAT72 
tape, and I once again searched for a simple solution to allow dumping of a zfs 
file system to multiple tapes.  Once again I was disappointed...

I expect there are plenty of other ways this could have been handled, but none 
leapt out at me.  I didn't want to pay large sums of cash for a commercial 
backup product, and I didn't see that Amanda would be an easy thing to fit into 
my existing scripts.  In particular, (and I could well be reading this 
incorrectly) it seems that the commercial products, Amanda, star, all are 
dumping the zfs file system file by file (with or without ACLs).  I found none 
which would allow me to dump the file system and its snapshots, unless I used 
zfs send to a scratch disk, and dumped to tape from there.  But, of course, 
that assumes I have a scratch disk large enough.

So, I have implemented zfsdump as a ksh script.  The method is as follows:
1. Make a bunch of fifos.
2. Pipe the stream from zfs send to split, with split writing to the fifos (in 
sequence).
3. Use dd to copy from the fifos to tape(s).

When the first tape is complete, zfsdump returns.  One then calls it again, 
specifying that the second tape is to be used, and so on.

 From the man page:

  Example 1.  Dump the @Tues snapshot of the  tank  filesystem
  to  the  non-rewinding,  non-compressing  tape,  with a 36GB
  capacity:

   zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 0

  For the second tape:

   zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 1

If you would like to try it out, download the package from:
http://www.quantmodels.co.uk/zfsdump/

I have packaged it up, so do the usual pkgadd stuff to install.

Please, though, [b]try this out with caution[/b].  Build a few test file 
systems, and see that it works for you.
[b]It comes without warranty of any kind.[/b]


Tristram


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Tristram Scott
 I use Bacula which works very well (much better than
 Amanda did).
 You may be able to customize it to do direct zfs
 send/receive, however I find that although they are
 great for copying file systems to other machines,
 they are inadequate for backups unless you always
 intend to restore the whole file system.  Most people
 want to restore a file or directory tree of files,
 not a whole file system.  In the past 25 years of
 backups and restores, I've never had to restore a
 whole file system.  I get requests for a few files,
 or somebody's mailbox or somebody's http document
 root.
 You can directly install it from CSW (or blastwave).

Thanks for your comments, Brian.  I should look at Bacula in more detail.

As for full restore versus ad hoc requests for files I just deleted, my 
experience is mostly similar to yours, although I have had need for full system 
restore more than once.

For the restore of a few files here and there, I believe this is now well 
handled with zfs snapshots.  I have always found these requests to be down to 
human actions.  The need for full system restore has (almost) always been 
hardware failure. 

If the file was there an hour ago, or yesterday, or last week, or last month, 
then we have it in a snapshot.

If the disk died horribly during a power outage (grrr!) then it would be very 
nice to be able to restore not only the full file system, but also the 
snapshots too.  The only way I know of achieving that is by using zfs send etc. 
 

 
 On 6/28/2010 11:26 AM, Tristram Scott wrote:
[snip]

 
  Tristram
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Brian Kolaci

On Jun 28, 2010, at 12:26 PM, Tristram Scott wrote:

 I use Bacula which works very well (much better than
 Amanda did).
 You may be able to customize it to do direct zfs
 send/receive, however I find that although they are
 great for copying file systems to other machines,
 they are inadequate for backups unless you always
 intend to restore the whole file system.  Most people
 want to restore a file or directory tree of files,
 not a whole file system.  In the past 25 years of
 backups and restores, I've never had to restore a
 whole file system.  I get requests for a few files,
 or somebody's mailbox or somebody's http document
 root.
 You can directly install it from CSW (or blastwave).
 
 Thanks for your comments, Brian.  I should look at Bacula in more detail.
 
 As for full restore versus ad hoc requests for files I just deleted, my 
 experience is mostly similar to yours, although I have had need for full 
 system restore more than once.
 
 For the restore of a few files here and there, I believe this is now well 
 handled with zfs snapshots.  I have always found these requests to be down to 
 human actions.  The need for full system restore has (almost) always been 
 hardware failure. 
 
 If the file was there an hour ago, or yesterday, or last week, or last month, 
 then we have it in a snapshot.
 
 If the disk died horribly during a power outage (grrr!) then it would be very 
 nice to be able to restore not only the full file system, but also the 
 snapshots too.  The only way I know of achieving that is by using zfs send 
 etc.  
 

I like snapshots when I'm making a major change to the system or for cloning.  
So to me, snapshots are good for transaction based operations.  Such as 
stopping  flushing a database, take a snapshot, then resume the database.  
Then you can back up the snapshot with Bacula and destroy the snapshot when the 
backup is complete.  I have Bacula configured with a pre-backup and post-backup 
scripts to do just that.  When you do the restore, it will create something 
that looks like a snapshot from the file system perspective, but isn't really 
one.

But if you're looking for a copy of a file from a specific date, Bacula retains 
that.  In fact you specify the retention period you want and you'll have access 
to any/all individual files on a per date basis.  You can retain the files for 
months or years if you like, and you specify that in the Bacula config file as 
to how long you want to keep the tapes around.  So it really comes down to your 
use-case.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Tristram Scott
 
 If you would like to try it out, download the package from:
 http://www.quantmodels.co.uk/zfsdump/

I haven't tried this yet, but thank you very much!

Other people have pointed out bacula is able to handle multiple tapes, and
individual file restores.  However, the disadvantage of
bacula/tar/cpio/rsync etc is that they all have to walk the entire
filesystem searching for things that have changed.

The advantage of zfs send (assuming incremental backups) is that it
already knows what's changed, and it can generate a continuous datastream
almost instantly.  Something like 1-2 orders of magnitude faster per
incremental backup.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Announce: zfsdump

2010-06-28 Thread Asif Iqbal
On Mon, Jun 28, 2010 at 11:26 AM, Tristram Scott
tristram.sc...@quantmodels.co.uk wrote:
 For quite some time I have been using zfs send -R fsn...@snapname | dd 
 of=/dev/rmt/1ln to make a tape backup of my zfs file system.  A few weeks 
 back the size of the file system grew to larger than would fit on a single 
 DAT72 tape, and I once again searched for a simple solution to allow dumping 
 of a zfs file system to multiple tapes.  Once again I was disappointed...

 I expect there are plenty of other ways this could have been handled, but 
 none leapt out at me.  I didn't want to pay large sums of cash for a 
 commercial backup product, and I didn't see that Amanda would be an easy 
 thing to fit into my existing scripts.  In particular, (and I could well be 
 reading this incorrectly) it seems that the commercial products, Amanda, 
 star, all are dumping the zfs file system file by file (with or without 
 ACLs).  I found none which would allow me to dump the file system and its 
 snapshots, unless I used zfs send to a scratch disk, and dumped to tape from 
 there.  But, of course, that assumes I have a scratch disk large enough.

 So, I have implemented zfsdump as a ksh script.  The method is as follows:
 1. Make a bunch of fifos.
 2. Pipe the stream from zfs send to split, with split writing to the fifos 
 (in sequence).

would be nice if i could pipe the zfs send stream to a split and then
send of those splitted stream over the
network to a remote system. it would help sending it over to remote
system quicker. can your tool do that?

something like this

   s | - | j
  zfs send p | - | o   zfs recv
   (local)   l  | - | i(remote)
   t  | - | n


 3. Use dd to copy from the fifos to tape(s).

 When the first tape is complete, zfsdump returns.  One then calls it again, 
 specifying that the second tape is to be used, and so on.

 From the man page:

     Example 1.  Dump the @Tues snapshot of the  tank  filesystem
     to  the  non-rewinding,  non-compressing  tape,  with a 36GB
     capacity:

          zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 0

     For the second tape:

          zfsdump -z t...@tues -a -R -f /dev/rmt/1ln  -s  36864 -t 1

 If you would like to try it out, download the package from:
 http://www.quantmodels.co.uk/zfsdump/

 I have packaged it up, so do the usual pkgadd stuff to install.

 Please, though, [b]try this out with caution[/b].  Build a few test file 
 systems, and see that it works for you.
 [b]It comes without warranty of any kind.[/b]


 Tristram
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss