Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-12 Thread Jürgen Keil
 The problem was with the shell.  For whatever reason,
 /usr/bin/ksh can't rejoin the files correctly.  When
 I switched to /sbin/sh, the rejoin worked fine, the
 cksum's matched, ...
 
 The ksh I was using is:
 
 # what /usr/bin/ksh
 /usr/bin/ksh:
 Version M-11/16/88i
 SunOS 5.10 Generic 118873-04 Aug 2006
 
 So, is this a bug in the ksh included with Solaris 10? 

Are you able to reproduce the issue with a script like this
(needs ~ 200 gigabytes of free disk space) ?  I can't...

==
% cat split.sh
#!/bin/ksh

bs=1k
count=`expr 57 \* 1024 \* 1024`
split_bs=8100m

set -x

dd if=/dev/urandom of=data.orig bs=${bs} count=${count}
split -b ${split_bs} data.orig data.split.
ls -l data.split.*
cat data.split.a[a-z]  data.join
cmp -l data.orig data.join
==


On SX:CE / OpenSolaris the same version of /bin/ksh = /usr/bin/ksh
is present:

% what /usr/bin/ksh
/usr/bin/ksh:
Version M-11/16/88i
SunOS 5.11 snv_104 November 2008

I did run the script in a directory in an uncompressed zfs filesystem:

% ./split.sh 
+ dd if=/dev/urandom of=data.orig bs=1k count=59768832
59768832+0 records in
59768832+0 records out
+ split -b 8100m data.orig data.split.
+ ls -l data.split.aa data.split.ab data.split.ac data.split.ad data.split.ae 
data.split.af data.split.ag data.split.ah
-rw-r--r--   1 jk   usr  8493465600 Feb 12 18:31 data.split.aa
-rw-r--r--   1 jk   usr  8493465600 Feb 12 18:35 data.split.ab
-rw-r--r--   1 jk   usr  8493465600 Feb 12 18:39 data.split.ac
-rw-r--r--   1 jk   usr  8493465600 Feb 12 18:43 data.split.ad
-rw-r--r--   1 jk   usr  8493465600 Feb 12 18:48 data.split.ae
-rw-r--r--   1 jk   usr  8493465600 Feb 12 18:53 data.split.af
-rw-r--r--   1 jk   usr  8493465600 Feb 12 18:58 data.split.ag
-rw-r--r--   1 jk   usr  1749024768 Feb 12 18:58 data.split.ah
+ cat data.split.aa data.split.ab data.split.ac data.split.ad data.split.ae 
data.split.af data.split.ag data.split.ah
+ 1 data.join
+ cmp -l data.orig data.join
2002.33u 2302.05s 1:51:06.85 64.5%


As expected, it works without problem. The files are 
bit for bit identical after splitting and joining.

For me this looks more as if your hardware is broken:
http://opensolaris.org/jive/thread.jspa?messageID=338148

A single bad bit (!) in the middle of the joined file is very suspicious...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-11 Thread Michael McKnight
Thanks to John K. and Richard E. for an answer that would have never, ever 
occurred to me...

The problem was with the shell.  For whatever reason, /usr/bin/ksh can't rejoin 
the files correctly.  When I switched to /sbin/sh, the rejoin worked fine, the 
cksum's matched, and the zfs recv worked without a hitch.

The ksh I was using is:

# what /usr/bin/ksh
/usr/bin/ksh:
Version M-11/16/88i
SunOS 5.10 Generic 118873-04 Aug 2006

So, is this a bug in the ksh included with Solaris 10?  Should I file a bug 
report with Sun?  If so, how?  I don't have a support contract or anything.

Anyway, I'd like to thank you all for your valuable input and assistance in 
helping me work through this issue.

-Michael
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-10 Thread Michael McKnight
Hi again everyone,

OK... I'm even more confused at what is happening here when I try to rejoin the 
split zfs send file...

When I cat the split files and pipe through cksum, I get the same cksum as the 
original (unsplit) zfs send snapshot:

#cat mypictures.zfssnap.split.a[a-d] |cksum
2375397256  27601696744

#cksum mypictures.zfssnap
2375397256  27601696744

But when I cat them into a file and then run cksum on the file, it results in a 
different cksum:

#cat mypictures.zfssnap.split.a[a-d]  testjoin3
#cksum testjoin3
3408767053  27601696744 testjoin3

I am at a loss as to what on Earth is happening here!  The resulting file size 
is the same as the original, but why does cat produce a different cksum when 
piped vs. directed to a file?

In each case where I have run 'cmp -l' on the resulting file, there is a single 
byte that has the wrong value.  What could cause this?  

Any ideas would be greatly appreciated.

Thanks (again) to all in advance,
-Michael
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-06 Thread Miles Nordin
 re == Richard Elling richard.ell...@gmail.com writes:

re The reason is that zfs send/recv has very good application,
re even in the backup space.  There are, in fact, many people
re using it.

[...]

re ZFS send is not an archival solution.  You should use an
re archival method which is appropriate for your business
re requirements.  Note: method above, not product or command.

well, I think most backups are archival.  If we start arguing about
words, I think everyone's lost interest long ago.  But I do think to
protect oneself from bad surprises it would be good to never archive
the output of 'zfs send', only use it to move data from one place to
another.

yes, backup ``method'', moving data from one place to another is often
part of backup and can be done safely with 'zfs send | zfs recv', but
without a specific warning people will imagine something's safe which
isn't, when you say the phrase ``use zfs send for backup''.

re CR 6764193 was fixed in b105

 I read someone ran into it when importing a pool, too, not just
 when using 'zfs send'.  so hopefully that fix came for free at
 the same time.

re Perhaps your memory needs to be using checksum=sha256 :-) I do
re not recall such a conversation or bug.

fine, here you go:

http://mail.opensolaris.org/pipermail/zfs-discuss/2008-December/053894.html


pgpT0KCeNBuJQ.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-06 Thread Richard Elling
my last contribution to this thread (and there was much rejoicing!)

Miles Nordin wrote:
 re == Richard Elling richard.ell...@gmail.com writes:
 

 re The reason is that zfs send/recv has very good application,
 re even in the backup space.  There are, in fact, many people
 re using it.

 [...]

 re ZFS send is not an archival solution.  You should use an
 re archival method which is appropriate for your business
 re requirements.  Note: method above, not product or command.

 well, I think most backups are archival.

Disagree.  Archives tend to not be overwritten, ever.  Backups have all
sorts of management schemes to allow the backup media to be reused.

   If we start arguing about
 words, I think everyone's lost interest long ago.  But I do think to
 protect oneself from bad surprises it would be good to never archive
 the output of 'zfs send', only use it to move data from one place to
 another.

 yes, backup ``method'', moving data from one place to another is often
 part of backup and can be done safely with 'zfs send | zfs recv', but
 without a specific warning people will imagine something's safe which
 isn't, when you say the phrase ``use zfs send for backup''.

 re CR 6764193 was fixed in b105

  I read someone ran into it when importing a pool, too, not just
  when using 'zfs send'.  so hopefully that fix came for free at
  the same time.

 re Perhaps your memory needs to be using checksum=sha256 :-) I do
 re not recall such a conversation or bug.

 fine, here you go:

 http://mail.opensolaris.org/pipermail/zfs-discuss/2008-December/053894.html
   

Bzzt.  Thanks for playing.  That is:
re CR 6764193 was fixed in b105
re http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is
re there another?

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread Joerg Schilling
Miles Nordin car...@ivy.net wrote:

  tt == Toby Thain t...@telegraphics.com.au writes:

 tt I know this was discussed a while back, but in what sense does
 tt tar do any of those things? I understand that it is unlikely
 tt to barf completely on bitflips, but won't tar simply silently
 tt de-archive bad data?

 yeah, I just tested it, and you're right.  I guess the checksums are
 only for headers.  However, cpio does store checksums for files'
 contents, so maybe it's better to use cpio than tar.  Just be careful
 how you invoke it, because there are different cpio formats just like
 there are different tar formats, and some might have no or weaker
 checksum.

cpio is a deprecated archive format. As it is hard to enhance the features of 
cpio without breaking archive compatibility, POSIX defines a standard archive 
format that is based on tar and made very extensible.

BTW: if you are on ZFS, ZFS should prevent flipping bits in archives ;-)

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread CLF
 I use the following command to convert them back into a single file:
 #cat mypictures.zfssnap.split.a[a-g]  testjoin
 

Maybe I'm missing the point, but this command won't give you what you're
after - in bash you want:

# cat mypictures.zfssnap.split.a{a..g}  testjoin

Chris

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread Casper . Dik

 I use the following command to convert them back into a single file:
 #cat mypictures.zfssnap.split.a[a-g]  testjoin
 

Maybe I'm missing the point, but this command won't give you what you're
after - in bash you want:

# cat mypictures.zfssnap.split.a{a..g}  testjoin


The first should work (unless they really broke the shell)
(Yes. I test it, and yes it works)

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread David Dyer-Bennet

On Thu, February 5, 2009 06:39, casper@sun.com wrote:

 I use the following command to convert them back into a single file:
 #cat mypictures.zfssnap.split.a[a-g]  testjoin


Maybe I'm missing the point, but this command won't give you what you're
after - in bash you want:

# cat mypictures.zfssnap.split.a{a..g}  testjoin


 The first should work (unless they really broke the shell)
 (Yes. I test it, and yes it works)

Good, because that's a syntax I still remember and use.  And it has indeed
worked for me recently as well.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread Miles Nordin
 re == Richard Elling richard.ell...@gmail.com writes:

re Indeed, but perhaps you'll find the grace to file an
re appropriate RFE?

for what?  The main problem I saw was with the wiki not warning people
away from archiving 'zfs send' emphatically enough, for example by
comparing its archival characteristics to tar (or checksummed cpio)
files and explaining that 'zfs send's output needs to be ephemeral.

This is RFE-worthy:

 * unresolved bugs.  ``poisonous streams'' causing kernel panics
 when you receive them,
 http://www.opensolaris.org/jive/thread.jspa?threadID=81613tstart=0

but I'm not having the problem, so I won't file it when I can't
provide information.

 * stream format is not guaranteed to be forward compatible

re Backward compatibility is achieved.

I've read complaints where the zfs filesystem version has to match.
People _have_ reported compatibility problems.  Maybe it is true that
a newer system can always receive an older stream, but not vice-versa.
I'd not wish for more, and that removes this (but not other)
objections to archiving 'zfs send'.

not entirely though---When you archive it you care about whether
you'll be able to read it years from now.  Suppose there IS some
problem receiving an old stream on a new system.  Even if there's not
supposed to be, and even if there isn't right now, a bug may appear
later.  I think it's less likely to get fixed than a bug importing an
old zpool.  so, archive the zpool, not 'zfs send' output.

re An enterprising community member could easily put together a
re utility to do a verification.  All of the necessary code is
re readily available.

fine, but (a) what CAN be written doesn't change the fact that the
tool DOES NOT EXIST NOW, and the possibility of writing one isn't
enough to make archiving 'zfs send' streams a better idea which is
what I'm discussing, and (b) it's my opinion a thorough tool is not
possible, because as I said, a bunch of kernel code is implicated in
the zfs recv which is full of assertions itself.  'zfs recv' is
actually panicing boxes.  so I'd not have faith in some userspace
tool's claim that a stream is good, since it's necessarily using
different code than the actual extraction.  'tar t', 'cpio -it', and I
think 'zpool scrub' don't use separate code paths for verification.

 * supposed to be endian-independent, but isn't.
 

re CR 6764193 was fixed in b105
re http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is
re there another?

no, no other, that is what I remember reading.  I read someone ran
into it when importing a pool, too, not just when using 'zfs send'.
so hopefully that fix came for free at the same time.

re I suggest you consider an Enterprise Backup Solution.

I prefer Free Software, especially for archival.  But I will consider
the advice I gave: backup to another zpool, or to a tar/cpio file.

I do not have a problem with the way 'zfs send' works.  For
replication-like incremental backups, rolling back the entire recv for
one flipped bit is quite defendable.  the lazy panics aren't, but the
architectural decision to trash a whole stream and all its descendent
incrementals for one flipped bit DOES make sense to me.  but 'zfs
send' shouldn't be archived!  That is what I'm saying, not 
``zfs send | zfs recv sucks'', just that it shouldn't be archived.


pgpc47seziEpB.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread Michael McKnight
Hi everyone,

I appreciate the discussion on the practicality of archiving ZFS sends, but 
right now I don't know of any other options.  I'm a home user, so 
Enterprise-level solutions aren't available and as far as I know, tar, cpio, 
etc. don't capture ACL's and other low-level filesystem attributes.   Plus, 
they are all susceptible to corruption while in storage, making recovery no 
more likely than with a zfs send.

The checksumming capability is a key factor to me.  I would rather not be able 
to restore the data than to unknowingly restore bad data.  This is the biggest 
reason I started using ZFS to start with.  Too many cases of invisible file 
corruption.  Admittedly, it would be nicer if zfs recv would flag individual 
files with checksum problems rather than completely failing the restore.

What I need is a complete snapshot of the filesystem (ie. ufsdump) and, correct 
me if I'm wrong, but zfs send/recv is the closest (only) thing we have.  And I 
need to be able to break up this complete snapshot into pieces small enough to 
fit onto a DVD-DL.

So far, using ZFS send/recv works great as long as the files aren't split.  I 
have seen suggestions on using something like 7z (?) instead of split as an 
option.  Does anyone else have any other ideas on how to successfully break up 
a send file and join it back together?

Thanks again,
Michael
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread David Dyer-Bennet

On Thu, February 5, 2009 14:15, Michael McKnight wrote:

 I appreciate the discussion on the practicality of archiving ZFS sends,
 but right now I don't know of any other options.  I'm a home user, so
 Enterprise-level solutions aren't available and as far as I know, tar,
 cpio, etc. don't capture ACL's and other low-level filesystem attributes.
  Plus, they are all susceptible to corruption while in storage, making
 recovery no more likely than with a zfs send.

Your big constraint is using optical disks.  Certainly there are arguments
for single-use media for a backup, but a series of optical disks
containing a data stream gives rise to a nasty probability that *one* disk
in the set won't be readable, which will render everything after that
unrecoverable too.   .99 ^ 56 = .57, which is not a probability *I* want
to see of fully recovering my data.  (.99 is probably pessimistic, though.
 I hope.)  (56 disks is how many my backup would take on DVD-DL disks, and
is why I don't do it that way.)

External hard drives give you a lot more options.  I'm formatting external
USB drives as a ZFS pool, and then rsyncing data to them.  I can scrub
them for verification, and I can easily access individual files.  I create
snapshots on them so that I can have generations of backup accessible
without duplicating data that hasn't changed.  I'm currently updating them
via rsync, which doesn't propagate ACLs, but I could and should be using
send/receive instead, which would.  I believe I've figured out the logic,
but haven't updated the script.  If you do it with send/receive, you get a
snapshot on the backup drive that's identical (modulo ZFS bugs) with the
original, and which you can scrub to verify when you want, etc. 
Furthermore, I don't have to be physically present to change and label and
file 56 DVD-DL disks.

Looks like DL disks are of similar price (per GB) to external USB drives
-- and external drives can be used for more than one backup.  (Rather
similar meaning within a factor of two either way; I only checked prices
one place.)

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-05 Thread Miles Nordin
 mm == Michael McKnight michael_mcknigh...@yahoo.com writes:

mm as far as I know, tar, cpio, etc. don't capture ACL's and
mm other low-level filesystem attributes.

Take another look with whatever specific ACL's you're using.  Some of
the cpio formats will probably work because I think there was a thread
in here about ACL copy working in cpio but not pax?  You have to try it.

mm Plus, they are all susceptible to corruption while in storage,

yes, of course there are no magic beans.

mm making recovery no more likely than with a zfs send.

nonsense.  With 'zfs send' recovery is impossible with any corruption.
With tar/cpio, partial recovery is the rule, not the exception.  This
is a difference.  a big one.  

And I am repeating myself, over and over.  I am baffled as to why this
is so disputable.

mm The checksumming capability is a key factor to me.

Follow the thread.  cpio does checksumming, at least with some of the
stream formats, and I showed an example of how to check that the
checksums are working, and prove they are missing from tar.

mm I would rather not be able to restore the data than to
mm unknowingly restore bad data.

I suppose that makes sense, but only for certain really specific kinds
of data that most peopple don't have.  Of course being warned would be
nice, but I've rarely wanted to be warned by losing everything, even
files far away from the bit flip.  I'd rather not be warned than get
that kind of warning, most of the time.  especially for a backup.

OTOH if you're hauling the data from one place to another and throwing
away the DVDR when you get it there, then maybe zfs send is
appropriate.  In that case you are not archiving the zfs send stream,
but rather the expanded zpool in the remote location, which is how
it's meant to be used.

mm it would be nicer if zfs recv would flag individual files
mm with checksum problems rather than completely failing the
mm restore.

It would be nice, but I suspect it's hard to do this and preserve the
incremental dump feature.  There are too many lazy panics as is
without wishing for incrementals to roll forward from a corrupt base.

Also, I think, architecturally, replication and storage should not be
mixed because the goals when errors occur are so different.  Fixing
this problem at the cost of making replication jobs less reliable
would be a bad thing, so I like separate tools, and unstorable zfs
send.

mm What I need is a complete snapshot of the filesystem
mm (ie. ufsdump) and, correct me if I'm wrong, but zfs send/recv
mm is the closest (only) thing we have.

Using 'zfs send | zfs recv' to replicate one zpool into another zpool
is a second option---store the destination pool on DVDR, not the
stream.  If you have enough space to store disk images of the second
zpool, which it sounds like you do, then once you get 'split' working
you can split it up and write it to DVDR, too.  Or you can let ZFS do
the splitting, and make DVD-size vdev's, export the pool, and burn
them.  It's not as robust as a split cpio when faced with a lost
DVD, but it's worlds better than a split 'zfs send'.

for your 'split' problem, I know I have used 'split' in the way you
want, but I would have been using GNU split.  Bob suggested beware of
split's line-orientedness (be sure to use -b).  A couple other people
suggested using bash's {a..z} syntax rather than plain globbing to
make sure you're combining the pieces in the right order.  There is
/usr/gnu/bin/split and /usr/5bin/split on my system in addition to
/usr/bin/split so you've a couple others to try.  You're checking that
it's working the right way, with md5sum, so at least you already have
enough tools to narrow the problem away from ZFS.  If you get really
desperate, you can use dd's skip= and count= options to emulate split,
and still use cat to combine.

Also check the filesizes.  If you have a 2GB filesize ulimit set, that
could mess up the stdout redirection, but on my Solaris system it
seems to default to unlimited.


pgpnxdaoxIHkL.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-04 Thread Toby Thain

On 4-Feb-09, at 6:19 AM, Michael McKnight wrote:

 Hello everyone,

 I am trying to take ZFS snapshots (ie. zfs send) and burn them to  
 DVD's for offsite storage.  In many cases, the snapshots greatly  
 exceed the 8GB I can stuff onto a single DVD-DL.

 In order to make this work, I have used the split utility ...
 I use the following command to convert them back into a single file:
 #cat mypictures.zfssnap.split.a[a-g]  testjoin

 But when I compare the checksum of the original snapshot to that of  
 the rejoined snapshot, I get a different result:

Tested your RAM lately?

--Toby



 -Michael
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-04 Thread Miles Nordin
 mm == Michael McKnight michael_mcknigh...@yahoo.com writes:

mm #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split.
mm #cat mypictures.zfssnap.split.a[a-g]  testjoin

mm But when I compare the checksum of the original snapshot to
mm that of the rejoined snapshot, I get a different result:

sounds fine.  I'm not sure why it's failing.

mm And when I try to restore the filesystem, I get the following
mm failure: #zfs recv pool_01/test  ./testjoin cannot receive
mm new filesystem stream: invalid stream (checksum mismatch)

however, aside from this problem you're immediately having, I think
you should never archive the output of 'zfs send'.  I think the
current warning on the wiki is not sufficiently drastic, but when I
asked for an account to update the wiki I got no answer.  Here are the
problems, again, with archiving 'zfs send' output:

 * no way to test the stream's integrity without receiving it.
   (meaning, to test a stream, you need enough space to store the
   stream being tested, plus that much space again.  not practical.)
   A test could possibly be hacked up, but because the whole ZFS
   software stack is involved in receiving, and is full of assertions
   itself, any test short of actual extraction wouldn't be a thorough
   test, so this is unlikely to change soon.

 * stream format is not guaranteed to be forward compatible with new
   kernels.  and versioning may be pickier than zfs/zpool versions.

 * stream is expanded _by the kernel_, so even if tar had a
   forward-compatibility problem, which it won't, you could
   hypothetically work around it by getting an old 'tar'.  For 'zfs
   send' streams you have to get an entire old kernel, and boot it on
   modern hardware, to get at your old stream.

 * supposed to be endian-independent, but isn't.

 * stream is ``protected'' from corruption in the following way: if a
   single bit is flipped anywhere in the stream, the entire stream and
   all incrementals descended from it become worthless.  It is
   EXTREMELY corruption-sensitive.  'tar' and zpool images both
   detect, report, work around, flipped bits.  The 'zfs send' idea is
   different: if there's corruption, the designers assume you can just
   restart the 'zfs send | zfs recv' until you get a clean go---what
   you most need is ability to atomically roll back the failed recv,
   which you do get.  You are not supposed to be archiving it!

 * unresolved bugs.  ``poisonous streams'' causing kernel panics when
   you receive them, 
http://www.opensolaris.org/jive/thread.jspa?threadID=81613tstart=0

The following things do not have these problems:

 * ZFS filesystems inside file vdev's (except maybe the endian
   problem.  and also the needs-whole-kernel problem, but mitigated by
   better forward-compatibility guarantees.)

 * tar files

In both alternatives you probably shouldn't use gzip on the resulting
file.  If you must gzip, it would be better to make a bunch of tar.gz
files, ex., one per user, and tar the result.  Maybe I'm missing some
magic flag, but I've not gotten gzip to be too bitflip-resilient.

The wiki cop-out is a nebulous ``enterprise backup ``Solution' ''.
Short of that you might make a zpool in a file with zfs compression
turned on and rsync or cpio or zfs send | zfs recv the data into it.

Or just use gtar like in the old days.  With some care you may even be
able to convince tar to write directly to the medium.  And when you're
done you can do a 'tar t' directly from medium also, to check it.  I'm
not sure what to do about incrementals.  There is a sort of halfass
incremental feature in gtar, but not like what ZFS gives.


pgpwUYXwCkuVI.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-04 Thread David Dyer-Bennet

On Wed, February 4, 2009 12:01, Miles Nordin wrote:

  * stream format is not guaranteed to be forward compatible with new
kernels.  and versioning may be pickier than zfs/zpool versions.

Useful points, all of them.  This particular one also points out something
I hadn't previously thought about -- using zfs send piped through ssh (or
in some other way going from one system to another) is also sensitive to
this versioning issue.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-04 Thread Bob Friesenhahn
On Wed, 4 Feb 2009, Toby Thain wrote:
 In order to make this work, I have used the split utility ...
 I use the following command to convert them back into a single file:
 #cat mypictures.zfssnap.split.a[a-g]  testjoin

 But when I compare the checksum of the original snapshot to that of
 the rejoined snapshot, I get a different result:

 Tested your RAM lately?

Split is originally designed to handle text files.  It may have 
problems with binary files.  Due to these issues, long ago (1993) I 
wrote a 'bustup' utility which works on binary files.  I have not 
looked at it since then.

Bob
==
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-04 Thread Toby Thain

On 4-Feb-09, at 2:29 PM, Bob Friesenhahn wrote:

 On Wed, 4 Feb 2009, Toby Thain wrote:
 In order to make this work, I have used the split utility ...
 I use the following command to convert them back into a single file:
 #cat mypictures.zfssnap.split.a[a-g]  testjoin

 But when I compare the checksum of the original snapshot to that of
 the rejoined snapshot, I get a different result:

 Tested your RAM lately?

 Split is originally designed to handle text files.  It may have  
 problems with binary files.

Ouch, OK.

--Toby

   Due to these issues, long ago (1993) I wrote a 'bustup' utility  
 which works on binary files.  I have not looked at it since then.

 Bob
 ==
 Bob Friesenhahn
 bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/ 
 bfriesen/
 GraphicsMagick Maintainer,http://www.GraphicsMagick.org/


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-04 Thread Miles Nordin
 tt == Toby Thain t...@telegraphics.com.au writes:

tt I know this was discussed a while back, but in what sense does
tt tar do any of those things? I understand that it is unlikely
tt to barf completely on bitflips, but won't tar simply silently
tt de-archive bad data?

yeah, I just tested it, and you're right.  I guess the checksums are
only for headers.  However, cpio does store checksums for files'
contents, so maybe it's better to use cpio than tar.  Just be careful
how you invoke it, because there are different cpio formats just like
there are different tar formats, and some might have no or weaker
checksum.

NetBSD 'pax' invoked as tar:
-8-
castrovalva:~$ dd if=/dev/zero of=t0 bs=1m count=1
1+0 records in
1+0 records out
1048576 bytes transferred in 0.022 secs (47662545 bytes/sec)
castrovalva:~$  tar cf t0.tar t0
castrovalva:~$ md5 t0.tar
MD5 (t0.tar) = 591a39a984f70fe3e44a5e13f0ac74b6
castrovalva:~$ tar tf t0.tar
t0
castrovalva:~$ dd of=t0.tar seek=$(( 512 * 1024 )) bs=1 conv=notrunc
asdfasdfasfs
13+0 records in
13+0 records out
13 bytes transferred in 2.187 secs (5 bytes/sec)
castrovalva:~$ md5 t0.tar
MD5 (t0.tar) = 14b3a9d851579d8331a0466a5ef62693
castrovalva:~$ tar tf t0.tar
t0
castrovalva:~$ tar xvf t0.tar
tar: Removing leading / from absolute path names in the archive
t0
tar: ustar vol 1, 1 files, 1054720 bytes read, 0 bytes written in 1 secs 
(1054720 bytes/sec)
castrovalva:~$ hexdump -C t0
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0007fe00  61 73 64 66 61 73 64 66  61 73 66 73 0a 00 00 00  |asdfasdfasfs|
0007fe10  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
0010
castrovalva:~$ 
-8-

GNU tar does the same thing.

NetBSD 'pax' invoked as cpio:
-8-
castrovalva:~$ dd if=/dev/zero of=t0 bs=1m count=1
1+0 records in
1+0 records out
1048576 bytes transferred in 0.018 secs (58254222 bytes/sec)
castrovalva:~$ cpio -H sv4cpio -o  t0.cpio
t0
castrovalva:~$ md5 t0.cpio
MD5 (t0.cpio) = d5128381e72ee514ced8ad10a5a33f16
castrovalva:~$ dd of=t0.cpio seek=$(( 512 * 1024 )) bs=1 conv=notrunc
asdfasdfasdf
13+0 records in
13+0 records out
13 bytes transferred in 1.461 secs (8 bytes/sec)
castrovalva:~$ md5 t0.cpio
MD5 (t0.cpio) = b22458669256da5bcb6c94948d22a155
castrovalva:~$ rm t0
castrovalva:~$ cpio -i  t0.cpio
cpio: Removing leading / from absolute path names in the archive
cpio: Actual crc does not match expected crc t0
-8-


pgpoIIGmEkmDv.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot splitting joining

2009-02-04 Thread Richard Elling
Miles Nordin wrote:
 mm == Michael McKnight michael_mcknigh...@yahoo.com writes:
 

 mm #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split.
 mm #cat mypictures.zfssnap.split.a[a-g]  testjoin

 mm But when I compare the checksum of the original snapshot to
 mm that of the rejoined snapshot, I get a different result:

 sounds fine.  I'm not sure why it's failing.

 mm And when I try to restore the filesystem, I get the following
 mm failure: #zfs recv pool_01/test  ./testjoin cannot receive
 mm new filesystem stream: invalid stream (checksum mismatch)

 however, aside from this problem you're immediately having, I think
 you should never archive the output of 'zfs send'.  I think the
 current warning on the wiki is not sufficiently drastic, but when I
 asked for an account to update the wiki I got no answer.  Here are the
 problems, again, with archiving 'zfs send' output:

  * no way to test the stream's integrity without receiving it.
(meaning, to test a stream, you need enough space to store the
stream being tested, plus that much space again.  not practical.)
A test could possibly be hacked up, but because the whole ZFS
software stack is involved in receiving, and is full of assertions
itself, any test short of actual extraction wouldn't be a thorough
test, so this is unlikely to change soon.

  * stream format is not guaranteed to be forward compatible with new
kernels.  and versioning may be pickier than zfs/zpool versions.
   

Backward compatibility is achieved.

  * stream is expanded _by the kernel_, so even if tar had a
forward-compatibility problem, which it won't, you could
hypothetically work around it by getting an old 'tar'.  For 'zfs
send' streams you have to get an entire old kernel, and boot it on
modern hardware, to get at your old stream.
   

An enterprising community member could easily put together a
utility to do a verification.  All of the necessary code is readily
available.

  * supposed to be endian-independent, but isn't.
   

CR 6764193 was fixed in b105
http://bugs.opensolaris.org/view_bug.do?bug_id=6764193
Is there another?

  * stream is ``protected'' from corruption in the following way: if a
single bit is flipped anywhere in the stream, the entire stream and
all incrementals descended from it become worthless.  It is
EXTREMELY corruption-sensitive.  'tar' and zpool images both
detect, report, work around, flipped bits.  The 'zfs send' idea is
different: if there's corruption, the designers assume you can just
restart the 'zfs send | zfs recv' until you get a clean go---what
you most need is ability to atomically roll back the failed recv,
which you do get.  You are not supposed to be archiving it!
   

This is not completely accurate.  Snapshots which are completed
are completed.

  * unresolved bugs.  ``poisonous streams'' causing kernel panics when
you receive them, 
 http://www.opensolaris.org/jive/thread.jspa?threadID=81613tstart=0

 The following things do not have these problems:

  * ZFS filesystems inside file vdev's (except maybe the endian
problem.  and also the needs-whole-kernel problem, but mitigated by
better forward-compatibility guarantees.)
   

Indeed, but perhaps you'll find the grace to file an appropriate RFE?

  * tar files

 In both alternatives you probably shouldn't use gzip on the resulting
 file.  If you must gzip, it would be better to make a bunch of tar.gz
 files, ex., one per user, and tar the result.  Maybe I'm missing some
 magic flag, but I've not gotten gzip to be too bitflip-resilient.

 The wiki cop-out is a nebulous ``enterprise backup ``Solution' ''.
   

Perhaps it would satisfy you to enumerate the market's Enterprise
Backup Solutions?  This might be helpful since Solaris does not
include such software, at least by my definition of Solaris.  So, the wiki
section Using ZFS With Enterprise Backup Solutions does in fact
enumerate them, and I don't see any benefit to repeating the enumeration.
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Using_ZFS_With_Enterprise_Backup_Solutions

 Short of that you might make a zpool in a file with zfs compression
 turned on and rsync or cpio or zfs send | zfs recv the data into it.

 Or just use gtar like in the old days.  With some care you may even be
 able to convince tar to write directly to the medium.  And when you're
 done you can do a 'tar t' directly from medium also, to check it.  I'm
 not sure what to do about incrementals.  There is a sort of halfass
 incremental feature in gtar, but not like what ZFS gives.
   

I suggest you consider an Enterprise Backup Solution.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss