Re: [zfs-discuss] ZFS snapshot splitting joining
The problem was with the shell. For whatever reason, /usr/bin/ksh can't rejoin the files correctly. When I switched to /sbin/sh, the rejoin worked fine, the cksum's matched, ... The ksh I was using is: # what /usr/bin/ksh /usr/bin/ksh: Version M-11/16/88i SunOS 5.10 Generic 118873-04 Aug 2006 So, is this a bug in the ksh included with Solaris 10? Are you able to reproduce the issue with a script like this (needs ~ 200 gigabytes of free disk space) ? I can't... == % cat split.sh #!/bin/ksh bs=1k count=`expr 57 \* 1024 \* 1024` split_bs=8100m set -x dd if=/dev/urandom of=data.orig bs=${bs} count=${count} split -b ${split_bs} data.orig data.split. ls -l data.split.* cat data.split.a[a-z] data.join cmp -l data.orig data.join == On SX:CE / OpenSolaris the same version of /bin/ksh = /usr/bin/ksh is present: % what /usr/bin/ksh /usr/bin/ksh: Version M-11/16/88i SunOS 5.11 snv_104 November 2008 I did run the script in a directory in an uncompressed zfs filesystem: % ./split.sh + dd if=/dev/urandom of=data.orig bs=1k count=59768832 59768832+0 records in 59768832+0 records out + split -b 8100m data.orig data.split. + ls -l data.split.aa data.split.ab data.split.ac data.split.ad data.split.ae data.split.af data.split.ag data.split.ah -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:31 data.split.aa -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:35 data.split.ab -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:39 data.split.ac -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:43 data.split.ad -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:48 data.split.ae -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:53 data.split.af -rw-r--r-- 1 jk usr 8493465600 Feb 12 18:58 data.split.ag -rw-r--r-- 1 jk usr 1749024768 Feb 12 18:58 data.split.ah + cat data.split.aa data.split.ab data.split.ac data.split.ad data.split.ae data.split.af data.split.ag data.split.ah + 1 data.join + cmp -l data.orig data.join 2002.33u 2302.05s 1:51:06.85 64.5% As expected, it works without problem. The files are bit for bit identical after splitting and joining. For me this looks more as if your hardware is broken: http://opensolaris.org/jive/thread.jspa?messageID=338148 A single bad bit (!) in the middle of the joined file is very suspicious... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
Thanks to John K. and Richard E. for an answer that would have never, ever occurred to me... The problem was with the shell. For whatever reason, /usr/bin/ksh can't rejoin the files correctly. When I switched to /sbin/sh, the rejoin worked fine, the cksum's matched, and the zfs recv worked without a hitch. The ksh I was using is: # what /usr/bin/ksh /usr/bin/ksh: Version M-11/16/88i SunOS 5.10 Generic 118873-04 Aug 2006 So, is this a bug in the ksh included with Solaris 10? Should I file a bug report with Sun? If so, how? I don't have a support contract or anything. Anyway, I'd like to thank you all for your valuable input and assistance in helping me work through this issue. -Michael -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
Hi again everyone, OK... I'm even more confused at what is happening here when I try to rejoin the split zfs send file... When I cat the split files and pipe through cksum, I get the same cksum as the original (unsplit) zfs send snapshot: #cat mypictures.zfssnap.split.a[a-d] |cksum 2375397256 27601696744 #cksum mypictures.zfssnap 2375397256 27601696744 But when I cat them into a file and then run cksum on the file, it results in a different cksum: #cat mypictures.zfssnap.split.a[a-d] testjoin3 #cksum testjoin3 3408767053 27601696744 testjoin3 I am at a loss as to what on Earth is happening here! The resulting file size is the same as the original, but why does cat produce a different cksum when piped vs. directed to a file? In each case where I have run 'cmp -l' on the resulting file, there is a single byte that has the wrong value. What could cause this? Any ideas would be greatly appreciated. Thanks (again) to all in advance, -Michael -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
re == Richard Elling richard.ell...@gmail.com writes: re The reason is that zfs send/recv has very good application, re even in the backup space. There are, in fact, many people re using it. [...] re ZFS send is not an archival solution. You should use an re archival method which is appropriate for your business re requirements. Note: method above, not product or command. well, I think most backups are archival. If we start arguing about words, I think everyone's lost interest long ago. But I do think to protect oneself from bad surprises it would be good to never archive the output of 'zfs send', only use it to move data from one place to another. yes, backup ``method'', moving data from one place to another is often part of backup and can be done safely with 'zfs send | zfs recv', but without a specific warning people will imagine something's safe which isn't, when you say the phrase ``use zfs send for backup''. re CR 6764193 was fixed in b105 I read someone ran into it when importing a pool, too, not just when using 'zfs send'. so hopefully that fix came for free at the same time. re Perhaps your memory needs to be using checksum=sha256 :-) I do re not recall such a conversation or bug. fine, here you go: http://mail.opensolaris.org/pipermail/zfs-discuss/2008-December/053894.html pgpT0KCeNBuJQ.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
my last contribution to this thread (and there was much rejoicing!) Miles Nordin wrote: re == Richard Elling richard.ell...@gmail.com writes: re The reason is that zfs send/recv has very good application, re even in the backup space. There are, in fact, many people re using it. [...] re ZFS send is not an archival solution. You should use an re archival method which is appropriate for your business re requirements. Note: method above, not product or command. well, I think most backups are archival. Disagree. Archives tend to not be overwritten, ever. Backups have all sorts of management schemes to allow the backup media to be reused. If we start arguing about words, I think everyone's lost interest long ago. But I do think to protect oneself from bad surprises it would be good to never archive the output of 'zfs send', only use it to move data from one place to another. yes, backup ``method'', moving data from one place to another is often part of backup and can be done safely with 'zfs send | zfs recv', but without a specific warning people will imagine something's safe which isn't, when you say the phrase ``use zfs send for backup''. re CR 6764193 was fixed in b105 I read someone ran into it when importing a pool, too, not just when using 'zfs send'. so hopefully that fix came for free at the same time. re Perhaps your memory needs to be using checksum=sha256 :-) I do re not recall such a conversation or bug. fine, here you go: http://mail.opensolaris.org/pipermail/zfs-discuss/2008-December/053894.html Bzzt. Thanks for playing. That is: re CR 6764193 was fixed in b105 re http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is re there another? -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
Miles Nordin car...@ivy.net wrote: tt == Toby Thain t...@telegraphics.com.au writes: tt I know this was discussed a while back, but in what sense does tt tar do any of those things? I understand that it is unlikely tt to barf completely on bitflips, but won't tar simply silently tt de-archive bad data? yeah, I just tested it, and you're right. I guess the checksums are only for headers. However, cpio does store checksums for files' contents, so maybe it's better to use cpio than tar. Just be careful how you invoke it, because there are different cpio formats just like there are different tar formats, and some might have no or weaker checksum. cpio is a deprecated archive format. As it is hard to enhance the features of cpio without breaking archive compatibility, POSIX defines a standard archive format that is based on tar and made very extensible. BTW: if you are on ZFS, ZFS should prevent flipping bits in archives ;-) Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
I use the following command to convert them back into a single file: #cat mypictures.zfssnap.split.a[a-g] testjoin Maybe I'm missing the point, but this command won't give you what you're after - in bash you want: # cat mypictures.zfssnap.split.a{a..g} testjoin Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
I use the following command to convert them back into a single file: #cat mypictures.zfssnap.split.a[a-g] testjoin Maybe I'm missing the point, but this command won't give you what you're after - in bash you want: # cat mypictures.zfssnap.split.a{a..g} testjoin The first should work (unless they really broke the shell) (Yes. I test it, and yes it works) Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
On Thu, February 5, 2009 06:39, casper@sun.com wrote: I use the following command to convert them back into a single file: #cat mypictures.zfssnap.split.a[a-g] testjoin Maybe I'm missing the point, but this command won't give you what you're after - in bash you want: # cat mypictures.zfssnap.split.a{a..g} testjoin The first should work (unless they really broke the shell) (Yes. I test it, and yes it works) Good, because that's a syntax I still remember and use. And it has indeed worked for me recently as well. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
re == Richard Elling richard.ell...@gmail.com writes: re Indeed, but perhaps you'll find the grace to file an re appropriate RFE? for what? The main problem I saw was with the wiki not warning people away from archiving 'zfs send' emphatically enough, for example by comparing its archival characteristics to tar (or checksummed cpio) files and explaining that 'zfs send's output needs to be ephemeral. This is RFE-worthy: * unresolved bugs. ``poisonous streams'' causing kernel panics when you receive them, http://www.opensolaris.org/jive/thread.jspa?threadID=81613tstart=0 but I'm not having the problem, so I won't file it when I can't provide information. * stream format is not guaranteed to be forward compatible re Backward compatibility is achieved. I've read complaints where the zfs filesystem version has to match. People _have_ reported compatibility problems. Maybe it is true that a newer system can always receive an older stream, but not vice-versa. I'd not wish for more, and that removes this (but not other) objections to archiving 'zfs send'. not entirely though---When you archive it you care about whether you'll be able to read it years from now. Suppose there IS some problem receiving an old stream on a new system. Even if there's not supposed to be, and even if there isn't right now, a bug may appear later. I think it's less likely to get fixed than a bug importing an old zpool. so, archive the zpool, not 'zfs send' output. re An enterprising community member could easily put together a re utility to do a verification. All of the necessary code is re readily available. fine, but (a) what CAN be written doesn't change the fact that the tool DOES NOT EXIST NOW, and the possibility of writing one isn't enough to make archiving 'zfs send' streams a better idea which is what I'm discussing, and (b) it's my opinion a thorough tool is not possible, because as I said, a bunch of kernel code is implicated in the zfs recv which is full of assertions itself. 'zfs recv' is actually panicing boxes. so I'd not have faith in some userspace tool's claim that a stream is good, since it's necessarily using different code than the actual extraction. 'tar t', 'cpio -it', and I think 'zpool scrub' don't use separate code paths for verification. * supposed to be endian-independent, but isn't. re CR 6764193 was fixed in b105 re http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is re there another? no, no other, that is what I remember reading. I read someone ran into it when importing a pool, too, not just when using 'zfs send'. so hopefully that fix came for free at the same time. re I suggest you consider an Enterprise Backup Solution. I prefer Free Software, especially for archival. But I will consider the advice I gave: backup to another zpool, or to a tar/cpio file. I do not have a problem with the way 'zfs send' works. For replication-like incremental backups, rolling back the entire recv for one flipped bit is quite defendable. the lazy panics aren't, but the architectural decision to trash a whole stream and all its descendent incrementals for one flipped bit DOES make sense to me. but 'zfs send' shouldn't be archived! That is what I'm saying, not ``zfs send | zfs recv sucks'', just that it shouldn't be archived. pgpc47seziEpB.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
Hi everyone, I appreciate the discussion on the practicality of archiving ZFS sends, but right now I don't know of any other options. I'm a home user, so Enterprise-level solutions aren't available and as far as I know, tar, cpio, etc. don't capture ACL's and other low-level filesystem attributes. Plus, they are all susceptible to corruption while in storage, making recovery no more likely than with a zfs send. The checksumming capability is a key factor to me. I would rather not be able to restore the data than to unknowingly restore bad data. This is the biggest reason I started using ZFS to start with. Too many cases of invisible file corruption. Admittedly, it would be nicer if zfs recv would flag individual files with checksum problems rather than completely failing the restore. What I need is a complete snapshot of the filesystem (ie. ufsdump) and, correct me if I'm wrong, but zfs send/recv is the closest (only) thing we have. And I need to be able to break up this complete snapshot into pieces small enough to fit onto a DVD-DL. So far, using ZFS send/recv works great as long as the files aren't split. I have seen suggestions on using something like 7z (?) instead of split as an option. Does anyone else have any other ideas on how to successfully break up a send file and join it back together? Thanks again, Michael -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
On Thu, February 5, 2009 14:15, Michael McKnight wrote: I appreciate the discussion on the practicality of archiving ZFS sends, but right now I don't know of any other options. I'm a home user, so Enterprise-level solutions aren't available and as far as I know, tar, cpio, etc. don't capture ACL's and other low-level filesystem attributes. Plus, they are all susceptible to corruption while in storage, making recovery no more likely than with a zfs send. Your big constraint is using optical disks. Certainly there are arguments for single-use media for a backup, but a series of optical disks containing a data stream gives rise to a nasty probability that *one* disk in the set won't be readable, which will render everything after that unrecoverable too. .99 ^ 56 = .57, which is not a probability *I* want to see of fully recovering my data. (.99 is probably pessimistic, though. I hope.) (56 disks is how many my backup would take on DVD-DL disks, and is why I don't do it that way.) External hard drives give you a lot more options. I'm formatting external USB drives as a ZFS pool, and then rsyncing data to them. I can scrub them for verification, and I can easily access individual files. I create snapshots on them so that I can have generations of backup accessible without duplicating data that hasn't changed. I'm currently updating them via rsync, which doesn't propagate ACLs, but I could and should be using send/receive instead, which would. I believe I've figured out the logic, but haven't updated the script. If you do it with send/receive, you get a snapshot on the backup drive that's identical (modulo ZFS bugs) with the original, and which you can scrub to verify when you want, etc. Furthermore, I don't have to be physically present to change and label and file 56 DVD-DL disks. Looks like DL disks are of similar price (per GB) to external USB drives -- and external drives can be used for more than one backup. (Rather similar meaning within a factor of two either way; I only checked prices one place.) -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
mm == Michael McKnight michael_mcknigh...@yahoo.com writes: mm as far as I know, tar, cpio, etc. don't capture ACL's and mm other low-level filesystem attributes. Take another look with whatever specific ACL's you're using. Some of the cpio formats will probably work because I think there was a thread in here about ACL copy working in cpio but not pax? You have to try it. mm Plus, they are all susceptible to corruption while in storage, yes, of course there are no magic beans. mm making recovery no more likely than with a zfs send. nonsense. With 'zfs send' recovery is impossible with any corruption. With tar/cpio, partial recovery is the rule, not the exception. This is a difference. a big one. And I am repeating myself, over and over. I am baffled as to why this is so disputable. mm The checksumming capability is a key factor to me. Follow the thread. cpio does checksumming, at least with some of the stream formats, and I showed an example of how to check that the checksums are working, and prove they are missing from tar. mm I would rather not be able to restore the data than to mm unknowingly restore bad data. I suppose that makes sense, but only for certain really specific kinds of data that most peopple don't have. Of course being warned would be nice, but I've rarely wanted to be warned by losing everything, even files far away from the bit flip. I'd rather not be warned than get that kind of warning, most of the time. especially for a backup. OTOH if you're hauling the data from one place to another and throwing away the DVDR when you get it there, then maybe zfs send is appropriate. In that case you are not archiving the zfs send stream, but rather the expanded zpool in the remote location, which is how it's meant to be used. mm it would be nicer if zfs recv would flag individual files mm with checksum problems rather than completely failing the mm restore. It would be nice, but I suspect it's hard to do this and preserve the incremental dump feature. There are too many lazy panics as is without wishing for incrementals to roll forward from a corrupt base. Also, I think, architecturally, replication and storage should not be mixed because the goals when errors occur are so different. Fixing this problem at the cost of making replication jobs less reliable would be a bad thing, so I like separate tools, and unstorable zfs send. mm What I need is a complete snapshot of the filesystem mm (ie. ufsdump) and, correct me if I'm wrong, but zfs send/recv mm is the closest (only) thing we have. Using 'zfs send | zfs recv' to replicate one zpool into another zpool is a second option---store the destination pool on DVDR, not the stream. If you have enough space to store disk images of the second zpool, which it sounds like you do, then once you get 'split' working you can split it up and write it to DVDR, too. Or you can let ZFS do the splitting, and make DVD-size vdev's, export the pool, and burn them. It's not as robust as a split cpio when faced with a lost DVD, but it's worlds better than a split 'zfs send'. for your 'split' problem, I know I have used 'split' in the way you want, but I would have been using GNU split. Bob suggested beware of split's line-orientedness (be sure to use -b). A couple other people suggested using bash's {a..z} syntax rather than plain globbing to make sure you're combining the pieces in the right order. There is /usr/gnu/bin/split and /usr/5bin/split on my system in addition to /usr/bin/split so you've a couple others to try. You're checking that it's working the right way, with md5sum, so at least you already have enough tools to narrow the problem away from ZFS. If you get really desperate, you can use dd's skip= and count= options to emulate split, and still use cat to combine. Also check the filesizes. If you have a 2GB filesize ulimit set, that could mess up the stdout redirection, but on my Solaris system it seems to default to unlimited. pgpnxdaoxIHkL.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
On 4-Feb-09, at 6:19 AM, Michael McKnight wrote: Hello everyone, I am trying to take ZFS snapshots (ie. zfs send) and burn them to DVD's for offsite storage. In many cases, the snapshots greatly exceed the 8GB I can stuff onto a single DVD-DL. In order to make this work, I have used the split utility ... I use the following command to convert them back into a single file: #cat mypictures.zfssnap.split.a[a-g] testjoin But when I compare the checksum of the original snapshot to that of the rejoined snapshot, I get a different result: Tested your RAM lately? --Toby -Michael -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
mm == Michael McKnight michael_mcknigh...@yahoo.com writes: mm #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split. mm #cat mypictures.zfssnap.split.a[a-g] testjoin mm But when I compare the checksum of the original snapshot to mm that of the rejoined snapshot, I get a different result: sounds fine. I'm not sure why it's failing. mm And when I try to restore the filesystem, I get the following mm failure: #zfs recv pool_01/test ./testjoin cannot receive mm new filesystem stream: invalid stream (checksum mismatch) however, aside from this problem you're immediately having, I think you should never archive the output of 'zfs send'. I think the current warning on the wiki is not sufficiently drastic, but when I asked for an account to update the wiki I got no answer. Here are the problems, again, with archiving 'zfs send' output: * no way to test the stream's integrity without receiving it. (meaning, to test a stream, you need enough space to store the stream being tested, plus that much space again. not practical.) A test could possibly be hacked up, but because the whole ZFS software stack is involved in receiving, and is full of assertions itself, any test short of actual extraction wouldn't be a thorough test, so this is unlikely to change soon. * stream format is not guaranteed to be forward compatible with new kernels. and versioning may be pickier than zfs/zpool versions. * stream is expanded _by the kernel_, so even if tar had a forward-compatibility problem, which it won't, you could hypothetically work around it by getting an old 'tar'. For 'zfs send' streams you have to get an entire old kernel, and boot it on modern hardware, to get at your old stream. * supposed to be endian-independent, but isn't. * stream is ``protected'' from corruption in the following way: if a single bit is flipped anywhere in the stream, the entire stream and all incrementals descended from it become worthless. It is EXTREMELY corruption-sensitive. 'tar' and zpool images both detect, report, work around, flipped bits. The 'zfs send' idea is different: if there's corruption, the designers assume you can just restart the 'zfs send | zfs recv' until you get a clean go---what you most need is ability to atomically roll back the failed recv, which you do get. You are not supposed to be archiving it! * unresolved bugs. ``poisonous streams'' causing kernel panics when you receive them, http://www.opensolaris.org/jive/thread.jspa?threadID=81613tstart=0 The following things do not have these problems: * ZFS filesystems inside file vdev's (except maybe the endian problem. and also the needs-whole-kernel problem, but mitigated by better forward-compatibility guarantees.) * tar files In both alternatives you probably shouldn't use gzip on the resulting file. If you must gzip, it would be better to make a bunch of tar.gz files, ex., one per user, and tar the result. Maybe I'm missing some magic flag, but I've not gotten gzip to be too bitflip-resilient. The wiki cop-out is a nebulous ``enterprise backup ``Solution' ''. Short of that you might make a zpool in a file with zfs compression turned on and rsync or cpio or zfs send | zfs recv the data into it. Or just use gtar like in the old days. With some care you may even be able to convince tar to write directly to the medium. And when you're done you can do a 'tar t' directly from medium also, to check it. I'm not sure what to do about incrementals. There is a sort of halfass incremental feature in gtar, but not like what ZFS gives. pgpwUYXwCkuVI.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
On Wed, February 4, 2009 12:01, Miles Nordin wrote: * stream format is not guaranteed to be forward compatible with new kernels. and versioning may be pickier than zfs/zpool versions. Useful points, all of them. This particular one also points out something I hadn't previously thought about -- using zfs send piped through ssh (or in some other way going from one system to another) is also sensitive to this versioning issue. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
On Wed, 4 Feb 2009, Toby Thain wrote: In order to make this work, I have used the split utility ... I use the following command to convert them back into a single file: #cat mypictures.zfssnap.split.a[a-g] testjoin But when I compare the checksum of the original snapshot to that of the rejoined snapshot, I get a different result: Tested your RAM lately? Split is originally designed to handle text files. It may have problems with binary files. Due to these issues, long ago (1993) I wrote a 'bustup' utility which works on binary files. I have not looked at it since then. Bob == Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
On 4-Feb-09, at 2:29 PM, Bob Friesenhahn wrote: On Wed, 4 Feb 2009, Toby Thain wrote: In order to make this work, I have used the split utility ... I use the following command to convert them back into a single file: #cat mypictures.zfssnap.split.a[a-g] testjoin But when I compare the checksum of the original snapshot to that of the rejoined snapshot, I get a different result: Tested your RAM lately? Split is originally designed to handle text files. It may have problems with binary files. Ouch, OK. --Toby Due to these issues, long ago (1993) I wrote a 'bustup' utility which works on binary files. I have not looked at it since then. Bob == Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/ bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
tt == Toby Thain t...@telegraphics.com.au writes: tt I know this was discussed a while back, but in what sense does tt tar do any of those things? I understand that it is unlikely tt to barf completely on bitflips, but won't tar simply silently tt de-archive bad data? yeah, I just tested it, and you're right. I guess the checksums are only for headers. However, cpio does store checksums for files' contents, so maybe it's better to use cpio than tar. Just be careful how you invoke it, because there are different cpio formats just like there are different tar formats, and some might have no or weaker checksum. NetBSD 'pax' invoked as tar: -8- castrovalva:~$ dd if=/dev/zero of=t0 bs=1m count=1 1+0 records in 1+0 records out 1048576 bytes transferred in 0.022 secs (47662545 bytes/sec) castrovalva:~$ tar cf t0.tar t0 castrovalva:~$ md5 t0.tar MD5 (t0.tar) = 591a39a984f70fe3e44a5e13f0ac74b6 castrovalva:~$ tar tf t0.tar t0 castrovalva:~$ dd of=t0.tar seek=$(( 512 * 1024 )) bs=1 conv=notrunc asdfasdfasfs 13+0 records in 13+0 records out 13 bytes transferred in 2.187 secs (5 bytes/sec) castrovalva:~$ md5 t0.tar MD5 (t0.tar) = 14b3a9d851579d8331a0466a5ef62693 castrovalva:~$ tar tf t0.tar t0 castrovalva:~$ tar xvf t0.tar tar: Removing leading / from absolute path names in the archive t0 tar: ustar vol 1, 1 files, 1054720 bytes read, 0 bytes written in 1 secs (1054720 bytes/sec) castrovalva:~$ hexdump -C t0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 0007fe00 61 73 64 66 61 73 64 66 61 73 66 73 0a 00 00 00 |asdfasdfasfs| 0007fe10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 0010 castrovalva:~$ -8- GNU tar does the same thing. NetBSD 'pax' invoked as cpio: -8- castrovalva:~$ dd if=/dev/zero of=t0 bs=1m count=1 1+0 records in 1+0 records out 1048576 bytes transferred in 0.018 secs (58254222 bytes/sec) castrovalva:~$ cpio -H sv4cpio -o t0.cpio t0 castrovalva:~$ md5 t0.cpio MD5 (t0.cpio) = d5128381e72ee514ced8ad10a5a33f16 castrovalva:~$ dd of=t0.cpio seek=$(( 512 * 1024 )) bs=1 conv=notrunc asdfasdfasdf 13+0 records in 13+0 records out 13 bytes transferred in 1.461 secs (8 bytes/sec) castrovalva:~$ md5 t0.cpio MD5 (t0.cpio) = b22458669256da5bcb6c94948d22a155 castrovalva:~$ rm t0 castrovalva:~$ cpio -i t0.cpio cpio: Removing leading / from absolute path names in the archive cpio: Actual crc does not match expected crc t0 -8- pgpoIIGmEkmDv.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS snapshot splitting joining
Miles Nordin wrote: mm == Michael McKnight michael_mcknigh...@yahoo.com writes: mm #split -b8100m ./mypictures.zfssnap mypictures.zfssnap.split. mm #cat mypictures.zfssnap.split.a[a-g] testjoin mm But when I compare the checksum of the original snapshot to mm that of the rejoined snapshot, I get a different result: sounds fine. I'm not sure why it's failing. mm And when I try to restore the filesystem, I get the following mm failure: #zfs recv pool_01/test ./testjoin cannot receive mm new filesystem stream: invalid stream (checksum mismatch) however, aside from this problem you're immediately having, I think you should never archive the output of 'zfs send'. I think the current warning on the wiki is not sufficiently drastic, but when I asked for an account to update the wiki I got no answer. Here are the problems, again, with archiving 'zfs send' output: * no way to test the stream's integrity without receiving it. (meaning, to test a stream, you need enough space to store the stream being tested, plus that much space again. not practical.) A test could possibly be hacked up, but because the whole ZFS software stack is involved in receiving, and is full of assertions itself, any test short of actual extraction wouldn't be a thorough test, so this is unlikely to change soon. * stream format is not guaranteed to be forward compatible with new kernels. and versioning may be pickier than zfs/zpool versions. Backward compatibility is achieved. * stream is expanded _by the kernel_, so even if tar had a forward-compatibility problem, which it won't, you could hypothetically work around it by getting an old 'tar'. For 'zfs send' streams you have to get an entire old kernel, and boot it on modern hardware, to get at your old stream. An enterprising community member could easily put together a utility to do a verification. All of the necessary code is readily available. * supposed to be endian-independent, but isn't. CR 6764193 was fixed in b105 http://bugs.opensolaris.org/view_bug.do?bug_id=6764193 Is there another? * stream is ``protected'' from corruption in the following way: if a single bit is flipped anywhere in the stream, the entire stream and all incrementals descended from it become worthless. It is EXTREMELY corruption-sensitive. 'tar' and zpool images both detect, report, work around, flipped bits. The 'zfs send' idea is different: if there's corruption, the designers assume you can just restart the 'zfs send | zfs recv' until you get a clean go---what you most need is ability to atomically roll back the failed recv, which you do get. You are not supposed to be archiving it! This is not completely accurate. Snapshots which are completed are completed. * unresolved bugs. ``poisonous streams'' causing kernel panics when you receive them, http://www.opensolaris.org/jive/thread.jspa?threadID=81613tstart=0 The following things do not have these problems: * ZFS filesystems inside file vdev's (except maybe the endian problem. and also the needs-whole-kernel problem, but mitigated by better forward-compatibility guarantees.) Indeed, but perhaps you'll find the grace to file an appropriate RFE? * tar files In both alternatives you probably shouldn't use gzip on the resulting file. If you must gzip, it would be better to make a bunch of tar.gz files, ex., one per user, and tar the result. Maybe I'm missing some magic flag, but I've not gotten gzip to be too bitflip-resilient. The wiki cop-out is a nebulous ``enterprise backup ``Solution' ''. Perhaps it would satisfy you to enumerate the market's Enterprise Backup Solutions? This might be helpful since Solaris does not include such software, at least by my definition of Solaris. So, the wiki section Using ZFS With Enterprise Backup Solutions does in fact enumerate them, and I don't see any benefit to repeating the enumeration. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Using_ZFS_With_Enterprise_Backup_Solutions Short of that you might make a zpool in a file with zfs compression turned on and rsync or cpio or zfs send | zfs recv the data into it. Or just use gtar like in the old days. With some care you may even be able to convince tar to write directly to the medium. And when you're done you can do a 'tar t' directly from medium also, to check it. I'm not sure what to do about incrementals. There is a sort of halfass incremental feature in gtar, but not like what ZFS gives. I suggest you consider an Enterprise Backup Solution. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss