[zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
I have written an python script that enables to get back already deleted files and pools/partitions. This is highly experimental, but I managed to get back a moths work when all the partitions were deleted by accident(and of course backups are for the weak ;-) I hope someone can pass this information to the ZFS forensics project or where this should be.. First the basics and the HOW-TO is after that. And i am not an solaris or ZFS expert, i am sure here are many things to improve, i hope you can help me out with some problems this still has. [b]Basics:[/b] Basically this script finds all the uberblocks, reads their metadata and orders them by time, then enables you to destroy all the uberblocks that were created after the event that you want to scroll back. Then destroy the cache and make the machine boot up again. This will only work if the discs are not very full and there was not very much activity after the bad event. I managed to get back files from an ZFS partition after it was deleted(several) and created new ones. I got so far by the help of these materials, the ones with * are the key parts: *http://mbruning.blogspot.com/2008/08/recovering-removed-file-on-zfs-disk.html* http://blogs.sun.com/blogfinger/entry/zfs_and_the_uberblock *http://www.opensolaris.org/jive/thread.jspa?threadID=85794%u205Etstart=0* http://opensolaris.org/os/project/forensics/ZFS-Forensics/ http://docs.huihoo.com/opensolaris/solaris-zfs-administration-guide/html/ch04s06.html http://www.lildude.co.uk/zfs-cheatsheet/ [b]How-to[/b] This is the scenario i had... First check the pool status: $zpool status zones From there you will get the disc name e.g:c2t60060E800457AB0057AB0146d0 Now we look up the history of the pool so we can find the timeline and some uberblocks(their TXG-s) where to scroll back: zpool history -il zones Save this output for later use. You will defently want to backup the disk before you continue from this point: e.g. ssh r...@host dd if=/dev/dsk/c... | dd of=Desktop/zones.dd Now take the script that i have attached zfs_revert.py It has two options: -bs is block size, by default 512 (never tested) -tb is number of blocks:[this is mandatory, maybe someone could automate this] To find the block size in solaris you can use prtvtoc /dev/dsk/c2t60060E800457AB0057AB0146d0 | grep sectors From there look at the sectors row. If you have a file/loop device just sizeinbytes/blocksize=total blocks Now run the script for example: ./zfs_revert.py -bs=512 -tb=41944319 /dev/dsk/c2t60060E800457AB0057AB0146d0 This will use dd, od and grep(GNU) to find the required information. This script should work on linux and on solaris. It should give you a representation of the found uberblocks(i tested it with a 20GB pool, did not take very long since the uberblocks are only at the beginning and ending of the disk) Something like this, but probably much more: TXG, time-stamp, unixtime, addresses(there are 4 copy's of uberblocks) 411579 05 Oct 2009 14:39:511254742791 [630, 1142, 41926774, 41927286] 411580 05 Oct 2009 14:40:211254742821 [632, 1144, 41926776, 41927288] 411586 05 Oct 2009 14:43:211254743001 [644, 1156, 41926788, 41927300] 411590 05 Oct 2009 14:45:211254743121 [652, 1164, 41926796, 41927308] Now comes the FUN part, take a wild guess witch block might be the one, it took me about 10 tryes to get it right, and i have no idea what are the good blocks or how to check this up. You will see later what i mean by that. Enter the last TXG you want to KEEP. Now the script writes zeroes to all of the uberblocs after the TXG you inserted. Now clear the ZFS cache and reboot(better solution someone???) rm -rf /etc/zfs/zpool.cache reboot After the box comes up you have to hurry, you don't have much time, if any at all since ZFS will realize in about a minute or two that something is fishy. First try to import the pool if it is not imported yet. zpool import -f zones Now see if it can import it or fail miserably. There is a good chance that you will hit Corrupt data and unable to import, but as i said earlier it took me about 10 tries to get it right. I did not have to restore the whole thing every time, i just took baby steps and every time deleted some more blocks until i found something stable(not quite, it will still crash after few minutes, but this is enough time to get back conf files or some code) Problems and unknown factors: 1) After the machine boots up you have limited time before ZFS realizes that it has been corrupted(checksums? I tried to turn them off but as soon as I turn checksumming off it crashes and when i could turn it of then the data might be corrupted) 2) If you copy files and one of them is corrupted the whole thing halts/crashes and you have to start with the zfs_revert.py script and reboot again. 3) It might be that reverting to a TXG where the pool was exported then there is a better chance of
Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
I forgot to add the script -- This message posted from opensolaris.org zfs_revert.py Description: Binary data ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
The links work fine if you take the * off from the end...sorry bout that -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Need Help Invalidating Uberblock
You might want to check out this thread: http://opensolaris.org/jive/thread.jspa?messageID=435420 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] hung pool on iscsi
zpool for zone of customer-facing production appserver hung due to iscsi transport errors. How can I {forcibly} reset this pool? zfs commands are hanging and iscsiadm remove refuses. r...@raadiku~[8]8:48#iscsiadm remove static-config iqn.1986-03.com.sun:02:aef78e-955a-4072-c7f6-afe087723466 iscsiadm: logical unit in use iscsiadm: Unable to complete operation r...@raadiku~[6]8:45#dmesg [...] Nov 16 00:03:30 Raadiku scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci (scsi_vhci0): Nov 16 00:03:30 Raadiku /scsi_vhci/s...@g013048c514da2a0049ae9806 (ssd3): Command Timeout on path /iscsi (iscsi0) Nov 16 00:03:30 Raadiku scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/s...@g013048c514da2a0049ae9806 (ssd3): Nov 16 00:03:30 Raadiku SCSI transport failed: reason 'timeout': retrying command Nov 16 08:40:10 Raadiku su: [ID 810491 auth.crit] 'su root' failed for jritorto on /dev/pts/1 Nov 16 08:47:05 Raadiku iscsi: [ID 213721 kern.notice] NOTICE: iscsi session(9) - session logout failed (1) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Deduplication Replication
Hello; Dedup on ZFS is an absolutely wonderful feature! Is there a way to conduct dedup replication across boxes from one dedup ZFS data set to another? Warmest Regards Steven Sim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Deduplication Replication
Steven Sim wrote: Hello; Dedup on ZFS is an absolutely wonderful feature! Is there a way to conduct dedup replication across boxes from one dedup ZFS data set to another? Pass the '-D' argument to 'zfs send'. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] permanent files error, unable to access pool
Hi Daniel, Unfortunately, the permanent errors are in this pool's metadata so it is unlikely that this pool can be recovered. Is this an external USB drive? These drives are not always well-behaved and its possible that it didn't synchronize successfully. Is the data accessible? I don't know if a zpool scrub would help in this case. Because we don't know what metadata is damaged, the pool data might be compromised. If this pool contains backup data of another pool, then you might destroy and recreate this pool and recreate the backup data. Maybe someone else has a better idea... Cindy On 11/14/09 10:56, daniel.rodriguez.delg...@gmail.com wrote: I have been using opensolaris for a couple of weeks, today is my first time I reboot the system and I ran into a problem loading my external hd (meant for backup). I was expecting a more descriptive name of the file names, but given that I have no clue which ones are those, can I just tell the OS to delate those or to ignore such files? Any help would be greatly appreciated... jdrod...@_solaris:~# zpool status -v external pool: external state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM external ONLINE 0 0 0 c10t0d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: metadata:0x0 metadata:0x14 jdrod...@_solaris:~# ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best config for different sized disks
On Sun, 15 Nov 2009, Tim Cook wrote: Once again I question why you're wasting your time with raid-z. You might as well just stripe across all the drives. You're taking a performance penalty for a setup that essentially has 0 redundancy. You lose a 500gb drive, you lose everything. Why do you say that this user will lose everything? The two concatenated/striped devices on the local system are no different than if they were concatenated on SAN array and made available as one LUN. If one of those two drives fails, then it would have the same effect as if one larger drive failed. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hung pool on iscsi
I encountered the same problem...like i sed in the first post...zpool command freezes. Anyone knows how to make it respond again? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Old zfs version with OpenSolaris 2009.06 JeOS ??
Hi guys, I needed a quick install of OpenSolaris and i found : http://hub.opensolaris.org/bin/view/Project+jeos/200906+Prototype#HDownloads The footprint is splendid: around 275 megs, so it is little. But i have a question. Why the ZFS and ZPOOL version are that old ? r...@osol-jeos:/var/www# cat /etc/release OpenSolaris 2009.06 snv_111b X86 Copyright 2009 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 07 May 2009 # zpool upgrade -v This system is currently running ZFS pool version 14. As per : http://www.opensolaris.org/os/community/zfs/version/N The latest is 21 ! # zfs upgrade -v ---show version 3 As per : http://www.opensolaris.org/os/community/zfs/version/zpl/N The latest version of ZFS is version 4 Is there any way i can upgrade all that stuff ?? Using live upgrade? if yes, can you point me how to do it ? thanks, -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
I have no idea why this forum just makes files dissapear??? I will put a link tomorrow...a file was attached before... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] permanent files error, unable to access pool
Thanks Cindy, In fact, after some research, I ran into the scrub suggestion and it worked perfectly. Now I think that the automated message in http://www.sun.com/msg/ZFS-8000-8A should mention something about scrub as a worthy attempt. It was related to an external usb disk. I guess I am happy it happened now before I invested in getting a couple of other external disks as mirrors of the existing one. I guess I am better off installing an extra internal disk. is this something common on usb disks? would it get improved in later versions of osol or it is somewhat of an incompatibility/unfriendliness of zfs with external usb disks? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Backing up ZVOLs
You can use VCB to backup. In my test lab, I use VCB integrated with Bacula to backup all the VMs. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] permanent files error, unable to access pool
On Mon, 16 Nov 2009, daniel.rodriguez.delg...@gmail.com wrote: is this something common on usb disks? would it get improved in later versions of osol or it is somewhat of an incompatibility/unfriendliness of zfs with external usb disks? -- Some USB disks seem to ignore cache sync requests, which is deadly to zfs integrity. I am using LaCie d2 Quadra drives and have not observed any zfs issues at all. However, the external power supplies on these drives tend to fail so I am not sure if I would recommend them (my solution was to buy a box of spare power supplies). Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Old zfs version with OpenSolaris 2009.06 JeOS ??
On Mon, Nov 16, 2009 at 2:13 PM, Benoit Heroux hero...@videotron.ca wrote: Hi guys, I needed a quick install of OpenSolaris and i found : http://hub.opensolaris.org/bin/view/Project+jeos/200906+Prototype#HDownloads The footprint is splendid: around 275 megs, so it is little. But i have a question. Why the ZFS and ZPOOL version are that old ? r...@osol-jeos:/var/www# cat /etc/release OpenSolaris 2009.06 snv_111b X86 Copyright 2009 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 07 May 2009 # zpool upgrade -v This system is currently running ZFS pool version 14. As per : http://www.opensolaris.org/os/community/zfs/version/N The latest is 21 ! # zfs upgrade -v ---show version 3 As per : http://www.opensolaris.org/os/community/zfs/version/zpl/N The latest version of ZFS is version 4 Is there any way i can upgrade all that stuff ?? Using live upgrade? if yes, can you point me how to do it ? thanks, http://gibbs.acu.edu/2008/07/19/opensolaris-upgrade-instructions/ If you want the latest development build, which would be required to get to a build 21 zpool, you'd need to change your repository. http://pkg.opensolaris.org/dev/en/index.shtml --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best config for different sized disks
On Mon, Nov 16, 2009 at 12:09 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Sun, 15 Nov 2009, Tim Cook wrote: Once again I question why you're wasting your time with raid-z. You might as well just stripe across all the drives. You're taking a performance penalty for a setup that essentially has 0 redundancy. You lose a 500gb drive, you lose everything. Why do you say that this user will lose everything? The two concatenated/striped devices on the local system are no different than if they were concatenated on SAN array and made available as one LUN. If one of those two drives fails, then it would have the same effect as if one larger drive failed. Bob Can I blame it on too many beers? I was thinking losing half of one drive, rather than an entire vdev would just cause weirdness in the pool, rather than a clean failure. I suppose without experimentation there's no way to really no, in theory though, zfs should be able to handle it. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hung pool on iscsi
I already got my files back acctuay and the disc contains already new pools, so i have no idea how it was set. I have to make a virtualbox installation and test it. Can you please tell me how-to set the failmode? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] permanent files error, unable to access pool
Hi Daniel, In some cases, when I/O is suspended, permanent errors are logged and you need to run a zpool scrub to clear the errors. Are you saying that a zpool scrub cleared the errors that were displayed in the zpool status output? Or, did you also use zpool clear? Metadata is duplicated even in a one-device pool but recovery must depend on the severity of metadata errors. Thanks, Cindy On 11/16/09 13:18, daniel.rodriguez.delg...@gmail.com wrote: Thanks Cindy, In fact, after some research, I ran into the scrub suggestion and it worked perfectly. Now I think that the automated message in http://www.sun.com/msg/ZFS-8000-8A should mention something about scrub as a worthy attempt. It was related to an external usb disk. I guess I am happy it happened now before I invested in getting a couple of other external disks as mirrors of the existing one. I guess I am better off installing an extra internal disk. is this something common on usb disks? would it get improved in later versions of osol or it is somewhat of an incompatibility/unfriendliness of zfs with external usb disks? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hung pool on iscsi
On Mon, Nov 16, 2009 at 4:00 PM, Martin Vool mardi...@gmail.com wrote: I already got my files back acctuay and the disc contains already new pools, so i have no idea how it was set. I have to make a virtualbox installation and test it. Can you please tell me how-to set the failmode? http://prefetch.net/blog/index.php/2008/03/01/configuring-zfs-to-gracefully-deal-with-failures/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hung pool on iscsi
On Nov 16, 2009, at 2:00 PM, Martin Vool wrote: I already got my files back acctuay and the disc contains already new pools, so i have no idea how it was set. I have to make a virtualbox installation and test it. Don't forget to change VirtualBox's default cache flush setting. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#OpenSolaris.2FZFS.2FVirtual_Box_Recommendations -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hung pool on iscsi
On Mon, Nov 16, 2009 at 4:49 PM, Tim Cook t...@cook.ms wrote: Is your failmode set to wait? Yes. This box has like ten prod zones and ten corresponding zpools that initiate to iscsi targets on the filers. We can't panic the whole box just because one {zone/zpool/iscsi target} fails. Are there undocumented commands to reset a specific zpool or something? thx jake ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] permanent files error, unable to access pool
to be the best of my recollection, I only needed to run zfs scrub, reboot and the disk became operational again the irony was that the error message was asking me to recover from backup, but the disk involved was my backup of my working pool. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] building zpools on device aliases
We have a number of Sun J4200 SAS JBOD arrays which we have multipathed using Sun's MPxIO facility. While this is great for reliability, it results in the /dev/dsk device IDs changing from cXtYd0 to something virtually unreadable like c4t5000C5000B21AC63d0s3. Since the entries in /dev/{rdsk,dsk} are simply symbolic links anyway, would there be any problem with adding alias links to /devices there and building our zpools on them? We've tried this and it seems to work fine producing a zpool status similar to the following: ... NAMESTATE READ WRITE CKSUM vol01 ONLINE 0 0 0 mirrorONLINE 0 0 0 top00 ONLINE 0 0 0 bot00 ONLINE 0 0 0 mirrorONLINE 0 0 0 top01 ONLINE 0 0 0 bot01 ONLINE 0 0 0 ... Here our aliases are topnn and botnn to denote the disks in the top and bottom JBODs. The obvious question is what happens if the alias link disappears?. We've tested this, and ZFS seems to handle it quite nicely by finding the normal /dev/dsk link and simply working with that (although it's more difficult to get ZFS to use the alias again once it is recreated). If anyone can think of anything really nasty that we've missed, we'd appreciate knowing about it. Alternatively, if there is a better supported means of having ZFS display human-readable device ids we're all ears :-) Perhaps an MPxIO RFE for vanity device names would be in order? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss