Re: [zfs-discuss] Remove corrupt files from snapshot
On Tue, November 15, 2011 10:07, sbre...@hotmail.com wrote: Would it make sense to do zfs scrub regularly and have a report sent, i.e. once a day, so discrepancy would be noticed beforehand? Is there anything readily available in the Freebsd ZFS package for this? If you're not scrubbing regularly, you're losing out on one of the key benefits of ZFS. In nearly all fileserver situations, a good amount of the content is essentially archival, infrequently accessed but important now and then. (In my case it's my collection of digital and digitized photos.) A weekly scrub combined with a decent backup plan will detect bit-rot before the backups with the correct data cycle into the trash (and, with redundant storage like mirroring or RAID, the scrub will probably be able to fix the error without resorting to restoring files from backup). -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
Thanks anyone for the help, finally I removed corrupt files from the current view of the file system and left the snapshots as they were. This way at least the incremental backup continues. (It is sad that snapshots are so rigid that even corruption is permanent. What more interesting is that, if snapshots are read only, how can they become corrupted?) Would it make sense to do zfs scrub regularly and have a report sent, i.e. once a day, so discrepancy would be noticed beforehand? Is there anything readily available in the Freebsd ZFS package for this? B. From: opensolarisisdeadlongliveopensola...@nedharvey.com To: sbre...@hotmail.com; zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] Remove corrupt files from snapshot Date: Mon, 14 Nov 2011 19:32:21 -0500 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of sbre...@hotmail.com Actually a regular file (on a RAID1 setup with gmirror and 2 identical disks) is used as backing store for ZFS. The hardware should be fine as nothing else seems to be corrupt. In a 10-second google, I see that gmirror is a FreeBSD raid tool, perhaps similar in some ways to linux lvm. One similarity it has - It doesn't count. You should be using zpool mirroring. Then ZFS will be aware of the redundant copy, and then ZFS has the potential to correct corruption it finds. If you're doing the redundancy at a level below ZFS, then ZFS can only see one device. It cannot perform as well this way, and it cannot perform such features as redundant copy error correction. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
On Tue, Nov 15, 2011 at 8:07 AM, sbre...@hotmail.com wrote: Thanks anyone for the help, finally I removed corrupt files from the current view of the file system and left the snapshots as they were. This way at least the incremental backup continues. (It is sad that snapshots are so rigid that even corruption is permanent. What more interesting is that, if snapshots are read only, how can they become corrupted?) The snapshot is read-only, meaning users cannot modify the data in the snapshots. However, there's nothing to prevents random bit flips in the underlying storage. Maybe the physical harddrive has a bad block and gmirror copied the bad data to both disks, which flipped a bit or two in the file you are using to back the ZFS pool. Since ZFS only see a single device, it has no internal redundancy and can't fix the corrupted bits, only report that it found a block where the on-disk checksum doesn't match the computed checksum of the block. This is why you need to let ZFS handle redundancy via mirror vdevs, raidz vdevs, or (at the very least) copies=2 property on the ZFS filesystem. If there's redundancy in the pool, then ZFS can correct the corruption. Would it make sense to do zfs scrub regularly and have a report sent, i.e. once a day, so discrepancy would be noticed beforehand? Is there anything readily available in the Freebsd ZFS package for this? Without any redundancy in the pool, all a scrub will do is let you know there is corrupted data in the pool. It can't fix it. Neither can gmirror below the pool fix it. All you can do is delete the corrupted file and restore that file from backups. You really should get rid of the gmirror setup, dedicate the entire disks to ZFS, and create a pool using a mirror vdev. File-backed ZFS vdevs really should only be used for testing purposes. -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
Use zpool status -v to see if any errors come up. Then you can use zpool scrub to remove at least some of them. I have had luck with this in the past. ---Todd On Nov 14, 2011, at 04:25 , sbre...@hotmail.com sbre...@hotmail.com wrote: Back to this topic, since I cannot touch snapshots I thought I could simply remove the corrupt files after the last snapshot, so the next incremental backup will notice the difference (i.e. no file) and overwrite the corrupt-and-removed files with valid ones. This was the plan. However, while checking for corrupt files, find stops at some directory with fts_read: Not a directory: find . -exec md5 {} \; /home/xxx/md5_out 2 /home/xxx/md5_err tail /home/xxx/md5_err ... md5: ./.zfs/snapshot/20100323081201/Bazsi/Projects/Java Test Client/java_test_client/lib/xxx/weblogic.jar: Input/output error md5: ./.zfs/snapshot/20100323081201/@Cache (Bazsi)/BMWi SP/Publikationen/PDF-Brosch▒ren/Nexxt.pdf: Input/output error find: fts_read: Not a directory What does this error mean? I cannot even scan the ZFS file system anymore? Is there any fsck for ZFS? Cheers, B. From: zfsdisc...@orgdotuk.org.uk To: zfs-discuss@opensolaris.org Date: Mon, 7 Nov 2011 21:49:56 + Subject: Re: [zfs-discuss] Remove corrupt files from snapshot -Original Message- From: Edward Ned Harvey Sent: 04/11/2011 21:23 You need to destroy the snapshot completely - But if you want to selectively delete from a snapshot, I think you can clone it, then promote the clone, then destroy the snapshot, then rm something from the clone and then snapshot the clone back to the original name, and then destroy the clone. Right? Not so fast! :-) If you promote this new clone, the current state / branch of your filesystem becomes a clone instead, dependent on the snapshot. Then if you try to destroy the snapshot, you'll fail, because it has a dependent clone (your current fs!!!). If you continue without realising the implications, and so try the 'destroy' again with '-R', there goes the neighbourhood! I did this once, and was only saved by the fact that my cwd was in my current filesystem, so couldn't be unmounted, and therefore couldn't be removed! Phew!! Nice to learn something and only get singed eyebrows, instead of losing a leg! hth Andy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Todd Urie Use zpool status -v to see if any errors come up. Then you can use zpool scrub to remove at least some of them. I have had luck with this in the past. Disks are made of chemicals, which can degrade over time. If some part of a disk starts to deteriorate, but you never attempt to read it, then you'll never know it's going bad. You should have redundancy, and scrub on a regular basis, much more frequently than the occurrence of a disk going bad - maybe once a week or once a month. If you can afford to scrub daily, that's great. Depending on your system and your data, scrubs might take several hours, thus making it impractical to scrub daily. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
Back to this topic, since I cannot touch snapshots I thought I could simply remove the corrupt files after the last snapshot, so the next incremental backup will notice the difference (i.e. no file) and overwrite the corrupt-and-removed files with valid ones. This was the plan. However, while checking for corrupt files, find stops at some directory with fts_read: Not a directory: find . -exec md5 {} \; /home/xxx/md5_out 2 /home/xxx/md5_err tail /home/xxx/md5_err ... md5: ./.zfs/snapshot/20100323081201/Bazsi/Projects/Java Test Client/java_test_client/lib/xxx/weblogic.jar: Input/output error md5: ./.zfs/snapshot/20100323081201/@Cache (Bazsi)/BMWi SP/Publikationen/PDF-Brosch▒ren/Nexxt.pdf: Input/output error find: fts_read: Not a directory What does this error mean? I cannot even scan the ZFS file system anymore? Is there any fsck for ZFS? Cheers, B. From: zfsdisc...@orgdotuk.org.uk To: zfs-discuss@opensolaris.org Date: Mon, 7 Nov 2011 21:49:56 + Subject: Re: [zfs-discuss] Remove corrupt files from snapshot -Original Message- From: Edward Ned Harvey Sent: 04/11/2011 21:23 You need to destroy the snapshot completely - But if you want to selectively delete from a snapshot, I think you can clone it, then promote the clone, then destroy the snapshot, then rm something from the clone and then snapshot the clone back to the original name, and then destroy the clone. Right? Not so fast! :-) If you promote this new clone, the current state / branch of your filesystem becomes a clone instead, dependent on the snapshot. Then if you try to destroy the snapshot, you'll fail, because it has a dependent clone (your current fs!!!). If you continue without realising the implications, and so try the 'destroy' again with '-R', there goes the neighbourhood! I did this once, and was only saved by the fact that my cwd was in my current filesystem, so couldn't be unmounted, and therefore couldn't be removed! Phew!! Nice to learn something and only get singed eyebrows, instead of losing a leg! hth Andy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of sbre...@hotmail.com What does this error mean? I cannot even scan the ZFS file system anymore? Is there any fsck for ZFS? There is zpool scrub. It will check all the checksums previously calculated, verifying the data that was actually written is the data ZFS previously thought it wrote. If you have sufficient redundancy (mirror or raid) it will self-correct any errors it finds. Since you're experincing corruption that doesn't go away, I'm supposing you don't have redundancy, or else the corruption happened in something higher up, such as a failing cpu or non-ecc ram, or a flaky disk controller. In any event, do you have any reason to believe you've eliminated the cause of the corruption? The behavior you're experiencing is normal if you have failing hardware. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
Actually a regular file (on a RAID1 setup with gmirror and 2 identical disks) is used as backing store for ZFS. The hardware should be fine as nothing else seems to be corrupt. Wonder if a server reset could have caused the issue? There are 2 things that surely do not work perfectly: 1. Startup: [root@xxx /etc/rc.d]# cat /etc/rc.conf | grep mdconfig mdconfig_md0=-f /usr/local/zfs/store [root@xxx /etc/rc.d]# /etc/rc.d/mdconfig start Creating md0 device (-f). mount: /dev/md0: unknown special file or file system I have to run 2 times zfs mount all, to see the folders. 2. Shutdown: ... +++ /tmp/security.Z3SCbf2M 2011-10-26 03:07:20.0 +0200 +Waiting (max 60 seconds) for system process `vnlru' to stop...done +Waiting (max 60 seconds) for system process `bufdaemon' to stop...done +Wait +iSnygn ci(nmga xd is6k0s ,s evcnoonddess) rfeomra isnyisntge.m. .pr5oc ess `syncer' to stop...4 1 3 3 2 2 0 0 0 done +All buffers synced. The computer does not reboot after this, just waits for ??? . Manual reset is needed. Is ZFS not recommended with file backing store? B. From: opensolarisisdeadlongliveopensola...@nedharvey.com To: sbre...@hotmail.com; zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] Remove corrupt files from snapshot Date: Mon, 14 Nov 2011 09:36:58 -0500 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of sbre...@hotmail.com What does this error mean? I cannot even scan the ZFS file system anymore? Is there any fsck for ZFS? There is zpool scrub. It will check all the checksums previously calculated, verifying the data that was actually written is the data ZFS previously thought it wrote. If you have sufficient redundancy (mirror or raid) it will self-correct any errors it finds. Since you're experincing corruption that doesn't go away, I'm supposing you don't have redundancy, or else the corruption happened in something higher up, such as a failing cpu or non-ecc ram, or a flaky disk controller. In any event, do you have any reason to believe you've eliminated the cause of the corruption? The behavior you're experiencing is normal if you have failing hardware. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
On Mon, 14 Nov 2011 16:39:25 +, sbre...@hotmail.com wrote: Is ZFS not recommended with file backing store? From man zpool: SunOS 5.11 Last change: 24 Nov 2009 2 System Administration Commandszpool(1M) Virtual Devices (vdevs) A virtual device describes a single device or a collection of devices organized according to certain performance and fault characteristics. The following virtual devices are supported: disk A block device, typically located under /dev/dsk. ZFS [...] file A regular file. The use of files as a backing store is strongly discouraged. It is designed primarily for experimental purposes, as the fault tolerance of a file is only as good as the file system of which it is a part. A file must be specified by a full path. mirror [] -- ( Kees Nuyt ) c[_] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of sbre...@hotmail.com Actually a regular file (on a RAID1 setup with gmirror and 2 identical disks) is used as backing store for ZFS. The hardware should be fine as nothing else seems to be corrupt. In a 10-second google, I see that gmirror is a FreeBSD raid tool, perhaps similar in some ways to linux lvm. One similarity it has - It doesn't count. You should be using zpool mirroring. Then ZFS will be aware of the redundant copy, and then ZFS has the potential to correct corruption it finds. If you're doing the redundancy at a level below ZFS, then ZFS can only see one device. It cannot perform as well this way, and it cannot perform such features as redundant copy error correction. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
-Original Message- From: Edward Ned Harvey Sent: 04/11/2011 21:23 You need to destroy the snapshot completely - But if you want to selectively delete from a snapshot, I think you can clone it, then promote the clone, then destroy the snapshot, then rm something from the clone and then snapshot the clone back to the original name, and then destroy the clone. Right? Not so fast! :-) If you promote this new clone, the current state / branch of your filesystem becomes a clone instead, dependent on the snapshot. Then if you try to destroy the snapshot, you'll fail, because it has a dependent clone (your current fs!!!). If you continue without realising the implications, and so try the 'destroy' again with '-R', there goes the neighbourhood! I did this once, and was only saved by the fact that my cwd was in my current filesystem, so couldn't be unmounted, and therefore couldn't be removed! Phew!! Nice to learn something and only get singed eyebrows, instead of losing a leg! hth Andy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of sbre...@hotmail.com However, snapshots seem to be read-only: Is there any way to force the file removal? You need to destroy the snapshot completely - But if you want to selectively delete from a snapshot, I think you can clone it, then promote the clone, then destroy the snapshot, then rm something from the clone and then snapshot the clone back to the original name, and then destroy the clone. Right? BTW, since snapshots are listed in chronological order, it is distinctly possible the above might cause unintended consequences for snapshot scripts / autosnapshot / whatever. Most people in your situation would simply destroy the snapshot and never look back. That's the easy thing to do. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Remove corrupt files from snapshot
Hello, I have got a bunch of corrupted files in various snapshots on my ZFS file backing store. I was not able to recover them so decided to remove all, otherwise the continuously make trouble for my incremental backup (rsync, diff etc. fails). However, snapshots seem to be read-only: # zpool status -v pool: backups state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM backups ONLINE 0 013 md0 ONLINE 0 013 errors: Permanent errors have been detected in the following files: /backups/memory_card/.zfs/snapshot/20110218230726/Backup/Backup.arc ... # rm /backups/memory_card/.zfs/snapshot/20110218230726/Backup/Backup.arc rm: /backups/memory_card/.zfs/snapshot/20110218230726/Backup/Backup.arc: Read-only file system Is there any way to force the file removal? Cheers, B. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
Hi, snapshots are read-only by design; you can clone them and manipulate the clone, but the snapshot itself remains r/o. HTH Michael On Thu, Nov 3, 2011 at 13:35, sbre...@hotmail.com wrote: Hello, I have got a bunch of corrupted files in various snapshots on my ZFS file backing store. I was not able to recover them so decided to remove all, otherwise the continuously make trouble for my incremental backup (rsync, diff etc. fails). However, snapshots seem to be read-only: # zpool status -v pool: backups state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM backups ONLINE 0 0 13 md0 ONLINE 0 0 13 errors: Permanent errors have been detected in the following files: /backups/memory_card/.zfs/snapshot/20110218230726/Backup/Backup.arc ... # rm /backups/memory_card/.zfs/snapshot/20110218230726/Backup/Backup.arc rm: /backups/memory_card/.zfs/snapshot/20110218230726/Backup/Backup.arc: Read-only file system Is there any way to force the file removal? Cheers, B. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Michael Schuster http://recursiveramblings.wordpress.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
On Thu, Nov 3, 2011 at 8:35 AM, sbre...@hotmail.com wrote: I have got a bunch of corrupted files in various snapshots on my ZFS file backing store. I was not able to recover them so decided to remove all, otherwise the continuously make trouble for my incremental backup (rsync, diff etc. fails). Why are you backing up the snapshots ? Or perhaps a better question is why are you backing them up more than once, as they can't change ? What are you trying to accomplish with the snapshots ? You can set the snapdir property on the dataset to hidden and it will not show up with an ls, even an ls -a, you have to know that the .zfs directory is there and cd into it blind. This will keep tools that walk the directory tree from finding it. zfs get snapdir xxx NAME PROPERTY VALUESOURCE xxx snapdir hidden default You would use zfs set snapdir=hidden dataset to set the parameter. -- {1-2-3-4-5-6-7-} Paul Kraus - Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) - Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) - Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Remove corrupt files from snapshot
On 03 November, 2011 - Paul Kraus sent me these 1,3K bytes: On Thu, Nov 3, 2011 at 8:35 AM, sbre...@hotmail.com wrote: I have got a bunch of corrupted files in various snapshots on my ZFS file backing store. I was not able to recover them so decided to remove all, otherwise the continuously make trouble for my incremental backup (rsync, diff etc. fails). Why are you backing up the snapshots ? Or perhaps a better question is why are you backing them up more than once, as they can't change ? What are you trying to accomplish with the snapshots ? You can set the snapdir property on the dataset to hidden and it will not show up with an ls, even an ls -a, you have to know that the .zfs directory is there and cd into it blind. This will keep tools that walk the directory tree from finding it. zfs get snapdir xxx NAME PROPERTY VALUESOURCE xxx snapdir hidden default You would use zfs set snapdir=hidden dataset to set the parameter. .. which is default. /Tomas -- Tomas Forsman, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss