Re: [zfs-discuss] ZFS non-zero checksum and permanent error with deleted file
Thank you very much for your reply! :-) Trevor Pretty schrieb: Steven I had a similar problem back in 2006 when I was first playing with ZFS. Jeff Bronwick sent me this. It may (or not) help. I'm not sure if the number is still the inode. If it is a please let zfs-discuss know. I've a non-mirrored zfs file systems which shows the status below. I saw the thread in the archives about working this out but it looks like ZFS messages have changed. How do I find out what file(s) this is? [...] errors: The following persistent errors have been detected: DATASET OBJECT RANGE LOCAL28905 3262251008-3262382080 I realize this is a bit lame, but currently the answer is: find /LOCAL -mount -inum 28905 And yes, we do indeed plan to automate this. ;-) Jeff Did your output come from a Solaris system ? I couldn't find anything about a -mount parameter in the find man page, what does it do ? [u...@host ~]$ sudo zpool status -v zpool01 ... errors: Permanent errors have been detected in the following files: zpool01:0x3736a [u...@host ~]$ sudo find /mnt/zpool01/ -inum 3736a find: -inum: 3736a: illegal trailing character [u...@host ~]$ sudo find /mnt/zpool01/ -inum 0x3736a find: -inum: 0x3736a: illegal trailing character Apparently, the -inum parameter needs a decimal number: [u...@host ~]$ sudo find /mnt/zpool01/ -inum 226154 [u...@host ~]$ How could find ever find anything ? The file at that inode as deleted after all. And even if it did find anything, what would I do with the result ? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: [zfs-discuss] ZFS non-zero checksum and permanent error with deleted file
Bob Friesenhahn schrieb: On Thu, 5 Nov 2009, Steven Samuel Cole wrote: Definitely do zpool scrub zpool01 to see if there is any other decay. I have done that prior to getting the status, several times actually, tried to indicate that in my OP. IIRC, all checksums are zero after clearing; after scrubbing, the total checksum goes back up to 4. The error is not cleared, though. Strange. I do recall that there was one OpenSolaris development release which did produce spurious checksum errors which looked weird like that. Hopefully you are not using that particular release. I am using ZFS as it comes with the official FreeBSD 7.2 64bit, no patches, no dev releases, all binary out of the box, nothing self-built. IIRC, that's ZFS version 6. Your 'zpool status' output did not indicate that a scrub was done. You are correct, my mistake. I reproduced the 3 zpool command lines in my OP from memory. I have gone through many clear/scrub/status, export/import, wash/rinse/repeat cycles now, the 'last scrub' info must have gone lost in one of them. A scrub on that pool takes ~8 hours, so I refrained from running it again just for demonstration purposes. Hmmm. Just as I want to double-check, I get this: [u...@host ~]$ sudo zpool history History for 'zpool01': 2008-05-31.22:16:22 zpool create -m /mnt/zpool01 zpool01 raidz1 ad12 ad14 ad16 ad18 2008-12-28.15:06:54 zpool import zpool01 2008-12-28.18:37:42 zpool export zpool01 2008-12-28.18:51:39 zpool import zpool01 2009-01-05.17:31:51 zpool export zpool01 2009-01-05.19:55:27 zpool import -d /dev/disk/by-id zpool01 2009-08-25.00:50:31 zpool clear zpool01 Assertion failed: ((null)), function nvlist_lookup_string(records[i], ZPOOL_HIST_CMD, cmdstr) == 0, file /usr/src/cddl/sbin/zpool/../../../cddl/contrib/opensolaris/cmd/zpool/zpool_main.c, line 3338. Abort trap: 6 (core dumped) Sigh. Maybe I should take that as another indication that something is just not right and I should rebuild the pool, afterwise there'll always be that nagging thought if my data is actually safe... Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
ZFS non-zero checksum and permanent error with deleted file
Hello, I couldn't find a dedicated FreeBSD/ZFS mailing list, so I hope this is the right place to ask. I'd like some advice if I should rely on one of my ZFS pools: [u...@host ~]$ sudo zpool clear zpool01 ... [u...@host ~]$ sudo zpool scrub zpool01 ... [u...@host ~]$ sudo zpool status -v zpool01 pool: zpool01 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAMESTATE READ WRITE CKSUM zpool01 ONLINE 0 0 4 raidz1ONLINE 0 0 4 ad12ONLINE 0 0 0 ad14ONLINE 0 0 0 ad16ONLINE 0 0 0 ad18ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: zpool01:0x3736a How can there be an error in a file that does not seem to exist ? How can I clear / recover from the error ? I have read the corresponding documentation and did the obligatory research, but so far, the only option I can see is a full destroy/create cycle - which seems an overkill, considering the pool size and the fact that there seems to be only one (deleted ?) file involved. [u...@host ~]$ df -h /mnt/zpool01/ FilesystemSizeUsed Avail Capacity Mounted on zpool01 1.3T1.2T133G90%/mnt/zpool01 [u...@host ~]$ uname -a FreeBSD host.domain 7.2-RELEASE FreeBSD 7.2-RELEASE #0: Fri May 1 07:18:07 UTC 2009 r...@driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Cheers, ssc ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: error message when starting smartd: FAILURE - SMART status=51 ...
Uwe Laverenz schrieb: On Sat, Jun 21, 2008 at 01:38:25PM +1200, Steven Samuel Cole wrote: Also, the disks are SATA300, the controller supports SATA150 only; there is a jumper on the disks that limits them to SATA150 which I removed. Could that be relevant ? Yes, it could be relevant. Several controllers have shown problems without this jumper in the past (VIA, 3ware...). Uwe ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] Hey Uwe, thanks for your reply :-) I put the jumpers back in, but unfortunately, the messages persist :-( Cheers, Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
error message when starting smartd: FAILURE - SMART status=51 ...
Hello, I see an error message every time I boot my AMD64 FreeBSD 7.0 system or when I restart smartd. These are the dmesg lines that seem relevant to the issue (shortened for clarity): kernel: FreeBSD 7.0-RELEASE #0: Fri Jun 6 22:06:44 NZST 2008 kernel: CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5200+ (2611.86-MHz K8-class CPU) kernel: Origin = AuthenticAMD Id = 0x60fb2 Stepping = 2 kernel: usable memory = 8576704512 (8179 MB) ... kernel: atapci2: SiI SiI 3114 SATA150 controller port 0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem 0xddeff400-0xddeff7ff irq 18 at device 10.0 on pci1 ... kernel: ata6: ATA channel 0 on atapci2 kernel: ata7: ATA channel 1 on atapci2 kernel: ata8: ATA channel 2 on atapci2 kernel: ata9: ATA channel 3 on atapci2 ... kernel: ad12: 476940MB Seagate ST3500320AS SD15 at ata6-master SATA150 kernel: ad14: 476940MB Seagate ST3500320AS SD15 at ata7-master SATA150 kernel: ad16: 476940MB Seagate ST3500320AS SD15 at ata8-master SATA150 kernel: ad18: 476940MB Seagate ST3500320AS SD15 at ata9-master SATA150 I am happy to provide more info if that helps.# I am using the latest version of smartmontools (5.38). The actual error messages are: kernel: ad12: FAILURE - SMART status=51READY,DSC,ERROR error=4ABORTED kernel: ad14: FAILURE - SMART status=51READY,DSC,ERROR error=4ABORTED kernel: ad16: FAILURE - SMART status=51READY,DSC,ERROR error=4ABORTED kernel: ad18: FAILURE - SMART status=51READY,DSC,ERROR error=4ABORTED Apart from that, the disks seem to be working fine. I am running a ZFS pool on them and they perform great, no worries whatsoever; I am simply unsure what these error messages mean and if they should worry me. Also, the disks are SATA300, the controller supports SATA150 only; there is a jumper on the disks that limits them to SATA150 which I removed. Could that be relevant ? Thank you very much for your attention. Kind regards, Steve ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]