>Number:         169398
>Category:       misc
>Synopsis:       Can't remove file with permanent error
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jun 25 14:00:21 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Ron Dzierwa
>Release:        8.2-RELEASE-p6
>Organization:
Innovative Engineering, Inc.
>Environment:
FreeBSD phoenix.hsd1.md.comcast.net 8.2-RELEASE-p6 FreeBSD 8.2-RELEASE-p6 #0: 
Sat Mar 24 20:42:07 EDT 2012     
[email protected]:/usr/src/sys/amd64/compile/PHOENIX  amd64

>Description:
I am running ZFS filesystem version 4 and storage pool version 15 on a FreeBSD 
8.2-Release-amd64 kernel.  I have a single 12TB pool based on a 3ware 9650 
controller with 8 seagate ST2000DL003 drives in a raid-5 configuration managed 
by the controller.

I recently had a connector problem on a disk in the array while running a 
performance test that was writing a 1TB pattern file to the array. When the 
raid controller started reporting errors I stopped the test and re-seated the 
connector on the drive.  After running a verify on the raid, I tried to read 
the partial pattern file and ZFS produced copious amounts of checksum error 
messages on the system console.  So, I rm'ed the file, and got even more 
checksum errors interspersed with several I/O error 86 messages.  Since the rm, 
ls no longer shows the file, but I did a scrub just to be sure the bogus file 
was gone, and got tons of checksum and i/o 86 errors.  At the end, zpool status 
shows:

phoenix# zpool status -v zfsPool
  pool: zfsPool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed after 3h40m with 6353 errors on Fri Jun 22 08:36:36 2012
config:

        NAME        STATE     READ WRITE CKSUM
        zfsPool     ONLINE       0     0 6.20K
          da0       ONLINE       0     0 12.4K

errors: Permanent errors have been detected in the following files:

        zfsPool/raid:<0x9e241>


I have tried "zpool clear"/reboot/"zpool scrub" several times now, and get a 
similar set of errors and results. 

My question is - How do I get rid of this file?  It is no longer linked to a 
directory entry, and there shouldn't be anybody with it open since I have 
rebooted several times.  yet, zfs still tells me there's a broken file and I 
should replace it.  It is most likely the pattern test file that I deleted, so 
I don't need it and I don't want to recover it.  i would just like to get rid 
of it and get my filesystem clean again without resorting to starting over.


thanks,
ron.


>How-To-Repeat:
not sure.  it occurred because of an untimely combination of high usage and 
hardware failures.
>Fix:
it was suggested that i either backup or copy the array somewhere and then copy 
it back, but the machine is in production, and  don't have enough capacity 
elsewhere to copy the entire content.  Anyway, for a serious filesystem, it 
should be possible to clean this file even if it has bad links and checksums 
without starting over.

>Release-Note:
>Audit-Trail:
>Unformatted:
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "[email protected]"

Reply via email to