On 10/11/17 12:41 PM, Ian Kumlien wrote:
> Hi,
> 
> I was running a btrfs raid with 6 disks, metadata: dup and data: raid 6
> 
> Two of the disks started behaving oddly:
> [436823.570296] sd 3:1:0:4: [sdf] Unaligned partial completion
> (resid=244, sector_sz=512)
> [436823.578604] sd 3:1:0:4: [sdf] Unaligned partial completion
> (resid=52, sector_sz=512)                             [436823.617593]
> sd 3:1:0:4: [sdf] Unaligned partial completion (resid=56,
> sector_sz=512)
> [436823.617771] sd 3:1:0:4: [sdf] Unaligned partial completion
> (resid=222, sector_sz=512)
> [436823.618386] sd 3:1:0:4: [sdf] Unaligned partial completion
> (resid=246, sector_sz=512)
> [436823.618463] sd 3:1:0:4: [sdf] Unaligned partial completion
> (resid=56, sector_sz=512)
> [436977.701944] scsi_io_completion: 68 callbacks suppressed
> [436977.701973] sd 3:1:0:4: [sdf] tag#0 FAILED Result: hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [436977.701982] sd 3:1:0:4: [sdf] tag#0 Sense Key : Hardware Error
> [current]
> [436977.701991] sd 3:1:0:4: [sdf] tag#0 Add. Sense: Logical unit
> failure
> [436977.702000] sd 3:1:0:4: [sdf] tag#0 CDB: Read(10) 28 00 02 fb fb
> 80 00 00 28 00
> [436977.702005] print_req_error: 68 callbacks suppressed
> [436977.702010] print_req_error: critical target error, dev sdf,
> sector 50068352
> [498132.144319] print_req_error: 450 callbacks suppressed
> [498132.144324] print_req_error: critical target error, dev sdf,
> sector 41777640
> [498132.144590] btrfs_dev_stat_print_on_error: 540 callbacks
> suppressed
> [498132.144600] BTRFS error (device sdb1): bdev /dev/sdf1 errs: wr
> 632, rd 1526, flush 0, corrupt 0, gen 0
> 
> Eventually the filesystem becomes read-only and everything is odd...

Are you still able to mount it?  I'd be surprised if you could if check
can't open the file system.

> Trying to run btrfs check on the disks results in:
> btrfs check -b /dev/disk/by-uuid/8d431da9-dad4-481c-a5ad-5e6844f31da0
> bytenr mismatch, want=912228352, have=0
> Couldn't read tree root
> ERROR: cannot open file system
> 
> (For backup and normal)
> 
> So even if the data is duplicated on all disks, something in the above
> errors seemed to cause it to abort
> (These disks are seagate sshd disks, never ever buying them again)

If you have metadata: dup, that doesn't mean the metadata is duplicated
on every disk.  It means that there are two copies of the metadata on a
single disk.  If that disk is going bad and returning failures for both
copies of the metadata, you may be out of luck.  It's really intended
for single spinning disks to get a little bit more resiliency in the
face of bad sectors.

The check error above means that it wasn't able to map a logical address
to a physical address.  Typically that means that the mapping was lost.

-Jeff


-- 
Jeff Mahoney
SUSE Labs

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to