Re: btrfs-related kernel oops due to media error
Hi, One of my disks, partitioned into a single btrfs partition, is showing media errors. The problem is that these errors lead to kernel panic from btrfs - that make the filesystem unusable until reboot - and therefore it is very hard for me to do a full backup of the data prior to changing the disk. My current kernel is 3.2.0-8-generic from Ubuntu/precise (based on linux 3.2-final) but I quickly tested and get the same error with an older 3.1 kernel (and I can probably reproduce it with a vanilla kernel if necessary). I assume that the filesystem should not panic even in case of a media error... Is there any procedure I can follow / patch I could apply to salvage my data while ignoring media errors ? I don't know about btrfs, but writing the sector with hdparm --write-sector will usually cause it to be remapped. You can use dd or another tool to read the entire disk to find out if there are more bad sectors. Niels -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-related kernel oops due to media error
On Tue, Jan 10, 2012 at 00:01, Niels de Carpentier ni...@decarpentier.com mailto:ni...@decarpentier.com wrote: Hi, One of my disks, partitioned into a single btrfs partition, is showing media errors. The problem is that these errors lead to kernel panic from btrfs - that make the filesystem unusable until reboot - and therefore it is very hard for me to do a full backup of the data prior to changing the disk. My current kernel is 3.2.0-8-generic from Ubuntu/precise (based on linux 3.2-final) but I quickly tested and get the same error with an older 3.1 kernel (and I can probably reproduce it with a vanilla kernel if necessary). I assume that the filesystem should not panic even in case of a media error... Is there any procedure I can follow / patch I could apply to salvage my data while ignoring media errors ? I don't know about btrfs, but writing the sector with hdparm --write-sector will usually cause it to be remapped. You can use dd or another tool to read the entire disk to find out if there are more bad sectors. Niels Thanks you for the hint ! I'll probably try this but since I've already managed to make a copy of all my interesting data, I think I'll keep the disk in the same state (with the bad sectors not remapped) for a few days, hoping the btrfs developers are interested in fixing this bug... Who will trust a filesystem that OOPs on media failure ? ;-) Vincent -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[v3.2-4874-ge4e1118 OOPS] btrfs-related kernel oops due to media error
[Note : this is a resent of a mail I send to linux-btrfs earlier, this time tested with the lastest git kernel] Hi, One of my disks, partitioned into a single btrfs partition, is showing media errors. The problem is that these errors lead to kernel panic from btrfs - that make the filesystem unusable until reboot - and therefore it is very hard for me to do a full backup of the data prior to changing the disk. My current kernel is a vanilla kernel at current tip (output from git describe is v3.2-4874-ge4e1118). I assume that the filesystem should not panic even in case of a media error... Is there any procedure I can follow / patch I could apply to salvage my data while ignoring media errors ? logs/OOPS at the end of this mail, please let me know if more information is needed, Best regards, Vincent --- [ 3210.717304] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 3210.717309] ata6.00: BMDMA stat 0x24 [ 3210.717312] ata6.00: failed command: READ DMA EXT [ 3210.717318] ata6.00: cmd 25/00:08:5f:dc:2f/00:00:70:00:00/e0 tag 0 dma 4096 in [ 3210.717320] res 51/40:00:61:dc:2f/40:00:70:00:00/e0 Emask 0x9 (media error) [ 3210.717323] ata6.00: status: { DRDY ERR } [ 3210.717325] ata6.00: error: { UNC } [ 3210.732234] ata6.00: configured for UDMA/133 [ 3210.732248] sd 5:0:0:0: [sdd] Unhandled sense code [ 3210.732250] sd 5:0:0:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 3210.732254] sd 5:0:0:0: [sdd] Sense Key : Medium Error [current] [descriptor] [ 3210.732259] Descriptor sense data with sense descriptors (in hex): [ 3210.732261] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [ 3210.732270] 70 2f dc 61 [ 3210.732274] sd 5:0:0:0: [sdd] Add. Sense: Unrecovered read error - auto reallocate failed [ 3210.732278] sd 5:0:0:0: [sdd] CDB: Read(10): 28 00 70 2f dc 5f 00 00 08 00 [ 3210.732287] end_request: I/O error, dev sdd, sector 1882184801 [ 3210.732305] ata6: EH complete [ 3210.732322] BUG: unable to handle kernel NULL pointer dereference at (null) [ 3210.732373] IP: [a017f129] extent_range_uptodate+0x59/0xe0 [btrfs] [ 3210.732426] PGD 21e9b7067 PUD 21e9b6067 PMD 0 [ 3210.732455] Oops: [#1] SMP [ 3210.732475] CPU 3 [ 3210.732486] Modules linked in: ip6table_filter ip6_tables ipt_MASQUERADE bnep iptable_nat nf_nat rfcomm bluetooth nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_intel kvm parport_pc ppdev nfsd nfs lockd fscache binfmt_misc auth_rpcgss nfs_acl sunrpc dm_crypt snd_usb_audio snd_usbmidi_lib joydev snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc psmouse serio_raw cdc_acm lp parport btrfs zlib_deflate libcrc32c hid_logitech ff_memless usbhid hid i915 drm_kms_helper drm r8169 i2c_algo_bit video pata_jmicron [ 3210.732870] [ 3210.732880] Pid: 3856, comm: btrfs-endio-met Not tainted 3.2.0-custom #2 Gigabyte Technology Co., Ltd. G33-DS3R/G33-DS3R [ 3210.732933] RIP: 0010:[a017f129] [a017f129] extent_range_uptodate+0x59/0xe0 [btrfs] [ 3210.732989] RSP: 0018:880006f3fde0 EFLAGS: 00010246 [ 3210.733014] RAX: RBX: 00df57385000 RCX: [ 3210.733047] RDX: 0001 RSI: 0df57385 RDI: [ 3210.733079] RBP: 880006f3fe00 R08: R09: 88008bce5200 [ 3210.733111] R10: 8800299f9010 R11: 1000 R12: 8802190f4030 [ 3210.733143] R13: 00df573853ff R14: 880006f3fe98 R15: 880143263d88 [ 3210.733175] FS: () GS:88022fd8() knlGS: [ 3210.733212] CS: 0010 DS: ES: CR0: 8005003b [ 3210.733238] CR2: CR3: 00021f35a000 CR4: 000406e0 [ 3210.733270] DR0: DR1: DR2: [ 3210.733302] DR3: DR6: 0ff0 DR7: 0400 [ 3210.74] Process btrfs-endio-met (pid: 3856, threadinfo 880006f3e000, task 8801fa8d8000) [ 3210.733374] Stack: [ 3210.733385] 8800298dd838 8801f9cc9840 88021ee05000 [ 3210.733423] 880006f3fe30 a01581f9 880143263d80 8800298dd860 [ 3210.733461] 880143263d80 880143263d98 880006f3fee0 a0187fef [ 3210.733499] Call Trace: [ 3210.733524] [a01581f9] end_workqueue_fn+0x119/0x140 [btrfs] [ 3210.733567] [a0187fef] worker_loop+0x16f/0x5d0 [btrfs] [ 3210.733608] [a0187e80] ? btrfs_queue_worker+0x310/0x310 [btrfs] [ 3210.733643] [8106fa93] kthread+0x93/0xa0 [ 3210.733668] [8162caa4] kernel_thread_helper+0x4/0x10 [ 3210.733697] [8106fa00] ?
btrfs-related kernel oops due to media error
Hi, One of my disks, partitioned into a single btrfs partition, is showing media errors. The problem is that these errors lead to kernel panic from btrfs - that make the filesystem unusable until reboot - and therefore it is very hard for me to do a full backup of the data prior to changing the disk. My current kernel is 3.2.0-8-generic from Ubuntu/precise (based on linux 3.2-final) but I quickly tested and get the same error with an older 3.1 kernel (and I can probably reproduce it with a vanilla kernel if necessary). I assume that the filesystem should not panic even in case of a media error... Is there any procedure I can follow / patch I could apply to salvage my data while ignoring media errors ? logs/OOPS at the end of this mail, please let me know if more information is needed, Best regards, Vincent --- [ 129.241636] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 129.241640] ata6.00: BMDMA stat 0x24 [ 129.241643] ata6.00: failed command: READ DMA EXT [ 129.241649] ata6.00: cmd 25/00:08:5f:dc:2f/00:00:70:00:00/e0 tag 0 dma 4096 in [ 129.241651] res 51/40:00:61:dc:2f/40:00:70:00:00/e0 Emask 0x9 (media error) [ 129.241654] ata6.00: status: { DRDY ERR } [ 129.241656] ata6.00: error: { UNC } [ 129.256243] ata6.00: configured for UDMA/133 [ 129.256261] ata6: EH complete [ 131.640911] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 131.640915] ata6.00: BMDMA stat 0x24 [ 131.640918] ata6.00: failed command: READ DMA EXT [ 131.640922] ata6.00: cmd 25/00:08:5f:dc:2f/00:00:70:00:00/e0 tag 0 dma 4096 in [ 131.640923] res 51/40:00:61:dc:2f/40:00:70:00:00/e0 Emask 0x9 (media error) [ 131.640926] ata6.00: status: { DRDY ERR } [ 131.640927] ata6.00: error: { UNC } [ 131.656244] ata6.00: configured for UDMA/133 [ 131.656260] ata6: EH complete [ 134.317351] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 134.317355] ata6.00: BMDMA stat 0x24 [ 134.317359] ata6.00: failed command: READ DMA EXT [ 134.317365] ata6.00: cmd 25/00:08:5f:dc:2f/00:00:70:00:00/e0 tag 0 dma 4096 in [ 134.317366] res 51/40:00:61:dc:2f/40:00:70:00:00/e0 Emask 0x9 (media error) [ 134.317369] ata6.00: status: { DRDY ERR } [ 134.317371] ata6.00: error: { UNC } [ 134.332234] ata6.00: configured for UDMA/133 [ 134.332248] ata6: EH complete [ 136.894260] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 136.894264] ata6.00: BMDMA stat 0x24 [ 136.894268] ata6.00: failed command: READ DMA EXT [ 136.894274] ata6.00: cmd 25/00:08:5f:dc:2f/00:00:70:00:00/e0 tag 0 dma 4096 in [ 136.894275] res 51/40:00:61:dc:2f/40:00:70:00:00/e0 Emask 0x9 (media error) [ 136.894278] ata6.00: status: { DRDY ERR } [ 136.894280] ata6.00: error: { UNC } [ 136.924255] ata6.00: configured for UDMA/133 [ 136.924269] ata6: EH complete [ 139.437990] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 139.437994] ata6.00: BMDMA stat 0x24 [ 139.437998] ata6.00: failed command: READ DMA EXT [ 139.438004] ata6.00: cmd 25/00:08:5f:dc:2f/00:00:70:00:00/e0 tag 0 dma 4096 in [ 139.438005] res 51/40:00:61:dc:2f/40:00:70:00:00/e0 Emask 0x9 (media error) [ 139.438008] ata6.00: status: { DRDY ERR } [ 139.438010] ata6.00: error: { UNC } [ 139.468239] ata6.00: configured for UDMA/133 [ 139.468253] ata6: EH complete [ 141.937488] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 141.937493] ata6.00: BMDMA stat 0x24 [ 141.937497] ata6.00: failed command: READ DMA EXT [ 141.937503] ata6.00: cmd 25/00:08:5f:dc:2f/00:00:70:00:00/e0 tag 0 dma 4096 in [ 141.937504] res 51/40:00:61:dc:2f/40:00:70:00:00/e0 Emask 0x9 (media error) [ 141.937507] ata6.00: status: { DRDY ERR } [ 141.937509] ata6.00: error: { UNC } [ 141.952236] ata6.00: configured for UDMA/133 [ 141.952253] sd 5:0:0:0: [sdd] Unhandled sense code [ 141.952256] sd 5:0:0:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 141.952260] sd 5:0:0:0: [sdd] Sense Key : Medium Error [current] [descriptor] [ 141.952264] Descriptor sense data with sense descriptors (in hex): [ 141.952266] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [ 141.952275] 70 2f dc 61 [ 141.952279] sd 5:0:0:0: [sdd] Add. Sense: Unrecovered read error - auto reallocate failed [ 141.952284] sd 5:0:0:0: [sdd] CDB: Read(10): 28 00 70 2f dc 5f 00 00 08 00 [ 141.952293] end_request: I/O error, dev sdd, sector 1882184801 [ 141.952313] ata6: EH complete [ 141.952335] BUG: unable to handle kernel NULL pointer dereference at (null) [ 141.952383] IP: [a018e439] extent_range_uptodate+0x59/0xe0 [btrfs] [ 141.952440] PGD 21caae067 PUD