Re: A partially failing disk in raid0 needs replacement
Hi Chris > > I don't see how you get an IO error in user space without the kernel > reporting the source of that IO error, whatever it is. > I totally agree, so I just retried the deletion. The only thing related I could see in /var/log/messages is this: Nov 30 07:29:57 box kernel: [368193.019160] BTRFS info (device sdb): found 207 extents Shortly after this, btrfs gives me the I/O error. I am guessing that the kernel with log to this file, and it did before I changed the disk - but not anymore, it seems. > If Btrfs detects corruption of data extents, it will tell you the > exact path to file names affected, as kernel messages. If you aren't > getting that, then it's some other problem. So it smells like some other problem, I guess. I have no idea what, so my best plan is to move forward on my plans to backup date, rebuild fs and restore everything.. Does anyone have any better suggestion? :) Thanks again for everything, /klaus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Wed, Nov 29, 2017 at 10:28 PM, Klaus Agnolettiwrote: > Hi Chris, > >> >> I assume when you get that, either when deleting the device or >> scrubbing, that you also see the device unrecoverable read error in >> dmesg, as originally reported. If the drive must have the information >> on that lost sector, and you can't increase SCT ERC time (as well as >> the kernel SCSI command timer), or increasing it doesn't help, then >> that data is lost. It's plausible btrfs check repair is smart enough >> to be able to reconstruct this missing data, but I suspect it isn't >> yet that capable. > > That's the 'fun' part: I don't get any kernel messages after changing > the disk, hence my assumption that it's a relatively small, logical > error somewhere on the fs. I don't see how you get an IO error in user space without the kernel reporting the source of that IO error, whatever it is. > >> >> So my recommendation is to prepare to lose the file system. Which >> means backing up whatever you can while it's still working, such as it >> is. > > Yeah, luckily I can temporarily borrow a couple of 3TB disks to host > the data while I rebuild the fs. So that will probably be what I do. > > It there any way I can remove just files with bad data from the disk > error so I just get those out of the way? If Btrfs detects corruption of data extents, it will tell you the exact path to file names affected, as kernel messages. If you aren't getting that, then it's some other problem. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
Hi Chris, > > I assume when you get that, either when deleting the device or > scrubbing, that you also see the device unrecoverable read error in > dmesg, as originally reported. If the drive must have the information > on that lost sector, and you can't increase SCT ERC time (as well as > the kernel SCSI command timer), or increasing it doesn't help, then > that data is lost. It's plausible btrfs check repair is smart enough > to be able to reconstruct this missing data, but I suspect it isn't > yet that capable. That's the 'fun' part: I don't get any kernel messages after changing the disk, hence my assumption that it's a relatively small, logical error somewhere on the fs. > > So my recommendation is to prepare to lose the file system. Which > means backing up whatever you can while it's still working, such as it > is. Yeah, luckily I can temporarily borrow a couple of 3TB disks to host the data while I rebuild the fs. So that will probably be what I do. It there any way I can remove just files with bad data from the disk error so I just get those out of the way? Thanks /klaus -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Wed, Nov 29, 2017 at 6:33 AM, Klaus Agnolettiwrote: > Hi list > > Can anyone give me any hints here? If not, my plan right now is to > start updating the server to latest debian stable (it's currently > running Jessie), to get access to a newer btrfs driver and tools, > hoping that decreases the risk of something screwing up, and then run > btrfs check --repair on the unmounted fs and wish for the best. > New kernel and tools won't fix this: ERROR: error removing the device '/dev/sdd' - Input/output error I assume when you get that, either when deleting the device or scrubbing, that you also see the device unrecoverable read error in dmesg, as originally reported. If the drive must have the information on that lost sector, and you can't increase SCT ERC time (as well as the kernel SCSI command timer), or increasing it doesn't help, then that data is lost. It's plausible btrfs check repair is smart enough to be able to reconstruct this missing data, but I suspect it isn't yet that capable. So my recommendation is to prepare to lose the file system. Which means backing up whatever you can while it's still working, such as it is. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
Hi list Can anyone give me any hints here? If not, my plan right now is to start updating the server to latest debian stable (it's currently running Jessie), to get access to a newer btrfs driver and tools, hoping that decreases the risk of something screwing up, and then run btrfs check --repair on the unmounted fs and wish for the best. Does that make sense? Thanks, /klaus On Tue, Nov 14, 2017 at 9:36 AM, Klaus Agnolettiwrote: > Hi list > > I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the > 2TB disks started giving me I/O errors in dmesg like this: > > [388659.173819] ata5.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action > 0x0 > [388659.175589] ata5.00: irq_stat 0x4008 > [388659.177312] ata5.00: failed command: READ FPDMA QUEUED > [388659.179045] ata5.00: cmd 60/20:60:80:96:95/00:00:c4:00:00/40 tag > 12 ncq 1638 > 4 in > res 51/40:1c:84:96:95/00:00:c4:00:00/40 Emask 0x409 (media error) > [388659.182552] ata5.00: status: { DRDY ERR } > [388659.184303] ata5.00: error: { UNC } > [388659.188899] ata5.00: configured for UDMA/133 > [388659.188956] sd 4:0:0:0: [sdd] Unhandled sense code > [388659.188960] sd 4:0:0:0: [sdd] > [388659.188962] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [388659.188965] sd 4:0:0:0: [sdd] > [388659.188967] Sense Key : Medium Error [current] [descriptor] > [388659.188970] Descriptor sense data with sense descriptors (in hex): > [388659.188972] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 > [388659.188981] c4 95 96 84 > [388659.188985] sd 4:0:0:0: [sdd] > [388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed > [388659.188991] sd 4:0:0:0: [sdd] CDB: > [388659.188992] Read(10): 28 00 c4 95 96 80 00 00 20 00 > [388659.189000] end_request: I/O error, dev sdd, sector 3298137732 > [388659.190740] BTRFS: bdev /dev/sdd errs: wr 0, rd 3120, flush 0, > corrupt 0, ge >n 0 > [388659.192556] ata5: EH complete > > At the same time, I started getting mails from smartd: > > Device: /dev/sdd [SAT], 2 Currently unreadable (pending) sectors > Device info: > Hitachi HDS723020BLA642, S/N:MN1220F30MNHUD, WWN:5-000cca-369c8f00b, > FW:MN6OA580, 2.00 TB > > For details see host's SYSLOG. > > To fix it, it ended up with me adding a new 6TB disk and trying to > delete the failing 2TB disks. > > That didn't go so well; apparently, the delete command aborts when > ever it encounters I/O errors. So now my raid0 looks like this: > > klaus@box:~$ sudo btrfs fi show > [sudo] password for klaus: > Label: none uuid: 5db5f82c-2571-4e62-a6da-50da0867888a > Total devices 4 FS bytes used 5.14TiB > devid1 size 1.82TiB used 1.78TiB path /dev/sde > devid2 size 1.82TiB used 1.78TiB path /dev/sdf > devid3 size 0.00B used 1.49TiB path /dev/sdd > devid4 size 5.46TiB used 305.21GiB path /dev/sdb > > Btrfs v3.17 > > Obviously, I want /dev/sdd emptied and deleted from the raid. > > So how do I do that? > > I thought of three possibilities myself. I am sure there are more, > given that I am in no way a btrfs expert: > > 1)Try to force a deletion of /dev/sdd where btrfs copies all intact > data to the other disks > 2) Somehow re-balances the raid so that sdd is emptied, and then deleted > 3) converting into a raid1, physically removing the failing disk, > simulating a hard error, starting the raid degraded, and converting it > back to raid0 again. > > How do you guys think I should go about this? Given that it's a raid0 > for a reason, it's not the end of the world losing all data, but I'd > really prefer losing as little as possible, obviously. > > FYI, I tried doing some scrubbing and balancing. There's traces of > that in the syslog and dmesg I've attached. It's being used as > firewall too, so there's a lof of Shorewall block messages smapping > the log I'm afraid. > > Additional info: > klaus@box:~$ uname -a > Linux box 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) > x86_64 GNU/Linux > klaus@box:~$ sudo btrfs --version > Btrfs v3.17 > klaus@box:~$ sudo btrfs fi df /mnt > Data, RAID0: total=5.34TiB, used=5.14TiB > System, RAID0: total=96.00MiB, used=384.00KiB > Metadata, RAID0: total=7.22GiB, used=5.82GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > Thanks a lot for any help you guys can give me. Btrfs is so incredibly > cool, compared to md :-) I love it! > > -- > Klaus Agnoletti -- Klaus Agnoletti -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
Hi List I tried removing the disk roughly like Roman suggested; I copied data using ddrescue from /dev/sdd to new disk /dev/sda. That seemed to work fine. After that I physically removed the old /dev/sdd and put the new /dev/sda to replace it on the same controller. After that I tried removing /dev/sdd from the btrfs. And I still get an error: klaus@box:~$ sudo btrfs device delete /dev/sdd /mnt [sudo] password for klaus: ERROR: error removing the device '/dev/sdd' - Input/output error I tried scrubbing, but that failed, too: scrub status for 5db5f82c-2571-4e62-a6da-50da0867888a scrub started at Sun Nov 26 01:27:58 2017 and finished after 21361 seconds total bytes scrubbed: 5.23TiB with 1 errors error details: csum=1 corrected errors: 0, uncorrectable errors: 1, unverified errors: 0 The biggest difference now is that I don't get emails from smartd with sector errors, so I am guessing this is 'just' a logical error. I've hesitated doing any repairing on the filesystem in fear of messing things up. What do you guys think I should do to fix the I/O error? Thanks, /klaus On Tue, Nov 14, 2017 at 3:44 PM, Roman Mamedovwrote: > On Tue, 14 Nov 2017 15:09:52 +0100 > Klaus Agnoletti wrote: > >> Hi Roman >> >> I almost understand :-) - however, I need a bit more information: >> >> How do I copy the image file to the 6TB without screwing the existing >> btrfs up when the fs is not mounted? Should I remove it from the raid >> again? > > Oh, you already added it to your FS, that's so unfortunate. For my scenario I > assumed have a spare 6TB (or any 2TB+) disk you can use as temporary space. > > You could try removing it, but with one of the existing member drives > malfunctioning, I wonder if trying any operation on that FS will cause further > damage. For example if you remove the 6TB one, how do you prevent Btrfs from > using the bad 2TB drive as destination to relocate data from the 6TB drive. Or > use it for one of the metadata mirrors, which will fail to write properly, > leading into transid failures later, etc. > > -- > With respect, > Roman -- Klaus Agnoletti -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Tue, Nov 14, 2017 at 1:36 AM, Klaus Agnolettiwrote: > Btrfs v3.17 Unrelated to the problem but this is pretty old. > Linux box 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) Also pretty old kernel. > x86_64 GNU/Linux > klaus@box:~$ sudo btrfs --version > Btrfs v3.17 > klaus@box:~$ sudo btrfs fi df /mnt > Data, RAID0: total=5.34TiB, used=5.14TiB > System, RAID0: total=96.00MiB, used=384.00KiB > Metadata, RAID0: total=7.22GiB, used=5.82GiB > GlobalReserve, single: total=512.00MiB, used=0.00B The central two problems: failing hardware, and no copies of metadata. By default, mkfs.btrfs does -draid0 -mraid1 for multiple device volumes. Explicitly making metadata raid0 basically means it's a disposable file system the instant there's a problem. What do you get for smartctl -l scterc /dev/ If you're lucky, this is really short. If it is something like 7 seconds, there's a chance the data in this sector can be recovered with a longer recovery time set by the drive *and* also setting the kernel's SCSI command timer to a value higher than 30 seconds (to match whatever you pick for the drive's error timeout). I'd pull something out of my ass like 60 seconds, or hell why not 120 seconds, for both. Maybe then there won't be a UNC error and you can quickly catch up your backups at the least. But before trying device removal again, assuming changing the error timeout to be higher is possible, the first thing I'd do is convert metadata to raid1. Then remove the bad device. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Tue, Nov 14, 2017 at 5:48 AM, Roman Mamedovwrote: > On Tue, 14 Nov 2017 10:36:22 +0200 > Klaus Agnoletti wrote: > >> Obviously, I want /dev/sdd emptied and deleted from the raid. > > * Unmount the RAID0 FS > > * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive > (noting how much of it is actually unreadable -- chances are it's mostly > intact) This almost certainly will not work now, the delete command has copied metadata to the 6TB drive, so it would have to be removed first to remove that metadata,and Btrfs's record of that member device to avoid it being considered missing, and also any chunks successfully copied over. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Tue, Nov 14, 2017 at 5:38 AM, Adam Borowskiwrote: > On Tue, Nov 14, 2017 at 10:36:22AM +0200, Klaus Agnoletti wrote: >> I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the > ^ >> 2TB disks started giving me I/O errors in dmesg like this: >> >> [388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed > > Alas, chances to recover anything are pretty slim. That's RAID0 metadata > for you. > > On the other hand, losing any non-trivial file while being able to gape at > intact metadata isn't that much better, thus -mraid0 isn't completely > unreasonable. I don't know the statistics on UNC read error vs total drive failure. If I thought that total drive failure was 2x or more likely than a single UNC then maybe raid0 is reasonable. But it's a 64KB block size for raid0. I think metadata raid0 probably doesn't offer that much performance improvement over raid1, and if it did, that's a case for raid10 metadata. In the UNC case, chances are it hits a data extent of a single file, in which case Btrfs can handle this fine, you just lose that one file. And if it hits the smaller target of metadata, it's fine if metadata is raid1 or raid10. In a previous email in the archives, I did a test where I intentionally formatted one member drive of a Btrfs data raid0, metadata raid1, and it was totally recoverable with a bunch of scary messages and sometimes a file was corrupted. So it actually is pretty darn resilient when there is a copy of metadata. (I did not try DUP.) -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
Hi Roman, If you look at the 'show' command, the failing disk is sorta out of the fs, so maybe removing the 6TB disk again will divide the data already on the 6TB disk (which isn't more than 300something gigs) to the 2 well-functioning disks. Still, as putting the dd-image of the 2TB disk on the temporary disk is only temporary, I do need one more 2TB+ disk attached to create a more permanent btrfs with the 6TB disk (which is what I eventually want). And for that I need some more harddisk power cables/splitters. And another disk. But that still seems to be the best option, so I will do that once I have those things sorted out. Thanks for your creative suggestion :) /klaus On Tue, Nov 14, 2017 at 4:44 PM, Roman Mamedovwrote: > On Tue, 14 Nov 2017 15:09:52 +0100 > Klaus Agnoletti wrote: > >> Hi Roman >> >> I almost understand :-) - however, I need a bit more information: >> >> How do I copy the image file to the 6TB without screwing the existing >> btrfs up when the fs is not mounted? Should I remove it from the raid >> again? > > Oh, you already added it to your FS, that's so unfortunate. For my scenario I > assumed have a spare 6TB (or any 2TB+) disk you can use as temporary space. > > You could try removing it, but with one of the existing member drives > malfunctioning, I wonder if trying any operation on that FS will cause further > damage. For example if you remove the 6TB one, how do you prevent Btrfs from > using the bad 2TB drive as destination to relocate data from the 6TB drive. Or > use it for one of the metadata mirrors, which will fail to write properly, > leading into transid failures later, etc. > > -- > With respect, > Roman -- Klaus Agnoletti -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Tue, 14 Nov 2017 15:09:52 +0100 Klaus Agnolettiwrote: > Hi Roman > > I almost understand :-) - however, I need a bit more information: > > How do I copy the image file to the 6TB without screwing the existing > btrfs up when the fs is not mounted? Should I remove it from the raid > again? Oh, you already added it to your FS, that's so unfortunate. For my scenario I assumed have a spare 6TB (or any 2TB+) disk you can use as temporary space. You could try removing it, but with one of the existing member drives malfunctioning, I wonder if trying any operation on that FS will cause further damage. For example if you remove the 6TB one, how do you prevent Btrfs from using the bad 2TB drive as destination to relocate data from the 6TB drive. Or use it for one of the metadata mirrors, which will fail to write properly, leading into transid failures later, etc. -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
Am Tue, 14 Nov 2017 17:48:56 +0500 schrieb Roman Mamedov: > [1] Note that "ddrescue" and "dd_rescue" are two different programs > for the same purpose, one may work better than the other. I don't > remember which. :) One is a perl implementation and is the one working worse. ;-) -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
Hi Austin Good points. Thanks a lot. /klaus On Tue, Nov 14, 2017 at 2:14 PM, Austin S. Hemmelgarnwrote: > On 2017-11-14 03:36, Klaus Agnoletti wrote: >> >> Hi list >> >> I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the >> 2TB disks started giving me I/O errors in dmesg like this: >> >> [388659.173819] ata5.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 >> action 0x0 >> [388659.175589] ata5.00: irq_stat 0x4008 >> [388659.177312] ata5.00: failed command: READ FPDMA QUEUED >> [388659.179045] ata5.00: cmd 60/20:60:80:96:95/00:00:c4:00:00/40 tag >> 12 ncq 1638 >> 4 in >> res 51/40:1c:84:96:95/00:00:c4:00:00/40 Emask 0x409 (media >> error) >> [388659.182552] ata5.00: status: { DRDY ERR } >> [388659.184303] ata5.00: error: { UNC } >> [388659.188899] ata5.00: configured for UDMA/133 >> [388659.188956] sd 4:0:0:0: [sdd] Unhandled sense code >> [388659.188960] sd 4:0:0:0: [sdd] >> [388659.188962] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [388659.188965] sd 4:0:0:0: [sdd] >> [388659.188967] Sense Key : Medium Error [current] [descriptor] >> [388659.188970] Descriptor sense data with sense descriptors (in hex): >> [388659.188972] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 >> [388659.188981] c4 95 96 84 >> [388659.188985] sd 4:0:0:0: [sdd] >> [388659.188988] Add. Sense: Unrecovered read error - auto reallocate >> failed >> [388659.188991] sd 4:0:0:0: [sdd] CDB: >> [388659.188992] Read(10): 28 00 c4 95 96 80 00 00 20 00 >> [388659.189000] end_request: I/O error, dev sdd, sector 3298137732 >> [388659.190740] BTRFS: bdev /dev/sdd errs: wr 0, rd 3120, flush 0, >> corrupt 0, ge >> n 0 >> [388659.192556] ata5: EH complete > > Just some background, but this error is usually indicative of either media > degradation from long-term usage, or a head crash. >> >> >> At the same time, I started getting mails from smartd: >> >> Device: /dev/sdd [SAT], 2 Currently unreadable (pending) sectors >> Device info: >> Hitachi HDS723020BLA642, S/N:MN1220F30MNHUD, WWN:5-000cca-369c8f00b, >> FW:MN6OA580, 2.00 TB >> >> For details see host's SYSLOG. > > And this correlates with the above errors (although the current pending > sectors being non-zero is less specific than the above). >> >> >> To fix it, it ended up with me adding a new 6TB disk and trying to >> delete the failing 2TB disks. >> >> That didn't go so well; apparently, the delete command aborts when >> ever it encounters I/O errors. So now my raid0 looks like this: > > I'm not going to comment on how to fix the current situation, as what has > been stated in other people's replies pretty well covers that. > > I would however like to mention two things for future reference: > > 1. The delete command handles I/O errors just fine, provided that there is > some form of redundancy in the filesystem. In your case, if this had been a > raid1 array instead of raid0, then the delete command would have just fallen > back to the other copy of the data when it hit an I/O error instead of > dying. Just like a regular RAID0 array (be it LVM, MD, or hardware), you > can't lose a device in a BTRFS raid0 array without losing the array. > > 2. While it would not have helped in this case, the preferred method when > replacing a device is to use the `btrfs replace` command. It's a lot more > efficient than add+delete (and exponentially more efficient than > delete+add), and also a bit safer (in both cases because it needs to move > less data). The only down-side to it is that you may need a couple of > resize commands around it. > >> >> klaus@box:~$ sudo btrfs fi show >> [sudo] password for klaus: >> Label: none uuid: 5db5f82c-2571-4e62-a6da-50da0867888a >> Total devices 4 FS bytes used 5.14TiB >> devid1 size 1.82TiB used 1.78TiB path /dev/sde >> devid2 size 1.82TiB used 1.78TiB path /dev/sdf >> devid3 size 0.00B used 1.49TiB path /dev/sdd >> devid4 size 5.46TiB used 305.21GiB path /dev/sdb >> >> Btrfs v3.17 >> >> Obviously, I want /dev/sdd emptied and deleted from the raid. >> >> So how do I do that? >> >> I thought of three possibilities myself. I am sure there are more, >> given that I am in no way a btrfs expert: >> >> 1)Try to force a deletion of /dev/sdd where btrfs copies all intact >> data to the other disks >> 2) Somehow re-balances the raid so that sdd is emptied, and then deleted >> 3) converting into a raid1, physically removing the failing disk, >> simulating a hard error, starting the raid degraded, and converting it >> back to raid0 again. >> >> How do you guys think I should go about this? Given that it's a raid0 >> for a reason, it's not the end of the world losing all data, but I'd >> really prefer losing as little as possible, obviously. >> >> FYI, I tried doing some scrubbing and balancing. There's traces of >> that in the syslog and dmesg I've attached. It's being used as >> firewall
Re: A partially failing disk in raid0 needs replacement
Hi Roman I almost understand :-) - however, I need a bit more information: How do I copy the image file to the 6TB without screwing the existing btrfs up when the fs is not mounted? Should I remove it from the raid again? Also, as you might have noticed, I have a bit of an issue with the entire space of the 6TB disk being added to the btrfs when I added the disk. There's something kinda basic about using btrfs that I haven't really understodd yet. Maybe you - or someone else - can point me in the right direction in terms of documentation. Thanks /klaus On Tue, Nov 14, 2017 at 1:48 PM, Roman Mamedovwrote: > On Tue, 14 Nov 2017 10:36:22 +0200 > Klaus Agnoletti wrote: > >> Obviously, I want /dev/sdd emptied and deleted from the raid. > > * Unmount the RAID0 FS > > * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive > (noting how much of it is actually unreadable -- chances are it's mostly > intact) > > * physically remove the bad drive (have a powerdown or reboot for this to be > sure Btrfs didn't remember it somewhere) > > * set up a loop device from the dd_rescue'd 2TB file > > * run `btrfs device scan` > > * mount the RAID0 filesystem > > * run the delete command on the loop device, it will not encounter I/O > errors anymore. > > > [1] Note that "ddrescue" and "dd_rescue" are two different programs for the > same purpose, one may work better than the other. I don't remember which. :) > > -- > With respect, > Roman -- Klaus Agnoletti -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On 2017-11-14 03:36, Klaus Agnoletti wrote: Hi list I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the 2TB disks started giving me I/O errors in dmesg like this: [388659.173819] ata5.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 0x0 [388659.175589] ata5.00: irq_stat 0x4008 [388659.177312] ata5.00: failed command: READ FPDMA QUEUED [388659.179045] ata5.00: cmd 60/20:60:80:96:95/00:00:c4:00:00/40 tag 12 ncq 1638 4 in res 51/40:1c:84:96:95/00:00:c4:00:00/40 Emask 0x409 (media error) [388659.182552] ata5.00: status: { DRDY ERR } [388659.184303] ata5.00: error: { UNC } [388659.188899] ata5.00: configured for UDMA/133 [388659.188956] sd 4:0:0:0: [sdd] Unhandled sense code [388659.188960] sd 4:0:0:0: [sdd] [388659.188962] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [388659.188965] sd 4:0:0:0: [sdd] [388659.188967] Sense Key : Medium Error [current] [descriptor] [388659.188970] Descriptor sense data with sense descriptors (in hex): [388659.188972] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [388659.188981] c4 95 96 84 [388659.188985] sd 4:0:0:0: [sdd] [388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed [388659.188991] sd 4:0:0:0: [sdd] CDB: [388659.188992] Read(10): 28 00 c4 95 96 80 00 00 20 00 [388659.189000] end_request: I/O error, dev sdd, sector 3298137732 [388659.190740] BTRFS: bdev /dev/sdd errs: wr 0, rd 3120, flush 0, corrupt 0, ge n 0 [388659.192556] ata5: EH complete Just some background, but this error is usually indicative of either media degradation from long-term usage, or a head crash. At the same time, I started getting mails from smartd: Device: /dev/sdd [SAT], 2 Currently unreadable (pending) sectors Device info: Hitachi HDS723020BLA642, S/N:MN1220F30MNHUD, WWN:5-000cca-369c8f00b, FW:MN6OA580, 2.00 TB For details see host's SYSLOG. And this correlates with the above errors (although the current pending sectors being non-zero is less specific than the above). To fix it, it ended up with me adding a new 6TB disk and trying to delete the failing 2TB disks. That didn't go so well; apparently, the delete command aborts when ever it encounters I/O errors. So now my raid0 looks like this: I'm not going to comment on how to fix the current situation, as what has been stated in other people's replies pretty well covers that. I would however like to mention two things for future reference: 1. The delete command handles I/O errors just fine, provided that there is some form of redundancy in the filesystem. In your case, if this had been a raid1 array instead of raid0, then the delete command would have just fallen back to the other copy of the data when it hit an I/O error instead of dying. Just like a regular RAID0 array (be it LVM, MD, or hardware), you can't lose a device in a BTRFS raid0 array without losing the array. 2. While it would not have helped in this case, the preferred method when replacing a device is to use the `btrfs replace` command. It's a lot more efficient than add+delete (and exponentially more efficient than delete+add), and also a bit safer (in both cases because it needs to move less data). The only down-side to it is that you may need a couple of resize commands around it. klaus@box:~$ sudo btrfs fi show [sudo] password for klaus: Label: none uuid: 5db5f82c-2571-4e62-a6da-50da0867888a Total devices 4 FS bytes used 5.14TiB devid1 size 1.82TiB used 1.78TiB path /dev/sde devid2 size 1.82TiB used 1.78TiB path /dev/sdf devid3 size 0.00B used 1.49TiB path /dev/sdd devid4 size 5.46TiB used 305.21GiB path /dev/sdb Btrfs v3.17 Obviously, I want /dev/sdd emptied and deleted from the raid. So how do I do that? I thought of three possibilities myself. I am sure there are more, given that I am in no way a btrfs expert: 1)Try to force a deletion of /dev/sdd where btrfs copies all intact data to the other disks 2) Somehow re-balances the raid so that sdd is emptied, and then deleted 3) converting into a raid1, physically removing the failing disk, simulating a hard error, starting the raid degraded, and converting it back to raid0 again. How do you guys think I should go about this? Given that it's a raid0 for a reason, it's not the end of the world losing all data, but I'd really prefer losing as little as possible, obviously. FYI, I tried doing some scrubbing and balancing. There's traces of that in the syslog and dmesg I've attached. It's being used as firewall too, so there's a lof of Shorewall block messages smapping the log I'm afraid. Additional info: klaus@box:~$ uname -a Linux box 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux klaus@box:~$ sudo btrfs --version Btrfs v3.17 klaus@box:~$ sudo btrfs fi df /mnt Data, RAID0: total=5.34TiB, used=5.14TiB System, RAID0: total=96.00MiB, used=384.00KiB Metadata, RAID0: total=7.22GiB,
Re: A partially failing disk in raid0 needs replacement
On 2017-11-14 07:48, Roman Mamedov wrote: On Tue, 14 Nov 2017 10:36:22 +0200 Klaus Agnolettiwrote: Obviously, I want /dev/sdd emptied and deleted from the raid. * Unmount the RAID0 FS * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive (noting how much of it is actually unreadable -- chances are it's mostly intact) * physically remove the bad drive (have a powerdown or reboot for this to be sure Btrfs didn't remember it somewhere) * set up a loop device from the dd_rescue'd 2TB file * run `btrfs device scan` * mount the RAID0 filesystem * run the delete command on the loop device, it will not encounter I/O errors anymore. While the above procedure will work, it is worth noting that you may still lose data. [1] Note that "ddrescue" and "dd_rescue" are two different programs for the same purpose, one may work better than the other. I don't remember which. :) As a general rule, GNU ddrescue is more user friendly for block-level copies, while Kurt Garlof's dd_rescue tends to be better for copying at the file level. Both work fine in terms of reliability though. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On 14 November 2017 at 09:36, Klaus Agnolettiwrote: > > How do you guys think I should go about this? I'd clone the disk with GNU ddrescue. https://www.gnu.org/software/ddrescue/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Tue, 14 Nov 2017 10:36:22 +0200 Klaus Agnolettiwrote: > Obviously, I want /dev/sdd emptied and deleted from the raid. * Unmount the RAID0 FS * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive (noting how much of it is actually unreadable -- chances are it's mostly intact) * physically remove the bad drive (have a powerdown or reboot for this to be sure Btrfs didn't remember it somewhere) * set up a loop device from the dd_rescue'd 2TB file * run `btrfs device scan` * mount the RAID0 filesystem * run the delete command on the loop device, it will not encounter I/O errors anymore. [1] Note that "ddrescue" and "dd_rescue" are two different programs for the same purpose, one may work better than the other. I don't remember which. :) -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On Tue, Nov 14, 2017 at 10:36:22AM +0200, Klaus Agnoletti wrote: > I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the ^ > 2TB disks started giving me I/O errors in dmesg like this: > > [388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed Alas, chances to recover anything are pretty slim. That's RAID0 metadata for you. On the other hand, losing any non-trivial file while being able to gape at intact metadata isn't that much better, thus -mraid0 isn't completely unreasonable. > To fix it, it ended up with me adding a new 6TB disk and trying to > delete the failing 2TB disks. > > That didn't go so well; apparently, the delete command aborts when > ever it encounters I/O errors. So now my raid0 looks like this: > > klaus@box:~$ sudo btrfs fi show > [sudo] password for klaus: > Label: none uuid: 5db5f82c-2571-4e62-a6da-50da0867888a > Total devices 4 FS bytes used 5.14TiB > devid1 size 1.82TiB used 1.78TiB path /dev/sde > devid2 size 1.82TiB used 1.78TiB path /dev/sdf > devid3 size 0.00B used 1.49TiB path /dev/sdd > devid4 size 5.46TiB used 305.21GiB path /dev/sdb > Obviously, I want /dev/sdd emptied and deleted from the raid. > > So how do I do that? > > I thought of three possibilities myself. I am sure there are more, > given that I am in no way a btrfs expert: > > 1)Try to force a deletion of /dev/sdd where btrfs copies all intact > data to the other disks > 2) Somehow re-balances the raid so that sdd is emptied, and then deleted > 3) converting into a raid1, physically removing the failing disk, > simulating a hard error, starting the raid degraded, and converting it > back to raid0 again. There's hardly any intact data: roughly 2/3 of chunks have half of their blocks on the failed disk, densely interspersed. Even worse, metadata required to map those blocks to files is gone, too: if we naively assume there's only a single tree, a tree node is intact only if it and every single node on the path to the root is intact. In practice, this means it's a total filesystem loss. > How do you guys think I should go about this? Given that it's a raid0 > for a reason, it's not the end of the world losing all data, but I'd > really prefer losing as little as possible, obviously. As the disk isn't _completely_ gone, there's a slim chance of some stuff requiring only still-readable sectors. Probably a waste of time to try to recover, though. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ Laws we want back: Poland, Dz.U. 1921 nr.30 poz.177 (also Dz.U. ⣾⠁⢰⠒⠀⣿⡁ 1920 nr.11 poz.61): Art.2: An official, guilty of accepting a gift ⢿⡄⠘⠷⠚⠋⠀ or another material benefit, or a promise thereof, [in matters ⠈⠳⣄ relevant to duties], shall be punished by death by shooting. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html