Re: A partially failing disk in raid0 needs replacement

2017-11-29 Thread Klaus Agnoletti
Hi Chris
>
> I don't see how you get an IO error in user space without the kernel
> reporting the source of that IO error, whatever it is.
>
I totally agree, so I just retried the deletion. The only thing
related I could see in /var/log/messages is this:
Nov 30 07:29:57 box kernel: [368193.019160] BTRFS info (device sdb):
found 207 extents

Shortly after this, btrfs gives me the I/O error. I am guessing that
the kernel with log to this file, and it did before I changed the disk
- but not anymore, it seems.

> If Btrfs detects corruption of data extents, it will tell you the
> exact path to file names affected, as kernel messages. If you aren't
> getting that, then it's some other problem.

So it smells like some other problem, I guess. I have no idea what, so
my best plan is to move forward on my plans to backup date, rebuild fs
and restore everything.. Does anyone have any better suggestion? :)

Thanks again for everything,

/klaus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-29 Thread Chris Murphy
On Wed, Nov 29, 2017 at 10:28 PM, Klaus Agnoletti  wrote:
> Hi Chris,
>
>>
>> I assume when you get that, either when deleting the device or
>> scrubbing, that you also see the device unrecoverable read error in
>> dmesg, as originally reported. If the drive must have the information
>> on that lost sector, and you can't increase SCT ERC time (as well as
>> the kernel SCSI command timer), or increasing it doesn't help, then
>> that data is lost. It's plausible btrfs check repair is smart enough
>> to be able to reconstruct this missing data, but I suspect it isn't
>> yet that capable.
>
> That's the 'fun' part: I don't get any kernel messages after changing
> the disk, hence my assumption that it's a relatively small, logical
> error somewhere on the fs.

I don't see how you get an IO error in user space without the kernel
reporting the source of that IO error, whatever it is.


>
>>
>> So my recommendation is to prepare to lose the file system. Which
>> means backing up whatever you can while it's still working, such as it
>> is.
>
> Yeah, luckily I can temporarily borrow a couple of 3TB disks to host
> the data while I rebuild the fs. So that will probably be what I do.
>
> It there any way I can remove just files with bad data from the disk
> error so I just get those out of the way?

If Btrfs detects corruption of data extents, it will tell you the
exact path to file names affected, as kernel messages. If you aren't
getting that, then it's some other problem.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-29 Thread Klaus Agnoletti
Hi Chris,

>
> I assume when you get that, either when deleting the device or
> scrubbing, that you also see the device unrecoverable read error in
> dmesg, as originally reported. If the drive must have the information
> on that lost sector, and you can't increase SCT ERC time (as well as
> the kernel SCSI command timer), or increasing it doesn't help, then
> that data is lost. It's plausible btrfs check repair is smart enough
> to be able to reconstruct this missing data, but I suspect it isn't
> yet that capable.

That's the 'fun' part: I don't get any kernel messages after changing
the disk, hence my assumption that it's a relatively small, logical
error somewhere on the fs.

>
> So my recommendation is to prepare to lose the file system. Which
> means backing up whatever you can while it's still working, such as it
> is.

Yeah, luckily I can temporarily borrow a couple of 3TB disks to host
the data while I rebuild the fs. So that will probably be what I do.

It there any way I can remove just files with bad data from the disk
error so I just get those out of the way?

Thanks

/klaus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-29 Thread Chris Murphy
On Wed, Nov 29, 2017 at 6:33 AM, Klaus Agnoletti  wrote:
> Hi list
>
> Can anyone give me any hints here? If not, my plan right now is to
> start updating the server to latest debian stable (it's currently
> running Jessie), to get access to a newer btrfs driver and tools,
> hoping that decreases the risk of something screwing up, and then run
> btrfs check --repair on the unmounted fs and wish for the best.
>

New kernel and tools won't fix this:

ERROR: error removing the device '/dev/sdd' - Input/output error

I assume when you get that, either when deleting the device or
scrubbing, that you also see the device unrecoverable read error in
dmesg, as originally reported. If the drive must have the information
on that lost sector, and you can't increase SCT ERC time (as well as
the kernel SCSI command timer), or increasing it doesn't help, then
that data is lost. It's plausible btrfs check repair is smart enough
to be able to reconstruct this missing data, but I suspect it isn't
yet that capable.

So my recommendation is to prepare to lose the file system. Which
means backing up whatever you can while it's still working, such as it
is.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-29 Thread Klaus Agnoletti
Hi list

Can anyone give me any hints here? If not, my plan right now is to
start updating the server to latest debian stable (it's currently
running Jessie), to get access to a newer btrfs driver and tools,
hoping that decreases the risk of something screwing up, and then run
btrfs check --repair on the unmounted fs and wish for the best.

Does that make sense?

Thanks,

/klaus

On Tue, Nov 14, 2017 at 9:36 AM, Klaus Agnoletti  wrote:
> Hi list
>
> I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the
> 2TB disks started giving me I/O errors in dmesg like this:
>
> [388659.173819] ata5.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 
> 0x0
> [388659.175589] ata5.00: irq_stat 0x4008
> [388659.177312] ata5.00: failed command: READ FPDMA QUEUED
> [388659.179045] ata5.00: cmd 60/20:60:80:96:95/00:00:c4:00:00/40 tag
> 12 ncq 1638
>  4 in
>  res 51/40:1c:84:96:95/00:00:c4:00:00/40 Emask 0x409 (media error) 
> [388659.182552] ata5.00: status: { DRDY ERR }
> [388659.184303] ata5.00: error: { UNC }
> [388659.188899] ata5.00: configured for UDMA/133
> [388659.188956] sd 4:0:0:0: [sdd] Unhandled sense code
> [388659.188960] sd 4:0:0:0: [sdd]
> [388659.188962] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [388659.188965] sd 4:0:0:0: [sdd]
> [388659.188967] Sense Key : Medium Error [current] [descriptor]
> [388659.188970] Descriptor sense data with sense descriptors (in hex):
> [388659.188972] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
> [388659.188981] c4 95 96 84
> [388659.188985] sd 4:0:0:0: [sdd]
> [388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed
> [388659.188991] sd 4:0:0:0: [sdd] CDB:
> [388659.188992] Read(10): 28 00 c4 95 96 80 00 00 20 00
> [388659.189000] end_request: I/O error, dev sdd, sector 3298137732
> [388659.190740] BTRFS: bdev /dev/sdd errs: wr 0, rd 3120, flush 0,
> corrupt 0, ge
>n 0
> [388659.192556] ata5: EH complete
>
> At the same time, I started getting mails from smartd:
>
> Device: /dev/sdd [SAT], 2 Currently unreadable (pending) sectors
> Device info:
> Hitachi HDS723020BLA642, S/N:MN1220F30MNHUD, WWN:5-000cca-369c8f00b,
> FW:MN6OA580, 2.00 TB
>
> For details see host's SYSLOG.
>
> To fix it, it ended up with me adding a new 6TB disk and trying to
> delete the failing 2TB disks.
>
> That didn't go so well; apparently, the delete command aborts when
> ever it encounters I/O errors. So now my raid0 looks like this:
>
> klaus@box:~$ sudo btrfs fi show
> [sudo] password for klaus:
> Label: none  uuid: 5db5f82c-2571-4e62-a6da-50da0867888a
> Total devices 4 FS bytes used 5.14TiB
> devid1 size 1.82TiB used 1.78TiB path /dev/sde
> devid2 size 1.82TiB used 1.78TiB path /dev/sdf
> devid3 size 0.00B used 1.49TiB path /dev/sdd
> devid4 size 5.46TiB used 305.21GiB path /dev/sdb
>
> Btrfs v3.17
>
> Obviously, I want /dev/sdd emptied and deleted from the raid.
>
> So how do I do that?
>
> I thought of three possibilities myself. I am sure there are more,
> given that I am in no way a btrfs expert:
>
> 1)Try to force a deletion of /dev/sdd where btrfs copies all intact
> data to the other disks
> 2) Somehow re-balances the raid so that sdd is emptied, and then deleted
> 3) converting into a raid1, physically removing the failing disk,
> simulating a hard error, starting the raid degraded, and converting it
> back to raid0 again.
>
> How do you guys think I should go about this? Given that it's a raid0
> for a reason, it's not the end of the world losing all data, but I'd
> really prefer losing as little as possible, obviously.
>
> FYI, I tried doing some scrubbing and balancing. There's traces of
> that in the syslog and dmesg I've attached. It's being used as
> firewall too, so there's a lof of Shorewall block messages smapping
> the log I'm afraid.
>
> Additional info:
> klaus@box:~$ uname -a
> Linux box 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19)
> x86_64 GNU/Linux
> klaus@box:~$ sudo btrfs --version
> Btrfs v3.17
> klaus@box:~$ sudo btrfs fi df /mnt
> Data, RAID0: total=5.34TiB, used=5.14TiB
> System, RAID0: total=96.00MiB, used=384.00KiB
> Metadata, RAID0: total=7.22GiB, used=5.82GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> Thanks a lot for any help you guys can give me. Btrfs is so incredibly
> cool, compared to md :-) I love it!
>
> --
> Klaus Agnoletti



-- 
Klaus Agnoletti
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-26 Thread Klaus Agnoletti
Hi List

I tried removing the disk roughly like Roman suggested; I copied data
using ddrescue from /dev/sdd to new disk /dev/sda. That seemed to work
fine. After that I physically removed the old /dev/sdd and put the new
/dev/sda to replace it on the same controller.

After that I tried removing /dev/sdd from the btrfs. And I still get an error:

klaus@box:~$ sudo btrfs device delete /dev/sdd /mnt
[sudo] password for klaus:
ERROR: error removing the device '/dev/sdd' - Input/output error

I tried scrubbing, but that failed, too:

scrub status for 5db5f82c-2571-4e62-a6da-50da0867888a
scrub started at Sun Nov 26 01:27:58 2017 and finished after
21361 seconds
total bytes scrubbed: 5.23TiB with 1 errors
error details: csum=1
corrected errors: 0, uncorrectable errors: 1, unverified errors: 0

The biggest difference now is that I don't get emails from smartd with
sector errors, so I am guessing this is 'just' a logical error.

I've hesitated doing any repairing on the filesystem in fear of
messing things up.

What do you guys think I should do to fix the I/O error?

Thanks,

/klaus

On Tue, Nov 14, 2017 at 3:44 PM, Roman Mamedov  wrote:
> On Tue, 14 Nov 2017 15:09:52 +0100
> Klaus Agnoletti  wrote:
>
>> Hi Roman
>>
>> I almost understand :-) - however, I need a bit more information:
>>
>> How do I copy the image file to the 6TB without screwing the existing
>> btrfs up when the fs is not mounted? Should I remove it from the raid
>> again?
>
> Oh, you already added it to your FS, that's so unfortunate. For my scenario I
> assumed have a spare 6TB (or any 2TB+) disk you can use as temporary space.
>
> You could try removing it, but with one of the existing member drives
> malfunctioning, I wonder if trying any operation on that FS will cause further
> damage. For example if you remove the 6TB one, how do you prevent Btrfs from
> using the bad 2TB drive as destination to relocate data from the 6TB drive. Or
> use it for one of the metadata mirrors, which will fail to write properly,
> leading into transid failures later, etc.
>
> --
> With respect,
> Roman



-- 
Klaus Agnoletti
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Chris Murphy
On Tue, Nov 14, 2017 at 1:36 AM, Klaus Agnoletti  wrote:

> Btrfs v3.17

Unrelated to the problem but this is pretty old.


> Linux box 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19)

Also pretty old kernel.


> x86_64 GNU/Linux
> klaus@box:~$ sudo btrfs --version
> Btrfs v3.17
> klaus@box:~$ sudo btrfs fi df /mnt
> Data, RAID0: total=5.34TiB, used=5.14TiB
> System, RAID0: total=96.00MiB, used=384.00KiB
> Metadata, RAID0: total=7.22GiB, used=5.82GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B


The central two problems: failing hardware, and no copies of metadata.
By default, mkfs.btrfs does -draid0 -mraid1 for multiple device
volumes. Explicitly making metadata raid0 basically means it's a
disposable file system the instant there's a problem.

What do you get for
smartctl -l scterc /dev/

If you're lucky, this is really short. If it is something like 7
seconds, there's a chance the data in this sector can be recovered
with a longer recovery time set by the drive *and* also setting the
kernel's SCSI command timer to a value higher than 30 seconds (to
match whatever you pick for the drive's error timeout). I'd pull
something out of my ass like 60 seconds, or hell why not 120 seconds,
for both. Maybe then there won't be a UNC error and you can quickly
catch up your backups at the least.

But before trying device removal again, assuming changing the error
timeout to be higher is possible, the first thing I'd do is convert
metadata to raid1. Then remove the bad device.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Chris Murphy
On Tue, Nov 14, 2017 at 5:48 AM, Roman Mamedov  wrote:
> On Tue, 14 Nov 2017 10:36:22 +0200
> Klaus Agnoletti  wrote:
>
>> Obviously, I want /dev/sdd emptied and deleted from the raid.
>
>   * Unmount the RAID0 FS
>
>   * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive
> (noting how much of it is actually unreadable -- chances are it's mostly
> intact)

This almost certainly will not work now, the delete command has copied
metadata to the 6TB drive, so it would have to be removed first to
remove that metadata,and Btrfs's record of that member device to avoid
it being considered missing, and also any chunks successfully copied
over.




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Chris Murphy
On Tue, Nov 14, 2017 at 5:38 AM, Adam Borowski  wrote:
> On Tue, Nov 14, 2017 at 10:36:22AM +0200, Klaus Agnoletti wrote:
>> I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the
>  ^
>> 2TB disks started giving me I/O errors in dmesg like this:
>>
>> [388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed
>
> Alas, chances to recover anything are pretty slim.  That's RAID0 metadata
> for you.
>
> On the other hand, losing any non-trivial file while being able to gape at
> intact metadata isn't that much better, thus -mraid0 isn't completely
> unreasonable.

I don't know the statistics on UNC read error vs total drive failure.
If I thought that total drive failure was 2x or more likely than a
single UNC then maybe raid0 is reasonable. But it's a 64KB block size
for raid0. I think metadata raid0 probably doesn't offer that much
performance improvement over raid1, and if it did, that's a case for
raid10 metadata.

In the UNC case, chances are it hits a data extent of a single file,
in which case Btrfs can handle this fine, you just lose that one file.
And if it hits the smaller target of metadata, it's fine if metadata
is raid1 or raid10.

In a previous email in the archives, I did a test where I
intentionally formatted one member drive of a Btrfs data raid0,
metadata raid1, and it was totally recoverable with a bunch of scary
messages and sometimes a file was corrupted. So it actually is pretty
darn resilient when there is a copy of metadata. (I did not try DUP.)




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Klaus Agnoletti
Hi Roman,

If you look at the 'show' command, the failing disk is sorta out of
the fs, so maybe removing the 6TB disk again will divide the data
already on the 6TB disk (which isn't more than 300something gigs) to
the 2 well-functioning disks.

Still, as putting the dd-image of the 2TB disk on the temporary disk
is only temporary, I do need one more 2TB+ disk attached to create a
more permanent btrfs with the 6TB disk (which is what I eventually
want). And for that I need some more harddisk power cables/splitters.
And another disk. But that still seems to be the best option, so I
will do that once I have those things sorted out.

Thanks for your creative suggestion :)

/klaus


On Tue, Nov 14, 2017 at 4:44 PM, Roman Mamedov  wrote:
> On Tue, 14 Nov 2017 15:09:52 +0100
> Klaus Agnoletti  wrote:
>
>> Hi Roman
>>
>> I almost understand :-) - however, I need a bit more information:
>>
>> How do I copy the image file to the 6TB without screwing the existing
>> btrfs up when the fs is not mounted? Should I remove it from the raid
>> again?
>
> Oh, you already added it to your FS, that's so unfortunate. For my scenario I
> assumed have a spare 6TB (or any 2TB+) disk you can use as temporary space.
>
> You could try removing it, but with one of the existing member drives
> malfunctioning, I wonder if trying any operation on that FS will cause further
> damage. For example if you remove the 6TB one, how do you prevent Btrfs from
> using the bad 2TB drive as destination to relocate data from the 6TB drive. Or
> use it for one of the metadata mirrors, which will fail to write properly,
> leading into transid failures later, etc.
>
> --
> With respect,
> Roman



-- 
Klaus Agnoletti
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Roman Mamedov
On Tue, 14 Nov 2017 15:09:52 +0100
Klaus Agnoletti  wrote:

> Hi Roman
> 
> I almost understand :-) - however, I need a bit more information:
> 
> How do I copy the image file to the 6TB without screwing the existing
> btrfs up when the fs is not mounted? Should I remove it from the raid
> again?

Oh, you already added it to your FS, that's so unfortunate. For my scenario I
assumed have a spare 6TB (or any 2TB+) disk you can use as temporary space.

You could try removing it, but with one of the existing member drives
malfunctioning, I wonder if trying any operation on that FS will cause further
damage. For example if you remove the 6TB one, how do you prevent Btrfs from
using the bad 2TB drive as destination to relocate data from the 6TB drive. Or
use it for one of the metadata mirrors, which will fail to write properly,
leading into transid failures later, etc.

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Kai Krakow
Am Tue, 14 Nov 2017 17:48:56 +0500
schrieb Roman Mamedov :

> [1] Note that "ddrescue" and "dd_rescue" are two different programs
> for the same purpose, one may work better than the other. I don't
> remember which. :)

One is a perl implementation and is the one working worse. ;-)


-- 
Regards,
Kai

Replies to list-only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Klaus Agnoletti
Hi Austin

Good points. Thanks a lot.

/klaus

On Tue, Nov 14, 2017 at 2:14 PM, Austin S. Hemmelgarn
 wrote:
> On 2017-11-14 03:36, Klaus Agnoletti wrote:
>>
>> Hi list
>>
>> I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the
>> 2TB disks started giving me I/O errors in dmesg like this:
>>
>> [388659.173819] ata5.00: exception Emask 0x0 SAct 0x7fff SErr 0x0
>> action 0x0
>> [388659.175589] ata5.00: irq_stat 0x4008
>> [388659.177312] ata5.00: failed command: READ FPDMA QUEUED
>> [388659.179045] ata5.00: cmd 60/20:60:80:96:95/00:00:c4:00:00/40 tag
>> 12 ncq 1638
>>   4 in
>>   res 51/40:1c:84:96:95/00:00:c4:00:00/40 Emask 0x409 (media
>> error) 
>> [388659.182552] ata5.00: status: { DRDY ERR }
>> [388659.184303] ata5.00: error: { UNC }
>> [388659.188899] ata5.00: configured for UDMA/133
>> [388659.188956] sd 4:0:0:0: [sdd] Unhandled sense code
>> [388659.188960] sd 4:0:0:0: [sdd]
>> [388659.188962] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [388659.188965] sd 4:0:0:0: [sdd]
>> [388659.188967] Sense Key : Medium Error [current] [descriptor]
>> [388659.188970] Descriptor sense data with sense descriptors (in hex):
>> [388659.188972] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
>> [388659.188981] c4 95 96 84
>> [388659.188985] sd 4:0:0:0: [sdd]
>> [388659.188988] Add. Sense: Unrecovered read error - auto reallocate
>> failed
>> [388659.188991] sd 4:0:0:0: [sdd] CDB:
>> [388659.188992] Read(10): 28 00 c4 95 96 80 00 00 20 00
>> [388659.189000] end_request: I/O error, dev sdd, sector 3298137732
>> [388659.190740] BTRFS: bdev /dev/sdd errs: wr 0, rd 3120, flush 0,
>> corrupt 0, ge
>> n 0
>> [388659.192556] ata5: EH complete
>
> Just some background, but this error is usually indicative of either media
> degradation from long-term usage, or a head crash.
>>
>>
>> At the same time, I started getting mails from smartd:
>>
>> Device: /dev/sdd [SAT], 2 Currently unreadable (pending) sectors
>> Device info:
>> Hitachi HDS723020BLA642, S/N:MN1220F30MNHUD, WWN:5-000cca-369c8f00b,
>> FW:MN6OA580, 2.00 TB
>>
>> For details see host's SYSLOG.
>
> And this correlates with the above errors (although the current pending
> sectors being non-zero is less specific than the above).
>>
>>
>> To fix it, it ended up with me adding a new 6TB disk and trying to
>> delete the failing 2TB disks.
>>
>> That didn't go so well; apparently, the delete command aborts when
>> ever it encounters I/O errors. So now my raid0 looks like this:
>
> I'm not going to comment on how to fix the current situation, as what has
> been stated in other people's replies pretty well covers that.
>
> I would however like to mention two things for future reference:
>
> 1. The delete command handles I/O errors just fine, provided that there is
> some form of redundancy in the filesystem.  In your case, if this had been a
> raid1 array instead of raid0, then the delete command would have just fallen
> back to the other copy of the data when it hit an I/O error instead of
> dying.  Just like a regular RAID0 array (be it LVM, MD, or hardware), you
> can't lose a device in a BTRFS raid0 array without losing the array.
>
> 2. While it would not have helped in this case, the preferred method when
> replacing a device is to use the `btrfs replace` command.  It's a lot more
> efficient than add+delete (and exponentially more efficient than
> delete+add), and also a bit safer (in both cases because it needs to move
> less data).  The only down-side to it is that you may need a couple of
> resize commands around it.
>
>>
>> klaus@box:~$ sudo btrfs fi show
>> [sudo] password for klaus:
>> Label: none  uuid: 5db5f82c-2571-4e62-a6da-50da0867888a
>>  Total devices 4 FS bytes used 5.14TiB
>>  devid1 size 1.82TiB used 1.78TiB path /dev/sde
>>  devid2 size 1.82TiB used 1.78TiB path /dev/sdf
>>  devid3 size 0.00B used 1.49TiB path /dev/sdd
>>  devid4 size 5.46TiB used 305.21GiB path /dev/sdb
>>
>> Btrfs v3.17
>>
>> Obviously, I want /dev/sdd emptied and deleted from the raid.
>>
>> So how do I do that?
>>
>> I thought of three possibilities myself. I am sure there are more,
>> given that I am in no way a btrfs expert:
>>
>> 1)Try to force a deletion of /dev/sdd where btrfs copies all intact
>> data to the other disks
>> 2) Somehow re-balances the raid so that sdd is emptied, and then deleted
>> 3) converting into a raid1, physically removing the failing disk,
>> simulating a hard error, starting the raid degraded, and converting it
>> back to raid0 again.
>>
>> How do you guys think I should go about this? Given that it's a raid0
>> for a reason, it's not the end of the world losing all data, but I'd
>> really prefer losing as little as possible, obviously.
>>
>> FYI, I tried doing some scrubbing and balancing. There's traces of
>> that in the syslog and dmesg I've attached. It's being used as
>> firewall 

Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Klaus Agnoletti
Hi Roman

I almost understand :-) - however, I need a bit more information:

How do I copy the image file to the 6TB without screwing the existing
btrfs up when the fs is not mounted? Should I remove it from the raid
again?

Also, as you might have noticed, I have a bit of an issue with the
entire space of the 6TB disk being added to the btrfs when I added the
disk. There's something kinda basic about using btrfs that I haven't
really understodd yet. Maybe you - or someone else - can point me in
the right direction in terms of documentation.

Thanks

/klaus

On Tue, Nov 14, 2017 at 1:48 PM, Roman Mamedov  wrote:
> On Tue, 14 Nov 2017 10:36:22 +0200
> Klaus Agnoletti  wrote:
>
>> Obviously, I want /dev/sdd emptied and deleted from the raid.
>
>   * Unmount the RAID0 FS
>
>   * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive
> (noting how much of it is actually unreadable -- chances are it's mostly
> intact)
>
>   * physically remove the bad drive (have a powerdown or reboot for this to be
> sure Btrfs didn't remember it somewhere)
>
>   * set up a loop device from the dd_rescue'd 2TB file
>
>   * run `btrfs device scan`
>
>   * mount the RAID0 filesystem
>
>   * run the delete command on the loop device, it will not encounter I/O
> errors anymore.
>
>
> [1] Note that "ddrescue" and "dd_rescue" are two different programs for the
> same purpose, one may work better than the other. I don't remember which. :)
>
> --
> With respect,
> Roman



-- 
Klaus Agnoletti
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Austin S. Hemmelgarn

On 2017-11-14 03:36, Klaus Agnoletti wrote:

Hi list

I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the
2TB disks started giving me I/O errors in dmesg like this:

[388659.173819] ata5.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 0x0
[388659.175589] ata5.00: irq_stat 0x4008
[388659.177312] ata5.00: failed command: READ FPDMA QUEUED
[388659.179045] ata5.00: cmd 60/20:60:80:96:95/00:00:c4:00:00/40 tag
12 ncq 1638
  4 in
  res 51/40:1c:84:96:95/00:00:c4:00:00/40 Emask 0x409 (media error) 
[388659.182552] ata5.00: status: { DRDY ERR }
[388659.184303] ata5.00: error: { UNC }
[388659.188899] ata5.00: configured for UDMA/133
[388659.188956] sd 4:0:0:0: [sdd] Unhandled sense code
[388659.188960] sd 4:0:0:0: [sdd]
[388659.188962] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[388659.188965] sd 4:0:0:0: [sdd]
[388659.188967] Sense Key : Medium Error [current] [descriptor]
[388659.188970] Descriptor sense data with sense descriptors (in hex):
[388659.188972] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[388659.188981] c4 95 96 84
[388659.188985] sd 4:0:0:0: [sdd]
[388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed
[388659.188991] sd 4:0:0:0: [sdd] CDB:
[388659.188992] Read(10): 28 00 c4 95 96 80 00 00 20 00
[388659.189000] end_request: I/O error, dev sdd, sector 3298137732
[388659.190740] BTRFS: bdev /dev/sdd errs: wr 0, rd 3120, flush 0,
corrupt 0, ge
n 0
[388659.192556] ata5: EH complete
Just some background, but this error is usually indicative of either 
media degradation from long-term usage, or a head crash.


At the same time, I started getting mails from smartd:

Device: /dev/sdd [SAT], 2 Currently unreadable (pending) sectors
Device info:
Hitachi HDS723020BLA642, S/N:MN1220F30MNHUD, WWN:5-000cca-369c8f00b,
FW:MN6OA580, 2.00 TB

For details see host's SYSLOG.
And this correlates with the above errors (although the current pending 
sectors being non-zero is less specific than the above).


To fix it, it ended up with me adding a new 6TB disk and trying to
delete the failing 2TB disks.

That didn't go so well; apparently, the delete command aborts when
ever it encounters I/O errors. So now my raid0 looks like this:
I'm not going to comment on how to fix the current situation, as what 
has been stated in other people's replies pretty well covers that.


I would however like to mention two things for future reference:

1. The delete command handles I/O errors just fine, provided that there 
is some form of redundancy in the filesystem.  In your case, if this had 
been a raid1 array instead of raid0, then the delete command would have 
just fallen back to the other copy of the data when it hit an I/O error 
instead of dying.  Just like a regular RAID0 array (be it LVM, MD, or 
hardware), you can't lose a device in a BTRFS raid0 array without losing 
the array.


2. While it would not have helped in this case, the preferred method 
when replacing a device is to use the `btrfs replace` command.  It's a 
lot more efficient than add+delete (and exponentially more efficient 
than delete+add), and also a bit safer (in both cases because it needs 
to move less data).  The only down-side to it is that you may need a 
couple of resize commands around it.


klaus@box:~$ sudo btrfs fi show
[sudo] password for klaus:
Label: none  uuid: 5db5f82c-2571-4e62-a6da-50da0867888a
 Total devices 4 FS bytes used 5.14TiB
 devid1 size 1.82TiB used 1.78TiB path /dev/sde
 devid2 size 1.82TiB used 1.78TiB path /dev/sdf
 devid3 size 0.00B used 1.49TiB path /dev/sdd
 devid4 size 5.46TiB used 305.21GiB path /dev/sdb

Btrfs v3.17

Obviously, I want /dev/sdd emptied and deleted from the raid.

So how do I do that?

I thought of three possibilities myself. I am sure there are more,
given that I am in no way a btrfs expert:

1)Try to force a deletion of /dev/sdd where btrfs copies all intact
data to the other disks
2) Somehow re-balances the raid so that sdd is emptied, and then deleted
3) converting into a raid1, physically removing the failing disk,
simulating a hard error, starting the raid degraded, and converting it
back to raid0 again.

How do you guys think I should go about this? Given that it's a raid0
for a reason, it's not the end of the world losing all data, but I'd
really prefer losing as little as possible, obviously.

FYI, I tried doing some scrubbing and balancing. There's traces of
that in the syslog and dmesg I've attached. It's being used as
firewall too, so there's a lof of Shorewall block messages smapping
the log I'm afraid.

Additional info:
klaus@box:~$ uname -a
Linux box 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19)
x86_64 GNU/Linux
klaus@box:~$ sudo btrfs --version
Btrfs v3.17
klaus@box:~$ sudo btrfs fi df /mnt
Data, RAID0: total=5.34TiB, used=5.14TiB
System, RAID0: total=96.00MiB, used=384.00KiB
Metadata, RAID0: total=7.22GiB, 

Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Austin S. Hemmelgarn

On 2017-11-14 07:48, Roman Mamedov wrote:

On Tue, 14 Nov 2017 10:36:22 +0200
Klaus Agnoletti  wrote:


Obviously, I want /dev/sdd emptied and deleted from the raid.


   * Unmount the RAID0 FS

   * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive
 (noting how much of it is actually unreadable -- chances are it's mostly
 intact)

   * physically remove the bad drive (have a powerdown or reboot for this to be
 sure Btrfs didn't remember it somewhere)

   * set up a loop device from the dd_rescue'd 2TB file

   * run `btrfs device scan`

   * mount the RAID0 filesystem

   * run the delete command on the loop device, it will not encounter I/O
 errors anymore.
While the above procedure will work, it is worth noting that you may 
still lose data.



[1] Note that "ddrescue" and "dd_rescue" are two different programs for the
same purpose, one may work better than the other. I don't remember which. :)
As a general rule, GNU ddrescue is more user friendly for block-level 
copies, while Kurt Garlof's dd_rescue tends to be better for copying at 
the file level.  Both work fine in terms of reliability though.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Patrik Lundquist
On 14 November 2017 at 09:36, Klaus Agnoletti  wrote:
>
> How do you guys think I should go about this?

I'd clone the disk with GNU ddrescue.

https://www.gnu.org/software/ddrescue/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Roman Mamedov
On Tue, 14 Nov 2017 10:36:22 +0200
Klaus Agnoletti  wrote:

> Obviously, I want /dev/sdd emptied and deleted from the raid.

  * Unmount the RAID0 FS

  * copy the bad drive using `dd_rescue`[1] into a file on the 6TB drive
(noting how much of it is actually unreadable -- chances are it's mostly
intact)

  * physically remove the bad drive (have a powerdown or reboot for this to be
sure Btrfs didn't remember it somewhere)

  * set up a loop device from the dd_rescue'd 2TB file

  * run `btrfs device scan`

  * mount the RAID0 filesystem

  * run the delete command on the loop device, it will not encounter I/O
errors anymore.


[1] Note that "ddrescue" and "dd_rescue" are two different programs for the
same purpose, one may work better than the other. I don't remember which. :)

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A partially failing disk in raid0 needs replacement

2017-11-14 Thread Adam Borowski
On Tue, Nov 14, 2017 at 10:36:22AM +0200, Klaus Agnoletti wrote:
> I used to have 3x2TB in a btrfs in raid0. A few weeks ago, one of the
 ^
> 2TB disks started giving me I/O errors in dmesg like this:
> 
> [388659.188988] Add. Sense: Unrecovered read error - auto reallocate failed

Alas, chances to recover anything are pretty slim.  That's RAID0 metadata
for you.

On the other hand, losing any non-trivial file while being able to gape at
intact metadata isn't that much better, thus -mraid0 isn't completely
unreasonable.

> To fix it, it ended up with me adding a new 6TB disk and trying to
> delete the failing 2TB disks.
> 
> That didn't go so well; apparently, the delete command aborts when
> ever it encounters I/O errors. So now my raid0 looks like this:
> 
> klaus@box:~$ sudo btrfs fi show
> [sudo] password for klaus:
> Label: none  uuid: 5db5f82c-2571-4e62-a6da-50da0867888a
> Total devices 4 FS bytes used 5.14TiB
> devid1 size 1.82TiB used 1.78TiB path /dev/sde
> devid2 size 1.82TiB used 1.78TiB path /dev/sdf
> devid3 size 0.00B used 1.49TiB path /dev/sdd
> devid4 size 5.46TiB used 305.21GiB path /dev/sdb

> Obviously, I want /dev/sdd emptied and deleted from the raid.
> 
> So how do I do that?
> 
> I thought of three possibilities myself. I am sure there are more,
> given that I am in no way a btrfs expert:
> 
> 1)Try to force a deletion of /dev/sdd where btrfs copies all intact
> data to the other disks
> 2) Somehow re-balances the raid so that sdd is emptied, and then deleted
> 3) converting into a raid1, physically removing the failing disk,
> simulating a hard error, starting the raid degraded, and converting it
> back to raid0 again.

There's hardly any intact data: roughly 2/3 of chunks have half of their
blocks on the failed disk, densely interspersed.  Even worse, metadata
required to map those blocks to files is gone, too: if we naively assume
there's only a single tree, a tree node is intact only if it and every
single node on the path to the root is intact.  In practice, this means
it's a total filesystem loss.

> How do you guys think I should go about this? Given that it's a raid0
> for a reason, it's not the end of the world losing all data, but I'd
> really prefer losing as little as possible, obviously.

As the disk isn't _completely_ gone, there's a slim chance of some stuff
requiring only still-readable sectors.  Probably a waste of time to try
to recover, though.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ Laws we want back: Poland, Dz.U. 1921 nr.30 poz.177 (also Dz.U. 
⣾⠁⢰⠒⠀⣿⡁ 1920 nr.11 poz.61): Art.2: An official, guilty of accepting a gift
⢿⡄⠘⠷⠚⠋⠀ or another material benefit, or a promise thereof, [in matters
⠈⠳⣄ relevant to duties], shall be punished by death by shooting.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html