Re: Can't remove device -> I/O error

2017-09-30 Thread Duncan
Dirk Diggler posted on Fri, 29 Sep 2017 22:00:28 +0200 as excerpted:

> is there any chance to get my device removed?
> Scrub literally takes months to complete (SATA 2/3 mix, about 1 minute
> per gigabyte) and i'm not sure if that helps.
> I guess same with balance. Mabye there is a quicker way. I can do
> without some data if it's corrupted. I have a backup, but i want to
> avoid to copy all data from scratch!

btrfs device remove uses an implicit balance to move data to other 
devices, so even if btrfs device remove were to work for you, it'd 
proceed at the same speed as balance.

[tl;dr stop there]

Even in the generic (non-btrfs) case, parity-raid is known to be slow for 
writes and therefore isn't recommended when speed is of any priority 
above minimum, thus, only for storage where both raw size and some level 
of device failure recovery is possible, and minimal speed is acceptable.

Between that and the btrfs-specific issues btrfs parity-raid had until 
kernel 4.13, with known bugs (but not the not btrfs-specific write hole) 
now fixed but with the possibility of unknown issues still lurking, I'd 
still not consider btrfs parity-raid particularly viable, tho it's no 
longer entirely blacklisted as it was until those 4.13 fixes.

So I'd suggest surrendering the fight and chalking it up to a learning 
experience, either taking the loss now and switching to something else, 
say btrfs raid1 on top of dm/mdraid-0 for higher speed or btrfs raid10 if 
you prefer to stick with a single layer at the sacrifice of speed, or as 
you write further down a different subthread, just sticking with what you 
have (since you do have backups) until a device dies and you really don't 
have an alternative but to eat that "weeks to fix" penalty.

Of course if you have the resources, you can do both at once, continuing 
to operate on the existing setup, while you create an entirely new setup 
and either initialize it from the backups, or start copying data to it 
off the still live raid5, presumably at idle priority so as to affect 
other operations as little as possible.  But the resource requirements to 
keep both the old and the new in operation at once until you can switch 
over to the new entirely, are high enough it may not be feasible.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-30 Thread DocMAX

Thank you for all your effort.
In "normal" conditions i know that the remove/delete command is working 
(i did that some times before).
But in this case it seems that i have some inconsistent data which 
prevents the operation to complete.




Am 30.09.2017 um 13:40 schrieb Goffredo Baroncelli:

On 09/30/2017 12:40 PM, DocMAX wrote:

I removed with "echo" command and also physically.

Both quit with I/O error.

Below the step which I used to simulate (in a virtual machine) your issue:



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-30 Thread Goffredo Baroncelli
On 09/30/2017 12:40 PM, DocMAX wrote:
> I removed with "echo" command and also physically.
> 
> Both quit with I/O error.

Below the step which I used to simulate (in a virtual machine) your issue:

### created the filesystem, and populated it (with about 500MB)
$ sudo mkfs.btrfs --force -d RAID5 -m RAID1 /dev/vd[bcd] /dev/sda
btrfs-progs v4.7.3
See http://btrfs.wiki.kernel.org for more information.

Performing full device TRIM (10.00GiB) ...
Label:  (null)
UUID:   
Node size:  16384
Sector size:4096
Filesystem size:40.00GiB
Block group profiles:
  Data: RAID5 3.00GiB
  Metadata: RAID1 1.00GiB
  System:   RAID1 8.00MiB
SSD detected:   no
Incompat features:  extref, raid56, skinny-metadata
Number of devices:  4
Devices:
   IDSIZE  PATH
110.00GiB  /dev/vdb
210.00GiB  /dev/vdc
310.00GiB  /dev/vdd
410.00GiB  /dev/sda

ghigo@emulato:~$ sudo mount /dev/sda /mnt/btrfs1
ghigo@emulato:~$ sudo cp -rfa /lib/modules/ /mnt/btrfs1/
ghigo@emulato:~$ sudo umount /mnt/btrfs1/

## remove the device, note that after this step /dev/sda is unreacheble
## both from userspace and kernel space

ghigo@emulato:~$ sudo -i
root@emulato:~# echo 1 >/sys/block/sda/device/delete 
root@emulato:~# logout

## mount the filesystem in "degraded mode" and delete the missing device
##

ghigo@emulato:~$ sudo mount -o degraded /dev/vdb /mnt/btrfs1
ghigo@emulato:~$ sudo btrfs dev us /mnt/btrfs1/
/dev/sda, ID: 4
   Device size:   0.00B
   Device slack:   16.00EiB
   Data,RAID5:  1.00GiB
   System,RAID1:8.00MiB
   Unallocated: 8.99GiB

/dev/vdb, ID: 1
   Device size:10.00GiB
   Device slack:  0.00B
   Data,RAID5:  1.00GiB
   Metadata,RAID1:  1.00GiB
   Unallocated: 8.00GiB

/dev/vdc, ID: 2
   Device size:10.00GiB
   Device slack:  0.00B
   Data,RAID5:  1.00GiB
   Metadata,RAID1:  1.00GiB
   Unallocated: 8.00GiB

/dev/vdd, ID: 3
   Device size:10.00GiB
   Device slack:  0.00B
   Data,RAID5:  1.00GiB
   System,RAID1:8.00MiB
   Unallocated: 8.99GiB

ghigo@emulato:~$ sudo btrfs dev del missing /mnt/btrfs1/

$ sudo btrfs fi us /mnt/btrfs1/
WARNING: RAID56 detected, not implemented
Overall:
Device size:  30.00GiB
Device allocated:  2.06GiB
Device unallocated:   27.94GiB
Device missing:  0.00B
Used: 47.97MiB
Free (estimated):0.00B  (min: 8.00EiB)
Data ratio:   0.00
Metadata ratio:   2.00
Global reserve:   16.00MiB  (used: 0.00B)

Data,RAID5: Size:2.00GiB, Used:1.54GiB
   /dev/vdb1.00GiB
   /dev/vdc1.00GiB
   /dev/vdd1.00GiB

Metadata,RAID1: Size:1.00GiB, Used:23.97MiB
   /dev/vdb1.00GiB
   /dev/vdc1.00GiB

System,RAID1: Size:32.00MiB, Used:16.00KiB
   /dev/vdc   32.00MiB
   /dev/vdd   32.00MiB

Unallocated:
   /dev/vdb8.00GiB
   /dev/vdc7.97GiB
   /dev/vdd8.97GiB


And I don't have any error in dmesg;

I made another test: I remove the device without "umount/mount -o degrade"

# create the filesystem and populate with about 1Gb of data

$ sudo mkfs.btrfs --force -d RAID5 -m RAID1 /dev/vd[bcd] /dev/sda
btrfs-progs v4.7.3
See http://btrfs.wiki.kernel.org for more information.

Performing full device TRIM (10.00GiB) ...
Label:  (null)
UUID:   
Node size:  16384
Sector size:4096
Filesystem size:40.00GiB
Block group profiles:
  Data: RAID5 3.00GiB
  Metadata: RAID1 1.00GiB
  System:   RAID1 8.00MiB
SSD detected:   no
Incompat features:  extref, raid56, skinny-metadata
Number of devices:  4
Devices:
   IDSIZE  PATH
110.00GiB  /dev/vdb
210.00GiB  /dev/vdc
310.00GiB  /dev/vdd
410.00GiB  /dev/sda

ghigo@emulato:~$ sudo mount /dev/vdb /mnt/btrfs1
ghigo@emulato:~$ sudo cp -rfa /lib/modules/ /mnt/btrfs1/

ghigo@emulato:~$ df -h
Filesystem  Size  Used Avail Use% Mounted on
udev1.5G 0  1.5G   0% /dev
tmpfs   302M  4.3M  297M   2% /run
/dev/vda 99G  4.5G   89G   5% /
tmpfs   1.5G 0  1.5G   0% /dev/shm
tmpfs   5.0M 0  5.0M   0% /run/lock
tmpfs   1.5G 0  1.5G   0% /sys/fs/cgroup
tmpfs   302M 0  302M   0% /run/user/1000
/dev/vdb 40G  1.7G   36G   5% /mnt/btrfs1
ghigo@emulato:~$ sudo btrfs fi df /mnt/btrfs1/
Data, RAID5: total=3.00GiB, used=1.54GiB
System, RAID1: total=8.00MiB, used=16.00KiB
Metadata, RAID1: total=1.00GiB, used=22.72MiB
GlobalReserve, single: total=16.00MiB, used=0.00B

 

Re: Can't remove device -> I/O error

2017-09-30 Thread DocMAX

I removed with "echo" command and also physically.

Both quit with I/O error.



Am 30.09.2017 um 09:16 schrieb Goffredo Baroncelli:

On 09/30/2017 01:06 AM, DocMAX wrote:

Did you removed the disk before mounting (physically or doing echo 1 
>/sys/block/xxx/device/delete)? Which steps you performed ?

- removed drive physically

- mounted degraded mode

- btrfs dev del -> same i/o error


Did you switch off the machine ? If not, before mounting in degraded mode, do "echo 1 
>/sys/block/xxx/device/delete". After the monting do a btrfs dev del missing






--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-30 Thread Goffredo Baroncelli
On 09/30/2017 01:06 AM, DocMAX wrote:
>>> Did you removed the disk before mounting (physically or doing echo 1 
>>> >/sys/block/xxx/device/delete)? Which steps you performed ?
> 
> - removed drive physically
> 
> - mounted degraded mode
> 
> - btrfs dev del -> same i/o error
> 

Did you switch off the machine ? If not, before mounting in degraded mode, do 
"echo 1 >/sys/block/xxx/device/delete". After the monting do a btrfs dev del 
missing



> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-29 Thread DocMAX
>> Did you removed the disk before mounting (physically or doing echo 1 
>/sys/block/xxx/device/delete)? Which steps you performed ?


- removed drive physically

- mounted degraded mode

- btrfs dev del -> same i/o error


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-29 Thread Goffredo Baroncelli
On 09/29/2017 11:09 PM, DocMAX wrote:
> Thanks for the reply.
> 
> I don't want to replace the drive. I want to remove.
> 
> Also tried in degraded mode. I get the exact same error.

Did you removed the disk before mounting (physically or doing echo 1 
>/sys/block/xxx/device/delete)? Which steps you performed ?

> 
> I'm not sure but i think i formated the drive on Kernel 4.11.

This shouldn't matter
> 
> I am on Kernel 4.13 now.

Ok, it is quite recently
> 
> 
> I have the bad feeling that i will never get rid of that small drive unless i 
> re-format.

No, it should not be necessary.

> 
> 
> 
> Am 29.09.2017 um 23:04 schrieb Goffredo Baroncelli:
>> On 09/29/2017 10:00 PM, Dirk Diggler wrote:
>>> Hi,
>>>
>>> is there any chance to get my device removed?
>> I simulated a device removing in KVM with
>>
>> echo 1 >/sys/block/sdj/device/delete
>>
>> then
>>
>> btrfs dev del 6 /mnt/
>>
>>
>> And I got success. But I am not sure if this is the right thing todo.
>>
>> You can use "btrfs replace start -r ". But you need another device.
>>
>> Otherwise, you can shutdown the filesystem, removing (physically) the disk 
>> then remount with a "mount -o degraded " followed by a "btrfs dev del 
>> missing /..."
>> Before doing so, please tell us which kernel you are using.
>>
>> RAID5/6 until few months ago has a lot of bugs, so if you have an old kernel 
>> it is very difficult to remove a device with success.
>>
> 
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-29 Thread DocMAX

Thanks for the reply.

I don't want to replace the drive. I want to remove.

Also tried in degraded mode. I get the exact same error.

I'm not sure but i think i formated the drive on Kernel 4.11.

I am on Kernel 4.13 now.


I have the bad feeling that i will never get rid of that small drive 
unless i re-format.




Am 29.09.2017 um 23:04 schrieb Goffredo Baroncelli:

On 09/29/2017 10:00 PM, Dirk Diggler wrote:

Hi,

is there any chance to get my device removed?

I simulated a device removing in KVM with

echo 1 >/sys/block/sdj/device/delete

then

btrfs dev del 6 /mnt/


And I got success. But I am not sure if this is the right thing todo.

You can use "btrfs replace start -r ". But you need another device.

Otherwise, you can shutdown the filesystem, removing (physically) the disk then remount with a 
"mount -o degraded " followed by a "btrfs dev del missing /..."
Before doing so, please tell us which kernel you are using.

RAID5/6 until few months ago has a lot of bugs, so if you have an old kernel it 
is very difficult to remove a device with success.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-29 Thread Goffredo Baroncelli
On 09/29/2017 10:00 PM, Dirk Diggler wrote:
> Hi,
> 
> is there any chance to get my device removed?

I simulated a device removing in KVM with

echo 1 >/sys/block/sdj/device/delete

then 

btrfs dev del 6 /mnt/


And I got success. But I am not sure if this is the right thing todo.

You can use "btrfs replace start -r ". But you need another device.

Otherwise, you can shutdown the filesystem, removing (physically) the disk then 
remount with a "mount -o degraded " followed by a "btrfs dev del missing 
/..."
Before doing so, please tell us which kernel you are using.

RAID5/6 until few months ago has a lot of bugs, so if you have an old kernel it 
is very difficult to remove a device with success.

> Scrub literally takes months to complete (SATA 2/3 mix, about 1 minute
> per gigabyte) and i'm not sure if that helps.
> I guess same with balance. Mabye there is a quicker way. I can do
> without some data if it's corrupted. I have a backup, but i want to
> avoid to copy all data from scratch!
> 
> Whenever i try to remove dev 6, i get:
> 
> console:
> ERROR: error removing device '/dev/sdj': Input/output error
> 
> dmesg (i/o error right after this):
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
> csum 0x98f94189 expected csum 0x585e5744 mirror 1
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675119616
> csum 0x98f94189 expected csum 0xcefd2ae0 mirror 1
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
> csum 0x98f94189 expected csum 0x585e5744 mirror 1
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675119616
> csum 0x98f94189 expected csum 0xcefd2ae0 mirror 1
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
> csum 0x4023cac1 expected csum 0x585e5744 mirror 2
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675119616
> csum 0xea91b663 expected csum 0xcefd2ae0 mirror 2
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
> csum 0x98f94189 expected csum 0x585e5744 mirror 1
> BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
> csum 0x4023cac1 expected csum 0x585e5744 mirror 2
> 
> My setup:
> /dev/sdf, ID: 3
>Device size: 2.73TiB
>Device slack:  0.00B
>Data,RAID5:333.00GiB
>Data,RAID5:  5.00GiB
>Unallocated: 2.40TiB
> 
> /dev/sdg, ID: 2
>Device size: 1.82TiB
>Device slack:  0.00B
>Data,RAID5:333.00GiB
>Data,RAID5:955.00GiB
>Data,RAID5:  5.57GiB
>Metadata,RAID1:  3.00GiB
>Unallocated:   566.44GiB
> 
> /dev/sdh, ID: 4
>Device size: 1.82TiB
>Device slack:  0.00B
>Data,RAID5:333.00GiB
>Data,RAID5:955.00GiB
>Data,RAID5:  5.57GiB
>Metadata,RAID1:  2.00GiB
>System,RAID1:   32.00MiB
>Unallocated:   567.41GiB
> 
> /dev/sdi, ID: 7
>Device size: 2.73TiB
>Device slack:  0.00B
>Data,RAID5:333.00GiB
>Data,RAID5:955.00GiB
>Data,RAID5:  5.57GiB
>Metadata,RAID1: 11.00GiB
>System,RAID1:   32.00MiB
>Unallocated: 1.45TiB
> 
> /dev/sdj, ID: 6
>Device size:   465.76GiB
>Device slack:  0.00B
>Data,RAID5:333.00GiB
>Data,RAID5:587.38MiB
>Unallocated:   132.19GiB
> 
> /dev/sdk, ID: 1
>Device size: 1.82TiB
>Device slack:  0.00B
>Data,RAID5:333.00GiB
>Data,RAID5:955.00GiB
>Data,RAID5:  5.57GiB
>Metadata,RAID1:  3.00GiB
>Unallocated:   566.44GiB
> 
> /dev/sdl, ID: 5
>Device size: 1.82TiB
>Device slack:  0.00B
>Data,RAID5:333.00GiB
>Data,RAID5:955.00GiB
>Data,RAID5:  5.57GiB
>Metadata,RAID1:  3.00GiB
>Unallocated:   566.44GiB
> 
> Thanks,
> DocMAX
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't remove device -> I/O error

2017-09-29 Thread DocMAX

Kernel:

Linux game 4.13.3-1-ARCH #1 SMP PREEMPT Thu Sep 21 20:33:16 CEST 2017 
x86_64 GNU/Linux

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can't remove device -> I/O error

2017-09-29 Thread Dirk Diggler
Hi,

is there any chance to get my device removed?
Scrub literally takes months to complete (SATA 2/3 mix, about 1 minute
per gigabyte) and i'm not sure if that helps.
I guess same with balance. Mabye there is a quicker way. I can do
without some data if it's corrupted. I have a backup, but i want to
avoid to copy all data from scratch!

Whenever i try to remove dev 6, i get:

console:
ERROR: error removing device '/dev/sdj': Input/output error

dmesg (i/o error right after this):
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
csum 0x98f94189 expected csum 0x585e5744 mirror 1
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675119616
csum 0x98f94189 expected csum 0xcefd2ae0 mirror 1
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
csum 0x98f94189 expected csum 0x585e5744 mirror 1
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675119616
csum 0x98f94189 expected csum 0xcefd2ae0 mirror 1
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
csum 0x4023cac1 expected csum 0x585e5744 mirror 2
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675119616
csum 0xea91b663 expected csum 0xcefd2ae0 mirror 2
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
csum 0x98f94189 expected csum 0x585e5744 mirror 1
BTRFS warning (device sdl): csum failed root -9 ino 365 off 3675115520
csum 0x4023cac1 expected csum 0x585e5744 mirror 2

My setup:
/dev/sdf, ID: 3
   Device size: 2.73TiB
   Device slack:  0.00B
   Data,RAID5:333.00GiB
   Data,RAID5:  5.00GiB
   Unallocated: 2.40TiB

/dev/sdg, ID: 2
   Device size: 1.82TiB
   Device slack:  0.00B
   Data,RAID5:333.00GiB
   Data,RAID5:955.00GiB
   Data,RAID5:  5.57GiB
   Metadata,RAID1:  3.00GiB
   Unallocated:   566.44GiB

/dev/sdh, ID: 4
   Device size: 1.82TiB
   Device slack:  0.00B
   Data,RAID5:333.00GiB
   Data,RAID5:955.00GiB
   Data,RAID5:  5.57GiB
   Metadata,RAID1:  2.00GiB
   System,RAID1:   32.00MiB
   Unallocated:   567.41GiB

/dev/sdi, ID: 7
   Device size: 2.73TiB
   Device slack:  0.00B
   Data,RAID5:333.00GiB
   Data,RAID5:955.00GiB
   Data,RAID5:  5.57GiB
   Metadata,RAID1: 11.00GiB
   System,RAID1:   32.00MiB
   Unallocated: 1.45TiB

/dev/sdj, ID: 6
   Device size:   465.76GiB
   Device slack:  0.00B
   Data,RAID5:333.00GiB
   Data,RAID5:587.38MiB
   Unallocated:   132.19GiB

/dev/sdk, ID: 1
   Device size: 1.82TiB
   Device slack:  0.00B
   Data,RAID5:333.00GiB
   Data,RAID5:955.00GiB
   Data,RAID5:  5.57GiB
   Metadata,RAID1:  3.00GiB
   Unallocated:   566.44GiB

/dev/sdl, ID: 5
   Device size: 1.82TiB
   Device slack:  0.00B
   Data,RAID5:333.00GiB
   Data,RAID5:955.00GiB
   Data,RAID5:  5.57GiB
   Metadata,RAID1:  3.00GiB
   Unallocated:   566.44GiB

Thanks,
DocMAX
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html