Cannot 'mount -o degraded /dev/replacement' after a replace

2019-02-09 Thread Jakob Schöttl

Hi,

I've setup a RAID1 with two disks (disk1 and disk2) and I'm testing the 
btrfs replace command.


After replacing disk2 with disk3, I can only mount
(a) disk1 or disk3 (if both disk are plugged) and
(b) the original disk1 (degraded, if disk3 is unplugged).

I cannot mount the replacement disk3 if disk1 is unplugged.

> mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop3, 
missing codepage or helper program, or other error.


What I expect is that both disk1 and disk3 are fully valid and working 
after a replace.


Steps to reproduce:

  dd if=/dev/zero of=/vdisk1 bs=1024 count=30
  losetup /dev/loop1 /vdisk1
  dd if=/dev/zero of=/vdisk2 bs=1024 count=30
  losetup /dev/loop2 /vdisk2
  dd if=/dev/zero of=/vdisk3 bs=1024 count=30
  # losetup /dev/loop3 /vdisk3    # don't plug this device yet

Create RAID1 file system:

  mkfs.btrfs -L datavol -m raid1 -d raid1 /dev/loop1 /dev/loop2

Unplug device 2 to simulate a defect:

  losetup -d /dev/loop2

Plug device 3:

  losetup /dev/loop3 /vdisk3

Replace device 2 with device 3:

  mount -o degraded /dev/loop1 /mnt
  btrfs filesystem show   # to get devid of device 2
  btrfs replace start -Br 2 /dev/loop3 /mnt
  btrfs replace status /mnt   # check success
  umount /mnt

Unplug the original device 1 to see if device 3 has really replaced 
device 2:


  losetup -d /dev/loop1
  mount -o degraded /dev/loop3 /mnt

The mount fails with this error:

> mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop3, 
missing codepage or helper program, or other error.


In this situation, btrfs device scan does not change anything and btrfs
filesystem show shows:

> warning, device 1 is missing
> warning, device 1 is missing
> warning, device 1 is missing
> warning, device 1 is missing
> bad tree block 198180864, bytenr mismatch, want=198180864, have=0
> ERROR: cannot read chunk root
> Label: 'datavol'  uuid: 640e45d3-e741-4a78-a24e-2d8a41c6b8c3
>     Total devices 2 FS bytes used 128.00KiB
>     devid    2 size 292.97MiB used 104.00MiB path /dev/loop3
>     *** Some devices missing

Is this a known problem? Can you reproduce it? Am I doing something wrong?

Regards, Jakob



Re: Cannot 'mount -o degraded /dev/replacement' after a replace

2019-02-09 Thread Jakob Schöttl

Thanks Qu,

Am 09.02.19 um 13:16 schrieb Qu Wenruo:

On 2019/2/9 下午6:36, Jakob Schöttl wrote:

Hi,

I've setup a RAID1 with two disks (disk1 and disk2) and I'm testing the
btrfs replace command.

After replacing disk2 with disk3, I can only mount
(a) disk1 or disk3 (if both disk are plugged) and
(b) the original disk1 (degraded, if disk3 is unplugged).

I cannot mount the replacement disk3 if disk1 is unplugged.

Sounds like there is one single chunk on disk1, which caused the problem.


mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop3,

missing codepage or helper program, or other error.

dmesg please.

[60456.856883] BTRFS: device label datavol devid 1 transid 5 /dev/loop1
[60456.940785] BTRFS: device label datavol devid 2 transid 5 /dev/loop2
[60525.211389] BTRFS info (device loop1): allowing degraded mounts
[60525.211395] BTRFS info (device loop1): disk space caching is enabled
[60525.211398] BTRFS info (device loop1): has skinny extents
[60525.211401] BTRFS info (device loop1): flagging fs with big metadata 
feature
[60525.213854] BTRFS warning (device loop1): devid 2 uuid 
0b4e0b31-e2b1-40a0-8360-09978f58a2e4 is missing

[60525.214639] BTRFS info (device loop1): checking UUID tree
[60525.386695] BTRFS info (device loop1): dev_replace from disk> (devid 2) to /dev/loop3 started
[60525.394403] BTRFS info (device loop1): dev_replace from disk> (devid 2) to /dev/loop3 finished

[60533.721841] BTRFS info (device loop3): allowing degraded mounts
[60533.721846] BTRFS info (device loop3): disk space caching is enabled
[60533.721850] BTRFS info (device loop3): has skinny extents
[60533.723703] BTRFS error (device loop3): failed to read chunk root
[60533.773553] BTRFS error (device loop3): open_ctree failed


And btrfs-progs version please.

$ pacman -Q btrfs-progs
btrfs-progs 4.20.1-2
$ uname -a
Linux jathink 4.20.7-arch1-1-ARCH #1 SMP PREEMPT Wed Feb 6 18:42:40 UTC 
2019 x86_64 GNU/Linux



Maybe mkfs is too old to leave SINGLE profile chunks on the original fs.

And you could verify the chunk mapping by executing 'btrfs ins dump-tree
-t chunk ' and paste the output.


When only /dev/loop3 is plugged:

# btrfs inspect-internal dump-tree -t chunk /dev/loop3
btrfs-progs v4.20.1
warning, device 1 is missing
warning, device 1 is missing
warning, device 1 is missing
warning, device 1 is missing
bad tree block 198180864, bytenr mismatch, want=198180864, have=0
ERROR: cannot read chunk root
ERROR: unable to open /dev/loop3

When only /dev/loop1 is plugged:

# btrfs inspect-internal dump-tree -t chunk /dev/loop1
btrfs-progs v4.20.1
warning, device 2 is missing
chunk tree
leaf 198180864 items 10 free space 15005 generation 8 owner CHUNK_TREE
leaf 198180864 flags 0x1(WRITTEN) backref revision 1
fs uuid 005a8d59-a561-4371-869e-b0ccc4a4862b
chunk uuid b3e609f1-a7fe-4add-bc51-6231f0bbf320
    item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 16185 itemsize 98
        devid 1 total_bytes 30720 bytes_used 306053120
        io_align 4096 io_width 4096 sector_size 4096 type 0
        generation 0 start_offset 0 dev_group 0
        seek_speed 0 bandwidth 0
        uuid 043443c7-ac91-4085-a5e4-983b59dd0803
        fsid 005a8d59-a561-4371-869e-b0ccc4a4862b
    item 1 key (DEV_ITEMS DEV_ITEM 2) itemoff 16087 itemsize 98
        devid 2 total_bytes 30720 bytes_used 109051904
        io_align 4096 io_width 4096 sector_size 4096 type 0
        generation 0 start_offset 0 dev_group 0
        seek_speed 0 bandwidth 0
        uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4
        fsid 005a8d59-a561-4371-869e-b0ccc4a4862b
    item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 
itemsize 112

        length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1
        io_align 65536 io_width 65536 sector_size 4096
        num_stripes 2 sub_stripes 0
            stripe 0 devid 2 offset 1048576
            dev_uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4
            stripe 1 devid 1 offset 22020096
            dev_uuid 043443c7-ac91-4085-a5e4-983b59dd0803
    item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 
itemsize 112

        length 33554432 owner 2 stripe_len 65536 type METADATA|RAID1
        io_align 65536 io_width 65536 sector_size 4096
        num_stripes 2 sub_stripes 0
            stripe 0 devid 2 offset 9437184
            dev_uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4
            stripe 1 devid 1 offset 30408704
            dev_uuid 043443c7-ac91-4085-a5e4-983b59dd0803
    item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 63963136) itemoff 15751 
itemsize 112

        length 67108864 owner 2 stripe_len 65536 type DATA|RAID1
        io_align 65536 io_width 65536 sector_size 4096
        num_stripes 2 sub_stripes 0
            stripe 0 devid 2 offset 42991616
            dev_uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4
            stripe 1 devid 1 offset 63963136
            dev_uuid 043443c7-ac91-4085-a5e4-983b59dd0803
    item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 131072000) 

Re: Cannot 'mount -o degraded /dev/replacement' after a replace

2019-02-09 Thread Jakob Schöttl



Am 09.02.19 um 16:32 schrieb Andrei Borzenkov:

Running

btrfs balance start -f -ssoft,convert=raid1 -dsoft,convert=raid1
-msoft,convert=raid1 /mnt

fixes it. I am not sure whether btrfs is expected to do it automatically
as of now.


This fix seems to work for me, too. Output was:
> Done, had to relocate 5 out of 8 chunks

Tested without any subvolumes or data.



Only one subvolume can be mounted after replace/balance

2021-01-23 Thread Jakob Schöttl

Hi,

In short:
When mounting a second subvolume from a pool, I get this error:
"mount: /mnt: wrong fs type, bad option, bad superblock on /dev/sda, 
missing code page or helper program, or other."

dmesg | grep BTRFS only shows this error:
info (device sda): disk space caching is enabled
error (device sda): Remounting read-write after error is not allowed

What happened:

In my RAID1 pool with two disk, I successfully replaced one disk with

btrfs replace start 2 /dev/sdx

After that, I mounted the pool and did

btrfs fi show /mnt

which showed WARNINGs about
"filesystems with multiple block group profiles detected"
(don't remember exactly)

I thought it is a good idea to do

btrfs balance start /mnt

which finished without errors.

Now, I can only mount one (sub)volume of the pool at a time. Others can 
only be mounted read-only. See error messages at top of this mail.


Do you have any idea what happened or how to fix it?

I already tried rescue zero-log and super-recovery which was successful 
but didn't help.


Regards, Jakob


Cannot resize filesystem: not enough free space

2021-01-24 Thread Jakob Schöttl



Help please, increasing the filesystem size doesn't work.

When mounting my btrfs filesystem, I had errors saying, "no space 
left
on device". Now I managed to mount the filesystem with -o 
skip_balance but:


# btrfs fi df /mnt
Data, RAID1: total=147.04GiB, used=147.02GiB
System, RAID1: total=8.00MiB, used=48.00KiB
Metadata, RAID1: total=1.00GiB, used=458.84MiB
GlobalReserve, single: total=181.53MiB, used=0.00B

It is full and resize doesn't work although both block devices sda 
and
sdb have more 250 GB and more nominal capacity (I don't have 
partitions,

btrfs is directly on sda and sdb):

# fdisk -l /dev/sd{a,b}*
Disk /dev/sda: 232.89 GiB, 250059350016 bytes, 488397168 sectors
[...]
Disk /dev/sdb: 465.76 GiB, 500107862016 bytes, 976773168 sectors
[...]

I tried:

# btrfs fi resize 230G /mnt
runs without errors but has no effect

# btrfs fi resize max /mnt
runs without errors but has no effect

# btrfs fi resize +1G /mnt
ERROR: unable to resize '/mnt': no enough free space

Any ideas? Thank you!


Re: Cannot resize filesystem: not enough free space

2021-01-24 Thread Jakob Schöttl



Hugo Mills  writes:


On Sun, Jan 24, 2021 at 07:23:21PM +0100, Jakob Schöttl wrote:


Help please, increasing the filesystem size doesn't work.

When mounting my btrfs filesystem, I had errors saying, "no 
space left
on device". Now I managed to mount the filesystem with -o 
skip_balance but:


# btrfs fi df /mnt
Data, RAID1: total=147.04GiB, used=147.02GiB
System, RAID1: total=8.00MiB, used=48.00KiB
Metadata, RAID1: total=1.00GiB, used=458.84MiB
GlobalReserve, single: total=181.53MiB, used=0.00B


   Can you show the output of "sudo btrfs fi show" as well?

   Hugo.
 


Thanks, Hugo, for the quick response.

# btrfs fi show /mnt/
Label: 'data'  uuid: fc991007-6ef3-4c2c-9ca7-b4d637fccafb
   Total devices 2 FS bytes used 148.43GiB
   devid1 size 232.89GiB used 149.05GiB path /dev/sda
   devid2 size 149.05GiB used 149.05GiB path /dev/sdb

Oh, now I see! Resize only worked for one sda!

# btrfs fi resize 1:max /mnt/
# btrfs fi resize 2:max /mnt/
# btrfs fi show /mnt/
Label: 'data'  uuid: fc991007-6ef3-4c2c-9ca7-b4d637fccafb
   Total devices 2 FS bytes used 150.05GiB
   devid1 size 232.89GiB used 151.05GiB path /dev/sda
   devid2 size 465.76GiB used 151.05GiB path /dev/sdb

Now it works. Thank you!

It is full and resize doesn't work although both block devices 
sda and
sdb have more 250 GB and more nominal capacity (I don't have 
partitions,

btrfs is directly on sda and sdb):

# fdisk -l /dev/sd{a,b}*
Disk /dev/sda: 232.89 GiB, 250059350016 bytes, 488397168 
sectors

[...]
Disk /dev/sdb: 465.76 GiB, 500107862016 bytes, 976773168 
sectors

[...]

I tried:

# btrfs fi resize 230G /mnt
runs without errors but has no effect

# btrfs fi resize max /mnt
runs without errors but has no effect

# btrfs fi resize +1G /mnt
ERROR: unable to resize '/mnt': no enough free space

Any ideas? Thank you!



--
Jakob Schöttl
Phone: 0176 45762916
E-mail: jscho...@gmail.com
PGP-key: 0x25055C7F


Re: Only one subvolume can be mounted after replace/balance

2021-01-27 Thread Jakob Schöttl

Thank you Chris, it's resolved now, see below.

Am 25.01.21 um 23:47 schrieb Chris Murphy:

On Sat, Jan 23, 2021 at 7:50 AM Jakob Schöttl  wrote:

Hi,

In short:
When mounting a second subvolume from a pool, I get this error:
"mount: /mnt: wrong fs type, bad option, bad superblock on /dev/sda,
missing code page or helper program, or other."
dmesg | grep BTRFS only shows this error:
info (device sda): disk space caching is enabled
error (device sda): Remounting read-write after error is not allowed

It went read-only before this because it's confused. You need to
unmount it before it can be mounted rw. In some cases a reboot is
needed.

Oh, I didn't notice that the pool was already mounted (via fstab).
The filesystem where out of space and I had to resize both disks 
separately. And I had to mount with -o skip_balance for that. Now it 
works again.



What happened:

In my RAID1 pool with two disk, I successfully replaced one disk with

btrfs replace start 2 /dev/sdx

After that, I mounted the pool and did

I don't understand this sequence. In order to do a replace, the file
system is already mounted.
That was, what I did before my actual problem occurred. But it's 
resolved now.



btrfs fi show /mnt

which showed WARNINGs about
"filesystems with multiple block group profiles detected"
(don't remember exactly)

I thought it is a good idea to do

btrfs balance start /mnt

which finished without errors.

Balance alone does not convert block groups to a new profile. You have
to explicitly select a conversion filter, e.g.

btrfs balance start -dconvert=raid1,soft -mconvert=raid1,soft /mnt
I didn't want to convert to a new profile. I thought btrfs replace 
automatically uses the same profile as the pool?


Regards, Jakob