Cannot 'mount -o degraded /dev/replacement' after a replace
Hi, I've setup a RAID1 with two disks (disk1 and disk2) and I'm testing the btrfs replace command. After replacing disk2 with disk3, I can only mount (a) disk1 or disk3 (if both disk are plugged) and (b) the original disk1 (degraded, if disk3 is unplugged). I cannot mount the replacement disk3 if disk1 is unplugged. > mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop3, missing codepage or helper program, or other error. What I expect is that both disk1 and disk3 are fully valid and working after a replace. Steps to reproduce: dd if=/dev/zero of=/vdisk1 bs=1024 count=30 losetup /dev/loop1 /vdisk1 dd if=/dev/zero of=/vdisk2 bs=1024 count=30 losetup /dev/loop2 /vdisk2 dd if=/dev/zero of=/vdisk3 bs=1024 count=30 # losetup /dev/loop3 /vdisk3 # don't plug this device yet Create RAID1 file system: mkfs.btrfs -L datavol -m raid1 -d raid1 /dev/loop1 /dev/loop2 Unplug device 2 to simulate a defect: losetup -d /dev/loop2 Plug device 3: losetup /dev/loop3 /vdisk3 Replace device 2 with device 3: mount -o degraded /dev/loop1 /mnt btrfs filesystem show # to get devid of device 2 btrfs replace start -Br 2 /dev/loop3 /mnt btrfs replace status /mnt # check success umount /mnt Unplug the original device 1 to see if device 3 has really replaced device 2: losetup -d /dev/loop1 mount -o degraded /dev/loop3 /mnt The mount fails with this error: > mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop3, missing codepage or helper program, or other error. In this situation, btrfs device scan does not change anything and btrfs filesystem show shows: > warning, device 1 is missing > warning, device 1 is missing > warning, device 1 is missing > warning, device 1 is missing > bad tree block 198180864, bytenr mismatch, want=198180864, have=0 > ERROR: cannot read chunk root > Label: 'datavol' uuid: 640e45d3-e741-4a78-a24e-2d8a41c6b8c3 > Total devices 2 FS bytes used 128.00KiB > devid 2 size 292.97MiB used 104.00MiB path /dev/loop3 > *** Some devices missing Is this a known problem? Can you reproduce it? Am I doing something wrong? Regards, Jakob
Re: Cannot 'mount -o degraded /dev/replacement' after a replace
Thanks Qu, Am 09.02.19 um 13:16 schrieb Qu Wenruo: On 2019/2/9 下午6:36, Jakob Schöttl wrote: Hi, I've setup a RAID1 with two disks (disk1 and disk2) and I'm testing the btrfs replace command. After replacing disk2 with disk3, I can only mount (a) disk1 or disk3 (if both disk are plugged) and (b) the original disk1 (degraded, if disk3 is unplugged). I cannot mount the replacement disk3 if disk1 is unplugged. Sounds like there is one single chunk on disk1, which caused the problem. mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop3, missing codepage or helper program, or other error. dmesg please. [60456.856883] BTRFS: device label datavol devid 1 transid 5 /dev/loop1 [60456.940785] BTRFS: device label datavol devid 2 transid 5 /dev/loop2 [60525.211389] BTRFS info (device loop1): allowing degraded mounts [60525.211395] BTRFS info (device loop1): disk space caching is enabled [60525.211398] BTRFS info (device loop1): has skinny extents [60525.211401] BTRFS info (device loop1): flagging fs with big metadata feature [60525.213854] BTRFS warning (device loop1): devid 2 uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4 is missing [60525.214639] BTRFS info (device loop1): checking UUID tree [60525.386695] BTRFS info (device loop1): dev_replace from disk> (devid 2) to /dev/loop3 started [60525.394403] BTRFS info (device loop1): dev_replace from disk> (devid 2) to /dev/loop3 finished [60533.721841] BTRFS info (device loop3): allowing degraded mounts [60533.721846] BTRFS info (device loop3): disk space caching is enabled [60533.721850] BTRFS info (device loop3): has skinny extents [60533.723703] BTRFS error (device loop3): failed to read chunk root [60533.773553] BTRFS error (device loop3): open_ctree failed And btrfs-progs version please. $ pacman -Q btrfs-progs btrfs-progs 4.20.1-2 $ uname -a Linux jathink 4.20.7-arch1-1-ARCH #1 SMP PREEMPT Wed Feb 6 18:42:40 UTC 2019 x86_64 GNU/Linux Maybe mkfs is too old to leave SINGLE profile chunks on the original fs. And you could verify the chunk mapping by executing 'btrfs ins dump-tree -t chunk ' and paste the output. When only /dev/loop3 is plugged: # btrfs inspect-internal dump-tree -t chunk /dev/loop3 btrfs-progs v4.20.1 warning, device 1 is missing warning, device 1 is missing warning, device 1 is missing warning, device 1 is missing bad tree block 198180864, bytenr mismatch, want=198180864, have=0 ERROR: cannot read chunk root ERROR: unable to open /dev/loop3 When only /dev/loop1 is plugged: # btrfs inspect-internal dump-tree -t chunk /dev/loop1 btrfs-progs v4.20.1 warning, device 2 is missing chunk tree leaf 198180864 items 10 free space 15005 generation 8 owner CHUNK_TREE leaf 198180864 flags 0x1(WRITTEN) backref revision 1 fs uuid 005a8d59-a561-4371-869e-b0ccc4a4862b chunk uuid b3e609f1-a7fe-4add-bc51-6231f0bbf320 item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 16185 itemsize 98 devid 1 total_bytes 30720 bytes_used 306053120 io_align 4096 io_width 4096 sector_size 4096 type 0 generation 0 start_offset 0 dev_group 0 seek_speed 0 bandwidth 0 uuid 043443c7-ac91-4085-a5e4-983b59dd0803 fsid 005a8d59-a561-4371-869e-b0ccc4a4862b item 1 key (DEV_ITEMS DEV_ITEM 2) itemoff 16087 itemsize 98 devid 2 total_bytes 30720 bytes_used 109051904 io_align 4096 io_width 4096 sector_size 4096 type 0 generation 0 start_offset 0 dev_group 0 seek_speed 0 bandwidth 0 uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4 fsid 005a8d59-a561-4371-869e-b0ccc4a4862b item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096) itemoff 15975 itemsize 112 length 8388608 owner 2 stripe_len 65536 type SYSTEM|RAID1 io_align 65536 io_width 65536 sector_size 4096 num_stripes 2 sub_stripes 0 stripe 0 devid 2 offset 1048576 dev_uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4 stripe 1 devid 1 offset 22020096 dev_uuid 043443c7-ac91-4085-a5e4-983b59dd0803 item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 30408704) itemoff 15863 itemsize 112 length 33554432 owner 2 stripe_len 65536 type METADATA|RAID1 io_align 65536 io_width 65536 sector_size 4096 num_stripes 2 sub_stripes 0 stripe 0 devid 2 offset 9437184 dev_uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4 stripe 1 devid 1 offset 30408704 dev_uuid 043443c7-ac91-4085-a5e4-983b59dd0803 item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 63963136) itemoff 15751 itemsize 112 length 67108864 owner 2 stripe_len 65536 type DATA|RAID1 io_align 65536 io_width 65536 sector_size 4096 num_stripes 2 sub_stripes 0 stripe 0 devid 2 offset 42991616 dev_uuid 0b4e0b31-e2b1-40a0-8360-09978f58a2e4 stripe 1 devid 1 offset 63963136 dev_uuid 043443c7-ac91-4085-a5e4-983b59dd0803 item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 131072000)
Re: Cannot 'mount -o degraded /dev/replacement' after a replace
Am 09.02.19 um 16:32 schrieb Andrei Borzenkov: Running btrfs balance start -f -ssoft,convert=raid1 -dsoft,convert=raid1 -msoft,convert=raid1 /mnt fixes it. I am not sure whether btrfs is expected to do it automatically as of now. This fix seems to work for me, too. Output was: > Done, had to relocate 5 out of 8 chunks Tested without any subvolumes or data.
Only one subvolume can be mounted after replace/balance
Hi, In short: When mounting a second subvolume from a pool, I get this error: "mount: /mnt: wrong fs type, bad option, bad superblock on /dev/sda, missing code page or helper program, or other." dmesg | grep BTRFS only shows this error: info (device sda): disk space caching is enabled error (device sda): Remounting read-write after error is not allowed What happened: In my RAID1 pool with two disk, I successfully replaced one disk with btrfs replace start 2 /dev/sdx After that, I mounted the pool and did btrfs fi show /mnt which showed WARNINGs about "filesystems with multiple block group profiles detected" (don't remember exactly) I thought it is a good idea to do btrfs balance start /mnt which finished without errors. Now, I can only mount one (sub)volume of the pool at a time. Others can only be mounted read-only. See error messages at top of this mail. Do you have any idea what happened or how to fix it? I already tried rescue zero-log and super-recovery which was successful but didn't help. Regards, Jakob
Cannot resize filesystem: not enough free space
Help please, increasing the filesystem size doesn't work. When mounting my btrfs filesystem, I had errors saying, "no space left on device". Now I managed to mount the filesystem with -o skip_balance but: # btrfs fi df /mnt Data, RAID1: total=147.04GiB, used=147.02GiB System, RAID1: total=8.00MiB, used=48.00KiB Metadata, RAID1: total=1.00GiB, used=458.84MiB GlobalReserve, single: total=181.53MiB, used=0.00B It is full and resize doesn't work although both block devices sda and sdb have more 250 GB and more nominal capacity (I don't have partitions, btrfs is directly on sda and sdb): # fdisk -l /dev/sd{a,b}* Disk /dev/sda: 232.89 GiB, 250059350016 bytes, 488397168 sectors [...] Disk /dev/sdb: 465.76 GiB, 500107862016 bytes, 976773168 sectors [...] I tried: # btrfs fi resize 230G /mnt runs without errors but has no effect # btrfs fi resize max /mnt runs without errors but has no effect # btrfs fi resize +1G /mnt ERROR: unable to resize '/mnt': no enough free space Any ideas? Thank you!
Re: Cannot resize filesystem: not enough free space
Hugo Mills writes: On Sun, Jan 24, 2021 at 07:23:21PM +0100, Jakob Schöttl wrote: Help please, increasing the filesystem size doesn't work. When mounting my btrfs filesystem, I had errors saying, "no space left on device". Now I managed to mount the filesystem with -o skip_balance but: # btrfs fi df /mnt Data, RAID1: total=147.04GiB, used=147.02GiB System, RAID1: total=8.00MiB, used=48.00KiB Metadata, RAID1: total=1.00GiB, used=458.84MiB GlobalReserve, single: total=181.53MiB, used=0.00B Can you show the output of "sudo btrfs fi show" as well? Hugo. Thanks, Hugo, for the quick response. # btrfs fi show /mnt/ Label: 'data' uuid: fc991007-6ef3-4c2c-9ca7-b4d637fccafb Total devices 2 FS bytes used 148.43GiB devid1 size 232.89GiB used 149.05GiB path /dev/sda devid2 size 149.05GiB used 149.05GiB path /dev/sdb Oh, now I see! Resize only worked for one sda! # btrfs fi resize 1:max /mnt/ # btrfs fi resize 2:max /mnt/ # btrfs fi show /mnt/ Label: 'data' uuid: fc991007-6ef3-4c2c-9ca7-b4d637fccafb Total devices 2 FS bytes used 150.05GiB devid1 size 232.89GiB used 151.05GiB path /dev/sda devid2 size 465.76GiB used 151.05GiB path /dev/sdb Now it works. Thank you! It is full and resize doesn't work although both block devices sda and sdb have more 250 GB and more nominal capacity (I don't have partitions, btrfs is directly on sda and sdb): # fdisk -l /dev/sd{a,b}* Disk /dev/sda: 232.89 GiB, 250059350016 bytes, 488397168 sectors [...] Disk /dev/sdb: 465.76 GiB, 500107862016 bytes, 976773168 sectors [...] I tried: # btrfs fi resize 230G /mnt runs without errors but has no effect # btrfs fi resize max /mnt runs without errors but has no effect # btrfs fi resize +1G /mnt ERROR: unable to resize '/mnt': no enough free space Any ideas? Thank you! -- Jakob Schöttl Phone: 0176 45762916 E-mail: jscho...@gmail.com PGP-key: 0x25055C7F
Re: Only one subvolume can be mounted after replace/balance
Thank you Chris, it's resolved now, see below. Am 25.01.21 um 23:47 schrieb Chris Murphy: On Sat, Jan 23, 2021 at 7:50 AM Jakob Schöttl wrote: Hi, In short: When mounting a second subvolume from a pool, I get this error: "mount: /mnt: wrong fs type, bad option, bad superblock on /dev/sda, missing code page or helper program, or other." dmesg | grep BTRFS only shows this error: info (device sda): disk space caching is enabled error (device sda): Remounting read-write after error is not allowed It went read-only before this because it's confused. You need to unmount it before it can be mounted rw. In some cases a reboot is needed. Oh, I didn't notice that the pool was already mounted (via fstab). The filesystem where out of space and I had to resize both disks separately. And I had to mount with -o skip_balance for that. Now it works again. What happened: In my RAID1 pool with two disk, I successfully replaced one disk with btrfs replace start 2 /dev/sdx After that, I mounted the pool and did I don't understand this sequence. In order to do a replace, the file system is already mounted. That was, what I did before my actual problem occurred. But it's resolved now. btrfs fi show /mnt which showed WARNINGs about "filesystems with multiple block group profiles detected" (don't remember exactly) I thought it is a good idea to do btrfs balance start /mnt which finished without errors. Balance alone does not convert block groups to a new profile. You have to explicitly select a conversion filter, e.g. btrfs balance start -dconvert=raid1,soft -mconvert=raid1,soft /mnt I didn't want to convert to a new profile. I thought btrfs replace automatically uses the same profile as the pool? Regards, Jakob