On 2021/2/4 上午5:54, jos...@mailmag.net wrote:
Good Evening.

I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having 
problems mounting on boot without timing out. This causes the system to drop to 
emergency mode. I am then able to mount the array in emergency mode and all 
data appears fine, but upon reboot it fails again.

I actually first had this problem around a year ago, and initially put 
considerable effort into extending the timeout in systemd, as I believed that 
to be the problem. However, all the methods I attempted did not work properly 
or caused the system to continue booting before the array was mounted, causing 
all sorts of issues. Eventually, I was able to almost completely resolve it by 
defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi 
defrag /mountpoint/subvolume/) This seemed to reduce the time required to 
mount, and made it mount on boot the majority of the time.

Recently I expanded the array yet again by adding another drive, (and some more 
data) and now I am having the same issue again. I've posted the relevant 
entries from my dmesg, as well as some information on my array and system 
below. I ran a defrag as mentioned above on each subvolume, and was able to get 
the system to boot successfully. Any ideas on a more reliable and permanent 
solution this this? Thanks much!

dmesg entries upon boot:
[ 22.775439] BTRFS info (device sdh): use lzo compression, level 0
[ 22.775441] BTRFS info (device sdh): using free space tree
[ 22.775442] BTRFS info (device sdh): has skinny extents
[ 124.250554] BTRFS error (device sdh): open_ctree failed

dmesg entries after running 'mount -a' in emergency mode:
[ 178.317339] BTRFS info (device sdh): force zstd compression, level 2
[ 178.317342] BTRFS info (device sdh): using free space tree
[ 178.317343] BTRFS info (device sdh): has skinny extents

uname -a:
Linux HOSTNAME 5.10.0-2-amd64 #1 SMP Debian 5.10.9-1 (2021-01-20) x86-64 
GNU/Linux

btrfs --version:
btrfs-progs v5.10

btrfs fi show /mountpoint:
Label: 'DATA' uuid: {snip}
Total devices 14 FS bytes used 41.94TiB
devid 1 size 2.73TiB used 2.46TiB path /dev/sdh
devid 2 size 7.28TiB used 6.87TiB path /dev/sdm
devid 3 size 2.73TiB used 2.46TiB path /dev/sdk
devid 4 size 9.10TiB used 8.57TiB path /dev/sdj
devid 5 size 9.10TiB used 8.57TiB path /dev/sde
devid 6 size 9.10TiB used 8.57TiB path /dev/sdn
devid 7 size 7.28TiB used 4.65TiB path /dev/sdc
devid 9 size 9.10TiB used 8.57TiB path /dev/sdf
devid 10 size 2.73TiB used 2.21TiB path /dev/sdl
devid 12 size 2.73TiB used 2.20TiB path /dev/sdg
devid 13 size 9.10TiB used 8.57TiB path /dev/sdd
devid 15 size 7.28TiB used 6.75TiB path /dev/sda
devid 16 size 7.28TiB used 6.75TiB path /dev/sdi
devid 17 size 7.28TiB used 6.75TiB path /dev/sdb

With such a large array, the extent tree is considerably large.

And that's causing the mount time problem, as at mount we need to load
each block group item into memory.
When extent tree goes large, the read is mostly random read which is
never a good thing for HDD.

I was pushing skinny block group tree for btrfs, which arrange block
group items into a very compact tree, just like chunk tree.

This should greatly improve the mount performance, but there are several
problems:
- The feature is not yet merged
- The feature needs to convert existing fs to the new tree
  For your fs, it may take quite some time

So unfortunately, no good short term solution yet.

THanks,
Qu

btrfs fi usage /mountpoint:
Overall:
Device size: 92.78TiB
Device allocated: 83.96TiB
Device unallocated: 8.83TiB
Device missing: 0.00B
Used: 83.94TiB
Free (estimated): 4.42TiB (min: 2.95TiB)
Free (statfs, df): 3.31TiB
Data ratio: 2.00
Metadata ratio: 3.00
Global reserve: 512.00MiB (used: 0.00B)
Multiple profiles: no

Data,RAID1: Size:41.88TiB, Used:41.877TiB (99.99%)
{snip}

Metadata,RAID1C3: Size:68GiB, Used:63.79GiB (93.81%)
{snip}

System,RAID1C3: Size:32MiB, Used:6.69MiB (20.90%)
{snip}

Unallocated:
{snip}

Reply via email to