Hi, Thank you. So, I've installed a new centos7 with the same configuration, old kernel and using btrfs. Then, upgraded the kernel to 5.11 and all went well, so I thought let's do it on the prod server
Unfortunately when I boot on 5.11 sysroot mount times out and I have something like this in log btrfs open ctree failed Any quick fix for this? I'm able to mount btrfs volume using a rescuCD but I have the same issues, like rm a big file takes 10 minutes.... Thank you Laszlo On Tue, Feb 16, 2021 at 2:08 PM Roman Stingler <roman.sting...@gmail.com> wrote: > > first update your kernel to 5.10 it is lts now and try again. > > there have been 1 million updates to stability and performance > improvements in the past year. > > > > On 2/16/21 9:54 AM, Pal, Laszlo wrote: > > Thank you all for the quick response. The server is running, but as I > > said the i/o perf. is not as good as it should be. I'm also thinking > > the fragmentation is the issue but I also would like to optimise my > > config and if possible keep this server running with acceptable > > performance, so let me answer the questions below > > > > So, as far as I see the action plan is the following > > - enable v2 space_cache. is this safe/stable enough? > > - run defrag on old data, I suppose it will run weeks, but I'm ok with > > it if the server can run smoothly during this process > > - compress=zstd is the recommended mount option? is this performing > > better than the default? > > - I'm also thinking to -after defrag- compress my logs with > > traditional gzip compression and turn off on-the-fly compress (is this > > a huge performance gain?) > > > > Any other suggestions? > > > > Thank you > > Laszlo > > --- > > > > uname -a > > 3.10.0-1160.6.1.el7.x86_64 #1 SMP Tue Nov 17 13:59:11 UTC 2020 x86_64 > > x86_64 x86_64 GNU/Linux > > > > btrfs --version > > btrfs-progs v4.9.1 > > > > btrfs fi show > > Label: 'centos' uuid: 7017204b-1582-4b4e-ad04-9e55212c7d46 > > Total devices 2 FS bytes used 4.03TiB > > devid 1 size 491.12GiB used 119.02GiB path /dev/sda2 > > devid 2 size 4.50TiB used 4.14TiB path /dev/sdb1 > > > > btrfs fi df > > btrfs fi df /var > > Data, single: total=4.09TiB, used=3.96TiB > > System, RAID1: total=8.00MiB, used=464.00KiB > > Metadata, RAID1: total=81.00GiB, used=75.17GiB > > GlobalReserve, single: total=512.00MiB, used=0.00B > > > > dmesg > dmesg.log > > dmesg|grep -i btrfs > > [ 491.729364] BTRFS warning (device sdb1): block group > > 4619266686976 has wrong amount of free space > > [ 491.729371] BTRFS warning (device sdb1): failed to load free > > space cache for block group 4619266686976, rebuilding it now > > > > CPU type and model > > processor : 11 > > vendor_id : GenuineIntel > > cpu family : 6 > > model : 26 > > model name : Intel(R) Xeon(R) CPU E5540 @ 2.53GHz > > stepping : 4 > > microcode : 0x1d > > cpu MHz : 2533.423 > > cache size : 8192 KB > > 12 vCPU on esxi > > > > how much memory > > 48 GB RAM > > > > type and model of hard disk > > virtualized Fujitsu RAID on esxi > > > > is it raid > > yes, the underlying virtualization provides redundancy, no sw RAID > > > > Kernel version > > 3.10.0-1160.6.1.el7.x86_64 > > > > your btrfs mount options probably in /etc/fstab > > UUID=7017204b-1582-4b4e-ad04-9e55212c7d46 / > > btrfs defaults,noatime,autodefrag,subvol=root 0 0 > > UUID=7017204b-1582-4b4e-ad04-9e55212c7d46 /var > > btrfs defaults,subvol=var,noatime,autodefrag 0 0 > > > > size of log files > > 4,5TB on /var > > > > have you snapshots > > no > > > > have you tries tools like dedup remover > > not yet > > > > things you do > > > > 1. Kernel update LTS kernel has been updated to 5.10 (maybe you have > > to install it manually, because centos will be dropped -> reboot > > maybe you have to remove your mount point in fstab and boot into > > system and mount it later manually. > > Is this absolutely necessary? > > > > 2. set mount options in fstab > > defaults,autodefrag,space_cache=v2,compress=zstd (autodefrag only on > > HDD) > > defaults,ssd,space_cache=v2,compress=zstd (for ssd) > > > > autodefrag is already enabled. v2 space_cache is safe enough? > > > > 3. sudo btrfs scrub start /dev/sda (use your device) > > watch sudo btrfs scrub status /dev/sda (watch and wait until finished) > > > > 4. sudo btrfs device stats /dev/sda (your disk) > > > > 5.install smartmontools > > run sudo smartctl -x /dev/sda (use your disk) > > check > > I think this is not applicable because this is a virtual disk, > > > > On Tue, Feb 16, 2021 at 8:17 AM Nikolay Borisov <nbori...@suse.com> wrote: > >> > >> > >> On 15.02.21 г. 16:53 ч., Pal, Laszlo wrote: > >>> Hi, > >>> > >>> I'm not sure this is the right place to ask, but let me try :) I have > >>> a server where I mainly using btrfs because of the builtin compress > >>> feature. This is a central log server, storing logs from tens of > >>> thousands devices, using a text files in thousands of directories in > >>> millions of files. > >>> > >>> I've started to think it was not the best idea to choose btrfs for this :) > >>> > >>> The performance of this server was always worst than others where I > >>> don't use btrfs, but I thought this is just because the i/o overhead > >>> of compression and the not-so-good esx host providing the disk to this > >>> machine. But now, even rm a single file takes ages, so there is > >>> something definitely wrong. So, I'm looking for some recommendations > >>> for such an environment where the data-security functions of btrfs is > >>> not as important than the performance. > >>> > >>> I was searching the net for some comprehensive performance documents > >>> for months, but I cannot find it so far. > >>> > >>> Thank you in advance > >>> Laszlo > >>> > >> You are likely suffering fragmentation issues, given you hold log files > >> I'd assume you do a lot of small writes, each one results in a CoW > >> operation which allocates space. This results in increasing the size of > >> the metadata tree and since you are likely using harddrives seeking is > >> slow. To try and ascertain if that's really the case I'd advise you to > >> show the output of the following commands: > >> > >> btrfs fi usage <mountpoint> - this will show the total used space on the > >> filesystem. > >> > >> Then run btrfs inspect-internal dump-tree -t5 </dev/xxx> | grep -c > >> EXTENT_DATA > >> > >> Which will show how many data extents there are in the filesystem. > >> Subsequently run btrfs inspect-internal dump-tree -t5 </dev/xxx> | grep > >> -c leaf which will show how many leaves there are in the filesystem. > >> Then you have 2 options: > >> > >> a) Use btrfs defragment to actually rewrite leaves to try and make them > >> be closer so that seeks are going to become somewhat cheaper, > >> > >> b) Rewrite the logs files by copying them with no reflinks so that > >> instead of 1 file consisting of multiple small extents just make them > >> consist of 1 giant extent, also with your use case I'd assume you also > >> want nocow to be enabled, unfortunately nodatacow precludes using > >> compression. > >> > >>