Hi,

Le 16/02/2021 à 09:54, Pal, Laszlo a écrit :
> [...]
> So, as far as I see the action plan is the following
> - enable v2 space_cache. is this safe/stable enough?
> - run defrag on old data, I suppose it will run weeks, but I'm ok with
> it if the server can run smoothly during this process
> - compress=zstd is the recommended mount option? is this performing
> better than the default?
> - I'm also thinking to -after defrag- compress my logs with
> traditional gzip compression and turn off on-the-fly compress (is this
> a huge performance gain?)
>
> [...]
>
> 3.10.0-1160.6.1.el7.x86_64 #1 SMP Tue Nov 17 13:59:11 UTC 2020 x86_64
> x86_64 x86_64 GNU/Linux
>

As Nikolay pointed out, this is a vendor kernel based on a very (more
than 7 years) old kernel version and this can create problems.
For reference, see
https://btrfs.wiki.kernel.org/index.php/Changelog#By_feature

- v2 space_cache appeared in 4.3.
- compress=zstd appeared in 4.14.

So for the 2 questions related to those, you'll have to ask the
distribution if they back-ported them (I doubt it, usually only bugfixes
are backported).

I wouldn't be comfortable using a 3.10 based kernel with BTRFS. For
example there was at least one compress=lzo bug (race condition?) that
corrupted data on occasions that was fixed (from memory) in either a
late 3.x kernel or a 4.x kernel. The stability and performance on such a
base will not compare well with the current state of BTRFS.

If you really want to go ahead with this kernel and BTRFS I would at
least avoid compression with it and as you suggested in your last point
compress at the application level.

Note that compression will make fragmentation worse. BTRFS uses small
individually compressed extents (probably because there isn't any other
decent way to minimize the costs of seeking into a file). The more
extents you have the more opportunity for fragmentation exist.

For defragmentation, I use something I coded to replace autodefrag which
was not usable in my use cases :
https://github.com/jtek/ceph-utils/blob/master/btrfs-defrag-scheduler.rb
The complexity is worth it for me because it allows good performance on
filesystem where I need BTRFS (either for checksums or snapshots). For a
log server I wouldn't consider it but you already bit the BTRFS bullet
so it might help depending on the details (it could be adapted to handle
a transition to a more sane state for example).
Initially it was fine-tuned to handle Ceph OSDs and latter on adapted to
very large BTRFS volumes backing NFS servers, backup servers (see the
README in the same repository) or even some of our PostgreSQL replicas.
The initialization on an existing filesystem needs special (undocumented
unfortunately) care though. By default the first pass goes very fast to
get an estimation of the number of files and this can create a very
large I/O load. If you decide to test it, I can provide directions (or
update the documentation). For a pure log server it is overkill (you
could simply defragment files on file rotation).


To sum up : if I were in your position I would probably choose between
these alternatives :
- switch to ext4 (maybe the easiest unless the migration is impractical),
- defragment old files that aren't written to anymore and schedule
defragmentation when log files are archived (maybe using logrotate),
- use my defragmentation scheduler as a last resort (might be a solution
if you store other data than logs in the filesystem too).

In all cases I would avoid BTRFS compression and compress on log
rotation. You'll get better performance and compression this way.

If you can update the kernel, use space_cache=v2, it is stable on recent
kernels (I don't even remember it being buggy even with the earlier
kernels).

Best regards,

Lionel

Reply via email to