HI Joao,

You can see how much RocksDB space has been used with this command “ceph daemon 
osd.X perf dump” Where X is an OSD id on the node you are running the command 
on.

You are looking for this section in the output :-
    "bluefs": {
        "gift_bytes": 0,
        "reclaim_bytes": 0,
        "db_total_bytes": 23966253056,
        "db_used_bytes": 1714421760,
        "wal_total_bytes": 0,
        "wal_used_bytes": 0,
        "slow_total_bytes": 0,
        "slow_used_bytes": 0,
        "num_files": 24,
        "log_bytes": 552120320,
        "log_compactions": 0,
        "logged_bytes": 537051136,
        "files_written_wal": 1,
        "files_written_sst": 11,
        "bytes_written_wal": 429315193,
        "bytes_written_sst": 601384180,
        "bytes_written_slow": 0,
        "max_bytes_wal": 0,
        "max_bytes_db": 1714421760,
        "max_bytes_slow": 0
    },

If you have numbers in the slow_ entries then your RocksDB is spilling over 
onto the HDD.

As to if moving RocksDb and WAL on HDD can cause a performance degradation then 
it depends how busy your disks are. If you HDD’s are working hard and you are 
now going to throw a lot more workload onto them then performance will degrade. 
Could be substantially. I have seen performance impacts of upto 75% when things 
have started spilling over from NVME to HDD.
By that I mean I had a lovely flat line ingesting objects and that line 
suddenly dropped by 75% once the RocksDB had filled up and spilt over onto the 
HDD.




From: João Victor Rodrigues Soares <[email protected]>
Date: Wednesday, 25 September 2019 at 14:37
To: "[email protected]" <[email protected]>
Subject: [ceph-users] Slow Write Issues

Hello,

In my company, we currently have the following infrastructure:

- Ceph Luminous
- OpenStack Pike.

We have a cluster of 3 osd nodes with the following configuration:

- 1 x Xeon (R) D-2146NT CPU @ 2.30GHz
- 128GB RAM
- 128GB ROOT DISK
- 12 x 10TB SATA ST10000NM0146 (OSD)
- 1 x Intel Optane P4800X SSD DC 375GB (block.DB / block.wal)
- Ubuntu 16.04
- 2 X 10Gb network interface configured with lacp


The compute nodes have
- 4 x 10Gb network interfaces with lacp.

We also have 4 monitors with:
- 4 x 10Gb lacp network interfaces.
- The monitor nodes are approx. 90% cpu idle time with 32GB / 256GB available 
RAM

For each OSD disk we have created a partition of 33GB to block.db and block.wal.

We are recently facing a number of performance issues. Virtual machines created 
in OpenStack are experiencing slow writing issues (approx. 50MB / s).

The OSD nodes monitoring incur an average of 20% cpu IOwait time and 70 cpu 
idle time.
The memory consumption is around 30% consumption.
We have no latency issues (9ms average)

My question is if what is happening may have to do with the amount of disk 
dedicated to DB / WAL. In the CEPH documentation it says it is recommended that 
the block.db size is not smaller than 4% of block.

In this case for each disk in my environment block.db could not be less than 
400GB / OSD.

Another question is if I set my disks to use block.db / block.wal on the 
mechanical disks themselves, if that could lead to a performance degradation.

Att.
João Victor Rodrigues Soares
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to