Dear list,
I have a small cluster (Reef 18.2.4) with 7 hosts and 3-4 OSDs each
(960GB/1.92TB mixed Intel D3-S4610, Samsung SM883, PM897 SSDs):
cluster:
id: ecff3ce8-539b-443e-a492-da428f4aa9e9
health: HEALTH_OK
services:
mon: 5 daemons, quorum titan,mangan,kalium,argon,chromium (age 2w)
mgr: mangan(active, since 2w), standbys: titan, argon
osd: 22 osds: 22 up (since 2w), 22 in (since 3M)
data:
pools: 2 pools, 513 pgs
objects: 2.76M objects, 7.0 TiB
usage: 16 TiB used, 15 TiB / 31 TiB avail
pgs: 513 active+clean
On that cluster RBD volumes for virtual machines are stored.
For a couple of months now the cluster reports slow ops for some OSDs and some
PGs as laggy. This happens once or twice a day, sometimes more and sometimes
not at all for a few days, at completely random times, independent of when
snapshots are deleted and trimmed and independent of the I/O load or load on
the hosts.
After about 30 seconds, during which the write speed goes to zero on the VMs,
everything returns to normal. I cannot reproduce the slow ops manually by
creating write load on the cluster. Even writing continuously with 300-400 MB/s
full speed for 20 minutes does not create any problems.
See attached log file for an example of a typical occurrence. I have also
measured write load on the disks during the problems with iostat which just
shows how writes stall, see also attached.
The OSDs with slow ops are completely random, any of the disks would show up
once in a while.
Current config (I've tried optimising snaptrim and scrub which didn't help):
# ceph config dump
WHO MASK LEVEL OPTION VALUE RO
global advanced auth_client_required cephx *
global advanced auth_cluster_required cephx *
global advanced auth_service_required cephx *
global advanced bdev_async_discard true
global advanced bdev_enable_discard true
global advanced public_network 10.0.4.0/24 *
mon advanced auth_allow_insecure_global_id_reclaim false
mgr advanced mgr/balancer/active true
mgr advanced mgr/balancer/mode upmap
mgr unknown mgr/pg_autoscaler/autoscale_profile scale-up *
osd basic osd_memory_target 4294967296
osd advanced osd_pg_max_concurrent_snap_trims 1
osd advanced osd_scrub_begin_hour 23
osd advanced osd_scrub_end_hour 4
osd advanced osd_scrub_sleep 1.000000
osd advanced osd_snap_trim_priority 1
osd advanced osd_snap_trim_sleep 2.000000
osd.0 basic osd_mclock_max_capacity_iops_ssd 29199.674019
osd.1 basic osd_mclock_max_capacity_iops_ssd 31554.530141
osd.10 basic osd_mclock_max_capacity_iops_ssd 25949.821194
osd.11 basic osd_mclock_max_capacity_iops_ssd 26300.596265
osd.12 basic osd_mclock_max_capacity_iops_ssd 25167.331294
osd.13 basic osd_mclock_max_capacity_iops_ssd 21606.610828
osd.14 basic osd_mclock_max_capacity_iops_ssd 27894.095121
osd.15 basic osd_mclock_max_capacity_iops_ssd 25929.047047
osd.16 basic osd_mclock_max_capacity_iops_ssd 15423.600235
osd.17 basic osd_mclock_max_capacity_iops_ssd 25097.493934
osd.18 basic osd_mclock_max_capacity_iops_ssd 25966.188007
osd.19 basic osd_mclock_max_capacity_iops_ssd 23628.746459
osd.2 basic osd_mclock_max_capacity_iops_ssd 32157.280832
osd.20 basic osd_mclock_max_capacity_iops_ssd 22722.682745
osd.3 basic osd_mclock_max_capacity_iops_ssd 33951.086556
osd.4 basic osd_mclock_max_capacity_iops_ssd 22736.907664
osd.5 basic osd_mclock_max_capacity_iops_ssd 21916.777510
osd.6 basic osd_mclock_max_capacity_iops_ssd 29984.954749
osd.7 basic osd_mclock_max_capacity_iops_ssd 26757.965797
osd.8 basic osd_mclock_max_capacity_iops_ssd 22738.921429
osd.9 basic osd_mclock_max_capacity_iops_ssd 24635.156413
Any help would be much appreciated!
Thanks,
Tim
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]