Hi Alwin,

Thanks for your reply! Appreciated.

These messages are not necessarily caused by a network issue. It might
well be that the daemon osd.18 can not react to heartbeat messages.

The thing is: the two OSDs are on the same host. I checked ceph-osd.18.log, and it contains just regular ceph stuff, nothing special, like this:

I noticed on host pm2 there are multiple kworker pids running with 100% CPU utilisation. Also swap usage is 100%, while regular RAM usage (from proxmox gui) is only 54%.

No idea what to make of that...

Check the logs on the host of osd.18.

Here they are:

2019-02-08 08:44:01.953390 7f6dc08b4700  1 leveldb: Level-0 table #1432303: 
started
2019-02-08 08:44:02.108622 7f6dc08b4700  1 leveldb: Level-0 table #1432303: 
1299359 bytes OK
2019-02-08 08:44:02.181135 7f6dc08b4700  1 leveldb: Delete type=0 #1432295

Also ceph-mon.1.log contains nothing special, except the regular stuff.

The cluster is doing scrubbing too, this is an intensive operation and
taxes your OSDs. This intensify the issue. But in general, you need to
find out what caused the slow requests. Ceph is able to throttle and
tries to get IOs done, even under pressure.

Yes, I turned off noscrub and nodeep-scrub again, after the issues of yesterdaymorning were resolved. The system has been running HEALTH_OK 24hrs, with no issues. (except the worrying loglines appearing every second)

If you describe your system further (eg. osd tree, crush map, system
specs) then we may be able to point you in the right direction. ;)

Here you go:

root@pm2:/var/log/ceph# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 87.35376 root default -2 29.11688 host pm1 0 hdd 3.64000 osd.0 up 1.00000 1.00000 1 hdd 3.64000 osd.1 up 1.00000 1.00000 2 hdd 3.63689 osd.2 up 1.00000 1.00000 3 hdd 3.64000 osd.3 up 1.00000 1.00000 12 hdd 3.64000 osd.12 up 1.00000 1.00000 13 hdd 3.64000 osd.13 up 1.00000 1.00000 14 hdd 3.64000 osd.14 up 1.00000 1.00000 15 hdd 3.64000 osd.15 up 1.00000 1.00000 -3 29.12000 host pm2 4 hdd 3.64000 osd.4 up 1.00000 1.00000 5 hdd 3.64000 osd.5 up 1.00000 1.00000 6 hdd 3.64000 osd.6 up 1.00000 1.00000 7 hdd 3.64000 osd.7 up 1.00000 1.00000 16 hdd 3.64000 osd.16 up 1.00000 1.00000 17 hdd 3.64000 osd.17 up 1.00000 1.00000 18 hdd 3.64000 osd.18 up 1.00000 1.00000 19 hdd 3.64000 osd.19 up 1.00000 1.00000 -4 29.11688 host pm3 8 hdd 3.64000 osd.8 up 1.00000 1.00000 9 hdd 3.64000 osd.9 up 1.00000 1.00000 10 hdd 3.64000 osd.10 up 1.00000 1.00000 11 hdd 3.64000 osd.11 up 1.00000 1.00000 20 hdd 3.64000 osd.20 up 1.00000 1.00000 21 hdd 3.64000 osd.21 up 1.00000 1.00000 22 hdd 3.64000 osd.22 up 1.00000 1.00000 23 hdd 3.63689 osd.23 up 1.00000 1.00000

We have journals on SSD.

The crush map:

root@pm2:/var/log/ceph# cat /tmp/decomp # begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 20 osd.20 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host pm1 {
        id -2           # do not change unnecessarily
        id -5 class hdd         # do not change unnecessarily
        # weight 29.117
        alg straw
        hash 0  # rjenkins1
        item osd.0 weight 3.640
        item osd.1 weight 3.640
        item osd.3 weight 3.640
        item osd.12 weight 3.640
        item osd.13 weight 3.640
        item osd.14 weight 3.640
        item osd.15 weight 3.640
        item osd.2 weight 3.637
}
host pm2 {
        id -3           # do not change unnecessarily
        id -6 class hdd         # do not change unnecessarily
        # weight 29.120
        alg straw
        hash 0  # rjenkins1
        item osd.4 weight 3.640
        item osd.5 weight 3.640
        item osd.6 weight 3.640
        item osd.7 weight 3.640
        item osd.16 weight 3.640
        item osd.17 weight 3.640
        item osd.18 weight 3.640
        item osd.19 weight 3.640
}
host pm3 {
        id -4           # do not change unnecessarily
        id -7 class hdd         # do not change unnecessarily
        # weight 29.117
        alg straw
        hash 0  # rjenkins1
        item osd.8 weight 3.640
        item osd.9 weight 3.640
        item osd.10 weight 3.640
        item osd.11 weight 3.640
        item osd.20 weight 3.640
        item osd.21 weight 3.640
        item osd.22 weight 3.640
        item osd.23 weight 3.637
}
root default {
        id -1           # do not change unnecessarily
        id -8 class hdd         # do not change unnecessarily
        # weight 87.354
        alg straw
        hash 0  # rjenkins1
        item pm1 weight 29.117
        item pm2 weight 29.120
        item pm3 weight 29.117
}

# rules
rule replicated_ruleset {
        id 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map

The three servers are identical: 128GB memory (50% used), dual Xeon(R) CPU E5-2630 v4 @ 2.20GHz, pve 5.3,

Any ideas where to look? I could of course try a reboot of that node pm2, to see if that makes the issue go away, but I'd rather understand why osd.18 does not respond to heartbeat messages, why swap usage is 100%, and why there are multiple high-cpu kworker threads running on this host only.

MJ
_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to