[PVE-User] lots of 'heartbeat_check: no reply from ...' in the logs

mj Thu, 07 Feb 2019 11:16:47 -0800

Hi,

We are getting continuous lines like in our logs, between osd.19 andosd.18, both are on the same host pm2:

2019-02-07T19:59:24.724447+01:00 pm2 ceph-osd 3093 - - 2019-02-07 
19:59:24.723800 7f902e9f0700 -1 osd.19 15136 heartbeat_check: no reply from 
10.10.89.2:6807 osd.18 ever on either front or back, first ping sent 2019-02-07 
07:58:32.526903 (cutoff 2019-02-07 19:59:04.723796)

I can ping the ip address 10.10.89.2 from host pm2, plus also ncconfirms that the post is listening:

root@pm2:~# nc -vz 10.10.89.2 6807
nc: 10.10.89.2 (10.10.89.2) 6807 [6807] open

We had some trouble this morning, doing too many things at the sametime, causing slow requests, etc, but the system recovered, and has beenup and running the whole day, no issues anymore. However, this is whenthese messages started appearing.


Some info on our system, consisting of three identical nodes:

root@pm2:~# ceph -v
ceph version 12.2.10 (fc2b1783e3727b66315cc667af9d663d30fe7ed4) luminous 
(stable)
root@pm2:~# ceph health detail
HEALTH_OK
root@pm2:~# ceph -s
  cluster:
    id:     1397f1dc-7d94-43ea-ab1xxxxxxxc1
    health: HEALTH_OK

services:

    mon: 3 daemons, quorum 0,1,2
    mgr: pm1(active), standbys: pm3, pm2
    osd: 24 osds: 24 up, 24 in

data:

    pools:   2 pools, 1088 pgs
    objects: 4.50M objects, 17.1TiB
    usage:   51.5TiB used, 35.8TiB / 87.3TiB avail
    pgs:     1085 active+clean
             2    active+clean+scrubbing
             1    active+clean+scrubbing+deep

io:

    client:   18.3MiB/s rd, 39.8MiB/s wr, 87op/s rd, 539op/s wr

What can I do to get rid of these messages..? They sound serious. Moreinfo required, just let me know...!


Thanks very much in advance,

MJ
_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

[PVE-User] lots of 'heartbeat_check: no reply from ...' in the logs

Reply via email to