[ceph-users] Re: Ceph standby-replay metadata server: MDS internal heartbeat is not healthy

Patrick Donnelly Wed, 19 Feb 2020 20:17:44 -0800

Hi Martin,

On Thu, Feb 13, 2020 at 4:10 AM Martin Palma <[email protected]> wrote:
>
> Hi all,
>
> today we observe that out of the sudden our standby-replay metadata
> server continuously writes the following logs:
>
> 2020-02-13 11:56:50.216102 7fd2ad229700  1 heartbeat_map is_healthy
> 'MDSRank' had timed out after 15
> 2020-02-13 11:56:50.287699 7fd2ad229700  0 mds.beacon.dcucmds401
> Skipping beacon heartbeat to monitors (last acked 100.836s ago); MDS
> internal heartbeat is not healthy!
>
> and it's memory is growing until no memory is available any more and
> the service gets restarted and then stops. The funny thing is that on
> the active MDS we are not seeing these log messages and any increase
> of memory.
>
> We are running ceph version 12.2.10 on all nodes of our Ceph cluster.
> Any suggestions?


Please increase debugging on the standby-replay daemon and share the logs.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Ceph standby-replay metadata server: MDS internal heartbeat is not healthy

Reply via email to