That may be pointing in the right direction - I see
{
"style": "legacy",
"name": "mon.rhel1.robeckert.us",
"fsid": "fe3a7cb0-69ca-11eb-8d45-c86000d08867",
"systemd_unit": "[email protected]",
"enabled": false,
"state": "stopped",
"host_version": "16.2.5"
},
And
{
"style": "cephadm:v1",
"name": "mon.rhel1",
"fsid": "fe3a7cb0-69ca-11eb-8d45-c86000d08867",
"systemd_unit": "[email protected]",
"enabled": true,
"state": "running",
"service_name": "mon",
"ports": [],
"ip": null,
"deployed_by": [
"quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac",
"quay.io/ceph/ceph@sha256:8a0f6f285edcd6488e2c91d3f9fa43534d37d7a9b37db1e0ff6691aae6466530"
],
"rank": null,
"rank_generation": null,
"memory_request": null,
"memory_limit": null,
"container_id": null,
"container_image_name":
"quay.io/ceph/ceph@sha256:5d042251e1faa1408663508099cf97b256364300365d403ca5563a518060abac",
"container_image_id": null,
"container_image_digests": null,
"version": null,
"started": null,
"created": "2021-09-20T15:46:42.166486Z",
"deployed": "2021-09-20T15:46:41.136498Z",
"configured": "2021-09-20T15:47:23.002007Z"
}
As the output.
In /var/lib/ceph/mon (not
/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon), there is a link:
ceph-rhel1.robeckert.us ->
/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/
I removed the link and the error did clear up. (hopefully it will stay gone
:-))
Thanks,
Rob
-----Original Message-----
From: Fyodor Ustinov <[email protected]>
Sent: Monday, September 20, 2021 2:01 PM
To: Robert W. Eckert <[email protected]>
Cc: ceph-users <[email protected]>
Subject: Re: [ceph-users] Getting cephadm "stderr:Inferring config" every
minute in log - for a monitor that doesn't exist and shouldn't exist
Hi!
It looks exactly the same as the problem I had.
Try the `cephadm ls` command on the `rhel1.robeckert.us` node.
----- Original Message -----
> From: "Robert W. Eckert" <[email protected]>
> To: "ceph-users" <[email protected]>
> Sent: Monday, 20 September, 2021 18:28:08
> Subject: [ceph-users] Getting cephadm "stderr:Inferring config" every
> minute in log - for a monitor that doesn't exist and shouldn't exist
> Hi- after the upgrade to 16.2.6, I am now seeing this error:
>
> 9/20/21 10:45:00 AM[ERR]cephadm exited with an error code: 1,
> stderr:Inferring config
> /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert
> .us/config
> ERROR: [Errno 2] No such file or directory:
> '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config'
> Traceback (most recent call last): File
> "/usr/share/ceph/mgr/cephadm/serve.py",
> line 1366, in _remote_connection yield (conn, connr) File
> "/usr/share/ceph/mgr/cephadm/serve.py", line 1263, in _run_cephadm
> code,
> '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm
> exited with an error code: 1, stderr:Inferring config
> /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert
> .us/config
> ERROR: [Errno 2] No such file or directory:
> '/var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert.us/config'
>
> The rhel1 server has a monitor under
> /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 , and it
> is up and active. If I copy the
> /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1 to
> /var/lib/ceph/fe3a7cb0-69ca-11eb-8d45-c86000d08867/mon.rhel1.robeckert
> .us the error clears, then cephadm removes the folder with the domain
> name, and the error starts showing up in the log again.
>
> After a few minutes, I get the all clear:
>
> 9/20/21 11:00:00 AM[INF]overall HEALTH_OK
>
> 9/20/21 10:58:38 AM[INF]Removing key for mon.
>
> 9/20/21 10:58:37 AM[INF]Removing daemon mon.rhel1.robeckert.us from
> rhel1.robeckert.us
>
> 9/20/21 10:58:37 AM[INF]Removing monitor rhel1.robeckert.us from monmap...
>
> 9/20/21 10:58:37 AM[INF]Safe to remove mon.rhel1.robeckert.us: not in
> monmap (['rhel1', 'story', 'cube'])
>
> 9/20/21 10:52:21 AM[INF]Cluster is now healthy
>
> 9/20/21 10:52:21 AM[INF]Health check cleared: CEPHADM_REFRESH_FAILED (was:
> failed to probe daemons or devices)
>
> 9/20/21 10:51:15 AM
>
>
> I checked all of the configurations and can't find any reason it wants
> the monitor with the domain.
>
> But then the errors start up again - I haven't found any messages
> before they start up, I am going to monitor more closely.
> This doesn't seem to affect any functionality, just lots of messages in the
> log.
>
> Thanks,
> Rob
>
> _______________________________________________
> ceph-users mailing list -- [email protected] To unsubscribe send an
> email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]