The OSDs are up and in , I have the problem on PGs as you see below
root@ceph-mon1:~# ceph -s
cluster:
id: 43f5d6b4-74b0-4281-92ab-940829d3ee5e
health: HEALTH_ERR
1/3 mons down, quorum ceph-mon1,ceph-mon3
14/32863 objects unfound (0.043%)
Possible
Could your hardware be faulty?
You are trying to deploy the faulty monitor? Or a whole new cluster?
If you are trying to fix your cluster, you should focus on OSD.
A cluster can run without big troubles with 2 monitors for few days (if not
years…).
-
Etienne Menguy
etienne.men...@croit.io
Hello team
Below is the error , I am getting once I try to redeploy the same cluster
TASK [ceph-mon : recursively fix ownership of monitor directory]
Have you tried to restart one of the OSD that seems to block PG recover?
I don’t think increasing PG can help.
-
Etienne Menguy
etienne.men...@croit.io
> On 29 Oct 2021, at 11:53, Michel Niyoyita wrote:
>
> Hello Eugen
>
> The failure_domain is host level and crush rule is replicated_rule
Hello Eugen
The failure_domain is host level and crush rule is replicated_rule in
troubleshooting process I changed for pool 5 its PG from 32 to 128 to see
if there
can be some changes. and it has the default replica (3)
Thanks for your continous help
On Fri, Oct 29, 2021 at 11:44 AM Etienne
> Is a way there you can enforce mon to rejoin a quorum ? I tried to restart it
> but nothing changed. I guess it is the cause If I am not mistaken.
No, but with quorum_status you can check monitor status and if it’s trying to
join quorum.
You may have to use daemon socket interface (asok
Dear Etienne
Is a way there you can enforce mon to rejoin a quorum ? I tried to restart
it but nothing changed. I guess it is the cause If I am not mistaken.
below is pg querry output
root@ceph-mon2:~# ceph pg 5.10 query
{
"snap_trimq": "[]",
"snap_trimq_len": 0,
"state":
Also what does the crush rule look like for pool 5 and what is the
failure-domain?
Zitat von Etienne Menguy :
With “ceph pg x.y query” you can check why it’s complaining.
x.y for pg id, like 5.77
It would also be interesting to check why mon fails to rejoin
quorum, it may give you hints
With “ceph pg x.y query” you can check why it’s complaining.
x.y for pg id, like 5.77
It would also be interesting to check why mon fails to rejoin quorum, it may
give you hints at your OSD issues.
-
Etienne Menguy
etienne.men...@croit.io
> On 29 Oct 2021, at 10:34, Michel Niyoyita
Hello Etienne
This is the ceph -s output
root@ceph-mon1:~# ceph -s
cluster:
id: 43f5d6b4-74b0-4281-92ab-940829d3ee5e
health: HEALTH_ERR
1/3 mons down, quorum ceph-mon1,ceph-mon3
14/47681 objects unfound (0.029%)
1 scrub errors
Hi,
Please share “ceph -s” output.
-
Etienne Menguy
etienne.men...@croit.io
> On 29 Oct 2021, at 10:03, Michel Niyoyita wrote:
>
> Hello team
>
> I am running a ceph cluster with 3 monitors and 4 OSDs nodes running 3osd
> each , I deployed my ceph cluster using ansible and ubuntu 20.04 as
11 matches
Mail list logo