Re: [PVE-User] Proxmox CEPH 6 servers failures!

Lindsay Mathieson Fri, 05 Oct 2018 09:16:51 -0700

Your Ceph cluster requires quorum to operate and that is based on yourmonitor nodes, not the OSD ones, which your diagram earlier doesn't detail.


How many monitor nodes do you have, and where are they located?


nb. You should only have an odd number of monitor nodes.


On 5/10/2018 10:53 PM, Gilberto Nunes wrote:

Folks...

I CEPH servers are in the same network: 10.10.10.0/24...
There is a optic channel between the builds: buildA and buildB, just to
identified!
When I create the cluster in first time, 3 servers going down in buildB,
and the remain ceph servers continued to worked properly...
I do not understand why now this cant happens anymore!
Sorry if I sound like a newbie! I still learn about it!
---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36





Em sex, 5 de out de 2018 às 09:44, Marcus Haarmann <
[email protected]> escreveu:

Gilberto,

the underlying problem is a ceph problem and not related to VMs or
Proxmox.
The ceph system requires a mayority of monitor nodes to be active.
Your setup seems to have 3 mon nodes, which results in a loss of quorum
when two of these servers are gone.
Check "ceph -s" on each side if you see any reaction of ceph.
If not, probably not enough mons are present.

Also, when one side is down you should see a non-presence of some OSD
instances.
In this case, ceph might be up but your VMs which are spread over the OSD
disks,
might block because of the non-accessibility of the primary storage.
The distribution of data over the OSD instances is steered by the crush
map.
You should make sure to have enough copies configured and the crush map
set up in a way
that on each side of your cluster is minimum one copy.
In case the crush map is mis-configured, all copies of your data may be on
the wrong side,
esulting in proxmox not being able to access the VM data.

Marcus Haarmann


Von: "Gilberto Nunes" <[email protected]>
An: "pve-user" <[email protected]>
Gesendet: Freitag, 5. Oktober 2018 14:31:20
Betreff: Re: [PVE-User] Proxmox CEPH 6 servers failures!

Nice.. Perhaps if I create a VM in Proxmox01 and Proxmox02, and join this
VM into Cluster Ceph, can I solve to quorum problem?
---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36





Em sex, 5 de out de 2018 às 09:23, dorsy <[email protected]> escreveu:

Your question has already been answered. You need majority to have

quorum.

On 2018. 10. 05. 14:10, Gilberto Nunes wrote:

Hi
Perhaps this can help:

https://imageshack.com/a/img921/6208/X7ha8R.png

I was thing about it, and perhaps if I deploy a VM in both side, with
Proxmox and add this VM to the CEPH cluster, maybe this can help!

thanks
---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36





Em sex, 5 de out de 2018 às 03:55, Alexandre DERUMIER <

[email protected]>

escreveu:

Hi,

Can you resend your schema, because it's impossible to read.


but you need to have to quorum on monitor to have the cluster

working.


----- Mail original -----
De: "Gilberto Nunes" <[email protected]>
À: "proxmoxve" <[email protected]>
Envoyé: Jeudi 4 Octobre 2018 22:05:16
Objet: [PVE-User] Proxmox CEPH 6 servers failures!

Hi there

I have something like this:

CEPH01 ----|
|----- CEPH04
|
|
CEPH02

----|-----------------------------------------------------|----

CEPH05
| Optic Fiber
|
CEPH03 ----|
|--- CEPH06

Sometime, when Optic Fiber not work, and just CEPH01, CEPH02 and

CEPH03

remains, the entire cluster fail!
I find out the cause!

ceph.conf

[global] auth client required = cephx auth cluster required = cephx

auth

service required = cephx cluster network = 10.10.10.0/24 fsid =
e67534b4-0a66-48db-ad6f-aa0868e962d8 keyring =
/etc/pve/priv/$cluster.$name.keyring mon allow pool delete = true osd
journal size = 5120 osd pool default min size = 2 osd pool default

size

3
public network = 10.10.10.0/24 [osd] keyring =
/var/lib/ceph/osd/ceph-$id/keyring [mon.pve-ceph01] host = pve-ceph01

mon

addr = 10.10.10.100:6789 mon osd allow primary affinity = true
[mon.pve-ceph02] host = pve-ceph02 mon addr = 10.10.10.110:6789 mon

osd

allow primary affinity = true [mon.pve-ceph03] host = pve-ceph03 mon

addr

=
10.10.10.120:6789 mon osd allow primary affinity = true

[mon.pve-ceph04]

host = pve-ceph04 mon addr = 10.10.10.130:6789 mon osd allow primary
affinity = true [mon.pve-ceph05] host = pve-ceph05 mon addr =
10.10.10.140:6789 mon osd allow primary affinity = true

[mon.pve-ceph06]

host = pve-ceph06 mon addr = 10.10.10.150:6789 mon osd allow primary
affinity = true

Any help will be welcome!

---
Gilberto Nunes Ferreira

(47) 3025-5907
(47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36
_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user



--
Lindsay

_______________________________________________
pve-user mailing list
[email protected]
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Re: [PVE-User] Proxmox CEPH 6 servers failures!

Reply via email to