Hi!
Now I have the same situation on al monitors without any reboot:
root@bes-mon3:~# ceph --verbose -w
Error initializing cluster client: Error
root@bes-mon3:~# ceph --admin-daemon /var/run/ceph/ceph-mon.3.asok mon_status
{ "name": "3",
"rank": 2,
"state": "peon",
"election_epoch": 86,
"quorum": [
0,
1,
2],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 3,
"fsid": "fffeafa2-a664-48a7-979a-517e3ffa0da1",
"modified": "2014-03-15 11:52:21.182767",
"created": "2014-03-15 11:51:42.321256",
"mons": [
{ "rank": 0,
"name": "1",
"addr": "10.92.8.80:6789\/0"},
{ "rank": 1,
"name": "2",
"addr": "10.92.8.81:6789\/0"},
{ "rank": 2,
"name": "3",
"addr": "10.92.8.82:6789\/0"}]}}
root@bes-mon3:~# ceph --admin-daemon /var/run/ceph/ceph-mon.3.asok quorum_status
{ "election_epoch": 86,
"quorum": [
0,
1,
2],
"quorum_names": [
"1",
"2",
"3"],
"quorum_leader_name": "1",
"monmap": { "epoch": 3,
"fsid": "fffeafa2-a664-48a7-979a-517e3ffa0da1",
"modified": "2014-03-15 11:52:21.182767",
"created": "2014-03-15 11:51:42.321256",
"mons": [
{ "rank": 0,
"name": "1",
"addr": "10.92.8.80:6789\/0"},
{ "rank": 1,
"name": "2",
"addr": "10.92.8.81:6789\/0"},
{ "rank": 2,
"name": "3",
"addr": "10.92.8.82:6789\/0"}]}}
root@bes-mon3:~# ceph --admin-daemon /var/run/ceph/ceph-mon.3.asok version
{"version":"0.72.2"}
The rbd image mounted from this cluster seems to be ok, reading and writing
don't hangs.
Pavel.
23 марта 2014 г., в 8:49, Kyle Bader <[email protected]> написал(а):
>> I have two nodes with 8 OSDs on each. First node running 2 monitors on
>> different virtual machines (mon.1 and mon.2), second node runing mon.3
>> After several reboots (I have tested power failure scenarios) "ceph -w" on
>> node 2 always fails with message:
>>
>> root@bes-mon3:~# ceph --verbose -w
>> Error initializing cluster client: Error
>
> The cluster is simply protecting itself from a split brain situation.
> Say you have:
>
> mon.1 mon.2 mon.3
>
> If mon.1 fails, no big deal, you still have 2/3 so no problem.
>
> Now instead, say mon.1 is separated from mon.2 and mon.3 because of a
> network partition (trunk failure, whatever). If one monitor of the
> three could elect itself as leader then you might have divergence
> between your monitors. Self-elected mon.1 thinks it's the leader and
> mon.{2,3} have elected a leader amongst themselves. The harsh reality
> is you really need to have monitors on 3 distinct physical hosts to
> protect against the failure of a physical host.
>
> --
>
> Kyle
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com