Master was ess68 and now it's essperf3.
On all cluster nodes the following files now have 'master: essperf3'
/etc/salt/minion
/etc/salt/minion/calamari.conf
/etc/diamond/diamond.conf
The 'salt \* ceph.get_heartbeats' is being run on essperf3 - heres a 'salt \*
test.ping' from essperf3 Calamari Master to the cluster. I've also included a
quick cluster sanity test with the output of ceph -s and ceph osd tree. And for
your reading pleasure the output of 'salt octeon109 ceph.get_heartbeats' since
I suspect there might be a missing field in the monitor response.
oot@essperf3:/etc/ceph# salt \* test.ping
octeon108:
True
octeon114:
True
octeon111:
True
octeon101:
True
octeon106:
True
octeon109:
True
octeon118:
True
root@essperf3:/etc/ceph# ceph osd tree
# id weight type name up/down reweight
-1 7 root default
-4 1 host octeon108
0 1 osd.0 up 1
-2 1 host octeon111
1 1 osd.1 up 1
-5 1 host octeon115
2 1 osd.2 DNE
-6 1 host octeon118
3 1 osd.3 up 1
-7 1 host octeon114
4 1 osd.4 up 1
-8 1 host octeon106
5 1 osd.5 up 1
-9 1 host octeon101
6 1 osd.6 up 1
root@essperf3:/etc/ceph# ceph -s
cluster 868bfacc-e492-11e4-89fa-000fb711110c
health HEALTH_OK
monmap e1: 1 mons at {octeon109=209.243.160.70:6789/0}, election epoch 1,
quorum 0 octeon109
osdmap e80: 6 osds: 6 up, 6 in
pgmap v26765: 728 pgs, 2 pools, 20070 MB data, 15003 objects
60604 MB used, 2734 GB / 2793 GB avail
728 active+clean
root@essperf3:/etc/ceph#
root@essperf3:/etc/ceph# salt octeon109 ceph.get_heartbeats
octeon109:
----------
- boot_time:
1430784431
- ceph_version:
0.80.8-0.el6
- services:
----------
ceph-mon.octeon109:
----------
cluster:
ceph
fsid:
868bfacc-e492-11e4-89fa-000fb711110c
id:
octeon109
status:
----------
election_epoch:
1
extra_probe_peers:
monmap:
----------
created:
2015-04-16 23:50:52.412686
epoch:
1
fsid:
868bfacc-e492-11e4-89fa-000fb711110c
modified:
2015-04-16 23:50:52.412686
mons:
----------
- addr:
209.243.160.70:6789/0
- name:
octeon109
- rank:
0
name:
octeon109
outside_quorum:
quorum:
- 0
rank:
0
state:
leader
sync_provider:
type:
mon
version:
0.86
----------
- 868bfacc-e492-11e4-89fa-000fb711110c:
----------
fsid:
868bfacc-e492-11e4-89fa-000fb711110c
name:
ceph
versions:
----------
config:
87f175c60e5c7ec06c263c556056fbcb
health:
a907d0ec395713369b4843381ec31bc2
mds_map:
1
mon_map:
1
mon_status:
1
osd_map:
80
pg_summary:
7e29d7cc93cfced8f3f146cc78f5682f
root@essperf3:/etc/ceph#
> -----Original Message-----
> From: Gregory Meno [mailto:[email protected]]
> Sent: Tuesday, May 12, 2015 5:03 PM
> To: Bruce McFarland
> Cc: [email protected]; [email protected]; ceph-devel
> ([email protected])
> Subject: Re: [ceph-calamari] Does anyone understand Calamari??
>
> Bruce,
>
> It is great to hear that salt is reporting status from all the nodes in the
> cluster.
>
> Let me see if I understand your question:
>
> You want to know what conditions cause us to recognize a working cluster?
>
> see
> https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/
> manager.py#L135
>
> https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/
> manager.py#L349
>
> and
>
> https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/c
> luster_monitor.py
>
>
> Let’s check that you need to be digging into that level of detail:
>
> You switched to a new instance of calamari and it is not recognizing the
> cluster.
>
> You what to know what you are overlooking? Would you please clarify with
> some hostnames?
>
> i.e. Let say that your old calamari node was called calamariA and that your
> new node is calamariB
>
> from which are you running the get_heartbeats?
>
> what is the master setting in the minion config files out on the nodes of the
> cluster if things are setup correctly they would look like this:
>
> [root@node1 shadow_man]# cat /etc/salt/minion.d/calamari.conf
> master: calamariB
>
>
> If this is the case the thing I would check is the
> http://calamariB/api/v2/cluster endpoint is reporting anything?
>
> hope this helps,
> Gregory
>
> > On May 12, 2015, at 4:34 PM, Bruce McFarland
> <[email protected]> wrote:
> >
> > Increasing the audience since ceph-calamari is not responsive. What salt
> event/info does the Calamari Master expect to see from the ceph-mon to
> determine there is an working cluster? I had to change servers hosting the
> calamari master and can’t get the new machine to recognize the cluster.
> The ‘salt \* ceph.get_heartbeats’ returns monmap, fsid, ver, epoch, etc for
> the monitor and all of the osd’s. Can anyone point me to docs or code that
> might enlighten me to what I’m overlooking? Thanks.
> > _______________________________________________
> > ceph-calamari mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com