Re: [ceph-users] [ceph-calamari] Does anyone understand Calamari??

Bruce McFarland Tue, 12 May 2015 17:19:07 -0700

Master was ess68 and now it's essperf3. 

On all cluster nodes the following files now have 'master: essperf3'
/etc/salt/minion 
/etc/salt/minion/calamari.conf 
/etc/diamond/diamond.conf


The 'salt \* ceph.get_heartbeats' is being run on essperf3 - heres a 'salt \* 
test.ping' from essperf3 Calamari Master to the cluster. I've also included a 
quick cluster sanity test with the output of ceph -s and ceph osd tree. And for 
your reading pleasure the output of 'salt octeon109 ceph.get_heartbeats' since 
I suspect there might be a missing field in the monitor response. 

oot@essperf3:/etc/ceph# salt \* test.ping
octeon108:
    True
octeon114:
    True
octeon111:
    True
octeon101:
    True
octeon106:
    True
octeon109:
    True
octeon118:
    True
root@essperf3:/etc/ceph# ceph osd tree
# id    weight  type name       up/down reweight
-1      7       root default
-4      1               host octeon108
0       1                       osd.0   up      1       
-2      1               host octeon111
1       1                       osd.1   up      1       
-5      1               host octeon115
2       1                       osd.2   DNE             
-6      1               host octeon118
3       1                       osd.3   up      1       
-7      1               host octeon114
4       1                       osd.4   up      1       
-8      1               host octeon106
5       1                       osd.5   up      1       
-9      1               host octeon101
6       1                       osd.6   up      1       
root@essperf3:/etc/ceph# ceph -s 
    cluster 868bfacc-e492-11e4-89fa-000fb711110c
     health HEALTH_OK
     monmap e1: 1 mons at {octeon109=209.243.160.70:6789/0}, election epoch 1, 
quorum 0 octeon109
     osdmap e80: 6 osds: 6 up, 6 in
      pgmap v26765: 728 pgs, 2 pools, 20070 MB data, 15003 objects
            60604 MB used, 2734 GB / 2793 GB avail
                 728 active+clean
root@essperf3:/etc/ceph#

root@essperf3:/etc/ceph# salt octeon109 ceph.get_heartbeats
octeon109:
    ----------
    - boot_time:
        1430784431
    - ceph_version:
        0.80.8-0.el6
    - services:
        ----------
        ceph-mon.octeon109:
            ----------
            cluster:
                ceph
            fsid:
                868bfacc-e492-11e4-89fa-000fb711110c
            id:
                octeon109
            status:
                ----------
                election_epoch:
                    1
                extra_probe_peers:
                monmap:
                    ----------
                    created:
                        2015-04-16 23:50:52.412686
                    epoch:
                        1
                    fsid:
                        868bfacc-e492-11e4-89fa-000fb711110c
                    modified:
                        2015-04-16 23:50:52.412686
                    mons:
                        ----------
                        - addr:
                            209.243.160.70:6789/0
                        - name:
                            octeon109
                        - rank:
                            0
                name:
                    octeon109
                outside_quorum:
                quorum:
                    - 0
                rank:
                    0
                state:
                    leader
                sync_provider:
            type:
                mon
            version:
                0.86
    ----------
    - 868bfacc-e492-11e4-89fa-000fb711110c:
        ----------
        fsid:
            868bfacc-e492-11e4-89fa-000fb711110c
        name:
            ceph
        versions:
            ----------
            config:
                87f175c60e5c7ec06c263c556056fbcb
            health:
                a907d0ec395713369b4843381ec31bc2
            mds_map:
                1
            mon_map:
                1
            mon_status:
                1
            osd_map:
                80
            pg_summary:
                7e29d7cc93cfced8f3f146cc78f5682f
root@essperf3:/etc/ceph#



> -----Original Message-----
> From: Gregory Meno [mailto:[email protected]]
> Sent: Tuesday, May 12, 2015 5:03 PM
> To: Bruce McFarland
> Cc: [email protected]; [email protected]; ceph-devel
> ([email protected])
> Subject: Re: [ceph-calamari] Does anyone understand Calamari??
> 
> Bruce,
> 
> It is great to hear that salt is reporting status from all the nodes in the
> cluster.
> 
> Let me see if I understand your question:
> 
> You want to know what conditions cause us to recognize a working cluster?
> 
> see
> https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/
> manager.py#L135
> 
> https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/
> manager.py#L349
> 
> and
> 
> https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/manager/c
> luster_monitor.py
> 
> 
> Let’s check that you need to be digging into that level of detail:
> 
> You switched to a new instance of calamari and it is not recognizing the
> cluster.
> 
> You what to know what you are overlooking? Would you please clarify with
> some hostnames?
> 
> i.e. Let say that your old calamari node was called calamariA and that your
> new node is calamariB
> 
> from which are you running the get_heartbeats?
> 
> what is the master setting in the minion config files out on the nodes of the
> cluster if things are setup correctly they would look like this:
> 
> [root@node1 shadow_man]# cat /etc/salt/minion.d/calamari.conf
> master: calamariB
> 
> 
> If this is the case the thing I would check is the
> http://calamariB/api/v2/cluster endpoint is reporting anything?
> 
> hope this helps,
> Gregory
> 
> > On May 12, 2015, at 4:34 PM, Bruce McFarland
> <[email protected]> wrote:
> >
> > Increasing the audience since ceph-calamari is not responsive. What salt
> event/info does the Calamari Master expect to see from the ceph-mon to
> determine there is an working cluster? I had to change servers hosting the
> calamari master and can’t get the new machine to recognize the cluster.
> The ‘salt \* ceph.get_heartbeats’ returns monmap, fsid, ver, epoch, etc for
> the monitor and all of the osd’s. Can anyone point me to docs or code that
> might enlighten me to what I’m overlooking? Thanks.
> > _______________________________________________
> > ceph-calamari mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-calamari-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [ceph-calamari] Does anyone understand Calamari??

Reply via email to