Do you have firewall on on new server by any chance?

On Sun, Jun 18, 2017 at 8:18 PM, Jim Forde <[email protected]> wrote:

> I have an eight node ceph cluster running Jewel 10.2.5.
>
> One Ceph-Deploy node. Four OSD nodes and three Monitor nodes.
>
> Ceph-Deploy node is r710T
>
> OSD’s are r710a, r710b, r710c, and r710d.
>
> Mon’s are r710e, r710f, and r710g.
>
> Name resolution is in Hosts file on each node.
>
>
>
> Successfully removed Monitor r710e from cluster
>
> Upgraded ceph-deploy node r710T to Kraken 11.2.0 (ceph -v returns 11.2.0
> all other nodes are still 10.2.5)
>
> Ceph -s is HEALTH_OK 2 mons
>
> Rebuilt r710e with same OS (ubutnu 14.04 LTS) and same IP address.
>
> “Ceph-deploy install –release kraken r710e” is successful with ceph -v
> returning 11.2.0 on node r710e
>
> “ceph-deploy admin r710e” is successful and puts the keyring in
> /etc/ceph/ceph.client.admin.keyring
>
> “sudo chmod +r /etc/ceph/ceph.client.admin.keyring”
>
>
>
> Everything seems successful to this point.
>
> Then I run
>
> “ceph-deploy mon create r710e” and I get the following:
>
>
>
> [r710e][DEBUG ] ******************************
> **************************************************
>
> [r710e][INFO  ] monitor: mon.r710e is currently at the state of probing
>
> [r710e][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon
> /var/run/ceph/ceph-mon.r710e.asok mon_status
>
> [r710e][WARNIN] r710e is not defined in `mon initial members`
>
> [r710e][WARNIN] monitor r710e does not exist in monmap
>
>
>
> R710e is in the ‘mon initial members’.
>
> It is in the ceph.conf file correctly (it was running before and the
> parameters have not changed) Public and Cluster networks are defined.
>
> It is the same physical server with the same (but freshly installed) OS
> and same IP address.
>
> Looking at the local daemon mon_status on all three monitors I see.
>
> R710f and r710g see r710e as an “extra_probe_peers”
>
> R710e sees r710f and r710g as “extra_probe_peers”
>
>
>
> “ceph-deploy purge r710e” and “ceph-deploy purgedata r710e” with a reboot
> of the 2 mon’s brings cluster back to HEALTH_OK
>
>
>
> Not sure what is going on. Is Ceph allergic to single node upgrades?
> Afraid to push the upgrade on all mon’s.
>
>
>
> What I have done:
>
> Rebuilt r710e with different hardware. Rebuilt with different OS. Rebuilt
> with different name and IP address. Same result.
>
> I have also restructured the NTP server. R710T is my NTP server on the
> cluster. (HEALTH_OK prior to updating) I reset all Mon nodes to get time
> from Ubuntu default NTP sources. Same error.
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to