Do you have firewall on on new server by any chance? On Sun, Jun 18, 2017 at 8:18 PM, Jim Forde <[email protected]> wrote:
> I have an eight node ceph cluster running Jewel 10.2.5. > > One Ceph-Deploy node. Four OSD nodes and three Monitor nodes. > > Ceph-Deploy node is r710T > > OSD’s are r710a, r710b, r710c, and r710d. > > Mon’s are r710e, r710f, and r710g. > > Name resolution is in Hosts file on each node. > > > > Successfully removed Monitor r710e from cluster > > Upgraded ceph-deploy node r710T to Kraken 11.2.0 (ceph -v returns 11.2.0 > all other nodes are still 10.2.5) > > Ceph -s is HEALTH_OK 2 mons > > Rebuilt r710e with same OS (ubutnu 14.04 LTS) and same IP address. > > “Ceph-deploy install –release kraken r710e” is successful with ceph -v > returning 11.2.0 on node r710e > > “ceph-deploy admin r710e” is successful and puts the keyring in > /etc/ceph/ceph.client.admin.keyring > > “sudo chmod +r /etc/ceph/ceph.client.admin.keyring” > > > > Everything seems successful to this point. > > Then I run > > “ceph-deploy mon create r710e” and I get the following: > > > > [r710e][DEBUG ] ****************************** > ************************************************** > > [r710e][INFO ] monitor: mon.r710e is currently at the state of probing > > [r710e][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon > /var/run/ceph/ceph-mon.r710e.asok mon_status > > [r710e][WARNIN] r710e is not defined in `mon initial members` > > [r710e][WARNIN] monitor r710e does not exist in monmap > > > > R710e is in the ‘mon initial members’. > > It is in the ceph.conf file correctly (it was running before and the > parameters have not changed) Public and Cluster networks are defined. > > It is the same physical server with the same (but freshly installed) OS > and same IP address. > > Looking at the local daemon mon_status on all three monitors I see. > > R710f and r710g see r710e as an “extra_probe_peers” > > R710e sees r710f and r710g as “extra_probe_peers” > > > > “ceph-deploy purge r710e” and “ceph-deploy purgedata r710e” with a reboot > of the 2 mon’s brings cluster back to HEALTH_OK > > > > Not sure what is going on. Is Ceph allergic to single node upgrades? > Afraid to push the upgrade on all mon’s. > > > > What I have done: > > Rebuilt r710e with different hardware. Rebuilt with different OS. Rebuilt > with different name and IP address. Same result. > > I have also restructured the NTP server. R710T is my NTP server on the > cluster. (HEALTH_OK prior to updating) I reset all Mon nodes to get time > from Ubuntu default NTP sources. Same error. > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
