We have a new ceph cluster, and when I follow the guide ( http://ceph.com/docs/master/start/quick-ceph-deploy/) during the section where you can add additional monitors, it fails, and it almost seems like its using an improper ip address
We have 4 nodes: - lts-mon - lts-osd1 - lts-osd2 - lts-osd3 Using, ceph-deploy, we have created a new cluster with lts-mon as the initial monitor: ceph-deploy new lts-mon ceph-deploy install lts-mon lts-osd1 lts-osd2 lts-osd3 ceph-deploy mon create-initial ceph-deploy osd prepare .... .... ceph-deploy mds lts-mon The only modifications I made to ceph.conf were to include the public and cluster network settings, and set the osd pool default size: [global] fsid = 5ca0e0f5-d367-48b8-97b4-48e8b12fd517 mon_initial_members = lts-mon mon_host = 10.5.68.236 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd_pool_default_size = 3 public_network = 10.5.68.0/22 cluster_network = 10.1.1.0/24 This all seemed fine, and after adding in all of our osd's, ceph -s reports: # ceph -s cluster f4adbd94-bf49-42f2-bd57-ebc7db9aa863 health HEALTH_WARN too few PGs per OSD (1 < min 30) monmap e1: 1 mons at {lts-mon=10.5.68.236:6789/0} election epoch 1, quorum 0 lts-mon osdmap e471: 102 osds: 102 up, 102 in pgmap v973: 64 pgs, 1 pools, 0 bytes data, 0 objects 515 GB used, 370 TB / 370 TB avail 64 active+clean We have not defined the default pg so the warning seems okay for now The problem we have is when adding a new monitor: ceph-deploy mon create lts-osd1 [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.25): /usr/local/bin/ceph-deploy mon create lts-osd1 [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts lts-osd1 [ceph_deploy.mon][DEBUG ] detecting platform for host lts-osd1 ... [lts-osd1][DEBUG ] connection detected need for sudo [lts-osd1][DEBUG ] connected to host: lts-osd1 [lts-osd1][DEBUG ] detect platform information from remote host [lts-osd1][DEBUG ] detect machine type [ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty [lts-osd1][DEBUG ] determining if provided host has same hostname in remote [lts-osd1][DEBUG ] get remote short hostname [lts-osd1][DEBUG ] deploying mon to lts-osd1 [lts-osd1][DEBUG ] get remote short hostname [lts-osd1][DEBUG ] remote hostname: lts-osd1 [lts-osd1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [lts-osd1][DEBUG ] create the mon path if it does not exist [lts-osd1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-lts-osd1/done [lts-osd1][DEBUG ] create a done file to avoid re-doing the mon deployment [lts-osd1][DEBUG ] create the init path if it does not exist [lts-osd1][DEBUG ] locating the `service` executable... [lts-osd1][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=lts-osd1 [lts-osd1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.lts-osd1.asok mon_status [lts-osd1][DEBUG ] ******************************************************************************** [lts-osd1][DEBUG ] status for monitor: mon.lts-osd1 [lts-osd1][DEBUG ] { [lts-osd1][DEBUG ] "election_epoch": 0, [lts-osd1][DEBUG ] "extra_probe_peers": [ [lts-osd1][DEBUG ] "10.5.68.236:6789/0" [lts-osd1][DEBUG ] ], [lts-osd1][DEBUG ] "monmap": { [lts-osd1][DEBUG ] "created": "0.000000", [lts-osd1][DEBUG ] "epoch": 0, [lts-osd1][DEBUG ] "fsid": "5ca0e0f5-d367-48b8-97b4-48e8b12fd517", [lts-osd1][DEBUG ] "modified": "0.000000", [lts-osd1][DEBUG ] "mons": [ [lts-osd1][DEBUG ] { [lts-osd1][DEBUG ] "addr": "0.0.0.0:0/1", [lts-osd1][DEBUG ] "name": "lts-mon", [lts-osd1][DEBUG ] "rank": 0 [lts-osd1][DEBUG ] } [lts-osd1][DEBUG ] ] [lts-osd1][DEBUG ] }, [lts-osd1][DEBUG ] "name": "lts-osd1", [lts-osd1][DEBUG ] "outside_quorum": [], [lts-osd1][DEBUG ] "quorum": [], [lts-osd1][DEBUG ] "rank": -1, [lts-osd1][DEBUG ] "state": "probing", [lts-osd1][DEBUG ] "sync_provider": [] [lts-osd1][DEBUG ] } [lts-osd1][DEBUG ] ******************************************************************************** [lts-osd1][INFO ] monitor: mon.lts-osd1 is currently at the state of probing [lts-osd1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.lts-osd1.asok mon_status [lts-osd1][WARNIN] lts-osd1 is not defined in `mon initial members` [lts-osd1][WARNIN] monitor lts-osd1 does not exist in monmap the monitor I was trying to add shows: 2015-06-09 11:33:24.661466 7fef2a806700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption 2015-06-09 11:33:24.661478 7fef2a806700 0 -- 10.5.68.229:6789/0 >> 10.5.68.236:6789/0 pipe(0x3571000 sd=13 :40912 s=1 pgs=0 cs=0 l=0 c=0x34083c0).failed verifying authorize reply 2015-06-09 11:33:24.763579 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch 2015-06-09 11:33:24.763651 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished 2015-06-09 11:33:25.825163 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch 2015-06-09 11:33:25.825259 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished 2015-06-09 11:33:26.661737 7fef2a806700 0 cephx: verify_reply couldn't decrypt with error: error decoding block for decryption 2015-06-09 11:33:26.661750 7fef2a806700 0 -- 10.5.68.229:6789/0 >> 10.5.68.236:6789/0 pipe(0x3571000 sd=13 :40914 s=1 pgs=0 cs=0 l=0 c=0x34083c0).failed verifying authorize reply 2015-06-09 11:33:26.887973 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch 2015-06-09 11:33:26.888047 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished 2015-06-09 11:33:27.950014 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch 2015-06-09 11:33:27.950113 7fef2eb83700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished All of our google searching seems to indicate that there may be a clock skew, but, the clocks are matched within .001 seconds Any assistance is much appreciated, thanks, Mike C
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com