We have a new ceph cluster, and when I follow the guide (
http://ceph.com/docs/master/start/quick-ceph-deploy/) during the section
where you can add additional monitors, it fails, and it almost seems like
its using an improper ip address
We have 4 nodes:
- lts-mon
- lts-osd1
- lts-osd2
- lts-osd3
Using, ceph-deploy, we have created a new cluster with lts-mon as the
initial monitor:
ceph-deploy new lts-mon
ceph-deploy install lts-mon lts-osd1 lts-osd2 lts-osd3
ceph-deploy mon create-initial
ceph-deploy osd prepare ....
....
ceph-deploy mds lts-mon
The only modifications I made to ceph.conf were to include the public and
cluster network settings, and set the osd pool default size:
[global]
fsid = 5ca0e0f5-d367-48b8-97b4-48e8b12fd517
mon_initial_members = lts-mon
mon_host = 10.5.68.236
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_pool_default_size = 3
public_network = 10.5.68.0/22
cluster_network = 10.1.1.0/24
This all seemed fine, and after adding in all of our osd's, ceph -s reports:
# ceph -s
cluster f4adbd94-bf49-42f2-bd57-ebc7db9aa863
health HEALTH_WARN
too few PGs per OSD (1 < min 30)
monmap e1: 1 mons at {lts-mon=10.5.68.236:6789/0}
election epoch 1, quorum 0 lts-mon
osdmap e471: 102 osds: 102 up, 102 in
pgmap v973: 64 pgs, 1 pools, 0 bytes data, 0 objects
515 GB used, 370 TB / 370 TB avail
64 active+clean
We have not defined the default pg so the warning seems okay for now
The problem we have is when adding a new monitor:
ceph-deploy mon create lts-osd1
[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.25): /usr/local/bin/ceph-deploy mon
create lts-osd1
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts lts-osd1
[ceph_deploy.mon][DEBUG ] detecting platform for host lts-osd1 ...
[lts-osd1][DEBUG ] connection detected need for sudo
[lts-osd1][DEBUG ] connected to host: lts-osd1
[lts-osd1][DEBUG ] detect platform information from remote host
[lts-osd1][DEBUG ] detect machine type
[ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty
[lts-osd1][DEBUG ] determining if provided host has same hostname in remote
[lts-osd1][DEBUG ] get remote short hostname
[lts-osd1][DEBUG ] deploying mon to lts-osd1
[lts-osd1][DEBUG ] get remote short hostname
[lts-osd1][DEBUG ] remote hostname: lts-osd1
[lts-osd1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[lts-osd1][DEBUG ] create the mon path if it does not exist
[lts-osd1][DEBUG ] checking for done path:
/var/lib/ceph/mon/ceph-lts-osd1/done
[lts-osd1][DEBUG ] create a done file to avoid re-doing the mon deployment
[lts-osd1][DEBUG ] create the init path if it does not exist
[lts-osd1][DEBUG ] locating the `service` executable...
[lts-osd1][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph
id=lts-osd1
[lts-osd1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon
/var/run/ceph/ceph-mon.lts-osd1.asok mon_status
[lts-osd1][DEBUG ]
********************************************************************************
[lts-osd1][DEBUG ] status for monitor: mon.lts-osd1
[lts-osd1][DEBUG ] {
[lts-osd1][DEBUG ] "election_epoch": 0,
[lts-osd1][DEBUG ] "extra_probe_peers": [
[lts-osd1][DEBUG ] "10.5.68.236:6789/0"
[lts-osd1][DEBUG ] ],
[lts-osd1][DEBUG ] "monmap": {
[lts-osd1][DEBUG ] "created": "0.000000",
[lts-osd1][DEBUG ] "epoch": 0,
[lts-osd1][DEBUG ] "fsid": "5ca0e0f5-d367-48b8-97b4-48e8b12fd517",
[lts-osd1][DEBUG ] "modified": "0.000000",
[lts-osd1][DEBUG ] "mons": [
[lts-osd1][DEBUG ] {
[lts-osd1][DEBUG ] "addr": "0.0.0.0:0/1",
[lts-osd1][DEBUG ] "name": "lts-mon",
[lts-osd1][DEBUG ] "rank": 0
[lts-osd1][DEBUG ] }
[lts-osd1][DEBUG ] ]
[lts-osd1][DEBUG ] },
[lts-osd1][DEBUG ] "name": "lts-osd1",
[lts-osd1][DEBUG ] "outside_quorum": [],
[lts-osd1][DEBUG ] "quorum": [],
[lts-osd1][DEBUG ] "rank": -1,
[lts-osd1][DEBUG ] "state": "probing",
[lts-osd1][DEBUG ] "sync_provider": []
[lts-osd1][DEBUG ] }
[lts-osd1][DEBUG ]
********************************************************************************
[lts-osd1][INFO ] monitor: mon.lts-osd1 is currently at the state of
probing
[lts-osd1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon
/var/run/ceph/ceph-mon.lts-osd1.asok mon_status
[lts-osd1][WARNIN] lts-osd1 is not defined in `mon initial members`
[lts-osd1][WARNIN] monitor lts-osd1 does not exist in monmap
the monitor I was trying to add shows:
2015-06-09 11:33:24.661466 7fef2a806700 0 cephx: verify_reply couldn't
decrypt with error: error decoding block for decryption
2015-06-09 11:33:24.661478 7fef2a806700 0 -- 10.5.68.229:6789/0 >>
10.5.68.236:6789/0 pipe(0x3571000 sd=13 :40912 s=1 pgs=0 cs=0 l=0
c=0x34083c0).failed verifying authorize reply
2015-06-09 11:33:24.763579 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2015-06-09 11:33:24.763651 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2015-06-09 11:33:25.825163 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2015-06-09 11:33:25.825259 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2015-06-09 11:33:26.661737 7fef2a806700 0 cephx: verify_reply couldn't
decrypt with error: error decoding block for decryption
2015-06-09 11:33:26.661750 7fef2a806700 0 -- 10.5.68.229:6789/0 >>
10.5.68.236:6789/0 pipe(0x3571000 sd=13 :40914 s=1 pgs=0 cs=0 l=0
c=0x34083c0).failed verifying authorize reply
2015-06-09 11:33:26.887973 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2015-06-09 11:33:26.888047 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2015-06-09 11:33:27.950014 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2015-06-09 11:33:27.950113 7fef2eb83700 0 log_channel(audit) log [DBG] :
from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
All of our google searching seems to indicate that there may be a clock
skew, but, the clocks are matched within .001 seconds
Any assistance is much appreciated, thanks,
Mike C
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com