All,
I have a set of hardware with a few systems connected via IB along with a DDN
SFA12K.
There are 4 IB/SRP paths to each block device. Those show up as
/dev/mapper/mpath[b-d]
I am trying to do an initial install/setup of ceph on 3 nodes. Each will be a
monitor as well as host a single OSD.
I am using the ceph-deploy to do most of the heavy lifting (using CentOS
7.2.1511).
Everything is quite successful installing monitors and even the first OSD.
ceph status shows:
cluster 0d9e68e4-176d-4229-866b-d408f8055e5b
health HEALTH_OK
monmap e1: 3 mons at
{ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}
election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b
osdmap e5: 1 osds: 1 up, 1 in
flags sortbitwise
pgmap v8: 64 pgs, 1 pools, 0 bytes data, 0 objects
40112 kB used, 43888 GB / 43889 GB avail
64 active+clean
But as soon as I try to add the next OSD on the next system using
ceph-deploy osd create ceph-1-35b:/dev/mapper/mpathc
things start acting up.
The last bit from the output seems ok:
[ceph-1-35b][INFO ] checking OSD status...
[ceph-1-35b][INFO ] Running command: ceph --cluster=ceph osd stat --format=json
[ceph-1-35b][WARNIN] there is 1 OSD down
[ceph-1-35b][WARNIN] there is 1 OSD out
[ceph_deploy.osd][DEBUG ] Host ceph-1-35b is now ready for osd use.
But, ceph status is now:
cluster 0d9e68e4-176d-4229-866b-d408f8055e5b
health HEALTH_OK
monmap e1: 3 mons at
{ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}
election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b
osdmap e6: 2 osds: 1 up, 1 in
flags sortbitwise
pgmap v10: 64 pgs, 1 pools, 0 bytes data, 0 objects
40120 kB used, 43888 GB / 43889 GB avail
64 active+clean
And ceph osd tree:
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 42.86040 root default
-2 42.86040 host ceph-1-35a
0 42.86040 osd.0 up 1.00000 1.00000
1 0 osd.1 down 0 1.00000
I don't understand why ceph-deploy didn't activate this one when it did for the
first. The OSD is not mounted on the other box.
I can try to activate the down OSD (ceph-deploy disk activate
ceph-1-35b:/dev/mapper/mpathc1:/dev/mapper/mpathc2)
Things look good for a bit:
cluster 0d9e68e4-176d-4229-866b-d408f8055e5b
health HEALTH_OK
monmap e1: 3 mons at
{ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}
election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b
osdmap e8: 2 osds: 2 up, 2 in
flags sortbitwise
pgmap v14: 64 pgs, 1 pools, 0 bytes data, 0 objects
74804 kB used, 87777 GB / 87778 GB avail
64 active+clean
But after about 1 minute, it goes down:
ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 85.72079 root default
-2 42.86040 host ceph-1-35a
0 42.86040 osd.0 up 1.00000 1.00000
-3 42.86040 host ceph-1-35b
1 42.86040 osd.1 down 1.00000 1.00000
ceph status
cluster 0d9e68e4-176d-4229-866b-d408f8055e5b
health HEALTH_WARN
1/2 in osds are down
monmap e1: 3 mons at
{ceph-1-35a=10.100.1.35:6789/0,ceph-1-35b=10.100.1.85:6789/0,ceph-1-36a=10.100.1.36:6789/0}
election epoch 8, quorum 0,1,2 ceph-1-35a,ceph-1-36a,ceph-1-35b
osdmap e9: 2 osds: 1 up, 2 in
flags sortbitwise
pgmap v15: 64 pgs, 1 pools, 0 bytes data, 0 objects
74804 kB used, 87777 GB / 87778 GB avail
64 active+clean
Has anyone played with getting multipath devices to work?
Of course it could be something completely different and I need to step back
and see what step is failing. Any insight into where to dig would be
appreciated.
Thanks in advance,
Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com