All,
I have found an issue with ceph OSDs that are on a SAN and Multipathed. It may
not matter that they are multipathed, but that is how our setup is where I
found the issue.
Our setup has an infiniband network which uses SRP to annunciate block devices
on a DDN.
Every LUN can be seen by every node that loads the SRP drivers. That would be
my OSSes.
I can create OSDs such that each node will have one OSD from what is available:
ceph-deploy osd create ceph-1-35a:/dev/mapper/mpathb:/dev/sda5 \
ceph-1-35b:/dev/mapper/mpathc:/dev/sda5 \
ceph-1-36a:/dev/mapper/mpathd:/dev/sda5 \
ceph-1-36b:/dev/mapper/mpathe:/dev/sda5
This creates the OSD and puts the journal as partition 5 on a local SSD on each
node.
After moment, everything is happy:
cluster b04e16d1-95d4-4f5f-8b32-318e7abbec56
health HEALTH_OK
monmap e1: 3 mons at
{gnas-1-35a=10.100.1.35:6789/0,gnas-1-35b=10.100.1.85:6789/0,gnas-1-36a=10.100.1.36:6789/0}
election epoch 4, quorum 0,1,2 gnas-1-35a,gnas-1-36a,gnas-1-35b
osdmap e19: 4 osds: 4 up, 4 in
flags sortbitwise
pgmap v39: 64 pgs, 1 pools, 0 bytes data, 0 objects
158 MB used, 171 TB / 171 TB avail
64 active+clean
Now the problem is that when the system probes the devices, ceph automatically
mounts ALL OSDs it sees:
#df
Filesystem 1K-blocks Used Available Use%
Mounted on
/dev/mapper/VG1-root 20834304 1313172 19521132 7% /
devtmpfs 132011116 0 132011116 0% /dev
tmpfs 132023232 0 132023232 0%
/dev/shm
tmpfs 132023232 19040 132004192 1% /run
tmpfs 132023232 0 132023232 0%
/sys/fs/cgroup
/dev/sda2 300780 126376 174404 43% /boot
/dev/sda1 307016 9680 297336 4%
/boot/efi
/dev/mapper/VG1-tmp 16766976 33052 16733924 1% /tmp
/dev/mapper/VG1-var 50307072 363196 49943876 1% /var
/dev/mapper/VG1-log 50307072 37120 50269952 1%
/var/log
/dev/mapper/VG1-auditlog 16766976 33412 16733564 1%
/var/log/audit
tmpfs 26404648 0 26404648 0%
/run/user/0
/dev/mapper/mpathb1 46026204140 41592 46026162548 1%
/var/lib/ceph/osd/ceph-0
#partprobe /dev/mapper/mpathc
#partprobe /dev/mapper/mpathd
#partprobe /dev/mapper/mpathe
#df
Filesystem 1K-blocks Used Available Use%
Mounted on
/dev/mapper/VG1-root 20834304 1313172 19521132 7% /
devtmpfs 132011116 0 132011116 0% /dev
tmpfs 132023232 0 132023232 0%
/dev/shm
tmpfs 132023232 19040 132004192 1% /run
tmpfs 132023232 0 132023232 0%
/sys/fs/cgroup
/dev/sda2 300780 126376 174404 43% /boot
/dev/sda1 307016 9680 297336 4%
/boot/efi
/dev/mapper/VG1-tmp 16766976 33052 16733924 1% /tmp
/dev/mapper/VG1-var 50307072 363196 49943876 1% /var
/dev/mapper/VG1-log 50307072 37120 50269952 1%
/var/log
/dev/mapper/VG1-auditlog 16766976 33412 16733564 1%
/var/log/audit
tmpfs 26404648 0 26404648 0%
/run/user/0
/dev/mapper/mpathb1 46026204140 41592 46026162548 1%
/var/lib/ceph/osd/ceph-0
/dev/mapper/mpathc1 46026204140 39912 46026164228 1%
/var/lib/ceph/osd/ceph-1
/dev/mapper/mpathd1 46026204140 39992 46026164148 1%
/var/lib/ceph/osd/ceph-2
/dev/mapper/mpathe1 46026204140 39964 46026164176 1%
/var/lib/ceph/osd/ceph-3
Well that causes great grief and lockups...
Is there a way within ceph to tell a particular OSS to ignore OSDs that aren't
meant for it? It's odd to me that a mere partprobe causes the OSD to mount even.
Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com