All,
I have found an issue with ceph OSDs that are on a SAN and Multipathed. It may 
not matter that they are multipathed, but that is how our setup is where I 
found the issue.

Our setup has an infiniband network which uses SRP to annunciate block devices 
on a DDN.
Every LUN can be seen by every node that loads the SRP drivers. That would be 
my OSSes.
I can create OSDs such that each node will have one OSD from what is available:

ceph-deploy osd create ceph-1-35a:/dev/mapper/mpathb:/dev/sda5 \
ceph-1-35b:/dev/mapper/mpathc:/dev/sda5 \
ceph-1-36a:/dev/mapper/mpathd:/dev/sda5 \
ceph-1-36b:/dev/mapper/mpathe:/dev/sda5

This creates the OSD and puts the journal as partition 5 on a local SSD on each 
node.
After moment, everything is happy:

    cluster b04e16d1-95d4-4f5f-8b32-318e7abbec56
     health HEALTH_OK
     monmap e1: 3 mons at 
{gnas-1-35a=10.100.1.35:6789/0,gnas-1-35b=10.100.1.85:6789/0,gnas-1-36a=10.100.1.36:6789/0}
            election epoch 4, quorum 0,1,2 gnas-1-35a,gnas-1-36a,gnas-1-35b
     osdmap e19: 4 osds: 4 up, 4 in
            flags sortbitwise
      pgmap v39: 64 pgs, 1 pools, 0 bytes data, 0 objects
            158 MB used, 171 TB / 171 TB avail
                  64 active+clean

Now the problem is that when the system probes the devices, ceph automatically 
mounts ALL OSDs it sees:
#df
Filesystem                          1K-blocks       Used   Available Use% 
Mounted on
/dev/mapper/VG1-root                 20834304    1313172    19521132   7% /
devtmpfs                            132011116          0   132011116   0% /dev
tmpfs                               132023232          0   132023232   0% 
/dev/shm
tmpfs                               132023232      19040   132004192   1% /run
tmpfs                               132023232          0   132023232   0% 
/sys/fs/cgroup
/dev/sda2                              300780     126376      174404  43% /boot
/dev/sda1                              307016       9680      297336   4% 
/boot/efi
/dev/mapper/VG1-tmp                  16766976      33052    16733924   1% /tmp
/dev/mapper/VG1-var                  50307072     363196    49943876   1% /var
/dev/mapper/VG1-log                  50307072      37120    50269952   1% 
/var/log
/dev/mapper/VG1-auditlog             16766976      33412    16733564   1% 
/var/log/audit
tmpfs                                26404648          0    26404648   0% 
/run/user/0
/dev/mapper/mpathb1               46026204140      41592 46026162548   1% 
/var/lib/ceph/osd/ceph-0

#partprobe /dev/mapper/mpathc
#partprobe /dev/mapper/mpathd
#partprobe /dev/mapper/mpathe
#df
Filesystem                          1K-blocks       Used   Available Use% 
Mounted on
/dev/mapper/VG1-root                 20834304    1313172    19521132   7% /
devtmpfs                            132011116          0   132011116   0% /dev
tmpfs                               132023232          0   132023232   0% 
/dev/shm
tmpfs                               132023232      19040   132004192   1% /run
tmpfs                               132023232          0   132023232   0% 
/sys/fs/cgroup
/dev/sda2                              300780     126376      174404  43% /boot
/dev/sda1                              307016       9680      297336   4% 
/boot/efi
/dev/mapper/VG1-tmp                  16766976      33052    16733924   1% /tmp
/dev/mapper/VG1-var                  50307072     363196    49943876   1% /var
/dev/mapper/VG1-log                  50307072      37120    50269952   1% 
/var/log
/dev/mapper/VG1-auditlog             16766976      33412    16733564   1% 
/var/log/audit
tmpfs                                26404648          0    26404648   0% 
/run/user/0
/dev/mapper/mpathb1               46026204140      41592 46026162548   1% 
/var/lib/ceph/osd/ceph-0
/dev/mapper/mpathc1               46026204140      39912 46026164228   1% 
/var/lib/ceph/osd/ceph-1
/dev/mapper/mpathd1               46026204140      39992 46026164148   1% 
/var/lib/ceph/osd/ceph-2
/dev/mapper/mpathe1               46026204140      39964 46026164176   1% 
/var/lib/ceph/osd/ceph-3

Well that causes great grief and lockups...
Is there a way within ceph to tell a particular OSS to ignore OSDs that aren't 
meant for it? It's odd to me that a mere partprobe causes the OSD to mount even.


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to