Greetings Cephers,

I have been lurking on this list for a while, but this is my first inquiry. I 
have been playing with Ceph for the past 9 months and am in the process of 
deploying a production Ceph cluster. I am seeking advice on an issue that I 
have encountered. I do not believe it is a Ceph specific issue, but more of a 
Linux issue. Technically, its not an issue, just undesired behaviour that I am 
hoping someone here has encountered and can provide some insight as to a work 
around.

Basically, there are occasions when an OSD host machine gets rebooted. 
Sometimes one or more drives does not spin up properly. This causes the OSD to 
go offline, along with all other OSDs after it in sequence.

I created my OSDs using the online docs with the Linux device name (ex. 
/dev/sdc, sdd, sde, etc). So, osd.0 = /dev/sdc, osd.1 = /dev/sdd, osd.2 = 
/dev/sde, osd.3 = dev/sdf, etc.

But, if one of the drives fails/does not spin up, then Linux will rename the 
drives. Example, /dev/sdd fails on reboot, so now osd.1 comes up with /dev/sde, 
but /dev/sde is actually the osd.2 drive and osd.2 comes up with what was the 
osd.3 drive, then they all fall offline in sequence after the one failed osd.1.

As expected, if I replace the failed drive and reboot, Linux enumerates the 
drives and gives them the original device names and Ceph behaves properly by 
marking the affected osd as down and out, while the remaining drives in 
sequence come up and recover gracefully.

Does anyone have any thoughts or experience with how one can ensure that Linux 
device names will always map to the physical device ID? I was thinking along 
the lines of a udev ruleset for the drives or something similar. Or, is there a 
better way to create the OSD using the physical device ID? Basically, some sort 
of way to ensure that a specific physical drive always gets mapped to the same 
device name and OSD.

Thanks for any insight or thoughts on this,

Colin

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to