They were not After I change it manually I was still unable to start the service Further more, a reboot screed up permissions again
ls -al /dev/sda* brw-rw---- 1 root disk 8, 0 Jan 3 11:10 /dev/sda brw-rw---- 1 root disk 8, 1 Jan 3 11:10 /dev/sda1 brw-rw---- 1 root disk 8, 2 Jan 3 11:10 /dev/sda2 [root@osd01 ~]# chown ceph:ceph /dev/sda1 [root@osd01 ~]# chown ceph:ceph /dev/sda2 [root@osd01 ~]# ls -al /dev/sda* brw-rw---- 1 root disk 8, 0 Jan 3 11:10 /dev/sda brw-rw---- 1 ceph ceph 8, 1 Jan 3 11:10 /dev/sda1 brw-rw---- 1 ceph ceph 8, 2 Jan 3 11:10 /dev/sda2 [root@osd01 ~]# systemctl start ceph-osd@3 [root@osd01 ~]# systemctl status ceph-osd@3 ● [email protected] - Ceph object storage daemon osd.3 Loaded: loaded (/usr/lib/systemd/system/[email protected]; enabled-runtime; vendor preset: disabled) Active: activating (auto-restart) (Result: exit-code) since Wed 2018-01-03 11:18:09 EST; 5s ago Process: 3823 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE) Process: 3818 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) Main PID: 3823 (code=exited, status=1/FAILURE) Jan 03 11:18:09 osd01.tor.medavail.net systemd[1]: Unit [email protected] entered failed state. Jan 03 11:18:09 osd01.tor.medavail.net systemd[1]: [email protected] failed. ceph-osd[3823]: 2018-01-03 11:18:08.515687 7fa55aec8d00 -1 bluestore(/var/lib/ceph/osd/ceph-3/block.db) _read_bdev_label unable to decode label at offset 102: buffer::malformed_input: void bluesto ceph-osd[3823]: 2018-01-03 11:18:08.515710 7fa55aec8d00 -1 bluestore(/var/lib/ceph/osd/ceph-3) _open_db check block device(/var/lib/ceph/osd/ceph-3/block.db) label returned: (22) Invalid argument This is very odd as the server was working fine What is the proper procedure for replacing a failed SSD drive used by Blustore ? On 3 January 2018 at 10:23, Sergey Malinin <[email protected]> wrote: > Are actual devices (not only udev links) owned by user “ceph”? > > ------------------------------ > *From:* ceph-users <[email protected]> on behalf of > Steven Vacaroaia <[email protected]> > *Sent:* Wednesday, January 3, 2018 6:19:45 PM > *To:* ceph-users > *Subject:* [ceph-users] ceph luminous - SSD partitions disssapeared > > Hi, > > After a reboot, all the partitions created on the SSD drive dissapeared > They were used by bluestore DB and WAL so the OSD are down > > The following error message are in /var/log/messages > > > Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.992218 7f4b52b9ed00 -1 > bluestore(/var/lib/ceph/osd/ceph-6) _open_db /var/lib/ceph/osd/ceph-6/block.db > link target doesn't exist > Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.993231 7f7ad37b1d00 -1 > bluestore(/var/lib/ceph/osd/ceph-5) _open_db /var/lib/ceph/osd/ceph-5/block.db > link target doesn't exist > > Then I decided to take this opportunity and "assume" a dead SSD thiuse > recreate partitions > > I zapped /dev/sda and then > I used this http://ceph.com/geen-categorie/ceph-recover-osds- > after-ssd-journal-failure/ to recreate partition for ceph-3 > Unfortunatelyy it is now "complaining' about permissions but they seem fine > > Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.992120 7f74003d1d00 -1 > bdev(0x562336677800 /var/lib/ceph/osd/ceph-3/block.db) open open got: > (13) Permission denied > Jan 3 09:54:12 osd01 ceph-osd: 2018-01-03 09:54:12.992131 7f74003d1d00 -1 > bluestore(/var/lib/ceph/osd/ceph-3) _open_db add block > device(/var/lib/ceph/osd/ceph-3/block.db) returned: (13) Permission denied > > ls -al /var/lib/ceph/osd/ceph-3/ > total 60 > drwxr-xr-x 2 ceph ceph 310 Jan 2 16:39 . > drwxr-x---. 7 ceph ceph 131 Jan 2 16:39 .. > -rw-r--r-- 1 root root 183 Jan 2 16:39 activate.monmap > -rw-r--r-- 1 ceph ceph 3 Jan 2 16:39 active > lrwxrwxrwx 1 ceph ceph 58 Jan 2 16:32 block -> /dev/disk/by-partuuid/ > 13560618-5942-4c7e-922a-1fafddb4a4d2 > lrwxrwxrwx 1 ceph ceph 58 Jan 2 16:32 block.db -> /dev/disk/by-partuuid/ > 5f610ecb-cb78-44d3-b503-016840d33ff6 > -rw-r--r-- 1 ceph ceph 37 Jan 2 16:32 block.db_uuid > -rw-r--r-- 1 ceph ceph 37 Jan 2 16:32 block_uuid > lrwxrwxrwx 1 ceph ceph 58 Jan 2 16:32 block.wal -> > /dev/disk/by-partuuid/04d38ce7-c9e7-4648-a3f5-7b459e508109 > > > > Anyone had to deal with a similar issue ? > > How d I fix the permission ? > > What is the proper procedure for dealing with a "dead' SSD ? > > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
