Hi,
One simple/quick question.
In my ceph cluster, I had a disk wich was in predicted failure. It was so much
in predicted failure that the ceph OSD daemon crashed.
After the OSD crashed, ceph moved data correctly (or at least that's what I
thought), and a ceph -s was giving a "HEALTH_OK".
Perfect.
I tride to tell ceph to mark the OSD down : it told me the OSD was already
down... fine.
Then I ran this :
ID=43 ; ceph osd down $ID ; ceph auth del osd.$ID ; ceph osd rm $ID ; ceph osd
crush remove osd.$ID
And immediately after this, ceph told me :
# ceph -s
cluster 70ac4a78-46c0-45e6-8ff9-878b37f50fa1
health HEALTH_WARN
37 pgs backfilling
3 pgs stuck unclean
recovery 12086/355688 objects misplaced (3.398%)
monmap e2: 3 mons at
{ceph0=192.54.207.70:6789/0,ceph1=192.54.207.71:6789/0,ceph2=192.54.207.72:6789/0}
election epoch 938, quorum 0,1,2 ceph0,ceph1,ceph2
mdsmap e64: 1/1/1 up {0=ceph1=up:active}, 1 up:standby-replay, 1 up:standby
osdmap e25455: 119 osds: 119 up, 119 in; 35 remapped pgs
pgmap v5473702: 3212 pgs, 10 pools, 378 GB data, 97528 objects
611 GB used, 206 TB / 207 TB avail
12086/355688 objects misplaced (3.398%)
3175 active+clean
37 active+remapped+backfilling
client io 192 kB/s rd, 1352 kB/s wr, 117 op/s
Off course, I'm sure the OSD 43 was the one that was down ;)
My question therefore is :
If ceph successfully and automatically migrated data off the down/out OSD, why
is there even anything happening once I tell ceph to forget about this osd ?
Was the cluster not "HEALTH OK" after all ?
(ceph-0.94.6-0.el7.x86_64 for now)
Thanks && regards
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com