Hi, i've had a very ugly thing happen to me over the weekend. Some of
my OSDs in a root that handles metadata pools overflowed to 100% disk
usage due to omap size(even though i had 97% full ratio, which is odd)
and refused to start. There were some pgs on those OSDs that went away
with them. I have tried compacting the omap, moving files away etc, but
nothing - i can't export the pgs, i get errors like this:
2018-08-27 04:42:33.436182 7fcb53382580 4 rocksdb: EVENT_LOG_v1
{"time_micros": 1535359353436170, "job": 1, "event": "recovery_started",
"log_files": [5504, 5507]}
2018-08-27 04:42:33.436194 7fcb53382580 4 rocksdb:
[/build/ceph-12.2.5/src/rocksdb/db/db_impl_open.cc:482] Recovering log
#5504 mode 2
2018-08-27 04:42:35.422502 7fcb53382580 4 rocksdb:
[/build/ceph-12.2.5/src/rocksdb/db/db_impl.cc:217] Shutdown: canceling
all background work
2018-08-27 04:42:35.431613 7fcb53382580 4 rocksdb:
[/build/ceph-12.2.5/src/rocksdb/db/db_impl.cc:343] Shutdown complete
2018-08-27 04:42:35.431716 7fcb53382580 -1 rocksdb: IO error: No space
left on device/var/lib/ceph/osd/ceph-5//current/omap/005507.sst: No
space left on device
Mount failed with '(1) Operation not permitted'
2018-08-27 04:42:35.432945 7fcb53382580 -1
filestore(/var/lib/ceph/osd/ceph-5/) mount(1723): Error initializing
rocksdb :
I decided to take the loss and mark the osds as lost and remove them
from the cluster, however, it left 4 pgs hanging in incomplete +
inactive state, which apparently prevents my radosgw from starting. Is
there another way to export/import the pgs into their new osds/recreate
them? I'm running Luminous 12.2.5 on Ubuntu 16.04.
Thanks
Josef
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com