Hi Greg,

Thanks for your support!

On 08. 09. 14 20:20, Gregory Farnum wrote:

> The first one is not caused by the same thing as the ticket you
> reference (it was fixed well before emperor), so it appears to be some
> kind of disk corruption.
> The second one is definitely corruption of some kind as it's missing
> an OSDMap it thinks it should have. It's possible that you're running
> into bugs in emperor that were fixed after we stopped doing regular
> support releases of it, but I'm more concerned that you've got disk
> corruption in the stores. What kind of crashes did you see previously;
> are there any relevant messages in dmesg, etc?

Nothing special in dmesg except probably irrelevant XFS warnings:

XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

All logs from before the disaster are still there, do you have any
advise on what would be relevant?

> Given these issues, you might be best off identifying exactly which
> PGs are missing, carefully copying them to working OSDs (use the osd
> store tool), and killing these OSDs. Do lots of backups at each
> stage...

This sounds scary, I'll keep fingers crossed and will do a bunch of
backups. There are 17 pg with missing objects.

What do you exactly mean by the osd store tool? Is it the
'ceph_filestore_tool' binary?

François

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to