Hi,

Am 03.06.2014 21:46, schrieb Jason Harley:
> Howdy —
> 
> I’ve had a failure on a small, Dumpling (0.67.4) cluster running on Ubuntu 
> 13.10 machines.  I had three OSD nodes (running 6 OSDs each), and lost two of 
> them in a beautiful failure.  One of these nodes even went so far as to 
> scramble the XFS filesystems of my OSD disks (I’m curious if it has some bad 
> DIMMs).
> 
> Anyway, the thing is: I’m okay with losing the data, this was a test setup 
> and I want to take this opportunity to learn from the recovery process.  I’m 
> now stuck in ‘HEALTH_ERR’ and want to get back to ‘HEALTH_OK’ without just 
> reinitializing the cluster.
> 
> My OSD map seems correct, I’ve done scrubs (deep, and normal) at the PG and 
> OSD levels.  ‘ceph -s’ shows that I have 47 unfound objects still after I 
> told ceph to ‘mark_unfound_lost’.  The remaining 47 PGs tell me that they 
> "haven't probed all sources, not marking lost”.  Two days have passed at this 
> point, and I’d just like to get my cluster back to working and deal with the 
> object loss (which seems located to a single pool).
> 
> How do I move forward from here, if at all?  Do I ‘force_create_pg’ the PGs 
> containing my unfound objects?
> 
>> # ceph health detail | grep "unfound" | grep "^pg"
>> pg 4.ffe is active+recovering, acting [7,26], 3 unfound
...
>> pg 4.43 is active+recovering, acting [9,23], 1 unfound
> 
> 

what is the output of:

ceph pg query 4.ffe



-- 

Mit freundlichen Grüßen,

Florian Wiessner

Smart Weblications GmbH
Martinsberger Str. 1
D-95119 Naila

fon.: +49 9282 9638 200
fax.: +49 9282 9638 205
24/7: +49 900 144 000 00 - 0,99 EUR/Min*
http://www.smart-weblications.de

--
Sitz der Gesellschaft: Naila
Geschäftsführer: Florian Wiessner
HRB-Nr.: HRB 3840 Amtsgericht Hof
*aus dem dt. Festnetz, ggf. abweichende Preise aus dem Mobilfunknetz
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to