Re: [ceph-users] Power outages!!! help!

Tomasz Kusmierz Mon, 28 Aug 2017 10:31:24 -0700

Sorry mate I’ve just noticed the 
"unfound (0.007%)”
I think that your main culprit here is osd.0. You need to have all osd’s on one 
host to get all the data back.


Also for time being I would just change size and min size down to 1 and try to 
figure out which osd you actually need to get all the data. Then try to fix 
your machine problems. From my experience regardless of solution, when you are 
in degraded mode and try to fix stuff things only get worse. 


> On 28 Aug 2017, at 17:31, hjcho616 <hjcho...@yahoo.com> wrote:
> 
> Thank you all for suggestions!
> 
> Maged,
> 
> I'll see what I can do on that... Looks like I may have to add another OSD 
> host as I utilized all of the SATA ports on those boards. =P
> 
> Ronny,
> 
> I am running with size=2 min_size=1.  I created everything with ceph-deploy 
> and didn't touch much of that pool settings...  I hope not, but sounds like I 
> may have lost some files!  I do want some of those OSDs to come back online 
> somehow... to get that confidence level up. =P
> 
> The dead osd.3 message is probably me trying to stop and start the osd.  
> There were some cases where stop didn't kill the ceph-osd process.  I just 
> started or restarted osd to try and see if that worked..  After that, there 
> were some reboots and I am not seeing those messages after it...
> 
> Tomasz,
> 
> This is something I am running at home.  I am the only user.  In a way it is 
> production environment but just driven by me. =)
> 
> Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and 
> osd.8 come back up without removing them?  I have a feeling I can get some 
> data back with some of them intact.
> 
> Thank you!
> 
> Regards,
> Hong
> 
> 
> On Monday, August 28, 2017 6:09 AM, Tomasz Kusmierz <tom.kusmi...@gmail.com> 
> wrote:
> 
> 
> Personally I would suggest to:
> - change minimal replication type to OSD (from default host)
> - remove the OSD from the host with all those "down OSD’s" (note that they 
> are down not out which makes it more weird)
> - let single node cluster stabilise, yes performance will suck but at least 
> you will have data on two copies on singular host … better this than nothing.
> - fix whatever issues you have on host OSD2 
> - add all osd on OSD2 and mark all osd from OSD1 with weight 0 - this will 
> make ceph migrate all data away from host OSD1
> - fix all the problem you’ve got on host OSD1 
> 
> reason I suggest that is that is seems that you’ve got issues everywhere and 
> since you are running a production environment (at least it seem like that to 
> me) data and down time is main priority.
> 
> > On 28 Aug 2017, at 11:58, Ronny Aasen <ronny+ceph-us...@aasen.cx 
> > <mailto:ceph-us...@aasen.cx>> wrote:
> > 
> > On 28. aug. 2017 08:01, hjcho616 wrote:
> >> Hello!
> >> I've been using ceph for long time mostly for network CephFS storage, even 
> >> before Argonaut release!  It's been working very well for me.  Yes, I had 
> >> some power outtages before and asked few questions on this list before and 
> >> got resolved happily!  Thank you all!
> >> Not sure why but we've been having quite a bit of power outages lately.  
> >> Ceph appear to be running OK with those going on.. so I was pretty happy 
> >> and didn't thought much of it... till yesterday, When I started to move 
> >> some videos to cephfs, ceph decided that it was full although df showed 
> >> only 54% utilization!  Then I looked up, some of the osds were down! (only 
> >> 3 at that point!)
> >> I am running pretty simple ceph configuration... I have one machine 
> >> running MDS and mon named MDS1.  Two OSD machines with 5 2TB HDDs and 1 
> >> SSD for journal named OSD1 and OSD2.
> >> At the time, I was running jewel 10.2.2. I looked at some of downed OSD's 
> >> log file and googled some of them... they appeared to be tied to version 
> >> 10.2.2.  So I just upgraded all to 10.2.9.  Well that didn't solve my 
> >> problems.. =P  While looking at some of this.. there was another power 
> >> outage!  D'oh!  I may need to invest in a UPS or something... Until this 
> >> happened, all of the osd down were from OSD2.  But OSD1 took a hit!  
> >> Couldn't boot, because osd-0 was damaged... I tried xfs_repair -L 
> >> /dev/sdb1 as suggested by command line.. I was able to mount it again, 
> >> phew, reboot... then /dev/sdb1 is no longer accessible!  Noooo!!!
> >> So this is what I have today!  I am a bit concerned as half of the osds 
> >> are down!  and osd.0 doesn't look good at all...
> >> # ceph osd tree
> >> ID WEIGHT  TYPE NAME    UP/DOWN REWEIGHT PRIMARY-AFFINITY
> >> -1 16.24478 root default
> >> -2  8.12239    host OSD1
> >>  1  1.95250        osd.1      up  1.00000          1.00000
> >>  0  1.95250        osd.0    down        0          1.00000
> >>  7  0.31239        osd.7      up  1.00000          1.00000
> >>  6  1.95250        osd.6      up  1.00000          1.00000
> >>  2  1.95250        osd.2      up  1.00000          1.00000
> >> -3  8.12239    host OSD2
> >>  3  1.95250        osd.3    down        0          1.00000
> >>  4  1.95250        osd.4    down        0          1.00000
> >>  5  1.95250        osd.5    down        0          1.00000
> >>  8  1.95250        osd.8    down        0          1.00000
> >>  9  0.31239        osd.9      up  1.00000          1.00000
> >> This looked alot better before that last extra power outage... =(  Can't 
> >> mount it anymore!
> >> # ceph health
> >> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 44 pgs 
> >> backfill_toofull; 80 pgs backfill_wait; 122 pgs degraded; 6 pgs down; 8 
> >> pgs inconsistent; 6 pgs peering; 2 pgs recovering; 18 pgs recovery_wait; 
> >> 16 pgs stale; 122 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck 
> >> stale; 159 pgs stuck unclean; 102 pgs stuck undersized; 102 pgs 
> >> undersized; 1 requests are blocked > 32 sec; recovery 1803466/4503980 
> >> objects degraded (40.042%); recovery 692976/4503980 objects misplaced 
> >> (15.386%); recovery 147/2251990 unfound (0.007%); 1 near full osd(s); 54 
> >> scrub errors; mds cluster is degraded; no legacy OSD present but 
> >> 'sortbitwise' flag is not set
> >> Each of osds are showing different failure signature.
> >> I've uploaded osd log with debug osd = 20, debug filestore = 20, and debug 
> >> ms = 20.  You can find it in below links.  Let me know if there is 
> >> preferred way to share this!
> >> https://drive.google.com/open?id=0By7YztAJNGUWQXItNzVMR281Snc  
> >> <https://drive.google.com/open?id=0By7YztAJNGUWQXItNzVMR281Snc>(ceph-osd.3.log)
> >> https://drive.google.com/open?id=0By7YztAJNGUWYmJBb3RvLVdSQWc  
> >> <https://drive.google.com/open?id=0By7YztAJNGUWYmJBb3RvLVdSQWc>(ceph-osd.4.log)
> >> https://drive.google.com/open?id=0By7YztAJNGUWaXhRMlFOajN6M1k  
> >> <https://drive.google.com/open?id=0By7YztAJNGUWaXhRMlFOajN6M1k>(ceph-osd.5.log)
> >> https://drive.google.com/open?id=0By7YztAJNGUWdm9BWFM5a3ExOFE  
> >> <https://drive.google.com/open?id=0By7YztAJNGUWdm9BWFM5a3ExOFE>(ceph-osd.8.log)
> >> So how does this look?  Can this be fixed? =)  If so please let me know.  
> >> I used to take backups but since it grew so big, I wasn't able to do so 
> >> anymore... and would like to get most of these back if I can.  Please let 
> >> me know if you need more info!
> >> Thank you!
> >> Regards,
> >> Hong
> > 
> > with only 2 osd host. how are you doing replication ? i assume you use 
> > size=2, and that is somewhat ok, if you have min_size=2, but if you have 
> > min_size=1 it can quickly become a big problem of lost objects.
> > 
> > with size=2, min_size=2 your data should be on 2 drives safely(if you can 
> > get one of them running again), but your cluster will block when there is 
> > an issue.
> > 
> > if at all possible i would add a third osd node in your cluster. so your OK 
> > PG's can replicate to that and you can work on the down osd's without fear 
> > of loosing additional working osd's
> > 
> > Also some of your logs contains lines like...
> > 
> > failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.3.asok': 
> > (17) File exists
> > 
> > filestore(/var/lib/ceph/osd/ceph-3) lock_fsid failed to lock 
> > /var/lib/ceph/osd/ceph-3/fsid, is another ceph-osd still running? (11) 
> > Resource temporarily unavailable
> > 
> > 7faf16e23800 -1 osd.3 0 OSD::pre_init: object store 
> > '/var/lib/ceph/osd/ceph-3' is currently in use. (Is ceph-osd already 
> > running?)
> > 
> > 7faf16e23800 -1  ** ERROR: osd pre_init failed: (16) Device or resource busy
> > 
> > 
> > 
> > This can indicate that you have a dead osd3 process keeping the resources 
> > open, and preventing a new osd from starting.
> > 
> > check with  ps aux if you can see any ceph processes. If you do find 
> > somthging relating to your down osds's you should try stopping it normally, 
> > and if that fails. killing it manually. before trying to restart the osd.
> > 
> > also check dmesg if you have messages relating to faulty hardware or OOM 
> > killer there. i have had experiences with the OOM killer where the osd node 
> > became unreliable until i rebooted the machine.
> > 
> > 
> > kind regards, and good luck
> > Ronny Aasen
> > 
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> 
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Reply via email to