Re: [ceph-users] Power Outage

2014-08-12 Thread hjcho616
, August 12, 2014 3:02 PM, Craig Lewis cle...@centraldesktop.com wrote: I can't really help with MDS.  Hopefully somebody else will chime in here. (Resending, because my last reply was too large.) On Tue, Aug 12, 2014 at 12:44 PM, hjcho616 hjcho...@yahoo.com wrote: Craig, Thanks.  It turns

[ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
When CephFS is mounted on a client and when client decides to go to sleep, MDS segfaults.  Has anyone seen this?  Below is a part of MDS log.  This happened in emperor and recent 0.77 release.  I am running Debian Wheezy with testing kernels 3.13.  What can I do to not crash the whole system if

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
]: segfault at 200 ip 7f36c3d480b8 sp 7f36c07d3520 error 4 in libgcc_s.so.1[7f36c3d39000+15000] Regards, Hong From: Luke Jing Yuan jyl...@mimos.my To: hjcho616 hjcho...@yahoo.com Cc: Mohd Bazli Ab Karim bazli.abka...@mimos.my; ceph-users@lists.ceph.com

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-21 Thread hjcho616
emperor.  I am using debian packages. Client went to sleep for a while (like 8+ hours).  There was no I/O prior to the sleep other than the fact that cephfs was still mounted. Regards, Hong From: Luke Jing Yuan jyl...@mimos.my To: hjcho616 hjcho...@yahoo.com Cc

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread hjcho616
From: Gregory Farnum g...@inktank.com To: hjcho616 hjcho...@yahoo.com Cc: Mohd Bazli Ab Karim bazli.abka...@mimos.my; Yan, Zheng uker...@gmail.com; Sage Weil s...@inktank.com; ceph-users@lists.ceph.com ceph-users@lists.ceph.com Sent: Tuesday, March 25, 2014 11:05 AM Subject: Re: [ceph

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread hjcho616
continued way before the wake event.   I'll monitor the sleep and wake few more times and see if it is good. Thanks. Regards, Hong From: Gregory Farnum g...@inktank.com To: hjcho616 hjcho...@yahoo.com Cc: Mohd Bazli Ab Karim bazli.abka...@mimos.my; Yan

Re: [ceph-users] infernalis and jewel upgrades...

2016-04-15 Thread hjcho616
.16024__0_4E98A1D9__noneroot@OSD2:/var/lib/ceph/osd# diff ./ceph-3/current/meta/DIR_9/DIR_D/osdmap.16024__0_4E98A1D9__none ./ceph-5/current/meta/DIR_9/DIR_D/osdmap.16024__0_4E98A1D9__none Regards,Hong On Saturday, April 16, 2016 12:35 AM, hjcho616 <hjcho...@yahoo.com> wrote: osd.3 did have

Re: [ceph-users] Power outages!!! help!

2017-09-15 Thread hjcho616
ster again using the method linked a few times in this thread. How did that go, were you successfull in recovering those pg's ? kind regards. Ronny Aasen On 15. sep. 2017 07:52, hjcho616 wrote: > I just did this and backfilling started.  Let's see where this takes me. > ceph osd lost

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616
Ronny, Working with all of the pgs shown in the "ceph health detail", I ran below for each PG to export.ceph-objectstore-tool --op export --pgid 0.1c   --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --skip-journal-replay --file 0.1c.export I have all PGs

Re: [ceph-users] Power outages!!! help!

2017-09-14 Thread hjcho616
I just did this and backfilling started.  Let's see where this takes me. ceph osd lost 0 --yes-i-really-mean-it Regards,Hong On Friday, September 15, 2017 12:44 AM, hjcho616 <hjcho...@yahoo.com> wrote: Ronny, Working with all of the pgs shown in the "ceph health detail&qu

Re: [ceph-users] Power outages!!! help!

2017-09-16 Thread hjcho616
Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 pgs repair; 1 pgs stuck inactive; 1 pgs stuck unclean; 109 scrub errors; too few PGs per OSD (29 < min 30); mds rank 0 has failed; mds cluster is

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
Anyone?  Can this page be saved?  If not what are my options? Regards,Hong On Saturday, September 16, 2017 1:55 AM, hjcho616 <hjcho...@yahoo.com> wrote: Looking better... working on scrubbing..HEALTH_ERR 1 pgs are stuck inactive for more than 300 seconds; 1 pgs incomplete;

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
but 'sortbitwise' flag is not set Regards,Hong On Wednesday, September 20, 2017 11:53 AM, Ronny Aasen <ronny+ceph-us...@aasen.cx> wrote: On 20.09.2017 16:49, hjcho616 wrote: Anyone?  Can this page be saved?  If not what are my options? Regards, Hong On Saturday, Sep

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
fo you need to do it all manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/ good luck Ronny Aasen On 20.09.2017 22:17, hjcho616 wrote: Thanks Ronny. I decided to try to tar everything under current directory.  Is this correct command for it?  Is there any direct

Re: [ceph-users] Power outages!!! help!

2017-09-20 Thread hjcho616
n't know how long is "wait a bit" is, I just turned it back on after a minute or so, just returns back to same inconsistent message.. =P  Are we looking for entire stopped OSD to map to different OSD and get 3 replica when running stopped OSD again? Regards,Hong On Wednesday, Sep

Re: [ceph-users] Power outages!!! help!

2017-09-13 Thread hjcho616
Rooney, Just tried hooking up osd.0 back.  osd.0 seems to be better as I was able to run ceph-objectstore-tool export so decided to try hooking it up.  Looks like journal is not happy.  Is there any way to get this running?  Or do I need to start getting data using ceph-objectstore-tool?

Re: [ceph-users] Power outages!!! help!

2017-09-22 Thread hjcho616
21, 2017 1:46 AM, Ronny Aasen <ronny+ceph-us...@aasen.cx> wrote: On 21. sep. 2017 00:35, hjcho616 wrote: > # rados list-inconsistent-pg data > ["0.0","0.5","0.a","0.e","0.1c","0.29","0.2c"] > # rados list-i

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616
> survive.  > > Get your self a LSI 94** controller and use it as HBA and you will be > fine. but get MORE DRIVES ! …  > > On 28 Aug 2017, at 23:10, hjcho616 <hjcho...@yahoo.com> wrote: > > > > Thank you Tomasz and Ronny.  I'll have to order some hdd

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread hjcho616
an extra disk or try your luck reasing osd_backfill_full_ratio to 92% it may fix things./MagedOn 2017-08-29 21:13, hjcho616 wrote: Nice!  Thank you for the explanation!  I feel like I can revive that OSD. =)   That does sound great.  I don't quite have another cluster so waiting for a drive to arr

Re: [ceph-users] Power outages!!! help!

2017-09-10 Thread hjcho616
It took a while.  It appears to have cleaned up quite a bit... but still has issues.  I've been seeing below message for more than a day and cpu utilization and io utilization is low... looks like something is stuck...  I rebooted OSDs several times when it looked like it was stuck earlier and

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616
ceph/osd/ceph-0/journal --file 0.2c.export.0Failure to read OSD superblock: (2) No such file or directory Regards,Hong On Tuesday, September 12, 2017 10:04 AM, hjcho616 <hjcho...@yahoo.com> wrote: Thank you for those references!  I'll have to go study some more.  Good portion of that inco

Re: [ceph-users] Power outages!!! help!

2017-09-12 Thread hjcho616
you manage to get one running, let it recover and stabilize. - recover and inject objects from osd's that do not run. stasrt by doing one and one pg. and once you get the hang of the method you can do multiple pg's at the same time. good luck Ronny Aasen On 11. sep. 2017 06:51, hjcho616 wrote:

Re: [ceph-users] Power outages!!! help!

2017-09-28 Thread hjcho616
.ceph.com/docs/jewel/cephfs/disaster-recovery/ Restarted MDS.  HEALTH_WARN no legacy OSD present but 'sortbitwise' flag is not set Mounted!  Thank you everyone for the help!  Learned alot! Regards,Hong On Friday, September 22, 2017 1:01 AM, hjcho616 <hjcho...@yahoo.com> wrote:

[ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Hello! I've been using ceph for long time mostly for network CephFS storage, even before Argonaut release!  It's been working very well for me.  Yes, I had some power outtages before and asked few questions on this list before and got resolved happily!  Thank you all! Not sure why but we've

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
reason I suggest that is that is seems that you’ve got issues everywhere and since you are running a production environment (at least it seem like that to me) data and down time is main priority. > On 28 Aug 2017, at 11:58, Ronny Aasen <ronny+ceph-us...@aasen.cx> wrote: > > On 28.

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
:53 PM, Ronny Aasen <ronny+ceph-us...@aasen.cx> wrote: comments inline On 28.08.2017 18:31, hjcho616 wrote: I'll see what I can do on that... Looks like I may have to add another OSD host as I utilized all of the SATA ports on those boards. =P Ronny, I am running with

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Utah | 84020 Office: 801.871.2799 | | | If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited. | On Mon, 2017-08-28 at 19:18 +00

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
or due to lack of space3. Follow advice of Ronny Aasen on hot to recover data from hard drives 4 get cooling to drives or you will loose more !  On 28 Aug 2017, at 22:39, hjcho616 <hjcho...@yahoo.com> wrote: Tomasz, Those machines are behind a surge protector.  Doesn't appear to be a good

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
, August 28, 2017 3:24 PM, Tomasz Kusmierz <tom.kusmi...@gmail.com> wrote: I think you’ve got your anwser: 197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       1 On 28 Aug 2017, at 21:22, hjcho616 <hjcho...@yahoo.com> wrote: Steve, I thought that

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread hjcho616
Tomasz, Those machines are behind a surge protector.  Doesn't appear to be a good one!   I do have a UPS... but it is my fault... no battery.  Power was pretty reliable for a while... and UPS was just beeping every chance it had, disrupting some sleep.. =P  So running on surge protector only.  I

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
ies is:- more “proper temperature” you run them at the more life you get outof them- more battery is overpowered for your application the longer it willsurvive. Get your self a LSI 94** controller and use it as HBA and you will befine. but get MORE DRIVES ! …  On 28 Aug 2017, at 2

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
at means I need to export both and import both?  If I have to get both, is there a need to merge the two before importing?  Or would the tool know how to handle this? Regards,Hong On Monday, September 4, 2017 1:20 AM, hjcho616 <hjcho...@yahoo.com> wrote: Thank you Ronny.  I've added

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
ew osd to the cluster. kind regards Ronny Aasen On 03.09.2017 06:20, hjcho616 wrote: I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that superblock file is the same.  I copied it over and started OSD.  It still fails with the same error message.  Looks like when I u

Re: [ceph-users] Power outages!!! help!

2017-09-04 Thread hjcho616
--op export --pgid 2.2f --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --file 2.2f.exportFailure to read OSD superblock: (2) No such file or directory Regards,Hong On Monday, September 4, 2017 2:29 AM, hjcho616 <hjcho...@yahoo.com> wrote: Ronny,

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
fy to use this drive if the data is missing? =)  Or am I being paranoid?   Just plug it? =) Regards,Hong On Friday, September 1, 2017 9:01 AM, hjcho616 <hjcho...@yahoo.com> wrote: Looks like it has been rescued... Only 1 error as we saw before in the smart log!# ddrescue -f /dev/sd

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
662 objects misplaced (10.297%); recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded; no legacy OSD present but 'sortbitwise' flag is not set Regards,Hong On Friday, September 1, 2017 10:37 PM, hjcho616 <hjcho...@yahoo.com> wrote: Tried connecting reco

Re: [ceph-users] Power outages!!! help!

2017-09-01 Thread hjcho616
         superblock  whoamiactive           fsid     keyring       ready          sysvinitceph_fsid        journal  lost+found     store_version  type Regards,Hong On Friday, September 1, 2017 2:59 PM, hjcho616 <hjcho...@yahoo.com> wrote: Found the partition, wasn't able to mount the par

Re: [ceph-users] Power outages!!! help!

2017-09-02 Thread hjcho616
Regards,Hong On Friday, September 1, 2017 11:10 PM, hjcho616 <hjcho...@yahoo.com> wrote: Just realized there is a file called superblock in the ceph directory.  ceph-1 and ceph-2's superblock file is identical, ceph-6 and ceph-7 are identical, but not between the two groups.