Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-06 Thread Stefan Kooman
Quoting Stefan Kooman (ste...@bit.nl): > 13.2.6 with this patch is running production now. We will continue the > cleanup process that *might* have triggered this tomorrow morning. For what's worth it ... that process completed succesfully ... Time will tell if it's really fixed, but it looks pro

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-05 Thread Stefan Kooman
Hi, Quoting Yan, Zheng (uker...@gmail.com): > Please check if https://github.com/ceph/ceph/pull/32020 works Thanks! 13.2.6 with this patch is running production now. We will continue the cleanup process that *might* have triggered this tomorrow morning. Gr. Stefan -- | BIT BV https://www.bi

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-04 Thread Yan, Zheng
On Thu, Dec 5, 2019 at 4:40 AM Stefan Kooman wrote: > > Quoting Stefan Kooman (ste...@bit.nl): > > and it crashed again (and again) ... until we stopped the mds and > > deleted the mds0_openfiles.0 from the metadata pool. > > > > Here is the (debug) output: > > > > A specific workload that *m

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-04 Thread Stefan Kooman
Quoting Stefan Kooman (ste...@bit.nl): > and it crashed again (and again) ... until we stopped the mds and > deleted the mds0_openfiles.0 from the metadata pool. > > Here is the (debug) output: > > A specific workload that *might* have triggered this: recursively deleting a > long > list of

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-04 Thread Stefan Kooman
Hi, Quoting Stefan Kooman (ste...@bit.nl): > > please apply following patch, thanks. > > > > diff --git a/src/mds/OpenFileTable.cc b/src/mds/OpenFileTable.cc > > index c0f72d581d..2ca737470d 100644 > > --- a/src/mds/OpenFileTable.cc > > +++ b/src/mds/OpenFileTable.cc > > @@ -470,7 +470,11 @@ voi

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-11-24 Thread Stefan Kooman
Hi, Quoting Yan, Zheng (uker...@gmail.com): > > > I double checked the code, but didn't find any clue. Can you compile > > > mds with a debug patch? > > > > Sure, I'll try to do my best to get a properly packaged Ceph Mimic > > 13.2.6 with the debug patch in it (and / or get help to get it build)

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-21 Thread Yan, Zheng
On Mon, Oct 21, 2019 at 7:58 PM Stefan Kooman wrote: > > Quoting Yan, Zheng (uker...@gmail.com): > > > I double checked the code, but didn't find any clue. Can you compile > > mds with a debug patch? > > Sure, I'll try to do my best to get a properly packaged Ceph Mimic > 13.2.6 with the debug pat

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-21 Thread Stefan Kooman
Quoting Yan, Zheng (uker...@gmail.com): > I double checked the code, but didn't find any clue. Can you compile > mds with a debug patch? Sure, I'll try to do my best to get a properly packaged Ceph Mimic 13.2.6 with the debug patch in it (and / or get help to get it build). Do you already have th

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-21 Thread Yan, Zheng
On Mon, Oct 21, 2019 at 4:33 PM Stefan Kooman wrote: > > Quoting Yan, Zheng (uker...@gmail.com): > > > delete 'mdsX_openfiles.0' object from cephfs metadata pool. (X is rank > > of the crashed mds) > > OK, MDS crashed again, restarted. I stopped it, deleted the object and > restarted the MDS. It b

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-21 Thread Stefan Kooman
Quoting Yan, Zheng (uker...@gmail.com): > delete 'mdsX_openfiles.0' object from cephfs metadata pool. (X is rank > of the crashed mds) OK, MDS crashed again, restarted. I stopped it, deleted the object and restarted the MDS. It became active right away. Any idea on why the openfiles list (object

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-21 Thread Stefan Kooman
Quoting Yan, Zheng (uker...@gmail.com): > delete 'mdsX_openfiles.0' object from cephfs metadata pool. (X is rank > of the crashed mds) Just to make sure I understand correctly. Current status is that the MDS is active (no standby for now) and not in a "crashed" state (although it has been crashin

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-21 Thread Yan, Zheng
On Sun, Oct 20, 2019 at 1:53 PM Stefan Kooman wrote: > > Dear list, > > Quoting Stefan Kooman (ste...@bit.nl): > > > I wonder if this situation is more likely to be hit on Mimic 13.2.6 than > > on any other system. > > > > Any hints / help to prevent this from happening? > > We have had this happe

Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-19 Thread Stefan Kooman
Dear list, Quoting Stefan Kooman (ste...@bit.nl): > I wonder if this situation is more likely to be hit on Mimic 13.2.6 than > on any other system. > > Any hints / help to prevent this from happening? We have had this happening another two times now. In both cases the MDS recovers, becomes acti

[ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-10-19 Thread Stefan Kooman
Dear list, Today our active MDS crashed with an assert: 2019-10-19 08:14:50.645 7f7906cb7700 -1 /build/ceph-13.2.6/src/mds/OpenFileTable.cc: In function 'void OpenFileTable::commit(MDSInternalContextBase*, uint64_t, int)' thread 7f7906cb7700 time 2019-10-19 08:14:50.648559 /build/ceph-13.2.6/s

Re: [ceph-users] MDS crash (Mimic 13.2.2 / 13.2.4 ) elist.h: 39: FAILED assert(!is_on_list())

2019-02-11 Thread Yan, Zheng
On Mon, Feb 11, 2019 at 8:01 PM Jake Grimmett wrote: > > Hi Zheng, > > Many, many thanks for your help... > > Your suggestion of setting large values for mds_cache_size and > mds_cache_memory_limit stopped our MDS crashing :) > > The values in ceph.conf are now: > > mds_cache_size = 8589934592 > m

Re: [ceph-users] MDS crash (Mimic 13.2.2 / 13.2.4 ) elist.h: 39: FAILED assert(!is_on_list())

2019-02-11 Thread Jake Grimmett
Hi Zheng, Sorry - I've just re-read your email and saw your instruction to restore the mds_cache_size and mds_cache_memory_limit to original values if the MDS does not crash - I have now done this... thanks again for your help, best regards, Jake On 2/11/19 12:01 PM, Jake Grimmett wrote: > Hi

Re: [ceph-users] MDS crash (Mimic 13.2.2 / 13.2.4 ) elist.h: 39: FAILED assert(!is_on_list())

2019-02-11 Thread Jake Grimmett
Hi Zheng, Many, many thanks for your help... Your suggestion of setting large values for mds_cache_size and mds_cache_memory_limit stopped our MDS crashing :) The values in ceph.conf are now: mds_cache_size = 8589934592 mds_cache_memory_limit = 17179869184 Should these values be left in our co

Re: [ceph-users] MDS crash (Mimic 13.2.2 / 13.2.4 ) elist.h: 39: FAILED assert(!is_on_list())

2019-02-11 Thread Yan, Zheng
On Sat, Feb 9, 2019 at 12:36 AM Jake Grimmett wrote: > > Dear All, > > Unfortunately the MDS has crashed on our Mimic cluster... > > First symptoms were rsync giving: > "No space left on device (28)" > when trying to rename or delete > > This prompted me to try restarting the MDS, as it reported l

[ceph-users] MDS crash (Mimic 13.2.2 / 13.2.4 ) elist.h: 39: FAILED assert(!is_on_list())

2019-02-08 Thread Jake Grimmett
Dear All, Unfortunately the MDS has crashed on our Mimic cluster... First symptoms were rsync giving: "No space left on device (28)" when trying to rename or delete This prompted me to try restarting the MDS, as it reported laggy. Restarting the MDS, shows this as error in the log before the cr

Re: [ceph-users] MDS crash Luminous

2018-02-26 Thread David C
Thanks for the tips, John. I'll increase the debug level as suggested. On 25 Feb 2018 20:56, "John Spray" wrote: > On Sat, Feb 24, 2018 at 10:13 AM, David C wrote: > > Hi All > > > > I had an MDS go down on a 12.2.1 cluster, the standby took over but I > don't > > know what caused the issue. Sc

Re: [ceph-users] MDS crash Luminous

2018-02-25 Thread John Spray
On Sat, Feb 24, 2018 at 10:13 AM, David C wrote: > Hi All > > I had an MDS go down on a 12.2.1 cluster, the standby took over but I don't > know what caused the issue. Scrubs are scheduled to start at 23:00 on this > cluster but this appears to have started a minute before. > > Can anyone help me

[ceph-users] MDS crash Luminous

2018-02-24 Thread David C
Hi All I had an MDS go down on a 12.2.1 cluster, the standby took over but I don't know what caused the issue. Scrubs are scheduled to start at 23:00 on this cluster but this appears to have started a minute before. Can anyone help me with diagnosing this please. Here's the relevant bit from the

Re: [ceph-users] MDS crash

2016-08-15 Thread Yan, Zheng
On Tue, Aug 16, 2016 at 6:29 AM, Randy Orr wrote: > Hi Patrick, > > We continue to hit this bug. Just a couple of questions: > > 1. I see that http://tracker.ceph.com/issues/16983 has been updated and you > believe it is related to http://tracker.ceph.com/issues/16013. It looks like > this fix is

Re: [ceph-users] MDS crash

2016-08-15 Thread Randy Orr
Hi Patrick, We continue to hit this bug. Just a couple of questions: 1. I see that http://tracker.ceph.com/issues/16983 has been updated and you believe it is related to http://tracker.ceph.com/issues/16013. It looks like this fix is scheduled to be backported to Jewel at some point... is there a

Re: [ceph-users] MDS crash

2016-08-10 Thread Randy Orr
Patrick, We are using the kernel client. We have a mix of 4.4 and 3.19 kernels on the client side with plans to move away from the 3.19 kernel where/when we can. -Randy On Wed, Aug 10, 2016 at 4:24 PM, Patrick Donnelly wrote: > Randy, are you using ceph-fuse or the kernel client (or something

Re: [ceph-users] MDS crash

2016-08-10 Thread Patrick Donnelly
Randy, are you using ceph-fuse or the kernel client (or something else)? On Wed, Aug 10, 2016 at 2:33 PM, Randy Orr wrote: > Great, thank you. Please let me know if I can be of any assistance in > testing or validating a fix. > > -Randy > > On Wed, Aug 10, 2016 at 1:21 PM, Patrick Donnelly > wro

Re: [ceph-users] MDS crash

2016-08-10 Thread Randy Orr
Great, thank you. Please let me know if I can be of any assistance in testing or validating a fix. -Randy On Wed, Aug 10, 2016 at 1:21 PM, Patrick Donnelly wrote: > Hello Randy, > > On Wed, Aug 10, 2016 at 12:20 PM, Randy Orr wrote: > > mds/Locker.cc: In function 'bool Locker::check_inode_max_

Re: [ceph-users] MDS crash

2016-08-10 Thread Patrick Donnelly
Hello Randy, On Wed, Aug 10, 2016 at 12:20 PM, Randy Orr wrote: > mds/Locker.cc: In function 'bool Locker::check_inode_max_size(CInode*, bool, > bool, uint64_t, bool, uint64_t, utime_t)' thread 7fc305b83700 time > 2016-08-09 18:51:50.626630 > mds/Locker.cc: 2190: FAILED assert(in->is_file()) > >

[ceph-users] MDS crash

2016-08-10 Thread Randy Orr
Hello, We have recently had some failures with our MDS processes. We are running Jewel 10.2.1. The two MDS services are on dedicated hosts running in active/standby on Ubuntu 14.04.3 with kernel 3.19.0-56-generic. I have searched the mailing list and open tickets without much luck so far. The fir

Re: [ceph-users] mds crash

2015-05-29 Thread Peter Tiernan
hi, that appears to have worked. The mds are now stable and I can read and write correctly. thanks for the help and have a good day. On 29/05/15 12:25, John Spray wrote: On 29/05/2015 11:41, Peter Tiernan wrote: ok, thanks. I wasn’t aware of this. Should this command fix everything or is

Re: [ceph-users] mds crash

2015-05-29 Thread John Spray
On 29/05/2015 11:41, Peter Tiernan wrote: ok, thanks. I wasn’t aware of this. Should this command fix everything or is do i need to delete cephfs and pools and start again: > ceph osd tier cache-mode CachePool writeback It might well work, give it a try. John _

Re: [ceph-users] mds crash

2015-05-29 Thread Peter Tiernan
ok, thanks. I wasn’t aware of this. Should this command fix everything or is do i need to delete cephfs and pools and start again: > ceph osd tier cache-mode CachePool writeback On 29/05/15 11:37, John Spray wrote: On 29/05/2015 11:34, Peter Tiernan wrote: ok, thats interesting. I had issues

Re: [ceph-users] mds crash

2015-05-29 Thread John Spray
On 29/05/2015 11:34, Peter Tiernan wrote: ok, thats interesting. I had issues before this crash where files were being garbled. I followed what I thought was the correct procedure for erasure coded pool with cache tier: > ceph osd pool create ECpool 800 800 erasure default > ceph osd pool crea

Re: [ceph-users] mds crash

2015-05-29 Thread Peter Tiernan
ok, thats interesting. I had issues before this crash where files were being garbled. I followed what I thought was the correct procedure for erasure coded pool with cache tier: > ceph osd pool create ECpool 800 800 erasure default > ceph osd pool create CachePool 4096 4096 > ceph osd tier add

Re: [ceph-users] mds crash

2015-05-29 Thread John Spray
On 29/05/2015 09:46, Peter Tiernan wrote: -16> 2015-05-29 09:28:23.106541 7f78c53a9700 10 mds.0.objecter in handle_osd_op_reply -15> 2015-05-29 09:28:23.106543 7f78c53a9700 7 mds.0.objecter handle_osd_op_reply 28 ondisk v 0'0 uv 0 in 11.5ce99960 attempt 1 -14> 2015-05-29 09:28:23.106

Re: [ceph-users] mds crash

2015-05-29 Thread Peter Tiernan
Thank you for your reply I had read the 'mds crashing' thread and i dont think im seeing that bug (http://tracker.ceph.com/issues/10449) . I have enabled "debug objector = 10" and here is the full log on starting mds : http://pastebin.com/dbk0uLYy Here is the last part of log: -35> 20

Re: [ceph-users] mds crash

2015-05-28 Thread John Spray
(This came up as in-reply-to to the previous "mds crashing" thread -- it's better to start threads with a fresh message) On 28/05/2015 16:58, Peter Tiernan wrote: Hi all, I have been testing cephfs with erasure coded pool and cache tier. I have 3 mds running on the same physical server as

[ceph-users] mds crash

2015-05-28 Thread Peter Tiernan
Hi all, I have been testing cephfs with erasure coded pool and cache tier. I have 3 mds running on the same physical server as 3 mons. The cluster is in ok state otherwise, rbd is working and all pg are active+clean. Im running v 0.87.2 giant on all nodes and ubuntu 14.04.2 . The cluster was

Re: [ceph-users] MDS crash when running a standby one

2014-07-21 Thread John Spray
For the question of OSD failures causing MDS crashes, there are many places where the MDS asserts that OSD operations succeeded (grep the code for "assert(r == 0)") -- we could probably do a better job of handling these, e.g. log the OSD error and respawn rather than assert'ing. John On Sat, Jul

Re: [ceph-users] MDS crash when running a standby one

2014-07-09 Thread Gregory Farnum
It crashed on an OSD reply. What's the output of "ceph -s"? -Greg On Wednesday, July 9, 2014, Florent B wrote: > Hi all, > > I run a Firefly cluster with a MDS server for a while without any problem. > > I would like to setup a second one to get a failover server. > > To minimize downtime in cas

Re: [ceph-users] MDS crash when running a standby one

2014-07-09 Thread Yan, Zheng
there is memory leak bug in standby replay code, your issue is likely caused by it. Yan, Zheng On Wed, Jul 9, 2014 at 4:49 PM, Florent B wrote: > Hi all, > > I run a Firefly cluster with a MDS server for a while without any problem. > > I would like to setup a second one to get a failover server

Re: [ceph-users] MDS crash dump ?

2014-06-11 Thread Gregory Farnum
On Wednesday, June 11, 2014, Florent B wrote: > Hi every one, > > Sometimes my MDS crashes... sometimes after a few hours, sometimes after > a few days. > > I know I could enable debugging and so on to get more information. But > if it crashes after a few days, it generates gigabytes of debugging

Re: [ceph-users] MDS crash when client goes to sleep

2014-04-03 Thread Florent B
I'm not sure I will re-test and tell you ;) On 04/02/2014 04:14 PM, Gregory Farnum wrote: > A *clean* shutdown? That sounds like a different issue; hjcho616's > issue only happens when a client wakes back up again. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Wed

Re: [ceph-users] MDS crash when client goes to sleep

2014-04-02 Thread Gregory Farnum
A *clean* shutdown? That sounds like a different issue; hjcho616's issue only happens when a client wakes back up again. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Apr 2, 2014 at 6:34 AM, Florent B wrote: > Can someone confirm that this issue is also in Emperor re

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-31 Thread Gregory Farnum
Yes, Zheng's fix for the MDS crash is in current mainline and will be in the next Firefly RC release. Sage, is there something else we can/should be doing when a client goes to sleep that we aren't already? (ie, flushing out all dirty data or something and disconnecting?) -Greg Software Engineer #

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-27 Thread hjcho616
wait a bit more. Regards, Hong From: Gregory Farnum To: hjcho616 Cc: Mohd Bazli Ab Karim ; "Yan, Zheng" ; Sage Weil ; "ceph-users@lists.ceph.com" Sent: Tuesday, March 25, 2014 11:05 AM Subject: Re: [ceph-users] MDS crash when client goe

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread hjcho616
ts.ceph.com" Sent: Tuesday, March 25, 2014 12:59 PM Subject: Re: [ceph-users] MDS crash when client goes to sleep On Tue, Mar 25, 2014 at 9:56 AM, hjcho616 wrote: > I am merely putting the client to sleep and waking it up.  When it is up, > running ls on the mounted directory.  A

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread Gregory Farnum
On Tue, Mar 25, 2014 at 9:56 AM, hjcho616 wrote: > I am merely putting the client to sleep and waking it up. When it is up, > running ls on the mounted directory. As far as I am concerned at very high > level I am doing the same thing. All are running 3.13 kernel Debian > provided. > > When tha

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread hjcho616
Regards, Hong From: Gregory Farnum To: hjcho616 Cc: Mohd Bazli Ab Karim ; "Yan, Zheng" ; Sage Weil ; "ceph-users@lists.ceph.com" Sent: Tuesday, March 25, 2014 11:05 AM Subject: Re: [ceph-users] MDS crash when client goes to sleep On Mon, Mar 24, 2014 at 6:26

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-25 Thread Gregory Farnum
On Mon, Mar 24, 2014 at 6:26 PM, hjcho616 wrote: > I tried the patch twice. First time, it worked. There was no issue. > Connected back to MDS and was happily running. All three MDS demons were > running ok. > > Second time though... all three demons were alive. Health was reported OK. > Howev

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-23 Thread Mohd Bazli Ab Karim
steps, just in case if it happens again in future. Many thanks. Bazli -Original Message- From: Yan, Zheng [mailto:uker...@gmail.com] Sent: Sunday, March 23, 2014 2:53 PM To: Sage Weil Cc: Mohd Bazli Ab Karim; ceph-users@lists.ceph.com Subject: Re: [ceph-users] MDS crash when client goes to slee

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-22 Thread Yan, Zheng
uld it able to mount to the filesystem now? It >> > looks >> > similar to our case, http://www.spinics.net/lists/ceph-devel/msg18395.html >> > >> > However, you need to collect some logs to confirm this. >> > >> > >> > >> > Thanks. >

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-22 Thread Sage Weil
;s the client now? Would it able to mount to the filesystem now? It looks > > similar to our case, http://www.spinics.net/lists/ceph-devel/msg18395.html > > > > However, you need to collect some logs to confirm this. > > > > > > > > Thanks. > > > > >

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-22 Thread Yan, Zheng
ou need to collect some logs to confirm this. > > > > Thanks. > > > > > > From: hjcho616 [mailto:hjcho...@yahoo.com] > Sent: Friday, March 21, 2014 2:30 PM > > > To: Luke Jing Yuan > Cc: Mohd Bazli Ab Karim; ceph-users@lists.ceph.com > Subject: Re: [

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-21 Thread Mohd Bazli Ab Karim
ch 21, 2014 2:30 PM To: Luke Jing Yuan Cc: Mohd Bazli Ab Karim; ceph-users@lists.ceph.com Subject: Re: [ceph-users] MDS crash when client goes to sleep Luke, Not sure what flapping ceph-mds daemon mean, but when I connected to MDS when this happened there no longer was any process with ceph-mds w

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
im ; "ceph-users@lists.ceph.com" Sent: Friday, March 21, 2014 1:17 AM Subject: RE: [ceph-users] MDS crash when client goes to sleep Hi Hong, That's interesting, for Mr. Bazli and I, we ended with MDS stuck in (up:replay) and a flapping ceph-mds daemon, but then again we are usin

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread Luke Jing Yuan
ds, Luke From: hjcho616 [mailto:hjcho...@yahoo.com] Sent: Friday, 21 March, 2014 12:09 PM To: Luke Jing Yuan Cc: Mohd Bazli Ab Karim; ceph-users@lists.ceph.com Subject: Re: [ceph-users] MDS crash when client goes to sleep Nope just these segfaults. [149884.709608] ceph-mds[17366]: segfault at 200 ip 00

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
PM Subject: Re: [ceph-users] MDS crash when client goes to sleep Did you see any messages in dmesg saying ceph-mds respawnning or stuffs like that? Regards, Luke On Mar 21, 2014, at 11:09 AM, "hjcho616" wrote: On client, I was no longer able to access the filesystem.  It would

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread Luke Jing Yuan
@lists.ceph.com<mailto:ceph-users@lists.ceph.com>" mailto:ceph-users@lists.ceph.com>> Sent: Thursday, March 20, 2014 9:40 PM Subject: RE: [ceph-users] MDS crash when client goes to sleep Hi Hong, May I know what has happened to your MDS once it crashed? Was it able to recover from replay

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
: hjcho616 ; "ceph-users@lists.ceph.com" Sent: Thursday, March 20, 2014 9:40 PM Subject: RE: [ceph-users] MDS crash when client goes to sleep Hi Hong, May I know what has happened to your MDS once it crashed? Was it able to recover from replay? We also facing this issue and I am int

Re: [ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread Mohd Bazli Ab Karim
hjcho616 Sent: Friday, March 21, 2014 10:29 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] MDS crash when client goes to sleep When CephFS is mounted on a client and when client decides to go to sleep, MDS segfaults. Has anyone seen this? Below is a part of MDS log. This happened in

[ceph-users] MDS crash when client goes to sleep

2014-03-20 Thread hjcho616
When CephFS is mounted on a client and when client decides to go to sleep, MDS segfaults.  Has anyone seen this?  Below is a part of MDS log.  This happened in emperor and recent 0.77 release.  I am running Debian Wheezy with testing kernels 3.13.  What can I do to not crash the whole system if

Re: [ceph-users] MDS Crash on recovery (0.60)

2013-04-30 Thread Mike Bryant
Ah, looks like it was. I've got a gitbuilder build of the mds running and it seems to be working. Thanks! Mike On 30 April 2013 16:56, Kevin Decherf wrote: > On Tue, Apr 30, 2013 at 03:10:00PM +0100, Mike Bryant wrote: >> All of my MDS daemons have begun crashing when I start them up, and >> the

Re: [ceph-users] MDS Crash on recovery (0.60)

2013-04-30 Thread Kevin Decherf
On Tue, Apr 30, 2013 at 03:10:00PM +0100, Mike Bryant wrote: > All of my MDS daemons have begun crashing when I start them up, and > they try to begin recovery. Hi, It seems to be the same bug as #4644 http://tracker.ceph.com/issues/4644 -- Kevin Decherf - @Kdecherf GPG C610 FE73 E706 F968 612B

[ceph-users] MDS Crash on recovery (0.60)

2013-04-30 Thread Mike Bryant
All of my MDS daemons have begun crashing when I start them up, and they try to begin recovery. Log attached Mike -- Mike Bryant | Systems Administrator | Ocado Technology mike.bry...@ocado.com | 01707 382148 | www.ocado.com -- Notice: This email is confidential and may contain copyright mater

Re: [ceph-users] mds crash

2013-02-25 Thread Gregory Farnum
On Mon, Feb 25, 2013 at 8:44 AM, Sage Weil wrote: > On Mon, 25 Feb 2013, Steffen Thorhauer wrote: >> Hi, >> I've found out, what I make wrong: stop the cluster and forget a client, >> which as mounting the cephfs. I simply forget the client. >> With a >> ceph mds newfs 0 1 --yes-i-really-mean-it

Re: [ceph-users] mds crash

2013-02-25 Thread Sage Weil
On Mon, 25 Feb 2013, Steffen Thorhauer wrote: > Hi, > I've found out, what I make wrong: stop the cluster and forget a client, > which as mounting the cephfs. I simply forget the client. > With a > ceph mds newfs 0 1 --yes-i-really-mean-it Oh... the 'newfs' resets the MDSMap in the monitor, but

Re: [ceph-users] mds crash

2013-02-25 Thread Steffen Thorhauer
Hi, I've found out, what I make wrong: stop the cluster and forget a client, which as mounting the cephfs. I simply forget the client. With a ceph mds newfs 0 1 --yes-i-really-mean-it (I dont really what the parameters are), but the mds is restarting with an empty fs. I tried the patch version