Re: [ceph-users] Some OSD and MDS crash

Pierre BLONDEAU Wed, 02 Jul 2014 14:22:36 -0700

Hi,

I do it, the log files are available here : https://blondeau.users.greyc.fr/cephlog/debug20/


The OSD's files are really big +/- 80M .

After starting the osd.20 some other osd crash. I pass from 31 osd up to 16. I remark that after this the number of down+peering PG decrease from 367 to 248. It's "normal" ? May be it's temporary, the time that the cluster verifies all the PG ?


Regards
Pierre

Le 02/07/2014 19:16, Samuel Just a écrit :

You should add

debug osd = 20
debug filestore = 20
debug ms = 1

to the [osd] section of the ceph.conf and restart the osds.  I'd like
all three logs if possible.

Thanks
-Sam

On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
<[email protected]> wrote:

Yes, but how i do that ?

With a command like that ?

ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20 --debug-ms
1'

By modify the /etc/ceph/ceph.conf ? This file is really poor because I use
udev detection.

When I have made these changes, you want the three log files or only
osd.20's ?

Thank you so much for the help

Regards
Pierre

Le 01/07/2014 23:51, Samuel Just a écrit :

Can you reproduce with
debug osd = 20
debug filestore = 20
debug ms = 1
?
-Sam

On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
<[email protected]> wrote:


Hi,

I join :
   - osd.20 is one of osd that I detect which makes crash other OSD.
   - osd.23 is one of osd which crash when i start osd.20
   - mds, is one of my MDS

I cut log file because they are to big but. All is here :
https://blondeau.users.greyc.fr/cephlog/

Regards

Le 30/06/2014 17:35, Gregory Farnum a écrit :

What's the backtrace from the crashing OSDs?

Keep in mind that as a dev release, it's generally best not to upgrade
to unnamed versions like 0.82 (but it's probably too late to go back
now).


I will remember it the next time ;)

-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
<[email protected]> wrote:

Hi,

After the upgrade to firefly, I have some PG in peering state.
I seen the output of 0.82 so I try to upgrade for solved my problem.

My three MDS crash and some OSD triggers a chain reaction that kills
other
OSD.
I think my MDS will not start because of the metadata are on the OSD.

I have 36 OSD on three servers and I identified 5 OSD which makes crash
others. If i not start their, the cluster passe in reconstructive state
with
31 OSD but i have 378 in down+peering state.

How can I do ? Would you more information ( os, crash log, etc ... ) ?

Regards


--
----------------------------------------------
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel     : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
----------------------------------------------

smime.p7s
Description: Signature cryptographique S/MIME

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Some OSD and MDS crash

Reply via email to