Re: [ceph-users] ceph segfault on all osd

Gregory Farnum Wed, 10 Apr 2013 09:45:34 -0700

[Please keep all mail on the list.]  

Hmm, that OSD log doesn't show a crash. I thought you said they were all 
crashing? Do they come up okay when you turn them back on again?
-Greg


Software Engineer #42 @ http://inktank.com | http://ceph.com


On Wednesday, April 10, 2013 at 9:27 AM, Witalij Poljatchek wrote:

> the log files.
>  
> thank you ! :)
>  
> On 04/10/2013 06:06 PM, Gregory Farnum wrote:
> > [Re-adding the list.]
> >  
> > When the OSDs crash they will print out to their log a short description of 
> > what happened, with a bunch of function names.
> >  
> > Unfortunately the problem you've run into is probably non-trivial to solve 
> > as you've introduced a bit of a weird situation into the permanent record 
> > that your OSDs need to process. I've created a bug 
> > (http://tracker.ceph.com/issues/4699), you can follow that. :)
> > -Greg
> > Software Engineer #42 @ http://inktank.com | http://ceph.com
> >  
> >  
> > On Wednesday, April 10, 2013 at 8:57 AM, Witalij Poljatchek wrote:
> >  
> > > there are no data.
> > >  
> > > Plain OSDs
> > >  
> > >  
> > > What you mean backtrace ? strace of ceph-osd process ?
> > >  
> > >  
> > > is easy to reproduce.
> > >  
> > > setup plain cluster
> > >  
> > > and then set:
> > >  
> > > ceph osd pool set rbd size 0
> > >  
> > > after minute set:
> > >  
> > > ceph osd pool set rbd size 2
> > >  
> > > that all.
> > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > > On 04/10/2013 05:24 PM, Gregory Farnum wrote:
> > > > Sounds like they aren't handling the transition very well when trying 
> > > > to calculate old OSDs which might have held the PG. Are you trying to 
> > > > salvage the data that was in it, or can you throw it away?
> > > > Can you post the backtrace they're producing?
> > > > -Greg
> > > > Software Engineer #42 @ http://inktank.com | http://ceph.com
> > > >  
> > > >  
> > > > On Wednesday, April 10, 2013 at 3:59 AM, Witalij Poljatchek wrote:
> > > >  
> > > > > Hello,
> > > > >  
> > > > > need help to solve segfault on all osd in my test cluster.
> > > > >  
> > > > >  
> > > > > Setup ceph from scratch.
> > > > > service ceph -a start
> > > > >  
> > > > > ceph -w
> > > > > health HEALTH_OK
> > > > > monmap e1: 3 mons at 
> > > > > {1=10.200.20.1:6789/0,2=10.200.20.2:6789/0,3=10.200.20.3:6789/0}, 
> > > > > election epoch 6, quorum 0,1,2 1,2,3
> > > > > osdmap e5: 4 osds: 4 up, 4 in
> > > > > pgmap v305: 960 pgs: 960 active+clean; 0 bytes data, 40147 MB used, 
> > > > > 26667 GB / 26706 GB avail
> > > > > mdsmap e1: 0/0/1 up
> > > > >  
> > > > >  
> > > > > if i set replica size to 0 "i know this make no sense"
> > > > > ceph osd pool set rbd size 0
> > > > > and then back to 2
> > > > > ceph osd pool set rbd size 2
> > > > >  
> > > > > then i see that on all OSDs the process ceph-osd crash with segfault
> > > > >  
> > > > > If i stop MONs daemons then i can start OSDs but if i start MONs back 
> > > > > then die all OSDs again.
> > > > >  
> > > > >  
> > > > >  
> > > > > How i cann repair this behavior ?
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > My setup
> > > > > Nothing specials:
> > > > >  
> > > > > Centos 6.3
> > > > >  
> > > > > Kernel: 3.8.3-1.el6.elrepo.x86_64
> > > > >  
> > > > > ceph-fuse-0.56.4-0.el6.x86_64
> > > > > ceph-test-0.56.4-0.el6.x86_64
> > > > > libcephfs1-0.56.4-0.el6.x86_64
> > > > > ceph-0.56.4-0.el6.x86_64
> > > > > ceph-release-1-0.el6.noarch
> > > > >  
> > > > > cat /etc/ceph/ceph.conf
> > > > >  
> > > > > [global]
> > > > > auth cluster required = none
> > > > > auth service required = none
> > > > > auth client required = none
> > > > > keyring = /etc/ceph/$name.keyring
> > > > > [mon]
> > > > > [mds]
> > > > > [osd]
> > > > > osd journal size = 10000
> > > > > [mon.1]
> > > > > host = ceph-mon1
> > > > > mon addr = 10.200.20.1:6789
> > > > > [mon.2]
> > > > > host = ceph-mon2
> > > > > mon addr = 10.200.20.2:6789
> > > > > [mon.3]
> > > > > host = ceph-mon3
> > > > > mon addr = 10.200.20.3:6789
> > > > >  
> > > > > [osd.0]
> > > > > host = ceph-osd1
> > > > > [osd.1]
> > > > > host = ceph-osd2
> > > > > [osd.2]
> > > > > host = ceph-osd3
> > > > > [osd.3]
> > > > > host = ceph-osd4
> > > > >  
> > > > > [mds.a]
> > > > > host = ceph-mds1
> > > > > [mds.b]
> > > > > host = ceph-mds2
> > > > > [mds.c]
> > > > > host = ceph-mds3
> > > > >  
> > > > > Thanks much.
> > > > > -- AIXIT GmbH - Witalij Poljatchek (T) +49 69 203 4709-13 - (F) +49 
> > > > > 69 203 470 979 [email protected] - http://www.aixit.com AIXIT GmbH 
> > > > > Strahlenbergerstr. 14 63067 Offenbach am Main (T) +49 69 203 470 913 
> > > > > Amtsgericht Offenbach, HRB 43953 Geschäftsführer: Friedhelm Heyer, 
> > > > > Holger Grauer
> > > >  
> > >  
> > >  
> > >  
> > >  
> > >  
> > > --
> > > AIXIT GmbH - Witalij Poljatchek
> > > (T) +49 69 203 4709-13 - (F) +49 69 203 470 979
> > > [email protected] - http://www.aixit.com
> > >  
> > > AIXIT GmbH
> > >  
> > > Strahlenbergerstr. 14
> > > 63067 Offenbach am Main
> > > (T) +49 69 203 470 913
> > >  
> > > Amtsgericht Offenbach, HRB 43953
> > > Geschäftsführer: Friedhelm Heyer, Holger Grauer
> >  
>  
>  
>  
> --  
> AIXIT GmbH - Witalij Poljatchek
> (T) +49 69 203 4709-13 - (F) +49 69 203 470 979
> [email protected] - http://www.aixit.com
>  
> AIXIT GmbH
>  
> Strahlenbergerstr. 14
> 63067 Offenbach am Main
> (T) +49 69 203 470 913
>  
> Amtsgericht Offenbach, HRB 43953
> Geschäftsführer: Friedhelm Heyer, Holger Grauer
>  
>  
> Attachments:  
> - ceph-osd.0.log
>  
> - ceph-mon.1.log
>  
> - ceph.log
>  



_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph segfault on all osd

Reply via email to