Re: Fwd: monitor crashing

2015-10-13 Thread Sage Weil
On Tue, 13 Oct 2015, Luis Periquito wrote:
> the store.db dir is 3.4GB big :(
> 
> can I do it on my side?

Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a 
branch wip-ecpool-hammer.  Not sure which distro you're on, but packages 
will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon 
crash, which will let you delete the pool.  I suggest stopping the OSDs 
before starting the mon with this or else they might get pg create 
messages and crash too.  Once the pool is removed you can start them 
again.  They shouldn't need to be upgraded.

Note that the latest hammer doesn't let you create the pool at all because 
it fails the crush safety check (I had to disable the check to reproduce 
this), so that's good at least!

sage

> 
> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil  wrote:
> > On Tue, 13 Oct 2015, Luis Periquito wrote:
> >> Any ideas? I'm growing desperate :(
> >>
> >> I've tried compiling from source, and including
> >> https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
> >> of the ceph-mon
> >
> > If you can email a (link to a) tarball of your mon data directory I'd love
> > to extract the osdmap and see why crush is crashing.. it's obviously not
> > supposed to do that (even with a bad rule).  You can also use
> > the ceph-post-file utility.
> >
> > Thanks!
> > sage
> >
> >
> >>
> >> -- Forwarded message --
> >> From: Luis Periquito 
> >> Date: Tue, Oct 13, 2015 at 12:26 PM
> >> Subject: Re: monitor crashing
> >> To: Ceph Users 
> >>
> >>
> >> I'm currently running Hammer (0.94.3), created an invalid LRC profile
> >> (typo in the l=, should have been l=4 but was l=3, and now I don't
> >> have enough different ruleset-locality) and created a pool. Is there
> >> any way to delete this pool? remember I can't start the ceph-mon...
> >>
> >> On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito  
> >> wrote:
> >> > It seems I've hit this bug:
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1231630
> >> >
> >> > is there any way I can recover this cluster? It worked in our test
> >> > cluster, but crashed the production one...
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: monitor crashing

2015-10-13 Thread Loic Dachary
https://github.com/ceph/ceph/compare/hammer...wip-ecpool-hammer

In order to bypass the crush verification, you could:

ceph tell mon.* injectargs --crushtool /bin/true

Cheers

On 13/10/2015 15:41, Sage Weil wrote:
> On Tue, 13 Oct 2015, Luis Periquito wrote:
>> the store.db dir is 3.4GB big :(
>>
>> can I do it on my side?
> 
> Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a 
> branch wip-ecpool-hammer.  Not sure which distro you're on, but packages 
> will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon 
> crash, which will let you delete the pool.  I suggest stopping the OSDs 
> before starting the mon with this or else they might get pg create 
> messages and crash too.  Once the pool is removed you can start them 
> again.  They shouldn't need to be upgraded.
> 
> Note that the latest hammer doesn't let you create the pool at all because 
> it fails the crush safety check (I had to disable the check to reproduce 
> this), so that's good at least!
> 
> sage
> 
>>
>> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil  wrote:
>>> On Tue, 13 Oct 2015, Luis Periquito wrote:
 Any ideas? I'm growing desperate :(

 I've tried compiling from source, and including
 https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
 of the ceph-mon
>>>
>>> If you can email a (link to a) tarball of your mon data directory I'd love
>>> to extract the osdmap and see why crush is crashing.. it's obviously not
>>> supposed to do that (even with a bad rule).  You can also use
>>> the ceph-post-file utility.
>>>
>>> Thanks!
>>> sage
>>>
>>>

 -- Forwarded message --
 From: Luis Periquito 
 Date: Tue, Oct 13, 2015 at 12:26 PM
 Subject: Re: monitor crashing
 To: Ceph Users 


 I'm currently running Hammer (0.94.3), created an invalid LRC profile
 (typo in the l=, should have been l=4 but was l=3, and now I don't
 have enough different ruleset-locality) and created a pool. Is there
 any way to delete this pool? remember I can't start the ceph-mon...

 On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito  
 wrote:
> It seems I've hit this bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1231630
>
> is there any way I can recover this cluster? It worked in our test
> cluster, but crashed the production one...
 --
 To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


Re: Fwd: monitor crashing

2015-10-13 Thread Sage Weil
On Tue, 13 Oct 2015, Loic Dachary wrote:
> https://github.com/ceph/ceph/compare/hammer...wip-ecpool-hammer
> 
> In order to bypass the crush verification, you could:
> 
> ceph tell mon.* injectargs --crushtool /bin/true

Ah, good trick!

http://tracker.ceph.com/issues/13477

is the ticket, and my fix for master is

https://github.com/ceph/ceph/pull/6246

sage

> 
> Cheers
> 
> On 13/10/2015 15:41, Sage Weil wrote:
> > On Tue, 13 Oct 2015, Luis Periquito wrote:
> >> the store.db dir is 3.4GB big :(
> >>
> >> can I do it on my side?
> > 
> > Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a 
> > branch wip-ecpool-hammer.  Not sure which distro you're on, but packages 
> > will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon 
> > crash, which will let you delete the pool.  I suggest stopping the OSDs 
> > before starting the mon with this or else they might get pg create 
> > messages and crash too.  Once the pool is removed you can start them 
> > again.  They shouldn't need to be upgraded.
> > 
> > Note that the latest hammer doesn't let you create the pool at all because 
> > it fails the crush safety check (I had to disable the check to reproduce 
> > this), so that's good at least!
> > 
> > sage
> > 
> >>
> >> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil  wrote:
> >>> On Tue, 13 Oct 2015, Luis Periquito wrote:
>  Any ideas? I'm growing desperate :(
> 
>  I've tried compiling from source, and including
>  https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
>  of the ceph-mon
> >>>
> >>> If you can email a (link to a) tarball of your mon data directory I'd love
> >>> to extract the osdmap and see why crush is crashing.. it's obviously not
> >>> supposed to do that (even with a bad rule).  You can also use
> >>> the ceph-post-file utility.
> >>>
> >>> Thanks!
> >>> sage
> >>>
> >>>
> 
>  -- Forwarded message --
>  From: Luis Periquito 
>  Date: Tue, Oct 13, 2015 at 12:26 PM
>  Subject: Re: monitor crashing
>  To: Ceph Users 
> 
> 
>  I'm currently running Hammer (0.94.3), created an invalid LRC profile
>  (typo in the l=, should have been l=4 but was l=3, and now I don't
>  have enough different ruleset-locality) and created a pool. Is there
>  any way to delete this pool? remember I can't start the ceph-mon...
> 
>  On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito  
>  wrote:
> > It seems I've hit this bug:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1231630
> >
> > is there any way I can recover this cluster? It worked in our test
> > cluster, but crashed the production one...
>  --
>  To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>  the body of a message to majord...@vger.kernel.org
>  More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> -- 
> Loïc Dachary, Artisan Logiciel Libre
> 
> 

Re: Fwd: monitor crashing

2015-10-13 Thread Sage Weil
On Tue, 13 Oct 2015, Luis Periquito wrote:
> Any ideas? I'm growing desperate :(
> 
> I've tried compiling from source, and including
> https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
> of the ceph-mon

If you can email a (link to a) tarball of your mon data directory I'd love 
to extract the osdmap and see why crush is crashing.. it's obviously not 
supposed to do that (even with a bad rule).  You can also use 
the ceph-post-file utility.

Thanks!
sage


> 
> -- Forwarded message --
> From: Luis Periquito 
> Date: Tue, Oct 13, 2015 at 12:26 PM
> Subject: Re: monitor crashing
> To: Ceph Users 
> 
> 
> I'm currently running Hammer (0.94.3), created an invalid LRC profile
> (typo in the l=, should have been l=4 but was l=3, and now I don't
> have enough different ruleset-locality) and created a pool. Is there
> any way to delete this pool? remember I can't start the ceph-mon...
> 
> On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito  wrote:
> > It seems I've hit this bug:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1231630
> >
> > is there any way I can recover this cluster? It worked in our test
> > cluster, but crashed the production one...
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: monitor crashing

2015-10-13 Thread Luis Periquito
the store.db dir is 3.4GB big :(

can I do it on my side?

On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil  wrote:
> On Tue, 13 Oct 2015, Luis Periquito wrote:
>> Any ideas? I'm growing desperate :(
>>
>> I've tried compiling from source, and including
>> https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
>> of the ceph-mon
>
> If you can email a (link to a) tarball of your mon data directory I'd love
> to extract the osdmap and see why crush is crashing.. it's obviously not
> supposed to do that (even with a bad rule).  You can also use
> the ceph-post-file utility.
>
> Thanks!
> sage
>
>
>>
>> -- Forwarded message --
>> From: Luis Periquito 
>> Date: Tue, Oct 13, 2015 at 12:26 PM
>> Subject: Re: monitor crashing
>> To: Ceph Users 
>>
>>
>> I'm currently running Hammer (0.94.3), created an invalid LRC profile
>> (typo in the l=, should have been l=4 but was l=3, and now I don't
>> have enough different ruleset-locality) and created a pool. Is there
>> any way to delete this pool? remember I can't start the ceph-mon...
>>
>> On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito  wrote:
>> > It seems I've hit this bug:
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1231630
>> >
>> > is there any way I can recover this cluster? It worked in our test
>> > cluster, but crashed the production one...
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: monitor crashing

2015-10-13 Thread Luis Periquito
Any ideas? I'm growing desperate :(

I've tried compiling from source, and including
https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
of the ceph-mon

-- Forwarded message --
From: Luis Periquito 
Date: Tue, Oct 13, 2015 at 12:26 PM
Subject: Re: monitor crashing
To: Ceph Users 


I'm currently running Hammer (0.94.3), created an invalid LRC profile
(typo in the l=, should have been l=4 but was l=3, and now I don't
have enough different ruleset-locality) and created a pool. Is there
any way to delete this pool? remember I can't start the ceph-mon...

On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito  wrote:
> It seems I've hit this bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1231630
>
> is there any way I can recover this cluster? It worked in our test
> cluster, but crashed the production one...
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: monitor crashing

2015-10-13 Thread Sage Weil
On Tue, 13 Oct 2015, Luis Periquito wrote:
> Hi Sage,
> 
> awesome help.
> 
> Sorry for not telling before, but I'm running 2xMON in precise and
> 1xMON in trusty. Looking at the status page
> (http://ceph.com/gitbuilder.cgi) it seems the precise build is
> failing... Can you have a look?

I've repushed the branch, this time cherry-picking the master fix.  Let me 
know if you run into other problems!

Thanks-
sage

> 
> thanks,
> 
> 
> On Tue, Oct 13, 2015 at 2:59 PM, Sage Weil  wrote:
> > On Tue, 13 Oct 2015, Loic Dachary wrote:
> >> https://github.com/ceph/ceph/compare/hammer...wip-ecpool-hammer
> >>
> >> In order to bypass the crush verification, you could:
> >>
> >> ceph tell mon.* injectargs --crushtool /bin/true
> >
> > Ah, good trick!
> >
> > http://tracker.ceph.com/issues/13477
> >
> > is the ticket, and my fix for master is
> >
> > https://github.com/ceph/ceph/pull/6246
> >
> > sage
> >
> >>
> >> Cheers
> >>
> >> On 13/10/2015 15:41, Sage Weil wrote:
> >> > On Tue, 13 Oct 2015, Luis Periquito wrote:
> >> >> the store.db dir is 3.4GB big :(
> >> >>
> >> >> can I do it on my side?
> >> >
> >> > Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a
> >> > branch wip-ecpool-hammer.  Not sure which distro you're on, but packages
> >> > will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon
> >> > crash, which will let you delete the pool.  I suggest stopping the OSDs
> >> > before starting the mon with this or else they might get pg create
> >> > messages and crash too.  Once the pool is removed you can start them
> >> > again.  They shouldn't need to be upgraded.
> >> >
> >> > Note that the latest hammer doesn't let you create the pool at all 
> >> > because
> >> > it fails the crush safety check (I had to disable the check to reproduce
> >> > this), so that's good at least!
> >> >
> >> > sage
> >> >
> >> >>
> >> >> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil  wrote:
> >> >>> On Tue, 13 Oct 2015, Luis Periquito wrote:
> >>  Any ideas? I'm growing desperate :(
> >> 
> >>  I've tried compiling from source, and including
> >>  https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
> >>  of the ceph-mon
> >> >>>
> >> >>> If you can email a (link to a) tarball of your mon data directory I'd 
> >> >>> love
> >> >>> to extract the osdmap and see why crush is crashing.. it's obviously 
> >> >>> not
> >> >>> supposed to do that (even with a bad rule).  You can also use
> >> >>> the ceph-post-file utility.
> >> >>>
> >> >>> Thanks!
> >> >>> sage
> >> >>>
> >> >>>
> >> 
> >>  -- Forwarded message --
> >>  From: Luis Periquito 
> >>  Date: Tue, Oct 13, 2015 at 12:26 PM
> >>  Subject: Re: monitor crashing
> >>  To: Ceph Users 
> >> 
> >> 
> >>  I'm currently running Hammer (0.94.3), created an invalid LRC profile
> >>  (typo in the l=, should have been l=4 but was l=3, and now I don't
> >>  have enough different ruleset-locality) and created a pool. Is there
> >>  any way to delete this pool? remember I can't start the ceph-mon...
> >> 
> >>  On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito 
> >>   wrote:
> >> > It seems I've hit this bug:
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1231630
> >> >
> >> > is there any way I can recover this cluster? It worked in our test
> >> > cluster, but crashed the production one...
> >>  --
> >>  To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> >>  in
> >>  the body of a message to majord...@vger.kernel.org
> >>  More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> 
> >> 
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> >> the body of a message to majord...@vger.kernel.org
> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>
> >> >>
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> > the body of a message to majord...@vger.kernel.org
> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >
> >>
> >> --
> >> Loïc Dachary, Artisan Logiciel Libre
> >>
> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

Re: Fwd: monitor crashing

2015-10-13 Thread Luis Periquito
Hi Sage,

awesome help.

Sorry for not telling before, but I'm running 2xMON in precise and
1xMON in trusty. Looking at the status page
(http://ceph.com/gitbuilder.cgi) it seems the precise build is
failing... Can you have a look?

thanks,


On Tue, Oct 13, 2015 at 2:59 PM, Sage Weil  wrote:
> On Tue, 13 Oct 2015, Loic Dachary wrote:
>> https://github.com/ceph/ceph/compare/hammer...wip-ecpool-hammer
>>
>> In order to bypass the crush verification, you could:
>>
>> ceph tell mon.* injectargs --crushtool /bin/true
>
> Ah, good trick!
>
> http://tracker.ceph.com/issues/13477
>
> is the ticket, and my fix for master is
>
> https://github.com/ceph/ceph/pull/6246
>
> sage
>
>>
>> Cheers
>>
>> On 13/10/2015 15:41, Sage Weil wrote:
>> > On Tue, 13 Oct 2015, Luis Periquito wrote:
>> >> the store.db dir is 3.4GB big :(
>> >>
>> >> can I do it on my side?
>> >
>> > Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a
>> > branch wip-ecpool-hammer.  Not sure which distro you're on, but packages
>> > will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon
>> > crash, which will let you delete the pool.  I suggest stopping the OSDs
>> > before starting the mon with this or else they might get pg create
>> > messages and crash too.  Once the pool is removed you can start them
>> > again.  They shouldn't need to be upgraded.
>> >
>> > Note that the latest hammer doesn't let you create the pool at all because
>> > it fails the crush safety check (I had to disable the check to reproduce
>> > this), so that's good at least!
>> >
>> > sage
>> >
>> >>
>> >> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil  wrote:
>> >>> On Tue, 13 Oct 2015, Luis Periquito wrote:
>>  Any ideas? I'm growing desperate :(
>> 
>>  I've tried compiling from source, and including
>>  https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
>>  of the ceph-mon
>> >>>
>> >>> If you can email a (link to a) tarball of your mon data directory I'd 
>> >>> love
>> >>> to extract the osdmap and see why crush is crashing.. it's obviously not
>> >>> supposed to do that (even with a bad rule).  You can also use
>> >>> the ceph-post-file utility.
>> >>>
>> >>> Thanks!
>> >>> sage
>> >>>
>> >>>
>> 
>>  -- Forwarded message --
>>  From: Luis Periquito 
>>  Date: Tue, Oct 13, 2015 at 12:26 PM
>>  Subject: Re: monitor crashing
>>  To: Ceph Users 
>> 
>> 
>>  I'm currently running Hammer (0.94.3), created an invalid LRC profile
>>  (typo in the l=, should have been l=4 but was l=3, and now I don't
>>  have enough different ruleset-locality) and created a pool. Is there
>>  any way to delete this pool? remember I can't start the ceph-mon...
>> 
>>  On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito  
>>  wrote:
>> > It seems I've hit this bug:
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1231630
>> >
>> > is there any way I can recover this cluster? It worked in our test
>> > cluster, but crashed the production one...
>>  --
>>  To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>  the body of a message to majord...@vger.kernel.org
>>  More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> the body of a message to majord...@vger.kernel.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>
>> >>
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> > the body of a message to majord...@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: monitor crashing

2015-10-13 Thread Luis Periquito
Thanks for all the help Sage. The cluster is now back to life with
your awesome patch.

On Tue, Oct 13, 2015 at 3:35 PM, Sage Weil  wrote:
> On Tue, 13 Oct 2015, Luis Periquito wrote:
>> Hi Sage,
>>
>> awesome help.
>>
>> Sorry for not telling before, but I'm running 2xMON in precise and
>> 1xMON in trusty. Looking at the status page
>> (http://ceph.com/gitbuilder.cgi) it seems the precise build is
>> failing... Can you have a look?
>
> I've repushed the branch, this time cherry-picking the master fix.  Let me
> know if you run into other problems!
>
> Thanks-
> sage
>
>>
>> thanks,
>>
>>
>> On Tue, Oct 13, 2015 at 2:59 PM, Sage Weil  wrote:
>> > On Tue, 13 Oct 2015, Loic Dachary wrote:
>> >> https://github.com/ceph/ceph/compare/hammer...wip-ecpool-hammer
>> >>
>> >> In order to bypass the crush verification, you could:
>> >>
>> >> ceph tell mon.* injectargs --crushtool /bin/true
>> >
>> > Ah, good trick!
>> >
>> > http://tracker.ceph.com/issues/13477
>> >
>> > is the ticket, and my fix for master is
>> >
>> > https://github.com/ceph/ceph/pull/6246
>> >
>> > sage
>> >
>> >>
>> >> Cheers
>> >>
>> >> On 13/10/2015 15:41, Sage Weil wrote:
>> >> > On Tue, 13 Oct 2015, Luis Periquito wrote:
>> >> >> the store.db dir is 3.4GB big :(
>> >> >>
>> >> >> can I do it on my side?
>> >> >
>> >> > Nevermind, I was able to reproduce it from the bugzilla.  I've pushed a
>> >> > branch wip-ecpool-hammer.  Not sure which distro you're on, but packages
>> >> > will appear at gitbuilder.ceph.com in 30-45 minutes.  This fixes the mon
>> >> > crash, which will let you delete the pool.  I suggest stopping the OSDs
>> >> > before starting the mon with this or else they might get pg create
>> >> > messages and crash too.  Once the pool is removed you can start them
>> >> > again.  They shouldn't need to be upgraded.
>> >> >
>> >> > Note that the latest hammer doesn't let you create the pool at all 
>> >> > because
>> >> > it fails the crush safety check (I had to disable the check to reproduce
>> >> > this), so that's good at least!
>> >> >
>> >> > sage
>> >> >
>> >> >>
>> >> >> On Tue, Oct 13, 2015 at 2:25 PM, Sage Weil  wrote:
>> >> >>> On Tue, 13 Oct 2015, Luis Periquito wrote:
>> >>  Any ideas? I'm growing desperate :(
>> >> 
>> >>  I've tried compiling from source, and including
>> >>  https://github.com/ceph/ceph/pull/5276, but it still crashes on boot
>> >>  of the ceph-mon
>> >> >>>
>> >> >>> If you can email a (link to a) tarball of your mon data directory I'd 
>> >> >>> love
>> >> >>> to extract the osdmap and see why crush is crashing.. it's obviously 
>> >> >>> not
>> >> >>> supposed to do that (even with a bad rule).  You can also use
>> >> >>> the ceph-post-file utility.
>> >> >>>
>> >> >>> Thanks!
>> >> >>> sage
>> >> >>>
>> >> >>>
>> >> 
>> >>  -- Forwarded message --
>> >>  From: Luis Periquito 
>> >>  Date: Tue, Oct 13, 2015 at 12:26 PM
>> >>  Subject: Re: monitor crashing
>> >>  To: Ceph Users 
>> >> 
>> >> 
>> >>  I'm currently running Hammer (0.94.3), created an invalid LRC profile
>> >>  (typo in the l=, should have been l=4 but was l=3, and now I don't
>> >>  have enough different ruleset-locality) and created a pool. Is there
>> >>  any way to delete this pool? remember I can't start the ceph-mon...
>> >> 
>> >>  On Tue, Oct 13, 2015 at 11:56 AM, Luis Periquito 
>> >>   wrote:
>> >> > It seems I've hit this bug:
>> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1231630
>> >> >
>> >> > is there any way I can recover this cluster? It worked in our test
>> >> > cluster, but crashed the production one...
>> >>  --
>> >>  To unsubscribe from this list: send the line "unsubscribe 
>> >>  ceph-devel" in
>> >>  the body of a message to majord...@vger.kernel.org
>> >>  More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> 
>> >> 
>> >> >> --
>> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
>> >> >> in
>> >> >> the body of a message to majord...@vger.kernel.org
>> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >>
>> >> >>
>> >> > --
>> >> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> > the body of a message to majord...@vger.kernel.org
>> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >> >
>> >>
>> >> --
>> >> Loïc Dachary, Artisan Logiciel Libre
>> >>
>> >>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org