Re: [ceph-users] Hammer: PGs stuck creating

2016-06-30 Thread Brad Hubbard
On Thu, Jun 30, 2016 at 11:34 PM, Brian Felton  wrote:
> Sure.  Here's a complete query dump of one of the 30 pgs:
> http://pastebin.com/NFSYTbUP

Looking at that something immediately stands out.

There are a lot of entries in "past intervals" like so.

"past_intervals": [
 {
 "first": 18522,
 "last": 18523,
 "maybe_went_rw": 1,
 "up": [
 2147483647,
...
"acting": [
2147483647,
2147483647,
2147483647,
2147483647
],
"primary": -1,
"up_primary": -1

That value is defined in src/crush/crush.h like so;

#define CRUSH_ITEM_NONE   0x7fff  /* no result */

So it looks like this could be to do with a bad crush rule (or at least a
previously un-satisfiable rule).

Could you share the output from the following?

$ ceph osd crush rule ls

For each rule listed by the above command.

$ ceph osd crush rule dump [rule_name]

I'd then dump out the crushmap and test it showing any bad mappings with the
commands listed here;

http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#crush-gives-up-too-soon

That should hopefully give some insight.

HTH,
Brad

>
> Brian
>
> On Wed, Jun 29, 2016 at 6:25 PM, Brad Hubbard  wrote:
>>
>> On Thu, Jun 30, 2016 at 3:22 AM, Brian Felton  wrote:
>> > Greetings,
>> >
>> > I have a lab cluster running Hammer 0.94.6 and being used exclusively
>> > for
>> > object storage.  The cluster consists of four servers running 60 6TB
>> > OSDs
>> > each.  The main .rgw.buckets pool is using k=3 m=1 erasure coding and
>> > contains 8192 placement groups.
>> >
>> > Last week, one of our guys out-ed and removed one OSD from each of three
>> > of
>> > the four servers in the cluster, which resulted in some general badness
>> > (the
>> > disks were wiped post-removal, so the data are gone).  After a proper
>> > education in why this is a Bad Thing, we got the OSDs added back.  When
>> > all
>> > was said and done, we had 30 pgs that were stuck incomplete, and no
>> > amount
>> > of magic has been able to get them to recover.  From reviewing the data,
>> > we
>> > knew that all of these pgs contained at least 2 of the removed OSDs; I
>> > understand and accept that the data are gone, and that's not a concern
>> > (yay
>> > lab).
>> >
>> > Here are the things I've tried:
>> >
>> > - Restarted all OSDs
>> > - Stopped all OSDs, removed all OSDs from the crush map, and started
>> > everything back up
>> > - Executed a 'ceph pg force_create_pg ' for each of the 30 stuck pgs
>> > - Executed a 'ceph pg send_pg_creates' to get the ball rolling on
>> > creates
>> > - Executed several 'ceph pg  query' commands to ensure we were
>> > referencing valid OSDs after the 'force_create_pg'
>> > - Ensured those OSDs were really removed (e.g. 'ceph auth del', 'ceph
>> > osd
>> > crush remove', and 'ceph osd rm')
>>
>> Can you share some of the pg query output?
>>
>> >
>> > At this point, I've got the same 30 pgs that are stuck creating.  I've
>> > run
>> > out of ideas for getting this back to a healthy state.  In reviewing the
>> > other posts on the mailing list, the overwhelming solution was a bad OSD
>> > in
>> > the crush map, but I'm all but certain that isn't what's hitting us
>> > here.
>> > Normally, being the lab, I'd consider nuking the .rgw.buckets pool and
>> > starting from scratch, but we've recently spent a lot of time pulling
>> > 140TB
>> > of data into this cluster for some performance and recovery tests, and
>> > I'd
>> > prefer not to have to start that process again.  I am willing to
>> > entertain
>> > most any other idea irrespective to how destructive it is to these PGs,
>> > so
>> > long as I don't have to lose the rest of the data in the pool.
>> >
>> > Many thanks in advance for any assistance here.
>> >
>> > Brian Felton
>> >
>> >
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>>
>>
>> --
>> Cheers,
>> Brad
>
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hammer: PGs stuck creating

2016-06-30 Thread Brian Felton
Sure.  Here's a complete query dump of one of the 30 pgs:
http://pastebin.com/NFSYTbUP

Brian

On Wed, Jun 29, 2016 at 6:25 PM, Brad Hubbard  wrote:

> On Thu, Jun 30, 2016 at 3:22 AM, Brian Felton  wrote:
> > Greetings,
> >
> > I have a lab cluster running Hammer 0.94.6 and being used exclusively for
> > object storage.  The cluster consists of four servers running 60 6TB OSDs
> > each.  The main .rgw.buckets pool is using k=3 m=1 erasure coding and
> > contains 8192 placement groups.
> >
> > Last week, one of our guys out-ed and removed one OSD from each of three
> of
> > the four servers in the cluster, which resulted in some general badness
> (the
> > disks were wiped post-removal, so the data are gone).  After a proper
> > education in why this is a Bad Thing, we got the OSDs added back.  When
> all
> > was said and done, we had 30 pgs that were stuck incomplete, and no
> amount
> > of magic has been able to get them to recover.  From reviewing the data,
> we
> > knew that all of these pgs contained at least 2 of the removed OSDs; I
> > understand and accept that the data are gone, and that's not a concern
> (yay
> > lab).
> >
> > Here are the things I've tried:
> >
> > - Restarted all OSDs
> > - Stopped all OSDs, removed all OSDs from the crush map, and started
> > everything back up
> > - Executed a 'ceph pg force_create_pg ' for each of the 30 stuck pgs
> > - Executed a 'ceph pg send_pg_creates' to get the ball rolling on creates
> > - Executed several 'ceph pg  query' commands to ensure we were
> > referencing valid OSDs after the 'force_create_pg'
> > - Ensured those OSDs were really removed (e.g. 'ceph auth del', 'ceph osd
> > crush remove', and 'ceph osd rm')
>
> Can you share some of the pg query output?
>
> >
> > At this point, I've got the same 30 pgs that are stuck creating.  I've
> run
> > out of ideas for getting this back to a healthy state.  In reviewing the
> > other posts on the mailing list, the overwhelming solution was a bad OSD
> in
> > the crush map, but I'm all but certain that isn't what's hitting us here.
> > Normally, being the lab, I'd consider nuking the .rgw.buckets pool and
> > starting from scratch, but we've recently spent a lot of time pulling
> 140TB
> > of data into this cluster for some performance and recovery tests, and
> I'd
> > prefer not to have to start that process again.  I am willing to
> entertain
> > most any other idea irrespective to how destructive it is to these PGs,
> so
> > long as I don't have to lose the rest of the data in the pool.
> >
> > Many thanks in advance for any assistance here.
> >
> > Brian Felton
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Cheers,
> Brad
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hammer: PGs stuck creating

2016-06-29 Thread Brad Hubbard
On Thu, Jun 30, 2016 at 3:22 AM, Brian Felton  wrote:
> Greetings,
>
> I have a lab cluster running Hammer 0.94.6 and being used exclusively for
> object storage.  The cluster consists of four servers running 60 6TB OSDs
> each.  The main .rgw.buckets pool is using k=3 m=1 erasure coding and
> contains 8192 placement groups.
>
> Last week, one of our guys out-ed and removed one OSD from each of three of
> the four servers in the cluster, which resulted in some general badness (the
> disks were wiped post-removal, so the data are gone).  After a proper
> education in why this is a Bad Thing, we got the OSDs added back.  When all
> was said and done, we had 30 pgs that were stuck incomplete, and no amount
> of magic has been able to get them to recover.  From reviewing the data, we
> knew that all of these pgs contained at least 2 of the removed OSDs; I
> understand and accept that the data are gone, and that's not a concern (yay
> lab).
>
> Here are the things I've tried:
>
> - Restarted all OSDs
> - Stopped all OSDs, removed all OSDs from the crush map, and started
> everything back up
> - Executed a 'ceph pg force_create_pg ' for each of the 30 stuck pgs
> - Executed a 'ceph pg send_pg_creates' to get the ball rolling on creates
> - Executed several 'ceph pg  query' commands to ensure we were
> referencing valid OSDs after the 'force_create_pg'
> - Ensured those OSDs were really removed (e.g. 'ceph auth del', 'ceph osd
> crush remove', and 'ceph osd rm')

Can you share some of the pg query output?

>
> At this point, I've got the same 30 pgs that are stuck creating.  I've run
> out of ideas for getting this back to a healthy state.  In reviewing the
> other posts on the mailing list, the overwhelming solution was a bad OSD in
> the crush map, but I'm all but certain that isn't what's hitting us here.
> Normally, being the lab, I'd consider nuking the .rgw.buckets pool and
> starting from scratch, but we've recently spent a lot of time pulling 140TB
> of data into this cluster for some performance and recovery tests, and I'd
> prefer not to have to start that process again.  I am willing to entertain
> most any other idea irrespective to how destructive it is to these PGs, so
> long as I don't have to lose the rest of the data in the pool.
>
> Many thanks in advance for any assistance here.
>
> Brian Felton
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Hammer: PGs stuck creating

2016-06-29 Thread Brian Felton
Greetings,

I have a lab cluster running Hammer 0.94.6 and being used exclusively for
object storage.  The cluster consists of four servers running 60 6TB OSDs
each.  The main .rgw.buckets pool is using k=3 m=1 erasure coding and
contains 8192 placement groups.

Last week, one of our guys out-ed and removed one OSD from each of three of
the four servers in the cluster, which resulted in some general badness
(the disks were wiped post-removal, so the data are gone).  After a proper
education in why this is a Bad Thing, we got the OSDs added back.  When all
was said and done, we had 30 pgs that were stuck incomplete, and no amount
of magic has been able to get them to recover.  From reviewing the data, we
knew that all of these pgs contained at least 2 of the removed OSDs; I
understand and accept that the data are gone, and that's not a concern (yay
lab).

Here are the things I've tried:

- Restarted all OSDs
- Stopped all OSDs, removed all OSDs from the crush map, and started
everything back up
- Executed a 'ceph pg force_create_pg ' for each of the 30 stuck pgs
- Executed a 'ceph pg send_pg_creates' to get the ball rolling on creates
- Executed several 'ceph pg  query' commands to ensure we were
referencing valid OSDs after the 'force_create_pg'
- Ensured those OSDs were really removed (e.g. 'ceph auth del', 'ceph osd
crush remove', and 'ceph osd rm')

At this point, I've got the same 30 pgs that are stuck creating.  I've run
out of ideas for getting this back to a healthy state.  In reviewing the
other posts on the mailing list, the overwhelming solution was a bad OSD in
the crush map, but I'm all but certain that isn't what's hitting us here.
Normally, being the lab, I'd consider nuking the .rgw.buckets pool and
starting from scratch, but we've recently spent a lot of time pulling 140TB
of data into this cluster for some performance and recovery tests, and I'd
prefer not to have to start that process again.  I am willing to entertain
most any other idea irrespective to how destructive it is to these PGs, so
long as I don't have to lose the rest of the data in the pool.

Many thanks in advance for any assistance here.

Brian Felton
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com