Re: [ceph-users] PG stuck inconsistent, but appears ok?

2017-07-14 Thread Dan van der Ster
You probably have osd_max_scrubs=1 and the PG just isn't getting a slot to start. Here's a little trick to get that going right away: ceph osd set noscrub ceph osd set nodeep-scrub ceph tell osd.* injectargs -- --osd_max_scrubs 2 ceph pg deep-scrub 22.1611 ... wait until it starts scrubbing ...

Re: [ceph-users] PG stuck inconsistent, but appears ok?

2017-07-14 Thread Aaron Bassett
I issued the pg deep scrub command ~24 hours ago and nothing has changed. I see nothing in the active osd's log about kicking off the scrub. On Jul 13, 2017, at 2:24 PM, David Turner > wrote: # ceph pg deep-scrub 22.1611 On Thu, Jul 13, 2017

Re: [ceph-users] PG stuck inconsistent, but appears ok?

2017-07-13 Thread David Turner
# ceph pg deep-scrub 22.1611 On Thu, Jul 13, 2017 at 1:00 PM Aaron Bassett wrote: > I'm not sure if I'm doing something wrong, but when I run this: > > # ceph osd deep-scrub 294 > > > All i get in the osd log is: > > 2017-07-13 16:57:53.782841 7f40d089f700 0

Re: [ceph-users] PG stuck inconsistent, but appears ok?

2017-07-13 Thread Aaron Bassett
I'm not sure if I'm doing something wrong, but when I run this: # ceph osd deep-scrub 294 All i get in the osd log is: 2017-07-13 16:57:53.782841 7f40d089f700 0 log_channel(cluster) log [INF] : 21.1ae9 deep-scrub starts 2017-07-13 16:57:53.785261 7f40ce09a700 0 log_channel(cluster) log

Re: [ceph-users] PG stuck inconsistent, but appears ok?

2017-07-13 Thread Aaron Bassett
Ok good to hear, I just kicked one off on the acting primary so I guess I'll be patient now... Thanks, Aaron > On Jul 13, 2017, at 10:28 AM, Dan van der Ster wrote: > > On Thu, Jul 13, 2017 at 4:23 PM, Aaron Bassett > wrote: >> Because it was

Re: [ceph-users] PG stuck inconsistent, but appears ok?

2017-07-13 Thread Dan van der Ster
On Thu, Jul 13, 2017 at 4:23 PM, Aaron Bassett wrote: > Because it was a read error I check SMART stats for that osd's disk and sure > enough, it had some uncorrected read errors. In order to stop it from causing > more problems > I stopped the daemon to let ceph

[ceph-users] PG stuck inconsistent, but appears ok?

2017-07-13 Thread Aaron Bassett
Good Morning, I have an odd situation where a pg is listed inconsistent, but rados is struggling to tell me about it: # ceph health detail HEALTH_ERR 1 pgs inconsistent; 1 requests are blocked > 32 sec; 1 osds have slow requests; 1 scrub errors pg 22.1611 is active+clean+inconsistent, acting