Re: [ceph-users] Ceph PG repair

Reed Dier Wed, 08 Mar 2017 08:14:11 -0800

This PG/object is still doing something rather odd.

Attempted to repair the object, which it supposedly attempted, but now I appear 
to have less visibility.


> $ ceph health detail
> HEALTH_ERR 3 pgs inconsistent; 4 scrub errors; mds0: Many clients (20) 
> failing to respond to cache pressure; noout,sortbitwise,require_jewel_osds 
> flag(s) set
> pg 10.2d8 is active+clean+inconsistent, acting [18,17,22]
> pg 10.7bd is active+clean+inconsistent, acting [8,23,17]
> pg 17.ec is active+clean+inconsistent, acting [23,2,21]
> 4 scrub errors
> noout,sortbitwise,require_jewel_osds flag(s) set


23 is the osd scheduled for replacement, generated another read error.

However, 17.ec does not show in the rados list inconsistent pg objects command

> $ rados list-inconsistent-pg objects
> ["10.2d8","10.7bd”]

And examining 10.2d8 as before, I’m presented with this:

> $ rados list-inconsistent-obj 10.2d8 --format=json-pretty
> {
>     "epoch": 21094,
>     "inconsistents": []
> }

Even though in the logs, the deep scrub and repair both show that the object 
was not repaired.

> $ zgrep 10.2d8 ceph-*
> ceph-osd.18.log.2.gz:2017-03-06 15:10:08.729827 7fc8dfeb8700  0 
> log_channel(cluster) log [INF] : 10.2d8 repair starts
> ceph-osd.18.log.2.gz:2017-03-06 15:13:49.793839 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : 10.2d8 recorded data digest 0x7fa9879c != on 
> disk 0xa6798e03 on {object.name}:head
> ceph-osd.18.log.2.gz:2017-03-06 15:13:49.793941 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : repair 10.2d8 {object.name}:head on disk 
> size (15913) does not match object info size (10280) adjusted for ondisk to 
> (10280)
> ceph-osd.18.log.2.gz:2017-03-06 15:46:13.286268 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : 10.2d8 repair 2 errors, 0 fixed
> ceph-osd.18.log.4.gz:2017-03-04 18:16:23.693057 7fc8dd6b3700  0 
> log_channel(cluster) log [INF] : 10.2d8 deep-scrub starts
> ceph-osd.18.log.4.gz:2017-03-04 18:19:25.471322 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : 10.2d8 recorded data digest 0x7fa9879c != on 
> disk 0xa6798e03 on {object.name}:head
> ceph-osd.18.log.4.gz:2017-03-04 18:19:25.471403 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : deep-scrub 10.2d8 {object.name}:head on disk 
> size (15913) does not match object info size (10280) adjusted for ondisk to 
> (10280)
> ceph-osd.18.log.4.gz:2017-03-04 18:55:39.617841 7fc8dd6b3700 -1 
> log_channel(cluster) log [ERR] : 10.2d8 deep-scrub 2 errors


File size and md5 still match.

> ls -la 
> /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> -rw-r--r-- 1 ceph ceph 15913 Mar  2 17:24 
> /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}

> -rw-r--r-- 1 ceph ceph 15913 Mar  2 17:24 
> /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> -rw-r--r-- 1 ceph ceph 15913 Mar  2 17:24 
> /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}

> md5sum 
> /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 55a76349b758d68945e5028784c59f24  
> /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 55a76349b758d68945e5028784c59f24  
> /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 55a76349b758d68945e5028784c59f24  
> /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}


So is the object actually inconsistent?
Is rados somehow behind on something, not showing the third inconsistent PG?

Appreciate any help.

Reed

> On Mar 2, 2017, at 9:21 AM, Reed Dier <[email protected]> wrote:
> 
> Over the weekend, two inconsistent PG’s popped up in my cluster. This being 
> after having scrubs disabled for close to 6 weeks after a very long rebalance 
> after adding 33% more OSD’s, an OSD failing, increasing PG’s, etc.
> 
> It appears we came out the other end with 2 inconsistent PG’s and I’m trying 
> to resolve them, and not seeming to have much luck.
> Ubuntu 16.04, Jewel 10.2.5, 3x replicated pool for reference.
> 
>> $ ceph health detail
>> HEALTH_ERR 2 pgs inconsistent; 3 scrub errors; 
>> noout,sortbitwise,require_jewel_osds flag(s) set
>> pg 10.7bd is active+clean+inconsistent, acting [8,23,17]
>> pg 10.2d8 is active+clean+inconsistent, acting [18,17,22]
>> 3 scrub errors
> 
>> $ rados list-inconsistent-pg objects
>> ["10.2d8","10.7bd”]
> 
> Pretty straight forward, 2 PG’s with inconsistent copies. Lets dig deeper.
> 
>> $ rados list-inconsistent-obj 10.2d8 --format=json-pretty
>> {
>>     "epoch": 21094,
>>     "inconsistents": [
>>         {
>>             "object": {
>>                 "name": “object.name",
>>                 "nspace": “namespace.name",
>>                 "locator": "",
>>                 "snap": "head"
>>             },
>>             "errors": [],
>>             "shards": [
>>                 {
>>                     "osd": 17,
>>                     "size": 15913,
>>                     "omap_digest": "0xffffffff",
>>                     "data_digest": "0xa6798e03",
>>                     "errors": []
>>                 },
>>                 {
>>                     "osd": 18,
>>                     "size": 15913,
>>                     "omap_digest": "0xffffffff",
>>                     "data_digest": "0xa6798e03",
>>                     "errors": []
>>                 },
>>                 {
>>                     "osd": 22,
>>                     "size": 15913,
>>                     "omap_digest": "0xffffffff",
>>                     "data_digest": "0xa6798e03",
>>                     "errors": [
>>                         "data_digest_mismatch_oi"
>>                     ]
>>                 }
>>             ]
>>         }
>>     ]
>> }
> 
>> $ rados list-inconsistent-obj 10.7bd --format=json-pretty
>> {
>>     "epoch": 21070,
>>     "inconsistents": [
>>         {
>>             "object": {
>>                 "name": “object2.name",
>>                 "nspace": “namespace.name",
>>                 "locator": "",
>>                 "snap": "head"
>>             },
>>             "errors": [
>>                 "read_error"
>>             ],
>>             "shards": [
>>                 {
>>                     "osd": 8,
>>                     "size": 27691,
>>                     "omap_digest": "0xffffffff",
>>                     "data_digest": "0x9ce36903",
>>                     "errors": []
>>                 },
>>                 {
>>                     "osd": 17,
>>                     "size": 27691,
>>                     "omap_digest": "0xffffffff",
>>                     "data_digest": "0x9ce36903",
>>                     "errors": []
>>                 },
>>                 {
>>                     "osd": 23,
>>                     "size": 27691,
>>                     "errors": [
>>                         "read_error"
>>                     ]
>>                 }
>>             ]
>>         }
>>     ]
>> }
> 
> 
> So we have one PG (10.7bd) with a read error on osd.23, which is known and 
> scheduled for replacement.
> We also have a data digest mismatch on PG 10.2d8 on osd.22, which I have been 
> attempting to repair with no real tangible results.
> 
>> $ ceph pg repair 10.2d8
>> instructing pg 10.2d8 on osd.18 to repair
> 
> I’ve run the ceph pg repair command multiple times, and each time, it 
> instructs osd.18 to repair to the PG.
> Is this to assume that osd.18 is the acting member of the copies, and its 
> being told to backfill the known-good copy of the PG over the agreed upon 
> wrong version on osd.22.
> 
>> $ zgrep 'ERR' /var/log/ceph/*
>> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 20:45:21.561164 7fc8dfeb8700 
>> -1 log_channel(cluster) log [ERR] : 10.2d8 recorded data digest 0x7fa9879c 
>> != on disk 0xa6798e03 on 10:1b42251f:{object.name}:head
>> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 20:45:21.561225 7fc8dfeb8700 
>> -1 log_channel(cluster) log [ERR] : deep-scrub 10.2d8 
>> 10:1b42251f:{object.name}:head on disk size (15913) does not match object 
>> info size (10280) adjusted for ondisk to (10280)
>> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 21:05:59.935815 7fc8dfeb8700 
>> -1 log_channel(cluster) log [ERR] : 10.2d8 deep-scrub 2 errors
> 
> 
>> $ ceph pg 10.2d8 query
>> {
>>     "state": "active+clean+inconsistent",
>>     "snap_trimq": "[]",
>>     "epoch": 21746,
>>     "up": [
>>         18,
>>         17,
>>         22
>>     ],
>>     "acting": [
>>         18,
>>         17,
>>         22
>>     ],
>>     "actingbackfill": [
>>         "17",
>>         "18",
>>         "22"
>>     ],
> 
> However, no recovery io ever occurs, and the PG never goes active+clean. Not 
> seeing anything exciting in the logs of the OSD’s nor the mon’s.
> 
> I’ve found a few articles and mailing list entries that mention downing the 
> OSD, flushing the journal, moving object off the disk, starting the OSD, and 
> running the repair command again.
> 
> However, after finding the object on disk, and eyeballing the size and the 
> md5sum, they all appear to be identical.
>> $ ls -la 
>> /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
>> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 
>> /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
>> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 
>> /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
>> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 
>> /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 
>> $ md5sum 
>> /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
>> 55a76349b758d68945e5028784c59f24  
>> /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
>> 55a76349b758d68945e5028784c59f24  
>> /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 
>> 55a76349b758d68945e5028784c59f24  
>> /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 
> Should I schedule another scrub? Should I do the whole down the OSD, flush 
> journal, move object song and dance?
> 
> Hoping the user list will provide some insight into the proper steps to move 
> forward with. And assuming the other inconsistent PG will fix itself once the 
> 
> Thanks,
> 
> Reed

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph PG repair

Reply via email to