Hi Greg,

I have re-introduced the OSD that was taken out (the one that used to be a 
primary). I have kept debug 20 logs from both the re-introduced primary and the 
outgoing primary. I have used ceph-post-file to upload these, tag: 
5b305f94-83e2-469c-a301-7299d2279d94

Hope this helps, let me know if you'd like me to do another test.

Thanks,

George
________________________________
From: Gregory Farnum [[email protected]]
Sent: 13 December 2017 00:04
To: Vasilakakos, George (STFC,RAL,SC)
Cc: [email protected]
Subject: Re: [ceph-users] Sudden omap growth on some OSDs



On Tue, Dec 12, 2017 at 3:36 PM 
<[email protected]<mailto:[email protected]>> wrote:
From: Gregory Farnum <[email protected]<mailto:[email protected]>>
Date: Tuesday, 12 December 2017 at 19:24
To: "Vasilakakos, George (STFC,RAL,SC)" 
<[email protected]<mailto:[email protected]>>
Cc: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [ceph-users] Sudden omap growth on some OSDs

On Tue, Dec 12, 2017 at 3:16 AM 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

On 11 Dec 2017, at 18:24, Gregory Farnum 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>><mailto:[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>>
 wrote:

Hmm, this does all sound odd. Have you tried just restarting the primary OSD 
yet? That frequently resolves transient oddities like this.
If not, I'll go poke at the kraken source and one of the developers more 
familiar with the recovery processes we're seeing here.
-Greg


Hi Greg,

I’ve tried this, no effect. Also, on Friday, we tried removing an OSD (not the 
primary), the OSD that was chosen to replace it had it’s LevelDB grow to 7GiB 
by now. Yesterday it was 5.3.
We’re not seeing any errors logged by the OSDs with the default logging level 
either.

Do you have any comments on the fact that the primary sees the PG’s state as 
being different to what the peers think?

Yes. It's super weird. :p

Now, with a new primary I’m seeing the last peer in the set reporting it’s 
‘active+clean’, as is the primary, all other are saying it’s 
‘active+clean+degraded’ (according to PG query output).

Has the last OSD in the list shrunk down its LevelDB instance?

No, the last peer has the largest one currently part of the PG at 14GiB.

If so (or even if not), I'd try restarting all the OSDs in the PG and see if 
that changes things.

Will try that and report back.

If it doesn't...well, it's about to be Christmas and Luminous saw quite a bit 
of change in this space, so it's unlikely to get a lot of attention. :/

Yeah, this being Kraken I doubt it will get looked into deeply.

But the next step would be to gather high-level debug logs from the OSDs in 
question, especially as a peering action takes place.

I’ll be re-introducing the old primary this week so maybe I’ll bump the logging 
levels (to what?) on these OSDs and see what they come up with.

debug osd = 20


Oh!
I didn't notice you previously mentioned "custom gateways using the 
libradosstriper". Are those backing onto this pool? What operations are they 
doing?
Something like repeated overwrites of EC data could definitely have symptoms 
similar to this (apart from the odd peering bit.)
-Greg

Think of these as using the cluster as an object store. Most of the time we’re 
writing something in, reading it out anywhere from zero to thousands of times 
(each time running stat as well) and eventually may be deleting it. Once 
written, there’s no reason to be overwritten. They’re backing onto the EC pools 
(one per “tenant”) but the particular pool that this PG is a part of has barely 
seen any use. The most used one is storing petabytes and this one was barely 
reaching 100TiB when this came up.

Yeah, it would be about overwrites specifically, not just using the data. 
Congratulations, you've exceeded the range of even my WAGs. :/
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to