On Tue, Dec 12, 2017 at 3:36 PM <george.vasilaka...@stfc.ac.uk> wrote:
> From: Gregory Farnum <gfar...@redhat.com> > Date: Tuesday, 12 December 2017 at 19:24 > To: "Vasilakakos, George (STFC,RAL,SC)" <george.vasilaka...@stfc.ac.uk> > Cc: "ceph-users@lists.ceph.com" <ceph-users@lists.ceph.com> > Subject: Re: [ceph-users] Sudden omap growth on some OSDs > > On Tue, Dec 12, 2017 at 3:16 AM <george.vasilaka...@stfc.ac.uk<mailto: > george.vasilaka...@stfc.ac.uk>> wrote: > > On 11 Dec 2017, at 18:24, Gregory Farnum <gfar...@redhat.com<mailto: > gfar...@redhat.com><mailto:gfar...@redhat.com<mailto:gfar...@redhat.com>>> > wrote: > > Hmm, this does all sound odd. Have you tried just restarting the primary > OSD yet? That frequently resolves transient oddities like this. > If not, I'll go poke at the kraken source and one of the developers more > familiar with the recovery processes we're seeing here. > -Greg > > > Hi Greg, > > I’ve tried this, no effect. Also, on Friday, we tried removing an OSD (not > the primary), the OSD that was chosen to replace it had it’s LevelDB grow > to 7GiB by now. Yesterday it was 5.3. > We’re not seeing any errors logged by the OSDs with the default logging > level either. > > Do you have any comments on the fact that the primary sees the PG’s state > as being different to what the peers think? > > Yes. It's super weird. :p > > Now, with a new primary I’m seeing the last peer in the set reporting it’s > ‘active+clean’, as is the primary, all other are saying it’s > ‘active+clean+degraded’ (according to PG query output). > > Has the last OSD in the list shrunk down its LevelDB instance? > > No, the last peer has the largest one currently part of the PG at 14GiB. > > If so (or even if not), I'd try restarting all the OSDs in the PG and see > if that changes things. > > Will try that and report back. > > If it doesn't...well, it's about to be Christmas and Luminous saw quite a > bit of change in this space, so it's unlikely to get a lot of attention. :/ > > Yeah, this being Kraken I doubt it will get looked into deeply. > > But the next step would be to gather high-level debug logs from the OSDs > in question, especially as a peering action takes place. > > I’ll be re-introducing the old primary this week so maybe I’ll bump the > logging levels (to what?) on these OSDs and see what they come up with. > debug osd = 20 > > Oh! > I didn't notice you previously mentioned "custom gateways using the > libradosstriper". Are those backing onto this pool? What operations are > they doing? > Something like repeated overwrites of EC data could definitely have > symptoms similar to this (apart from the odd peering bit.) > -Greg > > Think of these as using the cluster as an object store. Most of the time > we’re writing something in, reading it out anywhere from zero to thousands > of times (each time running stat as well) and eventually may be deleting > it. Once written, there’s no reason to be overwritten. They’re backing onto > the EC pools (one per “tenant”) but the particular pool that this PG is a > part of has barely seen any use. The most used one is storing petabytes and > this one was barely reaching 100TiB when this came up. > Yeah, it would be about overwrites specifically, not just using the data. Congratulations, you've exceeded the range of even my WAGs. :/
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com