[ceph-users] Re: Pool full but the user cleaned it up already

2020-05-21 Thread Eugen Block
Do you have quotas enabled on that pool? Can you also show ceph df detail Zitat von "Szabo, Istvan (Agoda)" : Restarted mgr and mon services, nothing helped :/ -Original Message- From: Eugen Block Sent: Wednesday, May 20, 2020 3:05 PM To: Szabo, Istvan (Agoda) Cc:

[ceph-users] Re: 15.2.2 Upgrade - Corruption: error in middle of record

2020-05-21 Thread Igor Fedotov
Short update on the issue: Finally we're able to reproduce the issue in master (not octopus), investigating further.. @Chris - to make sure you're facing the same issue could you please check the content of the broken file. To do so: 1) run "ceph-bluestore-tool --path --our-dir

[ceph-users] Re: Nautilus: (Minority of) OSDs with huge buffer_anon usage - triggering OOMkiller in worst cases.

2020-05-21 Thread aoanla
I should note that these OSDs also drop out of the pool as part of their symptoms - it's not clear to me at the moment if they drop out *because* of the memory, or if the buffer_pool is growing large because it's buffering communications that aren't getting to the cluster [and hence they drop

[ceph-users] Re: OSDs taking too much memory, for buffer_anon

2020-05-21 Thread aoanla
So, to jump into this thread - we seem to see the same problem as Harald on our cluster here in Glasgow, except our "worst case" OSDs are much worse than his [we get up to ~tens of GB in buffer_anon]. Activity is a mix of reads and writes against a single EC (8+2) encoded pool, with 8MB

[ceph-users] Nautilus: (Minority of) OSDs with huge buffer_anon usage - triggering OOMkiller in worst cases.

2020-05-21 Thread aoanla
Hi, Following on from various woes, we see an odd and unhelpful behaviour with some OSDs on our cluster currently. A minority of OSDs seem to have runaway memory usage, rising to 10s of GB, whilst other OSDs on the same host behave sensibly. This started when we moved from Mimic -> Nautilus,

[ceph-users] remove secondary zone from multisite

2020-05-21 Thread Zhenshi Zhou
Hi all, I'm gonna make my secondary zone offline. How to remove the secondary zone from a mutisite? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Luminous, OSDs down: "osd init failed" and "failed to load OSD map for epoch ... got 0 bytes"

2020-05-21 Thread Fulvio Galeazzi
Hallo all,     hope you can help me with very strange problems which arose suddenly today. Tried to search, also in this mailing list, but could not find anything relevant. At some point today, without any action from my side, I noticed some OSDs in my production cluster would go down and never

[ceph-users] Re: ceph orch upgrade stuck at the beginning.

2020-05-21 Thread Ashley Merrick
Hello,Yes I did but wasn't able to suggest anything further to get around it, however:1/ There is currently an issue with 15.2.2 so I would advise holding off any upgrade2/ Another mail list user replied to one of your older emails in the thread asking for some manager logs not sure if you have

[ceph-users] Re: ceph orch upgrade stuck at the beginning.

2020-05-21 Thread Gencer W . Genç
Hi Sebastian, I did not get your reply via e-mail. I am very sorry for this. I hope you can see this message... I've re-run the upgrade and attached the log. Thanks, Gencer. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: ceph orch upgrade stuck at the beginning.

2020-05-21 Thread Gencer W . Genç
Hi Ashley, Have you seen my previous reply? If so, and no solution then does anyone has any idea how can this be done with 2 node? Thanks, Gencer. On 20.05.2020 16:33:53, Gencer W. Genç wrote: This is 2 node setup. I have no third node :( I am planning to add more in the future but currently

[ceph-users] Re: Pool full but the user cleaned it up already

2020-05-21 Thread Szabo, Istvan (Agoda)
Hello, here it is, I usually set just space quota not object quota. NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READWRITE RAW USED k8s 8 N/A 200GiB

[ceph-users] Re: Nautilus: (Minority of) OSDs with huge buffer_anon usage - triggering OOMkiller in worst cases.

2020-05-21 Thread Mark Nelson
Hi Sam, I saw your comment in the other thread but wanted to reply here since you provided the mempool and perf counters.  It looks like the priority cache is (like in Harald's case) shrinking all of the caches to their smallest values trying to compensate for all of the stuff collecting in

[ceph-users] Re: OSDs taking too much memory, for buffer_anon

2020-05-21 Thread Mark Nelson
Out of curiosity do you have compression enabled?  FWIW Deepika is has been working on splitting the mempool assignments into much better categories for better tracking.  I suspect we are going to find a bug where something isn't being cleaned up properly in buffer_anon.  Adam's been taking up

[ceph-users] Setting up first cluster on proxmox - a few questions

2020-05-21 Thread CodingSpiderFox
Hello everyone :) When I try to create an OSD, Proxmox UI asks for * Data disk * DB disk * WAL disk What disk will be the limiting factor in terms of storage size for my OSD - the data disk? How large do I need to make the other two? Is there a risk of them running over capacity before the

[ceph-users] Re: Setting up first cluster on proxmox - a few questions

2020-05-21 Thread Eneko Lacunza
Hi, I strongly suggest you read ceph documentation in https://docs.ceph.com/docs/master El 21/5/20 a las 15:06, CodingSpiderFox escribió: Hello everyone :) When I try to create an OSD, Proxmox UI asks for * Data disk * DB disk * WAL disk What disk will be the limiting factor in terms of