[ceph-users] OSD Crash During Deep-Scrub

2021-03-29 Thread Dave Hall
Hello, A while back, I was having an issue with an OSD repeatedly crashing. I ultimately reweighted it to zero and then marked out 'Out'. Since I found that the logs for thoses crashes match https://tracker.ceph.com/issues/46490 . Since the OSD is in a 'Safe-to-Destroy' state, I'm wondering

[ceph-users] Re: Nautilus - PG Autoscaler Gobal vs Pool Setting

2021-03-29 Thread Dave Hall
All, In looking at the options for setting the default pg autoscale option, I notice that there is a global option setting and a per-pool option setting. It seems that the options at the pool level are off, warn, and on. The same, I assume for the global setting. Is there a way to get rid of

[ceph-users] Re: Resolving LARGE_OMAP_OBJECTS

2021-03-29 Thread David Orman
Response inline: On Fri, Mar 5, 2021 at 11:00 AM Benoît Knecht wrote: > > On Friday, March 5th, 2021 at 15:20, Drew Weaver > wrote: > > Sorry to sound clueless but no matter what I search for on El Goog I can't > > figure out how to answer the question as to whether dynamic sharding is > >

[ceph-users] Re: Nautilus - PG count decreasing after adding OSDs

2021-03-29 Thread Dave Hall
Eugen, I didn't really think my cluster was eating itself, but I also didn't want to be in denial. Regarding the autoscaler, I really thought that it only went up - I didn't expect that it would decrease the number of PGs. Plus, I thought I had it turned off. I see now that it's off globally

[ceph-users] Re: Cluster suspends when Add Mon or stop and start after a while.

2021-03-29 Thread Frank Schilder
Please use the correct list: ceph-users@ceph.io Probably same problem I had. Try reducing mon_sync_max_payload_size=4096 and start a new MON. Should just take a few seconds to boot up. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: Nautilus - PG Autoscaler Gobal vs Pool Setting

2021-03-29 Thread Eugen Block
Or you could just disable the mgr module. Something like ceph mgr module disable pg_autoscaler Zitat von Dave Hall : All, In looking at the options for setting the default pg autoscale option, I notice that there is a global option setting and a per-pool option setting. It seems that the

[ceph-users] Nautilus - PG count decreasing after adding OSDs

2021-03-29 Thread Dave Hall
Hello, About 3 weeks ago I added a node and increased the number of OSDs in my cluster from 24 to 32, and then marked one old OSD down because it was frequently crashing. . After adding the new OSDs the PG count jumped fairly dramatically, but ever since, amidst a continuous low level of

[ceph-users] Re: [Suspicious newsletter] Re: [Suspicious newsletter] bucket index and WAL/DB

2021-03-29 Thread Marcelo
It is true, on the slide it was understood to use this configuration. Thanks for the answer. Em sex., 26 de mar. de 2021 às 10:26, Szabo, Istvan (Agoda) < istvan.sz...@agoda.com> escreveu: > Makes sense what you are talking about, I had the same confusing like you, > finally went with redhat

[ceph-users] Re: OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-03-29 Thread Dan van der Ster
Hi, Saw that, looks scary! I have no experience with that particular crash, but I was thinking that if you have already backfilled the degraded PGs, and can afford to try another OSD, you could try: "bluestore_fsck_quick_fix_threads": "1", # because

[ceph-users] OSDs RocksDB corrupted when upgrading nautilus->octopus: unknown WriteBatch tag

2021-03-29 Thread Jonas Jelten
Hi! After upgrading MONs and MGRs successfully, the first OSD host I upgraded on Ubuntu Bionic from 14.2.16 to 15.2.10 shredded all OSDs on it by corrupting RocksDB, and they now refuse to boot. RocksDB complains "Corruption: unknown WriteBatch tag". The initial crash/corruption occured when

[ceph-users] Re: Nautilus - PG count decreasing after adding OSDs

2021-03-29 Thread Eugen Block
Hi, that sounds like the pg_autoscaler is doing its work. Check with: ceph osd pool autoscale-status I don't think ceph is eating itself or that you're losing data. ;-) Zitat von Dave Hall : Hello, About 3 weeks ago I added a node and increased the number of OSDs in my cluster from 24 to

[ceph-users] Re: Nautilus: Reduce the number of managers

2021-03-29 Thread Stefan Kooman
On 3/28/21 3:52 PM, Dave Hall wrote: Hello, We are in the process of bringing new hardware online that will allow us to get all of the MGRs, MONs, MDSs, etc.  off of our OSD nodes and onto dedicated management nodes.   I've created MGRs and MONs on the new nodes, and I found procedures for

[ceph-users] Re: memory consumption by osd

2021-03-29 Thread Stefan Kooman
On 3/28/21 4:58 AM, Tony Liu wrote: I don't see any problems yet. All OSDs are working fine. Just that 1.8GB free memory concerns me. I know 256GB memory for 10 OSDs (16TB HDD) is a lot, I am planning to reduce it or increate osd_memory_target (if that's what you meant) to boost performance. But

[ceph-users] Re: [Suspicious newsletter] Re: How to clear Health Warning status?

2021-03-29 Thread Szabo, Istvan (Agoda)
Restart the osd. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: jinguk.k...@ungleich.ch Sent: Monday, March

[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?

2021-03-29 Thread Josh Baergen
As was mentioned in this thread, all of the mon clients (OSDs included) learn about other mons through monmaps, which are distributed when mon membership and election changes. Thus, your OSDs should already know about the new mons. mon_host indicates the list of mons that mon clients should try

[ceph-users] Re: memory consumption by osd

2021-03-29 Thread Josh Baergen
Linux will automatically make use of all available memory for the buffer cache, freeing buffers when it needs more memory for other things. This is why MemAvailable is more useful than MemFree; the former indicates how much memory could be used between Free, buffer cache, and anything else that