[ceph-users] Re: MDS daemons don't report any more

2023-09-12 Thread Frank Schilder
Hi Patrick, I'm not sure that its exactly the same issue. I observed that "ceph tell mds.xyz session ls" had all counters 0. On Friday before we had a power loss on a rack that took out a JBOD with a few meta-data disks and I suspect that the reporting of zeroes started after this crash. No

[ceph-users] Re: rgw: strong consistency for (bucket) policy settings?

2023-09-12 Thread Matthias Ferdinand
On Mon, Sep 11, 2023 at 02:37:59PM -0400, Matt Benjamin wrote: > Yes, it's also strongly consistent. It's also last writer wins, though, so > two clients somehow permitted to contend for updating policy could > overwrite each other's changes, just as with objects. Hi, thank you for confirming

[ceph-users] Re: [ceph v16.2.10] radosgw crash

2023-09-12 Thread Tobias Urdin
Hello, That was solved in 16.2.11 in tracker [1] with fix [2]. Best regards Tobias [1] https://tracker.ceph.com/issues/55765 [2] https://github.com/ceph/ceph/pull/47194/commits > On 12 Sep 2023, at 05:29, Louis Koo wrote: > > radosgw crash again with: > ceph version 16.2.10

[ceph-users] Re: cannot create new OSDs - ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)

2023-09-12 Thread Konold, Martin
Hi Igor, I recreated the log with full debugging enabled. https://www.konsec.com/download/full-debug-20-ceph-osd.43.log.gz and another without the debug settings https://www.konsec.com/download/failed-ceph-osd.43.log.gz I hope you can draw some conclusions from it and I am looking forward to

[ceph-users] Re: Upgrading OS [and ceph release] nondestructively for oldish Ceph cluster

2023-09-12 Thread Ackermann, Christoph
Hello Sam, i did start with an Ceph Jewel and Centos 7 (POC) cluster in mid 2017 now successfully running latest Quincy version 17.2.6 in production. BUT, we had to do a recreation of all OSDs (DB/WAL) from Filstore to Bluestore and later once again for Centos 8 host migration. :-/ Major step

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-12 Thread Konstantin Shalygin
Hi Igor, > On 12 Sep 2023, at 15:28, Igor Fedotov wrote: > > Default hybrid allocator (as well as AVL one it's based on) could take > dramatically long time to allocate pretty large (hundreds of MBs) 64K-aligned > chunks for BlueFS. At the original cluster it was exposed as 20-30 sec OSD >

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-12 Thread Igor Fedotov
Hey Konstantin, forget to mention - indeed clusters having 4K bluestore min alloc size are more likely to be exposed to the issue. The key point is the difference between bluestore and bluefs allocation sizes. The issue likely to pop-up when user and DB data are collocated but different

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-12 Thread Igor Fedotov
HI All, as promised here is a postmortem analysis on what happened. the following ticket (https://tracker.ceph.com/issues/62815) with accompanying materials provide a low-level overview on the issue. In a few words it is as follows: Default hybrid allocator (as well as AVL one it's based

[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster

2023-09-12 Thread Joachim Kraftmayer - ceph ambassador
Another the possibility is also the ceph mon discovery via DNS: https://docs.ceph.com/en/quincy/rados/configuration/mon-lookup-dns/#looking-up-monitors-through-dns Regards, Joachim ___ ceph ambassador DACH ceph consultant since 2012 Clyso GmbH - Premier Ceph

[ceph-users] Re: Awful new dashboard in Reef

2023-09-12 Thread Nizamudeen A
Thank you Nicola, We are collecting these feedbacks. For a while we weren't focusing on the mobile view of the dashboard. If there are users using those, we'll look into it as well. Will let everyone know soon with the improvements in the UI. Regards, Nizam On Mon, Sep 11, 2023 at 2:23 PM

[ceph-users] Re: ceph orch command hung

2023-09-12 Thread Eugen Block
No, it's a flag you (or someone else?) set before shutting down the cluster, look at your initial email, there were multiple flags set: pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover flag(s) set When you bring your cluster back online you should unset those flags. Zitat von