[ceph-users] Re: 14.2.20: Strange monitor problem eating 100% CPU

2021-05-04 Thread Rainer Krienke
Hello Dan, I checked if I see the negative "Updated progress" messages and I actually do. At 07:32:00 I started osd.2 again and then ran some ceph -s until rebalance started and then ceph -s was finally hanging. In the mgr log I see this: https://cloud.uni-koblenz.de/s/CegqBT7pi9nobk4 At th

[ceph-users] Re: Where is the MDS journal written to?

2021-05-04 Thread mabi
‐‐‐ Original Message ‐‐‐ On Tuesday, May 4, 2021 4:42 PM, 胡 玮文 wrote: > MDS also write its journal to the meta pool. And eventually located on the > OSDs. Thank you for your answer. That's good news then as my meta pool is all SSD. Another question popped up, for a small cluster like m

[ceph-users] Re: Certificat format for the SSL dashboard

2021-05-04 Thread Fabrice Bacchella
Thanks, my work around was to that problem was to put an Apache in front of the dashboard. > Le 4 mai 2021 à 20:49, Ernesto Puerta a écrit : > > Hi Fabrice, > > Don't worry, it has nothing to do with the internal format of those > certificates. It's been a recent breakage and will be fixed in

[ceph-users] Re: Certificat format for the SSL dashboard

2021-05-04 Thread Ernesto Puerta
Hi Fabrice, Don't worry, it has nothing to do with the internal format of those certificates. It's been a recent breakage and will be fixed in 16.2.2. In the meantime, you can find a workaround (well, you already nailed it: the CLI command that failed is just a wrapper around the "ceph config-key

[ceph-users] Where is the MDS journal written to?

2021-05-04 Thread mabi
Hello, I have a small Octopus cluster (3 mon/mgr nodes, 3 osd nodes) installed with cephadm and hence running inside podman containers on Ubuntu 20.04. I want to use CephFS so I created a fs volume and saw that two MDS containers have been automatically deployed on two of my OSD nodes. Now I sa

[ceph-users] Weird PG Acting Set

2021-05-04 Thread Lazuardi Nasution
Hi, Suddenly we have a recovery_unfound situation. I find that PG acting set is missing some OSDs which are up. Why can't OSD 3 and 71 on following PG query result be members of PG acting set? Currently, we use v15.2.8. How to recover from this situation? { "snap_trimq": "[]", "snap_trimq

[ceph-users] Re: 14.2.20: Strange monitor problem eating 100% CPU

2021-05-04 Thread Dan van der Ster
On Tue, May 4, 2021 at 4:34 PM Janne Johansson wrote: > > Den tis 4 maj 2021 kl 16:29 skrev Dan van der Ster : > > BTW, if you find that this is indeed what's blocking your mons, you > > can workaround by setting `ceph progress off` until the fixes are > > released. > > Most ceph commands (and a f

[ceph-users] Re: Where is the MDS journal written to?

2021-05-04 Thread 胡 玮文
> 在 2021年5月4日,22:31,mabi 写道: > > Hello, > > I have a small Octopus cluster (3 mon/mgr nodes, 3 osd nodes) installed with > cephadm and hence running inside podman containers on Ubuntu 20.04. > > I want to use CephFS so I created a fs volume and saw that two MDS containers > have been automa

[ceph-users] Re: 14.2.20: Strange monitor problem eating 100% CPU

2021-05-04 Thread Janne Johansson
Den tis 4 maj 2021 kl 16:29 skrev Dan van der Ster : > BTW, if you find that this is indeed what's blocking your mons, you > can workaround by setting `ceph progress off` until the fixes are > released. Most ceph commands (and a few of the ceph daemon commands) would just block, so I guess one wou

[ceph-users] Re: 14.2.20: Strange monitor problem eating 100% CPU

2021-05-04 Thread Dan van der Ster
On Tue, May 4, 2021 at 4:21 PM Janne Johansson wrote: > > Den tis 4 maj 2021 kl 16:10 skrev Rainer Krienke : > > Hello, > > I am playing around with a test ceph 14.2.20 cluster. The cluster > > consists of 4 VMs, each VM has 2 OSDs. The first three VMs vceph1, > > vceph2 and vceph3 are monitors. v

[ceph-users] Re: 14.2.20: Strange monitor problem eating 100% CPU

2021-05-04 Thread Dan van der Ster
Hi, This sounds a lot like the negative progress bug we just found last week: https://tracker.ceph.com/issues/50591 That bug makes the mon enter a very long loop rendering a progress bar if the mgr incorrectly sends a message to the mon that the progress is negative. Octopus and later don't have

[ceph-users] Re: 14.2.20: Strange monitor problem eating 100% CPU

2021-05-04 Thread Janne Johansson
Den tis 4 maj 2021 kl 16:10 skrev Rainer Krienke : > Hello, > I am playing around with a test ceph 14.2.20 cluster. The cluster > consists of 4 VMs, each VM has 2 OSDs. The first three VMs vceph1, > vceph2 and vceph3 are monitors. vceph1 is also mgr. > What I did was quite simple. The cluster is in

[ceph-users] 14.2.20: Strange monitor problem eating 100% CPU

2021-05-04 Thread Rainer Krienke
Hello, I am playing around with a test ceph 14.2.20 cluster. The cluster consists of 4 VMs, each VM has 2 OSDs. The first three VMs vceph1, vceph2 and vceph3 are monitors. vceph1 is also mgr. What I did was quite simple. The cluster is in the state HEALTHY: vceph2: systemctl stop ceph-osd@2

[ceph-users] Re: Failed cephadm Upgrade - ValueError

2021-05-04 Thread David Orman
Can you please run: "cat /sys/kernel/security/apparmor/profiles"? See if any of the lines have a label but no mode. Let us know what you find! Thanks, David On Mon, May 3, 2021 at 8:58 AM Ashley Merrick wrote: > Created BugTicket : https://tracker.ceph.com/issues/50616 > > On Mon May 03 2021 21

[ceph-users] Re: How to set bluestore_rocksdb_options_annex

2021-05-04 Thread Igor Fedotov
OSD to be restarted similar to altering bluestore_rocksdb_options. You can check if RocksDB has got proper option via OSD log, e.g. 2021-05-04T04:35:42.290+0300 7f325f8daec0  4 rocksdb: Options.write_buffer_size: 268435456 Thanks, Igor On 5/4/2021 7:59 AM, c...@elchaka.de wrote: For the r

[ceph-users] Re: How to set bluestore_rocksdb_options_annex

2021-05-04 Thread Igor Fedotov
You should use 'ceph daemon osd.3 config get bluestore_rocksdb_options_annex' to verify config has been altered. This option is treated as independent one and doesn't modify any other Ceph(!) option. bluestore_rocksdb_options + bluestore_rocksdb_options_annex form a single RocksDB options l

[ceph-users] possible bug in radosgw-admin bucket radoslist

2021-05-04 Thread Rob Haverkamp
Hi there, I think I found a bug in the radosgw-admin bucket radoslist command. I'm not 100% sure so would like to check here first before I fill a bug report. I have a bucket called bucket3. If I do a multipart upload and stop it halfway for example and start a new​ upload with the same name and

[ceph-users] Re: OSD id 241 != my id 248: conversion from "ceph-disk" to "ceph-volume simple" destroys OSDs

2021-05-04 Thread Frank Schilder
Hi Chris and Wissem, finally found the time: https://tracker.ceph.com/issues/50638 Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Chris Dunlop Sent: 16 March 2021 03:56:50 To: Frank Schilder Cc: ceph-users@ceph.

[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-04 Thread Frank Schilder
I created a ticket: https://tracker.ceph.com/issues/50637 Hope a purge will do the trick. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 03 May 2021 15:21:38 To: Dan van der Ster; Vladimir S

[ceph-users] Re: Certificat format for the SSL dashboard

2021-05-04 Thread Fabrice Bacchella
And worst: $ ceph config-key set mgr/restful/fa42/crt -i /data/ceph/conf/ceph.crt set mgr/restful/fa42/crt The exact same certificate is accepted by the mgr. > Le 3 mai 2021 à 23:03, Fabrice Bacchella a > écrit : > > Once activated the dashboard, I try to import certificates, but it fails:

[ceph-users] Re: How to set bluestore_rocksdb_options_annex

2021-05-04 Thread ceph
Also it would be great if someone could give a hint on how to validate that this is really set for an osd... As Ceph daemon osd.3 config get bluestore_rocksdb_options Still shows the default line without my modifications via the ...annex option Best regards Mehmet Am 4. Mai 2021 06:59:51 MES

[ceph-users] Manager carries wrong information until killing it

2021-05-04 Thread Nico Schottelius
Hello, we have a recurring, funky problem with managers on Nautilus (and probably also earlier versions): the manager displays incorrect information. This is a recurring pattern and it also breaks the prometheus graphs, as the I/O is described insanely incorrectly: "recovery: 43 TiB/s, 3.62k ke