[ceph-users] Re: Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

2021-04-08 Thread Szabo, Istvan (Agoda)
Hi, So finally how did you solve it? Which method out of the three? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message-

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-08 Thread Robert LeBlanc
Good thought. The storage for the monitor data is a RAID-0 over three NVMe devices. Watching iostat, they are completely idle, maybe 0.8% to 1.4% for a second every minute or so. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Apr 8, 2021

[ceph-users] Re: Nautilus 14.2.19 radosgw ignoring ceph config

2021-04-08 Thread Arnaud Lefebvre
Hello Graham, We have the same issue after an upgrade from 14.2.16 to 14.2.19. I tracked down the issue today and made a bug report a few hours ago: https://tracker.ceph.com/issues/50249. Maybe the title can be adjusted if more than rgw_frontends is impacted. First nautilus release I found with

[ceph-users] Re: Version of podman for Ceph 15.2.10

2021-04-08 Thread David Orman
The latest podman 3.0.1 release is fine (we have many production clusters running this). We have not tested 3.1 yet, however, but will soon. > On Apr 8, 2021, at 10:32, mabi wrote: > > Hello, > > I would like to install Ceph 15.2.10 using cephadm and just found the > following table by

[ceph-users] Re: Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

2021-04-08 Thread Michael Thomas
Hi Joshua, I have had a similar issue three different times on one of my cephfs pools (15.2.10). The first time this happened I had lost some OSDs. In all cases I ended up with degraded PGs with unfound objects that could not be recovered. Here's how I recovered from the situation. Note

[ceph-users] short pages when listing RADOSGW buckets via Swift API

2021-04-08 Thread Paul Collins
Hi, I noticed while using rclone to migrate some data from a Swift cluster into a RADOSGW cluster that sometimes when listing a bucket RADOSGW will not always return as many results as specified by the "limit" parameter, even when more objects remain to list. This results in rclone believing on

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-08 Thread Stefan Kooman
On 4/8/21 6:22 PM, Robert LeBlanc wrote: I upgraded our Luminous cluster to Nautilus a couple of weeks ago and converted the last batch of FileStore OSDs to BlueStore about 36 hours ago. Yesterday our monitor cluster went nuts and started constantly calling elections because monitor nodes were

[ceph-users] Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

2021-04-08 Thread Joshua West
Hey everyone. Inside of cephfs, I have a directory which I setup a directory layout field to use an erasure coded (CLAY) pool, specific to the task. The rest of my cephfs is using normal replication. Fast forward some time, and the EC directory has been used pretty extensively, and through some

[ceph-users] Re: bluestore_min_alloc_size_hdd on Octopus (15.2.10) / XFS formatted RBDs

2021-04-08 Thread Igor Fedotov
Hi David, On 4/7/2021 7:43 PM, David Orman wrote: Now that the hybrid allocator appears to be enabled by default in Octopus, is it safe to change bluestore_min_alloc_size_hdd to 4k from 64k on Octopus 15.2.10 clusters, and then redeploy every OSD to switch to the smaller allocation size,

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-08 Thread Robert LeBlanc
I found this thread that matches a lot of what I'm seeing. I see the ms_dispatch thread going to 100%, but I'm at a single MON, the recovery is done and the rocksdb MON database is ~300MB. I've tried all the settings mentioned in that thread with no noticeable improvement. I was hoping that once

[ceph-users] Version of podman for Ceph 15.2.10

2021-04-08 Thread mabi
Hello, I would like to install Ceph 15.2.10 using cephadm and just found the following table by checking the requirements on the host: https://docs.ceph.com/en/latest/cephadm/compatibility/#compatibility-with-podman-versions Do I understand this table correctly that I should be using podman

[ceph-users] Nautilus 14.2.19 radosgw ignoring ceph config

2021-04-08 Thread Graham Allan
We just updated one of our ceph clusters from 14.2.15 to 14.2.19, and see some unexpected behavior by radosgw - it seems to ignore parameters set by the ceph config database. Specifically this is making it start up listening only on port 7480, and not the configured 80 and 443 (ssl) ports.

[ceph-users] Re: KRBD failed to mount rbd image if mapping it to the host with read-only option

2021-04-08 Thread Ha, Son Hai
Thank you. The option "noload" works as expected. -Original Message- From: Wido den Hollander Sent: Thursday, April 8, 2021 3:56 PM To: Ha, Son Hai; ceph-users@ceph.io; ceph-us...@lists.ceph.com Subject: Re: [ceph-users] KRBD failed to mount rbd image if mapping it to the host with

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-08 Thread Robert LeBlanc
On Thu, Apr 8, 2021 at 11:24 AM Robert LeBlanc wrote: > > On Thu, Apr 8, 2021 at 10:22 AM Robert LeBlanc wrote: > > > > I upgraded our Luminous cluster to Nautilus a couple of weeks ago and > > converted the last batch of FileStore OSDs to BlueStore about 36 hours ago. > > Yesterday our

[ceph-users] Re: Nautilus 14.2.19 mon 100% CPU

2021-04-08 Thread Robert LeBlanc
On Thu, Apr 8, 2021 at 10:22 AM Robert LeBlanc wrote: > > I upgraded our Luminous cluster to Nautilus a couple of weeks ago and > converted the last batch of FileStore OSDs to BlueStore about 36 hours ago. > Yesterday our monitor cluster went nuts and started constantly calling > elections

[ceph-users] Nautilus 14.2.19 mon 100% CPU

2021-04-08 Thread Robert LeBlanc
I upgraded our Luminous cluster to Nautilus a couple of weeks ago and converted the last batch of FileStore OSDs to BlueStore about 36 hours ago. Yesterday our monitor cluster went nuts and started constantly calling elections because monitor nodes were at 100% and wouldn't respond to heartbeats.

[ceph-users] KRBD failed to mount rbd image if mapping it to the host with read-only option

2021-04-08 Thread Ha, Son Hai
Hi everyone, We encountered an issue with KRBD mounting after mapping it to the host with read-only option. We try to pinpoint where the problem is, but not able to do it. The image is mounted well if we map it without the "read-only" option. This leads to an issue that the pod in k8s cannot

[ceph-users] Re: Ceph CFP Coordination for 2021

2021-04-08 Thread Mike Perez
KubeCon NA has extended their CFP dates to May 23rd. https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/program/cfp/#overview DevConf.US also has its CFP open until May 31st. https://www.devconf.info/us/ And lastly, we have Cloud-Native Data Management Day on May 4th with

[ceph-users] Re: KRBD failed to mount rbd image if mapping it to the host with read-only option

2021-04-08 Thread Wido den Hollander
On 08/04/2021 14:09, Ha, Son Hai wrote: Hi everyone, We encountered an issue with KRBD mounting after mapping it to the host with read-only option. We try to pinpoint where the problem is, but not able to do it. See my reply down below. The image is mounted well if we map it without the

[ceph-users] Re: cephadm/podman :: upgrade to pacific stuck

2021-04-08 Thread Adrian Sevcenco
Hi! (and thanks for taking your time to answer my email :) ) On 4/8/21 1:18 AM, Sage Weil wrote: You would normally tell cephadm to deploy another mgr with 'ceph orch apply mgr 2'. In this case, the default placement policy for mgrs is already either 2 or 3, though--the problem is that you

[ceph-users] Re: Upgrade and lost osds Operation not permitted

2021-04-08 Thread Behzad Khoshbakhti
I believe there is some of problem in the systemd as the ceph starts successfully by running manually using the ceph-osd command. On Thu, Apr 8, 2021, 10:32 AM Enrico Kern wrote: > I agree. But why does the process start manual without systemd which > obviously has nothing to do with uid/gid

[ceph-users] Re: Upgrade and lost osds Operation not permitted

2021-04-08 Thread Enrico Kern
I agree. But why does the process start manual without systemd which obviously has nothing to do with uid/gid 167 ? It is also not really a fix to let all users change uid/gids... On Wed, Apr 7, 2021 at 7:39 PM Wladimir Mutel wrote: > Could there be more smooth migration? On my Ubuntu I have