[ceph-users] Re: ceph filesystem stuck in read only

2022-11-04 Thread Ramana Krisna Venkatesh Raja
On Fri, Nov 4, 2022 at 9:36 AM Galzin Rémi wrote: > > > Hi, > i'm looking for some help/ideas/advices in order to solve the problem > that occurs on my metadata > server after the server reboot. You rebooted a MDS's host and your file system became read-only? Was the Ceph cluster healthy before

[ceph-users] Re: OSDs are not utilized evenly

2022-11-04 Thread Joseph Mundackal
Hi Denis, can you share the following data points? ceph osd df tree (to see how the osd's are distributed) ceph osd crush rule dump (to see what your ec rule looks like) ceph osd pool ls detail (to see the pools and pools to crush rule mapping and pg nums) Also "optimize_result": "Unable to

[ceph-users] Re: Upgrade/migrate host operating system for ceph nodes (CentOS/Rocky)

2022-11-04 Thread Jimmy Spets
I have upgraded the majority of the nodes in a cluster that I manage from CentOS 8.6 to AlmaLinux 9. We have done the upgrade by emptying one node at a time and then reinstalling and bringing it back into the cluster. With AlmaLinux 9 I install the default "Server without GUI" packages and run

[ceph-users] Re: Question about quorum

2022-11-04 Thread Janne Johansson
Den fre 4 nov. 2022 kl 13:37 skrev Murilo Morais : > Hi Tyler, thanks for clarifying, it makes total sense now. > Hypothetically, if there are any failures and most stop, how can I > re-initialize the cluster in its current state or what can be done in this > kind of case? > Just add one more mon

[ceph-users] ceph filesystem stuck in read only

2022-11-04 Thread Galzin Rémi
Hi, i'm looking for some help/ideas/advices in order to solve the problem that occurs on my metadata server after the server reboot. "Ceph status" warns about my MDS being "read only" but the fileystem and the data seem healthy. It is still possible to access the content of my cephfs volumes

[ceph-users] Re: Question about quorum

2022-11-04 Thread Murilo Morais
Hi Tyler, thanks for clarifying, it makes total sense now. Hypothetically, if there are any failures and most stop, how can I re-initialize the cluster in its current state or what can be done in this kind of case? Em qui., 3 de nov. de 2022 às 17:00, Tyler Brekke escreveu: > Hi Murilo, > >

[ceph-users] Re: What is the reason of the rgw_user_quota_bucket_sync_interval and rgw_bucket_quota_ttl values?

2022-11-04 Thread Janne Johansson
Den fre 4 nov. 2022 kl 10:48 skrev Szabo, Istvan (Agoda) : > Hi, > One of my user told me that they can upload bigger files to the bucket than > the limit. My question is to the developers mainly what’s the reason to set > the rgw_bucket_quota_ttl=600 and rgw_user_quota_bucket_sync_interval=180?

[ceph-users] Re: ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
the problem was a single osd daemon (not reported on health detail) which slowed down the entire peering process, after restarting it the cluster got back to normal. On 11/4/2022 10:49 AM, Adrian Nicolae wrote:  ceph health detail HEALTH_WARN Reduced data availability: 42 pgs inactive, 33

[ceph-users] Re: PG Ratio for EC overwrites Pool

2022-11-04 Thread mailing-lists
Thank you very much. I've increased it to 2*#OSD rounded to the next power of 2. Best Ken On 03.11.22 15:30, Anthony D'Atri wrote: PG count isn’t just about storage size, it also affects performance, parallelism, and recovery. You want pgp_num for RBD metadata pool to be at the VERY least

[ceph-users] Re: [PHISHING VERDACHT] ceph is stuck after increasing pg_nums

2022-11-04 Thread Burkhard Linke
Hi, On 11/4/22 09:45, Adrian Nicolae wrote: Hi, We have a Pacific cluster (16.2.4) with 30 servers and 30 osds. We started to increase the pg_num for the data bucket for more than a month, I usually added 64 pgs in every step I didn't have any issue. The cluster was healthy before

[ceph-users] Re: How to remove remaining bucket index shard objects

2022-11-04 Thread 伊藤 祐司
Hi, Mysteriously, the large omap objects alert recurred recently. The values for omap_used_mbytes and omap_used_keys are slightly different from the previous investigation, but very close. Our team is going to keep this cluster to investigate and create another cluster to work. Therefore, my

[ceph-users] What is the reason of the rgw_user_quota_bucket_sync_interval and rgw_bucket_quota_ttl values?

2022-11-04 Thread Szabo, Istvan (Agoda)
Hi, One of my user told me that they can upload bigger files to the bucket than the limit. My question is to the developers mainly what’s the reason to set the rgw_bucket_quota_ttl=600 and rgw_user_quota_bucket_sync_interval=180? I don’t want to set to 0 before I know the reason  With this

[ceph-users] Re: ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
 ceph health detail HEALTH_WARN Reduced data availability: 42 pgs inactive, 33 pgs peering; 1 pool(s) have non-power-of-two pg_num; 2371 slow ops, oldest one blocked for 6218 sec, daemons [osd.103,osd.115,osd.126,osd.129,osd.130,osd.138,osd.155,osd.174,osd.179,osd.181]... have slow ops. [WRN]

[ceph-users] ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
Hi, We have a Pacific cluster (16.2.4) with 30 servers and 30 osds. We started to increase the pg_num for the data bucket for more than a month, I usually added 64 pgs in every step I didn't have any issue. The cluster was healthy before increasing the pgs. Today I've added 128 pgs  and the