[ceph-users] Re: How to set bluestore_rocksdb_options_annex

2021-05-03 Thread ceph
For the records i have tried/done this like: Ceph config set osd bluestore_rocksdb_options_annex option1=8,option2=4 But i am not sure if it is necessary to restart the osds... cause Ceph config dump Shows ... .. . Osd advanced option1=8,option2=4 * ... .. . The "*" is shown in the "RO"

[ceph-users] Spam from Chip Cox

2021-05-03 Thread Frank Schilder
Does anyone else receive unsolicited replies from sender "Chip Cox " to e-mails posted on this list? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Frank Schilder
I concur, having this heavily collocated set-up will not perform any better than you observe. Do you really have 2 MDS daemons per host? I just saw hat you have only 2 disks, probably 1 per node. In this set-up, you cannot really expect good fail-over times due to the amount of simultaneous

[ceph-users] Certificat format for the SSL dashboard

2021-05-03 Thread Fabrice Bacchella
Once activated the dashboard, I try to import certificates, but it fails: $ ceph dashboard set-ssl-certificate-key -i /data/ceph/conf/ceph.key Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1337, in _handle_command return

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Frank Schilder
Following up on this and other comments, there are 2 different time delays. One (1) is the time it takes from killing an MDS until a stand-by is made an active rank, and (2) the time it takes for the new active rank to restore all client sessions. My experience is that (1) takes close to 0

[ceph-users] Re: using ec pool with rgw

2021-05-03 Thread David Orman
We haven't found a more 'elegant' way, but the process we follow: we pre-create all the pools prior to creating the realm/zonegroup/zone, then we period apply, then we remove the default zonegroup/zone, period apply, then remove the default pools. Hope this is at least somewhat helpful, David On

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Eugen Block
I wouldn't recommend a colocated MDS in a production environment. Zitat von Lokendra Rathour : Hello Frank, Thanks for your inputs. *Responding to your Queries , Kindly refer below:* - *Do you have services co-located? * - [loke] : Yes they are colocated: - Cephnode1 :

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Hello Frank, Thanks for your inputs. *Responding to your Queries , Kindly refer below:* - *Do you have services co-located? * - [loke] : Yes they are colocated: - Cephnode1 : MDS,MGR,MON,RGW,OSD,MDS - Cephnode2: MDS,MGR,MON,RGW,OSD,MDS - Cephnode3: MON -

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Yes Patric, In the process of killing MDS we are also *killing Monitor along with OSD,Mgr and RGW*. we are performing Poweroff/Reboot the complete node (with MDS,Mon,RGW,OSD,Mgr daemon). Cluster: 2 Nodes with MDS|Mon|RGW|OSD each and third node with 1 Mon. Note : when I am only stopping the MDS

[ceph-users] Re: How can I get tail information a parted rados object

2021-05-03 Thread by morphin
Hi Rob. I think I wasn't clear enough with the first mail. I'm having issues with the RGW. radosgw-admin or s3 can not access some objects in the bucket. These objects are exist in the "RADOS" and I can export with "rados get -p $pooll $object". But the problem ise 4M chunk and multiparts. I have

[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Frank Schilder
Hi Dan, just restarted all MONs, no change though :( Thanks for looking at this. I will wait until tomorrow. My plan is to get the disk up again with the same OSD ID and would expect that this will eventually allow the message to be cleared. Best regards, = Frank Schilder AIT

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Ok, Will try with nautilus as well. But we are really configuring too many variables to achieve 10 seconds of failover time. Is it possible for you to share the setup details. Like we are using 2 node ceph cluster in health ok (configured replication factor and related variables) Hardware is HP,

[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Frank Schilder
Hi Vladimir, thanks for your reply. I did, the cluster is healthy: [root@gnosis ~]# ceph status cluster: id: --- health: HEALTH_WARN 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Patrick Donnelly
On Mon, May 3, 2021 at 6:36 AM Lokendra Rathour wrote: > > Hi Team, > I was setting up the ceph cluster with > >- Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW >- Deployment Type: Active Standby >- Testing Mode: Failover of MDS Node >- Setup : Octopus (15.2.7) >- OS: centos 8.3 >

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Eugen Block
Hi, Yes we tried ceph-standby-replay but could not see much difference in the handover time. It was comming as 35 to 40 seconds in either case. Did you also changed these variables (as mentioned above) along with the hot-standby ? no, we barely differ from the default configs and haven't

[ceph-users] Troubleshoot MDS failure

2021-05-03 Thread Alessandro Piazza
Dear all, I'm having a hard time troubleshooting a file-system failure on my 3 node cluster (deployed with cephadm + docker). After moving some files between folders, the cluster became laggy and Metadata Servers started failing and got stuck in rejoin state. Of course I already tried to

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Hello Eugen, Thankyou for the response. Yes we tried ceph-standby-replay but could not see much difference in the handover time. It was comming as 35 to 40 seconds in either case. Did you also changed these variables (as mentioned above) along with the hot-standby ? Couple of seconds is

[ceph-users] Failed cephadm Upgrade - ValueError

2021-05-03 Thread Ashley Merrick
Created BugTicket : https://tracker.ceph.com/issues/50616 > On Mon May 03 2021 21:49:41 GMT+0800 (Singapore Standard Time), Ashley > Merrick wrote: > Just checked cluster logs and they are full of:cephadm exited with an error > code: 1, stderr:Reconfig daemon osd.16 ... Traceback (most recent

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Eugen Block
Also there's a difference between 'standby-replay' (hot standby) and just 'standby'. We use CephFS for a couple of years now with standby-replay and the failover takes a couple of seconds max, depending on the current load. Have you tried to enable the standby-replay config and tested the

[ceph-users] Failed cephadm Upgrade - ValueError

2021-05-03 Thread Ashley Merrick
Just checked cluster logs and they are full of:cephadm exited with an error code: 1, stderr:Reconfig daemon osd.16 ... Traceback (most recent call last): File "/var/lib/ceph/30449cba-44e4-11eb-ba64-dda10beff041/cephadm.17068a0b484bdc911a9c50d6408adfca696c2faaa65c018d660a3b697d119482", line

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Olivier AUDRY
hello perhaps you should have more than one MDS active. mds: cephfs:3 {0=cephfs-d=up:active,1=cephfs-e=up:active,2=cephfs- a=up:active} 1 up:standby-replay I got 3 active mds and one standby. I'm using rook in kubernetes for this setup. oau Le lundi 03 mai 2021 à 19:06 +0530, Lokendra

[ceph-users] [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Lokendra Rathour
Hi Team, I was setting up the ceph cluster with - Node Details:3 Mon,2 MDS, 2 Mgr, 2 RGW - Deployment Type: Active Standby - Testing Mode: Failover of MDS Node - Setup : Octopus (15.2.7) - OS: centos 8.3 - hardware: HP - Ram: 128 GB on each Node - OSD: 2 ( 1 tb each) -

[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Dan van der Ster
Wait, first just restart the leader mon. See: https://tracker.ceph.com/issues/47380 for a related issue. -- dan On Mon, May 3, 2021 at 2:55 PM Vladimir Sigunov wrote: > > Hi Frank, > Yes, I would purge the osd. The cluster looks absolutely healthy except of > this osd.584 Probably, the purge

[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Vladimir Sigunov
Hi Frank, Yes, I would purge the osd. The cluster looks absolutely healthy except of this osd.584 Probably, the purge will help the cluster to forget this faulty one. Also, I would restart monitors, too. With the amount of data you maintain in your cluster, I don't think your ceph.conf

[ceph-users] Re: How can I get tail information a parted rados object

2021-05-03 Thread Rob Haverkamp
Hi Morphin, There are multiple ways you can do this. 1. run a radosgw-admin bucket radoslist --bucket write that output to a file, grep all entries containing the object name ' im034113.jpg', sort that list and download them. 2. run a radosgw-admin object stat --bucket --object this

[ceph-users] Re: OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Vladimir Sigunov
Hi Frank. Check your cluster for inactive/incomplete placement groups. I saw similar behavior on Octopus when some pgs stuck in incomplete/inactive or peering state. From: Frank Schilder Sent: Monday, May 3, 2021 3:42:48 AM To: ceph-users@ceph.io Subject:

[ceph-users] OSD slow ops warning not clearing after OSD down

2021-05-03 Thread Frank Schilder
Dear cephers, I have a strange problem. An OSD went down and recovery finished. For some reason, I have a slow ops warning for the failed OSD stuck in the system: health: HEALTH_WARN 430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops The OSD is auto-out: | 580 |

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 12:24 PM Magnus Harlander wrote: > > Am 03.05.21 um 11:22 schrieb Ilya Dryomov: > > There is a 6th osd directory on both machines, but it's empty > > [root@s0 osd]# ll > total 0 > drwxrwxrwt. 2 ceph ceph 200 2. Mai 16:31 ceph-1 > drwxrwxrwt. 2 ceph ceph 200 2. Mai 16:31

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 12:27 PM Magnus Harlander wrote: > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov: > > ceph osd setmaxosd 10 > > Bingo! Mount works again. > > Vry strange things are going on here (-: > > Thanx a lot for now!! If I can help to track it down, please let me know. Good to

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 12:00 PM Magnus Harlander wrote: > > Am 03.05.21 um 11:22 schrieb Ilya Dryomov: > > max_osd 12 > > I never had more then 10 osds on the two osd nodes of this cluster. > > I was running a 3 osd-node cluster earlier with more than 10 > osds, but the current cluster has been

[ceph-users] Re: Cannot create issue in bugtracker

2021-05-03 Thread David Caro
I created an issue during the weekend without problems: https://tracker.ceph.com/issues/50604 On 05/03 09:36, Tobias Urdin wrote: > Hello, > > Anybody, still error? > > > Best regards > > - > > > Internal error > An error occurred on the page you were trying to access. > If you

[ceph-users] Re: Cannot create issue in bugtracker

2021-05-03 Thread Tobias Urdin
Hello, Anybody, still error? Best regards - Internal error An error occurred on the page you were trying to access. If you continue to experience problems please contact your Redmine administrator for assistance. If you are the Redmine administrator, check your log files for details

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov
On Mon, May 3, 2021 at 9:20 AM Magnus Harlander wrote: > > Am 03.05.21 um 00:44 schrieb Ilya Dryomov: > > On Sun, May 2, 2021 at 11:15 PM Magnus Harlander wrote: > > Hi, > > I know there is a thread about problems with mounting cephfs with 5.11 > kernels. > > ... > > Hi Magnus, > > What is the

[ceph-users] Failed cephadm Upgrade - ValueError

2021-05-03 Thread Ashley Merrick
Hello,Wondering if anyone had any feedback on some commands I could try to manually update the current OSD that is down to 16.2.1 so I can at least get around this upgrade bug and back to 100%?If there is any log's or if it seems a new bug and I should create a bugzilla report do let me