[ceph-users] Re: quincy rgw with ceph orch and two realms only get answers from first realm

2024-05-24 Thread Boris
Thanks for being my rubber ducky.
Turns out I didn't had the rgw_zonegroup configured in the first apply.
Then adding it to the config and applying it, does not restart or
reconfigure the containers.
After doing a ceph orch restart rgw.customer it seems to work now.

Happy weekend everybody.

Am Fr., 24. Mai 2024 um 14:19 Uhr schrieb Boris :

> Hi,
>
> we are currently in the process of adopting the main s3 cluster to
> orchestrator.
> We have two realms (one for us and one for the customer).
>
> The old config worked fine and depending on the port I requested, I got
> different x-amz-request-id header back:
> x-amz-request-id: tx0307170ac0d734ab4-0066508120-94aa0f66-eu-central-1
> x-amz-request-id: tx0a9d8fc1821bbe258-00665081d1-949b2447-eu-customer-1
>
> I then deployed the orchestrator config to other systems and checked, but
> now I get always an answer from the eu-central-1 zone and never from the
> customer zone.
>
> The test is very simple and works fine for the old rgw config, but fails
> for the new orchestrator config:
> curl [IPv6]:7481/somebucket -v
>
> All rgw instances are running on ceph version 17.2.7.
> Configs attached below.
>
> The old ceph.conf looked like this:
> [client.eu-central-1-s3db1-old]
> rgw_frontends = beast endpoint=[::]:7480
> rgw_region = eu
> rgw_zone = eu-central-1
> rgw_dns_name = example.com
> rgw_dns_s3website_name = eu-central-1.example.com
> [client.eu-customer-1-s3db1]
> rgw_frontends = beast endpoint=[::]:7481
> rgw_region = eu-customer
> rgw_zone = eu-customer-1
> rgw_dns_name = s3.customer.domain
> rgw_dns_s3website_name = eu-central-1.s3.customer.domain
>
> And this is the new service.yaml
> service_type: rgw
> service_id: ab12
> placement:
> label: rgw
> config:
> debug_rgw: 0
> rgw_thread_pool_size: 2048
> rgw_dns_name: example.com
> rgw_dns_s3website_name: eu-central-1.example.com
> rgw_enable_gc_threads: false
> spec:
> rgw_frontend_port: 7480
> rgw_frontend_type: beast
> rgw_realm: company
> rgw_zone: eu-central-1
> rgw_zonegroup: eu
> ---
> service_type: rgw
> service_id: customer
> placement:
> label: rgw
> config:
> debug_rgw: 0
> rgw_dns_name: s3.customer.domain
> rgw_dns_s3website_name: eu-central-1.s3.customer.domain
> rgw_enable_gc_threads: false
> spec:
> rgw_frontend_port: 7481
> rgw_frontend_type: beast
> rgw_realm: customer
> rgw_zone: eu-customer-1
> rgw_zonegroup: eu-customer
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] quincy rgw with ceph orch and two realms only get answers from first realm

2024-05-24 Thread Boris
Hi,

we are currently in the process of adopting the main s3 cluster to
orchestrator.
We have two realms (one for us and one for the customer).

The old config worked fine and depending on the port I requested, I got
different x-amz-request-id header back:
x-amz-request-id: tx0307170ac0d734ab4-0066508120-94aa0f66-eu-central-1
x-amz-request-id: tx0a9d8fc1821bbe258-00665081d1-949b2447-eu-customer-1

I then deployed the orchestrator config to other systems and checked, but
now I get always an answer from the eu-central-1 zone and never from the
customer zone.

The test is very simple and works fine for the old rgw config, but fails
for the new orchestrator config:
curl [IPv6]:7481/somebucket -v

All rgw instances are running on ceph version 17.2.7.
Configs attached below.

The old ceph.conf looked like this:
[client.eu-central-1-s3db1-old]
rgw_frontends = beast endpoint=[::]:7480
rgw_region = eu
rgw_zone = eu-central-1
rgw_dns_name = example.com
rgw_dns_s3website_name = eu-central-1.example.com
[client.eu-customer-1-s3db1]
rgw_frontends = beast endpoint=[::]:7481
rgw_region = eu-customer
rgw_zone = eu-customer-1
rgw_dns_name = s3.customer.domain
rgw_dns_s3website_name = eu-central-1.s3.customer.domain

And this is the new service.yaml
service_type: rgw
service_id: ab12
placement:
label: rgw
config:
debug_rgw: 0
rgw_thread_pool_size: 2048
rgw_dns_name: example.com
rgw_dns_s3website_name: eu-central-1.example.com
rgw_enable_gc_threads: false
spec:
rgw_frontend_port: 7480
rgw_frontend_type: beast
rgw_realm: company
rgw_zone: eu-central-1
rgw_zonegroup: eu
---
service_type: rgw
service_id: customer
placement:
label: rgw
config:
debug_rgw: 0
rgw_dns_name: s3.customer.domain
rgw_dns_s3website_name: eu-central-1.s3.customer.domain
rgw_enable_gc_threads: false
spec:
rgw_frontend_port: 7481
rgw_frontend_type: beast
rgw_realm: customer
rgw_zone: eu-customer-1
rgw_zonegroup: eu-customer

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: purging already destroyed OSD leads to degraded and misplaced objects?

2024-04-04 Thread Boris
Hi Tobias,

what we usually do, when we want to remove an OSD is to reweight the
crush map to 0. This stops the rebalancing after removing the OSD from
the crush map. Setting an OSD to out, keeps it weighted in the crush
map and when it gets removed, the cluster will rebalance the PGs to
reflect the new crush map.

Am Do., 4. Apr. 2024 um 10:54 Uhr schrieb tobias tempel :
>
> Dear Cephers,
>
> reorganizing one of our clusters i'm removing some hosts from it, taking 
> "out" all OSDs on these hosts and waiting until all PGs are fine.
> After stopping and destroying all OSDs on one host i notice, that "purge" of 
> such destroyed OSDs temporarily leads to degraded and misplaced objects 
> (tried "reweight 0" as well, same picture) .
> Why that? I would - due to my limited experience - expect, that completely 
> removing an already destroyed OSD could not have any effect on object 
> placement at all.
>
> This effect yet seems not to be a real problem, as recovery works fine ... 
> but perhaps it indicates some other issue.
> All of that takes place at Pacific 16.2.14 AlmaLinux 8.9, yes i know, there's 
> some work to do.
>
> Perhaps you can give me a hint, where to look, for an explanation?
>
> Thank you
> cheers, toBias
>
> --
> Tobias Tempel
> Deutsches Elektronen-Synchrotron - IT
> Notkestr. 85, 22607 Hamburg, Germany
> email: tobias.tem...@desy.de
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
im groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [quincy 17.2.7] ceph orchestrator not doing anything

2024-01-16 Thread Boris
Good morning Eugen,

I just found  this thread and saw that I had a test image for rgw in the
config.
After removing the global and the rgw config value everything was instantly
fine.

Cheers and a happy week
 Boris

Am Di., 16. Jan. 2024 um 10:20 Uhr schrieb Eugen Block :

> Hi,
>
> there have been a few threads with this topic, one of them is this one
> [1]. The issue there was that different ceph container images were in
> use. Can you check your container versions? If you don't configure a
> global image for all ceph daemons, e.g.:
>
> quincy-1:~ # ceph config set global container_image
> quay.io/ceph/ceph:v17.2.7
>
> you can end up with different images for different daemons which could
> also prevent the orchestrator from properly working. Check the local
> images with "podman|docker images" and/or your current configuration:
>
> quincy-1:~ # ceph config get mon container_image
> quincy-1:~ # ceph config get osd container_image
> quincy-1:~ # ceph config get mgr container_image
> quincy-1:~ # ceph config get mgr mgr/cephadm/container_image_base
>
> Regards,
> Eugen
>
> [1] https://www.spinics.net/lists/ceph-users/msg77573.html
>
> Zitat von Boris :
>
> > Happy new year everybody.
> >
> > I just found out that the orchestrator in one of our clusters is not
> doing
> > anything.
> >
> > What I tried until now:
> > - disabling / enabling cephadm (no impact)
> > - restarting hosts (no impact)
> > - starting upgrade to same version (no impact)
> > - starting downgrade (no impact)
> > - forcefully removing hosts and adding them again (now I have no daemons
> > anymore)
> > - applying new configurations (no impact)
> >
> > The orchestrator just does nothing.
> > Cluster itself is fine.
> >
> > I also checked the SSH connecability from all hosts to all hosts (
> > https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#ssh-errors)
> >
> > The logs always show a message like "took the task" but then nothing
> > happens.
> >
> > Cheers
> >  Boris
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [quincy 17.2.7] ceph orchestrator not doing anything

2024-01-11 Thread Boris
Happy new year everybody.

I just found out that the orchestrator in one of our clusters is not doing
anything.

What I tried until now:
- disabling / enabling cephadm (no impact)
- restarting hosts (no impact)
- starting upgrade to same version (no impact)
- starting downgrade (no impact)
- forcefully removing hosts and adding them again (now I have no daemons
anymore)
- applying new configurations (no impact)

The orchestrator just does nothing.
Cluster itself is fine.

I also checked the SSH connecability from all hosts to all hosts (
https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#ssh-errors)

The logs always show a message like "took the task" but then nothing
happens.

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Assistance Needed with Ceph Cluster Slow Ops Issue

2023-12-06 Thread Boris
Snaptrim is the process of removing a snapshot and reclaiming diskspace after 
it got deleted. I don't know how the ceph internals work, but it helped for us. 

You can try to move the snaptrim into specific timeframes and limit it to one 
per osd. Also sleeping (3s worked for us) between the deletion helped to 
alleviate the pressure from the OSDs. Please check the documentation for the 
exact config parameters. 

I don't know if octopus got the tooling to remove the pg log dups. 

Check "ceph-objectstore-tool --help" if the --op parameter accepts 
trim-pg-log-dups

croit made a blog post how to remove these dups: 
https://croit.io/blog/oom-killer-osds

But be warned: Checking an 8TB Samsung SAS SSD with 300pgs (not trimming) took 
4 hours. 
You can skip the listing and counting and just start the trim directly. But the 
OSD needs to be offline for it. 

What we did when we were hit with this issue:
Stop snaptrim, update to pacific, do an offline rocksdb compression before the 
OSDs start after the upgrade, start the OSDs and hate our lives while they 
started, wait a week, slowly start the snaptrim and hope for the best. :-)

Mit freundlichen Grüßen
 - Boris Behrens

> Am 06.12.2023 um 20:17 schrieb Peter :
> 
> 
> Thank you for pointing this out. I did check my cluster by using the article 
> given command, it over 17 million PG dups over each OSDs. 
> May I know if the snaptrim activity takes place every six hours?  If I 
> disable the snaptrim, will it stop the slow ops temporarily before my 
> performing version upgrade?
> If I want to upgrade my Ceph, it will take time to analysis the environment. 
> Can I have work around quickly for delete OSD then create it again for 
> zeroized the log times? or manually delete the OSD log?
> 
> 
> 
> From: Boris 
> Sent: Wednesday, December 6, 2023 10:13
> To: Peter 
> Cc: ceph-users@ceph.io 
> Subject: Re: [ceph-users] Assistance Needed with Ceph Cluster Slow Ops Issue
>  
> Hi Peter,
> 
> try to set the cluster to nosnaptrim
> 
> If this helps, you might need to upgrade to pacific, because you are hit by 
> the pg dups bug. 
> 
> See: https://www.clyso.com/blog/how-to-identify-osds-affected-by-pg-dup-bug/
> 
> 
> Mit freundlichen Grüßen
>  - Boris Behrens
> 
>>> Am 06.12.2023 um 19:01 schrieb Peter :
>>> 
>> Dear all,
>> 
>> 
>> I am reaching out regarding an issue with our Ceph cluster that has been 
>> recurring every six hours. Upon investigating the problem using the "ceph 
>> daemon dump_historic_slow_ops" command, I observed that the issue appears to 
>> be related to slow operations, specifically getting stuck at "Waiting for RW 
>> Locks." The wait times often range from one to two seconds.
>> 
>> Our cluster use SAS SSD disks from Samsung for the storage pool in question. 
>> While these disks are of high quality and should provide sufficient speed, 
>> the problem persists. The slow ops occurrence is consistent every six hours.
>> 
>> I would greatly appreciate any insights or suggestions you may have to 
>> address and resolve this issue. If there are specific optimizations or 
>> configurations that could improve the situation, please advise.
>> 
>> 
>> below are some output:
>> 
>> root@lasas003:~# ceph -v
>> ceph version 15.2.17 (542df8d06ef24dbddcf4994db16bcc4c89c9ec2d) octopus 
>> (stable)
>> 
>> 
>> "events": [
>> 
>>{
>>"event": "initiated",
>>"time": "2023-12-06T08:34:18.501644-0800",
>>"duration": 0
>>},
>>{
>>"event": "throttled",
>>"time": "2023-12-06T08:34:18.501644-0800",
>>"duration": 3.067e-06
>>},
>>{
>>"event": "header_read",
>>"time": "2023-12-06T08:34:18.501647-0800",
>>"duration": 3.5428e-06
>>},
>>{
>>"event": "all_read",
>>"time": "2023-12-06T08:34:18.501650-0800",
>>"duration": 9.3397e-07
>>},
>>{
>>"event": "dispatched",
>>   

[ceph-users] Re: Assistance Needed with Ceph Cluster Slow Ops Issue

2023-12-06 Thread Boris
Hi Peter,

try to set the cluster to nosnaptrim

If this helps, you might need to upgrade to pacific, because you are hit by the 
pg dups bug. 

See: https://www.clyso.com/blog/how-to-identify-osds-affected-by-pg-dup-bug/


Mit freundlichen Grüßen
 - Boris Behrens

> Am 06.12.2023 um 19:01 schrieb Peter :
> 
> Dear all,
> 
> 
> I am reaching out regarding an issue with our Ceph cluster that has been 
> recurring every six hours. Upon investigating the problem using the "ceph 
> daemon dump_historic_slow_ops" command, I observed that the issue appears to 
> be related to slow operations, specifically getting stuck at "Waiting for RW 
> Locks." The wait times often range from one to two seconds.
> 
> Our cluster use SAS SSD disks from Samsung for the storage pool in question. 
> While these disks are of high quality and should provide sufficient speed, 
> the problem persists. The slow ops occurrence is consistent every six hours.
> 
> I would greatly appreciate any insights or suggestions you may have to 
> address and resolve this issue. If there are specific optimizations or 
> configurations that could improve the situation, please advise.
> 
> 
> below are some output:
> 
> root@lasas003:~# ceph -v
> ceph version 15.2.17 (542df8d06ef24dbddcf4994db16bcc4c89c9ec2d) octopus 
> (stable)
> 
> 
> "events": [
> 
>{
>"event": "initiated",
>"time": "2023-12-06T08:34:18.501644-0800",
>"duration": 0
>},
>{
>"event": "throttled",
>"time": "2023-12-06T08:34:18.501644-0800",
>"duration": 3.067e-06
>},
>{
>"event": "header_read",
>"time": "2023-12-06T08:34:18.501647-0800",
>"duration": 3.5428e-06
>},
>{
>"event": "all_read",
>"time": "2023-12-06T08:34:18.501650-0800",
>"duration": 9.3397e-07
>},
>{
>"event": "dispatched",
>"time": "2023-12-06T08:34:18.501651-0800",
>"duration": 3.2832e-06
>},
>{
>"event": "queued_for_pg",
>"time": "2023-12-06T08:34:18.501654-0800",
>"duration": 1.381993999001
>},
>{
>"event": "reached_pg",
>"time": "2023-12-06T08:34:19.883648-0800",
>"duration": 5.7982e-06
>},
>{
>"event": "waiting for rw locks",
>"time": "2023-12-06T08:34:19.883654-0800",
>"duration": 4.248471164998
>},
>{
>"event": "reached_pg",
>"time": "2023-12-06T08:34:24.132125-0800",
>"duration": 1.0667e-05
>},
>{
>"event": "waiting for rw locks",
>"time": "2023-12-06T08:34:24.132136-0800",
>"duration": 2.159352784002
>},
>{
>"event": "reached_pg",
>"time": "2023-12-06T08:34:26.291489-0800",
>"duration": 3.292e-06
>},
>{
>"event": "waiting for rw locks",
>"time": "2023-12-06T08:34:26.291492-0800",
>"duration": 0.4391816470001
>},
>{
>"event": "reached_pg",
>"time&

[ceph-users] RadosGW public HA traffic - best practices?

2023-11-17 Thread Boris Behrens
Hi,
I am looking for some experience on how people make their RGW public.

Currently we use the follow:
3 IP addresses that get distributed via keepalived between three HAproxy
instances, which then balance to three RGWs.
The caveat is, that keepalived is PITA to get working in distributing a set
of IP addresses, and it doesn't scale very well (up and down).
The upside is, that it is really stable and customer nearly never have an
availability problem. And we have 3 IPs that make some sort of LB. It
serves up to 24Gbit in peak times, when all those backup jobs are running
at night.

But today I thought, what will happen if I just ditch the keepalived and
configure thos addresses static to the haproxy hosts?
How bad will the impact to a customer if I reboot one haproxy? Is there an
easier, more scalable way if I want to spread the load even further without
having an ingress HW LB (what I don't have)?

I have a lot of hosts that would be able to host some POD with a haproxy
and a RGW as container together, or even host the RGW alone in a container.
It would just need to bridge two networks.
But I currently do not have a way to use BGP to have one IP address split
between a set of RGW instances.

So long story short:
What are your easy setups to serve public RGW traffic with some sort of HA
and LB (without using a big HW LB that is capable of 100GBit traffic)?
And have you experienced problems when you do not shift around IP addresses.

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to disable ceph version check?

2023-11-07 Thread Boris
You can mute it with 

"ceph health mute ALERT"
where alert is the caps keyword from "ceph health detail"

But I would update asap. 

Cheers
 Boris

> Am 08.11.2023 um 02:02 schrieb zxcs :
> 
> Hi, Experts,
> 
> we have a ceph cluster report HEALTH_ERR due to multiple old versions.
> 
>health: HEALTH_ERR
>There are daemons running multiple old versions of ceph
> 
> after run `ceph version`, we see three ceph versions in {16.2.*} , these 
> daemons are ceph osd.
> 
> our question is: how to stop this version check , we cannot upgrade all old 
> daemon.
> 
> 
> 
> Thanks,
> Xiong
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Boris Behrens
Hi,
follow these instructions:
https://docs.ceph.com/en/quincy/rados/operations/add-or-rm-mons/#removing-monitors-from-an-unhealthy-cluster
As you are using containers, you might need to specify the --mon-data
directory (/var/lib/CLUSTER_UUID/mon.MONNAME) (actually I never did this in
an orchestrator environment)

Good luck.


Am Do., 2. Nov. 2023 um 12:48 Uhr schrieb Mohamed LAMDAOUAR <
mohamed.lamdao...@enyx.fr>:

> Hello Boris,
>
> I have one server monitor up and two other servers of the cluster are also
> up (These two servers are not monitors ) .
> I have four other servers down (the boot disk is out) but the osd data
> disks are safe.
> I reinstalled the OS on a  new SSD disk. How can I rebuild my cluster with
> only one mons.
> If you would like, you can join me for a meeting. I will give you more
> information about the cluster.
>
> Thanks for your help, I'm very stuck because the data is present but I
> don't know how to add the old osd in the cluster to recover the data.
>
>
>
> Le jeu. 2 nov. 2023 à 11:55, Boris Behrens  a écrit :
>
>> Hi Mohamed,
>> are all mons down, or do you still have at least one that is running?
>>
>> AFAIK: the mons save their DB on the normal OS disks, and not within the
>> ceph cluster.
>> So if all mons are dead, which mean the disks which contained the mon data
>> are unrecoverable dead, you might need to bootstrap a new cluster and add
>> the OSDs to the new cluster. This will likely include tinkering with cephx
>> authentication, so you don't wipe the old OSD data.
>>
>> If you still have at least ONE mon alive, you can shut it down, and remove
>> all the other mons from the monmap and start it again. You CAN have
>> clusters with only one mon.
>>
>> Or is did your host just lost the boot disk and you just need to bring it
>> up somehow? losing 4x2 NVME disks at the same time, sounds a bit strange.
>>
>> Am Do., 2. Nov. 2023 um 11:34 Uhr schrieb Mohamed LAMDAOUAR <
>> mohamed.lamdao...@enyx.fr>:
>>
>> > Hello,
>> >
>> >   I have 7 machines on CEPH cluster, the service ceph runs on a docker
>> > container.
>> >  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
>> >   During a reboot, the ssd bricked on 4 machines, the data are
>> available on
>> > the HDD disk but the nvme is bricked and the system is not available.
>> is it
>> > possible to recover the data of the cluster (the data disk are all
>> > available)
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >
>>
>>
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Boris Behrens
Hi Mohamed,
are all mons down, or do you still have at least one that is running?

AFAIK: the mons save their DB on the normal OS disks, and not within the
ceph cluster.
So if all mons are dead, which mean the disks which contained the mon data
are unrecoverable dead, you might need to bootstrap a new cluster and add
the OSDs to the new cluster. This will likely include tinkering with cephx
authentication, so you don't wipe the old OSD data.

If you still have at least ONE mon alive, you can shut it down, and remove
all the other mons from the monmap and start it again. You CAN have
clusters with only one mon.

Or is did your host just lost the boot disk and you just need to bring it
up somehow? losing 4x2 NVME disks at the same time, sounds a bit strange.

Am Do., 2. Nov. 2023 um 11:34 Uhr schrieb Mohamed LAMDAOUAR <
mohamed.lamdao...@enyx.fr>:

> Hello,
>
>   I have 7 machines on CEPH cluster, the service ceph runs on a docker
> container.
>  Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked)
>   During a reboot, the ssd bricked on 4 machines, the data are available on
> the HDD disk but the nvme is bricked and the system is not available. is it
> possible to recover the data of the cluster (the data disk are all
> available)
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW access logs with bucket name

2023-10-30 Thread Boris Behrens
Hi Dan,

we are currently moving all the logging into lua scripts, so it is not an
issue anymore for us.

Thanks

ps: the ceph analyzer is really cool. plusplus

Am Sa., 28. Okt. 2023 um 22:03 Uhr schrieb Dan van der Ster <
dan.vanders...@clyso.com>:

> Hi Boris,
>
> I found that you need to use debug_rgw=10 to see the bucket name :-/
>
> e.g.
> 2023-10-28T19:55:42.288+ 7f34dde06700 10 req 3268931155513085118
> 0.0s s->object=... s->bucket=xyz-bucket-123
>
> Did you find a more convenient way in the meantime? I think we should
> log bucket name at level 1.
>
> Cheers, Dan
>
> --
> Dan van der Ster
> CTO
>
> Clyso GmbH
> p: +49 89 215252722 | a: Vancouver, Canada
> w: https://clyso.com | e: dan.vanders...@clyso.com
>
> Try our Ceph Analyzer: https://analyzer.clyso.com
>
> On Thu, Mar 30, 2023 at 4:15 AM Boris Behrens  wrote:
> >
> > Sadly not.
> > I only see the the path/query of a request, but not the hostname.
> > So when a bucket is accessed via hostname (
> https://bucket.TLD/object?query)
> > I only see the object and the query (GET /object?query).
> > When a bucket is accessed bia path (https://TLD/bucket/object?query) I
> can
> > see also the bucket in the log (GET bucket/object?query)
> >
> > Am Do., 30. März 2023 um 12:58 Uhr schrieb Szabo, Istvan (Agoda) <
> > istvan.sz...@agoda.com>:
> >
> > > It has the full url begins with the bucket name in the beast logs http
> > > requests, hasn’t it?
> > >
> > > Istvan Szabo
> > > Staff Infrastructure Engineer
> > > ---
> > > Agoda Services Co., Ltd.
> > > e: istvan.sz...@agoda.com
> > > ---
> > >
> > > On 2023. Mar 30., at 17:44, Boris Behrens  wrote:
> > >
> > > Email received from the internet. If in doubt, don't click any link
> nor
> > > open any attachment !
> > > 
> > >
> > > Bringing up that topic again:
> > > is it possible to log the bucket name in the rgw client logs?
> > >
> > > currently I am only to know the bucket name when someone access the
> bucket
> > > via https://TLD/bucket/object instead of https://bucket.TLD/object.
> > >
> > > Am Di., 3. Jan. 2023 um 10:25 Uhr schrieb Boris Behrens  >:
> > >
> > > Hi,
> > >
> > > I am looking forward to move our logs from
> > >
> > > /var/log/ceph/ceph-client...log to our logaggregator.
> > >
> > >
> > > Is there a way to have the bucket name in the log file?
> > >
> > >
> > > Or can I write the rgw_enable_ops_log into a file? Maybe I could work
> with
> > >
> > > this.
> > >
> > >
> > > Cheers and happy new year
> > >
> > > Boris
> > >
> > >
> > >
> > >
> > > --
> > > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
> im
> > > groüen Saal.
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> > >
> > > --
> > > This message is confidential and is for the sole use of the intended
> > > recipient(s). It may also be privileged or otherwise protected by
> copyright
> > > or other legal rules. If you have received it by mistake please let us
> know
> > > by reply email and delete it from your system. It is prohibited to copy
> > > this message or disclose its content to anyone. Any confidentiality or
> > > privilege is not waived or lost by any mistaken delivery or
> unauthorized
> > > disclosure of the message. All messages sent to and from Agoda may be
> > > monitored to ensure compliance with company policies, to protect the
> > > company's interests and to remove potential malware. Electronic
> messages
> > > may be intercepted, amended, lost or deleted, or contain viruses.
> > >
> >
> >
> > --
> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> > groüen Saal.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] traffic by IP address / bucket / user

2023-10-18 Thread Boris Behrens
Hi,
did someone have a solution ready to monitor traffic by IP address?

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Autoscaler problems in pacific

2023-10-05 Thread Boris Behrens
I usually set it to warn, so I don't forget to check from time to time :)

Am Do., 5. Okt. 2023 um 12:24 Uhr schrieb Eugen Block :

> Hi,
>
> I strongly agree with Joachim, I usually disable the autoscaler in
> production environments. But the devs would probably appreciate bug
> reports to improve it.
>
> Zitat von Boris Behrens :
>
> > Hi,
> > I've just upgraded to our object storages to the latest pacific version
> > (16.2.14) and the autscaler is acting weird.
> > On one cluster it just shows nothing:
> > ~# ceph osd pool autoscale-status
> > ~#
> >
> > On the other clusters it shows this when it is set to warn:
> > ~# ceph health detail
> > ...
> > [WRN] POOL_TOO_MANY_PGS: 2 pools have too many placement groups
> > Pool .rgw.buckets.data has 1024 placement groups, should have 1024
> > Pool device_health_metrics has 1 placement groups, should have 1
> >
> > Version 16.2.13 seems to act normal.
> > Is this a known bug?
> > --
> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> > groüen Saal.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Autoscaler problems in pacific

2023-10-04 Thread Boris Behrens
Also found what the 2nd problem was:
When there are pools using the default replicated_ruleset while there are
multiple rulesets with differenct device classes, the autoscaler does not
produce any output.

Should I open a bug for that?

Am Mi., 4. Okt. 2023 um 14:36 Uhr schrieb Boris Behrens :

> Found the bug for the TOO_MANY_PGS: https://tracker.ceph.com/issues/62986
> But I am still not sure, why I don't have any output on that one cluster.
>
> Am Mi., 4. Okt. 2023 um 14:08 Uhr schrieb Boris Behrens :
>
>> Hi,
>> I've just upgraded to our object storages to the latest pacific version
>> (16.2.14) and the autscaler is acting weird.
>> On one cluster it just shows nothing:
>> ~# ceph osd pool autoscale-status
>> ~#
>>
>> On the other clusters it shows this when it is set to warn:
>> ~# ceph health detail
>> ...
>> [WRN] POOL_TOO_MANY_PGS: 2 pools have too many placement groups
>> Pool .rgw.buckets.data has 1024 placement groups, should have 1024
>> Pool device_health_metrics has 1 placement groups, should have 1
>>
>> Version 16.2.13 seems to act normal.
>> Is this a known bug?
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Autoscaler problems in pacific

2023-10-04 Thread Boris Behrens
Found the bug for the TOO_MANY_PGS: https://tracker.ceph.com/issues/62986
But I am still not sure, why I don't have any output on that one cluster.

Am Mi., 4. Okt. 2023 um 14:08 Uhr schrieb Boris Behrens :

> Hi,
> I've just upgraded to our object storages to the latest pacific version
> (16.2.14) and the autscaler is acting weird.
> On one cluster it just shows nothing:
> ~# ceph osd pool autoscale-status
> ~#
>
> On the other clusters it shows this when it is set to warn:
> ~# ceph health detail
> ...
> [WRN] POOL_TOO_MANY_PGS: 2 pools have too many placement groups
> Pool .rgw.buckets.data has 1024 placement groups, should have 1024
> Pool device_health_metrics has 1 placement groups, should have 1
>
> Version 16.2.13 seems to act normal.
> Is this a known bug?
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Autoscaler problems in pacific

2023-10-04 Thread Boris Behrens
Hi,
I've just upgraded to our object storages to the latest pacific version
(16.2.14) and the autscaler is acting weird.
On one cluster it just shows nothing:
~# ceph osd pool autoscale-status
~#

On the other clusters it shows this when it is set to warn:
~# ceph health detail
...
[WRN] POOL_TOO_MANY_PGS: 2 pools have too many placement groups
Pool .rgw.buckets.data has 1024 placement groups, should have 1024
Pool device_health_metrics has 1 placement groups, should have 1

Version 16.2.13 seems to act normal.
Is this a known bug?
-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] multiple rgw instances with same cephx key

2023-09-22 Thread Boris Behrens
Hi,
is it possible to use one cephx key for multiple parallel running RGW?
Maybe I could just use the same 'name' and the same key for all of the RGW
instances?

I plan to start RGWs all over the place in container and let BGP handle the
traffic. But I don't know how to create on demand keys, that get removed
when the RGW shuts down.

I don't want to use the orchestrator for this, because I would need to add
all the compute nodes to it and there might be other processes in place
that add FW rules in our provisioning.

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph orch osd data_allocate_fraction does not work

2023-09-21 Thread Boris Behrens
I have a use case where I want to only use a small portion of the disk for
the OSD and the documentation states that I can use
data_allocation_fraction [1]

But cephadm can not use this and throws this error:
/usr/bin/podman: stderr ceph-volume lvm batch: error: unrecognized
arguments: --data-allocate-fraction 0.1

So, what I actually want to achieve:
Split up a single SSD into:
3-5x block.db for spinning disks (5x 320GB or 3x 500GB regarding if I have
8TB HDDs or 16TB HDDs)
1x SSD OSD (100G) for RGW index / meta pools
1x SSD OSD (100G) for RGW gc pool because of this bug [2]

My service definition looks like this:

service_type: osd
service_id: hdd-8tb
placement:
  host_pattern: '*'
crush_device_class: hdd
spec:
  data_devices:
rotational: 1
size: ':9T'
  db_devices:
rotational: 0
limit: 5
size: '1T:2T'
  encrypted: true
  block_db_size: 3200
---
service_type: osd
service_id: hdd-16tb
placement:
  host_pattern: '*'
crush_device_class: hdd
spec:
  data_devices:
rotational: 1
size: '14T:'
  db_devices:
rotational: 0
limit: 1
size: '1T:2T'
  encrypted: true
  block_db_size: 5000
---
service_type: osd
service_id: gc
placement:
  host_pattern: '*'
crush_device_class: gc
spec:
  data_devices:
rotational: 0
size: '1T:2T'
  encrypted: true
  data_allocate_fraction: 0.05
---
service_type: osd
service_id: ssd
placement:
  host_pattern: '*'
crush_device_class: ssd
spec:
  data_devices:
rotational: 0
size: '1T:2T'
  encrypted: true
  data_allocate_fraction: 0.05


[1]
https://docs.ceph.com/en/pacific/cephadm/services/osd/#ceph.deployment.drive_group.DriveGroupSpec.data_allocate_fraction
[2] https://tracker.ceph.com/issues/53585

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Make ceph orch daemons reboot safe

2023-09-18 Thread Boris Behrens
Found it. The target was not enabled:
root@0cc47a6df14e:~# systemctl status
ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853.target
● ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853.target - Ceph cluster
03977a23-f00f-4bb0-b9a7-de57f40ba853
 Loaded: loaded
(/etc/systemd/system/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853.target;
enabled; vendor preset: enabled)
 Active: inactive (dead)

Am Sa., 16. Sept. 2023 um 13:29 Uhr schrieb Boris :

> The other hosts are still online and the cluster only lost 1/3 of its
> services.
>
>
>
> > Am 16.09.2023 um 12:53 schrieb Eugen Block :
> >
> > I don’t have time to look into all the details, but I’m wondering how
> you seem to be able to start mgr services with the orchestrator if all mgr
> daemons are down. The orchestrator is a mgr module, so that’s a bit weird,
> isn’t it?
> >
> > Zitat von Boris Behrens :
> >
> >> Hi Eugen,
> >> the test-test cluster where we started with simple ceph and the adoption
> >> when straight forward are working fine.
> >>
> >> But this test cluster was all over the place.
> >> We had an old running update via orchestrator which was still in the
> >> pipeline, the adoption process was stopped a year ago and now got
> picked up
> >> again, and so on and so forth.
> >>
> >> But now we have it clean, at least we think it's clean.
> >>
> >> After a reboot, the services are not available. I have to start the via
> >> ceph orch.
> >> root@0cc47a6df14e:~# systemctl list-units | grep ceph
> >>  ceph-crash.service
> >>loaded active running   Ceph crash dump collector
> >>  ceph-fuse.target
> >>loaded active activeceph target allowing to
> start/stop
> >> all ceph-fuse@.service instances at once
> >>  ceph-mds.target
> >>   loaded active activeceph target allowing to start/stop
> >> all ceph-mds@.service instances at once
> >>  ceph-mgr.target
> >>   loaded active activeceph target allowing to start/stop
> >> all ceph-mgr@.service instances at once
> >>  ceph-mon.target
> >>   loaded active activeceph target allowing to start/stop
> >> all ceph-mon@.service instances at once
> >>  ceph-osd.target
> >>   loaded active activeceph target allowing to start/stop
> >> all ceph-osd@.service instances at once
> >>  ceph-radosgw.target
> >>   loaded active activeceph target allowing to start/stop
> >> all ceph-radosgw@.service instances at once
> >>  ceph.target
> >>   loaded active activeAll Ceph clusters and services
> >> root@0cc47a6df14e:~# ceph orch start mgr
> >> Scheduled to start mgr.0cc47a6df14e.nvjlcx on host '0cc47a6df14e'
> >> Scheduled to start mgr.0cc47a6df330.aznjao on host '0cc47a6df330'
> >> Scheduled to start mgr.0cc47aad8ce8.ifiydp on host '0cc47aad8ce8'
> >> root@0cc47a6df14e:~# ceph orch start mon
> >> Scheduled to start mon.0cc47a6df14e on host '0cc47a6df14e'
> >> Scheduled to start mon.0cc47a6df330 on host '0cc47a6df330'
> >> Scheduled to start mon.0cc47aad8ce8 on host '0cc47aad8ce8'
> >> root@0cc47a6df14e:~# ceph orch start osd.all-flash-over-1tb
> >> Scheduled to start osd.2 on host '0cc47a6df14e'
> >> Scheduled to start osd.5 on host '0cc47a6df14e'
> >> Scheduled to start osd.3 on host '0cc47a6df330'
> >> Scheduled to start osd.0 on host '0cc47a6df330'
> >> Scheduled to start osd.4 on host '0cc47aad8ce8'
> >> Scheduled to start osd.1 on host '0cc47aad8ce8'
> >> root@0cc47a6df14e:~# systemctl list-units | grep ceph
> >>
> ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df14e.nvjlcx.service
> >>   loaded active running   Ceph
> >> mgr.0cc47a6df14e.nvjlcx for 03977a23-f00f-4bb0-b9a7-de57f40ba853
> >>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mon.0cc47a6df14e.service
> >>loaded active running   Ceph
> >> mon.0cc47a6df14e for 03977a23-f00f-4bb0-b9a7-de57f40ba853
> >>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.2.service
> >>   loaded active running   Ceph osd.2
> >> for 03977a23-f00f-4bb0-b9a7-de57f40ba853
> >>  ceph-crash.service
> >>loaded active running   Ceph
> crash
> >> dump collector
> >>  system-ceph\x2d03977a23\x2df00f\x2d4bb0\x2db9a7\x2dde57f40ba853.slice
> >>  

[ceph-users] Re: Make ceph orch daemons reboot safe

2023-09-16 Thread Boris
The other hosts are still online and the cluster only lost 1/3 of its services. 



> Am 16.09.2023 um 12:53 schrieb Eugen Block :
> 
> I don’t have time to look into all the details, but I’m wondering how you 
> seem to be able to start mgr services with the orchestrator if all mgr 
> daemons are down. The orchestrator is a mgr module, so that’s a bit weird, 
> isn’t it?
> 
> Zitat von Boris Behrens :
> 
>> Hi Eugen,
>> the test-test cluster where we started with simple ceph and the adoption
>> when straight forward are working fine.
>> 
>> But this test cluster was all over the place.
>> We had an old running update via orchestrator which was still in the
>> pipeline, the adoption process was stopped a year ago and now got picked up
>> again, and so on and so forth.
>> 
>> But now we have it clean, at least we think it's clean.
>> 
>> After a reboot, the services are not available. I have to start the via
>> ceph orch.
>> root@0cc47a6df14e:~# systemctl list-units | grep ceph
>>  ceph-crash.service
>>loaded active running   Ceph crash dump collector
>>  ceph-fuse.target
>>loaded active activeceph target allowing to start/stop
>> all ceph-fuse@.service instances at once
>>  ceph-mds.target
>>   loaded active activeceph target allowing to start/stop
>> all ceph-mds@.service instances at once
>>  ceph-mgr.target
>>   loaded active activeceph target allowing to start/stop
>> all ceph-mgr@.service instances at once
>>  ceph-mon.target
>>   loaded active activeceph target allowing to start/stop
>> all ceph-mon@.service instances at once
>>  ceph-osd.target
>>   loaded active activeceph target allowing to start/stop
>> all ceph-osd@.service instances at once
>>  ceph-radosgw.target
>>   loaded active activeceph target allowing to start/stop
>> all ceph-radosgw@.service instances at once
>>  ceph.target
>>   loaded active activeAll Ceph clusters and services
>> root@0cc47a6df14e:~# ceph orch start mgr
>> Scheduled to start mgr.0cc47a6df14e.nvjlcx on host '0cc47a6df14e'
>> Scheduled to start mgr.0cc47a6df330.aznjao on host '0cc47a6df330'
>> Scheduled to start mgr.0cc47aad8ce8.ifiydp on host '0cc47aad8ce8'
>> root@0cc47a6df14e:~# ceph orch start mon
>> Scheduled to start mon.0cc47a6df14e on host '0cc47a6df14e'
>> Scheduled to start mon.0cc47a6df330 on host '0cc47a6df330'
>> Scheduled to start mon.0cc47aad8ce8 on host '0cc47aad8ce8'
>> root@0cc47a6df14e:~# ceph orch start osd.all-flash-over-1tb
>> Scheduled to start osd.2 on host '0cc47a6df14e'
>> Scheduled to start osd.5 on host '0cc47a6df14e'
>> Scheduled to start osd.3 on host '0cc47a6df330'
>> Scheduled to start osd.0 on host '0cc47a6df330'
>> Scheduled to start osd.4 on host '0cc47aad8ce8'
>> Scheduled to start osd.1 on host '0cc47aad8ce8'
>> root@0cc47a6df14e:~# systemctl list-units | grep ceph
>>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mgr.0cc47a6df14e.nvjlcx.service
>>   loaded active running   Ceph
>> mgr.0cc47a6df14e.nvjlcx for 03977a23-f00f-4bb0-b9a7-de57f40ba853
>>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@mon.0cc47a6df14e.service
>>loaded active running   Ceph
>> mon.0cc47a6df14e for 03977a23-f00f-4bb0-b9a7-de57f40ba853
>>  ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.2.service
>>   loaded active running   Ceph osd.2
>> for 03977a23-f00f-4bb0-b9a7-de57f40ba853
>>  ceph-crash.service
>>loaded active running   Ceph crash
>> dump collector
>>  system-ceph\x2d03977a23\x2df00f\x2d4bb0\x2db9a7\x2dde57f40ba853.slice
>>   loaded active active
>> system-ceph\x2d03977a23\x2df00f\x2d4bb0\x2db9a7\x2dde57f40ba853.slice
>>  ceph-fuse.target
>>loaded active activeceph target
>> allowing to start/stop all ceph-fuse@.service instances at once
>>  ceph-mds.target
>>   loaded active activeceph target
>> allowing to start/stop all ceph-mds@.service instances at once
>>  ceph-mgr.target
>>   loaded active activeceph target
>> allowing to start/stop all ceph-mgr@.service instances at once
>>  ceph-mon.target
>>   loaded active activeceph target
>> allowing to start/stop all ceph-mon@.service instan

[ceph-users] Re: Make ceph orch daemons reboot safe

2023-09-16 Thread Boris Behrens
,805 7fef7b041740 DEBUG sysctl: stdout * Applying
/etc/sysctl.d/90-ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-osd.conf ...
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout fs.aio-max-nr =
1048576
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout kernel.pid_max =
4194304
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
/etc/sysctl.d/99-sysctl.conf ...
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
/usr/lib/sysctl.d/protect-links.conf ...
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
fs.protected_fifos = 1
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
fs.protected_hardlinks = 1
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
fs.protected_regular = 2
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout
fs.protected_symlinks = 1
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stdout * Applying
/etc/sysctl.conf ...
2023-09-15 11:32:50,805 7fef7b041740 DEBUG sysctl: stderr sysctl: setting
key "net.ipv4.conf.all.promote_secondaries": Invalid argument
2023-09-15 11:32:51,469 7fef7b041740 DEBUG Non-zero exit code 1 from
systemctl reset-failed ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3
2023-09-15 11:32:51,469 7fef7b041740 DEBUG systemctl: stderr Failed to
reset failed state of unit
ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3.service: Unit
ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3.service not loaded.
2023-09-15 11:32:51,954 7fef7b041740 DEBUG systemctl: stderr Created
symlink
/etc/systemd/system/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853.target.wants/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@osd.3.service
→ /etc/systemd/system/ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853@.service.
2023-09-15 11:32:54,331 7fef7b041740 DEBUG firewalld does not appear to be
present

Am Sa., 16. Sept. 2023 um 10:25 Uhr schrieb Eugen Block :

> That sounds a bit strange to me, because all clusters we adopted so
> far successfully converted the previous systemd-units into systemd
> units targeting the pods. This process also should have been logged
> (stdout, probably in the cephadm.log as well), resulting in "enabled"
> systemd units. Can you paste the output of 'systemctl status
> ceph-@mon.'? If you have it, please also share the logs
> from the adoption process.
> What I did notice in a test cluster a while ago was that I had to
> reboot a node where I had to "play around" a bit with removed and
> redeployed osd containers. At some point they didn't react to
> systemctl commands anymore, but a reboot fixed that. But I haven't
> seen that in a production cluster yet, so some more details would be
> useful.
>
> Zitat von Boris Behrens :
>
> > Hi,
> > is there a way to have the pods start again after reboot?
> > Currently I need to start them by hand via ceph orch start
> mon/mgr/osd/...
> >
> > I imagine this will lead to a lot of headache when the ceph cluster gets
> a
> > powercycle and the mon pods will not start automatically.
> >
> > I've spun up a test cluster and there the pods start very fast. On the
> > legacy test cluster, which got adopted to cephadm, it does not.
> >
> > Cheers
> >  Boris
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Make ceph orch daemons reboot safe

2023-09-15 Thread Boris Behrens
Hi,
is there a way to have the pods start again after reboot?
Currently I need to start them by hand via ceph orch start mon/mgr/osd/...

I imagine this will lead to a lot of headache when the ceph cluster gets a
powercycle and the mon pods will not start automatically.

I've spun up a test cluster and there the pods start very fast. On the
legacy test cluster, which got adopted to cephadm, it does not.

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph orchestator managed daemons do not use authentication (was: ceph orchestrator pulls strange images from docker.io)

2023-09-15 Thread Boris Behrens
aha found it. The mon store seemed not to assimilate the ceph config. We
changed it and now it works:
# ceph config dump |grep auth
global  advanced  auth_client_required
   none
  *
global  advanced  auth_cluster_required
  none
  *
global  advanced  auth_service_required
  none

Am Fr., 15. Sept. 2023 um 13:01 Uhr schrieb Boris Behrens :

> Oh, we found the issue. A very old update was stuck in the pipeline. We
> canceled it and then the correct images got pulled.
>
> Now on to the next issue.
> Daemons that start have problems talking to the cluster
>
> # podman logs 72248bafb0d3
> 2023-09-15T10:47:30.740+ 7f2943559700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [1] but i only support [1]
> 2023-09-15T10:47:30.740+ 7f294ac601c0 -1 mgr init Authentication
> failed, did you specify a mgr ID with a valid keyring?
> Error in initialization: (13) Permission denied
>
> When we add the following lines to the mgr config and restart the daemon,
> it works flawlessly
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
>
> Did I miss some config value that needs to be set?
>
> Trying the same with a new mon, will not work.
> 2023-09-15T10:59:28.960+ 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
> e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
> supported
> 2023-09-15T10:59:32.164+ 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
> e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
> supported
> 2023-09-15T10:59:38.568+ 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
> e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
> supported
>
> I added the mon via:
> ceph orch daemon add mon FQDN:[IPv6_address]
>
>
> Am Fr., 15. Sept. 2023 um 09:21 Uhr schrieb Boris Behrens :
>
>> Hi Stefan,
>>
>> the cluster is running 17.6.2 through the board. The mentioned container
>> with other version don't show in the ceph -s or ceph verions.
>> It looks like it is host related.
>> One host get the correct 17.2.6 images, one get the 16.2.11 images and
>> the third one uses the 7.0.0-7183-g54142666 (whatever this is) images.
>>
>> root@0cc47a6df330:~# ceph config-key get config/global/container_image
>> Error ENOENT:
>>
>> root@0cc47a6df330:~# ceph config-key list |grep container_image
>> "config-history/12/+mgr.0cc47a6df14e/container_image",
>> "config-history/13/+mgr.0cc47aad8ce8/container_image",
>> "config/mgr.0cc47a6df14e/container_image",
>> "config/mgr.0cc47aad8ce8/container_image",
>>
>> I've tried to set the detault image to ceph config-key set
>> config/global/container_image
>> quay.io/ceph/ceph:v17.2.6@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232
>> But I can not redeploy the mgr daemons, because there is no standby
>> daemon.
>>
>> root@0cc47a6df330:~# ceph orch redeploy mgr
>> Error EINVAL: Unable to schedule redeploy for mgr.0cc47aad8ce8: No
>> standby MGR
>>
>> But there should be:
>> root@0cc47a6df330:~# ceph orch ps
>> NAME HOST PORTS   STATUS
>> REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID
>>  CONTAINER ID
>> mgr.0cc47a6df14e.iltiot  0cc47a6df14e  *:9283  running (23s)22s ago
>> 2m10.6M-  16.2.11de4b0b384ad4  0f31a162fa3e
>> mgr.0cc47aad8ce8 0cc47aad8ce8  running (16h) 8m ago
>>  16h 591M-  17.2.6 22cd8daf4d70  8145c63fdc44
>>
>> root@0cc47a6df330:~# ceph orch ls
>> NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
>> mgr  2/2  8m ago 19h
>>  0cc47a6df14e;0cc47a6df330;0cc47aad8ce8
>>
>> I've also remove podman and containerd, kill all directories and then do
>> a fresh reinstall of podman, which also did not work.
>> It's also strange that the daemons with the wonky version got an extra
>> suffix.
>>
>> If I would now how, I would happily nuke the whole orchestrator, podman
>> and everything that goes along with it, and start over. In the end it is
>> not that hard to start some mgr/mon daemons without podman, so I would be
>> back to a classical cluster.
>> I tried this yesterday, but the daemons still use that very strange
>> images and I just don't understand why.
>>
>> I could just nuke the whole dev cluster, wipe all disks and start fresh
>

[ceph-users] Re: ceph orchestator managed daemons do not use authentication (was: ceph orchestrator pulls strange images from docker.io)

2023-09-15 Thread Boris Behrens
Oh, we found the issue. A very old update was stuck in the pipeline. We
canceled it and then the correct images got pulled.

Now on to the next issue.
Daemons that start have problems talking to the cluster

# podman logs 72248bafb0d3
2023-09-15T10:47:30.740+ 7f2943559700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [1] but i only support [1]
2023-09-15T10:47:30.740+ 7f294ac601c0 -1 mgr init Authentication
failed, did you specify a mgr ID with a valid keyring?
Error in initialization: (13) Permission denied

When we add the following lines to the mgr config and restart the daemon,
it works flawlessly
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

Did I miss some config value that needs to be set?

Trying the same with a new mon, will not work.
2023-09-15T10:59:28.960+ 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
supported
2023-09-15T10:59:32.164+ 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
supported
2023-09-15T10:59:38.568+ 7fc851a77700 -1 mon.0cc47a6df330@-1(probing)
e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not
supported

I added the mon via:
ceph orch daemon add mon FQDN:[IPv6_address]


Am Fr., 15. Sept. 2023 um 09:21 Uhr schrieb Boris Behrens :

> Hi Stefan,
>
> the cluster is running 17.6.2 through the board. The mentioned container
> with other version don't show in the ceph -s or ceph verions.
> It looks like it is host related.
> One host get the correct 17.2.6 images, one get the 16.2.11 images and the
> third one uses the 7.0.0-7183-g54142666 (whatever this is) images.
>
> root@0cc47a6df330:~# ceph config-key get config/global/container_image
> Error ENOENT:
>
> root@0cc47a6df330:~# ceph config-key list |grep container_image
> "config-history/12/+mgr.0cc47a6df14e/container_image",
> "config-history/13/+mgr.0cc47aad8ce8/container_image",
> "config/mgr.0cc47a6df14e/container_image",
> "config/mgr.0cc47aad8ce8/container_image",
>
> I've tried to set the detault image to ceph config-key set
> config/global/container_image
> quay.io/ceph/ceph:v17.2.6@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232
> But I can not redeploy the mgr daemons, because there is no standby daemon.
>
> root@0cc47a6df330:~# ceph orch redeploy mgr
> Error EINVAL: Unable to schedule redeploy for mgr.0cc47aad8ce8: No standby
> MGR
>
> But there should be:
> root@0cc47a6df330:~# ceph orch ps
> NAME HOST PORTS   STATUS
>   REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID  CONTAINER
> ID
> mgr.0cc47a6df14e.iltiot  0cc47a6df14e  *:9283  running (23s)22s ago
> 2m10.6M-  16.2.11de4b0b384ad4  0f31a162fa3e
> mgr.0cc47aad8ce8 0cc47aad8ce8  running (16h) 8m ago
>  16h 591M-  17.2.6 22cd8daf4d70  8145c63fdc44
>
> root@0cc47a6df330:~# ceph orch ls
> NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
> mgr  2/2  8m ago 19h
>  0cc47a6df14e;0cc47a6df330;0cc47aad8ce8
>
> I've also remove podman and containerd, kill all directories and then do a
> fresh reinstall of podman, which also did not work.
> It's also strange that the daemons with the wonky version got an extra
> suffix.
>
> If I would now how, I would happily nuke the whole orchestrator, podman
> and everything that goes along with it, and start over. In the end it is
> not that hard to start some mgr/mon daemons without podman, so I would be
> back to a classical cluster.
> I tried this yesterday, but the daemons still use that very strange images
> and I just don't understand why.
>
> I could just nuke the whole dev cluster, wipe all disks and start fresh
> after reinstalling the hosts, but as I have to adopt 17 clusters to the
> orchestrator, I rather get some learnings from the not working thing :)
>
> Am Fr., 15. Sept. 2023 um 08:26 Uhr schrieb Stefan Kooman :
>
>> On 14-09-2023 17:49, Boris Behrens wrote:
>> > Hi,
>> > I currently try to adopt our stage cluster, some hosts just pull strange
>> > images.
>> >
>> > root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
>> > CONTAINER ID  IMAGE   COMMAND
>> >  CREATEDSTATUSPORTS   NAMES
>> > a532c37ebe42  docker.io/ceph/daemon-base:latest-master-devel  -n
>> > mgr.0cc47a6df3...  2 minutes ago  Up 2 minutes ago
>> >   ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl
>> >
>> 

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Boris Behrens
Hi Stefan,

the cluster is running 17.6.2 through the board. The mentioned container
with other version don't show in the ceph -s or ceph verions.
It looks like it is host related.
One host get the correct 17.2.6 images, one get the 16.2.11 images and the
third one uses the 7.0.0-7183-g54142666 (whatever this is) images.

root@0cc47a6df330:~# ceph config-key get config/global/container_image
Error ENOENT:

root@0cc47a6df330:~# ceph config-key list |grep container_image
"config-history/12/+mgr.0cc47a6df14e/container_image",
"config-history/13/+mgr.0cc47aad8ce8/container_image",
"config/mgr.0cc47a6df14e/container_image",
"config/mgr.0cc47aad8ce8/container_image",

I've tried to set the detault image to ceph config-key set
config/global/container_image
quay.io/ceph/ceph:v17.2.6@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232
But I can not redeploy the mgr daemons, because there is no standby daemon.

root@0cc47a6df330:~# ceph orch redeploy mgr
Error EINVAL: Unable to schedule redeploy for mgr.0cc47aad8ce8: No standby
MGR

But there should be:
root@0cc47a6df330:~# ceph orch ps
NAME HOST PORTS   STATUS
  REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID  CONTAINER
ID
mgr.0cc47a6df14e.iltiot  0cc47a6df14e  *:9283  running (23s)22s ago
2m10.6M-  16.2.11de4b0b384ad4  0f31a162fa3e
mgr.0cc47aad8ce8 0cc47aad8ce8  running (16h) 8m ago
 16h 591M-  17.2.6 22cd8daf4d70  8145c63fdc44

root@0cc47a6df330:~# ceph orch ls
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr  2/2  8m ago 19h  0cc47a6df14e;0cc47a6df330;0cc47aad8ce8

I've also remove podman and containerd, kill all directories and then do a
fresh reinstall of podman, which also did not work.
It's also strange that the daemons with the wonky version got an extra
suffix.

If I would now how, I would happily nuke the whole orchestrator, podman and
everything that goes along with it, and start over. In the end it is not
that hard to start some mgr/mon daemons without podman, so I would be back
to a classical cluster.
I tried this yesterday, but the daemons still use that very strange images
and I just don't understand why.

I could just nuke the whole dev cluster, wipe all disks and start fresh
after reinstalling the hosts, but as I have to adopt 17 clusters to the
orchestrator, I rather get some learnings from the not working thing :)

Am Fr., 15. Sept. 2023 um 08:26 Uhr schrieb Stefan Kooman :

> On 14-09-2023 17:49, Boris Behrens wrote:
> > Hi,
> > I currently try to adopt our stage cluster, some hosts just pull strange
> > images.
> >
> > root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
> > CONTAINER ID  IMAGE   COMMAND
> >  CREATEDSTATUSPORTS   NAMES
> > a532c37ebe42  docker.io/ceph/daemon-base:latest-master-devel  -n
> > mgr.0cc47a6df3...  2 minutes ago  Up 2 minutes ago
> >   ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl
> >
> > root@0cc47a6df330:~# ceph orch ps
> > NAME HOST PORTS   STATUS
> >REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID
> >   CONTAINER ID
> > mgr.0cc47a6df14e.vqizdz  0cc47a6df14e.f00f.gridscale.dev  *:9283
> running
> > (3m)  3m ago   3m10.8M-  16.2.11
> >   de4b0b384ad4  00b02cd82a1c
> > mgr.0cc47a6df330.iijety  0cc47a6df330.f00f.gridscale.dev  *:9283
> running
> > (5s)  2s ago   4s10.5M-  17.0.0-7183-g54142666
> >   75e3d7089cea  662c6baa097e
> > mgr.0cc47aad8ce8 0cc47aad8ce8.f00f.gridscale.dev
> running
> > (65m) 8m ago  60m 553M-  17.2.6
> > 22cd8daf4d70  8145c63fdc44
> >
> > Any idea what I need to do to change that?
>
> I want to get some things cleared up. What is the version you are
> running? I see three different ceph versions active now. I see you are
> running a podman ps command, but see docker images pulled. AFAIK podman
> needs a different IMAGE than docker ... or do you have a mixed setup?
>
> What does "ceph config-key get config/global/container_image" give you?
>
> ceph config-key list |grep container_image should give you a list
> (including config-history) where you can see what has been configured
> before.
>
> cephadm logs might give a clue as well.
>
> You can configure the IMAGE version / type that you want by setting the
> key and redeploy affected containers: For example (18.1.2):
>
> ceph config-key set config/global/container_image
>
> quay.io/ceph/ceph:v18.1.2@sha256:82a380c8127c42da406b7ce1281c2f3c0a86d4ba04b1f4b5f8d1036b8c24784f
>
> Gr. Stefan
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph orchestator pulls strange images from docker.io

2023-09-14 Thread Boris Behrens
Hi,
I currently try to adopt our stage cluster, some hosts just pull strange
images.

root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps
CONTAINER ID  IMAGE   COMMAND
CREATEDSTATUSPORTS   NAMES
a532c37ebe42  docker.io/ceph/daemon-base:latest-master-devel  -n
mgr.0cc47a6df3...  2 minutes ago  Up 2 minutes ago
 ceph-03977a23-f00f-4bb0-b9a7-de57f40ba853-mgr-0cc47a6df330-fxrfyl

root@0cc47a6df330:~# ceph orch ps
NAME HOST PORTS   STATUS
  REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID
 CONTAINER ID
mgr.0cc47a6df14e.vqizdz  0cc47a6df14e.f00f.gridscale.dev  *:9283  running
(3m)  3m ago   3m10.8M-  16.2.11
 de4b0b384ad4  00b02cd82a1c
mgr.0cc47a6df330.iijety  0cc47a6df330.f00f.gridscale.dev  *:9283  running
(5s)  2s ago   4s10.5M-  17.0.0-7183-g54142666
 75e3d7089cea  662c6baa097e
mgr.0cc47aad8ce8 0cc47aad8ce8.f00f.gridscale.dev  running
(65m) 8m ago  60m 553M-  17.2.6
22cd8daf4d70  8145c63fdc44

Any idea what I need to do to change that?

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-23 Thread Boris Behrens
Well, I've read that different in older threads, but now that I know it, I
can work around it.

Thanks

Am Di., 22. Aug. 2023 um 16:41 Uhr schrieb Konstantin Shalygin <
k0...@k0ste.ru>:

> Hi,
>
> This how OSD's woks. For change the network subnet you need to setup
> reachability of both: old and new network, until end of migration
>
> k
> Sent from my iPhone
>
> > On 22 Aug 2023, at 10:43, Boris Behrens  wrote:
> >
> > The OSDs are still only bound to one IP address.
>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Boris Behrens
Yes, I did change the mon_host config in ceph.conf.
Then I restarted all ceph services with systemctl restart ceph.target

After the restart, nothing changed. I rebooted the host then, and now all
OSDs are attached to the new network.

I thought that OSDs can attach to different networks, or even to ALL ip
addresses.

Am Di., 22. Aug. 2023 um 09:53 Uhr schrieb Eugen Block :

> Can you add some more details? Did you change the mon_host in
> ceph.conf and then rebooted? So the OSDs do work correctly now within
> the new network? OSDs do only bind to one public and one cluster IP,
> I'm not aware of a way to have them bind to multiple public IPs like
> the MONs can. You'll probably need to route the compute node traffic
> towards the new network. Please correct me if I misunderstood your
> response.
>
> Zitat von Boris Behrens :
>
> > The OSDs are still only bound to one IP address.
> > After a reboot, the OSDs switched to the new address and are now
> > unreachable from the compute nodes.
> >
> >
> >
> > Am Di., 22. Aug. 2023 um 09:17 Uhr schrieb Eugen Block :
> >
> >> You'll need to update the mon_host line as well. Not sure if it makes
> >> sense to have both old and new network in there, but I'd try on one
> >> host first and see if it works.
> >>
> >> Zitat von Boris Behrens :
> >>
> >> > We're working on the migration to cephadm, but it requires some
> >> > prerequisites that still needs planing.
> >> >
> >> > root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump
> >> > [global]
> >> > fsid = ...
> >> > mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12]
> >> > #public_network = OLD_NETWORK::/64, NEW_NETWORK::/64
> >> > ms_bind_ipv6 = true
> >> > ms_bind_ipv4 = false
> >> > auth_cluster_required = none
> >> > auth_service_required = none
> >> > auth_client_required = none
> >> >
> >> > [client]
> >> > ms_mon_client_mode = crc
> >> > #debug_rgw = 20
> >> > rgw_frontends = beast endpoint=[OLD_NETWORK::12]:7480
> >> > rgw_region = ...
> >> > rgw_zone = ...
> >> > rgw_thread_pool_size = 512
> >> > rgw_dns_name = ...
> >> > rgw_dns_s3website_name = ...
> >> >
> >> > [mon-new]
> >> > public_addr = fNEW_NETWORK::12
> >> > public_bind_addr = NEW_NETWORK::12
> >> >
> >> > WHO   MASK  LEVEL OPTION
> >> > VALUE
> >> >RO
> >> > global  advanced  auth_client_required
> >> > none
> >> > *
> >> > global  advanced  auth_cluster_required
> >> >  none
> >> > *
> >> > global  advanced  auth_service_required
> >> >  none
> >> > *
> >> > global  advanced  mon_allow_pool_size_one
> >> >  true
> >> > global  advanced  ms_bind_ipv4
> >> > false
> >> > global  advanced  ms_bind_ipv6
> >> > true
> >> > global  advanced  osd_pool_default_pg_autoscale_mode
> >> > warn
> >> > global  advanced  public_network
> >> > OLD_NETWORK::/64, NEW_NETWORK::/64
> >> >   *
> >> > mon advanced
> auth_allow_insecure_global_id_reclaim
> >> >  false
> >> > mon advanced  mon_allow_pool_delete
> >> >  false
> >> > mgr advanced  mgr/balancer/active
> >> >  true
> >> > mgr advanced  mgr/balancer/mode
> >> >  upmap
> >> > mgr advanced  mgr/cephadm/migration_current
> >> 5
> >> >
> >> >  *
> >> > mgr advanced  mgr/orchestrator/orchestrator
> >> >  cephadm
> >> > mgr.0cc47a6df14ebasic container_image
> >> >
> >>
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >> >  *
> >> > mgr.0cc47aad8ce8basic container_image
> >> >
> >>
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >> >  *
> >> > osd.0   basic osd_mclock_max_capacity_iops_ssd
> >> > 13295.4

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Boris Behrens
The OSDs are still only bound to one IP address.
After a reboot, the OSDs switched to the new address and are now
unreachable from the compute nodes.



Am Di., 22. Aug. 2023 um 09:17 Uhr schrieb Eugen Block :

> You'll need to update the mon_host line as well. Not sure if it makes
> sense to have both old and new network in there, but I'd try on one
> host first and see if it works.
>
> Zitat von Boris Behrens :
>
> > We're working on the migration to cephadm, but it requires some
> > prerequisites that still needs planing.
> >
> > root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump
> > [global]
> > fsid = ...
> > mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12]
> > #public_network = OLD_NETWORK::/64, NEW_NETWORK::/64
> > ms_bind_ipv6 = true
> > ms_bind_ipv4 = false
> > auth_cluster_required = none
> > auth_service_required = none
> > auth_client_required = none
> >
> > [client]
> > ms_mon_client_mode = crc
> > #debug_rgw = 20
> > rgw_frontends = beast endpoint=[OLD_NETWORK::12]:7480
> > rgw_region = ...
> > rgw_zone = ...
> > rgw_thread_pool_size = 512
> > rgw_dns_name = ...
> > rgw_dns_s3website_name = ...
> >
> > [mon-new]
> > public_addr = fNEW_NETWORK::12
> > public_bind_addr = NEW_NETWORK::12
> >
> > WHO   MASK  LEVEL OPTION
> > VALUE
> >RO
> > global  advanced  auth_client_required
> > none
> > *
> > global  advanced  auth_cluster_required
> >  none
> > *
> > global  advanced  auth_service_required
> >  none
> > *
> > global  advanced  mon_allow_pool_size_one
> >  true
> > global  advanced  ms_bind_ipv4
> > false
> > global  advanced  ms_bind_ipv6
> > true
> > global  advanced  osd_pool_default_pg_autoscale_mode
> > warn
> > global  advanced  public_network
> > OLD_NETWORK::/64, NEW_NETWORK::/64
> >   *
> > mon advanced  auth_allow_insecure_global_id_reclaim
> >  false
> > mon advanced  mon_allow_pool_delete
> >  false
> > mgr advanced  mgr/balancer/active
> >  true
> > mgr advanced  mgr/balancer/mode
> >  upmap
> > mgr advanced  mgr/cephadm/migration_current
> 5
> >
> >  *
> > mgr advanced  mgr/orchestrator/orchestrator
> >  cephadm
> > mgr.0cc47a6df14ebasic container_image
> >
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >  *
> > mgr.0cc47aad8ce8basic container_image
> >
> quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
> >  *
> > osd.0   basic osd_mclock_max_capacity_iops_ssd
> > 13295.404086
> > osd.1   basic osd_mclock_max_capacity_iops_ssd
> > 14952.522452
> > osd.2   basic osd_mclock_max_capacity_iops_ssd
> > 13584.113025
> > osd.3   basic osd_mclock_max_capacity_iops_ssd
> > 16421.770356
> > osd.4   basic osd_mclock_max_capacity_iops_ssd
> > 15209.375302
> > osd.5   basic osd_mclock_max_capacity_iops_ssd
> > 15333.697366
> >
> > Am Mo., 21. Aug. 2023 um 14:20 Uhr schrieb Eugen Block :
> >
> >> Hi,
> >>
> >> > I don't have those configs. The cluster is not maintained via cephadm
> /
> >> > orchestrator.
> >>
> >> I just assumed that with Quincy it already would be managed by
> >> cephadm. So what does the ceph.conf currently look like on an OSD host
> >> (mask sensitive data)?
> >>
> >> Zitat von Boris Behrens :
> >>
> >> > Hey Eugen,
> >> > I don't have those configs. The cluster is not maintained via cephadm
> /
> >> > orchestrator.
> >> > The ceph.conf does not have IPaddresses configured.
> >> > A grep in /var/lib/ceph show only binary matches on the mons
> >> >
> >> > I've restarted the whole host, which also did not work.
> >> >
> >> > Am Mo., 21. Aug. 2023 um 13:18 Uhr schrieb Eugen Block  >:
> >> >
> >> >> Hi,
> >> >>
> >> >> there have been a couple of threads wrt network change

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
We're working on the migration to cephadm, but it requires some
prerequisites that still needs planing.

root@host:~# cat /etc/ceph/ceph.conf ; ceph config dump
[global]
fsid = ...
mon_host = [OLD_NETWORK::10], [OLD_NETWORK::11], [OLD_NETWORK::12]
#public_network = OLD_NETWORK::/64, NEW_NETWORK::/64
ms_bind_ipv6 = true
ms_bind_ipv4 = false
auth_cluster_required = none
auth_service_required = none
auth_client_required = none

[client]
ms_mon_client_mode = crc
#debug_rgw = 20
rgw_frontends = beast endpoint=[OLD_NETWORK::12]:7480
rgw_region = ...
rgw_zone = ...
rgw_thread_pool_size = 512
rgw_dns_name = ...
rgw_dns_s3website_name = ...

[mon-new]
public_addr = fNEW_NETWORK::12
public_bind_addr = NEW_NETWORK::12

WHO   MASK  LEVEL OPTION
VALUE
   RO
global  advanced  auth_client_required
none
*
global  advanced  auth_cluster_required
 none
*
global  advanced  auth_service_required
 none
*
global  advanced  mon_allow_pool_size_one
 true
global  advanced  ms_bind_ipv4
false
global  advanced  ms_bind_ipv6
true
global  advanced  osd_pool_default_pg_autoscale_mode
warn
global  advanced  public_network
OLD_NETWORK::/64, NEW_NETWORK::/64
  *
mon advanced  auth_allow_insecure_global_id_reclaim
 false
mon advanced  mon_allow_pool_delete
 false
mgr advanced  mgr/balancer/active
 true
mgr advanced  mgr/balancer/mode
 upmap
mgr advanced  mgr/cephadm/migration_current  5

 *
mgr advanced  mgr/orchestrator/orchestrator
 cephadm
mgr.0cc47a6df14ebasic container_image
quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
 *
mgr.0cc47aad8ce8basic container_image
quay.io/ceph/ceph@sha256:09e527353463993f0441ad3e86be98076c89c34552163e558a8c2f9bfb4a35f5
 *
osd.0   basic osd_mclock_max_capacity_iops_ssd
13295.404086
osd.1   basic osd_mclock_max_capacity_iops_ssd
14952.522452
osd.2   basic osd_mclock_max_capacity_iops_ssd
13584.113025
osd.3   basic osd_mclock_max_capacity_iops_ssd
16421.770356
osd.4   basic osd_mclock_max_capacity_iops_ssd
15209.375302
osd.5   basic osd_mclock_max_capacity_iops_ssd
15333.697366

Am Mo., 21. Aug. 2023 um 14:20 Uhr schrieb Eugen Block :

> Hi,
>
> > I don't have those configs. The cluster is not maintained via cephadm /
> > orchestrator.
>
> I just assumed that with Quincy it already would be managed by
> cephadm. So what does the ceph.conf currently look like on an OSD host
> (mask sensitive data)?
>
> Zitat von Boris Behrens :
>
> > Hey Eugen,
> > I don't have those configs. The cluster is not maintained via cephadm /
> > orchestrator.
> > The ceph.conf does not have IPaddresses configured.
> > A grep in /var/lib/ceph show only binary matches on the mons
> >
> > I've restarted the whole host, which also did not work.
> >
> > Am Mo., 21. Aug. 2023 um 13:18 Uhr schrieb Eugen Block :
> >
> >> Hi,
> >>
> >> there have been a couple of threads wrt network change, simply
> >> restarting OSDs is not sufficient. I still haven't had to do it
> >> myself, but did you 'ceph orch reconfig osd' after adding the second
> >> public network, then restart them? I'm not sure if the orchestrator
> >> works as expected here, last year there was a thread [1] with the same
> >> intention. Can you check the local ceph.conf file
> >> (/var/lib/ceph///config) of the OSDs (or start with
> >> one) if it contains both public networks? I (still) expect the
> >> orchestrator to update that config as well. Maybe it's worth a bug
> >> report? If there's more to it than just updating the monmap I would
> >> like to see that added to the docs since moving monitors to a
> >> different network is already documented [2].
> >>
> >> Regards,
> >> Eugen
> >>
> >> [1] https://www.spinics.net/lists/ceph-users/msg75162.html
> >> [2]
> >>
> >>
> https://docs.ceph.com/en/quincy/cephadm/services/mon/#moving-monitors-to-a-different-network
> >>
> >> Zitat von Boris Behrens :
> >>
> >> > Hi,
> >> > I need to migrate a storage cluster to a new network.
> >> >
> >> > I added the new network to the ceph config via:
> >> > ceph config set global public_network "old_network/64, new_network/64"
> >> > I've added a set of

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
Hey Eugen,
I don't have those configs. The cluster is not maintained via cephadm /
orchestrator.
The ceph.conf does not have IPaddresses configured.
A grep in /var/lib/ceph show only binary matches on the mons

I've restarted the whole host, which also did not work.

Am Mo., 21. Aug. 2023 um 13:18 Uhr schrieb Eugen Block :

> Hi,
>
> there have been a couple of threads wrt network change, simply
> restarting OSDs is not sufficient. I still haven't had to do it
> myself, but did you 'ceph orch reconfig osd' after adding the second
> public network, then restart them? I'm not sure if the orchestrator
> works as expected here, last year there was a thread [1] with the same
> intention. Can you check the local ceph.conf file
> (/var/lib/ceph///config) of the OSDs (or start with
> one) if it contains both public networks? I (still) expect the
> orchestrator to update that config as well. Maybe it's worth a bug
> report? If there's more to it than just updating the monmap I would
> like to see that added to the docs since moving monitors to a
> different network is already documented [2].
>
> Regards,
> Eugen
>
> [1] https://www.spinics.net/lists/ceph-users/msg75162.html
> [2]
>
> https://docs.ceph.com/en/quincy/cephadm/services/mon/#moving-monitors-to-a-different-network
>
> Zitat von Boris Behrens :
>
> > Hi,
> > I need to migrate a storage cluster to a new network.
> >
> > I added the new network to the ceph config via:
> > ceph config set global public_network "old_network/64, new_network/64"
> > I've added a set of new mon daemons with IP addresses in the new network
> > and they are added to the quorum and seem to work as expected.
> >
> > But when I restart the OSD daemons, the do not bind to the new
> addresses. I
> > would have expected that the OSDs try to bind to all networks but they
> are
> > only bound to the old_network.
> >
> > The idea was to add the new set of network config to the current storage
> > hosts, bind everything to ip addresses in both networks, shift over
> > workload, and then remove the old network.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
Hi,
I need to migrate a storage cluster to a new network.

I added the new network to the ceph config via:
ceph config set global public_network "old_network/64, new_network/64"
I've added a set of new mon daemons with IP addresses in the new network
and they are added to the quorum and seem to work as expected.

But when I restart the OSD daemons, the do not bind to the new addresses. I
would have expected that the OSDs try to bind to all networks but they are
only bound to the old_network.

The idea was to add the new set of network config to the current storage
hosts, bind everything to ip addresses in both networks, shift over
workload, and then remove the old network.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading nautilus / centos7 to octopus / ubuntu 20.04. - Suggestions and hints?

2023-08-01 Thread Boris Behrens
Hi Goetz,
I've done the same, and went to Octopus and to Ubuntu. It worked like a
charm and with pip, you can get the pecan library working. I think I did it
with this:
yum -y install python36-six.noarch python36-PyYAML.x86_64
pip3 install pecan werkzeug cherrypy

Worked very well, until we got hit by this bug:
https://tracker.ceph.com/issues/53729#note-65
Nautilus seem not to have tooling to detect it, and the fix is not
backported to octopus.

And because our clusters started to act badly after the octopus upgrade,
and we fast forwarded to pacific (untested emergency cluster upgrades are
okayish but ugly :D ).

And because of the bug, we went another route with the last cluster.
I reinstalled all hosts with ubuntu 18.04, then update straight to pacific,
and then upgrade to ubuntu 20.04.

Hope that helped.

Cheers
 Boris


Am Di., 1. Aug. 2023 um 20:06 Uhr schrieb Götz Reinicke <
goetz.reini...@filmakademie.de>:

> Hi,
>
> As I’v read and thought a lot about the migration as this is a bigger
> project, I was wondering if anyone has done that already and might share
> some notes or playbooks, because in all readings there where some parts
> missing or miss understandable to me.
>
> I do have some different approaches in mind, so may be you have some
> suggestions or hints.
>
> a) upgrade nautilus on centos 7 with the few missing features like
> dashboard and prometheus. After that migrate one node after an other to
> ubuntu 20.04 with octopus and than upgrade ceph to the recent stable
> version.
>
> b) migrate one node after an other to ubuntu 18.04 with nautilus and then
> upgrade to octupus and after that to ubuntu 20.04.
>
> or
>
> c) upgrade one node after an other to ubuntu 20.04 with octopus and join
> it to the cluster until all nodes are upgraded.
>
>
> For test I tried c) with a mon node, but adding that to the cluster fails
> with some failed state, still probing for the other mons. (I dont have the
> right log at hand right now.)
>
> So my questions are:
>
> a) What would be the best (most stable) migration path and
>
> b) is it in general possible to add a new octopus mon (not upgraded one)
> to a nautilus cluster, where the other mons are still on nautilus?
>
>
> I hope my thoughts and questions are understandable :)
>
> Thanks for any hint and suggestion. Best . Götz
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-07-04 Thread Boris Behrens
Are there any ideas how to work with this?
We disabled the logging so we do not run our of diskspace, but the rgw
daemon still requires A LOT of cpu because of this.

Am Mi., 21. Juni 2023 um 10:45 Uhr schrieb Boris Behrens :

> I've update the dc3 site from octopus to pacific and the problem is still
> there.
> I find it very weird that in only happens from one single zonegroup to the
> master and not from the other two.
>
> Am Mi., 21. Juni 2023 um 01:59 Uhr schrieb Boris Behrens :
>
>> I recreated the site and the problem still persists.
>>
>> I've upped the logging and saw this for a lot of buckets (i've stopped
>> the debug log after some seconds).
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state:
>> rctx=0x7fcaab7f9320 obj=dc3.rgw.meta:root:s3bucket-fra2
>> state=0x7fcba05ac0a0 s->prefetch_data=0
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
>> name=dc3.rgw.meta+root+s3bucket-fra2 : miss
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
>> name=dc3.rgw.meta+root+s3bucket-fra2 info.flags=0x6
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 adding
>> dc3.rgw.meta+root+s3bucket-fra2 to cache LRU end
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
>> name=dc3.rgw.meta+root+s3bucket-fra2 : type miss (requested=0x1, cached=0x6)
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
>> name=dc3.rgw.meta+root+s3bucket-fra2 info.flags=0x1
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 moving
>> dc3.rgw.meta+root+s3bucket-fra2 to cache LRU end
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state:
>> rctx=0x7fcaab7f9320
>> obj=dc3.rgw.meta:root:.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>> state=0x7fcba43ce0a0 s->prefetch_data=0
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
>> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>> : miss
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
>> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>> info.flags=0x16
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 adding
>> dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>> to cache LRU end
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
>> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>> : type miss (requested=0x13, cached=0x16)
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
>> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>> info.flags=0x13
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 moving
>> dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>> to cache LRU end
>> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 chain_cache_entry:
>> cache_locator=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>>
>> Am Di., 20. Juni 2023 um 19:29 Uhr schrieb Boris :
>>
>>> Hi Casey,
>>> already did restart all RGW instances.  Only helped for 2 minutes. We
>>> now stopped the new site.
>>>
>>> I will remove and recreate it later.
>>> As twi other sites don't have the problem I currently think I made a
>>> mistake in the process.
>>>
>>> Mit freundlichen Grüßen
>>>  - Boris Behrens
>>>
>>> > Am 20.06.2023 um 18:30 schrieb Casey Bodley :
>>> >
>>> > hi Boris,
>>> >
>>> > we've been investigating reports of excessive polling from metadata
>>> > sync. i just opened https://tracker.ceph.com/issues/61743 to track
>>> > this. restarting the secondary zone radosgws should help as a
>>> > temporary workaround
>>> >
>>> >> On Tue, Jun 20, 2023 at 5:57 AM Boris Behrens  wrote:
>>> >>
>>> >> Hi,
>>> >> yesterday I added a new zonegroup and it looks like it seems to cycle
>>> over
>>> >> the same requests over and over again.
>>> >>
>>> >> In the log of the main zone I see these requests:
>>> >> 2023-06-20T09:48:37.979+ 7f8941fb3700  1 beast: 0x7f8a602f3700:
>>> >> fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+] "GET
>>> >>
>>> /admin/log?type=metadata=62=e8fc96f1-ae86-4dc1-b432-470b0772fded=100&=b39392eb-75f8-47f0-b4f3-7d3882930b26
>>> >> H

[ceph-users] Re: list of rgw instances in ceph status

2023-07-03 Thread Boris Behrens
Hi Mahnoosh,
that helped. Thanks a lot!

Am Mo., 3. Juli 2023 um 13:46 Uhr schrieb mahnoosh shahidi <
mahnooosh@gmail.com>:

> Hi Boris,
>
> You can list your rgw daemons with the following command
>
> ceph service dump -f json-pretty | jq '.services.rgw.daemons'
>
>
> The following command extract all their ids
>
> ceph service dump -f json-pretty | jq '.services.rgw.daemons' | egrep -e
>> 'gid' -e '\"id\"'
>>
>
> Best Regards,
> Mahnoosh
>
> On Mon, Jul 3, 2023 at 3:00 PM Boris Behrens  wrote:
>
>> Hi,
>> might be a dump question, but is there a way to list the rgw instances
>> that
>> are running in a ceph cluster?
>>
>> Before pacific it showed up in `ceph status` but now it only tells me how
>> many daemons are active, now which daemons are active.
>>
>> ceph orch ls tells me that I need to configure a backend but we are not at
>> the stage that we are going to implement the orchestrator yet.
>>
>> Cheers
>>  Boris
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] list of rgw instances in ceph status

2023-07-03 Thread Boris Behrens
Hi,
might be a dump question, but is there a way to list the rgw instances that
are running in a ceph cluster?

Before pacific it showed up in `ceph status` but now it only tells me how
many daemons are active, now which daemons are active.

ceph orch ls tells me that I need to configure a backend but we are not at
the stage that we are going to implement the orchestrator yet.

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: device class for nvme disk is ssd

2023-06-29 Thread Boris Behrens
So basically it does not matter unless I want to have that split up.
Thanks for all the answers.

I am still lobbying to phase out SATA SSDs and replace them with NVME
disks. :)

Am Mi., 28. Juni 2023 um 18:14 Uhr schrieb Anthony D'Atri <
a...@dreamsnake.net>:

> Even when you factor in density, iops, and the cost of an HBA?
>
> SAS is mostly dead, manufacturers are beginning to drop SATA from their
> roadmaps.
>
> > On Jun 28, 2023, at 10:24 AM, Marc  wrote:
> >
> > 
> >
> >>
> >> What would we use instead? SATA / SAS that are progressively withering
> >> in the market, less performance for the same money? Why pay extra for an
> >> HBA just to use legacy media?
> >
> > I am still buying sas/sata ssd's, these are for me still ~half price of
> the nvme equivalent.
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] device class for nvme disk is ssd

2023-06-28 Thread Boris Behrens
Hi,
is it a problem that the device class for all my disks is SSD even all of
these disks are NVME disks? If it is just a classification for ceph, so I
can have pools on SSDs and NVMEs separated I don't care. But maybe ceph
handles NVME disks differently internally?

I've added them via
ceph-volume lvm create --bluestore --data /dev/nvme2n1

and they only show up as ssd
root@a0423f621aaa:~# ceph osd metadata osd.0
{
"id": 0,
"arch": "x86_64",
...
"bluefs": "1",
"bluefs_dedicated_db": "0",
"bluefs_dedicated_wal": "0",
"bluefs_single_shared_device": "1",
"bluestore_bdev_access_mode": "blk",
"bluestore_bdev_block_size": "4096",
"bluestore_bdev_dev_node": "/dev/dm-2",
"bluestore_bdev_devices": "nvme0n1",
"bluestore_bdev_driver": "KernelDevice",
"bluestore_bdev_partition_path": "/dev/dm-2",
"bluestore_bdev_rotational": "0",
"bluestore_bdev_size": "1920378863616",
"bluestore_bdev_support_discard": "1",
"bluestore_bdev_type": "ssd",
"ceph_release": "pacific",
"ceph_version": "ceph version 16.2.13
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)",
"ceph_version_short": "16.2.13",
"ceph_version_when_created": "ceph version 16.2.13
(5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)",
"cpu": "Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz",
"created_at": "2023-06-20T14:03:35.167741Z",
"default_device_class": "ssd",
"device_ids": "nvme0n1=SAMSUNG_MZQLB1T9HAJR-7_S439NF0M506164",
"device_paths": "nvme0n1=/dev/disk/by-path/pci-:5e:00.0-nvme-1",
"devices": "nvme0n1",
"distro": "ubuntu",
"distro_description": "Ubuntu 20.04.6 LTS",
"distro_version": "20.04",
...
"journal_rotational": "0",
"kernel_description": "#169-Ubuntu SMP Tue Jun 6 22:23:09 UTC 2023",
"kernel_version": "5.4.0-152-generic",
"mem_swap_kb": "0",
"mem_total_kb": "196668116",
"network_numa_unknown_ifaces": "back_iface,front_iface",
"objectstore_numa_node": "0",
"objectstore_numa_nodes": "0",
"os": "Linux",
"osd_data": "/var/lib/ceph/osd/ceph-0",
"osd_objectstore": "bluestore",
"osdspec_affinity": "",
"rotational": "0"
}

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-21 Thread Boris Behrens
I've update the dc3 site from octopus to pacific and the problem is still
there.
I find it very weird that in only happens from one single zonegroup to the
master and not from the other two.

Am Mi., 21. Juni 2023 um 01:59 Uhr schrieb Boris Behrens :

> I recreated the site and the problem still persists.
>
> I've upped the logging and saw this for a lot of buckets (i've stopped the
> debug log after some seconds).
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state:
> rctx=0x7fcaab7f9320 obj=dc3.rgw.meta:root:s3bucket-fra2
> state=0x7fcba05ac0a0 s->prefetch_data=0
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
> name=dc3.rgw.meta+root+s3bucket-fra2 : miss
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
> name=dc3.rgw.meta+root+s3bucket-fra2 info.flags=0x6
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 adding
> dc3.rgw.meta+root+s3bucket-fra2 to cache LRU end
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
> name=dc3.rgw.meta+root+s3bucket-fra2 : type miss (requested=0x1, cached=0x6)
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
> name=dc3.rgw.meta+root+s3bucket-fra2 info.flags=0x1
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 moving
> dc3.rgw.meta+root+s3bucket-fra2 to cache LRU end
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state:
> rctx=0x7fcaab7f9320
> obj=dc3.rgw.meta:root:.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
> state=0x7fcba43ce0a0 s->prefetch_data=0
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
> : miss
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
> info.flags=0x16
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 adding
> dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
> to cache LRU end
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
> : type miss (requested=0x13, cached=0x16)
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
> name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
> info.flags=0x13
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 moving
> dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
> to cache LRU end
> 2023-06-20T23:32:29.365+ 7fcaab7fe700 10 chain_cache_entry:
> cache_locator=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
>
> Am Di., 20. Juni 2023 um 19:29 Uhr schrieb Boris :
>
>> Hi Casey,
>> already did restart all RGW instances.  Only helped for 2 minutes. We now
>> stopped the new site.
>>
>> I will remove and recreate it later.
>> As twi other sites don't have the problem I currently think I made a
>> mistake in the process.
>>
>> Mit freundlichen Grüßen
>>  - Boris Behrens
>>
>> > Am 20.06.2023 um 18:30 schrieb Casey Bodley :
>> >
>> > hi Boris,
>> >
>> > we've been investigating reports of excessive polling from metadata
>> > sync. i just opened https://tracker.ceph.com/issues/61743 to track
>> > this. restarting the secondary zone radosgws should help as a
>> > temporary workaround
>> >
>> >> On Tue, Jun 20, 2023 at 5:57 AM Boris Behrens  wrote:
>> >>
>> >> Hi,
>> >> yesterday I added a new zonegroup and it looks like it seems to cycle
>> over
>> >> the same requests over and over again.
>> >>
>> >> In the log of the main zone I see these requests:
>> >> 2023-06-20T09:48:37.979+ 7f8941fb3700  1 beast: 0x7f8a602f3700:
>> >> fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+] "GET
>> >>
>> /admin/log?type=metadata=62=e8fc96f1-ae86-4dc1-b432-470b0772fded=100&=b39392eb-75f8-47f0-b4f3-7d3882930b26
>> >> HTTP/1.1" 200 44 - - -
>> >>
>> >> Only thing that changes is the 
>> >>
>> >> We have two other zonegroups that are configured identical (ceph.conf
>> and
>> >> period) and these don;t seem to spam the main rgw.
>> >>
>> >> root@host:~# radosgw-admin sync status
>> >>  realm 5d6f2ea4-b84a-459b-bce2-bccac338b3ef (main)
>> >>  zonegroup b39392eb-75f8-47f0-b4f3-7d3882930b26 (dc3)
>> >>   zone 96f5eca9-425b-

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris Behrens
I recreated the site and the problem still persists.

I've upped the logging and saw this for a lot of buckets (i've stopped the
debug log after some seconds).
2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state:
rctx=0x7fcaab7f9320 obj=dc3.rgw.meta:root:s3bucket-fra2
state=0x7fcba05ac0a0 s->prefetch_data=0
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
name=dc3.rgw.meta+root+s3bucket-fra2 : miss
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
name=dc3.rgw.meta+root+s3bucket-fra2 info.flags=0x6
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 adding
dc3.rgw.meta+root+s3bucket-fra2 to cache LRU end
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
name=dc3.rgw.meta+root+s3bucket-fra2 : type miss (requested=0x1, cached=0x6)
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
name=dc3.rgw.meta+root+s3bucket-fra2 info.flags=0x1
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 moving
dc3.rgw.meta+root+s3bucket-fra2 to cache LRU end
2023-06-20T23:32:29.365+ 7fcaab7fe700 20 get_system_obj_state:
rctx=0x7fcaab7f9320
obj=dc3.rgw.meta:root:.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
state=0x7fcba43ce0a0 s->prefetch_data=0
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
: miss
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
info.flags=0x16
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 adding
dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
to cache LRU end
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache get:
name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
: type miss (requested=0x13, cached=0x16)
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 cache put:
name=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
info.flags=0x13
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 moving
dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29
to cache LRU end
2023-06-20T23:32:29.365+ 7fcaab7fe700 10 chain_cache_entry:
cache_locator=dc3.rgw.meta+root+.bucket.meta.s3bucket-fra2:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297866866.29

Am Di., 20. Juni 2023 um 19:29 Uhr schrieb Boris :

> Hi Casey,
> already did restart all RGW instances.  Only helped for 2 minutes. We now
> stopped the new site.
>
> I will remove and recreate it later.
> As twi other sites don't have the problem I currently think I made a
> mistake in the process.
>
> Mit freundlichen Grüßen
>  - Boris Behrens
>
> > Am 20.06.2023 um 18:30 schrieb Casey Bodley :
> >
> > hi Boris,
> >
> > we've been investigating reports of excessive polling from metadata
> > sync. i just opened https://tracker.ceph.com/issues/61743 to track
> > this. restarting the secondary zone radosgws should help as a
> > temporary workaround
> >
> >> On Tue, Jun 20, 2023 at 5:57 AM Boris Behrens  wrote:
> >>
> >> Hi,
> >> yesterday I added a new zonegroup and it looks like it seems to cycle
> over
> >> the same requests over and over again.
> >>
> >> In the log of the main zone I see these requests:
> >> 2023-06-20T09:48:37.979+ 7f8941fb3700  1 beast: 0x7f8a602f3700:
> >> fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+] "GET
> >>
> /admin/log?type=metadata=62=e8fc96f1-ae86-4dc1-b432-470b0772fded=100&=b39392eb-75f8-47f0-b4f3-7d3882930b26
> >> HTTP/1.1" 200 44 - - -
> >>
> >> Only thing that changes is the 
> >>
> >> We have two other zonegroups that are configured identical (ceph.conf
> and
> >> period) and these don;t seem to spam the main rgw.
> >>
> >> root@host:~# radosgw-admin sync status
> >>  realm 5d6f2ea4-b84a-459b-bce2-bccac338b3ef (main)
> >>  zonegroup b39392eb-75f8-47f0-b4f3-7d3882930b26 (dc3)
> >>   zone 96f5eca9-425b-4194-a152-86e310e91ddb (dc3)
> >>  metadata sync syncing
> >>full sync: 0/64 shards
> >>incremental sync: 64/64 shards
> >>metadata is caught up with master
> >>
> >> root@host:~# radosgw-admin period get
> >> {
> >>"id": "e8fc96f1-ae86-4dc1-b432-470b0772fded",
> >>"epoch": 92,
> >>"predecessor_uuid": "5349ac85-3d6d-4088-993f-7a1d4be3835a",
> >>"sync_status": [
> >>"",
> >> ...

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris
Hi Casey,
already did restart all RGW instances.  Only helped for 2 minutes. We now 
stopped the new site. 

I will remove and recreate it later. 
As twi other sites don't have the problem I currently think I made a mistake in 
the process. 

Mit freundlichen Grüßen
 - Boris Behrens

> Am 20.06.2023 um 18:30 schrieb Casey Bodley :
> 
> hi Boris,
> 
> we've been investigating reports of excessive polling from metadata
> sync. i just opened https://tracker.ceph.com/issues/61743 to track
> this. restarting the secondary zone radosgws should help as a
> temporary workaround
> 
>> On Tue, Jun 20, 2023 at 5:57 AM Boris Behrens  wrote:
>> 
>> Hi,
>> yesterday I added a new zonegroup and it looks like it seems to cycle over
>> the same requests over and over again.
>> 
>> In the log of the main zone I see these requests:
>> 2023-06-20T09:48:37.979+ 7f8941fb3700  1 beast: 0x7f8a602f3700:
>> fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+] "GET
>> /admin/log?type=metadata=62=e8fc96f1-ae86-4dc1-b432-470b0772fded=100&=b39392eb-75f8-47f0-b4f3-7d3882930b26
>> HTTP/1.1" 200 44 - - -
>> 
>> Only thing that changes is the 
>> 
>> We have two other zonegroups that are configured identical (ceph.conf and
>> period) and these don;t seem to spam the main rgw.
>> 
>> root@host:~# radosgw-admin sync status
>>  realm 5d6f2ea4-b84a-459b-bce2-bccac338b3ef (main)
>>  zonegroup b39392eb-75f8-47f0-b4f3-7d3882930b26 (dc3)
>>   zone 96f5eca9-425b-4194-a152-86e310e91ddb (dc3)
>>  metadata sync syncing
>>full sync: 0/64 shards
>>incremental sync: 64/64 shards
>>metadata is caught up with master
>> 
>> root@host:~# radosgw-admin period get
>> {
>>"id": "e8fc96f1-ae86-4dc1-b432-470b0772fded",
>>"epoch": 92,
>>"predecessor_uuid": "5349ac85-3d6d-4088-993f-7a1d4be3835a",
>>"sync_status": [
>>"",
>> ...
>>""
>>],
>>"period_map": {
>>"id": "e8fc96f1-ae86-4dc1-b432-470b0772fded",
>>"zonegroups": [
>>{
>>"id": "b39392eb-75f8-47f0-b4f3-7d3882930b26",
>>"name": "dc3",
>>"api_name": "dc3",
>>"is_master": "false",
>>"endpoints": [
>>],
>>"hostnames": [
>>],
>>"hostnames_s3website": [
>>],
>>"master_zone": "96f5eca9-425b-4194-a152-86e310e91ddb",
>>"zones": [
>>{
>>"id": "96f5eca9-425b-4194-a152-86e310e91ddb",
>>"name": "dc3",
>>"endpoints": [
>>],
>>"log_meta": "false",
>>"log_data": "false",
>>"bucket_index_max_shards": 11,
>>"read_only": "false",
>>"tier_type": "",
>>"sync_from_all": "true",
>>"sync_from": [],
>>"redirect_zone": ""
>>}
>>],
>>"placement_targets": [
>>{
>>"name": "default-placement",
>>"tags": [],
>>"storage_classes": [
>>"STANDARD"
>>]
>>}
>>],
>>"default_placement": "default-placement",
>>"realm_id": "5d6f2ea4-b84a-459b-bce2-bccac338b3ef",
>>"sync_policy": {
>>"groups": []
>>}
>>},
>> ...
>> 
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris Behrens
Hi,
yesterday I added a new zonegroup and it looks like it seems to cycle over
the same requests over and over again.

In the log of the main zone I see these requests:
2023-06-20T09:48:37.979+ 7f8941fb3700  1 beast: 0x7f8a602f3700:
fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+] "GET
/admin/log?type=metadata=62=e8fc96f1-ae86-4dc1-b432-470b0772fded=100&=b39392eb-75f8-47f0-b4f3-7d3882930b26
HTTP/1.1" 200 44 - - -

Only thing that changes is the 

We have two other zonegroups that are configured identical (ceph.conf and
period) and these don;t seem to spam the main rgw.

root@host:~# radosgw-admin sync status
  realm 5d6f2ea4-b84a-459b-bce2-bccac338b3ef (main)
  zonegroup b39392eb-75f8-47f0-b4f3-7d3882930b26 (dc3)
   zone 96f5eca9-425b-4194-a152-86e310e91ddb (dc3)
  metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master

root@host:~# radosgw-admin period get
{
"id": "e8fc96f1-ae86-4dc1-b432-470b0772fded",
"epoch": 92,
"predecessor_uuid": "5349ac85-3d6d-4088-993f-7a1d4be3835a",
"sync_status": [
"",
...
""
],
"period_map": {
"id": "e8fc96f1-ae86-4dc1-b432-470b0772fded",
"zonegroups": [
{
"id": "b39392eb-75f8-47f0-b4f3-7d3882930b26",
"name": "dc3",
"api_name": "dc3",
"is_master": "false",
"endpoints": [
],
"hostnames": [
],
"hostnames_s3website": [
],
"master_zone": "96f5eca9-425b-4194-a152-86e310e91ddb",
"zones": [
{
"id": "96f5eca9-425b-4194-a152-86e310e91ddb",
"name": "dc3",
"endpoints": [
],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "5d6f2ea4-b84a-459b-bce2-bccac338b3ef",
"sync_policy": {
"groups": []
}
},
...

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bucket empty after resharding on multisite environment

2023-04-27 Thread Boris Behrens
Ok, I was able to do a backflip and revert to the old index files:

# Get stuff
radosgw-admin metadata get bucket.instance:BUCKET_NAME:NEW_BUCKET_ID >
bucket.instance:BUCKET_NAME:NEW_BUCKET_ID.json
radosgw-admin metadata get bucket:BUCKET_NAME > bucket:BUCKET_NAME.json

# create copy for fast rollback
cp bucket.instance:BUCKET_NAME:NEW_BUCKET_ID.json
new.bucket.instance:BUCKET_NAME:NEW_BUCKET_ID.json
cp bucket:BUCKET_NAME.json new.bucket:BUCKET_NAME.json

# edit the new.* files and replace all required fields with the correct
values.

# del stuff
radosgw-admin metadata rm bucket:BUCKET_NAME
radosgw-admin metadata rm bucket.instance:BUCKET_NAME:NEW_BUCKET_ID

# upload stuff
radosgw-admin metadata put bucket:BUCKET_NAME < new.bucket:BUCKET_NAME.json
radosgw-admin metadata put bucket.instance:BUCKET_NAME:OLD_BUCKET_ID <
new.bucket.instance:BUCKET_NAME:OLD_BUCKET_ID.json

# rollback in case it did not work
radosgw-admin metadata rm bucket:BUCKET_NAME
radosgw-admin metadata rm bucket.instance:BUCKET_NAME:OLD_BUCKET_ID
radosgw-admin metadata put bucket:BUCKET_NAME < bucket:BUCKET_NAME.json
radosgw-admin metadata put bucket.instance:BUCKET_NAME:OLD_BUCKET_ID <
bucket.instance:BUCKET_NAME:NEW_BUCKET_ID.json


Am Do., 27. Apr. 2023 um 13:32 Uhr schrieb Boris Behrens :

> To clarify a bit:
> The bucket data is not in the main zonegroup.
> I wanted to start the reshard in the zonegroup where the bucket and the
> data is located, but rgw told me to do it in the primary zonegroup.
>
> So I did it there and the index on the zonegroup where the bucket is
> located is empty.
>
> We only sync metadata between the zonegroups not the actual data
> (basically have working credentials in all replicated zones but the buckets
> only life in one place)
>
> radosgw-admin bucket stats shows me the correct ID/marker and the amount
> of shards in all location.
> radosgw-admin reshard status shows 101 entries with "not-resharding"
> radosgw-admin reshard stale-instances list --yes-i-really-mean-it does NOT
> show the bucket
> radosgw-admin bucket radoslist --bucket BUCKET is empty
> radosgw-admin bucket bi list --bucket BUCKET is empty
> radosgw-admin bucket radoslist --bucket-id BUCKETID list files
>
> Am Do., 27. Apr. 2023 um 13:08 Uhr schrieb Boris Behrens :
>
>> Hi,
>> I just resharded a bucket on an octopus multisite environment from 11 to
>> 101.
>>
>> I did it on the master zone and it went through very fast.
>> But now the index is empty.
>>
>> The files are still there when doing a radosgw-admin bucket radoslist
>> --bucket-id
>> Do I just need to wait or do I need to recover that somehow?
>>
>>
>>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bucket empty after resharding on multisite environment

2023-04-27 Thread Boris Behrens
To clarify a bit:
The bucket data is not in the main zonegroup.
I wanted to start the reshard in the zonegroup where the bucket and the
data is located, but rgw told me to do it in the primary zonegroup.

So I did it there and the index on the zonegroup where the bucket is
located is empty.

We only sync metadata between the zonegroups not the actual data (basically
have working credentials in all replicated zones but the buckets only life
in one place)

radosgw-admin bucket stats shows me the correct ID/marker and the amount of
shards in all location.
radosgw-admin reshard status shows 101 entries with "not-resharding"
radosgw-admin reshard stale-instances list --yes-i-really-mean-it does NOT
show the bucket
radosgw-admin bucket radoslist --bucket BUCKET is empty
radosgw-admin bucket bi list --bucket BUCKET is empty
radosgw-admin bucket radoslist --bucket-id BUCKETID list files

Am Do., 27. Apr. 2023 um 13:08 Uhr schrieb Boris Behrens :

> Hi,
> I just resharded a bucket on an octopus multisite environment from 11 to
> 101.
>
> I did it on the master zone and it went through very fast.
> But now the index is empty.
>
> The files are still there when doing a radosgw-admin bucket radoslist
> --bucket-id
> Do I just need to wait or do I need to recover that somehow?
>
>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Bucket empty after resharding on multisite environment

2023-04-27 Thread Boris Behrens
Hi,
I just resharded a bucket on an octopus multisite environment from 11 to
101.

I did it on the master zone and it went through very fast.
But now the index is empty.

The files are still there when doing a radosgw-admin bucket radoslist
--bucket-id
Do I just need to wait or do I need to recover that somehow?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to find the bucket name from Radosgw log?

2023-04-27 Thread Boris Behrens
Cheers Dan,

would it be an option to enable the ops log? I still didn't figure out how
it is actually working.
But I am also thinking to move to the logparsing in HAproxy and disable the
access log on the RGW instances.

Am Mi., 26. Apr. 2023 um 18:21 Uhr schrieb Dan van der Ster <
dan.vanders...@clyso.com>:

> Hi,
>
> Your cluster probably has dns-style buckets enabled.
> ..
> In that case the path does not include the bucket name, and neither
> does the rgw log.
> Do you have a frontend lb like haproxy? You'll find the bucket names there.
>
> -- Dan
>
> __
> Clyso GmbH | https://www.clyso.com
>
>
> On Tue, Apr 25, 2023 at 2:34 PM  wrote:
> >
> > I find a log like this, and I thought the bucket name should be "photos":
> >
> > [2023-04-19 15:48:47.0.5541s] "GET /photos/shares/
> >
> > But I can not find it:
> >
> > radosgw-admin bucket stats --bucket photos
> > failure: 2023-04-19 15:48:53.969 7f69dce49a80  0 could not get bucket
> info for bucket=photos
> > (2002) Unknown error 2002
> >
> > How does this happen? Thanks
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Veeam backups to radosgw seem to be very slow

2023-04-27 Thread Boris Behrens
Thanks Janne, I will hand that to the customer.

> Look at https://community.veeam.com/blogs-and-podcasts-57/sobr-veeam
> -capacity-tier-calculations-and-considerations-in-v11-2548
> for "extra large blocks" to make them 8M at least.
> We had one Veeam installation vomit millions of files onto our rgw-S3
> at an average size of 180k per object, and at those sizes, you will
> see very poor throughput and the many objs/MB will hurt all other
> kinds of performance like listing the bucket and so on.
>

@joachim
What do you mean with "default region"?
I just checked the period and it aligns. I've told them to try to get more
information from it

"bucket does not exist" or "permission denied".
> Had received similar error messages with another client program. The
> default region did not match the region of the cluster.
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Veeam backups to radosgw seem to be very slow

2023-04-25 Thread Boris Behrens
We have a customer that tries to use veeam with our rgw objectstorage and
it seems to be blazingly slow.

What also seems to be strange, that veeam sometimes show "bucket does not
exist" or "permission denied".

I've tested parallel and everything seems to work fine from the s3cmd/aws
cli standpoint.

Does anyone here ever experienced veeam problems with rgw?

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw-admin bucket stats doesn't show real num_objects and size

2023-04-11 Thread Boris Behrens
I don't think you can exclude that.
We've build a notification in the customer panel that there are incomplete
multipart uploads which will be added as space to the bill. We also added a
button to create a LC policy for these objects.

Am Di., 11. Apr. 2023 um 19:07 Uhr schrieb :

> The radosgw-admin bucket stats show there are 209266 objects in this
> bucket, but it included failed multiparts, so that make the size parameter
> is also wrong. When I use boto3 to count objects, the bucket only has
> 209049 objects.
>
> The only solution I can find is to use lifecycle to clean these failed
> multiparts, but in production, the client will decide to use lifecycle or
> not?
> So are there any way to exclude the failed multiparts in bucket statistic?
> Does Ceph allow to set auto clean failed multiparts globally?
>
> Thanks!
>
> "usage": {
> "rgw.main": {
> "size": 593286801276,
> "size_actual": 593716080640,
> "size_utilized": 593286801276,
> "size_kb": 579381642,
> "size_kb_actual": 579800860,
> "size_kb_utilized": 579381642,
> "num_objects": 209266
> }
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW can't create bucket

2023-03-31 Thread Boris Behrens
Sounds like all user have the problem?

so what I would do in my setup now:
- start a new rgw client with maximum logging (debug_rgw = 20) on a non
public port
- test against this endpoint and check logs

This might give you more insight.

Am Fr., 31. März 2023 um 09:36 Uhr schrieb Kamil Madac <
kamil.ma...@gmail.com>:

> We checked s3cmd --debug and endpoint is ok (Working with existing buckets
> works ok with same s3cmd config).  From what I read, "max_buckets": 0 means
> that there is no quota for the number of buckets. There are also users who
> have "max_buckets": 1000, and those users have the same access_denied issue
> when creating a bucket.
>
> We also tried other bucket names and it is the same issue.
>
> On Thu, Mar 30, 2023 at 6:28 PM Boris Behrens  wrote:
>
>> Hi Kamil,
>> is this with all new buckets or only the 'test' bucket? Maybe the name is
>> already taken?
>> Can you check s3cmd --debug if you are connecting to the correct endpoint?
>>
>> Also I see that the user seems to not be allowed to create bukets
>> ...
>> "max_buckets": 0,
>> ...
>>
>> Cheers
>>  Boris
>>
>> Am Do., 30. März 2023 um 17:43 Uhr schrieb Kamil Madac <
>> kamil.ma...@gmail.com>:
>>
>> > Hi Eugen
>> >
>> > It is version 16.2.6, we checked quotas and we can't see any applied
>> quotas
>> > for users. As I wrote, every user is affected. Are there any non-user or
>> > global quotas, which can cause that no user can create a bucket?
>> >
>> > Here is example output of newly created user which cannot create buckets
>> > too:
>> >
>> > {
>> > "user_id": "user123",
>> > "display_name": "user123",
>> > "email": "",
>> > "suspended": 0,
>> > "max_buckets": 0,
>> > "subusers": [],
>> > "keys": [
>> > {
>> > "user": "user123",
>> > "access_key": "ZIYY6XNSC06EU8YPL1AM",
>> > "secret_key": "xx"
>> > }
>> > ],
>> > "swift_keys": [],
>> > "caps": [
>> > {
>> > "type": "buckets",
>> > "perm": "*"
>> > }
>> > ],
>> > "op_mask": "read, write, delete",
>> > "default_placement": "",
>> > "default_storage_class": "",
>> > "placement_tags": [],
>> > "bucket_quota": {
>> > "enabled": false,
>> > "check_on_raw": false,
>> > "max_size": -1,
>> > "max_size_kb": 0,
>> > "max_objects": -1
>> > },
>> > "user_quota": {
>> > "enabled": false,
>> > "check_on_raw": false,
>> > "max_size": -1,
>> > "max_size_kb": 0,
>> > "max_objects": -1
>> > },
>> > "temp_url_keys": [],
>> > "type": "rgw",
>> > "mfa_ids": []
>> > }
>> >
>> > On Thu, Mar 30, 2023 at 1:25 PM Eugen Block  wrote:
>> >
>> > > Hi,
>> > >
>> > > what ceph version is this? Could you have hit some quota?
>> > >
>> > > Zitat von Kamil Madac :
>> > >
>> > > > Hi,
>> > > >
>> > > > One of my customers had a correctly working RGW cluster with two
>> zones
>> > in
>> > > > one zonegroup and since a few days ago users are not able to create
>> > > buckets
>> > > > and are always getting Access denied. Working with existing buckets
>> > works
>> > > > (like listing/putting objects into existing bucket). The only
>> operation
>> > > > which is not working is bucket creation. We also tried to create a
>> new
>> > > > user, but the behavior is the same, and he is not able to create the
>> > > > bucket. We tried s3cmd, python script with boto library and also
>> > > Dashboard
>> > > > as admin user. We are always getting Access Denied. Zones are
>> in-sync.
>> > > >
>> > > > Has anyone experienced such behavior?
>> > > >
>> > > > Thanks in advance, here are some outputs:
>> > >

[ceph-users] Re: OSD down cause all OSD slow ops

2023-03-30 Thread Boris Behrens
Hi,
you might suffer from the same bug we suffered:
https://tracker.ceph.com/issues/53729

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KG35GRTN4ZIDWPLJZ5OQOKERUIQT5WQ6/#K45MJ63J37IN2HNAQXVOOT3J6NTXIHCA

Basically there is a bug that prevents the removal of PGlog items. You need
to update to pacific for the fix. There is also a very easy check if you
MIGHT be affected: https://tracker.ceph.com/issues/53729#note-65

Am Do., 30. März 2023 um 17:02 Uhr schrieb :

> We experienced a Ceph failure causing the system to become unresponsive
> with no IOPS or throughput due to a problematic OSD process on one node.
> This resulted in slow operations and no IOPS for all other OSDs in the
> cluster. The incident timeline is as follows:
>
> Alert triggered for OSD problem.
> 6 out of 12 OSDs on the node were down.
> Soft restart attempted, but smartmontools process stuck while shutting
> down server.
> Hard restart attempted and service resumed as usual.
>
> Our Ceph cluster has 19 nodes, 218 OSDs, and is using version 15.2.17
> octopus (stable).
>
> Questions:
> 1. What is Ceph's detection mechanism? Why couldn't Ceph detect the faulty
> node and automatically abandon its resources?
> 2. Did we miss any patches or bug fixes?
> 3. Suggestions for improvements to quickly detect and avoid similar issues
> in the future?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Eccessive occupation of small OSDs

2023-03-30 Thread Boris Behrens
Hi Nicola, can you send the output of
ceph osd df tree
ceph df
?

Cheers
 Boris

Am Do., 30. März 2023 um 16:36 Uhr schrieb Nicola Mori :

> Dear Ceph users,
>
> my cluster is made up of 10 old machines, with uneven number of disks and
> disk size. Essentially I have just one big data pool (6+2 erasure code,
> with host failure domain) for which I am currently experiencing a very poor
> available space (88 TB of which 40 TB occupied, as reported by df -h on
> hosts mounting the cephfs) compared to the raw one (196.5 TB). I have a
> total of 104 OSDs and 512 PGs for the pool; I cannot increment the PG
> number since the machines are old and with very low amount of RAM, and some
> of them are already overloaded.
>
> In this situation I'm seeing a high occupation of small OSDs (500 MB) with
> respect to bigger ones (2 and 4 TB) even if the weight is set equal to disk
> capacity (see below for ceph osd tree). For example OSD 9 is at 62%
> occupancy even with weight 0.5 and reweight 0.75, while the highest
> occupancy for 2 TB OSDs is 41% (OSD 18) and 4 TB OSDs is 23% (OSD 79). I
> guess this high occupancy for 500 MB OSDs combined with erasure code size
> and host failure domain might be the cause of the poor available space,
> could this be true? The upmap balancer is currently running but I don't
> know if and how much it could improve the situation.
> Any hint is greatly appreciated, thanks.
>
> Nicola
>
> # ceph osd tree
> ID   CLASS  WEIGHT TYPE NAME STATUS  REWEIGHT  PRI-AFF
>  -1 196.47754  root default
>  -7  14.55518  host aka
>   4hdd1.81940  osd.4 up   1.0  1.0
>  11hdd1.81940  osd.11up   1.0  1.0
>  18hdd1.81940  osd.18up   1.0  1.0
>  26hdd1.81940  osd.26up   1.0  1.0
>  32hdd1.81940  osd.32up   1.0  1.0
>  41hdd1.81940  osd.41up   1.0  1.0
>  48hdd1.81940  osd.48up   1.0  1.0
>  55hdd1.81940  osd.55up   1.0  1.0
>  -3  14.55518  host balin
>   0hdd1.81940  osd.0 up   1.0  1.0
>   8hdd1.81940  osd.8 up   1.0  1.0
>  15hdd1.81940  osd.15up   1.0  1.0
>  22hdd1.81940  osd.22up   1.0  1.0
>  29hdd1.81940  osd.29up   1.0  1.0
>  34hdd1.81940  osd.34up   1.0  1.0
>  43hdd1.81940  osd.43up   1.0  1.0
>  49hdd1.81940  osd.49up   1.0  1.0
> -13  29.10950  host bifur
>   3hdd3.63869  osd.3 up   1.0  1.0
>  14hdd3.63869  osd.14up   1.0  1.0
>  27hdd3.63869  osd.27up   1.0  1.0
>  37hdd3.63869  osd.37up   1.0  1.0
>  50hdd3.63869  osd.50up   1.0  1.0
>  59hdd3.63869  osd.59up   1.0  1.0
>  64hdd3.63869  osd.64up   1.0  1.0
>  69hdd3.63869  osd.69up   1.0  1.0
> -17  29.10950  host bofur
>   2hdd3.63869  osd.2 up   1.0  1.0
>  21hdd3.63869  osd.21up   1.0  1.0
>  39hdd3.63869  osd.39up   1.0  1.0
>  57hdd3.63869  osd.57up   1.0  1.0
>  66hdd3.63869  osd.66up   1.0  1.0
>  72hdd3.63869  osd.72up   1.0  1.0
>  76hdd3.63869  osd.76up   1.0  1.0
>  79hdd3.63869  osd.79up   1.0  1.0
> -21  29.10376  host dwalin
>  88hdd1.81898  osd.88up   1.0  1.0
>  89hdd1.81898  osd.89up   1.0  1.0
>  90hdd1.81898  osd.90up   1.0  1.0
>  91hdd1.81898  osd.91up   1.0  1.0
>  92hdd1.81898  osd.92up   1.0  1.0
>  93hdd1.81898  osd.93up   1.0  1.0
>  94hdd1.81898  osd.94up   1.0  1.0
>  95hdd1.81898  osd.95up   1.0  1.0
>  96hdd1.81898  osd.96up   1.0  1.0
>  97hdd1.81898  osd.97up   1.0  1.0
>  98hdd1.81898  osd.98up   1.0  1.0
>  99hdd1.81898  osd.99

[ceph-users] Re: RGW can't create bucket

2023-03-30 Thread Boris Behrens
Hi Kamil,
is this with all new buckets or only the 'test' bucket? Maybe the name is
already taken?
Can you check s3cmd --debug if you are connecting to the correct endpoint?

Also I see that the user seems to not be allowed to create bukets
...
"max_buckets": 0,
...

Cheers
 Boris

Am Do., 30. März 2023 um 17:43 Uhr schrieb Kamil Madac <
kamil.ma...@gmail.com>:

> Hi Eugen
>
> It is version 16.2.6, we checked quotas and we can't see any applied quotas
> for users. As I wrote, every user is affected. Are there any non-user or
> global quotas, which can cause that no user can create a bucket?
>
> Here is example output of newly created user which cannot create buckets
> too:
>
> {
> "user_id": "user123",
> "display_name": "user123",
> "email": "",
> "suspended": 0,
> "max_buckets": 0,
> "subusers": [],
> "keys": [
> {
> "user": "user123",
> "access_key": "ZIYY6XNSC06EU8YPL1AM",
> "secret_key": "xx"
> }
> ],
> "swift_keys": [],
> "caps": [
> {
> "type": "buckets",
> "perm": "*"
> }
> ],
> "op_mask": "read, write, delete",
> "default_placement": "",
> "default_storage_class": "",
> "placement_tags": [],
> "bucket_quota": {
> "enabled": false,
> "check_on_raw": false,
> "max_size": -1,
> "max_size_kb": 0,
> "max_objects": -1
> },
> "user_quota": {
> "enabled": false,
> "check_on_raw": false,
> "max_size": -1,
> "max_size_kb": 0,
> "max_objects": -1
> },
> "temp_url_keys": [],
> "type": "rgw",
> "mfa_ids": []
> }
>
> On Thu, Mar 30, 2023 at 1:25 PM Eugen Block  wrote:
>
> > Hi,
> >
> > what ceph version is this? Could you have hit some quota?
> >
> > Zitat von Kamil Madac :
> >
> > > Hi,
> > >
> > > One of my customers had a correctly working RGW cluster with two zones
> in
> > > one zonegroup and since a few days ago users are not able to create
> > buckets
> > > and are always getting Access denied. Working with existing buckets
> works
> > > (like listing/putting objects into existing bucket). The only operation
> > > which is not working is bucket creation. We also tried to create a new
> > > user, but the behavior is the same, and he is not able to create the
> > > bucket. We tried s3cmd, python script with boto library and also
> > Dashboard
> > > as admin user. We are always getting Access Denied. Zones are in-sync.
> > >
> > > Has anyone experienced such behavior?
> > >
> > > Thanks in advance, here are some outputs:
> > >
> > > $ s3cmd -c .s3cfg_python_client mb s3://test
> > > ERROR: Access to bucket 'test' was denied
> > > ERROR: S3 error: 403 (AccessDenied)
> > >
> > > Zones are in-sync:
> > >
> > > Primary cluster:
> > >
> > > # radosgw-admin sync status
> > > realm 5429b434-6d43-4a18-8f19-a5720a89c621 (solargis-prod)
> > > zonegroup 00e4b3ff-1da8-4a86-9f52-4300c6d0f149 (solargis-prod-ba)
> > > zone 6067eec6-a930-45c7-af7d-a7ef2785a2d7 (solargis-prod-ba-dc)
> > > metadata sync no sync (zone is master)
> > > data sync source: e84fd242-dbae-466c-b4d9-545990590995
> > (solargis-prod-ba-hq)
> > > syncing
> > > full sync: 0/128 shards
> > > incremental sync: 128/128 shards
> > > data is caught up with source
> > >
> > >
> > > Secondary cluster:
> > >
> > > # radosgw-admin sync status
> > > realm 5429b434-6d43-4a18-8f19-a5720a89c621 (solargis-prod)
> > > zonegroup 00e4b3ff-1da8-4a86-9f52-4300c6d0f149 (solargis-prod-ba)
> > > zone e84fd242-dbae-466c-b4d9-545990590995 (solargis-prod-ba-hq)
> > > metadata sync syncing
> > > full sync: 0/64 shards
> > > incremental sync: 64/64 shards
> > > metadata is caught up with master
> > > data sync source: 6067eec6-a930-45c7-af7d-a7ef2785a2d7
> > (solargis-prod-ba-dc)
> > > syncing
> > > full sync: 0/128 shards
> > > incremental sync: 128/128 shards
> > > data is caught up with source
> > >
> > > --
> > > Kamil Madac
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
>
> --
> Kamil Madac <https://kmadac.github.io/>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW access logs with bucket name

2023-03-30 Thread Boris Behrens
Sadly not.
I only see the the path/query of a request, but not the hostname.
So when a bucket is accessed via hostname (https://bucket.TLD/object?query)
I only see the object and the query (GET /object?query).
When a bucket is accessed bia path (https://TLD/bucket/object?query) I can
see also the bucket in the log (GET bucket/object?query)

Am Do., 30. März 2023 um 12:58 Uhr schrieb Szabo, Istvan (Agoda) <
istvan.sz...@agoda.com>:

> It has the full url begins with the bucket name in the beast logs http
> requests, hasn’t it?
>
> Istvan Szabo
> Staff Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
> On 2023. Mar 30., at 17:44, Boris Behrens  wrote:
>
> Email received from the internet. If in doubt, don't click any link nor
> open any attachment !
> 
>
> Bringing up that topic again:
> is it possible to log the bucket name in the rgw client logs?
>
> currently I am only to know the bucket name when someone access the bucket
> via https://TLD/bucket/object instead of https://bucket.TLD/object.
>
> Am Di., 3. Jan. 2023 um 10:25 Uhr schrieb Boris Behrens :
>
> Hi,
>
> I am looking forward to move our logs from
>
> /var/log/ceph/ceph-client...log to our logaggregator.
>
>
> Is there a way to have the bucket name in the log file?
>
>
> Or can I write the rgw_enable_ops_log into a file? Maybe I could work with
>
> this.
>
>
> Cheers and happy new year
>
> Boris
>
>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> --
> This message is confidential and is for the sole use of the intended
> recipient(s). It may also be privileged or otherwise protected by copyright
> or other legal rules. If you have received it by mistake please let us know
> by reply email and delete it from your system. It is prohibited to copy
> this message or disclose its content to anyone. Any confidentiality or
> privilege is not waived or lost by any mistaken delivery or unauthorized
> disclosure of the message. All messages sent to and from Agoda may be
> monitored to ensure compliance with company policies, to protect the
> company's interests and to remove potential malware. Electronic messages
> may be intercepted, amended, lost or deleted, or contain viruses.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW access logs with bucket name

2023-03-30 Thread Boris Behrens
Bringing up that topic again:
is it possible to log the bucket name in the rgw client logs?

currently I am only to know the bucket name when someone access the bucket
via https://TLD/bucket/object instead of https://bucket.TLD/object.

Am Di., 3. Jan. 2023 um 10:25 Uhr schrieb Boris Behrens :

> Hi,
> I am looking forward to move our logs from
> /var/log/ceph/ceph-client...log to our logaggregator.
>
> Is there a way to have the bucket name in the log file?
>
> Or can I write the rgw_enable_ops_log into a file? Maybe I could work with
> this.
>
> Cheers and happy new year
>  Boris
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-30 Thread Boris Behrens
A short correction:
The IOPS from the bench in out pacific cluster are also down to 40 again
for the 4/8TB disks , but the apply latency seems to stay in the same place.
But I still don't understand why it is down again. Even when I synced out
the OSD so it receives 0 traffic it is still slow. After idling over night
it is back up to 120 IOPS

Am Do., 30. März 2023 um 09:45 Uhr schrieb Boris Behrens :

> After some digging in the nautilus cluster I see that the disks with the
> exceptional high IOPS performance are actually SAS attached NVME disks
> (these:
> https://semiconductor.samsung.com/ssd/enterprise-ssd/pm1643-pm1643a/mzilt7t6hala-7/
> ) and these disk make around 45% of cluster capacity. Maybe this explains
> the very low commit latency in the nautilus cluster.
>
> I did a bench on all SATA 8TB disks (nautilus) and most all of them only
> have ~30-50 IOPS.
> After redeploying one OSD with blkdiscard the IOPS went from 48 -> 120.
>
> The IOPS from the bench in out pacific cluster are also down to 40 again
> for the 4/8TB disks , but the apply latency seems to stay in the same place.
> But I still don't understand why it is down again. Even when I synced out
> the OSD so it receives 0 traffic it is still slow.
>
> I am unsure how I should interpret this. It also looks like that the AVG
> apply latency (4h resolution) goes up again (2023-03-01 upgrade to pacific,
> the dip around 25th was the redeploy and now it seems to go up again)
> [image: image.png]
>
>
>
> Am Mo., 27. März 2023 um 17:24 Uhr schrieb Igor Fedotov <
> igor.fedo...@croit.io>:
>
>>
>> On 3/27/2023 12:19 PM, Boris Behrens wrote:
>>
>> Nonetheless the IOPS the bench command generates are still VERY low
>> compared to the nautilus cluster (~150 vs ~250). But this is something I
>> would pin to this bug: https://tracker.ceph.com/issues/58530
>>
>> I've just run "ceph tell bench" against main, octopus and nautilus
>> branches (fresh osd deployed with vstart.sh) - I don't see any difference
>> between releases - sata drive shows around 110 IOPs in my case..
>>
>> So I suspect some difference between clusters in your case. E.g. are you
>> sure disk caching is off for both?
>>
>> @Igor do you want to me to update the ticket with my findings and the logs
>> from pastebin?
>>
>> Feel free to update if you like but IMO we still lack the understanding
>> what was the trigger for perf improvements in you case - OSD redeployment,
>> disk trimming or both?
>>
>>
>>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-30 Thread Boris Behrens
After some digging in the nautilus cluster I see that the disks with the
exceptional high IOPS performance are actually SAS attached NVME disks
(these:
https://semiconductor.samsung.com/ssd/enterprise-ssd/pm1643-pm1643a/mzilt7t6hala-7/
) and these disk make around 45% of cluster capacity. Maybe this explains
the very low commit latency in the nautilus cluster.

I did a bench on all SATA 8TB disks (nautilus) and most all of them only
have ~30-50 IOPS.
After redeploying one OSD with blkdiscard the IOPS went from 48 -> 120.

The IOPS from the bench in out pacific cluster are also down to 40 again
for the 4/8TB disks , but the apply latency seems to stay in the same place.
But I still don't understand why it is down again. Even when I synced out
the OSD so it receives 0 traffic it is still slow.

I am unsure how I should interpret this. It also looks like that the AVG
apply latency (4h resolution) goes up again (2023-03-01 upgrade to pacific,
the dip around 25th was the redeploy and now it seems to go up again)
[image: image.png]



Am Mo., 27. März 2023 um 17:24 Uhr schrieb Igor Fedotov <
igor.fedo...@croit.io>:

>
> On 3/27/2023 12:19 PM, Boris Behrens wrote:
>
> Nonetheless the IOPS the bench command generates are still VERY low
> compared to the nautilus cluster (~150 vs ~250). But this is something I
> would pin to this bug: https://tracker.ceph.com/issues/58530
>
> I've just run "ceph tell bench" against main, octopus and nautilus
> branches (fresh osd deployed with vstart.sh) - I don't see any difference
> between releases - sata drive shows around 110 IOPs in my case..
>
> So I suspect some difference between clusters in your case. E.g. are you
> sure disk caching is off for both?
>
> @Igor do you want to me to update the ticket with my findings and the logs
> from pastebin?
>
> Feel free to update if you like but IMO we still lack the understanding
> what was the trigger for perf improvements in you case - OSD redeployment,
> disk trimming or both?
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Boris Behrens
Hey Igor,

we are currently using these disks - all SATA attached (is it normal to
have some OSDs without waer counter?):
# ceph device ls | awk '{print $1}' | cut -f 1,2 -d _ | sort | uniq -c
 18 SAMSUNG_MZ7KH3T8 (4TB)
126 SAMSUNG_MZ7KM1T9 (2TB)
 24 SAMSUNG_MZ7L37T6 (8TB)
  1 TOSHIBA_THNSN81Q (2TB) (ceph device ls shows a wear of 16% so maybe
we remove this one)

These are the CPUs in the storage hosts:
# ceph osd metadata | grep -F '"cpu": "' | sort -u
"cpu": "Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz",
"cpu": "Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz",

The hosts have between 128GB and 256GB memory and each got between 20 and
30 OSDs.
DB and OSD are using same device, no extra device for DB/WAL.

Seeing your IOPS it looks like we are around the same level.
I am curious if the performance will stay at the current level or degrade
over time.

Am Mo., 27. März 2023 um 13:42 Uhr schrieb Igor Fedotov <
igor.fedo...@croit.io>:

> Hi Boris,
>
> I wouldn't recommend to take absolute "osd bench" numbers too seriously.
> It's definitely not a full-scale quality benchmark tool.
>
> The idea was just to make brief OSDs comparison from c1 and c2.
>
> And for your reference -  IOPS numbers I'm getting in my lab with data/DB
> colocated:
>
> 1) OSD on top of Intel S4600 (SATA SSD) - ~110 IOPS
>
> 2) OSD on top of Samsung DCT 983 (M.2 NVMe) - 310 IOPS
>
> 3) OSD on top of Intel 905p (Optane NVMe) - 546 IOPS.
>
>
> Could you please provide a bit more info on the H/W and OSD setup?
>
> What are the disk models? NVMe or SATA? Are DB and main disk shared?
>
>
> Thanks,
>
> Igor
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Boris Behrens
Hello together,

I've redeployed all OSDs in the cluster and did a blkdiscard before
deploying them again. It looks now a lot better, even better before the
octopus. I am waiting for confirmation from the dev and customer teams as
the value over all OSDs can be misleading, and we still have some OSDs that
have a 5 minute mean between 1-2 ms.

What I also see is that I have three OSDs that have quite a lot of OMAP
data, in compare to other OSDs (~20 time higher). I don't know if this is
an issue:
ID   CLASS  WEIGHT REWEIGHT  SIZE RAW USE   DATA  OMAP META
AVAIL%USE   VAR   PGS  STATUS  TYPE NAME
...
 91ssd1.74660   1.0  1.7 TiB   1.1 TiB   1.1 TiB   26 MiB  2.9
GiB  670 GiB  62.52  1.08   59  up  osd.91
 92ssd1.74660   1.0  1.7 TiB   1.0 TiB  1022 GiB  575 MiB  2.6
GiB  764 GiB  57.30  0.99   56  up  osd.92
 93ssd1.74660   1.0  1.7 TiB   986 GiB   983 GiB   25 MiB  3.0
GiB  803 GiB  55.12  0.95   53  up  osd.93
...
130ssd1.74660   1.0  1.7 TiB  1018 GiB  1015 GiB   25 MiB  3.1
GiB  771 GiB  56.92  0.98   53  up  osd.130
131ssd1.74660   1.0  1.7 TiB  1023 GiB  1019 GiB  574 MiB  2.9
GiB  766 GiB  57.17  0.98   54  up  osd.131
132ssd1.74660   1.0  1.7 TiB   1.1 TiB   1.1 TiB   26 MiB  3.1
GiB  675 GiB  62.26  1.07   58  up  osd.132
...
 41ssd1.74660   1.0  1.7 TiB   991 GiB   989 GiB   25 MiB  2.5
GiB  797 GiB  55.43  0.95   52  up  osd.41
 44ssd1.74660   1.0  1.7 TiB   1.1 TiB   1.1 TiB  576 MiB  2.8
GiB  648 GiB  63.75  1.10   60  up  osd.44
 56ssd1.74660   1.0  1.7 TiB   993 GiB   990 GiB   25 MiB  2.9
GiB  796 GiB  55.51  0.95   54  up  osd.56

IMHO this might be due to the blkdiscard. We move a lot of 2TB disks from
the nautilus cluster (c-2) to the, then octopus, pacific cluster (c-1). And
we only removed the LVM data. Doing the blkdiscard took around 10 minutes
on an 8TB SSD on the first run, and around 5s on the second run.
I could imagine, that this might be a problem with SSDs in combination with
bluestore, because there is trimable FS and the information on what the OSD
thinks is free vs the disk controller thinks is free might deviate. But I
am not really deep into storage mechanics so this is just a wild guess.

Nonetheless the IOPS the bench command generates are still VERY low
compared to the nautilus cluster (~150 vs ~250). But this is something I
would pin to this bug: https://tracker.ceph.com/issues/58530

@Igor do you want to me to update the ticket with my findings and the logs
from pastebin?

@marc
If I interpret the linked bug correctly, you might want to have the
metadata on an SSD, because the write aplification might hit very hard on
HDDs. But maybe someone else from the mailing list can say more about it.

Cheers
 Boris

Am Mi., 22. März 2023 um 22:45 Uhr schrieb Boris Behrens :

> Hey Igor,
>
> sadly we do not have the data from the time where c1 was on nautilus.
> The RocksDB warning persisted the recreation.
>
> Here are the measurements.
> I've picked the same SSD models from the clusters to have some
> comparablity.
> For the 8TB disks it's even the same chassis configuration
> (CPU/Memory/Board/Network)
>
> The IOPS seem VERY low for me. Or are these normal values for SSDs? After
> recreation the IOPS are a lot better on the pacific cluster.
>
> I also blkdiscarded the SSDs before recreating them.
>
> Nautilus Cluster
> osd.22  = 8TB
> osd.343 = 2TB
> https://pastebin.com/EfSSLmYS
>
> Pacific Cluster before recreating OSDs
> osd.40  = 8TB
> osd.162 = 2TB
> https://pastebin.com/wKMmSW9T
>
> Pacific Cluster after recreation OSDs
> osd.40  = 8TB
> osd.162 = 2TB
> https://pastebin.com/80eMwwBW
>
> Am Mi., 22. März 2023 um 11:09 Uhr schrieb Igor Fedotov <
> igor.fedo...@croit.io>:
>
>> Hi Boris,
>>
>> first of all I'm not sure if it's valid to compare two different clusters
>> (pacific vs . nautilus, C1 vs. C2 respectively). The perf numbers
>> difference might be caused by a bunch of other factors: different H/W, user
>> load, network etc... I can see that you got ~2x latency increase after
>> Octopus to Pacific upgrade at C1 but Octopus numbers had been much above
>> Nautilus at C2 before the upgrade. Did you observe even lower numbers at C1
>> when it was running Nautilus if any?
>>
>>
>> You might want to try "ceph tell osd.N bench" to compare OSDs performance
>> for both C1 and C2. Would it be that different?
>>
>>
>> Then redeploy a single OSD at C1, wait till rebalance completion and
>> benchmark it again. What would be the new numbers? Please also collect perf
>> counters from the to-be-redeployed OSD beforehand.
&

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-22 Thread Boris Behrens
Hey Igor,

sadly we do not have the data from the time where c1 was on nautilus.
The RocksDB warning persisted the recreation.

Here are the measurements.
I've picked the same SSD models from the clusters to have some comparablity.
For the 8TB disks it's even the same chassis configuration
(CPU/Memory/Board/Network)

The IOPS seem VERY low for me. Or are these normal values for SSDs? After
recreation the IOPS are a lot better on the pacific cluster.

I also blkdiscarded the SSDs before recreating them.

Nautilus Cluster
osd.22  = 8TB
osd.343 = 2TB
https://pastebin.com/EfSSLmYS

Pacific Cluster before recreating OSDs
osd.40  = 8TB
osd.162 = 2TB
https://pastebin.com/wKMmSW9T

Pacific Cluster after recreation OSDs
osd.40  = 8TB
osd.162 = 2TB
https://pastebin.com/80eMwwBW

Am Mi., 22. März 2023 um 11:09 Uhr schrieb Igor Fedotov <
igor.fedo...@croit.io>:

> Hi Boris,
>
> first of all I'm not sure if it's valid to compare two different clusters
> (pacific vs . nautilus, C1 vs. C2 respectively). The perf numbers
> difference might be caused by a bunch of other factors: different H/W, user
> load, network etc... I can see that you got ~2x latency increase after
> Octopus to Pacific upgrade at C1 but Octopus numbers had been much above
> Nautilus at C2 before the upgrade. Did you observe even lower numbers at C1
> when it was running Nautilus if any?
>
>
> You might want to try "ceph tell osd.N bench" to compare OSDs performance
> for both C1 and C2. Would it be that different?
>
>
> Then redeploy a single OSD at C1, wait till rebalance completion and
> benchmark it again. What would be the new numbers? Please also collect perf
> counters from the to-be-redeployed OSD beforehand.
>
> W.r.t. rocksdb warning - I presume this might be caused by newer RocksDB
> version running on top of DB with a legacy format.. Perhaps redeployment
> would fix that...
>
>
> Thanks,
>
> Igor
> On 3/21/2023 5:31 PM, Boris Behrens wrote:
>
> Hi Igor,
> i've offline compacted all the OSDs and reenabled the bluefs_buffered_io
>
> It didn't change anything and the commit and apply latencies are around
> 5-10 times higher than on our nautlus cluster. The pacific cluster got a 5
> minute mean over all OSDs 2.2ms, while the nautilus cluster is around 0.2 -
> 0.7 ms.
>
> I also see these kind of logs. Google didn't really help:
> 2023-03-21T14:08:22.089+ 7efe7b911700  3 rocksdb:
> [le/block_based/filter_policy.cc:579] Using legacy Bloom filter with high
> (20) bits/key. Dramatic filter space and/or accuracy improvement is
> available with format_version>=5.
>
>
>
>
> Am Di., 21. März 2023 um 10:46 Uhr schrieb Igor Fedotov 
> :
>
>
> Hi Boris,
>
> additionally you might want to manually compact RocksDB for every OSD.
>
>
> Thanks,
>
> Igor
> On 3/21/2023 12:22 PM, Boris Behrens wrote:
>
> Disabling the write cache and the bluefs_buffered_io did not change
> anything.
> What we see is that larger disks seem to be the leader in therms of
> slowness (we have 70% 2TB, 20% 4TB and 10% 8TB SSDs in the cluster), but
> removing some of the 8TB disks and replace them with 2TB (because it's by
> far the majority and we have a lot of them) disks did also not change
> anything.
>
> Are there any other ideas I could try. Customer start to complain about the
> slower performance and our k8s team mentions problems with ETCD because the
> latency is too high.
>
> Would it be an option to recreate every OSD?
>
> Cheers
>  Boris
>
> Am Di., 28. Feb. 2023 um 22:46 Uhr schrieb Boris Behrens  
>   :
>
>
> Hi Josh,
> thanks a lot for the breakdown and the links.
> I disabled the write cache but it didn't change anything. Tomorrow I will
> try to disable bluefs_buffered_io.
>
> It doesn't sound that I can mitigate the problem with more SSDs.
>
>
> Am Di., 28. Feb. 2023 um 15:42 Uhr schrieb Josh Baergen 
>  :
>
>
> Hi Boris,
>
> OK, what I'm wondering is whetherhttps://tracker.ceph.com/issues/58530 is 
> involved. There are two
> aspects to that ticket:
> * A measurable increase in the number of bytes written to disk in
> Pacific as compared to Nautilus
> * The same, but for IOPS
>
> Per the current theory, both are due to the loss of rocksdb log
> recycling when using default recovery options in rocksdb 6.8; Octopus
> uses version 6.1.2, Pacific uses 6.8.1.
>
> 16.2.11 largely addressed the bytes-written amplification, but the
> IOPS amplification remains. In practice, whether this results in a
> write performance degradation depends on the speed of the underlying
> media and the workload, and thus the things I mention in the next
> paragraph may or may not be applicable to you.
>
> There's no known workaro

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-22 Thread Boris Behrens
Might be. Josh also pointed in that direction. I currently search for ways
to mitigate it.

Am Mi., 22. März 2023 um 10:30 Uhr schrieb Konstantin Shalygin <
k0...@k0ste.ru>:

> Hi,
>
>
> Maybe [1] ?
>
>
> [1] https://tracker.ceph.com/issues/58530
> k
>
> On 22 Mar 2023, at 16:20, Boris Behrens  wrote:
>
> Are there any other ides?
>
>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-21 Thread Boris Behrens
Hi Igor,
i've offline compacted all the OSDs and reenabled the bluefs_buffered_io

It didn't change anything and the commit and apply latencies are around
5-10 times higher than on our nautlus cluster. The pacific cluster got a 5
minute mean over all OSDs 2.2ms, while the nautilus cluster is around 0.2 -
0.7 ms.

I also see these kind of logs. Google didn't really help:
2023-03-21T14:08:22.089+ 7efe7b911700  3 rocksdb:
[le/block_based/filter_policy.cc:579] Using legacy Bloom filter with high
(20) bits/key. Dramatic filter space and/or accuracy improvement is
available with format_version>=5.




Am Di., 21. März 2023 um 10:46 Uhr schrieb Igor Fedotov <
igor.fedo...@croit.io>:

> Hi Boris,
>
> additionally you might want to manually compact RocksDB for every OSD.
>
>
> Thanks,
>
> Igor
> On 3/21/2023 12:22 PM, Boris Behrens wrote:
>
> Disabling the write cache and the bluefs_buffered_io did not change
> anything.
> What we see is that larger disks seem to be the leader in therms of
> slowness (we have 70% 2TB, 20% 4TB and 10% 8TB SSDs in the cluster), but
> removing some of the 8TB disks and replace them with 2TB (because it's by
> far the majority and we have a lot of them) disks did also not change
> anything.
>
> Are there any other ideas I could try. Customer start to complain about the
> slower performance and our k8s team mentions problems with ETCD because the
> latency is too high.
>
> Would it be an option to recreate every OSD?
>
> Cheers
>  Boris
>
> Am Di., 28. Feb. 2023 um 22:46 Uhr schrieb Boris Behrens  
> :
>
>
> Hi Josh,
> thanks a lot for the breakdown and the links.
> I disabled the write cache but it didn't change anything. Tomorrow I will
> try to disable bluefs_buffered_io.
>
> It doesn't sound that I can mitigate the problem with more SSDs.
>
>
> Am Di., 28. Feb. 2023 um 15:42 Uhr schrieb Josh Baergen 
> :
>
>
> Hi Boris,
>
> OK, what I'm wondering is whetherhttps://tracker.ceph.com/issues/58530 is 
> involved. There are two
> aspects to that ticket:
> * A measurable increase in the number of bytes written to disk in
> Pacific as compared to Nautilus
> * The same, but for IOPS
>
> Per the current theory, both are due to the loss of rocksdb log
> recycling when using default recovery options in rocksdb 6.8; Octopus
> uses version 6.1.2, Pacific uses 6.8.1.
>
> 16.2.11 largely addressed the bytes-written amplification, but the
> IOPS amplification remains. In practice, whether this results in a
> write performance degradation depends on the speed of the underlying
> media and the workload, and thus the things I mention in the next
> paragraph may or may not be applicable to you.
>
> There's no known workaround or solution for this at this time. In some
> cases I've seen that disabling bluefs_buffered_io (which itself can
> cause IOPS amplification in some cases) can help; I think most folks
> do this by setting it in local conf and then restarting OSDs in order
> to gain the config change. Something else to consider is
> https://docs.ceph.com/en/quincy/start/hardware-recommendations/#write-caches
> ,
> as sometimes disabling these write caches can improve the IOPS
> performance of SSDs.
>
> Josh
>
> On Tue, Feb 28, 2023 at 7:19 AM Boris Behrens  
>  wrote:
>
> Hi Josh,
> we upgraded 15.2.17 -> 16.2.11 and we only use rbd workload.
>
>
>
> Am Di., 28. Feb. 2023 um 15:00 Uhr schrieb Josh Baergen <
>
> jbaer...@digitalocean.com>:
>
> Hi Boris,
>
> Which version did you upgrade from and to, specifically? And what
> workload are you running (RBD, etc.)?
>
> Josh
>
> On Tue, Feb 28, 2023 at 6:51 AM Boris Behrens  
>  wrote:
>
> Hi,
> today I did the first update from octopus to pacific, and it looks
>
> like the
>
> avg apply latency went up from 1ms to 2ms.
>
> All 36 OSDs are 4TB SSDs and nothing else changed.
> Someone knows if this is an issue, or am I just missing a config
>
> value?
>
> Cheers
>  Boris
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
>
> im groüen Saal.
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>
>
> --
> Igor Fedotov
> Ceph Lead Developer
> --
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
> Web <https://croit.io/> | LinkedIn <http://linkedin.com/company/croit> |
> Youtube <https://www

[ceph-users] Re: Changing os to ubuntu from centos 8

2023-03-21 Thread Boris Behrens
Hi Istvan,

I currently make the move from centos7 to ubuntu18.04 (we want to jump
directly from nautilus to pacific), When everything in the cluster got the
same version, and the version is available on the new OS you can just
reinstall the hosts with the new OS.

With the mons, I remove the current mon from the list while reinstalling
and recreate the mon afterward, so I don't need to carry over any files.
With the OSD hosts I just set the cluster to "noout" and have the system
down for 20 minutes, which is about the time I require to install the new
OS and provision all the configs. Afterwards I just start all the OSDs
(ceph-volume lvm activate --all) and wait for the cluster to become green
again.

Cheers
 Boris

Am Di., 21. März 2023 um 08:54 Uhr schrieb Szabo, Istvan (Agoda) <
istvan.sz...@agoda.com>:

> Hi,
>
> I'd like to change the os to ubuntu 20.04.5 from my bare metal deployed
> octopus 15.2.14 on centos 8. On the first run I would go with octopus
> 15.2.17 just to not make big changes in the cluster.
> I've found couple of threads on the mailing list but those were
> containerized (like: Re: Upgrade/migrate host operating system for ceph
> nodes (CentOS/Rocky) or  Re: Migrating CEPH OS looking for suggestions).
>
> Wonder what is the proper steps for this kind of migration? Do we need to
> start with mgr or mon or rgw or osd?
> Is it possible to reuse the osd with ceph-volume scan on the reinstalled
> machine?
> I'd stay with baremetal deployment and even maybe with octopus but I'm
> curious your advice.
>
> Thank you
>
> 
> This message is confidential and is for the sole use of the intended
> recipient(s). It may also be privileged or otherwise protected by copyright
> or other legal rules. If you have received it by mistake please let us know
> by reply email and delete it from your system. It is prohibited to copy
> this message or disclose its content to anyone. Any confidentiality or
> privilege is not waived or lost by any mistaken delivery or unauthorized
> disclosure of the message. All messages sent to and from Agoda may be
> monitored to ensure compliance with company policies, to protect the
> company's interests and to remove potential malware. Electronic messages
> may be intercepted, amended, lost or deleted, or contain viruses.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-21 Thread Boris Behrens
Disabling the write cache and the bluefs_buffered_io did not change
anything.
What we see is that larger disks seem to be the leader in therms of
slowness (we have 70% 2TB, 20% 4TB and 10% 8TB SSDs in the cluster), but
removing some of the 8TB disks and replace them with 2TB (because it's by
far the majority and we have a lot of them) disks did also not change
anything.

Are there any other ideas I could try. Customer start to complain about the
slower performance and our k8s team mentions problems with ETCD because the
latency is too high.

Would it be an option to recreate every OSD?

Cheers
 Boris

Am Di., 28. Feb. 2023 um 22:46 Uhr schrieb Boris Behrens :

> Hi Josh,
> thanks a lot for the breakdown and the links.
> I disabled the write cache but it didn't change anything. Tomorrow I will
> try to disable bluefs_buffered_io.
>
> It doesn't sound that I can mitigate the problem with more SSDs.
>
>
> Am Di., 28. Feb. 2023 um 15:42 Uhr schrieb Josh Baergen <
> jbaer...@digitalocean.com>:
>
>> Hi Boris,
>>
>> OK, what I'm wondering is whether
>> https://tracker.ceph.com/issues/58530 is involved. There are two
>> aspects to that ticket:
>> * A measurable increase in the number of bytes written to disk in
>> Pacific as compared to Nautilus
>> * The same, but for IOPS
>>
>> Per the current theory, both are due to the loss of rocksdb log
>> recycling when using default recovery options in rocksdb 6.8; Octopus
>> uses version 6.1.2, Pacific uses 6.8.1.
>>
>> 16.2.11 largely addressed the bytes-written amplification, but the
>> IOPS amplification remains. In practice, whether this results in a
>> write performance degradation depends on the speed of the underlying
>> media and the workload, and thus the things I mention in the next
>> paragraph may or may not be applicable to you.
>>
>> There's no known workaround or solution for this at this time. In some
>> cases I've seen that disabling bluefs_buffered_io (which itself can
>> cause IOPS amplification in some cases) can help; I think most folks
>> do this by setting it in local conf and then restarting OSDs in order
>> to gain the config change. Something else to consider is
>>
>> https://docs.ceph.com/en/quincy/start/hardware-recommendations/#write-caches
>> ,
>> as sometimes disabling these write caches can improve the IOPS
>> performance of SSDs.
>>
>> Josh
>>
>> On Tue, Feb 28, 2023 at 7:19 AM Boris Behrens  wrote:
>> >
>> > Hi Josh,
>> > we upgraded 15.2.17 -> 16.2.11 and we only use rbd workload.
>> >
>> >
>> >
>> > Am Di., 28. Feb. 2023 um 15:00 Uhr schrieb Josh Baergen <
>> jbaer...@digitalocean.com>:
>> >>
>> >> Hi Boris,
>> >>
>> >> Which version did you upgrade from and to, specifically? And what
>> >> workload are you running (RBD, etc.)?
>> >>
>> >> Josh
>> >>
>> >> On Tue, Feb 28, 2023 at 6:51 AM Boris Behrens  wrote:
>> >> >
>> >> > Hi,
>> >> > today I did the first update from octopus to pacific, and it looks
>> like the
>> >> > avg apply latency went up from 1ms to 2ms.
>> >> >
>> >> > All 36 OSDs are 4TB SSDs and nothing else changed.
>> >> > Someone knows if this is an issue, or am I just missing a config
>> value?
>> >> >
>> >> > Cheers
>> >> >  Boris
>> >> > ___
>> >> > ceph-users mailing list -- ceph-users@ceph.io
>> >> > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >
>> >
>> >
>> > --
>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
>> im groüen Saal.
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw SSE-C is not working (InvalidRequest)

2023-03-17 Thread Boris Behrens
Ha, found the error and now I feel just a tiny bit stupid:
haproxy did not add the X-Forwarded-Proto header.

Am Fr., 17. März 2023 um 12:03 Uhr schrieb Boris Behrens :

> Hi,
> I try to evaluate SSE-C (so customer provides keys) for our object
> storages.
> We do not provide a KMS server.
>
> I've added "Access-Control-Allow-Headers" to the haproxy frontend.
> rspadd Access-Control-Allow-Headers...
> x-amz-server-side-encryption-customer-algorithm,\
> x-amz-server-side-encryption-customer-key,\
> x-amz-server-side-encryption-customer-key-MD5
>
> I've also enabled "rgw_trust_forwarded_https = true" in the client
> section in the ceph.conf and restarted the RGW daemons.
>
> I now try to get it working, but I am not sure if I am doing it correctly.
>
> $ encKey=$(openssl rand -base64 32)
> $ md5Key=$(echo $encKey | md5sum | awk '{print $1}' | base64)
> $ aws s3api --endpoint=https://radosgw put-object \
>   --body ~/Downloads/TESTFILE \
>   --bucket test-bb-encryption \
>   --key TESTFILE \
>   --sse-customer-algorithm AES256 \
>   --sse-customer-key $encKey \
>   --sse-customer-key-md5 $md5Key
>
> This is what the RGW log gives me:
> 2023-03-17T10:55:55.465+ 7f42bbe5f700  1 == starting new request
> req=0x7f448c185700 =
> 2023-03-17T10:55:55.469+ 7f434df83700  1 == req done
> req=0x7f448c185700 op status=-2021 http_status=400 latency=385ns ==
> 2023-03-17T10:55:55.469+ 7f434df83700  1 beast: 0x7f448c185700: IPV6 -
> - [2023-03-17T10:55:55.469539+] "PUT /test-bb-encryption/TESTFILE
> HTTP/1.1" 400 221 - "aws-cli/2.4.18 Python/3.9.10 Darwin/22.3.0
> source/x86_64 prompt/off command/s3api.put-object" -
>
> Maybe someone got a wroking example and is willing to share it with me, or
> did also encounter this problem and knows what to do?
>
> It's and octopus cluster.
>
> Cheers
>  Boris
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] radosgw SSE-C is not working (InvalidRequest)

2023-03-17 Thread Boris Behrens
Hi,
I try to evaluate SSE-C (so customer provides keys) for our object storages.
We do not provide a KMS server.

I've added "Access-Control-Allow-Headers" to the haproxy frontend.
rspadd Access-Control-Allow-Headers...
x-amz-server-side-encryption-customer-algorithm,\
x-amz-server-side-encryption-customer-key,\
x-amz-server-side-encryption-customer-key-MD5

I've also enabled "rgw_trust_forwarded_https = true" in the client section
in the ceph.conf and restarted the RGW daemons.

I now try to get it working, but I am not sure if I am doing it correctly.

$ encKey=$(openssl rand -base64 32)
$ md5Key=$(echo $encKey | md5sum | awk '{print $1}' | base64)
$ aws s3api --endpoint=https://radosgw put-object \
  --body ~/Downloads/TESTFILE \
  --bucket test-bb-encryption \
  --key TESTFILE \
  --sse-customer-algorithm AES256 \
  --sse-customer-key $encKey \
  --sse-customer-key-md5 $md5Key

This is what the RGW log gives me:
2023-03-17T10:55:55.465+ 7f42bbe5f700  1 == starting new request
req=0x7f448c185700 =
2023-03-17T10:55:55.469+ 7f434df83700  1 == req done
req=0x7f448c185700 op status=-2021 http_status=400 latency=385ns ==
2023-03-17T10:55:55.469+ 7f434df83700  1 beast: 0x7f448c185700: IPV6 -
- [2023-03-17T10:55:55.469539+] "PUT /test-bb-encryption/TESTFILE
HTTP/1.1" 400 221 - "aws-cli/2.4.18 Python/3.9.10 Darwin/22.3.0
source/x86_64 prompt/off command/s3api.put-object" -

Maybe someone got a wroking example and is willing to share it with me, or
did also encounter this problem and knows what to do?

It's and octopus cluster.

Cheers
 Boris
-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Concerns about swap in ceph nodes

2023-03-16 Thread Boris Behrens
Maybe worth to mention, because it caught me by surprise:
Ubuntu creates a swap file (/swap.img) if you do not specify a swap
partition (check /etc/fstab).

Cheers
 Boris

Am Mi., 15. März 2023 um 22:11 Uhr schrieb Anthony D'Atri <
a...@dreamsnake.net>:

>
> With CentOS/Rocky 7-8 I’ve observed unexpected usage of swap when there is
> plenty of physmem available.
>
> Swap IMHO is a relic of a time when RAM capacities were lower and much
> more expensive.
>
> In years beginning with a 2, and with Ceph explicitly, I assert that swap
> should never be enabled during day to day operation.
>
> The RAM you need depends in part on the media you’re using, and how many
> per node, but most likely you can and should disable your swap.
>
> >
> > Hello,
> > We have a 6-node ceph cluster, all of them have osd running and 3 of
> them (ceph-1 to ceph-3 )also has the ceph-mgr and ceph-mon. Here is the
> detailed configuration of each node (swap on ceph-1 to ceph-3 has been
> disabled after the alarm):
> >
> > # ceph-1 free -h
> >totalusedfree  shared
> buff/cache   available
> > Mem:  187Gi38Gi   5.4Gi   4.1Gi   143Gi
>  142Gi
> > Swap:0B  0B  0B
> > # ceph-2 free -h
> >totalusedfree  shared
> buff/cache   available
> > Mem:  187Gi49Gi   2.6Gi   4.0Gi   135Gi
>  132Gi
> > Swap:0B  0B  0B
> > # ceph-3 free -h
> >totalusedfree  shared
> buff/cache   available
> > Mem:  187Gi37Gi   4.6Gi   4.0Gi   145Gi
>  144Gi
> > Swap:0B  0B  0B
> > # ceph-4 free -h
> >totalusedfree  shared
> buff/cache   available
> > Mem:  251Gi31Gi   8.3Gi   231Mi   211Gi
>  217Gi
> > Swap: 124Gi   3.8Gi   121Gi
> > # ceph-5 free -h
> >totalusedfree  shared
> buff/cache   available
> > Mem:  251Gi32Gi14Gi   135Mi   204Gi
>  216Gi
> > Swap: 124Gi   4.0Gi   121Gi
> > # ceph-6 free -h
> >totalusedfree  shared
> buff/cache   available
> > Mem:  251Gi30Gi16Gi   145Mi   204Gi
>  218Gi
> > Swap: 124Gi   4.0Gi   121Gi
> >
> > We have configured swap space on all of them, for ceph-mgr nodes, we
> have 8G swap space and 128G swap configured for osd nodes, and our zabbix
> has monitored a swap over 50% usage for ceph-1 to ceph-3, but our available
> space are still around 140G against the total 187G. Just wondering whether
> the swap space is necessary when we have lots of memory available?
> >
> > Thanks very much for your answering.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] radosgw - octopus - 500 Bad file descriptor on upload

2023-03-09 Thread Boris Behrens
Hi,
we've observed 500er errors on uploading files to a single bucket, but the
problem went away after around 2 hours.

We've checked and saw the following error message:
2023-03-08T17:55:58.778+ 7f8062f15700 0 WARNING: set_req_state_err
err_no=125 resorting to 500 2023-03-08T17:55:58.778+ 7f8062f15700 0
ERROR: RESTFUL_IO(s)->complete_header() returned err=Bad file
descriptor 2023-03-08T17:55:58.778+
7f8062f15700 1 == req done req=0x7f81d0189700 op status=-125
http_status=500 latency=65003730017ns == 2023-03-08T17:55:58.778+
7f8062f15700 1 beast: 0x7f81d0189700: IPADDRESS - -
[2023-03-08T17:55:58.778961+] "PUT /BUCKET/OBJECT HTTP/1.1" 500 57 -
"aws-sdk-php/3.257.11 OS/Linux/5.15.0-60-generic lang/php/8.2.3
GuzzleHttp/7" -

It only happened to a single bucket over a period of 1-2 hours (around 300
requests).
In the same time we've had >20k PUT requests the were working fine on other
buckets.

This error also seem to happen to other buckets, but only very sporadically.

Did someone encounter this issue or knows what it could be?

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Boris Behrens
Hi Josh,
thanks a lot for the breakdown and the links.
I disabled the write cache but it didn't change anything. Tomorrow I will
try to disable bluefs_buffered_io.

It doesn't sound that I can mitigate the problem with more SSDs.


Am Di., 28. Feb. 2023 um 15:42 Uhr schrieb Josh Baergen <
jbaer...@digitalocean.com>:

> Hi Boris,
>
> OK, what I'm wondering is whether
> https://tracker.ceph.com/issues/58530 is involved. There are two
> aspects to that ticket:
> * A measurable increase in the number of bytes written to disk in
> Pacific as compared to Nautilus
> * The same, but for IOPS
>
> Per the current theory, both are due to the loss of rocksdb log
> recycling when using default recovery options in rocksdb 6.8; Octopus
> uses version 6.1.2, Pacific uses 6.8.1.
>
> 16.2.11 largely addressed the bytes-written amplification, but the
> IOPS amplification remains. In practice, whether this results in a
> write performance degradation depends on the speed of the underlying
> media and the workload, and thus the things I mention in the next
> paragraph may or may not be applicable to you.
>
> There's no known workaround or solution for this at this time. In some
> cases I've seen that disabling bluefs_buffered_io (which itself can
> cause IOPS amplification in some cases) can help; I think most folks
> do this by setting it in local conf and then restarting OSDs in order
> to gain the config change. Something else to consider is
>
> https://docs.ceph.com/en/quincy/start/hardware-recommendations/#write-caches
> ,
> as sometimes disabling these write caches can improve the IOPS
> performance of SSDs.
>
> Josh
>
> On Tue, Feb 28, 2023 at 7:19 AM Boris Behrens  wrote:
> >
> > Hi Josh,
> > we upgraded 15.2.17 -> 16.2.11 and we only use rbd workload.
> >
> >
> >
> > Am Di., 28. Feb. 2023 um 15:00 Uhr schrieb Josh Baergen <
> jbaer...@digitalocean.com>:
> >>
> >> Hi Boris,
> >>
> >> Which version did you upgrade from and to, specifically? And what
> >> workload are you running (RBD, etc.)?
> >>
> >> Josh
> >>
> >> On Tue, Feb 28, 2023 at 6:51 AM Boris Behrens  wrote:
> >> >
> >> > Hi,
> >> > today I did the first update from octopus to pacific, and it looks
> like the
> >> > avg apply latency went up from 1ms to 2ms.
> >> >
> >> > All 36 OSDs are 4TB SSDs and nothing else changed.
> >> > Someone knows if this is an issue, or am I just missing a config
> value?
> >> >
> >> > Cheers
> >> >  Boris
> >> > ___
> >> > ceph-users mailing list -- ceph-users@ceph.io
> >> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> >
> > --
> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Boris Behrens
Hi Josh,
we upgraded 15.2.17 -> 16.2.11 and we only use rbd workload.



Am Di., 28. Feb. 2023 um 15:00 Uhr schrieb Josh Baergen <
jbaer...@digitalocean.com>:

> Hi Boris,
>
> Which version did you upgrade from and to, specifically? And what
> workload are you running (RBD, etc.)?
>
> Josh
>
> On Tue, Feb 28, 2023 at 6:51 AM Boris Behrens  wrote:
> >
> > Hi,
> > today I did the first update from octopus to pacific, and it looks like
> the
> > avg apply latency went up from 1ms to 2ms.
> >
> > All 36 OSDs are 4TB SSDs and nothing else changed.
> > Someone knows if this is an issue, or am I just missing a config value?
> >
> > Cheers
> >  Boris
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Boris Behrens
Hi,
today I did the first update from octopus to pacific, and it looks like the
avg apply latency went up from 1ms to 2ms.

All 36 OSDs are 4TB SSDs and nothing else changed.
Someone knows if this is an issue, or am I just missing a config value?

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: growing osd_pglog_items (was: increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior)

2023-02-23 Thread Boris Behrens
After reading a lot about it I still don't understand how this happened and
what I can do to fix it.

This only trims the pglog, but not the duplicates:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-41 --op
trim-pg-log --pgid 8.664

I also try to recreate the OSDs (sync out, crush rm, wipe disk, create new
osd, sync in), but the osd_pglog_items value seems to grow after everything
is synced back in (I have 8TB disks that are at around 10million items, one
day after I synced them back in). It seems not reach the old value which is
around 50million, but still growing.

Is there anything I can do with an octopus cluster, or is the only way to
upgrade?
And why does it happen?

Am Di., 21. Feb. 2023 um 18:31 Uhr schrieb Boris Behrens :

> Thanks a lot Josh. That really seems like my problem.
> That does not look healthy in the cluster. oof.
> ~# ceph tell osd.* perf dump |grep 'osd_pglog\|^osd\.[0-9]'
> osd.0: {
> "osd_pglog_bytes": 459617868,
> "osd_pglog_items": 2955043,
> osd.1: {
> "osd_pglog_bytes": 598414548,
> "osd_pglog_items": 4315956,
> osd.2: {
> "osd_pglog_bytes": 357056504,
> "osd_pglog_items": 1942486,
> osd.3: {
> "osd_pglog_bytes": 436198324,
> "osd_pglog_items": 2863501,
> osd.4: {
> "osd_pglog_bytes": 373516972,
> "osd_pglog_items": 2127588,
> osd.5: {
> "osd_pglog_bytes": 335471560,
> "osd_pglog_items": 1822608,
> osd.6: {
> "osd_pglog_bytes": 391814808,
> "osd_pglog_items": 2394209,
> osd.7: {
> "osd_pglog_bytes": 541849048,
> "osd_pglog_items": 3880437,
> ...
>
>
> Am Di., 21. Feb. 2023 um 18:21 Uhr schrieb Josh Baergen <
> jbaer...@digitalocean.com>:
>
>> Hi Boris,
>>
>> This sounds a bit like https://tracker.ceph.com/issues/53729.
>> https://tracker.ceph.com/issues/53729#note-65 might help you diagnose
>> whether this is the case.
>>
>> Josh
>>
>> On Tue, Feb 21, 2023 at 9:29 AM Boris Behrens  wrote:
>> >
>> > Hi,
>> > today I wanted to increase the PGs from 2k -> 4k and random OSDs went
>> > offline in the cluster.
>> > After some investigation we saw, that the OSDs got OOM killed (I've
>> seen a
>> > host that went from 90GB used memory to 190GB before OOM kills happen).
>> >
>> > We have around 24 SSD OSDs per host and 128GB/190GB/265GB memory in
>> these
>> > hosts. All of them experienced OOM kills.
>> > All hosts are octopus / ubuntu 20.04.
>> >
>> > And on every step new OSDs crashed with OOM. (We now set the
>> pg_num/pgp_num
>> > to 2516 to stop the process).
>> > The OSD logs do not show anything why this might happen.
>> > Some OSDs also segfault.
>> >
>> > I now started to stop all OSDs on a host, and do a "ceph-bluestore-tool
>> > repair" and a "ceph-kvstore-tool bluestore-kv compact" on all OSDs. This
>> > takes for the 8GB OSDs around 30 minutes. When I start the OSDs I
>> instantly
>> > get a lot of slow OPS from all the other OSDs when the OSD come up (the
>> 8TB
>> > OSDs take around 10 minutes with "load_pgs".
>> >
>> > I am unsure what I can do to restore normal cluster performance. Any
>> ideas
>> > or suggestions or maybe even known bugs?
>> > Maybe a line for what I can search in the logs.
>> >
>> > Cheers
>> >  Boris
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior

2023-02-21 Thread Boris Behrens
Thanks a lot Josh. That really seems like my problem.
That does not look healthy in the cluster. oof.
~# ceph tell osd.* perf dump |grep 'osd_pglog\|^osd\.[0-9]'
osd.0: {
"osd_pglog_bytes": 459617868,
"osd_pglog_items": 2955043,
osd.1: {
"osd_pglog_bytes": 598414548,
"osd_pglog_items": 4315956,
osd.2: {
"osd_pglog_bytes": 357056504,
"osd_pglog_items": 1942486,
osd.3: {
"osd_pglog_bytes": 436198324,
"osd_pglog_items": 2863501,
osd.4: {
"osd_pglog_bytes": 373516972,
"osd_pglog_items": 2127588,
osd.5: {
"osd_pglog_bytes": 335471560,
"osd_pglog_items": 1822608,
osd.6: {
"osd_pglog_bytes": 391814808,
"osd_pglog_items": 2394209,
osd.7: {
"osd_pglog_bytes": 541849048,
"osd_pglog_items": 3880437,
...


Am Di., 21. Feb. 2023 um 18:21 Uhr schrieb Josh Baergen <
jbaer...@digitalocean.com>:

> Hi Boris,
>
> This sounds a bit like https://tracker.ceph.com/issues/53729.
> https://tracker.ceph.com/issues/53729#note-65 might help you diagnose
> whether this is the case.
>
> Josh
>
> On Tue, Feb 21, 2023 at 9:29 AM Boris Behrens  wrote:
> >
> > Hi,
> > today I wanted to increase the PGs from 2k -> 4k and random OSDs went
> > offline in the cluster.
> > After some investigation we saw, that the OSDs got OOM killed (I've seen
> a
> > host that went from 90GB used memory to 190GB before OOM kills happen).
> >
> > We have around 24 SSD OSDs per host and 128GB/190GB/265GB memory in these
> > hosts. All of them experienced OOM kills.
> > All hosts are octopus / ubuntu 20.04.
> >
> > And on every step new OSDs crashed with OOM. (We now set the
> pg_num/pgp_num
> > to 2516 to stop the process).
> > The OSD logs do not show anything why this might happen.
> > Some OSDs also segfault.
> >
> > I now started to stop all OSDs on a host, and do a "ceph-bluestore-tool
> > repair" and a "ceph-kvstore-tool bluestore-kv compact" on all OSDs. This
> > takes for the 8GB OSDs around 30 minutes. When I start the OSDs I
> instantly
> > get a lot of slow OPS from all the other OSDs when the OSD come up (the
> 8TB
> > OSDs take around 10 minutes with "load_pgs".
> >
> > I am unsure what I can do to restore normal cluster performance. Any
> ideas
> > or suggestions or maybe even known bugs?
> > Maybe a line for what I can search in the logs.
> >
> > Cheers
> >  Boris
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior

2023-02-21 Thread Boris Behrens
Hi,
today I wanted to increase the PGs from 2k -> 4k and random OSDs went
offline in the cluster.
After some investigation we saw, that the OSDs got OOM killed (I've seen a
host that went from 90GB used memory to 190GB before OOM kills happen).

We have around 24 SSD OSDs per host and 128GB/190GB/265GB memory in these
hosts. All of them experienced OOM kills.
All hosts are octopus / ubuntu 20.04.

And on every step new OSDs crashed with OOM. (We now set the pg_num/pgp_num
to 2516 to stop the process).
The OSD logs do not show anything why this might happen.
Some OSDs also segfault.

I now started to stop all OSDs on a host, and do a "ceph-bluestore-tool
repair" and a "ceph-kvstore-tool bluestore-kv compact" on all OSDs. This
takes for the 8GB OSDs around 30 minutes. When I start the OSDs I instantly
get a lot of slow OPS from all the other OSDs when the OSD come up (the 8TB
OSDs take around 10 minutes with "load_pgs".

I am unsure what I can do to restore normal cluster performance. Any ideas
or suggestions or maybe even known bugs?
Maybe a line for what I can search in the logs.

Cheers
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Very slow snaptrim operations blocking client I/O

2023-02-20 Thread Boris Behrens
Hi,
we've encountered the same issue after upgrading to octopus on on of our
rbd cluster, and now it reappears after the autoscaler lowered the PGs form
8k to 2k for the RBD pool.

What we've done in the past:
- recreate all OSD after our 2nd incident with slow OPS in a single week
after the ceph upgrade (earlier september)
- upgraded the OS from centos7 to ubuntu focal after the third incident
(december)
- offline compact all OSDs a week ago, because we had some (~500) very old
snapshots lying around and hoped that the snaptrim works faster when the
rockzdb is freshly compacted.

After the first incident we had roughly three months of smooth sailing, and
now it is again roughly three months of smooth sailing later, and we
experience these slow OPS again.
It might be this time, because we have some OSDs (2TB SSD) with very few
PGs (~30) and some OSDs (8TB SSD) with a lot of PGs (~120).
I will try to compact all OSDs and check if it stops again, but I think I
need to bump the PGs up again to 4K PGs, because it started again when the
autoscaler lowered the PGs.

And from out data (prometheus), the apply latency goes up to 9seconds and
it mostly hits the 8TB disks.

I am currently cunning the "time ceph daemon osd.1
calc_objectstore_db_histogram" for all OSDs and get very mixed values, but
non of the values lies in the <1 minute range.


Am Mi., 15. Feb. 2023 um 16:42 Uhr schrieb Victor Rodriguez <
vrodrig...@soltecsis.com>:

> An update on this for the record:
>
> To fully solve this I've had to destroy each OSD and create then again,
> one by one. I could have done it one host at a time but I've preferred
> to be on the safest side just in case something else went wrong.
>
> The values for num_pgmeta_omap (which I don't know what it is, yet) on
> the new OSD were similar to other clusters (I've seen from 30 to
> 70 aprox), so I believe the characteristics of the data in the
> cluster does not determine how big or small num_pgmeta_omap should be.
>
> One thing I've noticed is that /bad/ or /damanged/ OSDs (i.e. those
> showing high CPU usage and poor performance doing the trim operation)
> took much more time to calculate their histogram, even if their
> num_pgmeta_opmap was low:
>
> (/bad //OSD)/:
> # time ceph daemon osd.1 calc_objectstore_db_histogram | grep
> "num_pgmeta_omap"
>  "num_pgmeta_omap": 673208,
>
> real1m14,549s
> user0m0,075s
> sys0m0,025s
>
> (/good new OSD/):
> #  time ceph daemon osd.1 calc_objectstore_db_histogram | grep
> "num_pgmeta_omap"
>  "num_pgmeta_omap": 434298,
>
> real0m18,022s
> user0m0,078s
> sys0m0,023s
>
>
> Maybe is worth checking that histogram from time to time as a way to
> measure the OSD "health"?
>
> Again, thanks everyone.
>
>
>
> On 1/30/23 18:18, Victor Rodriguez wrote:
> >
> > On 1/30/23 15:15, Ana Aviles wrote:
> >> Hi,
> >>
> >> Josh already suggested, but I will one more time. We had similar
> >> behaviour upgrading from Nautilus to Pacific. In our case compacting
> >> the OSDs did the trick.
> >
> > Thanks for chimming in! Unfortunately, in my case neither an online
> > compaction (ceph tell osd.ID compact) or an offline repair
> > (ceph-bluestore-tool repair --path /var/lib/ceph/osd/OSD_ID) does
> > help. Compactions seem to compact some amount. I think that OSD log
> > dumps information about the size of rocksdb. It went from this:
> >
> >
> 
>
> >
> >   L0  0/00.00 KB   0.0  0.0 0.0  0.0 3.9 3.9
> > 0.0   1.0  0.0 62.2 64.81 61.59890.728   0  0
> >   L1  3/0   132.84 MB   0.5  7.0 3.9  3.1 5.0
> > 2.0   0.0   1.3 63.8 46.1 112.11 108.5223
> > 4.874 56M  7276K
> >   L2 12/0   690.99 MB   0.8  6.5 1.8  4.7 5.6
> > 0.9   0.1   3.2 21.4 18.5 310.78 307.1428
> > 11.099165M  3077K
> >   L3 54/03.37 GB   0.1  0.9 0.3  0.6 0.5
> > -0.1   0.0   1.6 35.9 20.2 24.84 24.49 4
> > 6.210 24M15M
> >  Sum 69/04.17 GB   0.0 14.4 6.0  8.3 15.1
> > 6.7   0.1   3.8 28.7 30.1 512.54 501.74   144
> > 3.559246M26M
> >  Int  0/00.00 KB   0.0  0.8 0.3  0.5 0.6 0.1
> > 0.0  14.1 27.5 20.7 31.13 30.73 47.783 18M  4086K
> >
> > To this:
> >
> >
> 
>
> >
> >   L0  2/0   72.42 MB   0.5  0.0 0.0  0.0 0.1 0.1
> > 0.0   1.0  0.0 63.2 1.14 0.84 20.572   0  0
> >   L3 48/03.10 GB   0.1  0.0 0.0  0.0 0.0 0.0
> > 0.0   0.0  0.0  0.0 0.00 0.00 00.000   0  0
> >  Sum   

[ceph-users] Re: [RGW - octopus] too many omapkeys on versioned bucket

2023-02-13 Thread Boris Behrens
I've tried it the other way around and let cat give out all escaped chars
and the did the grep:

# cat -A omapkeys_list | grep -aFn '/'
9844:/$
9845:/^@v913^@$
88010:M-^@1000_/^@$
128981:M-^@1001_/$

Did anyone ever saw something like this?

Am Mo., 13. Feb. 2023 um 14:31 Uhr schrieb Boris Behrens :

> So here is some more weirdness:
> I've piped a list of all omapkeys into a file: (dedacted customer data
> with placeholders in <>)
>
> # grep -aFn '//' omapkeys_list
> 9844://
> 9845://v913
> 88010:�1000_//
> 128981:�1001_//
>
> # grep -aFn '/'
> omapkeys_list
> 
>
> # vim omapkeys_list +88010 (copy pasted from terminal)
> <80>1000_//^@
>
> Any idea what this is?
>
> Am Mo., 13. Feb. 2023 um 13:57 Uhr schrieb Boris Behrens :
>
>> Hi,
>> I have one bucket that showed up with a large omap warning, but the
>> amount of objects in the bucket, does not align with the amount of omap
>> keys. The bucket is sharded to get rid of the "large omapkeys" warning.
>>
>> I've counted all the omapkeys of one bucket and it came up with 33.383.622
>> (rados -p INDEXPOOL listomapkeys INDEXOBJECT | wc -l)
>> I've checked the amount of actual rados objects and it came up with
>> 17.095.877
>> (rados -p DATAPOOL ls | grep BUCKETMARKER | wc -l)
>> I've checked the bucket index and it came up with 16.738.482
>> (radosgw-admin bi list --bucket BUCKET | grep -F '"idx":' | wc -l)
>>
>> I have tried to fix it with
>> radosgw-admin bucket check --check-objects --fix --bucket BUCKET
>> but this did not change anything.
>>
>> Is this a known bug or might there be something else going on. How can I
>> investigate further?
>>
>> Cheers
>>  Boris
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [RGW - octopus] too many omapkeys on versioned bucket

2023-02-13 Thread Boris Behrens
So here is some more weirdness:
I've piped a list of all omapkeys into a file: (dedacted customer data with
placeholders in <>)

# grep -aFn '//' omapkeys_list
9844://
9845://v913
88010:�1000_//
128981:�1001_//

# grep -aFn '/'
omapkeys_list


# vim omapkeys_list +88010 (copy pasted from terminal)
<80>1000_//^@

Any idea what this is?

Am Mo., 13. Feb. 2023 um 13:57 Uhr schrieb Boris Behrens :

> Hi,
> I have one bucket that showed up with a large omap warning, but the amount
> of objects in the bucket, does not align with the amount of omap keys. The
> bucket is sharded to get rid of the "large omapkeys" warning.
>
> I've counted all the omapkeys of one bucket and it came up with 33.383.622
> (rados -p INDEXPOOL listomapkeys INDEXOBJECT | wc -l)
> I've checked the amount of actual rados objects and it came up with
> 17.095.877
> (rados -p DATAPOOL ls | grep BUCKETMARKER | wc -l)
> I've checked the bucket index and it came up with 16.738.482
> (radosgw-admin bi list --bucket BUCKET | grep -F '"idx":' | wc -l)
>
> I have tried to fix it with
> radosgw-admin bucket check --check-objects --fix --bucket BUCKET
> but this did not change anything.
>
> Is this a known bug or might there be something else going on. How can I
> investigate further?
>
> Cheers
>  Boris
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groüen Saal.
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] [RGW - octopus] too many omapkeys on versioned bucket

2023-02-13 Thread Boris Behrens
Hi,
I have one bucket that showed up with a large omap warning, but the amount
of objects in the bucket, does not align with the amount of omap keys. The
bucket is sharded to get rid of the "large omapkeys" warning.

I've counted all the omapkeys of one bucket and it came up with 33.383.622
(rados -p INDEXPOOL listomapkeys INDEXOBJECT | wc -l)
I've checked the amount of actual rados objects and it came up with
17.095.877
(rados -p DATAPOOL ls | grep BUCKETMARKER | wc -l)
I've checked the bucket index and it came up with 16.738.482
(radosgw-admin bi list --bucket BUCKET | grep -F '"idx":' | wc -l)

I have tried to fix it with
radosgw-admin bucket check --check-objects --fix --bucket BUCKET
but this did not change anything.

Is this a known bug or might there be something else going on. How can I
investigate further?

Cheers
 Boris
-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Migrate a bucket from replicated pool to ec pool

2023-02-13 Thread Boris Behrens
Hi Casey,

changes to the user's default placement target/storage class don't
> apply to existing buckets, only newly-created ones. a bucket's default
> placement target/storage class can't be changed after creation
>

so I can easily update the placement rules for this user and can migrate
existing buckets one at a time. Very cool. Thanks


> you might add the EC pool as a new storage class in the existing
> placement target, and use lifecycle transitions to move the objects.
> but the bucket's default storage class would still be replicated, so
> new uploads would go there unless the client adds a
> x-amz-storage-class header to override it. if you want to change those
> defaults, you'd need to create a new bucket and copy the objects over
>

Can you link me to documentation. It might be the monday, but I do not
understand that totally.

Do you know how much more CPU/RAM EC takes, and when (putting, reading,
deleting objects, recovering OSD failure)?


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Migrate a bucket from replicated pool to ec pool

2023-02-11 Thread Boris Behrens
Hi,
we use rgw as our backup storage, and it basically holds only compressed
rbd snapshots.
I would love to move these out of the replicated into a ec pool.

I've read that I can set a default placement target for a user (
https://docs.ceph.com/en/octopus/radosgw/placement/). What does happen to
the existing user data?

How do I move the existing data to the new pool?
Does it somehow interfere with ongoing data upload (it is one internal
user, with 800 buckets which constantly get new data and old data removed)?

Cheers
 Boris

ps: Can't wait to see some of you at the cephalocon :)

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PG_BACKFILL_FULL

2023-01-16 Thread Boris Behrens
Hmm.. I ran into some similar issue.

IMHO there are two ways to work around the problem until the new disk in
place:
1. change the backfill full threshold (I use these commands:
https://www.suse.com/support/kb/doc/?id=19724)
2. reweight the backfill full OSDs just a little bit, so they move data to
disks that are free enough (i.e. `ceph osd reweight osd.60 0.9`) - if you
have enough capacity in the cluster (577+ OSDs should be able to take that
:) )

Cheers
 Boris


Am Mo., 16. Jan. 2023 um 15:01 Uhr schrieb Iztok Gregori <
iztok.greg...@elettra.eu>:

> Hi to all!
>
> We are in a situation where we have 3 PG in
> "active+remapped+backfill_toofull". It happened when we executed a
> "gentle-reweight" to zero of one OSD (osd.77) to swap it with a new one
> (the current one registered some read errors and it's to be replaced
> just-in-case).
>
> > # ceph health detail:
> > [WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if
> this doesn't resolve itself): 3 pgs backfill_toofull
> > pg 10.46c is active+remapped+backfill_toofull, acting [77,607,96]
> > pg 10.8ad is active+remapped+backfill_toofull, acting [577,152,77]
> > pg 10.b15 is active+remapped+backfill_toofull, acting [483,348,77]
>
> Our cluster is a little unbalanced and we have 7 OSD nearfull (I think
> it's because we have 4 nodes with 6 TB disks and the other 19 have 10
> TB, but should be unrelated, why is the cluster unbalanced I mean, to
> the backfill_toofull ) not by too much (less then 88%). I'm not too much
> worried about it, we will add new storage this month (if the servers
> will arrive) and we will get rid of the old 6 TB servers.
>
> If I dump the PGs I see, if I'm not mistaken, that the osd.77 will be
> "replaced" by the osd.60, which is one of the nearfull ones (the top one
> with 87.53% used).
>
>
> > # ceph pg dump:
> >
> > 10.b15 37236   0 0  372360
> 1552496209920   0   5265  5265
> active+remapped+backfill_toofull  2023-01-16T14:45:46.155801+0100
> 305742'144106  305742:901513   [483,348,60] 483   [483,348,77]
>483  305211'144056  2023-01-11T10:20:56.600135+0100
> 305211'144056  2023-01-11T10:20:56.600135+0100  0
> > 10.8ad 37518   0 0  375180
> 1563450245120   0   5517  5517
> active+remapped+backfill_toofull  2023-01-16T14:45:38.510038+0100
> 305213'142117  305742:937228   [577,60,152] 577   [577,152,77]
>577  303828'142043  2023-01-06T17:52:02.523104+0100
> 303334'141645  2022-12-20T17:39:22.668083+0100  0
> > 10.46c 36710   0 0  367100
> 1530234434560   0   8172  8172
> active+remapped+backfill_toofull  2023-01-16T14:45:29.284223+0100
> 305298'141437  305741:877331[60,607,96]  60[77,607,96]
> 77  304802'141358  2023-01-08T21:39:23.469198+0100
> 304363'141349  2023-01-01T18:13:45.645494+0100  0
>
> > # ceph osd df:
> >  60hdd   5.45999   1.0  5.5 TiB  4.8 TiB   697 GiB  128 MiB
>  0 B   697 GiB  87.53  1.29   37  up
>
> In this situation was the correct way to address the problem?
> reweight-by-utilization the osd.60 to free up space (the OSD is a 6 TB
> disk, and other OSD on the same host are nearfull)? There is other way
> to manually map a PG to a different OSD?
>
> Thank you for your attention
>
> Iztok Gregori
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW - large omaps even when buckets are sharded

2023-01-16 Thread Boris Behrens
Hi,
since last week the scrubbing results in large omap warning.
After some digging I've got these results:

# searching for indexes with large omaps:
$ for i in `rados -p eu-central-1.rgw.buckets.index ls`; do
rados -p eu-central-1.rgw.buckets.index listomapkeys $i | wc -l | tr -d
'\n' >> omapkeys
echo " - ${i}" >> omapkeys
done

$ sort -n omapkeys | tail -n 15
212010 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2342226177.1.0
212460 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2342226177.1.3
212466 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2342226177.1.10
213165 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2342226177.1.4
354692 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.7
354760 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.5
354799 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.1
355040 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.10
355874 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.2
355930 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.3
356499 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.6
356583 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.8
356925 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.4
356935 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.9
358986 - .dir.ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1.0

So I have a bucket (ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1) with
11 shards where each shard got around 350k omapkeys.
When checking what bucket it is is get a total different number:

$ radosgw-admin bucket stats
 --bucket-id=ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1
{
"bucket": "bucket",
"num_shards": 11,
"tenant": "",
"zonegroup": "da651dc1-2663-4e1b-af2e-ac4454f24c9d",
"placement_rule": "default-placement",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
},
"id": "ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2421332952.1",
"marker": "ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2296333939.13",
"index_type": "Normal",
"owner": "user",
"ver":
"0#45265,1#44764,2#44631,3#44777,4#44859,5#44637,6#44814,7#44506,8#44853,9#44764,10#44813",
"master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
"mtime": "2022-11-16T08:34:17.298979Z",
"creation_time": "2021-11-16T09:13:34.480637Z",
"max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#",
"usage": {
"rgw.main": {
"size": 66897607205,
"size_actual": 68261179392,
"size_utilized": 66897607205,
"size_kb": 65329695,
"size_kb_actual": 1308,
"size_kb_utilized": 65329695,
"num_objects": 663369
},
"rgw.multimeta": {
"size": 0,
"size_actual": 0,
"size_utilized": 0,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 0,
"num_objects": 0
}
},
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
}

It got 11 shards, with a total of 663k files. radosgw-admin bucket limit
check gives 60k objects per shard.

After getting a list with all omapkeys (total of 3917043) I see that
entries look like this (only in less, not in cat) - the ^@ char:
$ grep -aF object1 2421332952.1_omapkeys
object1
object1^@v910^@i3nb5Cdp00wrt3Phhbn4MgwTcsM7sdwK
object1^@v913^@iPVPdb60UlfOu4Mwzr.oqojwWzRdgheZ
<80>1000_object1^@i3nb5Cdp00wrt3Phhbn4MgwTcsM7sdwK
<80>1000_object1^@iPVPdb60UlfOu4Mwzr.oqojwWzRdgheZ
<80>1001_object1

I also pulled the whole bucket index of said bucket via radosgw-admin bi
list --bucket bucket > bucket_index_list and searched via jq for the
object1:
$ jq '.[] | select(.entry.name == "object1")' bucket_index_list
{
  "type": "plain",
  "idx": "object1",
  "entry": {
"name": "object1",
"instance": "",
"ver": {
  "pool": -1,
  "epoch": 0
},
"locator": "",
"exists": "false",
"meta": {
  "category": 0,
  "size": 0,
  "mtime": "0.00",
  "etag": "",
  "storage_class": "",
  "owner": "",
  "owner_display_name": "",
  "content_type": "",
  "accounted_size": 0,
  "user_data": "",
  "appendable": "false"
},
"tag": "",
"flags": 8,
"pending_map": [],
"versioned_epoch": 0
  }
}
{
  "type": "plain",
  "idx": "object1\uv910\ui3nb5Cdp00wrt3Phhbn4MgwTcsM7sdwK",
  "entry": {
"name": "object1",
"instance": "3nb5Cdp00wrt3Phhbn4MgwTcsM7sdwK",
"ver": {
  "pool": -1,
  "epoch": 0
},
"locator": "",
"exists": "false",
"meta": {
  "category": 0,
  "size": 0,
  "mtime": "2022-12-16T00:00:28.651053Z",
  "etag": "",
  "storage_class": "",
  "owner": "user",
  "owner_display_name": "user",
  "content_type": "",
  "accounted_size": 0,
  "user_data": 

[ceph-users] radosgw ceph.conf question

2023-01-13 Thread Boris Behrens
Hi,
I am just reading through this document (
https://docs.ceph.com/en/octopus/radosgw/config-ref/) and on the top is
states:

The following settings may added to the Ceph configuration file (i.e.,
> usually ceph.conf) under the [client.radosgw.{instance-name}] section.
>

And my ceph.conf looks like this:

[client.eu-central-1-s3db3]
> rgw_frontends = beast endpoint=[::]:7482
> rgw_region = eu
> rgw_zone = eu-central-1
>
> [client.eu-central-1-s3db3-old]
> rgw_frontends = beast endpoint=[::]:7480
> rgw_region = eu
> rgw_zone = eu-central-1
>
> [client.eu-customer-1-s3db3]
> rgw_frontends = beast endpoint=[::]:7481
> rgw_region = eu-someother
> rgw_zone = eu-someother-1
>

Do I need to change the section names? It also seems that rgw_region is a
non-existing config value (this might have come from very old RHCS
documentation)

Would be very nice if someone could help me clarify this.

Cheers and happy weekend
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Octopus RGW large omaps in usage

2023-01-10 Thread Boris Behrens
Hi,
I am currently trying to figure out how to resolve the
"large objects found in pool 'rgw.usage'"
error.

In the past I trimmed the usage log, but now I am at the point that I need
to trim it down to two weeks.

I checked and amount of omapkeys and the distribution is quite off:

# for OBJECT in `rados -p rgw.usage ls`; do
  rados -p eu-central-1.rgw.usage listomapkeys ${OBJECT} | wc -l
done

86968
144388
6188
87854
46652
194788
46234
9622
45768
28376
104348
10018
2
34374
44744
40638
93664
35476
107794
18020
7172
17836
37344
73496
15572
31570
149352
740
113566
35292
5318
442176


Maybe it would be an option to increase this value?rgw usage max user shards

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: docs.ceph.com -- Do you use the header navigation bar? (RESPONSES REQUESTED)

2023-01-09 Thread Boris Behrens
I actually do not mind if i need to scroll up a line, but I also think
it is a good idea to remove it.

Am Mo., 9. Jan. 2023 um 11:06 Uhr schrieb Frank Schilder :
>
> Hi John,
>
> firstly, image attachments are filtered out by the list. How about you upload 
> the image somewhere like https://imgur.com/ and post a link instead?
>
> In my browser, the sticky header contains only "home" and "edit on github", 
> which are both entirely useless for a user. What exactly is "header 
> navigation" expected to do if it contains nothing else? Unless I'm looking at 
> the wrong thing (I can't see the attached image), this header can be removed. 
> The "edit on github" link can be added to the end of a page.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: John Zachary Dover 
> Sent: 04 January 2023 16:35:56
> To: ceph-users
> Subject: [ceph-users] docs.ceph.com -- Do you use the header navigation bar? 
> (RESPONSES REQUESTED)
>
> Do you use the header navigation bar on docs.ceph.com? See the attached
> file (sticky_header.png) if you are unsure of what "header navigation bar"
> means. In the attached file, the header navigation bar is indicated by
> means of two large, ugly, red-and-green arrows.
>
> *Cards on the Table*
> The navigation bar is the kind of thing that is sometimes referred to as a
> "sticky header", and it can get in the way of linked-to sections. I would
> like to remove this header bar. If there is community support for the
> header bar, though, I won't remove it.
>
> *What is Zac Complaining About?*
> Follow this procedure to see the behavior that has provoked my complaint:
>
>1. Go to https://docs.ceph.com/en/quincy/glossary/
>2. Scroll down to the "Ceph Cluster Map" entry.
>3. Click the "Cluster Map" link in the line that reads "See Cluster Map".
>4. Notice that the header navigation bar obscures the headword "Cluster
>Map".
>
> If you have any opinion at all on this matter, voice it. Please.
>
> Zac Dover
> Docs
> Upstream Ceph
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
im groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rgw - unable to remove some orphans

2023-01-03 Thread Boris Behrens
Hi Andrei,
happy new year to you too.

The file might be already removed.
You can check if the radosobject is there with `rados -p ls ...`
You can also check if the file is is still in the bucket with
`radosgw-admin bucket radoslist --bucket BUCKET`

Cheers
 Boris

Am Di., 3. Jan. 2023 um 13:47 Uhr schrieb Andrei Mikhailovsky
:
>
> Happy New Year everyone!
>
> I have a bit of an issue with removing some of the orphan objects that were 
> generated with the rgw-orphan-list tool. Over the years rgw generated over 14 
> million orphans with an overall waste of over 100TB in size, considering the 
> overall data stored in rgw was well under 10TB at max. Anyways, I have 
> managed to remove around 12m objects over the holiday season, but there are 
> just over 2m orphans which were not removed. Here is an example of one of the 
> objects taken from the orphans list file:
>
> $ rados -p .rgw.buckets rm 'default.775634629.1__multipart_SQL 
> Backups/ALL-POND-LIVE_backup_2021_05_26_204508_8473183.d20210526-u200953.bak.s26895803904.zip.0e6LO9b4w9H3HepY-3IW_JSOaysLdFs.1_92'
>
> error removing .rgw.buckets>default.775634629.1__shadow_SQL 
> Backups/ALL-POND-LIVE_backup_2021_05_26_204508_8473183.d20210526-u200953.bak.s26895803904.zip.0e6LO9b4w9H3HepY-3IW_JSOaysLdFs.1_92:
>  (2) No such file or directory
>
> Checking the presence of the object with the rados tool shows that the object 
> is there.
>
> $ cat orphan-list-20230103105849.out |grep -a JSOaysLdFs |grep -a 92
> default.775634629.1__shadow_SQL 
> Backups/ALL-POND-LIVE_backup_2021_05_26_204508_8473183.d20210526-u200953.bak.s26895803904.zip.0e6LO9b4w9H3HepY-3IW_JSOaysLdFs.1_92
>
> $ cat rados-20230103105849.intermediate |grep -a JSOaysLdFs |grep -a 92
> default.775634629.1__shadow_SQL 
> Backups/ALL-POND-LIVE_backup_2021_05_26_204508_8473183.d20210526-u200953.bak.s26895803904.zip.0e6LO9b4w9H3HepY-3IW_JSOaysLdFs.1_92
>
>
> Why can't I remove it? I have around 2m objects which can't be removed. What 
> can I do to remove them?
>
> Thanks
>
> Andrei
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
im groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW access logs with bucket name

2023-01-03 Thread Boris Behrens
Hi,
I am looking forward to move our logs from
/var/log/ceph/ceph-client...log to our logaggregator.

Is there a way to have the bucket name in the log file?

Or can I write the rgw_enable_ops_log into a file? Maybe I could work with this.

Cheers and happy new year
 Boris
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to shutdown a ceph node

2022-12-31 Thread Boris
Yes I do. It's the ceph default and we use it on any cluster size (smallest is 
3 Hosts with 6 disks each) and it removes a lot of headache. :-)

And as OP did not provide and config I assumed that he uses the default. 

Happy new year. 

> Am 31.12.2022 um 15:11 schrieb Anthony D'Atri :
> 
> Are you using size=3 replication and failure domain = host?  If so you’ll be 
> ok.
> We see folks sometimes using an EC profile that will result in PGs down, 
> especially with such a small cluster.
> 
>> On Dec 31, 2022, at 4:11 AM, Boris  wrote:
>> 
>> Hi,
>> I usually do 'ceph osd set noout' and 'ceph osd set norebalance' and then 
>> shut down the OS normally.
>> 
>> After everything is done I unset bot values and let the objects recover. 
>> 
>> Cheers and happy new year. 
>> 
>>>> Am 31.12.2022 um 08:52 schrieb Bülent ŞENGÜLER :
>>> 
>>> Hello,
>>> 
>>> I have a ceph cluster with 4 nodes and  İ have to shutdown one node of them
>>> due to electricity maintaince. I found how a cluster shutdown but I could
>>> not find to close a node. How can I power off a node gracefully.Thanks for
>>> answer.
>>> 
>>> Regards.
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to shutdown a ceph node

2022-12-31 Thread Boris
Hi,
I usually do 'ceph osd set noout' and 'ceph osd set norebalance' and then shut 
down the OS normally.

After everything is done I unset bot values and let the objects recover. 

Cheers and happy new year. 

> Am 31.12.2022 um 08:52 schrieb Bülent ŞENGÜLER :
> 
> Hello,
> 
> I have a ceph cluster with 4 nodes and  İ have to shutdown one node of them
> due to electricity maintaince. I found how a cluster shutdown but I could
> not find to close a node. How can I power off a node gracefully.Thanks for
> answer.
> 
> Regards.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nautilus mgr die when the balancer runs

2022-12-13 Thread Boris
After some manual rebalancing, all PGs went into a clean state and I was abler 
to start the balancer again. 

¯\_(ツ)_/¯ 

> Am 14.12.2022 um 01:18 schrieb Boris Behrens :
> 
> Hi,
> we had an issue with an old cluster, where we put disks from one host
> to another.
> We destroyed the disks and added them as new OSDs, but since then the
> mgr daemon were restarting in 120s intervals.
> 
> I tried to debug it a bit, and it looks like the balancer is the problem.
> I tried to disable it and create a plan on my own, but then the active
> manager just stops and the failover takes place.
> 
> Any idea how to easily fix it or debug it further?
> 
> I am currently trying to resolve the backfillfull issue by hand (ceph
> osd reweight) and then to restart all OSD hosts.
> 
> -- 
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
> im groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: octopus (15.2.16) OSDs crash or don't answer heathbeats (and get marked as down)

2022-12-13 Thread Boris Behrens
You could try to do this in a screen session for a while.
while true; do radosgw-admin gc process; done

Maybe your normal RGW daemons are too busy for GC processing.
We have this in our config and have started extra RGW instances for GC only:
[global]
...
# disable garbage collector default
rgw_enable_gc_threads = false
[client.gc-host1]
rgw_frontends = "beast endpoint=[::1]:7489"
rgw_enable_gc_threads = true

Am Mi., 14. Dez. 2022 um 01:14 Uhr schrieb Jakub Jaszewski
:
>
> Hi Boris, many thanks for the link!
>
> I see that GC list keep growing on my cluster and there are some very big 
> multipart objects on the GC list, even 138660 parts that I calculate as 
> >500GB in size.
> These objects are visible on the GC list but not on rados-level when calling 
> radosgw-admin --bucket=bucket_name bucket radoslist
> Also I manually called GC process,  radosgw-admin gc process 
> --bucket=bucket_name --debug-rgw=20   which according to logs did the job (no 
> errors raised although objects do not exist in rados?)
> ...
> 2022-12-13T20:21:06.635+0100 7fe0eb771080 20 garbage collection: 
> RGWGC::process iterating over entry tag='2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD', 
> time=2022-12-13T12:35:59.727067+0100, chain.objs.size()=138660
> 2022-12-13T20:21:06.635+0100 7fe0eb771080  5 garbage collection: 
> RGWGC::process removing 
> default.rgw.buckets.data:b4a09486-4fb6-474a-a45a-3fc6f7778e27.6781345.2__multipart_2ib3aonh7thn59a394l06un5i9lu2fhf1r2sl2g6rrhqbqhv6pjg.2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD.1
> 2022-12-13T20:21:06.703+0100 7fe0eb771080  5 garbage collection: 
> RGWGC::process removing 
> default.rgw.buckets.data:b4a09486-4fb6-474a-a45a-3fc6f7778e27.6781345.2__shadow_2ib3aonh7thn59a394l06un5i9lu2fhf1r2sl2g6rrhqbqhv6pjg.2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD.1_1
> 2022-12-13T20:21:06.859+0100 7fe0eb771080  5 garbage collection: 
> RGWGC::process removing 
> default.rgw.buckets.data:b4a09486-4fb6-474a-a45a-3fc6f7778e27.6781345.2__shadow_2ib3aonh7thn59a394l06un5i9lu2fhf1r2sl2g6rrhqbqhv6pjg.2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD.1_2
> ...
> but GC queue did not reduce, objects are still on the GC list.
>
> Do you happen to know how to remove non existent RADOS objects from RGW GC 
> list ?
>
> One more thing i have to check is max_secs=3600 for GC when entering 
> particular index_shard. As you can see in the logs, processing of multiparted 
> objects takes more than 3600 seconds.  I will try to increase 
> rgw_gc_processor_max_time
>
> 2022-12-13T20:20:13.168+0100 7fe0eb771080 20 garbage collection: 
> RGWGC::process entered with GC index_shard=25, max_secs=3600, expired_only=1
> 2022-12-13T20:20:13.168+0100 7fe0eb771080 20 garbage collection: 
> RGWGC::process cls_rgw_gc_list returned with returned:0, entries.size=0, 
> truncated=0, next_marker=''
> 2022-12-13T20:20:13.172+0100 7fe0eb771080 20 garbage collection: 
> RGWGC::process cls_rgw_gc_list returned NO non expired entries, so setting 
> cache entry to TRUE
> 2022-12-13T20:20:27.748+0100 7fe02700  2 
> RGWDataChangesLog::ChangesRenewThread: start
> 2022-12-13T20:20:49.748+0100 7fe02700  2 
> RGWDataChangesLog::ChangesRenewThread: start
> ...
> 2022-12-13T20:21:05.339+0100 7fe0eb771080 20 garbage collection: 
> RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, 
> entries.size=100, truncated=1, next_marker='4/20986990'
> 2022-12-13T20:21:06.635+0100 7fe0eb771080 20 garbage collection: 
> RGWGC::process iterating over entry tag='2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD', 
> time=2022-12-13T12:35:59.727067+0100, chain.objs.size()=138660
> 2022-12-13T20:21:06.635+0100 7fe0eb771080  5 garbage collection: 
> RGWGC::process removing 
> default.rgw.buckets.data:b4a09486-4fb6-474a-a45a-3fc6f7778e27.6781345.2__multipart_2ib3aonh7thn59a394l06un5i9lu2fhf1r2sl2g6rrhqbqhv6pjg.2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD.1
> 2022-12-13T20:21:06.703+0100 7fe0eb771080  5 garbage collection: 
> RGWGC::process removing 
> default.rgw.buckets.data:b4a09486-4fb6-474a-a45a-3fc6f7778e27.6781345.2__shadow_2ib3aonh7thn59a394l06un5i9lu2fhf1r2sl2g6rrhqbqhv6pjg.2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD.1_1
> ...
> 2022-12-13T21:31:23.505+0100 7fe0eb771080  5 garbage collection: 
> RGWGC::process removing 
> default.rgw.buckets.data:b4a09486-4fb6-474a-a45a-3fc6f7778e27.6781345.2__
> shadow_2ib3aonh7thn59a394l06un5i9lu2fhf1r2sl2g6rrhqbqhv6pjg.2~rBQjiZ4SWUf8u9IS1BUXGEwCnFTHqfD.4622_29
> ...
> 2022-12-13T21:31:23.565+0100 7fe0eb771080 20 garbage collection: 
> RGWGC::process entered with GC index_shard=26, max_secs=3600, expired_only=1
> ...
>
> Best Regards
> Jakub
>
> On Wed, Dec 7, 2022 at 6:10 PM Boris  wrote:
>>
>> Hi Jakub,
>>
>> the problem is in our case that we hit this bug 
>> (https://tracker.ceph.com/issues/53585)

[ceph-users] nautilus mgr die when the balancer runs

2022-12-13 Thread Boris Behrens
Hi,
we had an issue with an old cluster, where we put disks from one host
to another.
We destroyed the disks and added them as new OSDs, but since then the
mgr daemon were restarting in 120s intervals.

I tried to debug it a bit, and it looks like the balancer is the problem.
I tried to disable it and create a plan on my own, but then the active
manager just stops and the failover takes place.

Any idea how to easily fix it or debug it further?

I am currently trying to resolve the backfillfull issue by hand (ceph
osd reweight) and then to restart all OSD hosts.

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
im groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: radosgw - limit maximum file size

2022-12-09 Thread Boris Behrens
Hi Eric,

am I reading it correct, that *rgw_max_put_size *only limits files, that
are not uploaded as multipart?
My understanding would be, with these default values, that someone can
upload a 5TB file in 1 500MB multipart objects.

But I want to limit the maximum file size, so no one can upload a file
larger than 100GB, no matter how they size the multipart upload. Having
1000 99GB files is fine for me.
I want to mitigate this RGW bug [1], which currently leads to a lot of pain
on our side (some random customer seem to have lost all their rados object
from a bucket, because the GC went nuts.[2])
[1]: https://tracker.ceph.com/issues/53585
[2]:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/5XSUELNB64VTKRYRN6TXB5CU7VITPBVP/

Am Fr., 9. Dez. 2022 um 11:45 Uhr schrieb Eric Goirand :

> Hello Boris,
>
> I think you may be looking for these RGW daemon parameters :
>
> # ceph config help *rgw_max_put_size*
> rgw_max_put_size - Max size (in bytes) of regular (non multi-part) object
> upload.
>   (size, advanced)
>   Default: 5368709120
>   Can update at runtime: true
>   Services: [rgw]
>
> # ceph config help *rgw_multipart_part_upload_limit*
> rgw_multipart_part_upload_limit - Max number of parts in multipart upload
>   (int, advanced)
>   Default: 1
>   Can update at runtime: true
>   Services: [rgw]
>
> *rgw_max_put_size* is set in bytes.
>
> Regards,
> Eric.
>
> On Fri, Dec 9, 2022 at 11:24 AM Boris Behrens  wrote:
>
>> Hi,
>> is it possible to somehow limit the maximum file/object size?
>>
>> I've read that I can limit the size of multipart objects and the amount of
>> multipart objects, but I would like to limit the size of each object in
>> the
>> index to 100GB.
>>
>> I haven't found a config or quota value, that would fit.
>>
>> Cheers
>>  Boris
>>
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groüen Saal.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] radosgw - limit maximum file size

2022-12-09 Thread Boris Behrens
Hi,
is it possible to somehow limit the maximum file/object size?

I've read that I can limit the size of multipart objects and the amount of
multipart objects, but I would like to limit the size of each object in the
index to 100GB.

I haven't found a config or quota value, that would fit.

Cheers
 Boris

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-09 Thread Boris Behrens
Hello together,

@Alex: I am not sure for what to look in /sys/block//device
There are a lot of files.Is there anything I should check in particular?

You have sysfs access in /sys/block//device - this will show a lot
> of settings.  You can go to this directory on CentOS vs. Ubuntu, and see if
> any setting is different?
>

@Matthias: yes the kernel is an old one (3.10.0-1160.76.1.el7.x86_64)
The await values are not significantly different (something 0.2 and 3 for
read and 0.1 and 0.4 for write)

> I guess Centos7 has a rather old kernel. What are the kernel versions on
> these hosts?
>
> I have seen a drastic increase in iostat %util numbers on a Ceph cluster
> on Ubuntu hosts, after an Ubuntu upgrade 18.04 => 20.04 => 22.04
> (upgrading Ceph along with it).  iostat %util was up high since, but
> iostat latency values dropped considerably. As the the cluster seemed
> slightly faster overall after these upgrades, I did not worry much about
> increased %util numbers.
>


@Anthony: Thanks for the link. Very nice read.

> https://brooker.co.za/blog/2014/07/04/iostat-pct.html
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-06 Thread Boris Behrens
Hi Janne,
that is a really good idea. Thank you.

I just saw, that our only ubuntu20.04 got very high %util (all 8TB disks)
Devicer/s rkB/s   rrqm/s  %rrqm r_await rareq-sz w/s
  wkB/s   wrqm/s  %wrqm w_await wareq-sz d/s dkB/s   drqm/s  %drqm
d_await dareq-sz  aqu-sz  %util
sdc 19.00112.00 0.00   0.000.32 5.89 1535.00
 68768.00  1260.00  45.081.3344.800.00  0.00 0.00
0.000.00 0.001.44  76.00
sdd 62.00   5892.0043.00  40.952.8295.03 1196.00
 78708.00  1361.00  53.232.3565.810.00  0.00 0.00
0.000.00 0.002.31  72.00
sde 33.00184.00 0.00   0.000.33 5.58 1413.00
102592.00  1709.00  54.741.7072.610.00  0.00 0.00
0.000.00 0.001.68  84.40
sdf 62.00   8200.0063.00  50.409.32   132.26 1066.00
 74372.00  1173.00  52.391.6869.770.00  0.00 0.00
0.000.00 0.001.80  70.00
sdg  5.00 40.00 0.00   0.000.40 8.00 1936.00
128188.00  2172.00  52.872.1866.210.00  0.00 0.00
0.000.00 0.003.21  92.80
sdh133.00   8636.0044.00  24.864.1464.93 1505.00
 87820.00  1646.00  52.240.9558.350.00  0.00 0.00
0.000.00 0.001.09  78.80

I've cross checked the other 8TB disks in our cluster, which are around
30-50% with roughly the same IOPs.
Maybe I am missing some optimization, that is done on the centos7 nodes,
but not on the ubuntu20.04 node. (If you know something from the top of
your head, I am happy to hear it).
Maybe it is just another measuring on ubuntu.

But this was the first node where I restarted the OSDs and this is where I
waited the longest time, to see if anything is going better. The problem
nearly disappeared in a couple of seconds, after the last OSD was
restarted. So I would not blame that node in particular, but I will
investigate in this direction.


Am Di., 6. Dez. 2022 um 10:08 Uhr schrieb Janne Johansson <
icepic...@gmail.com>:

> Perhaps run "iostat -xtcy  5" on the OSD hosts to
> see if any of the drives have weirdly high utilization despite low
> iops/requests?
>
>
> Den tis 6 dec. 2022 kl 10:02 skrev Boris Behrens :
> >
> > Hi Sven,
> > I am searching really hard for defect hardware, but I am currently out of
> > ideas:
> > - checked prometheus stats, but in all that data I don't know what to
> look
> > for (osd apply latency if very low at the mentioned point and went up to
> > 40ms after all OSDs were restarted)
> > - smartctl shows nothing
> > - dmesg show nothing
> > - network data shows nothing
> > - osd and clusterlogs show nothing
> >
> > If anybody got a good tip what I can check, that would be awesome. A
> string
> > in the logs (I made a copy from that days logs), or a tool to fire
> against
> > the hardware. I am 100% out of ideas what it could be.
> > In a time frame of 20s 2/3 of our OSDs went from "all fine" to "I am
> > waiting for the replicas to do their work" (log message 'waiting for sub
> > ops'). But there was no alert that any OSD had connection problems to
> other
> > OSDs. Additional the cluster_network is the same interface, switch,
> > everything as public_network. Only difference is the VLAN id (I plan to
> > remove the cluster_network because it does not provide anything for us).
> >
> > I am also planning to update all hosts from centos7 to ubuntu 20.04
> (newer
> > kernel, standardized OS config and so on).
> >
> > Am Mo., 5. Dez. 2022 um 14:24 Uhr schrieb Sven Kieske <
> s.kie...@mittwald.de
> > >:
> >
> > > On Sa, 2022-12-03 at 01:54 +0100, Boris Behrens wrote:
> > > > hi,
> > > > maybe someone here can help me to debug an issue we faced today.
> > > >
> > > > Today one of our clusters came to a grinding halt with 2/3 of our
> OSDs
> > > > reporting slow ops.
> > > > Only option to get it back to work fast, was to restart all OSDs
> daemons.
> > > >
> > > > The cluster is an octopus cluster with 150 enterprise SSD OSDs. Last
> work
> > > > on the cluster: synced in a node 4 days ago.
> > > >
> > > > The only health issue, that was reported, was the SLOW_OPS. No slow
> pings
> > > > on the networks. No restarting OSDs. Nothing.
> > > >
> > > > I was able to ping it to a 20s timeframe and I read ALL the logs in
> a 20
> > > > minute timeframe around this issue.
> > > >
> > > > I haven't found any clues.
> > > >
> > > > Maybe someon

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-06 Thread Boris Behrens
Hi Sven,
I am searching really hard for defect hardware, but I am currently out of
ideas:
- checked prometheus stats, but in all that data I don't know what to look
for (osd apply latency if very low at the mentioned point and went up to
40ms after all OSDs were restarted)
- smartctl shows nothing
- dmesg show nothing
- network data shows nothing
- osd and clusterlogs show nothing

If anybody got a good tip what I can check, that would be awesome. A string
in the logs (I made a copy from that days logs), or a tool to fire against
the hardware. I am 100% out of ideas what it could be.
In a time frame of 20s 2/3 of our OSDs went from "all fine" to "I am
waiting for the replicas to do their work" (log message 'waiting for sub
ops'). But there was no alert that any OSD had connection problems to other
OSDs. Additional the cluster_network is the same interface, switch,
everything as public_network. Only difference is the VLAN id (I plan to
remove the cluster_network because it does not provide anything for us).

I am also planning to update all hosts from centos7 to ubuntu 20.04 (newer
kernel, standardized OS config and so on).

Am Mo., 5. Dez. 2022 um 14:24 Uhr schrieb Sven Kieske :

> On Sa, 2022-12-03 at 01:54 +0100, Boris Behrens wrote:
> > hi,
> > maybe someone here can help me to debug an issue we faced today.
> >
> > Today one of our clusters came to a grinding halt with 2/3 of our OSDs
> > reporting slow ops.
> > Only option to get it back to work fast, was to restart all OSDs daemons.
> >
> > The cluster is an octopus cluster with 150 enterprise SSD OSDs. Last work
> > on the cluster: synced in a node 4 days ago.
> >
> > The only health issue, that was reported, was the SLOW_OPS. No slow pings
> > on the networks. No restarting OSDs. Nothing.
> >
> > I was able to ping it to a 20s timeframe and I read ALL the logs in a 20
> > minute timeframe around this issue.
> >
> > I haven't found any clues.
> >
> > Maybe someone encountered this in the past?
>
> do you happen to run your rocksdb on a dedicated caching device (nvme ssd)?
>
> I observed slow ops in octopus after a faulty nvme ssd was inserted in one
> ceph server.
> as was said in other mails, try to isolate your root cause.
>
> maybe the node added 4 days ago was the culprit here?
>
> we were able to pinpoint the nvme by monitoring the slow osds
> and the commonality in this case was the same nvme cache device.
>
> you should always benchmark new hardware/perform burn-in tests imho, which
> is not always possible due to environment constraints.
>
> --
> Mit freundlichen Grüßen / Regards
>
> Sven Kieske
> Systementwickler / systems engineer
>
>
> Mittwald CM Service GmbH & Co. KG
> Königsberger Straße 4-6
> 32339 Espelkamp
>
> Tel.: 05772 / 293-900
> Fax: 05772 / 293-333
>
> https://www.mittwald.de
>
> Geschäftsführer: Robert Meyer, Florian Jürgens
>
> St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
> Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
>
> Informationen zur Datenverarbeitung im Rahmen unserer Geschäftstätigkeit
> gemäß Art. 13-14 DSGVO sind unter www.mittwald.de/ds abrufbar.
>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-04 Thread Boris Behrens
@Marius:
no swap at all. I rather buy more  memory than use swap :)

Am So., 4. Dez. 2022 um 20:10 Uhr schrieb Marius Leustean <
marius.l...@gmail.com>:

> Hi Boris
>
> Do you have swap enabled on any of the OSD hosts? That may slow down
> RocksDB drastically.
>
> On Sun, Dec 4, 2022 at 8:59 PM Alex Gorbachev 
> wrote:
>
>> Hi Boris,
>>
>> These waits seem to be all over the place.  Usually, in the main ceph.log
>> you see "implicated OSD" messages - I would try to find some commonality
>> with either a host, switch, or something like that.  Can be bad
>> ports/NICs,
>> LACP problems, even bad cables sometimes.  I try to isolate an area that
>> is
>> problematic.  Sometimes rebooting OSD hosts one at a time.  Rebooting
>> switches (if stacked/MLAG) one at a time.  Something has got to be there,
>> which makes the problem go away.
>> --
>> Alex Gorbachev
>> https://alextelescope.blogspot.com
>>
>>
>>
>> On Sun, Dec 4, 2022 at 6:08 AM Boris Behrens  wrote:
>>
>> > Hi Alex,
>> > I am searching for a log line that points me in the right direction.
>> From
>> > what I've seen, I could find a specific Host, OSD, PG that was leading
>> to
>> > this problem.
>> > But maybe I am looking at the wrong logs.
>> >
>> > I have around 150k lines that look like this:
>> > ceph.log.timeframe:2022-12-02T18:19:59.877920+0100 osd.122 (osd.122)
>> 5195
>> > : cluster [WRN] 14 slow requests (by type [ 'delayed' : 2 'waiting for
>> sub
>> > ops' : 12 ] most affected pool [ 'rbd' : 14 ])
>> > ceph.log.timeframe:2022-12-02T18:19:59.905505+0100 osd.118 (osd.118)
>> 21011
>> > : cluster [WRN] 256 slow requests (by type [ 'delayed' : 243 'waiting
>> for
>> > sub ops' : 13 ] most affected pool [ 'rbd' : 256 ])
>> > ceph.log.timeframe:2022-12-02T18:19:59.928599+0100 osd.120 (osd.120)
>> 19800
>> > : cluster [WRN] 71 slow requests (by type [ 'delayed' : 15 'waiting for
>> sub
>> > ops' : 56 ] most affected pool [ 'rbd' : 71 ])
>> > ceph.log.timeframe:2022-12-02T18:19:59.968535+0100 osd.54 (osd.54) 6960
>> :
>> > cluster [WRN] 38 slow requests (by type [ 'delayed' : 21 'waiting for
>> sub
>> > ops' : 17 ] most affected pool [ 'rbd' : 38 ])
>> > ceph.log.timeframe:2022-12-02T18:19:59.973174+0100 osd.97 (osd.97)
>> 16792 :
>> > cluster [WRN] 19 slow requests (by type [ 'delayed' : 11 'waiting for
>> sub
>> > ops' : 8 ] most affected pool [ 'rbd' : 19 ])
>> > ceph.log.timeframe:2022-12-02T18:19:59.978565+0100 osd.42 (osd.42) 5724
>> :
>> > cluster [WRN] 12 slow requests (by type [ 'delayed' : 5 'waiting for sub
>> > ops' : 7 ] most affected pool [ 'rbd' : 12 ])
>> > ceph.log.timeframe:2022-12-02T18:19:59.980684+0100 osd.98 (osd.98)
>> 18471 :
>> > cluster [WRN] 35 slow requests (by type [ 'delayed' : 3 'waiting for sub
>> > ops' : 32 ] most affected pool [ 'rbd' : 35 ])
>> > ceph.log.timeframe:2022-12-02T18:19:59.992514+0100 osd.77 (osd.77)
>> 11319 :
>> > cluster [WRN] 256 slow requests (by type [ 'delayed' : 232 'waiting for
>> sub
>> > ops' : 24 ] most affected pool [ 'rbd' : 256 ])
>> >
>> > and around 50k that look like this:
>> > ceph-osd.99.log.timeframe:2022-12-02T18:19:59.605+0100 7ff8f96ba700 -1
>> > osd.99 945870 get_health_metrics reporting 9 slow ops, oldest is
>> > osd_op(client.171194478.0:4862294 8.cf5
>> > 8:af34e5b1:::rbd_header.47d6a06b8b4567:head [watch ping cookie
>> > 18446462598732840961 gen 26] snapc 0=[] ondisk+write+known_if_redirected
>> > e945870)
>> > ceph-osd.92.log.timeframe:2022-12-02T18:14:57.415+0100 7f9e8e4fd700 -1
>> > osd.92 945870 get_health_metrics reporting 6 slow ops, oldest is
>> > osd_op(client.177840485.0:141305 8.159f
>> > 8:f9adda1f:::rbd_data.82f60d356b4e4a.a1c2:head [write
>> > 1900544~147456 in=147456b] snapc 0=[] ondisk+write+known_if_redirected
>> > e945868)
>> >
>> > Cheers
>> >  Boris
>> >
>> > Am So., 4. Dez. 2022 um 03:15 Uhr schrieb Alex Gorbachev <
>> > a...@iss-integration.com>:
>> >
>> >> Boris, I have seen one problematic OSD cause this issue on all OSD with
>> >> which its PGs peered.  The solution was to take out the slow OSD,
>> >> immediately all slow ops stopped.  I found it by observing common OSDs
>> in
>> >> reported slow ops.  Not saying this is your issue, but it may be a
>> >> possibility.  Good luck!
>> >>
>> >> --
>> >> Alex Gorbachev
>> >> https://alextelescope.blogspot.com
>> >>
>> >
>> >
>> > --
>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> > groüen Saal.
>> >
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-04 Thread Boris Behrens
@Alex:
the issue is done for now, but I fear it might come back sometime. The
cluster was running fine for months.
I check if we can restart the switches easily. Host reboots should also be
no problem.

There is no "implicated OSD" message in the logs.
All OSDs were recreated 3 months ago. (sync out, destroy, wipe, create,
sync in). Maybe I will reinstall with ubuntu 20.04 (currently centos7) for
newer kernel.

Am So., 4. Dez. 2022 um 19:58 Uhr schrieb Alex Gorbachev <
a...@iss-integration.com>:

> Hi Boris,
>
> These waits seem to be all over the place.  Usually, in the main ceph.log
> you see "implicated OSD" messages - I would try to find some commonality
> with either a host, switch, or something like that.  Can be bad ports/NICs,
> LACP problems, even bad cables sometimes.  I try to isolate an area that is
> problematic.  Sometimes rebooting OSD hosts one at a time.  Rebooting
> switches (if stacked/MLAG) one at a time.  Something has got to be there,
> which makes the problem go away.
> --
> Alex Gorbachev
> https://alextelescope.blogspot.com
>
-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


  1   2   3   4   >