[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
com> --- From: Irek Fasikhov Sent: Friday, May 14, 2021 6:28 PM To: Szabo, Istvan (Agoda) Cc: Konstantin Shalygin ; ceph-users@ceph.io Subject: Re: [ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true Hi. https://access.redh

[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
From: Konstantin Shalygin Sent: Friday, May 14, 2021 6:07 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [ceph-users] [Suspicious newsletter] Re: bluefs_buffered_io turn to true I suggest to look into vm.min_free_kbytes kernel option, and increase it twice k On 14 May 2021, at

[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
com> --- From: Konstantin Shalygin Sent: Friday, May 14, 2021 5:32 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [ceph-users] [Suspicious newsletter] Re: bluefs_buffered_io turn to true It's enough, should be true now... k On 14 May 2021, at 12:51,

[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com> --- From: Konstantin Shalygin Sent: Friday, May 14, 2021 3:26 PM To: Szabo, Istvan (Agoda) Cc

[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
Sent: Friday, May 14, 2021 3:00 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [Suspicious newsletter] [ceph-users] Re: bluefs_buffered_io turn to true On 14 May 2021, at 10:50, Szabo, Istvan (Agoda) mailto:istvan.sz...@agoda.com>> wrote: Is it

[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
com> --- From: Konstantin Shalygin Sent: Friday, May 14, 2021 2:12 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [Suspicious newsletter] [ceph-users] Re: bluefs_buffered_io turn to true I recommend to upgrade at least to 12.2.13, for luminous even .12 a

[ceph-users] Re: [Suspicious newsletter] Re: bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
Shalygin Sent: Friday, May 14, 2021 1:12 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: [Suspicious newsletter] [ceph-users] Re: bluefs_buffered_io turn to true Hi, This is not a normal, It's something different I think, like a crush changes on restart. This option

[ceph-users] bluefs_buffered_io turn to true

2021-05-14 Thread Szabo, Istvan (Agoda)
Hi, I had issue with the snaptrim after a hug amount of deleted data, it slows down the team operations due to the snaptrim and snaptrim_wait pgs. I've changed couple of things: debug_ms = 0/0 #default 0/5 osd_snap_trim_priority = 1 # default 5 osd_pg_max_concurrent_snap_trims = 1 # default 2

[ceph-users] Re: Which EC-code for 6 servers?

2021-05-11 Thread Szabo, Istvan (Agoda)
2 host failures with RW access, you need m=3, which is probably a bit much with 6 hosts. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Szabo, Istvan (Agoda) Sent: 10 May 2021 10:35:34 To: ceph-users Subject:

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-11 Thread Szabo, Istvan (Agoda)
com> --- From: 特木勒 Sent: Monday, May 10, 2021 6:53 PM To: Szabo, Istvan (Agoda) Cc: Jean-Sebastien Landry ; ceph-users@ceph.io; Amit Ghadge Subject: Re: [ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data Hi Istvan: Thanks for your help. Af

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-05-10 Thread Szabo, Istvan (Agoda)
--- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com> --- From: 特木勒 Sent: Thursday, May 6, 2021 11:27 AM To: Jean-Sebastien Landry Cc: Szabo, Istvan (Agoda) ; ceph-users@ceph.io; Amit Ghadge Subject: Re:

[ceph-users] Which EC-code for 6 servers?

2021-05-10 Thread Szabo, Istvan (Agoda)
Hi, Thinking to have 2:2 so I can tolerate 2 hosts loss, but if I just want to tolerate 1 host loss, which one better, 3:2 or 4:1? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e:

[ceph-users] Re: [Suspicious newsletter] Building ceph clusters with 8TB SSD drives?

2021-05-10 Thread Szabo, Istvan (Agoda)
We are using in our objectstore 15TB SSDs. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Matt Larson Sent:

[ceph-users] Re: [Suspicious newsletter] Re: Getting `InvalidInput` when trying to create a notification topic with Kafka endpoint

2021-04-28 Thread Szabo, Istvan (Agoda)
al Message- From: Yuval Lifshitz Sent: Tuesday, April 27, 2021 11:49 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io; Raveendran, Vigneshwaran (Agoda) Subject: [Suspicious newsletter] [ceph-users] Re: Getting `InvalidInput` when trying to create a notification topic with Kafka endpoint On T

[ceph-users] Re: Getting `InvalidInput` when trying to create a notification topic with Kafka endpoint

2021-04-27 Thread Szabo, Istvan (Agoda)
. e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com> --- From: Yuval Lifshitz Sent: Wednesday, April 21, 2021 10:34 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Getting `InvalidInput` when trying

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-04-25 Thread Szabo, Istvan (Agoda)
com> --- From: 特木勒 Sent: Friday, April 23, 2021 7:50 PM To: Szabo, Istvan (Agoda) Cc: ceph-users@ceph.io Subject: Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site does not sync olds data Hi Istvan: We just upgraded whole cluster to 15.2.10 and the multiple site still cannot

[ceph-users] PG can't deep and simple scrub after unfound data delete

2021-04-25 Thread Szabo, Istvan (Agoda)
Hi, I have a pg where it has been run the following command: ceph pg 44.1aa mark_unfound_lost delete After the cluster never report the unknown pgs which was actually the goal to ran this. However this pg is now inconsistent and can't be deepscrubbed. ceph health detail HEALTH_ERR 214275

[ceph-users] Re: [Suspicious newsletter] Re: cephadm: how to create more than 1 rgw per host

2021-04-23 Thread Szabo, Istvan (Agoda)
I don't use cephadm, I'm using the not dockerised deployment, but how you create another one on the same host with this config? This is my RGW section: [client.rgw.xyz-cephmon-2s01.rgw0] host = xyz-cephmon-2s01 keyring = /var/lib/ceph/radosgw/ceph-rgw.xyz-cephmon-2s01.rgw0/keyring log file =

[ceph-users] Re: Metrics for object sizes

2021-04-22 Thread Szabo, Istvan (Agoda)
com> --- On 2021. Apr 22., at 14:00, Wido den Hollander wrote:  On 21/04/2021 11:46, Szabo, Istvan (Agoda) wrote: Hi, Is there any clusterwise metric regarding object sizes? I'd like to collect some information about the users what is the object

[ceph-users] Re: [Suspicious newsletter] Re: cephadm: how to create more than 1 rgw per host

2021-04-22 Thread Szabo, Istvan (Agoda)
Have you ever tried this? Did it work for you? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- On 2021. Apr 22.,

[ceph-users] Metrics for object sizes

2021-04-21 Thread Szabo, Istvan (Agoda)
Hi, Is there any clusterwise metric regarding object sizes? I'd like to collect some information about the users what is the object sizes in their buckets. This message is confidential and is for the sole use of the intended recipient(s). It may also be

[ceph-users] Getting `InvalidInput` when trying to create a notification topic with Kafka endpoint

2021-04-21 Thread Szabo, Istvan (Agoda)
Hi Ceph Users, Here is the latest request I tried but still not working curl -v -H 'Date: Tue, 20 Apr 2021 16:05:47 +' -H 'Authorization: AWS :' -L -H 'content-type: application/x-www-form-urlencoded' -k -X POST https://servername -d

[ceph-users] Re: [Suspicious newsletter] cleanup multipart in radosgw

2021-04-19 Thread Szabo, Istvan (Agoda)
Hi, You have 2 ways: First is using s3vrowser app and in the menu select the multipart uploads and clean it up. The other is like this: Set lifecycle policy On the client: vim lifecyclepolicy http://s3.amazonaws.com/doc/2006-03-01/;>

[ceph-users] Re: what-does-nosuchkey-error-mean-while-subscribing-for-notification-in-ceph

2021-04-16 Thread Szabo, Istvan (Agoda)
da.com> --- On 2021. Apr 16., at 14:58, David Caro wrote:  What does notif.xml have in it? Looking at the docs you linked, I say that it does not find the `S3Key` from that xml for whatever reason. On 04/16 06:54, Szabo, Istvan (Agoda) wrote: Hi, I

[ceph-users] what-does-nosuchkey-error-mean-while-subscribing-for-notification-in-ceph

2021-04-16 Thread Szabo, Istvan (Agoda)
Hi, I am trying to follow this url https://docs.ceph.com/en/latest/radosgw/s3/bucketops/#create-notification to create a publisher for my bucket into a topic. My curl: curl -v -H 'Date: Fri, 16 Apr 2021 05:21:14 +' -H 'Authorization: AWS accessid:secretkey' -L -H 'content-type:

[ceph-users] Re: Abandon incomplete (damaged EC) pgs - How to manage the impact on cephfs?

2021-04-08 Thread Szabo, Istvan (Agoda)
Hi, So finally how did you solve it? Which method out of the three? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message-

[ceph-users] Re: [ERR] OSD_SCRUB_ERRORS: 2 scrub errors

2021-04-01 Thread Szabo, Istvan (Agoda)
Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Szabo, Istvan (Agoda) Sent: Thursday, April 1, 2021 11:38 AM To: ceph-users Subject: [ceph

[ceph-users] [ERR] OSD_SCRUB_ERRORS: 2 scrub errors

2021-04-01 Thread Szabo, Istvan (Agoda)
Hi, I’m continuously getting scrub errors in my index pool and log pool that I need to repair always. HEALTH_ERR 2 scrub errors; Possible data damage: 1 pg inconsistent [ERR] OSD_SCRUB_ERRORS: 2 scrub errors [ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent pg 20.19 is

[ceph-users] Re: OSD Crash During Deep-Scrub

2021-03-30 Thread Szabo, Istvan (Agoda)
I have this deep scrub issue in the index pool's pg's almost every week which made the cluster health error so I always need to repair that pg :/ Any solution that you have found so far? Istvan Szabo Senior Infrastructure Engineer --- Agoda

[ceph-users] Re: [Suspicious newsletter] Re: How to clear Health Warning status?

2021-03-29 Thread Szabo, Istvan (Agoda)
Restart the osd. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: jinguk.k...@ungleich.ch Sent: Monday, March

[ceph-users] Re: [Suspicious newsletter] Re: [Suspicious newsletter] bucket index and WAL/DB

2021-03-26 Thread Szabo, Istvan (Agoda)
qui., 25 de mar. de 2021 às 22:42, Szabo, Istvan (Agoda) < istvan.sz...@agoda.com> escreveu: > In couple of documentation that I've read I finally made the decision > to separate index from wal+db. > However don't you think that the density is a bit high with 12HDD for > 1 nvme? So

[ceph-users] Possible to update from luminous 12.2.8 to nautilus latest?

2021-03-26 Thread Szabo, Istvan (Agoda)
Hi, Is it possible to do a big jump or needs to go slower to luminous latest, then mimic latest, then nautilus latest? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e:

[ceph-users] How ceph sees when the pool is getting full?

2021-03-26 Thread Szabo, Istvan (Agoda)
Hi, I want to set alert on the user's pool before it got's full but in nautilus I still haven't found the way which is the value of their data usage based on ceph detail df? POOL ID STORED OBJECTS USED %USED

[ceph-users] Re: [Suspicious newsletter] bucket index and WAL/DB

2021-03-25 Thread Szabo, Istvan (Agoda)
In couple of documentation that I've read I finally made the decision to separate index from wal+db. However don't you think that the density is a bit high with 12HDD for 1 nvme? So if you loose nvme you actually loose your complete host and a lot of data movements will happen. Istvan Szabo

[ceph-users] Re: Incomplete pg , any chance to to make it survive or data loss :( ?

2021-03-22 Thread Szabo, Istvan (Agoda)
pg 44.1aa not scrubbed since 2021-01-14T05:50:23.852626+0100 From: Szabo, Istvan (Agoda) Sent: Monday, March 22, 2021 3:00 PM To: Ceph Users Subject: [ceph-users] Re: Incomplete pg , any chance to to make it survive or data loss :( ? Forgot to say

[ceph-users] Incomplete pg , any chance to to make it survive or data loss :( ?

2021-03-22 Thread Szabo, Istvan (Agoda)
Hi, What can I do with this pg to make it work? We lost and don't have the osds 61,122 but we have the 32,33,70. I've exported the pg chunk from them, but they are very small and when I imported back to another osd that osd never started again so I had to remove that chunk (44.1aas2,

[ceph-users] Re: [Suspicious newsletter] v15.2.10 Octopus released

2021-03-18 Thread Szabo, Istvan (Agoda)
Hi David, I guess this one fixes the non-containerized deployment too, isn't it? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original

[ceph-users] Re: PG export import

2021-03-18 Thread Szabo, Istvan (Agoda)
--- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Frank Schilder Sent: Thursday, March 18, 2021 6:16 PM To: Szabo, Istvan (Agoda) ; Ceph Users Subject: Re

[ceph-users] PG export import

2021-03-18 Thread Szabo, Istvan (Agoda)
Hi, I’ve tried to save some pg from a dead osd, I made this: Picked on the same server an osd which is not really used and stopped that osd and import the exported one from the dead one. root@server:~# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-33 --no-mon-config --pgid 44.c0s0

[ceph-users] Re: Inactive pg, how to make it active / or delete

2021-03-16 Thread Szabo, Istvan (Agoda)
== > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ____________ > From: Szabo, Istvan (Agoda) > Sent: 16 March 2021 10:44:10 > To: Ceph Users > Subject: [ceph-users] Inactive pg, how to make it active / or delete > > Hi

[ceph-users] Inactive pg, how to make it active / or delete

2021-03-16 Thread Szabo, Istvan (Agoda)
Hi, I have 4 inactive pg in my cluster, the osds are dies where it was before. How can I make it work again? Maybe just threw away because last backfill=max? Based on the pg query it is totally up on other osds. It is an EC 3+1. This is an example inactive pg: ceph pg 44.1f0 query {

[ceph-users] Re: Safe to remove osd or not? Which statement is correct?

2021-03-16 Thread Szabo, Istvan (Agoda)
I had the problem that a start does not work, I tried the 'ceph-volume lvm activate --all' on the host, which brought the OSDs back up. But I can't tell you if it is safe to remove the OSD. Cheers Boris Am So., 14. März 2021 um 02:38 Uhr schrieb Szabo, Istvan (Agoda) mailto:istvan.sz...@ago

[ceph-users] Re: millions slow ops on a cluster without load

2021-03-16 Thread Szabo, Istvan (Agoda)
Yeah, the mtu is on the cluster network’s nic cards are 8982, the ping works with 8954 packets between interfaces. On 2021. Mar 15., at 23:40, Matthew H wrote:  Might be an MTU problem, have you checked your network and MTU settings? From: Szabo, Istvan

[ceph-users] millions slow ops on a cluster without load

2021-03-15 Thread Szabo, Istvan (Agoda)
We have a cluster with a huge amount of warnings like this even if nothing is going on in the cluster. It makes mgr physical memory full, mon db maxed out 5 osds can't start :/ [WRN] slow request osd_op(mds.0.537792:26453 43.38 43:1d6c5587:::1fe56a6.:head [create,setxattr parent

[ceph-users] Safe to remove osd or not? Which statement is correct?

2021-03-13 Thread Szabo, Istvan (Agoda)
Hi Gents, There is a cluster with 14 hosts in this state: https://i.ibb.co/HPF3Pdr/6-ACB2-C5-B-6-B54-476-B-835-D-227-E9-BFB1247.jpg There is a host based crush rule ec 3:1 and there are 3 hosts where are osds down. Unfortunately there are pools with 3 replicas also which is host based. 2

[ceph-users] Re: ERROR: S3 error: 403 (SignatureDoesNotMatch)

2021-03-12 Thread Szabo, Istvan (Agoda)
: Szabo, Istvan (Agoda) Sent: Friday, March 12, 2021 11:57 AM To: 'Szabo, Istvan (Agoda)' ; Ceph Users Subject: RE: ERROR: S3 error: 403 (SignatureDoesNotMatch) Seems like the issue is this line in the radosgw-configuration: rgw_dns_name = It is only binded to the one which is listed

[ceph-users] ERROR: S3 error: 403 (SignatureDoesNotMatch)

2021-03-11 Thread Szabo, Istvan (Agoda)
Hi, I'm struggling with my old cluster cnamed address. The s3 and curl commands are working properly with the not cnamed address, but with the cnamed one, I got this in the ciwetweb log: 2021-03-12 10:24:18.812329 7f6b0c527700 1 == starting new request req=0x7f6b0c520f90 = 2021-03-12

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-11 Thread Szabo, Istvan (Agoda)
r your question? On 11/03/2021 08:22, Szabo, Istvan (Agoda) wrote: Hi, It relates to this sentence: "The median object size is ~4KB, written in RBD images using the default 4MB[0] object size. That will be ~100 millions RADOS objects instead of 100 billions." Istvan Szabo Senior Infrast

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-11 Thread Szabo, Istvan (Agoda)
Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Loïc Dachary Sent: Thursday, March 11, 2021 2:10 PM To: Szabo, Istvan (Agoda) Cc: Ceph Users Subject: [ceph-users] Re: A practical approach to efficiently

[ceph-users] Re: Bluestore OSD Layout - WAL, DB, Journal

2021-03-10 Thread Szabo, Istvan (Agoda)
Hi, If you don't specifiy WAL it will be located on the same drive as the rocksdb. You need to specify wal if you have faster disk the your rocksdb, like DATA on HDD, Rocksdb on SSD, wal on nvme/optane. In the past they suggested like this: 300GB data 30GB rocksdb, 3GB wal. Not sure is this

[ceph-users] Re: Failure Domain = NVMe?

2021-03-10 Thread Szabo, Istvan (Agoda)
Don't forget if you have server failure you might loose many objects. If the failure domain is osd, it means let's say you have 12 drives in each server, 8+2 EC in an unlucky situation can be located in 1 server also. Istvan Szabo Senior Infrastructure Engineer

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-03-10 Thread Szabo, Istvan (Agoda)
Does this mean that even in an object store the files which is smaller than 4MB will be packed in one 4 MB object? -Original Message- From: Loïc Dachary Sent: Thursday, March 11, 2021 2:13 AM To: Konstantin Shalygin Cc: Ceph Users ; swh-de...@inria.fr Subject: [ceph-users] Re: A

[ceph-users] Bluestore OSD crash with tcmalloc::allocate_full_cpp_throw_oom in multisite setup with PG_DAMAGED cluster error

2021-03-04 Thread Szabo, Istvan (Agoda)
Hi, I have a 3 DC multisite setup. The replication is directional like HKG->SGP->US so the bucket is replicated from HKG to SGP and the same bucket is replicated further from SGP to US. The HKG > SGP connection is pretty fast 12.5millions objects (600GB) transferred in 6.5 hours. Once the OSD

[ceph-users] Re: Questions RE: Ceph/CentOS/IBM

2021-03-03 Thread Szabo, Istvan (Agoda)
But are you using kernel 4 then with centos 7? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Marc Sent:

[ceph-users] Multiple radosgw on the same server

2021-03-02 Thread Szabo, Istvan (Agoda)
Hi, I've heard many time that to install multiple rados-gateway on the same server is possible, just need to create on a different port. However I've never managed to make it work. Today I gave another try like this: 1. Created a new keyring: ceph auth get-or-create

[ceph-users] Re: [Suspicious newsletter] RBD clone on Bluestore

2021-03-01 Thread Szabo, Istvan (Agoda)
I might be wrong but the object size is 4M in ceph so this might be the case also. But yeah, on the lower level of data storing I'm not familiar. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-02-28 Thread Szabo, Istvan (Agoda)
Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com<mailto:istvan.sz...@agoda.com> --- From: 特木勒 Sent: Sunday, February 28, 2021 1:34 PM To: Szabo, Istvan

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-02-25 Thread Szabo, Istvan (Agoda)
Same for me, 15.2.8 also. I’m trying directional sync now, looks like symmetrical has issue. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com

[ceph-users] Re: Erasure coded calculation

2021-02-25 Thread Szabo, Istvan (Agoda)
109 is 81.75 yes the rest of them some bluestore stuffs I guess. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From:

[ceph-users] Re: [Suspicious newsletter] Re: List number of buckets owned per user

2021-02-25 Thread Szabo, Istvan (Agoda)
Maybe this one? radosgw-admin bucket list --uid= And after linux commands, grep, wc -l. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com ---

[ceph-users] Multisite sync shards cleanup

2021-02-23 Thread Szabo, Istvan (Agoda)
Hi, Is there a way to cleanup the sync shards and start from scratch? Thank you This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have

[ceph-users] Multisite cluster sync latency

2021-02-15 Thread Szabo, Istvan (Agoda)
Hi Gents, Can you tell me how much is your latency in your multisite cluster? Multisite should be latency sensitive and I scare this is my sync issue, but I don't really know what means "low latency". Here is mine, I wonder is it good or not. In HKG: "data-sync-from-ash": { "fetch_bytes": {

[ceph-users] Data sync init vs bucket sync init

2021-02-11 Thread Szabo, Istvan (Agoda)
Hi, What’s the difference between data sync init and bucket sync init? Data initialise the complete cluster? Bucket only bucket? I see when initialise finished, have shards behind but doesn’t do anything with it? What is the proper steps to bring things back to sync? Init Run Restart ??

[ceph-users] Data sync init vs bucket sync init

2021-02-11 Thread Szabo, Istvan (Agoda)
Hi, What’s the difference between data sync init and bucket sync init? Data initialise the complete cluster? Bucket only bucket? I see when initialise finished, have shards behind but doesn’t do anything with it? What is the proper steps to bring things back to sync? Init Run Restart

[ceph-users] Multisite reshard stale instances

2021-02-04 Thread Szabo, Istvan (Agoda)
Hi, I found 6-700 stale instances with the reshard stale instances list command. Is there a way to clean it up (or actually should I clean it up)? The stale instance rm doesn't work in multisite. Thank you This message is confidential and is for the sole use of

[ceph-users] log_meta log_data was turned off in multisite and deleted

2021-02-04 Thread Szabo, Istvan (Agoda)
Hi, Is there a way to reinitialize the stored data and make it sync from the logs? Thank you This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules.

[ceph-users] Bucket synchronization works only after disable/enable, once finished, some operation maxes out SSDs/nvmes and sync degraded.

2021-02-01 Thread Szabo, Istvan (Agoda)
Hello, We have a 3 geo locational freshly installed multisite setup with an upgraded octopus from 15.2.5 to 15.2.7. We have 6 osd nodes, 3 mon/mgr/rgw in each dc, full SSD, 3 ssd is using 1 nvme for journaling. Each zone backed with 3 RGW, one on each mon/mgr node. The goal is to replicate 2

[ceph-users] Re: [Suspicious newsletter] Re: Multisite recovering shards

2021-02-01 Thread Szabo, Istvan (Agoda)
rds incremental sync: 128/128 shards data is caught up with source Zitat von "Szabo, Istvan (Agoda)" : > Hi, > > I’ve never seen in our multisite sync status healthy output, almost > all the sync shards are recovering. > > What can I do

[ceph-users] Re: Multisite recovering shards

2021-01-29 Thread Szabo, Istvan (Agoda)
2 things I forgot to mention which might be interesting, we have only 2 bucket at the moment, one is presharded to 9000 shards, the other presharded to 24000 shards (different users) > On 2021. Jan 30., at 10:02, Szabo, Istvan (Agoda) > wrote: > > Hi, > > I’ve never see

[ceph-users] Multisite recovering shards

2021-01-29 Thread Szabo, Istvan (Agoda)
Hi, I’ve never seen in our multisite sync status healthy output, almost all the sync shards are recovering. What can I do with recovering shards? We have 1 realm, 1 zonegroup and inside the zonegroup we have 3 zones in 3 different geo location. We are using octopus 15.2.7 for bucket sync

[ceph-users] Re: [Suspicious newsletter] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Szabo, Istvan (Agoda)
com> --- On 2021. Jan 28., at 19:51, Burkhard Linke wrote: Email received from outside the company. If in doubt don't click links nor open attachments! Hi, On 28.01.21 13:21, Szabo, Istvan (Agoda) wrote: I mean the image hasn’t been d

[ceph-users] Re: [Suspicious newsletter] Re: Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Szabo, Istvan (Agoda)
com> --- On 2021. Jan 28., at 18:21, Eugen Block wrote: Email received from outside the company. If in doubt don't click links nor open attachments! The image is probably still in the trash, I assume. rbd -p trash ls Zitat von "Szabo, Istvan (Agoda)&

[ceph-users] Rbd pool shows 458GB USED but the image is empty

2021-01-28 Thread Szabo, Istvan (Agoda)
Hi, We have a pool where the user has 2 image. They cleaned up the images, no snaphot in it, but when I see ceph df detail it still shows 458GB in the first column. Why? Thanks This message is confidential and is for the sole use of the intended recipient(s).

[ceph-users] Running ceph cluster on different os

2021-01-26 Thread Szabo, Istvan (Agoda)
Hi, Is there anybody running a cluster with different os? Due to the centos 8 change I might try to add ubuntu osd nodes to centos cluster and decommission the centos slowly but I'm not sure this is possible or not. Thank you This message is confidential and

[ceph-users] Re: Multisite bucket data inconsistency

2021-01-25 Thread Szabo, Istvan (Agoda)
Debugging a bit more it shows in all sites many stale instances which can't be removed due to multisite limitation ☹ in octopus 15.2.7. -Original Message- From: Szabo, Istvan (Agoda) Sent: Monday, January 25, 2021 11:51 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: Multisite

[ceph-users] Re: Multisite bucket data inconsistency

2021-01-25 Thread Szabo, Istvan (Agoda)
Hmm, Looks like attached screenshots not allowed, so in HKG we have 19 millions objects, in ash we have 32millions. -Original Message- From: Szabo, Istvan (Agoda) Sent: Monday, January 25, 2021 11:44 AM To: ceph-users@ceph.io Subject: [ceph-users] Multisite bucket data inconsistency

[ceph-users] Multisite bucket data inconsistency

2021-01-25 Thread Szabo, Istvan (Agoda)
Hi, We have bucket sync enabled and seems like it is inconsistent ☹ This is the master zone sync status on that specific bucket: realm 5fd28798-9195-44ac-b48d-ef3e95caee48 (realm) zonegroup 31a5ea05-c87a-436d-9ca0-ccfcbad481e3 (data) zone

[ceph-users] Re: [Suspicious newsletter] radosgw-admin realm pull from the secondary site fails "(13) Permission denied"

2021-01-23 Thread Szabo, Istvan (Agoda)
Hi, If you are using proxy, try to disable it. Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- On 2021. Jan 23.,

[ceph-users] RBD on windows

2021-01-20 Thread Szabo, Istvan (Agoda)
Hi, I'm looking the suse documentation regarding their option to have rbd on win. I want to try on windows server 2019 vm, but I got this error: PS C:\Users\$admin$> rbd create image01 --size 4096 --pool windowstest -m 10.118.199.248,10.118.199.249,10.118.199.250 --id windowstest --keyring

[ceph-users] Re: [Suspicious newsletter] Re: .rgw.root was created wit a lot of PG

2021-01-15 Thread Szabo, Istvan (Agoda)
Email received from outside the company. If in doubt don't click links nor open attachments! Which ceph version is this? Since Nautilus you can decrease pg numbers (or let pg-autoscaler do that for you). Zitat von "Szabo, Istvan (Agoda)" : > Hi, &

[ceph-users] .rgw.root was created wit a lot of PG

2021-01-15 Thread Szabo, Istvan (Agoda)
Hi, Originally this pool was created with 512PG which makes couple of OSDs having 500PG  What is the safe steps to copy over this pool? These are the files in this pool: default.realm period_config.f320e60d-8cff-4824-878e-c316423cc519 periods.18d63a25-8a50-4e17-9561-d452621f62fa.latest_epoch

[ceph-users] Re: radosgw-admin sync status takes ages to print output

2021-01-14 Thread Szabo, Istvan (Agoda)
lower object count. > > Thank you, > > Dominic L. Hilsbos, MBA > Director – Information Technology > Perform Air International Inc. > dhils...@performair.com > www.PerformAir.com > > > -Original Message- > From: Szabo, Istvan (Agoda) [mailto:istvan.sz...@agod

[ceph-users] Re: [Suspicious newsletter] Re: Centos 8 2021 with ceph, how to move forward?

2021-01-14 Thread Szabo, Istvan (Agoda)
a 1:1 binary compatible redhat fork due to the changes with Centos 8. Could be worth looking at. https://almalinux.org/ In our case we're using ceph on debian 10. -- David Majchrzak CTO Oderland Webbhotell AB Östra Hamngatan 50B, 411 09 Göteborg, SWEDEN Den 2021-01-14 kl. 09:04, skrev Szabo,

[ceph-users] Centos 8 2021 with ceph, how to move forward?

2021-01-14 Thread Szabo, Istvan (Agoda)
Hi, Just curious how you guys move forward with this Centos 8 change. We just finished installing our full multisite cluster and looks like we need to change the operating system. So curious if you are using centos 8 with ceph, where you are going to move forward. Thank you

[ceph-users] Re: radosgw-admin sync status takes ages to print output

2021-01-14 Thread Szabo, Istvan (Agoda)
: [5,14,23,25,26,34,36,37,38,45,46,47,49,50,51,52,54,55,57,58,60,61,62,67,68,69,71,77,79,80,88,89,90,95,97,100,108,110,111,117,118,120,121,125,126] Sorry for the 2 email. -Original Message- From: Szabo, Istvan (Agoda) Sent: Thursday, January 14, 2021 12:57 PM To: ceph-users@ceph.io Subject: [ceph-users] radosgw-admin sync status takes ages to print output Email received from

[ceph-users] radosgw-admin sync status takes ages to print output

2021-01-14 Thread Szabo, Istvan (Agoda)
Hello, I have a 3 DC octopus Multisite setup with bucket sync policy applied. I have 2 buckets where I’ve set the shard 24.000 and the other is 9.000 because they want to use 1 bucket but with a huge amount of objects (2.400.000.000 and 900.000.000) and in case of multisite we need to preshard

[ceph-users] Re: Data migration between clusters

2020-12-23 Thread Szabo, Istvan (Agoda)
ers, > Kalle > > - Original Message - >> From: "Szabo, Istvan (Agoda)" >> To: "ceph-users" >> Sent: Thursday, 17 December, 2020 12:11:19 >> Subject: [ceph-users] Data migration between clusters > >> What is the easiest and best w

[ceph-users] Data migration between clusters

2020-12-17 Thread Szabo, Istvan (Agoda)
What is the easiest and best way to migrate bucket from an old cluster to a new one? Luminous to octopus not sure does it matter from the data perspective. This message is confidential and is for the sole use of the intended recipient(s). It may also be

[ceph-users] Weird ceph df

2020-12-15 Thread Szabo, Istvan (Agoda)
Hi, It is a nautilus 14.2.13 ceph. The quota on the pool is 745GiB, how can be the stored data 788GiB? (2 replicas pool). Based on the used column it means just 334GiB is used because the pool has 2 replicas only. I don't understand. POOLS: POOLID STORED OBJECTS

[ceph-users] Re: [Suspicious newsletter] Re: PG_DAMAGED

2020-12-04 Thread Szabo, Istvan (Agoda)
links nor open attachments! Hi, this is not necessarily but most likely a hint to a (slowly) failing disk. Check all OSDs for this PG for disk errors in dmesg and smartctl. Regards, Eugen Zitat von "Szabo, Istvan (Agoda)" : > Hi, > > Not

[ceph-users] PG_DAMAGED

2020-12-04 Thread Szabo, Istvan (Agoda)
Hi, Not sure is it related to my 15.2.7 update, but today I got many time this issue: 2020-12-04T15:14:23.910799+0700 osd.40 (osd.40) 11 : cluster [DBG] 11.2 deep-scrub starts 2020-12-04T15:14:23.947255+0700 osd.40 (osd.40) 12 : cluster [ERR] 11.2 soid

[ceph-users] Re: [Suspicious newsletter] Re: Unable to reshard bucket

2020-11-26 Thread Szabo, Istvan (Agoda)
Message- From: Eric Ivancich Sent: Wednesday, November 25, 2020 5:37 AM To: Szabo, Istvan (Agoda) Cc: ceph-users Subject: Re: [Suspicious newsletter] [ceph-users] Re: Unable to reshard bucket Email received from outside the company. If in doubt don't click links nor open attachments

[ceph-users] HA_proxy setup

2020-11-23 Thread Szabo, Istvan (Agoda)
Hi, I wonder is there anybody have a setup like I want to setup? 1st subnet: 10.118.170.0/24 (FE users) 2nd subnet: 10.192.150.0/24 (BE users) The users are coming from these subnets, and I want that the FE users will come on the 1st interface on the loadbalancer, the BE users will come one

[ceph-users] Sizing radosgw and monitor

2020-11-23 Thread Szabo, Istvan (Agoda)
Hi, I haven't really find any documentation about how to size radosgw. One redhat doc says we need to decide the ratio like 1:50 or 1:100 osd / rgw. I had an issue earlier where I had a user who source loadbalanced so always went to the same radosgateway and 1 time just maxed out. So the

[ceph-users] Re: [Suspicious newsletter] Re: Unable to reshard bucket

2020-11-21 Thread Szabo, Istvan (Agoda)
Seems like this sharding we need to be plan carefully since the beginning. I'm thinking to set the shard number by default to the maximum which is 64k and leave it as is so we will never reach the limit only if we reach the maximum number of objects. Would be interesting to know what is the

[ceph-users] Weird ceph use case, is there any unknown bucket limitation?

2020-11-18 Thread Szabo, Istvan (Agoda)
Hi, I have a use case where the user would like to have 5 Buckets. Is it normal for ceph just too much for me? The reason they want this level of granularity is because they might need to clean buckets for a specific subset and not affect others. Bucket format is this:

[ceph-users] Re: Ceph EC PG calculation

2020-11-18 Thread Szabo, Istvan (Agoda)
in warn and do when it instruct. To be honest I just want to be sure my setup is correct or I miss something or did something wrong. -Original Message- From: Frank Schilder Sent: Wednesday, November 18, 2020 3:11 PM To: Szabo, Istvan (Agoda) ; ceph-users@ceph.io Subject: Re: Ceph EC PG

[ceph-users] EC overwrite

2020-11-18 Thread Szabo, Istvan (Agoda)
Hi, Is it s problem if ec_overwrite enabled in the data pool? https://docs.ceph.com/en/latest/rados/operations/erasure-code/#erasure-coding-with-overwrites Thanks This message is confidential and is for the sole use of the intended recipient(s). It may also be

[ceph-users] Ceph EC PG calculation

2020-11-17 Thread Szabo, Istvan (Agoda)
Hi, I have this error: I have 36 osd and get this: Error ERANGE: pg_num 4096 size 6 would mean 25011 total pgs, which exceeds max 10500 (mon_max_pg_per_osd 250 * num_in_osds 42) If I want to calculate the max pg in my server, how it works if I have EC pool? I have 4:2 data EC pool, and the

[ceph-users] Multisite mechanism deeper understanding

2020-11-09 Thread Szabo, Istvan (Agoda)
Hi, Couple of questions came up which is not really documented anywhere, hopefully someone knows the answers: 1. Is there a way to see the replication queue? I want to create metrics like is there any delay in the replication etc ... 2. Is the replication FIFO? 3. Actually how a replication

<    1   2   3   4   >