[ceph-users] Re: How's the maturity of CephFS and how's the maturity of Ceph erasure code?

2021-02-08 Thread Martin Verges
Hello Fred, from hundreds of installations, we can say it is production ready and working fine if deployed and maintained correctly. As always it depends, but it works for a huge amount of use cases. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat:

[ceph-users] Re: reinstalling node with orchestrator/cephadm

2021-02-08 Thread Eugen Block
Good morning, Thanks for sharing your results! Since we have multiple clusters and clusters with +500 OSDs, this solution is not feasible for us. yeah, sounds about right. ;-) Zitat von Kenneth Waegeman : Hi Eugen, all, Thanks for sharing your results! Since we have multiple clusters

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Eugen Block
Sorry, I had lost access to my emails. Setting those affected OSDs out would have been one of the next steps, great that it worked for you. Zitat von Boris Behrens : I've outed osd.18 and osd.54 and let it sync for some time and now the problem is gone. *shrugs Thank you for the hints. Am

[ceph-users] How's the maturity of CephFS and how's the maturity of Ceph erasure code?

2021-02-08 Thread fanyuanli
Hi all, I'm a rookie in CEPH. I want to ask two questions. One is the maturity of cephfs, the file system of CEPH, and whether it is recommended for production environment. The other is the maturity of CEPH's erasure code and whether it can be used in production environment. Are the above two

[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-08 Thread Tony Liu
Hi David, Could you show me an example of OSD service spec YAML to workaround it by specifying size? Thanks! Tony From: David Orman Sent: February 8, 2021 04:06 PM To: Tony Liu Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: db_devices doesn't show

[ceph-users] Re: Speed of S3 Ceph gateways

2021-02-08 Thread Steven Pine
You'll want to review your rgw configuration on ceph. How many gateways are you using? Are you using any sort of load balancer in front of them? Are you using BEAST or civet? On Mon, Feb 8, 2021 at 9:13 AM wrote: > Hi all, > > We are testing our S3 Ceph endpoints and we are not satisfied with

[ceph-users] Re: Increasing QD=1 performance (lowering latency)

2021-02-08 Thread Paul Emmerich
A few things that you can try on the network side to shave off microseconds: 1) 10G Base-T has quite some latency compared to fiber or DAC. I've measured 2 µs on Base-T vs. 0.3µs on fiber for one link in one direction, so that's 8µs you can save for a round-trip if it's client -> switch -> osd

[ceph-users] Re: reinstalling node with orchestrator/cephadm

2021-02-08 Thread Martin Verges
Hello, you could switch to croit. We can overtake existing clusters without much pain and then you have a single button to upgrade in the future ;) -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat: https://t.me/MartinVerges croit GmbH,

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
I've outed osd.18 and osd.54 and let it sync for some time and now the problem is gone. *shrugs Thank you for the hints. Am Mo., 8. Feb. 2021 um 14:46 Uhr schrieb Boris Behrens : > Hi, > sure > > ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF > -1 672.68457 root default >

[ceph-users] Re: reinstalling node with orchestrator/cephadm

2021-02-08 Thread Kenneth Waegeman
Hi Eugen, all, Thanks for sharing your results! Since we have multiple clusters and clusters with +500 OSDs, this solution is not feasible for us. In the meantime I created an issue for this : https://tracker.ceph.com/issues/49159 We would need this especially to migrate/reinstall all our

[ceph-users] Speed of S3 Ceph gateways

2021-02-08 Thread michal . strnad
Hi all, We are testing our S3 Ceph endpoints and we are not satisfied with its speed. Our results are something between around 120 - 150 MB/s depending on small/bigger files. This is good for 1Gbps connection, but not for 10GE or more. We've tried the most recent versions of the AWS CLI, s3cmd,

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
Hi, sure ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF -1 672.68457 root default -258.20561 host s3db1 23 hdd 14.55269 osd.23 up 1.0 1.0 69 hdd 14.55269 osd.69 up 1.0 1.0 73 hdd 14.55269 osd.73

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Eugen Block
Can you share 'ceph osd tree'? Are the weights of this OSD appropriate? I've seen stuck PGs because of OSD weight imbalance. Is the OSD in the correct subtree? Zitat von Boris Behrens : Hi Eugen, I've set it to 0 but the "degraded objects" count does not go down. Am Mo., 8. Feb. 2021 um

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
Hi Eugen, I've set it to 0 but the "degraded objects" count does not go down. Am Mo., 8. Feb. 2021 um 14:23 Uhr schrieb Eugen Block : > Hi, > > one option would be to decrease (or set to 0) the primary-affinity of > osd.14 and see if that brings the pg back. > > Regards, > Eugen > > -- Die

[ceph-users] Re: ceph monitors using 6-96GB RAM and crashing [nautilus]

2021-02-08 Thread Eugen Block
Hi Nico, What is your mon_memory_target? Zitat von Nico Schottelius : Hello, we have recently moved our ceph monitors from small, 4GB RAM servers to big servers, because we saw memory pressure on the machines. However even on our big machines (64GB ~ 1TB RAM) we are seeing ceph-mon

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Eugen Block
Hi, one option would be to decrease (or set to 0) the primary-affinity of osd.14 and see if that brings the pg back. Regards, Eugen Zitat von Boris Behrens : Good day together, I've got an issue after rebooting an osd node. It looks like there is one PG that does not sync back to the

[ceph-users] one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
Good day together, I've got an issue after rebooting an osd node. It looks like there is one PG that does not sync back to the other UP osds. I've tried to restart the ceph processes for all three OSDs and when I stopped the one on OSD.14 the PG went down. Any ideas what I can do? # ceph pg ls

[ceph-users] Re: share haproxy config for radosgw

2021-02-08 Thread Marc
Mine is this with Freddy's option added. frontend radosgw mode http bind abns@radosgw accept-proxy ssl crt xxx..xxx.pem http-request track-sc0 src table per_ip_rates http-request deny deny_status 429 if { sc_http_req_rate(0) gt 100 } option forwardfor default_backend

[ceph-users] Re: MDS rejects clients causing hanging mountpoint on linux kernel client

2021-02-08 Thread Dan van der Ster
We found the vmcore and backtrace and created https://tracker.ceph.com/issues/49210 Cheers, Dan On Mon, Feb 8, 2021 at 10:58 AM Dan van der Ster wrote: > > Hi, > > Sorry to ping this old thread, but we have a few kernel client nodes > stuck like this after an outage on their network. > MDS's

[ceph-users] Re: share haproxy config for radosgw

2021-02-08 Thread Marc
Hi Freddy, Thanks for posting this, I went through these settings on the haproxy manual, and was wondering why - You have added http-server-close? Because rgw does not support keep alives(? I don't know) - Why did you add the option forwardfor, this is not logged anywhere in radosgw not? At

[ceph-users] Re: MDS rejects clients causing hanging mountpoint on linux kernel client

2021-02-08 Thread Dan van der Ster
Hi, Sorry to ping this old thread, but we have a few kernel client nodes stuck like this after an outage on their network. MDS's are running v14.2.11 and the client has kernel 3.10.0-1127.19.1.el7.x86_64. This is the first time at our lab that clients didn't reconnect after a network issue (but