Re: [ceph-users] Bright new cluster get all pgs stuck in inactive

2019-01-29 Thread Paul Emmerich
Your CRUSH rule specifies to select 3 different chassis but your CRUSH ma defines no chassis. Add buckets of type chassis or change the rule to select hosts. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247

[ceph-users] OSDs stuck in preboot with log msgs about "osdmap fullness state needs update"

2019-01-29 Thread Subhachandra Chandra
Hello, I have a bunch of OSDs stuck in the preboot stage with the following log messages while recovering from an outage. The following flags are set on the cluster flags nodown,noout,nobackfill,norebalance,norecover,noscrub,nodeep-scrub How do we get these OSDs back to active state? Or

Re: [ceph-users] Multisite Ceph setup sync issue

2019-01-29 Thread Casey Bodley
On Tue, Jan 29, 2019 at 12:24 PM Krishna Verma wrote: > > Hi Ceph Users, > > > > I need your to fix sync issue in multisite setup. > > > > I have 2 cluster in different datacenter that we want to use for > bidirectional data replication. By followed the documentation >

Re: [ceph-users] Bright new cluster get all pgs stuck in inactive

2019-01-29 Thread PHARABOT Vincent
Thanks for the quick reply Here is the result # ceph osd crush rule dump [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take",

Re: [ceph-users] Bright new cluster get all pgs stuck in inactive

2019-01-29 Thread PHARABOT Vincent
Sorry JC, here is the correct osd crush rule dump (type=chassis instead of host) # ceph osd crush rule dump [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op":

[ceph-users] Multisite Ceph setup sync issue

2019-01-29 Thread Krishna Verma
Hi Ceph Users, I need your to fix sync issue in multisite setup. I have 2 cluster in different datacenter that we want to use for bidirectional data replication. By followed the documentation http://docs.ceph.com/docs/master/radosgw/multisite/ I have setup the gateway on each site but when I

Re: [ceph-users] Bright new cluster get all pgs stuck in inactive

2019-01-29 Thread Jean-Charles Lopez
Hi, I suspect your generated CRUSH rule is incorret because of osd_crush_cooseleaf_type=2 and by default chassis bucket are not created. Changing the type of bucket to host (osd_crush_cooseleaf_type=1 which is the default when using old ceph-deploy or ceph-ansible) for your deployment should

[ceph-users] Bright new cluster get all pgs stuck in inactive

2019-01-29 Thread PHARABOT Vincent
Hello, I have a bright new cluster with 2 pools, but cluster keeps pgs in inactive state. I have 3 OSDs and 1 Mon… all seems ok except I could not have pgs in clean+active state ! I might miss something obvious but I really don’t know what…. Someone could help me ? I tried to seek answers

[ceph-users] Best practice for increasing number of pg and pgp

2019-01-29 Thread Albert Yue
Dear Ceph Users, As the number of OSDs increase in our cluster, we reach a point where pg/osd is lower than recommend value and we want to increase it from 4096 to 8192. Somebody recommends that this adjustment should be done in multiple stages, e.g. increase 1024 pg each time. Is this a good

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-29 Thread Alexandre DERUMIER
Hi, here some new results, different osd/ different cluster before osd restart latency was between 2-5ms after osd restart is around 1-1.5ms http://odisoweb1.odiso.net/cephperf2/bad.txt (2-5ms) http://odisoweb1.odiso.net/cephperf2/ok.txt (1-1.5ms) http://odisoweb1.odiso.net/cephperf2/diff.txt

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-29 Thread Stefan Priebe - Profihost AG
Hi, Am 30.01.19 um 08:33 schrieb Alexandre DERUMIER: > Hi, > > here some new results, > different osd/ different cluster > > before osd restart latency was between 2-5ms > after osd restart is around 1-1.5ms > > http://odisoweb1.odiso.net/cephperf2/bad.txt (2-5ms) >

Re: [ceph-users] Best practice for increasing number of pg and pgp

2019-01-29 Thread Linh Vu
We use https://github.com/cernceph/ceph-scripts ceph-gentle-split script to slowly increase by 16 pgs at a time until we hit the target. From: ceph-users on behalf of Albert Yue Sent: Wednesday, 30 January 2019 1:39:40 PM To: ceph-users Subject: [ceph-users]

[ceph-users] Question regarding client-network

2019-01-29 Thread Buchberger, Carsten
Hi, it might be dumb question - our ceph-cluster runs with dedicated client and cluster network. I understand it like this : client -network is the network interface from where the client connections come from (from the mon & osd perspective), regardless of the source-ip-address. So as long

[ceph-users] Ceph metadata

2019-01-29 Thread F B
Hi, I'm looking for some details about the limits of the metadata used by Ceph. I founded some restrictions from XFS : - Max total keys/values size : 64 kB - Max key size : 255 bytes Does Ceph has limits for this metadata ? Thanks in advance ! Fabien BELLEGO

Re: [ceph-users] ceph-fs crashed after upgrade to 13.2.4

2019-01-29 Thread Yan, Zheng
upgraded from which version? have you try downgrade ceph-mds to old version? On Mon, Jan 28, 2019 at 9:20 PM Ansgar Jazdzewski wrote: > > hi folks we need some help with our cephfs, all mds keep crashing > > starting mds.mds02 at - > terminate called after throwing an instance of >

Re: [ceph-users] tuning ceph mds cache settings

2019-01-29 Thread Yan, Zheng
On Tue, Jan 29, 2019 at 9:05 PM Jonathan Woytek wrote: > > On Tue, Jan 29, 2019 at 7:12 AM Yan, Zheng wrote: >> >> Looks like you have 5 active mds. I suspect your issue is related to >> load balancer. Please try disabling mds load balancer (add >> "mds_bal_max = 0" to mds section of

Re: [ceph-users] tuning ceph mds cache settings

2019-01-29 Thread Jonathan Woytek
Thanks.. I'll give this a shot and we'll see what happens! jonathan On Tue, Jan 29, 2019 at 8:47 AM Yan, Zheng wrote: > On Tue, Jan 29, 2019 at 9:05 PM Jonathan Woytek > wrote: > > > > On Tue, Jan 29, 2019 at 7:12 AM Yan, Zheng wrote: > >> > >> Looks like you have 5 active mds. I suspect

Re: [ceph-users] ceph-fs crashed after upgrade to 13.2.4

2019-01-29 Thread Yan, Zheng
On Tue, Jan 29, 2019 at 8:30 PM Ansgar Jazdzewski wrote: > > Hi, > > we upgraded from 12.2.8 to 13.2.4 (ubuntu 16.04) > > - after the upgrade (~2 hours after the upgrade) the replay-mds keep > crashing so we tryed to restart all MDS than the filesystem was in > 'failed' state and no MDS is in

[ceph-users] Luminous defaults and OpenStack

2019-01-29 Thread Smith, Eric
Hey folks - I'm using Luminous (12.2.10) and I was wondering if there's anything out of the box I need to change performance wise to get the most out of OpenStack on Ceph. I'm running Rocky (Deployed with Kolla) and running Ceph deployed via ceph-deploy. Any tips / tricks / gotchas are greatly

Re: [ceph-users] tuning ceph mds cache settings

2019-01-29 Thread Yan, Zheng
On Fri, Jan 25, 2019 at 9:49 PM Jonathan Woytek wrote: > > Hi friendly ceph folks. A little while after I got the message asking for > some stats, we had a network issue that caused us to take all of our > processing offline for a few hours. Since we brought everything back up, I > have been

Re: [ceph-users] ceph-fs crashed after upgrade to 13.2.4

2019-01-29 Thread Ansgar Jazdzewski
Hi, we upgraded from 12.2.8 to 13.2.4 (ubuntu 16.04) - after the upgrade (~2 hours after the upgrade) the replay-mds keep crashing so we tryed to restart all MDS than the filesystem was in 'failed' state and no MDS is in "activ"-state - we than tryed to downgrade the MDS to 13.2.1 but had no

Re: [ceph-users] tuning ceph mds cache settings

2019-01-29 Thread Jonathan Woytek
On Tue, Jan 29, 2019 at 7:12 AM Yan, Zheng wrote: > Looks like you have 5 active mds. I suspect your issue is related to > load balancer. Please try disabling mds load balancer (add > "mds_bal_max = 0" to mds section of ceph.conf). and use 'export_pin' > to manually pin directories to mds >

Re: [ceph-users] cephfs constantly strays ( num_strays)

2019-01-29 Thread Yan, Zheng
Nothing to worried about. On Sun, Jan 27, 2019 at 10:13 PM Marc Roos wrote: > > > I have constantly strays. What are strays? Why do I have them? Is this > bad? > > > > [@~]# ceph daemon mds.c perf dump| grep num_stray > "num_strays": 25823, > "num_strays_delayed": 0, >