Re: [ceph-users] Backfill stops after a while after OSD reweight

2018-06-22 Thread Konstantin Shalygin
Yes, I know that section of the docs, but can't find how to change the crush rules after "ceph osd crush tunables ...". Could you give me a hint? What you mean? All what you need after upgrade to Luminous is: ceph osd crush tunables optimal ceph osd crush set-all-straw-buckets-to-straw2 k _

Re: [ceph-users] separate monitoring node

2018-06-22 Thread Stefan Kooman
Quoting Reed Dier (reed.d...@focusvq.com): > > > On Jun 22, 2018, at 2:14 AM, Stefan Kooman wrote: > > > > Just checking here: Are you using the telegraf ceph plugin on the nodes? > > In that case you _are_ duplicating data. But the good news is that you > > don't need to. There is a Ceph mgr te

Re: [ceph-users] Recovery after datacenter outage

2018-06-22 Thread Jason Dillaman
It sounds like your OpenStack users do not have the correct caps to blacklist dead clients. See step 6 in the upgrade section of Luminous’ release notes or (preferably) use the new “profile rbd”-style caps if you don’t use older clients. The reason why repairing the object map seemed to fix everyt

Re: [ceph-users] Recovery after datacenter outage

2018-06-22 Thread Gregory Farnum
On Fri, Jun 22, 2018 at 2:26 AM Christian Zunker wrote: > Hi List, > > we are running a ceph cluster (12.2.5) as backend to our OpenStack cloud. > > Yesterday our datacenter had a power outage. As this wouldn't be enough, > we also had a separated ceph cluster because of networking problems. > >

Re: [ceph-users] unfound blocks IO or gives IO error?

2018-06-22 Thread Gregory Farnum
On Fri, Jun 22, 2018 at 6:22 AM Sergey Malinin wrote: > From > http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg/ > : > > "Now 1 knows that these object exist, but there is no live ceph-osd who > has a copy. In this case, IO to those objects will block, and the cluster > w

Re: [ceph-users] separate monitoring node

2018-06-22 Thread Reed Dier
> On Jun 22, 2018, at 2:14 AM, Stefan Kooman wrote: > > Just checking here: Are you using the telegraf ceph plugin on the nodes? > In that case you _are_ duplicating data. But the good news is that you > don't need to. There is a Ceph mgr telegraf plugin now (mimic) which > also works on luminou

Re: [ceph-users] unfound blocks IO or gives IO error?

2018-06-22 Thread Dan van der Ster
Thanks. So I'm going to continue looking for the cause of these IO errors. -- dan On Fri, Jun 22, 2018 at 3:22 PM Sergey Malinin wrote: > > From > http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg/ : > > "Now 1 knows that these object exist, but there is no live ceph-osd w

Re: [ceph-users] unfound blocks IO or gives IO error?

2018-06-22 Thread Sergey Malinin
From http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg/ : "Now 1 knows that these object exist, but there is no live ceph-osd who has a copy. In this case, IO to those objects will block, and the c

[ceph-users] unfound blocks IO or gives IO error?

2018-06-22 Thread Dan van der Ster
Hi all, Quick question: does an IO with an unfound object result in an IO error or should the IO block? During a jewel to luminous upgrade some PGs passed through a state with unfound objects for a few seconds. And this seems to match the times when we had a few IO errors on RBD attached volumes.

Re: [ceph-users] Backfill stops after a while after OSD reweight

2018-06-22 Thread Oliver Schulz
Yes, I know that section of the docs, but can't find how to change the crush rules after "ceph osd crush tunables ...". Could you give me a hint? Another question, if I may: Would you recommend going from my ancient tunables to hammer directly (or even to jewel, if I can get the clients updated)

Re: [ceph-users] CentOS Dojo at CERN

2018-06-22 Thread Willem Jan Withagen
On 21-6-2018 14:44, Dan van der Ster wrote: On Thu, Jun 21, 2018 at 2:41 PM Kai Wagner wrote: On 20.06.2018 17:39, Dan van der Ster wrote: And BTW, if you can't make it to this event we're in the early days of planning a dedicated Ceph + OpenStack Days at CERN around May/June 2019. More news

Re: [ceph-users] separate monitoring node

2018-06-22 Thread Lenz Grimmer
On 06/20/2018 05:42 PM, Kevin Hrpcek wrote: > The ceph mgr dashboard is only enabled on the mgr daemons. I'm not > familiar with the mimic dashboard yet, but it is much more advanced than > luminous' dashboard and may have some alerting abilities built in. Not yet - see http://docs.ceph.com/docs/

[ceph-users] Howto add another client user id to a cluster

2018-06-22 Thread Steffen Winther Sørensen
Anyone, We’ve ceph clients that we want to let mount two cephfs from each their own ceph clusters. Both cluster are standard created w/ceph-deploy and possible only has knowledge of each their client.admin. How could we allow a new client id to access the 2. cluster eg. as admin2? On ceph cli

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-22 Thread Andrei Mikhailovsky
Hi Brad, here is the output of the command (replaced the real auth key with [KEY]): 2018-06-22 10:47:27.659895 7f70ef9e6700 10 monclient: build_initial_monmap 2018-06-22 10:47:27.661995 7f70ef9e6700 10 monclient: init 2018-06-22 10:47:27.662002 7f70ef9e6700 5 adding auth proto

[ceph-users] Recovery after datacenter outage

2018-06-22 Thread Christian Zunker
Hi List, we are running a ceph cluster (12.2.5) as backend to our OpenStack cloud. Yesterday our datacenter had a power outage. As this wouldn't be enough, we also had a separated ceph cluster because of networking problems. First of all thanks a lot to the ceph developers. After the network was

Re: [ceph-users] radosgw failover help

2018-06-22 Thread Konstantin Shalygin
Has any one, done or working a way to do S3(radosgw) failover. I am trying to work out away to have 2 radosgw servers, with an VIP when one server goes down it will go over to the other. May be better failover + load balancing? For example - nginx do this + TLS. k ___

Re: [ceph-users] separate monitoring node

2018-06-22 Thread Stefan Kooman
Quoting Denny Fuchs (linuxm...@4lin.net): > hi, > > > Am 19.06.2018 um 17:17 schrieb Kevin Hrpcek : > > > > # ceph auth get client.icinga > > exported keyring for client.icinga > > [client.icinga] > > key = > > caps mgr = "allow r" > > caps mon = "allow r" > > thats the point: It's