Re: [ceph-users] 1 mon unable to join the quorum

2018-04-04 Thread Brad Hubbard
See my latest update in the tracker.

On Sun, Apr 1, 2018 at 2:27 AM, Julien Lavesque
 wrote:
> At first the cluster has been deployed using ceph-ansible in version
> infernalis.
> For some unknown reason the controller02 was out of the quorum and we were
> unable to add it in the quorum.
>
> We have updated the cluster to jewel version using the rolling-update
> playbook from ceph-ansible
>
> The controller02 was still not in the quorum.
>
> We tried to delete the mon completely and add it again using the manual
> method of http://docs.ceph.com/docs/jewel/rados/operations/add-or-rm-mons/
> (with id controller02)
>
> The logs provided are when the controller02 was added with the manual
> method.
>
> But the controller02 won't join the cluster
>
> Hope It helps understand
>
>
>
> On 31/03/2018 02:12, Brad Hubbard wrote:
>>
>> I'm not sure I completely understand your "test". What exactly are you
>> trying to achieve and what documentation are you following?
>>
>> On Fri, Mar 30, 2018 at 10:49 PM, Julien Lavesque
>>  wrote:
>>>
>>> Brad,
>>>
>>> Thanks for your answer
>>>
>>> On 30/03/2018 02:09, Brad Hubbard wrote:


 2018-03-19 11:03:50.819493 7f842ed47640  0 mon.controller02 does not
 exist in monmap, will attempt to join an existing cluster
 2018-03-19 11:03:50.820323 7f842ed47640  0 starting mon.controller02
 rank -1 at 172.18.8.6:6789/0 mon_data
 /var/lib/ceph/mon/ceph-controller02 fsid
 f37f31b1-92c5-47c8-9834-1757a677d020

 We are called 'mon.controller02' and we can not find our name in the
 local copy of the monmap.

 2018-03-19 11:03:52.346318 7f842735d700 10
 mon.controller02@-1(probing) e68  ready to join, but i'm not in the
 monmap or my addr is blank, trying to join

 Our name is not in the copy of the monmap we got from peer controller01
 either.
>>>
>>>
>>>
>>> During our test we have deleted completely the controller02 monitor and
>>> add
>>> it again.
>>>
>>> The log you have is when the controller02 is added (so it wasn't in the
>>> monmap before)
>>>
>>>

 $ cat ../controller02-mon_status.log
 [root@controller02 ~]# ceph --admin-daemon
 /var/run/ceph/ceph-mon.controller02.asok mon_status
 {
 "name": "controller02",
 "rank": 1,
 "state": "electing",
 "election_epoch": 32749,
 "quorum": [],
 "outside_quorum": [],
 "extra_probe_peers": [],
 "sync_provider": [],
 "monmap": {
 "epoch": 71,
 "fsid": "f37f31b1-92c5-47c8-9834-1757a677d020",
 "modified": "2018-03-29 10:48:06.371157",
 "created": "0.00",
 "mons": [
 {
 "rank": 0,
 "name": "controller01",
 "addr": "172.18.8.5:6789\/0"
 },
 {
 "rank": 1,
 "name": "controller02",
 "addr": "172.18.8.6:6789\/0"
 },
 {
 "rank": 2,
 "name": "controller03",
 "addr": "172.18.8.7:6789\/0"
 }
 ]
 }
 }

 In the monmaps we are called 'controller02', not 'mon.controller02'.
 These names need to be identical.

>>>
>>> The cluster has been deployed using ceph-ansible with the servers
>>> hostname.
>>> All monitors are called mon.controller0x in the monmap and all the 3
>>> monitors have the same configuration
>>>
>>> We have the same behavior creating a monmap from scratch :
>>>
>>> [root@controller03 ~]# monmaptool --create --add controller01
>>> 172.18.8.5:6789 --add controller02 172.18.8.6:6789 --add controller03
>>> 172.18.8.7:6789 --fsid f37f31b1-92c5-47c8-9834-1757a677d020 --clobber
>>> test-monmap
>>> monmaptool: monmap file test-monmap
>>> monmaptool: set fsid to f37f31b1-92c5-47c8-9834-1757a677d020
>>> monmaptool: writing epoch 0 to test-monmap (3 monitors)
>>>
>>> [root@controller03 ~]# monmaptool --print test-monmap
>>> monmaptool: monmap file test-monmap
>>> epoch 0
>>> fsid f37f31b1-92c5-47c8-9834-1757a677d020
>>> last_changed 2018-03-30 14:42:18.809719
>>> created 2018-03-30 14:42:18.809719
>>> 0: 172.18.8.5:6789/0 mon.controller01
>>> 1: 172.18.8.6:6789/0 mon.controller02
>>> 2: 172.18.8.7:6789/0 mon.controller03
>>>
>>>

 On Thu, Mar 29, 2018 at 7:23 PM, Julien Lavesque
  wrote:
>
>
> Hi Brad,
>
> The results have been uploaded on the tracker
> (https://tracker.ceph.com/issues/23403)
>
> Julien
>
>
> On 29/03/2018 07:54, Brad Hubbard wrote:
>>
>>
>>
>> Can you update with the result of the following commands from all of
>> the
>> MONs?
>>
>> # ceph --admin-daemon /var/run/ceph/ceph-mon.[whatever].asok
>> mon_status

Re: [ceph-users] no rebalance when changing chooseleaf_vary_r tunable

2018-04-04 Thread Adrian
Hi Gregory

We were planning on going to chooseleaf_vary_r=4 so we could upgrade to
jewel now and schedule the change to 1 at a more suitable time since we
were expecting a large rebalancing of objects (should have mentioned that).

Good to know that there's a valid reason we didn't see any rebalance
though, had me worried so thanks for the info.

Regards,
Adrian.

On Thu, Apr 5, 2018 at 9:16 AM, Gregory Farnum  wrote:

> http://docs.ceph.com/docs/master/rados/operations/crush-
> map/#firefly-crush-tunables3
>
> "The optimal value (in terms of computational cost and correctness) is 1."
>
> I think you're just finding that the production cluster, with a much
> larger number of buckets, didn't ever run in to the situation
> chooseleaf_vary_r is meant to resolve, so it didn't change any
> mappings by turning it on.
> -Greg
>
> On Wed, Apr 4, 2018 at 3:49 PM, Adrian  wrote:
> > Hi all,
> >
> > Was wondering if someone could enlighten me...
> >
> > I've recently been upgrading a small test clusters tunables from bobtail
> to
> > firefly prior to doing the same with an old production cluster.
> >
> > OS is rhel 7.4, kernel in test is all 3.10.0-693.el7.x86_64, in prod
> admin
> > box
> > is 3.10.0-693.el7.x86_64 all mons and osds are 4.4.76-1.el7.elrepo.x86_64
> >
> > Ceph version is 0.94.10-0.el7, both were installed with ceph-deploy
> 5.37-0
> >
> > Production system was originally redhat ceph but was then changed to
> > ceph-community edition (all prior to me being here) and has 189 osds
> > on 21 hosts with 5 mons
> >
> > I changed chooseleaf_vary_r from 0 in test incrementally from 5 to 1,
> > each change saw a larger rebalance than the last
> >
> >|---+--+---|
> >| chooseleaf_vary_r | degraded | misplaced |
> >|---+--+---|
> >| 5 |   0% |0.187% |
> >| 4 |   1.913% |2.918% |
> >| 3 |   6.965% |   18.904% |
> >| 2 |  14.303% |   32.380% |
> >| 1 |  20.657% |   48.310% |
> >|---+--+---|
> >
> > As the change to 5 was so minimal we decided to jump from 0 to 4 in prod
> >
> > I performed the exact same steps on the production cluster and changed
> > chooseleaf_vary_r to 4 however nothing happened, no rebalancing at all.
> >
> > Update was done with
> >
> >ceph osd getcrushmap -o crushmap-bobtail
> >crushtool -i crushmap-bobtail --set-chooseleaf-vary-r 4 -o
> > crushmap-firefly
> >ceph osd setcrushmap -i crushmap-firefly
> >
> > I also decomplied and diff'ed the maps on occasion to confirm changes,
> I'm
> > relatively new to ceph, better safe than sorry :-)
> >
> >
> > tunables in prod prior to any change were
> > {
> > "choose_local_tries": 0,
> > "choose_local_fallback_tries": 0,
> > "choose_total_tries": 50,
> > "chooseleaf_descend_once": 1,
> > "chooseleaf_vary_r": 0,
> > "straw_calc_version": 0,
> > "allowed_bucket_algs": 22,
> > "profile": "bobtail",
> > "optimal_tunables": 0,
> > "legacy_tunables": 0,
> > "require_feature_tunables": 1,
> > "require_feature_tunables2": 1,
> > "require_feature_tunables3": 0,
> > "has_v2_rules": 0,
> > "has_v3_rules": 0,
> > "has_v4_buckets": 0
> > }
> >
> > tunables in prod now show
> > {
> > "choose_local_tries": 0,
> > "choose_local_fallback_tries": 0,
> > "choose_total_tries": 50,
> > "chooseleaf_descend_once": 1,
> > "chooseleaf_vary_r": 4,
> > "straw_calc_version": 0,
> > "allowed_bucket_algs": 22,
> > "profile": "unknown",
> > "optimal_tunables": 0,
> > "legacy_tunables": 0,
> > "require_feature_tunables": 1,
> > "require_feature_tunables2": 1,
> > "require_feature_tunables3": 1,
> > "has_v2_rules": 0,
> > "has_v3_rules": 0,
> > "has_v4_buckets": 0
> > }
> >
> > for ref in test they are now
> > {
> > "choose_local_tries": 0,
> > "choose_local_fallback_tries": 0,
> > "choose_total_tries": 50,
> > "chooseleaf_descend_once": 1,
> > "chooseleaf_vary_r": 1,
> > "straw_calc_version": 0,
> > "allowed_bucket_algs": 22,
> > "profile": "firefly",
> > "optimal_tunables": 1,
> > "legacy_tunables": 0,
> > "require_feature_tunables": 1,
> > "require_feature_tunables2": 1,
> > "require_feature_tunables3": 1,
> > "has_v2_rules": 0,
> > "has_v3_rules": 0,
> > "has_v4_buckets": 0
> > }
> >
> > I'm worried that no rebalancing occurred - anyone any idea why ?
> >
> > The goal here is to get ready to upgrade to jewel - anyone see any issues
> > from the above info ?
> >
> > Thanks in advance,
> > Adrian.
> >
> > --
> > ---
> > Adrian : aussie...@gmail.com
> > If violence doesn't solve your problem, you're not using enough of it.
> >
> > ___
> > ceph-users mailing list
> > 

Re: [ceph-users] no rebalance when changing chooseleaf_vary_r tunable

2018-04-04 Thread Gregory Farnum
http://docs.ceph.com/docs/master/rados/operations/crush-map/#firefly-crush-tunables3

"The optimal value (in terms of computational cost and correctness) is 1."

I think you're just finding that the production cluster, with a much
larger number of buckets, didn't ever run in to the situation
chooseleaf_vary_r is meant to resolve, so it didn't change any
mappings by turning it on.
-Greg

On Wed, Apr 4, 2018 at 3:49 PM, Adrian  wrote:
> Hi all,
>
> Was wondering if someone could enlighten me...
>
> I've recently been upgrading a small test clusters tunables from bobtail to
> firefly prior to doing the same with an old production cluster.
>
> OS is rhel 7.4, kernel in test is all 3.10.0-693.el7.x86_64, in prod admin
> box
> is 3.10.0-693.el7.x86_64 all mons and osds are 4.4.76-1.el7.elrepo.x86_64
>
> Ceph version is 0.94.10-0.el7, both were installed with ceph-deploy 5.37-0
>
> Production system was originally redhat ceph but was then changed to
> ceph-community edition (all prior to me being here) and has 189 osds
> on 21 hosts with 5 mons
>
> I changed chooseleaf_vary_r from 0 in test incrementally from 5 to 1,
> each change saw a larger rebalance than the last
>
>|---+--+---|
>| chooseleaf_vary_r | degraded | misplaced |
>|---+--+---|
>| 5 |   0% |0.187% |
>| 4 |   1.913% |2.918% |
>| 3 |   6.965% |   18.904% |
>| 2 |  14.303% |   32.380% |
>| 1 |  20.657% |   48.310% |
>|---+--+---|
>
> As the change to 5 was so minimal we decided to jump from 0 to 4 in prod
>
> I performed the exact same steps on the production cluster and changed
> chooseleaf_vary_r to 4 however nothing happened, no rebalancing at all.
>
> Update was done with
>
>ceph osd getcrushmap -o crushmap-bobtail
>crushtool -i crushmap-bobtail --set-chooseleaf-vary-r 4 -o
> crushmap-firefly
>ceph osd setcrushmap -i crushmap-firefly
>
> I also decomplied and diff'ed the maps on occasion to confirm changes, I'm
> relatively new to ceph, better safe than sorry :-)
>
>
> tunables in prod prior to any change were
> {
> "choose_local_tries": 0,
> "choose_local_fallback_tries": 0,
> "choose_total_tries": 50,
> "chooseleaf_descend_once": 1,
> "chooseleaf_vary_r": 0,
> "straw_calc_version": 0,
> "allowed_bucket_algs": 22,
> "profile": "bobtail",
> "optimal_tunables": 0,
> "legacy_tunables": 0,
> "require_feature_tunables": 1,
> "require_feature_tunables2": 1,
> "require_feature_tunables3": 0,
> "has_v2_rules": 0,
> "has_v3_rules": 0,
> "has_v4_buckets": 0
> }
>
> tunables in prod now show
> {
> "choose_local_tries": 0,
> "choose_local_fallback_tries": 0,
> "choose_total_tries": 50,
> "chooseleaf_descend_once": 1,
> "chooseleaf_vary_r": 4,
> "straw_calc_version": 0,
> "allowed_bucket_algs": 22,
> "profile": "unknown",
> "optimal_tunables": 0,
> "legacy_tunables": 0,
> "require_feature_tunables": 1,
> "require_feature_tunables2": 1,
> "require_feature_tunables3": 1,
> "has_v2_rules": 0,
> "has_v3_rules": 0,
> "has_v4_buckets": 0
> }
>
> for ref in test they are now
> {
> "choose_local_tries": 0,
> "choose_local_fallback_tries": 0,
> "choose_total_tries": 50,
> "chooseleaf_descend_once": 1,
> "chooseleaf_vary_r": 1,
> "straw_calc_version": 0,
> "allowed_bucket_algs": 22,
> "profile": "firefly",
> "optimal_tunables": 1,
> "legacy_tunables": 0,
> "require_feature_tunables": 1,
> "require_feature_tunables2": 1,
> "require_feature_tunables3": 1,
> "has_v2_rules": 0,
> "has_v3_rules": 0,
> "has_v4_buckets": 0
> }
>
> I'm worried that no rebalancing occurred - anyone any idea why ?
>
> The goal here is to get ready to upgrade to jewel - anyone see any issues
> from the above info ?
>
> Thanks in advance,
> Adrian.
>
> --
> ---
> Adrian : aussie...@gmail.com
> If violence doesn't solve your problem, you're not using enough of it.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph recovery kill VM's even with the smallest priority

2018-04-04 Thread Gregory Farnum
On Thu, Mar 29, 2018 at 3:17 PM Damian Dabrowski  wrote:

> Greg, thanks for Your reply!
>
> I think Your idea makes sense, I've did tests and its quite hard to
> understand for me. I'll try to explain my situation in few steps
> below.
> I think that ceph showing progress in recovery but it can only solve
> objects which doesn't really changed. It won't try to repair objects
> which are really degraded because of norecovery flag. Am i right?
> After a while I see blocked requests(as You can see below).
>

Yeah, so the implementation of this is a bit funky. Basically, when the OSD
gets a map specifying norecovery, it will prevent any new recovery ops from
starting once it processes that map. But it doesn't change the state of the
PGs out of recovery; they just won't queue up more work.

So probably the existing recovery IO was from OSDs that weren't up-to-date
yet. Or maybe there's a bug in the norecover implementation; it definitely
looks a bit fragile.

But really I just wouldn't use that command. It's an expert flag that you
shouldn't use except in some extreme wonky cluster situations (and even
those may no longer exist in modern Ceph). For the use case you shared in
your first email, I'd just stick with noout.
-Greg


>
> - FEW SEC AFTER START OSD -
> # ceph status
> cluster 848b340a-be27-45cb-ab66-3151d877a5a0
>  health HEALTH_WARN
> 140 pgs degraded
> 1 pgs recovering
> 92 pgs recovery_wait
> 140 pgs stuck unclean
> recovery 942/5772119 objects degraded (0.016%)
> noout,nobackfill,norecover flag(s) set
>  monmap e10: 3 mons at
> {node-19=
> 172.31.0.2:6789/0,node-20=172.31.0.8:6789/0,node-21=172.31.0.6:6789/0}
> election epoch 724, quorum 0,1,2 node-19,node-21,node-20
>  osdmap e18727: 36 osds: 36 up, 30 in
> flags noout,nobackfill,norecover
>   pgmap v20851644: 1472 pgs, 7 pools, 8510 GB data, 1880 kobjects
> 25204 GB used, 17124 GB / 42329 GB avail
> 942/5772119 objects degraded (0.016%)
> 1332 active+clean
>   92 active+recovery_wait+degraded
>   47 active+degraded
>1 active+recovering+degraded
> recovery io 31608 kB/s, 4 objects/s
>   client io 73399 kB/s rd, 80233 kB/s wr, 1218 op/s
>
> - 1 MIN AFTER OSD START, RECOVERY STUCK, BLOCKED REQUESTS -
> # ceph status
> cluster 848b340a-be27-45cb-ab66-3151d877a5a0
>  health HEALTH_WARN
> 140 pgs degraded
> 1 pgs recovering
> 109 pgs recovery_wait
> 140 pgs stuck unclean
> 80 requests are blocked > 32 sec
> recovery 847/5775929 <(847)%20577-5929> objects degraded
> (0.015%)
> noout,nobackfill,norecover flag(s) set
>  monmap e10: 3 mons at
> {node-19=
> 172.31.0.2:6789/0,node-20=172.31.0.8:6789/0,node-21=172.31.0.6:6789/0}
> election epoch 724, quorum 0,1,2 node-19,node-21,node-20
>  osdmap e18727: 36 osds: 36 up, 30 in
> flags noout,nobackfill,norecover
>   pgmap v20851812: 1472 pgs, 7 pools, 8520 GB data, 1881 kobjects
> 25234 GB used, 17094 GB / 42329 GB avail
> 847/5775929 <(847)%20577-5929> objects degraded (0.015%)
> 1332 active+clean
>  109 active+recovery_wait+degraded
>   30 active+degraded < degraded objects count got stuck
>1 active+recovering+degraded
> recovery io 3743 kB/s, 0 objects/s < depend on command execution
> this line showing 0 objects/s or doesn't exists
>   client io 26521 kB/s rd, 64211 kB/s wr, 1212 op/s
>
> - FEW SECONDS AFTER UNSETTING FLAGS NOOUT, NORECOVERY, NOBACKFILL -
> # ceph status
> cluster 848b340a-be27-45cb-ab66-3151d877a5a0
>  health HEALTH_WARN
> 134 pgs degraded
> 134 pgs recovery_wait
> 134 pgs stuck degraded
> 134 pgs stuck unclean
> recovery 591/5778179 objects degraded (0.010%)
>  monmap e10: 3 mons at
> {node-19=
> 172.31.0.2:6789/0,node-20=172.31.0.8:6789/0,node-21=172.31.0.6:6789/0}
> election epoch 724, quorum 0,1,2 node-19,node-21,node-20
>  osdmap e18730: 36 osds: 36 up, 30 in
>   pgmap v20851909: 1472 pgs, 7 pools, 8526 GB data, 1881 kobjects
> 25252 GB used, 17076 GB / 42329 GB avail
> 591/5778179 objects degraded (0.010%)
> 1338 active+clean
>  134 active+recovery_wait+degraded
> recovery io 191 MB/s, 26 objects/s
>   client io 100654 kB/s rd, 184 MB/s wr, 6303 op/s
>
>
>
> 2018-03-29 18:22 GMT+02:00 Gregory Farnum :
> >
> > On Thu, Mar 29, 2018 at 7:27 AM Damian Dabrowski 
> wrote:
> >>
> >> Hello,
> >>
> >> Few days ago I had very strange situation.
> >>
> >> I had to turn off few OSDs for a while. So I've set flags:noout,
> >> nobackfill, 

[ceph-users] no rebalance when changing chooseleaf_vary_r tunable

2018-04-04 Thread Adrian
Hi all,

Was wondering if someone could enlighten me...

I've recently been upgrading a small test clusters tunables from bobtail to
firefly prior to doing the same with an old production cluster.

OS is rhel 7.4, kernel in test is all 3.10.0-693.el7.x86_64, in prod admin
box
is 3.10.0-693.el7.x86_64 all mons and osds are 4.4.76-1.el7.elrepo.x86_64

Ceph version is 0.94.10-0.el7, both were installed with ceph-deploy 5.37-0

Production system was originally redhat ceph but was then changed to
ceph-community edition (all prior to me being here) and has 189 osds
on 21 hosts with 5 mons

I changed chooseleaf_vary_r from 0 in test incrementally from 5 to 1,
each change saw a larger rebalance than the last

   |---+--+---|
   | chooseleaf_vary_r | degraded | misplaced |
   |---+--+---|
   | 5 |   0% |0.187% |
   | 4 |   1.913% |2.918% |
   | 3 |   6.965% |   18.904% |
   | 2 |  14.303% |   32.380% |
   | 1 |  20.657% |   48.310% |
   |---+--+---|

As the change to 5 was so minimal we decided to jump from 0 to 4 in prod

I performed the exact same steps on the production cluster and changed
chooseleaf_vary_r to 4 however nothing happened, no rebalancing at all.

Update was done with

   ceph osd getcrushmap -o crushmap-bobtail
   crushtool -i crushmap-bobtail --set-chooseleaf-vary-r 4 -o
crushmap-firefly
   ceph osd setcrushmap -i crushmap-firefly

I also decomplied and diff'ed the maps on occasion to confirm changes, I'm
relatively new to ceph, better safe than sorry :-)


tunables in prod prior to any change were
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 0,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "bobtail",
"optimal_tunables": 0,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 0,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}

tunables in prod now show
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 4,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "unknown",
"optimal_tunables": 0,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 1,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}

for ref in test they are now
{
"choose_local_tries": 0,
"choose_local_fallback_tries": 0,
"choose_total_tries": 50,
"chooseleaf_descend_once": 1,
"chooseleaf_vary_r": 1,
"straw_calc_version": 0,
"allowed_bucket_algs": 22,
"profile": "firefly",
"optimal_tunables": 1,
"legacy_tunables": 0,
"require_feature_tunables": 1,
"require_feature_tunables2": 1,
"require_feature_tunables3": 1,
"has_v2_rules": 0,
"has_v3_rules": 0,
"has_v4_buckets": 0
}

I'm worried that no rebalancing occurred - anyone any idea why ?

The goal here is to get ready to upgrade to jewel - anyone see any issues
from the above info ?

Thanks in advance,
Adrian.

-- 
---
Adrian : aussie...@gmail.com
If violence doesn't solve your problem, you're not using enough of it.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy: recommended?

2018-04-04 Thread Brady Deetz
We use ceph-deploy in production. That said, our crush map is getting more
complex and we are starting to make use of other tooling as that occurs.
But we still use ceph-deploy to install ceph and bootstrap OSDs.

On Wed, Apr 4, 2018, 1:58 PM Robert Stanford 
wrote:

>
>  I read a couple of versions ago that ceph-deploy was not recommended for
> production clusters.  Why was that?  Is this still the case?  We have a lot
> of problems automating deployment without ceph-deploy.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance falls as data accumulates

2018-04-04 Thread Gregory Farnum
On Mon, Apr 2, 2018 at 11:18 AM Robert Stanford 
wrote:

>
>  This is a known issue as far as I can tell, I've read about it several
> times.  Ceph performs great (using radosgw), but as the OSDs fill up
> performance falls sharply.  I am down to half of empty performance with
> about 50% disk usage.
>
>  My questions are: does adding more OSDs / disks to the cluster restore
> performance?  (Is it the absolute number of objects that degrades
> performance, or % occupancy?)  And, will the performance level off at some
> point, or will it continue to get lower and lower as our disks fill?
>

There are two dimensions of interest for you here:
1) the speed of your bucket index,
2) the speed of the OSD store for reading/writing actual data.

(2) should be better with bluestore than filestore AFAIK, and will
definitely improve as you add more OSDs to the cluster.
(1) will depend more on the amount of bucket index sharding and the number
of buckets you're working with.
-Greg


>
>  Thank you
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how the files in /var/lib/ceph/osd/ceph-0 are generated

2018-04-04 Thread Gregory Farnum
On Tue, Apr 3, 2018 at 6:30 PM Jeffrey Zhang <
zhang.lei.fly+ceph-us...@gmail.com> wrote:

> I am testing ceph Luminous, the environment is
>
> - centos 7.4
> - ceph luminous ( ceph offical repo)
> - ceph-deploy 2.0
> - bluestore + separate wal and db
>
> I found the ceph osd folder `/var/lib/ceph/osd/ceph-0` is mounted
> from tmpfs. But where the files in that folder come from? like `keyring`,
> `whoami`?
>

These are generated as part of the initialization process. I don't know the
exact commands involved, but the keyring for instance will draw from the
results of "ceph osd new" (which is invoked by one of the ceph-volume setup
commands). That and whoami are part of the basic information an OSD needs
to communicate with a monitor.
-Greg


>
> $ ls -alh /var/lib/ceph/osd/ceph-0/
> lrwxrwxrwx.  1 ceph ceph   24 Apr  3 16:49 block ->
> /dev/ceph-pool/osd0.data
> lrwxrwxrwx.  1 root root   22 Apr  3 16:49 block.db ->
> /dev/ceph-pool/osd0-db
> lrwxrwxrwx.  1 root root   23 Apr  3 16:49 block.wal ->
> /dev/ceph-pool/osd0-wal
> -rw---.  1 ceph ceph   37 Apr  3 16:49 ceph_fsid
> -rw---.  1 ceph ceph   37 Apr  3 16:49 fsid
> -rw---.  1 ceph ceph   55 Apr  3 16:49 keyring
> -rw---.  1 ceph ceph6 Apr  3 16:49 ready
> -rw---.  1 ceph ceph   10 Apr  3 16:49 type
> -rw---.  1 ceph ceph2 Apr  3 16:49 whoami
>
> I guess they may be loaded from bluestore. But I can not find any clue for
> this.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy: recommended?

2018-04-04 Thread ceph


Am 4. April 2018 20:58:19 MESZ schrieb Robert Stanford 
:
>I read a couple of versions ago that ceph-deploy was not recommended
>for
>production clusters.  Why was that?  Is this still the case?  We have a
I cannot Imagine that. Did use it Now a few versions before 2.0 and it works 
Great. We use it in production. 

- Mehmet 
>lot
>of problems automating deployment without ceph-deploy.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy: recommended?

2018-04-04 Thread Robert Stanford
 I read a couple of versions ago that ceph-deploy was not recommended for
production clusters.  Why was that?  Is this still the case?  We have a lot
of problems automating deployment without ceph-deploy.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Use trimfs on already mounted RBD image

2018-04-04 Thread Damian Dabrowski
Hello,

I wonder if it is any way to run `trimfs` on rbd image which is currently
used by the KVM process? (when I don't have access to VM)

I know that I can do this by qemu-guest-agent but not all VMs have it
installed.

I can't use rbdmap too, because most images don't have distributed
filesystems. It's mostly ext4/xfs so I can't mount them in two places at
same time.

I would be grateful for any help.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw: can't delete bucket

2018-04-04 Thread Micha Krause

Hi,

I have a Bucket with multiple broken multipart uploads, which can't be aborted.

radosgw-admin bucket check shows thousands of _multipart_ objects, 
unfortunately the --fix and --check-objects don't change anything.

I decided to get rid of the bucket completely, but even this command:

radosgw-admin --id radosgw.rgw bucket rm --bucket my-bucket 
--inconsistent-index --yes-i-really-mean-it --bypass-gc --purge-objects

Is not able to do it, and shows:

7fe7cdceab80 -1 ERROR: unable to remove bucket(2) No such file or directory

--debug-ms=1 shows that the error occurs after:

2018-04-04 15:59:58.884901 7f6f500c5700  1 -- 10.210.64.16:0/772895100 <== 
osd.458 10.210.34.34:6804/2673 1  osd_op_reply(13121
  
default.193458319.2__multipart_my/object.2~p8vFLOvgFKMGOSLeEc8WGT2Bgw5ehg9.meta 
[omap-get-vals] v0'0 uv0 ondisk = -2 ((2) No such file or directory))
  v8  346+0+0 (2461758253 0 0) 0x7f6f48268ba0 con 0x5626ec47ee20

And indeed when I try to get the object with the rados get command, it also shows 
"No such file or directory"

I tried to

rados put -p .rgw.buckets 
default.193458319.2__multipart_my/object.2~p8vFLOvgFKMGOSLeEc8WGT2Bgw5ehg9.meta 
emptyfile

in it's place, but the error stays the same.

Any ideas how I can get rid of my bucket?


Micha Krause
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

2018-04-04 Thread Marc Roos


FYI, I have these to on a test cluster. Upgraded from Kraken

Apr  4 14:48:23 c01 ceph-osd: 2018-04-04 14:48:23.347040 7f9a39c05700 -1 
osd.8 pg_epoch: 19002 pg[17.32( v 19002'2523501 
(19002'2521979,19002'2523501] local-lis/les=18913/18914 n=3600 
ec=3636/3636 lis/c 18913/18913 les/c/f 18914/18914/0 18913/18913/18899) 
[10,5,8] r=2 lpr=18913 luod=0'0 crt=19002'2523501 lcod 19002'2523500 
active] _scan_snaps no head for 
17:4ffb808b:::rbd_data.239f7274b0dc51.071f:1d (have MIN)
Apr  4 14:48:23 c01 ceph-osd: 2018-04-04 14:48:23.347067 7f9a39c05700 -1 
osd.8 pg_epoch: 19002 pg[17.32( v 19002'2523501 
(19002'2521979,19002'2523501] local-lis/les=18913/18914 n=3600 
ec=3636/3636 lis/c 18913/18913 les/c/f 18914/18914/0 18913/18913/18899) 
[10,5,8] r=2 lpr=18913 luod=0'0 crt=19002'2523501 lcod 19002'2523500 
active] _scan_snaps no head for 
17:4ffb808b:::rbd_data.239f7274b0dc51.071f:14 (have MIN)
Apr  4 14:48:23 c01 ceph-osd: 2018-04-04 14:48:23.347067 7f9a39c05700 -1 
osd.8 pg_epoch: 19002 pg[17.32( v 19002'2523501 
(19002'2521979,19002'2523501] local-lis/les=18913/18914 n=3600 
ec=3636/3636 lis/c 18913/18913 les/c/f 18914/18914/0 18913/18913/18899) 
[10,5,8] r=2 lpr=18913 luod=0'0 crt=19002'2523501 lcod 19002'2523500 
active] _scan_snaps no head for 
17:4ffb808b:::rbd_data.239f7274b0dc51.071f:14 (have MIN)

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] User deletes bucket with partial multipart uploads in, objects still in quota

2018-04-04 Thread Matthew Vernon
On 04/04/18 10:30, Matthew Vernon wrote:
> Hi,
> 
> We have an rgw user who had a bunch of partial multipart uploads in a
> bucket, which they then deleted. radosgw-admin bucket list doesn't show
> the bucket any more, but  user stats --sync-stats still has (I think)
> the contents of that bucket counted against the users' quota.
> 
> So, err, how do I cause a) the users' quota usage to not include this
> deleted bucket b) the associated storage to actually be cleared (since I
> infer the failure to do so is causing the quota issue)?

Sorry, should have said: this is running jewel.

Regards,

Matthew


-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] amount of PGs/pools/OSDs for your openstack / Ceph

2018-04-04 Thread Osama Hasebou
Hi Everyone, 

I would like to know what kind of setup had the Ceph community been using for 
their Openstack's Ceph configuration when it comes to number of Pools & OSDs 
and their PGs. 

Ceph documentation briefly mentions it for small cluster size, and I would like 
to know from your experience, how much PGs have you created for your openstack 
pools in reality for a ceph cluster ranging from 1-2 PB capacity or 400-600 
number of OSDs that performs well without issues. 

Hope to hear from you! 

Thanks. 

Regards, 
Ossi 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] User deletes bucket with partial multipart uploads in, objects still in quota

2018-04-04 Thread Matthew Vernon
Hi,

We have an rgw user who had a bunch of partial multipart uploads in a
bucket, which they then deleted. radosgw-admin bucket list doesn't show
the bucket any more, but  user stats --sync-stats still has (I think)
the contents of that bucket counted against the users' quota.

So, err, how do I cause a) the users' quota usage to not include this
deleted bucket b) the associated storage to actually be cleared (since I
infer the failure to do so is causing the quota issue)?

Thanks,

Matthew


-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rados bucket issues, default.rgw.buckets.index growing every day

2018-04-04 Thread Mark Schouten
On Wed, 2018-04-04 at 09:38 +0200, Mark Schouten wrote:
> I have some issues with my bucket index. As you can see in the
> attachment, everyday around 16:30, the amount of objects in the
> default.rgw.buckets.index increases. This happens since upgrading
> from
> 12.2.2 to 12.2.4.

It seems there is still a resharding action running. Which I cannot
cancel:

root@osdnode03:~# radosgw-admin reshard cancel --
uid='DB0220$elasticsearch' --tenant=DB0220 --bucket=backups 
Error in getting bucket backups: (2) No such file or directory
2018-04-04 10:07:03.905049 7f105ee0fcc0 -1 ERROR: failed to get entry
from reshard log, oid=reshard.10 tenant= bucket=backups

I think tenant is empty because of https://github.com/oritwas/ceph/blob
/0a2142e83b58fa8e238bcb748d1cb97bdba674c5/src/rgw/rgw_admin.cc#L5755

What do you think?

-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076  | http://www.tuxis.nl/
T: 0318 200208 | i...@tuxis.nl

smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rados bucket issues, default.rgw.buckets.index growing every day

2018-04-04 Thread Mark Schouten
Hi,

I have some issues with my bucket index. As you can see in the
attachment, everyday around 16:30, the amount of objects in the
default.rgw.buckets.index increases. This happens since upgrading from
12.2.2 to 12.2.4.

There is no real activity at that moment, but the logs do show the
following:

rgw.log.1.gz:2018-04-03 16:28:24.594527 7fad2e13d700  0
check_bucket_shards: resharding needed:
stats.num_objects=18446744073709551539 shard max_objects=10
rgw.log.2.gz:2018-04-02 16:28:19.100159 7fad2e13d700  0
check_bucket_shards: resharding needed:
stats.num_objects=18446744073709551539 shard max_objects=10
rgw.log.3.gz:2018-04-01 16:28:09.051291 7fad2e13d700  0
check_bucket_shards: resharding needed:
stats.num_objects=18446744073709551539 shard max_objects=10
rgw.log.4.gz:2018-03-31 16:27:59.205853 7fad2e13d700  0
check_bucket_shards: resharding needed:
stats.num_objects=18446744073709551539 shard max_objects=10
rgw.log.5.gz:2018-03-30 16:27:42.812917 7fad2e13d700  0
check_bucket_shards: resharding needed:
stats.num_objects=18446744073709551539 shard max_objects=10
rgw.log.6.gz:2018-03-29 16:26:51.298205 7fad2e13d700  0
check_bucket_shards: resharding needed:
stats.num_objects=18446744073709551539 shard max_objects=10
rgw.log.7.gz:2018-03-28 06:25:24.250243 7f646026a700  0
check_bucket_shards: resharding needed: stats.num_objects=134612 shard
max_objects=10
rgw.log.7.gz:2018-03-28 16:26:45.095042 7fad2e13d700  0
check_bucket_shards: resharding needed:
stats.num_objects=18446744073709551539 shard max_objects=10


I have dynamic bucketsharding enabled, and there is a reshard entry,
but because I use tenant and rados had a bug where it didn't add tenant
to the reshard-log, that entry is invalid. The bucket it mentions
doesn't even exist anymore.

So. How do I find out who is creating all these entries in the index,
and why. Secondly, how do I remove the reshardingentry for a bucket
that does not exist? I have a strong feeling that the two questions are
very much related, because of https://tracker.ceph.com/issues/22046

Thanks,

Mark

-- 
Kerio Operator in de Cloud? https://www.kerioindecloud.nl/
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076  | http://www.tuxis.nl/
T: 0318 200208 | i...@tuxis.nl

smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com