Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-27 Thread Igor Gajsin
Thanks a lot for your help.

Konstantin Shalygin writes:

> On 04/27/2018 05:05 PM, Igor Gajsin wrote:
>> I have a crush rule like
>
>
> You still can use device classes!
>
>
>> * host0 has a piece of data on osd.0
> Not peace, full object. If we talk about non-EC pools.
>> * host1 has pieces of data on osd.1 and osd.2
> host1 has copy on osd.1 *or* osd.2
>> * host2 has no data
> host2 also will be have one copy of object.
>
> Also do not forget - hosts with half of osds of host1 (i.e. host0 and
> host2) will be do "double work" in comparison.
> You can minimize this impact via decreasing osd crush weights for host1.
>
>
>
>
>
> k


--
With best regards,
Igor Gajsin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-27 Thread Konstantin Shalygin

On 04/27/2018 05:05 PM, Igor Gajsin wrote:

I have a crush rule like



You still can use device classes!



* host0 has a piece of data on osd.0

Not peace, full object. If we talk about non-EC pools.

* host1 has pieces of data on osd.1 and osd.2

host1 has copy on osd.1 *or* osd.2

* host2 has no data

host2 also will be have one copy of object.

Also do not forget - hosts with half of osds of host1 (i.e. host0 and 
host2) will be do "double work" in comparison.

You can minimize this impact via decreasing osd crush weights for host1.





k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-27 Thread Igor Gajsin
Thanks, man. Thanks a lot. Now I'm understood. So, to be sure If I have 3 hosts,
replicating factor is also 3 and I have a crush rule like:
{
"rule_id": 0,
"rule_name": "replicated_rule",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}

My data is replicated across hosts, not across osds, all hosts have
pieces of data and a situation like:

* host0 has a piece of data on osd.0
* host1 has pieces of data on osd.1 and osd.2
* host2 has no data

is completely excluded?

Konstantin Shalygin writes:

> On 04/27/2018 04:37 PM, Igor Gajsin wrote:
>> pool 7 'rbd' replicated size 3 min_size 2 crush_rule 0
>
>
> Your pools have proper size settings - is 3. But you crush have only 2
> buckets for this rule (e.g. is your pods).
> For making this rule work you should have minimum of 3 'pod' buckets.
>
>
>
>
> k


--
With best regards,
Igor Gajsin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-27 Thread Konstantin Shalygin

On 04/27/2018 04:37 PM, Igor Gajsin wrote:

pool 7 'rbd' replicated size 3 min_size 2 crush_rule 0



Your pools have proper size settings - is 3. But you crush have only 2 
buckets for this rule (e.g. is your pods).

For making this rule work you should have minimum of 3 'pod' buckets.




k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-27 Thread Igor Gajsin
# ceph osd pool ls detail
pool 1 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 32 pgp_num 32 last_change 958 lfor 0/909 flags hashpspool 
stripe_width 0 application cephfs
pool 2 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 8 pgp_num 8 last_change 954 flags hashpspool stripe_width 0 
application cephfs
pool 3 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 8 pgp_num 8 last_change 22 owner 18446744073709551615 flags 
hashpspool stripe_width 0 application rgw
pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 
object_hash rjenkins pg_num 8 pgp_num 8 last_change 24 owner 
18446744073709551615 flags hashpspool stripe_width 0 application rgw
pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 8 pgp_num 8 last_change 26 owner 18446744073709551615 flags 
hashpspool stripe_width 0 application rgw
pool 6 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 8 pgp_num 8 last_change 28 owner 18446744073709551615 flags 
hashpspool stripe_width 0 application rgw
pool 7 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1161 flags hashpspool stripe_width 0 
application rbd
removed_snaps [1~3]
pool 8 'kube' replicated size 3 min_size 2 crush_rule 3 object_hash rjenkins 
pg_num 128 pgp_num 128 last_change 1241 lfor 0/537 flags hashpspool 
stripe_width 0 application cephfs
removed_snaps [1~5,7~2]

crush rule 3 is
ceph osd crush rule dump podshdd
{
"rule_id": 3,
"rule_name": "podshdd",
"ruleset": 3,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "pod"
},
{
"op": "emit"
}
]
}

Konstantin Shalygin writes:

> On 04/26/2018 11:30 PM, Igor Gajsin wrote:
>> after assigning this rule to a pool it stucks in the same state:
>
>
> `ceph osd pool ls detail` please
>
>
>
>
> k


--
With best regards,
Igor Gajsin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-27 Thread Konstantin Shalygin

On 04/26/2018 11:30 PM, Igor Gajsin wrote:

after assigning this rule to a pool it stucks in the same state:



`ceph osd pool ls detail` please




k
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-26 Thread Igor Gajsin
Hi Konstantin, thanks a lot for your response.

> Your crush is imbalanced:

I do it deliberately. The group2 of my small-but-helpful ceph cluster
also will be a master-nodes for my new small-but-helpful kubernetes
cluster. And I what I want to achieve is: there are 2 groups of nodes,
and even if one of them completely failed (during k8s installation),
another group will contain a copy of data.

But, ok. Let's rebalance it for test purpose:

ID  CLASS WEIGHT  TYPE NAME
 -1   3.63835 root default
 -9   1.81917 pod group1
 -3   0.90958 host feather0
  0   hdd 0.90958 osd.0
 -5   0.90959 host feather1
  1   hdd 0.90959 osd.1
-10   1.81918 pod group2
 -7   1.81918 host ds1
  2   hdd 0.90959 osd.2
  3   hdd 0.90959 osd.3

and add your rule

> ceph osd crush rule create-replicated podshdd default pod hdd

# ceph osd crush rule dump podshdd
{
"rule_id": 3,
"rule_name": "podshdd",
"ruleset": 3,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "pod"
},
{
"op": "emit"
}
]
}

after assigning this rule to a pool it stucks in the same state:

# ceph -s
  cluster:
id: 34b66329-b511-4d97-9e07-7b1a0a6879ef
health: HEALTH_WARN
3971/42399 objects misplaced (9.366%)

  services:
mon: 3 daemons, quorum feather0,feather1,ds1
mgr: ds1(active), standbys: feather1, feather0
mds: cephfs-1/1/1 up  {0=feather0=up:active}, 2 up:standby
osd: 4 osds: 4 up, 4 in; 128 remapped pgs
rgw: 3 daemons active

  data:
pools:   8 pools, 264 pgs
objects: 14133 objects, 49684 MB
usage:   143 GB used, 3582 GB / 3725 GB avail
pgs: 3971/42399 objects misplaced (9.366%)
 136 active+clean
 128 active+clean+remapped

  io:
client:   19441 B/s rd, 29673 B/s wr, 18 op/s rd, 18 op/s wr

And what interesting. First, it complains like "object misplaced (23%)"
and ceph health detail shows a lot of degraded pg. But then there is no
pg in its output:

# ceph health detail
HEALTH_WARN 3971/42399 objects misplaced (9.366%)
OBJECT_MISPLACED 3971/42399 objects misplaced (9.366%)

and amount of misplaced objects stops reducing it is equal 9.366 last 30
mins.

If switch the crush rule back to default the cluster returns to HEALTH_OK state.

Konstantin Shalygin writes:

>> # ceph osd crush tree
>> ID  CLASS WEIGHT  TYPE NAME
>>   -1   3.63835 root default
>>   -9   0.90959 pod group1
>>   -5   0.90959 host feather1
>>1   hdd 0.90959 osd.1
>> -10   2.72876 pod group2
>>   -7   1.81918 host ds1
>>2   hdd 0.90959 osd.2
>>3   hdd 0.90959 osd.3
>>   -3   0.90958 host feather0
>>0   hdd 0.90958 osd.0
>>
>> And I've made a rule
>>
>> # ceph osd crush rule dump pods
>> {
>>  "rule_id": 1,
>>  "rule_name": "pods",
>>  "ruleset": 1,
>>  "type": 1,
>>  "min_size": 1,
>>  "max_size": 10,
>>  "steps": [
>>  {
>>  "op": "take",
>>  "item": -1,
>>  "item_name": "default"
>>  },
>>  {
>>  "op": "chooseleaf_firstn",
>>  "num": 0,
>>  "type": "pod"
>>  },
>>  {
>>  "op": "emit"
>>  }
>>  ]
>> }
>
>
> 1. Assign device class to your crush rule:
>
> ceph osd crush rule create-replicated pods default pod hdd
>
> 2. Your crush is imbalanced:
>
> *good*:
>
> root:
>
>  host1:
>
>   - osd0
>
>  host2:
>
>   - osd1
>
>  host3:
>
>   - osd3
>
>
> *bad*:
>
> root:
>
>  host1:
>
>   - osd0
>
>  host2:
>
>   - osd1
>
>   - osd2
>
>   - osd3
>
>
>
>
> k


--
With best regards,
Igor Gajsin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster can't remapped objects after change crush tree

2018-04-25 Thread Konstantin Shalygin

# ceph osd crush tree
ID  CLASS WEIGHT  TYPE NAME
  -1   3.63835 root default
  -9   0.90959 pod group1
  -5   0.90959 host feather1
   1   hdd 0.90959 osd.1
-10   2.72876 pod group2
  -7   1.81918 host ds1
   2   hdd 0.90959 osd.2
   3   hdd 0.90959 osd.3
  -3   0.90958 host feather0
   0   hdd 0.90958 osd.0

And I've made a rule

# ceph osd crush rule dump pods
{
 "rule_id": 1,
 "rule_name": "pods",
 "ruleset": 1,
 "type": 1,
 "min_size": 1,
 "max_size": 10,
 "steps": [
 {
 "op": "take",
 "item": -1,
 "item_name": "default"
 },
 {
 "op": "chooseleaf_firstn",
 "num": 0,
 "type": "pod"
 },
 {
 "op": "emit"
 }
 ]
}



1. Assign device class to your crush rule:

ceph osd crush rule create-replicated pods default pod hdd

2. Your crush is imbalanced:

*good*:

root:

    host1:

        - osd0

    host2:

        - osd1

    host3:

        - osd3


*bad*:

root:

    host1:

      - osd0

    host2:

        - osd1

        - osd2

        - osd3




k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cluster can't remapped objects after change crush tree

2018-04-25 Thread Igor Gajsin
Hi, I've got stuck in a problem with crush rule.
I have a small cluster with 3 nodes and 4 osd. I've decided to split
it to 2 failure domains and made 2 buckets and put hosts in that buckets
like in that instruction
http://www.sebastien-han.fr/blog/2014/01/13/ceph-managing-crush-with-the-cli/

Finally, I've got crush tree like

# ceph osd crush tree
ID  CLASS WEIGHT  TYPE NAME
 -1   3.63835 root default
 -9   0.90959 pod group1
 -5   0.90959 host feather1
  1   hdd 0.90959 osd.1
-10   2.72876 pod group2
 -7   1.81918 host ds1
  2   hdd 0.90959 osd.2
  3   hdd 0.90959 osd.3
 -3   0.90958 host feather0
  0   hdd 0.90958 osd.0

And I've made a rule

# ceph osd crush rule dump pods
{
"rule_id": 1,
"rule_name": "pods",
"ruleset": 1,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "pod"
},
{
"op": "emit"
}
]
}

If to apply that rule to a pool, my cluster moves to

# ceph -s
 cluster:
id: 34b66329-b511-4d97-9e07-7b1a0a6879ef
health: HEALTH_WARN
6/42198 objects misplaced (0.014%)

  services:
mon: 3 daemons, quorum feather0,feather1,ds1
mgr: ds1(active), standbys: feather1, feather0
mds: cephfs-1/1/1 up  {0=feather0=up:active}, 2 up:standby
osd: 4 osds: 4 up, 4 in; 64 remapped pgs
rgw: 3 daemons active

  data:
pools:   8 pools, 264 pgs
objects: 14066 objects, 49429 MB
usage:   142 GB used, 3582 GB / 3725 GB avail
pgs: 6/42198 objects misplaced (0.014%)
 200 active+clean
 64  active+clean+remapped

  io:
client:   1897 kB/s wr, 0 op/s rd, 11 op/s wr

And it's frozen in that state, self-healing doesn't occur, just stuck in
the state: objects misplaced / pgs active+clean+remapped.

I think something wrong with my rule, and the cluster can't move objects
to rearrange it according to the new rule. I lost something and I have no idea
what exactly. Any help would be appreciated.

--
With best regards,
Igor Gajsin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com