Mike:
Thanks for the reply.
However, I did the crushtool command but the output doesn't give me any obvious
explanation why osd.4 should be the primary OSD for PGs.
All the rule has this "step chooseleaf firstn 0 type host". According to Ceph
document, PG should select two buckets from the host type. And all OSD has
same weight/type/etc etc. Why would all PG choose osd.4 as primary OSD?
Here is the content of my crush map.
****************************************************************************************************
# begin crush map
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
# types
type 0 osd
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root
# buckets
host gbl10134201 {
id -2 # do not change unnecessarily
# weight 0.000
alg straw
hash 0 # rjenkins1
item osd.4 weight 0.000
}
host gbl10134202 {
id -3 # do not change unnecessarily
# weight 0.000
alg straw
hash 0 # rjenkins1
item osd.1 weight 0.000
}
host gbl10134203 {
id -4 # do not change unnecessarily
# weight 0.000
alg straw
hash 0 # rjenkins1
item osd.2 weight 0.000
}
host gbl10134214 {
id -5 # do not change unnecessarily
# weight 0.000
alg straw
hash 0 # rjenkins1
item osd.3 weight 0.000
}
host gbl10134215 {
id -6 # do not change unnecessarily
# weight 0.000
alg straw
hash 0 # rjenkins1
item osd.0 weight 0.000
}
root default {
id -1 # do not change unnecessarily
# weight 0.000
alg straw
hash 0 # rjenkins1
item gbl10134201 weight 0.000
item gbl10134202 weight 0.000
item gbl10134203 weight 0.000
item gbl10134214 weight 0.000
item gbl10134215 weight 0.000
}
# rules
rule data {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule metadata {
ruleset 1
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule rbd {
ruleset 2
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
# end crush map
****************************************************************************************************
Regards,
Chen
-----Original Message-----
From: Mike Dawson [mailto:[email protected]]
Sent: Tuesday, October 01, 2013 11:31 AM
To: Chen, Ching-Cheng (KFRM 1); [email protected]
Subject: Re: [ceph-users] Weird behavior of PG distribution
Ching-Cheng,
Data placement is handled by CRUSH. Please examine the following:
ceph osd getcrushmap -o crushmap && crushtool -d crushmap -o
crushmap.txt && cat crushmap.txt
That will show the topology and placement rules Ceph is using.
Pay close attention to the "step chooseleaf" lines inside the rule for
each pool. Under certain configurations, I believe the placement that
you describe is in fact the expected behavior.
Thanks,
Mike Dawson
Co-Founder, Cloudapt LLC
On 10/1/2013 10:46 AM, Chen, Ching-Cheng (KFRM 1) wrote:
> Found a weird behavior (or looks like weird) with ceph 0.67.3
>
> I have 5 servers. Monitor runs on server 1. And server 2 to 5 have
> one OSD running each (osd.0 - osd.3)
>
> I did a 'ceph pg dump'. I can see PGs got somehow randomly distributed
> to all 4 OSDs which is expected behavior.
>
> However, if I bring up one OSD in the same server running monitor. It
> seems all PGs has their primary ODS move to this new OSD. After I add a
> new OSD (osd.4) to the same server running monitor, the 'ceph pg dump'
> command showing active OSDs as [4,x] for all PGs.
>
> Is this expected behavior??
>
> Regards,
>
> Chen
>
> Ching-Cheng Chen
>
> *CREDIT SUISSE*
>
> Information Technology | MDS - New York, KVBB 41
>
> One Madison Avenue | 10010 New York | United States
>
> Phone +1 212 538 8031 | Mobile +1 732 216 7939
>
> [email protected]
> <mailto:[email protected]> | www.credit-suisse.com
> <http://www.credit-suisse.com>
>
>
>
> ==============================================================================
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ==============================================================================
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
===============================================================================
Please access the attached hyperlink for an important electronic communications
disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
===============================================================================
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com