Here is my crushmap. You can see our general setup. We are using the bottom
rule for the EC pool.
We are trying to get to the point where we can lose an entire host and the
cluster will continue to work.
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0 class hdd
device 1 osd.1 class hdd
device 2 osd.2 class hdd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd
device 6 osd.6 class hdd
device 7 osd.7 class hdd
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
device 11 osd.11 class hdd
device 12 osd.12 class hdd
device 13 osd.13 class hdd
device 14 osd.14 class hdd
device 15 osd.15 class hdd
device 16 osd.16 class hdd
device 17 osd.17 class hdd
device 18 osd.18 class hdd
device 19 osd.19 class hdd
device 20 osd.20 class hdd
device 21 osd.21 class hdd
device 22 osd.22 class hdd
device 23 osd.23 class hdd
device 24 osd.24 class hdd
device 25 osd.25 class hdd
device 26 osd.26 class hdd
device 27 osd.27 class hdd
device 28 osd.28 class hdd
device 29 osd.29 class hdd
device 30 osd.30 class hdd
device 31 osd.31 class hdd
device 32 osd.32 class hdd
device 33 osd.33 class hdd
device 34 osd.34 class hdd
device 35 osd.35 class hdd
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host osd01tv01 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
# weight 109.152
alg straw2
hash 0 # rjenkins1
item osd.0 weight 9.096
item osd.3 weight 9.096
item osd.6 weight 9.096
item osd.9 weight 9.096
item osd.12 weight 9.096
item osd.15 weight 9.096
item osd.18 weight 9.096
item osd.21 weight 9.096
item osd.24 weight 9.096
item osd.27 weight 9.096
item osd.30 weight 9.096
item osd.33 weight 9.096
}
host osd02tv01 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
# weight 109.152
alg straw2
hash 0 # rjenkins1
item osd.1 weight 9.096
item osd.4 weight 9.096
item osd.7 weight 9.096
item osd.10 weight 9.096
item osd.13 weight 9.096
item osd.16 weight 9.096
item osd.19 weight 9.096
item osd.22 weight 9.096
item osd.25 weight 9.096
item osd.28 weight 9.096
item osd.31 weight 9.096
item osd.34 weight 9.096
}
host osd03tv01 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
# weight 109.152
alg straw2
hash 0 # rjenkins1
item osd.2 weight 9.096
item osd.5 weight 9.096
item osd.8 weight 9.096
item osd.11 weight 9.096
item osd.14 weight 9.096
item osd.17 weight 9.096
item osd.20 weight 9.096
item osd.23 weight 9.096
item osd.26 weight 9.096
item osd.29 weight 9.096
item osd.32 weight 9.096
item osd.35 weight 9.096
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
# weight 327.441
alg straw2
hash 0 # rjenkins1
item osd01tv01 weight 109.147
item osd02tv01 weight 109.147
item osd03tv01 weight 109.147
}
# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule default.rgw.buckets.data {
id 1
type erasure
min_size 3
max_size 3
step set_chooseleaf_tries 5
step set_choose_tries 100
step take default
step choose indep 2 type host
step choose indep 2 type osd
step emit
}
# end crush map
Thanks again for all the help!
Tim Gipson
Systems Engineer
On 11/12/17, 10:57 PM, "Christian Wuerdig" <[email protected]> wrote:
Well, as stated in the other email I think in the EC scenario you can
set size=k+m for the pgcalc tool. If you want 10+2 then in theory you
should be able to get away with 6 nodes to survive a single node
failure if you can guarantee that every node will always receive 2 out
of the 12 chunks - looks like this might be achievable:
http://ceph.com/planet/erasure-code-on-small-clusters/
On Mon, Nov 13, 2017 at 1:32 PM, Tim Gipson <[email protected]> wrote:
> I guess my questions are more centered around k+m and PG calculations.
>
> As we were starting to build and test our EC pools with our
infrastructure we were trying to figure out what our calculations needed to be
starting with 3 OSD hosts with 12 x 10 TB OSDs a piece. The nodes have the
ability to expand to 24 drives a piece and we hope to eventually get to around
a 1PB cluster after we add some more hosts. Initially we hoped to be able to
do a k=10 m=2 on the pool but I am not sure that is going to be feasible. We’d
like to set up the failure domain so that we would be able to lose an entire
host without losing the cluster. At this point I’m not sure that’s possible
without bringing in more hosts.
>
> Thanks for the help!
>
> Tim Gipson
>
>
> On 11/12/17, 5:14 PM, "Christian Wuerdig" <[email protected]>
wrote:
>
> I might be wrong, but from memory I think you can use
> http://ceph.com/pgcalc/ and use k+m for the size
>
> On Sun, Nov 12, 2017 at 5:41 AM, Ashley Merrick
<[email protected]> wrote:
> > Hello,
> >
> > Are you having any issues with getting the pool working or just
around the
> > PG num you should use?
> >
> > ,Ashley
> >
> > Get Outlook for Android
> >
> > ________________________________
> > From: ceph-users <[email protected]> on behalf of
Tim Gipson
> > <[email protected]>
> > Sent: Saturday, November 11, 2017 5:38:02 AM
> > To: [email protected]
> > Subject: [ceph-users] Erasure Coding Pools and PG calculation -
> > documentation
> >
> > Hey all,
> >
> > I’m having some trouble setting up a Pool for Erasure Coding. I
haven’t
> > found much documentation around the PG calculation for an Erasure
Coding
> > pool. It seems from what I’ve tried so far that the math needed to
set one
> > up is different than the math you use to calculate PGs for a regular
> > replicated pool.
> >
> > Does anyone have any experience setting up a pool this way and can
you give
> > me some help or direction, or point me toward some documentation
that goes
> > over the math behind this sort of pool setup?
> >
> > Any help would be greatly appreciated!
> >
> > Thanks,
> >
> >
> > Tim Gipson
> > Systems Engineer
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > _______________________________________________
> > ceph-users mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com