Re: [ceph-users] crush chooseleaf vs. choose

Dietmar Maurer Mon, 06 Jan 2014 00:58:36 -0800

> 'ceph osd crush tunables optimal'
> 
> or adjust an offline map file via the crushtool command line (more
> annoying) and retest; I suspect that is the problem.
> 
> http://ceph.com/docs/master/rados/operations/crush-map/#tunables


That solves the bug with weight 0, thanks.

But is still get the following distribution:

  device 0:     423
  device 1:     453
  device 2:     430
  device 3:     455
  device 4:     657
  device 5:     654

Host with only one osd gets too much data.

> On Fri, 3 Jan 2014, Dietmar Maurer wrote:
> 
> > > In both cases, you only get 2 replicas on the remaining 2 hosts.
> >
> > OK, I was able to reproduce this with crushtool.
> >
> > > The difference is if you have 4 hosts with 2 osds.  In the choose
> > > case, you have some fraction of the data that chose the down host in
> > > the first step (most of the attempts, actually!) and then couldn't
> > > find a usable osd, leaving you with only 2
> >
> > This is also reproducible.
> >
> > > replicas.  With chooseleaf that doesn't happen.
> > >
> > > The other difference is if you have one of the two OSDs on the host marked
> out.
> > > In the choose case, the remaining OSD will get allocated 2x the
> > > data; in the chooseleaf case, usage will remain proportional with
> > > the rest of the cluster and the data from the out OSD will be
> > > distributed across other OSDs (at least when there are > 3 hosts!).
> >
> > I see, but data distribution seems not optimal in that case.
> >
> > For example using this crush map:
> >
> > # types
> > type 0 osd
> > type 1 host
> > type 2 rack
> > type 3 row
> > type 4 room
> > type 5 datacenter
> > type 6 root
> >
> > # buckets
> > host prox-ceph-1 {
> >     id -2           # do not change unnecessarily
> >     # weight 7.260
> >     alg straw
> >     hash 0  # rjenkins1
> >     item osd.0 weight 3.630
> >     item osd.1 weight 3.630
> > }
> > host prox-ceph-2 {
> >     id -3           # do not change unnecessarily
> >     # weight 7.260
> >     alg straw
> >     hash 0  # rjenkins1
> >     item osd.2 weight 3.630
> >     item osd.3 weight 3.630
> > }
> > host prox-ceph-3 {
> >     id -4           # do not change unnecessarily
> >     # weight 3.630
> >     alg straw
> >     hash 0  # rjenkins1
> >     item osd.4 weight 3.630
> > }
> >
> > host prox-ceph-4 {
> >     id -5           # do not change unnecessarily
> >     # weight 3.630
> >     alg straw
> >     hash 0  # rjenkins1
> >     item osd.5 weight 3.630
> > }
> >
> > root default {
> >     id -1           # do not change unnecessarily
> >     # weight 21.780
> >     alg straw
> >     hash 0  # rjenkins1
> >     item prox-ceph-1 weight 7.260   # 2 OSDs
> >     item prox-ceph-2 weight 7.260   # 2 OSDs
> >     item prox-ceph-3 weight 3.630   # 1 OSD
> >     item prox-ceph-4 weight 3.630   # 1 OSD
> > }
> >
> > # rules
> > rule data {
> >     ruleset 0
> >     type replicated
> >     min_size 1
> >     max_size 10
> >     step take default
> >     step chooseleaf firstn 0 type host
> >     step emit
> > }
> > # end crush map
> >
> > crushtool shows the following utilization:
> >
> > # crushtool --test -i my.map --rule 0 --num-rep 3 --show-utilization
> >   device 0: 423
> >   device 1: 452
> >   device 2: 429
> >   device 3: 452
> >   device 4: 661
> >   device 5: 655
> >
> > Any explanation for that?  Maybe related to the small number of devices?
> >
> >


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] crush chooseleaf vs. choose

Reply via email to