On Mon, Nov 28, 2016 at 9:54 PM, Piotr Dzionek <[email protected]> wrote:
> Hi,
> I recently installed 3 nodes ceph cluster v.10.2.3. It has 3 mons, and 12
> osds. I removed default pool and created the following one:
>
> pool 7 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 1024 pgp_num 1024 last_change 126 flags hashpspool
> stripe_width 0
Do you understand the significance of min_size 1?
Are you OK with the likelihood of data loss that this value introduces?
>
> Cluster is healthy if all osds are up, however if I stop any of the osds, it
> becomes stuck and undersized - it is not rebuilding.
>
> cluster *****
> health HEALTH_WARN
> 166 pgs degraded
> 108 pgs stuck unclean
> 166 pgs undersized
> recovery 67261/827220 objects degraded (8.131%)
> 1/12 in osds are down
> monmap e3: 3 mons at
> {**osd01=***.144:6789/0,***osd02=***.145:6789/0,**osd03=*****.146:6789/0}
> election epoch 14, quorum 0,1,2 **osd01,**osd02,**osd03
> osdmap e161: 12 osds: 11 up, 12 in; 166 remapped pgs
> flags sortbitwise
> pgmap v307710: 1024 pgs, 1 pools, 1230 GB data, 403 kobjects
> 2452 GB used, 42231 GB / 44684 GB avail
> 67261/827220 objects degraded (8.131%)
> 858 active+clean
> 166 active+undersized+degraded
>
> Replica size is 2 and and I use the following crushmap:
>
> # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
> tunable chooseleaf_vary_r 1
> tunable straw_calc_version 1
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 osd.4
> device 5 osd.5
> device 6 osd.6
> device 7 osd.7
> device 8 osd.8
> device 9 osd.9
> device 10 osd.10
> device 11 osd.11
>
> # types
> type 0 osd
> type 1 host
> type 2 chassis
> type 3 rack
> type 4 row
> type 5 pdu
> type 6 pod
> type 7 room
> type 8 datacenter
> type 9 region
> type 10 root
>
> # buckets
> host osd01 {
> id -2 # do not change unnecessarily
> # weight 14.546
> alg straw
> hash 0 # rjenkins1
> item osd.0 weight 3.636
> item osd.1 weight 3.636
> item osd.2 weight 3.636
> item osd.3 weight 3.636
> }
> host osd02 {
> id -3 # do not change unnecessarily
> # weight 14.546
> alg straw
> hash 0 # rjenkins1
> item osd.4 weight 3.636
> item osd.5 weight 3.636
> item osd.6 weight 3.636
> item osd.7 weight 3.636
> }
> host osd03 {
> id -4 # do not change unnecessarily
> # weight 14.546
> alg straw
> hash 0 # rjenkins1
> item osd.8 weight 3.636
> item osd.9 weight 3.636
> item osd.10 weight 3.636
> item osd.11 weight 3.636
> }
> root default {
> id -1 # do not change unnecessarily
> # weight 43.637
> alg straw
> hash 0 # rjenkins1
> item osd01 weight 14.546
> item osd02 weight 14.546
> item osd03 weight 14.546
> }
>
> # rules
> rule replicated_ruleset {
> ruleset 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step chooseleaf firstn 0 type host
> step emit
> }
>
> # end crush map
>
> I am not sure what is the reason for undersized state. All osd disks are the
> same size and replica size is 2. Also data is only replicated per hosts
> basis and I have 3 separate hosts. Maybe number of pg is incorrect ? Is
> 1024 too big ? or maybe there is some misconfiguration in crushmap ?
>
>
> Kind regards,
> Piotr Dzionek
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
--
Cheers,
Brad
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com