Re: [ceph-users] stuck degraded, undersized

Doan Hartono Thu, 28 May 2015 20:31:10 -0700

Hi Christian,

Based on your feedback, I modified the CRUSH map:


step chooseleaf firstn 0 type host
to
step chooseleaf firstn 0 type osd

And then i compiled and set, and voila, health is OK now. Thanks so much!

ceph health
HEALTH_OK

Regards,
Doan


On 05/29/2015 10:53 AM, Christian Balzer wrote:

Hello,

google is your friend, this comes up every month at least, if not more
frequently.

Your default size (replica) is 2, the default CRUSH rule you quote at the
very end of your mail delineates failure domains on the host level (quite
rightly so).
So with 2 replicas (quite dangerous with disk) you will need to have at
least 2 storage nodes.
Or change the CRUSH rule to allow them to be placed on the same host.

Christian

On Fri, 29 May 2015 10:48:04 +0800 Doan Hartono wrote:

Hi ceph experts,

I just freshly deployed ceph 0.94.1 with one monitor and one storage
node containing 4 disks. But ceph health shows pgs stuck in degraded,
unclean, and undersized. Any idea how to resolve this issue to get
active+clean state?

ceph health
HEALTH_WARN 27 pgs degraded; 27 pgs stuck degraded; 128 pgs stuck
unclean; 27 pgs stuck undersized; 27 pgs undersized

ceph status
      cluster 6a8291d4-a3b8-475b-ad6c-c73895228762
       health HEALTH_WARN
              27 pgs degraded
              27 pgs stuck degraded
              128 pgs stuck unclean
              27 pgs stuck undersized
              27 pgs undersized
       monmap e1: 1 mons at {ceph-mon=10.0.0.154:6789/0}
              election epoch 2, quorum 0 ceph-mon
       osdmap e38: 4 osds: 4 up, 4 in; 101 remapped pgs
        pgmap v63: 128 pgs, 1 pools, 0 bytes data, 0 objects
              135 MB used, 7428 GB / 7428 GB avail
                    73 active+remapped
                    28 active
                    27 active+undersized+degraded

I set pg num and pgp num to 128 following ceph recommendation in the
documentation

[global]
fsid = 6a8291d4-a3b8-475b-ad6c-c73895228762
mon_initial_members = ceph-mon
mon_host = xxxxxxxxx
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd pool default size = 2
osd pool default pg num = 128
osd pool default pgp num = 128

I have set rbd pool's pg_num and pgp_num to 128.
$ ceph osd pool get rbd pg_num
pg_num: 128
$ ceph osd pool get rbd pgp_num
pgp_num: 128
$ ceph osd pool get rbd size
size: 2

I have tried modifying crush tunables as well
ceph osd crush tunables legacy
ceph osd crush tunables optimal
but no effect on ceph health

Crush map:

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host research10-pc {
          id -2           # do not change unnecessarily
          # weight 7.240
          alg straw
          hash 0  # rjenkins1
          item osd.0 weight 1.810
          item osd.1 weight 1.810
          item osd.2 weight 1.810
          item osd.3 weight 1.810
}
root default {
          id -1           # do not change unnecessarily
          # weight 7.240
          alg straw
          hash 0  # rjenkins1
          item research10-pc weight 7.240
}

# rules
rule replicated_ruleset {
          ruleset 0
          type replicated
          min_size 1
          max_size 10
          step take default
          step chooseleaf firstn 0 type host
          step emit
}

Regards,
Doan

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] stuck degraded, undersized

Reply via email to