Re: [ceph-users] pgs stuck unclean on a new pool despite the pool size reconfiguration

Warren Wang - ISD Fri, 02 Oct 2015 07:19:49 -0700

You probably don’t want hashpspool automatically set, since your clients may 
still not understand that crush map feature. You can try to unset it for that 
pool and see what happens, or create a new pool without hashpspool enabled from 
the start.  Just a guess.

Warren

From: Giuseppe Civitella 
<[email protected]<mailto:[email protected]>>
Date: Friday, October 2, 2015 at 10:05 AM
To: ceph-users <[email protected]<mailto:[email protected]>>
Subject: [ceph-users] pgs stuck unclean on a new pool despite the pool size 
reconfiguration

Hi all,
I have a Firefly cluster which has been upgraded from Emperor.
It has 2 OSD hosts and 3 monitors.
The cluster has default values for what concerns size and min_size of the pools.
Once upgraded to Firefly, I created a new pool called bench2:
ceph osd pool create bench2 128 128
and set its sizes:
ceph osd pool set bench2 size 2
ceph osd pool set bench2 min_size 1

this is the state of the pools:
pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1 crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 1 stripe_width 0
pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 2 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1 stripe_width 0
pool 3 'volumes' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 384 pgp_num 384 last_change 2568 stripe_width 0
        removed_snaps [1~75]
pool 4 'images' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 384 pgp_num 384 last_change 1895 stripe_width 0
pool 8 'bench2' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 128 pgp_num 128 last_change 2580 flags hashpspool stripe_width 0

despite this I still get a warning about 128 pgs stuck unclean.
The "ceph health detail" shows me the stuck PGs. So i take one to get the 
involved OSDs:

pg 8.38 is stuck unclean since forever, current state active, last acting [22,7]

if I restart the OSD with id 22, the PG 8.38 gets an active+clean state.

This is an incorrect behavior, AFAIK. The cluster should get noticed of the new 
size and min_size values without any manual intervention. So my question is: 
any idea about why this happens and how to restore the default behavior? Do I 
need to restart all of the OSDs to restore an healthy state?

thanks a lot
Giuseppe

This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] pgs stuck unclean on a new pool despite the pool size reconfiguration

Reply via email to