Re: [ceph-users] Slow requet on node reboot

Hyun Ha Thu, 10 Aug 2017 20:37:42 -0700

Thanks for reply.

In my case, it was an issue about min_size of pool.


# ceph osd pool ls detail
pool 5 'volumes' replicated size 2 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 512 pgp_num 512 last_change 844 flags hashpspool
stripe_width 0
        removed_snaps [1~23]

when replicated size=2 and min_size=2 is set, and osd goes down, ceph
cluster go into Err state and client I/O goes hang.

ceph status log>
health HEALTH_ERR
            310 pgs are stuck inactive for more than 300 seconds
            35 pgs backfill_wait
            3 pgs backfilling
            38 pgs degraded
            382 pgs peering
            310 pgs stuck inactive
            310 pgs stuck unclean
            39 pgs undersized
            263 requests are blocked > 32 sec

you can simply reproduce that.
so I solved this by set min_size=1 using "ceph osd pool set volumes
min_size 1" command.
It is very strange thing because if min_size can occurs big problem to ceph
cluster, ceph would not allow to set same value with replicated_size.

Thanks.

2017-08-10 23:33 GMT+09:00 David Turner <[email protected]>:

> When the node remote, are the osds being marked down immediately? If the
> node were to reboot, but not Mark the osds down, then all requires to those
> osds would block until they got marked down.
>
> On Thu, Aug 10, 2017, 5:46 AM Hyun Ha <[email protected]> wrote:
>
>> Hi, Ramirez
>>
>> I have exactly same problem as yours.
>> Did you solved that issue?
>> Do you have expireences or solutions?
>>
>> Thank you.
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Slow requet on node reboot

Reply via email to