Re: [ceph-users] What is the meaning of size and min_size for erasure-coded pools?

Gregory Farnum Wed, 09 May 2018 14:50:29 -0700

On Tue, May 8, 2018 at 2:16 PM Maciej Puzio <[email protected]> wrote:


> Thank you everyone for your replies. However, I feel that at least
> part of the discussion deviated from the topic of my original post. As
> I wrote before, I am dealing with a toy cluster, whose purpose is not
> to provide a resilient storage, but to evaluate ceph and its behavior
> in the event of a failure, with particular attention paid to
> worst-case scenarios. This cluster is purposely minimal, and is built
> on VMs running on my workstation, all OSDs storing data on a single
> SSD. That's definitely not a production system.
>
> I am not asking for advice on how to build resilient clusters, not at
> this point. I asked some questions about specific things that I
> noticed during my tests, and that I was not able to find explained in
> ceph documentation. Dan van der Ster wrote:
> > See https://github.com/ceph/ceph/pull/8008 for the reason why min_size
> defaults to k+1 on ec pools.
> That's a good point, but I am wondering why are reads also blocked
> when number of OSDs falls down to k? What if total number of OSDs in a
> pool (n) is larger than k+m, should the min_size then be k(+1) or
> n-m(+1)?
> In any case, since min_size can be easily changed, then I guess this
> is not an implementation issue, but rather a documentation issue.
>
> Which leaves these my questions still unanswered:
> After killing m OSDs and setting min_size=k most of PGs were now
> active+undersized, often with ...+degraded and/or remapped, but a few
> were active+clean or active+clean+remapped. Why? I would expect all
> PGs to be in the same state (perhaps active+undersized+degraded?).
> Is this mishmash of PG states normal? If not, would I have avoided it
> if I created the pool with min_size=k=3 from the start? In other
> words, does min_size influence the assignment of PGs to OSDs? Or is it
> only used to force I/O shutdown in the event of OSDs failures?
>

active+clean does not make a lot of sense if every PG really was 3+2. But
perhaps you had a 3x replicated pool or something hanging out as well from
your deployment tool?
The active+clean+remapped means that a PG was somehow lucky enough to have
an existing "stray" copy on one of the OSDs that it has decided to use to
bring it back up to the right number of copies, even though they certainly
won't match the proper failure domains.
The min_size in relation to the k+m values won't have any direct impact
here, although they might indirectly affect it by changing how quickly
stray PGs get deleted.
-Greg


>
> Thank you very much
>
> Maciej Puzio
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] What is the meaning of size and min_size for erasure-coded pools?

Reply via email to