min_size should be at least k+1 for EC. There are times to use k for
emergencies like you had. I would suggest seeing it back to 3 once your
back to healthy.

As far as why you needed to reduce min_size, my guess would be that
recovery would have happened as long as k copies were up. Were the PG's
refusing to backfill or just hang backfilled yet?

On Mon, Oct 29, 2018, 9:24 PM Chad W Seys <cws...@physics.wisc.edu> wrote:

> Hi all,
>    Recently our cluster lost a drive and a node (3 drives) at the same
> time.  Our erasure coded pools are all k2m2, so if all is working
> correctly no data is lost.
>    However, there were 4 PGs that stayed "incomplete" until I finally
> took the suggestion in 'ceph health detail' to reduce min_size . (Thanks
> for the hint!)  I'm not sure what it was (likely 3), but setting it to 2
> caused all PGs to become active (though degraded) and the cluster is on
> path to recovering fully.
>    In replicated pools, would not ceph create replicas without the need
> to reduce min_size?  It seems odd to not recover automatically if
> possible.  Could someone explain what was going on there?
>    Also, how to decide what min_size should be?
> Thanks!
> Chad.
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
ceph-users mailing list

Reply via email to