min_size should be at least k+1 for EC. There are times to use k for emergencies like you had. I would suggest seeing it back to 3 once your back to healthy.
As far as why you needed to reduce min_size, my guess would be that recovery would have happened as long as k copies were up. Were the PG's refusing to backfill or just hang backfilled yet? On Mon, Oct 29, 2018, 9:24 PM Chad W Seys <cws...@physics.wisc.edu> wrote: > Hi all, > Recently our cluster lost a drive and a node (3 drives) at the same > time. Our erasure coded pools are all k2m2, so if all is working > correctly no data is lost. > However, there were 4 PGs that stayed "incomplete" until I finally > took the suggestion in 'ceph health detail' to reduce min_size . (Thanks > for the hint!) I'm not sure what it was (likely 3), but setting it to 2 > caused all PGs to become active (though degraded) and the cluster is on > path to recovering fully. > > In replicated pools, would not ceph create replicas without the need > to reduce min_size? It seems odd to not recover automatically if > possible. Could someone explain what was going on there? > > Also, how to decide what min_size should be? > > Thanks! > Chad. > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com