Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 21:43 schreef "Will.Boege" : > > > Thanks for the explanation. I guess this case you outlined explains why the > Ceph developers chose to make this a ‘safe’ default. > > 2 osds are transiently down and the third fails hard. The PGs on the 3rd

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 21:18 schreef Kevin Olbrich : > > > Is Ceph accepting this OSD if the other (newer) replica is down? > In this case I would assume that my cluster is instantly broken when rack > _after_ rack fails (power outage) and I just start in random order. > We have

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Will . Boege
Thanks for the explanation. I guess this case you outlined explains why the Ceph developers chose to make this a ‘safe’ default. 2 osds are transiently down and the third fails hard. The PGs on the 3rd osd with no more replicas are marked unfound. You bring up 1 and 2 and these PGs will

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Kevin Olbrich
Is Ceph accepting this OSD if the other (newer) replica is down? In this case I would assume that my cluster is instantly broken when rack _after_ rack fails (power outage) and I just start in random order. We have at least one MON on stand-alone UPS to resolv such an issue - I just assumed this

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Wido den Hollander
> Op 7 december 2016 om 21:04 schreef "Will.Boege" : > > > Hi Wido, > > Just curious how blocking IO to the final replica provides protection from > data loss? I’ve never really understood why this is a Ceph best practice. > In my head all 3 replicas would be on

Re: [ceph-users] [EXTERNAL] Re: 2x replication: A BIG warning

2016-12-07 Thread Will . Boege
Hi Wido, Just curious how blocking IO to the final replica provides protection from data loss? I’ve never really understood why this is a Ceph best practice. In my head all 3 replicas would be on devices that have roughly the same odds of physically failing or getting logically corrupted in