Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Xavier Trilla
We had a similar situation, with one machine reseting when we restarted another one. I’m not 100% sure why it happened, but I would bet it was related to several thousand client connections migrating from the machine we restarted to another one. We have a similar setup than yours, and if you

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Wido den Hollander
On 7/25/19 9:56 AM, Xiaoxi Chen wrote: > The real impact of changing min_size to 1 , is not about the possibility > of losing data ,but how much data it will lost.. in both case you will > lost some data , just how much. > > Let PG X -> (osd A, B, C), min_size = 2, size =3 > In your

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Xiaoxi Chen
The real impact of changing min_size to 1 , is not about the possibility of losing data ,but how much data it will lost.. in both case you will lost some data , just how much. Let PG X -> (osd A, B, C), min_size = 2, size =3 In your description, T1, OSD A goes down due to upgrade, now the PG

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Wido den Hollander
On 7/25/19 9:19 AM, Xiaoxi Chen wrote: > We had hit this case in production but my solution will be change > min_size = 1 immediately so that PG back to active right after. > > It somewhat tradeoff reliability(durability) with availability during > that window of 15 mins but if you are certain

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Wido den Hollander
On 7/25/19 8:55 AM, Janne Johansson wrote: > Den ons 24 juli 2019 kl 21:48 skrev Wido den Hollander >: > > Right now I'm just trying to find a clever solution to this. It's a 2k > OSD cluster and the likelihood of an host or OSD crashing is reasonable > while

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Janne Johansson
Den ons 24 juli 2019 kl 21:48 skrev Wido den Hollander : > Right now I'm just trying to find a clever solution to this. It's a 2k > OSD cluster and the likelihood of an host or OSD crashing is reasonable > while you are performing maintenance on a different host. > > All kinds of things have

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-24 Thread Wido den Hollander
On 7/24/19 9:35 PM, Mark Schouten wrote: > I’d say the cure is worse than the issue you’re trying to fix, but that’s my > two cents. > I'm not completely happy with it either. Yes, the price goes up and latency increases as well. Right now I'm just trying to find a clever solution to this.

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-24 Thread Mark Schouten
I’d say the cure is worse than the issue you’re trying to fix, but that’s my two cents. Mark Schouten > Op 24 jul. 2019 om 21:22 heeft Wido den Hollander het > volgende geschreven: > > Hi, > > Is anybody using 4x (size=4, min_size=2) replication with Ceph? > > The reason I'm asking is that

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-24 Thread Paul Emmerich
We got a few size=4 pools, but most of them are metadata pools paired with m=3 or m=4 erasure coded pools for the actual data. Goal is to provide the same availability and durability guarantees for the metadata as the data. But we got some older odd setup with replicated size=4 for that reason

[ceph-users] Anybody using 4x (size=4) replication?

2019-07-24 Thread Wido den Hollander
Hi, Is anybody using 4x (size=4, min_size=2) replication with Ceph? The reason I'm asking is that a customer of mine asked me for a solution to prevent a situation which occurred: A cluster running with size=3 and replication over different racks was being upgraded from 13.2.5 to 13.2.6.