On 09/18/2013 05:49 PM, Platonides wrote:
"We don't need multiple replication". It may not be cheap, practical or comfortable. But the facts only allowed the sentence to stand a couple of months :P
All told, it took us just a hair under four hours to get replication back up from scratch (which is pretty much a worse case scenario). That seems like a reasonable compromise to me.
Plus, we even identified a few points during the process where we could shave some time off so if it has to occur again, we'd be even faster to recover.
That said, I'm going to recommend adding a fourth database to serve as a warm swap host, meaning that it'd be easier to recover even if the hardware failure was difficult to recover from. It's not as cool as having full redundancy but it's the best bang for limited bucks we can get.
-- Marc _______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
