Thanks all for your reply The target deployment is on Azure so with the Nice disk snapshot feature, replacing a dead node is easier, no streaming from Cassandra
About compaction overhead, using TwCs with a 1 day bucket and removing read repair and subrange repair should be sufficient Now the only remaining issue is Quorum read which triggers repair automagically Before 4.0 there is no flag to turn it off unfortunately Le 30 sept. 2019 15:47, "Eric Evans" <[email protected]> a écrit : On Sat, Sep 28, 2019 at 8:50 PM Jeff Jirsa <[email protected]> wrote: [ ... ] > 2) The 2TB guidance is old and irrelevant for most people, what you really care about is how fast you can replace the failed machine > > You’d likely be ok going significantly larger than that if you use a few vnodes, since that’ll help rebuild faster (you’ll stream from more sources on rebuild) > > If you don’t want to use vnodes, buy big machines and run multiple Cassandra instances in it - it’s not hard to run 3-4TB per instance and 12-16T of SSD per machine We do this too. It's worth keeping in mind though that you'll still have a 12-16T blast radius in the event of a host failure. As the host density goes up, consider steps to make the host more robust (RAID, redundant power supplies, etc). -- Eric Evans [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
