Re: Cluster sizing for huge dataset

DuyHai Doan Mon, 30 Sep 2019 09:04:02 -0700

Thanks all for your reply

The target deployment is on Azure so with the Nice disk snapshot feature,
replacing a dead node is easier, no streaming from Cassandra


About compaction overhead, using TwCs with a 1 day bucket and removing read
repair and subrange repair should be sufficient

Now the only remaining issue is Quorum read which triggers repair
automagically

Before 4.0  there is no flag to turn it off unfortunately

Le 30 sept. 2019 15:47, "Eric Evans" <[email protected]> a écrit :

On Sat, Sep 28, 2019 at 8:50 PM Jeff Jirsa <[email protected]> wrote:

[ ... ]

> 2) The 2TB guidance is old and irrelevant for most people, what you
really care about is how fast you can replace the failed machine
>
> You’d likely be ok going significantly larger than that if you use a few
vnodes, since that’ll help rebuild faster (you’ll stream from more sources
on rebuild)
>
> If you don’t want to use vnodes, buy big machines and run multiple
Cassandra instances in it - it’s not hard to run 3-4TB per instance and
12-16T of SSD per machine

We do this too.  It's worth keeping in mind though that you'll still
have a 12-16T blast radius in the event of a host failure.  As the
host density goes up, consider steps to make the host more robust
(RAID, redundant power supplies, etc).

-- 
Eric Evans
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Cluster sizing for huge dataset

Reply via email to