Thank you Jeff for the hints We are targeting to reach 20Tb/machine using TWCS and 8 vnodes (using the new token allocation algo). Also we will try the new zstd compression.
About transient replication, the underlying trade-offs and semantics are hard to understand for common people (for example, reading at CL ONE in the face of 2 full replicas loss leads to unavailable exception, unlike normal replication) so we will let it out for the moment Regards On Sun, Sep 29, 2019 at 3:50 AM Jeff Jirsa <[email protected]> wrote: > > A few random thoughts here > > 1) 90 nodes / 900T in a cluster isn’t that big. petabyte per cluster is a > manageable size. > > 2) The 2TB guidance is old and irrelevant for most people, what you really > care about is how fast you can replace the failed machine > > You’d likely be ok going significantly larger than that if you use a few > vnodes, since that’ll help rebuild faster (you’ll stream from more sources on > rebuild) > > If you don’t want to use vnodes, buy big machines and run multiple Cassandra > instances in it - it’s not hard to run 3-4TB per instance and 12-16T of SSD > per machine > > 3) Transient replication in 4.0 could potentially be worth trying out, > depending on your risk tolerance. Doing 2 full and one transient replica may > save you 30% storage > > 4) Note that you’re not factoring in compression, and some of the recent zstd > work may go a long way if your sensor data is similar / compressible. > > > On Sep 28, 2019, at 1:23 PM, DuyHai Doan <[email protected]> wrote: > > > > Hello users > > > > I'm facing with a very challenging exercise: size a cluster with a huge > > dataset. > > > > Use-case = IoT > > > > Number of sensors: 30 millions > > Frequency of data: every 10 minutes > > Estimate size of a data: 100 bytes (including clustering columns) > > Data retention: 2 years > > Replication factor: 3 (pretty standard) > > > > A very quick math gives me: > > > > 6 data points / hour * 24 * 365 ~50 000 data points/ year/ sensor > > > > In term of size, it is 50 000 x 100 bytes = 5Mb worth of data /year /sensor > > > > Now the big problem is that we have 30 millions of sensor so the disk > > requirements adds up pretty fast: 5 Mb * 30 000 000 = 5Tb * 30 = 150Tb > > worth of data/year > > > > We want to store data for 2 years => 300Tb > > > > We have RF=3 ==> 900Tb !!!! > > > > Now, according to commonly recommended density (with SSD), one shall > > not exceed 2Tb of data per node, which give us a rough sizing of 450 > > nodes cluster !!! > > > > Even if we push the limit up to 10Tb using TWCS (has anyone tried this > > ?) We would still need 90 beefy nodes to support this. > > > > Any thoughts/ideas to reduce the nodes count or increase density and > > keep the cluster manageable ? > > > > Regards > > > > Duy Hai DOAN > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
