Re: Running Large Clusters in Production

2020-07-13 Thread Reid Pinchback
" Date: Monday, July 13, 2020 at 10:48 AM To: "user@cassandra.apache.org" Subject: RE: Running Large Clusters in Production Message from External Sender I’m curious – is the scaling needed for the amount of data, the amount of user connections, throughput or what? I have a 200is

RE: Running Large Clusters in Production

2020-07-13 Thread Durity, Sean R
Reath (BLOOMBERG/ 919 3RD A) Sent: Monday, July 13, 2020 10:35 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Re: Running Large Clusters in Production Thanks for the info Jeff, all very helpful! From: user@cassandra.apache.org<mailto:user@cassandra.apache.org> At: 07/11/20 12:30:36 To

Re: Running Large Clusters in Production

2020-07-13 Thread Isaac Reath (BLOOMBERG/ 919 3RD A)
Thanks for the info Jeff, all very helpful! From: user@cassandra.apache.org At: 07/11/20 12:30:36To: user@cassandra.apache.org Subject: Re: Running Large Clusters in Production Gossip related stuff eventually becomes the issue For example, when a new host joins the cluster (or replaces

Re: Running Large Clusters in Production

2020-07-11 Thread Jeff Jirsa
/10/20 19:06:27 > To: user@cassandra.apache.org > Cc: Isaac Reath (BLOOMBERG/ 919 3RD A ) > Subject: Re: Running Large Clusters in Production > > I worked on a handful of large clusters (> 200 nodes) using vnodes, and there > were some serious issues with both performance and

Re: Running Large Clusters in Production

2020-07-11 Thread Isaac Reath (BLOOMBERG/ 919 3RD A)
3RD A ) Subject: Re: Running Large Clusters in Production I worked on a handful of large clusters (> 200 nodes) using vnodes, and there were some serious issues with both performance and availability. We had to put in a LOT of work to fix the problems. I agree with Jeff - it's way bet

Re: Running Large Clusters in Production

2020-07-10 Thread onmstester onmstester
Yes, you should handle the routing logic at app level I wish there was another level of sharding (above dc, rack) as cluster to distribute data on multiple cluster! but i don't think there is any other database that does such a thing for you. Another problem with big cluster is for huge amount

Re: Running Large Clusters in Production

2020-07-10 Thread Sergio
Sorry for the dumb question: When we refer to 1000 nodes divided in 10 clusters(shards): we would have 100 nodes per cluster A shard is not intended as Datacenter but it would be a cluster itself that it doesn't talk with the other ones so there should be some routing logic at the application

Re: Running Large Clusters in Production

2020-07-10 Thread Jon Haddad
I worked on a handful of large clusters (> 200 nodes) using vnodes, and there were some serious issues with both performance and availability. We had to put in a LOT of work to fix the problems. I agree with Jeff - it's way better to manage multiple clusters than a really large one. On Fri,

Re: Running Large Clusters in Production

2020-07-10 Thread Jeff Jirsa
1000 instances are fine if you're not using vnodes. I'm not sure what the limit is if you're using vnodes. If you might get to 1000, shard early before you get there. Running 8x100 host clusters will be easier than one 800 host cluster. On Fri, Jul 10, 2020 at 2:19 PM Isaac Reath (BLOOMBERG/

Running Large Clusters in Production

2020-07-10 Thread Isaac Reath (BLOOMBERG/ 919 3RD A)
Hi All, I’m currently dealing with a use case that is running on around 200 nodes, due to growth of their product as well as onboarding additional data sources, we are looking at having to expand that to around 700 nodes, and potentially beyond to 1000+. To that end I have a couple of