Datacenters and racks are different concepts.  While they don't have to be 
associated with their historical meanings, the historical meanings probably 
provide a helpful model for understanding what you want from them.

When companies own their own physical servers and have them housed somewhere, 
the questions arise on where you want to locate any particular server.  It's a 
balancing act on things like network speed of related servers being able to 
talk to each other, versus fault-tolerance of having many servers not all 
exposed to the same risks.  

"Same rack" in that physical world tended to mean something like "all behind 
the same network switch and all sharing the same power bus".  The morning after 
an electrical glitch fries a power bus and thus everything in that rack, you 
realize you wished you didn't have so many of the same type of server together. 
 Well, they were servers.  Now they are door stops.  Badness and sadness.  

That's kind of the mindset to have in mind with racks in Cassandra.  It's an 
artifact for you to separate servers into pools so that the disparate pools 
have hopefully somewhat independent infrastructure risks.  However, all those 
servers are still doing the same kind of work, are the same version, etc.

Datacenters are amalgams of those racks, and how similar or different they are 
from each other depends on what you want to do with them.  What is true is that 
if you have N datacenters, each one of them must have enough disk storage to 
house all the data.  The actual physical footprint of that data in each DC 
depends on the replication factors in play.

Note that you sorta can't have "one datacenter for writes" because the writes 
will replicate across the data centers.  You could definitely choose to have 
only one that takes read queries, but best to think of writing as being 
universal.  One scenario you can have is where the DC not taking live traffic 
read queries is the one you use for maintenance or performance testing or 
version upgrades.

One rack makes your life easier if you don't have a reason for multiple racks. 
It depends on the environment you deploy into and your fault tolerance goals.  
If you were in AWS and wanting to spread risk across availability zones, then 
you would likely have as many racks as AZs you choose to be in, because that's 
really the point of using multiple AZs.

R


On 10/23/19, 4:06 AM, "Sergio Bilello" <lapostadiser...@gmail.com> wrote:

     Message from External Sender
    
    Hello guys!
    
    I was reading about 
https://urldefense.proofpoint.com/v2/url?u=https-3A__cassandra.apache.org_doc_latest_architecture_dynamo.html-23networktopologystrategy&d=DwIBaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=xmgs1uQTlmvCtIoGJKHbByZZ6aDFzS5hDQzChDPCfFA&s=9ZDWAK6pstkCQfdbwLNsB-ZGsK64RwXSXfAkOWtmkq4&e=
 
    
    I would like to understand a concept related to the node load balancing.
    
    I know that Jon recommends Vnodes = 4 but right now I found a cluster with 
vnodes = 256 replication factor = 3 and 2 racks. This is unbalanced because the 
racks are not a multiplier of the replication factor.
    
    However, my plan is to move all the nodes in a single rack to eventually 
scale up and down the node in the cluster once at the time. 
    
    If I had 3 racks and I would like to keep the things balanced I should 
scale up 3 nodes at the time one for each rack.
    
    If I would have 3 racks, should I have also 3 different datacenters so one 
datacenter for each rack? 
    
    Can I have 2 datacenters and 3 racks? If this is possible one datacenter 
would have more nodes than the others? Could it be a problem?
    
    I am thinking to split my cluster in one datacenter for reads and one for 
writes and keep all the nodes in the same rack so I can scale up once node at 
the time.
    
    
    
    Please correct me if I am wrong
    
    
    
    Thanks,
    
    
    
    Sergio
    
    
    
    ---------------------------------------------------------------------
    
    To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
    
    For additional commands, e-mail: user-h...@cassandra.apache.org
    
    
    
    

Reply via email to