>
> I am seeing some unbalancing and I was worried because I have 256 vnodes
> Weird stuff is related to this post where I don't find a match between the
> load and du -sh * for the node 10.1.31.60 and I was trying to figure out
> the reason, if it was due to the number of vnodes.


Out of curiosity, did you start with a smaller cluster then added new ones?
Just wondering if this is a case of not having ran nodetool cleanup
post-expansion.

Does Cassandra keep a copy of the data per rack so if I need to keep the
> things balanced and I would have to add 3 racks at the time in a single
> Datacenter keep the things balanced?


The output you posted shows that all nodes are in the same rack but yes, C*
will place a replica in each rack so that each rack has a full copy.
Caveats apply such as RF=3 and 3 racks in the DC.

Is it better to keep a single Rack with a single Datacenter in 3 different
> availability zones with replication factor = 3 or to have for each
> Datacenter: 1 Rack and 1 Availability Zone and eventually redirect the
> client to a fallback Datacenter in case one of the availability zone is not
> reachable


If you have RF=3 and EC2 instances in 3 AZs, you can choose to allocate the
instances to a logical C* rack based on their AZ. However, you would only
do this if each AZ has identical number of nodes in each. If the node count
in each rack is not identical, you will end up with some bloated nodes for
the same reason that C* will keep a full copy of data in each rack.

Using racks also means that when you want to expand the cluster, you need
to provision instances in each AZ. As above, if you only provision
instances in 2 of 3 AZs (for example) then nodes in 1 AZ will be "fatter"
than nodes in the other 2 AZs.

Failing over to another DC isn't really necessary if only 1 AZ is
unreachable if using a CL of LOCAL_QUORUM since 2 AZs is sufficient to
satisfy requests. But that really depends on your application's business
rules. Cheers!

>

Reply via email to