Re: Understanding rack in cassandra-rackdc.properties

2023-04-03 Thread Tibor Répási
Usually, it’s a good practice to resemble the real datacenter in the Cassandra 
topology, thus nodes mounted to distinct racks are know with different rack 
names to Cassandra. This is due to the usual datacenter infrastructure, having 
a single point of failure in each rack - e.g. a network switch. In that sense a 
rack can be considered a failure domain within the datacenter. Cassandra is 
making efforts to distribute its token ranges among the available nodes to 
minimize intersections within a singe rack. In the best case you can lose a 
whole rack without loosing more than a single replica of the affected 
partitions. (Note: this is just best effort)
In some cases you can experience issues, e.g. if the number of nodes is very 
small, if nodes share some other resources that behave as single point of 
failure - like VMs do, etc. In such a case it might be better to configure each 
Cassandra node with the same rack.

> On 3. Apr 2023, at 17:11, David Tinker  wrote:
> 
> I have a 3 node cluster using the GossipingPropertyFileSnitch and replication 
> factor of 3. All nodes are leased hardware and more or less the same. The 
> cassandra-rackdc.properties files look like this:
> 
> dc=dc1
> rack=rack1
> (rack2 and rack3 for the other nodes)
> 
> Now I need to expand the cluster. I was going to use rack4 for the next node, 
> then rack5 and rack6 because the nodes are physically all on different racks. 
> Elsewhere on this list someone mentioned that I should use rack1, rack2 and 
> rack3 again.
> 
> Why is that?
> 
> Thanks
> David
> 



Re: Understanding rack in cassandra-rackdc.properties

2023-04-03 Thread Bowen Song via user
I just want to mention that the "rack" in Cassandra don't need to match 
the physical rack. As long as each "rack" in Cassandra fails independent 
of each other, it is fine.


That means if you have 6 physical servers each in an unique physical 
rack and Cassandra RF=3, you can have any of the following 
configurations, and each of them makes sense and all of them will work 
correctly:


1. 6 racks in Cassandra, each contains only 1 server

2. 3 racks in Cassandra, each contains 2 servers

3. 1 rack in Cassandra, with all 6 servers in it



On 03/04/2023 16:14, Jeff Jirsa wrote:
As long as the number of racks is already at/above the number of nodes 
/ replication factor, it's gonna be fine.


Where it tends to surprise people is if you have RF=3 and either 1 or 
2 racks, and then you add a third, that third rack gets one copy of 
"all" of the data, so you often run out of disk space.


If you're already at 3 nodes / 3 racks / RF=3, you're already evenly 
distributed, the next (4th, 5th, 6th) racks will just be randomly 
assigned based on the random token allocation.




On Mon, Apr 3, 2023 at 8:12 AM David Tinker  
wrote:


I have a 3 node cluster using the GossipingPropertyFileSnitch and
replication factor of 3. All nodes are leased hardware and more or
less the same. The cassandra-rackdc.properties files look like this:

dc=dc1
rack=rack1
(rack2 and rack3 for the other nodes)

Now I need to expand the cluster. I was going to use rack4 for the
next node, then rack5 and rack6 because the nodes are physically
all on different racks. Elsewhere on this list someone mentioned
that I should use rack1, rack2 and rack3 again.

Why is that?

Thanks
David


Re: Understanding rack in cassandra-rackdc.properties

2023-04-03 Thread Jeff Jirsa
As long as the number of racks is already at/above the number of nodes /
replication factor, it's gonna be fine.

Where it tends to surprise people is if you have RF=3 and either 1 or 2
racks, and then you add a third, that third rack gets one copy of "all" of
the data, so you often run out of disk space.

If you're already at 3 nodes / 3 racks / RF=3, you're already evenly
distributed, the next (4th, 5th, 6th) racks will just be randomly assigned
based on the random token allocation.



On Mon, Apr 3, 2023 at 8:12 AM David Tinker  wrote:

> I have a 3 node cluster using the GossipingPropertyFileSnitch and
> replication factor of 3. All nodes are leased hardware and more or less the
> same. The cassandra-rackdc.properties files look like this:
>
> dc=dc1
> rack=rack1
> (rack2 and rack3 for the other nodes)
>
> Now I need to expand the cluster. I was going to use rack4 for the next
> node, then rack5 and rack6 because the nodes are physically all on
> different racks. Elsewhere on this list someone mentioned that I should use
> rack1, rack2 and rack3 again.
>
> Why is that?
>
> Thanks
> David
>
>


Understanding rack in cassandra-rackdc.properties

2023-04-03 Thread David Tinker
I have a 3 node cluster using the GossipingPropertyFileSnitch and
replication factor of 3. All nodes are leased hardware and more or less the
same. The cassandra-rackdc.properties files look like this:

dc=dc1
rack=rack1
(rack2 and rack3 for the other nodes)

Now I need to expand the cluster. I was going to use rack4 for the next
node, then rack5 and rack6 because the nodes are physically all on
different racks. Elsewhere on this list someone mentioned that I should use
rack1, rack2 and rack3 again.

Why is that?

Thanks
David