Hello community,
A Cassandra cluster exists consisting of 4 Datacenters with 22 Cassandra node 
each.
A datacenter removal was executed for maintenance purposes.
Before the datacenter removal a repair operation has been triggered to achieve 
synchronization of the data before the datacenter removal.
This repair operation was stopped abruptly resulting in one Cassandra node 
being in a weird overload state where a lot of internal messages (WRITE, READ 
etc) were being dropped.
During this Overload situation, I observe that a lot of hint files were 
replayed towards the rest of the Cassandra nodes of the cluster,
>From my understanding, hint files are generated and replayed by a coordinator 
>node when the replica nodes are unavailable during a Write operation.
In this case, the Cassandra nodes belonging to the datacenter to be removed 
were isolated from all clients. Meaning that in theory none of those should be 
taking the coordinator role, since the client processes were preferring the 
other datacenters.
My question is a bit theoretic.
Is there any other operation that I should consider in the above-described 
scenario which trigger a Cassandra node to generates and replay hint files?

BR
MK

Reply via email to