Hi,

I have a Cassandra setup with multiple data centres. The vast majority of 
writes are LOCAL_ONE writes to data center DC-A. One node (lets call this node 
A1) in DC-A has accumulated large amounts of hint files (~100 GB). In the logs 
of this node I see lots of messages like the following:

INFO  [HintsDispatcher:26] 2019-03-28 01:49:25,217 
HintsDispatchExecutor.java:289 - Finished hinted handoff of file 
db485ac6-8acd-4241-9e21-7a2b540459de-1553419324363-1.hints to endpoint 
/10.10.2.55: db485ac6-8acd-4241-9e21-7a2b540459de

The node 10.10.2.55 is in DC-B, lets call this node B1. There is no indication 
whatsoever that B1 was down: Nothing in our monitoring, nothing in the logs of 
B1, nothing in the logs of A1. Are there any other situations where hints to B1 
are stored at A1? Other than A1's failure detection detecting B1 as down I 
mean. For example could the reason for the hints be that B1 is overloaded and 
can not handle the intake from the A1? Or that the network connection between 
DC-A and DC-B is to slow?

While researching this I also found the following information on Stack Overflow 
from Ben Slater regarding hints and multi-dc replication:

Another factor here is the consistency level you are using - a LOCAL_* 
consistency level will only require writes to be written to the local DC for 
the operation to be considered a success (and hints will be stored for 
replication to the other DC).
(…)
The hints are the records of writes that have been made in one DC that are not 
yet replicated to the other DC (or even nodes within a DC). I think your 
options to avoid them are: (1) write with ALL or QUOROM (not LOCAL_*) 
consistency - this will slow down your writes but will ensure writes go into 
both DCs before the op completes (2) Don't replicate the data to the second DC 
(by setting the replication factor to 0 for the second DC in the keyspace 
definition) (3) Increase the capacity of the second DC so it can keep up with 
the writes (4) Slow down your writes so the second DC can keep up.

Source: https://stackoverflow.com/a/37382726

This reads like hints are used for “normal” (async) replication between data 
centres, i.e. hints could show up without any nodes being down whatsoever. This 
could explain what I am seeing. Does anyone now more about this? Does that mean 
I will see hints even if I disable hinted handoff?

Any pointers or help are greatly appreciated!

Thanks in advance
Jens


[https://img.sonnen.de/TSEE2019_Banner_sonnenGmbH_de_1.jpg]

Geschäftsführer: Christoph Ostermann (CEO), Oliver Koch, Steffen Schneider, 
Hermann Schweizer.
Amtsgericht Kempten/Allgäu, Registernummer: 10655, Steuernummer 127/137/50792, 
USt.-IdNr. DE272208908

Reply via email to