Today I realized that one of the nodes in our Cassandra cluster (2.1.7) is
storing a lot of hints (>80GB) and I fail to see a convincing way to deal
with them.

>From the system.log:
INFO  [ScheduledTasks:1] 2015-09-23 14:27:06,692 StatusLogger.java:115 -
system.hints                      276,1010945
INFO  [ScheduledTasks:1] 2015-09-23 14:38:06,722 StatusLogger.java:115 -
system.hints                      968,2968163
INFO  [ScheduledTasks:1] 2015-09-23 14:38:41,742 StatusLogger.java:115 -
system.hints                     1317,3799471
INFO  [ScheduledTasks:1] 2015-09-23 14:49:16,775 StatusLogger.java:115 -
system.hints                     1519,4399905
INFO  [ScheduledTasks:1] 2015-09-23 14:49:36,793 StatusLogger.java:115 -
system.hints                     2247,6514649
INFO  [ScheduledTasks:1] 2015-09-23 14:49:41,811 StatusLogger.java:115 -
system.hints                     2247,6514649
INFO  [ScheduledTasks:1] 2015-09-23 14:49:51,830 StatusLogger.java:115 -
system.hints                     2368,6733293
INFO  [ScheduledTasks:1] 2015-09-23 15:00:41,885 StatusLogger.java:115 -
system.hints                    283,450166810
INFO  [ScheduledTasks:1] 2015-09-23 15:12:16,919 StatusLogger.java:115 -
system.hints                       232,970964
INFO  [ScheduledTasks:1] 2015-09-23 15:12:31,934 StatusLogger.java:115 -
system.hints                      581,2034388
INFO  [ScheduledTasks:1] 2015-09-23 15:23:46,973 StatusLogger.java:115 -
system.hints                       234,321566
INFO  [ScheduledTasks:1] 2015-09-23 15:24:01,988 StatusLogger.java:115 -
system.hints                       368,935634
INFO  [ScheduledTasks:1] 2015-09-23 15:35:12,039 StatusLogger.java:115 -
system.hints                       264,636164

The state of the cluster seems stable, at least we do not have any
downtimes (sometimes the load on one of the nodes is quite high).

We had a look into the table system.hints and from there we learnt that
most hints
are for one of the nodes in our 2nd datacenter and most of the mutations
are
increments to one of our counter tables which are very frequent.

There seem to be no other suspicious log messages in the log apart from a
few dropped events.

We have several questions:
- What could be the reason that only one of the nodes has hints for only
one target node, altough every other node should be coordinator for these
queries sometimes also?
- Is there a way to turn of hinted handoff on a table level or on data
center level?
- What could we do to investigate the cause of this issue deeper?

Thank you!
Kind regards
Björn Hachmann

Reply via email to