I installed Cassandra on three nodes. I then ran a test suite against them to generate load. The test suite is designed to generate the same type of load that we plan to have in production. As one of many tests, I reset one of the nodes to check the failure/recovery modes. Cassandra worked just fine.
I stopped the load generation, and got distracted with some other project/problem. A few days later, I noticed something strange on one of the nodes. On this node hinted handoff starts every ten minutes, and while it seems to finish without any errors, it will be started again in ten minutes. None of the nodes has any traffic, and hasn't for several days. I checked the logs, and this goes back to the initial failure/recovery testing: INFO [HintedHandoff:1] 2012-10-18 10:19:26,618 HintedHandOffManager.java (line 294) Started hinted handoff for token: 113427455640312821154458202477256070484 with IP: /192.168.128.136 INFO [HintedHandoff:1] 2012-10-18 10:19:26,779 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136 INFO [HintedHandoff:1] 2012-10-18 10:29:26,622 HintedHandOffManager.java (line 294) Started hinted handoff for token: 113427455640312821154458202477256070484 with IP: /192.168.128.136 INFO [HintedHandoff:1] 2012-10-18 10:29:26,735 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136 INFO [HintedHandoff:1] 2012-10-18 10:39:26,624 HintedHandOffManager.java (line 294) Started hinted handoff for token: 113427455640312821154458202477256070484 with IP: /192.168.128.136 INFO [HintedHandoff:1] 2012-10-18 10:39:26,751 HintedHandOffManager.java (line 390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136 The other nodes are happy and don't show this behavior. All the test data is readable, and everything is fine, but I'm curious why hinted handoff is running on one node all the time. I searched the bug database, and I found a bug that seems to have the same symptoms: https://issues.apache.org/jira/browse/CASSANDRA-3733 Although it's been marked fixed in 0.6, this describes my problem exactly. I'm running Cassandra 1.1.5 from Datastax on Centos 6.0: http://rpm.datastax.com/community/noarch/apache-cassandra11-1.1.5-1.noarch.rpm Is anyone else seeing this behavior? What can I do to provide more information? Steve