I installed Cassandra on three nodes. I then ran a test suite against them to 
generate load. The test suite is designed to generate the same type of load 
that we plan to have in production. As one of many tests, I reset one of the 
nodes to check the failure/recovery modes.  Cassandra worked just fine.

I stopped the load generation, and got distracted with some other 
project/problem. A few days later, I noticed something strange on one of the 
nodes. On this node hinted handoff starts every ten minutes, and while it seems 
to finish without any errors, it will be started again in ten minutes. None of 
the nodes has any traffic, and hasn't for several days. I checked the logs, and 
this goes back to the initial failure/recovery testing:

INFO [HintedHandoff:1] 2012-10-18 10:19:26,618 HintedHandOffManager.java (line 
294) Started hinted handoff for token: 113427455640312821154458202477256070484 
with IP: /192.168.128.136
INFO [HintedHandoff:1] 2012-10-18 10:19:26,779 HintedHandOffManager.java (line 
390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136
INFO [HintedHandoff:1] 2012-10-18 10:29:26,622 HintedHandOffManager.java (line 
294) Started hinted handoff for token: 113427455640312821154458202477256070484 
with IP: /192.168.128.136
INFO [HintedHandoff:1] 2012-10-18 10:29:26,735 HintedHandOffManager.java (line 
390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136
INFO [HintedHandoff:1] 2012-10-18 10:39:26,624 HintedHandOffManager.java (line 
294) Started hinted handoff for token: 113427455640312821154458202477256070484 
with IP: /192.168.128.136
INFO [HintedHandoff:1] 2012-10-18 10:39:26,751 HintedHandOffManager.java (line 
390) Finished hinted handoff of 0 rows to endpoint /192.168.128.136

The other nodes are happy and don't show this behavior. All the test data is 
readable, and everything is fine, but I'm curious why hinted handoff is running 
on one node all the time.

I searched the bug database, and I found a bug that seems to have the same 
symptoms:
https://issues.apache.org/jira/browse/CASSANDRA-3733
Although it's been marked fixed in 0.6, this describes my problem exactly.

I'm running Cassandra 1.1.5 from Datastax on Centos 6.0:
http://rpm.datastax.com/community/noarch/apache-cassandra11-1.1.5-1.noarch.rpm

Is anyone else seeing this behavior? What can I do to provide more information?

Steve

Reply via email to