Hello,

Last night one of our nodes froze and the server had to be rebooted.  After it 
came up, the node joined the ring and everything looked normal.
However, this morning there seem to be some inconsistencies in the data (e.g. 
some nodes don't have a given record or have a different version of the record 
than other node).

There are also a lot of messages about hinted handoff in the logs that started 
after the node failure.
Like these:

INFO [HintedHandoff:1] 2013-05-05 11:22:23,339 HintedHandOffManager.java (line 
294) Started hinted handoff for token: 56713727820156410577229101238628035242 
with IP: /107.20.45.6
 INFO [HintedHandoff:1] 2013-05-05 11:22:33,343 HintedHandOffManager.java (line 
372) Timed out replaying hints to /107.20.45.6; aborting further deliveries
 INFO [HintedHandoff:1] 2013-05-05 11:22:33,344 HintedHandOffManager.java (line 
390) Finished hinted handoff of 0 rows to endpoint /107.20.45.6
 INFO [HintedHandoff:1] 2013-05-05 11:22:33,344 HintedHandOffManager.java (line 
294) Started hinted handoff for token: 0 with IP: /67.202.15.178
 INFO [HintedHandoff:1] 2013-05-05 11:22:43,348 HintedHandOffManager.java (line 
372) Timed out replaying hints to /67.202.15.178; aborting further deliveries
 INFO [HintedHandoff:1] 2013-05-05 11:22:43,348 HintedHandOffManager.java (line 
390) Finished hinted handoff of 0 rows to endpoint /67.202.15.178

Do we need to run repair on all nodes to get the cluster back to "normal" state?

Thanks for the help.

Dan Kogan

Reply via email to