Hello,

We have been noticing an issue where, about 50% of the time in which a node 
fails or is restarted, secondary indexes appear to be partially lost or 
corrupted.  A drop and re-add of the index appears to correct the issue.  There 
are no errors in the cassandra logs that I see.  Part of the index seems to be 
simply missing.  Sometimes this corruption/loss doesn't happen immediately, but 
sometime after the node is restarted.  In addition, the index never appears to 
have an issue when the node comes down, it is only after the node comes back up 
and recovers in which we experience an issue.

We developed some code that goes through all the rows in the table, by key, in 
which the index is present.  It then attempts to look up the information via 
secondary index, in an attempt to detect when the issue occurs.  Another odd 
observation is that the number of members present in the index when we have the 
issue varies up and down (the index and the tables don't change that often).

We are running a 6 node Cassandra cluster with a replication factor of 3, 
consistency level for all queries is LOCAL_QUORUM.  We are running Cassandra 
1.1.2.

Anyone have any insights?

-Mike

Reply via email to