My last test, I cleaned everything, started from scratch and I can still repo. The AssertionError you mention below could be unrelated to my writes not working.
This time there are _no_ exceptions or errors reported from any cassandra node. My app runs for a while before writes start failing. I can kill my app and will happen immediately upon restarting it. I didn't dig as hard last time, but now I can see that nodetool tpstats on 3 consecutive machines in the ring have their MutationStage "stuck" with piled up requests. Completed is not going up and pending is not going down. I have concurrent_writes = 32 Here is a sample from one of the machines: Pool Name Active Pending Completed Blocked All time blocked MutationStage 32 1492 1416296 0 0 On 9/23/11 3:11 PM, "Brandon Williams" <dri...@gmail.com> wrote: >On 9/23/11, Todd Burruss <bburr...@expedia.com> wrote: >> INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java >> (line 259) Started hinted handoff for token: >> 150124573641590498586782915043427152112 with IP: /10.185.35.39 >> ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648 >>AbstractCassandraDaemon.java >> (line 133) Fatal exception in thread Thread[HintedHandoff:4,5,main] >> java.lang.AssertionError >> at >> >>org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hinte >>dHandOffManager.java:282) >> at >> >>org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffMana >>ger.java:81) >> at >> >>org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffM >>anager.java:333) >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) >> at >> >>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor >>.java:886) >> at >> >>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav >>a:908) >> at java.lang.Thread.run(Thread.java:662) > > >This sounds like you have old hints 1.0 doesn't understand: > > assert versionColumn != null; > >-Brandon