On Thu, Jun 14, 2012 at 8:26 AM, Piavlo <lolitus...@gmail.com> wrote:
> I started looking for similar messages on other nodes saw a SINGLE 
> IllegalArgumentException on
> ValidationExecutor on the same node and 2 other nodes (this is a 6 node 
> cluster) which happened
> at almost the same time , in all nodes while flushing same UserCompletions CF 
> memtable. This
> happened about 12hours before the IllegalArgumentException in  
> CompactionExecutor.

This actually does not happen during a flush but during a validation
compaction, which happens during a repair.
The exception is basically saying there is invalid composite column
name (you do use a composite comparator right?).
I guess that could result from some on-disk corruption. Are you using
sstable compression on UserCompletions? (I am asking because
compressed sstables have checksums)

> And even bigger problem now is that running repairs on other CFs against
> different nodes does not have any effect, for example running
> /usr/bin/nodetool -h dsc2b.internal -pr repair PRODUCTION UserDirectVendors
> does not trigger any repair activity and nothing in the logs to indicate a
> start of repair. And I have ~24hours left to repair some CFs before the gc
> period ends :(

Does that happen on every node?
What can happen is that some failed repair may block other from
starting. One thing you can try is to run the method called
forceTerminateAllRepairessions in JMX under
org.apache.cassandra.db->StorageService->Operations (I'm afraid there
is no nodetool hook so you will have to use jconsole). After that, try
starting a repair again. If that doesn't work, it's worth trying to
restart the node.

--
Sylvain

Reply via email to