i ran out of file handles on the "repairing node" after doing nodetool
repair - strange as i have never had this issue until using 0.7.0 (but i
should say that i have not truly tested 0.7.0 until now.) up'ed the
number of file handles, removed data, restarted nodes, then restarted my
test. waited a little while. i have two keyspaces on the cluster, so i
checked the number of SSTABLES in one of them before "nodetool repair"
and i see 36 "data.db" files, spread over 11 column families. very
reasonable.
after running nodetool repair i have over 900 "data.db" files,
immediately! now after waiting several hours i have over 1500 data.db
files. out of these i have 95 "compacted" files
lsof reporting 803 files in use by cassandra for the "Queues" keyspace ...
[cassandra@kv-app02 ~]$ /usr/sbin/lsof -p 32645|grep Data.db|grep -c Queues
803
.. this doesn't sound right to me. checking the server log i see a lot
of these messages:
ERROR [RequestResponseStage:14] 2011-01-26 17:00:29,493
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.fastRemove(ArrayList.java:441)
at java.util.ArrayList.remove(ArrayList.java:424)
at
com.google.common.collect.AbstractMultimap.remove(AbstractMultimap.java:219)
at
com.google.common.collect.ArrayListMultimap.remove(ArrayListMultimap.java:60)
at
org.apache.cassandra.net.MessagingService.responseReceivedFrom(MessagingService.java:436)
at
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:40)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
and a lot of these:
ERROR [ReadStage:809] 2011-01-26 21:48:01,047
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.lang.ArrayIndexOutOfBoundsException
ERROR [ReadStage:809] 2011-01-26 21:48:01,047
AbstractCassandraDaemon.java (line 91) Fatal exception in thread
Thread[ReadStage:809,5,main]
java.lang.ArrayIndexOutOfBoundsException
and some more like this:
ERROR [ReadStage:15] 2011-01-26 20:59:14,695
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.lang.ArrayIndexOutOfBoundsException: 6
at
org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:56)
at
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
at
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
at
org.apache.cassandra.db.filter.QueryFilter$1.compare(QueryFilter.java:98)
at
org.apache.cassandra.db.filter.QueryFilter$1.compare(QueryFilter.java:95)
at
org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:334)
at
org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:118)
at
org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(QueryFilter.java:142)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1230)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
at org.apache.cassandra.db.Table.getRow(Table.java:384)
at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:68)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)