Merkle trees have a fixed size/depth (2**20), so it’s not that, but it could be timing out elsewhere (or still running validation or something)
-- Jeff Jirsa > On Nov 30, 2017, at 10:12 AM, Javier Canillas <javier.canil...@gmail.com> > wrote: > > Christian, > > I'm not an expert, but maybe the merkle tree is too big to transfer between > nodes and that's why it times out. How many nodes do you have and what's the > size of the keyspace? Have you ever done a successfully repair before? > > Cassandra reaper does repair based on tokenrange (or even part of it), that's > why it can manage to require a small merkle tree. > > Regards, > > Javier. > > 2017-11-30 6:48 GMT-03:00 Christian Lorenz <christian.lor...@webtrekk.com>: >> Hello, >> >> >> >> after updating our cluster to Cassandra 3.11.1 (previously 3.9) running a >> ‘nodetool repair –full’ leads to the node crashing. >> >> Logfile showed the following Exception: >> >> ERROR [ReadRepairStage:36] 2017-11-30 07:42:06,439 CassandraDaemon.java:228 >> - Exception in thread Thread[ReadRepairStage:36,5,main] >> >> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - >> received only 0 responses. >> >> at >> org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:199) >> ~[apache-cassandra-3.11.1.jar:3.11.1] >> >> at >> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.close(UnfilteredPartitionIterators.java:175) >> ~[apache-cassandra-3.11.1.jar:3.11.1] >> >> at >> org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:92) >> ~[apache-cassandra-3.11.1.jar:3.11.1] >> >> at >> org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:76) >> ~[apache-cassandra-3.11.1.jar:3.11.1] >> >> at >> org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:50) >> ~[apache-cassandra-3.11.1.jar:3.11.1] >> >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) >> ~[apache-cassandra-3.11.1.jar:3.11.1] >> >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> ~[na:1.8.0_151] >> >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> ~[na:1.8.0_151] >> >> at >> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) >> ~[apache-cassandra-3.11.1.jar:3.11.1] >> >> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_151] >> >> >> >> The node datasize is ~270GB. A repair with Cassandra reaper works fine >> though. >> >> >> >> Any idea why this could be happening? >> >> >> >> Regards, >> >> Christian >> >