Thanks Alexander,

Now I started to run the repair with -pr arg and with keyspace and table
args.
Still, I got the "ERROR [RepairJobTask:1] 2016-09-28 11:34:38,288
RepairRunnable.java:246 - Repair session
89af4d10-856f-11e6-b28f-df99132d7979 for range
[(8323429577695061526,8326640819362122791],
..., (4212695343340915405,4229348077081465596]]] Validation failed in /
10.45.113.88"

for one of the tables. 10.45.113.88 is the ip of the machine I am running
the nodetool on.
I'm wondering if this is normal...

Thanks,
Robert




Robert Sicoie

On Wed, Sep 28, 2016 at 11:53 AM, Alexander Dejanovski <
a...@thelastpickle.com> wrote:

> Hi,
>
> nodetool scrub won't help here, as what you're experiencing is most likely
> that one SSTable is going through anticompaction, and then another node is
> asking for a Merkle tree that involves it.
> For understandable reasons, an SSTable cannot be anticompacted and
> validation compacted at the same time.
>
> The solution here is to adjust the repair pressure on your cluster so that
> anticompaction can end before you run repair on another node.
> You may have a lot of anticompaction to do if you had high volumes of
> unrepaired data, which can take a long time depending on several factors.
>
> You can tune your repair process to make sure no anticompaction is running
> before launching a new session on another node or you can try my Reaper
> fork that handles incremental repair : https://github.com/
> adejanovski/cassandra-reaper/tree/inc-repair-support-with-ui
> I may have to add a few checks in order to avoid all collisions between
> anticompactions and new sessions, but it should be helpful if you struggle
> with incremental repair.
>
> In any case, check if your nodes are still anticompacting before trying to
> run a new repair session on a node.
>
> Cheers,
>
>
> On Wed, Sep 28, 2016 at 10:31 AM Robert Sicoie <robert.sic...@gmail.com>
> wrote:
>
>> Hi guys,
>>
>> I have a cluster of 5 nodes, cassandra 3.0.5.
>> I was running nodetool repair last days, one node at a time, when I first
>> encountered this exception
>>
>> *ERROR [ValidationExecutor:11] 2016-09-27 16:12:20,409
>> CassandraDaemon.java:195 - Exception in thread
>> Thread[ValidationExecutor:11,1,main]*
>> *java.lang.RuntimeException: Cannot start multiple repair sessions over
>> the same sstables*
>> * at
>> org.apache.cassandra.db.compaction.CompactionManager.getSSTablesToValidate(CompactionManager.java:1194)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1084)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> org.apache.cassandra.db.compaction.CompactionManager.access$700(CompactionManager.java:80)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> org.apache.cassandra.db.compaction.CompactionManager$10.call(CompactionManager.java:714)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> ~[na:1.8.0_60]*
>> * at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> ~[na:1.8.0_60]*
>> * at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_60]*
>> * at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]*
>>
>> On some of the other boxes I see this:
>>
>>
>> *Caused by: org.apache.cassandra.exceptions.RepairException: [repair
>> #9dd21ab0-83f4-11e6-b28f-df99132d7979 on notes/operator_source_mv,
>> [(-7505573573695693981,-7495786486761919991],*
>> *....*
>> * (-8483612809930827919,-8480482504800860871]]] Validation failed in
>> /10.45.113.67 <http://10.45.113.67>*
>> * at
>> org.apache.cassandra.repair.ValidationTask.treesReceived(ValidationTask.java:68)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:183)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:408)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:168)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at org.apache.cassandra.net
>> <http://org.apache.cassandra.net>.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> ~[na:1.8.0_60]*
>> * at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> ~[na:1.8.0_60]*
>> * at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_60]*
>> * at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_60]*
>> * at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]*
>> *ERROR [RepairJobTask:3] 2016-09-26 16:39:33,096 CassandraDaemon.java:195
>> - Exception in thread Thread[RepairJobTask:3,5,RMI Runtime]*
>> *java.lang.AssertionError: java.lang.InterruptedException*
>> * at org.apache.cassandra.net
>> <http://org.apache.cassandra.net>.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:172)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at org.apache.cassandra.net
>> <http://org.apache.cassandra.net>.MessagingService.sendOneWay(MessagingService.java:761)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at org.apache.cassandra.net
>> <http://org.apache.cassandra.net>.MessagingService.sendOneWay(MessagingService.java:729)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> org.apache.cassandra.repair.ValidationTask.run(ValidationTask.java:56)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> ~[na:1.8.0_60]*
>> * at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> ~[na:1.8.0_60]*
>> * at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_60]*
>> *Caused by: java.lang.InterruptedException: null*
>> * at
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
>> ~[na:1.8.0_60]*
>> * at
>> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
>> ~[na:1.8.0_60]*
>> * at
>> java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
>> ~[na:1.8.0_60]*
>> * at org.apache.cassandra.net
>> <http://org.apache.cassandra.net>.OutboundTcpConnection.enqueue(OutboundTcpConnection.java:168)
>> ~[apache-cassandra-3.0.5.jar:3.0.5]*
>> * ... 6 common frames omitted*
>>
>>
>> Now if I run nodetool repair I get the
>>
>> *java.lang.RuntimeException: Cannot start multiple repair sessions over
>> the same sstables*
>>
>> exception.
>> What do you suggest? would nodetool scrub or sstablescrub help in this
>> case. or it would just make it worse?
>>
>> Thanks,
>>
>> Robert
>>
> --
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Reply via email to