The parent repair session will be on the node that you kicked off the
repair on. Are the logs above from that node? Can you make it a bit clearer
how many nodes are involved and the corresponding logs from each node?

On 9 January 2018 at 09:49, Hannu Kröger <hkro...@gmail.com> wrote:

> We have run restarts on the cluster and that doesn’t seem to help at all.
>
> We ran repair separately for each table that seems to go through usually
> but running a repair on a keyspace doesn’t.
>
> Anything anyone?
>
> Hannu
>
>
> On 3 Jan 2018, at 23:24, Hannu Kröger <hkro...@gmail.com> wrote:
>
> I can certainly try that. No problem there.
>
> However wouldn’t we then get this kind of errors if that was the case:
>
> java.lang.RuntimeException: Cannot start multiple repair sessions over the 
> same sstables
>
> ?
>
> Hannu
>
> On 3 Jan 2018, at 20:50, Nandakishore Tokala <
> nandakishore.tok...@gmail.com> wrote:
>
> hi Hannu,
>
> I think some of the repairs are hanging there. please restart all the
> nodes in the  cluster and start the repair
>
>
> Thanks
> Nanda
>
> On Wed, Jan 3, 2018 at 9:35 AM, Hannu Kröger <hkro...@gmail.com> wrote:
>
>> Additional notes:
>>
>> 1) If I run the repair just on those tables, it works fine
>> 2) Those tables are empty
>>
>> Hannu
>>
>> > On 3 Jan 2018, at 18:23, Hannu Kröger <hkro...@gmail.com> wrote:
>> >
>> > Hello,
>> >
>> > Situation is as follows:
>> >
>> > Repair was started on node X on this keyspace with —full —pr. Repair
>> fails on node Y.
>> >
>> > Node Y has debug logging on (DEBUG on org.apache.cassandra) and I’m
>> looking at the debug.log. I see following messages related to this repair
>> request:
>> >
>> > -----------
>> > DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,530
>> RepairMessageVerbHandler.java:114 - Validating
>> ValidationRequest{gcBefore=1511473932} org.apache.cassandra.repair.me
>> ssages.ValidationRequest@5a17430c
>> > DEBUG [ValidationExecutor:4] 2018-01-02 17:52:12,531
>> StorageService.java:3321 - Forcing flush on keyspace mykeyspace, CF mytable
>> > DEBUG [MemtablePostFlush:54] 2018-01-02 17:52:12,531
>> ColumnFamilyStore.java:954 - forceFlush requested but everything is clean
>> in mytable
>> > ERROR [ValidationExecutor:4] 2018-01-02 17:52:12,532 Validator.java:268
>> - Failed creating a merkle tree for [repair 
>> #1df000a0-effa-11e7-8361-b7c9edfbfc33
>> on mykeyspace/mytable, [(6917529027641081856,-9223372036854775808]]], /
>> 123.123.123.123 (see log for details)
>> > -----------
>> >
>> > then the same about another table and after that which indicates that
>> repair “master” has told to abort basically, right?
>> >
>> > -----------
>> > DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,563
>> RepairMessageVerbHandler.java:142 - Got anticompaction request
>> AnticompactionRequest{parentRepairSession=1de949e0-effa-11e7-8361-b7c9edfbfc33}
>> org.apache.cassandra.repair.messages.AnticompactionRequest@5dc8be
>> > ea
>> > ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,563
>> RepairMessageVerbHandler.java:168 - Got error, removing parent repair
>> session
>> > ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,564
>> CassandraDaemon.java:228 - Exception in thread
>> Thread[AntiEntropyStage:1,5,main]
>> > java.lang.RuntimeException: java.lang.RuntimeException: Parent repair
>> session with id = 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed.
>> >        at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(
>> RepairMessageVerbHandler.java:171) ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        at 
>> > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
>> ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        at 
>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> ~[na:1.8.0_111]
>> >        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> ~[na:1.8.0_111]
>> >        at 
>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> ~[na:1.8.0_111]
>> >        at 
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_111]
>> >        at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$
>> threadLocalDeallocator$0(NamedThreadFactory.java:81)
>> [apache-cassandra-3.11.0.jar:3.11.0]
>> >        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
>> > Caused by: java.lang.RuntimeException: Parent repair session with id =
>> 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed.
>> >        at org.apache.cassandra.service.ActiveRepairService.getParentRe
>> pairSession(ActiveRepairService.java:409) ~[apache-cassandra-3.11.0.jar:
>> 3.11.0]
>> >        at org.apache.cassandra.service.ActiveRepairService.doAntiCompa
>> ction(ActiveRepairService.java:444) ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(
>> RepairMessageVerbHandler.java:143) ~[apache-cassandra-3.11.0.jar:3.11.0]
>> >        ... 7 common frames omitted
>> > -----------
>> >
>> > But that is almost all in the log and I don’t really see what the
>> original problem here is.
>> >
>> > Cassandra flushes the table to start building merkle tree and on next
>> millisecond it already fails the repair but without proper exception or
>> error logging about the problem.
>> >
>> > Cassandra version is the 3.11.0.
>> >
>> > Any ideas?
>> >
>> > Cheers,
>> > Hannu
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
>
> --
> Thanks & Regards,
> Nanda Kishore
>
>
>
>

Reply via email to