Re: Repair fails for unknown reason

Hannu Kröger Wed, 03 Jan 2018 13:24:40 -0800

I can certainly try that. No problem there.

However wouldn’t we then get this kind of errors if that was the case:
java.lang.RuntimeException: Cannot start multiple repair sessions over the same 
sstables
?


Hannu

> On 3 Jan 2018, at 20:50, Nandakishore Tokala <nandakishore.tok...@gmail.com> 
> wrote:
> 
> hi Hannu,
> 
> I think some of the repairs are hanging there. please restart all the nodes 
> in the  cluster and start the repair 
> 
> 
> Thanks
> Nanda
> 
> On Wed, Jan 3, 2018 at 9:35 AM, Hannu Kröger <hkro...@gmail.com 
> <mailto:hkro...@gmail.com>> wrote:
> Additional notes:
> 
> 1) If I run the repair just on those tables, it works fine
> 2) Those tables are empty
> 
> Hannu
> 
> > On 3 Jan 2018, at 18:23, Hannu Kröger <hkro...@gmail.com 
> > <mailto:hkro...@gmail.com>> wrote:
> >
> > Hello,
> >
> > Situation is as follows:
> >
> > Repair was started on node X on this keyspace with —full —pr. Repair fails 
> > on node Y.
> >
> > Node Y has debug logging on (DEBUG on org.apache.cassandra) and I’m looking 
> > at the debug.log. I see following messages related to this repair request:
> >
> > -----------
> > DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,530 
> > RepairMessageVerbHandler.java:114 - Validating 
> > ValidationRequest{gcBefore=1511473932} 
> > org.apache.cassandra.repair.messages.ValidationRequest@5a17430c
> > DEBUG [ValidationExecutor:4] 2018-01-02 17:52:12,531 
> > StorageService.java:3321 - Forcing flush on keyspace mykeyspace, CF mytable
> > DEBUG [MemtablePostFlush:54] 2018-01-02 17:52:12,531 
> > ColumnFamilyStore.java:954 - forceFlush requested but everything is clean 
> > in mytable
> > ERROR [ValidationExecutor:4] 2018-01-02 17:52:12,532 Validator.java:268 - 
> > Failed creating a merkle tree for [repair 
> > #1df000a0-effa-11e7-8361-b7c9edfbfc33 on mykeyspace/mytable, 
> > [(6917529027641081856,-9223372036854775808]]], /123.123.123.123 
> > <http://123.123.123.123/> (see log for details)
> > -----------
> >
> > then the same about another table and after that which indicates that 
> > repair “master” has told to abort basically, right?
> >
> > -----------
> > DEBUG [AntiEntropyStage:1] 2018-01-02 17:52:12,563 
> > RepairMessageVerbHandler.java:142 - Got anticompaction request 
> > AnticompactionRequest{parentRepairSession=1de949e0-effa-11e7-8361-b7c9edfbfc33}
> >  org.apache.cassandra.repair.messages.AnticompactionRequest@5dc8be
> > ea
> > ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,563 
> > RepairMessageVerbHandler.java:168 - Got error, removing parent repair 
> > session
> > ERROR [AntiEntropyStage:1] 2018-01-02 17:52:12,564 CassandraDaemon.java:228 
> > - Exception in thread Thread[AntiEntropyStage:1,5,main]
> > java.lang.RuntimeException: java.lang.RuntimeException: Parent repair 
> > session with id = 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed.
> >        at 
> > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:171)
> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
> >        at org.apache.cassandra.net 
> > <http://org.apache.cassandra.net/>.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
> >        at 
> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> > ~[na:1.8.0_111]
> >        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> > ~[na:1.8.0_111]
> >        at 
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >  ~[na:1.8.0_111]
> >        at 
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >  [na:1.8.0_111]
> >        at 
> > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
> >  [apache-cassandra-3.11.0.jar:3.11.0]
> >        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_111]
> > Caused by: java.lang.RuntimeException: Parent repair session with id = 
> > 1de949e0-effa-11e7-8361-b7c9edfbfc33 has failed.
> >        at 
> > org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:409)
> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
> >        at 
> > org.apache.cassandra.service.ActiveRepairService.doAntiCompaction(ActiveRepairService.java:444)
> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
> >        at 
> > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:143)
> >  ~[apache-cassandra-3.11.0.jar:3.11.0]
> >        ... 7 common frames omitted
> > -----------
> >
> > But that is almost all in the log and I don’t really see what the original 
> > problem here is.
> >
> > Cassandra flushes the table to start building merkle tree and on next 
> > millisecond it already fails the repair but without proper exception or 
> > error logging about the problem.
> >
> > Cassandra version is the 3.11.0.
> >
> > Any ideas?
> >
> > Cheers,
> > Hannu
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> <mailto:user-unsubscr...@cassandra.apache.org>
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> <mailto:user-h...@cassandra.apache.org>
> 
> 
> 
> 
> -- 
> Thanks & Regards,
> Nanda Kishore

Re: Repair fails for unknown reason

Reply via email to