Re: Nodetool repair

Alain RODRIGUEZ Thu, 22 Sep 2016 08:30:28 -0700

As Matija mentioned, my coworker Alexander worked on Reaper. I believe the
branches of most interest would be:


Incremental repairs on Reaper:
https://github.com/adejanovski/cassandra-reaper/tree/inc-repair-that-works
UI integration with incremental repairs on Reaper:
https://github.com/adejanovski/cassandra-reaper/tree/inc-repair-support-with-ui

@George

When I check the log for pattern "session completed successfully" in
> system.log, I see the last finished range occurred in 14 hours ago. So I
> think it is safe to say that the repair has hanged somehow.
>

What is your current setting for 'streaming_socket_timeout_in_ms'. You
might want to be aware of
https://issues.apache.org/jira/browse/CASSANDRA-8611 and
https://issues.apache.org/jira/browse/CASSANDRA-11840

Depending on how long the streams are expected to be, you might want to try
'3600000 ms (1 hour)', if you are currently using 0, or increasing this
value it is already set if you think you might be hitting
https://issues.apache.org/jira/browse/CASSANDRA-11840

In order to start another repair, do we need to 'kill' this repair. If so,
> how do we do that?


Restarting the node is a straightforward way of doing that.

If you do not want to restart for some reason, you can use JMX (
forceTerminateAllRepairSessions). If you are going to use JMX and don't
know much about it, this video of the presentation done by Nate, , another
coworker, at the Cassandra Summit 2016 might be of interest
https://www.youtube.com/watch?v=uiUThbonnpc&index=21&list=PLm-EPIkBI3YoiA-02vufoEj4CgYvIQgIk
.

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2016-09-22 16:45 GMT+02:00 Li, Guangxing <guangxing...@pearson.com>:

> Romain,
>
> I had another repair that seems to just hang last night. When I did 'nodetool
> tpstats' on nodes, I see the following in the node where I initiated the
> repair:
> AntiEntropySessions               1         1
> On all other nodes, I see:
> AntiEntropySessions               0         0
> When I check the log for pattern "session completed successfully" in
> system.log, I see the last finished range occurred in 14 hours ago. So I
> think it is safe to say that the repair has hanged somehow. In order to
> start another repair, do we need to 'kill' this repair. If so, how do we do
> that?
>
> Thanks.
>
> George.
>
> On Thu, Sep 22, 2016 at 6:23 AM, Romain Hardouin <romainh...@yahoo.fr>
> wrote:
>
>> I meant that pending (and active) AntiEntropySessions are a simple way to
>> check if a repair is still running on a cluster. Also have a look at
>> Cassandra reaper:
>> - https://github.com/spotify/cassandra-reaper
>>
>> - https://github.com/spodkowinski/cassandra-reaper-ui
>>
>> Best,
>> Romain
>>
>>
>>
>> Le Mercredi 21 septembre 2016 22h32, "Li, Guangxing" <
>> guangxing...@pearson.com> a écrit :
>>
>> Romain,
>>
>> I started running a new repair. If I see such behavior again, I will try
>> what you mentioned.
>>
>> Thanks.
>>
>
>

Re: Nodetool repair

Reply via email to