Re: Hanging repairs in Cassandra

manish khandelwal Tue, 18 Jan 2022 18:22:56 -0800

Agree with you on that. Just wanted to highlight that I am experiencing the
same behavior.


Regards
Manish

On Tue, Jan 18, 2022, 22:50 Bowen Song <bo...@bso.ng> wrote:

> The link was related to Cassandra 1.2, and it was 9 years ago. Cassandra
> was full of bugs at that time, and it has improved a lot since then. For
> that reason, I would rather not compare the issue you have with some 9
> years old issues someone else had.
>
>
> On 18/01/2022 16:11, manish khandelwal wrote:
>
> I am not sure what is happening but it has happened thrice. It is
> happening that merkle trees are not received from nodes of other data
> center. Getting issue on similar lines as mentioned here
> https://user.cassandra.apache.narkive.com/GTbqO6za/repair-hangs-when-merkle-tree-request-is-not-acknowledged
>
> Regards
> Manish
>
> On Tue, Jan 18, 2022, 18:18 Bowen Song <bo...@bso.ng> wrote:
>
>> Keep reading the log on the initiator and the node sending the merkle
>> tree, anything follows that? FYI, not all log has the repair ID in it,
>> therefore please read the relevant logs in the chronological order without
>> filtering (e.g. "grep") on the repair ID.
>>
>> I'm sceptical network issue is causing all this. The merkle tree is send
>> over TCP connections, therefore some dropped packets over a few second of
>> network connectivity issue occasionally should not cause any issue to the
>> repair. You should only start to see network related issues if the network
>> problem persists over a period of time close to or longer than the timeout
>> values set in the cassandra.yaml file, in the case of repair it's the
>> request_timeout_in_ms which is default to 10 seconds.
>>
>> Carry on examine the logs, you may find something useful.
>>
>> BTW, talking about stuck repair, in my experience this can happen if two
>> or more repairs were ran concurrently on the same node (regardless which
>> node was the initiator) involving the same table. This could happen if you
>> accidentally ran "nodetool repair" on two nodes and both involve the same
>> table, or if you cancelled and then restarted a "nodetool repair" on a node
>> without waiting or killing the remannings of the first repair session on
>> other nodes.
>> On 18/01/2022 11:55, manish khandelwal wrote:
>>
>> In the system logs, on the node where repair was initiated, I see that
>> the node has requested merkle tree from all nodes including itself
>>
>> INFO  [Repair#3:1] 2022-01-14 03:32:18,805 RepairJob.java:172 - *[repair
>> #6e3385e0-74d1-11ec-8e66-9f084ace9968*] Requesting merkle trees for
>> *tablename* (to [*/xyz.abc.def.14, /xyz.abc.def.13, /xyz.abc.def.12,
>> /xyz.mkn.pq.18, /xyz.mkn.pq.16, /xyz.mkn.pq.17*])
>> INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,841 RepairSession.java:180
>> - [repair #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
>> *tablename* from */xyz.mkn.pq.17*
>> INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,847 RepairSession.java:180
>> - [repair #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
>> *tablename* from */xyz.mkn.pq.16*
>> INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,851 RepairSession.java:180
>> - [repair #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
>> *tablename* from */xyz.mkn.pq.18*
>> INFO  [AntiEntropyStage:1] 2022-01-14 03:32:18,856 RepairSession.java:180
>> - [repair #6e3385e0-74d1-11ec-8e66-9f084ace9968] Received merkle tree for
>> *tablename* from */xyz.abc.def.14*
>> Line 2480: INFO  [AntiEntropyStage:1] *2022-01-14 03:32:18*,876
>> RepairSession.java:180 - [*repair #6e3385e0-74d1-11ec-8e66-9f084ace9968*]
>> Received merkle tree for *tablename* from */xyz.abc.def.12*
>>
>> As per the logs merkle tree is not received from node with ip
>> *xyz.abc.def.13*
>>
>> In the system logs of node with ip *xyz.abc.def.13, *I can see following
>> logs
>>
>> NFO  [AntiEntropyStage:1] *2022-01-14 03:32:18*,850 Validator.java:281 -
>> [*repair #6e3385e0-74d1-11ec-8e66-9f084ace9968*] Sending completed
>> merkle tree to */* *xyz.mkn.pq.17*  for *keyspace.tablename*
>>
>> From the above I inferred that the repair task has become orphaned since
>> it is waiting for merkle tree from a node and it is not going to receive it
>> since it has been lost in the network somewhere between.
>>
>> Regards
>> Manish
>>
>> On Tue, Jan 18, 2022 at 4:39 PM Bowen Song <bo...@bso.ng> wrote:
>>
>>> The entry in the debug.log is not specific to a repair session, and it
>>> could also be caused by reasons other than network connectivity issue, such
>>> as long STW GC pauses. I usually don't start troubleshooting an issue from
>>> the debug log, as it can be rather noisy. The system.log is a better
>>> starting point.
>>>
>>> If I was to troubleshoot the issue, I would start from the system logs
>>> on the node that initiated the repair, i.e. the node you ran the "nodetool
>>> repair" command on. Follow the repair ID (an UUID) in the logs on all nodes
>>> involved in the repair and read all related logs in chronological order to
>>> find out what exactly had happened.
>>>
>>> BTW, If the issue is easily reproducible, I would re-run the repair with
>>> a reduce scope (such as table and token range) to get less logs related to
>>> the repair session. Less logs means less time spend on reading and
>>> analysing them.
>>>
>>> Hope this helps.
>>> On 18/01/2022 10:03, manish khandelwal wrote:
>>>
>>> I have a Cassandra 3.11.2 cluster with two DCs. While running repair , I
>>> am observing the following behavior.
>>>
>>> I am seeing that node is not able to receive merkle tree from one or two
>>> nodes. Also I am able to see that the missing nodes did send the merkle
>>> tree but it was not received. This make repair hangs on consistent basis.
>>> In netstats I can see output as follows
>>>
>>> *Mode: NORMAL*
>>> *Not sending any streams. Attempted: 7858888*
>>> *Mismatch (Blocking): 2560*
>>> *Mismatch (Background): 17173*
>>> *Pool Name Active Pending Completed Dropped*
>>> *Large messages n/a 0 6313 3*
>>> *Small messages n/a 0 55978004 3*
>>> *Gossip messages n/a 0 93756 125**Does it represent network issues? In
>>> Debug logs I saw something*DEBUG
>>> [MessagingService-Outgoing-hostname/xxx.yy.zz.kk-Large] 2022-01-14
>>> 05:00:19,031 OutboundTcpConnection.java:349 - Error writing to
>>> hostname/xxx.yy.zz.kk
>>> java.io.IOException: Connection timed out
>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_221]
>>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>> ~[na:1.8.0_221]
>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>> ~[na:1.8.0_221]
>>> at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.8.0_221]
>>> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>>> ~[na:1.8.0_221]
>>> at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
>>> ~[na:1.8.0_221]
>>> at java.nio.channels.Channels.writeFully(Channels.java:98)
>>> ~[na:1.8.0_221]
>>> at java.nio.channels.Channels.access$000(Channels.java:61)
>>> ~[na:1.8.0_221]
>>> at java.nio.channels.Channels$1.write(Channels.java:174) ~[na:1.8.0_221]
>>> at
>>> net.jpountz.lz4.LZ4BlockOutputStream.flushBufferedData(LZ4BlockOutputStream.java:205)
>>> ~[lz4-1.3.0.jar:na]
>>> at
>>> net.jpountz.lz4.LZ4BlockOutputStream.write(LZ4BlockOutputStream.java:158)
>>> ~[lz4-1.3.0.jar:na] (edited)
>>>
>>> Does this show any network fluctuations?
>>>
>>> Regards
>>> Manish
>>>
>>>
>>>

Re: Hanging repairs in Cassandra

Reply via email to