[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

Stefan Miklosovic (Jira) Wed, 27 Mar 2024 09:05:05 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831433#comment-17831433
 ]


Stefan Miklosovic commented on CASSANDRA-19495:
-----------------------------------------------

[~paulchandler] could you try this? 
https://github.com/smiklosovic/cassandra/commit/a88b62889473d13119e6c3d912fd6635cd3f5c9e

I dug a little bit deeper here and the way it works is that there is 
HintsDispatchTrigger running every 10s (that's what javadoc says). Every 10 
seconds it will iterate over all hint stores and if a respective node for which 
there is a corresponding hint store is alive (it went back online from being 
down), then the dispatching of hints is triggered. 

Then it will do this (1), as explained in the javadoc, it will dispatch all 
hint files which are queued to be sent over and it will close the current 
writer. Closing of the current writer means that it will call this (2) and 
there it will put the current descriptor as the the last one. Basically, when a 
node is up, it will send all full hints files and it will close the writer for 
a file which was not fully occupied yet. Then, once that node is down again, it 
will create a new hint file and it will start to fill it again. If that node 
goes down again, all to-be-dispatched hints are sent over and the current one 
is closed and will be sent in the next round and so on ...

Now if you notice, the logic telling whether we should save a hint or not is 
here (3). That goes to (4) and it looks into the oldest descriptor and its 
timestamp and it tries to find the minimum for it + all earliest hints in 
buffer pools. That goes here (5).

However, we only ever remove that host uuid for earliest hints when we close 
the store. So if we do not close it because it is not full and node is up 
again, there will be old values from times before we firstly dispatched. 

The solution is to remove records for the earliest hints when we close the 
writer when we go to dispatch as that node is alive again - so we dispatch and 
we remove the earliest hint for it. Then once it goes offline again, it will be 
added into earliest hints again too. 

(1) 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/HintsDispatchTrigger.java#L66-L76]
(2) 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/HintsStore.java#L311-L320]
(3) 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageProxy.java#L2395-L2401]
(4) 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/HintsService.java#L452-L459]
(5) 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hints/HintsBufferPool.java#L74-L83]

> Hints not stored after node goes down for the second time
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-19495
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Hints
>            Reporter: Paul Chandler
>            Assignee: Brandon Williams
>            Priority: Normal
>             Fix For: 4.1.x, 5.0.x, 5.x
>
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

Reply via email to