[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17829886#comment-17829886
 ] 

Tiago L. Alves commented on CASSANDRA-19448:
--------------------------------------------

[~maxwellguo] I've looked into the code and it seems to be possible to 
implement millisecond and microsecond-level PIT restore. Supporting millisecond 
requires changes in `CommitLogArchiver` only while supporting microsecond 
requires additional changes in `CommitLogReplayer`.

In both `CommitLogArchiver` and `CommitLogReplayer`, `restorePointInTime` and 
`restoreTarget` respectively, we're assuming long values in milliseconds. 
Supporting millisecond level PIT restore, can be achieved by just recognizing 
the milliseconds part specified in `restore_point_in_time` configuration. The 
existent code does not fail parsing if milliseconds are specified but it will 
ignore it.

To support microsecond granularity, we need further changes in 
`CommitLogReplayer` to accept microseconds instead of forcing comparison to be 
in milliseconds.

I have a patch that adds support for millisecond-level PIT restore and issues 
warnings on attempts to specify microseconds granularity: 
https://github.com/apache/cassandra/pull/3200

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19448
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log
>            Reporter: Jeremy Hanna
>            Assignee: Maxwell Guo
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to