[
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874309#comment-17874309
]
Maxwell Guo commented on CASSANDRA-19448:
-----------------------------------------
Hi [~brandon.williams], I think I found the reason for this
[failures|https://app.circleci.com/pipelines/github/driftx/cassandra/1704/workflows/bd8b0614-0b2a-4231-aca7-22688e0a06b8/jobs/97629/tests].
I pulled a branch, made simple modifications [here by add a
sleep|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448-test-repeat/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L47],
and did a repeat test according to your ci configuration, and it was
[successful|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/631/workflows/64d0372c-48e5-448c-99ae-87e326e5f09e].
||Heading 1||Heading 2||
| trunk |[trunk|https://github.com/apache/cassandra/pull/3215/files]|
|5.0|[5.0|https://github.com/apache/cassandra/pull/3236/]|
|4.1|[4.1|https://github.com/apache/cassandra/pull/3237/files]|
|4.0|[4.0|https://github.com/apache/cassandra/pull/3238/files]|
The reason for the problem is that because I changed the recovery time point
granularity to
[microseconds|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L81]
(this original test case is
[milliseconds|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L48]
level).
The three actions of [ INSERT twice and getting the current millisecond
timestamp
|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L44-L48]
in the test example are within one millisecond. If it happens, there will be a
problem, because the timestamp of our c* is microseconds, then the second
[INSERT|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L45]
time will be 1 greater than the timestamp of the first
[INSERT|https://github.com/Maxwell-Guo/cassandra/blob/trunk/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L44]
and the [current millisecond timestamp multiplied by
1000|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L81],
so the recovery failed, but if the original recovery granularity was
milliseconds, this problem would not exist.
[~blambov] I found the case for
[DropRecreateAndRestoreTest|https://github.com/Maxwell-Guo/cassandra/blob/CASSANDRA-19448/test/unit/org/apache/cassandra/cql3/validation/operations/DropRecreateAndRestoreTest.java#L52]
is written by you . Would you mind I add a sleep here ?
> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---------------------------------------------------------------------------
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Commit Log
> Reporter: Jeremy Hanna
> Assignee: Maxwell Guo
> Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of
> doing point in time restores. The [configuration
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
> gives an example of down to the seconds granularity but then asks what
> whether the timestamps are microseconds or milliseconds - defaulting to
> microseconds. Because the [CommitLogArchiver uses a second based date
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
> if a user specifies to restore at something at a lower granularity like
> milliseconds or microseconds, that means that the it will truncate everything
> after the second and restore to that second. So say you specify a
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds. So effectively to
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent. We should allow users to specify
> down to the millisecond or even microsecond level. If we allow them to
> specify down to microseconds for the restore point in time, then it may
> internally need to change from a long.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]