[
https://issues.apache.org/jira/browse/CASSANDRA-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529680#comment-17529680
]
Paulo Motta edited comment on CASSANDRA-16843 at 4/28/22 11:08 PM:
-------------------------------------------------------------------
The reason why snapshots of "dropped tables" are omitted from the "nodetool
listsnapshots" output above is because the prior implementation relied on the
mechanics of
[ColumnFamilyStore|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L2240=]
and
[Directories|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Directories.java#L964=]
to list snapshots.
Since dropped tables no longer have an associated {{ColumnFamilyStore}} object,
it's not possible to list snapshots of dropped tables in the current
implementation.
[This patch|https://github.com/apache/cassandra/pull/1595] re-architects the
snapshot listing logic to be fully decoupled from
{{{}ColumnFamilyStore{}}}/{{{}Directories{}}} and rely solely on the snapshot
directory structure, which currently has this format:
* {{$data_dir/$ks_name/$table_name-$table_uuid/snapshots/$tag}}
The new snapshot discovery logic is mostly contained in the
[SnapshotLoader|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/src/java/org/apache/cassandra/service/snapshot/SnapshotLoader.java]
class, which traverses the data directory [looking for snapshot directories
matching the pattern
above|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/src/java/org/apache/cassandra/service/snapshot/SnapshotLoader.java#L102=].
I updated
[StorageService.getSnapshotDetails|https://github.com/apache/cassandra/pull/1595/files#diff-9bf2c26bc294ef9085e16bf287490223665eaa2eb8ec24bcf5bd8653c713644bR4131]
which is used by {{nodetool listsnapshots}} to use new {{SnapshotLoader}}
class to list snapshots.
The snapshot true size computation was previously dependent on logic from
[Directories|https://github.com/apache/cassandra/blob/bb3749f2bb8282f67375c67712d8e3ca1f085879/src/java/org/apache/cassandra/db/Directories.java#L1153],
so in order to fully decouple snapshot listing from {{Directories}}, I
[simplified the computation of snapshot true
size|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/src/java/org/apache/cassandra/service/snapshot/TableSnapshot.java#L282]
to only include files which do not have a corresponding "live" file on
{{{}$data_dir/$ks_name/$table_name-$table_uuid{}}}.
This simplification to the snapshot true size computation fixed two additional
issues with the previous implementation (illustrated with examples in the
previous comment):
1) Snapshot true size did not include "schema.cql" and "manifest.json" sizes
2) Snapshot true size did not include secondary indexes (CASSANDRA-17357)
I performed other simplifications and refactorings along the way, but given the
proximity to the 4.1 freeze, I prepared a leaner version of the original patch
to facilitate review.
After this is merged I will prepare another set of follow-up patches (for next
release) with refactorings and simplifications in the snapshot management
module that will be enabled by this change.
Testing:
- [dtest to check if snapshot of dropped tables are included on
listsnapshots|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/test/distributed/org/apache/cassandra/distributed/test/SnapshotsTest.java#L195=]
-
[SnapshotLoaderTest|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/test/unit/org/apache/cassandra/service/snapshot/SnapshotLoaderTest.java]
- [Test to check that manifest and schema file sizes are included in true size
computation|https://github.com/apache/cassandra/pull/1595/files#diff-ef5be0b69d0440b76021282c4b24bad69770ef9419be260df2169f49921db377R291]
- [Update DirectoriesTest.testSecondaryIndexDirectories to include 2i on true
size
computation|https://github.com/apache/cassandra/pull/1595/files#diff-1948a455b59a97d8d1ab3d2cb5388190c1cbb8e8081e3ac97bfc0c51a7ef64e3R421]
- [testGetLiveFileFromSnapshotFile (used by new true size
computation)|https://github.com/apache/cassandra/pull/1595/files#diff-d349fb289ec10bece5531f1630cd2bcc55665b5cf3cd59cfcfb4dc93f288a571R233]
was (Author: paulo):
The reason why snapshots of "dropped tables" are omitted from the "nodetool
listsnapshots" output above is because the prior implementation relied on the
mechanics of
[ColumnFamilyStore|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L2240=]
and
[Directories|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Directories.java#L964=]
to list snapshots.
Since dropped tables no longer have an associated {{ColumnFamilyStore}} object,
it's not possible to list snapshots of dropped tables in the current
implementation.
[This patch|https://github.com/apache/cassandra/pull/1595] re-architects the
snapshot listing logic to be fully decoupled from
{{{}ColumnFamilyStore{}}}/{{{}Directories{}}} and rely solely on the snapshot
directory structure, which currently has this format:
* {{$data_dir/$ks_name/$table_name-$table_uuid/snapshots/$tag}}
The new snapshot discovery logic is mostly contained in the
[SnapshotLoader|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/src/java/org/apache/cassandra/service/snapshot/SnapshotLoader.java]
class, which traverses the data directory [looking for snapshot directories
matching the pattern
above|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/src/java/org/apache/cassandra/service/snapshot/SnapshotLoader.java#L102=].
I updated
[StorageService.getSnapshotDetails|https://github.com/apache/cassandra/pull/1595/files#diff-9bf2c26bc294ef9085e16bf287490223665eaa2eb8ec24bcf5bd8653c713644bR4131]
which is used by {{nodetool listsnapshots}} to use new {{SnapshotLoader}}
class to load snapshots.
The snapshot true size computation was previously dependent on logic from
[Directories|https://github.com/apache/cassandra/blob/bb3749f2bb8282f67375c67712d8e3ca1f085879/src/java/org/apache/cassandra/db/Directories.java#L1153],
so in order to fully decouple snapshot listing from {{Directories}}, I
[simplified the computation of snapshot true
size|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/src/java/org/apache/cassandra/service/snapshot/TableSnapshot.java#L282]
to only include files which do not have a corresponding "live" file on
{{{}$data_dir/$ks_name/$table_name-$table_uuid{}}}.
This simplification to the snapshot true size computation fixed two additional
issues with the previous implementation (illustrated with examples in the
previous comment):
1) Snapshot true size did not include "schema.cql" and "manifest.json" sizes
2) Snapshot true size did not include secondary indexes (CASSANDRA-17357)
I performed other simplifications and refactorings along the way, but given the
proximity to the 4.1 freeze, I prepared a leaner version of the original patch
to facilitate review.
After this is merged I will prepare another set of follow-up patches (for next
release) with refactorings and simplifications in the snapshot management
module that will be enabled by this change.
Testing:
- [dtest to check if snapshot of dropped tables are included on
listsnapshots|https://github.com/apache/cassandra/pull/1595/files#diff-35dcc7dbb180da51d4f548e79f31ba45fb7beb7dbeec27663053817619efff1bR195]
-
[SnapshotLoaderTest|https://github.com/apache/cassandra/blob/993190ada5b65b79c5b7ca707d436a6ceff7abcf/test/unit/org/apache/cassandra/service/snapshot/SnapshotLoaderTest.java]
- [Test to check that manifest and schema file sizes are included in true size
computation|https://github.com/apache/cassandra/pull/1595/files#diff-ef5be0b69d0440b76021282c4b24bad69770ef9419be260df2169f49921db377R291]
- [Update DirectoriesTest.testSecondaryIndexDirectories to include 2i on true
size
computation|https://github.com/apache/cassandra/pull/1595/files#diff-1948a455b59a97d8d1ab3d2cb5388190c1cbb8e8081e3ac97bfc0c51a7ef64e3R421]
- [testGetLiveFileFromSnapshotFile (used by new true size
computation)|https://github.com/apache/cassandra/pull/1595/files#diff-d349fb289ec10bece5531f1630cd2bcc55665b5cf3cd59cfcfb4dc93f288a571R233]
> List snapshots of dropped tables
> --------------------------------
>
> Key: CASSANDRA-16843
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16843
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Snapshots
> Reporter: James Brown
> Assignee: Paulo Motta
> Priority: Normal
> Fix For: 4.1
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Auto snapshots from dropped tables don't seem to show up in {{nodetool
> listsnapshots}} (even though they do get cleared by {{nodetool
> clearsnapshot}}). This makes them kind of annoying to clean up, since you
> need to muck about in the data directory to find them.
> Erick on the mailing list said that this seems to be an oversight and that
> clearsnapshot was fixed by
> [CASSANDRA-6418|https://issues.apache.org/jira/browse/CASSANDRA-6418].
> I reproduced this both on 3.11.11 and 4.0.0.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]