frankgh commented on code in PR #3374:
URL: https://github.com/apache/cassandra/pull/3374#discussion_r1683535668


##########
test/distributed/org/apache/cassandra/distributed/impl/Instance.java:
##########
@@ -632,7 +632,10 @@ public void startup(ICluster cluster)
                 {
                     partialStartup(cluster);
                 }
-                StorageService.instance.startSnapshotManager();
+                SnapshotManager.instance.start();

Review Comment:
   should we mirror what CassandraDaemon is doing here and start the snapshot 
manager right after the commit log starts? it should be 
`org.apache.cassandra.service.CassandraDaemon#setup` 



##########
src/java/org/apache/cassandra/service/StorageService.java:
##########
@@ -2969,77 +2958,49 @@ private Keyspace getValidKeyspace(String keyspaceName)
      * Remove the snapshot with the given name from the given keyspaces.
      * If no tag is specified we will remove all snapshots.
      */
-    public void clearSnapshot(String tag, String... keyspaceNames)
+    public void clearSnapshot(String tag, String... keyspaceNames) throws 
IOException
     {
         clearSnapshot(Collections.emptyMap(), tag, keyspaceNames);
     }
 
     public void clearSnapshot(Map<String, Object> options, String tag, 
String... keyspaceNames)
     {
-        if (tag == null)
-            tag = "";
-
         if (options == null)
             options = Collections.emptyMap();
 
-        Set<String> keyspaces = new HashSet<>();
-        for (String dataDir : DatabaseDescriptor.getAllDataFileLocations())
-        {
-            for (String keyspaceDir : new File(dataDir).tryListNames())
-            {
-                // Only add a ks if it has been specified as a param, assuming 
params were actually provided.
-                if (keyspaceNames.length > 0 && 
!Arrays.asList(keyspaceNames).contains(keyspaceDir))
-                    continue;
-                keyspaces.add(keyspaceDir);
-            }
-        }
+        Set<String> keyspaces = new HashSet<>(Arrays.asList(keyspaceNames));

Review Comment:
   I think you can do this now in java 11? Are we bringing this to 4.1 or 4.0 
and are trying to make it java 8 compatible?
   ```suggestion
           Set<String> keyspaces = Set.of(keyspaceNames);
   ```



##########
src/java/org/apache/cassandra/service/StorageService.java:
##########
@@ -2969,77 +2958,49 @@ private Keyspace getValidKeyspace(String keyspaceName)
      * Remove the snapshot with the given name from the given keyspaces.
      * If no tag is specified we will remove all snapshots.
      */
-    public void clearSnapshot(String tag, String... keyspaceNames)
+    public void clearSnapshot(String tag, String... keyspaceNames) throws 
IOException
     {
         clearSnapshot(Collections.emptyMap(), tag, keyspaceNames);
     }
 
     public void clearSnapshot(Map<String, Object> options, String tag, 
String... keyspaceNames)
     {
-        if (tag == null)
-            tag = "";
-
         if (options == null)
             options = Collections.emptyMap();
 
-        Set<String> keyspaces = new HashSet<>();
-        for (String dataDir : DatabaseDescriptor.getAllDataFileLocations())
-        {
-            for (String keyspaceDir : new File(dataDir).tryListNames())
-            {
-                // Only add a ks if it has been specified as a param, assuming 
params were actually provided.
-                if (keyspaceNames.length > 0 && 
!Arrays.asList(keyspaceNames).contains(keyspaceDir))
-                    continue;
-                keyspaces.add(keyspaceDir);
-            }
-        }
+        Set<String> keyspaces = new HashSet<>(Arrays.asList(keyspaceNames));
+        long maxCreatedAt = getMaxSnapshotCreatedAt(options);
+
+        SnapshotManager.instance.clearSnapshots(tag, keyspaces, maxCreatedAt);
 
+        if (logger.isDebugEnabled())
+            logger.debug("Cleared out snapshot directories");

Review Comment:
   maybe an opportunity to improve this log message:
   ```suggestion
               logger.debug("Cleared out snapshot directories tag={} 
keyspaces={} maxCreatedAt={}", tag, keyspaceNames, maxCreatedAt);
   ```



##########
src/java/org/apache/cassandra/service/StorageService.java:
##########
@@ -2969,77 +2958,49 @@ private Keyspace getValidKeyspace(String keyspaceName)
      * Remove the snapshot with the given name from the given keyspaces.
      * If no tag is specified we will remove all snapshots.
      */
-    public void clearSnapshot(String tag, String... keyspaceNames)
+    public void clearSnapshot(String tag, String... keyspaceNames) throws 
IOException
     {
         clearSnapshot(Collections.emptyMap(), tag, keyspaceNames);
     }
 
     public void clearSnapshot(Map<String, Object> options, String tag, 
String... keyspaceNames)
     {
-        if (tag == null)
-            tag = "";
-
         if (options == null)
             options = Collections.emptyMap();
 
-        Set<String> keyspaces = new HashSet<>();
-        for (String dataDir : DatabaseDescriptor.getAllDataFileLocations())
-        {
-            for (String keyspaceDir : new File(dataDir).tryListNames())
-            {
-                // Only add a ks if it has been specified as a param, assuming 
params were actually provided.
-                if (keyspaceNames.length > 0 && 
!Arrays.asList(keyspaceNames).contains(keyspaceDir))
-                    continue;
-                keyspaces.add(keyspaceDir);
-            }
-        }
+        Set<String> keyspaces = new HashSet<>(Arrays.asList(keyspaceNames));
+        long maxCreatedAt = getMaxSnapshotCreatedAt(options);
+
+        SnapshotManager.instance.clearSnapshots(tag, keyspaces, maxCreatedAt);
 
+        if (logger.isDebugEnabled())
+            logger.debug("Cleared out snapshot directories");
+    }
+
+    private static long getMaxSnapshotCreatedAt(Map<String, Object> options)
+    {
         Object olderThan = options.get("older_than");
         Object olderThanTimestamp = options.get("older_than_timestamp");
 
-        final long clearOlderThanTimestamp;
+        long maxCreatedAt = Clock.Global.currentTimeMillis();

Review Comment:
   This also changes the behavior. If no option is provided we are using `0` 
before. Now, we are using `Clock.Global.currentTimeMillis()`



##########
src/java/org/apache/cassandra/service/snapshot/TableSnapshot.java:
##########
@@ -138,11 +152,21 @@ public boolean isExpiring()
 
     public long computeSizeOnDiskBytes()
     {
-        return snapshotDirs.stream().mapToLong(FileUtils::folderSize).sum();
+        long sum = sizeOnDisk;
+        if (sum == 0)
+        {
+            for (File snapshotDir : snapshotDirs)
+                sizeOnDisk = sum += FileUtils.folderSize(snapshotDir);
+        }

Review Comment:
   I'm worried about races here. What if we have two concurrent requests to the 
`Table Snapshot` virtual table. 2 threads will see sum == 0 and compute the 
sizeOnDisk twice. Should `sizeOnDisk` potentially be volatile as well? Maybe a 
way to prevent two threads increasing this value independently is to refactor 
this code as this:
   
   ```suggestion
           long sum = sizeOnDisk;
           if (sum == 0)
           if (sum == 0)
           {
               for (File snapshotDir : snapshotDirs)
                   sum += FileUtils.folderSize(snapshotDir);
               sizeOnDisk = sum;
           }
   ```
   
   With the only potential side effect being that two threads do the same work. 



##########
test/unit/org/apache/cassandra/service/snapshot/TableSnapshotTest.java:
##########
@@ -298,7 +298,7 @@ public void testShouldClearSnapshot() throws Exception
                 // 1. snapshot to clear is not ephemeral
                 // 2. tag to clear is null, empty, or it is equal to snapshot 
tag
                 // 3. byTimestamp is true
-                if (TableSnapshot.shouldClearSnapshot(testingTag, 
olderThanTimestamp).test(snapshot))
+                if (SnapshotManager.shouldClearSnapshot(testingTag, 
Set.of(keyspace), olderThanTimestamp, false).test(snapshot))

Review Comment:
   this is another behavior change in the test and I'm not sure if it's 
desired. Should we avoid including the keyspace? The comments above seem to 
indicate we don't want to include the keyspace here
   ```suggestion
                   if (SnapshotManager.shouldClearSnapshot(testingTag, 
Set.of(), olderThanTimestamp, false).test(snapshot))
   ```



##########
src/java/org/apache/cassandra/db/Directories.java:
##########
@@ -1169,37 +1169,6 @@ private BiPredicate<File, FileType> getFilter(boolean 
includeForeignTables)
         }
     }
 
-    public Map<String, TableSnapshot> listSnapshots()

Review Comment:
   this looks like it's only used in test code



##########
src/java/org/apache/cassandra/service/snapshot/TableSnapshot.java:
##########
@@ -138,11 +152,21 @@ public boolean isExpiring()
 
     public long computeSizeOnDiskBytes()
     {
-        return snapshotDirs.stream().mapToLong(FileUtils::folderSize).sum();
+        long sum = sizeOnDisk;
+        if (sum == 0)
+        {
+            for (File snapshotDir : snapshotDirs)
+                sizeOnDisk = sum += FileUtils.folderSize(snapshotDir);
+        }

Review Comment:
   Another potential issue is that 1 thread starts updating the `sizeOnDisk`. 
The second threads sees a partial, but incorrect value of the sizeOnDisk



##########
test/unit/org/apache/cassandra/db/DirectoriesTest.java:
##########
@@ -350,40 +364,13 @@ public void testListSnapshots() throws Exception {
         snapshot1.snapshotDir.deleteRecursive();
 
         // Only snapshot 2 and 3 should be present
-        snapshots = directories.listSnapshots();
+        snapshots = listSnapshots(directories);
         assertThat(snapshots.keySet()).isEqualTo(Sets.newHashSet(SNAPSHOT2, 
SNAPSHOT3));
         
assertThat(snapshots.get(SNAPSHOT2)).isEqualTo(snapshot2.asTableSnapshot());
         
assertThat(snapshots.get(SNAPSHOT3)).isEqualTo(snapshot3.asTableSnapshot());
         assertThat(snapshots.get(SNAPSHOT3).isEphemeral()).isTrue();
     }
 
-    @Test
-    public void testListSnapshotDirsByTag() throws Exception {

Review Comment:
   do we lose this test?



##########
test/unit/org/apache/cassandra/db/DirectoriesTest.java:
##########
@@ -350,40 +364,13 @@ public void testListSnapshots() throws Exception {
         snapshot1.snapshotDir.deleteRecursive();
 
         // Only snapshot 2 and 3 should be present
-        snapshots = directories.listSnapshots();
+        snapshots = listSnapshots(directories);
         assertThat(snapshots.keySet()).isEqualTo(Sets.newHashSet(SNAPSHOT2, 
SNAPSHOT3));
         
assertThat(snapshots.get(SNAPSHOT2)).isEqualTo(snapshot2.asTableSnapshot());
         
assertThat(snapshots.get(SNAPSHOT3)).isEqualTo(snapshot3.asTableSnapshot());
         assertThat(snapshots.get(SNAPSHOT3).isEphemeral()).isTrue();
     }
 
-    @Test
-    public void testListSnapshotDirsByTag() throws Exception {

Review Comment:
   it looks like `listSnapshotDirsByTag()` was not used in production code, so 
makes sense to remove it



##########
src/java/org/apache/cassandra/service/snapshot/SnapshotWatcher.java:
##########
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.service.snapshot;
+
+import java.io.IOException;
+import java.nio.file.ClosedWatchServiceException;
+import java.nio.file.FileSystems;
+import java.nio.file.Path;
+import java.nio.file.WatchEvent;
+import java.nio.file.WatchKey;
+import java.nio.file.WatchService;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Consumer;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.cassandra.concurrent.ExecutorFactory;
+import org.apache.cassandra.io.util.File;
+import org.apache.cassandra.utils.ExecutorUtils;
+
+import static java.nio.file.StandardWatchEventKinds.ENTRY_DELETE;
+
+/**
+ * SnapshotWatcher watches snapshot directories. When a directory is removed 
from disk manually,
+ * it will clean respective snapshot from SnapshotManager. The result of doing 
so is that when somebody
+ * removes a snapshot via other means from e.g. nodetool clearsnapshot, such 
snapshot will not be present
+ * in SnapshotManager anymore hence nodetool listsnapshots nor 
system_views.snapshots will not display it either.
+ */
+public class SnapshotWatcher implements AutoCloseable
+{
+    private static final Logger logger = 
LoggerFactory.getLogger(SnapshotWatcher.class);
+
+    private final Consumer<Path> removedSnapshotConsumer;
+    private WatchService watchService;
+    private ExecutorService executor;
+    private final Map<WatchKey, Path> watchKeyPathMap = new 
ConcurrentHashMap<>();
+    private boolean started = false;
+
+    private Future<?> watcherFuture;
+
+    public SnapshotWatcher(Consumer<Path> removedSnapshotConsumer)
+    {
+        this.removedSnapshotConsumer = removedSnapshotConsumer;
+    }
+
+    public void watch(TableSnapshot snapshot)
+    {
+        if (!started)
+            return;
+
+        Collection<File> directories = snapshot.getDirectories();
+        if (directories == null)
+            return;
+
+        for (File snapshotDir : directories)
+        {
+            Path rootSnapshotsDir = snapshotDir.parent().toPath();
+            if (!watchKeyPathMap.containsValue(rootSnapshotsDir))

Review Comment:
   contains value is expensive, can we do something else rather?



##########
src/java/org/apache/cassandra/db/Directories.java:
##########
@@ -1228,100 +1197,10 @@ protected static SnapshotManifest 
maybeLoadManifest(String keyspace, String tabl
         return null;
     }
 
-    @VisibleForTesting
-    protected Map<String, Set<File>> listSnapshotDirsByTag()

Review Comment:
   also only used in test code from what I can tell



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to