smengcl commented on a change in pull request #2352:
URL: https://github.com/apache/hadoop/pull/2352#discussion_r497720649



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
##########
@@ -2094,6 +2103,41 @@ public Void next(final FileSystem fs, final Path p)
     }.resolve(this, absF);
   }
 
+  /**
+   * Helper function to check if a trash root exists in the given directory,
+   * remove the trash root if it is empty, or throw IOException if not empty
+   * @param p Path to a directory.
+   */
+  private void checkTrashRootAndRemoveIfEmpty(final Path p) throws IOException 
{
+    Path trashRoot = new Path(p, FileSystem.TRASH_PREFIX);
+    try {
+      // listStatus has 4 possible outcomes here:
+      // 1) throws FileNotFoundException: the trash root doesn't exist.
+      // 2) returns empty array: the trash path is an empty directory.
+      // 3) returns non-empty array, len >= 2: the trash root is not empty.
+      // 4) returns non-empty array, len == 1:
+      //    i) if the element's path is exactly p, the trash path is not a dir.
+      //       e.g. a file named .Trash. Ignore.
+      //   ii) if the element's path isn't p, the trash root is not empty.
+      FileStatus[] fileStatuses = listStatus(trashRoot);
+      if (fileStatuses.length == 0) {
+        DFSClient.LOG.debug("Removing empty trash root {}", trashRoot);
+        delete(trashRoot, false);
+      } else {
+        if (fileStatuses.length == 1
+            && !fileStatuses[0].isDirectory()
+            && !fileStatuses[0].getPath().equals(p)) {
+          // Ignore the trash path because it is not a directory.
+          DFSClient.LOG.warn("{} is not a directory.", trashRoot);

Review comment:
       I get your point. But I don't think it is worth it to prevent the user 
from doing so. At best, we can throw some client-side warnings when the user is 
attempting to do so.
   
   There are so many ways to circumvent this that I can think of so far if the 
user really wants to: the user could create the `.Trash` file before allowing 
snapshot, rename `.Trash` file from another place.
   
   Even if we have placed restrictions on a newer version of HDFS NameNode, 
they might have already created the `.Trash` before the NN upgrade.
   
   Also, regular user trash also faces the same issue.
   ```
   $ hdfs dfs -touch hdfs://127.0.0.1:9999/user/smeng/.Trash
   $ hdfs dfs -touch hdfs://127.0.0.1:9999/file3
   $ hdfs dfs -rm hdfs://127.0.0.1:9999/file3
   2020-09-30 11:27:43,062 WARN fs.TrashPolicyDefault: Can't create trash 
directory: hdfs://127.0.0.1:9999/user/smeng/.Trash/Current
   org.apache.hadoop.fs.ParentNotDirectoryException: /user/smeng/.Trash (is not 
a directory)
        at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkIsDirectory(FSPermissionChecker.java:743)
   ...
   rm: Failed to move to trash: hdfs://127.0.0.1:9999/file3: /user/smeng/.Trash 
(is not a directory)
   ```
   
   Another point is, trash is mostly a client-side feature. The client should 
still have some freedom to do something with it.
   
   It is a bit sarcastic for me to say this cause I myself have made so many 
changes to intervene the trash usage :D.
   At least creating this `.Trash` file shouldn't cause harm. It just fails, 
gloriously, if it is a file and someone is trying to move to that trash.
   Or, maybe the admin would do this intentionally to prevent users from using 
trash inside that specific snapshot directory?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to