[
https://issues.apache.org/jira/browse/HADOOP-19254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902774#comment-17902774
]
ASF GitHub Bot commented on HADOOP-19254:
-----------------------------------------
mukund-thakur commented on code in PR #7197:
URL: https://github.com/apache/hadoop/pull/7197#discussion_r1868429206
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestBulkDeleteCommand.java:
##########
@@ -0,0 +1,45 @@
+package org.apache.hadoop.fs.shell;
+
+import org.apache.hadoop.conf.Configuration;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.LinkedList;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+public class TestBulkDeleteCommand {
+ private static Configuration conf;
+
+ @BeforeClass
+ public static void setup() throws IOException {
+ conf = new Configuration();
+ }
+
+ @Test
+ public void testDefaults() throws IOException {
+ LinkedList<String> options = new LinkedList<>();
+ BulkDeleteCommand bulkDeleteCommand = new BulkDeleteCommand();
+ bulkDeleteCommand.processOptions(options);
+ assertTrue(bulkDeleteCommand.childArgs.isEmpty());
+ }
+
+ @Test
+ public void testArguments() throws IOException, URISyntaxException {
Review Comment:
Add one more test for just the filename argument where paths are present in
file.
And what happens if user gives both a filename and then base path and paths.
##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/BulkDeleteCommand.java:
##########
@@ -0,0 +1,84 @@
+package org.apache.hadoop.fs.shell;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.BulkDelete;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.util.ArrayList;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.stream.Collectors;
+
+public class BulkDeleteCommand extends FsCommand {
+ public static void registerCommands(CommandFactory factory) {
+ factory.addClass(BulkDeleteCommand.class, "-bulkDelete");
+ }
+
+ public static final String name = "bulkDelete";
+
+ public static final String READ_FROM_FILE = "readFromFile";
+
+ public static final String USAGE = "-[ " + READ_FROM_FILE + "] [<file>]
[<basePath> <paths>]";
+
+ public static final String DESCRIPTION = "Deletes the set of files under
the given path. If a list of paths " +
+ "is provided then the paths are deleted directly. User can also
point to the file where the paths are" +
+ "listed as full object names.";
+
+ private String fileName;
+
+ /*
+ Making the class stateful as the PathData initialization for all args is
not needed
+ */
+ LinkedList<String> childArgs;
+
+ protected BulkDeleteCommand() {
+ this.childArgs = new LinkedList<>();
+ }
+
+ protected BulkDeleteCommand(Configuration conf) {super(conf);}
+
+ @Override
+ protected void processOptions(LinkedList<String> args) throws IOException {
+ CommandFormat cf = new CommandFormat(0, Integer.MAX_VALUE);
+ cf.addOptionWithValue(READ_FROM_FILE);
+ cf.parse(args);
+ fileName = cf.getOptValue(READ_FROM_FILE);
+ }
+
+ @Override
+ protected LinkedList<PathData> expandArguments(LinkedList<String> args)
throws IOException {
+ if(fileName == null && args.size() < 2) {
+ throw new IOException("Invalid Number of Arguments. Expected
more");
+ }
+ LinkedList<PathData> pathData = new LinkedList<>();
+ pathData.add(new PathData(args.get(0), getConf()));
+ args.remove(0);
+ this.childArgs = args;
+ return pathData;
+ }
+
+ @Override
+ protected void processArguments(LinkedList<PathData> args) throws
IOException {
+ PathData basePath = args.get(0);
+ out.println("Deleting files under:" + basePath);
+ List<Path> pathList = new ArrayList<>();
+ if(fileName != null) {
+ FileSystem localFile = FileSystem.get(getConf());
+ BufferedReader br = new BufferedReader(new
InputStreamReader(localFile.open(new Path(fileName))));
+ String line;
+ while((line = br.readLine()) != null) {
+ pathList.add(new Path(line));
+ }
+ } else {
+
pathList.addAll(this.childArgs.stream().map(Path::new).collect(Collectors.toList()));
+ }
+ BulkDelete bulkDelete = basePath.fs.createBulkDelete(basePath.path);
+ bulkDelete.bulkDelete(pathList);
Review Comment:
yes add try catch for IOEexception handling. What happens if the number of
paths is more than pageSize? I think it will fail. Do we want to fail? We can
do a for loop to call bulk delete multiple times.
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestBulkDeleteCommand.java:
##########
@@ -0,0 +1,45 @@
+package org.apache.hadoop.fs.shell;
+
+import org.apache.hadoop.conf.Configuration;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.io.IOException;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.LinkedList;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+public class TestBulkDeleteCommand {
Review Comment:
can we not add the actual tests to delete file?
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestBulkDeleteCommand.java:
##########
@@ -0,0 +1,45 @@
+package org.apache.hadoop.fs.shell;
+
Review Comment:
add apache license.
> Implement bulk delete command as hadoop fs command operation
> -------------------------------------------------------------
>
> Key: HADOOP-19254
> URL: https://issues.apache.org/jira/browse/HADOOP-19254
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Affects Versions: 3.4.1
> Reporter: Mukund Thakur
> Assignee: Harshit Gupta
> Priority: Major
> Labels: pull-request-available
>
> {code}
> hadoop fs -bulkdelete <base-url> <file>
> {code}
> Key uses
> * QE: Testing from python and other scripting languages
> * cluster maintenance: actual bulk deletion operations from the store
> one thought there: we MUST qualify paths with / elements: if a passed in path
> ends in /, it means "delete a marker", not "delete a dir"'. and if it doesn't
> have one then it's an object.. This makes it possible to be used to delete
> surplus markers or where there is a file above another file...cloudstore
> listobjects finds this
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]