Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-21 Thread via GitHub


steveloughran closed pull request #6494: HADOOP-18679. Add API for bulk/paged 
object deletion
URL: https://github.com/apache/hadoop/pull/6494


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-21 Thread via GitHub


steveloughran closed pull request #5993: HADOOP-18679. Add API for bulk/paged 
object deletion
URL: https://github.com/apache/hadoop/pull/5993


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-21 Thread via GitHub


steveloughran closed pull request #6738: HADOOP-18679. Add API for bulk/paged 
object deletion
URL: https://github.com/apache/hadoop/pull/6738


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-13 Thread via GitHub


steveloughran commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2108677588

   mukund, if you can do those naming changes then I'm +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-09 Thread via GitHub


hadoop-yetus commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2102481437

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 07s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 01s |  |  xmllint was not available.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  markdownlint  |   0m 01s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m 01s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 11 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 42s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  | 107m 34s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  48m 30s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   7m 10s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   5m 19s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/5/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  23m 51s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 225m 30s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  | 228m 42s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 57s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  19m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  45m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  45m 43s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 01s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   7m 25s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   5m 28s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/5/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  24m 00s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 232m 12s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   7m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 700m 51s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6726 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle 
markdownlint |
   | uname | MINGW64_NT-10.0-17763 296d6abd6fb2 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / e37d88f764665c8530097bbed890a5935a5fd1f0 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/5/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs hadoop-tools/hadoop-aws 
hadoop-tools/hadoop-azure U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/5/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-07 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1592834729


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/wrappedio/WrappedIO.java:
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.io.wrappedio;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.fs.BulkDelete;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+
+/**
+ * Reflection-friendly access to APIs which are not available in
+ * some of the older Hadoop versions which libraries still
+ * compile against.
+ * 
+ * The intent is to avoid the need for complex reflection operations
+ * including wrapping of parameter classes, direct instatiation of
+ * new classes etc.
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Evolving
+public final class WrappedIO {
+
+  private WrappedIO() {
+  }
+
+  /**
+   * Get the maximum number of objects/files to delete in a single request.
+   * @param fs filesystem
+   * @param path path to delete under.
+   * @return a number greater than or equal to zero.
+   * @throws UnsupportedOperationException bulk delete under that path is not 
supported.
+   * @throws IllegalArgumentException path not valid.
+   * @throws IOException problems resolving paths
+   */
+  public static int bulkDeletePageSize(FileSystem fs, Path path) throws 
IOException {
+try (BulkDelete bulk = fs.createBulkDelete(path)) {
+  return bulk.pageSize();
+}
+  }
+
+  /**
+   * Delete a list of files/objects.
+   * 
+   *   Files must be under the path provided in {@code base}.
+   *   The size of the list must be equal to or less than the page 
size.
+   *   Directories are not supported; the outcome of attempting to delete
+   *   directories is undefined (ignored; undetected, listed as 
failures...).
+   *   The operation is not atomic.
+   *   The operation is treated as idempotent: network failures may
+   *trigger resubmission of the request -any new objects created under 
a
+   *path in the list may then be deleted.
+   *There is no guarantee that any parent directories exist after this 
call.
+   *
+   * 
+   * @param fs filesystem
+   * @param base path to delete under.
+   * @param paths list of paths which must be absolute and under the base path.
+   * @return a list of all the paths which couldn't be deleted for a reason 
other than "not found" and any associated error message.
+   * @throws UnsupportedOperationException bulk delete under that path is not 
supported.
+   * @throws IOException IO problems including networking, authentication and 
more.
+   * @throws IllegalArgumentException if a path argument is invalid.
+   */
+  public static List> bulkDelete(FileSystem fs,

Review Comment:
   rename bulkDelete_delete



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/wrappedio/WrappedIO.java:
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.io.wrappedio;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+
+import 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-06 Thread via GitHub


mukund-thakur commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2097047455

   > can you do the same here? some style checker will complain but it will 
help us to separate the methods in the new class.
   
   I don't understand what to do here. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-05-02 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1588274603


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java:
##
@@ -4980,4 +4982,17 @@ public MultipartUploaderBuilder 
createMultipartUploader(Path basePath)
 methodNotSupported();
 return null;
   }
+
+  /**
+   * Create a default bulk delete operation to be used for any FileSystem.

Review Comment:
   This doesn't hold for the subclasses. better to say
   ```
   Create a bulk delete operation.
   The default implementation returns an instance of {@link 
DefaultBulkDeleteOperation}
   
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-30 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1584787287


##
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/bulkdelete.md:
##
@@ -0,0 +1,136 @@
+
+
+#  interface `BulkDelete`
+
+
+
+The `BulkDelete` interface provides an API to perform bulk delete of 
files/objects
+in an object store or filesystem.
+
+## Key Features
+
+* An API for submitting a list of paths to delete.
+* This list must be no larger than the "page size" supported by the client; 
This size is also exposed as a method.
+* Triggers a request to delete files at the specific paths.
+* Returns a list of which paths were reported as delete failures by the store.
+* Does not consider a nonexistent file to be a failure.
+* Does not offer any atomicity guarantees.
+* Idempotency guarantees are weak: retries may delete files newly created by 
other clients.
+* Provides no guarantees as to the outcome if a path references a directory.
+* Provides no guarantees that parent directories will exist after the call.
+
+
+The API is designed to match the semantics of the AWS S3 [Bulk 
Delete](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) 
REST API call, but it is not
+exclusively restricted to this store. This is why the "provides no guarantees"
+restrictions do not state what the outcome will be when executed on other 
stores.
+
+### Interface `org.apache.hadoop.fs.BulkDeleteSource`
+
+The interface `BulkDeleteSource` is offered by a FileSystem/FileContext class 
if
+it supports the API.
+
+```java
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDeleteSource {
+  default BulkDelete createBulkDelete(Path path)
+  throws UnsupportedOperationException, IllegalArgumentException, 
IOException;
+
+}
+
+```
+
+### Interface `org.apache.hadoop.fs.BulkDelete`
+
+This is the bulk delete implementation returned by the `createBulkDelete()` 
call.
+
+```java
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDelete extends IOStatisticsSource, Closeable {
+  int pageSize();
+  Path basePath();
+  List> bulkDelete(List paths)
+  throws IOException, IllegalArgumentException;
+
+}
+
+```
+
+### `bulkDelete(paths)`
+
+ Preconditions
+
+```python
+if length(paths) > pageSize: throw IllegalArgumentException
+```
+
+ Postconditions
+
+All paths which refer to files are removed from the set of files.
+```python
+FS'Files = FS.Files - [paths]
+```
+
+No other restrictions are placed upon the outcome.
+
+
+### Availability
+
+The `BulkDeleteSource` interface is exported by `FileSystem` and `FileContext` 
storage clients
+which is available for all FS via 
`org.apache.hadoop.fs.DefalutBulkDeleteSource`. For the
+ICEBERG integration to work seamlessly, all FS which supports delete() MUST 
leave the

Review Comment:
   say "for integration in applications like Apache Iceberg", all 
implementations of this interface MUST NOT reject the request but instead 
return a BulkDelete instance of size >= 1"



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DefaultBulkDeleteOperation.java:
##
@@ -0,0 +1,109 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Objects.requireNonNull;
+import static org.apache.hadoop.fs.BulkDeleteUtils.validateBulkDeletePaths;
+
+/**
+ * Default implementation of the {@link BulkDelete} interface.
+ */
+public class DefaultBulkDeleteOperation implements BulkDelete {
+
+private static Logger LOG = 
LoggerFactory.getLogger(DefaultBulkDeleteOperation.class);
+
+/** Default page size for bulk delete. */
+private static final int DEFAULT_PAGE_SIZE = 1;
+
+/** Base path for the bulk delete operation. */
+private final Path basePath;
+
+/** Delegate File system make actual delete calls. */
+private final 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-29 Thread via GitHub


mukund-thakur commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1583846245


##
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractBulkDelete.java:
##
@@ -85,6 +88,9 @@ public ITestS3AContractBulkDelete(boolean 
enableMultiObjectDelete) {
 protected Configuration createConfiguration() {
 Configuration conf = super.createConfiguration();
 S3ATestUtils.disableFilesystemCaching(conf);
+conf = propagateBucketOptions(conf, getTestBucketName(conf));
+skipIfNotEnabled(conf, Constants.ENABLE_MULTI_DELETE,

Review Comment:
   nice catch. tested with gcs bucket as well. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-29 Thread via GitHub


hadoop-yetus commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2083482008

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 06s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 01s |  |  xmllint was not available.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  markdownlint  |   0m 01s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 10 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 19s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  91m 56s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  40m 19s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   6m 14s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 28s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/4/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  19m 58s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 195m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  16m 00s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  37m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  37m 43s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   5m 59s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 36s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/4/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  19m 48s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 199m 09s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 597m 32s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6726 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle 
markdownlint |
   | uname | MINGW64_NT-10.0-17763 e57383186604 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 0339eeb5bd4f0a90e5530abb8df9530f582d99b3 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/4/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs hadoop-tools/hadoop-aws 
hadoop-tools/hadoop-azure U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/4/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-29 Thread via GitHub


hadoop-yetus commented on PR #6738:
URL: https://github.com/apache/hadoop/pull/6738#issuecomment-2082900564

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 05s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m 00s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 00s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 00s |  |  xmllint was not available.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  markdownlint  |   0m 01s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 17s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  90m 51s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  40m 09s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   5m 58s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 31s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/2/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  13m 57s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 168m 48s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 17s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  10m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  38m 07s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  38m 07s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m 01s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/2/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   5m 56s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 27s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/2/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  13m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 182m 29s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  asflicense  |   7m 44s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/2/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 550m 15s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6738 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle 
markdownlint |
   | uname | MINGW64_NT-10.0-17763 374a372225c9 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 744a643945e9fbf2fd1246c3e48c752789060370 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/2/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
hadoop-tools/hadoop-azure U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/2/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-26 Thread via GitHub


steveloughran commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2079798916

   iceberg poc pr https://github.com/apache/iceberg/pull/10233


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-26 Thread via GitHub


steveloughran commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2079798082

   for the iceberg support to work, all filesystems MUST implement the api or 
we have to modify that PoC to handle the case where they don't. I'd rather the 
spec says any FS which supports delete() MUST support this since it comes for 
free.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-26 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1581288128


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DefaultBulkDeleteOperation.java:
##
@@ -17,61 +17,86 @@
  */
 package org.apache.hadoop.fs;
 
+import java.io.FileNotFoundException;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 import java.util.List;
 import java.util.Map;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import org.apache.hadoop.util.functional.Tuples;
 
 import static java.util.Objects.requireNonNull;
 import static org.apache.hadoop.fs.BulkDeleteUtils.validateBulkDeletePaths;
-import static org.apache.hadoop.util.Preconditions.checkArgument;
 
 /**
  * Default implementation of the {@link BulkDelete} interface.
  */
 public class DefaultBulkDeleteOperation implements BulkDelete {
 
-private final int pageSize;
+private static Logger LOG = 
LoggerFactory.getLogger(DefaultBulkDeleteOperation.class);
+
+/** Default page size for bulk delete. */
+private static final int DEFAULT_PAGE_SIZE = 1;
 
+/** Base path for the bulk delete operation. */
 private final Path basePath;
 
+/** Delegate File system make actual delete calls. */
 private final FileSystem fs;
 
-public DefaultBulkDeleteOperation(int pageSize,
-  Path basePath,
+public DefaultBulkDeleteOperation(Path basePath,
   FileSystem fs) {
-checkArgument(pageSize == 1, "Page size must be equal to 1");
-this.pageSize = pageSize;
 this.basePath = requireNonNull(basePath);
 this.fs = fs;
 }
 
 @Override
 public int pageSize() {
-return pageSize;
+return DEFAULT_PAGE_SIZE;
 }
 
 @Override
 public Path basePath() {
 return basePath;
 }
 
+/**
+ * {@inheritDoc}
+ */
 @Override
 public List> bulkDelete(Collection paths)
 throws IOException, IllegalArgumentException {
-validateBulkDeletePaths(paths, pageSize, basePath);
+validateBulkDeletePaths(paths, DEFAULT_PAGE_SIZE, basePath);
 List> result = new ArrayList<>();
-// this for loop doesn't make sense as pageSize must be 1.
-for (Path path : paths) {
+if (!paths.isEmpty()) {
+// As the page size is always 1, this should be the only one
+// path in the collection.
+Path pathToDelete = paths.iterator().next();
 try {
-fs.delete(path, false);
-// What to do if this return false?
-// I think we should add the path to the result list with 
value "Not Deleted".
-} catch (IOException e) {
-result.add(Tuples.pair(path, e.toString()));
+boolean deleted = fs.delete(pathToDelete, false);
+if (deleted) {
+return result;
+} else {
+try {
+FileStatus fileStatus = fs.getFileStatus(pathToDelete);
+if (fileStatus.isDirectory()) {
+result.add(Tuples.pair(pathToDelete, "Path is a 
directory"));
+}
+} catch (FileNotFoundException e) {
+// Ignore FNFE and don't add to the result list.
+LOG.debug("Couldn't delete {} - does not exist: {}", 
pathToDelete, e.toString());
+} catch (Exception e) {
+LOG.debug("Couldn't delete {} - exception occurred: 
{}", pathToDelete, e.toString());
+result.add(Tuples.pair(pathToDelete, e.toString()));
+}
+}
+} catch (Exception ex) {

Review Comment:
   make this an IOException



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DefaultBulkDeleteOperation.java:
##
@@ -17,61 +17,86 @@
  */
 package org.apache.hadoop.fs;
 
+import java.io.FileNotFoundException;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
 import java.util.List;
 import java.util.Map;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import org.apache.hadoop.util.functional.Tuples;
 
 import static java.util.Objects.requireNonNull;
 import static org.apache.hadoop.fs.BulkDeleteUtils.validateBulkDeletePaths;
-import static org.apache.hadoop.util.Preconditions.checkArgument;
 
 /**
  * Default implementation of the {@link BulkDelete} interface.
  */
 public class DefaultBulkDeleteOperation implements BulkDelete {
 
-private final int pageSize;
+private static Logger LOG = 
LoggerFactory.getLogger(DefaultBulkDeleteOperation.class);
+
+/** Default page size for bulk delete. */
+private static final int DEFAULT_PAGE_SIZE = 1;
 
+/** Base 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-26 Thread via GitHub


hadoop-yetus commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2078874481

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  22m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 10 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 58s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  39m 38s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  20m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   5m 52s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 59s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   8m 57s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  43m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 37s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 23s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 41s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  22m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 12s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  21m 12s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/5/artifact/out/blanks-eol.txt)
 |  The patch has 8 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   5m 34s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/5/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 409 new + 106 unchanged - 0 fixed = 515 total 
(was 106)  |
   | +1 :green_heart: |  mvnsite  |   5m 35s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 52s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/5/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0)  |
   | -1 :x: |  javadoc  |   0m 43s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/5/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 
with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 generated 3 new + 
0 unchanged - 0 fixed = 3 total (was 0)  |
   | +1 :green_heart: |  spotbugs  |   9m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 45s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 275m  3s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |   3m 23s | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-26 Thread via GitHub


hadoop-yetus commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2078773861

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  18m 18s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 10 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 48s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  36m 32s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  18m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   4m 54s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m  9s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 59s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   8m 49s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 33s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 57s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 46s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 56s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  17m 56s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/4/artifact/out/blanks-eol.txt)
 |  The patch has 8 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   4m 49s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/4/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 409 new + 106 unchanged - 0 fixed = 515 total 
(was 106)  |
   | +1 :green_heart: |  mvnsite  |   5m  5s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 11s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/4/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 2 new + 0 
unchanged - 0 fixed = 2 total (was 0)  |
   | -1 :x: |  javadoc  |   0m 47s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/4/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 
with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 generated 3 new + 
0 unchanged - 0 fixed = 3 total (was 0)  |
   | +1 :green_heart: |  spotbugs  |   9m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 45s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 267m  9s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | -1 :x: |  unit  |   3m 22s | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-25 Thread via GitHub


mukund-thakur commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1580173720


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DefalutBulkDeleteSource.java:
##
@@ -0,0 +1,38 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs;
+
+import java.io.IOException;
+
+/**
+ * Default implementation of {@link BulkDeleteSource}.
+ */
+public class DefalutBulkDeleteSource implements BulkDeleteSource {
+
+private final FileSystem fs;

Review Comment:
   javadoc



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BulkDeleteUtils.java:
##
@@ -0,0 +1,54 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs;
+
+import java.util.Collection;
+
+import static java.util.Objects.requireNonNull;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+
+/**
+ * Utility class for bulk delete operations.
+ */
+public final class BulkDeleteUtils {
+
+private BulkDeleteUtils() {
+}
+
+public static void validateBulkDeletePaths(Collection paths, int 
pageSize, Path basePath) {
+requireNonNull(paths);
+checkArgument(paths.size() <= pageSize,
+"Number of paths (%d) is larger than the page size (%d)", 
paths.size(), pageSize);
+paths.forEach(p -> {
+checkArgument(p.isAbsolute(), "Path %s is not absolute", p);
+checkArgument(validatePathIsUnderParent(p, basePath),
+"Path %s is not under the base path %s", p, basePath);
+});
+}
+
+public static boolean validatePathIsUnderParent(Path p, Path basePath) {

Review Comment:
   javadoc



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-23 Thread via GitHub


hadoop-yetus commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2073964537

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 05s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 01s |  |  xmllint was not available.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  markdownlint  |   0m 01s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 31s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  91m 09s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  40m 32s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   6m 09s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 42s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/1/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  14m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 171m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 24s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  11m 00s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  38m 33s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  38m 33s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m 00s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/1/artifact/out/blanks-eol.txt)
 |  The patch has 5 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   6m 31s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 36s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/1/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  14m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 185m 02s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  asflicense  |   5m 46s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/1/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 555m 55s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6726 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle 
markdownlint |
   | uname | MINGW64_NT-10.0-17763 cfb6e8c364ad 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 741542703607b954851f005514b12af61a98afb6 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/1/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
hadoop-tools/hadoop-azure U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6726/1/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-23 Thread via GitHub


hadoop-yetus commented on PR #6738:
URL: https://github.com/apache/hadoop/pull/6738#issuecomment-2072921149

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 05s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m 00s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 00s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m 00s |  |  xmllint was not available.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  markdownlint  |   0m 00s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 11s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  90m 04s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  39m 11s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   5m 51s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 20s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/1/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |  13m 35s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 167m 43s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 18s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  10m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  37m 05s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  37m 05s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m 00s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/1/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   6m 04s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 25s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/1/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |  13m 59s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 177m 47s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  asflicense  |   5m 31s | 
[/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/1/artifact/out/results-asflicense.txt)
 |  The patch generated 1 ASF License warnings.  |
   |  |   | 540m 54s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6738 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle 
markdownlint |
   | uname | MINGW64_NT-10.0-17763 b4a02a5f9adc 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 744a643945e9fbf2fd1246c3e48c752789060370 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/1/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
hadoop-tools/hadoop-azure U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6738/1/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-19 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1572223236


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DefaultBulkDeleteOperation.java:
##
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Objects.requireNonNull;
+import static org.apache.hadoop.fs.BulkDeleteUtils.validateBulkDeletePaths;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+
+/**
+ * Default implementation of the {@link BulkDelete} interface.
+ */
+public class DefaultBulkDeleteOperation implements BulkDelete {
+
+private final int pageSize;

Review Comment:
   this is always 1, isn't it? so much can be simplified here
   * no need for the field
   * no need to pass it in the constructor
   * pageSize() to return 1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-19 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1572230146


##
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/bulkdelete.md:
##
@@ -161,124 +105,20 @@ store.hasPathCapability(path, 
"fs.capability.bulk.delete")
 
 ### Invocation through Reflection.
 
-The need for many Libraries to compile against very old versions of Hadoop
+The need for many libraries to compile against very old versions of Hadoop
 means that most of the cloud-first Filesystem API calls cannot be used except
 through reflection -And the more complicated The API and its data types are,
 The harder that reflection is to implement.
 
-To assist this, the class `org.apache.hadoop.fs.FileUtil` has two methods
+To assist this, the class `org.apache.hadoop.io.wrappedio.WrappedIO` has few 
methods
 which are intended to provide simple access to the API, especially
 through reflection.
 
 ```java
-  /**
-   * Get the maximum number of objects/files to delete in a single request.
-   * @param fs filesystem
-   * @param path path to delete under.
-   * @return a number greater than or equal to zero.
-   * @throws UnsupportedOperationException bulk delete under that path is not 
supported.
-   * @throws IllegalArgumentException path not valid.
-   * @throws IOException problems resolving paths
-   */
+
   public static int bulkDeletePageSize(FileSystem fs, Path path) throws 
IOException;
   
-  /**
-   * Delete a list of files/objects.
-   * 
-   *   Files must be under the path provided in {@code base}.
-   *   The size of the list must be equal to or less than the page 
size.
-   *   Directories are not supported; the outcome of attempting to delete
-   *   directories is undefined (ignored; undetected, listed as 
failures...).
-   *   The operation is not atomic.
-   *   The operation is treated as idempotent: network failures may
-   *trigger resubmission of the request -any new objects created under 
a
-   *path in the list may then be deleted.
-   *There is no guarantee that any parent directories exist after this 
call.
-   *
-   * 
-   * @param fs filesystem
-   * @param base path to delete under.
-   * @param paths list of paths which must be absolute and under the base path.
-   * @return a list of all the paths which couldn't be deleted for a reason 
other than "not found" and any associated error message.
-   * @throws UnsupportedOperationException bulk delete under that path is not 
supported.
-   * @throws IOException IO problems including networking, authentication and 
more.
-   * @throws IllegalArgumentException if a path argument is invalid.
-   */
-  public static List> bulkDelete(FileSystem fs, Path 
base, List paths)
-```
+  public static int bulkDeletePageSize(FileSystem fs, Path path) throws 
IOException;
 
-## S3A Implementation
-
-The S3A client exports this API.

Review Comment:
   this needs to be covered, along with the default implementation "maps to 
delete(path, false)"



##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DefaultBulkDeleteOperation.java:
##
@@ -0,0 +1,84 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Objects.requireNonNull;
+import static org.apache.hadoop.fs.BulkDeleteUtils.validateBulkDeletePaths;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+
+/**
+ * Default implementation of the {@link BulkDelete} interface.
+ */
+public class DefaultBulkDeleteOperation implements BulkDelete {
+
+private final int pageSize;
+
+private final Path basePath;
+
+private final FileSystem fs;
+
+public DefaultBulkDeleteOperation(int pageSize,
+  Path basePath,
+  FileSystem fs) {
+checkArgument(pageSize == 1, "Page size must be equal to 1");
+this.pageSize = pageSize;
+this.basePath = requireNonNull(basePath);
+

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-18 Thread via GitHub


hadoop-yetus commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2065627094

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 35s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  36m 55s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 38s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  18m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   4m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 32s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   5m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 46s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   1m 55s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 59s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 56s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  17m 56s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/3/artifact/out/blanks-eol.txt)
 |  The patch has 5 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   4m 53s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/3/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 309 new + 41 unchanged - 0 fixed = 350 total (was 
41)  |
   | +1 :green_heart: |  mvnsite  |   3m 28s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 10s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/3/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0)  |
   | -1 :x: |  javadoc  |   0m 48s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/3/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 
with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 generated 3 new + 
0 unchanged - 0 fixed = 3 total (was 0)  |
   | +1 :green_heart: |  spotbugs  |   5m 40s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  41m 29s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 40s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  |   3m  7s | 
[/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/3/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  unit  |   2m 48s |  |  hadoop-azure in the patch 
passed.  |
   | -1 :x: |  asflicense  |   1m  4s | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-17 Thread via GitHub


mukund-thakur commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1569566843


##
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/aws_sdk_upgrade.md:
##
@@ -324,6 +324,7 @@ They have also been updated to return V2 SDK classes.
 public interface S3AInternals {
   S3Client getAmazonS3V2Client(String reason);
 
+  S3AStore getStore();

Review Comment:
   this is a doc file. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-17 Thread via GitHub


mukund-thakur commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1569528772


##
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractBulkDeleteTest.java:
##
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.contract;
+
+import org.apache.hadoop.fs.*;
+import org.assertj.core.api.Assertions;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+import static org.apache.hadoop.fs.contract.ContractTestUtils.touch;
+import static org.apache.hadoop.test.LambdaTestUtils.intercept;
+
+public abstract class AbstractContractBulkDeleteTest extends 
AbstractFSContractTestBase {
+
+private static final Logger LOG =
+LoggerFactory.getLogger(AbstractContractBulkDeleteTest.class);
+
+protected int pageSize;
+
+protected Path basePath;
+
+protected FileSystem fs;
+
+@Before
+public void setUp() throws Exception {
+fs = getFileSystem();
+basePath = path(getClass().getName());

Review Comment:
   this is under setup and the path will be created under the contract test 
directory. so cleanup should work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-16 Thread via GitHub


mukund-thakur commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1567902648


##
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractBulkDelete.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.contract.s3a;
+
+import org.apache.hadoop.conf.Configuration;

Review Comment:
   Ah sorry..installed a new IDE on my new Mac thus the old rules are gone. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-16 Thread via GitHub


hadoop-yetus commented on PR #6738:
URL: https://github.com/apache/hadoop/pull/6738#issuecomment-2059423509

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 14s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m  0s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   8m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   2m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  8s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m  2s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 49s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 42s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   8m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   8m 17s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6738/1/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   2m  2s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6738/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 206 new + 41 unchanged - 0 fixed = 247 total (was 
41)  |
   | +1 :green_heart: |  mvnsite  |   2m  9s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   0m 44s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6738/1/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0)  |
   | -1 :x: |  javadoc  |   0m 29s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6738/1/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 
with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 generated 3 new + 
0 unchanged - 0 fixed = 3 total (was 0)  |
   | +1 :green_heart: |  spotbugs  |   3m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 48s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   4m 20s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  |   2m 39s | 
[/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6738/1/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  unit  |   2m  4s |  |  hadoop-azure in the patch 
passed.  |
   | -1 :x: |  asflicense  |   0m 40s | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-16 Thread via GitHub


steveloughran commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2059138849

   commented. I've also done a PR #6738 which tunes the API to work with 
iceberg, having just written a PoC of the iceberg binding. 
   
   My PR
   * moved the wrapper methods to a new wrappedio.WrappedIO class
   * add a probe for the api being available
   * I also added an availability probe in the interface. not sure about that 
as we really should make it available everywhere, always.
   
   Can you cherrypick this PR onto your branch and then do the review comments.
   
   After which, please do not do any rebasing of your PR. That way, it is 
easier for me too keep my own branch in sync with your changes. Thanks.
   
   PoC of iceberg integration, based on their S3FileIO one.
   
   
https://github.com/steveloughran/iceberg/blob/s3/HADOOP-18679-bulk-delete-api/core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java#L208
   
   The iceberg api passes in a collection of paths, *which may span multiple 
filesystems*.
   
   To handle this, 
   * the bulk delete API should take a Collection, not a list
   * it needs to be implemented in every FS, because trying to distinguish 
case-by-case on support would be really complex.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-16 Thread via GitHub


steveloughran commented on code in PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#discussion_r1566433464


##
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/bulkdelete.md:
##
@@ -0,0 +1,284 @@
+
+
+#  interface `BulkDelete`

Review Comment:
   needs to be referenced from index.md



##
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/bulkdelete.md:
##
@@ -0,0 +1,284 @@
+
+
+#  interface `BulkDelete`
+
+
+
+The `BulkDelete` interface provides an API to perform bulk delete of 
files/objects
+in an object store or filesystem.
+
+## Key Features
+
+* An API for submitting a list of paths to delete.
+* This list must be no larger than the "page size" supported by the client; 
This size is also exposed as a method.
+* Triggers a request to delete files at the specific paths.
+* Returns a list of which paths were reported as delete failures by the store.
+* Does not consider a nonexistent file to be a failure.
+* Does not offer any atomicity guarantees.
+* Idempotency guarantees are weak: retries may delete files newly created by 
other clients.
+* Provides no guarantees as to the outcome if a path references a directory.
+* Provides no guarantees that parent directories will exist after the call.
+
+
+The API is designed to match the semantics of the AWS S3 [Bulk 
Delete](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) 
REST API call, but it is not
+exclusively restricted to this store. This is why the "provides no guarantees"
+restrictions do not state what the outcome will be when executed on other 
stores.
+
+### Interface `org.apache.hadoop.fs.BulkDeleteSource`
+
+The interface `BulkDeleteSource` is offered by a FileSystem/FileContext class 
if
+it supports the API.
+
+```java
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDeleteSource {
+
+  /**
+   * Create a bulk delete operation.
+   * There is no network IO at this point, simply the creation of
+   * a bulk delete object.
+   * A path must be supplied to assist in link resolution.
+   * @param path path to delete under.
+   * @return the bulk delete.
+   * @throws UnsupportedOperationException bulk delete under that path is not 
supported.
+   * @throws IllegalArgumentException path not valid.
+   * @throws IOException problems resolving paths
+   */
+  default BulkDelete createBulkDelete(Path path)
+  throws UnsupportedOperationException, IllegalArgumentException, 
IOException;
+
+}
+
+```
+
+### Interface `org.apache.hadoop.fs.BulkDelete`
+
+This is the bulk delete implementation returned by the `createBulkDelete()` 
call.
+
+```java
+/**
+ * API for bulk deletion of objects/files,
+ * but not directories.
+ * After use, call {@code close()} to release any resources and
+ * to guarantee store IOStatistics are updated.
+ * 
+ * Callers MUST have no expectation that parent directories will exist after 
the
+ * operation completes; if an object store needs to explicitly look for and 
create
+ * directory markers, that step will be omitted.
+ * 
+ * Be aware that on some stores (AWS S3) each object listed in a bulk delete 
counts
+ * against the write IOPS limit; large page sizes are counterproductive here, 
as
+ * are attempts at parallel submissions across multiple threads.
+ * @see https://issues.apache.org/jira/browse/HADOOP-16823;>HADOOP-16823.
+ *  Large DeleteObject requests are their own Thundering Herd
+ * 
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDelete extends IOStatisticsSource, Closeable {
+
+  /**
+   * The maximum number of objects/files to delete in a single request.
+   * @return a number greater than or equal to zero.
+   */
+  int pageSize();
+
+  /**
+   * Base path of a bulk delete operation.
+   * All paths submitted in {@link #bulkDelete(List)} must be under this path.
+   */
+  Path basePath();
+
+  /**
+   * Delete a list of files/objects.
+   * 
+   *   Files must be under the path provided in {@link #basePath()}.
+   *   The size of the list must be equal to or less than the page size
+   *   declared in {@link #pageSize()}.
+   *   Directories are not supported; the outcome of attempting to delete
+   *   directories is undefined (ignored; undetected, listed as 
failures...).
+   *   The operation is not atomic.
+   *   The operation is treated as idempotent: network failures may
+   *trigger resubmission of the request -any new objects created under 
a
+   *path in the list may then be deleted.
+   *There is no guarantee that any parent directories exist after this 
call.
+   *
+   * 
+   * @param paths list of paths which must be absolute and under the base path.
+   * provided in {@link #basePath()}.
+   * @throws IOException IO problems including networking, authentication and 
more.
+   * @throws IllegalArgumentException if a path argument is invalid.
+   */
+  List> bulkDelete(List paths)
+  throws 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-16 Thread via GitHub


steveloughran commented on PR #6738:
URL: https://github.com/apache/hadoop/pull/6738#issuecomment-2059110117

   This is #6726 with another commit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-15 Thread via GitHub


hadoop-yetus commented on PR #6726:
URL: https://github.com/apache/hadoop/pull/6726#issuecomment-2057942133

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  17m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 53s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  37m  2s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 47s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  17m 53s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   4m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 31s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 42s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   5m  7s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  40m 13s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 37s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  3s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 59s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 57s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  17m 57s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/1/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   4m 46s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 204 new + 41 unchanged - 0 fixed = 245 total (was 
41)  |
   | +1 :green_heart: |  mvnsite  |   3m 29s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 11s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/1/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0)  |
   | -1 :x: |  javadoc  |   0m 46s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/1/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 
with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 generated 3 new + 
0 unchanged - 0 fixed = 3 total (was 0)  |
   | +1 :green_heart: |  spotbugs  |   5m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   5m 44s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  |   3m 11s | 
[/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6726/1/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  unit  |   2m 47s |  |  hadoop-azure in the patch 
passed.  |
   | -1 :x: |  asflicense  |   1m  4s | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-11 Thread via GitHub


mukund-thakur commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1561315131


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/BulkDeleteOperationCallbacksImpl.java:
##
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.nio.file.AccessDeniedException;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import software.amazon.awssdk.services.s3.model.DeleteObjectsResponse;
+import software.amazon.awssdk.services.s3.model.ObjectIdentifier;
+import software.amazon.awssdk.services.s3.model.S3Error;
+
+import org.apache.hadoop.fs.s3a.Retries;
+import org.apache.hadoop.fs.s3a.S3AStore;
+import org.apache.hadoop.fs.store.audit.AuditSpan;
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Collections.emptyList;
+import static java.util.Collections.singletonList;
+import static org.apache.hadoop.fs.s3a.Invoker.once;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+import static org.apache.hadoop.util.functional.Tuples.pair;
+
+/**
+ * Callbacks for the bulk delete operation.
+ */
+public class BulkDeleteOperationCallbacksImpl implements
+BulkDeleteOperation.BulkDeleteOperationCallbacks {
+
+  /**
+   * Path for logging.
+   */
+  private final String path;
+
+  /** Page size for bulk delete. */
+  private final int pageSize;
+
+  /** span for operations. */
+  private final AuditSpan span;
+
+  /**
+   * Store.
+   */
+  private final S3AStore store;
+
+
+  public BulkDeleteOperationCallbacksImpl(final S3AStore store,
+  String path, int pageSize, AuditSpan span) {
+this.span = span;
+this.pageSize = pageSize;
+this.path = path;
+this.store = store;
+  }
+
+  @Override
+  @Retries.RetryTranslated
+  public List> bulkDelete(final 
List keysToDelete)
+  throws IOException, IllegalArgumentException {
+span.activate();
+final int size = keysToDelete.size();
+checkArgument(size <= pageSize,
+"Too many paths to delete in one operation: %s", size);
+if (size == 0) {
+  return emptyList();
+}
+
+if (size == 1) {
+  return deleteSingleObject(keysToDelete.get(0).key());
+}
+
+final DeleteObjectsResponse response = once("bulkDelete", path, () ->
+store.deleteObjects(store.getRequestFactory()
+.newBulkDeleteRequestBuilder(keysToDelete)
+.build())).getValue();
+final List errors = response.errors();
+if (errors.isEmpty()) {
+  // all good.
+  return emptyList();
+} else {
+  return errors.stream()
+  .map(e -> pair(e.key(), e.message()))

Review Comment:
   yes e.toString() sounds better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-11 Thread via GitHub


steveloughran commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1561126867


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/BulkDeleteOperationCallbacksImpl.java:
##
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.nio.file.AccessDeniedException;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import software.amazon.awssdk.services.s3.model.DeleteObjectsResponse;
+import software.amazon.awssdk.services.s3.model.ObjectIdentifier;
+import software.amazon.awssdk.services.s3.model.S3Error;
+
+import org.apache.hadoop.fs.s3a.Retries;
+import org.apache.hadoop.fs.s3a.S3AStore;
+import org.apache.hadoop.fs.store.audit.AuditSpan;
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Collections.emptyList;
+import static java.util.Collections.singletonList;
+import static org.apache.hadoop.fs.s3a.Invoker.once;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+import static org.apache.hadoop.util.functional.Tuples.pair;
+
+/**
+ * Callbacks for the bulk delete operation.
+ */
+public class BulkDeleteOperationCallbacksImpl implements
+BulkDeleteOperation.BulkDeleteOperationCallbacks {
+
+  /**
+   * Path for logging.
+   */
+  private final String path;
+
+  /** Page size for bulk delete. */
+  private final int pageSize;
+
+  /** span for operations. */
+  private final AuditSpan span;
+
+  /**
+   * Store.
+   */
+  private final S3AStore store;
+
+
+  public BulkDeleteOperationCallbacksImpl(final S3AStore store,
+  String path, int pageSize, AuditSpan span) {
+this.span = span;
+this.pageSize = pageSize;
+this.path = path;
+this.store = store;
+  }
+
+  @Override
+  @Retries.RetryTranslated
+  public List> bulkDelete(final 
List keysToDelete)
+  throws IOException, IllegalArgumentException {
+span.activate();
+final int size = keysToDelete.size();
+checkArgument(size <= pageSize,
+"Too many paths to delete in one operation: %s", size);
+if (size == 0) {
+  return emptyList();
+}
+
+if (size == 1) {
+  return deleteSingleObject(keysToDelete.get(0).key());
+}
+
+final DeleteObjectsResponse response = once("bulkDelete", path, () ->
+store.deleteObjects(store.getRequestFactory()
+.newBulkDeleteRequestBuilder(keysToDelete)
+.build())).getValue();
+final List errors = response.errors();
+if (errors.isEmpty()) {
+  // all good.
+  return emptyList();
+} else {
+  return errors.stream()
+  .map(e -> pair(e.key(), e.message()))

Review Comment:
   or e.toString()?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-10 Thread via GitHub


mukund-thakur commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1560055171


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/BulkDeleteOperationCallbacksImpl.java:
##
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.nio.file.AccessDeniedException;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import software.amazon.awssdk.services.s3.model.DeleteObjectsResponse;
+import software.amazon.awssdk.services.s3.model.ObjectIdentifier;
+import software.amazon.awssdk.services.s3.model.S3Error;
+
+import org.apache.hadoop.fs.s3a.Retries;
+import org.apache.hadoop.fs.s3a.S3AStore;
+import org.apache.hadoop.fs.store.audit.AuditSpan;
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Collections.emptyList;
+import static java.util.Collections.singletonList;
+import static org.apache.hadoop.fs.s3a.Invoker.once;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+import static org.apache.hadoop.util.functional.Tuples.pair;
+
+/**
+ * Callbacks for the bulk delete operation.
+ */
+public class BulkDeleteOperationCallbacksImpl implements
+BulkDeleteOperation.BulkDeleteOperationCallbacks {
+
+  /**
+   * Path for logging.
+   */
+  private final String path;
+
+  /** Page size for bulk delete. */
+  private final int pageSize;
+
+  /** span for operations. */
+  private final AuditSpan span;
+
+  /**
+   * Store.
+   */
+  private final S3AStore store;
+
+
+  public BulkDeleteOperationCallbacksImpl(final S3AStore store,
+  String path, int pageSize, AuditSpan span) {
+this.span = span;
+this.pageSize = pageSize;
+this.path = path;
+this.store = store;
+  }
+
+  @Override
+  @Retries.RetryTranslated
+  public List> bulkDelete(final 
List keysToDelete)
+  throws IOException, IllegalArgumentException {
+span.activate();
+final int size = keysToDelete.size();
+checkArgument(size <= pageSize,
+"Too many paths to delete in one operation: %s", size);
+if (size == 0) {
+  return emptyList();
+}
+
+if (size == 1) {
+  return deleteSingleObject(keysToDelete.get(0).key());
+}
+
+final DeleteObjectsResponse response = once("bulkDelete", path, () ->
+store.deleteObjects(store.getRequestFactory()
+.newBulkDeleteRequestBuilder(keysToDelete)
+.build())).getValue();
+final List errors = response.errors();
+if (errors.isEmpty()) {
+  // all good.
+  return emptyList();
+} else {
+  return errors.stream()
+  .map(e -> pair(e.key(), e.message()))

Review Comment:
   e.code() gives AccessDenied
   and e.message() gives Access Denied. Does it make sense to add both 
**e.code() + " " + e.message()** to have the max info returned to the user?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-08 Thread via GitHub


mukund-thakur commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1556505328


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BulkDelete.java:
##
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs;
+
+import java.io.Closeable;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.fs.statistics.IOStatisticsSource;
+
+import static java.util.Objects.requireNonNull;
+
+/**
+ * API for bulk deletion of objects/files,
+ * but not directories.
+ * After use, call {@code close()} to release any resources and
+ * to guarantee store IOStatistics are updated.
+ * 
+ * Callers MUST have no expectation that parent directories will exist after 
the
+ * operation completes; if an object store needs to explicitly look for and 
create
+ * directory markers, that step will be omitted.
+ * 
+ * Be aware that on some stores (AWS S3) each object listed in a bulk delete 
counts
+ * against the write IOPS limit; large page sizes are counterproductive here, 
as
+ * are attempts at parallel submissions across multiple threads.
+ * @see https://issues.apache.org/jira/browse/HADOOP-16823;>HADOOP-16823.
+ *  Large DeleteObject requests are their own Thundering Herd
+ * 
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDelete extends IOStatisticsSource, Closeable {
+
+  /**
+   * The maximum number of objects/files to delete in a single request.
+   * @return a number greater than or equal to zero.
+   */
+  int pageSize();

Review Comment:
   shouldn't this be greater than 0? 
   equal to 0 doesn't make sense. also we have the check in S3A impl. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-08 Thread via GitHub


mukund-thakur commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1556490279


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BulkDelete.java:
##
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs;
+
+import java.io.Closeable;
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.fs.statistics.IOStatisticsSource;
+
+import static java.util.Objects.requireNonNull;
+
+/**
+ * API for bulk deletion of objects/files,
+ * but not directories.
+ * After use, call {@code close()} to release any resources and
+ * to guarantee store IOStatistics are updated.
+ * 
+ * Callers MUST have no expectation that parent directories will exist after 
the
+ * operation completes; if an object store needs to explicitly look for and 
create
+ * directory markers, that step will be omitted.
+ * 
+ * Be aware that on some stores (AWS S3) each object listed in a bulk delete 
counts
+ * against the write IOPS limit; large page sizes are counterproductive here, 
as
+ * are attempts at parallel submissions across multiple threads.
+ * @see https://issues.apache.org/jira/browse/HADOOP-16823;>HADOOP-16823.
+ *  Large DeleteObject requests are their own Thundering Herd
+ * 
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDelete extends IOStatisticsSource, Closeable {
+
+  /**
+   * The maximum number of objects/files to delete in a single request.
+   * @return a number greater than or equal to zero.
+   */
+  int pageSize();
+
+  /**
+   * Base path of a bulk delete operation.
+   * All paths submitted in {@link #bulkDelete(List)} must be under this path.
+   */
+  Path basePath();
+
+  /**
+   * Delete a list of files/objects.
+   * 
+   *   Files must be under the path provided in {@link #basePath()}.

Review Comment:
   writing contract tests for this locally., can't find the implementation of 
this in S3A. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-04-08 Thread via GitHub


mukund-thakur commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1556489762


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/BulkDeleteOperationCallbacksImpl.java:
##
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.nio.file.AccessDeniedException;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import software.amazon.awssdk.services.s3.model.DeleteObjectsResponse;
+import software.amazon.awssdk.services.s3.model.ObjectIdentifier;
+import software.amazon.awssdk.services.s3.model.S3Error;
+
+import org.apache.hadoop.fs.s3a.Retries;
+import org.apache.hadoop.fs.s3a.S3AStore;
+import org.apache.hadoop.fs.store.audit.AuditSpan;
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Collections.emptyList;
+import static java.util.Collections.singletonList;
+import static org.apache.hadoop.fs.s3a.Invoker.once;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+import static org.apache.hadoop.util.functional.Tuples.pair;
+
+/**
+ * Callbacks for the bulk delete operation.
+ */
+public class BulkDeleteOperationCallbacksImpl implements
+BulkDeleteOperation.BulkDeleteOperationCallbacks {
+
+  /**
+   * Path for logging.
+   */
+  private final String path;
+
+  /** Page size for bulk delete. */
+  private final int pageSize;
+
+  /** span for operations. */
+  private final AuditSpan span;
+
+  /**
+   * Store.
+   */
+  private final S3AStore store;
+
+
+  public BulkDeleteOperationCallbacksImpl(final S3AStore store,
+  String path, int pageSize, AuditSpan span) {
+this.span = span;
+this.pageSize = pageSize;
+this.path = path;
+this.store = store;
+  }
+
+  @Override
+  @Retries.RetryTranslated
+  public List> bulkDelete(final 
List keysToDelete)
+  throws IOException, IllegalArgumentException {
+span.activate();
+final int size = keysToDelete.size();
+checkArgument(size <= pageSize,
+"Too many paths to delete in one operation: %s", size);
+if (size == 0) {
+  return emptyList();
+}
+
+if (size == 1) {
+  return deleteSingleObject(keysToDelete.get(0).key());
+}
+
+final DeleteObjectsResponse response = once("bulkDelete", path, () ->
+store.deleteObjects(store.getRequestFactory()
+.newBulkDeleteRequestBuilder(keysToDelete)
+.build())).getValue();
+final List errors = response.errors();
+if (errors.isEmpty()) {
+  // all good.
+  return emptyList();
+} else {
+  return errors.stream()
+  .map(e -> pair(e.key(), e.message()))
+  .collect(Collectors.toList());
+}
+  }
+
+  /**
+   * Delete a single object.
+   * @param key key to delete
+   * @return list of keys which failed to delete: length 0 or 1.
+   * @throws IOException IO problem other than AccessDeniedException
+   */
+  @Retries.RetryTranslated
+  private List> deleteSingleObject(final String key) 
throws IOException {

Review Comment:
   after checking locally, this is fine. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-28 Thread via GitHub


steveloughran commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-2025696873

   In #6686 I'm creating a new utils class for reflection access, nothing else. 
And proposing that  all tests of the API use reflection to be really confident 
it works and that there's no accidental changes which break reflection


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-28 Thread via GitHub


steveloughran commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-2025591287

   FYI i want to pull the rate limiter API of #6596 in here too; we'd have a 
rate limiter in s3a store which if enabled would limit #of deletes which can be 
issued on a bucket. Ideally it'd be at 3000 on s3 standard, off for s3 express 
and third party stores, so reduce load this call can generate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-28 Thread via GitHub


ahmarsuhail commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1542647008


##
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/bulkdelete.md:
##
@@ -0,0 +1,284 @@
+
+
+#  interface `BulkDelete`
+
+
+
+The `BulkDelete` interface provides an API to perform bulk delete of 
files/objects
+in an object store or filesystem.
+
+## Key Features
+
+* An API for submitting a list of paths to delete.
+* This list must be no larger than the "page size" supported by the client; 
This size is also exposed as a method.
+* Triggers a request to delete files at the specific paths.
+* Returns a list of which paths were reported as delete failures by the store.
+* Does not consider a nonexistent file to be a failure.
+* Does not offer any atomicity guarantees.
+* Idempotency guarantees are weak: retries may delete files newly created by 
other clients.
+* Provides no guarantees as to the outcome if a path references a directory.
+* Provides no guarantees that parent directories will exist after the call.
+
+
+The API is designed to match the semantics of the AWS S3 [Bulk 
Delete](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html) 
REST API call, but it is not
+exclusively restricted to this store. This is why the "provides no guarantees"
+restrictions do not state what the outcome will be when executed on other 
stores.
+
+### Interface `org.apache.hadoop.fs.BulkDeleteSource`
+
+The interface `BulkDeleteSource` is offered by a FileSystem/FileContext class 
if
+it supports the API.
+
+```java
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDeleteSource {
+
+  /**
+   * Create a bulk delete operation.
+   * There is no network IO at this point, simply the creation of
+   * a bulk delete object.
+   * A path must be supplied to assist in link resolution.
+   * @param path path to delete under.
+   * @return the bulk delete.
+   * @throws UnsupportedOperationException bulk delete under that path is not 
supported.
+   * @throws IllegalArgumentException path not valid.
+   * @throws IOException problems resolving paths
+   */
+  default BulkDelete createBulkDelete(Path path)
+  throws UnsupportedOperationException, IllegalArgumentException, 
IOException;
+
+}
+
+```
+
+### Interface `org.apache.hadoop.fs.BulkDelete`
+
+This is the bulk delete implementation returned by the `createBulkDelete()` 
call.
+
+```java
+/**
+ * API for bulk deletion of objects/files,
+ * but not directories.
+ * After use, call {@code close()} to release any resources and
+ * to guarantee store IOStatistics are updated.
+ * 
+ * Callers MUST have no expectation that parent directories will exist after 
the
+ * operation completes; if an object store needs to explicitly look for and 
create
+ * directory markers, that step will be omitted.
+ * 
+ * Be aware that on some stores (AWS S3) each object listed in a bulk delete 
counts
+ * against the write IOPS limit; large page sizes are counterproductive here, 
as
+ * are attempts at parallel submissions across multiple threads.
+ * @see https://issues.apache.org/jira/browse/HADOOP-16823;>HADOOP-16823.
+ *  Large DeleteObject requests are their own Thundering Herd
+ * 
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Unstable
+public interface BulkDelete extends IOStatisticsSource, Closeable {
+
+  /**
+   * The maximum number of objects/files to delete in a single request.
+   * @return a number greater than or equal to zero.
+   */
+  int pageSize();
+
+  /**
+   * Base path of a bulk delete operation.
+   * All paths submitted in {@link #bulkDelete(List)} must be under this path.
+   */
+  Path basePath();
+
+  /**
+   * Delete a list of files/objects.
+   * 
+   *   Files must be under the path provided in {@link #basePath()}.
+   *   The size of the list must be equal to or less than the page size
+   *   declared in {@link #pageSize()}.
+   *   Directories are not supported; the outcome of attempting to delete
+   *   directories is undefined (ignored; undetected, listed as 
failures...).
+   *   The operation is not atomic.
+   *   The operation is treated as idempotent: network failures may
+   *trigger resubmission of the request -any new objects created under 
a
+   *path in the list may then be deleted.
+   *There is no guarantee that any parent directories exist after this 
call.
+   *
+   * 
+   * @param paths list of paths which must be absolute and under the base path.
+   * provided in {@link #basePath()}.
+   * @throws IOException IO problems including networking, authentication and 
more.
+   * @throws IllegalArgumentException if a path argument is invalid.
+   */
+  List> bulkDelete(List paths)
+  throws IOException, IllegalArgumentException;
+
+}
+
+```
+
+### `bulkDelete(paths)`
+
+ Preconditions
+
+```python
+if length(paths) > pageSize: throw IllegalArgumentException
+```
+
+ Postconditions
+
+All paths which 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-22 Thread via GitHub


steveloughran commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1535463621


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -5457,7 +5421,11 @@ public boolean hasPathCapability(final Path path, final 
String capability)
 case STORE_CAPABILITY_DIRECTORY_MARKER_AWARE:
   return true;
 
-  // multi object delete flag
+// this is always true, even if multi object
+// delete is disabled -the page size is simply reduced to 1.
+case CommonPathCapabilities.BULK_DELETE:

Review Comment:
   it means the API is present and some of the semantics "parent dir existence 
not guaranteed". For that reason, it will always be faster than before: one 
DELETE; no LIST/HEAD etc



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-22 Thread via GitHub


steveloughran commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1535462548


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/BulkDeleteOperationCallbacksImpl.java:
##
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.nio.file.AccessDeniedException;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import software.amazon.awssdk.services.s3.model.DeleteObjectsResponse;
+import software.amazon.awssdk.services.s3.model.ObjectIdentifier;
+import software.amazon.awssdk.services.s3.model.S3Error;
+
+import org.apache.hadoop.fs.s3a.Retries;
+import org.apache.hadoop.fs.s3a.S3AStore;
+import org.apache.hadoop.fs.store.audit.AuditSpan;
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Collections.emptyList;
+import static java.util.Collections.singletonList;
+import static org.apache.hadoop.fs.s3a.Invoker.once;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+import static org.apache.hadoop.util.functional.Tuples.pair;
+
+/**
+ * Callbacks for the bulk delete operation.
+ */
+public class BulkDeleteOperationCallbacksImpl implements
+BulkDeleteOperation.BulkDeleteOperationCallbacks {
+
+  /**
+   * Path for logging.
+   */
+  private final String path;
+
+  /** Page size for bulk delete. */
+  private final int pageSize;
+
+  /** span for operations. */
+  private final AuditSpan span;
+
+  /**
+   * Store.
+   */
+  private final S3AStore store;
+
+
+  public BulkDeleteOperationCallbacksImpl(final S3AStore store,
+  String path, int pageSize, AuditSpan span) {
+this.span = span;
+this.pageSize = pageSize;
+this.path = path;
+this.store = store;
+  }
+
+  @Override
+  @Retries.RetryTranslated
+  public List> bulkDelete(final 
List keysToDelete)
+  throws IOException, IllegalArgumentException {
+span.activate();
+final int size = keysToDelete.size();
+checkArgument(size <= pageSize,
+"Too many paths to delete in one operation: %s", size);
+if (size == 0) {
+  return emptyList();
+}
+
+if (size == 1) {
+  return deleteSingleObject(keysToDelete.get(0).key());
+}
+
+final DeleteObjectsResponse response = once("bulkDelete", path, () ->
+store.deleteObjects(store.getRequestFactory()
+.newBulkDeleteRequestBuilder(keysToDelete)
+.build())).getValue();
+final List errors = response.errors();
+if (errors.isEmpty()) {
+  // all good.
+  return emptyList();
+} else {
+  return errors.stream()
+  .map(e -> pair(e.key(), e.message()))
+  .collect(Collectors.toList());
+}
+  }
+
+  /**
+   * Delete a single object.
+   * @param key key to delete
+   * @return list of keys which failed to delete: length 0 or 1.
+   * @throws IOException IO problem other than AccessDeniedException
+   */
+  @Retries.RetryTranslated
+  private List> deleteSingleObject(final String key) 
throws IOException {

Review Comment:
   prefer a collection?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-21 Thread via GitHub


mukund-thakur commented on code in PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#discussion_r1532928858


##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##
@@ -5457,7 +5421,11 @@ public boolean hasPathCapability(final Path path, final 
String capability)
 case STORE_CAPABILITY_DIRECTORY_MARKER_AWARE:
   return true;
 
-  // multi object delete flag
+// this is always true, even if multi object
+// delete is disabled -the page size is simply reduced to 1.
+case CommonPathCapabilities.BULK_DELETE:

Review Comment:
   nit: won't this be a bit misleading? 



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/BulkDeleteOperationCallbacksImpl.java:
##
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.nio.file.AccessDeniedException;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import software.amazon.awssdk.services.s3.model.DeleteObjectsResponse;
+import software.amazon.awssdk.services.s3.model.ObjectIdentifier;
+import software.amazon.awssdk.services.s3.model.S3Error;
+
+import org.apache.hadoop.fs.s3a.Retries;
+import org.apache.hadoop.fs.s3a.S3AStore;
+import org.apache.hadoop.fs.store.audit.AuditSpan;
+import org.apache.hadoop.util.functional.Tuples;
+
+import static java.util.Collections.emptyList;
+import static java.util.Collections.singletonList;
+import static org.apache.hadoop.fs.s3a.Invoker.once;
+import static org.apache.hadoop.util.Preconditions.checkArgument;
+import static org.apache.hadoop.util.functional.Tuples.pair;
+
+/**
+ * Callbacks for the bulk delete operation.
+ */
+public class BulkDeleteOperationCallbacksImpl implements
+BulkDeleteOperation.BulkDeleteOperationCallbacks {
+
+  /**
+   * Path for logging.
+   */
+  private final String path;
+
+  /** Page size for bulk delete. */
+  private final int pageSize;
+
+  /** span for operations. */
+  private final AuditSpan span;
+
+  /**
+   * Store.
+   */
+  private final S3AStore store;
+
+
+  public BulkDeleteOperationCallbacksImpl(final S3AStore store,
+  String path, int pageSize, AuditSpan span) {
+this.span = span;
+this.pageSize = pageSize;
+this.path = path;
+this.store = store;
+  }
+
+  @Override
+  @Retries.RetryTranslated
+  public List> bulkDelete(final 
List keysToDelete)
+  throws IOException, IllegalArgumentException {
+span.activate();
+final int size = keysToDelete.size();
+checkArgument(size <= pageSize,
+"Too many paths to delete in one operation: %s", size);
+if (size == 0) {
+  return emptyList();
+}
+
+if (size == 1) {
+  return deleteSingleObject(keysToDelete.get(0).key());
+}
+
+final DeleteObjectsResponse response = once("bulkDelete", path, () ->
+store.deleteObjects(store.getRequestFactory()
+.newBulkDeleteRequestBuilder(keysToDelete)
+.build())).getValue();
+final List errors = response.errors();
+if (errors.isEmpty()) {
+  // all good.
+  return emptyList();
+} else {
+  return errors.stream()
+  .map(e -> pair(e.key(), e.message()))
+  .collect(Collectors.toList());
+}
+  }
+
+  /**
+   * Delete a single object.
+   * @param key key to delete
+   * @return list of keys which failed to delete: length 0 or 1.
+   * @throws IOException IO problem other than AccessDeniedException
+   */
+  @Retries.RetryTranslated
+  private List> deleteSingleObject(final String key) 
throws IOException {

Review Comment:
   do we need the return to be a List?



##
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/BulkDeleteOperationCallbacksImpl.java:
##
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-14 Thread via GitHub


hadoop-yetus commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-1998394225

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 40s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  36m 15s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  18m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  17m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m 44s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | -1 :x: |  spotbugs  |   2m 33s | 
[/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/6/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html)
 |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  38m 11s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  17m 13s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/6/artifact/out/blanks-eol.txt)
 |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   4m 37s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/6/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 23 new + 41 unchanged - 0 fixed = 64 total (was 
41)  |
   | +1 :green_heart: |  mvnsite  |   2m 29s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  9s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/6/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0)  |
   | -1 :x: |  javadoc  |   0m 45s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/6/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08 with 
JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 generated 3 new + 0 unchanged 
- 0 fixed = 3 total (was 0)  |
   | +1 :green_heart: |  spotbugs  |   4m  5s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 25s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m 13s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  |   3m  6s | 
[/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/6/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in the patch 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-03-13 Thread via GitHub


hadoop-yetus commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-1996080816

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 27s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  36m 46s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   5m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 32s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | -1 :x: |  spotbugs  |   2m 36s | 
[/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/5/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html)
 |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  39m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 12s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  18m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  17m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 31s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/5/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 18 new + 41 unchanged - 0 fixed = 59 total (was 
41)  |
   | +1 :green_heart: |  mvnsite  |   2m 26s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  8s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/5/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
 with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0)  |
   | -1 :x: |  javadoc  |   0m 45s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/5/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08 with 
JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 generated 4 new + 0 unchanged 
- 0 fixed = 4 total (was 0)  |
   | +1 :green_heart: |  spotbugs  |   4m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m 55s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m  3s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  |   3m 16s | 
[/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/5/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt)
 |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  1s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 270m 10s |  |  |
   
   
   | Reason | Tests |
   

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-02-15 Thread via GitHub


hadoop-yetus commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-1947315201

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  18m 18s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 27s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  35m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  18m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  17m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 27s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 47s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | -1 :x: |  spotbugs  |   2m 31s | 
[/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/4/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html)
 |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  38m 32s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  18m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 14s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  17m 14s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 55s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/4/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 39 unchanged - 0 fixed = 40 total (was 
39)  |
   | +1 :green_heart: |  mvnsite  |   2m 38s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 10s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/4/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
 with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 generated 4 new + 0 
unchanged - 0 fixed = 4 total (was 0)  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 48s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  40m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m  8s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 51s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 280m 10s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6494 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 28b9081dc0d4 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-02-09 Thread via GitHub


hadoop-yetus commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-1936382465

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  8s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  36m 26s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m  5s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  16m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 31s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 47s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | -1 :x: |  spotbugs  |   2m 33s | 
[/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/3/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html)
 |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  38m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 31s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  17m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  16m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 32s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/3/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 39 unchanged - 0 fixed = 40 total (was 
39)  |
   | +1 :green_heart: |  mvnsite  |   2m 30s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  7s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/3/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
 with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 generated 4 new + 0 
unchanged - 0 fixed = 4 total (was 0)  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m  4s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m  7s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  9s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 57s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 259m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6494 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d1aa5776a4a0 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-02-09 Thread via GitHub


steveloughran commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-1935875347

   +add a FileUtils method to assist deletion here, with 
`FileUtils.bulkDeletePageSize(path) -> int` and `FileUtils.bulkDelete(path, 
List) -> List; each will create a bulk delete object, execute the 
operation/probe and then close. 
   
   why so?
   
   Makes reflection binding straighforward: no new types; just two methods.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-01-26 Thread via GitHub


hadoop-yetus commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-1912353079

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 57s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 10s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  35m 50s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  18m 13s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |  16m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   4m 37s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 47s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 43s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 48s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 36s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 41s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |  18m 41s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |  17m  6s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   4m 31s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/2/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   2m 28s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  9s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/2/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
 with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 generated 4 new + 0 
unchanged - 0 fixed = 4 total (was 0)  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m  5s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m 38s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  19m  5s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   3m  7s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 57s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 260m 38s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6494 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux f605ff408523 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 5afb6598adc1d81d8dcbbbaaecd2fa28d558e36f |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-01-24 Thread via GitHub


hadoop-yetus commented on PR #6494:
URL: https://github.com/apache/hadoop/pull/6494#issuecomment-1908768612

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 49s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 21s |  |  Maven dependency ordering for branch  |
   | -1 :x: |  mvninstall  |   7m  7s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | -1 :x: |  compile  |   9m  3s | 
[/branch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/branch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  root in trunk failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  compile  |   8m 32s | 
[/branch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  root in trunk failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08. 
 |
   | +1 :green_heart: |  checkstyle  |   4m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m  3s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 14s |  |  trunk passed  |
   | -1 :x: |  shadedclient  |  11m 29s |  |  branch has errors when building 
and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 47s |  |  the patch passed  |
   | -1 :x: |  compile  |  12m 33s | 
[/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  javac  |  12m 33s | 
[/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  compile  |  12m 23s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_392-8u392-ga-1~20.04-b08.  |
   | -1 :x: |  javac  |  12m 23s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_392-8u392-ga-1~20.04-b08.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   6m 18s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   2m 56s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 18s | 
[/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6494/1/artifact/out/results-javadoc-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  
hadoop-common-project_hadoop-common-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
 with JDK 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-01-24 Thread via GitHub


hadoop-yetus commented on PR #5993:
URL: https://github.com/apache/hadoop/pull/5993#issuecomment-1908677204

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 17s |  |  Maven dependency ordering for branch  |
   | -1 :x: |  mvninstall  |   4m 17s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | -1 :x: |  compile  |   3m 53s | 
[/branch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/branch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  root in trunk failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  compile  |   3m 36s | 
[/branch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  root in trunk failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08. 
 |
   | +1 :green_heart: |  checkstyle  |   1m 54s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   1m 49s |  |  trunk passed  |
   | -1 :x: |  shadedclient  |   4m 52s |  |  branch has errors when building 
and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 20s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   0m 45s |  |  the patch passed  |
   | -1 :x: |  compile  |   3m 48s | 
[/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  javac  |   3m 48s | 
[/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/patch-compile-root-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.  |
   | -1 :x: |  compile  |   3m 36s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_392-8u392-ga-1~20.04-b08.  |
   | -1 :x: |  javac  |   3m 36s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_392-8u392-ga-1~20.04-b08.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 48s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/4/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 5 new + 3 unchanged - 0 fixed = 8 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m  1s |  |  the patch passed  |
   | -1 :x: |  shadedclient  |   4m 57s |  |  patch has errors when building 
and testing our client artifacts.  |
    _ Other Tests _ |
   | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2024-01-01 Thread via GitHub


hadoop-yetus commented on PR #5993:
URL: https://github.com/apache/hadoop/pull/5993#issuecomment-1873467248

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 21s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 47s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  19m 28s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 19s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  compile  |   7m 27s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   2m  4s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  4s |  |  trunk passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   1m  0s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m  9s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m 50s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 20s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javac  |   7m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   7m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   7m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 59s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/3/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 5 new + 3 unchanged - 0 fixed = 8 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m  2s |  |  the patch passed with JDK 
Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   2m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 45s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  16m 22s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 10s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 143m 28s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5993 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux e75e83010c54 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 
15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d69fac0192c14889f0b3aa62bdb76e1d196eec8c |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5993/3/testReport/ |
   | Max. process+thread count | 2432 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws 
U: . |
   | Console output | 

Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2023-10-06 Thread via GitHub


steveloughran commented on code in PR #5993:
URL: https://github.com/apache/hadoop/pull/5993#discussion_r1349184302


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BulkDelete.java:
##
@@ -0,0 +1,324 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.CompletableFuture;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.fs.statistics.IOStatistics;
+import org.apache.hadoop.fs.statistics.IOStatisticsSource;
+
+import static 
org.apache.hadoop.fs.statistics.IOStatisticsLogging.ioStatisticsToPrettyString;
+
+/**
+ * Interface for bulk file delete operations.
+ * 
+ * The expectation is that the iterator-provided list of paths
+ * will be batched into pages and submitted to the remote filesystem/store
+ * for bulk deletion, possibly in parallel.
+ * 
+ * A remote iterator provides the list of paths to delete; all must be under

Review Comment:
   its for multiple mounted filesystems (viewfs) to direct to the final fs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



Re: [PR] HADOOP-18679. Add API for bulk/paged object deletion [hadoop]

2023-10-05 Thread via GitHub


steveloughran commented on PR #5993:
URL: https://github.com/apache/hadoop/pull/5993#issuecomment-1748524668

   @ahmarsuhail 
   
   - caller provides a remote iterator, such as the ones we do for listing or 
another source/transformation (see RemoteIterators)
   - build() call returns some result
   - implementation kicks off a worker thread to process the iterator, reading 
its values in until there's enough to kick off a DELETE request (page or maybe 
a parallel set in a thread pool)
   - after each page/set of deletes, invokes the supplied callback of results
   - then continues, unless told to stop
   - finish only on: iterator has nothing, iterator raises an exception
   - or maybe on reaching some limit on failures
   - including maybe those considered unrecoverable
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org