suneet-s commented on code in PR #15770:
URL: https://github.com/apache/druid/pull/15770#discussion_r1470097321


##########
extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentKiller.java:
##########
@@ -63,6 +69,79 @@ public AzureDataSegmentKiller(
     this.azureCloudBlobIterableFactory = azureCloudBlobIterableFactory;
   }
 
+  @Override
+  public void kill(List<DataSegment> segments) throws SegmentLoadingException
+  {
+    if (segments.isEmpty()) {
+      return;
+    }
+    if (segments.size() == 1) {
+      kill(segments.get(0));
+      return;
+    }
+
+    // create a list of keys to delete
+    Map<String, List<String>> containerToKeysToDelete = new HashMap<>();
+    for (DataSegment segment : segments) {
+      Map<String, Object> loadSpec = segment.getLoadSpec();
+      final String containerName = MapUtils.getString(loadSpec, 
"containerName");
+      final String blobPath = MapUtils.getString(loadSpec, "blobPath");
+      List<String> keysToDelete = containerToKeysToDelete.computeIfAbsent(
+          containerName,
+          k -> new ArrayList<>()
+      );
+      keysToDelete.add(blobPath);
+    }
+
+    boolean shouldThrowException = false;
+    for (Map.Entry<String, List<String>> containerToKeys : 
containerToKeysToDelete.entrySet()) {
+      shouldThrowException = deleteBlobKeys(containerToKeys.getValue(), 
containerToKeys.getKey());
+    }
+
+    if (shouldThrowException) {
+      throw new SegmentLoadingException(
+          "Couldn't delete segments from Azure. See the task logs for more 
details."
+      );
+    }
+  }
+
+  private Boolean deleteBlobKeys(List<String> keysToDelete, String 
containerName)
+  {
+    boolean hadException = false;
+    List<List<String>> keysChunks = Lists.partition(
+        keysToDelete,
+        MAX_MULTI_OBJECT_DELETE_SIZE

Review Comment:
   Can you explain why `deleteBlobKeys` is responsible for splitting the llst 
of keys into a reasonable size chunk for the bulk delete API? It looks like 
there are a couple other functions that call 
`azureStorage.batchDeleteFiles(...)` Should we push this behavior into 
`azureStorage.batchDeleteFiles`



##########
extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureStorage.java:
##########
@@ -188,6 +188,12 @@ public void batchDeleteFiles(String containerName, 
Iterable<String> paths, Integ
     );
   }
 
+  public void batchDeleteFiles(String containerName, Iterable<String> paths)
+      throws BlobBatchStorageException
+  {
+    batchDeleteFiles(containerName, paths, null);
+  }

Review Comment:
   No need for this function we can call the existing one with null directly. 
It would be good to add javadocs on the function above that `maxAttempts` is 
null able and users should call that with null when they want to use the system 
configured max retries.
   
   ```suggestion
   ```



##########
extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentKiller.java:
##########
@@ -63,6 +69,79 @@ public AzureDataSegmentKiller(
     this.azureCloudBlobIterableFactory = azureCloudBlobIterableFactory;
   }
 
+  @Override
+  public void kill(List<DataSegment> segments) throws SegmentLoadingException
+  {
+    if (segments.isEmpty()) {
+      return;
+    }
+    if (segments.size() == 1) {
+      kill(segments.get(0));
+      return;
+    }
+
+    // create a list of keys to delete
+    Map<String, List<String>> containerToKeysToDelete = new HashMap<>();
+    for (DataSegment segment : segments) {
+      Map<String, Object> loadSpec = segment.getLoadSpec();
+      final String containerName = MapUtils.getString(loadSpec, 
"containerName");
+      final String blobPath = MapUtils.getString(loadSpec, "blobPath");
+      List<String> keysToDelete = containerToKeysToDelete.computeIfAbsent(
+          containerName,
+          k -> new ArrayList<>()
+      );
+      keysToDelete.add(blobPath);
+    }
+
+    boolean shouldThrowException = false;
+    for (Map.Entry<String, List<String>> containerToKeys : 
containerToKeysToDelete.entrySet()) {
+      shouldThrowException = deleteBlobKeys(containerToKeys.getValue(), 
containerToKeys.getKey());
+    }
+
+    if (shouldThrowException) {
+      throw new SegmentLoadingException(
+          "Couldn't delete segments from Azure. See the task logs for more 
details."
+      );
+    }
+  }
+
+  private Boolean deleteBlobKeys(List<String> keysToDelete, String 
containerName)
+  {
+    boolean hadException = false;
+    List<List<String>> keysChunks = Lists.partition(
+        keysToDelete,
+        MAX_MULTI_OBJECT_DELETE_SIZE
+    );
+    for (List<String> chunkOfKeys : keysChunks) {
+      try {
+        log.info(
+            "Removing from container: [%s] the following index files: [%s] 
from s3!",

Review Comment:
   ```suggestion
               "Removing from container [%s] the following files: [%s]",
   ```



##########
extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentKiller.java:
##########
@@ -63,6 +69,79 @@ public AzureDataSegmentKiller(
     this.azureCloudBlobIterableFactory = azureCloudBlobIterableFactory;
   }
 
+  @Override
+  public void kill(List<DataSegment> segments) throws SegmentLoadingException
+  {
+    if (segments.isEmpty()) {
+      return;
+    }
+    if (segments.size() == 1) {
+      kill(segments.get(0));
+      return;
+    }
+
+    // create a list of keys to delete
+    Map<String, List<String>> containerToKeysToDelete = new HashMap<>();
+    for (DataSegment segment : segments) {
+      Map<String, Object> loadSpec = segment.getLoadSpec();
+      final String containerName = MapUtils.getString(loadSpec, 
"containerName");
+      final String blobPath = MapUtils.getString(loadSpec, "blobPath");
+      List<String> keysToDelete = containerToKeysToDelete.computeIfAbsent(
+          containerName,
+          k -> new ArrayList<>()
+      );
+      keysToDelete.add(blobPath);
+    }
+
+    boolean shouldThrowException = false;
+    for (Map.Entry<String, List<String>> containerToKeys : 
containerToKeysToDelete.entrySet()) {
+      shouldThrowException = deleteBlobKeys(containerToKeys.getValue(), 
containerToKeys.getKey());

Review Comment:
   If shouldThrowException is true once, then it should be true always.
   
   ```suggestion
         shouldThrowException = shouldThrowException || 
deleteBlobKeys(containerToKeys.getValue(), containerToKeys.getKey());
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to