[ 
https://issues.apache.org/jira/browse/HADOOP-19450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930007#comment-17930007
 ] 

ASF GitHub Bot commented on HADOOP-19450:
-----------------------------------------

anujmodi2021 commented on code in PR #7364:
URL: https://github.com/apache/hadoop/pull/7364#discussion_r1968915858


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/exceptions/AbfsDriverException.java:
##########
@@ -51,4 +51,12 @@ public AbfsDriverException(final Exception innerException, 
final String activity
             : ERROR_MESSAGE + ", rId: " + activityId,
         null);
   }
+
+  public AbfsDriverException(final String errorMessage, final Exception 
innerException) {

Review Comment:
   I was wondering if the exception will have reqId or not.
   Req id is part of abfsHttpOperation. Can we also pass down activity Id and 
append it to error message?



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsDfsClient.java:
##########
@@ -705,6 +711,30 @@ public AbfsClientRenameResult renamePath(
         throw e;
       }
 
+      // recovery using client transaction id only if it is a retried request.
+      if (op.isARetriedRequest() && clientTransactionId != null
+          && SOURCE_PATH_NOT_FOUND.getErrorCode().equalsIgnoreCase(
+              op.getResult().getStorageErrorCode())) {
+        try {
+          final AbfsHttpOperation abfsHttpOperation =
+              getPathStatus(destination, false,
+                  tracingContext, null).getResult();
+          if (clientTransactionId.equals(
+              abfsHttpOperation.getResponseHeader(
+                  X_MS_CLIENT_TRANSACTION_ID))) {
+            return new AbfsClientRenameResult(
+                getSuccessOp(AbfsRestOperationType.RenamePath,
+                HTTP_METHOD_PUT, url, requestHeaders), true,
+                isMetadataIncompleteState);
+          }
+        } catch (AzureBlobFileSystemException exception) {
+          throw new AbfsDriverException(
+              "Error in getPathStatus while recovering from rename failure.",

Review Comment:
   Let's define error string as constant in `AbfsErrors` class and add a 
mockito test to verify we get the proper exception thrown, if not already there.
   Same for create as well.



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsDfsClient.java:
##########
@@ -415,7 +416,9 @@ public AbfsRestOperation createPath(final String path,
           String existingResource =
               op.getResult().getResponseHeader(X_MS_EXISTING_RESOURCE_TYPE);
           if (existingResource != null && existingResource.equals(DIRECTORY)) {
-            return op; //don't throw ex on mkdirs for existing directory
+            //don't throw ex on mkdirs for existing directory
+            return getSuccessOp(AbfsRestOperationType.CreatePath,

Review Comment:
   This is good function to have a cleaner code, let's define it in base class 
and use it everywhere we are setting hard result. Even in Blob Client



##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRename.java:
##########
@@ -1766,4 +1745,69 @@ public void getClientTransactionIdAfterRename() throws 
Exception {
           .isEqualTo(clientTransactionId[0]);
     }
   }
+
+  @Test
+  public void failureInGetPathStatusDuringRenameRecovery() throws Exception {
+    try (AzureBlobFileSystem fs = getFileSystem()) {
+      assumeRecoveryThroughClientTransactionID(false);
+      AbfsDfsClient abfsDfsClient = (AbfsDfsClient) 
Mockito.spy(fs.getAbfsClient());
+      fs.getAbfsStore().setClient(abfsDfsClient);
+      final String[] clientTransactionId = new String[1];
+      mockAddClientTransactionIdToHeader(abfsDfsClient, clientTransactionId);
+      mockRetriedRequest(abfsDfsClient, new ArrayList<>());
+      boolean[] flag = new boolean[1];
+      Mockito.doAnswer(getPathStatus -> {
+        if (!flag[0]) {
+          flag[0] = true;
+          throw new AbfsRestOperationException(HTTP_CLIENT_TIMEOUT, "", "", 
new Exception());
+        }
+        return getPathStatus.callRealMethod();
+      }).when(abfsDfsClient).getPathStatus(
+          Mockito.nullable(String.class), Mockito.nullable(Boolean.class),
+          Mockito.nullable(TracingContext.class),
+          Mockito.nullable(ContextEncryptionAdapter.class));
+
+      Path sourceDir = path("/testSrc");
+      assertMkdirs(fs, sourceDir);
+      String filename = "file1";
+      Path sourceFilePath = new Path(sourceDir, filename);
+      touch(sourceFilePath);
+      Path destFilePath = new Path(sourceDir, "file2");
+
+      String errorMessage = intercept(AbfsDriverException.class,
+          () -> fs.rename(sourceFilePath, destFilePath)).getErrorMessage();
+
+      Assertions.assertThat(errorMessage)
+          .describedAs("getPathStatus should fail while recovering")
+          .contains("Error in getPathStatus while recovering from rename 
failure.");

Review Comment:
   Same as above
   



##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemCreate.java:
##########
@@ -2213,6 +2193,49 @@ public void testClientTransactionIdAfterTwoCreateCalls() 
throws Exception {
     }
   }
 
+  /**
+   * Test to verify that the client transaction ID is included in the response 
header
+   * during the creation of a new file in Azure Blob Storage.
+   * <p>
+   * This test ensures that when a new file is created, the Azure Blob 
FileSystem client
+   * correctly includes the client transaction ID in the response header for 
the created file.
+   * The test uses a configuration where client transaction ID is enabled and 
verifies
+   * its presence after the file creation operation.
+   * </p>
+   *
+   * @throws Exception if any error occurs during test execution
+   */
+  @Test
+  public void failureInGetPathStatusDuringCreateRecovery() throws Exception {
+    try (AzureBlobFileSystem fs = getFileSystem()) {
+      assumeRecoveryThroughClientTransactionID(true);
+      final String[] clientTransactionId = new String[1];
+      AbfsDfsClient abfsDfsClient = mockIngressClientHandler(fs);
+      mockAddClientTransactionIdToHeader(abfsDfsClient, clientTransactionId);
+      mockRetriedRequest(abfsDfsClient, new ArrayList<>());
+      boolean[] flag = new boolean[1];
+      Mockito.doAnswer(getPathStatus -> {
+        if (!flag[0]) {
+          flag[0] = true;
+          throw new AbfsRestOperationException(HTTP_CLIENT_TIMEOUT, "", "", 
new Exception());
+        }
+        return getPathStatus.callRealMethod();
+      }).when(abfsDfsClient).getPathStatus(
+          Mockito.nullable(String.class), Mockito.nullable(Boolean.class),
+          Mockito.nullable(TracingContext.class),
+          Mockito.nullable(ContextEncryptionAdapter.class));
+
+      final Path nonOverwriteFile = new Path(
+          "/NonOverwriteTest_FileName_" + UUID.randomUUID());
+      String errorMessage = intercept(AbfsDriverException.class,
+          () -> fs.create(nonOverwriteFile, false)).getErrorMessage();
+
+      Assertions.assertThat(errorMessage)
+          .describedAs("getPathStatus should fail while recovering")
+          .contains("Error in getPathStatus while recovering from create 
failure.");

Review Comment:
   Okay seems like this is the test I was referring to. Let's use constants 
here for error message.





> [ABFS] Rename/Create path idempotency client-level resolution
> -------------------------------------------------------------
>
>                 Key: HADOOP-19450
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19450
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs/azure
>    Affects Versions: 3.5.0
>            Reporter: Manish Bhatt
>            Assignee: Manish Bhatt
>            Priority: Major
>              Labels: pull-request-available
>
> CreatePath and RenamePath APIs are idempotent as subsequent retries on same 
> resource don’t change the server state. However, when client experiences 
> connection break on the CreatePath and the RenamePath APIs, client cannot 
> make sense if the request is accepted by the server or not. 
> On connection failure, the client retries the request. The server might 
> return 404 (sourceNotFound) in case of RenamePath API and 409 
> (pathAlreadyExists) in case of CreatePath (overwrite=false) API. Now the 
> client doesn’t have a path forward. Reason being, in case of CreatePath, 
> client doesn’t know if the path was created on the original request or the 
> path was already there for some other request, in case of RenamePath, client 
> doesn’t know if the source was removed because of the original-try or it was 
> not there on the first place. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to