[
https://issues.apache.org/jira/browse/HADOOP-19450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930007#comment-17930007
]
ASF GitHub Bot commented on HADOOP-19450:
-----------------------------------------
anujmodi2021 commented on code in PR #7364:
URL: https://github.com/apache/hadoop/pull/7364#discussion_r1968915858
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/exceptions/AbfsDriverException.java:
##########
@@ -51,4 +51,12 @@ public AbfsDriverException(final Exception innerException,
final String activity
: ERROR_MESSAGE + ", rId: " + activityId,
null);
}
+
+ public AbfsDriverException(final String errorMessage, final Exception
innerException) {
Review Comment:
I was wondering if the exception will have reqId or not.
Req id is part of abfsHttpOperation. Can we also pass down activity Id and
append it to error message?
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsDfsClient.java:
##########
@@ -705,6 +711,30 @@ public AbfsClientRenameResult renamePath(
throw e;
}
+ // recovery using client transaction id only if it is a retried request.
+ if (op.isARetriedRequest() && clientTransactionId != null
+ && SOURCE_PATH_NOT_FOUND.getErrorCode().equalsIgnoreCase(
+ op.getResult().getStorageErrorCode())) {
+ try {
+ final AbfsHttpOperation abfsHttpOperation =
+ getPathStatus(destination, false,
+ tracingContext, null).getResult();
+ if (clientTransactionId.equals(
+ abfsHttpOperation.getResponseHeader(
+ X_MS_CLIENT_TRANSACTION_ID))) {
+ return new AbfsClientRenameResult(
+ getSuccessOp(AbfsRestOperationType.RenamePath,
+ HTTP_METHOD_PUT, url, requestHeaders), true,
+ isMetadataIncompleteState);
+ }
+ } catch (AzureBlobFileSystemException exception) {
+ throw new AbfsDriverException(
+ "Error in getPathStatus while recovering from rename failure.",
Review Comment:
Let's define error string as constant in `AbfsErrors` class and add a
mockito test to verify we get the proper exception thrown, if not already there.
Same for create as well.
##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsDfsClient.java:
##########
@@ -415,7 +416,9 @@ public AbfsRestOperation createPath(final String path,
String existingResource =
op.getResult().getResponseHeader(X_MS_EXISTING_RESOURCE_TYPE);
if (existingResource != null && existingResource.equals(DIRECTORY)) {
- return op; //don't throw ex on mkdirs for existing directory
+ //don't throw ex on mkdirs for existing directory
+ return getSuccessOp(AbfsRestOperationType.CreatePath,
Review Comment:
This is good function to have a cleaner code, let's define it in base class
and use it everywhere we are setting hard result. Even in Blob Client
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRename.java:
##########
@@ -1766,4 +1745,69 @@ public void getClientTransactionIdAfterRename() throws
Exception {
.isEqualTo(clientTransactionId[0]);
}
}
+
+ @Test
+ public void failureInGetPathStatusDuringRenameRecovery() throws Exception {
+ try (AzureBlobFileSystem fs = getFileSystem()) {
+ assumeRecoveryThroughClientTransactionID(false);
+ AbfsDfsClient abfsDfsClient = (AbfsDfsClient)
Mockito.spy(fs.getAbfsClient());
+ fs.getAbfsStore().setClient(abfsDfsClient);
+ final String[] clientTransactionId = new String[1];
+ mockAddClientTransactionIdToHeader(abfsDfsClient, clientTransactionId);
+ mockRetriedRequest(abfsDfsClient, new ArrayList<>());
+ boolean[] flag = new boolean[1];
+ Mockito.doAnswer(getPathStatus -> {
+ if (!flag[0]) {
+ flag[0] = true;
+ throw new AbfsRestOperationException(HTTP_CLIENT_TIMEOUT, "", "",
new Exception());
+ }
+ return getPathStatus.callRealMethod();
+ }).when(abfsDfsClient).getPathStatus(
+ Mockito.nullable(String.class), Mockito.nullable(Boolean.class),
+ Mockito.nullable(TracingContext.class),
+ Mockito.nullable(ContextEncryptionAdapter.class));
+
+ Path sourceDir = path("/testSrc");
+ assertMkdirs(fs, sourceDir);
+ String filename = "file1";
+ Path sourceFilePath = new Path(sourceDir, filename);
+ touch(sourceFilePath);
+ Path destFilePath = new Path(sourceDir, "file2");
+
+ String errorMessage = intercept(AbfsDriverException.class,
+ () -> fs.rename(sourceFilePath, destFilePath)).getErrorMessage();
+
+ Assertions.assertThat(errorMessage)
+ .describedAs("getPathStatus should fail while recovering")
+ .contains("Error in getPathStatus while recovering from rename
failure.");
Review Comment:
Same as above
##########
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemCreate.java:
##########
@@ -2213,6 +2193,49 @@ public void testClientTransactionIdAfterTwoCreateCalls()
throws Exception {
}
}
+ /**
+ * Test to verify that the client transaction ID is included in the response
header
+ * during the creation of a new file in Azure Blob Storage.
+ * <p>
+ * This test ensures that when a new file is created, the Azure Blob
FileSystem client
+ * correctly includes the client transaction ID in the response header for
the created file.
+ * The test uses a configuration where client transaction ID is enabled and
verifies
+ * its presence after the file creation operation.
+ * </p>
+ *
+ * @throws Exception if any error occurs during test execution
+ */
+ @Test
+ public void failureInGetPathStatusDuringCreateRecovery() throws Exception {
+ try (AzureBlobFileSystem fs = getFileSystem()) {
+ assumeRecoveryThroughClientTransactionID(true);
+ final String[] clientTransactionId = new String[1];
+ AbfsDfsClient abfsDfsClient = mockIngressClientHandler(fs);
+ mockAddClientTransactionIdToHeader(abfsDfsClient, clientTransactionId);
+ mockRetriedRequest(abfsDfsClient, new ArrayList<>());
+ boolean[] flag = new boolean[1];
+ Mockito.doAnswer(getPathStatus -> {
+ if (!flag[0]) {
+ flag[0] = true;
+ throw new AbfsRestOperationException(HTTP_CLIENT_TIMEOUT, "", "",
new Exception());
+ }
+ return getPathStatus.callRealMethod();
+ }).when(abfsDfsClient).getPathStatus(
+ Mockito.nullable(String.class), Mockito.nullable(Boolean.class),
+ Mockito.nullable(TracingContext.class),
+ Mockito.nullable(ContextEncryptionAdapter.class));
+
+ final Path nonOverwriteFile = new Path(
+ "/NonOverwriteTest_FileName_" + UUID.randomUUID());
+ String errorMessage = intercept(AbfsDriverException.class,
+ () -> fs.create(nonOverwriteFile, false)).getErrorMessage();
+
+ Assertions.assertThat(errorMessage)
+ .describedAs("getPathStatus should fail while recovering")
+ .contains("Error in getPathStatus while recovering from create
failure.");
Review Comment:
Okay seems like this is the test I was referring to. Let's use constants
here for error message.
> [ABFS] Rename/Create path idempotency client-level resolution
> -------------------------------------------------------------
>
> Key: HADOOP-19450
> URL: https://issues.apache.org/jira/browse/HADOOP-19450
> Project: Hadoop Common
> Issue Type: Task
> Components: fs/azure
> Affects Versions: 3.5.0
> Reporter: Manish Bhatt
> Assignee: Manish Bhatt
> Priority: Major
> Labels: pull-request-available
>
> CreatePath and RenamePath APIs are idempotent as subsequent retries on same
> resource don’t change the server state. However, when client experiences
> connection break on the CreatePath and the RenamePath APIs, client cannot
> make sense if the request is accepted by the server or not.
> On connection failure, the client retries the request. The server might
> return 404 (sourceNotFound) in case of RenamePath API and 409
> (pathAlreadyExists) in case of CreatePath (overwrite=false) API. Now the
> client doesn’t have a path forward. Reason being, in case of CreatePath,
> client doesn’t know if the path was created on the original request or the
> path was already there for some other request, in case of RenamePath, client
> doesn’t know if the source was removed because of the original-try or it was
> not there on the first place.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]