snvijaya commented on a change in pull request #2246:
URL: https://github.com/apache/hadoop/pull/2246#discussion_r480968575
##########
File path:
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
##########
@@ -271,10 +272,67 @@ public AbfsRestOperation deleteFilesystem() throws
AzureBlobFileSystemException
return op;
}
- public AbfsRestOperation createPath(final String path, final boolean isFile,
final boolean overwrite,
- final String permission, final String
umask,
- final boolean isAppendBlob) throws
AzureBlobFileSystemException {
+ public AbfsRestOperation createPath(final String path,
+ final boolean isFile,
+ final boolean overwrite,
+ final String permission,
+ final String umask,
+ final boolean isAppendBlob) throws AzureBlobFileSystemException {
+ String operation = isFile
+ ? SASTokenProvider.CREATE_FILE_OPERATION
+ : SASTokenProvider.CREATE_DIRECTORY_OPERATION;
+
+ // HDFS FS defaults overwrite behaviour to true for create file which leads
+ // to majority create API traffic with overwrite=true. In some cases, this
+ // will end in race conditions at backend with parallel operations issued
to
+ // same path either by means of the customer workload or ABFS driver retry.
+ // Disabling the create overwrite default setting to false should
+ // significantly reduce the chances for such race conditions.
+ boolean isFirstAttemptToCreateWithoutOverwrite = false;
+ if (isFile && overwrite
+ && abfsConfiguration.isDefaultCreateOverwriteDisabled()) {
+ isFirstAttemptToCreateWithoutOverwrite = true;
+ }
+
+ AbfsRestOperation op = null;
+ // Query builder
+ final AbfsUriQueryBuilder abfsUriQueryBuilder =
createDefaultUriQueryBuilder();
+ abfsUriQueryBuilder.addQuery(QUERY_PARAM_RESOURCE,
+ operation.equals(SASTokenProvider.CREATE_FILE_OPERATION)
+ ? FILE
+ : DIRECTORY);
+ if (isAppendBlob) {
+ abfsUriQueryBuilder.addQuery(QUERY_PARAM_BLOBTYPE, APPEND_BLOB_TYPE);
+ }
+
+ appendSASTokenToQuery(path, operation, abfsUriQueryBuilder);
+
+ try {
+ op = createPathImpl(path, abfsUriQueryBuilder,
Review comment:
Yes, this PR change is not an absolute fix to resolve the race condition
issue. But will help in reducing the overall create overwrite=true traffic with
HDFS default for overwrite being true.
##########
File path:
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
##########
@@ -271,10 +272,67 @@ public AbfsRestOperation deleteFilesystem() throws
AzureBlobFileSystemException
return op;
}
- public AbfsRestOperation createPath(final String path, final boolean isFile,
final boolean overwrite,
- final String permission, final String
umask,
- final boolean isAppendBlob) throws
AzureBlobFileSystemException {
+ public AbfsRestOperation createPath(final String path,
+ final boolean isFile,
+ final boolean overwrite,
+ final String permission,
+ final String umask,
+ final boolean isAppendBlob) throws AzureBlobFileSystemException {
+ String operation = isFile
+ ? SASTokenProvider.CREATE_FILE_OPERATION
+ : SASTokenProvider.CREATE_DIRECTORY_OPERATION;
+
+ // HDFS FS defaults overwrite behaviour to true for create file which leads
+ // to majority create API traffic with overwrite=true. In some cases, this
+ // will end in race conditions at backend with parallel operations issued
to
+ // same path either by means of the customer workload or ABFS driver retry.
+ // Disabling the create overwrite default setting to false should
+ // significantly reduce the chances for such race conditions.
+ boolean isFirstAttemptToCreateWithoutOverwrite = false;
+ if (isFile && overwrite
+ && abfsConfiguration.isDefaultCreateOverwriteDisabled()) {
+ isFirstAttemptToCreateWithoutOverwrite = true;
+ }
+
+ AbfsRestOperation op = null;
+ // Query builder
+ final AbfsUriQueryBuilder abfsUriQueryBuilder =
createDefaultUriQueryBuilder();
+ abfsUriQueryBuilder.addQuery(QUERY_PARAM_RESOURCE,
+ operation.equals(SASTokenProvider.CREATE_FILE_OPERATION)
+ ? FILE
+ : DIRECTORY);
+ if (isAppendBlob) {
+ abfsUriQueryBuilder.addQuery(QUERY_PARAM_BLOBTYPE, APPEND_BLOB_TYPE);
+ }
+
+ appendSASTokenToQuery(path, operation, abfsUriQueryBuilder);
+
+ try {
+ op = createPathImpl(path, abfsUriQueryBuilder,
+ (isFirstAttemptToCreateWithoutOverwrite ? false : overwrite),
+ permission, umask);
+ } catch (AbfsRestOperationException e) {
+ if ((e.getStatusCode() == HttpURLConnection.HTTP_CONFLICT)
+ && isFirstAttemptToCreateWithoutOverwrite) {
+ isFirstAttemptToCreateWithoutOverwrite = false;
+ // was a first attempt made to create without overwrite. Now try again
+ // with overwrite now.
+ op = createPathImpl(path, abfsUriQueryBuilder, true, permission,
umask);
Review comment:
Yes, this PR change is not an absolute fix to resolve the race condition
issue. But will help in reducing the overall create overwrite=true traffic with
HDFS default for overwrite being true.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]