snvijaya commented on code in PR #5462:
URL: https://github.com/apache/hadoop/pull/5462#discussion_r1131060581


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java:
##########
@@ -621,37 +622,57 @@ private AbfsRestOperation 
conditionalCreateOverwriteFile(final String relativePa
           isAppendBlob, null, tracingContext);
 
     } catch (AbfsRestOperationException e) {
+      LOG.debug("Failed to create {}", relativePath, e);
       if (e.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) {
         // File pre-exists, fetch eTag
+        LOG.debug("Fetching etag of {}", relativePath);
         try {
           op = client.getPathStatus(relativePath, false, tracingContext);
         } catch (AbfsRestOperationException ex) {
+          LOG.debug("Failed to to getPathStatus {}", relativePath, ex);
           if (ex.getStatusCode() == HttpURLConnection.HTTP_NOT_FOUND) {

Review Comment:
   Hi @steveloughran, Given Hadoop is single writer semantic, would it be 
correct to expect that as part of job parallelization only one worker process 
should try to create a file ? As this check for FileNotFound is post an attempt 
to create the file with overwrite=false, which inturn failed with conflict 
indicating file was just present, concurrent operation on the file is indeed 
confirmed. 
   
   Its quite possible that if we let this create proceed, some other operation 
such as delete can kick in later on as well. Below code that throws exception 
at the first indication of parallel activity would be the right thing to do ? 
   
   
   As the workload pattern is not honoring the single writer semantic I feel we 
should retain the logic to throw  ConcurrentWriteOperationDetectedException. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to