SaketaChalamchala opened a new pull request, #6496:
URL: https://github.com/apache/ozone/pull/6496

   ## What changes were proposed in this pull request?
   S3a provides multiple mapreduce 
[committers](https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/committers.html).
 When using the directory staging committer `fs.s3a.committer.name=directory` 
with replace conflict mode `fs.s3a.committer.staging.conflict-mode=replace` and 
writing to FSO buckets, the job fails with the following errors.
   
   This is because the logic to add back missing parent directories and missing 
output prefix directories similar to Initiate MPU request 
([S3InitiateMultipartUploadRequestWithFSO.java](https://github.com/apache/ozone/blob/83d75861b0266160b219acde72d769eb0f9d5ac4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/s3/multipart/S3InitiateMultipartUploadRequestWithFSO.java#L192-L207),
 
[S3InitiateMultipartUploadResponseWithFSO.java](https://github.com/apache/ozone/blob/83d75861b0266160b219acde72d769eb0f9d5ac4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/response/s3/multipart/S3InitiateMultipartUploadResponseWithFSO.java#L83-L107))
 is missing in the Complete MPU request.
   
   The proposed solution adds the missing parent and output prefix directories 
back to DB before completing the MPU. 
   
   Errors:
   ```
   ##When creating getDBOzoneKey
   StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCompleteRequest:
 MultipartUpload Complete request failed for Key: 
st-data-con-jqmmwt/qetest/terasort/output-1710494153/part-r-00000 in 
Volume/Bucket s3v/qe-dataconn-bucket
   DIRECTORY_NOT_FOUND org.apache.hadoop.ozone.om.exceptions.OMException: 
Failed to find parent directory of 
st-data-con-jqmmwt/qetest/terasort/output-1710494153/part-r-00000
        at 
org.apache.hadoop.ozone.om.request.file.OMFileRequest.getParentID(OMFileRequest.java:1008)
        at 
org.apache.hadoop.ozone.om.request.file.OMFileRequest.getParentID(OMFileRequest.java:958)
        at 
org.apache.hadoop.ozone.om.request.file.OMFileRequest.getParentId(OMFileRequest.java:1038)
        at 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCompleteRequestWithFSO.getDBOzoneKey(S3MultipartUploadCompleteRequestWithFSO.java:114)
        at 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCompleteRequest.validateAndUpdateCache(S3MultipartUploadCompleteRequest.java:157)
        at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:378)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:568)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:363)
        at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
   ```
   
   ```
   ## When getting omKeyInfo from DB
   StateMachine ApplyTransaction Thread - 
0]-org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine: Terminating with 
exit status 1: Request cmdType: CompleteMultiPartUpload
   traceID: ""
   clientId: "client-5168AA460706"
   userInfo {
     userName: "xxx"
     remoteAddress: "10.140.142.67"
     hostName: "ccycloud-5.quasar-zycyup.root.comops.site"
   }
   version: 3
   completeMultiPartUploadRequest {
     keyArgs {
       volumeName: "s3v"
       bucketName: "qe-dataconn-bucket"
       keyName: 
"st-data-con-jqmmwt/qetest/terasort/output-1710494153/part-r-00000"
       multipartUploadID: 
"e1c08d2c-5798-4c81-ab2c-7a0bd0fea4c9-112101950169613319"
       acls {
         type: USER
         name: "[email protected]"
         rights: "\200"
         aclScope: ACCESS
       }
       acls {
         type: GROUP
         name: "hrt_qa"
         rights: "\000\001"
         aclScope: ACCESS
       }
       acls {
         type: GROUP
         name: "users"
         rights: "\000\001"
         aclScope: ACCESS
       }
       acls {
         type: GROUP
         name: "hivetest"
         rights: "\000\001"
         aclScope: ACCESS
       }
       modificationTime: 1710540186541
     }
     partsList {
       partNumber: 1
       partName: 
"/s3v/qe-dataconn-bucket/st-data-con-jqmmwt/qetest/terasort/output-1710494153/part-r-00000-e1c08d2c-5798-4c81-ab2c-7a0bd0fea4c9-112101950169613319-1"
     }
   }
   s3Authentication {
     stringToSign: 
"AWS4-HMAC-SHA256\n20240315T220306Z\n20240315/us-east-1/s3/aws4_request\nb99fbcaca83fd2e3e67e9d5ff8f83fe1d882107fba9398e24414391522fbd926"
     signature: 
"724343f896d39b9e644f354ac4bf19648f2b2bc5ceeb514570abc288ed38d6c8"
     accessId: "[email protected]"
   }
    failed with exception
   java.lang.NullPointerException
        at 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCompleteRequest.getOmKeyInfo(S3MultipartUploadCompleteRequest.java:378)
        at 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCompleteRequest.validateAndUpdateCache(S3MultipartUploadCompleteRequest.java:202)
        at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:378)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:568)
        at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:363)
        at 
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834) 
   ```
   
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-10630
   
   ## How was this patch tested?
   
   Ran the s3a hadoop contract test `ITestS3ACommitterMRJob` to verify. There 
is another [PR](https://github.com/apache/ozone/pull/6458/files) out to add s3a 
contract tests to acceptance testing.
   
   ```
    ## Startup unsecure ozone cluster using docker-compose and create an FSO 
bucket
   cd hadoop-ozone/dist/target/ozone-*-SNAPSHOT/compose/ozone
   docker-compose up -d --scale datanode=3
   ozone sh bucket create /s3v/fso-bucket -l FILE_SYSTEM_OPTIMIZED
   
   ## Download the hadoop-aws source
   curl -LSs -o "hadoop-src.tar.gz" 
https://archive.apache.org/dist/hadoop/common/hadoop-3.3.6/hadoop-3.3.6-src.tar.gz
   tar -x -z -C "hadoop-src" --strip-components=3 -f "hadoop-src.tar.gz" 
'hadoop-*-src/hadoop-tools/hadoop-aws'
   
   ## Create auth-keys.xml
   vi hadoop-src/src/test/resources/auth-keys.xml 
   <configuration>
   
     <property>
       <name>fs.s3a.endpoint</name>
       <value>http://localhost:9878</value>
     </property>
   
     <property>
       <name>fs.s3a.access.key</name>
       <value>s3a-contract</value>
     </property>
   
     <property>
       <name>fs.s3a.secret.key</name>
       <value>unsecure</value>
     </property>
   
    <property>
       <name>fs.s3a.committer.staging.conflict-mode</name>
       <value>replace</value>
     </property>
   
     <property>
       <name>fs.contract.test.fs.s3a</name>
       <value>s3a://fso-bucket/</value>
     </property>
   
     <property>
       <name>test.fs.s3a.name</name>
       <value>s3a://fso-bucket/</value>
     </property>
   
     <property>
       <name>test.fs.s3a.sts.enabled</name>
       <value>false</value>
     </property>
   
     <property>
       <name>fs.s3a.path.style.access</name>
       <value>true</value>
     </property>
   
     <property>
       <name>fs.s3a.directory.marker.retention</name>
       <value>keep</value>
     </property>
   
   </configuration>
   
   ## Run the ITestS3ACommitterMRJob contract test
   mvn clean test -B -V --no-transfer-progress -Dtest='ITestS3ACommitterMRJob' 
   ```
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to