shameersss1 opened a new pull request, #6006:
URL: https://github.com/apache/hadoop/pull/6006

   ### Description of PR
   
   Currently concurrent writes are not supported by S3A Magic Committer. When 
the user tries to write to same parent , but to a different 
partition/sub-directory, The MPU metadata (.pendingset) of slower running jobs 
might be deleted by the the jobs which completes first. 
   
   This happens because, The __magic directory is common across all the jobs 
and it gets cleanedup after the job completion which might affect the other 
jobs.
   
   ### Proposed Changes
   
   1. Instead of a global magic directory __magic, Each job will have its own 
magic directory of the format __magic_<jobId> and all the .pendingset are 
written to that directory.
   
   2. Introduced a new flag `fs.s3a.magic.cleanup.enabled` which is default to 
true, The clean up of magic directory will happen based on this flag. This was 
introduced to solve https://issues.apache.org/jira/browse/HADOOP-18568
   
   3. The default value of `fs.s3a.committer.abort.pending.uploads` is set to 
false to support concurrent writes by default.
   
   
   ### How was this patch tested?
   
   1. Ran S3A Unit Tests
   
   2. Ran S3A Integration test in `us-west-1` region 
   
   `[INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.TestDirectoryCommitterScale
   [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestPaths
   [INFO] Running org.apache.hadoop.fs.s3a.commit.TestMagicCommitPaths
   [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestStagingCommitter
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedFileListing
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingDirectoryOutputCommitter
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedJobCommit
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedTaskCommit
   [INFO] Tests run: 28, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.473 s - in org.apache.hadoop.fs.s3a.commit.TestMagicCommitPaths
   [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
3.611 s - in org.apache.hadoop.fs.s3a.commit.staging.TestPaths
   [INFO] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
20.965 s - in 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingDirectoryOutputCommitter
   [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
23.474 s - in 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedFileListing
   [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
27.627 s - in 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedJobCommit
   [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
29.249 s - in 
org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedTaskCommit
   [INFO] Tests run: 63, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
61.11 s - in org.apache.hadoop.fs.s3a.commit.staging.TestStagingCommitter
   [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
84.193 s - in 
org.apache.hadoop.fs.s3a.commit.staging.TestDirectoryCommitterScale
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocol
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitProtocol
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.integration.ITestPartitionedCommitProtocol
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocolFailure
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.901 
s - in 
   .hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocolFailure
   [INFO] Running org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocol
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocolFailure
   [INFO] Running 
org.apache.hadoop.fs.s3a.commit.integration.ITestS3ACommitterMRJob
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.225 
s - in org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocolFailure
   [INFO] Running org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory
   [INFO] Running org.apache.hadoop.fs.s3a.commit.ITestCommitOperations
   [INFO] Running org.apache.hadoop.fs.s3a.commit.ITestCommitOperationCost
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.05 
s - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory
   [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
43.398 s - in org.apache.hadoop.fs.s3a.commit.ITestCommitOperationCost
   [INFO] Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
143.308 s - in org.apache.hadoop.fs.s3a.commit.ITestCommitOperations
   [INFO] Running org.apache.hadoop.fs.s3a.auth.ITestAssumedRoleCommitOperations
   [INFO] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
218.466 s - in 
org.apache.hadoop.fs.s3a.commit.integration.ITestS3ACommitterMRJob
   [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 18, Time elapsed: 
62.036 s - in org.apache.hadoop.fs.s3a.auth.ITestAssumedRoleCommitOperations
   [WARNING] Tests run: 24, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 
429.367 s - in 
   .hadoop.fs.s3a.commit.staging.integration.ITestPartitionedCommitProtocol
   [INFO] Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
472.06 s - in 
   .hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocol
   [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
498.078 s - in 
   .hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitProtocol
   [INFO] Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
861.607 s - in org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocol
   [INFO] Running org.apache.hadoop.fs.s3a.commit.magic.ITestS3AHugeMagicCommits
   [WARNING] Tests run: 10, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 
29.141 s - in org.apache.hadoop.fs.s3a.commit.magic.ITestS3AHugeMagicCommits
   [INFO] Running org.apache.hadoop.fs.s3a.commit.terasort.ITestTerasortOnS3A
   [WARNING] Tests run: 14, Failures: 0, Errors: 0, Skipped: 14, Time elapsed: 
47.485 s - in org.apache.hadoop.fs.s3a.commit.terasort.ITestTerasortOnS3A`
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to