[
https://issues.apache.org/jira/browse/HADOOP-16632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-16632:
------------------------------------
Fix Version/s: 3.3.0
Resolution: Fixed
Status: Resolved (was: Patch Available)
> Speculating & Partitioned S3A magic committers can leave pending files under
> __magic
> ------------------------------------------------------------------------------------
>
> Key: HADOOP-16632
> URL: https://issues.apache.org/jira/browse/HADOOP-16632
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.2.1, 3.1.3
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Fix For: 3.3.0
>
>
> Partitioned S3A magic committers can leaving pending files, maybe upload data
> This surfaced in an assertion failure on a parallel test run.
> I thought it was actually a test failure, but with HADOOP-16207 all the docs
> are preserved in the local FS and I can understand what happened.
> h3. Junit process
> {code}
> [INFO]
> [ERROR] Failures:
> [ERROR]
> ITestS3ACommitterMRJob.test_200_execute:344->customPostExecutionValidation:356
> Expected a java.io.FileNotFoundException to be thrown, but got the result: :
> "Found magic dir which should have been deleted at
> S3AFileStatus{path=s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic;
> isDirectory=true; modification_time=0; access_time=0; owner=stevel;
> group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false;
> isEncrypted=true; isErasureCoded=false} isEmptyDirectory=UNKNOWN eTag=null
> versionId=null
> [s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1/__base/part-m-00008
> s3a://hwdev-steve-ireland-new/fork-0001/test/ITestS3ACommitterMRJob-execute-magic/__magic/app-attempt-0001/tasks/attempt_1570197469968_0003_m_000008_1/__base/part-m-00008.pending
> {code}
> Full details to follow in the comment as they are, well, detailed.
>
> Key point: AM-side job and task cleanup can happen before the worker task
> finishes its writes. This will result in files under __magic. It may result
> in pending uploads too -but only if the write began after the AM job cleanup
> did a list + abort of all pending uploads under the destination directory
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]