[
https://issues.apache.org/jira/browse/HADOOP-15003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243827#comment-16243827
]
Steve Loughran commented on HADOOP-15003:
-----------------------------------------
thanks for this, I'll look through the @after code too.
# All the other scale tests work in parallel, so I don't think it's their
teardown, more something happening in a parallel test
# which can include a job commit happening on a different path
# as well as some test setup doing autopurge of old stuff
# test is getting filename inconsistent; not correctly working out path of
pending file
The fact that you can see it and I can't makes me think it's either some race
condition or config. The fact you can consistently see it means that it's
unlikely to be a race, as that's going to to fail intermittently. I'd consider
parallel tests purging buckets. But: how is that going to clean up the file
under __magic? Lose the uncommitted MPUs, yes.
Here's my plan
# make sure that the code logs @ info when a magic file is saved, some message
like "file $dest... ready to commit, commit metadata stored at $pending."
# failing test case to add some asserts in the test about existence of pending
files under a path. This will help differentiate purge of commit data from
purge of files. Adding it in the test case which creates the file will verify
that any delete/purge happens between the two
# add a test which explicitly deletes the MPU of a pending commit, see what
happens
# add a test which explicitly runs two job commits in parallel to separate
paths: verifies isolation.
# review troubleshooting docs. Once your list MPU CI
# Have all commands to purge pending MPUs actually list@info the files. After
all, in normal execution there shouldn't be any.
That CLI you've proposed for listing MPUS, "hadoop s3guard uploads" will help
diagnose stuff in the field
> Merge S3A committers into trunk: Yetus patch checker
> ----------------------------------------------------
>
> Key: HADOOP-15003
> URL: https://issues.apache.org/jira/browse/HADOOP-15003
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13786-041.patch, HADOOP-13786-042.patch,
> HADOOP-13786-043.patch, HADOOP-13786-044.patch, HADOOP-13786-045.patch,
> HADOOP-13786-046.patch
>
>
> This is a Yetus only JIRA created to have Yetus review the
> HADOOP-13786/HADOOP-14971 patch as a .patch file, as the review PR
> [https://github.com/apache/hadoop/pull/282] is stopping this happening in
> HADOOP-14971.
> Reviews should go into the PR/other task
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]