[
https://issues.apache.org/jira/browse/HADOOP-16207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933718#comment-16933718
]
Siddharth Seth commented on HADOOP-16207:
-----------------------------------------
Seeing several MR job failures when running tests on HADOOP-16445.
{code}
[ERROR]
ITestMagicCommitMRJob>AbstractITCommitMRJob.testMRJob:146->AbstractFSContractTestBase.assertIsDirectory:327
» FileNotFound
[ERROR]
ITestDirectoryCommitMRJob>AbstractITCommitMRJob.testMRJob:146->AbstractFSContractTestBase.assertIsDirectory:327
» FileNotFound
[ERROR]
ITestPartitionCommitMRJob>AbstractITCommitMRJob.testMRJob:146->AbstractFSContractTestBase.assertIsDirectory:327
» FileNotFound
[ERROR]
ITestStagingCommitMRJob>AbstractITCommitMRJob.testMRJob:146->AbstractFSContractTestBase.assertIsDirectory:327
» FileNotFound
{code}
always fail when run with -Ds3guard -Ddynamo -Dauth (These fail when starting
with a clean DDB table as well)
The test setup seems broken to me.
* Cluster set up happens with createCluster(new JobConf())
* After this, AbstractITCommitMRJob creates the MRJob with
Job.getInstance(getClusterBinding().getConf() ... -> This will end up using the
previously created JobConf
* JobConf will only read core-site.xml ... so the command line parameters
-Ds3guard, -Ddynamo -Dauth don't make a difference.
Adding fs.s3a.metadatastore.authoritative=true,
fs.s3a.metadatastore.impl=org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore
in auth-keys.xml or core-site.xml fixed all the test failures for me. (With
the additions, the JobConf used by the cluster has these configs, and the tests
do what they're supposed to).
That isn't the correct fix though. Making sure the test configuration is used
to create the JobConf for the cluster and jobs would allow the test properties
to work.
That said, I did see 3 empty (and marked as deleted) files - part_0000,
part_0001, _SUCCESS in the s3guard table. I suspect this is a result of the
committer trying to access a file on the client, getting a cached FileSystem
instance (same UGI), and the getFileStatus (maybe) creates these S3Guard DDB
entries?
> Fix ITestDirectoryCommitMRJob.testMRJob
> ---------------------------------------
>
> Key: HADOOP-16207
> URL: https://issues.apache.org/jira/browse/HADOOP-16207
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3, test
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Critical
>
> Reported failure of {{ITestDirectoryCommitMRJob}} in validation runs of
> HADOOP-16186; assertIsDirectory with s3guard enabled and a parallel test run:
> Path "is recorded as deleted by S3Guard"
> {code}
> waitForConsistency();
> assertIsDirectory(outputPath) /* here */
> {code}
> The file is there but there's a tombstone. Possibilities
> * some race condition with another test
> * tombstones aren't timing out
> * committers aren't creating that base dir in a way which cleans up S3Guard's
> tombstones.
> Remember: we do have to delete that dest dir before the committer runs unless
> overwrite==true, so at the start of the run there will be a tombstone. It
> should be overwritten by a success.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]