[
https://issues.apache.org/jira/browse/HADOOP-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486803#comment-16486803
]
Aaron Fabbri commented on HADOOP-14946:
---------------------------------------
Thank you for the patch [~gabor.bota]. +1 LGTM. I believe this will make the
failure less likely.
I think the issue is this: calling sleep() , for example:
{noformat}
foo();
sleep(x);
bar();
{noformat}
This guarantees that bar() will execute *no sooner* than x seconds after foo(),
but doesn't guarantee when bar() will run. It could be x+1, x+2, or x+10,
depending on what your system is doing. This is consistent with the stack
traces showing there were *too many* items pruned (because by the time prune()
ran, more files had become stale).
I'll commit this after running the tests a couple of times under external load
(in us-west-2).
(If this failure comes back again, we can use timers to at least detect when
too much time has passed (get time before and after prune() call, if *too many*
files were pruned we can tell from the timers if the issue is that your system
was just too slow to run the test and ignore the failure with a LOG.warn().
Even better would be to add a Ticker class like the google cache stuff uses to
make the code testable without depending on real time.. that would be more work
though.)
> S3Guard testPruneCommandCLI can fail
> ------------------------------------
>
> Key: HADOOP-14946
> URL: https://issues.apache.org/jira/browse/HADOOP-14946
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.0.0
> Reporter: Steve Loughran
> Assignee: Gabor Bota
> Priority: Major
> Attachments: HADOOP-14946.001.patch
>
>
> The test of the S3Guard CLI prune can sometimes fail on parallel test runs.
> Assumption: it is the parallelism which is causing the problem
> {code}
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB
> testPruneCommandCLI(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB)
> Time elapsed: 10.765 sec <<< FAILURE!
> java.lang.AssertionError: Pruned children count [] expected:<1> but was:<0>
> at org.junit.Assert.fail(Assert.java:88)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]