[ 
https://issues.apache.org/jira/browse/HADOOP-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486803#comment-16486803
 ] 

Aaron Fabbri commented on HADOOP-14946:
---------------------------------------

Thank you for the patch [~gabor.bota]. +1 LGTM. I believe this will make the 
failure less likely.

I think the issue is this: calling sleep() , for example:
{noformat}
foo();
sleep(x);
bar();
{noformat}
This guarantees that bar() will execute *no sooner* than x seconds after foo(), 
but doesn't guarantee when bar() will run. It could be x+1, x+2, or x+10, 
depending on what your system is doing. This is consistent with the stack 
traces showing there were *too many* items pruned (because by the time prune() 
ran, more files had become stale).

I'll commit this after running the tests a couple of times under external load 
(in us-west-2).

(If this failure comes back again, we can use timers to at least detect when 
too much time has passed (get time before and after prune() call, if *too many* 
files were pruned we can tell from the timers if the issue is that your system 
was just too slow to run the test and ignore the failure with a LOG.warn().  
Even better would be to add a Ticker class like the google cache stuff uses to 
make the code testable without depending on real time.. that would be more work 
though.)

> S3Guard testPruneCommandCLI can fail
> ------------------------------------
>
>                 Key: HADOOP-14946
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14946
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>            Assignee: Gabor Bota
>            Priority: Major
>         Attachments: HADOOP-14946.001.patch
>
>
> The test of the S3Guard CLI prune can sometimes fail on parallel test runs. 
> Assumption: it is the parallelism which is causing the problem
> {code}
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB
> testPruneCommandCLI(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB)
>   Time elapsed: 10.765 sec  <<< FAILURE!
> java.lang.AssertionError: Pruned children count [] expected:<1> but was:<0>
>       at org.junit.Assert.fail(Assert.java:88)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to