[ 
https://issues.apache.org/jira/browse/HADOOP-13793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634668#comment-15634668
 ] 

Aaron Fabbri commented on HADOOP-13793:
---------------------------------------

[[email protected]], [~eddyxu] we've discussed this a bit, and [~liuml07] may 
be interested as well.

One idea I have is to take advantage of the separation of the S3Client 
introduced in HADOOP-13447 to create a "delay wrapper" or layer on top of the 
AmazonS3 client that selectively delays returning certain results.  We could 
call this wrapper DelayedAmazonS3 or something.  For example,

1. Create a file s3a://bucket/a/b/file
2. Call into DelayedAmazonS3 and tell it to delay visibility of that path for X 
milliseconds. (where X is above or below the max retry threshold)
3. call listStatus(s3a://bucket/a/b/file).  DelayedAmazonS3 will not return 
that path in the listing.  MetadataStore will add it to the listing though.
4. open(s3a://bucket/a/b/file).  The initial open will fail as DelayedAmazonS3 
will fake a file-not-found.  The S3A client will go into retry mode and we can 
assert that it is / is not successful getting the object.  We can also assert 
on (to be added) FS statistics that at least one retry happened (being mindful 
of timing issues that could cause flaky tests).

This is just one example but gives one of the ideas I was thinking of.  
[~eddyxu] also had an idea to use the underlying S3 bucket to do something 
similar.  That could be a little less deterministic depending on how its 
implemented?  Any other ideas, please share.


> s3guard: add inconsistency injection, integration tests
> -------------------------------------------------------
>
>                 Key: HADOOP-13793
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13793
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>
> Many of us share concerns that testing the consistency features of S3Guard 
> will be difficult if we depend on the rare and unpredictable occurrence of 
> actual inconsistency in S3 to exercise those code paths.
> I think we should have a mechanism for injecting failure to force exercising 
> of the consistency codepaths in S3Guard.
> Requirements:
> - Integration tests that cause S3A to see the types of inconsistency we 
> address with S3Guard.
> - These are deterministic integration tests.
> Unit tests are possible as well, if we were to stub out the S3Client.  That 
> may be less bang for the buck, though.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to