[ 
https://issues.apache.org/jira/browse/HADOOP-14107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956028#comment-15956028
 ] 

Mingliang Liu commented on HADOOP-14107:
----------------------------------------

[~ste...@apache.org], thanks for your input.

I reran the test but did not reproduce the failure. There are two possibilities 
that may fail this test:
# In {{noWriteBack.listStatus(directory);}}, the directory on S3 only may not 
be listed because of the eventually consistency in S3 list. S3Guard is not 
guaranteed to discover new S3 objects in a timely manner. This is possible but 
I did not reproduce this in test.
# If the test data was not cleaned up during last test process, then this test 
will fail, as the stack trace shows in description section.

The #1 is a corner case and we can simply sleep for 3~5 seconds. The #2 can be 
addressed by deleting the existing test directory explicitly. We may still need 
to sleep for a while before delete tracking is supported. Meanwhile, per your 
comment, I changed the coding style, comments, variable names etc in the v0 
patch to make the test purpose clearer.

The major point is that, is the test doing the right thing? I think it is. It 
creates a directory on S3 only (via noS3Guard), and another directory on S3 and 
DDB aka metadata store (via noWriteBack). Then if we list the file system with 
S3Guard enabled, regardless of S3Guard writing back or not, it should return 
both of the data. The only exception is the point #1 above: if the data only in 
S3 is not visible yet. That's being said, both noWriteBack and yesWriteBack 
will discover S3-only object, and union that with DDB result. However, 
noWriteBack should not populate the newly found S3-only object to DDB while 
yesWriteBack should always populate the S3-only object to DDB. We assert that 
via querying DDB directly.

> ITestS3GuardListConsistency fails intermittently
> ------------------------------------------------
>
>                 Key: HADOOP-14107
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14107
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-14107-HADOOP-13345.000.patch
>
>
> {code}
> mvn -Dit.test='ITestS3GuardListConsistency' -Dtest=none -Dscale -Ds3guard 
> -Ddynamo -q clean verify
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.544 sec <<< 
> FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
> testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency) 
>  Time elapsed: 3.147 sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of results from metastore. 
> Metastore should only know about /XYZ: 
> DirListingMetadata{path=s3a://mliu-s3guard/test/ListStatusWriteBack, 
> listMap={s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ;
>  isDirectory=true; modification_time=0; access_time=0; owner=mliu; 
> group=mliu; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}, 
> s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123;
>  isDirectory=true; modification_time=0; access_time=0; owner=mliu; 
> group=mliu; permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}}, 
> isAuthoritative=false}
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at 
> org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127)
> {code}
> See discussion on the parent JIRA [HADOOP-13345].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to