I can do a demo of this next week if people are interested > On 17 Aug 2017, at 23:07, Aaron Fabbri <fab...@cloudera.com> wrote: > > Hello, > > I'd like to open a vote (7 days, ending August 24 at 3:10 PST) to merge the > HADOOP-13345 feature branch into trunk. > > This branch contains the new S3Guard feature which adds metadata > consistency features to the S3A client. Formatted site documentation can > be found here: > > https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md > > The current patch against trunk is posted here: > > https://issues.apache.org/jira/browse/HADOOP-13998 > > The branch modifies the s3a portion of the hadoop-tools/hadoop-aws module: > > - The feature is off by default, and care has been taken to insure it has > no impact when disabled. > - S3Guard can be enabled with the production database which is backed by > DynamoDB, or with a local, in-memory implementation that facilitates > integration testing without having to pay for a database. > - getFileStatus() as well as directory listing consistency has been > implemented and thoroughly tested, including delete tracking. > - Convenient Maven profiles for testing with and without S3Guard. > - New failure injection code and integration tests that exercise it. We > use timers and a wrapper around the Amazon SDK client object to force > consistency delays to occur. This allows us to assert that S3Guard works > as advertised. This will be extended with more types of failure injection > to continue hardening the S3A client. > > Outside of hadoop-tools/hadoop-aws's s3a directory there are some minor > changes: > > - core-default.xml defaults and documentation for s3guard parameters. > - A couple additional FS contract test cases around rename. > - More goodies in LambdaTestUtils > - A new CLI tool for inspecting and manipulating S3Guard features, > including the backing MetadataStore database. > > This branch has seen extensive testing as well as use in production. This > branch makes significant improvements to S3A's test toolkit as well. > > Performance is typically on par with, and in some cases better than, the > existing S3A code without S3Guard enabled. > > This feature was developed with contributions and feedback from many > people. I'd like to thank everyone who worked on HADOOP-13345 as well as > all of those who contributed feedback and work on the original design > document. > > This is the first major Apache Hadoop project I've worked on from start to > finish, and I've really enjoyed it. Please shout if I've missed anything > important here or in the VOTE process. > > Cheers, > Aaron Fabbri
--------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org