If you've missed the announcement, AWS S3 storage is now strongly consistent: https://aws.amazon.com/s3/consistency/
That's full CRUD consistency, consistent listing, and no 404 caching. You don't get: rename, or an atomic create-no-overwrite. Applications need to know that and code for it. This is enabled for all S3 buckets; no need to change endpoints or any other settings. No extra cost, no performance impact. This is the biggest change in S3 semantics since it launched. What does this mean for the Hadoop S3A connector? 1. We've been testing it for a while, no problems have surfaced. 2. There's no need for S3Guard; leave the default settings alone. If you were using it, turn it off, restart *everything* and then you can delete the DDB table. 3. Without S3 listings may get a bit slower. There's been a lot of work in branch-3.3 on speeding up listings against raw S3, especially for code which uses listStatusIterator() and listFiles (HADOOP-17400). It'll be time to get Hadoop 3.3.1 out the door for people to play with; it's got a fair few other s3a-side enhancements. People are still using S3Guard and it needs to be maintained for now, but we'll have to be fairly ruthless about what isn't going to get closed as WONTFIX. I'm worried here about anyone using S3Guard against non-AWS consistent stores. If you are, send me an email. And so for releases/PRs, tdoing est runs with and without S3Guard is important. I've added an optional backwards-incompatible change recently for better scalability: HADOOP-13230. S3A to optionally retain directory markers. which adds markers=keep/delete to the test matrix. This is a pain, though as you can choose two options at a time it's manageable. Apache HBase ============ You still need the HBoss extension in front of the S3A connector to use Zookeeper to lock files during compaction. Apache Spark ============ Any workflows which chained together reads directly after writes/overwrites of files should now work reliably with raw S3. The classic FileOutputCommitter commit-by-rename algorithms aren't going to fail with FileNotFoundException during task commit. - They will still use copy to rename work, so take O(data) time to commit files - Without atomic dir rename, v1 commit algorithm can't isolate the commit operations of two task attempts. So it's unsafe and very slow. - The v2 commit is slow, doesn't have isolation between task attempt commits against any filesystem. - If different task attempts are generating unique filenames (possibly to work around s3 update inconsistencies), it's not safe. Turn that option off. The S3A committers' algorithms are happy talking directly to S3. But: SPARK-33402 is needed to fix a race condition in the staging committer. The "Magic" committer, which has relied on a consistent store, is safe. There's a fix in HADOOP-17318 for the staging committer; hadoop-aws builds with that in will work safely with older spark versions. Any formats which commit work by writing a file with a unique name & updating a reference to it in a consistent store (iceberg &c) are still going to work great. Naming is irrelevant and commit-by-writing-a-file is S3's best story. Disctp ====== There'll be no cached 404s ot break uploads, even if you don't have the relevant fixes to stop HEAD requests before creating files (HADOOP-16932 and revert of HADOOP-8143)or update inconsistency (HADOOP-16775) - If your distcp version supports -direct, use it to avoid rename performance penalties - If your distcp version doesn't have HADOOP-15209 it can issue needless DELETE calls to S3 after a big update, and end up being throttled badly. Upgrade if you can. If people are seeing problems: issues.apache.org + component HADOOP is where to file JIRAs; please tag the version of hadoop libraries you've been running with. thanks, -Steve