[ https://issues.apache.org/jira/browse/HADOOP-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Mackrory updated HADOOP-13760: ----------------------------------- Attachment: HADOOP-13760-HADOOP-13345.002.patch [~fabbri] - Just the unit tests with Null, Local, and Dynamo implementations. I'm also getting an ecnryption test and the one after it failing - haven't entirely looked into it yet but they succeed in isolation so I'm assuming it's HADOOP-14305. As you pointed out offline, -Dlocal doesn't do anything, but because Local's the default it still ran the tests as I intended. And it definitely exercised all 3 implementations because I saw failures definitely related to each one that I had to fix. I'm getting ready to run some actually workloads on an actual cluster, too. [~ste...@apache.org] - schema versioning aside, this would cause clusters running the old code to continue including deleted items in lists. So it effectly prolongs the inconsistency I'm trying to eliminate until the tombstone gets pruned or otherwise removed. Attaching another incremental patch. I've implemented the todo to filter out deleted children server-side when we're deciding if a directory is empty. I'm not sure I like this - the docs indicate there are limits that apply on the pre-filtering data size, that very large directories may hit. I'm not clear on whether regular queries would hit the same limits, but with large directories this saves us some network traffic (but not read-bandwidth-against-quotas usage). I also need to dig into the use of .withMaxResults. In my .001. patch I was applying that limit before filtering out deletes, so it's only luck / coincidence that tests didn't fail thinking non-empty directories were empty. So I need to add a test to catch that. Also not sure if that limit applies before or after filtering. If it applies before, I shouldn't use it. Also added a test that does a circular series of renames and a few fixes that it required. Most notably if a directory is created and then renamed fast enough that S3 doesn't return it in lists yet, we used to throw a FileNotFoundException trying to decide if it was empty. We now assume it IS empty. > S3Guard: add delete tracking > ---------------------------- > > Key: HADOOP-13760 > URL: https://issues.apache.org/jira/browse/HADOOP-13760 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Reporter: Aaron Fabbri > Assignee: Sean Mackrory > Attachments: HADOOP-13760-HADOOP-13345.001.patch, > HADOOP-13760-HADOOP-13345.002.patch > > > Following the S3AFileSystem integration patch in HADOOP-13651, we need to add > delete tracking. > Current behavior on delete is to remove the metadata from the MetadataStore. > To make deletes consistent, we need to add a {{isDeleted}} flag to > {{PathMetadata}} and check it when returning results from functions like > {{getFileStatus()}} and {{listStatus()}}. In HADOOP-13651, I added TODO > comments in most of the places these new conditions are needed. The work > does not look too bad. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org