[ https://issues.apache.org/jira/browse/HADOOP-16433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888979#comment-16888979 ]
Gabor Bota commented on HADOOP-16433: ------------------------------------- I'll create a jira on refactoring/moving the TTL handling inside the MetadataStore, since * TTLTimeProvider became a first-class citizen inside the MS * we handle ~1/3 of the TTL operation outside the ms (S3Guard static methods) and the other ~2/3 inside it, so it can be rather confusing to maintain the code. > S3Guard: Filter expired entries and tombstones when listing with > MetadataStore#listChildren > ------------------------------------------------------------------------------------------- > > Key: HADOOP-16433 > URL: https://issues.apache.org/jira/browse/HADOOP-16433 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.3.0 > Reporter: Gabor Bota > Assignee: Gabor Bota > Priority: Blocker > > Currently, we don't filter out entries in {{listChildren}} implementations. > This can cause bugs and inconsistencies, so this should be fixed. > It can lead to a status where we can't recover from the following: > {{guarded and raw (OOB op) clients are doing ops to S3}} > {noformat} > Guarded: touch /AAAA > Guarded: touch /ZZZZ > Guarded: rm /AAAA {{-> tombstone in MS}} > RAW: touch /AAAA/file.ext {{-> file is hidden with a tombstone}} > Guarded: ls / {{-> only ZZZZ will show up in the listing. }} > {noformat} > After we change the following code > {code:java} > final List<PathMetadata> metas = new ArrayList<>(); > for (Item item : items) { > DDBPathMetadata meta = itemToPathMetadata(item, username); > metas.add(meta); > } > {code} > to > {code:java} > // handle expiry - only add not expired entries to listing. > if (meta.getLastUpdated() == 0 || > !meta.isExpired(ttlTimeProvider.getMetadataTtl(), > ttlTimeProvider.getNow())) { > metas.add(meta); > } > {code} > we will filter out expired entries from the listing, so we can recover form > these kind of OOB ops. > Note: we have to handle the lastUpdated == 0 case, where the lastUpdated > field is not filled in! > Note: this can only be fixed cleanly after HADOOP-16383 is fixed because we > need to have the TTLtimeProvider in MS to handle this internally. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org