[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306733738
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/LocalMetadataStore.java
 ##
 @@ -309,15 +316,17 @@ public void put(PathMetadata meta,
   DirListingMetadata parentDirMeta =
   new DirListingMetadata(parentPath, DirListingMetadata.EMPTY_DIR,
   false);
+  parentDirMeta.setLastUpdated(meta.getLastUpdated());
 
 Review comment:
   I've added `testUpdateParentLastUpdatedOnPutNewParent` to test this in the 
new PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306729656
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/LocalMetadataStore.java
 ##
 @@ -309,15 +316,17 @@ public void put(PathMetadata meta,
   DirListingMetadata parentDirMeta =
   new DirListingMetadata(parentPath, DirListingMetadata.EMPTY_DIR,
   false);
+  parentDirMeta.setLastUpdated(meta.getLastUpdated());
 
 Review comment:
   also, this is only for the DirListingMeta for LocalMetadataStore, and 
nothing else. We do not persist DirListingMeta in DynamoDB, so the parent 
directory(/directories) won't get updated if a child is added. Imagine a 
scenario where we are deep in the dir structure (eg. 20+ directories in), so we 
should have an update for each parent FileMetadata (we don't store the listings 
there) whenever a child is added. That's not good imo. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306709699
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/LocalMetadataStore.java
 ##
 @@ -309,15 +316,17 @@ public void put(PathMetadata meta,
   DirListingMetadata parentDirMeta =
   new DirListingMetadata(parentPath, DirListingMetadata.EMPTY_DIR,
   false);
+  parentDirMeta.setLastUpdated(meta.getLastUpdated());
 
 Review comment:
   if this is not here then the oob tombstone listing expiry test will fail. 
but I'll try to write a unit test for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306708473
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardTtl.java
 ##
 @@ -286,4 +290,70 @@ public void testCreateParentHasTombstone() throws 
Exception {
 }
   }
 
+  /**
+   * Test that listing of metadatas is filtered from expired items.
+   */
+  @Test
+  public void testListingFilteredExpiredItems() throws Exception {
+LOG.info("Authoritative mode: {}", authoritative);
+final S3AFileSystem fs = getFileSystem();
+
+long oldTime = 100L;
+long newTime = 110L;
+long ttl = 9L;
+final String basedir = getMethodName();
 
 Review comment:
   `[]` chars are not the best in url names. Corrected to string literal 
instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306701870
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardOutOfBandOperations.java
 ##
 @@ -538,6 +540,47 @@ public void deleteAfterTombstoneExpiryOobCreate() throws 
Exception {
 }
   }
 
+  /**
+   * Test that a tombstone won't hide an entry after it's expired in the
+   * listing.
+   */
+  @Test
+  public void testRootTombstones() throws Exception {
+long ttl = 10L;
+ITtlTimeProvider mockTimeProvider = mock(ITtlTimeProvider.class);
+when(mockTimeProvider.getMetadataTtl()).thenReturn(ttl);
+when(mockTimeProvider.getNow()).thenReturn(100L);
+ITtlTimeProvider originalTimeProvider = guardedFs.getTtlTimeProvider();
+guardedFs.setTtlTimeProvider(mockTimeProvider);
+
+Path base = path(getMethodName() + UUID.randomUUID());
+Path testFile = new Path(base, "test.file");
+
+try {
+  touch(guardedFs, testFile);
+  ContractTestUtils.assertDeleted(guardedFs, testFile, false);
+
+  touch(rawFS, testFile);
+  awaitFileStatus(rawFS, testFile);
+
+  // the rawFS will include the file=
+  LambdaTestUtils.eventually(5000, 1000, () -> {
+checkListingContainsPath(rawFS, testFile);
+  });
+
+  // it will be hidden because of the tombstone
+  checkListingDoesNotContainPath(guardedFs, testFile);
 
 Review comment:
   also we have other tests for getFIleStatus. This is only for listing. I'd 
like to keep this test clean. Your test is already testing for that case we 
had, I don't want to mix more things into this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306700684
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardOutOfBandOperations.java
 ##
 @@ -538,6 +540,47 @@ public void deleteAfterTombstoneExpiryOobCreate() throws 
Exception {
 }
   }
 
+  /**
+   * Test that a tombstone won't hide an entry after it's expired in the
+   * listing.
+   */
+  @Test
+  public void testRootTombstones() throws Exception {
+long ttl = 10L;
+ITtlTimeProvider mockTimeProvider = mock(ITtlTimeProvider.class);
+when(mockTimeProvider.getMetadataTtl()).thenReturn(ttl);
+when(mockTimeProvider.getNow()).thenReturn(100L);
+ITtlTimeProvider originalTimeProvider = guardedFs.getTtlTimeProvider();
+guardedFs.setTtlTimeProvider(mockTimeProvider);
+
+Path base = path(getMethodName() + UUID.randomUUID());
+Path testFile = new Path(base, "test.file");
+
+try {
+  touch(guardedFs, testFile);
+  ContractTestUtils.assertDeleted(guardedFs, testFile, false);
+
+  touch(rawFS, testFile);
+  awaitFileStatus(rawFS, testFile);
+
+  // the rawFS will include the file=
+  LambdaTestUtils.eventually(5000, 1000, () -> {
+checkListingContainsPath(rawFS, testFile);
+  });
+
+  // it will be hidden because of the tombstone
+  checkListingDoesNotContainPath(guardedFs, testFile);
 
 Review comment:
   ok, I just wanted this as easy to understand and clean as possible. but I 
can add more check for sure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306683463
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardOutOfBandOperations.java
 ##
 @@ -538,6 +540,47 @@ public void deleteAfterTombstoneExpiryOobCreate() throws 
Exception {
 }
   }
 
+  /**
+   * Test that a tombstone won't hide an entry after it's expired in the
+   * listing.
+   */
+  @Test
+  public void testRootTombstones() throws Exception {
+long ttl = 10L;
+ITtlTimeProvider mockTimeProvider = mock(ITtlTimeProvider.class);
+when(mockTimeProvider.getMetadataTtl()).thenReturn(ttl);
+when(mockTimeProvider.getNow()).thenReturn(100L);
+ITtlTimeProvider originalTimeProvider = guardedFs.getTtlTimeProvider();
+guardedFs.setTtlTimeProvider(mockTimeProvider);
+
+Path base = path(getMethodName() + UUID.randomUUID());
+Path testFile = new Path(base, "test.file");
+
+try {
+  touch(guardedFs, testFile);
+  ContractTestUtils.assertDeleted(guardedFs, testFile, false);
+
+  touch(rawFS, testFile);
+  awaitFileStatus(rawFS, testFile);
+
+  // the rawFS will include the file=
+  LambdaTestUtils.eventually(5000, 1000, () -> {
+checkListingContainsPath(rawFS, testFile);
+  });
+
+  // it will be hidden because of the tombstone
+  checkListingDoesNotContainPath(guardedFs, testFile);
 
 Review comment:
   I'd prefer `Preconditions.checkArgument(lastUpdated >=0`. Better fail fast 
if something's wrong. We don't need to be that failproof here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306683463
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardOutOfBandOperations.java
 ##
 @@ -538,6 +540,47 @@ public void deleteAfterTombstoneExpiryOobCreate() throws 
Exception {
 }
   }
 
+  /**
+   * Test that a tombstone won't hide an entry after it's expired in the
+   * listing.
+   */
+  @Test
+  public void testRootTombstones() throws Exception {
+long ttl = 10L;
+ITtlTimeProvider mockTimeProvider = mock(ITtlTimeProvider.class);
+when(mockTimeProvider.getMetadataTtl()).thenReturn(ttl);
+when(mockTimeProvider.getNow()).thenReturn(100L);
+ITtlTimeProvider originalTimeProvider = guardedFs.getTtlTimeProvider();
+guardedFs.setTtlTimeProvider(mockTimeProvider);
+
+Path base = path(getMethodName() + UUID.randomUUID());
+Path testFile = new Path(base, "test.file");
+
+try {
+  touch(guardedFs, testFile);
+  ContractTestUtils.assertDeleted(guardedFs, testFile, false);
+
+  touch(rawFS, testFile);
+  awaitFileStatus(rawFS, testFile);
+
+  // the rawFS will include the file=
+  LambdaTestUtils.eventually(5000, 1000, () -> {
+checkListingContainsPath(rawFS, testFile);
+  });
+
+  // it will be hidden because of the tombstone
+  checkListingDoesNotContainPath(guardedFs, testFile);
 
 Review comment:
   I'd prefer `Preconditions.checkArgument(lastUpdated >=0`. Better fail fast 
if something's wrong. We don't need to be that failproof here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-24 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306683097
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/PathMetadata.java
 ##
 @@ -71,6 +78,7 @@ public PathMetadata(S3AFileStatus fileStatus, Tristate 
isEmptyDir, boolean
 this.fileStatus = fileStatus;
 this.isEmptyDirectory = isEmptyDir;
 this.isDeleted = isDeleted;
+this.setLastUpdated(lastUpdated);
 
 Review comment:
   I'd prefer `Preconditions.checkArgument(lastUpdated >=0 ` instead. It's 
better fail fast if there's a negative value, we don't need to be that 
failproof here imo.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-22 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306023290
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDirListingMetadata.java
 ##
 @@ -291,6 +293,38 @@ public void testRemoveNotChild() {
 meta.remove(new Path("/different/ancestor"));
   }
 
+
+  @Test
+  public void testExpiredEntriesFromListing() {
 
 Review comment:
   yes, this should be removeExpiredEntriesFromListing. Todo: update it!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-22 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306022775
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardTtl.java
 ##
 @@ -286,4 +290,70 @@ public void testCreateParentHasTombstone() throws 
Exception {
 }
   }
 
+  /**
+   * Test that listing is filtering expired items, so
 
 Review comment:
   Finish this comment


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: Filter expired entries and tombstones when lis…

2019-07-22 Thread GitBox
bgaborg commented on a change in pull request #1134: HADOOP-16433. S3Guard: 
Filter expired entries and tombstones when lis…
URL: https://github.com/apache/hadoop/pull/1134#discussion_r306022959
 
 

 ##
 File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDirListingMetadata.java
 ##
 @@ -291,6 +293,38 @@ public void testRemoveNotChild() {
 meta.remove(new Path("/different/ancestor"));
   }
 
+
+  @Test
+  public void testExpiredEntriesFromListing() {
 
 Review comment:
   Not the best test name, maybe testRemoveExpiredEntriesFromListing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org