[ 
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048376#comment-16048376
 ] 

Sean Mackrory commented on HADOOP-14457:
----------------------------------------

Had lot of back-and-forth about this offline with [~fabbri]. Notes are below. 
Would like some input from [~liuml07] / [~ste...@apache.org] about this since 
either solution involves adding to the MetadataStore interface. The gist is 
this:

- I think it's reasonable to only do this if authoritative mode is enabled.
- I'm concerned about code complexity, and having the Dynamo DB implementation 
add all ancestors, and then replicate the same thing under certain conditions 
in the FS only for implementations that don't do this. I suspect future 
implementations are more likely than not to require the tree to be traversable 
from root like DynamoDB does. I'm actually not clear why Local doesn't have 
this same problem, unless it's just that we don't care about doing a full-table 
scan on an in-memory hashmap.
- [~fabbri] is concerned about performance and wants the MetadataStore 
interface to be as simple as possible to make future implementations easy to 
add, so having more of the logic live in the FS when possible is desirable.

There are 2 options we discussed:

- Add putRecursive to the metadata store interface and use that in 
finishedWrite() instead of having it loop up the directory hierarchy itself. 
For Dynamo or similar implementations, it can just wrap put. Mkdir might be 
able to use it, although it already traverses the tree making sure none of the 
parents are a file anyway, similar to what create() should eventually do.

- Go with the change as-is, but wrap the finishedWrite changes in a 
capabilities API so that it only adds ancestors if the parent needs them and 
won't do it itself.

Personally, I prefer the former. It addresses my concerns about adding 
recursive operations at multiple points in the code (I think it's more likely 
than not that any future implementations will have the same need for DynamoDB 
to have the entire tree be traversable from root anyway), the FS doesn't need 
to decide whether or not it should recurse because it can always assume the 
underlying operation will do it the first time it's called, and the one point 
that operation is implemented can take advantage of any store-specific 
optimizations rather than something unaware of the storage layer just making 
repeated calls.

On the other hand, if we feel like we're going to eventually need a 
capabilities API anyway, we may as well add it now and use that to solve this 
problem too.

Any thoughts, [~ste...@apache.org] / [~liuml07]?

Any nuances I didn't call out here, [~fabbri]?

> create() does not notify metadataStore of parent directories or ensure 
> they're not existing files
> -------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14457
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14457
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-14457-HADOOP-13345.001.patch, 
> HADOOP-14457-HADOOP-13345.002.patch, HADOOP-14457-HADOOP-13345.003.patch, 
> HADOOP-14457-HADOOP-13345.004.patch, HADOOP-14457-HADOOP-13345.005.patch, 
> HADOOP-14457-HADOOP-13345.006.patch, HADOOP-14457-HADOOP-13345.007.patch, 
> HADOOP-14457-HADOOP-13345.008.patch, HADOOP-14457-HADOOP-13345.009.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue. 
> LocalMetadataStore will sometimes erroneously report that a directory is 
> empty with isAuthoritative = true when it *definitely* has children the 
> metadatastore should know about. It doesn't appear to happen if the children 
> are just directory. The fact that it's returning an empty listing is 
> concerning, but the fact that it says it's authoritative *might* be a second 
> bug.
> {code}
> diff --git 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
>  
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
>    }
>  
>    @VisibleForTesting
> -  MetadataStore getMetadataStore() {
> +  public MetadataStore getMetadataStore() {
>      return metadataStore;
>    }
>  
> diff --git 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
>  
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> --- 
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++ 
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
>  import org.apache.hadoop.fs.contract.AbstractFSContract;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>  
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
>  import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws 
> Throwable {
>      boolean rename = fs.rename(srcDir, destDir);
>      assertFalse("s3a doesn't support rename to non-empty directory", rename);
>    }
> +
> +  @Test
> +  public void testMkdirPopulatesFileAncestors() throws Exception {
> +    final FileSystem fs = getFileSystem();
> +    final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> +    final Path parent = path("testMkdirPopulatesFileAncestors/source");
> +    try {
> +      fs.mkdirs(parent);
> +      final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> +      byte[] srcDataset = dataset(256, 'a', 'z');
> +      writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> +          1024, false);
> +
> +      DirListingMetadata list = ms.listChildren(parent);
> +      assertTrue("MetadataStore falsely reports authoritative empty list",
> +          list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> +    } finally {
> +      fs.delete(parent, true);
> +    }
> +  }
>  }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to