[
https://issues.apache.org/jira/browse/HADOOP-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Mackrory updated HADOOP-14457:
-----------------------------------
Attachment: HADOOP-14457-HADOOP-13345.001.patch
Attaching a patch with a better test and a fix. Was originally looking at this
as a bug in the Local implementation for not creating parents, but after a
quick discussion about it with @fabbri, I was convinced that that's merely an
implementation detail of the DynamoDB implementation, and that it's actually
the FS' responsibility to create things it needs to exist (which makes sense,
since innerMkdirs does that). I'm more or less doing something similar to
innerMkdirs. Maybe they should share a common function here, but there are a
few key differences we would need to preserve if we went that route:
* I think it'd be wasteful to create the empty directory placeholder. Removing
it later makes it 2 S3 round trips that aren't needed.
* I think directories created in this manner should NOT be considered
authoritative. ITestS3GuardEmptyDirs agrees with me :) I think there are a few
special cases where we could set it to true here, but in my head that logic
gets pretty complex and IMO it's best to just leave it as false if someone is
creating the directories implicitly with create().
There's a second issue addressed here also (since the fix for that issue is
inside the loop I would have had to do anyway): one could create /a/b.txt and
subsequently /a/b.txt/c/d.txt and nothing would stop you. Obviously not common
in practice, but very incorrect IMO. I added a test to the S3A contract test.
If this is considered part of the general FS contract we could move it up a
level, but I haven't tested anything but S3. It also did occur to me that there
*could* be applications depending on this behavior. So I fixed it while I was
in the code, but I'm not entirely convinced about it myself.
> LocalMetadataStore falsely reports empty parent directory authoritatively
> -------------------------------------------------------------------------
>
> Key: HADOOP-14457
> URL: https://issues.apache.org/jira/browse/HADOOP-14457
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Sean Mackrory
> Attachments: HADOOP-14457-HADOOP-13345.001.patch
>
>
> Not a great test yet, but it at least reliably demonstrates the issue.
> LocalMetadataStore will sometimes erroneously report that a directory is
> empty with isAuthoritative = true when it *definitely* has children the
> metadatastore should know about. It doesn't appear to happen if the children
> are just directory. The fact that it's returning an empty listing is
> concerning, but the fact that it says it's authoritative *might* be a second
> bug.
> {code}
> diff --git
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
>
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> index 78b3970..1821d19 100644
> ---
> a/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> +++
> b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> @@ -965,7 +965,7 @@ public boolean hasMetadataStore() {
> }
>
> @VisibleForTesting
> - MetadataStore getMetadataStore() {
> + public MetadataStore getMetadataStore() {
> return metadataStore;
> }
>
> diff --git
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
>
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> index 4339649..881bdc9 100644
> ---
> a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> +++
> b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractRename.java
> @@ -23,6 +23,11 @@
> import org.apache.hadoop.fs.contract.AbstractFSContract;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> +import org.apache.hadoop.fs.s3a.S3AFileSystem;
> +import org.apache.hadoop.fs.s3a.Tristate;
> +import org.apache.hadoop.fs.s3a.s3guard.DirListingMetadata;
> +import org.apache.hadoop.fs.s3a.s3guard.MetadataStore;
> +import org.junit.Test;
>
> import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset;
> import static org.apache.hadoop.fs.contract.ContractTestUtils.writeDataset;
> @@ -72,4 +77,24 @@ public void testRenameDirIntoExistingDir() throws
> Throwable {
> boolean rename = fs.rename(srcDir, destDir);
> assertFalse("s3a doesn't support rename to non-empty directory", rename);
> }
> +
> + @Test
> + public void testMkdirPopulatesFileAncestors() throws Exception {
> + final FileSystem fs = getFileSystem();
> + final MetadataStore ms = ((S3AFileSystem) fs).getMetadataStore();
> + final Path parent = path("testMkdirPopulatesFileAncestors/source");
> + try {
> + fs.mkdirs(parent);
> + final Path nestedFile = new Path(parent, "dir1/dir2/dir3/file4");
> + byte[] srcDataset = dataset(256, 'a', 'z');
> + writeDataset(fs, nestedFile, srcDataset, srcDataset.length,
> + 1024, false);
> +
> + DirListingMetadata list = ms.listChildren(parent);
> + assertTrue("MetadataStore falsely reports authoritative empty list",
> + list.isEmpty() == Tristate.FALSE || !list.isAuthoritative());
> + } finally {
> + fs.delete(parent, true);
> + }
> + }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]