[
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-9361:
-----------------------------------
Attachment: HADOOP-9361-007.patch
Updated patch
# explicit option for an FS to not fail on a {{seek()}} on a closed file, only
on the read that follows.
# add raw local tests that bypass the {{ChecksumFilesystem}} -mostly to see
where quirks lay -and a couple that go down to Java's {{File}} class just to
make sure.
# more tests on rename behaviour
h2. Outstanding "ambiguities"
h3. Major: rename file onto file
On OS/X, LocalFS (and RawLocal) let you rename a source file over a destination
file; the source data becomes accessible at the source path. This is what
{{mv}} does at the command line, so presumably it's legitimate, even if HDFS
and the blobstore APIs all reject this.
This is a major difference in semantics from HDFS and Posix filesystems -code
I've written to assume that renaming a file fails if the destination exists
works well to implement some implicit concurrency control on HDFS, but would
not work on a Posix FS.
We have the option of making the rename operation stricter by explicitly adding
a check into rawlocal, but there's still a race condition between any check and
the OS-level rename action.
What does that leave? It leaves documenting this somewhere for end-users.
h3. Minor:
RawLocal returns true if you attempt to delete a nonexistent path; everything
else (including {{File.delete())}} returns true. This behaviour comes from
{{FileUtil.fullyDelete(f)}}, which does not check for a file existence first,
and when it gets false back from {{File.delete()}} generates the return code
{{!File.exists()}}. That is, the semantics of fully delete are "return true if
there is no directory at the end of the operation"
More subtly, it there is a small a race condition where you could accidentally
recursively delete a directory by attempting to delete a nonexistent file while
another process is creating a directory of the same name
This is because the check for {{!isFile()}} and !recursive are false when the
file does not exist, but by the time the delete operation starts the rename has
created a directory tree.
{code}
File f = pathToFile(p);
if (f.isFile()) {
return f.delete();
} else if (!recursive && f.isDirectory() &&
(FileUtil.listFiles(f).length != 0)) {
throw new IOException("Directory " + f.toString() + " is not empty");
}
return FileUtil.fullyDelete(f);
}
{code}
Adding an existence check at the start of the sequence produces a consistent
return code and eliminates this aspect of the race condition -this patch does
exactly that.
There is still a minor race: adding an entry to a directory after the empty
check and before the fullyDelete call.
the recursive flag logic should really be moved into a {{FileUtil}} method
itself.
> Strictly define the expected behavior of filesystem APIs and write tests to
> verify compliance
> ---------------------------------------------------------------------------------------------
>
> Key: HADOOP-9361
> URL: https://issues.apache.org/jira/browse/HADOOP-9361
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs, test
> Affects Versions: 3.0.0, 2.2.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch,
> HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch,
> HADOOP-9361-006.patch, HADOOP-9361-007.patch
>
>
> {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while
> HDFS gets tested downstream, other filesystems, such as blobstore bindings,
> don't.
> The only tests that are common are those of {{FileSystemContractTestBase}},
> which HADOOP-9258 shows is incomplete.
> I propose
> # writing more tests which clarify expected behavior
> # testing operations in the interface being in their own JUnit4 test classes,
> instead of one big test suite.
> # Having each FS declare via a properties file what behaviors they offer,
> such as atomic-rename, atomic-delete, umask, immediate-consistency -test
> methods can downgrade to skipped test cases if a feature is missing.
--
This message was sent by Atlassian JIRA
(v6.2#6252)