[
https://issues.apache.org/jira/browse/KUDU-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15274595#comment-15274595
]
David Alves commented on KUDU-1419:
-----------------------------------
[~mpercy] The "everything is well" path doesn't worry me as much as the
handling errors path. can you elaborate on what that would be (crashes in the
middle of each of the phases you described).
Also can you compare to the number of fsyncs we already do?
Seems like in this case would be:
- one for the dir creation
- one on the recovery-status file creation/append
- one on the dir after all the wals are moved
- one on the file before we start replaying to indicate the move is done
- one on the file for the change to CLEANING_UP
- one after we delete the dir.
- one at the end.
so about 7.
Since these don't grow with tablet size I'm not too worried about the fsyncs,
but this path seems complex and crash handling will need some testing.
> Kudu may fail to start in docker when using Ubuntu/AUFS
> -------------------------------------------------------
>
> Key: KUDU-1419
> URL: https://issues.apache.org/jira/browse/KUDU-1419
> Project: Kudu
> Issue Type: Bug
> Components: util
> Reporter: Casey Ching
> Priority: Critical
>
> By default Ubuntu's docker setup uses AUFS for its storage layer. That leads
> to problems during startup because rename() may not work in AUFS.
> {quote}
> To rename(2) directory may return EXDEV even if both of src and tgt are on
> the same aufs. When the rename-src dir exists on multiple branches and the
> lower dir has child(ren), aufs has to copyup all his children. It can be
> recursive copyup. Current aufs does not support such huge copyup operation at
> one time in kernel space, instead produces a warning and returns EXDEV.
> Generally, mv(1) detects this error and tries mkdir(2) and rename(2) or
> copy/unlink recursively. So the result is harmless. If your application which
> issues rename(2) for a directory does not support EXDEV, it will not work on
> aufs. Also this specification is applied to the case when the src directroy
> exists on the lower readonly branch and it has child(ren).
> {quote}
> http://aufs.sourceforge.net/aufs.html
> Starting the master may try to rename()
> {code}
> RETURN_NOT_OK_PREPEND(fs_manager->env()->RenameFile(log_dir,
> recovery_path),
> Substitute("Could not move log directory $0 to
> recovery dir $1",
> log_dir, recovery_path));
> {code}
> https://github.com/cloudera/kudu/blob/master/src/kudu/tablet/tablet_bootstrap.cc#L597
> {code}
> virtual Status RenameFile(const std::string& src, const std::string&
> target) OVERRIDE {
> TRACE_EVENT2("io", "PosixEnv::RenameFile", "src", src, "dst", target);
> ThreadRestrictions::AssertIOAllowed();
> Status result;
> if (rename(src.c_str(), target.c_str()) != 0) {
> result = IOError(src, errno);
> }
> return result;
> }
> {code}
> https://github.com/cloudera/kudu/blob/master/src/kudu/util/env_posix.cc#L891
> I think Kudu is supposed to fall back to copy/remove. As an example here is
> what python does
> {code}
> try:
> os.rename(src, real_dst)
> except OSError:
> if os.path.isdir(src):
> if _destinsrc(src, dst):
> raise Error, "Cannot move a directory '%s' into itself '%s'."
> % (src, dst)
> copytree(src, real_dst, symlinks=True)
> rmtree(src)
> else:
> copy2(src, real_dst)
> os.unlink(src)
> {code}
> https://hg.python.org/cpython/file/2.7/Lib/shutil.py#l295
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)