[
https://issues.apache.org/jira/browse/ZOOKEEPER-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086388#comment-14086388
]
Flavio Junqueira commented on ZOOKEEPER-2003:
---------------------------------------------
Fair enough, let's go into more detail here. According to the analysis in the
description, I read two potential problems:
- The directory data of a newly created directory isn't persisted to disk
- A newly created log file isn't persisted as a directory entry
Both cases may lead to the loss of an otherwise persisted log file and they
imply that we need to fsync the directory data. There is a third point that I
believe is important, which is making sure that the metadata is updated when we
pad the log file.
The fsync documentation says that we need to fsync the directory as well to
make sure that the directory change is persisted to disk. You claim that it is
not possible to do this in java, but I think that with Java 7 we can fsync
directories, no?
It seems that there are two parts to this discussion. First, we need to
understand to what extent this is really a problem in the current code. I must
say that I haven't thought about this part of ZK in a while, so I don't have it
entirely fresh. Second, assuming there is something to be fixed, we need to
determine how to do it in Java.
> Missing fsync() on the logs parent directory
> --------------------------------------------
>
> Key: ZOOKEEPER-2003
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2003
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.6
> Reporter: Samer Al-Kiswany
>
> After studying the steps ZooKeeper takes to update the logs we found the
> following bug. The bug may not manifest in the current file system
> implementations, but it violates the POSIX recommendations and may be an
> issue in some file systems.
> Looking at the strace of zookeeper we see the following:
> mkdir(v)
> create(v/log)
> append(v/log)
> trunk(v/log)
> write(v/log)
> fdatasync(v/log)
> Although the data is fdatasynced to the log, the parent directory was never
> fsynced, consequently in case of a crash, the parent directory or the log
> file may be lost, as the parent directory and file metadata were never
> persisted on disk.
> To be safe, both the log directory, and parent directory of the log directory
> should be fsynced as well.
--
This message was sent by Atlassian JIRA
(v6.2#6252)