[
https://issues.apache.org/jira/browse/ZOOKEEPER-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abraham Fine reassigned ZOOKEEPER-2528:
---------------------------------------
Assignee: (was: Abraham Fine)
> ZooKeeper cluster can become unavailable due to power failures
> --------------------------------------------------------------
>
> Key: ZOOKEEPER-2528
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2528
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.4.8
> Environment: A normal ZooKeeper cluster of 3 nodes running on 3 Linux
> machines.
> Reporter: Ramnatthan Alagappan
> Priority: Critical
>
> ZooKeeper cluster can become unavailable if power failures happen at certain
> specific points in time.
> Details:
> I am running a three-node ZooKeeper cluster. I perform a simple update from a
> client machine.
> When I try to update a value, ZooKeeper creates a new log file (for example,
> when the current log is fully utilized). First, it creates the file and
> appends some header information to the newly created log. The system call
> sequence looks like below:
> creat(log.200000001)
> append(log.200000001, offset=0, count=16)
> Now, if a power failure happens just after the creat of the log file but
> before the append of the header information, the node simply crashes with an
> EOF exception. If the same problem occurs at two or more nodes in my
> three-node cluster, the entire cluster becomes unavailable as the majority of
> servers have crashed because of the above problem.
> A power failure at the same time across multiple nodes may be possible in
> single data center or single rack deployment scenarios.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)