[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abraham Fine reassigned ZOOKEEPER-2528:
---------------------------------------

    Assignee:     (was: Abraham Fine)

> ZooKeeper cluster can become unavailable due to power failures
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2528
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2528
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.8
>         Environment: A normal ZooKeeper cluster of 3 nodes running on 3 Linux 
> machines. 
>            Reporter: Ramnatthan Alagappan
>            Priority: Critical
>
> ZooKeeper cluster can become unavailable if power failures happen at certain 
> specific points in time. 
> Details:
> I am running a three-node ZooKeeper cluster. I perform a simple update from a 
> client machine. 
> When I try to update a value, ZooKeeper creates a new log file (for example, 
> when the current log is fully utilized). First, it creates the file and 
> appends some header information to the newly created log. The system call 
> sequence looks like below:
> creat(log.200000001)
> append(log.200000001, offset=0,  count=16)
> Now, if a power failure happens just after the creat of the log file but 
> before the append of the header information, the node simply crashes with an 
> EOF exception. If the same problem occurs at two or more nodes in my 
> three-node cluster, the entire cluster becomes unavailable as the majority of 
> servers have crashed because of the above problem.  
> A power failure at the same time across multiple nodes may be possible in 
> single data center or single rack deployment scenarios. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to