[
https://issues.apache.org/jira/browse/ZOOKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flavio Junqueira updated ZOOKEEPER-702:
---
Status: Open (was: Patch Available)
Thanks for the updated patch, Abmar. The new tests, however, are working as
expected. More specifically, the methods in QuorumBase (createLearnersFD and
createSessionsFD) are not being overridden as expected, which affects all new
hammer tests. I haven't checked the other tests, but I suspect they suffer from
the same problem.
I'm canceling the patch for now.
GSoC 2010: Failure Detector Model
-
Key: ZOOKEEPER-702
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-702
Project: Zookeeper
Issue Type: Wish
Reporter: Henry Robinson
Assignee: Abmar Barros
Attachments: bertier-pseudo.txt, bertier-pseudo.txt, chen-pseudo.txt,
chen-pseudo.txt, phiaccrual-pseudo.txt, phiaccrual-pseudo.txt,
ZOOKEEPER-702-code.patch, ZOOKEEPER-702-doc.patch, ZOOKEEPER-702.patch,
ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch,
ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch,
ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch,
ZOOKEEPER-702.patch, ZOOKEEPER-702.patch, ZOOKEEPER-702.patch
Failure Detector Module
Possible Mentor
Henry Robinson (henry at apache dot org)
Requirements
Java, some distributed systems knowledge, comfort implementing distributed
systems protocols
Description
ZooKeeper servers detects the failure of other servers and clients by
counting the number of 'ticks' for which it doesn't get a heartbeat from
other machines. This is the 'timeout' method of failure detection and works
very well; however it is possible that it is too aggressive and not easily
tuned for some more unusual ZooKeeper installations (such as in a wide-area
network, or even in a mobile ad-hoc network).
This project would abstract the notion of failure detection to a dedicated
Java module, and implement several failure detectors to compare and contrast
their appropriateness for ZooKeeper. For example, Apache Cassandra uses a
phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which
is much more tunable and has some very interesting properties. This is a
great project if you are interested in distributed algorithms, or want to
help re-factor some of ZooKeeper's internal code.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.