[jira] Commented: (ZOOKEEPER-368) Observers

2009-07-20 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733411#action_12733411
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-368:
--

I think this is an excellent observation. The discussion on the user list also 
made me think about the redundancy between the two mechanisms (hierarchical 
quorums and observers), and I think Ben is right in that it may simplify the 
implementation. Also, it has a couple of extra benefits: it would exercise the 
flexible quorum implementation and would combine the configuration of 
hierarchical quorums and observers. 

> Observers
> -
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Attachments: ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-07-20 Thread Benjamin Reed (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733410#action_12733410
 ] 

Benjamin Reed commented on ZOOKEEPER-368:
-

henry, i was thinking the other day that an observer is very similar to a 
follower in a flexible quorum with 0 weight. actually the more i thought about 
it, the more i realized that it should be the same. a follower with 0 weight 
really should not send ACKs back and then it would be an observer. it turns out 
that there is a comment in ZOOKEEPER-29 that makes this observation as well. in 
that issue the differences that flavio points out are no longer relevant. i 
think. what do you think?

> Observers
> -
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Attachments: ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-470) include unistd.h for sleep() in c tests

2009-07-20 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-470:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this. thanks chris.

> include unistd.h for sleep() in c tests
> ---
>
> Key: ZOOKEEPER-470
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-470
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.0
>Reporter: Chris Darroch
>Assignee: Chris Darroch
>Priority: Minor
> Fix For: 3.2.1, 3.3.0
>
> Attachments: ZOOKEEPER-470.patch
>
>
> Include unistd.h for sleep() calls in C tests to ensure successful 
> compilation on some platforms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-460) bad testRetry in cppunit tests (hudson failure)

2009-07-20 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733312#action_12733312
 ] 

Mahadev konar commented on ZOOKEEPER-460:
-

the reason the tests are failing is because the servers are not able to start 
for cppunit tests.  the following is the exception on servers run via ccppunit 
tests - 

{code}
CLOVER] FATAL ERROR: Clover could not be initialised. Are you sure you have 
Clover in the runtime classpath? (class 
java.lang.NoClassDefFoundError:com_cenqua_clover/CloverVersionInfo)
Exception in thread "main" java.lang.NoClassDefFoundError: 
com_cenqua_clover/CoverageRecorder
at 
org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:48)
Caused by: java.lang.ClassNotFoundException: com_cenqua_clover.CoverageRecorder
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader
{code}


> bad testRetry in cppunit tests (hudson failure)
> ---
>
> Key: ZOOKEEPER-460
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-460
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client, tests
>Reporter: Patrick Hunt
>Assignee: Henry Robinson
> Fix For: 3.2.1, 3.3.0
>
>
> the followng code failed on hudson
> http://hudson.zones.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/371/
>   watchctx_t ctx1, ctx2;
>   zhandle_t *zk1 = createClient(&ctx1);
>   CPPUNIT_ASSERT_EQUAL(true, ctx1.waitForConnected(zk1));
>   zhandle_t *zk2 = createClient(&ctx2);
>   zookeeper_close(zk1);
>   CPPUNIT_ASSERT_EQUAL(true, ctx2.waitForConnected(zk2));
> there's a problem with this test, it assumes that close(1) can be called 
> before createclient(2) gets connected.
> this is not correct: createclient is an async call an in some cases the 
> connection can be established before
> create client returns.
> this shows a failure in this case because client1 was created, then client2 
> attempted to connect
> but failed due to this on the server (max conn exceeded):
> sprintf(cmd, "export ZKMAXCNXNS=1;%s startClean %s", ZKSERVER_CMD, 
> getHostPorts());
> conn 2 failed and therefore the following assert eventually failed.
> this code should not assume that close(1) will beat connect(2)
> Henry can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-481) Add lastMessageSent to QuorumCnxManager

2009-07-20 Thread Flavio Paiva Junqueira (JIRA)
Add lastMessageSent to QuorumCnxManager
---

 Key: ZOOKEEPER-481
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-481
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Reporter: Flavio Paiva Junqueira
 Attachments: ZOOKEEPER-481.patch

Currently we rely on TCP for reliable delivery of FLE messages. However, as we 
concurrently drop and create new connections, it is possible that a message is 
sent but never received. With this patch, cnx manager keeps a list of last 
messages sent, and resends the last one sent. Receiving multiples copies is 
harmless. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-481) Add lastMessageSent to QuorumCnxManager

2009-07-20 Thread Flavio Paiva Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-481:
-

Attachment: ZOOKEEPER-481.patch

> Add lastMessageSent to QuorumCnxManager
> ---
>
> Key: ZOOKEEPER-481
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-481
> Project: Zookeeper
>  Issue Type: Bug
>  Components: leaderElection
>Reporter: Flavio Paiva Junqueira
> Attachments: ZOOKEEPER-481.patch
>
>
> Currently we rely on TCP for reliable delivery of FLE messages. However, as 
> we concurrently drop and create new connections, it is possible that a 
> message is sent but never received. With this patch, cnx manager keeps a list 
> of last messages sent, and resends the last one sent. Receiving multiples 
> copies is harmless. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-480) FLE should perform leader check when node is not leading and add vote of follower

2009-07-20 Thread Flavio Paiva Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-480:
-

Attachment: ZOOKEEPER-480.patch

> FLE should perform leader check when node is not leading and add vote of 
> follower
> -
>
> Key: ZOOKEEPER-480
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-480
> Project: Zookeeper
>  Issue Type: Bug
>Reporter: Flavio Paiva Junqueira
> Fix For: 3.2.1
>
> Attachments: ZOOKEEPER-480.patch
>
>
> As a server may join leader election while others have already elected a 
> leader, it is necessary that a server handles some special cases of leader 
> election when notifications are from servers that are either LEADING or 
> FOLLOWING. In such special cases, we check if we have received a message from 
> the leader to declare a leader elected. This check does not consider the case 
> that the process performing the check might be a recently elected leader, and 
> consequently the check fails.
> This patch also adds a new case, which corresponds to adding a vote to 
> recvset when the notification is from a process LEADING or FOLLOWING. This 
> fixes the case raised in ZOOKEEPER-475.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-480) FLE should perform leader check when node is not leading and add vote of follower

2009-07-20 Thread Flavio Paiva Junqueira (JIRA)
FLE should perform leader check when node is not leading and add vote of 
follower
-

 Key: ZOOKEEPER-480
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-480
 Project: Zookeeper
  Issue Type: Bug
Reporter: Flavio Paiva Junqueira
 Fix For: 3.2.1


As a server may join leader election while others have already elected a 
leader, it is necessary that a server handles some special cases of leader 
election when notifications are from servers that are either LEADING or 
FOLLOWING. In such special cases, we check if we have received a message from 
the leader to declare a leader elected. This check does not consider the case 
that the process performing the check might be a recently elected leader, and 
consequently the check fails.

This patch also adds a new case, which corresponds to adding a vote to recvset 
when the notification is from a process LEADING or FOLLOWING. This fixes the 
case raised in ZOOKEEPER-475.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-479) QuorumHierarchical does not count groups correctly

2009-07-20 Thread Flavio Paiva Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Paiva Junqueira updated ZOOKEEPER-479:
-

Attachment: ZOOKEEPER-479.patch

This patch fixes this issue and adds one more test.

> QuorumHierarchical does not count groups correctly
> --
>
> Key: ZOOKEEPER-479
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-479
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
> Fix For: 3.2.1
>
> Attachments: ZOOKEEPER-479.patch
>
>
> QuorumHierarchical::containsQuorum should not verify if all groups 
> represented in the input set have more than half of the total weight. 
> Instead, it should check only for an overall majority of groups. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-479) QuorumHierarchical does not count groups correctly

2009-07-20 Thread Flavio Paiva Junqueira (JIRA)
QuorumHierarchical does not count groups correctly
--

 Key: ZOOKEEPER-479
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-479
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Flavio Paiva Junqueira
 Fix For: 3.2.1


QuorumHierarchical::containsQuorum should not verify if all groups represented 
in the input set have more than half of the total weight. Instead, it should 
check only for an overall majority of groups. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.