[jira] Updated: (ZOOKEEPER-530) Memory corruption: Zookeeper c client IPv6 implementation does not honor struct sockaddr_in6 size

2009-10-19 Thread Isabel Drost (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Drost updated ZOOKEEPER-530:
---

Attachment: ZOOKEEPER-530.patch

Changed error code and error message.

> Memory corruption: Zookeeper c client IPv6 implementation does not honor 
> struct sockaddr_in6 size
> -
>
> Key: ZOOKEEPER-530
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-530
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Isabel Drost
>Assignee: Isabel Drost
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-530.patch, ZOOKEEPER-530.patch
>
>
> I tried to run zookeeper c-client on a machine with IPv6 enabled. When 
> connecting to the IPv6 address a connect(...) gave a "Address family not 
> supported by protocol" error. The reason was, that a few lines earlier, the 
> socket was opened with PF_INET instead of PF_INET6. Changing that the 
> following way:
> {code}
>if (zh->addrs[zh->connect_index].sa_family == AF_INET) {
>   zh->fd = socket(PF_INET, SOCK_STREAM, 0);
> } else {
>   zh->fd = socket(PF_INET6, SOCK_STREAM, 0);
> }
> {code}
> turned the error message into "Invalid argument". 
> When printing out sizeof(struct sockaddr), sizeof(struct sockaddr_in) and 
> sizeof(struct sockaddr_in6) I got sockaddr: 16, sockaddr_in: 16 and 
> sockaddr_in6: 28. 
> So in the code calling 
> {code}
>connect(zh->fd, &zh->addrs[zh->connect_index], sizeof(struct 
> sockaddr_in));
> {code}
> the parameter address_len is too small.
> Same applies to how IPv6 addresses are handled in the function 
> getaddrs(zhandle_t *zh).
> (Big Thanks+kiss to Thilo Fromm for helping me debug this.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-512) FLE election fails to elect leader

2009-10-19 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767645#action_12767645
 ] 

Mahadev konar commented on ZOOKEEPER-512:
-

flavio, the patch looks good - 

The following logging can be imprvoed to include which quorum server it 
corresponds to (for unit testing) and in general. 

{code}
LOG.info("Leaving listener");
if(!shutdown)
LOG.fatal("As I'm leaving the listener thread, I won't be able 
to participate in leader election any longer... digital life sucks");
{code}

Also, I can see the hatred for digital life :), but a more useful logging 
message would be better ! 

- also I am having troble understanding this - 

{code}
synchronized void connectOne(long sid){
 if (senderWorkerMap.get(sid) == null){
InetSocketAddress electionAddr;
if(self.quorumPeers.containsKey(sid))
electionAddr =
self.quorumPeers.get(sid).electionAddr;
else{
LOG.warn("Invalid server id: " + sid);
return;
}
{code} 

you mentioned above that connectOne was being called with a sid that wasnt in 
the map. Is that possible?

> FLE election fails to elect leader
> --
>
> Key: ZOOKEEPER-512
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-512
> Project: Zookeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.2.0
>Reporter: Patrick Hunt
>Assignee: Flavio Paiva Junqueira
>Priority: Blocker
> Fix For: 3.3.0
>
> Attachments: jst.txt, log3_debug.tar.gz, logs.tar.gz, logs2.tar.gz, 
> t5_aj.tar.gz, ZOOKEEPER-512.patch, ZOOKEEPER-512.patch, ZOOKEEPER-512.patch, 
> ZOOKEEPER-512.patch
>
>
> I was doing some fault injection testing of 3.2.1 with ZOOKEEPER-508 patch 
> applied and noticed that after some time the ensemble failed to re-elect a 
> leader.
> See the attached log files - 5 member ensemble. typically 5 is the leader
> Notice that after 16:23:50,525 no quorum is formed, even after 20 minutes 
> elapses w/no quorum
> environment:
> I was doing fault injection testing using aspectj. The faults are injected 
> into socketchannel read/write, I throw exceptions randomly at a 1/200 ratio 
> (rand.nextFloat() <= .005 => throw IOException
> You can see when a fault is injected in the log via:
> 2009-08-19 16:57:09,568 - INFO  [Thread-74:readrequestfailsintermitten...@38] 
> - READPACKET FORCED FAIL
> vs a read/write that didn't force fail:
> 2009-08-19 16:57:09,568 - INFO  [Thread-74:readrequestfailsintermitten...@41] 
> - READPACKET OK
> otw standard code/config (straight fle quorum with 5 members)
> also see the attached jstack trace. this is for one of the servers. Notice in 
> particular that the number of sendworkers != the number of recv workers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-530) Memory corruption: Zookeeper c client IPv6 implementation does not honor struct sockaddr_in6 size

2009-10-19 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767637#action_12767637
 ] 

Mahadev konar commented on ZOOKEEPER-530:
-

Isabel, the patch looks good. thanks for the fix. 

One thing I noticed in the patch is this -

{code}
zh->fd = socket(zh->addrs[zh->connect_index].ss_family, SOCK_STREAM, 0);
if (zh->fd < 0) {
return 
api_epilog(zh,handle_socket_error_msg(zh,__LINE__,
ZCONNECTIONLOSS,"connect() call 
failed"));
}
{code}

there is a new error check, which is a good thing, but can the error be changed 
to ZSYSTEMERROR, the log message to socket() call failed?




> Memory corruption: Zookeeper c client IPv6 implementation does not honor 
> struct sockaddr_in6 size
> -
>
> Key: ZOOKEEPER-530
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-530
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.2.0, 3.2.1
>Reporter: Isabel Drost
>Assignee: Isabel Drost
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-530.patch
>
>
> I tried to run zookeeper c-client on a machine with IPv6 enabled. When 
> connecting to the IPv6 address a connect(...) gave a "Address family not 
> supported by protocol" error. The reason was, that a few lines earlier, the 
> socket was opened with PF_INET instead of PF_INET6. Changing that the 
> following way:
> {code}
>if (zh->addrs[zh->connect_index].sa_family == AF_INET) {
>   zh->fd = socket(PF_INET, SOCK_STREAM, 0);
> } else {
>   zh->fd = socket(PF_INET6, SOCK_STREAM, 0);
> }
> {code}
> turned the error message into "Invalid argument". 
> When printing out sizeof(struct sockaddr), sizeof(struct sockaddr_in) and 
> sizeof(struct sockaddr_in6) I got sockaddr: 16, sockaddr_in: 16 and 
> sockaddr_in6: 28. 
> So in the code calling 
> {code}
>connect(zh->fd, &zh->addrs[zh->connect_index], sizeof(struct 
> sockaddr_in));
> {code}
> the parameter address_len is too small.
> Same applies to how IPv6 addresses are handled in the function 
> getaddrs(zhandle_t *zh).
> (Big Thanks+kiss to Thilo Fromm for helping me debug this.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Maven POM

2009-10-19 Thread Patrick Hunt

3.3.0 has support for it:
https://issues.apache.org/jira/browse/ZOOKEEPER-529

but there are still some outstanding issues to be addressed:
https://issues.apache.org/jira/browse/ZOOKEEPER-537
https://issues.apache.org/jira/browse/ZOOKEEPER-224

Patrick


Alan D. Cabrera wrote:
I was looking for Zookeeper in the maven repo and was unable to find 
it.  Has it been published to Maven Central?



Regards,
Alan



Maven POM

2009-10-19 Thread Alan D. Cabrera
I was looking for Zookeeper in the maven repo and was unable to find  
it.  Has it been published to Maven Central?



Regards,
Alan



[jira] Updated: (ZOOKEEPER-368) Observers

2009-10-19 Thread Henry Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-368:
-

Attachment: observers sync benchmark.png

> Observers
> -
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-10-19 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767539#action_12767539
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

Flavio - 

Throughput is low - but we should be looking at the relative numbers, not the 
absolute values (I get similar numbers running the current trunk in the same 
configuration). One reason throughput might be arbitrarily low is because I'm 
running these benchmarks against a single machine, so might be hitting disk 
bottlenecks due to contention for the logs. 

These numbers were for synchronous create operations, issued from a single 
client. So read throughput would at best stay constant since the client can't 
take advantage of the parallelism offered by multiple observers. I've also 
benchmarked reads and mixed workloads (the most interesting, typically). Reads, 
as expected, are fairly constant in throughput. Mixed workloads are better in 
heterogeneous clusters, again as you would expect. 

These are just indicative numbers to ensure that we're on the right track :)

Mahadev - I ran the experiment you suggested, I'll attach the chart results 
below. 

Henry

> Observers
> -
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Attachments: obs-refactor.patch, observer-refactor.patch, observers 
> sync benchmark.png, observers.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-10-19 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767519#action_12767519
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

I have had to change the Leader Election code just a little bit to support 
Observers, and I wanted to run the decisions past everyone.

Observers don't participate in Leader Elections in the sense that they don't 
cast votes. However, they need to learn the results. The way I do this at the 
moment is to force Observers always to use LeaderElection as their election 
algorithm (and disable vote casting for them). So essentially they simply query 
the rest of the ensemble for a quorum of votes. This works well, and has the 
advantage of not needing to teach all LE algorithms about observers. The only 
change I make to the rest of the code is to always start a responder thread, no 
matter what the prevailing election type on the follower, so that they'll 
always respond to the queries from observers.

The correctness of this relies on the fact that a leader must always be 
supported by a quorum, no matter what the protocol used to elect the leader in 
the first place is. So it's always correct to believe that a leader that is 
supported by a quorum is actually the leader.

Does this sound right? Are there any gotchas about always running the responder 
thread?

Henry


> Observers
> -
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Attachments: obs-refactor.patch, observer-refactor.patch, 
> observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-549) Refactor Followers and related classes into a Peer->Follower hierarchy in preparation for Observers

2009-10-19 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-549:
---

Status: Open  (was: Patch Available)

> Refactor Followers and related classes into a Peer->Follower hierarchy in 
> preparation for Observers
> ---
>
> Key: ZOOKEEPER-549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-549
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: quorum, server
>Affects Versions: 3.2.1
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-549.patch, ZOOKEEPER-549.patch
>
>
> For the Observers patch (ZOOKEEPER-368), a lot of functionality is shared 
> between Followers and Observers. To avoid copying code, it makes sense to 
> push the common code into a parent Peer class and specialise it for Followers 
> and Observers. At the same time, some of the lengthier methods in Follower 
> can be broken up to make the code more readable. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-554) zkpython can segfault when statting a deleted node

2009-10-19 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-554:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1, looks good, Thanks Henry!

> zkpython can segfault when statting a deleted node
> --
>
> Key: ZOOKEEPER-554
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-554
> Project: Zookeeper
>  Issue Type: Bug
>  Components: contrib-bindings
>Reporter: Henry Robinson
>Assignee: Henry Robinson
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-554.patch
>
>
> C client returns NULL for stat object for deleted nodes. zookeeper.c blindly 
> dereferences it. Segfault. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-10-19 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767245#action_12767245
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-368:
--

Henry, are these throughput values for synchronous or asynchronous operations? 
Throughput sounds pretty low in any case. 

Also, from your description, it sounds like you're testing only with write 
operations (create). I would expect write throughput to be independent on the 
number of observers (but dependent upon the number of followers). Read 
throughput should increase with the number of observers.

> Observers
> -
>
> Key: ZOOKEEPER-368
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
> Project: Zookeeper
>  Issue Type: New Feature
>  Components: quorum
>Reporter: Flavio Paiva Junqueira
>Assignee: Henry Robinson
> Attachments: obs-refactor.patch, observer-refactor.patch, 
> observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
> ZOOKEEPER-368.patch
>
>
> Currently, all servers of an ensemble participate actively in reaching 
> agreement on the order of ZooKeeper transactions. That is, all followers 
> receive proposals, acknowledge them, and receive commit messages from the 
> leader. A leader issues commit messages once it receives acknowledgments from 
> a quorum of followers. For cross-colo operation, it would be useful to have a 
> third role: observer. Using Paxos terminology, observers are similar to 
> learners. An observer does not participate actively in the agreement step of 
> the atomic broadcast protocol. Instead, it only commits proposals that have 
> been accepted by some quorum of followers.
> One simple solution to implement observers is to have the leader forwarding 
> commit messages not only to followers but also to observers, and have 
> observers applying transactions according to the order followers agreed upon. 
> In the current implementation of the protocol, however, commit messages do 
> not carry their corresponding transaction payload because all servers 
> different from the leader are followers and followers receive such a payload 
> first through a proposal message. Just forwarding commit messages as they 
> currently are to an observer consequently is not sufficient. We have a couple 
> of options:
> 1- Include the transaction payload along in commit messages to observers;
> 2- Send proposals to observers as well.
> Number 2 is simpler to implement because it doesn't require changing the 
> protocol implementation, but it increases traffic slightly. The performance 
> impact due to such an increase might be insignificant, though.
> For scalability purposes, we may consider having followers also forwarding 
> commit messages to observers. With this option, observers can connect to 
> followers, and receive messages from followers. This choice is important to 
> avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.