date:20100603

[jira] Commented: (ZOOKEEPER-744) Add monitoring four-letter word

2010-06-03 Thread Savu Andrei (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875021#action_12875021
]

Savu Andrei commented on ZOOKEEPER-744:
---

Thanks for reviewing. I will resubmit the patch today.

--
Savu Andrei

Website: http://www.andreisavu.ro/

Add monitoring four-letter word
---

Key: ZOOKEEPER-744
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-744
Project: Zookeeper
Issue Type: New Feature
Components: server
Affects Versions: 3.4.0
Reporter: Travis Crawford
Assignee: Savu Andrei
Fix For: 3.4.0

Attachments: zk-ganglia.png, ZOOKEEPER-744.patch, ZOOKEEPER-744.patch

Filing a feature request based on a zookeeper-user discussion.
Zookeeper should have a new four-letter word that returns key-value pairs
appropriate for importing to a monitoring system (such as Ganglia which has a
large installed base)
This command should initially export the following:
(a) Count of instances in the ensemble.
(b) Count of up-to-date instances in the ensemble.
But be designed such that in the future additional data can be added. For
example, the output could define the statistic in a comment, then print a key
space character value line:

# Total number of instances in the ensemble
zk_ensemble_instances_total 5
# Number of instances currently participating in the quorum.
zk_ensemble_instances_active 4

From the mailing list:

Date: Mon, 19 Apr 2010 12:10:44 -0700
From: Patrick Hunt ph...@apache.org
To: zookeeper-u...@hadoop.apache.org
Subject: Re: Recovery issue - how to debug?
On 04/19/2010 11:55 AM, Travis Crawford wrote:
It would be a lot easier from the operations perspective if the leader
explicitly published some health stats:

(a) Count of instances in the ensemble.
(b) Count of up-to-date instances in the ensemble.

This would greatly simplify monitoring alerting - when an instance
falls behind one could configure their monitoring system to let
someone know and take a look at the logs.
That's a great idea. Please enter a JIRA for this - a new 4 letter word
and JMX support. It would also be a great starter project for someone
interested in becoming more familiar with the server code.
Patrick

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

enhance zookeeper lock function in cocurrent condition

2010-06-03 Thread Joe Zou

Hi All:
Use zookeeper to build distribute lock is main feature. now implement the lock 
function as below code:

Public void lock() throws InterruptedException{
Do{
If(path == null){
  Path = zk.create(lockPrefix,null,acl, CreateMode.EPHEMERAL_SEQUENTIAL)
}
   ListString children = zk.getChildren(parentPath);
   If(isFirst(children,path)){
   Return;
}else{
 Final CowntDownLatch latch = new CountDownLatch(1);
String nestestChild = findLastBefore(children,path);
   If(zk.exist(nestestChildPath,new Watcher(Event){
 Latch.countDown();
}) != null){
   Latch.await();
}else{
  //acquire lock success
   Return;
}
}
}while(true);
}

In high concurrent case, lock node may need to get a big ephemeral children 
nodes. So that the GetChildren may cause the package exceeding the 
limitation(4MB as default), and also this would cause the performance issue. To 
avoid the issue, I plan to add a new interface isFirst for zeekeeper. I don't 
know if it is useful as a common usage, but I do think it should help a little 
bit in the concurrent situation. Below is snippet of the code change, and the 
attachment is full list of it.

Public void lock() throws InterruptedException{
Do{
If(path == null){
  Path = zk.create(lockPrefix,null,acl, CreateMode.EPHEMERAL_SEQUENTIAL)
}
Final CowntDownLatch latch = new CountDownLatch(1);
If(!Zk.isFirst(parentPath,path,Type,new Watcher(Event){
 Latch.countDown();
})){
   Latch.countDown()
}else{
  //acquire success.
   Return;
}
}while(true);
}

As we know, only the first node can aquire the lock success, so when lock Type 
parent node remove child node, it need trigger the the wather to notify the 
first node.

the second lock requirement is:
in our current project, each save need require multiple lock. In distribute 
Env, it very maybe cause dead lock or lock starve. So we need a stateLock, in 
the lock node, it keep the multiple states to judge the node if acquire the 
lock or not. Example:
Client1:lock( id1,id2,id3) -zdnode---01
Client2:lock(id2,id3)-zdnode---02
Client3:lock(id4)   -zdnode---03

We need client2 need wait the lock until the client1 unlock lock. But client 3 
can acquire the lock at once. These judge logic in zookeeper server. We add a 
LockState interface:
public interface LockState{
String PATH_SEPERATOR = /;
String PATH_DELIMIT = |;
boolean isConflict(LockState state);
byte[] getBytes();
}

Any new lock strategy can be added by implement the interface.

Attached is my code diff from 3.2.2 and the use lock some case.


Best Regards
Joe Zou

[jira] Updated: (ZOOKEEPER-775) A large scale pub/sub system

2010-06-03 Thread Benjamin Reed (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-775:


Status: Patch Available  (was: Open)

 A large scale pub/sub system
 

 Key: ZOOKEEPER-775
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-775
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Benjamin Reed
Assignee: Benjamin Reed
 Fix For: 3.4.0

 Attachments: libs.zip, libs_2.zip, ZOOKEEPER-775.patch, 
 ZOOKEEPER-775_2.patch, ZOOKEEPER-775_3.patch


 we have developed a large scale pub/sub system based on ZooKeeper and 
 BookKeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

2010-06-03 Thread Benjamin Reed (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-767:


Status: Open  (was: Patch Available)

 Submitting Demo/Recipe Shared / Exclusive Lock Code
 ---

 Key: ZOOKEEPER-767
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767
 Project: Zookeeper
  Issue Type: Improvement
  Components: recipes
Affects Versions: 3.3.0
Reporter: Sam Baskinger
Assignee: Sam Baskinger
Priority: Minor
 Fix For: 3.4.0

 Attachments: ZOOKEEPER-767.patch, ZOOKEEPER-767.patch


 Networked Insights would like to share-back some code for shared/exclusive 
 locking that we are using in our labs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

2010-06-03 Thread Benjamin Reed (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875227#action_12875227
 ] 

Benjamin Reed commented on ZOOKEEPER-785:
-

+1 i think we should log the message as a warning rather than error since we 
completely recover from the situation. we may also want to log a warning for 2 
servers to indicate that failures will not be tolerated. (feel free to ignore 
both comments and commit the patch :)

  Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line
 ---

 Key: ZOOKEEPER-785
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-785
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1
 Environment: Tested in linux with a new jvm
Reporter: Alex Newman
Assignee: Patrick Hunt
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-785.patch


 The following config causes an infinite loop
 [zoo.cfg]
 tickTime=2000
 dataDir=/var/zookeeper/
 clientPort=2181
 initLimit=10
 syncLimit=5
 server.0=localhost:2888:3888
 Output:
 2010-06-01 16:20:32,471 - INFO [main:quorumpeerm...@119] - Starting quorum 
 peer
 2010-06-01 16:20:32,489 - INFO [main:nioservercnxn$fact...@143] - binding to 
 port 0.0.0.0/0.0.0.0:2181
 2010-06-01 16:20:32,504 - INFO [main:quorump...@818] - tickTime set to 2000
 2010-06-01 16:20:32,504 - INFO [main:quorump...@829] - minSessionTimeout set 
 to -1
 2010-06-01 16:20:32,505 - INFO [main:quorump...@840] - maxSessionTimeout set 
 to -1
 2010-06-01 16:20:32,505 - INFO [main:quorump...@855] - initLimit set to 10
 2010-06-01 16:20:32,526 - INFO [main:files...@82] - Reading snapshot 
 /var/zookeeper/version-2/snapshot.c
 2010-06-01 16:20:32,547 - INFO [Thread-1:quorumcnxmanager$liste...@436] - My 
 election bind port: 3888
 2010-06-01 16:20:32,554 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,556 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,558 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 1, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,560 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
 2010-06-01 16:20:32,560 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,560 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,561 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 2, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,561 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
 2010-06-01 16:20:32,561 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,562 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,562 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 3, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,562 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 Things like HBase require that the zookeeper servers be listed in the 
 zoo.cfg. This is a bug on their part, but zookeeper shouldn't null pointer in 
 a loop though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

2010-06-03 Thread Patrick Hunt (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875229#action_12875229
 ] 

Patrick Hunt commented on ZOOKEEPER-785:


imo ERROR is correct for the 1 server case as there's no way that could be 
right and we want to bring to operator attention - they need to fix it.


  Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line
 ---

 Key: ZOOKEEPER-785
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-785
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1
 Environment: Tested in linux with a new jvm
Reporter: Alex Newman
Assignee: Patrick Hunt
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-785.patch


 The following config causes an infinite loop
 [zoo.cfg]
 tickTime=2000
 dataDir=/var/zookeeper/
 clientPort=2181
 initLimit=10
 syncLimit=5
 server.0=localhost:2888:3888
 Output:
 2010-06-01 16:20:32,471 - INFO [main:quorumpeerm...@119] - Starting quorum 
 peer
 2010-06-01 16:20:32,489 - INFO [main:nioservercnxn$fact...@143] - binding to 
 port 0.0.0.0/0.0.0.0:2181
 2010-06-01 16:20:32,504 - INFO [main:quorump...@818] - tickTime set to 2000
 2010-06-01 16:20:32,504 - INFO [main:quorump...@829] - minSessionTimeout set 
 to -1
 2010-06-01 16:20:32,505 - INFO [main:quorump...@840] - maxSessionTimeout set 
 to -1
 2010-06-01 16:20:32,505 - INFO [main:quorump...@855] - initLimit set to 10
 2010-06-01 16:20:32,526 - INFO [main:files...@82] - Reading snapshot 
 /var/zookeeper/version-2/snapshot.c
 2010-06-01 16:20:32,547 - INFO [Thread-1:quorumcnxmanager$liste...@436] - My 
 election bind port: 3888
 2010-06-01 16:20:32,554 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,556 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,558 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 1, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,560 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
 2010-06-01 16:20:32,560 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,560 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,561 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 2, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,561 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
 2010-06-01 16:20:32,561 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,562 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,562 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 3, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,562 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 Things like HBase require that the zookeeper servers be listed in the 
 zoo.cfg. This is a bug on their part, but zookeeper shouldn't null pointer in 
 a loop though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

2010-06-03 Thread Patrick Hunt (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-785:
---

Status: Open  (was: Patch Available)

I'll implement Ben's suggestion re 2-warn.

  Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line
 ---

 Key: ZOOKEEPER-785
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-785
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1
 Environment: Tested in linux with a new jvm
Reporter: Alex Newman
Assignee: Patrick Hunt
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-785.patch


 The following config causes an infinite loop
 [zoo.cfg]
 tickTime=2000
 dataDir=/var/zookeeper/
 clientPort=2181
 initLimit=10
 syncLimit=5
 server.0=localhost:2888:3888
 Output:
 2010-06-01 16:20:32,471 - INFO [main:quorumpeerm...@119] - Starting quorum 
 peer
 2010-06-01 16:20:32,489 - INFO [main:nioservercnxn$fact...@143] - binding to 
 port 0.0.0.0/0.0.0.0:2181
 2010-06-01 16:20:32,504 - INFO [main:quorump...@818] - tickTime set to 2000
 2010-06-01 16:20:32,504 - INFO [main:quorump...@829] - minSessionTimeout set 
 to -1
 2010-06-01 16:20:32,505 - INFO [main:quorump...@840] - maxSessionTimeout set 
 to -1
 2010-06-01 16:20:32,505 - INFO [main:quorump...@855] - initLimit set to 10
 2010-06-01 16:20:32,526 - INFO [main:files...@82] - Reading snapshot 
 /var/zookeeper/version-2/snapshot.c
 2010-06-01 16:20:32,547 - INFO [Thread-1:quorumcnxmanager$liste...@436] - My 
 election bind port: 3888
 2010-06-01 16:20:32,554 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,556 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,558 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 1, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,560 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
 2010-06-01 16:20:32,560 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,560 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,561 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 2, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,561 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.totalOrderPredicate(FastLeaderElection.java:496)
 at 
 org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:709)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:621)
 2010-06-01 16:20:32,561 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@620] - LOOKING
 2010-06-01 16:20:32,562 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My 
 id = 0, Proposed zxid = 12
 2010-06-01 16:20:32,562 - INFO 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 0, 
 12, 3, 0, LOOKING, LOOKING, 0
 2010-06-01 16:20:32,562 - WARN 
 [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@623] - Unexpected exception
 java.lang.NullPointerException
 Things like HBase require that the zookeeper servers be listed in the 
 zoo.cfg. This is a bug on their part, but zookeeper shouldn't null pointer in 
 a loop though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: enhance zookeeper lock function in cocurrent condition

2010-06-03 Thread Benjamin Reed

if i understand your situation correctly, you have a lock that may have 
more than 100,000 processes contending for a lock. since this can cause 
a problem for getChildren, you want to have a way to get the server to 
do it for you without returning everything.


the isFirst method would return true if you are first (sorted in utf8 
order?) in the list of children. and you can set a watch on that 
condition. what do the path and type arguments do?


ben

On 06/03/2010 03:20 AM, Joe Zou wrote:


Hi All:

Use zookeeper to build distribute lock is main feature. now implement 
the lock function as below code:


Public void lock() throws InterruptedException{

Do{

If(path == null){

  Path = zk.create(lockPrefix,null,acl, CreateMode./EPHEMERAL_SEQUENTIAL/)

}

   ListString children = zk.getChildren(parentPath);

   If(isFirst(children,path)){

   Return;

}else{

 Final CowntDownLatch latch = new CountDownLatch(1);

String nestestChild = findLastBefore(children,path);

   If(zk.exist(nestestChildPath,new Watcher(Event){

 Latch.countDown();

}) != null){

   Latch.await();

}else{

  //acquire lock success

   Return;

}

}

}while(true);

}

In high concurrent case, lock node may need to get a big ephemeral 
children nodes. So that the GetChildren may cause the package 
exceeding the limitation(4MB as default), and also this would cause 
the performance issue. To avoid the issue, I plan to add a new 
interface isFirst for zeekeeper. I don’t know if it is useful as a 
common usage, but I do think it should help a little bit in the 
concurrent situation. Below is snippet of the code change, and the 
attachment is full list of it.


Public void lock() throws InterruptedException{

Do{

If(path == null){

  Path = zk.create(lockPrefix,null,acl, CreateMode./EPHEMERAL_SEQUENTIAL/)

}

Final CowntDownLatch latch = new CountDownLatch(1);

If(!Zk.isFirst(parentPath,path,Type,new Watcher(Event){

 Latch.countDown();

})){

   Latch.countDown()

}else{

  //acquire success.

   Return;

}

}while(true);

}

As we know, only the first node can aquire the lock success, so when 
lock Type parent node remove child node, it need trigger the the 
wather to notify the first node.


the second lock requirement is:

in our current project, each save need require multiple lock. In 
distribute Env, it very maybe cause dead lock or lock starve. So we 
need a stateLock, in the lock node, it keep the multiple states to 
judge the node if acquire the lock or not. Example:


Client1:lock( id1,id2,id3) -zdnode---01

Client2:lock(id2,id3)-zdnode---02

Client3:lock(id4)   -zdnode---03

We need client2 need wait the lock until the client1 unlock lock. But 
client 3 can acquire the lock at once. These judge logic in zookeeper 
server. We add a LockState interface:


*public* *interface* LockState{

String /PATH_SEPERATOR/ = /;

String /PATH_DELIMIT/ = |;

*boolean* isConflict(LockState state);

*byte*[] getBytes();

}

Any new lock strategy can be added by implement the interface.

Attached is my code diff from 3.2.2 and the use lock some case.

Best Regards

Joe Zou

[jira] Updated: (ZOOKEEPER-784) server-side functionality for read-only mode

2010-06-03 Thread Sergey Doroshenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Doroshenko updated ZOOKEEPER-784:


Attachment: ZOOKEEPER-784.patch

Fixed JMX behaviour, now read-only server is visible in a JMX console when 
server is partitioned

 server-side functionality for read-only mode
 

 Key: ZOOKEEPER-784
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-784
 Project: Zookeeper
  Issue Type: Sub-task
Reporter: Sergey Doroshenko
Assignee: Sergey Doroshenko
 Attachments: ZOOKEEPER-784.patch, ZOOKEEPER-784.patch


 As per http://wiki.apache.org/hadoop/ZooKeeper/GSoCReadOnlyMode , create 
 ReadOnlyZooKeeperServer which comes into play when peer is partitioned.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

2010-06-03 Thread Sam Baskinger (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875334#action_12875334
]

Sam Baskinger commented on ZOOKEEPER-767:
-

Thanks for the feedback Benjamin,

Perhaps I misread the recipe or am missing the philosophy of ZK's atomicity. It
wouldn't be the first time. :) To your points:

We do the create to ensure that we, at some point, will hold a lock. I do want
to do the create, ensuring my turn, and then wait until I'm at the front of the
line (front being defined in the exclusive or shared way).

There should be a unit test that ensure that this does indeed happen,
semantically. Exclusive locks block all shared access, if I take your meaning
correctly.

I thought the API guaranteed that in the event of a connection loss the
EPHEMERAL creation property would guarantee that when the session timed out the
file would be removed and watchers would be signaled.

All but those behind me in the line of locks. This could certainly be optimized
and is something I thought about, but moved past to get the rough
implementation in flight.

If the above 4 points hold, then extending the other implementation may be
better for the community. :) I hope you'll include the code, but if not, we're
very happy with it and appreciate ZooKeeper! Keep up the fine work. :)

What do you think? What did I miss? :)

Sam Baskinger

Submitting Demo/Recipe Shared / Exclusive Lock Code
---

Key: ZOOKEEPER-767
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-767
Project: Zookeeper
Issue Type: Improvement
Components: recipes
Affects Versions: 3.3.0
Reporter: Sam Baskinger
Assignee: Sam Baskinger
Priority: Minor
Fix For: 3.4.0

Attachments: ZOOKEEPER-767.patch, ZOOKEEPER-767.patch

Networked Insights would like to share-back some code for shared/exclusive
locking that we are using in our labs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (ZOOKEEPER-783) committedLog in ZKDatabase is not properly synchronized

2010-06-03 Thread Patrick Hunt (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-783:
---

 Assignee: Henry Robinson
Fix Version/s: 3.3.2
   3.4.0

 committedLog in ZKDatabase is not properly synchronized
 ---

 Key: ZOOKEEPER-783
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-783
 Project: Zookeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1
Reporter: Henry Robinson
Assignee: Henry Robinson
Priority: Critical
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-783.patch


 ZKDatabase.getCommittedLog() returns a reference to the LinkedListProposal 
 committedLog in ZKDatabase. This is then iterated over by at least one 
 caller. 
 I have seen a bug that causes a NPE in LinkedList.clear on committedLog, 
 which I am pretty sure is due to the lack of synchronization. This bug has 
 not been apparent in normal ZK operation, but in code that I have that starts 
 and stops a ZK server in process repeatedly (clear() is called from 
 ZooKeeperServerMain.shutdown()). 
 It's better style to defensively copy the list in getCommittedLog, and to 
 synchronize on the list in ZKDatabase.clear.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

2010-06-03 Thread Sam Baskinger (JIRA)

[
https://issues.apache.org/jira/browse/ZOOKEEPER-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875336#action_12875336
]

Sam Baskinger commented on ZOOKEEPER-767:
-

Thanks for the feedback Benjamin,

Replying by email removed snippets from your message. Same comments as above,
but with quotes for context (and fewer smilies).

Perhaps I misread the recipe or am missing the philosophy of ZK's atomicity. It
wouldn't be the first time. To your points:

1) shouldn't you check to see if you already have a lock before you do the
create? that will remove the code right after the create in the getLock()
methods.

2) if you already have an exclusive lock, shouldn't that also count as a
shared lock?

There should be a unit test that ensure that this does indeed happen,
semantically. Exclusive locks block all shared access, if I take your meaning
correctly.

3) the error handling is a bit problematic. a connection loss exception or an
interrupt can leave a process holding a lock without knowing it.

4) when you go through the children, you may end up checking for the
existence of every znode before you, which could be wasteful.

All but those behind me in the line of locks. This could certainly be optimized
and is something I thought about, but moved past to get the rough
implementation in flight.

i think it may be better to expand the current locking code to handle shared
lock rather than add a new lock implementation. the current lock recipe
implementation only does exclusive locks, but it is implemented in a way that
makes it easy to support shared locks as well and it takes care of the
above problems.

If the above 4 points hold, then extending the other implementation may be
better for the community. I hope you'll include the code, but if not, we're
very happy with it and appreciate ZooKeeper! Keep up the fine work.

What do you think? What did I miss? :)

Sam Baskinger

Submitting Demo/Recipe Shared / Exclusive Lock Code
---

Attachments: ZOOKEEPER-767.patch, ZOOKEEPER-767.patch

Networked Insights would like to share-back some code for shared/exclusive
locking that we are using in our labs.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-775) A large scale pub/sub system

2010-06-03 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875487#action_12875487
 ] 

Hadoop QA commented on ZOOKEEPER-775:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12446057/libs_2.zip
  against trunk revision 947063.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no tests are needed for this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/113/console

This message is automatically generated.

 A large scale pub/sub system
 

 Key: ZOOKEEPER-775
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-775
 Project: Zookeeper
  Issue Type: New Feature
  Components: contrib
Reporter: Benjamin Reed
Assignee: Benjamin Reed
 Fix For: 3.4.0

 Attachments: libs.zip, libs_2.zip, ZOOKEEPER-775.patch, 
 ZOOKEEPER-775_2.patch, ZOOKEEPER-775_3.patch


 we have developed a large scale pub/sub system based on ZooKeeper and 
 BookKeeper.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (ZOOKEEPER-744) Add monitoring four-letter word

enhance zookeeper lock function in cocurrent condition

[jira] Updated: (ZOOKEEPER-775) A large scale pub/sub system

[jira] Updated: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

[jira] Commented: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

[jira] Commented: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

[jira] Updated: (ZOOKEEPER-785) Zookeeper 3.3.1 shouldn't infinite loop if someone creates a server.0 line

Re: enhance zookeeper lock function in cocurrent condition

[jira] Updated: (ZOOKEEPER-784) server-side functionality for read-only mode

[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

[jira] Updated: (ZOOKEEPER-783) committedLog in ZKDatabase is not properly synchronized

[jira] Commented: (ZOOKEEPER-767) Submitting Demo/Recipe Shared / Exclusive Lock Code

[jira] Commented: (ZOOKEEPER-775) A large scale pub/sub system

13 matches

Site Navigation

Mail list logo

Footer information