from:"Mahadev Konar"

Re: txzookeeper - a twisted python client for zookeeper

2010-11-19 Thread Mahadev Konar

Nice.
Any chance of putting it back in zk?

Would be useful.

Thanks
mahadev


On 11/18/10 1:17 PM, Kapil Thangavelu kapil.f...@gmail.com wrote:

At canonical we've been using zookeeper heavily in the development of a new
project (ensemble) as noted by gustavo.

I just wanted to give a quick overview of the client library we're using for
it. Its called txzookeeper, its got 100% test coverage, and implements
various queue, lock, and utilities in addition to wrapping the standard zk
interface. Its based on the twisted async networking framework for python,
and obviates the need to use threads within the application, as all watches
and result callbacks are invoked in the main app thread. This makes
structuring the code signficantly simpler imo than having to deal with
threads in the application, but of course tastes may vary ;-).

Source code is here : http://launchpad.net/txzookeeper

comments and feedback welcome.

cheers,

Kapil

Re: JUnit tests do not produce logs if the JVM crashes

2010-11-04 Thread Mahadev Konar

Hi Andras,
  Junit unit will always buffer the logs unless you print it out to console.

To do that, try running this

ant test -Dtest.output=yes

This will print out the logs to console as they are logged.

Thanks
mahadev


On 11/4/10 3:33 AM, András Kövi allp...@gmail.com wrote:

 Hi all, I'm new to Zookeeper and ran into an issue while trying to run the
 tests with ant.
 
 It seems like the log output is buffered until the complete test suite
 finishes and it is flushed into its specific file only after then. I had to
 make some changes to the code (no JNI or similar) that resulted in JVM
 crashes. Since the logs are lost in this case, it is a little hard to debug
 the issue.
 
 Do you have any idea how I could disable the buffering?
 
 Thanks,
 Andras

FW: [Hadoop Wiki] Update of ZooKeeper/ZKClientBindings by yfinkelstein

2010-11-02 Thread Mahadev Konar

Nice to see this!

Thanks
mahadev
-- Forwarded Message
From: Apache Wiki wikidi...@apache.org
Reply-To: common-...@hadoop.apache.org
Date: Tue, 2 Nov 2010 14:39:24 -0700
To: Apache Wiki wikidi...@apache.org
Subject: [Hadoop Wiki] Update of ZooKeeper/ZKClientBindings by
yfinkelstein

Dear Wiki user,

You have subscribed to a wiki page or wiki category on Hadoop Wiki for
change notification.

The ZooKeeper/ZKClientBindings page has been changed by yfinkelstein.
http://wiki.apache.org/hadoop/ZooKeeper/ZKClientBindings?action=diffrev1=5;
rev2=6

--

  ||Binding||Author||URL||
  ||Scala||Steve Jenson, John
Corwin||http://github.com/twitter/scala-zookeeper-client||
  ||C#||Eric Hauser||http://github.com/ewhauser/zookeeper||
- || || || ||
+ ||Node.js||Yuri 
Finkelstein||http://github.com/yfinkelstein/node-zookeeper||

-- End of Forwarded Message

Re: Problem with Zookeeper cluster configuration

2010-10-27 Thread Mahadev Konar

I think Jared pointed this out, given that your clientPort and quorum port
are same:

clientPort=5181
server.1=3.7.192.142:5181:5888
 


The above 2 ports should be different.

Thanks
mahadev

On 10/27/10 10:19 AM, Ted Dunning ted.dunn...@gmail.com wrote:

 Sorry, didn't see this last bit.
 
 Hmph.  A real ZK person will have to answer this.
 
 On Wed, Oct 27, 2010 at 6:21 AM, siddhartha banik 
 siddhartha.ba...@gmail.com wrote:
 
 I have tried with netstat command also. No other process is using *5181
 *port
 other then zookeeper process.
 
 Other thing I have tried is: using separate ports for server1  server 2.
 Surprise is after starting server 2, server 1 also starts to use the same
 port as server 2 is using as client port. Does that matter , as server1 
 server 2 are running in different boxes.
 
 Any help is appreciated.
 
 
 Thanks
 Siddhartha

Re: Unusual exception

2010-10-26 Thread Mahadev Konar

Hi Avinash,
 Not sure if you got a response for your email.
  The exception that you mention mostly means that the client already closed
the socket or shutdown.
 Looks like a client is trying to connect but disconnects before the server
can respond.

Do you have any such clients? Is this causing any issues with your zookeeper
set up?

Thanks
mahadev


On 10/13/10 2:49 PM, Avinash Lakshman avinash.laksh...@gmail.com wrote:

 I started seeing a bunch of these exceptions. What do these mean?
 
 2010-10-13 14:01:33,426 - WARN [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@606] - EndOfStreamException: Unable to
 read additional data from client sessionid 0x0, likely client has closed
 socket
 2010-10-13 14:01:33,426 - INFO [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@1286] - Closed socket connection for
 client /10.138.34.195:55738 (no session established for client)
 2010-10-13 14:01:33,426 - DEBUG [CommitProcessor:1:finalrequestproces...@78]
 - Processing request:: sessionid:0x12b9d1f8b907a44 type:closeSession
 cxid:0x0 zxid:0x600193996 txntype:-11 reqpath:n/a
 2010-10-13 14:01:33,427 - WARN [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@606] - EndOfStreamException: Unable to
 read additional data from client sessionid 0x12b9d1f8b907a5d, likely client
 has closed socket
 2010-10-13 14:01:33,427 - INFO [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@1286] - Closed socket connection for
 client /10.138.34.195:55979 which had sessionid 0x12b9d1f8b907a5d
 2010-10-13 14:01:33,427 - DEBUG [QuorumPeer:/0.0.0.0:5001
 :commitproces...@159] - Committing request:: sessionid:0x52b90ab45bd51af
 type:createSession cxid:0x0 zxid:0x600193cf9 txntype:-10 reqpath:n/a
 2010-10-13 14:01:33,427 - DEBUG [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@1302] - ignoring exception during output
 shutdown
 java.net.SocketException: Transport endpoint is not connected
 at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
 at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
 at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
 at
 org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1298)
 at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263)
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609)
 at
 org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262)
 2010-10-13 14:01:33,428 - DEBUG [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@1310] - ignoring exception during input
 shutdown
 java.net.SocketException: Transport endpoint is not connected
 at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
 at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
 at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
 at
 org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1306)
 at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263)
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609)
 at
 org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262)
 2010-10-13 14:01:33,428 - WARN [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@606] - EndOfStreamException: Unable to
 read additional data from client sessionid 0x0, likely client has closed
 socket
 2010-10-13 14:01:33,428 - INFO [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:5001:nioserverc...@1286] - Closed socket connection for
 client /10.138.34.195:55731 (no session established for client)

Re: Zookeeper on 60+Gb mem

2010-10-05 Thread Mahadev Konar

Hi Maarteen,
  I definitely know of a group which uses around 3GB of memory heap for
zookeeper but never heard of someone with such huge requirements. I would
say it definitely would be a learning experience with such high memory which
I definitely think would be very very useful for others in the community as
well. 

Thanks
mahadev


On 10/5/10 11:03 AM, Maarten Koopmans maar...@vrijheid.net wrote:

 Hi,
 
 I just wondered: has anybody ever ran zookeeper to the max on a 68GB
 quadruple extra large high memory EC2 instance? With, say, 60GB allocated or
 so?
 
 Because EC2 with EBS is a nice way to grow your zookeeper cluster (data on the
 ebs columes, upgrade as your memory utilization grows)  - I just wonder
 what the limits are there, or if I am foing where angels fear to tread...
 
 --Maarten

Re: possible bug in zookeeper ?

2010-10-04 Thread Mahadev Konar

Hi Yatir,

  Any update on this? Are you still struggling with this problem?

Thanks
mahadev

On 9/15/10 12:56 AM, Yatir Ben Shlomo yat...@outbrain.com wrote:

 Thanks to all who replied, I appreciate your efforts:
 
 1. There is no connections problem from the client machine:
 (ob1078)(tom...@cass3:~)$ echo ruok | nc zook1 2181
 imok(ob1078)(tom...@cass3:~)$ echo ruok | nc zook2 2181
 imok(ob1078)(tom...@cass3:~)$ echo ruok | nc zook3 2181
 imok(ob1078)(tom...@cass3:~)$
 
 2. Unfortunately I have already tried to switch to the new jar but it does not
 seem to be backward compatible.
 It seems that the QuorumPeerConfig class does not have the following field
 protected int clientPort;
 It was replaced by InetSocketAddress clientPortAddress in the new jar
 So I am getting java.lang.NoSuchFieldError exception...
 
 3. I looked at the ClientCnxn.java code.
 It seems that the logic for iterating over the available servers
 (nextAddrToTry++ ) is used only inside the startConnect() function but not in
 the finishConnect() function, nor anywhere else.
 
 Possibly something along these lines is happening:
 some exception that happens inside the finishConnect() function is cauasing
 the cleanup() function which in turn causes another exception.
 Nowhere in this code path is the nextAddrToTry++ applied.
 Can this make sense to someone ?
 thanks
 
 
 
 
 
 
 -Original Message-
 From: Patrick Hunt [mailto:ph...@apache.org]
 Sent: Tuesday, September 14, 2010 6:20 PM
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: possible bug in zookeeper ?
 
 That is unusual. I don't recall anyone reporting a similar issue, and
 looking at the code I don't see any issues off hand. Can you try the
 following?
 
 1) on that particular zk client machine resolve the hosts zook1/zook2/zook3,
 what ip addresses does this resolve to? (try dig)
 2) try running the client using the 3.3.1 jar file (just replace the jar on
 the client), it includes more log4j information, turn on DEBUG or TRACE
 logging
 
 Patrick
 
 On Tue, Sep 14, 2010 at 8:44 AM, Yatir Ben Shlomo yat...@outbrain.comwrote:
 
 zook1:2181,zook2:2181,zook3:2181
 
 
 -Original Message-
 From: Ted Dunning [mailto:ted.dunn...@gmail.com]
 Sent: Tuesday, September 14, 2010 4:11 PM
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: possible bug in zookeeper ?
 
 What was the list of servers that was given originally to open the
 connection to ZK?
 
 On Tue, Sep 14, 2010 at 6:15 AM, Yatir Ben Shlomo yat...@outbrain.com
 wrote:
 
 Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances.
 
 I am performing survivability  tests:
 Taking one of the zookeeper instances down I would expect the client to
 use
 a different zookeeper server instance.
 
 But as you can see in the below logs attached
 Depending on which instance I choose to take down (in my case,  the last
 one in the list of zookeeper servers)
 the client is constantly insisting on the same zookeeper server
 (Attempting
 connection to server zook3/192.168.252.78:2181)
 and not switching to a different one
 the problem seems to arrive from ClientCnxn.java
 Any one has an idea on this ?
 
 Solr cloud currently is using  zookeeper-3.2.2.jar
 Is this a know bug that was fixed in later versions ?( 3.3.1)
 
 Thanks in advance,
 Yatir
 
 
 Logs:
 
 Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown input
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
at
 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :999)
at
 
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown output
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
at
 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :1004)
at
 
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
 INFO: Attempting connection to server zook3/192.168.252.78:2181
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
 WARNING: Exception closing session 0x32b105244a20001 to
 sun.nio.ch.selectionkeyi...@3ca58cbf
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at
 sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
 
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933
)
 Sep 14, 2010 9:02:22 AM

Re: Expiring session... timeout of 600000ms exceeded

2010-10-04 Thread Mahadev Konar

Am not sure, if anyone responded to this or not. Are the clients getting
session expired or getting Connectionloss?
In any case, zookeeper client has its own thread to updated the server with
active connection status. Did you take a look at the GC activity at your
client?

Thanks
mahadev


On 9/21/10 8:24 AM, Tim Robertson timrobertson...@gmail.com wrote:

 Hi all,
 
 I am seeing a lot of my clients being kicked out after the 10 minute
 negotiated timeout is exceeded.
 My clients are each a JVM (around 100 running on a machine) which are
 doing web crawling of specific endpoints and handling the response XML
 - so they do wait around for 3-4 minutes on HTTP timeouts, but
 certainly not 10 mins.
 I am just prototyping right now on a 2xquad core mac pro with 12GB
 memory, and the 100 child processes only get -Xmx64m and I don't see
 my machine exhausted.
 
 Do my clients need to do anything in order to initiate keep alive
 heart beats or should this be automatic (I thought the ticktime would
 dictate this)?
 
 # my conf is:
 tickTime=2000
 dataDir=/Volumes/Data/zookeeper
 clientPort=2181
 maxClientCnxns=1
 minSessionTimeout=4000
 maxSessionTimeout=80
 
 Thanks for any pointers to this newbie,
 Tim

Re: SessionMovedException

2010-10-01 Thread Mahadev Konar

Hi Jun,
  You can read more about the SessionMovedException at

http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperProgrammers.html

Thanks
mahadev


On 10/1/10 9:58 AM, Jun Rao jun...@gmail.com wrote:

Hi,

Could someone explain what SessionMovedException means? Should it be treated
as SessionExpiredException (therefore have to recreate ephemeral nodes,
etc)? I have seen this exception when the network is being upgraded.

Thanks,

Jun

Re: zkfuse

2010-09-24 Thread Mahadev Konar

Hi Jun,
  I havent seen people using zkfuse recently. What kind of issues are you
facing?

Thanks
mahadev


On 9/19/10 6:46 PM, 俊贤 junx...@taobao.com wrote:

 Hi guys,
 Has anyone succeeded in installing the zkfuse?
 
 
 This email (including any attachments) is confidential and may be legally
 privileged. If you received this email in error, please delete it immediately
 and do not copy it or use it for any purpose or disclose its contents to any
 other person. Thank you.
 
 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人，请您立即删除本邮件。
 请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。

Re: possible bug in zookeeper ?

2010-09-14 Thread Mahadev Konar

Hi yatir,
 Can you confirm that zook1 , zook2 can be nslookedup from the client
machine? 

We havent seen a bug like this. It would be great to nail this down.

Thanks
mahadev


On 9/14/10 8:44 AM, Yatir Ben Shlomo yat...@outbrain.com wrote:

 zook1:2181,zook2:2181,zook3:2181
 
 
 -Original Message-
 From: Ted Dunning [mailto:ted.dunn...@gmail.com]
 Sent: Tuesday, September 14, 2010 4:11 PM
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: possible bug in zookeeper ?
 
 What was the list of servers that was given originally to open the
 connection to ZK?
 
 On Tue, Sep 14, 2010 at 6:15 AM, Yatir Ben Shlomo yat...@outbrain.comwrote:
 
 Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances.
 
 I am performing survivability  tests:
 Taking one of the zookeeper instances down I would expect the client to use
 a different zookeeper server instance.
 
 But as you can see in the below logs attached
 Depending on which instance I choose to take down (in my case,  the last
 one in the list of zookeeper servers)
 the client is constantly insisting on the same zookeeper server (Attempting
 connection to server zook3/192.168.252.78:2181)
 and not switching to a different one
 the problem seems to arrive from ClientCnxn.java
 Any one has an idea on this ?
 
 Solr cloud currently is using  zookeeper-3.2.2.jar
 Is this a know bug that was fixed in later versions ?( 3.3.1)
 
 Thanks in advance,
 Yatir
 
 
 Logs:
 
 Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown input
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
at
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :999)
at
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown output
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
at
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :1004)
at
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
 INFO: Attempting connection to server zook3/192.168.252.78:2181
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
 WARNING: Exception closing session 0x32b105244a20001 to
 sun.nio.ch.selectionkeyi...@3ca58cbf
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933
)
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown input
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
at
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :999)
at
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown output
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
at
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :1004)
at
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
 INFO: Attempting connection to server zook3/192.168.252.78:2181
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
 WARNING: Exception closing session 0x32b105244a2 to
 sun.nio.ch.selectionkeyi...@3960f81b
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933
)
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown input
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
at

Re: Receiving create events for self with synchronous create

2010-09-13 Thread Mahadev Konar

Hi Todd,
 Sorry for my late response. I had marked this email to respond but couldn't 
find the time :). Did you figure this out? It mostly looks like that as soon as 
you set a watch on /follower, some other node instantly creates another child 
of  /follower? Could that be the case?

Thanks
mahadev


On 8/26/10 8:09 PM, Todd Nine t...@spidertracks.co.nz wrote:

Sure thing.  The FollowerWatcher class is instantiated by the IClusterManager 
implementation.It then performs the following

FollowerWatcher.init() which is intended to do the following.

1. Create our follower node so that other nodes know we exist at path 
/com/spidertracks/aviator/cluster/follower/10.0.1.1  where the last node is 
an ephemeral node with the internal IP address of the node.  These are lines 67 
through 72.
2. Signal to the clusterManager that the cluster has changed (line 79).  
Ultimately the clusterManager will perform a barrier for partitioning data ( a 
separate watcher)
3. Register a watcher to receive all future events on the follower path 
/com/spidertracks/aviator/cluster/follower/ line 81.


Then we have the following characteristics in the watcher

1. If a node has been added or deleted from the children of 
/com/spidertracks/aviator/cluster/follower then continue.  Otherwise, ignore 
the event.  Lines 33 through 44
2. If this was an event we should process our cluster has changed, signal to 
the CusterManager that a node has either been added or removed. line 51.


I'm trying to encapsulate the detection of additions and deletions of child 
nodes within this Watcher.  All other events that occur due to a node being 
added or deleted should be handled externally by the clustermanager.

Thanks,
Todd


On Thu, 2010-08-26 at 19:26 -0700, Mahadev Konar wrote:
Hi Todd,
   The code that you point to, I am not able to make out the sequence of steps.
Can you be more clear on what you are trying to do in terms of zookeeper 
api?

 Thanks
 mahadev
 On 8/26/10 5:58 PM, Todd Nine t...@spidertracks.co.nz wrote:


Hi all,
   I'm running into a strange issue I could use a hand with.   I've
 implemented leader election, and this is working well.  I'm now
 implementing a follower queue with ephemeral nodes. I have an interface
 IClusterManager which simply has the api clusterChanged.  I don't care
 if nodes are added or deleted, I always want to fire this event.  I have
 the following basic algorithm.


 init

 Create a path with /follower/+mynode name

 fire the clusterChangedEvent

 Watch set the event watcher on the path /follower.


 watch:

 reset the watch on /follower

 if event is not a NodeDeleted or NodeCreated, ignore

 fire the clustermanager event


 this seems pretty straightforward.  Here is what I'm expecting


 1. Create my node path
 2. fire the clusterChanged event
 3. Set watch on /follower
 4. Receive watch events for changes from any other nodes.

 What's actually happening

 1. Create my node path
 2. fire the clusterChanged event
 3. Set Watch on /follower
 4. Receive watch event for node created in step 1
 5. Receive future watch events for changes from any other nodes.


 Here is my code.  Since I set the watch after I create the node, I'm not
 expecting to receive the event for it.  Am I doing something incorrectly
 in creating my watch?  Here is my code.

 http://pastebin.com/zDXgLagd

 Thanks,
 Todd

Re: Lock example

2010-09-13 Thread Mahadev Konar

Hi Tim,
 The lock recipe you mention is supposed to avoid her affect and prevent 
starvation (though it has bugs :)).
 Are you looking for something like that or just a simple lock and unlock that 
doesn't have to worry abt the above issues.
If that's the case then just doing an ephemeral create and delete should give 
you your lock and unlock recipes.


Thanks
mahadev


On 9/8/10 9:58 PM, Tim Robertson timrobertson...@gmail.com wrote:

Hi all,

I am new to ZK and using the queue and lock examples that come with
zookeeper but have run into ZOOKEEPER-645 with the lock.
I have several JVMs each keeping a long running ZK client and the
first JVM (and hence client) does not respect the locks obtained by
subsequent clients - e.g. the first client always manages to get the
lock even if another client holds it.

Before I start digging, I thought I'd ask if anyone has a simple lock
implemented they might share?  My needs are simply to lock a URL to
indicate that it is being worked on, so that I don't hammer my
endpoints with multiple clients.

Thanks for any advice,
Tim

Re: Understanding ZooKeeper data file management and LogFormatter

2010-09-13 Thread Mahadev Konar

Hi Vishal,
 Usually the default retention policy is safe enough for operations.

http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html

Gives you an overview of how to use the purging library in zookeeper.

Thanks
mahadev


On 9/8/10 12:01 PM, Vishal K vishalm...@gmail.com wrote:

 Hi All,
 
 Can you please share your experience regarding ZK snapshot retention and
 recovery policies?
 
 We have an application where we never need to rollback (i.e., revert back to
 a previous state by using old snapshots). Given this, I am trying to
 understand under what circumstances would we ever need to use old ZK
 snapshots. I understand a lot of these decisions depend on the application
 and amount of redundancy used at every level (e.g,. RAID level where the
 snapshots are stored etc) in the product. To simplify the discussion, I
 would like to rule out any application characteristics and focus mainly on
 data consistency.
 
 - Assuming that we have a 3 node cluster I am trying to figure out when
 would I really need to use old snapshot files. With 3 nodes we already have
 at least 2 servers with consistent database. If I loose files on one of the
 servers, I can use files from the other. In fact, ZK server join will take
 care of this. I can remove files from a faulty node and reboot that node.
 The faulty node will sync with the leader.
 
 - The old files will be useful if the current snapshot and/or log files are
 lost or corrupted on all 3 servers. If  the loss is due to a disaster (case
 where we loose all 3 servers), one would have to keep the snapshots on some
 external storage to recover. However, if the current snapshot file is
 corrupted on all 3 servers, then the most likely cause would be a bug in ZK.
 In which case, how can I trust the consistency of the old snapshots?
 
 - Given a set of snapshots and log files, how can I verify the correctness
 of these files? Example, if one of the intermediate snapshot file is
 corrupt.
 
 - The Admin's guide says Using older log and snapshot files, you can look
 at the previous state of ZooKeeper servers and even restore that state. The
 LogFormatter class allows an administrator to look at the transactions in a
 log. * *Is there a tool that does this for the admin?  The LogFormatter
 only displays the transactions in the log file.
 
 - Has anyone ever had to play with the snapshot files in production?
 
 Thanks in advance.
 
 Regards,
 -Vishal

Re: ZooKeeper C bindings and cygwin?

2010-09-04 Thread Mahadev Konar

Hi Jan,
 It would be great to have some documentation on how to use the windows 
install. Would you mind submitting a patch with documentation with FAQ's and 
any other issues you might have faced?

Thanks
mahadev


On 9/1/10 6:04 AM, jdeinh...@ujam.com jdeinh...@ujam.com wrote:

Dear list readers,

we've solved the problem ourself. We found the dll CYGZOOKEEPER_MT-2.DLL in 
/usr/local/bin.

Best regards
Jan  Jan

Am 01.09.2010 um 12:57 schrieb jdeinh...@ujam.commailto:jdeinh...@ujam.com:

Dear list readers,

we want to use the zookeeper C bindings with our applications. Some of them are 
running on Linux (e.g. Load Balancer) and others (.NET(C#) Audio Servers) on 
Windows Server 2008.
We'd like to try using cygwin to accomplish this task on windows, but we need 
further advice on how to do that.

What we did so far:

1) downloaded latest cygwin
2) ran ./configure
3) ran make
4) ran make install


Now we find some files (libzookeeper_mt.a, libzookeeper_mt.dll.a, 
libzookeeper_mt.la, libzookeeper_st.a, libzookeeper_st.dll.a and 
libzookeeper_st.la) in our cygwin/usr/local/lib folder, but these cannot be 
used in Visual Studio.
Is it somehow possible to produce a file that we can then use like a .dll or a 
.lib ?
What do we have to do to accomplish our task? Are we heading in a completely 
wrong direction?


Any help is greatly appreciated, thank you in advance!



Best regards
Jan  Jan





Jan Deinhard

Software Developer

UJAM GmbH
Speicher 1
Konsul-Smidt-Str 8d
28217 Bremen

fon  +49 421 89 80 97-04

jdeinh...@ujam.commailto:a...@ujam.com
www.ujam.comhttp://www.ujam.com/

Re: getting created child on NodeChildrenChanged event

2010-09-04 Thread Mahadev Konar

Hi Todd, 
  We have always tried to lean on the side of keeping things lightweight and
the api simple. The only way you would be able to do this is with sequential
creates.

1. create nodes like /queueelement-$i where i is a monotonically increasing
number. You could use the sequential flag of zookeeper to do this.

2. when deleting a node, you would remove the node and create a deleted node
on 

/deletedqueueelements/queuelement-$i

2.1 on notification you would go to /deletedqueelements/ and find out which
ones were deleted. 

The above only works if you are ok with monotonically unique queue elements.

3. the above method allows the folks to see the deltas using
deletedqueuelements, which can be garbage collected by some clean up process
(you can be smarter abt this as well)

Would something like this work?


Thanks
mahadev


On 8/31/10 3:55 PM, Todd Nine t...@spidertracks.co.nz wrote:

 Hi Dave,
   Thanks for the response.  I understand your point about missed events
 during a watch reset period.  I may be off, here is the functionality I
 was thinking.  I'm not sure if the ZK internal versioning process could
 possibly support something like this.
 
 1. A watch is placed on children
 2. The event is fired to the client.  The client receives the Stat
 object as part of the event for the current state of the node when the
 event was created.  We'll call this Stat A with version 1
 3. The client performs processing.  Meanwhile the node has several
 children changed. Versions are incremented to version 2 and version 3
 4. Client resets the watch
 5. A node is added
 6. The event is fired to the client.  Client receives Stat B with
 version 4
 7. Client calls performs a deltaChildren(Stat A, Stat B)
 8. zookeeper returns added nodes between stats, also returns deleted
 nodes between stats.
 
 This would handle the missed event problem since the client would have
 the 2 states it needs to compare.  It also allows clients dealing with
 large data sets to only deal with the delta over time (like a git
 replay).  Our number of queues could get quite large, and I'm concerned
 that keeping my previous event's children in a set to perform the delta
 may become quite memory and processor intensive  Would a feature like
 this be possible without over complicating the Zookeeper core?
 
 
 Thanks,
 Todd
 
 On Tue, 2010-08-31 at 09:23 -0400, Dave Wright wrote:
 
 Hi Todd -
 The general explanation for why Zookeeper doesn't pass the event information
 w/ the event notification is that an event notification is only triggered
 once, and thus may indicate multiple events. For example, if you do a
 GetChildren and set a watch, then multiple children are added at about the
 same time, the first one triggers a notification, but the second (or later)
 ones do not. When you do another GetChildren() request to get the list and
 reset the watch, you'll see all the changed nodes, however if you had just
 been told about the first change in the notification you would have missed
 the others.
 To do what you are wanting, you would really need persistent watches that
 send notifications every time a change occurs and don't need to be reset so
 you can't miss events. That isn't the design that was chosen for Zookeeper
 and I don't think it's likely to be implemented.
 
 -Dave Wright
 
 On Tue, Aug 31, 2010 at 3:49 AM, Todd Nine t...@spidertracks.co.nz wrote:
 
 Hi all,
  I'm writing a distributed queue monitoring class for our leader node in
 the cluster.  We're queueing messages per input hardware device, this queue
 is then assigned to a node with the least load in our cluster.  To do this,
 I maintain 2 Persistent Znode with the following format.
 
 data queue
 
 /dataqueue/devices/unit id/data packet
 
 processing follower
 
 /dataqueue/nodes/node name/unit id
 
 The queue monitor watches for changes on the path of /dataqueue/devices.
  When the first packet from a unit is received, the queue writer will
 create
 the queue with the unit id.  This triggers the watch event on the
 monitoring
 class, which in turn creates the znode for the path with the least loaded
 node.  This path is watched for child node creation and the node creates a
 queue consumer to consume messages from the new queue.
 
 
 Our list of queues can become quite large, and I would prefer not to
 maintain a list of queues I have assigned then perform a delta when the
 event fires to determine which queues are new and caused the watch event. I
 can't really use sequenced nodes and keep track of my last read position,
 because I don't want to iterate over the list of queues to determine which
 sequenced node belongs to the current unit id (it would require full
 iteration, which really doesn't save me any reads).  Is it possible to
 create a watch to return the path and Stat of the child node that caused
 the
 event to fire?
 
 Thanks,
 Todd

Re: Logs and in memory operations

2010-09-04 Thread Mahadev Konar

Hi Avinash,
  IN the source code the FinalRequestProcessor updates the in memory data 
structures and the SyncRequestProcessor logs to disk.
For deciding when to delete take a look at PurgeTxnLog.java file.

Thanks
mahadev


On 8/30/10 1:11 PM, Avinash Lakshman avinash.laksh...@gmail.com wrote:

Hi All

From my understanding when a znode is updated/created a write happens into
the local transaction logs and then some in-memory data structure is updated
to serve the future reads.
Where in the source code can I find this? Also how can I decide when it is
ok for me to delete the logs off disk?

Please advice.

Cheers
Avinash

Re: Spew after call to close

2010-09-03 Thread Mahadev Konar


Hi Stack,
 Looks like you are shutting down the server and shutting down the client at
the same time? Is that the issue?

Thanks
mahadev

On 9/3/10 4:47 PM, Stack st...@duboce.net wrote:

 Have you fellas seen this before? I call close on zookeeper but it insists
 on doing the below exceptions.  Why is it doing this 'Session
 0x12ad9dccda30002
 for server null, unexpected error, closing socket connection and
 attempting reconnect'?   This would seem to come after the close has
 been noticed and looking in code, i'd think we'd not do this since the
 close flag should be set to true post call to close?
 
 Thanks lads  (The below looks ugly in our logs... this is zk 3.3.1),
 St.Ack
 
 2010-09-03 16:09:52,369 INFO
 org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
 for client /fe80:0:0:0:0:0:0:1%1:56941 which had sessionid
 0x12ad9dccda30001
 2010-09-03 16:09:52,369 INFO
 org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection
 for client /127.0.0.1:56942 which had sessionid 0x12ad9dccda30002
 2010-09-03 16:09:52,370 INFO org.apache.zookeeper.ClientCnxn: Unable
 to read additional data from server sessionid 0x12ad9dccda30001,
 likely server has closed socket, closing socket connection and
 attempting reconnect
 2010-09-03 16:09:52,370 INFO org.apache.zookeeper.ClientCnxn: Unable
 to read additional data from server sessionid 0x12ad9dccda30002,
 likely server has closed socket, closing socket connection and
 attempting reconnect
 2010-09-03 16:09:52,370 INFO
 org.apache.zookeeper.server.NIOServerCnxn: NIOServerCnxn factory
 exited run method
 2010-09-03 16:09:52,370 INFO
 org.apache.zookeeper.server.PrepRequestProcessor: PrepRequestProcessor
 exited loop!
 2010-09-03 16:09:52,370 INFO
 org.apache.zookeeper.server.SyncRequestProcessor: SyncRequestProcessor
 exited!
 2010-09-03 16:09:52,370 INFO
 org.apache.zookeeper.server.FinalRequestProcessor: shutdown of request
 processor complete
 2010-09-03 16:09:52,470 DEBUG
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: localhost:/hbase
 Received ZooKeeper Event, type=None, state=Disconnected, path=null
 2010-09-03 16:09:52,470 INFO
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: localhost:/hbase
 Received Disconnected from ZooKeeper, ignoring
 2010-09-03 16:09:52,471 DEBUG
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: localhost:/hbase
 Received ZooKeeper Event, type=None, state=Disconnected, path=null
 2010-09-03 16:09:52,471 INFO
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: localhost:/hbase
 Received Disconnected from ZooKeeper, ignoring
 2010-09-03 16:09:52,857 INFO org.apache.zookeeper.ClientCnxn: Opening
 socket connection to server localhost/0:0:0:0:0:0:0:1:2181
 2010-09-03 16:09:52,858 WARN org.apache.zookeeper.ClientCnxn: Session
 0x12ad9dccda30001 for server null, unexpected error, closing socket
 connection and attempting reconnect
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
 2010-09-03 16:09:53,149 INFO org.apache.zookeeper.ClientCnxn: Opening
 socket connection to server localhost/fe80:0:0:0:0:0:0:1%1:2181
 2010-09-03 16:09:53,150 WARN org.apache.zookeeper.ClientCnxn: Session
 0x12ad9dccda30002 for server null, unexpected error, closing socket
 connection and attempting reconnect
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
 2010-09-03 16:09:53,576 INFO org.apache.zookeeper.ClientCnxn: Opening
 socket connection to server localhost/127.0.0.1:2181
 2010-09-03 16:09:53,576 WARN org.apache.zookeeper.ClientCnxn: Session
 0x12ad9dccda30001 for server null, unexpected error, closing socket
 connection and attempting reconnect
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
 2010-09-03 16:09:54,000 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: SessionTrackerImpl
 exited loop!
 2010-09-03 16:09:54,002 DEBUG
 org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Closed
 zookeeper sessionid=0x12ad9dccda30001
 2010-09-03 16:09:54,129 INFO org.apache.zookeeper.ClientCnxn: Opening
 socket connection to server localhost/0:0:0:0:0:0:0:1:2181
 2010-09-03 16:09:54,130 WARN org.apache.zookeeper.ClientCnxn: Session
 0x12ad9dccda30002 for server null, unexpected error, closing socket
 connection and attempting reconnect
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at

Re: Receiving create events for self with synchronous create

2010-08-26 Thread Mahadev Konar

Hi Todd,
  The code that you point to, I am not able to make out the sequence of steps.
   Can you be more clear on what you are trying to do in terms of zookeeper api?

Thanks
mahadev
On 8/26/10 5:58 PM, Todd Nine t...@spidertracks.co.nz wrote:

Hi all,
  I'm running into a strange issue I could use a hand with.   I've
implemented leader election, and this is working well.  I'm now
implementing a follower queue with ephemeral nodes. I have an interface
IClusterManager which simply has the api clusterChanged.  I don't care
if nodes are added or deleted, I always want to fire this event.  I have
the following basic algorithm.


init

Create a path with /follower/+mynode name

fire the clusterChangedEvent

Watch set the event watcher on the path /follower.


watch:

reset the watch on /follower

if event is not a NodeDeleted or NodeCreated, ignore

fire the clustermanager event


this seems pretty straightforward.  Here is what I'm expecting


1. Create my node path
2. fire the clusterChanged event
3. Set watch on /follower
4. Receive watch events for changes from any other nodes.

What's actually happening

1. Create my node path
2. fire the clusterChanged event
3. Set Watch on /follower
4. Receive watch event for node created in step 1
5. Receive future watch events for changes from any other nodes.


Here is my code.  Since I set the watch after I create the node, I'm not
expecting to receive the event for it.  Am I doing something incorrectly
in creating my watch?  Here is my code.

http://pastebin.com/zDXgLagd

Thanks,
Todd

Re: Size of a znode in memory

2010-08-25 Thread Mahadev Konar

Hi Marten,
 The usual memory footprint of a znode is around 40-80 bytes.

 I think Ben is planning to document a way to calculate approximate memory
footprint of your zk servers given a set of updates and there sizes.

 thanks
mahadev


On 8/25/10 11:49 AM, Maarten Koopmans maar...@vrijheid.net wrote:

 Hi,
 
 Is there a way to know/measure the size of a znode? My average znode has a
 name of 32 bytes and user data of max 128 bytes.
 
 Or is the only way to run a smoke test and watch the heap growth via jconsole
 or so?
 
 Thanks, Maarten

Re: Searching more ZooKeeper content

2010-08-25 Thread Mahadev Konar

I am definitely a +1 on this, given that its powered by Solr.

Thanks
mahadev


On 8/25/10 9:22 AM, Alex Baranau alex.barano...@gmail.com wrote:

 Hello guys,
 
 Over at http://search-hadoop.com we index ZooKeeper project's mailing lists,
 wiki, web site,
 source code, javadoc, jira...
 
 Would the community be interested in a patch that replaces the
 Google-powered
 search with that from search-hadoop.com, set to search only ZooKeeper
 project by
 default?
 
 We look into adding this search service for all Hadoop's sub-projects.
 
 Assuming people are for this, any suggestions for how the search should
 function by default or any specific instructions for how the search box
 should
 be modified would be great!
 
 Thank you,
 Alex Baranau.
 
 P.S. HBase community already accepted our proposal (please refer to
 https://issues.apache.org/jira/browse/HBASE-2886) and new version (0.90)
 will include new search box. Also the patch is available for TIKA (we are in
 the process of discussing some details now):
 https://issues.apache.org/jira/browse/TIKA-488. ZooKeeper's site looks much
 like Avro's for which we also created patch recently (
 https://issues.apache.org/jira/browse/AVRO-626).

Re: Parent nodes multi-step transactions

2010-08-23 Thread Mahadev Konar

Hi Gustavo,
 Usually the paradigm I like to suggest is to have something like

/A/init

Every client watches for the existence of this node and this node is only
created after /A has been initialized with the creation of /A/C or other
stuff.

Would that work for you?

Thanks
mahadev


On 8/23/10 7:34 AM, Gustavo Niemeyer gust...@niemeyer.net wrote:

 Greetings,
 
 We (a development team at Canonical) are stumbling into a situation
 here which I'd be curious to understand what is the general practice,
 since I'm sure this is somewhat of a common issue.
 
 It's quite easy to describe it: say there's a parent node A somewhere
 in the tree.  That node was created dynamically over the course of
 running the system, because it's associated with some resource which
 has its own life-span.  Now, under this node we put some control nodes
 for different reasons (say, A/B), and we also want to track some
 information which is related to a sequence of nodes (say, A/C/D-0,
 A/C/D-1, etc).
 
 So, we end up with something like this:
 
 A/B
 A/C/D-0
 A/C/D-1
 
 The question here is about best-practices for taking care of nodes
 like A/C.  It'd be fantastic to be able to create A's structure
 together with A itself, otherwise we risk getting in a situation where
 a client can see the node A before its initialization has been
 finished (A/C doesn't exist yet).  In fact, A/C may never exist, since
 it is possible for a client to die between the creation of A and C.
 
 Anyway, I'm sure you all understand the problem.  The question here
 is: this is pretty common, and quite boring to deal with properly on
 every single client.  Is there any feature in the roadmap to deal with
 this, and any common practice besides the obvious check for
 half-initialization and wait for A/C to be created or deal with
 timeouts and whatnot on every client?
 
 I'm about to start writing another layer on top of Zookeeper's API, so
 it'd be great to have some additional insight into this issue.
 
 --
 Gustavo Niemeyer
 http://niemeyer.net
 http://niemeyer.net/blog
 http://niemeyer.net/twitter

Re: Non Hadoop scheduling frameworks

2010-08-23 Thread Mahadev Konar

Hi Todd,
  Just to be clear, are you looking at solving UC1 and UC2 via zookeeper? Or is 
this a broader question for scheduling on cassandra nodes? For the latter this 
probably isnt the right mailing list.

Thanks
mahadev


On 8/23/10 4:02 PM, Todd Nine t...@spidertracks.co.nz wrote:

Hi all,
  We're using Zookeeper for Leader Election and system monitoring.  We're
also using it for synchronizing our cluster wide jobs with  barriers.  We're
running into an issue where we now have a single job, but each node can fire
the job independently of others with different criteria in the job.  In the
event of a system failure, another node in our application cluster will need
to fire this Job.  I've used quartz previously (we're running Java 6), but
it simply isn't designed for the use case we have.  I found this article on
cloudera.

http://www.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/


I've looked at both plugins, but they require hadoop.  We're not currently
running hadoop, we only have Cassandra.  Here are the 2 basic use cases we
need to support.

UC1: Synchronized Jobs
1. A job is fired across all nodes
2. The nodes wait until the barrier is entered by all participants
3. The nodes process the data and leave
4. On all nodes leaving the barrier, the Leader node marks the job as
complete.


UC2: Multiple Jobs per Node
1. A Job is scheduled for a future time on a specific node (usually the same
node that's creating the trigger)
2. A Trigger can be overwritten and cancelled without the job firing
3. In the event of a node failure, the Leader will take all pending jobs
from the failed node, and partition them across the remaining nodes.


Any input would be greatly appreciated.

Thanks,
Todd

Re: Zookeeper stops

2010-08-19 Thread Mahadev Konar

Hi Wim,
  It mostly looks like that zookeeper is not able to create files on the /tmp 
filesystem. Is there is a space shortage or is it possible the file is being 
deleted as its being written to?

Sometimes admins have a crontab on /tmp that cleans up the /tmp filesystem.

Thanks
mahadev


On 8/19/10 1:15 AM, Wim Jongman wim.jong...@gmail.com wrote:

Hi,

I have a zookeeper server running that can sometimes run for days and then
quits:

Is there somebody with a clue to the problem?

I am running 64 bit Ubuntu with

java version 1.6.0_18
OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-0ubuntu1)
OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)

Zookeeper 3.3.0

The log below has some context before it shows the fatal error. Our
component.id=40676 indicates that it is the 40676th time that I ask ZK to
publish this information. It has been seen to go up to half a million before
stopping.

Regards,

Wim

ZooDiscovery Service Unpublished: Aug 18, 2010 11:17:28 PM.
ServiceInfo[uri=osgiservices://
188.40.116.87:3282/svc_19q0FmlQF0wEwjSl6SpUTJRlV5g=;id=ServiceID[type=ServiceTypeID[typeName=_osgiservices._tcp.default._iana];location=osgiservices://188.40.116.87:3282/svc_19q0FmlQF0wEwjSl6SpUTJRlV5g=;full=_osgiservices._tcp.default._i...@osgiservices://188.40.116.87:3282/svc_19q0FmlQF0wEwjSl6SpUTJRlV5g=];priority=0;weight=0;props=ServiceProperties[{ecf.rsvc.ns=ecf.namespace.generic.remoteservice,
osgi.remote.service.interfaces=org.eclipse.ecf.services.quotes.QuoteService,
ecf.sp.cns=org.eclipse.ecf.core.identity.StringID, ecf.rsvc.id
=org.eclipse.ecf.discovery.serviceproperties$bytearraywrap...@68a1e081,
component.name=Star Wars Quotes Service, ecf.sp.ect=ecf.generic.server,
component.id=40676,
ecf.sp.cid=org.eclipse.ecf.discovery.serviceproperties$bytearraywrap...@5b9a6ad1
}]]
ZooDiscovery Service Published: Aug 18, 2010 11:17:29 PM.
ServiceInfo[uri=osgiservices://
188.40.116.87:3282/svc_u2GpWmF3YKSlTauWcwOMsDgiBxs=;id=ServiceID[type=ServiceTypeID[typeName=_osgiservices._tcp.default._iana];location=osgiservices://188.40.116.87:3282/svc_u2GpWmF3YKSlTauWcwOMsDgiBxs=;full=_osgiservices._tcp.default._i...@osgiservices://188.40.116.87:3282/svc_u2GpWmF3YKSlTauWcwOMsDgiBxs=];priority=0;weight=0;props=ServiceProperties[{ecf.rsvc.ns=ecf.namespace.generic.remoteservice,
osgi.remote.service.interfaces=org.eclipse.ecf.services.quotes.QuoteService,
ecf.sp.cns=org.eclipse.ecf.core.identity.StringID, ecf.rsvc.id
=org.eclipse.ecf.discovery.serviceproperties$bytearraywrap...@71bfa0a4,
component.name=Eclipse Twitter, ecf.sp.ect=ecf.generic.server,
component.id=40677,
ecf.sp.cid=org.eclipse.ecf.discovery.serviceproperties$bytearraywrap...@5bcba953
}]]
[log;+0200 2010.08.18
23:17:29:545;INFO;org.eclipse.ecf.remoteservice;org.eclipse.core.runtime.Status[plugin=org.eclipse.ecf.remoteservice;code=0;message=No
async remote service interface found with
name=org.eclipse.ecf.services.quotes.QuoteServiceAsync for proxy service
class=org.eclipse.ecf.services.quotes.QuoteService;severity2;exception=null;children=[]]]
2010-08-18 23:17:37,057 - FATAL [Snapshot Thread:zookeeperser...@262] -
Severe unrecoverable error, exiting
java.io.FileNotFoundException: /tmp/zookeeperData/version-2/snapshot.13e2e
(No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.init(FileOutputStream.java:209)
at java.io.FileOutputStream.init(FileOutputStream.java:160)
at
org.apache.zookeeper.server.persistence.FileSnap.serialize(FileSnap.java:224)
at
org.apache.zookeeper.server.persistence.FileTxnSnapLog.save(FileTxnSnapLog.java:211)
at
org.apache.zookeeper.server.ZooKeeperServer.takeSnapshot(ZooKeeperServer.java:260)
at
org.apache.zookeeper.server.SyncRequestProcessor$1.run(SyncRequestProcessor.java:120)
ZooDiscovery Service Unpublished: Aug 18, 2010 11:17:37 PM.
ServiceInfo[uri=osgiservices://
188.40.116.87:3282/svc_u2GpWmF3YKSlTauWcwOMsDgiBxs=;id=ServiceID[type=ServiceTypeID[typeName=_osgiservices._tcp.default._iana];location=osgiservices://188.40.116.87:3282/svc_u2GpWmF3YKSlTauWcwOMsDgiBxs=;full=_osgiservices._tcp.default._i...@osgiservices://188.40.116.87:3282/svc_u2GpWmF3YKSlTauWcwOMsDgiBxs=];priority=0;weight=0;props=ServiceProperties[{ecf.rsvc.ns=ecf.namespace.generic.remoteservice,
osgi.remote.service.interfaces=org.eclipse.ecf.services.quotes.QuoteService,
ecf.sp.cns=org.eclipse.ecf.core.identity.StringID, ecf.rsvc.id
=org.eclipse.ecf.discovery.serviceproperties$bytearraywrap...@71bfa0a4,
component.name=Eclipse Twitter, ecf.sp.ect=ecf.generic.server,
component.id=40677,
ecf.sp.cid=org.eclipse.ecf.discovery.serviceproperties$bytearraywrap...@5bcba953
}]]

Re: A question about Watcher

2010-08-16 Thread Mahadev Konar

Hi Qian,
 The watcher information is saved at the client, and the client will
reattach the watches to the new server it connects to.
  Hope that helps.

Thanks
mahadev


On 8/16/10 9:28 AM, Qian Ye yeqian@gmail.com wrote:

 thx for explaination. Since the watcher can be preserved when the client
 switch the zookeeper server it connects to, does that means all the watchers
 information will be saved on all the zookeeper servers? I didn't find any
 source of the client can hold the watchers information.
 
 
 On Tue, Aug 17, 2010 at 12:21 AM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 I should correct this.  The watchers will deliver a session expiration
 event, but since the connection is closed at that point no further
 events will be delivered and the cluster will remove them.  This is as good
 as the watchers disappearing.
 
 On Mon, Aug 16, 2010 at 9:20 AM, Ted Dunning ted.dunn...@gmail.com
 wrote:
 
 The other is session expiration.  Watchers do not survive this.  This
 happens when a client does not provide timely
 evidence that it is alive and is marked as having disappeared by the
 cluster.
 
 
 
 
 
 --
 With Regards!
 
 Ye, Qian

Re: How to handle Node does not exist error?

2010-08-11 Thread Mahadev Konar

HI Dr Hao,
  Can you please post the configuration of all the 3 zookeeper servers? I
suspect it might be misconfigured clusters and they might not belong to the
same ensemble.

Just to be clear:
/xpe/queues/3bd7851e79381ef4bfd1a5857b5e34c04e5159e5/msgs/msg002807

And other such nodes exist on one of the zookeeper servers and the same node
does not exist on other servers?

Also, as ted pointed out, can you please post the output of echo ³stat² | nc
localhost 2181 (on all the 3 servers) to the list?

Thanks
mahadev



On 8/11/10 12:10 AM, Dr Hao He h...@softtouchit.com wrote:

 hi, Ted,
 
 Thanks for the reply.  Here is what I did:
 
 [zk: localhost:2181(CONNECTED) 0] ls
 /xpe/queues/3bd7851e79381ef4bfd1a5857b5e34c04e5159e5/msgs/msg002948
 []
 zk: localhost:2181(CONNECTED) 1] ls
 /xpe/queues/3bd7851e79381ef4bfd1a5857b5e34c04e5159e5/msgs
 [msg002807, msg002700, msg002701, msg002804, msg002704,
 msg002706, msg002601, msg001849, msg001847, msg002508,
 msg002609, msg001841, msg002607, msg002606, msg002604,
 msg002809, msg002817, msg001633, msg002812, msg002814,
 msg002711, msg002815, msg002713, msg002716, msg001772,
 msg002811, msg001635, msg001774, msg002515, msg002610,
 msg001838, msg002517, msg002612, msg002519, msg001973,
 msg001835, msg001974, msg002619, msg001831, msg002510,
 msg002512, msg002615, msg002614, msg002617, msg002104,
 msg002106, msg001769, msg001768, msg002828, msg002822,
 msg001760, msg002820, msg001963, msg001961, msg002110,
 msg002118, msg002900, msg002836, msg001757, msg002907,
 msg001753, msg001752, msg001755, msg001952, msg001958,
 msg001852, msg001956, msg001854, msg002749, msg001608,
 msg001609, msg002747, msg002882, msg001743, msg002888,
 msg001605, msg002885, msg001487, msg001746, msg002330,
 msg001749, msg001488, msg001489, msg001881, msg001491,
 msg002890, msg001889, msg002758, msg002241, msg002892,
 msg002852, msg002759, msg002898, msg002850, msg001733,
 msg002751, msg001739, msg002753, msg002756, msg002332,
 msg001872, msg002233, msg001721, msg001627, msg001720,
 msg001625, msg001628, msg001629, msg001729, msg002350,
 msg001727, msg002352, msg001622, msg001726, msg001623,
 msg001723, msg001724, msg001621, msg002736, msg002738,
 msg002363, msg001717, msg002878, msg002362, msg002361,
 msg001611, msg001894, msg002357, msg002218, msg002358,
 msg002355, msg001895, msg002356, msg001898, msg002354,
 msg001996, msg001990, msg002093, msg002880, msg002576,
 msg002579, msg002267, msg002266, msg002366, msg001901,
 msg002365, msg001903, msg001799, msg001906, msg002368,
 msg001597, msg002679, msg002166, msg001595, msg002481,
 msg002482, msg002373, msg002374, msg002371, msg001599,
 msg002773, msg002274, msg002275, msg002270, msg002583,
 msg002271, msg002580, msg002067, msg002277, msg002278,
 msg002376, msg002180, msg002467, msg002378, msg002182,
 msg002377, msg002184, msg002379, msg002187, msg002186,
 msg002665, msg002666, msg002381, msg002382, msg002661,
 msg002662, msg002663, msg002385, msg002284, msg002766,
 msg002282, msg002190, msg002599, msg002054, msg002596,
 msg002453, msg002459, msg002457, msg002456, msg002191,
 msg002652, msg002395, msg002650, msg002656, msg002655,
 msg002189, msg002047, msg002658, msg002659, msg002796,
 msg002250, msg002255, msg002589, msg002257, msg002061,
 msg002064, msg002585, msg002258, msg002587, msg002444,
 msg002446, msg002447, msg002450, msg002646, msg001501,
 msg002591, msg002592, msg001503, msg001506, msg002260,
 msg002594, msg002262, msg002263, msg002264, msg002590,
 msg002132, msg002130, msg002530, msg002931, msg001559,
 msg001808, msg002024, msg001553, msg002939, msg002937,
 msg001556, msg002935, msg002933, msg002140, msg001937,
 msg002143, msg002520, msg002522, msg002429, msg002524,
 msg002920, msg002035, msg001561, msg002134, msg002138,
 msg002925, msg002151, msg002287, msg002555, msg002010,
 msg002002, msg002290, msg001537, msg002005, msg002147,
 msg002145, msg002698,

Re: Sequence Number Generation With Zookeeper

2010-08-06 Thread Mahadev Konar

Hi David,
 I think it would be really useful. It would be very helpful for someone
looking for geenrating unique tokens/generations ids ( I can think of plenty
of applications for this).

Please do consider contributing it back to the community!

Thanks
mahadev


On 8/6/10 7:10 AM, David Rosenstrauch dar...@darose.net wrote:

 Perhaps.  I'd have to ask my boss for permission to release the code.
 
 Is this something that would be interesting/useful to other people?  If
 so, I can ask about it.
 
 DR
 
 On 08/05/2010 11:02 PM, Jonathan Holloway wrote:
 Hi David,
 
 We did discuss potentially doing this as well.  It would be nice to get some
 recipes for Zookeeper done for this area, if people think it's useful.  Were
 you thinking of submitting this back as a recipe, if not then I could
 potentially work on such a recipe instead.
 
 Many thanks,
 Jon.
 
 
 I just ran into this exact situation, and handled it like so:
 
 I wrote a library that uses the option (b) you described above.  Only
 instead of requesting a single sequence number, you request a block of them
 at a time from Zookeeper, and then locally use them up one by one from the
 block you retrieved.  Retrieving by block (e.g., by blocks of 1 at a
 time) eliminates the contention issue.
 
 Then, if you're finished assigning ID's from that block, but still have a
 bunch of ID's left in the block, the library has another function to push
 back the unused ID's.  They'll then get pulled again in the next block
 retrieval.
 
 We don't actually have this code running in production yet, so I can't
 vouch for how well it works.  But the design was reviewed and given the
 thumbs up by the core developers on the team, and the implementation passes
 all my unit tests.
 
 HTH.  Feel free to email back with specific questions if you'd like more
 details.
 
 DR

Re: zkperl - skipped tests

2010-08-04 Thread Mahadev Konar

Hi Martin,
 You might have to look into the tests.
 t/50_access.t is the file you might want to take a look at. I am not a perl 
guru so am not of much help but let me know if you cant work out the details on 
the skipped tests. I will try to dig into the perl code.

Thanks
mahadev


On 8/4/10 6:16 AM, Martin Waite waite@gmail.com wrote:

Hi,

I built the perl module and ran the test suite.   For test 50_access, 3
tests are skipped.

vm-026-lenny-mw$ ZK_TEST_HOSTS=127.0.0.1:2181 make test
PERL_DL_NONLAZY=1 /usr/bin/perl -MExtUtils::Command::MM -e
test_harness(0, 'blib/lib', 'blib/arch') t/*.t
t/10_invalid..ok 1/107# no ZooKeeper path specified in ZK_TEST_PATH env
var, using root path
t/10_invalid..ok
t/15_thread...ok
t/20_tie..ok
t/22_stat_tie.ok
t/24_watch_tieok
t/30_connect..ok
t/35_log..ok
t/40_basicok
t/45_classok
t/50_access...ok
3/38 skipped: various reasons
t/60_watchok
All tests successful, 3 subtests skipped.
Files=11, Tests=461, 18 wallclock secs ( 2.01 cusr +  3.08 csys =  5.09 CPU)

Is there any way to find out which of the 38 tests were skipped and why ?

regards,
Martin

Re: node symlinks

2010-07-26 Thread Mahadev Konar

HI Maarteen,
  Can you elaborate on your use case of ZooKeeper? We currently don't have
any symlinks feature in zookeeper. The only way to do it for you would be a
client side hash/lookup table that buckets data to different zookeeper
servers. 

Or you could also store this hash/lookup table in one of the zookeeper
clusters. This lookup table can then be cached on the client side after
reading it once from zookeeper servers.

Thanks
mahadev


On 7/24/10 2:39 PM, Maarten Koopmans maar...@vrijheid.net wrote:

 Yes, I thought about Cassandra or Voldemort, but I need ZKs guarantees
 as it will provide the file system hierarchy to a flat object store so I
 need locking primitives and consistency. Doing that on top of Voldemort
 will give me a scalable version of ZK, but just slower. Might as well
 find a way to scale across ZK clusters.
 
 Also, I want to be able to add clusters as the number of nodes grows.
 Note that the #nodes will grow with the #users of the system, so the
 clusters can grow sequentially, hence the symlink idea.
 
 --Maarten
 
 On 07/24/2010 11:12 PM, Ted Dunning wrote:
 Depending on your application, it might be good to simply hash the node name
 to decide which ZK cluster to put it on.
 
 Also, a scalable key value store like Voldemort or Cassandra might be more
 appropriate for your application.  Unless you need the hard-core guarantees
 of ZK, they can be better for large scale storage.
 
 On Sat, Jul 24, 2010 at 7:30 AM, Maarten Koopmansmaar...@vrijheid.netwrote:
 
 Hi,
 
 I have a number of nodes that will grow larger than one cluster can hold,
 so I am looking for a way to efficiently stack clusters. One way is to have
 a zookeeper node symlink to another cluster.
 
 Has anybody ever done that and some tips, or alternative approaches?
 Currently I use Scala, and traverse zookeeper trees by proper tail
 recursion, so adapting the tail recursion to process symlinks would be my
 approach.
 
 Bst, Maarten

Re: ZK recovery questions

2010-07-20 Thread Mahadev Konar

Hi Ashwin,
 We have seen people wanting to have something like ZooKeeper without the
reliability of permanent storage and are willing to work with loosened
guarantees of current Zookeeper. What you mention on log files is certainly
a valid use case. 

It would be great to see how much throughput you will be able to get in such
a scenario wherein we never log onto a permanent store. Do you want to try
this out and see what kind of throughput difference you can get?

Thanks
mahadev


On 7/19/10 8:35 PM, Ashwin Jayaprakash ashwin.jayaprak...@gmail.com
wrote:

 
 Cool. I've only tried the single node server so far. I didn't know it could
 sync from other senior servers.
 
 Server/Cluster addresses: I read somewhere in the docs/todo list that the
 bootstrap server list for the clients should be the same. So, what happens
 when a new replacement server has to be brought in on a different
 IP/hostname? Do the older clients autodetect the new server or is this even
 supported? I suppose not.
 
 Log files: I have absolutely no confusion between ZK and databases (very
 tempting tho'), but running ZK servers without log files does not seem
 unusual. Especially since you said new servers can sync directly from senior
 servers without relying on log files. In that case, I'm curious to see what
 happens if you just redirect log files to /dev/null. Anyone tried this?
 
 Regards,
 Ashwin Jayaprakash.

Re: unit test failure

2010-07-14 Thread Mahadev Konar

HI Martin,
  Can you check if you have a stale java process (ZooKeeperServer) running
on your machine? That might cause some issues with the tests.


Thanks 
mahadev


On 7/14/10 8:03 AM, Martin Waite waite@gmail.com wrote:

 Hi,
 
 I am attempting to build the C client on debian lenny.
 
 autoconf, configure, make and make install all appear to work cleanly.
 
 I ran:
 
 autoreconf -if
 ./configure
 make
 make install
 make run-check
 
 However, the unit tests fail:
 
 $ make run-check
 make  zktest-st zktest-mt
 make[1]: Entering directory `/home/martin/zookeeper-3.3.1/src/c'
 make[1]: `zktest-st' is up to date.
 make[1]: `zktest-mt' is up to date.
 make[1]: Leaving directory `/home/martin/zookeeper-3.3.1/src/c'
 ./zktest-st
 ./tests/zkServer.sh: line 52: kill: (17711) - No such process
  ZooKeeper server startedRunning
 Zookeeper_operations::testPing : elapsed 1 : OK
 Zookeeper_operations::testTimeoutCausedByWatches1 : elapsed 0 : OK
 Zookeeper_operations::testTimeoutCausedByWatches2 : elapsed 0 : OK
 Zookeeper_operations::testOperationsAndDisconnectConcurrently1 : elapsed 2 :
 OK
 Zookeeper_operations::testOperationsAndDisconnectConcurrently2 : elapsed 0 :
 OK
 Zookeeper_operations::testConcurrentOperations1 : elapsed 206 : OK
 Zookeeper_init::testBasic : elapsed 0 : OK
 Zookeeper_init::testAddressResolution : elapsed 0 : OK
 Zookeeper_init::testMultipleAddressResolution : elapsed 0 : OK
 Zookeeper_init::testNullAddressString : elapsed 0 : OK
 Zookeeper_init::testEmptyAddressString : elapsed 0 : OK
 Zookeeper_init::testOneSpaceAddressString : elapsed 0 : OK
 Zookeeper_init::testTwoSpacesAddressString : elapsed 0 : OK
 Zookeeper_init::testInvalidAddressString1 : elapsed 0 : OK
 Zookeeper_init::testInvalidAddressString2 : elapsed 2 : OK
 Zookeeper_init::testNonexistentHost : elapsed 108 : OK
 Zookeeper_init::testOutOfMemory_init : elapsed 0 : OK
 Zookeeper_init::testOutOfMemory_getaddrs1 : elapsed 0 : OK
 Zookeeper_init::testOutOfMemory_getaddrs2 : elapsed 0 : OK
 Zookeeper_init::testPermuteAddrsList : elapsed 0 : OK
 Zookeeper_close::testCloseUnconnected : elapsed 0 : OK
 Zookeeper_close::testCloseUnconnected1 : elapsed 0 : OK
 Zookeeper_close::testCloseConnected1 : elapsed 0 : OK
 Zookeeper_close::testCloseFromWatcher1 : elapsed 0 : OK
 Zookeeper_simpleSystem::testAsyncWatcherAutoResetterminate called after
 throwing an instance of 'CppUnit::Exception'
   what():  equality assertion failed
 - Expected: -101
 - Actual  : -4
 
 make: *** [run-check] Aborted
 
 This appears to come from tests/TestClient.cc - but beyond that, it is hard
 to identify which equality assertion failed.
 
 Help !
 
 regards,
 Martin

Re: building client tools

2010-07-13 Thread Mahadev Konar

Hi Martin,
  There is a list of tools, i.e cppunit. That is the only required tool to
build the zookeeper c library. The readme says that it can be done without
cppunit being installed but there has been a open bug regarding this. So
cppunit is required as of now.

Thanks
mahadev


On 7/13/10 10:09 AM, Martin Waite waite@gmail.com wrote:

 Hi,
 
 I am trying to build the c client on debian lenny for zookeeper 3.3.1.
 
 autoreconf -if
 configure.ac:33: warning: macro `AM_PATH_CPPUNIT' not found in library
 configure.ac:33: warning: macro `AM_PATH_CPPUNIT' not found in library
 configure.ac:33: error: possibly undefined macro: AM_PATH_CPPUNIT
   If this token and others are legitimate, please use m4_pattern_allow.
   See the Autoconf documentation.
 autoreconf: /usr/bin/autoconf failed with exit status: 1
 
 I probably need to install some required tools.   Is there a list of what
 tools are needed to build this please ?
 
 regards,
 Martin

Re: running the systest

2010-07-09 Thread Mahadev Konar

Hi Stuart,
 The instructions are just out of date. If you could open a jira and post a
patch to it that would be great!

We should try getting this in 3.3.2! That would be useful!

Thanks
mahadev


On 7/9/10 6:36 AM, Stuart Halloway stuart.hallo...@gmail.com wrote:

 Hi all,
 
 I am trying to run the systest and have hit a few minor issues:
 
 (1) The readme says src/contrib/jarjar, apparently should be
 src/contrib/fatjar
 
 (2) The compiled fatjar seems to be missing junit, so the launch instructions
 do not work.
 
 I can fix or workaround these, but I wanted to see if maybe the instructions
 are just out of date, and there is an easy (but currently undocumented) way to
 launch the tests.
 
 Thanks,
 Stu

Re: Suggested way to simulate client session expiration in unit tests?

2010-07-06 Thread Mahadev Konar

Hi Jeremy,

 zk.disconnect() is the right way to disconnect from the servers. For
session expiration you just have to make sure that the client stays
disconnected for more than the session expiration interval.

Hope that helps.

Thanks
mahadev


On 7/6/10 9:09 AM, Jeremy Davis jerdavis.cassan...@gmail.com wrote:

 Is there a recommended way of simulating a client session expiration in unit
 tests?
 I see a TestableZooKeeper.java, with a pauseCnxn() method that does cause
 the connection to timeout/disconnect and reconnect. Is there an easy way to
 push this all the way through to session expiration?
 Thanks,
 -JD

Re: Securing ZooKeeper connections

2010-05-26 Thread Mahadev Konar

Hi Vishal,
  Ben (Benjamin Reed) has been working on a netty based client server
protocol in ZooKeeper. I think there is an open jira for it. My network
connection is pretty slow so am finding it hard to search for it.

We have been thinking abt enabling secure connections via this netty based
connections in zookeeper.

Thanks
mahadev


On 5/25/10 12:20 PM, Vishal K vishalm...@gmail.com wrote:

 Hi All,
 
 Since ZooKeeper does not support secure network connections yet, I thought I
 would poll and see what people are doing to address this problem. Is anyone
 running ZooKeeper over secure channels (client - server and server- server
 authentication/encryption)? If yes, can you please elaborate how you do it?
 
 Thanks.
 
 Regards,
 -Vishal

Re: Zookeeper EventThread and SendThread

2010-05-20 Thread Mahadev Konar


Hi Nick,
 These threads are spawned with each zookeeper client handle. As soon as you
create a zookeeper client object these threads are spawned.

Are yu creating too many zookeeper client objects in your application?

Htanks
mahadev

On 5/20/10 11:30 AM, Nick Bailey nicholas.bai...@rackspace.com wrote:

 Hey guys,
 
 Question regarding zookeeper's EventThread and SendThread. I'm not quite sure
 what these are used for but a stacktrace of our client application contains
 lines similar to
 
 pool-2-thread-20-EventThread daemon prio=10 tid=0x2aac3cb29c00 nid=0x75d
 waiting on condition [0x6b08..0x6b080b10]
    java.lang.Thread.State: WAITING (parking)
         at sun.misc.Unsafe.park(Native Method)
         - parking to wait for  0x2aab1f577250 (a
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
         at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Ab
 stractQueuedSynchronizer.java:1925)
         at 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
         at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:414)
 
 pool-2-thread-20-SendThread daemon prio=10 tid=0x2aac3c35d400 nid=0x75c
 runnable [0x70ede000..0x70edeb90]
    java.lang.Thread.State: RUNNABLE
         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
         - locked 0x2aab1f571d08 (a sun.nio.ch.Util$1)
         - locked 0x2aab1f571cf0 (a
 java.util.Collections$UnmodifiableSet)
         - locked 0x2aab1f5715b8 (a sun.nio.ch.EPollSelectorImpl)
         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
         at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:921)
 
 There are pairs of threads ranging from thread-1 to thread-50 and also
 multiple pairs of these threads.  As in pool-2-thread-20-SendThread is the
 name of multiple threads in the trace.  I'm debugging some load issues with
 our system and am suspicious that the large amount of zookeeper threads is
 contributing. Would anyone be able to elaborate on the purpose of these
 threads and how they are spawned?
 
 Thanks,
 
 Nick Bailey
 Rackspace Hosting
 Software Developer, Email  Apps
 nicholas.bai...@rackspace.com

Re: Using ZooKeeper for managing solrCloud

2010-05-14 Thread Mahadev Konar

Hi Rakhi,
 You can read more abt monitoring zookeeper servers at

http://hadoop.apache.org/zookeeper/docs/r3.3.0/zookeeperAdmin.html#sc_monito
ring


Thanks
mahadev


On 5/14/10 4:09 AM, Rakhi Khatwani rkhatw...@gmail.com wrote:

 Hi,
I just went through the zookeeper tutorial and successfully managed
 to run the zookeeper server.
 How do we monitor the zookeeper server?, is there a url for it?
 
 i pasted the following urls on browser, but all i get is a blank page
 http://localhost:2181
 http://localhost:2181/zookeeper
 
 
 I actually needed zookeeper for managing solr cloud managed externally
 but now if i hv 2 solr servers running, how do i configure zookeeper to
 manage them.
 
 Regards,
 Raakhi

Re: Can't ls with large node count and I don't understand the use of jute.maxbuffer

2010-05-13 Thread Mahadev Konar

Hi Aaaron,
  Each of the requests and response between client and servers is sent an
(buflen, buffer) packet. The content of the packets are then deserialized
from this buffer. 

Looks like the size of the packet (buflen) is big in yoru case. We usually
avoid sending/receiving large packets just to discourage folks from using it
as bulk data store.

We also discourage creating a flat hierarchy with too many direct children
(your case). This is because such directories can cause huge load on
network/servers when an list on that directores are done by a huge number of
clients. We always suggest to bucket these children into more hierarchical
structure.

You are probably hitting the limit of 1MB for this! You might want to change
this in your client configuration as a temporary fix! But for later you
might want to think about out structure in ZooKeeper to make it more
hierarchical via some kind of bucketing!

Thanks
mahadev




On 5/13/10 10:18 AM, Aaron Crow dirtyvagab...@yahoo.com wrote:

 We're running Zookeeper with about 2 million nodes. It's working, with one
 specific exception: When I try to get all children on one of the main node
 trees, I get an IOException out of ClientCnxn (Packet len4648067 is out of
 range!). There are 150329 children under the node in question. I should
 also mention that I can successfully ls other nodes with similarly high
 children counts. But this specific node always fails.
 
 Googling led me to see that Mahadev dealt with this last year:
 http://www.mail-archive.com/zookeeper-comm...@hadoop.apache.org/msg00175.html
 
 Source diving led me to see that ClientCnxn enforces a bound based on
 the jute.maxbuffer setting:
 
 packetLen = Integer.getInteger(jute.maxbuffer, 4096 * 1024);
 
 ...
 
 if (len  0 || len = packetLen) {
 
   throw new IOException(Packet len + len +  is out of range!);
 
 
 So maybe I could bump this up in config... but, I'm confused when reading
 the documentation on jute.maxbuffer:
 It specifies the maximum size of the data that can be stored in a znode.
 
 It's true we have an extremely high node count. However, we've been careful
 to keep each node's data very small -- e.g., we certainly should have no
 single data entry longer than 256 characters.  The way I'm reading the docs,
 the jute.maxbuffer bound is purely against the data size of specific nodes,
 and shouldn't relate to child count. Or does it relate to child count as
 well?
 
 Here is a stat on the offending node:
 
 cZxid = 0x1000e
 
 ctime = Mon May 03 17:40:58 PDT 2010
 
 mZxid = 0x1000e
 
 mtime = Mon May 03 17:40:58 PDT 2010
 
 pZxid = 0x100315064
 
 cversion = 150654
 
 dataVersion = 0
 
 aclVersion = 0
 
 ephemeralOwner = 0x0
 
 dataLength = 0
 
 numChildren = 150372
 
 
 Thanks for any insights...
 
 
 Aaron

Re: Xid out of order. Got 8 expected 7

2010-05-12 Thread Mahadev Konar

Hi Jordan,
 Can you create a jira for this? And attach all the server logs and client
logs related to this timeline? How did you start up the servers? Is there
some changes you might have made accidentatlly to the servers?


Thanks
mahadev


On 5/12/10 10:49 AM, Jordan Zimmerman jzimmer...@proofpoint.com wrote:

 We've just started seeing an odd error and are having trouble determining the
 cause. 
 Xid out of order. Got 8 expected 7
 Any hints on what can cause this? Any ideas on how to debug?
 
 We're using ZK 3.3.0. The error occurs in ClientCnxn.java line 781
 
 -Jordan

Re: ZookeeperPresentations Wiki

2010-05-11 Thread Mahadev Konar

I just emailed in...@apache to ask for there help on this. I wasn't able to
figure out what the problem is!

Thanks for pointing it out.

mahadev


On 5/11/10 4:01 PM, Sudipto Das sudi...@cs.ucsb.edu wrote:

 Hi,
 
 I am trying to download some presentation slides from the
 ZookeeperPresentations wiki (
 http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations) but I am
 facing a weird problem. On clicking on a link for a presentation, I am
 getting the error message You are not allowed to do AttachFile on this
 page. Login and try again. I tried creating an account, and even after
 that, I get the same error message, except the login suggestion. All
 attachment links have an action=AttachFile URL, (e.g.
 http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations?action=AttachFi
 ledo=viewtarget=zookeeper_hbase.pptxfor
 the zookeeper_hbase.pptx file). My intent is to just download the
 files.
 Please let me know if I am doing something wrong. Sorry for my ignorance,
 but I honestly tried out all obvious means to figure out. :(
 
 Best Regards
 Sudipto
 
 --
 Sudipto Das
 PhD Candidate
 CS @ UCSB
 Santa Barbara, CA 93106, USA
 http://www.cs.ucsb.edu/~sudipto

Re: New ZooKeeper client library Cages

2010-05-11 Thread Mahadev Konar

Hi Dominic,
  Good to see this. I like the name cages :).


You might want to post to the list what cages is useful for. I think quite a
few folks would be interested in something like this. Are you guys currently
using it with cassandra?

Thanks
mahadev


On 5/11/10 4:02 PM, Dominic Williams thedwilli...@googlemail.com wrote:

 Anyone looking for a Java client library for ZooKeeper, please checkout:
 
 Cages - http://cages.googlecode.com
 
 The library will be expanded and feedback will be helpful.
 
 Many thanks,
 Dominic
 ria101.wordpress.com

Re: avoiding deadlocks on client handle close w/ python/c api

2010-05-04 Thread Mahadev Konar

Sure, Ill take a look at it.

Thanks
mahadev


On 5/4/10 2:32 PM, Patrick Hunt ph...@apache.org wrote:

 Thanks Kapil, Mahadev perhaps you could take a look at this as well?
 
 Patrick
 
 On 05/04/2010 06:36 AM, Kapil Thangavelu wrote:
 I've constructed  a simple example just using the zkpython library with
 condition variables, that will deadlock. I've filed a new ticket for it,
 
 https://issues.apache.org/jira/browse/ZOOKEEPER-763
 
 the gdb stack traces look suspiciously like the ones in 591, but sans the
 watchers.
 https://issues.apache.org/jira/browse/ZOOKEEPER-591
 
 the attached example on the ticket will deadlock in zk 3.3.0 (which has the
 fix for 591) and trunk.
 
 -kapil
 
 On Mon, May 3, 2010 at 9:48 PM, Kapil Thangavelukapil.f...@gmail.comwrote:
 
 Hi Folks,
 
 I'm constructing an async api on top of the zookeeper python bindings for
 twisted. The intent was to make a thin wrapper that would wrap the existing
 async api with one that allows for integration with the twisted python event
 loop (http://www.twistedmatrix.com) primarily using the async apis.
 
 One issue i'm running into while developing a unit tests, deadlocks occur
 if we attempt to close a handle while there are any outstanding async
 requests (aget, acreate, etc). Normally on close both the io thread
 terminates and the completion thread are terminated and joined, however
 w\ith outstanding async requests, the completion thread won't be in a
 joinable state, and we effectively hang when the main thread does the join.
 
 I'm curious if this would be considered bug, afaics ideal behavior would be
 on close of a handle, to effectively clear out any remaining callbacks and
 let the completion thread terminate.
 
 i've tried adding some bookkeeping to the api to guard against closing
 while there is an outstanding completion request, but its an imperfect
 solution do to the nature of the event loop integration. The problem is that
 the python callback invoked by the completion thread in turn schedules a
 function for the main thread. In twisted the api for this is implemented by
 appending the function to a list attribute on the reactor and then writing a
 byte to a pipe to wakeup the main thread. If a thread switch to the main
 thread occurs before the completion thread callback returns, the scheduled
 function runs and the rest of the application keeps processing, of which the
 last step for the unit tests is to close the connection, which results in a
 deadlock.
 
 i've included some of the client log and gdb stack traces from a deadlock'd
 client process.
 
 thanks,
 
 Kapil

Re: ZKClient

2010-05-04 Thread Mahadev Konar

Hi Adam,
  I don't think zk is very very hard to get right. There are exmaples in
src/recipes which implements locks/queues/others. There is ZOOKEEPER-22 to
make it even more easier for application to use.

Regarding re registration of watches, you can deifnitely write code and
submit is as a part of well documented contrib module which lays out the
assumptions/design of it. It could very well be useful for others. Its just
that folks havent had much time to focus on these areas as yet.

Thanks
mahadev


On 5/4/10 2:58 PM, Adam Rosien a...@rosien.net wrote:

 I use zkclient in my work at kaChing and I have mixed feelings about
 it. On one hand it makes easy things easy which is great, but on the
 other hand I very few ideas what assumptions it makes under the
 hood. I also dislike some of the design choices such as unchecked
 exceptions, but that's neither here nor there. It would take some
 extensive documentation work by the authors to really enumerate the
 model and assumptions, but the project doesn't seem to be active
 (either from it being adequate for its current users or just
 inactive). I'm not sure I could derive the assumptions myself.
 
 I'm a bit frustrated that zk is very, very hard to really get right.
 At a project level, can't we create structures to avoid most of these
 errors? Can there be a standard model with detailed assumptions and
 implementations of all the recipes? How can we start this? Is there
 something that makes this too hard?
 
 I feel like a recipe page is a big fail; wouldn't an example app that
 uses locks and barriers be that much more compelling?
 
 For the common FAQ items like you need to re-register the watch,
 can't we just create code that implements this pattern? My goal is to
 live up to the motto: a good API is impossible to use incorrectly.
 
 .. Adam
 
 On Tue, May 4, 2010 at 2:21 PM, Ted Dunning ted.dunn...@gmail.com wrote:
 In general, writing this sort of layer on top of ZK is very, very hard to
 get really right for general use.  In a simple use-case, you can probably
 nail it but distributed systems are a Zoo, to coin a phrase.  The problem is
 that you are fundamentally changing the metaphors in use so assumptions can
 come unglued or be introduced pretty easily.
 
 One example of this is the fact that ZK watches *don't* fire for every
 change but when you write listener oriented code, you kind of expect that
 they will.  That makes it really, really easy to introduce that assumption
 in the heads of the programmer using the event listener library on top of
 ZK.  Another example is how the atomic get content/set watch call works in
 ZK is easy to violate in an event driven architecture because the thread
 that watches ZK probably resets the watch.  If you assume that the listener
 will read the data, then you have introduced a timing mismatch between the
 read of the data and the resetting of the watch.  That might be OK or it
 might not be.  The point is that these changes are subtle and tricky to get
 exactly right.
 
 On Tue, May 4, 2010 at 1:48 PM, Jonathan Holloway 
 jonathan.hollo...@gmail.com wrote:
 
 Is there any reason why this isn't part of the Zookeeper trunk already?

Re: Dynamic adding/removing ZK servers on client

2010-05-03 Thread Mahadev Konar

Hi Dave,
 Just a question on how do you see it being used, meaning who would call
addserver and removeserver? It does seem useful to be able to do this. This
is definitely worth working on. You can link it as a subtask of
ZOOKEEPER-107.

Thanks
mahadev


On 5/3/10 7:03 AM, Dave Wright wrig...@gmail.com wrote:

 I've got a situation where I essentially need dynamic cluster
 membership, which has been talked about in ZOOKEEPER-107 but doesn't
 look like it's going to happen any time soon.
 
 For now, I'm planning on working around this by having a simple
 coordinator service on the server nodes that will re-write the configs
 and bounce the servers when membership changes. Clients will may get
 an error or two and need to reconnect, but that should be handled by
 the normal error logic.
 
 On the client side, I'd really like to dynamically update the server
 list w/o having to re-create the entire Zookeeper object. Looking at
 the code, it seems like it would be pretty trivial to add
 RemoveServer()/AddServer() functions for Zookeeper that calls down
 to ClientCnxn, where they are just maintained in a list. Of course if
 the server being removed is the one currently connected, we'd need to
 disconnect, but a simple call to disconnect() seems like it would
 resolve that and trigger the automatic re-connection logic.
 
 Does anyone see an issue with that approach?
 Were I to create the patch, do you think it would be interesting
 enough to merge? It seems like that functionality will eventually be
 needed for whatever full dynamic server support is eventually
 implemented.
 
 -Dave Wright

Re: Dynamic adding/removing ZK servers on client

2010-05-03 Thread Mahadev Konar

Yeah, that was one of the ideas, I think its been on the jira somewhere ( I
forget)... But could be and would definitely be one soln for it.

Thanks
mahadev


On 5/3/10 2:12 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 Should this be a znode in the privileged namespace?
 
 On Mon, May 3, 2010 at 1:45 PM, Dave Wright wrig...@gmail.com wrote:
 
 Hi Dave,
  Just a question on how do you see it being used, meaning who would call
 addserver and removeserver? It does seem useful to be able to do this.
 This
 is definitely worth working on. You can link it as a subtask of
 ZOOKEEPER-107.
 
 
 In my case, it would be my client application - I would get a
 notification (probably via a watched ZK node controlled by my manager
 process) that the cluster membership was changing, and I'd adjust the
 client server list accordingly.
 
 -Dave

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Mahadev Konar

Hi Lei,

In this case, the Leader will be disconnected from ZK cluster and will give
up its leadership. Since its disconnected, ZK cluster will realize that the
Leader is dead!

When Zk cluster realizes that the Leader is dead (this is because the zk
cluster hasn't heard from the Leader for a certain time Configurable via
session timeout parameter), the slaves will be notified of this via watchers
in zookeeper cluster. The slaves will realize that the Leader is gone and
will relect a new Leader and will start working with the new Leader.

Does that answer your question?

You might want to look though the documentation of ZK to understand its use
case and how it solves these kind of issues

Thanks
mahadev


On 4/30/10 2:08 PM, Lei Gao l...@linkedin.com wrote:

 Thank you all for your answers. It clarifies a lot of my confusion about the
 service guarantees of ZK. I am still struggling with one failure case (I am
 not trying to be the pain in the neck. But I need to have a full
 understanding of what ZK can offer before I make a decision on whether to
 used it in my cluster.)
 
 Assume the following topology:
 
  Leader   ZK cluster
   \\//
\\  //
  \\   //
   Slave(s)
 
 If I am asymmetric network failure such that the connection between Leader
 and Slave(s) are broken while all other connections are still alive, would
 my system hang after some point? Because no new leader election will be
 initiated by slaves and the leader can't get the work to slave(s).
 
 Thanks,
 
 Lei
 
 On 4/30/10 1:54 PM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 If one of your user clients can no longer reach one member of the ZK
 cluster, then it will try to reach another.  If it succeeds, then it will
 continue without any problems as long as the ZK cluster itself is OK.
 
 This applies for all the ZK recipes.  You will have to be a little bit
 careful to handle connection loss, but that should get easier soon (and
 isn't all that difficult anyway).
 
 On Fri, Apr 30, 2010 at 1:26 PM, Lei Gao l...@linkedin.com wrote:
 
 I am not talking about the leader election within zookeeper cluster. I
 guess
 I didn't make the discussion context clear. In my case, I run a cluster
 that
 uses zookeeper for doing the leader election. Yes, nodes in my cluster are
 the clients of zookeeper.  Those nodes depend on zookeeper to elect a new
 leader and figure out what the current leader is. So if the zookeeper
 (think
 of it as a stand-alone entity) becomes unavailabe in the way I've described
 earlier, how can I handle such situation so my cluster can still function
 while a majority of nodes still connect to each other (but not to the
 zookeeper)?

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Mahadev Konar

Hi Lei,
 Sorry I minsinterpreted your question! The scenario you describe could be
handled in such a way -

You could have a status node in ZooKeeper which every slave will subscribe
to and update! If one of the slave nodes sees that there have been too many
connection refused to the Leader by the slaves, the slave could go ahead and
delete the Leader znode, and force the Leader to give up its leadership. I
am not describing a deatiled way to do it, but its not very hard to come up
with a design for this.



Do you intend to have the Leader and Slaves in different Network (different
ACLs I mean) protected zones? In that case, it is a legitimate concern else
I do think assymetric network partition would be very unlikely to happen.

Do you usually see network partitions in such scenarios?

Thanks
mahadev


On 4/30/10 4:05 PM, Lei Gao l...@linkedin.com wrote:

 Hi Mahadev,
 
 Why would the leader be disconnected from ZK? ZK is fine communicating with
 the leader in this case. We are talking about asymmetric network failure.
 Yes. Leader could consider all the slaves being down if it tracks the status
 of all slaves himself. But I guess if ZK is used for for membership
 management, neither the leader nor the slaves will be considered
 disconnected because they can all connect to ZK.
 
 Thanks,
 
 Lei  
 
 
 On 4/30/10 3:47 PM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Lei,
 
 In this case, the Leader will be disconnected from ZK cluster and will give
 up its leadership. Since its disconnected, ZK cluster will realize that the
 Leader is dead!
 
 When Zk cluster realizes that the Leader is dead (this is because the zk
 cluster hasn't heard from the Leader for a certain time Configurable via
 session timeout parameter), the slaves will be notified of this via watchers
 in zookeeper cluster. The slaves will realize that the Leader is gone and
 will relect a new Leader and will start working with the new Leader.
 
 Does that answer your question?
 
 You might want to look though the documentation of ZK to understand its use
 case and how it solves these kind of issues
 
 Thanks
 mahadev
 
 
 On 4/30/10 2:08 PM, Lei Gao l...@linkedin.com wrote:
 
 Thank you all for your answers. It clarifies a lot of my confusion about the
 service guarantees of ZK. I am still struggling with one failure case (I am
 not trying to be the pain in the neck. But I need to have a full
 understanding of what ZK can offer before I make a decision on whether to
 used it in my cluster.)
 
 Assume the following topology:
 
  Leader   ZK cluster
   \\//
\\  //
  \\   //
   Slave(s)
 
 If I am asymmetric network failure such that the connection between Leader
 and Slave(s) are broken while all other connections are still alive, would
 my system hang after some point? Because no new leader election will be
 initiated by slaves and the leader can't get the work to slave(s).
 
 Thanks,
 
 Lei
 
 On 4/30/10 1:54 PM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 If one of your user clients can no longer reach one member of the ZK
 cluster, then it will try to reach another.  If it succeeds, then it will
 continue without any problems as long as the ZK cluster itself is OK.
 
 This applies for all the ZK recipes.  You will have to be a little bit
 careful to handle connection loss, but that should get easier soon (and
 isn't all that difficult anyway).
 
 On Fri, Apr 30, 2010 at 1:26 PM, Lei Gao l...@linkedin.com wrote:
 
 I am not talking about the leader election within zookeeper cluster. I
 guess
 I didn't make the discussion context clear. In my case, I run a cluster
 that
 uses zookeeper for doing the leader election. Yes, nodes in my cluster are
 the clients of zookeeper.  Those nodes depend on zookeeper to elect a new
 leader and figure out what the current leader is. So if the zookeeper
 (think
 of it as a stand-alone entity) becomes unavailabe in the way I've
 described
 earlier, how can I handle such situation so my cluster can still function
 while a majority of nodes still connect to each other (but not to the
 zookeeper)?

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Mahadev Konar

HI Lei,
  ZooKeeper provides a set of primitives which allows you to do all kinds of
things! You might want to take a look at the api and some examples of
zookeeper recipes to see how it works and probably that will clear things
out for you.

Here are the links:

http://hadoop.apache.org/zookeeper/docs/r3.3.0/recipes.html

Thanks
mahadev


On 4/30/10 4:46 PM, Lei Gao l...@linkedin.com wrote:

 Hi Mahadev,
 
 First of all, I like to thank you for being patient with me - my questions
 seem unclear to many of you who try to help me.
 
 I guess clients have to be smart enough to trigger a new leader election by
 trying to delete the znode. But in this case, ZK should not allow any single
 or multiple (as long as they are less than a quorum) client(s) to delete the
 znode responding to the master, right? A new consensus among clients (NOT
 among the nodes in zk cluster) has to be there for the znode to be deleted,
 right?  Does zk have this capability or the clients have to come to this
 consensus outside of zk before trying to delete the znode in zk?
 
 Thanks,
 
 Lei
 
 Hi Lei,
  Sorry I minsinterpreted your question! The scenario you describe could be
 handled in such a way -
 
 You could have a status node in ZooKeeper which every slave will subscribe
 to and update! If one of the slave nodes sees that there have been too many
 connection refused to the Leader by the slaves, the slave could go ahead and
 delete the Leader znode, and force the Leader to give up its leadership. I
 am not describing a deatiled way to do it, but its not very hard to come up
 with a design for this.
 
 
 
 Do you intend to have the Leader and Slaves in different Network (different
 ACLs I mean) protected zones? In that case, it is a legitimate concern else
 I do think assymetric network partition would be very unlikely to happen.
 
 Do you usually see network partitions in such scenarios?
 
 Thanks
 mahadev
 
 
 On 4/30/10 4:05 PM, Lei Gao l...@linkedin.com wrote:
 
 Hi Mahadev,
 
 Why would the leader be disconnected from ZK? ZK is fine communicating with
 the leader in this case. We are talking about asymmetric network failure.
 Yes. Leader could consider all the slaves being down if it tracks the status
 of all slaves himself. But I guess if ZK is used for for membership
 management, neither the leader nor the slaves will be considered
 disconnected because they can all connect to ZK.
 
 Thanks,
 
 Lei  
 
 
 On 4/30/10 3:47 PM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Lei,
 
 In this case, the Leader will be disconnected from ZK cluster and will give
 up its leadership. Since its disconnected, ZK cluster will realize that the
 Leader is dead!
 
 When Zk cluster realizes that the Leader is dead (this is because the zk
 cluster hasn't heard from the Leader for a certain time Configurable
 via
 session timeout parameter), the slaves will be notified of this via
 watchers
 in zookeeper cluster. The slaves will realize that the Leader is gone and
 will relect a new Leader and will start working with the new Leader.
 
 Does that answer your question?
 
 You might want to look though the documentation of ZK to understand its use
 case and how it solves these kind of issues
 
 Thanks
 mahadev
 
 
 On 4/30/10 2:08 PM, Lei Gao l...@linkedin.com wrote:
 
 Thank you all for your answers. It clarifies a lot of my confusion about
 the
 service guarantees of ZK. I am still struggling with one failure case (I
 am
 not trying to be the pain in the neck. But I need to have a full
 understanding of what ZK can offer before I make a decision on whether to
 used it in my cluster.)
 
 Assume the following topology:
 
  Leader   ZK cluster
   \\//
\\  //
  \\   //
   Slave(s)
 
 If I am asymmetric network failure such that the connection between Leader
 and Slave(s) are broken while all other connections are still alive, would
 my system hang after some point? Because no new leader election will be
 initiated by slaves and the leader can't get the work to slave(s).
 
 Thanks,
 
 Lei
 
 On 4/30/10 1:54 PM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 If one of your user clients can no longer reach one member of the ZK
 cluster, then it will try to reach another.  If it succeeds, then it will
 continue without any problems as long as the ZK cluster itself is OK.
 
 This applies for all the ZK recipes.  You will have to be a little bit
 careful to handle connection loss, but that should get easier soon (and
 isn't all that difficult anyway).
 
 On Fri, Apr 30, 2010 at 1:26 PM, Lei Gao l...@linkedin.com wrote:
 
 I am not talking about the leader election within zookeeper cluster. I
 guess
 I didn't make the discussion context clear. In my case, I run a cluster
 that
 uses zookeeper for doing the leader election. Yes, nodes in my cluster
 are
 the clients of zookeeper.  Those nodes depend on zookeeper to elect

Re: Embedding ZK in another application

2010-04-29 Thread Mahadev Konar

We do set that Chad but it doesn't seem to help on some systems (especially
bsd)...

Thanks
mahadev


On 4/29/10 11:22 AM, Chad Harrington chad.harring...@gmail.com wrote:

 On Thu, Apr 29, 2010 at 8:49 AM, Patrick Hunt ph...@apache.org wrote:
 
 This is not foolproof however. We found that in general this would work,
 however there were some infrequent cases where a restarted server would fail
 to initialize due to the following issue:
 it is possible for the process to complete before the kernel has released
 the associated network resource, and this port cannot be bound to another
 process until the kernel has decided that it is done.
 
 more detail here:
 http://hea-www.harvard.edu/~fine/Tech/addrinuse.html
 
 as a result we ended up changing the test code to start each test with new
 quorum/election port numbers. This fixed the problem for us but would not be
 a solution in your case.
 
 Patrick
 
 
 I am not an expert at all on this, but I have used SO_REUSEADDR in other
 situations to avoid the address in use problem.  Would that help here?
 
 Chad Harrington
 chad.harring...@gmail.com
 
 
 
 On 04/29/2010 07:13 AM, Vishal K wrote:
 
 Hi Ted,
 
 We want the application that embeds the ZK server to be running even after
 the ZK server is shutdown. So we don't want to restart the application.
 Also, we prefer not to use zkServer.sh/zkServer.cmd because these are OS
 dependent (our application will run on Win as well as Linux). Instead, we
 thought that calling QuorumPeerMain.initializeAndRun() and
 QuorumPeerMain.shutdown() will suffice to start and shutdown a ZK server
 and
 we won't have to worry about checking the OS.
 
 Is there way to cleanly shutdown the ZK server (by invoking ZK server API)
 when it is embedded in the application without actually restarting the
 application process?
 Thanks.
 On Thu, Apr 29, 2010 at 1:54 AM, Ted Dunningted.dunn...@gmail.com
  wrote:
 
  Hmmm it isn't quite clear what you mean by restart without
 restarting.
 
 Why is killing the server and restarting it not an option?
 
 It is common to do a rolling restart on a ZK cluster.  Just restart one
 server at a time.  This is often used during system upgrades.
 
 On Wed, Apr 28, 2010 at 8:22 PM, Vishal Kvishalm...@gmail.com  wrote:
 
 
 What is a good way to restart a ZK server (standalone and quorum)
 without
 having to restart it?
 
 Currently, I have ZK server embedded in another java application.

Re: Zookeeper client

2010-04-27 Thread Mahadev Konar

HI Avinash,
  The zk client does itself maintain liveness information and also
randomizes the list of servers to balance the number of clients connected to
a single ZooKeeper server.

Hope that helps.

Thanks
mahadev


On 4/27/10 10:56 AM, Avinash Lakshman avinash.laksh...@gmail.com wrote:

 Let's assume I have 100 clients connecting to a cluster of 5 Zookeeper
 servers over time. On the client side I instantiate a ZooKeeper instance and
 use it whenever I need to read/write into ZK. Now I know I can pass in a
 connect string with the list of all servers that make up the ZK cluster.
 Does the ZK client automatically maintain liveness information and load
 balance my connections across the machines? How can I do this effectively? I
 basically want to spread the connections from the 100 clients to the 5 ZK
 instances effectively.
 
 Thanks
 Avinash

Re: Embedding ZK in another application

2010-04-23 Thread Mahadev Konar

Hi Vishal and Ashanka,

  I think Ted and Pat had somewhat comentted on this before.

Reiterating these comments below. If you are ok with these points I see no
concern in ZooKeeper as an embedded application.

Also, as Pat mentioned earlier  there are some cases where the server code
will system.exit. This is typically only if quorum communication fails in
some weird, unrecoverable way. We have removed most of these but there are a
few still remaining.


--- Comments by Ted 

I can't comment on the details of your code (but I have run in-process ZK's
in the past without problem)

Operationally, however, this isn't a great idea.  The problem is two-fold:

a) firstly, somebody would probably like to look at Zookeeper to understand
the state of your service.  If the service is
down, then ZK will go away.  That means that Zookeeper can't be used that
way and is mild to moderate
on the logarithmic international suckitude scale.

b) secondly, if you want to upgrade your server without upgrading Zookeeper
then you still have to bounce
Zookeeper.  This is probably not a problem, but it can be a slight pain.

c) thirdly, you can't scale your service independently of how you scale
Zookeeper.  This may or may
not bother you, but it would bother me.

d) fourthly, you will be synchronizing your server restarts with ZK's
service restarts.  Moving these events
away from each other is likely to make them slightly more reliable.  There
is no failure mode that I know
of that would be tickled here, but your service code will be slightly more
complex since it has to make sure
that ZK is up before it does stuff.  If you could make the assumption that
ZK is up or exit, that would be
simpler.

e) yes, I know that is more than two issues.  That is itself an issue since
any design where the number of worries
is increasing so fast is suspect on larger grounds.  If there are small
problems cropping up at that rate, the likelihood
of there being a large problem that comes up seems higher.


On 4/23/10 11:04 AM, Vishal K vishalm...@gmail.com wrote:

 Hi,
 
 Good question. We are planning to do something similar as well and it will
 great to know if there are any issues with embedding ZK server into an app.
 We simply use QourumPeerMain and QourumPeer from our app to start/stop the
 ZK server. Is this not a good way to do it?
 
 On Fri, Apr 23, 2010 at 1:28 PM, Asankha C. Perera asan...@apache.orgwrote:
 
 Hi All
 
 I'm very new to ZK, and am looking at embeding ZK into an app that needs
 cluster management - and the objective is to use ZK to notify
 application cluster control operations (e.g. shutdown etc) across nodes.
 
 I came across this post [1] from the user list by Ted Dunning from some
 months back :
 My experience with Katta has led me to believe that embedding a ZK in a
 product is almost always a bad idea. - The problems are that you can't
 administer the Zookeeper cluster independently and that the cluster
 typically goes down when the associated service goes down.
 
 However, I believe that both the above are fine to live with for the
 application under consideration, as ZK will be used only to coordinate
 the larger application. Is there anything else that needs to be
 considered - and can I safely shutdown the clientPort since the
 application is always in the same JVM - but, if I do that how would I
 connect to ZK thereafter ?
 
 thanks and regards
 asankha
 
 [1] http://markmail.org/message/tjonwec7p7dhfpms

Re: Embedding ZK in another application

2010-04-23 Thread Mahadev Konar

That's true!

Thanks
mahadev


On 4/23/10 11:41 AM, Asankha C. Perera asan...@apache.org wrote:

 Hi Mahadev
   I think Ted and Pat had somewhat comentted on this before.
 
 Reiterating these comments below. If you are ok with these points I see no
 concern in ZooKeeper as an embedded application...
 
 Thanks, I missed this on the archives, and it helps!..
 
 I guess if we still decide to embed, the only way to connect to ZK is
 still with the normal TCP client..
 
 cheers
 asankha

Re: bug: wrong heading in recipes doc

2010-04-22 Thread Mahadev Konar

I think we should be using zookeeper locks to create jiras :) . Looks
like both of you created one!!! :)


Thanks
mahadev


On 4/22/10 1:37 PM, Patrick Hunt ph...@apache.org wrote:

 No problem.
 https://issues.apache.org/jira/browse/ZOOKEEPER-752
 
 I've seen alot of traffic on infrastruct...@apache, you might try there,
 I'm sure they could help you out.
 
 Regards,
 
 Patrick
 
 On 04/22/2010 01:26 PM, Adam Rosien wrote:
 I would, but the Apache JIRA has been f***ed since the breakin and I
 can't reset my password. Would you mind adding it for me?
 
 .. Adam
 
 On Thu, Apr 22, 2010 at 11:32 AM, Patrick Huntph...@apache.org  wrote:
 Hi Adam, would you mind creating a JIRA? That's the best way to address this
 type of issue. Thanks!
 https://issues.apache.org/jira/browse/ZOOKEEPER
 
 Patrick
 
 On 04/22/2010 11:30 AM, Adam Rosien wrote:
 
 
 http://hadoop.apache.org/zookeeper/docs/r3.3.0/recipes.html#sc_recoverableS
 haredLocks
 uses the heading recoverable locks, but the text refers to
 revocable.
 
 .. Adam

Re: odd error message

2010-04-20 Thread Mahadev Konar

Ok, I think this is possible.
So here is what happens currently. This has been a long standing bug and
should be fixed in 3.4

https://issues.apache.org/jira/browse/ZOOKEEPER-335

A newly elected leader currently doesn't log the new leader transaction to
its database

In your case, the follower (the 3rd server) did log it but the leader never
did. Now when you brought up the 3rd server it had the transaction log
present but the leader did not have that. In that case the 3rd server cried
fowl and shut down.

Removing the DB is totally fine. For now, we should update our docs on 3.3
and mention that this problem might occur during upgrade and fix it in 3.4.


Thanks for bringing it up Ted.


Thanks
mahadev

On 4/20/10 2:14 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 We have just done an upgrade of ZK to 3.3.0.  Previous to this, ZK has been
 up for about a year with no problems.
 
 On two nodes, we killed the previous instance and started the 3.3.0
 instance.  The first node was a follower and the second a leader.
 
 All went according to plan and no clients seemed to notice anything.  The
 stat command showed connections moving around as expected and all other
 indicators were normal.
 
 When we did the third node, we saw this in the log:
 
 2010-04-20 14:07:49,010 - FATAL [QuorumPeer:/0.0.0.0:2181:follo...@71] -
 Leader epoch 18 is less than our epoch 19
 
 The third node refused all connections.
 
 We brought down the third node, wiped away its snapshot, restarted and it
 joined without complaint.  Note that the third node
 was originally a follower and had never been a leader during the upgrade
 process.
 
 Does anybody know why this happened?
 
 We are fully upgraded and there was no interruption to normal service, but
 this seems strange.

Re: Recovery issue - how to debug?

2010-04-19 Thread Mahadev Konar

Hi Hao,
  As Vishal already asked, how are you determining if the writes are being
received? 
 Also, what was the status of C2 when you checked for these writes? Do you
have the output of echo stat | nc localhost port?

How long did you wait when you say that C2 did not received the writes? What
was the status of C2 (again echo stat | nc localhost port) when you saw
the C2 had received the writes?

Thanks
mahadev


On 4/18/10 10:54 PM, Dr Hao He h...@softtouchit.com wrote:

 I have zookeeper cluster E1 with 3 nodes A,B, and C.
 
 I stopped C and did some writes on E1.  Both A and B received the writes.  I
 then started C and after a short while, C also received the writes.
 
 All seem to go well so I replicated the setup to another cluster E2 with
 exactly 3 nodes: A2, B2, and C2.
 
 I stopped C2 and did some writes on E2.  A2 received the writes.  I then
 started C2.  However, no matter how long I wait, C2 never received the writes.
 
 I then did more writes on E2.  Then C2 can receive all the writes including
 the old writes when it was down.
 
 How do I find out what was wrong withe E2 setup?
 
 I am running 3.2.2 on all nodes.
 
 Regards,
 
 Dr Hao He
 
 XPE - the truly SOA platform
 
 h...@softtouchit.com
 http://softtouchit.com

Re: rolling upgrade 3.2.1 - 3.3.0

2010-04-14 Thread Mahadev Konar

Hi Charity,
   Looks like you are hitting a bug recently found in 3.3.0.

https://issues.apache.org/jira/browse/ZOOKEEPER-737


Is the bug, wherein the server does not show the right status. Looks like in
your case the server is running fine but bin/zkserver.sh status is not
returning the right result.

You can try telnet localhost port and then type stat to get the status on
the server. This bug will be fixed in the bug fix release 3.3.1 which most
probalbly will be released by next week or so.

Thanks
mahadev
  


On 4/14/10 3:59 PM, Charity Majors char...@shopkick.com wrote:

 Hi.  I'm trying to upgrade a zookeeper cluster from 3.2.1 to 3.3.0, and having
 problems.  I can't get a 3.3.0 node to successfully join the cluster and stay
 joined.  
 
 If I run zkServer.sh status immediately after starting up the newly upgraded
 node, it says the service is probably not running, and shows me this:
 
 
 [char...@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status
 JMX enabled by default
 Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg
 2010-04-14 22:47:35,574 - INFO
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@251] -
 Accepted socket connection from /127.0.0.1:40287
 2010-04-14 22:47:35,576 - INFO
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@968] - Processing
 stat command from /127.0.0.1:40287
 2010-04-14 22:47:35,577 - WARN
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@606] -
 EndOfStreamException: Unable to read additional data from client sessionid
 0x0, likely client has closed socket
 2010-04-14 22:47:35,578 - INFO
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1286] - Closed socket
 connection for client /127.0.0.1:40287 (no session established for client)
 Error contacting service. It is probably not running.
 [char...@test-zookeeper001 zookeeper-current]$ 2010-04-14 22:47:35,580 - DEBUG
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1310] - ignoring
 exception during input shutdown
 java.net.SocketException: Transport endpoint is not connected
 at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
 at 
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
 at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
 at 
 org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1306)
 at 
 org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263)
 at 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609)
 at 
 org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262)
 
 
 If I connect with zkCli.sh, I can list the contents of zookeeper.  If I make
 changes to the schema on either of the other two nodes, test-zookeeper002 and
 test-zookeeper003, both of which are running 3.2.1, the changes are reflected
 on test-zookeeper001, which is running 3.3.0.
 
 When I exit zkCli.sh, however, zkServer.sh status starts flapping between
 Error contacting service. It is probably not running. and Mode: follower,
 as you can see below.
 
 Any ideas?  I'd really rather not have to take the production zookeeper
 cluster down to upgrade if it's not necessary.
 
 Thanks,
 Charity.
 
 
 
 [char...@test-zookeeper001 zookeeper-current]$ bin/zkServer.sh status
 JMX enabled by default
 Using config: /services/zookeeper/zookeeper-20100412.1/bin/../conf/zoo.cfg
 2010-04-14 22:53:16,848 - INFO
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@251] -
 Accepted socket connection from /127.0.0.1:55284
 2010-04-14 22:53:16,849 - INFO
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@968] - Processing
 stat command from /127.0.0.1:55284
 2010-04-14 22:53:16,849 - WARN
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@606] -
 EndOfStreamException: Unable to read additional data from client sessionid
 0x0, likely client has closed socket
 2010-04-14 22:53:16,850 - INFO
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1286] - Closed socket
 connection for client /127.0.0.1:55284 (no session established for client)
 Error contacting service. It is probably not running.
 2010-04-14 22:53:16,850 - DEBUG
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@1310] - ignoring
 exception during input shutdown
 java.net.SocketException: Transport endpoint is not connected
 at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
 at 
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
 at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
 at 
 org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1306)
 at 
 org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1263)
 at 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:609)
 at 
 org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262)

Re: feed queue fetcher with hadoop/zookeeper/gearman?

2010-04-12 Thread Mahadev Konar

Hi Thomas,
  There are a couple of projects inside Yahoo! that use ZooKeeper as an
event manager for feed processing.
  
I am little bit unclear on your example below. As I understand it-

1. There are 1 million feeds that will be stored in Hbase.
2. A map reduce job will be run on these feeds to find out which feeds need
to be fetched. 
3. This will create queues in ZooKeeper to fetch the feeds
4.  Workers will pull items from this queue and process feeds

Did I understand it correctly? Also, if above is the case, how many queue
items would you anticipate be accumulated every hour?

Thanks
mahadev


On 4/12/10 1:21 AM, Thomas Koch tho...@koch.ro wrote:

 Hi,
 
 I'd like to implement a feed loader with Hadoop and most likely HBase. I've
 got around 1 million feeds, that should be loaded and checked for new entries.
 However the feeds have different priorities based on their average update
 frequency in the past and their relevance.
 The feeds (url, last_fetched timestamp, priority) are stored in HBase. How
 could I implement the fetch queue for the loaders?
 
 - An hourly map-reduce job to produce new queues for each node and save them
 on the nodes?
   - but how to know, which feeds have been fetched in the last hour?
   - what to do, if a fetch node dies?
 
 - Store a fetch queue in zookeeper and add to the queue with map-reduce each
 hour?
   - Isn't that too much load for zookeeper? (I could make one znode for a
 bunch of urls...?)
 
 - Use gearman to store the fetch queue?
   - But the gearman job server still seems to be a SPOF
 
 [1] http://gearman.org
 
 Thank you!
 
 Thomas Koch, http://www.koch.ro

Re: Errors while running sytest

2010-04-07 Thread Mahadev Konar

Great. 

I was just responding with a different soln:

'---


Looks like the fatjar does not include junit class. Also, the -jar option
does not use the classpath environment variable.

Here is an excerpt from the man page of java:

   -jar   

Execute a program encapsulated in a JAR archive.  The first argument is the
name of a JAR file instead of a startup class name.  In order for this
option to work, the manifest of the JAR file  must
  

  When you use this option, the JAR file is the source of all
user classes, and other user class path settings are ignored.


So you will have to use the main class in fatjar with the java -classpath
option with all the libraries in the classpath.

Java -cp log4j:junit:fatjar  org.apache.zookeeper.util.FatJarMain  server
...


But putting it in build and including it as part of fatjar is much more
convenient!!! 



Thanks
mahadev


On 4/7/10 1:09 PM, Vishal K vishalm...@gmail.com wrote:

 Hi,
 
 It works for me now. Just for the record, I had to copy junit*.jar to
 buil/lib because fat.jar expects it to be there. Then, I had to rebuild
 fatjar.jar.
 
 On Wed, Apr 7, 2010 at 12:10 AM, Vishal K vishalm...@gmail.com wrote:
 
 Hi,
 
 I am trying to run systest on a 3 node cluster (
 http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk/src/java/systest/READM
 E.txt
 ).
 
 When I reach the 4th step which is to actually run the test I get exception
 shown below.
 
 Exception in thread main java.lang.NoClassDefFoundError:
 junit/framework/TestC
 ase
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
 at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:14
 1)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
 at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:169)
 at org.apache.zookeeper.util.FatJarMain.main(FatJarMain.java:97)
 Caused by: java.lang.ClassNotFoundException: junit.framework.TestCase
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 ... 15 more
 
 Looks like it is not able to find classes in junit. However, my classpath
 is set right:
 
 
 :/opt/zookeeper-3.3.0/zookeeper.jar:/opt/zookeeper-3.3.0/lib/junit-4.4.jar:/o
 pt/
 
 zookeeper-3.3.0/lib/log4j-1.2.15.jar:/opt/zookeeper-3.3.0/build/test/lib/juni
 t-4.8.1.jar
 
 Any suggestions how I can get around this problem? Thanks.

Re: deleting a node - command line tool

2010-03-26 Thread Mahadev Konar

Hi Karthik,
  You can use bin/zkCli.sh which provides a nice command line shell
interface for executing commands.

Thanks
mahadev


On 3/26/10 9:42 AM, Karthik K oss@gmail.com wrote:

 Hi -
   I am looking to delete a node (say, /katta) from a running zk ensemble
 altogether and curious if there is any command-line tool that is available
 that can do a delete.
 
 --
   Karthik.

Re: Zookeeper unit tester?

2010-03-09 Thread Mahadev Konar

Hi David,
  We don't really have a mock test ZooKeeper client which does not do any
I/O. We have been thinking about using mockito sometime soon to use for this
kind of testing, but currently there is none.

Thanks
mahadev


On 3/9/10 2:23 PM, David Rosenstrauch dar...@darose.net wrote:

 Just wondering if there was a mock/fake version of
 org.apache.zookeeper.Zookeeper that could be used for unit testing?
 What I'm envisioning would be a single instance Zookeeper that operates
 completely in memory, with no network or disk I/O.
 
 This would make it possible to pass one of the memory-only
 FakeZookeeper's into unit tests, while using a real Zookeeper in
 production code.
 
 Any such animal?  :-)
 
 Thanks,
 
 DR

Re: Managing multi-site clusters with Zookeeper

2010-03-08 Thread Mahadev Konar

HI Martin,
  The results would be really nice information to have on ZooKeeper wiki.
Would be very helpful for others considering the same kind of deployment.
So, do send out your results on the list.


Thanks
mahadev


On 3/8/10 11:18 AM, Martin Waite waite@googlemail.com wrote:

 Hi Patrick,
 
 Thanks for you input.
 
 I am planning on having 3 zk servers per data centre, with perhaps only 2 in
 the tie-breaker site.
 
 The traffic between zk and the applications will be lots of local reads -
 who is the primary database ?.  Changes to the config will be rare (server
 rebuilds, etc - ie. planned changes) or caused by server / network / site
 failure.
 
 The interesting thing in my mind is how zookeeper will cope with inter-site
 link failure - how quickly the remote sites will notice, and how quickly
 normality can be resumed when the link reappears.
 
 I need to get this running in the lab and start pulling out wires.
 
 regards,
 Martin
 
 On 8 March 2010 17:39, Patrick Hunt ph...@apache.org wrote:
 
 IMO latency is the primary issue you will face, but also keep in mind
 reliability w/in a colo.
 
 Say you have 3 colos (obv can't be 2), if you only have 3 servers, one in
 each colo, you will be reliable but clients w/in each colo will have to
 connect to a remote colo if the local fails. You will want to prioritize the
 local colo given that reads can be serviced entirely local that way. If you
 have 7 servers (2-2-3) that would be better - if a local server fails you
 have a redundant, if both fail then you go remote.
 
 You want to keep your writes as few as possible and as small as possible?
 Why? Say you have 100ms latency btw colos, let's go through a scenario for a
 client in a colo where the local servers are not the leader (zk cluster
 leader).
 
 read:
 1) client reads a znode from local server
 2) local server (usually  1ms if in colo comm) responds in 1ms
 
 write:
 1) client writes a znode to local server A
 2) A proposes change to the ZK Leader (L) in remote colo
 3) L gets the proposal in 100ms
 4) L proposes the change to all followers
 5) all followers (not exactly, but hopefully) get the proposal in 100ms
 6) followers ack the change
 7) L gets the acks in 100ms
 8) L commits the change (message to all followers)
 9) A gets the commit in 100ms
 10) A responds to client ( 1ms)
 
 write latency: 100 + 100 + 100 + 100 = 400ms
 
 Obviously keeping these writes small is also critical.
 
 Patrick
 
 
 Martin Waite wrote:
 
 Hi Ted,
 
 If the links do not work for us for zk, then they are unlikely to work
 with
 any other solution - such as trying to stretch Pacemaker or Red Hat
 Cluster
 with their multicast protocols across the links.
 
 If the links are not good enough, we might have to spend some more money
 to
 fix this.
 
 regards,
 Martin
 
 On 8 March 2010 02:14, Ted Dunning ted.dunn...@gmail.com wrote:
 
  If you can stand the latency for updates then zk should work well for
 you.
 It is unlikely that you will be able to better than zk does and still
 maintain correctness.
 
 Do note that you can, probalbly bias client to use a local server. That
 should make things more efficient.
 
 Sent from my iPhone
 
 
 On Mar 7, 2010, at 3:00 PM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
  The inter-site links are a nuisance.  We have two data-centres with
 100Mb
 
 links which I hope would be good enough for most uses, but we need a 3rd
 site - and currently that only has 2Mb links to the other sites.  This
 might
 be a problem.

Re: Managing multi-site clusters with Zookeeper

2010-03-07 Thread Mahadev Konar

Hi Martin,
 As Ted rightly mentions that ZooKeeper usually is run within a colo because
of the low latency requirements of applications that it supports.

Its definitely reasnoble to use it in a multi data center environments but
you should realize the implications of it. The high latency/low throughput
means that you should make minimal use of such a ZooKeeper ensemble.

Also, there are things like the tick Time, the syncLimit and others (setup
parameters for ZooKeeper in config) which you will need to tune a little to
get ZooKeeper running without many hiccups in this environment.

Thanks
mahadev


On 3/6/10 10:29 AM, Ted Dunning ted.dunn...@gmail.com wrote:

 What you describe is relatively reasonable, even though Zookeeper is not
 normally distributed across multiple data centers with all members getting
 full votes.  If you account for the limited throughput that this will impose
 on your applications that use ZK, then I think that this can work well.
 Probably, you would have local ZK clusters for higher transaction rate
 applications.
 
 You should also consider very carefully whether having multiple data centers
 increases or decreases your overall reliability.  Unless you design very
 carefully, this will normally substantially degrade reliability.  Making
 sure that it increases reliability is a really big task that involves a lot
 of surprising (it was to me) considerations and considerable hardware and
 time investments.
 
 Good luck!
 
 On Sat, Mar 6, 2010 at 1:50 AM, Martin Waite waite@googlemail.comwrote:
 
 Is this a viable approach, or am I taking Zookeeper out of its application
 domain and just asking for trouble ?

Re: Managing multi-site clusters with Zookeeper

2010-03-07 Thread Mahadev Konar

Martin,
 2Mb link might certainly be a problem. We can refer to these nodes as
ZooKeeper servers. Znodes is used to data elements in the ZooKeeper data
tree.

The Zookeeper ensemble has minimal traffic which is basically health checks
between the members of the ensemble. We call one of the members as Leader
who is leading the ensemble and the others as Followers. The Leader does
periodic health checks to see if the Followers are doing fine. This is of
the order of  1KB/sec.

There is some traffic when the leader election within the ensemble happens.
This might be of the order of 1-2KB/sec.

As you mentioned the reads happen locally. So, a good enough link within the
ensemble members is important so that these followers can be up to date with
the Leader. But again looking at your config, looks like its mostly read
only traffic. 

One more thing you should be aware of:
Lets says a ephemeral node was created and the client died, then the clients
connected to the slow ZooKeeper server (with 2Mb/s links) would lag behind
the other clients connected to the other servers.

As per my opinion you should do some testing since 2Mb/sec seems a little
dodgy.

Thanks
mahadev
 
On 3/7/10 2:09 PM, Martin Waite waite@googlemail.com wrote:

 Hi Mahadev,
 
 The inter-site links are a nuisance.  We have two data-centres with 100Mb
 links which I hope would be good enough for most uses, but we need a 3rd
 site - and currently that only has 2Mb links to the other sites.  This might
 be a problem.
 
 The ensemble would have a lot of read traffic from applications asking which
 database to connect to for each transaction - which presumably would be
 mostly handled by local zookeeper servers (do we call these nodes as
 opposed to znodes ?).  The write traffic would be mostly changes to
 configuration (a rare event), and changes in the health of database servers
 - also hopefully rare.  I suppose the main concern is how much ambient
 zookeeper system chatter will cross the links.   Are there any measurements
 of how much traffic is used by zookeeper in maintaining the ensemble ?
 
 Another question that occurs is whether I can link sites A,B, and C in a
 ring - so that if any one site drops out, the remaining 2 continue to talk.
 I suppose that if the zookeeper servers are all in direct contact with each
 other, this issue does not exist.
 
 regards,
 Martin
 
 On 7 March 2010 21:43, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Martin,
  As Ted rightly mentions that ZooKeeper usually is run within a colo
 because
 of the low latency requirements of applications that it supports.
 
 Its definitely reasnoble to use it in a multi data center environments but
 you should realize the implications of it. The high latency/low throughput
 means that you should make minimal use of such a ZooKeeper ensemble.
 
 Also, there are things like the tick Time, the syncLimit and others (setup
 parameters for ZooKeeper in config) which you will need to tune a little to
 get ZooKeeper running without many hiccups in this environment.
 
 Thanks
 mahadev
 
 
 On 3/6/10 10:29 AM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 What you describe is relatively reasonable, even though Zookeeper is not
 normally distributed across multiple data centers with all members
 getting
 full votes.  If you account for the limited throughput that this will
 impose
 on your applications that use ZK, then I think that this can work well.
 Probably, you would have local ZK clusters for higher transaction rate
 applications.
 
 You should also consider very carefully whether having multiple data
 centers
 increases or decreases your overall reliability.  Unless you design very
 carefully, this will normally substantially degrade reliability.  Making
 sure that it increases reliability is a really big task that involves a
 lot
 of surprising (it was to me) considerations and considerable hardware and
 time investments.
 
 Good luck!
 
 On Sat, Mar 6, 2010 at 1:50 AM, Martin Waite waite@googlemail.com
 wrote:
 
 Is this a viable approach, or am I taking Zookeeper out of its
 application
 domain and just asking for trouble ?

Re: zookeeper utils

2010-03-02 Thread Mahadev Konar

Hi David,
  There is an implementation for locks and queues in src/recipes. The
documentation residres in src/recipes/{lock/queue}/README.txt.

Thanks
mahadev


On 3/2/10 1:04 PM, David Rosenstrauch dar...@darose.net wrote:

 Was reading through the zookeeper docs on the web - specifically the
 recipes and solutions page (as well as comments elsewhere inviting
 additional such contributions from the community) and was wondering:
 
 Is there a library of higher-level zookeeper utilities that people have
 contributed, beyond the barrier and queue examples provided in the docs?
 
 Thanks,
 
 DR

Re: is there a good pattern for leases ?

2010-02-24 Thread Mahadev Konar

I am not sure if I was clear enoguh in my last message.

What is suggested was this:

Create a client with a timeout of lets say 10 seconds!

Zookeeper zk = new ZooKeeper(1); (for brevity ignoring other parameters)

Zk.create(/parent/ephemeral, data, EPEMERAL);

//create a another thread that triggeers at 120 seconds

On a trigger from this thread call zk.delete(/parent/ephemeral);

That's how lease can be done at the application side.

Obviously your lease expires on a session close and other events as well,
you need to be monitoring.

Thanks
mahadev


On 2/24/10 11:09 AM, Martin Waite waite@googlemail.com wrote:

 Hi Mahadev,
 
 That is interesting.  All I need to do is hold the connection for the
 required time of a session that created an ephemeral node.
 
 Zookeeper is an interesting tool.
 
 Thanks again,
 Martin
 
 On 24 February 2010 17:00, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Martin,
  There isnt an inherent model for leases in the zookeeper library itself.
 To implement leases you will have to implement them at your application
 side
 with timeouts triggers (lease triggers) leading to session close at the
 client.
 
 
 Thanks
 mahadev
 
 
 On 2/24/10 3:40 AM, Martin Waite waite@googlemail.com wrote:
 
 Hi,
 
 Is there a good model for implementing leases in Zookeeper ?
 
 What I want to achieve is for a client to create a lock, and for that
 lock
 to disappear two minutes later - regardless of whether the client is
 still
 connected to zk.   Like ephemeral nodes - but with a time delay.
 
 regards,
 Martin

Re: how to lock one-of-many ?

2010-02-24 Thread Mahadev Konar

Hi martin,
 Currently you cannot access the server that the client is connected to.
This was fixed in this jira

http://issues.apache.org/jira/browse/ZOOKEEPER-544

But again this does not tell you if you are connected to the primary or the
other followers. So you will anyway have to do some manual testing with
specifying the client host:port address as just the primary or just the
follower (for the follower test case).

Leaking information like (if the server is primary or not) can cause
applications to use this information in a wrong way. So we never exposed
this information! :)

Thanks
mahadev




On 2/24/10 11:25 AM, Martin Waite waite@googlemail.com wrote:

 Hi,
 
 I take the point that the watch is useful for stopping clients unnecessarily
 pestering the zk nodes.
 
 I think that this is something I will have to experiment with and see how it
 goes.  I only need to place about 10k locks per minute, so I am hoping that
 whatever approach I take is well within the headroom of Zookeeper on some
 reasonable boxes.
 
 Is it possible for the client to know whether it has connected to the
 current primary or not ?   During my testing I would like to make sure that
 the approach works both when the client is attached to the primary and when
 attached to a lagged non-primary node.
 
 regards,
 Martin
 
 On 24 February 2010 18:42, Ted Dunning ted.dunn...@gmail.com wrote:
 
 Random back-off like this is unlikely to succeed (seems to me).  Better to
 use the watch on the locks directory to make the wait as long as possible
 AND as short as possible.
 
 On Wed, Feb 24, 2010 at 8:53 AM, Patrick Hunt ph...@apache.org wrote:
 
 Anyone interested in locking an explicit resource attempts to create an
 ephemeral node in /locks with the same ### as they resource they want
 access
 to. If interested in just getting any resource then you would
 getchildren(/resources) and getchildren(/locks) and attempt to lock
 anything
 not in the intersection (avail). This could be done efficiently since
 resources won't change much, just cache the results of getchildren and
 set a
 watch at the same time. To lock a resource randomize avail and attempt
 to
 lock each in turn. If all avail fail to acq the lock, then have some
 random
 holdoff time, then re-getchildren(locks) and start over.
 
 
 
 
 --
 Ted Dunning, CTO
 DeepDyve

Re: how to lock one-of-many ?

2010-02-23 Thread Mahadev Konar

Hi Martin,
  How about this- 

  you have resources in the a directory (say /locks)

  each process which needs to lock, lists all the children of this directory
and then creates an ephemeral node called /locks/resource1/lock depending on
which resource it wants to lock.

This ephemeral node will be deleted by the process as soon as its done using
the resource. A process should only use to resource_{i} if its been able to
create /locks/resource_{i}/locks.

Would that work?

Thanks
mahadev

On 2/23/10 4:05 AM, Martin Waite waite@googlemail.com wrote:

 Hi,
 
 I have a set of resources each of which has a unique identifier.  Each
 resource element must be locked before it is used, and unlocked afterwards.
 
 The logic of the application is something like:
 
 lock any one element;
 if (none locked) then
exit with error;
 else
get resource-id from lock
use resource
unlock resource
 end
 
 Zookeeper looks like a good candidate for managing these locks, being fast
 and resilient, and it seems quite simple to recover from client failure.
 
 However, I cannot think of a good way to implement this sort of one-of-many
 locking.
 
 I could create a directory called available and another called locked.
 Available would have one entry for each resource id ( or one entry
 containing a list of the resource-ids).  For locking, I could loop through
 the available ids, attempting to create a lock for that in the locked
 directory.  However this seems a bit clumsy and slow.  Also, the locks are
 held for a relatively short time (1 second on average), and by time I have
 blundered through all the possible locks, ids that were locked at the start
 might be available by time I finished.
 
 Can anyone think of a more elegant and efficient way of doing this ?
 
 regards,
 Martin

Re: Bit of help debugging a TIMED OUT session please

2010-02-22 Thread Mahadev Konar

HI stack,
 the other interesting part is with the session:
0x26ed968d880001

Looks like it gets disconnected from one of the servers (TIMEOUT). DO you
see any of these messages: Attempting connection to server in the logs
before you see all the consecutive

org.apache.zookeeper.ClientCnxn: Exception closing session
0x26ed968d880001 to sun.nio.ch.selectionkeyi...@788ab708
java.io.IOException: Read error rc = -1
java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
at 
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)

and


From the cient 0x26ed968d880001?

Thanks
mahadev


On 2/22/10 11:42 AM, Stack st...@duboce.net wrote:

 The thing that seems odd to me is that the connectivity complaints are
 out of the zk client, right?, why is it failing getting to member 14
 and why not move to another ensemble member if issue w/ 14?, and if
 there were a general connectivity issue, I'd think that the running
 hbase cluster would be complaining at about the same time (its talking
 to datanodes and masters at this time).
 
 (Thanks for the input lads)
 
 St.Ack
 
 
 On Mon, Feb 22, 2010 at 11:26 AM, Mahadev Konar maha...@yahoo-inc.com wrote:
 I also looked at the logs. Ted might have a point. It does look like that
 zookeeper server's are doing fine (though as ted mentions the skew is a
 little concerning, though that might be due to very few packets served by
 the first server). Other than that the latencies of 300 ms at max should not
 cause any timeouts.
 Also, the number of packets received is pretty low - meaning that it wasn't
 serving huge traffic. Is there anyway we can check if the network connection
 from the client to the server is not flaky?
 
 Thanks
 mahadev
 
 
 On 2/22/10 10:40 AM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 Not sure this helps at all, but these times are remarkably asymmetrical.  I
 would expect members of a ZK  cluster to have very comparable times.
 
 Additionally, 345 ms is nowhere near large enough to cause a session to
 expire.  My take is that ZK doesn't think it caused the timeout.
 
 On Mon, Feb 22, 2010 at 10:18 AM, Stack st...@duboce.net wrote:
 
        Latency min/avg/max: 2/125/345
 ...
        Latency min/avg/max: 0/7/81
 ...
        Latency min/avg/max: 1/1/1
 
 Thanks for any pointers on how to debug.

Re: Ordering guarantees for async callbacks vs watchers

2010-02-10 Thread Mahadev Konar

Hi martin,
 a call like getchildren(final String path, Watcher watcher,
ChildrenCallback cb, Object ctx)

Means that set a watch on this node for any further changes on the server. A
client will see the response to getchildren data before the above watch is
fired. 

Hope that helps.

Thanks
mahadev


On 2/10/10 6:59 PM, Martin Traverso mtrave...@gmail.com wrote:

 What are the ordering guarantees for asynchronous callbacks vs watcher
 notifications (Java API) when both are used in the same call? E.g.,
 for getChildren(final String path, Watcher watcher, ChildrenCallback cb,
 Object ctx)
 
 Will the callback always be invoked before the watcher if there is a state
 change on the server at about the same time the call is made?
 
 I *think* that's what's implied by the documentation, but I'm not sure I'm
 reading it right:
 
 All completions for asynchronous calls and watcher callbacks will be made
 in order, one at a time. The caller can do any processing they wish, but no
 other callbacks will be processed during that time. (
 http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperProgrammers.html#Java+
 Binding
 )
 
 Thanks!
 
 Martin

ZOOKEEPER-22 and release 3.3

2010-02-02 Thread Mahadev Konar

Hi all,

 I had been working on zookeeper-22 and found out that it needs quite a few
extensive changes. We will need to do some memory measurements to see if it
has any memory impacts or not.

Since we are targetting 3.3 release for early march, ZOOKEEPER-22 would be
hard to get into 3.3. I am proposing to move it to a later release (3.4), so
that it can be tested early in the release phase and gets baked in the
release. 


Thanks
mahadev

Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Mahadev Konar

Qian,

  ZooKeeper gurantees that if a client sees some transaction response, then
it will persist but the one's that a client does not see might be discarded
or committed. So in case a quorum does not log the transaction, there might
be a case wherein a zookeeper server which does not have the logged
transaction becomes the leader (because the machines with the logged
transaction are down). In that case the transaction is discarded. In a case
when a machine which has the logged transaction becomes the leader that
transaction will be committed.

Hope that clear your doubt.

mahadev


On 1/28/10 6:02 PM, Qian Ye yeqian@gmail.com wrote:

 Thanks henry and ben, actually I have read the paper henry mentioned in this
 mail, but I'm still not so clear with some of the details. Anyway, maybe
 more study on the source code can help me understanding. Since Ben said
 that, if less than a quorum of servers have accepted a transaction, we can
 commit or discard. Would this feature cause any unexpected problem? Can you
 give some hints about this issue?
 
 
 
 On Fri, Jan 29, 2010 at 1:09 AM, Benjamin Reed br...@yahoo-inc.com wrote:
 
 henry is correct. just to state another way, Zab guarantees that if a
 quorum of servers have accepted a transaction, the transaction will commit.
 this means that if less than a quorum of servers have accepted a
 transaction, we can commit or discard. the only constraint we have in
 choosing is ordering. we have to decide which partially accepted
 transactions are going to be committed and which discarded before we propose
 any new messages so that ordering is preserved.
 
 ben
 
 
 Henry Robinson wrote:
 
 Hi -
 
 Note that a machine that has the highest received zxid will necessarily
 have
 seen the most recent transaction that was logged by a quorum of followers
 (the FIFO property of TCP again ensures that all previous messages will
 have
 been seen). This is the property that ZAB needs to preserve. The idea is
 to
 avoid missing a commit that went to a node that has since failed.
 
 I was therefore slightly imprecise in my previous mail - it's possible for
 only partially-proposed proposals to be committed if the leader that is
 elected next has seen them. Only when another proposal is committed
 instead
 must the original proposal be discarded.
 
 I highly recommend Ben Reed's and Flavio Junqueira's LADIS paper on the
 subject, for those with portal.acm.org access:
 http://portal.acm.org/citation.cfm?id=1529978
 
 Henry
 
 On 27 January 2010 21:52, Qian Ye yeqian@gmail.com wrote:
 
 
 
 Hi Henry:
 
 According to your explanation, *ZAB makes the guarantee that a proposal
 which has been logged by
 a quorum of followers will eventually be committed* , however, the
 source
 code of Zookeeper, the FastLeaderElection.java file, shows that, in the
 election, the candidates only provide their zxid in the votes, the one
 with
 the max zxid would win the election. I mean, it seems that no check has
 been
 made to make sure whether the latest proposal has been logged by a quorum
 of
 servers.
 
 In this situation, the zookeeper would deliver a proposal, which is known
 as
 a failed one by the client. Imagine this scenario, a zookeeper cluster
 with
 5 servers, Leader only receives 1 ack for proposal A, after a timeout,
 the
 client is told that the proposal failed. At this time, all servers
 restart
 due to a power failure. The server have the log of proposal A would be
 the
 leader, however, the client is told the proposal A failed.
 
 Do I misunderstand this?
 
 
 On Wed, Jan 27, 2010 at 10:37 AM, Henry Robinson he...@cloudera.com
 wrote:
 
 
 
 Qing -
 
 That part of the documentation is slightly confusing. The elected leader
 must have the highest zxid that has been written to disk by a quorum of
 followers. ZAB makes the guarantee that a proposal which has been logged
 
 
 by
 
 
 a quorum of followers will eventually be committed. Conversely, any
 proposals that *don't* get logged by a quorum before the leader sending
 them
 dies will not be committed. One of the ZAB papers covers both these
 situations - making sure proposals are committed or skipped at the right
 moments.
 
 So you get the neat property that leader election can be live in exactly
 the
 case where the ZK cluster is live. If a quorum of peers aren't available
 
 
 to
 
 
 elect the leader, the resulting cluster won't be live anyhow, so it's ok
 for
 leader election to fail.
 
 FLP impossibility isn't actually strictly relevant for ZAB, because FLP
 requires that message reordering is possible (see all the stuff in that
 paper about non-deterministically drawing messages from a potentially
 deliverable set). TCP FIFO channels don't reorder, so provide the extra
 signalling that ZAB requires.
 
 cheers,
 Henry
 
 2010/1/26 Qing Yan qing...@gmail.com
 
 
 
 Hi,
 
 I have question about how zookeeper *remembers* a commit operation.
 
 According to

Re: Server exception when closing session

2010-01-22 Thread Mahadev Konar

Hi Josh,
 This warning is not of any concern. Just a quick question, is there any
reason for you to runn the server on a DEBUG level?

Thanks
mahadev


On 1/22/10 5:19 PM, Josh Scheid jsch...@velocetechnologies.com wrote:

 Is it normal for client session close() to cause a server exception?
 Things seem to work, but the WARN is a bit disconcerting.
 
 2010-01-22 17:15:01,573 - WARN
 [NIOServerCxn.Factory:2181:nioserverc...@518] - Exception causing
 close of session 0x126571af282114b due to java.io.IOException: Read
 error
 2010-01-22 17:15:01,573 - DEBUG
 [NIOServerCxn.Factory:2181:nioserverc...@521] - IOException stack
 trace
 java.io.IOException: Read error
 at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:396)
 at 
 org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:239)
 2010-01-22 17:15:01,573 - INFO
 [NIOServerCxn.Factory:2181:nioserverc...@857] - closing
 session:0x126571af282114b NIOServerCnxn:
 java.nio.channels.SocketChannel[connected local=/10.66.16.96:2181
 remote=/10.66.24.94:59591]
 2010-01-22 17:15:01,573 - INFO
 [ProcessThread:-1:preprequestproces...@384] - Processed session
 termination request for id: 0x126571af282114b
 2010-01-22 17:15:01,583 - DEBUG
 [SyncThread:0:finalrequestproces...@74] - Processing request::
 sessionid:0x126571af282114b type:closeSession cxid:0x4b5a4d95
 zxid:0x43f3 txntype:-11 n/a
 2010-01-22 17:15:01,583 - DEBUG
 [SyncThread:0:finalrequestproces...@147] - sessionid:0x126571af282114b
 type:closeSession cxid:0x4b5a4d95 zxid:0x43f3 txntype:-11 n/a
 
 zk 3.2.2.  Client is using zkpython.
 
 Nothing is otherwise abnormal.  I can just connect, then close the
 session and this occurs.
 
 -Josh

Re: Server exception when closing session

2010-01-22 Thread Mahadev Konar

HI Josh,
  The server latency does seem huge. What os and hardware are you running it
on? What is usage model of zookeeper? How much memory are you allocating to
the server? 
The debug well exacerbate the problem.
A dedicated disk means the following:
Zookeeper has snapshots and transaction logs. The datadir is the directory
that stores the transaction logs. Its highly recommended that this directory
be on a separate disk that isnt being used by any other process. The
snapshots can sit on a disk that is being used by the OS and can be shared.

Also, Pat ran some tests for serve lantecies at:

http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview

You can take a look at that as well and see what the expected performance
should be for your workload.

Thanks
mahadev


On 1/22/10 5:40 PM, Josh Scheid jsch...@velocetechnologies.com wrote:

 On Fri, Jan 22, 2010 at 17:22, Mahadev Konar maha...@yahoo-inc.com wrote:
  This warning is not of any concern.
 
 OK.  I'm used to warnings being things that must be addressed.  I'll
 ignore this one in the future.
 
 Just a quick question, is there any reason for you to runn the server on a
 DEBUG level?
 
 We're having issues with server latency.  Client default timeout of
 1ms gets hit.  I saw a stat output showing a 16s max latency
 today.
 Is DEBUG going to exacerbate that?  Of the recommendations I've seen,
 the one I can't yet follow is a dedicated disk: dataDir is in the root
 partition of the server right now.
 
 -Josh

Re: Question regarding Membership Election

2010-01-14 Thread Mahadev Konar

Hi Vijay,
  Unfortunately you wont be able to keep running the observer in the other
DC if the quorum in the DC 1 is dead. Most of the folks we have talked to
also want to avoid voiting across colos. They usually run two instances of
Zookeeper in 2 DC's and copy state of zookeeper (using a bridge) across
colos to keep them in sync. Usually the data requirement across colos is
very small and they are usually able to do that by copying data across with
there own bridge process.

Hope that helps.

Thanks
mahadev


On 1/14/10 12:12 PM, Vijay vijay2...@gmail.com wrote:

 Hi,
 
 I read about observers in other datacenter,
 
 My question is i dont want voting across the datacenters (So i will use
 observers), at the same time when a DC goes down i dont want to loose the
 cluster, whats the solution for it?
 
 I have to have 3 nodes in primary DC to accept 1 node failure. Thats fine...
 but what about the other DC? how many nodes and how will i make it work?
 
 Regards,
 /VJ

Re: Namespace partitioning ?

2010-01-14 Thread Mahadev Konar

Hi kay,
  the namespace partitioning in zookeeper has been on a back burner for a
long time. There isnt any jira open on it. There had been some discussions
on this but no real work. Flavio/Ben have had this on there minds for a
while but no real work/proposal is out yet.

May I know is this something you are looking for in production?

Thanks
mahadev


On 1/14/10 3:38 PM, Kay Kay kaykay.uni...@gmail.com wrote:

 Digging up some old tickets + search results - I am trying to understand
 what current state is , w.r.t support for namespace partitioning in
 zookeeper.  Is it already in / any tickets-mailing lists to understand
 the current state.

Re: Killing a zookeeper server

2010-01-13 Thread Mahadev Konar

Hi Adam,
  That seems fair to file as an improvement. Running 'stat' did return the
right stats right? Saying the servers werent able to elect a leader?

mahadev


On 1/13/10 11:52 AM, Adam Rosien a...@rosien.net wrote:

 On a related note, it was initially confusing to me that the server
 returned 'imok' when it wasn't part of the quorum. I realize the
 internal checks are probably in separate areas of the code, but if
 others feel similarly I could file an improvement in JIRA.
 
 .. Adam
 
 On Wed, Jan 13, 2010 at 11:19 AM, Nick Bailey ni...@mailtrust.com wrote:
 So the solution for us was to just nuke zookeeper and restart everywhere.
  We will also be upgrading soon as well.
 
 To answer your question, yes I believe all the servers were running normally
 except for the fact that they were experiencing high CPU usage.  As we began
 to see some CPU alerts I started restarting some of the servers.
 
 It was then that we noticed that they were not actually running according to
 'stat'.
 
 I still have the log from one server with a debug level and the rest with a
 warn level. If you would like to see any of these and analyze them just let
 me know.
 
 Thanks for the help,
 Nick Bailey
 
 On Jan 12, 2010, at 8:20 PM, Patrick Hunt ph...@apache.org wrote:
 
 Nick Bailey wrote:
 
 In my last email I failded to include a log line that may be revelent as
 well
 2010-01-12 18:33:10,658 [QuorumPeer:/0.0.0.0:2181] (QuorumCnxManager)
 DEBUG - Queue size: 0
 2010-01-12 18:33:10,659 [QuorumPeer:/0.0.0.0:2181] (FastLeaderElection)
 INFO  - Notification time out: 6400
 
 Yes, that is significant/interesting. I believe this means that there is
 some problem with the election process (ie the server re-joining the
 ensemble). We have a backoff on these attempts, which matches your
 description below. We have fixed some election issues in recent versions (we
 introduced fault injection testing prior to the 3.2.1 release which found a
 few issues with election). I don't have them off hand - but I've asked
 Flavio to comment directly (he's in diff tz).
 
 Can you provide a bit more background: prior to this issue, this
 particular server was running fine? You restarted it and then started seeing
 the issue? (rather than this being a new server I mean). What I'm getting at
 is that there shouldn't/couldn't be any networking/firewall type issue going
 on right?
 
 Can you provide a full/more log? What I'd suggest is shut down this one
 server, clear the log4j log file, then restart it. Let the problem
 reproduce, then gzip the log4j log file and attach to your response. Ok?
 
 Patrick
 
 We see this line occur frequently and the timeout will graduatlly
 increase to 6. It appears that all of our servers that seem to be
 acting
 normally are experiencing the cpu issue I mentioned earlier
 'https://issues.apache.org/jira/browse/ZOOKEEPER-427'. Perhaps that is
 causing the timeout in responding?
 Also to answer your other questions Patrick, we aren't storing a large
 amount of data really and network latency appears fine.
 Thanks for the help,
 Nick
 -Original Message-
 From: Nick Bailey nicholas.bai...@rackspace.com
 Sent: Tuesday, January 12, 2010 6:03pm
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: Killing a zookeeper server
 12 was just to keep uniformity on our servers. Our clients are connecting
 from the same 12 servers.  Easily modifiable and perhaps we should look
 into
 changing that.
 The logs just seem to indicate that the servers that claim to have no
 server running are continually attempting to elect a leader.  A sample is
 provided below.  The initial exception is something we see regularly in our
 logs and the debug and info lines following are simply repeating throughout
 the log.
 2010-01-12 17:55:02,269 [NIOServerCxn.Factory:2181] (NIOServerCnxn) WARN
  - Exception causing close of session 0x0 due to java.io.IOException: Read
 error
 2010-01-12 17:55:02,269 [NIOServerCxn.Factory:2181] (NIOServerCnxn) DEBUG
 - IOException stack trace
 java.io.IOException: Read error
       at
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:295)
       at
 org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:16
 2)
 2010-01-12 17:55:02,269 [NIOServerCxn.Factory:2181] (NIOServerCnxn) INFO
  - closing session:0x0 NIOServerCnxn:
 java.nio.channels.SocketChannel[connected local=/172.20.36.9:2181
 remote=/172.20.36.9:50367]
 2010-01-12 17:55:02,270 [NIOServerCxn.Factory:2181] (NIOServerCnxn) DEBUG
 - ignoring exception during input shutdown
 java.net.SocketException: Transport endpoint is not connected
       at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
       at
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
       at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
       at
 org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:767)
       at
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:421)

Re: Fetching sequential children

2009-12-23 Thread Mahadev Konar

Hi ohad,
  there isnt a way to get a selected set of children from the servers. So
you will have to get all of them and filter out the unwanted ones. Also,
what Steve suggested in the other email might be useful for you.

Thanks
mahadev


On 12/23/09 12:29 AM, Ohad Ben Porat o...@outbrain.com wrote:

 Hey,
 
 Under the main node of my application I have the following sequential
 children:  mytest1, mytest2, mytest3, sometest1,sometest2,sometest3.
 Now, I want to get all children of my main node that starts with mytest,
 something like getChildren(/main/mytest*, false), is there a command for
 that? Or must I bring all children and filter out the unwanted ones?
 
 Ohad

Re: zkfuse

2009-11-24 Thread Mahadev Konar

Hi Maarten,

  zkfuse does not have any support for acls. We havent had much time to
focus on zkfuse. Create/read/write/delete/ls are all supported. It was built
mostly for infrequent updates and more of a browsing interface on
filesystem. I don't think zkfuse is being used in production anywhere. Would
you mind elaborating your use case?

Thanks
mahadev


On 11/24/09 11:14 AM, Maarten Koopmans maar...@vrijheid.net wrote:

 Hi,
 
 I just started using zkfuse, and this may very well suit my needs for
 now. Thumbs up to the ZooKeeper team!
 
 What operations are supported (i.e. what is the best use of zkfuse). I
 can see how files, dirs there creation and listing map quite nicely. ACLs?
 
 I have noticed two things on a fresk Ubuntu 9.10 (posting for future
 archive reference):
 
 - I *have* to run in debug mode (-d)
 - you have to add libboost  or it won't compile
 
 Regards,
 
 Maarten

Bugfix release 3.2.2

2009-10-30 Thread Mahadev Konar

Hi all,
  We are planning to make a bugfix release 3.2.2 which will include a
critical bugfix in the c client code. The jira is ZOOKEEPER-562,
http://issues.apache.org/jira/browse/ZOOKEEPER-562.

 If you would like some fix to be considered for this bugfix release please
feel free to post on the zookeeper-dev list.


Thanks
Mahadev

Re: zookeeper viewer

2009-10-24 Thread Mahadev Konar

Hi Hamoun,
   Can you please mention which link is broken? Are you a looking for a
zookeeper tree browser?

   Pat created a dashboard for zookeeper at github. Below is the link:

   http://github.com/phunt/zookeeper_dashboard

Also, there is an open jira for a zookeeper browser which you can try out -

http://issues.apache.org/jira/browse/ZOOKEEPER-418


Hope this helps.

Thanks
mahadev
  
On 10/24/09 4:18 PM, Hamoun gh hamoun...@gmail.com wrote:

 I am looking for the zookeeper viewer. seems the link is broken. can
 somebody please help?
 
 Thank you,
 Hamoun Ghanbari

Re: Restarting a single zookeeper Server on the same port within the process

2009-10-22 Thread Mahadev Konar

Hi Siddharth,
  Usually the time of releasing the port is dependent on the OS. So you can
try sleeping a few more seconds to see if the port has been released or it
.. Or just poll on the port to see if its in use or not There isnt an
easier way to restart on the same port.


mahadev


On 10/22/09 4:52 PM, Siddharth Raghavan siddhar...@audiencescience.com
wrote:

 Hello,
 
  
 
 I need to restart a single zookeeper server node on the same port within
 my unit tests. 
 
  
 
 I tried stopping the server, having a delay and restarting it on the
 same port. But the server doesn't startup. When I re-start on a
 different port, it starts up correctly.
 
  
 
 Can you let me know how I can make this one work.
 
  
 
 Thank you. 
 
  
 
 Regards,
 
 Siddharth

Re: Cluster Configuration Issues

2009-10-20 Thread Mahadev Konar

HI Mark,
 ZooKeeper does not create the myid file in the data directory.

Looking at the config file it looks like it is missing the quorum
configuration for other servers.

Please take alook at

http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperAdmin.html#sc_zkMuli
tServerSetup


You will need to add config options for other servers in the quorum in the
config file.


Thanks
mahadev


On 10/20/09 10:12 AM, Mark Vigeant mark.vige...@riskmetrics.com wrote:

 Hey-
 
 So I'm trying to run hbase on 4 nodes, and in order to do that I need to run
 zookeeper in replicated mode (I could have hbase run the quorum for me, but
 it's suggested that I don't).
 
 I have an issue though.  For some reason the id I'm assigning each server in
 the file myid in the assigned data directory is not getting read. I feel
 like another id is being created and put somewhere else. Does anyone have any
 tips on starting a zookeeper quorum? Do I create the myid file myself or do I
 edit one once it is created by zookeeper?
 
 This is what my  config looks like:
 ticktime=2000
 dataDir=/home/hadoop/zookeeper
 clientPort=2181
 initLimit=5
 syncLimit=2
 server.1=hadoop1:2888:3888
 
 The name of my machine is hadoop1, with user name hadoop. In
 /home/hadoop/zookeeper I've created a myid file with the number 1 in it.
 
 Mark Vigeant
 RiskMetrics Group, Inc.

Re: specifying the location of zookeeper.log

2009-10-16 Thread Mahadev Konar

Hi Leonard,
  You should be able to set the ZOO_LOG_DIR as an environment variable to
get a different log directory. I think you are using bin/zkServer.sh to
start the server? 

Also, please open a jira for this. It would be good to fix the documentation
for this.

Thanks
mahadev


On 10/16/09 11:04 AM, Leonard Cuff lc...@valueclick.com wrote:

 I¹ve read through the admin manual at
 http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_loggi
 ng
 and I don¹t see that there is any way to specify a location for the server¹s
 own log file.  zookeeper.log appears in the bin directory, regardless of
 setting dataDir or dataLogDir in the configuration file. Am I overlooking
 something?  Is there a way to have this file appear somewhere else?
 
 TIA,
 
 Leonard

Re: specifying the location of zookeeper.log

2009-10-16 Thread Mahadev Konar

Hi Leonard,
 Looks like you are right. bin/zkServer.sh just logs the output to console,
so you should be able to redirect to any file you want. No?

Anyways this is a bug. Please open a jira for it.

Thanks
mahadev


On 10/16/09 11:27 AM, Leonard Cuff lc...@valueclick.com wrote:

 
 I should have mentioned in my original email, but I had already tried
 setting ZOO_LOG_DIR as an environment variable. I am using zkServer.sh and I
 see where it passes ZOO_LOG_DIR as a parameter to the java invocation.
 
  java  -Dzookeeper.log.dir=${ZOO_LOG_DIR}
 -Dzookeeper.root.logger=${ZOO_LOG4J_PROP} \
  -cp $CLASSPATH $JVMFLAGS $ZOOMAIN $ZOOCFG 
 
 
 I double checked by echo'ing the value of ZOO_LOG_DIR just before the java
 command. It's set correctly ... but it has no effect on the location of
 zookeeper.log :-(
 
 Leonard
 
 On 10/16/09 11:08 AM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Leonard,
   You should be able to set the ZOO_LOG_DIR as an environment variable to
 get a different log directory. I think you are using bin/zkServer.sh to
 start the server?
 
 Also, please open a jira for this. It would be good to fix the documentation
 for this.
 
 Thanks
 mahadev
 
 
 On 10/16/09 11:04 AM, Leonard Cuff lc...@valueclick.com wrote:
 
 I¹ve read through the admin manual at
 http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_loggi
 ng
 and I don¹t see that there is any way to specify a location for the server¹s
 own log file.  zookeeper.log appears in the bin directory, regardless of
 setting dataDir or dataLogDir in the configuration file. Am I overlooking
 something?  Is there a way to have this file appear somewhere else?
 
 TIA,
 
 Leonard

Re: specifying the location of zookeeper.log

2009-10-16 Thread Mahadev Konar

Sorry some misinformation from my side.

You can actually change the log4j properties to get it to write to a file.

Using the following in your log4j properties file

log4j.rootLogger=INFO, FILE

log4j.appender.FILE=org.apache.log4j.FileAppender
log4j.appender.FILE.File={$dir}/zoo.log
log4j.appender.FILE.layout=org.apache.log4j.PatternLayout
log4j.appender.FILE.layout.ConversionPattern=%d{ISO8601} - %-5p
[%t:%c...@%l] - %m%n


Will let you log to the output directory $dir.


Hope that helps!

mahadev


On 10/16/09 11:35 AM, Mahadev Konar maha...@yahoo-inc.com wrote:

 Hi Leonard,
  Looks like you are right. bin/zkServer.sh just logs the output to console,
 so you should be able to redirect to any file you want. No?
 
 Anyways this is a bug. Please open a jira for it.
 
 Thanks
 mahadev
 
 
 On 10/16/09 11:27 AM, Leonard Cuff lc...@valueclick.com wrote:
 
 
 I should have mentioned in my original email, but I had already tried
 setting ZOO_LOG_DIR as an environment variable. I am using zkServer.sh and I
 see where it passes ZOO_LOG_DIR as a parameter to the java invocation.
 
  java  -Dzookeeper.log.dir=${ZOO_LOG_DIR}
 -Dzookeeper.root.logger=${ZOO_LOG4J_PROP} \
  -cp $CLASSPATH $JVMFLAGS $ZOOMAIN $ZOOCFG 
 
 
 I double checked by echo'ing the value of ZOO_LOG_DIR just before the java
 command. It's set correctly ... but it has no effect on the location of
 zookeeper.log :-(
 
 Leonard
 
 On 10/16/09 11:08 AM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Leonard,
   You should be able to set the ZOO_LOG_DIR as an environment variable to
 get a different log directory. I think you are using bin/zkServer.sh to
 start the server?
 
 Also, please open a jira for this. It would be good to fix the documentation
 for this.
 
 Thanks
 mahadev
 
 
 On 10/16/09 11:04 AM, Leonard Cuff lc...@valueclick.com wrote:
 
 I¹ve read through the admin manual at
 
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_logg
i
 ng
 and I don¹t see that there is any way to specify a location for the
 server¹s
 own log file.  zookeeper.log appears in the bin directory, regardless of
 setting dataDir or dataLogDir in the configuration file. Am I overlooking
 something?  Is there a way to have this file appear somewhere else?
 
 TIA,
 
 Leonard

Re: specifying the location of zookeeper.log

2009-10-16 Thread Mahadev Konar

I just realized that as well :)

Sent out an email already!


mahadev


On 10/16/09 11:56 AM, Leonard Cuff lc...@valueclick.com wrote:

 Your comment that the output goes to the console made me realize this is
 configurable via the log4jproperties file, and that I'd configured it long
 ago and forgotten that I'd done so.
 
 Thanks for your attention.
 
 Leonard
 
 
 On 10/16/09 11:35 AM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Leonard,
  Looks like you are right. bin/zkServer.sh just logs the output to console,
 so you should be able to redirect to any file you want. No?
 
 Anyways this is a bug. Please open a jira for it.
 
 Thanks
 mahadev
 
 
 On 10/16/09 11:27 AM, Leonard Cuff lc...@valueclick.com wrote:
 
 
 I should have mentioned in my original email, but I had already tried
 setting ZOO_LOG_DIR as an environment variable. I am using zkServer.sh and I
 see where it passes ZOO_LOG_DIR as a parameter to the java invocation.
 
  java  -Dzookeeper.log.dir=${ZOO_LOG_DIR}
 -Dzookeeper.root.logger=${ZOO_LOG4J_PROP} \
  -cp $CLASSPATH $JVMFLAGS $ZOOMAIN $ZOOCFG 
 
 
 I double checked by echo'ing the value of ZOO_LOG_DIR just before the java
 command. It's set correctly ... but it has no effect on the location of
 zookeeper.log :-(
 
 Leonard
 
 On 10/16/09 11:08 AM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Leonard,
   You should be able to set the ZOO_LOG_DIR as an environment variable to
 get a different log directory. I think you are using bin/zkServer.sh to
 start the server?
 
 Also, please open a jira for this. It would be good to fix the
 documentation
 for this.
 
 Thanks
 mahadev
 
 
 On 10/16/09 11:04 AM, Leonard Cuff lc...@valueclick.com wrote:
 
 I¹ve read through the admin manual at
 
 
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_logg

 i
 ng
 and I don¹t see that there is any way to specify a location for the
 server¹s
 own log file.  zookeeper.log appears in the bin directory, regardless of
 setting dataDir or dataLogDir in the configuration file. Am I overlooking
 something?  Is there a way to have this file appear somewhere else?
 
 TIA,
 
 Leonard

Re: Start problem of Running Replicated ZooKeeper

2009-09-23 Thread Mahadev Konar

Hi Le,
 Is there some chance of the these servers not being able to talk to each
other?  IS the zookeeper prcoess running on debian-1? What error do you see
on debian-1? 

The connection refused error suggests that debian-0 is not able to talk to
debian-1 machine. 

Thanks
mahadev


On 9/23/09 2:41 AM, Le Zhou lezhouy...@gmail.com wrote:

 Hi,
 I'm trying to install HBase 0.20.0 in fully distributed mode on my cluster.
 As HBase depends on Zookeeper, I have to know first how to make Zookeeper
 work.
 I download the release 3.2.1 and install it on each machine in my cluster.
 
 Zookeeper in standalone mode works well on each machine in my cluster. I
 follow the Zookeeper Getting Started Guide and get expected output. Then I
 come to the Running replicated zookeeper
 
 On each machine in my cluster(debian-0, debian-1, debian-5), I append the
 following lines to zoo.cfg, and create in dataDir a myid which contains
 the server id(1 for debian-0, 2 for debian-1, 3 for debian-5).
 
 server.1=debian-0:2888:3888
 server.2=debian-1:2888:3888
 server.3=debian-5:2888:3888
 
 then I start zookeeper server by running bin/zkServer.sh start, and I got
 the following output:
 
 cl...@debian-0:~/zookeeper$ bin/zkServer.sh start
 JMX enabled by default
 Using config: /home/cloud/zookeeper-3.2.1/bin/../conf/zoo.cfg
 Starting zookeeper ...
 STARTED
 cl...@debian-0:~/zookeeper$ 2009-09-23 15:30:27,976 - INFO
  [main:quorumpeercon...@80] - Reading configuration from:
 /home/cloud/zookeeper-3.2.1/bin/../conf/zoo.cfg
 2009-09-23 15:30:27,981 - INFO  [main:quorumpeercon...@232] - Defaulting to
 majority quorums
 2009-09-23 15:30:28,009 - INFO  [main:quorumpeerm...@118] - Starting quorum
 peer
 2009-09-23 15:30:28,034 - INFO  [Thread-1:quorumcnxmanager$liste...@409] -
 My election bind port: 3888
 2009-09-23 15:30:28,045 - INFO
  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@487] - LOOKING
 2009-09-23 15:30:28,070 - INFO
  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@579] - New election:
 -1
 2009-09-23 15:30:28,075 - INFO
  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@618] - Notification:
 1, -1, 1, 1, LOOKING, LOOKING, 1
 2009-09-23 15:30:28,075 - WARN  [WorkerSender Thread:quorumcnxmana...@336] -
 Cannot open channel to 2 at election address debian-1/172.20.53.86:3888
 java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect(Native Method)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:146)
 at
 org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManage
 r.java:323)
 at
 org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.ja
 va:302)
 at
 org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.p
 rocess(FastLeaderElection.java:323)
 at
 org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.r
 un(FastLeaderElection.java:296)
 at java.lang.Thread.run(Thread.java:619)
 2009-09-23 15:30:28,085 - INFO
  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@642] - Adding vote
 2009-09-23 15:30:28,099 - WARN  [WorkerSender Thread:quorumcnxmana...@336] -
 Cannot open channel to 3 at election address debian-5/172.20.14.194:3888
 java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect(Native Method)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:146)
 at
 org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManage
 r.java:323)
 at
 org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.ja
 va:302)
 at
 org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.p
 rocess(FastLeaderElection.java:323)
 at
 org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.r
 un(FastLeaderElection.java:296)
 at java.lang.Thread.run(Thread.java:619)
 2009-09-23 15:30:28,288 - WARN
  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@336] - Cannot open
 channel to 2 at election address debian-1/172.20.53.86:3888
 java.net.ConnectException: Connection refused
 at sun.nio.ch.Net.connect(Native Method)
 at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
 at java.nio.channels.SocketChannel.open(SocketChannel.java:146)
 at
 org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManage
 r.java:323)
 at
 org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManage
 r.java:356)
 at
 org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeader
 Election.java:603)
 at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:488)
 
 Terminal keeps on outputing the WARN info until I stop the zookeeper Server.
 
 I googled zookeeper cannot open channel to at address and searched in
 mailing list archives, but got nothing helpful.
 
 I need your help, thanks and best regards!

Re: ACL question w/ Zookeeper 3.1.1

2009-09-17 Thread Mahadev Konar

HI todd,
 From what I understand, you are sayin that a creator_all_acl does not work
with auth?

 I tried the following with CREATOR_ALL_ACL and it seemed to work for me...

import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.data.ACL;
import org.apache.zookeeper.ZooDefs.Ids;
import java.util.ArrayList;
import java.util.List;

public class TestACl implements Watcher {

public static void main(String[] argv) throws Exception {
ListACL acls = new ArrayListACL(1);
String authentication_type = digest;
String authentication = mahadev:some;

for (ACL ids_acl : Ids.CREATOR_ALL_ACL) {
acls.add(ids_acl);
}
TestACl tacl = new TestACl();
ZooKeeper zoo = new ZooKeeper(localhost:2181, 3000, tacl);
zoo.addAuthInfo(authentication_type, authentication.getBytes());
zoo.create(/some, new byte[0], acls, CreateMode.PERSISTENT);
zoo.setData(/some, new byte[0], -1);
}

@Override
public void process(WatchedEvent event) {


}
}


And it worked on my set of zookeeper servers

And then 
I tried 

Without auth 

Getdata(/some) 

Which correctly gave me the error:


Exception in thread main
org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode =
NoAuth for /some
at org.apache.zookeeper.KeeperException.create(KeeperException.java:104)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:892)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
at 
org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:692)
at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:579)
at 
org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:351)
at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:309)
at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:268)


Is this what you are trying to do?

Thanks
mahadev
On 9/17/09 5:05 PM, Todd Greenwood to...@audiencescience.com wrote:

 I'm attempting to secure a zookeeper installation using zookeeper ACLs.
 However, I'm finding that while Ids.OPEN_ACL_UNSAFE works great, my
 attempts at using Ids.CREATOR_ALL_ACL are failing. Here's a code
 snippet:
 
 
 public class ZooWrapper
 {
 
 /*
 1. Here I'm setting up my authentication. I've got an ACL list, and my
 authentication strings.
 */
 private final ListACL acl = new ArrayListACL( 1 );
 private static final String authentication_type = digest;
 private static final String authentication =
 audiencescience:gravy;
 
 
 public ZooWrapper( final String connection_string,
final String path,
final int connectiontimeout ) throws
 ZooWrapperException
 {
 ...
 /*
 2. Here I'm adding the acls
 */
 
 // This works (creates nodes, sets data on nodes)
 for ( ACL ids_acl : Ids.OPEN_ACL_UNSAFE )
 {
 acl.add( ids_acl);
 }
 
 /*
 NOTE:  This does not work (nodes are not created, cannot set data on
 nodes b/c nodes do not exist)
 */
 
 //for ( ACL ids_acl : Ids.CREATOR_ALL_ACL )
 //{
 //acl.add( ids_acl );
 //}
 
 /*
 3. Finally, I create a new zookeeper instance and add my authorization
 info to it.
 */
  zoo = new ZooKeeper( connection_string, connectiontimeout, this );
  zoo.addAuthInfo( authentication_type, authentication.getBytes() )
 
 /*
 4. Later, I try to write some data into zookeeper by first creating the
 node, and then calling setdata...
 */
   zoo.create( path, new byte[0], acl, CreateMode.PERSISTENT );
 
   zoo.setData( path, bytes, -1 )
 
 As I mentioned above, when I add Ids.OPEN_ACL_UNSAFE to acl, then both
 the create and setData succeed. However, when I use Ids.CREATOR_ALL_ACL,
 then the nodes are not created. Am I missing something obvious w/
 respect to configuring ACLs?
 
 I've used the following references:
 
 http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperProgrammers.html
 
 http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-commits/200807
 .mbox/%3c20080731201025.c62092388...@eris.apache.org%3e
 
 http://books.google.com/books?id=bKPEwR-Pt6ECpg=PT404lpg=PT404dq=zook
 eeper+ACL+digest+%22new+Id%22source=blots=kObz0y8eFksig=VFCAsNW0mBJyZ
 swoweJDI31iNlohl=enei=Z82ySojRFsqRlAeqxsyIDwsa=Xoi=book_resultct=re
 sultresnum=6#v=onepageq=zookeeper%20ACL%20digest%20%22new%20Id%22f=fa
 lse
 
 -Todd

Re: Infinite ping after calling setData()

2009-09-15 Thread Mahadev Konar

Hi rob,
 you might want to take a look at our test cases in
Src/java/test/ 
Specifically QuorumTest

Wherein we start and stop a Quorum of servers in a single junit test.

Thanks
mahadev


On 9/15/09 10:15 AM, Rob Baccus r...@audiencescience.com wrote:

 
 These are not the complete logs because they to long to add to email at
 this time.  Unfortunately I am having completely different issues now
 with the servers not shutting down.  When I get past that and if I run
 into this issue again I will give more details.
 
 Thanks.
 
 
 
 -Original Message-
 From: Mahadev Konar [mailto:maha...@yahoo-inc.com]
 Sent: Monday, September 14, 2009 5:37 PM
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: Infinite ping after calling setData()
 
 Hi rob,
   Can you be a little more clear about what you are seeing? After
 reading
 through the email, it looks like the session is getting expired due to
 some
 reason (is that what you are debugging ?)
 
 Also, I see a close session for 0x30002  and no createsession for
 that
 in your logs. Are you sure these are the full logs?
 
 Thanks
 mahadev
 
 
 On 9/14/09 5:03 PM, Rob Baccus r...@audiencescience.com wrote:
 
 I am trying to automate creating an Ensemble then add data to it,
 and
 then pull it out.  I am finding that after I call setData() there
 becomes an infinite loop of pings.  This was working then just
 stopped
 when I changed some code around to use JUnit 4.1 instead of 3.8.1
 which
 I would expect has nothing to do with this issue.  I saw the issue
 relating to session expiration due to system resources but I don't
 believe this is the issue since this was working fine.
 
 My configuration:
 Linux VM with 1.5GB RAM
 Running in Eclipse 3.3
 Configured 4 zookeeper servers running each with different client
 and
 leader/leader election ports and different local transaction log
 locations.
 All 4 servers come up without issues and a Leader is elected.
 
 Below is the stack trace that I am seeing with log4j DEBUG turned on
 for
 org.apache.zookeeper after the setData() call is made.
 
 DEBUG [main]
 
 (com.audiencescience.util.zookeeper.qa.failover.FailOverTest.testSimpleZ
 KClientFailover:124) - ** Before Set Data
 **
  INFO [main-SendThread]
 (org.apache.zookeeper.ClientCnxn$SendThread.primeConnection:716) -
 Priming connection to java.nio.channels.SocketChannel[connected
 local=/172.17.1.133:40933
 remote=robb02linux.corp.digimine.com/172.17.1.133:2181]
  INFO [main-SendThread]
 (org.apache.zookeeper.ClientCnxn$SendThread.run:868) - Server
 connection
 successful
  INFO [NIOServerCxn.Factory:2181]
 (org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest:503) -
 Connected to /172.17.1.133:40933 lastZxid 0
  INFO [NIOServerCxn.Factory:2181]
 (org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest:534) -
 Creating new session 0x123bafb088a
 DEBUG [FollowerRequestProcessor:1]
 
 (org.apache.zookeeper.server.quorum.CommitProcessor.processRequest:168)
 - Processing request:: sessionid:0x123bafb088a
 type:createSession
 cxid:0x0 zxid:0xfffe txntype:unknown n/a
 DEBUG [ProcessThread:-1]
 
 (org.apache.zookeeper.server.quorum.CommitProcessor.processRequest:168)
 - Processing request:: sessionid:0x123bafb088a
 type:createSession
 cxid:0x0 zxid:0x30001 txntype:-10 n/a
 DEBUG [ProcessThread:-1]
 (org.apache.zookeeper.server.quorum.Leader.propose:560) -
 Proposing::
 sessionid:0x123bafb088a type:createSession cxid:0x0
 zxid:0x30001
 txntype:-10 n/a
  WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2183]
 (org.apache.zookeeper.server.quorum.Follower.followLeader:242) - Got
 zxid 0x30001 expected 0x1
  WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2182]
 (org.apache.zookeeper.server.quorum.Follower.followLeader:242) - Got
 zxid 0x30001 expected 0x1
  WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181]
 (org.apache.zookeeper.server.quorum.Follower.followLeader:242) - Got
 zxid 0x30001 expected 0x1
  INFO [SessionTracker]
 (org.apache.zookeeper.server.SessionTrackerImpl.run:132) - Expiring
 session 0x123baf02f28
  INFO [SessionTracker]
 (org.apache.zookeeper.server.ZooKeeperServer.expire:317) - Expiring
 session 0x123baf02f28
  INFO [ProcessThread:-1]
 (org.apache.zookeeper.server.PrepRequestProcessor.pRequest:360) -
 Processed session termination request for id: 0x123baf02f28
 DEBUG [ProcessThread:-1]
 
 (org.apache.zookeeper.server.quorum.CommitProcessor.processRequest:168)
 - Processing request:: sessionid:0x123baf02f28 type:closeSession
 cxid:0x0 zxid:0x30002 txntype:-11 n/a
 DEBUG [ProcessThread:-1]
 (org.apache.zookeeper.server.quorum.Leader.propose:560) -
 Proposing::
 sessionid:0x123baf02f28 type:closeSession cxid:0x0
 zxid:0x30002
 txntype:-11 n/a
 DEBUG [FollowerHandler-/172.17.1.133:40537]
 (org.apache.zookeeper.server.quorum.Leader.processAck:382) - Ack
 zxid:
 0x30001
 DEBUG [FollowerHandler-/172.17.1.133:40537

Re: zookeeper on ec2

2009-09-01 Thread Mahadev Konar

Hi Satish,

  Connectionloss is a little trickier than just retrying blindly. Please
read the following sections on this -

http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling

And the programmers guide:

http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperProgrammers.html

To learn more about how to handle CONNECTIONLOSS. The idea is that that
blindly retrying would create problems with CONNECTIONLOSS, since a
CONNECTIONLOSS does NOT necessarily mean that the zookepeer operation that
you were executing failed to execute. It might be possible that this
operation went through the servers.

Since, this has been a constant source of confusion for everyone who starts
using zookeeper we are working on a fix ZOOKEEPER-22 which will take care of
this problem and programmers would not have to worry about CONNECTIONLOSS
handling.

Thanks
mahadev




On 9/1/09 4:13 PM, Satish Bhatti cthd2...@gmail.com wrote:

 I have recently started running on EC2 and am seeing quite a few
 ConnectionLoss exceptions.  Should I just catch these and retry?  Since I
 assume that eventually, if the shit truly hits the fan, I will get a
 SessionExpired?
 Satish
 
 On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning ted.dunn...@gmail.com wrote:
 
 We have used EC2 quite a bit for ZK.
 
 The basic lessons that I have learned include:
 
 a) EC2's biggest advantage after scaling and elasticity was conformity of
 configuration.  Since you are bringing machines up and down all the time,
 they begin to act more like programs and you wind up with boot scripts that
 give you a very predictable environment.  Nice.
 
 b) EC2 interconnect has a lot more going on than in a dedicated VLAN.  That
 can make the ZK servers appear a bit less connected.  You have to plan for
 ConnectionLoss events.
 
 c) for highest reliability, I switched to large instances.  On reflection,
 I
 think that was helpful, but less important than I thought at the time.
 
 d) increasing and decreasing cluster size is nearly painless and is easily
 scriptable.  To decrease, do a rolling update on the survivors to update
 their configuration.  Then take down the instance you want to lose.  To
 increase, do a rolling update starting with the new instances to update the
 configuration to include all of the machines.  The rolling update should
 bounce each ZK with several seconds between each bounce.  Rescaling the
 cluster takes less than a minute which makes it comparable to EC2 instance
 boot time (about 30 seconds for the Alestic ubuntu instance that we used
 plus about 20 seconds for additional configuration).
 
 On Mon, Jul 6, 2009 at 4:45 AM, David Graf david.g...@28msec.com wrote:
 
 Hello
 
 I wanna set up a zookeeper ensemble on amazon's ec2 service. In my
 system,
 zookeeper is used to run a locking service and to generate unique id's.
 Currently, for testing purposes, I am only running one instance. Now, I
 need
 to set up an ensemble to protect my system against crashes.
 The ec2 services has some differences to a normal server farm. E.g. the
 data saved on the file system of an ec2 instance is lost if the instance
 crashes. In the documentation of zookeeper, I have read that zookeeper
 saves
 snapshots of the in-memory data in the file system. Is that needed for
 recovery? Logically, it would be much easier for me if this is not the
 case.
 Additionally, ec2 brings the advantage that serves can be switch on and
 off
 dynamically dependent on the load, traffic, etc. Can this advantage be
 utilized for a zookeeper ensemble? Is it possible to add a zookeeper
 server
 dynamically to an ensemble? E.g. dependent on the in-memory load?
 
 David

Re: Runtime Interrogation of the Ensemble

2009-08-31 Thread Mahadev Konar

Hi Todd,
  You can use jmx to to find such information. Also you can just do this

Echo stat | nc localhost clientport

To get status from the zookeeper servers. This is all documented in the
forrest docs at

http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html

Hope this helps.

 Ensemble : object representing a zookeeper ensemble
 
 Long Ensemble.getLastTxid()
 ZKServer Ensemble.getCurrentLeader()
 ZKServer[] Ensemble.getPastLeaders()
 ZKServer[] Ensemble.getConnectedServers()
 ZKServer[] Ensemble.getDisConnectedServers()
 Boolean doWeHaveAQuorum()
 
 ZKServer : object representing a zookeeper server
 
 Client[] ZKServer.getClients()
 Client[] ZKServer.getDisconnectedClients()
 Boolean isLeader()
 Boolean isAbleToBeLeader()
 int get|set VotingWeight()
 Group get|set GroupMembership()

Most of this is available via the stat/jmx.

 
 It would be cool if...
 
 1. The ensemble could trigger nagios type alerts if it no longer had a
 quorum
 2. The ensemble could dynamically bring up new zk servers in the event
 that enough severs have died to prevent a quorum
 3. The ensemble/server api could allow analysis of problem servers
 (servers that keep dropping connections and/or clients)
Nice idea... But we do not have anything like this as of now.


Thanks
mahadev

Re: question about watcher

2009-08-05 Thread Mahadev Konar

Hi Qian,
 There isnt any such api. We have been thinking abt adding an api on
cancelling a cleints watches. We have been thinking about adding a proc
filesystem wherein a cleintt will have a list of all the watches. This data
can be used to know which clients are watching what znode, but this has
always been in the future discussions for us. We DO NOT have anything
planned in the near future for this.

Thanks
mahadev


On 8/5/09 6:57 PM, Qian Ye yeqian@gmail.com wrote:

 Hi all:
 
 Is there a client API for querying the watchers' owner for a specific znode?
 In some situation, we want to find out who set watchers on the znode.
 
 thx

Re: c client error message with chroot

2009-08-03 Thread Mahadev Konar

This looks like a bug. Does this happen without doing any reads/writes using
the zookeeper handle?

Please do open a jira for this.


Thanks
mahadev


On 8/2/09 10:53 PM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:

 Hello,
 
 I'm doing something like this (using zookeeper-3.2.0):
 
 zhandle_t* zh = zookeeper_init(localhost:2818/servers, watcher,
 1000, 0, 0, 0);
 
 and getting this error:
 
 2009-08-03 05:48:30,693:3380(0x40a04950):zoo_i...@check_events@1439:
 initiated connection to server [127.0.0.1:2181]
 2009-08-03 05:48:30,705:3380(0x40a04950):zoo_i...@check_events@1484:
 connected to server [127.0.0.1:2181] with session id=122ddb9be64016d
 2009-08-03 05:48:30,705:3380(0x40c05950):zoo_er...@sub_string@730:
 server path  does not include chroot path /servers
 
 The error log doesn't appear if I use localhost:2818 without chroot.
 Is this actually an error?
 
 Thanks!
 --Michi

Re: bad svn url : test-patch

2009-07-30 Thread Mahadev Konar

Hi Todd,
  Yes this happens with the branch 3.2. The test-patch  link is broken
becasuse of the hadoop split. This file is used for hudson test environment.
It isnt used anywhere else, so the svn co otherwise should be fine. We
should fix it anyways.

Thanks
mahadev


On 7/30/09 2:57 PM, Todd Greenwood to...@audiencescience.com wrote:

 FYI - looks like there is a bad url in svn...
 
 $ svn co
 http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.2
 branch-3.2
 
 ...
 Abranch-3.2/build.xml
 
 Fetching external item into 'branch-3.2/src/java/test/bin'
 svn: URL
 'http://svn.apache.org/repos/asf/hadoop/common/nightly/test-patch'
 doesn't exist
 
 This does not repro w/ 3.1:
 
 $ svn co
 http://svn.apache.org/repos/asf/hadoop/zookeeper/branches/branch-3.1
 branch-3.1
 
 -Todd

Bug in 3.2 release.

2009-07-23 Thread Mahadev Konar



Hi folks, 
 We just discovered a bug in 3.2 release

http://issues.apache.org/jira/browse/ZOOKEEPER-484.

This bug will affect your clients whenever they switch zookeeper servers -
from a zookeeper server that is a follower to a server that is leader. We
should have a fix out by next week in 3.2.1 and trunk. 3.2.1 should be out
in the next 2-3 weeks.

If you are already using 3.2.0 in production I would suggest switching it
back to 3.1.1 (though there is a workaround mentioned in the jira
http://issues.apache.org/jira/browse/ZOOKEEPER-484 but I would advise
against it). 

The 3.2.0 clients are compatible with 3.1.1 servers.

Thanks 
mahadev


-- End of Forwarded Message

Re: Leader Elections

2009-07-20 Thread Mahadev Konar

Both of the options that Scott mentioned are quite interesting. Quite a few
of our users are interested in these two features. I think for 2, we should
be able to use observers with a subscription to the master cluster with
interested in a special subtree. That avoids too much of cross talk.

Henry/Flavio, do you guys want to keep this in mind for Observers in (not
implement it in the jira but to generalize it in a way that partial
subscription can be done later)

http://issues.apache.org/jira/browse/ZOOKEEPER-368

Wherein an observer can just register with interest in a subtree and the
master cluster can avoid sending updates to zookeeper tree not in the
sbutree. This would be very helpful in a WAN setting where in only small
data needs to be up to date within different data centers.


thanks
mahadev


On 7/20/09 11:50 AM, Todd Greenwood to...@audiencescience.com wrote:

 Flavio, Ted, Henry, Scott, this would perfectly well for my use case
 provided:
 
 SINGLE ENSEMBLE:
 GROUP A : ZK Servers w/ read/write AND Leader Elections
 GROUP B : ZK Servers w/ read/write W/O Leader Elections
 
 So, we can craft this via Observers and Hiererarchial Quorum groups?
 Great. Problem solved.
 
 When will this be production ready? :o)
 
 
 
 Scott brought up a multi-feature that is very interesting for me.
 Namely:
 
 1. Offline ZK servers that sync  merge on reconnect
 
 The offline servers seems conceptually simple, it's kind of like a
 messaging system. However, the merge and resolve step when two servers
 reconnect might be challenging. Cool idea though.
 
 2. Partial memory graph subscriptions
 
 The second idea is partial memory graph subscriptions. This would enable
 virtual ensembles to interract on the same physical ensemble. For my use
 case, this would prevent unnecessary cross talk between nodes on a WAN,
 allowing me to define the subsets of the memory graph that need to be
 replicated, and to whom. This would be a huge scalability win for WAN
 use cases.
 
 -Todd
 
 -Original Message-
 From: Scott Carey [mailto:sc...@richrelevance.com]
 Sent: Monday, July 20, 2009 11:00 AM
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: Leader Elections
 
 Observers would be awesome especially with a couple enhancements /
 extensions:
 
 An option for the observers to enter a special state if the WAN link
 goes down to the master cluster.  A read-only option would be great.
 However, allowing certain types of writes to continue on a limited basis
 would be highly valuable as well.  An observer could own a special
 node and its subnodes.  Only these subnodes would be writable by the
 observer when there was a session break to the master cluster, and the
 master cluster would take all the changes when the link is
 reestablished.  Essentially, it is a portion of the hierarchy that is
 writable only by a specitfic observer, and read-only for others.
 The purpose of this would be for when the WAN link goes down to the
 master ZKs for certain types of use cases - status updates or other
 changes local to the observer that are strictly read-only outside the
 Observer's 'realm'.
 
 
 On 7/19/09 12:16 PM, Henry Robinson he...@cloudera.com wrote:
 
 You can. See ZOOKEEPER-368 - at first glance it sounds like observers
 will
 be a good fit for your requirements.
 
 Do bear in mind that the patch on the jira is only for discussion
 purposes;
 I would not consider it currently fit for production use. I hope to put
 up a
 much better patch this week.
 
 Henry
 
 On Sat, Jul 18, 2009 at 7:38 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:
 
 Can you submit updates via an observer?
 
 On Sat, Jul 18, 2009 at 6:38 AM, Flavio Junqueira f...@yahoo-inc.com
 wrote:
 
 2- Observers: you could have one computing center containing an
 ensemble
 and observers around the edge just learning committed values.
 
 
 
 
 --
 Ted Dunning, CTO
 DeepDyve

Re: Queue code

2009-07-17 Thread Mahadev Konar

 Also are there any performance numbers of zookeeeper based queues. How does
 it compare with JMS.
 
 thanks
 Kishore G
Hi Kishore,
 We do not have any performance number fr queues on zookeeper. I think you
can get a rough idea of  those numbers from your usage of zookeeper (number
of reads/writes per second) and  zookeeper performance numbers on
http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperOver.html

Hope that helps.
Thanks
mahadev

Re: Instantiating HashSet for DataNode?

2009-07-14 Thread Mahadev Konar

Hi Erik,
  I am not sure if that would a considerable opitmization but even if you
wanted to do it, it would be much more than just adding a check in the
constructor (the serialization/deserialization would need to have
specialized code). Right now all the datanodes are treated equally for
ser/derser and other purposes.


mahadev




On 7/14/09 1:42 PM, Erik Holstad erikhols...@gmail.com wrote:

 I'm not sure if I've miss read the code for the DataNode, but to me it looks
 like every node gets a set of children even though it might be an
 ephemeral node which cannot have children, so we are wasting 240 B for every
 one of those. Not sure if it makes a big difference, but just thinking
 that since everything sits in memory and there is no reason to instantiate
 it, maybe it would be possible just to add a check in the constructor?
 
 Regards Erik

Re: Help to compile Zookeeper C API on a old system

2009-07-06 Thread Mahadev Konar

Hi Qian,
  I am not sure if it will work. You should be able to back port it such a
way so that it works with gcc 3.*/4.*, but again I have never tried it.

mahadev


On 7/6/09 6:35 PM, Qian Ye yeqian@gmail.com wrote:

 Thanks Mahadev,  I follow the installation instruction in the README,
 
 autoreconf -i -f
 ./configure --prefix=$dir
 make
 make install
 
 until ./configure --prefix=$dir, there is no error, however, errors came
 when I did make,
 
 My plan is change the compiler from gcc to g++, and solve the compile errors
 one by one.
 
 Will my plan do?
 
 Thanks~
 
 
 On Tue, Jul 7, 2009 at 2:22 AM, Mahadev Konar maha...@yahoo-inc.com wrote:
 
 Hi Qian,
  What issues do you face? I have never tried compiling with the
 configuration below, but I could give it a try in my free time to see if I
 can get it to compile.
 
 mahadev
 
 
 On 7/6/09 7:37 AM, Qian Ye yeqian@gmail.com wrote:
 
 Hi all:
 
 I'm writing to ask you to do me a favor. It's urgent. For some
 unchangeable
 reason, I have to compile libzookeeper_st.a, libzookeeper_mt.a on an
 old
 system:
 
 gcc 2.96
 autoconf 2.13
 automake 1.4-p5
 libtool 1.4.2
 
 I cannot not compile the target lib in the usual way, and this task
 drives
 me crazy :-(
 
 could anyone help me out? Thanks a lot~

Re: General Question about Zookeeper

2009-06-25 Thread Mahadev Konar

Hi Harold,
 As Henry mentioned, what acl's provide you is preventing access to znodes.
If someone has access to zookeeper's data stored on zookeeper's server
machines, they should be able to resconstruct the data and read it (using
zookeeper deserialization code).

I am not sure what kind of security model you are interested in, but for
ZooKeeper we expect the server side data stored on local disks be
inaccessible to normal users and only accessable to admins.

Hope this helps.
Thanks
mahadev

On 6/25/09 11:01 AM, Henry Robinson he...@cloudera.com wrote:

 Hi Harold,
 
 Each ZooKeeper server stores updates to znodes in logfiles, and periodic
 snapshots of the state of the datatree in snapshot files.
 
 A user who has the same permissions as the server will be able to read these
 files, and can therefore recover the state of the datatree without the ZK
 server intervening. ACLs are applied only by the server; there is no
 filesystem-level representation of them.
 
 Henry
 
 
 
 On Thu, Jun 25, 2009 at 6:48 PM, Harold Lim rold...@yahoo.com wrote:
 
 
 Hi All,
 
 How does zookeeper store data/files?
 From reading the doc, the clients can put ACL on files/znodes to limit
 read/write/create of other clients. However, I was wondering how are these
 znodes stored on Zookeeper servers?
 
 I am interested in a security aspect of zookeeper, where the clients and
 the servers don't necessarily belong to the same group. If a client
 creates a znode in the zookeeper? Can the person, who owns the zookeeper
 server, simply look at its filesystem and read the data (out-of-band, not
 using a client, simply browsing the file system of the machine hosting the
 zookeeper server)?
 
 
 Thanks,
 Harold

1 2 >

1 - 100 of 135 matches

Mail list logo