Re: Too many connections

2010-10-06 Thread Patrick Hunt
On Tue, Oct 5, 2010 at 10:23 AM, Avinash Lakshman 
avinash.laksh...@gmail.com wrote:

 So shouldn't all servers in another DC just have one session? So even if I
 have 50 observers in another DC that should be 50 sessions established
 since
 the IP doesn't change correct? Am I missing something? In some ZK clients I
 see the following exception even though they are in the same DC.


This really depends on how you implemented your client. Each time you create
a ZooKeeper object a new session is established. If you have 50 clients each
creating a ZooKeeper object then you have 50 sessions.

Patrick


Re: znode inconsistencies across ZooKeeper servers

2010-10-06 Thread Patrick Hunt
Vishal the attachment seems to be getting removed by the list daemon (I
don't have it), can you create a JIRA and attach? Also this is a good
question for the ppl on zookeeper-user. (ccing)

You are aware that ephemeral znodes are tied to the session? And that
sessions only expire after the session timeout period? At which time any
znodes created during that session are then deleted. The fact that you are
killing your client process leads me to believe that you are not closing
the session cleanly (meaning that it will eventually expire after the
session timeout period), in which case the ephemeral znodes _should_
reappear when A is restarted and successfully rejoins the cluster. (at least
until the session timeout is exceeded)

Patrick

On Tue, Oct 5, 2010 at 11:04 AM, Vishal K vishalm...@gmail.com wrote:

 Hi,

 I have a 3 node ZK cluster (A, B, C). On one of the the nodes (node A), I
 have a ZK client running that connects to the local server and creates an
 ephemeral znode to indicate clients on other nodes that it is online.

 I have test script that reboots the zookeeper server as well as client on
 A. The test does a getstat on the ephemeral znode created by the client on
 A. I am seeing that the view of znodes on A is different from the other 2
 nodes. I can tell this from the session ID that the client gets after
 reconnecting to the local ZK server.

 So the test is simple:
 - kill zookeeper server and client process
 - wait for a few seconds
 - do zkCli.sh stat ...  test.out

 What I am seeing is that the ephemeral znode with old zxid, time, and
 session ID is reappearing on node A. I have attached the output of 3
 consecutive getstat requests of the test (see client_getstat.out). Notice
 that the third output is the same as the first one. That is, the old
 ephemeral znode reappeared at A. However, both B and C are showing the
 latest znode with correct time, zxid and session ID (output not attached).

 After this point, all following getstat requests on A are showing the old
 znode. Whereas, B and C show the correct znode every time the client on A
 comes online. This is something very perplexing. Earlier I thought this was
 a bug in my client implementation. But the test shows that the ZK server on
 A after reboot is out of sync with rest of the servers.

 The stat command to each server shows that the servers are in sync as far
 as zxid's are concerned (see stat.out). So there is something wrong with A's
 local database that is causing this problem.

 Has anyone seen this before? I will be doing more debugging in the next few
 days. Comments/suggestions for further debugging are welcomed.

 -Vishal





Re: znode inconsistencies across ZooKeeper servers

2010-10-06 Thread Vishal K
Hi Patrick,

You are correct, the test restarts both ZooKeeper server and the client. The
client opens a new connection after restarting. So we would expect that the
ephmeral znode (/foo) to expire after the session timeout. However, the
client with the new session creates the ephemeral znode (/foo) again after
it reboots (it sets a watch for /foo and recreates /foo if it is deleted or
doesn't exist). The client is not reusing the session ID. What I expect to
see is that the older /foo should expire after which a new /foo should get
created. Is my expectation correct?

What confuses me is the following output of 3 successive getstat /foo
requests on A (the zxid, time and owner fields).  Notice that the older
znode reappeared.
At the same time when I do getstat at B and C, I see the newer /foo.

log4j:WARN No appenders could be found for logger
(org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
cZxid = 0x105ef
ctime = Tue Oct 05 15:00:50 UTC 2010
mZxid = 0x105ef
mtime = Tue Oct 05 15:00:50 UTC 2010
pZxid = 0x105ef
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x2b7ce57ce4
dataLength = 54
numChildren = 0

log4j:WARN No appenders could be found for logger
(org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
cZxid = 0x10607
ctime = Tue Oct 05 15:01:07 UTC 2010
mZxid = 0x10607
mtime = Tue Oct 05 15:01:07 UTC 2010
pZxid = 0x10607
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x2b7ce5bda4
dataLength = 54
numChildren = 0

log4j:WARN No appenders could be found for logger
(org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
cZxid = 0x105ef
ctime = Tue Oct 05 15:00:50 UTC 2010
mZxid = 0x105ef
mtime = Tue Oct 05 15:00:50 UTC 2010
pZxid = 0x105ef
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x2b7ce57ce4
dataLength = 54
numChildren = 0

Thanks for your help.

-Vishal

On Wed, Oct 6, 2010 at 4:45 PM, Patrick Hunt ph...@apache.org wrote:

 Vishal the attachment seems to be getting removed by the list daemon (I
 don't have it), can you create a JIRA and attach? Also this is a good
 question for the ppl on zookeeper-user. (ccing)

 You are aware that ephemeral znodes are tied to the session? And that
 sessions only expire after the session timeout period? At which time any
 znodes created during that session are then deleted. The fact that you are
 killing your client process leads me to believe that you are not closing
 the session cleanly (meaning that it will eventually expire after the
 session timeout period), in which case the ephemeral znodes _should_
 reappear when A is restarted and successfully rejoins the cluster. (at
 least
 until the session timeout is exceeded)

 Patrick

 On Tue, Oct 5, 2010 at 11:04 AM, Vishal K vishalm...@gmail.com wrote:

  Hi,
 
  I have a 3 node ZK cluster (A, B, C). On one of the the nodes (node A), I
  have a ZK client running that connects to the local server and creates an
  ephemeral znode to indicate clients on other nodes that it is online.
 
  I have test script that reboots the zookeeper server as well as client on
  A. The test does a getstat on the ephemeral znode created by the client
 on
  A. I am seeing that the view of znodes on A is different from the other 2
  nodes. I can tell this from the session ID that the client gets after
  reconnecting to the local ZK server.
 
  So the test is simple:
  - kill zookeeper server and client process
  - wait for a few seconds
  - do zkCli.sh stat ...  test.out
 
  What I am seeing is that the ephemeral znode with old zxid, time, and
  session ID is reappearing on node A. I have attached the output of 3
  consecutive getstat requests of the test (see client_getstat.out). Notice
  that the third output is the same as the first one. That is, the old
  ephemeral znode reappeared at A. However, both B and C are showing the
  latest znode with correct time, zxid and session ID (output not
 attached).
 
  After this point, all following getstat requests on A are showing the old
  znode. Whereas, B and C show the correct znode every time the client on A
  comes online. This is something very perplexing. Earlier I thought this
 was
  a bug in my client implementation. But the test shows that the ZK server
 on
  A after reboot is out of sync with rest of the servers.
 
  The stat command to each server shows that the servers are in sync as far
  as zxid's are concerned (see stat.out). So there is something wrong with
 A's
  local database that is causing this problem.
 
  Has anyone seen this before? I will be doing more debugging in the next
 few
  days. Comments/suggestions for further debugging are welcomed.
 
  -Vishal
 
 
 



snapshots

2010-10-06 Thread Avinash Lakshman
Hi All

Are snapshots serialized dumps of the DataTree taken whenever a log rolls
over? So when a server goes down and comes back up does it construct the
data tree from the snapshots? What if I am running this on a machine with
SSD as extended RAM how does it affect anything?

Cheers
A


Changing configuration

2010-10-06 Thread Avinash Lakshman
Suppose I have a 3 node ZK cluster composed of machines A, B and C. Now for
whatever reason I lose C forever and the machine needs to be replaced. How
do I handle this situation? Update the config with D in place of C and
restart the cluster? Also if I am interested in read just the ZAB portions
which packages should I be looking at?

Cheers
A