Re: possible bug in zookeeper ?

2010-10-04 Thread Mahadev Konar
Hi Yatir,

  Any update on this? Are you still struggling with this problem?

Thanks
mahadev

On 9/15/10 12:56 AM, Yatir Ben Shlomo yat...@outbrain.com wrote:

 Thanks to all who replied, I appreciate your efforts:
 
 1. There is no connections problem from the client machine:
 (ob1078)(tom...@cass3:~)$ echo ruok | nc zook1 2181
 imok(ob1078)(tom...@cass3:~)$ echo ruok | nc zook2 2181
 imok(ob1078)(tom...@cass3:~)$ echo ruok | nc zook3 2181
 imok(ob1078)(tom...@cass3:~)$
 
 2. Unfortunately I have already tried to switch to the new jar but it does not
 seem to be backward compatible.
 It seems that the QuorumPeerConfig class does not have the following field
 protected int clientPort;
 It was replaced by InetSocketAddress clientPortAddress in the new jar
 So I am getting java.lang.NoSuchFieldError exception...
 
 3. I looked at the ClientCnxn.java code.
 It seems that the logic for iterating over the available servers
 (nextAddrToTry++ ) is used only inside the startConnect() function but not in
 the finishConnect() function, nor anywhere else.
 
 Possibly something along these lines is happening:
 some exception that happens inside the finishConnect() function is cauasing
 the cleanup() function which in turn causes another exception.
 Nowhere in this code path is the nextAddrToTry++ applied.
 Can this make sense to someone ?
 thanks
 
 
 
 
 
 
 -Original Message-
 From: Patrick Hunt [mailto:ph...@apache.org]
 Sent: Tuesday, September 14, 2010 6:20 PM
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: possible bug in zookeeper ?
 
 That is unusual. I don't recall anyone reporting a similar issue, and
 looking at the code I don't see any issues off hand. Can you try the
 following?
 
 1) on that particular zk client machine resolve the hosts zook1/zook2/zook3,
 what ip addresses does this resolve to? (try dig)
 2) try running the client using the 3.3.1 jar file (just replace the jar on
 the client), it includes more log4j information, turn on DEBUG or TRACE
 logging
 
 Patrick
 
 On Tue, Sep 14, 2010 at 8:44 AM, Yatir Ben Shlomo yat...@outbrain.comwrote:
 
 zook1:2181,zook2:2181,zook3:2181
 
 
 -Original Message-
 From: Ted Dunning [mailto:ted.dunn...@gmail.com]
 Sent: Tuesday, September 14, 2010 4:11 PM
 To: zookeeper-user@hadoop.apache.org
 Subject: Re: possible bug in zookeeper ?
 
 What was the list of servers that was given originally to open the
 connection to ZK?
 
 On Tue, Sep 14, 2010 at 6:15 AM, Yatir Ben Shlomo yat...@outbrain.com
 wrote:
 
 Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances.
 
 I am performing survivability  tests:
 Taking one of the zookeeper instances down I would expect the client to
 use
 a different zookeeper server instance.
 
 But as you can see in the below logs attached
 Depending on which instance I choose to take down (in my case,  the last
 one in the list of zookeeper servers)
 the client is constantly insisting on the same zookeeper server
 (Attempting
 connection to server zook3/192.168.252.78:2181)
 and not switching to a different one
 the problem seems to arrive from ClientCnxn.java
 Any one has an idea on this ?
 
 Solr cloud currently is using  zookeeper-3.2.2.jar
 Is this a know bug that was fixed in later versions ?( 3.3.1)
 
 Thanks in advance,
 Yatir
 
 
 Logs:
 
 Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown input
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
at
 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :999)
at
 
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
 WARNING: Ignoring exception during shutdown output
 java.nio.channels.ClosedChannelException
at
 sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
at
 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
 :1004)
at
 
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970
)
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
 INFO: Attempting connection to server zook3/192.168.252.78:2181
 Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
 WARNING: Exception closing session 0x32b105244a20001 to
 sun.nio.ch.selectionkeyi...@3ca58cbf
 java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at
 sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at
 
 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933
)
 Sep 14, 2010 9:02:22 AM 

Re: Expiring session... timeout of 600000ms exceeded

2010-10-04 Thread Mahadev Konar
Am not sure, if anyone responded to this or not. Are the clients getting
session expired or getting Connectionloss?
In any case, zookeeper client has its own thread to updated the server with
active connection status. Did you take a look at the GC activity at your
client?

Thanks
mahadev


On 9/21/10 8:24 AM, Tim Robertson timrobertson...@gmail.com wrote:

 Hi all,
 
 I am seeing a lot of my clients being kicked out after the 10 minute
 negotiated timeout is exceeded.
 My clients are each a JVM (around 100 running on a machine) which are
 doing web crawling of specific endpoints and handling the response XML
 - so they do wait around for 3-4 minutes on HTTP timeouts, but
 certainly not 10 mins.
 I am just prototyping right now on a 2xquad core mac pro with 12GB
 memory, and the 100 child processes only get -Xmx64m and I don't see
 my machine exhausted.
 
 Do my clients need to do anything in order to initiate keep alive
 heart beats or should this be automatic (I thought the ticktime would
 dictate this)?
 
 # my conf is:
 tickTime=2000
 dataDir=/Volumes/Data/zookeeper
 clientPort=2181
 maxClientCnxns=1
 minSessionTimeout=4000
 maxSessionTimeout=80
 
 Thanks for any pointers to this newbie,
 Tim
 



kaChing using ZooKeeper: Continuous Deployment

2010-10-04 Thread Jay Tobias
I was asked to post this to the user list after attend the ACM talk by Louis 
Pascal.  One of  several technologies combined to build and deploy their whole 
stack in 5 minutes (also presented by invitation at Google in May).  Time 
includes testing and ZooKeeper controls deployment rate.

Applied Lean Startup Ideas: Continuous Deployment at kaChing
Let me give a some context about what kaChingis.  Everything uses the same 
platform kawala. Coordination using ZooKeeper …
www.slideshare.net/pascallouis/... - Options 


--Jay