Re: kaChing using ZooKeeper: Continuous Deployment

2010-10-04 Thread Adam Rosien
Uhh, I'm not sure what you mean, but if anyone's curious I wrote the
system that we use at kaChing to do continuous deployment with zk as
the discovery and coordination layers, and I'd be happy to talk about
how it works, etc. Pretty basic stuff.

BTW it's Pascal-Louis Perez, CTO of kaChing.

.. Adam

On Mon, Oct 4, 2010 at 7:15 PM, Jay Tobias  wrote:
> I was asked to post this to the user list after attend the ACM talk by Louis 
> Pascal.  One of  several technologies combined to build and deploy their 
> whole stack in 5 minutes (also presented by invitation at Google in May).  
> Time includes testing and ZooKeeper controls deployment rate.
>
> Applied Lean Startup Ideas: Continuous Deployment at kaChing
> Let me give a some context about what kaChingis.  Everything uses the 
> same platform kawala. Coordination using ZooKeeper …
> www.slideshare.net/pascallouis/... - Options
>
>
> --Jay


kaChing using ZooKeeper: Continuous Deployment

2010-10-04 Thread Jay Tobias
I was asked to post this to the user list after attend the ACM talk by Louis 
Pascal.  One of  several technologies combined to build and deploy their whole 
stack in 5 minutes (also presented by invitation at Google in May).  Time 
includes testing and ZooKeeper controls deployment rate.

Applied Lean Startup Ideas: Continuous Deployment at kaChing
Let me give a some context about what kaChingis.  Everything uses the same 
platform kawala. Coordination using ZooKeeper …
www.slideshare.net/pascallouis/... - Options 


--Jay

Re: Expiring session... timeout of 600000ms exceeded

2010-10-04 Thread Mahadev Konar
Am not sure, if anyone responded to this or not. Are the clients getting
session expired or getting Connectionloss?
In any case, zookeeper client has its own thread to updated the server with
active connection status. Did you take a look at the GC activity at your
client?

Thanks
mahadev


On 9/21/10 8:24 AM, "Tim Robertson"  wrote:

> Hi all,
> 
> I am seeing a lot of my clients being kicked out after the 10 minute
> negotiated timeout is exceeded.
> My clients are each a JVM (around 100 running on a machine) which are
> doing web crawling of specific endpoints and handling the response XML
> - so they do wait around for 3-4 minutes on HTTP timeouts, but
> certainly not 10 mins.
> I am just prototyping right now on a 2xquad core mac pro with 12GB
> memory, and the 100 child processes only get -Xmx64m and I don't see
> my machine exhausted.
> 
> Do my clients need to do anything in order to initiate keep alive
> heart beats or should this be automatic (I thought the ticktime would
> dictate this)?
> 
> # my conf is:
> tickTime=2000
> dataDir=/Volumes/Data/zookeeper
> clientPort=2181
> maxClientCnxns=1
> minSessionTimeout=4000
> maxSessionTimeout=80
> 
> Thanks for any pointers to this newbie,
> Tim
> 



Re: possible bug in zookeeper ?

2010-10-04 Thread Mahadev Konar
Hi Yatir,

  Any update on this? Are you still struggling with this problem?

Thanks
mahadev

On 9/15/10 12:56 AM, "Yatir Ben Shlomo"  wrote:

> Thanks to all who replied, I appreciate your efforts:
> 
> 1. There is no connections problem from the client machine:
> (ob1078)(tom...@cass3:~)$ echo ruok | nc zook1 2181
> imok(ob1078)(tom...@cass3:~)$ echo ruok | nc zook2 2181
> imok(ob1078)(tom...@cass3:~)$ echo ruok | nc zook3 2181
> imok(ob1078)(tom...@cass3:~)$
> 
> 2. Unfortunately I have already tried to switch to the new jar but it does not
> seem to be backward compatible.
> It seems that the QuorumPeerConfig class does not have the following field
> protected int clientPort;
> It was replaced by InetSocketAddress clientPortAddress in the new jar
> So I am getting java.lang.NoSuchFieldError exception...
> 
> 3. I looked at the ClientCnxn.java code.
> It seems that the logic for iterating over the available servers
> (nextAddrToTry++ ) is used only inside the startConnect() function but not in
> the finishConnect() function, nor anywhere else.
> 
> Possibly something along these lines is happening:
> some exception that happens inside the finishConnect() function is cauasing
> the cleanup() function which in turn causes another exception.
> Nowhere in this code path is the nextAddrToTry++ applied.
> Can this make sense to someone ?
> thanks
> 
> 
> 
> 
> 
> 
> -Original Message-
> From: Patrick Hunt [mailto:ph...@apache.org]
> Sent: Tuesday, September 14, 2010 6:20 PM
> To: zookeeper-user@hadoop.apache.org
> Subject: Re: possible bug in zookeeper ?
> 
> That is unusual. I don't recall anyone reporting a similar issue, and
> looking at the code I don't see any issues off hand. Can you try the
> following?
> 
> 1) on that particular zk client machine resolve the hosts zook1/zook2/zook3,
> what ip addresses does this resolve to? (try dig)
> 2) try running the client using the 3.3.1 jar file (just replace the jar on
> the client), it includes more log4j information, turn on DEBUG or TRACE
> logging
> 
> Patrick
> 
> On Tue, Sep 14, 2010 at 8:44 AM, Yatir Ben Shlomo wrote:
> 
>> zook1:2181,zook2:2181,zook3:2181
>> 
>> 
>> -Original Message-
>> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
>> Sent: Tuesday, September 14, 2010 4:11 PM
>> To: zookeeper-user@hadoop.apache.org
>> Subject: Re: possible bug in zookeeper ?
>> 
>> What was the list of servers that was given originally to open the
>> connection to ZK?
>> 
>> On Tue, Sep 14, 2010 at 6:15 AM, Yatir Ben Shlomo >> wrote:
>> 
>>> Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances.
>>> 
>>> I am performing survivability  tests:
>>> Taking one of the zookeeper instances down I would expect the client to
>> use
>>> a different zookeeper server instance.
>>> 
>>> But as you can see in the below logs attached
>>> Depending on which instance I choose to take down (in my case,  the last
>>> one in the list of zookeeper servers)
>>> the client is constantly insisting on the same zookeeper server
>> (Attempting
>>> connection to server zook3/192.168.252.78:2181)
>>> and not switching to a different one
>>> the problem seems to arrive from ClientCnxn.java
>>> Any one has an idea on this ?
>>> 
>>> Solr cloud currently is using  zookeeper-3.2.2.jar
>>> Is this a know bug that was fixed in later versions ?( 3.3.1)
>>> 
>>> Thanks in advance,
>>> Yatir
>>> 
>>> 
>>> Logs:
>>> 
>>> Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown input
>>> java.nio.channels.ClosedChannelException
>>>at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :999)
>>>at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown output
>>> java.nio.channels.ClosedChannelException
>>>at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :1004)
>>>at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
>>> INFO: Attempting connection to server zook3/192.168.252.78:2181
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Exception closing session 0x32b105244a20001 to
>>> sun.nio.ch.selectionkeyi...@3ca58cbf
>>> java.net.ConnectException: Connection refused
>>>at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
>>>at
>> sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
>>>