Debugging help for SessionExpiredException

2010-06-09 Thread Jordan Zimmerman
We have a test system using Zookeeper. There is a single Zookeeper server node and 4 clients. There is very little activity in this system. After a day's testing we start to see SessionExpiredException on the client. Things I've tried: * Increasing the session timeout to 1 minute * Making sure

Re: Debugging help for SessionExpiredException

2010-06-09 Thread Patrick Hunt
"100mb partition"? sounds like virtualization. resource starvation (worse in virtualized env) is a common cause of this. Are your clients gcing/swapping at all? If a client gc's for long periods of time the heartbeat thread won't be able to run and the server will expire the session. There is a

Re: Debugging help for SessionExpiredException

2010-06-09 Thread Stephen Green
On Wed, Jun 9, 2010 at 2:47 PM, Patrick Hunt wrote: > My guess is that your client is gcing for long periods of time - you can > rule this in/out by turning on gc logging in your clients and then viewing > the results after another such incident happens (try gchisto for graphical > view) >From re

Re: Debugging help for SessionExpiredException

2010-06-09 Thread Lei Zhang
We use zookeeper in virtualized environment, both on Amazon EC2 and on Vmware Workstation on local machines. We've consistently run into issues with vmware workstation (CentOS as guest OS) on Windows host: just by leaving the cluster idle over night leads to zk session expire issue. My theory is:

Re: Debugging help for SessionExpiredException

2010-06-09 Thread Ted Dunning
This can depend on which kind of instance you invoke as well. The smallest instances disappear for short periods of time and that can lead to surprises. On Wed, Jun 9, 2010 at 3:35 PM, Lei Zhang wrote: > On EC2 (still CentOS as guest OS), we consistently run into zk session > expire issue when

Re: Debugging help for SessionExpiredException

2010-06-09 Thread Patrick Hunt
On 06/09/2010 03:35 PM, Lei Zhang wrote: We've consistently run into issues with vmware workstation (CentOS as guest OS) on Windows host: just by leaving the cluster idle over night leads to zk session expire issue. My theory is: windows may have gone to hibernation, the zk heartbeat logic hibe