Re: Suggested way to simulate client session expiration in unit tests?

2010-07-06 Thread Patrick Hunt

If you want to simulate expiration use the example I sent.


http://github.com/phunt/zkexamples


Another option is to use a mock.

Patrick

On 07/06/2010 05:42 PM, Jeremy Davis wrote:

Thanks!
That seems to work, but it is approximately the same as zooKeeper.close() in
that there is no SessionExpired event that comes up through the default
Watcher.
Maybe I'm assuming more from ZK than I should, but should a paranoid lock
implementation periodically test it's session by reading or writing a value?

Regards,
-JD


On Tue, Jul 6, 2010 at 10:32 AM, Mahadev Konarwrote:


Hi Jeremy,

  zk.disconnect() is the right way to disconnect from the servers. For
session expiration you just have to make sure that the client stays
disconnected for more than the session expiration interval.

Hope that helps.

Thanks
mahadev


On 7/6/10 9:09 AM, "Jeremy Davis"  wrote:


Is there a recommended way of simulating a client session expiration in

unit

tests?
I see a TestableZooKeeper.java, with a pauseCnxn() method that does cause
the connection to timeout/disconnect and reconnect. Is there an easy way

to

push this all the way through to session expiration?
Thanks,
-JD







Re: Suggested way to simulate client session expiration in unit tests?

2010-07-06 Thread Jeremy Davis
Thanks!
That seems to work, but it is approximately the same as zooKeeper.close() in
that there is no SessionExpired event that comes up through the default
Watcher.
Maybe I'm assuming more from ZK than I should, but should a paranoid lock
implementation periodically test it's session by reading or writing a value?

Regards,
-JD


On Tue, Jul 6, 2010 at 10:32 AM, Mahadev Konar wrote:

> Hi Jeremy,
>
>  zk.disconnect() is the right way to disconnect from the servers. For
> session expiration you just have to make sure that the client stays
> disconnected for more than the session expiration interval.
>
> Hope that helps.
>
> Thanks
> mahadev
>
>
> On 7/6/10 9:09 AM, "Jeremy Davis"  wrote:
>
> > Is there a recommended way of simulating a client session expiration in
> unit
> > tests?
> > I see a TestableZooKeeper.java, with a pauseCnxn() method that does cause
> > the connection to timeout/disconnect and reconnect. Is there an easy way
> to
> > push this all the way through to session expiration?
> > Thanks,
> > -JD
>
>


Re: zookeeper crash

2010-07-06 Thread Travis Crawford
Hey all -

I believe we just suffered an outage from this issue. Short version is
while restarting quorum members with GC flags recommended in the
Troubleshooting wiki page a follower logged messages similar two the
following jiras:

2010-07-06 23:14:01,438 - FATAL
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:follo...@71] - Leader epoch 20 is
less than our epoch 21
2010-07-06 23:14:01,438 - WARN
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:follo...@82] - Exception when
following the leader
java.io.IOException: Error: Epoch of leader is lower

https://issues.apache.org/jira/browse/ZOOKEEPER-335
https://issues.apache.org/jira/browse/ZOOKEEPER-790

Reading through the jira's its unclear if the issue is well understood
at this point (as there's a patch available) or still being
understood.

If its still being understood let me know and I can attach the
relevant log lines to the appropriate jira.

Or if the patch appears good I can make a new release and help test.
Let me know :)

--travis





On Wed, Jun 16, 2010 at 3:25 PM, Flavio Junqueira  wrote:
> I would recommend opening a separate jira issue. I'm not convinced the
> issues are the same, so I'd rather keep them separate and link the issues if
> it is the case.
>
> -Flavio
>
> On Jun 17, 2010, at 12:16 AM, Patrick Hunt wrote:
>
>> We are unable to reproduce this issue. If you can provide the server
>> logs (all servers) and attach them to the jira it would be very helpful.
>> Some detail on the approx time of the issue so we can correlate to the
>> logs would help too (summary of what you did/do to cause it, etc...
>> anything that might help us nail this one down).
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-335
>>
>> Some detail on ZK version, OS, Java version, HW info, etc... would also
>> be of use to us.
>>
>> Patrick
>>
>> On 06/16/2010 02:49 PM, Vishal K wrote:
>>>
>>> Hi,
>>>
>>> We are running into this bug very often (almost 60-75% hit rate) while
>>> testing our newly developed application over ZK. This is almost a blocker
>>> for us. Will the fix be simplified if backward compatibility was not an
>>> issue?
>>>
>>> Considering that this bug is rarely reported, I am wondering why we are
>>> running into this problem so often. Also, on a side note, I am curious
>>> why
>>> the systest that comes with ZooKeeper did not detect this bug. Can anyone
>>> please give an overview of the problem?
>>>
>>> Thanks.
>>> -Vishal
>>>
>>>
>>> On Wed, Jun 2, 2010 at 8:17 PM, Charity Majors
>>>  wrote:
>>>
 Sure thing.

 We got paged this morning because backend services were not able to
 write
 to the database.  Each server discovers the DB master using zookeeper,
 so
 when zookeeper goes down, they assume they no longer know who the DB
 master
 is and stop working.

 When we realized there were no problems with the database, we logged in
 to
 the zookeeper nodes.  We weren't able to connect to zookeeper using
 zkCli.sh
 from any of the three nodes, so we decided to restart them all, starting
 with node one.  However, after restarting node one, the cluster started
 responding normally again.

 (The timestamps on the zookeeper processes on nodes two and three *are*
 dated today, but none of us restarted them.  We checked shell histories
 and
 sudo logs, and they seem to back us up.)

 We tried getting node one to come back up and join the cluster, but
 that's
 when we realized we weren't getting any logs, because log4j.properties
 was
 in the wrong location.  Sorry -- I REALLY wish I had those logs for you.
  We
 put log4j back in place, and that's when we saw the spew I pasted in my
 first message.

 I'll tack this on to ZK-335.



 On Jun 2, 2010, at 4:17 PM, Benjamin Reed wrote:

> charity, do you mind going through your scenario again to give a
> timeline for the failure? i'm a bit confused as to what happened.
>
> ben
>
> On 06/02/2010 01:32 PM, Charity Majors wrote:
>>
>> Thanks.  That worked for me.  I'm a little confused about why it threw

 the entire cluster into an unusable state, though.
>>
>> I said before that we restarted all three nodes, but tracing back, we

 actually didn't.  The zookeeper cluster was refusing all connections
 until
 we restarted node one.  But once node one had been dropped from the
 cluster,
 the other two nodes formed a quorum and started responding to queries on
 their own.
>>
>> Is that expected as well?  I didn't see it in ZOOKEEPER-335, so
>> thought

 I'd mention it.
>>
>>
>>
>> On Jun 2, 2010, at 11:49 AM, Patrick Hunt wrote:
>>
>>
>>> Hi Charity, unfortunately this is a known issue not specific to 3.3

 that
>>>
>>> we are working to address. See this thread for some background:
>>>
>>>

 http://zookeeper-user.578899.n

Re: Suggested way to simulate client session expiration in unit tests?

2010-07-06 Thread Patrick Hunt

not sure if this still works but here's an example:

http://github.com/phunt/zkexamples

Patrick

On 07/06/2010 10:32 AM, Mahadev Konar wrote:

Hi Jeremy,

  zk.disconnect() is the right way to disconnect from the servers. For
session expiration you just have to make sure that the client stays
disconnected for more than the session expiration interval.

Hope that helps.

Thanks
mahadev


On 7/6/10 9:09 AM, "Jeremy Davis"  wrote:


Is there a recommended way of simulating a client session expiration in unit
tests?
I see a TestableZooKeeper.java, with a pauseCnxn() method that does cause
the connection to timeout/disconnect and reconnect. Is there an easy way to
push this all the way through to session expiration?
Thanks,
-JD




Re: Suggested way to simulate client session expiration in unit tests?

2010-07-06 Thread Mahadev Konar
Hi Jeremy,

 zk.disconnect() is the right way to disconnect from the servers. For
session expiration you just have to make sure that the client stays
disconnected for more than the session expiration interval.

Hope that helps.

Thanks
mahadev


On 7/6/10 9:09 AM, "Jeremy Davis"  wrote:

> Is there a recommended way of simulating a client session expiration in unit
> tests?
> I see a TestableZooKeeper.java, with a pauseCnxn() method that does cause
> the connection to timeout/disconnect and reconnect. Is there an easy way to
> push this all the way through to session expiration?
> Thanks,
> -JD



Suggested way to simulate client session expiration in unit tests?

2010-07-06 Thread Jeremy Davis
Is there a recommended way of simulating a client session expiration in unit
tests?
I see a TestableZooKeeper.java, with a pauseCnxn() method that does cause
the connection to timeout/disconnect and reconnect. Is there an easy way to
push this all the way through to session expiration?
Thanks,
-JD