Client seeing wrong data on nodeDataChanged

2010-10-28 Thread Stack
I'm trying to debug an issue that maybe you fellas have some ideas for figuring.

In short:

Client 1 updates a znode setting its content to X, then X again, then
Y, and then finally it deletes the znode.  Client 1 is watching the
znode and I can see that its getting three nodeDataChanged events and
a nodeDeleted.

Client 2 is also watching the znode.  It gets notified three times:
two nodeDataChanged events(only) and a nodeDeleted event.  I'd expect
3 nodeDataChanged events but understand a client might skip states.
The problem is that when client 2 looks at the data in the znode on
nodeDataChanged, for both cases the data is Y.  Not X and then Y, but
Y both times.  This is unexpected.

This is 3.3.1 on a 5 node ensemble.

I have full zk logging enabled.  Would it help posting these?

St.Ack


Re: Client seeing wrong data on nodeDataChanged

2010-10-28 Thread Stack
On Thu, Oct 28, 2010 at 7:32 PM, Ted Dunning ted.dunn...@gmail.com wrote:
 Client 2 is not guaranteed to see X if it doesn't get to asking before the
 value has been updated to Y.

Right, but I wouldn't expect the watch to be triggered twice with value Y.

Anyways, I think we have a handle on whats going on: at the time of
the above incident, the master process is experiencing a flood of zk
changes and our thought is that we're not paying sufficient attention
to the order of receipt.  Will be back if this is not the issue.

Thanks,
St.Ack


Re: [jira] Commented: (MAHOUT-238) Further Dependency Cleanup

2010-01-22 Thread Stack
We're working on hbase 0.21 as being the first hbase that shows up in
a maven repo.
St.Ack

On Fri, Jan 22, 2010 at 1:12 PM, Mahadev Konar maha...@yahoo-inc.com wrote:
 Unfortunately no.. We are planning to deploy 3.3 as the first version on
 maven repo.


 Thanks
 mahadev


 On 1/22/10 12:58 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 Is ZK 3.2.2 in a maven repository somewhere?

 -- Forwarded message --
 From: Drew Farris drew.far...@gmail.com
 Date: Fri, Jan 22, 2010 at 11:47 AM
 Subject: Re: [jira] Commented: (MAHOUT-238) Further Dependency Cleanup
 To: mahout-...@lucene.apache.org


 Neither hbase 0.20.2 nor zookeeper (any version) appear to be in a
 maven repo at this point, so Mahout would have to roll and deploy
 these. What was the process that was followed to build and deploy the
 mahout-packaged hadoop 0.20.1 and hbase artifacts? Is this something I
 could submit a patch to Mahout for, or better left for the committers?

 As Ted pointed out, yes the release of zk is 3.2.2

 Drew

 On Thu, Jan 21, 2010 at 5:12 AM, zhao zhendong zhaozhend...@gmail.com
 wrote:
 Hi Drew,

 I propose to
 1) update hbase-0.20.0.jar to hbase-0.20.2.jar due to the later is stable
 and hbased-platform is based on this version,

 2) and add zookeeper-3.2.1.jar.

 Cheers,
 Zhendong

 On Tue, Jan 19, 2010 at 12:36 PM, zhao zhendong zhaozhend...@gmail.com
 wrote:

 Hi Drew,

 Including a source code in snapshots that will be great.

 Currently, the HDFS reader does not work in 0.20.2. Without source code,
 it's not convenient for me to debug the code.

 Cheers,
 Zhendong

 On Sat, Jan 9, 2010 at 12:25 AM, Drew Farris drew.far...@gmail.com
 wrote:

 I wonder if we can get the hadoop people to include source jars with
 their snapshots?

 On Fri, Jan 8, 2010 at 11:23 AM, Sean Owen sro...@gmail.com wrote:
 I need a fix after 0.20.1, that's the primary reason. As a bonus, we
 don't have to maintain our own version. The downside is relying on a
 SNAPSHOT, but seems worth it to me.

 On Fri, Jan 8, 2010 at 4:02 PM, zhao zhendong zhaozhend...@gmail.com
 wrote:
 Thanks Drew,

 +1 for me to maintain a stable hadoop release, such as 0.20.1. The
 reason is
 obvious :)

 Cheers,
 Zhendong







 --
 -

 Zhen-Dong Zhao (Maxim)

 

 Department of Computer Science
 School of Computing
 National University of Singapore

 





 --
 -

 Zhen-Dong Zhao (Maxim)

 

 Department of Computer Science
 School of Computing
 National University of Singapore

 







Asking zk cluster how its configured and whats this expired about?

2009-11-24 Thread stack
Hey lads:

I want to ask a running zk cluster what its configuration is -- ticktime,
session timeout, etc. -- but do not see how.  There are the four letter
words.  Dump and stat do not print what I want.   I took a look in logs --
the leader in particular -- and do not see vitals dumped out.  Am I missing
something?

I was also wondering what this expire stuff in the dump output is about?
Here is what I see:

$ echo dump|nc X.X.X.X 2181
SessionTracker dump:
Session Sets (12):
0 expire at Tue Nov 24 20:56:24 UTC 2009:
0 expire at Tue Nov 24 20:56:27 UTC 2009:
0 expire at Tue Nov 24 20:56:30 UTC 2009:
0 expire at Tue Nov 24 20:56:39 UTC 2009:
0 expire at Tue Nov 24 20:56:42 UTC 2009:
0 expire at Tue Nov 24 20:56:45 UTC 2009:
0 expire at Tue Nov 24 20:56:48 UTC 2009:
0 expire at Tue Nov 24 20:57:00 UTC 2009:
0 expire at Tue Nov 24 20:57:03 UTC 2009:
2 expire at Tue Nov 24 20:57:06 UTC 2009:
82512629887926272
154570223919497222
2 expire at Tue Nov 24 20:57:09 UTC 2009:
82512629887926273
10455035895349254
3 expire at Tue Nov 24 20:57:21 UTC 2009:
154570223919497216
10455035895349248
154570223919497221

ephemeral nodes dump:
Sessions with Ephemerals (4):
0x2524ccbca8:
/hbase/rs/1259042878053
0x12524ccb9f4:
/hbase/rs/1259042878032
0x12524ccb9f40001:
/hbase/rs/1259042878106
0x22524ccb993:
/hbase/master

Thanks,
St.Ack


Re: Asking zk cluster how its configured and whats this expired about?

2009-11-24 Thread stack
On Tue, Nov 24, 2009 at 1:33 PM, Patrick Hunt ph...@apache.org wrote:


 We can definitely add this, please create a JIRA.


ZOOKEEPER-595




  I was also wondering what this expire stuff in the dump output is about?


 Those are the expiration sets, or buckets. Each client session is put into
 a bucket based on when we last heard from it and it's timeout. The leader
 uses this to determine when to expire sessions.

 Unfortunately the session ids are being printed in decimal, this is fixed
 in 3.3.0.

 Good find, actually this would be useful information for you to monitor in
 determining which of your hbase clients are falling behind wrt
 heartbeating.


Well, are items listed under '2 expire at Tue Nov 24 20:57:06 UTC 2009'
items that have expired or rather, just a logging of when they will expire?
Looking in logs I do not see sessions expiring.

Thanks,
St.Ack




  Here is what I see:

 $ echo dump|nc X.X.X.X 2181
 SessionTracker dump:
 Session Sets (12):
 0 expire at Tue Nov 24 20:56:24 UTC 2009:
 0 expire at Tue Nov 24 20:56:27 UTC 2009:
 0 expire at Tue Nov 24 20:56:30 UTC 2009:
 0 expire at Tue Nov 24 20:56:39 UTC 2009:
 0 expire at Tue Nov 24 20:56:42 UTC 2009:
 0 expire at Tue Nov 24 20:56:45 UTC 2009:
 0 expire at Tue Nov 24 20:56:48 UTC 2009:
 0 expire at Tue Nov 24 20:57:00 UTC 2009:
 0 expire at Tue Nov 24 20:57:03 UTC 2009:
 2 expire at Tue Nov 24 20:57:06 UTC 2009:
82512629887926272
154570223919497222
 2 expire at Tue Nov 24 20:57:09 UTC 2009:
82512629887926273
10455035895349254
 3 expire at Tue Nov 24 20:57:21 UTC 2009:
154570223919497216
10455035895349248
154570223919497221

 ephemeral nodes dump:
 Sessions with Ephemerals (4):
 0x2524ccbca8:
/hbase/rs/1259042878053
 0x12524ccb9f4:
/hbase/rs/1259042878032
 0x12524ccb9f40001:
/hbase/rs/1259042878106
 0x22524ccb993:
/hbase/master

 Thanks,
 St.Ack




Please disregard - Re: Exception on close of connection (WAS - Re: c client on win32)

2009-11-20 Thread stack
Please disregard. Sorry for the noise (Patrick, of note, I am seeing this
session timeout on a cluster other than Zhenyus).
St.Ack

On Fri, Nov 20, 2009 at 4:24 PM, stack st...@duboce.net wrote:

 Sorry, I had a bad subject on the below question.
 St.Ack

 On Fri, Nov 20, 2009 at 4:22 PM, stack st...@duboce.net wrote:

 Below an excerpt from a single node zk quorum that was at heart of a small
 hbase cluster.  Unfortunately the log is not at DEBUG level (I've asked the
 gentleman to up the log level meantime).  What it seems to be reporting is
 that an exception while closing a session caused it to timeout all connected
 sessions.

 Here is the line that mentions the exception on close of session.  There
 is no stack trace:

 2009-11-20 03:41:04,766 WARN org.apache.zookeeper.server.NIOServerCnxn:
 Exception causing close of session 0x124bc250d700790 due to
 java.io.IOException: Read error

 Is it correct that an error at this stage throws out all connected
 sessions?

 Thanks,
 St.Ack


 2009-11-20 00:00:04,948 INFO org.apache.zookeeper.server.NIOServerCnxn:
 Connected to /10.1.20.101:50716 lastZxid 0
 2009-11-20 00:00:04,982 INFO org.apache.zookeeper.server.NIOServerCnxn:
 Creating new session 0x1250f26319f0016
 2009-11-20 00:00:05,051 INFO org.apache.zookeeper.server.NIOServerCnxn:
 Finished init of 0x1250f26319f0016 valid:true
 2009-11-20 00:00:05,051 WARN
 org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
 processing sessionid:0x1250f26319f0016 type:create c
 xid:0x1 zxid:0xfffe txntype:unknown n/a
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode
 = NodeExists
 at
 org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:245)
 at
 org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114)
 2009-11-20 00:00:40,150 WARN
 org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
 processing sessionid:0x1250f26319f0016 type:create c
 xid:0x4 zxid:0xfffe txntype:unknown n/a
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode
 = NodeExists
 at
 org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:245)
 at
 org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114)
 2009-11-20 00:00:50,428 WARN org.apache.zookeeper.server.NIOServerCnxn:
 Exception causing close of session 0x1250f26319f0016 due to
 java.io.IOExceptio
 n: Read error
 2009-11-20 00:00:50,429 INFO org.apache.zookeeper.server.NIOServerCnxn:
 closing session:0x1250f26319f0016 NIOServerCnxn:
 java.nio.channels.SocketChann
 el[connected local=/10.1.20.101:2181 remote=/10.1.20.101:50716]
 2009-11-20 00:01:22,002 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x1250f26319f0016
 2009-11-20 00:01:22,002 INFO org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x1250f26319f0016
 2009-11-20 00:01:22,002 INFO
 org.apache.zookeeper.server.PrepRequestProcessor: Processed session
 termination request for id: 0x1250f26319f0016
 2009-11-20 03:41:04,766 WARN org.apache.zookeeper.server.NIOServerCnxn:
 Exception causing close of session 0x124bc250d700790 due to
 java.io.IOExceptio
 n: Read error
 2009-11-20 03:41:04,864 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x1250f26319f
 2009-11-20 03:41:04,927 INFO org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x1250f26319f
 2009-11-20 03:41:04,927 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x124bc250d7007a2
 2009-11-20 03:41:04,927 INFO org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x124bc250d7007a2
 2009-11-20 03:41:04,927 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x124bc250d700794
 2009-11-20 03:41:04,927 INFO org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x124bc250d700794






Re: Please disregard - Re: Exception on close of connection (WAS - Re: c client on win32)

2009-11-20 Thread stack
I think now I can explain the session expirations; hbase cilents especially
up in a map/reduce task can exit without closing the zk session.  Will fix.
St.Ack

On Fri, Nov 20, 2009 at 4:45 PM, Patrick Hunt ph...@apache.org wrote:

 Yes, right, that's what I meant to say - what is causing the client to
 die, throwing read error on the server side, and then later you end up
 with the session expiration because the client was not closed gracefully.

 (thanks mahadev)

 Patrick


 Mahadev Konar wrote:

 That should be the case since the server gets an exception reading from
 the
 socket - meaning the client went away (not gracefully) and that leads the
 server to expire the session in 30 seconds.

 mahadev


 On 11/20/09 4:35 PM, Patrick Hunt ph...@apache.org wrote:

  Oops too late. ;-)

 I'm perplexed as to why you see all these expirations though. Are you
 killing your clients, ie not cleaning up the ZK session gracefully via
 close()?

 Patrick

 stack wrote:

 Please disregard. Sorry for the noise (Patrick, of note, I am seeing
 this
 session timeout on a cluster other than Zhenyus).
 St.Ack

 On Fri, Nov 20, 2009 at 4:24 PM, stack st...@duboce.net wrote:

  Sorry, I had a bad subject on the below question.
 St.Ack

 On Fri, Nov 20, 2009 at 4:22 PM, stack st...@duboce.net wrote:

  Below an excerpt from a single node zk quorum that was at heart of a
 small
 hbase cluster.  Unfortunately the log is not at DEBUG level (I've
 asked the
 gentleman to up the log level meantime).  What it seems to be
 reporting is
 that an exception while closing a session caused it to timeout all
 connected
 sessions.

 Here is the line that mentions the exception on close of session.
  There
 is no stack trace:

 2009-11-20 03:41:04,766 WARN
 org.apache.zookeeper.server.NIOServerCnxn:
 Exception causing close of session 0x124bc250d700790 due to
 java.io.IOException: Read error

 Is it correct that an error at this stage throws out all connected
 sessions?

 Thanks,
 St.Ack


 2009-11-20 00:00:04,948 INFO
 org.apache.zookeeper.server.NIOServerCnxn:
 Connected to /10.1.20.101:50716 lastZxid 0
 2009-11-20 00:00:04,982 INFO
 org.apache.zookeeper.server.NIOServerCnxn:
 Creating new session 0x1250f26319f0016
 2009-11-20 00:00:05,051 INFO
 org.apache.zookeeper.server.NIOServerCnxn:
 Finished init of 0x1250f26319f0016 valid:true
 2009-11-20 00:00:05,051 WARN
 org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
 processing sessionid:0x1250f26319f0016 type:create c
 xid:0x1 zxid:0xfffe txntype:unknown n/a
 org.apache.zookeeper.KeeperException$NodeExistsException:
 KeeperErrorCode
 = NodeExists
at

 org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProces
 sor.java:245)
at

 org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.j
 ava:114)
 2009-11-20 00:00:40,150 WARN
 org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
 processing sessionid:0x1250f26319f0016 type:create c
 xid:0x4 zxid:0xfffe txntype:unknown n/a
 org.apache.zookeeper.KeeperException$NodeExistsException:
 KeeperErrorCode
 = NodeExists
at

 org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProces
 sor.java:245)
at

 org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.j
 ava:114)
 2009-11-20 00:00:50,428 WARN
 org.apache.zookeeper.server.NIOServerCnxn:
 Exception causing close of session 0x1250f26319f0016 due to
 java.io.IOExceptio
 n: Read error
 2009-11-20 00:00:50,429 INFO
 org.apache.zookeeper.server.NIOServerCnxn:
 closing session:0x1250f26319f0016 NIOServerCnxn:
 java.nio.channels.SocketChann
 el[connected local=/10.1.20.101:2181 remote=/10.1.20.101:50716]
 2009-11-20 00:01:22,002 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x1250f26319f0016
 2009-11-20 00:01:22,002 INFO
 org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x1250f26319f0016
 2009-11-20 00:01:22,002 INFO
 org.apache.zookeeper.server.PrepRequestProcessor: Processed session
 termination request for id: 0x1250f26319f0016
 2009-11-20 03:41:04,766 WARN
 org.apache.zookeeper.server.NIOServerCnxn:
 Exception causing close of session 0x124bc250d700790 due to
 java.io.IOExceptio
 n: Read error
 2009-11-20 03:41:04,864 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x1250f26319f
 2009-11-20 03:41:04,927 INFO
 org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x1250f26319f
 2009-11-20 03:41:04,927 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x124bc250d7007a2
 2009-11-20 03:41:04,927 INFO
 org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x124bc250d7007a2
 2009-11-20 03:41:04,927 INFO
 org.apache.zookeeper.server.SessionTrackerImpl: Expiring session
 0x124bc250d700794
 2009-11-20 03:41:04,927 INFO
 org.apache.zookeeper.server.ZooKeeperServer:
 Expiring session 0x124bc250d700794