The master did not respond correctly to a SessionExpired event. I don't think there's a ZK bug. This is like HBASE-1232. Both the master and regionserver got a SessionExpired event. The bug I fixed for Ryan was just with the client getting a SessionExpired. Andrew's cluster shows us that it's just as likely for the master/RS to get this event.
The only thing you can do on a SessionExpired event is to completely restart the node. SessionExpired means your ZooKeeper handle is dead, and your ephemeral nodes will go away. Since every server in HBase has some ephemeral node that indicates it liveness (e.g. /hbase/master, /hbase/rs/...), the node has to completely restart. HBASE-1232, HBASE-1311, and HBASE-1312 are all the same problem, just with three different points of view (client, RS, master). On Sun, Apr 5, 2009 at 2:32 PM, Ryan Rawson <[email protected]> wrote: > ZK keeps the note up as long as the session is still valid. > > So the question is: > - did the master not respond correctly to an expired session? > - is there a ZK bug (HOPE NOT!) > > -ryan > > On Sun, Apr 5, 2009 at 2:22 PM, Andrew Purtell (JIRA) <[email protected] > >wrote: > > > ZooKeeper: Master's ephemeral node went away while it was still up and > > functioning normally > > > > > ------------------------------------------------------------------------------------------- > > > > Key: HBASE-1312 > > URL: https://issues.apache.org/jira/browse/HBASE-1312 > > Project: Hadoop HBase > > Issue Type: Bug > > Reporter: Andrew Purtell > > > > > > Does the master watch its own znode? Right around the time of > regionserver > > problems described in HBASE-1311, clients could no longer find the > master, > > but according to its log it was up and functionling normally. I think the > > master and regionserver sessions expired at the same time, as they were > > started within seconds of each other. > > > > -- > > This message is automatically generated by JIRA. > > - > > You can reply to this email to add a comment to the issue online. > > > > >
