When you say 'node restart' - do you mean a JVM reboot, or will we be able to have an internal reset?
Most people dont run hbase under some job control, so when hbase jvms die, they stay dead... -ryan On Sun, Apr 5, 2009 at 3:14 PM, Nitay <[email protected]> wrote: > The master did not respond correctly to a SessionExpired event. I don't > think there's a ZK bug. This is like HBASE-1232. Both the master and > regionserver got a SessionExpired event. The bug I fixed for Ryan was just > with the client getting a SessionExpired. Andrew's cluster shows us that > it's just as likely for the master/RS to get this event. > > The only thing you can do on a SessionExpired event is to completely > restart > the node. SessionExpired means your ZooKeeper handle is dead, and your > ephemeral nodes will go away. Since every server in HBase has some > ephemeral > node that indicates it liveness (e.g. /hbase/master, /hbase/rs/...), the > node has to completely restart. > > HBASE-1232, HBASE-1311, and HBASE-1312 are all the same problem, just with > three different points of view (client, RS, master). > > On Sun, Apr 5, 2009 at 2:32 PM, Ryan Rawson <[email protected]> wrote: > > > ZK keeps the note up as long as the session is still valid. > > > > So the question is: > > - did the master not respond correctly to an expired session? > > - is there a ZK bug (HOPE NOT!) > > > > -ryan > > > > On Sun, Apr 5, 2009 at 2:22 PM, Andrew Purtell (JIRA) <[email protected] > > >wrote: > > > > > ZooKeeper: Master's ephemeral node went away while it was still up and > > > functioning normally > > > > > > > > > ------------------------------------------------------------------------------------------- > > > > > > Key: HBASE-1312 > > > URL: https://issues.apache.org/jira/browse/HBASE-1312 > > > Project: Hadoop HBase > > > Issue Type: Bug > > > Reporter: Andrew Purtell > > > > > > > > > Does the master watch its own znode? Right around the time of > > regionserver > > > problems described in HBASE-1311, clients could no longer find the > > master, > > > but according to its log it was up and functionling normally. I think > the > > > master and regionserver sessions expired at the same time, as they were > > > started within seconds of each other. > > > > > > -- > > > This message is automatically generated by JIRA. > > > - > > > You can reply to this email to add a comment to the issue online. > > > > > > > > >
