Re: RE: HBASE-2312 discussion

2010-03-17 Thread Todd Lipcon
I opened HBASE-2342 to discuss the watchdog node concept. -Todd On Wed, Mar 17, 2010 at 2:59 PM, Todd Lipcon wrote: > Hi Ryan, > > I think the idea of a secondary watchdog node is a decent one, but as you > mentioned, it isn't a solution for the problem at hand. The RC pause > exacerbates the p

Re: RE: HBASE-2312 discussion

2010-03-17 Thread Todd Lipcon
Hi Ryan, I think the idea of a secondary watchdog node is a decent one, but as you mentioned, it isn't a solution for the problem at hand. The RC pause exacerbates the problem, but network blips, etc, can cause the same problem. Is there a JIRA open for the watchdog process? I think we should dis

Re: RE: HBASE-2312 discussion

2010-03-17 Thread Ryan Rawson
There are 2 ways to lose your ZK session: - you dont send pings back to ZK and it expires it (GC pause of death, network disconnect, etc) - ZK "somehow" expires your session for you. I have seen this once in a while, its rare, but painful when it happens. It didn't seem to be correlated to GC paus

Re: RE: HBASE-2312 discussion

2010-03-17 Thread Todd Lipcon
On Wed, Mar 17, 2010 at 10:48 AM, Ryan Rawson wrote: > I have a 4th option :-) I'm on the his right now and ill write it up when > I > get to work. In short move the zk thread out of the rs into a monitoring > parent and then you can explicitly monitor for Juliet gc pauses. More to > come >

Re: RE: HBASE-2312 discussion

2010-03-17 Thread Ryan Rawson
I have a 4th option :-) I'm on the his right now and ill write it up when I get to work. In short move the zk thread out of the rs into a monitoring parent and then you can explicitly monitor for Juliet gc pauses. More to come On Mar 17, 2010 10:22 AM, "Karthik Ranganathan" wrote: Loved the