Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Benjamin Reed
sorry to jump in late. if i understand the scenario correctly, you are partitioned from ZK, but you still have access to the NN on which you are holding leases to files. the problem is that even though your ephemeral nodes may timeout, you are still holding a lease on the NN and recovery would

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Jean-Daniel Cryans
If the machine was completely partitioned, as far as I know, it would lose it's lease so the only thing we have to make sure about is clearing the state of the region server by doing a "restart" so that it's ready to come back in the cluster. If ZK is down but the rest is up, closing the files in H

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Gustavo Niemeyer
> Ben's opinion is that it should not belong in the default API but in the > common client that another recent thread was about. My opinion is just that > I need such a functionality, wherever it is. Understood, sorry. I just meant that it feels like something that would likely be useful to other

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Jean-Daniel Cryans
Gustavo, Ben's opinion is that it should not belong in the default API but in the common client that another recent thread was about. My opinion is just that I need such a functionality, wherever it is. If the node gets the exception (or has it's own timer), as I wrote, it will shut itself down t

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Gustavo Niemeyer
Hi Jean-Daniel, > I understand, maybe the common client is the best place. It sounds like something useful to have in the default API, FWIW. > In our situation, if a HBase region server is in the state of being > disconnected for too long, the regions it's holding cannot be reached so > this is

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Jean-Daniel Cryans
I understand, maybe the common client is the best place. In our situation, if a HBase region server is in the state of being disconnected for too long, the regions it's holding cannot be reached so this is a major problem. Also, if the HMaster node gets the event that an ephemeral is gone, it will

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-24 Thread Benjamin Reed
perhaps it would fit into the common client that stefan is proposing. we don't have such a timer currently in the client code that we just need to expose, so it will be something we need to add. one thing to be careful of is trying to be too tricky. you don't want to trigger right after the ses

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-23 Thread Jean-Daniel Cryans
Ben, Thank you, I now see the rationale in not telling the client it's session is over because you can't be sure it actually is. But would it make sense to add a new state in KeeperState representing that corner case? Something like AfterSessionTimeout. I'm pretty sure other would find that useful

Re: Confused about KeeperState.Disconnected and KeeperState.Expired

2009-06-23 Thread Benjamin Reed
ZooKeeper only tells you about states that it is sure about, so you will not get the Expired event until you reconnect to ZooKeeper. if you never connect again to ZooKeeper, you will not get the Expired event. if you want to timeout using some sanity value, 2 times the session timeout for examp