Re: Confused about KeeperState.Disconnected and KeeperState.Expired
Ben's opinion is that it should not belong in the default API but in the common client that another recent thread was about. My opinion is just that I need such a functionality, wherever it is. Understood, sorry. I just meant that it feels like something that would likely be useful to other people too, so might have a role in the default API to ensure it gets done properly considering the details that Ben brought up. If the node gets the exception (or has it's own timer), as I wrote, it will shut itself down to release HDFS leases as fast as possible. If ZK is really down and it's not a network partition, then HBase is down and this is fine because it won't be able to work anyway. Right, that's mostly what I was wondering. I was pondering about under which circumstances the node would be unable to talk to the ZooKeeper server but would still be holding the HDFS lease in a way that prevented the rest of the system from going on. If I understand what you mean, if ZooKeeper is down entirely, HBase would be down for good. If the machine was partitioned off entirely, the HDFS side of things will also be disconnected, so shutting the node down won't help the rest of the system recovering. -- Gustavo Niemeyer http://niemeyer.net
Re: Confused about KeeperState.Disconnected and KeeperState.Expired
If the machine was completely partitioned, as far as I know, it would lose it's lease so the only thing we have to make sure about is clearing the state of the region server by doing a restart so that it's ready to come back in the cluster. If ZK is down but the rest is up, closing the files in HDFS should ensure that we lose a minimum of data if not losing any. I think that in a multi-rack setup it is possible to not be able to talk to ZK but to be able to talk to the Namenode as machines can be anywhere. Especially in HBase 0.20, the master can failover on any node that has a backup Master ready. So in that case, the region server should consider itself gone from the cluster and close any connection it has and restart. Those are very legetimate questions Gustavo, thanks for asking. J-D On Wed, Jun 24, 2009 at 3:38 PM, Gustavo Niemeyer gust...@niemeyer.netwrote: Ben's opinion is that it should not belong in the default API but in the common client that another recent thread was about. My opinion is just that I need such a functionality, wherever it is. Understood, sorry. I just meant that it feels like something that would likely be useful to other people too, so might have a role in the default API to ensure it gets done properly considering the details that Ben brought up. If the node gets the exception (or has it's own timer), as I wrote, it will shut itself down to release HDFS leases as fast as possible. If ZK is really down and it's not a network partition, then HBase is down and this is fine because it won't be able to work anyway. Right, that's mostly what I was wondering. I was pondering about under which circumstances the node would be unable to talk to the ZooKeeper server but would still be holding the HDFS lease in a way that prevented the rest of the system from going on. If I understand what you mean, if ZooKeeper is down entirely, HBase would be down for good. If the machine was partitioned off entirely, the HDFS side of things will also be disconnected, so shutting the node down won't help the rest of the system recovering. -- Gustavo Niemeyer http://niemeyer.net
Re: Confused about KeeperState.Disconnected and KeeperState.Expired
sorry to jump in late. if i understand the scenario correctly, you are partitioned from ZK, but you still have access to the NN on which you are holding leases to files. the problem is that even though your ephemeral nodes may timeout, you are still holding a lease on the NN and recovery would go faster if you actually closed the file. right? or is it deeper than that? can you open a file in such a way that you stomp the lease? or make sure that the lease timeout is smaller than the session timeout and only renew if you are still connected to ZK? thanx ben Jean-Daniel Cryans wrote: If the machine was completely partitioned, as far as I know, it would lose it's lease so the only thing we have to make sure about is clearing the state of the region server by doing a restart so that it's ready to come back in the cluster. If ZK is down but the rest is up, closing the files in HDFS should ensure that we lose a minimum of data if not losing any. I think that in a multi-rack setup it is possible to not be able to talk to ZK but to be able to talk to the Namenode as machines can be anywhere. Especially in HBase 0.20, the master can failover on any node that has a backup Master ready. So in that case, the region server should consider itself gone from the cluster and close any connection it has and restart. Those are very legetimate questions Gustavo, thanks for asking. J-D On Wed, Jun 24, 2009 at 3:38 PM, Gustavo Niemeyer gust...@niemeyer.netwrote: Ben's opinion is that it should not belong in the default API but in the common client that another recent thread was about. My opinion is just that I need such a functionality, wherever it is. Understood, sorry. I just meant that it feels like something that would likely be useful to other people too, so might have a role in the default API to ensure it gets done properly considering the details that Ben brought up. If the node gets the exception (or has it's own timer), as I wrote, it will shut itself down to release HDFS leases as fast as possible. If ZK is really down and it's not a network partition, then HBase is down and this is fine because it won't be able to work anyway. Right, that's mostly what I was wondering. I was pondering about under which circumstances the node would be unable to talk to the ZooKeeper server but would still be holding the HDFS lease in a way that prevented the rest of the system from going on. If I understand what you mean, if ZooKeeper is down entirely, HBase would be down for good. If the machine was partitioned off entirely, the HDFS side of things will also be disconnected, so shutting the node down won't help the rest of the system recovering. -- Gustavo Niemeyer http://niemeyer.net
Next Bay Area Hadoop User Group - Focus on Hadoop 0.20 and Core Project Split
Bay Area Hadoop Fans, We're excited to hold our first Hadoop User Group at Cloudera's office in Burlingame (just south of SFO). We pushed the start time back 30 minutes to allow a little extra time to drive further north, and we hope the mid-way location brings more users from San Francisco. Since meetup.com seems to be the norm for HUGs around the country, we created a meetup group for the bay area (http://www.meetup.com/Bay-Area-Hadoop-User-Group-HUG). Join this group to stay up to date with additional meetings and locations - we're hoping to move the location around potentially alternating between north bay and south bay. We've scheduled the next meetup for July 15th at 6:30 PM. Our office isn't huge, but we do have room for 40 friendly people: http://www.meetup.com/Bay-Area-Hadoop-User-Group-HUG/calendar/10728923/ We'll focus this meeting on Hadoop 0.20 and the split of core into mapreduce, hdfs and common projects. Specifically, we'll go over new features, API changes, upgrade experiences and more. If you'd like to present about your experience, please let me know. If you'd like to present about something else all together, also let me know, and we'll see what we can do at this, or a later meetup. We'll provide beer, drinks and snacks, and if there are any board game fans in the house, we won't kick you our afterwards :-) On a more serious note, after the meetup is a great opportunity to meet Cloudera's engineering team and get advice about any headaches you might be having. We'll post the agenda to the meetup group as soon as we hear from potential presenters and nail things down. Christophe -- get hadoop: cloudera.com/hadoop online training: cloudera.com/hadoop-training blog: cloudera.com/blog twitter: twitter.com/cloudera