I personally think a self-healing HBase is what every HBase user expects. On Wed, Jul 21, 2010 at 10:44 AM, Edward Capriolo <[email protected]>wrote:
> On Wed, Jul 21, 2010 at 1:40 PM, Ted Yu <[email protected]> wrote: > > J-D: > > Can you elaborate why ssh isn't preferred ? > > > > One solution is to create a light weight Java process that monitors > region > > server log and perform this duty. > > > > On Wed, Jul 21, 2010 at 10:31 AM, Jean-Daniel Cryans < > [email protected]>wrote: > > > >> The stance in the hbase dev community on that issue is to let users' > >> cluster management tools handle it. > >> > >> Also, how would you start a java process on a remote machine (if still > >> alive) from another java process (the master)? If you can find an > >> elegant way that doesn't rely on SSH, this would be a nice > >> contribution that users could enable. > >> > >> J-D > >> > >> On Wed, Jul 21, 2010 at 10:25 AM, Ted Yu <[email protected]> wrote: > >> > Thanks for the answer. > >> > > >> > GC pause seems to be a major cause for region server to come down: > >> > 2010-07-21 09:07:14,138 WARN org.apache.hadoop.hbase.util.Sleeper: We > >> slept > >> > 291505ms, ten times longer than scheduled: 10000 > >> > > >> > Is it possible for HBase Master to restart dead region server in this > >> case ? > >> > > >> > On Wed, Jul 21, 2010 at 10:02 AM, Jean-Daniel Cryans < > >> [email protected]>wrote: > >> > > >> >> HBaseAdmin.getClusterStatus().getServers() > >> >> > >> >> J-D > >> >> > >> >> On Wed, Jul 21, 2010 at 9:56 AM, Ted Yu <[email protected]> wrote: > >> >> > Hi, > >> >> > Is there API to query the number of live region servers ? > >> >> > > >> >> > Thanks > >> >> > > >> >> > >> > > >> > > > > Ted, > I just wrote to very relevant articles in my blog > > Using func instead of SSH keys > > http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/func_hadoop_the_end_of > > > http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/hadoop_secondarynamenode_puppet > > In which I use the puppet 'service resource' to do automatic restarts > of a secondary namenode. You could use puppet to accomplish auto > restarts for a region server. > > I usually tweet my new recipes when I come out with > themhttp://twitter.com/@edwardcapriolo >
