Sending this back to the user mailing list. RegionServers can die for many reasons. Looking at your RegionServer log files should give hints as to why it's happening.
-Dima On Fri, May 26, 2017 at 9:48 AM, jeff saremi <jeffsar...@hotmail.com> wrote: > I had posted this to the user mailing list and I have not got any direct > answer to my question. > > Where do dead RS's come from and how can they be cleaned up? Someone in > the midst of developers should know this. > > thanks > > Jeff > > ________________________________ > From: jeff saremi <jeffsar...@hotmail.com> > Sent: Thursday, May 25, 2017 10:23:17 AM > To: u...@hbase.apache.org > Subject: Re: What is Dead Region Servers and how to clear them up? > > I'm still looking to get hints on how to remove the dead regions. thanks > > ________________________________ > From: jeff saremi <jeffsar...@hotmail.com> > Sent: Wednesday, May 24, 2017 12:27:06 PM > To: u...@hbase.apache.org > Subject: Re: What is Dead Region Servers and how to clear them up? > > i'm trying to eliminate the dead region servers. > > ________________________________ > From: Ted Yu <yuzhih...@gmail.com> > Sent: Wednesday, May 24, 2017 12:17:40 PM > To: u...@hbase.apache.org > Subject: Re: What is Dead Region Servers and how to clear them up? > > bq. running hbck (many times > > Can you describe the specific inconsistencies you were trying to resolve ? > Depending on the inconsistencies, advice can be given on the best known > hbck command arguments to use. > > Feel free to pastebin master log if needed. > > On Wed, May 24, 2017 at 12:10 PM, jeff saremi <jeffsar...@hotmail.com> > wrote: > > > these are the things I have done so far: > > > > > > - restarting master (few times) > > > > - running hbck (many times; this tool does not seem to be doing anything > > at all) > > > > - checking the list of region servers in ZK (none of the dead ones are > > listed here) > > > > - checking the WALs under <hbase_hdfs>/WALs. Out of 11 dead ones only 3 > > are listed here with "-splitting" at the end of their names and they > > contain one single file like: 1493846660401..meta.1493922323600.meta > > > > > > > > > > ________________________________ > > From: jeff saremi <jeffsar...@hotmail.com> > > Sent: Wednesday, May 24, 2017 9:04:11 AM > > To: u...@hbase.apache.org > > Subject: What is Dead Region Servers and how to clear them up? > > > > Apparently having dead region servers is so common that a section of the > > master console is dedicated to that? > > How can we clean this up (preferably in an automated fashion)? Why isn't > > this being done by HBase automatically? > > > > > > thanks > > >