On Thu, May 27, 2010 at 4:01 PM, Lucas Nazário dos Santos <nazario.lu...@gmail.com> wrote: > Thanks a lot for the responses. I'll be monitoring HBase and get back in > touch if it happens again. > > Maybe HBase could employ a mechanism to automatically recover from > connectivity issues like the one I had gone through. Then me and others > wouldn't need to manually restart it.
Well usually if one machine is not reachable, it's not a big deal since there are other machines to connect to and HBase redistributes the regions to them. Also, why is it refused? Can we see the region server log? > > I still didn't get why the master kept failing even after its recovery, and > why I had to stop/start the cluster in order to get rid of the "Connection > refused" error. I'd also like to understand why the region server isn't responding, the master can only know so much. > > I'm assuming it's not big deal and my solution can live with it. > > More logs bellow. > Consider pastebin or a web server next time ;)