Can you post the stack trace and relevant log snippets ? Thanks
On Nov 16, 2011, at 2:40 AM, Matthias Hofschen <[email protected]> wrote: > Hi, > we had a case today where the loadbalancer stopped working. > (cloudera-cdh3-u1, 52nodes). Basically we had a hot region that we moved to > another node. Shortly thereafter the regionserver of that region was > stopped. In the master logs we see that master is trying to contact this > regionserver to move the region. At the same time we had about 1000 regions > missing because of the stopped regionserver. These where not assigned to > another regionserver. I suspect that the the load balancer on the master > machine was blocked by trying to move the one region which was not > possible. We then restarted the master which solved the problem. > > Is this a known problem, should I log an issue? I do have a stacktrace from > the master machine. > > Cheers Matthias
