Re: Unable to fully decommission a RS

Nicolas Liochon Wed, 06 Mar 2013 00:51:33 -0800

Yes, decommissioning the regionserver does not mean decommissioning the
datanode.
Here, if I understand well your first step, you migrated the regions to
other regions servers. Physically, the data was still on the previous
machine, with the hdfs datanode. It's not used anymore for writes if all
other RS have a local DN, but it's still used for reads.  A replication
factor on 1 makes things tricky. Running a major compaction in HBase should
be enough to make files local. After this, you then want to decommission
the datanode in hdfs. Wait for the replication to be finished (if there is
anything to replicate). Then you're done.


So it would be:
1) decommission RS
2) stop RS
3) major compaction in HBase
4) decommission DN
5) stop DN
6) Shutdown box

Nicolas


On Wed, Mar 6, 2013 at 9:17 AM, Yves Langisch <[email protected]> wrote:

> Hi,
>
> I'm trying to decommission a RS, i.e. migrate all regions to different
> nodes in order to shutdown that specific node. First I've moved all regions
> to different nodes and made sure that there are no regions anymore on this
> node:
>
> ...
>     ip-11-11-111-111.ec2.internal:**60020 1362060123906
>         requestsPerSecond=0, numberOfOnlineRegions=0, usedHeapMB=118,
> maxHeapMB=9974
> ...
>
> After moving the regions I've stopped the HBase and Hadoop services on the
> decommissioned. I though that's it but now when accessing the moved regions
> on the new nodes I get error messages in the logs that blocks are not being
> found...On the decommissioned node I can still see many blk_* files in the
> HDFS. Is there anything else that needs to be done to fully transfer all
> region data to the new nodes? It looks like the regions are transferred but
> the related data is still on the old node...Maybe the replication factor of
> only 1 (I know....) is the problem?!?
>
> I'm using HBase 0.92.0 btw.
>
> Thanks
> Yves
>

Re: Unable to fully decommission a RS

Reply via email to