Re: Decommissioning a data node and problems bringing it back online

andrew touchet Thu, 24 Jul 2014 10:03:21 -0700

Hi Mirko,

Thanks for the reply!


"...it will not bring in exactly the same blocks like before"
Is that what usually happens when adding nodes back in? Should I expect any
data loss due to starting the data node process before running the
balancing tool?

Best Regards,

Andrew Touchet



On Thu, Jul 24, 2014 at 11:37 AM, Mirko Kämpf <[email protected]>
wrote:

> After you added the nodes back to your cluster you run the balancer tool,
> but it will not bring in exactly the same blocks like before.
>

> Cheers,
> Mirko
>
>
>
> 2014-07-24 17:34 GMT+01:00 andrew touchet <[email protected]>:
>
> Thanks for the reply,
>>
>> I am using Hadoop-0.20. We installed from Apache not cloundera, if that
>> makes a difference.
>>
>> Currently I really need to know how to get the data that was replicated
>> during decommissioning back onto my two data nodes.
>>
>>
>>
>>
>>
>> On Thursday, July 24, 2014, Stanley Shi <[email protected]> wrote:
>>
>>> which distribution are you using?
>>>
>>> Regards,
>>> *Stanley Shi,*
>>>
>>>
>>>
>>> On Thu, Jul 24, 2014 at 4:38 AM, andrew touchet <[email protected]>
>>> wrote:
>>>
>>>> I should have added this in my first email but I do get an error in the
>>>> data node's log file
>>>>
>>>> '2014-07-12 19:39:58,027 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks
>>>> got processed in 1 msecs'
>>>>
>>>>
>>>>
>>>> On Wed, Jul 23, 2014 at 3:18 PM, andrew touchet <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am Decommissioning data nodes for an OS upgrade on a HPC cluster .
>>>>> Currently, users can run jobs that use data stored on /hdfs. They are able
>>>>> to access all datanodes/compute nodes except the one being decommissioned.
>>>>>
>>>>> Is this safe to do? Will edited files affect the decommissioning node?
>>>>>
>>>>> I've been adding the nodes to /usr/lib/hadoop-0.20/conf/hosts_exclude
>>>>> and running   'hadoop dfsadmin -refreshNodes' on the name name node.  Then
>>>>> I simply wait for log files to report completion. After upgrade, I simply
>>>>> remove the node from hosts_exlude and start hadoop again on the datanode.
>>>>>
>>>>> Also: Under the namenode web interface I just noticed that the node I
>>>>> have decommissioned previously now has 0 Configured capacity, Used,
>>>>> Remaining memory and is now 100% Used.
>>>>>
>>>>> I used the same /etc/sysconfig/hadoop file from before the upgrade,
>>>>> removed the node from hosts_exclude, and ran '-refreshNodes' afterwards.
>>>>>
>>>>> What steps have I missed in the decommissioning process or while
>>>>> bringing the data node back online?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>

Re: Decommissioning a data node and problems bringing it back online

Reply via email to