With rep of 3 you would have to lose 3 entire nodes to lose data. The rep
factor is 3 nodes, not 3 spindles.. The number of disks (sort of) determine
how hdfs spreads io across the spindles for the single copy of the data
(one of 3 nodes with copies) that the node owns. Note that things get
slightly complicated when the FIRST datum is written to a cluster. (But
that was not your question ; {)
On Apr 4, 2015 10:39 PM, "Arthur Chan" <[email protected]> wrote:

> Hi,
>
> I use the default replication factor 3 here, the cluster has 10 nodes,
> each of my datanode has 8 hard disks.  If one of the nodes is down because
> of hardware failure, i.e. the 8 hard disks will no longer be available
> immediately during the down time of this machine, does it mean that I will
> have data lost? (8 hard disks >  3 replicated)
>
> Or what would be the maximum number of servers that are allowed to be down
> without data lost here?
>
> Regards
> Arthur
>
> On Wednesday, December 17, 2014, Harshit Mathur <[email protected]>
> wrote:
>
>> Hi Arthur,
>>
>> In HDFS there will be block level replication, In case of total failure
>> of a datanode the lost blocks will get under replicated hence the namenode
>> will create copy of these under replicated blocks on some other datanode.
>>
>> BR,
>> Harshit
>>
>> On Wed, Dec 17, 2014 at 11:35 AM, [email protected] <
>> [email protected]> wrote:
>>>
>>> Hi,
>>>
>>> If each of  my datanode servers has 8 hard disks (a 10-node cluster) and
>>> I use the default replication factor of 3, how will Hadoop handle it when a
>>> datanode with total hardware failure suddenly?
>>>
>>> Regards
>>> Arthur
>>>
>>
>>
>>
>> --
>> Harshit Mathur
>>
>

Reply via email to