[ 
https://issues.apache.org/jira/browse/HDFS-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087997#comment-17087997
 ] 

Ayush Saxena commented on HDFS-15278:
-------------------------------------

Thanx for the discussion here.

Sorry, couldn't check the patch in full. But this feature tends to be enabled 
at the server side. Thus this would be imposing this to all the other clients 
too who didn't intend to have it(They had options during write time), for just 
one use-case. This even tends to spread the data out irrespective of the load 
on datanode. The dispersal of data could take away the data locality 
advantage(Short Circuit Reads) and may impact read performance for them. 

Usually I feel the server shouldn't impose things which can be handled from the 
Client end and should be left as per Client Preference, until and unless it is 
impacting Data Availability kinds of scenario, which BPP handles. 

> After execute ‘-setrep 1’, make sure that blocks of the file are dispersed 
> across different datanodes
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15278
>                 URL: https://issues.apache.org/jira/browse/HDFS-15278
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Yang Yun
>            Assignee: Yang Yun
>            Priority: Minor
>         Attachments: HDFS-15278.001.patch, HDFS-15278.002.patch, 
> HDFS-15278.003.patch
>
>
> After execute ‘-setrep 1’, many of blocks of the file may locate on same 
> machine. Especially the file is written on one datanode machine. That causes 
> data hot spots and is hard to fix if this machine is down.
> Add a chosen history to make sure that blocks of the file are dispersed 
> across different datanodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to