[ 
https://issues.apache.org/jira/browse/HDFS-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604405#comment-13604405
 ] 

Bikas Saha commented on HDFS-4606:
----------------------------------

There are several such application hint proposals floating around which makes 
it clear that there is need for HDFS to support new API's wrt block placement. 
The number of proposals also seems to suggest that the HDFS community needs to 
abstract the problem space and figure out the correct way forward (while 
balancing HDFS core principles and application complexity). It would help if we 
try to identify the scenarios/use-cases being desired instead of solutions for 
those scenarios/use-cases.
e.g.
HDFS-2121 - By creating replicas on the fly (when read off-switch), it looks 
like we are trying to solve the problem of hot data locality for data 
processing applications. When todays logs come in, almost every daily job wants 
to read them but gets stuck on 3 replicas. Being able to allow hot-data to be 
highly replicated on demand would help latency and locality. HDFS needs to 
understand that such over-replication needs to be tuned down by deleting excess 
replicas as demand falls. Persistance is not necessary here. This feature needs 
to be automatic to be useful.
HDFS-2576 - This seems to solve the use case when the application knows a 
priori that certain files need to be co-located with each other. Its not clear 
whether all replicas of blocks of those file needs co-location or not. By 
specifying the locations, the proposal solves the problem of getting a good 
starting point without persisting any co-location state. And thus, for stable 
clusters its good solution.
On the same jira, there is an alternate proposal to have co-location of files 
be a first class feature that persisted and HDFS can continue to co-locate them 
across machine failures and re-balancing. The application could potentially 
query the co-located machines and use that to assign its own failover services. 
This feature is also useful for data-processing applications that want to 
colocate frequently joined pre-partitioned data to avoid unnecessary 
re-partitioning. I have seen this scenario work at scale at a different Hadoop 
like system at very large scale. So its useful.
However, this does not prevent HDFS from co-locating 2 different Hbase region 
server data on the same machine. HDFS-4606 addresses that problem by letting 
client specify that they want to copy the data locally. But like someone 
suggested somewhere else, moving data to code is kind of going against the 
basic grain of hadoop. So designing such an API needs to be careful about doing 
the right thing and preventing abuse.

IMO, letting applications start managing their own blocks needs to be carefully 
thought out. It might be enticing at first but we might soon end up with issues 
on who owns the blocks and takes actions that HDFS currently takes on blocks 
for snaphotting, replicating, fault tolerance and every future HDFS feature. 
Also, does this mean applications might have to develop a host of namenode 
features themselves as they needs to fix more issues down the line. Or ask HDFS 
for API's to control all such HDFS actions. And how does HDFS manage all these 
API's while doing the right thing for the data in the cluster. Mistakes in data 
management are too critical to make.
                
> HDFS API to move file replicas to caller's location
> ---------------------------------------------------
>
>                 Key: HDFS-4606
>                 URL: https://issues.apache.org/jira/browse/HDFS-4606
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Sanjay Radia
>            Assignee: Sanjay Radia
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to