[
https://issues.apache.org/jira/browse/HDFS-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604405#comment-13604405
]
Bikas Saha commented on HDFS-4606:
----------------------------------
There are several such application hint proposals floating around which makes
it clear that there is need for HDFS to support new API's wrt block placement.
The number of proposals also seems to suggest that the HDFS community needs to
abstract the problem space and figure out the correct way forward (while
balancing HDFS core principles and application complexity). It would help if we
try to identify the scenarios/use-cases being desired instead of solutions for
those scenarios/use-cases.
e.g.
HDFS-2121 - By creating replicas on the fly (when read off-switch), it looks
like we are trying to solve the problem of hot data locality for data
processing applications. When todays logs come in, almost every daily job wants
to read them but gets stuck on 3 replicas. Being able to allow hot-data to be
highly replicated on demand would help latency and locality. HDFS needs to
understand that such over-replication needs to be tuned down by deleting excess
replicas as demand falls. Persistance is not necessary here. This feature needs
to be automatic to be useful.
HDFS-2576 - This seems to solve the use case when the application knows a
priori that certain files need to be co-located with each other. Its not clear
whether all replicas of blocks of those file needs co-location or not. By
specifying the locations, the proposal solves the problem of getting a good
starting point without persisting any co-location state. And thus, for stable
clusters its good solution.
On the same jira, there is an alternate proposal to have co-location of files
be a first class feature that persisted and HDFS can continue to co-locate them
across machine failures and re-balancing. The application could potentially
query the co-located machines and use that to assign its own failover services.
This feature is also useful for data-processing applications that want to
colocate frequently joined pre-partitioned data to avoid unnecessary
re-partitioning. I have seen this scenario work at scale at a different Hadoop
like system at very large scale. So its useful.
However, this does not prevent HDFS from co-locating 2 different Hbase region
server data on the same machine. HDFS-4606 addresses that problem by letting
client specify that they want to copy the data locally. But like someone
suggested somewhere else, moving data to code is kind of going against the
basic grain of hadoop. So designing such an API needs to be careful about doing
the right thing and preventing abuse.
IMO, letting applications start managing their own blocks needs to be carefully
thought out. It might be enticing at first but we might soon end up with issues
on who owns the blocks and takes actions that HDFS currently takes on blocks
for snaphotting, replicating, fault tolerance and every future HDFS feature.
Also, does this mean applications might have to develop a host of namenode
features themselves as they needs to fix more issues down the line. Or ask HDFS
for API's to control all such HDFS actions. And how does HDFS manage all these
API's while doing the right thing for the data in the cluster. Mistakes in data
management are too critical to make.
> HDFS API to move file replicas to caller's location
> ---------------------------------------------------
>
> Key: HDFS-4606
> URL: https://issues.apache.org/jira/browse/HDFS-4606
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Sanjay Radia
> Assignee: Sanjay Radia
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira