[ 
https://issues.apache.org/jira/browse/HDFS-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732054#action_12732054
 ] 

dhruba borthakur commented on HDFS-487:
---------------------------------------

The API for pluggable block placement (HDFS-385) provides the pathname of the 
file to the block placement policy. The block placement policy can use the 
filename to determine what kind of placement algorithm to use for blocks in 
that file. This works well in the current NN design. However, if in future, we 
separate out the Block Manager from the NN, the Block Manager might not know 
the pathname for which the block belongs to. In that case, the Block manager 
will not be able to provide the filename when invoking the 
pluggable-block-placement-policy API. So, in some sense, using a fileid 
(instead of a filename) is future-proofing the API.

Again to emphasize, HDFS-385 does not really need fileids, although it is good 
to have. The API designed in HDFS-385 shoudl be marked as "experimental", and 
we can change it if/when the Block Manager is separated out from the NN. Which 
option do you prefer?

> HDFS should expose a fileid to uniquely identify a file
> -------------------------------------------------------
>
>                 Key: HDFS-487
>                 URL: https://issues.apache.org/jira/browse/HDFS-487
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: fileid1.txt
>
>
> HDFS should expose a id that uniquely identifies a file. This helps in 
> developing  applications that work correctly even when files are moved from 
> one directory to another. A typical use-case is to make the Pluggable Block 
> Placement Policy (HDFS-385) use fileid instead of filename.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to