[ https://issues.apache.org/jira/browse/HDFS-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732054#action_12732054 ]
dhruba borthakur commented on HDFS-487: --------------------------------------- The API for pluggable block placement (HDFS-385) provides the pathname of the file to the block placement policy. The block placement policy can use the filename to determine what kind of placement algorithm to use for blocks in that file. This works well in the current NN design. However, if in future, we separate out the Block Manager from the NN, the Block Manager might not know the pathname for which the block belongs to. In that case, the Block manager will not be able to provide the filename when invoking the pluggable-block-placement-policy API. So, in some sense, using a fileid (instead of a filename) is future-proofing the API. Again to emphasize, HDFS-385 does not really need fileids, although it is good to have. The API designed in HDFS-385 shoudl be marked as "experimental", and we can change it if/when the Block Manager is separated out from the NN. Which option do you prefer? > HDFS should expose a fileid to uniquely identify a file > ------------------------------------------------------- > > Key: HDFS-487 > URL: https://issues.apache.org/jira/browse/HDFS-487 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Attachments: fileid1.txt > > > HDFS should expose a id that uniquely identifies a file. This helps in > developing applications that work correctly even when files are moved from > one directory to another. A typical use-case is to make the Pluggable Block > Placement Policy (HDFS-385) use fileid instead of filename. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.