[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626868#comment-13626868 ]
Kihwal Lee commented on HDFS-4489: ---------------------------------- bq. Please look at the overall increase in memory usage instead of increase over used memory. Your point would be valid only if the overhead was entirely a fixed amount (e.g. GSet). Since the extra memory consumption increases as the size of namespace grows, factoring the arbitrary max heap size into this can be misleading. But I agree that the 9% figure does not have an absolute meaning either. If the inode-to-block ratio is different, the number will be different. For the clusters I have seen, it will be a lower number. The GSet used for InodeID to INode map is also semi-fixed. Is it allocated similarly to BlocksMap? In any case, I would not call this insignificant. We have a namenode which will not work well if we upgrade to a release with this feature since it will need extra 4-6GB for the steady-state operation. Even if it could absorb the extra memory requirement, we would have to tell users that the namespace limit is X% worse. Simply saying the overhead is insignificant won't convince users. We should explain why the benefit from having this feature justifies the overhead. I don't think on/off switch is necessary. > Use InodeID as as an identifier of a file in HDFS protocols and APIs > -------------------------------------------------------------------- > > Key: HDFS-4489 > URL: https://issues.apache.org/jira/browse/HDFS-4489 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Reporter: Brandon Li > Assignee: Brandon Li > > The benefit of using InodeID to uniquely identify a file can be multiple > folds. Here are a few of them: > 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, > HDFS-4437. > 2. modification checks in tools like distcp. Since a file could have been > replaced or renamed to, the file name and size combination is no t reliable, > but the combination of file id and size is unique. > 3. id based protocol support (e.g., NFS) > 4. to make the pluggable block placement policy use fileid instead of > filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira