[ https://issues.apache.org/jira/browse/HDFS-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370199#comment-16370199 ]
Tsz Wo Nicholas Sze commented on HDFS-13102: -------------------------------------------- Thanks [~shashikant] for working on this. Some comments on the patch: - Pass INodeDirectory as a parameter in getSumForRange(..). Then, we could remove INodeDirectory dir from DirectoryDiffList. - Let's replace getSumForRange with getMinListForRange in DiffList so that we may implement it DiffListByArrayList using subList. - diffSetIndexList does not seem useful since it is the same as the nodes in level 1. BTW, diffSetIndexList is not updated when remove an element so that it seems a bug. I suggest removing diffSetIndexList since it can be computed if necessary. - TestDirectoryDiffList does not test remove(..). As mentioned, remove(..) seems having some bugs. > Implement SnapshotSkipList class to store Multi level DirectoryDiffs > -------------------------------------------------------------------- > > Key: HDFS-13102 > URL: https://issues.apache.org/jira/browse/HDFS-13102 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Shashikant Banerjee > Assignee: Shashikant Banerjee > Priority: Major > Attachments: HDFS-13102.001.patch, HDFS-13102.002.patch, > HDFS-13102.003.patch > > > HDFS-11225 explains an issue where deletion of older snapshots can take a > very long time in case the no of snapshot diffs is quite large for > directories. For any directory under a snapshot, to construct the children > list , it needs to combine all the diffs from that particular snapshot to the > last snapshotDiff record and reverseApply to the current children list of the > directory on live fs. This can take a significant time if the no of snapshot > diffs are quite large and changes per diff is significant. > This Jira proposes to store the Directory diffs in a SnapshotSkip list, where > we store multi level DirectoryDiffs. At each level, the Directory Diff will > be cumulative diff of k snapshot diffs, > where k is the level of a node in the list. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org