hemantk-12 opened a new pull request, #4045: URL: https://github.com/apache/ozone/pull/4045
## What changes were proposed in this pull request? To generate faster diff between snapshots, we maintain a compaction DAG in memory. Whenever compaction happens, related SST file nodes get added to the DAG. Over time, DAG will keep on increasing and may cause memory pressure or become a bottleneck. To solve this, we can prune the unnecessary SST file nodes from the DAG since we have a concept of the oldest snapshot with compaction history. This change proposes the traversal and pruning of the DAG. Idea here is to first remove the nodes and arcs which were created before snapshot, to be deleted, was created because they are not needed to generate the diff anymore. `pruneDownstreamDag` does that pruning and removes nodes and arcs from forward and backward DAGs by going over the successors of forward DAG's current level. Once older nodes and arcs get deleted from the oldest snapshot compaction history, remove the nodes and arcs which are not needed to generate diff for newer snapshots. `pruneUpstreamDag` does remaining pruning and removes nodes and arcs from both forward and backward DAGs by going over the successors from backward DAG of current level's node. If node in the current level doesn't have any successors in forward DAG, arc to the successor and current node can be deleted. Let's take an example of the following diagram (Backward DAG).  Snapshots were taken at level1, level-3 and level-5 Snapshot-1: 000015.sst, 000013.sst, 000011.sst, 000009.sst Snapshot-2: 000027.sst, 000030.sst, 000028.sst, 000031.sst, 000029.sst, 000039.sst, 000037.sst, 000035.sst, 000033.sst Snapshot-3: 000059.sst, 000055.sst, 000056.sst, 000060.sst, 000057.sst, 000058.sst If Snapshot-1 and Snapshot-2 need to be pruned, we can simply prune downstream of level-3 and then upstream of level-3 in Forward DAG. 1. `pruneDownstreamDag` will remove nodes of level-1 and level-2 and arcs between level-1 & level-2 and level-2 & level-3. 2. `pruneUpstreamDag` will remove nodes from level-3 and arcs between level-3 & level-4. ## What is the link to the Apache JIRA * https://issues.apache.org/jira/browse/HDDS-7524 ## How was this patch tested? * Unit tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
