Hello, Yesterday, I realized the algorithm for nodatacow is broken, it can't reliably detect whether a given extent is referenced by only one snapshot.
Let me use the attached picture to describe the issue. Figure (1) shows the initial tree structure. there is only one fs tree A. Figure (2) shows the tree structure after we create a snapshot of fs tree A. The new snapshot's root node is B. Figure (3) shows the situation after we modified leaf node L. Before we modified leaf node L, the tree is in the state showed figure (1) Figure (4) shows the situation after we modified leaf node L when snapshot B exists. In the figures, the color of rectangle is used to differentiate between tree nodes belongs to different owners (owner field in tree node header). Node A' is the shadow copy of node A, leaf L' is the shadow copy of L. When nodatacow option is enable, btrfs_count_snapshots_in_path is used to detect whether a given extent is referenced by only one snapshot. It uses backref info for tree blocks in btrfs_path and file extent to do the complex work. In the example showed in figure (3) or figure (4), backref info for node A', leaf L' and file extent are used. We can find that the backref info used in the case showed in figure (3) and in the case showed in figure (4) are same. But in figure (3), the file extent is referenced by one snapshot; in figure (4), the file extent is referenced by two snapshots. In both cases, btrfs_count_snapshots_in_path return 1. Regards YZ
<<attachment: nocow.jpg>>