On Wed, Nov 28, 2018 at 02:54:28PM +0000, fdman...@kernel.org wrote:
> From: Filipe Manana <fdman...@suse.com>
> 
> The log tree has a long standing problem that when a file is fsync'ed we
> only check for new ancestors, created in the current transaction, by
> following only the hard link for which the fsync was issued. We follow the
> ancestors using the VFS' dget_parent() API. This means that if we create a
> new link for a file in a directory that is new (or in an any other new
> ancestor directory) and then fsync the file using an old hard link, we end
> up not logging the new ancestor, and on log replay that new hard link and
> ancestor do not exist. In some cases, involving renames, the file will not
> exist at all.
> 
> Example:
> 
>   mkfs.btrfs -f /dev/sdb
>   mount /dev/sdb /mnt
> 
>   mkdir /mnt/A
>   touch /mnt/foo
>   ln /mnt/foo /mnt/A/bar
>   xfs_io -c fsync /mnt/foo
> 
>   <power failure>
> 
> In this example after log replay only the hard link named 'foo' exists
> and directory A does not exist, which is unexpected. In other major linux
> filesystems, such as ext4, xfs and f2fs for example, both hard links exist
> and so does directory A after mounting again the filesystem.
> 
> Checking if any new ancestors are new and need to be logged was added in
> 2009 by commit 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes"),
> however only for the ancestors of the hard link (dentry) for which the
> fsync was issued, instead of checking for all ancestors for all of the
> inode's hard links.
> 
> So fix this by tracking the id of the last transaction where a hard link
> was created for an inode and then on fsync fallback to a full transaction
> commit when an inode has more than one hard link and at least one new hard
> link was created in the current transaction. This is the simplest solution
> since this is not a common use case (adding frequently hard links for
> which there's an ancestor created in the current transaction and then
> fsync the file). In case it ever becomes a common use case, a solution
> that consists of iterating the fs/subvol btree for each hard link and
> check if any ancestor is new, could be implemented.
> 
> This solves many unexpected scenarios reported by Jayashree Mohan and
> Vijay Chidambaram, and for which there is a new test case for fstests
> under review.
> 
> Reported-by: Vijay Chidambaram <vvija...@gmail.com>
> Reported-by: Jayashree Mohan <jayashree2...@gmail.com>
> Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
> Signed-off-by: Filipe Manana <fdman...@suse.com>

Added to misc-next, thanks.

Reply via email to