On 15.05.19 г. 18:02 ч., [email protected] wrote:
> From: Filipe Manana <[email protected]>
> 
> While logging an inode we follow its ancestors and for each one we mark
> it as logged in the current transaction, even if we have not logged it.
> As a consequence if we change an attribute of an ancestor, such as the
> UID or GID for example, and then explicitly fsync it, we end up not
> logging the inode at all despite returning success to user space, which
> results in the attribute being lost if a power failure happens after
> the fsync.
> 
> Sample reproducer:
> 
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
> 
>   $ mkdir /mnt/dir
>   $ chown 6007:6007 /mnt/dir
> 
>   $ sync
> 
>   $ chown 9003:9003 /mnt/dir
>   $ touch /mnt/dir/file
>   $ xfs_io -c fsync /mnt/dir/file
> 
>   # fsync our directory after fsync'ing the new file, should persist the
>   # new values for the uid and gid.
>   $ xfs_io -c fsync /mnt/dir
> 
>   <power failure>
> 
>   $ mount /dev/sdb /mnt
>   $ stat -c %u:%g /mnt/dir
>   6007:6007
> 
>     --> should be 9003:9003, the uid and gid were not persisted, despite
>         the explicit fsync on the directory prior to the power failure
> 
> Fix this by not updating the logged_trans field of ancestor inodes when
> logging an inode, since we have not logged them. Let only future calls to
> btrfs_log_inode() to mark inodes as logged.
> 
> This could be triggered by my recent fsync fuzz tester for fstests, for
> which an fstests patch exists titled "fstests: generic, fsync fuzz tester
> with fsstress".
> 
> Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
> Signed-off-by: Filipe Manana <[email protected]>
> ---
>  fs/btrfs/tree-log.c | 11 -----------
>  1 file changed, 11 deletions(-)
> 
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 87e3e4e37606..7d13533a9620 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -5465,7 +5465,6 @@ static noinline int check_parent_dirs_for_sync(struct 
> btrfs_trans_handle *trans,
>  {
>       int ret = 0;
>       struct dentry *old_parent = NULL;
> -     struct btrfs_inode *orig_inode = inode;
>  
>       /*
>        * for regular files, if its inode is already on disk, we don't
> @@ -5485,16 +5484,6 @@ static noinline int check_parent_dirs_for_sync(struct 
> btrfs_trans_handle *trans,
>       }
>  
>       while (1) {
> -             /*
> -              * If we are logging a directory then we start with our inode,
> -              * not our parent's inode, so we need to skip setting the
> -              * logged_trans so that further down in the log code we don't
> -              * think this inode has already been logged.
> -              */
> -             if (inode != orig_inode)
> -                     inode->logged_trans = trans->transid;
> -             smp_mb();
> -

By removing this memory barrier don't you also obsolete the one in
btrfs_record_unlink_dir? Both of these were introduced in 12fcfd22fe5b
("Btrfs: tree logging unlink/rename fixes") and despite they are missing
explicit comments about the expected pairing one can only assume they
both pair against each other.


>               if (btrfs_must_commit_transaction(trans, inode)) {
>                       ret = 1;
>                       break;
> 

Reply via email to