Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-16 Thread Greg KH
On Mon, Sep 16, 2019 at 01:59:47PM +0100, Filipe Manana wrote:
> On Thu, Sep 12, 2019 at 8:32 AM Sasha Levin  wrote:
> >
> > Hi,
> >
> > [This is an automated email]
> >
> > This commit has been processed because it contains a -stable tag.
> > The stable tag indicates that it's relevant for the following trees: 4.4+
> >
> > The bot has tested the following trees: v5.2.14, v4.19.72, v4.14.143, 
> > v4.9.192, v4.4.192.
> >
> > v5.2.14: Build OK!
> > v4.19.72: Failed to apply! Possible dependencies:
> > 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of 
> > different files")
> > a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> > unlink/rmdir")
> > b8aa330d2acb ("Btrfs: improve performance on fsync of files with 
> > multiple hardlinks")
> >
> > v4.14.143: Failed to apply! Possible dependencies:
> > 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> > recreation")
> > 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> > combination")
> > 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of 
> > different files")
> > 8d9e220ca084 ("btrfs: simplify IS_ERR/PTR_ERR checks")
> > a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> > unlink/rmdir")
> > b8aa330d2acb ("Btrfs: improve performance on fsync of files with 
> > multiple hardlinks")
> >
> > v4.9.192: Failed to apply! Possible dependencies:
> > 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
> > variables")
> > 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> > recreation")
> > 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> > combination")
> > 4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take 
> > btrfs_inode")
> > 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of 
> > different files")
> > a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> > unlink/rmdir")
> > cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
> > da17066c4047 ("btrfs: pull node/sector/stripe sizes out of root and 
> > into fs_info")
> > db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
> > de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")
> > fb456252d3d9 ("btrfs: root->fs_info cleanup, use fs_info->dev_root 
> > everywhere")
> >
> > v4.4.192: Failed to apply! Possible dependencies:
> > 0132761017e0 ("btrfs: fix string and comment grammatical issues and 
> > typos")
> > 09cbfeaf1a5a ("mm, fs: get rid of PAGE_CACHE_* and 
> > page_cache_{get,release} macros")
> > 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
> > variables")
> > 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> > recreation")
> > 0e749e54244e ("dax: increase granularity of dax_clear_blocks() 
> > operations")
> > 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> > combination")
> > 44f714dae50a ("Btrfs: improve performance on fsync against new inode 
> > after rename/unlink")
> > 4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take 
> > btrfs_inode")
> > 52db400fcd50 ("pmem, dax: clean up clear_pmem()")
> > 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of 
> > different files")
> > 781feef7e6be ("Btrfs: fix lockdep warning about log_mutex")
> > a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> > unlink/rmdir")
> > b2e0d1625e19 ("dax: fix lifetime of in-kernel dax mappings with 
> > dax_map_atomic()")
> > bb7ab3b92e46 ("btrfs: Fix misspellings in comments.")
> > cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
> > d1a5f2b4d8a1 ("block: use DAX for partition table reads")
> > db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
> > de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")
> >
> >
> > NOTE: The patch will not be queued to stable trees until it is upstream.
> >
> > How should we proceed with this patch?
> 
> So here follows, as attachments, patches for the following stable
> versions: 4.4.193, 4.9.193, 4.14.144 and 4.19.73.

Thanks for these backports, now queued up!

greg k-h


Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-16 Thread Filipe Manana
On Thu, Sep 12, 2019 at 8:32 AM Sasha Levin  wrote:
>
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: 4.4+
>
> The bot has tested the following trees: v5.2.14, v4.19.72, v4.14.143, 
> v4.9.192, v4.4.192.
>
> v5.2.14: Build OK!
> v4.19.72: Failed to apply! Possible dependencies:
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> b8aa330d2acb ("Btrfs: improve performance on fsync of files with multiple 
> hardlinks")
>
> v4.14.143: Failed to apply! Possible dependencies:
> 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> recreation")
> 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> combination")
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> 8d9e220ca084 ("btrfs: simplify IS_ERR/PTR_ERR checks")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> b8aa330d2acb ("Btrfs: improve performance on fsync of files with multiple 
> hardlinks")
>
> v4.9.192: Failed to apply! Possible dependencies:
> 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
> variables")
> 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> recreation")
> 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> combination")
> 4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take 
> btrfs_inode")
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
> da17066c4047 ("btrfs: pull node/sector/stripe sizes out of root and into 
> fs_info")
> db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
> de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")
> fb456252d3d9 ("btrfs: root->fs_info cleanup, use fs_info->dev_root 
> everywhere")
>
> v4.4.192: Failed to apply! Possible dependencies:
> 0132761017e0 ("btrfs: fix string and comment grammatical issues and 
> typos")
> 09cbfeaf1a5a ("mm, fs: get rid of PAGE_CACHE_* and 
> page_cache_{get,release} macros")
> 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
> variables")
> 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> recreation")
> 0e749e54244e ("dax: increase granularity of dax_clear_blocks() 
> operations")
> 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> combination")
> 44f714dae50a ("Btrfs: improve performance on fsync against new inode 
> after rename/unlink")
> 4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take 
> btrfs_inode")
> 52db400fcd50 ("pmem, dax: clean up clear_pmem()")
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> 781feef7e6be ("Btrfs: fix lockdep warning about log_mutex")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> b2e0d1625e19 ("dax: fix lifetime of in-kernel dax mappings with 
> dax_map_atomic()")
> bb7ab3b92e46 ("btrfs: Fix misspellings in comments.")
> cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
> d1a5f2b4d8a1 ("block: use DAX for partition table reads")
> db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
> de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?

So here follows, as attachments, patches for the following stable
versions: 4.4.193, 4.9.193, 4.14.144 and 4.19.73.

Thanks.

>
> --
> Thanks,
> Sasha



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”
From 842266d7437f7714a67ddb34680546f0268f27c9 Mon Sep 17 00:00:00 2001
From: Filipe Manana 
Date: Tue, 10 Sep 2019 15:26:49 +0100
Subject: [PATCH] Btrfs: fix assertion failure during fsync and use of stale
 transaction

Sometimes when fsync'ing a file we need to log that other inodes exist and
when we need to do that we acquire a reference on the inodes and then drop
that reference using iput() after logging them.

That generally is not a problem except if we end up doing the final iput()
(dropping the last reference) on the inode and that inode has a link count
of 0, which can happen in a very short time window if the logging path
gets a reference on the inode while it's being unlinked.

In that case we end up getting the eviction callback, btrfs_evict_inode(),
invoked through the iput() call chain which needs to drop all of the
inode's items from its subvol

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Josef Bacik
On Thu, Sep 12, 2019 at 02:19:55PM +0100, Filipe Manana wrote:
> On Thu, Sep 12, 2019 at 1:18 PM Josef Bacik  wrote:
> >
> > On Tue, Sep 10, 2019 at 03:26:49PM +0100, fdman...@kernel.org wrote:
> > > From: Filipe Manana 
> > >
> > > Sometimes when fsync'ing a file we need to log that other inodes exist and
> > > when we need to do that we acquire a reference on the inodes and then drop
> > > that reference using iput() after logging them.
> > >
> > > That generally is not a problem except if we end up doing the final iput()
> > > (dropping the last reference) on the inode and that inode has a link count
> > > of 0, which can happen in a very short time window if the logging path
> > > gets a reference on the inode while it's being unlinked.
> > >
> > > In that case we end up getting the eviction callback, btrfs_evict_inode(),
> > > invoked through the iput() call chain which needs to drop all of the
> > > inode's items from its subvolume btree, and in order to do that, it needs
> > > to join a transaction at the helper function evict_refill_and_join().
> > > However because the task previously started a transaction at the fsync
> > > handler, btrfs_sync_file(), it has current->journal_info already pointing
> > > to a transaction handle and therefore evict_refill_and_join() will get
> > > that transaction handle from btrfs_join_transaction(). From this point on,
> > > two different problems can happen:
> > >
> > > 1) evict_refill_and_join() will often change the transaction handle's
> > >block reserve (->block_rsv) and set its ->bytes_reserved field to a
> > >value greater than 0. If evict_refill_and_join() never commits the
> > >transaction, the eviction handler ends up decreasing the reference
> > >count (->use_count) of the transaction handle through the call to
> > >btrfs_end_transaction(), and after that point we have a transaction
> > >handle with a NULL ->block_rsv (which is the value prior to the
> > >transaction join from evict_refill_and_join()) and a ->bytes_reserved
> > >value greater than 0. If after the eviction/iput completes the inode
> > >logging path hits an error or it decides that it must fallback to a
> > >transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
> > >non-zero value from btrfs_log_dentry_safe(), and because of that
> > >non-zero value it tries to commit the transaction using a handle with
> > >a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
> > >the transaction commit hit an assertion failure at
> > >btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
> > >the ->block_rsv is NULL. The produced stack trace for that is like the
> > >following:
> > >
> > >[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> > > fs/btrfs/transaction.c, line: 816
> > >[192922.917553] [ cut here ]
> > >[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
> > >[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
> > >[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW 
> > > 5.1.4-btrfs-next-47 #1
> > >[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > > BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
> > >[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
> > >(...)
> > >[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
> > >[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> > > 
> > >[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> > > 95cab6b16838
> > >[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> > > 
> > >[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> > > 95ca731a0980
> > >[192922.922820] R13:  R14: 95ca84c73338 R15: 
> > > 95ca731a0ea8
> > >[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> > > knlGS:
> > >[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
> > >[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> > > 003606e0
> > >[192922.924329] DR0:  DR1:  DR2: 
> > > 
> > >[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> > > 0400
> > >[192922.925105] Call Trace:
> > >[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
> > >[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
> > >[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
> > >[192922.926731]  do_fsync+0x38/0x60
> > >[192922.927138]  __x64_sys_fdatasync+0x13/0x20
> > >[192922.927543]  do_syscall_64+0x60/0x1c0
> > >[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > >(...)
> > >[192922.934077] ---[ end trace f00808b12068168f ]---

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Filipe Manana
On Thu, Sep 12, 2019 at 1:18 PM Josef Bacik  wrote:
>
> On Tue, Sep 10, 2019 at 03:26:49PM +0100, fdman...@kernel.org wrote:
> > From: Filipe Manana 
> >
> > Sometimes when fsync'ing a file we need to log that other inodes exist and
> > when we need to do that we acquire a reference on the inodes and then drop
> > that reference using iput() after logging them.
> >
> > That generally is not a problem except if we end up doing the final iput()
> > (dropping the last reference) on the inode and that inode has a link count
> > of 0, which can happen in a very short time window if the logging path
> > gets a reference on the inode while it's being unlinked.
> >
> > In that case we end up getting the eviction callback, btrfs_evict_inode(),
> > invoked through the iput() call chain which needs to drop all of the
> > inode's items from its subvolume btree, and in order to do that, it needs
> > to join a transaction at the helper function evict_refill_and_join().
> > However because the task previously started a transaction at the fsync
> > handler, btrfs_sync_file(), it has current->journal_info already pointing
> > to a transaction handle and therefore evict_refill_and_join() will get
> > that transaction handle from btrfs_join_transaction(). From this point on,
> > two different problems can happen:
> >
> > 1) evict_refill_and_join() will often change the transaction handle's
> >block reserve (->block_rsv) and set its ->bytes_reserved field to a
> >value greater than 0. If evict_refill_and_join() never commits the
> >transaction, the eviction handler ends up decreasing the reference
> >count (->use_count) of the transaction handle through the call to
> >btrfs_end_transaction(), and after that point we have a transaction
> >handle with a NULL ->block_rsv (which is the value prior to the
> >transaction join from evict_refill_and_join()) and a ->bytes_reserved
> >value greater than 0. If after the eviction/iput completes the inode
> >logging path hits an error or it decides that it must fallback to a
> >transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
> >non-zero value from btrfs_log_dentry_safe(), and because of that
> >non-zero value it tries to commit the transaction using a handle with
> >a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
> >the transaction commit hit an assertion failure at
> >btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
> >the ->block_rsv is NULL. The produced stack trace for that is like the
> >following:
> >
> >[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> > fs/btrfs/transaction.c, line: 816
> >[192922.917553] [ cut here ]
> >[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
> >[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
> >[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW   
> >   5.1.4-btrfs-next-47 #1
> >[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
> >[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
> >(...)
> >[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
> >[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> > 
> >[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> > 95cab6b16838
> >[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> > 
> >[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> > 95ca731a0980
> >[192922.922820] R13:  R14: 95ca84c73338 R15: 
> > 95ca731a0ea8
> >[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> > knlGS:
> >[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
> >[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> > 003606e0
> >[192922.924329] DR0:  DR1:  DR2: 
> > 
> >[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> >[192922.925105] Call Trace:
> >[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
> >[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
> >[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
> >[192922.926731]  do_fsync+0x38/0x60
> >[192922.927138]  __x64_sys_fdatasync+0x13/0x20
> >[192922.927543]  do_syscall_64+0x60/0x1c0
> >[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >(...)
> >[192922.934077] ---[ end trace f00808b12068168f ]---
> >
> > 2) If evict_refill_and_join() decides to commit the transaction, it will
> >be able to do it, since the nested transaction join only increments the
> >transaction handle's ->use_count reference counter and it does n

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Filipe Manana
On Thu, Sep 12, 2019 at 12:59 PM Josef Bacik  wrote:
>
> On Tue, Sep 10, 2019 at 03:26:49PM +0100, fdman...@kernel.org wrote:
> > From: Filipe Manana 
> >
> > Sometimes when fsync'ing a file we need to log that other inodes exist and
> > when we need to do that we acquire a reference on the inodes and then drop
> > that reference using iput() after logging them.
> >
> > That generally is not a problem except if we end up doing the final iput()
> > (dropping the last reference) on the inode and that inode has a link count
> > of 0, which can happen in a very short time window if the logging path
> > gets a reference on the inode while it's being unlinked.
> >
> > In that case we end up getting the eviction callback, btrfs_evict_inode(),
> > invoked through the iput() call chain which needs to drop all of the
> > inode's items from its subvolume btree, and in order to do that, it needs
> > to join a transaction at the helper function evict_refill_and_join().
> > However because the task previously started a transaction at the fsync
> > handler, btrfs_sync_file(), it has current->journal_info already pointing
> > to a transaction handle and therefore evict_refill_and_join() will get
> > that transaction handle from btrfs_join_transaction(). From this point on,
> > two different problems can happen:
> >
> > 1) evict_refill_and_join() will often change the transaction handle's
> >block reserve (->block_rsv) and set its ->bytes_reserved field to a
> >value greater than 0. If evict_refill_and_join() never commits the
> >transaction, the eviction handler ends up decreasing the reference
> >count (->use_count) of the transaction handle through the call to
> >btrfs_end_transaction(), and after that point we have a transaction
> >handle with a NULL ->block_rsv (which is the value prior to the
> >transaction join from evict_refill_and_join()) and a ->bytes_reserved
> >value greater than 0. If after the eviction/iput completes the inode
> >logging path hits an error or it decides that it must fallback to a
> >transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
> >non-zero value from btrfs_log_dentry_safe(), and because of that
> >non-zero value it tries to commit the transaction using a handle with
> >a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
> >the transaction commit hit an assertion failure at
> >btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
> >the ->block_rsv is NULL. The produced stack trace for that is like the
> >following:
> >
> >[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> > fs/btrfs/transaction.c, line: 816
> >[192922.917553] [ cut here ]
> >[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
> >[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
> >[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW   
> >   5.1.4-btrfs-next-47 #1
> >[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
> >[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
> >(...)
> >[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
> >[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> > 
> >[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> > 95cab6b16838
> >[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> > 
> >[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> > 95ca731a0980
> >[192922.922820] R13:  R14: 95ca84c73338 R15: 
> > 95ca731a0ea8
> >[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> > knlGS:
> >[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
> >[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> > 003606e0
> >[192922.924329] DR0:  DR1:  DR2: 
> > 
> >[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> >[192922.925105] Call Trace:
> >[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
> >[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
> >[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
> >[192922.926731]  do_fsync+0x38/0x60
> >[192922.927138]  __x64_sys_fdatasync+0x13/0x20
> >[192922.927543]  do_syscall_64+0x60/0x1c0
> >[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >(...)
> >[192922.934077] ---[ end trace f00808b12068168f ]---
> >
> > 2) If evict_refill_and_join() decides to commit the transaction, it will
> >be able to do it, since the nested transaction join only increments the
> >transaction handle's ->use_count reference counter and it does 

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Josef Bacik
On Tue, Sep 10, 2019 at 03:26:49PM +0100, fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> Sometimes when fsync'ing a file we need to log that other inodes exist and
> when we need to do that we acquire a reference on the inodes and then drop
> that reference using iput() after logging them.
> 
> That generally is not a problem except if we end up doing the final iput()
> (dropping the last reference) on the inode and that inode has a link count
> of 0, which can happen in a very short time window if the logging path
> gets a reference on the inode while it's being unlinked.
> 
> In that case we end up getting the eviction callback, btrfs_evict_inode(),
> invoked through the iput() call chain which needs to drop all of the
> inode's items from its subvolume btree, and in order to do that, it needs
> to join a transaction at the helper function evict_refill_and_join().
> However because the task previously started a transaction at the fsync
> handler, btrfs_sync_file(), it has current->journal_info already pointing
> to a transaction handle and therefore evict_refill_and_join() will get
> that transaction handle from btrfs_join_transaction(). From this point on,
> two different problems can happen:
> 
> 1) evict_refill_and_join() will often change the transaction handle's
>block reserve (->block_rsv) and set its ->bytes_reserved field to a
>value greater than 0. If evict_refill_and_join() never commits the
>transaction, the eviction handler ends up decreasing the reference
>count (->use_count) of the transaction handle through the call to
>btrfs_end_transaction(), and after that point we have a transaction
>handle with a NULL ->block_rsv (which is the value prior to the
>transaction join from evict_refill_and_join()) and a ->bytes_reserved
>value greater than 0. If after the eviction/iput completes the inode
>logging path hits an error or it decides that it must fallback to a
>transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
>non-zero value from btrfs_log_dentry_safe(), and because of that
>non-zero value it tries to commit the transaction using a handle with
>a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
>the transaction commit hit an assertion failure at
>btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
>the ->block_rsv is NULL. The produced stack trace for that is like the
>following:
> 
>[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> fs/btrfs/transaction.c, line: 816
>[192922.917553] [ cut here ]
>[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
>[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
>[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW 
> 5.1.4-btrfs-next-47 #1
>[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
>[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
>(...)
>[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
>[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> 
>[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> 95cab6b16838
>[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> 
>[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> 95ca731a0980
>[192922.922820] R13:  R14: 95ca84c73338 R15: 
> 95ca731a0ea8
>[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> knlGS:
>[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
>[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> 003606e0
>[192922.924329] DR0:  DR1:  DR2: 
> 
>[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> 0400
>[192922.925105] Call Trace:
>[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
>[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
>[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
>[192922.926731]  do_fsync+0x38/0x60
>[192922.927138]  __x64_sys_fdatasync+0x13/0x20
>[192922.927543]  do_syscall_64+0x60/0x1c0
>[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>(...)
>[192922.934077] ---[ end trace f00808b12068168f ]---
> 
> 2) If evict_refill_and_join() decides to commit the transaction, it will
>be able to do it, since the nested transaction join only increments the
>transaction handle's ->use_count reference counter and it does not
>prevent the transaction from getting committed. This means that after
>eviction completes, the fsync logging path will be using a transaction
>handle that refers to an already committed transaction. What 

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Josef Bacik
On Tue, Sep 10, 2019 at 03:26:49PM +0100, fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> Sometimes when fsync'ing a file we need to log that other inodes exist and
> when we need to do that we acquire a reference on the inodes and then drop
> that reference using iput() after logging them.
> 
> That generally is not a problem except if we end up doing the final iput()
> (dropping the last reference) on the inode and that inode has a link count
> of 0, which can happen in a very short time window if the logging path
> gets a reference on the inode while it's being unlinked.
> 
> In that case we end up getting the eviction callback, btrfs_evict_inode(),
> invoked through the iput() call chain which needs to drop all of the
> inode's items from its subvolume btree, and in order to do that, it needs
> to join a transaction at the helper function evict_refill_and_join().
> However because the task previously started a transaction at the fsync
> handler, btrfs_sync_file(), it has current->journal_info already pointing
> to a transaction handle and therefore evict_refill_and_join() will get
> that transaction handle from btrfs_join_transaction(). From this point on,
> two different problems can happen:
> 
> 1) evict_refill_and_join() will often change the transaction handle's
>block reserve (->block_rsv) and set its ->bytes_reserved field to a
>value greater than 0. If evict_refill_and_join() never commits the
>transaction, the eviction handler ends up decreasing the reference
>count (->use_count) of the transaction handle through the call to
>btrfs_end_transaction(), and after that point we have a transaction
>handle with a NULL ->block_rsv (which is the value prior to the
>transaction join from evict_refill_and_join()) and a ->bytes_reserved
>value greater than 0. If after the eviction/iput completes the inode
>logging path hits an error or it decides that it must fallback to a
>transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
>non-zero value from btrfs_log_dentry_safe(), and because of that
>non-zero value it tries to commit the transaction using a handle with
>a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
>the transaction commit hit an assertion failure at
>btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
>the ->block_rsv is NULL. The produced stack trace for that is like the
>following:
> 
>[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> fs/btrfs/transaction.c, line: 816
>[192922.917553] [ cut here ]
>[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
>[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
>[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW 
> 5.1.4-btrfs-next-47 #1
>[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
>[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
>(...)
>[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
>[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> 
>[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> 95cab6b16838
>[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> 
>[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> 95ca731a0980
>[192922.922820] R13:  R14: 95ca84c73338 R15: 
> 95ca731a0ea8
>[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> knlGS:
>[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
>[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> 003606e0
>[192922.924329] DR0:  DR1:  DR2: 
> 
>[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> 0400
>[192922.925105] Call Trace:
>[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
>[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
>[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
>[192922.926731]  do_fsync+0x38/0x60
>[192922.927138]  __x64_sys_fdatasync+0x13/0x20
>[192922.927543]  do_syscall_64+0x60/0x1c0
>[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>(...)
>[192922.934077] ---[ end trace f00808b12068168f ]---
> 
> 2) If evict_refill_and_join() decides to commit the transaction, it will
>be able to do it, since the nested transaction join only increments the
>transaction handle's ->use_count reference counter and it does not
>prevent the transaction from getting committed. This means that after

This brings up a good point, we should probably not allow the commit in this
case, or add an ASSERT(use_count == 1) or something, cause this 

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Nikolay Borisov



On 12.09.19 г. 14:24 ч., Filipe Manana wrote:
> On Thu, Sep 12, 2019 at 12:10 PM Nikolay Borisov  wrote:
>>
>>
>>
>> On 10.09.19 г. 17:26 ч., fdman...@kernel.org wrote:
>>> From: Filipe Manana 
>>>
>>> Sometimes when fsync'ing a file we need to log that other inodes exist and
>>> when we need to do that we acquire a reference on the inodes and then drop
>>> that reference using iput() after logging them.
>>>
>>> That generally is not a problem except if we end up doing the final iput()
>>> (dropping the last reference) on the inode and that inode has a link count
>>> of 0, which can happen in a very short time window if the logging path
>>> gets a reference on the inode while it's being unlinked.
>>>
>>> In that case we end up getting the eviction callback, btrfs_evict_inode(),
>>> invoked through the iput() call chain which needs to drop all of the
>>> inode's items from its subvolume btree, and in order to do that, it needs
>>> to join a transaction at the helper function evict_refill_and_join().
>>> However because the task previously started a transaction at the fsync
>>> handler, btrfs_sync_file(), it has current->journal_info already pointing
>>> to a transaction handle and therefore evict_refill_and_join() will get
>>> that transaction handle from btrfs_join_transaction(). From this point on,
>>> two different problems can happen:
>>>
>>> 1) evict_refill_and_join() will often change the transaction handle's
>>>block reserve (->block_rsv) and set its ->bytes_reserved field to a
>>>value greater than 0. If evict_refill_and_join() never commits the
>>>transaction, the eviction handler ends up decreasing the reference
>>>count (->use_count) of the transaction handle through the call to
>>>btrfs_end_transaction(), and after that point we have a transaction
>>>handle with a NULL ->block_rsv (which is the value prior to the
>>>transaction join from evict_refill_and_join()) and a ->bytes_reserved
>>>value greater than 0. If after the eviction/iput completes the inode
>>>logging path hits an error or it decides that it must fallback to a
>>>transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
>>>non-zero value from btrfs_log_dentry_safe(), and because of that
>>>non-zero value it tries to commit the transaction using a handle with
>>>a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
>>>the transaction commit hit an assertion failure at
>>>btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
>>>the ->block_rsv is NULL. The produced stack trace for that is like the
>>>following:
>>>
>>>[192922.917158] assertion failed: !trans->bytes_reserved, file: 
>>> fs/btrfs/transaction.c, line: 816
>>>[192922.917553] [ cut here ]
>>>[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
>>>[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
>>>[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW   
>>>   5.1.4-btrfs-next-47 #1
>>>[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
>>> BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
>>>[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
>>>(...)
>>>[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
>>>[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
>>> 
>>>[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
>>> 95cab6b16838
>>>[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
>>> 
>>>[192922.922442] R10: aebdc8a27e70 R11:  R12: 
>>> 95ca731a0980
>>>[192922.922820] R13:  R14: 95ca84c73338 R15: 
>>> 95ca731a0ea8
>>>[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
>>> knlGS:
>>>[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
>>>[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
>>> 003606e0
>>>[192922.924329] DR0:  DR1:  DR2: 
>>> 
>>>[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
>>> 0400
>>>[192922.925105] Call Trace:
>>>[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
>>>[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
>>>[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
>>>[192922.926731]  do_fsync+0x38/0x60
>>>[192922.927138]  __x64_sys_fdatasync+0x13/0x20
>>>[192922.927543]  do_syscall_64+0x60/0x1c0
>>>[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>(...)
>>>[192922.934077] ---[ end trace f00808b12068168f ]---
>>>
>>> 2) If evict_refill_and_join() decides to commit the transaction, it will
>> evict_refill_and_join only ever calls btrfs_join_transaction so it
>> cannot ever commit the transac

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Filipe Manana
On Thu, Sep 12, 2019 at 12:10 PM Nikolay Borisov  wrote:
>
>
>
> On 10.09.19 г. 17:26 ч., fdman...@kernel.org wrote:
> > From: Filipe Manana 
> >
> > Sometimes when fsync'ing a file we need to log that other inodes exist and
> > when we need to do that we acquire a reference on the inodes and then drop
> > that reference using iput() after logging them.
> >
> > That generally is not a problem except if we end up doing the final iput()
> > (dropping the last reference) on the inode and that inode has a link count
> > of 0, which can happen in a very short time window if the logging path
> > gets a reference on the inode while it's being unlinked.
> >
> > In that case we end up getting the eviction callback, btrfs_evict_inode(),
> > invoked through the iput() call chain which needs to drop all of the
> > inode's items from its subvolume btree, and in order to do that, it needs
> > to join a transaction at the helper function evict_refill_and_join().
> > However because the task previously started a transaction at the fsync
> > handler, btrfs_sync_file(), it has current->journal_info already pointing
> > to a transaction handle and therefore evict_refill_and_join() will get
> > that transaction handle from btrfs_join_transaction(). From this point on,
> > two different problems can happen:
> >
> > 1) evict_refill_and_join() will often change the transaction handle's
> >block reserve (->block_rsv) and set its ->bytes_reserved field to a
> >value greater than 0. If evict_refill_and_join() never commits the
> >transaction, the eviction handler ends up decreasing the reference
> >count (->use_count) of the transaction handle through the call to
> >btrfs_end_transaction(), and after that point we have a transaction
> >handle with a NULL ->block_rsv (which is the value prior to the
> >transaction join from evict_refill_and_join()) and a ->bytes_reserved
> >value greater than 0. If after the eviction/iput completes the inode
> >logging path hits an error or it decides that it must fallback to a
> >transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
> >non-zero value from btrfs_log_dentry_safe(), and because of that
> >non-zero value it tries to commit the transaction using a handle with
> >a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
> >the transaction commit hit an assertion failure at
> >btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
> >the ->block_rsv is NULL. The produced stack trace for that is like the
> >following:
> >
> >[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> > fs/btrfs/transaction.c, line: 816
> >[192922.917553] [ cut here ]
> >[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
> >[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
> >[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW   
> >   5.1.4-btrfs-next-47 #1
> >[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
> >[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
> >(...)
> >[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
> >[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> > 
> >[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> > 95cab6b16838
> >[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> > 
> >[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> > 95ca731a0980
> >[192922.922820] R13:  R14: 95ca84c73338 R15: 
> > 95ca731a0ea8
> >[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> > knlGS:
> >[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
> >[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> > 003606e0
> >[192922.924329] DR0:  DR1:  DR2: 
> > 
> >[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> >[192922.925105] Call Trace:
> >[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
> >[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
> >[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
> >[192922.926731]  do_fsync+0x38/0x60
> >[192922.927138]  __x64_sys_fdatasync+0x13/0x20
> >[192922.927543]  do_syscall_64+0x60/0x1c0
> >[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >(...)
> >[192922.934077] ---[ end trace f00808b12068168f ]---
> >
> > 2) If evict_refill_and_join() decides to commit the transaction, it will
> evict_refill_and_join only ever calls btrfs_join_transaction so it
> cannot ever commit the transaction.

It can:

https://git.kernel.org/pub/scm/linux/kern

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Nikolay Borisov



On 10.09.19 г. 17:26 ч., fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> Sometimes when fsync'ing a file we need to log that other inodes exist and
> when we need to do that we acquire a reference on the inodes and then drop
> that reference using iput() after logging them.
> 
> That generally is not a problem except if we end up doing the final iput()
> (dropping the last reference) on the inode and that inode has a link count
> of 0, which can happen in a very short time window if the logging path
> gets a reference on the inode while it's being unlinked.
> 
> In that case we end up getting the eviction callback, btrfs_evict_inode(),
> invoked through the iput() call chain which needs to drop all of the
> inode's items from its subvolume btree, and in order to do that, it needs
> to join a transaction at the helper function evict_refill_and_join().
> However because the task previously started a transaction at the fsync
> handler, btrfs_sync_file(), it has current->journal_info already pointing
> to a transaction handle and therefore evict_refill_and_join() will get
> that transaction handle from btrfs_join_transaction(). From this point on,
> two different problems can happen:
> 
> 1) evict_refill_and_join() will often change the transaction handle's
>block reserve (->block_rsv) and set its ->bytes_reserved field to a
>value greater than 0. If evict_refill_and_join() never commits the
>transaction, the eviction handler ends up decreasing the reference
>count (->use_count) of the transaction handle through the call to
>btrfs_end_transaction(), and after that point we have a transaction
>handle with a NULL ->block_rsv (which is the value prior to the
>transaction join from evict_refill_and_join()) and a ->bytes_reserved
>value greater than 0. If after the eviction/iput completes the inode
>logging path hits an error or it decides that it must fallback to a
>transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
>non-zero value from btrfs_log_dentry_safe(), and because of that
>non-zero value it tries to commit the transaction using a handle with
>a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
>the transaction commit hit an assertion failure at
>btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
>the ->block_rsv is NULL. The produced stack trace for that is like the
>following:
> 
>[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> fs/btrfs/transaction.c, line: 816
>[192922.917553] [ cut here ]
>[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
>[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
>[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW 
> 5.1.4-btrfs-next-47 #1
>[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
>[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
>(...)
>[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
>[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> 
>[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> 95cab6b16838
>[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> 
>[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> 95ca731a0980
>[192922.922820] R13:  R14: 95ca84c73338 R15: 
> 95ca731a0ea8
>[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> knlGS:
>[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
>[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> 003606e0
>[192922.924329] DR0:  DR1:  DR2: 
> 
>[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> 0400
>[192922.925105] Call Trace:
>[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
>[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
>[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
>[192922.926731]  do_fsync+0x38/0x60
>[192922.927138]  __x64_sys_fdatasync+0x13/0x20
>[192922.927543]  do_syscall_64+0x60/0x1c0
>[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>(...)
>[192922.934077] ---[ end trace f00808b12068168f ]---
> 
> 2) If evict_refill_and_join() decides to commit the transaction, it will
evict_refill_and_join only ever calls btrfs_join_transaction so it
cannot ever commit the transaction. Let's look into its sole caller,
btrfs_evict_inode, I see 3 cases where transactions handle is involved
in that function:

1. btrfs_commit_inode_delayed_inode - iot calls trans_join +
end_transaction => use_count = 2 -> no commit

2. btrfs_truncate_inode_items -> never calls e

Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Filipe Manana
On Thu, Sep 12, 2019 at 8:32 AM Sasha Levin  wrote:
>
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: 4.4+
>
> The bot has tested the following trees: v5.2.14, v4.19.72, v4.14.143, 
> v4.9.192, v4.4.192.
>
> v5.2.14: Build OK!
> v4.19.72: Failed to apply! Possible dependencies:
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> b8aa330d2acb ("Btrfs: improve performance on fsync of files with multiple 
> hardlinks")
>
> v4.14.143: Failed to apply! Possible dependencies:
> 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> recreation")
> 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> combination")
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> 8d9e220ca084 ("btrfs: simplify IS_ERR/PTR_ERR checks")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> b8aa330d2acb ("Btrfs: improve performance on fsync of files with multiple 
> hardlinks")
>
> v4.9.192: Failed to apply! Possible dependencies:
> 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
> variables")
> 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> recreation")
> 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> combination")
> 4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take 
> btrfs_inode")
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
> da17066c4047 ("btrfs: pull node/sector/stripe sizes out of root and into 
> fs_info")
> db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
> de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")
> fb456252d3d9 ("btrfs: root->fs_info cleanup, use fs_info->dev_root 
> everywhere")
>
> v4.4.192: Failed to apply! Possible dependencies:
> 0132761017e0 ("btrfs: fix string and comment grammatical issues and 
> typos")
> 09cbfeaf1a5a ("mm, fs: get rid of PAGE_CACHE_* and 
> page_cache_{get,release} macros")
> 0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
> variables")
> 0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
> recreation")
> 0e749e54244e ("dax: increase granularity of dax_clear_blocks() 
> operations")
> 1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
> combination")
> 44f714dae50a ("Btrfs: improve performance on fsync against new inode 
> after rename/unlink")
> 4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take 
> btrfs_inode")
> 52db400fcd50 ("pmem, dax: clean up clear_pmem()")
> 6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
> files")
> 781feef7e6be ("Btrfs: fix lockdep warning about log_mutex")
> a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
> unlink/rmdir")
> b2e0d1625e19 ("dax: fix lifetime of in-kernel dax mappings with 
> dax_map_atomic()")
> bb7ab3b92e46 ("btrfs: Fix misspellings in comments.")
> cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
> d1a5f2b4d8a1 ("block: use DAX for partition table reads")
> db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
> de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?

After it lands on Linus' tree, I'll try to send patch versions that
apply to different kernel releases.

Thanks.

>
> --
> Thanks,
> Sasha



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”


Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-12 Thread Sasha Levin
Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: 4.4+

The bot has tested the following trees: v5.2.14, v4.19.72, v4.14.143, v4.9.192, 
v4.4.192.

v5.2.14: Build OK!
v4.19.72: Failed to apply! Possible dependencies:
6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
files")
a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
unlink/rmdir")
b8aa330d2acb ("Btrfs: improve performance on fsync of files with multiple 
hardlinks")

v4.14.143: Failed to apply! Possible dependencies:
0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
recreation")
1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
combination")
6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
files")
8d9e220ca084 ("btrfs: simplify IS_ERR/PTR_ERR checks")
a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
unlink/rmdir")
b8aa330d2acb ("Btrfs: improve performance on fsync of files with multiple 
hardlinks")

v4.9.192: Failed to apply! Possible dependencies:
0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
variables")
0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
recreation")
1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
combination")
4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take btrfs_inode")
6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
files")
a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
unlink/rmdir")
cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
da17066c4047 ("btrfs: pull node/sector/stripe sizes out of root and into 
fs_info")
db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")
fb456252d3d9 ("btrfs: root->fs_info cleanup, use fs_info->dev_root 
everywhere")

v4.4.192: Failed to apply! Possible dependencies:
0132761017e0 ("btrfs: fix string and comment grammatical issues and typos")
09cbfeaf1a5a ("mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} 
macros")
0b246afa62b0 ("btrfs: root->fs_info cleanup, add fs_info convenience 
variables")
0d836392cadd ("Btrfs: fix mount failure after fsync due to hard link 
recreation")
0e749e54244e ("dax: increase granularity of dax_clear_blocks() operations")
1f250e929a9c ("Btrfs: fix log replay failure after unlink and link 
combination")
44f714dae50a ("Btrfs: improve performance on fsync against new inode after 
rename/unlink")
4791c8f19c45 ("btrfs: Make btrfs_check_ref_name_override take btrfs_inode")
52db400fcd50 ("pmem, dax: clean up clear_pmem()")
6b5fc433a7ad ("Btrfs: fix fsync after succession of renames of different 
files")
781feef7e6be ("Btrfs: fix lockdep warning about log_mutex")
a3baaf0d786e ("Btrfs: fix fsync after succession of renames and 
unlink/rmdir")
b2e0d1625e19 ("dax: fix lifetime of in-kernel dax mappings with 
dax_map_atomic()")
bb7ab3b92e46 ("btrfs: Fix misspellings in comments.")
cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block")
d1a5f2b4d8a1 ("block: use DAX for partition table reads")
db0a669fb002 ("btrfs: Make btrfs_add_link take btrfs_inode")
de143792253e ("btrfs: struct btrfsic_state->root should be an fs_info")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

--
Thanks,
Sasha


Re: [PATCH] Btrfs: fix assertion failure during fsync and use of stale transaction

2019-09-11 Thread Josef Bacik
On Tue, Sep 10, 2019 at 03:26:49PM +0100, fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> Sometimes when fsync'ing a file we need to log that other inodes exist and
> when we need to do that we acquire a reference on the inodes and then drop
> that reference using iput() after logging them.
> 
> That generally is not a problem except if we end up doing the final iput()
> (dropping the last reference) on the inode and that inode has a link count
> of 0, which can happen in a very short time window if the logging path
> gets a reference on the inode while it's being unlinked.
> 
> In that case we end up getting the eviction callback, btrfs_evict_inode(),
> invoked through the iput() call chain which needs to drop all of the
> inode's items from its subvolume btree, and in order to do that, it needs
> to join a transaction at the helper function evict_refill_and_join().
> However because the task previously started a transaction at the fsync
> handler, btrfs_sync_file(), it has current->journal_info already pointing
> to a transaction handle and therefore evict_refill_and_join() will get
> that transaction handle from btrfs_join_transaction(). From this point on,
> two different problems can happen:
> 
> 1) evict_refill_and_join() will often change the transaction handle's
>block reserve (->block_rsv) and set its ->bytes_reserved field to a
>value greater than 0. If evict_refill_and_join() never commits the
>transaction, the eviction handler ends up decreasing the reference
>count (->use_count) of the transaction handle through the call to
>btrfs_end_transaction(), and after that point we have a transaction
>handle with a NULL ->block_rsv (which is the value prior to the
>transaction join from evict_refill_and_join()) and a ->bytes_reserved
>value greater than 0. If after the eviction/iput completes the inode
>logging path hits an error or it decides that it must fallback to a
>transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
>non-zero value from btrfs_log_dentry_safe(), and because of that
>non-zero value it tries to commit the transaction using a handle with
>a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
>the transaction commit hit an assertion failure at
>btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
>the ->block_rsv is NULL. The produced stack trace for that is like the
>following:
> 
>[192922.917158] assertion failed: !trans->bytes_reserved, file: 
> fs/btrfs/transaction.c, line: 816
>[192922.917553] [ cut here ]
>[192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
>[192922.918310] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC PTI
>[192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: GW 
> 5.1.4-btrfs-next-47 #1
>[192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
>[192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
>(...)
>[192922.920925] RSP: 0018:aebdc8a27da8 EFLAGS: 00010286
>[192922.921315] RAX: 0051 RBX: 95c9c16a41c0 RCX: 
> 
>[192922.921692] RDX:  RSI: 95cab6b16838 RDI: 
> 95cab6b16838
>[192922.922066] RBP: 95c9c16a41c0 R08:  R09: 
> 
>[192922.922442] R10: aebdc8a27e70 R11:  R12: 
> 95ca731a0980
>[192922.922820] R13:  R14: 95ca84c73338 R15: 
> 95ca731a0ea8
>[192922.923200] FS:  7f337eda4e80() GS:95cab6b0() 
> knlGS:
>[192922.923579] CS:  0010 DS:  ES:  CR0: 80050033
>[192922.923948] CR2: 7f337edad000 CR3: 0001e00f6002 CR4: 
> 003606e0
>[192922.924329] DR0:  DR1:  DR2: 
> 
>[192922.924711] DR3:  DR6: fffe0ff0 DR7: 
> 0400
>[192922.925105] Call Trace:
>[192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
>[192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
>[192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
>[192922.926731]  do_fsync+0x38/0x60
>[192922.927138]  __x64_sys_fdatasync+0x13/0x20
>[192922.927543]  do_syscall_64+0x60/0x1c0
>[192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>(...)
>[192922.934077] ---[ end trace f00808b12068168f ]---
> 
> 2) If evict_refill_and_join() decides to commit the transaction, it will
>be able to do it, since the nested transaction join only increments the
>transaction handle's ->use_count reference counter and it does not
>prevent the transaction from getting committed. This means that after
>eviction completes, the fsync logging path will be using a transaction
>handle that refers to an already committed transaction. What