[BUG] delayed inodes and reflinks

2011-07-05 Thread Jan Schmidt
Hi,

I hit this bug an hour ago while executing some cp --reflink:

Jul  5 13:54:02 oglaroon kernel: [ 2654.545244] [ cut here
]
Jul  5 13:54:02 oglaroon kernel: [ 2654.600508] kernel BUG at
fs/btrfs/delayed-inode.c:1637!
Jul  5 13:54:02 oglaroon kernel: [ 2654.664052] invalid opcode: 
[#1] SMP
Jul  5 13:54:02 oglaroon kernel: [ 2654.713244] last sysfs file:
/sys/devices/pci:00/:00:1c.4/:04:00.0/net/eth3/broadcast
Jul  5 13:54:02 oglaroon kernel: [ 2654.819429] CPU 1
Jul  5 13:54:02 oglaroon kernel: [ 2654.841372] Modules linked in: btrfs
mpt2sas scsi_transport_sas raid_class [last unloaded: btrfs]
Jul  5 13:54:02 oglaroon kernel: [ 2654.950364]
Jul  5 13:54:02 oglaroon kernel: [ 2654.968147] Pid: 22343, comm: cp
Tainted: GW   2.6.39+ #2 Supermicro X8SIL/X8SIL
Jul  5 13:54:02 oglaroon kernel: [ 2655.065386] RIP:
0010:[a0222490]  [a0222490]
btrfs_delayed_update_inode+0x120/0x130 [btrfs]
Jul  5 13:54:02 oglaroon kernel: [ 2655.186237] RSP:
0018:88023010dbd8  EFLAGS: 00010286
Jul  5 13:54:02 oglaroon kernel: [ 2655.249780] RAX: ffe4
RBX: 8802356bec40 RCX: 00018000
Jul  5 13:54:02 oglaroon kernel: [ 2655.335164] RDX: 0047
RSI:  RDI: 880230a84390
Jul  5 13:54:02 oglaroon kernel: [ 2655.420652] RBP: 88023010dc18
R08: 825eb2a0 R09: 0001
Jul  5 13:54:02 oglaroon kernel: [ 2655.506036] R10: 03e0
R11: 8802317c4560 R12: 8802356bec88
Jul  5 13:54:02 oglaroon kernel: [ 2655.591419] R13: 88023569c6f8
R14: 88023334f000 R15: 880234318000
Jul  5 13:54:02 oglaroon kernel: [ 2655.676803] FS:
7f07d396c700() GS:88023fc4() knlGS:
Jul  5 13:54:02 oglaroon kernel: [ 2655.773733] CS:  0010 DS:  ES:
 CR0: 80050033
Jul  5 13:54:02 oglaroon kernel: [ 2655.842476] CR2: 00407407
CR3: 000230974000 CR4: 06e0
Jul  5 13:54:02 oglaroon kernel: [ 2655.927859] DR0: 
DR1:  DR2: 
Jul  5 13:54:02 oglaroon kernel: [ 2656.013245] DR3: 
DR6: 0ff0 DR7: 0400
Jul  5 13:54:02 oglaroon kernel: [ 2656.098733] Process cp (pid: 22343,
threadinfo 88023010c000, task 8802317c3e80)
Jul  5 13:54:02 oglaroon kernel: [ 2656.194516] Stack:
Jul  5 13:54:02 oglaroon kernel: [ 2656.218540]  88023010dc38
00018000 013b 88023569c6f8
Jul  5 13:54:02 oglaroon kernel: [ 2656.307564]  88023334f000
88023569c6f8 88023568d000 
Jul  5 13:54:02 oglaroon kernel: [ 2656.396483]  88023010dc68
a01db4de 0068 
Jul  5 13:54:02 oglaroon kernel: [ 2656.485403] Call Trace:
Jul  5 13:54:02 oglaroon kernel: [ 2656.514742]  [a01db4de]
btrfs_update_inode+0x3e/0x150 [btrfs]
Jul  5 13:54:02 oglaroon kernel: [ 2656.593884]  [a0209160]
btrfs_ioctl_clone+0x9e0/0xca0 [btrfs]
Jul  5 13:54:02 oglaroon kernel: [ 2656.673022]  [81151f00] ?
might_fault+0x40/0xa0
Jul  5 13:54:02 oglaroon kernel: [ 2656.737613]  [a0209b05]
btrfs_ioctl+0x335/0xf70 [btrfs]
Jul  5 13:54:02 oglaroon kernel: [ 2656.810612]  [81151f57] ?
might_fault+0x97/0xa0
Jul  5 13:54:02 oglaroon kernel: [ 2656.875198]  [81151f0e] ?
might_fault+0x4e/0xa0
Jul  5 13:54:02 oglaroon kernel: [ 2656.939782]  [81859006] ?
_raw_spin_unlock+0x26/0x30
Jul  5 13:54:02 oglaroon kernel: [ 2657.009567]  [8117fcd3] ?
cp_new_stat+0xf3/0x110
Jul  5 13:54:02 oglaroon kernel: [ 2657.075325]  [8118cb5c]
do_vfs_ioctl+0x9c/0x560
Jul  5 13:54:02 oglaroon kernel: [ 2657.139982]  [818607ac] ?
sysret_check+0x27/0x62
Jul  5 13:54:02 oglaroon kernel: [ 2657.205604]  [8118d0b9]
sys_ioctl+0x99/0xa0
Jul  5 13:54:02 oglaroon kernel: [ 2657.266133]  [8186077b]
system_call_fastpath+0x16/0x1b
Jul  5 13:54:02 oglaroon kernel: [ 2657.337995] Code: f8 05 00 00 8d 0c
49 48 89 ca 48 89 4d c8 e8 c8 c7 f9 ff 85 c0 48 8b 4d c8 75 10 48 89 4b
08 e9 3d ff ff ff 0f 1f 80 00 00 00 00 0f 0b eb fe 66 66 66 2e 0f 1f
84 00 00 00 00 00 55 48 89 e5 41
Jul  5 13:54:02 oglaroon kernel: [ 2657.570642] RIP
[a0222490] btrfs_delayed_update_inode+0x120/0x130 [btrfs]
Jul  5 13:54:02 oglaroon kernel: [ 2657.663516]  RSP 88023010dbd8
Jul  5 13:54:02 oglaroon kernel: [ 2657.705561] ---[ end trace
0ae6cc23c8022b5b ]---

I was testing some completely different modifications I made myself, but
I'm quite certain that my changes did not trigger this error. With the
fs I could reproducably get to this bug by creating 10 reflinks of a
certain file in a shell loop.

I lost the file system while trying to setup a clean, tight test case.
The fs I used had something like 50 files, some reflinks and a snapshot
with some files deleted. The tree had explicit backrefs and shared backrefs.

Anyway, I could not setup a new file system triggering this bug. The
line that 

Re: [BUG] delayed inodes and reflinks

2011-07-05 Thread Miao Xie
On tue, 05 Jul 2011 15:25:12 +0200, Jan Schmidt wrote:
 I hit this bug an hour ago while executing some cp --reflink:
 
 Jul  5 13:54:02 oglaroon kernel: [ 2654.545244] [ cut here
 ]
 Jul  5 13:54:02 oglaroon kernel: [ 2654.600508] kernel BUG at
 fs/btrfs/delayed-inode.c:1637!
[SNIP]
 Jul  5 13:54:02 oglaroon kernel: [ 2656.485403] Call Trace:
 Jul  5 13:54:02 oglaroon kernel: [ 2656.514742]  [a01db4de]
 btrfs_update_inode+0x3e/0x150 [btrfs]
 Jul  5 13:54:02 oglaroon kernel: [ 2656.593884]  [a0209160]
 btrfs_ioctl_clone+0x9e0/0xca0 [btrfs]
 Jul  5 13:54:02 oglaroon kernel: [ 2656.673022]  [81151f00] ?
 might_fault+0x40/0xa0
 Jul  5 13:54:02 oglaroon kernel: [ 2656.737613]  [a0209b05]
 btrfs_ioctl+0x335/0xf70 [btrfs]
 Jul  5 13:54:02 oglaroon kernel: [ 2656.810612]  [81151f57] ?
 might_fault+0x97/0xa0
 Jul  5 13:54:02 oglaroon kernel: [ 2656.875198]  [81151f0e] ?
 might_fault+0x4e/0xa0
 Jul  5 13:54:02 oglaroon kernel: [ 2656.939782]  [81859006] ?
 _raw_spin_unlock+0x26/0x30
 Jul  5 13:54:02 oglaroon kernel: [ 2657.009567]  [8117fcd3] ?
 cp_new_stat+0xf3/0x110
 Jul  5 13:54:02 oglaroon kernel: [ 2657.075325]  [8118cb5c]
 do_vfs_ioctl+0x9c/0x560
 Jul  5 13:54:02 oglaroon kernel: [ 2657.139982]  [818607ac] ?
 sysret_check+0x27/0x62
 Jul  5 13:54:02 oglaroon kernel: [ 2657.205604]  [8118d0b9]
 sys_ioctl+0x99/0xa0
 Jul  5 13:54:02 oglaroon kernel: [ 2657.266133]  [8186077b]
 system_call_fastpath+0x16/0x1b
 Jul  5 13:54:02 oglaroon kernel: [ 2657.337995] Code: f8 05 00 00 8d 0c
 49 48 89 ca 48 89 4d c8 e8 c8 c7 f9 ff 85 c0 48 8b 4d c8 75 10 48 89 4b
 08 e9 3d ff ff ff 0f 1f 80 00 00 00 00 0f 0b eb fe 66 66 66 2e 0f 1f
 84 00 00 00 00 00 55 48 89 e5 41
 Jul  5 13:54:02 oglaroon kernel: [ 2657.570642] RIP
 [a0222490] btrfs_delayed_update_inode+0x120/0x130 [btrfs]
 Jul  5 13:54:02 oglaroon kernel: [ 2657.663516]  RSP 88023010dbd8
 Jul  5 13:54:02 oglaroon kernel: [ 2657.705561] ---[ end trace
 0ae6cc23c8022b5b ]---
 
 I was testing some completely different modifications I made myself, but
 I'm quite certain that my changes did not trigger this error. With the
 fs I could reproducably get to this bug by creating 10 reflinks of a
 certain file in a shell loop.
 
 I lost the file system while trying to setup a clean, tight test case.
 The fs I used had something like 50 files, some reflinks and a snapshot
 with some files deleted. The tree had explicit backrefs and shared backrefs.
 
 Anyway, I could not setup a new file system triggering this bug. The
 line that triggered is the BUG_ON in btrfs_delayed_update_inode (line
 1693 in for-linus branch). We seem to have missed some reservation in
 some special case. The patch Miao sent some days ago does not
 interrelate at first sight.

I think you are right. btrfs_ioctl_clone() didn't reserve enough space because
we need reserve space for 3 items at least not 1 item:

  1 for old extents that will be dropped(in the some case, we may need more)
  1 for the new extent
  1 for the i-node

Maybe we need search the fs tree and find how many old extent need be dropped
and then reserve free space accurately

Thanks
Miao

 
 If I get back to a situation where I can reproduce the bug, I'll send a
 follow up.
 
 -Jan
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html