Hello, 

Recently I started getting the following crashes on some servers, 
running btrfs: 

[340435.480338] BTRFS info (device loop7): disk space caching is enabled
[340435.480509] BTRFS: has skinny extents
[340441.716174] BTRFS: checking UUID tree
[340441.912070] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000098
[340441.912463] IP: [<ffffffffa081f774>] btrfs_uuid_tree_iterate+0xf4/0x2d0 
[btrfs]
[340441.912823] PGD 0 
[340441.913035] Oops: 0000 [#1] SMP 
[340441.913302] Modules linked in: 
[340441.916996] CPU: 10 PID: 24990 Comm: btrfs-uuid Tainted: P        W  O    
4.4.14-clouder1 #55
[340441.917287] Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.2 
01/16/2015
[340441.917573] task: ffff8801b95c1b80 ti: ffff88034e504000 task.ti: 
ffff88034e504000
[340441.917859] RIP: 0010:[<ffffffffa081f774>]  [<ffffffffa081f774>] 
btrfs_uuid_tree_iterate+0xf4/0x2d0 [btrfs]
[340441.918212] RSP: 0018:ffff88034e507e20  EFLAGS: 00010246
[340441.918382] RAX: 0000000000000000 RBX: 0000160000000000 RCX: 
ffff880000000000
[340441.918665] RDX: 0000000000000001 RSI: ffff8801e3abd140 RDI: 
ffff88046f027f00
[340441.918952] RBP: ffff88034e507ea8 R08: 000060fb80001760 R09: 
ffffffffa07ac1de
[340441.919236] R10: ffffe8ffffd41760 R11: ffffea00078eaf40 R12: 
ffff8801b98ab750
[340441.919521] R13: 00000000fffffffe R14: ffff8801e3abd140 R15: 
ffff880049586000
[340441.919810] FS:  0000000000000000(0000) GS:ffff88047fd40000(0000) 
knlGS:0000000000000000
[340441.920097] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[340441.920267] CR2: 0000000000000098 CR3: 0000000001c0a000 CR4: 
00000000000406e0
[340441.920554] Stack:
[340441.920717]  ffff880049586000 ffff8801b98ab750 00003f7b00014fc0 
ffff8803711dec08
[340441.921186]  ffffffffa07d0c40 ffff880332342000 0000000000000114 
1b7088046d7612f8
[340441.921655]  8cfb42689378e508 70157e0ade97f5d6 8c42689378e5081b 
15157e0ade97f5d6
[340441.922126] Call Trace:
[340441.922315]  [<ffffffffa07d0c40>] ? find_live_mirror.isra.18+0xc0/0xc0 
[btrfs]
[340441.922614]  [<ffffffffa07d0ae0>] ? btrfs_uuid_scan_kthread+0x3c0/0x3c0 
[btrfs]
[340441.922917]  [<ffffffffa07d0afb>] btrfs_uuid_rescan_kthread+0x1b/0x60 
[btrfs]
[340441.923197]  [<ffffffff8107161f>] kthread+0xef/0x110
[340441.923363]  [<ffffffff81071530>] ? kthread_park+0x60/0x60
[340441.923531]  [<ffffffff816149ff>] ret_from_fork+0x3f/0x70
[340441.923697]  [<ffffffff81071530>] ? kthread_park+0x60/0x60
[340441.923863] Code: 0f 86 a0 00 00 00 48 bb 00 00 00 00 00 16 00 00 41 8b 44 
24 40 48 b9 00 00 00 00 00 88 ff ff 8d 50 01 49 8b 04 24 41 89 54 24 40 <48> 03 
98 98 00 00 00 48 89 d8 48 c1 f8 06 48 c1 e0 0c 3b 54 08 
[340441.927296] RIP  [<ffffffffa081f774>] btrfs_uuid_tree_iterate+0xf4/0x2d0 
[btrfs]
[340441.927641]  RSP <ffff88034e507e20>
[340441.927806] CR2: 0000000000000098


ffffffffa081f774 is in the heavily inlined btrfs_next_item. Here
is the decoded instructions, right before the crash with annotations:

   0:   0f 86 a0 00 00 00       jbe    0xa6
   6:   48 bb 00 00 00 00 00    mov    $0x160000000000,%rbx
   d:   16 00 00 
  10:   41 8b 44 24 40          mov    0x40(%r12),%eax ; r12 is btrfs_path, eax 
points to first slot
  15:   48 b9 00 00 00 00 00    mov    $0xffff880000000000,%rcx
  1c:   88 ff ff 
  1f:   8d 50 01                lea    0x1(%rax),%edx ; incr slot
  22:   49 8b 04 24             mov    (%r12),%rax ; load first extent_buffer 
in rax
  26:   41 89 54 24 40          mov    %edx,0x40(%r12) ; save incremented slot
  2b:*  48 03 98 98 00 00 00    add    0x98(%rax),%rbx <-- trapping instruction 
; load the first page from the extent_buffer
  32:   48 89 d8                mov    %rbx,%rax
  35:   48 c1 f8 06             sar    $0x6,%rax
  39:   48 c1 e0 0c             shl    $0xc,%rax
  3d:   3b                      .byte 0x3b
  3e:   54                      push   %rsp
  3f:   08                      .byte 0x8

So as can be seen rax is zero and naturally dereferencing it is 
also zero. What's interesting is the content of the btrf_path:

struct btrfs_path {
  nodes = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 
  slots = {1, 0, 0, 0, 0, 0, 0, 0}, 
  locks = {0, 0, 0, 0, 0, 0, 0, 0}, 
  reada = 0, 
  lowest_level = 0, 
  search_for_split = 0, 
  keep_locks = 0, 
  skip_locking = 0, 
  leave_spinning = 0, 
  search_commit_root = 0, 
  need_commit_sem = 0, 
  skip_release_on_error = 0
}

Any ideas how come btrfs_path can be all zero, the one in
the first slot comes from the increment in btrfs_next_old_item.

Regards, 
Nikolay 


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to