I ran 'btrfs check --repair --init-extent-tree' and appear to be in an
infinite loop. It performed heavy IO for about 1.5 hours then the IO
stopped and the CPU stayed at 100%. It's been like that for more than 12
hours now.

I made a hardware change last week that resulted in unstable RAM so I
suspect some corrupt data was written to disk. I tried mounting with
-orecovery,clear_cache,nospace_cache but I would get a panic shortly
thereafter. I tried 'btrfs check --repair' but also got a panic. I
finally tried 'btrfs check --repair --init-extent-tree' and hit an
assertion failed error with btrfs-progs 3.16.

After noticing some promising commits, I built from the integration repo
(kdave), re-ran (v3.16.1) and got further (2hrs) but then got stuck in
this infinite loop.

Here's the backtrace of where it is now and has been for hours:

#0  0x0000000000438f01 in free_some_buffers (tree=0xda3078) at
extent_io.c:553
#1  __alloc_extent_buffer (blocksize=4096, bytenr=<optimized out>,
tree=0xda3078) at extent_io.c:592
#2  alloc_extent_buffer (tree=0xda3078, bytenr=<optimized out>,
blocksize=4096) at extent_io.c:671
#3  0x000000000042be29 in btrfs_find_create_tree_block
(root=root@entry=0xda34a0, bytenr=<optimized out>, blocksize=<optimized
out>) at disk-io.c:133
#4  0x000000000042d683 in read_tree_block (root=0xda34a0,
bytenr=<optimized out>, blocksize=<optimized out>,
parent_transid=161580) at disk-io.c:260
#5  0x0000000000427c58 in read_node_slot (root=root@entry=0xda34a0,
parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634
#6  0x0000000000428558 in push_leaf_right (trans=trans@entry=0xe709b0,
root=root@entry=0xda34a0, path=path@entry=0xde317a0,
data_size=data_size@entry=67, empty=empty@entry=0)
    at ctree.c:1608
#7  0x0000000000428e4c in split_leaf (trans=trans@entry=0xe709b0,
root=root@entry=0xda34a0, ins_key=ins_key@entry=0x7fff24da24b0,
path=path@entry=0xde317a0,
    data_size=data_size@entry=67, extend=extend@entry=0) at ctree.c:1977
#8  0x000000000042aa54 in btrfs_search_slot (trans=0xe709b0,
root=root@entry=0xda34a0, key=key@entry=0x7fff24da24b0,
p=p@entry=0xde317a0, ins_len=ins_len@entry=67,
    cow=cow@entry=1) at ctree.c:1120
#9  0x000000000042af51 in btrfs_insert_empty_items
(trans=trans@entry=0xe709b0, root=root@entry=0xda34a0,
path=path@entry=0xde317a0, cpu_key=cpu_key@entry=0x7fff24da24b0,
    data_size=data_size@entry=0x7fff24da24a0, nr=nr@entry=1) at ctree.c:2412
#10 0x00000000004175f6 in btrfs_insert_empty_item (data_size=42,
key=0x7fff24da24b0, path=0xde317a0, root=0xda34a0, trans=0xe709b0) at
ctree.h:2312
#11 record_extent (flags=0, allocated=<optimized out>, back=0x95cb3d90,
rec=0x95cb3cc0, path=0xde317a0, info=0xda3010, trans=0xe709b0) at
cmds-check.c:4438
#12 fixup_extent_refs (trans=trans@entry=0xe709b0, info=<optimized out>,
extent_cache=extent_cache@entry=0x7fff24da2970,
rec=rec@entry=0x95cb3cc0) at cmds-check.c:5287
#13 0x000000000041ac01 in check_extent_refs
(extent_cache=0x7fff24da2970, root=<optimized out>, trans=<optimized
out>) at cmds-check.c:5511
#14 check_chunks_and_extents (root=root@entry=0xfa7c70) at cmds-check.c:5978
#15 0x000000000041bdd9 in cmd_check (argc=<optimized out>,
argv=<optimized out>) at cmds-check.c:6723
#16 0x0000000000404481 in main (argc=4, argv=0x7fff24da2fe0) at btrfs.c:247

I checked node, node->next, node->next->next, node->next->prev, etc. and
saw no obvious loop, at least not in the immediate vicinity of node. The
value of node is different each time I check it.

I'll periodically see the following backtrace:

#0  __list_del (next=0x1326fe820, prev=0xda3088) at list.h:113
#1  list_move_tail (head=0xda3088, list=0x1514b40f0) at list.h:183
#2  free_some_buffers (tree=0xda3078) at extent_io.c:560
#3  __alloc_extent_buffer (blocksize=4096, bytenr=<optimized out>,
tree=0xda3078) at extent_io.c:592
#4  alloc_extent_buffer (tree=0xda3078, bytenr=<optimized out>,
blocksize=4096) at extent_io.c:671
#5  0x000000000042be29 in btrfs_find_create_tree_block
(root=root@entry=0xda34a0, bytenr=<optimized out>, blocksize=<optimized
out>) at disk-io.c:133
#6  0x000000000042d683 in read_tree_block (root=0xda34a0,
bytenr=<optimized out>, blocksize=<optimized out>,
parent_transid=161580) at disk-io.c:260
#7  0x0000000000427c58 in read_node_slot (root=root@entry=0xda34a0,
parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634
#8  0x0000000000428558 in push_leaf_right (trans=trans@entry=0xe709b0,
root=root@entry=0xda34a0, path=path@entry=0xde317a0,
data_size=data_size@entry=67, empty=empty@entry=0)
    at ctree.c:1608
#9  0x0000000000428e4c in split_leaf (trans=trans@entry=0xe709b0,
root=root@entry=0xda34a0, ins_key=ins_key@entry=0x7fff24da24b0,
path=path@entry=0xde317a0,
    data_size=data_size@entry=67, extend=extend@entry=0) at ctree.c:1977
#10 0x000000000042aa54 in btrfs_search_slot (trans=0xe709b0,
root=root@entry=0xda34a0, key=key@entry=0x7fff24da24b0,
p=p@entry=0xde317a0, ins_len=ins_len@entry=67,
    cow=cow@entry=1) at ctree.c:1120
#11 0x000000000042af51 in btrfs_insert_empty_items
(trans=trans@entry=0xe709b0, root=root@entry=0xda34a0,
path=path@entry=0xde317a0, cpu_key=cpu_key@entry=0x7fff24da24b0,
    data_size=data_size@entry=0x7fff24da24a0, nr=nr@entry=1) at ctree.c:2412
#12 0x00000000004175f6 in btrfs_insert_empty_item (data_size=42,
key=0x7fff24da24b0, path=0xde317a0, root=0xda34a0, trans=0xe709b0) at
ctree.h:2312
#13 record_extent (flags=0, allocated=<optimized out>, back=0x95cb3d90,
rec=0x95cb3cc0, path=0xde317a0, info=0xda3010, trans=0xe709b0) at
cmds-check.c:4438
#14 fixup_extent_refs (trans=trans@entry=0xe709b0, info=<optimized out>,
extent_cache=extent_cache@entry=0x7fff24da2970,
rec=rec@entry=0x95cb3cc0) at cmds-check.c:5287
#15 0x000000000041ac01 in check_extent_refs
(extent_cache=0x7fff24da2970, root=<optimized out>, trans=<optimized
out>) at cmds-check.c:5511
#16 check_chunks_and_extents (root=root@entry=0xfa7c70) at cmds-check.c:5978
#17 0x000000000041bdd9 in cmd_check (argc=<optimized out>,
argv=<optimized out>) at cmds-check.c:6723
#18 0x0000000000404481 in main (argc=4, argv=0x7fff24da2fe0) at btrfs.c:247

If there's interest in debugging I can leave this machine in this
condition for a few days. It's just a backup server so losing the fs
won't be the end of the world.

--Larkin

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to