Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!
OCFS2 folks, any thoughts on this crash? On Tue, 2017-01-17 at 02:12 +, Ben Hutchings wrote: > On Mon, 2017-01-16 at 13:12 -0600, Russell Mosemann wrote: > [...] > > Jan 15 17:31:03 vhost032 kernel: [ cut here ] > > Jan 15 17:31:03 vhost032 kernel: kernel BUG at > > /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514! > > This is: > > static int ocfs2_grow_tree(handle_t *handle, struct ocfs2_extent_tree *et, > int *final_depth, struct buffer_head **last_eb_bh, > struct ocfs2_alloc_context *meta_ac) > { > ... > BUG_ON(meta_ac == NULL); > > > [...] > > Jan 15 17:31:03 vhost032 kernel: Call Trace: > > Jan 15 17:31:03 vhost032 kernel: [] ? > > ocfs2_set_buffer_uptodate+0x35/0x4a0 [ocfs2] > > Jan 15 17:31:03 vhost032 kernel: [] ? > > __find_get_block+0xa7/0x110 > > Jan 15 17:31:03 vhost032 kernel: [] ? > > ocfs2_split_and_insert+0x307/0x490 [ocfs2] > > Jan 15 17:31:03 vhost032 kernel: [] ? > > ocfs2_split_extent+0x3ee/0x560 [ocfs2] > > Jan 15 17:31:03 vhost032 kernel: [] ? > > ocfs2_change_extent_flag+0x273/0x450 [ocfs2] > > Jan 15 17:31:03 vhost032 kernel: [] ? > > ocfs2_mark_extent_written+0x110/0x1d0 [ocfs2] > > Jan 15 17:31:03 vhost032 kernel: [] ? > > ocfs2_dio_end_io_write+0x44d/0x600 [ocfs2] > > meta_ac is passed down from ocfs2_dio_end_io_write(), which allocates > it using ocfs2_lock_allocators()... but the latter only allocates it > conditionally. It seems like the condition is wrong somehow. This still seems to be happening for this user with 4.9.13, looking at "git log -p v4.9.13..origin/master -- fs/ocfs2" I wonder if https://git.kernel.org/torvalds/c/3e10b793fc40dfdbe51762e0d084bd6f2c8acaaa might be relevant? The commit message mentions meta_ac not getting allocated and an extent split vs refcount split differentiation and we have ocfs2_split_extent in the trace. Slim reasoning I know, maybe someone who knows the code could make a better determination. As Ben said before the whole bug report can be found at https://bugs.de bian.org/841144 Ian.
Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!
On Mon, 2017-01-16 at 13:12 -0600, Russell Mosemann wrote: [...] > Jan 15 17:31:03 vhost032 kernel: [ cut here ] > Jan 15 17:31:03 vhost032 kernel: kernel BUG at > /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514! This is: static int ocfs2_grow_tree(handle_t *handle, struct ocfs2_extent_tree *et, int *final_depth, struct buffer_head **last_eb_bh, struct ocfs2_alloc_context *meta_ac) { ... BUG_ON(meta_ac == NULL); > Jan 15 17:31:03 vhost032 kernel: invalid opcode: [#1] SMP > Jan 15 17:31:03 vhost032 kernel: Modules linked in: vhost_net(E) vhost(E) > macvtap(E) macvlan(E) tun(E) ocfs2(E) quota_tree(E) hmac(E) veth(E) > iptable_filter(E) ip_tables(E) x_tables(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) > nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) ocfs2_dlmfs(E) > ocfs2_stack_o2cb(E) ocfs2_dlm(E) ocfs2_nodemanager(E) ocfs2_stackglue(E) > configfs(E) bridge(E) stp(E) llc(E) bonding(E) intel_rapl(E) sb_edac(E) > edac_core(E) x86_pkg_temp_thermal(E) coretemp(E) ast(E) kvm_intel(E) ttm(E) > drm_kms_helper(E) mxm_wmi(E) iTCO_wdt(E) kvm(E) iTCO_vendor_support(E) igb(E) > evdev(E) irqbypass(E) drm(E) xhci_pci(E) dca(E) ehci_pci(E) xhci_hcd(E) > crct10dif_pclmul(E) ehci_hcd(E) crc32_pclmul(E) i2c_algo_bit(E) e1000e(E) > usbcore(E) ptp(E) mei_me(E) lpc_ich(E) ghash_clmulni_intel(E) i2c_i801(E) > pcspkr(E) usb_common(E) sg(E) > Jan 15 17:31:03 vhost032 kernel: mei(E) shpchp(E) i2c_smbus(E) pps_core(E) > mfd_core(E) ipmi_si(E) wmi(E) fjes(E) ipmi_msghandler(E) tpm_tis(E) > tpm_tis_core(E) tpm(E) acpi_power_meter(E) acpi_pad(E) button(E) fuse(E) > drbd(E) lru_cache(E) libcrc32c(E) crc32c_generic(E) autofs4(E) ext4(E) > crc16(E) jbd2(E) fscrypto(E) mbcache(E) dm_mod(E) md_mod(E) sd_mod(E) > crc32c_intel(E) aesni_intel(E) aes_x86_64(E) glue_helper(E) lrw(E) > gf128mul(E) ablk_helper(E) cryptd(E) ahci(E) libahci(E) libata(E) scsi_mod(E) > Jan 15 17:31:03 vhost032 kernel: CPU: 5 PID: 28586 Comm: qemu-system-x86 > Tainted: GE 4.8.0-0.bpo.2-amd64 #1 Debian 4.8.11-1~bpo8+1 > Jan 15 17:31:03 vhost032 kernel: Hardware name: To Be Filled By O.E.M. To Be > Filled By O.E.M./EPC612D4I, BIOS P2.10 03/31/2016 > Jan 15 17:31:03 vhost032 kernel: task: 8e6e8584d000 task.stack: > 8e6d8079 > Jan 15 17:31:03 vhost032 kernel: RIP: 0010:[] > [] ocfs2_grow_tree+0x6f2/0x780 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: RSP: 0018:8e6d80793618 EFLAGS: 00010246 > Jan 15 17:31:03 vhost032 kernel: RAX: RBX: 0004 > RCX: 8e6d80793790 > Jan 15 17:31:03 vhost032 kernel: RDX: 8e6d807936bc RSI: 8e6d80793968 > RDI: 8e6ea5012690 > Jan 15 17:31:03 vhost032 kernel: RBP: 8e6d80793678 R08: > R09: 00141d0b > Jan 15 17:31:03 vhost032 kernel: R10: 01586960 R11: 8e6e36ab30c0 > R12: 0001 > Jan 15 17:31:03 vhost032 kernel: R13: 8e6d80793828 R14: 8e6e36ab30c0 > R15: 0001 > Jan 15 17:31:03 vhost032 kernel: FS: 7f578affd700() > GS:8e7cbf34() knlGS: > Jan 15 17:31:03 vhost032 kernel: CS: 0010 DS: ES: CR0: > 80050033 > Jan 15 17:31:03 vhost032 kernel: CR2: b0015a300238 CR3: 0001f5579000 > CR4: 001426e0 > Jan 15 17:31:03 vhost032 kernel: Stack: > Jan 15 17:31:03 vhost032 kernel: 8e6d80793728 8e6d80793728 > c092aa75 8e6cb1fc0c30 > Jan 15 17:31:03 vhost032 kernel: 8e6e81ddb548 9e63ba27 > ab4b7e2a 0004 > Jan 15 17:31:03 vhost032 kernel: 0001 8e6d80793828 > 8e6d80793968 8e6e78de1700 > Jan 15 17:31:03 vhost032 kernel: Call Trace: > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_set_buffer_uptodate+0x35/0x4a0 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: [] ? > __find_get_block+0xa7/0x110 > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_split_and_insert+0x307/0x490 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_split_extent+0x3ee/0x560 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_change_extent_flag+0x273/0x450 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_mark_extent_written+0x110/0x1d0 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_dio_end_io_write+0x44d/0x600 [ocfs2] meta_ac is passed down from ocfs2_dio_end_io_write(), which allocates it using ocfs2_lock_allocators()... but the latter only allocates it conditionally. It seems like the condition is wrong somehow. I didn't see any relevant changes post-4.8 (though I did see a number of unrelated bug fixes that maybe ought to go to stable). The rest of the traceback is below; the whole bug report can be found at https://bugs.debian.org/841144 Ben. > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_allocate_extend_trans+0x180/0x180 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: [] ? > ocfs2_dio_end_io+0x3b/0x60 [ocfs2] > Jan 15 17:31:03 vhost032 kernel: [] ?
Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!
Package: src:linux Version: 4.8.11-1~bpo8+1 Severity: critical Dear Maintainer, * What led up to the situation? Jan 15 17:31:03 vhost032 kernel: [ cut here ] Jan 15 17:31:03 vhost032 kernel: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514! Jan 15 17:31:03 vhost032 kernel: invalid opcode: [#1] SMP Jan 15 17:31:03 vhost032 kernel: Modules linked in: vhost_net(E) vhost(E) macvtap(E) macvlan(E) tun(E) ocfs2(E) quota_tree(E) hmac(E) veth(E) iptable_filter(E) ip_tables(E) x_tables(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) ocfs2_dlmfs(E) ocfs2_stack_o2cb(E) ocfs2_dlm(E) ocfs2_nodemanager(E) ocfs2_stackglue(E) configfs(E) bridge(E) stp(E) llc(E) bonding(E) intel_rapl(E) sb_edac(E) edac_core(E) x86_pkg_temp_thermal(E) coretemp(E) ast(E) kvm_intel(E) ttm(E) drm_kms_helper(E) mxm_wmi(E) iTCO_wdt(E) kvm(E) iTCO_vendor_support(E) igb(E) evdev(E) irqbypass(E) drm(E) xhci_pci(E) dca(E) ehci_pci(E) xhci_hcd(E) crct10dif_pclmul(E) ehci_hcd(E) crc32_pclmul(E) i2c_algo_bit(E) e1000e(E) usbcore(E) ptp(E) mei_me(E) lpc_ich(E) ghash_clmulni_intel(E) i2c_i801(E) pcspkr(E) usb_common(E) sg(E) Jan 15 17:31:03 vhost032 kernel: mei(E) shpchp(E) i2c_smbus(E) pps_core(E) mfd_core(E) ipmi_si(E) wmi(E) fjes(E) ipmi_msghandler(E) tpm_tis(E) tpm_tis_core(E) tpm(E) acpi_power_meter(E) acpi_pad(E) button(E) fuse(E) drbd(E) lru_cache(E) libcrc32c(E) crc32c_generic(E) autofs4(E) ext4(E) crc16(E) jbd2(E) fscrypto(E) mbcache(E) dm_mod(E) md_mod(E) sd_mod(E) crc32c_intel(E) aesni_intel(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) cryptd(E) ahci(E) libahci(E) libata(E) scsi_mod(E) Jan 15 17:31:03 vhost032 kernel: CPU: 5 PID: 28586 Comm: qemu-system-x86 Tainted: GE 4.8.0-0.bpo.2-amd64 #1 Debian 4.8.11-1~bpo8+1 Jan 15 17:31:03 vhost032 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EPC612D4I, BIOS P2.10 03/31/2016 Jan 15 17:31:03 vhost032 kernel: task: 8e6e8584d000 task.stack: 8e6d8079 Jan 15 17:31:03 vhost032 kernel: RIP: 0010:[] [] ocfs2_grow_tree+0x6f2/0x780 [ocfs2] Jan 15 17:31:03 vhost032 kernel: RSP: 0018:8e6d80793618 EFLAGS: 00010246 Jan 15 17:31:03 vhost032 kernel: RAX: RBX: 0004 RCX: 8e6d80793790 Jan 15 17:31:03 vhost032 kernel: RDX: 8e6d807936bc RSI: 8e6d80793968 RDI: 8e6ea5012690 Jan 15 17:31:03 vhost032 kernel: RBP: 8e6d80793678 R08: R09: 00141d0b Jan 15 17:31:03 vhost032 kernel: R10: 01586960 R11: 8e6e36ab30c0 R12: 0001 Jan 15 17:31:03 vhost032 kernel: R13: 8e6d80793828 R14: 8e6e36ab30c0 R15: 0001 Jan 15 17:31:03 vhost032 kernel: FS: 7f578affd700() GS:8e7cbf34() knlGS: Jan 15 17:31:03 vhost032 kernel: CS: 0010 DS: ES: CR0: 80050033 Jan 15 17:31:03 vhost032 kernel: CR2: b0015a300238 CR3: 0001f5579000 CR4: 001426e0 Jan 15 17:31:03 vhost032 kernel: Stack: Jan 15 17:31:03 vhost032 kernel: 8e6d80793728 8e6d80793728 c092aa75 8e6cb1fc0c30 Jan 15 17:31:03 vhost032 kernel: 8e6e81ddb548 9e63ba27 ab4b7e2a 0004 Jan 15 17:31:03 vhost032 kernel: 0001 8e6d80793828 8e6d80793968 8e6e78de1700 Jan 15 17:31:03 vhost032 kernel: Call Trace: Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_set_buffer_uptodate+0x35/0x4a0 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? __find_get_block+0xa7/0x110 Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_split_and_insert+0x307/0x490 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_split_extent+0x3ee/0x560 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_change_extent_flag+0x273/0x450 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_mark_extent_written+0x110/0x1d0 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_dio_end_io_write+0x44d/0x600 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_allocate_extend_trans+0x180/0x180 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_dio_end_io+0x3b/0x60 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? dio_complete+0x68/0x160 Jan 15 17:31:03 vhost032 kernel: [] ? do_blockdev_direct_IO+0x2079/0x23f0 Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_write_end_nolock+0x560/0x560 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_direct_IO+0x83/0x90 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? generic_file_direct_write+0xb3/0x180 Jan 15 17:31:03 vhost032 kernel: [] ? __generic_file_write_iter+0xb6/0x1e0 Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_file_write_iter+0x44e/0xae0 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? do_iter_readv_writev+0xb0/0x130 Jan 15 17:31:03 vhost032 kernel: [] ? do_readv_writev+0x1a2/0x240 Jan 15 17:31:03 vhost032 kernel: [] ? ocfs2_check_range_for_refcount+0x130/0x130 [ocfs2] Jan 15 17:31:03 vhost032 kernel: [] ? schedule+0x31/0x80 Jan 15 17:31:03 vhost032