[Kernel-packages] [Bug 1818501] Re: kernel BUG at fs/ocfs2/alloc.c:1514
Unfortunately we don't have a dedicated test system here, and I don't really want to test this on the production system... I can confirm that upgrading to the HWE kernel (4.18, which contains the patches) seems to solve the problem. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1818501 Title: kernel BUG at fs/ocfs2/alloc.c:1514 Status in linux package in Ubuntu: Confirmed Bug description: The current bionic kernel (4.15) contains a known bug in the OCFS2 distributed filesystem, which can cause all nodes (!) of a redundant cluster to crash. More information on this bug (including the patch) can be found here: https://bugs.debian.org/cgi- bin/bugreport.cgi?bug=841144 This fix was included upstream in 4.16, so it is included in the HWE stack, but not in the GA kernel. In my opinion this is quite severe bug, because it can bring a whole redundant setup down (this happened to us). This patch should be backported to 4.15. #cat /proc/version_signature Ubuntu 4.15.0-45.48-generic 4.15.18 # lsb_release -rd Description: Ubuntu 18.04.2 LTS Release: 18.04 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1818501/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1818501] Re: kernel BUG at fs/ocfs2/alloc.c:1514
It should be 63de8bd9328bf2a778fc277503da163ae3defa3c ocfs2: make metadata estimation accurate and clear 71a36944042b7d9dd71f6a5d1c5ea1c2353b5d42 ocfs2: try to reuse extent block in dealloc without meta_alloc -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1818501 Title: kernel BUG at fs/ocfs2/alloc.c:1514 Status in linux package in Ubuntu: Confirmed Bug description: The current bionic kernel (4.15) contains a known bug in the OCFS2 distributed filesystem, which can cause all nodes (!) of a redundant cluster to crash. More information on this bug (including the patch) can be found here: https://bugs.debian.org/cgi- bin/bugreport.cgi?bug=841144 This fix was included upstream in 4.16, so it is included in the HWE stack, but not in the GA kernel. In my opinion this is quite severe bug, because it can bring a whole redundant setup down (this happened to us). This patch should be backported to 4.15. #cat /proc/version_signature Ubuntu 4.15.0-45.48-generic 4.15.18 # lsb_release -rd Description: Ubuntu 18.04.2 LTS Release: 18.04 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1818501/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1818501] Re: kernel BUG at fs/ocfs2/alloc.c:1514
Adding log with apport-collect files is not easily possible due to our security setup, but should not be necessary because all information can be found in the linked debian bug report. Here is our stacktrace of the bug happening: Mär 02 06:25:59 prometheus-lo kernel: [ cut here ] Mär 02 06:25:59 prometheus-lo kernel: kernel BUG at /build/linux-uQJ2um/linux-4.15.0/fs/ocfs2/alloc.c:1514! Mär 02 06:25:59 prometheus-lo kernel: invalid opcode: [#1] SMP PTI Mär 02 06:25:59 prometheus-lo kernel: Modules linked in: vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 devlink ebtable_filter ebtables ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm Mär 02 06:25:59 prometheus-lo kernel: xt_addrtype xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ib_iser rdma_cm iw_cm ib_cm ib_core Mär 02 06:25:59 prometheus-lo kernel: CPU: 0 PID: 9345 Comm: kworker/0:1 Not tainted 4.15.0-45-generic #48-Ubuntu Mär 02 06:25:59 prometheus-lo kernel: Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 11/14/2017 Mär 02 06:25:59 prometheus-lo kernel: Workqueue: dio/dm-0 dio_aio_complete_work Mär 02 06:25:59 prometheus-lo kernel: RIP: 0010:ocfs2_grow_tree+0x5e9/0x7e0 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: RSP: 0018:bea20df37a28 EFLAGS: 00010246 Mär 02 06:25:59 prometheus-lo kernel: RAX: RBX: bea20df37da0 RCX: bea20df37bb8 Mär 02 06:25:59 prometheus-lo kernel: RDX: bea20df37ac4 RSI: bea20df37da0 RDI: 9679f54479f0 Mär 02 06:25:59 prometheus-lo kernel: RBP: bea20df37a98 R08: R09: bea20df37c58 Mär 02 06:25:59 prometheus-lo kernel: R10: bea20df37b68 R11: 0030 R12: 9676ba5d95a0 Mär 02 06:25:59 prometheus-lo kernel: R13: 9679e321d0c0 R14: 9676ba5d95a0 R15: 0001 Mär 02 06:25:59 prometheus-lo kernel: FS: () GS:9679ffc0() knlGS: Mär 02 06:25:59 prometheus-lo kernel: CS: 0010 DS: ES: CR0: 80050033 Mär 02 06:25:59 prometheus-lo kernel: CR2: 7f1b137ba3cc CR3: 0013a720a002 CR4: 007626f0 Mär 02 06:25:59 prometheus-lo kernel: DR0: DR1: DR2: Mär 02 06:25:59 prometheus-lo kernel: DR3: DR6: fffe0ff0 DR7: 0400 Mär 02 06:25:59 prometheus-lo kernel: PKRU: 5554 Mär 02 06:25:59 prometheus-lo kernel: Call Trace: Mär 02 06:25:59 prometheus-lo kernel: ? ocfs2_set_buffer_uptodate+0x34/0x490 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ocfs2_split_and_insert+0x332/0x4d0 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ? ocfs2_read_blocks+0x304/0x600 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ocfs2_split_extent+0x3cb/0x530 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ? ocfs2_dinode_set_last_eb_blk+0x20/0x20 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ocfs2_change_extent_flag+0x25b/0x3e0 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ocfs2_mark_extent_written+0xad/0x1c0 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ocfs2_dio_end_io_write+0x4ec/0x690 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ? __switch_to_asm+0x34/0x70 Mär 02 06:25:59 prometheus-lo kernel: ? ocfs2_allocate_extend_trans+0x1a0/0x1a0 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: ocfs2_dio_end_io+0x3e/0x70 [ocfs2] Mär 02 06:25:59 prometheus-lo kernel: dio_complete+0x86/0x220 Mär 02 06:25:59 prometheus-lo kernel: dio_aio_complete_work+0x19/0x20 Mär 02 06:25:59 prometheus-lo kernel: process_one_work+0x1de/0x410 Mär 02 06:25:59 prometheus-lo kernel: worker_thread+0x32/0x410 Mär 02 06:25:59 prometheus-lo kernel: kthread+0x121/0x140 Mär 02 06:25:59 prometheus-lo kernel: ? process_one_work+0x410/0x410 Mär 02 06:25:59 prometheus-lo kernel: ? kthread_create_worker_on_cpu+0x70/0x70 Mär 02 06:25:59 prometheus-lo kernel: ? do_syscall_64+0x73/0x130 Mär 02 06:25:59 prometheus-lo kernel: ? SyS_exit_group+0x14/0x20 Mär 02 06:25:59 prometheus-lo kernel: ret_from_fork+0x35/0x40 Mär 02 06:25:59 prometheus-lo kernel: Code: 00 00 00 00 00 00 10 4d 63 c6 48 c7 c1 40 26 b0 c0 ba 1c 06 00 00 48 c7 c6 f0 59 af c0 48 89 45 c0 e8 5c c2 de ff e9 41 fd ff ff <0f> 0b 48 8b 7d b8 48 85 ff 0f 84 e3 fd ff ff eb 22 3d 00 fe ff Mär 02 06:25:59 prometheus-lo kernel: RIP: ocfs2_grow_tree+0x5e9/0x7e0 [ocfs2] RSP: bea20df37a28 Mär 02 06:25:59 prometheus-lo kernel: ---[ end trace 8ba9e5d5bf1ad2c7 ]--- ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1818501 Title: kernel BUG at fs/ocfs2/alloc.c:1514 Status in linux package in Ubuntu: Confirmed Bug description: The current bionic kernel (4.15) contains a known bug in the OCFS2
[Kernel-packages] [Bug 1818501] [NEW] kernel BUG at fs/ocfs2/alloc.c:1514
Public bug reported: The current bionic kernel (4.15) contains a known bug in the OCFS2 distributed filesystem, which can cause all nodes (!) of a redundant cluster to crash. More information on this bug (including the patch) can be found here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=841144 This fix was included upstream in 4.16, so it is included in the HWE stack, but not in the GA kernel. In my opinion this is quite severe bug, because it can bring a whole redundant setup down (this happened to us). This patch should be backported to 4.15. #cat /proc/version_signature Ubuntu 4.15.0-45.48-generic 4.15.18 # lsb_release -rd Description:Ubuntu 18.04.2 LTS Release:18.04 ** Affects: linux (Ubuntu) Importance: Undecided Status: Incomplete ** Tags: bionic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1818501 Title: kernel BUG at fs/ocfs2/alloc.c:1514 Status in linux package in Ubuntu: Incomplete Bug description: The current bionic kernel (4.15) contains a known bug in the OCFS2 distributed filesystem, which can cause all nodes (!) of a redundant cluster to crash. More information on this bug (including the patch) can be found here: https://bugs.debian.org/cgi- bin/bugreport.cgi?bug=841144 This fix was included upstream in 4.16, so it is included in the HWE stack, but not in the GA kernel. In my opinion this is quite severe bug, because it can bring a whole redundant setup down (this happened to us). This patch should be backported to 4.15. #cat /proc/version_signature Ubuntu 4.15.0-45.48-generic 4.15.18 # lsb_release -rd Description: Ubuntu 18.04.2 LTS Release: 18.04 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1818501/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp