[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2021-09-15 Thread Po-Hsu Lin
*** This bug is a duplicate of bug 1927076 ***
https://bugs.launchpad.net/bugs/1927076

** This bug is no longer a duplicate of bug 1909286
   ubuntu_kernel_selftest will be interrupted with the reuseport_bpf_cpu / 
reuseport_bpf_numa test in net (BUG: Unable to handle kernel instruction fetch 
(NULL pointer?))

** This bug has been marked a duplicate of bug 1927076
   IPv6 TCP in reuseport_bpf_cpu from ubuntu_kernel_selftests/net crash P8 node 
entei on 5.8 kernel (Oops: Exception in kernel mode, sig: 4 [#1])

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-12-25 Thread Po-Hsu Lin
*** This bug is a duplicate of bug 1909286 ***
https://bugs.launchpad.net/bugs/1909286

As node modoc is no longer accessible, and I managed to reproduce this
on node entei and dryden in bug 1909286, I will mark this one as a dup.

** This bug has been marked a duplicate of bug 1909286
   ubuntu_kernel_selftest will be interrupted with the reuseport_bpf_cpu / 
reuseport_bpf_numa test in net (BUG: Unable to handle kernel instruction fetch 
(NULL pointer?))

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-06-24 Thread Po-Hsu Lin
This issue is striking us again with 5.3.0-60.54 on modoc.

** Tags added: sru-20200608

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Po-Hsu Lin
** Tags removed: kqa-blocker

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Po-Hsu Lin
I can reproduce this with 5.3.0-40-generic, so removing the kqa-blocker
tag.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Sean Feole
** Changed in: ubuntu-kernel-tests
   Status: New => Triaged

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Po-Hsu Lin
I managed to catch this issue with IPMI console:

[  673.975988] BUG: Unable to handle kernel instruction fetch
[  673.976017] Faulting instruction address: 0x7fe87fe8
[  673.976025] Oops: Kernel access of bad area, sig: 11 [#1]
[  673.976032] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[  673.976040] Modules linked in: binfmt_misc dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua leds_powernv ipmi_powernv ipmi_devintf ipmi_msghandler 
ibmpowernv uio_pdrv_genirq vmx_crypto uio powernv_rng powernv_op_panel 
sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas btrfs 
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor 
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_vpmsum 
crc32c_vpmsum tg3 ipr [last unloaded: notifier_error_inject]
[  673.976101] CPU: 0 PID: 76667 Comm: reuseport_bpf_c Not tainted 
5.3.0-42-generic #34-Ubuntu
[  673.976109] NIP:  7fe87fe8 LR: c0ca9d10 CTR: 7fe87fe8
[  673.976117] REGS: c00cf580 TRAP: 0480   Not tainted  
(5.3.0-42-generic)
[  673.976124] MSR:  90009033   CR: 24002488  
XER: 2000
[  673.976135] CFAR: c0ca9d0c IRQMASK: 0 
[  673.976135] GPR00: c0e4094c c00cf810 c19c9000 
c00a2208b6e0 
[  673.976135] GPR04: c00807c70038 c00a2208b6e0 0028 
00012e2e5506 
[  673.976135] GPR08: a2be9417   
 
[  673.976135] GPR12: 7fe87fe8 c1d6 0003 
0101 
[  673.976135] GPR16: c3fe00e8 22b8 2788 
c18fb600 
[  673.976135] GPR20: 000a  22b8 
0001 
[  673.976135] GPR24:  0028 00a0 
c3fe00e8 
[  673.976135] GPR28: c00807c7 147b1819 c00a2208b6e0 
c01e2d8b 
[  673.976195] NIP [7fe87fe8] 0x7fe87fe8
[  673.976206] LR [c0ca9d10] reuseport_select_sock+0x100/0x400
[  673.976212] Call Trace:
[  673.976216] [c00cf810] [c00fe9daf0f0] 0xc00fe9daf0f0 
(unreliable)
[  673.976226] [c00cf8b0] [c0e4094c] 
inet6_lhash2_lookup+0x1ec/0x220
[  673.976234] [c00cf930] [c0e40c80] 
inet6_lookup_listener+0x300/0x3f0
[  673.976244] [c00cf9d0] [c0e1ec38] tcp_v6_rcv+0x7d8/0xe10
[  673.976252] [c00cfb00] [c0dd7e70] 
ip6_protocol_deliver_rcu+0x110/0x6e0
[  673.976261] [c00cfb80] [c0dd846c] ip6_input_finish+0x2c/0x40
[  673.976269] [c00cfba0] [c0dd8560] ip6_input+0xe0/0xf0
[  673.976276] [c00cfc10] [c0dd7008] ip6_rcv_finish+0xa8/0xe0
[  673.976283] [c00cfc40] [c0dd7b90] ipv6_rcv+0x100/0x110
[  673.976291] [c00cfcc0] [c0c6d040] 
__netif_receive_skb_one_core+0x70/0xb0
[  673.976299] [c00cfd00] [c0c6ee70] process_backlog+0xd0/0x230
[  673.976307] [c00cfd70] [c0c6db88] net_rx_action+0x1e8/0x510
[  673.976315] [c00cfe90] [c0e9c8a0] __do_softirq+0x160/0x3f4
[  673.976324] [c00cff90] [c0030be8] call_do_softirq+0x14/0x24
[  673.976332] [c4a4f6b0] [c001c2e8] 
do_softirq_own_stack+0x38/0x50
[  673.976341] [c4a4f6d0] [c01315a0] do_softirq.part.0+0x80/0xb0
[  673.976349] [c4a4f700] [c0131698] 
__local_bh_enable_ip+0xc8/0xf0
[  673.976357] [c4a4f720] [c0dd19b8] 
ip6_finish_output2+0x298/0x790
[  673.976365] [c4a4f7c0] [c0dd6124] ip6_output+0x84/0x1c0
[  673.976372] [c4a4f840] [c0dd222c] ip6_xmit+0x37c/0x7b0
[  673.976379] [c4a4f960] [c0e272e4] inet6_csk_xmit+0xb4/0x120
[  673.976387] [c4a4fa00] [c0d4231c] 
__tcp_transmit_skb+0x55c/0xd80
[  673.976396] [c4a4fab0] [c0d43a18] tcp_connect+0x918/0xab0
[  673.976403] [c4a4fb70] [c0e1c25c] tcp_v6_connect+0x5ac/0x740
[  673.976411] [c4a4fc50] [c0d7271c] 
__inet_stream_connect+0x12c/0x4b0
[  673.976419] [c4a4fcf0] [c0d72afc] 
inet_stream_connect+0x5c/0x90
[  673.976428] [c4a4fd30] [c0c37a0c] __sys_connect+0x11c/0x160
[  673.976435] [c4a4fe00] [c0c37a78] sys_connect+0x28/0x40
[  673.976443] [c4a4fe20] [c000b388] system_call+0x5c/0x70
[  673.976449] Instruction dump:
[  673.976454]        
 
[  673.976462]        
 
[  673.976472] ---[ end trace 0961698a90363364 ]---
[  673.976948] 
[  674.977035] Kernel panic - not syncing: Aiee, killing interrupt handler!
[  674.978187] Rebooting in 10 seconds..

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Po-Hsu Lin
I can see this message on a freshly deployed modoc:
[  396.856011] ipr 0001:08:00.0: 9076: Configuration error, missing remote IOA
[  396.856042] ipr 0001:08:00.0: Attached Adapter not discovered within 
allotted time [PRC: 17101541]
[  396.856051] ipr 0001:08:00.0: Remote IOA VPID/SN:   
[  396.856058] ipr 0001:08:00.0: Remote IOA WWN: 
[  396.856065] ipr: : 0100 0100 FC22 0840
[  396.856072] ipr: 0010: 4C494344 49424D20 20202020 35374438
[  396.856078] ipr: 0020: 30303153 4953494F 41202020 
[  396.856084] ipr: 0030: 30303431 57303132 50050760 5EC33900
[  396.856091] ipr: 0040: 0001  02020101 00FF

With 5.3.0-40

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Po-Hsu Lin
Please find attachment for the syslog?field.comment=Please find
attachment for the syslog

** Attachment added: "modoc-syslog.log"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1867155/+attachment/5336156/+files/modoc-syslog.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Po-Hsu Lin
** Description changed:

  Tested with 5 attempts, 4 hangs around the following test in 
ubuntu_kernel_selftests net sub-category:
   # selftests: net: reuseport_bpf_cpu
  
  First attempt:
  23:21:32 DEBUG| [stdout] ok 2 selftests: net: reuseport_bpf_cpu
  23:21:32 DEBUG| [stdout] # selftests: net: reuseport_bpf_numa
  23:21:32 DEBUG| [stdout] #  IPv4 UDP 
  (hang here)
  
  Second attempt:
  10:17:35 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
  10:17:35 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
  10:17:35 DEBUG| [stdout] #  IPv4 UDP 
  10:17:35 DEBUG| [stdout] # send cpu 0, receive socket 0
  (line skipped)
  10:17:35 DEBUG| [stdout] # send cpu 159, receive socket 159
  10:17:35 DEBUG| [stdout] #  IPv6 TCP 
  (hang here)
  
  Third attempt failed because of test timeout:
  12:46:16 DEBUG| [stdout] # [FAIL]
  12:46:16 DEBUG| [stdout] # 
  12:46:16 DEBUG| [stdout] # running psock_tpacket test
  12:46:16 DEBUG| [stdout] # 
  13:14:13 INFO | Timer expired (1800 sec.), nuking pid 161853
  
  Fourth attempt:
  07:41:51 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
  07:41:51 DEBUG| [stdout] #  IPv4 UDP 
  07:41:51 DEBUG| [stdout] # send cpu 0, receive socket 0
  (lines skipped)
  07:41:51 DEBUG| [stdout] # send cpu 159, receive socket 159
  07:41:51 DEBUG| [stdout] #  IPv6 UDP 
  07:41:51 DEBUG| [stdout] # send cpu 0, receive socket 0
  07:41:51 DEBUG| [stdout] # send cpu 1, receive socket 1
  (lines skipped)
  07:41:51 DEBUG| [stdout] # send cpu 157, receive socket 157
  07:41:51 DEBUG| [stdout] # send cpu 159, receive socket 159
  07:41:51 DEBUG| [stdout] #  IPv4 TCP 
  (test hang here)
  
  Fifth attempt:
  04:29:17 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
  04:29:17 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
  04:29:17 DEBUG| [stdout] #  IPv4 UDP 
  04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0
  (lines skipped)
  04:29:17 DEBUG| [stdout] # send cpu 159, receive socket 159
  04:29:17 DEBUG| [stdout] #  IPv6 UDP 
  04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0
  (lines skipped)
  04:29:17 DEBUG| [stdout] # send cpu 159, receive socket 159
  04:29:17 DEBUG| [stdout] #  IPv4 TCP 
  04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0
  (lines skipped)
  04:29:17 DEBUG| [stdout] # send cpu 15, receive socket 15
  (test hang here)
  
  I tried to run tests in this sru-misc suite in the following order:
- 'hwclock',
- 'ubuntu_bpf',
- 'ubuntu_bpf_jit',
- 'ubuntu_kernel_selftests',
- 'ubuntu_lxc',
- 'ubuntu_seccomp',
- 'ubuntu_unionmount_ovlfs',
- 'ubuntu_cts_kernel',
- 'ubuntu_kvm_unit_tests',
+ 'hwclock',
+ 'ubuntu_bpf',
+ 'ubuntu_bpf_jit',
+ 'ubuntu_kernel_selftests',
+ 'ubuntu_lxc',
+ 'ubuntu_seccomp',
+ 'ubuntu_unionmount_ovlfs',
+ 'ubuntu_cts_kernel',
+ 'ubuntu_kvm_unit_tests',
  One by one on this node, but I can't reproduce this issue.
  
  I tried to watch dmesg when this happens, but there is no information
  there, the system will be reboot automatically silently.
+ 
+ This is what you can see from syslog after reboot:
+ Mar 12 04:27:39 modoc kernel: [  536.668305] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.684547] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.700907] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.717246] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.719288] page:c00c00c4f000 refcount:1 
mapcount:0 mapping:c00f8cfe0fd1 index:0x7611c3e
+ Mar 12 04:27:39 modoc kernel: [  536.719289] anon
+ Mar 12 04:27:39 modoc kernel: [  536.719291] flags: 
0x3800080024(uptodate|active|swapbacked)
+ Mar 12 04:27:39 modoc kernel: [  536.719294] raw: 003800080024 
5deadbeef100 5deadbeef122 c00f8cfe0fd1
+ Mar 12 04:27:39 modoc kernel: [  536.719295] raw: 07611c3e 
 0001 c00fcfd1c000
+ Mar 12 04:27:39 modoc kernel: [  536.719296] page dumped because: unmovable 
page
+ Mar 12 04:27:39 modoc kernel: [  536.719296] page->mem_cgroup:c00fcfd1c000
+ Mar 12 04:27:39 modoc kernel: [  536.735465] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.751848] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.768210] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.784450] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.800756] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.817006] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 12 04:27:39 modoc kernel: [  536.833133] Injecting error (-12) to 
MEM_GOING_OFFLINE
+ Mar 

[Bug 1867155] Re: P8 node modoc will reboot automatically when running the sru_misc test suite

2020-03-12 Thread Po-Hsu Lin
** Tags removed: sru-20200217
** Tags added: kqa-blocker

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867155

Title:
  P8 node modoc will reboot automatically when running the sru_misc test
  suite

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs