[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050
Also worth mentioning. We are only seeing this on the A100. Neither our automated testing or manual testing of ftrace saw any issues on DGX2. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1922387 Title: BUG: kernel NULL pointer dereference, address: 0050 Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: Confirmed Status in linux source package in Groovy: Incomplete Status in linux source package in Hirsute: Incomplete Bug description: I observed the following kernel panic with the 5.4.0-71.79-generic kernel while running kernel selftests: blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28 [ 1672.118199] mmiotrace: Error taking CPU254 down: -28 [ 1672.230306] mmiotrace: Error taking CPU255 down: -28 [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050 [ 2503.367527] #PF: supervisor read access in kernel mode [ 2503.373257] #PF: error_code(0x) - not-present page [ 2503.378989] PGD 0 P4D 0 [ 2503.381812] Oops: [#1] SMP NOPTI [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.4.0-71-generic #79-Ubuntu [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 0.33 01/19/2021 [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100 [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 d0 a4 ca ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 59 91 [ 2503.432683] RSP: 0018:a8d6c0003d90 EFLAGS: 00010286 [ 2503.438513] RAX: RBX: RCX: 8100 [ 2503.446474] RDX: 9968a228f418 RSI: 0100 RDI: 9968a228f414 [ 2503.454436] RBP: a8d6c0003df8 R08: 9968a228f414 R09: 0100 [ 2503.462394] R10: 0007 R11: 0007 R12: 9968a228f418 [ 2503.470353] R13: fffa R14: 0003 R15: 9a686f9b3000 [ 2503.478316] FS: () GS:99690cc0() knlGS: [ 2503.487342] CS: 0010 DS: ES: CR0: 80050033 [ 2503.493752] CR2: 0050 CR3: 007e08ad6000 CR4: 00340ef0 [ 2503.501712] Call Trace: [ 2503.504438] [ 2503.506682] wb_timer_fn+0x1d6/0x3c0 [ 2503.510672] ? blk_stat_free_callback_rcu+0x30/0x30 [ 2503.516112] blk_stat_timer_fn+0x134/0x140 [ 2503.520683] call_timer_fn+0x32/0x130 [ 2503.524768] __run_timers.part.0+0x180/0x280 [ 2503.529535] ? trace_event_raw_event_softirq+0x5d/0xa0 [ 2503.535267] run_timer_softirq+0x2a/0x50 [ 2503.539644] __do_softirq+0xe1/0x2d6 [ 2503.543629] irq_exit+0xae/0xb0 [ 2503.547132] smp_apic_timer_interrupt+0x7b/0x140 [ 2503.552280] apic_timer_interrupt+0xf/0x20 [ 2503.556848] [ 2503.559187] RIP: 0010:native_safe_halt+0xe/0x10 [ 2503.564239] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 66 dd 52 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 56 dd 52 00 fb f4 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 cd cd 63 ff 65 [ 2503.585191] RSP: 0018:94803e18 EFLAGS: 0202 ORIG_RAX: ff13 [ 2503.593635] RAX: 0001e7c0 RBX: 996849080de8 RCX: 00149022 [ 2503.601595] RDX: 00149022 RSI: RDI: 948c5ba0 [ 2503.609556] RBP: 94803e38 R08: 02a8 R09: 9968a228f000 [ 2503.617516] R10: R11: 0002 R12: [ 2503.625475] R13: R14: R15: [ 2503.633440] ? default_idle+0x20/0x140 [ 2503.637623] arch_cpu_idle+0x15/0x20 [ 2503.641608] default_idle_call+0x23/0x30 [ 2503.645984] do_idle+0x1fb/0x270 [ 2503.649583] cpu_startup_entry+0x20/0x30 [ 2503.653960] rest_init+0xae/0xb0 [ 2503.657563] arch_call_rest_init+0xe/0x1b [ 2503.662025] start_kernel+0x549/0x56a [ 2503.666108] x86_64_start_reservations+0x24/0x26 [ 2503.671258] x86_64_start_kernel+0x75/0x79 [ 2503.675828] secondary_startup_64+0xa4/0xb0 [ 2503.680493] Modules linked in: sch_etf sch_fq dccp_ipv6 dccp_ipv4 dccp ip6table_nat iptable_nat xt_nat nf_nat algif_hash af_alg ip6table_filter xt_conntrack nf_conntrack nf_defrag_ipv4 ip6_tables nf_defrag_ipv6 ip_vti ip6_vti fou6 sit ipip tunnel4 geneve act_mirred cls_basic esp6 authenc echainiv iptable_filter xt_policy bpfilter veth esp4_offload esp4 xfrm_user xfrm_algo macsec fou vxlan ip6_udp_tunnel udp_tunnel vrf 8021q garp mrp bridge stp llc ip6_gre ip6_tunnel tunnel6 ip_gre ip_tunnel gre cls_u32 sch_htb dummy binfmt_misc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua amd64_edac_mod edac_mce_amd kvm_amd kvm ipmi_ssif input_leds cdc_ether usbnet mii ccp k10temp ipmi_si ipmi_devintf
[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050
Here are the steps I used to reproduce: #if using proposed pocket kernel https://wiki.ubuntu.com/Testing/EnableProposed #Need to enable deb-src for proposed/updates for this work sudo apt update $ sudo apt-get source linux #After source is pulled, build and run ftrace selftests $ sudo make -C linux-5.4.0/tools/testing/selftests TARGETS=ftrace run_tests I also tested on Ubuntu-5.4.0-70.78 and saw similar behavior with soft lockups, but have yet to replicate the crash. Though I don't feel I have evidence to indicate this is a kernel regression. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1922387 Title: BUG: kernel NULL pointer dereference, address: 0050 Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: Confirmed Status in linux source package in Groovy: Incomplete Status in linux source package in Hirsute: Incomplete Bug description: I observed the following kernel panic with the 5.4.0-71.79-generic kernel while running kernel selftests: blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28 [ 1672.118199] mmiotrace: Error taking CPU254 down: -28 [ 1672.230306] mmiotrace: Error taking CPU255 down: -28 [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050 [ 2503.367527] #PF: supervisor read access in kernel mode [ 2503.373257] #PF: error_code(0x) - not-present page [ 2503.378989] PGD 0 P4D 0 [ 2503.381812] Oops: [#1] SMP NOPTI [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.4.0-71-generic #79-Ubuntu [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 0.33 01/19/2021 [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100 [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 d0 a4 ca ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 59 91 [ 2503.432683] RSP: 0018:a8d6c0003d90 EFLAGS: 00010286 [ 2503.438513] RAX: RBX: RCX: 8100 [ 2503.446474] RDX: 9968a228f418 RSI: 0100 RDI: 9968a228f414 [ 2503.454436] RBP: a8d6c0003df8 R08: 9968a228f414 R09: 0100 [ 2503.462394] R10: 0007 R11: 0007 R12: 9968a228f418 [ 2503.470353] R13: fffa R14: 0003 R15: 9a686f9b3000 [ 2503.478316] FS: () GS:99690cc0() knlGS: [ 2503.487342] CS: 0010 DS: ES: CR0: 80050033 [ 2503.493752] CR2: 0050 CR3: 007e08ad6000 CR4: 00340ef0 [ 2503.501712] Call Trace: [ 2503.504438] [ 2503.506682] wb_timer_fn+0x1d6/0x3c0 [ 2503.510672] ? blk_stat_free_callback_rcu+0x30/0x30 [ 2503.516112] blk_stat_timer_fn+0x134/0x140 [ 2503.520683] call_timer_fn+0x32/0x130 [ 2503.524768] __run_timers.part.0+0x180/0x280 [ 2503.529535] ? trace_event_raw_event_softirq+0x5d/0xa0 [ 2503.535267] run_timer_softirq+0x2a/0x50 [ 2503.539644] __do_softirq+0xe1/0x2d6 [ 2503.543629] irq_exit+0xae/0xb0 [ 2503.547132] smp_apic_timer_interrupt+0x7b/0x140 [ 2503.552280] apic_timer_interrupt+0xf/0x20 [ 2503.556848] [ 2503.559187] RIP: 0010:native_safe_halt+0xe/0x10 [ 2503.564239] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 66 dd 52 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 56 dd 52 00 fb f4 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 cd cd 63 ff 65 [ 2503.585191] RSP: 0018:94803e18 EFLAGS: 0202 ORIG_RAX: ff13 [ 2503.593635] RAX: 0001e7c0 RBX: 996849080de8 RCX: 00149022 [ 2503.601595] RDX: 00149022 RSI: RDI: 948c5ba0 [ 2503.609556] RBP: 94803e38 R08: 02a8 R09: 9968a228f000 [ 2503.617516] R10: R11: 0002 R12: [ 2503.625475] R13: R14: R15: [ 2503.633440] ? default_idle+0x20/0x140 [ 2503.637623] arch_cpu_idle+0x15/0x20 [ 2503.641608] default_idle_call+0x23/0x30 [ 2503.645984] do_idle+0x1fb/0x270 [ 2503.649583] cpu_startup_entry+0x20/0x30 [ 2503.653960] rest_init+0xae/0xb0 [ 2503.657563] arch_call_rest_init+0xe/0x1b [ 2503.662025] start_kernel+0x549/0x56a [ 2503.666108] x86_64_start_reservations+0x24/0x26 [ 2503.671258] x86_64_start_kernel+0x75/0x79 [ 2503.675828] secondary_startup_64+0xa4/0xb0 [ 2503.680493] Modules linked in: sch_etf sch_fq dccp_ipv6 dccp_ipv4 dccp ip6table_nat iptable_nat xt_nat nf_nat algif_hash af_alg ip6table_filter xt_conntrack nf_conntrack nf_defrag_ipv4 ip6_tables nf_defrag_ipv6 ip_vti ip6_vti fou6 sit ipip tunnel4 geneve act_mirred cls_basic esp6 authenc echainiv
[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050
I did some manual ubuntu_kernel_selftests ftrace testing on the 5.4.0-71.79-generic kernel. I was able to replicate the panic, but not on every run, but even on runs with no panic dmesg would report several soft lockups. After removing the MOFED dkms, I was unable to replicate a panic or any of the soft lockups previously seen. Currently I don't have evidence as to which MOFED module is potentially triggering the problem. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1922387 Title: BUG: kernel NULL pointer dereference, address: 0050 Status in linux package in Ubuntu: Incomplete Status in linux source package in Focal: Confirmed Status in linux source package in Groovy: Incomplete Status in linux source package in Hirsute: Incomplete Bug description: I observed the following kernel panic with the 5.4.0-71.79-generic kernel while running kernel selftests: blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28 [ 1672.118199] mmiotrace: Error taking CPU254 down: -28 [ 1672.230306] mmiotrace: Error taking CPU255 down: -28 [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050 [ 2503.367527] #PF: supervisor read access in kernel mode [ 2503.373257] #PF: error_code(0x) - not-present page [ 2503.378989] PGD 0 P4D 0 [ 2503.381812] Oops: [#1] SMP NOPTI [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.4.0-71-generic #79-Ubuntu [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 0.33 01/19/2021 [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100 [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 d0 a4 ca ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 59 91 [ 2503.432683] RSP: 0018:a8d6c0003d90 EFLAGS: 00010286 [ 2503.438513] RAX: RBX: RCX: 8100 [ 2503.446474] RDX: 9968a228f418 RSI: 0100 RDI: 9968a228f414 [ 2503.454436] RBP: a8d6c0003df8 R08: 9968a228f414 R09: 0100 [ 2503.462394] R10: 0007 R11: 0007 R12: 9968a228f418 [ 2503.470353] R13: fffa R14: 0003 R15: 9a686f9b3000 [ 2503.478316] FS: () GS:99690cc0() knlGS: [ 2503.487342] CS: 0010 DS: ES: CR0: 80050033 [ 2503.493752] CR2: 0050 CR3: 007e08ad6000 CR4: 00340ef0 [ 2503.501712] Call Trace: [ 2503.504438] [ 2503.506682] wb_timer_fn+0x1d6/0x3c0 [ 2503.510672] ? blk_stat_free_callback_rcu+0x30/0x30 [ 2503.516112] blk_stat_timer_fn+0x134/0x140 [ 2503.520683] call_timer_fn+0x32/0x130 [ 2503.524768] __run_timers.part.0+0x180/0x280 [ 2503.529535] ? trace_event_raw_event_softirq+0x5d/0xa0 [ 2503.535267] run_timer_softirq+0x2a/0x50 [ 2503.539644] __do_softirq+0xe1/0x2d6 [ 2503.543629] irq_exit+0xae/0xb0 [ 2503.547132] smp_apic_timer_interrupt+0x7b/0x140 [ 2503.552280] apic_timer_interrupt+0xf/0x20 [ 2503.556848] [ 2503.559187] RIP: 0010:native_safe_halt+0xe/0x10 [ 2503.564239] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 66 dd 52 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 56 dd 52 00 fb f4 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 cd cd 63 ff 65 [ 2503.585191] RSP: 0018:94803e18 EFLAGS: 0202 ORIG_RAX: ff13 [ 2503.593635] RAX: 0001e7c0 RBX: 996849080de8 RCX: 00149022 [ 2503.601595] RDX: 00149022 RSI: RDI: 948c5ba0 [ 2503.609556] RBP: 94803e38 R08: 02a8 R09: 9968a228f000 [ 2503.617516] R10: R11: 0002 R12: [ 2503.625475] R13: R14: R15: [ 2503.633440] ? default_idle+0x20/0x140 [ 2503.637623] arch_cpu_idle+0x15/0x20 [ 2503.641608] default_idle_call+0x23/0x30 [ 2503.645984] do_idle+0x1fb/0x270 [ 2503.649583] cpu_startup_entry+0x20/0x30 [ 2503.653960] rest_init+0xae/0xb0 [ 2503.657563] arch_call_rest_init+0xe/0x1b [ 2503.662025] start_kernel+0x549/0x56a [ 2503.666108] x86_64_start_reservations+0x24/0x26 [ 2503.671258] x86_64_start_kernel+0x75/0x79 [ 2503.675828] secondary_startup_64+0xa4/0xb0 [ 2503.680493] Modules linked in: sch_etf sch_fq dccp_ipv6 dccp_ipv4 dccp ip6table_nat iptable_nat xt_nat nf_nat algif_hash af_alg ip6table_filter xt_conntrack nf_conntrack nf_defrag_ipv4 ip6_tables nf_defrag_ipv6 ip_vti ip6_vti fou6 sit ipip tunnel4 geneve act_mirred cls_basic esp6 authenc echainiv iptable_filter xt_policy bpfilter veth esp4_offload esp4 xfrm_user xfrm_algo macsec fou vxlan ip6_udp_tunnel udp_tunnel vrf 8021q garp
[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050
This panic occurred while running the ubuntu_kernel_selftests suite. The last bit of logs are: 13:33:20 DEBUG| [stdout] # selftests: ftrace: ftracetest 13:33:20 DEBUG| [stdout] # === Ftrace unit tests === 13:33:28 DEBUG| [stdout] # [1] Basic trace file check [PASS] 13:37:04 DEBUG| [stdout] # [2] Basic test for tracers [PASS] 13:39:48 DEBUG| [stdout] # [3] Basic trace clock test [PASS] 13:39:56 DEBUG| [stdout] # [4] Basic event tracing check [PASS] 13:40:04 DEBUG| [stdout] # [5] Change the ringbuffer size [PASS] 13:40:20 DEBUG| [stdout] # [6] Snapshot and tracing setting [PASS] 13:40:35 DEBUG| [stdout] # [7] trace_pipe and trace_marker [PASS] 13:40:51 DEBUG| [stdout] # [8] Generic dynamic event - add/remove kprobe events [PASS] 13:41:07 DEBUG| [stdout] # [9] Generic dynamic event - add/remove synthetic events [PASS] 13:41:14 DEBUG| [stdout] # [10] Generic dynamic event - selective clear (compatibility) [PASS] 13:41:22 DEBUG| [stdout] # [11] Generic dynamic event - generic clear event [PASS] 13:41:46 DEBUG| [stdout] # [12] event tracing - enable/disable with event level files [PASS] 13:42:17 DEBUG| [stdout] # [13] event tracing - restricts events based on pid [PASS] 13:42:41 DEBUG| [stdout] # [14] event tracing - enable/disable with subsystem level files [PASS] 13:43:05 DEBUG| [stdout] # [15] event tracing - enable/disable with top level files [PASS] 13:43:14 DEBUG| [stdout] # [16] Test trace_printk from module [PASS] 13:43:56 DEBUG| [stdout] # [17] ftrace - function graph filters with stack tracer [PASS] 13:44:29 DEBUG| [stdout] # [18] ftrace - function graph filters [PASS] 13:45:49 DEBUG| [stdout] # [19] ftrace - function pid filters [PASS] 13:46:06 DEBUG| [stdout] # [20] ftrace - stacktrace filter command [PASS] 13:46:38 DEBUG| [stdout] # [21] ftrace - function trace with cpumask [PASS] 13:47:13 DEBUG| [stdout] # [22] ftrace - test for function event triggers [PASS] 13:47:21 DEBUG| [stdout] # [23] ftrace - function trace on module [PASS] 13:47:31 DEBUG| [stdout] # [24] ftrace - function profiling [PASS] 13:48:07 DEBUG| [stdout] # [25] ftrace - function profiler with function tracing [PASS] 13:48:25 DEBUG| [stdout] # [26] ftrace - test reading of set_ftrace_filter [PASS] END OF MESSAGES This job was run twice. The prior run also hung before completing, but we don't have a console log for that time period, so it's unclear if it also panic'd. It's last messages were: 04:44:27 DEBUG| [stdout] # selftests: timers: nsleep-lat 04:44:48 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME [OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC [OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_RAW [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_COARSE [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_COARSE [UNSUPPORTED] 04:45:30 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME [OK] 04:45:52 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_ALARM [OK] 04:46:13 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME_ALARM [OK] 04:46:34 DEBUG| [stdout] # nsleep latency CLOCK_TAI [OK] 04:46:34 DEBUG| [stdout] # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0 04:46:34 DEBUG| [stdout] ok 3 selftests: timers: nsleep-lat 04:46:34 DEBUG| [stdout] # selftests: timers: set-timer-lat -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1922387 Title: BUG: kernel NULL pointer dereference, address: 0050 Status in linux package in Ubuntu: New Status in linux source package in Focal: Confirmed Status in linux source package in Groovy: New Status in linux source package in Hirsute: New Bug description: I observed the following kernel panic with the 5.4.0-71.79-generic kernel while running kernel selftests: blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28 [ 1672.118199] mmiotrace: Error taking CPU254 down: -28 [ 1672.230306] mmiotrace: Error taking CPU255 down: -28 [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050 [ 2503.367527] #PF: supervisor read access in kernel mode [ 2503.373257] #PF: error_code(0x) - not-present page [ 2503.378989] PGD 0 P4D 0 [ 2503.381812] Oops: [#1] SMP NOPTI [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.4.0-71-generic #79-Ubuntu [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 0.33 01/19/2021 [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100 [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00 00 48 8d 7d a0 e8 d0 a4 ca ff 49 89 c4 48 85 c0 74 37 49 8b 87 b8 03 00 00 <48> 8b 70 50 48 85 f6 74 45 49 8d 7c 24 08 ba 20 00 00 00 e8 59 91 [ 2503.432683] RSP: 0018:a8d6c0003d90 EFLAGS: 00010286 [ 2503.438513] RAX: RBX: RCX: 8100 [ 2503.446474] RDX: 9968a228f418 RSI:
[Kernel-packages] [Bug 1922387] Re: BUG: kernel NULL pointer dereference, address: 0000000000000050
This panic occurred while running the ubuntu_kernel_selftests suite. The last bit of logs are: 13:33:20 DEBUG| [stdout] # selftests: ftrace: ftracetest 13:33:20 DEBUG| [stdout] # === Ftrace unit tests === 13:33:28 DEBUG| [stdout] # [1] Basic trace file check [PASS] 13:37:04 DEBUG| [stdout] # [2] Basic test for tracers [PASS] 13:39:48 DEBUG| [stdout] # [3] Basic trace clock test [PASS] 13:39:56 DEBUG| [stdout] # [4] Basic event tracing check[PASS] 13:40:04 DEBUG| [stdout] # [5] Change the ringbuffer size [PASS] 13:40:20 DEBUG| [stdout] # [6] Snapshot and tracing setting [PASS] 13:40:35 DEBUG| [stdout] # [7] trace_pipe and trace_marker [PASS] 13:40:51 DEBUG| [stdout] # [8] Generic dynamic event - add/remove kprobe events [PASS] 13:41:07 DEBUG| [stdout] # [9] Generic dynamic event - add/remove synthetic events [PASS] 13:41:14 DEBUG| [stdout] # [10] Generic dynamic event - selective clear (compatibility) [PASS] 13:41:22 DEBUG| [stdout] # [11] Generic dynamic event - generic clear event [PASS] 13:41:46 DEBUG| [stdout] # [12] event tracing - enable/disable with event level files [PASS] 13:42:17 DEBUG| [stdout] # [13] event tracing - restricts events based on pid [PASS] 13:42:41 DEBUG| [stdout] # [14] event tracing - enable/disable with subsystem level files [PASS] 13:43:05 DEBUG| [stdout] # [15] event tracing - enable/disable with top level files [PASS] 13:43:14 DEBUG| [stdout] # [16] Test trace_printk from module [PASS] 13:43:56 DEBUG| [stdout] # [17] ftrace - function graph filters with stack tracer [PASS] 13:44:29 DEBUG| [stdout] # [18] ftrace - function graph filters [PASS] 13:45:49 DEBUG| [stdout] # [19] ftrace - function pid filters [PASS] 13:46:06 DEBUG| [stdout] # [20] ftrace - stacktrace filter command [PASS] 13:46:38 DEBUG| [stdout] # [21] ftrace - function trace with cpumask[PASS] 13:47:13 DEBUG| [stdout] # [22] ftrace - test for function event triggers [PASS] 13:47:21 DEBUG| [stdout] # [23] ftrace - function trace on module [PASS] 13:47:31 DEBUG| [stdout] # [24] ftrace - function profiling [PASS] 13:48:07 DEBUG| [stdout] # [25] ftrace - function profiler with function tracing[PASS] 13:48:25 DEBUG| [stdout] # [26] ftrace - test reading of set_ftrace_filter [PASS] END OF MESSAGES This job was run twice. The prior run also hung before completing, but we don't have a console log for that time period, so it's unclear if it also panic'd. It's last messages were: 04:44:27 DEBUG| [stdout] # selftests: timers: nsleep-lat 04:44:48 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME [OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC[OK] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_RAW [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_COARSE [UNSUPPORTED] 04:45:09 DEBUG| [stdout] # nsleep latency CLOCK_MONOTONIC_COARSE [UNSUPPORTED] 04:45:30 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME [OK] 04:45:52 DEBUG| [stdout] # nsleep latency CLOCK_REALTIME_ALARM [OK] 04:46:13 DEBUG| [stdout] # nsleep latency CLOCK_BOOTTIME_ALARM [OK] 04:46:34 DEBUG| [stdout] # nsleep latency CLOCK_TAI [OK] 04:46:34 DEBUG| [stdout] # # Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0 04:46:34 DEBUG| [stdout] ok 3 selftests: timers: nsleep-lat 04:46:34 DEBUG| [stdout] # selftests: timers: set-timer-lat The job can be found here: http://10.246.72.4:8080/view/nvidia%20a100%20-%20blanka/job/focal-linux- generic-amd64-5.4.0-blanka-ubuntu_kernel_selftests/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1922387 Title: BUG: kernel NULL pointer dereference, address: 0050 Status in linux package in Ubuntu: New Status in linux source package in Focal: Confirmed Status in linux source package in Groovy: New Status in linux source package in Hirsute: New Bug description: I observed the following kernel panic with the 5.4.0-71.79-generic kernel while running kernel selftests: blanka login: [ 1671.958400] mmiotrace: Error taking CPU253 down: -28 [ 1672.118199] mmiotrace: Error taking CPU254 down: -28 [ 1672.230306] mmiotrace: Error taking CPU255 down: -28 [ 2503.359753] BUG: kernel NULL pointer dereference, address: 0050 [ 2503.367527] #PF: supervisor read access in kernel mode [ 2503.373257] #PF: error_code(0x) - not-present page [ 2503.378989] PGD 0 P4D 0 [ 2503.381812] Oops: [#1] SMP NOPTI [ 2503.385896] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.4.0-71-generic #79-Ubuntu [ 2503.395795] Hardware name: NVIDIA DGXA100 920-23687-2530-000/DGXA100, BIOS 0.33 01/19/2021 [ 2503.405027] RIP: 0010:trace_event_raw_event_wbt_timer+0x6f/0x100 [ 2503.411728] Code: 59 80 e5 02 0f 85 8f 00 00 00 4c 89 e6 ba 34 00 00