Public bug reported:

systemlog is as follows:
Nov 13 15:34:56 user-Z790-D kernel: [65651.108600] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:35:36 user-Z790-D kernel: [65691.403191] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:36:18 user-Z790-D kernel: [65732.710350] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:36:21 user-Z790-D gnome-shell[1964]: Can't update stage views actor 
<unnamed>[<MetaWindowGroup>:0x606fc6268660] is on because it needs an 
allocation.
Nov 13 15:36:21 user-Z790-D gnome-shell[1964]: Can't update stage views actor 
<unnamed>[<MetaWindowActorX11>:0x606fcb3fef20] is on because it needs an 
allocation.
Nov 13 15:36:21 user-Z790-D gnome-shell[1964]: Can't update stage views actor 
<unnamed>[<MetaSurfaceActorX11>:0x606fca86e6c0] is on because it needs an 
allocation.
Nov 13 15:36:21 user-Z790-D gnome-shell[1964]: Can't update stage views actor 
<unnamed>[<MetaWindowGroup>:0x606fc6268660] is on because it needs an 
allocation.
Nov 13 15:36:21 user-Z790-D gnome-shell[1964]: Can't update stage views actor 
<unnamed>[<MetaWindowActorX11>:0x606fcb3fef20] is on because it needs an 
allocation.
Nov 13 15:36:21 user-Z790-D gnome-shell[1964]: Can't update stage views actor 
<unnamed>[<MetaSurfaceActorX11>:0x606fca86e6c0] is on because it needs an 
allocation.
Nov 13 15:36:56 user-Z790-D kernel: [65771.072011] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:37:36 user-Z790-D kernel: [65811.392486] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:38:17 user-Z790-D kernel: [65851.847893] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:38:56 user-Z790-D kernel: [65890.816063] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:39:36 user-Z790-D kernel: [65931.099975] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:40:17 user-Z790-D kernel: [65971.896316] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:40:56 user-Z790-D kernel: [66010.802158] [WARNING][BB]Skip gain error 
setting in scan status
Nov 13 15:41:17 user-Z790-D kernel: [66032.098555] BUG: kernel NULL pointer 
dereference, address: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.098560] #PF: supervisor instruction 
fetch in kernel mode
Nov 13 15:41:17 user-Z790-D kernel: [66032.098561] #PF: error_code(0x0010) - 
not-present page
Nov 13 15:41:17 user-Z790-D kernel: [66032.098562] PGD 489614067 P4D 489614067 
PUD 0 
Nov 13 15:41:17 user-Z790-D kernel: [66032.098565] Oops: 0010 [#1] PREEMPT SMP 
NOPTI
Nov 13 15:41:17 user-Z790-D kernel: [66032.098567] CPU: 10 PID: 7027 Comm: 
python Tainted: P           OE      6.5.0-44-generic #44~22.04.1-Ubuntu
Nov 13 15:41:17 user-Z790-D kernel: [66032.098569] Hardware name: Gigabyte 
Technology Co., Ltd. Z790 D/Z790 D, BIOS F1 12/22/2023
Nov 13 15:41:17 user-Z790-D kernel: [66032.098570] RIP: 0010:0x0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098595] Code: Unable to access 
opcode bytes at 0xffffffffffffffd6.
Nov 13 15:41:17 user-Z790-D kernel: [66032.098596] RSP: 0018:ffffaec60f01be08 
EFLAGS: 00010046
Nov 13 15:41:17 user-Z790-D kernel: [66032.098597] RAX: 00003c0e4bc72c8a RBX: 
ffffaec60f01be08 RCX: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.098599] RDX: 0000000000000000 RSI: 
0000000000000000 RDI: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.098599] RBP: 00000000000330c0 R08: 
0000000000000000 R09: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.098600] R10: 0000000000000000 R11: 
0000000000000000 R12: ffffffffb3185872
Nov 13 15:41:17 user-Z790-D kernel: [66032.098601] R13: ffff988b3f6b30c0 R14: 
ffff987ecaad4d40 R15: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.098602] FS:  000077889dacc740(0000) 
GS:ffff988b3f680000(0000) knlGS:0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.098603] CS:  0010 DS: 0000 ES: 0000 
CR0: 0000000080050033
Nov 13 15:41:17 user-Z790-D kernel: [66032.098604] CR2: ffffffffffffffd6 CR3: 
00000003c6798000 CR4: 0000000000750ee0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098606] PKRU: 55555554
Nov 13 15:41:17 user-Z790-D kernel: [66032.098606] Call Trace:
Nov 13 15:41:17 user-Z790-D kernel: [66032.098608]  <TASK>
Nov 13 15:41:17 user-Z790-D kernel: [66032.098609]  ? show_regs+0x6d/0x80
Nov 13 15:41:17 user-Z790-D kernel: [66032.098613]  ? __die+0x24/0x80
Nov 13 15:41:17 user-Z790-D kernel: [66032.098615]  ? page_fault_oops+0x99/0x1b0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098617]  ? 
do_user_addr_fault+0x31d/0x6b0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098619]  ? 
__cgroup_account_cputime+0x2f/0x60
Nov 13 15:41:17 user-Z790-D kernel: [66032.098622]  ? exc_page_fault+0x83/0x1b0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098624]  ? 
asm_exc_page_fault+0x27/0x30
Nov 13 15:41:17 user-Z790-D kernel: [66032.098627]  ? sched_clock_cpu+0x12/0x1e0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098630]  ? update_rq_clock+0x3c/0x230
Nov 13 15:41:17 user-Z790-D kernel: [66032.098632]  ? __schedule+0x133/0x750
Nov 13 15:41:17 user-Z790-D kernel: [66032.098634]  ? schedule+0x63/0x110
Nov 13 15:41:17 user-Z790-D kernel: [66032.098636]  ? do_sched_yield+0x9d/0xd0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098637]  ? 
__do_sys_sched_yield+0xe/0x20
Nov 13 15:41:17 user-Z790-D kernel: [66032.098639]  ? x64_sys_call+0x116b/0x20b0
Nov 13 15:41:17 user-Z790-D kernel: [66032.098641]  ? do_syscall_64+0x55/0x90
Nov 13 15:41:17 user-Z790-D kernel: [66032.098642]  ? do_syscall_64+0x61/0x90
Nov 13 15:41:17 user-Z790-D kernel: [66032.098644]  ? do_syscall_64+0x61/0x90
Nov 13 15:41:17 user-Z790-D kernel: [66032.098645]  ? do_syscall_64+0x61/0x90
Nov 13 15:41:17 user-Z790-D kernel: [66032.098646]  ? 
entry_SYSCALL_64_after_hwframe+0x73/0xdd
Nov 13 15:41:17 user-Z790-D kernel: [66032.098649]  </TASK>
Nov 13 15:41:17 user-Z790-D kernel: [66032.098649] Modules linked in: 
nvidia_uvm(POE) snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel 
snd_sof_intel_hda_mlink soundwire_cadence intel_rapl_msr snd_sof_intel_hda 
intel_rapl_common snd_sof_pci snd_sof_xtensa_dsp intel_uncore_frequency joydev 
input_leds intel_uncore_frequency_common snd_sof 8852bu(OE) 
snd_hda_codec_realtek snd_sof_utils snd_soc_hdac_hda snd_hda_codec_generic 
snd_hda_ext_core snd_soc_acpi_intel_match nvidia_drm(POE) ledtrig_audio 
snd_soc_acpi x86_pkg_temp_thermal nvidia_modeset(POE) 
soundwire_generic_allocation intel_powerclamp soundwire_bus snd_soc_core 
snd_compress ac97_bus hid_generic snd_pcm_dmaengine cfg80211 coretemp 
snd_hda_codec_hdmi usbhid hid snd_hda_intel kvm_intel snd_intel_dspcfg 
snd_intel_sdw_acpi nvidia(POE) i915 snd_hda_codec kvm snd_hda_core snd_hwdep 
irqbypass snd_pcm crct10dif_pclmul polyval_clmulni snd_seq_midi polyval_generic 
binfmt_misc snd_seq_midi_event ghash_clmulni_intel sha256_ssse3 snd_rawmidi 
sha1_ssse3 drm_buddy aesni_intel ttm snd_seq crypto_simd
Nov 13 15:41:17 user-Z790-D kernel: [66032.098682]  drm_display_helper 
snd_seq_device nls_iso8859_1 cryptd mei_hdcp mei_pxp snd_timer cec rapl 
cmdlinepart rc_core intel_cstate snd spi_nor mei_me drm_kms_helper gigabyte_wmi 
wmi_bmof mtd soundcore i2c_algo_bit mei intel_hid sparse_keymap acpi_tad 
acpi_pad mac_hid sch_fq_codel msr parport_pc ppdev lp parport drm efi_pstore 
ip_tables x_tables autofs4 nvme crc32_pclmul r8169 i2c_i801 nvme_core 
intel_lpss_pci spi_intel_pci ahci realtek spi_intel intel_lpss i2c_smbus 
xhci_pci libahci idma64 xhci_pci_renesas nvme_common video wmi pinctrl_alderlake
Nov 13 15:41:17 user-Z790-D kernel: [66032.098709] CR2: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.098711] ---[ end trace 
0000000000000000 ]---
Nov 13 15:41:17 user-Z790-D kernel: [66032.206403] RIP: 0010:0x0
Nov 13 15:41:17 user-Z790-D kernel: [66032.206418] Code: Unable to access 
opcode bytes at 0xffffffffffffffd6.
Nov 13 15:41:17 user-Z790-D kernel: [66032.206418] RSP: 0018:ffffaec60f01be08 
EFLAGS: 00010046
Nov 13 15:41:17 user-Z790-D kernel: [66032.206419] RAX: 00003c0e4bc72c8a RBX: 
ffffaec60f01be08 RCX: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.206420] RDX: 0000000000000000 RSI: 
0000000000000000 RDI: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.206421] RBP: 00000000000330c0 R08: 
0000000000000000 R09: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.206422] R10: 0000000000000000 R11: 
0000000000000000 R12: ffffffffb3185872
Nov 13 15:41:17 user-Z790-D kernel: [66032.206422] R13: ffff988b3f6b30c0 R14: 
ffff987ecaad4d40 R15: 0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.206423] FS:  000077889dacc740(0000) 
GS:ffff988b3f680000(0000) knlGS:0000000000000000
Nov 13 15:41:17 user-Z790-D kernel: [66032.206424] CS:  0010 DS: 0000 ES: 0000 
CR0: 0000000080050033
Nov 13 15:41:17 user-Z790-D kernel: [66032.206424] CR2: ffffffffffffffd6 CR3: 
00000003c6798000 CR4: 0000000000750ee0
Nov 13 15:41:17 user-Z790-D kernel: [66032.206425] PKRU: 55555554
Nov 13 15:41:17 user-Z790-D kernel: [66032.206426] note: python[7027] exited 
with irqs disabled

** Affects: ubuntu
     Importance: Undecided
         Status: New

** Summary changed:

- When training neural networks based on PyTorch, the Linux 22.04 system 
suddenly crashes, causing power outages to external devices and forcing them to 
shut down
+ When training neural networks based on PyTorch, the ubuntu 22.04 system 
suddenly crashes, causing power outages to external devices and forcing them to 
shut down

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2088049

Title:
  When training neural networks based on PyTorch, the ubuntu 22.04
  system suddenly crashes, causing power outages to external devices and
  forcing them to shut down

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/2088049/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to