Public bug reported:

[Impact]
When rebooting with the focal kernel, my system always MCEs. Installing an 
nvidia driver - or simply blacklisting the nouveau driver - avoids the issue.

Sometimes it hard hangs the system, requiring a manual power cycle:

[  OK  ] Reached target Reboot.
[  402.489755] Disabling lock debugging due to kernel taint
[  402.495319] mce: [Hardware Error]: CPU 24: Machine Check Exception: 5 Bank 
6: bb80000000000e0b
[  402.503924] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff9ead91c7> 
{intel_idle+0x87/0x130}
[  402.512530] mce: [Hardware Error]: TSC 29fb4740af0 MISC d7000000 
[  402.518622] mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1601415822 SOCKET 
1 APIC 40 microcode 2006906
[  402.527998] mce: [Hardware Error]: Run the above through 'mcelog --ascii'

Other times it emits the MCE tombstone, but goes ahead and reboots
itself:

[  OK  ] Reached target Reboot.
[  870.372933] Disabling lock debugging due to kernel taint
[  870.378505] mce: [Hardware Error]: CPU 24: Machine Check Exception: 5 Bank 
6: bb80000000000e0b
[  870.387110] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8e2d4847> 
{intel_idle+0x87/0x130}
[  870.395716] mce: [Hardware Error]: TSC 44e0f5e602c MISC d7000000 
[  870.401801] mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1589320331 SOCKET 
1 APIC 40 microcode 2000064
[  870.411185] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[  870.420531] mce: [Hardware Error]: Machine check: Processor context corrupt
[  870.427488] Kernel panic - not syncing: Fatal machine check
[  870.433108] Kernel Offset: 0xc800000 from 0xffffffff81000000 (relocation 
range: 0xffffffff80000000-0xffffffffbfffffff)
[  871.054820] Rebooting in 30 seconds..
[  900.901238] ACPI MEMORY or I/O RESET_REG.

Copyright(c) 2015 American Megatrends, Inc. 
0x19 : Pre-memory SB Initialization.         
Copyright(c) 2016 American Megatrends, Inc.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Confirmed

** Changed in: linux (Ubuntu)
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1908294

Title:
  MCE on shutdown when nouveau driver loaded

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]
  When rebooting with the focal kernel, my system always MCEs. Installing an 
nvidia driver - or simply blacklisting the nouveau driver - avoids the issue.

  Sometimes it hard hangs the system, requiring a manual power cycle:

  [  OK  ] Reached target Reboot.
  [  402.489755] Disabling lock debugging due to kernel taint
  [  402.495319] mce: [Hardware Error]: CPU 24: Machine Check Exception: 5 Bank 
6: bb80000000000e0b
  [  402.503924] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff9ead91c7> 
{intel_idle+0x87/0x130}
  [  402.512530] mce: [Hardware Error]: TSC 29fb4740af0 MISC d7000000 
  [  402.518622] mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1601415822 
SOCKET 1 APIC 40 microcode 2006906
  [  402.527998] mce: [Hardware Error]: Run the above through 'mcelog --ascii'

  Other times it emits the MCE tombstone, but goes ahead and reboots
  itself:

  [  OK  ] Reached target Reboot.
  [  870.372933] Disabling lock debugging due to kernel taint
  [  870.378505] mce: [Hardware Error]: CPU 24: Machine Check Exception: 5 Bank 
6: bb80000000000e0b
  [  870.387110] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8e2d4847> 
{intel_idle+0x87/0x130}
  [  870.395716] mce: [Hardware Error]: TSC 44e0f5e602c MISC d7000000 
  [  870.401801] mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1589320331 
SOCKET 1 APIC 40 microcode 2000064
  [  870.411185] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
  [  870.420531] mce: [Hardware Error]: Machine check: Processor context corrupt
  [  870.427488] Kernel panic - not syncing: Fatal machine check
  [  870.433108] Kernel Offset: 0xc800000 from 0xffffffff81000000 (relocation 
range: 0xffffffff80000000-0xffffffffbfffffff)
  [  871.054820] Rebooting in 30 seconds..
  [  900.901238] ACPI MEMORY or I/O RESET_REG.

  Copyright(c) 2015 American Megatrends, Inc. 
  0x19 : Pre-memory SB Initialization.         
  Copyright(c) 2016 American Megatrends, Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1908294/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to