https://lists.ubuntu.com/archives/kernel-team/2018-August/094654.html

** Description changed:

+ 
+ == SRU Justification ==
+ Mainline commit introduced a regression in v4.15-rc1.  The regression
+ causes a kernel panic during system shutdown.  This commit fixes
+ that regression.  This commit was also cc'd to upstream stable, but it
+ has not landed in Bionic as of yet.
+ 
+ == Fix ==
+ 0d98ba8d70b0 ("scsi: hpsa: disable device during shutdown")
+ 
+ == Regression Potential ==
+ Low.  This patch fixes a current regression.  It has been cc'd to
+ upstream stable, so it has had additon upstream review.
+ 
+ == Test Case ==
+ A test kernel was built with this patch and tested by the original bug 
reporter.
+ The bug reporter states the test kernel resolved the bug.
+ 
+ 
  Verified on multiple DL360 Gen9 servers with up to date firmware.  Just
  before reboot or shutdown, there is the following panic:
  
  [  289.093083] {1}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 1
  [  289.093085] {1}[Hardware Error]: event severity: fatal
  [  289.093087] {1}[Hardware Error]:  Error 0, type: fatal
  [  289.093088] {1}[Hardware Error]:   section_type: PCIe error
  [  289.093090] {1}[Hardware Error]:   port_type: 4, root port
  [  289.093091] {1}[Hardware Error]:   version: 1.16
  [  289.093093] {1}[Hardware Error]:   command: 0x6010, status: 0x0143
  [  289.093094] {1}[Hardware Error]:   device_id: 0000:00:01.0
  [  289.093095] {1}[Hardware Error]:   slot: 0
  [  289.093096] {1}[Hardware Error]:   secondary_bus: 0x03
  [  289.093097] {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2f02
  [  289.093098] {1}[Hardware Error]:   class_code: 040600
  [  289.093378] {1}[Hardware Error]:   bridge: secondary_status: 0x2000, 
control: 0x0003
  [  289.093380] {1}[Hardware Error]:  Error 1, type: fatal
  [  289.093381] {1}[Hardware Error]:   section_type: PCIe error
  [  289.093382] {1}[Hardware Error]:   port_type: 4, root port
  [  289.093383] {1}[Hardware Error]:   version: 1.16
  [  289.093384] {1}[Hardware Error]:   command: 0x6010, status: 0x0143
  [  289.093386] {1}[Hardware Error]:   device_id: 0000:00:01.0
  [  289.093386] {1}[Hardware Error]:   slot: 0
  [  289.093387] {1}[Hardware Error]:   secondary_bus: 0x03
  [  289.093388] {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2f02
  [  289.093674] {1}[Hardware Error]:   class_code: 040600
  [  289.093676] {1}[Hardware Error]:   bridge: secondary_status: 0x2000, 
control: 0x0003
  [  289.093678] Kernel panic - not syncing: Fatal hardware error!
  [  289.093745] Kernel Offset: 0x1cc00000 from 0xffffffff81000000 (relocation 
range: 0xffffffff80000000-0xffffffffbfffffff)
  [  289.105835] ERST: [Firmware Warn]: Firmware does not respond in time.
  
  It does eventually restart after this.  Then during the subsequent POST,
  the following warning appears:
  
  Embedded RAID 1 : Smart Array P440ar Controller - (2048 MB, V6.30) 7 Logical
  Drive(s) - Operation Failed
-  - 1719-Slot 0 Drive Array - A controller failure event occurred prior
-    to this power-up.  (Previous lock up code = 0x13) Action: Install the
-    latest controller firmware. If the problem persists, replace the
-    controller.
+  - 1719-Slot 0 Drive Array - A controller failure event occurred prior
+    to this power-up.  (Previous lock up code = 0x13) Action: Install the
+    latest controller firmware. If the problem persists, replace the
+    controller.
  
  The latter's symptoms are described in
  https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04805565
  but the running storage controller firmware is much newer than the doc's
  resolution.
  
  Neither of these problems occur during shutdown/reboot on the xenial
  kernel.
  
  FWIW, when running on old P89 (1.50 (07/20/2015) vs 2.56 (01/22/2018)),
  the shutdown failure mode was a loop like so:
  
  [529151.035267] NMI: IOCK error (debug interrupt?) for reason 75 on CPU 0.
  [529153.222883] Uhhuh. NMI received for unknown reason 25 on CPU 0.
  [529153.222884] Do you have a strange power saving mode enabled?
  [529153.222884] Dazed and confused, but trying to continue
  [529153.554447] Uhhuh. NMI received for unknown reason 25 on CPU 0.
  [529153.554448] Do you have a strange power saving mode enabled?
  [529153.554449] Dazed and confused, but trying to continue
  [529153.554450] Uhhuh. NMI received for unknown reason 25 on CPU 0.
  [529153.554451] Do you have a strange power saving mode enabled?
  [529153.554452] Dazed and confused, but trying to continue
  [529153.554452] Uhhuh. NMI received for unknown reason 25 on CPU 0.
  [529153.554453] Do you have a strange power saving mode enabled?
  [529153.554454] Dazed and confused, but trying to continue
  [529153.554454] Uhhuh. NMI received for unknown reason 35 on CPU 0.
  [529153.554455] Do you have a strange power saving mode enabled?
  [529153.554456] Dazed and confused, but trying to continue
  [529153.554457] Uhhuh. NMI received for unknown reason 25 on CPU 0.
  [529153.554458] Do you have a strange power saving mode enabled?
  [529153.554458] Dazed and confused, but trying to continue
  [529153.554459] Uhhuh. NMI received for unknown reason 25 on CPU 0.
  [529153.554460] Do you have a strange power saving mode enabled?
  [529153.554460] Dazed and confused, but trying to continue
  [529154.953916] Uhhuh. NMI received for unknown reason 25 on CPU 0.
  [529154.953917] Do you have a strange power saving mode enabled?
  [529154.953918] Dazed and confused, but trying to continue
  
  But upgrading to 2.56 changes that to a kernel panic.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-signed-image-generic 4.15.0.21.22
  ProcVersionSignature: Ubuntu 4.15.0-21.22-generic 4.15.17
  Uname: Linux 4.15.0-21-generic x86_64
  AlsaDevices:
-  total 0
-  crw-rw---- 1 root audio 116,  1 May 15 23:11 seq
-  crw-rw---- 1 root audio 116, 33 May 15 23:11 timer
+  total 0
+  crw-rw---- 1 root audio 116,  1 May 15 23:11 seq
+  crw-rw---- 1 root audio 116, 33 May 15 23:11 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Wed May 16 00:17:53 2018
  HibernationDevice: RESUME=UUID=696e8063-c668-4c89-a478-bfc23a450369
  InstallationDate: Installed on 2016-06-01 (713 days ago)
  InstallationMedia: Ubuntu-Server 14.04.5 LTS "Trusty Tahr" - Beta amd64 
(20160527)
  MachineType: HP ProLiant DL360 Gen9
  PciMultimedia:
-  
+ 
  ProcEnviron:
-  TERM=xterm-256color
-  PATH=(custom, no user)
-  LANG=en_US.UTF-8
-  SHELL=/bin/bash
+  TERM=xterm-256color
+  PATH=(custom, no user)
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-21-generic 
root=UUID=6e6d422d-8ffb-4db3-b8c7-6c81e320b1b2 ro console=tty0 
console=ttyS1,38400 nosplash console=ttyS1,38400 console=tty0 nosplash
  RelatedPackageVersions:
-  linux-restricted-modules-4.15.0-21-generic N/A
-  linux-backports-modules-4.15.0-21-generic  N/A
-  linux-firmware                             1.173
+  linux-restricted-modules-4.15.0-21-generic N/A
+  linux-backports-modules-4.15.0-21-generic  N/A
+  linux-firmware                             1.173
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: Upgraded to bionic on 2018-05-09 (6 days ago)
  dmi.bios.date: 01/22/2018
  dmi.bios.vendor: HP
  dmi.bios.version: P89
  dmi.board.name: ProLiant DL360 Gen9
  dmi.board.vendor: HP
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: 
dmi:bvnHP:bvrP89:bd01/22/2018:svnHP:pnProLiantDL360Gen9:pvr:rvnHP:rnProLiantDL360Gen9:rvr:cvnHP:ct23:cvr:
  dmi.product.family: ProLiant
  dmi.product.name: ProLiant DL360 Gen9
  dmi.sys.vendor: HP

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771467

Title:
  Reboot/shutdown kernel panic on HP DL360/DL380 Gen9 w/ bionic 4.15.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1771467/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to