SRU Request Submitted:
https://lists.ubuntu.com/archives/kernel-team/2018-June/093483.html
** Description changed:
+ == SRU Justification ==
+ IBM is seeing kernel traces during testing. This is due to a missing
+ backport of some kernel fixes in the RTC driver, which is commit
+ 682e6b4da5cb. Commit 682e6b4da5cb was also cc'd to upstream stable, but
+ it has not landed in Bionic as of yet. It is also a fix to upstream
+ commit 628daa8d5abf.
+
+ Commit 34dd25de9fe3 is also needed as a prereq to define
+ OPAL_BUSY_DELAY_MS.
+
+ == Fixes ==
+ 34dd25de9fe3 ("powerpc/powernv: define a standard delay for OPAL_BUSY type
retry loops")
+ 682e6b4da5cb ("rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops")
+
+ == Regression Potential ==
+ Low. Limited to powerpc. Fixes a current regression.
+
+ == Test Case ==
+ A test kernel was built with these patches and tested by the original bug
reporter.
+ The bug reporter states the test kernel resolved the bug.
+
+
== Comment: #0 - PAVAMAN SUBRAMANIYAM <[email protected]> - 2018-05-23
01:15:30 ==
Install a P9 Open Power Hardware with the latest OP920 Firmware images
provided in the following link:
http://pfd.austin.ibm.com/releasenotes/openpower9/OP920/OP920_1808A/OP920_1808N_RelNote_Main.html
root@witherspoon:~# cat /etc/os-release
ID="openbmc-phosphor"
NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro)"
VERSION="ibm-v2.1"
VERSION_ID="ibm-v2.1-438-g0030304-r12-0-g5ee4fb0"
PRETTY_NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro)
ibm-v2.1"
BUILD_ID="ibm-v2.1-438-g0030304-r12"
root@witherspoon:~# cat /var/lib/phosphor-software-manager/pnor/ro/VERSION
IBM-witherspoon-ibm-OP9-v2.0-2.14
- op-build-v2.0-11-gb248194-dirty
- buildroot-2018.02.1-6-ga8d1126
- skiboot-v6.0.1
- hostboot-8ab6717d-pfc036fa
- occ-77bb5e6
- linux-4.16.8-openpower2-pb532d68
- petitboot-v1.7.1-p1188545
- machine-xml-7cd20a6
- hostboot-binaries-276bb70
- capp-ucode-p9-dd2-v4
- sbe-a596975
- hcode-b8173e8
+ op-build-v2.0-11-gb248194-dirty
+ buildroot-2018.02.1-6-ga8d1126
+ skiboot-v6.0.1
+ hostboot-8ab6717d-pfc036fa
+ occ-77bb5e6
+ linux-4.16.8-openpower2-pb532d68
+ petitboot-v1.7.1-p1188545
+ machine-xml-7cd20a6
+ hostboot-binaries-276bb70
+ capp-ucode-p9-dd2-v4
+ sbe-a596975
+ hcode-b8173e8
Seeing the following messages in the dmesg logs.
[ 16.377405] ipmi_si: Unable to find any System Interface(s)
[ 17.384118] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
[ 1372.711730] INFO: rcu_sched self-detected stall on CPU
[ 1372.711787] 32-....: (5249 ticks this GP) idle=182/140000000000001/0
softirq=1093/1093 fqs=2623
[ 1372.711863] (t=5250 jiffies g=22430 c=22429 q=953)
[ 1372.711921] Task dump for CPU 32:
[ 1372.711922] kworker/32:1 R running task 0 1123 2
0x00000804
[ 1372.711930] Workqueue: events rtc_timer_do_work
[ 1372.711931] Call Trace:
[ 1372.711934] [c000003fd2b97350] [c00000000014a8f8]
sched_show_task.part.16+0xd8/0x110 (unreliable)
[ 1372.711939] [c000003fd2b973c0] [c0000000001aa8bc]
rcu_dump_cpu_stacks+0xd4/0x138
[ 1372.711942] [c000003fd2b97410] [c0000000001a9988]
rcu_check_callbacks+0x8e8/0xb40
[ 1372.711945] [c000003fd2b97540] [c0000000001b7c28]
update_process_times+0x48/0x90
[ 1372.711948] [c000003fd2b97570] [c0000000001cf974]
tick_sched_handle.isra.5+0x34/0xd0
[ 1372.711950] [c000003fd2b975a0] [c0000000001cfa70]
tick_sched_timer+0x60/0xe0
[ 1372.711953] [c000003fd2b975e0] [c0000000001b87d4]
__hrtimer_run_queues+0x144/0x370
[ 1372.711956] [c000003fd2b97660] [c0000000001b972c]
hrtimer_interrupt+0xfc/0x350
[ 1372.711959] [c000003fd2b97730] [c0000000000249f0]
__timer_interrupt+0x90/0x260
[ 1372.711962] [c000003fd2b97780] [c000000000024e08] timer_interrupt+0x98/0xe0
[ 1372.711965] [c000003fd2b977b0] [c000000000009054]
decrementer_common+0x114/0x120
[ 1372.711970] --- interrupt: 901 at opal_get_rtc_time+0x98/0x110
- LR = opal_return+0x14/0x48
+ LR = opal_return+0x14/0x48
[ 1372.711972] [c000003fd2b97aa0] [c000000000a457b8]
opal_get_rtc_time+0x98/0x110 (unreliable)
[ 1372.711975] [c000003fd2b97ae0] [c000000000a3f98c]
__rtc_read_time+0x7c/0x180
[ 1372.711977] [c000003fd2b97b60] [c000000000a41738]
rtc_timer_do_work+0x78/0x250
[ 1372.711980] [c000003fd2b97c90] [c000000000134378]
process_one_work+0x298/0x5a0
[ 1372.711982] [c000003fd2b97d20] [c000000000134718] worker_thread+0x98/0x630
[ 1372.711985] [c000003fd2b97dc0] [c00000000013d348] kthread+0x1a8/0x1b0
[ 1372.711988] [c000003fd2b97e30] [c00000000000b658]
ret_from_kernel_thread+0x5c/0x84
== Comment: #1 - PAVAMAN SUBRAMANIYAM <[email protected]> - 2018-05-23
01:31:06 ==
-
== Comment: #2 - Application Cdeadmin <[email protected]> - 2018-05-23
01:33:40 ==
cde00 ([email protected]) added native attachment
/tmp/AIXOS07311082/dmesg.txt on 2018-05-23 01:33:33
== Comment: #3 - Application Cdeadmin <[email protected]> - 2018-05-24
16:45:41 ==
==== State: Open by: jayeshp on 24 May 2018 16:42:57 ====
#=#=# 2018-05-24 16:42:54 (CDT) #=#=#
New Fix_Potential = [P920.10W]
#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#=#
== Comment: #4 - Stewart Smith <[email protected]> - 2018-05-30 21:15:15 ==
This'll be a missing backport of some kernel fixes in the RTC driver.
It's at least this commit:
commit 682e6b4da5cbe8e9a53f979a58c2a9d7dc997175
Author: Nicholas Piggin <[email protected]>
Date: Tue Apr 10 21:49:32 2018 +1000
- rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops
-
- The OPAL RTC driver does not sleep in case it gets OPAL_BUSY or
- OPAL_BUSY_EVENT from firmware, which causes large scheduling
- latencies, up to 50 seconds have been observed here when RTC stops
- responding (BMC reboot can do it).
-
- Fix this by converting it to the standard form OPAL_BUSY loop that
- sleeps.
-
- Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus
RTAS fallbacks")
- Cc: [email protected] # v3.2+
- Signed-off-by: Nicholas Piggin <[email protected]>
- Acked-by: Alexandre Belloni <[email protected]>
- Signed-off-by: Michael Ellerman <[email protected]>
+ rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops
+
+ The OPAL RTC driver does not sleep in case it gets OPAL_BUSY or
+ OPAL_BUSY_EVENT from firmware, which causes large scheduling
+ latencies, up to 50 seconds have been observed here when RTC stops
+ responding (BMC reboot can do it).
+
+ Fix this by converting it to the standard form OPAL_BUSY loop that
+ sleeps.
+
+ Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus
RTAS fallbacks")
+ Cc: [email protected] # v3.2+
+ Signed-off-by: Nicholas Piggin <[email protected]>
+ Acked-by: Alexandre Belloni <[email protected]>
+ Signed-off-by: Michael Ellerman <[email protected]>
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1777857
Title:
[LTCTest][OPAL][OP920] INFO: rcu_sched self-detected stall on CPU
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1777857/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs