This bug was fixed in the package linux - 4.15.0-23.25

---------------
linux (4.15.0-23.25) bionic; urgency=medium

  * linux: 4.15.0-23.25 -proposed tracker (LP: #1772927)

  * arm64 SDEI support needs trampoline code for KPTI (LP: #1768630)
    - arm64: mmu: add the entry trampolines start/end section markers into
      sections.h
    - arm64: sdei: Add trampoline code for remapping the kernel

  * Some PCIe errors not surfaced through rasdaemon (LP: #1769730)
    - ACPI: APEI: handle PCIe AER errors in separate function
    - ACPI: APEI: call into AER handling regardless of severity

  * qla2xxx: Fix page fault at kmem_cache_alloc_node() (LP: #1770003)
    - scsi: qla2xxx: Fix session cleanup for N2N
    - scsi: qla2xxx: Remove unused argument from 
qlt_schedule_sess_for_deletion()
    - scsi: qla2xxx: Serialize session deletion by using work_lock
    - scsi: qla2xxx: Serialize session free in qlt_free_session_done
    - scsi: qla2xxx: Don't call dma_free_coherent with IRQ disabled.
    - scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout()
    - scsi: qla2xxx: Prevent relogin trigger from sending too many commands
    - scsi: qla2xxx: Fix double free bug after firmware timeout
    - scsi: qla2xxx: Fixup locking for session deletion

  * Several hisi_sas bug fixes (LP: #1768974)
    - scsi: hisi_sas: dt-bindings: add an property of signal attenuation
    - scsi: hisi_sas: support the property of signal attenuation for v2 hw
    - scsi: hisi_sas: fix the issue of link rate inconsistency
    - scsi: hisi_sas: fix the issue of setting linkrate register
    - scsi: hisi_sas: increase timer expire of internal abort task
    - scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req
    - scsi: hisi_sas: fix return value of hisi_sas_task_prep()
    - scsi: hisi_sas: Code cleanup and minor bug fixes

  * [bionic] machine stuck and bonding not working well when nvmet_rdma module
    is loaded (LP: #1764982)
    - nvmet-rdma: Don't flush system_wq by default during remove_one
    - nvme-rdma: Don't flush delete_wq by default during remove_one

  * Warnings/hang during error handling of SATA disks on SAS controller
    (LP: #1768971)
    - scsi: libsas: defer ata device eh commands to libata

  * Hotplugging a SATA disk into a SAS controller may cause crash (LP: #1768948)
    - ata: do not schedule hot plug if it is a sas host

  * ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU
    ATTEMPT TO RE-ENTER FIRMWARE! (LP: #1767927)
    - powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()
    - powerpc/64s: return more carefully from sreset NMI
    - powerpc/64s: sreset panic if there is no debugger or crash dump handlers

  * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
    - fsnotify: Fix fsnotify_mark_connector race

  * Hang on network interface removal in Xen virtual machine (LP: #1771620)
    - xen-netfront: Fix hang on device removal

  * HiSilicon HNS NIC names are truncated in /proc/interrupts (LP: #1765977)
    - net: hns: Avoid action name truncation

  * Ubuntu 18.04 kernel crashed while in degraded mode (LP: #1770849)
    - SAUCE: powerpc/perf: Fix memory allocation for core-imc based on
      num_possible_cpus()

  * Switch Build-Depends: transfig to fig2dev (LP: #1770770)
    - [Config] update Build-Depends: transfig to fig2dev

  * smp_call_function_single/many core hangs with stop4 alone (LP: #1768898)
    - cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer
      interrupt

  * Add d-i support for Huawei NICs (LP: #1767490)
    - d-i: add hinic to nic-modules udeb

  * unregister_netdevice: waiting for eth0 to become free. Usage count = 5
    (LP: #1746474)
    - xfrm: reuse uncached_list to track xdsts

  * Include nfp driver in linux-modules (LP: #1768526)
    - [Config] Add nfp.ko to generic inclusion list

  * Kernel panic on boot (m1.small in cn-north-1) (LP: #1771679)
    - x86/xen: Reset VCPU0 info pointer after shared_info remap

  * CVE-2018-3639 (x86)
    - x86/bugs: Fix the parameters alignment and missing void
    - KVM: SVM: Move spec control call after restore of GS
    - x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
    - x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
    - x86/cpufeatures: Disentangle SSBD enumeration
    - x86/cpufeatures: Add FEATURE_ZEN
    - x86/speculation: Handle HT correctly on AMD
    - x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
    - x86/speculation: Add virtualized speculative store bypass disable support
    - x86/speculation: Rework speculative_store_bypass_update()
    - x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
    - x86/bugs: Expose x86_spec_ctrl_base directly
    - x86/bugs: Remove x86_spec_ctrl_set()
    - x86/bugs: Rework spec_ctrl base and mask logic
    - x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
    - KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
    - x86/bugs: Rename SSBD_NO to SSB_NO
    - bpf: Prevent memory disambiguation attack
    - KVM: VMX: Expose SSBD properly to guests.

  * Suspend to idle: Open lid didn't resume (LP: #1771542)
    - ACPI / PM: Do not reconfigure GPEs for suspend-to-idle

  * Fix initialization failure detection in SDEI for device-tree based systems
    (LP: #1768663)
    - firmware: arm_sdei: Fix return value check in sdei_present_dt()

  * No driver for Huawei network adapters on arm64 (LP: #1769899)
    - net-next/hinic: add arm64 support

  * CVE-2018-1092
    - ext4: fail ext4_iget for root directory if unallocated

  * kernel 4.15 breaks nouveau on Lenovo P50 (LP: #1763189)
    - drm/nouveau: Fix deadlock in nv50_mstm_register_connector()

  * update-initramfs not adding i915 GuC firmware for Kaby Lake, firmware fails
    to load (LP: #1728238)
    - Revert "UBUNTU: SAUCE: (no-up) i915: Remove MODULE_FIRMWARE statements for
      unreleased firmware"

  * Battery drains when laptop is off  (shutdown) (LP: #1745646)
    - PCI / PM: Check device_may_wakeup() in pci_enable_wake()

  * Dell Latitude 5490/5590 BIOS update 1.1.9 causes black screen at boot
    (LP: #1764194)
    - drm/i915/bios: filter out invalid DDC pins from VBT child devices

  * Intel 9462 A370:42A4 doesn't work (LP: #1748853)
    - iwlwifi: add shared clock PHY config flag for some devices
    - iwlwifi: add a bunch of new 9000 PCI IDs

  * Fix an issue that some PCI devices get incorrectly suspended (LP: #1764684)
    - PCI / PM: Always check PME wakeup capability for runtime wakeup support

  * [SRU][Bionic/Artful] fix false positives in W+X checking (LP: #1769696)
    - init: fix false positives in W+X checking

  * Bionic update to v4.15.18 stable release (LP: #1769723)
    - netfilter: ipset: Missing nfnl_lock()/nfnl_unlock() is added to
      ip_set_net_exit()
    - cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
    - rds: MP-RDS may use an invalid c_path
    - slip: Check if rstate is initialized before uncompressing
    - vhost: fix vhost_vq_access_ok() log check
    - l2tp: fix races in tunnel creation
    - l2tp: fix race in duplicate tunnel detection
    - ip_gre: clear feature flags when incompatible o_flags are set
    - vhost: Fix vhost_copy_to_user()
    - lan78xx: Correctly indicate invalid OTP
    - media: v4l2-compat-ioctl32: don't oops on overlay
    - media: v4l: vsp1: Fix header display list status check in continuous mode
    - ipmi: Fix some error cleanup issues
    - parisc: Fix out of array access in match_pci_device()
    - parisc: Fix HPMC handler by increasing size to multiple of 16 bytes
    - Drivers: hv: vmbus: do not mark HV_PCIE as perf_device
    - PCI: hv: Serialize the present and eject work items
    - PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()
    - KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
    - perf/core: Fix use-after-free in uprobe_perf_close()
    - x86/mce/AMD: Get address from already initialized block
    - hwmon: (ina2xx) Fix access to uninitialized mutex
    - ath9k: Protect queue draining by rcu_read_lock()
    - x86/apic: Fix signedness bug in APIC ID validity checks
    - f2fs: fix heap mode to reset it back
    - block: Change a rcu_read_{lock,unlock}_sched() pair into
      rcu_read_{lock,unlock}()
    - nvme: Skip checking heads without namespaces
    - lib: fix stall in __bitmap_parselist()
    - blk-mq: order getting budget and driver tag
    - blk-mq: don't keep offline CPUs mapped to hctx 0
    - ovl: fix lookup with middle layer opaque dir and absolute path redirects
    - xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
    - hugetlbfs: fix bug in pgoff overflow checking
    - nfsd: fix incorrect umasks
    - scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure
    - block/loop: fix deadlock after loop_set_status
    - nfit: fix region registration vs block-data-window ranges
    - s390/qdio: don't retry EQBS after CCQ 96
    - s390/qdio: don't merge ERROR output buffers
    - s390/ipl: ensure loadparm valid flag is set
    - get_user_pages_fast(): return -EFAULT on access_ok failure
    - mm/gup_benchmark: handle gup failures
    - getname_kernel() needs to make sure that ->name != ->iname in long case
    - Bluetooth: Fix connection if directed advertising and privacy is used
    - Bluetooth: hci_bcm: Treat Interrupt ACPI resources as always being active-
      low
    - rtl8187: Fix NULL pointer dereference in priv->conf_mutex
    - ovl: set lower layer st_dev only if setting lower st_ino
    - Linux 4.15.18

  * Kernel bug when unplugging Thunderbolt 3 cable, leaves xHCI host controller
    dead (LP: #1768852)
    - xhci: Fix Kernel oops in xhci dbgtty

  * Incorrect blacklist of bcm2835_wdt (LP: #1766052)
    - [Packaging] Fix missing watchdog for Raspberry Pi

  * CVE-2018-8087
    - mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()

  * Integrated Webcam Realtek Integrated_Webcam_HD (0bda:58f4) not working in
    DELL XPS 13 9370 with firmware 1.50 (LP: #1763748)
    - SAUCE: media: uvcvideo: Support realtek's UVC 1.5 device

  * [ALSA] [PATCH] Clevo P950ER ALC1220 Fixup (LP: #1769721)
    - SAUCE: ALSA: hda/realtek - Clevo P950ER ALC1220 Fixup

  * Bionic: Intermittently sent to Emergency Mode on boot with unhandled kernel
    NULL pointer dereference at  0000000000000980 (LP: #1768292)
    - thunderbolt: Prevent crash when ICM firmware is not running

  * linux-snapdragon: reduce EPROBEDEFER noise during boot (LP: #1768761)
    - [Config] snapdragon: DRM_I2C_ADV7511=y

  * regression Aquantia Corp. AQC107 4.15.0-13-generic -> 4.15.0-20-generic ?
    (LP: #1767088)
    - net: aquantia: Regression on reset with 1.x firmware
    - net: aquantia: oops when shutdown on already stopped device

  * e1000e msix interrupts broken in linux-image-4.15.0-15-generic
    (LP: #1764892)
    - e1000e: Remove Other from EIAC

  * Acer Swift sf314-52 power button not managed  (LP: #1766054)
    - SAUCE: platform/x86: acer-wmi: add another KEY_POWER keycode

  * set PINCFG_HEADSET_MIC to parse_flags for Dell precision 3630 (LP: #1766398)
    - ALSA: hda/realtek - set PINCFG_HEADSET_MIC to parse_flags

  * Change the location for one of two front mics on a lenovo thinkcentre
    machine (LP: #1766477)
    - ALSA: hda/realtek - adjust the location of one mic

  * SRU: bionic: apply 50 ZFS upstream bugfixes (LP: #1764690)
    - SAUCE: (noup) Update zfs to 0.7.5-1ubuntu15 (LP: #1764690)

  * [8086:3e92] display becomes blank after S3 (LP: #1763271)
    - drm/i915/edp: Do not do link training fallback or prune modes on EDP

 -- Stefan Bader <stefan.ba...@canonical.com>  Wed, 23 May 2018 18:54:55
+0200

** Changed in: linux (Ubuntu)
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1768898

Title:
  smp_call_function_single/many core hangs with stop4 alone

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  == SRU Justification ==
  IBM reports that this bug occurs with stop4 which results in soft lockups/rcu 
stalls.
  This is a kernel synchronization issue leading to a dead lock.

  This bug was introduced by commit 7bc54b652f13 in v4.8-rc1.  This
  regression is fixed by mainline commit c0f7f5b6c6910.

  == Fix ==
  c0f7f5b6c6910 ("cpufreq: powernv: Fix hardlockup due to synchronous smp_call 
in timer interrupt")

  == Regression Potential ==
  Low. Fixes current regression.  Cc'd to upstream stable, so it has had
  additon upstream review.

  == Test Case ==
  A test kernel was built with this patch and tested by the original bug 
reporter.
  The bug reporter states the test kernel resolved the bug.


  Recently we discovered this bug occurs just alone with stop4 which
  results in soft lockups/rcu stalls.

  ```
  root@ltc-boston125:~# [15523.619395] systemd[1]: systemd-journald.service: 
Processes still around after final SIGKILL. Entering failed mode.
  [15523.619508] systemd[1]: systemd-journald.service: Failed with result 
'timeout'.
  [15523.619769] systemd[1]: Failed to start Journal Service.
  [15523.620618] systemd[1]: systemd-journald.service: Service has no hold-off 
time, scheduling restart.
  [15523.620774] systemd[1]: systemd-journald.service: Scheduled restart job, 
restart counter is at 21.
  [15523.621462] systemd[1]: Stopped Journal Service.
  [15523.621635] systemd[1]: systemd-journald.service: Found left-over process 
1561 (systemd-journal) in control group while starting unit. Ignoring.
  [15523.621756] systemd[1]: This usually indicates unclean termination of a 
previous run, or service implementation deficiencies.
  [15523.621888] systemd[1]: systemd-journald.service: Found left-over process 
69060 (systemd-journal) in control group while starting unit. Ignoring.
  [15523.622029] systemd[1]: This usually indica[15541.629904] INFO: rcu_sched 
self-detected stall on CPU
  [15541.629958]        60-....: (2 GPs behind) idle=146/140000000000002/0 
softirq=300022/300022 fqs=999069
  [15541.630046]         (t=2415546 jiffies g=184827 c=184826 q=57111)
  [15541.630101] NMI backtrace for cpu 60
  [15541.630135] CPU: 60 PID: 4810 Comm: tlbie_test Tainted: G             L   
4.15.0-15-generic #16-Ubuntu
  [15541.630207] Call Trace:
  [15541.630232] [c000201a1da96b00] [c000000000ceb35c] dump_stack+0xb0/0xf4 
(unreliable)
  [15541.630298] [c000201a1da96b40] [c000000000cf4d48] 
nmi_cpu_backtrace+0x1f8/0x200
  [15541.630363] [c000201a1da96bd0] [c000000000cf4ee8] 
nmi_trigger_cpumask_backtrace+0x198/0x1f0
  [15541.630429] [c000201a1da96c60] [c00000000002f2d8] 
arch_trigger_cpumask_backtrace+0x28/0x40
  [15541.630495] [c000201a1da96c80] [c0000000001a913c] 
rcu_dump_cpu_stacks+0xf4/0x158
  [15541.630560] [c000201a1da96cd0] [c0000000001a81e8] 
rcu_check_callbacks+0x8e8/0xb40
  [15541.630625] [c000201a1da96e00] [c0000000001b64a8] 
update_process_times+0x48/0x90
  [15541.630689] [c000201a1da96e30] [c0000000001ce1f4] 
tick_sched_handle.isra.5+0x34/0xd0
  [15541.630753] [c000201a1da96e60] [c0000000001ce2f0] 
tick_sched_timer+0x60/0xe0
  [15541.630818] [c000201a1da96ea0] [c0000000001b7054] 
__hrtimer_run_queues+0x144/0x370
  [15541.630883] [c000201a1da96f20] [c0000000001b7fac] 
hrtimer_interrupt+0xfc/0x350
  [15541.630948] [c000201a1da96ff0] [c0000000000248f0] 
__timer_interrupt+0x90/0x260
  [15541.631013] [c000201a1da97040] [c000000000024d08] timer_interrupt+0x98/0xe0
  [15541.631069] [c000201a1da97070] [c000000000009014] 
decrementer_common+0x114/0x120
  [15541.631135] --- interrupt: 901 at smp_call_function_single+0x134/0x180
  [15541.631135]     LR = smp_call_function_single+0x110/0x180
  [15541.631230] [c000201a1da973d0] [c0000000001d55e0] 
smp_call_function_any+0x180/0x250
  [15541.631294] [c000201a1da97430] [c000000000acd3e8] 
gpstate_timer_handler+0x1e8/0x580
  [15541.631359] [c000201a1da974e0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
  [15541.631433] [c000201a1da97560] [c0000000001b4958] expire_timers+0x138/0x1f0
  [15541.631488] [c000201a1da975d0] [c0000000001b4bf8] 
run_timer_softirq+0x1e8/0x270
  [15541.631553] [c000201a1da97670] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
  [15541.631608] [c000201a1da97750] [c000000000114be8] irq_exit+0xe8/0x120
  [15541.631663] [c000201a1da97770] [c000000000024d0c] timer_interrupt+0x9c/0xe0
  [15541.631718] [c000201a1da977a0] [c000000000009014] 
decrementer_common+0x114/0x120
  [15541.631784] --- interrupt: 901 at smp_call_function_many+0x330/0x450
  [15541.631784]     LR = smp_call_function_many+0x324/0x450
  [15541.631879] [c000201a1da97b00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
  [15541.631935] [c000201a1da97b30] [c0000000003a1120] 
change_huge_pmd+0xe0/0x270
  [15541.632000] [c000201a1da97ba0] [c000000000349278] 
change_protection_range+0xb88/0xe40
  [15541.632065] [c000201a1da97cf0] [c0000000003496c0] 
mprotect_fixup+0x140/0x340
  [15541.632129] [c000201a1da97db0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
  [15541.632185] [c000201a1da97e30] [c00000000000b184] system_call+0x58/0x6c
  [15579.001651] watchdog: BUG: soft lockup - CPU#52 stuck for 23s! [grep:69263]
  [15579.001738] Modules linked in: vhost_net vhost tap xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter 
ip6_tables iptable_filter devlink input_leds joydev mac_hid idt_89hpesx ofpart 
cmdlinepart powernv_flash ipmi_powernv ipmi_devintf opal_prd mtd 
ipmi_msghandler ibmpowernv at24 uio_pdrv_genirq uio vmx_crypto kvm_hv kvm 
sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress 
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor 
raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure ast
  [15579.002363]  i2c_algo_bit hid_generic ttm drm_kms_helper mpt3sas 
syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid raid_class 
crct10dif_vpmsum crc32c_vpmsum drm i40e aacraid scsi_transport_sas
  [15579.002524] CPU: 52 PID: 69263 Comm: grep Tainted: G             L   
4.15.0-15-generic #16-Ubuntu
  [15579.002598] NIP:  c0000000001d5368 LR: c0000000001d5340 CTR: 
c000000000acc7f0
  [15579.002664] REGS: c000003e84eff7e0 TRAP: 0901   Tainted: G             L   
 (4.15.0-15-generic)
  [15579.002735] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48044222 
 XER: 00000000
  [15579.002810] CFAR: c01721ed8
  [15579.002810] GPR08: c000000001721ed8 0000000000000001 c009e006592e0960 
0000000000000000
  [15579.002810] GPR12: c000000000acc7f0 c00000000faa3c00
  [15579.003084] NIP [c0000000001d5368] smp_call_function_single+0x138/0x180
  [15579.003139] LR [c0000000001d5340] smp_call_function_single+0x110/0x180
  [15579.003191] Call Trace:
  [15579.003217] [c000003e84effa60] [c0000000001d5340] 
smp_call_function_single+0x110/0x180 (unreliable)
  [15579.003298] [c000003e84effad0] [c0000000001d55e0] 
smp_call_function_any+0x180/0x250
  [15579.003381] [c000003e84effb30] [c000000000acc840] 
powernv_cpufreq_get+0x50/0x70
  [15579.003447] [c000003e84effb60] [c000000000ac2b8c] __cpufreq_get+0x5c/0x140
  [15579.003503] [c000003e84effba0] [c000000000ac2d18] cpufreq_get+0xa8/0xb0
  [15579.003560] [c000003e84effbe0] [c00000000009da50] 
pnv_get_proc_freq+0x20/0x50
  [15579.003625] [c000003e84effc00] [c0000000000283bc] show_cpuinfo+0x11c/0x400
  [15579.003680] [c000003e84effca0] [c00000000040c738] seq_read+0x138/0x610
  [15579.003737] [c000003e84effd40] [c00000000047fa38] proc_reg_read+0x88/0xd0
  [15579.003794] [c000003e84effd70] [c0000000003d293c] __vfs_read+0x3c/0x70
  [15579.003849] [c000003e84effd90] [c0000000003d2a2c] vfs_read+0xbc/0x1b0
  [15579.003905] [c000003e84effde0] [c0000000003d3028] SyS_read+0x68/0x110
  [15579.003962] [c000003e84effe30] [c00000000000b184] system_call+0x58/0x6c
  [15579.004016] Instruction dump:
  [15579.004051] 7fe4fb78 4bfffd4d 813f0018 71290001 4182002c 48000014 60000000 
60000000
  [15579.004121] 60000000 60420000 7c210b78 7c421378 <813f0018> 71290001 
4082fff0 7c2004ac
  [15604.648202] INFO: rcu_sched self-detected stall on CPU
  [15604.648260]        60-....: (2 GPs behind) idle=146/140000000000002/0 
softirq=300022/300022 fqs=1005652
  [15604.648332]         (t=2431300 jiffies g=184827 c=184826 q=57308)
  [15604.648385] NMI backtrace for cpu 60
  [15604.648419] CPU: 60 PID: 4810 Comm: tlbie_test Tainted: G             L   
4.15.0-15-generic #16-Ubuntu
  [15604.648491] Call Trace:
  [15604.648515] [c000201a1da96b00] [c000000000ceb35c] dump_stack+0xb0/0xf4 
(unreliable)
  [15604.648581] [c000201a1da96b40] [c000000000cf4d48] 
nmi_cpu_backtrace+0x1f8/0x200
  [15604.648647] [c000201a1da96bd0] [c000000000cf4ee8] 
nmi_trigger_cpumask_backtrace+0x198/0x1f0
  [15604.648728] [c000201a1da96c60] [c00000000002f2d8] 
arch_trigger_cpumask_backtrace+0x28/0x40
  [15604.648793] [c000201a1da96c80] [c0000000001a913c] 
rcu_dump_cpu_stacks+0xf4/0x158
  [15604.648858] [c000201a1da96cd0] [c0000000001a81e8] 
rcu_check_callbacks+0x8e8/0xb40
  [15604.648924] [c000201a1da96e00] [c0000000001b64a8] 
update_process_times+0x48/0x90
  [15604.648988] [c000201a1da96e30] [c0000000001ce1f4] 
tick_sched_handle.isra.5+0x34/0xd0
  [15604.649052] [c000201a1da96e60] [c0000000001ce2f0] 
tick_sched_timer+0x60/0xe0
  [15604.649118] [c000201a1da96ea0] [c0000000001b7054] 
__hrtimer_run_queues+0x144/0x370
  [15604.649183] [c000201a1da96f20] [c0000000001b7fac] 
hrtimer_interrupt+0xfc/0x350
  [15604.649248] [c000201a1da96ff0] [c0000000000248f0] 
__timer_interrupt+0x90/0x260
  [15604.649313] [c000201a1da97040] [c000000000024d08] timer_interrupt+0x98/0xe0
  [15604.649369] [c000201a1da97070] [c000000000009014] 
decrementer_common+0x114/0x120
  [15604.649435] --- interrupt: 901 at smp_call_function_single+0x138/0x180
  [15604.649435]     LR = smp_call_function_single+0x110/0x180
  [15604.649530] [c000201a1da973d0] [c0000000001d55e0] 
smp_call_function_any+0x180/0x250
  [15604.649595] [c000201a1da97430] [c000000000acd3e8] 
gpstate_timer_handler+0x1e8/0x580
  [15604.649660] [c000201a1da974e0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
  [15604.649715] [c000201a1da97560] [c0000000001b4958] expire_timers+0x138/0x1f0
  [15604.649770] [c000201a1da975d0] [c0000000001b4bf8] 
run_timer_softirq+0x1e8/0x270
  [15604.649835] [c000201a1da97670] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
  [15604.649891] [c000201a1da97750] [c000000000114be8] irq_exit+0xe8/0x120
  [15604.649946] [c000201a1da97770] [c000000000024d0c] timer_interrupt+0x9c/0xe0
  [15604.650002] [c000201a1da977a0] [c000000000009014] 
decrementer_common+0x114/0x120
  [15604.650084] --- interrupt: 901 at smp_call_function_many+0x330/0x450
  [15604.650084]     LR = smp_call_function_many+0x324/0x450
  [15604.650179] [c000201a1da97b00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
  [15604.650235] [c000201a1da97b30] [c0000000003a1120] 
change_huge_pmd+0xe0/0x270
  [15604.650301] [c000201a1da97ba0] [c000000000349278] 
change_protection_range+0xb88/0xe40
  [15604.650366] [c000201a1da97cf0] [c0000000003496c0] 
mprotect_fixup+0x140/0x340
  [15604.650430] [c000201a1da97db0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
  [15604.650486] [c000201a1da97e30] [c00000000000b184] system_call+0x58/0x6c
  [15667.666494] INFO: rcu_sched self-detected stall on CPU
  [15667.666550]        60-....: (2 GPs behind) idle=146/140000000000002/0 
softirq=300022/300022 fqs=1012258
  [15667.666622]         (t=2447054 jiffies g=184827 c=184826 q=57457)
  [15667.666675] NMI backtrace for cpu 60
  [15667.666709] CPU: 60 PID: 4810 Comm: tlbie_test Tainted: G             L   
4.15.0-15-generic #16-Ubuntu
  [15667.666781] Call Trace:
  [15667.666805] [c000201a1da96b00] [c000000000ceb35c] dump_stack+0xb0/0xf4 
(unreliable)
  [15667.666871] [c000201a1da96b40] [c000000000cf4d48] 
nmi_cpu_backtrace+0x1f8/0x200
  [15667.666937] [c000201a1da96bd0] [c000000000cf4ee8] 
nmi_trigger_cpumask_backtrace+0x198/0x1f0
  [15667.667002] [c000201a1da96c60] [c00000000002f2d8] 
arch_trigger_cpumask_backtrace+0x28/0x40
  [15667.667086] [c000201a1da96c80] [c0000000001a913c] 
rcu_dump_cpu_stacks+0xf4/0x158
  [15667.667151] [c000201a1da96cd0] [c0000000001a81e8] 
rcu_check_callbacks+0x8e8/0xb40
  [15667.667216] [c000201a1da96e00] [c0000000001b64a8] 
update_process_times+0x48/0x90
  [15667.667280] [c000201a1da96e30] [c0000000001ce1f4] 
tick_sched_handle.isra.5+0x34/0xd0
  [15667.667344] [c000201a1da96e60] [c0000000001ce2f0] 
tick_sched_timer+0x60/0xe0
  [15667.667409] [c000201a1da96ea0] [c0000000001b7054] 
__hrtimer_run_queues+0x144/0x370
  [15667.667474] [c000201a1da96f20] [c0000000001b7fac] 
hrtimer_interrupt+0xfc/0x350
  [15667.667539] [c000201a1da96ff0] [c0000000000248f0] 
__timer_interrupt+0x90/0x260
  [15667.667604] [c000201a1da97040] [c000000000024d08] timer_interrupt+0x98/0xe0
  [15667.667660] [c000201a1da97070] [c000000000009014] 
decrementer_common+0x114/0x120
  [15667.667727] --- interrupt: 901 at smp_call_function_single+0x130/0x180
  [15667.667727]     LR = smp_call_function_single+0x110/0x180
  [15667.667821] [c000201a1da973d0] [c0000000001d55e0] 
smp_call_function_any+0x180/0x250
  [15667.667886] [c000201a1da97430] [c000000000acd3e8] 
gpstate_timer_handler+0x1e8/0x580
  [15667.667951] [c000201a1da974e0] [c0000000001b46b0] call_timer_fn+0x50/0x1c0
  [15667.668006] [c000201a1da97560] [c0000000001b4958] expire_timers+0x138/0x1f0
  [15667.668061] [c000201a1da975d0] [c0000000001b4bf8] 
run_timer_softirq+0x1e8/0x270
  [15667.668126] [c000201a1da97670] [c000000000d0d6c8] __do_softirq+0x158/0x3e4
  [15667.668181] [c000201a1da97750] [c000000000114be8] irq_exit+0xe8/0x120
  [15667.668236] [c000201a1da97770] [c000000000024d0c] timer_interrupt+0x9c/0xe0
  [15667.668292] [c000201a1da977a0] [c000000000009014] 
decrementer_common+0x114/0x120
  [15667.668358] --- interrupt: 901 at smp_call_function_many+0x330/0x450
  [15667.668358]     LR = smp_call_function_many+0x324/0x450
  [15667.668469] [c000201a1da97b00] [c000000000075f18] pmdp_invalidate+0x98/0xe0
  [15667.668524] [c000201a1da97b30] [c0000000003a1120] 
change_huge_pmd+0xe0/0x270
  [15667.668589] [c000201a1da97ba0] [c000000000349278] 
change_protection_range+0xb88/0xe40
  [15667.668654] [c000201a1da97cf0] [c0000000003496c0] 
mprotect_fixup+0x140/0x340
  [15667.668719] [c000201a1da97db0] [c000000000349a74] SyS_mprotect+0x1b4/0x350
  [15667.668775] [c000201a1da97e30] [c00000000000b184] system_call+0x58/0x6c

  ```

  Per feedback from Vaidy, this currently appears to NOT be a firmware
  problem. This seems to be a kernel synchronization issue leading to a
  dead lock.

  -------
  Fix identified by Shilpa as per Nick Piggin's recommendation.  Kernel fix is 
currently being tested.

   -------
  Fix upstream in 4.17-rc3

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.17-rc3&id=c0f7f5b6c69107ca92909512533e70258ee19188
  cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer 
interrupt

  Posted to stable as well.

  Mirroring to Launchpad for Canonical to pull in commit.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1768898/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to