This bug was fixed in the package linux - 4.15.0-23.25

---------------
linux (4.15.0-23.25) bionic; urgency=medium

  * linux: 4.15.0-23.25 -proposed tracker (LP: #1772927)

  * arm64 SDEI support needs trampoline code for KPTI (LP: #1768630)
    - arm64: mmu: add the entry trampolines start/end section markers into
      sections.h
    - arm64: sdei: Add trampoline code for remapping the kernel

  * Some PCIe errors not surfaced through rasdaemon (LP: #1769730)
    - ACPI: APEI: handle PCIe AER errors in separate function
    - ACPI: APEI: call into AER handling regardless of severity

  * qla2xxx: Fix page fault at kmem_cache_alloc_node() (LP: #1770003)
    - scsi: qla2xxx: Fix session cleanup for N2N
    - scsi: qla2xxx: Remove unused argument from 
qlt_schedule_sess_for_deletion()
    - scsi: qla2xxx: Serialize session deletion by using work_lock
    - scsi: qla2xxx: Serialize session free in qlt_free_session_done
    - scsi: qla2xxx: Don't call dma_free_coherent with IRQ disabled.
    - scsi: qla2xxx: Fix warning in qla2x00_async_iocb_timeout()
    - scsi: qla2xxx: Prevent relogin trigger from sending too many commands
    - scsi: qla2xxx: Fix double free bug after firmware timeout
    - scsi: qla2xxx: Fixup locking for session deletion

  * Several hisi_sas bug fixes (LP: #1768974)
    - scsi: hisi_sas: dt-bindings: add an property of signal attenuation
    - scsi: hisi_sas: support the property of signal attenuation for v2 hw
    - scsi: hisi_sas: fix the issue of link rate inconsistency
    - scsi: hisi_sas: fix the issue of setting linkrate register
    - scsi: hisi_sas: increase timer expire of internal abort task
    - scsi: hisi_sas: remove unused variable hisi_sas_devices.running_req
    - scsi: hisi_sas: fix return value of hisi_sas_task_prep()
    - scsi: hisi_sas: Code cleanup and minor bug fixes

  * [bionic] machine stuck and bonding not working well when nvmet_rdma module
    is loaded (LP: #1764982)
    - nvmet-rdma: Don't flush system_wq by default during remove_one
    - nvme-rdma: Don't flush delete_wq by default during remove_one

  * Warnings/hang during error handling of SATA disks on SAS controller
    (LP: #1768971)
    - scsi: libsas: defer ata device eh commands to libata

  * Hotplugging a SATA disk into a SAS controller may cause crash (LP: #1768948)
    - ata: do not schedule hot plug if it is a sas host

  * ISST-LTE:pKVM:Ubuntu1804: rcu_sched self-detected stall on CPU follow by CPU
    ATTEMPT TO RE-ENTER FIRMWARE! (LP: #1767927)
    - powerpc/powernv: Handle unknown OPAL errors in opal_nvram_write()
    - powerpc/64s: return more carefully from sreset NMI
    - powerpc/64s: sreset panic if there is no debugger or crash dump handlers

  * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
    - fsnotify: Fix fsnotify_mark_connector race

  * Hang on network interface removal in Xen virtual machine (LP: #1771620)
    - xen-netfront: Fix hang on device removal

  * HiSilicon HNS NIC names are truncated in /proc/interrupts (LP: #1765977)
    - net: hns: Avoid action name truncation

  * Ubuntu 18.04 kernel crashed while in degraded mode (LP: #1770849)
    - SAUCE: powerpc/perf: Fix memory allocation for core-imc based on
      num_possible_cpus()

  * Switch Build-Depends: transfig to fig2dev (LP: #1770770)
    - [Config] update Build-Depends: transfig to fig2dev

  * smp_call_function_single/many core hangs with stop4 alone (LP: #1768898)
    - cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer
      interrupt

  * Add d-i support for Huawei NICs (LP: #1767490)
    - d-i: add hinic to nic-modules udeb

  * unregister_netdevice: waiting for eth0 to become free. Usage count = 5
    (LP: #1746474)
    - xfrm: reuse uncached_list to track xdsts

  * Include nfp driver in linux-modules (LP: #1768526)
    - [Config] Add nfp.ko to generic inclusion list

  * Kernel panic on boot (m1.small in cn-north-1) (LP: #1771679)
    - x86/xen: Reset VCPU0 info pointer after shared_info remap

  * CVE-2018-3639 (x86)
    - x86/bugs: Fix the parameters alignment and missing void
    - KVM: SVM: Move spec control call after restore of GS
    - x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
    - x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
    - x86/cpufeatures: Disentangle SSBD enumeration
    - x86/cpufeatures: Add FEATURE_ZEN
    - x86/speculation: Handle HT correctly on AMD
    - x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
    - x86/speculation: Add virtualized speculative store bypass disable support
    - x86/speculation: Rework speculative_store_bypass_update()
    - x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
    - x86/bugs: Expose x86_spec_ctrl_base directly
    - x86/bugs: Remove x86_spec_ctrl_set()
    - x86/bugs: Rework spec_ctrl base and mask logic
    - x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
    - KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
    - x86/bugs: Rename SSBD_NO to SSB_NO
    - bpf: Prevent memory disambiguation attack
    - KVM: VMX: Expose SSBD properly to guests.

  * Suspend to idle: Open lid didn't resume (LP: #1771542)
    - ACPI / PM: Do not reconfigure GPEs for suspend-to-idle

  * Fix initialization failure detection in SDEI for device-tree based systems
    (LP: #1768663)
    - firmware: arm_sdei: Fix return value check in sdei_present_dt()

  * No driver for Huawei network adapters on arm64 (LP: #1769899)
    - net-next/hinic: add arm64 support

  * CVE-2018-1092
    - ext4: fail ext4_iget for root directory if unallocated

  * kernel 4.15 breaks nouveau on Lenovo P50 (LP: #1763189)
    - drm/nouveau: Fix deadlock in nv50_mstm_register_connector()

  * update-initramfs not adding i915 GuC firmware for Kaby Lake, firmware fails
    to load (LP: #1728238)
    - Revert "UBUNTU: SAUCE: (no-up) i915: Remove MODULE_FIRMWARE statements for
      unreleased firmware"

  * Battery drains when laptop is off  (shutdown) (LP: #1745646)
    - PCI / PM: Check device_may_wakeup() in pci_enable_wake()

  * Dell Latitude 5490/5590 BIOS update 1.1.9 causes black screen at boot
    (LP: #1764194)
    - drm/i915/bios: filter out invalid DDC pins from VBT child devices

  * Intel 9462 A370:42A4 doesn't work (LP: #1748853)
    - iwlwifi: add shared clock PHY config flag for some devices
    - iwlwifi: add a bunch of new 9000 PCI IDs

  * Fix an issue that some PCI devices get incorrectly suspended (LP: #1764684)
    - PCI / PM: Always check PME wakeup capability for runtime wakeup support

  * [SRU][Bionic/Artful] fix false positives in W+X checking (LP: #1769696)
    - init: fix false positives in W+X checking

  * Bionic update to v4.15.18 stable release (LP: #1769723)
    - netfilter: ipset: Missing nfnl_lock()/nfnl_unlock() is added to
      ip_set_net_exit()
    - cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
    - rds: MP-RDS may use an invalid c_path
    - slip: Check if rstate is initialized before uncompressing
    - vhost: fix vhost_vq_access_ok() log check
    - l2tp: fix races in tunnel creation
    - l2tp: fix race in duplicate tunnel detection
    - ip_gre: clear feature flags when incompatible o_flags are set
    - vhost: Fix vhost_copy_to_user()
    - lan78xx: Correctly indicate invalid OTP
    - media: v4l2-compat-ioctl32: don't oops on overlay
    - media: v4l: vsp1: Fix header display list status check in continuous mode
    - ipmi: Fix some error cleanup issues
    - parisc: Fix out of array access in match_pci_device()
    - parisc: Fix HPMC handler by increasing size to multiple of 16 bytes
    - Drivers: hv: vmbus: do not mark HV_PCIE as perf_device
    - PCI: hv: Serialize the present and eject work items
    - PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()
    - KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
    - perf/core: Fix use-after-free in uprobe_perf_close()
    - x86/mce/AMD: Get address from already initialized block
    - hwmon: (ina2xx) Fix access to uninitialized mutex
    - ath9k: Protect queue draining by rcu_read_lock()
    - x86/apic: Fix signedness bug in APIC ID validity checks
    - f2fs: fix heap mode to reset it back
    - block: Change a rcu_read_{lock,unlock}_sched() pair into
      rcu_read_{lock,unlock}()
    - nvme: Skip checking heads without namespaces
    - lib: fix stall in __bitmap_parselist()
    - blk-mq: order getting budget and driver tag
    - blk-mq: don't keep offline CPUs mapped to hctx 0
    - ovl: fix lookup with middle layer opaque dir and absolute path redirects
    - xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handling
    - hugetlbfs: fix bug in pgoff overflow checking
    - nfsd: fix incorrect umasks
    - scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure
    - block/loop: fix deadlock after loop_set_status
    - nfit: fix region registration vs block-data-window ranges
    - s390/qdio: don't retry EQBS after CCQ 96
    - s390/qdio: don't merge ERROR output buffers
    - s390/ipl: ensure loadparm valid flag is set
    - get_user_pages_fast(): return -EFAULT on access_ok failure
    - mm/gup_benchmark: handle gup failures
    - getname_kernel() needs to make sure that ->name != ->iname in long case
    - Bluetooth: Fix connection if directed advertising and privacy is used
    - Bluetooth: hci_bcm: Treat Interrupt ACPI resources as always being active-
      low
    - rtl8187: Fix NULL pointer dereference in priv->conf_mutex
    - ovl: set lower layer st_dev only if setting lower st_ino
    - Linux 4.15.18

  * Kernel bug when unplugging Thunderbolt 3 cable, leaves xHCI host controller
    dead (LP: #1768852)
    - xhci: Fix Kernel oops in xhci dbgtty

  * Incorrect blacklist of bcm2835_wdt (LP: #1766052)
    - [Packaging] Fix missing watchdog for Raspberry Pi

  * CVE-2018-8087
    - mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()

  * Integrated Webcam Realtek Integrated_Webcam_HD (0bda:58f4) not working in
    DELL XPS 13 9370 with firmware 1.50 (LP: #1763748)
    - SAUCE: media: uvcvideo: Support realtek's UVC 1.5 device

  * [ALSA] [PATCH] Clevo P950ER ALC1220 Fixup (LP: #1769721)
    - SAUCE: ALSA: hda/realtek - Clevo P950ER ALC1220 Fixup

  * Bionic: Intermittently sent to Emergency Mode on boot with unhandled kernel
    NULL pointer dereference at  0000000000000980 (LP: #1768292)
    - thunderbolt: Prevent crash when ICM firmware is not running

  * linux-snapdragon: reduce EPROBEDEFER noise during boot (LP: #1768761)
    - [Config] snapdragon: DRM_I2C_ADV7511=y

  * regression Aquantia Corp. AQC107 4.15.0-13-generic -> 4.15.0-20-generic ?
    (LP: #1767088)
    - net: aquantia: Regression on reset with 1.x firmware
    - net: aquantia: oops when shutdown on already stopped device

  * e1000e msix interrupts broken in linux-image-4.15.0-15-generic
    (LP: #1764892)
    - e1000e: Remove Other from EIAC

  * Acer Swift sf314-52 power button not managed  (LP: #1766054)
    - SAUCE: platform/x86: acer-wmi: add another KEY_POWER keycode

  * set PINCFG_HEADSET_MIC to parse_flags for Dell precision 3630 (LP: #1766398)
    - ALSA: hda/realtek - set PINCFG_HEADSET_MIC to parse_flags

  * Change the location for one of two front mics on a lenovo thinkcentre
    machine (LP: #1766477)
    - ALSA: hda/realtek - adjust the location of one mic

  * SRU: bionic: apply 50 ZFS upstream bugfixes (LP: #1764690)
    - SAUCE: (noup) Update zfs to 0.7.5-1ubuntu15 (LP: #1764690)

  * [8086:3e92] display becomes blank after S3 (LP: #1763271)
    - drm/i915/edp: Do not do link training fallback or prune modes on EDP

 -- Stefan Bader <stefan.ba...@canonical.com>  Wed, 23 May 2018 18:54:55
+0200

** Changed in: linux (Ubuntu)
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1764982

Title:
  [bionic] machine stuck and bonding not working well when nvmet_rdma
  module is loaded

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:

  == SRU Justification ==
  This bug causes the machine to get stuck and bonding to not work when
  the nvmet_rdma module is loaded.

  Both of these commits are in mainline as of v4.17-rc1.

  == Fixes ==
  a3dd7d0022c3 ("nvmet-rdma: Don't flush system_wq by default during 
remove_one")
  9bad0404ecd7 ("nvme-rdma: Don't flush delete_wq by default during remove_one")

  == Regression Potential ==
  Low.  Limited to nvme driver and tested by Mellanox.

  == Test Case ==
  A test kernel was built with these patches and tested by the original bug 
reporter.
  The bug reporter states the test kernel resolved the bug.


  
  == Original Bug Description ==

  Hi
  Machine stuck after unregistering bonding interface when the nvmet_rdma 
module is loading.

  scenario:

   # modprobe nvmet_rdma
   # modprobe -r bonding
   # modprobe bonding  -v mode=1 miimon=100 fail_over_mac=0
   # ifdown eth4
   # ifdown eth5
   # ip addr add 15.209.12.173/8 dev bond0
   # ip link set bond0 up
   # echo +eth5 > /sys/class/net/bond0/bonding/slaves
   # echo +eth4 > /sys/class/net/bond0/bonding/slaves
   # echo -eth4 > /sys/class/net/bond0/bonding/slaves
   # echo -eth5 > /sys/class/net/bond0/bonding/slaves
   # echo -bond0 > /sys/class/net/bonding_masters

  dmesg:

  kernel: [78348.225556] bond0 (unregistering): Released all slaves
  kernel: [78358.339631] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78368.419621] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78378.499615] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78388.579625] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78398.659613] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78408.739655] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78418.819634] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78428.899642] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78438.979614] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78449.059619] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78459.139626] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78469.219623] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78479.299619] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78489.379620] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78499.459623] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78509.539631] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2
  kernel: [78519.619629] unregister_netdevice: waiting for bond0 to become 
free. Usage count = 2

  The following upstream commits that fix this issue

  commit a3dd7d0022c347207ae931c753a6dc3e6e8fcbc1
  Author: Max Gurtovoy <m...@mellanox.com>
  Date:   Wed Feb 28 13:12:38 2018 +0200

      nvmet-rdma: Don't flush system_wq by default during remove_one

      The .remove_one function is called for any ib_device removal.
      In case the removed device has no reference in our driver, there
      is no need to flush the system work queue.

      Reviewed-by: Israel Rukshin <isra...@mellanox.com>
      Signed-off-by: Max Gurtovoy <m...@mellanox.com>
      Reviewed-by: Sagi Grimberg <s...@grimberg.me>
      Signed-off-by: Keith Busch <keith.bu...@intel.com>
      Signed-off-by: Jens Axboe <ax...@kernel.dk>

  diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
  index aa8068f..a59263d 100644
  --- a/drivers/nvme/target/rdma.c
  +++ b/drivers/nvme/target/rdma.c
  @@ -1469,8 +1469,25 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = {
   static void nvmet_rdma_remove_one(struct ib_device *ib_device, void 
*client_data)
   {
          struct nvmet_rdma_queue *queue, *tmp;
  + struct nvmet_rdma_device *ndev;
  + bool found = false;
  +
  + mutex_lock(&device_list_mutex);
  + list_for_each_entry(ndev, &device_list, entry) {
  +         if (ndev->device == ib_device) {
  +                 found = true;
  +                 break;
  +         }
  + }
  + mutex_unlock(&device_list_mutex);
  +
  + if (!found)
  +         return;

  -   /* Device is being removed, delete all queues using this device */
  + /*
  +  * IB Device that is used by nvmet controllers is being removed,
  +  * delete all queues using this device.
  +  */
          mutex_lock(&nvmet_rdma_queue_mutex);
          list_for_each_entry_safe(queue, tmp, &nvmet_rdma_queue_list,
                                   queue_list) {

  commit 9bad0404ecd7594265cef04e176adeaa4ffbca4a
  Author: Max Gurtovoy <m...@mellanox.com>
  Date:   Wed Feb 28 13:12:39 2018 +0200

      nvme-rdma: Don't flush delete_wq by default during remove_one

      The .remove_one function is called for any ib_device removal.
      In case the removed device has no reference in our driver, there
      is no need to flush the work queue.

      Reviewed-by: Israel Rukshin <isra...@mellanox.com>
      Signed-off-by: Max Gurtovoy <m...@mellanox.com>
      Reviewed-by: Sagi Grimberg <s...@grimberg.me>
      Signed-off-by: Keith Busch <keith.bu...@intel.com>
      Signed-off-by: Jens Axboe <ax...@kernel.dk>

  diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
  index f5f460b..250b277 100644
  --- a/drivers/nvme/host/rdma.c
  +++ b/drivers/nvme/host/rdma.c
  @@ -2024,6 +2024,20 @@ static struct nvmf_transport_ops nvme_rdma_transport = 
{
   static void nvme_rdma_remove_one(struct ib_device *ib_device, void 
*client_data)
   {
          struct nvme_rdma_ctrl *ctrl;
  + struct nvme_rdma_device *ndev;
  + bool found = false;
  +
  + mutex_lock(&device_list_mutex);
  + list_for_each_entry(ndev, &device_list, entry) {
  +         if (ndev->dev == ib_device) {
  +                 found = true;
  +                 break;
  +         }
  + }
  + mutex_unlock(&device_list_mutex);
  +
  + if (!found)
  +         return;

          /* Delete all controllers using this device */
          mutex_lock(&nvme_rdma_ctrl_mutex);

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1764982/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to