[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
This bug was fixed in the package linux - 5.4.0-42.46 --- linux (5.4.0-42.46) focal; urgency=medium * focal/linux: 5.4.0-42.46 -proposed tracker (LP: #1887069) * linux 4.15.0-109-generic network DoS regression vs -108 (LP: #1886668) - SAUCE: Revert "netprio_cgroup: Fix unlimited memory leak of v2 cgroups" linux (5.4.0-41.45) focal; urgency=medium * focal/linux: 5.4.0-41.45 -proposed tracker (LP: #1885855) * Packaging resync (LP: #1786013) - update dkms package versions * CVE-2019-19642 - kernel/relay.c: handle alloc_percpu returning NULL in relay_open * CVE-2019-16089 - SAUCE: nbd_genl_status: null check for nla_nest_start * CVE-2020-11935 - aufs: do not call i_readcount_inc() * ip_defrag.sh in net from ubuntu_kernel_selftests failed with 5.0 / 5.3 / 5.4 kernel (LP: #1826848) - selftests: net: ip_defrag: ignore EPERM * Update lockdown patches (LP: #1884159) - SAUCE: acpi: disallow loading configfs acpi tables when locked down * seccomp_bpf fails on powerpc (LP: #1885757) - SAUCE: selftests/seccomp: fix ptrace tests on powerpc * Introduce the new NVIDIA 418-server and 440-server series, and update the current NVIDIA drivers (LP: #1881137) - [packaging] add signed modules for the 418-server and the 440-server flavours -- Khalid Elmously Thu, 09 Jul 2020 19:50:26 -0400 ** Changed in: linux (Ubuntu) Status: In Progress => Fix Released ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-16089 ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2019-19642 ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2020-11935 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Released Status in linux source package in Focal: Fix Released Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
This bug was fixed in the package linux - 5.4.0-31.35 --- linux (5.4.0-31.35) focal; urgency=medium * focal/linux: 5.4.0-31.35 -proposed tracker (LP: #1877253) * Intermittent display blackouts on event (LP: #1875254) - drm/i915: Limit audio CDCLK>=2*BCLK constraint back to GLK only * Unable to handle kernel pointer dereference in virtual kernel address space on Eoan (LP: #1876645) - SAUCE: overlayfs: fix shitfs special-casing linux (5.4.0-30.34) focal; urgency=medium * focal/linux: 5.4.0-30.34 -proposed tracker (LP: #1875385) * ubuntu/focal64 fails to mount Vagrant shared folders (LP: #1873506) - [Packaging] Move virtualbox modules to linux-modules - [Packaging] Remove vbox and zfs modules from generic.inclusion-list * linux-image-5.0.0-35-generic breaks checkpointing of container (LP: #1857257) - SAUCE: overlayfs: use shiftfs hacks only with shiftfs as underlay * shiftfs: broken shiftfs nesting (LP: #1872094) - SAUCE: shiftfs: record correct creator credentials * Add debian/rules targets to compile/run kernel selftests (LP: #1874286) - [Packaging] add support to compile/run selftests * shiftfs: O_TMPFILE reports ESTALE (LP: #1872757) - SAUCE: shiftfs: fix dentry revalidation * LIO hanging in iscsit_free_session and iscsit_stop_session (LP: #1871688) - scsi: target: iscsi: calling iscsit_stop_session() inside iscsit_close_session() has no effect * [ICL] TC port in legacy/static mode can't be detected due TCCOLD (LP: #1868936) - SAUCE: drm/i915: Align power domain names with port names - SAUCE: drm/i915/display: Move out code to return the digital_port of the aux ch - SAUCE: drm/i915/display: Add intel_legacy_aux_to_power_domain() - SAUCE: drm/i915/display: Split hsw_power_well_enable() into two - SAUCE: drm/i915/tc/icl: Implement TC cold sequences - SAUCE: drm/i915/tc: Skip ref held check for TC legacy aux power wells - SAUCE: drm/i915/tc/tgl: Implement TC cold sequences - SAUCE: drm/i915/tc: Catch TC users accessing FIA registers without enable aux - SAUCE: drm/i915/tc: Do not warn when aux power well of static TC ports timeout * alsa/sof: external mic can't be deteced on Lenovo and HP laptops (LP: #1872569) - SAUCE: ASoC: intel/skl/hda - set autosuspend timeout for hda codecs * amdgpu kernel errors in Linux 5.4 (LP: #1871248) - drm/amd/display: Stop if retimer is not available * Focal update: v5.4.34 upstream stable release (LP: #1874111) - amd-xgbe: Use __napi_schedule() in BH context - hsr: check protocol version in hsr_newlink() - l2tp: Allow management of tunnels and session in user namespace - net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode - net: ipv4: devinet: Fix crash when add/del multicast IP with autojoin - net: ipv6: do not consider routes via gateways for anycast address check - net: phy: micrel: use genphy_read_status for KSZ9131 - net: qrtr: send msgs from local of same id as broadcast - net: revert default NAPI poll timeout to 2 jiffies - net: tun: record RX queue in skb before do_xdp_generic() - net: dsa: mt7530: move mt7623 settings out off the mt7530 - net: ethernet: mediatek: move mt7623 settings out off the mt7530 - net/mlx5: Fix frequent ioread PCI access during recovery - net/mlx5e: Add missing release firmware call - net/mlx5e: Fix nest_level for vlan pop action - net/mlx5e: Fix pfnum in devlink port attribute - net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes - ovl: fix value of i_ino for lower hardlink corner case - scsi: ufs: Fix ufshcd_hold() caused scheduling while atomic - platform/chrome: cros_ec_rpmsg: Fix race with host event - jbd2: improve comments about freeing data buffers whose page mapping is NULL - acpi/nfit: improve bounds checking for 'func' - perf report: Fix no branch type statistics report issue - pwm: pca9685: Fix PWM/GPIO inter-operation - ext4: fix incorrect group count in ext4_fill_super error message - ext4: fix incorrect inodes per group in error message - clk: at91: sam9x60: fix usb clock parents - clk: at91: usb: use proper usbs_mask - ARM: dts: imx7-colibri: fix muxing of usbc_det pin - arm64: dts: librem5-devkit: add a vbus supply to usb0 - usb: dwc3: gadget: Don't clear flags before transfer ended - ASoC: Intel: mrfld: fix incorrect check on p->sink - ASoC: Intel: mrfld: return error codes when an error occurs - ALSA: hda/realtek - Enable the headset mic on Asus FX505DT - ALSA: usb-audio: Filter error from connector kctl ops, too - ALSA: usb-audio: Don't override ignore_ctl_error value from the map - ALSA: usb-audio: Don't create jack controls for PCM terminals - ALSA: usb-audio: Check mapping at creating connector controls, too - arm64: vdso: don't free unallocated pages - keys: Fix proc_
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
This bug was fixed in the package linux - 4.15.0-101.102 --- linux (4.15.0-101.102) bionic; urgency=medium * bionic/linux: 4.15.0-101.102 -proposed tracker (LP: #1877262) * 4.15.0-100.101 breaks userspace builds due to a bug in the headers /usr/include/linux/swab.h of linux-libc-dev (LP: #1877123) - include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap * bionic snapdragon 4.15 snap failed Certification testing (LP: #1877657) - Revert "drm/msm: Use the correct dma_sync calls in msm_gem" - Revert "drm/msm: stop abusing dma_map/unmap for cache" linux (4.15.0-100.101) bionic; urgency=medium * bionic/linux: 4.15.0-100.101 -proposed tracker (LP: #1875878) * built-using constraints preventing uploads (LP: #1875601) - temporarily drop Built-Using data * Add debian/rules targets to compile/run kernel selftests (LP: #1874286) - [Packaging] add support to compile/run selftests * getitimer returns it_value=0 erroneously (LP: #1349028) - [Config] CONTEXT_TRACKING_FORCE policy should be unset * QEMU/KVM display is garbled when booting from kernel EFI stub due to missing bochs-drm module (LP: #1872863) - [Config] Enable CONFIG_DRM_BOCHS as module for all archs * Backport MPLS patches from 5.3 to 4.15 (LP: #1851446) - net/mlx5e: Report netdevice MPLS features - net: vlan: Inherit MPLS features from parent device - net: bonding: Inherit MPLS features from slave devices - net/mlx5e: Move to HW checksumming advertising * LIO hanging in iscsit_free_session and iscsit_stop_session (LP: #1871688) - scsi: target: remove boilerplate code - scsi: target: fix hang when multiple threads try to destroy the same iscsi session - scsi: target: iscsi: calling iscsit_stop_session() inside iscsit_close_session() has no effect * Add hw timestamps to received skbs in peak_canfd (LP: #1874124) - can: peak_canfd: provide hw timestamps in rx skbs * Bionic update: upstream stable patchset 2020-04-23 (LP: #1874502) - ARM: dts: sun8i-a83t-tbs-a711: HM5065 doesn't like such a high voltage - bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads - net: vxge: fix wrong __VA_ARGS__ usage - hinic: fix a bug of waitting for IO stopped - hinic: fix wrong para of wait_for_completion_timeout - cxgb4/ptp: pass the sign of offset delta in FW CMD - qlcnic: Fix bad kzalloc null test - i2c: st: fix missing struct parameter description - firmware: arm_sdei: fix double-lock on hibernate with shared events - null_blk: Fix the null_add_dev() error path - null_blk: Handle null_add_dev() failures properly - null_blk: fix spurious IO errors after failed past-wp access - xhci: bail out early if driver can't accress host in resume - x86: Don't let pgprot_modify() change the page encryption bit - block: keep bdi->io_pages in sync with max_sectors_kb for stacked devices - irqchip/versatile-fpga: Handle chained IRQs properly - sched: Avoid scale real weight down to zero - selftests/x86/ptrace_syscall_32: Fix no-vDSO segfault - PCI/switchtec: Fix init_completion race condition with poll_wait() - libata: Remove extra scsi_host_put() in ata_scsi_add_hosts() - gfs2: Don't demote a glock until its revokes are written - x86/boot: Use unsigned comparison for addresses - efi/x86: Ignore the memory attributes table on i386 - genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy() - block: Fix use-after-free issue accessing struct io_cq - usb: dwc3: core: add support for disabling SS instances in park mode - irqchip/gic-v4: Provide irq_retrigger to avoid circular locking dependency - md: check arrays is suspended in mddev_detach before call quiesce operations - locking/lockdep: Avoid recursion in lockdep_count_{for,back}ward_deps() - block, bfq: fix use-after-free in bfq_idle_slice_timer_body - btrfs: qgroup: ensure qgroup_rescan_running is only set when the worker is at least queued - btrfs: remove a BUG_ON() from merge_reloc_roots() - btrfs: track reloc roots based on their commit root bytenr - uapi: rename ext2_swab() to swab() and share globally in swab.h - slub: improve bit diffusion for freelist ptr obfuscation - ASoC: fix regwmask - ASoC: dapm: connect virtual mux with default value - ASoC: dpcm: allow start or stop during pause for backend - ASoC: topology: use name_prefix for new kcontrol - usb: gadget: f_fs: Fix use after free issue as part of queue failure - usb: gadget: composite: Inform controller driver of self-powered - ALSA: usb-audio: Add mixer workaround for TRX40 and co - ALSA: hda: Add driver blacklist - ALSA: hda: Fix potential access overflow in beep helper - ALSA: ice1724: Fix invalid access for enumerated ctl items - ALSA: pcm: oss: Fix regression by buffer overflow fix - ALSA: do
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
Thanks very much @Matt Coleman for reporting the bug and verifying the fix! I will mark the bug as verified based on your feedback. ** Tags removed: verification-needed-bionic verification-needed-focal ** Tags added: verification-done-bionic verification-done-focal -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
The Xenial system that I installed the binary packages from the proposed Bionic repository on has been running properly and stable since the 7th. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
I installed the proposed Bionic packages on my Xenial system and have confirmed that basic functionality is working properly. If the system remains stable for >24h, I will consider the issue to be fixed in this kernel. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
My Xenial system is still stable after almost two days with the kernel package I built from the proposed Bionic kernel's source. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-focal -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
I built and installed the package on one of my Xenial systems from the source available here: https://launchpad.net/ubuntu/+source/linux/4.15.0-100.101 If the system is still stable a day from now, is it okay for me to set the 'verification-done-bionic' tag, even though I'm testing on Xenial? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
The only machines I've reproduced this bug on are running Xenial. For the past few months, I've been applying these patches to the Xenial HWE kernels and building packages for my dev/test systems. These systems frequently add and remove targets (around 125 per hour) with separate initiators connecting to each target. Initiators connect for varying durations, so creating/destroying and connecting/disconnecting happen for different targets in unpredictable order. With an unpatched kernel, the systems will hit this bug within a day (very occasionally stretches into a second day), but it's usually only hours. With the patches applied, the systems remain stable. I attempted to install the proposed Bionic kernel on one of my Xenial systems but it depends on libssl1.1, which isn't available in Xenial. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed- bionic'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-bionic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
** No longer affects: linux (Ubuntu Xenial) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
Target to Xenial was meant for Xenial-hwe (derivative of Bionic). ** Changed in: linux (Ubuntu Bionic) Status: In Progress => Fix Committed ** Changed in: linux (Ubuntu Focal) Status: In Progress => Fix Committed ** Changed in: linux (Ubuntu Xenial) Status: New => Won't Fix -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: Won't Fix Status in linux source package in Bionic: Fix Committed Status in linux source package in Focal: Fix Committed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
** Changed in: linux (Ubuntu Focal) Status: Confirmed => In Progress ** Changed in: linux (Ubuntu Bionic) Status: New => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: New Status in linux source package in Bionic: In Progress Status in linux source package in Focal: In Progress Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
** Also affects: linux (Ubuntu Focal) Importance: Undecided Status: Confirmed ** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: linux (Ubuntu Bionic) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: Confirmed Status in linux source package in Xenial: New Status in linux source package in Bionic: New Status in linux source package in Focal: Confirmed Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
** Tags added: bionic focal ** Description changed: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: - * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad#diff-b7557d7ed3ba34645f6e9d510f281d3a - * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796#diff-b7557d7ed3ba34645f6e9d510f281d3a - * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e#diff-b7557d7ed3ba34645f6e9d510f281d3a + * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad + * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 + * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e + + I'd like these commits to be added to the xenial, bionic, and focal + kernels. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: New Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zer
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
** Attachment added: "contents of /proc/version_signature" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+attachment/5349844/+files/version.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: New Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
** Attachment added: "output of `lspci -vvnn`" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+attachment/5349846/+files/lspci-vvnn.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: New Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1871688] Re: LIO hanging in iscsit_free_session and iscsit_stop_session
** Attachment added: "dmesg output shortly after booting" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+attachment/5349845/+files/dmesg.log -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871688 Title: LIO hanging in iscsit_free_session and iscsit_stop_session Status in linux package in Ubuntu: New Bug description: The target subsystem (LIO) can hang if multiple threads try to destroy iSCSI sessions simultaneously. This is reproducible on systems that have multiple targets with initiators regularly connecting/disconnecting. This may happen when a "targetcli iscsi/iqn.../tpg1 disable" command is executed when a logout operation is underway. The iscsi target doesn't handle such events in a correct way: two or more threads may end up sleeping while waiting for the driver to close the remaining connections on the session. When the connections are closed, the driver wakes up only the first thread that will then proceed to destroy the session structure. The remaining threads are blocked there forever, waiting on a completion synchronization mechanism that doesn't exist in memory anymore because it has been freed by the first thread. Note that if the blocked threads are somehow forced to wake up, they will try to free the same iSCSI session structure destroyed by the first thread, causing double frees, memory corruptions, etc. The driver has been reorganized so the concurrent threads will set a flag in the session structure to notify the driver that the session should be destroyed; then, they wait for the driver to close the remaining connections. When the connections are all closed, the driver will wake up all the threads and will wait for the refcount of the iSCSI session structure to reach zero. When the last thread wakes up, the refcount is decreased to zero and the driver can proceed to destroy the session structure because no one is referencing it anymore. I've witnessed this happening on hundreds of Ubuntu 16.04.5 systems. It is a regression, because this did not occur several years ago. Unfortunately, I don't have detailed records from that far back to determine exactly which kernel I was running that was not affected by this bug (I believe it was either 4.8.x or 4.10.x). I've attached the requested uname, version_signature, dmesg, and lspci from my system. However, I've seen this happen on a wide array of hardware: 2 to 24 cores, 8GB to 256GB RAM, both AMD and Intel CPUs, onboard storage and PCIe SAS cards, etc. This has been fixed in the upstream master branch, but it hasn't yet been backported to "-stable". To fix this in the Ubuntu kernel, these three commits should be backported: * https://github.com/torvalds/linux/commit/e49a7d994379278d3353d7ffc7994672752fb0ad * https://github.com/torvalds/linux/commit/57c46e9f33da530a2485fa01aa27b6d18c28c796 * https://github.com/torvalds/linux/commit/626bac73371eed79e2afa2966de393da96cf925e I'd like these commits to be added to the xenial, bionic, and focal kernels. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1871688/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp