Public bug reported:

(I'll add the SRU template + testing steps and post to ML shortly.)

A customer reported a CPU soft lockup on Trusty HWE kernel from Xenial
when detaching a virtio-scsi drive, and provided a crashdump that shows
2 things:

1) The soft locked up CPU is waiting for another CPU to finish something,
and that does not happen because the other CPU is infinitely looping in
virtscsi_target_destroy().

2) The loop happens because the 'tgt->reqs' counter is non-zero, and that
probably happened due to a missing decrement in SCSI command requeue path,
exercised when the virtio ring is full.

The reported problem itself happens because of a downstream/SAUCE patch,
coupled with the problem of the missing decrement for the reqs counter.

Introducing a decrement in the SCSI command requeue path resolves the
problem, verified synthetically with QEMU+GDB and with test-case/loop
provided by the customer as problem reproducer.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1798110

Title:
  xenial: virtio-scsi: CPU soft lockup due to loop in
  virtscsi_target_destroy()

Status in linux package in Ubuntu:
  New

Bug description:
  (I'll add the SRU template + testing steps and post to ML shortly.)

  A customer reported a CPU soft lockup on Trusty HWE kernel from Xenial
  when detaching a virtio-scsi drive, and provided a crashdump that shows
  2 things:

  1) The soft locked up CPU is waiting for another CPU to finish something,
  and that does not happen because the other CPU is infinitely looping in
  virtscsi_target_destroy().

  2) The loop happens because the 'tgt->reqs' counter is non-zero, and that
  probably happened due to a missing decrement in SCSI command requeue path,
  exercised when the virtio ring is full.

  The reported problem itself happens because of a downstream/SAUCE patch,
  coupled with the problem of the missing decrement for the reqs counter.

  Introducing a decrement in the SCSI command requeue path resolves the
  problem, verified synthetically with QEMU+GDB and with test-case/loop
  provided by the customer as problem reproducer.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1798110/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to