*** This bug is a security vulnerability ***

Private security bug reported:

When running the stress-ng vector floating point stressor in QEMU PPC64
virtual machines I get floating point verification errors when running
more stressor instances than the number of virtual CPUs.

How to reproduce:

Create a PPC64 VM in QEMU on a x86 host with 8 virtual CPUs. Login, and
then do:

get latest stress-ng:

sudo apt-get build-dep stress-ng
git clone https://github.com/ColinIanKing/stress-ng
cd stress-ng
make clean; make -j $(nproc)
./stress-ng --vecfp 32 --verify -t 10

One should get failures such as:
stress-ng: info:  [1487] setting to a 10 second run per stressor
stress-ng: info:  [1487] dispatching hogs: 32 vecfp
stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 1078998925312.000000, expected 180812.062500
stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 46779686912.000000, expected 13278722.000000
stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 24992688128.000000, expected 26213772.000000
stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 17185787904.000000, expected 39415832.000000
stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 157250576.000000, expected 33576.261719
stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 170314032.000000, expected 13129044.000000
stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 183516080.000000, expected 26348392.000000
stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 196647552.000000, expected 39365508.000000
etc..

However, running less instances than the number of CPUs this runs fine without 
any errors:
/stress-ng --vecfp 1 --verify -t 10
stress-ng: info:  [1521] setting to a 10 second run per stressor
stress-ng: info:  [1521] dispatching hogs: 1 vecfp
stress-ng: info:  [1521] passed: 1: vecfp (1)
stress-ng: info:  [1521] failed: 0
stress-ng: info:  [1521] skipped: 0
stress-ng: info:  [1521] metrics untrustworthy: 0
stress-ng: info:  [1521] successful run completed in 19.00s

It appears this only fails when the number of instances of the vecfp
stressor is more than the number of virtual CPUs.  This seems to
indicate that vector floating point registers are being clobbered
between processes, which could be a security exploitable issue.

Reproduced with Ubuntu Lunar PPC64 VM (6.2.0-20-generic) and x86 host
(6.2.0-21-generic + qemu-kvm  1:5.0-5ubuntu6).

List of PPC64el kernels reproducers:

    Focal: 5.4.0-148-generic
    Jammy: 5.15.0-58-generic
    Lunar: 6.2.0-20-generic
    Mantic: 6.3.0-7-generic

Not sure if this is a kernel or KVM issue, or both.

** Affects: linux (Ubuntu)
     Importance: High
         Status: New

** Affects: linux (Ubuntu Focal)
     Importance: High
         Status: New

** Affects: linux (Ubuntu Lunar)
     Importance: High
         Status: New

** Affects: linux (Ubuntu Mantic)
     Importance: High
         Status: New

** Also affects: linux (Ubuntu Lunar)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2026883

Title:
  vector floating point registers get clobbered when running stress-ng
  --vecfp with more instances than CPUs

Status in linux package in Ubuntu:
  New
Status in linux source package in Focal:
  New
Status in linux source package in Lunar:
  New
Status in linux source package in Mantic:
  New

Bug description:
  When running the stress-ng vector floating point stressor in QEMU
  PPC64 virtual machines I get floating point verification errors when
  running more stressor instances than the number of virtual CPUs.

  How to reproduce:

  Create a PPC64 VM in QEMU on a x86 host with 8 virtual CPUs. Login,
  and then do:

  get latest stress-ng:

  sudo apt-get build-dep stress-ng
  git clone https://github.com/ColinIanKing/stress-ng
  cd stress-ng
  make clean; make -j $(nproc)
  ./stress-ng --vecfp 32 --verify -t 10

  One should get failures such as:
  stress-ng: info:  [1487] setting to a 10 second run per stressor
  stress-ng: info:  [1487] dispatching hogs: 32 vecfp
  stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 1078998925312.000000, expected 180812.062500
  stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 46779686912.000000, expected 13278722.000000
  stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 24992688128.000000, expected 26213772.000000
  stress-ng: fail:  [1489] vecfp: floatv64div float vector operation result 
mismatch, got 17185787904.000000, expected 39415832.000000
  stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 157250576.000000, expected 33576.261719
  stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 170314032.000000, expected 13129044.000000
  stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 183516080.000000, expected 26348392.000000
  stress-ng: fail:  [1488] vecfp: floatv16div float vector operation result 
mismatch, got 196647552.000000, expected 39365508.000000
  etc..

  However, running less instances than the number of CPUs this runs fine 
without any errors:
  /stress-ng --vecfp 1 --verify -t 10
  stress-ng: info:  [1521] setting to a 10 second run per stressor
  stress-ng: info:  [1521] dispatching hogs: 1 vecfp
  stress-ng: info:  [1521] passed: 1: vecfp (1)
  stress-ng: info:  [1521] failed: 0
  stress-ng: info:  [1521] skipped: 0
  stress-ng: info:  [1521] metrics untrustworthy: 0
  stress-ng: info:  [1521] successful run completed in 19.00s

  It appears this only fails when the number of instances of the vecfp
  stressor is more than the number of virtual CPUs.  This seems to
  indicate that vector floating point registers are being clobbered
  between processes, which could be a security exploitable issue.

  Reproduced with Ubuntu Lunar PPC64 VM (6.2.0-20-generic) and x86 host
  (6.2.0-21-generic + qemu-kvm  1:5.0-5ubuntu6).

  List of PPC64el kernels reproducers:

      Focal: 5.4.0-148-generic
      Jammy: 5.15.0-58-generic
      Lunar: 6.2.0-20-generic
      Mantic: 6.3.0-7-generic

  Not sure if this is a kernel or KVM issue, or both.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026883/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to