Public bug reported:

We see the below failure in the LTP madvise07 test case, which tests
HWPOISON functionality on anonymous mapped memory. The issue seems to be
HWPOISON'ed anonymous mapped memory causes a process to be forcibly
SIGKILL'd instead of being sent SIGBUS as expected.

~john-cabaj discovered this error while reviewing linux-azure-nvidia
test results, and identified this patch as the culprit:
https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-
nvidia/+git/noble/commit/?h=nvidia-6.14-next&id=7a1391d02f3e1a71b8953bfcd6a66c6093fe1280.

This appears to only impact kernels based on noble:linux-nvidia-6.14.

77544 20:44:08 ERROR| [stderr] -------------------------------------------
77545 20:44:08 ERROR| [stderr] INFO: runltp script is deprecated, try kirk
77546 20:44:08 ERROR| [stderr] https://github.com/linux-test-project/kirk
77547 20:44:08 ERROR| [stderr] -------------------------------------------
77548 20:44:08 DEBUG| [stdout] Checking for required user/group ids
77549 20:44:08 DEBUG| [stdout]
77550 20:44:08 DEBUG| [stdout] 'root' user id and group found.
77551 20:44:08 DEBUG| [stdout] 'nobody' user id and group found.
77552 20:44:08 DEBUG| [stdout] 'bin' user id and group found.
77553 20:44:08 DEBUG| [stdout] 'daemon' user id and group found.
77554 20:44:08 DEBUG| [stdout] Users group found.
77555 20:44:08 DEBUG| [stdout] Sys group found.
77556 20:44:08 DEBUG| [stdout] Required users/groups exist.
77557 20:44:08 DEBUG| [stdout] no big block device was specified on commandline.
77558 20:44:08 DEBUG| [stdout] Tests which require a big block device are 
disabled.
77559 20:44:08 DEBUG| [stdout] You can specify it with option -z
77560 20:44:08 DEBUG| [stdout] INFO: Test start time: Thu Sep 18 20:44:08 UTC 
2025
77561 20:44:08 DEBUG| [stdout] COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 150147 
-n 150147 -f /tmp/ltp-1RFGjDBAMw/alltests -l /dev/null -C /dev/null -T /dev/null
77562 20:44:08 DEBUG| [stdout] LOG File: /dev/null
77563 20:44:08 DEBUG| [stdout] FAILED COMMAND File: /dev/null
77564 20:44:08 DEBUG| [stdout] TCONF COMMAND File: /dev/null
77565 20:44:08 DEBUG| [stdout] Running tests.......
77566 20:44:08 DEBUG| [stdout] tst_test.c:1952: TINFO: LTP version: 20250130
77567 20:44:08 DEBUG| [stdout] tst_test.c:1955: TINFO: Tested kernel: 
6.14.0-1004-azure-nvidia #4~24.04.1-Ubuntu SMP Thu Sep 11 18:20:32 UTC 2025 
aarch64
77568 20:44:08 DEBUG| [stdout] tst_kconfig.c:88: TINFO: Parsing kernel config 
'/lib/modules/6.14.0-1004-azure-nvidia/build/.config'
77569 20:44:08 DEBUG| [stdout] tst_test.c:1773: TINFO: Overall timeout per run 
is 0h 00m 30s
77570 20:44:08 DEBUG| [stdout] madvise07.c:43: TINFO: mmap(0, 65536, PROT_READ 
| PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0)
77571 20:44:08 DEBUG| [stdout] madvise07.c:54: TINFO: madvise(0xe595bee90000, 
65536, MADV_HWPOISON)
77572 20:44:08 DEBUG| [stdout] madvise07.c:90: TFAIL: Child killed by SIGKILL
77573 20:44:08 DEBUG| [stdout]
77574 20:44:08 DEBUG| [stdout] Summary:
77575 20:44:08 DEBUG| [stdout] passed 0
77576 20:44:08 DEBUG| [stdout] failed 1
77577 20:44:08 DEBUG| [stdout] broken 0
77578 20:44:08 DEBUG| [stdout] skipped 0
77579 20:44:08 DEBUG| [stdout] warnings 0
77580 20:44:08 DEBUG| [stdout] INFO: ltp-pan reported some tests FAIL
77581 20:44:08 DEBUG| [stdout] LTP Version: 20250130
77582 20:44:08 DEBUG| [stdout] INFO: Test end time: Thu Sep 18 20:44:08 UTC 2025
77583 20:44:08 ERROR| [stderr] -------------------------------------------
77584 20:44:08 ERROR| [stderr] INFO: runltp script is deprecated, try kirk
77585 20:44:08 ERROR| [stderr] https://github.com/linux-test-project/kirk
77586 20:44:08 ERROR| [stderr] -------------------------------------------

This is accompanied by an error in dmesg:
...
[ 2786.499480] Memory failure: 0x232b9: forcibly killing madvise07:150194 
because of failure to unmap corrupted page
...

** Affects: linux-nvidia-6.14 (Ubuntu)
     Importance: Undecided
         Status: Invalid

** Affects: linux-nvidia-6.14 (Ubuntu Noble)
     Importance: Undecided
         Status: Triaged

** Also affects: linux-nvidia-6.14 (Ubuntu Noble)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.14 in Ubuntu.
https://bugs.launchpad.net/bugs/2125434

Title:
  6.14 kernel SAUCE patch causes LTP madvise07 test case to fail with
  "Child killed by SIGKILL"

Status in linux-nvidia-6.14 package in Ubuntu:
  Invalid
Status in linux-nvidia-6.14 source package in Noble:
  Triaged

Bug description:
  We see the below failure in the LTP madvise07 test case, which tests
  HWPOISON functionality on anonymous mapped memory. The issue seems to
  be HWPOISON'ed anonymous mapped memory causes a process to be forcibly
  SIGKILL'd instead of being sent SIGBUS as expected.

  ~john-cabaj discovered this error while reviewing linux-azure-nvidia
  test results, and identified this patch as the culprit:
  https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-
  
nvidia/+git/noble/commit/?h=nvidia-6.14-next&id=7a1391d02f3e1a71b8953bfcd6a66c6093fe1280.

  This appears to only impact kernels based on noble:linux-nvidia-6.14.

  77544 20:44:08 ERROR| [stderr] -------------------------------------------
  77545 20:44:08 ERROR| [stderr] INFO: runltp script is deprecated, try kirk
  77546 20:44:08 ERROR| [stderr] https://github.com/linux-test-project/kirk
  77547 20:44:08 ERROR| [stderr] -------------------------------------------
  77548 20:44:08 DEBUG| [stdout] Checking for required user/group ids
  77549 20:44:08 DEBUG| [stdout]
  77550 20:44:08 DEBUG| [stdout] 'root' user id and group found.
  77551 20:44:08 DEBUG| [stdout] 'nobody' user id and group found.
  77552 20:44:08 DEBUG| [stdout] 'bin' user id and group found.
  77553 20:44:08 DEBUG| [stdout] 'daemon' user id and group found.
  77554 20:44:08 DEBUG| [stdout] Users group found.
  77555 20:44:08 DEBUG| [stdout] Sys group found.
  77556 20:44:08 DEBUG| [stdout] Required users/groups exist.
  77557 20:44:08 DEBUG| [stdout] no big block device was specified on 
commandline.
  77558 20:44:08 DEBUG| [stdout] Tests which require a big block device are 
disabled.
  77559 20:44:08 DEBUG| [stdout] You can specify it with option -z
  77560 20:44:08 DEBUG| [stdout] INFO: Test start time: Thu Sep 18 20:44:08 UTC 
2025
  77561 20:44:08 DEBUG| [stdout] COMMAND: /opt/ltp/bin/ltp-pan -q -e -S -a 
150147 -n 150147 -f /tmp/ltp-1RFGjDBAMw/alltests -l /dev/null -C /dev/null -T 
/dev/null
  77562 20:44:08 DEBUG| [stdout] LOG File: /dev/null
  77563 20:44:08 DEBUG| [stdout] FAILED COMMAND File: /dev/null
  77564 20:44:08 DEBUG| [stdout] TCONF COMMAND File: /dev/null
  77565 20:44:08 DEBUG| [stdout] Running tests.......
  77566 20:44:08 DEBUG| [stdout] tst_test.c:1952: TINFO: LTP version: 20250130
  77567 20:44:08 DEBUG| [stdout] tst_test.c:1955: TINFO: Tested kernel: 
6.14.0-1004-azure-nvidia #4~24.04.1-Ubuntu SMP Thu Sep 11 18:20:32 UTC 2025 
aarch64
  77568 20:44:08 DEBUG| [stdout] tst_kconfig.c:88: TINFO: Parsing kernel config 
'/lib/modules/6.14.0-1004-azure-nvidia/build/.config'
  77569 20:44:08 DEBUG| [stdout] tst_test.c:1773: TINFO: Overall timeout per 
run is 0h 00m 30s
  77570 20:44:08 DEBUG| [stdout] madvise07.c:43: TINFO: mmap(0, 65536, 
PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0)
  77571 20:44:08 DEBUG| [stdout] madvise07.c:54: TINFO: madvise(0xe595bee90000, 
65536, MADV_HWPOISON)
  77572 20:44:08 DEBUG| [stdout] madvise07.c:90: TFAIL: Child killed by SIGKILL
  77573 20:44:08 DEBUG| [stdout]
  77574 20:44:08 DEBUG| [stdout] Summary:
  77575 20:44:08 DEBUG| [stdout] passed 0
  77576 20:44:08 DEBUG| [stdout] failed 1
  77577 20:44:08 DEBUG| [stdout] broken 0
  77578 20:44:08 DEBUG| [stdout] skipped 0
  77579 20:44:08 DEBUG| [stdout] warnings 0
  77580 20:44:08 DEBUG| [stdout] INFO: ltp-pan reported some tests FAIL
  77581 20:44:08 DEBUG| [stdout] LTP Version: 20250130
  77582 20:44:08 DEBUG| [stdout] INFO: Test end time: Thu Sep 18 20:44:08 UTC 
2025
  77583 20:44:08 ERROR| [stderr] -------------------------------------------
  77584 20:44:08 ERROR| [stderr] INFO: runltp script is deprecated, try kirk
  77585 20:44:08 ERROR| [stderr] https://github.com/linux-test-project/kirk
  77586 20:44:08 ERROR| [stderr] -------------------------------------------

  This is accompanied by an error in dmesg:
  ...
  [ 2786.499480] Memory failure: 0x232b9: forcibly killing madvise07:150194 
because of failure to unmap corrupted page
  ...

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.14/+bug/2125434/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to