Public bug reported:
Description:
During run-time, the kernel can be notified of new/latent errors in two
different ways:
If an application trips over a latent/unknown error, and if the system has
machine check recovery, then we will be notified via the mce handler, and the
app will be killed. In this case, the kernel should take the chance to go
clear-errors for that page, and re-online it so it can be used again in the
future.
If the 'patrol scrubber' discovers an error on a yet-to-be-accessed location,
it can send an ACPI notification to the nfit driver. In this case, the kernel
should go clear the error if the page is not in use. If it is in use, the
application that has it mapped may need to killed as in case 1 above.
This additional run-time handling of errors will augment (and be complimentary
to) the init-time handling in userspace, and having both will give us the best
possible coverage for media errors.
Target Release: 19.10
Target Kernel: TBD
** Affects: intel
Importance: Undecided
Status: New
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
** Tags: intel-kernel-19.10
** Also affects: linux (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1835340
Title:
[HMEM] kmem: clear poison reported via mce or ACPI patrol scrub
notifications
Status in intel:
New
Status in linux package in Ubuntu:
New
Bug description:
Description:
During run-time, the kernel can be notified of new/latent errors in
two different ways:
If an application trips over a latent/unknown error, and if the system has
machine check recovery, then we will be notified via the mce handler, and the
app will be killed. In this case, the kernel should take the chance to go
clear-errors for that page, and re-online it so it can be used again in the
future.
If the 'patrol scrubber' discovers an error on a yet-to-be-accessed location,
it can send an ACPI notification to the nfit driver. In this case, the kernel
should go clear the error if the page is not in use. If it is in use, the
application that has it mapped may need to killed as in case 1 above.
This additional run-time handling of errors will augment (and be
complimentary to) the init-time handling in userspace, and having both will
give us the best possible coverage for media errors.
Target Release: 19.10
Target Kernel: TBD
To manage notifications about this bug go to:
https://bugs.launchpad.net/intel/+bug/1835340/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp