Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-31 Thread Borislav Petkov
On Wed, Jan 23, 2019 at 06:40:08PM +, James Morse wrote: > My SMM comment was because the CPU must jump from user-space->SMM, which > injects > an NMI into the kernel. The kernel's EIP must point into user-space, so > returning from the NMI without doing the memory_failure() work puts us back

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-23 Thread James Morse
Hi Boris, On 21/01/2019 17:58, Borislav Petkov wrote: > On Mon, Dec 03, 2018 at 06:06:10PM +, James Morse wrote: >> memory_failure() offlines or repairs pages of memory that have been >> discovered to be corrupt. These may be detected by an external >> component, (e.g. the memory controller),

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-23 Thread James Morse
Hi Boris, On 22/01/2019 10:51, Borislav Petkov wrote: > On Mon, Dec 10, 2018 at 07:15:13PM +, James Morse wrote: >> What happens if we miss MF_ACTION_REQUIRED? > > AFAICU, the logic is to force-send a signal to the user process, i.e., > force_sig_info() which cannot be ignored. IOW, an

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-22 Thread Borislav Petkov
On Mon, Dec 10, 2018 at 07:15:13PM +, James Morse wrote: > What happens if we miss MF_ACTION_REQUIRED? AFAICU, the logic is to force-send a signal to the user process, i.e., force_sig_info() which cannot be ignored. IOW, an "enlightened" process would know how to do recovery action from a

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-21 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:06:10PM +, James Morse wrote: > memory_failure() offlines or repairs pages of memory that have been > discovered to be corrupt. These may be detected by an external > component, (e.g. the memory controller), and notified via an IRQ. > In this case the work is queued

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2018-12-10 Thread James Morse
Hi Xie XiuQi, On 05/12/2018 02:02, Xie XiuQi wrote: > On 2018/12/4 2:06, James Morse wrote: >> memory_failure() offlines or repairs pages of memory that have been >> discovered to be corrupt. These may be detected by an external >> component, (e.g. the memory controller), and notified via an IRQ.

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2018-12-04 Thread Xie XiuQi
Hi James & Boris, On 2018/12/4 2:06, James Morse wrote: > memory_failure() offlines or repairs pages of memory that have been > discovered to be corrupt. These may be detected by an external > component, (e.g. the memory controller), and notified via an IRQ. > In this case the work is queued as

[PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2018-12-03 Thread James Morse
memory_failure() offlines or repairs pages of memory that have been discovered to be corrupt. These may be detected by an external component, (e.g. the memory controller), and notified via an IRQ. In this case the work is queued as not all of memory_failure()s work can happen in IRQ context. If