Re: [PATCH] tracing: Include PPIN in mce_record tracepoint

2024-01-23 Thread Tony Luck
On Tue, Jan 23, 2024 at 05:51:50PM -0600, Avadhut Naik wrote: > Machine Check Error information from struct mce is exported to userspace > through the mce_record tracepoint. > > Currently, however, the PPIN (Protected Processor Inventory Number) field > of struct mce is not exported through the

[PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery

2021-03-25 Thread Tony Luck
same addrsss with page faults enabled). Just in case there is some code that loops forever enforce a limit of 10. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 40 ++ include/linux/sched.h | 1 + 2 files changed, 32 insertions(+), 9 deleti

[PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison

2021-03-25 Thread Tony Luck
Andy Lutomirski pointed out that sending SIGBUS to tasks that hit poison in the kernel copying syscall parameters from user address space is not the right semantic. So stop doing that. Add a new kill_me_never() call back that simply unmaps and offlines the poison page. current-mce_vaddr is no

[PATCH 2/4] mce/iter: Check for copyin failure & return error up stack

2021-03-25 Thread Tony Luck
Check for failure from low level copy from user code. Doing this requires some type changes from the unsigned "size_t" so some signed type (so that "if (xxx < 0)" works!). I picked "loff_t" but there may be some other more appropriate type. Very likely more places need to be changed. These

[RFC 0/4] Fix machine check recovery for copy_from_user

2021-03-25 Thread Tony Luck
brain cells in the maze of nested macros that is lib/iov_iter.c Last part has been posted before. It covers the case where the kernel takes more than one swing at reading poison data before returning to user. Tony Luck (4): x86/mce: Fix copyin code to return -EFAULT on machine check. mce/iter

[PATCH 1/4] x86/mce: Fix copyin code to return -EFAULT on machine check.

2021-03-25 Thread Tony Luck
When copy from user fails due to a machine check on poison reading user data it should return an error code. --- Separate patch just now, but likely needs to be combined with patches to iteration code for bisection safety. --- arch/x86/lib/copy_user_64.S | 18 +++--- 1 file changed,

[PATCH] x86/mce: Add Skylake quirk for patrol scrub reported errors

2021-03-22 Thread Tony Luck
nux will try to take the affected page offline. [Tony: Wordsmith commit comment] Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- Repost ... looks like this got lost somewhere. V2: Boris: Don't optimize with pointer to quirk function. Just do the vendor/family/model check i

[tip: x86/cpu] x86/mce: Add Xeon Sapphire Rapids to list of CPUs that support PPIN

2021-03-20 Thread tip-bot2 for Tony Luck
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: a331f5fdd36dba1ffb0239a4dfaaf1df91ff1aab Gitweb: https://git.kernel.org/tip/a331f5fdd36dba1ffb0239a4dfaaf1df91ff1aab Author:Tony Luck AuthorDate:Fri, 19 Mar 2021 10:39:19 -07:00 Committer

[PATCH] x86/mce: Add Xeon Sapphire Rapids to list of CPUs that support PPIN

2021-03-19 Thread Tony Luck
New CPU model, same MSRs to control and read the inventory number. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/intel.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c index e309476743b7..acfd5d9f93c6 100644

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-02-25 Thread Tony Luck
On Thu, Feb 25, 2021 at 6:23 PM HORIGUCHI NAOYA(堀口 直也) wrote: > > On Thu, Feb 25, 2021 at 10:15:42AM -0800, Luck, Tony wrote: > > CPU3 reads the poison and starts along same path that CPU2 > > did. > > I think that the MCE loop happening on CPU2 and CPU3 is unexpected > and these threads should

[PATCH] x86/cpu: Add another Alder Lake CPU to the Intel family

2021-01-21 Thread Tony Luck
From: Gayatri Kammela Add Alder Lake mobile CPU model number to Intel family. Signed-off-by: Gayatri Kammela Signed-off-by: Tony Luck --- Boris: As usual, getting the CPU model number in first makes life simpler for the different teams adding model specific code to perf, power, etc

[PATCH v3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-14 Thread Tony Luck
different page from the first. Signed-off-by: Tony Luck --- V3: Thanks to extensive commentary from Andy & Boris Throws out the changes to get_user() and subsequent changes to core code. Everything is now handled in the machine check code. Downside is that we can (and do) take multiple machi

[PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-11 Thread Tony Luck
func(work); work = next; cond_resched(); } while (work); Add a "mce_busy" flag bit to detect this situation and panic when it happens. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 7 ++- include/linux/sched.h

[PATCH v2 3/3] futex, x86/mce: Avoid double machine checks

2021-01-11 Thread Tony Luck
case it is reasonable to retry 2) machine check on the user address, bad idea to re-read Check for the ENXIO return code from the first get_user() call and immediately return an error without re-reading the futex. Signed-off-by: Tony Luck --- kernel/futex.c | 5 - 1 file changed, 4 insertions

[PATCH v2 2/3] x86/mce: Add new return value to get_user() for machine check

2021-01-11 Thread Tony Luck
to an address that has an uncorrectable error. Signed-off-by: Tony Luck --- arch/x86/lib/getuser.S | 8 +++- arch/x86/mm/extable.c | 1 + 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index fa1bc2104b32..c49a449fced6 100644 -

[PATCH v2 0/3] Fix infinite machine check loop in futex_wait_setup()

2021-01-11 Thread Tony Luck
l crashed when get_user() touched poison). Tony Luck (3): x86/mce: Avoid infinite loop for copy from user recovery x86/mce: Add new return value to get_user() for machine check futex, x86/mce: Avoid double machine checks arch/x86/kernel/cpu/mce/core.c | 7 ++- arch/x86/lib/getuser.S

[PATCH 2/2] futex, x86/mce: Avoid double machine checks

2021-01-08 Thread Tony Luck
case it is reasonable to retry 2) machine check on the user address, bad idea to re-read Add some infrastructure to differentiate these cases. Signed-off-by: Tony Luck --- arch/x86/include/asm/mmu.h | 7 +++ arch/x86/kernel/cpu/mce/core.c | 10 ++ include/linux/mm.h

[PATCH 1/2] x86/mce: Avoid infinite loop for copy from user recovery

2021-01-08 Thread Tony Luck
func(work); work = next; cond_resched(); } while (work); Add a "mce_busy" flag bit to detect this situation and panic when it happens. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 7 ++- include/linux/sched.h

[PATCH 0/2] Fix infinite machine check loop in futex_wait_setup()

2021-01-08 Thread Tony Luck
to handle the #defines etc. to define an arch specific function to be used in generic code] Tony Luck (2): x86/mce: Avoid infinite loop for copy from user recovery futex, x86/mce: Avoid double machine checks arch/x86/include/asm/mmu.h | 7 +++ arch/x86/kernel/cpu/mce/core.c | 17

[RFC PATCH] x86/mce: Add ppin and microcode to mce trace

2021-01-07 Thread Tony Luck
Steven, I've been remiss about updating the mce_record trace as new fields have been added to "struct mce". What are the ABI implications of a patch like the one below (sample only ... there are a couple more fields that may need to be added)? Are there any size limitations that I might hit

Re: [PATCH] mm/filemap: add static for function __add_to_page_cache_locked

2020-12-09 Thread Tony Luck
On Mon, Dec 7, 2020 at 4:36 PM Michal Kubecek wrote: > Not removal, commit 3351b16af494 ("mm/filemap: add static for function > __add_to_page_cache_locked") made the function static which breaks the > build in btfids phase - but it seems to happen only on some > architectures. In our case, ppc64,

[tip: ras/core] x86/mce: Use "safe" MSR functions when enabling additional error logging

2020-11-16 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: 098416e6986127f7e4c8ce4fd6bbbd80e55b0386 Gitweb: https://git.kernel.org/tip/098416e6986127f7e4c8ce4fd6bbbd80e55b0386 Author:Tony Luck AuthorDate:Tue, 10 Nov 2020 16:39:54 -08:00 Committer

[PATCH 0/2] Update MAINTAINERS for EDAC

2020-11-05 Thread Tony Luck
A new driver for Intel client system on chip. Clean up a couple of F: entries for other EDAC drivers. Tony Luck (2): MAINTAINERS: Add entry for Intel IGEN6 EDAC driver MAINTAINERS: Clean up the F: entries for some EDAC drivers MAINTAINERS | 11 +-- 1 file changed, 9 insertions

[PATCH 2/2] MAINTAINERS: Clean up the F: entries for some EDAC drivers

2020-11-05 Thread Tony Luck
The edac_altera entry stopped at the "." and needed "[ch]" to match both the driver and the header file. The edac_skx entry only matched on ".c" files so didn't include skx_common.h Signed-off-by: Tony Luck --- MAINTAINERS | 4 ++-- 1 file changed, 2 insertions(

[PATCH 1/2] MAINTAINERS: Add entry for Intel IGEN6 EDAC driver

2020-11-05 Thread Tony Luck
New driver for "client" system on chip CPUs. Signed-off-by: Tony Luck --- MAINTAINERS | 7 +++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index e73636b75f29..86eb55697c8b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6353,6 +6353,13 @@ L:

[tip: ras/core] x86/mce: Enable additional error logging on certain Intel CPUs

2020-11-02 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: 68299a42f84288537ee3420c431ac0115ccb90b1 Gitweb: https://git.kernel.org/tip/68299a42f84288537ee3420c431ac0115ccb90b1 Author:Tony Luck AuthorDate:Fri, 30 Oct 2020 12:04:00 -07:00 Committer

[tip: ras/core] x86/mce: Recover from poison found while copying from user space

2020-10-07 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: c0ab7ffce275d3f83bd253c70889c28821d4a41d Gitweb: https://git.kernel.org/tip/c0ab7ffce275d3f83bd253c70889c28821d4a41d Author:Tony Luck AuthorDate:Tue, 06 Oct 2020 14:09:09 -07:00 Committer

[tip: ras/core] x86/mce: Provide method to find out the type of an exception handler

2020-10-07 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: a05d54c41ecfa1a322b229b4e5ce50c157284f74 Gitweb: https://git.kernel.org/tip/a05d54c41ecfa1a322b229b4e5ce50c157284f74 Author:Tony Luck AuthorDate:Tue, 06 Oct 2020 14:09:06 -07:00 Committer

[tip: ras/core] x86/mce: Avoid tail copy when machine check terminated a copy from user

2020-10-07 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: a2f73400e4dfd13f673c6e1b4b98d180fd1e47b3 Gitweb: https://git.kernel.org/tip/a2f73400e4dfd13f673c6e1b4b98d180fd1e47b3 Author:Tony Luck AuthorDate:Tue, 06 Oct 2020 14:09:08 -07:00 Committer

[tip: ras/core] x86/mce: Decode a kernel instruction to determine if it is copying from user

2020-10-07 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: 300638101329e8f1569115f3d7197ef5ef754a3a Gitweb: https://git.kernel.org/tip/300638101329e8f1569115f3d7197ef5ef754a3a Author:Tony Luck AuthorDate:Tue, 06 Oct 2020 14:09:10 -07:00 Committer

[PATCH v3 4/6] x86/mce: Avoid tail copy when machine check terminated a copy from user

2020-10-06 Thread Tony Luck
code as if the copy succeeded. The machine check handler will use task_work_add() to make sure that the task is sent a SIGBUS. Signed-off-by: Tony Luck --- arch/x86/lib/copy_user_64.S | 15 +++ 1 file changed, 15 insertions(+) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib

[PATCH v3 2/6] x86/mce: Provide method to find out the type of exception handle

2020-10-06 Thread Tony Luck
Avoid a proliferation of ex_has_*_handler() functions by having just one function that returns the type of the handler (if any). Drop the __visible attribute for this function. It is not called from assembler so the attribute is not necessary. Signed-off-by: Tony Luck --- arch/x86/include/asm

[PATCH v3 3/6] x86/mce: Add _ASM_EXTABLE_CPY for copy user access

2020-10-06 Thread Tony Luck
COPYIN will be used by mce_severity() calculation to indicate that a machine check is recoverable because the kernel was copying from user space. Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/include/asm/asm.h | 6 +++ arch/x86/include/asm/mce.h | 15 ++ arch/x86/lib/copy_use

[PATCH v3 1/6] x86/mce: Pass pointer to saved pt_regs to severity calculation routines

2020-10-06 Thread Tony Luck
From: Youquan Song New recovery features require additional information about processor state when a machine check occurred. Pass pt_regs down to the routines that need it. No functional change. Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 14

[PATCH v3 6/6] x86/mce: Decode a kernel instruction to determine if it is copying from user

2020-10-06 Thread Tony Luck
address. For MOVS instructions the source address is in the %rsi register. The function fault_in_kernel_space() determines whether the source address is kernel or user, upgrade it from "static" so it can be used here. Co-developed-by: Youquan Song Signed-off-by: Youquan Song Signed-off-by:

[PATCH v3 5/6] x86/mce: Recover from poison found while copying from user space

2020-10-06 Thread Tony Luck
d a SIGBUS to the task. Use a new helper function to share common code between the "fault in user mode" case and the "fault while copying from user" case. New code paths will be activated by the next patch which sets MCE_IN_KERNEL_COPYIN. Suggested-by: Borislav Petkov Signed-of

[PATCH v3 0/6] Add machine check recovery when copying from user space

2020-10-06 Thread Tony Luck
where function fault_in_kernel_space() is used. Check modrm.got and sib.got fields in "insn" were set before calling insn_get_addr() Change type of constant from ~0ul to -1l when checking whether address returned by insn_get_addr() is valid. Tony Luck (

[PATCH v2 5/7] x86/mce: Change fault_in_kernel_space() from static to global

2020-09-30 Thread Tony Luck
From: Youquan Song Machine check code needs to be able to determine if a faulting address is in user or kernel space. There is already a function to do this. Change from "static int" to "bool" and add declaration to No functional change. Signed-off-by: Youquan Song Signed-off-by: ---

[PATCH v2 1/7] x86/mce: Pass pointer to saved pt_regs to severity calculation routines

2020-09-30 Thread Tony Luck
From: Youquan Song New recovery features require additional information about processor state when a machine check occurred. Pass pt_regs down to the routines that need it. No functional change. Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 14

[PATCH v2 2/7] x86/mce: Provide method to find out the type of exception handle

2020-09-30 Thread Tony Luck
Avoid a proliferation of ex_has_*_handler() functions by having just one function that returns the type of the handler (if any). Drop the __visible attribute for this function. It is not called from assembler so the attribute is not necessary. Signed-off-by: Tony Luck --- arch/x86/include/asm

[PATCH v2 0/7] Add machine check recovery when copying from user space

2020-09-30 Thread Tony Luck
need a test case where a futex has been poisoned to check. Probably this switch should be expanded with all the instructions that the compiler could possibly generate that read from user space. Change since v1: Moved the code to discover user address here in the

[PATCH v2 7/7] x86/mce: Decode a kernel instruction to determine if it is copying from user

2020-09-30 Thread Tony Luck
the source address. For MOVS instructions the source address is in the %rsi register. The function fault_in_kernel_space() determines whether the source address is kernel or user. Co-developed-by: Youquan Song Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c

[PATCH v2 4/7] x86/mce: Avoid tail copy when machine check terminated a copy from user

2020-09-30 Thread Tony Luck
code as if the copy succeeded. The machine check handler will use task_work_add() to make sure that the task is sent a SIGBUS. Signed-off-by: Tony Luck --- arch/x86/lib/copy_user_64.S | 15 +++ 1 file changed, 15 insertions(+) diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib

[PATCH v2 3/7] x86/mce: Add _ASM_EXTABLE_CPY for copy user access

2020-09-30 Thread Tony Luck
e used by mce_severity() calculation to indicate that a machine check is recoverable because the kernel was copying from user space. Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/include/asm/asm.h | 6 +++ arch/x86/include/asm/mce.h | 15 ++ arch/x86/lib/copy_user_64.

[PATCH v2 6/7] x86/mce: Recover from poison found while copying from user space

2020-09-30 Thread Tony Luck
p the page and send a SIGBUS to the task. Refactor the recovery code path to share common code between the "fault in user mode" case and the "fault while copying from user" case. New code paths will be activated by the next patch which sets MCE_IN_KERNEL_COPYIN. Signed-off-by: You

[tip: ras/core] x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list

2020-09-29 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: ed9705e4ad1c19ae51ed0cb4c112f9eb6dfc69fc Gitweb: https://git.kernel.org/tip/ed9705e4ad1c19ae51ed0cb4c112f9eb6dfc69fc Author:Tony Luck AuthorDate:Tue, 29 Sep 2020 19:13:13 -07:00 Committer

[PATCH 1/2] x86/mce: Add Skylake quirk for patrol scrub reported errors

2020-09-29 Thread Tony Luck
the model, minimum stepping and range of machine check bank numbers. Add a new rule to detect the special signature (on model 0x55, stepping >=4 in any of the memory controller banks). Suggested-by: Youquan Song Rewritten-by: Borislav Petkov Co-developed-by: Tony Luck Signed-off-by: Tony L

[PATCH 2/2] x86/mce: Drop AMD specific "DEFERRED" case from Intel severity rule list

2020-09-29 Thread Tony Luck
D switched to a separate grading function in commit bf80bbd7dcf5 ("x86/mce: Add an AMD severities-grading function") Belatedly drop the DEFERRED case from the Intel rule list. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/severity.c | 4 1 file changed, 4 deletions(-) diff --git a

[PATCH 0/2] mce severity quirk & cleanup

2020-09-29 Thread Tony Luck
in sight for years. Drop it. Borislav Petkov (1): x86/mce: Add Skylake quirk for patrol scrub reported errors Tony Luck (1): x86/mce: Drop AMD specific "DEFERRED" case from Intel severity rule list arch/x86/kernel/cpu/mce/severity.c | 32 -- 1 file

[tip: ras/core] x86/mce: Stop mce_reign() from re-computing severity for every CPU

2020-09-14 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: 13c877f4b48b943105ad9e13ba2c7a093fb694e8 Gitweb: https://git.kernel.org/tip/13c877f4b48b943105ad9e13ba2c7a093fb694e8 Author:Tony Luck AuthorDate:Tue, 08 Sep 2020 10:55:12 -07:00 Committer

[RESEND PATCH 0/8] Add machine check recovery when copying from user space

2020-09-09 Thread Tony Luck
be safe, but I need a test case where a futex has been poisoned to check. Probably this switch should be expanded with all the instructions that the compiler could possibly generate that read from user space. Tony Luck (4): x86/mce: Stop mce_reign() from re-computing severity

[PATCH 1/8] x86/mce: Stop mce_reign() from re-computing severity for every CPU

2020-09-08 Thread Tony Luck
ic, we do still need one call to mce_severity to provide the correct message giving the reason for the panic. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/k

[PATCH 5/8] x86/mce: Avoid tail copy when machine check terminated a copy from user

2020-09-08 Thread Tony Luck
succeeded. [Tried returning bytes not copied here, but that puts the kernel into a loop taking the machine check over and over. I don't know at what level some code is retrying] Signed-off-by: Tony Luck --- arch/x86/lib/copy_user_64.S | 7 +++ 1 file changed, 7 insertions(+) diff --git

[PATCH 8/8] x86/mce: Decode a kernel instruction to determine if it is copying from user

2020-09-08 Thread Tony Luck
information to determine direction of the data copy. In the case of "REP MOVS*" instructions it is necessary to also check the value in the %rsi register. Co-developed-by: Youquan Song Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.

[PATCH 6/8] x86/mce: Change fault_in_kernel_space() from static to global

2020-09-08 Thread Tony Luck
From: Youquan Song Machine check code needs to be able to determine if a faulting address is in user of kernel space. There is already a function to do this. Change from "static int" to "bool" and add declaration to No functional change. Signed-off-by: Youquan Song Signed-off-by: ---

[PATCH 4/8] x86/mce: Add _ASM_EXTABLE_CPY for copy user access

2020-09-08 Thread Tony Luck
e used by mce_severity() calculation to indicate that a machine check is recoverable because the kernel was copying from user space. Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/include/asm/asm.h | 6 +++ arch/x86/include/asm/mce.h | 1 + arch/x86/lib/copy_user_64.

[PATCH 7/8] x86/mce: Recover from poison found while copying from user space

2020-09-08 Thread Tony Luck
Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 51 ++ include/linux/sched.h | 1 + 2 files changed, 52 insertions(+) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 5512318a07ae..2a3c42329c3f 100644 --- a/arch

[RFD PATCH] x86/mce: Make sure to send SIGBUS even after losing the race to poison a page

2020-08-27 Thread Tony Luck
for many other error cases from memory_failure(). Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 7 +-- mm/memory-failure.c| 2 +- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index

[tip: ras/core] x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()

2020-08-26 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: 1e36d9c6886849c6f3d3c836370563e6bc1a6ddd Gitweb: https://git.kernel.org/tip/1e36d9c6886849c6f3d3c836370563e6bc1a6ddd Author:Tony Luck AuthorDate:Mon, 24 Aug 2020 15:12:37 -07:00 Committer

[PATCH] x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()

2020-08-24 Thread Tony Luck
ly be called when instructions have been changed/re-mapped. Recovery for an instruction fetch may change the physical address. But that doesn't happen until the scheduled work runs (which could be on another CPU). Reported-by: Gabriele Paoloni Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mc

[tip: x86/cpu] x86/cpu: Add Lakefield, Alder Lake and Rocket Lake models to the to Intel CPU family

2020-07-25 Thread tip-bot2 for Tony Luck
The following commit has been merged into the x86/cpu branch of tip: Commit-ID: e00b62f0b06d0ae2b844049f216807617aff0cdb Gitweb: https://git.kernel.org/tip/e00b62f0b06d0ae2b844049f216807617aff0cdb Author:Tony Luck AuthorDate:Mon, 20 Jul 2020 21:37:49 -07:00 Committer

[PATCH] x86/cpu: Add Lakefield, Alder Lake and Rocket Lake to Intel family

2020-07-20 Thread Tony Luck
Three new CPU models. Signed-off-by: Tony Luck --- This patch supercedes https://lore.kernel.org/lkml/20200709192353.21151-1-tony.l...@intel.com/ That one just added Rocket Lake arch/x86/include/asm/intel-family.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/include

[PATCH] x86/cpu: Add Rocket Lake to Intel family

2020-07-09 Thread Tony Luck
From: Gayatri Kammela Add the model number/CPUID of Rocket Lake desktop to the Intel family. Reviewed-by: Tony Luck Signed-off-by: Gayatri Kammela Signed-off-by: Tony Luck --- Thomas, Ingo: I'd appreciate if this could go into some TIP branch quickly. As usual we have a bunch of different

[tip: efi/urgent] efivarfs: Don't return -EINTR when rate-limiting reads

2020-06-19 Thread tip-bot2 for Tony Luck
The following commit has been merged into the efi/urgent branch of tip: Commit-ID: 4353f03317fd3eb0bd803b61bdb287b68736a728 Gitweb: https://git.kernel.org/tip/4353f03317fd3eb0bd803b61bdb287b68736a728 Author:Tony Luck AuthorDate:Thu, 28 May 2020 12:49:05 -07:00 Committer

[tip: efi/urgent] efivarfs: Update inode modification time for successful writes

2020-06-19 Thread tip-bot2 for Tony Luck
The following commit has been merged into the efi/urgent branch of tip: Commit-ID: 2096721f1577b51b574fa06a7d91823dffe7267a Gitweb: https://git.kernel.org/tip/2096721f1577b51b574fa06a7d91823dffe7267a Author:Tony Luck AuthorDate:Thu, 28 May 2020 12:49:04 -07:00 Committer

[tip: x86/fsgsbase] x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation

2020-06-18 Thread tip-bot2 for Tony Luck
The following commit has been merged into the x86/fsgsbase branch of tip: Commit-ID: 978e1342c3c4d7b20808fd5875d9ac0d57db22ee Gitweb: https://git.kernel.org/tip/978e1342c3c4d7b20808fd5875d9ac0d57db22ee Author:Tony Luck AuthorDate:Thu, 28 May 2020 16:13:54 -04:00 Committer

[PATCH] x86/mce: Add Skylake quirk for patrol scrub reported errors

2020-06-15 Thread Tony Luck
nux will try to take the affected page offline. [Tony: Wordsmith commit comment] Signed-off-by: Youquan Song Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/core.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/k

Re: [RFC PATCH] x86/msr: Filter MSR writes

2020-06-12 Thread Tony Luck
On Fri, Jun 12, 2020 at 1:41 PM Peter Zijlstra wrote: > > On Fri, Jun 12, 2020 at 07:48:01PM +0200, Borislav Petkov wrote: > > On Fri, Jun 12, 2020 at 10:20:03AM -0700, Linus Torvalds wrote: > > > Since you already added the filtering, this looks fairly sane. > > > > > > IOW, what MSR's do we

[tip: x86/urgent] x86/cpu: Add Sapphire Rapids CPU model number

2020-06-03 Thread tip-bot2 for Tony Luck
The following commit has been merged into the x86/urgent branch of tip: Commit-ID: be25d1b5ea6a3a3ecbb5474e2ae8e32d2ba055ea Gitweb: https://git.kernel.org/tip/be25d1b5ea6a3a3ecbb5474e2ae8e32d2ba055ea Author:Tony Luck AuthorDate:Wed, 03 Jun 2020 10:33:52 -07:00 Committer

[PATCH] x86/cpu: Add Sapphire Rapids CPU model number

2020-06-03 Thread Tony Luck
Latest edition (039) of "Intel Architecture Instruction Set Extensions and Future Features Programming Reference" includes three new CPU model numbers. Linux already has the two Ice Lake server ones. Add the new model number for Sapphire Rapids. Signed-off-by: Tony Luck --- I'd

[PATCH 2/2] efivarfs: Don't return -EINTR when rate-limiting reads

2020-05-28 Thread Tony Luck
to a non-interruptible one. Reported-by: Lennart Poettering Signed-off-by: Tony Luck --- fs/efivarfs/file.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/fs/efivarfs/file.c b/fs/efivarfs/file.c index 4b8bc4560d70..feaa5e182b7b 100644 --- a/fs/efivarfs/file.c +++ b/fs

[PATCH 0/2] Couple of efivarfs fixes

2020-05-28 Thread Tony Luck
1) Some apps want to monitor changes in EFI variables, but reading the file and comparing is inefficient. Just have Linnux update the modification time when a file is written 2) A rate limited read can return -EINTR ... very suprising to apps. Tony Luck (2): efivarfs: Update inode

[PATCH 1/2] efivarfs: Update inode modification time for successful writes

2020-05-28 Thread Tony Luck
Some applications want to be able to see when EFI variables have been updated. Update the modification time for successful writes. Reported-by: Lennart Poettering Signed-off-by: Tony Luck --- fs/efivarfs/file.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/efivarfs/file.c b/fs

[tip: ras/core] x86/mce/dev-mcelog: Fix -Wstringop-truncation warning about strncpy()

2020-05-27 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: 45811ba140593e288a288c2a2e45d25f38d20d73 Gitweb: https://git.kernel.org/tip/45811ba140593e288a288c2a2e45d25f38d20d73 Author:Tony Luck AuthorDate:Wed, 27 May 2020 11:28:08 -07:00 Committer

[PATCH] x86/mce/dev-mcelog: Fix "make W=1" warning about strncpy

2020-05-27 Thread Tony Luck
Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/mce/dev-mcelog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/mce/dev-mcelog.c b/arch/x86/kernel/cpu/mce/dev-mcelog.c index d089567a9ce8..bcb379b2fd42 100644 --- a/arch/x86/kernel/cpu/mce/dev-mcelog.c

[tip: ras/core] x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned

2020-05-26 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: be69f6c5cd38c457c22f6e718077f6524437369d Gitweb: https://git.kernel.org/tip/be69f6c5cd38c457c22f6e718077f6524437369d Author:Tony Luck AuthorDate:Wed, 20 May 2020 09:35:46 -07:00 Committer

Re: [PATCH v12 00/18] Enable FSGSBASE instructions

2020-05-25 Thread Tony Luck
pr_warn("** If you see this message and you are not debugging**\n"); >pr_warn("** the kernel, report this immediately to your vendor! > **\n"); >pr_warn("** > **\n"); >pr_warn("** NOTICE NOTICE NOTICE

[tip: ras/core] x86/{mce,mm}: Change so poison pages are either unmapped or marked uncacheable

2020-05-25 Thread tip-bot2 for Tony Luck
The following commit has been merged into the ras/core branch of tip: Commit-ID: 3cb1ada80fe29e2fa022b5f20370b65718e0a744 Gitweb: https://git.kernel.org/tip/3cb1ada80fe29e2fa022b5f20370b65718e0a744 Author:Tony Luck AuthorDate:Wed, 20 May 2020 09:35:46 -07:00 Committer

[PATCH] x86/mm: Don't try to change poison pages to uncacheable in a guest

2020-05-05 Thread Tony Luck
is running as a guest. If it is, there is no point in trying to change the cache mode of the bad page. The VMM has taken the whole page away. Reported-by: Jue Wang Tested-by: Jue Wang Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()") Cc: Signed-off-by: Tony Luck

Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()

2019-10-07 Thread Tony Luck
ctl(PR_SET_UNALIGN) to choose whether they want the kernel to silently fix things or to send SIGBUS. Kernel always noisily (rate limited) fixes up unaligned access. Your patch does make all the messages go away. Tested-by: Tony Luck

Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()

2019-10-07 Thread Tony Luck
On Mon, Oct 7, 2019 at 11:28 AM Linus Torvalds wrote: > > On Sun, Oct 6, 2019 at 8:11 PM Linus Torvalds > wrote: > > > > > > > > The last two should just do user_access_begin()/user_access_end() > > > instead of access_ok(). __copy_to_user_inatomic() has very few callers > > > as well: > > > >

[PATCH 2/3] drivers/net/b44: Align pwol_mask to unsigned long for better performance

2019-09-16 Thread Tony Luck
issue on x86. Signed-off-by: Peter Zijlstra Signed-off-by: Fenghua Yu Signed-off-by: Tony Luck --- drivers/net/ethernet/broadcom/b44.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c index

[PATCH 3/3] x86/split_lock: Align the x86_capability array to size of unsigned long

2019-09-16 Thread Tony Luck
changes. So choose the simpler solution by setting the array's alignment to size of unsigned long. Suggested-by: David Laight Suggested-by: Thomas Gleixner Signed-off-by: Fenghua Yu Signed-off-by: Tony Luck --- arch/x86/include/asm/processor.h | 10 +- 1 file changed, 9 insertions(+), 1

[PATCH 1/3] x86/common: Align cpu_caps_cleared and cpu_caps_set to unsigned long

2019-09-16 Thread Tony Luck
__aligned(unsigned long) is a simpler fix. Signed-off-by: Fenghua Yu Reviewed-by: Borislav Petkov Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/common.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index

[PATCH 0/3] Fix some 4-byte vs. 8-byte alignment issues

2019-09-16 Thread Tony Luck
This series is made up of three patches from Fenghua Yu's "Split lock" series last posted here: https://lore.kernel.org/kvm/1560897679-228028-1-git-send-email-fenghua...@intel.com/ Part 3 has been fixed to use a union to force alignment per feedback from Thomas. These parts are all simple fixes

[PATCH 1/4] x86/cpu: Add Tiger Lake to Intel family

2019-09-05 Thread Tony Luck
From: Gayatri Kammela Add the model numbers/CPUIDs of Tiger Lake mobile and desktop to the Intel family. Cc: Peter Zijlstra Cc: Tony Luck Cc: Andy Shevchenko Cc: Kan Liang Cc: David E. Box Cc: Rajneesh Bhardwaj Suggested-by: Tony Luck Reviewed-by: Tony Luck Signed-off-by: Gayatri

[PATCH 4/4] x86/cpu: Update init data for new Airmont CPU model

2019-09-05 Thread Tony Luck
From: Rahul Tanwar Update properties for newly added Airmont CPU variant. Signed-off-by: Rahul Tanwar Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/common.c | 1 + arch/x86/kernel/cpu/intel.c | 1 + arch/x86/kernel/tsc_msr.c| 5 + 3 files changed, 7 insertions(+) diff --git

[PATCH 3/4] x86/cpu: Add new Airmont variant to Intel family

2019-09-05 Thread Tony Luck
From: Rahul Tanwar Add new Airmont variant CPU model to Intel family. Signed-off-by: Rahul Tanwar Signed-off-by: Tony Luck --- arch/x86/include/asm/intel-family.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h

[PATCH 2/4] x86/cpu: Add Elkhart Lake to Intel family

2019-09-05 Thread Tony Luck
From: Gayatri Kammela Add the model number/CPUID of atom based Elkhart Lake to the Intel family. Cc: Peter Zijlstra Cc: Tony Luck Cc: Andy Shevchenko Cc: Kan Liang Cc: David E. Box Cc: Rajneesh Bhardwaj Signed-off-by: Gayatri Kammela Signed-off-by: Tony Luck --- arch/x86/include/asm

[PATCH 0/4] New Intel CPU model numbers

2019-09-05 Thread Tony Luck
I'm going to be more aggressive about pushing new CPU model numbers into . Basically as soon as Intel talks publicly about some new model and I have the model number, then I'll post the update. Changes to the rest of the kernel will follow at the pace of the various groups that have model

[tip:x86/urgent] x86/cpu: Explain Intel model naming convention

2019-08-17 Thread tip-bot for Tony Luck
Commit-ID: 12ece2d53d3e8f827e972caf497c165f7729c717 Gitweb: https://git.kernel.org/tip/12ece2d53d3e8f827e972caf497c165f7729c717 Author: Tony Luck AuthorDate: Thu, 15 Aug 2019 11:16:24 -0700 Committer: Borislav Petkov CommitDate: Sat, 17 Aug 2019 10:06:32 +0200 x86/cpu: Explain Intel

[tip:x86/urgent] MAINTAINERS, x86/CPU: Tony Luck will maintain asm/intel-family.h

2019-08-15 Thread tip-bot for Tony Luck
Commit-ID: 5ed1c835ed8b522ce25071cc2d56a9a09bd5b59e Gitweb: https://git.kernel.org/tip/5ed1c835ed8b522ce25071cc2d56a9a09bd5b59e Author: Tony Luck AuthorDate: Wed, 14 Aug 2019 16:40:30 -0700 Committer: Borislav Petkov CommitDate: Thu, 15 Aug 2019 09:54:05 +0200 MAINTAINERS, x86/CPU

[PATCH] MAINTAINERS, x86/CPU: Tony Luck will maintain asm/intel-family.h

2019-08-14 Thread Tony Luck
There are a few different subsystems in the kernel that depend on model specific behaviour (perf, EDAC, power, ...). Easier for just one person to have the task to get new model numbers included instead of having these groups trip over each other to do it. Signed-off-by: Tony Luck

Re: [PATCH] EDAC, ie31200: Add Intel Corporation 3rd Gen Core processor

2019-08-12 Thread Tony Luck
On Wed, Jun 19, 2019 at 12:34 AM Jiping Ma wrote: Oops. Boris pointed out to me that this has been left hanging. Sorry for the delay. > 3rd Gen Core seems to work just like Skylake. Maybe "just like all the other Xeon-E3 processors? "3rd Gen Core" seems to be Ivybridge generation (based on

[tip:ras/core] x86/mce: Don't check for the overflow bit on action optional machine checks

2019-08-05 Thread tip-bot for Tony Luck
Commit-ID: aaefca8e30d9df7a4ca13c9c8e135dd227b8ff19 Gitweb: https://git.kernel.org/tip/aaefca8e30d9df7a4ca13c9c8e135dd227b8ff19 Author: Tony Luck AuthorDate: Thu, 18 Jul 2019 11:29:20 -0700 Committer: Borislav Petkov CommitDate: Mon, 5 Aug 2019 09:34:02 +0200 x86/mce: Don't check

[PATCH] IB/core: Add mitigation for Spectre V1

2019-07-30 Thread Tony Luck
Some processors may mispredict an array bounds check and speculatively access memory that they should not. With a user supplied array index we like to play things safe by masking the value with the array size before it is used as an index. Signed-off-by: Tony Luck --- [I don't have h/w, so just

[PATCH] x86/mce: Don't check for "OVER" bit on action optional machine checks

2019-07-18 Thread Tony Luck
was logged first, followed by a correcetd error. In this case the first error is retained in the bank. So in either case the machine check bank will contain the address of the SRAO error. So we can process that even if the overflow bit was set. Reported-by: Yongkai Wu Signed-off-by: Tony Luck ---

[tip:ras/core] RAS/CEC: Add CONFIG_RAS_CEC_DEBUG and move CEC debug features there

2019-06-08 Thread tip-bot for Tony Luck
Commit-ID: 60fd42d26cc7ec8847598da50ebf27e3c9647d7b Gitweb: https://git.kernel.org/tip/60fd42d26cc7ec8847598da50ebf27e3c9647d7b Author: Tony Luck AuthorDate: Mon, 6 May 2019 13:13:22 +0200 Committer: Borislav Petkov CommitDate: Sat, 8 Jun 2019 17:39:24 +0200 RAS/CEC: Add

Re: Is 2nd Generation Intel(R) Xeon(R) Processors (Formerly Cascade Lake) affected by MDS

2019-05-24 Thread Tony Luck
Stepping 5 is preproduction of cascade lake. MDS mitigation is in production parts (stepping >= 6) Sent from my iPhone > On May 24, 2019, at 07:17, Greg Kroah-Hartman > wrote: > >> On Fri, May 24, 2019 at 03:19:34PM +0200, Jinpu Wang wrote: >> Resend with plain text, and remove confidential

[PATCH] RAS/CEC: Add debugfs switch to disable at run time

2019-04-18 Thread Tony Luck
Useful when running error injection tests that want to see all of the MCi_(STATUS|ADDR|MISC) data via /dev/mcelog. Signed-off-by: Tony Luck --- drivers/ras/cec.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c index

  1   2   3   4   5   6   7   8   9   10   >