On 3/14/20 9:18 AM, Nicholas Piggin wrote:
Ganesh Goudar's on March 14, 2020 12:04 am:
MCE handling on pSeries platform fails as recent rework to use common
code for pSeries and PowerNV in machine check error handling tries to
access per-cpu variables in realmode. The per-cpu variables may
On 4/3/20 7:38 AM, Nicholas Piggin wrote:
Ganesh Goudar's on March 30, 2020 5:12 pm:
From: Santosh S
Introduce notification chain which lets know about uncorrected memory
errors(UE). This would help prospective users in pmem or nvdimm subsystem
to track bad blocks for better handling
On 3/20/20 4:31 PM, Ganesh Goudar wrote:
MCE handling on pSeries platform fails as recent rework to use common
code for pSeries and PowerNV in machine check error handling tries to
access per-cpu variables in realmode. The per-cpu variables may be
outside the RMO region on pSeries platform
On 3/30/20 12:42 PM, Ganesh Goudar wrote:
From: Santosh S
Introduce notification chain which lets know about uncorrected memory
errors(UE). This would help prospective users in pmem or nvdimm subsystem
to track bad blocks for better handling of persistent memory allocations.
Signed-off
On 3/17/20 3:31 PM, Nicholas Piggin wrote:
Ganesh's on March 16, 2020 9:47 pm:
On 3/14/20 9:18 AM, Nicholas Piggin wrote:
Ganesh Goudar's on March 14, 2020 12:04 am:
MCE handling on pSeries platform fails as recent rework to use common
code for pSeries and PowerNV in machine check error
On 3/20/20 8:11 AM, Nicholas Piggin wrote:
Ganesh's on March 18, 2020 12:35 am:
On 3/17/20 3:31 PM, Nicholas Piggin wrote:
Ganesh's on March 16, 2020 9:47 pm:
On 3/14/20 9:18 AM, Nicholas Piggin wrote:
Ganesh Goudar's on March 14, 2020 12:04 am:
MCE handling on pSeries platform fails
On 3/20/20 8:58 PM, Nicholas Piggin wrote:
rtas_call allocates and uses memory in failure paths, which is
not safe for RMA. It also calls local_irq_save() which may not be safe
in all real mode contexts.
Particularly machine check may run with interrupts not "reconciled",
and it may have hit
On 3/24/20 10:57 AM, Michael Ellerman wrote:
Ganesh Goudar writes:
If we hit UE at an instruction with a fixup entry, flag to
ignore the event and set nip to continue execution at the
fixup entry.
You don't explain why we would want to do that. Or what the consequences
are if we *don't* do
On 10/1/20 11:21 PM, Ganesh Goudar wrote:
Use of nmi_enter/exit in real mode handler causes the kernel to panic
and reboot on injecting slb mutihit on pseries machine running in hash
mmu mode, As these calls try to accesses memory outside RMO region in
real mode handler where translation
On 10/16/20 5:02 PM, Michael Ellerman wrote:
On Fri, 9 Oct 2020 12:10:03 +0530, Ganesh Goudar wrote:
This patch series fixes mce handling for pseries, Adds LKDTM test
for SLB multihit recovery and enables selftest for the same,
basically to test MCE handling on pseries/powernv machines running
On 7/24/20 12:09 PM, Ganesh Goudar wrote:
When an UE or memory error exception is encountered the MCE handler
tries to find the pfn using addr_to_pfn() which takes effective
address as an argument, later pfn is used to poison the page where
memory error occurred, recent rework in this area made
On 9/26/20 1:29 AM, Kees Cook wrote:
On Fri, Sep 25, 2020 at 04:01:23PM +0530, Ganesh Goudar wrote:
Add PPC_SLB_MULTIHIT to lkdtm selftest framework.
Signed-off-by: Ganesh Goudar
---
tools/testing/selftests/lkdtm/tests.txt | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing
On 9/26/20 1:27 AM, Kees Cook wrote:
On Fri, Sep 25, 2020 at 04:01:22PM +0530, Ganesh Goudar wrote:
Add support to inject slb multihit errors, to test machine
check handling.
Thank you for more tests in here!
Based on work by Mahesh Salgaonkar and Michal Suchánek.
Cc: Mahesh Salgaonkar
On 9/18/20 12:10 PM, Michael Ellerman wrote:
Hi Ganesh,
Ganesh Goudar writes:
To test machine check handling, add debugfs interface to inject
slb multihit errors.
To inject slb multihit:
#echo 1 > /sys/kernel/debug/powerpc/mce_error_inject/inject_slb_multihit
Rather than creating a
On 9/17/20 5:50 PM, Michal Suchánek wrote:
Hello,
On Wed, Sep 16, 2020 at 10:52:26PM +0530, Ganesh Goudar wrote:
Use of nmi_enter/exit in real mode handler causes the kernel to panic
and reboot on injecting slb mutihit on pseries machine running in hash
mmu mode, As these calls try
On 9/17/20 5:59 PM, Michal Suchánek wrote:
Hello,
On Wed, Sep 16, 2020 at 10:52:25PM +0530, Ganesh Goudar wrote:
This patch series fixes mce handling for pseries, provides debugfs
interface for mce injection and adds selftest to test mce handling
on pseries/powernv machines running in hash mmu
On 9/17/20 5:53 PM, Michal Suchánek wrote:
Hello,
On Wed, Sep 16, 2020 at 10:52:27PM +0530, Ganesh Goudar wrote:
To test machine check handling, add debugfs interface to inject
slb multihit errors.
To inject slb multihit:
#echo 1 > /sys/kernel/debug/powerpc/mce_error_inj
On 7/21/20 3:38 PM, Nicholas Piggin wrote:
Excerpts from Ganesh Goudar's message of July 20, 2020 6:03 pm:
When an UE or memory error exception is encountered the MCE handler
tries to find the pfn using addr_to_pfn() which takes effective
address as an argument, later pfn is used to poison
On 10/19/20 6:45 PM, Michal Suchánek wrote:
On Mon, Oct 19, 2020 at 09:59:57PM +1100, Michael Ellerman wrote:
Hi Ganesh,
Some comments below ...
Ganesh Goudar writes:
To check machine check handling, add support to inject slb
multihit errors.
Cc: Kees Cook
Reviewed-by: Michal Suchánek
On 12/8/20 4:01 PM, Michael Ellerman wrote:
Ganesh Goudar writes:
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 9454d29ff4b4..4769954efa7d 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -273,6 +274,17 @@ struct
On 1/19/21 9:28 AM, Nicholas Piggin wrote:
Excerpts from Ganesh Goudar's message of January 15, 2021 10:58 pm:
Access to per-cpu variables requires translation to be enabled on
pseries machine running in hash mmu mode, Since part of MCE handler
runs in realmode and part of MCE handling code
On 1/25/21 2:54 PM, Christophe Leroy wrote:
Le 22/01/2021 à 13:32, Ganesh Goudar a écrit :
Access to per-cpu variables requires translation to be enabled on
pseries machine running in hash mmu mode, Since part of MCE handler
runs in realmode and part of MCE handling code is shared between
On 4/17/21 6:06 PM, Michael Ellerman wrote:
Ganesh Goudar writes:
The error type is ICACHE and DCACHE, for case MCE_ERROR_TYPE_ICACHE.
Do you mean "is ICACHE not DCACHE" ?
Right :), Should I send v2 ?
cheers
Signed-off-by: Ganesh Goudar
---
arch/powerpc/platforms/pseries
On 4/20/21 12:54 PM, Santosh Sivaraj wrote:
Hi Ganesh,
Ganesh Goudar writes:
When we hit an UE while using machine check safe copy routines,
ignore_event flag is set and the event is ignored by mce handler,
And the flag is also saved for defered handling and printing of
mce event
On 4/7/21 10:28 AM, Ganesh Goudar wrote:
When we hit an UE while using machine check safe copy routines,
ignore_event flag is set and the event is ignored by mce handler,
And the flag is also saved for defered handling and printing of
mce event information, But as of now saving of this flag
On 4/22/21 11:31 AM, Ganesh wrote:
On 4/7/21 10:28 AM, Ganesh Goudar wrote:
When we hit an UE while using machine check safe copy routines,
ignore_event flag is set and the event is ignored by mce handler,
And the flag is also saved for defered handling and printing of
mce event information
On 8/26/21 8:57 AM, Michael Ellerman wrote:
Ganesh writes:
On 8/24/21 6:18 PM, Michael Ellerman wrote:
Ganesh Goudar writes:
Add test for real address or control memory address access
error handling, using NX-GZIP engine.
The error is injected by accessing the control memory address
On 8/25/21 2:54 AM, Segher Boessenkool wrote:
On Tue, Aug 24, 2021 at 04:39:57PM +1000, Michael Ellerman wrote:
+ case MC_ERROR_CTRL_MEM_ACCESS_PTABLE_WALK:
+ mce_err.u.ra_error_type =
+
On 8/24/21 6:18 PM, Michael Ellerman wrote:
Ganesh Goudar writes:
Add test for real address or control memory address access
error handling, using NX-GZIP engine.
The error is injected by accessing the control memory address
using illegal instruction, on successful handling the process
On 8/24/21 12:09 PM, Michael Ellerman wrote:
Hi Ganesh,
Some comments below ...
Ganesh Goudar writes:
Add support to parse and log control memory access
error for pseries.
Signed-off-by: Ganesh Goudar
---
v2: No changes in this patch.
---
arch/powerpc/platforms/pseries/ras.c | 21
Hi mpe, Any comments on this patchset?
On 8/5/21 2:50 PM, Ganesh Goudar wrote:
Add support to parse and log control memory access
error for pseries.
Signed-off-by: Ganesh Goudar
---
v2: No changes in this patch.
---
arch/powerpc/platforms/pseries/ras.c | 21 +
1 file
On 9/8/21 11:10 AM, Michael Ellerman wrote:
Ganesh writes:
On 9/6/21 6:03 PM, Michael Ellerman wrote:
Ganesh Goudar writes
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 5 PID: 1883 Comm: insmod Tainted: GOE 5.14.0
On 8/6/21 6:53 PM, Ganesh Goudar wrote:
Check if the event info is valid before printing the
event information. When a fwnmi enabled nested kvm guest
hits a machine check exception L0 and L2 would generate
machine check event info, But L1 would not generate any
machine check event info
On 9/17/21 12:09 PM, Daniel Axtens wrote:
Hi Ganesh,
We queue an irq work for deferred processing of mce event
in realmode mce handler, where translation is disabled.
Queuing of the work may result in accessing memory outside
RMO region, such access needs the translation to be enabled
On 9/6/21 6:03 PM, Michael Ellerman wrote:
Ganesh Goudar writes:
We queue an irq work for deferred processing of mce event
in realmode mce handler, where translation is disabled.
Queuing of the work may result in accessing memory outside
RMO region, such access needs the translation
On 9/22/21 7:32 AM, Nicholas Piggin wrote:
The machine check handler is not considered NMI on 64s. The early
handler is the true NMI handler, and then it schedules the
machine_check_exception handler to run when interrupts are enabled.
This works fine except the case of an unrecoverable MCE,
On 11/8/21 19:49, Nicholas Piggin wrote:
Excerpts from Ganesh Goudar's message of November 8, 2021 6:38 pm:
In realmode mce handler we use irq_work_queue() to defer
the processing of mce events, irq_work_queue() can only
be called when translation is enabled because it touches
memory outside
ch 2/2, refactors this.
-
- /*
-* Queue irq work to log this rtas event later.
-* irq_work_queue uses per-cpu variables, so do this in virt
-* mode as well.
-*/
- irq_work_queue(_errlog_process_work);
-
- mtmsr(msr);
-
return disposition;
}
Thanks for the review :) .
Ganesh
On 9/6/21 14:13, Ganesh Goudar wrote:
Add support to parse and log control memory access
error for pseries. These changes are made according to
PAPR v2.11 10.3.2.2.12.
Signed-off-by: Ganesh Goudar
---
v3: Modify the commit log to mention the document according
to which changes are made
On 11/24/21 18:33, Nicholas Piggin wrote:
Excerpts from Ganesh Goudar's message of November 24, 2021 7:54 pm:
In realmode mce handler we use irq_work_queue() to defer
the processing of mce events, irq_work_queue() can only
be called when translation is enabled because it touches
memory outside
On 11/24/21 18:40, Nicholas Piggin wrote:
Excerpts from Ganesh Goudar's message of November 24, 2021 7:55 pm:
Now that we are no longer switching on the mmu in realmode
mce handler, Revert the commit 4ff753feab02("powerpc/pseries:
Avoid using addr_to_pfn in real mode") partia
On 1/7/22 19:44, Ganesh Goudar wrote:
Add support to parse and log control memory access
error for pseries. These changes are made according to
PAPR v2.11 10.3.2.2.12.
Signed-off-by: Ganesh Goudar
---
arch/powerpc/platforms/pseries/ras.c | 36
1 file changed
On 8/22/22 11:01, Sachin Sant wrote:
On 19-Aug-2022, at 10:12 AM, Ganesh wrote
We'll have to make sure everything get_pseries_errorlog() is either
forced inline, or marked noinstr.
Making the following functions always_inline and noinstr is fixing the issue.
__always_inline
On 8/22/22 11:19, Michael Ellerman wrote:
So I guess the compiler has decided not to inline it (why?!), and it is
not marked noinstr, so it gets KASAN instrumentation which crashes in
real mode.
We'll have to make sure everything get_pseries_errorlog() is either
forced inline, or marked
On 8/17/22 11:28, Michael Ellerman wrote:
Sachin Sant writes:
Following crash is seen while running powerpc/mce subtest on
a Power10 LPAR.
1..1
# selftests: powerpc/mce: inject-ra-err
[ 155.240591] BUG: Unable to handle kernel data access on read at
0xc00e00022d55b503
[ 155.240618]
On 9/2/22 05:49, Jason Gunthorpe wrote:
On Tue, Aug 16, 2022 at 08:57:13AM +0530, Ganesh Goudar wrote:
Hi,
EEH reocvery is currently serialized and these patches shorten
the time taken for EEH recovery by making the recovery to run
in parallel. The original author of these patches is Sam
On 9/7/22 09:49, Nicholas Piggin wrote:
On Mon Sep 5, 2022 at 4:38 PM AEST, Ganesh Goudar wrote:
Part of machine check error handling is done in realmode,
As of now instrumentation is not possible for any code that
runs in realmode.
When MCE is injected on KASAN enabled kernel, crash
Hi All,
I've already sent this almost before 6-7 hours, but the
mail did not appear on the Aug 2009 archives, So I'm sending
it again. Sorry for this!!. Thanks in advance.
I'm working on MPC860 with Linux Kernel 2.4.18.
As I'm fine tuning the FEC(Fast Ethernet Controller) driver,
I came
Hi all,
I'm working on MPC860 with Linux Kernel 2.4.18.
As I'm fine tuning the FEC(Fast Ethernet Controller) driver,
I came across the receive side processing of the ethernet frames
where in the Rx BD rings are preallocated with the buffers and each time
a new frame is received, the whole
will not get updated there(initially it used to)
again it resumes after some 45-60 seconds and the sequence repeats.
Dunno what's happening with in the FEC if configured in bridge mode
any clue on this, Thanks a lakh in advance.
--Ganesh
On Friday 28 August 2009 18:19, you wrote:
Hi All,
I've
2018-04-17 22:33 GMT+08:00 Laurent Dufour :
> Add speculative_pgfault vmstat counter to count successful speculative page
> fault handling.
>
> Also fixing a minor typo in include/linux/vm_event_item.h.
>
> Signed-off-by: Laurent Dufour
>
2018-03-29 15:50 GMT+08:00 Laurent Dufour <lduf...@linux.vnet.ibm.com>:
> On 29/03/2018 05:06, Ganesh Mahendran wrote:
>> 2018-03-29 10:26 GMT+08:00 Ganesh Mahendran <opensource.gan...@gmail.com>:
>>> Hi, Laurent
>>>
>>> 2018-02-16 23:25 GMT+
Hi, Laurent
2018-02-16 23:25 GMT+08:00 Laurent Dufour :
> When the speculative page fault handler is returning VM_RETRY, there is a
> chance that VMA fetched without grabbing the mmap_sem can be reused by the
> legacy page fault handler. By reusing it, we avoid
2018-03-29 10:26 GMT+08:00 Ganesh Mahendran <opensource.gan...@gmail.com>:
> Hi, Laurent
>
> 2018-02-16 23:25 GMT+08:00 Laurent Dufour <lduf...@linux.vnet.ibm.com>:
>> When the speculative page fault handler is returning VM_RETRY, there is a
>> chance that VMA fetc
Hi, Laurent
2018-03-14 1:59 GMT+08:00 Laurent Dufour :
> This is a port on kernel 4.16 of the work done by Peter Zijlstra to
> handle page fault without holding the mm semaphore [1].
>
> The idea is to try to handle user space page faults without holding the
>
Add support to hwpoison the pages upon hitting machine check
exception.
This patch queues the address where UE is hit to percpu array
and schedules work to plumb it into memory poison infrastructure.
Reviewed-by: Mahesh Salgaonkar
Signed-off-by: Ganesh Goudar
---
arch/powerpc/include/asm
vram, call kmsg_dump()
before carrying out fadump or kdump.
Fixes: 4388c9b3a6ee ("powerpc: Do not send system reset request through the
oops path")
Reviewed-by: Mahesh Salgaonkar
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/traps.c | 1 +
1 file changed, 1 insertion(+)
diff
y: Mahesh Salgaonkar
Reviewed-by: Nicholas Piggin
Signed-off-by: Ganesh Goudar
---
V2: Rephrasing the commit message
---
arch/powerpc/kernel/traps.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 11caa0291254..82f43535e686 10064
8b0063 380b0001
---[ end trace 46fd63f36bbdd940 ]---
Fixes: 9ca766f9891d ("powerpc/64s/pseries: machine check convert to use common
event code")
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/exceptions-64s.S | 12
arch/powerpc/platforms/pseries/pseries.h | 1
If we hit UE at an instruction with a fixup entry, flag to
ignore the event and set nip to continue execution at the
fixup entry.
For powernv this changes are already made by commit
895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe")
Signed-off-by: Ganesh Goudar
---
ar
event for memcpy_mcsafe")
Reviewed-by: Mahesh Salgaonkar
Reviewed-by: Santosh S
Signed-off-by: Ganesh Goudar
---
V2: Fixes a trivial checkpatch error in commit msg.
V3: Use proper subject prefix.
V4: Rephrase the commit message.
Define a common function to update nip with fixup address.
From: Santosh S
Introduce notification chain which lets know about uncorrected memory
errors(UE). This would help prospective users in pmem or nvdimm subsystem
to track bad blocks for better handling of persistent memory allocations.
Signed-off-by: Santosh S
Signed-off-by: Ganesh Goudar
mce_handle_ierror() and mce_handle_derror() has some duplicate
code to recover from the recoverable MCE errors and to get the
MCE error sub-type while generating MCE error info, Add helper
functions to remove it.
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce_power.c | 136
and thereby
avoid poisoning the memory in host.
Reviewed-by: Mahesh Salgaonkar
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce_power.c | 10 +-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index
Reviewed-by: Santosh S
Signed-off-by: Ganesh Goudar
---
V2: Fixes a trivial checkpatch error in commit msg
---
arch/powerpc/platforms/pseries/ras.c | 8
1 file changed, 8 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/ras.c
b/arch/powerpc/platforms/pseries/ras.c
index 5d
8b0063 380b0001
---[ end trace 46fd63f36bbdd940 ]---
Fixes: 9ca766f9891d ("powerpc/64s/pseries: machine check convert to use common
event code")
Reviewed-by: Mahesh Salgaonkar
Reviewed-by: Nicholas Piggin
Signed-off-by: Ganesh Goudar
---
v2: Avoid asm code to switch to virtual mode.
-
Reviewed-by: Santosh S
Signed-off-by: Ganesh Goudar
---
V2: Fixes a trivial checkpatch error in commit msg.
V3: Use proper subject prefix.
---
arch/powerpc/platforms/pseries/ras.c | 8
1 file changed, 8 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/ras.c
b/arch/powerpc/platfor
To check machine check handling, add support to inject slb
multihit errors.
Cc: Kees Cook
Reviewed-by: Michal Suchánek
Co-developed-by: Mahesh Salgaonkar
Signed-off-by: Mahesh Salgaonkar
Signed-off-by: Ganesh Goudar
---
drivers/misc/lkdtm/Makefile | 1 +
drivers/misc/lkdtm
on pseries machine running in hash
mmu mode.
Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI
accounting")
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/mc
,
as nesting is supported.
* Fix build errors and remove unused variables.
* Integrate error injection code into LKDTM.
* Add support to inject multihit in paca.
Ganesh Goudar (2):
powerpc/mce: remove nmi_enter/exit from real mode handler
lkdtm/powerpc: Add SLB multihit test
arch/powerpc/kernel
Add PPC_SLB_MULTIHIT to lkdtm selftest framework.
Signed-off-by: Ganesh Goudar
---
tools/testing/selftests/lkdtm/tests.txt | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/lkdtm/tests.txt
b/tools/testing/selftests/lkdtm/tests.txt
index 9d266e79c6a2..7eb3cf91c89e
on pseries machine running in hash
mmu mode.
Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI
accounting")
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kern
.
* Fix build errors and remove unused variables.
* Integrate error injection code into LKDTM.
* Add support to inject multihit in paca.
Ganesh Goudar (3):
powerpc/mce: remove nmi_enter/exit from real mode handler
lkdtm/powerpc: Add SLB multihit test
selftests/lkdtm: Enable selftest for SLB
Add support to inject slb multihit errors, to test machine
check handling.
Based on work by Mahesh Salgaonkar and Michal Suchánek.
Cc: Mahesh Salgaonkar
Cc: Michal Suchánek
Signed-off-by: Ganesh Goudar
---
drivers/misc/lkdtm/Makefile | 4 ++
drivers/misc/lkdtm/core.c| 3 +
drivers
support to inject multihit in paca.
Ganesh Goudar (2):
powerpc/mce: remove nmi_enter/exit from real mode handler
lkdtm/powerpc: Add SLB multihit test
arch/powerpc/kernel/mce.c | 10 +-
drivers/misc/lkdtm/Makefile | 1 +
drivers/misc/lkdtm/core.c | 3
To check machine check handling, add support to inject slb
multihit errors.
Reviewed-by: Michal Suchánek
Co-developed-by: Mahesh Salgaonkar
Signed-off-by: Mahesh Salgaonkar
Signed-off-by: Ganesh Goudar
---
drivers/misc/lkdtm/Makefile | 1 +
drivers/misc/lkdtm/core.c
on pseries machine running in hash
mmu mode.
Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI
accounting")
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kern
Add selftest to check if the system recovers from slb multihit
errors.
Signed-off-by: Ganesh Goudar
---
tools/testing/selftests/powerpc/Makefile | 3 ++-
tools/testing/selftests/powerpc/mces/Makefile| 6 ++
tools/testing/selftests/powerpc/mces/slb_multihit.sh | 9
To test machine check handling, add debugfs interface to inject
slb multihit errors.
To inject slb multihit:
#echo 1 > /sys/kernel/debug/powerpc/mce_error_inject/inject_slb_multihit
Signed-off-by: Ganesh Goudar
Signed-off-by: Mahesh Salgaonkar
---
arch/powerpc/Kconfig.debug |
on pseries machine running in hash
mmu mode.
Fixes: 116ac378bb3f ("powerpc/64s: machine check interrupt update NMI
accounting")
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce.c | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/mc
if possible.
Ganesh Goudar (3):
powerpc/mce: remove nmi_enter/exit from real mode handler
powerpc/mce: Add debugfs interface to inject MCE
selftest/powerpc: Add slb multihit selftest
arch/powerpc/Kconfig.debug| 9 ++
arch/powerpc/kernel/mce.c | 7
be fatal as it may try to access
memory outside RMO region.
To fix this use addr_to_pfn after switching to virtual mode.
Signed-off-by: Ganesh Goudar
---
V2: Leave bare metal code and save_mce_event as is.
---
arch/powerpc/platforms/pseries/ras.c | 20 +++-
1 file changed, 11
be fatal as it may try to access
memory outside RMO region.
To fix this move the use of addr_to_pfn to save_mce_event(), which
runs in virtual mode.
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce.c| 7 +
arch/powerpc/kernel/mce_power.c | 39
] 79291f24 790af00e 78e70020 7d095214 <7c69502a> 2fa3 419e011c
70690040
[ 485.128152] ---[ end trace d34b27e29ae0e340 ]---
Signed-off-by: Ganesh Goudar
---
V2: Leave bare metal code and save_mce_event as is.
V3: Have separate functions for realmode and virtual mode handling.
---
arch/p
vert to use common
event code")
Signed-off-by: Ganesh Goudar
---
V2: Leave bare metal code and save_mce_event as is.
V3: Have separate functions for realmode and virtual mode handling.
V4: Fix build warning, rephrase commit message.
---
arch/powerpc/platforms/pse
on different architectures, So have
these variables in paca instead of having them as per-cpu variables
to avoid complications.
Maximum recursive depth of MCE is 4, Considering the maximum depth
allowed reduce the size of event to 10 from 100.
Signed-off-by: Ganesh Goudar
---
v2: Dynamically
To check machine check handling, add support to inject slb
multihit errors.
Cc: Kees Cook
Cc: Michal Suchánek
Co-developed-by: Mahesh Salgaonkar
Signed-off-by: Mahesh Salgaonkar
Signed-off-by: Ganesh Goudar
---
v5:
- Insert entries at SLB_NUM_BOLTED and SLB_NUM_BOLTED +1, remove index
on different architectures, So have
these variables in paca instead of having them as per-cpu variables
to avoid complications.
Maximum recursive depth of MCE is 4, Considering the maximum depth
allowed reduce the size of event to 10 from 100.
Signed-off-by: Ganesh Goudar
---
arch/powerpc/include
on different architectures, So have
these variables in paca instead of having them as per-cpu variables
to avoid complications.
Signed-off-by: Ganesh Goudar
---
v2: Dynamically allocate memory for machine check event info
v3: Remove check for hash mmu lpar, use memblock_alloc_try_nid
Maximum recursive depth of MCE is 4, Considering the maximum depth
allowed reduce the size of event to 10 from 100. This saves us ~19kB
of memory and has no fatal consequences.
Signed-off-by: Ganesh Goudar
---
v4: This patch is a fragment of the orignal patch which is
split into two
on different architectures, So have
these variables in paca instead of having them as per-cpu variables
to avoid complications.
Maximum recursive depth of MCE is 4, Considering the maximum depth
allowed reduce the size of event to 10 from 100.
Signed-off-by: Ganesh Goudar
---
v2: Dynamically
Maximum recursive depth of MCE is 4, Considering the maximum depth
allowed reduce the size of event to 10 from 100. This saves us ~19kB
of memory and has no fatal consequences.
Signed-off-by: Ganesh Goudar
---
v4: This patch is a fragment of the orignal patch which is
split into two.
v5
on different architectures, So have
these variables in paca instead of having them as per-cpu variables
to avoid complications.
Signed-off-by: Ganesh Goudar
---
v2: Dynamically allocate memory for machine check event info.
v3: Remove check for hash mmu lpar, use memblock_alloc_try_nid
] memcpy+0x88/0x90
[ 512.972456] MCE: CPU1: Initiator CPU
[ 512.972534] MCE: CPU1: Unknown
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 11f0cae086ed
The error type is ICACHE and DCACHE, for case MCE_ERROR_TYPE_ICACHE.
Signed-off-by: Ganesh Goudar
---
arch/powerpc/platforms/pseries/ras.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/ras.c
b/arch/powerpc/platforms/pseries/ras.c
index
receives SIGBUS.
Signed-off-by: Ganesh Goudar
---
v3: Avoid using shell script to inject error.
v2: Fix build error.
---
tools/testing/selftests/powerpc/Makefile | 3 +-
tools/testing/selftests/powerpc/mce/Makefile | 7 ++
.../selftests/powerpc/mce/inject-ra-err.c | 65
s space.
Signed-off-by: Ganesh Goudar
---
v3: No changes.
v2: No changes.
---
arch/powerpc/kernel/mce.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 9d1e39d42e3e..5baf69503349 100644
--- a/arch/powerpc/ker
] machine_check_queue_event+0xbc/0xd0
[c0001ebffcf0] [c000838c] machine_check_early_common+0x16c/0x1f4
Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from
handler")
Signed-off-by: Ganesh Goudar
---
arch/powerpc/kernel/mce.c | 16 ++--
1 file c
Add support to parse and log control memory access
error for pseries. These changes are made according to
PAPR v2.11 10.3.2.2.12.
Signed-off-by: Ganesh Goudar
---
v3: Modify the commit log to mention the document according
to which changes are made.
Define and use a macro to check
+0xbc/0xd0
[c0001ebffcf0] [c000838c] machine_check_early_common+0x16c/0x1f4
Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from
handler")
Signed-off-by: Ganesh Goudar
---
v2: Change in commit message.
---
arch/powerpc/kernel/mce.c | 16 ++--
1 file ch
1 - 100 of 145 matches
Mail list logo