Re: [PATCH v4] powerpc/pseries: Handle UE event for memcpy_mcsafe

2020-04-01 Thread Michael Ellerman
On Thu, 2020-03-26 at 18:49:16 UTC, Ganesh Goudar wrote:
> memcpy_mcsafe has been implemented for power machines which is used
> by pmem infrastructure, so that an UE encountered during memcpy from
> pmem devices would not result in panic instead a right error code
> is returned. The implementation expects machine check handler to ignore
> the event and set nip to continue the execution from fixup code.
> 
> Appropriate changes are already made to powernv machine check handler,
> make similar changes to pseries machine check handler to ignore the
> the event and set nip to continue execution at the fixup entry if we
> hit UE at an instruction with a fixup entry.
> 
> while we are at it, have a common function which searches the exception
> table entry and updates nip with fixup address, and any future common
> changes can be made in this function that are valid for both architectures.
> 
> powernv changes are made by
> commit 895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe")
> 
> Reviewed-by: Mahesh Salgaonkar 
> Reviewed-by: Santosh S 
> Signed-off-by: Ganesh Goudar 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/efbc4303b255bb80ab1283794b36dd5fe1fb0ec3

cheers


[PATCH v4] powerpc/pseries: Handle UE event for memcpy_mcsafe

2020-03-26 Thread Ganesh Goudar
memcpy_mcsafe has been implemented for power machines which is used
by pmem infrastructure, so that an UE encountered during memcpy from
pmem devices would not result in panic instead a right error code
is returned. The implementation expects machine check handler to ignore
the event and set nip to continue the execution from fixup code.

Appropriate changes are already made to powernv machine check handler,
make similar changes to pseries machine check handler to ignore the
the event and set nip to continue execution at the fixup entry if we
hit UE at an instruction with a fixup entry.

while we are at it, have a common function which searches the exception
table entry and updates nip with fixup address, and any future common
changes can be made in this function that are valid for both architectures.

powernv changes are made by
commit 895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe")

Reviewed-by: Mahesh Salgaonkar 
Reviewed-by: Santosh S 
Signed-off-by: Ganesh Goudar 
---
V2: Fixes a trivial checkpatch error in commit msg.
V3: Use proper subject prefix.
V4: Rephrase the commit message.
Define a common function to update nip with fixup address.
---
 arch/powerpc/include/asm/mce.h   |  2 ++
 arch/powerpc/kernel/mce.c| 14 ++
 arch/powerpc/kernel/mce_power.c  |  8 ++--
 arch/powerpc/platforms/pseries/ras.c |  3 +++
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 6a6ddaabdb34..376a395daf32 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -218,6 +218,8 @@ extern void machine_check_queue_event(void);
 extern void machine_check_print_event_info(struct machine_check_event *evt,
   bool user_mode, bool in_guest);
 unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
+extern void mce_common_process_ue(struct pt_regs *regs,
+ struct mce_error_info *mce_err);
 #ifdef CONFIG_PPC_BOOK3S_64
 void flush_and_reload_slb(void);
 #endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 34c1001e9e8b..8077b5fb18a7 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -251,6 +252,19 @@ void machine_check_queue_event(void)
/* Queue irq work to process this event later. */
irq_work_queue(_event_process_work);
 }
+
+void mce_common_process_ue(struct pt_regs *regs,
+  struct mce_error_info *mce_err)
+{
+   const struct exception_table_entry *entry;
+
+   entry = search_kernel_exception_table(regs->nip);
+   if (entry) {
+   mce_err->ignore_event = true;
+   regs->nip = extable_fixup(entry);
+   }
+}
+
 /*
  * process pending MCE event from the mce event queue. This function will be
  * called during syscall exit.
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index 1cbf7f1a4e3d..067b094bfeff 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -579,14 +579,10 @@ static long mce_handle_ue_error(struct pt_regs *regs,
struct mce_error_info *mce_err)
 {
long handled = 0;
-   const struct exception_table_entry *entry;
 
-   entry = search_kernel_exception_table(regs->nip);
-   if (entry) {
-   mce_err->ignore_event = true;
-   regs->nip = extable_fixup(entry);
+   mce_common_process_ue(regs, mce_err);
+   if (mce_err->ignore_event)
return 1;
-   }
 
/*
 * On specific SCOM read via MMIO we may get a machine check
diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 43710b69e09e..1d1da639b8b7 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -558,6 +558,9 @@ static int mce_handle_error(struct pt_regs *regs, struct 
rtas_error_log *errp)
switch (mce_log->error_type) {
case MC_ERROR_TYPE_UE:
mce_err.error_type = MCE_ERROR_TYPE_UE;
+   mce_common_process_ue(regs, _err);
+   if (mce_err.ignore_event)
+   disposition = RTAS_DISP_FULLY_RECOVERED;
switch (err_sub_type) {
case MC_ERROR_UE_IFETCH:
mce_err.u.ue_error_type = MCE_UE_ERROR_IFETCH;
-- 
2.17.2