RE: [PATCH V2 4/6] x86/efi: Add efi page fault handler to fixup/recover from page faults caused by firmware

2018-09-03 Thread Prakhya, Sai Praneeth
> > The efi specific page fault handler offers us two advantages:
> > 1. Avoid panics/hangs caused by buggy firmware.
> > 2. Shout loud that the firmware is buggy and hence is not a kernel bug.
> >
> > Finally, this new mapping will not impact a reboot from kexec, as
> > kexec is only concerned about runtime memory regions.
> 
> No. This is just a horrible hack to make completely bogus firmware work and
> never fixed.
> 

Yes, that's true.

> The proper thing to do is to have a minimal page fault handler which does:
> 
>  1) Yell loudly if that ever happens
> 
>  2) Handles the reboot request gracefully
> 
>  3) Freeze and disable the EFI mess for all other cases
> 
> That does not require any hackery to make these mappings work from atomic
> context and keeps the mess confined to the EFI code where it belongs.
> 
> Ideally we just blacklist the offending system and be done with it.

This makes sense to me. I will implement the above suggested and as said should 
avoid the need for making mappings work from atomic context.

Regards,
Sai


Re: [PATCH V2 4/6] x86/efi: Add efi page fault handler to fixup/recover from page faults caused by firmware

2018-09-03 Thread Thomas Gleixner
On Sun, 2 Sep 2018, Sai Praneeth Prakhya wrote:
> Kernel panics/hangs because the memory region requested by the firmware
> isn't mapped, which causes a page fault in ring 0 and the kernel fails
> to handle it, leading to die(). To save kernel from hanging, add an efi
> specific page fault handler which detects illegal accesses by the
> firmware and
> 1. If the illegally accessed region is EFI_BOOT_SERVICES_,
>the efi page fault handler fixes it up by mapping the requested
>region.
> 2. If any other region (Eg: EFI_CONVENTIONAL_MEMORY or
>EFI_LOADER_), then the efi page fault handler freezes
>efi_rts_wq and schedules a new process.
> 3. If the access is to any other efi region like above but if the efi
>runtime service is efi_reset_system(), then the efi page fault
>handler will reboot the machine through BIOS.
> 
> Illegal accesses to EFI_BOOT_SERVICES_ and to other regions
> are dealt differently in efi page fault handler because, *generally*
> EFI_BOOT_SERVICES_ regions are smaller in size relative to
> other efi regions and hence could be reserved and can be dynamically
> mapped. But other EFI regions like EFI_CONVENTIONAL_MEMORY and
> EFI_LOADER_ cannot be reserved as they are very huge in size
> and reserving them will make the kernel un-bootable.
> 
> The efi specific page fault handler offers us two advantages:
> 1. Avoid panics/hangs caused by buggy firmware.
> 2. Shout loud that the firmware is buggy and hence is not a kernel bug.
> 
> Finally, this new mapping will not impact a reboot from kexec, as kexec
> is only concerned about runtime memory regions.

No. This is just a horrible hack to make completely bogus firmware work and
never fixed.

The proper thing to do is to have a minimal page fault handler which does:

 1) Yell loudly if that ever happens

 2) Handles the reboot request gracefully

 3) Freeze and disable the EFI mess for all other cases

That does not require any hackery to make these mappings work from atomic
context and keeps the mess confined to the EFI code where it belongs.

Ideally we just blacklist the offending system and be done with it.

Thanks,

tglx


[PATCH V2 4/6] x86/efi: Add efi page fault handler to fixup/recover from page faults caused by firmware

2018-09-02 Thread Sai Praneeth Prakhya
From: Sai Praneeth 

EFI regions could briefly be divided into 3 types.
1. EFI_BOOT_SERVICES_ regions
2. EFI_RUNTIME_SERVICES_ regions
3. Other EFI regions like EFI_LOADER_ etc.

As per the UEFI specification, after the call to ExitBootServices(),
accesses by the firmware to any memory region except
EFI_RUNTIME_SERVICES_ regions is considered illegal. A buggy
firmware could trigger these illegal accesses during boot time or at
runtime (i.e. when the kernel is up and running). Presently, the kernel
can fix up illegal accesses to EFI_BOOT_SERVICES_ regions
*only* during kernel boot phase. If the firmware triggers illegal
accesses to *any* other EFI regions during kernel boot, the kernel
panics or if this happens during kernel runtime then the kernel hangs.

Kernel panics/hangs because the memory region requested by the firmware
isn't mapped, which causes a page fault in ring 0 and the kernel fails
to handle it, leading to die(). To save kernel from hanging, add an efi
specific page fault handler which detects illegal accesses by the
firmware and
1. If the illegally accessed region is EFI_BOOT_SERVICES_,
   the efi page fault handler fixes it up by mapping the requested
   region.
2. If any other region (Eg: EFI_CONVENTIONAL_MEMORY or
   EFI_LOADER_), then the efi page fault handler freezes
   efi_rts_wq and schedules a new process.
3. If the access is to any other efi region like above but if the efi
   runtime service is efi_reset_system(), then the efi page fault
   handler will reboot the machine through BIOS.

Illegal accesses to EFI_BOOT_SERVICES_ and to other regions
are dealt differently in efi page fault handler because, *generally*
EFI_BOOT_SERVICES_ regions are smaller in size relative to
other efi regions and hence could be reserved and can be dynamically
mapped. But other EFI regions like EFI_CONVENTIONAL_MEMORY and
EFI_LOADER_ cannot be reserved as they are very huge in size
and reserving them will make the kernel un-bootable.

The efi specific page fault handler offers us two advantages:
1. Avoid panics/hangs caused by buggy firmware.
2. Shout loud that the firmware is buggy and hence is not a kernel bug.

Finally, this new mapping will not impact a reboot from kexec, as kexec
is only concerned about runtime memory regions.

Suggested-by: Matt Fleming 
Based-on-code-from: Ricardo Neri 
Signed-off-by: Sai Praneeth Prakhya 
Cc: Lee Chun-Yi 
Cc: Al Stone 
Cc: Borislav Petkov 
Cc: Ingo Molnar 
Cc: Andy Lutomirski 
Cc: Bhupesh Sharma 
Cc: Peter Zijlstra 
Cc: Ard Biesheuvel 
---
 arch/x86/include/asm/efi.h  |   7 ++
 arch/x86/mm/fault.c |   9 ++
 arch/x86/platform/efi/quirks.c  | 152 
 drivers/firmware/efi/runtime-wrappers.c |   7 ++
 include/linux/efi.h |   1 +
 5 files changed, 176 insertions(+)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index d9e5d9a6d138..68a28606909c 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -144,8 +144,15 @@ extern void efi_switch_mm(struct mm_struct *mm);
 
 #ifdef CONFIG_EFI_WARN_ON_ILLEGAL_ACCESS
 extern void __init efi_save_original_memmap(void);
+extern int efi_illegal_accesses_fixup(unsigned long phys_addr,
+ struct pt_regs *regs);
 #else
 static inline void __init efi_save_original_memmap(void) { }
+static inline int efi_illegal_accesses_fixup(unsigned long phys_addr,
+struct pt_regs *regs)
+{
+   return 0;
+}
 #endif /* CONFIG_EFI_WARN_ON_ILLEGAL_ACCESS */
 
 struct efi_setup_data {
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 2aafa6ab6103..afd42e76058e 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -16,6 +16,7 @@
 #include /* prefetchw*/
 #include /* exception_enter(), ...   */
 #include  /* faulthandler_disabled()  */
+#include  /* fixup for buggy UEFI firmware*/
 
 #include /* boot_cpu_has, ...*/
 #include  /* dotraplinkage, ...   */
@@ -24,6 +25,7 @@
 #include   /* emulate_vsyscall */
 #include   /* struct vm86  */
 #include/* vma_pkey()   */
+#include/* fixup for buggy UEFI firmware*/
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -790,6 +792,13 @@ no_context(struct pt_regs *regs, unsigned long error_code,
return;
 
/*
+* Buggy firmware could trigger illegal accesses to some EFI regions
+* which might page fault, try to fixup or recover from such faults.
+*/
+   if (efi_illegal_accesses_fixup(address, regs))
+   return;
+
+   /*
 * Oops. The kernel tried to access some bad page. We'll have to
 * terminate things with extreme prejudice:
 */
diff --git a/arch/x86/platform/efi/quirks.c