Re: [PATCH] x86: Lock down MSR writing in secure boot

2013-02-08 Thread Borislav Petkov
On Fri, Feb 08, 2013 at 02:30:52PM -0800, H. Peter Anvin wrote:
 Also, keep in mind that there is a very simple way to deny MSR access
 completely, which is to not include the driver in your kernel (and not
 allow module loading, but if you can load modules you can just load a
 module to muck with whatever MSR you want.)

I was contemplating that too. What is the use case of having
msr.ko in a secure boot environment? Isn't that an all-no-tools,
you-can't-do-sh*t-except-what-you're-explicitly-allowed-to environment which
simply doesn't need to write MSRs in the first place?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EFI runtime and kexec

2013-03-01 Thread Borislav Petkov
On Fri, Mar 01, 2013 at 02:32:47PM -0800, H. Peter Anvin wrote:
 On 03/01/2013 01:39 PM, Borislav Petkov wrote:
  Hi guys,
  
  so I was talking with mfleming on IRC and he said I should talk to you
  about it. I actually pestered hpa about it already, sorry :).
  
  So I've been looking into making EFI runtime services available
  in the kexec'ed kernel. What I've found out so far is that
  efi_enter_virtual_mode() in the first kernel iterates over the EFI
  memmap and ioremaps all those EFI runtime services mappings. On my DELL
  workstation it looks like the dump below.
  
  Now, once this is done and SetVirtualAddressMap() is called, for the
  duration of this boot, those virtual addresses cannot change as the UEFI
  spec states: A virtual address map may only be applied one time. Once
  the runtime system is in virtual mode, calls to this function return
  EFI_UNSUPPORTED.
  
  Now, looking at those mappings, they're spread all over the VA space and
  their size is ~159.887Mb (which is 159MB too many for a goddam BIOS crap
  but whatever, everyone is jumping on this train so I'm gonna have to
  follow, unwillingly./EndOfRant)
  
  AFAICT, in a kexec kernel I'd have to recreate the exact-same mappings,
  i.e. phys_addr - va for all those regions. And since I'm not an mm guy,
  I'd rather ask the experts before I dive into a catch-22 thing.
  
  So even if I manage to do the mappings in the kexec kernel correctly for
  all those regions which efi_ioremap() serves only with direct mappings
  through init_memory_mapping(), I probably won't be that lucky with
  regions of type EFI_MEMORY_MAPPED_IO (type 0xb below) for which we
  really do ioremap and those virtual addresses I can't control, AFAICT.
  
  So what do you guys think, would it be possible
  
  * to make all EFI runtime services use predefined mappings which are
  globally valid and I can read them out in kexec or
  
  * make those mappings virtually contiguous so that kexec kernel only
  gets a va_start and a size and after that it knows what to do or
  
  * an even better idea.
  
  In general, any suggestion is appreciated.
  
 
 Adding a few more people.
 
 This has been a big topic, and yes, we have a problem.
 
 We seem to have a few options:
 
 1. We could always map 1:1, with the EFI mappings being in the user
 part of the virtual address space.  This MAY be what Windows does
 already.  Some Apple platforms are known to fail in this configuration,
 but perhaps we can blacklist those platforms or do something special.
 
 2. We could always map them into a fixed address that can be relied upon
 to be consistent.  The most logical such area is the second quadrant of
 the address space (again, in the user portion.)  It would be
 beneficial if we could define it so that whenever Linux needs to go to
 more than 48 virtual address bits at some point in the future this can
 be compatible between 48-bit and N-bit kernels, but if that is the only
 thing that breaks, then oh well.
 
 3. We could just always map at the kernel virtual address.  The 64-bit
 address space is large enough that we could make every ioremap() land at
 its identity-mapped address instead of in a unique part of the virtual
 address space.
 
 4. We could export a table of mappings to the kexec'd kernel.  In that
 case, we have to re-establish those mappings very early in the kernel
 boot so that nothing else steps on them.
 
 What is quite interesting in your case is that you have a mishmash of
 the identity-mapped and the non-identity-mapped mappings.

Yeah, the mishmash comes from regions of type EFI_MEMORY_MAPPED_IO which
are really ioremapped instead of returning the kernel virtual address.

Btw, I always tend to like the simplest approaches so option 3.
is kinda winking at me right now. I don't know whether for those
EFI_MEMORY_MAPPED_IO type regions though, we can simply return the
identity-mapped address.

If we can, the advantage would be great because then the kexec kernel
would simply parse the efi memmap and use __va() on the physical
addresses there and no need for special option passing to it.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EFI runtime and kexec

2013-03-01 Thread Borislav Petkov
On Fri, Mar 01, 2013 at 02:58:56PM -0800, H. Peter Anvin wrote:
  We seem to have a few options:
 
  1. We could always map 1:1, with the EFI mappings being in the user
  part of the virtual address space.  This MAY be what Windows does
  already.  Some Apple platforms are known to fail in this configuration,
  but perhaps we can blacklist those platforms or do something special.
 
  2. We could always map them into a fixed address that can be relied upon
  to be consistent.  The most logical such area is the second quadrant of
  the address space (again, in the user portion.)  It would be
  beneficial if we could define it so that whenever Linux needs to go to
  more than 48 virtual address bits at some point in the future this can
  be compatible between 48-bit and N-bit kernels, but if that is the only
  thing that breaks, then oh well.
 
  3. We could just always map at the kernel virtual address.  The 64-bit
  address space is large enough that we could make every ioremap() land at
  its identity-mapped address instead of in a unique part of the virtual
  address space.
 
  4. We could export a table of mappings to the kexec'd kernel.  In that
  case, we have to re-establish those mappings very early in the kernel
  boot so that nothing else steps on them.
 
  What is quite interesting in your case is that you have a mishmash of
  the identity-mapped and the non-identity-mapped mappings.
  
  Yeah, the mishmash comes from regions of type EFI_MEMORY_MAPPED_IO which
  are really ioremapped instead of returning the kernel virtual address.
  
  Btw, I always tend to like the simplest approaches so option 3.
  is kinda winking at me right now. I don't know whether for those
  EFI_MEMORY_MAPPED_IO type regions though, we can simply return the
  identity-mapped address.
  
  If we can, the advantage would be great because then the kexec kernel
  would simply parse the efi memmap and use __va() on the physical
  addresses there and no need for special option passing to it.
  
 
 We can, and in fact we could do this for *all* ioremap()s in the 64-bit
 kernel.  This doesn't help the 32-bit kernel in any way, however.

Right, ok.

 One thing I *really* don't like about it is that it exposes the kernel
 virtual address map as an ABI.

Hmm, yeah, that's nasty. This also means option #2 can go too because
of the fixed addresses. Option #1 is also kinda polluting user address
space so maybe the most elegant one would be #4, AFAICT.

We just need a nice mechanism to tell those mappings to the kexec-d
kernel and when it starts, to establish them right away.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EFI runtime and kexec

2013-03-01 Thread Borislav Petkov
Just commenting on this one for now, the rest tomorrow cuz I'm half
asleep.

On Fri, Mar 01, 2013 at 11:30:25PM +, David Woodhouse wrote:
 The other option, for the long term, is to fix the damn firmware to
 allow SetVirtualAddressMap to be called more than once. It was stupid
 for it to be a one-time call anyway.

Now this would be the cleanest solution. If we can do that, we can then
simply call efi_enter_virtual_mode() in the kexec'd kernel without
the need to pass any options to it. Actually, the kexec'd kernel can
probably run the same efi code as the papa kernel.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Borislav Petkov
On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote:
 What appears to be happening is that your the EFI runtime services
 code is calling into the EFI boot services code, which is definitely
 a bug in your firmware because we're at runtime, but we've seen
 other machines that do similar things so we usually handle it just
 fine. However, what makes your case different, and the reason you
 see the above splat, is that it's using the physical address of
 the EFI boot services region, not the virtual one we setup with
 SetVirtualAddressMap(). Which is a second firmware bug.

I'm speechless. Let's have someone else do the ranting this time:

http://www.happyassassin.net/2013/05/03/a-day-in-the-life-of-a-firmware-engineer/

 Again, we have seen other machines that access
 physical addresses after SetVirtualAddressMap(), but until now we
 haven't had any non-optional code that triggered them.
 
 The only reason I can see that the offending commit would introduce this
 problem is because it calls QueryVariableInfo() at boot time. I notice
 that your machine is an SGI UV one, is there any chance you could get a
 firmware fix for this? If possible, it would be also good to confirm
 that it's this chunk of code in setup_efi_vars(),
 
   status = efi_call_phys4(sys_table-runtime-query_variable_info,
   EFI_VARIABLE_NON_VOLATILE |
   EFI_VARIABLE_BOOTSERVICE_ACCESS |
   EFI_VARIABLE_RUNTIME_ACCESS, store_size,
   remaining_size, var_size);
 
 that later makes GetNextVariable() jump to the physical address of the
 EFI Boot Services region. Because if not, we need to do some more
 digging.
 
 Borislav, how are your 1:1 mapping patches coming along? In theory, once
 those are merged we can gracefully workaround these kinds of issues.

What do you mean, map boot time functions 1:1 too?

In any case, I think I have an idea about the bug I was discussing with
hpa recently but I need to do more experimenting. I have the next week
off, though, so don't hold your breath just yet :).
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] EFI 1:1 mapping

2013-06-02 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Hi all,

this one is 64-bit only for now and it has been tested only in kvm with
OVMF.

Keeping in mind the ihnerent efi b0rkedness left and right, I'd like to
be very cautious and conservative with this and not hurry anything until
it has been actually very well tested on a variety of baremetal boxes.

Please take a closer look and let me know.

Thanks.

Borislav Petkov (4):
  efi: Convert runtime services function ptrs
  x86, cpa: Map in an arbitrary pgd
  x86, efi: Add an efi= kernel command line parameter
  x86, efi: Map runtime services 1:1

 arch/x86/boot/compressed/eboot.c |   2 +-
 arch/x86/include/asm/efi.h   |  30 +++---
 arch/x86/include/asm/pgtable_types.h |   3 +-
 arch/x86/mm/pageattr.c   |  80 
 arch/x86/platform/efi/efi.c  | 177 +--
 arch/x86/platform/efi/efi_stub_64.S  |  48 ++
 include/linux/efi.h  |  28 +++---
 7 files changed, 290 insertions(+), 78 deletions(-)

-- 
1.8.3.rc1.25.g423ecb0

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] efi: Convert runtime services function ptrs

2013-06-02 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

... to void * like the boot services and lose all the void * casts. No
functionality change.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/boot/compressed/eboot.c |  2 +-
 arch/x86/include/asm/efi.h   | 28 ++--
 include/linux/efi.h  | 28 ++--
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index 35ee62fccf98..4060c8daf05e 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -266,7 +266,7 @@ static efi_status_t setup_efi_vars(struct boot_params 
*params)
while (data  data-next)
data = (struct setup_data *)(unsigned long)data-next;
 
-   status = efi_call_phys4((void *)sys_table-runtime-query_variable_info,
+   status = efi_call_phys4(sys_table-runtime-query_variable_info,
EFI_VARIABLE_NON_VOLATILE |
EFI_VARIABLE_BOOTSERVICE_ACCESS |
EFI_VARIABLE_RUNTIME_ACCESS, store_size,
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 2fb5d5884e23..5b33686b6995 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -52,40 +52,40 @@ extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
 u64 arg4, u64 arg5, u64 arg6);
 
 #define efi_call_phys0(f)  \
-   efi_call0((void *)(f))
+   efi_call0((f))
 #define efi_call_phys1(f, a1)  \
-   efi_call1((void *)(f), (u64)(a1))
+   efi_call1((f), (u64)(a1))
 #define efi_call_phys2(f, a1, a2)  \
-   efi_call2((void *)(f), (u64)(a1), (u64)(a2))
+   efi_call2((f), (u64)(a1), (u64)(a2))
 #define efi_call_phys3(f, a1, a2, a3)  \
-   efi_call3((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3))
+   efi_call3((f), (u64)(a1), (u64)(a2), (u64)(a3))
 #define efi_call_phys4(f, a1, a2, a3, a4)  \
-   efi_call4((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+   efi_call4((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4))
 #define efi_call_phys5(f, a1, a2, a3, a4, a5)  \
-   efi_call5((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+   efi_call5((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4), (u64)(a5))
 #define efi_call_phys6(f, a1, a2, a3, a4, a5, a6)  \
-   efi_call6((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+   efi_call6((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4), (u64)(a5), (u64)(a6))
 
 #define efi_call_virt0(f)  \
-   efi_call0((void *)(efi.systab-runtime-f))
+   efi_call0((efi.systab-runtime-f))
 #define efi_call_virt1(f, a1)  \
-   efi_call1((void *)(efi.systab-runtime-f), (u64)(a1))
+   efi_call1((efi.systab-runtime-f), (u64)(a1))
 #define efi_call_virt2(f, a1, a2)  \
-   efi_call2((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2))
+   efi_call2((efi.systab-runtime-f), (u64)(a1), (u64)(a2))
 #define efi_call_virt3(f, a1, a2, a3)  \
-   efi_call3((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call3((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3))
 #define efi_call_virt4(f, a1, a2, a3, a4)  \
-   efi_call4((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call4((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3), (u64)(a4))
 #define efi_call_virt5(f, a1, a2, a3, a4, a5)  \
-   efi_call5((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call5((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3), (u64)(a4), (u64)(a5))
 #define efi_call_virt6(f, a1, a2, a3, a4, a5, a6)  \
-   efi_call6((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call6((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3), (u64)(a4), (u64)(a5), (u64)(a6))
 
 extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 2bc0ad78d058..21ae6b3c0359 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -287,20 +287,20 @@ typedef struct {
 
 typedef struct {
efi_table_hdr_t hdr;
-   unsigned long get_time;
-   unsigned long set_time;
-   unsigned long get_wakeup_time;
-   unsigned long set_wakeup_time;
-   unsigned long set_virtual_address_map;
-   unsigned long convert_pointer;
-   unsigned long get_variable;
-   unsigned long get_next_variable

[PATCH 2/4] x86, cpa: Map in an arbitrary pgd

2013-06-02 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Add the ability to map pages in an arbitrary pgd.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/include/asm/pgtable_types.h |  3 +-
 arch/x86/mm/pageattr.c   | 80 
 2 files changed, 65 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h 
b/arch/x86/include/asm/pgtable_types.h
index e6423002c10b..0613e147f083 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -352,7 +352,8 @@ static inline void update_page_count(int level, unsigned 
long pages) { }
  */
 extern pte_t *lookup_address(unsigned long address, unsigned int *level);
 extern phys_addr_t slow_virt_to_phys(void *__address);
-
+extern void kernel_map_pages_in_pgd(pgd_t *pgd, unsigned long address,
+   unsigned numpages, unsigned long 
page_flags);
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_X86_PGTABLE_DEFS_H */
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index bb32480c2d71..3d64e5fc2adc 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -30,6 +30,7 @@
  */
 struct cpa_data {
unsigned long   *vaddr;
+   pgd_t   *pgd;
pgprot_tmask_set;
pgprot_tmask_clr;
int numpages;
@@ -322,17 +323,9 @@ static inline pgprot_t static_protections(pgprot_t prot, 
unsigned long address,
return prot;
 }
 
-/*
- * Lookup the page table entry for a virtual address. Return a pointer
- * to the entry and the level of the mapping.
- *
- * Note: We return pud and pmd either when the entry is marked large
- * or when the present bit is not set. Otherwise we would return a
- * pointer to a nonexisting mapping.
- */
-pte_t *lookup_address(unsigned long address, unsigned int *level)
+static pte_t *
+__lookup_address_in_pgd(pgd_t *pgd, unsigned long address, unsigned int *level)
 {
-   pgd_t *pgd = pgd_offset_k(address);
pud_t *pud;
pmd_t *pmd;
 
@@ -361,8 +354,30 @@ pte_t *lookup_address(unsigned long address, unsigned int 
*level)
 
return pte_offset_kernel(pmd, address);
 }
+
+/*
+ * Lookup the page table entry for a virtual address. Return a pointer
+ * to the entry and the level of the mapping.
+ *
+ * Note: We return pud and pmd either when the entry is marked large
+ * or when the present bit is not set. Otherwise we would return a
+ * pointer to a nonexisting mapping.
+ */
+pte_t *lookup_address(unsigned long address, unsigned int *level)
+{
+   return __lookup_address_in_pgd(pgd_offset_k(address), address, level);
+}
 EXPORT_SYMBOL_GPL(lookup_address);
 
+pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
+ unsigned int *level)
+{
+   if (cpa-pgd)
+   return __lookup_address_in_pgd(cpa-pgd, address, level);
+
+   return lookup_address(address, level);
+}
+
 /*
  * This is necessary because __pa() does not work on some
  * kinds of memory, like vmalloc() or the alloc_remap()
@@ -437,7 +452,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 * Check for races, another CPU might have split this page
 * up already:
 */
-   tmp = lookup_address(address, level);
+   tmp = _lookup_address_cpa(cpa, address, level);
if (tmp != kpte)
goto out_unlock;
 
@@ -543,7 +558,8 @@ out_unlock:
 }
 
 static int
-__split_large_page(pte_t *kpte, unsigned long address, struct page *base)
+__split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
+  struct page *base)
 {
pte_t *pbase = (pte_t *)page_address(base);
unsigned long pfn, pfninc = 1;
@@ -556,7 +572,7 @@ __split_large_page(pte_t *kpte, unsigned long address, 
struct page *base)
 * Check for races, another CPU might have split this page
 * up for us already:
 */
-   tmp = lookup_address(address, level);
+   tmp = _lookup_address_cpa(cpa, address, level);
if (tmp != kpte) {
spin_unlock(pgd_lock);
return 1;
@@ -632,7 +648,8 @@ __split_large_page(pte_t *kpte, unsigned long address, 
struct page *base)
return 0;
 }
 
-static int split_large_page(pte_t *kpte, unsigned long address)
+static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
+   unsigned long address)
 {
struct page *base;
 
@@ -644,7 +661,7 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
if (!base)
return -ENOMEM;
 
-   if (__split_large_page(kpte, address, base))
+   if (__split_large_page(cpa, kpte, address, base))
__free_page(base);
 
return 0;
@@ -697,7 +714,10 @@ static int __change_page_attr(struct cpa_data *cpa, int 
primary)
else
address = *cpa-vaddr;
 repeat:
-   kpte = lookup_address(address, level);
+   if (cpa-pgd

[PATCH 4/4] x86, efi: Map runtime services 1:1

2013-06-02 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Due to the braindead design of EFI, we cannot map runtime services more
than once for the duration of a booted system. Thus, if we want to use
EFI runtime services in a kexec'ed kernel, maybe the only possible and
sensible approach would be to map them 1:1 so that when the kexec kernel
loads, it can simply call those addresses without the need for remapping
(which doesn't work anyway).

Furthermore, this mapping approach could be of help with b0rked EFI
implementations for a different set of reasons.

This implementation is 64-bit only for now and it boots fine in kvm with
OVMF BIOS.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/include/asm/efi.h  |   2 +
 arch/x86/platform/efi/efi.c | 161 +---
 arch/x86/platform/efi/efi_stub_64.S |  48 +++
 3 files changed, 180 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 5b33686b6995..1c9c0a5cc280 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -41,6 +41,8 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...);
 
 #define EFI_LOADER_SIGNATURE   EL64
 
+extern pgd_t *efi_pgt;
+
 extern u64 efi_call0(void *fp);
 extern u64 efi_call1(void *fp, u64 arg1);
 extern u64 efi_call2(void *fp, u64 arg1, u64 arg2);
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index aea4337f7023..36ecefb54495 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -93,6 +93,8 @@ unsigned long x86_efi_facility;
 
 static unsigned long efi_config;
 
+extern bool use_11_map;
+
 /*
  * Returns 1 if 'facility' is enabled, 0 otherwise.
  */
@@ -763,6 +765,25 @@ static int __init efi_runtime_init(void)
 * virtual mode.
 */
efi.get_time = phys_efi_get_time;
+
+   if (efi_config  EFI_CFG_MAP11) {
+#define efi_phys_assign(f) \
+   efi_phys.f = (efi_ ##f## _t *)runtime-f
+
+   efi_phys_assign(set_time);
+   efi_phys_assign(get_wakeup_time);
+   efi_phys_assign(set_wakeup_time);
+   efi_phys_assign(get_variable);
+   efi_phys_assign(get_next_variable);
+   efi_phys_assign(set_variable);
+   efi_phys_assign(get_next_high_mono_count);
+   efi_phys_assign(reset_system);
+   efi_phys_assign(set_virtual_address_map);
+   efi_phys_assign(query_variable_info);
+   efi_phys_assign(update_capsule);
+   efi_phys_assign(query_capsule_caps);
+   }
+
early_iounmap(runtime, sizeof(efi_runtime_services_t));
 
return 0;
@@ -954,6 +975,61 @@ void efi_memory_uc(u64 addr, unsigned long size)
set_memory_uc(addr, npages);
 }
 
+static void __init __runtime_map_11(efi_memory_desc_t *md)
+{
+   pgd_t *pgd = (pgd_t *)__va(real_mode_header-trampoline_pgd);
+   unsigned long page_flags = 0;
+
+   if (md-type == EFI_RUNTIME_SERVICES_DATA ||
+   md-type == EFI_BOOT_SERVICES_DATA)
+   page_flags |= _PAGE_NX;
+
+   if (!(md-attribute  EFI_MEMORY_WB))
+   page_flags |= _PAGE_PCD;
+
+   kernel_map_pages_in_pgd(pgd + pgd_index(md-phys_addr),
+   md-phys_addr,
+   md-num_pages,
+   page_flags);
+
+   md-virt_addr = md-phys_addr;
+}
+
+static int __init __runtime_ioremap(efi_memory_desc_t *md)
+{
+   u64 end, systab, start_pfn, end_pfn;
+   unsigned long size;
+   void *va;
+
+   size  = md-num_pages  EFI_PAGE_SHIFT;
+   end   = md-phys_addr + size;
+   start_pfn = PFN_DOWN(md-phys_addr);
+   end_pfn   = PFN_UP(end);
+
+   if (pfn_range_is_mapped(start_pfn, end_pfn)) {
+   va = __va(md-phys_addr);
+
+   if (!(md-attribute  EFI_MEMORY_WB))
+   efi_memory_uc((u64)(unsigned long)va, size);
+   } else
+   va = efi_ioremap(md-phys_addr, size, md-type, md-attribute);
+
+   md-virt_addr = (u64) (unsigned long) va;
+   if (!va) {
+   pr_err(ioremap of 0x%llX failed!\n,
+   (unsigned long long)md-phys_addr);
+   return 1;
+   }
+
+   systab = (u64) (unsigned long) efi_phys.systab;
+   if (md-phys_addr = systab  systab  end) {
+   systab += md-virt_addr - md-phys_addr;
+   efi.systab = (efi_system_table_t *) (unsigned long) systab;
+   }
+
+   return 0;
+}
+
 /*
  * This function will switch the EFI runtime services to virtual mode.
  * Essentially, look through the EFI memmap and map every region that
@@ -964,11 +1040,11 @@ void efi_memory_uc(u64 addr, unsigned long size)
  */
 void __init efi_enter_virtual_mode(void)
 {
+   pgd_t *pgd = (pgd_t *)__va(real_mode_header-trampoline_pgd);
efi_memory_desc_t *md, *prev_md = NULL;
efi_status_t status

Re: [PATCH 0/4] EFI 1:1 mapping

2013-06-03 Thread Borislav Petkov
On Sun, Jun 02, 2013 at 11:56:20PM +0100, Matthew Garrett wrote:
 I've just run Windows 8 under a hacked up copy of OVMF that dumps
 the data passed to SetVirtualAddressMap. It seems that Windows *is*
 mapping the runtime services to higher addresses - so presumably the
 1:1 mapping is in addition to the virtual mapping.

But but, once we call SetVirtualAddressMap with the set of addresses of
the runtime services, only those can be used after, right? If so, we
can't have both (this is at least my understanding)...

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

2013-06-06 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 06:50:52PM +0100, Matthew Garrett wrote:
 On Thu, Jun 06, 2013 at 03:26:03PM +0200, Borislav Petkov wrote:
 
  This would break the Macs, remember?
 
 I think the Macs will be fine as long as we're passing the high mappings 
 into SetVirtualAddressMap().

Right, on those we'll fall back to the current mappings and simply not
have the 1:1 thing.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

2013-06-06 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 12:28:20PM -0700, H. Peter Anvin wrote:
 Or we could materialize mappings for this specific PGD. However,
 adding a read of %cr3 in __do_page_fault sounds expensive.

Yes, I think we want to make sure all mappings are there when we do an
EFI runtime call so that we never #PF while it executes.

Matt mentioned on IRC that the it could be that his EFI runtime is
referencing EFI_RESERVED area which we don't map. However, we need to
confirm/disprove that first, as it is currently only a hunch.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

2013-06-06 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 08:35:48PM +0100, Matthew Garrett wrote:
 No, I think that's the wrong thing to do. We should set up the current
 mappings and the 1:1 mappings, and pass the current mappings through
 SetVirtualAddressMap(). That matches the behaviour of Windows.

And when do we use the 1:1 mappings and when the current mappings when
doing runtime calls?

Also, would the 1:1 mappings even work if not passed through
SetVirtualAddressMap? I'm sensing a yes but I don't know...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

2013-06-06 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 08:54:50PM +0100, Matthew Garrett wrote:
 We want both to be available when we're making the call, but I think
 we should probably enter via the high addresses. The only reason we're
 doing this at all is that some systems don't update all of their
 pointers from physical mode, and we'd prefer them to work rather than
 fault...

Actually, we do the 1:1 thing so that EFI runtime works in a kexec
kernel too. Which won't work if we use the high addresses.

However, if we can use the 1:1 map *after* SetVirtualAddressMap() has
been called with the high mappings, then my issue is solved - we drop
to using 1:1 in the kexec kernel only. But I don't think that is the
case...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

2013-06-06 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 09:18:28PM +0100, Matthew Garrett wrote:
 kexec seems like a lower priority than compatibility. Perhaps keep the
 efi argument for people who want to use kexec?

This is what I currently have in the code: if you boot with efi=1:1_map,
you get them.

 hpa suggested allocating a fixed high area for UEFI mappings, which
 would also solve this.

I guess we can do that too.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

2013-06-06 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 09:30:57PM +0100, Matthew Garrett wrote:
 Well, we want the 1:1 mappings to exist all the time. The only
 thing the option should change is whether they're passed to
 SetVirtualAddressMap() or not.

But can you call them even if they haven't been passed through
SetVirtualAddressMap, *after* SetVirtualAddressMap has been called?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] x86, efi: Add an efi= kernel command line parameter

2013-06-06 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 09:50:57PM +0100, Matthew Garrett wrote:
 What do you mean by call them? I don't think we ever want to call by
 physical address, other than maybe in the kexec case. The only reason
 we really care about the physical addresses being mapped 1:1 is that
 some pointers may not have been updated.

I want to be able to call the runtime services in the kexec kernel.
Which means, the kexec kernel would simply map the runtime code/data
regions 1:1 and then use the physical addresses to call the runtime
services.

Question is: would that work even if SetVirtualAddressMap has already
run in the original kernel and with virtual addresses?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] x86, efi: Map runtime services 1:1

2013-06-10 Thread Borislav Petkov
On Thu, Jun 06, 2013 at 12:38:10PM -0700, H. Peter Anvin wrote:
 On 06/06/2013 12:36 PM, Borislav Petkov wrote:
  On Thu, Jun 06, 2013 at 12:28:20PM -0700, H. Peter Anvin wrote:
  Or we could materialize mappings for this specific PGD. However,
  adding a read of %cr3 in __do_page_fault sounds expensive.
  
  Yes, I think we want to make sure all mappings are there when we do an
  EFI runtime call so that we never #PF while it executes.
  
  Matt mentioned on IRC that the it could be that his EFI runtime is
  referencing EFI_RESERVED area which we don't map. However, we need to
  confirm/disprove that first, as it is currently only a hunch.

FWIW,

booting the patchset on my Dell looks good here. Booting at least, I
don't know about other stuff. If you have an EFI test suite or want me
to try stuff out, let me know. efibootmgr output looks sane too.

Btw, I've added a printk to the code so that we know that we've managed
switching to the 1:1 thing:

[0.073119] efi: Using 1:1 map.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] efi: Convert runtime services function ptrs

2013-06-11 Thread Borislav Petkov
On Tue, Jun 11, 2013 at 07:49:12AM +0100, Matt Fleming wrote:
 OK, I chickened out of sending this in my latest pull request
 after reading Linus' -rc5 email about him not wanting to see any
 non-critical changes. I've stuck it in the 'next' branch with the rest
 of the stuff for v3.11.

Yep, did the same with a couple of edac patches too - cleanups can wait.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH -v2 0/4] EFI 1:1 mapping

2013-06-17 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Hi all,

this is just a snapshot of the current state of affairs. The patchset
starts to boot successfully on real hardware now but we're far away from
the coverage we'd like to have before we even consider upstreaming it.

And yes, considering the sick f*ck EFI is, we're keeping the 1:1 mapping
optional and off by default (you need to boot with efi=1:1_map to
enable it).

Matt has picked up 1/4 already so I'll drop it when it lands into -tip
and so on...

Thanks for any suggestions, as always.

Borislav Petkov (4):
  efi: Convert runtime services function ptrs
  x86, cpa: Map in an arbitrary pgd
  x86, efi: Add an efi= kernel command line parameter
  x86, efi: Map runtime services 1:1

 arch/x86/boot/compressed/eboot.c |   2 +-
 arch/x86/include/asm/efi.h   |  81 ++-
 arch/x86/include/asm/pgtable_types.h |   3 +-
 arch/x86/mm/pageattr.c   |  82 
 arch/x86/platform/efi/efi.c  | 184 +--
 arch/x86/platform/efi/efi_stub_64.S  |  56 +++
 include/linux/efi.h  |  28 +++---
 7 files changed, 348 insertions(+), 88 deletions(-)

-- 
1.8.3

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH -v2 3/4] x86, efi: Add an efi= kernel command line parameter

2013-06-17 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

.. for passing miscellaneous options from the kernel command line.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/platform/efi/efi.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 82089d8b1954..5af5b97bf203 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -88,6 +88,11 @@ static u64 active_size;
 
 unsigned long x86_efi_facility;
 
+ /* 1:1 mapping of services regions */
+#define EFI_CFG_MAP11  BIT(0)
+
+static unsigned long efi_config;
+
 /*
  * Returns 1 if 'facility' is enabled, 0 otherwise.
  */
@@ -1167,3 +1172,17 @@ efi_status_t efi_query_variable_store(u32 attributes, 
unsigned long size)
return EFI_SUCCESS;
 }
 EXPORT_SYMBOL_GPL(efi_query_variable_store);
+
+static int __init parse_efi_cmdline(char *str)
+{
+   if (*str == '=')
+   str++;
+
+#ifdef CONFIG_X86_64
+   if (!strncmp(str, 1:1_map, 7))
+   efi_config |= EFI_CFG_MAP11;
+#endif
+
+   return 0;
+}
+early_param(efi, parse_efi_cmdline);
-- 
1.8.3

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH -v2 1/4] efi: Convert runtime services function ptrs

2013-06-17 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

... to void * like the boot services and lose all the void * casts. No
functionality change.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/boot/compressed/eboot.c |  2 +-
 arch/x86/include/asm/efi.h   | 28 ++--
 include/linux/efi.h  | 28 ++--
 3 files changed, 29 insertions(+), 29 deletions(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index 35ee62fccf98..4060c8daf05e 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -266,7 +266,7 @@ static efi_status_t setup_efi_vars(struct boot_params 
*params)
while (data  data-next)
data = (struct setup_data *)(unsigned long)data-next;
 
-   status = efi_call_phys4((void *)sys_table-runtime-query_variable_info,
+   status = efi_call_phys4(sys_table-runtime-query_variable_info,
EFI_VARIABLE_NON_VOLATILE |
EFI_VARIABLE_BOOTSERVICE_ACCESS |
EFI_VARIABLE_RUNTIME_ACCESS, store_size,
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 2fb5d5884e23..5b33686b6995 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -52,40 +52,40 @@ extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
 u64 arg4, u64 arg5, u64 arg6);
 
 #define efi_call_phys0(f)  \
-   efi_call0((void *)(f))
+   efi_call0((f))
 #define efi_call_phys1(f, a1)  \
-   efi_call1((void *)(f), (u64)(a1))
+   efi_call1((f), (u64)(a1))
 #define efi_call_phys2(f, a1, a2)  \
-   efi_call2((void *)(f), (u64)(a1), (u64)(a2))
+   efi_call2((f), (u64)(a1), (u64)(a2))
 #define efi_call_phys3(f, a1, a2, a3)  \
-   efi_call3((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3))
+   efi_call3((f), (u64)(a1), (u64)(a2), (u64)(a3))
 #define efi_call_phys4(f, a1, a2, a3, a4)  \
-   efi_call4((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+   efi_call4((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4))
 #define efi_call_phys5(f, a1, a2, a3, a4, a5)  \
-   efi_call5((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+   efi_call5((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4), (u64)(a5))
 #define efi_call_phys6(f, a1, a2, a3, a4, a5, a6)  \
-   efi_call6((void *)(f), (u64)(a1), (u64)(a2), (u64)(a3), \
+   efi_call6((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4), (u64)(a5), (u64)(a6))
 
 #define efi_call_virt0(f)  \
-   efi_call0((void *)(efi.systab-runtime-f))
+   efi_call0((efi.systab-runtime-f))
 #define efi_call_virt1(f, a1)  \
-   efi_call1((void *)(efi.systab-runtime-f), (u64)(a1))
+   efi_call1((efi.systab-runtime-f), (u64)(a1))
 #define efi_call_virt2(f, a1, a2)  \
-   efi_call2((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2))
+   efi_call2((efi.systab-runtime-f), (u64)(a1), (u64)(a2))
 #define efi_call_virt3(f, a1, a2, a3)  \
-   efi_call3((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call3((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3))
 #define efi_call_virt4(f, a1, a2, a3, a4)  \
-   efi_call4((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call4((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3), (u64)(a4))
 #define efi_call_virt5(f, a1, a2, a3, a4, a5)  \
-   efi_call5((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call5((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3), (u64)(a4), (u64)(a5))
 #define efi_call_virt6(f, a1, a2, a3, a4, a5, a6)  \
-   efi_call6((void *)(efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
+   efi_call6((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
  (u64)(a3), (u64)(a4), (u64)(a5), (u64)(a6))
 
 extern void __iomem *efi_ioremap(unsigned long addr, unsigned long size,
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 2bc0ad78d058..21ae6b3c0359 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -287,20 +287,20 @@ typedef struct {
 
 typedef struct {
efi_table_hdr_t hdr;
-   unsigned long get_time;
-   unsigned long set_time;
-   unsigned long get_wakeup_time;
-   unsigned long set_wakeup_time;
-   unsigned long set_virtual_address_map;
-   unsigned long convert_pointer;
-   unsigned long get_variable;
-   unsigned long get_next_variable

[PATCH -v2 2/4] x86, cpa: Map in an arbitrary pgd

2013-06-17 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Add the ability to map pages in an arbitrary pgd.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/include/asm/pgtable_types.h |  3 +-
 arch/x86/mm/pageattr.c   | 82 
 2 files changed, 67 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_types.h 
b/arch/x86/include/asm/pgtable_types.h
index e6423002c10b..0613e147f083 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -352,7 +352,8 @@ static inline void update_page_count(int level, unsigned 
long pages) { }
  */
 extern pte_t *lookup_address(unsigned long address, unsigned int *level);
 extern phys_addr_t slow_virt_to_phys(void *__address);
-
+extern void kernel_map_pages_in_pgd(pgd_t *pgd, unsigned long address,
+   unsigned numpages, unsigned long 
page_flags);
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_X86_PGTABLE_DEFS_H */
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index bb32480c2d71..b770b334c97e 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -30,6 +30,7 @@
  */
 struct cpa_data {
unsigned long   *vaddr;
+   pgd_t   *pgd;
pgprot_tmask_set;
pgprot_tmask_clr;
int numpages;
@@ -322,17 +323,9 @@ static inline pgprot_t static_protections(pgprot_t prot, 
unsigned long address,
return prot;
 }
 
-/*
- * Lookup the page table entry for a virtual address. Return a pointer
- * to the entry and the level of the mapping.
- *
- * Note: We return pud and pmd either when the entry is marked large
- * or when the present bit is not set. Otherwise we would return a
- * pointer to a nonexisting mapping.
- */
-pte_t *lookup_address(unsigned long address, unsigned int *level)
+static pte_t *
+__lookup_address_in_pgd(pgd_t *pgd, unsigned long address, unsigned int *level)
 {
-   pgd_t *pgd = pgd_offset_k(address);
pud_t *pud;
pmd_t *pmd;
 
@@ -361,8 +354,30 @@ pte_t *lookup_address(unsigned long address, unsigned int 
*level)
 
return pte_offset_kernel(pmd, address);
 }
+
+/*
+ * Lookup the page table entry for a virtual address. Return a pointer
+ * to the entry and the level of the mapping.
+ *
+ * Note: We return pud and pmd either when the entry is marked large
+ * or when the present bit is not set. Otherwise we would return a
+ * pointer to a nonexisting mapping.
+ */
+pte_t *lookup_address(unsigned long address, unsigned int *level)
+{
+   return __lookup_address_in_pgd(pgd_offset_k(address), address, level);
+}
 EXPORT_SYMBOL_GPL(lookup_address);
 
+pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
+ unsigned int *level)
+{
+   if (cpa-pgd)
+   return __lookup_address_in_pgd(cpa-pgd, address, level);
+
+   return lookup_address(address, level);
+}
+
 /*
  * This is necessary because __pa() does not work on some
  * kinds of memory, like vmalloc() or the alloc_remap()
@@ -437,7 +452,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 * Check for races, another CPU might have split this page
 * up already:
 */
-   tmp = lookup_address(address, level);
+   tmp = _lookup_address_cpa(cpa, address, level);
if (tmp != kpte)
goto out_unlock;
 
@@ -543,7 +558,8 @@ out_unlock:
 }
 
 static int
-__split_large_page(pte_t *kpte, unsigned long address, struct page *base)
+__split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
+  struct page *base)
 {
pte_t *pbase = (pte_t *)page_address(base);
unsigned long pfn, pfninc = 1;
@@ -556,7 +572,7 @@ __split_large_page(pte_t *kpte, unsigned long address, 
struct page *base)
 * Check for races, another CPU might have split this page
 * up for us already:
 */
-   tmp = lookup_address(address, level);
+   tmp = _lookup_address_cpa(cpa, address, level);
if (tmp != kpte) {
spin_unlock(pgd_lock);
return 1;
@@ -632,7 +648,8 @@ __split_large_page(pte_t *kpte, unsigned long address, 
struct page *base)
return 0;
 }
 
-static int split_large_page(pte_t *kpte, unsigned long address)
+static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
+   unsigned long address)
 {
struct page *base;
 
@@ -644,7 +661,7 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
if (!base)
return -ENOMEM;
 
-   if (__split_large_page(kpte, address, base))
+   if (__split_large_page(cpa, kpte, address, base))
__free_page(base);
 
return 0;
@@ -697,7 +714,10 @@ static int __change_page_attr(struct cpa_data *cpa, int 
primary)
else
address = *cpa-vaddr;
 repeat:
-   kpte = lookup_address(address, level);
+   if (cpa-pgd

[PATCH -v2 4/4] x86, efi: Map runtime services 1:1

2013-06-17 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Due to the braindead design of EFI, we cannot map runtime services more
than once for the duration of a booted system. Thus, if we want to use
EFI runtime services in a kexec'ed kernel, maybe the only possible and
sensible approach would be to map them 1:1 so that when the kexec kernel
loads, it can simply call those addresses without the need for remapping
(which doesn't work anyway).

Furthermore, this mapping approach could be of help with b0rked EFI
implementations for a different set of reasons.

This implementation is 64-bit only for now.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/include/asm/efi.h  |  67 +++
 arch/x86/platform/efi/efi.c | 165 +---
 arch/x86/platform/efi/efi_stub_64.S |  56 
 3 files changed, 240 insertions(+), 48 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 5b33686b6995..3adeef4a0064 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -39,8 +39,13 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...);
 
 #else /* !CONFIG_X86_32 */
 
+#include linux/sched.h
+
 #define EFI_LOADER_SIGNATURE   EL64
 
+extern pgd_t *efi_pgt;
+extern bool efi_use_11_map;
+
 extern u64 efi_call0(void *fp);
 extern u64 efi_call1(void *fp, u64 arg1);
 extern u64 efi_call2(void *fp, u64 arg1, u64 arg2);
@@ -51,6 +56,22 @@ extern u64 efi_call5(void *fp, u64 arg1, u64 arg2, u64 arg3,
 extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
 u64 arg4, u64 arg5, u64 arg6);
 
+/*
+ * map-in low kernel mapping for passing arguments to EFI functions.
+ */
+static inline void efi_sync_low_kernel_mappings(void)
+{
+   unsigned num_pgds;
+   pgd_t *pgd;
+
+   pgd = (pgd_t *)__va(real_mode_header-trampoline_pgd);
+   num_pgds = pgd_index(VMALLOC_START - 1) - pgd_index(PAGE_OFFSET);
+
+   memcpy(pgd + pgd_index(PAGE_OFFSET),
+   init_mm.pgd + pgd_index(PAGE_OFFSET),
+   sizeof(pgd_t) * num_pgds);
+}
+
 #define efi_call_phys0(f)  \
efi_call0((f))
 #define efi_call_phys1(f, a1)  \
@@ -69,24 +90,36 @@ extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
efi_call6((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4), (u64)(a5), (u64)(a6))
 
+#define _efi_call_virtX(x, f, ...) \
+({ \
+   efi_status_t __s;   \
+   \
+   if (efi_use_11_map) {   \
+   efi_sync_low_kernel_mappings(); \
+   preempt_disable();  \
+   }   \
+   \
+   __s = efi_call##x(efi.systab-runtime-f, __VA_ARGS__); \
+   \
+   if (efi_use_11_map) \
+   preempt_enable();   \
+   __s;\
+})
+
 #define efi_call_virt0(f)  \
-   efi_call0((efi.systab-runtime-f))
-#define efi_call_virt1(f, a1)  \
-   efi_call1((efi.systab-runtime-f), (u64)(a1))
-#define efi_call_virt2(f, a1, a2)  \
-   efi_call2((efi.systab-runtime-f), (u64)(a1), (u64)(a2))
-#define efi_call_virt3(f, a1, a2, a3)  \
-   efi_call3((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3))
-#define efi_call_virt4(f, a1, a2, a3, a4)  \
-   efi_call4((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3), (u64)(a4))
-#define efi_call_virt5(f, a1, a2, a3, a4, a5)  \
-   efi_call5((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3), (u64)(a4), (u64)(a5))
-#define efi_call_virt6(f, a1, a2, a3, a4, a5, a6)  \
-   efi_call6((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3), (u64)(a4), (u64)(a5), (u64)(a6))
+   _efi_call_virtX(0, f)
+#define efi_call_virt1(f, a1)  \
+   _efi_call_virtX(1, f, (u64)(a1))
+#define efi_call_virt2(f, a1, a2)  \
+   _efi_call_virtX(2, f, (u64)(a1), (u64)(a2))
+#define efi_call_virt3(f, a1, a2, a3)  \
+   _efi_call_virtX(3, f, (u64)(a1), (u64)(a2), (u64)(a3))
+#define efi_call_virt4(f, a1, a2, a3, a4)  \
+   _efi_call_virtX(4, f

Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-19 Thread Borislav Petkov
On Wed, Jun 19, 2013 at 02:52:43PM +0200, Ingo Molnar wrote:
 I hope making it a weird boot option is not the end plan, there's
 little point in _not_ enabling 1:1 mappings by default eventually:
 the 1:1 mapping is supposed to emulate a Windows compatible EFI
 environment better and is expected to work around certain EFI runtime
 crashes.

And yet there are the Macs which reportedly cannot stomach this.

And then there's the issue where some boxes cannot boot through the EFI
stub with those patches even without efi=1:1_map on the command line.
The issue has something to do with the cmpb $0, efi_use_11_map in the
efi_callX stubs.

And then again, other boxes have no problem with it and boot perfectly
fine.

So I don't know - it all looks like a weird boot, opt-in option for now.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-19 Thread Borislav Petkov
On Wed, Jun 19, 2013 at 03:04:34PM +0200, Ingo Molnar wrote:
 Do we know why?

Well, according to mjg59 some Macs break if we don't give them a map
which uses high addresses.

I can imagine flipping the meaning of this option to be on by default
and efi=no_11_map to disable the 1:1 map for those Macs.

 A bug I suspect?

Probably. The problem is, it is very hard to debug the boot stub that
early. And of course, I can't reproduce it in qemu :(. If only I had a
hardware debugger...

 But once it works reliably we can enable it, right?

It's all the same to me - I hate EFI with passion so whatever people
agree upon, I'll do it.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-19 Thread Borislav Petkov
On Wed, Jun 19, 2013 at 05:08:04PM +0100, Matthew Garrett wrote:
 But, as always, the only reliable thing to do here is to behave as
 much like Windows as possible. Which means performing the 1:1 mapping
 but maintaining the high mapping, and passing the high values via
 SetVirtualAddressMap.

We can't pass the high values via SetVirtualAddressMap and have EFI
runtime in the kexec-ed kernel, as you and I established last week. And
since not all would want EFI runtime in the kexec-ed kernel, I'm leaning
more towards a boot-time option which enables the 1:1 mapping.

Btw, why would you even want the 1:1 mappings if we pass the high values
via SetVirtualAddressMap?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-19 Thread Borislav Petkov
On Wed, Jun 19, 2013 at 05:48:22PM +0100, Matthew Garrett wrote:
  Ok, so it sounds like we want to *always* create both mappings but,
  depending on what we want, to shove down SetVirtualAddressMap a
  different set. And the 1:1 map will be the optional one which we give
  SetVirtualAddressMap only when user wants it, i.e. when booting with
  efi=1:1_map.
 
 Yup, I think that sounds ideal.

Crap, I got completely sidetracked. The 1:1 mappings go in a different
pagetable (real_mode_header-trampoline_pgd) than the kernel one
(i.e. init_mm.pgd). However, the -trampoline_pgd has all mappings
anyway, which means that if we want to do EFI runtime calls with the
high mappings but *also* have the 1:1 mappings established, we should
*always* switch to that pagetable when doing those calls.

hpa, MattF, agreed?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-19 Thread Borislav Petkov
On Wed, Jun 19, 2013 at 12:25:42PM -0500, H. Peter Anvin wrote:
 On 06/19/2013 08:02 AM, Borislav Petkov wrote:
  
  And yet there are the Macs which reportedly cannot stomach this.
  
 No, the reports are that if you use the 1:1 map as the primary address
 on Macs the drivers fail... not that you can't have a 1:1 map.

That's what I meant: ... cannot stomach when the 1:1 map is shoved down
SetVirtualAddressMap.

The thing is, if we want to have both the 1:1 map and the high map
during an EFI runtime call, we would need to *always* switch the
pagetable for an EFI runtime call and establish both mappings in
-trampoline_pgd beforehand.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-19 Thread Borislav Petkov
On Wed, Jun 19, 2013 at 12:38:24PM -0500, H. Peter Anvin wrote:
 I thought that was the plan?

Well, currently if I'm booted with efi=1:1_map I'm creating only the
1:1 mapping in -trampoline_pgd and switching the pagetable only then.
Otherwise, I'm using the high, ioremapped mappings - i.e., what we have
now.

I guess I can sync the kernel address space into -trampoline_pgd after
having created the 1:1 mappings and always switch the pagetable later,
after we've done SetVirtualAddressMap.

Which should take care of the EFI boot stub issue too, as I can define
another set of efi_callX which switch the pagetable unconditionally.

Let me see how that pans out...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-20 Thread Borislav Petkov
On Thu, Jun 20, 2013 at 11:22:37AM +0200, Ingo Molnar wrote:
   Cool - and supposedly this will work in a Mac environment as well? Would 
   be very nice to avoid fundamentally fragile system specific quirks for 
   something as fundamental as the EFI runtime memory mapping model ...
  
  Apple is the only case where I'd expect there to be an issue, since they 
  only started supporting booting Windows via UEFI on very recent systems. 
  However, unless they're actually sniffing the page tables on UEFI entry, 
  I can't see any way that this could break things???
 
 Agreed - I was susprised to see that the runtime was able to _break_ in 
 any way due to 1:1: my assumption was that it can only get better.
 
 But I did not realize that the 1:1 boot flag also changed what was passed 
 down, which probably explains the breakages.

Right, in the next version, the boot flag will influence only what's
being passed down.

 I'd even argue to not do this whole boot flag thing at all - just 
 standardize on the Windows compatibility model as closely as possible.

This will break the Macs so maybe we can do

efi=no_11_map

so the Macs can still boot but use the 1:1 map by default.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-20 Thread Borislav Petkov
On Thu, Jun 20, 2013 at 05:54:26PM +0100, Matthew Garrett wrote:
 On Thu, Jun 20, 2013 at 09:46:15AM -0700, James Bottomley wrote:
 
  Unless you can think of the way out of this, we seem to have the stark
  choice of behave like windows or allow kexec.  For the server market,
  kexec wins, so either we find a way not to have to make the choice or we
  do something automatic to make it fairly painless.
 
 hpa suggested ensuring that UEFI regions are mapped at fixed high 
 offsets. Someone who cares about kexec should probably make that happen.

If we can detect the Macs, we can make this decision automatic. And
since no Mac boots windoze, a single DMI check of the sort if (Mac)
should suffice.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-20 Thread Borislav Petkov
On Thu, Jun 20, 2013 at 06:12:10PM +0100, Matthew Garrett wrote:
 On Thu, Jun 20, 2013 at 07:01:24PM +0200, Borislav Petkov wrote:
 
  If we can detect the Macs, we can make this decision automatic. And
  since no Mac boots windoze, a single DMI check of the sort if (Mac)
  should suffice.
 
 Yes, we can special-case Macs. But since our behaviour is then obviously 
 different to Windows, we'll inevitably break some other system.

Why different? We'll have the high mappings and shove the 1:1 mappings
down SetVirtualAddressMap by default.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-20 Thread Borislav Petkov
On Thu, Jun 20, 2013 at 07:10:15PM +0100, Matthew Garrett wrote:
 Because Windows passes high addresses to SetVirtualAddressMap(), and
 because if you can imagine firmware developers getting it wrong then
 firmware developers will have got it wrong.

Can we reversely assume that if we'd used fixed high offsets, as hpa
suggests, then it'll be fine? IOW, are any high addresses, even fixed
ones, fine?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-20 Thread Borislav Petkov
On Thu, Jun 20, 2013 at 07:17:31PM +0100, Matthew Garrett wrote:
 On Thu, Jun 20, 2013 at 08:14:45PM +0200, Borislav Petkov wrote:
  On Thu, Jun 20, 2013 at 07:10:15PM +0100, Matthew Garrett wrote:
   Because Windows passes high addresses to SetVirtualAddressMap(), and
   because if you can imagine firmware developers getting it wrong then
   firmware developers will have got it wrong.
  
  Can we reversely assume that if we'd used fixed high offsets, as hpa
  suggests, then it'll be fine? IOW, are any high addresses, even fixed
  ones, fine?
 
 Windows actually seems to start at the top of address space and go down 
 - this is what I get booting Windows 8 under kvm. It looks like very 
 high addresses are fine, and we're currently using low high addresses, 
 so I suspect we're fine pretty much anywhere in that range.
 
 ** SetVirtualAddressMap
 Type: 5
 Physical Start: 3E878000
 Virtual Start: FFBEB000
 Number Of Pages: 15
 Attributes: 800F
 Type: 6
 Physical Start: 3E88D000
 Virtual Start: FFBD6000
 Number Of Pages: 15
 Attributes: 800F
 Type: 5
 Physical Start: 3FB22000
 Virtual Start: FFBA6000
 Number Of Pages: 30
 Attributes: 800F
 Type: 6
 Physical Start: 3FB52000
 Virtual Start: FFB82000
 Number Of Pages: 24
 Attributes: 800F
 Type: 6
 Physical Start: 3FFE
 Virtual Start: FFB62000
 Number Of Pages: 20

I guess we can do a top-down allocation, starting from the highest
virtual addresses:

EFI_HIGHEST_ADDRESS
|
| size1
|
-- region1
|
| size2
|
-- region2

...

and we make EFI_HIGHEST_ADDRESS be the same absolute number on every
system.

hpa, is this close to what you had in mind? It would be prudent to
verify whether this will suit well with the kexec virtual space layout
though...

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2 0/4] EFI 1:1 mapping

2013-06-21 Thread Borislav Petkov
On Fri, Jun 21, 2013 at 03:05:30AM -0700, H. Peter Anvin wrote:
 If you cap it you are basically imposing a constraint on the firmware
 and may not run properly (or at least have to turn off EFI runtime
 calls with all that implies.)

I don't want to cap EFI just for the fun of it but rather set a limit
so that the next one who wants a chunk of the virtual address space can
have a reliable limit from where she/he can start. Otherwise we won't
know where EFI reliably ends...

 It might be good to have a sanity check but it needs to be pretty
 generous.

64 Gb generous enough?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] arm: Add [U]EFI runtime services support

2013-06-26 Thread Borislav Petkov
On Wed, Jun 26, 2013 at 02:54:17PM +0100, Matt Fleming wrote:
 On Wed, 26 Jun, at 02:46:09PM, Grant Likely wrote:
  Eventually we'll need to look at how this interacts with kexec. A
  kexec'd kernel will need to use the mapping already chosen by a
  previous kernel, but that's an issue for another patch series.
 
 FYI, this is exactly what Borislav has been tackling on x86 recently. It
 would be nice if we could find one scheme that suits everyone.

Is this arm 32 or 64-bit? Because we haven't talked about 32-bit on x86
either. From skimming over the code, I'm not sure the same top-down
allocation and 1:1 mapping would work there. But I haven't looked hard
yet so I dunno.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Corrupted EFI region

2013-07-31 Thread Borislav Petkov
Hi guys,

so I'm seeing this funny thing where an EFI region changes when we enter
efi_enter_virtual_mode when booting with edk2 on kvm. Here's the diff:

--- before  2013-07-31 22:20:52.316039492 +0200
+++ after   2013-07-31 22:21:30.960731706 +0200
@@ -9,7 +9,7 @@ efi: mem07: type=2, attr=0xf, range=[0x0
 efi: mem08: type=7, attr=0xf, range=[0x4000-0x7c00) 
(960MB)
 efi: mem09: type=4, attr=0xf, range=[0x7c00-0x7c02) 
(0MB)
 efi: mem10: type=7, attr=0xf, range=[0x7c02-0x7e0ad000) 
(32MB)
-efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0cc000) 
(0MB)
+efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0ad000) 
(0MB)
 efi: mem12: type=7, attr=0xf, range=[0x7e0cc000-0x7e0cd000) 
(0MB)
 efi: mem13: type=4, attr=0xf, range=[0x7e0cd000-0x7e55d000) 
(4MB)
 efi: mem14: type=3, attr=0xf, range=[0x7e55d000-0x7e59c000) 
(0MB)

That second boundary of region mem11 suddenly changes *before* we merge
the regions. edk2 bug?

Whole dmesg attached.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--


test-x86_64.log.gz
Description: Binary data


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 01:27:16PM +0200, Laszlo Ersek wrote:
  --- before  2013-07-31 22:20:52.316039492 +0200
  +++ after   2013-07-31 22:21:30.960731706 +0200
  @@ -9,7 +9,7 @@ efi: mem07: type=2, attr=0xf, range=[0x0
   efi: mem08: type=7, attr=0xf, 
  range=[0x4000-0x7c00) (960MB)
   efi: mem09: type=4, attr=0xf, 
  range=[0x7c00-0x7c02) (0MB)
   efi: mem10: type=7, attr=0xf, 
  range=[0x7c02-0x7e0ad000) (32MB)
  -efi: mem11: type=4, attr=0xf, 
  range=[0x7e0ad000-0x7e0cc000) (0MB)
  +efi: mem11: type=4, attr=0xf, 
  range=[0x7e0ad000-0x7e0ad000) (0MB)
 
 (type 4 is EfiBootServicesData)

Yes.

   efi: mem12: type=7, attr=0xf, 
  range=[0x7e0cc000-0x7e0cd000) (0MB)
   efi: mem13: type=4, attr=0xf, 
  range=[0x7e0cd000-0x7e55d000) (4MB)
   efi: mem14: type=3, attr=0xf, 
  range=[0x7e55d000-0x7e59c000) (0MB)
  
  That second boundary of region mem11 suddenly changes *before* we merge
  the regions. edk2 bug?
 
 I take it you mean this change (ie. appearance of the zero-sized range)
 occurs when you enable KVM acceleration in qemu?

Right. And I'm booting with qemu -enable-kvm so KVM acceleration is
enabled?? Or do you mean something else.

 If so, please locate gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel
 in OvmfPkg/OvmfPkgX64.dsc, and set the following bit in its value:
 
   # DEBUG_GCD  0x0010 Global Coherency Database changes
 
 Then please rebuild OVMF, and capture the debug port output of qemu
 (-debugcon file:debug.log -global isa-debugcon.iobase=0x402) both with
 and without KVM.
 
 DEBUG_GCD should produce messages related to CoreAllocateSpace(), and
 might help us find the spot the difference is introduced.

Ok, I'll try to get this thing done before my vacation. If not, we'll
deal with it afterwards but I won't forget, I promise! :-)

 BTW does this have anything to do with the NX bit report of yours, or
 have you noticed this independently?

Independently, while testing my runtime services mapping patchset. I was
getting an empty region and was wondering whether to discard it from the
mapping or not and then I looked at why I get it in the first place.

Basically, I get this empty region which appears at some point. It is
there when we enter efi_enter_virtual_mode in the kernel to setup the
runtime mappings:

[0.005012] efi: efi_enter_virtual_mode: enter
[0.006004] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.007004] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.008003] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.009004] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.010004] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.011004] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.012004] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.013003] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.014004] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.015004] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.016004] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.017004] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0ad000) (0MB)

^^

[0.018003] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)

When we dump the EFI regions initially, it is ok.

[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)

So what basically happens is the end boundary of the region becomes the
start, practically turning it into a 0-size one.

Thanks for looking into it.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 03:39:31PM +0200, Laszlo Ersek wrote:
 My question was: is my understanding correct that you only see this
 problem with -enable-kvm? Because,
 
 On 08/01/13 18:49, Borislav Petkov wrote:
  so I'm seeing this funny thing where an EFI region changes when we
  enter efi_enter_virtual_mode when booting with edk2 on kvm. Here's
  the diff:
 
 You said on kvm, and provided a diff. I think (hope) I understand the
 environment you've denoted with after, but what's your before? The
 absence of -enable-kvm, or something else?

Ah, I see.

So 'before' is the initial dump of the EFI regions, very early during
boot:

[0.00] efi: EFI v2.31 by EDK II
[0.00] efi:  ACPI=0x7fb71000  ACPI 2.0=0x7fb71014 
[0.00] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.00] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.00] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.00] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.00] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.00] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.00] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.00] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.00] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.00] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)
[0.00] efi: mem13: type=4, attr=0xf, 
range=[0x7e0cd000-0x7e55d000) (4MB)
[0.00] efi: mem14: type=3, attr=0xf, 
range=[0x7e55d000-0x7e59c000) (0MB)
[0.00] efi: mem15: type=4, attr=0xf, 
range=[0x7e59c000-0x7e5a) (0MB)
[0.00] efi: mem16: type=3, attr=0xf, 
range=[0x7e5a-0x7e668000) (0MB)
[0.00] efi: mem17: type=5, attr=0x800f, 
range=[0x7e668000-0x7e67d000) (0MB)
[0.00] efi: mem18: type=6, attr=0x800f, 
range=[0x7e67d000-0x7e692000) (0MB)
[0.00] efi: mem19: type=4, attr=0xf, 
range=[0x7e692000-0x7f992000) (19MB)
[0.00] efi: mem20: type=7, attr=0xf, 
range=[0x7f992000-0x7f994000) (0MB)
[0.00] efi: mem21: type=3, attr=0xf, 
range=[0x7f994000-0x7fb12000) (1MB)
[0.00] efi: mem22: type=5, attr=0x800f, 
range=[0x7fb12000-0x7fb42000) (0MB)
[0.00] efi: mem23: type=6, attr=0x800f, 
range=[0x7fb42000-0x7fb66000) (0MB)
[0.00] efi: mem24: type=0, attr=0xf, 
range=[0x7fb66000-0x7fb6a000) (0MB)
[0.00] efi: mem25: type=9, attr=0xf, 
range=[0x7fb6a000-0x7fb72000) (0MB)
[0.00] efi: mem26: type=10, attr=0xf, 
range=[0x7fb72000-0x7fb76000) (0MB)
[0.00] efi: mem27: type=4, attr=0xf, 
range=[0x7fb76000-0x7ffe) (4MB)
[0.00] efi: mem28: type=6, attr=0x800f, 
range=[0x7ffe-0x8000) (0MB)

and with 'after' I've denoted the dump of the EFI regions a second time,
a bit later, when we enter efi_enter_virtual_mode():

[0.005012] efi: efi_enter_virtual_mode: enter
[0.006004] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.007004] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.008003] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.009004] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.010004] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.011004] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.012004] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.013003] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.014004] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.015004] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.016004] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.017004] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 04:27:44PM +0200, Laszlo Ersek wrote:
 I wouldn't call the design of SetVirtualAddressMap() braindead.

Ok, I've always wondered and you could probably shed some light on the
matter: why is SetVirtualAddressMap() a call-once only? Why can't I
simply call it again and update the mappings?

 I'd rather call kexec unique and somewhat unexpected :)

In all fairness, it was there before UEFI, AFAICT.

  I wouldn't wonder if we f*cked it up again like the last time. I'll give
  it a long hard look.
 
 Ah sorry, by and you guys suspect I didn't mean to imply anything
 between the lines, I was simply trying to ascertain your working idea :)

As long as we get to the bottom of this, we're all fine. And I'd
pretty much expect everyone who is dealing with EFI to have grown a
sufficiently thick skin before starting to do so, so don't worry.

:-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 06:41:20PM +0200, Laszlo Ersek wrote:
 I didn't realize the timestamps survive kexec. (As far as I remember
 the kernels I played with kexec on didn't have the automatic
 timestamps yet in dmesg, but I might have messed up just as well...)

No, no, no, kexec is not involved at all.

Here's the whole dmesg up until efi_enter_virtual_map. When we have entered
efi_enter_virtual_mode, the region has changed from

[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)

to

[0.023004] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0ad000) (0MB)


And yes, I still need to audit whether the kernel actually does that
change. I'm still looking...


[=3h[=3h[=3h[=3h[=3h[=3h[=3hearly
 console in decompress_kernel

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.10.0-rc7+ (boris@nazgul) (gcc version 4.7.3 
(Debian 4.7.3-4) ) #9 SMP PREEMPT Mon Aug 5 16:27:00 CEST 2013
[0.00] Command line: root=/dev/sda1 debug ignore_loglevel 
log_buf_len=10M earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009] usable
[0.00] BIOS-e820: [mem 0x0010-0x7e667fff] usable
[0.00] BIOS-e820: [mem 0x7e668000-0x7e691fff] reserved
[0.00] BIOS-e820: [mem 0x7e692000-0x7fb11fff] usable
[0.00] BIOS-e820: [mem 0x7fb12000-0x7fb69fff] reserved
[0.00] BIOS-e820: [mem 0x7fb6a000-0x7fb71fff] ACPI data
[0.00] BIOS-e820: [mem 0x7fb72000-0x7fb75fff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7fb76000-0x7ffd] usable
[0.00] BIOS-e820: [mem 0x7ffe-0x7fff] reserved
[0.00] debug: ignoring loglevel setting.
[0.00] bootconsole [earlyser0] enabled
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.31 by EDK II
[0.00] efi:  ACPI=0x7fb71000  ACPI 2.0=0x7fb71014 
[0.00] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.00] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.00] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.00] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.00] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.00] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.00] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.00] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.00] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.00] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)
[0.00] efi: mem13: type=4, attr=0xf, 
range=[0x7e0cd000-0x7e55d000) (4MB)
[0.00] efi: mem14: type=3, attr=0xf, 
range=[0x7e55d000-0x7e59c000) (0MB)
[0.00] efi: mem15: type=4, attr=0xf, 
range=[0x7e59c000-0x7e5a) (0MB)
[0.00] efi: mem16: type=3, attr=0xf, 
range=[0x7e5a-0x7e668000) (0MB)
[0.00] efi: mem17: type=5, attr=0x800f, 
range=[0x7e668000-0x7e67d000) (0MB)
[0.00] efi: mem18: type=6, attr=0x800f, 
range=[0x7e67d000-0x7e692000) (0MB)
[0.00] efi: mem19: type=4, attr=0xf, 
range=[0x7e692000-0x7f992000) (19MB)
[0.00] efi: mem20: type=7, attr=0xf, 
range=[0x7f992000-0x7f994000) (0MB)
[0.00] efi: mem21: type=3, attr=0xf, 
range=[0x7f994000-0x7fb12000) (1MB)
[0.00] efi: mem22: type=5, attr=0x800f, 
range=[0x7fb12000-0x7fb42000) (0MB)
[0.00] efi: mem23: type=6, attr=0x800f, 
range=[0x7fb42000-0x7fb66000) (0MB)
[0.00] efi: mem24: type=0, attr=0xf, 
range=[0x7fb66000-0x7fb6a000) (0MB)
[0.00] efi: mem25: type=9, attr=0xf, 

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 08:50:17AM -0700, Andrew Fish wrote:
 AFAICT EFI pre-dates kexec merge into mainline by a number of years as
 SetVirtualaddressMap() was part of EFI 1.0 (previous millennium)

Ok, fair enough.

 The EFI to UEFI conversion was placing EFI 1.10 into an industry
 standard, UEFI 2.0. UEFI is an industry standard so some one just
 needs to make a proposal to update the spec. The edk2 open source
 project is not part of the standards body so complaining on this
 mailing list is not going to get anything changed.

Right, I don't think that even changing the spec would help - it would
actually make things worse because then we'd have to differentiate
between UEFI versions: those which can do SetVirtualaddressMap() more
than once and the older ones.

So let's drop the discussion here - it is what it is, it is too late to
change anything. At least we talked about it. :-)

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 02:37:08PM -0700, H. Peter Anvin wrote:
 All of this would be a non-problem if there weren't buggy
 implementations which can't run *without* SetVirtualAddressMap().

Oh, you mean, if we were to call the runtime services through their
physical addresses?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 11:26:46PM +0200, Laszlo Ersek wrote:
 What happens if you pass memblock=debug on the kernel command line
 (see early_memblock() in mm/memblock.c)?
 
 (I just tried it in my Fedora 19 guest, and it in fact produced the message
 
 [0.00] efi: Could not reserve boot range [0x80-0xff]

Note to self: Always look for bugs in Linux' UEFI code first, before
going anywhere else!

Yes, very good analysis and good job Laszlo!

I'll write what I see now but will doublecheck it tomorrow because I'm
almost half asleep.

[0.00] efi: efi_reserve_boot_services:  - start: 0x7e0ad000, size: 
0x1f000
[0.00] efi: Could not reserve boot range [0x007e0ad000-0x007e0cbfff]

And yes, this fails because memblock_is_region_reserved(start, size)
returns true.

And why is that:

[0.00] memblock_reserve: [0x00036be000-0x00036c3000] 
setup_arch+0x60e/0xa63
[0.00] MEMBLOCK configuration:
[0.00]  memory size = 0x7fef1000 reserved size = 0x1724570
[0.00]  memory.cnt  = 0x4
[0.00]  memory[0x0] [0x001000-0x09], 0x9f000 
bytes
[0.00]  memory[0x1] [0x10-0x007e667fff], 0x7e568000 
bytes
[0.00]  memory[0x2] [0x007e692000-0x007fb11fff], 0x148 
bytes
[0.00]  memory[0x3] [0x007fb76000-0x007ffd], 0x46a000 
bytes
[0.00]  reserved.cnt  = 0x3
[0.00]  reserved[0x0]   [0x09f000-0x0f], 0x61000 
bytes
[0.00]  reserved[0x1]   [0x000200-0x00036c2fff], 0x16c3000 
bytes
[0.00]  reserved[0x2]   [0x007e0ad018-0x007e0ad587], 0x570 bytes
^

There are 0x570 bytes right in this region which are memblock-reserved
and so we truncate it in efi_reserve_boot_services().

This makes me say words which will offend this list so I'll instead go
out on the balcony and wake up the neighbors. :-)

Ok, thanks again for finding it, I'll go and try to figure out the whole
mess tomorrow.

Good night!

 BTW, regarding Michael's answer, I think this is just one of several
 ways in which Linux manipulates the EFI memmap between (b) and (c).
 For example it seems to merge ranges in the map.

Yes, it does so in efi_enter_virtual_mode(). That was my initial
suspicion, that's why I dumped the regions before the merging.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: UEFI Plugfest 2013 -- New Orleans

2013-08-19 Thread Borislav Petkov
On Mon, Aug 19, 2013 at 09:25:35AM +0100, David Woodhouse wrote:
 Hm. It would be really useful to have a kernel build option which
 *disables* all the workarounds we've ever put in for broken firmware.

Yeah, cool!

I wonder if we could reach a high double-digit percentage of machines
not booting/barfing on such a clean kernel.

While at it, can we please replace the fw with coreboot? :-)

 Every deviation from the spec (or common sense), however minor, should
 show up as a clear failure. Even the ones we *have* been able to work
 around, because we still want them *fixed*.
 
 And there's a school of thought that says we should brick as many
 Samsung machines as possible,

Yep, and there's the rooted secure boot asus f*ckup which we should also
advertize while booting, enabling people to use it:

https://www.blackhat.com/us-13/archives.html#Bulygin

/me LOLs ominously...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/11] x86, pageattr: Add last levels of error path

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

We try to free the pagetable pages once we've unmapped our portion.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 94 +-
 1 file changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index a0d2e90ad62b..ca76481c09e8 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -666,7 +666,99 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
return 0;
 }
 
-#define unmap_pmd_range(pud, start, pre_end)   do {} while (0)
+static bool try_to_free_pte_page(pte_t *pte)
+{
+   int i;
+
+   for (i = 0; i  PTRS_PER_PTE; i++)
+   if (!pte_none(pte[i]))
+   return false;
+
+   free_page((unsigned long)pte);
+   return true;
+}
+
+static bool try_to_free_pmd_page(pmd_t *pmd)
+{
+   int i;
+
+   for (i = 0; i  PTRS_PER_PMD; i++)
+   if (!pmd_none(pmd[i]))
+   return false;
+
+   free_page((unsigned long)pmd);
+   return true;
+}
+
+static bool unmap_pte_range(pmd_t *pmd, unsigned long start, unsigned long end)
+{
+   pte_t *pte = pte_offset_kernel(pmd, start);
+
+   while (start  end) {
+   set_pte(pte, __pte(0));
+
+   start += PAGE_SIZE;
+   pte++;
+   }
+
+   if (try_to_free_pte_page((pte_t *)pmd_page_vaddr(*pmd))) {
+   pmd_clear(pmd);
+   return true;
+   }
+   return false;
+}
+
+static void __unmap_pmd_range(pud_t *pud, pmd_t *pmd,
+ unsigned long start, unsigned long end)
+{
+   if (unmap_pte_range(pmd, start, end))
+   if (try_to_free_pmd_page((pmd_t *)pud_page_vaddr(*pud)))
+   pud_clear(pud);
+}
+
+static void unmap_pmd_range(pud_t *pud, unsigned long start, unsigned long end)
+{
+   pmd_t *pmd = pmd_offset(pud, start);
+
+   /*
+* Not on a 2MB page boundary?
+*/
+   if (start  (PMD_SIZE - 1)) {
+   unsigned long next_page = (start + PMD_SIZE)  PMD_MASK;
+   unsigned long pre_end = min_t(unsigned long, end, next_page);
+
+   __unmap_pmd_range(pud, pmd, start, pre_end);
+
+   start = pre_end;
+   pmd++;
+   }
+
+   /*
+* Try to unmap in 2M chunks.
+*/
+   while (end - start = PMD_SIZE) {
+   if (pmd_large(*pmd))
+   pmd_clear(pmd);
+   else
+   __unmap_pmd_range(pud, pmd, start, start + PMD_SIZE);
+
+   start += PMD_SIZE;
+   pmd++;
+   }
+
+   /*
+* 4K leftovers?
+*/
+   if (start  end)
+   return __unmap_pmd_range(pud, pmd, start, end);
+
+   /*
+* Try again to free the PMD page if haven't succeeded above.
+*/
+   if (!pud_none(*pud))
+   if (try_to_free_pmd_page((pmd_t *)pud_page_vaddr(*pud)))
+   pud_clear(pud);
+}
 
 static void unmap_pud_range(pgd_t *pgd, unsigned long start, unsigned long end)
 {
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/11] x86, pageattr: Add a PUD error unwinding path

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

In case we encounter an error during the mapping of a region, we want to
unwind what we've established so far exactly the way we did the mapping.
This is the PUD part kept deliberately small for easier review.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 60 --
 1 file changed, 58 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 02cf97b3bb7c..a0d2e90ad62b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -666,6 +666,51 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
return 0;
 }
 
+#define unmap_pmd_range(pud, start, pre_end)   do {} while (0)
+
+static void unmap_pud_range(pgd_t *pgd, unsigned long start, unsigned long end)
+{
+   pud_t *pud = pud_offset(pgd, start);
+
+   /*
+* Not on a GB page boundary?
+*/
+   if (start  (PUD_SIZE - 1)) {
+   unsigned long next_page = (start + PUD_SIZE)  PUD_MASK;
+   unsigned long pre_end   = min_t(unsigned long, end, next_page);
+
+   unmap_pmd_range(pud, start, pre_end);
+
+   start = pre_end;
+   pud++;
+   }
+
+   /*
+* Try to unmap in 1G chunks?
+*/
+   while (end - start = PUD_SIZE) {
+
+   if (pud_large(*pud))
+   pud_clear(pud);
+   else
+   unmap_pmd_range(pud, start, start + PUD_SIZE);
+
+   start += PUD_SIZE;
+   pud++;
+   }
+
+   /*
+* 2M leftovers?
+*/
+   if (start  end)
+   unmap_pmd_range(pud, start, end);
+
+   /*
+* No need to try to free the PUD page because we'll free it in
+* populate_pgd's error path
+*/
+}
+
 static int alloc_pte_page(pmd_t *pmd)
 {
pte_t *pte = (pte_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
@@ -883,9 +928,20 @@ static int populate_pgd(struct cpa_data *cpa, unsigned 
long addr)
pgprot_val(pgprot) |=  pgprot_val(cpa-mask_set);
 
ret = populate_pud(cpa, addr, pgd_entry, pgprot);
-   if (ret  0)
-   return ret;
+   if (ret  0) {
+   unmap_pud_range(pgd_entry, addr,
+   addr + (cpa-numpages  PAGE_SHIFT));
 
+   if (allocd_pgd) {
+   /*
+* If I allocated this PUD page, I can just as well
+* free it in this error path.
+*/
+   pgd_clear(pgd_entry);
+   free_page((unsigned long)pud);
+   }
+   return ret;
+   }
cpa-numpages = ret;
return 0;
 }
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/11] efi: Remove EFI_PAGE_SHIFT and EFI_PAGE_SIZE

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

... and use the good old standard defines which we all know. Also,
simplify math to shift by PAGE_SHIFT instead of multiplying by
PAGE_SIZE.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/boot/compressed/eboot.c | 12 ++--
 arch/x86/boot/compressed/eboot.h |  1 -
 arch/x86/platform/efi/efi.c  | 22 +++---
 include/linux/efi.h  |  6 ++
 4 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/arch/x86/boot/compressed/eboot.c b/arch/x86/boot/compressed/eboot.c
index b7388a425f09..5c440bf769a8 100644
--- a/arch/x86/boot/compressed/eboot.c
+++ b/arch/x86/boot/compressed/eboot.c
@@ -96,7 +96,7 @@ static efi_status_t high_alloc(unsigned long size, unsigned 
long align,
if (status != EFI_SUCCESS)
goto fail;
 
-   nr_pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+   nr_pages = round_up(size, PAGE_SIZE) / PAGE_SIZE;
 again:
for (i = 0; i  map_size / desc_size; i++) {
efi_memory_desc_t *desc;
@@ -111,7 +111,7 @@ again:
continue;
 
start = desc-phys_addr;
-   end = start + desc-num_pages * (1UL  EFI_PAGE_SHIFT);
+   end = start + (desc-num_pages  PAGE_SHIFT);
 
if ((start + size)  end || (start + size)  max)
continue;
@@ -173,7 +173,7 @@ static efi_status_t low_alloc(unsigned long size, unsigned 
long align,
if (status != EFI_SUCCESS)
goto fail;
 
-   nr_pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+   nr_pages = round_up(size, PAGE_SIZE) / PAGE_SIZE;
for (i = 0; i  map_size / desc_size; i++) {
efi_memory_desc_t *desc;
unsigned long m = (unsigned long)map;
@@ -188,7 +188,7 @@ static efi_status_t low_alloc(unsigned long size, unsigned 
long align,
continue;
 
start = desc-phys_addr;
-   end = start + desc-num_pages * (1UL  EFI_PAGE_SHIFT);
+   end = start + (desc-num_pages  PAGE_SHIFT);
 
/*
 * Don't allocate at 0x0. It will confuse code that
@@ -224,7 +224,7 @@ static void low_free(unsigned long size, unsigned long addr)
 {
unsigned long nr_pages;
 
-   nr_pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+   nr_pages = round_up(size, PAGE_SIZE) / PAGE_SIZE;
efi_call_phys2(sys_table-boottime-free_pages, addr, nr_pages);
 }
 
@@ -1128,7 +1128,7 @@ static efi_status_t relocate_kernel(struct setup_header 
*hdr)
 * possible.
 */
start = hdr-pref_address;
-   nr_pages = round_up(hdr-init_size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+   nr_pages = round_up(hdr-init_size, PAGE_SIZE) / PAGE_SIZE;
 
status = efi_call_phys4(sys_table-boottime-allocate_pages,
EFI_ALLOCATE_ADDRESS, EFI_LOADER_DATA,
diff --git a/arch/x86/boot/compressed/eboot.h b/arch/x86/boot/compressed/eboot.h
index e5b0a8f91c5f..786398c1bb9a 100644
--- a/arch/x86/boot/compressed/eboot.h
+++ b/arch/x86/boot/compressed/eboot.h
@@ -11,7 +11,6 @@
 
 #define DESC_TYPE_CODE_DATA(1  0)
 
-#define EFI_PAGE_SIZE  (1UL  EFI_PAGE_SHIFT)
 #define EFI_READ_CHUNK_SIZE(1024 * 1024)
 
 #define EFI_CONSOLE_OUT_DEVICE_GUID\
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 7cec1e9e5494..538c1e6b7b2c 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -339,7 +339,7 @@ static void __init do_add_efi_memmap(void)
for (p = memmap.map; p  memmap.map_end; p += memmap.desc_size) {
efi_memory_desc_t *md = p;
unsigned long long start = md-phys_addr;
-   unsigned long long size = md-num_pages  EFI_PAGE_SHIFT;
+   unsigned long long size = md-num_pages  PAGE_SHIFT;
int e820_type;
 
switch (md-type) {
@@ -416,8 +416,8 @@ static void __init print_efi_memmap(void)
pr_info(mem%02u: type=%u, attr=0x%llx, 
range=[0x%016llx-0x%016llx) (%lluMB)\n,
i, md-type, md-attribute, md-phys_addr,
-   md-phys_addr + (md-num_pages  EFI_PAGE_SHIFT),
-   (md-num_pages  (20 - EFI_PAGE_SHIFT)));
+   md-phys_addr + (md-num_pages  PAGE_SHIFT),
+   (md-num_pages  (20 - PAGE_SHIFT)));
}
 #endif  /*  EFI_DEBUG  */
 }
@@ -429,7 +429,7 @@ void __init efi_reserve_boot_services(void)
for (p = memmap.map; p  memmap.map_end; p += memmap.desc_size) {
efi_memory_desc_t *md = p;
u64 start = md-phys_addr;
-   u64 size = md-num_pages  EFI_PAGE_SHIFT;
+   u64 size = md-num_pages  PAGE_SHIFT;
 
if (md-type != EFI_BOOT_SERVICES_CODE 
md-type != EFI_BOOT_SERVICES_DATA)
@@ -473,7 +473,7

[PATCH 10/11] x86, cpa: Map in an arbitrary pgd

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Add the ability to map pages in an arbitrary pgd. This wires in the
remaining stuff so that there's a new interface with which you can map a
region into an arbitrary PGD.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 53 +++---
 1 file changed, 46 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index ca76481c09e8..991386bf3aad 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -453,7 +453,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 * Check for races, another CPU might have split this page
 * up already:
 */
-   tmp = lookup_address(address, level);
+   tmp = _lookup_address_cpa(cpa, address, level);
if (tmp != kpte)
goto out_unlock;
 
@@ -559,7 +559,8 @@ out_unlock:
 }
 
 static int
-__split_large_page(pte_t *kpte, unsigned long address, struct page *base)
+__split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address,
+  struct page *base)
 {
pte_t *pbase = (pte_t *)page_address(base);
unsigned long pfn, pfninc = 1;
@@ -572,7 +573,7 @@ __split_large_page(pte_t *kpte, unsigned long address, 
struct page *base)
 * Check for races, another CPU might have split this page
 * up for us already:
 */
-   tmp = lookup_address(address, level);
+   tmp = _lookup_address_cpa(cpa, address, level);
if (tmp != kpte) {
spin_unlock(pgd_lock);
return 1;
@@ -648,7 +649,8 @@ __split_large_page(pte_t *kpte, unsigned long address, 
struct page *base)
return 0;
 }
 
-static int split_large_page(pte_t *kpte, unsigned long address)
+static int split_large_page(struct cpa_data *cpa, pte_t *kpte,
+   unsigned long address)
 {
struct page *base;
 
@@ -660,7 +662,7 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
if (!base)
return -ENOMEM;
 
-   if (__split_large_page(kpte, address, base))
+   if (__split_large_page(cpa, kpte, address, base))
__free_page(base);
 
return 0;
@@ -1041,6 +1043,9 @@ static int populate_pgd(struct cpa_data *cpa, unsigned 
long addr)
 static int __cpa_process_fault(struct cpa_data *cpa, unsigned long vaddr,
   int primary)
 {
+   if (cpa-pgd)
+   return populate_pgd(cpa, vaddr);
+
/*
 * Ignore all non primary paths.
 */
@@ -1085,7 +1090,7 @@ static int __change_page_attr(struct cpa_data *cpa, int 
primary)
else
address = *cpa-vaddr;
 repeat:
-   kpte = lookup_address(address, level);
+   kpte = _lookup_address_cpa(cpa, address, level);
if (!kpte)
return __cpa_process_fault(cpa, address, primary);
 
@@ -1149,7 +1154,7 @@ repeat:
/*
 * We have to split the large page:
 */
-   err = split_large_page(kpte, address);
+   err = split_large_page(cpa, kpte, address);
if (!err) {
/*
 * Do a global flush tlb after splitting the large page
@@ -1298,6 +1303,8 @@ static int change_page_attr_set_clr(unsigned long *addr, 
int numpages,
int ret, cache, checkalias;
unsigned long baddr = 0;
 
+   memset(cpa, 0, sizeof(cpa));
+
/*
 * Check, if we are requested to change a not supported
 * feature:
@@ -1744,6 +1751,7 @@ static int __set_pages_p(struct page *page, int numpages)
 {
unsigned long tempaddr = (unsigned long) page_address(page);
struct cpa_data cpa = { .vaddr = tempaddr,
+   .pgd = 0,
.numpages = numpages,
.mask_set = __pgprot(_PAGE_PRESENT | _PAGE_RW),
.mask_clr = __pgprot(0),
@@ -1762,6 +1770,7 @@ static int __set_pages_np(struct page *page, int numpages)
 {
unsigned long tempaddr = (unsigned long) page_address(page);
struct cpa_data cpa = { .vaddr = tempaddr,
+   .pgd = 0,
.numpages = numpages,
.mask_set = __pgprot(0),
.mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW),
@@ -1822,6 +1831,36 @@ bool kernel_page_present(struct page *page)
 
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
+int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
+   unsigned numpages, unsigned long page_flags)
+{
+   int retval = 0;
+
+   struct cpa_data cpa = {
+   .vaddr = address,
+   .pfn = pfn,
+   .pgd = pgd,
+   .numpages = numpages,
+   .mask_set = __pgprot(0),
+   .mask_clr = __pgprot(0),
+   .flags = 0

[PATCH 04/11] x86, pageattr: Add a PGD pagetable populating function

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

This allocates, if necessary, and populates the corresponding PGD entry
with a PUD page. The next population level is a dummy macro which will
be removed by the next patch and it is added here to keep the patch
small and easily reviewable but not break bisection, at the same time.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index c53de62a1170..21a31e85283c 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -666,6 +666,45 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
return 0;
 }
 
+#define populate_pud(cpa, addr, pgd, pgprot)   (-1)
+
+/*
+ * Restrictions for kernel page table do not necessarily apply when mapping in
+ * an alternate PGD.
+ */
+static int populate_pgd(struct cpa_data *cpa, unsigned long addr)
+{
+   pgprot_t pgprot = __pgprot(_KERNPG_TABLE);
+   bool allocd_pgd = false;
+   pgd_t *pgd_entry;
+   pud_t *pud;
+   int ret;
+
+   pgd_entry = cpa-pgd + pgd_index(addr);
+
+   /*
+* Allocate a PUD page and hand it down for mapping.
+*/
+   if (pgd_none(*pgd_entry)) {
+   pud = (pud_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
+   if (!pud)
+   return -1;
+
+   set_pgd(pgd_entry, __pgd(__pa(pud) | _KERNPG_TABLE));
+   allocd_pgd = true;
+   }
+
+   pgprot_val(pgprot) = ~pgprot_val(cpa-mask_clr);
+   pgprot_val(pgprot) |=  pgprot_val(cpa-mask_set);
+
+   ret = populate_pud(cpa, addr, pgd_entry, pgprot);
+   if (ret  0)
+   return ret;
+
+   cpa-numpages = ret;
+   return 0;
+}
+
 static int __cpa_process_fault(struct cpa_data *cpa, unsigned long vaddr,
   int primary)
 {
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/11] efi: Simplify EFI_DEBUG

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

... and lose one #ifdef .. #endif sandwich.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/platform/efi/efi.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 90f6ed127096..7cec1e9e5494 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -51,7 +51,7 @@
 #include asm/x86_init.h
 #include asm/rtc.h
 
-#define EFI_DEBUG  1
+#define EFI_DEBUG
 
 #define EFI_MIN_RESERVE 5120
 
@@ -402,9 +402,9 @@ int __init efi_memblock_x86_reserve_range(void)
return 0;
 }
 
-#if EFI_DEBUG
 static void __init print_efi_memmap(void)
 {
+#ifdef EFI_DEBUG
efi_memory_desc_t *md;
void *p;
int i;
@@ -419,8 +419,8 @@ static void __init print_efi_memmap(void)
md-phys_addr + (md-num_pages  EFI_PAGE_SHIFT),
(md-num_pages  (20 - EFI_PAGE_SHIFT)));
}
-}
 #endif  /*  EFI_DEBUG  */
+}
 
 void __init efi_reserve_boot_services(void)
 {
@@ -774,10 +774,7 @@ void __init efi_init(void)
x86_platform.set_wallclock = efi_set_rtc_mmss;
}
 #endif
-
-#if EFI_DEBUG
print_efi_memmap();
-#endif
 }
 
 void __init efi_late_init(void)
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/11] EFI runtime services virtual mapping

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Hi all,

here's finally a new version of the runtime services VA mapping patchset
which hopefully implements hpa's idea of statically mapping EFI runtime
regions in a top-down manner starting at -4Gb virtual.

We're also using a different pagetable so as not to pollute kernel
address space. For that, we switch to that table before doing an EFI
call, and afterwards we switch back to the previous one.

To the patches:

1-2 are simple cleanups which Matt probably can take now

3-10 add the machinery to map regions into an arbitrary PGD. Those I've
split deliberately into very small bites so that they can be reviewed
more thoroughly and easily for my pagetable skills are pretty basic.

11 is the actual patch which implements that mapping so that we can use
runtime services in kexec (which is the whole reason for this fuss :))

So please take a long hard look at those, hammer on them on your
boxes and let me know. They boot fine on my Dell UEFI box and in OVMF
(obviously :)).

Thanks.

Borislav Petkov (11):
  efi: Simplify EFI_DEBUG
  efi: Remove EFI_PAGE_SHIFT and EFI_PAGE_SIZE
  x86, pageattr: Lookup address in an arbitrary PGD
  x86, pageattr: Add a PGD pagetable populating function
  x86, pageattr: Add a PUD pagetable populating function
  x86, pageattr: Add a PMD pagetable populating function
  x86, pageattr: Add a PTE pagetable populating function
  x86, pageattr: Add a PUD error unwinding path
  x86, pageattr: Add last levels of error path
  x86, cpa: Map in an arbitrary pgd
  EFI: Runtime services virtual mapping

 arch/x86/boot/compressed/eboot.c |  12 +-
 arch/x86/boot/compressed/eboot.h |   1 -
 arch/x86/include/asm/efi.h   |  58 +++--
 arch/x86/include/asm/pgtable_types.h |   3 +-
 arch/x86/mm/pageattr.c   | 461 +--
 arch/x86/platform/efi/efi.c  | 126 +-
 arch/x86/platform/efi/efi_64.c   |  56 +
 arch/x86/platform/efi/efi_stub_64.S  |  47 
 include/linux/efi.h  |   6 +-
 9 files changed, 615 insertions(+), 155 deletions(-)

-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/11] x86, pageattr: Lookup address in an arbitrary PGD

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

This is preparatory work in order to be able to map pages into a
specified PGD and not implicitly and only into init_mm.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 36 ++--
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index bb32480c2d71..c53de62a1170 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -30,6 +30,7 @@
  */
 struct cpa_data {
unsigned long   *vaddr;
+   pgd_t   *pgd;
pgprot_tmask_set;
pgprot_tmask_clr;
int numpages;
@@ -322,17 +323,9 @@ static inline pgprot_t static_protections(pgprot_t prot, 
unsigned long address,
return prot;
 }
 
-/*
- * Lookup the page table entry for a virtual address. Return a pointer
- * to the entry and the level of the mapping.
- *
- * Note: We return pud and pmd either when the entry is marked large
- * or when the present bit is not set. Otherwise we would return a
- * pointer to a nonexisting mapping.
- */
-pte_t *lookup_address(unsigned long address, unsigned int *level)
+static pte_t *__lookup_address_in_pgd(pgd_t *pgd, unsigned long address,
+ unsigned int *level)
 {
-   pgd_t *pgd = pgd_offset_k(address);
pud_t *pud;
pmd_t *pmd;
 
@@ -361,8 +354,31 @@ pte_t *lookup_address(unsigned long address, unsigned int 
*level)
 
return pte_offset_kernel(pmd, address);
 }
+
+/*
+ * Lookup the page table entry for a virtual address. Return a pointer
+ * to the entry and the level of the mapping.
+ *
+ * Note: We return pud and pmd either when the entry is marked large
+ * or when the present bit is not set. Otherwise we would return a
+ * pointer to a nonexisting mapping.
+ */
+pte_t *lookup_address(unsigned long address, unsigned int *level)
+{
+return __lookup_address_in_pgd(pgd_offset_k(address), address, level);
+}
 EXPORT_SYMBOL_GPL(lookup_address);
 
+static pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
+ unsigned int *level)
+{
+if (cpa-pgd)
+   return __lookup_address_in_pgd(cpa-pgd + pgd_index(address),
+  address, level);
+
+return lookup_address(address, level);
+}
+
 /*
  * This is necessary because __pa() does not work on some
  * kinds of memory, like vmalloc() or the alloc_remap()
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/11] x86, pageattr: Add a PTE pagetable populating function

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Handle last level by unconditionally writing the PTEs into the PTE page
while paying attention to the NX bit.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index c56d71591617..02cf97b3bb7c 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -686,7 +686,27 @@ static int alloc_pmd_page(pud_t *pud)
return 0;
 }
 
-#define populate_pte(cpa, start, end, pages, pmd, pgprot)  do {} while (0)
+static void populate_pte(struct cpa_data *cpa,
+unsigned long start, unsigned long end,
+unsigned num_pages, pmd_t *pmd, pgprot_t pgprot)
+{
+   pte_t *pte;
+
+   pte = pte_offset_kernel(pmd, start);
+
+   while (num_pages--  start  end) {
+
+   /* deal with the NX bit */
+   if (!(pgprot_val(pgprot)  _PAGE_NX))
+   cpa-pfn = ~_PAGE_NX;
+
+   set_pte(pte, pfn_pte(cpa-pfn  PAGE_SHIFT, pgprot));
+
+   start+= PAGE_SIZE;
+   cpa-pfn += PAGE_SIZE;
+   pte++;
+   }
+}
 
 static int populate_pmd(struct cpa_data *cpa,
unsigned long start, unsigned long end,
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/11] x86, pageattr: Add a PUD pagetable populating function

2013-09-19 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Add the next level of the pagetable populating function, we handle
chunks around a 1G boundary by mapping them with the lower level
functions - otherwise we use 1G pages for the mappings, thus using as
less amount of pagetable pages as possible.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 87 +-
 1 file changed, 86 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 21a31e85283c..41c6fdbbfab0 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -666,7 +666,92 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
return 0;
 }
 
-#define populate_pud(cpa, addr, pgd, pgprot)   (-1)
+static int alloc_pmd_page(pud_t *pud)
+{
+   pmd_t *pmd = (pmd_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
+   if (!pmd)
+   return -1;
+
+   set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
+   return 0;
+}
+
+#define populate_pmd(cpa, start, end, pages, pud, pgprot)  (-1)
+
+static int populate_pud(struct cpa_data *cpa, unsigned long start, pgd_t *pgd,
+   pgprot_t pgprot)
+{
+   pud_t *pud;
+   unsigned long end;
+   int cur_pages = 0;
+
+   end = start + (cpa-numpages  PAGE_SHIFT);
+
+   /*
+* Not on a Gb page boundary? = map everything up to it with
+* smaller pages.
+*/
+   if (start  (PUD_SIZE - 1)) {
+   unsigned long pre_end;
+   unsigned long next_page = (start + PUD_SIZE)  PUD_MASK;
+
+   pre_end   = min_t(unsigned long, end, next_page);
+   cur_pages = (pre_end - start)  PAGE_SHIFT;
+   cur_pages = min_t(int, (int)cpa-numpages, cur_pages);
+
+   pud = pud_offset(pgd, start);
+
+   /*
+* Need a PMD page?
+*/
+   if (pud_none(*pud))
+   if (alloc_pmd_page(pud))
+   return -1;
+
+   cur_pages = populate_pmd(cpa, start, pre_end, cur_pages,
+pud, pgprot);
+   if (cur_pages  0)
+   return cur_pages;
+
+   start = pre_end;
+   }
+
+   /* We mapped them all? */
+   if (cpa-numpages == cur_pages)
+   return cur_pages;
+
+   pud = pud_offset(pgd, start);
+
+   /*
+* Map everything starting from the Gb boundary, possibly with 1G pages
+*/
+   while (end - start = PUD_SIZE) {
+   set_pud(pud, __pud(cpa-pfn | _PAGE_PSE | 
massage_pgprot(pgprot)));
+
+   start += PUD_SIZE;
+   cpa-pfn  += PUD_SIZE;
+   cur_pages += PUD_SIZE  PAGE_SHIFT;
+   pud++;
+   }
+
+   /* Map trailing leftover */
+   if (start  end) {
+   int tmp;
+
+   pud = pud_offset(pgd, start);
+   if (pud_none(*pud))
+   if (alloc_pmd_page(pud))
+   return -1;
+
+   tmp = populate_pmd(cpa, start, end, cpa-numpages - cur_pages,
+  pud, pgprot);
+   if (tmp  0)
+   return cur_pages;
+
+   cur_pages += tmp;
+   }
+   return cur_pages;
+}
 
 /*
  * Restrictions for kernel page table do not necessarily apply when mapping in
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/11] EFI runtime services virtual mapping

2013-09-20 Thread Borislav Petkov
On Fri, Sep 20, 2013 at 03:29:04PM +0800, Dave Young wrote:
 Just tested this series, for 1st kernel It boots ok in qemu+ovmf. But
 it immediately reboot on my Thinkpad T420. Unfortunately there's no
 way to debug this very early problem because there's no serial port
 also earlyprintk does not work for efi boot. No usb debug as well on
 this machine. I will test it when I go back to work after the china
 holiday.

Hmm, I'm booting with the efi boot stub, how do you do it?

 OTOH, for 2nd kernel testing because kexec tools does not fill
 efi_info[] in bootparam so kernel will disable efi, also it pass
 acpi_rsdp pointer automaticlly to make 2nd kernel boot ok.

Right, the way this could be done is to pass in efi_info.efi_memmap,
i.e. the physical map and then iterate over it and compute the virtual
addresses *without* calling phys_efi_set_virtual_address_map() - they
are stable now.

 I tested with a user space patch which copy efi_info from 1st kernel
 to bootparams, as I said previously this is not enough because several
 fields in systab, fw_vendor, runtime and tables are converted to
 virtual address but in kernel efi init function they are assumed
 physical addresses. Thus we need save these physical address. I have a
 patch to save them and pass them to 2nd kernel in bootparams.

Yep.

 Since the mapping are same, I wonder if we can calculate the physical
 address from virtual address. Idea?

Just look at the loop where we're iterating over regions in
efi_enter_virtual_mode(): we basically can do the same __map_region
calls without calling phys_efi_set_virtual_address_map.

 Another concern is that is it safe for i386 efi boot?

That's why I didn't put a git tree on k.org - I wanted to run tests
myself before Fengguang's robot :)

But no, 32-bit is not addressed here. Which just dawned on me: Matt, I
probably should keep the ioremapping code for 32-bit, doh. I completely
went 64-bit only here :-)

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/11] EFI runtime services virtual mapping

2013-09-20 Thread Borislav Petkov
On Fri, Sep 20, 2013 at 04:19:40PM +0800, Dave Young wrote:
 Actually the ovmf testing is qemu-system-x86_64 -kernel , boot from grub
 fails as well. Nothing printed on serial. I guess '-kernel' is using efi stub
 to boot?

Yes.

Which OVMF are you using? Mine is pretty recent: svn revision 14530 from August.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/11] EFI runtime services virtual mapping

2013-09-20 Thread Borislav Petkov
On Fri, Sep 20, 2013 at 10:49:13AM +0100, Matt Fleming wrote:
 /home/build/git/efi/arch/x86/platform/efi/efi.c: In function ‘__map_region’:
 /home/build/git/efi/arch/x86/platform/efi/efi.c:753:24: error: ‘struct 
 real_mode_header’ has no member named ‘trampoline_pgd’
 /home/build/git/efi/arch/x86/platform/efi/efi.c: In function 
 ‘efi_enter_virtual_mode’:
 /home/build/git/efi/arch/x86/platform/efi/efi.c:863:64: error: ‘struct 
 real_mode_header’ has no member named ‘trampoline_pgd’
 /home/build/git/efi/arch/x86/platform/efi/efi.c:867:2: error: implicit 
 declaration of function ‘efi_sync_low_kernel_mappings’
 [-Werror=implicit-function-declaration]

Yep, I know - saw them last night and fixed them. But this place will
need some reorg anyway in the next version - just don't do 32-bit builds
with this one :)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH -v2] EFI: Runtime services virtual mapping

2013-09-21 Thread Borislav Petkov
On Thu, Sep 19, 2013 at 04:54:54PM +0200, Borislav Petkov wrote:
 From: Borislav Petkov b...@suse.de
 
 We map the EFI regions needed for runtime services contiguously on
 virtual addresses starting from -4G down for a total max space of 64G.
 This way, we provide for stable runtime services addresses across
 kernels so that a kexec'd kernel can still use them.
 
 This way, they're mapped in a separate pagetable so that we don't
 pollute the kernel namespace (you can see how the whole ioremapping and
 saving and restoring of PGDs is gone now).

Ok, this one was not so good, let's try again:

This time I saved 32-bit and am switching the pagetable only after
having built it properly. This boots fine again on baremetal and on OVMF
with Matt's handover flags fix from yesterday.

Also, I've uploaded the whole series to
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git, branch
efi-experimental

(-experimental doesn't trigger Fengguang's robot :-))

Good luck! :-)

---
From 880fcee20209a122eda846e7f109776ed1c56de5 Mon Sep 17 00:00:00 2001
From: Borislav Petkov b...@suse.de
Date: Wed, 18 Sep 2013 17:35:42 +0200
Subject: [PATCH] EFI: Runtime services virtual mapping

We map the EFI regions needed for runtime services contiguously on
virtual addresses starting from -4G down for a total max space of 64G.
This way, we provide for stable runtime services addresses across
kernels so that a kexec'd kernel can still use them.

This way, they're mapped in a separate pagetable so that we don't
pollute the kernel namespace (you can see how the whole ioremapping and
saving and restoring of PGDs is gone now).

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/include/asm/efi.h   | 43 ++
 arch/x86/include/asm/pgtable_types.h |  3 +-
 arch/x86/platform/efi/efi.c  | 68 -
 arch/x86/platform/efi/efi_32.c   | 29 +++-
 arch/x86/platform/efi/efi_64.c   | 85 +++-
 arch/x86/platform/efi/efi_stub_64.S  | 53 ++
 6 files changed, 181 insertions(+), 100 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0062a0125041..9a99e0499e4b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -69,24 +69,31 @@ extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
efi_call6((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4), (u64)(a5), (u64)(a6))
 
+#define _efi_call_virtX(x, f, ...) \
+({ \
+   efi_status_t __s;   \
+   \
+   efi_sync_low_kernel_mappings(); \
+   preempt_disable();  \
+   __s = efi_call##x((void *)efi.systab-runtime-f, __VA_ARGS__); \
+   preempt_enable();   \
+   __s;\
+})
+
 #define efi_call_virt0(f)  \
-   efi_call0((efi.systab-runtime-f))
-#define efi_call_virt1(f, a1)  \
-   efi_call1((efi.systab-runtime-f), (u64)(a1))
-#define efi_call_virt2(f, a1, a2)  \
-   efi_call2((efi.systab-runtime-f), (u64)(a1), (u64)(a2))
-#define efi_call_virt3(f, a1, a2, a3)  \
-   efi_call3((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3))
-#define efi_call_virt4(f, a1, a2, a3, a4)  \
-   efi_call4((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3), (u64)(a4))
-#define efi_call_virt5(f, a1, a2, a3, a4, a5)  \
-   efi_call5((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3), (u64)(a4), (u64)(a5))
-#define efi_call_virt6(f, a1, a2, a3, a4, a5, a6)  \
-   efi_call6((efi.systab-runtime-f), (u64)(a1), (u64)(a2), \
- (u64)(a3), (u64)(a4), (u64)(a5), (u64)(a6))
+   _efi_call_virtX(0, f)
+#define efi_call_virt1(f, a1)  \
+   _efi_call_virtX(1, f, (u64)(a1))
+#define efi_call_virt2(f, a1, a2)  \
+   _efi_call_virtX(2, f, (u64)(a1), (u64)(a2))
+#define efi_call_virt3(f, a1, a2, a3)  \
+   _efi_call_virtX(3, f, (u64)(a1), (u64)(a2), (u64)(a3))
+#define efi_call_virt4(f, a1, a2, a3, a4)  \
+   _efi_call_virtX(4, f, (u64)(a1), (u64)(a2), (u64)(a3), (u64)(a4))
+#define efi_call_virt5(f, a1, a2, a3, a4, a5)  \
+   _efi_call_virtX(5, f, (u64)(a1), (u64)(a2), (u64)(a3), (u64)(a4), 
(u64)(a5))
+#define efi_call_virt6(f, a1, a2, a3, a4, a5, a6)  \
+   _efi_call_virtX(6, f, (u64)(a1), (u64)(a2

Re: [PATCH 02/11] efi: Remove EFI_PAGE_SHIFT and EFI_PAGE_SIZE

2013-09-21 Thread Borislav Petkov
On Sat, Sep 21, 2013 at 05:21:39PM +0200, Leif Lindholm wrote:

 It will probably not be a problem on the stub side, and it's not used
 in many places but it would break efi_lookup_mapped_address(),
 efi_range_is_wc() and memrange_efi_to_native() for use by arm64.
 At least the first of these would be a problem.

Ok, maybe the generic header include/linux/efi.h might be a problem but
the rest are changes to arch/x86/ which should have no effect whatsoever
on any other arch.

Or are you planning to move some of it into generic code?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/11] efi: Remove EFI_PAGE_SHIFT and EFI_PAGE_SIZE

2013-09-21 Thread Borislav Petkov
On Sat, Sep 21, 2013 at 05:41:43PM +0200, Borislav Petkov wrote:
 On Sat, Sep 21, 2013 at 05:21:39PM +0200, Leif Lindholm wrote:
 
  It will probably not be a problem on the stub side, and it's not used
  in many places but it would break efi_lookup_mapped_address(),
  efi_range_is_wc() and memrange_efi_to_native() for use by arm64.
  At least the first of these would be a problem.
 
 Ok, maybe the generic header include/linux/efi.h might be a problem but
 the rest are changes to arch/x86/ which should have no effect whatsoever
 on any other arch.
 
 Or are you planning to move some of it into generic code?

Oh, and arm64 defines a respective PAGE_SIZE too, so what's the problem?
Or is possibly EFI_PAGE_SIZE != PAGE_SIZE on arm64?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/11] efi: Remove EFI_PAGE_SHIFT and EFI_PAGE_SIZE

2013-09-21 Thread Borislav Petkov
On Sat, Sep 21, 2013 at 06:01:21PM +0200, Leif Lindholm wrote:
 Correct. On arm64, EFI_PAGE_SIZE will be 4K, and PAGE_SIZE can be 4K
 or 64K, with at least Fedora opting for 64K.

Hm, ok, it looks like we want to keep EFI_PAGE_SIZE.

Oh well.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] EFI: Runtime services virtual mapping

2013-09-22 Thread Borislav Petkov
On Sun, Sep 22, 2013 at 08:35:15PM +0800, Dave Young wrote:
 I tested your new patch, it works both with efi stub and grub boot in
 1st kernel.

Good, thanks!

 But it paniced in kexec boot with my kexec related patcheset, the patchset

That's the second kernel, right?

 contains 3 patch:
 1. introduce cmdline kexecboot=0|1|2; 1 == kexec, 2 == kdump
 2. export physical addr fw_vendor, runtime, tables to /sys/firmware/efi/systab
 3. if kexecboot != 0, use fw_vendor, runtime, tables from bootparams; Also do 
 not
call SetVirtualAddressMao in case kexecboot.
 
 The panic happens at the last line of efi_init:
 /* clean DUMMY object */
 efi.set_variable(efi_dummy_name, EFI_DUMMY_GUID,
  EFI_VARIABLE_NON_VOLATILE |
  EFI_VARIABLE_BOOTSERVICE_ACCESS |
  EFI_VARIABLE_RUNTIME_ACCESS,
  0, NULL);
 
 Below is the dmesg:
 [0.003359] pid_max: default: 32768 minimum: 301
 [0.004792] BUG: unable to handle kernel paging request at fffefde97e70
 [0.00] IP: [8103a1db] virt_efi_set_variable+0x40/0x54
 [0.00] PGD 36981067 PUD 35828063 PMD 0

Here it is - fffefde97e70 is not mapped in the pagetable, PMD is 0.

Ok, can you upload your patches somewhere and tell me exactly how to
reproduce this so that I can take a look too?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] EFI: Runtime services virtual mapping

2013-09-22 Thread Borislav Petkov
On Sun, Sep 22, 2013 at 08:27:34AM -0700, H. Peter Anvin wrote:a
 The address that faults is interesting in that it is indeed just below
 -4G. The question at hand is probably what information you are using
 to build the EFI mappings in the secondary kernel and what could make
 it not match the primary.

Yep, so obviously we're not building the pagetable in the second kernel
the same way as the first or we're missing some pieces.

Btw, for debugging situations like this one, one could use
arch/x86/mm/dump_pagetables.c successfully by sticking in the right CR3
value into *start.

:-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] EFI: Runtime services virtual mapping

2013-09-23 Thread Borislav Petkov
On Mon, Sep 23, 2013 at 01:47:41PM +0800, Dave Young wrote:
  +   unsigned long size = md-num_pages  PAGE_SHIFT;
  +
  +   efi_va -= size;
  +   if (efi_va  EFI_VA_END) {
  +   pr_warning(FW_WARN VA address range overflow!\n);
  +   return;
  +   }
  +
  +   /* Do the 1:1 map */
  +   __map_region(md, md-phys_addr);
  +
  +   /* Do the VA map */
  +   __map_region(md, efi_va);
 
 
 Could you add comment for above code? It's hard to understand the
 twice mapping if one did not follow the old thread.

Does that suffice:

/*
 * Make sure the 1:1 mappings are present as a catch-all for b0rked firmware
 * which doesn't update all internal pointers after switching to virtual mode
 * and would otherwise crap on us.
 */

?

Btw, when you reply to a mail, please remove that quoted portion of it
which you're not replying to - I had to scroll a bunch of screens down
and I almost missed your reply. :)

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] EFI: Runtime services virtual mapping

2013-09-23 Thread Borislav Petkov
On Sat, Sep 21, 2013 at 01:39:29PM +0200, Borislav Petkov wrote:
 -void __init efi_call_phys_prelog(void)
 +/*
 + * We allocate runtime services regions top-down, starting from -4G, i.e.
 + * 0x___ and limit EFI VA mapping space to 64G.
 + */
 +static u64 efi_va = -4 * (1UL  30);
 +#define EFI_VA_END(-68 * (1UL  30))

Note to self: add this range to Documentation/x86/x86_64/mm.txt

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] EFI: Runtime services virtual mapping

2013-09-30 Thread Borislav Petkov
On Thu, Sep 26, 2013 at 11:12:42AM +0800, Dave Young wrote:
 If we choose this approach, can we save not only the efi_mapping, but
 also the fields which will be converted to virt addr, like fw_vendor,
 runtime, tables? During my test on a HP workstation, the config table
 item (SMBIOS) also is converted to virt addr though spec only mention
 fw_vendor/runtime/tables.

Btw, I was about to ask: how do you pass boot_params to the kexec
kernel?

Because I'm looking into hpa's idea to pass an efi_mapping array of
regions with setup_data but how does this get passed to the kexec'ed
kernel? I see in your patches you have boot_params.saved_*** for the
needed info but you're not writing to them anywhere. Is that why you've
added them to the systab_show function so that userspace can parse it
and build the boot_params thing?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -v2] EFI: Runtime services virtual mapping

2013-10-04 Thread Borislav Petkov
On Fri, Oct 04, 2013 at 07:43:37AM -0700, H. Peter Anvin wrote:
 We can do that... but it is different from what Windows does to my
 understanding and it also has the potential of severe pathologies...
 e.g. a window at the top of the address space being mapped.

Right, so after Matt and I talked about it a bit on IRC, we actually
don't really care how we do the mappings if we spell them out later to
kexec over proc or somewhere else, as you wanted.

So we can do the VA address space saving scheme first and change it
later, if there are issues. We'll see.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/12] efi: Add an efi= kernel command line parameter

2013-10-08 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

... for passing miscellaneous options and chicken bits from the command
line.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/platform/efi/efi.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 538c1e6b7b2c..16996aba5012 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -1113,3 +1113,12 @@ efi_status_t efi_query_variable_store(u32 attributes, 
unsigned long size)
return EFI_SUCCESS;
 }
 EXPORT_SYMBOL_GPL(efi_query_variable_store);
+
+static int __init parse_efi_cmdline(char *str)
+{
+   if (*str == '=')
+   str++;
+
+   return 0;
+}
+early_param(efi, parse_efi_cmdline);
-- 
1.8.4

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-10 Thread Borislav Petkov
On Thu, Oct 10, 2013 at 04:14:34PM +0800, Dave Young wrote:
 Even though I still have no idea why kernel text overlap with efi boot
 region, anyway map the un-overlapped part is necessary though.

 I can post the kexec related patches after your mapping patches settle
 down

Right, settle down being the key here.

Matt just mentioned on IRC that we might not need boot services mappings
by the time we have to start the kexec kernel, which would mean, you
don't have to do anything in efi_reserve_boot_services().

The question which needs answering first though is, how the whole efi
thing is going to handle any functionality like calling into efi boot
regions from runtime functions and such. Which hasn't really been tested
and fw vendors don't really want to support that. But this is all bits
and pieces I heard yesterday so it is all pretty wet and I'll let efi
guys, i.e. the Matts and a couple of others :-), figure out this whole
issue.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-11 Thread Borislav Petkov
On Fri, Oct 11, 2013 at 02:24:37PM +0800, Dave Young wrote:
 But for current implementation from Boris, getting same mapping
 between diffrent kernel depends on same md order (same start and
 size for each one) How about using this mapping solution but at the
 same time for kexec kernel we also pass the virtual mappings via
 setup_data, only thing diffrent is we only need map the non boot
 region and just use the boot region size to ensure the other regions
 are mapped with same virtual address.

Actually, as hpa suggested, we will need to be passing the explicit
virtual addresses to the kexec kernel in case we change the mapping
algorithm in the future. So all should go through setup_data.

 OTOH, if we only passing ioremapped data without Boris's current patch
 the problem I worry about is how can we ensure the addresses are not
 used by other code before we mapping the in 2nd kernel efi_init.

Right, the old method of mapping EFI runtime regions used ioremap and
was mapping the regions in the same address space. Now we have reserved
a 64G in the VA space ending at -4G (i.e. 0x___) which
is reserved only for EFI RT usage.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-12 Thread Borislav Petkov
On Sat, Oct 12, 2013 at 11:13:08AM +0100, Matt Fleming wrote:
 On Sat, 12 Oct, at 03:54:44PM, Dave Young wrote:
  Boris:
  
  For the boot service region overlapping problem I have another idea,
  how about modify your mapping code to always mapping the RUNTIME region
  (non boot service region) firstly from the efi_va, then mapping other
  regions in order, in this way kexec 2nd kernel will be happy because
  it does not call SetVirtualAddressMap and it does not need the boot
  service area at all.
 
 Coalescing the runtime regions together implies that the second kernel
 would care about the fragmentation caused by unmapping the boot service
 regions - it shouldn't. We've sliced up a considerable chunk of kernel
 virtual address space (64G) and fragmentation shouldn't be an issue
 right now.
 
 Even if we run out of address space in the future due to fragmentation,
 and end up needing to coalesce runtime regions, this would be
 transparent to the kexec kernel because it's passed the memmap entries
 through setup_data.
 
 Though we are defining an ABI around the EFI address range
 (0xffef - 0x), such that it needs to be the
 same between kernels, we must not make the layout of regions within that
 range part of the ABI. We need the freedom to change the layout in the
 future.

Basically, to sum up what Matt so eloquently explained, we will be
passing all the runtime regions *but* *not* the boot regions (because
the kexec kernel doesn't need them anyway) through setup_data to the
kexec kernel.

I.e., boot services regions is a dont-care for kexec.

And it is very important to restate that we want to reserve ourselves
the most flexible way of passing regions to the kexec kernel in case we
want to change the mapping algorithm in the future. Therefore, kexec
should simply not know anything about the VA layout of the EFI regions
but will get them spelled out through the boot header's setup_data.

This is the picture so far, AFAICT. Matt, please make a lot of noise if
I've misrepresented anything.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-13 Thread Borislav Petkov
On Sun, Oct 13, 2013 at 11:11:27AM +0800, Dave Young wrote:
 Boris, I think we have got the agreement about passing setup_data?

Yes.

Basically, we want to start with what hpa suggested and see where it
gets us:

http://marc.info/?l=linux-kernelm=138006799131051

 I think it should be on top of your patch series,

Yep.

 I can work on that along with other kexec related patches. Or if you
 would like to do it please let me know.

Absolutely, please feel free to do so - it's not like I don't have
anything else to do :-)

In the meantime, I'll finish randconfigs testing of the patches and
upload the latest version to k-org, I'll let you know.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-21 Thread Borislav Petkov
On Mon, Oct 21, 2013 at 08:47:39PM +0800, Dave Young wrote:
 What's the status of this series?

They should appear at some point in Matt's efi-next branch, I think.

 I need below patch for mapping to fixed virt addr passed
 from 1st kernel.

You need this to map the runtime regions in the kexec kernel, right?
Please write that in the commit message.

 Would you like to add it to your series or I send out it later?

Yeah, just add it to your patchset.

 BTW, what tree should my patches based on? Matt's next tree?

Yeah, I think efi-next. Matt?
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-22 Thread Borislav Petkov
On Mon, Oct 21, 2013 at 11:04:26PM +0800, Dave Young wrote:
  You need this to map the runtime regions in the kexec kernel, right?
  Please write that in the commit message.
 
 Yes, will do

Ok, but but, why doesn't the normal code path in efi_enter_virtual_mode
work anymore? I mean, why do you need another function instead of doing
what you did previously:

if (!kexec)
phys_efi_set_virtual_address_map(...)

The path up to here does the mapping already anyway so you only need to
do the mapping in the kexec kernel and skip set set_virtual_map thing.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-23 Thread Borislav Petkov
On Wed, Oct 23, 2013 at 10:17:31AM +0800, Dave Young wrote:
 The reason is that I only pass runtime regions from 1st kernel to
 kexec kernel, your efi mapping function uses the region size to
 determin the virtual address from top to down. Because the passed-in
 md ranges in kexec kernel are different from ranges booting from
 firmware so the virtual address will be different.

Well, this shouldn't be because SetVirtualAddressMap has already fixed
the virtual addresses for us. And if they're different, then runtime
services won't work anyway. Or am I missing something...?

 Even I pass the whole untouched ranges including BOOT_SERVICE there's
 still chance the function for reserving boot regions overwrite the
 boot region size to 0, and 1st kernel will leave it to be used as
 normal memory after efi init. I think we have talked about this issue
 previously.

Matt, didn't you question the need to keep boot services regions
mapped indefinitely? What was the story there?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-23 Thread Borislav Petkov
On Wed, Oct 23, 2013 at 08:51:31PM +0800, Dave Young wrote:
 In kexed 2nd kernel, phys_start_b need to be mapped to virt_start_b
 Simply use efi_map_region from your patch does not work because it
 will map phys_start_b to a different virt address, isn't it?

Oh ok, in the second kernel we're not mapping *all* regions we do map in
the first kernel, right.

 So I need simply map according to the kexec passed in mapping addr.

Yes, thanks for elaborating.
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/6] x86 efi: reserve boot service fix

2013-10-27 Thread Borislav Petkov
On Sun, Oct 27, 2013 at 11:47:15AM +0800, dyo...@redhat.com wrote:
 Current code check boot service region with kernel text region by: 
 start+size = __pa_symbol(_text)
 The end of the above region should be start + size - 1 instead.
 
 I see this problem in ovmf + Fedora 19 grub boot:
 text start: 100 md start: 80 md size: 80
 
 Signed-off-by: Dave Young dyo...@redhat.com

Acked-by: Borislav Petkov b...@suse.de

Btw, Matt, this being a bugfix and all, shouldn't it be tagged for
stable?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/6] Add function efi_remap_region for remapping to saved virt address

2013-10-27 Thread Borislav Petkov
On Sun, Oct 27, 2013 at 11:47:14AM +0800, dyo...@redhat.com wrote:
 Kexec kernel will use saved runtime virtual mapping, so add a
 new function efi_remap_region to remapping it directly without
 calculate the virt addr from efi_va.
 
 The md is passed in from 1st kernel, the virtual addr is
 saved in md-virt_addr.
 
 Signed-off-by: Dave Young dyo...@redhat.com
 ---
  arch/x86/include/asm/efi.h |1 +
  arch/x86/platform/efi/efi_32.c |4 
  arch/x86/platform/efi/efi_64.c |   13 +
  3 files changed, 18 insertions(+)
 
 --- linux-2.6.orig/arch/x86/include/asm/efi.h
 +++ linux-2.6/arch/x86/include/asm/efi.h
 @@ -112,6 +112,7 @@ extern void efi_call_phys_epilog(void);
  extern void efi_unmap_memmap(void);
  extern void efi_memory_uc(u64 addr, unsigned long size);
  extern void __init efi_map_region(efi_memory_desc_t *md);
 +extern void __init efi_remap_region(efi_memory_desc_t *md);
  extern void efi_sync_low_kernel_mappings(void);
  extern void __init old_map_region(efi_memory_desc_t *md);
  
 --- linux-2.6.orig/arch/x86/platform/efi/efi_64.c
 +++ linux-2.6/arch/x86/platform/efi/efi_64.c
 @@ -177,6 +177,19 @@ void __init efi_map_region(efi_memory_de
   md-virt_addr = efi_va;
  }
  
 +void __init efi_remap_region(efi_memory_desc_t *md)

remap? Why?

You did have efi_map_region_fixed() which made more sense.

 +{
 + pgd_t *pgd = (pgd_t *)__va(real_mode_header-trampoline_pgd);
 + unsigned long pf = 0;
 +
 + if (!(md-attribute  EFI_MEMORY_WB))
 + pf |= _PAGE_PCD;
 +
 + if(kernel_map_pages_in_pgd(pgd, md-phys_addr, md-virt_addr, 
 md-num_pages, pf))

ERROR: space required before the open parenthesis '('
#59: FILE: arch/x86/platform/efi/efi_64.c:188:
+   if(kernel_map_pages_in_pgd(pgd, md-phys_addr, md-virt_addr, 
md-num_pages, pf))


Please run them all through checkpatch.pl - better yet, integrate
checkpatch into your workflow like using git hooks, for example.

 + pr_warning(Error mapping PA 0x%llx - VA 0x%llx!\n,

WARNING: Prefer pr_warn(... to pr_warning(...
#60: FILE: arch/x86/platform/efi/efi_64.c:189:
+   pr_warning(Error mapping PA 0x%llx - VA 0x%llx!\n,

 +md-phys_addr, md-virt_addr);
 +}
 +
  void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
u32 type, u64 attribute)
  {
 --- linux-2.6.orig/arch/x86/platform/efi/efi_32.c
 +++ linux-2.6/arch/x86/platform/efi/efi_32.c
 @@ -46,6 +46,10 @@ void __init efi_map_region(efi_memory_de
   old_map_region(md);
  }
  
 +void __init efi_remap_region(efi_memory_desc_t *md)
 +{
 +}

Let's keep braces on the same line as the function to save space:

void __init efi_remap_region(efi_memory_desc_t *md) {}

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/6] x86 efi: reserve boot service fix

2013-10-28 Thread Borislav Petkov
On Mon, Oct 28, 2013 at 09:18:24AM +0800, Dave Young wrote:
 There should be some people see below message with non-kexec kernel:
 Could not reserve boot range ...

I can find one other report like that: https://lkml.org/lkml/2013/7/16/309

[0.00] efi: Could not reserve boot range [0x00-0x000fff]

for

efi: mem00: type=3, attr=0xf, range=[0x-0x1000) 
(0MB)

which is EFI_BOOT_SERVICES_CODE and

efi: Could not reserve boot range [0x05f000-0x09]

for

efi: mem06: type=3, attr=0xf, range=[0x0005f000-0x000a) 
(0MB)

which is of the same type.

 But it's hard for them to notice the bad functionality because
 it's only one mem range which might be not the boot range what
 SetVirtualAddressMap need

Right.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/6] Cleanup efi_enter_virtual_mode function

2013-10-28 Thread Borislav Petkov
On Sun, Oct 27, 2013 at 11:47:16AM +0800, dyo...@redhat.com wrote:
 Add two small functions:
 efi_merge_regions and efi_map_regions, efi_enter_virtual_mode
 calls them instead of embedding two long for loop.
 
 Signed-off-by: Dave Young dyo...@redhat.com
 ---
  arch/x86/platform/efi/efi.c |   83 
 +++-
  1 file changed, 52 insertions(+), 31 deletions(-)
 
 --- efi.orig/arch/x86/platform/efi/efi.c
 +++ efi/arch/x86/platform/efi/efi.c
 @@ -789,35 +789,13 @@ void __init old_map_region(efi_memory_de
   pr_err(ioremap of 0x%llX failed!\n,
  (unsigned long long)md-phys_addr);
  }
 -/*
 - * This function will switch the EFI runtime services to virtual mode.
 - * Essentially, look through the EFI memmap and map every region that
 - * has the runtime attribute bit set in its memory descriptor and update
 - * that memory descriptor with the virtual address obtained from ioremap().
 - * This enables the runtime services to be called without having to
 - * thunk back into physical mode for every invocation.
 - */
 -void __init efi_enter_virtual_mode(void)
 -{
 - efi_memory_desc_t *md, *prev_md = NULL;
 - void *p, *new_memmap = NULL;
 - unsigned long size;
 - efi_status_t status;
 - u64 end, systab;
 - int count = 0;
  
 - efi.systab = NULL;
 -
 - /*
 -  * We don't do virtual mode, since we don't do runtime services, on
 -  * non-native EFI
 -  */
 - if (!efi_is_native()) {
 - efi_unmap_memmap();
 - return;
 - }
 +/* Merge contiguous regions of the same type and attribute */
 +static void efi_merge_regions(void)
 +{
 + void *p;
 + efi_memory_desc_t *md, *prev_md = NULL;
  
 - /* Merge contiguous regions of the same type and attribute */
   for (p = memmap.map; p  memmap.map_end; p += memmap.desc_size) {
   u64 prev_size;
   md = p;
 @@ -844,6 +822,19 @@ void __init efi_enter_virtual_mode(void)
   prev_md = md;
  
   }
 +}
 +
 +/*
 + * Map efi memory ranges for runtime serivce
 + * Return the new memmap with updated virtual addrresses.
 + */
 +void efi_map_regions(void **new_memmap, int *count)
 +{
 + efi_memory_desc_t *md, *prev_md = NULL;

Applying: Cleanup efi_enter_virtual_mode function
/home/boris/kernel/linux-2.6/.git/rebase-apply/patch:42: space before tab in 
indent.
efi_memory_desc_t *md, *prev_md = NULL;
error: patch failed: arch/x86/platform/efi/efi.c:862
error: arch/x86/platform/efi/efi.c: patch does not apply
Patch failed at 0001 Cleanup efi_enter_virtual_mode function

And I know git can be a bit pickier than patch but it doesn't apply with patch
either:

$ patch -p1 --dry-run -i .git/rebase-apply/patch
checking file arch/x86/platform/efi/efi.c
Hunk #3 FAILED at 853.
1 out of 3 hunks FAILED

For some reason, this patch doesn't apply and the .rej looks funny.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/6] Cleanup efi_enter_virtual_mode function

2013-10-28 Thread Borislav Petkov
On Mon, Oct 28, 2013 at 05:51:17PM +0800, Dave Young wrote:
 It does have a warning, but it applied successfully, no idea though: 
 patches/02-efi-enter-virtual-mode-cleanup.patch
 Applying: Cleanup efi_enter_virtual_mode function
 /home/dave/git/efi/.git/rebase-apply/patch:42: space before tab in indent.
   efi_memory_desc_t *md, *prev_md = NULL;
 warning: 1 line adds whitespace errors.

Hmm, ok, can you upload the patches somewhere so that I can pull them?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/12] efi: Add an efi= kernel command line parameter

2013-10-28 Thread Borislav Petkov
On Mon, Oct 28, 2013 at 11:02:13AM +, Matt Fleming wrote:
 This patch should be part of PATCH 12.

I wanted it to be separate as it adds an unrelated functionality but I
don't really care all that much - I'll merge it.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/6] Cleanup efi_enter_virtual_mode function

2013-10-28 Thread Borislav Petkov
On Mon, Oct 28, 2013 at 06:10:11PM +0800, Dave Young wrote:
 Sorry, I have not any public git account. Attached the patch I applied
 with git

Still doesn't work:

$ patch -p1 --dry-run -i /tmp/02-efi-enter-virtual-mode-cleanup-1.patch.new 
checking file arch/x86/platform/efi/efi.c
Hunk #1 succeeded at 788 (offset -1 lines).
Hunk #2 succeeded at 821 (offset -1 lines).
Hunk #3 FAILED at 853.
1 out of 3 hunks FAILED

I'm using Matt's next branch + my efi runtime branch. What are you
basing your stuff ontop?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-28 Thread Borislav Petkov
On Mon, Oct 28, 2013 at 11:22:46AM +, Matt Fleming wrote:
 Could you use the efi_enabled() function to test for EFI_OLD_MEMMAP
 instead of test_bit()?

Sure.

 This way we won't exhaust the bitspace quite so soon (since ARM/ARM64

Yeah, very foresightful.

 can reuse EFI_ARCH_1 if they need it), plus this memory mapping method
 is a very architecture-specific thing and so makes sense to hide it in
 the bowels of arch/x86. If it turns out that ARM/ARM64 need the exact
 same config option we can delete EFI_ARCH_1 and move EFI_OLD_MEMMAP to
 include/linux/efi.h just like in your original patch. 
 
 What do you think?

Yep, done and pushed out.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] EFI: Runtime services virtual mapping

2013-10-30 Thread Borislav Petkov
On Wed, Oct 30, 2013 at 05:32:27PM +0800, Dave Young wrote:
 Boris, thanks for update, it's very elaborate, I have still wonder if
 32 bit case should be mentioned as well.

Ah, so that's why is mfleming bugging me about it on IRC :)

Well, I left out the 32-bit case simply because I don't think anyone
cares about it.

 Waiting for you next version of the patch series. I will redo my
 patches based on that.

Since I'm doing only minor fixups, I didn't want to spam
the lists again.

The latest version is my 'efi' branch at
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git

and you can pull it from there.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/6] Cleanup efi_enter_virtual_mode function

2013-10-30 Thread Borislav Petkov
On Wed, Oct 30, 2013 at 10:03:49AM +0800, Dave Young wrote:
 Will try, but please keep the posted patches in mailing list up-to-date,

Would you like me to send them to you privately?

 I'm an old-fashioned person, do not tend to depend on git.

Really? You should change that - you're missing out on so much by not
using git. :)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/11] x86, pageattr: Add a PGD pagetable populating function

2013-10-31 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

This allocates, if necessary, and populates the corresponding PGD entry
with a PUD page. The next population level is a dummy macro which will
be removed by the next patch and it is added here to keep the patch
small and easily reviewable but not break bisection, at the same time.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index c53de62a1170..4b47ae0602e1 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -666,6 +666,45 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
return 0;
 }
 
+#define populate_pud(cpa, addr, pgd, pgprot)   (-1)
+
+/*
+ * Restrictions for kernel page table do not necessarily apply when mapping in
+ * an alternate PGD.
+ */
+static int populate_pgd(struct cpa_data *cpa, unsigned long addr)
+{
+   pgprot_t pgprot = __pgprot(_KERNPG_TABLE);
+   bool allocd_pgd = false;
+   pgd_t *pgd_entry;
+   pud_t *pud = NULL;  /* shut up gcc */
+   int ret;
+
+   pgd_entry = cpa-pgd + pgd_index(addr);
+
+   /*
+* Allocate a PUD page and hand it down for mapping.
+*/
+   if (pgd_none(*pgd_entry)) {
+   pud = (pud_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
+   if (!pud)
+   return -1;
+
+   set_pgd(pgd_entry, __pgd(__pa(pud) | _KERNPG_TABLE));
+   allocd_pgd = true;
+   }
+
+   pgprot_val(pgprot) = ~pgprot_val(cpa-mask_clr);
+   pgprot_val(pgprot) |=  pgprot_val(cpa-mask_set);
+
+   ret = populate_pud(cpa, addr, pgd_entry, pgprot);
+   if (ret  0)
+   return ret;
+
+   cpa-numpages = ret;
+   return 0;
+}
+
 static int __cpa_process_fault(struct cpa_data *cpa, unsigned long vaddr,
   int primary)
 {
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/11] x86, pageattr: Add a PMD pagetable populating function

2013-10-31 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Handle PMD-level mappings the same as PUD ones.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 82 +-
 1 file changed, 81 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 81deca77b871..968398b023c0 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -666,6 +666,16 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
return 0;
 }
 
+static int alloc_pte_page(pmd_t *pmd)
+{
+   pte_t *pte = (pte_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
+   if (!pte)
+   return -1;
+
+   set_pmd(pmd, __pmd(__pa(pte) | _KERNPG_TABLE));
+   return 0;
+}
+
 static int alloc_pmd_page(pud_t *pud)
 {
pmd_t *pmd = (pmd_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
@@ -676,7 +686,77 @@ static int alloc_pmd_page(pud_t *pud)
return 0;
 }
 
-#define populate_pmd(cpa, start, end, pages, pud, pgprot)  (-1)
+#define populate_pte(cpa, start, end, pages, pmd, pgprot)  do {} while (0)
+
+static int populate_pmd(struct cpa_data *cpa,
+   unsigned long start, unsigned long end,
+   unsigned num_pages, pud_t *pud, pgprot_t pgprot)
+{
+   unsigned int cur_pages = 0;
+   pmd_t *pmd;
+
+   /*
+* Not on a 2M boundary?
+*/
+   if (start  (PMD_SIZE - 1)) {
+   unsigned long pre_end = start + (num_pages  PAGE_SHIFT);
+   unsigned long next_page = (start + PMD_SIZE)  PMD_MASK;
+
+   pre_end   = min_t(unsigned long, pre_end, next_page);
+   cur_pages = (pre_end - start)  PAGE_SHIFT;
+   cur_pages = min_t(unsigned int, num_pages, cur_pages);
+
+   /*
+* Need a PTE page?
+*/
+   pmd = pmd_offset(pud, start);
+   if (pmd_none(*pmd))
+   if (alloc_pte_page(pmd))
+   return -1;
+
+   populate_pte(cpa, start, pre_end, cur_pages, pmd, pgprot);
+
+   start = pre_end;
+   }
+
+   /*
+* We mapped them all?
+*/
+   if (num_pages == cur_pages)
+   return cur_pages;
+
+   while (end - start = PMD_SIZE) {
+
+   /*
+* We cannot use a 1G page so allocate a PMD page if needed.
+*/
+   if (pud_none(*pud))
+   if (alloc_pmd_page(pud))
+   return -1;
+
+   pmd = pmd_offset(pud, start);
+
+   set_pmd(pmd, __pmd(cpa-pfn | _PAGE_PSE | 
massage_pgprot(pgprot)));
+
+   start += PMD_SIZE;
+   cpa-pfn  += PMD_SIZE;
+   cur_pages += PMD_SIZE  PAGE_SHIFT;
+   }
+
+   /*
+* Map trailing 4K pages.
+*/
+   if (start  end) {
+   pmd = pmd_offset(pud, start);
+   if (pmd_none(*pmd))
+   if (alloc_pte_page(pmd))
+   return -1;
+
+   populate_pte(cpa, start, end, num_pages - cur_pages,
+pmd, pgprot);
+   }
+   return num_pages;
+}
 
 static int populate_pud(struct cpa_data *cpa, unsigned long start, pgd_t *pgd,
pgprot_t pgprot)
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/11] x86, pageattr: Add a PUD error unwinding path

2013-10-31 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

In case we encounter an error during the mapping of a region, we want to
unwind what we've established so far exactly the way we did the mapping.
This is the PUD part kept deliberately small for easier review.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 60 --
 1 file changed, 58 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 2a1308a8c072..1cbdbbc35b47 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -666,6 +666,51 @@ static int split_large_page(pte_t *kpte, unsigned long 
address)
return 0;
 }
 
+#define unmap_pmd_range(pud, start, pre_end)   do {} while (0)
+
+static void unmap_pud_range(pgd_t *pgd, unsigned long start, unsigned long end)
+{
+   pud_t *pud = pud_offset(pgd, start);
+
+   /*
+* Not on a GB page boundary?
+*/
+   if (start  (PUD_SIZE - 1)) {
+   unsigned long next_page = (start + PUD_SIZE)  PUD_MASK;
+   unsigned long pre_end   = min_t(unsigned long, end, next_page);
+
+   unmap_pmd_range(pud, start, pre_end);
+
+   start = pre_end;
+   pud++;
+   }
+
+   /*
+* Try to unmap in 1G chunks?
+*/
+   while (end - start = PUD_SIZE) {
+
+   if (pud_large(*pud))
+   pud_clear(pud);
+   else
+   unmap_pmd_range(pud, start, start + PUD_SIZE);
+
+   start += PUD_SIZE;
+   pud++;
+   }
+
+   /*
+* 2M leftovers?
+*/
+   if (start  end)
+   unmap_pmd_range(pud, start, end);
+
+   /*
+* No need to try to free the PUD page because we'll free it in
+* populate_pgd's error path
+*/
+}
+
 static int alloc_pte_page(pmd_t *pmd)
 {
pte_t *pte = (pte_t *)get_zeroed_page(GFP_KERNEL | __GFP_NOTRACK);
@@ -883,9 +928,20 @@ static int populate_pgd(struct cpa_data *cpa, unsigned 
long addr)
pgprot_val(pgprot) |=  pgprot_val(cpa-mask_set);
 
ret = populate_pud(cpa, addr, pgd_entry, pgprot);
-   if (ret  0)
-   return ret;
+   if (ret  0) {
+   unmap_pud_range(pgd_entry, addr,
+   addr + (cpa-numpages  PAGE_SHIFT));
 
+   if (allocd_pgd) {
+   /*
+* If I allocated this PUD page, I can just as well
+* free it in this error path.
+*/
+   pgd_clear(pgd_entry);
+   free_page((unsigned long)pud);
+   }
+   return ret;
+   }
cpa-numpages = ret;
return 0;
 }
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/11] x86, pageattr: Lookup address in an arbitrary PGD

2013-10-31 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

This is preparatory work in order to be able to map pages into a
specified PGD and not implicitly and only into init_mm.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 36 ++--
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index bb32480c2d71..c53de62a1170 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -30,6 +30,7 @@
  */
 struct cpa_data {
unsigned long   *vaddr;
+   pgd_t   *pgd;
pgprot_tmask_set;
pgprot_tmask_clr;
int numpages;
@@ -322,17 +323,9 @@ static inline pgprot_t static_protections(pgprot_t prot, 
unsigned long address,
return prot;
 }
 
-/*
- * Lookup the page table entry for a virtual address. Return a pointer
- * to the entry and the level of the mapping.
- *
- * Note: We return pud and pmd either when the entry is marked large
- * or when the present bit is not set. Otherwise we would return a
- * pointer to a nonexisting mapping.
- */
-pte_t *lookup_address(unsigned long address, unsigned int *level)
+static pte_t *__lookup_address_in_pgd(pgd_t *pgd, unsigned long address,
+ unsigned int *level)
 {
-   pgd_t *pgd = pgd_offset_k(address);
pud_t *pud;
pmd_t *pmd;
 
@@ -361,8 +354,31 @@ pte_t *lookup_address(unsigned long address, unsigned int 
*level)
 
return pte_offset_kernel(pmd, address);
 }
+
+/*
+ * Lookup the page table entry for a virtual address. Return a pointer
+ * to the entry and the level of the mapping.
+ *
+ * Note: We return pud and pmd either when the entry is marked large
+ * or when the present bit is not set. Otherwise we would return a
+ * pointer to a nonexisting mapping.
+ */
+pte_t *lookup_address(unsigned long address, unsigned int *level)
+{
+return __lookup_address_in_pgd(pgd_offset_k(address), address, level);
+}
 EXPORT_SYMBOL_GPL(lookup_address);
 
+static pte_t *_lookup_address_cpa(struct cpa_data *cpa, unsigned long address,
+ unsigned int *level)
+{
+if (cpa-pgd)
+   return __lookup_address_in_pgd(cpa-pgd + pgd_index(address),
+  address, level);
+
+return lookup_address(address, level);
+}
+
 /*
  * This is necessary because __pa() does not work on some
  * kinds of memory, like vmalloc() or the alloc_remap()
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/11] EFI: Runtime services virtual mapping

2013-10-31 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

We map the EFI regions needed for runtime services non-contiguously,
with preserved alignment on virtual addresses starting from -4G down
for a total max space of 64G. This way, we provide for stable runtime
services addresses across kernels so that a kexec'd kernel can still use
them.

Thus, they're mapped in a separate pagetable so that we don't pollute
the kernel namespace.

Add an efi= kernel command line parameter for passing miscellaneous
options and chicken bits from the command line.

While at it, add a chicken bit called efi=old_map which can be used as
a fallback to the old runtime services mapping method in case there's
some b0rkage with a particular EFI implementation (haha, it is hard to
hold up the sarcasm here...).

Also, add the UEFI RT VA space to Documentation/x86/x86_64/mm.txt.

Signed-off-by: Borislav Petkov b...@suse.de
---
 Documentation/kernel-parameters.txt  |   6 ++
 Documentation/x86/x86_64/mm.txt  |   7 +++
 arch/x86/include/asm/efi.h   |  64 ++--
 arch/x86/include/asm/pgtable_types.h |   3 +-
 arch/x86/platform/efi/efi.c  |  94 +-
 arch/x86/platform/efi/efi_32.c   |   9 ++-
 arch/x86/platform/efi/efi_64.c   | 109 +++
 arch/x86/platform/efi/efi_stub_64.S  |  54 +
 include/linux/efi.h  |   1 +
 9 files changed, 300 insertions(+), 47 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 7f9d4f53882c..57603f2d7e86 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -833,6 +833,12 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
edd=[EDD]
Format: {off | on | skip[mbr]}
 
+   efi=[EFI]
+   Format: { old_map }
+   old_map [X86-64]: switch to the old ioremap-based EFI
+   runtime services mapping. 32-bit still uses this one by
+   default.
+
efi_no_storage_paranoia [EFI; X86]
Using this parameter you can use more than 50% of
your efi variable storage. Use this parameter only if
diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 881582f75c9c..c584a51add15 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -28,4 +28,11 @@ reference.
 Current X86-64 implementations only support 40 bits of address space,
 but we support up to 46 bits. This expands into MBZ space in the page tables.
 
+-trampoline_pgd:
+
+We map EFI runtime services in the aforementioned PGD in the virtual
+range of 64Gb (arbitrarily set, can be raised if needed)
+
+0xffef - 0x
+
 -Andi Kleen, Jul 2004
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 0062a0125041..ee86179f8ca7 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -1,6 +1,24 @@
 #ifndef _ASM_X86_EFI_H
 #define _ASM_X86_EFI_H
 
+/*
+ * We map the EFI regions needed for runtime services non-contiguously,
+ * with preserved alignment on virtual addresses starting from -4G down
+ * for a total max space of 64G. This way, we provide for stable runtime
+ * services addresses across kernels so that a kexec'd kernel can still
+ * use them.
+ *
+ * This is the main reason why we're doing stable VA mappings for RT
+ * services.
+ *
+ * This flag is used in conjuction with a chicken bit called
+ * efi=old_map which can be used as a fallback to the old runtime
+ * services mapping method in case there's some b0rkage with a
+ * particular EFI implementation (haha, it is hard to hold up the
+ * sarcasm here...).
+ */
+#define EFI_OLD_MEMMAP EFI_ARCH_1
+
 #ifdef CONFIG_X86_32
 
 #define EFI_LOADER_SIGNATURE   EL32
@@ -69,24 +87,31 @@ extern u64 efi_call6(void *fp, u64 arg1, u64 arg2, u64 arg3,
efi_call6((f), (u64)(a1), (u64)(a2), (u64)(a3), \
  (u64)(a4), (u64)(a5), (u64)(a6))
 
+#define _efi_call_virtX(x, f, ...) \
+({ \
+   efi_status_t __s;   \
+   \
+   efi_sync_low_kernel_mappings(); \
+   preempt_disable();  \
+   __s = efi_call##x((void *)efi.systab-runtime-f, __VA_ARGS__); \
+   preempt_enable();   \
+   __s;\
+})
+
 #define efi_call_virt0(f)  \
-   efi_call0((efi.systab-runtime-f))
-#define efi_call_virt1(f, a1

[PATCH 00/11] EFI runtime services virtual mapping, vN++

2013-10-31 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Hi all,

here's maybe the final version of the patchset, no major changes since
the last time but cosmetic and cleanups and tidying as requested by Matt
and others.

Patches also at:

git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git#efi

The previous announcement:

here's finally a new version of the runtime services VA mapping patchset
which hopefully implements hpa's idea of statically mapping EFI runtime
regions in a top-down manner starting at -4Gb virtual.

We're also using a different pagetable so as not to pollute kernel
address space. For that, we switch to that table before doing an EFI
call, and afterwards we switch back to the previous one.

To the patches:

1-2 are simple cleanups which Matt probably can take now

3-10 add the machinery to map regions into an arbitrary PGD. Those I've
split deliberately into very small bites so that they can be reviewed
more thoroughly and easily for my pagetable skills are pretty basic.

11 is the actual patch which implements that mapping so that we can use
runtime services in kexec (which is the whole reason for this fuss :))

So please take a long hard look at those, hammer on them on your
boxes and let me know. They boot fine on my Dell UEFI box and in OVMF
(obviously :)).

Borislav Petkov (11):
  efi: Simplify EFI_DEBUG
  x86, pageattr: Lookup address in an arbitrary PGD
  x86, pageattr: Add a PGD pagetable populating function
  x86, pageattr: Add a PUD pagetable populating function
  x86, pageattr: Add a PMD pagetable populating function
  x86, pageattr: Add a PTE pagetable populating function
  x86, pageattr: Add a PUD error unwinding path
  x86, pageattr: Add last levels of error path
  x86, cpa: Map in an arbitrary pgd
  EFI: Runtime services virtual mapping
  efi: Check krealloc return value

 Documentation/kernel-parameters.txt  |   6 +
 Documentation/x86/x86_64/mm.txt  |   7 +
 arch/x86/include/asm/efi.h   |  64 +++--
 arch/x86/include/asm/pgtable_types.h |   3 +-
 arch/x86/mm/pageattr.c   | 461 +--
 arch/x86/platform/efi/efi.c  | 111 ++---
 arch/x86/platform/efi/efi_32.c   |   9 +-
 arch/x86/platform/efi/efi_64.c   | 109 +
 arch/x86/platform/efi/efi_stub_64.S  |  54 
 include/linux/efi.h  |   1 +
 10 files changed, 755 insertions(+), 70 deletions(-)

-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/11] x86, pageattr: Add a PTE pagetable populating function

2013-10-31 Thread Borislav Petkov
From: Borislav Petkov b...@suse.de

Handle last level by unconditionally writing the PTEs into the PTE page
while paying attention to the NX bit.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/mm/pageattr.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 968398b023c0..2a1308a8c072 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -686,7 +686,27 @@ static int alloc_pmd_page(pud_t *pud)
return 0;
 }
 
-#define populate_pte(cpa, start, end, pages, pmd, pgprot)  do {} while (0)
+static void populate_pte(struct cpa_data *cpa,
+unsigned long start, unsigned long end,
+unsigned num_pages, pmd_t *pmd, pgprot_t pgprot)
+{
+   pte_t *pte;
+
+   pte = pte_offset_kernel(pmd, start);
+
+   while (num_pages--  start  end) {
+
+   /* deal with the NX bit */
+   if (!(pgprot_val(pgprot)  _PAGE_NX))
+   cpa-pfn = ~_PAGE_NX;
+
+   set_pte(pte, pfn_pte(cpa-pfn  PAGE_SHIFT, pgprot));
+
+   start+= PAGE_SIZE;
+   cpa-pfn += PAGE_SIZE;
+   pte++;
+   }
+}
 
 static int populate_pmd(struct cpa_data *cpa,
unsigned long start, unsigned long end,
-- 
1.8.4

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 3/6] Cleanup efi_enter_virtual_mode function

2013-11-01 Thread Borislav Petkov
On Fri, Nov 01, 2013 at 09:18:25AM +0800, Dave Young wrote:
 Great, thank you. BTW, I have managed to test original patch set on a
 Macboot Air of my friend with usb boot, it works ok.

Ok, that's actually a very good news - the apples tend to be special wrt
uefi implementation.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/7 v2] kexec kernel efi runtime support

2013-11-05 Thread Borislav Petkov
On Tue, Nov 05, 2013 at 04:20:07PM +0800, dyo...@redhat.com wrote:
 Please help to review the patches.

Sure, but will have to wait 'til next week when I get back.

Thanks.

-- 
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 5/9 v3] efi: export more efi table variable to sysfs

2013-11-21 Thread Borislav Petkov
On Thu, Nov 21, 2013 at 02:17:09PM +0800, dyo...@redhat.com wrote:
 --- efi.orig/arch/x86/platform/efi/efi.c
 +++ efi/arch/x86/platform/efi/efi.c
 @@ -653,6 +653,10 @@ void __init efi_init(void)
  
   set_bit(EFI_SYSTEM_TABLES, x86_efi_facility);
  
 + efi.fw_vendor = (unsigned long)efi.systab-fw_vendor;
 + efi.runtime = (unsigned long)efi.systab-runtime;
 + efi.config_table = (unsigned long)efi.systab-tables;

A bit more readable:

efi.config_table = (unsigned long)efi.systab-tables;
efi.fw_vendor= (unsigned long)efi.systab-fw_vendor;
efi.runtime  = (unsigned long)efi.systab-runtime;

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 2/2] x86, efi: Early use of boot service memory

2013-11-21 Thread Borislav Petkov
On Thu, Nov 21, 2013 at 02:01:26PM -0700, Jerry Hoemann wrote:
 Some platform have firmware that violate the UEFI spec and access boot service
 code or data segments after the system has called ExitBootServices().
 The call to efi_reserve_boot_services is a workaround to avoid using
 boot service memory until after the kernel has done SetVirtualAddressMap().
 However, this reservation fragments memory which can cause
 large allocations early in boot (e.g. crash kernel) to fail.
 
 When reserve_crashkernel fails, kdump is disabled.
 
 This patch creates a quirk list that governs when the workaround,
 efi_reserve_boot_services, is called.
 
 For all firmware released prior to 2014, the workaround will be
 called unless an entry for the platform is in the quirk list saying
 not to do the workaround.
 
 For all firmware released 2014 and later,  the workaround will not
 be called unless an entry for the platform is in the quirk list
 saying to call the workaround.

This is yet another quirk list which can grow uncontrolled considering
the notoriety of firmware bugs. And since detecting such spec violation
is very simple - boot Linux on the machine - we should rather disable
this by default for FW = 2014 and make this test part of the firmware
test suite so that vendors can get a chance to fix their BIOSen.

Provided vendors do boot fwts on their validation platforms, that is.

Yo Fleming, got a better idea? :)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/9 v3] efi: remove unused variables in __map_region

2013-11-21 Thread Borislav Petkov
On Thu, Nov 21, 2013 at 02:17:05PM +0800, dyo...@redhat.com wrote:
 variables size and end is useless in this function, thus remove them.
 
 Reported-by: Toshi Kani toshi.k...@hp.com
 Signed-off-by: Dave Young dyo...@redhat.com

Good catch.

Acked-by: Borislav Petkov b...@suse.de

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   >