Re: [PATCH v2 0/5] mm: support parallel free of memory

2017-03-20 Thread Alex Thorlton
On Fri, Mar 17, 2017 at 10:21:58AM +0800, Aaron Lu wrote:
> On Thu, Mar 16, 2017 at 02:38:44PM -0500, Alex Thorlton wrote:
> > On Wed, Mar 15, 2017 at 04:59:59PM +0800, Aaron Lu wrote:
> > > v2 changes: Nothing major, only minor ones.
> > >  - rebased on top of v4.11-rc2-mmotm-2017-03-14-15-41;
> > >  - use list_add_tail instead of list_add to add worker to tlb's worker
> > >list so that when doing flush, the first queued worker gets flushed
> > >first(based on the comsumption that the first queued worker has a
> > >better chance of finishing its job than those later queued workers);
> > >  - use bool instead of int for variable free_batch_page in function
> > >tlb_flush_mmu_free_batches;
> > >  - style change according to ./scripts/checkpatch;
> > >  - reword some of the changelogs to make it more readable.
> > > 
> > > v1 is here:
> > > https://lkml.org/lkml/2017/2/24/245
> > 
> > I tested v1 on a Haswell system with 64 sockets/1024 cores/2048 threads
> > and 8TB of RAM, with a 1TB malloc.  The average free() time for a 1TB
> > malloc on a vanilla kernel was 41.69s, the patched kernel averaged
> > 21.56s for the same test.
> 
> Thanks a lot for the test result.
> 
> > 
> > I am testing v2 now and will report back with results in the next day or
> > so.
> 
> Testing plain v2 shouldn't bring any surprise/difference

You're right!  Not much difference here.  v2 averaged a 23.17s free
time for a 1T allocation.

> better set the
> following param before the test(I'm planning to make them default in the
> next version):
> # echo 64 > /sys/devices/virtual/workqueue/batch_free_wq/max_active
> # echo 1030 > /sys/kernel/debug/parallel_free/max_gather_batch_count

10 test runs with these params set averaged 22.22s to free 1T.

So, we're still seeing a nearly 50% decrease in free time vs. the
unpatched kernel.

- Alex


Re: [PATCH v2 0/5] mm: support parallel free of memory

2017-03-20 Thread Alex Thorlton
On Fri, Mar 17, 2017 at 10:21:58AM +0800, Aaron Lu wrote:
> On Thu, Mar 16, 2017 at 02:38:44PM -0500, Alex Thorlton wrote:
> > On Wed, Mar 15, 2017 at 04:59:59PM +0800, Aaron Lu wrote:
> > > v2 changes: Nothing major, only minor ones.
> > >  - rebased on top of v4.11-rc2-mmotm-2017-03-14-15-41;
> > >  - use list_add_tail instead of list_add to add worker to tlb's worker
> > >list so that when doing flush, the first queued worker gets flushed
> > >first(based on the comsumption that the first queued worker has a
> > >better chance of finishing its job than those later queued workers);
> > >  - use bool instead of int for variable free_batch_page in function
> > >tlb_flush_mmu_free_batches;
> > >  - style change according to ./scripts/checkpatch;
> > >  - reword some of the changelogs to make it more readable.
> > > 
> > > v1 is here:
> > > https://lkml.org/lkml/2017/2/24/245
> > 
> > I tested v1 on a Haswell system with 64 sockets/1024 cores/2048 threads
> > and 8TB of RAM, with a 1TB malloc.  The average free() time for a 1TB
> > malloc on a vanilla kernel was 41.69s, the patched kernel averaged
> > 21.56s for the same test.
> 
> Thanks a lot for the test result.
> 
> > 
> > I am testing v2 now and will report back with results in the next day or
> > so.
> 
> Testing plain v2 shouldn't bring any surprise/difference

You're right!  Not much difference here.  v2 averaged a 23.17s free
time for a 1T allocation.

> better set the
> following param before the test(I'm planning to make them default in the
> next version):
> # echo 64 > /sys/devices/virtual/workqueue/batch_free_wq/max_active
> # echo 1030 > /sys/kernel/debug/parallel_free/max_gather_batch_count

10 test runs with these params set averaged 22.22s to free 1T.

So, we're still seeing a nearly 50% decrease in free time vs. the
unpatched kernel.

- Alex


Re: [PATCH v2 0/5] mm: support parallel free of memory

2017-03-16 Thread Alex Thorlton
On Wed, Mar 15, 2017 at 04:59:59PM +0800, Aaron Lu wrote:
> v2 changes: Nothing major, only minor ones.
>  - rebased on top of v4.11-rc2-mmotm-2017-03-14-15-41;
>  - use list_add_tail instead of list_add to add worker to tlb's worker
>list so that when doing flush, the first queued worker gets flushed
>first(based on the comsumption that the first queued worker has a
>better chance of finishing its job than those later queued workers);
>  - use bool instead of int for variable free_batch_page in function
>tlb_flush_mmu_free_batches;
>  - style change according to ./scripts/checkpatch;
>  - reword some of the changelogs to make it more readable.
> 
> v1 is here:
> https://lkml.org/lkml/2017/2/24/245

I tested v1 on a Haswell system with 64 sockets/1024 cores/2048 threads
and 8TB of RAM, with a 1TB malloc.  The average free() time for a 1TB
malloc on a vanilla kernel was 41.69s, the patched kernel averaged
21.56s for the same test.

I am testing v2 now and will report back with results in the next day or
so.

- Alex


Re: [PATCH v2 0/5] mm: support parallel free of memory

2017-03-16 Thread Alex Thorlton
On Wed, Mar 15, 2017 at 04:59:59PM +0800, Aaron Lu wrote:
> v2 changes: Nothing major, only minor ones.
>  - rebased on top of v4.11-rc2-mmotm-2017-03-14-15-41;
>  - use list_add_tail instead of list_add to add worker to tlb's worker
>list so that when doing flush, the first queued worker gets flushed
>first(based on the comsumption that the first queued worker has a
>better chance of finishing its job than those later queued workers);
>  - use bool instead of int for variable free_batch_page in function
>tlb_flush_mmu_free_batches;
>  - style change according to ./scripts/checkpatch;
>  - reword some of the changelogs to make it more readable.
> 
> v1 is here:
> https://lkml.org/lkml/2017/2/24/245

I tested v1 on a Haswell system with 64 sockets/1024 cores/2048 threads
and 8TB of RAM, with a 1TB malloc.  The average free() time for a 1TB
malloc on a vanilla kernel was 41.69s, the patched kernel averaged
21.56s for the same test.

I am testing v2 now and will report back with results in the next day or
so.

- Alex


[PATCH 2/2] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-12-05 Thread Alex Thorlton
On systems with sufficiently large e820 tables, and several IOAPICs, it
is possible for the XENMEM_machine_memory_map callback (and its
counterpart, XENMEM_memory_map) to attempt to return an e820 table with
more than 128 entries.  This callback adds entries to the BIOS-provided
e820 table to account for IOAPIC registers, which, on sufficiently large
systems, can result in an e820 table that is too large to copy back into
xen_e820_map.

This change simply increases the size of xen_e820_map to E820_X_MAX to
ensure that there is enough room to store the entire e820 map returned
from this callback.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Suggested-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Reviewed-by: Juergen Gross <jgr...@suse.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: David Vrabel <david.vra...@citrix.com>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/xen/setup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index f8960fc..8c394e3 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -41,7 +41,7 @@
 unsigned long xen_released_pages;
 
 /* E820 map used during setting up memory. */
-static struct e820entry xen_e820_map[E820MAX] __initdata;
+static struct e820entry xen_e820_map[E820_X_MAX] __initdata;
 static u32 xen_e820_map_entries __initdata;
 
 /*
@@ -750,7 +750,7 @@ char * __init xen_memory_setup(void)
max_pfn = min(max_pfn, xen_start_info->nr_pages);
mem_end = PFN_PHYS(max_pfn);
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
op = xen_initial_domain() ?
@@ -923,7 +923,7 @@ char * __init xen_auto_xlated_memory_setup(void)
int i;
int rc;
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
rc = HYPERVISOR_memory_op(XENMEM_memory_map, );
-- 
1.8.5.6



[PATCH 2/2] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-12-05 Thread Alex Thorlton
On systems with sufficiently large e820 tables, and several IOAPICs, it
is possible for the XENMEM_machine_memory_map callback (and its
counterpart, XENMEM_memory_map) to attempt to return an e820 table with
more than 128 entries.  This callback adds entries to the BIOS-provided
e820 table to account for IOAPIC registers, which, on sufficiently large
systems, can result in an e820 table that is too large to copy back into
xen_e820_map.

This change simply increases the size of xen_e820_map to E820_X_MAX to
ensure that there is enough room to store the entire e820 map returned
from this callback.

Signed-off-by: Alex Thorlton 
Suggested-by: Boris Ostrovsky 
Reviewed-by: Juergen Gross 
Cc: Russ Anderson 
Cc: David Vrabel 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: "H. Peter Anvin" 
Cc: Denys Vlasenko 
Cc: Boris Ostrovsky 
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/xen/setup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index f8960fc..8c394e3 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -41,7 +41,7 @@
 unsigned long xen_released_pages;
 
 /* E820 map used during setting up memory. */
-static struct e820entry xen_e820_map[E820MAX] __initdata;
+static struct e820entry xen_e820_map[E820_X_MAX] __initdata;
 static u32 xen_e820_map_entries __initdata;
 
 /*
@@ -750,7 +750,7 @@ char * __init xen_memory_setup(void)
max_pfn = min(max_pfn, xen_start_info->nr_pages);
mem_end = PFN_PHYS(max_pfn);
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
op = xen_initial_domain() ?
@@ -923,7 +923,7 @@ char * __init xen_auto_xlated_memory_setup(void)
int i;
int rc;
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
rc = HYPERVISOR_memory_op(XENMEM_memory_map, );
-- 
1.8.5.6



[RFC PATCH v3] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-12-05 Thread Alex Thorlton
This is the third pass at my patchset to fix up our problems with
XENMEM_machine_memory_map on large systems.  The only changes on this
pass were to flesh out the comment above the E820_X_MAX definition, and
to add Juergen's Reviewed-by to the second patch.

Let me know if anyone has any questions/comments!

Alex Thorlton (2):
  x86: Make E820_X_MAX unconditionally larger than E820MAX
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

 arch/x86/include/asm/e820.h | 12 
 arch/x86/xen/setup.c|  6 +++---
 2 files changed, 11 insertions(+), 7 deletions(-)

-- 
1.8.5.6



[PATCH 1/2] x86: Make E820_X_MAX unconditionally larger than E820MAX

2016-12-05 Thread Alex Thorlton
It's really not necessary to limit E820_X_MAX to 128 in the non-EFI
case.  This commit drops E820_X_MAX's dependency on CONFIG_EFI, so that
E820_X_MAX is always at least slightly larger than E820MAX.

The real motivation behind this is actually to prevent some issues in
the Xen kernel, where the XENMEM_machine_memory_map hypercall can
produce an e820 map larger than 128 entries, even on systems where the
original e820 table was quite a bit smaller than that, depending on how
many IOAPICs are installed on the system.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Suggested-by: Ingo Molnar <mi...@redhat.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: David Vrabel <david.vra...@citrix.com>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/include/asm/e820.h | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 476b574..ec23d8e 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -1,13 +1,17 @@
 #ifndef _ASM_X86_E820_H
 #define _ASM_X86_E820_H
 
-#ifdef CONFIG_EFI
+/*
+ * E820_X_MAX is the maximum size of the extended E820 table.  The extended
+ * table may contain up to 3 extra E820 entries per possible NUMA node, so we
+ * make room for 3 * MAX_NUMNODES possible entries, beyond the standard 128.
+ * Also note that E820_X_MAX *must* be defined before we include 
uapi/asm/e820.h.
+ */
 #include 
 #define E820_X_MAX (E820MAX + 3 * MAX_NUMNODES)
-#else  /* ! CONFIG_EFI */
-#define E820_X_MAX E820MAX
-#endif
+
 #include 
+
 #ifndef __ASSEMBLY__
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map *e820;
-- 
1.8.5.6



[RFC PATCH v3] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-12-05 Thread Alex Thorlton
This is the third pass at my patchset to fix up our problems with
XENMEM_machine_memory_map on large systems.  The only changes on this
pass were to flesh out the comment above the E820_X_MAX definition, and
to add Juergen's Reviewed-by to the second patch.

Let me know if anyone has any questions/comments!

Alex Thorlton (2):
  x86: Make E820_X_MAX unconditionally larger than E820MAX
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

 arch/x86/include/asm/e820.h | 12 
 arch/x86/xen/setup.c|  6 +++---
 2 files changed, 11 insertions(+), 7 deletions(-)

-- 
1.8.5.6



[PATCH 1/2] x86: Make E820_X_MAX unconditionally larger than E820MAX

2016-12-05 Thread Alex Thorlton
It's really not necessary to limit E820_X_MAX to 128 in the non-EFI
case.  This commit drops E820_X_MAX's dependency on CONFIG_EFI, so that
E820_X_MAX is always at least slightly larger than E820MAX.

The real motivation behind this is actually to prevent some issues in
the Xen kernel, where the XENMEM_machine_memory_map hypercall can
produce an e820 map larger than 128 entries, even on systems where the
original e820 table was quite a bit smaller than that, depending on how
many IOAPICs are installed on the system.

Signed-off-by: Alex Thorlton 
Suggested-by: Ingo Molnar 
Cc: Russ Anderson 
Cc: David Vrabel 
Cc: Ingo Molnar 
Cc: Juergen Gross 
Cc: Thomas Gleixner 
Cc: "H. Peter Anvin" 
Cc: Denys Vlasenko 
Cc: Boris Ostrovsky 
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/include/asm/e820.h | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 476b574..ec23d8e 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -1,13 +1,17 @@
 #ifndef _ASM_X86_E820_H
 #define _ASM_X86_E820_H
 
-#ifdef CONFIG_EFI
+/*
+ * E820_X_MAX is the maximum size of the extended E820 table.  The extended
+ * table may contain up to 3 extra E820 entries per possible NUMA node, so we
+ * make room for 3 * MAX_NUMNODES possible entries, beyond the standard 128.
+ * Also note that E820_X_MAX *must* be defined before we include 
uapi/asm/e820.h.
+ */
 #include 
 #define E820_X_MAX (E820MAX + 3 * MAX_NUMNODES)
-#else  /* ! CONFIG_EFI */
-#define E820_X_MAX E820MAX
-#endif
+
 #include 
+
 #ifndef __ASSEMBLY__
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map *e820;
-- 
1.8.5.6



Re: [PATCH 1/2] x86: Make E820_X_MAX unconditionally larger than E820MAX

2016-12-01 Thread Alex Thorlton
On Wed, Nov 30, 2016 at 07:21:48AM +0100, Ingo Molnar wrote:
> 
> * Alex Thorlton <athorl...@sgi.com> wrote:
> 
> > It's really not necessary to limit E820_X_MAX to 128 in the non-EFI
> > case.  This commit drops E820_X_MAX's dependency on CONFIG_EFI, so that
> > E820_X_MAX is always at least slightly larger than E820MAX.
> > 
> > The real motivation behind this is actually to prevent some issues in
> > the Xen kernel, where the XENMEM_machine_memory_map hypercall can
> > produce an e820 map larger than 128 entries, even on systems where the
> > original e820 table was quite a bit smaller than that, depending on how
> > many IOAPICs are installed on the system.
> > 
> > Signed-off-by: Alex Thorlton <athorl...@sgi.com>
> > Suggested-by: Ingo Molnar <mi...@redhat.com>
> > Cc: Russ Anderson <r...@sgi.com>
> > Cc: David Vrabel <david.vra...@citrix.com>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: Juergen Gross <jgr...@suse.com>
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > Cc: Denys Vlasenko <dvlas...@redhat.com>
> > Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
> > Cc: x...@kernel.org
> > Cc: xen-de...@lists.xenproject.org
> > ---
> >  arch/x86/include/asm/e820.h | 10 ++
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
> > index 476b574..aa00d33 100644
> > --- a/arch/x86/include/asm/e820.h
> > +++ b/arch/x86/include/asm/e820.h
> > @@ -1,13 +1,15 @@
> >  #ifndef _ASM_X86_E820_H
> >  #define _ASM_X86_E820_H
> >  
> > -#ifdef CONFIG_EFI
> > +/*
> > + * We need to make sure that E820_X_MAX is defined
> > + * before we include uapi/asm/e820.h
> > + */
> >  #include 
> >  #define E820_X_MAX (E820MAX + 3 * MAX_NUMNODES)
> 
> What we need an explanation for in the comment is what does this stand for 
> (what 
> does the 'X' mean?), and what is the magic 3*MAX_NUMNODES about?

I'm not actually certain why 3*MAX_NUMNODES was chosen as the max size
for this table, but it was written by somebody here at SGI, so I'm sure
I can find out :)

As for the 'X,' I'm assuming that's meant to imply that this is the
maxmimum size for the 'eXtended' table, but that could be made more
clear in the comment as well.

I'll come up with a better comment for this and submit a v3.

Thanks, Ingo!


Re: [PATCH 1/2] x86: Make E820_X_MAX unconditionally larger than E820MAX

2016-12-01 Thread Alex Thorlton
On Wed, Nov 30, 2016 at 07:21:48AM +0100, Ingo Molnar wrote:
> 
> * Alex Thorlton  wrote:
> 
> > It's really not necessary to limit E820_X_MAX to 128 in the non-EFI
> > case.  This commit drops E820_X_MAX's dependency on CONFIG_EFI, so that
> > E820_X_MAX is always at least slightly larger than E820MAX.
> > 
> > The real motivation behind this is actually to prevent some issues in
> > the Xen kernel, where the XENMEM_machine_memory_map hypercall can
> > produce an e820 map larger than 128 entries, even on systems where the
> > original e820 table was quite a bit smaller than that, depending on how
> > many IOAPICs are installed on the system.
> > 
> > Signed-off-by: Alex Thorlton 
> > Suggested-by: Ingo Molnar 
> > Cc: Russ Anderson 
> > Cc: David Vrabel 
> > Cc: Ingo Molnar 
> > Cc: Juergen Gross 
> > Cc: Thomas Gleixner 
> > Cc: "H. Peter Anvin" 
> > Cc: Denys Vlasenko 
> > Cc: Boris Ostrovsky 
> > Cc: x...@kernel.org
> > Cc: xen-de...@lists.xenproject.org
> > ---
> >  arch/x86/include/asm/e820.h | 10 ++
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
> > index 476b574..aa00d33 100644
> > --- a/arch/x86/include/asm/e820.h
> > +++ b/arch/x86/include/asm/e820.h
> > @@ -1,13 +1,15 @@
> >  #ifndef _ASM_X86_E820_H
> >  #define _ASM_X86_E820_H
> >  
> > -#ifdef CONFIG_EFI
> > +/*
> > + * We need to make sure that E820_X_MAX is defined
> > + * before we include uapi/asm/e820.h
> > + */
> >  #include 
> >  #define E820_X_MAX (E820MAX + 3 * MAX_NUMNODES)
> 
> What we need an explanation for in the comment is what does this stand for 
> (what 
> does the 'X' mean?), and what is the magic 3*MAX_NUMNODES about?

I'm not actually certain why 3*MAX_NUMNODES was chosen as the max size
for this table, but it was written by somebody here at SGI, so I'm sure
I can find out :)

As for the 'X,' I'm assuming that's meant to imply that this is the
maxmimum size for the 'eXtended' table, but that could be made more
clear in the comment as well.

I'll come up with a better comment for this and submit a v3.

Thanks, Ingo!


[PATCH 1/2] x86: Make E820_X_MAX unconditionally larger than E820MAX

2016-11-29 Thread Alex Thorlton
It's really not necessary to limit E820_X_MAX to 128 in the non-EFI
case.  This commit drops E820_X_MAX's dependency on CONFIG_EFI, so that
E820_X_MAX is always at least slightly larger than E820MAX.

The real motivation behind this is actually to prevent some issues in
the Xen kernel, where the XENMEM_machine_memory_map hypercall can
produce an e820 map larger than 128 entries, even on systems where the
original e820 table was quite a bit smaller than that, depending on how
many IOAPICs are installed on the system.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Suggested-by: Ingo Molnar <mi...@redhat.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: David Vrabel <david.vra...@citrix.com>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/include/asm/e820.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 476b574..aa00d33 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -1,13 +1,15 @@
 #ifndef _ASM_X86_E820_H
 #define _ASM_X86_E820_H
 
-#ifdef CONFIG_EFI
+/*
+ * We need to make sure that E820_X_MAX is defined
+ * before we include uapi/asm/e820.h
+ */
 #include 
 #define E820_X_MAX (E820MAX + 3 * MAX_NUMNODES)
-#else  /* ! CONFIG_EFI */
-#define E820_X_MAX E820MAX
-#endif
+
 #include 
+
 #ifndef __ASSEMBLY__
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map *e820;
-- 
1.8.5.6



[PATCH 1/2] x86: Make E820_X_MAX unconditionally larger than E820MAX

2016-11-29 Thread Alex Thorlton
It's really not necessary to limit E820_X_MAX to 128 in the non-EFI
case.  This commit drops E820_X_MAX's dependency on CONFIG_EFI, so that
E820_X_MAX is always at least slightly larger than E820MAX.

The real motivation behind this is actually to prevent some issues in
the Xen kernel, where the XENMEM_machine_memory_map hypercall can
produce an e820 map larger than 128 entries, even on systems where the
original e820 table was quite a bit smaller than that, depending on how
many IOAPICs are installed on the system.

Signed-off-by: Alex Thorlton 
Suggested-by: Ingo Molnar 
Cc: Russ Anderson 
Cc: David Vrabel 
Cc: Ingo Molnar 
Cc: Juergen Gross 
Cc: Thomas Gleixner 
Cc: "H. Peter Anvin" 
Cc: Denys Vlasenko 
Cc: Boris Ostrovsky 
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/include/asm/e820.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h
index 476b574..aa00d33 100644
--- a/arch/x86/include/asm/e820.h
+++ b/arch/x86/include/asm/e820.h
@@ -1,13 +1,15 @@
 #ifndef _ASM_X86_E820_H
 #define _ASM_X86_E820_H
 
-#ifdef CONFIG_EFI
+/*
+ * We need to make sure that E820_X_MAX is defined
+ * before we include uapi/asm/e820.h
+ */
 #include 
 #define E820_X_MAX (E820MAX + 3 * MAX_NUMNODES)
-#else  /* ! CONFIG_EFI */
-#define E820_X_MAX E820MAX
-#endif
+
 #include 
+
 #ifndef __ASSEMBLY__
 /* see comment in arch/x86/kernel/e820.c */
 extern struct e820map *e820;
-- 
1.8.5.6



[PATCH 2/2] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-29 Thread Alex Thorlton
On systems with sufficiently large e820 tables, and several IOAPICs, it
is possible for the XENMEM_machine_memory_map callback (and its
counterpart, XENMEM_memory_map) to attempt to return an e820 table with
more than 128 entries.  This callback adds entries to the BIOS-provided
e820 table to account for IOAPIC registers, which, on sufficiently large
systems, can result in an e820 table that is too large to copy back into
xen_e820_map.

This change simply increases the size of xen_e820_map to E820_X_MAX to
ensure that there is enough room to store the entire e820 map returned
from this callback.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Suggested-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: David Vrabel <david.vra...@citrix.com>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/xen/setup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index f8960fc..8c394e3 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -41,7 +41,7 @@
 unsigned long xen_released_pages;
 
 /* E820 map used during setting up memory. */
-static struct e820entry xen_e820_map[E820MAX] __initdata;
+static struct e820entry xen_e820_map[E820_X_MAX] __initdata;
 static u32 xen_e820_map_entries __initdata;
 
 /*
@@ -750,7 +750,7 @@ char * __init xen_memory_setup(void)
max_pfn = min(max_pfn, xen_start_info->nr_pages);
mem_end = PFN_PHYS(max_pfn);
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
op = xen_initial_domain() ?
@@ -923,7 +923,7 @@ char * __init xen_auto_xlated_memory_setup(void)
int i;
int rc;
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
rc = HYPERVISOR_memory_op(XENMEM_memory_map, );
-- 
1.8.5.6



[PATCH 2/2] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-29 Thread Alex Thorlton
On systems with sufficiently large e820 tables, and several IOAPICs, it
is possible for the XENMEM_machine_memory_map callback (and its
counterpart, XENMEM_memory_map) to attempt to return an e820 table with
more than 128 entries.  This callback adds entries to the BIOS-provided
e820 table to account for IOAPIC registers, which, on sufficiently large
systems, can result in an e820 table that is too large to copy back into
xen_e820_map.

This change simply increases the size of xen_e820_map to E820_X_MAX to
ensure that there is enough room to store the entire e820 map returned
from this callback.

Signed-off-by: Alex Thorlton 
Suggested-by: Boris Ostrovsky 
Cc: Russ Anderson 
Cc: David Vrabel 
Cc: Ingo Molnar 
Cc: Juergen Gross 
Cc: Thomas Gleixner 
Cc: "H. Peter Anvin" 
Cc: Denys Vlasenko 
Cc: Boris Ostrovsky 
Cc: x...@kernel.org
Cc: xen-de...@lists.xenproject.org
---
 arch/x86/xen/setup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index f8960fc..8c394e3 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -41,7 +41,7 @@
 unsigned long xen_released_pages;
 
 /* E820 map used during setting up memory. */
-static struct e820entry xen_e820_map[E820MAX] __initdata;
+static struct e820entry xen_e820_map[E820_X_MAX] __initdata;
 static u32 xen_e820_map_entries __initdata;
 
 /*
@@ -750,7 +750,7 @@ char * __init xen_memory_setup(void)
max_pfn = min(max_pfn, xen_start_info->nr_pages);
mem_end = PFN_PHYS(max_pfn);
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
op = xen_initial_domain() ?
@@ -923,7 +923,7 @@ char * __init xen_auto_xlated_memory_setup(void)
int i;
int rc;
 
-   memmap.nr_entries = E820MAX;
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_map);
set_xen_guest_handle(memmap.buffer, xen_e820_map);
 
rc = HYPERVISOR_memory_op(XENMEM_memory_map, );
-- 
1.8.5.6



[RFC PATCH v2] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-29 Thread Alex Thorlton
Here's the second round of my patches to fix up the problems that we're
seeing with the XENMEM_machine_memory_map hypercall.  These few simple
changes, to give xen_e820_map some extra space, get things working fine
on our large machine.

This version of the patchset adds a patch to remove the #ifdef
CONFIG_EFI conditional around the definition of E820_X_MAX, so that it's
always slightly larger than E820MAX.  I've also tweaked the code that
works on xen_e820_map to use ARRAY_SIZE(xen_e820_map) instead of using
E820_X_MAX directly, as suggested by Boris.

As always, I appreciate any input that others can give!

- Alex

Alex Thorlton (2):
  x86: Make E820_X_MAX unconditionally larger than E820MAX
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

 arch/x86/include/asm/e820.h | 8 +++-
 arch/x86/xen/setup.c| 6 +++---
 2 files changed, 6 insertions(+), 8 deletions(-)

-- 
1.8.5.6



[RFC PATCH v2] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-29 Thread Alex Thorlton
Here's the second round of my patches to fix up the problems that we're
seeing with the XENMEM_machine_memory_map hypercall.  These few simple
changes, to give xen_e820_map some extra space, get things working fine
on our large machine.

This version of the patchset adds a patch to remove the #ifdef
CONFIG_EFI conditional around the definition of E820_X_MAX, so that it's
always slightly larger than E820MAX.  I've also tweaked the code that
works on xen_e820_map to use ARRAY_SIZE(xen_e820_map) instead of using
E820_X_MAX directly, as suggested by Boris.

As always, I appreciate any input that others can give!

- Alex

Alex Thorlton (2):
  x86: Make E820_X_MAX unconditionally larger than E820MAX
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

 arch/x86/include/asm/e820.h | 8 +++-
 arch/x86/xen/setup.c| 6 +++---
 2 files changed, 6 insertions(+), 8 deletions(-)

-- 
1.8.5.6



Re: [tip:x86/urgent] x86/apic/uv: Silence a shift wrapping warning

2016-11-29 Thread Alex Thorlton
On Wed, Nov 23, 2016 at 10:25:48PM -0800, tip-bot for Dan Carpenter wrote:
> Commit-ID:  c4597fd756836a5fb7900f2091797ab564390ad0
> Gitweb: http://git.kernel.org/tip/c4597fd756836a5fb7900f2091797ab564390ad0
> Author: Dan Carpenter <dan.carpen...@oracle.com>
> AuthorDate: Thu, 24 Nov 2016 01:19:08 +0300
> Committer:  Ingo Molnar <mi...@kernel.org>
> CommitDate: Thu, 24 Nov 2016 06:01:05 +0100
> 
> x86/apic/uv: Silence a shift wrapping warning
> 
> 'm_io' is stored in 6 bits so it's a number in the 0-63 range.  Static
> analysis tools complain that 1 << 63 will wrap so I have changed it to
> 1ULL << m_io.
> 
> This code is over three years old so presumably the bug doesn't happen
> very frequently in real life or someone would have complained by now.
> 
> Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com>
> Cc: Alex Thorlton <athorl...@sgi.com>

Acked-by: Alex Thorlton <athorl...@sgi.com>

- Alex


Re: [tip:x86/urgent] x86/apic/uv: Silence a shift wrapping warning

2016-11-29 Thread Alex Thorlton
On Wed, Nov 23, 2016 at 10:25:48PM -0800, tip-bot for Dan Carpenter wrote:
> Commit-ID:  c4597fd756836a5fb7900f2091797ab564390ad0
> Gitweb: http://git.kernel.org/tip/c4597fd756836a5fb7900f2091797ab564390ad0
> Author: Dan Carpenter 
> AuthorDate: Thu, 24 Nov 2016 01:19:08 +0300
> Committer:  Ingo Molnar 
> CommitDate: Thu, 24 Nov 2016 06:01:05 +0100
> 
> x86/apic/uv: Silence a shift wrapping warning
> 
> 'm_io' is stored in 6 bits so it's a number in the 0-63 range.  Static
> analysis tools complain that 1 << 63 will wrap so I have changed it to
> 1ULL << m_io.
> 
> This code is over three years old so presumably the bug doesn't happen
> very frequently in real life or someone would have complained by now.
> 
> Signed-off-by: Dan Carpenter 
> Cc: Alex Thorlton 

Acked-by: Alex Thorlton 

- Alex


Re: [RFC PATCH] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-15 Thread Alex Thorlton
On Tue, Nov 15, 2016 at 10:55:49AM +0100, Juergen Gross wrote:
> I'd go with the new error code. What about E2BIG or ENOSPC?
> 
> I think the hypervisor should fill in the number of entries required
> in this case.
> 
> In case nobody objects I can post patches for this purpose (both Xen
> and Linux).

This sounds like a good solution to me.  I think it's definitely more
appropriate than simply bumping up the size of xen_e820_map, especially
considering the fact that it's theoretically possible for the e820 map
generated by the hypercall to grow too large, even on a non-EFI machine,
where my change would have no effect.

Thanks for your input!

- Alex


Re: [RFC PATCH] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-15 Thread Alex Thorlton
On Tue, Nov 15, 2016 at 10:55:49AM +0100, Juergen Gross wrote:
> I'd go with the new error code. What about E2BIG or ENOSPC?
> 
> I think the hypervisor should fill in the number of entries required
> in this case.
> 
> In case nobody objects I can post patches for this purpose (both Xen
> and Linux).

This sounds like a good solution to me.  I think it's definitely more
appropriate than simply bumping up the size of xen_e820_map, especially
considering the fact that it's theoretically possible for the e820 map
generated by the hypercall to grow too large, even on a non-EFI machine,
where my change would have no effect.

Thanks for your input!

- Alex


[RFC PATCH] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-14 Thread Alex Thorlton
Hey everyone,

We're having problems with large systems hitting a BUG in
xen_memory_setup, due to extra e820 entries created in the
XENMEM_machine_memory_map callback.  The change in the patch gets things
working, but Boris and I wanted to get opinions on whether or not this
is the appropriate/entire solution, which is why I've sent it as an RFC
for now.

Boris pointed out to me that E820_X_MAX is only large when CONFIG_EFI=y,
which is a detail worth discussig.  He proposed possibly adding
CONFIG_XEN to the conditions under which we set E820_X_MAX to a larger
value than E820MAX, since the Xen e820 table isn't bound by the
zero-page memory limitations.

I do *slightly* question the use of E820_X_MAX here, only from a
cosmetic prospective, as I believe this macro is intended to describe
the maximum size of the extended e820 table, which, AFAIK, is not used
by the Xen HV.  That being said, there isn't exactly a "more
appropriate" macro/variable to use, so this may not really be an issue.

Any input on the patch, or the questions I've raised above is greatly
appreciated!

- Alex

Alex Thorlton (1):
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

 arch/x86/xen/setup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
1.8.5.6



[RFC PATCH] xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

2016-11-14 Thread Alex Thorlton
Hey everyone,

We're having problems with large systems hitting a BUG in
xen_memory_setup, due to extra e820 entries created in the
XENMEM_machine_memory_map callback.  The change in the patch gets things
working, but Boris and I wanted to get opinions on whether or not this
is the appropriate/entire solution, which is why I've sent it as an RFC
for now.

Boris pointed out to me that E820_X_MAX is only large when CONFIG_EFI=y,
which is a detail worth discussig.  He proposed possibly adding
CONFIG_XEN to the conditions under which we set E820_X_MAX to a larger
value than E820MAX, since the Xen e820 table isn't bound by the
zero-page memory limitations.

I do *slightly* question the use of E820_X_MAX here, only from a
cosmetic prospective, as I believe this macro is intended to describe
the maximum size of the extended e820 table, which, AFAIK, is not used
by the Xen HV.  That being said, there isn't exactly a "more
appropriate" macro/variable to use, so this may not really be an issue.

Any input on the patch, or the questions I've raised above is greatly
appreciated!

- Alex

Alex Thorlton (1):
  xen/x86: Increase xen_e820_map to E820_X_MAX possible entries

 arch/x86/xen/setup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
1.8.5.6



[tip:x86/urgent] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-20 Thread tip-bot for Alex Thorlton
Commit-ID:  caef78b6cdeddf4ad364f95910bba6b43b8eb9bf
Gitweb: http://git.kernel.org/tip/caef78b6cdeddf4ad364f95910bba6b43b8eb9bf
Author: Alex Thorlton <athorl...@sgi.com>
AuthorDate: Wed, 19 Oct 2016 20:48:51 -0500
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Thu, 20 Oct 2016 08:47:58 +0200

x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

Some time ago, we brought our UV BIOS callback code up to speed with the
new EFI memory mapping scheme, in commit:

d1be84a232e3 ("x86/uv: Update uv_bios_call() to use 
efi_call_virt_pointer()")

By leveraging some changes that I made to a few of the EFI runtime
callback mechanisms, in commit:

80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")

This got everything running smoothly on UV, with the new EFI mapping
code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
(a.k.a. efi=old_map) will no longer work on UV, on kernels that include
the aforementioned changes.

At the time this was not a major issue (in fact, it still really isn't),
but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
systems.  This commit adds a check into uv_bios_call(), to see if we have
the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
using our old callback method, which uses efi_call() directly on the __va()
of our function pointer.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Acked-by: Matt Fleming <m...@codeblueprint.co.uk>
Cc: <sta...@vger.kernel.org> # v4.7 and later
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Josh Poimboeuf <jpoim...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masah...@socionext.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Russ Anderson <r...@sgi.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1476928131-170101-1-git-send-email-athorl...@sgi.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/platform/uv/bios_uv.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index b4d5e95..4a6a5a2 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,7 +40,15 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
+   /*
+* If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
+* callback method, which uses efi_call() directly, with the kernel 
page tables:
+*/
+   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
+   else
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);


[tip:x86/urgent] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-20 Thread tip-bot for Alex Thorlton
Commit-ID:  caef78b6cdeddf4ad364f95910bba6b43b8eb9bf
Gitweb: http://git.kernel.org/tip/caef78b6cdeddf4ad364f95910bba6b43b8eb9bf
Author: Alex Thorlton 
AuthorDate: Wed, 19 Oct 2016 20:48:51 -0500
Committer:  Ingo Molnar 
CommitDate: Thu, 20 Oct 2016 08:47:58 +0200

x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

Some time ago, we brought our UV BIOS callback code up to speed with the
new EFI memory mapping scheme, in commit:

d1be84a232e3 ("x86/uv: Update uv_bios_call() to use 
efi_call_virt_pointer()")

By leveraging some changes that I made to a few of the EFI runtime
callback mechanisms, in commit:

80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")

This got everything running smoothly on UV, with the new EFI mapping
code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
(a.k.a. efi=old_map) will no longer work on UV, on kernels that include
the aforementioned changes.

At the time this was not a major issue (in fact, it still really isn't),
but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
systems.  This commit adds a check into uv_bios_call(), to see if we have
the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
using our old callback method, which uses efi_call() directly on the __va()
of our function pointer.

Signed-off-by: Alex Thorlton 
Acked-by: Matt Fleming 
Cc:  # v4.7 and later
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: Dimitri Sivanich 
Cc: H. Peter Anvin 
Cc: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Masahiro Yamada 
Cc: Mike Travis 
Cc: Peter Zijlstra 
Cc: Russ Anderson 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1476928131-170101-1-git-send-email-athorl...@sgi.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/platform/uv/bios_uv.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index b4d5e95..4a6a5a2 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,7 +40,15 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
+   /*
+* If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
+* callback method, which uses efi_call() directly, with the kernel 
page tables:
+*/
+   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
+   else
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);


[PATCH v2] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-19 Thread Alex Thorlton
Some time ago, we brought our UV BIOS callback code up to speed with the
new EFI memory mapping scheme, in commit:

d1be84a232e3 ("x86/uv: Update uv_bios_call() to use 
efi_call_virt_pointer()")

By leveraging some changes that I made to a few of the EFI runtime
callback mechanisms, in commit:

80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")

This got everything running smoothly on UV, with the new EFI mapping
code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
(a.k.a. efi=old_map) will no longer work on UV, on kernels that include
the aforementioned changes.

At the time this was not a major issue (in fact, it still really isn't),
but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
systems.  This commit adds a check into uv_bios_call, to see if we have
the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
using our old callback method, which uses efi_call directly on the __va
of our function pointer.

v2: Invert if-statement and add unlikely() hint.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Masahiro Yamada <yamada.masah...@socionext.com>
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index b4d5e95..d1f7422 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,7 +40,15 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
+   /*
+* If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
+* callback method, which uses efi_call directly, with the kernel page 
tables.
+*/
+   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
+   else
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



[PATCH v2] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-19 Thread Alex Thorlton
Some time ago, we brought our UV BIOS callback code up to speed with the
new EFI memory mapping scheme, in commit:

d1be84a232e3 ("x86/uv: Update uv_bios_call() to use 
efi_call_virt_pointer()")

By leveraging some changes that I made to a few of the EFI runtime
callback mechanisms, in commit:

80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")

This got everything running smoothly on UV, with the new EFI mapping
code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
(a.k.a. efi=old_map) will no longer work on UV, on kernels that include
the aforementioned changes.

At the time this was not a major issue (in fact, it still really isn't),
but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
systems.  This commit adds a check into uv_bios_call, to see if we have
the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
using our old callback method, which uses efi_call directly on the __va
of our function pointer.

v2: Invert if-statement and add unlikely() hint.

Signed-off-by: Alex Thorlton 
Cc: Mike Travis 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Matt Fleming 
Cc: Masahiro Yamada 
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index b4d5e95..d1f7422 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,7 +40,15 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
+   /*
+* If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
+* callback method, which uses efi_call directly, with the kernel page 
tables.
+*/
+   if (unlikely(test_bit(EFI_OLD_MEMMAP, )))
+   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
+   else
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



Re: [PATCH] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-19 Thread Alex Thorlton
On Wed, Oct 19, 2016 at 12:32:00PM +0100, Matt Fleming wrote:
> Could you please invert the conditional? I had to read it 3 times to
> make sure it was correct given the comment that precedes it. E.g. this
> is preferable,
> 
>   if (efi_enabled(EFI_OLD_MEMMAP))
>   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
> a3, a4, a5);
>   else
>   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
> a3, a4, a5);

Sure, no problem!  I generally default to checking for the more common
condition first, but I definitely see how that makes the code kind of
hard to read, in this case.

I'll send another version shortly.

Thanks, Matt!


Re: [PATCH] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-19 Thread Alex Thorlton
On Wed, Oct 19, 2016 at 12:32:00PM +0100, Matt Fleming wrote:
> Could you please invert the conditional? I had to read it 3 times to
> make sure it was correct given the comment that precedes it. E.g. this
> is preferable,
> 
>   if (efi_enabled(EFI_OLD_MEMMAP))
>   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
> a3, a4, a5);
>   else
>   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
> a3, a4, a5);

Sure, no problem!  I generally default to checking for the more common
condition first, but I definitely see how that makes the code kind of
hard to read, in this case.

I'll send another version shortly.

Thanks, Matt!


[PATCH] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-18 Thread Alex Thorlton
Some time ago, we brought our UV BIOS callback code up to speed with the
new EFI memory mapping scheme, in commit:

d1be84a232e3 ("x86/uv: Update uv_bios_call() to use 
efi_call_virt_pointer()")

By leveraging some changes that I made to a few of the EFI runtime
callback mechanisms, in commit:

80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")

This got everything running smoothly on UV, with the new EFI mapping
code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
(a.k.a. efi=old_map) will no longer work on UV, on kernels that include
the aforementioned changes.

At the time this was not a major issue (in fact, it still really isn't),
but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
systems.  This commit adds a check into uv_bios_call, to see if we have
the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
using our old callback method, which uses efi_call directly on the __va
of our function pointer.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Masahiro Yamada <yamada.masah...@socionext.com>
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index b4d5e95..12b6e52 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,7 +40,15 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
+   /*
+* If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
+* callback method, which uses efi_call directly, with the kernel page 
tables.
+*/
+   if (!test_bit(EFI_OLD_MEMMAP, ))
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
+   else
+   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



[PATCH] x86/platform/UV: Fix support for EFI_OLD_MEMMAP after BIOS callback updates

2016-10-18 Thread Alex Thorlton
Some time ago, we brought our UV BIOS callback code up to speed with the
new EFI memory mapping scheme, in commit:

d1be84a232e3 ("x86/uv: Update uv_bios_call() to use 
efi_call_virt_pointer()")

By leveraging some changes that I made to a few of the EFI runtime
callback mechanisms, in commit:

80e75596079f ("efi: Convert efi_call_virt() to efi_call_virt_pointer()")

This got everything running smoothly on UV, with the new EFI mapping
code.  However, this left one, small loose end, in that EFI_OLD_MEMMAP
(a.k.a. efi=old_map) will no longer work on UV, on kernels that include
the aforementioned changes.

At the time this was not a major issue (in fact, it still really isn't),
but there's no reason that EFI_OLD_MEMMAP *shouldn't* work on our
systems.  This commit adds a check into uv_bios_call, to see if we have
the EFI_OLD_MEMMAP bit set in efi.flags.  If it is set, we fall back to
using our old callback method, which uses efi_call directly on the __va
of our function pointer.

Signed-off-by: Alex Thorlton 
Cc: Mike Travis 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Matt Fleming 
Cc: Masahiro Yamada 
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index b4d5e95..12b6e52 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,7 +40,15 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
+   /*
+* If EFI_OLD_MEMMAP is set, we need to fall back to using our old EFI
+* callback method, which uses efi_call directly, with the kernel page 
tables.
+*/
+   if (!test_bit(EFI_OLD_MEMMAP, ))
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, 
a3, a4, a5);
+   else
+   ret = efi_call((void *)__va(tab->function), (u64)which, a1, a2, 
a3, a4, a5);
+
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-17 Thread Alex Thorlton
On Wed, Aug 17, 2016 at 03:01:51PM +0800, Dave Young wrote:
> > > Why do you guys need the physical mapping all of a sudden?
> > 
> > It's not that we need it all of the sudden, necessarily, it's just that
> > we've had to make other changes to make things work with the new,
> > (almost) completely isolated, EFI page tables.  We ended up choosing the
> > lesser of two evils, and have decided to temporarily rely on the
> > physical address of our runtime code, instead of continuing to rely on
> > EFI_OLD_MEMMAP.
> 
> In efi_map_region, there is already mapped md->phys_addr for broken
> firmware. SGI still need EFI_OLD_MEMMAP? I means in 1st kernel instead
> of kexec kernel.

We're actually in the middle of trying to move *away* from
EFI_OLD_MEMMAP, which is why we're just starting to uncover some of
these things.  efi_map_region covers us on the primary kernel, because
it maps in the physical address of each md (as you note here), but that
little piece is missing in the kexec'd kernel.  So, our primary kernel
works without efi=old_map, but the second kernel won't, without this
change (supplying "noefi" on the kexec command line also works, but then
we don't have any of our runtime stuff available).

As noted in a previous message, we're aware that our code needs a little
more work to be "perfect," but this small change buys us most of (all
of?) the stuff we'd get by implementing the other changes that we're
aware we need to make, i.e. update our runtime function pointer to its
efi_va during SetVirtualAddressMap, at least from a kexec perspective.

- Alex


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-17 Thread Alex Thorlton
On Wed, Aug 17, 2016 at 03:01:51PM +0800, Dave Young wrote:
> > > Why do you guys need the physical mapping all of a sudden?
> > 
> > It's not that we need it all of the sudden, necessarily, it's just that
> > we've had to make other changes to make things work with the new,
> > (almost) completely isolated, EFI page tables.  We ended up choosing the
> > lesser of two evils, and have decided to temporarily rely on the
> > physical address of our runtime code, instead of continuing to rely on
> > EFI_OLD_MEMMAP.
> 
> In efi_map_region, there is already mapped md->phys_addr for broken
> firmware. SGI still need EFI_OLD_MEMMAP? I means in 1st kernel instead
> of kexec kernel.

We're actually in the middle of trying to move *away* from
EFI_OLD_MEMMAP, which is why we're just starting to uncover some of
these things.  efi_map_region covers us on the primary kernel, because
it maps in the physical address of each md (as you note here), but that
little piece is missing in the kexec'd kernel.  So, our primary kernel
works without efi=old_map, but the second kernel won't, without this
change (supplying "noefi" on the kexec command line also works, but then
we don't have any of our runtime stuff available).

As noted in a previous message, we're aware that our code needs a little
more work to be "perfect," but this small change buys us most of (all
of?) the stuff we'd get by implementing the other changes that we're
aware we need to make, i.e. update our runtime function pointer to its
efi_va during SetVirtualAddressMap, at least from a kexec perspective.

- Alex


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-16 Thread Alex Thorlton
On Tue, Aug 16, 2016 at 07:50:10AM +0200, Borislav Petkov wrote:
> On Mon, Aug 15, 2016 at 01:47:31PM -0500, Alex Thorlton wrote:
> > The only thing we're adding here is the physical mappings, to match
> > what is availble in the primary kernel.
> 
> I can see what it does - I just am questioning the reasoning for as we
> did all that effort so that kexec can have stable virtual mappings.
> 
> I guess we still need a way to pass the virtual mappings to kexec
> as they're immutable as some "smartass" decided to allow to call
> SetVirtualAddressMap only once.
> 
> > This is sort of a hand-wavey answer - I will investigate the his further...
> 
> Yeah, it'll be interesting to know whether that is an issue because if
> we do the 1:1 mappings in the kexec kernel too and there's an address
> conflict, then we better know upfront.

I did some investigation/testing here and found, as Matt stated, that
this shouldn't cause any serious problems.  The crashkernel memory
reservation happens a little later than I originally thought, so we sort
of hit a hybrid of the two problems I described.  i.e, the memory isn't
mapped yet, but the kernel does know not to map anything over it.

I purposely set the crashkernel to land right on top of some of our
MMRs:

#define UV2_GLOBAL_MMR32_BASE   0xfc00UL

This is 4227858432 in base10, or 4128768K in crashkernel parameter
terms, so I used this on the command line:

crashkernel=256M@4128768K

And, not too surprisingly, I hit this while trying to reserve the
crashkernel memory:

[0.00] crashkernel reservation failed - memory is in use.

At a glance, I believe that this is because the memory range for these
MMRs is E820_RESERVED:

[0.00] BIOS-e820: [mem 0xf800-0xfdff] reserved

So, it gets skipped over during memblock_x86_fill, which only picks up
E820_RAM, and E820_RESERVED_KERN memory.  At least in this case, the
memory range where our MMRs fall is seen as completely unavailable (or,
more accurately, not seen at all) when we do the memblock_find_in_range
during reserve_crashkernel.

Note that I chose to map over some MMR space instead of EFI runtime
code.  This was an arbitrary choice - they both show up as
E820_RESERVED, so the behavior will be the same.

This is certainly not news to some, but I wasn't positive about how this
worked, and wanted to make sure that I addressed the concern that you
expressed.

Thanks again for your input!

- Alex


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-16 Thread Alex Thorlton
On Tue, Aug 16, 2016 at 07:50:10AM +0200, Borislav Petkov wrote:
> On Mon, Aug 15, 2016 at 01:47:31PM -0500, Alex Thorlton wrote:
> > The only thing we're adding here is the physical mappings, to match
> > what is availble in the primary kernel.
> 
> I can see what it does - I just am questioning the reasoning for as we
> did all that effort so that kexec can have stable virtual mappings.
> 
> I guess we still need a way to pass the virtual mappings to kexec
> as they're immutable as some "smartass" decided to allow to call
> SetVirtualAddressMap only once.
> 
> > This is sort of a hand-wavey answer - I will investigate the his further...
> 
> Yeah, it'll be interesting to know whether that is an issue because if
> we do the 1:1 mappings in the kexec kernel too and there's an address
> conflict, then we better know upfront.

I did some investigation/testing here and found, as Matt stated, that
this shouldn't cause any serious problems.  The crashkernel memory
reservation happens a little later than I originally thought, so we sort
of hit a hybrid of the two problems I described.  i.e, the memory isn't
mapped yet, but the kernel does know not to map anything over it.

I purposely set the crashkernel to land right on top of some of our
MMRs:

#define UV2_GLOBAL_MMR32_BASE   0xfc00UL

This is 4227858432 in base10, or 4128768K in crashkernel parameter
terms, so I used this on the command line:

crashkernel=256M@4128768K

And, not too surprisingly, I hit this while trying to reserve the
crashkernel memory:

[0.00] crashkernel reservation failed - memory is in use.

At a glance, I believe that this is because the memory range for these
MMRs is E820_RESERVED:

[0.00] BIOS-e820: [mem 0xf800-0xfdff] reserved

So, it gets skipped over during memblock_x86_fill, which only picks up
E820_RAM, and E820_RESERVED_KERN memory.  At least in this case, the
memory range where our MMRs fall is seen as completely unavailable (or,
more accurately, not seen at all) when we do the memblock_find_in_range
during reserve_crashkernel.

Note that I chose to map over some MMR space instead of EFI runtime
code.  This was an arbitrary choice - they both show up as
E820_RESERVED, so the behavior will be the same.

This is certainly not news to some, but I wasn't positive about how this
worked, and wanted to make sure that I addressed the concern that you
expressed.

Thanks again for your input!

- Alex


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Alex Thorlton
On Mon, Aug 15, 2016 at 05:07:09PM +0200, Borislav Petkov wrote:
> On Mon, Aug 15, 2016 at 01:42:58PM +0100, Matt Fleming wrote:
> > (Cc'ing Boris and Dave)
> > 
> > On Fri, 05 Aug, at 06:59:35PM, Alex Thorlton wrote:
> > > This is a simple change to add in the physical mappings as well as the
> > > virtual mappings in efi_map_region_fixed.  The motivation here is to
> > > get access to EFI runtime code that is only available via the 1:1
> > > mappings on a kexec'd kernel.
> 
> So I don't understand: the whole jumping through hoops so that we have
> stable virtual mappings just so that the kexec-ed kernel can call EFI
> runtime, is now useless?!

Well, anyone who is using the mappings in -4G down range still needs
your code to map their stuff into the appropriate spot in the kexec'd
kernel, so anybody who is doing things in the most up-to-date manner
still makes use of your new code, right?

AFAIK, we pass up the efi_runtime_map to the kexec'd kernel, and then
process the memory descriptors one by one, mapping in their virtual
addresses during kexec_enter_virtual_mode.  All of this relies on your
efforts to pass up the virtual mappings.  The only thing we're adding
here is the physical mappings, to match what is availble in the primary
kernel.

> What if the physical address is occupied by the kexec kernel?

Ah, that's a good question.  I usually don't force the placement of my
crashkernel, so I can't exactly comment intelligently on what will
happen in that situation.  I will look into this, but in our case, I
believe that the primary kernel would either fail to boot, or fail to
load the kexec kernel if we placed it over the range where are EFI code
gets mapped in, which I think is an issue even without this patch.

IIRC, the crashkernel memory gets reserved really early, so I'd imagine
we'd hit the former problem, since we will try to remap into that same
space in uv_bios_init.

This is sort of a hand-wavey answer - I will investigate the his further to
make sure that I'm not making any stupid assumptions (there's a good
chance that I am :)

> Why do you guys need the physical mapping all of a sudden?

It's not that we need it all of the sudden, necessarily, it's just that
we've had to make other changes to make things work with the new,
(almost) completely isolated, EFI page tables.  We ended up choosing the
lesser of two evils, and have decided to temporarily rely on the
physical address of our runtime code, instead of continuing to rely on
EFI_OLD_MEMMAP.

I guess what I'm saying is, if we hadn't been relying on some
semi-undefined behavior in the EFI memory mapping scheme, we would've
been relying on the physical address for quite a while, since nobody
would have been mapping in the virtual address for us.

It's important to note that we've been dancing back and forth between
workarounds for some slightly inorrect assumptions, and some actual bug
fixes.  We're working to get everything 100% compatible with the new
memory mapping schemes, but there're a few pieces of the puzzle we
haven't gotten around to yet.

> Your patch is basically rendering all the effort moot and we could've
> saved ourselves all that trouble of doing all that virtual address
> mapping and done the 1:1 thing.
> 
> Which really is probably simpler since we have an EFI-specific page
> table and running EFI in the kexec-ed kernel would mean basically
> recreating it.
> 
> What am I missing?

I don't think it renders all of your effort worthless.  It just allows
those who haven't had a chance to completely update their code to work
with the new mapping schemes a way to utilize EFI runtime callbacks, in
a kexec'd kernel.

I do understand that, in a perfect world, we would be able to just hop
right in and use your memory mapping scheme.  At the same time, I don't
see how this is that much different from what we already do in the
primary, non-kexec'd kernel.

If there are strong objections to this change, I won't pursue it
further.  We will be able to achieve the same effect once we've had a
chance to update our code to register a callback with
SetVirtualAddressMap to fix up our function pointer.  This is on my
upcoming to-do list, but it'll be a little bit before I have a chance to
get it finished.

Thanks for the input, Boris!

- Alex


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-15 Thread Alex Thorlton
On Mon, Aug 15, 2016 at 05:07:09PM +0200, Borislav Petkov wrote:
> On Mon, Aug 15, 2016 at 01:42:58PM +0100, Matt Fleming wrote:
> > (Cc'ing Boris and Dave)
> > 
> > On Fri, 05 Aug, at 06:59:35PM, Alex Thorlton wrote:
> > > This is a simple change to add in the physical mappings as well as the
> > > virtual mappings in efi_map_region_fixed.  The motivation here is to
> > > get access to EFI runtime code that is only available via the 1:1
> > > mappings on a kexec'd kernel.
> 
> So I don't understand: the whole jumping through hoops so that we have
> stable virtual mappings just so that the kexec-ed kernel can call EFI
> runtime, is now useless?!

Well, anyone who is using the mappings in -4G down range still needs
your code to map their stuff into the appropriate spot in the kexec'd
kernel, so anybody who is doing things in the most up-to-date manner
still makes use of your new code, right?

AFAIK, we pass up the efi_runtime_map to the kexec'd kernel, and then
process the memory descriptors one by one, mapping in their virtual
addresses during kexec_enter_virtual_mode.  All of this relies on your
efforts to pass up the virtual mappings.  The only thing we're adding
here is the physical mappings, to match what is availble in the primary
kernel.

> What if the physical address is occupied by the kexec kernel?

Ah, that's a good question.  I usually don't force the placement of my
crashkernel, so I can't exactly comment intelligently on what will
happen in that situation.  I will look into this, but in our case, I
believe that the primary kernel would either fail to boot, or fail to
load the kexec kernel if we placed it over the range where are EFI code
gets mapped in, which I think is an issue even without this patch.

IIRC, the crashkernel memory gets reserved really early, so I'd imagine
we'd hit the former problem, since we will try to remap into that same
space in uv_bios_init.

This is sort of a hand-wavey answer - I will investigate the his further to
make sure that I'm not making any stupid assumptions (there's a good
chance that I am :)

> Why do you guys need the physical mapping all of a sudden?

It's not that we need it all of the sudden, necessarily, it's just that
we've had to make other changes to make things work with the new,
(almost) completely isolated, EFI page tables.  We ended up choosing the
lesser of two evils, and have decided to temporarily rely on the
physical address of our runtime code, instead of continuing to rely on
EFI_OLD_MEMMAP.

I guess what I'm saying is, if we hadn't been relying on some
semi-undefined behavior in the EFI memory mapping scheme, we would've
been relying on the physical address for quite a while, since nobody
would have been mapping in the virtual address for us.

It's important to note that we've been dancing back and forth between
workarounds for some slightly inorrect assumptions, and some actual bug
fixes.  We're working to get everything 100% compatible with the new
memory mapping schemes, but there're a few pieces of the puzzle we
haven't gotten around to yet.

> Your patch is basically rendering all the effort moot and we could've
> saved ourselves all that trouble of doing all that virtual address
> mapping and done the 1:1 thing.
> 
> Which really is probably simpler since we have an EFI-specific page
> table and running EFI in the kexec-ed kernel would mean basically
> recreating it.
> 
> What am I missing?

I don't think it renders all of your effort worthless.  It just allows
those who haven't had a chance to completely update their code to work
with the new mapping schemes a way to utilize EFI runtime callbacks, in
a kexec'd kernel.

I do understand that, in a perfect world, we would be able to just hop
right in and use your memory mapping scheme.  At the same time, I don't
see how this is that much different from what we already do in the
primary, non-kexec'd kernel.

If there are strong objections to this change, I won't pursue it
further.  We will be able to achieve the same effect once we've had a
chance to update our code to register a callback with
SetVirtualAddressMap to fix up our function pointer.  This is on my
upcoming to-do list, but it'll be a little bit before I have a chance to
get it finished.

Thanks for the input, Boris!

- Alex


[tip:efi/urgent] x86/platform/uv: Skip UV runtime services mapping in the efi_runtime_disabled case

2016-08-11 Thread tip-bot for Alex Thorlton
Commit-ID:  f72075c9eda8a43aeea2f9dbb8d187afd4a76f0b
Gitweb: http://git.kernel.org/tip/f72075c9eda8a43aeea2f9dbb8d187afd4a76f0b
Author: Alex Thorlton <athorl...@sgi.com>
AuthorDate: Thu, 11 Aug 2016 11:41:59 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Thu, 11 Aug 2016 13:55:36 +0200

x86/platform/uv: Skip UV runtime services mapping in the efi_runtime_disabled 
case

This problem has actually been in the UV code for a while, but we didn't
catch it until recently, because we had been relying on EFI_OLD_MEMMAP
to allow our systems to boot for a period of time.  We noticed the issue
when trying to kexec a recent community kernel, where we hit this NULL
pointer dereference in efi_sync_low_kernel_mappings():

 [0.337515] BUG: unable to handle kernel NULL pointer dereference at 
0880
 [0.346276] IP: [] efi_sync_low_kernel_mappings+0x5d/0x1b0

The problem doesn't show up with EFI_OLD_MEMMAP because we skip the
chunk of setup_efi_state() that sets the efi_loader_signature for the
kexec'd kernel.  When the kexec'd kernel boots, it won't set EFI_BOOT in
setup_arch, so we completely avoid the bug.

We always kexec with noefi on the command line, so this shouldn't be an
issue, but since we're not actually checking for efi_runtime_disabled in
uv_bios_init(), we end up trying to do EFI runtime callbacks when we
shouldn't be. This patch just adds a check for efi_runtime_disabled in
uv_bios_init() so that we don't map in uv_systab when runtime_disabled ==
true.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
Cc: <sta...@vger.kernel.org> # v4.7
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Josh Poimboeuf <jpoim...@redhat.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Mike Travis <tra...@sgi.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Russ Anderson <r...@sgi.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1470912120-22831-2-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/platform/uv/bios_uv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 66b2166..0df8a03 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -187,7 +187,8 @@ EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 void uv_bios_init(void)
 {
uv_systab = NULL;
-   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab) {
+   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) ||
+   !efi.uv_systab || efi_runtime_disabled()) {
pr_crit("UV: UVsystab: missing\n");
return;
}


[tip:efi/urgent] x86/platform/uv: Skip UV runtime services mapping in the efi_runtime_disabled case

2016-08-11 Thread tip-bot for Alex Thorlton
Commit-ID:  f72075c9eda8a43aeea2f9dbb8d187afd4a76f0b
Gitweb: http://git.kernel.org/tip/f72075c9eda8a43aeea2f9dbb8d187afd4a76f0b
Author: Alex Thorlton 
AuthorDate: Thu, 11 Aug 2016 11:41:59 +0100
Committer:  Ingo Molnar 
CommitDate: Thu, 11 Aug 2016 13:55:36 +0200

x86/platform/uv: Skip UV runtime services mapping in the efi_runtime_disabled 
case

This problem has actually been in the UV code for a while, but we didn't
catch it until recently, because we had been relying on EFI_OLD_MEMMAP
to allow our systems to boot for a period of time.  We noticed the issue
when trying to kexec a recent community kernel, where we hit this NULL
pointer dereference in efi_sync_low_kernel_mappings():

 [0.337515] BUG: unable to handle kernel NULL pointer dereference at 
0880
 [0.346276] IP: [] efi_sync_low_kernel_mappings+0x5d/0x1b0

The problem doesn't show up with EFI_OLD_MEMMAP because we skip the
chunk of setup_efi_state() that sets the efi_loader_signature for the
kexec'd kernel.  When the kexec'd kernel boots, it won't set EFI_BOOT in
setup_arch, so we completely avoid the bug.

We always kexec with noefi on the command line, so this shouldn't be an
issue, but since we're not actually checking for efi_runtime_disabled in
uv_bios_init(), we end up trying to do EFI runtime callbacks when we
shouldn't be. This patch just adds a check for efi_runtime_disabled in
uv_bios_init() so that we don't map in uv_systab when runtime_disabled ==
true.

Signed-off-by: Alex Thorlton 
Signed-off-by: Matt Fleming 
Cc:  # v4.7
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Borislav Petkov 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Mike Travis 
Cc: Peter Zijlstra 
Cc: Russ Anderson 
Cc: Thomas Gleixner 
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1470912120-22831-2-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/platform/uv/bios_uv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 66b2166..0df8a03 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -187,7 +187,8 @@ EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 void uv_bios_init(void)
 {
uv_systab = NULL;
-   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab) {
+   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) ||
+   !efi.uv_systab || efi_runtime_disabled()) {
pr_crit("UV: UVsystab: missing\n");
return;
}


funkier linuxdev mail test

2016-08-10 Thread Alex Thorlton
Maybe this time?



funkier linuxdev mail test

2016-08-10 Thread Alex Thorlton
Maybe this time?



Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-10 Thread Alex Thorlton
On Fri, Aug 05, 2016 at 06:59:35PM -0500, Alex Thorlton wrote:
> This is a simple change to add in the physical mappings as well as the
> virtual mappings in efi_map_region_fixed.  The motivation here is to
> get access to EFI runtime code that is only available via the 1:1
> mappings on a kexec'd kernel.
> 
> The added call is essentially the kexec analog of the first __map_region
> that Boris put in efi_map_region in commit d2f7cbe7b26a ("x86/efi:
> Runtime services virtual mapping").

Just wanted to ping and see if anybody has had a chance to look at this?
Also, linux-kernel looks broken today (latest e-mails on lkml look to be
from last night?) so I wanted to send something to see if it goes
through :)

- Alex


Re: [PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-10 Thread Alex Thorlton
On Fri, Aug 05, 2016 at 06:59:35PM -0500, Alex Thorlton wrote:
> This is a simple change to add in the physical mappings as well as the
> virtual mappings in efi_map_region_fixed.  The motivation here is to
> get access to EFI runtime code that is only available via the 1:1
> mappings on a kexec'd kernel.
> 
> The added call is essentially the kexec analog of the first __map_region
> that Boris put in efi_map_region in commit d2f7cbe7b26a ("x86/efi:
> Runtime services virtual mapping").

Just wanted to ping and see if anybody has had a chance to look at this?
Also, linux-kernel looks broken today (latest e-mails on lkml look to be
from last night?) so I wanted to send something to see if it goes
through :)

- Alex


[PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-06 Thread Alex Thorlton
This is a simple change to add in the physical mappings as well as the
virtual mappings in efi_map_region_fixed.  The motivation here is to
get access to EFI runtime code that is only available via the 1:1
mappings on a kexec'd kernel.

The added call is essentially the kexec analog of the first __map_region
that Boris put in efi_map_region in commit d2f7cbe7b26a ("x86/efi:
Runtime services virtual mapping").

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/platform/efi/efi_64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 459bcbb..b206126 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -363,6 +363,7 @@ void __init efi_map_region(efi_memory_desc_t *md)
  */
 void __init efi_map_region_fixed(efi_memory_desc_t *md)
 {
+   __map_region(md, md->phys_addr);
__map_region(md, md->virt_addr);
 }
 
-- 
1.8.5.6



[PATCH] Map in physical addresses in efi_map_region_fixed

2016-08-06 Thread Alex Thorlton
This is a simple change to add in the physical mappings as well as the
virtual mappings in efi_map_region_fixed.  The motivation here is to
get access to EFI runtime code that is only available via the 1:1
mappings on a kexec'd kernel.

The added call is essentially the kexec analog of the first __map_region
that Boris put in efi_map_region in commit d2f7cbe7b26a ("x86/efi:
Runtime services virtual mapping").

Signed-off-by: Alex Thorlton 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: Mike Travis 
Cc: Matt Fleming 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/platform/efi/efi_64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 459bcbb..b206126 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -363,6 +363,7 @@ void __init efi_map_region(efi_memory_desc_t *md)
  */
 void __init efi_map_region_fixed(efi_memory_desc_t *md)
 {
+   __map_region(md, md->phys_addr);
__map_region(md, md->virt_addr);
 }
 
-- 
1.8.5.6



Re: [RFC PATCH] Fix EFI callbacks on UV during kexec

2016-08-04 Thread Alex Thorlton
On Thu, Aug 04, 2016 at 10:25:32AM +0100, Matt Fleming wrote:
> On Mon, 01 Aug, at 09:34:10AM, Alex Thorlton wrote:
> > 
> > Hmm...   That's a good point.  It certainly would be nice for us to have
> > those mappings in the kexec kernel, at least for the time being.  If
> > you're not opposed to it, I can write up the patch.  Pretty sure it's a
> > one-liner.
> 
> If it's that trivial, sure, please go ahead and submit.

Sure thing.  I played around with this a bit before I sent this patch
up.  I'm pretty sure it all worked as expected, but I'll need to double
check everything.  I'll try and get it out in the next day or two.

Thanks, Matt!

- Alex


Re: [RFC PATCH] Fix EFI callbacks on UV during kexec

2016-08-04 Thread Alex Thorlton
On Thu, Aug 04, 2016 at 10:25:32AM +0100, Matt Fleming wrote:
> On Mon, 01 Aug, at 09:34:10AM, Alex Thorlton wrote:
> > 
> > Hmm...   That's a good point.  It certainly would be nice for us to have
> > those mappings in the kexec kernel, at least for the time being.  If
> > you're not opposed to it, I can write up the patch.  Pretty sure it's a
> > one-liner.
> 
> If it's that trivial, sure, please go ahead and submit.

Sure thing.  I played around with this a bit before I sent this patch
up.  I'm pretty sure it all worked as expected, but I'll need to double
check everything.  I'll try and get it out in the next day or two.

Thanks, Matt!

- Alex


Re: [PATCH] Skip UV runtime services mapping in the efi_runtime_disabled case

2016-08-03 Thread Alex Thorlton
On Mon, Aug 01, 2016 at 09:28:06AM -0500, Alex Thorlton wrote:
> So, it definitely needs to go in for v4.8, but it's kind of a toss-up
> for the older kernels.  I'll discuss this with the other guys around
> here to see what they think, and get back to you a bit later, if that's
> alright?

We talked about this, and I think everyone here agrees that there's not
much point in pulling this change back to the older kernels.  The only
exception here would be that we'd definitely like this change on the
older kernels *if* my other memmap fixes get ported back to those
kernels, though I don't know what the chances are of those changes
making it through stable.

So, unless you have a particular reason that you'd like to pull it back
to the old kernels, or you think that my other fixes might make it back
there, I don't see much point.

Let me know what you think!

- Alex


Re: [PATCH] Skip UV runtime services mapping in the efi_runtime_disabled case

2016-08-03 Thread Alex Thorlton
On Mon, Aug 01, 2016 at 09:28:06AM -0500, Alex Thorlton wrote:
> So, it definitely needs to go in for v4.8, but it's kind of a toss-up
> for the older kernels.  I'll discuss this with the other guys around
> here to see what they think, and get back to you a bit later, if that's
> alright?

We talked about this, and I think everyone here agrees that there's not
much point in pulling this change back to the older kernels.  The only
exception here would be that we'd definitely like this change on the
older kernels *if* my other memmap fixes get ported back to those
kernels, though I don't know what the chances are of those changes
making it through stable.

So, unless you have a particular reason that you'd like to pull it back
to the old kernels, or you think that my other fixes might make it back
there, I don't see much point.

Let me know what you think!

- Alex


Re: [RFC PATCH] Fix EFI callbacks on UV during kexec

2016-08-01 Thread Alex Thorlton
On Mon, Aug 01, 2016 at 02:39:26PM +0100, Matt Fleming wrote:
> On Tue, 26 Jul, at 05:38:32PM, Alex Thorlton wrote:
> > 
> > After investigating the problem here and figuring out the proper way to
> > get the noefi parameter working again, I noticed that there appears to
> > be support for EFI runtime callbacks in a kexec'd kernel now...  I
> > think we need some more cleanup here to get that all working entirely.
> > Without noefi, we hit a bad paging request when we try to do EFI 
> > callbacks:
>  
> [...]
> 
> > [0.341531] BUG: unable to handle kernel paging request at 
> > 6a1ab938
> > [0.349319] IP: [<6a1ab938>] 0x6a1ab938
> > [0.354386] PGD 354e0063 PUD 0
> > [0.357910] Oops: 0010 [#1] SMP
> > [0.361414] Modules linked in:
> > [0.364833] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> > 4.7.0-runtime-check+ #713
> 
> [...]
> 
> > This is due to the fact that the efi_map_region_fixed calls in
> > kexec_enter_virtual_mode, which map in the EFI runtime memory
> > descriptors, only map the virtual address of the descriptor.
> > Unfortunately, since we're still relying on the physical address of our
> > EFI runtime code being mapped in, we don't have access to that code in
> > the kexec scenario.
> > 
> > A potential fix for this would be to map in the physical addresses of
> > the descriptors as well as the virtual addresses in
> > efi_map_region_fixed, but the more "correct" fix would be to update
> > our system table pointer to its new virtual address during
> > SetVirtualAddressMap.  We intend to get that piece fixed up relatively
> > soon, but haven't quite gotten around to it yet.
> 
> I don't think it would be so bad if we did the 1:1 mappings in the
> kexec kernel too, we've got our own page tables after all and the VA
> space is available. It would be required if people ever want to use
> kexec with mixed mode kernels too.

Hmm...   That's a good point.  It certainly would be nice for us to have
those mappings in the kexec kernel, at least for the time being.  If
you're not opposed to it, I can write up the patch.  Pretty sure it's a
one-liner.

- Alex


Re: [RFC PATCH] Fix EFI callbacks on UV during kexec

2016-08-01 Thread Alex Thorlton
On Mon, Aug 01, 2016 at 02:39:26PM +0100, Matt Fleming wrote:
> On Tue, 26 Jul, at 05:38:32PM, Alex Thorlton wrote:
> > 
> > After investigating the problem here and figuring out the proper way to
> > get the noefi parameter working again, I noticed that there appears to
> > be support for EFI runtime callbacks in a kexec'd kernel now...  I
> > think we need some more cleanup here to get that all working entirely.
> > Without noefi, we hit a bad paging request when we try to do EFI 
> > callbacks:
>  
> [...]
> 
> > [0.341531] BUG: unable to handle kernel paging request at 
> > 6a1ab938
> > [0.349319] IP: [<6a1ab938>] 0x6a1ab938
> > [0.354386] PGD 354e0063 PUD 0
> > [0.357910] Oops: 0010 [#1] SMP
> > [0.361414] Modules linked in:
> > [0.364833] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> > 4.7.0-runtime-check+ #713
> 
> [...]
> 
> > This is due to the fact that the efi_map_region_fixed calls in
> > kexec_enter_virtual_mode, which map in the EFI runtime memory
> > descriptors, only map the virtual address of the descriptor.
> > Unfortunately, since we're still relying on the physical address of our
> > EFI runtime code being mapped in, we don't have access to that code in
> > the kexec scenario.
> > 
> > A potential fix for this would be to map in the physical addresses of
> > the descriptors as well as the virtual addresses in
> > efi_map_region_fixed, but the more "correct" fix would be to update
> > our system table pointer to its new virtual address during
> > SetVirtualAddressMap.  We intend to get that piece fixed up relatively
> > soon, but haven't quite gotten around to it yet.
> 
> I don't think it would be so bad if we did the 1:1 mappings in the
> kexec kernel too, we've got our own page tables after all and the VA
> space is available. It would be required if people ever want to use
> kexec with mixed mode kernels too.

Hmm...   That's a good point.  It certainly would be nice for us to have
those mappings in the kexec kernel, at least for the time being.  If
you're not opposed to it, I can write up the patch.  Pretty sure it's a
one-liner.

- Alex


Re: [PATCH] Skip UV runtime services mapping in the efi_runtime_disabled case

2016-08-01 Thread Alex Thorlton
On Mon, Aug 01, 2016 at 02:49:57PM +0100, Matt Fleming wrote:
> On Tue, 26 Jul, at 05:38:33PM, Alex Thorlton wrote:
> > This problem has actually been in the UV code for a while, but we didn't
> > catch it until recently, because we had been relying on EFI_OLD_MEMMAP
> > to allow our systems to boot for a period of time.  We noticed the issue
> > when trying to kexec a recent community kernel, where we hit this NULL
> > pointer dereference in efi_sync_low_kernel_mappings:
> > 
> > [0.337515] BUG: unable to handle kernel NULL pointer dereference at 
> > 0880
> > [0.346276] IP: [] 
> > efi_sync_low_kernel_mappings+0x5d/0x1b0
> > 
> > The problem doesn't show up with EFI_OLD_MEMMAP because we skip the
> > chunk of setup_efi_state that sets the efi_loader_signature for the
> > kexec'd kernel.  When the kexec'd kernel boots, it won't set EFI_BOOT in
> > setup_arch, so we completely avoid the bug.
> > 
> > We always kexec with noefi on the command line, so this shouldn't be an
> > issue, but since we're not actually checking for efi_runtime_disabled in
> > uv_bios_init, we end up trying to do EFI runtime callbacks when we
> > shouldn't be. This patch just adds a check for efi_runtime_disabled in
> > uv_bios_init so that we don't map in uv_systab when runtime_disabled ==
> > true.
> > 
> > Signed-off-by: Alex Thorlton <athorl...@sgi.com>
> > Cc: Russ Anderson <r...@sgi.com>
> > Cc: Mike Travis <tra...@sgi.com>
> > Cc: Matt Fleming <m...@codeblueprint.co.uk>
> > Cc: Borislav Petkov <b...@suse.de>
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > Cc: x...@kernel.org
> > ---
> >  arch/x86/platform/uv/bios_uv.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
> > index 66b2166..0df8a03 100644
> > --- a/arch/x86/platform/uv/bios_uv.c
> > +++ b/arch/x86/platform/uv/bios_uv.c
> > @@ -187,7 +187,8 @@ EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
> >  void uv_bios_init(void)
> >  {
> > uv_systab = NULL;
> > -   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab) {
> > +   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) ||
> > +   !efi.uv_systab || efi_runtime_disabled()) {
> > pr_crit("UV: UVsystab: missing\n");
> > return;
> > }
> 
> The fix looks fine, but I'm losing track of which kernels this patch
> should be applied to. Does it just need to be applied for v4.8 or
> earlier kernels too?

Well, we *have* to boot v4.6 and v4.7 with efi=old_map, which will avoid
our kexec problem entirely, so while the patch would apply just fine on
those kernels, and achieve the desired effect, we wouldn't really get
any benefit from it.

So, it definitely needs to go in for v4.8, but it's kind of a toss-up
for the older kernels.  I'll discuss this with the other guys around
here to see what they think, and get back to you a bit later, if that's
alright?

- Alex


Re: [PATCH] Skip UV runtime services mapping in the efi_runtime_disabled case

2016-08-01 Thread Alex Thorlton
On Mon, Aug 01, 2016 at 02:49:57PM +0100, Matt Fleming wrote:
> On Tue, 26 Jul, at 05:38:33PM, Alex Thorlton wrote:
> > This problem has actually been in the UV code for a while, but we didn't
> > catch it until recently, because we had been relying on EFI_OLD_MEMMAP
> > to allow our systems to boot for a period of time.  We noticed the issue
> > when trying to kexec a recent community kernel, where we hit this NULL
> > pointer dereference in efi_sync_low_kernel_mappings:
> > 
> > [0.337515] BUG: unable to handle kernel NULL pointer dereference at 
> > 0880
> > [0.346276] IP: [] 
> > efi_sync_low_kernel_mappings+0x5d/0x1b0
> > 
> > The problem doesn't show up with EFI_OLD_MEMMAP because we skip the
> > chunk of setup_efi_state that sets the efi_loader_signature for the
> > kexec'd kernel.  When the kexec'd kernel boots, it won't set EFI_BOOT in
> > setup_arch, so we completely avoid the bug.
> > 
> > We always kexec with noefi on the command line, so this shouldn't be an
> > issue, but since we're not actually checking for efi_runtime_disabled in
> > uv_bios_init, we end up trying to do EFI runtime callbacks when we
> > shouldn't be. This patch just adds a check for efi_runtime_disabled in
> > uv_bios_init so that we don't map in uv_systab when runtime_disabled ==
> > true.
> > 
> > Signed-off-by: Alex Thorlton 
> > Cc: Russ Anderson 
> > Cc: Mike Travis 
> > Cc: Matt Fleming 
> > Cc: Borislav Petkov 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: "H. Peter Anvin" 
> > Cc: x...@kernel.org
> > ---
> >  arch/x86/platform/uv/bios_uv.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
> > index 66b2166..0df8a03 100644
> > --- a/arch/x86/platform/uv/bios_uv.c
> > +++ b/arch/x86/platform/uv/bios_uv.c
> > @@ -187,7 +187,8 @@ EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
> >  void uv_bios_init(void)
> >  {
> > uv_systab = NULL;
> > -   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab) {
> > +   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) ||
> > +   !efi.uv_systab || efi_runtime_disabled()) {
> > pr_crit("UV: UVsystab: missing\n");
> > return;
> > }
> 
> The fix looks fine, but I'm losing track of which kernels this patch
> should be applied to. Does it just need to be applied for v4.8 or
> earlier kernels too?

Well, we *have* to boot v4.6 and v4.7 with efi=old_map, which will avoid
our kexec problem entirely, so while the patch would apply just fine on
those kernels, and achieve the desired effect, we wouldn't really get
any benefit from it.

So, it definitely needs to go in for v4.8, but it's kind of a toss-up
for the older kernels.  I'll discuss this with the other guys around
here to see what they think, and get back to you a bit later, if that's
alright?

- Alex


[PATCH] Skip UV runtime services mapping in the efi_runtime_disabled case

2016-07-26 Thread Alex Thorlton
This problem has actually been in the UV code for a while, but we didn't
catch it until recently, because we had been relying on EFI_OLD_MEMMAP
to allow our systems to boot for a period of time.  We noticed the issue
when trying to kexec a recent community kernel, where we hit this NULL
pointer dereference in efi_sync_low_kernel_mappings:

[0.337515] BUG: unable to handle kernel NULL pointer dereference at 
0880
[0.346276] IP: [] efi_sync_low_kernel_mappings+0x5d/0x1b0

The problem doesn't show up with EFI_OLD_MEMMAP because we skip the
chunk of setup_efi_state that sets the efi_loader_signature for the
kexec'd kernel.  When the kexec'd kernel boots, it won't set EFI_BOOT in
setup_arch, so we completely avoid the bug.

We always kexec with noefi on the command line, so this shouldn't be an
issue, but since we're not actually checking for efi_runtime_disabled in
uv_bios_init, we end up trying to do EFI runtime callbacks when we
shouldn't be. This patch just adds a check for efi_runtime_disabled in
uv_bios_init so that we don't map in uv_systab when runtime_disabled ==
true.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Borislav Petkov <b...@suse.de>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 66b2166..0df8a03 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -187,7 +187,8 @@ EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 void uv_bios_init(void)
 {
uv_systab = NULL;
-   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab) {
+   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) ||
+   !efi.uv_systab || efi_runtime_disabled()) {
pr_crit("UV: UVsystab: missing\n");
return;
}
-- 
1.8.5.6



[PATCH] Skip UV runtime services mapping in the efi_runtime_disabled case

2016-07-26 Thread Alex Thorlton
This problem has actually been in the UV code for a while, but we didn't
catch it until recently, because we had been relying on EFI_OLD_MEMMAP
to allow our systems to boot for a period of time.  We noticed the issue
when trying to kexec a recent community kernel, where we hit this NULL
pointer dereference in efi_sync_low_kernel_mappings:

[0.337515] BUG: unable to handle kernel NULL pointer dereference at 
0880
[0.346276] IP: [] efi_sync_low_kernel_mappings+0x5d/0x1b0

The problem doesn't show up with EFI_OLD_MEMMAP because we skip the
chunk of setup_efi_state that sets the efi_loader_signature for the
kexec'd kernel.  When the kexec'd kernel boots, it won't set EFI_BOOT in
setup_arch, so we completely avoid the bug.

We always kexec with noefi on the command line, so this shouldn't be an
issue, but since we're not actually checking for efi_runtime_disabled in
uv_bios_init, we end up trying to do EFI runtime callbacks when we
shouldn't be. This patch just adds a check for efi_runtime_disabled in
uv_bios_init so that we don't map in uv_systab when runtime_disabled ==
true.

Signed-off-by: Alex Thorlton 
Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Matt Fleming 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 66b2166..0df8a03 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -187,7 +187,8 @@ EXPORT_SYMBOL_GPL(uv_bios_set_legacy_vga_target);
 void uv_bios_init(void)
 {
uv_systab = NULL;
-   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab) {
+   if ((efi.uv_systab == EFI_INVALID_TABLE_ADDR) ||
+   !efi.uv_systab || efi_runtime_disabled()) {
pr_crit("UV: UVsystab: missing\n");
return;
}
-- 
1.8.5.6



[RFC PATCH] Fix EFI callbacks on UV during kexec

2016-07-26 Thread Alex Thorlton
Hey everyone,

This is a fix for our BIOS init code to skip mapping in runtime services
when runtime_disabled == true.   This is one that snuck under the radar
for a while, since we were using EFI_OLD_MEMMAP for so long.  I've
explained the details of how it went unnoticed in the commit message.

After investigating the problem here and figuring out the proper way to
get the noefi parameter working again, I noticed that there appears to
be support for EFI runtime callbacks in a kexec'd kernel now...  I
think we need some more cleanup here to get that all working entirely.
Without noefi, we hit a bad paging request when we try to do EFI 
callbacks:

[0.292926] UV: UVsystab: Revision:1
[0.296913] UV: No UVsystab socket table, ignoring
[0.302261] UV: N:4 M:36 m_shift:28 n_lshift:39
[0.307317] UV: gpa_mask/shift:0xff/0 pnode_mask:0xf apic_pns:5
[0.314697] UV: mmr_base/shift:0xff4000/26 gru_base/shift:0x0/0
[0.321692] UV: gnode_upper:0x0 gnode_extra:0x0
[0.326746] UV: NODE_PRESENT_DEPTH = 16
[0.331025] UV: NODE_PRESENT(0) = 0x0001
[0.336569] UV: Found 1 hubs, 1 nodes, 10 cpus
[0.341531] BUG: unable to handle kernel paging request at 6a1ab938
[0.349319] IP: [<6a1ab938>] 0x6a1ab938
[0.354386] PGD 354e0063 PUD 0
[0.357910] Oops: 0010 [#1] SMP
[0.361414] Modules linked in:
[0.364833] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-runtime-check+ 
#713
[0.372988] Hardware name: SGI UV3000/UV3000, BIOS SGI UV 3000 series BIOS 
01/15/2015
[0.381725] task: 880035614040 ti: 880035618000 task.ti: 
880035618000
[0.390075] RIP: 0010:[<6a1ab938>]  [<6a1ab938>] 0x6a1ab938
[0.397855] RSP: :88003561bbe8  EFLAGS: 00010086
[0.403780] RAX:  RBX: c9006000 RCX: 0001
[0.411741] RDX:  RSI: 0001 RDI: 6a1ab938
[0.419705] RBP: 88003561bc90 R08: 88003561bd10 R09: 88003561bd18
[0.427667] R10: 8800354d4000 R11: 00c9 R12: 
[0.435630] R13:  R14: 88003561bd18 R15: 0001
[0.443592] FS:  () GS:88003480() 
knlGS:
[0.452621] CS:  0010 DS:  ES:  CR0: 80050033
[0.459033] CR2: 6a1ab938 CR3: 354d3000 CR4: 001406f0
[0.466996] Stack:
[0.469236]  8105e148 0046 0096 
0096
[0.477532]  88003561bc28   
88003561bc90
[0.485826]  80050033   

[0.494121] Call Trace:
[0.496851]  [] ? efi_call+0x58/0x90
[0.502586]  [] uv_bios_call+0x82/0x120
[0.508609]  [] uv_bios_call_irqsave+0x20/0x40
[0.515310]  [] uv_bios_get_sn_info+0x40/0xb0
[0.521921]  [] uv_system_init+0x8b6/0x143e
[0.528337]  [] ? vprintk_emit+0x225/0x470
[0.534645]  [] native_smp_prepare_cpus+0x299/0x2e4
[0.541836]  [] kernel_init_freeable+0xc3/0x220
[0.548638]  [] kernel_init+0xe/0x110
[0.554467]  [] ret_from_fork+0x1f/0x40
[0.560491]  [] ? rest_init+0x80/0x80
[0.566320] Code:  Bad RIP value.
[0.570035] RIP  [<6a1ab938>] 0x6a1ab938
[0.575197]  RSP 
[0.579087] CR2: 6a1ab938
[0.582786] ---[ end trace 99fd1a588f7287b9 ]---

This is due to the fact that the efi_map_region_fixed calls in
kexec_enter_virtual_mode, which map in the EFI runtime memory
descriptors, only map the virtual address of the descriptor.
Unfortunately, since we're still relying on the physical address of our
EFI runtime code being mapped in, we don't have access to that code in
the kexec scenario.

A potential fix for this would be to map in the physical addresses of
the descriptors as well as the virtual addresses in
efi_map_region_fixed, but the more "correct" fix would be to update
our system table pointer to its new virtual address during
SetVirtualAddressMap.  We intend to get that piece fixed up relatively
soon, but haven't quite gotten around to it yet.

Let me know what you guys think!

Alex Thorlton (1):
  Skip UV runtime services mapping in the efi_runtime_disabled case

 arch/x86/platform/uv/bios_uv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
1.8.5.6



[RFC PATCH] Fix EFI callbacks on UV during kexec

2016-07-26 Thread Alex Thorlton
Hey everyone,

This is a fix for our BIOS init code to skip mapping in runtime services
when runtime_disabled == true.   This is one that snuck under the radar
for a while, since we were using EFI_OLD_MEMMAP for so long.  I've
explained the details of how it went unnoticed in the commit message.

After investigating the problem here and figuring out the proper way to
get the noefi parameter working again, I noticed that there appears to
be support for EFI runtime callbacks in a kexec'd kernel now...  I
think we need some more cleanup here to get that all working entirely.
Without noefi, we hit a bad paging request when we try to do EFI 
callbacks:

[0.292926] UV: UVsystab: Revision:1
[0.296913] UV: No UVsystab socket table, ignoring
[0.302261] UV: N:4 M:36 m_shift:28 n_lshift:39
[0.307317] UV: gpa_mask/shift:0xff/0 pnode_mask:0xf apic_pns:5
[0.314697] UV: mmr_base/shift:0xff4000/26 gru_base/shift:0x0/0
[0.321692] UV: gnode_upper:0x0 gnode_extra:0x0
[0.326746] UV: NODE_PRESENT_DEPTH = 16
[0.331025] UV: NODE_PRESENT(0) = 0x0001
[0.336569] UV: Found 1 hubs, 1 nodes, 10 cpus
[0.341531] BUG: unable to handle kernel paging request at 6a1ab938
[0.349319] IP: [<6a1ab938>] 0x6a1ab938
[0.354386] PGD 354e0063 PUD 0
[0.357910] Oops: 0010 [#1] SMP
[0.361414] Modules linked in:
[0.364833] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-runtime-check+ 
#713
[0.372988] Hardware name: SGI UV3000/UV3000, BIOS SGI UV 3000 series BIOS 
01/15/2015
[0.381725] task: 880035614040 ti: 880035618000 task.ti: 
880035618000
[0.390075] RIP: 0010:[<6a1ab938>]  [<6a1ab938>] 0x6a1ab938
[0.397855] RSP: :88003561bbe8  EFLAGS: 00010086
[0.403780] RAX:  RBX: c9006000 RCX: 0001
[0.411741] RDX:  RSI: 0001 RDI: 6a1ab938
[0.419705] RBP: 88003561bc90 R08: 88003561bd10 R09: 88003561bd18
[0.427667] R10: 8800354d4000 R11: 00c9 R12: 
[0.435630] R13:  R14: 88003561bd18 R15: 0001
[0.443592] FS:  () GS:88003480() 
knlGS:
[0.452621] CS:  0010 DS:  ES:  CR0: 80050033
[0.459033] CR2: 6a1ab938 CR3: 354d3000 CR4: 001406f0
[0.466996] Stack:
[0.469236]  8105e148 0046 0096 
0096
[0.477532]  88003561bc28   
88003561bc90
[0.485826]  80050033   

[0.494121] Call Trace:
[0.496851]  [] ? efi_call+0x58/0x90
[0.502586]  [] uv_bios_call+0x82/0x120
[0.508609]  [] uv_bios_call_irqsave+0x20/0x40
[0.515310]  [] uv_bios_get_sn_info+0x40/0xb0
[0.521921]  [] uv_system_init+0x8b6/0x143e
[0.528337]  [] ? vprintk_emit+0x225/0x470
[0.534645]  [] native_smp_prepare_cpus+0x299/0x2e4
[0.541836]  [] kernel_init_freeable+0xc3/0x220
[0.548638]  [] kernel_init+0xe/0x110
[0.554467]  [] ret_from_fork+0x1f/0x40
[0.560491]  [] ? rest_init+0x80/0x80
[0.566320] Code:  Bad RIP value.
[0.570035] RIP  [<6a1ab938>] 0x6a1ab938
[0.575197]  RSP 
[0.579087] CR2: 6a1ab938
[0.582786] ---[ end trace 99fd1a588f7287b9 ]---

This is due to the fact that the efi_map_region_fixed calls in
kexec_enter_virtual_mode, which map in the EFI runtime memory
descriptors, only map the virtual address of the descriptor.
Unfortunately, since we're still relying on the physical address of our
EFI runtime code being mapped in, we don't have access to that code in
the kexec scenario.

A potential fix for this would be to map in the physical addresses of
the descriptors as well as the virtual addresses in
efi_map_region_fixed, but the more "correct" fix would be to update
our system table pointer to its new virtual address during
SetVirtualAddressMap.  We intend to get that piece fixed up relatively
soon, but haven't quite gotten around to it yet.

Let me know what you guys think!

Alex Thorlton (1):
  Skip UV runtime services mapping in the efi_runtime_disabled case

 arch/x86/platform/uv/bios_uv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
1.8.5.6



[tip:efi/core] efi: Convert efi_call_virt() to efi_call_virt_pointer()

2016-06-27 Thread tip-bot for Alex Thorlton
Commit-ID:  80e75596079f0a41f905836ad0ccaac68ba33612
Gitweb: http://git.kernel.org/tip/80e75596079f0a41f905836ad0ccaac68ba33612
Author: Alex Thorlton <athorl...@sgi.com>
AuthorDate: Sat, 25 Jun 2016 08:20:27 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Mon, 27 Jun 2016 13:06:56 +0200

efi: Convert efi_call_virt() to efi_call_virt_pointer()

This commit makes a few slight modifications to the efi_call_virt() macro
to get it to work with function pointers that are stored in locations
other than efi.systab->runtime, and renames the macro to
efi_call_virt_pointer().  The majority of the changes here are to pull
these macros up into header files so that they can be accessed from
outside of drivers/firmware/efi/runtime-wrappers.c.

The most significant change not directly related to the code move is to
add an extra "p" argument into the appropriate efi_call macros, and use
that new argument in place of the, formerly hard-coded,
efi.systab->runtime pointer.

The last piece of the puzzle was to add an efi_call_virt() macro back into
drivers/firmware/efi/runtime-wrappers.c to wrap around the new
efi_call_virt_pointer() macro - this was mainly to keep the code from
looking too cluttered by adding a bunch of extra references to
efi.systab->runtime everywhere.

Note that I also broke up the code in the efi_call_virt_pointer() macro a
bit in the process of moving it.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Roy Franz <roy.fr...@linaro.org>
Cc: Russ Anderson <r...@sgi.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Will Deacon <will.dea...@arm.com>
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1466839230-12781-5-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/arm/include/asm/efi.h  |  4 +--
 arch/arm64/include/asm/efi.h|  4 +--
 arch/x86/include/asm/efi.h  |  9 +++---
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 5 files changed, 69 insertions(+), 52 deletions(-)

diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index a708fa1..766bf9b 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -28,10 +28,10 @@ int efi_set_mapping_permissions(struct mm_struct *mm, 
efi_memory_desc_t *md);
 #define arch_efi_call_virt_setup() efi_virtmap_load()
 #define arch_efi_call_virt_teardown()  efi_virtmap_unload()
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 622db3c..bd88766 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -23,10 +23,10 @@ int efi_create_mapping(struct mm_struct *mm, 
efi_memory_desc_t *md);
efi_virtmap_load(); \
 })
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 78d1e74..55b4596 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -41,10 +41,9 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...);
 /*
  * Wrap all the virtual calls in a way that forces the parameters on the stack.
  */
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({

[tip:efi/core] x86/efi: Update efi_thunk() to use the the arch_efi_call_virt*() macros

2016-06-27 Thread tip-bot for Alex Thorlton
Commit-ID:  21f866257c7027f8f49bfde83f559f9e58f9ee93
Gitweb: http://git.kernel.org/tip/21f866257c7027f8f49bfde83f559f9e58f9ee93
Author: Alex Thorlton <athorl...@sgi.com>
AuthorDate: Sat, 25 Jun 2016 08:20:29 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Mon, 27 Jun 2016 13:06:57 +0200

x86/efi: Update efi_thunk() to use the the arch_efi_call_virt*() macros

Currently, the efi_thunk macro has some semi-duplicated code in it that
can be replaced with the arch_efi_call_virt_setup/teardown macros. This
commit simply replaces the duplicated code with those macros.

Suggested-by: Matt Fleming <m...@codeblueprint.co.uk>
Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Roy Franz <roy.fr...@linaro.org>
Cc: Russ Anderson <r...@sgi.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Will Deacon <will.dea...@arm.com>
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1466839230-12781-7-git-send-email-m...@codeblueprint.co.uk
[ Renamed variables to the standard __ prefix. ]
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/platform/efi/efi_64.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index b226b3f..5cb4301 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -466,22 +466,17 @@ extern efi_status_t efi64_thunk(u32, ...);
 #define efi_thunk(f, ...)  \
 ({ \
efi_status_t __s;   \
-   unsigned long flags;\
-   u32 func;   \
+   unsigned long __flags;  \
+   u32 __func; \
\
-   efi_sync_low_kernel_mappings(); \
-   local_irq_save(flags);  \
+   local_irq_save(__flags);\
+   arch_efi_call_virt_setup(); \
\
-   efi_scratch.prev_cr3 = read_cr3();  \
-   write_cr3((unsigned long)efi_scratch.efi_pgt);  \
-   __flush_tlb_all();  \
+   __func = runtime_service32(f);  \
+   __s = efi64_thunk(__func, __VA_ARGS__); \
\
-   func = runtime_service32(f);\
-   __s = efi64_thunk(func, __VA_ARGS__);   \
-   \
-   write_cr3(efi_scratch.prev_cr3);\
-   __flush_tlb_all();  \
-   local_irq_restore(flags);   \
+   arch_efi_call_virt_teardown();  \
+   local_irq_restore(__flags); \
\
__s;\
 })


[tip:efi/core] x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()

2016-06-27 Thread tip-bot for Alex Thorlton
Commit-ID:  d1be84a232e359ca9456c63e72cb0082d68311b6
Gitweb: http://git.kernel.org/tip/d1be84a232e359ca9456c63e72cb0082d68311b6
Author: Alex Thorlton <athorl...@sgi.com>
AuthorDate: Sat, 25 Jun 2016 08:20:28 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Mon, 27 Jun 2016 13:06:56 +0200

x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()

Now that the efi_call_virt() macro has been generalized to be able to
use EFI system tables besides efi.systab, we are able to convert our
uv_bios_call() wrapper to use this standard EFI callback mechanism.

This simple change is part of a much larger effort to recover from some
issues with the way we were mapping in some of our MMRs, and the way
that we were doing our BIOS callbacks, which were uncovered by commit
67a9108ed431 ("x86/efi: Build our own page table structures").

The first issue that this uncovered was that we were relying on the EFI
memory mapping mechanism to map in our MMR space for us, which, while
reliable, was technically a bug, as it relied on "undefined" behavior in
the mapping code.

The reason we were able to piggyback on the EFI memory mapping code to
map in our MMRs was because, previously, EFI code used the
trampoline_pgd, which shares a few entries with the main kernel pgd.  It
just so happened, that the memory range containing our MMRs was inside
one of those shared regions, which kept our code working without issue
for quite a while.

Anyways, once we discovered this problem, we brought back our original
code to map in the MMRs with commit:

  08914f436bdd ("x86/platform/UV: Bring back the call to map_low_mmrs in 
uv_system_init")

This got our systems a little further along, but we were still running
into trouble with our EFI callbacks, which prevented us from booting
all the way up.

Our first step towards fixing the BIOS callbacks was to get our
uv_bios_call() wrapper updated to use efi_call_virt() instead of the plain
efi_call().  The previous patch took care of the effort needed to make
that possible.  Along the way, we hit a major issue with some confusion
about how to properly pull arguments higher than number 6 off the stack
in the efi_call() code, which resulted in the following commit from Linus:

  683ad8092cd2 ("x86/efi: Fix 7-parameter efi_call()s")

Now that all of those issues are out of the way, we're able to make this
simple change to use the new efi_call_virt_pointer() in uv_bios_call()
which gets our machines booting, running properly, and able to execute our
callbacks with 6+ arguments.

Note that, since we are now using the EFI page table when we make our
function call, we are no longer able to make the call using the __va()
of our function pointer, since the memory range containing that address
isn't mapped into the EFI page table.  For now, we will use the physical
address of the function directly, since that is mapped into the EFI page
table.  In the near future, we're going to get some code added in to
properly update our function pointer to its virtual address during
SetVirtualAddressMap.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Roy Franz <roy.fr...@linaro.org>
Cc: Russ Anderson <r...@sgi.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Will Deacon <will.dea...@arm.com>
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1466839230-12781-6-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/platform/uv/bios_uv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 815fec6..66b2166 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call((void *)__va(tab->function), (u64)which,
-   a1, a2, a3, a4, a5);
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);


[tip:efi/core] x86/efi: Update efi_thunk() to use the the arch_efi_call_virt*() macros

2016-06-27 Thread tip-bot for Alex Thorlton
Commit-ID:  21f866257c7027f8f49bfde83f559f9e58f9ee93
Gitweb: http://git.kernel.org/tip/21f866257c7027f8f49bfde83f559f9e58f9ee93
Author: Alex Thorlton 
AuthorDate: Sat, 25 Jun 2016 08:20:29 +0100
Committer:  Ingo Molnar 
CommitDate: Mon, 27 Jun 2016 13:06:57 +0200

x86/efi: Update efi_thunk() to use the the arch_efi_call_virt*() macros

Currently, the efi_thunk macro has some semi-duplicated code in it that
can be replaced with the arch_efi_call_virt_setup/teardown macros. This
commit simply replaces the duplicated code with those macros.

Suggested-by: Matt Fleming 
Signed-off-by: Alex Thorlton 
Signed-off-by: Matt Fleming 
Cc: Ard Biesheuvel 
Cc: Catalin Marinas 
Cc: Dimitri Sivanich 
Cc: Linus Torvalds 
Cc: Mark Rutland 
Cc: Peter Zijlstra 
Cc: Roy Franz 
Cc: Russ Anderson 
Cc: Russell King 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1466839230-12781-7-git-send-email-m...@codeblueprint.co.uk
[ Renamed variables to the standard __ prefix. ]
Signed-off-by: Ingo Molnar 
---
 arch/x86/platform/efi/efi_64.c | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index b226b3f..5cb4301 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -466,22 +466,17 @@ extern efi_status_t efi64_thunk(u32, ...);
 #define efi_thunk(f, ...)  \
 ({ \
efi_status_t __s;   \
-   unsigned long flags;\
-   u32 func;   \
+   unsigned long __flags;  \
+   u32 __func; \
\
-   efi_sync_low_kernel_mappings(); \
-   local_irq_save(flags);  \
+   local_irq_save(__flags);\
+   arch_efi_call_virt_setup(); \
\
-   efi_scratch.prev_cr3 = read_cr3();  \
-   write_cr3((unsigned long)efi_scratch.efi_pgt);  \
-   __flush_tlb_all();  \
+   __func = runtime_service32(f);  \
+   __s = efi64_thunk(__func, __VA_ARGS__); \
\
-   func = runtime_service32(f);\
-   __s = efi64_thunk(func, __VA_ARGS__);   \
-   \
-   write_cr3(efi_scratch.prev_cr3);\
-   __flush_tlb_all();  \
-   local_irq_restore(flags);   \
+   arch_efi_call_virt_teardown();  \
+   local_irq_restore(__flags); \
\
__s;\
 })


[tip:efi/core] x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()

2016-06-27 Thread tip-bot for Alex Thorlton
Commit-ID:  d1be84a232e359ca9456c63e72cb0082d68311b6
Gitweb: http://git.kernel.org/tip/d1be84a232e359ca9456c63e72cb0082d68311b6
Author: Alex Thorlton 
AuthorDate: Sat, 25 Jun 2016 08:20:28 +0100
Committer:  Ingo Molnar 
CommitDate: Mon, 27 Jun 2016 13:06:56 +0200

x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()

Now that the efi_call_virt() macro has been generalized to be able to
use EFI system tables besides efi.systab, we are able to convert our
uv_bios_call() wrapper to use this standard EFI callback mechanism.

This simple change is part of a much larger effort to recover from some
issues with the way we were mapping in some of our MMRs, and the way
that we were doing our BIOS callbacks, which were uncovered by commit
67a9108ed431 ("x86/efi: Build our own page table structures").

The first issue that this uncovered was that we were relying on the EFI
memory mapping mechanism to map in our MMR space for us, which, while
reliable, was technically a bug, as it relied on "undefined" behavior in
the mapping code.

The reason we were able to piggyback on the EFI memory mapping code to
map in our MMRs was because, previously, EFI code used the
trampoline_pgd, which shares a few entries with the main kernel pgd.  It
just so happened, that the memory range containing our MMRs was inside
one of those shared regions, which kept our code working without issue
for quite a while.

Anyways, once we discovered this problem, we brought back our original
code to map in the MMRs with commit:

  08914f436bdd ("x86/platform/UV: Bring back the call to map_low_mmrs in 
uv_system_init")

This got our systems a little further along, but we were still running
into trouble with our EFI callbacks, which prevented us from booting
all the way up.

Our first step towards fixing the BIOS callbacks was to get our
uv_bios_call() wrapper updated to use efi_call_virt() instead of the plain
efi_call().  The previous patch took care of the effort needed to make
that possible.  Along the way, we hit a major issue with some confusion
about how to properly pull arguments higher than number 6 off the stack
in the efi_call() code, which resulted in the following commit from Linus:

  683ad8092cd2 ("x86/efi: Fix 7-parameter efi_call()s")

Now that all of those issues are out of the way, we're able to make this
simple change to use the new efi_call_virt_pointer() in uv_bios_call()
which gets our machines booting, running properly, and able to execute our
callbacks with 6+ arguments.

Note that, since we are now using the EFI page table when we make our
function call, we are no longer able to make the call using the __va()
of our function pointer, since the memory range containing that address
isn't mapped into the EFI page table.  For now, we will use the physical
address of the function directly, since that is mapped into the EFI page
table.  In the near future, we're going to get some code added in to
properly update our function pointer to its virtual address during
SetVirtualAddressMap.

Signed-off-by: Alex Thorlton 
Signed-off-by: Matt Fleming 
Cc: Ard Biesheuvel 
Cc: Catalin Marinas 
Cc: Dimitri Sivanich 
Cc: Linus Torvalds 
Cc: Mark Rutland 
Cc: Peter Zijlstra 
Cc: Roy Franz 
Cc: Russ Anderson 
Cc: Russell King 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1466839230-12781-6-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/x86/platform/uv/bios_uv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 815fec6..66b2166 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call((void *)__va(tab->function), (u64)which,
-   a1, a2, a3, a4, a5);
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);


[tip:efi/core] efi: Convert efi_call_virt() to efi_call_virt_pointer()

2016-06-27 Thread tip-bot for Alex Thorlton
Commit-ID:  80e75596079f0a41f905836ad0ccaac68ba33612
Gitweb: http://git.kernel.org/tip/80e75596079f0a41f905836ad0ccaac68ba33612
Author: Alex Thorlton 
AuthorDate: Sat, 25 Jun 2016 08:20:27 +0100
Committer:  Ingo Molnar 
CommitDate: Mon, 27 Jun 2016 13:06:56 +0200

efi: Convert efi_call_virt() to efi_call_virt_pointer()

This commit makes a few slight modifications to the efi_call_virt() macro
to get it to work with function pointers that are stored in locations
other than efi.systab->runtime, and renames the macro to
efi_call_virt_pointer().  The majority of the changes here are to pull
these macros up into header files so that they can be accessed from
outside of drivers/firmware/efi/runtime-wrappers.c.

The most significant change not directly related to the code move is to
add an extra "p" argument into the appropriate efi_call macros, and use
that new argument in place of the, formerly hard-coded,
efi.systab->runtime pointer.

The last piece of the puzzle was to add an efi_call_virt() macro back into
drivers/firmware/efi/runtime-wrappers.c to wrap around the new
efi_call_virt_pointer() macro - this was mainly to keep the code from
looking too cluttered by adding a bunch of extra references to
efi.systab->runtime everywhere.

Note that I also broke up the code in the efi_call_virt_pointer() macro a
bit in the process of moving it.

Signed-off-by: Alex Thorlton 
Signed-off-by: Matt Fleming 
Cc: Ard Biesheuvel 
Cc: Catalin Marinas 
Cc: Dimitri Sivanich 
Cc: Linus Torvalds 
Cc: Mark Rutland 
Cc: Peter Zijlstra 
Cc: Roy Franz 
Cc: Russ Anderson 
Cc: Russell King 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1466839230-12781-5-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 arch/arm/include/asm/efi.h  |  4 +--
 arch/arm64/include/asm/efi.h|  4 +--
 arch/x86/include/asm/efi.h  |  9 +++---
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 5 files changed, 69 insertions(+), 52 deletions(-)

diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index a708fa1..766bf9b 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -28,10 +28,10 @@ int efi_set_mapping_permissions(struct mm_struct *mm, 
efi_memory_desc_t *md);
 #define arch_efi_call_virt_setup() efi_virtmap_load()
 #define arch_efi_call_virt_teardown()  efi_virtmap_unload()
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 622db3c..bd88766 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -23,10 +23,10 @@ int efi_create_mapping(struct mm_struct *mm, 
efi_memory_desc_t *md);
efi_virtmap_load(); \
 })
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 78d1e74..55b4596 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -41,10 +41,9 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...);
 /*
  * Wrap all the virtual calls in a way that forces the parameters on the stack.
  */
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
-   ((efi_##f##_t __attribute__((regparm(0)))*) \
-   efi.systab->runtime->f)(args);  \
+   ((efi_##f##_t __attribute__((regparm(0)))*) p->f)(args);\
 })
 
 #define efi_ioremap(addr, size, type, attr)ioremap_cache(addr, size)
@@ -81,8 +80,8 @@ struct efi_scratch {
}

[PATCH 2/3] Update uv_bios_call to use efi_call_virt_pointer

2016-06-15 Thread Alex Thorlton
Now that the efi_call_virt macro has been generalized to be able to
use EFI system tables besides efi.systab, we are able to convert our
uv_bios_call wrapper to use this standard EFI callback mechanism.

This simple change is part of a much larger effort to recover from some
issues with the way we were mapping in some of our MMRs, and the way
that we were doing our BIOS callbacks, which were uncovered by commit
67a9108ed431 ("x86/efi: Build our own page table structures").

The first issue that this uncovered was that we were relying on the EFI
memory mapping mechanism to map in our MMR space for us, which, while
reliable, was technically a bug, as it relied on "undefined" behavior in
the mapping code.

The reason we were able to piggyback on the EFI memory mapping code to
map in our MMRs was because, previously, EFI code used the
trampoline_pgd, which shares a few entries with the main kernel pgd.  It
just so happened, that the memory range containing our MMRs was inside
one of those shared regions, which kept our code working without issue
for quite a while.

Anyways, once we discovered this problem, we brought back our original
code to map in the MMRs with commit 08914f436bdd ("x86/platform/UV:
Bring back the call to map_low_mmrs in uv_system_init").  This got our
systems a little further along, but we were still running into trouble
with our EFI callbacks, which prevented us from booting all the way up.

Our first step towards fixing the BIOS callbacks was to get our
uv_bios_call wrapper updated to use efi_call_virt instead of the plain
efi_call.  The previous patch took care of the effort needed to make
that possible.  Along the way, we hit a major issue with some confusion
about how to properly pull arguments higher than number 6 off the stack
in the efi_call code, which resulted in Linus's commit 683ad8092cd2
("x86/efi: Fix 7-parameter efi_call()s").

Now that all of those issues are out of the way, we're able to make this
simple change to use the new efi_call_virt_pointer in uv_bios_call which
gets our machines booting, running properly, and able to execute our
callbacks with 6+ arguments.

Note that, since we are now using the EFI page table when we make our
function call, we are no longer able to make the call using the __va()
of our function pointer, since the memory range containing that address
isn't mapped into the EFI page table.  For now, we will use the physical
address of the function directly, since that is mapped into the EFI page
table.  In the near future, we're going to get some code added in to
properly update our function pointer to its virtual address during
SetVirtualAddressMap.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Roy Franz <roy.fr...@linaro.org>
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 815fec6..66b2166 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call((void *)__va(tab->function), (u64)which,
-   a1, a2, a3, a4, a5);
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



[PATCH 1/3] Convert efi_call_virt to efi_call_virt_pointer

2016-06-15 Thread Alex Thorlton
This commit makes a few slight modifications to the efi_call_virt macro
to get it to work with function pointers that are stored in locations
other than efi.systab->runtime, and renames the macro to
efi_call_virt_pointer.  The majority of the changes here are to pull
these macros up into header files so that they can be accessed from
outside of drivers/firmware/efi/runtime-wrappers.c.

The most significant change not directly related to the code move is to
add an extra "p" argument into the appropriate efi_call macros, and use
that new argument in place of the, formerly hard-coded,
efi.systab->runtime pointer.

The last piece of the puzzle was to add an efi_call_virt macro back into
drivers/firmware/efi/runtime-wrappers.c to wrap around the new
efi_call_virt_pointer macro - this was mainly to keep the code from
looking too cluttered by adding a bunch of extra references to
efi.systab->runtime everywhere.

Note that I also broke up the code in the efi_call_virt_pointer macro a
bit in the process of moving it.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Roy Franz <roy.fr...@linaro.org>
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org
---
 arch/arm/include/asm/efi.h  |  4 +--
 arch/arm64/include/asm/efi.h|  4 +--
 arch/x86/include/asm/efi.h  |  9 +++---
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 5 files changed, 69 insertions(+), 52 deletions(-)

diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index a708fa1..766bf9b 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -28,10 +28,10 @@ int efi_set_mapping_permissions(struct mm_struct *mm, 
efi_memory_desc_t *md);
 #define arch_efi_call_virt_setup() efi_virtmap_load()
 #define arch_efi_call_virt_teardown()  efi_virtmap_unload()
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 622db3c..bd88766 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -23,10 +23,10 @@ int efi_create_mapping(struct mm_struct *mm, 
efi_memory_desc_t *md);
efi_virtmap_load(); \
 })
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 78d1e74..55b4596 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -41,10 +41,9 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...);
 /*
  * Wrap all the virtual calls in a way that forces the parameters on the stack.
  */
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
-   ((efi_##f##_t __attribute__((regparm(0)))*) \
-   efi.systab->runtime->f)(args);  \
+   ((efi_##f##_t __attribute__((regparm(0)))*) p->f)(args);\
 })
 
 #define efi_ioremap(addr, size, type, attr)ioremap_cache(addr, size)
@@ -81,8 +80,8 @@ struct efi_scratch {
}   \
 })
 
-#define arch_efi_call_virt(f, args...) 

[PATCH 1/3] Convert efi_call_virt to efi_call_virt_pointer

2016-06-15 Thread Alex Thorlton
This commit makes a few slight modifications to the efi_call_virt macro
to get it to work with function pointers that are stored in locations
other than efi.systab->runtime, and renames the macro to
efi_call_virt_pointer.  The majority of the changes here are to pull
these macros up into header files so that they can be accessed from
outside of drivers/firmware/efi/runtime-wrappers.c.

The most significant change not directly related to the code move is to
add an extra "p" argument into the appropriate efi_call macros, and use
that new argument in place of the, formerly hard-coded,
efi.systab->runtime pointer.

The last piece of the puzzle was to add an efi_call_virt macro back into
drivers/firmware/efi/runtime-wrappers.c to wrap around the new
efi_call_virt_pointer macro - this was mainly to keep the code from
looking too cluttered by adding a bunch of extra references to
efi.systab->runtime everywhere.

Note that I also broke up the code in the efi_call_virt_pointer macro a
bit in the process of moving it.

Signed-off-by: Alex Thorlton 
Cc: Matt Fleming 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Ard Biesheuvel 
Cc: Mark Rutland 
Cc: Roy Franz 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org
---
 arch/arm/include/asm/efi.h  |  4 +--
 arch/arm64/include/asm/efi.h|  4 +--
 arch/x86/include/asm/efi.h  |  9 +++---
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 5 files changed, 69 insertions(+), 52 deletions(-)

diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index a708fa1..766bf9b 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -28,10 +28,10 @@ int efi_set_mapping_permissions(struct mm_struct *mm, 
efi_memory_desc_t *md);
 #define arch_efi_call_virt_setup() efi_virtmap_load()
 #define arch_efi_call_virt_teardown()  efi_virtmap_unload()
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 622db3c..bd88766 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -23,10 +23,10 @@ int efi_create_mapping(struct mm_struct *mm, 
efi_memory_desc_t *md);
efi_virtmap_load(); \
 })
 
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
efi_##f##_t *__f;   \
-   __f = efi.systab->runtime->f;   \
+   __f = p->f; \
__f(args);  \
 })
 
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 78d1e74..55b4596 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -41,10 +41,9 @@ extern unsigned long asmlinkage efi_call_phys(void *, ...);
 /*
  * Wrap all the virtual calls in a way that forces the parameters on the stack.
  */
-#define arch_efi_call_virt(f, args...) \
+#define arch_efi_call_virt(p, f, args...)  \
 ({ \
-   ((efi_##f##_t __attribute__((regparm(0)))*) \
-   efi.systab->runtime->f)(args);  \
+   ((efi_##f##_t __attribute__((regparm(0)))*) p->f)(args);\
 })
 
 #define efi_ioremap(addr, size, type, attr)ioremap_cache(addr, size)
@@ -81,8 +80,8 @@ struct efi_scratch {
}   \
 })
 
-#define arch_efi_call_virt(f, args...) \
-   efi_call((void *)efi.systab->runtime->f, args)  \
+#define arch_efi_call_virt(p, f, args...)  \
+   efi_call((void *)p->f, args)\
 
 #define arch_efi_call_virt_teardown()  \
 ({ 

[PATCH 2/3] Update uv_bios_call to use efi_call_virt_pointer

2016-06-15 Thread Alex Thorlton
Now that the efi_call_virt macro has been generalized to be able to
use EFI system tables besides efi.systab, we are able to convert our
uv_bios_call wrapper to use this standard EFI callback mechanism.

This simple change is part of a much larger effort to recover from some
issues with the way we were mapping in some of our MMRs, and the way
that we were doing our BIOS callbacks, which were uncovered by commit
67a9108ed431 ("x86/efi: Build our own page table structures").

The first issue that this uncovered was that we were relying on the EFI
memory mapping mechanism to map in our MMR space for us, which, while
reliable, was technically a bug, as it relied on "undefined" behavior in
the mapping code.

The reason we were able to piggyback on the EFI memory mapping code to
map in our MMRs was because, previously, EFI code used the
trampoline_pgd, which shares a few entries with the main kernel pgd.  It
just so happened, that the memory range containing our MMRs was inside
one of those shared regions, which kept our code working without issue
for quite a while.

Anyways, once we discovered this problem, we brought back our original
code to map in the MMRs with commit 08914f436bdd ("x86/platform/UV:
Bring back the call to map_low_mmrs in uv_system_init").  This got our
systems a little further along, but we were still running into trouble
with our EFI callbacks, which prevented us from booting all the way up.

Our first step towards fixing the BIOS callbacks was to get our
uv_bios_call wrapper updated to use efi_call_virt instead of the plain
efi_call.  The previous patch took care of the effort needed to make
that possible.  Along the way, we hit a major issue with some confusion
about how to properly pull arguments higher than number 6 off the stack
in the efi_call code, which resulted in Linus's commit 683ad8092cd2
("x86/efi: Fix 7-parameter efi_call()s").

Now that all of those issues are out of the way, we're able to make this
simple change to use the new efi_call_virt_pointer in uv_bios_call which
gets our machines booting, running properly, and able to execute our
callbacks with 6+ arguments.

Note that, since we are now using the EFI page table when we make our
function call, we are no longer able to make the call using the __va()
of our function pointer, since the memory range containing that address
isn't mapped into the EFI page table.  For now, we will use the physical
address of the function directly, since that is mapped into the EFI page
table.  In the near future, we're going to get some code added in to
properly update our function pointer to its virtual address during
SetVirtualAddressMap.

Signed-off-by: Alex Thorlton 
Cc: Matt Fleming 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Ard Biesheuvel 
Cc: Mark Rutland 
Cc: Roy Franz 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 815fec6..66b2166 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call((void *)__va(tab->function), (u64)which,
-   a1, a2, a3, a4, a5);
+   ret = efi_call_virt_pointer(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



[PATCHv2 0/3] x86/UV, x86/efi: Re-factor efi_call_virt for general use

2016-06-15 Thread Alex Thorlton
Hey guys,

This patchset creates a general purpose version of the efi_call_virt
macro that does not assume that the function pointer being passed in is
inside of efi.systab->runtime.  It also fixes up uv_bios_call to use the
new functionality, and does a bit of cleanup in the efi_thunk macro.

Quick breakdown of the patches:

Patch 1) Move necessary macros to locations where we can access them.
 Remove hard-coded efi.systab reference from efi_call_virt.
 Rename/create new macros as needed.
Patch 2) Simple change to allow UV code to utilize the new
 functionality.  Included a detailed explanation of how we got
 here.
Patch 3) Replace a few bits of the efi_thunk macro with the
 arch_efi_call_setup/teardown macros.

The first two have been tested on simulators and hardware, but the third
has only been compile-tested.  I don't have any hardware to test that
on.

Updates for v2:

- Fix up arm and arm64 versions of arch_efi_call_virt.  I missed these
  on my first pass 
- Add some more detail to the commit message for the uv_bios_call fix.
- Change the third patch to use the arch_efi_call_setup/teardown macros
  inside of the efi_thunk macro, instead of replacing efi_thunk entirely
  for the CONFIG_EFI_MIXED case.

Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Roy Franz <roy.fr...@linaro.org>
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org

Alex Thorlton (3):
  Convert efi_call_virt to efi_call_virt_pointer
  Update uv_bios_call to use efi_call_virt_pointer
  Update efi_thunk to use the the arch_efi_call_virt* macros

 arch/arm/include/asm/efi.h  |  4 +--
 arch/arm64/include/asm/efi.h|  4 +--
 arch/x86/include/asm/efi.h  |  9 +++---
 arch/x86/platform/efi/efi_64.c  | 11 ++-
 arch/x86/platform/uv/bios_uv.c  |  3 +-
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 7 files changed, 73 insertions(+), 62 deletions(-)

-- 
1.8.5.6



[PATCHv2 0/3] x86/UV, x86/efi: Re-factor efi_call_virt for general use

2016-06-15 Thread Alex Thorlton
Hey guys,

This patchset creates a general purpose version of the efi_call_virt
macro that does not assume that the function pointer being passed in is
inside of efi.systab->runtime.  It also fixes up uv_bios_call to use the
new functionality, and does a bit of cleanup in the efi_thunk macro.

Quick breakdown of the patches:

Patch 1) Move necessary macros to locations where we can access them.
 Remove hard-coded efi.systab reference from efi_call_virt.
 Rename/create new macros as needed.
Patch 2) Simple change to allow UV code to utilize the new
 functionality.  Included a detailed explanation of how we got
 here.
Patch 3) Replace a few bits of the efi_thunk macro with the
 arch_efi_call_setup/teardown macros.

The first two have been tested on simulators and hardware, but the third
has only been compile-tested.  I don't have any hardware to test that
on.

Updates for v2:

- Fix up arm and arm64 versions of arch_efi_call_virt.  I missed these
  on my first pass 
- Add some more detail to the commit message for the uv_bios_call fix.
- Change the third patch to use the arch_efi_call_setup/teardown macros
  inside of the efi_thunk macro, instead of replacing efi_thunk entirely
  for the CONFIG_EFI_MIXED case.

Cc: Matt Fleming 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Ard Biesheuvel 
Cc: Mark Rutland 
Cc: Roy Franz 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org

Alex Thorlton (3):
  Convert efi_call_virt to efi_call_virt_pointer
  Update uv_bios_call to use efi_call_virt_pointer
  Update efi_thunk to use the the arch_efi_call_virt* macros

 arch/arm/include/asm/efi.h  |  4 +--
 arch/arm64/include/asm/efi.h|  4 +--
 arch/x86/include/asm/efi.h  |  9 +++---
 arch/x86/platform/efi/efi_64.c  | 11 ++-
 arch/x86/platform/uv/bios_uv.c  |  3 +-
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 7 files changed, 73 insertions(+), 62 deletions(-)

-- 
1.8.5.6



[PATCH 3/3] Update efi_thunk to use the the arch_efi_call_virt* macros

2016-06-15 Thread Alex Thorlton
Currently, the efi_thunk macro has some semi-duplicated code in it that
can be replaced with the arch_efi_call_virt_setup/teardown macros. This
commit simply replaces the duplicated code with those macros.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Suggested-by: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Mark Rutland <mark.rutl...@arm.com>
Cc: Roy Franz <roy.fr...@linaro.org>
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org
---
 arch/x86/platform/efi/efi_64.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 6e7242b..4cc2b96 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -469,18 +469,13 @@ extern efi_status_t efi64_thunk(u32, ...);
unsigned long flags;\
u32 func;   \
\
-   efi_sync_low_kernel_mappings(); \
local_irq_save(flags);  \
-   \
-   efi_scratch.prev_cr3 = read_cr3();  \
-   write_cr3((unsigned long)efi_scratch.efi_pgt);  \
-   __flush_tlb_all();  \
+   arch_efi_call_virt_setup(); \
\
func = runtime_service32(f);\
-   __s = efi64_thunk(func, __VA_ARGS__);   \
+   __s = efi64_thunk(func, __VA_ARGS__);   \
\
-   write_cr3(efi_scratch.prev_cr3);\
-   __flush_tlb_all();  \
+   arch_efi_call_virt_teardown();  \
local_irq_restore(flags);   \
\
__s;\
-- 
1.8.5.6



[PATCH 3/3] Update efi_thunk to use the the arch_efi_call_virt* macros

2016-06-15 Thread Alex Thorlton
Currently, the efi_thunk macro has some semi-duplicated code in it that
can be replaced with the arch_efi_call_virt_setup/teardown macros. This
commit simply replaces the duplicated code with those macros.

Signed-off-by: Alex Thorlton 
Suggested-by: Matt Fleming 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Ard Biesheuvel 
Cc: Mark Rutland 
Cc: Roy Franz 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: x...@kernel.org
---
 arch/x86/platform/efi/efi_64.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 6e7242b..4cc2b96 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -469,18 +469,13 @@ extern efi_status_t efi64_thunk(u32, ...);
unsigned long flags;\
u32 func;   \
\
-   efi_sync_low_kernel_mappings(); \
local_irq_save(flags);  \
-   \
-   efi_scratch.prev_cr3 = read_cr3();  \
-   write_cr3((unsigned long)efi_scratch.efi_pgt);  \
-   __flush_tlb_all();  \
+   arch_efi_call_virt_setup(); \
\
func = runtime_service32(f);\
-   __s = efi64_thunk(func, __VA_ARGS__);   \
+   __s = efi64_thunk(func, __VA_ARGS__);   \
\
-   write_cr3(efi_scratch.prev_cr3);\
-   __flush_tlb_all();  \
+   arch_efi_call_virt_teardown();  \
local_irq_restore(flags);   \
\
__s;\
-- 
1.8.5.6



Re: [PATCH 2/3] Update uv_bios_call to use efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 04:14:03PM -0500, Alex Thorlton wrote:
> On Thu, Jun 02, 2016 at 08:45:47PM +0100, Matt Fleming wrote:
> > Unless I've missed it, I didn't see an explanation in the changelog of
> > why it's OK to switch from using __va(tab->function) to tab->function
> > directly, which presumably is a physical address.
> > 
> > Was that intended?
> 
> It was intended.  The motivation is so that we can use the same
> "dereference the pointer inside the macro" stuff that we do with the
> efi.systab->runtime pointer.  IINM, the reason it works is because we do
> 
> /*
>  * Make sure the 1:1 mappings are present as a catch-all for b0rked
>  * firmware which doesn't update all internal pointers after switching
>  * to virtual mode and would otherwise crap on us.
>  */
> __map_region(md, md->phys_addr);
> 
> Inside of efi_map_region, so we know we'll have that physical address
> mapped into the EFI page table.
> 
> Upon review, I'm wondering if the correct thing to do  here is to
> update that pointer during the switch to virtual mode, to avoid the
> b0rkage mentioned in the above comment.

Realizing after I typed this that the b0rkage in question is firmware
b0rkage, so the comment there doesn't really apply to this situation.

> Either way, the straigh tab->function dereference should work while
> using the EFI page table, but I do wonder if that tab->function address
> should have been updated to the __va() version of itself before we reach
> this point.

I played with this some and it's not working the way I expected.  In
thinking about it, I believe the reason is fairly obvious.  We have
already switched to the EFI page table when we try to dereference the
pointer, so if we have some 88 address, it won't be mapped
(this is somewhat related to the situation that got us here in the first
place).

I think, in order for this to work, tab->function has to be either the
physical address, or the address in the ffef  range that also
gets mapped in during efi_map_region.

My brain is getting close to fried for the day, so this might be utter
nonsense, but it's sounding pretty reasonable to me right now.  We'll
see how I feel about it tomorrow :)

Let me know what you think!

- Alex


Re: [PATCH 2/3] Update uv_bios_call to use efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 04:14:03PM -0500, Alex Thorlton wrote:
> On Thu, Jun 02, 2016 at 08:45:47PM +0100, Matt Fleming wrote:
> > Unless I've missed it, I didn't see an explanation in the changelog of
> > why it's OK to switch from using __va(tab->function) to tab->function
> > directly, which presumably is a physical address.
> > 
> > Was that intended?
> 
> It was intended.  The motivation is so that we can use the same
> "dereference the pointer inside the macro" stuff that we do with the
> efi.systab->runtime pointer.  IINM, the reason it works is because we do
> 
> /*
>  * Make sure the 1:1 mappings are present as a catch-all for b0rked
>  * firmware which doesn't update all internal pointers after switching
>  * to virtual mode and would otherwise crap on us.
>  */
> __map_region(md, md->phys_addr);
> 
> Inside of efi_map_region, so we know we'll have that physical address
> mapped into the EFI page table.
> 
> Upon review, I'm wondering if the correct thing to do  here is to
> update that pointer during the switch to virtual mode, to avoid the
> b0rkage mentioned in the above comment.

Realizing after I typed this that the b0rkage in question is firmware
b0rkage, so the comment there doesn't really apply to this situation.

> Either way, the straigh tab->function dereference should work while
> using the EFI page table, but I do wonder if that tab->function address
> should have been updated to the __va() version of itself before we reach
> this point.

I played with this some and it's not working the way I expected.  In
thinking about it, I believe the reason is fairly obvious.  We have
already switched to the EFI page table when we try to dereference the
pointer, so if we have some 88 address, it won't be mapped
(this is somewhat related to the situation that got us here in the first
place).

I think, in order for this to work, tab->function has to be either the
physical address, or the address in the ffef  range that also
gets mapped in during efi_map_region.

My brain is getting close to fried for the day, so this might be utter
nonsense, but it's sounding pretty reasonable to me right now.  We'll
see how I feel about it tomorrow :)

Let me know what you think!

- Alex


Re: [PATCH 3/3] Update efi_thunk to use efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 09:19:49PM +0100, Matt Fleming wrote:
> On Wed, 18 May, at 02:11:41PM, Alex Thorlton wrote:
> > +#define arch_efi_call_virt(p, f, ...)  
> > \
> > +({ \
> > +   u32 func = runtime_service32(f);\
> > +   efi64_thunk(func, __VA_ARGS__); \
> > +})
> > +
> 
> This isn't correct because you're turning the runtime decision of
> whether we're executing the thunking code into a build time one.

Ahh, yep, you're absolutely correct.  That's not what I intended to do,
but that's definitely the effect that this change has.

> Would something like this work instead? It's not as neat as your
> suggestion but it's a damn sight better than what we have today.
> 
> ---
> 
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index 6e7242be1c87..b976084e56ef 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -469,18 +469,13 @@ extern efi_status_t efi64_thunk(u32, ...);
>   unsigned long flags;\
>   u32 func;   \
>   \
> - efi_sync_low_kernel_mappings(); \
>   local_irq_save(flags);  \
> - \
> - efi_scratch.prev_cr3 = read_cr3();  \
> - write_cr3((unsigned long)efi_scratch.efi_pgt);  \
> - __flush_tlb_all();  \
> + arch_efi_call_virt_setup(); \
>   \
>   func = runtime_service32(f);\
>   __s = efi64_thunk(func, __VA_ARGS__);   \
>   \
> - write_cr3(efi_scratch.prev_cr3);\
> - __flush_tlb_all();  \
> + arch_efi_call_virt_teardown();  \
>   local_irq_restore(flags);   \
>   \
>   __s;\

This looks good to me.  We're at least making use of the
arch_efi_call_virt_* stuff where possible, and only using the special
thunk code where necessary.  I think it's a good middle ground between
the two approaches (especially considering the fact that mine won't
work :) 

I will re-work that last patch to include this change instead of my
original, broken one.

Thanks, Matt!

- Alex


Re: [PATCH 3/3] Update efi_thunk to use efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 09:19:49PM +0100, Matt Fleming wrote:
> On Wed, 18 May, at 02:11:41PM, Alex Thorlton wrote:
> > +#define arch_efi_call_virt(p, f, ...)  
> > \
> > +({ \
> > +   u32 func = runtime_service32(f);\
> > +   efi64_thunk(func, __VA_ARGS__); \
> > +})
> > +
> 
> This isn't correct because you're turning the runtime decision of
> whether we're executing the thunking code into a build time one.

Ahh, yep, you're absolutely correct.  That's not what I intended to do,
but that's definitely the effect that this change has.

> Would something like this work instead? It's not as neat as your
> suggestion but it's a damn sight better than what we have today.
> 
> ---
> 
> diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
> index 6e7242be1c87..b976084e56ef 100644
> --- a/arch/x86/platform/efi/efi_64.c
> +++ b/arch/x86/platform/efi/efi_64.c
> @@ -469,18 +469,13 @@ extern efi_status_t efi64_thunk(u32, ...);
>   unsigned long flags;\
>   u32 func;   \
>   \
> - efi_sync_low_kernel_mappings(); \
>   local_irq_save(flags);  \
> - \
> - efi_scratch.prev_cr3 = read_cr3();  \
> - write_cr3((unsigned long)efi_scratch.efi_pgt);  \
> - __flush_tlb_all();  \
> + arch_efi_call_virt_setup(); \
>   \
>   func = runtime_service32(f);\
>   __s = efi64_thunk(func, __VA_ARGS__);   \
>   \
> - write_cr3(efi_scratch.prev_cr3);\
> - __flush_tlb_all();  \
> + arch_efi_call_virt_teardown();  \
>   local_irq_restore(flags);   \
>   \
>   __s;\

This looks good to me.  We're at least making use of the
arch_efi_call_virt_* stuff where possible, and only using the special
thunk code where necessary.  I think it's a good middle ground between
the two approaches (especially considering the fact that mine won't
work :) 

I will re-work that last patch to include this change instead of my
original, broken one.

Thanks, Matt!

- Alex


Re: [PATCH 2/3] Update uv_bios_call to use efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 08:45:47PM +0100, Matt Fleming wrote:
> Unless I've missed it, I didn't see an explanation in the changelog of
> why it's OK to switch from using __va(tab->function) to tab->function
> directly, which presumably is a physical address.
> 
> Was that intended?

It was intended.  The motivation is so that we can use the same
"dereference the pointer inside the macro" stuff that we do with the
efi.systab->runtime pointer.  IINM, the reason it works is because we do

/*
 * Make sure the 1:1 mappings are present as a catch-all for b0rked
 * firmware which doesn't update all internal pointers after switching
 * to virtual mode and would otherwise crap on us.
 */
__map_region(md, md->phys_addr);

Inside of efi_map_region, so we know we'll have that physical address
mapped into the EFI page table.

Upon review, I'm wondering if the correct thing to do  here is to
update that pointer during the switch to virtual mode, to avoid the
b0rkage mentioned in the above comment.

Either way, the straigh tab->function dereference should work while
using the EFI page table, but I do wonder if that tab->function address
should have been updated to the __va() version of itself before we reach
this point.

Maybe you can comment on that?

- Alex


Re: [PATCH 2/3] Update uv_bios_call to use efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 08:45:47PM +0100, Matt Fleming wrote:
> Unless I've missed it, I didn't see an explanation in the changelog of
> why it's OK to switch from using __va(tab->function) to tab->function
> directly, which presumably is a physical address.
> 
> Was that intended?

It was intended.  The motivation is so that we can use the same
"dereference the pointer inside the macro" stuff that we do with the
efi.systab->runtime pointer.  IINM, the reason it works is because we do

/*
 * Make sure the 1:1 mappings are present as a catch-all for b0rked
 * firmware which doesn't update all internal pointers after switching
 * to virtual mode and would otherwise crap on us.
 */
__map_region(md, md->phys_addr);

Inside of efi_map_region, so we know we'll have that physical address
mapped into the EFI page table.

Upon review, I'm wondering if the correct thing to do  here is to
update that pointer during the switch to virtual mode, to avoid the
b0rkage mentioned in the above comment.

Either way, the straigh tab->function dereference should work while
using the EFI page table, but I do wonder if that tab->function address
should have been updated to the __va() version of itself before we reach
this point.

Maybe you can comment on that?

- Alex


Re: [PATCH 1/3] Convert efi_call_virt to efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 04:41:14PM +0100, Matt Fleming wrote:
> Oops, you're missing updates to the 32-bit version and ARM/arm64,
> which results in this,
> 
> drivers/firmware/efi/runtime-wrappers.c: In function ‘virt_efi_get_time’:
> arch/x86/include/asm/efi.h:46:4: error: ‘efi_efi’ undeclared (first use in 
> this function)
>   ((efi_##f##_t __attribute__((regparm(0)))*)   \
> ^
> 
> but that's easily fixed up.

Oops!  Sorry about that.  I had done a pretty recent pull before I did
my last build, but it was probably before these updates came in.  I'll
rebase my patches onto the latest tip kernel and fix this.

> > +/*
> > + * Wrap around the new efi_call_virt_generic macros so that the
> > + * code doesn't get too cluttered
> > + */
> > +#define efi_call_virt(f, args...)   \
> > +   efi_call_virt_generic(efi.systab->runtime, f, args)
> > +#define __efi_call_virt(f, args...) \
> > +   __efi_call_virt_generic(efi.systab->runtime, f, args)
> > +
> > +void efi_call_virt_check_flags(unsigned long flags, const char *call)
> >  {
> > unsigned long cur_flags, mismatch;
> >  
> 
> I'm not crazy about using "generic" in the name. How about
> efi_call_virt_function() or efi_call_virt_pointer()?

I'm not married to "generic."  I'll change it to efi_call_virt_pointer.
I think efi_call_virt_function is a little confusing/redundant, since a
call is always to a function, right?

I will fix this stuff up and send a v2.

Thanks for looking this over, Matt!

- Alex


Re: [PATCH 1/3] Convert efi_call_virt to efi_call_virt_generic

2016-06-02 Thread Alex Thorlton
On Thu, Jun 02, 2016 at 04:41:14PM +0100, Matt Fleming wrote:
> Oops, you're missing updates to the 32-bit version and ARM/arm64,
> which results in this,
> 
> drivers/firmware/efi/runtime-wrappers.c: In function ‘virt_efi_get_time’:
> arch/x86/include/asm/efi.h:46:4: error: ‘efi_efi’ undeclared (first use in 
> this function)
>   ((efi_##f##_t __attribute__((regparm(0)))*)   \
> ^
> 
> but that's easily fixed up.

Oops!  Sorry about that.  I had done a pretty recent pull before I did
my last build, but it was probably before these updates came in.  I'll
rebase my patches onto the latest tip kernel and fix this.

> > +/*
> > + * Wrap around the new efi_call_virt_generic macros so that the
> > + * code doesn't get too cluttered
> > + */
> > +#define efi_call_virt(f, args...)   \
> > +   efi_call_virt_generic(efi.systab->runtime, f, args)
> > +#define __efi_call_virt(f, args...) \
> > +   __efi_call_virt_generic(efi.systab->runtime, f, args)
> > +
> > +void efi_call_virt_check_flags(unsigned long flags, const char *call)
> >  {
> > unsigned long cur_flags, mismatch;
> >  
> 
> I'm not crazy about using "generic" in the name. How about
> efi_call_virt_function() or efi_call_virt_pointer()?

I'm not married to "generic."  I'll change it to efi_call_virt_pointer.
I think efi_call_virt_function is a little confusing/redundant, since a
call is always to a function, right?

I will fix this stuff up and send a v2.

Thanks for looking this over, Matt!

- Alex


Re: [RFC PATCH 0/3] x86/UV, x86/efi: Re-factor efi_call_virt for general use

2016-05-18 Thread Alex Thorlton
On Wed, May 18, 2016 at 02:11:38PM -0500, Alex Thorlton wrote:
> Let me know what everybody thinks!

I realized right as I sent these that I should've included prefixes on
the individual patches.  I have a feeling we'll need a v2 anyways, so
I'll clean that up then.

- Alex


[PATCH 2/3] Update uv_bios_call to use efi_call_virt_generic

2016-05-18 Thread Alex Thorlton
Now that the efi_call_virt macro has been generalized to be able to
use EFI system tables besides efi.systab, we are able to convert our
uv_bios_call wrapper to use this standard EFI callback mechanism.

This simple change is part of a much larger effort to recover from some
issues with the way we were mapping in some of our MMRs, and the way
that we were doing our BIOS callbacks, which were uncovered by commit
67a9108ed431 ("x86/efi: Build our own page table structures").

The first issue that this uncovered was that we were relying on the EFI
memory mapping mechanism to map in our MMR space for us, which, while
reliable, was technically a bug, as it relied on "undefined" behavior in
the mapping code.

The reason we were able to piggyback on the EFI memory mapping code to
map in our MMRs was because, previously, EFI code used the
trampoline_pgd, which shares a few entries with the main kernel pgd.  It
just so happened, that the memory range containing our MMRs was inside
one of those shared regions, which kept our code working without issue
for quite a while.

Anyways, once we discovered this problem, we brought back our original
code to map in the MMRs with commit 08914f436bdd ("x86/platform/UV:
Bring back the call to map_low_mmrs in uv_system_init").  This got our
systems a little further along, but we were still running into trouble
with our EFI callbacks, which prevented us from booting all the way up.

Our first step towards fixing the BIOS callbacks was to get our
uv_bios_call wrapper updated to use efi_call_virt instead of the plain
efi_call.  The previous patch took care of the effort needed to make
that possible.  Along the way, we hit a major issue with some confusion
about how to properly pull arguments higher than number 6 off the stack
in the efi_call code, which resulted in Linus's commit 683ad8092cd2
("x86/efi: Fix 7-parameter efi_call()s").

Now that all of those issues are out of the way, we're able to make this
simple change to use the new efi_call_virt_generic in uv_bios_call which
gets our machines booting, running properly, and able to execute our
callbacks with 6+ arguments.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Borislav Petkov <b...@suse.de>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 815fec6..0ae0826 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call((void *)__va(tab->function), (u64)which,
-   a1, a2, a3, a4, a5);
+   ret = efi_call_virt_generic(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



Re: [RFC PATCH 0/3] x86/UV, x86/efi: Re-factor efi_call_virt for general use

2016-05-18 Thread Alex Thorlton
On Wed, May 18, 2016 at 02:11:38PM -0500, Alex Thorlton wrote:
> Let me know what everybody thinks!

I realized right as I sent these that I should've included prefixes on
the individual patches.  I have a feeling we'll need a v2 anyways, so
I'll clean that up then.

- Alex


[PATCH 2/3] Update uv_bios_call to use efi_call_virt_generic

2016-05-18 Thread Alex Thorlton
Now that the efi_call_virt macro has been generalized to be able to
use EFI system tables besides efi.systab, we are able to convert our
uv_bios_call wrapper to use this standard EFI callback mechanism.

This simple change is part of a much larger effort to recover from some
issues with the way we were mapping in some of our MMRs, and the way
that we were doing our BIOS callbacks, which were uncovered by commit
67a9108ed431 ("x86/efi: Build our own page table structures").

The first issue that this uncovered was that we were relying on the EFI
memory mapping mechanism to map in our MMR space for us, which, while
reliable, was technically a bug, as it relied on "undefined" behavior in
the mapping code.

The reason we were able to piggyback on the EFI memory mapping code to
map in our MMRs was because, previously, EFI code used the
trampoline_pgd, which shares a few entries with the main kernel pgd.  It
just so happened, that the memory range containing our MMRs was inside
one of those shared regions, which kept our code working without issue
for quite a while.

Anyways, once we discovered this problem, we brought back our original
code to map in the MMRs with commit 08914f436bdd ("x86/platform/UV:
Bring back the call to map_low_mmrs in uv_system_init").  This got our
systems a little further along, but we were still running into trouble
with our EFI callbacks, which prevented us from booting all the way up.

Our first step towards fixing the BIOS callbacks was to get our
uv_bios_call wrapper updated to use efi_call_virt instead of the plain
efi_call.  The previous patch took care of the effort needed to make
that possible.  Along the way, we hit a major issue with some confusion
about how to properly pull arguments higher than number 6 off the stack
in the efi_call code, which resulted in Linus's commit 683ad8092cd2
("x86/efi: Fix 7-parameter efi_call()s").

Now that all of those issues are out of the way, we're able to make this
simple change to use the new efi_call_virt_generic in uv_bios_call which
gets our machines booting, running properly, and able to execute our
callbacks with 6+ arguments.

Signed-off-by: Alex Thorlton 
Cc: Matt Fleming 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Mike Travis 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/platform/uv/bios_uv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
index 815fec6..0ae0826 100644
--- a/arch/x86/platform/uv/bios_uv.c
+++ b/arch/x86/platform/uv/bios_uv.c
@@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, u64 
a3, u64 a4, u64 a5)
 */
return BIOS_STATUS_UNIMPLEMENTED;
 
-   ret = efi_call((void *)__va(tab->function), (u64)which,
-   a1, a2, a3, a4, a5);
+   ret = efi_call_virt_generic(tab, function, (u64)which, a1, a2, a3, a4, 
a5);
return ret;
 }
 EXPORT_SYMBOL_GPL(uv_bios_call);
-- 
1.8.5.6



[PATCH 3/3] Update efi_thunk to use efi_call_virt_generic

2016-05-18 Thread Alex Thorlton
Now that we have efi_call_virt_generic, we no longer need to have an
entirely separate efi_thunk macro to handle the CONFIG_EFI_MIXED
scenario, where the function pointers cannot be read directly out of
efi.systab->runtime.

This commit creates a new set of arch_efi_call_virt* macros to mimic the
behavior of the old efi_thunk macro.  In the end, the code should be the
same, functionally, but we'll have eliminated a good chunk of code
duplication by splitting the efi_thunk macro up into the appropriate
arch_efi_call_virt bits and then just calling efi_call_virt_generic,
instead of efi_thunk.  I do go ahead and create a new efi_thunk macro in
arch/x86/platform/efi/efi_64.c, but this is mainly to keep the existing
code clean.

One thing to note here, is that this is not and absolutely *perfect* use
of the efi_call_virt_generic macro, in that it still has the
efi.systab->runtime pointer hard-coded into runtime_service32.  For now,
I think this should be fine, as the only users of the function already
assumed that this was where their function pointer would come from.  If
another CONFIG_EFI_MIXED user comes along and needs to use
efi_call_virt_generic, then we will need to re-think things a bit.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Borislav Petkov <b...@suse.de>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/include/asm/efi.h | 47 
 arch/x86/platform/efi/efi_64.c | 49 ++
 2 files changed, 58 insertions(+), 38 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index f310f0b..6643f9b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -68,6 +68,52 @@ struct efi_scratch {
u64 phys_stack;
 } __packed;
 
+#ifdef CONFIG_EFI_MIXED
+extern efi_status_t efi64_thunk(u32, ...);
+
+#define runtime_service32(func)
 \
+({  \
+   u32 table = (u32)(unsigned long)efi.systab;  \
+   u32 *rt, *___f;  \
+\
+   rt = (u32 *)(table + offsetof(efi_system_table_32_t, runtime));  \
+   ___f = (u32 *)(*rt + offsetof(efi_runtime_services_32_t, func)); \
+   *___f;   \
+})
+
+/*
+ * Switch to the EFI page tables early so that we can access the 1:1
+ * runtime services mappings which are not mapped in any other page
+ * tables. This function must be called before runtime_service32().
+ *
+ * Also, disable interrupts because the IDT points to 64-bit handlers,
+ * which aren't going to function correctly when we switch to 32-bit.
+ */
+#define arch_efi_call_virt_setup() \
+({ \
+   efi_sync_low_kernel_mappings(); \
+   local_irq_save(flags);  \
+   \
+   efi_scratch.prev_cr3 = read_cr3();  \
+   write_cr3((unsigned long)efi_scratch.efi_pgt);  \
+   __flush_tlb_all();  \
+})
+
+#define arch_efi_call_virt(p, f, ...)  \
+({ \
+   u32 func = runtime_service32(f);\
+   efi64_thunk(func, __VA_ARGS__); \
+})
+
+#define arch_efi_call_virt_teardown()  \
+({ \
+   write_cr3(efi_scratch.prev_cr3);\
+   __flush_tlb_all();  \
+   local_irq_restore(flags);   \
+})
+
+#else /* !CONFIG_EFI_MIXED */
+
 #define arch_efi_call_virt_setup() \
 ({ \
efi_sync_low_kernel_mappings(); \
@@ -94,6 +140,7 @@ struct efi_scratch {
__kernel_fpu_end(); \
preempt_enable();   \
 })
+#endif
 
 extern void __iomem *__init efi_ioremap(unsigned long addr, uns

[RFC PATCH 0/3] x86/UV, x86/efi: Re-factor efi_call_virt for general use

2016-05-18 Thread Alex Thorlton
Hey guys,

This patchset creates a general purpose version of the efi_call_virt
macro that does not assume that the function pointer being passed in is
inside of efi.systab->runtime.  It also fixes up a few potentional users
of that new functionality, namely the SGI UV, and the CONFIG_EFI_MIXED
code paths.

Quick breakdown of the patches:

Patch 1) Move necessary macros to locations where we can access them.
 Remove hard-coded efi.systab reference from efi_call_virt.
 Rename/create new macros as needed.
Patch 2) Simple change to allow UV code to utilize the new
 functionality.  Included a detailed explanation of how we got
 here.
Patch 3) This is the one I'm most looking for input on.  I merge the
 efi_thunk code in with the new efi_call_virt scheme (giving it
 it's own arch_efi_call_* macros, conditionally defined for
 EFI_MIXED) and then use efi_thunk as a wrapper for
 efi_call_virt_generic.

The first two have been tested on simulators and hardware, but the third
has only been compile-tested.  I don't have any hardware to test that
on.  I'm sure I could set up a VM with OVMF, but I haven't taken the
time yet :)

A few notes/concerns that I had about the patches:

1) I could have created more specific names for the individual uses of
   efi_call_virt instead of using the originals as wrappers for
   efi_call_virt_generic.
   * Would be easy enough to do, but would need to update all the
 original callers of the function.  Not difficult, just different.
2) I'm not sure if each macro really needs to have the same args
   implemented.  Could be simplified a bit if we didn't use "p" on the
   EFI_MIXED side.
   * I did this for consistency, but I suppose it's not explicitly
 necessary.
3) It wouldn't be too hard to add an efi_thunk_generic function that
   would just expect a 32-bit pointer, and then have an
   efi_thunk_runtime to wrap that and handle the call to
   runtime_service32 for us, so that the efi.systab pointer doesn't have
   to be hard-coded into the EFI_MIXED version of efi_call_virt_generic.
   * This would only be to cover a hypothetical situation where there
 was code that needed to use a function pointer outside of
 efi.systab->runtime, running with CONFIG_EFI_MIXED enabled.

I'm still playing around with some of this to see how it could be
cleaned up, but wanted to get something out there so people could see
how I'm thinking about handling this.

Let me know what everybody thinks!

Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Borislav Petkov <b...@suse.de>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org

Alex Thorlton (3):
  Convert efi_call_virt to efi_call_virt_generic
  Update uv_bios_call to use efi_call_virt_generic
  Update efi_thunk to use efi_call_virt_generic

 arch/x86/include/asm/efi.h  | 51 +--
 arch/x86/platform/efi/efi_64.c  | 49 +++---
 arch/x86/platform/uv/bios_uv.c  |  3 +-
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 5 files changed, 122 insertions(+), 85 deletions(-)

-- 
1.8.5.6



[PATCH 3/3] Update efi_thunk to use efi_call_virt_generic

2016-05-18 Thread Alex Thorlton
Now that we have efi_call_virt_generic, we no longer need to have an
entirely separate efi_thunk macro to handle the CONFIG_EFI_MIXED
scenario, where the function pointers cannot be read directly out of
efi.systab->runtime.

This commit creates a new set of arch_efi_call_virt* macros to mimic the
behavior of the old efi_thunk macro.  In the end, the code should be the
same, functionally, but we'll have eliminated a good chunk of code
duplication by splitting the efi_thunk macro up into the appropriate
arch_efi_call_virt bits and then just calling efi_call_virt_generic,
instead of efi_thunk.  I do go ahead and create a new efi_thunk macro in
arch/x86/platform/efi/efi_64.c, but this is mainly to keep the existing
code clean.

One thing to note here, is that this is not and absolutely *perfect* use
of the efi_call_virt_generic macro, in that it still has the
efi.systab->runtime pointer hard-coded into runtime_service32.  For now,
I think this should be fine, as the only users of the function already
assumed that this was where their function pointer would come from.  If
another CONFIG_EFI_MIXED user comes along and needs to use
efi_call_virt_generic, then we will need to re-think things a bit.

Signed-off-by: Alex Thorlton 
Cc: Matt Fleming 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Mike Travis 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/include/asm/efi.h | 47 
 arch/x86/platform/efi/efi_64.c | 49 ++
 2 files changed, 58 insertions(+), 38 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index f310f0b..6643f9b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -68,6 +68,52 @@ struct efi_scratch {
u64 phys_stack;
 } __packed;
 
+#ifdef CONFIG_EFI_MIXED
+extern efi_status_t efi64_thunk(u32, ...);
+
+#define runtime_service32(func)
 \
+({  \
+   u32 table = (u32)(unsigned long)efi.systab;  \
+   u32 *rt, *___f;  \
+\
+   rt = (u32 *)(table + offsetof(efi_system_table_32_t, runtime));  \
+   ___f = (u32 *)(*rt + offsetof(efi_runtime_services_32_t, func)); \
+   *___f;   \
+})
+
+/*
+ * Switch to the EFI page tables early so that we can access the 1:1
+ * runtime services mappings which are not mapped in any other page
+ * tables. This function must be called before runtime_service32().
+ *
+ * Also, disable interrupts because the IDT points to 64-bit handlers,
+ * which aren't going to function correctly when we switch to 32-bit.
+ */
+#define arch_efi_call_virt_setup() \
+({ \
+   efi_sync_low_kernel_mappings(); \
+   local_irq_save(flags);  \
+   \
+   efi_scratch.prev_cr3 = read_cr3();  \
+   write_cr3((unsigned long)efi_scratch.efi_pgt);  \
+   __flush_tlb_all();  \
+})
+
+#define arch_efi_call_virt(p, f, ...)  \
+({ \
+   u32 func = runtime_service32(f);\
+   efi64_thunk(func, __VA_ARGS__); \
+})
+
+#define arch_efi_call_virt_teardown()  \
+({ \
+   write_cr3(efi_scratch.prev_cr3);\
+   __flush_tlb_all();  \
+   local_irq_restore(flags);   \
+})
+
+#else /* !CONFIG_EFI_MIXED */
+
 #define arch_efi_call_virt_setup() \
 ({ \
efi_sync_low_kernel_mappings(); \
@@ -94,6 +140,7 @@ struct efi_scratch {
__kernel_fpu_end(); \
preempt_enable();   \
 })
+#endif
 
 extern void __iomem *__init efi_ioremap(unsigned long addr, unsigned long size,
u32 type, u64 attribute);
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 6e7242b..747bbc3 100644
--- a/arch/x86/platform/efi/efi

[RFC PATCH 0/3] x86/UV, x86/efi: Re-factor efi_call_virt for general use

2016-05-18 Thread Alex Thorlton
Hey guys,

This patchset creates a general purpose version of the efi_call_virt
macro that does not assume that the function pointer being passed in is
inside of efi.systab->runtime.  It also fixes up a few potentional users
of that new functionality, namely the SGI UV, and the CONFIG_EFI_MIXED
code paths.

Quick breakdown of the patches:

Patch 1) Move necessary macros to locations where we can access them.
 Remove hard-coded efi.systab reference from efi_call_virt.
 Rename/create new macros as needed.
Patch 2) Simple change to allow UV code to utilize the new
 functionality.  Included a detailed explanation of how we got
 here.
Patch 3) This is the one I'm most looking for input on.  I merge the
 efi_thunk code in with the new efi_call_virt scheme (giving it
 it's own arch_efi_call_* macros, conditionally defined for
 EFI_MIXED) and then use efi_thunk as a wrapper for
 efi_call_virt_generic.

The first two have been tested on simulators and hardware, but the third
has only been compile-tested.  I don't have any hardware to test that
on.  I'm sure I could set up a VM with OVMF, but I haven't taken the
time yet :)

A few notes/concerns that I had about the patches:

1) I could have created more specific names for the individual uses of
   efi_call_virt instead of using the originals as wrappers for
   efi_call_virt_generic.
   * Would be easy enough to do, but would need to update all the
 original callers of the function.  Not difficult, just different.
2) I'm not sure if each macro really needs to have the same args
   implemented.  Could be simplified a bit if we didn't use "p" on the
   EFI_MIXED side.
   * I did this for consistency, but I suppose it's not explicitly
 necessary.
3) It wouldn't be too hard to add an efi_thunk_generic function that
   would just expect a 32-bit pointer, and then have an
   efi_thunk_runtime to wrap that and handle the call to
   runtime_service32 for us, so that the efi.systab pointer doesn't have
   to be hard-coded into the EFI_MIXED version of efi_call_virt_generic.
   * This would only be to cover a hypothetical situation where there
 was code that needed to use a function pointer outside of
 efi.systab->runtime, running with CONFIG_EFI_MIXED enabled.

I'm still playing around with some of this to see how it could be
cleaned up, but wanted to get something out there so people could see
how I'm thinking about handling this.

Let me know what everybody thinks!

Cc: Matt Fleming 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Mike Travis 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org

Alex Thorlton (3):
  Convert efi_call_virt to efi_call_virt_generic
  Update uv_bios_call to use efi_call_virt_generic
  Update efi_thunk to use efi_call_virt_generic

 arch/x86/include/asm/efi.h  | 51 +--
 arch/x86/platform/efi/efi_64.c  | 49 +++---
 arch/x86/platform/uv/bios_uv.c  |  3 +-
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 5 files changed, 122 insertions(+), 85 deletions(-)

-- 
1.8.5.6



[PATCH 1/3] Convert efi_call_virt to efi_call_virt_generic

2016-05-18 Thread Alex Thorlton
This commit makes a few slight modifications to the efi_call_virt macro
to get it to work with function pointers that are stored in locations
other than efi.systab->runtime, and renames the macro to
efi_call_virt_generic.  The majority of the changes here are to pull
these macros up into header files so that they can be accessed from
outside of drivers/firmware/efi/runtime-wrappers.c.

The most significant change not directly related to the code move is to
add an extra "p" argument into the appropriate efi_call macros, and use
that new argument in place of the, formerly hard-coded,
efi.systab->runtime pointer.

The last piece of the puzzle was to add an efi_call_virt macro back into
drivers/firmware/efi/runtime-wrappers.c to wrap around the new
efi_call_virt_generic macro - this was mainly to keep the code from
looking too cluttered by adding a bunch of extra references to
efi.systab->runtime everywhere.

Note that I also broke up the code in the efi_call_virt_generic macro a
bit in the process of moving it.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Borislav Petkov <b...@suse.de>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/include/asm/efi.h  |  4 +--
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 3 files changed, 63 insertions(+), 45 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 78d1e74..f310f0b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -81,8 +81,8 @@ struct efi_scratch {
}   \
 })
 
-#define arch_efi_call_virt(f, args...) \
-   efi_call((void *)efi.systab->runtime->f, args)  \
+#define arch_efi_call_virt(p, f, args...)  
\
+   efi_call((void *)p->f, args)\
 
 #define arch_efi_call_virt_teardown()  \
 ({ \
diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 23bef6b..e8bc493 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -22,7 +22,16 @@
 #include 
 #include 
 
-static void efi_call_virt_check_flags(unsigned long flags, const char *call)
+/*
+ * Wrap around the new efi_call_virt_generic macros so that the
+ * code doesn't get too cluttered
+ */
+#define efi_call_virt(f, args...)   \
+   efi_call_virt_generic(efi.systab->runtime, f, args)
+#define __efi_call_virt(f, args...) \
+   __efi_call_virt_generic(efi.systab->runtime, f, args)
+
+void efi_call_virt_check_flags(unsigned long flags, const char *call)
 {
unsigned long cur_flags, mismatch;
 
@@ -39,48 +48,6 @@ static void efi_call_virt_check_flags(unsigned long flags, 
const char *call)
 }
 
 /*
- * Arch code can implement the following three template macros, avoiding
- * reptition for the void/non-void return cases of {__,}efi_call_virt:
- *
- *  * arch_efi_call_virt_setup
- *
- *Sets up the environment for the call (e.g. switching page tables,
- *allowing kernel-mode use of floating point, if required).
- *
- *  * arch_efi_call_virt
- *
- *Performs the call. The last expression in the macro must be the call
- *itself, allowing the logic to be shared by the void and non-void
- *cases.
- *
- *  * arch_efi_call_virt_teardown
- *
- *Restores the usual kernel environment once the call has returned.
- */
-
-#define efi_call_virt(f, args...)  \
-({ \
-   efi_status_t __s;   \
-   unsigned long flags;\
-   arch_efi_call_virt_setup(); \
-   local_save_flags(flags);\
-   __s = arch_efi_call_virt(f, args);  \
-   efi_call_virt_check_flags(flags, __stringify(f));   \
-   arch_efi_call_virt_teardown();  \
-   __s;\
-})
-
-#define __efi_call_virt(f, args...)\
-({ \
-   unsigned long flags;\

[PATCH 1/3] Convert efi_call_virt to efi_call_virt_generic

2016-05-18 Thread Alex Thorlton
This commit makes a few slight modifications to the efi_call_virt macro
to get it to work with function pointers that are stored in locations
other than efi.systab->runtime, and renames the macro to
efi_call_virt_generic.  The majority of the changes here are to pull
these macros up into header files so that they can be accessed from
outside of drivers/firmware/efi/runtime-wrappers.c.

The most significant change not directly related to the code move is to
add an extra "p" argument into the appropriate efi_call macros, and use
that new argument in place of the, formerly hard-coded,
efi.systab->runtime pointer.

The last piece of the puzzle was to add an efi_call_virt macro back into
drivers/firmware/efi/runtime-wrappers.c to wrap around the new
efi_call_virt_generic macro - this was mainly to keep the code from
looking too cluttered by adding a bunch of extra references to
efi.systab->runtime everywhere.

Note that I also broke up the code in the efi_call_virt_generic macro a
bit in the process of moving it.

Signed-off-by: Alex Thorlton 
Cc: Matt Fleming 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Mike Travis 
Cc: Russ Anderson 
Cc: Dimitri Sivanich 
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
---
 arch/x86/include/asm/efi.h  |  4 +--
 drivers/firmware/efi/runtime-wrappers.c | 53 +++--
 include/linux/efi.h | 51 +++
 3 files changed, 63 insertions(+), 45 deletions(-)

diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 78d1e74..f310f0b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -81,8 +81,8 @@ struct efi_scratch {
}   \
 })
 
-#define arch_efi_call_virt(f, args...) \
-   efi_call((void *)efi.systab->runtime->f, args)  \
+#define arch_efi_call_virt(p, f, args...)  
\
+   efi_call((void *)p->f, args)\
 
 #define arch_efi_call_virt_teardown()  \
 ({ \
diff --git a/drivers/firmware/efi/runtime-wrappers.c 
b/drivers/firmware/efi/runtime-wrappers.c
index 23bef6b..e8bc493 100644
--- a/drivers/firmware/efi/runtime-wrappers.c
+++ b/drivers/firmware/efi/runtime-wrappers.c
@@ -22,7 +22,16 @@
 #include 
 #include 
 
-static void efi_call_virt_check_flags(unsigned long flags, const char *call)
+/*
+ * Wrap around the new efi_call_virt_generic macros so that the
+ * code doesn't get too cluttered
+ */
+#define efi_call_virt(f, args...)   \
+   efi_call_virt_generic(efi.systab->runtime, f, args)
+#define __efi_call_virt(f, args...) \
+   __efi_call_virt_generic(efi.systab->runtime, f, args)
+
+void efi_call_virt_check_flags(unsigned long flags, const char *call)
 {
unsigned long cur_flags, mismatch;
 
@@ -39,48 +48,6 @@ static void efi_call_virt_check_flags(unsigned long flags, 
const char *call)
 }
 
 /*
- * Arch code can implement the following three template macros, avoiding
- * reptition for the void/non-void return cases of {__,}efi_call_virt:
- *
- *  * arch_efi_call_virt_setup
- *
- *Sets up the environment for the call (e.g. switching page tables,
- *allowing kernel-mode use of floating point, if required).
- *
- *  * arch_efi_call_virt
- *
- *Performs the call. The last expression in the macro must be the call
- *itself, allowing the logic to be shared by the void and non-void
- *cases.
- *
- *  * arch_efi_call_virt_teardown
- *
- *Restores the usual kernel environment once the call has returned.
- */
-
-#define efi_call_virt(f, args...)  \
-({ \
-   efi_status_t __s;   \
-   unsigned long flags;\
-   arch_efi_call_virt_setup(); \
-   local_save_flags(flags);\
-   __s = arch_efi_call_virt(f, args);  \
-   efi_call_virt_check_flags(flags, __stringify(f));   \
-   arch_efi_call_virt_teardown();  \
-   __s;\
-})
-
-#define __efi_call_virt(f, args...)\
-({ \
-   unsigned long flags;\
-   arch_efi_call_virt_setup(); \
-   local_save_flags(flags);\
-   arch_efi_call_virt(f, args); 

Re: [PATCH 1/2] Create UV efi_call macros

2016-05-17 Thread Alex Thorlton
On Tue, May 17, 2016 at 01:11:22PM +0100, Matt Fleming wrote:
> On Mon, 16 May, at 05:58:40PM, Alex Thorlton wrote:
> > 
> > I was simply re-using the efi_call implementation.  Boris suggested that
> > I re-write this using the efi_call_virt macro, so I just went with that.
> > It all seems to work just fine, so I don't see much reason to stray away
> > from that implementation.  That being said, I'm obviously not a huge fun
> > of the code duplication across the macros.  I think there's probably a
> > way to minimize this, though I haven't quite worked out the best method
> > yet (ideas are welcome :)
> 
> The reason I'm pressing for details is that we have a related issue
> with the EFI thunking code (CONFIG_EFI_MIXED), where the function
> pointer we want to call isn't accessible via the EFI System Table, see
> efi_thunk().
> 
> Well, technically it *is* accessible, you just can't dereference the
> services at runtime because the pointers in the tables are not 64-bit.
> 
> But the same constraints exist for EFI thunk and UV code; given a
> function pointer to execute that isn't in efi.systab, setup the EFI
> runtime environment and call a custom ABI function.

I took a look at this, and see what you mean.  You pass in the same
pointer to efi_thunk, which handles essentially the same setup
stuff as efi_call_virt (sync low mappings, disable interrupts, switch
page tables), sans a few of the finer details in
arch_efi_call_virt_setup.

The separate efi_thunk macro is necessary in this case, because you
need to use the efi64_thunk call, with your runtime_service32 massaged
pointer, instead of efi_call, with a pointer straight out of
systab->runtime. This is a similar scenario to ours, in that we
need uv_efi_call instead of efi_call, with our own pointer, instead of
systab->runtime.

The only difference here is that your efi64_thunk call needs a
slightly different setup/teardown than the regular efi_call, so you
need that efi_thunk to be hacked up a bit more (compared to
efi_call_virt) than my uv_efi_call_virt macro.

IINM, we could probably make up for this discrepancy by having a
different arch_efi_call_virt_setup/teardown for the !efi_is_native case
(not sure if that is a feasible idea, correct me if that's stupid).

> I haven't tested this (or thought through all the implications), but
> could you look at providing a table (or something) for mapping a
> function name to a ptr,func pair, e.g.
> 
>   thunk_get_time: runtime_services32(get_time), efi64_thunk
>   thunk_set_time: runtime_services32(set_time), efi64_thunk
>   ...
>   uv_call_func: efi.uv_systab->function, uv_efi_call_virt
> 
> which we could use in arch_efi_call_virt()? That should give us much
> less code duplication and hide all this inside arch/x86.

This sounds like it could be a good way to handle this.  I will need to
think about it.  Unless someone can say for certain that we can use the
same arch_efi_call_virt_setup/teardown for your efi64_thunk call that we
use for efi_call, then we'll also have to take that into account, which
might make things uglier.  Not horrible, but more complicated.

I'm starting to play with this over here to see if I can get a cleaner
implementation working.

Let me know if you have other thoughts.  Thanks for the input!

- Alex


Re: [PATCH 1/2] Create UV efi_call macros

2016-05-17 Thread Alex Thorlton
On Tue, May 17, 2016 at 01:11:22PM +0100, Matt Fleming wrote:
> On Mon, 16 May, at 05:58:40PM, Alex Thorlton wrote:
> > 
> > I was simply re-using the efi_call implementation.  Boris suggested that
> > I re-write this using the efi_call_virt macro, so I just went with that.
> > It all seems to work just fine, so I don't see much reason to stray away
> > from that implementation.  That being said, I'm obviously not a huge fun
> > of the code duplication across the macros.  I think there's probably a
> > way to minimize this, though I haven't quite worked out the best method
> > yet (ideas are welcome :)
> 
> The reason I'm pressing for details is that we have a related issue
> with the EFI thunking code (CONFIG_EFI_MIXED), where the function
> pointer we want to call isn't accessible via the EFI System Table, see
> efi_thunk().
> 
> Well, technically it *is* accessible, you just can't dereference the
> services at runtime because the pointers in the tables are not 64-bit.
> 
> But the same constraints exist for EFI thunk and UV code; given a
> function pointer to execute that isn't in efi.systab, setup the EFI
> runtime environment and call a custom ABI function.

I took a look at this, and see what you mean.  You pass in the same
pointer to efi_thunk, which handles essentially the same setup
stuff as efi_call_virt (sync low mappings, disable interrupts, switch
page tables), sans a few of the finer details in
arch_efi_call_virt_setup.

The separate efi_thunk macro is necessary in this case, because you
need to use the efi64_thunk call, with your runtime_service32 massaged
pointer, instead of efi_call, with a pointer straight out of
systab->runtime. This is a similar scenario to ours, in that we
need uv_efi_call instead of efi_call, with our own pointer, instead of
systab->runtime.

The only difference here is that your efi64_thunk call needs a
slightly different setup/teardown than the regular efi_call, so you
need that efi_thunk to be hacked up a bit more (compared to
efi_call_virt) than my uv_efi_call_virt macro.

IINM, we could probably make up for this discrepancy by having a
different arch_efi_call_virt_setup/teardown for the !efi_is_native case
(not sure if that is a feasible idea, correct me if that's stupid).

> I haven't tested this (or thought through all the implications), but
> could you look at providing a table (or something) for mapping a
> function name to a ptr,func pair, e.g.
> 
>   thunk_get_time: runtime_services32(get_time), efi64_thunk
>   thunk_set_time: runtime_services32(set_time), efi64_thunk
>   ...
>   uv_call_func: efi.uv_systab->function, uv_efi_call_virt
> 
> which we could use in arch_efi_call_virt()? That should give us much
> less code duplication and hide all this inside arch/x86.

This sounds like it could be a good way to handle this.  I will need to
think about it.  Unless someone can say for certain that we can use the
same arch_efi_call_virt_setup/teardown for your efi64_thunk call that we
use for efi_call, then we'll also have to take that into account, which
might make things uglier.  Not horrible, but more complicated.

I'm starting to play with this over here to see if I can get a cleaner
implementation working.

Let me know if you have other thoughts.  Thanks for the input!

- Alex


Re: [PATCH 1/2] Create UV efi_call macros

2016-05-16 Thread Alex Thorlton
On Thu, May 12, 2016 at 10:17:39AM +0200, Ingo Molnar wrote:
> > Fine by me, although having a newline after arch_efi_call_virt_setup()
> > but not before arch_efi_call_virt_teardown() seems rather arbitrary
> 
> It's an oversight! :-)
> 
> #define efi_call_virt(f, args...) \
> ({\
>   efi_status_t __s;   \
>   unsigned long flags;\
>   \
>   arch_efi_call_virt_setup(); \
>   \
>   local_save_flags(flags);\
>   __s = arch_efi_call_virt(f, args);  \
>   efi_call_virt_check_flags(flags, __stringify(f));   \
>   \
>   arch_efi_call_virt_teardown();  \
>   \
>   __s;\
> })
> 
> But if it's too segmented this is fine too:
> 
> #define efi_call_virt(f, args...) \
> ({\
>   efi_status_t __s;   \
>   unsigned long flags;\
>   \
>   arch_efi_call_virt_setup(); \
>   local_save_flags(flags);\
>   __s = arch_efi_call_virt(f, args);  \
>   efi_call_virt_check_flags(flags, __stringify(f));   \
>   arch_efi_call_virt_teardown();  \
>   \
>   __s;\
> })

This makes sense to me.  I'll make sure to include something like this
in my next version of the patch.

Thanks, guys!

- Alex


Re: [PATCH 1/2] Create UV efi_call macros

2016-05-16 Thread Alex Thorlton
On Thu, May 12, 2016 at 10:17:39AM +0200, Ingo Molnar wrote:
> > Fine by me, although having a newline after arch_efi_call_virt_setup()
> > but not before arch_efi_call_virt_teardown() seems rather arbitrary
> 
> It's an oversight! :-)
> 
> #define efi_call_virt(f, args...) \
> ({\
>   efi_status_t __s;   \
>   unsigned long flags;\
>   \
>   arch_efi_call_virt_setup(); \
>   \
>   local_save_flags(flags);\
>   __s = arch_efi_call_virt(f, args);  \
>   efi_call_virt_check_flags(flags, __stringify(f));   \
>   \
>   arch_efi_call_virt_teardown();  \
>   \
>   __s;\
> })
> 
> But if it's too segmented this is fine too:
> 
> #define efi_call_virt(f, args...) \
> ({\
>   efi_status_t __s;   \
>   unsigned long flags;\
>   \
>   arch_efi_call_virt_setup(); \
>   local_save_flags(flags);\
>   __s = arch_efi_call_virt(f, args);  \
>   efi_call_virt_check_flags(flags, __stringify(f));   \
>   arch_efi_call_virt_teardown();  \
>   \
>   __s;\
> })

This makes sense to me.  I'll make sure to include something like this
in my next version of the patch.

Thanks, guys!

- Alex


Re: [PATCH 1/2] Create UV efi_call macros

2016-05-16 Thread Alex Thorlton
On Thu, May 12, 2016 at 01:06:00PM +0100, Matt Fleming wrote:
> (Adding author of arch_efi_call code)
> 
> On Wed, 11 May, at 02:55:44PM, Alex Thorlton wrote:
> > We need a slightly different macro than the standard efi_call_virt,
> > since those macros all assume that the function pointer, f, that gets
> > passed in will live somewhere in efi.systab->runtime.  Our EFI function
> > pointer lives in efi.uv_systab, so we can't use the standard macros out
> > of the box.
>  
> Is that true of all EFI services pointers? From reading the SGI/UV
> code I got the impression that it's possible to call the standard
> services via efi.systab->runtime but you also need the ability to call
> these UV bios functions, which are not accessible via efi.systab.

No, sorry.  I wasn't very clear there.  All of the standard EFI services
(get_time, get_variable, etc.) are still called through
efi.systab->runtime on UV.  It's only the special UV-specific function
pointer that is accessed through uv_systab, but, either way, we will
still need a slightly different macro to make that happen.

> Do you actually want the same environment setup and teardown to happen
> when calling efi.uv_systab ? Or are you simply trying to reuse the
> efi_call() implementation?

I was simply re-using the efi_call implementation.  Boris suggested that
I re-write this using the efi_call_virt macro, so I just went with that.
It all seems to work just fine, so I don't see much reason to stray away
from that implementation.  That being said, I'm obviously not a huge fun
of the code duplication across the macros.  I think there's probably a
way to minimize this, though I haven't quite worked out the best method
yet (ideas are welcome :)

> > This commit creates some new uv_* macros that are functionally
> > equivalent to the standard ones, with the exception of allowing us to
> > use a different function pointer.  I figure that we won't want to call
> > these uv_* in the end (we'll probably want something more generic), but
> > I thought I would get everyone's thoughts on how we might best reach
> > that goal instead of just trying to come up with a new implementation on
> > my own.
> > 
> > By itself, this commit does get our machines booting, but it needs the
> > small fix to the efi_call assembly code for our modules to work.
>  
> Could you provide some more details in the changelog on why your
> machine doesn't boot without this change?

Absolutely.  I wasn't sure exactly how much detail was necessary.  I'll
put a brief write-up of the original problem in the commit message for
the next version.

- Alex


Re: [PATCH 1/2] Create UV efi_call macros

2016-05-16 Thread Alex Thorlton
On Thu, May 12, 2016 at 01:06:00PM +0100, Matt Fleming wrote:
> (Adding author of arch_efi_call code)
> 
> On Wed, 11 May, at 02:55:44PM, Alex Thorlton wrote:
> > We need a slightly different macro than the standard efi_call_virt,
> > since those macros all assume that the function pointer, f, that gets
> > passed in will live somewhere in efi.systab->runtime.  Our EFI function
> > pointer lives in efi.uv_systab, so we can't use the standard macros out
> > of the box.
>  
> Is that true of all EFI services pointers? From reading the SGI/UV
> code I got the impression that it's possible to call the standard
> services via efi.systab->runtime but you also need the ability to call
> these UV bios functions, which are not accessible via efi.systab.

No, sorry.  I wasn't very clear there.  All of the standard EFI services
(get_time, get_variable, etc.) are still called through
efi.systab->runtime on UV.  It's only the special UV-specific function
pointer that is accessed through uv_systab, but, either way, we will
still need a slightly different macro to make that happen.

> Do you actually want the same environment setup and teardown to happen
> when calling efi.uv_systab ? Or are you simply trying to reuse the
> efi_call() implementation?

I was simply re-using the efi_call implementation.  Boris suggested that
I re-write this using the efi_call_virt macro, so I just went with that.
It all seems to work just fine, so I don't see much reason to stray away
from that implementation.  That being said, I'm obviously not a huge fun
of the code duplication across the macros.  I think there's probably a
way to minimize this, though I haven't quite worked out the best method
yet (ideas are welcome :)

> > This commit creates some new uv_* macros that are functionally
> > equivalent to the standard ones, with the exception of allowing us to
> > use a different function pointer.  I figure that we won't want to call
> > these uv_* in the end (we'll probably want something more generic), but
> > I thought I would get everyone's thoughts on how we might best reach
> > that goal instead of just trying to come up with a new implementation on
> > my own.
> > 
> > By itself, this commit does get our machines booting, but it needs the
> > small fix to the efi_call assembly code for our modules to work.
>  
> Could you provide some more details in the changelog on why your
> machine doesn't boot without this change?

Absolutely.  I wasn't sure exactly how much detail was necessary.  I'll
put a brief write-up of the original problem in the commit message for
the next version.

- Alex


Re: [GIT PULL] EFI fix

2016-05-16 Thread Alex Thorlton
On Mon, May 16, 2016 at 03:23:51PM -0500, Alex Thorlton wrote:
> Everything discussed above makes sense to me, and the patch looks sane.
> I will apply and test it today and let you know how it works.

I applied this to the latest -tip kernel and tested on both real
hardware and in our simulator.  Everything appears to be behaving as
expected.

I still need to work on the other patch in the patchset to get our
callbacks working with the new EFI page tables, but I believe we have a
clear path forward there, and that is (mostly) a separate issue.

Thanks again for catching this mistake, Linus!

- Alex


Re: [GIT PULL] EFI fix

2016-05-16 Thread Alex Thorlton
On Mon, May 16, 2016 at 03:23:51PM -0500, Alex Thorlton wrote:
> Everything discussed above makes sense to me, and the patch looks sane.
> I will apply and test it today and let you know how it works.

I applied this to the latest -tip kernel and tested on both real
hardware and in our simulator.  Everything appears to be behaving as
expected.

I still need to work on the other patch in the patchset to get our
callbacks working with the new EFI page tables, but I believe we have a
clear path forward there, and that is (mostly) a separate issue.

Thanks again for catching this mistake, Linus!

- Alex


Re: [GIT PULL] EFI fix

2016-05-16 Thread Alex Thorlton
On Mon, May 16, 2016 at 01:05:45PM -0700, Linus Torvalds wrote:
> On Mon, May 16, 2016 at 7:46 AM, Ingo Molnar  wrote:
> >
> > Please pull the latest efi-urgent-for-linus git tree from:
> >
> >git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> > efi-urgent-for-linus
> >
> ># HEAD: bea23c757f66d91dac8fdadd94da0cba6b0b66bc x86/efi: Fix 7th 
> > argument to efi_call()
> >
> > A leftover fix from the v4.6 cycle.
> 
> I'm not pulling this. It seems to be completely broken unless I'm
> mis-reading things.
> 
> > diff --git a/arch/x86/platform/efi/efi_stub_64.S 
> > b/arch/x86/platform/efi/efi_stub_64.S
> > index 92723aeae0f9..62938ffbb9f9 100644
> > --- a/arch/x86/platform/efi/efi_stub_64.S
> > +++ b/arch/x86/platform/efi/efi_stub_64.S
> > @@ -43,7 +43,7 @@ ENTRY(efi_call)
> > FRAME_BEGIN
> > SAVE_XMM
> > mov (%rsp), %rax
> > -   mov 8(%rax), %rax
> > +   mov 16(%rax), %rax
> > subq $48, %rsp
> > mov %r9, 32(%rsp)
> > mov %rax, 40(%rsp)
> 
> This code is an unmitigated disaster. It makes no sense, but the
> reason I refuse to pull it is that it also seems to be buggy - with or
> without that patch.
> 
> In particular,. the SAME_XMM code saves the old stack pointer, but
> that's just crazy. It saves the stack pointer *AFTER* we've done that
> 
> FRAME_BEGIN
> 
> which will have *changed* the stack pointer, depending on whether
> stack frames are enabled or not.
> 
> So when the code then does
> 
> mov (%rsp), %rax
> 
> we now move that old stack pointer into %rax, but the offset off that
> stack pointer will depend on whether that FRAME_BEGIN saved off %rbp
> or not.
> 
> So that whole 8-vs-16 offset confusion depends on the frame pointer!
> If frame pointers were enabled, it will be 16. If they weren't, it
> will be 8. That patch that changes it from 8 to 16 will just move the
> bug around. Before, it was correct when frame pointers were disabled
> and buggy otherwise. Now, it's correct if frame pointers are enabled,
> and buggy otherwise.

This makes sense.  I missed the implication of the conditionally defined
FRAME_BEGIN being used at the beginning of this function.  My fix works
on our machines, because we always have frame pointers enabled, but I do
see, now, how this effectively just moves the bug.

> I may be missing something, but I think that commit is pure garbage.
> 
> I think the right fix is to just get rid of that silly conditional
> frame pointer thing, and always use frame pointers in this stub
> function. And then we don't need that (odd) load to get the old stack
> pointer into %rax - we can just use the frame pointer.

After having read your explanation, I agree.  If we leave in the
conditional frame pointer chunk, we'll have to do some other trickery to
make sure that we get the offset for the 7th arg correct, which sounds
ugly.  Your idea seems like a much cleaner fix.

> Something like the attached completely untested patch.
> 
> But maybe I was missing something. Maybe my patch is crap and the
> patch above is right for some reason that completely evades me.
> 
> Since this apparently only affects the SGI EFI stuff, can you please
> test this, Alex?

Everything discussed above makes sense to me, and the patch looks sane.
I will apply and test it today and let you know how it works.

Thanks for looking at this!

- Alex


Re: [GIT PULL] EFI fix

2016-05-16 Thread Alex Thorlton
On Mon, May 16, 2016 at 01:05:45PM -0700, Linus Torvalds wrote:
> On Mon, May 16, 2016 at 7:46 AM, Ingo Molnar  wrote:
> >
> > Please pull the latest efi-urgent-for-linus git tree from:
> >
> >git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> > efi-urgent-for-linus
> >
> ># HEAD: bea23c757f66d91dac8fdadd94da0cba6b0b66bc x86/efi: Fix 7th 
> > argument to efi_call()
> >
> > A leftover fix from the v4.6 cycle.
> 
> I'm not pulling this. It seems to be completely broken unless I'm
> mis-reading things.
> 
> > diff --git a/arch/x86/platform/efi/efi_stub_64.S 
> > b/arch/x86/platform/efi/efi_stub_64.S
> > index 92723aeae0f9..62938ffbb9f9 100644
> > --- a/arch/x86/platform/efi/efi_stub_64.S
> > +++ b/arch/x86/platform/efi/efi_stub_64.S
> > @@ -43,7 +43,7 @@ ENTRY(efi_call)
> > FRAME_BEGIN
> > SAVE_XMM
> > mov (%rsp), %rax
> > -   mov 8(%rax), %rax
> > +   mov 16(%rax), %rax
> > subq $48, %rsp
> > mov %r9, 32(%rsp)
> > mov %rax, 40(%rsp)
> 
> This code is an unmitigated disaster. It makes no sense, but the
> reason I refuse to pull it is that it also seems to be buggy - with or
> without that patch.
> 
> In particular,. the SAME_XMM code saves the old stack pointer, but
> that's just crazy. It saves the stack pointer *AFTER* we've done that
> 
> FRAME_BEGIN
> 
> which will have *changed* the stack pointer, depending on whether
> stack frames are enabled or not.
> 
> So when the code then does
> 
> mov (%rsp), %rax
> 
> we now move that old stack pointer into %rax, but the offset off that
> stack pointer will depend on whether that FRAME_BEGIN saved off %rbp
> or not.
> 
> So that whole 8-vs-16 offset confusion depends on the frame pointer!
> If frame pointers were enabled, it will be 16. If they weren't, it
> will be 8. That patch that changes it from 8 to 16 will just move the
> bug around. Before, it was correct when frame pointers were disabled
> and buggy otherwise. Now, it's correct if frame pointers are enabled,
> and buggy otherwise.

This makes sense.  I missed the implication of the conditionally defined
FRAME_BEGIN being used at the beginning of this function.  My fix works
on our machines, because we always have frame pointers enabled, but I do
see, now, how this effectively just moves the bug.

> I may be missing something, but I think that commit is pure garbage.
> 
> I think the right fix is to just get rid of that silly conditional
> frame pointer thing, and always use frame pointers in this stub
> function. And then we don't need that (odd) load to get the old stack
> pointer into %rax - we can just use the frame pointer.

After having read your explanation, I agree.  If we leave in the
conditional frame pointer chunk, we'll have to do some other trickery to
make sure that we get the offset for the 7th arg correct, which sounds
ugly.  Your idea seems like a much cleaner fix.

> Something like the attached completely untested patch.
> 
> But maybe I was missing something. Maybe my patch is crap and the
> patch above is right for some reason that completely evades me.
> 
> Since this apparently only affects the SGI EFI stuff, can you please
> test this, Alex?

Everything discussed above makes sense to me, and the patch looks sane.
I will apply and test it today and let you know how it works.

Thanks for looking at this!

- Alex


  1   2   3   4   5   6   >