from:"Matt Fleming"

Re: [PATCH] x86: Add early quirk to reset Apple AirPort card

2016-06-02 Thread Matt Fleming

On Sun, 29 May, at 01:35:28AM, Lukas Wunner wrote:
> The EFI firmware on Macs contains a full-fledged network stack for
> downloading OS X images from osrecovery.apple.com. Unfortunately
> on Macs introduced 2011 and 2012, EFI brings up the Broadcom 4331
> wireless card on every boot and leaves it enabled even after
> ExitBootServices has been called. The card continues to assert its IRQ
> line, causing spurious interrupts if the IRQ is shared. It also corrupts
> memory by DMAing received packets, allowing for remote code execution
> over the air. This only stops when a driver is loaded for the wireless
> card, which may be never if the driver is not installed or blacklisted.

[... Snip a very thorough changelog ...]

This patch looks fine to me from an EFI perspective.

Acked-by: Matt Fleming

[PATCH 1/2] efi: Fix for_each_efi_memory_desc_in_map() for empty memmaps

2016-05-31 Thread Matt Fleming

From: Vitaly Kuznetsov <vkuzn...@redhat.com>

Commit 78ce248faa3c ("efi: Iterate over efi.memmap in
for_each_efi_memory_desc()") introduced a regression for systems booted
with 'noefi' kernel option. In particular, I observe early kernel hang in
efi_find_mirror() on for_each_efi_memory_desc() call. As we don't have
efi memmap we enter this iterator with the following parameters:

efi.memmap.map = 0, efi.memmap.map_end = 0, efi.memmap.desc_size = 28

for_each_efi_memory_desc_in_map() does the following comparison:

(md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size);

where md = 0, (m)->map_end = 0 and (m)->desc_size = 28 but when we subtract
something from a NULL pointer wrap around happens and we end up returning
invalid pointer.

Fixes: 78ce248faa3c ("efi: Iterate over efi.memmap in 
for_each_efi_memory_desc()")
Signed-off-by: Vitaly Kuznetsov <vkuzn...@redhat.com>
Cc: Mark Salter <msal...@redhat.com>
Cc: "K. Y. Srinivasan" <k...@microsoft.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 include/linux/efi.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/efi.h b/include/linux/efi.h
index c2db3ca22217..f196dd0b0f2f 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1005,7 +1005,7 @@ extern int efi_memattr_apply_permissions(struct mm_struct 
*mm,
 /* Iterate through an efi_memory_map */
 #define for_each_efi_memory_desc_in_map(m, md)\
for ((md) = (m)->map;  \
-(md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size); \
+((void *)(md) + (m)->desc_size) <= (m)->map_end;  \
 (md) = (void *)(md) + (m)->desc_size)
 
 /**
-- 
2.7.3

[PATCH 1/2] efi: Fix for_each_efi_memory_desc_in_map() for empty memmaps

2016-05-31 Thread Matt Fleming

From: Vitaly Kuznetsov 

Commit 78ce248faa3c ("efi: Iterate over efi.memmap in
for_each_efi_memory_desc()") introduced a regression for systems booted
with 'noefi' kernel option. In particular, I observe early kernel hang in
efi_find_mirror() on for_each_efi_memory_desc() call. As we don't have
efi memmap we enter this iterator with the following parameters:

efi.memmap.map = 0, efi.memmap.map_end = 0, efi.memmap.desc_size = 28

for_each_efi_memory_desc_in_map() does the following comparison:

(md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size);

where md = 0, (m)->map_end = 0 and (m)->desc_size = 28 but when we subtract
something from a NULL pointer wrap around happens and we end up returning
invalid pointer.

Fixes: 78ce248faa3c ("efi: Iterate over efi.memmap in 
for_each_efi_memory_desc()")
Signed-off-by: Vitaly Kuznetsov 
Cc: Mark Salter 
Cc: "K. Y. Srinivasan" 
Signed-off-by: Matt Fleming 
---
 include/linux/efi.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/efi.h b/include/linux/efi.h
index c2db3ca22217..f196dd0b0f2f 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1005,7 +1005,7 @@ extern int efi_memattr_apply_permissions(struct mm_struct 
*mm,
 /* Iterate through an efi_memory_map */
 #define for_each_efi_memory_desc_in_map(m, md)\
for ((md) = (m)->map;  \
-(md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size); \
+((void *)(md) + (m)->desc_size) <= (m)->map_end;  \
 (md) = (void *)(md) + (m)->desc_size)
 
 /**
-- 
2.7.3

[PATCH 2/2] efi/arm: Fix the format of debug message from efi

2016-05-31 Thread Matt Fleming

From: Dennis Chen <dennis.c...@arm.com>

When enable debug of efi and memblock with 'efi=debug memblock=debug' appended
to the kernel command line, the debug message output for earyly_con looks like:

[0.00] efi:   0xe105-0xe105 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe130-0xe1300fff [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe820-0xe827 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0x0080-0x008001e7 [Runtime Data   |RUN|  
|  |  |  |  |  |   |WB|WT|WC|UC]
[0.00] memblock_add: [0x80-0x8001e7] flags 0x0 
early_init_dt_add_memory_arch+0x54/0x5c
[0.00] *
...

This patch is trying to fix the above output messed up by memblock_add(),
so we can get below debug mesg looks more formally after applied:

[0.00] efi:   0xe105-0xe105 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe130-0xe1300fff [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe820-0xe827 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0x0080-0x008001e7 [Runtime Data   |RUN|  
|  |  |  |  |  |   |WB|WT|WC|UC]*
[0.00] memblock_add: [0x80-0x8001e7] flags 0x0 
early_init_dt_add_memory_arch+0x54/0x5c
...

Signed-off-by: Dennis Chen <dennis.c...@arm.com>
Acked-by: Mark Rutland <mark.rutl...@arm.com>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Steve Capper <steve.cap...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: linux-...@vger.kernel.org
Cc: Steve McIntyre <st...@einval.com>
Cc: Steven Rostedt <rost...@goodmis.org>
Cc: Dan Williams <dan.j.willi...@intel.com>
Cc: Mark Salter <msal...@redhat.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 drivers/firmware/efi/arm-init.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c
index a850cbc48d8d..c49d50e68aee 100644
--- a/drivers/firmware/efi/arm-init.c
+++ b/drivers/firmware/efi/arm-init.c
@@ -174,6 +174,7 @@ static __init void reserve_regions(void)
 {
efi_memory_desc_t *md;
u64 paddr, npages, size;
+   int resv;
 
if (efi_enabled(EFI_DBG))
pr_info("Processing EFI memory map:\n");
@@ -190,12 +191,14 @@ static __init void reserve_regions(void)
paddr = md->phys_addr;
npages = md->num_pages;
 
+   resv = is_reserve_region(md);
if (efi_enabled(EFI_DBG)) {
char buf[64];
 
-   pr_info("  0x%012llx-0x%012llx %s",
+   pr_info("  0x%012llx-0x%012llx %s%s\n",
paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
-   efi_md_typeattr_format(buf, sizeof(buf), md));
+   efi_md_typeattr_format(buf, sizeof(buf), md),
+   resv ? "*" : "");
}
 
memrange_efi_to_native(, );
@@ -204,14 +207,9 @@ static __init void reserve_regions(void)
if (is_normal_ram(md))
early_init_dt_add_memory_arch(paddr, size);
 
-   if (is_reserve_region(md)) {
+   if (resv)
memblock_mark_nomap(paddr, size);
-   if (efi_enabled(EFI_DBG))
-   pr_cont("*");
-   }
 
-   if (efi_enabled(EFI_DBG))
-   pr_cont("\n");
}
 
set_bit(EFI_MEMMAP, );
-- 
2.7.3

[GIT PULL 0/2] EFI urgent fixes

2016-05-31 Thread Matt Fleming

Folks, please pull the following urgent patches which fix a boot crash
when using the "noefi" parameter and the debug output on arm.

The following changes since commit 1a695a905c18548062509178b98bc91e67510864:

  Linux 4.7-rc1 (2016-05-29 09:29:24 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git tags/efi-urgent

for you to fetch changes up to 1f0cf3892caeab20a99c19f5523499be77b533cd:

  efi/arm: Fix the format of debug message from efi (2016-05-30 22:51:53 +0100)


 * Fix crash when booting with the "noefi" kernel parameter, caused by
   recent changes to for_each_efi_memory_desc_in_map() - Vitaly Kuznetsov

 * Unscramble the debug output on arm when efi=debug and memblock=debug
   is passed on the kernel cmdline - Dennis Chen


Dennis Chen (1):
  efi/arm: Fix the format of debug message from efi

Vitaly Kuznetsov (1):
  efi: Fix for_each_efi_memory_desc_in_map() for empty memmaps

 drivers/firmware/efi/arm-init.c | 14 ++
 include/linux/efi.h |  2 +-
 2 files changed, 7 insertions(+), 9 deletions(-)

[PATCH 2/2] efi/arm: Fix the format of debug message from efi

2016-05-31 Thread Matt Fleming

From: Dennis Chen 

When enable debug of efi and memblock with 'efi=debug memblock=debug' appended
to the kernel command line, the debug message output for earyly_con looks like:

[0.00] efi:   0xe105-0xe105 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe130-0xe1300fff [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe820-0xe827 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0x0080-0x008001e7 [Runtime Data   |RUN|  
|  |  |  |  |  |   |WB|WT|WC|UC]
[0.00] memblock_add: [0x80-0x8001e7] flags 0x0 
early_init_dt_add_memory_arch+0x54/0x5c
[0.00] *
...

This patch is trying to fix the above output messed up by memblock_add(),
so we can get below debug mesg looks more formally after applied:

[0.00] efi:   0xe105-0xe105 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe130-0xe1300fff [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0xe820-0xe827 [Memory Mapped I/O  |RUN|  
|  |  |  |  |  |   |  |  |  |UC]
[0.00] efi:   0x0080-0x008001e7 [Runtime Data   |RUN|  
|  |  |  |  |  |   |WB|WT|WC|UC]*
[0.00] memblock_add: [0x80-0x8001e7] flags 0x0 
early_init_dt_add_memory_arch+0x54/0x5c
...

Signed-off-by: Dennis Chen 
Acked-by: Mark Rutland 
Cc: Catalin Marinas 
Cc: Steve Capper 
Cc: Will Deacon 
Cc: Ard Biesheuvel 
Cc: linux-...@vger.kernel.org
Cc: Steve McIntyre 
Cc: Steven Rostedt 
Cc: Dan Williams 
Cc: Mark Salter 
Signed-off-by: Matt Fleming 
---
 drivers/firmware/efi/arm-init.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/efi/arm-init.c b/drivers/firmware/efi/arm-init.c
index a850cbc48d8d..c49d50e68aee 100644
--- a/drivers/firmware/efi/arm-init.c
+++ b/drivers/firmware/efi/arm-init.c
@@ -174,6 +174,7 @@ static __init void reserve_regions(void)
 {
efi_memory_desc_t *md;
u64 paddr, npages, size;
+   int resv;
 
if (efi_enabled(EFI_DBG))
pr_info("Processing EFI memory map:\n");
@@ -190,12 +191,14 @@ static __init void reserve_regions(void)
paddr = md->phys_addr;
npages = md->num_pages;
 
+   resv = is_reserve_region(md);
if (efi_enabled(EFI_DBG)) {
char buf[64];
 
-   pr_info("  0x%012llx-0x%012llx %s",
+   pr_info("  0x%012llx-0x%012llx %s%s\n",
paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
-   efi_md_typeattr_format(buf, sizeof(buf), md));
+   efi_md_typeattr_format(buf, sizeof(buf), md),
+   resv ? "*" : "");
}
 
memrange_efi_to_native(, );
@@ -204,14 +207,9 @@ static __init void reserve_regions(void)
if (is_normal_ram(md))
early_init_dt_add_memory_arch(paddr, size);
 
-   if (is_reserve_region(md)) {
+   if (resv)
memblock_mark_nomap(paddr, size);
-   if (efi_enabled(EFI_DBG))
-   pr_cont("*");
-   }
 
-   if (efi_enabled(EFI_DBG))
-   pr_cont("\n");
}
 
set_bit(EFI_MEMMAP, );
-- 
2.7.3

[GIT PULL 0/2] EFI urgent fixes

2016-05-31 Thread Matt Fleming

Folks, please pull the following urgent patches which fix a boot crash
when using the "noefi" parameter and the debug output on arm.

The following changes since commit 1a695a905c18548062509178b98bc91e67510864:

  Linux 4.7-rc1 (2016-05-29 09:29:24 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git tags/efi-urgent

for you to fetch changes up to 1f0cf3892caeab20a99c19f5523499be77b533cd:

  efi/arm: Fix the format of debug message from efi (2016-05-30 22:51:53 +0100)


 * Fix crash when booting with the "noefi" kernel parameter, caused by
   recent changes to for_each_efi_memory_desc_in_map() - Vitaly Kuznetsov

 * Unscramble the debug output on arm when efi=debug and memblock=debug
   is passed on the kernel cmdline - Dennis Chen


Dennis Chen (1):
  efi/arm: Fix the format of debug message from efi

Vitaly Kuznetsov (1):
  efi: Fix for_each_efi_memory_desc_in_map() for empty memmaps

 drivers/firmware/efi/arm-init.c | 14 ++
 include/linux/efi.h |  2 +-
 2 files changed, 7 insertions(+), 9 deletions(-)

Re: [PATCH] efi: fix for_each_efi_memory_desc_in_map() for empty memmaps

2016-05-25 Thread Matt Fleming

(Cc'ing Mark, the original author)

On Wed, 25 May, at 04:36:55PM, Vitaly Kuznetsov wrote:
> Commit 78ce248faa3c ("efi: Iterate over efi.memmap in
> for_each_efi_memory_desc()") introduced a regression for systems booted
> with 'noefi' kernel option. In particular, I observe early kernel hang in
> efi_find_mirror() on for_each_efi_memory_desc() call. As we don't have
> efi memmap we enter this iterator with the following parameters:
> 
> efi.memmap.map = 0, efi.memmap.map_end = 0, efi.memmap.desc_size = 28
> 
> for_each_efi_memory_desc_in_map() does the following comparison:
> 
> (md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size);
> 
> where md = 0, (m)->map_end = 0 and (m)->desc_size = 28 but when we subtract
> something from a NULL pointer wrap around happens and we end up returning
> invalid pointer.
> 
> Fixes: 78ce248faa3c ("efi: Iterate over efi.memmap in 
> for_each_efi_memory_desc()")
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  include/linux/efi.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index c2db3ca..229ccb5 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -1005,7 +1005,8 @@ extern int efi_memattr_apply_permissions(struct 
> mm_struct *mm,
>  /* Iterate through an efi_memory_map */
>  #define for_each_efi_memory_desc_in_map(m, md)   
>\
>   for ((md) = (m)->map;  \
> -  (md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size); \
> +  (efi_memory_desc_t *)((md) + (m)->desc_size) <=   \
> +  (efi_memory_desc_t *)(m)->map_end;\
>(md) = (void *)(md) + (m)->desc_size)

Curse C type casting!

You're adding m->desc_size to a (efi_memory_desc_t *) which is 40
bytes. You need the below to avoid breaking regular EFI boot.

But yeah, this looks like a fine fix. Mark, any concerns?

---

diff --git a/include/linux/efi.h b/include/linux/efi.h
index d8ab480b1089..3a3f8d8bd9be 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1026,7 +1026,7 @@ extern int efi_memattr_apply_permissions(struct mm_struct 
*mm,
 /* Iterate through an efi_memory_map */
 #define for_each_efi_memory_desc_in_map(m, md)\
for ((md) = (m)->map;  \
-(md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size); \
+((void *)(md) + (m)->desc_size) <= (m)->map_end;  \
 (md) = (void *)(md) + (m)->desc_size)
 
 /**

Re: [PATCH] efi: fix for_each_efi_memory_desc_in_map() for empty memmaps

2016-05-25 Thread Matt Fleming

(Cc'ing Mark, the original author)

On Wed, 25 May, at 04:36:55PM, Vitaly Kuznetsov wrote:
> Commit 78ce248faa3c ("efi: Iterate over efi.memmap in
> for_each_efi_memory_desc()") introduced a regression for systems booted
> with 'noefi' kernel option. In particular, I observe early kernel hang in
> efi_find_mirror() on for_each_efi_memory_desc() call. As we don't have
> efi memmap we enter this iterator with the following parameters:
> 
> efi.memmap.map = 0, efi.memmap.map_end = 0, efi.memmap.desc_size = 28
> 
> for_each_efi_memory_desc_in_map() does the following comparison:
> 
> (md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size);
> 
> where md = 0, (m)->map_end = 0 and (m)->desc_size = 28 but when we subtract
> something from a NULL pointer wrap around happens and we end up returning
> invalid pointer.
> 
> Fixes: 78ce248faa3c ("efi: Iterate over efi.memmap in 
> for_each_efi_memory_desc()")
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  include/linux/efi.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index c2db3ca..229ccb5 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -1005,7 +1005,8 @@ extern int efi_memattr_apply_permissions(struct 
> mm_struct *mm,
>  /* Iterate through an efi_memory_map */
>  #define for_each_efi_memory_desc_in_map(m, md)   
>\
>   for ((md) = (m)->map;  \
> -  (md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size); \
> +  (efi_memory_desc_t *)((md) + (m)->desc_size) <=   \
> +  (efi_memory_desc_t *)(m)->map_end;\
>(md) = (void *)(md) + (m)->desc_size)

Curse C type casting!

You're adding m->desc_size to a (efi_memory_desc_t *) which is 40
bytes. You need the below to avoid breaking regular EFI boot.

But yeah, this looks like a fine fix. Mark, any concerns?

---

diff --git a/include/linux/efi.h b/include/linux/efi.h
index d8ab480b1089..3a3f8d8bd9be 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1026,7 +1026,7 @@ extern int efi_memattr_apply_permissions(struct mm_struct 
*mm,
 /* Iterate through an efi_memory_map */
 #define for_each_efi_memory_desc_in_map(m, md)\
for ((md) = (m)->map;  \
-(md) <= (efi_memory_desc_t *)((m)->map_end - (m)->desc_size); \
+((void *)(md) + (m)->desc_size) <= (m)->map_end;  \
 (md) = (void *)(md) + (m)->desc_size)
 
 /**

Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear

2016-05-25 Thread Matt Fleming

On Tue, 24 May, at 09:54:31AM, Tom Lendacky wrote:
> 
> I looked into this and this would be a large change also to parse tables
> and build lists.  It occurred to me that this could all be taken care of
> if the early_memremap calls were changed to early_ioremap calls. Looking
> in the git log I see that they were originally early_ioremap calls but
> were changed to early_memremap calls with this commit:
> 
> commit abc93f8eb6e4 ("efi: Use early_mem*() instead of early_io*()")
> 
> Looking at the early_memremap code and the early_ioremap code they both
> call __early_ioremap so I don't see how this change makes any
> difference (especially since FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are
> identical in this case).
> 
> Is it safe to change these back to early_ioremap calls (at least on
> x86)?

I really don't want to begin mixing early_ioremap() calls and
early_memremap() calls for any of the EFI code if it can be avoided.

There is slow but steady progress to move more and more of the
architecture specific EFI code out into generic code. Swapping
early_memremap() for early_ioremap() would be a step backwards,
because FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are not identical on
ARM/arm64.

Could you point me at the patch that in this series that fixes up
early_ioremap() to work with mem encrypt/decrypt? I took another
(quick) look through but couldn't find it.

Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear

2016-05-25 Thread Matt Fleming

On Tue, 24 May, at 09:54:31AM, Tom Lendacky wrote:
> 
> I looked into this and this would be a large change also to parse tables
> and build lists.  It occurred to me that this could all be taken care of
> if the early_memremap calls were changed to early_ioremap calls. Looking
> in the git log I see that they were originally early_ioremap calls but
> were changed to early_memremap calls with this commit:
> 
> commit abc93f8eb6e4 ("efi: Use early_mem*() instead of early_io*()")
> 
> Looking at the early_memremap code and the early_ioremap code they both
> call __early_ioremap so I don't see how this change makes any
> difference (especially since FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are
> identical in this case).
> 
> Is it safe to change these back to early_ioremap calls (at least on
> x86)?

I really don't want to begin mixing early_ioremap() calls and
early_memremap() calls for any of the EFI code if it can be avoided.

There is slow but steady progress to move more and more of the
architecture specific EFI code out into generic code. Swapping
early_memremap() for early_ioremap() would be a step backwards,
because FIXMAP_PAGE_NORMAL and FIXMAP_PAGE_IO are not identical on
ARM/arm64.

Could you point me at the patch that in this series that fixes up
early_ioremap() to work with mem encrypt/decrypt? I took another
(quick) look through but couldn't find it.

Re: [GIT PULL] EFI fix

2016-05-23 Thread Matt Fleming

On Mon, 16 May, at 01:05:45PM, Linus Torvalds wrote:
> 
> I think the right fix is to just get rid of that silly conditional
> frame pointer thing, and always use frame pointers in this stub
> function. And then we don't need that (odd) load to get the old stack
> pointer into %rax - we can just use the frame pointer.
> 
> Something like the attached completely untested patch.

Linus, are you going to apply your patch directly or would you prefer
someone to send a pull request which also includes Josh's patch for
arch/x86/entry/thunk_64.S?

Re: [GIT PULL] EFI fix

2016-05-23 Thread Matt Fleming

On Mon, 16 May, at 01:05:45PM, Linus Torvalds wrote:
> 
> I think the right fix is to just get rid of that silly conditional
> frame pointer thing, and always use frame pointers in this stub
> function. And then we don't need that (odd) load to get the old stack
> pointer into %rax - we can just use the frame pointer.
> 
> Something like the attached completely untested patch.

Linus, are you going to apply your patch directly or would you prefer
someone to send a pull request which also includes Josh's patch for
arch/x86/entry/thunk_64.S?

Re: [PATCH 2/3] sched,fair: Fix local starvation

2016-05-20 Thread Matt Fleming

On Tue, 10 May, at 07:43:16PM, Peter Zijlstra wrote:
> Mike reported that the recent commit 3a47d5124a95 ("sched/fair: Fix
> fairness issue on migration") broke interactivity and the signal
> starve test.
> 
> The problem is that I assumed ENQUEUE_WAKING was only set when we do a
> cross-cpu wakeup (migration), which isn't true. This means we now
> destroy the vruntime history of tasks and wakeup-preemption suffers.
> 
> Cure this by making my assumption true, only call
> sched_class::task_waking() when we do a cross-cpu wakeup. This avoids
> the indirect call in the case we do a local wakeup.
> 
> Cc: Pavan Kondeti <pkond...@codeaurora.org>
> Cc: Ben Segall <bseg...@google.com>
> Cc: Matt Fleming <m...@codeblueprint.co.uk>
> Cc: Morten Rasmussen <morten.rasmus...@arm.com>
> Cc: Paul Turner <p...@google.com>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: byungchul.p...@lge.com
> Cc: Andrew Hunter <a...@google.com>
> Fixes: 3a47d5124a95 ("sched/fair: Fix fairness issue on migration")
> Reported-by: Mike Galbraith <mgalbra...@suse.de>
> Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
> ---
>  kernel/sched/core.c |   29 +
>  kernel/sched/fair.c |   41 -
>  2 files changed, 61 insertions(+), 9 deletions(-)
 
This patch appears to cause a regression for hackbench -pipe of
between ~8% and ~10% with groups >= NR_CPU.

I haven't probed much yet, but it looks like the vruntime of tasks has
gone nuts.

Re: [PATCH 2/3] sched,fair: Fix local starvation

2016-05-20 Thread Matt Fleming

On Tue, 10 May, at 07:43:16PM, Peter Zijlstra wrote:
> Mike reported that the recent commit 3a47d5124a95 ("sched/fair: Fix
> fairness issue on migration") broke interactivity and the signal
> starve test.
> 
> The problem is that I assumed ENQUEUE_WAKING was only set when we do a
> cross-cpu wakeup (migration), which isn't true. This means we now
> destroy the vruntime history of tasks and wakeup-preemption suffers.
> 
> Cure this by making my assumption true, only call
> sched_class::task_waking() when we do a cross-cpu wakeup. This avoids
> the indirect call in the case we do a local wakeup.
> 
> Cc: Pavan Kondeti 
> Cc: Ben Segall 
> Cc: Matt Fleming 
> Cc: Morten Rasmussen 
> Cc: Paul Turner 
> Cc: Thomas Gleixner 
> Cc: byungchul.p...@lge.com
> Cc: Andrew Hunter 
> Fixes: 3a47d5124a95 ("sched/fair: Fix fairness issue on migration")
> Reported-by: Mike Galbraith 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  kernel/sched/core.c |   29 +
>  kernel/sched/fair.c |   41 -
>  2 files changed, 61 insertions(+), 9 deletions(-)
 
This patch appears to cause a regression for hackbench -pipe of
between ~8% and ~10% with groups >= NR_CPU.

I haven't probed much yet, but it looks like the vruntime of tasks has
gone nuts.

Re: [Xen-devel] [GIT PULL] EFI ARM Xen support

2016-05-18 Thread Matt Fleming

On Wed, 18 May, at 12:46:38PM, Ingo Molnar wrote:
> 
> I have no particular objections, since this seems to be Xen-next merged to 
> the EFI 
> tree that is already upstream, plus this single commit on top of t:
> 
>   11ee5491e5ff Xen: EFI: Parse DT parameters for Xen specific UEFI
> 
> Which, if Matt is fine with it, you could send to Linus too as part of this 
> cycle's Xen pull request.

Sure, I'm fine with that.

Re: [Xen-devel] [GIT PULL] EFI ARM Xen support

2016-05-18 Thread Matt Fleming

On Wed, 18 May, at 12:46:38PM, Ingo Molnar wrote:
> 
> I have no particular objections, since this seems to be Xen-next merged to 
> the EFI 
> tree that is already upstream, plus this single commit on top of t:
> 
>   11ee5491e5ff Xen: EFI: Parse DT parameters for Xen specific UEFI
> 
> Which, if Matt is fine with it, you could send to Linus too as part of this 
> cycle's Xen pull request.

Sure, I'm fine with that.

Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-18 Thread Matt Fleming

On Wed, 18 May, at 03:01:27AM, Yuyang Du wrote:
> On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote:
> > So, if the code looks like the following, either now or in the future,
> > 
> > static void __schedule(bool preempt)
> > {
> > ...
> > /* Clear RQCF_ACT_SKIP */
> > rq->clock_update_flags = 0;
> > ...
> > delta = rq_clock();
> > }
>  
> Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it,
> you clear everything.

That was sloppy on my part but intentional because that's what the
code looks like in tip/sched/core today.

It was purely meant to demonstrate that setting RQCF_UPDATE just
because RQCF_ACT_SKIP is set does not make sense. You can replace the
clearing line with the correct bit masking operation.

But I get it, the pseudo-code was confusing. I'll send out a v2.

> And if you clear the RQCF_UPDATE also (maybe you shouldn't, but
> actually it does not matter), of course you will get a warning...

Right, I wouldn't actually clear RQCF_UPDATE in v2 of this patch.

> In addition, it looks like multiple skips are possible, so:

I'm not sure what you mean, could you elaborate?

> update_rq_clock() {
> rq->clock_update_flags |= RQCF_UPDATE;
> 
> ...
> }
> 
> instead of clearing the skip flag there.

Huh? RQCF_*_SKIP are not cleared in update_rq_clock().

Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-18 Thread Matt Fleming

On Wed, 18 May, at 03:01:27AM, Yuyang Du wrote:
> On Tue, May 17, 2016 at 01:24:15PM +0100, Matt Fleming wrote:
> > So, if the code looks like the following, either now or in the future,
> > 
> > static void __schedule(bool preempt)
> > {
> > ...
> > /* Clear RQCF_ACT_SKIP */
> > rq->clock_update_flags = 0;
> > ...
> > delta = rq_clock();
> > }
>  
> Sigh, you even said "Clear RQCF_ACT_SKIP", but you not only clear it,
> you clear everything.

That was sloppy on my part but intentional because that's what the
code looks like in tip/sched/core today.

It was purely meant to demonstrate that setting RQCF_UPDATE just
because RQCF_ACT_SKIP is set does not make sense. You can replace the
clearing line with the correct bit masking operation.

But I get it, the pseudo-code was confusing. I'll send out a v2.

> And if you clear the RQCF_UPDATE also (maybe you shouldn't, but
> actually it does not matter), of course you will get a warning...

Right, I wouldn't actually clear RQCF_UPDATE in v2 of this patch.

> In addition, it looks like multiple skips are possible, so:

I'm not sure what you mean, could you elaborate?

> update_rq_clock() {
> rq->clock_update_flags |= RQCF_UPDATE;
> 
> ...
> }
> 
> instead of clearing the skip flag there.

Huh? RQCF_*_SKIP are not cleared in update_rq_clock().

Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-17 Thread Matt Fleming

On Tue, 17 May, at 04:11:09AM, Yuyang Du wrote:
> On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote:
> > 
> > No because if someone calls rq_clock() immediately after __schedule(),
> > or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
> > should trigger a warning since the clock has not actually been
> > updated.
> 
> Well, I don't know how concurrent it can be, but aren't both update
> and read synchronized by rq->lock? So I don't understand the latter
> case, and the former should be addressed (missing its own update?).

I'm not talking about concurrency; when I said "someone" above, I was
referring to code.

So, if the code looks like the following, either now or in the future,

static void __schedule(bool preempt)
{
...
/* Clear RQCF_ACT_SKIP */
rq->clock_update_flags = 0;
...
delta = rq_clock();
}

I would expect to see a warning triggered, because we've read the rq
clock outside of the code area where we know it's safe to do so
without a clock update.

The solution for that bug may be as simple as rearranging the code,

delta = rq_clock();
...
rq->clock_update_flags = 0;

but we definitely want to catch such bugs in the first instance.

Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-17 Thread Matt Fleming

On Tue, 17 May, at 04:11:09AM, Yuyang Du wrote:
> On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote:
> > 
> > No because if someone calls rq_clock() immediately after __schedule(),
> > or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
> > should trigger a warning since the clock has not actually been
> > updated.
> 
> Well, I don't know how concurrent it can be, but aren't both update
> and read synchronized by rq->lock? So I don't understand the latter
> case, and the former should be addressed (missing its own update?).

I'm not talking about concurrency; when I said "someone" above, I was
referring to code.

So, if the code looks like the following, either now or in the future,

static void __schedule(bool preempt)
{
...
/* Clear RQCF_ACT_SKIP */
rq->clock_update_flags = 0;
...
delta = rq_clock();
}

I would expect to see a warning triggered, because we've read the rq
clock outside of the code area where we know it's safe to do so
without a clock update.

The solution for that bug may be as simple as rearranging the code,

delta = rq_clock();
...
rq->clock_update_flags = 0;

but we definitely want to catch such bugs in the first instance.

Re: [PATCH 1/2] Create UV efi_call macros

2016-05-17 Thread Matt Fleming

On Mon, 16 May, at 05:58:40PM, Alex Thorlton wrote:
> 
> I was simply re-using the efi_call implementation.  Boris suggested that
> I re-write this using the efi_call_virt macro, so I just went with that.
> It all seems to work just fine, so I don't see much reason to stray away
> from that implementation.  That being said, I'm obviously not a huge fun
> of the code duplication across the macros.  I think there's probably a
> way to minimize this, though I haven't quite worked out the best method
> yet (ideas are welcome :)

The reason I'm pressing for details is that we have a related issue
with the EFI thunking code (CONFIG_EFI_MIXED), where the function
pointer we want to call isn't accessible via the EFI System Table, see
efi_thunk().

Well, technically it *is* accessible, you just can't dereference the
services at runtime because the pointers in the tables are not 64-bit.

But the same constraints exist for EFI thunk and UV code; given a
function pointer to execute that isn't in efi.systab, setup the EFI
runtime environment and call a custom ABI function.

I haven't tested this (or thought through all the implications), but
could you look at providing a table (or something) for mapping a
function name to a ptr,func pair, e.g.

thunk_get_time: runtime_services32(get_time), efi64_thunk
thunk_set_time: runtime_services32(set_time), efi64_thunk
...
uv_call_func: efi.uv_systab->function, uv_efi_call_virt

which we could use in arch_efi_call_virt()? That should give us much
less code duplication and hide all this inside arch/x86.

Re: [PATCH 1/2] Create UV efi_call macros

2016-05-17 Thread Matt Fleming

On Mon, 16 May, at 05:58:40PM, Alex Thorlton wrote:
> 
> I was simply re-using the efi_call implementation.  Boris suggested that
> I re-write this using the efi_call_virt macro, so I just went with that.
> It all seems to work just fine, so I don't see much reason to stray away
> from that implementation.  That being said, I'm obviously not a huge fun
> of the code duplication across the macros.  I think there's probably a
> way to minimize this, though I haven't quite worked out the best method
> yet (ideas are welcome :)

The reason I'm pressing for details is that we have a related issue
with the EFI thunking code (CONFIG_EFI_MIXED), where the function
pointer we want to call isn't accessible via the EFI System Table, see
efi_thunk().

Well, technically it *is* accessible, you just can't dereference the
services at runtime because the pointers in the tables are not 64-bit.

But the same constraints exist for EFI thunk and UV code; given a
function pointer to execute that isn't in efi.systab, setup the EFI
runtime environment and call a custom ABI function.

I haven't tested this (or thought through all the implications), but
could you look at providing a table (or something) for mapping a
function name to a ptr,func pair, e.g.

thunk_get_time: runtime_services32(get_time), efi64_thunk
thunk_set_time: runtime_services32(set_time), efi64_thunk
...
uv_call_func: efi.uv_systab->function, uv_efi_call_virt

which we could use in arch_efi_call_virt()? That should give us much
less code duplication and hide all this inside arch/x86.

Re: [GIT PULL] EFI fix

2016-05-17 Thread Matt Fleming

On Tue, 17 May, at 10:04:34AM, Matt Fleming wrote:
> 
> Now I'm wondering whether other users of FRAME_BEGIN/FRAME_END make
> this same mistake. Coccinelle might be able to detect it perhaps.

A quick bit of sed turned up the code in arch/x86/entry/entry_64.S,
which looks to suffer from the same bug,

/* rdi: arg1 ... normal C conventions. rax is saved/restored. */
.macro THUNK name, func, put_ret_addr_in_rdi=0
.globl \name
.type \name, @function
\name:
FRAME_BEGIN

/* this one pushes 9 elems, the next one would be %rIP */
pushq %rdi
pushq %rsi
pushq %rdx
pushq %rcx
pushq %rax
pushq %r8
pushq %r9
pushq %r10
pushq %r11

.if \put_ret_addr_in_rdi
/* 9*8(%rsp) is return addr on stack */
movq 9*8(%rsp), %rdi
.endif

With CONFIG_FRAME_POINTER=y 9*8(%rsp) is actually the value of %rbp on
entry, not the return address.

Re: [GIT PULL] EFI fix

2016-05-17 Thread Matt Fleming

On Tue, 17 May, at 10:04:34AM, Matt Fleming wrote:
> 
> Now I'm wondering whether other users of FRAME_BEGIN/FRAME_END make
> this same mistake. Coccinelle might be able to detect it perhaps.

A quick bit of sed turned up the code in arch/x86/entry/entry_64.S,
which looks to suffer from the same bug,

/* rdi: arg1 ... normal C conventions. rax is saved/restored. */
.macro THUNK name, func, put_ret_addr_in_rdi=0
.globl \name
.type \name, @function
\name:
FRAME_BEGIN

/* this one pushes 9 elems, the next one would be %rIP */
pushq %rdi
pushq %rsi
pushq %rdx
pushq %rcx
pushq %rax
pushq %r8
pushq %r9
pushq %r10
pushq %r11

.if \put_ret_addr_in_rdi
/* 9*8(%rsp) is return addr on stack */
movq 9*8(%rsp), %rdi
.endif

With CONFIG_FRAME_POINTER=y 9*8(%rsp) is actually the value of %rbp on
entry, not the return address.

Re: [GIT PULL] EFI fix

2016-05-17 Thread Matt Fleming

On Mon, 16 May, at 01:05:45PM, Linus Torvalds wrote:
> 
> So that whole 8-vs-16 offset confusion depends on the frame pointer!
> If frame pointers were enabled, it will be 16. If they weren't, it
> will be 8. That patch that changes it from 8 to 16 will just move the
> bug around. Before, it was correct when frame pointers were disabled
> and buggy otherwise. Now, it's correct if frame pointers are enabled,
> and buggy otherwise.

Urgh, right.

We didn't always use frame pointers in efi_call(), they were
introduced in commit 779c433b8ea5c ("x86/asm/efi: Create a stack frame
in efi_call()") earlier this year to stop objtool complaining.

I admit I totally missed the part about pulling arguments off the
stack when I reviewed that patch.

> I may be missing something, but I think that commit is pure garbage.

You're correct.

> I think the right fix is to just get rid of that silly conditional
> frame pointer thing, and always use frame pointers in this stub
> function. And then we don't need that (odd) load to get the old stack
> pointer into %rax - we can just use the frame pointer.
> 
> Something like the attached completely untested patch.

Looks good to me, but I haven't tested it.

Now I'm wondering whether other users of FRAME_BEGIN/FRAME_END make
this same mistake. Coccinelle might be able to detect it perhaps.

Re: [GIT PULL] EFI fix

2016-05-17 Thread Matt Fleming

On Mon, 16 May, at 01:05:45PM, Linus Torvalds wrote:
> 
> So that whole 8-vs-16 offset confusion depends on the frame pointer!
> If frame pointers were enabled, it will be 16. If they weren't, it
> will be 8. That patch that changes it from 8 to 16 will just move the
> bug around. Before, it was correct when frame pointers were disabled
> and buggy otherwise. Now, it's correct if frame pointers are enabled,
> and buggy otherwise.

Urgh, right.

We didn't always use frame pointers in efi_call(), they were
introduced in commit 779c433b8ea5c ("x86/asm/efi: Create a stack frame
in efi_call()") earlier this year to stop objtool complaining.

I admit I totally missed the part about pulling arguments off the
stack when I reviewed that patch.

> I may be missing something, but I think that commit is pure garbage.

You're correct.

> I think the right fix is to just get rid of that silly conditional
> frame pointer thing, and always use frame pointers in this stub
> function. And then we don't need that (odd) load to get the old stack
> pointer into %rax - we can just use the frame pointer.
> 
> Something like the attached completely untested patch.

Looks good to me, but I haven't tested it.

Now I'm wondering whether other users of FRAME_BEGIN/FRAME_END make
this same mistake. Coccinelle might be able to detect it perhaps.

Re: [PATCH] ftrace/x86: Fix function graph tracer reset path

2016-05-16 Thread Matt Fleming

On Mon, 16 May, at 09:58:33AM, Steven Rostedt wrote:
> 
> Can we solve this by doing the same thing it did for the kernel?
> 
> -- Steve
> 
> diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
> index ed48a9f465f8..e13a695c3084 100644
> --- a/arch/x86/kernel/mcount_64.S
> +++ b/arch/x86/kernel/mcount_64.S
> @@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call)
>   jmp ftrace_stub
>  #endif
>  
> -GLOBAL(ftrace_stub)
> +/* This is weak to keep gas from relaxing the jumps */
> +WEAK(ftrace_stub)
>   retq
>  END(ftrace_caller)
  
Works for me.

Tested-by: Matt Fleming <m...@codeblueprint.co.uk>

Re: [PATCH] ftrace/x86: Fix function graph tracer reset path

2016-05-16 Thread Matt Fleming

On Mon, 16 May, at 09:58:33AM, Steven Rostedt wrote:
> 
> Can we solve this by doing the same thing it did for the kernel?
> 
> -- Steve
> 
> diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
> index ed48a9f465f8..e13a695c3084 100644
> --- a/arch/x86/kernel/mcount_64.S
> +++ b/arch/x86/kernel/mcount_64.S
> @@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call)
>   jmp ftrace_stub
>  #endif
>  
> -GLOBAL(ftrace_stub)
> +/* This is weak to keep gas from relaxing the jumps */
> +WEAK(ftrace_stub)
>   retq
>  END(ftrace_caller)
  
Works for me.

Tested-by: Matt Fleming

Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-16 Thread Matt Fleming

On Sun, 15 May, at 10:14:39AM, Yuyang Du wrote:
> Hi Matt,
> 
> Thanks for Ccing me.
> 
> I am indeed interested, because I recently encountered an rq clock
> issue, which is that the clock jumps about 200ms when I was
> experimenting the "flat util hierarchy" patches, which really annoyed
> me, and I had to stop to figure out what is wrong (but haven't yet
> figured out ;))
> 
> First, this patchset does not solve my problem, but never mind, by
> reviewing your patches, I have some comments:

Thanks for the review. One gap that this patch series doesn't address
is that some callers of update_rq_clock() do not pin rq->lock, which
makes the diagnostic checks useless in that case.

I plan on handling that next, but I wanted to get this series out as
soon as possible for review.

> On Thu, May 12, 2016 at 08:49:53PM +0100, Matt Fleming wrote:
> >  
> > -   rq->clock_skip_update = 0;
> > +   /* Clear ACT, preserve everything else */
> > +   rq->clock_update_flags ^= RQCF_ACT_SKIP;
> 
> The comment says "Clear ACT", but this is really xor, and I am not sure
> this is even what you want.

Urgh, you're right. I'm not sure what I was thinking when I wrote
that.

> In addition, would it be simpler to do this?
> 
> update_rq_clock()
>   if (flags & RQCF_ACT_SKIP)
>   flags <<= 1; /* effective skip is an update */
>   return;
> 
>   flags = RQCF_UPDATED;

No because if someone calls rq_clock() immediately after __schedule(),
or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
should trigger a warning since the clock has not actually been
updated.

Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-16 Thread Matt Fleming

On Sun, 15 May, at 10:14:39AM, Yuyang Du wrote:
> Hi Matt,
> 
> Thanks for Ccing me.
> 
> I am indeed interested, because I recently encountered an rq clock
> issue, which is that the clock jumps about 200ms when I was
> experimenting the "flat util hierarchy" patches, which really annoyed
> me, and I had to stop to figure out what is wrong (but haven't yet
> figured out ;))
> 
> First, this patchset does not solve my problem, but never mind, by
> reviewing your patches, I have some comments:

Thanks for the review. One gap that this patch series doesn't address
is that some callers of update_rq_clock() do not pin rq->lock, which
makes the diagnostic checks useless in that case.

I plan on handling that next, but I wanted to get this series out as
soon as possible for review.

> On Thu, May 12, 2016 at 08:49:53PM +0100, Matt Fleming wrote:
> >  
> > -   rq->clock_skip_update = 0;
> > +   /* Clear ACT, preserve everything else */
> > +   rq->clock_update_flags ^= RQCF_ACT_SKIP;
> 
> The comment says "Clear ACT", but this is really xor, and I am not sure
> this is even what you want.

Urgh, you're right. I'm not sure what I was thinking when I wrote
that.

> In addition, would it be simpler to do this?
> 
> update_rq_clock()
>   if (flags & RQCF_ACT_SKIP)
>   flags <<= 1; /* effective skip is an update */
>   return;
> 
>   flags = RQCF_UPDATED;

No because if someone calls rq_clock() immediately after __schedule(),
or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we
should trigger a warning since the clock has not actually been
updated.

Re: [PATCH] ftrace/x86: Fix function graph tracer reset path

2016-05-15 Thread Matt Fleming

On Fri, 13 May, at 10:06:10AM, Steven Rostedt wrote:
> Matt,
> 
> This bug looks very similar to what you were hitting with the function
> profiler. Can you apply this patch and see if it fixes the issue for
> you.

Yep, this patch fixes it for me.

For the record, this is what objdump tells me (with patch applied),

00b6 :
  b6:   eb 03   jmpbb 
  b8:   90  nop
  b9:   90  nop
  ba:   90  nop

So my toolchain is definitely generating a short jump. This is
binutils 2.26.

But on one of my other test machines with binutils 2.24 I see this,

00aa :
  aa:   e9 00 00 00 00  jmpq   af 
ab: R_X86_64_PC32   ftrace_stub-0x4

i.e. a near jump.

Re: [PATCH] ftrace/x86: Fix function graph tracer reset path

2016-05-15 Thread Matt Fleming

On Fri, 13 May, at 10:06:10AM, Steven Rostedt wrote:
> Matt,
> 
> This bug looks very similar to what you were hitting with the function
> profiler. Can you apply this patch and see if it fixes the issue for
> you.

Yep, this patch fixes it for me.

For the record, this is what objdump tells me (with patch applied),

00b6 :
  b6:   eb 03   jmpbb 
  b8:   90  nop
  b9:   90  nop
  ba:   90  nop

So my toolchain is definitely generating a short jump. This is
binutils 2.26.

But on one of my other test machines with binutils 2.24 I see this,

00aa :
  aa:   e9 00 00 00 00  jmpq   af 
ab: R_X86_64_PC32   ftrace_stub-0x4

i.e. a near jump.

Re: [PATCH 2/5] efibc: Fix excessive stack footprint warning

2016-05-14 Thread Matt Fleming

On Wed, 11 May, at 05:16:24PM, Jeremy Compostella wrote:
> From 3a54e6872e220e1ac4db0eae126a20b5383dae3e Mon Sep 17 00:00:00 2001
> From: Jeremy Compostella 
> Date: Tue, 10 May 2016 10:34:21 +0200
> Subject: [PATCH] efibc: report more information in the error messages
> 
> Report the name of the EFI variable if the value size is too large or
> if efibc_set_variable() fails to allocate the struct efivar_entry
> object.  If efibc_set_variable() fails because the value size is too
> large, it also reports the value size in the error message.
> 
> Signed-off-by: Jeremy Compostella 
> ---
>  drivers/firmware/efi/efibc.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

Applied, thanks Jeremy.

Re: [PATCH 2/5] efibc: Fix excessive stack footprint warning

2016-05-14 Thread Matt Fleming

On Wed, 11 May, at 05:16:24PM, Jeremy Compostella wrote:
> From 3a54e6872e220e1ac4db0eae126a20b5383dae3e Mon Sep 17 00:00:00 2001
> From: Jeremy Compostella 
> Date: Tue, 10 May 2016 10:34:21 +0200
> Subject: [PATCH] efibc: report more information in the error messages
> 
> Report the name of the EFI variable if the value size is too large or
> if efibc_set_variable() fails to allocate the struct efivar_entry
> object.  If efibc_set_variable() fails because the value size is too
> large, it also reports the value size in the error message.
> 
> Signed-off-by: Jeremy Compostella 
> ---
>  drivers/firmware/efi/efibc.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)

Applied, thanks Jeremy.

[GIT PULL] EFI ARM Xen support

2016-05-14 Thread Matt Fleming

Folks,

Please pull the following branch which contains support for Xen on ARM
EFI platforms.

This merge includes a merge of Stefano's xen/linux-next branch to pull
in the prerequisites required for Shannon's commit:

  11ee5491e5ff ("Xen: EFI: Parse DT parameters for Xen specific UEFI")

as it needs both the latest changes in the EFI 'next' branch (now
tip/efi/core) and xen/linux-next.

I have intentionally not included the individual patches as I would
normally do in a pull request, so that commit history created by my
merging of Stefano's branch is preserved, and so that there's no way
to accidentally apply the patches individually.

The following changes since commit 6c5450ef66816216e574885cf8d3ddb31ef77428:

  efivarfs: Make efivarfs_file_ioctl() static (2016-05-07 07:06:13 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git efi/arm-xen

for you to fetch changes up to 11ee5491e5ff519e0d3a7018eb21351cb6955a98:

  Xen: EFI: Parse DT parameters for Xen specific UEFI (2016-05-14 19:31:01 
+0100)


David Vrabel (1):
  xen/gntdev: reduce copy batch size to 16

Matt Fleming (1):
  Merge branch 'xen/linux-next' into efi/arm-xen

Shannon Zhao (16):
  Xen: ACPI: Hide UART used by Xen
  xen/grant-table: Move xlated_setup_gnttab_pages to common place
  Xen: xlate: Use page_to_xen_pfn instead of page_to_pfn
  arm/xen: Use xen_xlate_map_ballooned_pages to setup grant table
  xen: memory : Add new XENMAPSPACE type XENMAPSPACE_dev_mmio
  Xen: ARM: Add support for mapping platform device mmio
  Xen: ARM: Add support for mapping AMBA device mmio
  Xen: public/hvm: sync changes of HVM_PARAM_CALLBACK_VIA ABI from Xen
  xen/hvm/params: Add a new delivery type for event-channel in 
HVM_PARAM_CALLBACK_IRQ
  arm/xen: Get event-channel irq through HVM_PARAM when booting with ACPI
  ARM: XEN: Move xen_early_init() before efi_init()
  ARM: Xen: Document UEFI support on Xen ARM virtual platforms
  XEN: EFI: Move x86 specific codes to architecture directory
  ARM64: XEN: Add a function to initialize Xen specific UEFI runtime 
services
  FDT: Add a helper to get the subnode by given name
  Xen: EFI: Parse DT parameters for Xen specific UEFI

Stefano Stabellini (1):
  xen/x86: don't lose event interrupts

 Documentation/devicetree/bindings/arm/xen.txt |  35 +
 arch/arm/include/asm/xen/xen-ops.h|   6 +
 arch/arm/kernel/setup.c   |   2 +-
 arch/arm/xen/Makefile |   1 +
 arch/arm/xen/efi.c|  40 ++
 arch/arm/xen/enlighten.c  | 125 
 arch/arm64/include/asm/xen/xen-ops.h  |   6 +
 arch/arm64/kernel/setup.c |   2 +-
 arch/arm64/xen/Makefile   |   1 +
 arch/x86/xen/efi.c| 111 +++
 arch/x86/xen/grant-table.c|  57 +---
 arch/x86/xen/time.c   |   6 +-
 drivers/acpi/scan.c   |  74 ++
 drivers/firmware/efi/arm-runtime.c|   5 +
 drivers/firmware/efi/efi.c|  81 ---
 drivers/of/fdt.c  |  13 ++
 drivers/xen/Kconfig   |   2 +-
 drivers/xen/Makefile  |   1 +
 drivers/xen/arm-device.c  | 196 ++
 drivers/xen/efi.c | 173 +--
 drivers/xen/gntdev.c  |   2 +-
 drivers/xen/xlate_mmu.c   |  77 ++
 include/linux/of_fdt.h|   2 +
 include/xen/interface/hvm/params.h|  40 +-
 include/xen/interface/memory.h|   1 +
 include/xen/xen-ops.h |  32 +++--
 26 files changed, 840 insertions(+), 251 deletions(-)
 create mode 100644 arch/arm/include/asm/xen/xen-ops.h
 create mode 100644 arch/arm/xen/efi.c
 create mode 100644 arch/arm64/include/asm/xen/xen-ops.h
 create mode 100644 drivers/xen/arm-device.c

[GIT PULL] EFI ARM Xen support

2016-05-14 Thread Matt Fleming

Folks,

Please pull the following branch which contains support for Xen on ARM
EFI platforms.

This merge includes a merge of Stefano's xen/linux-next branch to pull
in the prerequisites required for Shannon's commit:

  11ee5491e5ff ("Xen: EFI: Parse DT parameters for Xen specific UEFI")

as it needs both the latest changes in the EFI 'next' branch (now
tip/efi/core) and xen/linux-next.

I have intentionally not included the individual patches as I would
normally do in a pull request, so that commit history created by my
merging of Stefano's branch is preserved, and so that there's no way
to accidentally apply the patches individually.

The following changes since commit 6c5450ef66816216e574885cf8d3ddb31ef77428:

  efivarfs: Make efivarfs_file_ioctl() static (2016-05-07 07:06:13 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git efi/arm-xen

for you to fetch changes up to 11ee5491e5ff519e0d3a7018eb21351cb6955a98:

  Xen: EFI: Parse DT parameters for Xen specific UEFI (2016-05-14 19:31:01 
+0100)


David Vrabel (1):
  xen/gntdev: reduce copy batch size to 16

Matt Fleming (1):
  Merge branch 'xen/linux-next' into efi/arm-xen

Shannon Zhao (16):
  Xen: ACPI: Hide UART used by Xen
  xen/grant-table: Move xlated_setup_gnttab_pages to common place
  Xen: xlate: Use page_to_xen_pfn instead of page_to_pfn
  arm/xen: Use xen_xlate_map_ballooned_pages to setup grant table
  xen: memory : Add new XENMAPSPACE type XENMAPSPACE_dev_mmio
  Xen: ARM: Add support for mapping platform device mmio
  Xen: ARM: Add support for mapping AMBA device mmio
  Xen: public/hvm: sync changes of HVM_PARAM_CALLBACK_VIA ABI from Xen
  xen/hvm/params: Add a new delivery type for event-channel in 
HVM_PARAM_CALLBACK_IRQ
  arm/xen: Get event-channel irq through HVM_PARAM when booting with ACPI
  ARM: XEN: Move xen_early_init() before efi_init()
  ARM: Xen: Document UEFI support on Xen ARM virtual platforms
  XEN: EFI: Move x86 specific codes to architecture directory
  ARM64: XEN: Add a function to initialize Xen specific UEFI runtime 
services
  FDT: Add a helper to get the subnode by given name
  Xen: EFI: Parse DT parameters for Xen specific UEFI

Stefano Stabellini (1):
  xen/x86: don't lose event interrupts

 Documentation/devicetree/bindings/arm/xen.txt |  35 +
 arch/arm/include/asm/xen/xen-ops.h|   6 +
 arch/arm/kernel/setup.c   |   2 +-
 arch/arm/xen/Makefile |   1 +
 arch/arm/xen/efi.c|  40 ++
 arch/arm/xen/enlighten.c  | 125 
 arch/arm64/include/asm/xen/xen-ops.h  |   6 +
 arch/arm64/kernel/setup.c |   2 +-
 arch/arm64/xen/Makefile   |   1 +
 arch/x86/xen/efi.c| 111 +++
 arch/x86/xen/grant-table.c|  57 +---
 arch/x86/xen/time.c   |   6 +-
 drivers/acpi/scan.c   |  74 ++
 drivers/firmware/efi/arm-runtime.c|   5 +
 drivers/firmware/efi/efi.c|  81 ---
 drivers/of/fdt.c  |  13 ++
 drivers/xen/Kconfig   |   2 +-
 drivers/xen/Makefile  |   1 +
 drivers/xen/arm-device.c  | 196 ++
 drivers/xen/efi.c | 173 +--
 drivers/xen/gntdev.c  |   2 +-
 drivers/xen/xlate_mmu.c   |  77 ++
 include/linux/of_fdt.h|   2 +
 include/xen/interface/hvm/params.h|  40 +-
 include/xen/interface/memory.h|   1 +
 include/xen/xen-ops.h |  32 +++--
 26 files changed, 840 insertions(+), 251 deletions(-)
 create mode 100644 arch/arm/include/asm/xen/xen-ops.h
 create mode 100644 arch/arm/xen/efi.c
 create mode 100644 arch/arm64/include/asm/xen/xen-ops.h
 create mode 100644 drivers/xen/arm-device.c

Re: [PATCH v2] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-13 Thread Matt Fleming

(Including more folks, quoting entire patch)

On Thu, 12 May, at 08:19:54PM, Shannon Zhao wrote:
> From: Shannon Zhao <shannon.z...@linaro.org>
> 
> The EFI DT parameters for bare metal are located under /chosen node,
> while for Xen Dom0 they are located under /hyperviosr/uefi node. These
> parameters under /chosen and /hyperviosr/uefi are not expected to appear
> at the same time.
> Parse these EFI parameters and initialize EFI like the way for bare
> metal except the runtime services because the runtime services for Xen
> Dom0 are available through hypercalls and they are always enabled. So it
> sets the EFI_RUNTIME_SERVICES flag if it finds /hyperviosr/uefi node and
> bails out in arm_enable_runtime_services() when EFI_RUNTIME_SERVICES
> flag is set already.
> 
> CC: Matt Fleming <m...@codeblueprint.co.uk>
> Signed-off-by: Shannon Zhao <shannon.z...@linaro.org>
> ---
> v2: rebase it on top of EFI and Xen next branch, add some comments to
> explain the codes. 
> ---
>  arch/arm/xen/enlighten.c   | 22 +++
>  drivers/firmware/efi/arm-runtime.c |  5 +++
>  drivers/firmware/efi/efi.c | 81 
> ++
>  3 files changed, 92 insertions(+), 16 deletions(-)
 
This looks good to me. Thanks for all the work Shannon.

Stefan, OK, let's figure out how to deal with this.

What I've done is merge xen/linux-next into the 'efi/xen-arm' topic
branch and applied this patch here,

  
https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/log/?h=efi/arm-xen

Yes, the merge includes some Xen patches that are not actually needed
for any of the EFI patches, but Linus has complained about picking
random commits of a tree to merge. At least this way this is all the
Xen linux-next material.

If you're OK with this, and assuming no one else has any preferred
methods, I'll send out a pull request to tip over the weekend.

> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 13e3e9f..d4f36eb 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -15,7 +15,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -261,6 +263,19 @@ static int __init fdt_find_hyper_node(unsigned long 
> node, const char *uname,
>   !strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
>   hyper_node.version = s + strlen(hyper_node.prefix);
>  
> + /*
> +  * Check if Xen supports EFI by checking whether there is the
> +  * "/hypervisor/uefi" node in DT. If so, runtime services are available
> +  * through proxy functions (e.g. in case of Xen dom0 EFI implementation
> +  * they call special hypercall which executes relevant EFI functions)
> +  * and that is why they are always enabled.
> +  */
> + if (IS_ENABLED(CONFIG_XEN_EFI)) {
> + if ((of_get_flat_dt_subnode_by_name(node, "uefi") > 0) &&
> + !efi_runtime_disabled())
> + set_bit(EFI_RUNTIME_SERVICES, );
> + }
> +
>   return 0;
>  }
>  
> @@ -352,6 +367,13 @@ static int __init xen_guest_init(void)
>   return -ENODEV;
>   }
>  
> + /*
> +  * The fdt parsing codes have set EFI_RUNTIME_SERVICES if Xen EFI
> +  * parameters are found. Force enable runtime services.
> +  */
> + if (efi_enabled(EFI_RUNTIME_SERVICES))
> + xen_efi_runtime_setup();
> +
>   shared_info_page = (struct shared_info *)get_zeroed_page(GFP_KERNEL);
>  
>   if (!shared_info_page) {
> diff --git a/drivers/firmware/efi/arm-runtime.c 
> b/drivers/firmware/efi/arm-runtime.c
> index 17ccf0a..c394b81 100644
> --- a/drivers/firmware/efi/arm-runtime.c
> +++ b/drivers/firmware/efi/arm-runtime.c
> @@ -107,6 +107,11 @@ static int __init arm_enable_runtime_services(void)
>   return 0;
>   }
>  
> + if (efi_enabled(EFI_RUNTIME_SERVICES)) {
> + pr_info("EFI runtime services access via paravirt.\n");
> + return 0;
> + }
> +
>   pr_info("Remapping and enabling EFI services.\n");
>  
>   mapsize = efi.memmap.map_end - efi.memmap.map;
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 05509f3..1c6f9dd 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -472,12 +472,14 @@ device_initcall(efi_load_efivars);
>   FIELD_SIZEOF(struct efi_fdt_params, field) \
>   }
>  
> -static __initdata struct {
> +struct params {
>   const char name[32];
>   const char propname[32];
>   int offset;
>   int size;

Re: [PATCH v2] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-13 Thread Matt Fleming

(Including more folks, quoting entire patch)

On Thu, 12 May, at 08:19:54PM, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> The EFI DT parameters for bare metal are located under /chosen node,
> while for Xen Dom0 they are located under /hyperviosr/uefi node. These
> parameters under /chosen and /hyperviosr/uefi are not expected to appear
> at the same time.
> Parse these EFI parameters and initialize EFI like the way for bare
> metal except the runtime services because the runtime services for Xen
> Dom0 are available through hypercalls and they are always enabled. So it
> sets the EFI_RUNTIME_SERVICES flag if it finds /hyperviosr/uefi node and
> bails out in arm_enable_runtime_services() when EFI_RUNTIME_SERVICES
> flag is set already.
> 
> CC: Matt Fleming 
> Signed-off-by: Shannon Zhao 
> ---
> v2: rebase it on top of EFI and Xen next branch, add some comments to
> explain the codes. 
> ---
>  arch/arm/xen/enlighten.c   | 22 +++
>  drivers/firmware/efi/arm-runtime.c |  5 +++
>  drivers/firmware/efi/efi.c | 81 
> ++
>  3 files changed, 92 insertions(+), 16 deletions(-)
 
This looks good to me. Thanks for all the work Shannon.

Stefan, OK, let's figure out how to deal with this.

What I've done is merge xen/linux-next into the 'efi/xen-arm' topic
branch and applied this patch here,

  
https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/log/?h=efi/arm-xen

Yes, the merge includes some Xen patches that are not actually needed
for any of the EFI patches, but Linus has complained about picking
random commits of a tree to merge. At least this way this is all the
Xen linux-next material.

If you're OK with this, and assuming no one else has any preferred
methods, I'll send out a pull request to tip over the weekend.

> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 13e3e9f..d4f36eb 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -15,7 +15,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -261,6 +263,19 @@ static int __init fdt_find_hyper_node(unsigned long 
> node, const char *uname,
>   !strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
>   hyper_node.version = s + strlen(hyper_node.prefix);
>  
> + /*
> +  * Check if Xen supports EFI by checking whether there is the
> +  * "/hypervisor/uefi" node in DT. If so, runtime services are available
> +  * through proxy functions (e.g. in case of Xen dom0 EFI implementation
> +  * they call special hypercall which executes relevant EFI functions)
> +  * and that is why they are always enabled.
> +  */
> + if (IS_ENABLED(CONFIG_XEN_EFI)) {
> + if ((of_get_flat_dt_subnode_by_name(node, "uefi") > 0) &&
> + !efi_runtime_disabled())
> + set_bit(EFI_RUNTIME_SERVICES, );
> + }
> +
>   return 0;
>  }
>  
> @@ -352,6 +367,13 @@ static int __init xen_guest_init(void)
>   return -ENODEV;
>   }
>  
> + /*
> +  * The fdt parsing codes have set EFI_RUNTIME_SERVICES if Xen EFI
> +  * parameters are found. Force enable runtime services.
> +  */
> + if (efi_enabled(EFI_RUNTIME_SERVICES))
> + xen_efi_runtime_setup();
> +
>   shared_info_page = (struct shared_info *)get_zeroed_page(GFP_KERNEL);
>  
>   if (!shared_info_page) {
> diff --git a/drivers/firmware/efi/arm-runtime.c 
> b/drivers/firmware/efi/arm-runtime.c
> index 17ccf0a..c394b81 100644
> --- a/drivers/firmware/efi/arm-runtime.c
> +++ b/drivers/firmware/efi/arm-runtime.c
> @@ -107,6 +107,11 @@ static int __init arm_enable_runtime_services(void)
>   return 0;
>   }
>  
> + if (efi_enabled(EFI_RUNTIME_SERVICES)) {
> + pr_info("EFI runtime services access via paravirt.\n");
> + return 0;
> + }
> +
>   pr_info("Remapping and enabling EFI services.\n");
>  
>   mapsize = efi.memmap.map_end - efi.memmap.map;
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 05509f3..1c6f9dd 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -472,12 +472,14 @@ device_initcall(efi_load_efivars);
>   FIELD_SIZEOF(struct efi_fdt_params, field) \
>   }
>  
> -static __initdata struct {
> +struct params {
>   const char name[32];
>   const char propname[32];
>   int offset;
>   int size;
> -} dt_params[] = {
> +};
> +
> +static __initdata struct params fdt_param

[PATCH] x86/efi: Fix 7th argument to efi_call

2016-05-13 Thread Matt Fleming

From: Alex Thorlton <athorl...@sgi.com>

The efi_call assembly code has a slight error that prevents us from
using arguments 7 and higher, which will be passed in on the stack.

mov (%rsp), %rax
mov 8(%rax), %rax
...
mov %rax, 40(%rsp)

This code goes and grabs the return address for the current stack frame,
and puts it on the stack, next to the 5th argument for the EFI runtime
call.  Considering the fact that having the return address in that
position on the stack makes no sense, I'm guessing that the intent of
this code was actually to grab an argument off the stack frame for this
call and place it into the frame for the next one.

The small change to that offset (i.e. 8(%rax) to 16(%rax)) ensures that
we grab the 7th argument off the stack, and pass it as the 6th argument
to the EFI runtime function that we're about to call.  This change gets
our EFI runtime calls that need to pass more than 6 arguments working
again.  SGI/UV is the only platform that passes more than 6 arguments.

Signed-off-by: Alex Thorlton <athorl...@sgi.com>
Cc: Dimitri Sivanich <sivan...@sgi.com>
Cc: Russ Anderson <r...@sgi.com>
Cc: Mike Travis <tra...@sgi.com>
Cc: Borislav Petkov <b...@suse.de>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
Cc: <sta...@vger.kernel.org>
[ Updated changelog. ]
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 arch/x86/platform/efi/efi_stub_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_stub_64.S 
b/arch/x86/platform/efi/efi_stub_64.S
index 92723aeae0f9..62938ffbb9f9 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -43,7 +43,7 @@ ENTRY(efi_call)
FRAME_BEGIN
SAVE_XMM
mov (%rsp), %rax
-   mov 8(%rax), %rax
+   mov 16(%rax), %rax
subq $48, %rsp
mov %r9, 32(%rsp)
mov %rax, 40(%rsp)
-- 
2.7.3

[PATCH] x86/efi: Fix 7th argument to efi_call

2016-05-13 Thread Matt Fleming

From: Alex Thorlton 

The efi_call assembly code has a slight error that prevents us from
using arguments 7 and higher, which will be passed in on the stack.

mov (%rsp), %rax
mov 8(%rax), %rax
...
mov %rax, 40(%rsp)

This code goes and grabs the return address for the current stack frame,
and puts it on the stack, next to the 5th argument for the EFI runtime
call.  Considering the fact that having the return address in that
position on the stack makes no sense, I'm guessing that the intent of
this code was actually to grab an argument off the stack frame for this
call and place it into the frame for the next one.

The small change to that offset (i.e. 8(%rax) to 16(%rax)) ensures that
we grab the 7th argument off the stack, and pass it as the 6th argument
to the EFI runtime function that we're about to call.  This change gets
our EFI runtime calls that need to pass more than 6 arguments working
again.  SGI/UV is the only platform that passes more than 6 arguments.

Signed-off-by: Alex Thorlton 
Cc: Dimitri Sivanich 
Cc: Russ Anderson 
Cc: Mike Travis 
Cc: Borislav Petkov 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: linux-...@vger.kernel.org
Cc: 
[ Updated changelog. ]
Signed-off-by: Matt Fleming 
---
 arch/x86/platform/efi/efi_stub_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi_stub_64.S 
b/arch/x86/platform/efi/efi_stub_64.S
index 92723aeae0f9..62938ffbb9f9 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -43,7 +43,7 @@ ENTRY(efi_call)
FRAME_BEGIN
SAVE_XMM
mov (%rsp), %rax
-   mov 8(%rax), %rax
+   mov 16(%rax), %rax
subq $48, %rsp
mov %r9, 32(%rsp)
mov %rax, 40(%rsp)
-- 
2.7.3

[GIT PULL] EFI urgent fix

2016-05-13 Thread Matt Fleming

The following changes since commit c10fcb14c7afd6688c7b197a814358fecf244222:

  x86/sysfb_efi: Fix valid BAR address range check (2016-05-05 16:01:00 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git tags/efi-urgent

for you to fetch changes up to 6f4c184576aeb5594b8d21f9e7206b7b62e3d96e:

  x86/efi: Fix 7th argument to efi_call (2016-05-13 21:12:13 +0100)


 * Fix passing of 7 parameters or more to efi_call. This issue is only
   triggered on SGI/UV systems - Alex Thorlton


Alex Thorlton (1):
  x86/efi: Fix 7th argument to efi_call

 arch/x86/platform/efi/efi_stub_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

[GIT PULL] EFI urgent fix

2016-05-13 Thread Matt Fleming

The following changes since commit c10fcb14c7afd6688c7b197a814358fecf244222:

  x86/sysfb_efi: Fix valid BAR address range check (2016-05-05 16:01:00 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git tags/efi-urgent

for you to fetch changes up to 6f4c184576aeb5594b8d21f9e7206b7b62e3d96e:

  x86/efi: Fix 7th argument to efi_call (2016-05-13 21:12:13 +0100)


 * Fix passing of 7 parameters or more to efi_call. This issue is only
   triggered on SGI/UV systems - Alex Thorlton


Alex Thorlton (1):
  x86/efi: Fix 7th argument to efi_call

 arch/x86/platform/efi/efi_stub_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Re: [PATCH 1/3] MAINTAINERS: Remove asterisk from EFI directory names

2016-05-13 Thread Matt Fleming

On Fri, 13 May, at 08:23:47AM, Joe Perches wrote:
> On Fri, 2016-05-13 at 17:11 +0200, Xose Vazquez Perez wrote:
> > Matt Fleming wrote:
> > 
> > > 
> > > Mark reported that having asterisks on the end of directory names
> > > confuses get_maintainer.pl when it encounters subdirectories, and
> > > that
> > If this's a get_maintainer.pl bug, it should also happen in:
> 
> Mark is confused.
 
Sorry, "confuses" was my word, not Mark's.

> The meaning of * is all the files in a particular
> directory but not any subdirectories.
 
Exactly. This is not what EFI maintainer entries should be using. I am
responsible for everything under drivers/firmware/efi, including
subdirectories.

Re: [PATCH 1/3] MAINTAINERS: Remove asterisk from EFI directory names

2016-05-13 Thread Matt Fleming

On Fri, 13 May, at 08:23:47AM, Joe Perches wrote:
> On Fri, 2016-05-13 at 17:11 +0200, Xose Vazquez Perez wrote:
> > Matt Fleming wrote:
> > 
> > > 
> > > Mark reported that having asterisks on the end of directory names
> > > confuses get_maintainer.pl when it encounters subdirectories, and
> > > that
> > If this's a get_maintainer.pl bug, it should also happen in:
> 
> Mark is confused.
 
Sorry, "confuses" was my word, not Mark's.

> The meaning of * is all the files in a particular
> directory but not any subdirectories.
 
Exactly. This is not what EFI maintainer entries should be using. I am
responsible for everything under drivers/firmware/efi, including
subdirectories.

[RFC][PATCH 0/5] sched: Diagnostic checks for missing rq clock updates

2016-05-12 Thread Matt Fleming

There are currently no runtime diagnostic checks for detecting when we
have inadvertently missed a call to update_rq_clock() before accessing
rq_clock().

The idea in these patches, which came from Peter, is to piggyback on
the rq->lock pin/unpin context to detect when we expected (and failed)
to see an update to the rq clock. They've already caught a couple of
bugs: see patch 1 and commit b52fad2db5d7 ("sched/fair: Update rq
clock before updating nohz CPU load") in tip/sched/core.

I'm not sure how palatable the s/pin_cookie/rq_flags/ changes will be
in patch 2, so I've marked this entire series as RFC.

All the diagnostic code is guarded by CONFIG_SCHED_DEBUG, but there
are minimal changes to __schedule() in patch 5 for the !SCHED_DEBUG
case.

Matt Fleming (5):
  sched/fair: Update the rq clock before detaching tasks
  sched: Add wrappers for lockdep_(un)pin_lock()
  sched/core: Reset RQCF_ACT_SKIP before unpinning rq->lock
  sched/fair: Push rq lock pin/unpin into idle_balance()
  sched/core: Add debug code to catch missing update_rq_clock()

 kernel/sched/core.c  |  94 +++
 kernel/sched/deadline.c  |  10 ++---
 kernel/sched/fair.c  |  31 +--
 kernel/sched/idle_task.c |   2 +-
 kernel/sched/rt.c|   6 +--
 kernel/sched/sched.h | 101 ---
 kernel/sched/stop_task.c |   2 +-
 7 files changed, 166 insertions(+), 80 deletions(-)

-- 
2.7.3

[RFC][PATCH 0/5] sched: Diagnostic checks for missing rq clock updates

2016-05-12 Thread Matt Fleming

There are currently no runtime diagnostic checks for detecting when we
have inadvertently missed a call to update_rq_clock() before accessing
rq_clock().

The idea in these patches, which came from Peter, is to piggyback on
the rq->lock pin/unpin context to detect when we expected (and failed)
to see an update to the rq clock. They've already caught a couple of
bugs: see patch 1 and commit b52fad2db5d7 ("sched/fair: Update rq
clock before updating nohz CPU load") in tip/sched/core.

I'm not sure how palatable the s/pin_cookie/rq_flags/ changes will be
in patch 2, so I've marked this entire series as RFC.

All the diagnostic code is guarded by CONFIG_SCHED_DEBUG, but there
are minimal changes to __schedule() in patch 5 for the !SCHED_DEBUG
case.

Matt Fleming (5):
  sched/fair: Update the rq clock before detaching tasks
  sched: Add wrappers for lockdep_(un)pin_lock()
  sched/core: Reset RQCF_ACT_SKIP before unpinning rq->lock
  sched/fair: Push rq lock pin/unpin into idle_balance()
  sched/core: Add debug code to catch missing update_rq_clock()

 kernel/sched/core.c  |  94 +++
 kernel/sched/deadline.c  |  10 ++---
 kernel/sched/fair.c  |  31 +--
 kernel/sched/idle_task.c |   2 +-
 kernel/sched/rt.c|   6 +--
 kernel/sched/sched.h | 101 ---
 kernel/sched/stop_task.c |   2 +-
 7 files changed, 166 insertions(+), 80 deletions(-)

-- 
2.7.3

[RFC][PATCH 2/5] sched: Add wrappers for lockdep_(un)pin_lock()

2016-05-12 Thread Matt Fleming

In preparation for adding diagnostic checks to catch missing calls to
update_rq_clock(), provide wrappers for (re)pinning and unpinning
rq->lock.

Because the pending diagnostic checks allow state to be maintained in
rq_flags across pin contexts, swap the 'struct pin_cookie' arguments
for 'struct rq_flags *'.

Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Mike Galbraith <umgwanakikb...@gmail.com>
Cc: Mel Gorman <mgor...@techsingularity.net>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Frederic Weisbecker <fweis...@gmail.com>
Cc: Wanpeng Li <wanpeng...@hotmail.com>
Cc: Luca Abeni <luca.ab...@unitn.it>
Cc: Yuyang Du <yuyang...@intel.com>
Cc: Byungchul Park <byungchul.p...@lge.com>
Cc: Rik van Riel <r...@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wyso...@intel.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 kernel/sched/core.c  | 80 
 kernel/sched/deadline.c  | 10 +++---
 kernel/sched/fair.c  |  6 ++--
 kernel/sched/idle_task.c |  2 +-
 kernel/sched/rt.c|  6 ++--
 kernel/sched/sched.h | 31 ++-
 kernel/sched/stop_task.c |  2 +-
 7 files changed, 76 insertions(+), 61 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 404c0784b1fc..dfdd8c3c083b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -184,7 +184,7 @@ struct rq *__task_rq_lock(struct task_struct *p, struct 
rq_flags *rf)
rq = task_rq(p);
raw_spin_lock(>lock);
if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
-   rf->cookie = lockdep_pin_lock(>lock);
+   rq_pin_lock(rq, rf);
return rq;
}
raw_spin_unlock(>lock);
@@ -224,7 +224,7 @@ struct rq *task_rq_lock(struct task_struct *p, struct 
rq_flags *rf)
 * pair with the WMB to ensure we must then also see migrating.
 */
if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
-   rf->cookie = lockdep_pin_lock(>lock);
+   rq_pin_lock(rq, rf);
return rq;
}
raw_spin_unlock(>lock);
@@ -1183,9 +1183,9 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
 * OK, since we're going to drop the lock immediately
 * afterwards anyway.
 */
-   lockdep_unpin_lock(>lock, rf.cookie);
+   rq_unpin_lock(rq, );
rq = move_queued_task(rq, p, dest_cpu);
-   lockdep_repin_lock(>lock, rf.cookie);
+   rq_repin_lock(rq, );
}
 out:
task_rq_unlock(rq, p, );
@@ -1677,7 +1677,7 @@ static inline void ttwu_activate(struct rq *rq, struct 
task_struct *p, int en_fl
  * Mark the task runnable and perform wakeup-preemption.
  */
 static void ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int 
wake_flags,
-  struct pin_cookie cookie)
+  struct rq_flags *rf)
 {
check_preempt_curr(rq, p, wake_flags);
p->state = TASK_RUNNING;
@@ -1689,9 +1689,9 @@ static void ttwu_do_wakeup(struct rq *rq, struct 
task_struct *p, int wake_flags,
 * Our task @p is fully woken up and running; so its safe to
 * drop the rq->lock, hereafter rq is only used for statistics.
 */
-   lockdep_unpin_lock(>lock, cookie);
+   rq_unpin_lock(rq, rf);
p->sched_class->task_woken(rq, p);
-   lockdep_repin_lock(>lock, cookie);
+   rq_repin_lock(rq, rf);
}
 
if (rq->idle_stamp) {
@@ -1710,7 +1710,7 @@ static void ttwu_do_wakeup(struct rq *rq, struct 
task_struct *p, int wake_flags,
 
 static void
 ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
-struct pin_cookie cookie)
+struct rq_flags *rf)
 {
int en_flags = ENQUEUE_WAKEUP;
 
@@ -1725,7 +1725,7 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, 
int wake_flags,
 #endif
 
ttwu_activate(rq, p, en_flags);
-   ttwu_do_wakeup(rq, p, wake_flags, cookie);
+   ttwu_do_wakeup(rq, p, wake_flags, rf);
 }
 
 /*
@@ -1744,7 +1744,7 @@ static int ttwu_remote(struct task_struct *p, int 
wake_flags)
if (task_on_rq_queued(p)) {
/* check_preempt_curr() may use rq clock */
update_rq_clock(rq);
-   ttwu_do_wakeup(rq, p, wake_flags, rf.cookie);
+   ttwu_do_wakeup(rq, p, wake_flags, );
ret = 1;
}
__task_rq_unlock(rq, );
@@ -1757,15 +1757,15 @@ void sched_ttwu_pending(void)
 {
struct rq *rq = this_rq();
struct

[RFC][PATCH 2/5] sched: Add wrappers for lockdep_(un)pin_lock()

2016-05-12 Thread Matt Fleming

In preparation for adding diagnostic checks to catch missing calls to
update_rq_clock(), provide wrappers for (re)pinning and unpinning
rq->lock.

Because the pending diagnostic checks allow state to be maintained in
rq_flags across pin contexts, swap the 'struct pin_cookie' arguments
for 'struct rq_flags *'.

Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Mike Galbraith 
Cc: Mel Gorman 
Cc: Thomas Gleixner 
Cc: Frederic Weisbecker 
Cc: Wanpeng Li 
Cc: Luca Abeni 
Cc: Yuyang Du 
Cc: Byungchul Park 
Cc: Rik van Riel 
Cc: "Rafael J. Wysocki" 
Signed-off-by: Matt Fleming 
---
 kernel/sched/core.c  | 80 
 kernel/sched/deadline.c  | 10 +++---
 kernel/sched/fair.c  |  6 ++--
 kernel/sched/idle_task.c |  2 +-
 kernel/sched/rt.c|  6 ++--
 kernel/sched/sched.h | 31 ++-
 kernel/sched/stop_task.c |  2 +-
 7 files changed, 76 insertions(+), 61 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 404c0784b1fc..dfdd8c3c083b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -184,7 +184,7 @@ struct rq *__task_rq_lock(struct task_struct *p, struct 
rq_flags *rf)
rq = task_rq(p);
raw_spin_lock(>lock);
if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
-   rf->cookie = lockdep_pin_lock(>lock);
+   rq_pin_lock(rq, rf);
return rq;
}
raw_spin_unlock(>lock);
@@ -224,7 +224,7 @@ struct rq *task_rq_lock(struct task_struct *p, struct 
rq_flags *rf)
 * pair with the WMB to ensure we must then also see migrating.
 */
if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
-   rf->cookie = lockdep_pin_lock(>lock);
+   rq_pin_lock(rq, rf);
return rq;
}
raw_spin_unlock(>lock);
@@ -1183,9 +1183,9 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
 * OK, since we're going to drop the lock immediately
 * afterwards anyway.
 */
-   lockdep_unpin_lock(>lock, rf.cookie);
+   rq_unpin_lock(rq, );
rq = move_queued_task(rq, p, dest_cpu);
-   lockdep_repin_lock(>lock, rf.cookie);
+   rq_repin_lock(rq, );
}
 out:
task_rq_unlock(rq, p, );
@@ -1677,7 +1677,7 @@ static inline void ttwu_activate(struct rq *rq, struct 
task_struct *p, int en_fl
  * Mark the task runnable and perform wakeup-preemption.
  */
 static void ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int 
wake_flags,
-  struct pin_cookie cookie)
+  struct rq_flags *rf)
 {
check_preempt_curr(rq, p, wake_flags);
p->state = TASK_RUNNING;
@@ -1689,9 +1689,9 @@ static void ttwu_do_wakeup(struct rq *rq, struct 
task_struct *p, int wake_flags,
 * Our task @p is fully woken up and running; so its safe to
 * drop the rq->lock, hereafter rq is only used for statistics.
 */
-   lockdep_unpin_lock(>lock, cookie);
+   rq_unpin_lock(rq, rf);
p->sched_class->task_woken(rq, p);
-   lockdep_repin_lock(>lock, cookie);
+   rq_repin_lock(rq, rf);
}
 
if (rq->idle_stamp) {
@@ -1710,7 +1710,7 @@ static void ttwu_do_wakeup(struct rq *rq, struct 
task_struct *p, int wake_flags,
 
 static void
 ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
-struct pin_cookie cookie)
+struct rq_flags *rf)
 {
int en_flags = ENQUEUE_WAKEUP;
 
@@ -1725,7 +1725,7 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, 
int wake_flags,
 #endif
 
ttwu_activate(rq, p, en_flags);
-   ttwu_do_wakeup(rq, p, wake_flags, cookie);
+   ttwu_do_wakeup(rq, p, wake_flags, rf);
 }
 
 /*
@@ -1744,7 +1744,7 @@ static int ttwu_remote(struct task_struct *p, int 
wake_flags)
if (task_on_rq_queued(p)) {
/* check_preempt_curr() may use rq clock */
update_rq_clock(rq);
-   ttwu_do_wakeup(rq, p, wake_flags, rf.cookie);
+   ttwu_do_wakeup(rq, p, wake_flags, );
ret = 1;
}
__task_rq_unlock(rq, );
@@ -1757,15 +1757,15 @@ void sched_ttwu_pending(void)
 {
struct rq *rq = this_rq();
struct llist_node *llist = llist_del_all(>wake_list);
-   struct pin_cookie cookie;
struct task_struct *p;
unsigned long flags;
+   struct rq_flags rf;
 
if (!llist)
return;
 
raw_spin_lock_irqsave(>lock, flags);
-   cookie = lockdep_pin_lock(>lock);
+   rq_pin_lock(rq, );
 
while (llist) {

[RFC][PATCH 1/5] sched/fair: Update the rq clock before detaching tasks

2016-05-12 Thread Matt Fleming

detach_task_cfs_rq() may indirectly call rq_clock() to inform the
cpufreq code that the rq utilisation has changed. In which case, we
need to update the rq clock.

Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Mel Gorman <mgor...@techsingularity.net>
Cc: Mike Galbraith <umgwanakikb...@gmail.com>
Cc: Yuyang Du <yuyang...@intel.com>
Cc: Byungchul Park <byungchul.p...@lge.com>
Cc: Rik van Riel <r...@redhat.com>
Cc: Frederic Weisbecker <fweis...@gmail.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 kernel/sched/fair.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 218f8e83db73..02856647339d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8378,6 +8378,8 @@ static void detach_task_cfs_rq(struct task_struct *p)
struct sched_entity *se = >se;
struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
+   update_rq_clock(task_rq(p));
+
if (!vruntime_normalized(p)) {
/*
 * Fix up our vruntime so that the current sleep doesn't
-- 
2.7.3

[RFC][PATCH 1/5] sched/fair: Update the rq clock before detaching tasks

2016-05-12 Thread Matt Fleming

detach_task_cfs_rq() may indirectly call rq_clock() to inform the
cpufreq code that the rq utilisation has changed. In which case, we
need to update the rq clock.

Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Mel Gorman 
Cc: Mike Galbraith 
Cc: Yuyang Du 
Cc: Byungchul Park 
Cc: Rik van Riel 
Cc: Frederic Weisbecker 
Signed-off-by: Matt Fleming 
---
 kernel/sched/fair.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 218f8e83db73..02856647339d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8378,6 +8378,8 @@ static void detach_task_cfs_rq(struct task_struct *p)
struct sched_entity *se = >se;
struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
+   update_rq_clock(task_rq(p));
+
if (!vruntime_normalized(p)) {
/*
 * Fix up our vruntime so that the current sleep doesn't
-- 
2.7.3

[RFC][PATCH 3/5] sched/core: Reset RQCF_ACT_SKIP before unpinning rq->lock

2016-05-12 Thread Matt Fleming

rq_clock() is called from sched_info_{depart,arrive}() after resetting
RQCF_ACT_SKIP but prior to a call to update_rq_clock().

In preparation for pending patches that check whether the rq clock has
been updated inside of a pin context before rq_clock() is called, move
the reset of rq->clock_skip_update immediately before unpinning the rq
lock.

This will avoid the new warnings which check if update_rq_clock() is
being actively skipped.

Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Mike Galbraith <umgwanakikb...@gmail.com>
Cc: Mel Gorman <mgor...@techsingularity.net>
Cc: Thomas Gleixner <t...@linutronix.de>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 kernel/sched/core.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dfdd8c3c083b..d2112c268fd1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2824,6 +2824,9 @@ context_switch(struct rq *rq, struct task_struct *prev,
prev->active_mm = NULL;
rq->prev_mm = oldmm;
}
+
+   rq->clock_skip_update = 0;
+
/*
 * Since the runqueue lock will be released by the next
 * task (which is an invalid locking op but in the case
@@ -3316,7 +3319,6 @@ static void __sched notrace __schedule(bool preempt)
next = pick_next_task(rq, prev, );
clear_tsk_need_resched(prev);
clear_preempt_need_resched();
-   rq->clock_skip_update = 0;
 
if (likely(prev != next)) {
rq->nr_switches++;
@@ -3326,6 +3328,7 @@ static void __sched notrace __schedule(bool preempt)
trace_sched_switch(preempt, prev, next);
rq = context_switch(rq, prev, next, ); /* unlocks the rq */
} else {
+   rq->clock_skip_update = 0;
rq_unpin_lock(rq, );
raw_spin_unlock_irq(>lock);
}
-- 
2.7.3

[RFC][PATCH 3/5] sched/core: Reset RQCF_ACT_SKIP before unpinning rq->lock

2016-05-12 Thread Matt Fleming

rq_clock() is called from sched_info_{depart,arrive}() after resetting
RQCF_ACT_SKIP but prior to a call to update_rq_clock().

In preparation for pending patches that check whether the rq clock has
been updated inside of a pin context before rq_clock() is called, move
the reset of rq->clock_skip_update immediately before unpinning the rq
lock.

This will avoid the new warnings which check if update_rq_clock() is
being actively skipped.

Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Mike Galbraith 
Cc: Mel Gorman 
Cc: Thomas Gleixner 
Signed-off-by: Matt Fleming 
---
 kernel/sched/core.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dfdd8c3c083b..d2112c268fd1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2824,6 +2824,9 @@ context_switch(struct rq *rq, struct task_struct *prev,
prev->active_mm = NULL;
rq->prev_mm = oldmm;
}
+
+   rq->clock_skip_update = 0;
+
/*
 * Since the runqueue lock will be released by the next
 * task (which is an invalid locking op but in the case
@@ -3316,7 +3319,6 @@ static void __sched notrace __schedule(bool preempt)
next = pick_next_task(rq, prev, );
clear_tsk_need_resched(prev);
clear_preempt_need_resched();
-   rq->clock_skip_update = 0;
 
if (likely(prev != next)) {
rq->nr_switches++;
@@ -3326,6 +3328,7 @@ static void __sched notrace __schedule(bool preempt)
trace_sched_switch(preempt, prev, next);
rq = context_switch(rq, prev, next, ); /* unlocks the rq */
} else {
+   rq->clock_skip_update = 0;
rq_unpin_lock(rq, );
raw_spin_unlock_irq(>lock);
}
-- 
2.7.3

[RFC][PATCH 4/5] sched/fair: Push rq lock pin/unpin into idle_balance()

2016-05-12 Thread Matt Fleming

Future patches will emit warnings if rq_clock() is called before
update_rq_clock() inside a rq_pin_lock()/rq_unpin_lock() pair.

Since there is only one caller of idle_balance() we can push the
unpin/repin there.

Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Mike Galbraith <umgwanakikb...@gmail.com>
Cc: Mel Gorman <mgor...@techsingularity.net>
Cc: Yuyang Du <yuyang...@intel.com>
Cc: Byungchul Park <byungchul.p...@lge.com>
Cc: Rik van Riel <r...@redhat.com>
Cc: Frederic Weisbecker <fweis...@gmail.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 kernel/sched/fair.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7f776a99bde0..217e3a9d78db 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3095,7 +3095,7 @@ static inline unsigned long cfs_rq_load_avg(struct cfs_rq 
*cfs_rq)
return cfs_rq->avg.load_avg;
 }
 
-static int idle_balance(struct rq *this_rq);
+static int idle_balance(struct rq *this_rq, struct rq_flags *rf);
 
 #else /* CONFIG_SMP */
 
@@ -3118,7 +3118,7 @@ attach_entity_load_avg(struct cfs_rq *cfs_rq, struct 
sched_entity *se) {}
 static inline void
 detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {}
 
-static inline int idle_balance(struct rq *rq)
+static inline int idle_balance(struct rq *rq, struct rq_flags *rf)
 {
return 0;
 }
@@ -5701,15 +5701,8 @@ simple:
return p;
 
 idle:
-   /*
-* This is OK, because current is on_cpu, which avoids it being picked
-* for load-balance and preemption/IRQs are still disabled avoiding
-* further scheduler activity on it and we're being very careful to
-* re-start the picking loop.
-*/
-   rq_unpin_lock(rq, rf);
-   new_tasks = idle_balance(rq);
-   rq_repin_lock(rq, rf);
+   new_tasks = idle_balance(rq, rf);
+
/*
 * Because idle_balance() releases (and re-acquires) rq->lock, it is
 * possible for any higher priority task to appear. In that case we
@@ -7641,7 +7634,7 @@ update_next_balance(struct sched_domain *sd, int 
cpu_busy, unsigned long *next_b
  * idle_balance is called by schedule() if this_cpu is about to become
  * idle. Attempts to pull tasks from other CPUs.
  */
-static int idle_balance(struct rq *this_rq)
+static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 {
unsigned long next_balance = jiffies + HZ;
int this_cpu = this_rq->cpu;
@@ -7655,6 +7648,14 @@ static int idle_balance(struct rq *this_rq)
 */
this_rq->idle_stamp = rq_clock(this_rq);
 
+   /*
+* This is OK, because current is on_cpu, which avoids it being picked
+* for load-balance and preemption/IRQs are still disabled avoiding
+* further scheduler activity on it and we're being very careful to
+* re-start the picking loop.
+*/
+   rq_unpin_lock(this_rq, rf);
+
if (this_rq->avg_idle < sysctl_sched_migration_cost ||
!this_rq->rd->overload) {
rcu_read_lock();
@@ -7732,6 +7733,8 @@ out:
if (pulled_task)
this_rq->idle_stamp = 0;
 
+   rq_repin_lock(this_rq, rf);
+
return pulled_task;
 }
 
-- 
2.7.3

[RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-12 Thread Matt Fleming

There's no diagnostic checks for figuring out when we've accidentally
missed update_rq_clock() calls. Let's add some by piggybacking on the
rq_*pin_lock() wrappers.

The idea behind the diagnostic checks is that upon pining rq lock the
rq clock should be updated, via update_rq_clock(), before anybody
reads the clock with rq_clock(). The exception to this rule is when
updates have explicitly been disabled with the rq_clock_skip_update()
optimisation.

There are some functions that only unpin the rq lock in order to grab
some other lock and avoid deadlock. In that case we don't need to
update the clock again and the previous diagnostic state can be
carried over in rq_repin_lock() by saving the state in the rq_flags
context.

Since this patch adds a new clock update flag and some already exist
in rq::clock_skip_update, that field has now been renamed. An attempt
has been made to keep the flag manipulation code small and fast since
it's used in the heart of the __schedule() fast path.

For the !CONFIG_SCHED_DEBUG case the only object code change (other
than addresses) is the following change to the two lines to reset
RQCF_ACT_SKIP inside of __schedule(),

  -   41 c7 84 24 f0 08 00movl$0x0,0x8f0(%r12)
  -   00 00 00 00 00
  +   41 83 b4 24 f0 08 00xorl$0x2,0x8f0(%r12)
  +   00 02

Suggested-by: Peter Zijlstra <pet...@infradead.org>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Mike Galbraith <umgwanakikb...@gmail.com>
Cc: Mel Gorman <mgor...@techsingularity.net>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Frederic Weisbecker <fweis...@gmail.com>
Cc: Yuyang Du <yuyang...@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wyso...@intel.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 kernel/sched/core.c  | 13 +++---
 kernel/sched/sched.h | 70 +++-
 2 files changed, 73 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d2112c268fd1..0999b3f23dde 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -101,9 +101,12 @@ void update_rq_clock(struct rq *rq)
 
lockdep_assert_held(>lock);
 
-   if (rq->clock_skip_update & RQCF_ACT_SKIP)
+   if (rq->clock_update_flags & RQCF_ACT_SKIP)
return;
 
+#ifdef CONFIG_SCHED_DEBUG
+   rq->clock_update_flags |= RQCF_UPDATED;
+#endif
delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
if (delta < 0)
return;
@@ -2825,7 +2828,8 @@ context_switch(struct rq *rq, struct task_struct *prev,
rq->prev_mm = oldmm;
}
 
-   rq->clock_skip_update = 0;
+   /* Clear ACT, preserve everything else */
+   rq->clock_update_flags ^= RQCF_ACT_SKIP;
 
/*
 * Since the runqueue lock will be released by the next
@@ -3287,7 +3291,7 @@ static void __sched notrace __schedule(bool preempt)
raw_spin_lock(>lock);
rq_pin_lock(rq, );
 
-   rq->clock_skip_update <<= 1; /* promote REQ to ACT */
+   rq->clock_update_flags <<= 1; /* promote REQ to ACT */
 
switch_count = >nivcsw;
if (!preempt && prev->state) {
@@ -3328,7 +3332,8 @@ static void __sched notrace __schedule(bool preempt)
trace_sched_switch(preempt, prev, next);
rq = context_switch(rq, prev, next, ); /* unlocks the rq */
} else {
-   rq->clock_skip_update = 0;
+   /* Clear ACT, preserve everything else */
+   rq->clock_update_flags ^= RQCF_ACT_SKIP;
rq_unpin_lock(rq, );
raw_spin_unlock_irq(>lock);
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index cbeb830c7ac6..072c020cd8e3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -628,7 +628,7 @@ struct rq {
unsigned long next_balance;
struct mm_struct *prev_mm;
 
-   unsigned int clock_skip_update;
+   unsigned int clock_update_flags;
u64 clock;
u64 clock_task;
 
@@ -735,9 +735,45 @@ static inline u64 __rq_clock_broken(struct rq *rq)
return READ_ONCE(rq->clock);
 }
 
+/*
+ * rq::clock_update_flags bits
+ *
+ * %RQCF_REQ_SKIP - will request skipping of clock update on the next
+ *  call to __schedule(). This is an optimisation to avoid
+ *  neighbouring rq clock updates.
+ *
+ * %RQCF_ACT_SKIP - is set from inside of __schedule() when skipping is
+ *  in effect and calls to update_rq_clock() are being ignored.
+ *
+ * %RQCF_UPDATED - is a debug flag that indicates whether a call has been
+ *  made to update_rq_clock() since the last time rq::lock was pinned.
+ *
+ * If inside of __schedule(), clock_update_flags will have been
+ * shifted left (a left shift is a cheap operation for the fast path
+ * to promote %RQCF_REQ_SKIP to %RQCF_ACT_SKIP), so you must use,
+ *
+ * if (rq-clock_

[RFC][PATCH 4/5] sched/fair: Push rq lock pin/unpin into idle_balance()

2016-05-12 Thread Matt Fleming

Future patches will emit warnings if rq_clock() is called before
update_rq_clock() inside a rq_pin_lock()/rq_unpin_lock() pair.

Since there is only one caller of idle_balance() we can push the
unpin/repin there.

Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Mike Galbraith 
Cc: Mel Gorman 
Cc: Yuyang Du 
Cc: Byungchul Park 
Cc: Rik van Riel 
Cc: Frederic Weisbecker 
Signed-off-by: Matt Fleming 
---
 kernel/sched/fair.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7f776a99bde0..217e3a9d78db 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3095,7 +3095,7 @@ static inline unsigned long cfs_rq_load_avg(struct cfs_rq 
*cfs_rq)
return cfs_rq->avg.load_avg;
 }
 
-static int idle_balance(struct rq *this_rq);
+static int idle_balance(struct rq *this_rq, struct rq_flags *rf);
 
 #else /* CONFIG_SMP */
 
@@ -3118,7 +3118,7 @@ attach_entity_load_avg(struct cfs_rq *cfs_rq, struct 
sched_entity *se) {}
 static inline void
 detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {}
 
-static inline int idle_balance(struct rq *rq)
+static inline int idle_balance(struct rq *rq, struct rq_flags *rf)
 {
return 0;
 }
@@ -5701,15 +5701,8 @@ simple:
return p;
 
 idle:
-   /*
-* This is OK, because current is on_cpu, which avoids it being picked
-* for load-balance and preemption/IRQs are still disabled avoiding
-* further scheduler activity on it and we're being very careful to
-* re-start the picking loop.
-*/
-   rq_unpin_lock(rq, rf);
-   new_tasks = idle_balance(rq);
-   rq_repin_lock(rq, rf);
+   new_tasks = idle_balance(rq, rf);
+
/*
 * Because idle_balance() releases (and re-acquires) rq->lock, it is
 * possible for any higher priority task to appear. In that case we
@@ -7641,7 +7634,7 @@ update_next_balance(struct sched_domain *sd, int 
cpu_busy, unsigned long *next_b
  * idle_balance is called by schedule() if this_cpu is about to become
  * idle. Attempts to pull tasks from other CPUs.
  */
-static int idle_balance(struct rq *this_rq)
+static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
 {
unsigned long next_balance = jiffies + HZ;
int this_cpu = this_rq->cpu;
@@ -7655,6 +7648,14 @@ static int idle_balance(struct rq *this_rq)
 */
this_rq->idle_stamp = rq_clock(this_rq);
 
+   /*
+* This is OK, because current is on_cpu, which avoids it being picked
+* for load-balance and preemption/IRQs are still disabled avoiding
+* further scheduler activity on it and we're being very careful to
+* re-start the picking loop.
+*/
+   rq_unpin_lock(this_rq, rf);
+
if (this_rq->avg_idle < sysctl_sched_migration_cost ||
!this_rq->rd->overload) {
rcu_read_lock();
@@ -7732,6 +7733,8 @@ out:
if (pulled_task)
this_rq->idle_stamp = 0;
 
+   rq_repin_lock(this_rq, rf);
+
return pulled_task;
 }
 
-- 
2.7.3

[RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock()

2016-05-12 Thread Matt Fleming

There's no diagnostic checks for figuring out when we've accidentally
missed update_rq_clock() calls. Let's add some by piggybacking on the
rq_*pin_lock() wrappers.

The idea behind the diagnostic checks is that upon pining rq lock the
rq clock should be updated, via update_rq_clock(), before anybody
reads the clock with rq_clock(). The exception to this rule is when
updates have explicitly been disabled with the rq_clock_skip_update()
optimisation.

There are some functions that only unpin the rq lock in order to grab
some other lock and avoid deadlock. In that case we don't need to
update the clock again and the previous diagnostic state can be
carried over in rq_repin_lock() by saving the state in the rq_flags
context.

Since this patch adds a new clock update flag and some already exist
in rq::clock_skip_update, that field has now been renamed. An attempt
has been made to keep the flag manipulation code small and fast since
it's used in the heart of the __schedule() fast path.

For the !CONFIG_SCHED_DEBUG case the only object code change (other
than addresses) is the following change to the two lines to reset
RQCF_ACT_SKIP inside of __schedule(),

  -   41 c7 84 24 f0 08 00movl$0x0,0x8f0(%r12)
  -   00 00 00 00 00
  +   41 83 b4 24 f0 08 00xorl$0x2,0x8f0(%r12)
  +   00 02

Suggested-by: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Mike Galbraith 
Cc: Mel Gorman 
Cc: Thomas Gleixner 
Cc: Frederic Weisbecker 
Cc: Yuyang Du 
Cc: "Rafael J. Wysocki" 
Signed-off-by: Matt Fleming 
---
 kernel/sched/core.c  | 13 +++---
 kernel/sched/sched.h | 70 +++-
 2 files changed, 73 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d2112c268fd1..0999b3f23dde 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -101,9 +101,12 @@ void update_rq_clock(struct rq *rq)
 
lockdep_assert_held(>lock);
 
-   if (rq->clock_skip_update & RQCF_ACT_SKIP)
+   if (rq->clock_update_flags & RQCF_ACT_SKIP)
return;
 
+#ifdef CONFIG_SCHED_DEBUG
+   rq->clock_update_flags |= RQCF_UPDATED;
+#endif
delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
if (delta < 0)
return;
@@ -2825,7 +2828,8 @@ context_switch(struct rq *rq, struct task_struct *prev,
rq->prev_mm = oldmm;
}
 
-   rq->clock_skip_update = 0;
+   /* Clear ACT, preserve everything else */
+   rq->clock_update_flags ^= RQCF_ACT_SKIP;
 
/*
 * Since the runqueue lock will be released by the next
@@ -3287,7 +3291,7 @@ static void __sched notrace __schedule(bool preempt)
raw_spin_lock(>lock);
rq_pin_lock(rq, );
 
-   rq->clock_skip_update <<= 1; /* promote REQ to ACT */
+   rq->clock_update_flags <<= 1; /* promote REQ to ACT */
 
switch_count = >nivcsw;
if (!preempt && prev->state) {
@@ -3328,7 +3332,8 @@ static void __sched notrace __schedule(bool preempt)
trace_sched_switch(preempt, prev, next);
rq = context_switch(rq, prev, next, ); /* unlocks the rq */
} else {
-   rq->clock_skip_update = 0;
+   /* Clear ACT, preserve everything else */
+   rq->clock_update_flags ^= RQCF_ACT_SKIP;
rq_unpin_lock(rq, );
raw_spin_unlock_irq(>lock);
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index cbeb830c7ac6..072c020cd8e3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -628,7 +628,7 @@ struct rq {
unsigned long next_balance;
struct mm_struct *prev_mm;
 
-   unsigned int clock_skip_update;
+   unsigned int clock_update_flags;
u64 clock;
u64 clock_task;
 
@@ -735,9 +735,45 @@ static inline u64 __rq_clock_broken(struct rq *rq)
return READ_ONCE(rq->clock);
 }
 
+/*
+ * rq::clock_update_flags bits
+ *
+ * %RQCF_REQ_SKIP - will request skipping of clock update on the next
+ *  call to __schedule(). This is an optimisation to avoid
+ *  neighbouring rq clock updates.
+ *
+ * %RQCF_ACT_SKIP - is set from inside of __schedule() when skipping is
+ *  in effect and calls to update_rq_clock() are being ignored.
+ *
+ * %RQCF_UPDATED - is a debug flag that indicates whether a call has been
+ *  made to update_rq_clock() since the last time rq::lock was pinned.
+ *
+ * If inside of __schedule(), clock_update_flags will have been
+ * shifted left (a left shift is a cheap operation for the fast path
+ * to promote %RQCF_REQ_SKIP to %RQCF_ACT_SKIP), so you must use,
+ *
+ * if (rq-clock_update_flags >= RQCF_UPDATED)
+ *
+ * to check if %RQCF_UPADTED is set. It'll never be shifted more than
+ * one position though, because the next rq_unpin_lock() will shift it
+ * back.
+ */
+#define RQCF_REQ_SKIP  0x01
+#define RQCF_ACT_SKIP  0x02
+#define RQCF_U

Re: [PATCH 1/2] Create UV efi_call macros

2016-05-12 Thread Matt Fleming

(Adding author of arch_efi_call code)

On Wed, 11 May, at 02:55:44PM, Alex Thorlton wrote:
> We need a slightly different macro than the standard efi_call_virt,
> since those macros all assume that the function pointer, f, that gets
> passed in will live somewhere in efi.systab->runtime.  Our EFI function
> pointer lives in efi.uv_systab, so we can't use the standard macros out
> of the box.
 
Is that true of all EFI services pointers? From reading the SGI/UV
code I got the impression that it's possible to call the standard
services via efi.systab->runtime but you also need the ability to call
these UV bios functions, which are not accessible via efi.systab.

Do you actually want the same environment setup and teardown to happen
when calling efi.uv_systab ? Or are you simply trying to reuse the
efi_call() implementation?

> This commit creates some new uv_* macros that are functionally
> equivalent to the standard ones, with the exception of allowing us to
> use a different function pointer.  I figure that we won't want to call
> these uv_* in the end (we'll probably want something more generic), but
> I thought I would get everyone's thoughts on how we might best reach
> that goal instead of just trying to come up with a new implementation on
> my own.
> 
> By itself, this commit does get our machines booting, but it needs the
> small fix to the efi_call assembly code for our modules to work.
 
Could you provide some more details in the changelog on why your
machine doesn't boot without this change?

> Signed-off-by: Alex Thorlton <athorl...@sgi.com>
> Cc: Dimitri Sivanich <sivan...@sgi.com>
> Cc: Russ Anderson <r...@sgi.com>
> Cc: Mike Travis <tra...@sgi.com>
> Cc: Matt Fleming <m...@codeblueprint.co.uk>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: "H. Peter Anvin" <h...@zytor.com>
> Cc: x...@kernel.org
> Cc: linux-...@vger.kernel.org
> ---
>  arch/x86/include/asm/efi.h  |  3 ++
>  arch/x86/platform/uv/bios_uv.c  |  3 +-
>  drivers/firmware/efi/runtime-wrappers.c | 44 +-
>  include/linux/efi.h | 55 
> +
>  4 files changed, 60 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
> index 78d1e74..f384047 100644
> --- a/arch/x86/include/asm/efi.h
> +++ b/arch/x86/include/asm/efi.h
> @@ -84,6 +84,9 @@ struct efi_scratch {
>  #define arch_efi_call_virt(f, args...)   
> \
>   efi_call((void *)efi.systab->runtime->f, args)  \
>  
> +#define uv_efi_call_virt(f, args...) \
> + efi_call((void *)f, args)   \
> +
>  #define arch_efi_call_virt_teardown()
> \
>  ({   \
>   if (efi_scratch.use_pgd) {  \
> diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
> index 815fec6..62a46cb 100644
> --- a/arch/x86/platform/uv/bios_uv.c
> +++ b/arch/x86/platform/uv/bios_uv.c
> @@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, 
> u64 a3, u64 a4, u64 a5)
>*/
>   return BIOS_STATUS_UNIMPLEMENTED;
>  
> - ret = efi_call((void *)__va(tab->function), (u64)which,
> - a1, a2, a3, a4, a5);
> + ret = uv_call_virt(tab->function, (u64)which, a1, a2, a3, a4, a5);
>   return ret;
>  }
>  EXPORT_SYMBOL_GPL(uv_bios_call);
> diff --git a/drivers/firmware/efi/runtime-wrappers.c 
> b/drivers/firmware/efi/runtime-wrappers.c
> index 23bef6b..008a1a3 100644
> --- a/drivers/firmware/efi/runtime-wrappers.c
> +++ b/drivers/firmware/efi/runtime-wrappers.c
> @@ -22,7 +22,7 @@
>  #include 
>  #include 
>  
> -static void efi_call_virt_check_flags(unsigned long flags, const char *call)
> +void efi_call_virt_check_flags(unsigned long flags, const char *call)
>  {
>   unsigned long cur_flags, mismatch;
>  
> @@ -39,48 +39,6 @@ static void efi_call_virt_check_flags(unsigned long flags, 
> const char *call)
>  }
>  
>  /*
> - * Arch code can implement the following three template macros, avoiding
> - * reptition for the void/non-void return cases of {__,}efi_call_virt:
> - *
> - *  * arch_efi_call_virt_setup
> - *
> - *Sets up the environment for the call (e.g. switching page tables,
> - *allowing kernel-mode use of floating point, if required).
> - *
> - *  *

Re: [PATCH 1/2] Create UV efi_call macros

2016-05-12 Thread Matt Fleming

(Adding author of arch_efi_call code)

On Wed, 11 May, at 02:55:44PM, Alex Thorlton wrote:
> We need a slightly different macro than the standard efi_call_virt,
> since those macros all assume that the function pointer, f, that gets
> passed in will live somewhere in efi.systab->runtime.  Our EFI function
> pointer lives in efi.uv_systab, so we can't use the standard macros out
> of the box.
 
Is that true of all EFI services pointers? From reading the SGI/UV
code I got the impression that it's possible to call the standard
services via efi.systab->runtime but you also need the ability to call
these UV bios functions, which are not accessible via efi.systab.

Do you actually want the same environment setup and teardown to happen
when calling efi.uv_systab ? Or are you simply trying to reuse the
efi_call() implementation?

> This commit creates some new uv_* macros that are functionally
> equivalent to the standard ones, with the exception of allowing us to
> use a different function pointer.  I figure that we won't want to call
> these uv_* in the end (we'll probably want something more generic), but
> I thought I would get everyone's thoughts on how we might best reach
> that goal instead of just trying to come up with a new implementation on
> my own.
> 
> By itself, this commit does get our machines booting, but it needs the
> small fix to the efi_call assembly code for our modules to work.
 
Could you provide some more details in the changelog on why your
machine doesn't boot without this change?

> Signed-off-by: Alex Thorlton 
> Cc: Dimitri Sivanich 
> Cc: Russ Anderson 
> Cc: Mike Travis 
> Cc: Matt Fleming 
> Cc: Borislav Petkov 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: linux-...@vger.kernel.org
> ---
>  arch/x86/include/asm/efi.h  |  3 ++
>  arch/x86/platform/uv/bios_uv.c  |  3 +-
>  drivers/firmware/efi/runtime-wrappers.c | 44 +-
>  include/linux/efi.h | 55 
> +
>  4 files changed, 60 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
> index 78d1e74..f384047 100644
> --- a/arch/x86/include/asm/efi.h
> +++ b/arch/x86/include/asm/efi.h
> @@ -84,6 +84,9 @@ struct efi_scratch {
>  #define arch_efi_call_virt(f, args...)   
> \
>   efi_call((void *)efi.systab->runtime->f, args)  \
>  
> +#define uv_efi_call_virt(f, args...) \
> + efi_call((void *)f, args)   \
> +
>  #define arch_efi_call_virt_teardown()
> \
>  ({   \
>   if (efi_scratch.use_pgd) {  \
> diff --git a/arch/x86/platform/uv/bios_uv.c b/arch/x86/platform/uv/bios_uv.c
> index 815fec6..62a46cb 100644
> --- a/arch/x86/platform/uv/bios_uv.c
> +++ b/arch/x86/platform/uv/bios_uv.c
> @@ -40,8 +40,7 @@ s64 uv_bios_call(enum uv_bios_cmd which, u64 a1, u64 a2, 
> u64 a3, u64 a4, u64 a5)
>*/
>   return BIOS_STATUS_UNIMPLEMENTED;
>  
> - ret = efi_call((void *)__va(tab->function), (u64)which,
> - a1, a2, a3, a4, a5);
> + ret = uv_call_virt(tab->function, (u64)which, a1, a2, a3, a4, a5);
>   return ret;
>  }
>  EXPORT_SYMBOL_GPL(uv_bios_call);
> diff --git a/drivers/firmware/efi/runtime-wrappers.c 
> b/drivers/firmware/efi/runtime-wrappers.c
> index 23bef6b..008a1a3 100644
> --- a/drivers/firmware/efi/runtime-wrappers.c
> +++ b/drivers/firmware/efi/runtime-wrappers.c
> @@ -22,7 +22,7 @@
>  #include 
>  #include 
>  
> -static void efi_call_virt_check_flags(unsigned long flags, const char *call)
> +void efi_call_virt_check_flags(unsigned long flags, const char *call)
>  {
>   unsigned long cur_flags, mismatch;
>  
> @@ -39,48 +39,6 @@ static void efi_call_virt_check_flags(unsigned long flags, 
> const char *call)
>  }
>  
>  /*
> - * Arch code can implement the following three template macros, avoiding
> - * reptition for the void/non-void return cases of {__,}efi_call_virt:
> - *
> - *  * arch_efi_call_virt_setup
> - *
> - *Sets up the environment for the call (e.g. switching page tables,
> - *allowing kernel-mode use of floating point, if required).
> - *
> - *  * arch_efi_call_virt
> - *
> - *Performs the call. The last expression in the macro must be the call
> - *itself, allowing the logic to be shared by the void and non-void
> - *cases.
> - *
> - *  * arch_efi_call_virt_teardown
> - *
> -

Re: [PATCH 2/2] Fix efi_call

2016-05-12 Thread Matt Fleming

On Thu, 12 May, at 08:48:35AM, Ingo Molnar wrote:
> 
> * Alex Thorlton  wrote:
> 
> > The efi_call assembly code has a slight error that prevents us from
> > using arguments 7 and higher, which will be passed in on the stack.
> > 
> > mov (%rsp), %rax
> > mov 8(%rax), %rax
> > ...
> > mov %rax, 40(%rsp)
> > 
> > This code goes and grabs the return address for the current stack frame,
> > and puts it on the stack, next the 5th argument for the EFI runtime
> > call.  Considering the fact that having the return address in that
> > position on the stack makes no sense, I'm guessing that the intent of
> > this code was actually to grab an argument off the stack frame for this
> > call and place it into the frame for the next one.
> > 
> > The small change to that offset (i.e. 8(%rax) to 16(%rax)) ensures that
> > we grab the 7th argument off the stack, and pass it as the 6th argument
> > to the EFI runtime function that we're about to call.  This change gets
> > our EFI runtime calls that need to pass more than 6 arguments working
> > again.
> 
> I suppose the SGI/UV code is the only one using 7 arguments or more? Might 
> make 
> sense to point that out in the changelog.
 
Yeah, I included that info when I applied this patch.

Re: [PATCH 2/2] Fix efi_call

2016-05-12 Thread Matt Fleming

On Thu, 12 May, at 08:48:35AM, Ingo Molnar wrote:
> 
> * Alex Thorlton  wrote:
> 
> > The efi_call assembly code has a slight error that prevents us from
> > using arguments 7 and higher, which will be passed in on the stack.
> > 
> > mov (%rsp), %rax
> > mov 8(%rax), %rax
> > ...
> > mov %rax, 40(%rsp)
> > 
> > This code goes and grabs the return address for the current stack frame,
> > and puts it on the stack, next the 5th argument for the EFI runtime
> > call.  Considering the fact that having the return address in that
> > position on the stack makes no sense, I'm guessing that the intent of
> > this code was actually to grab an argument off the stack frame for this
> > call and place it into the frame for the next one.
> > 
> > The small change to that offset (i.e. 8(%rax) to 16(%rax)) ensures that
> > we grab the 7th argument off the stack, and pass it as the 6th argument
> > to the EFI runtime function that we're about to call.  This change gets
> > our EFI runtime calls that need to pass more than 6 arguments working
> > again.
> 
> I suppose the SGI/UV code is the only one using 7 arguments or more? Might 
> make 
> sense to point that out in the changelog.
 
Yeah, I included that info when I applied this patch.

Re: [PATCH 2/2] Fix efi_call

2016-05-12 Thread Matt Fleming

On Wed, 11 May, at 02:55:45PM, Alex Thorlton wrote:
> The efi_call assembly code has a slight error that prevents us from
> using arguments 7 and higher, which will be passed in on the stack.
> 
> mov (%rsp), %rax
> mov 8(%rax), %rax
>   ...
> mov %rax, 40(%rsp)
> 
> This code goes and grabs the return address for the current stack frame,
> and puts it on the stack, next the 5th argument for the EFI runtime
> call.  Considering the fact that having the return address in that
> position on the stack makes no sense, I'm guessing that the intent of
> this code was actually to grab an argument off the stack frame for this
> call and place it into the frame for the next one.
> 
> The small change to that offset (i.e. 8(%rax) to 16(%rax)) ensures that
> we grab the 7th argument off the stack, and pass it as the 6th argument
> to the EFI runtime function that we're about to call.  This change gets
> our EFI runtime calls that need to pass more than 6 arguments working
> again.
> 
> Signed-off-by: Alex Thorlton <athorl...@sgi.com>
> Cc: Dimitri Sivanich <sivan...@sgi.com>
> Cc: Russ Anderson <r...@sgi.com>
> Cc: Mike Travis <tra...@sgi.com>
> Cc: Matt Fleming <m...@codeblueprint.co.uk>
> Cc: Borislav Petkov <b...@suse.de>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: "H. Peter Anvin" <h...@zytor.com>
> Cc: x...@kernel.org
> Cc: linux-...@vger.kernel.org
> ---
>  arch/x86/platform/efi/efi_stub_64.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/platform/efi/efi_stub_64.S 
> b/arch/x86/platform/efi/efi_stub_64.S
> index 92723ae..62938ff 100644
> --- a/arch/x86/platform/efi/efi_stub_64.S
> +++ b/arch/x86/platform/efi/efi_stub_64.S
> @@ -43,7 +43,7 @@ ENTRY(efi_call)
>   FRAME_BEGIN
>   SAVE_XMM
>   mov (%rsp), %rax
> - mov 8(%rax), %rax
> + mov 16(%rax), %rax
>   subq $48, %rsp
>   mov %r9, 32(%rsp)
>   mov %rax, 40(%rsp)

Nice. Your fix looks good, so I've put it in the urgent queue and
tagged it for stable.

Re: [PATCH 2/2] Fix efi_call

2016-05-12 Thread Matt Fleming

On Wed, 11 May, at 02:55:45PM, Alex Thorlton wrote:
> The efi_call assembly code has a slight error that prevents us from
> using arguments 7 and higher, which will be passed in on the stack.
> 
> mov (%rsp), %rax
> mov 8(%rax), %rax
>   ...
> mov %rax, 40(%rsp)
> 
> This code goes and grabs the return address for the current stack frame,
> and puts it on the stack, next the 5th argument for the EFI runtime
> call.  Considering the fact that having the return address in that
> position on the stack makes no sense, I'm guessing that the intent of
> this code was actually to grab an argument off the stack frame for this
> call and place it into the frame for the next one.
> 
> The small change to that offset (i.e. 8(%rax) to 16(%rax)) ensures that
> we grab the 7th argument off the stack, and pass it as the 6th argument
> to the EFI runtime function that we're about to call.  This change gets
> our EFI runtime calls that need to pass more than 6 arguments working
> again.
> 
> Signed-off-by: Alex Thorlton 
> Cc: Dimitri Sivanich 
> Cc: Russ Anderson 
> Cc: Mike Travis 
> Cc: Matt Fleming 
> Cc: Borislav Petkov 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: linux-...@vger.kernel.org
> ---
>  arch/x86/platform/efi/efi_stub_64.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/platform/efi/efi_stub_64.S 
> b/arch/x86/platform/efi/efi_stub_64.S
> index 92723ae..62938ff 100644
> --- a/arch/x86/platform/efi/efi_stub_64.S
> +++ b/arch/x86/platform/efi/efi_stub_64.S
> @@ -43,7 +43,7 @@ ENTRY(efi_call)
>   FRAME_BEGIN
>   SAVE_XMM
>   mov (%rsp), %rax
> - mov 8(%rax), %rax
> + mov 16(%rax), %rax
>   subq $48, %rsp
>   mov %r9, 32(%rsp)
>   mov %rax, 40(%rsp)

Nice. Your fix looks good, so I've put it in the urgent queue and
tagged it for stable.

Re: [PATCH 0/3] sched: Fix wakeup preemption regression

2016-05-12 Thread Matt Fleming

On Tue, 10 May, at 07:43:14PM, Peter Zijlstra wrote:
> A recent commit caused an interactivity/starvation issue because we wrecked rq
> local wakeup preemption.
> 
> These patches rectify this while also (hopefully) keeping the problem that led
> to the fault patch fixed.
> 
> Mike, Pavan, could you guys please confirm?
 
FWIW, I took a quick look over these patches and they made sense to
me. (I appreciate the comment block above enqueue_entity())

Re: [PATCH 0/3] sched: Fix wakeup preemption regression

2016-05-12 Thread Matt Fleming

On Tue, 10 May, at 07:43:14PM, Peter Zijlstra wrote:
> A recent commit caused an interactivity/starvation issue because we wrecked rq
> local wakeup preemption.
> 
> These patches rectify this while also (hopefully) keeping the problem that led
> to the fault patch fixed.
> 
> Mike, Pavan, could you guys please confirm?
 
FWIW, I took a quick look over these patches and they made sense to
me. (I appreciate the comment block above enqueue_entity())

Re: [PATCH 1/3] sched,fair: Move record_wakee()

2016-05-12 Thread Matt Fleming

On Tue, 10 May, at 07:43:15PM, Peter Zijlstra wrote:
> Since I want to make ->task_woken() conditional on the task getting
> migrated, we cannot use it to call record_wakee().
 
You mean ->task_waking(), right?

Re: [PATCH 1/3] sched,fair: Move record_wakee()

2016-05-12 Thread Matt Fleming

On Tue, 10 May, at 07:43:15PM, Peter Zijlstra wrote:
> Since I want to make ->task_woken() conditional on the task getting
> migrated, we cannot use it to call record_wakee().
 
You mean ->task_waking(), right?

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-12 Thread Matt Fleming

On Thu, 12 May, at 10:22:07AM, Shannon Zhao wrote:
> 
> As said above, I will rebase this patch on top of the EFI next branch.

OK thanks.

Note that it is not possible to escape merge conflicts, since there
are changes in the xen tip tree that are not in the EFI next branch
and vice versa.

For example these commits from xen/linux-next look relevant,

  8e147fcc3ffa ("FDT: Add a helper to get the subnode by given name")
  37060935dc04 ("ARM64: XEN: Add a function to initialize Xen specific UEFI 
runtime services")
  acb2c923a860 ("XEN: EFI: Move x86 specific codes to architecture directory")
  055be2d42408 ("ARM: Xen: Document UEFI support on Xen ARM virtual platforms")
  3915fea959b6 ("ARM: XEN: Move xen_early_init() before efi_init()")

Linus, Stefano, tip-folks: I'm soliciting opinions on the correct way
to handle this inter-tree dependency where we've got changes to EFI
code in two separate trees and Shannon wants to write patches on top
of both.

I'm guessing the best solution is to merge xen/linux-next and efi/next
into a topic branch either in the EFI tree or Xen tree, but I want to
be cautious of the branch history that will create.

(In hindsight all of these change should have probably gone via the
EFI tree.)

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-12 Thread Matt Fleming

On Thu, 12 May, at 10:22:07AM, Shannon Zhao wrote:
> 
> As said above, I will rebase this patch on top of the EFI next branch.

OK thanks.

Note that it is not possible to escape merge conflicts, since there
are changes in the xen tip tree that are not in the EFI next branch
and vice versa.

For example these commits from xen/linux-next look relevant,

  8e147fcc3ffa ("FDT: Add a helper to get the subnode by given name")
  37060935dc04 ("ARM64: XEN: Add a function to initialize Xen specific UEFI 
runtime services")
  acb2c923a860 ("XEN: EFI: Move x86 specific codes to architecture directory")
  055be2d42408 ("ARM: Xen: Document UEFI support on Xen ARM virtual platforms")
  3915fea959b6 ("ARM: XEN: Move xen_early_init() before efi_init()")

Linus, Stefano, tip-folks: I'm soliciting opinions on the correct way
to handle this inter-tree dependency where we've got changes to EFI
code in two separate trees and Shannon wants to write patches on top
of both.

I'm guessing the best solution is to merge xen/linux-next and efi/next
into a topic branch either in the EFI tree or Xen tree, but I want to
be cautious of the branch history that will create.

(In hindsight all of these change should have probably gone via the
EFI tree.)

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-11 Thread Matt Fleming

On Wed, 11 May, at 09:35:47PM, Shannon Zhao wrote:
> > 
> > Also, why do you need to setup efi.runtime_version here? Isn't that
> > done inside uefi_init()?
> > 
> I don't see any codes which setup efi.runtime_version in uefi_init().
 
Look in the EFI tree,

  
https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/tree/drivers/firmware/efi/arm-init.c?h=next#n120

> > Here is how I would have expected this patch to look:
> > 
> >   - Add FDT code to find hypervisor EFI params
> > 
> >   - Force enable EFI_RUNTIME_SERVICES for Xen and call
> > xen_efi_runtime_setup() inside xen_guest_init()
> > 
> >   - Change arm_enable_runtime_services() to handle the case where
> > EFI_RUNTIME_SERVICES is already set
> > 
> > Am I misunderstanding some ordering issues?
> 
> Since xen_guest_init() and arm_enable_runtime_services() are both
> early_initcall and the calling order is random I think.

Urgh, right, I missed that.

> And when we call xen_efi_runtime_setup() and setup
> efi.runtime_version in xen_guest_init(), the efi.systab might be
> NULL. So it needs to map the systanle explicitly before we use
> efi.systab.

Could you explain why you need to copy efi.runtime_version instead of
letting the existing code in uefi_init() set it up?

Also, efi.systab isn't used by xen_efi_runtime_setup() as far I can
see, not sure why you need that either?

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-11 Thread Matt Fleming

On Wed, 11 May, at 09:35:47PM, Shannon Zhao wrote:
> > 
> > Also, why do you need to setup efi.runtime_version here? Isn't that
> > done inside uefi_init()?
> > 
> I don't see any codes which setup efi.runtime_version in uefi_init().
 
Look in the EFI tree,

  
https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/tree/drivers/firmware/efi/arm-init.c?h=next#n120

> > Here is how I would have expected this patch to look:
> > 
> >   - Add FDT code to find hypervisor EFI params
> > 
> >   - Force enable EFI_RUNTIME_SERVICES for Xen and call
> > xen_efi_runtime_setup() inside xen_guest_init()
> > 
> >   - Change arm_enable_runtime_services() to handle the case where
> > EFI_RUNTIME_SERVICES is already set
> > 
> > Am I misunderstanding some ordering issues?
> 
> Since xen_guest_init() and arm_enable_runtime_services() are both
> early_initcall and the calling order is random I think.

Urgh, right, I missed that.

> And when we call xen_efi_runtime_setup() and setup
> efi.runtime_version in xen_guest_init(), the efi.systab might be
> NULL. So it needs to map the systanle explicitly before we use
> efi.systab.

Could you explain why you need to copy efi.runtime_version instead of
letting the existing code in uefi_init() set it up?

Also, efi.systab isn't used by xen_efi_runtime_setup() as far I can
see, not sure why you need that either?

Re: [RFC][PATCH 4/7] sched: Replace sd_busy/nr_busy_cpus with sched_domain_shared

2016-05-11 Thread Matt Fleming

On Wed, 11 May, at 07:37:52PM, Peter Zijlstra wrote:
> On Wed, May 11, 2016 at 12:55:56PM +0100, Matt Fleming wrote:
> 
> > This breaks my POWER7 box which presumably doesn't have 
> > SD_SHARE_PKG_RESOURCES,
> 
> > index 978b3ef2d87e..d27153adee4d 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7920,7 +7920,8 @@ static inline void set_cpu_sd_state_busy(void)
> > goto unlock;
> > sd->nohz_idle = 0;
> >  
> > -   atomic_inc(>shared->nr_busy_cpus);
> > +   if (sd->shared)
> > +   atomic_inc(>shared->nr_busy_cpus);
> >  unlock:
> > rcu_read_unlock();
> >  }
> 
> 
> Ah, no, the problem is that while it does have SHARE_PKG_RESOURCES (in
> its SMT domain -- SMT threads share all cache after all), I failed to
> connect the sched_domain_shared structure for it.
> 
> Does something like this also work?

Yep, it does.

Re: [RFC][PATCH 4/7] sched: Replace sd_busy/nr_busy_cpus with sched_domain_shared

2016-05-11 Thread Matt Fleming

On Wed, 11 May, at 07:37:52PM, Peter Zijlstra wrote:
> On Wed, May 11, 2016 at 12:55:56PM +0100, Matt Fleming wrote:
> 
> > This breaks my POWER7 box which presumably doesn't have 
> > SD_SHARE_PKG_RESOURCES,
> 
> > index 978b3ef2d87e..d27153adee4d 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7920,7 +7920,8 @@ static inline void set_cpu_sd_state_busy(void)
> > goto unlock;
> > sd->nohz_idle = 0;
> >  
> > -   atomic_inc(>shared->nr_busy_cpus);
> > +   if (sd->shared)
> > +   atomic_inc(>shared->nr_busy_cpus);
> >  unlock:
> > rcu_read_unlock();
> >  }
> 
> 
> Ah, no, the problem is that while it does have SHARE_PKG_RESOURCES (in
> its SMT domain -- SMT threads share all cache after all), I failed to
> connect the sched_domain_shared structure for it.
> 
> Does something like this also work?

Yep, it does.

Re: [PATCH 2/5] efibc: Fix excessive stack footprint warning

2016-05-11 Thread Matt Fleming

On Tue, 10 May, at 10:40:22AM, Jeremy Compostella wrote:
> Why not.  See patch as attachment.
> 
> Thanks,
> 
> Jérémy
> 

> From 8a9b07e2d7242fa8a36157f1025202a96c3c7c9a Mon Sep 17 00:00:00 2001
> From: Jeremy Compostella 
> Date: Tue, 10 May 2016 10:34:21 +0200
> Subject: [PATCH] efibc: report the EFI variable name in the error messages
> 
> Report the name of the EFI variable if the value is incorrect or if
> efibc_set_variable() fails to allocate the struct efivar_entry object.
> 
> Signed-off-by: Jeremy Compostella 
> ---
>  drivers/firmware/efi/efibc.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/firmware/efi/efibc.c b/drivers/firmware/efi/efibc.c
> index cb4f573..93d34a1 100644
> --- a/drivers/firmware/efi/efibc.c
> +++ b/drivers/firmware/efi/efibc.c
> @@ -37,13 +37,14 @@ static int efibc_set_variable(const char *name, const 
> char *value)
>   size_t size = (strlen(value) + 1) * sizeof(efi_char16_t);
>  
>   if (size > sizeof(entry->var.Data)) {
> - pr_err("value is too large");
> + pr_err("value is too large for %s EFI variable", name);
>   return -EINVAL;
>   }

It'd be a good idea to print 'size' too.

>  
>   entry = kmalloc(sizeof(*entry), GFP_KERNEL);
>   if (!entry) {
> - pr_err("failed to allocate efivar entry");
> + pr_err("failed to allocate efivar entry for %s EFI variable",
> +name);
>   return -ENOMEM;
>   }

Aren't these pr_err() calls missing newline characters?

Re: [PATCH 2/5] efibc: Fix excessive stack footprint warning

2016-05-11 Thread Matt Fleming

On Tue, 10 May, at 10:40:22AM, Jeremy Compostella wrote:
> Why not.  See patch as attachment.
> 
> Thanks,
> 
> Jérémy
> 

> From 8a9b07e2d7242fa8a36157f1025202a96c3c7c9a Mon Sep 17 00:00:00 2001
> From: Jeremy Compostella 
> Date: Tue, 10 May 2016 10:34:21 +0200
> Subject: [PATCH] efibc: report the EFI variable name in the error messages
> 
> Report the name of the EFI variable if the value is incorrect or if
> efibc_set_variable() fails to allocate the struct efivar_entry object.
> 
> Signed-off-by: Jeremy Compostella 
> ---
>  drivers/firmware/efi/efibc.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/firmware/efi/efibc.c b/drivers/firmware/efi/efibc.c
> index cb4f573..93d34a1 100644
> --- a/drivers/firmware/efi/efibc.c
> +++ b/drivers/firmware/efi/efibc.c
> @@ -37,13 +37,14 @@ static int efibc_set_variable(const char *name, const 
> char *value)
>   size_t size = (strlen(value) + 1) * sizeof(efi_char16_t);
>  
>   if (size > sizeof(entry->var.Data)) {
> - pr_err("value is too large");
> + pr_err("value is too large for %s EFI variable", name);
>   return -EINVAL;
>   }

It'd be a good idea to print 'size' too.

>  
>   entry = kmalloc(sizeof(*entry), GFP_KERNEL);
>   if (!entry) {
> - pr_err("failed to allocate efivar entry");
> + pr_err("failed to allocate efivar entry for %s EFI variable",
> +name);
>   return -ENOMEM;
>   }

Aren't these pr_err() calls missing newline characters?

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-11 Thread Matt Fleming

On Fri, 06 May, at 09:52:42AM, Mathieu Poirier wrote:
> > +static int __init efi_remap_init(void)
> > +{
> > +   u64 mapsize;
> > +
> > +   pr_info("Remapping and enabling EFI services.\n");
> > +
> > +   mapsize = memmap.map_end - memmap.map;
> > +   memmap.map = (__force void *)ioremap_cache(memmap.phys_map,
> > +  mapsize);
> > +   if (!memmap.map) {
> > +   pr_err("Failed to remap EFI memory map\n");
> > +   return -ENOMEM;
> > +   }
> > +   memmap.map_end = memmap.map + mapsize;
> > +   efi.memmap = 
> > +
> > +   efi.systab = (__force void *)ioremap_cache(efi_system_table,
> > +  
> > sizeof(efi_system_table_t));
> > +   if (!efi.systab) {
> 
> Is there a reason why memmap.map isn't iounmap() in the error path?
> The original code didn't have it either, hence the question.

I suspect that is a bug.

In reality, if you can't access the EFI System Table because you fail
to map it I would guess you are going to crash very, very quickly
anyhow.

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-11 Thread Matt Fleming

On Fri, 06 May, at 09:52:42AM, Mathieu Poirier wrote:
> > +static int __init efi_remap_init(void)
> > +{
> > +   u64 mapsize;
> > +
> > +   pr_info("Remapping and enabling EFI services.\n");
> > +
> > +   mapsize = memmap.map_end - memmap.map;
> > +   memmap.map = (__force void *)ioremap_cache(memmap.phys_map,
> > +  mapsize);
> > +   if (!memmap.map) {
> > +   pr_err("Failed to remap EFI memory map\n");
> > +   return -ENOMEM;
> > +   }
> > +   memmap.map_end = memmap.map + mapsize;
> > +   efi.memmap = 
> > +
> > +   efi.systab = (__force void *)ioremap_cache(efi_system_table,
> > +  
> > sizeof(efi_system_table_t));
> > +   if (!efi.systab) {
> 
> Is there a reason why memmap.map isn't iounmap() in the error path?
> The original code didn't have it either, hence the question.

I suspect that is a bug.

In reality, if you can't access the EFI System Table because you fail
to map it I would guess you are going to crash very, very quickly
anyhow.

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-11 Thread Matt Fleming

On Fri, 06 May, at 04:32:06PM, Shannon Zhao wrote:
> From: Shannon Zhao <shannon.z...@linaro.org>
> 
> Add a new function to parse DT parameters for Xen specific UEFI just
> like the way for normal UEFI. Then it could reuse the existing codes.
> 
> If Xen supports EFI, initialize runtime services.

This commit log would benefit from a little expansion. I'd like to see
information that answers the following questions,

 - How do the Xen DT paramters differ from bare metal?
 - What existing code is reused with your patch?
 - How are Xen runtime services different from bare metal?
 - Why is it OK to force enable EFI runtime services for Xen?

I think it would also be good to explicitly state that we do not
expect to find both Xen EFI DT parameters and bare metal EFI DT
parameters when performing the search; the two should be mutually
exclusive.

> CC: Matt Fleming <m...@codeblueprint.co.uk>
> Signed-off-by: Shannon Zhao <shannon.z...@linaro.org>
> ---
> Drop using of EFI_PARAVIRT. Discussion can be found from [1].
> [1] https://lkml.org/lkml/2016/4/29/8
> ---
>  arch/arm/include/asm/efi.h |  2 +
>  arch/arm/xen/enlighten.c   | 16 
>  arch/arm64/include/asm/efi.h   |  2 +
>  drivers/firmware/efi/arm-runtime.c | 70 +++-
>  drivers/firmware/efi/efi.c | 81 
> ++
>  5 files changed, 137 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
> index e0eea72..5ba4343 100644
> --- a/arch/arm/include/asm/efi.h
> +++ b/arch/arm/include/asm/efi.h
> @@ -52,9 +52,11 @@ static inline void efi_set_pgd(struct mm_struct *mm)
>  
>  void efi_virtmap_load(void);
>  void efi_virtmap_unload(void);
> +int xen_arm_enable_runtime_services(void);
>  
>  #else
>  #define efi_init()
> +#define xen_arm_enable_runtime_services() (0)
>  #endif /* CONFIG_EFI */
>  
>  /* arch specific definitions used by the stub code */
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 13e3e9f..3362870 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -261,6 +262,13 @@ static int __init fdt_find_hyper_node(unsigned long 
> node, const char *uname,
>   !strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
>   hyper_node.version = s + strlen(hyper_node.prefix);
>  
> + if (IS_ENABLED(CONFIG_XEN_EFI)) {
> + /* Check if Xen supports EFI */
> + if ((of_get_flat_dt_subnode_by_name(node, "uefi") > 0) &&
> + !efi_runtime_disabled())
> + set_bit(EFI_RUNTIME_SERVICES, );
> + }
> +
>   return 0;
>  }
>  

The above comment could do with including similar information to the
commit log as to why we want to force enable runtime services. For x86
we have this,

 *
 * When EFI_PARAVIRT is in force then we could not map runtime
 * service memory region because we do not have direct access to it.
 * However, runtime services are available through proxy functions
 * (e.g. in case of Xen dom0 EFI implementation they call special
 * hypercall which executes relevant EFI functions) and that is why
 * they are always enabled.
 */

We need something similar here.

> @@ -338,6 +346,7 @@ static int __init xen_guest_init(void)
>  {
>   struct xen_add_to_physmap xatp;
>   struct shared_info *shared_info_page = NULL;
> + int ret;
>  
>   if (!xen_domain())
>   return 0;
> @@ -352,6 +361,13 @@ static int __init xen_guest_init(void)
>   return -ENODEV;
>   }
>  
> + if (IS_ENABLED(CONFIG_XEN_EFI) && efi_enabled(EFI_BOOT) &&
> + efi_enabled(EFI_RUNTIME_SERVICES)) {
> + ret = xen_arm_enable_runtime_services();
> + if (ret)
> + return ret;
> + }
> +

Is it ever possible to have EFI_RUNTIME_SERVICES set but
efi_enabled(EFI_BOOT) and IS_ENABLED(CONFIG_XEN_EFI) be false inside
this function? If the answer is "no", I'd suggest just reducing this
down to,

/*
 * The fdt parsing code will have set EFI_RUNTIME_SERVICES if
 * it found Xen EFI parameters. Force enable runtime services.
 */ 
if (efi_enabled(EFI_RUNTIME_SERVICES)) {
ret = xen_arm_enable_runtime_services();
if (ret)
return ret;
}

But first, see my comments below.

> @@ -39,6 +40,33 @@ static struct mm_struct efi_mm = {
>

Re: [PATCH] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-05-11 Thread Matt Fleming

On Fri, 06 May, at 04:32:06PM, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Add a new function to parse DT parameters for Xen specific UEFI just
> like the way for normal UEFI. Then it could reuse the existing codes.
> 
> If Xen supports EFI, initialize runtime services.

This commit log would benefit from a little expansion. I'd like to see
information that answers the following questions,

 - How do the Xen DT paramters differ from bare metal?
 - What existing code is reused with your patch?
 - How are Xen runtime services different from bare metal?
 - Why is it OK to force enable EFI runtime services for Xen?

I think it would also be good to explicitly state that we do not
expect to find both Xen EFI DT parameters and bare metal EFI DT
parameters when performing the search; the two should be mutually
exclusive.

> CC: Matt Fleming 
> Signed-off-by: Shannon Zhao 
> ---
> Drop using of EFI_PARAVIRT. Discussion can be found from [1].
> [1] https://lkml.org/lkml/2016/4/29/8
> ---
>  arch/arm/include/asm/efi.h |  2 +
>  arch/arm/xen/enlighten.c   | 16 
>  arch/arm64/include/asm/efi.h   |  2 +
>  drivers/firmware/efi/arm-runtime.c | 70 +++-
>  drivers/firmware/efi/efi.c | 81 
> ++
>  5 files changed, 137 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
> index e0eea72..5ba4343 100644
> --- a/arch/arm/include/asm/efi.h
> +++ b/arch/arm/include/asm/efi.h
> @@ -52,9 +52,11 @@ static inline void efi_set_pgd(struct mm_struct *mm)
>  
>  void efi_virtmap_load(void);
>  void efi_virtmap_unload(void);
> +int xen_arm_enable_runtime_services(void);
>  
>  #else
>  #define efi_init()
> +#define xen_arm_enable_runtime_services() (0)
>  #endif /* CONFIG_EFI */
>  
>  /* arch specific definitions used by the stub code */
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 13e3e9f..3362870 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -261,6 +262,13 @@ static int __init fdt_find_hyper_node(unsigned long 
> node, const char *uname,
>   !strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
>   hyper_node.version = s + strlen(hyper_node.prefix);
>  
> + if (IS_ENABLED(CONFIG_XEN_EFI)) {
> + /* Check if Xen supports EFI */
> + if ((of_get_flat_dt_subnode_by_name(node, "uefi") > 0) &&
> + !efi_runtime_disabled())
> + set_bit(EFI_RUNTIME_SERVICES, );
> + }
> +
>   return 0;
>  }
>  

The above comment could do with including similar information to the
commit log as to why we want to force enable runtime services. For x86
we have this,

 *
 * When EFI_PARAVIRT is in force then we could not map runtime
 * service memory region because we do not have direct access to it.
 * However, runtime services are available through proxy functions
 * (e.g. in case of Xen dom0 EFI implementation they call special
 * hypercall which executes relevant EFI functions) and that is why
 * they are always enabled.
 */

We need something similar here.

> @@ -338,6 +346,7 @@ static int __init xen_guest_init(void)
>  {
>   struct xen_add_to_physmap xatp;
>   struct shared_info *shared_info_page = NULL;
> + int ret;
>  
>   if (!xen_domain())
>   return 0;
> @@ -352,6 +361,13 @@ static int __init xen_guest_init(void)
>   return -ENODEV;
>   }
>  
> + if (IS_ENABLED(CONFIG_XEN_EFI) && efi_enabled(EFI_BOOT) &&
> + efi_enabled(EFI_RUNTIME_SERVICES)) {
> + ret = xen_arm_enable_runtime_services();
> + if (ret)
> + return ret;
> + }
> +

Is it ever possible to have EFI_RUNTIME_SERVICES set but
efi_enabled(EFI_BOOT) and IS_ENABLED(CONFIG_XEN_EFI) be false inside
this function? If the answer is "no", I'd suggest just reducing this
down to,

/*
 * The fdt parsing code will have set EFI_RUNTIME_SERVICES if
 * it found Xen EFI parameters. Force enable runtime services.
 */ 
if (efi_enabled(EFI_RUNTIME_SERVICES)) {
ret = xen_arm_enable_runtime_services();
if (ret)
return ret;
}

But first, see my comments below.

> @@ -39,6 +40,33 @@ static struct mm_struct efi_mm = {
>   .mmlist = LIST_HEAD_INIT(efi_mm.mmlist),
>  };
>  
> +static int __init efi_r

Re: [RFC][PATCH 4/7] sched: Replace sd_busy/nr_busy_cpus with sched_domain_shared

2016-05-11 Thread Matt Fleming

On Mon, 09 May, at 12:48:11PM, Peter Zijlstra wrote:
> Move the nr_busy_cpus thing from its hacky sd->parent->groups->sgc
> location into the much more natural sched_domain_shared location.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  include/linux/sched.h|1 +
>  kernel/sched/core.c  |   10 +-
>  kernel/sched/fair.c  |   22 --
>  kernel/sched/sched.h |6 +-
>  kernel/time/tick-sched.c |   10 +-
>  5 files changed, 24 insertions(+), 25 deletions(-)
> 
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1059,6 +1059,7 @@ struct sched_group;
>  
>  struct sched_domain_shared {
>   atomic_tref;
> + atomic_tnr_busy_cpus;
>  };
>  
>  struct sched_domain {

[...]

> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7842,13 +7842,13 @@ static inline void set_cpu_sd_state_busy
>   int cpu = smp_processor_id();
>  
>   rcu_read_lock();
> - sd = rcu_dereference(per_cpu(sd_busy, cpu));
> + sd = rcu_dereference(per_cpu(sd_llc, cpu));
>  
>   if (!sd || !sd->nohz_idle)
>   goto unlock;
>   sd->nohz_idle = 0;
>  
> - atomic_inc(>groups->sgc->nr_busy_cpus);
> + atomic_inc(>shared->nr_busy_cpus);
>  unlock:
>   rcu_read_unlock();
>  }

This breaks my POWER7 box which presumably doesn't have SD_SHARE_PKG_RESOURCES,

  NIP [c012de68] .set_cpu_sd_state_idle+0x58/0x80
  LR [c017ded4] .tick_nohz_idle_enter+0x24/0x90
  Call Trace:
  [c007774b7cf0] [c007774b4080] 0xc007774b4080 (unreliable)
  [c007774b7d60] [c017ded4] .tick_nohz_idle_enter+0x24/0x90
  [c007774b7dd0] [c0137200] .cpu_startup_entry+0xe0/0x440
  [c007774b7ee0] [c004739c] .start_secondary+0x35c/0x3a0
  [c007774b7f90] [c0008bfc] start_secondary_prolog+0x10/0x14

The following fixes it,

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 978b3ef2d87e..d27153adee4d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7920,7 +7920,8 @@ static inline void set_cpu_sd_state_busy(void)
goto unlock;
sd->nohz_idle = 0;
 
-   atomic_inc(>shared->nr_busy_cpus);
+   if (sd->shared)
+   atomic_inc(>shared->nr_busy_cpus);
 unlock:
rcu_read_unlock();
 }
@@ -7937,7 +7938,8 @@ void set_cpu_sd_state_idle(void)
goto unlock;
sd->nohz_idle = 1;
 
-   atomic_dec(>shared->nr_busy_cpus);
+   if (sd->shared)
+   atomic_dec(>shared->nr_busy_cpus);
 unlock:
rcu_read_unlock();
 }

Re: [RFC][PATCH 4/7] sched: Replace sd_busy/nr_busy_cpus with sched_domain_shared

2016-05-11 Thread Matt Fleming

On Mon, 09 May, at 12:48:11PM, Peter Zijlstra wrote:
> Move the nr_busy_cpus thing from its hacky sd->parent->groups->sgc
> location into the much more natural sched_domain_shared location.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  include/linux/sched.h|1 +
>  kernel/sched/core.c  |   10 +-
>  kernel/sched/fair.c  |   22 --
>  kernel/sched/sched.h |6 +-
>  kernel/time/tick-sched.c |   10 +-
>  5 files changed, 24 insertions(+), 25 deletions(-)
> 
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1059,6 +1059,7 @@ struct sched_group;
>  
>  struct sched_domain_shared {
>   atomic_tref;
> + atomic_tnr_busy_cpus;
>  };
>  
>  struct sched_domain {

[...]

> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7842,13 +7842,13 @@ static inline void set_cpu_sd_state_busy
>   int cpu = smp_processor_id();
>  
>   rcu_read_lock();
> - sd = rcu_dereference(per_cpu(sd_busy, cpu));
> + sd = rcu_dereference(per_cpu(sd_llc, cpu));
>  
>   if (!sd || !sd->nohz_idle)
>   goto unlock;
>   sd->nohz_idle = 0;
>  
> - atomic_inc(>groups->sgc->nr_busy_cpus);
> + atomic_inc(>shared->nr_busy_cpus);
>  unlock:
>   rcu_read_unlock();
>  }

This breaks my POWER7 box which presumably doesn't have SD_SHARE_PKG_RESOURCES,

  NIP [c012de68] .set_cpu_sd_state_idle+0x58/0x80
  LR [c017ded4] .tick_nohz_idle_enter+0x24/0x90
  Call Trace:
  [c007774b7cf0] [c007774b4080] 0xc007774b4080 (unreliable)
  [c007774b7d60] [c017ded4] .tick_nohz_idle_enter+0x24/0x90
  [c007774b7dd0] [c0137200] .cpu_startup_entry+0xe0/0x440
  [c007774b7ee0] [c004739c] .start_secondary+0x35c/0x3a0
  [c007774b7f90] [c0008bfc] start_secondary_prolog+0x10/0x14

The following fixes it,

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 978b3ef2d87e..d27153adee4d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7920,7 +7920,8 @@ static inline void set_cpu_sd_state_busy(void)
goto unlock;
sd->nohz_idle = 0;
 
-   atomic_inc(>shared->nr_busy_cpus);
+   if (sd->shared)
+   atomic_inc(>shared->nr_busy_cpus);
 unlock:
rcu_read_unlock();
 }
@@ -7937,7 +7938,8 @@ void set_cpu_sd_state_idle(void)
goto unlock;
sd->nohz_idle = 1;
 
-   atomic_dec(>shared->nr_busy_cpus);
+   if (sd->shared)
+   atomic_dec(>shared->nr_busy_cpus);
 unlock:
rcu_read_unlock();
 }

Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear

2016-05-10 Thread Matt Fleming

On Tue, 26 Apr, at 05:57:40PM, Tom Lendacky wrote:
> The EFI tables are not encrypted and need to be accessed as such. Be sure
> to memmap them without the encryption attribute set. For EFI support that
> lives outside of the arch/x86 tree, create a routine that uses the __weak
> attribute so that it can be overridden by an architecture specific routine.
> 
> When freeing boot services related memory, since it has been mapped as
> un-encrypted, be sure to change the mapping to encrypted for future use.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/cacheflush.h  |3 +
>  arch/x86/include/asm/mem_encrypt.h |   22 +++
>  arch/x86/kernel/setup.c|6 +--
>  arch/x86/mm/mem_encrypt.c  |   56 +++
>  arch/x86/mm/pageattr.c |   75 
> 
>  arch/x86/platform/efi/efi.c|   26 +++-
>  arch/x86/platform/efi/efi_64.c |9 +++-
>  arch/x86/platform/efi/quirks.c |   12 +-
>  drivers/firmware/efi/efi.c |   18 +++--
>  drivers/firmware/efi/esrt.c|   12 +++---
>  include/linux/efi.h|3 +
>  11 files changed, 212 insertions(+), 30 deletions(-)

The size of this change is completely unexpected. Why is there so much
churn to workaround this new feature?

Is it not possible to maintain some kind of kernel virtual address
mapping so memremap*() and friends can figure out when to twiddle the
mapping attributes and map with/without encryption?

These API changes place an undue burden on developers that don't even
care about SME.

Re: [RFC PATCH v1 10/18] x86/efi: Access EFI related tables in the clear

2016-05-10 Thread Matt Fleming

On Tue, 26 Apr, at 05:57:40PM, Tom Lendacky wrote:
> The EFI tables are not encrypted and need to be accessed as such. Be sure
> to memmap them without the encryption attribute set. For EFI support that
> lives outside of the arch/x86 tree, create a routine that uses the __weak
> attribute so that it can be overridden by an architecture specific routine.
> 
> When freeing boot services related memory, since it has been mapped as
> un-encrypted, be sure to change the mapping to encrypted for future use.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/include/asm/cacheflush.h  |3 +
>  arch/x86/include/asm/mem_encrypt.h |   22 +++
>  arch/x86/kernel/setup.c|6 +--
>  arch/x86/mm/mem_encrypt.c  |   56 +++
>  arch/x86/mm/pageattr.c |   75 
> 
>  arch/x86/platform/efi/efi.c|   26 +++-
>  arch/x86/platform/efi/efi_64.c |9 +++-
>  arch/x86/platform/efi/quirks.c |   12 +-
>  drivers/firmware/efi/efi.c |   18 +++--
>  drivers/firmware/efi/esrt.c|   12 +++---
>  include/linux/efi.h|3 +
>  11 files changed, 212 insertions(+), 30 deletions(-)

The size of this change is completely unexpected. Why is there so much
churn to workaround this new feature?

Is it not possible to maintain some kind of kernel virtual address
mapping so memremap*() and friends can figure out when to twiddle the
mapping attributes and map with/without encryption?

These API changes place an undue burden on developers that don't even
care about SME.

Re: [BUG] x86/efi: MMRs no longer properly mapped after switch to isolated page table

2016-05-09 Thread Matt Fleming

On Mon, 02 May, at 04:39:31PM, Alex Thorlton wrote:
> 
> If you think we're violating EFI rules by accessing these registers from
> both sides of the fence, please let me know.  I'd like to make sure that
> we get everything behaving the way it should be!

Oh no, I don't think this is violating the UEFI spec at all, but I do
think it goes against the spirit of the way other implementations are
designed; with maximum separation between firmware and kernel.

In a perfect world, I'd suggest mapping the MMR range in both the
kernel and firmware, at different virtual address ranges, but have
the firmware's version opaque to the kernel and only described by an
EfiMemoryMappedIO region, or something. That is ignoring any region
type conflicts that may arise.

Of course, we don't operate in a perfect world, so a good solution may
be to just insert the kernels' mapping into the EFI page tables.

Re: [BUG] x86/efi: MMRs no longer properly mapped after switch to isolated page table

2016-05-09 Thread Matt Fleming

On Mon, 02 May, at 04:39:31PM, Alex Thorlton wrote:
> 
> If you think we're violating EFI rules by accessing these registers from
> both sides of the fence, please let me know.  I'd like to make sure that
> we get everything behaving the way it should be!

Oh no, I don't think this is violating the UEFI spec at all, but I do
think it goes against the spirit of the way other implementations are
designed; with maximum separation between firmware and kernel.

In a perfect world, I'd suggest mapping the MMR range in both the
kernel and firmware, at different virtual address ranges, but have
the firmware's version opaque to the kernel and only described by an
EfiMemoryMappedIO region, or something. That is ignoring any region
type conflicts that may arise.

Of course, we don't operate in a perfect world, so a good solution may
be to just insert the kernels' mapping into the EFI page tables.

[tip:efi/core] efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

2016-05-07 Thread tip-bot for Matt Fleming

Commit-ID:  fb7a84cac03541f4da18dfa25b3f4767d4efc6fc
Gitweb: http://git.kernel.org/tip/fb7a84cac03541f4da18dfa25b3f4767d4efc6fc
Author: Matt Fleming <m...@codeblueprint.co.uk>
AuthorDate: Fri, 6 May 2016 22:39:29 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Sat, 7 May 2016 07:06:13 +0200

efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

Dan Carpenter reports that passing the address of the pointer to the
kmalloc()'d memory for 'capsule' is dangerous:

 "drivers/firmware/efi/capsule.c:109 efi_capsule_supported()
  warn: did you mean to pass the address of 'capsule'

   108
   109  status = efi.query_capsule_caps(, 1, _size, reset);

  If we modify capsule inside this function call then at the end of the
  function we aren't freeing the original pointer that we allocated."

Ard Biesheuvel noted that we don't even need to call kmalloc() since the
object we allocate isn't very big and doesn't need to persist after the
function returns.

Place 'capsule' on the stack instead.

Suggested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Reported-by: Dan Carpenter <dan.carpen...@oracle.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Bryan O'Donoghue <pure.lo...@nexus-software.ie>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Kweh Hock Leong <hock.leong.k...@intel.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: joeyli <j...@suse.com>
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1462570771-13324-4-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 drivers/firmware/efi/capsule.c | 29 +++--
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index e530540..53b9fd2 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -86,33 +86,26 @@ bool efi_capsule_pending(int *reset_type)
  */
 int efi_capsule_supported(efi_guid_t guid, u32 flags, size_t size, int *reset)
 {
-   efi_capsule_header_t *capsule;
+   efi_capsule_header_t capsule;
+   efi_capsule_header_t *cap_list[] = {  };
efi_status_t status;
u64 max_size;
-   int rv = 0;
 
if (flags & ~EFI_CAPSULE_SUPPORTED_FLAG_MASK)
return -EINVAL;
 
-   capsule = kmalloc(sizeof(*capsule), GFP_KERNEL);
-   if (!capsule)
-   return -ENOMEM;
-
-   capsule->headersize = capsule->imagesize = sizeof(*capsule);
-   memcpy(>guid, , sizeof(efi_guid_t));
-   capsule->flags = flags;
+   capsule.headersize = capsule.imagesize = sizeof(capsule);
+   memcpy(, , sizeof(efi_guid_t));
+   capsule.flags = flags;
 
-   status = efi.query_capsule_caps(, 1, _size, reset);
-   if (status != EFI_SUCCESS) {
-   rv = efi_status_to_err(status);
-   goto out;
-   }
+   status = efi.query_capsule_caps(cap_list, 1, _size, reset);
+   if (status != EFI_SUCCESS)
+   return efi_status_to_err(status);
 
if (size > max_size)
-   rv = -ENOSPC;
-out:
-   kfree(capsule);
-   return rv;
+   return -ENOSPC;
+
+   return 0;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_supported);

[tip:efi/core] efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

2016-05-07 Thread tip-bot for Matt Fleming

Commit-ID:  fb7a84cac03541f4da18dfa25b3f4767d4efc6fc
Gitweb: http://git.kernel.org/tip/fb7a84cac03541f4da18dfa25b3f4767d4efc6fc
Author: Matt Fleming 
AuthorDate: Fri, 6 May 2016 22:39:29 +0100
Committer:  Ingo Molnar 
CommitDate: Sat, 7 May 2016 07:06:13 +0200

efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

Dan Carpenter reports that passing the address of the pointer to the
kmalloc()'d memory for 'capsule' is dangerous:

 "drivers/firmware/efi/capsule.c:109 efi_capsule_supported()
  warn: did you mean to pass the address of 'capsule'

   108
   109  status = efi.query_capsule_caps(, 1, _size, reset);

  If we modify capsule inside this function call then at the end of the
  function we aren't freeing the original pointer that we allocated."

Ard Biesheuvel noted that we don't even need to call kmalloc() since the
object we allocate isn't very big and doesn't need to persist after the
function returns.

Place 'capsule' on the stack instead.

Suggested-by: Ard Biesheuvel 
Reported-by: Dan Carpenter 
Signed-off-by: Matt Fleming 
Acked-by: Ard Biesheuvel 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Bryan O'Donoghue 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Kweh Hock Leong 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: joeyli 
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1462570771-13324-4-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/capsule.c | 29 +++--
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index e530540..53b9fd2 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -86,33 +86,26 @@ bool efi_capsule_pending(int *reset_type)
  */
 int efi_capsule_supported(efi_guid_t guid, u32 flags, size_t size, int *reset)
 {
-   efi_capsule_header_t *capsule;
+   efi_capsule_header_t capsule;
+   efi_capsule_header_t *cap_list[] = {  };
efi_status_t status;
u64 max_size;
-   int rv = 0;
 
if (flags & ~EFI_CAPSULE_SUPPORTED_FLAG_MASK)
return -EINVAL;
 
-   capsule = kmalloc(sizeof(*capsule), GFP_KERNEL);
-   if (!capsule)
-   return -ENOMEM;
-
-   capsule->headersize = capsule->imagesize = sizeof(*capsule);
-   memcpy(>guid, , sizeof(efi_guid_t));
-   capsule->flags = flags;
+   capsule.headersize = capsule.imagesize = sizeof(capsule);
+   memcpy(, , sizeof(efi_guid_t));
+   capsule.flags = flags;
 
-   status = efi.query_capsule_caps(, 1, _size, reset);
-   if (status != EFI_SUCCESS) {
-   rv = efi_status_to_err(status);
-   goto out;
-   }
+   status = efi.query_capsule_caps(cap_list, 1, _size, reset);
+   if (status != EFI_SUCCESS)
+   return efi_status_to_err(status);
 
if (size > max_size)
-   rv = -ENOSPC;
-out:
-   kfree(capsule);
-   return rv;
+   return -ENOSPC;
+
+   return 0;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_supported);

[tip:efi/core] efi/capsule: Make efi_capsule_pending() lockless

2016-05-07 Thread tip-bot for Matt Fleming

Commit-ID:  62075e581802ea1842d5d3c490a7e46330bdb9e1
Gitweb: http://git.kernel.org/tip/62075e581802ea1842d5d3c490a7e46330bdb9e1
Author: Matt Fleming <m...@codeblueprint.co.uk>
AuthorDate: Fri, 6 May 2016 22:39:27 +0100
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Sat, 7 May 2016 07:06:13 +0200

efi/capsule: Make efi_capsule_pending() lockless

Taking a mutex in the reboot path is bogus because we cannot sleep
with interrupts disabled, such as when rebooting due to panic(),

  BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:97
  in_atomic(): 0, irqs_disabled(): 1, pid: 7, name: rcu_sched
  Call Trace:
dump_stack+0x63/0x89
___might_sleep+0xd8/0x120
__might_sleep+0x49/0x80
mutex_lock+0x20/0x50
efi_capsule_pending+0x1d/0x60
native_machine_emergency_restart+0x59/0x280
machine_emergency_restart+0x19/0x20
emergency_restart+0x18/0x20
panic+0x1ba/0x217

In this case all other CPUs will have been stopped by the time we
execute the platform reboot code, so 'capsule_pending' cannot change
under our feet. We wouldn't care even if it could since we cannot wait
for it complete.

Also, instead of relying on the external 'system_state' variable just
use a reboot notifier, so we can set 'stop_capsules' while holding
'capsule_mutex', thereby avoiding a race where system_state is updated
while we're in the middle of efi_capsule_update_locked() (since CPUs
won't have been stopped at that point).

Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Bryan O'Donoghue <pure.lo...@nexus-software.ie>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Kweh Hock Leong <hock.leong.k...@intel.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: joeyli <j...@suse.com>
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1462570771-13324-2-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 drivers/firmware/efi/capsule.c | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index 0de5594..e530540 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -22,11 +22,12 @@ typedef struct {
 } efi_capsule_block_desc_t;
 
 static bool capsule_pending;
+static bool stop_capsules;
 static int efi_reset_type = -1;
 
 /*
  * capsule_mutex serialises access to both capsule_pending and
- * efi_reset_type.
+ * efi_reset_type and stop_capsules.
  */
 static DEFINE_MUTEX(capsule_mutex);
 
@@ -50,18 +51,13 @@ static DEFINE_MUTEX(capsule_mutex);
  */
 bool efi_capsule_pending(int *reset_type)
 {
-   bool rv = false;
-
-   mutex_lock(_mutex);
if (!capsule_pending)
-   goto out;
+   return false;
 
if (reset_type)
*reset_type = efi_reset_type;
-   rv = true;
-out:
-   mutex_unlock(_mutex);
-   return rv;
+
+   return true;
 }
 
 /*
@@ -176,7 +172,7 @@ efi_capsule_update_locked(efi_capsule_header_t *capsule,
 * whether to force an EFI reboot), and we're racing against
 * that call. Abort in that case.
 */
-   if (unlikely(system_state == SYSTEM_RESTART)) {
+   if (unlikely(stop_capsules)) {
pr_warn("Capsule update raced with reboot, aborting.\n");
return -EINVAL;
}
@@ -298,3 +294,22 @@ out:
return rv;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_update);
+
+static int capsule_reboot_notify(struct notifier_block *nb, unsigned long 
event, void *cmd)
+{
+   mutex_lock(_mutex);
+   stop_capsules = true;
+   mutex_unlock(_mutex);
+
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block capsule_reboot_nb = {
+   .notifier_call = capsule_reboot_notify,
+};
+
+static int __init capsule_reboot_register(void)
+{
+   return register_reboot_notifier(_reboot_nb);
+}
+core_initcall(capsule_reboot_register);

[tip:efi/core] efi/capsule: Make efi_capsule_pending() lockless

2016-05-07 Thread tip-bot for Matt Fleming

Commit-ID:  62075e581802ea1842d5d3c490a7e46330bdb9e1
Gitweb: http://git.kernel.org/tip/62075e581802ea1842d5d3c490a7e46330bdb9e1
Author: Matt Fleming 
AuthorDate: Fri, 6 May 2016 22:39:27 +0100
Committer:  Ingo Molnar 
CommitDate: Sat, 7 May 2016 07:06:13 +0200

efi/capsule: Make efi_capsule_pending() lockless

Taking a mutex in the reboot path is bogus because we cannot sleep
with interrupts disabled, such as when rebooting due to panic(),

  BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:97
  in_atomic(): 0, irqs_disabled(): 1, pid: 7, name: rcu_sched
  Call Trace:
dump_stack+0x63/0x89
___might_sleep+0xd8/0x120
__might_sleep+0x49/0x80
mutex_lock+0x20/0x50
efi_capsule_pending+0x1d/0x60
native_machine_emergency_restart+0x59/0x280
machine_emergency_restart+0x19/0x20
emergency_restart+0x18/0x20
panic+0x1ba/0x217

In this case all other CPUs will have been stopped by the time we
execute the platform reboot code, so 'capsule_pending' cannot change
under our feet. We wouldn't care even if it could since we cannot wait
for it complete.

Also, instead of relying on the external 'system_state' variable just
use a reboot notifier, so we can set 'stop_capsules' while holding
'capsule_mutex', thereby avoiding a race where system_state is updated
while we're in the middle of efi_capsule_update_locked() (since CPUs
won't have been stopped at that point).

Signed-off-by: Matt Fleming 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Bryan O'Donoghue 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Kweh Hock Leong 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: joeyli 
Cc: linux-...@vger.kernel.org
Link: 
http://lkml.kernel.org/r/1462570771-13324-2-git-send-email-m...@codeblueprint.co.uk
Signed-off-by: Ingo Molnar 
---
 drivers/firmware/efi/capsule.c | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index 0de5594..e530540 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -22,11 +22,12 @@ typedef struct {
 } efi_capsule_block_desc_t;
 
 static bool capsule_pending;
+static bool stop_capsules;
 static int efi_reset_type = -1;
 
 /*
  * capsule_mutex serialises access to both capsule_pending and
- * efi_reset_type.
+ * efi_reset_type and stop_capsules.
  */
 static DEFINE_MUTEX(capsule_mutex);
 
@@ -50,18 +51,13 @@ static DEFINE_MUTEX(capsule_mutex);
  */
 bool efi_capsule_pending(int *reset_type)
 {
-   bool rv = false;
-
-   mutex_lock(_mutex);
if (!capsule_pending)
-   goto out;
+   return false;
 
if (reset_type)
*reset_type = efi_reset_type;
-   rv = true;
-out:
-   mutex_unlock(_mutex);
-   return rv;
+
+   return true;
 }
 
 /*
@@ -176,7 +172,7 @@ efi_capsule_update_locked(efi_capsule_header_t *capsule,
 * whether to force an EFI reboot), and we're racing against
 * that call. Abort in that case.
 */
-   if (unlikely(system_state == SYSTEM_RESTART)) {
+   if (unlikely(stop_capsules)) {
pr_warn("Capsule update raced with reboot, aborting.\n");
return -EINVAL;
}
@@ -298,3 +294,22 @@ out:
return rv;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_update);
+
+static int capsule_reboot_notify(struct notifier_block *nb, unsigned long 
event, void *cmd)
+{
+   mutex_lock(_mutex);
+   stop_capsules = true;
+   mutex_unlock(_mutex);
+
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block capsule_reboot_nb = {
+   .notifier_call = capsule_reboot_notify,
+};
+
+static int __init capsule_reboot_register(void)
+{
+   return register_reboot_notifier(_reboot_nb);
+}
+core_initcall(capsule_reboot_register);

[PATCH 5/5] efivarfs: Make efivarfs_file_ioctl static

2016-05-06 Thread Matt Fleming

From: Peter Jones <pjo...@redhat.com>

There are no callers except through the file_operations struct below
this, so it should be static like everything else here.

Signed-off-by: Peter Jones <pjo...@redhat.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 fs/efivarfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/efivarfs/file.c b/fs/efivarfs/file.c
index d48e0d261d78..5f22e74bbade 100644
--- a/fs/efivarfs/file.c
+++ b/fs/efivarfs/file.c
@@ -157,7 +157,7 @@ efivarfs_ioc_setxflags(struct file *file, void __user *arg)
return 0;
 }
 
-long
+static long
 efivarfs_file_ioctl(struct file *file, unsigned int cmd, unsigned long p)
 {
void __user *arg = (void __user *)p;
-- 
2.7.3

[PATCH 4/5] efi: Merge boolean flag arguments

2016-05-06 Thread Matt Fleming

From: Julia Lawall <julia.law...@lip6.fr>

The parameters atomic and duplicates of efivar_init always have opposite
values.  Drop the parameter atomic, replace the uses of !atomic with
duplicates, and update the call sites accordingly.

The code using duplicates is slightly reorganized with an else, to avoid
duplicating the lock code.

Signed-off-by: Julia Lawall <julia.law...@lip6.fr>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Matthew Garrett <mj...@srcf.ucam.org>
Cc: Jeremy Kerr <j...@ozlabs.org>
Cc: Vaishali Thakkar <vaishali.thak...@oracle.com>
Cc: Saurabh Sengar <saurabh.tr...@gmail.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 drivers/firmware/efi/efivars.c |  5 ++---
 drivers/firmware/efi/vars.c| 23 ++-
 fs/efivarfs/super.c|  3 +--
 include/linux/efi.h|  3 +--
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/firmware/efi/efivars.c b/drivers/firmware/efi/efivars.c
index 096adcbcb5a9..116b244dee68 100644
--- a/drivers/firmware/efi/efivars.c
+++ b/drivers/firmware/efi/efivars.c
@@ -661,7 +661,7 @@ static void efivar_update_sysfs_entries(struct work_struct 
*work)
return;
 
err = efivar_init(efivar_update_sysfs_entry, entry,
- true, false, _sysfs_list);
+ false, _sysfs_list);
if (!err)
break;
 
@@ -730,8 +730,7 @@ int efivars_sysfs_init(void)
return -ENOMEM;
}
 
-   efivar_init(efivars_sysfs_callback, NULL, false,
-   true, _sysfs_list);
+   efivar_init(efivars_sysfs_callback, NULL, true, _sysfs_list);
 
error = create_efivars_bin_attributes();
if (error) {
diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
index 0012331d5a3d..d3b751383286 100644
--- a/drivers/firmware/efi/vars.c
+++ b/drivers/firmware/efi/vars.c
@@ -419,8 +419,7 @@ static void dup_variable_bug(efi_char16_t *str16, 
efi_guid_t *vendor_guid,
  * Returns 0 on success, or a kernel error code on failure.
  */
 int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
-   void *data, bool atomic, bool duplicates,
-   struct list_head *head)
+   void *data, bool duplicates, struct list_head *head)
 {
const struct efivar_operations *ops = __efivars->ops;
unsigned long variable_name_size = 1024;
@@ -450,7 +449,7 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, 
unsigned long, void *),
_guid);
switch (status) {
case EFI_SUCCESS:
-   if (!atomic)
+   if (duplicates)
spin_unlock_irq(&__efivars->lock);
 
variable_name_size = var_name_strnsize(variable_name,
@@ -465,21 +464,19 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, 
unsigned long, void *),
 * and may end up looping here forever.
 */
if (duplicates &&
-   variable_is_present(variable_name, _guid, 
head)) {
+   variable_is_present(variable_name, _guid,
+   head)) {
dup_variable_bug(variable_name, _guid,
 variable_name_size);
-   if (!atomic)
-   spin_lock_irq(&__efivars->lock);
-
status = EFI_NOT_FOUND;
-   break;
+   } else {
+   err = func(variable_name, vendor_guid,
+  variable_name_size, data);
+   if (err)
+   status = EFI_NOT_FOUND;
}
 
-   err = func(variable_name, vendor_guid, 
variable_name_size, data);
-   if (err)
-   status = EFI_NOT_FOUND;
-
-   if (!atomic)
+   if (duplicates)
spin_lock_irq(&__efivars->lock);
 
break;
diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index 553c5d2db4a4..9cb54a38832d 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -216,8 +216,7 @@ static int efivarfs_fill_super(struct super_block *sb, void 
*data, int silent)
 
INIT_LIST_HEAD(_list);
 
-   err = efivar_init(efivarfs_callback, (void *)sb, false,
- true, _list);
+   err = efivar_init(efivarfs_callback, (void *)sb, true, _list);
if (err)
__efivar_entry_

[PATCH 5/5] efivarfs: Make efivarfs_file_ioctl static

2016-05-06 Thread Matt Fleming

From: Peter Jones 

There are no callers except through the file_operations struct below
this, so it should be static like everything else here.

Signed-off-by: Peter Jones 
Signed-off-by: Matt Fleming 
---
 fs/efivarfs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/efivarfs/file.c b/fs/efivarfs/file.c
index d48e0d261d78..5f22e74bbade 100644
--- a/fs/efivarfs/file.c
+++ b/fs/efivarfs/file.c
@@ -157,7 +157,7 @@ efivarfs_ioc_setxflags(struct file *file, void __user *arg)
return 0;
 }
 
-long
+static long
 efivarfs_file_ioctl(struct file *file, unsigned int cmd, unsigned long p)
 {
void __user *arg = (void __user *)p;
-- 
2.7.3

[PATCH 4/5] efi: Merge boolean flag arguments

2016-05-06 Thread Matt Fleming

From: Julia Lawall 

The parameters atomic and duplicates of efivar_init always have opposite
values.  Drop the parameter atomic, replace the uses of !atomic with
duplicates, and update the call sites accordingly.

The code using duplicates is slightly reorganized with an else, to avoid
duplicating the lock code.

Signed-off-by: Julia Lawall 
Cc: Ard Biesheuvel 
Cc: Matthew Garrett 
Cc: Jeremy Kerr 
Cc: Vaishali Thakkar 
Cc: Saurabh Sengar 
Signed-off-by: Matt Fleming 
---
 drivers/firmware/efi/efivars.c |  5 ++---
 drivers/firmware/efi/vars.c| 23 ++-
 fs/efivarfs/super.c|  3 +--
 include/linux/efi.h|  3 +--
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/firmware/efi/efivars.c b/drivers/firmware/efi/efivars.c
index 096adcbcb5a9..116b244dee68 100644
--- a/drivers/firmware/efi/efivars.c
+++ b/drivers/firmware/efi/efivars.c
@@ -661,7 +661,7 @@ static void efivar_update_sysfs_entries(struct work_struct 
*work)
return;
 
err = efivar_init(efivar_update_sysfs_entry, entry,
- true, false, _sysfs_list);
+ false, _sysfs_list);
if (!err)
break;
 
@@ -730,8 +730,7 @@ int efivars_sysfs_init(void)
return -ENOMEM;
}
 
-   efivar_init(efivars_sysfs_callback, NULL, false,
-   true, _sysfs_list);
+   efivar_init(efivars_sysfs_callback, NULL, true, _sysfs_list);
 
error = create_efivars_bin_attributes();
if (error) {
diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
index 0012331d5a3d..d3b751383286 100644
--- a/drivers/firmware/efi/vars.c
+++ b/drivers/firmware/efi/vars.c
@@ -419,8 +419,7 @@ static void dup_variable_bug(efi_char16_t *str16, 
efi_guid_t *vendor_guid,
  * Returns 0 on success, or a kernel error code on failure.
  */
 int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
-   void *data, bool atomic, bool duplicates,
-   struct list_head *head)
+   void *data, bool duplicates, struct list_head *head)
 {
const struct efivar_operations *ops = __efivars->ops;
unsigned long variable_name_size = 1024;
@@ -450,7 +449,7 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, 
unsigned long, void *),
_guid);
switch (status) {
case EFI_SUCCESS:
-   if (!atomic)
+   if (duplicates)
spin_unlock_irq(&__efivars->lock);
 
variable_name_size = var_name_strnsize(variable_name,
@@ -465,21 +464,19 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, 
unsigned long, void *),
 * and may end up looping here forever.
 */
if (duplicates &&
-   variable_is_present(variable_name, _guid, 
head)) {
+   variable_is_present(variable_name, _guid,
+   head)) {
dup_variable_bug(variable_name, _guid,
 variable_name_size);
-   if (!atomic)
-   spin_lock_irq(&__efivars->lock);
-
status = EFI_NOT_FOUND;
-   break;
+   } else {
+   err = func(variable_name, vendor_guid,
+  variable_name_size, data);
+   if (err)
+   status = EFI_NOT_FOUND;
}
 
-   err = func(variable_name, vendor_guid, 
variable_name_size, data);
-   if (err)
-   status = EFI_NOT_FOUND;
-
-   if (!atomic)
+   if (duplicates)
spin_lock_irq(&__efivars->lock);
 
break;
diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index 553c5d2db4a4..9cb54a38832d 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -216,8 +216,7 @@ static int efivarfs_fill_super(struct super_block *sb, void 
*data, int silent)
 
INIT_LIST_HEAD(_list);
 
-   err = efivar_init(efivarfs_callback, (void *)sb, false,
- true, _list);
+   err = efivar_init(efivarfs_callback, (void *)sb, true, _list);
if (err)
__efivar_entry_iter(efivarfs_destroy, _list, NULL, 
NULL);
 
diff --git a/include/linux/efi.h b/include/linux/efi.h
index aa36fb8bea4b..df7acb51f3cc 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -1336,8 +1336,7 @@ int efivars_unregister(s

[PATCH 3/5] efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

2016-05-06 Thread Matt Fleming

Dan reports that passing the address of the pointer to the kmalloc()'d
memory for 'capsule' is dangerous,

 "drivers/firmware/efi/capsule.c:109 efi_capsule_supported()
  warn: did you mean to pass the address of 'capsule'

   108
   109  status = efi.query_capsule_caps(, 1, _size, reset);

  If we modify capsule inside this function call then at the end of the
  function we aren't freeing the original pointer that we allocated."

Ard noted that we don't even need to call kmalloc() since the object
we allocate isn't very big and doesn't need to persist after the
function returns.

Place 'capsule' on the stack instead.

Suggested-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Reported-by: Dan Carpenter <dan.carpen...@oracle.com>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Kweh Hock Leong <hock.leong.k...@intel.com>
Cc: Bryan O'Donoghue <pure.lo...@nexus-software.ie>
Cc: joeyli <j...@suse.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 drivers/firmware/efi/capsule.c | 29 +++--
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index 4703dc9b8fbd..7593108f5402 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -86,33 +86,26 @@ bool efi_capsule_pending(int *reset_type)
  */
 int efi_capsule_supported(efi_guid_t guid, u32 flags, size_t size, int *reset)
 {
-   efi_capsule_header_t *capsule;
+   efi_capsule_header_t capsule;
+   efi_capsule_header_t *cap_list[] = {  };
efi_status_t status;
u64 max_size;
-   int rv = 0;
 
if (flags & ~EFI_CAPSULE_SUPPORTED_FLAG_MASK)
return -EINVAL;
 
-   capsule = kmalloc(sizeof(*capsule), GFP_KERNEL);
-   if (!capsule)
-   return -ENOMEM;
-
-   capsule->headersize = capsule->imagesize = sizeof(*capsule);
-   memcpy(>guid, , sizeof(efi_guid_t));
-   capsule->flags = flags;
+   capsule.headersize = capsule.imagesize = sizeof(capsule);
+   memcpy(, , sizeof(efi_guid_t));
+   capsule.flags = flags;
 
-   status = efi.query_capsule_caps(, 1, _size, reset);
-   if (status != EFI_SUCCESS) {
-   rv = efi_status_to_err(status);
-   goto out;
-   }
+   status = efi.query_capsule_caps(cap_list, 1, _size, reset);
+   if (status != EFI_SUCCESS)
+   return efi_status_to_err(status);
 
if (size > max_size)
-   rv = -ENOSPC;
-out:
-   kfree(capsule);
-   return rv;
+   return -ENOSPC;
+
+   return 0;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_supported);
 
-- 
2.7.3

[PATCH 3/5] efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

2016-05-06 Thread Matt Fleming

Dan reports that passing the address of the pointer to the kmalloc()'d
memory for 'capsule' is dangerous,

 "drivers/firmware/efi/capsule.c:109 efi_capsule_supported()
  warn: did you mean to pass the address of 'capsule'

   108
   109  status = efi.query_capsule_caps(, 1, _size, reset);

  If we modify capsule inside this function call then at the end of the
  function we aren't freeing the original pointer that we allocated."

Ard noted that we don't even need to call kmalloc() since the object
we allocate isn't very big and doesn't need to persist after the
function returns.

Place 'capsule' on the stack instead.

Suggested-by: Ard Biesheuvel 
Acked-by: Ard Biesheuvel 
Reported-by: Dan Carpenter 
Cc: Borislav Petkov 
Cc: Kweh Hock Leong 
Cc: Bryan O'Donoghue 
Cc: joeyli 
Signed-off-by: Matt Fleming 
---
 drivers/firmware/efi/capsule.c | 29 +++--
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index 4703dc9b8fbd..7593108f5402 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -86,33 +86,26 @@ bool efi_capsule_pending(int *reset_type)
  */
 int efi_capsule_supported(efi_guid_t guid, u32 flags, size_t size, int *reset)
 {
-   efi_capsule_header_t *capsule;
+   efi_capsule_header_t capsule;
+   efi_capsule_header_t *cap_list[] = {  };
efi_status_t status;
u64 max_size;
-   int rv = 0;
 
if (flags & ~EFI_CAPSULE_SUPPORTED_FLAG_MASK)
return -EINVAL;
 
-   capsule = kmalloc(sizeof(*capsule), GFP_KERNEL);
-   if (!capsule)
-   return -ENOMEM;
-
-   capsule->headersize = capsule->imagesize = sizeof(*capsule);
-   memcpy(>guid, , sizeof(efi_guid_t));
-   capsule->flags = flags;
+   capsule.headersize = capsule.imagesize = sizeof(capsule);
+   memcpy(, , sizeof(efi_guid_t));
+   capsule.flags = flags;
 
-   status = efi.query_capsule_caps(, 1, _size, reset);
-   if (status != EFI_SUCCESS) {
-   rv = efi_status_to_err(status);
-   goto out;
-   }
+   status = efi.query_capsule_caps(cap_list, 1, _size, reset);
+   if (status != EFI_SUCCESS)
+   return efi_status_to_err(status);
 
if (size > max_size)
-   rv = -ENOSPC;
-out:
-   kfree(capsule);
-   return rv;
+   return -ENOSPC;
+
+   return 0;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_supported);
 
-- 
2.7.3

[GIT PULL 0/5] EFI changes for v4.7

2016-05-06 Thread Matt Fleming

Folks, this is the second pull request containing v4.7 material. The
commits are listed in priority order, with the first patch fixing an
oops in the EFI capsule code sitting in tip/efi/core, and the rest
being a compiler warning fix, static checker fix, and a couple of
cleanups.

The following changes since commit 0ec7ae928a9c19c2b7b8054507d5694a2597065e:

  efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI 
driver (2016-04-29 11:06:15 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git tags/efi-next

for you to fetch changes up to 20948c1d9fdefa7acfaa84046f59adce9ef00f2e:

  efivarfs: Make efivarfs_file_ioctl static (2016-05-05 16:52:19 +0100)


 * Fix an oops in the EFI capsule code reported by the 0day bot
   because efi_capsule_pending() was grabbing a mutex in the emergency
   reboot path - Matt Fleming

 * Fix a compiler warning about excessive stack usage in the new efibc
   driver by kmalloc'ing the efivar_entry object - Jeremy Compostella

 * Dan Carpenter reported that it's potentially unsafe to pass the
   address of a pointer to the firmware in efi_capsule_supported().
   Instead we can skip the dynamic allocation entirely and put the
   capsule object on the stack - Matt Fleming

 * Simplify the locking in the efivars code by merging two of
   efivar_init()'s parameters into one - Julia Lawall

 * Cleanup efivarfs_file_ioctl by marking it as static since it has no
   external users - Peter Jones


Jeremy Compostella (1):
  efibc: Fix excessive stack footprint warning

Julia Lawall (1):
  efi: Merge boolean flag arguments

Matt Fleming (2):
  efi/capsule: Make efi_capsule_pending() lockless
  efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

Peter Jones (1):
  efivarfs: Make efivarfs_file_ioctl static

 drivers/firmware/efi/capsule.c | 65 --
 drivers/firmware/efi/efibc.c   | 34 +++---
 drivers/firmware/efi/efivars.c |  5 ++--
 drivers/firmware/efi/vars.c| 23 +++
 fs/efivarfs/file.c |  2 +-
 fs/efivarfs/super.c|  3 +-
 include/linux/efi.h|  3 +-
 7 files changed, 75 insertions(+), 60 deletions(-)

[PATCH 1/5] efi/capsule: Make efi_capsule_pending() lockless

2016-05-06 Thread Matt Fleming

Taking a mutex in the reboot path is bogus because we cannot sleep
with interrupts disabled, such as when rebooting due to panic(),

  BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:97
  in_atomic(): 0, irqs_disabled(): 1, pid: 7, name: rcu_sched
  Call Trace:
dump_stack+0x63/0x89
___might_sleep+0xd8/0x120
__might_sleep+0x49/0x80
mutex_lock+0x20/0x50
efi_capsule_pending+0x1d/0x60
native_machine_emergency_restart+0x59/0x280
machine_emergency_restart+0x19/0x20
emergency_restart+0x18/0x20
panic+0x1ba/0x217

In this case all other CPUs will have been stopped by the time we
execute the platform reboot code, so 'capsule_pending' cannot change
under our feet. We wouldn't care even if it could since we cannot wait
for it complete.

Also, instead of relying on the external 'system_state' variable just
use a reboot notifier, so we can set 'stop_capsules' while holding
'capsule_mutex', thereby avoiding a race where system_state is updated
while we're in the middle of efi_capsule_update_locked() (since CPUs
won't have been stopped at that point).

Cc: Borislav Petkov <b...@alien8.de>
Cc: Kweh Hock Leong <hock.leong.k...@intel.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Bryan O'Donoghue <pure.lo...@nexus-software.ie>
Cc: joeyli <j...@suse.com>
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 drivers/firmware/efi/capsule.c | 36 ++--
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index 0de55944ac0b..4703dc9b8fbd 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -22,11 +22,12 @@ typedef struct {
 } efi_capsule_block_desc_t;
 
 static bool capsule_pending;
+static bool stop_capsules;
 static int efi_reset_type = -1;
 
 /*
  * capsule_mutex serialises access to both capsule_pending and
- * efi_reset_type.
+ * efi_reset_type and stop_capsules.
  */
 static DEFINE_MUTEX(capsule_mutex);
 
@@ -50,18 +51,13 @@ static DEFINE_MUTEX(capsule_mutex);
  */
 bool efi_capsule_pending(int *reset_type)
 {
-   bool rv = false;
-
-   mutex_lock(_mutex);
if (!capsule_pending)
-   goto out;
+   return false;
 
if (reset_type)
*reset_type = efi_reset_type;
-   rv = true;
-out:
-   mutex_unlock(_mutex);
-   return rv;
+
+   return true;
 }
 
 /*
@@ -176,7 +172,7 @@ efi_capsule_update_locked(efi_capsule_header_t *capsule,
 * whether to force an EFI reboot), and we're racing against
 * that call. Abort in that case.
 */
-   if (unlikely(system_state == SYSTEM_RESTART)) {
+   if (unlikely(stop_capsules)) {
pr_warn("Capsule update raced with reboot, aborting.\n");
return -EINVAL;
}
@@ -298,3 +294,23 @@ out:
return rv;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_update);
+
+static int capsule_reboot_notify(struct notifier_block *nb,
+unsigned long event, void *cmd)
+{
+   mutex_lock(_mutex);
+   stop_capsules = true;
+   mutex_unlock(_mutex);
+
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block capsule_reboot_nb = {
+   .notifier_call = capsule_reboot_notify,
+};
+
+static int __init capsule_reboot_register(void)
+{
+   return register_reboot_notifier(_reboot_nb);
+}
+core_initcall(capsule_reboot_register);
-- 
2.7.3

[PATCH 2/5] efibc: Fix excessive stack footprint warning

2016-05-06 Thread Matt Fleming

From: Jeremy Compostella <jeremy.composte...@intel.com>

gcc complains about a newly added file for the EFI Bootloader Control:

drivers/firmware/efi/efibc.c: In function 'efibc_set_variable':
drivers/firmware/efi/efibc.c:53:1: error: the frame size of 2272 bytes is 
larger than 1024 bytes [-Werror=frame-larger-than=]

The problem is the declaration of a local variable of type struct
efivar_entry, which is by itself larger than the warning limit of 1024
bytes.

Use dynamic memory allocation instead of stack memory for the entry
object.

This patch also fixes a potential buffer overflow.

Reported-by: Ingo Molnar <mi...@kernel.org>
Reported-by: Arnd Bergmann <a...@arndb.de>
Signed-off-by: Jeremy Compostella <jeremy.composte...@intel.com>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
[ Updated changelog to include gcc error ]
Signed-off-by: Matt Fleming <m...@codeblueprint.co.uk>
---
 drivers/firmware/efi/efibc.c | 34 +++---
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/firmware/efi/efibc.c b/drivers/firmware/efi/efibc.c
index 2e0c7ccd9d9e..8dd0c7085e59 100644
--- a/drivers/firmware/efi/efibc.c
+++ b/drivers/firmware/efi/efibc.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static void efibc_str_to_str16(const char *str, efi_char16_t *str16)
 {
@@ -28,41 +29,52 @@ static void efibc_str_to_str16(const char *str, 
efi_char16_t *str16)
str16[i] = '\0';
 }
 
-static void efibc_set_variable(const char *name, const char *value)
+static int efibc_set_variable(const char *name, const char *value)
 {
int ret;
efi_guid_t guid = LINUX_EFI_LOADER_ENTRY_GUID;
-   struct efivar_entry entry;
+   struct efivar_entry *entry;
size_t size = (strlen(value) + 1) * sizeof(efi_char16_t);
 
-   if (size > sizeof(entry.var.Data))
+   if (size > sizeof(entry->var.Data)) {
pr_err("value is too large");
+   return -EINVAL;
+   }
 
-   efibc_str_to_str16(name, entry.var.VariableName);
-   efibc_str_to_str16(value, (efi_char16_t *)entry.var.Data);
-   memcpy(, , sizeof(guid));
+   entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+   if (!entry) {
+   pr_err("failed to allocate efivar entry");
+   return -ENOMEM;
+   }
 
-   ret = efivar_entry_set(,
+   efibc_str_to_str16(name, entry->var.VariableName);
+   efibc_str_to_str16(value, (efi_char16_t *)entry->var.Data);
+   memcpy(>var.VendorGuid, , sizeof(guid));
+
+   ret = efivar_entry_set(entry,
   EFI_VARIABLE_NON_VOLATILE
   | EFI_VARIABLE_BOOTSERVICE_ACCESS
   | EFI_VARIABLE_RUNTIME_ACCESS,
-  size, entry.var.Data, NULL);
+  size, entry->var.Data, NULL);
if (ret)
pr_err("failed to set %s EFI variable: 0x%x\n",
   name, ret);
+
+   kfree(entry);
+   return ret;
 }
 
 static int efibc_reboot_notifier_call(struct notifier_block *notifier,
  unsigned long event, void *data)
 {
const char *reason = "shutdown";
+   int ret;
 
if (event == SYS_RESTART)
reason = "reboot";
 
-   efibc_set_variable("LoaderEntryRebootReason", reason);
-
-   if (!data)
+   ret = efibc_set_variable("LoaderEntryRebootReason", reason);
+   if (ret || !data)
return NOTIFY_DONE;
 
efibc_set_variable("LoaderEntryOneShot", (char *)data);
-- 
2.7.3

[GIT PULL 0/5] EFI changes for v4.7

2016-05-06 Thread Matt Fleming

Folks, this is the second pull request containing v4.7 material. The
commits are listed in priority order, with the first patch fixing an
oops in the EFI capsule code sitting in tip/efi/core, and the rest
being a compiler warning fix, static checker fix, and a couple of
cleanups.

The following changes since commit 0ec7ae928a9c19c2b7b8054507d5694a2597065e:

  efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI 
driver (2016-04-29 11:06:15 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git tags/efi-next

for you to fetch changes up to 20948c1d9fdefa7acfaa84046f59adce9ef00f2e:

  efivarfs: Make efivarfs_file_ioctl static (2016-05-05 16:52:19 +0100)


 * Fix an oops in the EFI capsule code reported by the 0day bot
   because efi_capsule_pending() was grabbing a mutex in the emergency
   reboot path - Matt Fleming

 * Fix a compiler warning about excessive stack usage in the new efibc
   driver by kmalloc'ing the efivar_entry object - Jeremy Compostella

 * Dan Carpenter reported that it's potentially unsafe to pass the
   address of a pointer to the firmware in efi_capsule_supported().
   Instead we can skip the dynamic allocation entirely and put the
   capsule object on the stack - Matt Fleming

 * Simplify the locking in the efivars code by merging two of
   efivar_init()'s parameters into one - Julia Lawall

 * Cleanup efivarfs_file_ioctl by marking it as static since it has no
   external users - Peter Jones


Jeremy Compostella (1):
  efibc: Fix excessive stack footprint warning

Julia Lawall (1):
  efi: Merge boolean flag arguments

Matt Fleming (2):
  efi/capsule: Make efi_capsule_pending() lockless
  efi/capsule: Move 'capsule' to the stack in efi_capsule_supported()

Peter Jones (1):
  efivarfs: Make efivarfs_file_ioctl static

 drivers/firmware/efi/capsule.c | 65 --
 drivers/firmware/efi/efibc.c   | 34 +++---
 drivers/firmware/efi/efivars.c |  5 ++--
 drivers/firmware/efi/vars.c| 23 +++
 fs/efivarfs/file.c |  2 +-
 fs/efivarfs/super.c|  3 +-
 include/linux/efi.h|  3 +-
 7 files changed, 75 insertions(+), 60 deletions(-)

[PATCH 1/5] efi/capsule: Make efi_capsule_pending() lockless

2016-05-06 Thread Matt Fleming

Taking a mutex in the reboot path is bogus because we cannot sleep
with interrupts disabled, such as when rebooting due to panic(),

  BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:97
  in_atomic(): 0, irqs_disabled(): 1, pid: 7, name: rcu_sched
  Call Trace:
dump_stack+0x63/0x89
___might_sleep+0xd8/0x120
__might_sleep+0x49/0x80
mutex_lock+0x20/0x50
efi_capsule_pending+0x1d/0x60
native_machine_emergency_restart+0x59/0x280
machine_emergency_restart+0x19/0x20
emergency_restart+0x18/0x20
panic+0x1ba/0x217

In this case all other CPUs will have been stopped by the time we
execute the platform reboot code, so 'capsule_pending' cannot change
under our feet. We wouldn't care even if it could since we cannot wait
for it complete.

Also, instead of relying on the external 'system_state' variable just
use a reboot notifier, so we can set 'stop_capsules' while holding
'capsule_mutex', thereby avoiding a race where system_state is updated
while we're in the middle of efi_capsule_update_locked() (since CPUs
won't have been stopped at that point).

Cc: Borislav Petkov 
Cc: Kweh Hock Leong 
Cc: Ard Biesheuvel 
Cc: Bryan O'Donoghue 
Cc: joeyli 
Signed-off-by: Matt Fleming 
---
 drivers/firmware/efi/capsule.c | 36 ++--
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/drivers/firmware/efi/capsule.c b/drivers/firmware/efi/capsule.c
index 0de55944ac0b..4703dc9b8fbd 100644
--- a/drivers/firmware/efi/capsule.c
+++ b/drivers/firmware/efi/capsule.c
@@ -22,11 +22,12 @@ typedef struct {
 } efi_capsule_block_desc_t;
 
 static bool capsule_pending;
+static bool stop_capsules;
 static int efi_reset_type = -1;
 
 /*
  * capsule_mutex serialises access to both capsule_pending and
- * efi_reset_type.
+ * efi_reset_type and stop_capsules.
  */
 static DEFINE_MUTEX(capsule_mutex);
 
@@ -50,18 +51,13 @@ static DEFINE_MUTEX(capsule_mutex);
  */
 bool efi_capsule_pending(int *reset_type)
 {
-   bool rv = false;
-
-   mutex_lock(_mutex);
if (!capsule_pending)
-   goto out;
+   return false;
 
if (reset_type)
*reset_type = efi_reset_type;
-   rv = true;
-out:
-   mutex_unlock(_mutex);
-   return rv;
+
+   return true;
 }
 
 /*
@@ -176,7 +172,7 @@ efi_capsule_update_locked(efi_capsule_header_t *capsule,
 * whether to force an EFI reboot), and we're racing against
 * that call. Abort in that case.
 */
-   if (unlikely(system_state == SYSTEM_RESTART)) {
+   if (unlikely(stop_capsules)) {
pr_warn("Capsule update raced with reboot, aborting.\n");
return -EINVAL;
}
@@ -298,3 +294,23 @@ out:
return rv;
 }
 EXPORT_SYMBOL_GPL(efi_capsule_update);
+
+static int capsule_reboot_notify(struct notifier_block *nb,
+unsigned long event, void *cmd)
+{
+   mutex_lock(_mutex);
+   stop_capsules = true;
+   mutex_unlock(_mutex);
+
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block capsule_reboot_nb = {
+   .notifier_call = capsule_reboot_notify,
+};
+
+static int __init capsule_reboot_register(void)
+{
+   return register_reboot_notifier(_reboot_nb);
+}
+core_initcall(capsule_reboot_register);
-- 
2.7.3

< 4 5 6 7 8 9 10 11 12 13 >

801 - 900 of 4043 matches

Mail list logo