from:"Mathias Krause"

Re: [PATCH] x86/hyperv/vtl: Replace real_mode_header only under Hyper-V

2023-09-11 Thread Mathias Krause

On 08.09.23 17:02, Saurabh Singh Sengar wrote:
> On Fri, Sep 08, 2023 at 12:26:10PM +0200, Mathias Krause wrote:
>> Booting a CONFIG_HYPERV_VTL_MODE=y enabled kernel on bare metal or a
>> non-Hyper-V hypervisor leads to serve memory corruption as
> 
> FWIW, CONFIG_HYPERV_VTL_MODE is not expected to be enabled for non VTL
> platforms.

Fair enough, but there's really no excuse to randomly crashing the
kernel if one forgot to RTFM like I did. The code should (and easily
can) handle such situations, especially if it's just a matter of a two
line change.

> Referring Kconfig documentation:
> "A kernel built with this option must run at VTL2, and will not run as
> a normal guest."

So, maybe, the 'return 0' below should be a 'panic("Need to run on
Hyper-V!")' instead?

But then, looking at the code, most of the VTL specifics only run when
the Hyper-V hypervisor was actually detected during early boot. It's
just hv_vtl_early_init() that runs unconditionally and spoils the game.

Is there really a *hard* requirement / reason for having VTL support
disabled when not running under Hyper-V? At least I can't find any from
the code side. It'll all be fine with the below patch, also enabling
running the same kernel on multiple platforms -- bare metal, KVM, Hyper-V,..

Thanks,
Mathias

> 
> - Saurabh
> 
>> hv_vtl_early_init() will run even though hv_vtl_init_platform() did not.
>> This skips no-oping the 'realmode_reserve' and 'realmode_init' platform
>> hooks, making init_real_mode() -> setup_real_mode() try to copy
>> 'real_mode_blob' over 'real_mode_header' which we set to the stub
>> 'hv_vtl_real_mode_header'. However, as 'real_mode_blob' isn't just a
>> 'struct real_mode_header' -- it's the complete code! -- copying it over
>> 'hv_vtl_real_mode_header' will corrupt quite some memory following it.
>>
>> The real cause for this erroneous behaviour is that hv_vtl_early_init()
>> blindly assumes the kernel is running on Hyper-V, which it may not.
>>
>> Fix this by making sure the code only replaces the real mode header with
>> the stub one iff the kernel is running under Hyper-V.
>>
>> Fixes: 3be1bc2fe9d2 ("x86/hyperv: VTL support for Hyper-V")
>> Cc: Saurabh Sengar 
>> Cc: sta...@kernel.org
>> Signed-off-by: Mathias Krause 
>> ---
>>  arch/x86/hyperv/hv_vtl.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/hyperv/hv_vtl.c b/arch/x86/hyperv/hv_vtl.c
>> index 57df7821d66c..54c06f4b8b4c 100644
>> --- a/arch/x86/hyperv/hv_vtl.c
>> +++ b/arch/x86/hyperv/hv_vtl.c
>> @@ -12,6 +12,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  
>>  extern struct boot_params boot_params;
>> @@ -214,6 +215,9 @@ static int hv_vtl_wakeup_secondary_cpu(int apicid, 
>> unsigned long start_eip)
>>  
>>  static int __init hv_vtl_early_init(void)
>>  {
>> +if (!hypervisor_is_type(X86_HYPER_MS_HYPERV))
>> +return 0;
>> +
>>  /*
>>   * `boot_cpu_has` returns the runtime feature support,
>>   * and here is the earliest it can be used.
>> -- 
>> 2.30.2
>>

Re: [PATCH] vmlinux.lds.h: drop unused __vermagic

2019-03-10 Thread Mathias Krause

Hi Jessica,

On Mon, 21 Jan 2019 at 12:39, Mathias Krause  wrote:
>
> On Wed, 2 Jan 2019 at 21:29, Jessica Yu  wrote:
> >
> > +++ Mathias Krause [30/12/18 13:40 +0100]:
> > >The reference to '__vermagic' is a relict from v2.5 times. And even
> > >there it had a very short life time, from v2.5.59 (commit 1d411b80ee18
> > >("Module Sanity Check") in the historic tree) to v2.5.69 (commit
> > >67ac5b866bda ("[PATCH] complete modinfo section")).
> > >
> > >Neither current kernels nor modules contain a '__vermagic' section any
> > >more, so get rid of it.
> > >
> > >Signed-off-by: Mathias Krause 
> > >Cc: Rusty Russell 
> > >Cc: Jessica Yu 
> >
> > Thanks for the cleanup.
> >
> > Reviewed-by: Jessica Yu 
> >
>
> Arnd, will you carry this patch through your asm-generic tree or did
> the MAINTAINERS file mislead me here?
>

looks like Arnd keeps ignoring my Emails. :/ Can you carry this patch
through your module tree? Or do you think I should ask Andrew Morton
instead?

Thanks,
Mathias

> > >---
> > > include/asm-generic/vmlinux.lds.h | 1 -
> > > 1 file changed, 1 deletion(-)
> > >
> > >diff --git a/include/asm-generic/vmlinux.lds.h 
> > >b/include/asm-generic/vmlinux.lds.h
> > >index 3d7a6a9c2370..530ce1e7a1ec 100644
> > >--- a/include/asm-generic/vmlinux.lds.h
> > >+++ b/include/asm-generic/vmlinux.lds.h
> > >@@ -332,7 +332,6 @@
> > >   __start_rodata = .; \
> > >   *(.rodata) *(.rodata.*) \
> > >   RO_AFTER_INIT_DATA  /* Read only after init */  \
> > >-  KEEP(*(__vermagic)) /* Kernel version magic */  \
> > >   . = ALIGN(8);   \
> > >   __start___tracepoints_ptrs = .; \
> > >   KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array 
> > > */ \
> > >--
> > >2.19.2
> > >

Re: [PATCH] vmlinux.lds.h: drop unused __vermagic

2019-01-21 Thread Mathias Krause

On Wed, 2 Jan 2019 at 21:29, Jessica Yu  wrote:
>
> +++ Mathias Krause [30/12/18 13:40 +0100]:
> >The reference to '__vermagic' is a relict from v2.5 times. And even
> >there it had a very short life time, from v2.5.59 (commit 1d411b80ee18
> >("Module Sanity Check") in the historic tree) to v2.5.69 (commit
> >67ac5b866bda ("[PATCH] complete modinfo section")).
> >
> >Neither current kernels nor modules contain a '__vermagic' section any
> >more, so get rid of it.
> >
> >Signed-off-by: Mathias Krause 
> >Cc: Rusty Russell 
> >Cc: Jessica Yu 
>
> Thanks for the cleanup.
>
> Reviewed-by: Jessica Yu 
>

Arnd, will you carry this patch through your asm-generic tree or did
the MAINTAINERS file mislead me here?

Thanks,
Mathias

> >---
> > include/asm-generic/vmlinux.lds.h | 1 -
> > 1 file changed, 1 deletion(-)
> >
> >diff --git a/include/asm-generic/vmlinux.lds.h 
> >b/include/asm-generic/vmlinux.lds.h
> >index 3d7a6a9c2370..530ce1e7a1ec 100644
> >--- a/include/asm-generic/vmlinux.lds.h
> >+++ b/include/asm-generic/vmlinux.lds.h
> >@@ -332,7 +332,6 @@
> >   __start_rodata = .; \
> >   *(.rodata) *(.rodata.*) \
> >   RO_AFTER_INIT_DATA  /* Read only after init */  \
> >-  KEEP(*(__vermagic)) /* Kernel version magic */  \
> >   . = ALIGN(8);   \
> >   __start___tracepoints_ptrs = .; \
> >   KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
> >--
> >2.19.2
> >

[PATCH] vmlinux.lds.h: drop unused __vermagic

2018-12-30 Thread Mathias Krause

The reference to '__vermagic' is a relict from v2.5 times. And even
there it had a very short life time, from v2.5.59 (commit 1d411b80ee18
("Module Sanity Check") in the historic tree) to v2.5.69 (commit
67ac5b866bda ("[PATCH] complete modinfo section")).

Neither current kernels nor modules contain a '__vermagic' section any
more, so get rid of it.

Signed-off-by: Mathias Krause 
Cc: Rusty Russell 
Cc: Jessica Yu 
---
 include/asm-generic/vmlinux.lds.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 3d7a6a9c2370..530ce1e7a1ec 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -332,7 +332,6 @@
__start_rodata = .; \
*(.rodata) *(.rodata.*) \
RO_AFTER_INIT_DATA  /* Read only after init */  \
-   KEEP(*(__vermagic)) /* Kernel version magic */  \
. = ALIGN(8);   \
__start___tracepoints_ptrs = .; \
KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
-- 
2.19.2

[PATCH] kallsyms: lower alignment on ARM

2018-12-30 Thread Mathias Krause

As mentioned in the info pages of gas, the '.align' pseudo op's
interpretation of the alignment value is architecture specific.
It might either be a byte value or taken to the power of two.

On ARM it's actually the latter which leads to unnecessary large
alignments of 16 bytes for 32 bit builds or 256 bytes for 64 bit
builds.

Fix this by switching to '.balign' instead which is consistent
across all architectures.

Signed-off-by: Mathias Krause 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 scripts/kallsyms.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 109a1af7e444..77cebad0474e 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -334,10 +334,10 @@ static void write_src(void)
printf("#include \n");
printf("#if BITS_PER_LONG == 64\n");
printf("#define PTR .quad\n");
-   printf("#define ALGN .align 8\n");
+   printf("#define ALGN .balign 8\n");
printf("#else\n");
printf("#define PTR .long\n");
-   printf("#define ALGN .align 4\n");
+   printf("#define ALGN .balign 4\n");
printf("#endif\n");
 
printf("\t.section .rodata, \"a\"\n");
-- 
2.19.2

Re: [PATCH] net: ipv6: xfrm6_state: remove VLA usage

2018-03-09 Thread Mathias Krause

On 9 March 2018 at 13:21, Andreas Christoforou
 wrote:
> The kernel would like to have all stack VLA usage removed[1].
>
> Signed-off-by: Andreas Christoforou 
> ---
>  net/ipv6/xfrm6_state.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
> index b15075a..45c0d98 100644
> --- a/net/ipv6/xfrm6_state.c
> +++ b/net/ipv6/xfrm6_state.c
> @@ -62,7 +62,12 @@ __xfrm6_sort(void **dst, void **src, int n, int 
> (*cmp)(void *p), int maxclass)
>  {
> int i;
> int class[XFRM_MAX_DEPTH];
> -   int count[maxclass];
> +   int *count;
> +
> +   count = kcalloc(maxclass + 1, sizeof(*count), GFP_KERNEL);
> +
> +   if (!count)
> +   return -ENOMEM;
>
> memset(count, 0, sizeof(count));
>
> @@ -80,6 +85,7 @@ __xfrm6_sort(void **dst, void **src, int n, int (*cmp)(void 
> *p), int maxclass)
> src[i] = NULL;
> }
>
> +   kfree(count);
> return 0;
>  }

Instead of dynamically allocating and freeing memory here, shouldn't
we just get rid of the maxclass parameter and use XFRM_MAX_DEPTH as
size for the count[] array, too?

Cheers,
Mathias

Re: [PATCH] net: ipv6: xfrm6_state: remove VLA usage

2018-03-09 Thread Mathias Krause

On 9 March 2018 at 13:21, Andreas Christoforou
 wrote:
> The kernel would like to have all stack VLA usage removed[1].
>
> Signed-off-by: Andreas Christoforou 
> ---
>  net/ipv6/xfrm6_state.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
> index b15075a..45c0d98 100644
> --- a/net/ipv6/xfrm6_state.c
> +++ b/net/ipv6/xfrm6_state.c
> @@ -62,7 +62,12 @@ __xfrm6_sort(void **dst, void **src, int n, int 
> (*cmp)(void *p), int maxclass)
>  {
> int i;
> int class[XFRM_MAX_DEPTH];
> -   int count[maxclass];
> +   int *count;
> +
> +   count = kcalloc(maxclass + 1, sizeof(*count), GFP_KERNEL);
> +
> +   if (!count)
> +   return -ENOMEM;
>
> memset(count, 0, sizeof(count));
>
> @@ -80,6 +85,7 @@ __xfrm6_sort(void **dst, void **src, int n, int (*cmp)(void 
> *p), int maxclass)
> src[i] = NULL;
> }
>
> +   kfree(count);
> return 0;
>  }

Instead of dynamically allocating and freeing memory here, shouldn't
we just get rid of the maxclass parameter and use XFRM_MAX_DEPTH as
size for the count[] array, too?

Cheers,
Mathias

[tip:x86/urgent] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-09 Thread tip-bot for Mathias Krause

Commit-ID:  6b32c126d33d5cb379bca280ab8acedc1ca978ff
Gitweb: https://git.kernel.org/tip/6b32c126d33d5cb379bca280ab8acedc1ca978ff
Author: Mathias Krause <mini...@googlemail.com>
AuthorDate: Thu, 5 Oct 2017 20:30:12 +0200
Committer:  Thomas Gleixner <t...@linutronix.de>
CommitDate: Mon, 9 Oct 2017 13:35:17 +0200

x86/alternatives: Fix alt_max_short macro to really be a max()

The alt_max_short() macro in asm/alternative.h does not work as
intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
exactly the maximum of 1 and 3.

In fact, I had to learn it the hard way by crashing my kernel in not
so funny ways by attempting to make use of the ALTENATIVE_2 macro
with alternatives where the first one was larger than the second
one.

According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
ALTERNATIVE_2 padding generation properly") the right handed side
should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
macro work as intended.

While at it, fix up the comments regarding the additional "-", too.
It's not about gas' usage of s32 but brain dead logic of having a
"true" value of -1 for the < operator ... *sigh*

Btw., the one in asm/alternative-asm.h is correct. And, apparently,
all current users of ALTERNATIVE_2() pass same sized alternatives,
avoiding to hit the bug.

[1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax

Reviewed-and-tested-by: Borislav Petkov <b...@suse.de>
Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
properly")
Signed-off-by: Mathias Krause <mini...@googlemail.com>
Signed-off-by: Thomas Gleixner <t...@linutronix.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: sta...@vger.kernel.org
Link: 
https://lkml.kernel.org/r/1507228213-13095-1-git-send-email-mini...@googlemail.com

---
 arch/x86/include/asm/alternative-asm.h | 4 +++-
 arch/x86/include/asm/alternative.h | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h 
b/arch/x86/include/asm/alternative-asm.h
index e7636ba..6c98821 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -62,8 +62,10 @@
 #define new_len2   145f-144f
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
+ *
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
 #define alt_max_short(a, b)((a) ^ (((a) ^ (b)) & -(-((a) < (b)
 
diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index c096624..ccbe24e 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -103,12 +103,12 @@ static inline int alternatives_text_reserved(void *start, 
void *end)
alt_end_marker ":\n"
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
  *
- * The additional "-" is needed because gas works with s32s.
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
-#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
- (" b ")"
+#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
< (" b ")"
 
 /*
  * Pad the second replacement alternative with additional NOPs if it is

[tip:x86/urgent] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-09 Thread tip-bot for Mathias Krause

Commit-ID:  6b32c126d33d5cb379bca280ab8acedc1ca978ff
Gitweb: https://git.kernel.org/tip/6b32c126d33d5cb379bca280ab8acedc1ca978ff
Author: Mathias Krause 
AuthorDate: Thu, 5 Oct 2017 20:30:12 +0200
Committer:  Thomas Gleixner 
CommitDate: Mon, 9 Oct 2017 13:35:17 +0200

x86/alternatives: Fix alt_max_short macro to really be a max()

The alt_max_short() macro in asm/alternative.h does not work as
intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
exactly the maximum of 1 and 3.

In fact, I had to learn it the hard way by crashing my kernel in not
so funny ways by attempting to make use of the ALTENATIVE_2 macro
with alternatives where the first one was larger than the second
one.

According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
ALTERNATIVE_2 padding generation properly") the right handed side
should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
macro work as intended.

While at it, fix up the comments regarding the additional "-", too.
It's not about gas' usage of s32 but brain dead logic of having a
"true" value of -1 for the < operator ... *sigh*

Btw., the one in asm/alternative-asm.h is correct. And, apparently,
all current users of ALTERNATIVE_2() pass same sized alternatives,
avoiding to hit the bug.

[1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax

Reviewed-and-tested-by: Borislav Petkov 
Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
properly")
Signed-off-by: Mathias Krause 
Signed-off-by: Thomas Gleixner 
Cc: Borislav Petkov 
Cc: sta...@vger.kernel.org
Link: 
https://lkml.kernel.org/r/1507228213-13095-1-git-send-email-mini...@googlemail.com

---
 arch/x86/include/asm/alternative-asm.h | 4 +++-
 arch/x86/include/asm/alternative.h | 6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h 
b/arch/x86/include/asm/alternative-asm.h
index e7636ba..6c98821 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -62,8 +62,10 @@
 #define new_len2   145f-144f
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
+ *
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
 #define alt_max_short(a, b)((a) ^ (((a) ^ (b)) & -(-((a) < (b)
 
diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index c096624..ccbe24e 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -103,12 +103,12 @@ static inline int alternatives_text_reserved(void *start, 
void *end)
alt_end_marker ":\n"
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
  *
- * The additional "-" is needed because gas works with s32s.
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
-#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
- (" b ")"
+#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
< (" b ")"
 
 /*
  * Pad the second replacement alternative with additional NOPs if it is

[PATCH v2] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

The alt_max_short() macro in asm/alternative.h does not work as
intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
exactly the maximum of 1 and 3.

In fact, I had to learn it the hard way by crashing my kernel in not
so funny ways by attempting to make use of the ALTENATIVE_2 macro
with alternatives where the first one was larger than the second
one.

According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
ALTERNATIVE_2 padding generation properly") the right handed side
should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
macro work as intended.

While at it, fix up the comments regarding the additional "-", too.
It's not about gas' usage of s32 but brain dead logic of having a
"true" value of -1 for the < operator ... *sigh*

Btw., the one in asm/alternative-asm.h is correct. And, apparently,
all current users of ALTERNATIVE_2() pass same sized alternatives,
avoiding to hit the bug.

[1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax

Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
properly")
Signed-off-by: Mathias Krause <mini...@googlemail.com>
Reviewed-and-tested-by: Borislav Petkov <b...@suse.de>
---
v2:
 - fix the comments even further, as requested by Boris
 - add reviewed-and-tested-by for Boris

 arch/x86/include/asm/alternative-asm.h |4 +++-
 arch/x86/include/asm/alternative.h |6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h 
b/arch/x86/include/asm/alternative-asm.h
index e7636bac7372..6c98821fef5e 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -62,8 +62,10 @@
 #define new_len2   145f-144f
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
+ *
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
 #define alt_max_short(a, b)((a) ^ (((a) ^ (b)) & -(-((a) < (b)
 
diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index c096624137ae..ccbe24e697c4 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -103,12 +103,12 @@ static inline int alternatives_text_reserved(void *start, 
void *end)
alt_end_marker ":\n"
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
  *
- * The additional "-" is needed because gas works with s32s.
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
-#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
- (" b ")"
+#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
< (" b ")"
 
 /*
  * Pad the second replacement alternative with additional NOPs if it is
-- 
1.7.10.4

[PATCH v2] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

The alt_max_short() macro in asm/alternative.h does not work as
intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
exactly the maximum of 1 and 3.

In fact, I had to learn it the hard way by crashing my kernel in not
so funny ways by attempting to make use of the ALTENATIVE_2 macro
with alternatives where the first one was larger than the second
one.

According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
ALTERNATIVE_2 padding generation properly") the right handed side
should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
macro work as intended.

While at it, fix up the comments regarding the additional "-", too.
It's not about gas' usage of s32 but brain dead logic of having a
"true" value of -1 for the < operator ... *sigh*

Btw., the one in asm/alternative-asm.h is correct. And, apparently,
all current users of ALTERNATIVE_2() pass same sized alternatives,
avoiding to hit the bug.

[1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax

Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
properly")
Signed-off-by: Mathias Krause 
Reviewed-and-tested-by: Borislav Petkov 
---
v2:
 - fix the comments even further, as requested by Boris
 - add reviewed-and-tested-by for Boris

 arch/x86/include/asm/alternative-asm.h |4 +++-
 arch/x86/include/asm/alternative.h |6 +++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h 
b/arch/x86/include/asm/alternative-asm.h
index e7636bac7372..6c98821fef5e 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -62,8 +62,10 @@
 #define new_len2   145f-144f
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
+ *
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
 #define alt_max_short(a, b)((a) ^ (((a) ^ (b)) & -(-((a) < (b)
 
diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index c096624137ae..ccbe24e697c4 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -103,12 +103,12 @@ static inline int alternatives_text_reserved(void *start, 
void *end)
alt_end_marker ":\n"
 
 /*
- * max without conditionals. Idea adapted from:
+ * gas compatible max based on the idea from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
  *
- * The additional "-" is needed because gas works with s32s.
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
-#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
- (" b ")"
+#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
< (" b ")"
 
 /*
  * Pad the second replacement alternative with additional NOPs if it is
-- 
1.7.10.4

Re: [PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

On 5 October 2017 at 14:52, Michael Matz <m...@suse.de> wrote:
> On Thu, 5 Oct 2017, Borislav Petkov wrote:
>> On Thu, Oct 05, 2017 at 02:35:33PM +0200, Mathias Krause wrote:
>> > Note the "<"! ...comment is wrong, though the implementation works!
>>
>> I know, I realized that when I looked at alternative-asm.h. Wanted to
>> double-check it with Micha first.
>
> Yeah, for bit-twiddling the result of arithmetic would need to be
> booleanized first, or alternatively the boolean operators be used in the
> first place.  So if '<' works then that's better in this context.

Ack!

> (In a different context, or in the same one there definitely was a problem
> with using '<', but I can't remember the details, it's too long ago we
> discussed about this; maybe it even was a problem only with some binutils
> versions.  So I'd suggest using the more obvious way until problems
> reoccur, and then document why exactly using relational ops was a problem
> ;-) )

Might be because the "true" value of gas' < operator is non-obvious
with being -1. But, well, if you don't know, I don't know either ;)

Anyways, the alt_max_short() macro in alternative.h is plain wrong
while the one in alternative-asm.h works and is used in a few places,
even with varying lengths of the alternatives (e.g.
arch/x86/entry/vdso/vdso32/system_call.S,
arch/x86/lib/{clear_page_64.S,memcpy_64.S,memset_64.S}), therefore
proves to be functional.

Thanks,
Mathias

Re: [PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

On 5 October 2017 at 14:52, Michael Matz  wrote:
> On Thu, 5 Oct 2017, Borislav Petkov wrote:
>> On Thu, Oct 05, 2017 at 02:35:33PM +0200, Mathias Krause wrote:
>> > Note the "<"! ...comment is wrong, though the implementation works!
>>
>> I know, I realized that when I looked at alternative-asm.h. Wanted to
>> double-check it with Micha first.
>
> Yeah, for bit-twiddling the result of arithmetic would need to be
> booleanized first, or alternatively the boolean operators be used in the
> first place.  So if '<' works then that's better in this context.

Ack!

> (In a different context, or in the same one there definitely was a problem
> with using '<', but I can't remember the details, it's too long ago we
> discussed about this; maybe it even was a problem only with some binutils
> versions.  So I'd suggest using the more obvious way until problems
> reoccur, and then document why exactly using relational ops was a problem
> ;-) )

Might be because the "true" value of gas' < operator is non-obvious
with being -1. But, well, if you don't know, I don't know either ;)

Anyways, the alt_max_short() macro in alternative.h is plain wrong
while the one in alternative-asm.h works and is used in a few places,
even with varying lengths of the alternatives (e.g.
arch/x86/entry/vdso/vdso32/system_call.S,
arch/x86/lib/{clear_page_64.S,memcpy_64.S,memset_64.S}), therefore
proves to be functional.

Thanks,
Mathias

Re: [PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

On 5 October 2017 at 09:58, Mathias Krause <mini...@googlemail.com> wrote:
> On 5 October 2017 at 09:38, Borislav Petkov <b...@alien8.de> wrote:
>> On Wed, Oct 04, 2017 at 08:08:12PM +0200, Mathias Krause wrote:
>>> diff --git a/arch/x86/include/asm/alternative.h 
>>> b/arch/x86/include/asm/alternative.h
>>> index c096624137ae..7c553f48f163 100644
>>> --- a/arch/x86/include/asm/alternative.h
>>> +++ b/arch/x86/include/asm/alternative.h
>>> @@ -106,9 +106,9 @@ static inline int alternatives_text_reserved(void 
>>> *start, void *end)
>>>   * max without conditionals. Idea adapted from:
>>   ^
>>
>> You did read this part, right?

Oh, btw., quoting its counter part from arch/x86/include/asm/alternative-asm.h:

/*
 * max without conditionals. Idea adapted from:
 * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
 */
#define alt_max_short(a, b) ((a) ^ (((a) ^ (b)) & -(-((a) < (b)

Note the "<"! ...comment is wrong, though the implementation works!

Cheers,
Mathias

Re: [PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

On 5 October 2017 at 09:58, Mathias Krause  wrote:
> On 5 October 2017 at 09:38, Borislav Petkov  wrote:
>> On Wed, Oct 04, 2017 at 08:08:12PM +0200, Mathias Krause wrote:
>>> diff --git a/arch/x86/include/asm/alternative.h 
>>> b/arch/x86/include/asm/alternative.h
>>> index c096624137ae..7c553f48f163 100644
>>> --- a/arch/x86/include/asm/alternative.h
>>> +++ b/arch/x86/include/asm/alternative.h
>>> @@ -106,9 +106,9 @@ static inline int alternatives_text_reserved(void 
>>> *start, void *end)
>>>   * max without conditionals. Idea adapted from:
>>   ^
>>
>> You did read this part, right?

Oh, btw., quoting its counter part from arch/x86/include/asm/alternative-asm.h:

/*
 * max without conditionals. Idea adapted from:
 * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
 */
#define alt_max_short(a, b) ((a) ^ (((a) ^ (b)) & -(-((a) < (b)

Note the "<"! ...comment is wrong, though the implementation works!

Cheers,
Mathias

Re: [PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

On 5 October 2017 at 09:38, Borislav Petkov <b...@alien8.de> wrote:
> On Wed, Oct 04, 2017 at 08:08:12PM +0200, Mathias Krause wrote:
>> The alt_max_short() macro in asm/alternative.h does not work as
>> intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
>> evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
>> exactly the maximum of 1 and 3.
>>
>> In fact, I had to learn it the hard way by crashing my kernel in not
>> so funny ways by attempting to make use of the ALTENATIVE_2 macro
>> with alternatives where the first one was larger than the second
>> one.
>>
>> According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
>> ALTERNATIVE_2 padding generation properly") the right handed side
>> should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
>> macro work as intended.
>>
>> While at it, fix up the comment regarding the additional "-", too.
>> It's not about gas' usage of s32 but brain dead logic of having a
>> "true" value of -1 for the < operator ... *sigh*
>>
>> Btw., the one in asm/alternative-asm.h is correct. And, apparently,
>> all current users of ALTERNATIVE_2() pass same sized alternatives,
>> avoiding to hit the bug.
>>
>> [1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
>>
>> Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
>> properly")
>> Signed-off-by: Mathias Krause <mini...@googlemail.com>
>> Cc: Borislav Petkov <b...@suse.de>
>> ---
>>  arch/x86/include/asm/alternative.h |4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/alternative.h 
>> b/arch/x86/include/asm/alternative.h
>> index c096624137ae..7c553f48f163 100644
>> --- a/arch/x86/include/asm/alternative.h
>> +++ b/arch/x86/include/asm/alternative.h
>> @@ -106,9 +106,9 @@ static inline int alternatives_text_reserved(void 
>> *start, void *end)
>>   * max without conditionals. Idea adapted from:
>   ^
>
> You did read this part, right?

Yes. But it's just wrong the way it is right now. It's a crap number
generator instead of being a "max without conditionals".

> AFAIR, gas can't stomach conditionals but I don't remember the details
> anymore.

Can't be as arch/x86/include/asm/alternative.h itself makes use of
them for the implementation of ALTERNATIVE[_2], see, e.g. __OLDINSTR()
and OLDINSTR_2().

> Could be that -1 representation of "true". Let me add Micha to
> CC.
>
> Anyway, how can I reproduce your observation? Code snippet and compiler
> pls.

Here you go:

$ cat alt_max.c
#ifdef VANILLA
# define alt_max_short(a, b) \
"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") - (" b ")"
#else
# define alt_max_short(a, b) \
"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") < (" b ")"
#endif

#define ams(a, b) \
"ams_" #a "_" #b " = " alt_max_short(#a, #b) "\n\t"

asm(ams(1, 3)
ams(3, 1)
ams(1, 6));

$ gcc -DVANILLA -c alt_max.c && nm alt_max.o
0003 a ams_1_3
0002 a ams_1_6
0001 a ams_3_1
$ gcc -c alt_max.c && nm alt_max.o
0003 a ams_1_3
0006 a ams_1_6
0003 a ams_3_1

Note not only how it gets the max(3,1) case wrong but generates even
more insane crap for max(1,6).

$ as --version | head -1
GNU assembler (GNU Binutils for Debian) 2.25

Cheers,
Mathias

>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

Re: [PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-05 Thread Mathias Krause

On 5 October 2017 at 09:38, Borislav Petkov  wrote:
> On Wed, Oct 04, 2017 at 08:08:12PM +0200, Mathias Krause wrote:
>> The alt_max_short() macro in asm/alternative.h does not work as
>> intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
>> evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
>> exactly the maximum of 1 and 3.
>>
>> In fact, I had to learn it the hard way by crashing my kernel in not
>> so funny ways by attempting to make use of the ALTENATIVE_2 macro
>> with alternatives where the first one was larger than the second
>> one.
>>
>> According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
>> ALTERNATIVE_2 padding generation properly") the right handed side
>> should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
>> macro work as intended.
>>
>> While at it, fix up the comment regarding the additional "-", too.
>> It's not about gas' usage of s32 but brain dead logic of having a
>> "true" value of -1 for the < operator ... *sigh*
>>
>> Btw., the one in asm/alternative-asm.h is correct. And, apparently,
>> all current users of ALTERNATIVE_2() pass same sized alternatives,
>> avoiding to hit the bug.
>>
>> [1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
>>
>> Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
>> properly")
>> Signed-off-by: Mathias Krause 
>> Cc: Borislav Petkov 
>> ---
>>  arch/x86/include/asm/alternative.h |4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/alternative.h 
>> b/arch/x86/include/asm/alternative.h
>> index c096624137ae..7c553f48f163 100644
>> --- a/arch/x86/include/asm/alternative.h
>> +++ b/arch/x86/include/asm/alternative.h
>> @@ -106,9 +106,9 @@ static inline int alternatives_text_reserved(void 
>> *start, void *end)
>>   * max without conditionals. Idea adapted from:
>   ^
>
> You did read this part, right?

Yes. But it's just wrong the way it is right now. It's a crap number
generator instead of being a "max without conditionals".

> AFAIR, gas can't stomach conditionals but I don't remember the details
> anymore.

Can't be as arch/x86/include/asm/alternative.h itself makes use of
them for the implementation of ALTERNATIVE[_2], see, e.g. __OLDINSTR()
and OLDINSTR_2().

> Could be that -1 representation of "true". Let me add Micha to
> CC.
>
> Anyway, how can I reproduce your observation? Code snippet and compiler
> pls.

Here you go:

$ cat alt_max.c
#ifdef VANILLA
# define alt_max_short(a, b) \
"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") - (" b ")"
#else
# define alt_max_short(a, b) \
"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") < (" b ")"
#endif

#define ams(a, b) \
"ams_" #a "_" #b " = " alt_max_short(#a, #b) "\n\t"

asm(ams(1, 3)
ams(3, 1)
ams(1, 6));

$ gcc -DVANILLA -c alt_max.c && nm alt_max.o
0003 a ams_1_3
0002 a ams_1_6
0001 a ams_3_1
$ gcc -c alt_max.c && nm alt_max.o
0003 a ams_1_3
0006 a ams_1_6
0003 a ams_3_1

Note not only how it gets the max(3,1) case wrong but generates even
more insane crap for max(1,6).

$ as --version | head -1
GNU assembler (GNU Binutils for Debian) 2.25

Cheers,
Mathias

>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

[PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-04 Thread Mathias Krause

The alt_max_short() macro in asm/alternative.h does not work as
intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
exactly the maximum of 1 and 3.

In fact, I had to learn it the hard way by crashing my kernel in not
so funny ways by attempting to make use of the ALTENATIVE_2 macro
with alternatives where the first one was larger than the second
one.

According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
ALTERNATIVE_2 padding generation properly") the right handed side
should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
macro work as intended.

While at it, fix up the comment regarding the additional "-", too.
It's not about gas' usage of s32 but brain dead logic of having a
"true" value of -1 for the < operator ... *sigh*

Btw., the one in asm/alternative-asm.h is correct. And, apparently,
all current users of ALTERNATIVE_2() pass same sized alternatives,
avoiding to hit the bug.

[1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax

Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
properly")
Signed-off-by: Mathias Krause <mini...@googlemail.com>
Cc: Borislav Petkov <b...@suse.de>
---
 arch/x86/include/asm/alternative.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index c096624137ae..7c553f48f163 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -106,9 +106,9 @@ static inline int alternatives_text_reserved(void *start, 
void *end)
  * max without conditionals. Idea adapted from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
  *
- * The additional "-" is needed because gas works with s32s.
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
-#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
- (" b ")"
+#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
< (" b ")"
 
 /*
  * Pad the second replacement alternative with additional NOPs if it is
-- 
1.7.10.4

[PATCH] x86/alternatives: Fix alt_max_short macro to really be a max()

2017-10-04 Thread Mathias Krause

The alt_max_short() macro in asm/alternative.h does not work as
intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
exactly the maximum of 1 and 3.

In fact, I had to learn it the hard way by crashing my kernel in not
so funny ways by attempting to make use of the ALTENATIVE_2 macro
with alternatives where the first one was larger than the second
one.

According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
ALTERNATIVE_2 padding generation properly") the right handed side
should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
macro work as intended.

While at it, fix up the comment regarding the additional "-", too.
It's not about gas' usage of s32 but brain dead logic of having a
"true" value of -1 for the < operator ... *sigh*

Btw., the one in asm/alternative-asm.h is correct. And, apparently,
all current users of ALTERNATIVE_2() pass same sized alternatives,
avoiding to hit the bug.

[1] http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax

Fixes: dbe4058a6a44 ("x86/alternatives: Fix ALTERNATIVE_2 padding generation 
properly")
Signed-off-by: Mathias Krause 
Cc: Borislav Petkov 
---
 arch/x86/include/asm/alternative.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h 
b/arch/x86/include/asm/alternative.h
index c096624137ae..7c553f48f163 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -106,9 +106,9 @@ static inline int alternatives_text_reserved(void *start, 
void *end)
  * max without conditionals. Idea adapted from:
  * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
  *
- * The additional "-" is needed because gas works with s32s.
+ * The additional "-" is needed because gas uses a "true" value of -1.
  */
-#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
- (" b ")"
+#define alt_max_short(a, b)"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") 
< (" b ")"
 
 /*
  * Pad the second replacement alternative with additional NOPs if it is
-- 
1.7.10.4

[tip:x86/urgent] x86/vdso: Ensure vdso32_enabled gets set to valid values only

2017-04-10 Thread tip-bot for Mathias Krause

Commit-ID:  c06989da39cdb10604d572c8c7ea8c8c97f3c483
Gitweb: http://git.kernel.org/tip/c06989da39cdb10604d572c8c7ea8c8c97f3c483
Author: Mathias Krause <mini...@googlemail.com>
AuthorDate: Mon, 10 Apr 2017 17:14:27 +0200
Committer:  Thomas Gleixner <t...@linutronix.de>
CommitDate: Mon, 10 Apr 2017 18:31:41 +0200

x86/vdso: Ensure vdso32_enabled gets set to valid values only

vdso_enabled can be set to arbitrary integer values via the kernel command
line 'vdso32=' parameter or via 'sysctl abi.vsyscall32'.

load_vdso32() only maps VDSO if vdso_enabled == 1, but ARCH_DLINFO_IA32
merily checks for vdso_enabled != 0. As a consequence the AT_SYSINFO_EHDR
auxiliary vector for the VDSO_ENTRY is emitted with a NULL pointer which
causes a segfault when the application tries to use the VDSO.

Restrict the valid arguments on the command line and the sysctl to 0 and 1.

Fixes: b0b49f2673f0 ("x86, vdso: Remove compat vdso support")
Signed-off-by: Mathias Krause <mini...@googlemail.com>
Acked-by: Andy Lutomirski <l...@amacapital.net>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: sta...@vger.kernel.org
Cc: Roland McGrath <rol...@redhat.com>
Link: 
http://lkml.kernel.org/r/1491424561-7187-1-git-send-email-mini...@googlemail.com
Link: http://lkml.kernel.org/r/20170410151723.518412...@linutronix.de
Signed-off-by: Thomas Gleixner <t...@linutronix.de>

---
 arch/x86/entry/vdso/vdso32-setup.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/vdso/vdso32-setup.c 
b/arch/x86/entry/vdso/vdso32-setup.c
index 7853b53..3f9d1a8 100644
--- a/arch/x86/entry/vdso/vdso32-setup.c
+++ b/arch/x86/entry/vdso/vdso32-setup.c
@@ -30,8 +30,10 @@ static int __init vdso32_setup(char *s)
 {
vdso32_enabled = simple_strtoul(s, NULL, 0);
 
-   if (vdso32_enabled > 1)
+   if (vdso32_enabled > 1) {
pr_warn("vdso32 values other than 0 and 1 are no longer 
allowed; vdso disabled\n");
+   vdso32_enabled = 0;
+   }
 
return 1;
 }
@@ -62,13 +64,18 @@ subsys_initcall(sysenter_setup);
 /* Register vsyscall32 into the ABI table */
 #include 
 
+static const int zero;
+static const int one = 1;
+
 static struct ctl_table abi_table2[] = {
{
.procname   = "vsyscall32",
.data   = _enabled,
.maxlen = sizeof(int),
.mode   = 0644,
-   .proc_handler   = proc_dointvec
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = (int *),
+   .extra2 = (int *),
},
{}
 };

[tip:x86/urgent] x86/vdso: Ensure vdso32_enabled gets set to valid values only

2017-04-10 Thread tip-bot for Mathias Krause

Commit-ID:  c06989da39cdb10604d572c8c7ea8c8c97f3c483
Gitweb: http://git.kernel.org/tip/c06989da39cdb10604d572c8c7ea8c8c97f3c483
Author: Mathias Krause 
AuthorDate: Mon, 10 Apr 2017 17:14:27 +0200
Committer:  Thomas Gleixner 
CommitDate: Mon, 10 Apr 2017 18:31:41 +0200

x86/vdso: Ensure vdso32_enabled gets set to valid values only

vdso_enabled can be set to arbitrary integer values via the kernel command
line 'vdso32=' parameter or via 'sysctl abi.vsyscall32'.

load_vdso32() only maps VDSO if vdso_enabled == 1, but ARCH_DLINFO_IA32
merily checks for vdso_enabled != 0. As a consequence the AT_SYSINFO_EHDR
auxiliary vector for the VDSO_ENTRY is emitted with a NULL pointer which
causes a segfault when the application tries to use the VDSO.

Restrict the valid arguments on the command line and the sysctl to 0 and 1.

Fixes: b0b49f2673f0 ("x86, vdso: Remove compat vdso support")
Signed-off-by: Mathias Krause 
Acked-by: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: sta...@vger.kernel.org
Cc: Roland McGrath 
Link: 
http://lkml.kernel.org/r/1491424561-7187-1-git-send-email-mini...@googlemail.com
Link: http://lkml.kernel.org/r/20170410151723.518412...@linutronix.de
Signed-off-by: Thomas Gleixner 

---
 arch/x86/entry/vdso/vdso32-setup.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/vdso/vdso32-setup.c 
b/arch/x86/entry/vdso/vdso32-setup.c
index 7853b53..3f9d1a8 100644
--- a/arch/x86/entry/vdso/vdso32-setup.c
+++ b/arch/x86/entry/vdso/vdso32-setup.c
@@ -30,8 +30,10 @@ static int __init vdso32_setup(char *s)
 {
vdso32_enabled = simple_strtoul(s, NULL, 0);
 
-   if (vdso32_enabled > 1)
+   if (vdso32_enabled > 1) {
pr_warn("vdso32 values other than 0 and 1 are no longer 
allowed; vdso disabled\n");
+   vdso32_enabled = 0;
+   }
 
return 1;
 }
@@ -62,13 +64,18 @@ subsys_initcall(sysenter_setup);
 /* Register vsyscall32 into the ABI table */
 #include 
 
+static const int zero;
+static const int one = 1;
+
 static struct ctl_table abi_table2[] = {
{
.procname   = "vsyscall32",
.data   = _enabled,
.maxlen = sizeof(int),
.mode   = 0644,
-   .proc_handler   = proc_dointvec
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = (int *),
+   .extra2 = (int *),
},
{}
 };

Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Mathias Krause

On 7 April 2017 at 15:14, Thomas Gleixner <t...@linutronix.de> wrote:
> On Fri, 7 Apr 2017, Mathias Krause wrote:
>> On 7 April 2017 at 11:46, Thomas Gleixner <t...@linutronix.de> wrote:
>> > Whether protected by preempt_disable or local_irq_disable, to make that
>> > work it needs CR0 handling in the exception entry/exit at the lowest
>> > level. And that's just a nightmare maintainence wise as it's prone to be
>> > broken over time.
>>
>> It seems to be working fine for more than a decade now in PaX. So it
>> can't be such a big maintenance nightmare ;)
>
> I really do not care whether PaX wants to chase and verify that over and
> over. I certainly don't want to take the chance to leak CR0.WP ever and I
> very much care about extra stuff to check in the entry/exit path.

Fair enough. However, placing a BUG_ON(!(read_cr0() & X86_CR0_WP))
somewhere sensible should make those "leaks" visible fast -- and their
exploitation impossible, i.e. fail hard.

>> The "proper solution" seems to be much slower compared to just
>> toggling CR0.WP (which is costly in itself, already) because of the
>> TLB invalidation / synchronisation involved.
>
> Why the heck should we care about rare writes being performant?

As soon as they stop being rare and people start extending the r/o
protection to critical data structures accessed often. Then
performance matters.

>> > It's valid (at least on x86) to have a shadow map with the same page
>> > attributes but write enabled. That does not require any fixups of CR0 and
>> > just works.
>>
>> "Just works", sure -- but it's not as tightly focused as the PaX
>> solution which is CPU local, while your proposed solution is globally
>> visible.
>
> Making the world and some more writeable hardly qualifies as tightly
> focussed. Making the mapping concept CPU local is not rocket science
> either. The question is whethers it's worth the trouble.

No, the question is if the value of the concept is well understood and
if people can see what could be done with such a strong primitive.
Apparently not...


Cheers,
Mathias

Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Mathias Krause

On 7 April 2017 at 15:14, Thomas Gleixner  wrote:
> On Fri, 7 Apr 2017, Mathias Krause wrote:
>> On 7 April 2017 at 11:46, Thomas Gleixner  wrote:
>> > Whether protected by preempt_disable or local_irq_disable, to make that
>> > work it needs CR0 handling in the exception entry/exit at the lowest
>> > level. And that's just a nightmare maintainence wise as it's prone to be
>> > broken over time.
>>
>> It seems to be working fine for more than a decade now in PaX. So it
>> can't be such a big maintenance nightmare ;)
>
> I really do not care whether PaX wants to chase and verify that over and
> over. I certainly don't want to take the chance to leak CR0.WP ever and I
> very much care about extra stuff to check in the entry/exit path.

Fair enough. However, placing a BUG_ON(!(read_cr0() & X86_CR0_WP))
somewhere sensible should make those "leaks" visible fast -- and their
exploitation impossible, i.e. fail hard.

>> The "proper solution" seems to be much slower compared to just
>> toggling CR0.WP (which is costly in itself, already) because of the
>> TLB invalidation / synchronisation involved.
>
> Why the heck should we care about rare writes being performant?

As soon as they stop being rare and people start extending the r/o
protection to critical data structures accessed often. Then
performance matters.

>> > It's valid (at least on x86) to have a shadow map with the same page
>> > attributes but write enabled. That does not require any fixups of CR0 and
>> > just works.
>>
>> "Just works", sure -- but it's not as tightly focused as the PaX
>> solution which is CPU local, while your proposed solution is globally
>> visible.
>
> Making the world and some more writeable hardly qualifies as tightly
> focussed. Making the mapping concept CPU local is not rocket science
> either. The question is whethers it's worth the trouble.

No, the question is if the value of the concept is well understood and
if people can see what could be done with such a strong primitive.
Apparently not...


Cheers,
Mathias

Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Mathias Krause

On 7 April 2017 at 11:46, Thomas Gleixner <t...@linutronix.de> wrote:
> On Fri, 7 Apr 2017, Mathias Krause wrote:
>> On 6 April 2017 at 17:59, Andy Lutomirski <l...@amacapital.net> wrote:
>> > On Wed, Apr 5, 2017 at 5:14 PM, Kees Cook <keesc...@chromium.org> wrote:
>> >> static __always_inline rare_write_begin(void)
>> >> {
>> >> preempt_disable();
>> >> local_irq_disable();
>> >> barrier();
>> >> __arch_rare_write_begin();
>> >> barrier();
>> >> }
>> >
>> > Looks good, except you don't need preempt_disable().
>> > local_irq_disable() also disables preemption.  You might need to use
>> > local_irq_save(), though, depending on whether any callers already
>> > have IRQs off.
>>
>> Well, doesn't look good to me. NMIs will still be able to interrupt
>> this code and will run with CR0.WP = 0.
>>
>> Shouldn't you instead question yourself why PaX can do it "just" with
>> preempt_disable() instead?!
>
> That's silly. Just because PaX does it, doesn't mean it's correct. To be
> honest, playing games with the CR0.WP bit is outright stupid to begin with.

Why that? It allows fast and CPU local modifications of r/o memory.
OTOH, an approach that needs to fiddle with page table entries
requires global synchronization to keep the individual TLB states in
sync. Hmm.. Not that fast, I'd say.

> Whether protected by preempt_disable or local_irq_disable, to make that
> work it needs CR0 handling in the exception entry/exit at the lowest
> level. And that's just a nightmare maintainence wise as it's prone to be
> broken over time.

It seems to be working fine for more than a decade now in PaX. So it
can't be such a big maintenance nightmare ;)

>Aside of that it's pointless overhead for the normal case.

>
> The proper solution is:
>
> write_rare(ptr, val)
> {
> mp = map_shadow_rw(ptr);
> *mp = val;
> unmap_shadow_rw(mp);
> }
>
> map_shadow_rw() is essentially the same thing as we do in the highmem case
> where the kernel creates a shadow mapping of the user space pages via
> kmap_atomic().

The "proper solution" seems to be much slower compared to just
toggling CR0.WP (which is costly in itself, already) because of the
TLB invalidation / synchronisation involved.

> It's valid (at least on x86) to have a shadow map with the same page
> attributes but write enabled. That does not require any fixups of CR0 and
> just works.

"Just works", sure -- but it's not as tightly focused as the PaX
solution which is CPU local, while your proposed solution is globally
visible.


Cheers,
Mathias

Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Mathias Krause

On 7 April 2017 at 11:46, Thomas Gleixner  wrote:
> On Fri, 7 Apr 2017, Mathias Krause wrote:
>> On 6 April 2017 at 17:59, Andy Lutomirski  wrote:
>> > On Wed, Apr 5, 2017 at 5:14 PM, Kees Cook  wrote:
>> >> static __always_inline rare_write_begin(void)
>> >> {
>> >> preempt_disable();
>> >> local_irq_disable();
>> >> barrier();
>> >> __arch_rare_write_begin();
>> >> barrier();
>> >> }
>> >
>> > Looks good, except you don't need preempt_disable().
>> > local_irq_disable() also disables preemption.  You might need to use
>> > local_irq_save(), though, depending on whether any callers already
>> > have IRQs off.
>>
>> Well, doesn't look good to me. NMIs will still be able to interrupt
>> this code and will run with CR0.WP = 0.
>>
>> Shouldn't you instead question yourself why PaX can do it "just" with
>> preempt_disable() instead?!
>
> That's silly. Just because PaX does it, doesn't mean it's correct. To be
> honest, playing games with the CR0.WP bit is outright stupid to begin with.

Why that? It allows fast and CPU local modifications of r/o memory.
OTOH, an approach that needs to fiddle with page table entries
requires global synchronization to keep the individual TLB states in
sync. Hmm.. Not that fast, I'd say.

> Whether protected by preempt_disable or local_irq_disable, to make that
> work it needs CR0 handling in the exception entry/exit at the lowest
> level. And that's just a nightmare maintainence wise as it's prone to be
> broken over time.

It seems to be working fine for more than a decade now in PaX. So it
can't be such a big maintenance nightmare ;)

>Aside of that it's pointless overhead for the normal case.

>
> The proper solution is:
>
> write_rare(ptr, val)
> {
> mp = map_shadow_rw(ptr);
> *mp = val;
> unmap_shadow_rw(mp);
> }
>
> map_shadow_rw() is essentially the same thing as we do in the highmem case
> where the kernel creates a shadow mapping of the user space pages via
> kmap_atomic().

The "proper solution" seems to be much slower compared to just
toggling CR0.WP (which is costly in itself, already) because of the
TLB invalidation / synchronisation involved.

> It's valid (at least on x86) to have a shadow map with the same page
> attributes but write enabled. That does not require any fixups of CR0 and
> just works.

"Just works", sure -- but it's not as tightly focused as the PaX
solution which is CPU local, while your proposed solution is globally
visible.


Cheers,
Mathias

Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Mathias Krause

On 6 April 2017 at 17:59, Andy Lutomirski  wrote:
> On Wed, Apr 5, 2017 at 5:14 PM, Kees Cook  wrote:
>> On Wed, Apr 5, 2017 at 4:57 PM, Andy Lutomirski  wrote:
>>> On Wed, Mar 29, 2017 at 6:41 PM, Kees Cook  wrote:
 On Wed, Mar 29, 2017 at 3:38 PM, Andy Lutomirski  
 wrote:
> On Wed, Mar 29, 2017 at 11:15 AM, Kees Cook  wrote:
>> Based on PaX's x86 pax_{open,close}_kernel() implementation, this
>> allows HAVE_ARCH_RARE_WRITE to work on x86.
>>
>> +
>> +static __always_inline unsigned long __arch_rare_write_begin(void)
>> +{
>> +   unsigned long cr0;
>> +
>> +   preempt_disable();
>
> This looks wrong.  DEBUG_LOCKS_WARN_ON(!irqs_disabled()) would work,
> as would local_irq_disable().  There's no way that just disabling
> preemption is enough.
>
> (Also, how does this interact with perf nmis?)

 Do you mean preempt_disable() isn't strong enough here? I'm open to
 suggestions. The goal would be to make sure nothing between _begin and
 _end would get executed without interruption...

>>>
>>> Sorry for the very slow response.
>>>
>>> preempt_disable() isn't strong enough to prevent interrupts, and an
>>> interrupt here would run with WP off, causing unknown havoc.  I tend
>>> to think that the caller should be responsible for turning off
>>> interrupts.
>>
>> So, something like:
>>
>> Top-level functions:
>>
>> static __always_inline rare_write_begin(void)
>> {
>> preempt_disable();
>> local_irq_disable();
>> barrier();
>> __arch_rare_write_begin();
>> barrier();
>> }
>
> Looks good, except you don't need preempt_disable().
> local_irq_disable() also disables preemption.  You might need to use
> local_irq_save(), though, depending on whether any callers already
> have IRQs off.

Well, doesn't look good to me. NMIs will still be able to interrupt
this code and will run with CR0.WP = 0.

Shouldn't you instead question yourself why PaX can do it "just" with
preempt_disable() instead?!

Cheers,
Mathias

Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap()

2017-04-07 Thread Mathias Krause

On 6 April 2017 at 17:59, Andy Lutomirski  wrote:
> On Wed, Apr 5, 2017 at 5:14 PM, Kees Cook  wrote:
>> On Wed, Apr 5, 2017 at 4:57 PM, Andy Lutomirski  wrote:
>>> On Wed, Mar 29, 2017 at 6:41 PM, Kees Cook  wrote:
 On Wed, Mar 29, 2017 at 3:38 PM, Andy Lutomirski  
 wrote:
> On Wed, Mar 29, 2017 at 11:15 AM, Kees Cook  wrote:
>> Based on PaX's x86 pax_{open,close}_kernel() implementation, this
>> allows HAVE_ARCH_RARE_WRITE to work on x86.
>>
>> +
>> +static __always_inline unsigned long __arch_rare_write_begin(void)
>> +{
>> +   unsigned long cr0;
>> +
>> +   preempt_disable();
>
> This looks wrong.  DEBUG_LOCKS_WARN_ON(!irqs_disabled()) would work,
> as would local_irq_disable().  There's no way that just disabling
> preemption is enough.
>
> (Also, how does this interact with perf nmis?)

 Do you mean preempt_disable() isn't strong enough here? I'm open to
 suggestions. The goal would be to make sure nothing between _begin and
 _end would get executed without interruption...

>>>
>>> Sorry for the very slow response.
>>>
>>> preempt_disable() isn't strong enough to prevent interrupts, and an
>>> interrupt here would run with WP off, causing unknown havoc.  I tend
>>> to think that the caller should be responsible for turning off
>>> interrupts.
>>
>> So, something like:
>>
>> Top-level functions:
>>
>> static __always_inline rare_write_begin(void)
>> {
>> preempt_disable();
>> local_irq_disable();
>> barrier();
>> __arch_rare_write_begin();
>> barrier();
>> }
>
> Looks good, except you don't need preempt_disable().
> local_irq_disable() also disables preemption.  You might need to use
> local_irq_save(), though, depending on whether any callers already
> have IRQs off.

Well, doesn't look good to me. NMIs will still be able to interrupt
this code and will run with CR0.WP = 0.

Shouldn't you instead question yourself why PaX can do it "just" with
preempt_disable() instead?!

Cheers,
Mathias

[PATCH] x86/vdso: ensure vdso32_enabled gets set to valid values only

2017-04-05 Thread Mathias Krause

If either via kernel command line 'vdso32=' or via 'sysctl abi.vsyscall32'
vdso32_enabled gets set to a value below 0 or above 1, load_vdso32() won't
map the vDSO but ARCH_DLINFO_IA32 would still pass an AT_SYSINFO_EHDR
auxiliary vector, however with a NULL pointer. That'll make any program
trying to make use of it fail with a segmentation fault. At least musl
makes use of it if the kernel provides it.

Ensure vdso32_enabled gets set to valid values only to fix this corner
case.

Fixes: b0b49f2673f0 ("x86, vdso: Remove compat vdso support")
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: Roland McGrath <rol...@redhat.com>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
---
 arch/x86/entry/vdso/vdso32-setup.c |   11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/vdso/vdso32-setup.c 
b/arch/x86/entry/vdso/vdso32-setup.c
index 7853b53959cd..ca312c174d6f 100644
--- a/arch/x86/entry/vdso/vdso32-setup.c
+++ b/arch/x86/entry/vdso/vdso32-setup.c
@@ -30,8 +30,10 @@ static int __init vdso32_setup(char *s)
 {
vdso32_enabled = simple_strtoul(s, NULL, 0);
 
-   if (vdso32_enabled > 1)
+   if (vdso32_enabled > 1) {
pr_warn("vdso32 values other than 0 and 1 are no longer 
allowed; vdso disabled\n");
+   vdso32_enabled = 0;
+   }
 
return 1;
 }
@@ -62,13 +64,18 @@ int __init sysenter_setup(void)
 /* Register vsyscall32 into the ABI table */
 #include 
 
+static const int zero;
+static const int one = 1;
+
 static struct ctl_table abi_table2[] = {
{
.procname   = "vsyscall32",
.data   = _enabled,
.maxlen = sizeof(int),
.mode   = 0644,
-   .proc_handler   = proc_dointvec
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = (int *),
+   .extra2 = (int *),
},
{}
 };
-- 
1.7.10.4

[PATCH] x86/vdso: ensure vdso32_enabled gets set to valid values only

2017-04-05 Thread Mathias Krause

If either via kernel command line 'vdso32=' or via 'sysctl abi.vsyscall32'
vdso32_enabled gets set to a value below 0 or above 1, load_vdso32() won't
map the vDSO but ARCH_DLINFO_IA32 would still pass an AT_SYSINFO_EHDR
auxiliary vector, however with a NULL pointer. That'll make any program
trying to make use of it fail with a segmentation fault. At least musl
makes use of it if the kernel provides it.

Ensure vdso32_enabled gets set to valid values only to fix this corner
case.

Fixes: b0b49f2673f0 ("x86, vdso: Remove compat vdso support")
Cc: Andy Lutomirski 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Roland McGrath 
Signed-off-by: Mathias Krause 
---
 arch/x86/entry/vdso/vdso32-setup.c |   11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/vdso/vdso32-setup.c 
b/arch/x86/entry/vdso/vdso32-setup.c
index 7853b53959cd..ca312c174d6f 100644
--- a/arch/x86/entry/vdso/vdso32-setup.c
+++ b/arch/x86/entry/vdso/vdso32-setup.c
@@ -30,8 +30,10 @@ static int __init vdso32_setup(char *s)
 {
vdso32_enabled = simple_strtoul(s, NULL, 0);
 
-   if (vdso32_enabled > 1)
+   if (vdso32_enabled > 1) {
pr_warn("vdso32 values other than 0 and 1 are no longer 
allowed; vdso disabled\n");
+   vdso32_enabled = 0;
+   }
 
return 1;
 }
@@ -62,13 +64,18 @@ int __init sysenter_setup(void)
 /* Register vsyscall32 into the ABI table */
 #include 
 
+static const int zero;
+static const int one = 1;
+
 static struct ctl_table abi_table2[] = {
{
.procname   = "vsyscall32",
.data   = _enabled,
.maxlen = sizeof(int),
.mode   = 0644,
-   .proc_handler   = proc_dointvec
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = (int *),
+   .extra2 = (int *),
},
{}
 };
-- 
1.7.10.4

[PATCH v2] cris: remove unused wp_works_ok macro

2017-04-05 Thread Mathias Krause

It had no use since it's introduction in v2.4.1.2. Get rid of it.

Cc: Mikael Starvik <star...@axis.com>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
Acked-by: Jesper Nilsson <jesper.nils...@axis.com>
---
Same patch as v1 but as the tip folks took only the x86 parts, I think,
this one should go through the CRIS tree.

Cheers,
Mathias

 arch/cris/include/arch-v10/arch/processor.h |3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/cris/include/arch-v10/arch/processor.h 
b/arch/cris/include/arch-v10/arch/processor.h
index 93feb2a487d8..58f75bee1d6c 100644
--- a/arch/cris/include/arch-v10/arch/processor.h
+++ b/arch/cris/include/arch-v10/arch/processor.h
@@ -7,9 +7,6 @@
  */
 #define current_text_addr() ({void *pc; __asm__ ("move.d $pc,%0" : "=rm" 
(pc)); pc; })
 
-/* CRIS has no problems with write protection */
-#define wp_works_ok 1
-
 /* CRIS thread_struct. this really has nothing to do with the processor 
itself, since
  * CRIS does not do any hardware task-switching, but it's here for legacy 
reasons.
  * The thread_struct here is used when task-switching using _resume defined in 
entry.S.
-- 
1.7.10.4

[PATCH v2] cris: remove unused wp_works_ok macro

2017-04-05 Thread Mathias Krause

It had no use since it's introduction in v2.4.1.2. Get rid of it.

Cc: Mikael Starvik 
Signed-off-by: Mathias Krause 
Acked-by: Jesper Nilsson 
---
Same patch as v1 but as the tip folks took only the x86 parts, I think,
this one should go through the CRIS tree.

Cheers,
Mathias

 arch/cris/include/arch-v10/arch/processor.h |3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/cris/include/arch-v10/arch/processor.h 
b/arch/cris/include/arch-v10/arch/processor.h
index 93feb2a487d8..58f75bee1d6c 100644
--- a/arch/cris/include/arch-v10/arch/processor.h
+++ b/arch/cris/include/arch-v10/arch/processor.h
@@ -7,9 +7,6 @@
  */
 #define current_text_addr() ({void *pc; __asm__ ("move.d $pc,%0" : "=rm" 
(pc)); pc; })
 
-/* CRIS has no problems with write protection */
-#define wp_works_ok 1
-
 /* CRIS thread_struct. this really has nothing to do with the processor 
itself, since
  * CRIS does not do any hardware task-switching, but it's here for legacy 
reasons.
  * The thread_struct here is used when task-switching using _resume defined in 
entry.S.
-- 
1.7.10.4

[PATCH v2] sparc: remove unused wp_works_ok macro

2017-04-05 Thread Mathias Krause

It's unused for ages, used to be required for ksyms.c back in the v1.1
times.

Signed-off-by: Mathias Krause <mini...@googlemail.com>
Acked-by: David S. Miller <da...@davemloft.net>
---
Hi Dave,

same patch as v1 but as the tip folks took only the x86 parts, I think,
this one should go through the SPARC tree.

Cheers,
Mathias

 arch/sparc/include/asm/processor_32.h |6 --
 arch/sparc/include/asm/processor_64.h |4 
 2 files changed, 10 deletions(-)

diff --git a/arch/sparc/include/asm/processor_32.h 
b/arch/sparc/include/asm/processor_32.h
index 365d4cb267b4..dd27159819eb 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -18,12 +18,6 @@
 #include 
 #include 
 
-/*
- * The sparc has no problems with write protection
- */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /* Whee, this is STACK_TOP + PAGE_SIZE and the lowest kernel address too...
  * That one page is used to protect kernel from intruders, so that
  * we can make our access_ok test faster
diff --git a/arch/sparc/include/asm/processor_64.h 
b/arch/sparc/include/asm/processor_64.h
index 6448cfc8292f..b58ee9018433 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -18,10 +18,6 @@
 #include 
 #include 
 
-/* The sparc has no problems with write protection */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /*
  * User lives in his very own context, and cannot reference us. Note
  * that TASK_SIZE is a misnomer, it really gives maximum user virtual
-- 
1.7.10.4

[PATCH v2] sparc: remove unused wp_works_ok macro

2017-04-05 Thread Mathias Krause

It's unused for ages, used to be required for ksyms.c back in the v1.1
times.

Signed-off-by: Mathias Krause 
Acked-by: David S. Miller 
---
Hi Dave,

same patch as v1 but as the tip folks took only the x86 parts, I think,
this one should go through the SPARC tree.

Cheers,
Mathias

 arch/sparc/include/asm/processor_32.h |6 --
 arch/sparc/include/asm/processor_64.h |4 
 2 files changed, 10 deletions(-)

diff --git a/arch/sparc/include/asm/processor_32.h 
b/arch/sparc/include/asm/processor_32.h
index 365d4cb267b4..dd27159819eb 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -18,12 +18,6 @@
 #include 
 #include 
 
-/*
- * The sparc has no problems with write protection
- */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /* Whee, this is STACK_TOP + PAGE_SIZE and the lowest kernel address too...
  * That one page is used to protect kernel from intruders, so that
  * we can make our access_ok test faster
diff --git a/arch/sparc/include/asm/processor_64.h 
b/arch/sparc/include/asm/processor_64.h
index 6448cfc8292f..b58ee9018433 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -18,10 +18,6 @@
 #include 
 #include 
 
-/* The sparc has no problems with write protection */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /*
  * User lives in his very own context, and cannot reference us. Note
  * that TASK_SIZE is a misnomer, it really gives maximum user virtual
-- 
1.7.10.4

[tip:x86/cpu] x86/cpu: Drop wp_works_ok member of struct cpuinfo_x86

2017-03-11 Thread tip-bot for Mathias Krause

Commit-ID:  6415813bae75feba10b8ca3ed6634a72c2a4d313
Gitweb: http://git.kernel.org/tip/6415813bae75feba10b8ca3ed6634a72c2a4d313
Author: Mathias Krause <mini...@googlemail.com>
AuthorDate: Sun, 12 Feb 2017 22:12:08 +0100
Committer:  Thomas Gleixner <t...@linutronix.de>
CommitDate: Sat, 11 Mar 2017 14:30:24 +0100

x86/cpu: Drop wp_works_ok member of struct cpuinfo_x86

Remove the wp_works_ok member of struct cpuinfo_x86. It's an
optimization back from Linux v0.99 times where we had no fixup support
yet and did the CR0.WP test via special code in the page fault handler.
The < 0 test was an optimization to not do the special casing for each
NULL ptr access violation but just for the first one doing the WP test.
Today it serves no real purpose as the test no longer needs special code
in the page fault handler and the only call side -- mem_init() -- calls
it just once, anyway. However, Xen pre-initializes it to 1, to skip the
test.

Doing the test again for Xen should be no issue at all, as even the
commit introducing skipping the test (commit d560bc61575e ("x86, xen:
Suppress WP test on Xen")) mentioned it being ban aid only. And, in
fact, testing the patch on Xen showed nothing breaks.

The pre-fixup times are long gone and with the removal of the fallback
handling code in commit a5c2a893dbd4 ("x86, 386 removal: Remove
CONFIG_X86_WP_WORKS_OK") the kernel requires a working CR0.WP anyway.
So just get rid of the "optimization" and do the test unconditionally.

Signed-off-by: Mathias Krause <mini...@googlemail.com>
Acked-by: Borislav Petkov <b...@alien8.de>
Cc: Jesper Nilsson <jesper.nils...@axis.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardi...@citrix.com>
Cc: Arnd Hannemann <hannem...@nets.rwth-aachen.de>
Cc: Mikael Starvik <star...@axis.com>
Cc: Geert Uytterhoeven <ge...@linux-m68k.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: "David S. Miller" <da...@davemloft.net>
Link: 
http://lkml.kernel.org/r/1486933932-585-3-git-send-email-mini...@googlemail.com
Signed-off-by: Thomas Gleixner <t...@linutronix.de>

---
 arch/x86/include/asm/processor.h |  4 +---
 arch/x86/kernel/cpu/proc.c   |  5 ++---
 arch/x86/kernel/setup.c  | 11 ---
 arch/x86/mm/init_32.c|  9 +
 arch/x86/xen/enlighten.c |  1 -
 5 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 893f80e..4aa93b5 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -89,9 +89,7 @@ struct cpuinfo_x86 {
__u8x86_vendor; /* CPU vendor */
__u8x86_model;
__u8x86_mask;
-#ifdef CONFIG_X86_32
-   charwp_works_ok;/* It doesn't on 386's */
-#else
+#ifdef CONFIG_X86_64
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;
 #endif
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 18ca99f..6df621a 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -31,14 +31,13 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
   "fpu\t\t: %s\n"
   "fpu_exception\t: %s\n"
   "cpuid level\t: %d\n"
-  "wp\t\t: %s\n",
+  "wp\t\t: yes\n",
   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
-  c->cpuid_level,
-  c->wp_works_ok ? "yes" : "no");
+  c->cpuid_level);
 }
 #else
 static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4bf0c89..7cd7bbe 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -173,14 +173,11 @@ static struct resource bss_resource = {
 
 
 #ifdef CONFIG_X86_32
-/* cpu data as detected by the assembly code in head.S */
-struct cpuinfo_x86 new_cpu_data = {
-   .wp_works_ok = -1,
-};
+/* cpu data as detected by the assembly code in head_32.S */
+struct cpuinfo_x86 new_cpu_data;
+
 /* common cpu data for all cpus */
-struct cpuinfo_x86 boot_cpu_data __read_mostly = {
-   .wp_works_ok = -1,
-};
+struct cpuinfo_x86 boot_cpu_data __read_mostly;
 EXPORT_SYMBOL(boot_cpu_data);
 
 unsigned int def_to_bigsmp;
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/i

[tip:x86/cpu] x86/cpu: Drop wp_works_ok member of struct cpuinfo_x86

2017-03-11 Thread tip-bot for Mathias Krause

Commit-ID:  6415813bae75feba10b8ca3ed6634a72c2a4d313
Gitweb: http://git.kernel.org/tip/6415813bae75feba10b8ca3ed6634a72c2a4d313
Author: Mathias Krause 
AuthorDate: Sun, 12 Feb 2017 22:12:08 +0100
Committer:  Thomas Gleixner 
CommitDate: Sat, 11 Mar 2017 14:30:24 +0100

x86/cpu: Drop wp_works_ok member of struct cpuinfo_x86

Remove the wp_works_ok member of struct cpuinfo_x86. It's an
optimization back from Linux v0.99 times where we had no fixup support
yet and did the CR0.WP test via special code in the page fault handler.
The < 0 test was an optimization to not do the special casing for each
NULL ptr access violation but just for the first one doing the WP test.
Today it serves no real purpose as the test no longer needs special code
in the page fault handler and the only call side -- mem_init() -- calls
it just once, anyway. However, Xen pre-initializes it to 1, to skip the
test.

Doing the test again for Xen should be no issue at all, as even the
commit introducing skipping the test (commit d560bc61575e ("x86, xen:
Suppress WP test on Xen")) mentioned it being ban aid only. And, in
fact, testing the patch on Xen showed nothing breaks.

The pre-fixup times are long gone and with the removal of the fallback
handling code in commit a5c2a893dbd4 ("x86, 386 removal: Remove
CONFIG_X86_WP_WORKS_OK") the kernel requires a working CR0.WP anyway.
So just get rid of the "optimization" and do the test unconditionally.

Signed-off-by: Mathias Krause 
Acked-by: Borislav Petkov 
Cc: Jesper Nilsson 
Cc: Jeremy Fitzhardinge 
Cc: Arnd Hannemann 
Cc: Mikael Starvik 
Cc: Geert Uytterhoeven 
Cc: Andrew Morton 
Cc: "David S. Miller" 
Link: 
http://lkml.kernel.org/r/1486933932-585-3-git-send-email-mini...@googlemail.com
Signed-off-by: Thomas Gleixner 

---
 arch/x86/include/asm/processor.h |  4 +---
 arch/x86/kernel/cpu/proc.c   |  5 ++---
 arch/x86/kernel/setup.c  | 11 ---
 arch/x86/mm/init_32.c|  9 +
 arch/x86/xen/enlighten.c |  1 -
 5 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 893f80e..4aa93b5 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -89,9 +89,7 @@ struct cpuinfo_x86 {
__u8x86_vendor; /* CPU vendor */
__u8x86_model;
__u8x86_mask;
-#ifdef CONFIG_X86_32
-   charwp_works_ok;/* It doesn't on 386's */
-#else
+#ifdef CONFIG_X86_64
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;
 #endif
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 18ca99f..6df621a 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -31,14 +31,13 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
   "fpu\t\t: %s\n"
   "fpu_exception\t: %s\n"
   "cpuid level\t: %d\n"
-  "wp\t\t: %s\n",
+  "wp\t\t: yes\n",
   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
-  c->cpuid_level,
-  c->wp_works_ok ? "yes" : "no");
+  c->cpuid_level);
 }
 #else
 static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4bf0c89..7cd7bbe 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -173,14 +173,11 @@ static struct resource bss_resource = {
 
 
 #ifdef CONFIG_X86_32
-/* cpu data as detected by the assembly code in head.S */
-struct cpuinfo_x86 new_cpu_data = {
-   .wp_works_ok = -1,
-};
+/* cpu data as detected by the assembly code in head_32.S */
+struct cpuinfo_x86 new_cpu_data;
+
 /* common cpu data for all cpus */
-struct cpuinfo_x86 boot_cpu_data __read_mostly = {
-   .wp_works_ok = -1,
-};
+struct cpuinfo_x86 boot_cpu_data __read_mostly;
 EXPORT_SYMBOL(boot_cpu_data);
 
 unsigned int def_to_bigsmp;
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 2b4b53e..4dddfaf 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -716,15 +716,17 @@ void __init paging_init(void)
  */
 static void __init test_wp_bit(void)
 {
+   int wp_works_ok;
+
printk(KERN_INFO
   "Checking if this processor honours the WP bit even in supervisor mode...");

[tip:x86/cpu] x86/cpu: Drop unneded members of struct cpuinfo_x86

2017-03-11 Thread tip-bot for Mathias Krause

Commit-ID:  04402116846f36adea9503d7cd5104a7ed27a1a6
Gitweb: http://git.kernel.org/tip/04402116846f36adea9503d7cd5104a7ed27a1a6
Author: Mathias Krause <mini...@googlemail.com>
AuthorDate: Sun, 12 Feb 2017 22:12:07 +0100
Committer:  Thomas Gleixner <t...@linutronix.de>
CommitDate: Sat, 11 Mar 2017 14:30:23 +0100

x86/cpu: Drop unneded members of struct cpuinfo_x86

Those member serve no purpose -- not even fill padding for alignment or
such. So just get rid of them.

Signed-off-by: Mathias Krause <mini...@googlemail.com>
Acked-by: Borislav Petkov <b...@alien8.de>
Cc: Jesper Nilsson <jesper.nils...@axis.com>
Cc: Mikael Starvik <star...@axis.com>
Cc: Geert Uytterhoeven <ge...@linux-m68k.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: "David S. Miller" <da...@davemloft.net>
Link: 
http://lkml.kernel.org/r/1486933932-585-2-git-send-email-mini...@googlemail.com
Signed-off-by: Thomas Gleixner <t...@linutronix.de>

---
 arch/x86/include/asm/processor.h | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca..893f80e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -80,7 +80,7 @@ extern u16 __read_mostly tlb_lld_1g[NR_INFO];
 
 /*
  *  CPU type and hardware bug flags. Kept separately for each CPU.
- *  Members of this structure are referenced in head.S, so think twice
+ *  Members of this structure are referenced in head_32.S, so think twice
  *  before touching them. [mj]
  */
 
@@ -91,11 +91,6 @@ struct cpuinfo_x86 {
__u8x86_mask;
 #ifdef CONFIG_X86_32
charwp_works_ok;/* It doesn't on 386's */
-
-   /* Problems on some 486Dx4's and old 386's: */
-   charrfu;
-   charpad0;
-   charpad1;
 #else
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;

[tip:x86/cpu] x86/cpu: Drop unneded members of struct cpuinfo_x86

2017-03-11 Thread tip-bot for Mathias Krause

Commit-ID:  04402116846f36adea9503d7cd5104a7ed27a1a6
Gitweb: http://git.kernel.org/tip/04402116846f36adea9503d7cd5104a7ed27a1a6
Author: Mathias Krause 
AuthorDate: Sun, 12 Feb 2017 22:12:07 +0100
Committer:  Thomas Gleixner 
CommitDate: Sat, 11 Mar 2017 14:30:23 +0100

x86/cpu: Drop unneded members of struct cpuinfo_x86

Those member serve no purpose -- not even fill padding for alignment or
such. So just get rid of them.

Signed-off-by: Mathias Krause 
Acked-by: Borislav Petkov 
Cc: Jesper Nilsson 
Cc: Mikael Starvik 
Cc: Geert Uytterhoeven 
Cc: Andrew Morton 
Cc: "David S. Miller" 
Link: 
http://lkml.kernel.org/r/1486933932-585-2-git-send-email-mini...@googlemail.com
Signed-off-by: Thomas Gleixner 

---
 arch/x86/include/asm/processor.h | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca..893f80e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -80,7 +80,7 @@ extern u16 __read_mostly tlb_lld_1g[NR_INFO];
 
 /*
  *  CPU type and hardware bug flags. Kept separately for each CPU.
- *  Members of this structure are referenced in head.S, so think twice
+ *  Members of this structure are referenced in head_32.S, so think twice
  *  before touching them. [mj]
  */
 
@@ -91,11 +91,6 @@ struct cpuinfo_x86 {
__u8x86_mask;
 #ifdef CONFIG_X86_32
charwp_works_ok;/* It doesn't on 386's */
-
-   /* Problems on some 486Dx4's and old 386's: */
-   charrfu;
-   charpad0;
-   charpad1;
 #else
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;

Re: [PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-28 Thread Mathias Krause

On 14 February 2017 at 22:42, Mathias Krause <mini...@googlemail.com> wrote:
> On 14 February 2017 at 19:13, H. Peter Anvin <h...@zytor.com> wrote:
>> On 02/12/17 13:12, Mathias Krause wrote:
>>> As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
>>> CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
>>> correctly. This makes a process reading this file always see "wp : yes"
>>> here -- otherwise there would be no process to begin with ;)
>>>
>>> As this status line in /proc/cpuinfo serves no purpose for quite some
>>> time now, get rid of it.
>>>
>>> Cc: Borislav Petkov <b...@alien8.de>
>>> Cc: H. Peter Anvin <h...@linux.intel.com>
>>> Signed-off-by: Mathias Krause <mini...@googlemail.com>
>>> ---
>>>  arch/x86/kernel/cpu/proc.c |6 ++
>>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
>>> index 6df621ae62a7..c6c5217a7980 100644
>>> --- a/arch/x86/kernel/cpu/proc.c
>>> +++ b/arch/x86/kernel/cpu/proc.c
>>> @@ -30,8 +30,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>>> cpuinfo_x86 *c)
>>>  "coma_bug\t: %s\n"
>>>  "fpu\t\t: %s\n"
>>>  "fpu_exception\t: %s\n"
>>> -"cpuid level\t: %d\n"
>>> -"wp\t\t: yes\n",
>>> +"cpuid level\t: %d\n",
>>>  static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
>>>  static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
>>>  static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
>>> @@ -45,8 +44,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>>> cpuinfo_x86 *c)
>>>   seq_printf(m,
>>>  "fpu\t\t: yes\n"
>>>  "fpu_exception\t: yes\n"
>>> -"cpuid level\t: %d\n"
>>> -"wp\t\t: yes\n",
>>> +"cpuid level\t: %d\n",
>>>  c->cpuid_level);
>>>  }
>>>  #endif
>>>
>>
>> Potential userspace breakage, which is why the line was left in the
>> first place despite its value now being hard-coded.  Removing it will
>> save a whopping 9 bytes of kernel rodata; this is a very small price to
>> pay for guaranteeing continued compatibility.
>
> Indeed. That's why I've separated the removal into an extra patch --
> to make it easier not to take it.
>
>>
>> Nacked-by: H. Peter Anvin <h...@zytor.com>
>
> Do you want me to send the series again without this patch and patch
> #6 (Geert took it already) or are you okay with sorting them out
> yourself?
>

Ping...
Peter, what's your preference here?

Mathias

Re: [PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-28 Thread Mathias Krause

On 14 February 2017 at 22:42, Mathias Krause  wrote:
> On 14 February 2017 at 19:13, H. Peter Anvin  wrote:
>> On 02/12/17 13:12, Mathias Krause wrote:
>>> As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
>>> CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
>>> correctly. This makes a process reading this file always see "wp : yes"
>>> here -- otherwise there would be no process to begin with ;)
>>>
>>> As this status line in /proc/cpuinfo serves no purpose for quite some
>>> time now, get rid of it.
>>>
>>> Cc: Borislav Petkov 
>>> Cc: H. Peter Anvin 
>>> Signed-off-by: Mathias Krause 
>>> ---
>>>  arch/x86/kernel/cpu/proc.c |6 ++
>>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
>>> index 6df621ae62a7..c6c5217a7980 100644
>>> --- a/arch/x86/kernel/cpu/proc.c
>>> +++ b/arch/x86/kernel/cpu/proc.c
>>> @@ -30,8 +30,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>>> cpuinfo_x86 *c)
>>>  "coma_bug\t: %s\n"
>>>  "fpu\t\t: %s\n"
>>>  "fpu_exception\t: %s\n"
>>> -"cpuid level\t: %d\n"
>>> -"wp\t\t: yes\n",
>>> +"cpuid level\t: %d\n",
>>>  static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
>>>  static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
>>>  static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
>>> @@ -45,8 +44,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>>> cpuinfo_x86 *c)
>>>   seq_printf(m,
>>>  "fpu\t\t: yes\n"
>>>  "fpu_exception\t: yes\n"
>>> -"cpuid level\t: %d\n"
>>> -"wp\t\t: yes\n",
>>> +"cpuid level\t: %d\n",
>>>  c->cpuid_level);
>>>  }
>>>  #endif
>>>
>>
>> Potential userspace breakage, which is why the line was left in the
>> first place despite its value now being hard-coded.  Removing it will
>> save a whopping 9 bytes of kernel rodata; this is a very small price to
>> pay for guaranteeing continued compatibility.
>
> Indeed. That's why I've separated the removal into an extra patch --
> to make it easier not to take it.
>
>>
>> Nacked-by: H. Peter Anvin 
>
> Do you want me to send the series again without this patch and patch
> #6 (Geert took it already) or are you okay with sorting them out
> yourself?
>

Ping...
Peter, what's your preference here?

Mathias

Re: [PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-14 Thread Mathias Krause

On 14 February 2017 at 19:13, H. Peter Anvin <h...@zytor.com> wrote:
> On 02/12/17 13:12, Mathias Krause wrote:
>> As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
>> CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
>> correctly. This makes a process reading this file always see "wp : yes"
>> here -- otherwise there would be no process to begin with ;)
>>
>> As this status line in /proc/cpuinfo serves no purpose for quite some
>> time now, get rid of it.
>>
>> Cc: Borislav Petkov <b...@alien8.de>
>> Cc: H. Peter Anvin <h...@linux.intel.com>
>> Signed-off-by: Mathias Krause <mini...@googlemail.com>
>> ---
>>  arch/x86/kernel/cpu/proc.c |6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
>> index 6df621ae62a7..c6c5217a7980 100644
>> --- a/arch/x86/kernel/cpu/proc.c
>> +++ b/arch/x86/kernel/cpu/proc.c
>> @@ -30,8 +30,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>> cpuinfo_x86 *c)
>>  "coma_bug\t: %s\n"
>>  "fpu\t\t: %s\n"
>>  "fpu_exception\t: %s\n"
>> -"cpuid level\t: %d\n"
>> -"wp\t\t: yes\n",
>> +"cpuid level\t: %d\n",
>>  static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
>>  static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
>>  static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
>> @@ -45,8 +44,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>> cpuinfo_x86 *c)
>>   seq_printf(m,
>>  "fpu\t\t: yes\n"
>>  "fpu_exception\t: yes\n"
>> -"cpuid level\t: %d\n"
>> -"wp\t\t: yes\n",
>> +"cpuid level\t: %d\n",
>>  c->cpuid_level);
>>  }
>>  #endif
>>
>
> Potential userspace breakage, which is why the line was left in the
> first place despite its value now being hard-coded.  Removing it will
> save a whopping 9 bytes of kernel rodata; this is a very small price to
> pay for guaranteeing continued compatibility.

Indeed. That's why I've separated the removal into an extra patch --
to make it easier not to take it.

>
> Nacked-by: H. Peter Anvin <h...@zytor.com>

Do you want me to send the series again without this patch and patch
#6 (Geert took it already) or are you okay with sorting them out
yourself?

Cheers,
Mathias

Re: [PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-14 Thread Mathias Krause

On 14 February 2017 at 19:13, H. Peter Anvin  wrote:
> On 02/12/17 13:12, Mathias Krause wrote:
>> As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
>> CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
>> correctly. This makes a process reading this file always see "wp : yes"
>> here -- otherwise there would be no process to begin with ;)
>>
>> As this status line in /proc/cpuinfo serves no purpose for quite some
>> time now, get rid of it.
>>
>> Cc: Borislav Petkov 
>> Cc: H. Peter Anvin 
>> Signed-off-by: Mathias Krause 
>> ---
>>  arch/x86/kernel/cpu/proc.c |6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
>> index 6df621ae62a7..c6c5217a7980 100644
>> --- a/arch/x86/kernel/cpu/proc.c
>> +++ b/arch/x86/kernel/cpu/proc.c
>> @@ -30,8 +30,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>> cpuinfo_x86 *c)
>>  "coma_bug\t: %s\n"
>>  "fpu\t\t: %s\n"
>>  "fpu_exception\t: %s\n"
>> -"cpuid level\t: %d\n"
>> -"wp\t\t: yes\n",
>> +"cpuid level\t: %d\n",
>>  static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
>>  static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
>>  static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
>> @@ -45,8 +44,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
>> cpuinfo_x86 *c)
>>   seq_printf(m,
>>  "fpu\t\t: yes\n"
>>  "fpu_exception\t: yes\n"
>> -"cpuid level\t: %d\n"
>> -"wp\t\t: yes\n",
>> +"cpuid level\t: %d\n",
>>  c->cpuid_level);
>>  }
>>  #endif
>>
>
> Potential userspace breakage, which is why the line was left in the
> first place despite its value now being hard-coded.  Removing it will
> save a whopping 9 bytes of kernel rodata; this is a very small price to
> pay for guaranteeing continued compatibility.

Indeed. That's why I've separated the removal into an extra patch --
to make it easier not to take it.

>
> Nacked-by: H. Peter Anvin 

Do you want me to send the series again without this patch and patch
#6 (Geert took it already) or are you okay with sorting them out
yourself?

Cheers,
Mathias

Re: [PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-14 Thread Mathias Krause

On 14 February 2017 at 17:20, Borislav Petkov <b...@alien8.de> wrote:
> On Sun, Feb 12, 2017 at 10:12:09PM +0100, Mathias Krause wrote:
>> As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
>> CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
>> correctly. This makes a process reading this file always see "wp : yes"
>> here -- otherwise there would be no process to begin with ;)
>>
>> As this status line in /proc/cpuinfo serves no purpose for quite some
>> time now, get rid of it.
>
> Right, sure, except /proc/cpuinfo's format is kind of an ABI and scripts
> rely on it, I'm being told.

That's the reason I haven't folded this change into patch 2. I had
similar doubts but it's not documented in Documentation/ and kinda
useless to test anyway -- what would a "wp : no" tell one?

> TBH, I'd remove that wp:-line too but this
> is just me. tip guys' call.
>
> FWIW, for all three:
>
> Acked-by: Borislav Petkov <b...@suse.de>

Thanks,
Mathias

Re: [PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-14 Thread Mathias Krause

On 14 February 2017 at 17:20, Borislav Petkov  wrote:
> On Sun, Feb 12, 2017 at 10:12:09PM +0100, Mathias Krause wrote:
>> As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
>> CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
>> correctly. This makes a process reading this file always see "wp : yes"
>> here -- otherwise there would be no process to begin with ;)
>>
>> As this status line in /proc/cpuinfo serves no purpose for quite some
>> time now, get rid of it.
>
> Right, sure, except /proc/cpuinfo's format is kind of an ABI and scripts
> rely on it, I'm being told.

That's the reason I haven't folded this change into patch 2. I had
similar doubts but it's not documented in Documentation/ and kinda
useless to test anyway -- what would a "wp : no" tell one?

> TBH, I'd remove that wp:-line too but this
> is just me. tip guys' call.
>
> FWIW, for all three:
>
> Acked-by: Borislav Petkov 

Thanks,
Mathias

Re: [PATCH 1/6] x86: drop unneded members of struct cpuinfo_x86

2017-02-14 Thread Mathias Krause

On 14 February 2017 at 17:17, Borislav Petkov <b...@alien8.de> wrote:
> On Sun, Feb 12, 2017 at 10:12:07PM +0100, Mathias Krause wrote:
>> Those member serve no purpose -- not even fill padding for alignment or
>> such. So just get rid of them.
>
> Well, almost. You need the wp_works_ok removal patch too, otherwise you
> have the 3 bytes hole below.

Heh, indeed! But only since commit 79a8b9aa388b ("x86/CPU/AMD: Bring
back Compute Unit ID") ;)

> But the wp_works_ok goes away too so I guess that's fine.
>
> $ pahole -C cpuinfo_x86 vmlinux
> struct cpuinfo_x86 {
> __u8   x86;  /* 0 1 */
> __u8   x86_vendor;   /* 1 1 */
> __u8   x86_model;/* 2 1 */
> __u8   x86_mask; /* 3 1 */
> char   wp_works_ok;  /* 4 1 */
> __u8   x86_virt_bits;/* 5 1 */
> __u8   x86_phys_bits;/* 6 1 */
> __u8   x86_coreid_bits;  /* 7 1 */
> __u8   cu_id;/* 8 1 */
>
> /* XXX 3 bytes hole, try to pack */

The cu_id member is "new". Without it there would be no hole. But,
yeah, without wp_works_ok everything's fine again.

Cheers,
Mathias

Re: [PATCH 1/6] x86: drop unneded members of struct cpuinfo_x86

2017-02-14 Thread Mathias Krause

On 14 February 2017 at 17:17, Borislav Petkov  wrote:
> On Sun, Feb 12, 2017 at 10:12:07PM +0100, Mathias Krause wrote:
>> Those member serve no purpose -- not even fill padding for alignment or
>> such. So just get rid of them.
>
> Well, almost. You need the wp_works_ok removal patch too, otherwise you
> have the 3 bytes hole below.

Heh, indeed! But only since commit 79a8b9aa388b ("x86/CPU/AMD: Bring
back Compute Unit ID") ;)

> But the wp_works_ok goes away too so I guess that's fine.
>
> $ pahole -C cpuinfo_x86 vmlinux
> struct cpuinfo_x86 {
> __u8   x86;  /* 0 1 */
> __u8   x86_vendor;   /* 1 1 */
> __u8   x86_model;/* 2 1 */
> __u8   x86_mask; /* 3 1 */
> char   wp_works_ok;  /* 4 1 */
> __u8   x86_virt_bits;/* 5 1 */
> __u8   x86_phys_bits;/* 6 1 */
> __u8   x86_coreid_bits;  /* 7 1 */
> __u8   cu_id;/* 8 1 */
>
> /* XXX 3 bytes hole, try to pack */

The cu_id member is "new". Without it there would be no hole. But,
yeah, without wp_works_ok everything's fine again.

Cheers,
Mathias

[PATCH 2/6] x86/cpu: drop wp_works_ok member of struct cpuinfo_x86

2017-02-12 Thread Mathias Krause

Remove the wp_works_ok member of struct cpuinfo_x86. It's an
optimization back from Linux v0.99 times where we had no fixup support
yet and did the CR0.WP test via special code in the page fault handler.
The < 0 test was an optimization to not do the special casing for each
NULL ptr access violation but just for the first one doing the WP test.
Today it serves no real purpose as the test no longer needs special code
in the page fault handler and the only call side -- mem_init() -- calls
it just once, anyway. However, Xen pre-initializes it to 1, to skip the
test.

Doing the test again for Xen should be no issue at all, as even the
commit introducing skipping the test (commit d560bc61575e ("x86, xen:
Suppress WP test on Xen")) mentioned it being ban aid only. And, in
fact, testing the patch on Xen showed nothing breaks.

The pre-fixup times are long gone and with the removal of the fallback
handling code in commit a5c2a893dbd4 ("x86, 386 removal: Remove
CONFIG_X86_WP_WORKS_OK") the kernel requires a working CR0.WP anyway.
So just get rid of the "optimization" and do the test unconditionally.

Cc: Borislav Petkov <b...@alien8.de>
Cc: H. Peter Anvin <h...@linux.intel.com>
Cc: Arnd Hannemann <hannem...@nets.rwth-aachen.de>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardi...@citrix.com>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
---
 arch/x86/include/asm/processor.h |4 +---
 arch/x86/kernel/cpu/proc.c   |5 ++---
 arch/x86/kernel/setup.c  |   11 ---
 arch/x86/mm/init_32.c|9 +
 arch/x86/xen/enlighten.c |1 -
 5 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index bf7cb1e00ce7..7b15b29e8a66 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -89,9 +89,7 @@ struct cpuinfo_x86 {
__u8x86_vendor; /* CPU vendor */
__u8x86_model;
__u8x86_mask;
-#ifdef CONFIG_X86_32
-   charwp_works_ok;/* It doesn't on 386's */
-#else
+#ifdef CONFIG_X86_64
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;
 #endif
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 18ca99f2798b..6df621ae62a7 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -31,14 +31,13 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
   "fpu\t\t: %s\n"
   "fpu_exception\t: %s\n"
   "cpuid level\t: %d\n"
-  "wp\t\t: %s\n",
+  "wp\t\t: yes\n",
   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
-  c->cpuid_level,
-  c->wp_works_ok ? "yes" : "no");
+  c->cpuid_level);
 }
 #else
 static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4cfba947d774..ffc2791ab256 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -173,14 +173,11 @@ int default_check_phys_apicid_present(int phys_apicid)
 
 
 #ifdef CONFIG_X86_32
-/* cpu data as detected by the assembly code in head.S */
-struct cpuinfo_x86 new_cpu_data = {
-   .wp_works_ok = -1,
-};
+/* cpu data as detected by the assembly code in head_32.S */
+struct cpuinfo_x86 new_cpu_data;
+
 /* common cpu data for all cpus */
-struct cpuinfo_x86 boot_cpu_data __read_mostly = {
-   .wp_works_ok = -1,
-};
+struct cpuinfo_x86 boot_cpu_data __read_mostly;
 EXPORT_SYMBOL(boot_cpu_data);
 
 unsigned int def_to_bigsmp;
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 928d657de829..e0fd0c8b9ad1 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -716,15 +716,17 @@ void __init paging_init(void)
  */
 static void __init test_wp_bit(void)
 {
+   int wp_works_ok;
+
printk(KERN_INFO
   "Checking if this processor honours the WP bit even in supervisor mode...");
 
/* Any page-aligned address will do, the test is non-destructive */
__set_fixmap(FIX_WP_TEST, __pa(_pg_dir), PAGE_KERNEL_RO);
-   boot_cpu_data.wp_works_ok = do_test_wp_bit();
+   wp_works_ok = do_test_wp_bit();
clear_fixmap(FIX_WP_TEST);
 
-   if (!boot_cpu_data.wp_works_ok) {
+   if (!wp_works_ok) {

[PATCH 4/6] sparc: remove unused wp_works_ok macro

2017-02-12 Thread Mathias Krause

It's unused for ages, used to be required for ksyms.c back in the v1.1
times.

Cc: David S. Miller <da...@davemloft.net>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
---
 arch/sparc/include/asm/processor_32.h |6 --
 arch/sparc/include/asm/processor_64.h |4 
 2 files changed, 10 deletions(-)

diff --git a/arch/sparc/include/asm/processor_32.h 
b/arch/sparc/include/asm/processor_32.h
index 365d4cb267b4..dd27159819eb 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -18,12 +18,6 @@
 #include 
 #include 
 
-/*
- * The sparc has no problems with write protection
- */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /* Whee, this is STACK_TOP + PAGE_SIZE and the lowest kernel address too...
  * That one page is used to protect kernel from intruders, so that
  * we can make our access_ok test faster
diff --git a/arch/sparc/include/asm/processor_64.h 
b/arch/sparc/include/asm/processor_64.h
index 6448cfc8292f..b58ee9018433 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -18,10 +18,6 @@
 #include 
 #include 
 
-/* The sparc has no problems with write protection */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /*
  * User lives in his very own context, and cannot reference us. Note
  * that TASK_SIZE is a misnomer, it really gives maximum user virtual
-- 
1.7.10.4

[PATCH 2/6] x86/cpu: drop wp_works_ok member of struct cpuinfo_x86

2017-02-12 Thread Mathias Krause

Remove the wp_works_ok member of struct cpuinfo_x86. It's an
optimization back from Linux v0.99 times where we had no fixup support
yet and did the CR0.WP test via special code in the page fault handler.
The < 0 test was an optimization to not do the special casing for each
NULL ptr access violation but just for the first one doing the WP test.
Today it serves no real purpose as the test no longer needs special code
in the page fault handler and the only call side -- mem_init() -- calls
it just once, anyway. However, Xen pre-initializes it to 1, to skip the
test.

Doing the test again for Xen should be no issue at all, as even the
commit introducing skipping the test (commit d560bc61575e ("x86, xen:
Suppress WP test on Xen")) mentioned it being ban aid only. And, in
fact, testing the patch on Xen showed nothing breaks.

The pre-fixup times are long gone and with the removal of the fallback
handling code in commit a5c2a893dbd4 ("x86, 386 removal: Remove
CONFIG_X86_WP_WORKS_OK") the kernel requires a working CR0.WP anyway.
So just get rid of the "optimization" and do the test unconditionally.

Cc: Borislav Petkov 
Cc: H. Peter Anvin 
Cc: Arnd Hannemann 
Cc: Jeremy Fitzhardinge 
Signed-off-by: Mathias Krause 
---
 arch/x86/include/asm/processor.h |4 +---
 arch/x86/kernel/cpu/proc.c   |5 ++---
 arch/x86/kernel/setup.c  |   11 ---
 arch/x86/mm/init_32.c|9 +
 arch/x86/xen/enlighten.c |1 -
 5 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index bf7cb1e00ce7..7b15b29e8a66 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -89,9 +89,7 @@ struct cpuinfo_x86 {
__u8x86_vendor; /* CPU vendor */
__u8x86_model;
__u8x86_mask;
-#ifdef CONFIG_X86_32
-   charwp_works_ok;/* It doesn't on 386's */
-#else
+#ifdef CONFIG_X86_64
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;
 #endif
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 18ca99f2798b..6df621ae62a7 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -31,14 +31,13 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
   "fpu\t\t: %s\n"
   "fpu_exception\t: %s\n"
   "cpuid level\t: %d\n"
-  "wp\t\t: %s\n",
+  "wp\t\t: yes\n",
   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
-  c->cpuid_level,
-  c->wp_works_ok ? "yes" : "no");
+  c->cpuid_level);
 }
 #else
 static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4cfba947d774..ffc2791ab256 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -173,14 +173,11 @@ int default_check_phys_apicid_present(int phys_apicid)
 
 
 #ifdef CONFIG_X86_32
-/* cpu data as detected by the assembly code in head.S */
-struct cpuinfo_x86 new_cpu_data = {
-   .wp_works_ok = -1,
-};
+/* cpu data as detected by the assembly code in head_32.S */
+struct cpuinfo_x86 new_cpu_data;
+
 /* common cpu data for all cpus */
-struct cpuinfo_x86 boot_cpu_data __read_mostly = {
-   .wp_works_ok = -1,
-};
+struct cpuinfo_x86 boot_cpu_data __read_mostly;
 EXPORT_SYMBOL(boot_cpu_data);
 
 unsigned int def_to_bigsmp;
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 928d657de829..e0fd0c8b9ad1 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -716,15 +716,17 @@ void __init paging_init(void)
  */
 static void __init test_wp_bit(void)
 {
+   int wp_works_ok;
+
printk(KERN_INFO
   "Checking if this processor honours the WP bit even in supervisor mode...");
 
/* Any page-aligned address will do, the test is non-destructive */
__set_fixmap(FIX_WP_TEST, __pa(_pg_dir), PAGE_KERNEL_RO);
-   boot_cpu_data.wp_works_ok = do_test_wp_bit();
+   wp_works_ok = do_test_wp_bit();
clear_fixmap(FIX_WP_TEST);
 
-   if (!boot_cpu_data.wp_works_ok) {
+   if (!wp_works_ok) {
printk(KERN_CONT "No.\n");
panic("Linux doesn't support CPUs with broken WP.");
} el

[PATCH 4/6] sparc: remove unused wp_works_ok macro

2017-02-12 Thread Mathias Krause

It's unused for ages, used to be required for ksyms.c back in the v1.1
times.

Cc: David S. Miller 
Signed-off-by: Mathias Krause 
---
 arch/sparc/include/asm/processor_32.h |6 --
 arch/sparc/include/asm/processor_64.h |4 
 2 files changed, 10 deletions(-)

diff --git a/arch/sparc/include/asm/processor_32.h 
b/arch/sparc/include/asm/processor_32.h
index 365d4cb267b4..dd27159819eb 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -18,12 +18,6 @@
 #include 
 #include 
 
-/*
- * The sparc has no problems with write protection
- */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /* Whee, this is STACK_TOP + PAGE_SIZE and the lowest kernel address too...
  * That one page is used to protect kernel from intruders, so that
  * we can make our access_ok test faster
diff --git a/arch/sparc/include/asm/processor_64.h 
b/arch/sparc/include/asm/processor_64.h
index 6448cfc8292f..b58ee9018433 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -18,10 +18,6 @@
 #include 
 #include 
 
-/* The sparc has no problems with write protection */
-#define wp_works_ok 1
-#define wp_works_ok__is_a_macro /* for versions in ksyms.c */
-
 /*
  * User lives in his very own context, and cannot reference us. Note
  * that TASK_SIZE is a misnomer, it really gives maximum user virtual
-- 
1.7.10.4

[PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-12 Thread Mathias Krause

As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
correctly. This makes a process reading this file always see "wp : yes"
here -- otherwise there would be no process to begin with ;)

As this status line in /proc/cpuinfo serves no purpose for quite some
time now, get rid of it.

Cc: Borislav Petkov <b...@alien8.de>
Cc: H. Peter Anvin <h...@linux.intel.com>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
---
 arch/x86/kernel/cpu/proc.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 6df621ae62a7..c6c5217a7980 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -30,8 +30,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
   "coma_bug\t: %s\n"
   "fpu\t\t: %s\n"
   "fpu_exception\t: %s\n"
-  "cpuid level\t: %d\n"
-  "wp\t\t: yes\n",
+  "cpuid level\t: %d\n",
   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
@@ -45,8 +44,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
seq_printf(m,
   "fpu\t\t: yes\n"
   "fpu_exception\t: yes\n"
-  "cpuid level\t: %d\n"
-  "wp\t\t: yes\n",
+  "cpuid level\t: %d\n",
   c->cpuid_level);
 }
 #endif
-- 
1.7.10.4

[PATCH 6/6] m68k: paging_init - remove dead code

2017-02-12 Thread Mathias Krause

The macro TEST_VERIFY_AREA can never be defined as there's no
wp_works_ok variable. So just remove the dead code.

Cc: Geert Uytterhoeven <ge...@linux-m68k.org>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
---
 arch/m68k/mm/sun3mmu.c |3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/m68k/mm/sun3mmu.c b/arch/m68k/mm/sun3mmu.c
index b5b7d53f7283..177d776de1a0 100644
--- a/arch/m68k/mm/sun3mmu.c
+++ b/arch/m68k/mm/sun3mmu.c
@@ -44,9 +44,6 @@ void __init paging_init(void)
unsigned long zones_size[MAX_NR_ZONES] = { 0, };
unsigned long size;
 
-#ifdef TEST_VERIFY_AREA
-   wp_works_ok = 0;
-#endif
empty_zero_page = alloc_bootmem_pages(PAGE_SIZE);
 
address = PAGE_OFFSET;
-- 
1.7.10.4

[PATCH 3/6] x86/cpu: proc - remove "wp" status line in cpuinfo

2017-02-12 Thread Mathias Krause

As of commit a5c2a893dbd4 ("x86, 386 removal: Remove
CONFIG_X86_WP_WORKS_OK") the kernel won't boot if CR0.WP isn't working
correctly. This makes a process reading this file always see "wp : yes"
here -- otherwise there would be no process to begin with ;)

As this status line in /proc/cpuinfo serves no purpose for quite some
time now, get rid of it.

Cc: Borislav Petkov 
Cc: H. Peter Anvin 
Signed-off-by: Mathias Krause 
---
 arch/x86/kernel/cpu/proc.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 6df621ae62a7..c6c5217a7980 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -30,8 +30,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
   "coma_bug\t: %s\n"
   "fpu\t\t: %s\n"
   "fpu_exception\t: %s\n"
-  "cpuid level\t: %d\n"
-  "wp\t\t: yes\n",
+  "cpuid level\t: %d\n",
   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
@@ -45,8 +44,7 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
cpuinfo_x86 *c)
seq_printf(m,
   "fpu\t\t: yes\n"
   "fpu_exception\t: yes\n"
-  "cpuid level\t: %d\n"
-  "wp\t\t: yes\n",
+  "cpuid level\t: %d\n",
   c->cpuid_level);
 }
 #endif
-- 
1.7.10.4

[PATCH 6/6] m68k: paging_init - remove dead code

2017-02-12 Thread Mathias Krause

The macro TEST_VERIFY_AREA can never be defined as there's no
wp_works_ok variable. So just remove the dead code.

Cc: Geert Uytterhoeven 
Signed-off-by: Mathias Krause 
---
 arch/m68k/mm/sun3mmu.c |3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/m68k/mm/sun3mmu.c b/arch/m68k/mm/sun3mmu.c
index b5b7d53f7283..177d776de1a0 100644
--- a/arch/m68k/mm/sun3mmu.c
+++ b/arch/m68k/mm/sun3mmu.c
@@ -44,9 +44,6 @@ void __init paging_init(void)
unsigned long zones_size[MAX_NR_ZONES] = { 0, };
unsigned long size;
 
-#ifdef TEST_VERIFY_AREA
-   wp_works_ok = 0;
-#endif
empty_zero_page = alloc_bootmem_pages(PAGE_SIZE);
 
address = PAGE_OFFSET;
-- 
1.7.10.4

[PATCH 5/6] cris: remove unused wp_works_ok macro

2017-02-12 Thread Mathias Krause

It had no use since it's introduction in v2.4.1.2. Get rid of it.

Cc: Jesper Nilsson <jesper.nils...@axis.com>
Cc: Mikael Starvik <star...@axis.com>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
---
 arch/cris/include/arch-v10/arch/processor.h |3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/cris/include/arch-v10/arch/processor.h 
b/arch/cris/include/arch-v10/arch/processor.h
index 93feb2a487d8..58f75bee1d6c 100644
--- a/arch/cris/include/arch-v10/arch/processor.h
+++ b/arch/cris/include/arch-v10/arch/processor.h
@@ -7,9 +7,6 @@
  */
 #define current_text_addr() ({void *pc; __asm__ ("move.d $pc,%0" : "=rm" 
(pc)); pc; })
 
-/* CRIS has no problems with write protection */
-#define wp_works_ok 1
-
 /* CRIS thread_struct. this really has nothing to do with the processor 
itself, since
  * CRIS does not do any hardware task-switching, but it's here for legacy 
reasons.
  * The thread_struct here is used when task-switching using _resume defined in 
entry.S.
-- 
1.7.10.4

[PATCH 5/6] cris: remove unused wp_works_ok macro

2017-02-12 Thread Mathias Krause

It had no use since it's introduction in v2.4.1.2. Get rid of it.

Cc: Jesper Nilsson 
Cc: Mikael Starvik 
Signed-off-by: Mathias Krause 
---
 arch/cris/include/arch-v10/arch/processor.h |3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/cris/include/arch-v10/arch/processor.h 
b/arch/cris/include/arch-v10/arch/processor.h
index 93feb2a487d8..58f75bee1d6c 100644
--- a/arch/cris/include/arch-v10/arch/processor.h
+++ b/arch/cris/include/arch-v10/arch/processor.h
@@ -7,9 +7,6 @@
  */
 #define current_text_addr() ({void *pc; __asm__ ("move.d $pc,%0" : "=rm" 
(pc)); pc; })
 
-/* CRIS has no problems with write protection */
-#define wp_works_ok 1
-
 /* CRIS thread_struct. this really has nothing to do with the processor 
itself, since
  * CRIS does not do any hardware task-switching, but it's here for legacy 
reasons.
  * The thread_struct here is used when task-switching using _resume defined in 
entry.S.
-- 
1.7.10.4

[PATCH 0/6] struct cpuinfo_x86 related cleanups

2017-02-12 Thread Mathias Krause

This small series slims down the x86 specific struct cpuinfo_x86
(patches 1 and 2). It's kind of a continuation of Boris' work from 2013.

Beside the x86 specific part it also cleans up other arches that had to
have a wp_works_ok variable / define back in the old v1.1 times. But
those times are long gone, so we can get rid of that ancient hackery.

Therefore patches 4, 5 and 6 are independent of the x86 specific changes
and could be taken by the individual arch maintainers if they prefer to.

Please apply,
Mathias

Mathias Krause (6):
  x86: drop unneded members of struct cpuinfo_x86
  x86/cpu: drop wp_works_ok member of struct cpuinfo_x86
  x86/cpu: proc - remove "wp" status line in cpuinfo
  sparc: remove unused wp_works_ok macro
  cris: remove unused wp_works_ok macro
  m68k: paging_init - remove dead code

 arch/cris/include/arch-v10/arch/processor.h |3 ---
 arch/m68k/mm/sun3mmu.c  |3 ---
 arch/sparc/include/asm/processor_32.h   |6 --
 arch/sparc/include/asm/processor_64.h   |4 
 arch/x86/include/asm/processor.h|   11 ++-
 arch/x86/kernel/cpu/proc.c  |9 +++--
 arch/x86/kernel/setup.c |   11 ---
 arch/x86/mm/init_32.c   |9 +
 arch/x86/xen/enlighten.c|1 -
 9 files changed, 14 insertions(+), 43 deletions(-)

-- 
1.7.10.4

[PATCH 1/6] x86: drop unneded members of struct cpuinfo_x86

2017-02-12 Thread Mathias Krause

Those member serve no purpose -- not even fill padding for alignment or
such. So just get rid of them.

Cc: Borislav Petkov <b...@alien8.de>
Cc: H. Peter Anvin <h...@linux.intel.com>
Signed-off-by: Mathias Krause <mini...@googlemail.com>
---
 arch/x86/include/asm/processor.h |7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index e6cfe7ba2d65..bf7cb1e00ce7 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -80,7 +80,7 @@ enum tlb_infos {
 
 /*
  *  CPU type and hardware bug flags. Kept separately for each CPU.
- *  Members of this structure are referenced in head.S, so think twice
+ *  Members of this structure are referenced in head_32.S, so think twice
  *  before touching them. [mj]
  */
 
@@ -91,11 +91,6 @@ struct cpuinfo_x86 {
__u8x86_mask;
 #ifdef CONFIG_X86_32
charwp_works_ok;/* It doesn't on 386's */
-
-   /* Problems on some 486Dx4's and old 386's: */
-   charrfu;
-   charpad0;
-   charpad1;
 #else
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;
-- 
1.7.10.4

[PATCH 0/6] struct cpuinfo_x86 related cleanups

2017-02-12 Thread Mathias Krause

This small series slims down the x86 specific struct cpuinfo_x86
(patches 1 and 2). It's kind of a continuation of Boris' work from 2013.

Beside the x86 specific part it also cleans up other arches that had to
have a wp_works_ok variable / define back in the old v1.1 times. But
those times are long gone, so we can get rid of that ancient hackery.

Therefore patches 4, 5 and 6 are independent of the x86 specific changes
and could be taken by the individual arch maintainers if they prefer to.

Please apply,
Mathias

Mathias Krause (6):
  x86: drop unneded members of struct cpuinfo_x86
  x86/cpu: drop wp_works_ok member of struct cpuinfo_x86
  x86/cpu: proc - remove "wp" status line in cpuinfo
  sparc: remove unused wp_works_ok macro
  cris: remove unused wp_works_ok macro
  m68k: paging_init - remove dead code

 arch/cris/include/arch-v10/arch/processor.h |3 ---
 arch/m68k/mm/sun3mmu.c  |3 ---
 arch/sparc/include/asm/processor_32.h   |6 --
 arch/sparc/include/asm/processor_64.h   |4 
 arch/x86/include/asm/processor.h|   11 ++-
 arch/x86/kernel/cpu/proc.c  |9 +++--
 arch/x86/kernel/setup.c |   11 ---
 arch/x86/mm/init_32.c   |9 +
 arch/x86/xen/enlighten.c|1 -
 9 files changed, 14 insertions(+), 43 deletions(-)

-- 
1.7.10.4

[PATCH 1/6] x86: drop unneded members of struct cpuinfo_x86

2017-02-12 Thread Mathias Krause

Those member serve no purpose -- not even fill padding for alignment or
such. So just get rid of them.

Cc: Borislav Petkov 
Cc: H. Peter Anvin 
Signed-off-by: Mathias Krause 
---
 arch/x86/include/asm/processor.h |7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index e6cfe7ba2d65..bf7cb1e00ce7 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -80,7 +80,7 @@ enum tlb_infos {
 
 /*
  *  CPU type and hardware bug flags. Kept separately for each CPU.
- *  Members of this structure are referenced in head.S, so think twice
+ *  Members of this structure are referenced in head_32.S, so think twice
  *  before touching them. [mj]
  */
 
@@ -91,11 +91,6 @@ struct cpuinfo_x86 {
__u8x86_mask;
 #ifdef CONFIG_X86_32
charwp_works_ok;/* It doesn't on 386's */
-
-   /* Problems on some 486Dx4's and old 386's: */
-   charrfu;
-   charpad0;
-   charpad1;
 #else
/* Number of 4K pages in DTLB/ITLB combined(in pages): */
int x86_tlbsize;
-- 
1.7.10.4

Re: [PATCH v1 0/2] Introduce the initify gcc plugin

2016-06-28 Thread Mathias Krause

Hi Emese,

On Tue, Jun 28, 2016 at 01:34:07PM +0200, Emese Revfy wrote:
> I would like to introduce the initify gcc plugin. The kernel already has
> a mechanism to free up code and data memory that is only used during kernel
> or module initialization.
> This plugin will teach the compiler to find more such code and data that
> can be freed after initialization. It reduces memory usage.
> The initify gcc plugin can be useful for embedded systems.
> 
> It is a CII project supported by the Linux Foundation.
> 
> This plugin is the part of grsecurity/PaX.
> 
> The plugin supports all gcc versions from 4.5 to 6.0.
> 
> I made some changes on top of the PaX version (since March 6.). These are
> the important ones:
>  * move all local strings to init.rodata.str and exit.rodata.str
>(not just __func__)
>  * report all initified strings and functions
>(GCC_PLUGIN_INITIFY_VERBOSE config option)
>  * automatically discover init/exit functions and apply the __init or
>__exit attributes on them
> 
> You can find more about the changes here:
> https://github.com/ephox-gcc-plugins/initify
> 
> This patch set is based on the "Add support for complex gcc plugins that
> don't fit in a single file" patch set (git/kees/linux.git#kspp HEAD:
> e5d4798b284cd192c8b).
> 
> Some statistics about the plugin:
> 
> On allyes config (amd64, gcc-6):
> * 7731 initified strings
> *  231 initified functions
> 
> On allmod config (i386, gcc-6):
> * 8846 initified strings
> *  252 initified functions
> 
> On allyes config (amd64, gcc-6):
> 
> section vanilla vanilla + initifychange
> ---
> .rodata 39059688 (0x25400e8)38527210 (0x24be0ea)-532478
> .data   45744128 (0x2ba)45404160 (0x2b4d000)-339968
> .init.data   1361144  (0x14c4f8) 1674200  (0x198bd8)+313056
> .text   77615128 (0x4a05018)77576664 (0x49fb9d8) -38464
> .init.text   1108455  (0x10e9e7) 1137618  (0x115bd2) +29163

You should probably provide numbers for .init.rodata.str, .exit.rodata.str
and .exit.text as well. Otherwise this delta calculation suggests a rather
gigantic image size reduction which is probably not the case ;)

Also a comparison of the final kernel image size would be nice to see if
the string duplication issue mentioned in [1] is actually an issue.

  [1] http://marc.info/?l=linux-kernel=140364632417795=2


Thanks,
Mathias

> 
> 
> Emese Revfy (2):
>  Add the initify gcc plugin
>  Mark functions with the __nocapture attribute
> 
> ---
>  arch/Kconfig |   23 +
>  arch/arm/include/asm/string.h|   10 +-
>  arch/arm64/include/asm/string.h  |   23 +-
>  arch/powerpc/include/asm/string.h|   19 +-
>  arch/x86/boot/string.h   |4 +-
>  arch/x86/include/asm/string_32.h |   21 +-
>  arch/x86/include/asm/string_64.h |   18 +-
>  arch/x86/kernel/hpet.c   |2 +-
>  include/asm-generic/bug.h|6 +-
>  include/asm-generic/vmlinux.lds.h|2 +
>  include/linux/compiler-gcc.h |   10 +-
>  include/linux/compiler.h |4 +
>  include/linux/fs.h   |5 +-
>  include/linux/printk.h   |2 +-
>  include/linux/string.h   |   73 +--
>  scripts/Makefile.gcc-plugins |4 +
>  scripts/gcc-plugins/initify_plugin.c | 1147 
> ++
>  17 files changed, 1283 insertions(+), 90 deletions(-)

Re: [PATCH v1 0/2] Introduce the initify gcc plugin

2016-06-28 Thread Mathias Krause

Hi Emese,

On Tue, Jun 28, 2016 at 01:34:07PM +0200, Emese Revfy wrote:
> I would like to introduce the initify gcc plugin. The kernel already has
> a mechanism to free up code and data memory that is only used during kernel
> or module initialization.
> This plugin will teach the compiler to find more such code and data that
> can be freed after initialization. It reduces memory usage.
> The initify gcc plugin can be useful for embedded systems.
> 
> It is a CII project supported by the Linux Foundation.
> 
> This plugin is the part of grsecurity/PaX.
> 
> The plugin supports all gcc versions from 4.5 to 6.0.
> 
> I made some changes on top of the PaX version (since March 6.). These are
> the important ones:
>  * move all local strings to init.rodata.str and exit.rodata.str
>(not just __func__)
>  * report all initified strings and functions
>(GCC_PLUGIN_INITIFY_VERBOSE config option)
>  * automatically discover init/exit functions and apply the __init or
>__exit attributes on them
> 
> You can find more about the changes here:
> https://github.com/ephox-gcc-plugins/initify
> 
> This patch set is based on the "Add support for complex gcc plugins that
> don't fit in a single file" patch set (git/kees/linux.git#kspp HEAD:
> e5d4798b284cd192c8b).
> 
> Some statistics about the plugin:
> 
> On allyes config (amd64, gcc-6):
> * 7731 initified strings
> *  231 initified functions
> 
> On allmod config (i386, gcc-6):
> * 8846 initified strings
> *  252 initified functions
> 
> On allyes config (amd64, gcc-6):
> 
> section vanilla vanilla + initifychange
> ---
> .rodata 39059688 (0x25400e8)38527210 (0x24be0ea)-532478
> .data   45744128 (0x2ba)45404160 (0x2b4d000)-339968
> .init.data   1361144  (0x14c4f8) 1674200  (0x198bd8)+313056
> .text   77615128 (0x4a05018)77576664 (0x49fb9d8) -38464
> .init.text   1108455  (0x10e9e7) 1137618  (0x115bd2) +29163

You should probably provide numbers for .init.rodata.str, .exit.rodata.str
and .exit.text as well. Otherwise this delta calculation suggests a rather
gigantic image size reduction which is probably not the case ;)

Also a comparison of the final kernel image size would be nice to see if
the string duplication issue mentioned in [1] is actually an issue.

  [1] http://marc.info/?l=linux-kernel=140364632417795=2


Thanks,
Mathias

> 
> 
> Emese Revfy (2):
>  Add the initify gcc plugin
>  Mark functions with the __nocapture attribute
> 
> ---
>  arch/Kconfig |   23 +
>  arch/arm/include/asm/string.h|   10 +-
>  arch/arm64/include/asm/string.h  |   23 +-
>  arch/powerpc/include/asm/string.h|   19 +-
>  arch/x86/boot/string.h   |4 +-
>  arch/x86/include/asm/string_32.h |   21 +-
>  arch/x86/include/asm/string_64.h |   18 +-
>  arch/x86/kernel/hpet.c   |2 +-
>  include/asm-generic/bug.h|6 +-
>  include/asm-generic/vmlinux.lds.h|2 +
>  include/linux/compiler-gcc.h |   10 +-
>  include/linux/compiler.h |4 +
>  include/linux/fs.h   |5 +-
>  include/linux/printk.h   |2 +-
>  include/linux/string.h   |   73 +--
>  scripts/Makefile.gcc-plugins |4 +
>  scripts/gcc-plugins/initify_plugin.c | 1147 
> ++
>  17 files changed, 1283 insertions(+), 90 deletions(-)

[tip:x86/mm] x86/extable: Ensure entries are swapped completely when sorting

2016-05-11 Thread tip-bot for Mathias Krause

Commit-ID:  67d7a982bab6702d84415ea889996fae72a7d3b2
Gitweb: http://git.kernel.org/tip/67d7a982bab6702d84415ea889996fae72a7d3b2
Author: Mathias Krause <mini...@googlemail.com>
AuthorDate: Tue, 10 May 2016 23:07:02 +0200
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Wed, 11 May 2016 11:14:06 +0200

x86/extable: Ensure entries are swapped completely when sorting

The x86 exception table sorting was changed in this recent commit:

  29934b0fb8ff ("x86/extable: use generic search and sort routines")

... to use the arch independent code in lib/extable.c. However, the
patch was mangled somehow on its way into the kernel from the last
version posted at:

  https://lkml.org/lkml/2016/1/27/232

The committed version kind of attempted to incorporate the changes of
contemporary commit done in the x86 tree:

  548acf19234d ("x86/mm: Expand the exception table logic to allow new handling 
options")

... as in _completely_ _ignoring_ the x86 specific 'handler' member of
struct exception_table_entry. This effectively broke the sorting as
entries will only be partly swapped now.

Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
exception table doesn't need to be sorted at runtime. However, in case
that ever changes, we better not break the exception table sorting just
because of that.

Fix this by providing a swap_ex_entry_fixup() macro that takes care of
the 'handler' member.

Signed-off-by: Mathias Krause <mini...@googlemail.com>
Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Borislav Petkov <b...@suse.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Tony Luck <tony.l...@intel.com>
Link: 
http://lkml.kernel.org/r/1462914422-2911-1-git-send-email-mini...@googlemail.com
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/include/asm/uaccess.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 8b3fb76..86c48f3 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -108,6 +108,14 @@ struct exception_table_entry {
 
 #define ARCH_HAS_RELATIVE_EXTABLE
 
+#define swap_ex_entry_fixup(a, b, tmp, delta)  \
+   do {\
+   (a)->fixup = (b)->fixup + (delta);  \
+   (b)->fixup = (tmp).fixup - (delta); \
+   (a)->handler = (b)->handler + (delta);  \
+   (b)->handler = (tmp).handler - (delta); \
+   } while (0)
+
 extern int fixup_exception(struct pt_regs *regs, int trapnr);
 extern bool ex_has_fault_handler(unsigned long ip);
 extern int early_fixup_exception(unsigned long *ip);

[tip:x86/mm] x86/extable: Ensure entries are swapped completely when sorting

2016-05-11 Thread tip-bot for Mathias Krause

Commit-ID:  67d7a982bab6702d84415ea889996fae72a7d3b2
Gitweb: http://git.kernel.org/tip/67d7a982bab6702d84415ea889996fae72a7d3b2
Author: Mathias Krause 
AuthorDate: Tue, 10 May 2016 23:07:02 +0200
Committer:  Ingo Molnar 
CommitDate: Wed, 11 May 2016 11:14:06 +0200

x86/extable: Ensure entries are swapped completely when sorting

The x86 exception table sorting was changed in this recent commit:

  29934b0fb8ff ("x86/extable: use generic search and sort routines")

... to use the arch independent code in lib/extable.c. However, the
patch was mangled somehow on its way into the kernel from the last
version posted at:

  https://lkml.org/lkml/2016/1/27/232

The committed version kind of attempted to incorporate the changes of
contemporary commit done in the x86 tree:

  548acf19234d ("x86/mm: Expand the exception table logic to allow new handling 
options")

... as in _completely_ _ignoring_ the x86 specific 'handler' member of
struct exception_table_entry. This effectively broke the sorting as
entries will only be partly swapped now.

Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
exception table doesn't need to be sorted at runtime. However, in case
that ever changes, we better not break the exception table sorting just
because of that.

Fix this by providing a swap_ex_entry_fixup() macro that takes care of
the 'handler' member.

Signed-off-by: Mathias Krause 
Reviewed-by: Ard Biesheuvel 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Link: 
http://lkml.kernel.org/r/1462914422-2911-1-git-send-email-mini...@googlemail.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/uaccess.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 8b3fb76..86c48f3 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -108,6 +108,14 @@ struct exception_table_entry {
 
 #define ARCH_HAS_RELATIVE_EXTABLE
 
+#define swap_ex_entry_fixup(a, b, tmp, delta)  \
+   do {\
+   (a)->fixup = (b)->fixup + (delta);  \
+   (b)->fixup = (tmp).fixup - (delta); \
+   (a)->handler = (b)->handler + (delta);  \
+   (b)->handler = (tmp).handler - (delta); \
+   } while (0)
+
 extern int fixup_exception(struct pt_regs *regs, int trapnr);
 extern bool ex_has_fault_handler(unsigned long ip);
 extern int early_fixup_exception(unsigned long *ip);

Re: [PATCH] x86/extable: ensure entries are swapped completely when sorting

2016-05-11 Thread Mathias Krause

On 11 May 2016 at 07:46, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote:
> On 10 May 2016 at 23:07, Mathias Krause <mini...@googlemail.com> wrote:
>> The x86 exception table sorting was changed in commit 29934b0fb8ff
>> ("x86/extable: use generic search and sort routines") to use the arch
>> independent code in lib/extable.c. However, the patch was mangled
>> somehow on its way into the kernel from the last version posted at [1].
>> The committed version kind of attempted to incorporate the changes of
>> commit 548acf19234d ("x86/mm: Expand the exception table logic to allow
>> new handling options") as in _completely_ _ignoring_ the x86 specific
>> 'handler' member of struct exception_table_entry. This effectively broke
>> the sorting as entries will only partly be swapped now.
>>
>
> OK, it appears that Tony's patches grew a 'handler' field fairly late
> in their review cycle, and I didn't pick up on that. Apologies.

No need to apologize, as your patch *must* have created conflicts for
the committer applying it. It's just that those conflicts were
resolved wrongly. He/She should have asked you for a new version,
but... did not :/

>> Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
>> exception table doesn't need to be sorted at runtime. However, in case
>> that ever changes, we better not break the exception table sorting just
>> because of that.
>>
>
> BUILDTIME_EXTABLE_SORT applies to the core image only, but we still
> rely on the sorting routines for modules in that case. So I think this
> needs to be fixed right away.

Good point!

>
>> Fix this by providing a swap_ex_entry_fixup() macro that takes care of
>> the 'handler' member.
>>
>> [1] https://lkml.org/lkml/2016/1/27/232
>>
>> Signed-off-by: Mathias Krause <mini...@googlemail.com>
>> Cc: Andrew Morton <a...@linux-foundation.org>
>> Cc: Andy Lutomirski <l...@kernel.org>
>> Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
>> Cc: Borislav Petkov <b...@suse.de>
>> Cc: H. Peter Anvin <h...@linux.intel.com>
>> Cc: Ingo Molnar <mi...@kernel.org>
>> Cc: Linus Torvalds <torva...@linux-foundation.org>
>> Cc: Thomas Gleixner <t...@linutronix.de>
>> Cc: Tony Luck <tony.l...@intel.com>
>
> Fixes: 29934b0fb8f ("x86/extable: use generic search and sort routines")
> Reviewed-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
>
> Thanks,
> Ard.
>

Thanks,
Mathias

Re: [PATCH] x86/extable: ensure entries are swapped completely when sorting

2016-05-11 Thread Mathias Krause

On 11 May 2016 at 07:46, Ard Biesheuvel  wrote:
> On 10 May 2016 at 23:07, Mathias Krause  wrote:
>> The x86 exception table sorting was changed in commit 29934b0fb8ff
>> ("x86/extable: use generic search and sort routines") to use the arch
>> independent code in lib/extable.c. However, the patch was mangled
>> somehow on its way into the kernel from the last version posted at [1].
>> The committed version kind of attempted to incorporate the changes of
>> commit 548acf19234d ("x86/mm: Expand the exception table logic to allow
>> new handling options") as in _completely_ _ignoring_ the x86 specific
>> 'handler' member of struct exception_table_entry. This effectively broke
>> the sorting as entries will only partly be swapped now.
>>
>
> OK, it appears that Tony's patches grew a 'handler' field fairly late
> in their review cycle, and I didn't pick up on that. Apologies.

No need to apologize, as your patch *must* have created conflicts for
the committer applying it. It's just that those conflicts were
resolved wrongly. He/She should have asked you for a new version,
but... did not :/

>> Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
>> exception table doesn't need to be sorted at runtime. However, in case
>> that ever changes, we better not break the exception table sorting just
>> because of that.
>>
>
> BUILDTIME_EXTABLE_SORT applies to the core image only, but we still
> rely on the sorting routines for modules in that case. So I think this
> needs to be fixed right away.

Good point!

>
>> Fix this by providing a swap_ex_entry_fixup() macro that takes care of
>> the 'handler' member.
>>
>> [1] https://lkml.org/lkml/2016/1/27/232
>>
>> Signed-off-by: Mathias Krause 
>> Cc: Andrew Morton 
>> Cc: Andy Lutomirski 
>> Cc: Ard Biesheuvel 
>> Cc: Borislav Petkov 
>> Cc: H. Peter Anvin 
>> Cc: Ingo Molnar 
>> Cc: Linus Torvalds 
>> Cc: Thomas Gleixner 
>> Cc: Tony Luck 
>
> Fixes: 29934b0fb8f ("x86/extable: use generic search and sort routines")
> Reviewed-by: Ard Biesheuvel 
>
> Thanks,
> Ard.
>

Thanks,
Mathias

[PATCH] x86/extable: ensure entries are swapped completely when sorting

2016-05-10 Thread Mathias Krause

The x86 exception table sorting was changed in commit 29934b0fb8ff
("x86/extable: use generic search and sort routines") to use the arch
independent code in lib/extable.c. However, the patch was mangled
somehow on its way into the kernel from the last version posted at [1].
The committed version kind of attempted to incorporate the changes of
commit 548acf19234d ("x86/mm: Expand the exception table logic to allow
new handling options") as in _completely_ _ignoring_ the x86 specific
'handler' member of struct exception_table_entry. This effectively broke
the sorting as entries will only partly be swapped now.

Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
exception table doesn't need to be sorted at runtime. However, in case
that ever changes, we better not break the exception table sorting just
because of that.

Fix this by providing a swap_ex_entry_fixup() macro that takes care of
the 'handler' member.

[1] https://lkml.org/lkml/2016/1/27/232

Signed-off-by: Mathias Krause <mini...@googlemail.com>
Cc: Andrew Morton <a...@linux-foundation.org>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Ard Biesheuvel <ard.biesheu...@linaro.org>
Cc: Borislav Petkov <b...@suse.de>
Cc: H. Peter Anvin <h...@linux.intel.com>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Tony Luck <tony.l...@intel.com>
---
 arch/x86/include/asm/uaccess.h |8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index a969ae607be8..b69b7bffb0e0 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -108,6 +108,14 @@ struct exception_table_entry {
 
 #define ARCH_HAS_RELATIVE_EXTABLE
 
+#define swap_ex_entry_fixup(a, b, tmp, delta)  \
+   do {\
+   (a)->fixup = (b)->fixup + (delta);  \
+   (b)->fixup = (tmp).fixup - (delta); \
+   (a)->handler = (b)->handler + (delta);  \
+   (b)->handler = (tmp).handler - (delta); \
+   } while (0)
+
 extern int fixup_exception(struct pt_regs *regs, int trapnr);
 extern bool ex_has_fault_handler(unsigned long ip);
 extern int early_fixup_exception(unsigned long *ip);
-- 
1.7.10.4

[PATCH] x86/extable: ensure entries are swapped completely when sorting

2016-05-10 Thread Mathias Krause

The x86 exception table sorting was changed in commit 29934b0fb8ff
("x86/extable: use generic search and sort routines") to use the arch
independent code in lib/extable.c. However, the patch was mangled
somehow on its way into the kernel from the last version posted at [1].
The committed version kind of attempted to incorporate the changes of
commit 548acf19234d ("x86/mm: Expand the exception table logic to allow
new handling options") as in _completely_ _ignoring_ the x86 specific
'handler' member of struct exception_table_entry. This effectively broke
the sorting as entries will only partly be swapped now.

Fortunately, the x86 Kconfig selects BUILDTIME_EXTABLE_SORT, so the
exception table doesn't need to be sorted at runtime. However, in case
that ever changes, we better not break the exception table sorting just
because of that.

Fix this by providing a swap_ex_entry_fixup() macro that takes care of
the 'handler' member.

[1] https://lkml.org/lkml/2016/1/27/232

Signed-off-by: Mathias Krause 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Borislav Petkov 
Cc: H. Peter Anvin 
Cc: Ingo Molnar 
Cc: Linus Torvalds 
Cc: Thomas Gleixner 
Cc: Tony Luck 
---
 arch/x86/include/asm/uaccess.h |8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index a969ae607be8..b69b7bffb0e0 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -108,6 +108,14 @@ struct exception_table_entry {
 
 #define ARCH_HAS_RELATIVE_EXTABLE
 
+#define swap_ex_entry_fixup(a, b, tmp, delta)  \
+   do {\
+   (a)->fixup = (b)->fixup + (delta);  \
+   (b)->fixup = (tmp).fixup - (delta); \
+   (a)->handler = (b)->handler + (delta);  \
+   (b)->handler = (tmp).handler - (delta); \
+   } while (0)
+
 extern int fixup_exception(struct pt_regs *regs, int trapnr);
 extern bool ex_has_fault_handler(unsigned long ip);
 extern int early_fixup_exception(unsigned long *ip);
-- 
1.7.10.4

Re: [PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-29 Thread Mathias Krause

On 28 April 2016 at 23:30, Andrew Morton <a...@linux-foundation.org> wrote:
> On Thu, 28 Apr 2016 21:04:18 +0200 Mathias Krause <mini...@googlemail.com> 
> wrote:
>
>> If /proc//environ gets read before the envp[] array is fully set
>> up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying
>> to read more bytes than are actually written, as env_start will already
>> be set but env_end will still be zero, making the range calculation
>> underflow, allowing to read beyond the end of what has been written.
>>
>> Fix this as it is done for /proc//cmdline by testing env_end for
>> zero. It is, apparently, intentionally set last in create_*_tables().
>
> Also, if this is indeed our design then
>
> a) the various create_*_tables() should have comments in there which
>explain this subtlety to the reader.  Or, better, they use a common
>helper function for this readiness-signaling operation because..
>
> b) we'll need some barriers there to ensure that the environ_read()
>caller sees the create_*_tables() writes in the correct order.

I totally agree that this kind of "synchronization" is rather fragile.
Adding comments won't help much, I fear. Rather a dedicated flag,
signaling "process ready for inspection" may be needed. So far, that's
what env_end is (ab-)used for.

Regards,
Mathias

Re: [PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-29 Thread Mathias Krause

On 28 April 2016 at 23:30, Andrew Morton  wrote:
> On Thu, 28 Apr 2016 21:04:18 +0200 Mathias Krause  
> wrote:
>
>> If /proc//environ gets read before the envp[] array is fully set
>> up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying
>> to read more bytes than are actually written, as env_start will already
>> be set but env_end will still be zero, making the range calculation
>> underflow, allowing to read beyond the end of what has been written.
>>
>> Fix this as it is done for /proc//cmdline by testing env_end for
>> zero. It is, apparently, intentionally set last in create_*_tables().
>
> Also, if this is indeed our design then
>
> a) the various create_*_tables() should have comments in there which
>explain this subtlety to the reader.  Or, better, they use a common
>helper function for this readiness-signaling operation because..
>
> b) we'll need some barriers there to ensure that the environ_read()
>caller sees the create_*_tables() writes in the correct order.

I totally agree that this kind of "synchronization" is rather fragile.
Adding comments won't help much, I fear. Rather a dedicated flag,
signaling "process ready for inspection" may be needed. So far, that's
what env_end is (ab-)used for.

Regards,
Mathias

Re: [PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-28 Thread Mathias Krause

On 28 April 2016 at 23:26, Andrew Morton <a...@linux-foundation.org> wrote:
> On Thu, 28 Apr 2016 21:04:18 +0200 Mathias Krause <mini...@googlemail.com> 
> wrote:
>
>> If /proc//environ gets read before the envp[] array is fully set
>> up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying
>> to read more bytes than are actually written, as env_start will already
>> be set but env_end will still be zero, making the range calculation
>> underflow, allowing to read beyond the end of what has been written.
>>
>> Fix this as it is done for /proc//cmdline by testing env_end for
>> zero. It is, apparently, intentionally set last in create_*_tables().
>>
>> This bug was found by the PaX size_overflow plugin that detected the
>> arithmetic underflow of 'this_len = env_end - (env_start + src)' when
>> env_end is still zero.
>
> So what are the implications of this?  From my reading, a craftily
> constructed application could occasionally read arbitrarily large
> amounts of kernel memory?

I don't think access_remote_vm() is capable of that. So, the only
consequence is, userland trying to access /proc//environ of a not
yet fully set up process may get inconsistent data as we're in the
middle of copying in the environment variables.

Regards,
Mathias

Re: [PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-28 Thread Mathias Krause

On 28 April 2016 at 23:26, Andrew Morton  wrote:
> On Thu, 28 Apr 2016 21:04:18 +0200 Mathias Krause  
> wrote:
>
>> If /proc//environ gets read before the envp[] array is fully set
>> up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying
>> to read more bytes than are actually written, as env_start will already
>> be set but env_end will still be zero, making the range calculation
>> underflow, allowing to read beyond the end of what has been written.
>>
>> Fix this as it is done for /proc//cmdline by testing env_end for
>> zero. It is, apparently, intentionally set last in create_*_tables().
>>
>> This bug was found by the PaX size_overflow plugin that detected the
>> arithmetic underflow of 'this_len = env_end - (env_start + src)' when
>> env_end is still zero.
>
> So what are the implications of this?  From my reading, a craftily
> constructed application could occasionally read arbitrarily large
> amounts of kernel memory?

I don't think access_remote_vm() is capable of that. So, the only
consequence is, userland trying to access /proc//environ of a not
yet fully set up process may get inconsistent data as we're in the
middle of copying in the environment variables.

Regards,
Mathias

Re: [PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-28 Thread Mathias Krause

On 28 April 2016 at 21:20, Mateusz Guzik  wrote:
> In this case get_cmdline in mm/util.c should also be patched for
> completness. It tests for arg_end, but later accesses env_end.

But it'll do this only when argv[] was modified from what the kernel
initially wrote, which, in turn, either requires the process to have
started executing and messing with it's own argv[] or another process
poking at it via ptrace(). In the former case env_end will be non-zero
already and I don't know if the latter case is actually possible, i.e.
if one can already attach to a process this early. If one can, then
yes, that place needs to be modified, too.

Thanks,
Mathias

Re: [PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-28 Thread Mathias Krause

On 28 April 2016 at 21:20, Mateusz Guzik  wrote:
> In this case get_cmdline in mm/util.c should also be patched for
> completness. It tests for arg_end, but later accesses env_end.

But it'll do this only when argv[] was modified from what the kernel
initially wrote, which, in turn, either requires the process to have
started executing and messing with it's own argv[] or another process
poking at it via ptrace(). In the former case env_end will be non-zero
already and I don't know if the latter case is actually possible, i.e.
if one can already attach to a process this early. If one can, then
yes, that place needs to be modified, too.

Thanks,
Mathias

[PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-28 Thread Mathias Krause

If /proc//environ gets read before the envp[] array is fully set
up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying
to read more bytes than are actually written, as env_start will already
be set but env_end will still be zero, making the range calculation
underflow, allowing to read beyond the end of what has been written.

Fix this as it is done for /proc//cmdline by testing env_end for
zero. It is, apparently, intentionally set last in create_*_tables().

This bug was found by the PaX size_overflow plugin that detected the
arithmetic underflow of 'this_len = env_end - (env_start + src)' when
env_end is still zero.

Reported-at: https://forums.grsecurity.net/viewtopic.php?f=3=4363
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116461
Signed-off-by: Mathias Krause <mini...@googlemail.com>
Cc: Emese Revfy <re.em...@gmail.com>
Cc: Pax Team <pagee...@freemail.hu>
Cc: Al Viro <v...@zeniv.linux.org.uk>
Cc: Mateusz Guzik <mgu...@redhat.com>
Cc: Alexey Dobriyan <adobri...@gmail.com>
Cc: Cyrill Gorcunov <gorcu...@openvz.org>
Cc: Jarod Wilson <ja...@redhat.com>
---
 fs/proc/base.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4f764c2ac1a5..45f2162e55b2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -955,7 +955,8 @@ static ssize_t environ_read(struct file *file, char __user 
*buf,
struct mm_struct *mm = file->private_data;
unsigned long env_start, env_end;
 
-   if (!mm)
+   /* Ensure the process spawned far enough to have an environment. */
+   if (!mm || !mm->env_end)
return 0;
 
page = (char *)__get_free_page(GFP_TEMPORARY);
-- 
1.7.10.4

[PATCH] proc: prevent accessing /proc//environ until it's ready

2016-04-28 Thread Mathias Krause

If /proc//environ gets read before the envp[] array is fully set
up in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying
to read more bytes than are actually written, as env_start will already
be set but env_end will still be zero, making the range calculation
underflow, allowing to read beyond the end of what has been written.

Fix this as it is done for /proc//cmdline by testing env_end for
zero. It is, apparently, intentionally set last in create_*_tables().

This bug was found by the PaX size_overflow plugin that detected the
arithmetic underflow of 'this_len = env_end - (env_start + src)' when
env_end is still zero.

Reported-at: https://forums.grsecurity.net/viewtopic.php?f=3=4363
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116461
Signed-off-by: Mathias Krause 
Cc: Emese Revfy 
Cc: Pax Team 
Cc: Al Viro 
Cc: Mateusz Guzik 
Cc: Alexey Dobriyan 
Cc: Cyrill Gorcunov 
Cc: Jarod Wilson 
---
 fs/proc/base.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4f764c2ac1a5..45f2162e55b2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -955,7 +955,8 @@ static ssize_t environ_read(struct file *file, char __user 
*buf,
struct mm_struct *mm = file->private_data;
unsigned long env_start, env_end;
 
-   if (!mm)
+   /* Ensure the process spawned far enough to have an environment. */
+   if (!mm || !mm->env_end)
return 0;
 
page = (char *)__get_free_page(GFP_TEMPORARY);
-- 
1.7.10.4

Re: [kernel-hardening] [RFC][PATCH 0/3] Sanitization of buddy pages

2016-01-26 Thread Mathias Krause

On 25 January 2016 at 17:55, Laura Abbott  wrote:
> Hi,
>
> This is an implementation of page poisoning/sanitization for all arches. It
> takes advantage of the existing implementation for
> !ARCH_SUPPORTS_DEBUG_PAGEALLOC arches. This is a different approach than what
> the Grsecurity patches were taking but should provide equivalent 
> functionality.
>
> For those who aren't familiar with this, the goal of sanitization is to reduce
> the severity of use after free and uninitialized data bugs. Memory is cleared
> on free so any sensitive data is no longer available. Discussion of
> sanitization was brough up in a thread about CVEs
> (lkml.kernel.org/g/<20160119112812.GA10818@mwanda>)
>
> I eventually expect Kconfig names will want to be changed and or moved if this
> is going to be used for security but that can happen later.
>
> Credit to Mathias Krause for the version in grsecurity

Thanks for the credits but I don't deserve them. I've contributed the
slab based sanitization only. The page based one shipped in PaX and
grsecurity is from the PaX Team.

>
> Laura Abbott (3):
>   mm/debug-pagealloc.c: Split out page poisoning from debug page_alloc
>   mm/page_poison.c: Enable PAGE_POISONING as a separate option
>   mm/page_poisoning.c: Allow for zero poisoning
>
>  Documentation/kernel-parameters.txt |   5 ++
>  include/linux/mm.h  |  13 +++
>  include/linux/poison.h  |   4 +
>  mm/Kconfig.debug|  35 +++-
>  mm/Makefile |   5 +-
>  mm/debug-pagealloc.c| 127 +
>  mm/page_alloc.c |  10 ++-
>  mm/page_poison.c| 158 
> 
>  8 files changed, 228 insertions(+), 129 deletions(-)
>  create mode 100644 mm/page_poison.c
>
> --
> 2.5.0
>

Regards,
Mathias

Re: [kernel-hardening] [RFC][PATCH 0/3] Sanitization of buddy pages

2016-01-26 Thread Mathias Krause

On 25 January 2016 at 17:55, Laura Abbott <labb...@fedoraproject.org> wrote:
> Hi,
>
> This is an implementation of page poisoning/sanitization for all arches. It
> takes advantage of the existing implementation for
> !ARCH_SUPPORTS_DEBUG_PAGEALLOC arches. This is a different approach than what
> the Grsecurity patches were taking but should provide equivalent 
> functionality.
>
> For those who aren't familiar with this, the goal of sanitization is to reduce
> the severity of use after free and uninitialized data bugs. Memory is cleared
> on free so any sensitive data is no longer available. Discussion of
> sanitization was brough up in a thread about CVEs
> (lkml.kernel.org/g/<20160119112812.GA10818@mwanda>)
>
> I eventually expect Kconfig names will want to be changed and or moved if this
> is going to be used for security but that can happen later.
>
> Credit to Mathias Krause for the version in grsecurity

Thanks for the credits but I don't deserve them. I've contributed the
slab based sanitization only. The page based one shipped in PaX and
grsecurity is from the PaX Team.

>
> Laura Abbott (3):
>   mm/debug-pagealloc.c: Split out page poisoning from debug page_alloc
>   mm/page_poison.c: Enable PAGE_POISONING as a separate option
>   mm/page_poisoning.c: Allow for zero poisoning
>
>  Documentation/kernel-parameters.txt |   5 ++
>  include/linux/mm.h  |  13 +++
>  include/linux/poison.h  |   4 +
>  mm/Kconfig.debug|  35 +++-
>  mm/Makefile |   5 +-
>  mm/debug-pagealloc.c| 127 +
>  mm/page_alloc.c |  10 ++-
>  mm/page_poison.c| 158 
> 
>  8 files changed, 228 insertions(+), 129 deletions(-)
>  create mode 100644 mm/page_poison.c
>
> --
> 2.5.0
>

Regards,
Mathias

Re: [PATCH] ACPI / OSL: Add kerneldoc comments to memory mapping functions

2016-01-02 Thread Mathias Krause

On 2 January 2016 at 03:10, Rafael J. Wysocki  wrote:
> From: Rafael J. Wysocki 
>
> Add kerneldoc comments to acpi_os_map_iomem() and acpi_os_unmap_iomem()
> and explain why the latter needs the __ref annotation in one of them
> (as suggested by Mathias Krause).
>
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/acpi/osl.c |   27 +++
>  1 file changed, 27 insertions(+)
>
> Index: linux-pm/drivers/acpi/osl.c
> ===
> --- linux-pm.orig/drivers/acpi/osl.c
> +++ linux-pm/drivers/acpi/osl.c
> @@ -366,6 +366,19 @@ static void acpi_unmap(acpi_physical_add
> iounmap(vaddr);
>  }
>
> +/**
> + * acpi_os_map_iomem - Get a virtual address for a given physical address 
> range.
> + * @phys: Start of the physical address range to map.
> + * @size: Size of the physical address range to map.
> + *
> + * Look up the given physical address range in the list of existing ACPI 
> memory
> + * mappings.  If found, get a reference to it and return a pointer to it (its
> + * virtual address).  If not found, map it, add it to that list and return a
> + * pointer to it.
> + *
> + * During early init (when acpi_gbl_permanent_mmap has not been set yet) this
> + * routine simply calls __acpi_map_table() to get the job done.
> + */
>  void __iomem *__init_refok
>  acpi_os_map_iomem(acpi_physical_address phys, acpi_size size)
>  {
> @@ -441,6 +454,20 @@ static void acpi_os_map_cleanup(struct a
> }
>  }
>
> +/**
> + * acpi_os_unmap_iomem - Drop a memory mapping reference.
> + * @virt: Start of the address range to drop a reference to.
> + * @size: Size of the address range to drop a reference to.
> + *
> + * Look up the given virtual address range in the list of existing ACPI 
> memory
> + * mappings, drop a reference to it and unmap it if there are no more active
> + * references to it.
> + *
> + * During early init (when acpi_gbl_permanent_mmap has not been set yet) this
> + * routine simply calls __acpi_unmap_table() to get the job done.  Since
> + * __acpi_unmap_table() is an __init function, the __ref annotation is needed
> + * here.
> + */
>  void __ref acpi_os_unmap_iomem(void __iomem *virt, acpi_size size)
>  {
> struct acpi_ioremap *map;
>

Looks much better than my two-liner comment!

Acked-by: Mathias Krause 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ACPI / OSL: Add kerneldoc comments to memory mapping functions

2016-01-02 Thread Mathias Krause

On 2 January 2016 at 03:10, Rafael J. Wysocki <r...@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wyso...@intel.com>
>
> Add kerneldoc comments to acpi_os_map_iomem() and acpi_os_unmap_iomem()
> and explain why the latter needs the __ref annotation in one of them
> (as suggested by Mathias Krause).
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com>
> ---
>  drivers/acpi/osl.c |   27 +++
>  1 file changed, 27 insertions(+)
>
> Index: linux-pm/drivers/acpi/osl.c
> ===
> --- linux-pm.orig/drivers/acpi/osl.c
> +++ linux-pm/drivers/acpi/osl.c
> @@ -366,6 +366,19 @@ static void acpi_unmap(acpi_physical_add
> iounmap(vaddr);
>  }
>
> +/**
> + * acpi_os_map_iomem - Get a virtual address for a given physical address 
> range.
> + * @phys: Start of the physical address range to map.
> + * @size: Size of the physical address range to map.
> + *
> + * Look up the given physical address range in the list of existing ACPI 
> memory
> + * mappings.  If found, get a reference to it and return a pointer to it (its
> + * virtual address).  If not found, map it, add it to that list and return a
> + * pointer to it.
> + *
> + * During early init (when acpi_gbl_permanent_mmap has not been set yet) this
> + * routine simply calls __acpi_map_table() to get the job done.
> + */
>  void __iomem *__init_refok
>  acpi_os_map_iomem(acpi_physical_address phys, acpi_size size)
>  {
> @@ -441,6 +454,20 @@ static void acpi_os_map_cleanup(struct a
> }
>  }
>
> +/**
> + * acpi_os_unmap_iomem - Drop a memory mapping reference.
> + * @virt: Start of the address range to drop a reference to.
> + * @size: Size of the address range to drop a reference to.
> + *
> + * Look up the given virtual address range in the list of existing ACPI 
> memory
> + * mappings, drop a reference to it and unmap it if there are no more active
> + * references to it.
> + *
> + * During early init (when acpi_gbl_permanent_mmap has not been set yet) this
> + * routine simply calls __acpi_unmap_table() to get the job done.  Since
> + * __acpi_unmap_table() is an __init function, the __ref annotation is needed
> + * here.
> + */
>  void __ref acpi_os_unmap_iomem(void __iomem *virt, acpi_size size)
>  {
> struct acpi_ioremap *map;
>

Looks much better than my two-liner comment!

Acked-by: Mathias Krause <mini...@googlemail.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [RFC][PATCH 6/7] mm: Add Kconfig option for slab sanitization

2015-12-22 Thread Mathias Krause

On 22 December 2015 at 21:01, Christoph Lameter  wrote:
> On Tue, 22 Dec 2015, Mathias Krause wrote:
>
>> How many systems, do you think, are running with enabled DEBUG_SLAB /
>> SLUB_DEBUG in production? Not so many, I'd guess. And the ones running
>> into issues probably just disable DEBUG_SLAB / SLUB_DEBUG.
>
> All systems run with SLUB_DEBUG in production. SLUB_DEBUG causes the code
> for debugging to be compiled in. Then it can be enabled later with a
> command line parameter.

Indeed, I meant CONFIG_SLUB_DEBUG_ON, i.e. compiled in and enabled
SLAB cache debugging including poisoning.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [RFC][PATCH 6/7] mm: Add Kconfig option for slab sanitization

2015-12-22 Thread Mathias Krause

On 22 December 2015 at 18:51, Laura Abbott  wrote:
>> [snip]
>>
>> Related to this, have you checked that the sanitization doesn't
>> interfere with the various slab handling schemes, namely RCU related
>> specialties? Not all caches are marked SLAB_DESTROY_BY_RCU, some use
>> call_rcu() instead, implicitly relying on the semantics RCU'ed slabs
>> permit, namely allowing a "use-after-free" access to be legitimate
>> within the RCU grace period. Scrubbing the object during that period
>> would break that assumption.
>
>
> I haven't looked into that. I was working off the assumption that
> if the regular SLAB debug poisoning worked so would the sanitization.
> The regular debug poisoning only checks for SLAB_DESTROY_BY_RCU so
> how does that work then?

Maybe it doesn't? ;)

How many systems, do you think, are running with enabled DEBUG_SLAB /
SLUB_DEBUG in production? Not so many, I'd guess. And the ones running
into issues probably just disable DEBUG_SLAB / SLUB_DEBUG.

Btw, SLUB not only looks for SLAB_DESTROY_BY_RCU but also excludes
"call_rcu slabs" via other mechanisms. As SLUB is the default SLAB
allocator for quite some time now, even with enabled SLUB_DEBUG one
wouldn't be able to trigger RCU related sanitization issues.

>> Speaking of RCU, do you have a plan to support RCU'ed slabs as well?
>>
>
> My only plan was to get the base support in. I didn't have a plan to
> support RCU slabs but that's certainly something to be done in the
> future.

"Base support", in my opinion, includes covering the buddy allocator
as well. Otherwise this feature is incomplete.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [RFC][PATCH 6/7] mm: Add Kconfig option for slab sanitization

2015-12-22 Thread Mathias Krause

On 22 December 2015 at 04:40, Laura Abbott  wrote:
>
> The SL[AOU]B allocators all behave differently w.r.t. to what happen
> an object is freed. CONFIG_SLAB_SANITIZATION introduces a common
> mechanism to control what happens on free. When this option is
> enabled, objects may be poisoned according to a combination of
> slab_sanitization command line option and whether SLAB_NO_SANITIZE
> is set on a cache.
>
> All credit for the original work should be given to Brad Spengler and
> the PaX Team.
>
> Signed-off-by: Laura Abbott 
> ---
>  init/Kconfig | 36 
>  1 file changed, 36 insertions(+)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 235c7a2..37857f3 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1755,6 +1755,42 @@ config SLUB_CPU_PARTIAL
>   which requires the taking of locks that may cause latency spikes.
>   Typically one would choose no for a realtime system.
>
> +config SLAB_MEMORY_SANITIZE
> +   bool "Sanitize all freed memory"
> +   help
> + By saying Y here the kernel will erase slab objects as soon as they
> + are freed.  This in turn reduces the lifetime of data
> + stored in them, making it less likely that sensitive information 
> such
> + as passwords, cryptographic secrets, etc stay in memory for too 
> long.
> +

> + This is especially useful for programs whose runtime is short, long
> + lived processes and the kernel itself benefit from this as long as
> + they ensure timely freeing of memory that may hold sensitive
> + information.

This part is not true. The code is handling SLAB objects only, so
talking about processes in this context is misleading. Freeing memory
in userland containing secrets cannot be covered by this feature as
is. It needs a counter-part in the userland memory allocator as well
as handling page sanitization in the buddy allocator.

I guess you've just copy+pasted that Kconfig description from the PaX
feature PAX_MEMORY_SANITIZE that also covers the buddy allocator,
therefore fits that description while this patch set does not. So
please adapt the text or implement the fully featured version.

> +
> + A nice side effect of the sanitization of slab objects is the
> + reduction of possible info leaks caused by padding bytes within the
> + leaky structures.  Use-after-free bugs for structures containing
> + pointers can also be detected as dereferencing the sanitized pointer
> + will generate an access violation.
> +
> + The tradeoff is performance impact. The noticible impact can vary
> + and you are advised to test this feature on your expected workload
> + before deploying it
> +

> + The slab sanitization feature excludes a few slab caches per default
> + for performance reasons. The level of sanitization can be adjusted
> + with the sanitize_slab commandline option:
> +   sanitize_slab=off: No sanitization will occur
> +   santiize_slab=slow: Sanitization occurs only on the slow path
> +   for all but the excluded slabs
> +   (relevant for SLUB allocator only)
> +   sanitize_slab=partial: Sanitization occurs on all path for all
> +   but the excluded slabs
> +   sanitize_slab=full: All slabs are sanitize

This should probably be moved to Documentation/kernel-parameters.txt,
as can be found in the PaX patch[1]?

> +
> + If unsure, say Y here.

Really? It has an unknown performance impact, depending on the
workload, which might make "unsure users" preferably say No, if they
don't care about info leaks.

Related to this, have you checked that the sanitization doesn't
interfere with the various slab handling schemes, namely RCU related
specialties? Not all caches are marked SLAB_DESTROY_BY_RCU, some use
call_rcu() instead, implicitly relying on the semantics RCU'ed slabs
permit, namely allowing a "use-after-free" access to be legitimate
within the RCU grace period. Scrubbing the object during that period
would break that assumption.

Speaking of RCU, do you have a plan to support RCU'ed slabs as well?

> +
>  config MMAP_ALLOW_UNINITIALIZED
> bool "Allow mmapped anonymous memory to be uninitialized"
> depends on EXPERT && !MMU
> --
> 2.5.0
>

Regards,
Mathias

[1] 
https://github.com/minipli/linux-grsec/blob/v4.3.3-pax/Documentation/kernel-parameters.txt#L2689-L2696
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [RFC][PATCH 6/7] mm: Add Kconfig option for slab sanitization

2015-12-22 Thread Mathias Krause

On 22 December 2015 at 21:01, Christoph Lameter <c...@linux.com> wrote:
> On Tue, 22 Dec 2015, Mathias Krause wrote:
>
>> How many systems, do you think, are running with enabled DEBUG_SLAB /
>> SLUB_DEBUG in production? Not so many, I'd guess. And the ones running
>> into issues probably just disable DEBUG_SLAB / SLUB_DEBUG.
>
> All systems run with SLUB_DEBUG in production. SLUB_DEBUG causes the code
> for debugging to be compiled in. Then it can be enabled later with a
> command line parameter.

Indeed, I meant CONFIG_SLUB_DEBUG_ON, i.e. compiled in and enabled
SLAB cache debugging including poisoning.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [RFC][PATCH 6/7] mm: Add Kconfig option for slab sanitization

2015-12-22 Thread Mathias Krause

On 22 December 2015 at 04:40, Laura Abbott  wrote:
>
> The SL[AOU]B allocators all behave differently w.r.t. to what happen
> an object is freed. CONFIG_SLAB_SANITIZATION introduces a common
> mechanism to control what happens on free. When this option is
> enabled, objects may be poisoned according to a combination of
> slab_sanitization command line option and whether SLAB_NO_SANITIZE
> is set on a cache.
>
> All credit for the original work should be given to Brad Spengler and
> the PaX Team.
>
> Signed-off-by: Laura Abbott 
> ---
>  init/Kconfig | 36 
>  1 file changed, 36 insertions(+)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 235c7a2..37857f3 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1755,6 +1755,42 @@ config SLUB_CPU_PARTIAL
>   which requires the taking of locks that may cause latency spikes.
>   Typically one would choose no for a realtime system.
>
> +config SLAB_MEMORY_SANITIZE
> +   bool "Sanitize all freed memory"
> +   help
> + By saying Y here the kernel will erase slab objects as soon as they
> + are freed.  This in turn reduces the lifetime of data
> + stored in them, making it less likely that sensitive information 
> such
> + as passwords, cryptographic secrets, etc stay in memory for too 
> long.
> +

> + This is especially useful for programs whose runtime is short, long
> + lived processes and the kernel itself benefit from this as long as
> + they ensure timely freeing of memory that may hold sensitive
> + information.

This part is not true. The code is handling SLAB objects only, so
talking about processes in this context is misleading. Freeing memory
in userland containing secrets cannot be covered by this feature as
is. It needs a counter-part in the userland memory allocator as well
as handling page sanitization in the buddy allocator.

I guess you've just copy+pasted that Kconfig description from the PaX
feature PAX_MEMORY_SANITIZE that also covers the buddy allocator,
therefore fits that description while this patch set does not. So
please adapt the text or implement the fully featured version.

> +
> + A nice side effect of the sanitization of slab objects is the
> + reduction of possible info leaks caused by padding bytes within the
> + leaky structures.  Use-after-free bugs for structures containing
> + pointers can also be detected as dereferencing the sanitized pointer
> + will generate an access violation.
> +
> + The tradeoff is performance impact. The noticible impact can vary
> + and you are advised to test this feature on your expected workload
> + before deploying it
> +

> + The slab sanitization feature excludes a few slab caches per default
> + for performance reasons. The level of sanitization can be adjusted
> + with the sanitize_slab commandline option:
> +   sanitize_slab=off: No sanitization will occur
> +   santiize_slab=slow: Sanitization occurs only on the slow path
> +   for all but the excluded slabs
> +   (relevant for SLUB allocator only)
> +   sanitize_slab=partial: Sanitization occurs on all path for all
> +   but the excluded slabs
> +   sanitize_slab=full: All slabs are sanitize

This should probably be moved to Documentation/kernel-parameters.txt,
as can be found in the PaX patch[1]?

> +
> + If unsure, say Y here.

Really? It has an unknown performance impact, depending on the
workload, which might make "unsure users" preferably say No, if they
don't care about info leaks.

Related to this, have you checked that the sanitization doesn't
interfere with the various slab handling schemes, namely RCU related
specialties? Not all caches are marked SLAB_DESTROY_BY_RCU, some use
call_rcu() instead, implicitly relying on the semantics RCU'ed slabs
permit, namely allowing a "use-after-free" access to be legitimate
within the RCU grace period. Scrubbing the object during that period
would break that assumption.

Speaking of RCU, do you have a plan to support RCU'ed slabs as well?

> +
>  config MMAP_ALLOW_UNINITIALIZED
> bool "Allow mmapped anonymous memory to be uninitialized"
> depends on EXPERT && !MMU
> --
> 2.5.0
>

Regards,
Mathias

[1] 
https://github.com/minipli/linux-grsec/blob/v4.3.3-pax/Documentation/kernel-parameters.txt#L2689-L2696
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [RFC][PATCH 6/7] mm: Add Kconfig option for slab sanitization

2015-12-22 Thread Mathias Krause

On 22 December 2015 at 18:51, Laura Abbott  wrote:
>> [snip]
>>
>> Related to this, have you checked that the sanitization doesn't
>> interfere with the various slab handling schemes, namely RCU related
>> specialties? Not all caches are marked SLAB_DESTROY_BY_RCU, some use
>> call_rcu() instead, implicitly relying on the semantics RCU'ed slabs
>> permit, namely allowing a "use-after-free" access to be legitimate
>> within the RCU grace period. Scrubbing the object during that period
>> would break that assumption.
>
>
> I haven't looked into that. I was working off the assumption that
> if the regular SLAB debug poisoning worked so would the sanitization.
> The regular debug poisoning only checks for SLAB_DESTROY_BY_RCU so
> how does that work then?

Maybe it doesn't? ;)

How many systems, do you think, are running with enabled DEBUG_SLAB /
SLUB_DEBUG in production? Not so many, I'd guess. And the ones running
into issues probably just disable DEBUG_SLAB / SLUB_DEBUG.

Btw, SLUB not only looks for SLAB_DESTROY_BY_RCU but also excludes
"call_rcu slabs" via other mechanisms. As SLUB is the default SLAB
allocator for quite some time now, even with enabled SLUB_DEBUG one
wouldn't be able to trigger RCU related sanitization issues.

>> Speaking of RCU, do you have a plan to support RCU'ed slabs as well?
>>
>
> My only plan was to get the base support in. I didn't have a plan to
> support RCU slabs but that's certainly something to be done in the
> future.

"Base support", in my opinion, includes covering the buddy allocator
as well. Otherwise this feature is incomplete.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [PATCH 0/2] introduce post-init read-only memory

2015-11-29 Thread Mathias Krause

On 29 November 2015 at 16:39, Ingo Molnar  wrote:
>
> * PaX Team  wrote:
>
>> On 29 Nov 2015 at 9:08, Ingo Molnar wrote:
>>
>> >
>> > * PaX Team  wrote:
>> >
>> > > i don't see the compile time vs. runtime detection as 'competing' 
>> > > approaches,
>> > > both have their own role. [...]
>> >
>> > That's true - but only as long as 'this can be solved in tooling!' is not 
>> > used as
>> > an excuse to oppose the runtime solution and we end up doing neither.
>>
>> actually, i already voiced my opinion elsewhere in the constify thread on the
>> kernel hardening list that adding/using __read_only is somewhat premature
>> without also adding the compile time verification part (as part of the 
>> constify
>> plugin for example). right now its use on the embedded vdso image is simple 
>> and
>> easy to verify but once people begin to add it to variables that the compiler
>> knows and cares about (say, the ops structures) then things can become 
>> fragile
>> without compile checking. so yes, i'd also advise to get such tooling in
>> *before* more __read_only usage is added.
>
> I think you are mistaken there: if we add the page fault fixup to make sure we
> don't crash if a read-only variable is accessed, then we'll have most of the
> benefits of read-only mappings and no fragility - without having to wait for
> tooling.

I guess the point PaX Team (and me earlier in this thread) wanted to
make is that having misuse detection *only* at run-time will make
those kind of bugs visible too late -- as late as on the wrong write
attempt actually happening. It would be much better to have the
compiler complain about invalid write statements already during
compilation, much like it does when one wants to assign some value to
a const object.

Having the page fault handler being able to recover from
__ro_after_init faults is a nice feature to support users, actually
being able to report bugs. But it shouldn't be the only way to detect
those kinds of bugs. In fact, we've tools in our toolchain that try to
detect and flag wrong usage of __init / __exit, so why not cover
__ro_after_init as well? Admitted, that won't be possible with modpost
in its current state, but would require a compiler extension instead.
Its current non-existence shouldn't be a show-stopper for
__ro_after_init but the very next step to take care of before
extending the usage of that annotation.

Just my 2ct,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [PATCH 0/2] introduce post-init read-only memory

2015-11-29 Thread Mathias Krause

On 29 November 2015 at 16:39, Ingo Molnar  wrote:
>
> * PaX Team  wrote:
>
>> On 29 Nov 2015 at 9:08, Ingo Molnar wrote:
>>
>> >
>> > * PaX Team  wrote:
>> >
>> > > i don't see the compile time vs. runtime detection as 'competing' 
>> > > approaches,
>> > > both have their own role. [...]
>> >
>> > That's true - but only as long as 'this can be solved in tooling!' is not 
>> > used as
>> > an excuse to oppose the runtime solution and we end up doing neither.
>>
>> actually, i already voiced my opinion elsewhere in the constify thread on the
>> kernel hardening list that adding/using __read_only is somewhat premature
>> without also adding the compile time verification part (as part of the 
>> constify
>> plugin for example). right now its use on the embedded vdso image is simple 
>> and
>> easy to verify but once people begin to add it to variables that the compiler
>> knows and cares about (say, the ops structures) then things can become 
>> fragile
>> without compile checking. so yes, i'd also advise to get such tooling in
>> *before* more __read_only usage is added.
>
> I think you are mistaken there: if we add the page fault fixup to make sure we
> don't crash if a read-only variable is accessed, then we'll have most of the
> benefits of read-only mappings and no fragility - without having to wait for
> tooling.

I guess the point PaX Team (and me earlier in this thread) wanted to
make is that having misuse detection *only* at run-time will make
those kind of bugs visible too late -- as late as on the wrong write
attempt actually happening. It would be much better to have the
compiler complain about invalid write statements already during
compilation, much like it does when one wants to assign some value to
a const object.

Having the page fault handler being able to recover from
__ro_after_init faults is a nice feature to support users, actually
being able to report bugs. But it shouldn't be the only way to detect
those kinds of bugs. In fact, we've tools in our toolchain that try to
detect and flag wrong usage of __init / __exit, so why not cover
__ro_after_init as well? Admitted, that won't be possible with modpost
in its current state, but would require a compiler extension instead.
Its current non-existence shouldn't be a show-stopper for
__ro_after_init but the very next step to take care of before
extending the usage of that annotation.

Just my 2ct,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [PATCH 0/2] introduce post-init read-only memory

2015-11-25 Thread Mathias Krause

On 24 November 2015 at 22:38, Kees Cook  wrote:
> Many things are written to only during __init, and never changed
> again. These cannot be made "const" since the compiler will do the wrong
> thing (we do actually need to write to them). Instead, move these items
> into a memory region that will be made read-only during mark_rodata_ro()
> which happens after all kernel __init code has finished.
>
> This introduces __read_only as a way to mark such memory, and uses it on
> the x86 vDSO to kill an extant kernel exploitation method.

...just some random notes on the experience with kernels implementing
such a feature for quite a lot of locations, not just the vDSO.

While having that annotation makes perfect sense, not only from a
security perspective but also from a micro-optimization point of view
(much like the already existing __read_mostly annotation), it has its
drawbacks. Violating the "r/o after init" rule by writing to such
annotated variables from non-init code goes unnoticed as far as it
concerns the toolchain. Neither the compiler nor the linker will flag
that incorrect use. It'll just trap at runtime and that's bad.

I myself had some educating experience seeing my machine triple fault
when resuming from a S3 sleep. The root cause was a variable that was
annotated __read_only but that was (unnecessarily) modified during CPU
bring-up phase. Debugging that kind of problems is sort of a PITA, you
could imagine.

So, prior extending the usage of the __read_only annotation some
toolchain support is needed. Maybe a gcc plugin that'll warn/error on
code that writes to such a variable but is not __init itself. The
initify and checker plugins from the PaX patch might be worth to look
at for that purpose, as they're doing similar things already. Adding
such a check to sparse might be worth it, too.
A modpost check probably won't work as it's unable to tell if it's a
legitimate access (r/o) or a violation (/w access). So the gcc plugin
is the way to go, IMHO.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel-hardening] [PATCH 0/2] introduce post-init read-only memory

2015-11-25 Thread Mathias Krause

On 24 November 2015 at 22:38, Kees Cook  wrote:
> Many things are written to only during __init, and never changed
> again. These cannot be made "const" since the compiler will do the wrong
> thing (we do actually need to write to them). Instead, move these items
> into a memory region that will be made read-only during mark_rodata_ro()
> which happens after all kernel __init code has finished.
>
> This introduces __read_only as a way to mark such memory, and uses it on
> the x86 vDSO to kill an extant kernel exploitation method.

...just some random notes on the experience with kernels implementing
such a feature for quite a lot of locations, not just the vDSO.

While having that annotation makes perfect sense, not only from a
security perspective but also from a micro-optimization point of view
(much like the already existing __read_mostly annotation), it has its
drawbacks. Violating the "r/o after init" rule by writing to such
annotated variables from non-init code goes unnoticed as far as it
concerns the toolchain. Neither the compiler nor the linker will flag
that incorrect use. It'll just trap at runtime and that's bad.

I myself had some educating experience seeing my machine triple fault
when resuming from a S3 sleep. The root cause was a variable that was
annotated __read_only but that was (unnecessarily) modified during CPU
bring-up phase. Debugging that kind of problems is sort of a PITA, you
could imagine.

So, prior extending the usage of the __read_only annotation some
toolchain support is needed. Maybe a gcc plugin that'll warn/error on
code that writes to such a variable but is not __init itself. The
initify and checker plugins from the PaX patch might be worth to look
at for that purpose, as they're doing similar things already. Adding
such a check to sparse might be worth it, too.
A modpost check probably won't work as it's unable to tell if it's a
legitimate access (r/o) or a violation (/w access). So the gcc plugin
is the way to go, IMHO.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] printk: prevent userland from spoofing kernel messages

2015-10-24 Thread Mathias Krause

The following statement of ABI/testing/dev-kmsg is not quite right:

   It is not possible to inject messages from userspace with the
   facility number LOG_KERN (0), to make sure that the origin of the
   messages can always be reliably determined.

Userland actually can inject messages with a facility of 0 by abusing
the fact that the facility is stored in a u8 data type. By using a
facility which is a multiple of 256 the assignment of msg->facility in
log_store() implicitly truncates it to 0, i.e. LOG_KERN, allowing users
of /dev/kmsg to spoof kernel messages as shown below:

The following call...
   # printf '<%d>Kernel panic - not syncing: beer empty\n' 0 >/dev/kmsg
...leads to the following log entry (dmesg -x | tail -n 1):
   user  :emerg : [   66.137758] Kernel panic - not syncing: beer empty

However, this call...
   # printf '<%d>Kernel panic - not syncing: beer empty\n' 0x800 >/dev/kmsg
...leads to the slightly different log entry (note the kernel facility):
   kern  :emerg : [   74.177343] Kernel panic - not syncing: beer empty

Fix that by limiting the user provided facility to 8 bit right from the
beginning and catch the truncation early.

Fixes: 7ff9554bb578 ("printk: convert byte-buffer to variable-length...")
Signed-off-by: Mathias Krause 
Cc: Greg Kroah-Hartman 
Cc: Petr Mladek 
Cc: Alex Elder 
Cc: Joe Perches 
Cc: Kay Sievers 
---
Might be worth to apply to stable, too. Don't know. Prior to commit
7ff9554bb578 there was no facility so user injected messages had an implicit
facility for LOG_KERN. But commit 7ff9554bb578 explicitly mentions this
feature, so, dunno :/

 kernel/printk/printk.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 8f0324ef72ab..9982616ce712 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -269,6 +269,9 @@ static u32 clear_idx;
 #define PREFIX_MAX 32
 #define LOG_LINE_MAX   (1024 - PREFIX_MAX)
 
+#define LOG_LEVEL(v)   ((v) & 0x07)
+#define LOG_FACILITY(v)((v) >> 3 & 0xff)
+
 /* record buffer */
 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
 #define LOG_ALIGN 4
@@ -611,7 +614,6 @@ struct devkmsg_user {
 static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from)
 {
char *buf, *line;
-   int i;
int level = default_message_loglevel;
int facility = 1;   /* LOG_USER */
size_t len = iov_iter_count(from);
@@ -641,12 +643,13 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct 
iov_iter *from)
line = buf;
if (line[0] == '<') {
char *endp = NULL;
+   unsigned int u;
 
-   i = simple_strtoul(line+1, , 10);
+   u = simple_strtoul(line + 1, , 10);
if (endp && endp[0] == '>') {
-   level = i & 7;
-   if (i >> 3)
-   facility = i >> 3;
+   level = LOG_LEVEL(u);
+   if (LOG_FACILITY(u) != 0)
+   facility = LOG_FACILITY(u);
endp++;
len -= endp - line;
line = endp;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] printk: prevent userland from spoofing kernel messages

2015-10-24 Thread Mathias Krause

The following statement of ABI/testing/dev-kmsg is not quite right:

   It is not possible to inject messages from userspace with the
   facility number LOG_KERN (0), to make sure that the origin of the
   messages can always be reliably determined.

Userland actually can inject messages with a facility of 0 by abusing
the fact that the facility is stored in a u8 data type. By using a
facility which is a multiple of 256 the assignment of msg->facility in
log_store() implicitly truncates it to 0, i.e. LOG_KERN, allowing users
of /dev/kmsg to spoof kernel messages as shown below:

The following call...
   # printf '<%d>Kernel panic - not syncing: beer empty\n' 0 >/dev/kmsg
...leads to the following log entry (dmesg -x | tail -n 1):
   user  :emerg : [   66.137758] Kernel panic - not syncing: beer empty

However, this call...
   # printf '<%d>Kernel panic - not syncing: beer empty\n' 0x800 >/dev/kmsg
...leads to the slightly different log entry (note the kernel facility):
   kern  :emerg : [   74.177343] Kernel panic - not syncing: beer empty

Fix that by limiting the user provided facility to 8 bit right from the
beginning and catch the truncation early.

Fixes: 7ff9554bb578 ("printk: convert byte-buffer to variable-length...")
Signed-off-by: Mathias Krause <mini...@googlemail.com>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>
Cc: Petr Mladek <pmla...@suse.cz>
Cc: Alex Elder <el...@linaro.org>
Cc: Joe Perches <j...@perches.com>
Cc: Kay Sievers <k...@vrfy.org>
---
Might be worth to apply to stable, too. Don't know. Prior to commit
7ff9554bb578 there was no facility so user injected messages had an implicit
facility for LOG_KERN. But commit 7ff9554bb578 explicitly mentions this
feature, so, dunno :/

 kernel/printk/printk.c |   13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 8f0324ef72ab..9982616ce712 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -269,6 +269,9 @@ static u32 clear_idx;
 #define PREFIX_MAX 32
 #define LOG_LINE_MAX   (1024 - PREFIX_MAX)
 
+#define LOG_LEVEL(v)   ((v) & 0x07)
+#define LOG_FACILITY(v)((v) >> 3 & 0xff)
+
 /* record buffer */
 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
 #define LOG_ALIGN 4
@@ -611,7 +614,6 @@ struct devkmsg_user {
 static ssize_t devkmsg_write(struct kiocb *iocb, struct iov_iter *from)
 {
char *buf, *line;
-   int i;
int level = default_message_loglevel;
int facility = 1;   /* LOG_USER */
size_t len = iov_iter_count(from);
@@ -641,12 +643,13 @@ static ssize_t devkmsg_write(struct kiocb *iocb, struct 
iov_iter *from)
line = buf;
if (line[0] == '<') {
char *endp = NULL;
+   unsigned int u;
 
-   i = simple_strtoul(line+1, , 10);
+   u = simple_strtoul(line + 1, , 10);
if (endp && endp[0] == '>') {
-   level = i & 7;
-   if (i >> 3)
-   facility = i >> 3;
+   level = LOG_LEVEL(u);
+   if (LOG_FACILITY(u) != 0)
+   facility = LOG_FACILITY(u);
endp++;
len -= endp - line;
line = endp;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/3] unix: fix use-after-free in unix_dgram_poll()

2015-10-02 Thread Mathias Krause

unlock:
> @@ -1194,6 +1216,8 @@ restart:
>
> sock_hold(sk);
> unix_peer(newsk)= sk;
> +   if (sk->sk_type == SOCK_SEQPACKET)
> +   add_wait_queue(_sk(sk)->peer_wait, 
> _sk(newsk)->wait);
> newsk->sk_state = TCP_ESTABLISHED;
> newsk->sk_type  = sk->sk_type;
> init_peercred(newsk);
> @@ -1220,6 +1244,8 @@ restart:
>
> smp_mb__after_atomic(); /* sock_hold() does an atomic_inc() */
> unix_peer(sk)   = newsk;
> +   if (sk->sk_type == SOCK_SEQPACKET)
> +   add_wait_queue(_sk(newsk)->peer_wait, 
> _sk(sk)->wait);
>
> unix_state_unlock(sk);
>
> @@ -1254,6 +1280,10 @@ static int unix_socketpair(struct socket *socka, 
> struct socket *sockb)
> sock_hold(skb);
> unix_peer(ska) = skb;
> unix_peer(skb) = ska;
> +   if (ska->sk_type != SOCK_STREAM) {
> +   add_wait_queue(_sk(ska)->peer_wait, _sk(skb)->wait);
> +   add_wait_queue(_sk(skb)->peer_wait, _sk(ska)->wait);
> +   }
> init_peercred(ska);
> init_peercred(skb);
>
> @@ -1565,6 +1595,7 @@ restart:
> unix_state_lock(sk);
> if (unix_peer(sk) == other) {
> unix_peer(sk) = NULL;
> +   remove_wait_queue(_sk(other)->peer_wait, 
> >wait);
> unix_state_unlock(sk);
>
> unix_dgram_disconnected(sk, other);
> @@ -2441,7 +2472,6 @@ static unsigned int unix_dgram_poll(struct file *file, 
> struct socket *sock,
> other = unix_peer_get(sk);
> if (other) {
> if (unix_peer(other) != sk) {
> -   sock_poll_wait(file, _sk(other)->peer_wait, 
> wait);
> if (unix_recvq_full(other))
> writable = 0;
> }
> --
> 1.8.2.rc2
>

My reproducer runs on this patch for more than 3 days now without
triggering anything anymore.

Tested-by: Mathias Krause 

Thanks Jason!

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/3] unix: fix use-after-free in unix_dgram_poll()

2015-10-02 Thread Mathias Krause

  return 0;
>
>  out_unlock:
> @@ -1194,6 +1216,8 @@ restart:
>
> sock_hold(sk);
> unix_peer(newsk)= sk;
> +   if (sk->sk_type == SOCK_SEQPACKET)
> +   add_wait_queue(_sk(sk)->peer_wait, 
> _sk(newsk)->wait);
> newsk->sk_state = TCP_ESTABLISHED;
> newsk->sk_type  = sk->sk_type;
> init_peercred(newsk);
> @@ -1220,6 +1244,8 @@ restart:
>
> smp_mb__after_atomic(); /* sock_hold() does an atomic_inc() */
> unix_peer(sk)   = newsk;
> +   if (sk->sk_type == SOCK_SEQPACKET)
> +   add_wait_queue(_sk(newsk)->peer_wait, 
> _sk(sk)->wait);
>
> unix_state_unlock(sk);
>
> @@ -1254,6 +1280,10 @@ static int unix_socketpair(struct socket *socka, 
> struct socket *sockb)
> sock_hold(skb);
> unix_peer(ska) = skb;
> unix_peer(skb) = ska;
> +   if (ska->sk_type != SOCK_STREAM) {
> +   add_wait_queue(_sk(ska)->peer_wait, _sk(skb)->wait);
> +   add_wait_queue(_sk(skb)->peer_wait, _sk(ska)->wait);
> +   }
> init_peercred(ska);
> init_peercred(skb);
>
> @@ -1565,6 +1595,7 @@ restart:
> unix_state_lock(sk);
> if (unix_peer(sk) == other) {
> unix_peer(sk) = NULL;
> +   remove_wait_queue(_sk(other)->peer_wait, 
> >wait);
> unix_state_unlock(sk);
>
> unix_dgram_disconnected(sk, other);
> @@ -2441,7 +2472,6 @@ static unsigned int unix_dgram_poll(struct file *file, 
> struct socket *sock,
> other = unix_peer_get(sk);
> if (other) {
> if (unix_peer(other) != sk) {
> -   sock_poll_wait(file, _sk(other)->peer_wait, 
> wait);
> if (unix_recvq_full(other))
> writable = 0;
> }
> --
> 1.8.2.rc2
>

My reproducer runs on this patch for more than 3 days now without
triggering anything anymore.

Tested-by: Mathias Krause <mini...@googlemail.com>

Thanks Jason!

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v13 13/23] x86/asm/crypto: Create stack frames in aesni-intel_asm.S

2015-10-01 Thread Mathias Krause

On Do, Okt 01, 2015 at 08:29:50 -0500, Josh Poimboeuf wrote:
> On Thu, Oct 01, 2015 at 08:10:26AM +0200, mini...@ld-linux.so wrote:
> > On Tue, Sep 22, 2015 at 10:47:04AM -0500, Josh Poimboeuf wrote:
> > >  ENTRY(aesni_set_key)
> > > + FRAME_BEGIN
> > >  #ifndef __x86_64__
> > >   pushl KEYP
> > >   movl 8(%esp), KEYP  # ctx
> > 
> > This will break 32 bit builds using the aesni-intel.ko module. You need
> > to adjust the esp-based offsets for the non-x86_64 case, as FRAME_BEGIN
> > may do another push.
> > 
> > How about adding a FRAME_OFFSET() macro to  to wrap the
> > offsets?:
> > 
> > #ifdef CONFIG_FRAME_POINTER
> > # define FRAME_OFFSET(x)((x) + (BITS_PER_LONG / 8))
> > #else
> > # define FRAME_OFFSET(x)(x)
> > #endif
> > 
> > And using them like this:
> > 
> > movl FRAME_OFFSET(8)(%esp), KEYP# ctx
> 
> Ah, right.  The 32-bit ABI passes arguments on the stack instead of via
> registers.
> 
> For now, I'm inclined to just make FRAME_BEGIN and FRAME_END do nothing
> on 32-bit.  We're only doing stack validation on x86_64 and I don't know
> if anybody cares about frame pointers on 32-bit at this point.

Well, we had issues in the past, especially in that very module, but
only on 32 bit systems. So it would be nice to get frame pointers right
for 32 bit, too.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v13 13/23] x86/asm/crypto: Create stack frames in aesni-intel_asm.S

2015-10-01 Thread Mathias Krause

On Do, Okt 01, 2015 at 08:29:50 -0500, Josh Poimboeuf wrote:
> On Thu, Oct 01, 2015 at 08:10:26AM +0200, mini...@ld-linux.so wrote:
> > On Tue, Sep 22, 2015 at 10:47:04AM -0500, Josh Poimboeuf wrote:
> > >  ENTRY(aesni_set_key)
> > > + FRAME_BEGIN
> > >  #ifndef __x86_64__
> > >   pushl KEYP
> > >   movl 8(%esp), KEYP  # ctx
> > 
> > This will break 32 bit builds using the aesni-intel.ko module. You need
> > to adjust the esp-based offsets for the non-x86_64 case, as FRAME_BEGIN
> > may do another push.
> > 
> > How about adding a FRAME_OFFSET() macro to  to wrap the
> > offsets?:
> > 
> > #ifdef CONFIG_FRAME_POINTER
> > # define FRAME_OFFSET(x)((x) + (BITS_PER_LONG / 8))
> > #else
> > # define FRAME_OFFSET(x)(x)
> > #endif
> > 
> > And using them like this:
> > 
> > movl FRAME_OFFSET(8)(%esp), KEYP# ctx
> 
> Ah, right.  The 32-bit ABI passes arguments on the stack instead of via
> registers.
> 
> For now, I'm inclined to just make FRAME_BEGIN and FRAME_END do nothing
> on 32-bit.  We're only doing stack validation on x86_64 and I don't know
> if anybody cares about frame pointers on 32-bit at this point.

Well, we had issues in the past, especially in that very module, but
only on 32 bit systems. So it would be nice to get frame pointers right
for 32 bit, too.

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Mathias Krause

On 30 September 2015 at 15:25, Rainer Weikusat
 wrote:
> Mathias Krause  writes:
>> On 30 September 2015 at 12:56, Rainer Weikusat 
>>  wrote:
>>> In case you want some information on this: This is a kernel warning I
>>> could trigger (more than once) on the single day I could so far spend
>>> looking into this (3.2.54 kernel):
>>>
>>> Sep 15 19:37:19 doppelsaurus kernel: WARNING: at lib/list_debug.c:53 
>>> list_del+0x9/0x30()
>>> Sep 15 19:37:19 doppelsaurus kernel: Hardware name: 500-330nam
>>> Sep 15 19:37:19 doppelsaurus kernel: list_del corruption. prev->next should 
>>> be 88022c38f078, but was dead00100100
>>> [snip]
>>
>> Is that with Jason's patch or a vanilla v3.2.54?
>
> That's a kernel warning which occurred repeatedly (among other "link
> pointer disorganization" warnings) when I tested the "program with
> unknown behaviour" you wrote with the kernel I'm currently supporting a
> while ago (as I already wrote in the original mail).

So I assume Jason's patch is not included in your kernel. Then those
messages are expected; expected even on kernels as old as v2.6.27.
Can you re-try with Jason's patch applied?

Thanks,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Mathias Krause

On 30 September 2015 at 12:56, Rainer Weikusat
 wrote:
> Mathias Krause  writes:
>> On 29 September 2015 at 21:09, Jason Baron  wrote:
>>> However, if we call connect on socket 's', to connect to a new socket 'o2', 
>>> we
>>> drop the reference on the original socket 'o'. Thus, we can now close socket
>>> 'o' without unregistering from epoll. Then, when we either close the ep
>>> or unregister 'o', we end up with this list corruption. Thus, this is not a
>>> race per se, but can be triggered sequentially.
>>
>> Sounds profound, but the reproducers calls connect only once per
>> socket. So there is no "connect to a new socket", no?
>> But w/e, see below.
>
> In case you want some information on this: This is a kernel warning I
> could trigger (more than once) on the single day I could so far spend
> looking into this (3.2.54 kernel):
>
> Sep 15 19:37:19 doppelsaurus kernel: WARNING: at lib/list_debug.c:53 
> list_del+0x9/0x30()
> Sep 15 19:37:19 doppelsaurus kernel: Hardware name: 500-330nam
> Sep 15 19:37:19 doppelsaurus kernel: list_del corruption. prev->next should 
> be 88022c38f078, but was dead00100100
> [snip]

Is that with Jason's patch or a vanilla v3.2.54?

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Mathias Krause

On 30 September 2015 at 12:56, Rainer Weikusat
<rweiku...@mobileactivedefense.com> wrote:
> Mathias Krause <mini...@googlemail.com> writes:
>> On 29 September 2015 at 21:09, Jason Baron <jba...@akamai.com> wrote:
>>> However, if we call connect on socket 's', to connect to a new socket 'o2', 
>>> we
>>> drop the reference on the original socket 'o'. Thus, we can now close socket
>>> 'o' without unregistering from epoll. Then, when we either close the ep
>>> or unregister 'o', we end up with this list corruption. Thus, this is not a
>>> race per se, but can be triggered sequentially.
>>
>> Sounds profound, but the reproducers calls connect only once per
>> socket. So there is no "connect to a new socket", no?
>> But w/e, see below.
>
> In case you want some information on this: This is a kernel warning I
> could trigger (more than once) on the single day I could so far spend
> looking into this (3.2.54 kernel):
>
> Sep 15 19:37:19 doppelsaurus kernel: WARNING: at lib/list_debug.c:53 
> list_del+0x9/0x30()
> Sep 15 19:37:19 doppelsaurus kernel: Hardware name: 500-330nam
> Sep 15 19:37:19 doppelsaurus kernel: list_del corruption. prev->next should 
> be 88022c38f078, but was dead00100100
> [snip]

Is that with Jason's patch or a vanilla v3.2.54?

Regards,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-30 Thread Mathias Krause

On 30 September 2015 at 15:25, Rainer Weikusat
<rweiku...@mobileactivedefense.com> wrote:
> Mathias Krause <mini...@googlemail.com> writes:
>> On 30 September 2015 at 12:56, Rainer Weikusat 
>> <rweiku...@mobileactivedefense.com> wrote:
>>> In case you want some information on this: This is a kernel warning I
>>> could trigger (more than once) on the single day I could so far spend
>>> looking into this (3.2.54 kernel):
>>>
>>> Sep 15 19:37:19 doppelsaurus kernel: WARNING: at lib/list_debug.c:53 
>>> list_del+0x9/0x30()
>>> Sep 15 19:37:19 doppelsaurus kernel: Hardware name: 500-330nam
>>> Sep 15 19:37:19 doppelsaurus kernel: list_del corruption. prev->next should 
>>> be 88022c38f078, but was dead00100100
>>> [snip]
>>
>> Is that with Jason's patch or a vanilla v3.2.54?
>
> That's a kernel warning which occurred repeatedly (among other "link
> pointer disorganization" warnings) when I tested the "program with
> unknown behaviour" you wrote with the kernel I'm currently supporting a
> while ago (as I already wrote in the original mail).

So I assume Jason's patch is not included in your kernel. Then those
messages are expected; expected even on kernels as old as v2.6.27.
Can you re-try with Jason's patch applied?

Thanks,
Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: List corruption on epoll_ctl(EPOLL_CTL_DEL) an AF_UNIX socket

2015-09-29 Thread Mathias Krause

On 29 September 2015 at 21:09, Jason Baron  wrote:
> However, if we call connect on socket 's', to connect to a new socket 'o2', we
> drop the reference on the original socket 'o'. Thus, we can now close socket
> 'o' without unregistering from epoll. Then, when we either close the ep
> or unregister 'o', we end up with this list corruption. Thus, this is not a
> race per se, but can be triggered sequentially.

Sounds profound, but the reproducers calls connect only once per
socket. So there is no "connect to a new socket", no?
But w/e, see below.

> Linus explains the general case in the context the signalfd stuff here:
> https://lkml.org/lkml/2013/10/14/634

I also found that posting while looking for similar bug reports. Also
found that one: https://lkml.org/lkml/2014/5/15/532

> So this may be the case that we've been chasing here for a while...

That bug triggers since commit 3c73419c09 "af_unix: fix 'poll for
write'/ connected DGRAM sockets". That's v2.6.26-rc7, as noted in the
reproducer.

>
> In any case, we could fix with that same POLLFREE mechansim, the simplest
> would be something like:
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 03ee4d3..d499f81 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -392,6 +392,9 @@ static void unix_sock_destructor(struct sock *sk)
> pr_debug("UNIX %p is destroyed, %ld are still alive.\n", sk,
> atomic_long_read(_nr_socks));
>  #endif
> +   /* make sure we remove from epoll */
> +   wake_up_poll(>peer_wait, POLLFREE);
> +   synchronize_sched();
>  }
>
>  static void unix_release_sock(struct sock *sk, int embrion)
>
> I'm not suggesting we apply that, but that fixed the supplied test case.
> We could enhance the above, to avoid the free'ing latency there by doing
> the SLAB_DESTROY_BY_RCU for unix sockets. But I'm not convinced
> that this wouldn't be still broken for select()/poll() as well. I think
> we can be in a select() call for socket 's', and if we remove socket
> 'o' from it in the meantime (by doing a connect() on s to somewhere else
> and a close on 'o'), I think we can still crash there. So POLLFREE would
> have to be extended. I tried to hit this with select() but could not,
> but I think if I tried harder I could.
>
> Instead of going further down that route, perhaps something like below
> might be better. The basic idea would be to do away with the 'other'
> poll call in unix_dgram_poll(), and instead revert back to a registering
> on a single wait queue. We add a new wait queue to unix sockets such
> that we can register it with a remote other on connect(). Then we can
> use the wakeup from the remote to wake up the registered unix socket.
> Probably better explained with the patch below. Note I didn't add to
> the remote for SOCK_STREAM, since the poll() routine there doesn't do
> the double wait queue registering:
>
> diff --git a/include/net/af_unix.h b/include/net/af_unix.h
> index 4a167b3..9698aff 100644
> --- a/include/net/af_unix.h
> +++ b/include/net/af_unix.h
> @@ -62,6 +62,7 @@ struct unix_sock {
>  #define UNIX_GC_CANDIDATE  0
>  #define UNIX_GC_MAYBE_CYCLE1
> struct socket_wqpeer_wq;
> +   wait_queue_twait;
>  };
>  #define unix_sk(__sk) ((struct unix_sock *)__sk)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 03ee4d3..9e0692a 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -420,6 +420,8 @@ static void unix_release_sock(struct sock *sk, int 
> embrion)
> skpair = unix_peer(sk);
>
> if (skpair != NULL) {
> +   if (sk->sk_type != SOCK_STREAM)
> +   remove_wait_queue(_sk(skpair)->peer_wait, 
> >wait);
> if (sk->sk_type == SOCK_STREAM || sk->sk_type == 
> SOCK_SEQPACKET) {
> unix_state_lock(skpair);
> /* No more writes */
> @@ -636,6 +638,16 @@ static struct proto unix_proto = {
>   */
>  static struct lock_class_key af_unix_sk_receive_queue_lock_key;
>
> +static int peer_wake(wait_queue_t *wait, unsigned mode, int sync, void *key)
> +{
> +   struct unix_sock *u;
> +
> +   u = container_of(wait, struct unix_sock, wait);
> +   wake_up_interruptible_sync_poll(sk_sleep(>sk), key);
> +
> +   return 0;
> +}
> +
>  static struct sock *unix_create1(struct net *net, struct socket *sock, int 
> kern)
>  {
> struct sock *sk = NULL;
> @@ -664,6 +676,7 @@ static struct sock *unix_create1(struct net *net, struct 
> socket *sock, int kern)
> INIT_LIST_HEAD(>link);
> mutex_init(>readlock); /* single task reading lock */
> init_waitqueue_head(>peer_wait);
> +   init_waitqueue_func_entry(>wait, peer_wake);
> unix_insert_socket(unix_sockets_unbound(sk), sk);
>  out:
> if (sk == NULL)
> @@ -1030,7 +1043,10 @@ restart:
>  */
> if (unix_peer(sk)) {
> struct sock *old_peer = unix_peer(sk);
> +
> +

1 2 3 4 5 6 >

1 - 100 of 549 matches

Mail list logo