Re: [PATCH v2 13/13] lkdtm: Add a test for function descriptors protection

2021-10-15 Thread Kees Cook
On Thu, Oct 14, 2021 at 07:50:02AM +0200, Christophe Leroy wrote:
> Add WRITE_OPD to check that you can't modify function
> descriptors.
> 
> Gives the following result when function descriptors are
> not protected:
> 
>   lkdtm: Performing direct entry WRITE_OPD
>   lkdtm: attempting bad 16 bytes write at c269b358
>   lkdtm: FAIL: survived bad write
>   lkdtm: do_nothing was hijacked!
> 
> Looks like a standard compiler barrier(); is not enough to force
> GCC to use the modified function descriptor. Add to add a fake empty
> inline assembly to force GCC to reload the function descriptor.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  drivers/misc/lkdtm/core.c  |  1 +
>  drivers/misc/lkdtm/lkdtm.h |  1 +
>  drivers/misc/lkdtm/perms.c | 22 ++
>  3 files changed, 24 insertions(+)
> 
> diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> index fe6fd34b8caf..de092aa03b5d 100644
> --- a/drivers/misc/lkdtm/core.c
> +++ b/drivers/misc/lkdtm/core.c
> @@ -148,6 +148,7 @@ static const struct crashtype crashtypes[] = {
>   CRASHTYPE(WRITE_RO),
>   CRASHTYPE(WRITE_RO_AFTER_INIT),
>   CRASHTYPE(WRITE_KERN),
> + CRASHTYPE(WRITE_OPD),
>   CRASHTYPE(REFCOUNT_INC_OVERFLOW),
>   CRASHTYPE(REFCOUNT_ADD_OVERFLOW),
>   CRASHTYPE(REFCOUNT_INC_NOT_ZERO_OVERFLOW),
> diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
> index c212a253edde..188bd0fd6575 100644
> --- a/drivers/misc/lkdtm/lkdtm.h
> +++ b/drivers/misc/lkdtm/lkdtm.h
> @@ -105,6 +105,7 @@ void __init lkdtm_perms_init(void);
>  void lkdtm_WRITE_RO(void);
>  void lkdtm_WRITE_RO_AFTER_INIT(void);
>  void lkdtm_WRITE_KERN(void);
> +void lkdtm_WRITE_OPD(void);
>  void lkdtm_EXEC_DATA(void);
>  void lkdtm_EXEC_STACK(void);
>  void lkdtm_EXEC_KMALLOC(void);
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 96b3ebfcb8ed..3870bc82d40d 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -44,6 +44,11 @@ static noinline void do_overwritten(void)
>   return;
>  }
>  
> +static noinline void do_almost_nothing(void)
> +{
> + pr_info("do_nothing was hijacked!\n");
> +}
> +
>  static void *setup_function_descriptor(func_desc_t *fdesc, void *dst)
>  {
>   memcpy(fdesc, do_nothing, sizeof(*fdesc));
> @@ -143,6 +148,23 @@ void lkdtm_WRITE_KERN(void)
>   do_overwritten();
>  }
>  
> +void lkdtm_WRITE_OPD(void)
> +{
> + size_t size = sizeof(func_desc_t);
> + void (*func)(void) = do_nothing;
> +
> + if (!have_function_descriptors()) {
> + pr_info("Platform doesn't have function descriptors.\n");

This should be more explicit ('xfail'):

pr_info("XFAIL: platform doesn't use function descriptors.\n");

> + return;
> + }
> + pr_info("attempting bad %zu bytes write at %px\n", size, do_nothing);
> + memcpy(do_nothing, do_almost_nothing, size);
> + pr_err("FAIL: survived bad write\n");
> +
> + asm("" : "=m"(func));

Since this is a descriptor, I assume no icache flush is needed. Are
function descriptors strictly dcache? (Is anything besides just a
barrier needed?)

> + func();
> +}
> +
>  void lkdtm_EXEC_DATA(void)
>  {
>   execute_location(data_area, CODE_WRITE);
> -- 
> 2.31.1
> 

-- 
Kees Cook


Re: [PATCH v2 11/13] lkdtm: Fix lkdtm_EXEC_RODATA()

2021-10-15 Thread Kees Cook
On Thu, Oct 14, 2021 at 07:50:00AM +0200, Christophe Leroy wrote:
> Behind its location, lkdtm_EXEC_RODATA() executes
> lkdtm_rodata_do_nothing() which is a real function,
> not a copy of do_nothing().
> 
> So executes it directly instead of using execute_location().
> 
> This is necessary because following patch will fix execute_location()
> to use a copy of the function descriptor of do_nothing() and
> function descriptor of lkdtm_rodata_do_nothing() might be different.
> 
> And fix displayed addresses by dereferencing the function descriptors.
> 
> Signed-off-by: Christophe Leroy 

I still don't understand this -- it doesn't look needed at all given the
changes in patch 12. (i.e. everything is using
dereference_function_descriptor() now)

Can't this patch be dropped?

-Kees

> ---
>  drivers/misc/lkdtm/perms.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 035fcca441f0..5266dc28df6e 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -153,7 +153,14 @@ void lkdtm_EXEC_VMALLOC(void)
>  
>  void lkdtm_EXEC_RODATA(void)
>  {
> - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS);
> + pr_info("attempting ok execution at %px\n",
> + dereference_function_descriptor(do_nothing));
> + do_nothing();
> +
> + pr_info("attempting bad execution at %px\n",
> + dereference_function_descriptor(lkdtm_rodata_do_nothing));
> + lkdtm_rodata_do_nothing();
> + pr_err("FAIL: func returned\n");
>  }
>  
>  void lkdtm_EXEC_USERSPACE(void)
> -- 
> 2.31.1
> 

-- 
Kees Cook


Re: [PATCH v2 12/13] lkdtm: Fix execute_[user]_location()

2021-10-15 Thread Kees Cook
On Thu, Oct 14, 2021 at 07:50:01AM +0200, Christophe Leroy wrote:
> execute_location() and execute_user_location() intent
> to copy do_nothing() text and execute it at a new location.
> However, at the time being it doesn't copy do_nothing() function
> but do_nothing() function descriptor which still points to the
> original text. So at the end it still executes do_nothing() at
> its original location allthough using a copied function descriptor.
> 
> So, fix that by really copying do_nothing() text and build a new
> function descriptor by copying do_nothing() function descriptor and
> updating the target address with the new location.
> 
> Also fix the displayed addresses by dereferencing do_nothing()
> function descriptor.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  drivers/misc/lkdtm/perms.c | 25 +
>  include/asm-generic/sections.h |  5 +
>  2 files changed, 26 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 5266dc28df6e..96b3ebfcb8ed 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -44,19 +44,32 @@ static noinline void do_overwritten(void)
>   return;
>  }
>  
> +static void *setup_function_descriptor(func_desc_t *fdesc, void *dst)
> +{
> + memcpy(fdesc, do_nothing, sizeof(*fdesc));
> + fdesc->addr = (unsigned long)dst;
> + barrier();
> +
> + return fdesc;
> +}

How about collapsing the "have_function_descriptors()" check into
setup_function_descriptor()?

static void *setup_function_descriptor(func_desc_t *fdesc, void *dst)
{
if (__is_defined(HAVE_FUNCTION_DESCRIPTORS)) {
memcpy(fdesc, do_nothing, sizeof(*fdesc));
fdesc->addr = (unsigned long)dst;
barrier();
return fdesc;
} else {
return dst;
}
}

> +
>  static noinline void execute_location(void *dst, bool write)
>  {
>   void (*func)(void) = dst;
> + func_desc_t fdesc;
> + void *do_nothing_text = dereference_function_descriptor(do_nothing);
>  
> - pr_info("attempting ok execution at %px\n", do_nothing);
> + pr_info("attempting ok execution at %px\n", do_nothing_text);
>   do_nothing();
>  
>   if (write == CODE_WRITE) {
> - memcpy(dst, do_nothing, EXEC_SIZE);
> + memcpy(dst, do_nothing_text, EXEC_SIZE);
>   flush_icache_range((unsigned long)dst,
>  (unsigned long)dst + EXEC_SIZE);
>   }
>   pr_info("attempting bad execution at %px\n", func);
> + if (have_function_descriptors())
> + func = setup_function_descriptor(, dst);
>   func();
>   pr_err("FAIL: func returned\n");
>  }
> @@ -67,15 +80,19 @@ static void execute_user_location(void *dst)
>  
>   /* Intentionally crossing kernel/user memory boundary. */
>   void (*func)(void) = dst;
> + func_desc_t fdesc;
> + void *do_nothing_text = dereference_function_descriptor(do_nothing);
>  
> - pr_info("attempting ok execution at %px\n", do_nothing);
> + pr_info("attempting ok execution at %px\n", do_nothing_text);
>   do_nothing();
>  
> - copied = access_process_vm(current, (unsigned long)dst, do_nothing,
> + copied = access_process_vm(current, (unsigned long)dst, do_nothing_text,
>  EXEC_SIZE, FOLL_WRITE);
>   if (copied < EXEC_SIZE)
>   return;
>   pr_info("attempting bad execution at %px\n", func);
> + if (have_function_descriptors())
> + func = setup_function_descriptor(, dst);
>   func();
>   pr_err("FAIL: func returned\n");
>  }


> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
> index 76163883c6ff..d225318538bd 100644
> --- a/include/asm-generic/sections.h
> +++ b/include/asm-generic/sections.h
> @@ -70,6 +70,11 @@ typedef struct {
>  } func_desc_t;
>  #endif
>  
> +static inline bool have_function_descriptors(void)
> +{
> + return __is_defined(HAVE_FUNCTION_DESCRIPTORS);
> +}
> +
>  /* random extra sections (if any).  Override
>   * in asm/sections.h */
>  #ifndef arch_is_kernel_text

This hunk seems like it should live in a separate patch.

-- 
Kees Cook


Re: [PATCH 2/2] kbuild: use more subdir- for visiting subdirectories while cleaning

2021-10-13 Thread Kees Cook
On Wed, Oct 13, 2021 at 03:36:22PM +0900, Masahiro Yamada wrote:
> Documentation/kbuild/makefiles.rst suggests to use "archclean" for
> cleaning arch/$(SRCARCH)/boot/.
> 
> Since commit d92cc4d51643 ("kbuild: require all architectures to have
> arch/$(SRCARCH)/Kbuild"), we can use the "subdir- += boot" trick for
> all architectures. This can take advantage of the parallel option (-j)
> for "make clean".
> 
> I also cleaned up the comments. The "archdep" target does not exist.
> 
> Signed-off-by: Masahiro Yamada 

I like the clean-up!

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v1 05/10] asm-generic: Define 'funct_descr_t' to commonly describe function descriptors

2021-10-13 Thread Kees Cook
On Wed, Oct 13, 2021 at 09:23:56AM +0200, Christophe Leroy wrote:
> 
> 
> Le 13/10/2021 à 09:01, Kees Cook a écrit :
> > On Mon, Oct 11, 2021 at 05:25:32PM +0200, Christophe Leroy wrote:
> > > We have three architectures using function descriptors, each with its
> > > own name.
> > > 
> > > Add a common typedef that can be used in generic code.
> > > 
> > > Also add a stub typedef for architecture without function descriptors,
> > 
> > nit: funct_descr_t reads weird to me. why not func_desc_t ? Either way:
> 
> func_desc_t already exists in powerpc. I have a patch to remove it as it is
> redundant with struct ppc64_opd_entry, but I didnt' want to include it in
> this series.
> 
> But after all I can add it in this series, I'll add it in v2.

Ah-ha! That works for me. :) Thanks!

-Kees

-- 
Kees Cook


Re: [PATCH v1 09/10] lkdtm: Fix lkdtm_EXEC_RODATA()

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:36PM +0200, Christophe Leroy wrote:
> Behind a location, lkdtm_EXEC_RODATA() executes a real function,
> not a copy of do_nothing().
>
> So do it directly instead of using execute_location().
>
> And fix displayed addresses by dereferencing the function descriptors.
>
> Signed-off-by: Christophe Leroy 
> ---
>  drivers/misc/lkdtm/perms.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 442d60ed25ef..da16564e1ecd 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -153,7 +153,14 @@ void lkdtm_EXEC_VMALLOC(void)
>
>  void lkdtm_EXEC_RODATA(void)
>  {
> - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS);
> + pr_info("attempting ok execution at %px\n",
> + dereference_symbol_descriptor(do_nothing));
> + do_nothing();
> +
> + pr_info("attempting bad execution at %px\n",
> + dereference_symbol_descriptor(lkdtm_rodata_do_nothing));
> + lkdtm_rodata_do_nothing();
> + pr_err("FAIL: func returned\n");
>  }

(In re-reading this more carefully, I see now why kallsyms.h is used
earlier: _function_ vs _symbol_ descriptor.)

In the next patch:

static noinline void execute_location(void *dst, bool write)
{
...
   func = setup_function_descriptor(, dst);
   if (IS_ERR(func))
   return;

   pr_info("attempting bad execution at %px\n", dst);
   func();
   pr_err("FAIL: func returned\n");
}

What are the conditions for which dereference_symbol_descriptor works
but dereference _function_descriptor doesn't?

--
Kees Cook


Re: [PATCH v1 10/10] lkdtm: Fix execute_[user]_location()

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:37PM +0200, Christophe Leroy wrote:
> execute_location() and execute_user_location() intent
> to copy do_nothing() text and execute it at a new location.
> However, at the time being it doesn't copy do_nothing() function
> but do_nothing() function descriptor which still points to the
> original text. So at the end it still executes do_nothing() at
> its original location allthough using a copied function descriptor.
> 
> So, fix that by really copying do_nothing() text and build a new
> function descriptor by copying do_nothing() function descriptor and
> updating the target address with the new location.
> 
> Also fix the displayed addresses by dereferencing do_nothing()
> function descriptor.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  drivers/misc/lkdtm/perms.c | 45 +++---
>  1 file changed, 37 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index da16564e1ecd..1d03cd44c21d 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -44,19 +44,42 @@ static noinline void do_overwritten(void)
>   return;
>  }
>  
> +static void *setup_function_descriptor(funct_descr_t *fdesc, void *dst)
> +{
> + int err;
> +
> + if (!__is_defined(HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR))
> + return dst;
> +
> + err = copy_from_kernel_nofault(fdesc, do_nothing, sizeof(*fdesc));
> + if (err < 0)
> + return ERR_PTR(err);
> +
> + fdesc->addr = (unsigned long)dst;
> + barrier();
> +
> + return fdesc;
> +}
> +
>  static noinline void execute_location(void *dst, bool write)
>  {
> - void (*func)(void) = dst;
> + void (*func)(void);
> + funct_descr_t fdesc;
> + void *do_nothing_text = dereference_symbol_descriptor(do_nothing);
>  
> - pr_info("attempting ok execution at %px\n", do_nothing);
> + pr_info("attempting ok execution at %px\n", do_nothing_text);
>   do_nothing();
>  
>   if (write == CODE_WRITE) {
> - memcpy(dst, do_nothing, EXEC_SIZE);
> + memcpy(dst, do_nothing_text, EXEC_SIZE);
>   flush_icache_range((unsigned long)dst,
>  (unsigned long)dst + EXEC_SIZE);
>   }
> - pr_info("attempting bad execution at %px\n", func);
> + func = setup_function_descriptor(, dst);
> + if (IS_ERR(func))
> + return;

I think this error case should complain with some details. :) Maybe:

pr_err("FAIL: could not build function descriptor for %px\n", dst);

> +
> + pr_info("attempting bad execution at %px\n", dst);

And then leave this pr_info() as it was, before the
setup_function_descriptor() call.

>   func();
>   pr_err("FAIL: func returned\n");
>  }
> @@ -66,16 +89,22 @@ static void execute_user_location(void *dst)
>   int copied;
>  
>   /* Intentionally crossing kernel/user memory boundary. */
> - void (*func)(void) = dst;
> + void (*func)(void);
> + funct_descr_t fdesc;
> + void *do_nothing_text = dereference_symbol_descriptor(do_nothing);
>  
> - pr_info("attempting ok execution at %px\n", do_nothing);
> + pr_info("attempting ok execution at %px\n", do_nothing_text);
>   do_nothing();
>  
> - copied = access_process_vm(current, (unsigned long)dst, do_nothing,
> + copied = access_process_vm(current, (unsigned long)dst, do_nothing_text,
>  EXEC_SIZE, FOLL_WRITE);
>   if (copied < EXEC_SIZE)
>   return;
> - pr_info("attempting bad execution at %px\n", func);
> + func = setup_function_descriptor(, dst);
> + if (IS_ERR(func))
> + return;
> +
> + pr_info("attempting bad execution at %px\n", dst);

Same here.

>   func();
>   pr_err("FAIL: func returned\n");
>  }
> -- 
> 2.31.1
> 

-- 
Kees Cook


Re: [PATCH v1 09/10] lkdtm: Fix lkdtm_EXEC_RODATA()

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:36PM +0200, Christophe Leroy wrote:
> Behind a location, lkdtm_EXEC_RODATA() executes a real function,
> not a copy of do_nothing().
> 
> So do it directly instead of using execute_location().

I don't understand this. Why does the next patch not fix this?

-Kees

> 
> And fix displayed addresses by dereferencing the function descriptors.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  drivers/misc/lkdtm/perms.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 442d60ed25ef..da16564e1ecd 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -153,7 +153,14 @@ void lkdtm_EXEC_VMALLOC(void)
>  
>  void lkdtm_EXEC_RODATA(void)
>  {
> - execute_location(lkdtm_rodata_do_nothing, CODE_AS_IS);
> + pr_info("attempting ok execution at %px\n",
> + dereference_symbol_descriptor(do_nothing));
> + do_nothing();
> +
> + pr_info("attempting bad execution at %px\n",
> + dereference_symbol_descriptor(lkdtm_rodata_do_nothing));
> + lkdtm_rodata_do_nothing();
> +     pr_err("FAIL: func returned\n");
>  }
>  
>  void lkdtm_EXEC_USERSPACE(void)
> -- 
> 2.31.1
> 

-- 
Kees Cook


Re: [PATCH v1 08/10] lkdtm: Really write into kernel text in WRITE_KERN

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:35PM +0200, Christophe Leroy wrote:
> WRITE_KERN is supposed to overwrite some kernel text, namely
> do_overwritten() function.
> 
> But at the time being it overwrites do_overwritten() function
> descriptor, not function text.
> 
> Fix it by dereferencing the function descriptor to obtain
> function text pointer.
> 
> And make do_overwritten() noinline so that it is really
> do_overwritten() which is called by lkdtm_WRITE_KERN().
> 
> Signed-off-by: Christophe Leroy 
> ---
>  drivers/misc/lkdtm/perms.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
> index 60b3b2fe929d..442d60ed25ef 100644
> --- a/drivers/misc/lkdtm/perms.c
> +++ b/drivers/misc/lkdtm/perms.c
> @@ -5,6 +5,7 @@
>   * even non-readable regions.
>   */
>  #include "lkdtm.h"
> +#include 

Why not #include  instead here?

>  #include 
>  #include 
>  #include 
> @@ -37,7 +38,7 @@ static noinline void do_nothing(void)
>  }
>  
>  /* Must immediately follow do_nothing for size calculuations to work out. */
> -static void do_overwritten(void)
> +static noinline void do_overwritten(void)
>  {
>   pr_info("do_overwritten wasn't overwritten!\n");
>   return;
> @@ -113,8 +114,9 @@ void lkdtm_WRITE_KERN(void)
>   size_t size;
>   volatile unsigned char *ptr;
>  
> - size = (unsigned long)do_overwritten - (unsigned long)do_nothing;
> - ptr = (unsigned char *)do_overwritten;
> + size = (unsigned long)dereference_symbol_descriptor(do_overwritten) -
> +(unsigned long)dereference_symbol_descriptor(do_nothing);
> + ptr = dereference_symbol_descriptor(do_overwritten);

But otherwise, yup, I expect there will be a bunch of things like this
to clean up in LKDTM. :| Sorry about that!

Acked-by: Kees Cook 

>  
>   pr_info("attempting bad %zu byte write at %px\n", size, ptr);
>   memcpy((void *)ptr, (unsigned char *)do_nothing, size);
> -- 
> 2.31.1
> 

-- 
Kees Cook


Re: [PATCH v1 07/10] lkdtm: Force do_nothing() out of line

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:34PM +0200, Christophe Leroy wrote:
> LKDTM tests display that the run do_nothing() at a given
> address, but in reality do_nothing() is inlined into the
> caller.
> 
> Force it out of line so that it really runs text at the
> displayed address.
> 
> Signed-off-by: Christophe Leroy 

Acked-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v1 06/10] asm-generic: Refactor dereference_[kernel]_function_descriptor()

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:33PM +0200, Christophe Leroy wrote:
> dereference_function_descriptor() and
> dereference_kernel_function_descriptor() are identical on the
> three architectures implementing them.
> 
> Make it common.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/ia64/include/asm/sections.h| 19 ---
>  arch/parisc/include/asm/sections.h  |  9 -
>  arch/parisc/kernel/process.c| 21 -
>  arch/powerpc/include/asm/sections.h | 23 ---
>  include/asm-generic/sections.h  | 18 ++
>  5 files changed, 18 insertions(+), 72 deletions(-)

A diffstat to love. :)

Reviewed-by: Kees Cook 


> 
> diff --git a/arch/ia64/include/asm/sections.h 
> b/arch/ia64/include/asm/sections.h
> index 929b5c535620..d9addaea8339 100644
> --- a/arch/ia64/include/asm/sections.h
> +++ b/arch/ia64/include/asm/sections.h
> @@ -30,23 +30,4 @@ extern char __start_gate_brl_fsys_bubble_down_patchlist[], 
> __end_gate_brl_fsys_b
>  extern char __start_unwind[], __end_unwind[];
>  extern char __start_ivt_text[], __end_ivt_text[];
>  
> -#undef dereference_function_descriptor
> -static inline void *dereference_function_descriptor(void *ptr)
> -{
> - struct fdesc *desc = ptr;
> - void *p;
> -
> - if (!get_kernel_nofault(p, (void *)>addr))
> - ptr = p;
> - return ptr;
> -}
> -
> -#undef dereference_kernel_function_descriptor
> -static inline void *dereference_kernel_function_descriptor(void *ptr)
> -{
> - if (ptr < (void *)__start_opd || ptr >= (void *)__end_opd)
> - return ptr;
> - return dereference_function_descriptor(ptr);
> -}
> -
>  #endif /* _ASM_IA64_SECTIONS_H */
> diff --git a/arch/parisc/include/asm/sections.h 
> b/arch/parisc/include/asm/sections.h
> index 329e80f7af0a..b041fb32a300 100644
> --- a/arch/parisc/include/asm/sections.h
> +++ b/arch/parisc/include/asm/sections.h
> @@ -12,13 +12,4 @@ typedef Elf64_Fdesc funct_descr_t;
>  
>  extern char __alt_instructions[], __alt_instructions_end[];
>  
> -#ifdef CONFIG_64BIT
> -
> -#undef dereference_function_descriptor
> -void *dereference_function_descriptor(void *);
> -
> -#undef dereference_kernel_function_descriptor
> -void *dereference_kernel_function_descriptor(void *);
> -#endif
> -
>  #endif
> diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
> index 38ec4ae81239..7382576b52a8 100644
> --- a/arch/parisc/kernel/process.c
> +++ b/arch/parisc/kernel/process.c
> @@ -266,27 +266,6 @@ get_wchan(struct task_struct *p)
>   return 0;
>  }
>  
> -#ifdef CONFIG_64BIT
> -void *dereference_function_descriptor(void *ptr)
> -{
> - Elf64_Fdesc *desc = ptr;
> - void *p;
> -
> - if (!get_kernel_nofault(p, (void *)>addr))
> - ptr = p;
> - return ptr;
> -}
> -
> -void *dereference_kernel_function_descriptor(void *ptr)
> -{
> - if (ptr < (void *)__start_opd ||
> - ptr >= (void *)__end_opd)
> - return ptr;
> -
> - return dereference_function_descriptor(ptr);
> -}
> -#endif
> -
>  static inline unsigned long brk_rnd(void)
>  {
>   return (get_random_int() & BRK_RND_MASK) << PAGE_SHIFT;
> diff --git a/arch/powerpc/include/asm/sections.h 
> b/arch/powerpc/include/asm/sections.h
> index d0d5287fa568..8f8e95f234e2 100644
> --- a/arch/powerpc/include/asm/sections.h
> +++ b/arch/powerpc/include/asm/sections.h
> @@ -72,29 +72,6 @@ static inline int overlaps_kernel_text(unsigned long 
> start, unsigned long end)
>   (unsigned long)_stext < end;
>  }
>  
> -#ifdef PPC64_ELF_ABI_v1
> -
> -#undef dereference_function_descriptor
> -static inline void *dereference_function_descriptor(void *ptr)
> -{
> - struct ppc64_opd_entry *desc = ptr;
> - void *p;
> -
> - if (!get_kernel_nofault(p, (void *)>addr))
> - ptr = p;
> - return ptr;
> -}
> -
> -#undef dereference_kernel_function_descriptor
> -static inline void *dereference_kernel_function_descriptor(void *ptr)
> -{
> - if (ptr < (void *)__start_opd || ptr >= (void *)__end_opd)
> - return ptr;
> -
> - return dereference_function_descriptor(ptr);
> -}
> -#endif /* PPC64_ELF_ABI_v1 */
> -
>  #endif
>  
>  #endif /* __KERNEL__ */
> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
> index 436412d94054..5baaf9d7c671 100644
> --- a/include/asm-generic/sections.h
> +++ b/include/asm-generic/sections.h
> @@ -60,6 +60,24 @@ extern __visible const void __nosave_begin, __nosave_end;
>  
&g

Re: [PATCH v1 05/10] asm-generic: Define 'funct_descr_t' to commonly describe function descriptors

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:32PM +0200, Christophe Leroy wrote:
> We have three architectures using function descriptors, each with its
> own name.
> 
> Add a common typedef that can be used in generic code.
> 
> Also add a stub typedef for architecture without function descriptors,

nit: funct_descr_t reads weird to me. why not func_desc_t ? Either way:

Reviewed-by: Kees Cook 


> to avoid a forest of #ifdefs.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/ia64/include/asm/sections.h| 1 +
>  arch/parisc/include/asm/sections.h  | 1 +
>  arch/powerpc/include/asm/sections.h | 1 +
>  include/asm-generic/sections.h  | 3 +++
>  4 files changed, 6 insertions(+)
> 
> diff --git a/arch/ia64/include/asm/sections.h 
> b/arch/ia64/include/asm/sections.h
> index 80f5868afb06..929b5c535620 100644
> --- a/arch/ia64/include/asm/sections.h
> +++ b/arch/ia64/include/asm/sections.h
> @@ -8,6 +8,7 @@
>   */
>  
>  #define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1
> +typedef struct fdesc funct_descr_t;
>  
>  #include 
>  #include 
> diff --git a/arch/parisc/include/asm/sections.h 
> b/arch/parisc/include/asm/sections.h
> index 2e781ee19b66..329e80f7af0a 100644
> --- a/arch/parisc/include/asm/sections.h
> +++ b/arch/parisc/include/asm/sections.h
> @@ -4,6 +4,7 @@
>  
>  #ifdef CONFIG_64BIT
>  #define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1
> +typedef Elf64_Fdesc funct_descr_t;
>  #endif
>  
>  /* nothing to see, move along */
> diff --git a/arch/powerpc/include/asm/sections.h 
> b/arch/powerpc/include/asm/sections.h
> index b7f1ba04e756..d0d5287fa568 100644
> --- a/arch/powerpc/include/asm/sections.h
> +++ b/arch/powerpc/include/asm/sections.h
> @@ -10,6 +10,7 @@
>  
>  #ifdef PPC64_ELF_ABI_v1
>  #define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1
> +typedef struct ppc64_opd_entry funct_descr_t;
>  #endif
>  
>  #include 
> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
> index 1db5cfd69817..436412d94054 100644
> --- a/include/asm-generic/sections.h
> +++ b/include/asm-generic/sections.h
> @@ -63,6 +63,9 @@ extern __visible const void __nosave_begin, __nosave_end;
>  #else
>  #define dereference_function_descriptor(p) ((void *)(p))
>  #define dereference_kernel_function_descriptor(p) ((void *)(p))
> +typedef struct {
> + unsigned long addr;
> +} funct_descr_t;
>  #endif
>  
>  /* random extra sections (if any).  Override
> -- 
> 2.31.1
> 

-- 
Kees Cook


Re: [PATCH v1 04/10] asm-generic: Use HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR to define associated stubs

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:31PM +0200, Christophe Leroy wrote:
> Use HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR instead of 
> 'dereference_function_descriptor'
> to know whether arch has function descriptors.
> 
> Signed-off-by: Christophe Leroy 

I'd mention the intentionally-empty #if/#else in the commit log, as I,
like Helge, mentally tripped over it in the review. :)

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v1 03/10] ia64: Rename 'ip' to 'addr' in 'struct fdesc'

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:30PM +0200, Christophe Leroy wrote:
> There are three architectures with function descriptors, try to
> have common names for the address they contain in order to
> refactor some functions into generic functions later.
> 
> powerpc has 'funcaddr'
> ia64 has 'ip'
> parisc has 'addr'
> 
> Vote for 'addr' and update 'struct fdesc' accordingly.
> 
> Signed-off-by: Christophe Leroy 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v1 02/10] powerpc: Rename 'funcaddr' to 'addr' in 'struct ppc64_opd_entry'

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:29PM +0200, Christophe Leroy wrote:
> There are three architectures with function descriptors, try to
> have common names for the address they contain in order to
> refactor some functions into generic functions later.
> 
> powerpc has 'funcaddr'
> ia64 has 'ip'
> parisc has 'addr'
> 
> Vote for 'addr' and update 'struct ppc64_opd_entry' accordingly.
> 
> Signed-off-by: Christophe Leroy 

Reasonable. :)

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v1 01/10] powerpc: Move 'struct ppc64_opd_entry' back into asm/elf.h

2021-10-13 Thread Kees Cook
On Mon, Oct 11, 2021 at 05:25:28PM +0200, Christophe Leroy wrote:
> 'struct ppc64_opd_entry' doesn't belong to uapi/asm/elf.h
> 
> It was initially in module_64.c and commit 2d291e902791 ("Fix compile
> failure with non modular builds") moved it into asm/elf.h
> 
> But it was by mistake added outside of __KERNEL__ section,
> therefore commit c3617f72036c ("UAPI: (Scripted) Disintegrate
> arch/powerpc/include/asm") moved it to uapi/asm/elf.h
> 
> Move it back into asm/elf.h, this brings it back in line with
> IA64 and PARISC architectures.
> 
> Fixes: 2d291e902791 ("Fix compile failure with non modular builds")
> Signed-off-by: Christophe Leroy 

I'd agree with Arnd: this is a reasonable cleanup and nothing should be
using it.

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH] lkdtm: Fix content of section containing lkdtm_rodata_do_nothing()

2021-10-08 Thread Kees Cook
On Fri, 8 Oct 2021 18:58:40 +0200, Christophe Leroy wrote:
> On a kernel without CONFIG_STRICT_KERNEL_RWX, running EXEC_RODATA
> test leads to "Illegal instruction" failure.
> 
> Looking at the content of rodata_objcopy.o, we see that the
> function content zeroes only:
> 
>   Disassembly of section .rodata:
> 
> [...]

Applied to for-next/lkdtm, thanks!

[1/1] lkdtm: Fix content of section containing lkdtm_rodata_do_nothing()
  https://git.kernel.org/kees/c/19c3069c5f5f

Also, can you take a moment and get "patatt" set up[1] for signing your
patches? I would appreciate that since b4 yells at me when patches aren't
signed. :)

-Kees

[1] https://github.com/mricon/patatt

-- 
Kees Cook



Re: [PATCH] lkdtm: Fix content of section containing lkdtm_rodata_do_nothing()

2021-10-08 Thread Kees Cook
On Fri, Oct 08, 2021 at 11:09:47AM -0700, Nick Desaulniers wrote:
> On Fri, Oct 8, 2021 at 9:59 AM Christophe Leroy
>  wrote:
> >
> > On a kernel without CONFIG_STRICT_KERNEL_RWX, running EXEC_RODATA
> > test leads to "Illegal instruction" failure.
> >
> > Looking at the content of rodata_objcopy.o, we see that the
> > function content zeroes only:
> >
> > Disassembly of section .rodata:
> >
> >  <.lkdtm_rodata_do_nothing>:
> >0:   00 00 00 00 .long 0x0
> >
> > Add the contents flag in order to keep the content of the section
> > while renaming it.
> >
> > Disassembly of section .rodata:
> >
> >  <.lkdtm_rodata_do_nothing>:
> >0:   4e 80 00 20 blr
> >
> > Fixes: e9e08a07385e ("lkdtm: support llvm-objcopy")
> 
> Thanks for the patch; sorry I broke this.
> Reviewed-by: Nick Desaulniers 

Hah! Whoops; sorry I don't have an inverted version of this test! I
should have caught this when it broke. :|

-Kees

-- 
Kees Cook


Re: [RFC PATCH 4/8] powerpc: add CPU field to struct thread_info

2021-09-30 Thread Kees Cook
On Thu, Sep 30, 2021 at 08:46:04AM +1000, Michael Ellerman wrote:
> Ard Biesheuvel  writes:
> > On Tue, 28 Sept 2021 at 02:16, Michael Ellerman  wrote:
> >>
> >> Michael Ellerman  writes:
> >> > Ard Biesheuvel  writes:
> >> >> On Tue, 14 Sept 2021 at 14:11, Ard Biesheuvel  wrote:
> >> >>>
> >> >>> The CPU field will be moved back into thread_info even when
> >> >>> THREAD_INFO_IN_TASK is enabled, so add it back to powerpc's definition
> >> >>> of struct thread_info.
> >> >>>
> >> >>> Signed-off-by: Ard Biesheuvel 
> >> >>
> >> >> Michael,
> >> >>
> >> >> Do you have any objections or issues with this patch or the subsequent
> >> >> ones cleaning up the task CPU kludge for ppc32? Christophe indicated
> >> >> that he was happy with it.
> >> >
> >> > No objections, it looks good to me, thanks for cleaning up that horror :)
> >> >
> >> > It didn't apply cleanly to master so I haven't tested it at all, if you 
> >> > can point me at a
> >> > git tree with the dependencies I'd be happy to run some tests over it.
> >>
> >> Actually I realised I can just drop the last patch.
> >>
> >> So that looks fine, passes my standard quick build & boot on qemu tests,
> >> and builds with/without stack protector enabled.
> >>
> >
> > Thanks.
> >
> > Do you have any opinion on how this series should be merged? Kees Cook
> > is willing to take them via his cross-arch tree, or you could carry
> > them if you prefer. Taking it via multiple trees at the same time is
> > going to be tricky, or take two cycles, with I'd prefer to avoid.
> 
> I don't really mind. If Kees is happy to take it then that's OK by me.
> 
> If Kees put the series in a topic branch based off rc2 then I could
> merge that, and avoid any conflicts.

I've created:

git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
for-next/thread_info/cpu

it includes a --no-ff merge commit, which I'm not sure is desirable? Let
me know if I should adjust this, or if Linus will yell about this if I
send him a PR containing a merge commit? I'm not sure what's right here.

Thanks!

-- 
Kees Cook


Re: [RFC PATCH 4/8] powerpc: add CPU field to struct thread_info

2021-09-29 Thread Kees Cook
On Thu, Sep 30, 2021 at 08:46:04AM +1000, Michael Ellerman wrote:
> Ard Biesheuvel  writes:
> > On Tue, 28 Sept 2021 at 02:16, Michael Ellerman  wrote:
> >>
> >> Michael Ellerman  writes:
> >> > Ard Biesheuvel  writes:
> >> >> On Tue, 14 Sept 2021 at 14:11, Ard Biesheuvel  wrote:
> >> >>>
> >> >>> The CPU field will be moved back into thread_info even when
> >> >>> THREAD_INFO_IN_TASK is enabled, so add it back to powerpc's definition
> >> >>> of struct thread_info.
> >> >>>
> >> >>> Signed-off-by: Ard Biesheuvel 
> >> >>
> >> >> Michael,
> >> >>
> >> >> Do you have any objections or issues with this patch or the subsequent
> >> >> ones cleaning up the task CPU kludge for ppc32? Christophe indicated
> >> >> that he was happy with it.
> >> >
> >> > No objections, it looks good to me, thanks for cleaning up that horror :)
> >> >
> >> > It didn't apply cleanly to master so I haven't tested it at all, if you 
> >> > can point me at a
> >> > git tree with the dependencies I'd be happy to run some tests over it.
> >>
> >> Actually I realised I can just drop the last patch.
> >>
> >> So that looks fine, passes my standard quick build & boot on qemu tests,
> >> and builds with/without stack protector enabled.
> >>
> >
> > Thanks.
> >
> > Do you have any opinion on how this series should be merged? Kees Cook
> > is willing to take them via his cross-arch tree, or you could carry
> > them if you prefer. Taking it via multiple trees at the same time is
> > going to be tricky, or take two cycles, with I'd prefer to avoid.
> 
> I don't really mind. If Kees is happy to take it then that's OK by me.
> 
> If Kees put the series in a topic branch based off rc2 then I could
> merge that, and avoid any conflicts.

If that helps, yeah, I can make a separate stable branch. Thanks!

-Kees

-- 
Kees Cook


Re: [PATCH] mm: Remove HARDENED_USERCOPY_FALLBACK

2021-09-21 Thread Kees Cook
On Tue, Sep 21, 2021 at 08:11:49AM +0200, Stephen Kitt wrote:
> This has served its purpose and is no longer used. All usercopy
> violations appear to have been handled by now, any remaining
> instances (or new bugs) will cause copies to be rejected.
> 
> This isn't a direct revert of commit 2d891fbc3bb6 ("usercopy: Allow
> strict enforcement of whitelists"); since usercopy_fallback is
> effectively 0, the fallback handling is removed too.
> 
> This also removes the usercopy_fallback module parameter on
> slab_common.
> 
> Link: https://github.com/KSPP/linux/issues/153
> Signed-off-by: Stephen Kitt 
> Suggested-by: Kees Cook 

Thanks for doing this!

Acked-by: Kees Cook 

-- 
Kees Cook


[PATCH for-next 01/25] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp

2021-08-22 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the
upper union (evt_struct->iu.srp) instead, as that's what is being wiped.

Cc: Tyrel Datwyler 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: linux-s...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
Acked-by: Martin K. Petersen 
Link: https://lore.kernel.org/lkml/yq135rzp79c@ca-mkp.ca.oracle.com
Acked-by: Tyrel Datwyler 
Link: 
https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695...@linux.ibm.com
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index e6a3eaaa57d9..3bd3a0124123 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -1055,8 +1055,9 @@ static int ibmvscsi_queuecommand_lck(struct scsi_cmnd 
*cmnd,
return SCSI_MLQUEUE_HOST_BUSY;
 
/* Set up the actual SRP IU */
+   BUILD_BUG_ON(sizeof(evt_struct->iu.srp) != SRP_MAX_IU_LEN);
+   memset(_struct->iu.srp, 0x00, sizeof(evt_struct->iu.srp));
srp_cmd = _struct->iu.srp.cmd;
-   memset(srp_cmd, 0x00, SRP_MAX_IU_LEN);
srp_cmd->opcode = SRP_CMD;
memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(srp_cmd->cdb));
int_to_scsilun(lun, _cmd->lun);
-- 
2.30.2



[PATCH for-next 02/25] powerpc: Split memset() to avoid multi-field overflow

2021-08-22 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing across a field boundary with memset(), move the call
to just the array, and an explicit zeroing of the prior field.

Cc: Benjamin Herrenschmidt 
Cc: Qinglang Miao 
Cc: "Gustavo A. R. Silva" 
Cc: Hulk Robot 
Cc: Wang Wensheng 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
Reviewed-by: Michael Ellerman 
Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au
---
 drivers/macintosh/smu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
index 94fb63a7b357..3e2b25ea58a3 100644
--- a/drivers/macintosh/smu.c
+++ b/drivers/macintosh/smu.c
@@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
cmd->read = cmd->info.devaddr & 0x01;
switch(cmd->info.type) {
case SMU_I2C_TRANSFER_SIMPLE:
-   memset(>info.sublen, 0, 4);
+   cmd->info.sublen = 0;
+   memset(cmd->info.subaddr, 0, sizeof(cmd->info.subaddr));
break;
case SMU_I2C_TRANSFER_COMBINED:
cmd->info.devaddr &= 0xfe;
-- 
2.30.2



Re: [PATCH v2 57/63] powerpc/signal32: Use struct_group() to zero spe regs

2021-08-20 Thread Kees Cook
On Fri, Aug 20, 2021 at 05:49:35PM +1000, Michael Ellerman wrote:
> Kees Cook  writes:
> > In preparation for FORTIFY_SOURCE performing compile-time and run-time
> > field bounds checking for memset(), avoid intentionally writing across
> > neighboring fields.
> >
> > Add a struct_group() for the spe registers so that memset() can correctly 
> > reason
> > about the size:
> >
> >In function 'fortify_memset_chk',
> >inlined from 'restore_user_regs.part.0' at 
> > arch/powerpc/kernel/signal_32.c:539:3:
> >>> include/linux/fortify-string.h:195:4: error: call to 
> >>> '__write_overflow_field' declared with attribute warning: detected write 
> >>> beyond size of field (1st parameter); maybe use struct_group()? 
> >>> [-Werror=attribute-warning]
> >  195 |__write_overflow_field();
> >  |^~~~
> >
> > Cc: Michael Ellerman 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Paul Mackerras 
> > Cc: Christophe Leroy 
> > Cc: Sudeep Holla 
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Reported-by: kernel test robot 
> > Signed-off-by: Kees Cook 
> > ---
> >  arch/powerpc/include/asm/processor.h | 6 --
> >  arch/powerpc/kernel/signal_32.c  | 6 +++---
> >  2 files changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/processor.h 
> > b/arch/powerpc/include/asm/processor.h
> > index f348e564f7dd..05dc567cb9a8 100644
> > --- a/arch/powerpc/include/asm/processor.h
> > +++ b/arch/powerpc/include/asm/processor.h
> > @@ -191,8 +191,10 @@ struct thread_struct {
> > int used_vsr;   /* set if process has used VSX */
> >  #endif /* CONFIG_VSX */
> >  #ifdef CONFIG_SPE
> > -   unsigned long   evr[32];/* upper 32-bits of SPE regs */
> > -   u64 acc;/* Accumulator */
> > +   struct_group(spe,
> > +   unsigned long   evr[32];/* upper 32-bits of SPE regs */
> > +   u64 acc;/* Accumulator */
> > +   );
> > unsigned long   spefscr;/* SPE & eFP status */
> > unsigned long   spefscr_last;   /* SPEFSCR value on last prctl
> >call or trap return */
> > diff --git a/arch/powerpc/kernel/signal_32.c 
> > b/arch/powerpc/kernel/signal_32.c
> > index 0608581967f0..77b86caf5c51 100644
> > --- a/arch/powerpc/kernel/signal_32.c
> > +++ b/arch/powerpc/kernel/signal_32.c
> > @@ -532,11 +532,11 @@ static long restore_user_regs(struct pt_regs *regs,
> > regs_set_return_msr(regs, regs->msr & ~MSR_SPE);
> > if (msr & MSR_SPE) {
> > /* restore spe registers from the stack */
> > -   unsafe_copy_from_user(current->thread.evr, >mc_vregs,
> > - ELF_NEVRREG * sizeof(u32), failed);
> > +   unsafe_copy_from_user(>thread.spe, >mc_vregs,
> > + sizeof(current->thread.spe), failed);
> 
> This makes me nervous, because the ABI is that we copy ELF_NEVRREG *
> sizeof(u32) bytes, not whatever sizeof(current->thread.spe) happens to
> be.
> 
> ie. if we use sizeof an inadvertent change to the fields in
> thread_struct could change how many bytes we copy out to userspace,
> which would be an ABI break.
> 
> And that's not that hard to do, because it's not at all obvious that the
> size and layout of fields in thread_struct affects the user ABI.
> 
> At the same time we don't want to copy the right number of bytes but
> the wrong content, so from that point of view using sizeof is good :)
> 
> The way we handle it in ptrace is to have BUILD_BUG_ON()s to verify that
> things match up, so maybe we should do that here too.
> 
> ie. add:
> 
>   BUILD_BUG_ON(sizeof(current->thread.spe) == ELF_NEVRREG * sizeof(u32));
> 
> Not sure if you are happy doing that as part of this patch. I can always
> do it later if not.

Sounds good to me; I did that in a few other cases in the series where
the relationships between things seemed tenuous. :) I'll add this (as
!=) in v3.

Thanks!

-- 
Kees Cook


Re: [PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow

2021-08-18 Thread Kees Cook
On Wed, Aug 18, 2021 at 08:42:18AM +0200, Christophe Leroy wrote:
> 
> 
> Le 18/08/2021 à 08:05, Kees Cook a écrit :
> > In preparation for FORTIFY_SOURCE performing compile-time and run-time
> > field bounds checking for memset(), avoid intentionally writing across
> > neighboring fields.
> > 
> > Instead of writing across a field boundary with memset(), move the call
> > to just the array, and an explicit zeroing of the prior field.
> > 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Qinglang Miao 
> > Cc: "Gustavo A. R. Silva" 
> > Cc: Hulk Robot 
> > Cc: Wang Wensheng 
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Signed-off-by: Kees Cook 
> > Reviewed-by: Michael Ellerman 
> > Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au
> > ---
> >   drivers/macintosh/smu.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
> > index 94fb63a7b357..59ce431da7ef 100644
> > --- a/drivers/macintosh/smu.c
> > +++ b/drivers/macintosh/smu.c
> > @@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
> > cmd->read = cmd->info.devaddr & 0x01;
> > switch(cmd->info.type) {
> > case SMU_I2C_TRANSFER_SIMPLE:
> > -   memset(>info.sublen, 0, 4);
> > +   cmd->info.sublen = 0;
> > +   memset(>info.subaddr, 0, 3);
> 
> subaddr[] is a table, should the & be avoided ?

It results in the same thing, but it's better form to not have the &; I
will fix this.

> And while at it, why not use sizeof(subaddr) instead of 3 ?

Agreed. :)

Thanks!

-- 
Kees Cook


[PATCH v2 57/63] powerpc/signal32: Use struct_group() to zero spe regs

2021-08-18 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Add a struct_group() for the spe registers so that memset() can correctly reason
about the size:

   In function 'fortify_memset_chk',
   inlined from 'restore_user_regs.part.0' at 
arch/powerpc/kernel/signal_32.c:539:3:
>> include/linux/fortify-string.h:195:4: error: call to 
>> '__write_overflow_field' declared with attribute warning: detected write 
>> beyond size of field (1st parameter); maybe use struct_group()? 
>> [-Werror=attribute-warning]
 195 |__write_overflow_field();
 |^~~~

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Christophe Leroy 
Cc: Sudeep Holla 
Cc: linuxppc-dev@lists.ozlabs.org
Reported-by: kernel test robot 
Signed-off-by: Kees Cook 
---
 arch/powerpc/include/asm/processor.h | 6 --
 arch/powerpc/kernel/signal_32.c  | 6 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index f348e564f7dd..05dc567cb9a8 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -191,8 +191,10 @@ struct thread_struct {
int used_vsr;   /* set if process has used VSX */
 #endif /* CONFIG_VSX */
 #ifdef CONFIG_SPE
-   unsigned long   evr[32];/* upper 32-bits of SPE regs */
-   u64 acc;/* Accumulator */
+   struct_group(spe,
+   unsigned long   evr[32];/* upper 32-bits of SPE regs */
+   u64 acc;/* Accumulator */
+   );
unsigned long   spefscr;/* SPE & eFP status */
unsigned long   spefscr_last;   /* SPEFSCR value on last prctl
   call or trap return */
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index 0608581967f0..77b86caf5c51 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -532,11 +532,11 @@ static long restore_user_regs(struct pt_regs *regs,
regs_set_return_msr(regs, regs->msr & ~MSR_SPE);
if (msr & MSR_SPE) {
/* restore spe registers from the stack */
-   unsafe_copy_from_user(current->thread.evr, >mc_vregs,
- ELF_NEVRREG * sizeof(u32), failed);
+   unsafe_copy_from_user(>thread.spe, >mc_vregs,
+ sizeof(current->thread.spe), failed);
current->thread.used_spe = true;
} else if (current->thread.used_spe)
-   memset(current->thread.evr, 0, ELF_NEVRREG * sizeof(u32));
+   memset(>thread.spe, 0, sizeof(current->thread.spe));
 
/* Always get SPEFSCR back */
unsafe_get_user(current->thread.spefscr, (u32 __user *)>mc_vregs + 
ELF_NEVRREG, failed);
-- 
2.30.2



[PATCH v2 61/63] powerpc: Split memset() to avoid multi-field overflow

2021-08-18 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing across a field boundary with memset(), move the call
to just the array, and an explicit zeroing of the prior field.

Cc: Benjamin Herrenschmidt 
Cc: Qinglang Miao 
Cc: "Gustavo A. R. Silva" 
Cc: Hulk Robot 
Cc: Wang Wensheng 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
Reviewed-by: Michael Ellerman 
Link: https://lore.kernel.org/lkml/87czqsnmw9@mpe.ellerman.id.au
---
 drivers/macintosh/smu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
index 94fb63a7b357..59ce431da7ef 100644
--- a/drivers/macintosh/smu.c
+++ b/drivers/macintosh/smu.c
@@ -848,7 +848,8 @@ int smu_queue_i2c(struct smu_i2c_cmd *cmd)
cmd->read = cmd->info.devaddr & 0x01;
switch(cmd->info.type) {
case SMU_I2C_TRANSFER_SIMPLE:
-   memset(>info.sublen, 0, 4);
+   cmd->info.sublen = 0;
+   memset(>info.subaddr, 0, 3);
break;
case SMU_I2C_TRANSFER_COMBINED:
cmd->info.devaddr &= 0xfe;
-- 
2.30.2



[PATCH v2 36/63] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp

2021-08-18 Thread Kees Cook
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the
upper union (evt_struct->iu.srp) instead, as that's what is being wiped.

Cc: Tyrel Datwyler 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: linux-s...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
Acked-by: Martin K. Petersen 
Link: https://lore.kernel.org/lkml/yq135rzp79c@ca-mkp.ca.oracle.com
Acked-by: Tyrel Datwyler 
Link: 
https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695...@linux.ibm.com
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 50df7dd9cb91..ea8e01f49cba 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -1055,8 +1055,9 @@ static int ibmvscsi_queuecommand_lck(struct scsi_cmnd 
*cmnd,
return SCSI_MLQUEUE_HOST_BUSY;
 
/* Set up the actual SRP IU */
+   BUILD_BUG_ON(sizeof(evt_struct->iu.srp) != SRP_MAX_IU_LEN);
+   memset(_struct->iu.srp, 0x00, sizeof(evt_struct->iu.srp));
srp_cmd = _struct->iu.srp.cmd;
-   memset(srp_cmd, 0x00, SRP_MAX_IU_LEN);
srp_cmd->opcode = SRP_CMD;
memcpy(srp_cmd->cdb, cmnd->cmnd, sizeof(srp_cmd->cdb));
int_to_scsilun(lun, _cmd->lun);
-- 
2.30.2



Re: [PATCH v5 2/8] lkdtm/powerpc: Add test to hijack a patch mapping

2021-08-11 Thread Kees Cook
On Wed, Aug 11, 2021 at 12:57:00PM -0500, Christopher M. Riedl wrote:
> On Thu Aug 5, 2021 at 4:13 AM CDT, Christophe Leroy wrote:
> >
> >
> > Le 13/07/2021 à 07:31, Christopher M. Riedl a écrit :
> > > When live patching with STRICT_KERNEL_RWX the CPU doing the patching
> > > must temporarily remap the page(s) containing the patch site with +W
> > > permissions. While this temporary mapping is in use, another CPU could
> > > write to the same mapping and maliciously alter kernel text. Implement a
> > > LKDTM test to attempt to exploit such an opening during code patching.
> > > The test is implemented on powerpc and requires LKDTM built into the
> > > kernel (building LKDTM as a module is insufficient).
> > > 
> > > The LKDTM "hijack" test works as follows:
> > > 
> > >1. A CPU executes an infinite loop to patch an instruction. This is
> > >   the "patching" CPU.
> > >2. Another CPU attempts to write to the address of the temporary
> > >   mapping used by the "patching" CPU. This other CPU is the
> > >   "hijacker" CPU. The hijack either fails with a fault/error or
> > >   succeeds, in which case some kernel text is now overwritten.
> > > [...]
> > > +#if (IS_BUILTIN(CONFIG_LKDTM) && defined(CONFIG_STRICT_KERNEL_RWX) && \
> > > + defined(CONFIG_PPC))
> >
> > I think this test shouldn't be limited to CONFIG_PPC and shouldn't be
> > limited to CONFIG_STRICT_KERNEL_RWX. It should be there all the time.

Agreed: if the machinery exists to provide this defense on even one
arch/config/whatever combo, I'd like LKDTM to test for it. This lets use
compare defenses across different combinations more easily, and means
folks must answer questions like "why doesn't $combination provide
$defense?"

> > Also why limiting it to IS_BUILTIN(CONFIG_LKDTM) ?
> 
> The test needs read_cpu_patching_addr() which definitely cannot be
> exposed outside of the kernel (ie. builtin).

FWIW, I'm okay with this. There isn't a solution that feels entirely
"right", so either a build-time requirement like this, or using an
exception for modules like this:

arch/x86/kernel/cpu/common.c:#if IS_MODULE(CONFIG_LKDTM)
arch/x86/kernel/cpu/common.c-EXPORT_SYMBOL_GPL(native_write_cr4);
arch/x86/kernel/cpu/common.c-#endif

I think neither is great. Another idea is maybe using a name-spaced
export, like:

EXPORT_SYMBOL_NS_GPL(native_write_cr4, LKDTM);

But that still means it gets exposed to malicious discovery, so probably
not.

I suspect the best is to just do the BUILTIN check, since building LKDTM
as a module on a _production_ kernel is rare if it exists at all. The
only downside is needing to completely reboot to perform updated tests,
but then, I frequently find myself breaking the kernel badly on bad
tests, so I have to reboot anyway. ;)

-Kees

-- 
Kees Cook


[PATCH] ibmvnic: Use strscpy() instead of strncpy()

2021-06-21 Thread Kees Cook
Since these strings are expected to be NUL-terminated and the buffers
are exactly sized (in vnic_client_data_len()) with no padding, strncpy()
can be safely replaced with strscpy() here, as strncpy() on
NUL-terminated string is considered deprecated[1]. This has the
side-effect of silencing a -Warray-bounds warning due to the compiler
being confused about the vlcd incrementing:

In file included from ./include/linux/string.h:253,
 from ./include/linux/bitmap.h:10,
 from ./include/linux/cpumask.h:12,
 from ./include/linux/mm_types_task.h:14,
 from ./include/linux/mm_types.h:5,
 from ./include/linux/buildid.h:5,
 from ./include/linux/module.h:14,
 from drivers/net/ethernet/ibm/ibmvnic.c:35:
In function '__fortify_strncpy',
inlined from 'vnic_add_client_data' at 
drivers/net/ethernet/ibm/ibmvnic.c:3919:2:
./include/linux/fortify-string.h:39:30: warning: '__builtin_strncpy' offset 12 
from the object at 'v
lcd' is out of the bounds of referenced subobject 'name' with type 'char[]' at 
offset 12 [-Warray-bo
unds]
   39 | #define __underlying_strncpy __builtin_strncpy
  |  ^
./include/linux/fortify-string.h:51:9: note: in expansion of macro 
'__underlying_strncpy'
   51 |  return __underlying_strncpy(p, q, size);
  | ^~~~
drivers/net/ethernet/ibm/ibmvnic.c: In function 'vnic_add_client_data':
drivers/net/ethernet/ibm/ibmvnic.c:3883:7: note: subobject 'name' declared here
 3883 |  char name[];
  |   ^~~~

[1] 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings

Cc: Dany Madden 
Cc: Sukadev Bhattiprolu 
Cc: Thomas Falcon 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: net...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 2d8804ebdf96..adb0d5ca9ff1 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -3909,21 +3909,21 @@ static void vnic_add_client_data(struct ibmvnic_adapter 
*adapter,
vlcd->type = 1;
len = strlen(os_name) + 1;
vlcd->len = cpu_to_be16(len);
-   strncpy(vlcd->name, os_name, len);
+   strscpy(vlcd->name, os_name, len);
vlcd = (struct vnic_login_client_data *)(vlcd->name + len);
 
/* Type 2 - LPAR name */
vlcd->type = 2;
len = strlen(utsname()->nodename) + 1;
vlcd->len = cpu_to_be16(len);
-   strncpy(vlcd->name, utsname()->nodename, len);
+   strscpy(vlcd->name, utsname()->nodename, len);
vlcd = (struct vnic_login_client_data *)(vlcd->name + len);
 
/* Type 3 - device name */
vlcd->type = 3;
len = strlen(adapter->netdev->name) + 1;
vlcd->len = cpu_to_be16(len);
-   strncpy(vlcd->name, adapter->netdev->name, len);
+   strscpy(vlcd->name, adapter->netdev->name, len);
 }
 
 static int send_login(struct ibmvnic_adapter *adapter)
-- 
2.30.2



Re: [PATCH] crypto: nx: Fix memcpy() over-reading in nonce

2021-06-18 Thread Kees Cook
On Thu, Jun 17, 2021 at 04:08:15PM +1000, Michael Ellerman wrote:
> Kees Cook  writes:
> > Fix typo in memcpy() where size should be CTR_RFC3686_NONCE_SIZE.
> >
> > Fixes: 030f4e968741 ("crypto: nx - Fix reentrancy bugs")
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Kees Cook 
> 
> Thanks.
> 
> > ---
> >  drivers/crypto/nx/nx-aes-ctr.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/crypto/nx/nx-aes-ctr.c b/drivers/crypto/nx/nx-aes-ctr.c
> > index 13f518802343..6120e350ff71 100644
> > --- a/drivers/crypto/nx/nx-aes-ctr.c
> > +++ b/drivers/crypto/nx/nx-aes-ctr.c
> > @@ -118,7 +118,7 @@ static int ctr3686_aes_nx_crypt(struct skcipher_request 
> > *req)
> > struct nx_crypto_ctx *nx_ctx = crypto_skcipher_ctx(tfm);
> > u8 iv[16];
> >  
> > -   memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_IV_SIZE);
> > +   memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_NONCE_SIZE);
> > memcpy(iv + CTR_RFC3686_NONCE_SIZE, req->iv, CTR_RFC3686_IV_SIZE);
> > iv[12] = iv[13] = iv[14] = 0;
> > iv[15] = 1;
> 
> Where IV_SIZE is 8 and NONCE_SIZE is 4.
> 
> And iv is 16 bytes, so it's not a buffer overflow.
> 
> But priv.ctr.nonce is 4 bytes, and at the end of the struct, so it reads
> 4 bytes past the end of the nx_crypto_ctx, which is not good.
> 
> But then immediately overwrites whatever it read with req->iv.
> 
> So seems pretty harmless in practice?

Right -- there's no damage done, but future memcpy() FORTIFY work alerts
on this, so I'm going through cleaning all of these up. :)

-- 
Kees Cook


[PATCH] crypto: nx: Fix memcpy() over-reading in nonce

2021-06-16 Thread Kees Cook
Fix typo in memcpy() where size should be CTR_RFC3686_NONCE_SIZE.

Fixes: 030f4e968741 ("crypto: nx - Fix reentrancy bugs")
Cc: sta...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/crypto/nx/nx-aes-ctr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/nx/nx-aes-ctr.c b/drivers/crypto/nx/nx-aes-ctr.c
index 13f518802343..6120e350ff71 100644
--- a/drivers/crypto/nx/nx-aes-ctr.c
+++ b/drivers/crypto/nx/nx-aes-ctr.c
@@ -118,7 +118,7 @@ static int ctr3686_aes_nx_crypt(struct skcipher_request 
*req)
struct nx_crypto_ctx *nx_ctx = crypto_skcipher_ctx(tfm);
u8 iv[16];
 
-   memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_IV_SIZE);
+   memcpy(iv, nx_ctx->priv.ctr.nonce, CTR_RFC3686_NONCE_SIZE);
memcpy(iv + CTR_RFC3686_NONCE_SIZE, req->iv, CTR_RFC3686_IV_SIZE);
iv[12] = iv[13] = iv[14] = 0;
iv[15] = 1;
-- 
2.25.1



Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers

2021-04-06 Thread Kees Cook
On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote:
> kernel.h is being used as a dump for all kinds of stuff for a long time.
> Here is the attempt to start cleaning it up by splitting out panic and
> oops helpers.
> 
> At the same time convert users in header and lib folder to use new header.
> Though for time being include new header back to kernel.h to avoid twisted
> indirected includes for existing users.
> 
> Signed-off-by: Andy Shevchenko 

I like it! Do you have a multi-arch CI to do allmodconfig builds to
double-check this?

Acked-by: Kees Cook 

-Kees

-- 
Kees Cook


Re: [PATCH next v3 12/15] printk: introduce a kmsg_dump iterator

2021-02-25 Thread Kees Cook
On Thu, Feb 25, 2021 at 09:24:35PM +0100, John Ogness wrote:
> Rather than storing the iterator information in the registered
> kmsg_dumper structure, create a separate iterator structure. The
> kmsg_dump_iter structure can reside on the stack of the caller, thus
> allowing lockless use of the kmsg_dump functions.
> 
> This change also means that the kmsg_dumper dump() callback no
> longer needs to pass in the kmsg_dumper as an argument. If
> kmsg_dumpers want to access the kernel logs, they can use the new
> iterator.
> 
> Update the kmsg_dumper callback prototype. Update code that accesses
> the kernel logs using the kmsg_dumper structure to use the new
> kmsg_dump_iter structure. For kmsg_dumpers, this also means adding a
> call to kmsg_dump_rewind() to initialize the iterator.
> 
> All this is in preparation for removal of @logbuf_lock.
> 
> Signed-off-by: John Ogness 
> ---
>  arch/powerpc/kernel/nvram_64.c | 14 +++---
>  arch/powerpc/platforms/powernv/opal-kmsg.c |  3 +-
>  arch/powerpc/xmon/xmon.c   |  6 +--
>  arch/um/kernel/kmsg_dump.c |  8 +--
>  drivers/hv/vmbus_drv.c |  7 +--
>  drivers/mtd/mtdoops.c  |  8 +--
>  fs/pstore/platform.c   |  8 +--

Reviewed-by: Kees Cook  # pstore

-Kees

>  include/linux/kmsg_dump.h  | 38 ---
>  kernel/debug/kdb/kdb_main.c| 10 ++--
>  kernel/printk/printk.c | 57 ++
>  10 files changed, 81 insertions(+), 78 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
> index 532f22637783..5a64b24a91c2 100644
> --- a/arch/powerpc/kernel/nvram_64.c
> +++ b/arch/powerpc/kernel/nvram_64.c
> @@ -72,8 +72,7 @@ static const char *nvram_os_partitions[] = {
>   NULL
>  };
>  
> -static void oops_to_nvram(struct kmsg_dumper *dumper,
> -   enum kmsg_dump_reason reason);
> +static void oops_to_nvram(enum kmsg_dump_reason reason);
>  
>  static struct kmsg_dumper nvram_kmsg_dumper = {
>   .dump = oops_to_nvram
> @@ -642,11 +641,11 @@ void __init nvram_init_oops_partition(int 
> rtas_partition_exists)
>   * that we think will compress sufficiently to fit in the lnx,oops-log
>   * partition.  If that's too much, go back and capture uncompressed text.
>   */
> -static void oops_to_nvram(struct kmsg_dumper *dumper,
> -   enum kmsg_dump_reason reason)
> +static void oops_to_nvram(enum kmsg_dump_reason reason)
>  {
>   struct oops_log_info *oops_hdr = (struct oops_log_info *)oops_buf;
>   static unsigned int oops_count = 0;
> + static struct kmsg_dump_iter iter;
>   static bool panicking = false;
>   static DEFINE_SPINLOCK(lock);
>   unsigned long flags;
> @@ -681,13 +680,14 @@ static void oops_to_nvram(struct kmsg_dumper *dumper,
>   return;
>  
>   if (big_oops_buf) {
> - kmsg_dump_get_buffer(dumper, false,
> + kmsg_dump_rewind();
> + kmsg_dump_get_buffer(, false,
>big_oops_buf, big_oops_buf_sz, _len);
>   rc = zip_oops(text_len);
>   }
>   if (rc != 0) {
> - kmsg_dump_rewind(dumper);
> - kmsg_dump_get_buffer(dumper, false,
> + kmsg_dump_rewind();
> + kmsg_dump_get_buffer(, false,
>oops_data, oops_data_sz, _len);
>   err_type = ERR_TYPE_KERNEL_PANIC;
>   oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION);
> diff --git a/arch/powerpc/platforms/powernv/opal-kmsg.c 
> b/arch/powerpc/platforms/powernv/opal-kmsg.c
> index 6c3bc4b4da98..a7bd6ac681f4 100644
> --- a/arch/powerpc/platforms/powernv/opal-kmsg.c
> +++ b/arch/powerpc/platforms/powernv/opal-kmsg.c
> @@ -19,8 +19,7 @@
>   * may not be completely printed.  This function does not actually dump the
>   * message, it just ensures that OPAL completely flushes the console buffer.
>   */
> -static void kmsg_dump_opal_console_flush(struct kmsg_dumper *dumper,
> -  enum kmsg_dump_reason reason)
> +static void kmsg_dump_opal_console_flush(enum kmsg_dump_reason reason)
>  {
>   /*
>* Outside of a panic context the pollers will continue to run,
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index 80ed3e1becf9..5978b90a885f 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -3001,7 +3001,7 @@ print_address(unsigned long addr)
>  static void
>  dump_log_buf(void)
>  {
> - struct kmsg_dumper dumper;
> + struct kmsg_dump_iter iter;
>

Re: linux-next: build warning after merge of the akpm tree

2020-12-09 Thread Kees Cook
On Tue, Dec 08, 2020 at 11:01:57PM +1100, Stephen Rothwell wrote:
> Hi Stephen,
> 
> On Fri, 4 Dec 2020 21:00:00 +1100 Stephen Rothwell  
> wrote:
> >
> > Hi all,
> > 
> > After merging the akpm tree, today's linux-next build (powerpc
> > allyesconfig) produced warnings like this:
> > 
> > ld: warning: orphan section `.data..Lubsan_data177' from 
> > `arch/powerpc/oprofile/op_model_pa6t.o' being placed in section 
> > `.data..Lubsan_data177'
> > 
> > (lots of these latter ones)
> 
> 781584 of them today!
> 
> > I don't know what produced these, but it is in the akpm-current or
> > akpm trees.
> 
> Presumably the result of commit
> 
>   186c3e18dba3 ("ubsan: enable for all*config builds")
> 
> from the akpm-current tree.
> 
> arch/powerpc/kernel/vmlinux.lds.S has:
> 
> #ifdef CONFIG_PPC32
> .data : AT(ADDR(.data) - LOAD_OFFSET) {
> DATA_DATA
> #ifdef CONFIG_UBSAN
> *(.data..Lubsan_data*)
> *(.data..Lubsan_type*)
> #endif
> *(.data.rel*)
> *(SDATA_MAIN)
> 
> added by commit
> 
>   beba24ac5913 ("powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* 
> sections explicitly")
> 
> in 2018, but no equivalent for 64 bit.
> 
> I will try the following patch tomorrow:
> 
> From: Stephen Rothwell 
> Date: Tue, 8 Dec 2020 22:58:24 +1100
> Subject: [PATCH] powerpc: Add .data..Lubsan_data*/.data..Lubsan_type* 
> sections explicitly
> 
> Similarly to commit
> 
>   beba24ac5913 ("powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* 
> sections explicitly")
> 
> since CONFIG_UBSAN bits can now be enabled for all*config.
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  arch/powerpc/kernel/vmlinux.lds.S | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
> b/arch/powerpc/kernel/vmlinux.lds.S
> index 3b4c26e94328..0318ba436f34 100644
> --- a/arch/powerpc/kernel/vmlinux.lds.S
> +++ b/arch/powerpc/kernel/vmlinux.lds.S
> @@ -296,6 +296,10 @@ SECTIONS
>  #else
>   .data : AT(ADDR(.data) - LOAD_OFFSET) {
>   DATA_DATA
> +#ifdef CONFIG_UBSAN
> + *(.data..Lubsan_data*)
> + *(.data..Lubsan_type*)
> +#endif
>   *(.data.rel*)
>   *(.toc1)
>   *(.branch_lt)
> -- 
> 2.29.2
> 
> -- 
> Cheers,
> Stephen Rothwell

Reviewed-by: Kees Cook 

Thanks for figuring this one out. :) Andrew, can you add this to your
ubsan patch stack, or do you want me to resend it to you directly?


-- 
Kees Cook


Re: [PATCH v2 2/2] kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1

2020-12-02 Thread Kees Cook
On Wed, Dec 02, 2020 at 11:37:38AM +0900, Masahiro Yamada wrote:
> On Wed, Dec 2, 2020 at 5:56 AM Kees Cook  wrote:
> >
> > On Tue, Dec 01, 2020 at 10:31:37PM +0900, Masahiro Yamada wrote:
> > > On Wed, Nov 25, 2020 at 7:22 AM Kees Cook  wrote:
> > > >
> > > > On Thu, Nov 19, 2020 at 01:13:27PM -0800, Nick Desaulniers wrote:
> > > > > On Thu, Nov 19, 2020 at 12:57 PM Nathan Chancellor
> > > > >  wrote:
> > > > > >
> > > > > > ld.lld 10.0.1 spews a bunch of various warnings about .rela 
> > > > > > sections,
> > > > > > along with a few others. Newer versions of ld.lld do not have these
> > > > > > warnings. As a result, do not add '--orphan-handling=warn' to
> > > > > > LDFLAGS_vmlinux if ld.lld's version is not new enough.
> > > > > >
> > > > > > Link: https://github.com/ClangBuiltLinux/linux/issues/1187
> > > > > > Link: https://github.com/ClangBuiltLinux/linux/issues/1193
> > > > > > Reported-by: Arvind Sankar 
> > > > > > Reported-by: kernelci.org bot 
> > > > > > Reported-by: Mark Brown 
> > > > > > Reviewed-by: Kees Cook 
> > > > > > Signed-off-by: Nathan Chancellor 
> > > > >
> > > > > Thanks for the additions in v2.
> > > > > Reviewed-by: Nick Desaulniers 
> > > >
> > > > I'm going to carry this for a few days in -next, and if no one screams,
> > > > ask Linus to pull it for v5.10-rc6.
> > > >
> > > > Thanks!
> > > >
> > > > --
> > > > Kees Cook
> > >
> > >
> > > Sorry for the delay.
> > > Applied to linux-kbuild.
> >
> > Great, thanks!
> >
> > > But, I already see this in linux-next.
> > > Please let me know if I should drop it from my tree.
> >
> > My intention was to get this to Linus this week. Do you want to do that
> > yourself, or Ack the patches in my tree and I'll send it?
> 
> I will send a kbuild pull request myself this week.

Okay, thanks! I've removed it from my -next tree now.

-- 
Kees Cook


Re: [PATCH v2 2/2] kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1

2020-12-01 Thread Kees Cook
On Tue, Dec 01, 2020 at 10:31:37PM +0900, Masahiro Yamada wrote:
> On Wed, Nov 25, 2020 at 7:22 AM Kees Cook  wrote:
> >
> > On Thu, Nov 19, 2020 at 01:13:27PM -0800, Nick Desaulniers wrote:
> > > On Thu, Nov 19, 2020 at 12:57 PM Nathan Chancellor
> > >  wrote:
> > > >
> > > > ld.lld 10.0.1 spews a bunch of various warnings about .rela sections,
> > > > along with a few others. Newer versions of ld.lld do not have these
> > > > warnings. As a result, do not add '--orphan-handling=warn' to
> > > > LDFLAGS_vmlinux if ld.lld's version is not new enough.
> > > >
> > > > Link: https://github.com/ClangBuiltLinux/linux/issues/1187
> > > > Link: https://github.com/ClangBuiltLinux/linux/issues/1193
> > > > Reported-by: Arvind Sankar 
> > > > Reported-by: kernelci.org bot 
> > > > Reported-by: Mark Brown 
> > > > Reviewed-by: Kees Cook 
> > > > Signed-off-by: Nathan Chancellor 
> > >
> > > Thanks for the additions in v2.
> > > Reviewed-by: Nick Desaulniers 
> >
> > I'm going to carry this for a few days in -next, and if no one screams,
> > ask Linus to pull it for v5.10-rc6.
> >
> > Thanks!
> >
> > --
> > Kees Cook
> 
> 
> Sorry for the delay.
> Applied to linux-kbuild.

Great, thanks!

> But, I already see this in linux-next.
> Please let me know if I should drop it from my tree.

My intention was to get this to Linus this week. Do you want to do that
yourself, or Ack the patches in my tree and I'll send it?

-Kees

-- 
Kees Cook


Re: [PATCH v2 2/2] kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1

2020-11-24 Thread Kees Cook
On Thu, Nov 19, 2020 at 01:13:27PM -0800, Nick Desaulniers wrote:
> On Thu, Nov 19, 2020 at 12:57 PM Nathan Chancellor
>  wrote:
> >
> > ld.lld 10.0.1 spews a bunch of various warnings about .rela sections,
> > along with a few others. Newer versions of ld.lld do not have these
> > warnings. As a result, do not add '--orphan-handling=warn' to
> > LDFLAGS_vmlinux if ld.lld's version is not new enough.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1187
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1193
> > Reported-by: Arvind Sankar 
> > Reported-by: kernelci.org bot 
> > Reported-by: Mark Brown 
> > Reviewed-by: Kees Cook 
> > Signed-off-by: Nathan Chancellor 
> 
> Thanks for the additions in v2.
> Reviewed-by: Nick Desaulniers 

I'm going to carry this for a few days in -next, and if no one screams,
ask Linus to pull it for v5.10-rc6.

Thanks!

-- 
Kees Cook


Re: [PATCH v2 1/2] kbuild: Hoist '--orphan-handling' into Kconfig

2020-11-20 Thread Kees Cook
On Thu, Nov 19, 2020 at 01:46:56PM -0700, Nathan Chancellor wrote:
> Currently, '--orphan-handling=warn' is spread out across four different
> architectures in their respective Makefiles, which makes it a little
> unruly to deal with in case it needs to be disabled for a specific
> linker version (in this case, ld.lld 10.0.1).
> 
> To make it easier to control this, hoist this warning into Kconfig and
> the main Makefile so that disabling it is simpler, as the warning will
> only be enabled in a couple places (main Makefile and a couple of
> compressed boot folders that blow away LDFLAGS_vmlinx) and making it
> conditional is easier due to Kconfig syntax. One small additional
> benefit of this is saving a call to ld-option on incremental builds
> because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN.
> 
> To keep the list of supported architectures the same, introduce
> CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to
> gain this automatically after all of the sections are specified and size
> asserted. A special thanks to Kees Cook for the help text on this
> config.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1187
> Acked-by: Kees Cook 
> Acked-by: Michael Ellerman  (powerpc)
> Reviewed-by: Nick Desaulniers 
> Tested-by: Nick Desaulniers 
> Signed-off-by: Nathan Chancellor 

Masahiro, do you want to take these to get them to Linus for v5.10? I
can send them if you'd prefer.

-Kees

-- 
Kees Cook


Re: [PATCH 1/2] kbuild: Hoist '--orphan-handling' into Kconfig

2020-11-17 Thread Kees Cook
On Fri, Nov 13, 2020 at 12:55:52PM -0700, Nathan Chancellor wrote:
> Currently, '--orphan-handling=warn' is spread out across four different
> architectures in their respective Makefiles, which makes it a little
> unruly to deal with in case it needs to be disabled for a specific
> linker version (in this case, ld.lld 10.0.1).
> 
> To make it easier to control this, hoist this warning into Kconfig and
> the main Makefile so that disabling it is simpler, as the warning will
> only be enabled in a couple places (main Makefile and a couple of
> compressed boot folders that blow away LDFLAGS_vmlinx) and making it
> conditional is easier due to Kconfig syntax. One small additional
> benefit of this is saving a call to ld-option on incremental builds
> because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN.
> 
> To keep the list of supported architectures the same, introduce
> CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to
> gain this automatically after all of the sections are specified and size
> asserted. A special thanks to Kees Cook for the help text on this
> config.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1187
> Signed-off-by: Nathan Chancellor 

Looks good to me. With the other suggestions from the thread added,
please consider it:

Acked-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 2/2] kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1

2020-11-17 Thread Kees Cook
On Fri, Nov 13, 2020 at 12:55:53PM -0700, Nathan Chancellor wrote:
>  config LD_ORPHAN_WARN
> - def_bool ARCH_WANT_LD_ORPHAN_WARN && $(ld-option,--orphan-handling=warn)
> + def_bool ARCH_WANT_LD_ORPHAN_WARN && 
> $(ld-option,--orphan-handling=warn) && (!LD_IS_LLD || LLD_VERSION >= 11)

Readability nit-pick... I prefer separate "depends" lines to make things
a little easier to parse, change, etc:

config LD_ORPHAN_WARN
def_bool y
depends on ARCH_WANT_LD_ORPHAN_WARN
depends on !LD_IS_LLD || LLD_VERSION >= 11
depends on $(ld-option,--orphan-handling=warn)

Otherwise, yeah, looks good to me. With this and the other suggestions,
please consider it:

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH seccomp v2 0/8] seccomp: add bitmap cache support on remaining arches and report cache in procfs

2020-11-17 Thread Kees Cook
On Wed, 11 Nov 2020 07:33:46 -0600, YiFei Zhu wrote:
> This patch series enables bitmap cache for the remaining arches with
> SECCOMP_FILTER, other than MIPS.
> 
> I was unable to find any of the arches having subarch-specific NR_syscalls
> macros, so generic NR_syscalls is used. SH's syscall_get_arch seems to
> only have the 32-bit subarch implementation. I'm not sure if this is
> expected.
> 
> [...]

Applied to for-next/seccomp, thanks!

I made a small tweak to the last patch to add more details to the per-ARCH
help text, and to drop the needless "depends on SECCOMP" (which "depends
on SECCOMP_FILTER" was already present).

I successfully build-tested on parisc, powerpc, riscv, s390, and
sh. xtensa doesn't build using the existing Debian cross-compiler, and
I can't make csky with clang work, but they look correct. *cross fingers*

[1/8] csky: Enable seccomp architecture tracking
  https://git.kernel.org/kees/c/ee7ce951028f
[2/8] parisc: Enable seccomp architecture tracking
  https://git.kernel.org/kees/c/7f049cc068a3
[3/8] powerpc: Enable seccomp architecture tracking
  https://git.kernel.org/kees/c/95f8ae2624a0
[4/8] riscv: Enable seccomp architecture tracking
  https://git.kernel.org/kees/c/8f9f0f44a37b
[5/8] s390: Enable seccomp architecture tracking
  https://git.kernel.org/kees/c/5897106c6902
[6/8] sh: Enable seccomp architecture tracking
  https://git.kernel.org/kees/c/75186111c257
[7/8] xtensa: Enable seccomp architecture tracking
  https://git.kernel.org/kees/c/4f408bc643aa
[8/8] seccomp/cache: Report cache data through /proc/pid/seccomp_cache
  https://git.kernel.org/kees/c/49a6968cc78f

-- 
Kees Cook



Re: [PATCH seccomp 0/8] seccomp: add bitmap cache support on remaining arches and report cache in procfs

2020-11-03 Thread Kees Cook
On Tue, Nov 03, 2020 at 07:42:56AM -0600, YiFei Zhu wrote:
> From: YiFei Zhu 
> 
> This patch series enables bitmap cache for the remaining arches with
> SECCOMP_FILTER, other than MIPS.
> 
> I was unable to find any of the arches having subarch-specific NR_syscalls
> macros, so generic NR_syscalls is used. SH's syscall_get_arch seems to
> only have the 32-bit subarch implementation. I'm not sure if this is
> expected.
> 
> This series has not been tested; I have not built all the cross compilers
> necessary to build test, let alone run the kernel or benchmark the
> performance, so help on making sure the bitmap cache works as expected
> would be appreciated. The series applies on top of Kees's for-next/seccomp
> branch.

Thank you! This looks good. I wonder if the different handling of little
endian is worth solving -- I'm suspicious about powerpc's use of
__LITTLE_ENDIAN__ vs a CONFIG, but I guess the compiler would match the
target endian-ness. Regardless, it captures what the architectures are
doing, and gets things standardized.

> 
> YiFei Zhu (8):
>   csky: Enable seccomp architecture tracking
>   parisc: Enable seccomp architecture tracking

I don't have compilers for these.

>   powerpc: Enable seccomp architecture tracking
>   riscv: Enable seccomp architecture tracking
>   s390: Enable seccomp architecture tracking

These I can build-test immediately.

>   sh: Enable seccomp architecture tracking
>   xtensa: Enable seccomp architecture tracking

These two are available in Ubuntu's cross compiler set, so I'll get them
added to my cross-builders.

>   seccomp/cache: Report cache data through /proc/pid/seccomp_cache

In the meantime, I'll wait a bit to see if we can get some Acks/Reviews
from arch maintainers. :)

-Kees

> 
>  arch/Kconfig   | 15 
>  arch/csky/include/asm/Kbuild   |  1 -
>  arch/csky/include/asm/seccomp.h| 11 ++
>  arch/parisc/include/asm/Kbuild |  1 -
>  arch/parisc/include/asm/seccomp.h  | 22 +++
>  arch/powerpc/include/asm/seccomp.h | 21 +++
>  arch/riscv/include/asm/seccomp.h   | 10 +
>  arch/s390/include/asm/seccomp.h|  9 +
>  arch/sh/include/asm/seccomp.h  | 10 +
>  arch/xtensa/include/asm/Kbuild |  1 -
>  arch/xtensa/include/asm/seccomp.h  | 11 ++
>  fs/proc/base.c |  6 +++
>  include/linux/seccomp.h|  7 
>  kernel/seccomp.c   | 59 ++
>  14 files changed, 181 insertions(+), 3 deletions(-)
>  create mode 100644 arch/csky/include/asm/seccomp.h
>  create mode 100644 arch/parisc/include/asm/seccomp.h
>  create mode 100644 arch/xtensa/include/asm/seccomp.h
> 
> 
> base-commit: 38c37e8fd3d2590c4234d8cfbc22158362f0eb04
> --
> 2.29.2

-- 
Kees Cook


Re: [PATCH v2 3/3] selftests/lkdtm: Enable selftest for SLB multihit

2020-09-25 Thread Kees Cook
On Fri, Sep 25, 2020 at 04:01:23PM +0530, Ganesh Goudar wrote:
> Add PPC_SLB_MULTIHIT to lkdtm selftest framework.
> 
> Signed-off-by: Ganesh Goudar 
> ---
>  tools/testing/selftests/lkdtm/tests.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/testing/selftests/lkdtm/tests.txt 
> b/tools/testing/selftests/lkdtm/tests.txt
> index 9d266e79c6a2..7eb3cf91c89e 100644
> --- a/tools/testing/selftests/lkdtm/tests.txt
> +++ b/tools/testing/selftests/lkdtm/tests.txt
> @@ -70,3 +70,4 @@ USERCOPY_KERNEL
>  USERCOPY_KERNEL_DS
>  STACKLEAK_ERASING OK: the rest of the thread stack is properly erased
>  CFI_FORWARD_PROTO
> +PPC_SLB_MULTIHIT Recovered

Please squash this into the lkdtm patch -- I'd like test implementation
and kselftest awareness to go in together.

-- 
Kees Cook


Re: [PATCH v2 2/3] lkdtm/powerpc: Add SLB multihit test

2020-09-25 Thread Kees Cook
_ to happen).

> +
> +static void inject_vmalloc_slb_multihit(void)
> +{
> + char *p;
> +
> + p = vmalloc(2048);
> + if (!p)
> + return;
> +
> + insert_slb_entry(p, MMU_SEGSIZE_1T);
> + vfree(p);
> +}
> +
> +static void inject_kmalloc_slb_multihit(void)
> +{
> + char *p;
> +
> + p = kmalloc(2048, GFP_KERNEL);
> + if (!p)
> + return;
> +
> + insert_slb_entry(p, MMU_SEGSIZE_1T);
> + kfree(p);
> +}

It looks like the expected failure injection is actually the "p[0] = '!'" line 
in the
"insert" helper? I would expect pr_info/pr_err wrappers, etc, as in
other lkdtm tests.

If this is the negative test, what does the positive test look like?
e.g. in other lkdtm tests, first a positive test is done, then a
negative: first run a legit function, then run a function from a bad
location.

> +
> +static void insert_dup_slb_entry_0(void)
> +{
> + unsigned long test_address = 0xC000;
> + volatile unsigned long *test_ptr;
> + unsigned long entry, i = 0;
> + unsigned long esid, vsid;
> +
> + test_ptr = (unsigned long *)test_address;
> + preempt_disable();
> +
> + asm volatile("slbmfee  %0,%1" : "=r" (esid) : "r" (i));
> + asm volatile("slbmfev  %0,%1" : "=r" (vsid) : "r" (i));
> + entry = get_slb_index();
> +
> + /* for i !=0 we would need to mask out the old entry number */
> + asm volatile("slbmte %0,%1" :
> + : "r" (vsid),
> +   "r" (esid | entry)
> + : "memory");
> +
> + asm volatile("slbmfee  %0,%1" : "=r" (esid) : "r" (i));
> + asm volatile("slbmfev  %0,%1" : "=r" (vsid) : "r" (i));
> + entry = get_slb_index();
> +
> + /* for i !=0 we would need to mask out the old entry number */
> + asm volatile("slbmte %0,%1" :
> + : "r" (vsid),
> +   "r" (esid | entry)
> + : "memory");
> +
> + pr_info("lkdtm: %s accessing test address 0x%lx: 0x%lx\n",
> + __func__, test_address, *test_ptr);
> +
> + preempt_enable();
> +}

What does this do?

> +
> +void lkdtm_PPC_SLB_MULTIHIT(void)
> +{
> + if (mmu_has_feature(MMU_FTR_HPTE_TABLE)) {
> + inject_vmalloc_slb_multihit();
> + inject_kmalloc_slb_multihit();
> + insert_dup_slb_entry_0();
> + }
> + pr_info("Recovered from SLB multihit. (Ignore this message on non HPTE 
> machines)");

Is this bad? If so, I'd expect pr_err("FAIL: ...") Can HPTE machines be
detected so that an XFAIL can be emitted instead?

Since there are three (two?) distinct regions being tested, should these
be separate tests? Right now there is no way to separate a vmalloc
failure from a kmalloc failure, and no way to fail the first without
hiding the result from the latter (or maybe the machine cannot survive
this test? ... which should also be a comment.)

And finally, assuming a successful test (or testing from a separate
thread later), so there any state that needs to be restored (or cleaned
up before doing the "insert" calls)?

Thanks!

-- 
Kees Cook


[PATCH v2 1/4] selftests/seccomp: Record syscall during ptrace entry

2020-09-19 Thread Kees Cook
In preparation for performing actions during ptrace syscall exit, save
the syscall number during ptrace syscall entry. Some architectures do
no have the syscall number available during ptrace syscall exit.

Suggested-by: Thadeu Lima de Souza Cascardo 
Link: 
https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-casca...@canonical.com/
Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 40 +--
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index bc0fb463c709..c0311b4c736b 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1949,12 +1949,19 @@ void tracer_seccomp(struct __test_metadata *_metadata, 
pid_t tracee,
 
 }
 
+FIXTURE(TRACE_syscall) {
+   struct sock_fprog prog;
+   pid_t tracer, mytid, mypid, parent;
+   long syscall_nr;
+};
+
 void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee,
   int status, void *args)
 {
-   int ret, nr;
+   int ret;
unsigned long msg;
static bool entry;
+   FIXTURE_DATA(TRACE_syscall) *self = args;
 
/*
 * The traditional way to tell PTRACE_SYSCALL entry/exit
@@ -1968,24 +1975,31 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
pid_t tracee,
EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
: PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
 
-   if (!entry)
+   /*
+* Some architectures only support setting return values during
+* syscall exit under ptrace, and on exit the syscall number may
+* no longer be available. Therefore, save the initial sycall
+* number here, so it can be examined during both entry and exit
+* phases.
+*/
+   if (entry)
+   self->syscall_nr = get_syscall(_metadata, tracee);
+   else
return;
 
-   nr = get_syscall(_metadata, tracee);
-
-   if (nr == __NR_getpid)
+   switch (self->syscall_nr) {
+   case __NR_getpid:
change_syscall(_metadata, tracee, __NR_getppid, 0);
-   if (nr == __NR_gettid)
+   break;
+   case __NR_gettid:
change_syscall(_metadata, tracee, -1, 45000);
-   if (nr == __NR_openat)
+   break;
+   case __NR_openat:
change_syscall(_metadata, tracee, -1, -ESRCH);
+   break;
+   }
 }
 
-FIXTURE(TRACE_syscall) {
-   struct sock_fprog prog;
-   pid_t tracer, mytid, mypid, parent;
-};
-
 FIXTURE_VARIANT(TRACE_syscall) {
/*
 * All of the SECCOMP_RET_TRACE behaviors can be tested with either
@@ -2044,7 +2058,7 @@ FIXTURE_SETUP(TRACE_syscall)
self->tracer = setup_trace_fixture(_metadata,
   variant->use_ptrace ? tracer_ptrace
   : tracer_seccomp,
-  NULL, variant->use_ptrace);
+  self, variant->use_ptrace);
 
ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
ASSERT_EQ(0, ret);
-- 
2.25.1



[PATCH v2 4/4] selftests/clone3: Avoid OS-defined clone_args

2020-09-19 Thread Kees Cook
As the UAPI headers start to appear in distros, we need to avoid outdated
versions of struct clone_args to be able to test modern features;
rename to "struct __clone_args". Additionally update the struct size
macro names to match UAPI names.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/clone3/clone3.c   | 45 ---
 .../clone3/clone3_cap_checkpoint_restore.c|  4 +-
 .../selftests/clone3/clone3_clear_sighand.c   |  2 +-
 .../selftests/clone3/clone3_selftests.h   | 24 +-
 .../testing/selftests/clone3/clone3_set_tid.c |  4 +-
 tools/testing/selftests/seccomp/seccomp_bpf.c |  4 +-
 6 files changed, 40 insertions(+), 43 deletions(-)

diff --git a/tools/testing/selftests/clone3/clone3.c 
b/tools/testing/selftests/clone3/clone3.c
index b7e6dec36173..42be3b925830 100644
--- a/tools/testing/selftests/clone3/clone3.c
+++ b/tools/testing/selftests/clone3/clone3.c
@@ -20,13 +20,6 @@
 #include "../kselftest.h"
 #include "clone3_selftests.h"
 
-/*
- * Different sizes of struct clone_args
- */
-#ifndef CLONE3_ARGS_SIZE_V0
-#define CLONE3_ARGS_SIZE_V0 64
-#endif
-
 enum test_mode {
CLONE3_ARGS_NO_TEST,
CLONE3_ARGS_ALL_0,
@@ -38,13 +31,13 @@ enum test_mode {
 
 static int call_clone3(uint64_t flags, size_t size, enum test_mode test_mode)
 {
-   struct clone_args args = {
+   struct __clone_args args = {
.flags = flags,
.exit_signal = SIGCHLD,
};
 
struct clone_args_extended {
-   struct clone_args args;
+   struct __clone_args args;
__aligned_u64 excess_space[2];
} args_ext;
 
@@ -52,11 +45,11 @@ static int call_clone3(uint64_t flags, size_t size, enum 
test_mode test_mode)
int status;
 
memset(_ext, 0, sizeof(args_ext));
-   if (size > sizeof(struct clone_args))
+   if (size > sizeof(struct __clone_args))
args_ext.excess_space[1] = 1;
 
if (size == 0)
-   size = sizeof(struct clone_args);
+   size = sizeof(struct __clone_args);
 
switch (test_mode) {
case CLONE3_ARGS_ALL_0:
@@ -77,9 +70,9 @@ static int call_clone3(uint64_t flags, size_t size, enum 
test_mode test_mode)
break;
}
 
-   memcpy(_ext.args, , sizeof(struct clone_args));
+   memcpy(_ext.args, , sizeof(struct __clone_args));
 
-   pid = sys_clone3((struct clone_args *)_ext, size);
+   pid = sys_clone3((struct __clone_args *)_ext, size);
if (pid < 0) {
ksft_print_msg("%s - Failed to create new process\n",
strerror(errno));
@@ -144,14 +137,14 @@ int main(int argc, char *argv[])
else
ksft_test_result_skip("Skipping clone3() with CLONE_NEWPID\n");
 
-   /* Do a clone3() with CLONE3_ARGS_SIZE_V0. */
-   test_clone3(0, CLONE3_ARGS_SIZE_V0, 0, CLONE3_ARGS_NO_TEST);
+   /* Do a clone3() with CLONE_ARGS_SIZE_VER0. */
+   test_clone3(0, CLONE_ARGS_SIZE_VER0, 0, CLONE3_ARGS_NO_TEST);
 
-   /* Do a clone3() with CLONE3_ARGS_SIZE_V0 - 8 */
-   test_clone3(0, CLONE3_ARGS_SIZE_V0 - 8, -EINVAL, CLONE3_ARGS_NO_TEST);
+   /* Do a clone3() with CLONE_ARGS_SIZE_VER0 - 8 */
+   test_clone3(0, CLONE_ARGS_SIZE_VER0 - 8, -EINVAL, CLONE3_ARGS_NO_TEST);
 
/* Do a clone3() with sizeof(struct clone_args) + 8 */
-   test_clone3(0, sizeof(struct clone_args) + 8, 0, CLONE3_ARGS_NO_TEST);
+   test_clone3(0, sizeof(struct __clone_args) + 8, 0, CLONE3_ARGS_NO_TEST);
 
/* Do a clone3() with exit_signal having highest 32 bits non-zero */
test_clone3(0, 0, -EINVAL, CLONE3_ARGS_INVAL_EXIT_SIGNAL_BIG);
@@ -165,31 +158,31 @@ int main(int argc, char *argv[])
/* Do a clone3() with NSIG < exit_signal < CSIG */
test_clone3(0, 0, -EINVAL, CLONE3_ARGS_INVAL_EXIT_SIGNAL_NSIG);
 
-   test_clone3(0, sizeof(struct clone_args) + 8, 0, CLONE3_ARGS_ALL_0);
+   test_clone3(0, sizeof(struct __clone_args) + 8, 0, CLONE3_ARGS_ALL_0);
 
-   test_clone3(0, sizeof(struct clone_args) + 16, -E2BIG,
+   test_clone3(0, sizeof(struct __clone_args) + 16, -E2BIG,
CLONE3_ARGS_ALL_0);
 
-   test_clone3(0, sizeof(struct clone_args) * 2, -E2BIG,
+   test_clone3(0, sizeof(struct __clone_args) * 2, -E2BIG,
CLONE3_ARGS_ALL_0);
 
/* Do a clone3() with > page size */
test_clone3(0, getpagesize() + 8, -E2BIG, CLONE3_ARGS_NO_TEST);
 
-   /* Do a clone3() with CLONE3_ARGS_SIZE_V0 in a new PID NS. */
+   /* Do a clone3() with CLONE_ARGS_SIZE_VER0 in a new PID NS. */
if (uid == 0)
-   test_clone3(CLONE_NEWPID, CLONE3_ARGS_SIZE_V0, 0,
+   test_clone3(CLONE_NEWPID, CLONE_ARGS_SIZE_VER0, 0,
CLONE3_ARGS_NO_TEST);
else
ksft_test_result_ski

[PATCH v2 0/4] selftests/seccomp: Refactor change_syscall()

2020-09-19 Thread Kees Cook
v1: https://lore.kernel.org/lkml/20200912110820.597135-1-keesc...@chromium.org
v2:
- Took Acked patches into -next
- refactored powerpc syscall setting implementation
- refactored clone3 args implementation

Hi,

This finishes the refactoring of the seccomp selftest logic used in
for ptrace syscall number/return handling for powerpc. Additionally
fixes clone3 (which seccomp depends on for testing) to run under MIPS
where an old struct clone_args has become visible.

(FWIW, I expect to take these via the seccomp tree.)

Thanks,

Kees Cook (4):
  selftests/seccomp: Record syscall during ptrace entry
  selftests/seccomp: Allow syscall nr and ret value to be set separately
  selftests/seccomp: powerpc: Set syscall return during ptrace syscall
exit
  selftests/clone3: Avoid OS-defined clone_args

 tools/testing/selftests/clone3/clone3.c   |  45 +++
 .../clone3/clone3_cap_checkpoint_restore.c|   4 +-
 .../selftests/clone3/clone3_clear_sighand.c   |   2 +-
 .../selftests/clone3/clone3_selftests.h   |  24 ++--
 .../testing/selftests/clone3/clone3_set_tid.c |   4 +-
 tools/testing/selftests/seccomp/seccomp_bpf.c | 120 ++
 6 files changed, 131 insertions(+), 68 deletions(-)

-- 
2.25.1



[PATCH v2 2/4] selftests/seccomp: Allow syscall nr and ret value to be set separately

2020-09-19 Thread Kees Cook
In preparation for setting syscall nr and ret values separately, refactor
the helpers to take a pointer to a value, so that a NULL can indicate
"do not change this respective value". This is done to keep the regset
read/write happening once and in one code path.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 59 +++
 1 file changed, 47 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index c0311b4c736b..98ce5e8a6398 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1888,27 +1888,47 @@ int get_syscall(struct __test_metadata *_metadata, 
pid_t tracee)
 }
 
 /* Architecture-specific syscall changing routine. */
-void change_syscall(struct __test_metadata *_metadata,
-   pid_t tracee, int syscall, int result)
+void __change_syscall(struct __test_metadata *_metadata,
+   pid_t tracee, long *syscall, long *ret)
 {
ARCH_REGS orig, regs;
 
+   /* Do not get/set registers if we have nothing to do. */
+   if (!syscall && !ret)
+   return;
+
EXPECT_EQ(0, ARCH_GETREGS(regs)) {
return;
}
orig = regs;
 
-   SYSCALL_NUM_SET(regs, syscall);
+   if (syscall)
+   SYSCALL_NUM_SET(regs, *syscall);
 
-   /* If syscall is skipped, change return value. */
-   if (syscall == -1)
-   SYSCALL_RET_SET(regs, result);
+   if (ret)
+   SYSCALL_RET_SET(regs, *ret);
 
/* Flush any register changes made. */
if (memcmp(, , sizeof(orig)) != 0)
EXPECT_EQ(0, ARCH_SETREGS(regs));
 }
 
+/* Change only syscall number. */
+void change_syscall_nr(struct __test_metadata *_metadata,
+  pid_t tracee, long syscall)
+{
+   __change_syscall(_metadata, tracee, , NULL);
+}
+
+/* Change syscall return value (and set syscall number to -1). */
+void change_syscall_ret(struct __test_metadata *_metadata,
+   pid_t tracee, long ret)
+{
+   long syscall = -1;
+
+   __change_syscall(_metadata, tracee, , );
+}
+
 void tracer_seccomp(struct __test_metadata *_metadata, pid_t tracee,
int status, void *args)
 {
@@ -1924,17 +1944,17 @@ void tracer_seccomp(struct __test_metadata *_metadata, 
pid_t tracee,
case 0x1002:
/* change getpid to getppid. */
EXPECT_EQ(__NR_getpid, get_syscall(_metadata, tracee));
-   change_syscall(_metadata, tracee, __NR_getppid, 0);
+   change_syscall_nr(_metadata, tracee, __NR_getppid);
break;
case 0x1003:
/* skip gettid with valid return code. */
EXPECT_EQ(__NR_gettid, get_syscall(_metadata, tracee));
-   change_syscall(_metadata, tracee, -1, 45000);
+   change_syscall_ret(_metadata, tracee, 45000);
break;
case 0x1004:
/* skip openat with error. */
EXPECT_EQ(__NR_openat, get_syscall(_metadata, tracee));
-   change_syscall(_metadata, tracee, -1, -ESRCH);
+   change_syscall_ret(_metadata, tracee, -ESRCH);
break;
case 0x1005:
/* do nothing (allow getppid) */
@@ -1961,6 +1981,8 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
pid_t tracee,
int ret;
unsigned long msg;
static bool entry;
+   long syscall_nr_val, syscall_ret_val;
+   long *syscall_nr = NULL, *syscall_ret = NULL;
FIXTURE_DATA(TRACE_syscall) *self = args;
 
/*
@@ -1987,17 +2009,30 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
pid_t tracee,
else
return;
 
+   syscall_nr = _nr_val;
+   syscall_ret = _ret_val;
+
+   /* Now handle the actual rewriting cases. */
switch (self->syscall_nr) {
case __NR_getpid:
-   change_syscall(_metadata, tracee, __NR_getppid, 0);
+   syscall_nr_val = __NR_getppid;
+   /* Never change syscall return for this case. */
+   syscall_ret = NULL;
break;
case __NR_gettid:
-   change_syscall(_metadata, tracee, -1, 45000);
+   syscall_nr_val = -1;
+   syscall_ret_val = 45000;
break;
case __NR_openat:
-   change_syscall(_metadata, tracee, -1, -ESRCH);
+   syscall_nr_val = -1;
+   syscall_ret_val = -ESRCH;
break;
+   default:
+   /* Unhandled, do nothing. */
+   return;
}
+
+   __change_syscall(_metadata, tracee, syscall_nr, syscall_ret);
 }
 
 FIXTURE_VARIANT(TRACE_syscall) {
-- 
2.25.1



[PATCH v2 3/4] selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit

2020-09-19 Thread Kees Cook
Some archs (like powerpc) only support changing the return code during
syscall exit when ptrace is used. Test entry vs exit phases for which
portions of the syscall number and return values need to be set at which
different phases. For non-powerpc, all changes are made during ptrace
syscall entry, as before. For powerpc, the syscall number is changed at
ptrace syscall entry and the syscall return value is changed on ptrace
syscall exit.

Reported-by: Thadeu Lima de Souza Cascardo 
Suggested-by: Thadeu Lima de Souza Cascardo 
Link: 
https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-casca...@canonical.com/
Fixes: 58d0a862f573 ("seccomp: add tests for ptrace hole")
Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 25 ---
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 98ce5e8a6398..894c2404d321 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1765,6 +1765,7 @@ TEST_F(TRACE_poke, getpid_runs_normally)
(_regs).ccr &= ~0x1000; \
}   \
} while (0)
+# define SYSCALL_RET_SET_ON_PTRACE_EXIT
 #elif defined(__s390__)
 # define ARCH_REGS s390_regs
 # define SYSCALL_NUM(_regs)(_regs).gprs[2]
@@ -1853,6 +1854,18 @@ TEST_F(TRACE_poke, getpid_runs_normally)
} while (0)
 #endif
 
+/*
+ * Some architectures (e.g. powerpc) can only set syscall
+ * return values on syscall exit during ptrace.
+ */
+const bool ptrace_entry_set_syscall_nr = true;
+const bool ptrace_entry_set_syscall_ret =
+#ifndef SYSCALL_RET_SET_ON_PTRACE_EXIT
+   true;
+#else
+   false;
+#endif
+
 /*
  * Use PTRACE_GETREGS and PTRACE_SETREGS when available. This is useful for
  * architectures without HAVE_ARCH_TRACEHOOK (e.g. User-mode Linux).
@@ -2006,11 +2019,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
pid_t tracee,
 */
if (entry)
self->syscall_nr = get_syscall(_metadata, tracee);
-   else
-   return;
 
-   syscall_nr = _nr_val;
-   syscall_ret = _ret_val;
+   /*
+* Depending on the architecture's syscall setting abilities, we
+* pick which things to set during this phase (entry or exit).
+*/
+   if (entry == ptrace_entry_set_syscall_nr)
+   syscall_nr = _nr_val;
+   if (entry == ptrace_entry_set_syscall_ret)
+   syscall_ret = _ret_val;
 
/* Now handle the actual rewriting cases. */
switch (self->syscall_nr) {
-- 
2.25.1



Re: [PATCH 14/15] selftests/clone3: Avoid OS-defined clone_args

2020-09-18 Thread Kees Cook
On Tue, Sep 15, 2020 at 06:25:28PM +0200, Christian Brauner wrote:
> On Sat, Sep 12, 2020 at 04:08:19AM -0700, Kees Cook wrote:
> > As the UAPI headers start to appear in distros, we need to avoid
> > outdated versions of struct clone_args to be able to test modern
> > features. Additionally pull in the syscall numbers correctly.
> > 
> > Signed-off-by: Kees Cook 
> > ---
> 
> Hm, with this patch applied I'm getting:
> 
> gcc -g -I../../../../usr/include/clone3_set_tid.c 
> /home/brauner/src/git/linux/linux/tools/testing/selftests/kselftest_harness.h 
> /home/brauner/src/git/linux/linux/tools/testing/selftests/kselftest.h -lcap 
> -o 
> /home/brauner/src/git/linux/linux/tools/testing/selftests/clone3/clone3_set_tid
> In file included from clone3_set_tid.c:24:
> clone3_selftests.h:37:8: error: redefinition of ‘struct clone_args’
>37 | struct clone_args {
>   |^~
> In file included from clone3_set_tid.c:12:
> /usr/include/linux/sched.h:92:8: note: originally defined here
>92 | struct clone_args {
>   |^~
> make: *** [../lib.mk:140: 
> /home/brauner/src/git/linux/linux/tools/testing/selftests/clone3/clone3_set_tid]
>  Error 1

Hm, weird.

> One trick to avoid this could be:
> 
> #ifndef CLONE_ARGS_SIZE_VER0
> #define CLONE_ARGS_SIZE_VER0 64 /* sizeof first published struct */
> #endif
> 
> #ifndef CLONE_ARGS_SIZE_VER1
> #define CLONE_ARGS_SIZE_VER1 80 /* sizeof second published struct */
> #endif
> 
> #ifndef CLONE_ARGS_SIZE_VER2
> #define CLONE_ARGS_SIZE_VER2 88 /* sizeof third published struct */
> #endif
> 
> struct __clone_args {
>   __aligned_u64 flags;
>   __aligned_u64 pidfd;
>   __aligned_u64 child_tid;
>   __aligned_u64 parent_tid;
>   __aligned_u64 exit_signal;
>   __aligned_u64 stack;
>   __aligned_u64 stack_size;
>   __aligned_u64 tls;
>   __aligned_u64 set_tid;
>   __aligned_u64 set_tid_size;
>   __aligned_u64 cgroup;
> };
> 
> static pid_t sys_clone3(struct __clone_args *args, size_t size)
> {
>   return syscall(__NR_clone3, args, size);
> }

Yeah, that has fewer down sides. I'll rework it.

-- 
Kees Cook


Re: [PATCH 03/15] selftests/seccomp: mips: Define SYSCALL_NUM_SET macro

2020-09-18 Thread Kees Cook
On Tue, Sep 15, 2020 at 05:55:46PM +0200, Christian Brauner wrote:
> On Sat, Sep 12, 2020 at 04:08:08AM -0700, Kees Cook wrote:
> > Remove the mips special-case in change_syscall().
> > 
> > Signed-off-by: Kees Cook 
> > ---
> >  tools/testing/selftests/seccomp/seccomp_bpf.c | 17 +
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> > b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index 1c83e743bfb1..02a9a6599746 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -1742,6 +1742,13 @@ TEST_F(TRACE_poke, getpid_runs_normally)
> >  # define ARCH_REGS struct pt_regs
> >  # define SYSCALL_NUM(_regs)(_regs).regs[2]
> >  # define SYSCALL_SYSCALL_NUM   regs[4]
> > +# define SYSCALL_NUM_SET(_regs, _nr)   \
> > +   do {\
> > +   if ((_regs).regs[2] == __NR_O32_Linux)  \
> > +   (_regs).regs[4] = _nr;  \
> > +   else\
> > +   (_regs).regs[2] = _nr;  \
> > +   } while (0)
> 
> I think that
> 
> # define SYSCALL_NUM_SET(_regs, _nr)  \
>   do {\
>   if (SYSCALL_NUM(_regs) == __NR_O32_Linux)   \
>   (_regs).regs[4] = _nr;  \
>   else\
>   (_regs).regs[2] = _nr;  \
>   } while (0)
> 
> would read better but that's just a matter of taste. :)

That's how I started originally, but when I realized that I'd have to
reorganize SYSCALL_NUM() too, it seem best to have minimal churn, so I
left it open coded here, since that's how it needs to be in the end.

> Looks good!
> Acked-by: Christian Brauner 

Thanks for the reviews!

-- 
Kees Cook


Re: [PATCH] selftests/seccomp: fix ptrace tests on powerpc

2020-09-17 Thread Kees Cook
On Sun, Sep 13, 2020 at 10:34:23PM +1000, Michael Ellerman wrote:
> Thadeu Lima de Souza Cascardo  writes:
> > On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
> >> On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo 
> >> wrote:
> ...
> >> > @@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata 
> >> > *_metadata, pid_t tracee,
> >> >  EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> >> >  : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
> >> >  
> >> > -if (!entry)
> >> > +if (!entry && !syscall_nr)
> >> >  return;
> >> >  
> >> > -nr = get_syscall(_metadata, tracee);
> >> > +if (entry)
> >> > +nr = get_syscall(_metadata, tracee);
> >> > +else
> >> > +nr = *syscall_nr;
> >> 
> >> This is weird? Shouldn't get_syscall() be modified to do the right thing
> >> here instead of depending on the extra arg?
> >> 
> >
> > R0 might be clobered. Same documentation mentions it as volatile. So, during
> > syscall exit, we can't tell for sure that R0 will have the original syscall
> > number. So, we need to grab it during syscall enter, save it somewhere and
> > reuse it. I used the test context/args for that.
> 
> The user r0 (in regs->gpr[0]) shouldn't be clobbered.
> 
> But it is modified if the tracer skips the syscall, by setting the
> syscall number to -1. Or if the tracer changes the syscall number.
> 
> So if you need the original syscall number in the exit path then I think
> you do need to save it at entry.

... the selftest code wants to test the updated syscall (-1 or
whatever), so this sounds correct. Was this test actually failing on
powerpc? (I'd really rather not split entry/exit if I don't have to.)

-- 
Kees Cook


Re: [PATCH 00/15] selftests/seccomp: Refactor change_syscall()

2020-09-14 Thread Kees Cook
On Mon, Sep 14, 2020 at 10:15:18PM +1000, Michael Ellerman wrote:
> Kees Cook  writes:
> > Hi,
> >
> > This refactors the seccomp selftest macros used in change_syscall(),
> > in an effort to remove special cases for mips, arm, arm64, and xtensa,
> > which paves the way for powerpc fixes.
> >
> > I'm not entirely done testing, but all-arch build tests and x86_64
> > selftests pass. I'll be doing arm, arm64, and i386 selftests shortly,
> > but I currently don't have an easy way to check xtensa, mips, nor
> > powerpc. Any help there would be appreciated!
> 
> The series builds fine for me, and all the tests pass (see below).
> 
> Thanks for picking up those changes to deal with powerpc being oddball.
> 
> Tested-by: Michael Ellerman  (powerpc)

Awesome; thanks!

However...

> 
> cheers
> 
> 
> ./seccomp_bpf
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> #  RUN   global.kcmp ...
> #OK  global.kcmp
> ok 1 global.kcmp
> [...]
> #  RUN   global.KILL_thread ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.

Was this a mis-paste, or has something very very bad happened here in
global.KILL_one_arg_six finishes?

> #  RUN   global.kcmp ...
> #OK  global.kcmp
> ok 1 global.kcmp
> [...]
> #  RUN   global.user_notification_basic ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_basic ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_signal ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_closed_listener ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_child_pid_ns ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_sibling_pid_ns ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_sibling_pid_ns ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_sibling_pid_ns ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_fault_recv ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_continue ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_filter_empty ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_filter_empty_threaded ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_addfd ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> #  RUN   global.user_notification_addfd_rlimit ...
> TAP version 13
> 1..86
> # Starting 86 tests from 7 test cases.
> [...]
> # PASSED: 86 / 86 tests passed.
> # Totals: pass:86 fail:0 xfail:0 xpass:0 skip:0 error:0

And after every user_notification test? O_O

-- 
Kees Cook


Re: [PATCH 13/15] selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit

2020-09-14 Thread Kees Cook
On Mon, Sep 14, 2020 at 03:47:13PM +1000, Michael Ellerman wrote:
> Kees Cook  writes:
> > Some archs (like ppc) only support changing the return code during
> > syscall exit when ptrace is used. As the syscall number might not
> > be available anymore during syscall exit, it needs to be saved
> > during syscall enter. Adjust the ptrace tests to do this.
> 
> I'm not that across all the fixture stuff, but if I'm reading it right
> you're now calling change_syscall() on both entry and exit for all
> arches.

Correct.

> That should work, but it no longer tests changing the return code on
> entry on the arches that support it, which seems like a backward step?

That's a good point. I wouldn't be in a position to notice a regression
for the other architectures. I will refactor this one...

-- 
Kees Cook


[PATCH 15/15] selftests/seccomp: Use __NR_mknodat instead of __NR_mknod

2020-09-12 Thread Kees Cook
The __NR_mknod syscall doesn't exist on arm64 (only __NR_mknodat).
Switch to the modern syscall.

Fixes: ad5682184a81 ("selftests/seccomp: Check for EPOLLHUP for user_notif")
Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 26c712c6a575..b34ede28f314 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -3773,7 +3773,7 @@ TEST(user_notification_filter_empty)
if (pid == 0) {
int listener;
 
-   listener = user_notif_syscall(__NR_mknod, 
SECCOMP_FILTER_FLAG_NEW_LISTENER);
+   listener = user_notif_syscall(__NR_mknodat, 
SECCOMP_FILTER_FLAG_NEW_LISTENER);
if (listener < 0)
_exit(EXIT_FAILURE);
 
-- 
2.25.1



[PATCH 14/15] selftests/clone3: Avoid OS-defined clone_args

2020-09-12 Thread Kees Cook
As the UAPI headers start to appear in distros, we need to avoid
outdated versions of struct clone_args to be able to test modern
features. Additionally pull in the syscall numbers correctly.

Signed-off-by: Kees Cook 
---
I needed to fix this to get MIPS to build the seccomp selftests.
---
 .../testing/selftests/clone3/clone3_selftests.h  | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/clone3/clone3_selftests.h 
b/tools/testing/selftests/clone3/clone3_selftests.h
index 91c1a78ddb39..bc0f34e37ae1 100644
--- a/tools/testing/selftests/clone3/clone3_selftests.h
+++ b/tools/testing/selftests/clone3/clone3_selftests.h
@@ -4,11 +4,19 @@
 #define _CLONE3_SELFTESTS_H
 
 #define _GNU_SOURCE
+
+/* Pull in syscall numbers. */
+#include 
+#include 
+
+/* Avoid old OS versions of "struct clone_args". */
+#define clone_args old_clone_args
 #include 
 #include 
+#undef clone_args
+
 #include 
 #include 
-#include 
 #include 
 
 #include "../kselftest.h"
@@ -25,6 +33,7 @@
 
 #ifndef __NR_clone3
 #define __NR_clone3 -1
+#endif
 struct clone_args {
__aligned_u64 flags;
__aligned_u64 pidfd;
@@ -34,13 +43,16 @@ struct clone_args {
__aligned_u64 stack;
__aligned_u64 stack_size;
__aligned_u64 tls;
+#ifndef CLONE_ARGS_SIZE_VER1
 #define CLONE_ARGS_SIZE_VER1 80
+#endif
__aligned_u64 set_tid;
__aligned_u64 set_tid_size;
+#ifndef CLONE_ARGS_SIZE_VER2
 #define CLONE_ARGS_SIZE_VER2 88
+#endif
__aligned_u64 cgroup;
 };
-#endif /* __NR_clone3 */
 
 static pid_t sys_clone3(struct clone_args *args, size_t size)
 {
-- 
2.25.1



[PATCH 11/15] selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET

2020-09-12 Thread Kees Cook
Instead of special-casing the specific case of shared registers, create
a default SYSCALL_RET_SET() macro (mirroring SYSCALL_NUM_SET()), that
writes to the SYSCALL_RET register. For architectures that can't set the
return value (for whatever reason), they can define SYSCALL_RET_SET()
without an associated SYSCALL_RET() macro. This also paves the way for
architectures that need to do special things to set the return value
(e.g. powerpc).

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 33 +--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 2790d9cd50f4..623953a53032 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1753,8 +1753,8 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 #elif defined(__s390__)
 # define ARCH_REGS s390_regs
 # define SYSCALL_NUM(_regs)(_regs).gprs[2]
-# define SYSCALL_RET(_regs)(_regs).gprs[2]
-# define SYSCALL_NUM_RET_SHARE_REG
+# define SYSCALL_RET_SET(_regs, _val)  \
+   TH_LOG("Can't modify syscall return on this architecture")
 #elif defined(__mips__)
 # include 
 # include 
@@ -1776,8 +1776,8 @@ TEST_F(TRACE_poke, getpid_runs_normally)
else\
(_regs).regs[2] = _nr;  \
} while (0)
-# define SYSCALL_RET(_regs)(_regs).regs[2]
-# define SYSCALL_NUM_RET_SHARE_REG
+# define SYSCALL_RET_SET(_regs, _val)  \
+   TH_LOG("Can't modify syscall return on this architecture")
 #elif defined(__xtensa__)
 # define ARCH_REGS struct user_pt_regs
 # define SYSCALL_NUM(_regs)(_regs).syscall
@@ -1804,9 +1804,26 @@ TEST_F(TRACE_poke, getpid_runs_normally)
SYSCALL_NUM(_regs) = (_nr); \
} while (0)
 #endif
+/*
+ * Most architectures can change the syscall return value by just
+ * writing to the SYSCALL_RET register. This is the default if not
+ * defined above. If an architecture cannot set the return value
+ * (for example when the syscall and return value register is
+ * shared), report it with TH_LOG() in an arch-specific definition
+ * of SYSCALL_RET_SET() above, and leave SYSCALL_RET undefined.
+ */
+#if !defined(SYSCALL_RET) && !defined(SYSCALL_RET_SET)
+# error "One of SYSCALL_RET or SYSCALL_RET_SET is needed for this arch"
+#endif
+#ifndef SYSCALL_RET_SET
+# define SYSCALL_RET_SET(_regs, _val)  \
+   do {\
+   SYSCALL_RET(_regs) = (_val);\
+   } while (0)
+#endif
 
 /* When the syscall return can't be changed, stub out the tests for it. */
-#ifdef SYSCALL_NUM_RET_SHARE_REG
+#ifndef SYSCALL_RET
 # define EXPECT_SYSCALL_RETURN(val, action)EXPECT_EQ(-1, action)
 #else
 # define EXPECT_SYSCALL_RETURN(val, action)\
@@ -1870,11 +1887,7 @@ void change_syscall(struct __test_metadata *_metadata,
 
/* If syscall is skipped, change return value. */
if (syscall == -1)
-#ifdef SYSCALL_NUM_RET_SHARE_REG
-   TH_LOG("Can't modify syscall return on this architecture");
-#else
-   SYSCALL_RET(regs) = result;
-#endif
+   SYSCALL_RET_SET(regs, result);
 
/* Flush any register changes made. */
if (memcmp(, , sizeof(orig)) != 0)
-- 
2.25.1



[PATCH 13/15] selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit

2020-09-12 Thread Kees Cook
Some archs (like ppc) only support changing the return code during
syscall exit when ptrace is used. As the syscall number might not
be available anymore during syscall exit, it needs to be saved
during syscall enter. Adjust the ptrace tests to do this.

Reported-by: Thadeu Lima de Souza Cascardo 
Suggested-by: Thadeu Lima de Souza Cascardo 
Link: 
https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-casca...@canonical.com/
Fixes: 58d0a862f573 ("seccomp: add tests for ptrace hole")
Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 34 +++
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index bbab2420d708..26c712c6a575 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1949,12 +1949,19 @@ void tracer_seccomp(struct __test_metadata *_metadata, 
pid_t tracee,
 
 }
 
+FIXTURE(TRACE_syscall) {
+   struct sock_fprog prog;
+   pid_t tracer, mytid, mypid, parent;
+   long syscall_nr;
+};
+
 void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee,
   int status, void *args)
 {
-   int ret, nr;
+   int ret;
unsigned long msg;
static bool entry;
+   FIXTURE_DATA(TRACE_syscall) *self = args;
 
/*
 * The traditional way to tell PTRACE_SYSCALL entry/exit
@@ -1968,24 +1975,23 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
pid_t tracee,
EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
: PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
 
-   if (!entry)
-   return;
-
-   nr = get_syscall(_metadata, tracee);
+   /*
+* Some architectures only support setting return values during
+* syscall exit under ptrace, and on exit the syscall number may
+* no longer be available. Therefore, save it here, and call
+* "change syscall and set return values" on both entry and exit.
+*/
+   if (entry)
+   self->syscall_nr = get_syscall(_metadata, tracee);
 
-   if (nr == __NR_getpid)
+   if (self->syscall_nr == __NR_getpid)
change_syscall(_metadata, tracee, __NR_getppid, 0);
-   if (nr == __NR_gettid)
+   if (self->syscall_nr == __NR_gettid)
change_syscall(_metadata, tracee, -1, 45000);
-   if (nr == __NR_openat)
+   if (self->syscall_nr == __NR_openat)
change_syscall(_metadata, tracee, -1, -ESRCH);
 }
 
-FIXTURE(TRACE_syscall) {
-   struct sock_fprog prog;
-   pid_t tracer, mytid, mypid, parent;
-};
-
 FIXTURE_VARIANT(TRACE_syscall) {
/*
 * All of the SECCOMP_RET_TRACE behaviors can be tested with either
@@ -2044,7 +2050,7 @@ FIXTURE_SETUP(TRACE_syscall)
self->tracer = setup_trace_fixture(_metadata,
   variant->use_ptrace ? tracer_ptrace
   : tracer_seccomp,
-  NULL, variant->use_ptrace);
+  self, variant->use_ptrace);
 
ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
ASSERT_EQ(0, ret);
-- 
2.25.1



[PATCH 12/15] selftests/seccomp: powerpc: Fix seccomp return value testing

2020-09-12 Thread Kees Cook
On powerpc, the errno is not inverted, and depends on ccr.so being
set. Add this to a powerpc definition of SYSCALL_RET_SET().

Co-developed-by: Thadeu Lima de Souza Cascardo 
Signed-off-by: Thadeu Lima de Souza Cascardo 
Link: 
https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-casca...@canonical.com/
Fixes: 5d83c2b37d43 ("selftests/seccomp: Add powerpc support")
Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 623953a53032..bbab2420d708 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1750,6 +1750,21 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 # define ARCH_REGS struct pt_regs
 # define SYSCALL_NUM(_regs)(_regs).gpr[0]
 # define SYSCALL_RET(_regs)(_regs).gpr[3]
+# define SYSCALL_RET_SET(_regs, _val)  \
+   do {\
+   typeof(_val) _result = (_val);  \
+   /*  \
+* A syscall error is signaled by CR0 SO bit\
+* and the code is stored as a positive value.  \
+*/ \
+   if (_result < 0) {  \
+   SYSCALL_RET(_regs) = -result;   \
+   (_regs).ccr |= 0x1000;  \
+   } else {\
+   SYSCALL_RET(_regs) = result;\
+   (_regs).ccr &= ~0x1000; \
+   }   \
+   } while (0)
 #elif defined(__s390__)
 # define ARCH_REGS s390_regs
 # define SYSCALL_NUM(_regs)(_regs).gprs[2]
-- 
2.25.1



[PATCH 10/15] selftests/seccomp: Avoid redundant register flushes

2020-09-12 Thread Kees Cook
When none of the registers have changed, don't flush them back. This can
happen if the architecture uses a non-register way to change the syscall
(e.g. arm64) , and a return value hasn't been written.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index d9346121b89b..2790d9cd50f4 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1859,11 +1859,12 @@ int get_syscall(struct __test_metadata *_metadata, 
pid_t tracee)
 void change_syscall(struct __test_metadata *_metadata,
pid_t tracee, int syscall, int result)
 {
-   ARCH_REGS regs;
+   ARCH_REGS orig, regs;
 
EXPECT_EQ(0, ARCH_GETREGS(regs)) {
return;
}
+   orig = regs;
 
SYSCALL_NUM_SET(regs, syscall);
 
@@ -1876,7 +1877,8 @@ void change_syscall(struct __test_metadata *_metadata,
 #endif
 
/* Flush any register changes made. */
-   EXPECT_EQ(0, ARCH_SETREGS(regs));
+   if (memcmp(, , sizeof(orig)) != 0)
+   EXPECT_EQ(0, ARCH_SETREGS(regs));
 }
 
 void tracer_seccomp(struct __test_metadata *_metadata, pid_t tracee,
-- 
2.25.1



[PATCH 09/15] selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG

2020-09-12 Thread Kees Cook
Consolidate the REGSET logic into the new ARCH_GETREG() and
ARCH_SETREG() macros, avoiding more #ifdef code in function bodies.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 42 +++
 1 file changed, 15 insertions(+), 27 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index a986f2332327..d9346121b89b 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1828,26 +1828,29 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 #if defined(__x86_64__) || defined(__i386__) || defined(__mips__)
 # define ARCH_GETREGS(_regs)   ptrace(PTRACE_GETREGS, tracee, 0, &(_regs))
 # define ARCH_SETREGS(_regs)   ptrace(PTRACE_SETREGS, tracee, 0, &(_regs))
+#else
+# define ARCH_GETREGS(_regs)   ({  \
+   struct iovec __v;   \
+   __v.iov_base = &(_regs);\
+   __v.iov_len = sizeof(_regs);\
+   ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, &__v);\
+   })
+# define ARCH_SETREGS(_regs)   ({  \
+   struct iovec __v;   \
+   __v.iov_base = &(_regs);\
+   __v.iov_len = sizeof(_regs);\
+   ptrace(PTRACE_SETREGSET, tracee, NT_PRSTATUS, &__v);\
+   })
 #endif
 
 /* Architecture-specific syscall fetching routine. */
 int get_syscall(struct __test_metadata *_metadata, pid_t tracee)
 {
ARCH_REGS regs;
-#ifdef ARCH_GETREGS
-   EXPECT_EQ(0, ARCH_GETREGS(regs)) {
-   return -1;
-   }
-#else
-   struct iovec iov;
 
-   iov.iov_base = 
-   iov.iov_len = sizeof(regs);
-   EXPECT_EQ(0, ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, )) {
-   TH_LOG("PTRACE_GETREGSET failed");
+   EXPECT_EQ(0, ARCH_GETREGS(regs)) {
return -1;
}
-#endif
 
return SYSCALL_NUM(regs);
 }
@@ -1857,18 +1860,10 @@ void change_syscall(struct __test_metadata *_metadata,
pid_t tracee, int syscall, int result)
 {
ARCH_REGS regs;
-#ifdef ARCH_GETREGS
+
EXPECT_EQ(0, ARCH_GETREGS(regs)) {
return;
}
-#else
-   int ret;
-   struct iovec iov;
-   iov.iov_base = 
-   iov.iov_len = sizeof(regs);
-   ret = ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, );
-   EXPECT_EQ(0, ret);
-#endif
 
SYSCALL_NUM_SET(regs, syscall);
 
@@ -1881,14 +1876,7 @@ void change_syscall(struct __test_metadata *_metadata,
 #endif
 
/* Flush any register changes made. */
-#ifdef ARCH_SETREGS
EXPECT_EQ(0, ARCH_SETREGS(regs));
-#else
-   iov.iov_base = 
-   iov.iov_len = sizeof(regs);
-   ret = ptrace(PTRACE_SETREGSET, tracee, NT_PRSTATUS, );
-   EXPECT_EQ(0, ret);
-#endif
 }
 
 void tracer_seccomp(struct __test_metadata *_metadata, pid_t tracee,
-- 
2.25.1



[PATCH 08/15] selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG

2020-09-12 Thread Kees Cook
Instead of special-casing the get/set-registers routines, move the
HAVE_GETREG logic into the new ARCH_GETREG() and ARCH_SETREG() macros.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 27 ++-
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 3b77bdbe7125..a986f2332327 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1821,20 +1821,21 @@ TEST_F(TRACE_poke, getpid_runs_normally)
} while (0)
 #endif
 
-/* Use PTRACE_GETREGS and PTRACE_SETREGS when available. This is useful for
+/*
+ * Use PTRACE_GETREGS and PTRACE_SETREGS when available. This is useful for
  * architectures without HAVE_ARCH_TRACEHOOK (e.g. User-mode Linux).
  */
 #if defined(__x86_64__) || defined(__i386__) || defined(__mips__)
-#define HAVE_GETREGS
+# define ARCH_GETREGS(_regs)   ptrace(PTRACE_GETREGS, tracee, 0, &(_regs))
+# define ARCH_SETREGS(_regs)   ptrace(PTRACE_SETREGS, tracee, 0, &(_regs))
 #endif
 
 /* Architecture-specific syscall fetching routine. */
 int get_syscall(struct __test_metadata *_metadata, pid_t tracee)
 {
ARCH_REGS regs;
-#ifdef HAVE_GETREGS
-   EXPECT_EQ(0, ptrace(PTRACE_GETREGS, tracee, 0, )) {
-   TH_LOG("PTRACE_GETREGS failed");
+#ifdef ARCH_GETREGS
+   EXPECT_EQ(0, ARCH_GETREGS(regs)) {
return -1;
}
 #else
@@ -1855,17 +1856,19 @@ int get_syscall(struct __test_metadata *_metadata, 
pid_t tracee)
 void change_syscall(struct __test_metadata *_metadata,
pid_t tracee, int syscall, int result)
 {
-   int ret;
ARCH_REGS regs;
-#ifdef HAVE_GETREGS
-   ret = ptrace(PTRACE_GETREGS, tracee, 0, );
+#ifdef ARCH_GETREGS
+   EXPECT_EQ(0, ARCH_GETREGS(regs)) {
+   return;
+   }
 #else
+   int ret;
struct iovec iov;
iov.iov_base = 
iov.iov_len = sizeof(regs);
ret = ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, );
-#endif
EXPECT_EQ(0, ret);
+#endif
 
SYSCALL_NUM_SET(regs, syscall);
 
@@ -1878,14 +1881,14 @@ void change_syscall(struct __test_metadata *_metadata,
 #endif
 
/* Flush any register changes made. */
-#ifdef HAVE_GETREGS
-   ret = ptrace(PTRACE_SETREGS, tracee, 0, );
+#ifdef ARCH_SETREGS
+   EXPECT_EQ(0, ARCH_SETREGS(regs));
 #else
iov.iov_base = 
iov.iov_len = sizeof(regs);
ret = ptrace(PTRACE_SETREGSET, tracee, NT_PRSTATUS, );
-#endif
EXPECT_EQ(0, ret);
+#endif
 }
 
 void tracer_seccomp(struct __test_metadata *_metadata, pid_t tracee,
-- 
2.25.1



[PATCH 07/15] selftests/seccomp: Remove syscall setting #ifdefs

2020-09-12 Thread Kees Cook
With all architectures now using the common SYSCALL_NUM_SET() macro, the
arch-specific #ifdef can be removed from change_syscall() itself.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 16 +++-
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index aa1c224371d1..3b77bdbe7125 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1865,20 +1865,9 @@ void change_syscall(struct __test_metadata *_metadata,
iov.iov_len = sizeof(regs);
ret = ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, );
 #endif
-   EXPECT_EQ(0, ret) {}
+   EXPECT_EQ(0, ret);
 
-#if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) || \
-   defined(__s390__) || defined(__hppa__) || defined(__riscv) || \
-   defined(__xtensa__) || defined(__csky__) || defined(__sh__) || \
-   defined(__mips__) || defined(__arm__) || defined(__aarch64__)
-   {
-   SYSCALL_NUM_SET(regs, syscall);
-   }
-#else
-   ASSERT_EQ(1, 0) {
-   TH_LOG("How is the syscall changed on this architecture?");
-   }
-#endif
+   SYSCALL_NUM_SET(regs, syscall);
 
/* If syscall is skipped, change return value. */
if (syscall == -1)
@@ -1888,6 +1877,7 @@ void change_syscall(struct __test_metadata *_metadata,
SYSCALL_RET(regs) = result;
 #endif
 
+   /* Flush any register changes made. */
 #ifdef HAVE_GETREGS
ret = ptrace(PTRACE_SETREGS, tracee, 0, );
 #else
-- 
2.25.1



[PATCH 06/15] selftests/seccomp: mips: Remove O32-specific macro

2020-09-12 Thread Kees Cook
Instead of having the mips O32 macro special-cased, pull the logic into
the SYSCALL_NUM() macro. Additionally include the ABI headers, since
these appear to have been missing, leaving __NR_O32_Linux undefined.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index cfa606d96086..aa1c224371d1 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1756,9 +1756,19 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 # define SYSCALL_RET(_regs)(_regs).gprs[2]
 # define SYSCALL_NUM_RET_SHARE_REG
 #elif defined(__mips__)
+# include 
+# include 
+# include 
 # define ARCH_REGS struct pt_regs
-# define SYSCALL_NUM(_regs)(_regs).regs[2]
-# define SYSCALL_SYSCALL_NUM   regs[4]
+# define SYSCALL_NUM(_regs)\
+   ({  \
+   typeof((_regs).regs[2]) _nr;\
+   if ((_regs).regs[2] == __NR_O32_Linux)  \
+   _nr = (_regs).regs[4];  \
+   else\
+   _nr = (_regs).regs[2];  \
+   _nr;\
+   })
 # define SYSCALL_NUM_SET(_regs, _nr)   \
do {\
if ((_regs).regs[2] == __NR_O32_Linux)  \
@@ -1838,10 +1848,6 @@ int get_syscall(struct __test_metadata *_metadata, pid_t 
tracee)
}
 #endif
 
-#if defined(__mips__)
-   if (SYSCALL_NUM(regs) == __NR_O32_Linux)
-   return regs.SYSCALL_SYSCALL_NUM;
-#endif
return SYSCALL_NUM(regs);
 }
 
-- 
2.25.1



[PATCH 05/15] selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro

2020-09-12 Thread Kees Cook
Remove the arm64 special-case in change_syscall().

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 610fc036e374..cfa606d96086 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1717,6 +1717,18 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 #elif defined(__aarch64__)
 # define ARCH_REGS struct user_pt_regs
 # define SYSCALL_NUM(_regs)(_regs).regs[8]
+# ifndef NT_ARM_SYSTEM_CALL
+#  define NT_ARM_SYSTEM_CALL 0x404
+# endif
+# define SYSCALL_NUM_SET(_regs, _nr)   \
+   do {\
+   struct iovec __v;   \
+   typeof(_nr) __nr = (_nr);   \
+   __v.iov_base = &__nr;   \
+   __v.iov_len = sizeof(__nr); \
+   EXPECT_EQ(0, ptrace(PTRACE_SETREGSET, tracee,   \
+   NT_ARM_SYSTEM_CALL, &__v)); \
+   } while (0)
 # define SYSCALL_RET(_regs)(_regs).regs[0]
 #elif defined(__riscv) && __riscv_xlen == 64
 # define ARCH_REGS struct user_regs_struct
@@ -1852,23 +1864,10 @@ void change_syscall(struct __test_metadata *_metadata,
 #if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) || \
defined(__s390__) || defined(__hppa__) || defined(__riscv) || \
defined(__xtensa__) || defined(__csky__) || defined(__sh__) || \
-   defined(__mips__) || defined(__arm__)
+   defined(__mips__) || defined(__arm__) || defined(__aarch64__)
{
SYSCALL_NUM_SET(regs, syscall);
}
-
-#elif defined(__aarch64__)
-# ifndef NT_ARM_SYSTEM_CALL
-#  define NT_ARM_SYSTEM_CALL 0x404
-# endif
-   {
-   iov.iov_base = 
-   iov.iov_len = sizeof(syscall);
-   ret = ptrace(PTRACE_SETREGSET, tracee, NT_ARM_SYSTEM_CALL,
-);
-   EXPECT_EQ(0, ret);
-   }
-
 #else
ASSERT_EQ(1, 0) {
TH_LOG("How is the syscall changed on this architecture?");
-- 
2.25.1



[PATCH 04/15] selftests/seccomp: arm: Define SYSCALL_NUM_SET macro

2020-09-12 Thread Kees Cook
Remove the arm special-case in change_syscall().

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 02a9a6599746..610fc036e374 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1708,6 +1708,11 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 #elif defined(__arm__)
 # define ARCH_REGS struct pt_regs
 # define SYSCALL_NUM(_regs)(_regs).ARM_r7
+# ifndef PTRACE_SET_SYSCALL
+#  define PTRACE_SET_SYSCALL   23
+# endif
+# define SYSCALL_NUM_SET(_regs, _nr)   \
+   EXPECT_EQ(0, ptrace(PTRACE_SET_SYSCALL, tracee, NULL, _nr))
 # define SYSCALL_RET(_regs)(_regs).ARM_r0
 #elif defined(__aarch64__)
 # define ARCH_REGS struct user_pt_regs
@@ -1847,20 +1852,11 @@ void change_syscall(struct __test_metadata *_metadata,
 #if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) || \
defined(__s390__) || defined(__hppa__) || defined(__riscv) || \
defined(__xtensa__) || defined(__csky__) || defined(__sh__) || \
-   defined(__mips__)
+   defined(__mips__) || defined(__arm__)
{
SYSCALL_NUM_SET(regs, syscall);
}
 
-#elif defined(__arm__)
-# ifndef PTRACE_SET_SYSCALL
-#  define PTRACE_SET_SYSCALL   23
-# endif
-   {
-   ret = ptrace(PTRACE_SET_SYSCALL, tracee, NULL, syscall);
-   EXPECT_EQ(0, ret);
-   }
-
 #elif defined(__aarch64__)
 # ifndef NT_ARM_SYSTEM_CALL
 #  define NT_ARM_SYSTEM_CALL 0x404
-- 
2.25.1



[PATCH 02/15] selftests/seccomp: Provide generic syscall setting macro

2020-09-12 Thread Kees Cook
In order to avoid "#ifdef"s in the main function bodies, create a new
macro, SYSCALL_NUM_SET(), where arch-specific logic can live.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index fef15080b575..1c83e743bfb1 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1760,6 +1760,17 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 # error "Do not know how to find your architecture's registers and syscalls"
 #endif
 
+/*
+ * Most architectures can change the syscall by just updating the
+ * associated register. This is the default if not defined above.
+ */
+#ifndef SYSCALL_NUM_SET
+# define SYSCALL_NUM_SET(_regs, _nr)   \
+   do {\
+   SYSCALL_NUM(_regs) = (_nr); \
+   } while (0)
+#endif
+
 /* When the syscall return can't be changed, stub out the tests for it. */
 #ifdef SYSCALL_NUM_RET_SHARE_REG
 # define EXPECT_SYSCALL_RETURN(val, action)EXPECT_EQ(-1, action)
@@ -1830,14 +1841,14 @@ void change_syscall(struct __test_metadata *_metadata,
defined(__s390__) || defined(__hppa__) || defined(__riscv) || \
defined(__xtensa__) || defined(__csky__) || defined(__sh__)
{
-   SYSCALL_NUM(regs) = syscall;
+   SYSCALL_NUM_SET(regs, syscall);
}
 #elif defined(__mips__)
{
if (SYSCALL_NUM(regs) == __NR_O32_Linux)
regs.SYSCALL_SYSCALL_NUM = syscall;
else
-   SYSCALL_NUM(regs) = syscall;
+   SYSCALL_NUM_SET(regs, syscall);
}
 
 #elif defined(__arm__)
-- 
2.25.1



[PATCH 00/15] selftests/seccomp: Refactor change_syscall()

2020-09-12 Thread Kees Cook
Hi,

This refactors the seccomp selftest macros used in change_syscall(),
in an effort to remove special cases for mips, arm, arm64, and xtensa,
which paves the way for powerpc fixes.

I'm not entirely done testing, but all-arch build tests and x86_64
selftests pass. I'll be doing arm, arm64, and i386 selftests shortly,
but I currently don't have an easy way to check xtensa, mips, nor
powerpc. Any help there would be appreciated!

(FWIW, I expect to take these via the seccomp tree.)

Thanks,

-Kees


Kees Cook (15):
  selftests/seccomp: Refactor arch register macros to avoid xtensa
special case
  selftests/seccomp: Provide generic syscall setting macro
  selftests/seccomp: mips: Define SYSCALL_NUM_SET macro
  selftests/seccomp: arm: Define SYSCALL_NUM_SET macro
  selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro
  selftests/seccomp: mips: Remove O32-specific macro
  selftests/seccomp: Remove syscall setting #ifdefs
  selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG
  selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG
  selftests/seccomp: Avoid redundant register flushes
  selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of
SYSCALL_RET_SET
  selftests/seccomp: powerpc: Fix seccomp return value testing
  selftests/seccomp: powerpc: Set syscall return during ptrace syscall
exit
  selftests/clone3: Avoid OS-defined clone_args
  selftests/seccomp: Use __NR_mknodat instead of __NR_mknod

 .../selftests/clone3/clone3_selftests.h   |  16 +-
 tools/testing/selftests/seccomp/seccomp_bpf.c | 313 ++
 2 files changed, 184 insertions(+), 145 deletions(-)

-- 
2.25.1



[PATCH 01/15] selftests/seccomp: Refactor arch register macros to avoid xtensa special case

2020-09-12 Thread Kees Cook
To avoid an xtensa special-case, refactor all arch register macros to
take the register variable instead of depending on the macro expanding
as a struct member name.

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 97 +--
 1 file changed, 47 insertions(+), 50 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index c5002fc25b00..fef15080b575 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1698,64 +1698,64 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 }
 
 #if defined(__x86_64__)
-# define ARCH_REGS struct user_regs_struct
-# define SYSCALL_NUM   orig_rax
-# define SYSCALL_RET   rax
+# define ARCH_REGS struct user_regs_struct
+# define SYSCALL_NUM(_regs)(_regs).orig_rax
+# define SYSCALL_RET(_regs)(_regs).rax
 #elif defined(__i386__)
-# define ARCH_REGS struct user_regs_struct
-# define SYSCALL_NUM   orig_eax
-# define SYSCALL_RET   eax
+# define ARCH_REGS struct user_regs_struct
+# define SYSCALL_NUM(_regs)(_regs).orig_eax
+# define SYSCALL_RET(_regs)(_regs).eax
 #elif defined(__arm__)
-# define ARCH_REGS struct pt_regs
-# define SYSCALL_NUM   ARM_r7
-# define SYSCALL_RET   ARM_r0
+# define ARCH_REGS struct pt_regs
+# define SYSCALL_NUM(_regs)(_regs).ARM_r7
+# define SYSCALL_RET(_regs)(_regs).ARM_r0
 #elif defined(__aarch64__)
-# define ARCH_REGS struct user_pt_regs
-# define SYSCALL_NUM   regs[8]
-# define SYSCALL_RET   regs[0]
+# define ARCH_REGS struct user_pt_regs
+# define SYSCALL_NUM(_regs)(_regs).regs[8]
+# define SYSCALL_RET(_regs)(_regs).regs[0]
 #elif defined(__riscv) && __riscv_xlen == 64
-# define ARCH_REGS struct user_regs_struct
-# define SYSCALL_NUM   a7
-# define SYSCALL_RET   a0
+# define ARCH_REGS struct user_regs_struct
+# define SYSCALL_NUM(_regs)(_regs).a7
+# define SYSCALL_RET(_regs)(_regs).a0
 #elif defined(__csky__)
-# define ARCH_REGS struct pt_regs
-#if defined(__CSKYABIV2__)
-# define SYSCALL_NUM   regs[3]
-#else
-# define SYSCALL_NUM   regs[9]
-#endif
-# define SYSCALL_RET   a0
+# define ARCH_REGS struct pt_regs
+#  if defined(__CSKYABIV2__)
+#   define SYSCALL_NUM(_regs)  (_regs).regs[3]
+#  else
+#   define SYSCALL_NUM(_regs)  (_regs).regs[9]
+#  endif
+# define SYSCALL_RET(_regs)(_regs).a0
 #elif defined(__hppa__)
-# define ARCH_REGS struct user_regs_struct
-# define SYSCALL_NUM   gr[20]
-# define SYSCALL_RET   gr[28]
+# define ARCH_REGS struct user_regs_struct
+# define SYSCALL_NUM(_regs)(_regs).gr[20]
+# define SYSCALL_RET(_regs)(_regs).gr[28]
 #elif defined(__powerpc__)
-# define ARCH_REGS struct pt_regs
-# define SYSCALL_NUM   gpr[0]
-# define SYSCALL_RET   gpr[3]
+# define ARCH_REGS struct pt_regs
+# define SYSCALL_NUM(_regs)(_regs).gpr[0]
+# define SYSCALL_RET(_regs)(_regs).gpr[3]
 #elif defined(__s390__)
-# define ARCH_REGS s390_regs
-# define SYSCALL_NUM   gprs[2]
-# define SYSCALL_RET   gprs[2]
+# define ARCH_REGS s390_regs
+# define SYSCALL_NUM(_regs)(_regs).gprs[2]
+# define SYSCALL_RET(_regs)(_regs).gprs[2]
 # define SYSCALL_NUM_RET_SHARE_REG
 #elif defined(__mips__)
-# define ARCH_REGS struct pt_regs
-# define SYSCALL_NUM   regs[2]
-# define SYSCALL_SYSCALL_NUM regs[4]
-# define SYSCALL_RET   regs[2]
+# define ARCH_REGS struct pt_regs
+# define SYSCALL_NUM(_regs)(_regs).regs[2]
+# define SYSCALL_SYSCALL_NUM   regs[4]
+# define SYSCALL_RET(_regs)(_regs).regs[2]
 # define SYSCALL_NUM_RET_SHARE_REG
 #elif defined(__xtensa__)
-# define ARCH_REGS struct user_pt_regs
-# define SYSCALL_NUM   syscall
+# define ARCH_REGS struct user_pt_regs
+# define SYSCALL_NUM(_regs)(_regs).syscall
 /*
  * On xtensa syscall return value is in the register
  * a2 of the current window which is not fixed.
  */
-#define SYSCALL_RET(reg) a[(reg).windowbase * 4 + 2]
+#define SYSCALL_RET(_regs) (_regs).a[(_regs).windowbase * 4 + 2]
 #elif defined(__sh__)
-# define ARCH_REGS struct pt_regs
-# define SYSCALL_NUM   gpr[3]
-# define SYSCALL_RET   gpr[0]
+# define ARCH_REGS struct pt_regs
+# define SYSCALL_NUM(_regs)(_regs).gpr[3]
+# define SYSCALL_RET(_regs)(_regs).gpr[0]
 #else
 # error "Do not know how to find your architecture's registers and syscalls"
 #endif
@@ -1804,10 +1804,10 @@ int get_syscall(struct __test_metadata *_metadata, 
pid_t tracee)
 #endif
 
 #if defined(__mips__)
-   if (regs.SYSCALL_NUM == __NR_O32_Linux)
+   if (SYSCALL_NUM(regs) == __NR_O32_Linux)
return regs.SYSCALL_SYSCALL_NUM;
 #endif
-   return regs.SYSCALL_NUM;
+   return SYSCALL_NUM(regs);
 }
 
 /* Architecture-specific syscall changing routine. */
@@ -1830,14 +1830,14 @@ void change_syscall(struct __test_m

[PATCH 03/15] selftests/seccomp: mips: Define SYSCALL_NUM_SET macro

2020-09-12 Thread Kees Cook
Remove the mips special-case in change_syscall().

Signed-off-by: Kees Cook 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 1c83e743bfb1..02a9a6599746 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1742,6 +1742,13 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 # define ARCH_REGS struct pt_regs
 # define SYSCALL_NUM(_regs)(_regs).regs[2]
 # define SYSCALL_SYSCALL_NUM   regs[4]
+# define SYSCALL_NUM_SET(_regs, _nr)   \
+   do {\
+   if ((_regs).regs[2] == __NR_O32_Linux)  \
+   (_regs).regs[4] = _nr;  \
+   else\
+   (_regs).regs[2] = _nr;  \
+   } while (0)
 # define SYSCALL_RET(_regs)(_regs).regs[2]
 # define SYSCALL_NUM_RET_SHARE_REG
 #elif defined(__xtensa__)
@@ -1839,17 +1846,11 @@ void change_syscall(struct __test_metadata *_metadata,
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__powerpc__) || \
defined(__s390__) || defined(__hppa__) || defined(__riscv) || \
-   defined(__xtensa__) || defined(__csky__) || defined(__sh__)
+   defined(__xtensa__) || defined(__csky__) || defined(__sh__) || \
+   defined(__mips__)
{
SYSCALL_NUM_SET(regs, syscall);
}
-#elif defined(__mips__)
-   {
-   if (SYSCALL_NUM(regs) == __NR_O32_Linux)
-   regs.SYSCALL_SYSCALL_NUM = syscall;
-   else
-   SYSCALL_NUM_SET(regs, syscall);
-   }
 
 #elif defined(__arm__)
 # ifndef PTRACE_SET_SYSCALL
-- 
2.25.1



Re: [PATCH v2] selftests/seccomp: fix ptrace tests on powerpc

2020-09-11 Thread Kees Cook
On Fri, Sep 11, 2020 at 03:10:12PM -0300, Thadeu Lima de Souza Cascardo wrote:
> As pointed out by Michael Ellerman, the ptrace ABI on powerpc does not
> allow or require the return code to be set on syscall entry when
> skipping the syscall. It will always return ENOSYS and the return code
> must be set on syscall exit.
> 
> This code does that, behaving more similarly to strace. It still sets
> the return code on entry, which is overridden on powerpc, and it will
> always repeat the same on exit. Also, on powerpc, the errno is not
> inverted, and depends on ccr.so being set.
> 
> This has been tested on powerpc and amd64.

This looks like two fixes in one, so this should be split. :)

>  tools/testing/selftests/seccomp/seccomp_bpf.c | 81 ---
>  1 file changed, 53 insertions(+), 28 deletions(-)
> 
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 7a6d40286a42..0ddc0846e9c0 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -1837,15 +1837,24 @@ void change_syscall(struct __test_metadata *_metadata,
>  #endif
>  
>   /* If syscall is skipped, change return value. */
> - if (syscall == -1)
> + if (syscall == -1) {
>  #ifdef SYSCALL_NUM_RET_SHARE_REG
>   TH_LOG("Can't modify syscall return on this architecture");
> -
>  #elif defined(__xtensa__)
>   regs.SYSCALL_RET(regs) = result;
> +#elif defined(__powerpc__)
> + /* Error is signaled by CR0 SO bit and error code is positive. 
> */
> + if (result < 0) {
> + regs.SYSCALL_RET = -result;
> + regs.ccr |= 0x1000;
> + } else {
> + regs.SYSCALL_RET = result;
> + regs.ccr &= ~0x1000;
> + }
>  #else
>   regs.SYSCALL_RET = result;
>  #endif
> + }

I'll send a series soon that will include this bit, since I don't want
to collect these kinds of arch-specific things in the functions. (And
the xtensa one went in without my review!)

> +FIXTURE(TRACE_syscall) {
> + struct sock_fprog prog;
> + pid_t tracer, mytid, mypid, parent;
> +};
> +
> +FIXTURE_VARIANT(TRACE_syscall) {
> + /*
> +  * All of the SECCOMP_RET_TRACE behaviors can be tested with either
> +  * SECCOMP_RET_TRACE+PTRACE_CONT or plain ptrace()+PTRACE_SYSCALL.
> +  * This indicates if we should use SECCOMP_RET_TRACE (false), or
> +  * ptrace (true).
> +  */
> + bool use_ptrace;
> +
> + /*
> +  * Some archs (like ppc) only support changing the return code during
> +  * syscall exit when ptrace is used.  As the syscall number might not
> +  * be available anymore during syscall exit, it needs to be saved
> +  * during syscall enter.
> +  */
> + int syscall_nr;

This should be part of the fixture struct, not the variant. 

> +};
> +
> +FIXTURE_VARIANT_ADD(TRACE_syscall, ptrace) {
> + .use_ptrace = true,
> +};
> +
> +FIXTURE_VARIANT_ADD(TRACE_syscall, seccomp) {
> + .use_ptrace = false,
> +};

i.e. if a member isn't initialized in FIXTURE_VARIANT_ADD, it shouldn't
be defined in FIXTURE_VARIANT. :)

> +
>  void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee,
>  int status, void *args)
>  {
>   int ret, nr;
>   unsigned long msg;
>   static bool entry;
> + FIXTURE_VARIANT(TRACE_syscall) * variant = args;
>  
>   /*
>* The traditional way to tell PTRACE_SYSCALL entry/exit
> @@ -1916,10 +1957,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
> pid_t tracee,
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>   : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
>  
> - if (!entry)
> + if (!entry && !variant)
>   return;
>  
> - nr = get_syscall(_metadata, tracee);
> + if (entry)
> +     nr = get_syscall(_metadata, tracee);
> + else if (variant)
> + nr = variant->syscall_nr;
> + if (variant)
> + variant->syscall_nr = nr;

So, to be clear this is _only_ an issue for the ptrace side of things,
yes? i.e. seccomp's setting of the return value will correct stick?

-- 
Kees Cook


Re: [PATCH] kbuild: preprocess module linker script

2020-09-10 Thread Kees Cook
On Fri, Sep 04, 2020 at 10:31:21PM +0900, Masahiro Yamada wrote:
> There was a request to preprocess the module linker script like we do
> for the vmlinux one (https://lkml.org/lkml/2020/8/21/512).
> 
> The difference between vmlinux.lds and module.lds is that the latter
> is needed for external module builds, thus must be cleaned up by
> 'make mrproper' instead of 'make clean' (also, it must be created by
> 'make modules_prepare').
> 
> You cannot put it in arch/*/kernel/ because 'make clean' descends into
> it. I moved arch/*/kernel/module.lds to arch/*/include/asm/module.lds.h,
> which is included from scripts/module.lds.S.
> 
> scripts/module.lds is fine because 'make clean' keeps all the build
> artifacts under scripts/.
> 
> You can add arch-specific sections in .
> 
> Signed-off-by: Masahiro Yamada 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH] selftests/seccomp: fix ptrace tests on powerpc

2020-09-08 Thread Kees Cook
On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
> As pointed out by Michael Ellerman, the ptrace ABI on powerpc does not
> allow or require the return code to be set on syscall entry when
> skipping the syscall. It will always return ENOSYS and the return code
> must be set on syscall exit.
> 
> This code does that, behaving more similarly to strace. It still sets
> the return code on entry, which is overridden on powerpc, and it will
> always repeat the same on exit. Also, on powerpc, the errno is not
> inverted, and depends on ccr.so being set.
> 
> This has been tested on powerpc and amd64.
> 
> Cc: Michael Ellerman 
> Cc: Kees Cook 
> Signed-off-by: Thadeu Lima de Souza Cascardo 

Yikes, I missed this from a while ago. I apologize for responding so
late!

This appears still unfixed; is that correct?

> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 24 +++
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 252140a52553..b90a9190ba88 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -1738,6 +1738,14 @@ void change_syscall(struct __test_metadata *_metadata,
>   TH_LOG("Can't modify syscall return on this architecture");
>  #else
>   regs.SYSCALL_RET = result;
> +# if defined(__powerpc__)
> + if (result < 0) {
> + regs.SYSCALL_RET = -result;
> + regs.ccr |= 0x1000;
> + } else {
> + regs.ccr &= ~0x1000;
> + }
> +# endif
>  #endif

Just so I understand correctly: for ppc to "see" this result, it needs
to be both negative AND have this specific register set?

>  
>  #ifdef HAVE_GETREGS
> @@ -1796,6 +1804,7 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
> pid_t tracee,
>   int ret, nr;
>   unsigned long msg;
>   static bool entry;
> + int *syscall_nr = args;
>  
>   /*
>* The traditional way to tell PTRACE_SYSCALL entry/exit
> @@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
> pid_t tracee,
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>   : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
>  
> - if (!entry)
> + if (!entry && !syscall_nr)
>   return;
>  
> - nr = get_syscall(_metadata, tracee);
> + if (entry)
> + nr = get_syscall(_metadata, tracee);
> + else
> + nr = *syscall_nr;

This is weird? Shouldn't get_syscall() be modified to do the right thing
here instead of depending on the extra arg?

> + if (syscall_nr)
> + *syscall_nr = nr;
>  
>   if (nr == __NR_getpid)
>   change_syscall(_metadata, tracee, __NR_getppid, 0);
> @@ -1889,9 +1903,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_redirected)
>  
>  TEST_F(TRACE_syscall, ptrace_syscall_errno)
>  {
> + int syscall_nr = -1;
>   /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */
>   teardown_trace_fixture(_metadata, self->tracer);
> - self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL,
> + self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, 
> _nr,
>  true);
>  
>   /* Tracer should skip the open syscall, resulting in ESRCH. */
> @@ -1900,9 +1915,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_errno)
>  
>  TEST_F(TRACE_syscall, ptrace_syscall_faked)
>  {
> + int syscall_nr = -1;
>   /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */
>   teardown_trace_fixture(_metadata, self->tracer);
> - self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL,
> + self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, 
> _nr,
>  true);
>  
>   /* Tracer should skip the gettid syscall, resulting fake pid. */
> -- 
> 2.25.1
> 

-- 
Kees Cook


Re: [PATCH 2/2] powerpc/vdso32: link vdso64 with linker

2020-09-01 Thread Kees Cook
I think $subject needs a typo update... vdso32...

On Tue, Sep 01, 2020 at 03:25:23PM -0700, Nick Desaulniers wrote:
> Rather than invoke the compiler as the driver, use the linker. That way
> we can check --orphan-handling=warn support correctly, as cc-ldoption
> was removed in
> commit 055efab3120b ("kbuild: drop support for cc-ldoption").
> 
> Requires dropping the .got section.  I couldn't find how it was used in
> the vdso32.
> 
> Fixes: commit f2af201002a8 ("powerpc/build: vdso linker warning for orphan 
> sections")
> Link: 
> https://lore.kernel.org/lkml/cakwvodnn3wxydjomvnveyd_njwrku3fabwt_bs92duihhyw...@mail.gmail.com/
> Signed-off-by: Nick Desaulniers 
> ---
> Not sure removing .got is a good idea or not.  Otherwise I observe the
> following link error:
> powerpc-linux-gnu-ld: warning: orphan section `.got' from 
> `arch/powerpc/kernel/vdso32/sigtramp.o' being placed in section `.got'
> powerpc-linux-gnu-ld: _GLOBAL_OFFSET_TABLE_ not defined in linker created .got
> powerpc-linux-gnu-ld: final link failed: bad value

If it's like the x86 and arm toolchains, I think you'll be required to
keep .got, but you can assert it to a 0 size, e.g.:

/*
 * Sections that should stay zero sized, which is safer to
 * explicitly check instead of blindly discarding.
 */
.got : {
*(.got)
}
ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")

(and put that at the end of the linker script)


-Kees

> 
> sigtramp.c doesn't mention anything from the GOT AFAICT, and doesn't
> look like it contains relocations that do, so I'm not sure where
> references to _GLOBAL_OFFSET_TABLE_ are coming from.
> 
>  arch/powerpc/kernel/vdso32/Makefile | 7 +--
>  arch/powerpc/kernel/vdso32/vdso32.lds.S | 3 ++-
>  2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/vdso32/Makefile 
> b/arch/powerpc/kernel/vdso32/Makefile
> index 87ab1152d5ce..611a5951945a 100644
> --- a/arch/powerpc/kernel/vdso32/Makefile
> +++ b/arch/powerpc/kernel/vdso32/Makefile
> @@ -27,6 +27,9 @@ UBSAN_SANITIZE := n
>  ccflags-y := -shared -fno-common -fno-builtin -nostdlib \
>   -Wl,-soname=linux-vdso32.so.1 -Wl,--hash-style=both
>  asflags-y := -D__VDSO32__ -s
> +ldflags-y := -shared -soname linux-vdso32.so.1 \
> + $(call ld-option, --eh-frame-hdr) \
> + $(call ld-option, --orphan-handling=warn) -T
>  
>  obj-y += vdso32_wrapper.o
>  extra-y += vdso32.lds
> @@ -49,8 +52,8 @@ $(obj-vdso32): %.o: %.S FORCE
>   $(call if_changed_dep,vdso32as)
>  
>  # actual build commands
> -quiet_cmd_vdso32ld = VDSO32L $@
> -  cmd_vdso32ld = $(VDSOCC) $(c_flags) $(CC32FLAGS) -o $@ $(call 
> cc-ldoption, -Wl$(comma)--orphan-handling=warn) -Wl,-T$(filter %.lds,$^) 
> $(filter %.o,$^)
> +quiet_cmd_vdso32ld = LD  $@
> +  cmd_vdso32ld = $(cmd_ld)
>  quiet_cmd_vdso32as = VDSO32A $@
>cmd_vdso32as = $(VDSOCC) $(a_flags) $(CC32FLAGS) -c -o $@ $<
>  
> diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S 
> b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> index 4c985467a668..0ccdebad18b8 100644
> --- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
> +++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> @@ -61,7 +61,6 @@ SECTIONS
>   .fixup  : { *(.fixup) }
>  
>   .dynamic: { *(.dynamic) }   :text   :dynamic
> - .got: { *(.got) }   :text
>   .plt: { *(.plt) }
>  
>   _end = .;
> @@ -108,7 +107,9 @@ SECTIONS
>   .debug_varnames  0 : { *(.debug_varnames) }
>  
>   /DISCARD/   : {
> + *(.got)
>   *(.note.GNU-stack)
> + *(.branch_lt)
>   *(.data .data.* .gnu.linkonce.d.* .sdata*)
>   *(.bss .sbss .dynbss .dynsbss)
>   *(.glink .iplt .plt .rela*)
> -- 
> 2.28.0.402.g5ffc5be6b7-goog
> 

-- 
Kees Cook


Re: [PATCH 05/10] lkdtm: disable set_fs-based tests for !CONFIG_SET_FS

2020-09-01 Thread Kees Cook
On Sat, Aug 29, 2020 at 11:24:06AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 27, 2020 at 11:06:28AM -0700, Linus Torvalds wrote:
> > On Thu, Aug 27, 2020 at 8:00 AM Christoph Hellwig  wrote:
> > >
> > > Once we can't manipulate the address limit, we also can't test what
> > > happens when the manipulation is abused.
> > 
> > Just remove these tests entirely.
> > 
> > Once set_fs() doesn't exist on x86, the tests no longer make any sense
> > what-so-ever, because test coverage will be basically zero.
> > 
> > So don't make the code uglier just to maintain a fiction that
> > something is tested when it isn't really.
> 
> Sure fine with me unless Kees screams.

To clarify: if any of x86, arm64, arm, powerpc, riscv, and s390 are
using set_fs(), I want to keep this test. "ugly" is fine in lkdtm. :)

-- 
Kees Cook


Re: [PATCH 05/10] lkdtm: disable set_fs-based tests for !CONFIG_SET_FS

2020-09-01 Thread Kees Cook
On Sat, Aug 29, 2020 at 11:24:06AM +0200, Christoph Hellwig wrote:
> On Thu, Aug 27, 2020 at 11:06:28AM -0700, Linus Torvalds wrote:
> > On Thu, Aug 27, 2020 at 8:00 AM Christoph Hellwig  wrote:
> > >
> > > Once we can't manipulate the address limit, we also can't test what
> > > happens when the manipulation is abused.
> > 
> > Just remove these tests entirely.
> > 
> > Once set_fs() doesn't exist on x86, the tests no longer make any sense
> > what-so-ever, because test coverage will be basically zero.
> > 
> > So don't make the code uglier just to maintain a fiction that
> > something is tested when it isn't really.
> 
> Sure fine with me unless Kees screams.

If we don't have set_fs, we don't need the tests. :)

-- 
Kees Cook


Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code

2020-08-18 Thread Kees Cook
On Tue, Aug 18, 2020 at 10:00:16PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:59:05PM -0700, Kees Cook wrote:
> > > I didn't see a problem bisecting, do you have something particular in
> > > mind?
> > 
> > Oh, I misunderstood this patch to be a fix for compilation. Is this just
> > a correctness fix?
> 
> It prepares for using the definition from assembly, which is done in
> the next patch.

Ah! Okay; thanks.

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code

2020-08-18 Thread Kees Cook
On Tue, Aug 18, 2020 at 09:55:39PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:44:49PM -0700, Kees Cook wrote:
> > On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> > > For 64-bit the only hing missing was a strategic _AC, and for 32-bit we
> > 
> > typo: thing
> > 
> > > need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> > > definition to escape the explicit unsigned long cast.  This just works
> > > because __PAGE_OFFSET is defined using _AC itself and thus never needs
> > > the cast anyway.
> > 
> > Shouldn't this be folded into the prior patch so there's no bisection
> > problem?
> 
> I didn't see a problem bisecting, do you have something particular in
> mind?

Oh, I misunderstood this patch to be a fix for compilation. Is this just
a correctness fix?

-- 
Kees Cook


Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops

2020-08-18 Thread Kees Cook
On Tue, Aug 18, 2020 at 09:54:46PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote:
> > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> > > default_file_splice_write is the last piece of generic code that uses
> > > set_fs to make the uaccess routines operate on kernel pointers.  It
> > > implements a "fallback loop" for splicing from files that do not actually
> > > provide a proper splice_read method.  The usual file systems and other
> > > high bandwith instances all provide a ->splice_read, so this just removes
> > > support for various device drivers and procfs/debugfs files.  If splice
> > > support for any of those turns out to be important it can be added back
> > > by switching them to the iter ops and using generic_file_splice_read.
> > > 
> > > Signed-off-by: Christoph Hellwig 
> > 
> > This seems a bit disruptive? I feel like this is going to make fuzzers
> > really noisy (e.g. trinity likes to splice random stuff out of /sys and
> > /proc).
> 
> Noisy in the sence of triggering the pr_debug or because they can't
> handle -EINVAL?

Well, maybe both? I doubt much _expects_ to be using splice, so I'm fine
with that, but it seems weird not to have a fall-back, especially if
something would like to splice a file out of there. But, I'm not opposed
to the change, it just seems like it might cause pain down the road.

-- 
Kees Cook


Re: [PATCH 09/11] x86: remove address space overrides using set_fs()

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:10AM +0200, Christoph Hellwig wrote:
> Stop providing the possibility to override the address space using
> set_fs() now that there is no need for that any more.  To properly
> handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on
> x86 a new alternative is introduced, which just like the one in
> entry_64.S has to use the hardcoded virtual address bits to escape
> the fact that TASK_SIZE_MAX isn't actually a constant when 5-level
> page tables are enabled.
> 
> Signed-off-by: Christoph Hellwig 

Awesome. :)

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> For 64-bit the only hing missing was a strategic _AC, and for 32-bit we

typo: thing

> need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> definition to escape the explicit unsigned long cast.  This just works
> because __PAGE_OFFSET is defined using _AC itself and thus never needs
> the cast anyway.

Shouldn't this be folded into the prior patch so there's no bisection
problem?

-Kees

> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/x86/include/asm/page_32_types.h | 4 ++--
>  arch/x86/include/asm/page_64_types.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/page_32_types.h 
> b/arch/x86/include/asm/page_32_types.h
> index 26236925fb2c36..f462895a33e452 100644
> --- a/arch/x86/include/asm/page_32_types.h
> +++ b/arch/x86/include/asm/page_32_types.h
> @@ -44,8 +44,8 @@
>  /*
>   * User space process size: 3GB (default).
>   */
> -#define IA32_PAGE_OFFSET PAGE_OFFSET
> -#define TASK_SIZEPAGE_OFFSET
> +#define IA32_PAGE_OFFSET __PAGE_OFFSET
> +#define TASK_SIZE__PAGE_OFFSET
>  #define TASK_SIZE_LOWTASK_SIZE
>  #define TASK_SIZE_MAXTASK_SIZE
>  #define DEFAULT_MAP_WINDOW   TASK_SIZE
> diff --git a/arch/x86/include/asm/page_64_types.h 
> b/arch/x86/include/asm/page_64_types.h
> index 996595c9897e0a..838515daf87b36 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -76,7 +76,7 @@
>   *
>   * With page table isolation enabled, we map the LDT in ... [stay tuned]
>   */
> -#define TASK_SIZE_MAX((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
> +#define TASK_SIZE_MAX((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - 
> PAGE_SIZE)
>  
>  #define DEFAULT_MAP_WINDOW   ((1UL << 47) - PAGE_SIZE)
>  
> -- 
> 2.28.0
> 

-- 
Kees Cook


Re: [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:06AM +0200, Christoph Hellwig wrote:
> We can't run the tests for userspace bitmap parsing if set_fs() doesn't
> exist.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs()

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:05AM +0200, Christoph Hellwig wrote:
> Add a CONFIG_SET_FS option that is selected by architecturess that
> implement set_fs, which is all of them initially.  If the option is not
> set stubs for routines related to overriding the address space are
> provided so that architectures can start to opt out of providing set_fs.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> default_file_splice_write is the last piece of generic code that uses
> set_fs to make the uaccess routines operate on kernel pointers.  It
> implements a "fallback loop" for splicing from files that do not actually
> provide a proper splice_read method.  The usual file systems and other
> high bandwith instances all provide a ->splice_read, so this just removes
> support for various device drivers and procfs/debugfs files.  If splice
> support for any of those turns out to be important it can be added back
> by switching them to the iter ops and using generic_file_splice_read.
> 
> Signed-off-by: Christoph Hellwig 

This seems a bit disruptive? I feel like this is going to make fuzzers
really noisy (e.g. trinity likes to splice random stuff out of /sys and
/proc).

Conceptually, though:

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:03AM +0200, Christoph Hellwig wrote:
> Don't allow calling ->read or ->write with set_fs as a preparation for
> killing off set_fs.  All the instances that we use kernel_read/write on
> are using the iter ops already.
> 
> If a file has both the regular ->read/->write methods and the iter
> variants those could have different semantics for messed up enough
> drivers.  Also fails the kernel access to them in that case.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:02AM +0200, Christoph Hellwig wrote:
> There is no good reason to implement both the traditional ->read and
> ->write as well as the iter based ops.  So implement just the iter
> based ones.
> 
> Suggested-by: Al Viro 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 06/11] lkdtm: disable set_fs-based tests for !CONFIG_SET_FS

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:07AM +0200, Christoph Hellwig wrote:
> Once we can't manipulate the address limit, we also can't test what
> happens when the manipulation is abused.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/misc/lkdtm/bugs.c | 2 ++
>  drivers/misc/lkdtm/core.c | 4 
>  drivers/misc/lkdtm/usercopy.c | 2 ++
>  3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> index 4dfbfd51bdf774..66f1800b1cb82d 100644
> --- a/drivers/misc/lkdtm/bugs.c
> +++ b/drivers/misc/lkdtm/bugs.c
> @@ -312,6 +312,7 @@ void lkdtm_CORRUPT_LIST_DEL(void)
>   pr_err("list_del() corruption not detected!\n");
>  }
>  
> +#ifdef CONFIG_SET_FS
>  /* Test if unbalanced set_fs(KERNEL_DS)/set_fs(USER_DS) check exists. */
>  void lkdtm_CORRUPT_USER_DS(void)
>  {
> @@ -321,6 +322,7 @@ void lkdtm_CORRUPT_USER_DS(void)
>   /* Make sure we do not keep running with a KERNEL_DS! */
>   force_sig(SIGKILL);
>  }
> +#endif

Please let the test defined, but it should XFAIL with a message about
the CONFIG (see similar ifdefs in lkdtm).

>  /* Test that VMAP_STACK is actually allocating with a leading guard page */
>  void lkdtm_STACK_GUARD_PAGE_LEADING(void)
> diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> index a5e344df916632..aae08b33a7ee2a 100644
> --- a/drivers/misc/lkdtm/core.c
> +++ b/drivers/misc/lkdtm/core.c
> @@ -112,7 +112,9 @@ static const struct crashtype crashtypes[] = {
>   CRASHTYPE(CORRUPT_STACK_STRONG),
>   CRASHTYPE(CORRUPT_LIST_ADD),
>   CRASHTYPE(CORRUPT_LIST_DEL),
> +#ifdef CONFIG_SET_FS
>   CRASHTYPE(CORRUPT_USER_DS),
> +#endif
>   CRASHTYPE(STACK_GUARD_PAGE_LEADING),
>   CRASHTYPE(STACK_GUARD_PAGE_TRAILING),
>   CRASHTYPE(UNSET_SMEP),
> @@ -172,7 +174,9 @@ static const struct crashtype crashtypes[] = {
>   CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
>   CRASHTYPE(USERCOPY_STACK_BEYOND),
>   CRASHTYPE(USERCOPY_KERNEL),
> +#ifdef CONFIG_SET_FS
>   CRASHTYPE(USERCOPY_KERNEL_DS),
> +#endif
>   CRASHTYPE(STACKLEAK_ERASING),
>   CRASHTYPE(CFI_FORWARD_PROTO),

Then none of these are needed.

>  #ifdef CONFIG_X86_32

Hmpf, this ifdef was missed in ae56942c1474 ("lkdtm: Make arch-specific
tests always available"). I will fix that.

> diff --git a/drivers/misc/lkdtm/usercopy.c b/drivers/misc/lkdtm/usercopy.c
> index b833367a45d053..4b632fe79ab6bb 100644
> --- a/drivers/misc/lkdtm/usercopy.c
> +++ b/drivers/misc/lkdtm/usercopy.c
> @@ -325,6 +325,7 @@ void lkdtm_USERCOPY_KERNEL(void)
>   vm_munmap(user_addr, PAGE_SIZE);
>  }
>  
> +#ifdef CONFIG_SET_FS
>  void lkdtm_USERCOPY_KERNEL_DS(void)
>  {
>   char __user *user_ptr =
> @@ -339,6 +340,7 @@ void lkdtm_USERCOPY_KERNEL_DS(void)
>   pr_err("copy_to_user() to noncanonical address succeeded!?\n");
>   set_fs(old_fs);
>  }
> +#endif

(Same here, please.)

>  
>  void __init lkdtm_usercopy_init(void)
>  {
> -- 
> 2.28.0
> 

-- 
Kees Cook


Re: [PATCH 07/11] x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h

2020-08-18 Thread Kees Cook
On Mon, Aug 17, 2020 at 09:32:08AM +0200, Christoph Hellwig wrote:
> At least for 64-bit this moves them closer to some of the defines
> they are based on, and it prepares for using the TASK_SIZE_MAX
> definition from assembly.
> 
> Signed-off-by: Christoph Hellwig 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 0/8] scsi: convert tasklets to use new tasklet_setup()

2020-08-17 Thread Kees Cook
On Mon, Aug 17, 2020 at 07:41:58AM -0700, James Bottomley wrote:
> On Mon, 2020-08-17 at 14:24 +0530, Allen Pais wrote:
> > From: Allen Pais 
> > 
> > Commit 12cc923f1ccc ("tasklet: Introduce new initialization API")'
> > introduced a new tasklet initialization API. This series converts 
> > all the scsi drivers to use the new tasklet_setup() API
> 
> I've got to say I agree with Jens, this was a silly obfuscation:
> 
> +#define from_tasklet(var, callback_tasklet, tasklet_fieldname) \
> +   container_of(callback_tasklet, typeof(*var), tasklet_fieldname)
> 
> Just use container_of directly since we all understand what it does.

But then the lines get really long, wrapped, etc. This is what the
timer_struct conversion did too (added a container_of wrapper), so I
think it makes sense here too.

-- 
Kees Cook


Re: [Latest Git kernel/Linux-next kernel] Xorg doesn't start after the seccomp updates v5.9-rc1

2020-08-07 Thread Kees Cook
On Fri, Aug 07, 2020 at 04:45:14PM +0200, Christian Zigotzky wrote:
> But Xorg works on Ubuntu 10.04.4 (PowerPC 32-bit), openSUSE Tumbleweed
> 20190722 PPC64 and on Fedora 27 PPC64 with the latest Git kernel.
> 
> I bisected today [4].
> 
> Result: net/scm: Regularize compat handling of scm_detach_fds()
> (c0029de50982c1fb215330a5f9d433cec0cfd8cc) [5] is the first bad commit.
> 
> This commit has been merged with the seccomp updates v5.9-rc1 on 2020-08-04
> 14:11:08 -0700 [1]. Since these updates, Xorg doesn't start anymore on some
> Linux distributions.

Hi! Thanks for bisecting; yes, sorry for the trouble (I'm still trying
to understand why my compat tests _passed_...). Regardless, can you try
this patch:

https://lore.kernel.org/lkml/20200807173609.GJ4402@mussarela/

-- 
Kees Cook


Re: [PATCH v2 5/5] powerpc: Add LKDTM test to hijack a patch mapping

2020-07-14 Thread Kees Cook
wake_up_process(patching_kthrd);
> +
> + addr = offset_in_page(patch_site) | 
> read_cpu_patching_addr(patching_cpu);
> +
> + pr_info("starting hijacker_cpu=%d\n", hijacker_cpu);
> + for (attempts = 0; attempts < 10; ++attempts) {
> + /* Use __put_user to catch faults without an Oops */
> + hijacked = !__put_user(0xbad00bad, (unsigned int *)addr);
> +
> + if (hijacked) {
> + if (kthread_stop(patching_kthrd))
> + goto out;
> + break;
> + }
> + }
> + pr_info("hijack attempts: %d\n", attempts);
> +
> + if (hijacked) {
> + if (*(unsigned int *)READ_ONCE(patch_site) == 0xbad00bad)
> + pr_err("overwrote kernel text\n");
> + /*
> +  * There are window conditions where the hijacker cpu manages to
> +  * write to the patch site but the site gets overwritten again 
> by
> +  * the patching cpu. We still consider that a "successful" 
> hijack
> +  * since the hijacker cpu did not fault on the write.
> +  */
> + pr_err("FAIL: wrote to another cpu's patching area\n");
> + } else {
> + kthread_stop(patching_kthrd);
> + }
> +
> +out:
> + /* Restore the original insn for any future lkdtm tests */
> +     patch_instruction(patch_site, original_insn);

Can this test be done for x86's instruction patching too?

> +}
> +
> +#else
> +
> +void lkdtm_HIJACK_PATCH(void)
> +{
> + if (!IS_ENABLED(CONFIG_PPC))
> + pr_err("XFAIL: this test is powerpc-only\n");
> + if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> + pr_err("XFAIL: this test requires CONFIG_STRICT_KERNEL_RWX\n");
> +}
> +
> +#endif /* CONFIG_PPC && CONFIG_STRICT_KERNEL_RWX */
> +
>  void __init lkdtm_perms_init(void)
>  {
>   /* Make sure we can write to __ro_after_init values during __init */
> -- 
> 2.27.0

Otherwise, looks good!

-- 
Kees Cook


Re: [PATCH v5 06/12] arch: xtensa: add linker section for KUnit test suites

2020-06-26 Thread Kees Cook
On Fri, Jun 26, 2020 at 02:09:11PM -0700, Brendan Higgins wrote:
> Add a linker section to xtensa where KUnit can put references to its
> test suites. This patch is an early step in transitioning to dispatching
> all KUnit tests from a centralized executor rather than having each as
> its own separate late_initcall.
> 
> Signed-off-by: Brendan Higgins 
> ---
>  arch/xtensa/kernel/vmlinux.lds.S | 4 

If you ever find yourself modifying multiple arch linker scripts for a
series, something has gone wrong. ;)

-- 
Kees Cook


Re: [PATCH v5 01/12] vmlinux.lds.h: add linker section for KUnit test suites

2020-06-26 Thread Kees Cook
On Fri, Jun 26, 2020 at 02:09:06PM -0700, Brendan Higgins wrote:
> Add a linker section where KUnit can put references to its test suites.
> This patch is the first step in transitioning to dispatching all KUnit
> tests from a centralized executor rather than having each as its own
> separate late_initcall.
> 
> Co-developed-by: Iurii Zaikin 
> Signed-off-by: Iurii Zaikin 
> Signed-off-by: Brendan Higgins 
> Reviewed-by: Stephen Boyd 
> ---
>  include/asm-generic/vmlinux.lds.h | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index db600ef218d7d..4f9b036fc9616 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -881,6 +881,13 @@
>   KEEP(*(.con_initcall.init)) \
>   __con_initcall_end = .;
>  
> +/* Alignment must be consistent with (kunit_suite *) in include/kunit/test.h 
> */

Nit on naming:

> +#define KUNIT_TEST_SUITES\

I would call this KUNIT_TABLE to maintain the same names as other things
of this nature.

> + . = ALIGN(8);   \
> + __kunit_suites_start = .;   \
> + KEEP(*(.kunit_test_suites)) \
> + __kunit_suites_end = .;
> +
>  #ifdef CONFIG_BLK_DEV_INITRD
>  #define INIT_RAM_FS  \
>   . = ALIGN(4);   \
> @@ -1056,6 +1063,7 @@
>   INIT_CALLS  \
>   CON_INITCALL\
>   INIT_RAM_FS \
> + KUNIT_TEST_SUITES   \
>   }

Nack: this must be in INIT_DATA, not in INIT_DATA_SECTION. Not all
architectures use the INIT_DATA_SECTION macro (e.g. arm64), but everything
uses INIT_DATA.


-- 
Kees Cook


Re: [PATCH 17/17] arch: rename copy_thread_tls() back to copy_thread()

2020-06-22 Thread Kees Cook
On Tue, Jun 23, 2020 at 01:43:26AM +0200, Christian Brauner wrote:
> Now that HAVE_COPY_THREAD_TLS has been removed, rename copy_thread_tls()
> back simply copy_thread(). It's a simpler name, and doesn't imply that only
> tls is copied here. This finishes an outstanding chunk of internal process
> creation work since we've added clone3().
> [...]
> -copy_thread_tls(unsigned long clone_flags, unsigned long user_stack_base,
> +copy_thread(unsigned long clone_flags, unsigned long user_stack_base,
>   unsigned long user_stack_size, struct task_struct *p,
>   unsigned long tls)

Maybe clean up the arg indentation too? I'm not sure how strongly people
feel about that, but I think it'd be nice.

Either way:

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 16/17] arch: remove HAVE_COPY_THREAD_TLS

2020-06-22 Thread Kees Cook
On Tue, Jun 23, 2020 at 01:43:25AM +0200, Christian Brauner wrote:
> All architectures support copy_thread_tls() now, so remove the legacy
> copy_thread() function and the HAVE_COPY_THREAD_TLS config option. Everyone
> uses the same process creation calling convention based on
> copy_thread_tls() and struct kernel_clone_args. This will make it easier to
> maintain the core process creation code under kernel/, simplifies the
> callpaths and makes the identical for all architectures.
> [...]
> Signed-off-by: Christian Brauner 

Massive CC list. ;) linux-arch@ may be sufficient.

Reviewed-by: Kees Cook 

-- 
Kees Cook


[PATCH] powerpc/mm: Fix typo in IS_ENABLED()

2020-06-05 Thread Kees Cook
From: Joe Perches 

IS_ENABLED() matches names exactly, so the missing "CONFIG_" prefix
means this code would never be built.

Also fixes a missing newline in pr_warn().

Signed-off-by: Joe Perches 
Link: 
https://lore.kernel.org/lkml/b08611018fdb6d88757c6008a5c02fa0e07b32fb.ca...@perches.com
Signed-off-by: Kees Cook 
---
 arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
b/arch/powerpc/mm/book3s64/hash_utils.c
index 8ed2411c3f39..cf2e1b06e5d4 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
 * Pick a size for the linear mapping. Currently, we only
 * support 16M, 1M and 4K which is the default
 */
-   if (IS_ENABLED(STRICT_KERNEL_RWX) &&
+   if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
(unsigned long)_stext % 0x100) {
if (mmu_psize_defs[MMU_PAGE_16M].shift)
-   pr_warn("Kernel not 16M aligned, "
-   "disabling 16M linear map alignment");
+   pr_warn("Kernel not 16M aligned, disabling 16M 
linear map alignment\n");
    aligned = false;
}
 
-- 
2.25.1


-- 
Kees Cook


Re: [PATCH] pwm: Add missing "CONFIG_" prefix

2020-06-04 Thread Kees Cook
On Wed, Jun 03, 2020 at 04:04:31PM -0700, Joe Perches wrote:
> On Wed, 2020-06-03 at 15:40 -0700, Kees Cook wrote:
> > The IS_ENABLED() use was missing the CONFIG_ prefix which would have
> > lead to skipping this code.
> > 
> > Fixes: 3ad1f3a33286 ("pwm: Implement some checks for lowlevel drivers")
> > Signed-off-by: Kees Cook 
> > ---
> >  drivers/pwm/core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
> > index 9973c442b455..6b3cbc0490c6 100644
> > --- a/drivers/pwm/core.c
> > +++ b/drivers/pwm/core.c
> > @@ -121,7 +121,7 @@ static int pwm_device_request(struct pwm_device *pwm, 
> > const char *label)
> > pwm->chip->ops->get_state(pwm->chip, pwm, >state);
> > trace_pwm_get(pwm, >state);
> >  
> > -   if (IS_ENABLED(PWM_DEBUG))
> > +   if (IS_ENABLED(CONFIG_PWM_DEBUG))
> > pwm->last = pwm->state;
> > }
> >  
> > -- 
> > 2.25.1
> > 
> 
> more odd uses (mostly in comments)
> 
> $ git grep -P -oh '\bIS_ENABLED\s*\(\s*\w+\s*\)'| \
>   sed -r 's/\s+//g'| \
>   grep -v '(CONFIG_' | \
>   sort | uniq -c | sort -rn
>   7 IS_ENABLED(DEBUG)
>   4 IS_ENABLED(DRM_I915_SELFTEST)
>   4 IS_ENABLED(cfg)
>   2 IS_ENABLED(opt_name)
>   2 IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)
>   2 IS_ENABLED(config)
>   2 IS_ENABLED(cond)
>   2 IS_ENABLED(__BIG_ENDIAN)
>   1 IS_ENABLED(x)
>   1 IS_ENABLED(STRICT_KERNEL_RWX)
>   1 IS_ENABLED(PWM_DEBUG)
>   1 IS_ENABLED(option)
>   1 IS_ENABLED(ETHTOOL_NETLINK)
>   1 IS_ENABLED(DEBUG_RANDOM_TRIE)
>   1 IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST)
> 
> STRICT_KERNEL_RWX is misused here in ppc
> 
> ---
> 
> Fix pr_warn without newline too.
> 
>  arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
> b/arch/powerpc/mm/book3s64/hash_utils.c
> index 51e3c15f7aff..dd60c5f2b991 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
>* Pick a size for the linear mapping. Currently, we only
>* support 16M, 1M and 4K which is the default
>*/
> - if (IS_ENABLED(STRICT_KERNEL_RWX) &&
> + if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
>   (unsigned long)_stext % 0x100) {
>   if (mmu_psize_defs[MMU_PAGE_16M].shift)
> - pr_warn("Kernel not 16M aligned, "
> -     "disabling 16M linear map alignment");
> + pr_warn("Kernel not 16M aligned, disabling 16M 
> linear map alignment\n");
>   aligned = false;
>   }

Joe, I was going to send all of the fixes for these issues, but your
patch doesn't have a SoB. Shall I add one for the above patch?

-- 
Kees Cook


Re: [PATCH] pwm: Add missing "CONFIG_" prefix

2020-06-03 Thread Kees Cook
On Wed, Jun 03, 2020 at 04:04:31PM -0700, Joe Perches wrote:
> more odd uses (mostly in comments)
> 
> $ git grep -P -oh '\bIS_ENABLED\s*\(\s*\w+\s*\)'| \
>   sed -r 's/\s+//g'| \
>   grep -v '(CONFIG_' | \
>   sort | uniq -c | sort -rn

I think a missed a bunch because my grep was messy. :) This is much
easier to scan.

>   7 IS_ENABLED(DEBUG)
>   4 IS_ENABLED(cfg)
>   2 IS_ENABLED(opt_name)
>   2 IS_ENABLED(DEBUG_PRINT_TRIE_GRAPHVIZ)
>   2 IS_ENABLED(config)
>   2 IS_ENABLED(cond)
>   2 IS_ENABLED(__BIG_ENDIAN)
>   1 IS_ENABLED(x)
>   1 IS_ENABLED(PWM_DEBUG)
>   1 IS_ENABLED(option)
>   1 IS_ENABLED(DEBUG_RANDOM_TRIE)
>   1 IS_ENABLED(DEBUG_CHACHA20POLY1305_SLOW_CHUNK_TEST)

These seem to be "as expected".

>   4 IS_ENABLED(DRM_I915_SELFTEST)
>   1 IS_ENABLED(STRICT_KERNEL_RWX)
>   1 IS_ENABLED(ETHTOOL_NETLINK)

But these are not.

> 
> STRICT_KERNEL_RWX is misused here in ppc
> 
> ---
> 
> Fix pr_warn without newline too.
> 
>  arch/powerpc/mm/book3s64/hash_utils.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
> b/arch/powerpc/mm/book3s64/hash_utils.c
> index 51e3c15f7aff..dd60c5f2b991 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -660,11 +660,10 @@ static void __init htab_init_page_sizes(void)
>* Pick a size for the linear mapping. Currently, we only
>* support 16M, 1M and 4K which is the default
>*/
> - if (IS_ENABLED(STRICT_KERNEL_RWX) &&
> + if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) &&
>   (unsigned long)_stext % 0x100) {
>   if (mmu_psize_defs[MMU_PAGE_16M].shift)
> - pr_warn("Kernel not 16M aligned, "
> - "disabling 16M linear map alignment");
> + pr_warn("Kernel not 16M aligned, disabling 16M 
> linear map alignment\n");
>   aligned = false;
>   }

Reviewed-by: Kees Cook 


-- 
Kees Cook


Re: [Intel-gfx] [PATCH 06/13] ocfs2: use new sysctl subdir helper register_sysctl_subdir()

2020-05-29 Thread Kees Cook
On Fri, May 29, 2020 at 11:49:12AM +, Luis Chamberlain wrote:
> Yikes, sense, you're right. Nope, I left the random config tests to
> 0day. Will fix, thanks!

Yeah, I do the same for randconfig, but I always do an "allmodconfig"
build before sending stuff. It's a good smoke test.

-- 
Kees Cook


  1   2   3   4   5   6   7   >