from:"Kees Cook"

Re: [PATCH] efi: pstore: Allow dynamic initialization based on module parameter

2024-02-01 Thread Kees Cook

On Wed, 03 Jan 2024 15:40:32 -0300, Guilherme G. Piccoli wrote:
> The efi-pstore module parameter "pstore_disable" warrants that users
> are able to deactivate such backend. There is also a Kconfig option
> for the default value of this parameter. It was originally added due
> to some bad UEFI FW implementations that could break with many variables
> written.
> 
> Some distros (such as Arch Linux) set this in their config file still
> nowadays. And once it is set, even being a writable module parameter,
> there is effectively no way to make use of efi-pstore anymore.
> If "pstore_disable" is set to true, the init function of the module exits
> early and is never called again after the initcall processing.
> 
> [...]

Applied to for-next/pstore, thanks!

[1/1] efi: pstore: Allow dynamic initialization based on module parameter
  https://git.kernel.org/kees/c/c3f849caf81b

Take care,

-- 
Kees Cook

Re: [RFC PATCH v2 1/9] pstore: move pstore creator id, section type and record struct to common header

2023-09-25 Thread Kees Cook

On Mon, Sep 25, 2023 at 03:44:18PM +0800, Shuai Xue wrote:
> Move pstore creator id, section type and record struct to the common
> header, so that it can be use by MCE and GHES driver.

I would prefer this was not in the pstore header -- this is a backend
detail that should stay in backend headers.

-Kees

-- 
Kees Cook

Re: [RFC] random: UEFI RNG input is bootloader randomness

2019-09-28 Thread Kees Cook

On Sat, Sep 28, 2019 at 12:14:28PM +0200, Dominik Brodowski wrote:
> Depending on RANDOM_TRUST_BOOTLOADER, bootloader-provided randomness
> is credited as entropy. As the UEFI seeding entropy pool is seeded by
> the UEFI firmware/bootloader, add its content as bootloader randomness.
> 
> Note that this UEFI (v2.4 or newer) feature is currently only
> implemented for EFI stub booting on ARM, and further note that
> RANDOM_TRUST_BOOTLOADER must only be enabled if there indeed is
> sufficient trust in the bootloader _and_ its source of randomness.
> 
> Signed-off-by: Dominik Brodowski 
> Cc: Ard Biesheuvel 
> Cc: Hsin-Yi Wang 
> Cc: Stephen Boyd 
> Cc: Rob Herring 
> Cc: Theodore Ts'o 
> Cc: Lee, Chun-Yi 

Reviewed-by: Kees Cook 

-Kees

> 
> ---
> 
> Untested patch, as efi_random_get_seed() is only hooked up on ARM,
> and the firmware on my old x86 laptop only has UEFI v2.31 anyway.
> 
> Thanks,
>   Dominik
> 
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 8f1ab04f6743..db0bffce754e 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -545,7 +545,7 @@ int __init efi_config_parse_tables(void *config_tables, 
> int count, int sz,
> sizeof(*seed) + size);
>   if (seed != NULL) {
>   pr_notice("seeding entropy pool\n");
> - add_device_randomness(seed->bits, seed->size);
> + add_bootloader_randomness(seed->bits, 
> seed->size);
>   early_memunmap(seed, sizeof(*seed) + size);
>   } else {
>   pr_err("Could not map UEFI random seed!\n");

-- 
Kees Cook

Re: [PATCH V34 28/29] efi: Restrict efivar_ssdt_load when the kernel is locked down

2019-06-22 Thread Kees Cook

On Fri, Jun 21, 2019 at 05:03:57PM -0700, Matthew Garrett wrote:
> efivar_ssdt_load allows the kernel to import arbitrary ACPI code from an
> EFI variable, which gives arbitrary code execution in ring 0. Prevent
> that when the kernel is locked down.
> 
> Signed-off-by: Matthew Garrett 

Reviewed-by: Kees Cook 

-Kees

> Cc: Ard Biesheuvel 
> Cc: linux-efi@vger.kernel.org
> ---
>  drivers/firmware/efi/efi.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 55b77c576c42..9f92a013ab27 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -31,6 +31,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -242,6 +243,11 @@ static void generic_ops_unregister(void)
>  static char efivar_ssdt[EFIVAR_SSDT_NAME_MAX] __initdata;
>  static int __init efivar_ssdt_setup(char *str)
>  {
> + int ret = security_locked_down(LOCKDOWN_ACPI_TABLES);
> +
> + if (ret)
> + return ret;
> +
>   if (strlen(str) < sizeof(efivar_ssdt))
>   memcpy(efivar_ssdt, str, strlen(str));
>   else
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

-- 
Kees Cook

Re: [PATCH] pstore: Convert buf_lock to semaphore

2018-12-04 Thread Kees Cook

On Tue, Dec 4, 2018 at 7:41 AM Sebastian Andrzej Siewior
 wrote:
>
> On 2018-11-30 14:47:36 [-0800], Kees Cook wrote:
> > diff --git a/drivers/firmware/efi/efi-pstore.c 
> > b/drivers/firmware/efi/efi-pstore.c
> > index cfe87b465819..0f7d97917197 100644
> > --- a/drivers/firmware/efi/efi-pstore.c
> > +++ b/drivers/firmware/efi/efi-pstore.c
> > @@ -259,8 +259,7 @@ static int efi_pstore_write(struct pstore_record 
> > *record)
> >   efi_name[i] = name[i];
> >
> >   ret = efivar_entry_set_safe(efi_name, vendor, PSTORE_EFI_ATTRIBUTES,
> > -   !pstore_cannot_block_path(record->reason),
> > -   record->size, record->psi->buf);
> > +   preemptible(), record->size, record->psi->buf);
>
> Well. Better I think.
> might_sleep() / preempt_count_equals() checks for preemptible() + 
> rcu_preempt_depth().
> kmsg_dump() starts with rcu_read_lock() which means with this patch applied I
> got:

Okay, so, if kmsg_dump() uses rcu_read_lock(), that means efi-pstore
can _never_ sleep, and it's nothing to do with pstore internals. :( I
guess we just hard-code it, then? And efi-pstore should probably only
attach to pstore if it has a nonblock implementation (and warn if one
isn't available).

-Kees

-- 
Kees Cook

Re: [PATCH] pstore: Convert buf_lock to semaphore

2018-12-04 Thread Kees Cook

On Tue, Dec 4, 2018 at 7:41 AM Sebastian Andrzej Siewior
 wrote:
>
> On 2018-11-30 14:47:36 [-0800], Kees Cook wrote:
> > diff --git a/drivers/firmware/efi/efi-pstore.c 
> > b/drivers/firmware/efi/efi-pstore.c
> > index cfe87b465819..0f7d97917197 100644
> > --- a/drivers/firmware/efi/efi-pstore.c
> > +++ b/drivers/firmware/efi/efi-pstore.c
> > @@ -259,8 +259,7 @@ static int efi_pstore_write(struct pstore_record 
> > *record)
> >   efi_name[i] = name[i];
> >
> >   ret = efivar_entry_set_safe(efi_name, vendor, PSTORE_EFI_ATTRIBUTES,
> > -   !pstore_cannot_block_path(record->reason),
> > -   record->size, record->psi->buf);
> > +   preemptible(), record->size, record->psi->buf);
>
> Well. Better I think.
> might_sleep() / preempt_count_equals() checks for preemptible() + 
> rcu_preempt_depth().
> kmsg_dump() starts with rcu_read_lock() which means with this patch applied I
> got:
>
> | BUG: sleeping function called from invalid context at 
> kernel/sched/completion.c:99
> | in_atomic(): 0, irqs_disabled(): 0, pid: 2286, name: sig-xstate-bum PC: 0 
> RCU: 1
> | Preemption disabled at:
> | [] __queue_work+0x95/0x440
> | CPU: 30 PID: 2286 Comm: sig-xstate-bum Tainted: G  D   
> 4.20.0-rc3+ #90
> | Call Trace:
> |  dump_stack+0x4f/0x6a
> |  ___might_sleep.cold.91+0xef/0x100
> |  __might_sleep+0x50/0x90
> |  wait_for_completion+0x32/0x130
> |  virt_efi_query_variable_info+0x14e/0x160
> |  efi_query_variable_store+0x51/0x1a0
> |  efivar_entry_set_safe+0xa3/0x1b0
> |  efi_pstore_write+0x110/0x140
> |  pstore_dump+0x114/0x320
> |  kmsg_dump+0xa4/0xd0
> |  oops_exit+0x7f/0x90
> |  oops_end+0x67/0xd0
> |  die+0x41/0x4a
> |  do_general_protection+0xc1/0x150
> |  general_protection+0x1e/0x30
> | RIP: 0010:__fpu__restore_sig+0x1c1/0x540
>
> just in case you wonder why both counter are zero and it still creates
> this backtrace.

Oh, hmm. That didn't show up in my testing. Any thoughts on a solution?

>
> >   if (record->reason == KMSG_DUMP_OOPS)
> >   efivar_run_worker();
> > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> > index 2387cb74f729..afdfd3687f94 100644
> > --- a/fs/pstore/platform.c
> > +++ b/fs/pstore/platform.c
> > @@ -400,23 +401,20 @@ static void pstore_dump(struct kmsg_dumper *dumper,
> >   unsigned long   total = 0;
> >   const char  *why;
> >   unsigned intpart = 1;
> > - unsigned long   flags = 0;
> > - int is_locked;
> >   int ret;
> >
> >   why = get_reason_str(reason);
> >
> > - if (pstore_cannot_block_path(reason)) {
> > - is_locked = spin_trylock_irqsave(>buf_lock, flags);
> > - if (!is_locked) {
> > - pr_err("pstore dump routine blocked in %s path, may 
> > corrupt error record\n"
> > -, in_nmi() ? "NMI" : why);
> > + if (down_trylock(>buf_lock)) {
> > + /* Failed to acquire lock: give up if we cannot wait. */
> > + if (pstore_cannot_wait(reason)) {
> > + pr_err("dump skipped in %s path: may corrupt error 
> > record\n",
> > + in_nmi() ? "NMI" : why);
> >   return;
> >   }
> > - } else {
> > -     spin_lock_irqsave(>buf_lock, flags);
> > - is_locked = 1;
> > + down_interruptible(>buf_lock);
>
>  In function ‘pstore_dump’:
> fs/pstore/platform.c:393:3: warning: ignoring return value of 
> ‘down_interruptible’, declared with attribute warn_unused_result 
> [-Wunused-result]
>down_interruptible(>buf_lock);
>^
>

Thanks, yes. I've fixed this in the version in -next.

-Kees

-- 
Kees Cook

Re: [PATCH] pstore: Convert buf_lock to semaphore

2018-12-03 Thread Kees Cook

On Mon, Dec 3, 2018 at 8:26 AM Arnd Bergmann  wrote:
>
> On Mon, Dec 3, 2018 at 12:18 PM Sebastian Andrzej Siewior
>  wrote:
> >
> > On 2018-12-01 09:42:38 [+0100], Arnd Bergmann wrote:
> > > You are right that you can't take (or release) a mutex from interrupt
> > > context. However, I don't think converting a spinlock to a semaphore
> > > is going to help here either.
> >
> > you can acquire a semaphore with a try_lock from interrupts context but
> > you can't do that with a mutex. You can also a acquire a semaphore in
> > one context and release in another.
>
> Right, that is the obvious part.
>
> > I haven't looked a general picture yet, will try to do so later today or
> > tomorrow.
>
> To speed this up, the problem I'm referring to is in
> virt_efi_query_variable_info() and efi_queue_work(),
> as in the original BUG_ON() that Kees quoted:
>
> > BUG: sleeping function called from invalid context at 
> > kernel/sched/completion.c:99
> > |in_atomic(): 1, irqs_disabled(): 1, pid: 2236, name: sig-xstate-bum
> > |Preemption disabled at:
> > |[] pstore_dump+0x72/0x330
> > |CPU: 26 PID: 2236 Comm: sig-xstate-bum Tainted: G  D   
> > 4.20.0-rc3 #45
> > |Call Trace:
> > | dump_stack+0x4f/0x6a
> > | ___might_sleep.cold.91+0xd3/0xe4
> > | __might_sleep+0x50/0x90
> > | wait_for_completion+0x32/0x130
> > | virt_efi_query_variable_info+0x14e/0x160
> > | efi_query_variable_store+0x51/0x1a0
> > | efivar_entry_set_safe+0xa3/0x1b0
> > | efi_pstore_write+0x109/0x140
> > | pstore_dump+0x11c/0x330
> > | kmsg_dump+0xa4/0xd0
> > | oops_exit+0x22/0x30
>
> This will no longer happen when pstore is called from process
> context with his patch, but we still get the same thing if we call
> pstore from interrupt context, unless both the down_interruptible
> and wait_for_completion in there are also changed to
> nonblocking calls. However, once they are no longer blocking,
> we don't need the outer lock to be changed from spinlock
> to semaphore any more either.

My proposed patch was trying to do two things:

- have pstore not make things _worse_ (do not hold a spin lock)
- use preemptible() in efi pstore backend to get it right:

ret = efivar_entry_set_safe(efi_name, vendor, PSTORE_EFI_ATTRIBUTES,
- !pstore_cannot_block_path(record->reason),
-     record->size, record->psi->buf);
+ preemptible(), record->size, record->psi->buf);

IIUC, that would make efi-vars take the nonblocking path so we don't
trip over the might_sleep().

-Kees

-- 
Kees Cook

Re: [PATCH] pstore: Convert buf_lock to semaphore

2018-11-30 Thread Kees Cook

On Fri, Nov 30, 2018 at 2:52 PM Arnd Bergmann  wrote:
>
> On Fri, Nov 30, 2018 at 11:48 PM Kees Cook  wrote:
> >
> > Instead of running with interrupts disabled, use a semaphore. This should
> > make it easier for backends that may need to sleep (e.g. EFI) when
> > performing a write:
> >
> > |BUG: sleeping function called from invalid context at 
> > kernel/sched/completion.c:99
> > |in_atomic(): 1, irqs_disabled(): 1, pid: 2236, name: sig-xstate-bum
> > |Preemption disabled at:
> > |[] pstore_dump+0x72/0x330
> > |CPU: 26 PID: 2236 Comm: sig-xstate-bum Tainted: G  D   
> > 4.20.0-rc3 #45
> > |Call Trace:
> > | dump_stack+0x4f/0x6a
> > | ___might_sleep.cold.91+0xd3/0xe4
> > | __might_sleep+0x50/0x90
> > | wait_for_completion+0x32/0x130
> > | virt_efi_query_variable_info+0x14e/0x160
> > | efi_query_variable_store+0x51/0x1a0
> > | efivar_entry_set_safe+0xa3/0x1b0
> > | efi_pstore_write+0x109/0x140
> > | pstore_dump+0x11c/0x330
> > | kmsg_dump+0xa4/0xd0
> > | oops_exit+0x22/0x30
> > ...
> >
> > Reported-by: Sebastian Andrzej Siewior 
> > Fixes: 21b3ddd39fee ("efi: Don't use spinlocks for efi vars")
> > Signed-off-by: Kees Cook 
>
> Hmm, I've actually been working on a patch set recently to deprecate
> all semaphores from the kernel and replace them with something
> else as much as possible.
>
> Why can't this be a mutex instead?

My understanding is that I can't use a mutex in interrupt context
(Documentation/kernel-hacking/locking.rst) and pstore_dump() needs to
handle being called from anywhere. I'm surprised it's managed to get
away with using a spinlock for this long. :P

-Kees

-- 
Kees Cook

[PATCH] pstore: Convert buf_lock to semaphore

2018-11-30 Thread Kees Cook

Instead of running with interrupts disabled, use a semaphore. This should
make it easier for backends that may need to sleep (e.g. EFI) when
performing a write:

|BUG: sleeping function called from invalid context at 
kernel/sched/completion.c:99
|in_atomic(): 1, irqs_disabled(): 1, pid: 2236, name: sig-xstate-bum
|Preemption disabled at:
|[] pstore_dump+0x72/0x330
|CPU: 26 PID: 2236 Comm: sig-xstate-bum Tainted: G  D   4.20.0-rc3 
#45
|Call Trace:
| dump_stack+0x4f/0x6a
| ___might_sleep.cold.91+0xd3/0xe4
| __might_sleep+0x50/0x90
| wait_for_completion+0x32/0x130
| virt_efi_query_variable_info+0x14e/0x160
| efi_query_variable_store+0x51/0x1a0
| efivar_entry_set_safe+0xa3/0x1b0
| efi_pstore_write+0x109/0x140
| pstore_dump+0x11c/0x330
| kmsg_dump+0xa4/0xd0
| oops_exit+0x22/0x30
...

Reported-by: Sebastian Andrzej Siewior 
Fixes: 21b3ddd39fee ("efi: Don't use spinlocks for efi vars")
Signed-off-by: Kees Cook 
---
 arch/powerpc/kernel/nvram_64.c|  2 --
 drivers/acpi/apei/erst.c  |  1 -
 drivers/firmware/efi/efi-pstore.c |  4 +--
 fs/pstore/platform.c  | 41 +++
 fs/pstore/ram.c   |  1 -
 include/linux/pstore.h|  7 +++---
 6 files changed, 24 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 22e9d281324d..e7d4ce6964ae 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -563,8 +563,6 @@ static int nvram_pstore_init(void)
nvram_pstore_info.buf = oops_data;
nvram_pstore_info.bufsize = oops_data_sz;
 
-   spin_lock_init(_pstore_info.buf_lock);
-
rc = pstore_register(_pstore_info);
if (rc && (rc != -EPERM))
/* Print error only when pstore.backend == nvram */
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index a5e1d963208e..9953e50667ec 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -1176,7 +1176,6 @@ static int __init erst_init(void)
"Error Record Serialization Table (ERST) support is initialized.\n");
 
buf = kmalloc(erst_erange.size, GFP_KERNEL);
-   spin_lock_init(_info.buf_lock);
if (buf) {
erst_info.buf = buf + sizeof(struct cper_pstore_record);
erst_info.bufsize = erst_erange.size -
diff --git a/drivers/firmware/efi/efi-pstore.c 
b/drivers/firmware/efi/efi-pstore.c
index cfe87b465819..0f7d97917197 100644
--- a/drivers/firmware/efi/efi-pstore.c
+++ b/drivers/firmware/efi/efi-pstore.c
@@ -259,8 +259,7 @@ static int efi_pstore_write(struct pstore_record *record)
efi_name[i] = name[i];
 
ret = efivar_entry_set_safe(efi_name, vendor, PSTORE_EFI_ATTRIBUTES,
- !pstore_cannot_block_path(record->reason),
- record->size, record->psi->buf);
+ preemptible(), record->size, record->psi->buf);
 
if (record->reason == KMSG_DUMP_OOPS)
efivar_run_worker();
@@ -369,7 +368,6 @@ static __init int efivars_pstore_init(void)
return -ENOMEM;
 
efi_pstore_info.bufsize = 1024;
-   spin_lock_init(_pstore_info.buf_lock);
 
if (pstore_register(_pstore_info)) {
kfree(efi_pstore_info.buf);
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 2387cb74f729..afdfd3687f94 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -161,26 +161,27 @@ static const char *get_reason_str(enum kmsg_dump_reason 
reason)
}
 }
 
-bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
+/*
+ * Should pstore_dump() wait for a concurrent pstore_dump()? If
+ * not, the current pstore_dump() will report a failure to dump
+ * and return.
+ */
+static bool pstore_cannot_wait(enum kmsg_dump_reason reason)
 {
-   /*
-* In case of NMI path, pstore shouldn't be blocked
-* regardless of reason.
-*/
+   /* In NMI path, pstore shouldn't block regardless of reason. */
if (in_nmi())
return true;
 
switch (reason) {
/* In panic case, other cpus are stopped by smp_send_stop(). */
case KMSG_DUMP_PANIC:
-   /* Emergency restart shouldn't be blocked by spin lock. */
+   /* Emergency restart shouldn't be blocked. */
case KMSG_DUMP_EMERG:
return true;
default:
return false;
}
 }
-EXPORT_SYMBOL_GPL(pstore_cannot_block_path);
 
 #if IS_ENABLED(CONFIG_PSTORE_DEFLATE_COMPRESS)
 static int zbufsize_deflate(size_t size)
@@ -400,23 +401,20 @@ static void pstore_dump(struct kmsg_dumper *dumper,
unsigned long   total = 0;
const char  *why;
unsigned intpart = 1;
-   unsigned long   flags = 0;
-   int is_locked;
int ret;
 
why = get_reason_str(re

Re: EFI-pstore, sleeping from back context after fault

2018-11-30 Thread Kees Cook

On Fri, Nov 30, 2018 at 1:20 PM Kees Cook  wrote:
>
> On Fri, Nov 30, 2018 at 3:43 AM Sebastian Andrzej Siewior
>  wrote:
> >
> > On 2018-11-29 13:43:58 [-0800], Kees Cook wrote:
> > > On Mon, Nov 26, 2018 at 9:04 AM Sebastian Andrzej Siewior
> > >  wrote:
> > > This bug got handled by Jann Horn, yes? (I remember seeing a related
> > > thread go by...)
> >
> > Correct, fix sits in the tip tree. Nevertheless it was useful to spot
> > the other thing.
> >
> > > > This looks like it comes from commit 21b3ddd39fee ("efi: Don't use
> > > > spinlocks for efi vars") because it replaced a spin_lock() with
> > > > down_interruptible() in this case. In this case, since pstore_dump()
> > > > holds psinfo->buf_lock with disabled interrupts we can't block…
> > > >
> > > > I have this as a workaround:
> > >
> > > I'm not sure this is correct...
> > >
> > > > diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> > > > index 9336ffdf6e2c..d4dcfe46eb2e 100644
> > > > --- a/drivers/firmware/efi/vars.c
> > > > +++ b/drivers/firmware/efi/vars.c
> > > > @@ -755,7 +755,9 @@ int efivar_entry_set_safe(efi_char16_t *name, 
> > > > efi_guid_t vendor, u32 attributes,
> > > > return -EINTR;
> > > > }
> > > >
> > > > -   status = check_var_size(attributes, size + ucs2_strsize(name, 
> > > > 1024));
> > > > +   status = ops->query_variable_store(attributes,
> > > > +  size + ucs2_strsize(name, 
> > > > 1024),
> > > > +  block);
> > >
> > > Why does this change help?
> >
> > check_var_size() uses false as the argument for `block'.
> > check_var_size_nonblocking uses true as the argument for `block'.
> > Doing this ops->… allows to pass the `block' argument as received from
> > caller. And the functions above already use the `block' argument. Plus
> > the next hunk makes sure that `block' is set properly.
>
> Here's the code I see:
>
> if (!block && ops->set_variable_nonblocking)
> return efivar_entry_set_nonblocking(name, vendor, attributes,
> size, data);
>
> if (!block) {
> if (down_trylock(_lock))
> return -EBUSY;
> } else {
> if (down_interruptible(_lock))
> return -EINTR;
> }
>
> status = check_var_size(attributes, size + ucs2_strsize(name, 1024));
> if (status != EFI_SUCCESS) {
> up(_lock);
> return -ENOSPC;
> }
>
> status = ops->set_variable(name, , attributes, size, data);
>
> up(_lock);
>
> In the !block case, this will already take the
> efivar_entry_set_nonblocking() path. Isn't that sufficient? (i.e. I'm
> not sure why I see a change needed in the EFI code, just the
> pstore_cannot_block_path() path?
>
> >
> > > > if (status != EFI_SUCCESS) {
> > > > up(_lock);
> > > > return -ENOSPC;
> > > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> > > > index b821054ca3ed..9aa27a2c8d36 100644
> > > > --- a/fs/pstore/platform.c
> > > > +++ b/fs/pstore/platform.c
> > > > @@ -127,10 +127,10 @@ static const char *get_reason_str(enum 
> > > > kmsg_dump_reason reason)
> > > >  bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
> > > >  {
> > > > /*
> > > > -* In case of NMI path, pstore shouldn't be blocked
> > > > -* regardless of reason.
> > > > +* In case of disabled preemption / interrupts we can't block 
> > > > on any
> > > > +* lock regardless of reason.
> > > >  */
> > > > -   if (in_nmi())
> > > > +   if (in_atomic() || irqs_disabled())
> > > > return true;
> > > >
> > > > switch (reason) {
> > >
> > > I'd like to avoid any case where pstore might "miss" a crash report,
> > > though... this seems to make it easier for it to fail to record a
> > > crash, yes?
> >
> > Well, if the lock is occupied, yes. But waiting for the completion with

Re: EFI-pstore, sleeping from back context after fault

2018-11-30 Thread Kees Cook

On Fri, Nov 30, 2018 at 3:43 AM Sebastian Andrzej Siewior
 wrote:
>
> On 2018-11-29 13:43:58 [-0800], Kees Cook wrote:
> > On Mon, Nov 26, 2018 at 9:04 AM Sebastian Andrzej Siewior
> >  wrote:
> > This bug got handled by Jann Horn, yes? (I remember seeing a related
> > thread go by...)
>
> Correct, fix sits in the tip tree. Nevertheless it was useful to spot
> the other thing.
>
> > > This looks like it comes from commit 21b3ddd39fee ("efi: Don't use
> > > spinlocks for efi vars") because it replaced a spin_lock() with
> > > down_interruptible() in this case. In this case, since pstore_dump()
> > > holds psinfo->buf_lock with disabled interrupts we can't block…
> > >
> > > I have this as a workaround:
> >
> > I'm not sure this is correct...
> >
> > > diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> > > index 9336ffdf6e2c..d4dcfe46eb2e 100644
> > > --- a/drivers/firmware/efi/vars.c
> > > +++ b/drivers/firmware/efi/vars.c
> > > @@ -755,7 +755,9 @@ int efivar_entry_set_safe(efi_char16_t *name, 
> > > efi_guid_t vendor, u32 attributes,
> > > return -EINTR;
> > > }
> > >
> > > -   status = check_var_size(attributes, size + ucs2_strsize(name, 
> > > 1024));
> > > +   status = ops->query_variable_store(attributes,
> > > +  size + ucs2_strsize(name, 
> > > 1024),
> > > +  block);
> >
> > Why does this change help?
>
> check_var_size() uses false as the argument for `block'.
> check_var_size_nonblocking uses true as the argument for `block'.
> Doing this ops->… allows to pass the `block' argument as received from
> caller. And the functions above already use the `block' argument. Plus
> the next hunk makes sure that `block' is set properly.

Here's the code I see:

if (!block && ops->set_variable_nonblocking)
return efivar_entry_set_nonblocking(name, vendor, attributes,
size, data);

if (!block) {
if (down_trylock(_lock))
return -EBUSY;
} else {
if (down_interruptible(_lock))
return -EINTR;
}

status = check_var_size(attributes, size + ucs2_strsize(name, 1024));
if (status != EFI_SUCCESS) {
up(_lock);
return -ENOSPC;
}

status = ops->set_variable(name, , attributes, size, data);

up(_lock);

In the !block case, this will already take the
efivar_entry_set_nonblocking() path. Isn't that sufficient? (i.e. I'm
not sure why I see a change needed in the EFI code, just the
pstore_cannot_block_path() path?

>
> > > if (status != EFI_SUCCESS) {
> > > up(_lock);
> > > return -ENOSPC;
> > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> > > index b821054ca3ed..9aa27a2c8d36 100644
> > > --- a/fs/pstore/platform.c
> > > +++ b/fs/pstore/platform.c
> > > @@ -127,10 +127,10 @@ static const char *get_reason_str(enum 
> > > kmsg_dump_reason reason)
> > >  bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
> > >  {
> > > /*
> > > -* In case of NMI path, pstore shouldn't be blocked
> > > -* regardless of reason.
> > > +* In case of disabled preemption / interrupts we can't block on 
> > > any
> > > +* lock regardless of reason.
> > >  */
> > > -   if (in_nmi())
> > > +   if (in_atomic() || irqs_disabled())
> > > return true;
> > >
> > > switch (reason) {
> >
> > I'd like to avoid any case where pstore might "miss" a crash report,
> > though... this seems to make it easier for it to fail to record a
> > crash, yes?
>
> Well, if the lock is occupied, yes. But waiting for the completion with
> disabled interrupts isn't much better :)
>
> pstore_cannot_block_path() is used in two places. One caller is taking a
> spin_lock(). So that one could keep using the "old" function.
>
> The other caller uses it before acquiring the semaphore. The non-try
> version can't be used with preemption or interrupts disabled.
>
> You could split those two cases and let the sempaphire user use the
> "(in_atomic() || irqs_disabled())" from above.

preempt.h:#define in_nmi()  (preempt_count() & NMI_MASK)
preempt.h:#define in_atomic()   (preempt_count() != 0)

NMI is a special case of preempt_count() test. in_atomic() is a
non-zero preempt_count() test. It looks like we want to use
preemptible()?

#define preemptible() (preempt_count() == 0 && !irqs_disabled())

So, perhaps:

-   if (in_nmi())
+   if (!preemptible())

What do you think?

-- 
Kees Cook

Re: EFI-pstore, sleeping from back context after fault

2018-11-29 Thread Kees Cook

On Mon, Nov 26, 2018 at 9:04 AM Sebastian Andrzej Siewior
 wrote:
>
> So I triggered a bug in x86 code. First the "okay" backtrace:
> |BUG: GPF in non-whitelisted uaccess (non-canonical address?)
> |general protection fault:  [#1] PREEMPT SMP NOPTI
> |CPU: 26 PID: 2236 Comm: sig-xstate-bum Not tainted 4.20.0-rc3 #45
> |RIP: 0010:__fpu__restore_sig+0x1c1/0x540
> |Call Trace:
> | fpu__restore_sig+0x28/0x40
> | restore_sigcontext+0x13a/0x180
> | __ia32_sys_rt_sigreturn+0xae/0x100
> | do_syscall_64+0x4f/0x100
> | entry_SYSCALL_64_after_hwframe+0x44/0xa9
> |RIP: 0033:0x7f9b06aea227
> |---[ end trace a45ac23b593e9ab0 ]---

This bug got handled by Jann Horn, yes? (I remember seeing a related
thread go by...)

>
> and now the not okay part:
>
> |BUG: sleeping function called from invalid context at 
> kernel/sched/completion.c:99
> |in_atomic(): 1, irqs_disabled(): 1, pid: 2236, name: sig-xstate-bum
> |Preemption disabled at:
> |[] pstore_dump+0x72/0x330
> |CPU: 26 PID: 2236 Comm: sig-xstate-bum Tainted: G  D   
> 4.20.0-rc3 #45
> |Call Trace:
> | dump_stack+0x4f/0x6a
> | ___might_sleep.cold.91+0xd3/0xe4
> | __might_sleep+0x50/0x90
> | wait_for_completion+0x32/0x130
> | virt_efi_query_variable_info+0x14e/0x160
> | efi_query_variable_store+0x51/0x1a0
> | efivar_entry_set_safe+0xa3/0x1b0
> | efi_pstore_write+0x109/0x140

Eww. :)

> | pstore_dump+0x11c/0x330
> | kmsg_dump+0xa4/0xd0
> | oops_exit+0x22/0x30
> | oops_end+0x67/0xd0
> | die+0x41/0x4a
> | do_general_protection+0xc1/0x150
> | general_protection+0x1e/0x30
> |RIP: 0010:__fpu__restore_sig+0x1c1/0x540
> | fpu__restore_sig+0x28/0x40
> | restore_sigcontext+0x13a/0x180
> | __ia32_sys_rt_sigreturn+0xae/0x100
> | do_syscall_64+0x4f/0x100
> | entry_SYSCALL_64_after_hwframe+0x44/0xa9
> |RIP: 0033:0x7f9b06aea227
> … [ moar backtrace ]
>
> This looks like it comes from commit 21b3ddd39fee ("efi: Don't use
> spinlocks for efi vars") because it replaced a spin_lock() with
> down_interruptible() in this case. In this case, since pstore_dump()
> holds psinfo->buf_lock with disabled interrupts we can't block…
>
> I have this as a workaround:

I'm not sure this is correct...

> diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
> index 9336ffdf6e2c..d4dcfe46eb2e 100644
> --- a/drivers/firmware/efi/vars.c
> +++ b/drivers/firmware/efi/vars.c
> @@ -755,7 +755,9 @@ int efivar_entry_set_safe(efi_char16_t *name, efi_guid_t 
> vendor, u32 attributes,
> return -EINTR;
> }
>
> -   status = check_var_size(attributes, size + ucs2_strsize(name, 1024));
> +   status = ops->query_variable_store(attributes,
> +  size + ucs2_strsize(name, 1024),
> +  block);

Why does this change help?

> if (status != EFI_SUCCESS) {
> up(_lock);
> return -ENOSPC;
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index b821054ca3ed..9aa27a2c8d36 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -127,10 +127,10 @@ static const char *get_reason_str(enum kmsg_dump_reason 
> reason)
>  bool pstore_cannot_block_path(enum kmsg_dump_reason reason)
>  {
> /*
> -* In case of NMI path, pstore shouldn't be blocked
> -* regardless of reason.
> +* In case of disabled preemption / interrupts we can't block on any
> +* lock regardless of reason.
>  */
> -   if (in_nmi())
> +   if (in_atomic() || irqs_disabled())
> return true;
>
> switch (reason) {

I'd like to avoid any case where pstore might "miss" a crash report,
though... this seems to make it easier for it to fail to record a
crash, yes?

I have some pending clean-ups in this area (buf_lock may not be needed
at all, actually), so I'll try to work through those too.

-Kees

-- 
Kees Cook

Re: [PATCH v7 3/3] x86/boot/KASLR: Limit kaslr to choosing the immovable memory

2018-09-22 Thread Kees Cook

On Thu, Sep 13, 2018 at 3:46 AM, Chao Fan  wrote:
> If CONFIG_MEMORY_HOTREMOVE enabled and the amount of immovable
> memory regions is not zero. Calculate the intersection between memory
> regions from e820/efi memory table and immovable memory regions.
>
> Signed-off-by: Chao Fan 
> ---
>  arch/x86/boot/compressed/kaslr.c | 71 +++-
>  1 file changed, 60 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/kaslr.c 
> b/arch/x86/boot/compressed/kaslr.c
> index 0c3567bc231c..0a7ef2daf169 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -101,6 +101,11 @@ static bool memmap_too_large;
>  /* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>  static unsigned long long mem_limit = ULLONG_MAX;
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +/* Store the immovable memory regions */
> +extern struct mem_vector immovable_mem[MAX_NUMNODES*2];
> +#endif
> +
>
>  enum mem_avoid_index {
> MEM_AVOID_ZO_RANGE = 0,
> @@ -577,9 +582,9 @@ static unsigned long slots_fetch_random(void)
> return 0;
>  }
>
> -static void process_mem_region(struct mem_vector *entry,
> -  unsigned long minimum,
> -  unsigned long image_size)
> +static void slots_count(struct mem_vector *entry,
> +   unsigned long minimum,
> +   unsigned long image_size)
>  {
> struct mem_vector region, overlap;
> unsigned long start_orig, end;
> @@ -655,6 +660,56 @@ static void process_mem_region(struct mem_vector *entry,
> }
>  }
>
> +static bool process_mem_region(struct mem_vector *region,
> +  unsigned long long minimum,
> +  unsigned long long image_size)
> +{
> +   int i;
> +   /*
> +* If no immovable memory found, or MEMORY_HOTREMOVE disabled,
> +* walk all the regions, so use region directely.
> +*/
> +   if (num_immovable_mem == 0) {
> +   slots_count(region, minimum, image_size);
> +
> +   if (slot_area_index == MAX_SLOT_AREA) {
> +   debug_putstr("Aborted e820/efi memmap scan 
> (slot_areas full)!\n");
> +   return 1;
> +   }
> +   return 0;
> +   }
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +   /*
> +* If immovable memory found, filter the intersection between
> +* immovable memory and region to slots_count.
> +* Otherwise, go on old code.
> +*/
> +   for (i = 0; i < num_immovable_mem; i++) {
> +   struct mem_vector entry;
> +   unsigned long long start, end, entry_end, region_end;
> +
> +   start = immovable_mem[i].start;
> +   end = start + immovable_mem[i].size;
> +   region_end = region->start + region->size;
> +
> +   entry.start = clamp(region->start, start, end);
> +   entry_end = clamp(region_end, start, end);
> +
> +   if (entry.start + image_size < entry_end) {

Can this logic be rewritten to use the existing mem_overlaps() check
instead? I think that would make it much more readable.

Otherwise, yes, this all looks fine.

-Kees

> +   entry.size = entry_end - entry.start;
> +   slots_count(, minimum, image_size);
> +
> +   if (slot_area_index == MAX_SLOT_AREA) {
> +   debug_putstr("Aborted e820/efi memmap scan 
> (slot_areas full)!\n");
> +   return 1;
> +   }
> +   }
> +   }
> +   return 0;
> +#endif
> +}
> +
>  #ifdef CONFIG_EFI
>  /*
>   * Returns true if mirror region found (and must have been processed
> @@ -720,11 +775,8 @@ process_efi_entries(unsigned long minimum, unsigned long 
> image_size)
>
> region.start = md->phys_addr;
> region.size = md->num_pages << EFI_PAGE_SHIFT;
> -   process_mem_region(, minimum, image_size);
> -   if (slot_area_index == MAX_SLOT_AREA) {
> -   debug_putstr("Aborted EFI scan (slot_areas full)!\n");
> +   if (process_mem_region(, minimum, image_size))
> break;
> -   }
> }
> return true;
>  }
> @@ -751,11 +803,8 @@ static void process_e820_entries(unsigned long minimum,
> continue;
> region.start = entry->add

Re: [PATCH v7 2/3] x86/boot/KASLR: Walk srat tables to filter immovable memory

2018-09-22 Thread Kees Cook

On Thu, Sep 13, 2018 at 3:46 AM, Chao Fan  wrote:
> If CONFIG_MEMORY_HOTREMOVE enabled, walk through the acpi srat memory
> tables and store those immovable memory regions so that kaslr can get
> where to choose for randomization.
>
> Signed-off-by: Chao Fan 
> ---
>  arch/x86/boot/compressed/kaslr.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/boot/compressed/kaslr.c 
> b/arch/x86/boot/compressed/kaslr.c
> index 9ed9709d9947..0c3567bc231c 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -417,6 +417,11 @@ static void mem_avoid_init(unsigned long input, unsigned 
> long input_size,
> /* Mark the memmap regions we need to avoid */
> handle_mem_options();
>
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +   /* Mark the immovable regions we need to choose */
> +   get_immovable_mem();
> +#endif

Style nit-pick: I'd prefer the definition of get_immovable_mem() do
the ifdef so it doesn't appear here.

-Kees

> +
>  #ifdef CONFIG_X86_VERBOSE_BOOTUP
> /* Make sure video RAM can be used. */
> add_identity_map(0, PMD_SIZE);
> --
> 2.17.1
>
>
>



-- 
Kees Cook
Pixel Security

Re: [PATCH v7 1/3] x86/boot: Add acpitb.c to parse acpi tables

2018-09-22 Thread Kees Cook

On Thu, Sep 13, 2018 at 3:46 AM, Chao Fan  wrote:
> There is a bug that kaslr may randomly chooses some positions
> which are located in movable memory regions. This will break memory
> hotplug feature and make the movable memory chosen by KASLR can't be
> removed. So dig SRAT table from ACPI tables to get memory information.
>
> Imitate the ACPI code of parsing ACPI tables to dig and read ACPI
> tables. Since some operations are not needed here, functions are
> simplified. Functions will be used to dig only SRAT tables to get
> information of memory, so that KASLR can the memory in immovable node.
>
> And also, these functions won't influence the initialization of
> ACPI after start_kernel().
>
> Since use physical address directely, so acpi_os_map_memory()
> and acpi_os_unmap_memory() are not needed.
>
> Signed-off-by: Chao Fan 
> ---
>  arch/x86/boot/compressed/Makefile |   4 +
>  arch/x86/boot/compressed/acpitb.c | 401 ++

Does this logic live anywhere else in the kernel already? (i.e. could
other code be reused?)

-Kees

-- 
Kees Cook
Pixel Security

Re: [GIT PULL] Kernel lockdown for secure boot

2018-04-03 Thread Kees Cook

On Tue, Apr 3, 2018 at 12:01 PM, Andy Lutomirski <l...@kernel.org> wrote:
> On Tue, Apr 3, 2018 at 11:45 AM, Kees Cook <keesc...@chromium.org> wrote:
>> On Tue, Apr 3, 2018 at 9:45 AM, Andy Lutomirski <l...@kernel.org> wrote:
>>> On Tue, Apr 3, 2018 at 9:29 AM, Matthew Garrett <mj...@google.com> wrote:
>>>> On Tue, Apr 3, 2018 at 8:11 AM Andy Lutomirski <l...@kernel.org> wrote:
>>>>> Can you explain that much more clearly?  I'm asking why booting via
>>>>> UEFI Secure Boot should enable lockdown, and I don't see what this has
>>>>> to do with kexec.  And "someone blacklist[ing] your key in the
>>>>> bootloader" sounds like a political issue, not a technical issue.
>>>>
>>>> A kernel that allows users arbitrary access to ring 0 is just an
>>>> overfeatured bootloader. Why would you want secure boot in that case?
>>>
>>> To get a chain of trust.  I can provision a system with some public
>>> keys, stored in UEFI authenticated variables, such that the system
>>> will only boot a signed image.  That signed image, can, in turn, load
>>> a signed (or hashed or otherwise verfified) kernel and a verified
>>> initramfs.  The initramfs can run a full system from a verified (using
>>> dm-verity or similar) filesystem, for example.  Now it's very hard to
>>> persistently attack this system.  Chromium OS does something very much
>>> like this, except that it doesn't use UEFI as far as I know.  So does
>>> iOS, and so do some Android versions.
>>
>> Correct, Chrome OS does not use UEFI, and we still want this patch
>> series, as it plugs all the known "intentional" escalation paths from
>> uid-0 to ring-0. Happily, that means all the politics around the UEFI
>> and Secure Boot case can be ignored, because those issues are specific
>> to Secure Boot, not the lockdown series. (They are _related_, sure,
>> but lockdown isn't only about Secure Boot -- it's just that SB is one
>> of the widely deployed implementations of this kind of
>> trust-chain-booting-thing. Chrome OS and Android's Verified Boot do
>> similar things and have the same expectations about the uid-0/ring-0
>> separation.)
>>
>> The goal for that bright line on Chrome OS and Android is to stop
>> attack persistence. We want to know that a reboot onto a new kernel
>> and OS image will actually result in getting the desired system state,
>> and that any attack on persistent system data (even for things running
>> with full root privileges) can't result in using kernel interfaces to
>> gain kernel control. This isn't expected to be _perfect_, since
>> nothing is. But it creates a place to work from. The idea that uid-0
>> is NOT ring-0 is still relatively new, so the existing designs in the
>> kernel aren't well suited to building that distinction. I view this
>> series as a solid first step to getting there, though.
>
> But wouldn't Chrome OS possibly want to lock down kernel memory write
> vectors but not read vectors?  After all, debugging is useful even on
> Chrome OS.

Chrome OS absolutely wants to block writing. We also want to block
reading as much as we possibly can, though yes we bump up against
debugging in that quest. But those cases are manageable and specific,
IMO.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] Kernel lockdown for secure boot

2018-04-03 Thread Kees Cook

On Tue, Apr 3, 2018 at 9:45 AM, Andy Lutomirski <l...@kernel.org> wrote:
> On Tue, Apr 3, 2018 at 9:29 AM, Matthew Garrett <mj...@google.com> wrote:
>> On Tue, Apr 3, 2018 at 8:11 AM Andy Lutomirski <l...@kernel.org> wrote:
>>> Can you explain that much more clearly?  I'm asking why booting via
>>> UEFI Secure Boot should enable lockdown, and I don't see what this has
>>> to do with kexec.  And "someone blacklist[ing] your key in the
>>> bootloader" sounds like a political issue, not a technical issue.
>>
>> A kernel that allows users arbitrary access to ring 0 is just an
>> overfeatured bootloader. Why would you want secure boot in that case?
>
> To get a chain of trust.  I can provision a system with some public
> keys, stored in UEFI authenticated variables, such that the system
> will only boot a signed image.  That signed image, can, in turn, load
> a signed (or hashed or otherwise verfified) kernel and a verified
> initramfs.  The initramfs can run a full system from a verified (using
> dm-verity or similar) filesystem, for example.  Now it's very hard to
> persistently attack this system.  Chromium OS does something very much
> like this, except that it doesn't use UEFI as far as I know.  So does
> iOS, and so do some Android versions.

Correct, Chrome OS does not use UEFI, and we still want this patch
series, as it plugs all the known "intentional" escalation paths from
uid-0 to ring-0. Happily, that means all the politics around the UEFI
and Secure Boot case can be ignored, because those issues are specific
to Secure Boot, not the lockdown series. (They are _related_, sure,
but lockdown isn't only about Secure Boot -- it's just that SB is one
of the widely deployed implementations of this kind of
trust-chain-booting-thing. Chrome OS and Android's Verified Boot do
similar things and have the same expectations about the uid-0/ring-0
separation.)

The goal for that bright line on Chrome OS and Android is to stop
attack persistence. We want to know that a reboot onto a new kernel
and OS image will actually result in getting the desired system state,
and that any attack on persistent system data (even for things running
with full root privileges) can't result in using kernel interfaces to
gain kernel control. This isn't expected to be _perfect_, since
nothing is. But it creates a place to work from. The idea that uid-0
is NOT ring-0 is still relatively new, so the existing designs in the
kernel aren't well suited to building that distinction. I view this
series as a solid first step to getting there, though.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] hash addresses printed with %p

2017-12-01 Thread Kees Cook

On Fri, Dec 1, 2017 at 7:34 AM, Greg Kroah-Hartman
<gre...@linuxfoundation.org> wrote:

> And isn't there a specific %p modifier you should use for a kernel
> pointer.  I've lost the thread here for what should, or should not, be
> done for kernel pointers these days based on the long email discussion.

Current implementation to bypass the hashing is %px. (Though perhaps
all %px usage should include a comment with a justification?)

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC Part1 PATCH v3 10/17] resource: Provide resource struct in resource walk callback

2017-07-31 Thread Kees Cook

On Mon, Jul 24, 2017 at 12:07 PM, Brijesh Singh <brijesh.si...@amd.com> wrote:
> From: Tom Lendacky <thomas.lenda...@amd.com>
>
> In prep for a new function that will need additional resource information
> during the resource walk, update the resource walk callback to pass the
> resource structure.  Since the current callback start and end arguments
> are pulled from the resource structure, the callback functions can obtain
> them from the resource structure directly.
>
> Signed-off-by: Tom Lendacky <thomas.lenda...@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.si...@amd.com>

This is a nice clean up even without the refactoring need. :)

Reviewed-by: Kees Cook <keesc...@chromium.org>

Thanks!

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 0/2] Add EFI capsule pstore backend support

2017-06-23 Thread Kees Cook

On Thu, Jun 22, 2017 at 9:34 AM, Qiuxu Zhuo <qiuxu.z...@intel.com> wrote:
> Change Log v3->v4:
>  - Add comment 'the number of config tables' for 'nr_config_table' in efi 
> structure
>  - Initialize 'efi.nr_config_table' to 0 in default
>  - Set 'efi.nr_config_table' to 'efi.systab->nr_tables' in 
> drivers/firmware/efi/arm-init.c -> uefi_init()
>  - Mark 'efi_capsule_pstore_disable' as __ro_after_init
>  - Use timestamp value passed from pstore API rather than using get_seconds()
>  - Pass page physcial addresses rather than struct page pointers to 
> efi_capsule_update()

Thanks for the updates!

Reviewed-by: Kees Cook <keesc...@chromium.org>

>
> Qiuxu Zhuo (2):
>   efi: Add 'nr_config_table' variable in efi structure
>   eif/capsule-pstore: Add capsule pstore backend
>
>  arch/x86/platform/efi/efi.c   |   1 +
>  drivers/firmware/efi/Kconfig  |  21 ++
>  drivers/firmware/efi/Makefile |   1 +
>  drivers/firmware/efi/arm-init.c   |   4 +-
>  drivers/firmware/efi/capsule-pstore.c | 679 
> ++
>  drivers/firmware/efi/efi.c|   1 +
>  include/linux/efi.h   |   1 +
>  7 files changed, 707 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/firmware/efi/capsule-pstore.c
>
> --
> 2.9.0.GIT
>



-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 2/2] eif/capsule-pstore: Add capsule pstore backend

2017-06-20 Thread Kees Cook

   if (atomic_long_cmpxchg(ring_offset, curr, next)) {
> +   curr = 0;
> +   break;
> +   }
> +   continue;
> +   }
> +
> +   } while (atomic_long_cmpxchg(ring_offset, curr, next) != curr);
> +
> +   return ring_buf + curr;
> +}
> +
> +static int notrace efi_capsule_pstore_write(struct pstore_record *record)
> +{
> +   struct efi_capsule_pstore *pctx = record->psi->data;
> +   struct efi_capsule_pstore_record *rec;
> +   static atomic64_t seq;
> +   void *dst;
> +
> +   /*
> +* A zero size record would break our detection of
> +* partially-filled capsules.
> +*/
> +   if (!record->size)
> +   return -EINVAL;
> +
> +   if (record->type == PSTORE_TYPE_CONSOLE) {
> +   dst = get_pstore_write_ring_buf(>console, record->size);
> +   } else if (record->type == PSTORE_TYPE_FTRACE) {
> +   dst = get_pstore_write_ring_buf(>ftrace, record->size);
> +   } else if (record->type == PSTORE_TYPE_DMESG) {
> +   size_t size = record->size;
> +
> +   rec = get_pstore_write_record(>dmesg, >size);
> +   if (!rec || (record->compressed && record->size < size))
> +   return -ENOSPC;
> +
> +   rec->type = record->type;
> +   rec->timestamp = get_seconds();

timestamp will already be correctly populated by the pstore API, so it
should be copied from there rather than using get_seconds() which may
not be functional during resume, etc.

> +   rec->size = record->size;
> +   if (!atomic64_read())
> +   atomic64_set(, rec->timestamp << 32);
> +   record->id = atomic64_inc_return();
> +   rec->id = record->id;
> +   rec->compressed = record->compressed;
> +   rec->magic = CAPSULE_MAGIC;
> +
> +   dst = rec->data;
> +   } else if (record->type == PSTORE_TYPE_PMSG) {
> +   pr_warn_ratelimited("PMSG should not call %s\n", __func__);
> +   return -EINVAL;
> +   } else {
> +   return -EINVAL;
> +   }
> +
> +   if (!dst)
> +   return -ENOSPC;
> +
> +   memcpy(dst, record->buf, record->size);
> +
> +   return 0;
> +}
> +
> +static int notrace efi_capsule_pstore_write_user(struct pstore_record 
> *record, const char __user *buf)
> +{
> +   struct efi_capsule_pstore *pctx = record->psi->data;
> +   void *dst;
> +
> +   if (record->type == PSTORE_TYPE_PMSG) {
> +   dst = get_pstore_write_ring_buf(>pmsg, record->size);
> +   if (!dst)
> +   return -ENOSPC;
> +
> +   if (unlikely(__copy_from_user(dst, buf, record->size)))
> +   return -EFAULT;
> +
> +   return 0;
> +   }
> +
> +   return -EINVAL;
> +}
> +
> +static __init int check_capsule_support(void)
> +{
> +   int rv;
> +
> +   efi_guid_t guid = LINUX_EFI_CRASH_GUID;
> +   u32 flags = EFI_CAPSULE_PERSIST_ACROSS_RESET |
> +   EFI_CAPSULE_POPULATE_SYSTEM_TABLE;
> +
> +   rv = efi_capsule_supported(guid, flags, CAPSULE_SIZE, 
> _reset_type);
> +   if (rv)
> +   return rv;
> +
> +   return 0;
> +}
> +
> +static __init int efi_capsule_pstore_init(void)
> +{
> +   int rv = 0;
> +
> +   if (!efi_enabled(EFI_RUNTIME_SERVICES))
> +   return -ENODEV;
> +
> +   if (efi_capsule_pstore_disable) {
> +   pr_info("Capsule-pstore is disabled\n ");
> +   return 0;
> +   }
> +
> +   rv = check_capsule_support();
> +   if (rv) {
> +   pr_info("Capsule-pstore is not supported: %d\n ", rv);
> +   return rv;
> +   }
> +
> +   rv = efi_capsule_pstore_setup();
> +   if (rv) {
> +   pr_err("Fail to setup capsule-pstore: %d\n", rv);
> +   return rv;
> +   }
> +
> +   /*
> +* Once the memory reserved for capsules passed to the firmware
> +* by EFI UpdateCapsule(), there's no way to let firmware give up
> +* the capsules. So this module cannot be unloaded once loaded.
> +*/
> +   __module_get(THIS_MODULE);
> +
> +   return 0;
> +}
> +
> +static __exit void efi_capsule_pstore_exit(void) {}
> +
> +module_init(efi_capsule_pstore_init);
> +module_exit(efi_capsule_pstore_exit);
> +
> +module_param_named(pstore_disable, efi_capsule_pstore_disable, bool, 0644);
> +
> +MODULE_DESCRIPTION("EFI capsule backend for pstore");
> +MODULE_LICENSE("GPL v2");
> --
> 2.9.0.GIT
>

I can't speak well to the EFI portions, but the after fixing the nits
above, this looks fine to me. :)

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] efi-pstore: Fix write/erase id tracking

2017-05-19 Thread Kees Cook

Prior to the pstore interface refactoring, the "id" generated during
a backend pstore_write() was only retained by the internal pstore
inode tracking list. Additionally the "part" was ignored, so EFI
would encode this in the id. This corrects the misunderstandings
and correctly sets "id" during pstore_write(), and uses "part"
directly during pstore_erase().

Reported-by: Marta Lofstedt <marta.lofst...@intel.com>
Fixes: 76cc9580e3fb ("pstore: Replace arguments for write() API")
Fixes: a61072aae693 ("pstore: Replace arguments for erase() API")
Signed-off-by: Kees Cook <keesc...@chromium.org>
---
Since the pstore refactor broke this, I intend to push this via the
pstore tree.
---
 drivers/firmware/efi/efi-pstore.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/firmware/efi/efi-pstore.c 
b/drivers/firmware/efi/efi-pstore.c
index 44148fd4c9f2..dda2e96328c0 100644
--- a/drivers/firmware/efi/efi-pstore.c
+++ b/drivers/firmware/efi/efi-pstore.c
@@ -53,6 +53,7 @@ static int efi_pstore_read_func(struct efivar_entry *entry,
if (sscanf(name, "dump-type%u-%u-%d-%lu-%c",
   >type, , , , _type) == 5) {
record->id = generic_id(time, part, cnt);
+   record->part = part;
record->count = cnt;
record->time.tv_sec = time;
record->time.tv_nsec = 0;
@@ -64,6 +65,7 @@ static int efi_pstore_read_func(struct efivar_entry *entry,
} else if (sscanf(name, "dump-type%u-%u-%d-%lu",
   >type, , , ) == 4) {
record->id = generic_id(time, part, cnt);
+   record->part = part;
record->count = cnt;
record->time.tv_sec = time;
record->time.tv_nsec = 0;
@@ -77,6 +79,7 @@ static int efi_pstore_read_func(struct efivar_entry *entry,
 * multiple logs, remains.
 */
record->id = generic_id(time, part, 0);
+   record->part = part;
record->count = 0;
record->time.tv_sec = time;
record->time.tv_nsec = 0;
@@ -241,9 +244,15 @@ static int efi_pstore_write(struct pstore_record *record)
efi_guid_t vendor = LINUX_EFI_CRASH_GUID;
int i, ret = 0;
 
+   record->time.tv_sec = get_seconds();
+   record->time.tv_nsec = 0;
+
+   record->id = generic_id(record->time.tv_sec, record->part,
+   record->count);
+
snprintf(name, sizeof(name), "dump-type%u-%u-%d-%lu-%c",
 record->type, record->part, record->count,
-get_seconds(), record->compressed ? 'C' : 'D');
+record->time.tv_sec, record->compressed ? 'C' : 'D');
 
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = name[i];
@@ -255,7 +264,6 @@ static int efi_pstore_write(struct pstore_record *record)
if (record->reason == KMSG_DUMP_OOPS)
efivar_run_worker();
 
-   record->id = record->part;
return ret;
 };
 
@@ -287,7 +295,7 @@ static int efi_pstore_erase_func(struct efivar_entry 
*entry, void *data)
 * holding multiple logs, remains.
 */
snprintf(name_old, sizeof(name_old), "dump-type%u-%u-%lu",
-   ed->record->type, (unsigned int)ed->record->id,
+   ed->record->type, ed->record->part,
ed->record->time.tv_sec);
 
for (i = 0; i < DUMP_NAME_LEN; i++)
@@ -320,10 +328,7 @@ static int efi_pstore_erase(struct pstore_record *record)
char name[DUMP_NAME_LEN];
efi_char16_t efi_name[DUMP_NAME_LEN];
int found, i;
-   unsigned int part;
 
-   do_div(record->id, 1000);
-   part = do_div(record->id, 100);
snprintf(name, sizeof(name), "dump-type%u-%u-%d-%lu",
 record->type, record->part, record->count,
 record->time.tv_sec);
-- 
2.7.4


-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] efi-pstore: Fix read iter after pstore API refactor

2017-05-18 Thread Kees Cook

On Thu, May 18, 2017 at 9:18 AM, Ard Biesheuvel
<ard.biesheu...@linaro.org> wrote:
> On 18 May 2017 at 14:01, Kees Cook <keesc...@chromium.org> wrote:
>> On Thu, May 18, 2017 at 3:35 AM, Ard Biesheuvel
>> <ard.biesheu...@linaro.org> wrote:
>>> On 12 May 2017 at 22:58, Kees Cook <keesc...@chromium.org> wrote:
>>>> During the internal pstore API refactoring, the EFI vars read entry was
>>>> accidentally made to update a stack variable instead of the pstore
>>>> private data pointer. This corrects the problem (and removes the now
>>>> needless argument).
>>>>
>>>> Signed-off-by: Kees Cook <keesc...@chromium.org>
>>>
>>> Does this need a cc stable?
>>
>> No, the refactor first appeared in 4.12-rc1.
>>
>
> OK. Applied.

Err, eek, no, it's already gone into Linus's tree.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] efi-pstore: Fix read iter after pstore API refactor

2017-05-18 Thread Kees Cook

On Thu, May 18, 2017 at 3:35 AM, Ard Biesheuvel
<ard.biesheu...@linaro.org> wrote:
> On 12 May 2017 at 22:58, Kees Cook <keesc...@chromium.org> wrote:
>> During the internal pstore API refactoring, the EFI vars read entry was
>> accidentally made to update a stack variable instead of the pstore
>> private data pointer. This corrects the problem (and removes the now
>> needless argument).
>>
>> Signed-off-by: Kees Cook <keesc...@chromium.org>
>
> Does this need a cc stable?

No, the refactor first appeared in 4.12-rc1.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] efi-pstore: Fix read iter after pstore API refactor

2017-05-12 Thread Kees Cook

During the internal pstore API refactoring, the EFI vars read entry was
accidentally made to update a stack variable instead of the pstore
private data pointer. This corrects the problem (and removes the now
needless argument).

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 drivers/firmware/efi/efi-pstore.c | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/firmware/efi/efi-pstore.c 
b/drivers/firmware/efi/efi-pstore.c
index 93d8cdbe7ef4..9e6f14c354f1 100644
--- a/drivers/firmware/efi/efi-pstore.c
+++ b/drivers/firmware/efi/efi-pstore.c
@@ -155,25 +155,20 @@ static int efi_pstore_scan_sysfs_exit(struct efivar_entry 
*pos,
  * efi_pstore_sysfs_entry_iter
  *
  * @record: pstore record to pass to callback
- * @pos: entry to begin iterating from
  *
  * You MUST call efivar_enter_iter_begin() before this function, and
  * efivar_entry_iter_end() afterwards.
  *
- * It is possible to begin iteration from an arbitrary entry within
- * the list by passing @pos. @pos is updated on return to point to
- * the next entry of the last one passed to efi_pstore_read_func().
- * To begin iterating from the beginning of the list @pos must be %NULL.
  */
-static int efi_pstore_sysfs_entry_iter(struct pstore_record *record,
-  struct efivar_entry **pos)
+static int efi_pstore_sysfs_entry_iter(struct pstore_record *record)
 {
+   struct efivar_entry *pos = (struct efivar_entry *)record->psi->data;
struct efivar_entry *entry, *n;
struct list_head *head = _sysfs_list;
int size = 0;
int ret;
 
-   if (!*pos) {
+   if (!pos) {
list_for_each_entry_safe(entry, n, head, list) {
efi_pstore_scan_sysfs_enter(entry, n, head);
 
@@ -185,21 +180,21 @@ static int efi_pstore_sysfs_entry_iter(struct 
pstore_record *record,
if (size)
break;
}
-   *pos = n;
+   pos = n;
return size;
}
 
-   list_for_each_entry_safe_from((*pos), n, head, list) {
-   efi_pstore_scan_sysfs_enter((*pos), n, head);
+   list_for_each_entry_safe_from(pos, n, head, list) {
+   efi_pstore_scan_sysfs_enter(pos, n, head);
 
-   size = efi_pstore_read_func((*pos), record);
-   ret = efi_pstore_scan_sysfs_exit((*pos), n, head, size < 0);
+   size = efi_pstore_read_func(pos, record);
+   ret = efi_pstore_scan_sysfs_exit(pos, n, head, size < 0);
if (ret)
return ret;
if (size)
break;
}
-   *pos = n;
+   pos = n;
return size;
 }
 
@@ -218,7 +213,6 @@ static int efi_pstore_sysfs_entry_iter(struct pstore_record 
*record,
  */
 static ssize_t efi_pstore_read(struct pstore_record *record)
 {
-   struct efivar_entry *entry = (struct efivar_entry *)record->psi->data;
ssize_t size;
 
record->buf = kzalloc(EFIVARS_DATA_SIZE_MAX, GFP_KERNEL);
@@ -229,7 +223,7 @@ static ssize_t efi_pstore_read(struct pstore_record *record)
size = -EINTR;
goto out;
}
-   size = efi_pstore_sysfs_entry_iter(record, );
+   size = efi_pstore_sysfs_entry_iter(record);
efivar_entry_iter_end();
 
 out:
-- 
2.7.4


-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 14/24] x86: Restrict MSR access when the kernel is locked down

2017-04-05 Thread Kees Cook

On Wed, Apr 5, 2017 at 10:12 AM, David Howells <dhowe...@redhat.com> wrote:
> From: Matthew Garrett <matthew.garr...@nebula.com>
>
> Writing to MSRs should not be allowed if the kernel is locked down, since
> it could lead to execution of arbitrary code in kernel mode.  Based on a
> patch by Kees Cook.
>
> Cc: Kees Cook <keesc...@chromium.org>
> Signed-off-by: Matthew Garrett <matthew.garr...@nebula.com>
> Signed-off-by: David Howells <dhowe...@redhat.com>

Acked-by: Kees Cook <keesc...@chromium.org>

-Kees

> ---
>
>  arch/x86/kernel/msr.c |7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/arch/x86/kernel/msr.c b/arch/x86/kernel/msr.c
> index ef688804f80d..fbcce028e502 100644
> --- a/arch/x86/kernel/msr.c
> +++ b/arch/x86/kernel/msr.c
> @@ -84,6 +84,9 @@ static ssize_t msr_write(struct file *file, const char 
> __user *buf,
> int err = 0;
> ssize_t bytes = 0;
>
> +   if (kernel_is_locked_down())
> +   return -EPERM;
> +
> if (count % 8)
> return -EINVAL; /* Invalid chunk size */
>
> @@ -131,6 +134,10 @@ static long msr_ioctl(struct file *file, unsigned int 
> ioc, unsigned long arg)
> err = -EBADF;
> break;
> }
> +   if (kernel_is_locked_down()) {
> +   err = -EPERM;
> +   break;
> +   }
> if (copy_from_user(, uregs, sizeof regs)) {
> err = -EFAULT;
> break;
>



-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/18] pstore: Extract common arguments into structure

2017-03-07 Thread Kees Cook

On Tue, Mar 7, 2017 at 8:22 AM, Namhyung Kim <namhy...@gmail.com> wrote:
> On Tue, Mar 7, 2017 at 6:55 AM, Kees Cook <keesc...@chromium.org> wrote:
>> The read/mkfile pair pass the same arguments and should be cleared
>> between calls. Move to a structure and wipe it after every loop.
>>
>> Signed-off-by: Kees Cook <keesc...@chromium.org>
>> ---
>>  fs/pstore/platform.c   | 55 
>> +++---
>>  include/linux/pstore.h | 28 -
>>  2 files changed, 57 insertions(+), 26 deletions(-)
>>
>> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
>> index 320a673ecb5b..3fa1575a6e36 100644
>> --- a/fs/pstore/platform.c
>> +++ b/fs/pstore/platform.c
>> @@ -766,16 +766,9 @@ EXPORT_SYMBOL_GPL(pstore_unregister);
>>  void pstore_get_records(int quiet)
>>  {
>> struct pstore_info *psi = psinfo;
>> -   char*buf = NULL;
>> -   ssize_t size;
>> -   u64 id;
>> -   int count;
>> -   enum pstore_type_id type;
>> -   struct timespec time;
>> +   struct pstore_recordrecord = { .psi = psi, };
>> int failed = 0, rc;
>> -   boolcompressed;
>> int unzipped_len = -1;
>> -   ssize_t ecc_notice_size = 0;
>>
>> if (!psi)
>> return;
>> @@ -784,39 +777,51 @@ void pstore_get_records(int quiet)
>> if (psi->open && psi->open(psi))
>> goto out;
>>
>> -   while ((size = psi->read(, , , , , 
>> ,
>> -_notice_size, psi)) > 0) {
>> -   if (compressed && (type == PSTORE_TYPE_DMESG)) {
>> +   while ((record.size = psi->read(, ,
>> +, ,
>> +, ,
>> +_notice_size,
>> +record.psi)) > 0) {
>> +   if (record.compressed &&
>> +   record.type == PSTORE_TYPE_DMESG) {
>> if (big_oops_buf)
>> -   unzipped_len = pstore_decompress(buf,
>> -   big_oops_buf, size,
>> +   unzipped_len = pstore_decompress(
>> +   record.buf,
>> +   big_oops_buf,
>> +   record.size,
>> big_oops_buf_sz);
>>
>> if (unzipped_len > 0) {
>> -   if (ecc_notice_size)
>> +   if (record.ecc_notice_size)
>> memcpy(big_oops_buf + unzipped_len,
>> -  buf + size, ecc_notice_size);
>> -   kfree(buf);
>> -   buf = big_oops_buf;
>> -   size = unzipped_len;
>> -   compressed = false;
>> +  record.buf + recorrecord.size,
>
> A typo on record.size.

Thanks! Yeah, 0-day noticed this too. I've refreshed the patches in my
tree with the correction now.

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/18] pstore: Avoid race in module unloading

2017-03-07 Thread Kees Cook

On Tue, Mar 7, 2017 at 8:16 AM, Namhyung Kim <namhy...@gmail.com> wrote:
> Hi Kees,
>
> On Tue, Mar 7, 2017 at 6:55 AM, Kees Cook <keesc...@chromium.org> wrote:
>> Technically, it might be possible for struct pstore_info to go out of
>> scope after the module_put(), so report the backend name first.
>
> But in that case, using pstore will crash the kernel anyway, right?
> If so, why pstore doesn't keep a reference until unregister?
> Do I miss something?

I could be wrong with this, since the backend can't call unregister
until register has finished... I'll drop this patch.

-Kees

>
> Thanks,
> Namhyung
>
>
>>
>> Signed-off-by: Kees Cook <keesc...@chromium.org>
>> ---
>>  fs/pstore/platform.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
>> index 074fe85a2078..d69ef8a840b9 100644
>> --- a/fs/pstore/platform.c
>> +++ b/fs/pstore/platform.c
>> @@ -722,10 +722,10 @@ int pstore_register(struct pstore_info *psi)
>>  */
>> backend = psi->name;
>>
>> -   module_put(owner);
>> -
>> pr_info("Registered %s as persistent store backend\n", psi->name);
>>
>> +   module_put(owner);
>> +
>> return 0;
>>  }
>>  EXPORT_SYMBOL_GPL(pstore_register);
>> --
>> 2.7.4
>>



-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/18] pstore: Shut down worker when unregistering

2017-03-06 Thread Kees Cook

When built as a module and running with update_ms >= 0, pstore will Oops
during module unload since the work timer is still running. This makes sure
the worker is stopped before unloading.

Signed-off-by: Kees Cook <keesc...@chromium.org>
Cc: sta...@vger.kernel.org
---
 fs/pstore/platform.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index cfc1abd264d9..074fe85a2078 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -709,6 +709,7 @@ int pstore_register(struct pstore_info *psi)
if (psi->flags & PSTORE_FLAGS_PMSG)
pstore_register_pmsg();
 
+   /* Start watching for new records, if desired. */
if (pstore_update_ms >= 0) {
pstore_timer.expires = jiffies +
msecs_to_jiffies(pstore_update_ms);
@@ -731,6 +732,11 @@ EXPORT_SYMBOL_GPL(pstore_register);
 
 void pstore_unregister(struct pstore_info *psi)
 {
+   /* Stop timer and make sure all work has finished. */
+   pstore_update_ms = -1;
+   del_timer_sync(_timer);
+   flush_work(_work);
+
if (psi->flags & PSTORE_FLAGS_PMSG)
pstore_unregister_pmsg();
if (psi->flags & PSTORE_FLAGS_FTRACE)
@@ -830,7 +836,9 @@ static void pstore_timefunc(unsigned long dummy)
schedule_work(_work);
}
 
-   mod_timer(_timer, jiffies + msecs_to_jiffies(pstore_update_ms));
+   if (pstore_update_ms >= 0)
+   mod_timer(_timer,
+ jiffies + msecs_to_jiffies(pstore_update_ms));
 }
 
 module_param(backend, charp, 0444);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/18] pstore: Improve register_pstore() error reporting

2017-03-06 Thread Kees Cook

Uncommon errors are better to get reported to dmesg so developers can
more easily figure out why pstore is unhappy with a backend attempting
to register.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/platform.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index d69ef8a840b9..320a673ecb5b 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -673,11 +673,15 @@ int pstore_register(struct pstore_info *psi)
 {
struct module *owner = psi->owner;
 
-   if (backend && strcmp(backend, psi->name))
+   if (backend && strcmp(backend, psi->name)) {
+   pr_warn("ignoring unexpected backend '%s'\n", psi->name);
return -EPERM;
+   }
 
spin_lock(_lock);
if (psinfo) {
+   pr_warn("backend '%s' already loaded: ignoring '%s'\n",
+   psinfo->name, psi->name);
spin_unlock(_lock);
return -EBUSY;
}
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/18] pstore: Allocate records on heap instead of stack

2017-03-06 Thread Kees Cook

In preparation for handling records off to pstore_mkfile(), allocate the
record instead of reusing stack. This still always frees the record,
though, since pstore_mkfile() isn't yet keeping it.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/platform.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index d897e2f11b6a..072326625629 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -818,8 +818,7 @@ static void decompress_record(struct pstore_record *record)
 void pstore_get_records(int quiet)
 {
struct pstore_info *psi = psinfo;
-   struct pstore_recordrecord = { .psi = psi, };
-   int failed = 0, rc;
+   int failed = 0;
 
if (!psi)
return;
@@ -833,19 +832,34 @@ void pstore_get_records(int quiet)
 * may reallocate record.buf. On success, pstore_mkfile() will keep
 * the record.buf, so free it only on failure.
 */
-   while ((record.size = psi->read()) > 0) {
-   decompress_record();
-   rc = pstore_mkfile();
+   for (;;) {
+   struct pstore_record *record;
+   int rc;
+
+   record = kzalloc(sizeof(*record), GFP_KERNEL);
+   if (!record) {
+   pr_err("out of memory creating record\n");
+   break;
+   }
+   record->psi = psi;
+
+   record->size = psi->read(record);
+
+   /* No more records left in backend? */
+   if (record->size <= 0)
+   break;
+
+   decompress_record(record);
+   rc = pstore_mkfile(record);
if (rc) {
/* pstore_mkfile() did not take buf, so free it. */
-   kfree(record.buf);
+   kfree(record->buf);
if (rc != -EEXIST || !quiet)
failed++;
}
 
/* Reset for next record. */
-   memset(, 0, sizeof(record));
-   record.psi = psi;
+   kfree(record);
}
if (psi->close)
psi->close(psi);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/18] pstore: Pass record contents instead of copying

2017-03-06 Thread Kees Cook

pstore_mkfile() shouldn't have to memcpy the record contents. It can use
the existing copy instead. This adjusts the allocation lifetime management
and renames the contents variable from "data" to "buf" to assist moving to
struct pstore_record in the future.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/inode.c| 22 +++---
 fs/pstore/platform.c | 16 
 2 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index a98787bab3e6..3d1f047e4f41 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -52,7 +52,7 @@ struct pstore_private {
u64 id;
int count;
ssize_t size;
-   chardata[];
+   char*buf;
 };
 
 struct pstore_ftrace_seq_data {
@@ -63,6 +63,14 @@ struct pstore_ftrace_seq_data {
 
 #define REC_SIZE sizeof(struct pstore_ftrace_record)
 
+static void free_pstore_private(struct pstore_private *private)
+{
+   if (!private)
+   return;
+   kfree(private->buf);
+   kfree(private);
+}
+
 static void *pstore_ftrace_seq_start(struct seq_file *s, loff_t *pos)
 {
struct pstore_private *ps = s->private;
@@ -105,7 +113,7 @@ static int pstore_ftrace_seq_show(struct seq_file *s, void 
*v)
 {
struct pstore_private *ps = s->private;
struct pstore_ftrace_seq_data *data = v;
-   struct pstore_ftrace_record *rec = (void *)(ps->data + data->off);
+   struct pstore_ftrace_record *rec = (void *)(ps->buf + data->off);
 
seq_printf(s, "CPU:%d ts:%llu %08lx  %08lx  %pf <- %pF\n",
   pstore_ftrace_decode_cpu(rec),
@@ -143,7 +151,7 @@ static ssize_t pstore_file_read(struct file *file, char 
__user *userbuf,
 
if (ps->type == PSTORE_TYPE_FTRACE)
return seq_read(file, userbuf, count, ppos);
-   return simple_read_from_buffer(userbuf, count, ppos, ps->data, 
ps->size);
+   return simple_read_from_buffer(userbuf, count, ppos, ps->buf, ps->size);
 }
 
 static int pstore_file_open(struct inode *inode, struct file *file)
@@ -221,7 +229,7 @@ static void pstore_evict_inode(struct inode *inode)
spin_lock_irqsave(_lock, flags);
list_del(>list);
spin_unlock_irqrestore(_lock, flags);
-   kfree(p);
+   free_pstore_private(p);
}
 }
 
@@ -332,7 +340,7 @@ int pstore_mkfile(struct pstore_record *record)
goto fail;
inode->i_mode = S_IFREG | 0444;
inode->i_fop = _file_operations;
-   private = kmalloc(sizeof *private + size, GFP_KERNEL);
+   private = kzalloc(sizeof(*private), GFP_KERNEL);
if (!private)
goto fail_alloc;
private->type = record->type;
@@ -394,7 +402,7 @@ int pstore_mkfile(struct pstore_record *record)
if (!dentry)
goto fail_lockedalloc;
 
-   memcpy(private->data, record->buf, size);
+   private->buf = record->buf;
inode->i_size = private->size = size;
 
inode->i_private = private;
@@ -414,7 +422,7 @@ int pstore_mkfile(struct pstore_record *record)
 
 fail_lockedalloc:
inode_unlock(d_inode(root));
-   kfree(private);
+   free_pstore_private(private);
 fail_alloc:
iput(inode);
 
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index c0d401e732e6..d897e2f11b6a 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -828,14 +828,22 @@ void pstore_get_records(int quiet)
if (psi->open && psi->open(psi))
goto out;
 
+   /*
+* Backend callback read() allocates record.buf. decompress_record()
+* may reallocate record.buf. On success, pstore_mkfile() will keep
+* the record.buf, so free it only on failure.
+*/
while ((record.size = psi->read()) > 0) {
decompress_record();
rc = pstore_mkfile();
+   if (rc) {
+   /* pstore_mkfile() did not take buf, so free it. */
+   kfree(record.buf);
+   if (rc != -EEXIST || !quiet)
+   failed++;
+   }
 
-   if (rc && (rc != -EEXIST || !quiet))
-   failed++;
-
-   kfree(record.buf);
+   /* Reset for next record. */
memset(, 0, sizeof(record));
record.psi = psi;
}
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 18/18] pstore: Remove write_buf() callback

2017-03-06 Thread Kees Cook

Now that write() and write_buf() are functionally identical, this removes
write_buf(), and renames write_buf_user() to write_user(). Additionally
adds sanity-checks for pstore_info's declared functions and flags at
registration time.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/ftrace.c |  4 ++--
 fs/pstore/platform.c   | 35 ---
 fs/pstore/pmsg.c   |  4 ++--
 fs/pstore/ram.c| 10 +-
 include/linux/pstore.h | 29 ++---
 5 files changed, 39 insertions(+), 43 deletions(-)

diff --git a/fs/pstore/ftrace.c b/fs/pstore/ftrace.c
index a5506ec6995e..06aab07b6bb7 100644
--- a/fs/pstore/ftrace.c
+++ b/fs/pstore/ftrace.c
@@ -53,7 +53,7 @@ static void notrace pstore_ftrace_call(unsigned long ip,
rec.parent_ip = parent_ip;
pstore_ftrace_write_timestamp(, pstore_ftrace_stamp++);
pstore_ftrace_encode_cpu(, raw_smp_processor_id());
-   psinfo->write_buf();
+   psinfo->write();
 
local_irq_restore(flags);
 }
@@ -122,7 +122,7 @@ void pstore_register_ftrace(void)
 {
struct dentry *file;
 
-   if (!psinfo->write_buf)
+   if (!psinfo->write)
return;
 
pstore_ftrace_dir = debugfs_create_dir("pstore", NULL);
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 1e6642a2063e..e79f170fa79b 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -604,7 +604,7 @@ static void pstore_console_write(struct console *con, const 
char *s, unsigned c)
}
record.buf = (char *)s;
record.size = c;
-   psinfo->write_buf();
+   psinfo->write();
spin_unlock_irqrestore(>buf_lock, flags);
s += c;
c = e - s;
@@ -632,15 +632,8 @@ static void pstore_register_console(void) {}
 static void pstore_unregister_console(void) {}
 #endif
 
-static int pstore_write_compat(struct pstore_record *record)
-{
-   record->buf = psinfo->buf;
-
-   return record->psi->write_buf(record);
-}
-
-static int pstore_write_buf_user_compat(struct pstore_record *record,
-   const char __user *buf)
+static int pstore_write_user_compat(struct pstore_record *record,
+   const char __user *buf)
 {
unsigned long flags = 0;
size_t i, bufsize, total_size = record->size;
@@ -662,7 +655,7 @@ static int pstore_write_buf_user_compat(struct 
pstore_record *record,
break;
}
record->size = c;
-   ret = record->psi->write_buf(record);
+   ret = record->psi->write(record);
if (unlikely(ret < 0))
break;
i += c;
@@ -687,6 +680,20 @@ int pstore_register(struct pstore_info *psi)
return -EPERM;
}
 
+   /* Sanity check flags. */
+   if (!psi->flags) {
+   pr_warn("backend '%s' must support at least one frontend\n",
+   psi->name);
+   return -EINVAL;
+   }
+
+   /* Check for required functions. */
+   if (!psi->read || !psi->write) {
+   pr_warn("backend '%s' must implement read() and write()\n",
+   psi->name);
+   return -EINVAL;
+   }
+
spin_lock(_lock);
if (psinfo) {
pr_warn("backend '%s' already loaded: ignoring '%s'\n",
@@ -695,10 +702,8 @@ int pstore_register(struct pstore_info *psi)
return -EBUSY;
}
 
-   if (!psi->write)
-   psi->write = pstore_write_compat;
-   if (!psi->write_buf_user)
-   psi->write_buf_user = pstore_write_buf_user_compat;
+   if (!psi->write_user)
+   psi->write_user = pstore_write_user_compat;
psinfo = psi;
mutex_init(>read_mutex);
spin_unlock(_lock);
diff --git a/fs/pstore/pmsg.c b/fs/pstore/pmsg.c
index ce35907602de..c16a2477e106 100644
--- a/fs/pstore/pmsg.c
+++ b/fs/pstore/pmsg.c
@@ -33,12 +33,12 @@ static ssize_t write_pmsg(struct file *file, const char 
__user *buf,
if (!count)
return 0;
 
-   /* check outside lock, page in any data. write_buf_user also checks */
+   /* check outside lock, page in any data. write_user also checks */
if (!access_ok(VERIFY_READ, buf, count))
return -EFAULT;
 
mutex_lock(_lock);
-   ret = psinfo->write_buf_user(, buf);
+   ret = psinfo->write_user(, buf);
mutex_unlock(_lock);
return ret ? ret : count;
 }
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index d85e1adae1b6..5523df7f17ef 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -378,7 +378,7 @@ static size_t ramoops_write_kmsg_hdr(struct 
per

[PATCH 11/18] pstore: Always allocate buffer for decompression

2017-03-06 Thread Kees Cook

Currently, pstore_mkfile() performs a memcpy() of the record contents,
so it can live anywhere. However, this is needlessly wasteful. In
preparation of pstore_mkfile() keeping the record contents, always
allocate a buffer for the contents.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/platform.c | 38 --
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 879658b4c679..c0d401e732e6 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -768,6 +768,7 @@ EXPORT_SYMBOL_GPL(pstore_unregister);
 static void decompress_record(struct pstore_record *record)
 {
int unzipped_len;
+   char *decompressed;
 
/* Only PSTORE_TYPE_DMESG support compression. */
if (!record->compressed || record->type != PSTORE_TYPE_DMESG) {
@@ -783,17 +784,29 @@ static void decompress_record(struct pstore_record 
*record)
 
unzipped_len = pstore_decompress(record->buf, big_oops_buf,
 record->size, big_oops_buf_sz);
-   if (unzipped_len > 0) {
-   if (record->ecc_notice_size)
-   memcpy(big_oops_buf + unzipped_len,
-  record->buf + record->size,
-  record->ecc_notice_size);
-   kfree(record->buf);
-   record->buf = big_oops_buf;
-   record->size = unzipped_len;
-   record->compressed = false;
-   } else
+   if (unzipped_len <= 0) {
pr_err("decompression failed: %d\n", unzipped_len);
+   return;
+   }
+
+   /* Build new buffer for decompressed contents. */
+   decompressed = kmalloc(unzipped_len + record->ecc_notice_size,
+  GFP_KERNEL);
+   if (!decompressed) {
+   pr_err("decompression ran out of memory\n");
+   return;
+   }
+   memcpy(decompressed, big_oops_buf, unzipped_len);
+
+   /* Append ECC notice to decompressed buffer. */
+   memcpy(decompressed + unzipped_len, record->buf + record->size,
+  record->ecc_notice_size);
+
+   /* Swap out compresed contents with decompressed contents. */
+   kfree(record->buf);
+   record->buf = decompressed;
+   record->size = unzipped_len;
+   record->compressed = false;
 }
 
 /*
@@ -819,13 +832,10 @@ void pstore_get_records(int quiet)
decompress_record();
rc = pstore_mkfile();
 
-   /* Free buffer other than big oops */
-   if (record.buf != big_oops_buf)
-   kfree(record.buf);
-
if (rc && (rc != -EEXIST || !quiet))
failed++;
 
+   kfree(record.buf);
memset(, 0, sizeof(record));
record.psi = psi;
}
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 15/18] pstore: Replace arguments for erase() API

2017-03-06 Thread Kees Cook

This removes the argument list for the erase() callback and replaces it
with a pointer to the backend record details to be removed.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 drivers/acpi/apei/erst.c  |  8 +++-
 drivers/firmware/efi/efi-pstore.c | 26 +++---
 fs/pstore/inode.c | 12 +---
 fs/pstore/ram.c   | 15 +++
 include/linux/pstore.h| 16 +---
 5 files changed, 31 insertions(+), 46 deletions(-)

diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index 440588d189e7..7207e5fc9d3d 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -927,8 +927,7 @@ static int erst_open_pstore(struct pstore_info *psi);
 static int erst_close_pstore(struct pstore_info *psi);
 static ssize_t erst_reader(struct pstore_record *record);
 static int erst_writer(struct pstore_record *record);
-static int erst_clearer(enum pstore_type_id type, u64 id, int count,
-   struct timespec time, struct pstore_info *psi);
+static int erst_clearer(struct pstore_record *record);
 
 static struct pstore_info erst_info = {
.owner  = THIS_MODULE,
@@ -1100,10 +1099,9 @@ static int erst_writer(struct pstore_record *record)
return ret;
 }
 
-static int erst_clearer(enum pstore_type_id type, u64 id, int count,
-   struct timespec time, struct pstore_info *psi)
+static int erst_clearer(struct pstore_record *record)
 {
-   return erst_clear(id);
+   return erst_clear(record->id);
 }
 
 static int __init erst_init(void)
diff --git a/drivers/firmware/efi/efi-pstore.c 
b/drivers/firmware/efi/efi-pstore.c
index f81e3ec6f1c0..93d8cdbe7ef4 100644
--- a/drivers/firmware/efi/efi-pstore.c
+++ b/drivers/firmware/efi/efi-pstore.c
@@ -266,10 +266,7 @@ static int efi_pstore_write(struct pstore_record *record)
 };
 
 struct pstore_erase_data {
-   u64 id;
-   enum pstore_type_id type;
-   int count;
-   struct timespec time;
+   struct pstore_record *record;
efi_char16_t *name;
 };
 
@@ -295,8 +292,9 @@ static int efi_pstore_erase_func(struct efivar_entry 
*entry, void *data)
 * Check if an old format, which doesn't support
 * holding multiple logs, remains.
 */
-   sprintf(name_old, "dump-type%u-%u-%lu", ed->type,
-   (unsigned int)ed->id, ed->time.tv_sec);
+   snprintf(name_old, sizeof(name_old), "dump-type%u-%u-%lu",
+   ed->record->type, (unsigned int)ed->record->id,
+   ed->record->time.tv_sec);
 
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name_old[i] = name_old[i];
@@ -321,8 +319,7 @@ static int efi_pstore_erase_func(struct efivar_entry 
*entry, void *data)
return 1;
 }
 
-static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
-   struct timespec time, struct pstore_info *psi)
+static int efi_pstore_erase(struct pstore_record *record)
 {
struct pstore_erase_data edata;
struct efivar_entry *entry = NULL;
@@ -331,17 +328,16 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 
id, int count,
int found, i;
unsigned int part;
 
-   do_div(id, 1000);
-   part = do_div(id, 100);
-   sprintf(name, "dump-type%u-%u-%d-%lu", type, part, count, time.tv_sec);
+   do_div(record->id, 1000);
+   part = do_div(record->id, 100);
+   snprintf(name, sizeof(name), "dump-type%u-%u-%d-%lu",
+record->type, record->part, record->count,
+record->time.tv_sec);
 
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = name[i];
 
-   edata.id = part;
-   edata.type = type;
-   edata.count = count;
-   edata.time = time;
+   edata.record = record;
edata.name = efi_name;
 
if (efivar_entry_iter_begin())
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 0ea281b457fa..06504b69575b 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -210,14 +210,12 @@ static int pstore_unlink(struct inode *dir, struct dentry 
*dentry)
if (err)
return err;
 
-   if (record->psi->erase) {
-   mutex_lock(>psi->read_mutex);
-   record->psi->erase(record->type, record->id, record->count,
- d_inode(dentry)->i_ctime, record->psi);
-   mutex_unlock(>psi->read_mutex);
-   } else {
+   if (!record->psi->erase)
return -EPERM;
-   }
+
+   mutex_lock(>psi->read_mutex);
+   record->psi->erase(record);
+   mutex_unlock(>psi->read_mutex);
 
return simple_unlink(dir, dentry);
 }
diff --g

[PATCH 08/18] pstore: Switch pstore_mkfile to pass record

2017-03-06 Thread Kees Cook

Instead of the long list of arguments, just pass the new record struct.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/inode.c| 57 +---
 fs/pstore/internal.h |  5 +
 fs/pstore/platform.c |  6 +-
 3 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 57c0646479f5..a98787bab3e6 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -302,9 +302,7 @@ bool pstore_is_mounted(void)
  * Load it up with "size" bytes of data from "buf".
  * Set the mtime & ctime to the date that this record was originally stored.
  */
-int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
- char *data, bool compressed, size_t size,
- struct timespec time, struct pstore_info *psi)
+int pstore_mkfile(struct pstore_record *record)
 {
struct dentry   *root = pstore_sb->s_root;
struct dentry   *dentry;
@@ -313,12 +311,13 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, 
u64 id, int count,
charname[PSTORE_NAMELEN];
struct pstore_private   *private, *pos;
unsigned long   flags;
+   size_t  size = record->size + record->ecc_notice_size;
 
spin_lock_irqsave(_lock, flags);
list_for_each_entry(pos, , list) {
-   if (pos->type == type &&
-   pos->id == id &&
-   pos->psi == psi) {
+   if (pos->type == record->type &&
+   pos->id == record->id &&
+   pos->psi == record->psi) {
rc = -EEXIST;
break;
}
@@ -336,48 +335,56 @@ int pstore_mkfile(enum pstore_type_id type, char *psname, 
u64 id, int count,
private = kmalloc(sizeof *private + size, GFP_KERNEL);
if (!private)
goto fail_alloc;
-   private->type = type;
-   private->id = id;
-   private->count = count;
-   private->psi = psi;
+   private->type = record->type;
+   private->id = record->id;
+   private->count = record->count;
+   private->psi = record->psi;
 
-   switch (type) {
+   switch (record->type) {
case PSTORE_TYPE_DMESG:
scnprintf(name, sizeof(name), "dmesg-%s-%lld%s",
- psname, id, compressed ? ".enc.z" : "");
+ record->psi->name, record->id,
+ record->compressed ? ".enc.z" : "");
break;
case PSTORE_TYPE_CONSOLE:
-   scnprintf(name, sizeof(name), "console-%s-%lld", psname, id);
+   scnprintf(name, sizeof(name), "console-%s-%lld",
+ record->psi->name, record->id);
break;
case PSTORE_TYPE_FTRACE:
-   scnprintf(name, sizeof(name), "ftrace-%s-%lld", psname, id);
+   scnprintf(name, sizeof(name), "ftrace-%s-%lld",
+ record->psi->name, record->id);
break;
case PSTORE_TYPE_MCE:
-   scnprintf(name, sizeof(name), "mce-%s-%lld", psname, id);
+   scnprintf(name, sizeof(name), "mce-%s-%lld",
+ record->psi->name, record->id);
break;
case PSTORE_TYPE_PPC_RTAS:
-   scnprintf(name, sizeof(name), "rtas-%s-%lld", psname, id);
+   scnprintf(name, sizeof(name), "rtas-%s-%lld",
+ record->psi->name, record->id);
break;
case PSTORE_TYPE_PPC_OF:
scnprintf(name, sizeof(name), "powerpc-ofw-%s-%lld",
- psname, id);
+ record->psi->name, record->id);
break;
case PSTORE_TYPE_PPC_COMMON:
scnprintf(name, sizeof(name), "powerpc-common-%s-%lld",
- psname, id);
+ record->psi->name, record->id);
break;
case PSTORE_TYPE_PMSG:
-   scnprintf(name, sizeof(name), "pmsg-%s-%lld", psname, id);
+   scnprintf(name, sizeof(name), "pmsg-%s-%lld",
+ record->psi->name, record->id);
break;
case PSTORE_TYPE_PPC_OPAL:
-   sprintf(name, "powerpc-opal-%s-%lld", psname, id);
+   scnprintf(name, sizeof(name), "powerpc-opal-%s-%lld",
+ record->psi->name, record->id);
break;
case PSTORE

[PATCH 00/18] pstore: refactor internal APIs

2017-03-06 Thread Kees Cook

For a long time I've been bothered by the complexity of argument passing
in the pstore internals, which makes understanding things and changing
things extremely fragile.

With the proposal of a new backend (EPI capsules), and my attempts to
reorganize things for the proposed multiple-pmsg frontend, I've spent
some time refactoring the internal pstore API to pass a record structure
instead of lots of arguments. This hugely simplifies things, and lets me
document the API better, reduce memory copies, etc.

Included in the series are a few bug fixes as well.

If pstore backend authors could review the changes I've made in their
drivers, I'd appreciate it. Hopefully I got it all right. :) And unless
there are any objections, I'll put this into -next in a couple days.

I intend to continue working on this to fix up the "ecc_notification_size"
field and make it more generalized, and get multiple pmsg support added
too, but I'd like to get this first series out for review first, since
it's rather long already.

Thanks!

-Kees

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 14/18] pstore: Do not duplicate record metadata

2017-03-06 Thread Kees Cook

This switches the inode-private data from carrying duplicate metadata to
keeping the record passed in during pstore_mkfile().

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/inode.c| 57 ++--
 fs/pstore/platform.c |  6 ++
 2 files changed, 30 insertions(+), 33 deletions(-)

diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 3d1f047e4f41..0ea281b457fa 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -47,12 +47,8 @@ static LIST_HEAD(allpstore);
 
 struct pstore_private {
struct list_head list;
-   struct pstore_info *psi;
-   enum pstore_type_id type;
-   u64 id;
-   int count;
-   ssize_t size;
-   char*buf;
+   struct pstore_record *record;
+   size_t total_size;
 };
 
 struct pstore_ftrace_seq_data {
@@ -67,7 +63,10 @@ static void free_pstore_private(struct pstore_private 
*private)
 {
if (!private)
return;
-   kfree(private->buf);
+   if (private->record) {
+   kfree(private->record->buf);
+   kfree(private->record);
+   }
kfree(private);
 }
 
@@ -80,9 +79,9 @@ static void *pstore_ftrace_seq_start(struct seq_file *s, 
loff_t *pos)
if (!data)
return NULL;
 
-   data->off = ps->size % REC_SIZE;
+   data->off = ps->total_size % REC_SIZE;
data->off += *pos * REC_SIZE;
-   if (data->off + REC_SIZE > ps->size) {
+   if (data->off + REC_SIZE > ps->total_size) {
kfree(data);
return NULL;
}
@@ -102,7 +101,7 @@ static void *pstore_ftrace_seq_next(struct seq_file *s, 
void *v, loff_t *pos)
struct pstore_ftrace_seq_data *data = v;
 
data->off += REC_SIZE;
-   if (data->off + REC_SIZE > ps->size)
+   if (data->off + REC_SIZE > ps->total_size)
return NULL;
 
(*pos)++;
@@ -113,7 +112,9 @@ static int pstore_ftrace_seq_show(struct seq_file *s, void 
*v)
 {
struct pstore_private *ps = s->private;
struct pstore_ftrace_seq_data *data = v;
-   struct pstore_ftrace_record *rec = (void *)(ps->buf + data->off);
+   struct pstore_ftrace_record *rec;
+
+   rec = (struct pstore_ftrace_record *)(ps->record->buf + data->off);
 
seq_printf(s, "CPU:%d ts:%llu %08lx  %08lx  %pf <- %pF\n",
   pstore_ftrace_decode_cpu(rec),
@@ -133,7 +134,7 @@ static const struct seq_operations pstore_ftrace_seq_ops = {
 
 static int pstore_check_syslog_permissions(struct pstore_private *ps)
 {
-   switch (ps->type) {
+   switch (ps->record->type) {
case PSTORE_TYPE_DMESG:
case PSTORE_TYPE_CONSOLE:
return check_syslog_permissions(SYSLOG_ACTION_READ_ALL,
@@ -149,9 +150,10 @@ static ssize_t pstore_file_read(struct file *file, char 
__user *userbuf,
struct seq_file *sf = file->private_data;
struct pstore_private *ps = sf->private;
 
-   if (ps->type == PSTORE_TYPE_FTRACE)
+   if (ps->record->type == PSTORE_TYPE_FTRACE)
return seq_read(file, userbuf, count, ppos);
-   return simple_read_from_buffer(userbuf, count, ppos, ps->buf, ps->size);
+   return simple_read_from_buffer(userbuf, count, ppos,
+  ps->record->buf, ps->total_size);
 }
 
 static int pstore_file_open(struct inode *inode, struct file *file)
@@ -165,7 +167,7 @@ static int pstore_file_open(struct inode *inode, struct 
file *file)
if (err)
return err;
 
-   if (ps->type == PSTORE_TYPE_FTRACE)
+   if (ps->record->type == PSTORE_TYPE_FTRACE)
sops = _ftrace_seq_ops;
 
err = seq_open(file, sops);
@@ -201,17 +203,18 @@ static const struct file_operations 
pstore_file_operations = {
 static int pstore_unlink(struct inode *dir, struct dentry *dentry)
 {
struct pstore_private *p = d_inode(dentry)->i_private;
+   struct pstore_record *record = p->record;
int err;
 
err = pstore_check_syslog_permissions(p);
if (err)
return err;
 
-   if (p->psi->erase) {
-   mutex_lock(>psi->read_mutex);
-   p->psi->erase(p->type, p->id, p->count,
- d_inode(dentry)->i_ctime, p->psi);
-   mutex_unlock(>psi->read_mutex);
+   if (record->psi->erase) {
+   mutex_lock(>psi->read_mutex);
+   record->psi->erase(record->type, record->id, record->count,
+ d_inode(dentry)->i_ctime, record->psi);
+   mutex_unlock(>psi->read_mutex);
} else {
return -EPERM;
}
@@ -323,9 +326,9 @@ int pstore_mkfile(struct pstore_r

[PATCH 06/18] pstore: Extract common arguments into structure

2017-03-06 Thread Kees Cook

The read/mkfile pair pass the same arguments and should be cleared
between calls. Move to a structure and wipe it after every loop.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/platform.c   | 55 +++---
 include/linux/pstore.h | 28 -
 2 files changed, 57 insertions(+), 26 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 320a673ecb5b..3fa1575a6e36 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -766,16 +766,9 @@ EXPORT_SYMBOL_GPL(pstore_unregister);
 void pstore_get_records(int quiet)
 {
struct pstore_info *psi = psinfo;
-   char*buf = NULL;
-   ssize_t size;
-   u64 id;
-   int count;
-   enum pstore_type_id type;
-   struct timespec time;
+   struct pstore_recordrecord = { .psi = psi, };
int failed = 0, rc;
-   boolcompressed;
int unzipped_len = -1;
-   ssize_t ecc_notice_size = 0;
 
if (!psi)
return;
@@ -784,39 +777,51 @@ void pstore_get_records(int quiet)
if (psi->open && psi->open(psi))
goto out;
 
-   while ((size = psi->read(, , , , , ,
-_notice_size, psi)) > 0) {
-   if (compressed && (type == PSTORE_TYPE_DMESG)) {
+   while ((record.size = psi->read(, ,
+, ,
+, ,
+_notice_size,
+record.psi)) > 0) {
+   if (record.compressed &&
+   record.type == PSTORE_TYPE_DMESG) {
if (big_oops_buf)
-   unzipped_len = pstore_decompress(buf,
-   big_oops_buf, size,
+   unzipped_len = pstore_decompress(
+   record.buf,
+   big_oops_buf,
+   record.size,
big_oops_buf_sz);
 
if (unzipped_len > 0) {
-   if (ecc_notice_size)
+   if (record.ecc_notice_size)
memcpy(big_oops_buf + unzipped_len,
-  buf + size, ecc_notice_size);
-   kfree(buf);
-   buf = big_oops_buf;
-   size = unzipped_len;
-   compressed = false;
+  record.buf + recorrecord.size,
+  record.ecc_notice_size);
+   kfree(record.buf);
+   record.buf = big_oops_buf;
+   record.size = unzipped_len;
+   record.compressed = false;
} else {
pr_err("decompression failed;returned %d\n",
   unzipped_len);
-   compressed = true;
+   record.compressed = true;
}
}
-   rc = pstore_mkfile(type, psi->name, id, count, buf,
-  compressed, size + ecc_notice_size,
-  time, psi);
+   rc = pstore_mkfile(record.type, psi->name, record.id,
+  record.count, record.buf,
+  record.compressed,
+  record.size + record.ecc_notice_size,
+  record.time, record.psi);
if (unzipped_len < 0) {
/* Free buffer other than big oops */
-   kfree(buf);
-   buf = NULL;
+   kfree(record.buf);
+   record.buf = NULL;
} else
unzipped_len = -1;
if (rc && (rc != -EEXIST || !quiet))
failed++;
+
+   memset(, 0, sizeof(record));
+   record.psi = psi;
}
if (psi->close)
psi->close(psi);
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index 083b10bacd4a..7b25f7f17915 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -30,6 +30,8 @@
 #include 
 #include 
 
+struct module;
+
 /* pstore record types (see fs/pstore/inode.c for filename templates) */

[PATCH 09/18] pstore: Replace arguments for read() API

2017-03-06 Thread Kees Cook

The argument list for the pstore_read() interface is unwieldy. This changes
passes the new struct pstore_record instead. The erst backend was already
doing something similar internally.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 arch/powerpc/kernel/nvram_64.c|  61 +++---
 drivers/acpi/apei/erst.c  |  38 ++
 drivers/firmware/efi/efi-pstore.c | 104 --
 fs/pstore/platform.c  |   7 +--
 fs/pstore/ram.c   |  53 ++-
 include/linux/pstore.h|  20 +++-
 6 files changed, 124 insertions(+), 159 deletions(-)

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index d5e2b8309939..7f192001d09a 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -442,10 +442,7 @@ static int nvram_pstore_write(enum pstore_type_id type,
  * Returns the length of the data we read from each partition.
  * Returns 0 if we've been called before.
  */
-static ssize_t nvram_pstore_read(u64 *id, enum pstore_type_id *type,
-   int *count, struct timespec *time, char **buf,
-   bool *compressed, ssize_t *ecc_notice_size,
-   struct pstore_info *psi)
+static ssize_t nvram_pstore_read(struct pstore_record *record)
 {
struct oops_log_info *oops_hdr;
unsigned int err_type, id_no, size = 0;
@@ -459,40 +456,40 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
switch (nvram_type_ids[read_type]) {
case PSTORE_TYPE_DMESG:
part = _log_partition;
-   *type = PSTORE_TYPE_DMESG;
+   record->type = PSTORE_TYPE_DMESG;
break;
case PSTORE_TYPE_PPC_COMMON:
sig = NVRAM_SIG_SYS;
part = _partition;
-   *type = PSTORE_TYPE_PPC_COMMON;
-   *id = PSTORE_TYPE_PPC_COMMON;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_COMMON;
+   record->id = PSTORE_TYPE_PPC_COMMON;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #ifdef CONFIG_PPC_PSERIES
case PSTORE_TYPE_PPC_RTAS:
part = _log_partition;
-   *type = PSTORE_TYPE_PPC_RTAS;
-   time->tv_sec = last_rtas_event;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_RTAS;
+   record->time.tv_sec = last_rtas_event;
+   record->time.tv_nsec = 0;
break;
case PSTORE_TYPE_PPC_OF:
sig = NVRAM_SIG_OF;
part = _config_partition;
-   *type = PSTORE_TYPE_PPC_OF;
-   *id = PSTORE_TYPE_PPC_OF;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_OF;
+   record->id = PSTORE_TYPE_PPC_OF;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #endif
 #ifdef CONFIG_PPC_POWERNV
case PSTORE_TYPE_PPC_OPAL:
sig = NVRAM_SIG_FW;
part = _partition;
-   *type = PSTORE_TYPE_PPC_OPAL;
-   *id = PSTORE_TYPE_PPC_OPAL;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_OPAL;
+   record->id = PSTORE_TYPE_PPC_OPAL;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #endif
default:
@@ -520,10 +517,10 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
return 0;
}
 
-   *count = 0;
+   record->count = 0;
 
if (part->os_partition)
-   *id = id_no;
+   record->id = id_no;
 
if (nvram_type_ids[read_type] == PSTORE_TYPE_DMESG) {
size_t length, hdr_size;
@@ -533,28 +530,28 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
/* Old format oops header had 2-byte record size */
hdr_size = sizeof(u16);
length = be16_to_cpu(oops_hdr->version);
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
} else {
hdr_size = sizeof(*oops_hdr);
length = be16_to_cpu(oops_hdr->report_length);
-   time->tv_sec = be64_to_cpu(oops_hdr->timestamp);
-   time->tv_nsec = 0;
+   record->time.tv_sec = be64_to_cpu(oops_hdr->timest

[PATCH 05/18] pstore: Add kernel-doc for struct pstore_info

2017-03-06 Thread Kees Cook

This adds documentation for struct pstore_info, which also includes
the basic API the backends need to implement.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 include/linux/pstore.h | 133 +++--
 1 file changed, 128 insertions(+), 5 deletions(-)

diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index 0da29cae009b..083b10bacd4a 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -30,7 +30,7 @@
 #include 
 #include 
 
-/* types */
+/* pstore record types (see fs/pstore/inode.c for filename templates) */
 enum pstore_type_id {
PSTORE_TYPE_DMESG   = 0,
PSTORE_TYPE_MCE = 1,
@@ -47,14 +47,138 @@ enum pstore_type_id {
 
 struct module;
 
+/**
+ * struct pstore_info - backend pstore driver structure
+ *
+ * @owner: module which is repsonsible for this backend driver
+ * @name:  name of the backend driver
+ *
+ * @buf_lock:  spinlock to serialize access to @buf
+ * @buf:   preallocated crash dump buffer
+ * @bufsize:   size of @buf available for crash dump writes
+ *
+ * @read_mutex:mutex to serialize @open, @read, and @close callbacks
+ * @flags: bitfield of frontends the backend can accept writes for
+ * @data:  backend-private pointer passed back during callbacks
+ *
+ * Callbacks:
+ *
+ * @open:
+ * Notify backend that pstore is starting a full read of backend
+ * records. Followed by one or more @read calls, and a final @close.
+ *
+ * @psi:   in: pointer to the struct pstore_info for the backend
+ *
+ * Returns 0 on success, and non-zero on error.
+ *
+ * @close:
+ * Notify backend that pstore has finished a full read of backend
+ * records. Always preceded by an @open call and one or more @read
+ * calls.
+ *
+ * @psi:   in: pointer to the struct pstore_info for the backend
+ *
+ * Returns 0 on success, and non-zero on error. (Though pstore will
+ * ignore the error.)
+ *
+ * @read:
+ * Read next available backend record. Called after a successful
+ * @open.
+ *
+ * @id:out: unique identifier for the record
+ * @type:  out: pstore record type
+ * @count: out: for PSTORE_TYPE_DMESG, the Oops count.
+ * @time:  out: timestamp for the record
+ * @buf:   out: kmalloc copy of record contents, to be freed by pstore
+ * @compressed:
+ * out: if the record contents are compressed
+ * @ecc_notice_size:
+ * out: ECC information
+ * @psi:   in: pointer to the struct pstore_info for the backend
+ *
+ * Returns record size on success, zero when no more records are
+ * available, or negative on error.
+ *
+ * @write:
+ * Perform a frontend notification of a write to a backend record. The
+ * data to be stored has already been written to the registered @buf
+ * of the @psi structure.
+ *
+ * @type:  in: pstore record type to write
+ * @reason:
+ * in: pstore write reason
+ * @id:out: unique identifier for the record
+ * @part:  in: position in a multipart write
+ * @count: in: increasing from 0 since boot, the number of this Oops
+ * @compressed:
+ * in: if the record is compressed
+ * @size:  in: size of the write
+ * @psi:   in: pointer to the struct pstore_info for the backend
+ *
+ * Returns 0 on success, and non-zero on error.
+ *
+ * @write_buf:
+ * Perform a frontend write to a backend record, using a specified
+ * buffer. Unlike @write, this does not use the @psi @buf.
+ *
+ * @type:  in: pstore record type to write
+ * @reason:
+ * in: pstore write reason
+ * @id:out: unique identifier for the record
+ * @part:  in: position in a multipart write
+ * @buf:   in: pointer to contents to write to backend record
+ * @compressed:
+ * in: if the record is compressed
+ * @size:  in: size of the write
+ * @psi:   in: pointer to the struct pstore_info for the backend
+ *
+ * Returns 0 on success, and non-zero on error.
+ *
+ * @write_buf_user:
+ * Perform a frontend write to a backend record, using a specified
+ * buffer that is coming directly from userspace.
+ *
+ * @type:  in: pstore record type to write
+ * @reason:
+ * in: pstore write reason
+ * @id:out: unique identifier for the record
+ * @part:  in: position in a multipart write
+ * @buf:   in: pointer to userspace contents to write to backend record
+ * @compressed:
+ * in: if the record is compressed
+ * @size:  in: size of the write
+ * @psi:   in: pointer to the struct pstore_info for the backend
+ *
+ * Returns 0 on success, and non-zero on error.
+ *
+ * @erase:
+ * Delete a record from backend storage.  Different backends
+ * identify records differently, so all possible methods of
+ * identification are included to help the backend locate the
+ * record to

[PATCH 07/18] pstore: Move record decompression to function

2017-03-06 Thread Kees Cook

This moves the record decompression logic out to a separate function
to avoid the deep indentation.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/platform.c | 67 +---
 1 file changed, 37 insertions(+), 30 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 3fa1575a6e36..0503380704de 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -757,6 +757,37 @@ void pstore_unregister(struct pstore_info *psi)
 }
 EXPORT_SYMBOL_GPL(pstore_unregister);
 
+static void decompress_record(struct pstore_record *record)
+{
+   int unzipped_len;
+
+   /* Only PSTORE_TYPE_DMESG support compression. */
+   if (!record->compressed || record->type != PSTORE_TYPE_DMESG) {
+   pr_warn("ignored compressed record type %d\n", record->type);
+   return;
+   }
+
+   /* No compression method has created the common buffer. */
+   if (!big_oops_buf) {
+   pr_warn("no decompression buffer allocated\n");
+   return;
+   }
+
+   unzipped_len = pstore_decompress(record->buf, big_oops_buf,
+record->size, big_oops_buf_sz);
+   if (unzipped_len > 0) {
+   if (record->ecc_notice_size)
+   memcpy(big_oops_buf + unzipped_len,
+  record->buf + record->size,
+  record->ecc_notice_size);
+   kfree(record->buf);
+   record->buf = big_oops_buf;
+   record->size = unzipped_len;
+   record->compressed = false;
+   } else
+   pr_err("decompression failed: %d\n", unzipped_len);
+}
+
 /*
  * Read all the records from the persistent store. Create
  * files in our filesystem.  Don't warn about -EEXIST errors
@@ -768,7 +799,6 @@ void pstore_get_records(int quiet)
struct pstore_info *psi = psinfo;
struct pstore_recordrecord = { .psi = psi, };
int failed = 0, rc;
-   int unzipped_len = -1;
 
if (!psi)
return;
@@ -782,41 +812,18 @@ void pstore_get_records(int quiet)
 , ,
 _notice_size,
 record.psi)) > 0) {
-   if (record.compressed &&
-   record.type == PSTORE_TYPE_DMESG) {
-   if (big_oops_buf)
-   unzipped_len = pstore_decompress(
-   record.buf,
-   big_oops_buf,
-   record.size,
-   big_oops_buf_sz);
-
-   if (unzipped_len > 0) {
-   if (record.ecc_notice_size)
-   memcpy(big_oops_buf + unzipped_len,
-  record.buf + recorrecord.size,
-  record.ecc_notice_size);
-   kfree(record.buf);
-   record.buf = big_oops_buf;
-   record.size = unzipped_len;
-   record.compressed = false;
-   } else {
-   pr_err("decompression failed;returned %d\n",
-  unzipped_len);
-   record.compressed = true;
-   }
-   }
+
+   decompress_record();
rc = pstore_mkfile(record.type, psi->name, record.id,
   record.count, record.buf,
   record.compressed,
   record.size + record.ecc_notice_size,
   record.time, record.psi);
-   if (unzipped_len < 0) {
-   /* Free buffer other than big oops */
+
+   /* Free buffer other than big oops */
+   if (record.buf != big_oops_buf)
kfree(record.buf);
-   record.buf = NULL;
-   } else
-   unzipped_len = -1;
+
if (rc && (rc != -EEXIST || !quiet))
failed++;
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/18] pstore: Replace arguments for write_buf_user() API

2017-03-06 Thread Kees Cook

Removes argument list in favor of pstore record, though the user buffer
remains passed separately since it must carry the __user annotation.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/platform.c   | 35 ---
 fs/pstore/pmsg.c   |  9 ++---
 fs/pstore/ram.c| 14 +-
 include/linux/pstore.h | 23 +++
 4 files changed, 30 insertions(+), 51 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 5eecf9012459..1e6642a2063e 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -639,47 +639,36 @@ static int pstore_write_compat(struct pstore_record 
*record)
return record->psi->write_buf(record);
 }
 
-static int pstore_write_buf_user_compat(enum pstore_type_id type,
-  enum kmsg_dump_reason reason,
-  u64 *id, unsigned int part,
-  const char __user *buf,
-  bool compressed, size_t size,
-  struct pstore_info *psi)
+static int pstore_write_buf_user_compat(struct pstore_record *record,
+   const char __user *buf)
 {
unsigned long flags = 0;
-   size_t i, bufsize = size;
+   size_t i, bufsize, total_size = record->size;
long ret = 0;
 
-   if (unlikely(!access_ok(VERIFY_READ, buf, size)))
+   if (unlikely(!access_ok(VERIFY_READ, buf, total_size)))
return -EFAULT;
+   bufsize = total_size;
if (bufsize > psinfo->bufsize)
bufsize = psinfo->bufsize;
+   record->buf = psinfo->buf;
spin_lock_irqsave(>buf_lock, flags);
-   for (i = 0; i < size; ) {
-   struct pstore_record record = {
-   .type = type,
-   .reason = reason,
-   .id = id,
-   .part = part,
-   .buf = psinfo->buf,
-   .compressed = compressed,
-   .psi = psi,
-   };
-   size_t c = min(size - i, bufsize);
+   for (i = 0; i < total_size; ) {
+   size_t c = min(total_size - i, bufsize);
 
-   ret = __copy_from_user(psinfo->buf, buf + i, c);
+   ret = __copy_from_user(record->buf, buf + i, c);
if (unlikely(ret != 0)) {
ret = -EFAULT;
break;
}
-   record.size = c;
-   ret = psi->write_buf();
+   record->size = c;
+   ret = record->psi->write_buf(record);
if (unlikely(ret < 0))
break;
i += c;
}
spin_unlock_irqrestore(>buf_lock, flags);
-   return unlikely(ret < 0) ? ret : size;
+   return unlikely(ret < 0) ? ret : total_size;
 }
 
 /*
diff --git a/fs/pstore/pmsg.c b/fs/pstore/pmsg.c
index 78f6176c020f..ce35907602de 100644
--- a/fs/pstore/pmsg.c
+++ b/fs/pstore/pmsg.c
@@ -23,7 +23,11 @@ static DEFINE_MUTEX(pmsg_lock);
 static ssize_t write_pmsg(struct file *file, const char __user *buf,
  size_t count, loff_t *ppos)
 {
-   u64 id;
+   struct pstore_record record = {
+   .type = PSTORE_TYPE_PMSG,
+   .size = count,
+   .psi = psinfo,
+   };
int ret;
 
if (!count)
@@ -34,8 +38,7 @@ static ssize_t write_pmsg(struct file *file, const char 
__user *buf,
return -EFAULT;
 
mutex_lock(_lock);
-   ret = psinfo->write_buf_user(PSTORE_TYPE_PMSG, 0, , 0, buf, 0, count,
-psinfo);
+   ret = psinfo->write_buf_user(, buf);
mutex_unlock(_lock);
return ret ? ret : count;
 }
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index a7cdde60b1f9..d85e1adae1b6 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -451,19 +451,15 @@ static int notrace ramoops_pstore_write_buf(struct 
pstore_record *record)
return 0;
 }
 
-static int notrace ramoops_pstore_write_buf_user(enum pstore_type_id type,
-enum kmsg_dump_reason reason,
-u64 *id, unsigned int part,
-const char __user *buf,
-bool compressed, size_t size,
-struct pstore_info *psi)
+static int notrace ramoops_pstore_write_buf_user(struct pstore_record *record,
+const char __user *buf)
 {
-   if (type == PSTORE_TYPE_PMSG) {
-   struct ramoops_context *cxt = psi->data;
+   if (record->type == PSTORE_TYPE_PMSG) {
+   struct ramoops_context *cxt = record-

[PATCH 16/18] pstore: Replace arguments for write_buf() API

2017-03-06 Thread Kees Cook

As with the other API updates, this removes the long argument list in favor
of passing a single pstore recaord.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 fs/pstore/ftrace.c |  9 +++--
 fs/pstore/platform.c   | 30 +-
 fs/pstore/ram.c| 44 ++--
 include/linux/pstore.h | 21 +
 4 files changed, 55 insertions(+), 49 deletions(-)

diff --git a/fs/pstore/ftrace.c b/fs/pstore/ftrace.c
index 899d0ba0bd6c..a5506ec6995e 100644
--- a/fs/pstore/ftrace.c
+++ b/fs/pstore/ftrace.c
@@ -37,6 +37,12 @@ static void notrace pstore_ftrace_call(unsigned long ip,
 {
unsigned long flags;
struct pstore_ftrace_record rec = {};
+   struct pstore_record record = {
+   .type = PSTORE_TYPE_FTRACE,
+   .buf = (char *),
+   .size = sizeof(rec),
+   .psi = psinfo,
+   };
 
if (unlikely(oops_in_progress))
return;
@@ -47,8 +53,7 @@ static void notrace pstore_ftrace_call(unsigned long ip,
rec.parent_ip = parent_ip;
pstore_ftrace_write_timestamp(, pstore_ftrace_stamp++);
pstore_ftrace_encode_cpu(, raw_smp_processor_id());
-   psinfo->write_buf(PSTORE_TYPE_FTRACE, 0, NULL, 0, (void *),
- 0, sizeof(rec), psinfo);
+   psinfo->write_buf();
 
local_irq_restore(flags);
 }
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index aa3d6e572ede..5eecf9012459 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -587,8 +587,11 @@ static void pstore_console_write(struct console *con, 
const char *s, unsigned c)
const char *e = s + c;
 
while (s < e) {
+   struct pstore_record record = {
+   .type = PSTORE_TYPE_CONSOLE,
+   .psi = psinfo,
+   };
unsigned long flags;
-   u64 id;
 
if (c > psinfo->bufsize)
c = psinfo->bufsize;
@@ -599,8 +602,9 @@ static void pstore_console_write(struct console *con, const 
char *s, unsigned c)
} else {
spin_lock_irqsave(>buf_lock, flags);
}
-   psinfo->write_buf(PSTORE_TYPE_CONSOLE, 0, , 0,
- s, 0, c, psinfo);
+   record.buf = (char *)s;
+   record.size = c;
+   psinfo->write_buf();
spin_unlock_irqrestore(>buf_lock, flags);
s += c;
c = e - s;
@@ -630,10 +634,9 @@ static void pstore_unregister_console(void) {}
 
 static int pstore_write_compat(struct pstore_record *record)
 {
-   return record->psi->write_buf(record->type, record->reason,
- >id, record->part,
- psinfo->buf, record->compressed,
- record->size, record->psi);
+   record->buf = psinfo->buf;
+
+   return record->psi->write_buf(record);
 }
 
 static int pstore_write_buf_user_compat(enum pstore_type_id type,
@@ -653,6 +656,15 @@ static int pstore_write_buf_user_compat(enum 
pstore_type_id type,
bufsize = psinfo->bufsize;
spin_lock_irqsave(>buf_lock, flags);
for (i = 0; i < size; ) {
+   struct pstore_record record = {
+   .type = type,
+   .reason = reason,
+   .id = id,
+   .part = part,
+   .buf = psinfo->buf,
+   .compressed = compressed,
+   .psi = psi,
+   };
size_t c = min(size - i, bufsize);
 
ret = __copy_from_user(psinfo->buf, buf + i, c);
@@ -660,8 +672,8 @@ static int pstore_write_buf_user_compat(enum pstore_type_id 
type,
ret = -EFAULT;
break;
}
-   ret = psi->write_buf(type, reason, id, part, psinfo->buf,
-compressed, c, psi);
+   record.size = c;
+   ret = psi->write_buf();
if (unlikely(ret < 0))
break;
i += c;
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index a18575fe32e9..a7cdde60b1f9 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -378,23 +378,18 @@ static size_t ramoops_write_kmsg_hdr(struct 
persistent_ram_zone *prz,
return len;
 }
 
-static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
-   enum kmsg_dump_reason reason,
-   u64 *id, unsigned int part,
-   const char *buf,
-   bool compress

[PATCH 10/18] pstore: Replace arguments for write() API

2017-03-06 Thread Kees Cook

Similar to the pstore_info read() callback, there were too many arguments.
This switches to the new struct pstore_record pointer instead. This adds
"reason" and "part" to the record structure as well.

Signed-off-by: Kees Cook <keesc...@chromium.org>
---
 arch/powerpc/kernel/nvram_64.c| 27 +
 drivers/acpi/apei/erst.c  | 18 +---
 drivers/firmware/efi/efi-pstore.c | 18 +---
 fs/pstore/platform.c  | 62 ++-
 include/linux/pstore.h| 36 +++
 5 files changed, 76 insertions(+), 85 deletions(-)

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 7f192001d09a..caf2e1f36d6b 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -389,51 +389,40 @@ static int nvram_pstore_open(struct pstore_info *psi)
 
 /**
  * nvram_pstore_write - pstore write callback for nvram
- * @type:   Type of message logged
- * @reason: reason behind dump (oops/panic)
- * @id: identifier to indicate the write performed
- * @part:   pstore writes data to registered buffer in parts,
- *  part number will indicate the same.
- * @count:  Indicates oops count
- * @compressed: Flag to indicate the log is compressed
- * @size:   number of bytes written to the registered buffer
- * @psi:registered pstore_info structure
+ * @record: pstore record to write, with @id to be set
  *
  * Called by pstore_dump() when an oops or panic report is logged in the
  * printk buffer.
  * Returns 0 on successful write.
  */
-static int nvram_pstore_write(enum pstore_type_id type,
-   enum kmsg_dump_reason reason,
-   u64 *id, unsigned int part, int count,
-   bool compressed, size_t size,
-   struct pstore_info *psi)
+static int nvram_pstore_write(struct pstore_record *record)
 {
int rc;
unsigned int err_type = ERR_TYPE_KERNEL_PANIC;
struct oops_log_info *oops_hdr = (struct oops_log_info *) oops_buf;
 
/* part 1 has the recent messages from printk buffer */
-   if (part > 1 || (type != PSTORE_TYPE_DMESG))
+   if (record->part > 1 || (record->type != PSTORE_TYPE_DMESG))
return -1;
 
if (clobbering_unread_rtas_event())
return -1;
 
oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION);
-   oops_hdr->report_length = cpu_to_be16(size);
+   oops_hdr->report_length = cpu_to_be16(record->size);
oops_hdr->timestamp = cpu_to_be64(ktime_get_real_seconds());
 
-   if (compressed)
+   if (record->compressed)
err_type = ERR_TYPE_KERNEL_PANIC_GZ;
 
rc = nvram_write_os_partition(_log_partition, oops_buf,
-   (int) (sizeof(*oops_hdr) + size), err_type, count);
+   (int) (sizeof(*oops_hdr) + record->size), err_type,
+   record->count);
 
if (rc != 0)
return rc;
 
-   *id = part;
+   record->id = record->part;
return 0;
 }
 
diff --git a/drivers/acpi/apei/erst.c b/drivers/acpi/apei/erst.c
index bbefb7522f80..440588d189e7 100644
--- a/drivers/acpi/apei/erst.c
+++ b/drivers/acpi/apei/erst.c
@@ -926,9 +926,7 @@ static int erst_check_table(struct acpi_table_erst 
*erst_tab)
 static int erst_open_pstore(struct pstore_info *psi);
 static int erst_close_pstore(struct pstore_info *psi);
 static ssize_t erst_reader(struct pstore_record *record);
-static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
-  u64 *id, unsigned int part, int count, bool compressed,
-  size_t size, struct pstore_info *psi);
+static int erst_writer(struct pstore_record *record);
 static int erst_clearer(enum pstore_type_id type, u64 id, int count,
struct timespec time, struct pstore_info *psi);
 
@@ -1054,9 +1052,7 @@ static ssize_t erst_reader(struct pstore_record *record)
return (rc < 0) ? rc : (len - sizeof(*rcd));
 }
 
-static int erst_writer(enum pstore_type_id type, enum kmsg_dump_reason reason,
-  u64 *id, unsigned int part, int count, bool compressed,
-  size_t size, struct pstore_info *psi)
+static int erst_writer(struct pstore_record *record)
 {
struct cper_pstore_record *rcd = (struct cper_pstore_record *)
(erst_info.buf - sizeof(*rcd));
@@ -1071,21 +1067,21 @@ static int erst_writer(enum pstore_type_id type, enum 
kmsg_dump_reason reason,
/* timestamp valid. platform_id, partition_id are invalid */
rcd->hdr.validation_bits = CPER_VALID_TIMESTAMP;
rcd->hdr.timestamp = get_seconds();
-   rcd->

Re: [PATCH 1/2] x86: Fix kernel panic when booting with XD disabled in uEFI firmware

2015-12-08 Thread Kees Cook

On Tue, Dec 8, 2015 at 6:19 AM, Borislav Petkov <b...@alien8.de> wrote:
> On Tue, Dec 08, 2015 at 12:25:57PM +, Matt Fleming wrote:
>> On Mon, 07 Dec, at 11:10:43PM, Kosuke Tatsukawa wrote:
>> >
>> > Thank you pointing that out.
>> >
>> > linux-4.4-rc3 booted without a problem on a real server even with XD
>> > turned off by the firmware.  I didn't notice this before because I was
>
> The aforementioned patch reenables NX.
>
>> Borislav, what do you think about stripping PAGE_NX from 'page_flags'
>> in kernel_map_pages_in_pgd() if NX isn't supported, rather than
>> returning EINVAL? At least that way EFI runtime services would still
>> work.
>
> I guess we can - I mean, I don't see what can go wrong more when
> allowing the kernel to execute even NX UEFI regions. Maybe easier
> generation of "gadgets" in the ROP sense ...
>
> On a related node, I'm very sceptical of the existence of this "noexec"
> chicken bit, if you ask me. It is a really bad idea, security-wise, to
> disable NX. Is there even a valid use case to disable NX?
>
> Because if not, I'd vote for removing that chicken bit or at least
> taining the kernel with
>
> add_taint(TAINT_USER_MORON, ... );

If we add this for not-nx, I would like to add it for not-rodata too.

> Kees, has this NX disabling practice come up in the past, per chance... ?

I've never seen anyone actually use it. I was asked to include it out
of fear of some kind of rogue imagined CPU configuration that mixed NX
and non-NX capable CPUs in a single machine where the forced NX
re-enablement code would cause problems. As you might imagine, I'm not
aware of this case ever being an issue. ;)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/3] efi-pstore: implement efivars_pstore_exit()

2015-11-11 Thread Kees Cook

On Wed, Nov 11, 2015 at 3:23 PM, Luck, Tony <tony.l...@intel.com> wrote:
>>>>  module_init(efivars_pstore_init);
>>>
>>> Looks OK to me. Kees, are you picking this up?
>>
>> I can, though usually it goes through Tony.
>
> Can I count that as "Acked-by" from both of you?

Yup, sorry. I thought I'd acked them already. :) My bad!

Acked-by: Kees Cook <keesc...@chromium.org>

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/3] efi-pstore: implement efivars_pstore_exit()

2015-11-11 Thread Kees Cook

On Wed, Nov 11, 2015 at 8:59 AM, Matt Fleming <m...@codeblueprint.co.uk> wrote:
> On Sat, 07 Nov, at 12:43:48PM, Geliang Tang wrote:
>> The original efivars_pstore_exit() is empty. I
>>  1) add a bufsize check statement.
>>  2) call pstore_unregister as it is defined now.
>>  3) free the memory and set bufsize to 0.
>>
>> Signed-off-by: Geliang Tang <geliangt...@163.com>
>> ---
>>  drivers/firmware/efi/efi-pstore.c | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/firmware/efi/efi-pstore.c 
>> b/drivers/firmware/efi/efi-pstore.c
>> index eac76a7..62d57d8 100644
>> --- a/drivers/firmware/efi/efi-pstore.c
>> +++ b/drivers/firmware/efi/efi-pstore.c
>> @@ -393,6 +393,13 @@ static __init int efivars_pstore_init(void)
>>
>>  static __exit void efivars_pstore_exit(void)
>>  {
>> + if (!efi_pstore_info.bufsize)
>> + return;
>> +
>> + pstore_unregister(_pstore_info);
>> + kfree(efi_pstore_info.buf);
>> + efi_pstore_info.buf = NULL;
>> + efi_pstore_info.bufsize = 0;
>>  }
>>
>>  module_init(efivars_pstore_init);
>
> Looks OK to me. Kees, are you picking this up?

I can, though usually it goes through Tony.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] x86/mm changes for v4.4

2015-11-10 Thread Kees Cook

On Mon, Nov 9, 2015 at 11:08 PM, Ard Biesheuvel
<ard.biesheu...@linaro.org> wrote:
> On 9 November 2015 at 22:08, Kees Cook <keesc...@chromium.org> wrote:
>> On Sat, Nov 7, 2015 at 11:55 PM, Ard Biesheuvel
>> <ard.biesheu...@linaro.org> wrote:
>>> On 8 November 2015 at 07:58, Kees Cook <keesc...@chromium.org> wrote:
>>>> On Fri, Nov 6, 2015 at 11:39 PM, Ard Biesheuvel
>>>> <ard.biesheu...@linaro.org> wrote:
>>>>> On 7 November 2015 at 08:09, Ingo Molnar <mi...@kernel.org> wrote:
>>>>>>
>>>>>> * Matt Fleming <m...@codeblueprint.co.uk> wrote:
>>>>>>
>>>>>>> On Fri, 06 Nov, at 07:55:50AM, Ingo Molnar wrote:
>>>>>>> >
>>>>>>> >  3) We should fix the EFI permission problem without relying on the 
>>>>>>> > firmware: it
>>>>>>> > appears we could just mark everything R-X optimistically, and if 
>>>>>>> > a write fault
>>>>>>> > happens (it's pretty rare in fact, only triggers when we write to 
>>>>>>> > an EFI
>>>>>>> > variable and so), we can mark the faulting page RW- on the fly, 
>>>>>>> > because it
>>>>>>> > appears that writable EFI sections, while not enumerated very 
>>>>>>> > well in 'old'
>>>>>>> > firmware, are still supposed to be page granular. (Even 'new' 
>>>>>>> > firmware I
>>>>>>> > wouldn't automatically trust to get the enumeration right...)
>>>>>>>
>>>>>>> Sorry, this isn't true. I misled you with one of my earlier posts on
>>>>>>> this topic. Let me try and clear things up...
>>>>>>>
>>>>>>> Writing to EFI regions has to do with every invocation of the EFI
>>>>>>> runtime services - it's not limited to when you read/write/delete EFI
>>>>>>> variables. In fact, EFI variables really have nothing to do with this
>>>>>>> discussion, they're a completely opaque concept to the OS, we have no
>>>>>>> idea how the firmware implements them. Everything is done via the EFI
>>>>>>> boot/runtime services.
>>>>>>>
>>>>>>> The firmware itself will attempt to write to EFI regions when we
>>>>>>> invoke the EFI services because that's where the PE/COFF ".data" and
>>>>>>> ".bss" sections live along with the heap. There's even some relocation
>>>>>>> fixups that occur as SetVirtualAddressMap() time so it'll write to
>>>>>>> ".text" too.
>>>>>>>
>>>>>>> Now, the above PE/COFF sections are usually (always?) contained within
>>>>>>> EFI regions of type EfiRuntimeServicesCode. We know this is true
>>>>>>> because the firmware folks have told us so, and because stopping that
>>>>>>> is the motivation behind the new EFI_PROPERTIES_TABLE feature in UEFI
>>>>>>> V2.5.
>>>>>>>
>>>>>>> The data sections within the region are also *not* guaranteed to be
>>>>>>> page granular because work was required in Tianocore for emitting
>>>>>>> sections with 4k alignment as part of the EFI_PROPERTIES_TABLE
>>>>>>> support.
>>>>>>>
>>>>>>> Ultimately, what this means is that if you were to attempt to
>>>>>>> dynamically fixup those regions that required write permission, you'd
>>>>>>> have to modify the mappings for the majority of the EFI regions
>>>>>>> anyway. And if you're blindly allowing write permission as a fixup,
>>>>>>> there's not much security to be had.
>>>>>>
>>>>>> I think you misunderstood my suggestion: the 'fixup' would be changing 
>>>>>> it from R-X
>>>>>> to RW-, i.e. it would add 'write' permission but remove 'execute' 
>>>>>> permission.
>>>>>>
>>>>>> Note that there would be no 'RWX' permission at any given moment - which 
>>>>>> is the
>>>>>> dangerous combination.
>>>>>>
>>>>>
>>>>> The problem with that is that /any/ page in the UEFI runtime region
&g

Re: [GIT PULL] x86/mm changes for v4.4

2015-11-09 Thread Kees Cook

On Sat, Nov 7, 2015 at 11:55 PM, Ard Biesheuvel
<ard.biesheu...@linaro.org> wrote:
> On 8 November 2015 at 07:58, Kees Cook <keesc...@chromium.org> wrote:
>> On Fri, Nov 6, 2015 at 11:39 PM, Ard Biesheuvel
>> <ard.biesheu...@linaro.org> wrote:
>>> On 7 November 2015 at 08:09, Ingo Molnar <mi...@kernel.org> wrote:
>>>>
>>>> * Matt Fleming <m...@codeblueprint.co.uk> wrote:
>>>>
>>>>> On Fri, 06 Nov, at 07:55:50AM, Ingo Molnar wrote:
>>>>> >
>>>>> >  3) We should fix the EFI permission problem without relying on the 
>>>>> > firmware: it
>>>>> > appears we could just mark everything R-X optimistically, and if a 
>>>>> > write fault
>>>>> > happens (it's pretty rare in fact, only triggers when we write to 
>>>>> > an EFI
>>>>> > variable and so), we can mark the faulting page RW- on the fly, 
>>>>> > because it
>>>>> > appears that writable EFI sections, while not enumerated very well 
>>>>> > in 'old'
>>>>> > firmware, are still supposed to be page granular. (Even 'new' 
>>>>> > firmware I
>>>>> > wouldn't automatically trust to get the enumeration right...)
>>>>>
>>>>> Sorry, this isn't true. I misled you with one of my earlier posts on
>>>>> this topic. Let me try and clear things up...
>>>>>
>>>>> Writing to EFI regions has to do with every invocation of the EFI
>>>>> runtime services - it's not limited to when you read/write/delete EFI
>>>>> variables. In fact, EFI variables really have nothing to do with this
>>>>> discussion, they're a completely opaque concept to the OS, we have no
>>>>> idea how the firmware implements them. Everything is done via the EFI
>>>>> boot/runtime services.
>>>>>
>>>>> The firmware itself will attempt to write to EFI regions when we
>>>>> invoke the EFI services because that's where the PE/COFF ".data" and
>>>>> ".bss" sections live along with the heap. There's even some relocation
>>>>> fixups that occur as SetVirtualAddressMap() time so it'll write to
>>>>> ".text" too.
>>>>>
>>>>> Now, the above PE/COFF sections are usually (always?) contained within
>>>>> EFI regions of type EfiRuntimeServicesCode. We know this is true
>>>>> because the firmware folks have told us so, and because stopping that
>>>>> is the motivation behind the new EFI_PROPERTIES_TABLE feature in UEFI
>>>>> V2.5.
>>>>>
>>>>> The data sections within the region are also *not* guaranteed to be
>>>>> page granular because work was required in Tianocore for emitting
>>>>> sections with 4k alignment as part of the EFI_PROPERTIES_TABLE
>>>>> support.
>>>>>
>>>>> Ultimately, what this means is that if you were to attempt to
>>>>> dynamically fixup those regions that required write permission, you'd
>>>>> have to modify the mappings for the majority of the EFI regions
>>>>> anyway. And if you're blindly allowing write permission as a fixup,
>>>>> there's not much security to be had.
>>>>
>>>> I think you misunderstood my suggestion: the 'fixup' would be changing it 
>>>> from R-X
>>>> to RW-, i.e. it would add 'write' permission but remove 'execute' 
>>>> permission.
>>>>
>>>> Note that there would be no 'RWX' permission at any given moment - which 
>>>> is the
>>>> dangerous combination.
>>>>
>>>
>>> The problem with that is that /any/ page in the UEFI runtime region
>>> may intersect with both .text and .data of any of the PE/COFF images
>>> that make up the runtime firmware (since the PE/COFF sections are not
>>> necessarily page aligned). Such pages require RWX permissions. The
>>> UEFI memory map does not provide the information to identify those
>>> pages a priori (the entire region containing several PE/COFF images
>>> could be covered by a single entry) so it is hard to guess which pages
>>> should be allowed these RWX permissions.
>>
>> I'm sad that UEFI was designed without even the most basic of memory
>> protections in mind. UEFI _itself_ should be setting up protective
>> page mappings. :

Re: [PATCH 1/2] efi-pstore: fix kernel-doc argument name

2015-11-03 Thread Kees Cook

On Sat, Oct 31, 2015 at 8:23 AM, Geliang Tang <geliangt...@163.com> wrote:
> The first argument name in the kernel-doc argument list for
> efi_pstore_scan_sysfs_enter() was slightly off. Fix it for the
> kernel doc.
>
> Signed-off-by: Geliang Tang <geliangt...@163.com>

Acked-by: Kees Cook <keesc...@chromium.org>

Thanks!

-Kees

> ---
>  drivers/firmware/efi/efi-pstore.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/efi-pstore.c 
> b/drivers/firmware/efi/efi-pstore.c
> index c8d794c..eac76a7 100644
> --- a/drivers/firmware/efi/efi-pstore.c
> +++ b/drivers/firmware/efi/efi-pstore.c
> @@ -103,7 +103,7 @@ static int efi_pstore_read_func(struct efivar_entry 
> *entry, void *data)
>
>  /**
>   * efi_pstore_scan_sysfs_enter
> - * @entry: scanning entry
> + * @pos: scanning entry
>   * @next: next entry
>   * @head: list head
>   */
> --
> 2.4.3
>
>



-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 2/7] x86, boot: Move ZO to end of buffer

2015-03-10 Thread Kees Cook

On Tue, Mar 10, 2015 at 1:00 AM, Borislav Petkov b...@suse.de wrote:
 Final patch:

 ---
 From: Yinghai Lu ying...@kernel.org
 Date: Sat, 7 Mar 2015 14:07:16 -0800
 Subject: [PATCH] x86/setup: Move compressed kernel to the end of the buffer

 Boris found that passing KASLR status through setup_data from the boot
 stage cannot be used later in the kernel stage, see commit

   f47233c2d34f (x86/mm/ASLR: Propagate base load address calculation)

 Here's some background:

 The boot loader allocates a buffer of size init_size in concordance with
 the value passed in the setup header and it loads the compressed, i.e.
 first kernel (arch/x86/boot/compressed/vmlinux) in it.

 First kernel then moves itself somewhere around the middle of the
 buffer at z_extract_offset to make sure that the decompressor does not
 overwrite input data.

 After the decompressor is finished, kernel proper (vmlinux) uses the
 whole buffer from the beginning and the compressed kernel's code and
 data section is overlapped with the kernel proper's bss section.

 Later on, clear_bss() in kernel proper clears .bss before code in
 arch/x86/kernel/setup.c can access setup_data passed in the first,
 compressed kernel.

 To make sure that data survives, we should avoid the overlapping.

 As a first step, move the first kernel closer to the end of the buffer
 instead of the middle. As a result, this will place first kernel's data
 area out of kernel proper's .bss area.

 This way we can find out where the data section of the copied first
 kernel is instead of guessing. In addition, it will make the KASLR
 mem_avoid array preparation for the search of a fitting buffer much
 simpler.

 While at it, rename z_extract_offset to z_min_extract_offset as it is
 actually the minimum extract offset now.

 In order to keep the final extract offset page-aligned we need to
 make both kernels' _end markers page-aligned too so that init_size is
 page-aligned as a result.

 Signed-off-by: Yinghai Lu ying...@kernel.org
 Cc: H. Peter Anvin h...@zytor.com
 Cc: Matt Fleming matt.flem...@intel.com
 Cc: Kees Cook keesc...@chromium.org
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Jiri Kosina jkos...@suse.cz
 Cc: linux-efi@vger.kernel.org
 Cc: Ingo Molnar mi...@redhat.com
 Cc: Baoquan He b...@redhat.com
 Fixes: f47233c2d34f (x86/mm/ASLR: Propagate base load address calculation)
 Link: 
 http://lkml.kernel.org/r/1425766041-6551-3-git-send-email-ying...@kernel.org
 [ Commit message massively rewritten ]

Acked-by: Kees Cook keesc...@chromium.org

Thanks!

-Kees

 Signed-off-by:
 ---
  arch/x86/boot/compressed/head_32.S | 11 +--
  arch/x86/boot/compressed/head_64.S |  8 ++--
  arch/x86/boot/compressed/mkpiggy.c |  7 ++-
  arch/x86/boot/compressed/vmlinux.lds.S |  1 +
  arch/x86/boot/header.S |  2 +-
  arch/x86/kernel/asm-offsets.c  |  1 +
  arch/x86/kernel/vmlinux.lds.S  |  1 +
  7 files changed, 21 insertions(+), 10 deletions(-)

 diff --git a/arch/x86/boot/compressed/head_32.S 
 b/arch/x86/boot/compressed/head_32.S
 index cbed1407a5cd..a9b56f1d8e75 100644
 --- a/arch/x86/boot/compressed/head_32.S
 +++ b/arch/x86/boot/compressed/head_32.S
 @@ -147,7 +147,9 @@ preferred_addr:
  1:

 /* Target address to relocate to for decompression */
 -   addl$z_extract_offset, %ebx
 +   movlBP_init_size(%esi), %eax
 +   subl$_end, %eax
 +   addl%eax, %ebx

 /* Set up the stack */
 lealboot_stack_end(%ebx), %esp
 @@ -208,8 +210,13 @@ relocated:
   */
 /* push arguments for decompress_kernel: */
 pushl   $z_output_len   /* decompressed length */
 -   lealz_extract_offset_negative(%ebx), %ebp
 +
 +   movlBP_init_size(%esi), %eax
 +   subl$_end, %eax
 +   movl%ebx, %ebp
 +   subl%eax, %ebp
 pushl   %ebp/* output address */
 +
 pushl   $z_input_len/* input_len */
 lealinput_data(%ebx), %eax
 pushl   %eax/* input_data */
 diff --git a/arch/x86/boot/compressed/head_64.S 
 b/arch/x86/boot/compressed/head_64.S
 index 2884e0c3e8a5..69015b576cf6 100644
 --- a/arch/x86/boot/compressed/head_64.S
 +++ b/arch/x86/boot/compressed/head_64.S
 @@ -101,7 +101,9 @@ ENTRY(startup_32)
  1:

 /* Target address to relocate to for decompression */
 -   addl$z_extract_offset, %ebx
 +   movlBP_init_size(%esi), %eax
 +   subl$_end, %eax
 +   addl%eax, %ebx

  /*
   * Prepare for entering 64 bit mode
 @@ -329,7 +331,9 @@ preferred_addr:
  1:

 /* Target address to relocate to for decompression */
 -   leaqz_extract_offset(%rbp), %rbx
 +   movlBP_init_size(%rsi), %ebx
 +   subl$_end, %ebx
 +   addq%rbp, %rbx

 /* Set up the stack */
 leaqboot_stack_end(%rbx), %rsp
 diff --git a/arch/x86/boot/compressed/mkpiggy.c 
 b/arch/x86/boot/compressed/mkpiggy.c
 index

Re: [PATCH v3 0/7] x86, boot: clean up kasl

2015-03-09 Thread Kees Cook

On Sat, Mar 7, 2015 at 2:07 PM, Yinghai Lu ying...@kernel.org wrote:
 First 3 patches make ZO (arch/x86/boot/compressed/vmlinux) data region is not
 overwritten by VO (vmlinux) after decompress.  So could pass data from ZO to 
 VO.

Tiny nit-pick, in case there is a v5: the Subject on this (0/7) says
kasl instead of kaslr.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 1/7] x86, kaslr: Use init_size instead of run_size

2015-03-09 Thread Kees Cook

On Sat, Mar 7, 2015 at 2:07 PM, Yinghai Lu ying...@kernel.org wrote:
 commit e6023367d779 (x86, kaslr: Prevent .bss from overlaping initrd)
 introduced one run_size for kaslr.
 We should use real runtime size (include copy/decompress) aka init_size.

 run_size is VO (vmlinux) init size include bss and brk.
 init_size is the size needed for decompress and it is bigger than run_size
 when decompress need more buff.

 According to arch/x86/boot/header.S:
 | #define ZO_INIT_SIZE(ZO__end - ZO_startup_32 + ZO_z_extract_offset)
 | #define VO_INIT_SIZE(VO__end - VO__text)
 | #if ZO_INIT_SIZE  VO_INIT_SIZE
 | #define INIT_SIZE ZO_INIT_SIZE
 | #else
 | #define INIT_SIZE VO_INIT_SIZE
 | #endif
 | init_size:  .long INIT_SIZE # kernel initialization size

 Bootloader allocate buffer according to init_size in hdr, and load the
 ZO (arch/x86/boot/compressed/vmlinux) from start of that buffer.
 init_size first come from VO (vmlinux) init size. That VO init size
 is from VO _end to VO _end and include VO bss and brk area.

 During running of ZO, ZO move itself to the middle of buffer at
 z_extract_offset to make sure that decompressor would not have output
 overwrite input data before input data get consumed.
 But z_extract_offset calculating is based on size of VO (vmlinux) and size
 of compressed VO only at first.
 So need to make sure [z_extra_offset, init_size) will fit ZO, that means
 init_size need to be adjusted according to ZO size.
 That make init_size is always = run_size.

 During aslr buffer searching, we need to make sure the new buffer is big
 enough for decompress at first. So use init_size instead, and kill not
 needed run_size related code.

 Fixes: e6023367d779 (x86, kaslr: Prevent .bss from overlaping initrd)
 Cc: H. Peter Anvin h...@zytor.com
 Cc: Josh Triplett j...@joshtriplett.org
 Cc: Matt Fleming matt.flem...@intel.com
 Cc: Kees Cook keesc...@chromium.org
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Ard Biesheuvel ard.biesheu...@linaro.org
 Cc: Junjie Mao eternal@gmail.com
 Signed-off-by: Yinghai Lu ying...@kernel.org

Acked-by: Kees Cook keesc...@chromium.org

 ---
  arch/x86/boot/compressed/Makefile  |  4 +---
  arch/x86/boot/compressed/head_32.S |  5 ++---
  arch/x86/boot/compressed/head_64.S |  5 +
  arch/x86/boot/compressed/misc.c| 15 +++---
  arch/x86/boot/compressed/mkpiggy.c |  9 ++--
  arch/x86/tools/calc_run_size.sh| 42 
 --
  6 files changed, 13 insertions(+), 67 deletions(-)
  delete mode 100644 arch/x86/tools/calc_run_size.sh

 diff --git a/arch/x86/boot/compressed/Makefile 
 b/arch/x86/boot/compressed/Makefile
 index 0a291cd..70cc92c 100644
 --- a/arch/x86/boot/compressed/Makefile
 +++ b/arch/x86/boot/compressed/Makefile
 @@ -92,10 +92,8 @@ suffix-$(CONFIG_KERNEL_XZ)   := xz
  suffix-$(CONFIG_KERNEL_LZO):= lzo
  suffix-$(CONFIG_KERNEL_LZ4):= lz4

 -RUN_SIZE = $(shell $(OBJDUMP) -h vmlinux | \
 -$(CONFIG_SHELL) $(srctree)/arch/x86/tools/calc_run_size.sh)
  quiet_cmd_mkpiggy = MKPIGGY $@
 -  cmd_mkpiggy = $(obj)/mkpiggy $ $(RUN_SIZE)  $@ || ( rm -f $@ ; false 
 )
 +  cmd_mkpiggy = $(obj)/mkpiggy $  $@ || ( rm -f $@ ; false )

  targets += piggy.S
  $(obj)/piggy.S: $(obj)/vmlinux.bin.$(suffix-y) $(obj)/mkpiggy FORCE
 diff --git a/arch/x86/boot/compressed/head_32.S 
 b/arch/x86/boot/compressed/head_32.S
 index 1d7fbbc..cbed140 100644
 --- a/arch/x86/boot/compressed/head_32.S
 +++ b/arch/x86/boot/compressed/head_32.S
 @@ -207,8 +207,7 @@ relocated:
   * Do the decompression, and jump to the new kernel..
   */
 /* push arguments for decompress_kernel: */
 -   pushl   $z_run_size /* size of kernel with .bss and .brk */
 -   pushl   $z_output_len   /* decompressed length, end of relocs */
 +   pushl   $z_output_len   /* decompressed length */
 lealz_extract_offset_negative(%ebx), %ebp
 pushl   %ebp/* output address */
 pushl   $z_input_len/* input_len */
 @@ -218,7 +217,7 @@ relocated:
 pushl   %eax/* heap area */
 pushl   %esi/* real mode pointer */
 calldecompress_kernel /* returns kernel location in %eax */
 -   addl$28, %esp
 +   addl$24, %esp

  /*
   * Jump to the decompressed kernel.
 diff --git a/arch/x86/boot/compressed/head_64.S 
 b/arch/x86/boot/compressed/head_64.S
 index 6b1766c..2884e0c 100644
 --- a/arch/x86/boot/compressed/head_64.S
 +++ b/arch/x86/boot/compressed/head_64.S
 @@ -402,16 +402,13 @@ relocated:
   * Do the decompression, and jump to the new kernel..
   */
 pushq   %rsi/* Save the real mode argument */
 -   movq$z_run_size, %r9/* size of kernel with .bss and .brk 
 */
 -   pushq   %r9
 movq%rsi, %rdi  /* real mode address */
 leaqboot_heap(%rip), %rsi   /* malloc area for uncompression */
 leaq

Re: [PATCH v3 6/7] x86, boot: Split kernel_ident_mapping_init to another file

2015-03-09 Thread Kees Cook

On Sat, Mar 7, 2015 at 2:07 PM, Yinghai Lu ying...@kernel.org wrote:
 We need to include that in boot::decompress_kernel stage to set new
 ident mapping.

 Also add checking for __pa/__va macro definition, as we need to override them
 in boot::decompress_kernel stage.

 Signed-off-by: Yinghai Lu ying...@kernel.org

This seems good. :)

Reviewed-by: Kees Cook keesc...@chromium.org

-Kees

 ---
  arch/x86/include/asm/page.h |  5 +++
  arch/x86/mm/ident_map.c | 74 
 +
  arch/x86/mm/init_64.c   | 74 
 +
  3 files changed, 80 insertions(+), 73 deletions(-)
  create mode 100644 arch/x86/mm/ident_map.c

 diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
 index 802dde3..cf8f619 100644
 --- a/arch/x86/include/asm/page.h
 +++ b/arch/x86/include/asm/page.h
 @@ -37,7 +37,10 @@ static inline void copy_user_page(void *to, void *from, 
 unsigned long vaddr,
 alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
  #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE

 +#ifndef __pa
  #define __pa(x)__phys_addr((unsigned long)(x))
 +#endif
 +
  #define __pa_nodebug(x)__phys_addr_nodebug((unsigned long)(x))
  /* __pa_symbol should be used for C visible symbols.
 This seems to be the official gcc blessed way to do such arithmetic. */
 @@ -51,7 +54,9 @@ static inline void copy_user_page(void *to, void *from, 
 unsigned long vaddr,
  #define __pa_symbol(x) \
 __phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))

 +#ifndef __va
  #define __va(x)((void *)((unsigned 
 long)(x)+PAGE_OFFSET))
 +#endif

  #define __boot_va(x)   __va(x)
  #define __boot_pa(x)   __pa(x)
 diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
 new file mode 100644
 index 000..751ca92
 --- /dev/null
 +++ b/arch/x86/mm/ident_map.c
 @@ -0,0 +1,74 @@
 +
 +static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page,
 +  unsigned long addr, unsigned long end)
 +{
 +   addr = PMD_MASK;
 +   for (; addr  end; addr += PMD_SIZE) {
 +   pmd_t *pmd = pmd_page + pmd_index(addr);
 +
 +   if (!pmd_present(*pmd))
 +   set_pmd(pmd, __pmd(addr | pmd_flag));
 +   }
 +}
 +static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page,
 + unsigned long addr, unsigned long end)
 +{
 +   unsigned long next;
 +
 +   for (; addr  end; addr = next) {
 +   pud_t *pud = pud_page + pud_index(addr);
 +   pmd_t *pmd;
 +
 +   next = (addr  PUD_MASK) + PUD_SIZE;
 +   if (next  end)
 +   next = end;
 +
 +   if (pud_present(*pud)) {
 +   pmd = pmd_offset(pud, 0);
 +   ident_pmd_init(info-pmd_flag, pmd, addr, next);
 +   continue;
 +   }
 +   pmd = (pmd_t *)info-alloc_pgt_page(info-context);
 +   if (!pmd)
 +   return -ENOMEM;
 +   ident_pmd_init(info-pmd_flag, pmd, addr, next);
 +   set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
 +   }
 +
 +   return 0;
 +}
 +
 +int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
 + unsigned long addr, unsigned long end)
 +{
 +   unsigned long next;
 +   int result;
 +   int off = info-kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0;
 +
 +   for (; addr  end; addr = next) {
 +   pgd_t *pgd = pgd_page + pgd_index(addr) + off;
 +   pud_t *pud;
 +
 +   next = (addr  PGDIR_MASK) + PGDIR_SIZE;
 +   if (next  end)
 +   next = end;
 +
 +   if (pgd_present(*pgd)) {
 +   pud = pud_offset(pgd, 0);
 +   result = ident_pud_init(info, pud, addr, next);
 +   if (result)
 +   return result;
 +   continue;
 +   }
 +
 +   pud = (pud_t *)info-alloc_pgt_page(info-context);
 +   if (!pud)
 +   return -ENOMEM;
 +   result = ident_pud_init(info, pud, addr, next);
 +   if (result)
 +   return result;
 +   set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE));
 +   }
 +
 +   return 0;
 +}
 diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
 index 30eb05a..c30efb6 100644
 --- a/arch/x86/mm/init_64.c
 +++ b/arch/x86/mm/init_64.c
 @@ -56,79 +56,7 @@

  #include mm_internal.h

 -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page,
 -  unsigned long addr, unsigned long end)
 -{
 -   addr = PMD_MASK;
 -   for (; addr  end; addr += PMD_SIZE) {
 -   pmd_t *pmd = pmd_page + pmd_index

Re: [PATCH v3 4/7] x86, kaslr: Access the correct kaslr_enabled variable

2015-03-09 Thread Kees Cook

On Sat, Mar 7, 2015 at 2:07 PM, Yinghai Lu ying...@kernel.org wrote:
 Commit

   f47233c2d34f (x86/mm/ASLR: Propagate base load address calculation)

 started passing KASLR status to kernel proper, but it uses a physical
 address as the vaule, leading to parsing bogus KASLR status in kernel
 proper.

 The setup_data linked list and thus the element which contains
 kaslr_enabled is chained together using physical addresses. At the time
 when we access it in the kernel proper, we're already running with
 paging enabled and therefore must access it through its virtual address.

 This patch changes the code to use early_memmap() and access the value.

 -v3: add checking return from early_memmap according to Boris.
 -v4: add description about setup_data accessing from Boris to change log.

Nit for next version: revision change notes usually go below the ---
in a separate section, and end with ---, above the files changed
report, and below the main changelog.

-Kees


 Fixes: f47233c2d34f (x86/mm/ASLR: Propagate base load address calculation)
 Cc: Matt Fleming matt.flem...@intel.com
 Cc: Borislav Petkov b...@suse.de
 Cc: Kees Cook keesc...@chromium.org
 Cc: Jiri Kosina jkos...@suse.cz
 Acked-by: Jiri Kosina jkos...@suse.cz
 Signed-off-by: Yinghai Lu ying...@kernel.org
 ---
  arch/x86/kernel/setup.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

 diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
 index 98dc931..912f124 100644
 --- a/arch/x86/kernel/setup.c
 +++ b/arch/x86/kernel/setup.c
 @@ -429,7 +429,18 @@ static void __init reserve_initrd(void)

  static void __init parse_kaslr_setup(u64 pa_data, u32 data_len)
  {
 -   kaslr_enabled = (bool)(pa_data + sizeof(struct setup_data));
 +   /* kaslr_setup_data is defined in aslr.c */
 +   unsigned char *data;
 +   unsigned long offset = sizeof(struct setup_data);
 +
 +   data = early_memremap(pa_data, offset + 1);
 +   if (!data) {
 +   kaslr_enabled = true;
 +   return;
 +   }
 +
 +   kaslr_enabled = *(data + offset);
 +   early_memunmap(data, offset + 1);
  }

  static void __init parse_setup_data(void)
 --
 1.8.4.5




-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 5/7] x86, kaslr: Consolidate mem_avoid array filling

2015-03-09 Thread Kees Cook

On Sat, Mar 7, 2015 at 2:07 PM, Yinghai Lu ying...@kernel.org wrote:
 Now ZO sit end of the buffer, we can find out where is ZO text
 and data/bss etc.

 [input, input+input_size) is copied compressed kernel, not the whole ZO.
 [output, output+init_size) is the buffer for VO.

 [input+input_size, output+init_size) is [_text, _end) for ZO.
 that could be first range in mem_avoid. We don't need to guess that anymore.

 That area already include heap and stack for ZO running. So we don't need
 to put them into mem_avoid array.

 We need to put boot_params into the mem_avoid too. As with 64bit bootloader
 could put it anywhere.

This may be a stupid question, but are boot_params being used outside
of the compressed loader? If so, it might make sense to split that
change into a separate patch to go to stable, if it's causing
problems. (And document what problem is getting solved.)

-Kees


 Also change output_size referring to init_size, as we pass init_size instead.

 Cc: Kees Cook keesc...@chromium.org
 Signed-off-by: Yinghai Lu ying...@kernel.org
 ---
  arch/x86/boot/compressed/aslr.c | 29 ++---
  arch/x86/boot/compressed/misc.h |  4 ++--
  2 files changed, 16 insertions(+), 17 deletions(-)

 diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
 index 7083c16..a279514 100644
 --- a/arch/x86/boot/compressed/aslr.c
 +++ b/arch/x86/boot/compressed/aslr.c
 @@ -116,7 +116,7 @@ struct mem_vector {
 unsigned long size;
  };

 -#define MEM_AVOID_MAX 5
 +#define MEM_AVOID_MAX 4
  static struct mem_vector mem_avoid[MEM_AVOID_MAX];

  static bool mem_contains(struct mem_vector *region, struct mem_vector *item)
 @@ -142,7 +142,7 @@ static bool mem_overlaps(struct mem_vector *one, struct 
 mem_vector *two)
  }

  static void mem_avoid_init(unsigned long input, unsigned long input_size,
 -  unsigned long output, unsigned long output_size)
 +  unsigned long output, unsigned long init_size)
  {
 u64 initrd_start, initrd_size;
 u64 cmd_line, cmd_line_size;
 @@ -151,10 +151,13 @@ static void mem_avoid_init(unsigned long input, 
 unsigned long input_size,

 /*
  * Avoid the region that is unsafe to overlap during
 -* decompression (see calculations at top of misc.c).
 +* decompression.
 +* As we already move ZO (arch/x86/boot/compressed/vmlinux)
 +* to the end of buffer, [input+input_size, output+init_size)
 +* has [_text, _end) for ZO.
  */
 -   unsafe_len = (output_size  12) + 32768 + 18;
 -   unsafe = (unsigned long)input + input_size - unsafe_len;
 +   unsafe_len = output + init_size - (input + input_size);
 +   unsafe = (unsigned long)input + input_size;
 mem_avoid[0].start = unsafe;
 mem_avoid[0].size = unsafe_len;

 @@ -176,13 +179,9 @@ static void mem_avoid_init(unsigned long input, unsigned 
 long input_size,
 mem_avoid[2].start = cmd_line;
 mem_avoid[2].size = cmd_line_size;

 -   /* Avoid heap memory. */
 -   mem_avoid[3].start = (unsigned long)free_mem_ptr;
 -   mem_avoid[3].size = BOOT_HEAP_SIZE;
 -
 -   /* Avoid stack memory. */
 -   mem_avoid[4].start = (unsigned long)free_mem_end_ptr;
 -   mem_avoid[4].size = BOOT_STACK_SIZE;
 +   /* Avoid params */
 +   mem_avoid[3].start = (unsigned long)real_mode;
 +   mem_avoid[3].size = sizeof(*real_mode);
  }

  /* Does this memory vector overlap a known avoided area? */
 @@ -327,7 +326,7 @@ unsigned char *choose_kernel_location(struct boot_params 
 *params,
   unsigned char *input,
   unsigned long input_size,
   unsigned char *output,
 - unsigned long output_size)
 + unsigned long init_size)
  {
 unsigned long choice = (unsigned long)output;
 unsigned long random;
 @@ -349,10 +348,10 @@ unsigned char *choose_kernel_location(struct 
 boot_params *params,

 /* Record the various known unsafe memory ranges. */
 mem_avoid_init((unsigned long)input, input_size,
 -  (unsigned long)output, output_size);
 +  (unsigned long)output, init_size);

 /* Walk e820 and find a random address. */
 -   random = find_random_addr(choice, output_size);
 +   random = find_random_addr(choice, init_size);
 if (!random) {
 debug_putstr(KASLR could not find suitable E820 
 region...\n);
 goto out;
 diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
 index ee3576b..23156e7 100644
 --- a/arch/x86/boot/compressed/misc.h
 +++ b/arch/x86/boot/compressed/misc.h
 @@ -61,7 +61,7 @@ unsigned char *choose_kernel_location(struct boot_params 
 *params,
   unsigned char *input

Re: [PATCH v3 7/7] x86, kaslr, 64bit: Set new or extra ident_mapping

2015-03-09 Thread Kees Cook

On Sat, Mar 7, 2015 at 2:07 PM, Yinghai Lu ying...@kernel.org wrote:
 First, aslr will support to put random VO above 4G, so we must set ident
 mapping for the range even we come from startup_32 path.

 Second, when boot from 64bit bootloader, bootloader set ident mapping,
 and boot via ZO (arch/x86/boot/compressed/vmlinux) startup_64.
 Those pages for pagetable need to be avoided when we select new random
 VO (vmlinux) base. Otherwise decompressor would overwrite them during
 decompressing.

 One solution: go through pagetable and find out every page is used by
 pagetable for every mem_aovid checking but we will need extra code.

 Other solution: create new ident mapping instead, and pages for pagetable
 will sit in _pagetable section of ZO, and they are in mem_avoid array already.
 In this way, we can reuse the code for setting ident mapping.

 The _pgtable will be shared 32bit and 64bit path to reduce init_size,
 as now ZO _rodata to _end will contribute init_size.

 Need to increase pgt buffer size.
 When boot via startup_64, as we need to cover old VO, params, cmdline
 and new VO, in extreme case we could have them all cross 512G boundary,
 will need (2+2)*4 pages with 2M mapping. And need 2 for first 2M for vga ram.
 Plus one for level4. Total will be 19 pages.
 When boot via startup_32, aslr would move new VO above 4G, we need set extra
 ident mapping for new VO, pgt buffer come from _pgtable offset 6 pages.
 should only need (2+2) pages at most when it cross 512G boundary.
 So 19 pages could make both pathes happy.


 -v3: add mapping for first 2M with video ram when X86_VERBOSE_BOOTUP is set.
  Don't need to set mapping for setup_data, as it is already late
  in boot/ZO stage, will not access it until VO stage, and VO stage
  will use early_memmap or kernel address to access them.

 Cc: Kees Cook keesc...@chromium.org
 Cc: Jiri Kosina jkos...@suse.cz
 Cc: Borislav Petkov b...@suse.de
 Cc: Matt Fleming matt.flem...@intel.com
 Signed-off-by: Yinghai Lu ying...@kernel.org
 ---
  arch/x86/boot/compressed/aslr.c | 21 
  arch/x86/boot/compressed/head_64.S  |  4 +-
  arch/x86/boot/compressed/misc_pgt.c | 98 
 +
  arch/x86/include/asm/boot.h | 19 +++
  4 files changed, 140 insertions(+), 2 deletions(-)
  create mode 100644 arch/x86/boot/compressed/misc_pgt.c

 diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
 index a279514..34eb652 100644
 --- a/arch/x86/boot/compressed/aslr.c
 +++ b/arch/x86/boot/compressed/aslr.c
 @@ -1,3 +1,8 @@
 +#ifdef CONFIG_X86_64
 +#define __pa(x)  ((unsigned long)(x))
 +#define __va(x)  ((void *)((unsigned long)(x)))
 +#endif
 +
  #include misc.h

  #include asm/msr.h
 @@ -21,6 +26,8 @@ struct kaslr_setup_data {
 __u8 data[1];
  } kaslr_setup_data;

 +#include misc_pgt.c

Shouldn't this just be a normal built .o file that is linked together
in the Makefile, specifically tracking CONFIG_RANDOMIZE_BASE as aslr.o
already is?

-Kees

 +
  #define I8254_PORT_CONTROL 0x43
  #define I8254_PORT_COUNTER00x40
  #define I8254_CMD_READBACK 0xC0
 @@ -160,6 +167,7 @@ static void mem_avoid_init(unsigned long input, unsigned 
 long input_size,
 unsafe = (unsigned long)input + input_size;
 mem_avoid[0].start = unsafe;
 mem_avoid[0].size = unsafe_len;
 +   fill_pagetable(output, init_size);

 /* Avoid initrd. */
 initrd_start  = (u64)real_mode-ext_ramdisk_image  32;
 @@ -168,6 +176,7 @@ static void mem_avoid_init(unsigned long input, unsigned 
 long input_size,
 initrd_size |= real_mode-hdr.ramdisk_size;
 mem_avoid[1].start = initrd_start;
 mem_avoid[1].size = initrd_size;
 +   /* don't need to set mapping for initrd */

 /* Avoid kernel command line. */
 cmd_line  = (u64)real_mode-ext_cmd_line_ptr  32;
 @@ -178,10 +187,19 @@ static void mem_avoid_init(unsigned long input, 
 unsigned long input_size,
 ;
 mem_avoid[2].start = cmd_line;
 mem_avoid[2].size = cmd_line_size;
 +   fill_pagetable(cmd_line, cmd_line_size);

 /* Avoid params */
 mem_avoid[3].start = (unsigned long)real_mode;
 mem_avoid[3].size = sizeof(*real_mode);
 +   fill_pagetable((unsigned long)real_mode, sizeof(*real_mode));
 +
 +   /* don't need to set mapping for setup_data */
 +
 +#ifdef CONFIG_X86_VERBOSE_BOOTUP
 +   /* for video ram */
 +   fill_pagetable(0, PMD_SIZE);
 +#endif
  }

  /* Does this memory vector overlap a known avoided area? */
 @@ -362,6 +380,9 @@ unsigned char *choose_kernel_location(struct boot_params 
 *params,
 goto out;

 choice = random;
 +
 +   fill_pagetable(choice, init_size);
 +   switch_pagetable();
  out:
 return (unsigned char *)choice;
  }
 diff --git a/arch/x86/boot/compressed/head_64.S 
 b/arch/x86/boot/compressed/head_64.S
 index 69015b5..1b6e34a 100644
 --- a/arch/x86/boot

Re: [PATCH v3 5/7] x86, kaslr: Consolidate mem_avoid array filling

2015-03-09 Thread Kees Cook

On Mon, Mar 9, 2015 at 6:10 PM, Yinghai Lu ying...@kernel.org wrote:
 On Mon, Mar 9, 2015 at 6:00 PM, Kees Cook keesc...@chromium.org wrote:
 On Sat, Mar 7, 2015 at 2:07 PM, Yinghai Lu ying...@kernel.org wrote:

 This may be a stupid question, but are boot_params being used outside
 of the compressed loader? If so, it might make sense to split that
 change into a separate patch to go to stable, if it's causing
 problems. (And document what problem is getting solved.)

 boot_params will keep the same and until it is passed
 x86_64_start_kernel in vmlinux.
 and there it will be copied, same as cmdline.

 but current kaslr only support random the base high, and it does not
 support kexec
 and current boot loader (grub2) put it really low. (under 1M). the
 real real_mode. :).

Oh right! Of course, thanks for the details. Yes, it was implicitly
ignored before, but now it needs to be explicitly ignored. Great,
thanks!

-Kees


 Thanks

 Yinghai



-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 01/15] x86, kaslr: Use init_size instead of run_size

2015-03-06 Thread Kees Cook

On Fri, Mar 6, 2015 at 10:44 AM, Yinghai Lu ying...@kernel.org wrote:
 On Fri, Mar 6, 2015 at 5:55 AM, Borislav Petkov b...@alien8.de wrote:
 On Wed, Mar 04, 2015 at 12:00:34AM -0800, Yinghai Lu wrote:
 commit e6023367d779 (x86, kaslr: Prevent .bss from overlaping initrd)

 introduced one run_size for kaslr.

 We do not need to have home grown run_size.

 We should use real runtime size (include copy/decompress) aka init_size

 Why?

 New change log:

 Subject: [PATCH] x86, kaslr: Use init_size instead of run_size

 commit e6023367d779 (x86, kaslr: Prevent .bss from overlaping initrd)
 introduced one run_size for kaslr.
 We should use real runtime size (include copy/decompress) aka init_size.

 run_size is size of VO (vmlinux).
 init_size is the size needed for decompress and it is bigger than run_size
 when decompress need more buff.

 According to arch/x86/boot/header.S:
 | #define ZO_INIT_SIZE(ZO__end - ZO_startup_32 + ZO_z_extract_offset)
 | #define VO_INIT_SIZE(VO__end - VO__text)
 | #if ZO_INIT_SIZE  VO_INIT_SIZE
 | #define INIT_SIZE ZO_INIT_SIZE
 | #else
 | #define INIT_SIZE VO_INIT_SIZE
 | #endif
 | init_size:  .long INIT_SIZE # kernel initialization size

 Bootloader allocate buffer according to init_size in hdr, and load the
 ZO (arch/x86/boot/compressed/vmlinux) from start of that buffer.
 During running of ZO, ZO move itself to the middle of buffer at
 z_extract_offset to make sure that decompressor would not have output
 overwrite input data before input data get consumed.
 But z_extract_offset calculating is based on size of VO (vmlinux) and size
 of compressed VO only at first.
 So need to make [z_extra_offset, init_size) will fit ZO, that means
 init_size need to be adjusted according to ZO size.
 That make init_size is always = run_size.

 During aslr buffer searching, we need to make sure the buffer is bigger
 enough for decompress at first. So use init_size instead, and kill not
 needed run_size related code.

I don't see how bss and brk are related to these sizes. Can you
explain how bss, brk, and initrd factor into these sizes? Those were
what run_size was created to represent. I don't want to accidentally
start stomping on bss and brk again. :)

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Trusted kernel patchset for Secure Boot lockdown

2014-03-19 Thread Kees Cook

On Fri, Mar 14, 2014 at 3:31 PM, One Thousand Gnomes
gno...@lxorguk.ukuu.org.uk wrote:
 On Fri, 14 Mar 2014 22:15:45 +
 Matthew Garrett matthew.garr...@nebula.com wrote:

 On Fri, 2014-03-14 at 22:08 +, One Thousand Gnomes wrote:
  On Fri, 14 Mar 2014 21:56:33 +
  Matthew Garrett matthew.garr...@nebula.com wrote:
   Signed userspace is not a requirement, and therefore any solution that
   relies on a signed initrd is inadequate. There are use cases that
   require verification of the initrd and other levels. This isn't one of
   them.
 
  The job of the kernel is to solve the general problem. There are lots of
  people who happen to care about verification beyond the kernel so it
  shouldn't be ignored. And they can do do things like load trusted SELinux
  rulesets even if you can't support it in your environment.

 The general problem includes having to support this even without an
 selinux policy.

 Yes. No dispute about that. But equally the general solution should allow
 for it.

 And one that's not going to change, so the general problem includes not
 relying on a signed initramfs.

 Likewise

 some other way. ChromeOS will load unmeasured kernel modules provided it
 can attest to the trustworthyness of the filesystem containing them.

 See How to Bypass Verified Boot Security in Chromium OS 8)

That method a) is intentionally available (system owner can disable
firmware RO and install their own keys), and b) requires physical
presence.

 So ChromeOS loads *measured* kernel modules. It just did the measuring
 differently to the signed module code.

Right, using dm-verity. However, the read-only media case is still
valid (and is why I am still trying to get the module restrictions LSM
accepted, and why I'm modelling my firmware restrictions on the same
principle).

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Trusted kernel patchset for Secure Boot lockdown

2014-03-19 Thread Kees Cook

On Fri, Mar 14, 2014 at 4:18 PM, Theodore Ts'o ty...@mit.edu wrote:
 On Fri, Mar 14, 2014 at 10:08:40PM +, One Thousand Gnomes wrote:
  Signed userspace is not a requirement, and therefore any solution that
  relies on a signed initrd is inadequate. There are use cases that
  require verification of the initrd and other levels. This isn't one of
  them.

 The job of the kernel is to solve the general problem. There are lots of
 people who happen to care about verification beyond the kernel so it
 shouldn't be ignored. And they can do do things like load trusted SELinux
 rulesets even if you can't support it in your environment.

 This is really a question about goals and trust models.  Alan is
 arguing that we should support trust models and goals which go far
 beyond the goals and trust model of the UEFI Forum.

 Matthew is, I think, trying to make the argument that his patches
 fulfill the goals that are needed so we can boot Linux distribution
 kernels on UEFI secure boot machines without worrying about Microsoft
 deciding to revoke keys so that Red Hat or SuSE will no longer be able
 to be bootable on desktops that are certified for Windows 8.  And
 while we might want to improve the framework to support other trust
 models later on, supporting distro kernels on Windows 8 certified PC's
 is important enough that we should let these patches into mainline.

 Is that a fair summary of the two viewpoints?

UEFI is a red herring in both cases. This isn't about UEFI, it just
happens that one of the things that can assert trusted_kernel is the
UEFI Secure Boot path. Chrome OS would assert trusted_kernel by
passing it on the kernel command line (since it, along with firmware,
kernel, and root fs are effectively measured). A
boot-my-router-from-CD system can assert similarly because the kernel
is on RO media.

Based on my view of the conversation, it's about whether or not
capable() can be made to do these checks. The proposal about creating
the action/privilege map is interesting, and I'm all for doing it just
to make things easy to reason about for capabilities, but it still
seems separate from this work.

There still needs to be a global boolean for the trusted kernel
state. It would be used, for example, on the module param
whitelisting, which has nothing to do with capabilities. It would be
used to change driver behavior (e.g. disabling DMA or other badness
that isn't trusted), which has nothing to do with capabilities. The
ability to check this state is going to be needed going into the
future as we uncover more dangerous interfaces.

Since the capability work is separate, when it happens, that same
regex work could replace some of the is_trusted_kernel checks with new
action tests, but we still need the same infrastructure and
identification of dangerous interfaces.

Capabilities can be seen as related to this patch set, but it cannot
be seen as a blocker. This logic is needed today, it's implemented,
and it clearly marks where the known problems are. If an overhaul of
capabilities happens, it can happen separately.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Trusted kernel patchset for Secure Boot lockdown

2014-03-14 Thread Kees Cook

 that covers this
new thing we want to do and keeps the new policy distinctly separate.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Trusted kernel patchset for Secure Boot lockdown

2014-03-13 Thread Kees Cook

On Wed, Mar 12, 2014 at 10:01 PM, Matthew Garrett
matthew.garr...@nebula.com wrote:
 On Fri, 2014-02-28 at 14:03 +1100, James Morris wrote:

 Ok, which tree should take this?  I'm happy to, although most of it is
 outside security/ .

 Should I be looking for someone else to take them instead? :)

Andrew, is this series[1] something you'd be okay taking? It touches
many different areas, so you might be best for it.

-Kees

[1] https://lkml.org/lkml/2014/2/26/811

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 12/12] Add option to automatically set trusted_kernel when in Secure Boot mode

2014-02-27 Thread Kees Cook

On Wed, Feb 26, 2014 at 2:48 PM, Matthew Garrett
matthew.garr...@nebula.com wrote:
 On Wed, 2014-02-26 at 22:41 +, One Thousand Gnomes wrote:
 Another issue that needs addressing is firmware. Quite a few of our
 request_firmware cases load device firmware which is not signed into DMA
 capable hardware. Probably also worth checking what the
 architectural guarantees on bogus microcode updates is. Maybe we need
 firmware signing for such cases to match the mod signing ?

 Vendors keep telling me that they're validating firmware for new
 hardware, and I keep tending not to believe them. Meh. The big problem
 with firmware signatures is that we don't necessarily have the right to
 distribute modified versions of the firmware, so we'd need detached
 signature support. I'm certainly not against this.

I have been working on a patch series for this. It will have LSM hooks
for validating firmware origin (via fd) and contents (via blob),
similar to the changes I made for validating module origins. It just
need to finish testing, and I'll post the series. If you want to check
it out in its current state, it's here:

http://git.kernel.org/cgit/linux/kernel/git/kees/linux.git/log/?h=fw-restrict

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Trusted kernel patchset for Secure Boot lockdown

2014-02-26 Thread Kees Cook

On Wed, Feb 26, 2014 at 12:11 PM, Matthew Garrett
matthew.garr...@nebula.com wrote:
 The conclusion we came to at Plumbers was that this patchset was basically
 fine but that Linus hated the name securelevel more than I hate pickled
 herring, so after thinking about this for a few months I've come up with
 Trusted Kernel. This flag indicates that the kernel is, via some
 external mechanism, trusted and should behave that way. If firmware has
 some way to verify the kernel, it can pass that information on. If userspace
 has some way to verify the kernel, it can set the flag itself. However,
 userspace should not attempt to use the flag as a means to verify that the
 kernel was trusted - untrusted userspace could have set it on an untrusted
 kernel, but by the same metric an untrusted kernel could just set it itself.

 If people object to this name then I swear to god that I will open a poll
 on Phoronix to decide the next attempt and you will like that even less.

For the Chrome OS use-case, it might be better described as untrusted
userspace, but that seems unfriendly. :) The trusted kernel name
seems fine to me.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/12] One more attempt at useful kernel lockdown

2013-09-10 Thread Kees Cook

On Tue, Sep 10, 2013 at 11:51 AM, gre...@linuxfoundation.org
gre...@linuxfoundation.org wrote:
 On Tue, Sep 10, 2013 at 11:29:45AM -0700, H. Peter Anvin wrote:
 On 09/10/2013 11:26 AM, Matthew Garrett wrote:
  On Tue, 2013-09-10 at 14:23 -0300, Henrique de Moraes Holschuh wrote:
  On Tue, 10 Sep 2013, Matthew Garrett wrote:
  That's why modern systems require signed firmware updates.
 
  Linux doesn't.  Is someone working on adding signature support to the
  runtime firmware loader?
 
  It'd be simple to do so, but so far the model appears to be that devices
  that expect signed firmware enforce that themselves.
 

 Most devices do absolutely no verification on the firmware, and simply
 trust the driver.

 So signing firmware is probably critical.

 How are you going to validate that the firmware is correct, given
 that it's just a blob living in the linux-firmware tree.  If you sign
 it, what is that saying?

In theory these blobs are traceable to a manufacturer. It's not really
an indication that it's safe more than it's an indication that it
hasn't been changed. But I haven't chased this very hard yet because
of below...

 I'm with Matthew here, any device that needs/wants this, has their own
 built-in checking, nothing the kernel should do here.

 Especially given that no other os does this :)

Yeah, it's impossible to handle since the way components do firmware
updates is frequently exposed to userspace anyway. 3G modems that do
firmware updates over the AT-command set, harddrives doing firmware
updates over SCSI-generic commands, etc. Creating this barrier in the
kernel is not a good solution; the component makers need to be doing
the enforcement. :(

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/12] One more attempt at useful kernel lockdown

2013-09-09 Thread Kees Cook

On Mon, Sep 9, 2013 at 4:30 PM, James Bottomley
james.bottom...@hansenpartnership.com wrote:
 On Mon, 2013-09-09 at 16:20 -0700, Kees Cook wrote:
 On Mon, Sep 9, 2013 at 4:19 PM, David Lang da...@lang.hm wrote:
  And if SELinux can do the job, what is the reason for creating this new
  option?

 Not everyone uses SELinux. :) Also, it's rarely controlled the things
 we want to control here.

 It comes on by default (or its equivalent: AppArmour) in almost every
 shipping distro.

Right, if LSM was meant here, yeah, I do use an LSM. But they, as a
class of security policy in the kernel, handle isolation of entirely
different things. The goal of no way to mess with ring-0 isn't
really related to the goals of the LSM in general, or specific MACs in
particular.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V3 08/11] kexec: Disable at runtime if the kernel enforces module loading restrictions

2013-09-08 Thread Kees Cook

On Sun, Sep 8, 2013 at 12:24 AM, Greg KH gre...@linuxfoundation.org wrote:
 On Sun, Sep 08, 2013 at 06:44:08AM +, Matthew Garrett wrote:
 On Sat, 2013-09-07 at 23:40 -0700, Greg KH wrote:
  If you apply this, you break everyone who is currently relying on kexec
  (i.e. kdump, bootloaders, etc.), from using signed kernel modules, which
  personally, seems like a very bad idea.

 Enforcing signed modules provides you with no additional security if you
 have kexec enabled. It's better to make that obvious.

 Then document the heck out of it, don't disable a valid use case just
 because it possibly could be used in some way that is different from the
 original system.

 If you take this to an extreme, kexec shouldn't be here at all, as it
 can do anything in the kernel wherever it wants to.

 kexec has nothing to do with signed modules, don't tie them together.

It's not accurate to say it has nothing to do with signed modules.
The purpose of signed modules is to ensure the integrity of the
running system against the root user.

It was, however, incomplete. Terrible analogy follows: signed modules
was locking the front door, but we have all sorts of windows still
open. This closes those windows. You're trying to say that shutting
windows has nothing to do with lumber locks. While technically true,
this is about the intent of the barriers.

Anyone currently using signed modules (with sig_enforce) AND kexec is
deluding themselves about what the state of their system's ring-0
security stance is. Those people should be running without
sig_enforce, and if they want both sig_enforce and kexec, then I would
expect a follow-up patch from them to provide signed kexec support.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re:

2013-09-04 Thread Kees Cook

On Tue, Sep 3, 2013 at 4:50 PM, Matthew Garrett
matthew.garr...@nebula.com wrote:
 We have two in-kernel mechanisms for restricting module loading - disabling
 it entirely, or limiting it to the loading of modules signed with a trusted
 key. These can both be configured in such a way that even root is unable to
 relax the restrictions.

 However, right now, there's several other straightforward ways for root to
 modify running kernel code. At the most basic level these allow root to
 reset the configuration such that modules can be loaded again, rendering
 the existing restrictions useless.

 This patchset adds additional restrictions to various kernel entry points
 that would otherwise make it straightforward for root to disable enforcement
 of module loading restrictions. It also provides a patch that allows the
 kernel to be configured such that module signing will be automatically
 enabled when the system is booting via UEFI Secure Boot, allowing a stronger
 guarantee of kernel integrity.

 V3 addresses some review feedback and also locks down uswsusp.

Looks good to me. Consider the entire series:

Acked-by: Kees Cook keesc...@chromium.org

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/10] Add additional security checks when module loading is restricted

2013-08-28 Thread Kees Cook

On Wed, Aug 28, 2013 at 3:58 PM, Lenny Szubowicz lszub...@redhat.com wrote:

 - Original Message -
 From: Matthew Garrett matthew.garr...@nebula.com
 To: Lenny Szubowicz lszub...@redhat.com
 Cc: linux-ker...@vger.kernel.org, linux-efi@vger.kernel.org, 
 jwbo...@redhat.com, keesc...@chromium.org
 Sent: Wednesday, August 28, 2013 6:41:55 PM
 Subject: Re: [PATCH 0/10] Add additional security checks when module loading 
 is restricted

 On Wed, 2013-08-28 at 18:37 -0400, Lenny Szubowicz wrote:

  Did you purposely exclude similar checks for hibernate that were covered
  by earlier versions of your patch set?

 Yes, I think it's worth tying it in with the encrypted hibernation
 support. The local attack is significantly harder in the hibernation
 case - in the face of unknown hardware it basically involves a
 pre-generated memory image corresponding to your system or the ability
 to force a reboot into an untrusted environment. I think it's probably
 more workable to just add a configuration option for forcing encrypted
 hibernation when secure boot is in use.

 --
 Matthew Garrett matthew.garr...@nebula.com

 I'm root. So I can write anything I want to the swap file that looks
 like a valid hibernate image but is code of my choosing. I can read
 anything I need from /dev/mem or /dev/kmem to help me do that.
 I can then immediately initiate a reboot.

Strictly speaking, RAM contents are not available via /dev/*mem, even
to root. However, you can request a suspend image be written, but to
not enter hibernation. Then modify the image, and request a resume
from it.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 05/10] asus-wmi: Restrict debugfs interface when module loading is restricted

2013-08-19 Thread Kees Cook

On Mon, Aug 19, 2013 at 9:10 AM, Matthew Garrett
matthew.garr...@nebula.com wrote:
 We have no way of validating what all of the Asus WMI methods do on a
 given machine, and there's a risk that some will allow hardware state to
 be manipulated in such a way that arbitrary code can be executed in the
 kernel, circumventing module loading restrictions. Prevent that if any of
 these features are enabled.

 Signed-off-by: Matthew Garrett matthew.garr...@nebula.com
 ---
  drivers/platform/x86/asus-wmi.c | 9 +
  1 file changed, 9 insertions(+)

 diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
 index 19c313b..8105530 100644
 --- a/drivers/platform/x86/asus-wmi.c
 +++ b/drivers/platform/x86/asus-wmi.c
 @@ -1618,6 +1618,9 @@ static int show_dsts(struct seq_file *m, void *data)
 int err;
 u32 retval = -1;

 +   if (secure_modules())
 +   return -EPERM;
 +
 err = asus_wmi_get_devstate(asus, asus-debug.dev_id, retval);

 if (err  0)
 @@ -1634,6 +1637,9 @@ static int show_devs(struct seq_file *m, void *data)
 int err;
 u32 retval = -1;

 +   if (!capable(CAP_COMPROMISE_KERNEL))

Looks like this and the next chunk weren't changed to the
secure_modules() API...

-Kees

 +   return -EPERM;
 +
 err = asus_wmi_set_devstate(asus-debug.dev_id, 
 asus-debug.ctrl_param,
 retval);

 @@ -1658,6 +1664,9 @@ static int show_call(struct seq_file *m, void *data)
 union acpi_object *obj;
 acpi_status status;

 +   if (!capable(CAP_COMPROMISE_KERNEL))
 +   return -EPERM;
 +
 status = wmi_evaluate_method(ASUS_WMI_MGMT_GUID,
  1, asus-debug.method_id,
  input, output);
 --
 1.8.3.1




-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/10] Add additional security checks when module loading is restricted

2013-08-19 Thread Kees Cook

On Mon, Aug 19, 2013 at 10:26 AM, Matthew Garrett
matthew.garr...@nebula.com wrote:
 We have two in-kernel mechanisms for restricting module loading - disabling
 it entirely, or limiting it to the loading of modules signed with a trusted
 key. These can both be configured in such a way that even root is unable to
 relax the restrictions.

 However, right now, there's several other straightforward ways for root to
 modify running kernel code. At the most basic level these allow root to
 reset the configuration such that modules can be loaded again, rendering
 the existing restrictions useless.

 This patchset adds additional restrictions to various kernel entry points
 that would otherwise make it straightforward for root to disable enforcement
 of module loading restrictions. It also provides a patch that allows the
 kernel to be configured such that module signing will be automatically
 enabled when the system is booting via UEFI Secure Boot, allowing a stronger
 guarantee of kernel integrity.

I like this approach of attaching it to the module loading logic.
Given the LSM hook for loading modules, I think I'd like to add a hook
for the secure_modules() check so that an LSM can respond secure as
well, if it doing module loading mediation (for example, in the case
of Chrome OS's modules-only-from-rootfs). I'll work up a patch...

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] x86: Lock down MSR writing in secure boot

2013-02-08 Thread Kees Cook

On Fri, Feb 8, 2013 at 11:42 AM, H. Peter Anvin h...@zytor.com wrote:
 On 02/08/2013 11:18 AM, Kees Cook wrote:

 No. CAP_RAWIO is for reading. Writing needs a much stronger check.

 If so, I suspect we need to do this for *all* raw I/O... but I keep
 wondering how much more sensitive writing really is than reading.

Well, I think there's a reasonable distinction between systems that
expect to strictly enforce user-space/kernel-space separation
(CAP_COMPROMISE_KERNEL) and things that are fiddling with hardware
(CAP_SYS_RAWIO).

For example, even things like /dev/mem already have this separation
(although it is stronger). You can't open /dev/mem without
CAP_SYS_RAWIO, but if you do, you still can't write to RAM in
/dev/mem. This might be one of the earliest examples of this
distinction, actually.

I think it's likely that after a while, we can convert some of these
proposed CAP_COMPROMISE_KERNEL checks in always-deny once we figure
out how to deal with those areas more safely.

-Kees

--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] x86: Lock down MSR writing in secure boot

2013-02-08 Thread Kees Cook

On Fri, Feb 8, 2013 at 5:29 PM, Matthew Garrett
matthew.garr...@nebula.com wrote:
 On Fri, 2013-02-08 at 17:22 -0800, H. Peter Anvin wrote:

 You don't have to build the kernel twice to exclude a loadable module.

 I guess you could just strip the signatures off any modules you don't
 want to support under Secure Boot, but that breaks some other use cases.

Also, _reading_ MSRs from userspace arguably has utility that doesn't
compromise ring-0. So excluding the driver entirely seems like
overkill.

-Kees

--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

75 matches

Mail list logo