Re: [PATCH v8 6/6] docs: trusted-encrypted: add DCP as new trust source

2024-04-13 Thread Jarkko Sakkinen
On Fri Apr 12, 2024 at 9:26 AM EEST, Herbert Xu wrote:
> On Wed, Apr 03, 2024 at 06:47:51PM +0300, Jarkko Sakkinen wrote:
> >
> > Reviewed-by: Jarkko Sakkinen 
> > 
> > I can only test that this does not break a machine without the
> > hardware feature.
>
> Please feel free to take this through your tree.
>
> Thanks,

OK, thanks!

BR, Jarkko


Re: [EXT] Re: [PATCH v8 6/6] docs: trusted-encrypted: add DCP as new trust source

2024-04-13 Thread Jarkko Sakkinen
On Tue Apr 9, 2024 at 12:48 PM EEST, Kshitiz Varshney wrote:
> Hi Jarkko,
>
>
> > -Original Message-
> > From: Jarkko Sakkinen 
> > Sent: Wednesday, April 3, 2024 9:18 PM
> > To: David Gstir ; Mimi Zohar ;
> > James Bottomley ; Herbert Xu
> > ; David S. Miller 
> > Cc: Shawn Guo ; Jonathan Corbet
> > ; Sascha Hauer ; Pengutronix
> > Kernel Team ; Fabio Estevam
> > ; dl-linux-imx ; Ahmad Fatoum
> > ; sigma star Kernel Team
> > ; David Howells ; Li
> > Yang ; Paul Moore ; James
> > Morris ; Serge E. Hallyn ; Paul E.
> > McKenney ; Randy Dunlap ;
> > Catalin Marinas ; Rafael J. Wysocki
> > ; Tejun Heo ; Steven Rostedt
> > (Google) ; linux-...@vger.kernel.org; linux-
> > ker...@vger.kernel.org; linux-integr...@vger.kernel.org;
> > keyri...@vger.kernel.org; linux-cry...@vger.kernel.org; linux-arm-
> > ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org; linux-security-
> > mod...@vger.kernel.org; Richard Weinberger ; David
> > Oberhollenzer 
> > Subject: [EXT] Re: [PATCH v8 6/6] docs: trusted-encrypted: add DCP as new
> > trust source
> > 
> > Caution: This is an external email. Please take care when clicking links or
> > opening attachments. When in doubt, report the message using the 'Report
> > this email' button
> > 
> > 
> > On Wed Apr 3, 2024 at 10:21 AM EEST, David Gstir wrote:
> > > Update the documentation for trusted and encrypted KEYS with DCP as
> > > new trust source:
> > >
> > > - Describe security properties of DCP trust source
> > > - Describe key usage
> > > - Document blob format
> > >
> > > Co-developed-by: Richard Weinberger 
> > > Signed-off-by: Richard Weinberger 
> > > Co-developed-by: David Oberhollenzer
> > > 
> > > Signed-off-by: David Oberhollenzer 
> > > Signed-off-by: David Gstir 
> > > ---
> > >  .../security/keys/trusted-encrypted.rst   | 53 +++
> > >  security/keys/trusted-keys/trusted_dcp.c  | 19 +++
> > >  2 files changed, 72 insertions(+)
> > >
> > > diff --git a/Documentation/security/keys/trusted-encrypted.rst
> > > b/Documentation/security/keys/trusted-encrypted.rst
> > > index e989b9802f92..f4d7e162d5e4 100644
> > > --- a/Documentation/security/keys/trusted-encrypted.rst
> > > +++ b/Documentation/security/keys/trusted-encrypted.rst
> > > @@ -42,6 +42,14 @@ safe.
> > >   randomly generated and fused into each SoC at manufacturing 
> > > time.
> > >   Otherwise, a common fixed test key is used instead.
> > >
> > > + (4) DCP (Data Co-Processor: crypto accelerator of various i.MX
> > > + SoCs)
> > > +
> > > + Rooted to a one-time programmable key (OTP) that is generally
> > burnt
> > > + in the on-chip fuses and is accessible to the DCP encryption 
> > > engine
> > only.
> > > + DCP provides two keys that can be used as root of trust: the OTP
> > key
> > > + and the UNIQUE key. Default is to use the UNIQUE key, but 
> > > selecting
> > > + the OTP key can be done via a module parameter
> > (dcp_use_otp_key).
> > > +
> > >*  Execution isolation
> > >
> > >   (1) TPM
> > > @@ -57,6 +65,12 @@ safe.
> > >
> > >   Fixed set of operations running in isolated execution 
> > > environment.
> > >
> > > + (4) DCP
> > > +
> > > + Fixed set of cryptographic operations running in isolated 
> > > execution
> > > + environment. Only basic blob key encryption is executed there.
> > > + The actual key sealing/unsealing is done on main 
> > > processor/kernel
> > space.
> > > +
> > >* Optional binding to platform integrity state
> > >
> > >   (1) TPM
> > > @@ -79,6 +93,11 @@ safe.
> > >   Relies on the High Assurance Boot (HAB) mechanism of NXP SoCs
> > >   for platform integrity.
> > >
> > > + (4) DCP
> > > +
> > > + Relies on Secure/Trusted boot process (called HAB by vendor) for
> > > + platform integrity.
> > > +
> > >*  Interfaces and APIs
> > >
> > >   (1) TPM
> > > @@ -94,6 +113,11 @@ safe.
> > >
> > >   Interface is specific to silicon vendor.
> > >
> > > + (4) DCP
> > > +
> > > + Vendor-specific API that is implemented as part of the DCP 
> > > crypto
> > driver in
> > > + ``drivers/crypto/mxs-dcp.c``.
> > > +
> > >*  Threat model
> > >
> > >   The strength and appropriateness of a particular trust source
> > > for a given @@ -129,6 +153,13 @@ selected trust source:
> > >   CAAM HWRNG, enable CRYPTO_DEV_FSL_CAAM_RNG_API and ensure
> > the device
> > >   is probed.
> > >
> > > +  *  DCP (Data Co-Processor: crypto accelerator of various i.MX SoCs)
> > > +
> > > + The DCP hardware device itself does not provide a dedicated RNG
> > interface,
> > > + so the kernel default RNG is used. SoCs with DCP like the i.MX6ULL 
> > > do
> > have
> > > + a dedicated hardware RNG that is independent from DCP which can be
> > enabled
> > > + to back the kernel RNG.
> > > +
> > >  Users may override this by specifying ``trusted.rng=kernel`` on the
> 

Re: [PATCH v1 3/3] mm: use "GUP-fast" instead "fast GUP" in remaining comments

2024-04-13 Thread John Hubbard

On 4/2/24 5:55 AM, David Hildenbrand wrote:

Let's fixup the remaining comments to consistently call that thing
"GUP-fast". With this change, we consistently call it "GUP-fast".

Reviewed-by: Mike Rapoport (IBM) 
Signed-off-by: David Hildenbrand 
---
  mm/filemap.c| 2 +-
  mm/khugepaged.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)


Yes, everything is changed over now, confirmed.

Reviewed-by: John Hubbard 


thanks,
--
John Hubbard
NVIDIA



Re: [PATCH v1 2/3] mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST

2024-04-13 Thread John Hubbard

On 4/2/24 5:55 AM, David Hildenbrand wrote:

Nowadays, we call it "GUP-fast", the external interface includes
functions like "get_user_pages_fast()", and we renamed all internal
functions to reflect that as well.

Let's make the config option reflect that.

Reviewed-by: Mike Rapoport (IBM) 
Signed-off-by: David Hildenbrand 
---
  arch/arm/Kconfig   |  2 +-
  arch/arm64/Kconfig |  2 +-
  arch/loongarch/Kconfig |  2 +-
  arch/mips/Kconfig  |  2 +-
  arch/powerpc/Kconfig   |  2 +-
  arch/riscv/Kconfig |  2 +-
  arch/s390/Kconfig  |  2 +-
  arch/sh/Kconfig|  2 +-
  arch/x86/Kconfig   |  2 +-
  include/linux/rmap.h   |  8 
  kernel/events/core.c   |  4 ++--
  mm/Kconfig |  2 +-
  mm/gup.c   | 10 +-
  mm/internal.h  |  2 +-
  14 files changed, 22 insertions(+), 22 deletions(-)



Reviewed-by: John Hubbard 


thanks,
--
John Hubbard
NVIDIA



Re: [PATCH v1 1/3] mm/gup: consistently name GUP-fast functions

2024-04-13 Thread John Hubbard

On 4/2/24 5:55 AM, David Hildenbrand wrote:

Let's consistently call the "fast-only" part of GUP "GUP-fast" and rename
all relevant internal functions to start with "gup_fast", to make it
clearer that this is not ordinary GUP. The current mixture of
"lockless", "gup" and "gup_fast" is confusing.

Further, avoid the term "huge" when talking about a "leaf" -- for
example, we nowadays check pmd_leaf() because pmd_huge() is gone. For the
"hugepd"/"hugepte" stuff, it's part of the name ("is_hugepd"), so that
stays.

What remains is the "external" interface:
* get_user_pages_fast_only()
* get_user_pages_fast()
* pin_user_pages_fast()

The high-level internal functions for GUP-fast (+slow fallback) are now:
* internal_get_user_pages_fast() -> gup_fast_fallback()
* lockless_pages_from_mm() -> gup_fast()

The basic GUP-fast walker functions:
* gup_pgd_range() -> gup_fast_pgd_range()
* gup_p4d_range() -> gup_fast_p4d_range()
* gup_pud_range() -> gup_fast_pud_range()
* gup_pmd_range() -> gup_fast_pmd_range()
* gup_pte_range() -> gup_fast_pte_range()
* gup_huge_pgd()  -> gup_fast_pgd_leaf()
* gup_huge_pud()  -> gup_fast_pud_leaf()
* gup_huge_pmd()  -> gup_fast_pmd_leaf()


This is my favorite cleanup of 2024 so far. The above mix was confusing
even if one worked on this file all day long--you constantly have to
translate from function name, to "is this fast or slow?". whew.




The weird hugepd stuff:
* gup_huge_pd() -> gup_fast_hugepd()
* gup_hugepte() -> gup_fast_hugepte()

The weird devmap stuff:
* __gup_device_huge_pud() -> gup_fast_devmap_pud_leaf()
* __gup_device_huge_pmd   -> gup_fast_devmap_pmd_leaf()
* __gup_device_huge() -> gup_fast_devmap_leaf()
* undo_dev_pagemap()  -> gup_fast_undo_dev_pagemap()

Helper functions:
* unpin_user_pages_lockless() -> gup_fast_unpin_user_pages()
* gup_fast_folio_allowed() is already properly named
* gup_fast_permitted() is already properly named

With "gup_fast()", we now even have a function that is referred to in
comment in mm/mmu_gather.c.

Reviewed-by: Jason Gunthorpe 
Reviewed-by: Mike Rapoport (IBM) 
Signed-off-by: David Hildenbrand 
---
  mm/gup.c | 205 ---
  1 file changed, 103 insertions(+), 102 deletions(-)



Reviewed-by: John Hubbard 


thanks,
--
John Hubbard
NVIDIA



Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback

2024-04-13 Thread Marc Zyngier
On Fri, 12 Apr 2024 15:54:22 +0100,
Sean Christopherson  wrote:
> 
> On Fri, Apr 12, 2024, Marc Zyngier wrote:
> > On Fri, 12 Apr 2024 11:44:09 +0100, Will Deacon  wrote:
> > > On Fri, Apr 05, 2024 at 07:58:12AM -0400, Paolo Bonzini wrote:
> > > Also, if you're in the business of hacking the MMU notifier code, it
> > > would be really great to change the .clear_flush_young() callback so
> > > that the architecture could handle the TLB invalidation. At the moment,
> > > the core KVM code invalidates the whole VMID courtesy of 'flush_on_ret'
> > > being set by kvm_handle_hva_range(), whereas we could do a much
> > > lighter-weight and targetted TLBI in the architecture page-table code
> > > when we actually update the ptes for small ranges.
> > 
> > Indeed, and I was looking at this earlier this week as it has a pretty
> > devastating effect with NV (it blows the shadow S2 for that VMID, with
> > costly consequences).
> > 
> > In general, it feels like the TLB invalidation should stay with the
> > code that deals with the page tables, as it has a pretty good idea of
> > what needs to be invalidated and how -- specially on architectures
> > that have a HW-broadcast facility like arm64.
> 
> Would this be roughly on par with an in-line flush on arm64?  The simpler, 
> more
> straightforward solution would be to let architectures override flush_on_ret,
> but I would prefer something like the below as x86 can also utilize a 
> range-based
> flush when running as a nested hypervisor.
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index ff0a20565f90..b65116294efe 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -601,6 +601,7 @@ static __always_inline kvm_mn_ret_t 
> __kvm_handle_hva_range(struct kvm *kvm,
> struct kvm_gfn_range gfn_range;
> struct kvm_memory_slot *slot;
> struct kvm_memslots *slots;
> +   bool need_flush = false;
> int i, idx;
>  
> if (WARN_ON_ONCE(range->end <= range->start))
> @@ -653,10 +654,22 @@ static __always_inline kvm_mn_ret_t 
> __kvm_handle_hva_range(struct kvm *kvm,
> break;
> }
> r.ret |= range->handler(kvm, _range);
> +
> +   /*
> +* Use a precise gfn-based TLB flush when possible, as
> +* most mmu_notifier events affect a small-ish range.
> +* Fall back to a full TLB flush if the gfn-based 
> flush
> +* fails, and don't bother trying the gfn-based flush
> +* if a full flush is already pending.
> +*/
> +   if (range->flush_on_ret && !need_flush && r.ret &&
> +   kvm_arch_flush_remote_tlbs_range(kvm, 
> gfn_range.start
> +gfn_range.end - 
> gfn_range.start + 1))
> +   need_flush = true;
> }
> }
>  
> -   if (range->flush_on_ret && r.ret)
> +   if (need_flush)
> kvm_flush_remote_tlbs(kvm);
>  
> if (r.found_memslot)

I think this works for us on HW that has range invalidation, which
would already be a positive move.

For the lesser HW that isn't range capable, it also gives the
opportunity to perform the iteration ourselves or go for the nuclear
option if the range is larger than some arbitrary constant (though
this is additional work).

But this still considers the whole range as being affected by
range->handler(). It'd be interesting to try and see whether more
precise tracking is (or isn't) generally beneficial.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [RFC PATCH] powerpc: Optimise barriers for fully ordered atomics

2024-04-13 Thread Michael Ellerman
Nicholas Piggin  writes:
> "Fully ordered" atomics (RMW that return a value) are said to have a
> full barrier before and after the atomic operation. This is implemented
> as:
>
>   hwsync
>   larx
>   ...
>   stcx.
>   bne-
>   hwsync
>
> This is slow on POWER processors because hwsync and stcx. require a
> round-trip to the nest (~= L2 cache). The hwsyncs can be avoided with
> the sequence:
>
>   lwsync
>   larx
>   ...
>   stcx.
>   bne-
>   isync
>
> lwsync prevents all reorderings except store/load reordering, so the
> larx could be execued ahead of a prior store becoming visible. However
> the stcx. is a store, so it is ordered by the lwsync against all prior
> access and if the value in memory had been modified since the larx, it
> will fail. So the point at which the larx executes is not a concern
> because the stcx. always verifies the memory was unchanged.
>
> The isync prevents subsequent instructions being executed before the
> stcx. executes, and stcx. is necessarily visible to the system after
> it executes, so there is no opportunity for it (or prior stores, thanks
> to lwsync) to become visible after a subsequent load or store.

AFAICS commit b97021f85517 ("powerpc: Fix atomic_xxx_return barrier
semantics") disagrees.

That was 2011, so maybe it's wrong or outdated?

Either way it would be good to have some litmus tests to back this up.

cheers

ps. isn't there a rule against sending barrier patches after midnight? ;)


Re: [PATCH 1/3] x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n

2024-04-13 Thread Michael Ellerman
Michael Ellerman  writes:
> Stephen Rothwell  writes:
...
>> On Tue,  9 Apr 2024 10:51:05 -0700 Sean Christopherson  
>> wrote:
...
>>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>>> index 8f6affd051f7..07ad53b7f119 100644
>>> --- a/kernel/cpu.c
>>> +++ b/kernel/cpu.c
>>> @@ -3207,7 +3207,8 @@ enum cpu_mitigations {
>>>  };
>>>  
>>>  static enum cpu_mitigations cpu_mitigations __ro_after_init =
>>> -   CPU_MITIGATIONS_AUTO;
>>> +   IS_ENABLED(CONFIG_SPECULATION_MITIGATIONS) ? CPU_MITIGATIONS_AUTO :
>>> +CPU_MITIGATIONS_OFF;
>>>  
>>>  static int __init mitigations_parse_cmdline(char *arg)
>>>  {

I think a minimal workaround/fix would be:

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 2b8fd6bb7da0..290be2f9e909 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -191,6 +191,10 @@ config GENERIC_CPU_AUTOPROBE
 config GENERIC_CPU_VULNERABILITIES
bool

+config SPECULATION_MITIGATIONS
+   def_bool y
+   depends on !X86
+
 config SOC_BUS
bool
select GLOB

cheers


Re: [PATCH 1/3] x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n

2024-04-13 Thread Michael Ellerman
Stephen Rothwell  writes:
> Hi Sean,
>
> I noticed this commit in linux-next.
>
> On Tue,  9 Apr 2024 10:51:05 -0700 Sean Christopherson  
> wrote:
>>
>> Initialize cpu_mitigations to CPU_MITIGATIONS_OFF if the kernel is built
>> with CONFIG_SPECULATION_MITIGATIONS=n, as the help text quite clearly
>> states that disabling SPECULATION_MITIGATIONS is supposed to turn off all
>> mitigations by default.
>> 
>>   │ If you say N, all mitigations will be disabled. You really
>>   │ should know what you are doing to say so.
>> 
>> As is, the kernel still defaults to CPU_MITIGATIONS_AUTO, which results in
>> some mitigations being enabled in spite of SPECULATION_MITIGATIONS=n.
>> 
>> Fixes: f43b9876e857 ("x86/retbleed: Add fine grained Kconfig knobs")
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Sean Christopherson 
>> ---
>>  kernel/cpu.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/kernel/cpu.c b/kernel/cpu.c
>> index 8f6affd051f7..07ad53b7f119 100644
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -3207,7 +3207,8 @@ enum cpu_mitigations {
>>  };
>>  
>>  static enum cpu_mitigations cpu_mitigations __ro_after_init =
>> -CPU_MITIGATIONS_AUTO;
>> +IS_ENABLED(CONFIG_SPECULATION_MITIGATIONS) ? CPU_MITIGATIONS_AUTO :
>> + CPU_MITIGATIONS_OFF;
>>  
>>  static int __init mitigations_parse_cmdline(char *arg)
>>  {
>> -- 
>> 2.44.0.478.gd926399ef9-goog
>> 
>
> I noticed because it turned off all mitigations for my PowerPC qemu
> boot tests - probably because CONFIG_SPECULATION_MITIGATIONS only
> exists in arch/x86/Kconfig ... thus for other architectures that have
> cpu mitigations, this will always default them to off, right?

Yep.

The patch has the effect of changing the default for non-x86 arches from
auto to off.

I see at least powerpc, arm64 and s390 use cpu_mitigations_off() and
will be affected.

cheers