Re: [PATCH 2/4] arm64: alternatives: apply boot time fixups via the linear mapping

2017-02-11 Thread Kees Cook
   PAGE_KERNEL_RO);
> +
> +   /* flush the TLBs after updating live kernel mappings */
> +   flush_tlb_all();
> +}

Oh, dur, sorry, I missed this the first time through. Nevermind! Looks
like linear will be RO, kernel will be ROX at runtime, and during
boot, kernel will be ROX, and linear will be RW. Nice!

-Kees

> +
>  static void __init map_mem(pgd_t *pgd)
>  {
> struct memblock_region *reg;
> --
> 2.7.4
>



-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 3/4] arm64: mmu: map .text as read-only from the outset

2017-02-11 Thread Kees Cook
On Fri, Feb 10, 2017 at 9:16 AM, Ard Biesheuvel
 wrote:
> Now that alternatives patching code no longer relies on the primary
> mapping of .text being writable, we can remove the code that removes
> the writable permissions post-init time, and map it read-only from
> the outset.
>
> Signed-off-by: Ard Biesheuvel 

Awesome!

Reviewed-by: Kees Cook 

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 4/4] arm64: mmu: apply strict permissions to .init.text and .init.data

2017-02-11 Thread Kees Cook
On Fri, Feb 10, 2017 at 9:16 AM, Ard Biesheuvel
 wrote:
> To avoid having mappings that are writable and executable at the same
> time, split the init region into a .init.text region that is mapped
> read-only, and a .init.data region that is mapped non-executable.
>
> This is possible now that the alternative patching occurs via the linear
> mapping, and the linear alias of the init region is always mapped writable
> (but never executable).

Er, so, that means kernel text is still basically RWX... you just
write to the linear mapping and execute the kernel mapping. Can't we
make the linear mapping match the kernel mapping permissions?

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC] minimum gcc version for kernel: raise to gcc-4.3 or 4.6?

2017-04-16 Thread Kees Cook
Was there a conclusion to this discussion? I didn't see anything
definitive in the thread...

Notes below...

On Fri, Dec 16, 2016 at 3:14 AM, Arnd Bergmann  wrote:
> [Fixed linux-arm-kernel mailing list address, sorry for the duplicate,
>  I'm not reposting all the ugly patches though, unless someone really
>  wants them, https://lkml.org/lkml/2016/12/16/174 has a copy]
>
> On Friday, December 16, 2016 11:56:21 AM CET Arnd Bergmann wrote:
>> I had some fun doing build testing with older gcc versions, building
>> every release from 4.0 through 7.0 and running that on my randconfig
>> setup to see what comes out.
>>
>> First of all, gcc-4.9 and higher is basically warning-free everywhere,
>> although gcc-7 introduces some interesting new warnings (I have started
>> doing patches for those as well). gcc-4.8 is probably good, too, and
>> gcc-4.6 and 4.7 at least don't produce build failures in general, though
>> the level of false-positive warnings increases (we could decide to turn
>> those off for older compilers for build test purposes).
>>
>> In gcc-4.5 and below, dead code elimination is not as good as later,
>> causing a couple of link errors, and some of them have no good workaround
>> (see patch 1). It would be nice to declare that version too old, but
>> several older distros that are still in wide use ship with compilers
>> earlier than 4.6:
>>
>>  RHEL6:   gcc-4.4

This appears to have support until July 31, 2018. (Though it's using a
2.6 kernel.)

>>  Debian 6:gcc-4.4

This went fully unsupported on Feb 29, 2016.

>>  Ubuntu 10.04:gcc-4.4

This went fully unsupported on Apr 30, 2015.

>>  SLES11:  gcc-4.3

General support ends Mar 31 2019, fully unsupported 31 Mar 2022. (And
like RHEL6 is using a 2.6 kernel.)

>> With gcc-4.3, we need a couple of workaround patches beyond the problem
>> mentioned above, more configuration options are unavailable and we get
>> a significant number of false-positive warnings, but it's not much worse
>> than gcc-4.5 otherwise.
>>
>> These are the options I had to disable to get gcc-4.3 randconfig builds
>> working:
>>
>>  CONFIG_HAVE_GCC_PLUGINS

This needs gcc 4.5, with 4.7 preferred.

>>  CONFIG_CC_STACKPROTECTOR_STRONG

Introduced in gcc 4.9.

>>  CONFIG_ARM_SINGLE_ARMV7M
>>  CONFIG_THUMB2_KERNEL
>>  CONFIG_KERNEL_MODE_NEON
>>  CONFIG_VDSO
>>  CONFIG_FUNCTION_TRACER (with CONFIG_FRAME_POINTER=n)
>>
>> I have not checked in detail which version is required for
>> each of the above.
>>
>> Specifically on ARM, going further makes things rather useless especially
>> for build testing: with gcc-4.2, we lose support for ARMv7, EABI, and
>> effectively ARMv6 (as it relies on EABI for building reliably). Also,
>> the number of false-positive build warnings is so high that it is useless
>> for finding actual bugs from the warnings.
>>
>> See the replies to this mail for 13 patches I needed to work around
>> issues for each of the releases before 4.6. I have also submitted
>> some separate patches for issues that I considered actual bugs
>> uncovered by the older compilers and that should be applied regardless.
>>
>> The original gcc-4.3 release was in early 2008. If we decide to still
>> support that, we probably want the first 10 quirks in this series,
>> while gcc-4.6 (released in 2011) requires none of them.

I'd be in support of raising the minimum to gcc 4.6. (I'd actually
prefer 4.7, just to avoid some 4.6 packaging issues, and for better
gcc plugin support.)

I'm curious what gcc 4.6 binaries are common in the wild besides
old-stable Debian (unsupported in maybe a year from now?) and 12.04
Ubuntu (going fully unsupported in 2 weeks). It looks like 4.6 was
used only in Fedora 15 and 16 (both EOL).

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC] minimum gcc version for kernel: raise to gcc-4.3 or 4.6?

2017-04-20 Thread Kees Cook
On Thu, Apr 20, 2017 at 3:15 AM, Arnd Bergmann  wrote:
> On Sun, Apr 16, 2017 at 9:52 PM, Kees Cook  wrote:
>>>> The original gcc-4.3 release was in early 2008. If we decide to still
>>>> support that, we probably want the first 10 quirks in this series,
>>>> while gcc-4.6 (released in 2011) requires none of them.
>>
>> I'd be in support of raising the minimum to gcc 4.6. (I'd actually
>> prefer 4.7, just to avoid some 4.6 packaging issues, and for better
>> gcc plugin support.)
>>
>> I'm curious what gcc 4.6 binaries are common in the wild besides
>> old-stable Debian (unsupported in maybe a year from now?) and 12.04
>> Ubuntu (going fully unsupported in 2 weeks). It looks like 4.6 was
>> used only in Fedora 15 and 16 (both EOL).
>
> I think we are better off defining two versions: One that we know
> a lot of people care about, and we actively try to make that work
> well in all configurations (e.g. 4.6, 4.7 or 4.8), fixing all warnings
> we run into, and an older version that we try not to break
> intentionally (e.g. 3.4, 4.1 or 4.3) but that we only fix when
> someone actually runs into a problem they can't work around
> by upgrading to a more modern compiler.

For "working well everywhere" I feel like 4.8 is the better of those
three (I'd prefer 4.9). I think we should avoid 4.6 -- it seems not
widely used.

For an old compiler... yikes. 3.4 sounds insane to me. :)

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC] minimum gcc version for kernel: raise to gcc-4.3 or 4.6?

2017-04-21 Thread Kees Cook
On Fri, Apr 21, 2017 at 1:55 PM, Arnd Bergmann  wrote:
> On Thu, Apr 20, 2017 at 9:52 PM, Kees Cook  wrote:
>> On Thu, Apr 20, 2017 at 3:15 AM, Arnd Bergmann  wrote:
>>> On Sun, Apr 16, 2017 at 9:52 PM, Kees Cook  wrote:
>>>>>> The original gcc-4.3 release was in early 2008. If we decide to still
>>>>>> support that, we probably want the first 10 quirks in this series,
>>>>>> while gcc-4.6 (released in 2011) requires none of them.
>>>>
>>>> I'd be in support of raising the minimum to gcc 4.6. (I'd actually
>>>> prefer 4.7, just to avoid some 4.6 packaging issues, and for better
>>>> gcc plugin support.)
>>>>
>>>> I'm curious what gcc 4.6 binaries are common in the wild besides
>>>> old-stable Debian (unsupported in maybe a year from now?) and 12.04
>>>> Ubuntu (going fully unsupported in 2 weeks). It looks like 4.6 was
>>>> used only in Fedora 15 and 16 (both EOL).
>>>
>>> I think we are better off defining two versions: One that we know
>>> a lot of people care about, and we actively try to make that work
>>> well in all configurations (e.g. 4.6, 4.7 or 4.8), fixing all warnings
>>> we run into, and an older version that we try not to break
>>> intentionally (e.g. 3.4, 4.1 or 4.3) but that we only fix when
>>> someone actually runs into a problem they can't work around
>>> by upgrading to a more modern compiler.
>>
>> For "working well everywhere" I feel like 4.8 is the better of those
>> three (I'd prefer 4.9). I think we should avoid 4.6 -- it seems not
>> widely used.
>
> I suspect that 4.9 might be the one that actually works best
> across architectures, and it contained some very significant
> changes. In my testing gcc-5 tends to behave very similarly
> to 4.9, and gcc-6 introduced a larger number of new warnings,
> so that would clearly be too new for a recommended version.
>
> The suggestion of 4.9 or higher is appealing as a recommendation
> because it matches what I would personally tell people:
>
> - If you have gcc-4.9 or newer and you don't rely on any newer
>   features, there is no need to upgrade
> - Wth gcc-4.8, the -Wmaybe-uninitialized warnings are now turned
>   off because they were too noisy, so upgrading is probably a good
>   idea even though the compiler is otherwise ok and in widespread
>   use
> - gcc-4.6 and 4.7 are basically usable for building kernels, but the
>   warning output is often counterproductive, and the generated
>   object code may be noticeably worse.
> - anything before gcc-4.6 is missing too many features to be
>   useful on ARM, but may still be fine on other architectures.
>
> On the other hand, there is a noticeable difference in compile
> speed, as a 5% slowdown compared to the previous release
> apparently is not considered a regression. These are the times
> I see for building ARM 'vexpress_defconfig':
>
> gcc-4.4: real 0m47.269s user 11m48.576s
> gcc-4.5: real 0m44.878s user 10m58.900s
> gcc-4.6: real 0m44.621s user 11m34.716s
> gcc-4.7: real 0m47.476s user 12m42.924s
> gcc-4.8: real 0m48.494s user 13m19.736s
> gcc-4.9: real 0m50.140s user 13m44.876s
> gcc-5.x: real 0m51.302s user 14m05.564s
> gcc-6.x: real 0m54.615s user 15m06.304s
> gcc-7.x: real 0m56.008s user 15m44.720s
>
> That is a factor of 1.5x in CPU cycles between slowest and
> fastest, so there is clearly a benefit to keeping the old versions
> around, but there is also no clear cut-off other thannoticing
> that gcc-4.4 is slower than 4.5 in this particular
> configuration.
>
>> For an old compiler... yikes. 3.4 sounds insane to me. :)
>
> That was my initial thought as well. On ARM, it clearly is
> insane, as even gcc-4.0 is unable to build any of the modern
> defconfigs (lacking -mabi=aapcs, ICE when building vsprintf.c)
> and even the patch I did to get gcc-4.1 to build is probably
> too ugly to get merged, so to build any unpatched kernel after
> linux-3.6 you need at least gcc-4.2, or even gcc-4.4 for the
> ''defconfig' (gcc-4.3 if you disable vdso).
>
> Then again, on x86, old cmpilers were claimed to be much better
> supported. I just tried it out and found that no x86 defconfig kernel
> since linux-3.2 could be built with gcc-3.4, probably not on any
> other architecture either (it cannot have forward declarations
> for inline functions and we have one in kernel/sched_fair.c).
>
> I think that would be a really good argument for requiring
> something newer ;-)
>
> The linux-4.2 x86 defconfig could still be built with gcc-4.0, but
> later kernels have several 

Re: [RFC] minimum gcc version for kernel: raise to gcc-4.3 or 4.6?

2017-04-24 Thread Kees Cook
On Mon, Apr 24, 2017 at 1:30 PM, Arnd Bergmann  wrote:
> How about this approach then:
>
> - To keep it simple, we update the README.rst to say that a minimum
>   gcc-4.3 is required, while recommending gcc-4.9 for all architectures
> - Support for gcc-4.0 and earlier gets removed from linux/compiler.h,
>   and instead we add a summary of what I found, explaining that
>   gcc-4.1 has active users on a few architectures.
> - We make the Makefile show a warning once during compilation for
>   gcc earlier than 4.3.

This sounds good to me!

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 01/11] arm64: docs: describe ELF hwcaps

2017-08-03 Thread Kees Cook
On Wed, Jul 19, 2017 at 9:01 AM, Mark Rutland  wrote:
> We don't document our ELF hwcaps, leaving developers to interpret them
> according to hearsay, guesswork, or (in exceptional cases) inspection of
> the current kernel code.
>
> This is less than optimal, and it would be far better if we had some
> definitive description of each of the ELF hwcaps that developers could
> refer to.
>
> This patch adds a document describing the (native) arm64 ELF hwcaps.
>
> Signed-off-by: Mark Rutland 
> Cc: Catalin Marinas 
> Cc: Dave Martin 
> Cc: Suzuki K Poulose 
> Cc: Will Deacon 
> ---
>  Documentation/arm64/elf_hwcaps.txt | 133 
> +

With the kernel docs moving to ReST markup[1], perhaps reformat this
to a .rst file and link to it from somewhere sensible in the ReST
tree, perhaps the userspace API section in
Documentation/userspace-api/index.rst?

-Kees

[1] https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Allow usercopy to vcpu->arch.ctxt and arm64 debug

2017-10-21 Thread Kees Cook
On Sat, Oct 21, 2017 at 11:45 AM, Christoffer Dall
 wrote:
> We do direct useraccess copying to the kvm_cpu_context structure
> embedded in the kvm_vcpu_arch structure, and to the vcpu debug register
> state.  Everything else (timer, PMU, vgic) goes through a temporary
> indirection.

Are these copies done with a dynamic size? The normal way these get
whitelisted is via builtin_const sizes on the copy. Looking at
KVM_REG_SIZE(), though, it seems that would be a dynamic calculation.

> Fixing all accesses to kvm_cpu_context is massively invasive, and we'd
> like to avoid that, so we tell kvm_init_usercopy to whitelist accesses
> to out context structure.
>
> The debug system register accesses on arm64 are modified to work through
> an indirection instead.
>
> Cc: kernel-harden...@lists.openwall.com
> Cc: Kees Cook 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: Marc Zyngier 
> Signed-off-by: Christoffer Dall 
> ---
> This fixes KVM/ARM on today's linux next with CONFIG_HARDENED_USERCOPY.
>
> The patch is based on linux-next plus Paolo's x86 patch which introduces
> kvm_init_usercopy.  Not sure how this needs to get merged, but it would
> potentially make sense for Paolo to put together a set of the patches
> needed for this.

I was planning to carry Paolo's patches, and I can take this one too.
If this poses a problem, then I could just do a two-phase commit of
the whitelisting code, leaving the very last commit (which enables the
defense for anything not yet whitelisted), until the KVM trees land.

What's preferred?

Thanks for looking at this!

-Kees

>
> Thanks,
> -Christoffer
>
>  arch/arm64/kvm/sys_regs.c | 36 
>  virt/kvm/arm/arm.c|  5 -
>  2 files changed, 24 insertions(+), 17 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 2e070d3baf9f..cdf47a9108fe 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -293,19 +293,20 @@ static bool trap_bvr(struct kvm_vcpu *vcpu,
>  static int set_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -   __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg];
> +   __u64 r;
>
> -   if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
> +   if (copy_from_user(&r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
> return -EFAULT;
> +   vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg] = r;
> return 0;
>  }
>
>  static int get_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -   __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg];
> +   __u64 r = vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg];
>
> -   if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0)
> +   if (copy_to_user(uaddr, &r, KVM_REG_SIZE(reg->id)) != 0)
> return -EFAULT;
> return 0;
>  }
> @@ -335,10 +336,11 @@ static bool trap_bcr(struct kvm_vcpu *vcpu,
>  static int set_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -   __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg];
> +   __u64 r;
>
> -   if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
> +   if (copy_from_user(&r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
> return -EFAULT;
> +   vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg] = r;
>
> return 0;
>  }
> @@ -346,9 +348,9 @@ static int set_bcr(struct kvm_vcpu *vcpu, const struct 
> sys_reg_desc *rd,
>  static int get_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -   __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg];
> +   __u64 r = vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg];
>
> -   if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0)
> +   if (copy_to_user(uaddr, &r, KVM_REG_SIZE(reg->id)) != 0)
> return -EFAULT;
> return 0;
>  }
> @@ -379,19 +381,20 @@ static bool trap_wvr(struct kvm_vcpu *vcpu,
>  static int set_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> const struct kvm_one_reg *reg, void __user *uaddr)
>  {
> -   __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg];
> +   __u64 r;
>
> -   if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
> +   if (copy_from_user(&r, uaddr, KVM_REG_SIZE(reg->id

Re: [PATCH] KVM: arm/arm64: Allow usercopy to vcpu->arch.ctxt and arm64 debug

2017-10-23 Thread Kees Cook
On Mon, Oct 23, 2017 at 7:14 AM, Paolo Bonzini  wrote:
> On 22/10/2017 09:44, Christoffer Dall wrote:
>> However, I think it's much clearer if I
>> rewrite these to use get_user() and put_user().  v2 incoming.
>
> I'd actually prefer if you all do a trivial conversion to
> kvm_init_usercopy to begin with.  In fact, we could just change the
> default from "0, 0" to "0, sizeof (kvm_arch_vcpu)" in kvm_init.  Any
> other change can be applied after the patches are merged to Linus's
> tree, especially with KVM Forum and the merge window both coming soon.
>
> I'll send a v2 myself later this week.

Okay, which patches would you like me to carry in the usercopy
whitelisting tree for the coming merge window?

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 15/30] arm64/sve: Signal handling support

2017-12-06 Thread Kees Cook
On Tue, Oct 31, 2017 at 8:51 AM, Dave Martin  wrote:
> This patch implements support for saving and restoring the SVE
> registers around signals.
>
> A fixed-size header struct sve_context is always included in the
> signal frame encoding the thread's vector length at the time of
> signal delivery, optionally followed by a variable-layout structure
> encoding the SVE registers.
>
> Because of the need to preserve backwards compatibility, the FPSIMD
> view of the SVE registers is always dumped as a struct
> fpsimd_context in the usual way, in addition to any sve_context.
>
> The SVE vector registers are dumped in full, including bits 127:0
> of each register which alias the corresponding FPSIMD vector
> registers in the hardware.  To avoid any ambiguity about which
> alias to restore during sigreturn, the kernel always restores bits
> 127:0 of each SVE vector register from the fpsimd_context in the
> signal frame (which must be present): userspace needs to take this
> into account if it wants to modify the SVE vector register contents
> on return from a signal.
>
> FPSR and FPCR, which are used by both FPSIMD and SVE, are not
> included in sve_context because they are always present in
> fpsimd_context anyway.
>
> For signal delivery, a new helper
> fpsimd_signal_preserve_current_state() is added to update _both_
> the FPSIMD and SVE views in the task struct, to make it easier to
> populate this information into the signal frame.  Because of the
> redundancy between the two views of the state, only one is updated
> otherwise.
>
> Signed-off-by: Dave Martin 
> Cc: Alex Bennée 
> Cc: Ard Biesheuvel 
> Cc: Will Deacon 
>
> ---
>
> **Dropped** Reviewed-by: Catalin Marinas 
> (Non-trivial changes.)
>
> Changes since v4
> 
>
> Requested by Will Deacon:
>
>  * Fix inconsistent return semantics in restore_sve_fpsimd_context().
>
>Currently a nonzero return value from __copy_from_user() is passed
>back as-is to the caller or restore_sve_fpsimd_context(), rather than
>translating it do an error code as is done elsewhere.
>
>Callers of restore_sve_fpsimd_context() only care whether the return
>value is 0 or not, so this isn't a big deal, but it is more
>consistent to return proper error codes on failure in case we grow a
>use for them later.
>
>This patch returns -EFAULT in the __copy_from_user() error cases
>that weren't explicitly handled.
>
> Miscellaneous:
>
>  * Change inconsistent copy_to_user() calls to __copy_to_user() in
>preserve_sve_context().
>
>There are already __put_user_error() calls here.
>
>The whole extended signal frame is already checked for
>access_ok(VERIFY_WRITE) in get_sigframe().

Verifying all these __copy_to/from_user() calls is rather non-trivial.
For example, I had to understand that the access_ok() check actually
spans memory that both user->sigframe and user->next_frame point into.
And it isn't clear to me that all users of apply_user_offset() are
within this range too, along with other manually calculated offsets in
setup_sigframe().

And it's not clear if parse_user_sigframe() is safe either. Are
user->fpsimd and user->sve checked somewhere? It seems like it's
safely contained by in sf->uc.uc_mcontext.__reserved, but it's hard to
read, though I do see access_ok() checks against __reserved at the end
of the while loop.

Can we just drop all the __... calls for the fully checked
copy_to/from_user() instead?

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 15/30] arm64/sve: Signal handling support

2017-12-07 Thread Kees Cook
On Thu, Dec 7, 2017 at 2:49 AM, Will Deacon  wrote:
> Hi Kees,
>
> On Wed, Dec 06, 2017 at 11:56:50AM -0800, Kees Cook wrote:
>> On Tue, Oct 31, 2017 at 8:51 AM, Dave Martin  wrote:
>> > Miscellaneous:
>> >
>> >  * Change inconsistent copy_to_user() calls to __copy_to_user() in
>> >preserve_sve_context().
>> >
>> >There are already __put_user_error() calls here.
>> >
>> >The whole extended signal frame is already checked for
>> >access_ok(VERIFY_WRITE) in get_sigframe().
>>
>> Verifying all these __copy_to/from_user() calls is rather non-trivial.
>> For example, I had to understand that the access_ok() check actually
>> spans memory that both user->sigframe and user->next_frame point into.
>
> I don't think that's particularly difficult -- you just have to read the
> four lines preceding the access_ok.

Sure, but someone working backward from finding the __copy_*, it's
less obvious. :)

>> And it isn't clear to me that all users of apply_user_offset() are
>> within this range too, along with other manually calculated offsets in
>> setup_sigframe().
>
> The offsets passed into apply_user_offset are calculated by
> setup_sigframe_layout as the stack is allocated, so they're correct by
> construction. We could add a size check in apply_user_offset if you like?
>
>> And it's not clear if parse_user_sigframe() is safe either. Are
>> user->fpsimd and user->sve checked somewhere? It seems like it's
>> safely contained by in sf->uc.uc_mcontext.__reserved, but it's hard to
>> read, though I do see access_ok() checks against __reserved at the end
>> of the while loop.
>
> This one is certainly more difficult to follow, mainly because it's spread
> about a bit and we have to check the extra context separately. However, the
> main part of the frame is checked in sys_rt_sigreturn before calling
> restore_sigframe, and the extra context is checked in parse_user_sigframe
> if we find it.
>
> Dave, any thoughts on making this easier to understand?

My question is mainly: why not just use copy_*() everywhere instead?
Having these things so spread out makes it fragile, and there's very
little performance benefit from using __copy_*() over copy_*().

As far as I can see, everything looks correctly checked here, but it
took a while to convince myself of it. Having Dave's description in
the other reply certainly helped, though, thanks for that! :)

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 15/30] arm64/sve: Signal handling support

2017-12-11 Thread Kees Cook
On Mon, Dec 11, 2017 at 6:07 AM, Will Deacon  wrote:
> On Thu, Dec 07, 2017 at 10:50:38AM -0800, Kees Cook wrote:
>> My question is mainly: why not just use copy_*() everywhere instead?
>> Having these things so spread out makes it fragile, and there's very
>> little performance benefit from using __copy_*() over copy_*().
>
> I think that's more of a general question. Why not just remove the __
> versions from the kernel entirely if they're not worth the perf?

That has been something Linus has strongly suggested in the past, so
I've kind of been looking for easy places to drop the __copy_*
versions. :)

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 15/30] arm64/sve: Signal handling support

2017-12-12 Thread Kees Cook
On Tue, Dec 12, 2017 at 3:11 AM, Dave Martin  wrote:
> On Tue, Dec 12, 2017 at 10:40:30AM +, Will Deacon wrote:
>> On Mon, Dec 11, 2017 at 11:23:09AM -0800, Kees Cook wrote:
>> > On Mon, Dec 11, 2017 at 6:07 AM, Will Deacon  wrote:
>> > > On Thu, Dec 07, 2017 at 10:50:38AM -0800, Kees Cook wrote:
>> > >> My question is mainly: why not just use copy_*() everywhere instead?
>> > >> Having these things so spread out makes it fragile, and there's very
>> > >> little performance benefit from using __copy_*() over copy_*().
>> > >
>> > > I think that's more of a general question. Why not just remove the __
>> > > versions from the kernel entirely if they're not worth the perf?
>> >
>> > That has been something Linus has strongly suggested in the past, so
>> > I've kind of been looking for easy places to drop the __copy_*
>> > versions. :)
>>
>> Tell you what then: I'll Ack the arm64 patch if it's part of a series
>> removing the thing entirely :p
>>
>> I guess we'd still want to the validation of the whole sigframe though,
>> so we don't end up pushing half a signal stack before running into an
>> access_ok failure?
>
> That's an interesting question.  In many cases access_ok() might become
> redundant, but for syscalls that you don't want to have side-effects
> on user memory on failure it's still relevant.
>
> In the signal case we'd still an encompassing access_ok() to prevent
> stack guard overruns, because the signal frame can be large and isn't
> written or read contiguously or in a well-defined order.

Yeah, I think bailing early is fine. I think the existing access_ok()
checks are fine; I wouldn't want to drop those.

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 00/17] ARMv8.3 pointer authentication support

2018-11-13 Thread Kees Cook
On Tue, Nov 13, 2018 at 10:17 AM, Kristina Martsenko
 wrote:
> When the PAC authentication fails, it doesn't actually generate an
> exception, it just flips a bit in the high-order bits of the pointer,
> making the pointer invalid. Then when the pointer is dereferenced (e.g.
> as a function return address), it generates the usual type of exception
> for an invalid address.

Ah! Okay, thanks. I missed that detail. :)

What area of memory ends up being addressable with such bit flips?
(i.e. is the kernel making sure nothing executable ends up there?)

> So when a function return fails in user mode, the exception is handled
> in __do_user_fault and a forced SIGSEGV is delivered to the task. When a
> function return fails in kernel mode, the exception is handled in
> __do_kernel_fault and the task is killed.
>
> This is different from stack protector as we don't panic the kernel, we
> just kill the task. It would be difficult to panic as we don't have a
> reliable way of knowing that the exception was caused by a PAC
> authentication failure (we just have an invalid pointer with a specific
> bit flipped). We also don't print out any PAC-related warning.

There are other "guesses" in __do_kernel_fault(), I think? Could a
"PAC mismatch?" warning be included in the Oops if execution fails in
the address range that PAC failures would resolve into?

-Kees

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 00/17] ARMv8.3 pointer authentication support

2018-11-14 Thread Kees Cook
On Wed, Nov 14, 2018 at 3:47 PM, Mark Rutland  wrote:
> On Tue, Nov 13, 2018 at 05:09:00PM -0600, Kees Cook wrote:
>> On Tue, Nov 13, 2018 at 10:17 AM, Kristina Martsenko
>>  wrote:
>> > When the PAC authentication fails, it doesn't actually generate an
>> > exception, it just flips a bit in the high-order bits of the pointer,
>> > making the pointer invalid. Then when the pointer is dereferenced (e.g.
>> > as a function return address), it generates the usual type of exception
>> > for an invalid address.
>>
>> Ah! Okay, thanks. I missed that detail. :)
>>
>> What area of memory ends up being addressable with such bit flips?
>> (i.e. is the kernel making sure nothing executable ends up there?)
>>
>> > So when a function return fails in user mode, the exception is handled
>> > in __do_user_fault and a forced SIGSEGV is delivered to the task. When a
>> > function return fails in kernel mode, the exception is handled in
>> > __do_kernel_fault and the task is killed.
>> >
>> > This is different from stack protector as we don't panic the kernel, we
>> > just kill the task. It would be difficult to panic as we don't have a
>> > reliable way of knowing that the exception was caused by a PAC
>> > authentication failure (we just have an invalid pointer with a specific
>> > bit flipped). We also don't print out any PAC-related warning.
>>
>> There are other "guesses" in __do_kernel_fault(), I think? Could a
>> "PAC mismatch?" warning be included in the Oops if execution fails in
>> the address range that PAC failures would resolve into?
>
> I'd personally prefer that we didn't try to guess if a fault is due to a 
> failed
> AUT*, even for logging.
>
> Presently, it's not possible to distinguish between a fault resulting from a
> failed AUT* and a fault which happens to have hte same bits/clear, so there 
> are
> false positives. The architecture may also change the precise details of the
> faulting address, and we'd have false negatives in that case.
>
> Given that, I think suggesting that a fault is due to a failed AUT* is liable
> to make things more confusing.

Okay, no worries. It should be pretty clear from the back trace anyway. :)

As long as there isn't any way for the pointer bit flips to result in
an addressable range, I'm happy. :)

-Kees

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm64: Don't use cbz/adr with external symbols

2021-03-05 Thread Kees Cook
On Fri, Mar 05, 2021 at 12:21:24PM -0800, Sami Tolvanen wrote:
> allmodconfig + CONFIG_LTO_CLANG_THIN=y fails to build due to following
> linker errors:
> 
>   ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21CC):
>   relocation R_AARCH64_CONDBR19 out of range: 2031220 is not in
>   [-1048576, 1048575]; references hyp_panic
>   >>> defined in vmlinux.o
> 
>   ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21E0):
>   relocation R_AARCH64_ADR_PREL_LO21 out of range: 2031200 is not in
>   [-1048576, 1048575]; references hyp_panic
>   >>> defined in vmlinux.o
> 
> This is because with LTO, the compiler ends up placing hyp_panic()
> more than 1MB away from __guest_enter(). Use an unconditional branch
> and adr_l instead to fix the issue.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1317
> Reported-by: Nathan Chancellor 
> Suggested-by: Marc Zyngier 
> Suggested-by: Ard Biesheuvel 
> Signed-off-by: Sami Tolvanen 

Reviewed-by: Kees Cook 

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] KVM: arm64: vgic-its: Remove VLA usage

2018-06-29 Thread Kees Cook
In the quest to remove all stack VLA usage from the kernel[1], this
switches to using a maximum size and adds sanity checks. Additionally
cleans up some of the int-vs-u32 usage and adds additional bounds checking.
As it currently stands, this will always be 8 bytes until the ABI changes.

[1] 
https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com

Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Eric Auger 
Cc: Andre Przywara 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Kees Cook 
---
 virt/kvm/arm/vgic/vgic-its.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 4ed79c939fb4..3143fc047fcf 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -168,8 +168,14 @@ struct vgic_its_abi {
int (*commit)(struct vgic_its *its);
 };
 
+#define ABI_0_ESZ  8
+#define ESZ_MAXABI_0_ESZ
+
 static const struct vgic_its_abi its_table_abi_versions[] = {
-   [0] = {.cte_esz = 8, .dte_esz = 8, .ite_esz = 8,
+   [0] = {
+.cte_esz = ABI_0_ESZ,
+.dte_esz = ABI_0_ESZ,
+.ite_esz = ABI_0_ESZ,
 .save_tables = vgic_its_save_tables_v0,
 .restore_tables = vgic_its_restore_tables_v0,
 .commit = vgic_its_commit_v0,
@@ -180,10 +186,12 @@ static const struct vgic_its_abi its_table_abi_versions[] 
= {
 
 inline const struct vgic_its_abi *vgic_its_get_abi(struct vgic_its *its)
 {
+   if (WARN_ON(its->abi_rev >= NR_ITS_ABIS))
+   return NULL;
return &its_table_abi_versions[its->abi_rev];
 }
 
-int vgic_its_set_abi(struct vgic_its *its, int rev)
+static int vgic_its_set_abi(struct vgic_its *its, u32 rev)
 {
const struct vgic_its_abi *abi;
 
@@ -1881,16 +1889,19 @@ typedef int (*entry_fn_t)(struct vgic_its *its, u32 id, 
void *entry,
  * Return: < 0 on error, 0 if last element was identified, 1 otherwise
  * (the last element may not be found on second level tables)
  */
-static int scan_its_table(struct vgic_its *its, gpa_t base, int size, int esz,
+static int scan_its_table(struct vgic_its *its, gpa_t base, int size, u32 esz,
  int start_id, entry_fn_t fn, void *opaque)
 {
struct kvm *kvm = its->dev->kvm;
unsigned long len = size;
int id = start_id;
gpa_t gpa = base;
-   char entry[esz];
+   char entry[ESZ_MAX];
int ret;
 
+   if (WARN_ON(esz > ESZ_MAX))
+   return -EINVAL;
+
memset(entry, 0, esz);
 
while (len > 0) {
-- 
2.17.1


-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm64: vgic-its: Remove VLA usage

2018-07-02 Thread Kees Cook
On Mon, Jul 2, 2018 at 12:36 AM, Auger Eric  wrote:
> Hi Kees,
>
> On 06/29/2018 08:46 PM, Kees Cook wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this
>> switches to using a maximum size and adds sanity checks. Additionally
>> cleans up some of the int-vs-u32 usage and adds additional bounds checking.
>> As it currently stands, this will always be 8 bytes until the ABI changes.
>>
>> [1] 
>> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
>>
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Eric Auger 
>> Cc: Andre Przywara 
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: kvmarm@lists.cs.columbia.edu
>> Signed-off-by: Kees Cook 
>> ---
>>  virt/kvm/arm/vgic/vgic-its.c | 19 +++
>>  1 file changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index 4ed79c939fb4..3143fc047fcf 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -168,8 +168,14 @@ struct vgic_its_abi {
>>   int (*commit)(struct vgic_its *its);
>>  };
>>
>> +#define ABI_0_ESZ8
>> +#define ESZ_MAX  ABI_0_ESZ
>> +
>>  static const struct vgic_its_abi its_table_abi_versions[] = {
>> - [0] = {.cte_esz = 8, .dte_esz = 8, .ite_esz = 8,
>> + [0] = {
>> +  .cte_esz = ABI_0_ESZ,
>> +  .dte_esz = ABI_0_ESZ,
>> +  .ite_esz = ABI_0_ESZ,
>>.save_tables = vgic_its_save_tables_v0,
>>.restore_tables = vgic_its_restore_tables_v0,
>>.commit = vgic_its_commit_v0,
>> @@ -180,10 +186,12 @@ static const struct vgic_its_abi 
>> its_table_abi_versions[] = {
>>
>>  inline const struct vgic_its_abi *vgic_its_get_abi(struct vgic_its *its)
>>  {
>> + if (WARN_ON(its->abi_rev >= NR_ITS_ABIS))
>> + return NULL;
>>   return &its_table_abi_versions[its->abi_rev];
>>  }
>>
>> -int vgic_its_set_abi(struct vgic_its *its, int rev)
>> +static int vgic_its_set_abi(struct vgic_its *its, u32 rev)
>>  {
> if vgic_its_get_abi is likely to return NULL, don't we need to check abi
> != NULL in all call sites.

My thinking was that since it should never happen, a WARN_ON would be
sufficient. But I can drop all these changes if you want. I just
wanted to see the VLA removed. :)

-Kees

>
> abi_rev is actually set by vgic_its_set_abi() which is actually called
> by vgic_mmio_uaccess_write_its_iidr() and vgic_its_create().
>
> Only vgic_mmio_uaccess_write_its_iidr allows the userspace to overwrite
> the default abi_rev. At this point a check against NR_ITS_ABIS is
> already done. So to me the check is done at the source?
>
> Thanks
>
> Eric
>>   const struct vgic_its_abi *abi;
>>
>> @@ -1881,16 +1889,19 @@ typedef int (*entry_fn_t)(struct vgic_its *its, u32 
>> id, void *entry,
>>   * Return: < 0 on error, 0 if last element was identified, 1 otherwise
>>   * (the last element may not be found on second level tables)
>>   */
>> -static int scan_its_table(struct vgic_its *its, gpa_t base, int size, int 
>> esz,
>> +static int scan_its_table(struct vgic_its *its, gpa_t base, int size, u32 
>> esz,
>> int start_id, entry_fn_t fn, void *opaque)
>>  {
>>   struct kvm *kvm = its->dev->kvm;
>>   unsigned long len = size;
>>   int id = start_id;
>>   gpa_t gpa = base;
>> - char entry[esz];
>> + char entry[ESZ_MAX];
>>   int ret;
>>
>> + if (WARN_ON(esz > ESZ_MAX))
>> + return -EINVAL;
>> +
>>   memset(entry, 0, esz);
>>
>>   while (len > 0) {
>>



-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm64: vgic-its: Remove VLA usage

2018-07-09 Thread Kees Cook
On Mon, Jul 9, 2018 at 3:47 AM, Marc Zyngier  wrote:
> Hi kees,
>
> On 02/07/18 18:15, Kees Cook wrote:
>> On Mon, Jul 2, 2018 at 12:36 AM, Auger Eric  wrote:
>>> Hi Kees,
>>>
>>> On 06/29/2018 08:46 PM, Kees Cook wrote:
>>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>>> switches to using a maximum size and adds sanity checks. Additionally
>>>> cleans up some of the int-vs-u32 usage and adds additional bounds checking.
>>>> As it currently stands, this will always be 8 bytes until the ABI changes.
>>>>
>>>> [1] 
>>>> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
>>>>
>>>> Cc: Christoffer Dall 
>>>> Cc: Marc Zyngier 
>>>> Cc: Eric Auger 
>>>> Cc: Andre Przywara 
>>>> Cc: linux-arm-ker...@lists.infradead.org
>>>> Cc: kvmarm@lists.cs.columbia.edu
>>>> Signed-off-by: Kees Cook 
>>>> ---
>>>>  virt/kvm/arm/vgic/vgic-its.c | 19 +++
>>>>  1 file changed, 15 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>>> index 4ed79c939fb4..3143fc047fcf 100644
>>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>>> @@ -168,8 +168,14 @@ struct vgic_its_abi {
>>>>   int (*commit)(struct vgic_its *its);
>>>>  };
>>>>
>>>> +#define ABI_0_ESZ8
>>>> +#define ESZ_MAX  ABI_0_ESZ
>>>> +
>>>>  static const struct vgic_its_abi its_table_abi_versions[] = {
>>>> - [0] = {.cte_esz = 8, .dte_esz = 8, .ite_esz = 8,
>>>> + [0] = {
>>>> +  .cte_esz = ABI_0_ESZ,
>>>> +  .dte_esz = ABI_0_ESZ,
>>>> +  .ite_esz = ABI_0_ESZ,
>>>>.save_tables = vgic_its_save_tables_v0,
>>>>.restore_tables = vgic_its_restore_tables_v0,
>>>>.commit = vgic_its_commit_v0,
>>>> @@ -180,10 +186,12 @@ static const struct vgic_its_abi 
>>>> its_table_abi_versions[] = {
>>>>
>>>>  inline const struct vgic_its_abi *vgic_its_get_abi(struct vgic_its *its)
>>>>  {
>>>> + if (WARN_ON(its->abi_rev >= NR_ITS_ABIS))
>>>> + return NULL;
>>>>   return &its_table_abi_versions[its->abi_rev];
>>>>  }
>>>>
>>>> -int vgic_its_set_abi(struct vgic_its *its, int rev)
>>>> +static int vgic_its_set_abi(struct vgic_its *its, u32 rev)
>>>>  {
>>> if vgic_its_get_abi is likely to return NULL, don't we need to check abi
>>> != NULL in all call sites.
>>
>> My thinking was that since it should never happen, a WARN_ON would be
>> sufficient. But I can drop all these changes if you want. I just
>> wanted to see the VLA removed. :)
>
> Are you going to respin this patch limiting it to just the VLA changes?
> I'm actively queuing stuff for the next merge window, and it'd be good
> to get that one in.
>
> Alternatively, I can drop the WARN_ONs myself. Just let me know.

If you can just drop them that would be best, thanks!

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC 17/17] arm64: compile the kernel with ptrauth -msign-return-address

2018-10-15 Thread Kees Cook
On Thu, Oct 11, 2018 at 7:23 AM, Vladimir Murzin
 wrote:
> Hi Kristina,
>
> On 05/10/18 09:47, Kristina Martsenko wrote:
>> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
>> index 106039d25e2f..dbcd43ea99d8 100644
>> --- a/arch/arm64/Makefile
>> +++ b/arch/arm64/Makefile
>> @@ -56,6 +56,10 @@ KBUILD_AFLAGS  += $(lseinstr) $(brokengasinst)
>>  KBUILD_CFLAGS+= $(call cc-option,-mabi=lp64)
>>  KBUILD_AFLAGS+= $(call cc-option,-mabi=lp64)
>>
>> +ifeq ($(CONFIG_ARM64_PTR_AUTH),y)
>> +KBUILD_CFLAGS+= -msign-return-address=all
>> +endif
>
> Should not it be done via cc-option so old toolchains keep working [1]?
>
> [1]
> $ aarch64-linux-gnu-gcc --version
> aarch64-linux-gnu-gcc (Linaro GCC 2014.11) 4.9.3 20141031 (prerelease)
> Copyright (C) 2014 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> $ aarch64-linux-gnu-gcc -msign-return-address=all
> aarch64-linux-gnu-gcc: error: unrecognized command line option 
> '-msign-return-address=all'
> ...

I would like to see CONFIG_ARM64_PTR_AUTH testing for compiler support
via Kconfig (as stack-protector does). This would allow developers to
only see the option if it was available (i.e. no "downgrade" happens
if the compiler is missing support). Using cc-option runs the risk of
building a kernel with CONFIG_ARM64_PTR_AUTH set, but _not_ actually
using ptr auth.

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 11/17] arm64: docs: document pointer authentication

2018-10-15 Thread Kees Cook
t; +these instructions operate correctly regardless of the presence of the
> +extension.
> +
> +
> +Basic support
> +-
> +
> +When CONFIG_ARM64_PTR_AUTH is selected, and relevant HW support is
> +present, the kernel will assign a random APIAKey value to each process
> +at exec*() time. This key is shared by all threads within the process,
> +and the key is preserved across fork(). Presence of functionality using
> +APIAKey is advertised via HWCAP_APIA.

It might be useful to include documentation here on how many bits of
the address are being used for the PAC bits (I'm assuming it's 7?)

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 00/17] ARMv8.3 pointer authentication support

2018-10-15 Thread Kees Cook
On Fri, Oct 5, 2018 at 1:47 AM, Kristina Martsenko
 wrote:
> This series adds support for the ARMv8.3 pointer authentication
> extension. The series contains Mark's original patches to enable pointer
> authentication for userspace [1], followed by early RFC patches using
> pointer authentication in the kernel.

It wasn't obvious to me where the PAC mismatch exceptions will be
caught. I'm mainly curious to compare the PAC exception handling to
the existing stack-protector panic(). Can you point me to which
routines manage that? (Perhaps I just missed it in the series...)

Thanks for the series! I'm quite excited for ARMv8.3 hardware. :)

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 07/17] arm64: add basic pointer authentication support

2018-10-19 Thread Kees Cook
On Fri, Oct 19, 2018 at 4:24 AM, Will Deacon  wrote:
> [+Cyrill Gorcunov for CRIU stuff]
>
> On Fri, Oct 19, 2018 at 12:15:43PM +0100, Catalin Marinas wrote:
>> On Fri, Oct 05, 2018 at 09:47:44AM +0100, Kristina Martsenko wrote:
>> > diff --git a/arch/arm64/include/asm/pointer_auth.h 
>> > b/arch/arm64/include/asm/pointer_auth.h
>> > new file mode 100644
>> > index ..2aefedc31d9e
>> > --- /dev/null
>> > +++ b/arch/arm64/include/asm/pointer_auth.h
>> > @@ -0,0 +1,63 @@
>> > +// SPDX-License-Identifier: GPL-2.0
>> > +#ifndef __ASM_POINTER_AUTH_H
>> > +#define __ASM_POINTER_AUTH_H
>> > +
>> > +#include 
>> > +
>> > +#include 
>> > +#include 
>> > +
>> > +#ifdef CONFIG_ARM64_PTR_AUTH
>> > +/*
>> > + * Each key is a 128-bit quantity which is split across a pair of 64-bit
>> > + * registers (Lo and Hi).
>> > + */
>> > +struct ptrauth_key {
>> > +   unsigned long lo, hi;
>> > +};
>> > +
>> > +/*
>> > + * We give each process its own instruction A key (APIAKey), which is 
>> > shared by
>> > + * all threads. This is inherited upon fork(), and reinitialised upon 
>> > exec*().
>> > + * All other keys are currently unused, with APIBKey, APDAKey, and APBAKey
>> > + * instructions behaving as NOPs.
>> > + */
>>
>> I don't remember the past discussions but I assume the tools guys are ok
>> with a single key shared by multiple threads. Ramana, could you ack this
>> part, FTR?
>>
>> (and it would help if someone from the Android and Chrome camps can
>> confirm)
>
> FWIW: I think we should be entertaining a prctl() interface to use a new
> key on a per-thread basis. Obviously, this would need to be used with care
> (e.g. you'd fork(); use the prctl() and then you'd better not return from
> the calling function!).
>
> Assuming we want this (Kees -- I was under the impression that everything in
> Android would end up with the same key otherwise?), then the question is
> do we want:
>
>   - prctl() get/set operations for the key, or
>   - prctl() set_random_key operation, or
>   - both of the above?
>
> Part of the answer to that may lie in the requirements of CRIU, where I
> strongly suspect they need explicit get/set operations, although these
> could be gated on CONFIG_CHECKPOINT_RESTORE=y.

Oh CRIU. Yikes. I'd like the get/set to be gated by the CONFIG, yes.
No reason to allow explicit access to the key (and selected algo) if
we don't have to.

As for per-thread or not, having a "pick a new key now" prctl() sounds
good, but I'd like to have an eye toward having it just be "automatic"
on clone().

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 07/17] arm64: add basic pointer authentication support

2018-10-19 Thread Kees Cook
On Fri, Oct 19, 2018 at 8:49 AM, Will Deacon  wrote:
> On Fri, Oct 19, 2018 at 08:36:45AM -0700, Kees Cook wrote:
>> On Fri, Oct 19, 2018 at 4:24 AM, Will Deacon  wrote:
>> > FWIW: I think we should be entertaining a prctl() interface to use a new
>> > key on a per-thread basis. Obviously, this would need to be used with care
>> > (e.g. you'd fork(); use the prctl() and then you'd better not return from
>> > the calling function!).
>> >
>> > Assuming we want this (Kees -- I was under the impression that everything 
>> > in
>> > Android would end up with the same key otherwise?), then the question is
>> > do we want:
>> >
>> >   - prctl() get/set operations for the key, or
>> >   - prctl() set_random_key operation, or
>> >   - both of the above?
>> >
>> > Part of the answer to that may lie in the requirements of CRIU, where I
>> > strongly suspect they need explicit get/set operations, although these
>> > could be gated on CONFIG_CHECKPOINT_RESTORE=y.
>>
>> Oh CRIU. Yikes. I'd like the get/set to be gated by the CONFIG, yes.
>> No reason to allow explicit access to the key (and selected algo) if
>> we don't have to.
>
> Makes sense.
>
>> As for per-thread or not, having a "pick a new key now" prctl() sounds
>> good, but I'd like to have an eye toward having it just be "automatic"
>> on clone().
>
> I thought about that too, but we're out of clone() flags afaict and there's
> no arch hook in there. We could add yet another clone syscall, but yuck (and
> I reckon viro would kill us).
>
> Or are you saying that we could infer the behaviour from the existing set
> of flags?

I mean if it's starting a new thread, it should get a new key
automatically, just like the ssp canary happens in dup_task_struct().

(Or did I miss some context for why that's not possible?)

-Kees

-- 
Kees Cook
Pixel Security
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 06/14] Fix CFLAGS for UBSAN_BOUNDS on Clang

2020-09-14 Thread Kees Cook
On Mon, Sep 14, 2020 at 05:27:42PM +, George-Aurelian Popescu wrote:
> From: George Popescu 
> 
> When the kernel is compiled with Clang, UBSAN_BOUNDS inserts a brk after
> the handler call, preventing it from printing any information processed
> inside the buffer.
> For Clang -fsanitize=bounds expands to -fsanitize=array-bounds and
> -fsanitize=local-bounds, and the latter adds a brk after the handler
> call

That sounds like a compiler bug?

> Signed-off-by: George Popescu 
> ---
>  scripts/Makefile.ubsan | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/Makefile.ubsan b/scripts/Makefile.ubsan
> index 27348029b2b8..3d15ac346c97 100644
> --- a/scripts/Makefile.ubsan
> +++ b/scripts/Makefile.ubsan
> @@ -4,7 +4,14 @@ ifdef CONFIG_UBSAN_ALIGNMENT
>  endif
>  
>  ifdef CONFIG_UBSAN_BOUNDS
> -  CFLAGS_UBSAN += $(call cc-option, -fsanitize=bounds)
> +  # For Clang -fsanitize=bounds translates to -fsanitize=array-bounds and
> +  # -fsanitize=local-bounds; the latter adds a brk right after the
> +  # handler is called.
> +  ifdef CONFIG_CC_IS_CLANG
> +CFLAGS_UBSAN += $(call cc-option, -fsanitize=array-bounds)

This would mean losing the local-bounds coverage? Isn't that for locally
defined arrays on the stack?

> +  else
> +CFLAGS_UBSAN += $(call cc-option, -fsanitize=bounds)
> +  endif
>  endif
>  
>  ifdef CONFIG_UBSAN_MISC
> -- 
> 2.28.0.618.gf4bc123cb7-goog
> 

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 06/14] Fix CFLAGS for UBSAN_BOUNDS on Clang

2020-09-17 Thread Kees Cook
On Tue, Sep 15, 2020 at 10:24:58AM +, George Popescu wrote:
> This would mean losing the local-bounds coverage. I tried to  test it without
> local-bounds and with a locally defined array on the stack and it works fine
> (the handler is called and the error reported). For me it feels like
> --array-bounds and --local-bounds are triggered for the same type of
> undefined_behaviours but they are handling them different.

Er, if --array-bounds still works on local arrays, what does
local-bounds actually do? >_> :P If we don't have a reduction in
coverage, yeah, I'm fine to turn that off.

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 06/14] Fix CFLAGS for UBSAN_BOUNDS on Clang

2020-09-17 Thread Kees Cook
On Thu, Sep 17, 2020 at 11:35:40AM +, George Popescu wrote:
> On Thu, Sep 17, 2020 at 08:37:07AM +0200, Marco Elver wrote:
> > So, it seems that local-bounds can still catch some rare OOB accesses,
> > where KASAN fails to catch it because the access might skip over the
> > redzone.
> > 
> > The other more interesting bit of history is that
> > -fsanitize=local-bounds used to be -fbounds-checking, and meant for
> > production use as a hardening feature:
> > http://lists.llvm.org/pipermail/llvm-dev/2012-May/049972.html
> > 
> > And local-bounds just does not behave like any other sanitizer as a
> > result, it just traps. The fact that it's enabled via
> > -fsanitize=local-bounds (or just bounds) but hasn't much changed in
> > behaviour is a little unfortunate.
> 
> > I suppose there are 3 options:
> > 
> > 1. George implements trap handling somehow. Is this feasible? If not,
> > why not? Maybe that should also have been explained in the commit
> > message.
> > 
> > 2. Only enable -fsanitize=local-bounds if UBSAN_TRAP was selected, at
> > least for as long as Clang traps for local-bounds. I think this makes
> > sense either way, because if we do not expect UBSAN to trap, it really
> > should not trap!
> > 
> > 3. Change the compiler. As always, this will take a while to implement
> > and then to reach whoever should have that updated compiler.
> > 
> > Preferences?
> Considering of what you said above, I find option 2 the most elegant.
> The first one doesn't sound doable for the moment, also the third.
> I will edit this patch considering your comments and resend it to the
> list.

I have a slightly different suggestion that is very nearly #2 above:
split local-bounds into a separate CONFIG that requires UBSAN_TRAP, and
then carefully document both:
- what does it catch that "bounds" doesn't
- why it only operates in trap mode

The rationale I have is that I don't like the coverage of some
mitigation or detection to "silently" vary between builds. e.g. someone
would build with/without UBSAN_TRAP and end up with unexpectedly
different coverage. I'd rather there be a separate CONFIG that appears.

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 09/17] perf/core: Use static_call to optimize perf_guest_info_callbacks

2022-02-06 Thread Kees Cook
On Fri, Feb 04, 2022 at 09:35:49AM -0800, Sami Tolvanen wrote:
> On Wed, Feb 2, 2022 at 10:43 AM Sean Christopherson  wrote:
> > > +DEFINE_STATIC_CALL_RET0(__perf_guest_state, *perf_guest_cbs->state);
> > > +DEFINE_STATIC_CALL_RET0(__perf_guest_get_ip, *perf_guest_cbs->get_ip);
> > > +DEFINE_STATIC_CALL_RET0(__perf_guest_handle_intel_pt_intr, 
> > > *perf_guest_cbs->handle_intel_pt_intr);
> >
> > Using __static_call_return0() makes clang's CFI sad on arm64 due to the 
> > resulting
> > function prototype mistmatch, which IIUC, is verified by clang's 
> > __cfi_check()
> > for indirect calls, i.e. architectures without CONFIG_HAVE_STATIC_CALL.
> >
> > We could fudge around the issue by using stubs, massaging prototypes, 
> > etc..., but
> > that means doing that for every arch-agnostic user of 
> > __static_call_return0().
> >
> > Any clever ideas?  Can we do something like generate a unique function for 
> > every
> > DEFINE_STATIC_CALL_RET0 for CONFIG_HAVE_STATIC_CALL=n, e.g. using typeof() 
> > to
> > get the prototype?
> 
> I'm not sure there's a clever fix for this. On architectures without
> HAVE_STATIC_CALL, this is an indirect call to a function with a
> mismatching type, which CFI is intended to catch.
> 
> The obvious way to solve the problem would be to use a stub function
> with the correct type, which I agree, isn't going to scale. You can
> alternatively check if .func points to __static_call_return0 and not
> make the indirect call if it does. If neither of these options are
> feasible, you can disable CFI checking in the functions that have
> these static calls using the __nocfi attribute.
> 
> Kees, any thoughts?

I'm digging through the macros to sort this out, but IIUC, an example of
the problem is:

perf_guest_cbs->handle_intel_pt_intr is:

unsigned int (*handle_intel_pt_intr)(void);

The declaration for this starts with:

DECLARE_STATIC_CALL(__perf_guest_handle_intel_pt_intr, 
*perf_guest_cbs->handle_intel_pt_intr);

which produces:

extern struct static_call_key 
STATIC_CALL_KEY(__perf_guest_handle_intel_pt_intr);
extern typeof(*perf_guest_cbs->handle_intel_pt_intr) 
STATIC_CALL_TRAMP(__perf_guest_handle_intel_pt_intr);

and the last line becomes:

extern unsigned int (*__SCTperf_guest_handle_intel_pt_intr)(void);

with !HAVE_STATIC_CALL, when perf_guest_handle_intel_pt_intr() does:

return static_call(__perf_guest_handle_intel_pt_intr)();

it is resolving static_call() into:

((typeof(STATIC_CALL_TRAMP(name))*)(STATIC_CALL_KEY(name).func))

so the caller is expecting "unsigned int (*)(void)" but the prototype
of __static_call_return0 is "long (*)(void)":

long __static_call_return0(void);

Could we simply declare a type-matched ret0 trampoline too?

#define STATIC_CALL_TRAMP_RET0_PREFIX   __SCT__ret0__
#define STATIC_CALL_TRAMP_RET0(name)__PASTE(STATIC_CALL_TRAMP_RET0_PREFIX, 
name)

#define DEFINE_STATIC_CALL_RET0(name, _func)\
static typeof(_func) STATIC_CALL_TRAMP_RET0(name) { return 0; } \
__DEFINE_STATIC_CALL(name, _func, STATIC_CALL_TRAMP_RET0(name));

static_call_update(__perf_guest_handle_intel_pt_intr,
   (void 
*)&STATIC_CALL_TRAMP_RET0(__perf_guest_handle_intel_pt_intr))

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 09/17] perf/core: Use static_call to optimize perf_guest_info_callbacks

2022-02-06 Thread Kees Cook
On Sun, Feb 06, 2022 at 09:28:52PM +0100, Peter Zijlstra wrote:
> On Sun, Feb 06, 2022 at 10:45:15AM -0800, Kees Cook wrote:
> 
> > I'm digging through the macros to sort this out, but IIUC, an example of
> > the problem is:
> > 
> 
> > so the caller is expecting "unsigned int (*)(void)" but the prototype
> > of __static_call_return0 is "long (*)(void)":
> > 
> > long __static_call_return0(void);
> > 
> > Could we simply declare a type-matched ret0 trampoline too?
> 
> That'll work for this case, but the next case the function will have
> arguments we'll need even more nonsense...

Shouldn't the typeof() work there too, though? I.e. as long as the
return value can hold a "0", it'd work.

> And as stated in that other email, there's tb_stub_func() having the
> exact same problem as well.

Yeah, I'd need to go look at that again.

> The x86_64 CFI patches had a work-around for this, that could trivially
> be lifted I suppose.

Yeah, I think it'd be similar. I haven't had a chance to go look at that
again...

-- 
Kees Cook
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm