On Tue, Aug 28, 2018 at 1:26 AM Andy Lutomirski <l...@kernel.org> wrote:
>
> On Mon, Aug 27, 2018 at 4:12 PM, Jann Horn <ja...@google.com> wrote:
> > On Tue, Aug 28, 2018 at 1:04 AM Andy Lutomirski <l...@kernel.org> wrote:
> >>
> >> In NMI context, we might be in the middle of context switching or in
> >> the middle of switch_mm_irqs_off().  In either case, CR3 might not
> >> match current->mm, which could cause copy_from_user_nmi() and
> >> friends to read the wrong memory.
> >>
> >> Fix it by adding a new nmi_uaccess_okay() helper and checking it in
> >> copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
> >
> > What about eBPF probes (which I think can be attached to kprobe points
> > / tracepoints / perf events) that perform userspace reads / userspace
> > writes / kernel reads? Can those run in NMI context, and if so, do
> > they also need special handling?
>
> I assume they can run in NMI context, which might be problematic in
> and of themselves.  For example, does BPF adequately protect against a
> BPF program accessing a map while bpf(2) is modifying it?  It seems
> like bpf_prog_active is intended to serve this purpose.
>
> But I don't see any obvious mechanism for eBPF programs to read user memory.

Look in kernel/trace/bpf_trace.c, which defines a bunch of eBPF
helpers that can only be called from privileged eBPF code. Ah, but I
misremembered, the userspace write helper does have a guard against
interrupts, just the arbitrary read helper doesn't.

BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr)
{
    int ret;

    ret = probe_kernel_read(dst, unsafe_ptr, size);
    if (unlikely(ret < 0))
        memset(dst, 0, size);

    return ret;
}
[...]
BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
       u32, size)
{
    /*
     * Ensure we're in user context which is safe for the helper to
     * run. This helper has no business in a kthread.
     *
     * access_ok() should prevent writing to non-user memory, but in
     * some situations (nommu, temporary switch, etc) access_ok() does
     * not provide enough validation, hence the check on KERNEL_DS.
     */

    if (unlikely(in_interrupt() ||
             current->flags & (PF_KTHREAD | PF_EXITING)))
        return -EPERM;
    if (unlikely(uaccess_kernel()))
        return -EPERM;
    if (!access_ok(VERIFY_WRITE, unsafe_ptr, size))
        return -EPERM;

    return probe_kernel_write(unsafe_ptr, src, size);
}

Reply via email to