Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andrea Arcangeli wrote: > [ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on > cpu 6 > [ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted > 3.10.0-327.62.4.el7.x86_64 #1 > [ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 > 01/09/2017 > [ 1838.645954] Call Trace: > [ 1838.648680][] dump_stack+0x19/0x1b > [ 1838.655113] [] panic+0xd8/0x1e7 > [ 1838.660460] [] ? restart_watchdog_hrtimer+0x50/0x50 > [ 1838.667742] [] watchdog_overflow_callback+0xc2/0xd0 > [ 1838.675024] [] __perf_event_overflow+0xa1/0x250 > [ 1838.681920] [] perf_event_overflow+0x14/0x20 > [ 1838.688526] [] intel_pmu_handle_irq+0x1e8/0x470 > [ 1838.695423] [] ? ioremap_page_range+0x24c/0x330 > [ 1838.702320] [] ? unmap_kernel_range_noflush+0x11/0x20 > [ 1838.709797] [] ? ghes_copy_tofrom_phys+0x124/0x210 > [ 1838.716984] [] ? ghes_read_estatus+0xa0/0x190 > [ 1838.723687] [] perf_event_nmi_handler+0x2b/0x50 > [ 1838.730582] [] nmi_handle.isra.0+0x69/0xb0 > [ 1838.736992] [] do_nmi+0x169/0x340 > [ 1838.742532] [] end_repeat_nmi+0x1e/0x7e > [ 1838.748653] [] ? _raw_spin_lock_irqsave+0x3d/0x60 > [ 1838.755742] [] ? _raw_spin_lock_irqsave+0x3d/0x60 > [ 1838.762831] [] ? _raw_spin_lock_irqsave+0x3d/0x60 > [ 1838.769917] <> [] avc_compute_av+0x126/0x1b5 > [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 > [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 > [ 1838.790628] [] cred_has_capability+0x6b/0x120 > [ 1838.797331] [] ? ktime_get+0x4c/0xd0 > [ 1838.803160] [] ? clockevents_program_event+0x6b/0xf0 > [ 1838.810532] [] selinux_capable+0x2e/0x40 > [ 1838.816748] [] security_capable_noaudit+0x15/0x20 > [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 > [ 1838.831014] [] ptrace_has_cap+0x35/0x40 > [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 > [ 1838.843925] [] __schedule+0x26e/0xa00 > [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 > [ 1838.857041] [] cpu_startup_entry+0x184/0x290 > [ 1838.863637] [] start_secondary+0x1da/0x250 So yeah, current Linus' tree needs the same treatment -- we have to avoid calling out to ptrace_has_cap() in PTRACE_MODE_NOACCESS_CHK cases. I have updated the patch for v4. Thanks again, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andrea Arcangeli wrote: > [ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on > cpu 6 > [ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted > 3.10.0-327.62.4.el7.x86_64 #1 > [ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 > 01/09/2017 > [ 1838.645954] Call Trace: > [ 1838.648680][] dump_stack+0x19/0x1b > [ 1838.655113] [] panic+0xd8/0x1e7 > [ 1838.660460] [] ? restart_watchdog_hrtimer+0x50/0x50 > [ 1838.667742] [] watchdog_overflow_callback+0xc2/0xd0 > [ 1838.675024] [] __perf_event_overflow+0xa1/0x250 > [ 1838.681920] [] perf_event_overflow+0x14/0x20 > [ 1838.688526] [] intel_pmu_handle_irq+0x1e8/0x470 > [ 1838.695423] [] ? ioremap_page_range+0x24c/0x330 > [ 1838.702320] [] ? unmap_kernel_range_noflush+0x11/0x20 > [ 1838.709797] [] ? ghes_copy_tofrom_phys+0x124/0x210 > [ 1838.716984] [] ? ghes_read_estatus+0xa0/0x190 > [ 1838.723687] [] perf_event_nmi_handler+0x2b/0x50 > [ 1838.730582] [] nmi_handle.isra.0+0x69/0xb0 > [ 1838.736992] [] do_nmi+0x169/0x340 > [ 1838.742532] [] end_repeat_nmi+0x1e/0x7e > [ 1838.748653] [] ? _raw_spin_lock_irqsave+0x3d/0x60 > [ 1838.755742] [] ? _raw_spin_lock_irqsave+0x3d/0x60 > [ 1838.762831] [] ? _raw_spin_lock_irqsave+0x3d/0x60 > [ 1838.769917] <> [] avc_compute_av+0x126/0x1b5 > [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 > [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 > [ 1838.790628] [] cred_has_capability+0x6b/0x120 > [ 1838.797331] [] ? ktime_get+0x4c/0xd0 > [ 1838.803160] [] ? clockevents_program_event+0x6b/0xf0 > [ 1838.810532] [] selinux_capable+0x2e/0x40 > [ 1838.816748] [] security_capable_noaudit+0x15/0x20 > [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 > [ 1838.831014] [] ptrace_has_cap+0x35/0x40 > [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 > [ 1838.843925] [] __schedule+0x26e/0xa00 > [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 > [ 1838.857041] [] cpu_startup_entry+0x184/0x290 > [ 1838.863637] [] start_secondary+0x1da/0x250 So yeah, current Linus' tree needs the same treatment -- we have to avoid calling out to ptrace_has_cap() in PTRACE_MODE_NOACCESS_CHK cases. I have updated the patch for v4. Thanks again, -- Jiri Kosina SUSE Labs
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Wednesday, September 05, 2018 12:03 PM > To: Andrea Arcangeli > Cc: Jiri Kosina ; Andi Kleen ; Tim > Chen > ; Schaufler, Casey ; > Thomas Gleixner ; Ingo Molnar ; > Josh Poimboeuf ; Woodhouse, David > ; Oleg Nesterov ; linux- > ker...@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote: > > [ 1838.769917] <> [] > avc_compute_av+0x126/0x1b5 > > That does read_lock(), which is not allowed from scheduler context. > > > [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 > > [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 > > In current code this can end up in avc_update_node() which uses > spin_lock(), which is a bug from scheduler context.o > > > [ 1838.790628] [] cred_has_capability+0x6b/0x120 > > [ 1838.797331] [] ? ktime_get+0x4c/0xd0 > > [ 1838.803160] [] ? > clockevents_program_event+0x6b/0xf0 > > [ 1838.810532] [] selinux_capable+0x2e/0x40 > > [ 1838.816748] [] security_capable_noaudit+0x15/0x20 > > [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 > > [ 1838.831014] [] ptrace_has_cap+0x35/0x40 > > [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 > > [ 1838.843925] [] __schedule+0x26e/0xa00 > > [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 > > [ 1838.857041] [] cpu_startup_entry+0x184/0x290 > > [ 1838.863637] [] start_secondary+0x1da/0x250 > > So yes, looks like all that security LSM nonsense isn't going to work > here. What won't work is using the ptrace code. That is one of the reasons why you can't just blindly use it. Look at the patch set I submitted and you'll see that the SELinux selinux_task_safe_sidechannel() hook does not do the things that cause the lockup.
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Wednesday, September 05, 2018 12:03 PM > To: Andrea Arcangeli > Cc: Jiri Kosina ; Andi Kleen ; Tim > Chen > ; Schaufler, Casey ; > Thomas Gleixner ; Ingo Molnar ; > Josh Poimboeuf ; Woodhouse, David > ; Oleg Nesterov ; linux- > ker...@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote: > > [ 1838.769917] <> [] > avc_compute_av+0x126/0x1b5 > > That does read_lock(), which is not allowed from scheduler context. > > > [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 > > [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 > > In current code this can end up in avc_update_node() which uses > spin_lock(), which is a bug from scheduler context.o > > > [ 1838.790628] [] cred_has_capability+0x6b/0x120 > > [ 1838.797331] [] ? ktime_get+0x4c/0xd0 > > [ 1838.803160] [] ? > clockevents_program_event+0x6b/0xf0 > > [ 1838.810532] [] selinux_capable+0x2e/0x40 > > [ 1838.816748] [] security_capable_noaudit+0x15/0x20 > > [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 > > [ 1838.831014] [] ptrace_has_cap+0x35/0x40 > > [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 > > [ 1838.843925] [] __schedule+0x26e/0xa00 > > [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 > > [ 1838.857041] [] cpu_startup_entry+0x184/0x290 > > [ 1838.863637] [] start_secondary+0x1da/0x250 > > So yes, looks like all that security LSM nonsense isn't going to work > here. What won't work is using the ptrace code. That is one of the reasons why you can't just blindly use it. Look at the patch set I submitted and you'll see that the SELinux selinux_task_safe_sidechannel() hook does not do the things that cause the lockup.
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote: > [ 1838.769917] <> [] avc_compute_av+0x126/0x1b5 That does read_lock(), which is not allowed from scheduler context. > [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 > [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 In current code this can end up in avc_update_node() which uses spin_lock(), which is a bug from scheduler context.o > [ 1838.790628] [] cred_has_capability+0x6b/0x120 > [ 1838.797331] [] ? ktime_get+0x4c/0xd0 > [ 1838.803160] [] ? clockevents_program_event+0x6b/0xf0 > [ 1838.810532] [] selinux_capable+0x2e/0x40 > [ 1838.816748] [] security_capable_noaudit+0x15/0x20 > [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 > [ 1838.831014] [] ptrace_has_cap+0x35/0x40 > [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 > [ 1838.843925] [] __schedule+0x26e/0xa00 > [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 > [ 1838.857041] [] cpu_startup_entry+0x184/0x290 > [ 1838.863637] [] start_secondary+0x1da/0x250 So yes, looks like all that security LSM nonsense isn't going to work here.
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote: > [ 1838.769917] <> [] avc_compute_av+0x126/0x1b5 That does read_lock(), which is not allowed from scheduler context. > [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 > [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 In current code this can end up in avc_update_node() which uses spin_lock(), which is a bug from scheduler context.o > [ 1838.790628] [] cred_has_capability+0x6b/0x120 > [ 1838.797331] [] ? ktime_get+0x4c/0xd0 > [ 1838.803160] [] ? clockevents_program_event+0x6b/0xf0 > [ 1838.810532] [] selinux_capable+0x2e/0x40 > [ 1838.816748] [] security_capable_noaudit+0x15/0x20 > [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 > [ 1838.831014] [] ptrace_has_cap+0x35/0x40 > [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 > [ 1838.843925] [] __schedule+0x26e/0xa00 > [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 > [ 1838.857041] [] cpu_startup_entry+0x184/0x290 > [ 1838.863637] [] start_secondary+0x1da/0x250 So yes, looks like all that security LSM nonsense isn't going to work here.
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andrea Arcangeli wrote: > Perhaps you didn't sandbox KVM inside selinux by default? We by default do not enable selinux, so that's probably why. Thanks again, will go through the code in mainline and adapt the patch before sending v4. -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andrea Arcangeli wrote: > Perhaps you didn't sandbox KVM inside selinux by default? We by default do not enable selinux, so that's probably why. Thanks again, will go through the code in mainline and adapt the patch before sending v4. -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 08:29:07PM +0200, Jiri Kosina wrote: > (and no, my testing of the patch I sent on current tree didn't produce any > hangs -- was there a reliable way to trigger it on 3.10?). Only a very specific libvirt acceptance test found this after a while and it wasn't a customer it was caught by QA. The reporter said it wasn't sure about how to reproduce this issue either, it happened once in a while the backtrace was still enough to fix it for sure and then it never happened again. It's not because of virt but probably because of selinux+audit. This is precisely why I thought once you enter LSM from the scheduler atomic path the trouble starts as each LSM implementation of those calls may crash or not crash. Perhaps you didn't sandbox KVM inside selinux by default? This is the lockup the patch I posted fixed for 3.10. [ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 6 [ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.10.0-327.62.4.el7.x86_64 #1 [ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 01/09/2017 [ 1838.645954] Call Trace: [ 1838.648680][] dump_stack+0x19/0x1b [ 1838.655113] [] panic+0xd8/0x1e7 [ 1838.660460] [] ? restart_watchdog_hrtimer+0x50/0x50 [ 1838.667742] [] watchdog_overflow_callback+0xc2/0xd0 [ 1838.675024] [] __perf_event_overflow+0xa1/0x250 [ 1838.681920] [] perf_event_overflow+0x14/0x20 [ 1838.688526] [] intel_pmu_handle_irq+0x1e8/0x470 [ 1838.695423] [] ? ioremap_page_range+0x24c/0x330 [ 1838.702320] [] ? unmap_kernel_range_noflush+0x11/0x20 [ 1838.709797] [] ? ghes_copy_tofrom_phys+0x124/0x210 [ 1838.716984] [] ? ghes_read_estatus+0xa0/0x190 [ 1838.723687] [] perf_event_nmi_handler+0x2b/0x50 [ 1838.730582] [] nmi_handle.isra.0+0x69/0xb0 [ 1838.736992] [] do_nmi+0x169/0x340 [ 1838.742532] [] end_repeat_nmi+0x1e/0x7e [ 1838.748653] [] ? _raw_spin_lock_irqsave+0x3d/0x60 [ 1838.755742] [] ? _raw_spin_lock_irqsave+0x3d/0x60 [ 1838.762831] [] ? _raw_spin_lock_irqsave+0x3d/0x60 [ 1838.769917] <> [] avc_compute_av+0x126/0x1b5 [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 [ 1838.790628] [] cred_has_capability+0x6b/0x120 [ 1838.797331] [] ? ktime_get+0x4c/0xd0 [ 1838.803160] [] ? clockevents_program_event+0x6b/0xf0 [ 1838.810532] [] selinux_capable+0x2e/0x40 [ 1838.816748] [] security_capable_noaudit+0x15/0x20 [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 [ 1838.831014] [] ptrace_has_cap+0x35/0x40 [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 [ 1838.843925] [] __schedule+0x26e/0xa00 [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 [ 1838.857041] [] cpu_startup_entry+0x184/0x290 [ 1838.863637] [] start_secondary+0x1da/0x250
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 08:29:07PM +0200, Jiri Kosina wrote: > (and no, my testing of the patch I sent on current tree didn't produce any > hangs -- was there a reliable way to trigger it on 3.10?). Only a very specific libvirt acceptance test found this after a while and it wasn't a customer it was caught by QA. The reporter said it wasn't sure about how to reproduce this issue either, it happened once in a while the backtrace was still enough to fix it for sure and then it never happened again. It's not because of virt but probably because of selinux+audit. This is precisely why I thought once you enter LSM from the scheduler atomic path the trouble starts as each LSM implementation of those calls may crash or not crash. Perhaps you didn't sandbox KVM inside selinux by default? This is the lockup the patch I posted fixed for 3.10. [ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 6 [ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.10.0-327.62.4.el7.x86_64 #1 [ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 01/09/2017 [ 1838.645954] Call Trace: [ 1838.648680][] dump_stack+0x19/0x1b [ 1838.655113] [] panic+0xd8/0x1e7 [ 1838.660460] [] ? restart_watchdog_hrtimer+0x50/0x50 [ 1838.667742] [] watchdog_overflow_callback+0xc2/0xd0 [ 1838.675024] [] __perf_event_overflow+0xa1/0x250 [ 1838.681920] [] perf_event_overflow+0x14/0x20 [ 1838.688526] [] intel_pmu_handle_irq+0x1e8/0x470 [ 1838.695423] [] ? ioremap_page_range+0x24c/0x330 [ 1838.702320] [] ? unmap_kernel_range_noflush+0x11/0x20 [ 1838.709797] [] ? ghes_copy_tofrom_phys+0x124/0x210 [ 1838.716984] [] ? ghes_read_estatus+0xa0/0x190 [ 1838.723687] [] perf_event_nmi_handler+0x2b/0x50 [ 1838.730582] [] nmi_handle.isra.0+0x69/0xb0 [ 1838.736992] [] do_nmi+0x169/0x340 [ 1838.742532] [] end_repeat_nmi+0x1e/0x7e [ 1838.748653] [] ? _raw_spin_lock_irqsave+0x3d/0x60 [ 1838.755742] [] ? _raw_spin_lock_irqsave+0x3d/0x60 [ 1838.762831] [] ? _raw_spin_lock_irqsave+0x3d/0x60 [ 1838.769917] <> [] avc_compute_av+0x126/0x1b5 [ 1838.777125] [] ? walk_tg_tree_from+0xbe/0x110 [ 1838.783828] [] avc_has_perm_noaudit+0xc4/0x110 [ 1838.790628] [] cred_has_capability+0x6b/0x120 [ 1838.797331] [] ? ktime_get+0x4c/0xd0 [ 1838.803160] [] ? clockevents_program_event+0x6b/0xf0 [ 1838.810532] [] selinux_capable+0x2e/0x40 [ 1838.816748] [] security_capable_noaudit+0x15/0x20 [ 1838.823829] [] has_ns_capability_noaudit+0x15/0x20 [ 1838.831014] [] ptrace_has_cap+0x35/0x40 [ 1838.837126] [] ___ptrace_may_access+0xa7/0x1e0 [ 1838.843925] [] __schedule+0x26e/0xa00 [ 1838.849855] [] schedule_preempt_disabled+0x29/0x70 [ 1838.857041] [] cpu_startup_entry+0x184/0x290 [ 1838.863637] [] start_secondary+0x1da/0x250
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andi Kleen wrote: > Please if you repost include plenty of performance numbers for multi > threaded workloads. It's ridiculous to even discuss this without them. Talking about ridiculous ... I find it a bit sad that Intel has let this be unfixed for 3/4 years in linux; that doesn't really signal deep dedication to customer safety. Have any STIBP patches been even submitted? This is not the same situation as IBRS which was mostly ignored -- there we have retpolines to protect the kernel, and it's debatable whether it's exploitable on SKL at all. Ignoring IBPB and STIBP is keeping the system plain vulnerable to user-user attacks, and us not providing users with possibiliy to easily mitigate, is a bit embarassing in my eyes. Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andi Kleen wrote: > Please if you repost include plenty of performance numbers for multi > threaded workloads. It's ridiculous to even discuss this without them. Talking about ridiculous ... I find it a bit sad that Intel has let this be unfixed for 3/4 years in linux; that doesn't really signal deep dedication to customer safety. Have any STIBP patches been even submitted? This is not the same situation as IBRS which was mostly ignored -- there we have retpolines to protect the kernel, and it's debatable whether it's exploitable on SKL at all. Ignoring IBPB and STIBP is keeping the system plain vulnerable to user-user attacks, and us not providing users with possibiliy to easily mitigate, is a bit embarassing in my eyes. Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andrea Arcangeli wrote: > ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup > hard if called from scheduler as it does some locking, and we fixed > that already half a year ago. > > Not sure how it's still unfixed in Jiri's codebase after so long, or > if it's an issue specific to 3.10 and upstream gets away without this. We haven't got any lockup reports in our kernels (and we do carry a variant of this patch), so it might be somehow specific to 3.10. > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index eb7862f185ff..4a8d0dd73c93 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer, > gid_eq(caller_gid, tcred->sgid) && > gid_eq(caller_gid, tcred->gid)) > goto ok; > - if (ptrace_has_cap(tcred->user_ns, mode)) > + if (!(mode & PTRACE_MODE_NOACCESS_CHK) && > + ptrace_has_cap(tcred->user_ns, mode)) > goto ok; > rcu_read_unlock(); > return -EPERM; > @@ -296,7 +297,8 @@ ok: > dumpable = get_dumpable(task->mm); > rcu_read_lock(); > if (dumpable != SUID_DUMP_USER && > - !ptrace_has_cap(__task_cred(task)->user_ns, mode)) { > + ((mode & PTRACE_MODE_NOACCESS_CHK) || > + !ptrace_has_cap(__task_cred(task)->user_ns, mode))) { > rcu_read_unlock(); > return -EPERM; I will look into this whether it's still applicable or not, thanks a lot for the pointer. (and no, my testing of the patch I sent on current tree didn't produce any hangs -- was there a reliable way to trigger it on 3.10?). Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andrea Arcangeli wrote: > ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup > hard if called from scheduler as it does some locking, and we fixed > that already half a year ago. > > Not sure how it's still unfixed in Jiri's codebase after so long, or > if it's an issue specific to 3.10 and upstream gets away without this. We haven't got any lockup reports in our kernels (and we do carry a variant of this patch), so it might be somehow specific to 3.10. > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index eb7862f185ff..4a8d0dd73c93 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer, > gid_eq(caller_gid, tcred->sgid) && > gid_eq(caller_gid, tcred->gid)) > goto ok; > - if (ptrace_has_cap(tcred->user_ns, mode)) > + if (!(mode & PTRACE_MODE_NOACCESS_CHK) && > + ptrace_has_cap(tcred->user_ns, mode)) > goto ok; > rcu_read_unlock(); > return -EPERM; > @@ -296,7 +297,8 @@ ok: > dumpable = get_dumpable(task->mm); > rcu_read_lock(); > if (dumpable != SUID_DUMP_USER && > - !ptrace_has_cap(__task_cred(task)->user_ns, mode)) { > + ((mode & PTRACE_MODE_NOACCESS_CHK) || > + !ptrace_has_cap(__task_cred(task)->user_ns, mode))) { > rcu_read_unlock(); > return -EPERM; I will look into this whether it's still applicable or not, thanks a lot for the pointer. (and no, my testing of the patch I sent on current tree didn't produce any hangs -- was there a reliable way to trigger it on 3.10?). Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andi Kleen wrote: > > So, after giving it a bit more thought, I still believe "I want spectre V2 > > protection" vs. "I do not care about spectre V2 on my system > > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 > > of my patchset, including the ptrace check in switch_mm() (statically > > patched out on !IBPB-capable systems), and we can then later see whether > > the LSM implementation, once it exists, should be used instead. > > Please if you repost include plenty of performance numbers for multi threaded > workloads. It's ridiculous to even discuss this without them. Either we care about that problem and provide a proper mechanism to protect systems or we do not. That's not a performance number problem at all. Thanks, tglx
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, 5 Sep 2018, Andi Kleen wrote: > > So, after giving it a bit more thought, I still believe "I want spectre V2 > > protection" vs. "I do not care about spectre V2 on my system > > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 > > of my patchset, including the ptrace check in switch_mm() (statically > > patched out on !IBPB-capable systems), and we can then later see whether > > the LSM implementation, once it exists, should be used instead. > > Please if you repost include plenty of performance numbers for multi threaded > workloads. It's ridiculous to even discuss this without them. Either we care about that problem and provide a proper mechanism to protect systems or we do not. That's not a performance number problem at all. Thanks, tglx
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 08:58:23AM -0700, Andi Kleen wrote: > > So, after giving it a bit more thought, I still believe "I want spectre V2 > > protection" vs. "I do not care about spectre V2 on my system > > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 > > of my patchset, including the ptrace check in switch_mm() (statically > > patched out on !IBPB-capable systems), and we can then later see whether > > the LSM implementation, once it exists, should be used instead. > > Please if you repost include plenty of performance numbers for multi threaded > workloads. It's ridiculous to even discuss this without them. Multi threaded workloads won't be affected because they share the memory in the first place... the check itself is lost in the noise too. Maybe you meant to ask for multiple parallel processes (multithreaded or not, zero difference) all with a different user id? What is more weird for me is to attempt to discuss the STIBP part of the patch without knowing which microcodes exactly implement STIBP in a way that is slow. Tim already said it's a measurable performance hit, but on some CPU it's zero performance hit. We don't even know if STIBP is actually useful or if it's a noop on those CPUs where it won't affect performance. Back to the IBPB, from implementation standpoint at least on 3.10 this code posted would lockup hard eventually and we got complains. ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup hard if called from scheduler as it does some locking, and we fixed that already half a year ago. Not sure how it's still unfixed in Jiri's codebase after so long, or if it's an issue specific to 3.10 and upstream gets away without this. diff --git a/kernel/ptrace.c b/kernel/ptrace.c index eb7862f185ff..4a8d0dd73c93 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer, gid_eq(caller_gid, tcred->sgid) && gid_eq(caller_gid, tcred->gid)) goto ok; - if (ptrace_has_cap(tcred->user_ns, mode)) + if (!(mode & PTRACE_MODE_NOACCESS_CHK) && + ptrace_has_cap(tcred->user_ns, mode)) goto ok; rcu_read_unlock(); return -EPERM; @@ -296,7 +297,8 @@ ok: dumpable = get_dumpable(task->mm); rcu_read_lock(); if (dumpable != SUID_DUMP_USER && - !ptrace_has_cap(__task_cred(task)->user_ns, mode)) { + ((mode & PTRACE_MODE_NOACCESS_CHK) || +!ptrace_has_cap(__task_cred(task)->user_ns, mode))) { rcu_read_unlock(); return -EPERM; }
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 08:58:23AM -0700, Andi Kleen wrote: > > So, after giving it a bit more thought, I still believe "I want spectre V2 > > protection" vs. "I do not care about spectre V2 on my system > > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 > > of my patchset, including the ptrace check in switch_mm() (statically > > patched out on !IBPB-capable systems), and we can then later see whether > > the LSM implementation, once it exists, should be used instead. > > Please if you repost include plenty of performance numbers for multi threaded > workloads. It's ridiculous to even discuss this without them. Multi threaded workloads won't be affected because they share the memory in the first place... the check itself is lost in the noise too. Maybe you meant to ask for multiple parallel processes (multithreaded or not, zero difference) all with a different user id? What is more weird for me is to attempt to discuss the STIBP part of the patch without knowing which microcodes exactly implement STIBP in a way that is slow. Tim already said it's a measurable performance hit, but on some CPU it's zero performance hit. We don't even know if STIBP is actually useful or if it's a noop on those CPUs where it won't affect performance. Back to the IBPB, from implementation standpoint at least on 3.10 this code posted would lockup hard eventually and we got complains. ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup hard if called from scheduler as it does some locking, and we fixed that already half a year ago. Not sure how it's still unfixed in Jiri's codebase after so long, or if it's an issue specific to 3.10 and upstream gets away without this. diff --git a/kernel/ptrace.c b/kernel/ptrace.c index eb7862f185ff..4a8d0dd73c93 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer, gid_eq(caller_gid, tcred->sgid) && gid_eq(caller_gid, tcred->gid)) goto ok; - if (ptrace_has_cap(tcred->user_ns, mode)) + if (!(mode & PTRACE_MODE_NOACCESS_CHK) && + ptrace_has_cap(tcred->user_ns, mode)) goto ok; rcu_read_unlock(); return -EPERM; @@ -296,7 +297,8 @@ ok: dumpable = get_dumpable(task->mm); rcu_read_lock(); if (dumpable != SUID_DUMP_USER && - !ptrace_has_cap(__task_cred(task)->user_ns, mode)) { + ((mode & PTRACE_MODE_NOACCESS_CHK) || +!ptrace_has_cap(__task_cred(task)->user_ns, mode))) { rcu_read_unlock(); return -EPERM; }
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> So, after giving it a bit more thought, I still believe "I want spectre V2 > protection" vs. "I do not care about spectre V2 on my system > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 > of my patchset, including the ptrace check in switch_mm() (statically > patched out on !IBPB-capable systems), and we can then later see whether > the LSM implementation, once it exists, should be used instead. Please if you repost include plenty of performance numbers for multi threaded workloads. It's ridiculous to even discuss this without them. -Andi
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> So, after giving it a bit more thought, I still believe "I want spectre V2 > protection" vs. "I do not care about spectre V2 on my system > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 > of my patchset, including the ptrace check in switch_mm() (statically > patched out on !IBPB-capable systems), and we can then later see whether > the LSM implementation, once it exists, should be used instead. Please if you repost include plenty of performance numbers for multi threaded workloads. It's ridiculous to even discuss this without them. -Andi
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Wednesday, September 05, 2018 1:00 AM > To: Jiri Kosina > Cc: Tim Chen ; Thomas Gleixner > ; Ingo Molnar ; Josh Poimboeuf > ; Andrea Arcangeli ; > Woodhouse, David ; Oleg Nesterov > ; Schaufler, Casey ; linux- > ker...@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote: > > On Tue, 4 Sep 2018, Tim Chen wrote: > > > > > > Current ptrace_may_access() implementation assumes that the 'source' > task is > > > > always the caller (current). > > > > > > > > Expose ___ptrace_may_access() that can be used to apply the check on > arbitrary > > > > tasks. > > > > > > Casey recently has proposed putting the decision making of whether to > > > do IBPB in the security module. > > > > > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4- > casey.schauf...@intel.com/ > > > > > > That will have the advantage of giving the administrator a more > > > flexibility > > > of when to turn on IBPB. The policy is very similar to what you have > proposed here > > > but I think the security module is a more appropriate place for the > > > security > policy. > > > > Yeah, well, honestly, I have a bit hard time buying the "generic > > sidechannel prevention security module" idea, given how completely > > different in nature all the mitigations have been so far. I don't see that > > trying to abstract this somehow provides more clarity. > > > > So if this should be done in LSM, it'd probably have to be written by > > someone else than me :) who actually understands how the "sidechannel > LSM" > > idea works. > > Yeah, I'm not convinced on LSM either. Lets just do these here patches > first and then Casey can try and convince us later. Works for me. There are advantages to doing it either way. The LSM approach allows you to consider implications of LSM data, which you can't do otherwise. Once the hook is available it becomes the obvious place to do other checks.
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Peter Zijlstra [mailto:pet...@infradead.org] > Sent: Wednesday, September 05, 2018 1:00 AM > To: Jiri Kosina > Cc: Tim Chen ; Thomas Gleixner > ; Ingo Molnar ; Josh Poimboeuf > ; Andrea Arcangeli ; > Woodhouse, David ; Oleg Nesterov > ; Schaufler, Casey ; linux- > ker...@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote: > > On Tue, 4 Sep 2018, Tim Chen wrote: > > > > > > Current ptrace_may_access() implementation assumes that the 'source' > task is > > > > always the caller (current). > > > > > > > > Expose ___ptrace_may_access() that can be used to apply the check on > arbitrary > > > > tasks. > > > > > > Casey recently has proposed putting the decision making of whether to > > > do IBPB in the security module. > > > > > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4- > casey.schauf...@intel.com/ > > > > > > That will have the advantage of giving the administrator a more > > > flexibility > > > of when to turn on IBPB. The policy is very similar to what you have > proposed here > > > but I think the security module is a more appropriate place for the > > > security > policy. > > > > Yeah, well, honestly, I have a bit hard time buying the "generic > > sidechannel prevention security module" idea, given how completely > > different in nature all the mitigations have been so far. I don't see that > > trying to abstract this somehow provides more clarity. > > > > So if this should be done in LSM, it'd probably have to be written by > > someone else than me :) who actually understands how the "sidechannel > LSM" > > idea works. > > Yeah, I'm not convinced on LSM either. Lets just do these here patches > first and then Casey can try and convince us later. Works for me. There are advantages to doing it either way. The LSM approach allows you to consider implications of LSM data, which you can't do otherwise. Once the hook is available it becomes the obvious place to do other checks.
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote: > On Tue, 4 Sep 2018, Tim Chen wrote: > > > > Current ptrace_may_access() implementation assumes that the 'source' task > > > is > > > always the caller (current). > > > > > > Expose ___ptrace_may_access() that can be used to apply the check on > > > arbitrary > > > tasks. > > > > Casey recently has proposed putting the decision making of whether to > > do IBPB in the security module. > > > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/ > > > > That will have the advantage of giving the administrator a more flexibility > > of when to turn on IBPB. The policy is very similar to what you have > > proposed here > > but I think the security module is a more appropriate place for the > > security policy. > > Yeah, well, honestly, I have a bit hard time buying the "generic > sidechannel prevention security module" idea, given how completely > different in nature all the mitigations have been so far. I don't see that > trying to abstract this somehow provides more clarity. > > So if this should be done in LSM, it'd probably have to be written by > someone else than me :) who actually understands how the "sidechannel LSM" > idea works. Yeah, I'm not convinced on LSM either. Lets just do these here patches first and then Casey can try and convince us later.
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote: > On Tue, 4 Sep 2018, Tim Chen wrote: > > > > Current ptrace_may_access() implementation assumes that the 'source' task > > > is > > > always the caller (current). > > > > > > Expose ___ptrace_may_access() that can be used to apply the check on > > > arbitrary > > > tasks. > > > > Casey recently has proposed putting the decision making of whether to > > do IBPB in the security module. > > > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/ > > > > That will have the advantage of giving the administrator a more flexibility > > of when to turn on IBPB. The policy is very similar to what you have > > proposed here > > but I think the security module is a more appropriate place for the > > security policy. > > Yeah, well, honestly, I have a bit hard time buying the "generic > sidechannel prevention security module" idea, given how completely > different in nature all the mitigations have been so far. I don't see that > trying to abstract this somehow provides more clarity. > > So if this should be done in LSM, it'd probably have to be written by > someone else than me :) who actually understands how the "sidechannel LSM" > idea works. Yeah, I'm not convinced on LSM either. Lets just do these here patches first and then Casey can try and convince us later.
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, Sep 04, 2018 at 04:40:57PM +0200, Jiri Kosina wrote: > From: Jiri Kosina > > Current ptrace_may_access() implementation assumes that the 'source' task is > always the caller (current). Note that now you 'fixed' the new user, this is still true. Which makes me think you do not in fact need this patch, right?
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, Sep 04, 2018 at 04:40:57PM +0200, Jiri Kosina wrote: > From: Jiri Kosina > > Current ptrace_may_access() implementation assumes that the 'source' task is > always the caller (current). Note that now you 'fixed' the new user, this is still true. Which makes me think you do not in fact need this patch, right?
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Tim Chen wrote: > I think STIBP should be an opt in option as it will have significant > impact on performance. The attack from neighbor thread is pretty > difficult to pull off considering you have to know what the sibling > thread is running and its address allocation. In many scenarios the attacker can just easily taskset itself to the correct sibling. > We could also use a security module to opt in the STIBP policy. I am a bit afraid that we are offloading to sysadmins decisions that are very hard for them to make, as they require deep understanding of both the technical details of the security issue in the CPU, and the mitigation. I surely understand that Intel is doing what they could to minimize the performance effect, but achieving that by making it a rocket science to configure it properly doesn't feel right. So, after giving it a bit more thought, I still believe "I want spectre V2 protection" vs. "I do not care about spectre V2 on my system (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 of my patchset, including the ptrace check in switch_mm() (statically patched out on !IBPB-capable systems), and we can then later see whether the LSM implementation, once it exists, should be used instead. Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Tim Chen wrote: > I think STIBP should be an opt in option as it will have significant > impact on performance. The attack from neighbor thread is pretty > difficult to pull off considering you have to know what the sibling > thread is running and its address allocation. In many scenarios the attacker can just easily taskset itself to the correct sibling. > We could also use a security module to opt in the STIBP policy. I am a bit afraid that we are offloading to sysadmins decisions that are very hard for them to make, as they require deep understanding of both the technical details of the security issue in the CPU, and the mitigation. I surely understand that Intel is doing what they could to minimize the performance effect, but achieving that by making it a rocket science to configure it properly doesn't feel right. So, after giving it a bit more thought, I still believe "I want spectre V2 protection" vs. "I do not care about spectre V2 on my system (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 of my patchset, including the ptrace check in switch_mm() (statically patched out on !IBPB-capable systems), and we can then later see whether the LSM implementation, once it exists, should be used instead. Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 01:00:37AM +, Schaufler, Casey wrote: > Sorry, I've been working in security too long for my > optimistic streak to be very wide. Eheh. So I was simply trying to follow in context, but it wasn't entirely clear, so I tried to take it out of context and then it was even less clear, never mind I was only looking for some more explanation. > Not especially, no. I have gotten considerable feedback that > it should be configurable, and while I may not see the value myself, > I do respect the people who've requested the configurability. Correct me if I'm wrong, but LSM as last word "module" implies to make this logic modular. My wondering is because: 1) why not just a single version of this logic in the scheduler? (i.e. can only be on or off or perhaps a only-ibpb-dumpable tweak to retain current slightly higher perf but less secure behavior) 2) even if you want a myriad of versions of this IBPB logic, how do the different versions can possibly fit in a whole "module" whose semantics have pretty much nothing to do with the other methods in the module like LSM_HOOK_INIT(capable, cap_capable), LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory), and LSM_HOOK_INIT(mmap_addr, cap_mmap_addr), or even LSM_HOOK_INIT(inode_follow_link, selinux_inode_follow_link), LSM_HOOK_INIT(inode_permission, selinux_inode_permission). I mean it looks an apple to oranges monolithic link in the same module. Or are you going to load this method in a separate module with only a single method implemented and then load multiple LSM modules at the same time? 3) if you need so much tweaking that not even a single off/on switch independent of any module loaded through LSM is not enough, how is it ok that all the rest shouldn't be configurable? All the rest has more performance impact than this one so it'd start from there if something. I understand how configurablity is valuable (this is why we kept everything dynamic tunable at runtime, including the PTI stop machine to allow even PTI TLB flushes to be turned off). I'm just trying to understand how IBPB fits in a LSM module implementation. Even if you build with CONFIG_SECURITY=n PTI won't go away, retpoline won't go away, the l1flush in vmenter won't go away, the pte/transhugepmd inversion won't go away, why only the runtime tunability or tweaking of IBPB fits in a LSM module? > If they provide different answers there should be different > functions. It's a question of viewpoint. If you don't care about > the SELinux viewpoint you shouldn't have to include it in your > checks, any more than you care about x86 issues when you're > running on a MIPS processor. I don't see how selinux fits into this. Selinux doesn't affect virtual memory protection. Not having selinux even built in doesn't mean you are ok with virtual memory protection not being provided by the CPU. Or in other words selinux fits into this as much as seccomp or apparmor fits into this. This is just like a memory barrier mb() to be executed in the context switch code. Security issues can emerge with the lack of it regardless if selinux is built in and enabled or CONFIG_SECURITY=n. selinux matters after an exploit already happened, this as opposed is needed to prevent the exploit in the first place. Also the correct fully secure version provides a single answer (i.e. in theory assuming a perfect implementation that didn't forget anything so having a single implementation will also increase the chances that we get as close as possible to the theoretical correct implementation). If it provides different answers in this case it is because somebody wants not perfect security to provide higher performance, i.e. the "configurablity" which is valuable and fine feature to provide. Just a LSM module doesn't seem the most flexibile way to provide configurability by far. > Yes, even security people have to worry about locking. Yes it was some lock that if contended would lockup if used from inside the scheduler. > What *is* the most robust way? The below pretty much explains it. > Yes, there are locking issues. The code in the LSM infrastructure and in > the security modules needs to be aware of that and deal with it. The SELinux > code I proposed is more complex than it could be because the audit code > requires locking that doesn't work in the switching context. Or in other words having multiple versions of this called from within the scheduler is a bit like making the kernel modular and then because the locking is subtle you may then have scheduler deadlocks only happening with some kernel module loaded instead of others, but the real question is what is the payoff compared to just allowing the scheduler code to be tuned with x86 debugfs or sysctl the normal way. Also how would a new common code method in LSM fit for CPUs that are so slow that they can't possibly need anything like IBPB and flush_RSB()? Thanks! Andrea
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Wed, Sep 05, 2018 at 01:00:37AM +, Schaufler, Casey wrote: > Sorry, I've been working in security too long for my > optimistic streak to be very wide. Eheh. So I was simply trying to follow in context, but it wasn't entirely clear, so I tried to take it out of context and then it was even less clear, never mind I was only looking for some more explanation. > Not especially, no. I have gotten considerable feedback that > it should be configurable, and while I may not see the value myself, > I do respect the people who've requested the configurability. Correct me if I'm wrong, but LSM as last word "module" implies to make this logic modular. My wondering is because: 1) why not just a single version of this logic in the scheduler? (i.e. can only be on or off or perhaps a only-ibpb-dumpable tweak to retain current slightly higher perf but less secure behavior) 2) even if you want a myriad of versions of this IBPB logic, how do the different versions can possibly fit in a whole "module" whose semantics have pretty much nothing to do with the other methods in the module like LSM_HOOK_INIT(capable, cap_capable), LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory), and LSM_HOOK_INIT(mmap_addr, cap_mmap_addr), or even LSM_HOOK_INIT(inode_follow_link, selinux_inode_follow_link), LSM_HOOK_INIT(inode_permission, selinux_inode_permission). I mean it looks an apple to oranges monolithic link in the same module. Or are you going to load this method in a separate module with only a single method implemented and then load multiple LSM modules at the same time? 3) if you need so much tweaking that not even a single off/on switch independent of any module loaded through LSM is not enough, how is it ok that all the rest shouldn't be configurable? All the rest has more performance impact than this one so it'd start from there if something. I understand how configurablity is valuable (this is why we kept everything dynamic tunable at runtime, including the PTI stop machine to allow even PTI TLB flushes to be turned off). I'm just trying to understand how IBPB fits in a LSM module implementation. Even if you build with CONFIG_SECURITY=n PTI won't go away, retpoline won't go away, the l1flush in vmenter won't go away, the pte/transhugepmd inversion won't go away, why only the runtime tunability or tweaking of IBPB fits in a LSM module? > If they provide different answers there should be different > functions. It's a question of viewpoint. If you don't care about > the SELinux viewpoint you shouldn't have to include it in your > checks, any more than you care about x86 issues when you're > running on a MIPS processor. I don't see how selinux fits into this. Selinux doesn't affect virtual memory protection. Not having selinux even built in doesn't mean you are ok with virtual memory protection not being provided by the CPU. Or in other words selinux fits into this as much as seccomp or apparmor fits into this. This is just like a memory barrier mb() to be executed in the context switch code. Security issues can emerge with the lack of it regardless if selinux is built in and enabled or CONFIG_SECURITY=n. selinux matters after an exploit already happened, this as opposed is needed to prevent the exploit in the first place. Also the correct fully secure version provides a single answer (i.e. in theory assuming a perfect implementation that didn't forget anything so having a single implementation will also increase the chances that we get as close as possible to the theoretical correct implementation). If it provides different answers in this case it is because somebody wants not perfect security to provide higher performance, i.e. the "configurablity" which is valuable and fine feature to provide. Just a LSM module doesn't seem the most flexibile way to provide configurability by far. > Yes, even security people have to worry about locking. Yes it was some lock that if contended would lockup if used from inside the scheduler. > What *is* the most robust way? The below pretty much explains it. > Yes, there are locking issues. The code in the LSM infrastructure and in > the security modules needs to be aware of that and deal with it. The SELinux > code I proposed is more complex than it could be because the audit code > requires locking that doesn't work in the switching context. Or in other words having multiple versions of this called from within the scheduler is a bit like making the kernel modular and then because the locking is subtle you may then have scheduler deadlocks only happening with some kernel module loaded instead of others, but the real question is what is the payoff compared to just allowing the scheduler code to be tuned with x86 debugfs or sysctl the normal way. Also how would a new common code method in LSM fit for CPUs that are so slow that they can't possibly need anything like IBPB and flush_RSB()? Thanks! Andrea
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Andrea Arcangeli [mailto:aarca...@redhat.com] > Sent: Tuesday, September 04, 2018 4:37 PM > To: Schaufler, Casey > Cc: Jiri Kosina ; Tim Chen ; > Thomas Gleixner ; Ingo Molnar ; > Peter Zijlstra ; Josh Poimboeuf > ; Woodhouse, David ; Oleg > Nesterov ; linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > Hello, > > On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote: > > The real reason to use an LSM based approach is that overloading ptrace > > checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a > > processor interface. Even if ptrace_may_access() does exactly what you > > "Side channel is a processor interface" doesn't make me optimistic, > but I assume you're not speaking for Intel. Sorry, I've been working in security too long for my optimistic streak to be very wide. > > want it to for side-channel mitigation today it would be incredibly > > inappropriate to tie the two together for eternity. You don't want to > > restrict > > the ptrace checks to only those things that are also good for side-channel, > > and vice versa. > > It seems like you want to make this more configurable, Not especially, no. I have gotten considerable feedback that it should be configurable, and while I may not see the value myself, I do respect the people who've requested the configurability. > we have all > debugfs x86 specific tweaks to disable IBPB at runtime and we don't > allow a runtime opt-out of IBPB alone. > > If you shutdown either IBRS or retpolines at runtime with debugfs, > then IBPB goes away too. > > Giving a finegrined way to disable only IBPB we found was overkill > because IBPB has never been measurable if done only when the prev task > cannot ptrace the next task (which is a superset of the too weak > upstream not dumpable check, but still not a performance issue). > > Allowing IBPB runtime opt-out doesn't make much sense if you don't > allow to disable retpolines too still at runtime. And disabling > retpolines from LSM doesn't sounds the right place, it's an x86 > temporary quirk only relevant for current buggy CPUs. > > There should be a function that decides when IBPB and flush_RSB should > be run (flush_RSB has to be run if SMEP because with SMEP there's no > need to run flush_RSB at every kernel entry anymore), and that > function happens to check ptrace today, but it would check other stuff > too if we had other interfaces besides ptrace that would allow the > prev task to read the memory of the next task. So it's not so much > about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task > can read memory of next task". Then each arch can implement > "IBPB_RSB" method its own way but the check is for the common > code and it should be in the scheduler and there's just 1 version of > this check needed. Right, I get that. There's more to ptrace access than "I can read the other task". There's more to what the processor is up to than IBPB. > > I don't think there should be a ton of different versions of this > function each providing a different answer, which is what LSM would > provide. If they provide different answers there should be different functions. It's a question of viewpoint. If you don't care about the SELinux viewpoint you shouldn't have to include it in your checks, any more than you care about x86 issues when you're running on a MIPS processor. > At most you can add a x86 debugfs tweak to shut off the logic but > again it would make more sense if retpolines could be shut off at > runtime too, doing it just for IBPB sounds backwards because it has > such an unmeasurable overhead. > > > Yes. That would be me. > > Even the attempt to make an innocuous call like > ptrace_has_cap(tcred->user_ns, mode) will eventually > deadlock there. Which is yet another reason there needs to be separate logic. > I used a PTRACE_MODE_ check that the ptrace check code uses to filter > out specific calls that may eventually enter LSM and hard lockup in > non reproducible workloads (some false positive IBPB is ok, especially > if it avoids a deadlock). Yes, even security people have to worry about locking. > Everything can be fixed in any way, but calling LSM from scheduler > code doesn't sound the most robust thing to do in general because what > works outside the scheduler doesn't work from within those semi atomic > regions (it tends to work initially until it eventually > deadlocks). The original code of such check, had all sort of deadlocks > because of that. What *is* the most robust way? Yes, there are locking issues. The code in the LSM infrastructure and in the security modules needs to be aware of that and deal with it. The SELinux code I proposed is more complex than it could be because the audit code requires locking that doesn't work in the switching context. > Thanks, > Andrea
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Andrea Arcangeli [mailto:aarca...@redhat.com] > Sent: Tuesday, September 04, 2018 4:37 PM > To: Schaufler, Casey > Cc: Jiri Kosina ; Tim Chen ; > Thomas Gleixner ; Ingo Molnar ; > Peter Zijlstra ; Josh Poimboeuf > ; Woodhouse, David ; Oleg > Nesterov ; linux-kernel@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > Hello, > > On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote: > > The real reason to use an LSM based approach is that overloading ptrace > > checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a > > processor interface. Even if ptrace_may_access() does exactly what you > > "Side channel is a processor interface" doesn't make me optimistic, > but I assume you're not speaking for Intel. Sorry, I've been working in security too long for my optimistic streak to be very wide. > > want it to for side-channel mitigation today it would be incredibly > > inappropriate to tie the two together for eternity. You don't want to > > restrict > > the ptrace checks to only those things that are also good for side-channel, > > and vice versa. > > It seems like you want to make this more configurable, Not especially, no. I have gotten considerable feedback that it should be configurable, and while I may not see the value myself, I do respect the people who've requested the configurability. > we have all > debugfs x86 specific tweaks to disable IBPB at runtime and we don't > allow a runtime opt-out of IBPB alone. > > If you shutdown either IBRS or retpolines at runtime with debugfs, > then IBPB goes away too. > > Giving a finegrined way to disable only IBPB we found was overkill > because IBPB has never been measurable if done only when the prev task > cannot ptrace the next task (which is a superset of the too weak > upstream not dumpable check, but still not a performance issue). > > Allowing IBPB runtime opt-out doesn't make much sense if you don't > allow to disable retpolines too still at runtime. And disabling > retpolines from LSM doesn't sounds the right place, it's an x86 > temporary quirk only relevant for current buggy CPUs. > > There should be a function that decides when IBPB and flush_RSB should > be run (flush_RSB has to be run if SMEP because with SMEP there's no > need to run flush_RSB at every kernel entry anymore), and that > function happens to check ptrace today, but it would check other stuff > too if we had other interfaces besides ptrace that would allow the > prev task to read the memory of the next task. So it's not so much > about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task > can read memory of next task". Then each arch can implement > "IBPB_RSB" method its own way but the check is for the common > code and it should be in the scheduler and there's just 1 version of > this check needed. Right, I get that. There's more to ptrace access than "I can read the other task". There's more to what the processor is up to than IBPB. > > I don't think there should be a ton of different versions of this > function each providing a different answer, which is what LSM would > provide. If they provide different answers there should be different functions. It's a question of viewpoint. If you don't care about the SELinux viewpoint you shouldn't have to include it in your checks, any more than you care about x86 issues when you're running on a MIPS processor. > At most you can add a x86 debugfs tweak to shut off the logic but > again it would make more sense if retpolines could be shut off at > runtime too, doing it just for IBPB sounds backwards because it has > such an unmeasurable overhead. > > > Yes. That would be me. > > Even the attempt to make an innocuous call like > ptrace_has_cap(tcred->user_ns, mode) will eventually > deadlock there. Which is yet another reason there needs to be separate logic. > I used a PTRACE_MODE_ check that the ptrace check code uses to filter > out specific calls that may eventually enter LSM and hard lockup in > non reproducible workloads (some false positive IBPB is ok, especially > if it avoids a deadlock). Yes, even security people have to worry about locking. > Everything can be fixed in any way, but calling LSM from scheduler > code doesn't sound the most robust thing to do in general because what > works outside the scheduler doesn't work from within those semi atomic > regions (it tends to work initially until it eventually > deadlocks). The original code of such check, had all sort of deadlocks > because of that. What *is* the most robust way? Yes, there are locking issues. The code in the LSM infrastructure and in the security modules needs to be aware of that and deal with it. The SELinux code I proposed is more complex than it could be because the audit code requires locking that doesn't work in the switching context. > Thanks, > Andrea
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
Hello, On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote: > The real reason to use an LSM based approach is that overloading ptrace > checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a > processor interface. Even if ptrace_may_access() does exactly what you "Side channel is a processor interface" doesn't make me optimistic, but I assume you're not speaking for Intel. > want it to for side-channel mitigation today it would be incredibly > inappropriate to tie the two together for eternity. You don't want to restrict > the ptrace checks to only those things that are also good for side-channel, > and vice versa. It seems like you want to make this more configurable, we have all debugfs x86 specific tweaks to disable IBPB at runtime and we don't allow a runtime opt-out of IBPB alone. If you shutdown either IBRS or retpolines at runtime with debugfs, then IBPB goes away too. Giving a finegrined way to disable only IBPB we found was overkill because IBPB has never been measurable if done only when the prev task cannot ptrace the next task (which is a superset of the too weak upstream not dumpable check, but still not a performance issue). Allowing IBPB runtime opt-out doesn't make much sense if you don't allow to disable retpolines too still at runtime. And disabling retpolines from LSM doesn't sounds the right place, it's an x86 temporary quirk only relevant for current buggy CPUs. There should be a function that decides when IBPB and flush_RSB should be run (flush_RSB has to be run if SMEP because with SMEP there's no need to run flush_RSB at every kernel entry anymore), and that function happens to check ptrace today, but it would check other stuff too if we had other interfaces besides ptrace that would allow the prev task to read the memory of the next task. So it's not so much about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task can read memory of next task". Then each arch can implement "IBPB_RSB" method its own way but the check is for the common code and it should be in the scheduler and there's just 1 version of this check needed. I don't think there should be a ton of different versions of this function each providing a different answer, which is what LSM would provide. At most you can add a x86 debugfs tweak to shut off the logic but again it would make more sense if retpolines could be shut off at runtime too, doing it just for IBPB sounds backwards because it has such an unmeasurable overhead. > Yes. That would be me. Even the attempt to make an innocuous call like ptrace_has_cap(tcred->user_ns, mode) will eventually deadlock there. I used a PTRACE_MODE_ check that the ptrace check code uses to filter out specific calls that may eventually enter LSM and hard lockup in non reproducible workloads (some false positive IBPB is ok, especially if it avoids a deadlock). Everything can be fixed in any way, but calling LSM from scheduler code doesn't sound the most robust thing to do in general because what works outside the scheduler doesn't work from within those semi atomic regions (it tends to work initially until it eventually deadlocks). The original code of such check, had all sort of deadlocks because of that. Thanks, Andrea
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
Hello, On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote: > The real reason to use an LSM based approach is that overloading ptrace > checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a > processor interface. Even if ptrace_may_access() does exactly what you "Side channel is a processor interface" doesn't make me optimistic, but I assume you're not speaking for Intel. > want it to for side-channel mitigation today it would be incredibly > inappropriate to tie the two together for eternity. You don't want to restrict > the ptrace checks to only those things that are also good for side-channel, > and vice versa. It seems like you want to make this more configurable, we have all debugfs x86 specific tweaks to disable IBPB at runtime and we don't allow a runtime opt-out of IBPB alone. If you shutdown either IBRS or retpolines at runtime with debugfs, then IBPB goes away too. Giving a finegrined way to disable only IBPB we found was overkill because IBPB has never been measurable if done only when the prev task cannot ptrace the next task (which is a superset of the too weak upstream not dumpable check, but still not a performance issue). Allowing IBPB runtime opt-out doesn't make much sense if you don't allow to disable retpolines too still at runtime. And disabling retpolines from LSM doesn't sounds the right place, it's an x86 temporary quirk only relevant for current buggy CPUs. There should be a function that decides when IBPB and flush_RSB should be run (flush_RSB has to be run if SMEP because with SMEP there's no need to run flush_RSB at every kernel entry anymore), and that function happens to check ptrace today, but it would check other stuff too if we had other interfaces besides ptrace that would allow the prev task to read the memory of the next task. So it's not so much about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task can read memory of next task". Then each arch can implement "IBPB_RSB" method its own way but the check is for the common code and it should be in the scheduler and there's just 1 version of this check needed. I don't think there should be a ton of different versions of this function each providing a different answer, which is what LSM would provide. At most you can add a x86 debugfs tweak to shut off the logic but again it would make more sense if retpolines could be shut off at runtime too, doing it just for IBPB sounds backwards because it has such an unmeasurable overhead. > Yes. That would be me. Even the attempt to make an innocuous call like ptrace_has_cap(tcred->user_ns, mode) will eventually deadlock there. I used a PTRACE_MODE_ check that the ptrace check code uses to filter out specific calls that may eventually enter LSM and hard lockup in non reproducible workloads (some false positive IBPB is ok, especially if it avoids a deadlock). Everything can be fixed in any way, but calling LSM from scheduler code doesn't sound the most robust thing to do in general because what works outside the scheduler doesn't work from within those semi atomic regions (it tends to work initially until it eventually deadlocks). The original code of such check, had all sort of deadlocks because of that. Thanks, Andrea
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On 09/04/2018 11:48 AM, Jiri Kosina wrote: > On Tue, 4 Sep 2018, Schaufler, Casey wrote: > >>> So if this should be done in LSM, it'd probably have to be written by >>> someone else than me :) who actually understands how the "sidechannel LSM" >>> idea works. >> >> Yes. That would be me. > > Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. > > Some form of 3/3 still should be merged independently on that. I think STIBP should be an opt in option as it will have significant impact on performance. The attack from neighbor thread is pretty difficult to pull off considering you have to know what the sibling thread is running and its address allocation. We could also use a security module to opt in the STIBP policy. Tim > > Thanks, >
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On 09/04/2018 11:48 AM, Jiri Kosina wrote: > On Tue, 4 Sep 2018, Schaufler, Casey wrote: > >>> So if this should be done in LSM, it'd probably have to be written by >>> someone else than me :) who actually understands how the "sidechannel LSM" >>> idea works. >> >> Yes. That would be me. > > Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. > > Some form of 3/3 still should be merged independently on that. I think STIBP should be an opt in option as it will have significant impact on performance. The attack from neighbor thread is pretty difficult to pull off considering you have to know what the sibling thread is running and its address allocation. We could also use a security module to opt in the STIBP policy. Tim > > Thanks, >
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Schaufler, Casey wrote: > > So if this should be done in LSM, it'd probably have to be written by > > someone else than me :) who actually understands how the "sidechannel LSM" > > idea works. > > Yes. That would be me. Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. Some form of 3/3 still should be merged independently on that. Thanks, -- Jiri Kosina SUSE Labs
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Schaufler, Casey wrote: > > So if this should be done in LSM, it'd probably have to be written by > > someone else than me :) who actually understands how the "sidechannel LSM" > > idea works. > > Yes. That would be me. Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. Some form of 3/3 still should be merged independently on that. Thanks, -- Jiri Kosina SUSE Labs
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Jiri Kosina [mailto:ji...@kernel.org] > Sent: Tuesday, September 04, 2018 10:35 AM > To: Tim Chen > Cc: Thomas Gleixner ; Ingo Molnar ; > Peter Zijlstra ; Josh Poimboeuf > ; Andrea Arcangeli ; > Woodhouse, David ; Oleg Nesterov > ; Schaufler, Casey ; linux- > ker...@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > On Tue, 4 Sep 2018, Tim Chen wrote: > > > > Current ptrace_may_access() implementation assumes that the 'source' > task is > > > always the caller (current). > > > > > > Expose ___ptrace_may_access() that can be used to apply the check on > arbitrary > > > tasks. > > > > Casey recently has proposed putting the decision making of whether to > > do IBPB in the security module. > > > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4- > casey.schauf...@intel.com/ > > > > That will have the advantage of giving the administrator a more flexibility > > of when to turn on IBPB. The policy is very similar to what you have > > proposed > here > > but I think the security module is a more appropriate place for the security > policy. > > Yeah, well, honestly, I have a bit hard time buying the "generic > sidechannel prevention security module" idea, given how completely > different in nature all the mitigations have been so far. I don't see that > trying to abstract this somehow provides more clarity. The real reason to use an LSM based approach is that overloading ptrace checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a processor interface. Even if ptrace_may_access() does exactly what you want it to for side-channel mitigation today it would be incredibly inappropriate to tie the two together for eternity. You don't want to restrict the ptrace checks to only those things that are also good for side-channel, and vice versa. > So if this should be done in LSM, it'd probably have to be written by > someone else than me :) who actually understands how the "sidechannel LSM" > idea works. Yes. That would be me. > > Thanks, > > -- > Jiri Kosina > SUSE Labs
RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
> -Original Message- > From: Jiri Kosina [mailto:ji...@kernel.org] > Sent: Tuesday, September 04, 2018 10:35 AM > To: Tim Chen > Cc: Thomas Gleixner ; Ingo Molnar ; > Peter Zijlstra ; Josh Poimboeuf > ; Andrea Arcangeli ; > Woodhouse, David ; Oleg Nesterov > ; Schaufler, Casey ; linux- > ker...@vger.kernel.org; x...@kernel.org > Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can > be applied on arbitrary tasks > > On Tue, 4 Sep 2018, Tim Chen wrote: > > > > Current ptrace_may_access() implementation assumes that the 'source' > task is > > > always the caller (current). > > > > > > Expose ___ptrace_may_access() that can be used to apply the check on > arbitrary > > > tasks. > > > > Casey recently has proposed putting the decision making of whether to > > do IBPB in the security module. > > > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4- > casey.schauf...@intel.com/ > > > > That will have the advantage of giving the administrator a more flexibility > > of when to turn on IBPB. The policy is very similar to what you have > > proposed > here > > but I think the security module is a more appropriate place for the security > policy. > > Yeah, well, honestly, I have a bit hard time buying the "generic > sidechannel prevention security module" idea, given how completely > different in nature all the mitigations have been so far. I don't see that > trying to abstract this somehow provides more clarity. The real reason to use an LSM based approach is that overloading ptrace checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a processor interface. Even if ptrace_may_access() does exactly what you want it to for side-channel mitigation today it would be incredibly inappropriate to tie the two together for eternity. You don't want to restrict the ptrace checks to only those things that are also good for side-channel, and vice versa. > So if this should be done in LSM, it'd probably have to be written by > someone else than me :) who actually understands how the "sidechannel LSM" > idea works. Yes. That would be me. > > Thanks, > > -- > Jiri Kosina > SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Tim Chen wrote: > > Current ptrace_may_access() implementation assumes that the 'source' task is > > always the caller (current). > > > > Expose ___ptrace_may_access() that can be used to apply the check on > > arbitrary > > tasks. > > Casey recently has proposed putting the decision making of whether to > do IBPB in the security module. > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/ > > That will have the advantage of giving the administrator a more flexibility > of when to turn on IBPB. The policy is very similar to what you have > proposed here > but I think the security module is a more appropriate place for the security > policy. Yeah, well, honestly, I have a bit hard time buying the "generic sidechannel prevention security module" idea, given how completely different in nature all the mitigations have been so far. I don't see that trying to abstract this somehow provides more clarity. So if this should be done in LSM, it'd probably have to be written by someone else than me :) who actually understands how the "sidechannel LSM" idea works. Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Tim Chen wrote: > > Current ptrace_may_access() implementation assumes that the 'source' task is > > always the caller (current). > > > > Expose ___ptrace_may_access() that can be used to apply the check on > > arbitrary > > tasks. > > Casey recently has proposed putting the decision making of whether to > do IBPB in the security module. > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/ > > That will have the advantage of giving the administrator a more flexibility > of when to turn on IBPB. The policy is very similar to what you have > proposed here > but I think the security module is a more appropriate place for the security > policy. Yeah, well, honestly, I have a bit hard time buying the "generic sidechannel prevention security module" idea, given how completely different in nature all the mitigations have been so far. I don't see that trying to abstract this somehow provides more clarity. So if this should be done in LSM, it'd probably have to be written by someone else than me :) who actually understands how the "sidechannel LSM" idea works. Thanks, -- Jiri Kosina SUSE Labs
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On 09/04/2018 07:40 AM, Jiri Kosina wrote: > From: Jiri Kosina > > Current ptrace_may_access() implementation assumes that the 'source' task is > always the caller (current). > > Expose ___ptrace_may_access() that can be used to apply the check on arbitrary > tasks. Casey recently has proposed putting the decision making of whether to do IBPB in the security module. https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/ That will have the advantage of giving the administrator a more flexibility of when to turn on IBPB. The policy is very similar to what you have proposed here but I think the security module is a more appropriate place for the security policy. Thanks. Tim > > Originally-by: Tim Chen > Signed-off-by: Jiri Kosina > --- > > Sorry for the resend, my pine is buggered and broke threading. > > include/linux/ptrace.h | 12 > kernel/ptrace.c| 13 ++--- > 2 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h > index 4f36431c380b..09744d4113fb 100644 > --- a/include/linux/ptrace.h > +++ b/include/linux/ptrace.h > @@ -87,6 +87,18 @@ extern void exit_ptrace(struct task_struct *tracer, struct > list_head *dead); > */ > extern bool ptrace_may_access(struct task_struct *task, unsigned int mode); > > +/** > + * ___ptrace_may_access - variant of ptrace_may_access that can be used > + * between two arbitrary tasks > + * @curr: source task > + * @task: target task > + * @mode: selects type of access and caller credentials > + * > + * Returns true on success, false on denial. > + */ > +extern int ___ptrace_may_access(struct task_struct *curr, struct task_struct > *task, > + unsigned int mode); > + > static inline int ptrace_reparented(struct task_struct *child) > { > return !same_thread_group(child->real_parent, child->parent); > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 21fec73d45d4..07ff6685ebed 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -268,9 +268,10 @@ static int ptrace_has_cap(struct user_namespace *ns, > unsigned int mode) > } > > /* Returns 0 on success, -errno on denial. */ > -static int __ptrace_may_access(struct task_struct *task, unsigned int mode) > +int ___ptrace_may_access(struct task_struct *curr, struct task_struct *task, > + unsigned int mode) > { > - const struct cred *cred = current_cred(), *tcred; > + const struct cred *cred, *tcred; > struct mm_struct *mm; > kuid_t caller_uid; > kgid_t caller_gid; > @@ -290,9 +291,10 @@ static int __ptrace_may_access(struct task_struct *task, > unsigned int mode) >*/ > > /* Don't let security modules deny introspection */ > - if (same_thread_group(task, current)) > + if (same_thread_group(task, curr)) > return 0; > rcu_read_lock(); > + cred = __task_cred(curr); > if (mode & PTRACE_MODE_FSCREDS) { > caller_uid = cred->fsuid; > caller_gid = cred->fsgid; > @@ -331,6 +333,11 @@ static int __ptrace_may_access(struct task_struct *task, > unsigned int mode) > return security_ptrace_access_check(task, mode); > } > > +static int __ptrace_may_access(struct task_struct *task, unsigned int mode) > +{ > + return ___ptrace_may_access(current, task, mode); > +} > + > bool ptrace_may_access(struct task_struct *task, unsigned int mode) > { > int err; >
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On 09/04/2018 07:40 AM, Jiri Kosina wrote: > From: Jiri Kosina > > Current ptrace_may_access() implementation assumes that the 'source' task is > always the caller (current). > > Expose ___ptrace_may_access() that can be used to apply the check on arbitrary > tasks. Casey recently has proposed putting the decision making of whether to do IBPB in the security module. https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/ That will have the advantage of giving the administrator a more flexibility of when to turn on IBPB. The policy is very similar to what you have proposed here but I think the security module is a more appropriate place for the security policy. Thanks. Tim > > Originally-by: Tim Chen > Signed-off-by: Jiri Kosina > --- > > Sorry for the resend, my pine is buggered and broke threading. > > include/linux/ptrace.h | 12 > kernel/ptrace.c| 13 ++--- > 2 files changed, 22 insertions(+), 3 deletions(-) > > diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h > index 4f36431c380b..09744d4113fb 100644 > --- a/include/linux/ptrace.h > +++ b/include/linux/ptrace.h > @@ -87,6 +87,18 @@ extern void exit_ptrace(struct task_struct *tracer, struct > list_head *dead); > */ > extern bool ptrace_may_access(struct task_struct *task, unsigned int mode); > > +/** > + * ___ptrace_may_access - variant of ptrace_may_access that can be used > + * between two arbitrary tasks > + * @curr: source task > + * @task: target task > + * @mode: selects type of access and caller credentials > + * > + * Returns true on success, false on denial. > + */ > +extern int ___ptrace_may_access(struct task_struct *curr, struct task_struct > *task, > + unsigned int mode); > + > static inline int ptrace_reparented(struct task_struct *child) > { > return !same_thread_group(child->real_parent, child->parent); > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 21fec73d45d4..07ff6685ebed 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -268,9 +268,10 @@ static int ptrace_has_cap(struct user_namespace *ns, > unsigned int mode) > } > > /* Returns 0 on success, -errno on denial. */ > -static int __ptrace_may_access(struct task_struct *task, unsigned int mode) > +int ___ptrace_may_access(struct task_struct *curr, struct task_struct *task, > + unsigned int mode) > { > - const struct cred *cred = current_cred(), *tcred; > + const struct cred *cred, *tcred; > struct mm_struct *mm; > kuid_t caller_uid; > kgid_t caller_gid; > @@ -290,9 +291,10 @@ static int __ptrace_may_access(struct task_struct *task, > unsigned int mode) >*/ > > /* Don't let security modules deny introspection */ > - if (same_thread_group(task, current)) > + if (same_thread_group(task, curr)) > return 0; > rcu_read_lock(); > + cred = __task_cred(curr); > if (mode & PTRACE_MODE_FSCREDS) { > caller_uid = cred->fsuid; > caller_gid = cred->fsgid; > @@ -331,6 +333,11 @@ static int __ptrace_may_access(struct task_struct *task, > unsigned int mode) > return security_ptrace_access_check(task, mode); > } > > +static int __ptrace_may_access(struct task_struct *task, unsigned int mode) > +{ > + return ___ptrace_may_access(current, task, mode); > +} > + > bool ptrace_may_access(struct task_struct *task, unsigned int mode) > { > int err; >
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Thomas Gleixner wrote: > On Tue, 4 Sep 2018, Jiri Kosina wrote: > > +/** > > + * ___ptrace_may_access - variant of ptrace_may_access that can be used > > + * between two arbitrary tasks > > + * @curr: source task > > That's a pretty shitty name, really. If @curr is supposed to be an > arbitrary task, then please rename it in a way which makes that entirely > clear. curr is too close to current. Just grep for 'task_struct \*curr'. Something like '*tracer' would make it entirely clear what's going on. Thanks, tglx
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Thomas Gleixner wrote: > On Tue, 4 Sep 2018, Jiri Kosina wrote: > > +/** > > + * ___ptrace_may_access - variant of ptrace_may_access that can be used > > + * between two arbitrary tasks > > + * @curr: source task > > That's a pretty shitty name, really. If @curr is supposed to be an > arbitrary task, then please rename it in a way which makes that entirely > clear. curr is too close to current. Just grep for 'task_struct \*curr'. Something like '*tracer' would make it entirely clear what's going on. Thanks, tglx
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Jiri Kosina wrote: > +/** > + * ___ptrace_may_access - variant of ptrace_may_access that can be used > + * between two arbitrary tasks > + * @curr: source task That's a pretty shitty name, really. If @curr is supposed to be an arbitrary task, then please rename it in a way which makes that entirely clear. curr is too close to current. Just grep for 'task_struct \*curr'. Thanks, tglx
Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks
On Tue, 4 Sep 2018, Jiri Kosina wrote: > +/** > + * ___ptrace_may_access - variant of ptrace_may_access that can be used > + * between two arbitrary tasks > + * @curr: source task That's a pretty shitty name, really. If @curr is supposed to be an arbitrary task, then please rename it in a way which makes that entirely clear. curr is too close to current. Just grep for 'task_struct \*curr'. Thanks, tglx