subject:"RE\: \[PATCH v3 1\/3\] ptrace\: Provide ___ptrace_may_access\(\) that can be applied on arbitrary tasks"

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andrea Arcangeli wrote:

> [ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on 
> cpu 6
> [ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 
> 3.10.0-327.62.4.el7.x86_64 #1
> [ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 
> 01/09/2017
> [ 1838.645954] Call Trace:
> [ 1838.648680][] dump_stack+0x19/0x1b
> [ 1838.655113]  [] panic+0xd8/0x1e7
> [ 1838.660460]  [] ? restart_watchdog_hrtimer+0x50/0x50
> [ 1838.667742]  [] watchdog_overflow_callback+0xc2/0xd0
> [ 1838.675024]  [] __perf_event_overflow+0xa1/0x250
> [ 1838.681920]  [] perf_event_overflow+0x14/0x20
> [ 1838.688526]  [] intel_pmu_handle_irq+0x1e8/0x470
> [ 1838.695423]  [] ? ioremap_page_range+0x24c/0x330
> [ 1838.702320]  [] ? unmap_kernel_range_noflush+0x11/0x20
> [ 1838.709797]  [] ? ghes_copy_tofrom_phys+0x124/0x210
> [ 1838.716984]  [] ? ghes_read_estatus+0xa0/0x190
> [ 1838.723687]  [] perf_event_nmi_handler+0x2b/0x50
> [ 1838.730582]  [] nmi_handle.isra.0+0x69/0xb0
> [ 1838.736992]  [] do_nmi+0x169/0x340
> [ 1838.742532]  [] end_repeat_nmi+0x1e/0x7e
> [ 1838.748653]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
> [ 1838.755742]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
> [ 1838.762831]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
> [ 1838.769917]  <>  [] avc_compute_av+0x126/0x1b5
> [ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
> [ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110
> [ 1838.790628]  [] cred_has_capability+0x6b/0x120
> [ 1838.797331]  [] ? ktime_get+0x4c/0xd0
> [ 1838.803160]  [] ? clockevents_program_event+0x6b/0xf0
> [ 1838.810532]  [] selinux_capable+0x2e/0x40
> [ 1838.816748]  [] security_capable_noaudit+0x15/0x20
> [ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
> [ 1838.831014]  [] ptrace_has_cap+0x35/0x40
> [ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
> [ 1838.843925]  [] __schedule+0x26e/0xa00
> [ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
> [ 1838.857041]  [] cpu_startup_entry+0x184/0x290
> [ 1838.863637]  [] start_secondary+0x1da/0x250

So yeah, current Linus' tree needs the same treatment -- we have to avoid 
calling out to ptrace_has_cap() in PTRACE_MODE_NOACCESS_CHK cases.

I have updated the patch for v4. Thanks again,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andrea Arcangeli wrote:

> [ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on 
> cpu 6
> [ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 
> 3.10.0-327.62.4.el7.x86_64 #1
> [ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 
> 01/09/2017
> [ 1838.645954] Call Trace:
> [ 1838.648680][] dump_stack+0x19/0x1b
> [ 1838.655113]  [] panic+0xd8/0x1e7
> [ 1838.660460]  [] ? restart_watchdog_hrtimer+0x50/0x50
> [ 1838.667742]  [] watchdog_overflow_callback+0xc2/0xd0
> [ 1838.675024]  [] __perf_event_overflow+0xa1/0x250
> [ 1838.681920]  [] perf_event_overflow+0x14/0x20
> [ 1838.688526]  [] intel_pmu_handle_irq+0x1e8/0x470
> [ 1838.695423]  [] ? ioremap_page_range+0x24c/0x330
> [ 1838.702320]  [] ? unmap_kernel_range_noflush+0x11/0x20
> [ 1838.709797]  [] ? ghes_copy_tofrom_phys+0x124/0x210
> [ 1838.716984]  [] ? ghes_read_estatus+0xa0/0x190
> [ 1838.723687]  [] perf_event_nmi_handler+0x2b/0x50
> [ 1838.730582]  [] nmi_handle.isra.0+0x69/0xb0
> [ 1838.736992]  [] do_nmi+0x169/0x340
> [ 1838.742532]  [] end_repeat_nmi+0x1e/0x7e
> [ 1838.748653]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
> [ 1838.755742]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
> [ 1838.762831]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
> [ 1838.769917]  <>  [] avc_compute_av+0x126/0x1b5
> [ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
> [ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110
> [ 1838.790628]  [] cred_has_capability+0x6b/0x120
> [ 1838.797331]  [] ? ktime_get+0x4c/0xd0
> [ 1838.803160]  [] ? clockevents_program_event+0x6b/0xf0
> [ 1838.810532]  [] selinux_capable+0x2e/0x40
> [ 1838.816748]  [] security_capable_noaudit+0x15/0x20
> [ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
> [ 1838.831014]  [] ptrace_has_cap+0x35/0x40
> [ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
> [ 1838.843925]  [] __schedule+0x26e/0xa00
> [ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
> [ 1838.857041]  [] cpu_startup_entry+0x184/0x290
> [ 1838.863637]  [] start_secondary+0x1da/0x250

So yeah, current Linus' tree needs the same treatment -- we have to avoid 
calling out to ptrace_has_cap() in PTRACE_MODE_NOACCESS_CHK cases.

I have updated the patch for v4. Thanks again,

-- 
Jiri Kosina
SUSE Labs

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Schaufler, Casey

> -Original Message-
> From: Peter Zijlstra [mailto:pet...@infradead.org]
> Sent: Wednesday, September 05, 2018 12:03 PM
> To: Andrea Arcangeli 
> Cc: Jiri Kosina ; Andi Kleen ; Tim 
> Chen
> ; Schaufler, Casey ;
> Thomas Gleixner ; Ingo Molnar ;
> Josh Poimboeuf ; Woodhouse, David
> ; Oleg Nesterov ; linux-
> ker...@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote:
> > [ 1838.769917]  <>  []
> avc_compute_av+0x126/0x1b5
> 
> That does read_lock(), which is not allowed from scheduler context.
> 
> > [ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
> > [ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110
> 
> In current code this can end up in avc_update_node() which uses
> spin_lock(), which is a bug from scheduler context.o
> 
> > [ 1838.790628]  [] cred_has_capability+0x6b/0x120
> > [ 1838.797331]  [] ? ktime_get+0x4c/0xd0
> > [ 1838.803160]  [] ?
> clockevents_program_event+0x6b/0xf0
> > [ 1838.810532]  [] selinux_capable+0x2e/0x40
> > [ 1838.816748]  [] security_capable_noaudit+0x15/0x20
> > [ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
> > [ 1838.831014]  [] ptrace_has_cap+0x35/0x40
> > [ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
> > [ 1838.843925]  [] __schedule+0x26e/0xa00
> > [ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
> > [ 1838.857041]  [] cpu_startup_entry+0x184/0x290
> > [ 1838.863637]  [] start_secondary+0x1da/0x250
> 
> So yes, looks like all that security LSM nonsense isn't going to work
> here.

What won't work is using the ptrace code. That is one of the reasons why
you can't just blindly use it. Look at the patch set I submitted and you'll see
that the SELinux selinux_task_safe_sidechannel() hook does not do the things
that cause the lockup.

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Schaufler, Casey

> -Original Message-
> From: Peter Zijlstra [mailto:pet...@infradead.org]
> Sent: Wednesday, September 05, 2018 12:03 PM
> To: Andrea Arcangeli 
> Cc: Jiri Kosina ; Andi Kleen ; Tim 
> Chen
> ; Schaufler, Casey ;
> Thomas Gleixner ; Ingo Molnar ;
> Josh Poimboeuf ; Woodhouse, David
> ; Oleg Nesterov ; linux-
> ker...@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote:
> > [ 1838.769917]  <>  []
> avc_compute_av+0x126/0x1b5
> 
> That does read_lock(), which is not allowed from scheduler context.
> 
> > [ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
> > [ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110
> 
> In current code this can end up in avc_update_node() which uses
> spin_lock(), which is a bug from scheduler context.o
> 
> > [ 1838.790628]  [] cred_has_capability+0x6b/0x120
> > [ 1838.797331]  [] ? ktime_get+0x4c/0xd0
> > [ 1838.803160]  [] ?
> clockevents_program_event+0x6b/0xf0
> > [ 1838.810532]  [] selinux_capable+0x2e/0x40
> > [ 1838.816748]  [] security_capable_noaudit+0x15/0x20
> > [ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
> > [ 1838.831014]  [] ptrace_has_cap+0x35/0x40
> > [ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
> > [ 1838.843925]  [] __schedule+0x26e/0xa00
> > [ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
> > [ 1838.857041]  [] cpu_startup_entry+0x184/0x290
> > [ 1838.863637]  [] start_secondary+0x1da/0x250
> 
> So yes, looks like all that security LSM nonsense isn't going to work
> here.

What won't work is using the ptrace code. That is one of the reasons why
you can't just blindly use it. Look at the patch set I submitted and you'll see
that the SELinux selinux_task_safe_sidechannel() hook does not do the things
that cause the lockup.

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Peter Zijlstra

On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote:
> [ 1838.769917]  <>  [] avc_compute_av+0x126/0x1b5

That does read_lock(), which is not allowed from scheduler context.

> [ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
> [ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110

In current code this can end up in avc_update_node() which uses
spin_lock(), which is a bug from scheduler context.o

> [ 1838.790628]  [] cred_has_capability+0x6b/0x120
> [ 1838.797331]  [] ? ktime_get+0x4c/0xd0
> [ 1838.803160]  [] ? clockevents_program_event+0x6b/0xf0
> [ 1838.810532]  [] selinux_capable+0x2e/0x40
> [ 1838.816748]  [] security_capable_noaudit+0x15/0x20
> [ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
> [ 1838.831014]  [] ptrace_has_cap+0x35/0x40
> [ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
> [ 1838.843925]  [] __schedule+0x26e/0xa00
> [ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
> [ 1838.857041]  [] cpu_startup_entry+0x184/0x290
> [ 1838.863637]  [] start_secondary+0x1da/0x250

So yes, looks like all that security LSM nonsense isn't going to work
here.

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Peter Zijlstra

On Wed, Sep 05, 2018 at 02:40:18PM -0400, Andrea Arcangeli wrote:
> [ 1838.769917]  <>  [] avc_compute_av+0x126/0x1b5

That does read_lock(), which is not allowed from scheduler context.

> [ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
> [ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110

In current code this can end up in avc_update_node() which uses
spin_lock(), which is a bug from scheduler context.o

> [ 1838.790628]  [] cred_has_capability+0x6b/0x120
> [ 1838.797331]  [] ? ktime_get+0x4c/0xd0
> [ 1838.803160]  [] ? clockevents_program_event+0x6b/0xf0
> [ 1838.810532]  [] selinux_capable+0x2e/0x40
> [ 1838.816748]  [] security_capable_noaudit+0x15/0x20
> [ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
> [ 1838.831014]  [] ptrace_has_cap+0x35/0x40
> [ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
> [ 1838.843925]  [] __schedule+0x26e/0xa00
> [ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
> [ 1838.857041]  [] cpu_startup_entry+0x184/0x290
> [ 1838.863637]  [] start_secondary+0x1da/0x250

So yes, looks like all that security LSM nonsense isn't going to work
here.

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andrea Arcangeli wrote:

> Perhaps you didn't sandbox KVM inside selinux by default?

We by default do not enable selinux, so that's probably why.

Thanks again, will go through the code in mainline and adapt the patch 
before sending v4.

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andrea Arcangeli wrote:

> Perhaps you didn't sandbox KVM inside selinux by default?

We by default do not enable selinux, so that's probably why.

Thanks again, will go through the code in mainline and adapt the patch 
before sending v4.

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Andrea Arcangeli

On Wed, Sep 05, 2018 at 08:29:07PM +0200, Jiri Kosina wrote:
> (and no, my testing of the patch I sent on current tree didn't produce any 
> hangs -- was there a reliable way to trigger it on 3.10?).

Only a very specific libvirt acceptance test found this after a while
and it wasn't a customer it was caught by QA. The reporter said it
wasn't sure about how to reproduce this issue either, it happened once
in a while the backtrace was still enough to fix it for sure and then
it never happened again.

It's not because of virt but probably because of selinux+audit. This
is precisely why I thought once you enter LSM from the scheduler
atomic path the trouble starts as each LSM implementation of those
calls may crash or not crash.

Perhaps you didn't sandbox KVM inside selinux by default?

This is the lockup the patch I posted fixed for 3.10.

[ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 
6
[ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 
3.10.0-327.62.4.el7.x86_64 #1
[ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 
01/09/2017
[ 1838.645954] Call Trace:
[ 1838.648680][] dump_stack+0x19/0x1b
[ 1838.655113]  [] panic+0xd8/0x1e7
[ 1838.660460]  [] ? restart_watchdog_hrtimer+0x50/0x50
[ 1838.667742]  [] watchdog_overflow_callback+0xc2/0xd0
[ 1838.675024]  [] __perf_event_overflow+0xa1/0x250
[ 1838.681920]  [] perf_event_overflow+0x14/0x20
[ 1838.688526]  [] intel_pmu_handle_irq+0x1e8/0x470
[ 1838.695423]  [] ? ioremap_page_range+0x24c/0x330
[ 1838.702320]  [] ? unmap_kernel_range_noflush+0x11/0x20
[ 1838.709797]  [] ? ghes_copy_tofrom_phys+0x124/0x210
[ 1838.716984]  [] ? ghes_read_estatus+0xa0/0x190
[ 1838.723687]  [] perf_event_nmi_handler+0x2b/0x50
[ 1838.730582]  [] nmi_handle.isra.0+0x69/0xb0
[ 1838.736992]  [] do_nmi+0x169/0x340
[ 1838.742532]  [] end_repeat_nmi+0x1e/0x7e
[ 1838.748653]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.755742]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.762831]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.769917]  <>  [] avc_compute_av+0x126/0x1b5
[ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
[ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110
[ 1838.790628]  [] cred_has_capability+0x6b/0x120
[ 1838.797331]  [] ? ktime_get+0x4c/0xd0
[ 1838.803160]  [] ? clockevents_program_event+0x6b/0xf0
[ 1838.810532]  [] selinux_capable+0x2e/0x40
[ 1838.816748]  [] security_capable_noaudit+0x15/0x20
[ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
[ 1838.831014]  [] ptrace_has_cap+0x35/0x40
[ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
[ 1838.843925]  [] __schedule+0x26e/0xa00
[ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
[ 1838.857041]  [] cpu_startup_entry+0x184/0x290
[ 1838.863637]  [] start_secondary+0x1da/0x250

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Andrea Arcangeli

On Wed, Sep 05, 2018 at 08:29:07PM +0200, Jiri Kosina wrote:
> (and no, my testing of the patch I sent on current tree didn't produce any 
> hangs -- was there a reliable way to trigger it on 3.10?).

Only a very specific libvirt acceptance test found this after a while
and it wasn't a customer it was caught by QA. The reporter said it
wasn't sure about how to reproduce this issue either, it happened once
in a while the backtrace was still enough to fix it for sure and then
it never happened again.

It's not because of virt but probably because of selinux+audit. This
is precisely why I thought once you enter LSM from the scheduler
atomic path the trouble starts as each LSM implementation of those
calls may crash or not crash.

Perhaps you didn't sandbox KVM inside selinux by default?

This is the lockup the patch I posted fixed for 3.10.

[ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 
6
[ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 
3.10.0-327.62.4.el7.x86_64 #1
[ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 
01/09/2017
[ 1838.645954] Call Trace:
[ 1838.648680][] dump_stack+0x19/0x1b
[ 1838.655113]  [] panic+0xd8/0x1e7
[ 1838.660460]  [] ? restart_watchdog_hrtimer+0x50/0x50
[ 1838.667742]  [] watchdog_overflow_callback+0xc2/0xd0
[ 1838.675024]  [] __perf_event_overflow+0xa1/0x250
[ 1838.681920]  [] perf_event_overflow+0x14/0x20
[ 1838.688526]  [] intel_pmu_handle_irq+0x1e8/0x470
[ 1838.695423]  [] ? ioremap_page_range+0x24c/0x330
[ 1838.702320]  [] ? unmap_kernel_range_noflush+0x11/0x20
[ 1838.709797]  [] ? ghes_copy_tofrom_phys+0x124/0x210
[ 1838.716984]  [] ? ghes_read_estatus+0xa0/0x190
[ 1838.723687]  [] perf_event_nmi_handler+0x2b/0x50
[ 1838.730582]  [] nmi_handle.isra.0+0x69/0xb0
[ 1838.736992]  [] do_nmi+0x169/0x340
[ 1838.742532]  [] end_repeat_nmi+0x1e/0x7e
[ 1838.748653]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.755742]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.762831]  [] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.769917]  <>  [] avc_compute_av+0x126/0x1b5
[ 1838.777125]  [] ? walk_tg_tree_from+0xbe/0x110
[ 1838.783828]  [] avc_has_perm_noaudit+0xc4/0x110
[ 1838.790628]  [] cred_has_capability+0x6b/0x120
[ 1838.797331]  [] ? ktime_get+0x4c/0xd0
[ 1838.803160]  [] ? clockevents_program_event+0x6b/0xf0
[ 1838.810532]  [] selinux_capable+0x2e/0x40
[ 1838.816748]  [] security_capable_noaudit+0x15/0x20
[ 1838.823829]  [] has_ns_capability_noaudit+0x15/0x20
[ 1838.831014]  [] ptrace_has_cap+0x35/0x40
[ 1838.837126]  [] ___ptrace_may_access+0xa7/0x1e0
[ 1838.843925]  [] __schedule+0x26e/0xa00
[ 1838.849855]  [] schedule_preempt_disabled+0x29/0x70
[ 1838.857041]  [] cpu_startup_entry+0x184/0x290
[ 1838.863637]  [] start_secondary+0x1da/0x250

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andi Kleen wrote:

> Please if you repost include plenty of performance numbers for multi 
> threaded workloads.  It's ridiculous to even discuss this without them.

Talking about ridiculous ... I find it a bit sad that Intel has let this 
be unfixed for 3/4 years in linux; that doesn't really signal deep 
dedication to customer safety. Have any STIBP patches been even submitted?

This is not the same situation as IBRS which was mostly ignored -- there 
we have retpolines to protect the kernel, and it's debatable whether it's 
exploitable on SKL at all.

Ignoring IBPB and STIBP is keeping the system plain vulnerable to 
user-user attacks, and us not providing users with possibiliy to easily 
mitigate, is a bit embarassing in my eyes.

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andi Kleen wrote:

> Please if you repost include plenty of performance numbers for multi 
> threaded workloads.  It's ridiculous to even discuss this without them.

Talking about ridiculous ... I find it a bit sad that Intel has let this 
be unfixed for 3/4 years in linux; that doesn't really signal deep 
dedication to customer safety. Have any STIBP patches been even submitted?

This is not the same situation as IBRS which was mostly ignored -- there 
we have retpolines to protect the kernel, and it's debatable whether it's 
exploitable on SKL at all.

Ignoring IBPB and STIBP is keeping the system plain vulnerable to 
user-user attacks, and us not providing users with possibiliy to easily 
mitigate, is a bit embarassing in my eyes.

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andrea Arcangeli wrote:

> ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup
> hard if called from scheduler as it does some locking, and we fixed
> that already half a year ago.
> 
> Not sure how it's still unfixed in Jiri's codebase after so long, or
> if it's an issue specific to 3.10 and upstream gets away without this.

We haven't got any lockup reports in our kernels (and we do carry a 
variant of this patch), so it might be somehow specific to 3.10.

> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index eb7862f185ff..4a8d0dd73c93 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer,
>   gid_eq(caller_gid, tcred->sgid) &&
>   gid_eq(caller_gid, tcred->gid))
>   goto ok;
> - if (ptrace_has_cap(tcred->user_ns, mode))
> + if (!(mode & PTRACE_MODE_NOACCESS_CHK) &&
> + ptrace_has_cap(tcred->user_ns, mode))
>   goto ok;
>   rcu_read_unlock();
>   return -EPERM;
> @@ -296,7 +297,8 @@ ok:
>   dumpable = get_dumpable(task->mm);
>   rcu_read_lock();
>   if (dumpable != SUID_DUMP_USER &&
> - !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
> + ((mode & PTRACE_MODE_NOACCESS_CHK) ||
> +  !ptrace_has_cap(__task_cred(task)->user_ns, mode))) {
>   rcu_read_unlock();
>   return -EPERM;

I will look into this whether it's still applicable or not, thanks a lot 
for the pointer.

(and no, my testing of the patch I sent on current tree didn't produce any 
hangs -- was there a reliable way to trigger it on 3.10?).

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Wed, 5 Sep 2018, Andrea Arcangeli wrote:

> ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup
> hard if called from scheduler as it does some locking, and we fixed
> that already half a year ago.
> 
> Not sure how it's still unfixed in Jiri's codebase after so long, or
> if it's an issue specific to 3.10 and upstream gets away without this.

We haven't got any lockup reports in our kernels (and we do carry a 
variant of this patch), so it might be somehow specific to 3.10.

> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index eb7862f185ff..4a8d0dd73c93 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer,
>   gid_eq(caller_gid, tcred->sgid) &&
>   gid_eq(caller_gid, tcred->gid))
>   goto ok;
> - if (ptrace_has_cap(tcred->user_ns, mode))
> + if (!(mode & PTRACE_MODE_NOACCESS_CHK) &&
> + ptrace_has_cap(tcred->user_ns, mode))
>   goto ok;
>   rcu_read_unlock();
>   return -EPERM;
> @@ -296,7 +297,8 @@ ok:
>   dumpable = get_dumpable(task->mm);
>   rcu_read_lock();
>   if (dumpable != SUID_DUMP_USER &&
> - !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
> + ((mode & PTRACE_MODE_NOACCESS_CHK) ||
> +  !ptrace_has_cap(__task_cred(task)->user_ns, mode))) {
>   rcu_read_unlock();
>   return -EPERM;

I will look into this whether it's still applicable or not, thanks a lot 
for the pointer.

(and no, my testing of the patch I sent on current tree didn't produce any 
hangs -- was there a reliable way to trigger it on 3.10?).

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Thomas Gleixner

On Wed, 5 Sep 2018, Andi Kleen wrote:

> > So, after giving it a bit more thought, I still believe "I want spectre V2 
> > protection" vs. "I do not care about spectre V2 on my system 
> > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
> > of my patchset, including the ptrace check in switch_mm() (statically 
> > patched out on !IBPB-capable systems), and we can then later see whether 
> > the LSM implementation, once it exists, should be used instead.
> 
> Please if you repost include plenty of performance numbers for multi threaded
> workloads.  It's ridiculous to even discuss this without them.

Either we care about that problem and provide a proper mechanism to protect
systems or we do not. That's not a performance number problem at all.

Thanks,

tglx

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Thomas Gleixner

On Wed, 5 Sep 2018, Andi Kleen wrote:

> > So, after giving it a bit more thought, I still believe "I want spectre V2 
> > protection" vs. "I do not care about spectre V2 on my system 
> > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
> > of my patchset, including the ptrace check in switch_mm() (statically 
> > patched out on !IBPB-capable systems), and we can then later see whether 
> > the LSM implementation, once it exists, should be used instead.
> 
> Please if you repost include plenty of performance numbers for multi threaded
> workloads.  It's ridiculous to even discuss this without them.

Either we care about that problem and provide a proper mechanism to protect
systems or we do not. That's not a performance number problem at all.

Thanks,

tglx

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Andrea Arcangeli

On Wed, Sep 05, 2018 at 08:58:23AM -0700, Andi Kleen wrote:
> > So, after giving it a bit more thought, I still believe "I want spectre V2 
> > protection" vs. "I do not care about spectre V2 on my system 
> > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
> > of my patchset, including the ptrace check in switch_mm() (statically 
> > patched out on !IBPB-capable systems), and we can then later see whether 
> > the LSM implementation, once it exists, should be used instead.
> 
> Please if you repost include plenty of performance numbers for multi threaded
> workloads.  It's ridiculous to even discuss this without them.

Multi threaded workloads won't be affected because they share the
memory in the first place... the check itself is lost in the noise
too. Maybe you meant to ask for multiple parallel processes
(multithreaded or not, zero difference) all with a different user id?

What is more weird for me is to attempt to discuss the STIBP part of
the patch without knowing which microcodes exactly implement STIBP in
a way that is slow. Tim already said it's a measurable performance hit,
but on some CPU it's zero performance hit. We don't even know if STIBP
is actually useful or if it's a noop on those CPUs where it won't
affect performance.

Back to the IBPB, from implementation standpoint at least on 3.10 this
code posted would lockup hard eventually and we got complains.

ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup
hard if called from scheduler as it does some locking, and we fixed
that already half a year ago.

Not sure how it's still unfixed in Jiri's codebase after so long, or
if it's an issue specific to 3.10 and upstream gets away without this.

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index eb7862f185ff..4a8d0dd73c93 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer,
gid_eq(caller_gid, tcred->sgid) &&
gid_eq(caller_gid, tcred->gid))
goto ok;
-   if (ptrace_has_cap(tcred->user_ns, mode))
+   if (!(mode & PTRACE_MODE_NOACCESS_CHK) &&
+   ptrace_has_cap(tcred->user_ns, mode))
goto ok;
rcu_read_unlock();
return -EPERM;
@@ -296,7 +297,8 @@ ok:
dumpable = get_dumpable(task->mm);
rcu_read_lock();
if (dumpable != SUID_DUMP_USER &&
-   !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
+   ((mode & PTRACE_MODE_NOACCESS_CHK) ||
+!ptrace_has_cap(__task_cred(task)->user_ns, mode))) {
rcu_read_unlock();
return -EPERM;
}

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Andrea Arcangeli

On Wed, Sep 05, 2018 at 08:58:23AM -0700, Andi Kleen wrote:
> > So, after giving it a bit more thought, I still believe "I want spectre V2 
> > protection" vs. "I do not care about spectre V2 on my system 
> > (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
> > of my patchset, including the ptrace check in switch_mm() (statically 
> > patched out on !IBPB-capable systems), and we can then later see whether 
> > the LSM implementation, once it exists, should be used instead.
> 
> Please if you repost include plenty of performance numbers for multi threaded
> workloads.  It's ridiculous to even discuss this without them.

Multi threaded workloads won't be affected because they share the
memory in the first place... the check itself is lost in the noise
too. Maybe you meant to ask for multiple parallel processes
(multithreaded or not, zero difference) all with a different user id?

What is more weird for me is to attempt to discuss the STIBP part of
the patch without knowing which microcodes exactly implement STIBP in
a way that is slow. Tim already said it's a measurable performance hit,
but on some CPU it's zero performance hit. We don't even know if STIBP
is actually useful or if it's a noop on those CPUs where it won't
affect performance.

Back to the IBPB, from implementation standpoint at least on 3.10 this
code posted would lockup hard eventually and we got complains.

ptrace_has_cap(tcred->user_ns, mode) is supposed to eventually lockup
hard if called from scheduler as it does some locking, and we fixed
that already half a year ago.

Not sure how it's still unfixed in Jiri's codebase after so long, or
if it's an issue specific to 3.10 and upstream gets away without this.

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index eb7862f185ff..4a8d0dd73c93 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -285,7 +285,8 @@ int ___ptrace_may_access(struct task_struct *tracer,
gid_eq(caller_gid, tcred->sgid) &&
gid_eq(caller_gid, tcred->gid))
goto ok;
-   if (ptrace_has_cap(tcred->user_ns, mode))
+   if (!(mode & PTRACE_MODE_NOACCESS_CHK) &&
+   ptrace_has_cap(tcred->user_ns, mode))
goto ok;
rcu_read_unlock();
return -EPERM;
@@ -296,7 +297,8 @@ ok:
dumpable = get_dumpable(task->mm);
rcu_read_lock();
if (dumpable != SUID_DUMP_USER &&
-   !ptrace_has_cap(__task_cred(task)->user_ns, mode)) {
+   ((mode & PTRACE_MODE_NOACCESS_CHK) ||
+!ptrace_has_cap(__task_cred(task)->user_ns, mode))) {
rcu_read_unlock();
return -EPERM;
}

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Andi Kleen

> So, after giving it a bit more thought, I still believe "I want spectre V2 
> protection" vs. "I do not care about spectre V2 on my system 
> (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
> of my patchset, including the ptrace check in switch_mm() (statically 
> patched out on !IBPB-capable systems), and we can then later see whether 
> the LSM implementation, once it exists, should be used instead.

Please if you repost include plenty of performance numbers for multi threaded
workloads.  It's ridiculous to even discuss this without them.

-Andi

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Andi Kleen

> So, after giving it a bit more thought, I still believe "I want spectre V2 
> protection" vs. "I do not care about spectre V2 on my system 
> (=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
> of my patchset, including the ptrace check in switch_mm() (statically 
> patched out on !IBPB-capable systems), and we can then later see whether 
> the LSM implementation, once it exists, should be used instead.

Please if you repost include plenty of performance numbers for multi threaded
workloads.  It's ridiculous to even discuss this without them.

-Andi

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Schaufler, Casey

> -Original Message-
> From: Peter Zijlstra [mailto:pet...@infradead.org]
> Sent: Wednesday, September 05, 2018 1:00 AM
> To: Jiri Kosina 
> Cc: Tim Chen ; Thomas Gleixner
> ; Ingo Molnar ; Josh Poimboeuf
> ; Andrea Arcangeli ;
> Woodhouse, David ; Oleg Nesterov
> ; Schaufler, Casey ; linux-
> ker...@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote:
> > On Tue, 4 Sep 2018, Tim Chen wrote:
> >
> > > > Current ptrace_may_access() implementation assumes that the 'source'
> task is
> > > > always the caller (current).
> > > >
> > > > Expose ___ptrace_may_access() that can be used to apply the check on
> arbitrary
> > > > tasks.
> > >
> > > Casey recently has proposed putting the decision making of whether to
> > > do IBPB in the security module.
> > >
> > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-
> casey.schauf...@intel.com/
> > >
> > > That will have the advantage of giving the administrator a more 
> > > flexibility
> > > of when to turn on IBPB.  The policy is very similar to what you have
> proposed here
> > > but I think the security module is a more appropriate place for the 
> > > security
> policy.
> >
> > Yeah, well, honestly, I have a bit hard time buying the "generic
> > sidechannel prevention security module" idea, given how completely
> > different in nature all the mitigations have been so far. I don't see that
> > trying to abstract this somehow provides more clarity.
> >
> > So if this should be done in LSM, it'd probably have to be written by
> > someone else than me :) who actually understands how the "sidechannel
> LSM"
> > idea works.
> 
> Yeah, I'm not convinced on LSM either. Lets just do these here patches
> first and then Casey can try and convince us later.

Works for me. There are advantages to doing it either way.
The LSM approach allows you to consider implications of LSM
data, which you can't do otherwise. Once the hook is available
it becomes the obvious place to do other checks.

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Schaufler, Casey

> -Original Message-
> From: Peter Zijlstra [mailto:pet...@infradead.org]
> Sent: Wednesday, September 05, 2018 1:00 AM
> To: Jiri Kosina 
> Cc: Tim Chen ; Thomas Gleixner
> ; Ingo Molnar ; Josh Poimboeuf
> ; Andrea Arcangeli ;
> Woodhouse, David ; Oleg Nesterov
> ; Schaufler, Casey ; linux-
> ker...@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote:
> > On Tue, 4 Sep 2018, Tim Chen wrote:
> >
> > > > Current ptrace_may_access() implementation assumes that the 'source'
> task is
> > > > always the caller (current).
> > > >
> > > > Expose ___ptrace_may_access() that can be used to apply the check on
> arbitrary
> > > > tasks.
> > >
> > > Casey recently has proposed putting the decision making of whether to
> > > do IBPB in the security module.
> > >
> > > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-
> casey.schauf...@intel.com/
> > >
> > > That will have the advantage of giving the administrator a more 
> > > flexibility
> > > of when to turn on IBPB.  The policy is very similar to what you have
> proposed here
> > > but I think the security module is a more appropriate place for the 
> > > security
> policy.
> >
> > Yeah, well, honestly, I have a bit hard time buying the "generic
> > sidechannel prevention security module" idea, given how completely
> > different in nature all the mitigations have been so far. I don't see that
> > trying to abstract this somehow provides more clarity.
> >
> > So if this should be done in LSM, it'd probably have to be written by
> > someone else than me :) who actually understands how the "sidechannel
> LSM"
> > idea works.
> 
> Yeah, I'm not convinced on LSM either. Lets just do these here patches
> first and then Casey can try and convince us later.

Works for me. There are advantages to doing it either way.
The LSM approach allows you to consider implications of LSM
data, which you can't do otherwise. Once the hook is available
it becomes the obvious place to do other checks.

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Peter Zijlstra

On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote:
> On Tue, 4 Sep 2018, Tim Chen wrote:
> 
> > > Current ptrace_may_access() implementation assumes that the 'source' task 
> > > is
> > > always the caller (current).
> > > 
> > > Expose ___ptrace_may_access() that can be used to apply the check on 
> > > arbitrary
> > > tasks.
> > 
> > Casey recently has proposed putting the decision making of whether to
> > do IBPB in the security module.
> > 
> > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/
> > 
> > That will have the advantage of giving the administrator a more flexibility
> > of when to turn on IBPB.  The policy is very similar to what you have 
> > proposed here
> > but I think the security module is a more appropriate place for the 
> > security policy.
> 
> Yeah, well, honestly, I have a bit hard time buying the "generic 
> sidechannel prevention security module" idea, given how completely 
> different in nature all the mitigations have been so far. I don't see that 
> trying to abstract this somehow provides more clarity.
> 
> So if this should be done in LSM, it'd probably have to be written by 
> someone else than me :) who actually understands how the "sidechannel LSM" 
> idea works.

Yeah, I'm not convinced on LSM either. Lets just do these here patches
first and then Casey can try and convince us later.

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Peter Zijlstra

On Tue, Sep 04, 2018 at 07:35:29PM +0200, Jiri Kosina wrote:
> On Tue, 4 Sep 2018, Tim Chen wrote:
> 
> > > Current ptrace_may_access() implementation assumes that the 'source' task 
> > > is
> > > always the caller (current).
> > > 
> > > Expose ___ptrace_may_access() that can be used to apply the check on 
> > > arbitrary
> > > tasks.
> > 
> > Casey recently has proposed putting the decision making of whether to
> > do IBPB in the security module.
> > 
> > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/
> > 
> > That will have the advantage of giving the administrator a more flexibility
> > of when to turn on IBPB.  The policy is very similar to what you have 
> > proposed here
> > but I think the security module is a more appropriate place for the 
> > security policy.
> 
> Yeah, well, honestly, I have a bit hard time buying the "generic 
> sidechannel prevention security module" idea, given how completely 
> different in nature all the mitigations have been so far. I don't see that 
> trying to abstract this somehow provides more clarity.
> 
> So if this should be done in LSM, it'd probably have to be written by 
> someone else than me :) who actually understands how the "sidechannel LSM" 
> idea works.

Yeah, I'm not convinced on LSM either. Lets just do these here patches
first and then Casey can try and convince us later.

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Peter Zijlstra

On Tue, Sep 04, 2018 at 04:40:57PM +0200, Jiri Kosina wrote:
> From: Jiri Kosina 
> 
> Current ptrace_may_access() implementation assumes that the 'source' task is
> always the caller (current).

Note that now you 'fixed' the new user, this is still true. Which makes
me think you do not in fact need this patch, right?

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Peter Zijlstra

On Tue, Sep 04, 2018 at 04:40:57PM +0200, Jiri Kosina wrote:
> From: Jiri Kosina 
> 
> Current ptrace_may_access() implementation assumes that the 'source' task is
> always the caller (current).

Note that now you 'fixed' the new user, this is still true. Which makes
me think you do not in fact need this patch, right?

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Tue, 4 Sep 2018, Tim Chen wrote:

> I think STIBP should be an opt in option as it will have significant
> impact on performance.  The attack from neighbor thread is pretty
> difficult to pull off considering you have to know what the sibling
> thread is running and its address allocation.

In many scenarios the attacker can just easily taskset itself to the 
correct sibling.

> We could also use a security module to opt in the STIBP policy.

I am a bit afraid that we are offloading to sysadmins decisions that are 
very hard for them to make, as they require deep understanding of both the 
technical details of the security issue in the CPU, and the mitigation.

I surely understand that Intel is doing what they could to minimize the 
performance effect, but achieving that by making it a rocket science to 
configure it properly doesn't feel right.

So, after giving it a bit more thought, I still believe "I want spectre V2 
protection" vs. "I do not care about spectre V2 on my system 
(=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
of my patchset, including the ptrace check in switch_mm() (statically 
patched out on !IBPB-capable systems), and we can then later see whether 
the LSM implementation, once it exists, should be used instead.

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-05 Thread Jiri Kosina

On Tue, 4 Sep 2018, Tim Chen wrote:

> I think STIBP should be an opt in option as it will have significant
> impact on performance.  The attack from neighbor thread is pretty
> difficult to pull off considering you have to know what the sibling
> thread is running and its address allocation.

In many scenarios the attacker can just easily taskset itself to the 
correct sibling.

> We could also use a security module to opt in the STIBP policy.

I am a bit afraid that we are offloading to sysadmins decisions that are 
very hard for them to make, as they require deep understanding of both the 
technical details of the security issue in the CPU, and the mitigation.

I surely understand that Intel is doing what they could to minimize the 
performance effect, but achieving that by making it a rocket science to 
configure it properly doesn't feel right.

So, after giving it a bit more thought, I still believe "I want spectre V2 
protection" vs. "I do not care about spectre V2 on my system 
(=nospectre_v2)" are the sane options we should provide; so I'll respin v4 
of my patchset, including the ptrace check in switch_mm() (statically 
patched out on !IBPB-capable systems), and we can then later see whether 
the LSM implementation, once it exists, should be used instead.

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Andrea Arcangeli

On Wed, Sep 05, 2018 at 01:00:37AM +, Schaufler, Casey wrote:
> Sorry, I've been working in security too long for my
> optimistic streak to be very wide.

Eheh. So I was simply trying to follow in context, but it wasn't
entirely clear, so I tried to take it out of context and then it was
even less clear, never mind I was only looking for some more
explanation.

> Not especially, no. I have gotten considerable feedback that
> it should be configurable, and while I may not see the value myself,
> I do respect the people who've requested the configurability.

Correct me if I'm wrong, but LSM as last word "module" implies to make
this logic modular.

My wondering is because:

1) why not just a single version of this logic in the scheduler?
   (i.e. can only be on or off or perhaps a only-ibpb-dumpable tweak
   to retain current slightly higher perf but less secure behavior)

2) even if you want a myriad of versions of this IBPB logic, how do
   the different versions can possibly fit in a whole "module" whose
   semantics have pretty much nothing to do with the other methods in
   the module like LSM_HOOK_INIT(capable, cap_capable),
   LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory), and
   LSM_HOOK_INIT(mmap_addr, cap_mmap_addr), or even
   LSM_HOOK_INIT(inode_follow_link, selinux_inode_follow_link),
   LSM_HOOK_INIT(inode_permission, selinux_inode_permission). I mean
   it looks an apple to oranges monolithic link in the same module. Or
   are you going to load this method in a separate module with only a
   single method implemented and then load multiple LSM modules at the
   same time?

3) if you need so much tweaking that not even a single off/on switch
   independent of any module loaded through LSM is not enough, how is
   it ok that all the rest shouldn't be configurable? All the rest has
   more performance impact than this one so it'd start from there if
   something.

I understand how configurablity is valuable (this is why we kept
everything dynamic tunable at runtime, including the PTI stop machine
to allow even PTI TLB flushes to be turned off).

I'm just trying to understand how IBPB fits in a LSM module
implementation.

Even if you build with CONFIG_SECURITY=n PTI won't go away, retpoline
won't go away, the l1flush in vmenter won't go away, the
pte/transhugepmd inversion won't go away, why only the runtime
tunability or tweaking of IBPB fits in a LSM module?

> If they provide different answers there should be different
> functions. It's a question of viewpoint. If you don't care about
> the SELinux viewpoint you shouldn't have to include it in your
> checks, any more than you care about x86 issues when you're
> running on a MIPS processor.

I don't see how selinux fits into this. Selinux doesn't affect virtual
memory protection. Not having selinux even built in doesn't mean you
are ok with virtual memory protection not being provided by the CPU.

Or in other words selinux fits into this as much as seccomp or
apparmor fits into this.

This is just like a memory barrier mb() to be executed in the context
switch code. Security issues can emerge with the lack of it regardless
if selinux is built in and enabled or CONFIG_SECURITY=n.

selinux matters after an exploit already happened, this as opposed is
needed to prevent the exploit in the first place.

Also the correct fully secure version provides a single answer
(i.e. in theory assuming a perfect implementation that didn't forget
anything so having a single implementation will also increase the
chances that we get as close as possible to the theoretical correct
implementation).

If it provides different answers in this case it is because somebody
wants not perfect security to provide higher performance, i.e. the
"configurablity" which is valuable and fine feature to provide. Just a
LSM module doesn't seem the most flexibile way to provide
configurability by far.

> Yes, even security people have to worry about locking.

Yes it was some lock that if contended would lockup if used from
inside the scheduler.

> What *is* the most robust way?

The below pretty much explains it.

> Yes, there are locking issues. The code in the LSM infrastructure and in
> the security modules needs to be aware of that and deal with it. The SELinux
> code I proposed is more complex than it could be because the audit code
> requires locking that doesn't work in the switching context.

Or in other words having multiple versions of this called from within
the scheduler is a bit like making the kernel modular and then because
the locking is subtle you may then have scheduler deadlocks only
happening with some kernel module loaded instead of others, but the
real question is what is the payoff compared to just allowing the
scheduler code to be tuned with x86 debugfs or sysctl the normal
way.

Also how would a new common code method in LSM fit for CPUs that are
so slow that they can't possibly need anything like IBPB and
flush_RSB()?

Thanks!
Andrea

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Andrea Arcangeli

On Wed, Sep 05, 2018 at 01:00:37AM +, Schaufler, Casey wrote:
> Sorry, I've been working in security too long for my
> optimistic streak to be very wide.

Eheh. So I was simply trying to follow in context, but it wasn't
entirely clear, so I tried to take it out of context and then it was
even less clear, never mind I was only looking for some more
explanation.

> Not especially, no. I have gotten considerable feedback that
> it should be configurable, and while I may not see the value myself,
> I do respect the people who've requested the configurability.

Correct me if I'm wrong, but LSM as last word "module" implies to make
this logic modular.

My wondering is because:

1) why not just a single version of this logic in the scheduler?
   (i.e. can only be on or off or perhaps a only-ibpb-dumpable tweak
   to retain current slightly higher perf but less secure behavior)

2) even if you want a myriad of versions of this IBPB logic, how do
   the different versions can possibly fit in a whole "module" whose
   semantics have pretty much nothing to do with the other methods in
   the module like LSM_HOOK_INIT(capable, cap_capable),
   LSM_HOOK_INIT(vm_enough_memory, cap_vm_enough_memory), and
   LSM_HOOK_INIT(mmap_addr, cap_mmap_addr), or even
   LSM_HOOK_INIT(inode_follow_link, selinux_inode_follow_link),
   LSM_HOOK_INIT(inode_permission, selinux_inode_permission). I mean
   it looks an apple to oranges monolithic link in the same module. Or
   are you going to load this method in a separate module with only a
   single method implemented and then load multiple LSM modules at the
   same time?

3) if you need so much tweaking that not even a single off/on switch
   independent of any module loaded through LSM is not enough, how is
   it ok that all the rest shouldn't be configurable? All the rest has
   more performance impact than this one so it'd start from there if
   something.

I understand how configurablity is valuable (this is why we kept
everything dynamic tunable at runtime, including the PTI stop machine
to allow even PTI TLB flushes to be turned off).

I'm just trying to understand how IBPB fits in a LSM module
implementation.

Even if you build with CONFIG_SECURITY=n PTI won't go away, retpoline
won't go away, the l1flush in vmenter won't go away, the
pte/transhugepmd inversion won't go away, why only the runtime
tunability or tweaking of IBPB fits in a LSM module?

> If they provide different answers there should be different
> functions. It's a question of viewpoint. If you don't care about
> the SELinux viewpoint you shouldn't have to include it in your
> checks, any more than you care about x86 issues when you're
> running on a MIPS processor.

I don't see how selinux fits into this. Selinux doesn't affect virtual
memory protection. Not having selinux even built in doesn't mean you
are ok with virtual memory protection not being provided by the CPU.

Or in other words selinux fits into this as much as seccomp or
apparmor fits into this.

This is just like a memory barrier mb() to be executed in the context
switch code. Security issues can emerge with the lack of it regardless
if selinux is built in and enabled or CONFIG_SECURITY=n.

selinux matters after an exploit already happened, this as opposed is
needed to prevent the exploit in the first place.

Also the correct fully secure version provides a single answer
(i.e. in theory assuming a perfect implementation that didn't forget
anything so having a single implementation will also increase the
chances that we get as close as possible to the theoretical correct
implementation).

If it provides different answers in this case it is because somebody
wants not perfect security to provide higher performance, i.e. the
"configurablity" which is valuable and fine feature to provide. Just a
LSM module doesn't seem the most flexibile way to provide
configurability by far.

> Yes, even security people have to worry about locking.

Yes it was some lock that if contended would lockup if used from
inside the scheduler.

> What *is* the most robust way?

The below pretty much explains it.

> Yes, there are locking issues. The code in the LSM infrastructure and in
> the security modules needs to be aware of that and deal with it. The SELinux
> code I proposed is more complex than it could be because the audit code
> requires locking that doesn't work in the switching context.

Or in other words having multiple versions of this called from within
the scheduler is a bit like making the kernel modular and then because
the locking is subtle you may then have scheduler deadlocks only
happening with some kernel module loaded instead of others, but the
real question is what is the payoff compared to just allowing the
scheduler code to be tuned with x86 debugfs or sysctl the normal
way.

Also how would a new common code method in LSM fit for CPUs that are
so slow that they can't possibly need anything like IBPB and
flush_RSB()?

Thanks!
Andrea

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Schaufler, Casey

> -Original Message-
> From: Andrea Arcangeli [mailto:aarca...@redhat.com]
> Sent: Tuesday, September 04, 2018 4:37 PM
> To: Schaufler, Casey 
> Cc: Jiri Kosina ; Tim Chen ;
> Thomas Gleixner ; Ingo Molnar ;
> Peter Zijlstra ; Josh Poimboeuf
> ; Woodhouse, David ; Oleg
> Nesterov ; linux-kernel@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> Hello,
> 
> On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote:
> > The real reason to use an LSM based approach is that overloading ptrace
> > checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a
> > processor interface. Even if ptrace_may_access() does exactly what you
> 
> "Side channel is a processor interface" doesn't make me optimistic,
> but I assume you're not speaking for Intel.

Sorry, I've been working in security too long for my
optimistic streak to be very wide.

> > want it to for side-channel mitigation today it would be incredibly
> > inappropriate to tie the two together for eternity. You don't want to 
> > restrict
> > the ptrace checks to only those things that are also good for side-channel,
> > and vice versa.
> 
> It seems like you want to make this more configurable,

Not especially, no. I have gotten considerable feedback that
it should be configurable, and while I may not see the value myself,
I do respect the people who've requested the configurability.

> we have all
> debugfs x86 specific tweaks to disable IBPB at runtime and we don't
> allow a runtime opt-out of IBPB alone.
> 
> If you shutdown either IBRS or retpolines at runtime with debugfs,
> then IBPB goes away too.
> 
> Giving a finegrined way to disable only IBPB we found was overkill
> because IBPB has never been measurable if done only when the prev task
> cannot ptrace the next task (which is a superset of the too weak
> upstream not dumpable check, but still not a performance issue).
> 
> Allowing IBPB runtime opt-out doesn't make much sense if you don't
> allow to disable retpolines too still at runtime. And disabling
> retpolines from LSM doesn't sounds the right place, it's an x86
> temporary quirk only relevant for current buggy CPUs.
> 
> There should be a function that decides when IBPB and flush_RSB should
> be run (flush_RSB has to be run if SMEP because with SMEP there's no
> need to run flush_RSB at every kernel entry anymore), and that
> function happens to check ptrace today, but it would check other stuff
> too if we had other interfaces besides ptrace that would allow the
> prev task to read the memory of the next task. So it's not so much
> about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task
> can read memory of next task". Then each arch can implement
> "IBPB_RSB" method its own way but the check is for the common
> code and it should be in the scheduler and there's just 1 version of
> this check needed.

Right, I get that. There's more to ptrace access than "I can read the
other task". There's more to what the processor is up to than IBPB.

> 
> I don't think there should be a ton of different versions of this
> function each providing a different answer, which is what LSM would
> provide.

If they provide different answers there should be different
functions. It's a question of viewpoint. If you don't care about
the SELinux viewpoint you shouldn't have to include it in your
checks, any more than you care about x86 issues when you're
running on a MIPS processor.

> At most you can add a x86 debugfs tweak to shut off the logic but
> again it would make more sense if retpolines could be shut off at
> runtime too, doing it just for IBPB sounds backwards because it has
> such an unmeasurable overhead.
> 
> > Yes. That would be me.
> 
> Even the attempt to make an innocuous call like
> ptrace_has_cap(tcred->user_ns, mode) will eventually
> deadlock there.

Which is yet another reason there needs to be separate logic.

> I used a PTRACE_MODE_ check that the ptrace check code uses to filter
> out specific calls that may eventually enter LSM and hard lockup in
> non reproducible workloads (some false positive IBPB is ok, especially
> if it avoids a deadlock).

Yes, even security people have to worry about locking.

> Everything can be fixed in any way, but calling LSM from scheduler
> code doesn't sound the most robust thing to do in general because what
> works outside the scheduler doesn't work from within those semi atomic
> regions (it tends to work initially until it eventually
> deadlocks). The original code of such check, had all sort of deadlocks
> because of that.

What *is* the most robust way?
Yes, there are locking issues. The code in the LSM infrastructure and in
the security modules needs to be aware of that and deal with it. The SELinux
code I proposed is more complex than it could be because the audit code
requires locking that doesn't work in the switching context.

> Thanks,
> Andrea

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Schaufler, Casey

> -Original Message-
> From: Andrea Arcangeli [mailto:aarca...@redhat.com]
> Sent: Tuesday, September 04, 2018 4:37 PM
> To: Schaufler, Casey 
> Cc: Jiri Kosina ; Tim Chen ;
> Thomas Gleixner ; Ingo Molnar ;
> Peter Zijlstra ; Josh Poimboeuf
> ; Woodhouse, David ; Oleg
> Nesterov ; linux-kernel@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> Hello,
> 
> On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote:
> > The real reason to use an LSM based approach is that overloading ptrace
> > checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a
> > processor interface. Even if ptrace_may_access() does exactly what you
> 
> "Side channel is a processor interface" doesn't make me optimistic,
> but I assume you're not speaking for Intel.

Sorry, I've been working in security too long for my
optimistic streak to be very wide.

> > want it to for side-channel mitigation today it would be incredibly
> > inappropriate to tie the two together for eternity. You don't want to 
> > restrict
> > the ptrace checks to only those things that are also good for side-channel,
> > and vice versa.
> 
> It seems like you want to make this more configurable,

Not especially, no. I have gotten considerable feedback that
it should be configurable, and while I may not see the value myself,
I do respect the people who've requested the configurability.

> we have all
> debugfs x86 specific tweaks to disable IBPB at runtime and we don't
> allow a runtime opt-out of IBPB alone.
> 
> If you shutdown either IBRS or retpolines at runtime with debugfs,
> then IBPB goes away too.
> 
> Giving a finegrined way to disable only IBPB we found was overkill
> because IBPB has never been measurable if done only when the prev task
> cannot ptrace the next task (which is a superset of the too weak
> upstream not dumpable check, but still not a performance issue).
> 
> Allowing IBPB runtime opt-out doesn't make much sense if you don't
> allow to disable retpolines too still at runtime. And disabling
> retpolines from LSM doesn't sounds the right place, it's an x86
> temporary quirk only relevant for current buggy CPUs.
> 
> There should be a function that decides when IBPB and flush_RSB should
> be run (flush_RSB has to be run if SMEP because with SMEP there's no
> need to run flush_RSB at every kernel entry anymore), and that
> function happens to check ptrace today, but it would check other stuff
> too if we had other interfaces besides ptrace that would allow the
> prev task to read the memory of the next task. So it's not so much
> about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task
> can read memory of next task". Then each arch can implement
> "IBPB_RSB" method its own way but the check is for the common
> code and it should be in the scheduler and there's just 1 version of
> this check needed.

Right, I get that. There's more to ptrace access than "I can read the
other task". There's more to what the processor is up to than IBPB.

> 
> I don't think there should be a ton of different versions of this
> function each providing a different answer, which is what LSM would
> provide.

If they provide different answers there should be different
functions. It's a question of viewpoint. If you don't care about
the SELinux viewpoint you shouldn't have to include it in your
checks, any more than you care about x86 issues when you're
running on a MIPS processor.

> At most you can add a x86 debugfs tweak to shut off the logic but
> again it would make more sense if retpolines could be shut off at
> runtime too, doing it just for IBPB sounds backwards because it has
> such an unmeasurable overhead.
> 
> > Yes. That would be me.
> 
> Even the attempt to make an innocuous call like
> ptrace_has_cap(tcred->user_ns, mode) will eventually
> deadlock there.

Which is yet another reason there needs to be separate logic.

> I used a PTRACE_MODE_ check that the ptrace check code uses to filter
> out specific calls that may eventually enter LSM and hard lockup in
> non reproducible workloads (some false positive IBPB is ok, especially
> if it avoids a deadlock).

Yes, even security people have to worry about locking.

> Everything can be fixed in any way, but calling LSM from scheduler
> code doesn't sound the most robust thing to do in general because what
> works outside the scheduler doesn't work from within those semi atomic
> regions (it tends to work initially until it eventually
> deadlocks). The original code of such check, had all sort of deadlocks
> because of that.

What *is* the most robust way?
Yes, there are locking issues. The code in the LSM infrastructure and in
the security modules needs to be aware of that and deal with it. The SELinux
code I proposed is more complex than it could be because the audit code
requires locking that doesn't work in the switching context.

> Thanks,
> Andrea

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Andrea Arcangeli

Hello,

On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote:
> The real reason to use an LSM based approach is that overloading ptrace
> checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a
> processor interface. Even if ptrace_may_access() does exactly what you

"Side channel is a processor interface" doesn't make me optimistic,
but I assume you're not speaking for Intel.

> want it to for side-channel mitigation today it would be incredibly
> inappropriate to tie the two together for eternity. You don't want to restrict
> the ptrace checks to only those things that are also good for side-channel,
> and vice versa. 

It seems like you want to make this more configurable, we have all
debugfs x86 specific tweaks to disable IBPB at runtime and we don't
allow a runtime opt-out of IBPB alone.

If you shutdown either IBRS or retpolines at runtime with debugfs,
then IBPB goes away too.

Giving a finegrined way to disable only IBPB we found was overkill
because IBPB has never been measurable if done only when the prev task
cannot ptrace the next task (which is a superset of the too weak
upstream not dumpable check, but still not a performance issue).

Allowing IBPB runtime opt-out doesn't make much sense if you don't
allow to disable retpolines too still at runtime. And disabling
retpolines from LSM doesn't sounds the right place, it's an x86
temporary quirk only relevant for current buggy CPUs.

There should be a function that decides when IBPB and flush_RSB should
be run (flush_RSB has to be run if SMEP because with SMEP there's no
need to run flush_RSB at every kernel entry anymore), and that
function happens to check ptrace today, but it would check other stuff
too if we had other interfaces besides ptrace that would allow the
prev task to read the memory of the next task. So it's not so much
about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task
can read memory of next task". Then each arch can implement
"IBPB_RSB" method its own way but the check is for the common
code and it should be in the scheduler and there's just 1 version of
this check needed.

I don't think there should be a ton of different versions of this
function each providing a different answer, which is what LSM would
provide.

At most you can add a x86 debugfs tweak to shut off the logic but
again it would make more sense if retpolines could be shut off at
runtime too, doing it just for IBPB sounds backwards because it has
such an unmeasurable overhead.

> Yes. That would be me. 

Even the attempt to make an innocuous call like
ptrace_has_cap(tcred->user_ns, mode) will eventually
deadlock there.

I used a PTRACE_MODE_ check that the ptrace check code uses to filter
out specific calls that may eventually enter LSM and hard lockup in
non reproducible workloads (some false positive IBPB is ok, especially
if it avoids a deadlock).

Everything can be fixed in any way, but calling LSM from scheduler
code doesn't sound the most robust thing to do in general because what
works outside the scheduler doesn't work from within those semi atomic
regions (it tends to work initially until it eventually
deadlocks). The original code of such check, had all sort of deadlocks
because of that.

Thanks,
Andrea

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Andrea Arcangeli

Hello,

On Tue, Sep 04, 2018 at 06:10:47PM +, Schaufler, Casey wrote:
> The real reason to use an LSM based approach is that overloading ptrace
> checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a
> processor interface. Even if ptrace_may_access() does exactly what you

"Side channel is a processor interface" doesn't make me optimistic,
but I assume you're not speaking for Intel.

> want it to for side-channel mitigation today it would be incredibly
> inappropriate to tie the two together for eternity. You don't want to restrict
> the ptrace checks to only those things that are also good for side-channel,
> and vice versa. 

It seems like you want to make this more configurable, we have all
debugfs x86 specific tweaks to disable IBPB at runtime and we don't
allow a runtime opt-out of IBPB alone.

If you shutdown either IBRS or retpolines at runtime with debugfs,
then IBPB goes away too.

Giving a finegrined way to disable only IBPB we found was overkill
because IBPB has never been measurable if done only when the prev task
cannot ptrace the next task (which is a superset of the too weak
upstream not dumpable check, but still not a performance issue).

Allowing IBPB runtime opt-out doesn't make much sense if you don't
allow to disable retpolines too still at runtime. And disabling
retpolines from LSM doesn't sounds the right place, it's an x86
temporary quirk only relevant for current buggy CPUs.

There should be a function that decides when IBPB and flush_RSB should
be run (flush_RSB has to be run if SMEP because with SMEP there's no
need to run flush_RSB at every kernel entry anymore), and that
function happens to check ptrace today, but it would check other stuff
too if we had other interfaces besides ptrace that would allow the
prev task to read the memory of the next task. So it's not so much
about ptrace nor about IBPB, it's about "IBPB_RSB" vs "prev task
can read memory of next task". Then each arch can implement
"IBPB_RSB" method its own way but the check is for the common
code and it should be in the scheduler and there's just 1 version of
this check needed.

I don't think there should be a ton of different versions of this
function each providing a different answer, which is what LSM would
provide.

At most you can add a x86 debugfs tweak to shut off the logic but
again it would make more sense if retpolines could be shut off at
runtime too, doing it just for IBPB sounds backwards because it has
such an unmeasurable overhead.

> Yes. That would be me. 

Even the attempt to make an innocuous call like
ptrace_has_cap(tcred->user_ns, mode) will eventually
deadlock there.

I used a PTRACE_MODE_ check that the ptrace check code uses to filter
out specific calls that may eventually enter LSM and hard lockup in
non reproducible workloads (some false positive IBPB is ok, especially
if it avoids a deadlock).

Everything can be fixed in any way, but calling LSM from scheduler
code doesn't sound the most robust thing to do in general because what
works outside the scheduler doesn't work from within those semi atomic
regions (it tends to work initially until it eventually
deadlocks). The original code of such check, had all sort of deadlocks
because of that.

Thanks,
Andrea

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Tim Chen

On 09/04/2018 11:48 AM, Jiri Kosina wrote:
> On Tue, 4 Sep 2018, Schaufler, Casey wrote:
> 
>>> So if this should be done in LSM, it'd probably have to be written by
>>> someone else than me :) who actually understands how the "sidechannel LSM"
>>> idea works.
>>
>> Yes. That would be me. 
> 
> Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. 
> 
> Some form of 3/3 still should be merged independently on that.

I think STIBP should be an opt in option as it will have significant
impact on performance.  The attack from neighbor thread is pretty
difficult to pull off considering you have to know what the sibling
thread is running and its address allocation.

We could also use a security module to opt in the STIBP policy.

Tim 

> 
> Thanks,
>

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Tim Chen

On 09/04/2018 11:48 AM, Jiri Kosina wrote:
> On Tue, 4 Sep 2018, Schaufler, Casey wrote:
> 
>>> So if this should be done in LSM, it'd probably have to be written by
>>> someone else than me :) who actually understands how the "sidechannel LSM"
>>> idea works.
>>
>> Yes. That would be me. 
> 
> Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. 
> 
> Some form of 3/3 still should be merged independently on that.

I think STIBP should be an opt in option as it will have significant
impact on performance.  The attack from neighbor thread is pretty
difficult to pull off considering you have to know what the sibling
thread is running and its address allocation.

We could also use a security module to opt in the STIBP policy.

Tim 

> 
> Thanks,
>

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Jiri Kosina

On Tue, 4 Sep 2018, Schaufler, Casey wrote:

> > So if this should be done in LSM, it'd probably have to be written by
> > someone else than me :) who actually understands how the "sidechannel LSM"
> > idea works.
> 
> Yes. That would be me. 

Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. 

Some form of 3/3 still should be merged independently on that.

Thanks,

-- 
Jiri Kosina
SUSE Labs

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Jiri Kosina

On Tue, 4 Sep 2018, Schaufler, Casey wrote:

> > So if this should be done in LSM, it'd probably have to be written by
> > someone else than me :) who actually understands how the "sidechannel LSM"
> > idea works.
> 
> Yes. That would be me. 

Ok, cool. Then 1/2 and 2/3 can be ignored / replaced by Casey's LSM stuff. 

Some form of 3/3 still should be merged independently on that.

Thanks,

-- 
Jiri Kosina
SUSE Labs

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Schaufler, Casey

> -Original Message-
> From: Jiri Kosina [mailto:ji...@kernel.org]
> Sent: Tuesday, September 04, 2018 10:35 AM
> To: Tim Chen 
> Cc: Thomas Gleixner ; Ingo Molnar ;
> Peter Zijlstra ; Josh Poimboeuf
> ; Andrea Arcangeli ;
> Woodhouse, David ; Oleg Nesterov
> ; Schaufler, Casey ; linux-
> ker...@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> On Tue, 4 Sep 2018, Tim Chen wrote:
> 
> > > Current ptrace_may_access() implementation assumes that the 'source'
> task is
> > > always the caller (current).
> > >
> > > Expose ___ptrace_may_access() that can be used to apply the check on
> arbitrary
> > > tasks.
> >
> > Casey recently has proposed putting the decision making of whether to
> > do IBPB in the security module.
> >
> > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-
> casey.schauf...@intel.com/
> >
> > That will have the advantage of giving the administrator a more flexibility
> > of when to turn on IBPB.  The policy is very similar to what you have 
> > proposed
> here
> > but I think the security module is a more appropriate place for the security
> policy.
> 
> Yeah, well, honestly, I have a bit hard time buying the "generic
> sidechannel prevention security module" idea, given how completely
> different in nature all the mitigations have been so far. I don't see that
> trying to abstract this somehow provides more clarity.

The real reason to use an LSM based approach is that overloading ptrace
checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a
processor interface. Even if ptrace_may_access() does exactly what you
want it to for side-channel mitigation today it would be incredibly
inappropriate to tie the two together for eternity. You don't want to restrict
the ptrace checks to only those things that are also good for side-channel,
and vice versa. 


> So if this should be done in LSM, it'd probably have to be written by
> someone else than me :) who actually understands how the "sidechannel LSM"
> idea works.

Yes. That would be me. 

> 
> Thanks,
> 
> --
> Jiri Kosina
> SUSE Labs

RE: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Schaufler, Casey

> -Original Message-
> From: Jiri Kosina [mailto:ji...@kernel.org]
> Sent: Tuesday, September 04, 2018 10:35 AM
> To: Tim Chen 
> Cc: Thomas Gleixner ; Ingo Molnar ;
> Peter Zijlstra ; Josh Poimboeuf
> ; Andrea Arcangeli ;
> Woodhouse, David ; Oleg Nesterov
> ; Schaufler, Casey ; linux-
> ker...@vger.kernel.org; x...@kernel.org
> Subject: Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can
> be applied on arbitrary tasks
> 
> On Tue, 4 Sep 2018, Tim Chen wrote:
> 
> > > Current ptrace_may_access() implementation assumes that the 'source'
> task is
> > > always the caller (current).
> > >
> > > Expose ___ptrace_may_access() that can be used to apply the check on
> arbitrary
> > > tasks.
> >
> > Casey recently has proposed putting the decision making of whether to
> > do IBPB in the security module.
> >
> > https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-
> casey.schauf...@intel.com/
> >
> > That will have the advantage of giving the administrator a more flexibility
> > of when to turn on IBPB.  The policy is very similar to what you have 
> > proposed
> here
> > but I think the security module is a more appropriate place for the security
> policy.
> 
> Yeah, well, honestly, I have a bit hard time buying the "generic
> sidechannel prevention security module" idea, given how completely
> different in nature all the mitigations have been so far. I don't see that
> trying to abstract this somehow provides more clarity.

The real reason to use an LSM based approach is that overloading ptrace
checks is a Really Bad Idea. Ptrace is a user interface. Side-channel is a
processor interface. Even if ptrace_may_access() does exactly what you
want it to for side-channel mitigation today it would be incredibly
inappropriate to tie the two together for eternity. You don't want to restrict
the ptrace checks to only those things that are also good for side-channel,
and vice versa. 


> So if this should be done in LSM, it'd probably have to be written by
> someone else than me :) who actually understands how the "sidechannel LSM"
> idea works.

Yes. That would be me. 

> 
> Thanks,
> 
> --
> Jiri Kosina
> SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Jiri Kosina

On Tue, 4 Sep 2018, Tim Chen wrote:

> > Current ptrace_may_access() implementation assumes that the 'source' task is
> > always the caller (current).
> > 
> > Expose ___ptrace_may_access() that can be used to apply the check on 
> > arbitrary
> > tasks.
> 
> Casey recently has proposed putting the decision making of whether to
> do IBPB in the security module.
> 
> https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/
> 
> That will have the advantage of giving the administrator a more flexibility
> of when to turn on IBPB.  The policy is very similar to what you have 
> proposed here
> but I think the security module is a more appropriate place for the security 
> policy.

Yeah, well, honestly, I have a bit hard time buying the "generic 
sidechannel prevention security module" idea, given how completely 
different in nature all the mitigations have been so far. I don't see that 
trying to abstract this somehow provides more clarity.

So if this should be done in LSM, it'd probably have to be written by 
someone else than me :) who actually understands how the "sidechannel LSM" 
idea works.

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Jiri Kosina

On Tue, 4 Sep 2018, Tim Chen wrote:

> > Current ptrace_may_access() implementation assumes that the 'source' task is
> > always the caller (current).
> > 
> > Expose ___ptrace_may_access() that can be used to apply the check on 
> > arbitrary
> > tasks.
> 
> Casey recently has proposed putting the decision making of whether to
> do IBPB in the security module.
> 
> https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/
> 
> That will have the advantage of giving the administrator a more flexibility
> of when to turn on IBPB.  The policy is very similar to what you have 
> proposed here
> but I think the security module is a more appropriate place for the security 
> policy.

Yeah, well, honestly, I have a bit hard time buying the "generic 
sidechannel prevention security module" idea, given how completely 
different in nature all the mitigations have been so far. I don't see that 
trying to abstract this somehow provides more clarity.

So if this should be done in LSM, it'd probably have to be written by 
someone else than me :) who actually understands how the "sidechannel LSM" 
idea works.

Thanks,

-- 
Jiri Kosina
SUSE Labs

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Tim Chen

On 09/04/2018 07:40 AM, Jiri Kosina wrote:
> From: Jiri Kosina 
> 
> Current ptrace_may_access() implementation assumes that the 'source' task is
> always the caller (current).
> 
> Expose ___ptrace_may_access() that can be used to apply the check on arbitrary
> tasks.

Casey recently has proposed putting the decision making of whether to
do IBPB in the security module.

https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/

That will have the advantage of giving the administrator a more flexibility
of when to turn on IBPB.  The policy is very similar to what you have proposed 
here
but I think the security module is a more appropriate place for the security 
policy.

Thanks.

Tim

> 
> Originally-by: Tim Chen 
> Signed-off-by: Jiri Kosina 
> ---
> 
> Sorry for the resend, my pine is buggered and broke threading.
> 
>  include/linux/ptrace.h | 12 
>  kernel/ptrace.c| 13 ++---
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 4f36431c380b..09744d4113fb 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -87,6 +87,18 @@ extern void exit_ptrace(struct task_struct *tracer, struct 
> list_head *dead);
>   */
>  extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
>  
> +/**
> + * ___ptrace_may_access - variant of ptrace_may_access that can be used
> + * between two arbitrary tasks
> + * @curr: source task
> + * @task: target task
> + * @mode: selects type of access and caller credentials
> + *
> + * Returns true on success, false on denial.
> + */
> +extern int ___ptrace_may_access(struct task_struct *curr, struct task_struct 
> *task,
> + unsigned int mode);
> +
>  static inline int ptrace_reparented(struct task_struct *child)
>  {
>   return !same_thread_group(child->real_parent, child->parent);
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 21fec73d45d4..07ff6685ebed 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -268,9 +268,10 @@ static int ptrace_has_cap(struct user_namespace *ns, 
> unsigned int mode)
>  }
>  
>  /* Returns 0 on success, -errno on denial. */
> -static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
> +int ___ptrace_may_access(struct task_struct *curr, struct task_struct *task,
> + unsigned int mode)
>  {
> - const struct cred *cred = current_cred(), *tcred;
> + const struct cred *cred, *tcred;
>   struct mm_struct *mm;
>   kuid_t caller_uid;
>   kgid_t caller_gid;
> @@ -290,9 +291,10 @@ static int __ptrace_may_access(struct task_struct *task, 
> unsigned int mode)
>*/
>  
>   /* Don't let security modules deny introspection */
> - if (same_thread_group(task, current))
> + if (same_thread_group(task, curr))
>   return 0;
>   rcu_read_lock();
> + cred =  __task_cred(curr);
>   if (mode & PTRACE_MODE_FSCREDS) {
>   caller_uid = cred->fsuid;
>   caller_gid = cred->fsgid;
> @@ -331,6 +333,11 @@ static int __ptrace_may_access(struct task_struct *task, 
> unsigned int mode)
>   return security_ptrace_access_check(task, mode);
>  }
>  
> +static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
> +{
> + return ___ptrace_may_access(current, task, mode);
> +}
> +
>  bool ptrace_may_access(struct task_struct *task, unsigned int mode)
>  {
>   int err;
>

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Tim Chen

On 09/04/2018 07:40 AM, Jiri Kosina wrote:
> From: Jiri Kosina 
> 
> Current ptrace_may_access() implementation assumes that the 'source' task is
> always the caller (current).
> 
> Expose ___ptrace_may_access() that can be used to apply the check on arbitrary
> tasks.

Casey recently has proposed putting the decision making of whether to
do IBPB in the security module.

https://lwn.net/ml/kernel-hardening/20180815235355.14908-4-casey.schauf...@intel.com/

That will have the advantage of giving the administrator a more flexibility
of when to turn on IBPB.  The policy is very similar to what you have proposed 
here
but I think the security module is a more appropriate place for the security 
policy.

Thanks.

Tim

> 
> Originally-by: Tim Chen 
> Signed-off-by: Jiri Kosina 
> ---
> 
> Sorry for the resend, my pine is buggered and broke threading.
> 
>  include/linux/ptrace.h | 12 
>  kernel/ptrace.c| 13 ++---
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 4f36431c380b..09744d4113fb 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -87,6 +87,18 @@ extern void exit_ptrace(struct task_struct *tracer, struct 
> list_head *dead);
>   */
>  extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
>  
> +/**
> + * ___ptrace_may_access - variant of ptrace_may_access that can be used
> + * between two arbitrary tasks
> + * @curr: source task
> + * @task: target task
> + * @mode: selects type of access and caller credentials
> + *
> + * Returns true on success, false on denial.
> + */
> +extern int ___ptrace_may_access(struct task_struct *curr, struct task_struct 
> *task,
> + unsigned int mode);
> +
>  static inline int ptrace_reparented(struct task_struct *child)
>  {
>   return !same_thread_group(child->real_parent, child->parent);
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 21fec73d45d4..07ff6685ebed 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -268,9 +268,10 @@ static int ptrace_has_cap(struct user_namespace *ns, 
> unsigned int mode)
>  }
>  
>  /* Returns 0 on success, -errno on denial. */
> -static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
> +int ___ptrace_may_access(struct task_struct *curr, struct task_struct *task,
> + unsigned int mode)
>  {
> - const struct cred *cred = current_cred(), *tcred;
> + const struct cred *cred, *tcred;
>   struct mm_struct *mm;
>   kuid_t caller_uid;
>   kgid_t caller_gid;
> @@ -290,9 +291,10 @@ static int __ptrace_may_access(struct task_struct *task, 
> unsigned int mode)
>*/
>  
>   /* Don't let security modules deny introspection */
> - if (same_thread_group(task, current))
> + if (same_thread_group(task, curr))
>   return 0;
>   rcu_read_lock();
> + cred =  __task_cred(curr);
>   if (mode & PTRACE_MODE_FSCREDS) {
>   caller_uid = cred->fsuid;
>   caller_gid = cred->fsgid;
> @@ -331,6 +333,11 @@ static int __ptrace_may_access(struct task_struct *task, 
> unsigned int mode)
>   return security_ptrace_access_check(task, mode);
>  }
>  
> +static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
> +{
> + return ___ptrace_may_access(current, task, mode);
> +}
> +
>  bool ptrace_may_access(struct task_struct *task, unsigned int mode)
>  {
>   int err;
>

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Thomas Gleixner

On Tue, 4 Sep 2018, Thomas Gleixner wrote:
> On Tue, 4 Sep 2018, Jiri Kosina wrote:
> > +/**
> > + * ___ptrace_may_access - variant of ptrace_may_access that can be used
> > + * between two arbitrary tasks
> > + * @curr: source task
> 
> That's a pretty shitty name, really. If @curr is supposed to be an
> arbitrary task, then please rename it in a way which makes that entirely
> clear. curr is too close to current. Just grep for 'task_struct \*curr'.

Something like '*tracer' would make it entirely clear what's going on.

Thanks,

tglx

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Thomas Gleixner

On Tue, 4 Sep 2018, Thomas Gleixner wrote:
> On Tue, 4 Sep 2018, Jiri Kosina wrote:
> > +/**
> > + * ___ptrace_may_access - variant of ptrace_may_access that can be used
> > + * between two arbitrary tasks
> > + * @curr: source task
> 
> That's a pretty shitty name, really. If @curr is supposed to be an
> arbitrary task, then please rename it in a way which makes that entirely
> clear. curr is too close to current. Just grep for 'task_struct \*curr'.

Something like '*tracer' would make it entirely clear what's going on.

Thanks,

tglx

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Thomas Gleixner

On Tue, 4 Sep 2018, Jiri Kosina wrote:
> +/**
> + * ___ptrace_may_access - variant of ptrace_may_access that can be used
> + * between two arbitrary tasks
> + * @curr: source task

That's a pretty shitty name, really. If @curr is supposed to be an
arbitrary task, then please rename it in a way which makes that entirely
clear. curr is too close to current. Just grep for 'task_struct \*curr'.

Thanks,

tglx

Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

2018-09-04 Thread Thomas Gleixner

On Tue, 4 Sep 2018, Jiri Kosina wrote:
> +/**
> + * ___ptrace_may_access - variant of ptrace_may_access that can be used
> + * between two arbitrary tasks
> + * @curr: source task

That's a pretty shitty name, really. If @curr is supposed to be an
arbitrary task, then please rename it in a way which makes that entirely
clear. curr is too close to current. Just grep for 'task_struct \*curr'.

Thanks,

tglx

48 matches

Mail list logo