Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-05-01 Thread Khalid Aziz
On 5/1/19 8:49 AM, Waiman Long wrote:
> On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> 
>> index 858b6c0b9a15..9b36da94760e 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -2997,6 +2997,12 @@
>>
>>   nox2apic    [X86-64,APIC] Do not enable x2APIC mode.
>>
>> +    noxpfo    [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
>> +    when CONFIG_XPFO is on. Physical pages mapped into
>> +    user applications will also be mapped in the
>> +    kernel's address space as if CONFIG_XPFO was not
>> +    enabled.
>> +
>>   cpu0_hotplug    [X86] Turn on CPU0 hotplug feature when
>>   CONFIG_BO OTPARAM_HOTPLUG_CPU0 is off.
>>   Some features depend on CPU0. Known dependencies are:
> 
> Given the big performance impact that XPFO can have. It should be off by
> default when configured. Instead, the xpfo option should be used to
> enable it.

Agreed. I plan to disable it by default in the next version of the
patch. This is likely to end up being a feature for extreme security
conscious folks only, unless I or someone else comes up with further
significant performance boost.

Thanks,
Khalid

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-05-01 Thread Waiman Long
On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation/admin-guide/kernel-parameters.txt

> index 858b6c0b9a15..9b36da94760e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2997,6 +2997,12 @@
>
>  nox2apic    [X86-64,APIC] Do not enable x2APIC mode.
>
> +    noxpfo    [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
> +    when CONFIG_XPFO is on. Physical pages mapped into
> +    user applications will also be mapped in the
> +    kernel's address space as if CONFIG_XPFO was not
> +    enabled.
> +
>  cpu0_hotplug    [X86] Turn on CPU0 hotplug feature when
>  CONFIG_BO OTPARAM_HOTPLUG_CPU0 is off.
>  Some features depend on CPU0. Known dependencies are:

Given the big performance impact that XPFO can have. It should be off by
default when configured. Instead, the xpfo option should be used to
enable it.

Cheers,
Longman

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-22 Thread Kees Cook via iommu
On Thu, Apr 18, 2019 at 7:35 AM Khalid Aziz  wrote:
>
> On 4/17/19 11:41 PM, Kees Cook wrote:
> > On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski  wrote:
> >> I don't think this type of NX goof was ever the argument for XPFO.
> >> The main argument I've heard is that a malicious user program writes a
> >> ROP payload into user memory (regular anonymous user memory) and then
> >> gets the kernel to erroneously set RSP (*not* RIP) to point there.
> >
> > Well, more than just ROP. Any of the various attack primitives. The NX
> > stuff is about moving RIP: SMEP-bypassing. But there is still basic
> > SMAP-bypassing for putting a malicious structure in userspace and
> > having the kernel access it via the linear mapping, etc.
> >
> >> I find this argument fairly weak for a couple reasons.  First, if
> >> we're worried about this, let's do in-kernel CFI, not XPFO, to
> >
> > CFI is getting much closer. Getting the kernel happy under Clang, LTO,
> > and CFI is under active development. (It's functional for arm64
> > already, and pieces have been getting upstreamed.)
> >
>
> CFI theoretically offers protection with fairly low overhead. I have not
> played much with CFI in clang. I agree with Linus that probability of
> bugs in XPFO implementation itself is a cause of concern. If CFI in
> Clang can provide us the same level of protection as XPFO does, I
> wouldn't want to push for an expensive change like XPFO.
>
> If Clang/CFI can't get us there for extended period of time, does it
> make sense to continue to poke at XPFO?

Well, I think CFI will certainly vastly narrow the execution paths
available to an attacker, but what I continue to see XPFO useful for
is stopping attacks that need to locate something in memory. (i.e. not
ret2dir but, like, read2dir.) It's arguable that such attacks would
just use heap, stack, etc to hold such things, but the linear map
remains relatively easy to find/target. But I agree: the protection is
getting more and more narrow (especially with CFI coming down the
pipe), and if it's still a 28% hit, that's not going to be tenable for
anyone but the truly paranoid. :)

All that said, there isn't a very good backward-edge CFI protection
(i.e. ROP defense) on x86 in Clang. The forward-edge looks decent, but
requires LTO, etc. My point is there is still a long path to gaining
CFI in upstream.

-- 
Kees Cook
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-22 Thread Khalid Aziz
On 4/18/19 8:34 AM, Khalid Aziz wrote:
> On 4/17/19 11:41 PM, Kees Cook wrote:
>> On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski  wrote:
>>> I don't think this type of NX goof was ever the argument for XPFO.
>>> The main argument I've heard is that a malicious user program writes a
>>> ROP payload into user memory (regular anonymous user memory) and then
>>> gets the kernel to erroneously set RSP (*not* RIP) to point there.
>>
>> Well, more than just ROP. Any of the various attack primitives. The NX
>> stuff is about moving RIP: SMEP-bypassing. But there is still basic
>> SMAP-bypassing for putting a malicious structure in userspace and
>> having the kernel access it via the linear mapping, etc.
>>
>>> I find this argument fairly weak for a couple reasons.  First, if
>>> we're worried about this, let's do in-kernel CFI, not XPFO, to
>>
>> CFI is getting much closer. Getting the kernel happy under Clang, LTO,
>> and CFI is under active development. (It's functional for arm64
>> already, and pieces have been getting upstreamed.)
>>
> 
> CFI theoretically offers protection with fairly low overhead. I have not
> played much with CFI in clang. I agree with Linus that probability of
> bugs in XPFO implementation itself is a cause of concern. If CFI in
> Clang can provide us the same level of protection as XPFO does, I
> wouldn't want to push for an expensive change like XPFO.
> 
> If Clang/CFI can't get us there for extended period of time, does it
> make sense to continue to poke at XPFO?

Any feedback on continued effort on XPFO? If it makes sense to have XPFO
available as a solution for ret2dir issue in case Clang/CFI does not
work out, I will continue to refine it.

--
Khalid

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-18 Thread Khalid Aziz
On 4/17/19 11:41 PM, Kees Cook wrote:
> On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski  wrote:
>> I don't think this type of NX goof was ever the argument for XPFO.
>> The main argument I've heard is that a malicious user program writes a
>> ROP payload into user memory (regular anonymous user memory) and then
>> gets the kernel to erroneously set RSP (*not* RIP) to point there.
> 
> Well, more than just ROP. Any of the various attack primitives. The NX
> stuff is about moving RIP: SMEP-bypassing. But there is still basic
> SMAP-bypassing for putting a malicious structure in userspace and
> having the kernel access it via the linear mapping, etc.
> 
>> I find this argument fairly weak for a couple reasons.  First, if
>> we're worried about this, let's do in-kernel CFI, not XPFO, to
> 
> CFI is getting much closer. Getting the kernel happy under Clang, LTO,
> and CFI is under active development. (It's functional for arm64
> already, and pieces have been getting upstreamed.)
> 

CFI theoretically offers protection with fairly low overhead. I have not
played much with CFI in clang. I agree with Linus that probability of
bugs in XPFO implementation itself is a cause of concern. If CFI in
Clang can provide us the same level of protection as XPFO does, I
wouldn't want to push for an expensive change like XPFO.

If Clang/CFI can't get us there for extended period of time, does it
make sense to continue to poke at XPFO?

Thanks,
Khalid

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-18 Thread Kees Cook via iommu
On Wed, Apr 17, 2019 at 11:41 PM Andy Lutomirski  wrote:
> I don't think this type of NX goof was ever the argument for XPFO.
> The main argument I've heard is that a malicious user program writes a
> ROP payload into user memory (regular anonymous user memory) and then
> gets the kernel to erroneously set RSP (*not* RIP) to point there.

Well, more than just ROP. Any of the various attack primitives. The NX
stuff is about moving RIP: SMEP-bypassing. But there is still basic
SMAP-bypassing for putting a malicious structure in userspace and
having the kernel access it via the linear mapping, etc.

> I find this argument fairly weak for a couple reasons.  First, if
> we're worried about this, let's do in-kernel CFI, not XPFO, to

CFI is getting much closer. Getting the kernel happy under Clang, LTO,
and CFI is under active development. (It's functional for arm64
already, and pieces have been getting upstreamed.)

> mitigate it.  Second, I don't see why the exact same attack can't be
> done using, say, page cache, and unless I'm missing something, XPFO
> doesn't protect page cache.  Or network buffers, or pipe buffers, etc.

My understanding is that it's much easier to feel out the linear
mapping address than for the others. But yes, all of those same attack
primitives are possible in other memory areas (though most are NX),
and plenty of exploits have done such things.

-- 
Kees Cook
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-18 Thread Thomas Gleixner
On Wed, 17 Apr 2019, Linus Torvalds wrote:
> On Wed, Apr 17, 2019 at 4:42 PM Thomas Gleixner  wrote:
> > On Wed, 17 Apr 2019, Linus Torvalds wrote:
> > > With SMEP, user space pages are always NX.
> >
> > We talk past each other. The user space page in the ring3 valid virtual
> > address space (non negative) is of course protected by SMEP.
> >
> > The attack utilizes the kernel linear mapping of the physical
> > memory. I.e. user space address 0x43210 has a kernel equivalent at
> > 0xfxx. So if the attack manages to trick the kernel to that valid
> > kernel address and that is mapped X --> game over. SMEP does not help
> > there.
> 
> Oh, agreed.
> 
> But that would simply be a kernel bug. We should only map kernel pages
> executable when we have kernel code in them, and we should certainly
> not allow those pages to be mapped writably in user space.
> 
> That kind of "executable in kernel, writable in user" would be a
> horrendous and major bug.
> 
> So i think it's a non-issue.

Pretty much.

> > From the top of my head I'd say this is a non issue as those kernel address
> > space mappings _should_ be NX, but we got bitten by _should_ in the past:)
> 
> I do agree that bugs can happen, obviously, and we might have missed 
> something.
>
> But in the context of XPFO, I would argue (*very* strongly) that the
> likelihood of the above kind of bug is absolutely *miniscule* compared
> to the likelihood that we'd have something wrong in the software
> implementation of XPFO.
> 
> So if the argument is "we might have bugs in software", then I think
> that's an argument _against_ XPFO rather than for it.

No argument from my side. We better spend time to make sure that a bogus
kernel side X mapping is caught, like we catch other things.

Thanks,

tglx

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Andy Lutomirski
On Wed, Apr 17, 2019 at 5:00 PM Linus Torvalds
 wrote:
>
> On Wed, Apr 17, 2019 at 4:42 PM Thomas Gleixner  wrote:
> >
> > On Wed, 17 Apr 2019, Linus Torvalds wrote:
> >
> > > With SMEP, user space pages are always NX.
> >
> > We talk past each other. The user space page in the ring3 valid virtual
> > address space (non negative) is of course protected by SMEP.
> >
> > The attack utilizes the kernel linear mapping of the physical
> > memory. I.e. user space address 0x43210 has a kernel equivalent at
> > 0xfxx. So if the attack manages to trick the kernel to that valid
> > kernel address and that is mapped X --> game over. SMEP does not help
> > there.
>
> Oh, agreed.
>
> But that would simply be a kernel bug. We should only map kernel pages
> executable when we have kernel code in them, and we should certainly
> not allow those pages to be mapped writably in user space.
>
> That kind of "executable in kernel, writable in user" would be a
> horrendous and major bug.
>
> So i think it's a non-issue.
>
> > From the top of my head I'd say this is a non issue as those kernel address
> > space mappings _should_ be NX, but we got bitten by _should_ in the past:)
>
> I do agree that bugs can happen, obviously, and we might have missed 
> something.
>
> But in the context of XPFO, I would argue (*very* strongly) that the
> likelihood of the above kind of bug is absolutely *miniscule* compared
> to the likelihood that we'd have something wrong in the software
> implementation of XPFO.
>
> So if the argument is "we might have bugs in software", then I think
> that's an argument _against_ XPFO rather than for it.
>

I don't think this type of NX goof was ever the argument for XPFO.
The main argument I've heard is that a malicious user program writes a
ROP payload into user memory (regular anonymous user memory) and then
gets the kernel to erroneously set RSP (*not* RIP) to point there.

I find this argument fairly weak for a couple reasons.  First, if
we're worried about this, let's do in-kernel CFI, not XPFO, to
mitigate it.  Second, I don't see why the exact same attack can't be
done using, say, page cache, and unless I'm missing something, XPFO
doesn't protect page cache.  Or network buffers, or pipe buffers, etc.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Linus Torvalds
On Wed, Apr 17, 2019 at 4:42 PM Thomas Gleixner  wrote:
>
> On Wed, 17 Apr 2019, Linus Torvalds wrote:
>
> > With SMEP, user space pages are always NX.
>
> We talk past each other. The user space page in the ring3 valid virtual
> address space (non negative) is of course protected by SMEP.
>
> The attack utilizes the kernel linear mapping of the physical
> memory. I.e. user space address 0x43210 has a kernel equivalent at
> 0xfxx. So if the attack manages to trick the kernel to that valid
> kernel address and that is mapped X --> game over. SMEP does not help
> there.

Oh, agreed.

But that would simply be a kernel bug. We should only map kernel pages
executable when we have kernel code in them, and we should certainly
not allow those pages to be mapped writably in user space.

That kind of "executable in kernel, writable in user" would be a
horrendous and major bug.

So i think it's a non-issue.

> From the top of my head I'd say this is a non issue as those kernel address
> space mappings _should_ be NX, but we got bitten by _should_ in the past:)

I do agree that bugs can happen, obviously, and we might have missed something.

But in the context of XPFO, I would argue (*very* strongly) that the
likelihood of the above kind of bug is absolutely *miniscule* compared
to the likelihood that we'd have something wrong in the software
implementation of XPFO.

So if the argument is "we might have bugs in software", then I think
that's an argument _against_ XPFO rather than for it.

Linus
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Thomas Gleixner
On Wed, 17 Apr 2019, Linus Torvalds wrote:

> On Wed, Apr 17, 2019, 14:20 Thomas Gleixner  wrote:
> 
> >
> > It's not necessarily a W+X issue. The user space text is mapped in the
> > kernel as well and even if it is mapped RX then this can happen. So any
> > kernel mappings of user space text need to be mapped NX!
> 
> With SMEP, user space pages are always NX.

We talk past each other. The user space page in the ring3 valid virtual
address space (non negative) is of course protected by SMEP.

The attack utilizes the kernel linear mapping of the physical
memory. I.e. user space address 0x43210 has a kernel equivalent at
0xfxx. So if the attack manages to trick the kernel to that valid
kernel address and that is mapped X --> game over. SMEP does not help
there.

>From the top of my head I'd say this is a non issue as those kernel address
space mappings _should_ be NX, but we got bitten by _should_ in the past:)

Thanks,

tglx


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Linus Torvalds
On Wed, Apr 17, 2019, 14:20 Thomas Gleixner  wrote:

>
> It's not necessarily a W+X issue. The user space text is mapped in the
> kernel as well and even if it is mapped RX then this can happen. So any
> kernel mappings of user space text need to be mapped NX!


With SMEP, user space pages are always NX.

I really think SM[AE]P is something we can already take for granted. People
who have old CPU's workout it are simply not serious about security anyway.
There is no point in saying "we can do it badly in software".

   Linus

>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Thomas Gleixner
On Wed, 17 Apr 2019, Nadav Amit wrote:
> > On Apr 17, 2019, at 10:26 AM, Ingo Molnar  wrote:
> >> As I was curious, I looked at the paper. Here is a quote from it:
> >> 
> >> "In x86-64, however, the permissions of physmap are not in sane state.
> >> Kernels up to v3.8.13 violate the W^X property by mapping the entire region
> >> as “readable, writeable, and executable” (RWX)—only very recent kernels
> >> (≥v3.9) use the more conservative RW mapping.”
> > 
> > But v3.8.13 is a 5+ years old kernel, it doesn't count as a "modern" 
> > kernel in any sense of the word. For any proposed patchset with 
> > significant complexity and non-trivial costs the benchmark version 
> > threshold is the "current upstream kernel".
> > 
> > So does that quote address my followup questions:
> > 
> >> Is this actually true of modern x86-64 kernels? We've locked down W^X
> >> protections in general.
> >> 
> >> I.e. this conclusion:
> >> 
> >>  "Therefore, by simply overwriting kfptr with 0x87FF9F08 and
> >>   triggering the kernel to dereference it, an attacker can directly
> >>   execute shell code with kernel privileges."
> >> 
> >> ... appears to be predicated on imperfect W^X protections on the x86-64
> >> kernel.
> >> 
> >> Do such holes exist on the latest x86-64 kernel? If yes, is there a
> >> reason to believe that these W^X holes cannot be fixed, or that any fix
> >> would be more expensive than XPFO?
> > 
> > ?
> > 
> > What you are proposing here is a XPFO patch-set against recent kernels 
> > with significant runtime overhead, so my questions about the W^X holes 
> > are warranted.
> > 
> 
> Just to clarify - I am an innocent bystander and have no part in this work.
> I was just looking (again) at the paper, as I was curious due to the recent
> patches that I sent that improve W^X protection.

It's not necessarily a W+X issue. The user space text is mapped in the
kernel as well and even if it is mapped RX then this can happen. So any
kernel mappings of user space text need to be mapped NX!

Thanks,

tglx___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Khalid Aziz
On 4/17/19 1:49 PM, Andy Lutomirski wrote:
> On Wed, Apr 17, 2019 at 10:33 AM Khalid Aziz  wrote:
>>
>> On 4/17/19 11:09 AM, Ingo Molnar wrote:
>>>
>>> * Khalid Aziz  wrote:
>>>
> I.e. the original motivation of the XPFO patches was to prevent execution
> of direct kernel mappings. Is this motivation still present if those
> mappings are non-executable?
>
> (Sorry if this has been asked and answered in previous discussions.)

 Hi Ingo,

 That is a good question. Because of the cost of XPFO, we have to be very
 sure we need this protection. The paper from Vasileios, Michalis and
 Angelos - ,
 does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
 and 6.2.
>>>
>>> So it would be nice if you could generally summarize external arguments
>>> when defending a patchset, instead of me having to dig through a PDF
>>> which not only causes me to spend time that you probably already spent
>>> reading that PDF, but I might also interpret it incorrectly. ;-)
>>
>> Sorry, you are right. Even though that paper explains it well, a summary
>> is always useful.
>>
>>>
>>> The PDF you cited says this:
>>>
>>>   "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced
>>>in many platforms, including x86-64.  In our example, the content of
>>>user address 0xBEEF000 is also accessible through kernel address
>>>0x87FF9F08 as plain, executable code."
>>>
>>> Is this actually true of modern x86-64 kernels? We've locked down W^X
>>> protections in general.
>>>
>>> I.e. this conclusion:
>>>
>>>   "Therefore, by simply overwriting kfptr with 0x87FF9F08 and
>>>triggering the kernel to dereference it, an attacker can directly
>>>execute shell code with kernel privileges."
>>>
>>> ... appears to be predicated on imperfect W^X protections on the x86-64
>>> kernel.
>>>
>>> Do such holes exist on the latest x86-64 kernel? If yes, is there a
>>> reason to believe that these W^X holes cannot be fixed, or that any fix
>>> would be more expensive than XPFO?
>>
>> Even if physmap is not executable, return-oriented programming (ROP) can
>> still be used to launch an attack. Instead of placing executable code at
>> user address 0xBEEF000, attacker can place an ROP payload there. kfptr
>> is then overwritten to point to a stack-pivoting gadget. Using the
>> physmap address aliasing, the ROP payload becomes kernel-mode stack. The
>> execution can then be hijacked upon execution of ret instruction. This
>> is a gist of the subsection titled "Non-executable physmap" under
>> section 6.2 and it looked convincing enough to me. If you have a
>> different take on this, I am very interested in your point of view.
> 
> My issue with all this is that XPFO is really very expensive.  I think
> that, if we're going to seriously consider upstreaming expensive
> exploit mitigations like this, we should consider others first, in
> particular CFI techniques.  grsecurity's RAP would be a great start.
> I also proposed using a gcc plugin (or upstream gcc feature) to add
> some instrumentation to any code that pops RSP to verify that the
> resulting (unsigned) change in RSP is between 0 and THREAD_SIZE bytes.
> This will make ROP quite a bit harder.
> 

Yes, XPFO is expensive. I have been able to reduce the overhead of XPFO
from 2537% to 28% (on large servers) but 28% is still quite significant.
Alternative mitigation techniques with lower impact would easily be more
acceptable as long as they provide same level of protection. If we have
to go with XPFO, we will continue to look for more performance
improvement to bring that number down further from 28%. Hopefully what
Tycho is working on will yield better results. I am continuing to look
for improvements to XPFO in parallel.

Thanks,
Khalid

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Tycho Andersen
On Wed, Apr 17, 2019 at 12:49:04PM -0700, Andy Lutomirski wrote:
> I also proposed using a gcc plugin (or upstream gcc feature) to add
> some instrumentation to any code that pops RSP to verify that the
> resulting (unsigned) change in RSP is between 0 and THREAD_SIZE bytes.
> This will make ROP quite a bit harder.

I've been playing around with this for a bit, and hope to have
something to post Soon :)

Tycho
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Andy Lutomirski
On Wed, Apr 17, 2019 at 10:33 AM Khalid Aziz  wrote:
>
> On 4/17/19 11:09 AM, Ingo Molnar wrote:
> >
> > * Khalid Aziz  wrote:
> >
> >>> I.e. the original motivation of the XPFO patches was to prevent execution
> >>> of direct kernel mappings. Is this motivation still present if those
> >>> mappings are non-executable?
> >>>
> >>> (Sorry if this has been asked and answered in previous discussions.)
> >>
> >> Hi Ingo,
> >>
> >> That is a good question. Because of the cost of XPFO, we have to be very
> >> sure we need this protection. The paper from Vasileios, Michalis and
> >> Angelos - ,
> >> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
> >> and 6.2.
> >
> > So it would be nice if you could generally summarize external arguments
> > when defending a patchset, instead of me having to dig through a PDF
> > which not only causes me to spend time that you probably already spent
> > reading that PDF, but I might also interpret it incorrectly. ;-)
>
> Sorry, you are right. Even though that paper explains it well, a summary
> is always useful.
>
> >
> > The PDF you cited says this:
> >
> >   "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced
> >in many platforms, including x86-64.  In our example, the content of
> >user address 0xBEEF000 is also accessible through kernel address
> >0x87FF9F08 as plain, executable code."
> >
> > Is this actually true of modern x86-64 kernels? We've locked down W^X
> > protections in general.
> >
> > I.e. this conclusion:
> >
> >   "Therefore, by simply overwriting kfptr with 0x87FF9F08 and
> >triggering the kernel to dereference it, an attacker can directly
> >execute shell code with kernel privileges."
> >
> > ... appears to be predicated on imperfect W^X protections on the x86-64
> > kernel.
> >
> > Do such holes exist on the latest x86-64 kernel? If yes, is there a
> > reason to believe that these W^X holes cannot be fixed, or that any fix
> > would be more expensive than XPFO?
>
> Even if physmap is not executable, return-oriented programming (ROP) can
> still be used to launch an attack. Instead of placing executable code at
> user address 0xBEEF000, attacker can place an ROP payload there. kfptr
> is then overwritten to point to a stack-pivoting gadget. Using the
> physmap address aliasing, the ROP payload becomes kernel-mode stack. The
> execution can then be hijacked upon execution of ret instruction. This
> is a gist of the subsection titled "Non-executable physmap" under
> section 6.2 and it looked convincing enough to me. If you have a
> different take on this, I am very interested in your point of view.

My issue with all this is that XPFO is really very expensive.  I think
that, if we're going to seriously consider upstreaming expensive
exploit mitigations like this, we should consider others first, in
particular CFI techniques.  grsecurity's RAP would be a great start.
I also proposed using a gcc plugin (or upstream gcc feature) to add
some instrumentation to any code that pops RSP to verify that the
resulting (unsigned) change in RSP is between 0 and THREAD_SIZE bytes.
This will make ROP quite a bit harder.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Nadav Amit
> On Apr 17, 2019, at 10:26 AM, Ingo Molnar  wrote:
> 
> 
> * Nadav Amit  wrote:
> 
>>> On Apr 17, 2019, at 10:09 AM, Ingo Molnar  wrote:
>>> 
>>> 
>>> * Khalid Aziz  wrote:
>>> 
> I.e. the original motivation of the XPFO patches was to prevent execution 
> of direct kernel mappings. Is this motivation still present if those 
> mappings are non-executable?
> 
> (Sorry if this has been asked and answered in previous discussions.)
 
 Hi Ingo,
 
 That is a good question. Because of the cost of XPFO, we have to be very
 sure we need this protection. The paper from Vasileios, Michalis and
 Angelos - ,
 does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
 and 6.2.
>>> 
>>> So it would be nice if you could generally summarize external arguments 
>>> when defending a patchset, instead of me having to dig through a PDF 
>>> which not only causes me to spend time that you probably already spent 
>>> reading that PDF, but I might also interpret it incorrectly. ;-)
>>> 
>>> The PDF you cited says this:
>>> 
>>> "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
>>>  in many platforms, including x86-64.  In our example, the content of 
>>>  user address 0xBEEF000 is also accessible through kernel address 
>>>  0x87FF9F08 as plain, executable code."
>>> 
>>> Is this actually true of modern x86-64 kernels? We've locked down W^X 
>>> protections in general.
>> 
>> As I was curious, I looked at the paper. Here is a quote from it:
>> 
>> "In x86-64, however, the permissions of physmap are not in sane state.
>> Kernels up to v3.8.13 violate the W^X property by mapping the entire region
>> as “readable, writeable, and executable” (RWX)—only very recent kernels
>> (≥v3.9) use the more conservative RW mapping.”
> 
> But v3.8.13 is a 5+ years old kernel, it doesn't count as a "modern" 
> kernel in any sense of the word. For any proposed patchset with 
> significant complexity and non-trivial costs the benchmark version 
> threshold is the "current upstream kernel".
> 
> So does that quote address my followup questions:
> 
>> Is this actually true of modern x86-64 kernels? We've locked down W^X
>> protections in general.
>> 
>> I.e. this conclusion:
>> 
>>  "Therefore, by simply overwriting kfptr with 0x87FF9F08 and
>>   triggering the kernel to dereference it, an attacker can directly
>>   execute shell code with kernel privileges."
>> 
>> ... appears to be predicated on imperfect W^X protections on the x86-64
>> kernel.
>> 
>> Do such holes exist on the latest x86-64 kernel? If yes, is there a
>> reason to believe that these W^X holes cannot be fixed, or that any fix
>> would be more expensive than XPFO?
> 
> ?
> 
> What you are proposing here is a XPFO patch-set against recent kernels 
> with significant runtime overhead, so my questions about the W^X holes 
> are warranted.
> 

Just to clarify - I am an innocent bystander and have no part in this work.
I was just looking (again) at the paper, as I was curious due to the recent
patches that I sent that improve W^X protection.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Khalid Aziz
On 4/17/19 11:09 AM, Ingo Molnar wrote:
> 
> * Khalid Aziz  wrote:
> 
>>> I.e. the original motivation of the XPFO patches was to prevent execution 
>>> of direct kernel mappings. Is this motivation still present if those 
>>> mappings are non-executable?
>>>
>>> (Sorry if this has been asked and answered in previous discussions.)
>>
>> Hi Ingo,
>>
>> That is a good question. Because of the cost of XPFO, we have to be very
>> sure we need this protection. The paper from Vasileios, Michalis and
>> Angelos - ,
>> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
>> and 6.2.
> 
> So it would be nice if you could generally summarize external arguments 
> when defending a patchset, instead of me having to dig through a PDF 
> which not only causes me to spend time that you probably already spent 
> reading that PDF, but I might also interpret it incorrectly. ;-)

Sorry, you are right. Even though that paper explains it well, a summary
is always useful.

> 
> The PDF you cited says this:
> 
>   "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
>in many platforms, including x86-64.  In our example, the content of 
>user address 0xBEEF000 is also accessible through kernel address 
>0x87FF9F08 as plain, executable code."
> 
> Is this actually true of modern x86-64 kernels? We've locked down W^X 
> protections in general.
> 
> I.e. this conclusion:
> 
>   "Therefore, by simply overwriting kfptr with 0x87FF9F08 and 
>triggering the kernel to dereference it, an attacker can directly 
>execute shell code with kernel privileges."
> 
> ... appears to be predicated on imperfect W^X protections on the x86-64 
> kernel.
> 
> Do such holes exist on the latest x86-64 kernel? If yes, is there a 
> reason to believe that these W^X holes cannot be fixed, or that any fix 
> would be more expensive than XPFO?

Even if physmap is not executable, return-oriented programming (ROP) can
still be used to launch an attack. Instead of placing executable code at
user address 0xBEEF000, attacker can place an ROP payload there. kfptr
is then overwritten to point to a stack-pivoting gadget. Using the
physmap address aliasing, the ROP payload becomes kernel-mode stack. The
execution can then be hijacked upon execution of ret instruction. This
is a gist of the subsection titled "Non-executable physmap" under
section 6.2 and it looked convincing enough to me. If you have a
different take on this, I am very interested in your point of view.

Thanks,
Khalid


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Ingo Molnar

* Nadav Amit  wrote:

> > On Apr 17, 2019, at 10:09 AM, Ingo Molnar  wrote:
> > 
> > 
> > * Khalid Aziz  wrote:
> > 
> >>> I.e. the original motivation of the XPFO patches was to prevent execution 
> >>> of direct kernel mappings. Is this motivation still present if those 
> >>> mappings are non-executable?
> >>> 
> >>> (Sorry if this has been asked and answered in previous discussions.)
> >> 
> >> Hi Ingo,
> >> 
> >> That is a good question. Because of the cost of XPFO, we have to be very
> >> sure we need this protection. The paper from Vasileios, Michalis and
> >> Angelos - ,
> >> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
> >> and 6.2.
> > 
> > So it would be nice if you could generally summarize external arguments 
> > when defending a patchset, instead of me having to dig through a PDF 
> > which not only causes me to spend time that you probably already spent 
> > reading that PDF, but I might also interpret it incorrectly. ;-)
> > 
> > The PDF you cited says this:
> > 
> >  "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
> >   in many platforms, including x86-64.  In our example, the content of 
> >   user address 0xBEEF000 is also accessible through kernel address 
> >   0x87FF9F08 as plain, executable code."
> > 
> > Is this actually true of modern x86-64 kernels? We've locked down W^X 
> > protections in general.
> 
> As I was curious, I looked at the paper. Here is a quote from it:
> 
> "In x86-64, however, the permissions of physmap are not in sane state.
> Kernels up to v3.8.13 violate the W^X property by mapping the entire region
> as “readable, writeable, and executable” (RWX)—only very recent kernels
> (≥v3.9) use the more conservative RW mapping.”

But v3.8.13 is a 5+ years old kernel, it doesn't count as a "modern" 
kernel in any sense of the word. For any proposed patchset with 
significant complexity and non-trivial costs the benchmark version 
threshold is the "current upstream kernel".

So does that quote address my followup questions:

> Is this actually true of modern x86-64 kernels? We've locked down W^X
> protections in general.
>
> I.e. this conclusion:
>
>   "Therefore, by simply overwriting kfptr with 0x87FF9F08 and
>triggering the kernel to dereference it, an attacker can directly
>execute shell code with kernel privileges."
>
> ... appears to be predicated on imperfect W^X protections on the x86-64
> kernel.
>
> Do such holes exist on the latest x86-64 kernel? If yes, is there a
> reason to believe that these W^X holes cannot be fixed, or that any fix
> would be more expensive than XPFO?

?

What you are proposing here is a XPFO patch-set against recent kernels 
with significant runtime overhead, so my questions about the W^X holes 
are warranted.

Thanks,

Ingo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Nadav Amit
> On Apr 17, 2019, at 10:09 AM, Ingo Molnar  wrote:
> 
> 
> * Khalid Aziz  wrote:
> 
>>> I.e. the original motivation of the XPFO patches was to prevent execution 
>>> of direct kernel mappings. Is this motivation still present if those 
>>> mappings are non-executable?
>>> 
>>> (Sorry if this has been asked and answered in previous discussions.)
>> 
>> Hi Ingo,
>> 
>> That is a good question. Because of the cost of XPFO, we have to be very
>> sure we need this protection. The paper from Vasileios, Michalis and
>> Angelos - ,
>> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
>> and 6.2.
> 
> So it would be nice if you could generally summarize external arguments 
> when defending a patchset, instead of me having to dig through a PDF 
> which not only causes me to spend time that you probably already spent 
> reading that PDF, but I might also interpret it incorrectly. ;-)
> 
> The PDF you cited says this:
> 
>  "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
>   in many platforms, including x86-64.  In our example, the content of 
>   user address 0xBEEF000 is also accessible through kernel address 
>   0x87FF9F08 as plain, executable code."
> 
> Is this actually true of modern x86-64 kernels? We've locked down W^X 
> protections in general.

As I was curious, I looked at the paper. Here is a quote from it:

"In x86-64, however, the permissions of physmap are not in sane state.
Kernels up to v3.8.13 violate the W^X property by mapping the entire region
as “readable, writeable, and executable” (RWX)—only very recent kernels
(≥v3.9) use the more conservative RW mapping.”

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Ingo Molnar


* Khalid Aziz  wrote:

> > I.e. the original motivation of the XPFO patches was to prevent execution 
> > of direct kernel mappings. Is this motivation still present if those 
> > mappings are non-executable?
> > 
> > (Sorry if this has been asked and answered in previous discussions.)
> 
> Hi Ingo,
> 
> That is a good question. Because of the cost of XPFO, we have to be very
> sure we need this protection. The paper from Vasileios, Michalis and
> Angelos - ,
> does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
> and 6.2.

So it would be nice if you could generally summarize external arguments 
when defending a patchset, instead of me having to dig through a PDF 
which not only causes me to spend time that you probably already spent 
reading that PDF, but I might also interpret it incorrectly. ;-)

The PDF you cited says this:

  "Unfortunately, as shown in Table 1, the W^X prop-erty is not enforced 
   in many platforms, including x86-64.  In our example, the content of 
   user address 0xBEEF000 is also accessible through kernel address 
   0x87FF9F08 as plain, executable code."

Is this actually true of modern x86-64 kernels? We've locked down W^X 
protections in general.

I.e. this conclusion:

  "Therefore, by simply overwriting kfptr with 0x87FF9F08 and 
   triggering the kernel to dereference it, an attacker can directly 
   execute shell code with kernel privileges."

... appears to be predicated on imperfect W^X protections on the x86-64 
kernel.

Do such holes exist on the latest x86-64 kernel? If yes, is there a 
reason to believe that these W^X holes cannot be fixed, or that any fix 
would be more expensive than XPFO?

Thanks,

Ingo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Khalid Aziz
On 4/17/19 10:15 AM, Ingo Molnar wrote:
> 
> [ Sorry, had to trim the Cc: list from hell. Tried to keep all the 
>   mailing lists and all x86 developers. ]
> 
> * Khalid Aziz  wrote:
> 
>> From: Juerg Haefliger 
>>
>> This patch adds basic support infrastructure for XPFO which protects 
>> against 'ret2dir' kernel attacks. The basic idea is to enforce 
>> exclusive ownership of page frames by either the kernel or userspace, 
>> unless explicitly requested by the kernel. Whenever a page destined for 
>> userspace is allocated, it is unmapped from physmap (the kernel's page 
>> table). When such a page is reclaimed from userspace, it is mapped back 
>> to physmap. Individual architectures can enable full XPFO support using 
>> this infrastructure by supplying architecture specific pieces.
> 
> I have a higher level, meta question:
> 
> Is there any updated analysis outlining why this XPFO overhead would be 
> required on x86-64 kernels running on SMAP/SMEP CPUs which should be all 
> recent Intel and AMD CPUs, and with kernel that mark all direct kernel 
> mappings as non-executable - which should be all reasonably modern 
> kernels later than v4.0 or so?
> 
> I.e. the original motivation of the XPFO patches was to prevent execution 
> of direct kernel mappings. Is this motivation still present if those 
> mappings are non-executable?
> 
> (Sorry if this has been asked and answered in previous discussions.)

Hi Ingo,

That is a good question. Because of the cost of XPFO, we have to be very
sure we need this protection. The paper from Vasileios, Michalis and
Angelos - ,
does go into how ret2dir attacks can bypass SMAP/SMEP in sections 6.1
and 6.2.

Thanks,
Khalid


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-17 Thread Ingo Molnar


[ Sorry, had to trim the Cc: list from hell. Tried to keep all the 
  mailing lists and all x86 developers. ]

* Khalid Aziz  wrote:

> From: Juerg Haefliger 
> 
> This patch adds basic support infrastructure for XPFO which protects 
> against 'ret2dir' kernel attacks. The basic idea is to enforce 
> exclusive ownership of page frames by either the kernel or userspace, 
> unless explicitly requested by the kernel. Whenever a page destined for 
> userspace is allocated, it is unmapped from physmap (the kernel's page 
> table). When such a page is reclaimed from userspace, it is mapped back 
> to physmap. Individual architectures can enable full XPFO support using 
> this infrastructure by supplying architecture specific pieces.

I have a higher level, meta question:

Is there any updated analysis outlining why this XPFO overhead would be 
required on x86-64 kernels running on SMAP/SMEP CPUs which should be all 
recent Intel and AMD CPUs, and with kernel that mark all direct kernel 
mappings as non-executable - which should be all reasonably modern 
kernels later than v4.0 or so?

I.e. the original motivation of the XPFO patches was to prevent execution 
of direct kernel mappings. Is this motivation still present if those 
mappings are non-executable?

(Sorry if this has been asked and answered in previous discussions.)

Thanks,

Ingo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-04 Thread Peter Zijlstra
On Thu, Apr 04, 2019 at 09:15:46AM -0600, Khalid Aziz wrote:
> Thanks Peter. I really appreciate your review. Your feedback helps make
> this code better and closer to where I can feel comfortable not calling
> it RFC any more.
> 
> The more I look at xpfo_kmap()/xpfo_kunmap() code, the more I get
> uncomfortable with it. As you pointed out about calling kmap_atomic from
> NMI context, that just makes the kmap_atomic code look even worse. I
> pointed out one problem with this code in cover letter and suggested a
> rewrite. I see these problems with this code:

Well, I no longer use it from NMI context, but I did do that for a
while. We now have a giant heap of magic in the NMI path that allows us
to take faults from NMI context (please don't ask), this means we can
mostly do copy_from_user_inatomic() now.

> 1. When xpfo_kmap maps a page back in physmap, it opens up the ret2dir
> attack security hole again even if just for the duration of kmap. A kmap
> can stay around for some time if the page is being used for I/O.

Correct.

> 2. This code uses spinlock which leads to problems. If it does not
> disable IRQ, it is exposed to deadlock around xpfo_lock. If it disables
> IRQ, I think it can still deadlock around pgd_lock.

I've not spotted that inversion yet, but then I didn't look at the lock
usage outside of k{,un}map_xpfo().

> I think a better implementation of xpfo_kmap()/xpfo_kunmap() would map
> the page at a new virtual address similar to what kmap_high for i386
> does. This avoids re-opening the ret2dir security hole. We can also
> possibly do away with xpfo_lock saving bytes in page-frame and the not
> so sane code sequence can go away.

Right, the TLB invalidation issues are still tricky, even there :/
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-04 Thread Khalid Aziz
On 4/4/19 1:43 AM, Peter Zijlstra wrote:
> 
> You must be so glad I no longer use kmap_atomic from NMI context :-)
> 
> On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
>> +static inline void xpfo_kmap(void *kaddr, struct page *page)
>> +{
>> +unsigned long flags;
>> +
>> +if (!static_branch_unlikely(_inited))
>> +return;
>> +
>> +if (!PageXpfoUser(page))
>> +return;
>> +
>> +/*
>> + * The page was previously allocated to user space, so
>> + * map it back into the kernel if needed. No TLB flush required.
>> + */
>> +spin_lock_irqsave(>xpfo_lock, flags);
>> +
>> +if ((atomic_inc_return(>xpfo_mapcount) == 1) &&
>> +TestClearPageXpfoUnmapped(page))
>> +set_kpte(kaddr, page, PAGE_KERNEL);
>> +
>> +spin_unlock_irqrestore(>xpfo_lock, flags);
> 
> That's a really sad sequence, not wrong, but sad. _3_ atomic operations,
> 2 on likely the same cacheline. And mostly all pointless.
> 
> This patch makes xpfo_mapcount an atomic, but then all modifications are
> under the spinlock, what gives?
> 
> Anyway, a possibly saner sequence might be:
> 
>   if (atomic_inc_not_zero(>xpfo_mapcount))
>   return;
> 
>   spin_lock_irqsave(>xpfo_lock, flag);
>   if ((atomic_inc_return(>xpfo_mapcount) == 1) &&
>   TestClearPageXpfoUnmapped(page))
>   set_kpte(kaddr, page, PAGE_KERNEL);
>   spin_unlock_irqrestore(>xpfo_lock, flags);
> 
>> +}
>> +
>> +static inline void xpfo_kunmap(void *kaddr, struct page *page)
>> +{
>> +unsigned long flags;
>> +
>> +if (!static_branch_unlikely(_inited))
>> +return;
>> +
>> +if (!PageXpfoUser(page))
>> +return;
>> +
>> +/*
>> + * The page is to be allocated back to user space, so unmap it from
>> + * the kernel, flush the TLB and tag it as a user page.
>> + */
>> +spin_lock_irqsave(>xpfo_lock, flags);
>> +
>> +if (atomic_dec_return(>xpfo_mapcount) == 0) {
>> +#ifdef CONFIG_XPFO_DEBUG
>> +WARN_ON(PageXpfoUnmapped(page));
>> +#endif
>> +SetPageXpfoUnmapped(page);
>> +set_kpte(kaddr, page, __pgprot(0));
>> +xpfo_flush_kernel_tlb(page, 0);
> 
> You didn't speak about the TLB invalidation anywhere. But basically this
> is one that x86 cannot do.
> 
>> +}
>> +
>> +spin_unlock_irqrestore(>xpfo_lock, flags);
> 
> Idem:
> 
>   if (atomic_add_unless(>xpfo_mapcount, -1, 1))
>   return;
> 
>   
> 
> 
>> +}
> 
> Also I'm failing to see the point of PG_xpfo_unmapped, afaict it
> is identical to !atomic_read(>xpfo_mapcount).
> 

Thanks Peter. I really appreciate your review. Your feedback helps make
this code better and closer to where I can feel comfortable not calling
it RFC any more.

The more I look at xpfo_kmap()/xpfo_kunmap() code, the more I get
uncomfortable with it. As you pointed out about calling kmap_atomic from
NMI context, that just makes the kmap_atomic code look even worse. I
pointed out one problem with this code in cover letter and suggested a
rewrite. I see these problems with this code:

1. When xpfo_kmap maps a page back in physmap, it opens up the ret2dir
attack security hole again even if just for the duration of kmap. A kmap
can stay around for some time if the page is being used for I/O.

2. This code uses spinlock which leads to problems. If it does not
disable IRQ, it is exposed to deadlock around xpfo_lock. If it disables
IRQ, I think it can still deadlock around pgd_lock.

I think a better implementation of xpfo_kmap()/xpfo_kunmap() would map
the page at a new virtual address similar to what kmap_high for i386
does. This avoids re-opening the ret2dir security hole. We can also
possibly do away with xpfo_lock saving bytes in page-frame and the not
so sane code sequence can go away.

Good point about PG_xpfo_unmapped flag. You are right that it can be
replaced with !atomic_read(>xpfo_mapcount).

Thanks,
Khalid

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-04 Thread Tycho Andersen
On Thu, Apr 04, 2019 at 09:21:52AM +0200, Peter Zijlstra wrote:
> On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 2c471a2c43fa..d17d33f36a01 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -204,6 +204,14 @@ struct page {
> >  #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
> > int _last_cpupid;
> >  #endif
> > +
> > +#ifdef CONFIG_XPFO
> > +   /* Counts the number of times this page has been kmapped. */
> > +   atomic_t xpfo_mapcount;
> > +
> > +   /* Serialize kmap/kunmap of this page */
> > +   spinlock_t xpfo_lock;
> 
> NAK, see ALLOC_SPLIT_PTLOCKS
> 
> spinlock_t can be _huge_ (CONFIG_PROVE_LOCKING=y), also are you _really_
> sure you want spinlock_t and not raw_spinlock_t ? For
> CONFIG_PREEMPT_FULL spinlock_t turns into a rtmutex.
> 
> > +#endif
> 
> Growing the page-frame by 8 bytes (in the good case) is really sad,
> that's a _lot_ of memory.

Originally we had this in page_ext, it's not really clear to me why we
moved it out.

Julien?

Tycho
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-04 Thread Peter Zijlstra
On Thu, Apr 04, 2019 at 09:21:52AM +0200, Peter Zijlstra wrote:
> On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 2c471a2c43fa..d17d33f36a01 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -204,6 +204,14 @@ struct page {
> >  #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
> > int _last_cpupid;
> >  #endif
> > +
> > +#ifdef CONFIG_XPFO
> > +   /* Counts the number of times this page has been kmapped. */
> > +   atomic_t xpfo_mapcount;
> > +
> > +   /* Serialize kmap/kunmap of this page */
> > +   spinlock_t xpfo_lock;
> 
> NAK, see ALLOC_SPLIT_PTLOCKS
> 
> spinlock_t can be _huge_ (CONFIG_PROVE_LOCKING=y), also are you _really_
> sure you want spinlock_t and not raw_spinlock_t ? For
> CONFIG_PREEMPT_FULL spinlock_t turns into a rtmutex.
> 
> > +#endif
> 
> Growing the page-frame by 8 bytes (in the good case) is really sad,
> that's a _lot_ of memory.

So if you use the original kmap_atomic/kmap code from i386 and create
an alias per user you can do away with all that.

Now, that leaves you with the fixmap kmap_atomic code, which I also
hate, but it gets rid of a lot of the ugly you introduce in these here
patches.

As to the fixmap kmap_atomic; so fundamentally the PTEs are only used on
a single CPU and therefore CPU local TLB invalidation _should_ suffice.

However, speculation...

Another CPU can speculatively hit upon a fixmap entry for another CPU
and populate it's own TLB entry. Then the TLB invalidate is
insufficient, it leaves a stale entry in a remote TLB.

If the remote CPU then re-uses that fixmap slot to alias another page,
we have two CPUs with different translations for the same VA, a
condition that AMD CPU's dislike enough to machine check on (IIRC).

Actually hitting that is incredibly difficult (we have to have
speculation, fixmap reuse and not get a full TLB invalidate in between),
but, afaict, not impossible.

Your monstrosity from the last patch avoids this particular issue by not
aliasing in this manner, but it comes at the cost of this page-frame
bloat. Also, I'm still not sure there's not other problems with it.

Bah..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-04 Thread Peter Zijlstra


You must be so glad I no longer use kmap_atomic from NMI context :-)

On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> +static inline void xpfo_kmap(void *kaddr, struct page *page)
> +{
> + unsigned long flags;
> +
> + if (!static_branch_unlikely(_inited))
> + return;
> +
> + if (!PageXpfoUser(page))
> + return;
> +
> + /*
> +  * The page was previously allocated to user space, so
> +  * map it back into the kernel if needed. No TLB flush required.
> +  */
> + spin_lock_irqsave(>xpfo_lock, flags);
> +
> + if ((atomic_inc_return(>xpfo_mapcount) == 1) &&
> + TestClearPageXpfoUnmapped(page))
> + set_kpte(kaddr, page, PAGE_KERNEL);
> +
> + spin_unlock_irqrestore(>xpfo_lock, flags);

That's a really sad sequence, not wrong, but sad. _3_ atomic operations,
2 on likely the same cacheline. And mostly all pointless.

This patch makes xpfo_mapcount an atomic, but then all modifications are
under the spinlock, what gives?

Anyway, a possibly saner sequence might be:

if (atomic_inc_not_zero(>xpfo_mapcount))
return;

spin_lock_irqsave(>xpfo_lock, flag);
if ((atomic_inc_return(>xpfo_mapcount) == 1) &&
TestClearPageXpfoUnmapped(page))
set_kpte(kaddr, page, PAGE_KERNEL);
spin_unlock_irqrestore(>xpfo_lock, flags);

> +}
> +
> +static inline void xpfo_kunmap(void *kaddr, struct page *page)
> +{
> + unsigned long flags;
> +
> + if (!static_branch_unlikely(_inited))
> + return;
> +
> + if (!PageXpfoUser(page))
> + return;
> +
> + /*
> +  * The page is to be allocated back to user space, so unmap it from
> +  * the kernel, flush the TLB and tag it as a user page.
> +  */
> + spin_lock_irqsave(>xpfo_lock, flags);
> +
> + if (atomic_dec_return(>xpfo_mapcount) == 0) {
> +#ifdef CONFIG_XPFO_DEBUG
> + WARN_ON(PageXpfoUnmapped(page));
> +#endif
> + SetPageXpfoUnmapped(page);
> + set_kpte(kaddr, page, __pgprot(0));
> + xpfo_flush_kernel_tlb(page, 0);

You didn't speak about the TLB invalidation anywhere. But basically this
is one that x86 cannot do.

> + }
> +
> + spin_unlock_irqrestore(>xpfo_lock, flags);

Idem:

if (atomic_add_unless(>xpfo_mapcount, -1, 1))
return;




> +}

Also I'm failing to see the point of PG_xpfo_unmapped, afaict it
is identical to !atomic_read(>xpfo_mapcount).

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-04 Thread Peter Zijlstra
On Wed, Apr 03, 2019 at 11:34:04AM -0600, Khalid Aziz wrote:
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 2c471a2c43fa..d17d33f36a01 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -204,6 +204,14 @@ struct page {
>  #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
>   int _last_cpupid;
>  #endif
> +
> +#ifdef CONFIG_XPFO
> + /* Counts the number of times this page has been kmapped. */
> + atomic_t xpfo_mapcount;
> +
> + /* Serialize kmap/kunmap of this page */
> + spinlock_t xpfo_lock;

NAK, see ALLOC_SPLIT_PTLOCKS

spinlock_t can be _huge_ (CONFIG_PROVE_LOCKING=y), also are you _really_
sure you want spinlock_t and not raw_spinlock_t ? For
CONFIG_PREEMPT_FULL spinlock_t turns into a rtmutex.

> +#endif

Growing the page-frame by 8 bytes (in the good case) is really sad,
that's a _lot_ of memory.

>  } _struct_page_alignment;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH v9 03/13] mm: Add support for eXclusive Page Frame Ownership (XPFO)

2019-04-03 Thread Khalid Aziz
From: Juerg Haefliger 

This patch adds basic support infrastructure for XPFO which protects
against 'ret2dir' kernel attacks. The basic idea is to enforce exclusive
ownership of page frames by either the kernel or userspace, unless
explicitly requested by the kernel. Whenever a page destined for
userspace is allocated, it is unmapped from physmap (the kernel's page
table). When such a page is reclaimed from userspace, it is mapped back
to physmap. Individual architectures can enable full XPFO support using
this infrastructure by supplying architecture specific pieces.

Additional fields in the page struct are used for XPFO housekeeping,
specifically:
  - two flags to distinguish user vs. kernel pages and to tag unmapped
pages.
  - a reference counter to balance kmap/kunmap operations.
  - a lock to serialize access to the XPFO fields.

This patch is based on the work of Vasileios P. Kemerlis et al. who
published their work in this paper:
  http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf

CC: x...@kernel.org
Suggested-by: Vasileios P. Kemerlis 
Signed-off-by: Juerg Haefliger 
Signed-off-by: Tycho Andersen 
Signed-off-by: Marco Benatto 
[jstec...@amazon.de: encode all XPFO info in struct page]
Signed-off-by: Julian Stecklina 
Signed-off-by: Khalid Aziz 
Cc: Khalid Aziz 
---
 v9: * Do not use page extensions. Encode all xpfo information in struct
  page (Julian Stecklina).
* Split architecture specific code into its own separate patch
* Move kmap*() to include/linux/xpfo.h for cleaner code as suggested
  for an earlier version of this patch
* Use irq versions of spin_lock to address possible deadlock around
  xpfo_lock caused by interrupts.
* Incorporated various feedback provided on v6 patch way back.

v6: * use flush_tlb_kernel_range() instead of __flush_tlb_one, so we flush
  the tlb entry on all CPUs when unmapping it in kunmap
* handle lookup_page_ext()/lookup_xpfo() returning NULL
* drop lots of BUG()s in favor of WARN()
* don't disable irqs in xpfo_kmap/xpfo_kunmap, export
  __split_large_page so we can do our own alloc_pages(GFP_ATOMIC) to
  pass it

 .../admin-guide/kernel-parameters.txt |   6 +
 include/linux/highmem.h   |  31 +---
 include/linux/mm_types.h  |   8 +
 include/linux/page-flags.h|  23 ++-
 include/linux/xpfo.h  | 147 ++
 include/trace/events/mmflags.h|  10 +-
 mm/Makefile   |   1 +
 mm/compaction.c   |   2 +-
 mm/internal.h |   2 +-
 mm/page_alloc.c   |  10 +-
 mm/page_isolation.c   |   2 +-
 mm/xpfo.c | 106 +
 security/Kconfig  |  27 
 13 files changed, 337 insertions(+), 38 deletions(-)
 create mode 100644 include/linux/xpfo.h
 create mode 100644 mm/xpfo.c

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 858b6c0b9a15..9b36da94760e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2997,6 +2997,12 @@
 
nox2apic[X86-64,APIC] Do not enable x2APIC mode.
 
+   noxpfo  [XPFO] Disable eXclusive Page Frame Ownership (XPFO)
+   when CONFIG_XPFO is on. Physical pages mapped into
+   user applications will also be mapped in the
+   kernel's address space as if CONFIG_XPFO was not
+   enabled.
+
cpu0_hotplug[X86] Turn on CPU0 hotplug feature when
CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
Some features depend on CPU0. Known dependencies are:
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index ea5cdbd8c2c3..59a1a5fa598d 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -77,36 +78,6 @@ static inline struct page *kmap_to_page(void *addr)
 
 static inline unsigned long totalhigh_pages(void) { return 0UL; }
 
-#ifndef ARCH_HAS_KMAP
-static inline void *kmap(struct page *page)
-{
-   might_sleep();
-   return page_address(page);
-}
-
-static inline void kunmap(struct page *page)
-{
-}
-
-static inline void *kmap_atomic(struct page *page)
-{
-   preempt_disable();
-   pagefault_disable();
-   return page_address(page);
-}
-#define kmap_atomic_prot(page, prot)   kmap_atomic(page)
-
-static inline void __kunmap_atomic(void *addr)
-{
-   pagefault_enable();
-   preempt_enable();
-}
-
-#define kmap_atomic_pfn(pfn)   kmap_atomic(pfn_to_page(pfn))
-
-#define kmap_flush_unused()do {} while(0)
-#endif
-
 #endif /* CONFIG_HIGHMEM */
 
 #if defined(CONFIG_HIGHMEM) ||