Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Thu 13-07-17 08:53:52, Benjamin Herrenschmidt wrote: > On Wed, 2017-07-12 at 09:23 +0200, Michal Hocko wrote: > > > > > > > > Ideally the MMU looks at the PTE for keys, in order to enforce > > > protection. This is the case with x86 and is the case with power9 Radix > > > page table. Hence the keys have to be programmed into the PTE. > > > > But x86 doesn't update ptes for PKEYs, that would be just too expensive. > > You could use standard mprotect to do the same... > > What do you mean ? x86 ends up in mprotect_fixup -> change_protection() > which will update the PTEs just the same as we do. > > Changing the key for a page is a form mprotect. Changing the access > permissions for keys is different, for us it's a special register > (AMR). > > I don't understand why you think we are doing any differently than x86 > here. That was a misunderstanding on my side as explained in other reply. > > > However with HPT on power, these keys do not necessarily have to be > > > programmed into the PTE. We could bypass the Linux Page Table Entry(PTE) > > > and instead just program them into the Hash Page Table(HPTE), since > > > the MMU does not refer the PTE but refers the HPTE. The last version > > > of the page attempted to do that. It worked as follows: > > > > > > a) when a address range is requested to be associated with a key; by the > > >application through key_mprotect() system call, the kernel > > >stores that key in the vmas corresponding to that address > > >range. > > > > > > b) Whenever there is a hash page fault for that address, the fault > > >handler reads the key from the VMA and programs the key into the > > >HPTE. __hash_page() is the function that does that. > > > > What causes the fault here? > > The hardware. With the hash MMU, the HW walks a hash table which is > effectively a large in-memory TLB extension. When a page isn't found > there, a "hash fault" is generated allowing Linux to populate that > hash table with the content of the corresponding PTE. Thanks for the clarification -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Wed, 2017-07-12 at 09:23 +0200, Michal Hocko wrote: > > > > > Ideally the MMU looks at the PTE for keys, in order to enforce > > protection. This is the case with x86 and is the case with power9 Radix > > page table. Hence the keys have to be programmed into the PTE. > > But x86 doesn't update ptes for PKEYs, that would be just too expensive. > You could use standard mprotect to do the same... What do you mean ? x86 ends up in mprotect_fixup -> change_protection() which will update the PTEs just the same as we do. Changing the key for a page is a form mprotect. Changing the access permissions for keys is different, for us it's a special register (AMR). I don't understand why you think we are doing any differently than x86 here. > > However with HPT on power, these keys do not necessarily have to be > > programmed into the PTE. We could bypass the Linux Page Table Entry(PTE) > > and instead just program them into the Hash Page Table(HPTE), since > > the MMU does not refer the PTE but refers the HPTE. The last version > > of the page attempted to do that. It worked as follows: > > > > a) when a address range is requested to be associated with a key; by the > >application through key_mprotect() system call, the kernel > >stores that key in the vmas corresponding to that address > >range. > > > > b) Whenever there is a hash page fault for that address, the fault > >handler reads the key from the VMA and programs the key into the > >HPTE. __hash_page() is the function that does that. > > What causes the fault here? The hardware. With the hash MMU, the HW walks a hash table which is effectively a large in-memory TLB extension. When a page isn't found there, a "hash fault" is generated allowing Linux to populate that hash table with the content of the corresponding PTE. > > c) Once the hpte is programmed, the MMU can sense key violations and > >generate key-faults. > > > > The problem is with step (b). This step is really a very critical > > path which is performance sensitive. We dont want to add any delays. > > However if we want to access the key from the vma, we will have to > > hold the vma semaphore, and that is a big NO-NO. As a result, this > > design had to be dropped. > > > > > > > > I reverted back to the old design i.e the design in v4 version. In this > > version we do the following: > > > > a) when a address range is requested to be associated with a key; by the > >application through key_mprotect() system call, the kernel > >stores that key in the vmas corresponding to that address > >range. Also the kernel programs the key into Linux PTE coresponding to > > all the > >pages associated with the address range. > > OK, so how is this any different from the regular mprotect then? It takes the key argument. This is nothing new. This was done for x86 already, we are just re-using the infrastructure. Look at do_mprotect_pkey() in mm/mprotect.c today. It's all the same code, pkey_mprotect() is just mprotect with an added key argument. > > b) Whenever there is a hash page fault for that address, the fault > >handler reads the key from the Linux PTE and programs the key into > >the HPTE. > > > > c) Once the HPTE is programmed, the MMU can sense key violations and > >generate key-faults. > > > > > > Since step (b) in this case has easy access to the Linux PTE, and hence > > to the key, it is fast to access it and program the HPTE. Thus we avoid > > taking any performance hit on this critical path. > > > > Hope this explains the rationale, > > > > > > As promised here is the high level design: > > I will read through that later > [...] -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Wed 12-07-17 09:23:37, Michal Hocko wrote: > On Tue 11-07-17 12:32:57, Ram Pai wrote: [...] > > Ideally the MMU looks at the PTE for keys, in order to enforce > > protection. This is the case with x86 and is the case with power9 Radix > > page table. Hence the keys have to be programmed into the PTE. > > But x86 doesn't update ptes for PKEYs, that would be just too expensive. > You could use standard mprotect to do the same... OK, this seems to be a misunderstanding and confusion on my end. do_mprotect_pkey does mprotect_fixup even for the pkey path which is quite surprising to me. I guess my misunderstanding comes from Documentation/x86/protection-keys.txt " Memory Protection Keys provides a mechanism for enforcing page-based protections, but without requiring modification of the page tables when an application changes protection domains. It works by dedicating 4 previously ignored bits in each page table entry to a "protection key", giving 16 possible keys. " So please disregard my previous comments about page tables and sorry about the confusion. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Tue 11-07-17 12:32:57, Ram Pai wrote: > On Tue, Jul 11, 2017 at 04:52:46PM +0200, Michal Hocko wrote: > > On Wed 05-07-17 14:21:37, Ram Pai wrote: > > > Memory protection keys enable applications to protect its > > > address space from inadvertent access or corruption from > > > itself. > > > > > > The overall idea: > > > > > > A process allocates a key and associates it with > > > an address range withinits address space. > > > The process then can dynamically set read/write > > > permissions on the key without involving the > > > kernel. Any code that violates the permissions > > > of the address space; as defined by its associated > > > key, will receive a segmentation fault. > > > > > > This patch series enables the feature on PPC64 HPTE > > > platform. > > > > > > ISA3.0 section 5.7.13 describes the detailed specifications. > > > > Could you describe the highlevel design of this feature in the cover > > letter. > > Yes it can be hard to understand without the big picture. I will > provide the high level design and the rationale behind the patch split > towards the end. Also I will have it in the cover letter for my next > revision of the patchset. Thanks! > > I have tried to get some idea from the patchset but it was > > really far from trivial. Patches are not very well split up (many > > helpers are added without their users etc..). > > I see your point. Earlier, I had the patches split such a way that the > users of the helpers were in the same patch as that of the helper. > But then comments from others lead to the current split. It is not my call here, obviously. I cannot review arch specific parts due to lack of familiarity but it is a general good practice to include helpers along with their users to make the usage clear. Also, as much as I like small patches because they are easier to review, having very many of them can lead to a harder review in the end because you easily lose a higher level overview. > > > Testing: > > > This patch series has passed all the protection key > > > tests available in the selftests directory. > > > The tests are updated to work on both x86 and powerpc. > > > > > > version v5: > > > (1) reverted back to the old design -- store the > > > key in the pte, instead of bypassing it. > > > The v4 design slowed down the hash page path. > > > > This surprised me a lot but I couldn't find the respective code. Why do > > you need to store anything in the pte? My understanding of PKEYs is that > > the setup and teardown should be very cheap and so no page tables have > > to updated. Or do I just misunderstand what you wrote here? > > Ideally the MMU looks at the PTE for keys, in order to enforce > protection. This is the case with x86 and is the case with power9 Radix > page table. Hence the keys have to be programmed into the PTE. But x86 doesn't update ptes for PKEYs, that would be just too expensive. You could use standard mprotect to do the same... > However with HPT on power, these keys do not necessarily have to be > programmed into the PTE. We could bypass the Linux Page Table Entry(PTE) > and instead just program them into the Hash Page Table(HPTE), since > the MMU does not refer the PTE but refers the HPTE. The last version > of the page attempted to do that. It worked as follows: > > a) when a address range is requested to be associated with a key; by the >application through key_mprotect() system call, the kernel >stores that key in the vmas corresponding to that address >range. > > b) Whenever there is a hash page fault for that address, the fault >handler reads the key from the VMA and programs the key into the >HPTE. __hash_page() is the function that does that. What causes the fault here? > c) Once the hpte is programmed, the MMU can sense key violations and >generate key-faults. > > The problem is with step (b). This step is really a very critical > path which is performance sensitive. We dont want to add any delays. > However if we want to access the key from the vma, we will have to > hold the vma semaphore, and that is a big NO-NO. As a result, this > design had to be dropped. > > > > I reverted back to the old design i.e the design in v4 version. In this > version we do the following: > > a) when a address range is requested to be associated with a key; by the >application through key_mprotect() system call, the kernel >stores that key in the vmas corresponding to that address >range. Also the kernel programs the key into Linux PTE coresponding to all > the >pages associated with the address range. OK, so how is this any different from the regular mprotect then? > b) Whenever there is a hash page fault for that address, the fault >handler reads the key from the Linux PTE and programs the key into >the HPTE. > > c) Once the HPTE is programmed, the MMU can sense key violations and >generate key-faults. >
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Tue, 2017-07-11 at 12:32 -0700, Ram Pai wrote: > Ideally the MMU looks at the PTE for keys, in order to enforce > protection. This is the case with x86 and is the case with power9 Radix > page table. Hence the keys have to be programmed into the PTE. POWER9 radix doesn't currently support keys. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Tue, Jul 11, 2017 at 04:52:46PM +0200, Michal Hocko wrote: > On Wed 05-07-17 14:21:37, Ram Pai wrote: > > Memory protection keys enable applications to protect its > > address space from inadvertent access or corruption from > > itself. > > > > The overall idea: > > > > A process allocates a key and associates it with > > an address range withinits address space. > > The process then can dynamically set read/write > > permissions on the key without involving the > > kernel. Any code that violates the permissions > > of the address space; as defined by its associated > > key, will receive a segmentation fault. > > > > This patch series enables the feature on PPC64 HPTE > > platform. > > > > ISA3.0 section 5.7.13 describes the detailed specifications. > > Could you describe the highlevel design of this feature in the cover > letter. Yes it can be hard to understand without the big picture. I will provide the high level design and the rationale behind the patch split towards the end. Also I will have it in the cover letter for my next revision of the patchset. > I have tried to get some idea from the patchset but it was > really far from trivial. Patches are not very well split up (many > helpers are added without their users etc..). I see your point. Earlier, I had the patches split such a way that the users of the helpers were in the same patch as that of the helper. But then comments from others lead to the current split. > > > > > Testing: > > This patch series has passed all the protection key > > tests available in the selftests directory. > > The tests are updated to work on both x86 and powerpc. > > > > version v5: > > (1) reverted back to the old design -- store the > > key in the pte, instead of bypassing it. > > The v4 design slowed down the hash page path. > > This surprised me a lot but I couldn't find the respective code. Why do > you need to store anything in the pte? My understanding of PKEYs is that > the setup and teardown should be very cheap and so no page tables have > to updated. Or do I just misunderstand what you wrote here? Ideally the MMU looks at the PTE for keys, in order to enforce protection. This is the case with x86 and is the case with power9 Radix page table. Hence the keys have to be programmed into the PTE. However with HPT on power, these keys do not necessarily have to be programmed into the PTE. We could bypass the Linux Page Table Entry(PTE) and instead just program them into the Hash Page Table(HPTE), since the MMU does not refer the PTE but refers the HPTE. The last version of the page attempted to do that. It worked as follows: a) when a address range is requested to be associated with a key; by the application through key_mprotect() system call, the kernel stores that key in the vmas corresponding to that address range. b) Whenever there is a hash page fault for that address, the fault handler reads the key from the VMA and programs the key into the HPTE. __hash_page() is the function that does that. c) Once the hpte is programmed, the MMU can sense key violations and generate key-faults. The problem is with step (b). This step is really a very critical path which is performance sensitive. We dont want to add any delays. However if we want to access the key from the vma, we will have to hold the vma semaphore, and that is a big NO-NO. As a result, this design had to be dropped. I reverted back to the old design i.e the design in v4 version. In this version we do the following: a) when a address range is requested to be associated with a key; by the application through key_mprotect() system call, the kernel stores that key in the vmas corresponding to that address range. Also the kernel programs the key into Linux PTE coresponding to all the pages associated with the address range. b) Whenever there is a hash page fault for that address, the fault handler reads the key from the Linux PTE and programs the key into the HPTE. c) Once the HPTE is programmed, the MMU can sense key violations and generate key-faults. Since step (b) in this case has easy access to the Linux PTE, and hence to the key, it is fast to access it and program the HPTE. Thus we avoid taking any performance hit on this critical path. Hope this explains the rationale, As promised here is the high level design: (1) When a application associates a key with a address range, program the key in the Linux PTE. (2) Program the key into HPTE, when a HPTE is allocated to back the Linux PTE. (3) And finally when the MMU detects a key violation due to invalid user access, invoke the registered signal handler and provide it with the key number that got violated and the state of the key register (AMR) at the time it faulted. In order to accomplish (1) we need to free up 5 bits in the Linux PTE to store the key. This is accompli
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Wed 05-07-17 14:21:37, Ram Pai wrote: > Memory protection keys enable applications to protect its > address space from inadvertent access or corruption from > itself. > > The overall idea: > > A process allocates a key and associates it with > an address range withinits address space. > The process then can dynamically set read/write > permissions on the key without involving the > kernel. Any code that violates the permissions > of the address space; as defined by its associated > key, will receive a segmentation fault. > > This patch series enables the feature on PPC64 HPTE > platform. > > ISA3.0 section 5.7.13 describes the detailed specifications. Could you describe the highlevel design of this feature in the cover letter. I have tried to get some idea from the patchset but it was really far from trivial. Patches are not very well split up (many helpers are added without their users etc..). > > Testing: > This patch series has passed all the protection key > tests available in the selftests directory. > The tests are updated to work on both x86 and powerpc. > > version v5: > (1) reverted back to the old design -- store the > key in the pte, instead of bypassing it. > The v4 design slowed down the hash page path. This surprised me a lot but I couldn't find the respective code. Why do you need to store anything in the pte? My understanding of PKEYs is that the setup and teardown should be very cheap and so no page tables have to updated. Or do I just misunderstand what you wrote here? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Sun, Jul 09, 2017 at 11:05:44PM -0700, Ram Pai wrote: > On Mon, Jul 10, 2017 at 11:13:23AM +0530, Anshuman Khandual wrote: > > On 07/06/2017 02:51 AM, Ram Pai wrote: . > > > do you have data points to show the difference in > > performance between this version and the last one where > > we skipped the bits from PTE and directly programmed the > > HPTE entries looking into VMA bits. > > No. I dont. I am hoping you can help me out with this. Anshuman, The last version where we skipped the PTE bits is guaranteed to be bad/horrible. For one it has a bug, since it accesses the vma without a lock. And even if we did take a lock, it will slow down the page-hash path un-acceptably. So there is no point measuring the performance of that design. I think the number we want to measure is -- the performance with the current design and comparing that to the performance without memkey feature. We want to find if there is any degradation by adding this feature. RP -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On Mon, Jul 10, 2017 at 11:13:23AM +0530, Anshuman Khandual wrote: > On 07/06/2017 02:51 AM, Ram Pai wrote: > > Memory protection keys enable applications to protect its > > address space from inadvertent access or corruption from > > itself. > > > > The overall idea: > > > > A process allocates a key and associates it with > > an address range withinits address space. > > The process then can dynamically set read/write > > permissions on the key without involving the > > kernel. Any code that violates the permissions > > of the address space; as defined by its associated > > key, will receive a segmentation fault. > > > > This patch series enables the feature on PPC64 HPTE > > platform. > > > > ISA3.0 section 5.7.13 describes the detailed specifications. > > > > > > Testing: > > This patch series has passed all the protection key > > tests available in the selftests directory. > > The tests are updated to work on both x86 and powerpc. > > > > version v5: > > (1) reverted back to the old design -- store the > > key in the pte, instead of bypassing it. > > The v4 design slowed down the hash page path. > > (2) detects key violation when kernel is told to > > access user pages. > > (3) further refined the patches into smaller consumable > > units > > (4) page faults handlers captures the faulting key > > from the pte instead of the vma. This closes a > > race between where the key update in the vma and > > a key fault caused cause by the key programmed > > in the pte. > > (5) a key created with access-denied should > > also set it up to deny write. Fixed it. > > (6) protection-key number is displayed in smaps > > the x86 way. > > Hello Ram, > > This patch series has now grown a lot. Do you have this > hosted some where for us to pull and test it out ? BTW https://github.com/rampai/memorykeys.git branch memkey.v5.3 > do you have data points to show the difference in > performance between this version and the last one where > we skipped the bits from PTE and directly programmed the > HPTE entries looking into VMA bits. No. I dont. I am hoping you can help me out with this. RP -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC v5 00/38] powerpc: Memory Protection Keys
On 07/06/2017 02:51 AM, Ram Pai wrote: > Memory protection keys enable applications to protect its > address space from inadvertent access or corruption from > itself. > > The overall idea: > > A process allocates a key and associates it with > an address range withinits address space. > The process then can dynamically set read/write > permissions on the key without involving the > kernel. Any code that violates the permissions > of the address space; as defined by its associated > key, will receive a segmentation fault. > > This patch series enables the feature on PPC64 HPTE > platform. > > ISA3.0 section 5.7.13 describes the detailed specifications. > > > Testing: > This patch series has passed all the protection key > tests available in the selftests directory. > The tests are updated to work on both x86 and powerpc. > > version v5: > (1) reverted back to the old design -- store the > key in the pte, instead of bypassing it. > The v4 design slowed down the hash page path. > (2) detects key violation when kernel is told to > access user pages. > (3) further refined the patches into smaller consumable > units > (4) page faults handlers captures the faulting key > from the pte instead of the vma. This closes a > race between where the key update in the vma and > a key fault caused cause by the key programmed > in the pte. > (5) a key created with access-denied should > also set it up to deny write. Fixed it. > (6) protection-key number is displayed in smaps > the x86 way. Hello Ram, This patch series has now grown a lot. Do you have this hosted some where for us to pull and test it out ? BTW do you have data points to show the difference in performance between this version and the last one where we skipped the bits from PTE and directly programmed the HPTE entries looking into VMA bits. - Anshuman -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v5 00/38] powerpc: Memory Protection Keys
Memory protection keys enable applications to protect its address space from inadvertent access or corruption from itself. The overall idea: A process allocates a key and associates it with an address range withinits address space. The process then can dynamically set read/write permissions on the key without involving the kernel. Any code that violates the permissions of the address space; as defined by its associated key, will receive a segmentation fault. This patch series enables the feature on PPC64 HPTE platform. ISA3.0 section 5.7.13 describes the detailed specifications. Testing: This patch series has passed all the protection key tests available in the selftests directory. The tests are updated to work on both x86 and powerpc. version v5: (1) reverted back to the old design -- store the key in the pte, instead of bypassing it. The v4 design slowed down the hash page path. (2) detects key violation when kernel is told to access user pages. (3) further refined the patches into smaller consumable units (4) page faults handlers captures the faulting key from the pte instead of the vma. This closes a race between where the key update in the vma and a key fault caused cause by the key programmed in the pte. (5) a key created with access-denied should also set it up to deny write. Fixed it. (6) protection-key number is displayed in smaps the x86 way. version v4: (1) patches no more depend on the pte bits to program the hpte -- comment by Balbir (2) documentation updates (3) fixed a bug in the selftest. (4) unlike x86, powerpc lets signal handler change key permission bits; the change will persist across signal handler boundaries. Earlier we allowed the signal handler to modify a field in the siginfo structure which would than be used by the kernel to program the key protection register (AMR) -- resolves a issue raised by Ben. "Calls to sys_swapcontext with a made-up context will end up with a crap AMR if done by code who didn't know about that register". (5) these changes enable protection keys on 4k-page kernel aswell. version v3: (1) split the patches into smaller consumable patches. (2) added the ability to disable execute permission on a key at creation. (3) rename calc_pte_to_hpte_pkey_bits() to pte_to_hpte_pkey_bits() -- suggested by Anshuman (4) some code optimization and clarity in do_page_fault() (5) A bug fix while invalidating a hpte slot in __hash_page_4K() -- noticed by Aneesh version v2: (1) documentation and selftest added (2) fixed a bug in 4k hpte backed 64k pte where page invalidation was not done correctly, and initialization of second-part-of-the-pte was not done correctly if the pte was not yet Hashed with a hpte. Reported by Aneesh. (3) Fixed ABI breakage caused in siginfo structure. Reported by Anshuman. version v1: Initial version Ram Pai (38): powerpc: Free up four 64K PTE bits in 4K backed HPTE pages powerpc: Free up four 64K PTE bits in 64K backed HPTE pages powerpc: introduce pte_set_hash_slot() helper powerpc: introduce pte_get_hash_gslot() helper powerpc: capture the PTE format changes in the dump pte report powerpc: use helper functions in __hash_page_64K() for 64K PTE powerpc: use helper functions in __hash_page_huge() for 64K PTE powerpc: use helper functions in __hash_page_4K() for 64K PTE powerpc: use helper functions in __hash_page_4K() for 4K PTE powerpc: use helper functions in flush_hash_page() mm: introduce an additional vma bit for powerpc pkey mm: ability to disable execute permission on a key at creation x86: disallow pkey creation with PKEY_DISABLE_EXECUTE powerpc: initial plumbing for key management powerpc: helper function to read,write AMR,IAMR,UAMOR registers powerpc: implementation for arch_set_user_pkey_access() powerpc: sys_pkey_alloc() and sys_pkey_free() system calls powerpc: store and restore the pkey state across context switches powerpc: introduce execute-only pkey powerpc: ability to associate pkey to a vma powerpc: implementation for arch_override_mprotect_pkey() powerpc: map vma key-protection bits to pte key bits. powerpc: sys_pkey_mprotect() system call powerpc: Program HPTE key protection bits powerpc: helper to validate key-access permissions of a pte powerpc: check key protection for user page access powerpc: Macro th