On Tue, Oct 24, 2017 at 05:47:38PM +0200, Michael Ellerman wrote: > Ram Pai <linux...@us.ibm.com> writes: > > > Handle Data and Instruction exceptions caused by memory > > protection-key. > > > > The CPU will detect the key fault if the HPTE is already > > programmed with the key. > > > > However if the HPTE is not hashed, a key fault will not > > be detected by the hardware. The software will detect > > pkey violation in such a case. > > That seems like the wrong trade off to me. > > It means every fault has to go through arch_vma_access_permitted(), > which is at least a function call in the best case, even when pkeys are > not in use, and/or the range in question is not protected by a key. > > Why not instead just service the fault and let the hardware catch it? > > If that access to a protected page is a bug then the process will > probably just exit, in which case who cares about the overhead of the > extra fault. > > If the access is not currently allowed, but the process wants to take > the signal and do something to allow it, then is the overhead of the > extra fault going to matter vs the signal etc? > > I think we should just let the hardware take a 2nd fault, unless someone > can demonstrate a workload where the cost of that is prohibitive. > > Or does that not work for some reason?
It does not work, because the arch-neutral code error-outs the fault without letting the power-specific code to page-in the faulting page. So neither does the page gets faulted, nor does the key-fault gets signalled. > > > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c > > index 4797d08..a16bc43 100644 > > --- a/arch/powerpc/mm/fault.c > > +++ b/arch/powerpc/mm/fault.c > > @@ -391,11 +408,9 @@ static int __do_page_fault(struct pt_regs *regs, > > unsigned long address, > > return 0; > > > > if (unlikely(page_fault_is_bad(error_code))) { > > - if (is_user) { > > - _exception(SIGBUS, regs, BUS_OBJERR, address); > > - return 0; > > - } > > - return SIGBUS; > > + if (!is_user) > > + return SIGBUS; > > + return bad_page_fault_exception(regs, address, error_code); > > I don't see why we want to handle the key fault here. > > I know it's caught here at the moment, because it's in > DSISR_BAD_FAULT_64S, but I think now that we support keys we should > remove DSISR_KEYFAULT from DSISR_BAD_FAULT_64S. > > Then we should let it fall through to further down, and handle it in > access_error(). > > That way eg. the page fault accounting will work as normal etc. Bit tricky to do that. Will try. For one if I take out DSISR_KEYFAULT from DSISR_BAD_FAULT_64S, than the assembly code in do_hash_page() will not call __do_page_fault(). We want it called, but somehow special-handle DSISR_KEYFAULT to call access_error(). > > > @@ -492,6 +507,18 @@ static int __do_page_fault(struct pt_regs *regs, > > unsigned long address, > > if (unlikely(access_error(is_write, is_exec, vma))) > > return bad_area(regs, address); > > > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS > > + if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE, > > + is_exec, 0)) > > + return __bad_area(regs, address, SEGV_PKUERR); > > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */ > > + > > + > > + /* handle_mm_fault() needs to know if its a instruction access > > + * fault. > > + */ > > + if (is_exec) > > + flags |= FAULT_FLAG_INSTRUCTION; > > What is this doing here? We already do that further up. The more I think of this, I find that the code can be optimized and remove redundancy. I am unnecessarily called arch_vma_access_permitted() above, when all the hardwork anyway gets done by handle_mm_fault(). handle_mm_fault() anyway calls arch_vma_access_permitted(). I could rather use the return value of handle_mm_fault() to signal a key-error to the userspace. RP