On Mon, Jul 07, 2025 at 03:02:15PM +0200, Christophe Leroy wrote: > > > Le 07/07/2025 à 13:49, Mike Rapoport a écrit : > > On Mon, Jul 07, 2025 at 12:10:43PM +0200, Christophe Leroy wrote: > > > > > > Le 04/07/2025 à 15:49, Mike Rapoport a écrit : > > > > From: "Mike Rapoport (Microsoft)" <r...@kernel.org> > > > > > > > > The execmem_update_copy() that used text poking was required when memory > > > > allocated from ROX cache was always read-only. Since now its permissions > > > > can be switched to read-write there is no need in a function that > > > > updates > > > > memory with text poking. > > > > > > Erm. Looks like I missed the patch that introduced this change. > > > > > > On some variant of powerpc, namely book3s/32, this is not feasible. > > > > The only user of EXECMEM_ROX_CACHE for now is x86-64, we can always revisit > > when powerpc book3s/32 would want to opt in to cache usage. > > > > And it seems that [MODULES_VADDR, MODULES_END] is already mapped with > > "large pages", isn't it? > > I don't think so. It uses execmem_alloc() which sets VM_ALLOW_HUGE_VMAP only > when using EXECMEM_ROX_CACHE. And book3s/32 doesn't have large pages. > > Only 8xx has large pages but they are not PMD aligned (PMD_SIZE is 4M while > large pages are 512k and 8M) so it wouldn't work well with existing > execmem_vmalloc(). The PMD_SIZE can be replaced with one of arch_vmap size helpers if needed. Or even parametrized in execmem_info. > > > The granularity for setting the NX (non exec) bit is 256 Mbytes sections. > > > So the area dedicated to execmem [MODULES_VADDR; MODULES_END[ always have > > > the NX bit unset. > > > > > > You can change any page within this area from ROX to RWX but you can't > > > make > > > it RW without X. If you want RW without X you must map it in the VMALLOC > > > area, as VMALLOC area have NX bit always set. > > > > So what will happen when one callse > > > > set_memory_nx() > > set_memory_rw() > > > > in such areas? > > Nothing will happen. It will successfully unset the X bit on the PTE but > that will be ignored by the HW which only relies on the segment's NX bit > which is set for the entire VMALLOC area and unset for the entire MODULE > area.
And set_memory_rw() will essentially make the mapping RWX if it's in MODULE area? > Christophe > -- Sincerely yours, Mike.