On Fri, Jun 27, 2025 at 02:39:16PM +0200, Jiri Olsa wrote: > On Fri, Jun 27, 2025 at 03:01:45PM +0900, Masami Hiramatsu wrote: > > SNIP > > > > > > > > > > + return tramp; > > > > > + } > > > > > + > > > > > + tramp = create_uprobe_trampoline(vaddr); > > > > > + if (!tramp) > > > > > + return NULL; > > > > > + > > > > > + *new = true; > > > > > + hlist_add_head(&tramp->node, &state->head_tramps); > > > > > + return tramp; > > > > > +} > > > > > + > > > > > +static void destroy_uprobe_trampoline(struct uprobe_trampoline > > > > > *tramp) > > > > > +{ > > > > > + hlist_del(&tramp->node); > > > > > + kfree(tramp); > > > > > > > > Don't we need to unmap the tramp->vaddr? > > > > > > that's tricky because we have no way to make sure the application is > > > no longer executing the trampoline, it's described in the changelog > > > of following patch: > > > > > > uprobes/x86: Add support to optimize uprobes > > > > > > ... > > > > > > We do not unmap and release uprobe trampoline when it's no longer > > > needed, > > > because there's no easy way to make sure none of the threads is still > > > inside the trampoline. But we do not waste memory, because there's > > > just > > > single page for all the uprobe trampoline mappings. > > > > > > > I think we should put this as a code comment. > > ok > > > > > > We do waste frame on page mapping for every 4GB by keeping the uprobe > > > trampoline page mapped, but that seems ok. > > > > Hmm, this is not right with the current find_nearest_page(), because > > it always finds a page from the farthest +2GB range until it is full. > > Thus, in the worst case, if we hits uprobes with the order of > > uprobe0 -> 1 -> 2 which is put as below; > > > > 0x0abc0004 [uprobe2] > > ... > > 0x0abc2004 [uprobe1] > > ... > > 0x0abc4004 [uprobe0] > > > > Then the trampoline pages can be allocated as below. > > > > 0x8abc0000 [uprobe_tramp2] > > [gap] > > 0x8abc2000 [uprobe_tramp1] > > [gap] > > 0x8abc4000 [uprobe_tramp0] > > > > Using true "find_nearest_page()", this will be mitigated. But not > > allocated for "every 4GB". So I think we should drop that part > > from the comment :) > > I think you're right, it's better to start with nearest page, > will change it in new version
I wonder we could actualy use page every 4GB (+-) code below seems to work, wdyt? thanks, jirka --- +#define __4GB (1UL << 32) +#define MASK_4GB ~(__4GB - 1) +#define PAGE_COUNT(addr) ((addr & ~MASK_4GB) >> PAGE_SHIFT) + +static unsigned long find_nearest_page(unsigned long vaddr) +{ + struct vm_unmapped_area_info info = { + .length = PAGE_SIZE, + .align_mask = ~PAGE_MASK, + }; + unsigned long limit, low_limit = PAGE_SIZE, high_limit = TASK_SIZE; + unsigned long cross_4GB, low_4GB, high_4GB; + unsigned long low_tramp, high_tramp; + unsigned long call_end = vaddr + 5; + + /* + * The idea is to create a trampoline every 4GB, so we need to find + * free page closest to the 4GB alignment. We find intersecting 4GB + * alignment address and search up and down to find the closest free + * page. + */ + + low_4GB = call_end & MASK_4GB; + high_4GB = low_4GB + __4GB; + + /* Restrict limits to be within (PAGE_SIZE,TASK_SIZE) boundaries. */ + if (!check_add_overflow(call_end, INT_MIN, &limit)) + low_limit = limit; + if (low_limit == PAGE_SIZE) + low_4GB = low_limit; + + high_limit = call_end + INT_MAX; + if (high_limit > TASK_SIZE) + high_limit = high_4GB = TASK_SIZE; + + /* Get 4GB alligned address that's within 2GB distance from call_end */ + if (low_limit <= low_4GB) + cross_4GB = low_4GB; + else + cross_4GB = high_4GB; + + /* Search up from intersecting 4GB alignment address. */ + info.low_limit = cross_4GB; + info.high_limit = high_limit; + high_tramp = vm_unmapped_area(&info); + + /* Search down from intersecting 4GB alignment address. */ + info.low_limit = low_limit; + info.high_limit = cross_4GB; + info.flags = VM_UNMAPPED_AREA_TOPDOWN; + low_tramp = vm_unmapped_area(&info); + + if (IS_ERR_VALUE(high_tramp) && IS_ERR_VALUE(low_tramp)) + return -ENOMEM; + if (IS_ERR_VALUE(high_tramp)) + return low_tramp; + if (IS_ERR_VALUE(low_tramp)) + return high_tramp; + + /* Return address that's closest to the 4GB alignment address. */ + if (cross_4GB - low_tramp < high_tramp - cross_4GB) + return low_tramp; + return high_tramp; +}