Hi Glenn, Sun, 12 Jan 2025 14:07:36 -0600 Glenn Washburn <developm...@efficientek.com> wrote: >On Fri, 10 Jan 2025 17:13:05 +0100 >Benjamin Berg <benja...@sipsolutions.net> wrote: > >> From: Benjamin Berg <benjamin.b...@intel.com> >> >> The stub execution uses the somewhat new close_range and execveat >> syscalls. Of these two, the execveat call is essential, but the >> close_range call is more about stub process hygiene rather than safety >> (and its result is ignored). >> >> Replace both calls with a raw syscall as older machines might not have a >> recent enough kernel for close_range (with CLOSE_RANGE_CLOEXEC) or a >> libc that does not yet expose both of the syscalls. >> >> Fixes: 32e8eaf263d9 ("um: use execveat to create userspace MMs") > >This change fixes the immediate issue, allowing the compile to complete >successfully. > >> Reported-by: Glenn Washburn <developm...@efficientek.com> >> Closes: https://lore.kernel.org/20250108022404.05e0de1e@crass-HP-ZBook-15-G2 >> Signed-off-by: Benjamin Berg <benjamin.b...@intel.com> >> --- >> arch/um/os-Linux/skas/process.c | 16 +++++++++++++--- >> 1 file changed, 13 insertions(+), 3 deletions(-) >> >> diff --git a/arch/um/os-Linux/skas/process.c >> b/arch/um/os-Linux/skas/process.c >> index f683cfc9e51a..64e89921bc7f 100644 >> --- a/arch/um/os-Linux/skas/process.c >> +++ b/arch/um/os-Linux/skas/process.c >> @@ -181,6 +181,10 @@ extern char __syscall_stub_start[]; >> >> static int stub_exe_fd; >> >> +#ifndef CLOSE_RANGE_CLOEXEC >> +#define CLOSE_RANGE_CLOEXEC (1U << 2) >> +#endif >> + >> static int userspace_tramp(void *stack) >> { >> char *const argv[] = { "uml-userspace", NULL }; >> @@ -202,8 +206,12 @@ static int userspace_tramp(void *stack) >> init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset); >> init_data.stub_data_offset = MMAP_OFFSET(offset); >> >> - /* Set CLOEXEC on all FDs and then unset on all memory related FDs */ >> - close_range(0, ~0U, CLOSE_RANGE_CLOEXEC); >> + /* >> + * Avoid leaking unneeded FDs to the stub by setting CLOEXEC on all FDs >> + * and then unsetting it on all memory related FDs. >> + * This is not strictly necessary from a safety perspective. >> + */ >> + syscall(__NR_close_range, 1, ~0U, CLOSE_RANGE_CLOEXEC); > >Was this intentional to change the fd parameter from 0 to 1? > >> >> fcntl(init_data.stub_data_fd, F_SETFD, 0); >> for (iomem = iomem_regions; iomem; iomem = iomem->next) >> @@ -224,7 +232,9 @@ static int userspace_tramp(void *stack) >> if (ret != sizeof(init_data)) >> exit(4); >> >> - execveat(stub_exe_fd, "", argv, NULL, AT_EMPTY_PATH); >> + /* Raw execveat for compatibility with older libc versions */ >> + syscall(__NR_execveat, stub_exe_fd, (unsigned long)"", >> + (unsigned long)argv, NULL, AT_EMPTY_PATH); > >I think it would look nicer to leave the call unchanged and define a stub >function like libc does, but only if we detect that the stubs are >undefined. I have a glibc specific patch for this, and then realized >that it should be more generic to cover other libcs. So this patch here >is more general. I think to have what I'd like, I'd need to add and run >a test binaries that check for these functions, like autotools does. >I'm not aware that this is really done in the kernel build system, >though I do know that there are binaries built during build for >generating binaries to be included in the kernel. So in theory I don't >think it should be too much trouble to do. Basically the idea would be >to have a system for testing host libc and outputting a config.h which >will define for instance, HAVE_EXECVEAT, if support is detected in the >linked libc for execveat. And in process.c define a stub around the >syscall() for execveat if not defined HAVE_EXECVEAT. > >So perhaps for now, this is the best solution, but ultimately it would >be nice to have the above and ultimately reverse these changes.
Sorry for my due enquiry, I don't want take a new thread with a patch. It is just a short question. Currently(119b1e61a769aa98e68599f44721661a4d8c55f3), there are three pieces of code calling to syscall __NR_close_range: # grep __NR_close_range -r arch/um/* -n arch/um/kernel/skas/stub_exe.c:73: res = stub_syscall3(__NR_close_range, 1, ~0U, 0); arch/um/os-Linux/start_up.c:269: if (stub_syscall3(__NR_close_range, 1, ~0U, 0)) arch/um/os-Linux/skas/process.c:331: syscall(__NR_close_range, 0, ~0U, CLOSE_RANGE_CLOEXEC); But my host kernel is 5.4 and fails to compile these pieces of codes because there is no this syscall in my older kernel. I was wondering if you are working on this issue. If not, I will try to make a draft solution to review. Thanks! Yongting > >Glenn > > >> exit(5); >> }