Re: [PATCH 5/5] add setting gs/fsbase
On Sat, May 6, 2023 at 1:12 PM Sergey Bugaev wrote: > This is ld.so in the exec server task trying to dir_lookup > ("/hurd/exec") -- it then crashes ext2fs when it tries to access the > TCB. But before that, ext2fs starts up, spawns all those > libports/libpager worker threads (which now start up correctly), > resumes exec, and ld.so begins starting up, which means the SHARED / > IS_IN (rtld) build is not super broken either \o/ I now have exec server loading and initializing and binding all the shared libraries, then starting all the way to main (), then trivfs_startup (), successfully handshaking with ext2fs! Then something goes wrong; I have not yet investigated, but it's probably because ext2fs expects to find /hurd/startup next, and I haven't put that into my ramdisk. Anyway, I declare this a huge success :) I'll send the glibc fixes some time soon, and I should keep working on the Hurd patches, hopefully there isn't too much work left, then we can upstream it, and then test with something that resembles a more complete Debian installation. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Hello, On Tue, Apr 25, 2023 at 1:59 PM Samuel Thibault wrote: > Sergey Bugaev, le mar. 25 avril 2023 13:25:02 +0300, a ecrit: > > @@ -733,6 +734,10 @@ boolean_t thread_invoke( > > > > counter(c_thread_invoke_hits++); > > (void) spl0(); > > +#ifdef __x86_64__ > > + wrmsr(MSR_REG_FSBASE, new_thread->pcb->iss.fsbase); > > + wrmsr(MSR_REG_GSBASE, new_thread->pcb->iss.gsbase); > > +#endif > > I guess it could belong to switch_ktss? I'm now hitting this same issue with the fast RPC codepath: #0 stack_handoff (old=old@entry=0x90ecd828, new=new@entry=0x90ecd620) at ../i386/i386/pcb.c:301 #1 0x8104f6c4 in thread_handoff (old=old@entry=0x90ecd828, continuation=continuation@entry=0x81046f30 , new=new@entry=0x90ecd620) at ../kern/ipc_sched.c:239 #2 0x810485b4 in mach_msg_trap (msg=0xbfffeca0, option=, send_size=1096, rcv_size=, rcv_name=3, time_out=, notify=) at ../ipc/mach_msg.c:830 #3 0x81011eb2 in syscall64 () at ../x86_64/locore.S:1430 This is ld.so in the exec server task trying to dir_lookup ("/hurd/exec") -- it then crashes ext2fs when it tries to access the TCB. But before that, ext2fs starts up, spawns all those libports/libpager worker threads (which now start up correctly), resumes exec, and ld.so begins starting up, which means the SHARED / IS_IN (rtld) build is not super broken either \o/ Indeed it looks like switch_ktss is a better place for setting fsgs_base, I'm going to try that next. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
On Tue, May 2, 2023 at 10:04 AM Samuel Thibault wrote: > I don't see any issue on 32bit gnumach: > > $ ./test > Killed > > Perhaps with is only with 64bit gnumach Yes, most likely it's something 64-bit specific. We know from experience that the 32-bit gnumach survives a task_terminate call just fine :) But then maybe you should be checking with only a single bootstrap task trying to terminate itself, not a full Hurd environment, which, again, is surely known to work. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Hello, Sergey Bugaev, le mer. 26 avril 2023 20:33:20 +0300, a ecrit: > ../kern/ipc_tt.c:395: retrieve_task_self_fast: Assertion > `task->itk_self != IP_NULL' failed.panic ../kern/debug.c:103: > Debugger: Debugger invoked, but there isn't one! > > This is after typing 'quit' in bc, which calls exit () -- I had to fix > up _hurd_exit () in glibc a little to not crash if we don't have > _hurd_ports. From single-stepping, it seems task_terminate () works, > as in it tears down and deallocates the kernel task_t, but then the > syscall (which task_terminate is) just returns back to userspace to > the now-nonexistent task, and it keeps running. It then calls another > syscall and that one breaks with the assertion above. > > You should be able to reproduce this without glibc by just calling > task_terminate (mach_task_self ()). I don't see any issue on 32bit gnumach: $ ./test Killed Perhaps with is only with 64bit gnumach, with the AST questions, I see notably thread_terminate that in the thread==cur_thread case, AST_TERMINATE is used. Samuel
Re: [PATCH 5/5] add setting gs/fsbase
Applied, thanks! Luca Dariz, le mer. 19 avril 2023 21:47:03 +0200, a ecrit: > * i386/i386/i386asm.sym: add offsets for asm > * i386/i386/pcb.c: switch FSBASE/GSBASE on context switch and > implement accessors in thread setstatus/getstatus > * i386/i386/thread.h: add new state to thread saved state > * kern/thread.c: add i386_FSGS_BASE_STATE handler > * x86_64/locore.S: fix fs/gs handling, skipping the base address and > avoid resetting it by manually re-loading fs/gs > --- > i386/i386/i386asm.sym | 2 + > i386/i386/pcb.c | 39 +-- > i386/i386/thread.h| 4 ++ > kern/thread.c | 3 ++ > x86_64/locore.S | 89 ++- > 5 files changed, 116 insertions(+), 21 deletions(-) > > diff --git a/i386/i386/i386asm.sym b/i386/i386/i386asm.sym > index 1b9b40bb..fd0be557 100644 > --- a/i386/i386/i386asm.sym > +++ b/i386/i386/i386asm.sym > @@ -108,6 +108,8 @@ offseti386_saved_stater r12 > offset i386_saved_stater r13 > offset i386_saved_stater r14 > offset i386_saved_stater r15 > +offset i386_saved_stater fsbase > +offset i386_saved_stater gsbase > #endif > > offset i386_interrupt_statei eip > diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c > index 61125fe8..8a9e3bf4 100644 > --- a/i386/i386/pcb.c > +++ b/i386/i386/pcb.c > @@ -51,6 +51,7 @@ > #include "eflags.h" > #include "gdt.h" > #include "ldt.h" > +#include "msr.h" > #include "ktss.h" > #include "pcb.h" > > @@ -372,7 +373,10 @@ thread_t switch_context( >* Load the rest of the user state for the new thread >*/ > switch_ktss(new->pcb); > - > +#if defined(__x86_64__) && !defined(USER32) > +wrmsr(MSR_REG_FSBASE, new->pcb->iss.fsbase); > +wrmsr(MSR_REG_GSBASE, new->pcb->iss.gsbase); > +#endif > return Switch_context(old, continuation, new); > } > > @@ -667,7 +671,23 @@ kern_return_t thread_setstatus( > return ret; > break; > } > - > +#if defined(__x86_64__) && !defined(USER32) > + case i386_FSGS_BASE_STATE: > +{ > +struct i386_fsgs_base_state *state; > +if (count < i386_FSGS_BASE_STATE_COUNT) > +return KERN_INVALID_ARGUMENT; > + > +state = (struct i386_fsgs_base_state *) tstate; > +thread->pcb->iss.fsbase = state->fs_base; > +thread->pcb->iss.gsbase = state->gs_base; > +if (thread == current_thread()) { > +wrmsr(MSR_REG_FSBASE, state->fs_base); > +wrmsr(MSR_REG_GSBASE, state->gs_base); > +} > +break; > +} > +#endif > default: > return(KERN_INVALID_ARGUMENT); > } > @@ -843,7 +863,20 @@ kern_return_t thread_getstatus( > *count = i386_DEBUG_STATE_COUNT; > break; > } > - > +#if defined(__x86_64__) && !defined(USER32) > + case i386_FSGS_BASE_STATE: > +{ > +struct i386_fsgs_base_state *state; > +if (*count < i386_FSGS_BASE_STATE_COUNT) > +return KERN_INVALID_ARGUMENT; > + > +state = (struct i386_fsgs_base_state *) tstate; > +state->fs_base = thread->pcb->iss.fsbase; > +state->fs_base = thread->pcb->iss.gsbase; > +*count = i386_FSGS_BASE_STATE_COUNT; > +break; > +} > +#endif > default: > return(KERN_INVALID_ARGUMENT); > } > diff --git a/i386/i386/thread.h b/i386/i386/thread.h > index 933b43d8..b5fc5ffb 100644 > --- a/i386/i386/thread.h > +++ b/i386/i386/thread.h > @@ -51,6 +51,10 @@ > */ > > struct i386_saved_state { > +#ifdef __x86_64__ > + unsigned long fsbase; > + unsigned long gsbase; > +#endif > unsigned long gs; > unsigned long fs; > unsigned long es; > diff --git a/kern/thread.c b/kern/thread.c > index 392d38f8..f0cea804 100644 > --- a/kern/thread.c > +++ b/kern/thread.c > @@ -1472,6 +1472,9 @@ kern_return_t thread_set_state( > if (flavor == i386_DEBUG_STATE && thread == current_thread()) > /* This state can be set directly for the curren thread. */ > return thread_setstatus(thread, flavor, new_state, > new_state_count); > + if (flavor == i386_FSGS_BASE_STATE && thread == current_thread()) > + /* This state can be set directly for the curren thread. */ > + return thread_setstatus(thread, flavor, new_state, > new_state_count); > #endif > > if (thread == THREAD_NULL || thread == current_thread()) > diff --git a/x86_64/locore.S b/x86_64/locore.S > index 1b17d921..ba0400a3
Re: [PATCH 5/5] add setting gs/fsbase
Hi again, I managed to break gnumach in a new and exciting way (tm): ../kern/ipc_tt.c:395: retrieve_task_self_fast: Assertion `task->itk_self != IP_NULL' failed.panic ../kern/debug.c:103: Debugger: Debugger invoked, but there isn't one! This is after typing 'quit' in bc, which calls exit () -- I had to fix up _hurd_exit () in glibc a little to not crash if we don't have _hurd_ports. From single-stepping, it seems task_terminate () works, as in it tears down and deallocates the kernel task_t, but then the syscall (which task_terminate is) just returns back to userspace to the now-nonexistent task, and it keeps running. It then calls another syscall and that one breaks with the assertion above. You should be able to reproduce this without glibc by just calling task_terminate (mach_task_self ()). Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le mar. 25 avril 2023 13:25:02 +0300, a ecrit: > @@ -733,6 +734,10 @@ boolean_t thread_invoke( > > counter(c_thread_invoke_hits++); > (void) spl0(); > +#ifdef __x86_64__ > + wrmsr(MSR_REG_FSBASE, new_thread->pcb->iss.fsbase); > + wrmsr(MSR_REG_GSBASE, new_thread->pcb->iss.gsbase); > +#endif I guess it could belong to switch_ktss? Samuel
Re: [PATCH 5/5] add setting gs/fsbase
On Tue, Apr 25, 2023 at 12:02 PM Sergey Bugaev wrote: > subcode is the address of the fault, and 128 just so happens to be > 0x80, and %fs:0x80 is of course tcb->reply_port. So it looks like > fs_base was not restored to its proper value when context-switching to > the thread, and gets instead set to 0, and so accessing the reply port > to make an RPC faults. > > But how come all the previous RPCs worked? Well, it looks like fs_base > is restored/preserved correctly when just doing synchronous kernel RPC > calls, but not when the thread is continued after reading via > io_done_thread_continue/thread_invoke/call_continuation. This is of course a wrong way to do this, but just for the sake of testing: diff --git a/kern/sched_prim.c b/kern/sched_prim.c index dd0f492b..820de799 100644 --- a/kern/sched_prim.c +++ b/kern/sched_prim.c @@ -55,6 +55,7 @@ #include #include #include +#include #ifMACH_FIXPRI #include @@ -733,6 +734,10 @@ boolean_t thread_invoke( counter(c_thread_invoke_hits++); (void) spl0(); +#ifdef __x86_64__ + wrmsr(MSR_REG_FSBASE, new_thread->pcb->iss.fsbase); + wrmsr(MSR_REG_GSBASE, new_thread->pcb->iss.gsbase); +#endif call_continuation(new_thread->swap_func); /*NOTREACHED*/ return TRUE; /* help for the compiler */ And sure enough: start bc: bc 1.07.1 Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008, 2012-2017 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. 2 + 3 5 print "Hello from x86_64-gnu!" Hello from x86_64-gnu! Now's the time to admit I don't really know how to use bc :D Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Hello, I have done some debugging and I think I know what's going on, and it looks related to this very patch. The device_read_inband () call actually succeeds, and glibc tries to echo the same thing back (that's how devstream works, arguably maybe it shouldn't, but whatever); and when it tries to do the RPC is calls everyone's-since-recently-favorite-function, __mig_get_reply_port (), and... #0 i386_exception (exc=exc@entry=1, code=code@entry=2, subcode=128) at ../i386/i386/trap.c:638 #1 0x8103410b in user_page_fault_continue (kr=kr@entry=2) at ../i386/i386/trap.c:119 #2 0x81056346 in vm_fault (map=, vaddr=vaddr@entry=0, fault_type=1, change_wiring=change_wiring@entry=0, resume=resume@entry=0, continuation=continuation@entry=0x810340e0 ) at ../vm/vm_fault.c:1484 #3 0x81033f5b in user_trap (regs=0x90fe3868) at ../i386/i386/trap.c:518 #4 0x81011b73 in _take_trap () at ../x86_64/locore.S:549 subcode is the address of the fault, and 128 just so happens to be 0x80, and %fs:0x80 is of course tcb->reply_port. So it looks like fs_base was not restored to its proper value when context-switching to the thread, and gets instead set to 0, and so accessing the reply port to make an RPC faults. But how come all the previous RPCs worked? Well, it looks like fs_base is restored/preserved correctly when just doing synchronous kernel RPC calls, but not when the thread is continued after reading via io_done_thread_continue/thread_invoke/call_continuation. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le lun. 24 avril 2023 23:27:09 +0300, a ecrit: > Alternatively, if it is some fatal crash in userland, what should I > break on in gnumach to detect it? Somewhere in vm_fault.c? The debug_all_traps_with_kdb variable is there for that, you can also enable it live from kdb with debug traps /on Samuel
Re: [PATCH 5/5] add setting gs/fsbase
Hi, On Mon, Apr 24, 2023 at 11:02 PM Luca Dariz wrote: > Il 24/04/23 17:19, Sergey Bugaev ha scritto: > > Resending without the attachment, because apparently the email did not > > make it into the list archive so maybe it didn't get to you either; > > and I'm too conscious about email just eating my letters without any > > notice or indication since that one time. If you didn't get the last > > one and still want the attachment, let me know and we'll work > > something out. And please ack receiving this one in any case. > > I received the previous one, including the attachment, thanks! Good -- and it did eventually make its way into the archive too. Interestingly, the second one got into the archive first, and it was being displayed as a reply to an unknown message for a while, and then the first one showed up. I don't know what's going on, perhaps *something* was taking its time scanning my 5.6 MB binary attachment for malware? > I can see issues also with simpler tests, if I use the graphical > console; actually the whole bootstrap task doesn't start. Yes, with the graphical console (called VGA text console I believe?) the bootstrap crashes before it even first enters user mode. So please debug that one too. > Did you try to > use the serial console? It has the drawback of reconfiguring the > terminal every time, but it seems to work for both in/out. ...but yes, since you mentioned using -nographic & console=com0 last time, this is what I've been using. > I tried using > the "com" mach device with the device_read/write() rpcs. I'm trying the "console" device, and glibc uses the _inband versions of the RPCs, I believe. Please try my bc build and see if you can reproduce it. Alternatively, if it is some fatal crash in userland, what should I break on in gnumach to detect it? Somewhere in vm_fault.c? I can't just break on all page faults since there are a lot of them and most are benign. Maybe on i386_exception ()? Hm, maybe I should just -D MACH_KDB. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Il 24/04/23 17:19, Sergey Bugaev ha scritto: Resending without the attachment, because apparently the email did not make it into the list archive so maybe it didn't get to you either; and I'm too conscious about email just eating my letters without any notice or indication since that one time. If you didn't get the last one and still want the attachment, let me know and we'll work something out. And please ack receiving this one in any case. I received the previous one, including the attachment, thanks! Sergey Bugaev, le sam. 22 avril 2023 15:11:30 +0300, a ecrit: I had to patch it a bit to make it accept and use --device-port, and to stop trying to read (fileno (stdin)). So, Luca, I'm eagerly awaiting a fix for input :) I'm attaching the bc binary for you to test. [Not actually attaching this time. -S] I can see issues also with simpler tests, if I use the graphical console; actually the whole bootstrap task doesn't start. Did you try to use the serial console? It has the drawback of reconfiguring the terminal every time, but it seems to work for both in/out. I tried using the "com" mach device with the device_read/write() rpcs. Luca
Re: [PATCH 5/5] add setting gs/fsbase
Resending without the attachment, because apparently the email did not make it into the list archive so maybe it didn't get to you either; and I'm too conscious about email just eating my letters without any notice or indication since that one time. If you didn't get the last one and still want the attachment, let me know and we'll work something out. And please ack receiving this one in any case. --- On Sat, Apr 22, 2023 at 3:13 PM Samuel Thibault wrote: > > Sergey Bugaev, le sam. 22 avril 2023 15:11:30 +0300, a ecrit: > > So please suggest your favorite REPL :) > > bc ? :) Sure: GNU Mach 1.8 biosmem: physical memory map: biosmem: 00:09f000, available biosmem: 09fc00:0a, reserved biosmem: 0f:10, reserved biosmem: 10:001ffe, available biosmem: 001ffe:002000, reserved biosmem: 00feffc000:00ff00, reserved biosmem: 00fffc:01, reserved vm_page: page table size: 131024 entries (12284k) vm_page: DMA: pages: 4080 (15M), free: 0 (0M) vm_page: DMA: min:500 low:600 high:1000 vm_page: DMA32: pages: 61440 (240M), free: 61282 (239M) vm_page: DMA32: min:3072 low:3686 high:6144 vm_page: DIRECTMAP: pages: 65504 (255M), free: 59399 (232M) vm_page: DIRECTMAP: min:3275 low:3930 high:6550 com 2 out of range RTC time is 2023-04-24 14:30:26 Pausing, debug me! module 0: bc --device-port=${device-port} $(task-create) $(prompt-task-resume) 1 multiboot modules task loaded: bc --device-port=1 Pausing for bc... Hit to resume bootstrap. start bc: bc 1.07.1 Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008, 2012-2017 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. (just like mach-bootstrap-hello, hangs here, does not react to input) I had to patch it a bit to make it accept and use --device-port, and to stop trying to read (fileno (stdin)). So, Luca, I'm eagerly awaiting a fix for input :) I'm attaching the bc binary for you to test. [Not actually attaching this time. -S] Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le sam. 22 avril 2023 15:11:30 +0300, a ecrit: > So please suggest your favorite REPL :) bc ? :) Samuel
Re: [PATCH 5/5] add setting gs/fsbase
On Sat, Apr 22, 2023 at 2:17 PM Luca Dariz wrote: > Hi Sergey, Hi, > Il 20/04/23 13:51, Sergey Bugaev ha scritto: > > On Wed, Apr 19, 2023 at 11:52 PM Sergey Bugaev wrote: > > We do reach the call to __thread_set_state (), but then it uses > > __mig_memcpy (), which is just 'return memcpy (...)'. And (because of > > the static build?), that memcpy () is the full real ifunc-selected > > memcpy. So it jumps to the memcpy@plt, which jumps to > > *(mem...@got.plt), which is supposed to jump into the rtld and then > > run the ifunc resolver and do its smarts and eventually jump to the > > right memcpy... > > I don't know if you already found a good solution for this, but if not > maybe we could add an x86_64-specific plain syscall to set fsbase/gsbase > during bootstrap, if it can make things more manageable. Thank you, but that won't help much, since thread_set_state is not the only offender, there are more memcpy/memmove calls further (in RPC_exec_startup_get_info and hurdstartup.c). So we need to solve this in userspace one way or another. My pre-filling GOT thing seems to work well, although it hasn't received any positive feedback from glibc maintainers. Alternatively we could use Adhemerval Zanella Netto's suggestion to redirect memcpy to a baseline implementation in some .c files at compile time. Rather, please look into why device_write () hangs in ast_taken (), and into enabling > 4 GB address space, and into the crash when not using console=com0. Once you figure out the hang in device_write (), I'd like to experiment with building and running some REPL -- like maybe Python or Bash or some Lisp; but it needs to work without forking (so Bash wouldn't really work...) and without accessing any files (Python wants to load its stdlib from the fs I believe). So please suggest your favorite REPL :) Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Hi Sergey, Il 20/04/23 13:51, Sergey Bugaev ha scritto: On Wed, Apr 19, 2023 at 11:52 PM Sergey Bugaev wrote: We do reach the call to __thread_set_state (), but then it uses __mig_memcpy (), which is just 'return memcpy (...)'. And (because of the static build?), that memcpy () is the full real ifunc-selected memcpy. So it jumps to the memcpy@plt, which jumps to *(mem...@got.plt), which is supposed to jump into the rtld and then run the ifunc resolver and do its smarts and eventually jump to the right memcpy... I don't know if you already found a good solution for this, but if not maybe we could add an x86_64-specific plain syscall to set fsbase/gsbase during bootstrap, if it can make things more manageable. Luca
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le ven. 21 avril 2023 17:30:12 +0300, a ecrit: > On Fri, Apr 21, 2023 at 12:24 PM Samuel Thibault > wrote: > > No, they don't :) otherwise they wouldn't respect the ABI. The AMD64 > > ABI indeed defaults to /lib/ld64.so.1, and says that Linux uses > > /lib64/ld-linux-x86-64.so.2, and that is cast in stone, we cannot change > > it any more. > > > > /lib/ld-x86-64.so.1 should be completely fine. > > I don't think I understand -- so the AMD64 ABI spec requires it to be > /lib/ld64.so.1, It does not require it to be that, it says it should be that. But it's a very bad idea since it doesn't permit to distinguish different 64bit archs. > but it also mentions that Linux uses > /lib64/ld-linux-x86-64.so.2, Yes, because Linux chose to use that, to encode the OS name and the arch in the path. > and that means we should be using /lib/ld-x86-64.so.1? Yes, because we do want to put the arch name in the path. We could want to put "gnu-" in it too, it's just about time to decide. After we have some settled distribution, we should really not change that. > Moreover, how does the ELF spec requiring a specific file path for > INTERP make any sense at all? For interoperability between distributions. > Surely the .1 is the soname means something, we should be able to bump > it to .2 when we break ABI really bad? Yes, but we'd rather avoid that if possible. > What if the system is not a Unix and doesn't use /lib/ or .so for its > libraries (but still uses ELF as a format)? Then it's not the same system and thus it's free to use whatever interpreter path it prefers. But it should really rather stick to it once decided, for interoperability. Samuel
Re: [PATCH 5/5] add setting gs/fsbase
On Fri, Apr 21, 2023 at 12:24 PM Samuel Thibault wrote: > No, they don't :) otherwise they wouldn't respect the ABI. The AMD64 > ABI indeed defaults to /lib/ld64.so.1, and says that Linux uses > /lib64/ld-linux-x86-64.so.2, and that is cast in stone, we cannot change > it any more. > > /lib/ld-x86-64.so.1 should be completely fine. I don't think I understand -- so the AMD64 ABI spec requires it to be /lib/ld64.so.1, but it also mentions that Linux uses /lib64/ld-linux-x86-64.so.2, and that means we should be using /lib/ld-x86-64.so.1? How does that make any sense? Moreover, how does the ELF spec requiring a specific file path for INTERP make any sense at all? Surely the .1 is the soname means something, we should be able to bump it to .2 when we break ABI really bad? What if the system is not a Unix and doesn't use /lib/ or .so for its libraries (but still uses ELF as a format)? Why embed a path name into each binary at all, if the spec just defines the one true rtld path? And also surely the systems don't follow that part of the spec in practice? Linux uses its own thing as we have already established, and from other emulparams files in the binutils tree, I see that Haiku has /system/runtime_loader, and FreeBSD apparently /usr/libexec/ld-elf.so.1. In SerenityOS we use /usr/lib/Loader.so. I'm not objecting to /lib/ld-x86-64.so.1, I just don't understand... Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le ven. 21 avril 2023 12:17:54 +0300, a ecrit: > Hello, > > On Fri, Apr 21, 2023 at 12:01 AM Samuel Thibault > wrote: > > BTW, which dynamic linker path do we have set for x86_64-gnu? > > It's in gcc/config/i386/gnu64.h in GCC tree: > > #define GNU_USER_DYNAMIC_LINKER64 "/lib/ld-x86-64.so.1" > > It was added by Flavio. Ok, so it has the version, good :) > So let's maybe choose a single path and use it consistently? That's the one we should be using. > I would also expect distros to want to have a say in this. No, they don't :) otherwise they wouldn't respect the ABI. The AMD64 ABI indeed defaults to /lib/ld64.so.1, and says that Linux uses /lib64/ld-linux-x86-64.so.2, and that is cast in stone, we cannot change it any more. /lib/ld-x86-64.so.1 should be completely fine. > For instance on Debian GNU/Linux x86_64 ld.so is at > /usr/lib/x86_64-linux-gnu/ld-2.31.so, That's just where the file is stored. It's normal to set a symlink where it is actually looked up. > Which maybe makes sense if they wish to keep binaries built on Debian > runnable on other distros. Do we want the same for Debian GNU/Hurd, Better have a cross-distribution ABI, yes. Samuel
Re: [PATCH 5/5] add setting gs/fsbase
Hello, On Fri, Apr 21, 2023 at 12:01 AM Samuel Thibault wrote: > BTW, which dynamic linker path do we have set for x86_64-gnu? It's in gcc/config/i386/gnu64.h in GCC tree: #define GNU_USER_DYNAMIC_LINKER64 "/lib/ld-x86-64.so.1" It was added by Flavio. So GCC passes -dynamic-linker /lib/ld-x86-64.so.1 to ld. On its own, ld sets /lib/ld64.so.1 as far as I can see. glibc installs its ld.so into $prefix/lib/ld.so.1 (i.e. /usr/lib/ld.so.1 likely, or /lib/ld.so.1 in no-/usr configuration). So let's maybe choose a single path and use it consistently? I would also expect distros to want to have a say in this. For instance on Debian GNU/Linux x86_64 ld.so is at /usr/lib/x86_64-linux-gnu/ld-2.31.so, but the toolchains still sets /lib64/ld-linux-x86-64.so.2 in PT_INTERP, and to make this work they have /usr/lib64 which contains just a single symlink: /usr/lib64/ld-linux-x86-64.so.2 -> /lib/x86_64-linux-gnu/ld-2.31.so Which maybe makes sense if they wish to keep binaries built on Debian runnable on other distros. Do we want the same for Debian GNU/Hurd, or maybe we want Debian-specific GCC/binutils configuration to set PT_INTERP to /lib/x86_64-gnu/ld.so.1 directly? I imagine the Guix folks may also have their own preference. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Hello, Sergey Bugaev, le jeu. 20 avril 2023 16:08:45 +0300, a ecrit: > Is that actually configurable somewhere (at gcc/binutils build time, I > mean)? It's probably in the link script, somewhere in /usr/lib/*/ldscripts/elf*x86_64* > I'm thinking we will want to map the lower 4GB of address space > inaccessible like dyld does, and not just one page, to catch pointer > truncation in addition to nullptr dereferences. That sounds useful indeed. > But that would need gcc/ld to know to not use low addresses for their > needs. It seems that binutils' ld/emulparams/ there are already some per-OS files: elf_x86_64_cloudabi.sh elf_x86_64_fbsd.sh elf_x86_64_haiku.sh elf_x86_64_sol2.sh One can probably create a gnu file that overrides TEXT_START_ADDR=0x40 BTW, which dynamic linker path do we have set for x86_64-gnu? Samuel
Re: [PATCH 5/5] add setting gs/fsbase
FWIW, mach-bootstrap-hello is supposed to then echo back what you type on the console, but it doesn't work. From some debugging, it seems that once I input something, the following device_write () call hangs here: (gdb) ast_taken () at ../kern/ast.c:88 88 if (reasons & AST_NETWORK) (gdb) 96 if (self != current_processor()->idle_thread) { (gdb) n 98 while (thread_should_halt(self)) (gdb) l 93 * or thread_block from the idle thread. 94 */ 95 96 if (self != current_processor()->idle_thread) { 97 #ifndef MIGRATING_THREADS 98 while (thread_should_halt(self)) 99 thread_halt_self(thread_exception_return); 100 #endif 101 102 /* (gdb) bt #0 ast_taken () at ../kern/ast.c:98 #1 0x81011b8e in _return_from_trap () at ../x86_64/locore.S:571 #2 0x9f442868 in ?? () #3 0x81018a70 in ?? () at ../kern/sched_prim.c:924 #4 0x9f43ffb8 in ?? () #5 0x in ?? () I don't know what to make of that, but perhaps you will. Note that again, this worked on i386. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Il 20 aprile 2023 13:18:29 UTC, Sergey Bugaev ha scritto: >On Thu, Apr 20, 2023 at 4:08 PM Sergey Bugaev wrote: >> > Would it work for you to shrink temporarily the VM space, at least for >> > testing? >> >> Yeah, that's also what I've been thinking to try out next. I'm >> somewhat surprised that gcc/ld placed my static executable in the >> rather low part of the address space, but here it actually works in >> our favor. > >It's alive Wow!!!
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le jeu. 20 avril 2023 16:18:29 +0300, a ecrit: > On Thu, Apr 20, 2023 at 4:08 PM Sergey Bugaev wrote: > > > Would it work for you to shrink temporarily the VM space, at least for > > > testing? > > > > Yeah, that's also what I've been thinking to try out next. I'm > > somewhat surprised that gcc/ld placed my static executable in the > > rather low part of the address space, but here it actually works in > > our favor. > > It's alive !! \o/ !!
Re: [PATCH 5/5] add setting gs/fsbase
On Thu, Apr 20, 2023 at 4:01 PM Luca wrote: > There is likely some limitation in the pmap module in gnumach, IIRC it should > statically allocate the user L3 page tables as long as the limit is below the > first 512GB, but I never really tested it yet. That sounds plausible, yes, I've seen some sort of TODOs/FIXMEs referencing L3 page tables while grepping for VM_MAX_USER_ADDRESS. Not that I have any clue how any of this works. Please look into this! > Would it work for you to shrink temporarily the VM space, at least for > testing? Yeah, that's also what I've been thinking to try out next. I'm somewhat surprised that gcc/ld placed my static executable in the rather low part of the address space, but here it actually works in our favor. Is that actually configurable somewhere (at gcc/binutils build time, I mean)? I'm thinking we will want to map the lower 4GB of address space inaccessible like dyld does, and not just one page, to catch pointer truncation in addition to nullptr dereferences. But that would need gcc/ld to know to not use low addresses for their needs. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Il 20 aprile 2023 12:46:51 UTC, Sergey Bugaev ha scritto: >On Thu, Apr 20, 2023 at 3:33 PM Samuel Thibault >wrote: >> Sergey Bugaev, le jeu. 20 avril 2023 15:27:15 +0300, a ecrit: >> > On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault >> > wrote: >> > > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy. >> > > > >> > > > Interesting; but that one's dealing with the SHARED case, isn't it? >> > > >> > > Yes but I guess it fixed the static case too? >> > >> > For the shared case, yes, lib*user will now reference >> > __mig_memcpy from libc.so, and that one needs a simple relocation >> > without ifunc. >> >> Ah, I missed the part that said that __mig_memcpy just calls the >> (ifunc-) memcpy. >> >> I guess I shouldn't be replying when I don't actually have time to check >> all ins and outs of what I'm talking about :) > >It's alright, I'm guilty of that too :) And also indeed I have just >spent hours debugging this, so I have some mental model of what's >going on (whether a correct one is another question...), but if you >haven't looked into this for some time, you wouldn't. > >> > What would be a correct value then, 1 << 48? >> >> I don't know by heart, but something like that yes. > >Well that value is making gnumach crash before the task even runs :( > >XNU has 0x7000, which is still larger than 0x2000, so >I'm going to try that one. Though they probably mean 0x8000 >and have just left a single page of address space for the commpage. There is likely some limitation in the pmap module in gnumach, IIRC it should statically allocate the user L3 page tables as long as the limit is below the first 512GB, but I never really tested it yet. Would it work for you to shrink temporarily the VM space, at least for testing? Luxa
Re: [PATCH 5/5] add setting gs/fsbase
On Thu, Apr 20, 2023 at 3:33 PM Samuel Thibault wrote: > Sergey Bugaev, le jeu. 20 avril 2023 15:27:15 +0300, a ecrit: > > On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault > > wrote: > > > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy. > > > > > > > > Interesting; but that one's dealing with the SHARED case, isn't it? > > > > > > Yes but I guess it fixed the static case too? > > > > For the shared case, yes, lib*user will now reference > > __mig_memcpy from libc.so, and that one needs a simple relocation > > without ifunc. > > Ah, I missed the part that said that __mig_memcpy just calls the > (ifunc-) memcpy. > > I guess I shouldn't be replying when I don't actually have time to check > all ins and outs of what I'm talking about :) It's alright, I'm guilty of that too :) And also indeed I have just spent hours debugging this, so I have some mental model of what's going on (whether a correct one is another question...), but if you haven't looked into this for some time, you wouldn't. > > What would be a correct value then, 1 << 48? > > I don't know by heart, but something like that yes. Well that value is making gnumach crash before the task even runs :( XNU has 0x7000, which is still larger than 0x2000, so I'm going to try that one. Though they probably mean 0x8000 and have just left a single page of address space for the commpage. Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le jeu. 20 avril 2023 15:27:15 +0300, a ecrit: > On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault > wrote: > > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy. > > > > > > Interesting; but that one's dealing with the SHARED case, isn't it? > > > > Yes but I guess it fixed the static case too? > > For the shared case, yes, lib*user will now reference > __mig_memcpy from libc.so, and that one needs a simple relocation > without ifunc. Ah, I missed the part that said that __mig_memcpy just calls the (ifunc-) memcpy. I guess I shouldn't be replying when I don't actually have time to check all ins and outs of what I'm talking about :) > > > > > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then > > > > > defined to 0xc000, same as on i386. That's of course too small. > > > > > Can we bump this substantially? > > > > > > > > Of course ! > > > > > > Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE... > > > > It doesn't need to be that high :) > > > > And actually it probably cannot: IIRC not all bits can be used in amd64 > > virtual addresses anyway. > > Yes, but apparently it still works for the kernel? Or is that some > other kind of address? IIRC the last really-usable bit of the virtual address is just replicated (to 0 for user and to 1 for kernel). > What would be a correct value then, 1 << 48? I don't know by heart, but something like that yes. Samuel
Re: [PATCH 5/5] add setting gs/fsbase
On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault wrote: > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy. > > > > Interesting; but that one's dealing with the SHARED case, isn't it? > > Yes but I guess it fixed the static case too? We must be talking past each other -- evidently it did not fix the static case, because I'm running into that issue now. On x86_64, that is; while on i386 it is working for the two reasons I listed. Moreover, I don't see how it could have changed anything for the static case. For the shared case, yes, lib*user will now reference __mig_memcpy from libc.so, and that one needs a simple relocation without ifunc. For the static case, it's all in the same binary; the only relocations left after static linking are the ifunc relocations, and the issue I'm having is different (and happens at an earlier stage). Or in other words: the issue I am/was running into was not that ifunc resolvers wouldn't work because of yet unrelocated _rtld_global_ro or some such, but that they aren't getting called in the first place. Instead we fetch some garbage pointers from the GOT *before* they are initialized by calling the ifunc resolvers, and jump there and eventually crash. (Or are they not garbage? It's suspicious that they really point back to the PLT.) > > > > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then > > > > defined to 0xc000, same as on i386. That's of course too small. > > > > Can we bump this substantially? > > > > > > Of course ! > > > > Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE... > > It doesn't need to be that high :) > > And actually it probably cannot: IIRC not all bits can be used in amd64 > virtual addresses anyway. Yes, but apparently it still works for the kernel? Or is that some other kind of address? What would be a correct value then, 1 << 48? Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Sergey Bugaev, le jeu. 20 avril 2023 15:05:25 +0300, a ecrit: > On Thu, Apr 20, 2023 at 2:56 PM Samuel Thibault > wrote: > > > > Hello, > > > > Sergey Bugaev, le jeu. 20 avril 2023 14:51:04 +0300, a ecrit: > > > Why was this not an issue for us on i386? > > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy. > > Interesting; but that one's dealing with the SHARED case, isn't it? Yes but I guess it fixed the static case too? > > > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then > > > defined to 0xc000, same as on i386. That's of course too small. > > > Can we bump this substantially? > > > > Of course ! > > Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE... It doesn't need to be that high :) And actually it probably cannot: IIRC not all bits can be used in amd64 virtual addresses anyway. Samuel
Re: [PATCH 5/5] add setting gs/fsbase
On Thu, Apr 20, 2023 at 2:56 PM Samuel Thibault wrote: > > Hello, > > Sergey Bugaev, le jeu. 20 avril 2023 14:51:04 +0300, a ecrit: > > Why was this not an issue for us on i386? > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy. Interesting; but that one's dealing with the SHARED case, isn't it? In my case everything's statically linked so circular relocations between libc and lib*user aren't an issue; but on the other hand we need to run __thread_set_state (and other MIG routines) super early, unlike in SHARED. > > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then > > defined to 0xc000, same as on i386. That's of course too small. > > Can we bump this substantially? > > Of course ! Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE... Sergey
Re: [PATCH 5/5] add setting gs/fsbase
Hello, Sergey Bugaev, le jeu. 20 avril 2023 14:51:04 +0300, a ecrit: > Why was this not an issue for us on i386? See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy. > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then > defined to 0xc000, same as on i386. That's of course too small. > Can we bump this substantially? Of course ! Samuel
Re: [PATCH 5/5] add setting gs/fsbase
On Wed, Apr 19, 2023 at 11:52 PM Sergey Bugaev wrote: > On Wed, Apr 19, 2023 at 10:48 PM Luca Dariz wrote: > > * i386/i386/pcb.c: switch FSBASE/GSBASE on context switch and > > implement accessors in thread setstatus/getstatus > > * i386/i386/thread.h: add new state to thread saved state > > * kern/thread.c: add i386_FSGS_BASE_STATE handler > > Hi Luca -- this is great! > > I will build gnumach with your changes and see if that works with my > latest build of x86_64-gnu glibc. Well, too bad -- as is, we don't even get to actually doing the thread_set_state () RPC. We do reach the call to __thread_set_state (), but then it uses __mig_memcpy (), which is just 'return memcpy (...)'. And (because of the static build?), that memcpy () is the full real ifunc-selected memcpy. So it jumps to the memcpy@plt, which jumps to *(mem...@got.plt), which is supposed to jump into the rtld and then run the ifunc resolver and do its smarts and eventually jump to the right memcpy... But of course none of that is happening because we haven't really started running any of rtld proper yet, we're still inside _hurd_stack_setup (). So there was no-one to apply the relocations to our .got.plt yet, and they point to who knows where. (In practice, back to the PLT.) Why had this worked for me when I was running it on Linux? Most likely I have skipped over the whole __thread_set_state () call and not just the 'syscall'. Why was this not an issue for us on i386? Two reasons: * i386_set_gdt () doesn't need memcpy; * memcpy is not ifunc-resolved in static i386 We can't run libc_start_main () (which is what calls ARCH_INIT_CPU_FEATURES and then _dl_relocate_static_pie) before we get our argv/envp and our new stack, but to do that we need MIG, and MIG will memcpy... So how do we get out of this chicken-and-egg situation? Well, like this of course: leaq __memcpy_sse2_unaligned(%rip), %rax movq %rax, memcpy@GOTPCREL(%rip) Call that either a terrible hack or a brilliant solution :) :) :) and I'm sure H.J. Lu over on libc-alpha will appreciate it when I send the patch. But -- it works! I can get through memcpy now, and __thread_set_state () succeeds and writes $fs_base as seen from GDB! So, great work on this! Now, this exposed some of my own bugs in glibc... and after quickly fixing them, I'm now stuck at trying to allocate memory. Specifically, I'd like to have memory at 0x2000 and some further, but gnumach is not letting me! -- because that's more than VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then defined to 0xc000, same as on i386. That's of course too small. Can we bump this substantially? Sergey
Re: [PATCH 5/5] add setting gs/fsbase
On Wed, Apr 19, 2023 at 10:48 PM Luca Dariz wrote: > * i386/i386/pcb.c: switch FSBASE/GSBASE on context switch and > implement accessors in thread setstatus/getstatus > * i386/i386/thread.h: add new state to thread saved state > * kern/thread.c: add i386_FSGS_BASE_STATE handler Hi Luca -- this is great! I will build gnumach with your changes and see if that works with my latest build of x86_64-gnu glibc. Have you tried running my executable again? How does it fail now? Also since we talked last time, Samuel has reviewed and pushed most of my patches, so you can *almost* build x86_64-gnu glibc from master yourself. Have you looked into why the console (when not redirecting it into a serial port) crashes? Sergey P.S. And a small tip: it would have been nice of you to cc me on this patch. In this case I'm subscribed to the list so it doesn't matter much, but still. When using git send-email, you can use --cc to cc someone on the whole series, or add a Cc: line in an individual .patch file.