Re: [PATCH 5/5] add setting gs/fsbase

2023-05-06 Thread Sergey Bugaev
On Sat, May 6, 2023 at 1:12 PM Sergey Bugaev  wrote:
> This is ld.so in the exec server task trying to dir_lookup
> ("/hurd/exec") -- it then crashes ext2fs when it tries to access the
> TCB. But before that, ext2fs starts up, spawns all those
> libports/libpager worker threads (which now start up correctly),
> resumes exec, and ld.so begins starting up, which means the SHARED /
> IS_IN (rtld) build is not super broken either \o/

I now have exec server loading and initializing and binding all the
shared libraries, then starting all the way to main (), then
trivfs_startup (), successfully handshaking with ext2fs!

Then something goes wrong; I have not yet investigated, but it's
probably because ext2fs expects to find /hurd/startup next, and I
haven't put that into my ramdisk.

Anyway, I declare this a huge success :) I'll send the glibc fixes
some time soon, and I should keep working on the Hurd patches,
hopefully there isn't too much work left, then we can upstream it, and
then test with something that resembles a more complete Debian
installation.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-05-06 Thread Sergey Bugaev
Hello,

On Tue, Apr 25, 2023 at 1:59 PM Samuel Thibault  wrote:
> Sergey Bugaev, le mar. 25 avril 2023 13:25:02 +0300, a ecrit:
> > @@ -733,6 +734,10 @@ boolean_t thread_invoke(
> >
> > counter(c_thread_invoke_hits++);
> > (void) spl0();
> > +#ifdef __x86_64__
> > +   wrmsr(MSR_REG_FSBASE, new_thread->pcb->iss.fsbase);
> > +   wrmsr(MSR_REG_GSBASE, new_thread->pcb->iss.gsbase);
> > +#endif
>
> I guess it could belong to switch_ktss?

I'm now hitting this same issue with the fast RPC codepath:

#0  stack_handoff (old=old@entry=0x90ecd828,
new=new@entry=0x90ecd620) at ../i386/i386/pcb.c:301
#1  0x8104f6c4 in thread_handoff
(old=old@entry=0x90ecd828,
continuation=continuation@entry=0x81046f30
,
new=new@entry=0x90ecd620) at ../kern/ipc_sched.c:239
#2  0x810485b4 in mach_msg_trap (msg=0xbfffeca0,
option=, send_size=1096, rcv_size=,
rcv_name=3, time_out=,
notify=) at ../ipc/mach_msg.c:830
#3  0x81011eb2 in syscall64 () at ../x86_64/locore.S:1430

This is ld.so in the exec server task trying to dir_lookup
("/hurd/exec") -- it then crashes ext2fs when it tries to access the
TCB. But before that, ext2fs starts up, spawns all those
libports/libpager worker threads (which now start up correctly),
resumes exec, and ld.so begins starting up, which means the SHARED /
IS_IN (rtld) build is not super broken either \o/

Indeed it looks like switch_ktss is a better place for setting
fsgs_base, I'm going to try that next.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-05-02 Thread Sergey Bugaev
On Tue, May 2, 2023 at 10:04 AM Samuel Thibault  wrote:
> I don't see any issue on 32bit gnumach:
>
> $ ./test
> Killed
>
> Perhaps with is only with 64bit gnumach

Yes, most likely it's something 64-bit specific. We know from
experience that the 32-bit gnumach survives a task_terminate call just
fine :)

But then maybe you should be checking with only a single bootstrap
task trying to terminate itself, not a full Hurd environment, which,
again, is surely known to work.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-05-02 Thread Samuel Thibault
Hello,

Sergey Bugaev, le mer. 26 avril 2023 20:33:20 +0300, a ecrit:
> ../kern/ipc_tt.c:395: retrieve_task_self_fast: Assertion
> `task->itk_self != IP_NULL' failed.panic ../kern/debug.c:103:
> Debugger: Debugger invoked, but there isn't one!
> 
> This is after typing 'quit' in bc, which calls exit () -- I had to fix
> up _hurd_exit () in glibc a little to not crash if we don't have
> _hurd_ports. From single-stepping, it seems task_terminate () works,
> as in it tears down and deallocates the kernel task_t, but then the
> syscall (which task_terminate is) just returns back to userspace to
> the now-nonexistent task, and it keeps running. It then calls another
> syscall and that one breaks with the assertion above.
> 
> You should be able to reproduce this without glibc by just calling
> task_terminate (mach_task_self ()).

I don't see any issue on 32bit gnumach:

$ ./test
Killed

Perhaps with is only with 64bit gnumach, with the AST questions, I see
notably thread_terminate that in the thread==cur_thread case,
AST_TERMINATE is used.

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-30 Thread Samuel Thibault
Applied, thanks!

Luca Dariz, le mer. 19 avril 2023 21:47:03 +0200, a ecrit:
> * i386/i386/i386asm.sym: add offsets for asm
> * i386/i386/pcb.c: switch FSBASE/GSBASE on context switch and
>   implement accessors in thread setstatus/getstatus
> * i386/i386/thread.h: add new state to thread saved state
> * kern/thread.c: add i386_FSGS_BASE_STATE handler
> * x86_64/locore.S: fix fs/gs handling, skipping the base address and
>   avoid resetting it by manually re-loading fs/gs
> ---
>  i386/i386/i386asm.sym |  2 +
>  i386/i386/pcb.c   | 39 +--
>  i386/i386/thread.h|  4 ++
>  kern/thread.c |  3 ++
>  x86_64/locore.S   | 89 ++-
>  5 files changed, 116 insertions(+), 21 deletions(-)
> 
> diff --git a/i386/i386/i386asm.sym b/i386/i386/i386asm.sym
> index 1b9b40bb..fd0be557 100644
> --- a/i386/i386/i386asm.sym
> +++ b/i386/i386/i386asm.sym
> @@ -108,6 +108,8 @@ offseti386_saved_stater   r12
>  offset   i386_saved_stater   r13
>  offset   i386_saved_stater   r14
>  offset   i386_saved_stater   r15
> +offset   i386_saved_stater   fsbase
> +offset   i386_saved_stater   gsbase
>  #endif
>  
>  offset   i386_interrupt_statei   eip
> diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
> index 61125fe8..8a9e3bf4 100644
> --- a/i386/i386/pcb.c
> +++ b/i386/i386/pcb.c
> @@ -51,6 +51,7 @@
>  #include "eflags.h"
>  #include "gdt.h"
>  #include "ldt.h"
> +#include "msr.h"
>  #include "ktss.h"
>  #include "pcb.h"
>  
> @@ -372,7 +373,10 @@ thread_t switch_context(
>*  Load the rest of the user state for the new thread
>*/
>   switch_ktss(new->pcb);
> -
> +#if defined(__x86_64__) && !defined(USER32)
> +wrmsr(MSR_REG_FSBASE, new->pcb->iss.fsbase);
> +wrmsr(MSR_REG_GSBASE, new->pcb->iss.gsbase);
> +#endif
>   return Switch_context(old, continuation, new);
>  }
>  
> @@ -667,7 +671,23 @@ kern_return_t thread_setstatus(
>   return ret;
>   break;
>   }
> -
> +#if defined(__x86_64__) && !defined(USER32)
> + case i386_FSGS_BASE_STATE:
> +{
> +struct i386_fsgs_base_state *state;
> +if (count < i386_FSGS_BASE_STATE_COUNT)
> +return KERN_INVALID_ARGUMENT;
> +
> +state = (struct i386_fsgs_base_state *) tstate;
> +thread->pcb->iss.fsbase = state->fs_base;
> +thread->pcb->iss.gsbase = state->gs_base;
> +if (thread == current_thread()) {
> +wrmsr(MSR_REG_FSBASE, state->fs_base);
> +wrmsr(MSR_REG_GSBASE, state->gs_base);
> +}
> +break;
> +}
> +#endif
>   default:
>   return(KERN_INVALID_ARGUMENT);
>   }
> @@ -843,7 +863,20 @@ kern_return_t thread_getstatus(
>   *count = i386_DEBUG_STATE_COUNT;
>   break;
>   }
> -
> +#if defined(__x86_64__) && !defined(USER32)
> + case i386_FSGS_BASE_STATE:
> +{
> +struct i386_fsgs_base_state *state;
> +if (*count < i386_FSGS_BASE_STATE_COUNT)
> +return KERN_INVALID_ARGUMENT;
> +
> +state = (struct i386_fsgs_base_state *) tstate;
> +state->fs_base = thread->pcb->iss.fsbase;
> +state->fs_base = thread->pcb->iss.gsbase;
> +*count = i386_FSGS_BASE_STATE_COUNT;
> +break;
> +}
> +#endif
>   default:
>   return(KERN_INVALID_ARGUMENT);
>   }
> diff --git a/i386/i386/thread.h b/i386/i386/thread.h
> index 933b43d8..b5fc5ffb 100644
> --- a/i386/i386/thread.h
> +++ b/i386/i386/thread.h
> @@ -51,6 +51,10 @@
>   */
>  
>  struct i386_saved_state {
> +#ifdef __x86_64__
> + unsigned long   fsbase;
> + unsigned long   gsbase;
> +#endif
>   unsigned long   gs;
>   unsigned long   fs;
>   unsigned long   es;
> diff --git a/kern/thread.c b/kern/thread.c
> index 392d38f8..f0cea804 100644
> --- a/kern/thread.c
> +++ b/kern/thread.c
> @@ -1472,6 +1472,9 @@ kern_return_t thread_set_state(
>   if (flavor == i386_DEBUG_STATE && thread == current_thread())
>   /* This state can be set directly for the curren thread.  */
>   return thread_setstatus(thread, flavor, new_state, 
> new_state_count);
> + if (flavor == i386_FSGS_BASE_STATE && thread == current_thread())
> + /* This state can be set directly for the curren thread.  */
> + return thread_setstatus(thread, flavor, new_state, 
> new_state_count);
>  #endif
>  
>   if (thread == THREAD_NULL || thread == current_thread())
> diff --git a/x86_64/locore.S b/x86_64/locore.S
> index 1b17d921..ba0400a3 

Re: [PATCH 5/5] add setting gs/fsbase

2023-04-26 Thread Sergey Bugaev
Hi again,

I managed to break gnumach in a new and exciting way (tm):

../kern/ipc_tt.c:395: retrieve_task_self_fast: Assertion
`task->itk_self != IP_NULL' failed.panic ../kern/debug.c:103:
Debugger: Debugger invoked, but there isn't one!

This is after typing 'quit' in bc, which calls exit () -- I had to fix
up _hurd_exit () in glibc a little to not crash if we don't have
_hurd_ports. From single-stepping, it seems task_terminate () works,
as in it tears down and deallocates the kernel task_t, but then the
syscall (which task_terminate is) just returns back to userspace to
the now-nonexistent task, and it keeps running. It then calls another
syscall and that one breaks with the assertion above.

You should be able to reproduce this without glibc by just calling
task_terminate (mach_task_self ()).

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-25 Thread Samuel Thibault
Sergey Bugaev, le mar. 25 avril 2023 13:25:02 +0300, a ecrit:
> @@ -733,6 +734,10 @@ boolean_t thread_invoke(
> 
> counter(c_thread_invoke_hits++);
> (void) spl0();
> +#ifdef __x86_64__
> +   wrmsr(MSR_REG_FSBASE, new_thread->pcb->iss.fsbase);
> +   wrmsr(MSR_REG_GSBASE, new_thread->pcb->iss.gsbase);
> +#endif

I guess it could belong to switch_ktss?

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-25 Thread Sergey Bugaev
On Tue, Apr 25, 2023 at 12:02 PM Sergey Bugaev  wrote:
> subcode is the address of the fault, and 128 just so happens to be
> 0x80, and %fs:0x80 is of course tcb->reply_port. So it looks like
> fs_base was not restored to its proper value when context-switching to
> the thread, and gets instead set to 0, and so accessing the reply port
> to make an RPC faults.
>
> But how come all the previous RPCs worked? Well, it looks like fs_base
> is restored/preserved correctly when just doing synchronous kernel RPC
> calls, but not when the thread is continued after reading via
> io_done_thread_continue/thread_invoke/call_continuation.

This is of course a wrong way to do this, but just for the sake of testing:

diff --git a/kern/sched_prim.c b/kern/sched_prim.c
index dd0f492b..820de799 100644
--- a/kern/sched_prim.c
+++ b/kern/sched_prim.c
@@ -55,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 

 #ifMACH_FIXPRI
 #include 
@@ -733,6 +734,10 @@ boolean_t thread_invoke(

counter(c_thread_invoke_hits++);
(void) spl0();
+#ifdef __x86_64__
+   wrmsr(MSR_REG_FSBASE, new_thread->pcb->iss.fsbase);
+   wrmsr(MSR_REG_GSBASE, new_thread->pcb->iss.gsbase);
+#endif
call_continuation(new_thread->swap_func);
/*NOTREACHED*/
return TRUE; /* help for the compiler */

And sure enough:

start bc: bc 1.07.1
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008, 2012-2017
Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
2 + 3
5
print "Hello from x86_64-gnu!"
Hello from x86_64-gnu!

Now's the time to admit I don't really know how to use bc :D

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-25 Thread Sergey Bugaev
Hello,

I have done some debugging and I think I know what's going on, and it
looks related to this very patch. The device_read_inband () call
actually succeeds, and glibc tries to echo the same thing back (that's
how devstream works, arguably maybe it shouldn't, but whatever); and
when it tries to do the RPC is calls
everyone's-since-recently-favorite-function, __mig_get_reply_port (),
and...

#0  i386_exception (exc=exc@entry=1, code=code@entry=2, subcode=128)
at ../i386/i386/trap.c:638
#1  0x8103410b in user_page_fault_continue (kr=kr@entry=2) at
../i386/i386/trap.c:119
#2  0x81056346 in vm_fault (map=,
vaddr=vaddr@entry=0, fault_type=1,
change_wiring=change_wiring@entry=0, resume=resume@entry=0,
continuation=continuation@entry=0x810340e0
) at ../vm/vm_fault.c:1484
#3  0x81033f5b in user_trap (regs=0x90fe3868) at
../i386/i386/trap.c:518
#4  0x81011b73 in _take_trap () at ../x86_64/locore.S:549

subcode is the address of the fault, and 128 just so happens to be
0x80, and %fs:0x80 is of course tcb->reply_port. So it looks like
fs_base was not restored to its proper value when context-switching to
the thread, and gets instead set to 0, and so accessing the reply port
to make an RPC faults.

But how come all the previous RPCs worked? Well, it looks like fs_base
is restored/preserved correctly when just doing synchronous kernel RPC
calls, but not when the thread is continued after reading via
io_done_thread_continue/thread_invoke/call_continuation.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-24 Thread Samuel Thibault
Sergey Bugaev, le lun. 24 avril 2023 23:27:09 +0300, a ecrit:
> Alternatively, if it is some fatal crash in userland, what should I
> break on in gnumach to detect it? Somewhere in vm_fault.c?

The debug_all_traps_with_kdb variable is there for that, you can also
enable it live from kdb with 

debug traps /on

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-24 Thread Sergey Bugaev
Hi,

On Mon, Apr 24, 2023 at 11:02 PM Luca Dariz  wrote:
> Il 24/04/23 17:19, Sergey Bugaev ha scritto:
> > Resending without the attachment, because apparently the email did not
> > make it into the list archive so maybe it didn't get to you either;
> > and I'm too conscious about email just eating my letters without any
> > notice or indication since that one time. If you didn't get the last
> > one and still want the attachment, let me know and we'll work
> > something out. And please ack receiving this one in any case.
>
> I received the previous one, including the attachment, thanks!

Good -- and it did eventually make its way into the archive too.
Interestingly, the second one got into the archive first, and it was
being displayed as a reply to an unknown message for a while, and then
the first one showed up. I don't know what's going on, perhaps
*something* was taking its time scanning my 5.6 MB binary attachment
for malware?

> I can see issues also with simpler tests, if I use the graphical
> console; actually the whole bootstrap task doesn't start.

Yes, with the graphical console (called VGA text console I believe?)
the bootstrap crashes before it even first enters user mode. So please
debug that one too.

> Did you try to
> use the serial console? It has the drawback of reconfiguring the
> terminal every time, but it seems to work for both in/out.

...but yes, since you mentioned using -nographic & console=com0 last
time, this is what I've been using.

> I tried using
> the "com" mach device with the device_read/write() rpcs.

I'm trying the "console" device, and glibc uses the _inband versions
of the RPCs, I believe. Please try my bc build and see if you can
reproduce it.

Alternatively, if it is some fatal crash in userland, what should I
break on in gnumach to detect it? Somewhere in vm_fault.c? I can't
just break on all page faults since there are a lot of them and most
are benign. Maybe on i386_exception ()? Hm, maybe I should just -D
MACH_KDB.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-24 Thread Luca Dariz

Il 24/04/23 17:19, Sergey Bugaev ha scritto:

Resending without the attachment, because apparently the email did not
make it into the list archive so maybe it didn't get to you either;
and I'm too conscious about email just eating my letters without any
notice or indication since that one time. If you didn't get the last
one and still want the attachment, let me know and we'll work
something out. And please ack receiving this one in any case.


I received the previous one, including the attachment, thanks!


Sergey Bugaev, le sam. 22 avril 2023 15:11:30 +0300, a ecrit:

I had to patch it a bit to make it accept and use --device-port, and
to stop trying to read (fileno (stdin)). So, Luca, I'm eagerly
awaiting a fix for input :) I'm attaching the bc binary for you to
test. [Not actually attaching this time. -S]


I can see issues also with simpler tests, if I use the graphical 
console; actually the whole bootstrap task doesn't start. Did you try to 
use the serial console? It has the drawback of reconfiguring the 
terminal every time, but it seems to work for both in/out. I tried using 
the "com" mach device with the device_read/write() rpcs.



Luca



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-24 Thread Sergey Bugaev
Resending without the attachment, because apparently the email did not
make it into the list archive so maybe it didn't get to you either;
and I'm too conscious about email just eating my letters without any
notice or indication since that one time. If you didn't get the last
one and still want the attachment, let me know and we'll work
something out. And please ack receiving this one in any case.

---

On Sat, Apr 22, 2023 at 3:13 PM Samuel Thibault  wrote:
>
> Sergey Bugaev, le sam. 22 avril 2023 15:11:30 +0300, a ecrit:
> > So please suggest your favorite REPL :)
>
> bc ? :)

Sure:

GNU Mach 1.8
biosmem: physical memory map:
biosmem: 00:09f000, available
biosmem: 09fc00:0a, reserved
biosmem: 0f:10, reserved
biosmem: 10:001ffe, available
biosmem: 001ffe:002000, reserved
biosmem: 00feffc000:00ff00, reserved
biosmem: 00fffc:01, reserved
vm_page: page table size: 131024 entries (12284k)
vm_page: DMA: pages: 4080 (15M), free: 0 (0M)
vm_page: DMA: min:500 low:600 high:1000
vm_page: DMA32: pages: 61440 (240M), free: 61282 (239M)
vm_page: DMA32: min:3072 low:3686 high:6144
vm_page: DIRECTMAP: pages: 65504 (255M), free: 59399 (232M)
vm_page: DIRECTMAP: min:3275 low:3930 high:6550
com 2 out of range
RTC time is 2023-04-24 14:30:26
Pausing, debug me!
module 0: bc --device-port=${device-port} $(task-create) $(prompt-task-resume)
1 multiboot modules
task loaded: bc --device-port=1
Pausing for bc...
Hit  to resume bootstrap.

start bc: bc 1.07.1
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008, 2012-2017
Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.

(just like mach-bootstrap-hello, hangs here, does not react to input)

I had to patch it a bit to make it accept and use --device-port, and
to stop trying to read (fileno (stdin)). So, Luca, I'm eagerly
awaiting a fix for input :) I'm attaching the bc binary for you to
test. [Not actually attaching this time. -S]

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-22 Thread Samuel Thibault
Sergey Bugaev, le sam. 22 avril 2023 15:11:30 +0300, a ecrit:
> So please suggest your favorite REPL :)

bc ? :)

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-22 Thread Sergey Bugaev
On Sat, Apr 22, 2023 at 2:17 PM Luca Dariz  wrote:
> Hi Sergey,

Hi,

> Il 20/04/23 13:51, Sergey Bugaev ha scritto:
> > On Wed, Apr 19, 2023 at 11:52 PM Sergey Bugaev  wrote:
> > We do reach the call to __thread_set_state (), but then it uses
> > __mig_memcpy (), which is just 'return memcpy (...)'. And (because of
> > the static build?), that memcpy () is the full real ifunc-selected
> > memcpy. So it jumps to the memcpy@plt, which jumps to
> > *(mem...@got.plt), which is supposed to jump into the rtld and then
> > run the ifunc resolver and do its smarts and eventually jump to the
> > right memcpy...
>
> I don't know if you already found a good solution for this, but if not
> maybe we could add an x86_64-specific plain syscall to set fsbase/gsbase
> during bootstrap, if it can make things more manageable.

Thank you, but that won't help much, since thread_set_state is not the
only offender, there are more memcpy/memmove calls further (in
RPC_exec_startup_get_info and hurdstartup.c). So we need to solve this
in userspace one way or another.

My pre-filling GOT thing seems to work well, although it hasn't
received any positive feedback from glibc maintainers. Alternatively
we could use Adhemerval Zanella Netto's suggestion to redirect memcpy
to a baseline implementation in some .c files at compile time.

Rather, please look into why device_write () hangs in ast_taken (),
and into enabling > 4 GB address space, and into the crash when not
using console=com0.

Once you figure out the hang in device_write (), I'd like to
experiment with building and running some REPL -- like maybe Python or
Bash or some Lisp; but it needs to work without forking (so Bash
wouldn't really work...) and without accessing any files (Python wants
to load its stdlib from the fs I believe). So please suggest your
favorite REPL :)

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-22 Thread Luca Dariz

Hi Sergey,

Il 20/04/23 13:51, Sergey Bugaev ha scritto:

On Wed, Apr 19, 2023 at 11:52 PM Sergey Bugaev  wrote:
We do reach the call to __thread_set_state (), but then it uses
__mig_memcpy (), which is just 'return memcpy (...)'. And (because of
the static build?), that memcpy () is the full real ifunc-selected
memcpy. So it jumps to the memcpy@plt, which jumps to
*(mem...@got.plt), which is supposed to jump into the rtld and then
run the ifunc resolver and do its smarts and eventually jump to the
right memcpy...


I don't know if you already found a good solution for this, but if not 
maybe we could add an x86_64-specific plain syscall to set fsbase/gsbase 
during bootstrap, if it can make things more manageable.



Luca



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-21 Thread Samuel Thibault
Sergey Bugaev, le ven. 21 avril 2023 17:30:12 +0300, a ecrit:
> On Fri, Apr 21, 2023 at 12:24 PM Samuel Thibault
>  wrote:
> > No, they don't :) otherwise they wouldn't respect the ABI. The AMD64
> > ABI indeed defaults to /lib/ld64.so.1, and says that Linux uses
> > /lib64/ld-linux-x86-64.so.2, and that is cast in stone, we cannot change
> > it any more.
> >
> > /lib/ld-x86-64.so.1 should be completely fine.
> 
> I don't think I understand -- so the AMD64 ABI spec requires it to be
> /lib/ld64.so.1,

It does not require it to be that, it says it should be that. But it's a
very bad idea since it doesn't permit to distinguish different 64bit
archs.

> but it also mentions that Linux uses
> /lib64/ld-linux-x86-64.so.2,

Yes, because Linux chose to use that, to encode the OS name and the arch
in the path.

> and that means we should be using /lib/ld-x86-64.so.1?

Yes, because we do want to put the arch name in the path. We could want
to put "gnu-" in it too, it's just about time to decide. After we have
some settled distribution, we should really not change that.

> Moreover, how does the ELF spec requiring a specific file path for
> INTERP make any sense at all?

For interoperability between distributions.

> Surely the .1 is the soname means something, we should be able to bump
> it to .2 when we break ABI really bad?

Yes, but we'd rather avoid that if possible.

> What if the system is not a Unix and doesn't use /lib/ or .so for its
> libraries (but still uses ELF as a format)?

Then it's not the same system and thus it's free to use whatever
interpreter path it prefers. But it should really rather stick to it
once decided, for interoperability.

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-21 Thread Sergey Bugaev
On Fri, Apr 21, 2023 at 12:24 PM Samuel Thibault
 wrote:
> No, they don't :) otherwise they wouldn't respect the ABI. The AMD64
> ABI indeed defaults to /lib/ld64.so.1, and says that Linux uses
> /lib64/ld-linux-x86-64.so.2, and that is cast in stone, we cannot change
> it any more.
>
> /lib/ld-x86-64.so.1 should be completely fine.

I don't think I understand -- so the AMD64 ABI spec requires it to be
/lib/ld64.so.1, but it also mentions that Linux uses
/lib64/ld-linux-x86-64.so.2, and that means we should be using
/lib/ld-x86-64.so.1? How does that make any sense?

Moreover, how does the ELF spec requiring a specific file path for
INTERP make any sense at all? Surely the .1 is the soname means
something, we should be able to bump it to .2 when we break ABI really
bad? What if the system is not a Unix and doesn't use /lib/ or .so for
its libraries (but still uses ELF as a format)? Why embed a path name
into each binary at all, if the spec just defines the one true rtld
path?

And also surely the systems don't follow that part of the spec in
practice? Linux uses its own thing as we have already established, and
from other emulparams files in the binutils tree, I see that Haiku has
/system/runtime_loader, and FreeBSD apparently
/usr/libexec/ld-elf.so.1. In SerenityOS we use /usr/lib/Loader.so.

I'm not objecting to /lib/ld-x86-64.so.1, I just don't understand...

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-21 Thread Samuel Thibault
Sergey Bugaev, le ven. 21 avril 2023 12:17:54 +0300, a ecrit:
> Hello,
> 
> On Fri, Apr 21, 2023 at 12:01 AM Samuel Thibault
>  wrote:
> > BTW, which dynamic linker path do we have set for x86_64-gnu?
> 
> It's in gcc/config/i386/gnu64.h in GCC tree:
> 
> #define GNU_USER_DYNAMIC_LINKER64 "/lib/ld-x86-64.so.1"
> 
> It was added by Flavio.

Ok, so it has the version, good :)

> So let's maybe choose a single path and use it consistently?

That's the one we should be using.

> I would also expect distros to want to have a say in this.

No, they don't :) otherwise they wouldn't respect the ABI. The AMD64
ABI indeed defaults to /lib/ld64.so.1, and says that Linux uses
/lib64/ld-linux-x86-64.so.2, and that is cast in stone, we cannot change
it any more.

/lib/ld-x86-64.so.1 should be completely fine.

> For instance on Debian GNU/Linux x86_64 ld.so is at
> /usr/lib/x86_64-linux-gnu/ld-2.31.so,

That's just where the file is stored. It's normal to set a symlink where
it is actually looked up.

> Which maybe makes sense if they wish to keep binaries built on Debian
> runnable on other distros. Do we want the same for Debian GNU/Hurd,

Better have a cross-distribution ABI, yes.

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-21 Thread Sergey Bugaev
Hello,

On Fri, Apr 21, 2023 at 12:01 AM Samuel Thibault
 wrote:
> BTW, which dynamic linker path do we have set for x86_64-gnu?

It's in gcc/config/i386/gnu64.h in GCC tree:

#define GNU_USER_DYNAMIC_LINKER64 "/lib/ld-x86-64.so.1"

It was added by Flavio.

So GCC passes -dynamic-linker /lib/ld-x86-64.so.1 to ld. On its own,
ld sets /lib/ld64.so.1 as far as I can see. glibc installs its ld.so
into $prefix/lib/ld.so.1 (i.e. /usr/lib/ld.so.1 likely, or
/lib/ld.so.1 in no-/usr configuration).

So let's maybe choose a single path and use it consistently?

I would also expect distros to want to have a say in this. For
instance on Debian GNU/Linux x86_64 ld.so is at
/usr/lib/x86_64-linux-gnu/ld-2.31.so, but the toolchains still sets
/lib64/ld-linux-x86-64.so.2 in PT_INTERP, and to make this work they
have /usr/lib64 which contains just a single symlink:

/usr/lib64/ld-linux-x86-64.so.2 -> /lib/x86_64-linux-gnu/ld-2.31.so

Which maybe makes sense if they wish to keep binaries built on Debian
runnable on other distros. Do we want the same for Debian GNU/Hurd, or
maybe we want Debian-specific GCC/binutils configuration to set
PT_INTERP to /lib/x86_64-gnu/ld.so.1 directly? I imagine the Guix
folks may also have their own preference.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Samuel Thibault
Hello,

Sergey Bugaev, le jeu. 20 avril 2023 16:08:45 +0300, a ecrit:
> Is that actually configurable somewhere (at gcc/binutils build time, I
> mean)?

It's probably in the link script, somewhere in
/usr/lib/*/ldscripts/elf*x86_64*

> I'm thinking we will want to map the lower 4GB of address space
> inaccessible like dyld does, and not just one page, to catch pointer
> truncation in addition to nullptr dereferences.

That sounds useful indeed.

> But that would need gcc/ld to know to not use low addresses for their
> needs.

It seems that binutils' ld/emulparams/ there are already some per-OS
files: elf_x86_64_cloudabi.sh elf_x86_64_fbsd.sh elf_x86_64_haiku.sh
elf_x86_64_sol2.sh

One can probably create a gnu file that overrides 

TEXT_START_ADDR=0x40

BTW, which dynamic linker path do we have set for x86_64-gnu?

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Sergey Bugaev
FWIW, mach-bootstrap-hello is supposed to then echo back what you type
on the console, but it doesn't work. From some debugging, it seems
that once I input something, the following device_write () call hangs
here:

(gdb)
ast_taken () at ../kern/ast.c:88
88 if (reasons & AST_NETWORK)
(gdb)
96 if (self != current_processor()->idle_thread) {
(gdb) n
98 while (thread_should_halt(self))
(gdb) l
93 * or thread_block from the idle thread.
94 */
95
96 if (self != current_processor()->idle_thread) {
97 #ifndef MIGRATING_THREADS
98 while (thread_should_halt(self))
99 thread_halt_self(thread_exception_return);
100 #endif
101
102 /*
(gdb) bt
#0  ast_taken () at ../kern/ast.c:98
#1  0x81011b8e in _return_from_trap () at ../x86_64/locore.S:571
#2  0x9f442868 in ?? ()
#3  0x81018a70 in ?? () at ../kern/sched_prim.c:924
#4  0x9f43ffb8 in ?? ()
#5  0x in ?? ()

I don't know what to make of that, but perhaps you will. Note that
again, this worked on i386.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Luca
Il 20 aprile 2023 13:18:29 UTC, Sergey Bugaev  ha scritto:
>On Thu, Apr 20, 2023 at 4:08 PM Sergey Bugaev  wrote:
>> > Would it work for you to shrink temporarily the VM space, at least for 
>> > testing?
>>
>> Yeah, that's also what I've been thinking to try out next. I'm
>> somewhat surprised that gcc/ld placed my static executable in the
>> rather low part of the address space, but here it actually works in
>> our favor.
>
>It's alive

Wow!!!




Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Samuel Thibault
Sergey Bugaev, le jeu. 20 avril 2023 16:18:29 +0300, a ecrit:
> On Thu, Apr 20, 2023 at 4:08 PM Sergey Bugaev  wrote:
> > > Would it work for you to shrink temporarily the VM space, at least for 
> > > testing?
> >
> > Yeah, that's also what I've been thinking to try out next. I'm
> > somewhat surprised that gcc/ld placed my static executable in the
> > rather low part of the address space, but here it actually works in
> > our favor.
> 
> It's alive

!! \o/ !!



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Sergey Bugaev
On Thu, Apr 20, 2023 at 4:01 PM Luca  wrote:
> There is likely some limitation in the pmap module in gnumach, IIRC it should 
> statically allocate the user L3 page tables as long as the limit is below the 
> first 512GB, but I never really tested it yet.

That sounds plausible, yes, I've seen some sort of TODOs/FIXMEs
referencing L3 page tables while grepping for VM_MAX_USER_ADDRESS. Not
that I have any clue how any of this works.

Please look into this!

> Would it work for you to shrink temporarily the VM space, at least for 
> testing?

Yeah, that's also what I've been thinking to try out next. I'm
somewhat surprised that gcc/ld placed my static executable in the
rather low part of the address space, but here it actually works in
our favor.

Is that actually configurable somewhere (at gcc/binutils build time, I
mean)? I'm thinking we will want to map the lower 4GB of address space
inaccessible like dyld does, and not just one page, to catch pointer
truncation in addition to nullptr dereferences. But that would need
gcc/ld to know to not use low addresses for their needs.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Luca
Il 20 aprile 2023 12:46:51 UTC, Sergey Bugaev  ha scritto:
>On Thu, Apr 20, 2023 at 3:33 PM Samuel Thibault  
>wrote:
>> Sergey Bugaev, le jeu. 20 avril 2023 15:27:15 +0300, a ecrit:
>> > On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault  
>> > wrote:
>> > > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy.
>> > > >
>> > > > Interesting; but that one's dealing with the SHARED case, isn't it?
>> > >
>> > > Yes but I guess it fixed the static case too?
>> >
>> > For the shared case, yes, lib*user will now reference
>> > __mig_memcpy from libc.so, and that one needs a simple relocation
>> > without ifunc.
>>
>> Ah, I missed the part that said that __mig_memcpy just calls the
>> (ifunc-) memcpy.
>>
>> I guess I shouldn't be replying when I don't actually have time to check
>> all ins and outs of what I'm talking about :)
>
>It's alright, I'm guilty of that too :) And also indeed I have just
>spent hours debugging this, so I have some mental model of what's
>going on (whether a correct one is another question...), but if you
>haven't looked into this for some time, you wouldn't.
>
>> > What would be a correct value then, 1 << 48?
>>
>> I don't know by heart, but something like that yes.
>
>Well that value is making gnumach crash before the task even runs :(
>
>XNU has 0x7000, which is still larger than 0x2000, so
>I'm going to try that one. Though they probably mean 0x8000
>and have just left a single page of address space for the commpage.

There is likely some limitation in the pmap module in gnumach, IIRC it should 
statically allocate the user L3 page tables as long as the limit is below the 
first 512GB, but I never really tested it yet. Would it work for you to shrink 
temporarily the VM space, at least for testing?

Luxa




Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Sergey Bugaev
On Thu, Apr 20, 2023 at 3:33 PM Samuel Thibault  wrote:
> Sergey Bugaev, le jeu. 20 avril 2023 15:27:15 +0300, a ecrit:
> > On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault  
> > wrote:
> > > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy.
> > > >
> > > > Interesting; but that one's dealing with the SHARED case, isn't it?
> > >
> > > Yes but I guess it fixed the static case too?
> >
> > For the shared case, yes, lib*user will now reference
> > __mig_memcpy from libc.so, and that one needs a simple relocation
> > without ifunc.
>
> Ah, I missed the part that said that __mig_memcpy just calls the
> (ifunc-) memcpy.
>
> I guess I shouldn't be replying when I don't actually have time to check
> all ins and outs of what I'm talking about :)

It's alright, I'm guilty of that too :) And also indeed I have just
spent hours debugging this, so I have some mental model of what's
going on (whether a correct one is another question...), but if you
haven't looked into this for some time, you wouldn't.

> > What would be a correct value then, 1 << 48?
>
> I don't know by heart, but something like that yes.

Well that value is making gnumach crash before the task even runs :(

XNU has 0x7000, which is still larger than 0x2000, so
I'm going to try that one. Though they probably mean 0x8000
and have just left a single page of address space for the commpage.

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Samuel Thibault
Sergey Bugaev, le jeu. 20 avril 2023 15:27:15 +0300, a ecrit:
> On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault  
> wrote:
> > > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy.
> > >
> > > Interesting; but that one's dealing with the SHARED case, isn't it?
> >
> > Yes but I guess it fixed the static case too?
> 
> For the shared case, yes, lib*user will now reference
> __mig_memcpy from libc.so, and that one needs a simple relocation
> without ifunc.

Ah, I missed the part that said that __mig_memcpy just calls the
(ifunc-) memcpy.

I guess I shouldn't be replying when I don't actually have time to check
all ins and outs of what I'm talking about :)

> > > > > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then
> > > > > defined to 0xc000, same as on i386. That's of course too small.
> > > > > Can we bump this substantially?
> > > >
> > > > Of course !
> > >
> > > Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE...
> >
> > It doesn't need to be that high :)
> >
> > And actually it probably cannot: IIRC not all bits can be used in amd64
> > virtual addresses anyway.
> 
> Yes, but apparently it still works for the kernel? Or is that some
> other kind of address?

IIRC the last really-usable bit of the virtual address is just
replicated (to 0 for user and to 1 for kernel).

> What would be a correct value then, 1 << 48?

I don't know by heart, but something like that yes.

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Sergey Bugaev
On Thu, Apr 20, 2023 at 3:08 PM Samuel Thibault  wrote:
> > > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy.
> >
> > Interesting; but that one's dealing with the SHARED case, isn't it?
>
> Yes but I guess it fixed the static case too?

We must be talking past each other -- evidently it did not fix the
static case, because I'm running into that issue now. On x86_64, that
is; while on i386 it is working for the two reasons I listed.

Moreover, I don't see how it could have changed anything for the
static case. For the shared case, yes, lib*user will now reference
__mig_memcpy from libc.so, and that one needs a simple relocation
without ifunc. For the static case, it's all in the same binary; the
only relocations left after static linking are the ifunc relocations,
and the issue I'm having is different (and happens at an earlier
stage).

Or in other words: the issue I am/was running into was not that ifunc
resolvers wouldn't work because of yet unrelocated _rtld_global_ro or
some such, but that they aren't getting called in the first place.
Instead we fetch some garbage pointers from the GOT *before* they are
initialized by calling the ifunc resolvers, and jump there and
eventually crash. (Or are they not garbage? It's suspicious that they
really point back to the PLT.)

> > > > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then
> > > > defined to 0xc000, same as on i386. That's of course too small.
> > > > Can we bump this substantially?
> > >
> > > Of course !
> >
> > Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE...
>
> It doesn't need to be that high :)
>
> And actually it probably cannot: IIRC not all bits can be used in amd64
> virtual addresses anyway.

Yes, but apparently it still works for the kernel? Or is that some
other kind of address?

What would be a correct value then, 1 << 48?

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Samuel Thibault
Sergey Bugaev, le jeu. 20 avril 2023 15:05:25 +0300, a ecrit:
> On Thu, Apr 20, 2023 at 2:56 PM Samuel Thibault  
> wrote:
> >
> > Hello,
> >
> > Sergey Bugaev, le jeu. 20 avril 2023 14:51:04 +0300, a ecrit:
> > > Why was this not an issue for us on i386?
> >
> > See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy.
> 
> Interesting; but that one's dealing with the SHARED case, isn't it?

Yes but I guess it fixed the static case too?

> > > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then
> > > defined to 0xc000, same as on i386. That's of course too small.
> > > Can we bump this substantially?
> >
> > Of course !
> 
> Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE...

It doesn't need to be that high :)

And actually it probably cannot: IIRC not all bits can be used in amd64
virtual addresses anyway.

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Sergey Bugaev
On Thu, Apr 20, 2023 at 2:56 PM Samuel Thibault  wrote:
>
> Hello,
>
> Sergey Bugaev, le jeu. 20 avril 2023 14:51:04 +0300, a ecrit:
> > Why was this not an issue for us on i386?
>
> See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy.

Interesting; but that one's dealing with the SHARED case, isn't it? In
my case everything's statically linked so circular relocations between
libc and lib*user aren't an issue; but on the other hand we need to
run __thread_set_state (and other MIG routines) super early, unlike in
SHARED.

> > VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then
> > defined to 0xc000, same as on i386. That's of course too small.
> > Can we bump this substantially?
>
> Of course !

Let me actually try doing #define VM_MAX_ADDRESS KENEL_MAP_BASE...

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Samuel Thibault
Hello,

Sergey Bugaev, le jeu. 20 avril 2023 14:51:04 +0300, a ecrit:
> Why was this not an issue for us on i386?

See 56010b73e81e2cb1082e418699f98353598fe671 and its __mig_memcpy.

> VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then
> defined to 0xc000, same as on i386. That's of course too small.
> Can we bump this substantially?

Of course !

Samuel



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-20 Thread Sergey Bugaev
On Wed, Apr 19, 2023 at 11:52 PM Sergey Bugaev  wrote:
> On Wed, Apr 19, 2023 at 10:48 PM Luca Dariz  wrote:
> > * i386/i386/pcb.c: switch FSBASE/GSBASE on context switch and
> >   implement accessors in thread setstatus/getstatus
> > * i386/i386/thread.h: add new state to thread saved state
> > * kern/thread.c: add i386_FSGS_BASE_STATE handler
>
> Hi Luca -- this is great!
>
> I will build gnumach with your changes and see if that works with my
> latest build of x86_64-gnu glibc.

Well, too bad -- as is, we don't even get to actually doing the
thread_set_state () RPC.

We do reach the call to __thread_set_state (), but then it uses
__mig_memcpy (), which is just 'return memcpy (...)'. And (because of
the static build?), that memcpy () is the full real ifunc-selected
memcpy. So it jumps to the memcpy@plt, which jumps to
*(mem...@got.plt), which is supposed to jump into the rtld and then
run the ifunc resolver and do its smarts and eventually jump to the
right memcpy...

But of course none of that is happening because we haven't really
started running any of rtld proper yet, we're still inside
_hurd_stack_setup (). So there was no-one to apply the relocations to
our .got.plt yet, and they point to who knows where. (In practice,
back to the PLT.)

Why had this worked for me when I was running it on Linux? Most likely
I have skipped over the whole __thread_set_state () call and not just
the 'syscall'. Why was this not an issue for us on i386? Two reasons:

* i386_set_gdt () doesn't need memcpy;
* memcpy is not ifunc-resolved in static i386

We can't run libc_start_main () (which is what calls
ARCH_INIT_CPU_FEATURES and then _dl_relocate_static_pie) before we
get our argv/envp and our new stack, but to do that we need MIG, and
MIG will memcpy...

So how do we get out of this chicken-and-egg situation? Well, like
this of course:

leaq __memcpy_sse2_unaligned(%rip), %rax
movq %rax, memcpy@GOTPCREL(%rip)

Call that either a terrible hack or a brilliant solution :) :) :)
and I'm sure H.J. Lu over on libc-alpha will appreciate it when I
send the patch. But -- it works! I can get through memcpy now, and
__thread_set_state () succeeds and writes $fs_base as seen from GDB!
So, great work on this!

Now, this exposed some of my own bugs in glibc... and after quickly
fixing them, I'm now stuck at trying to allocate memory. Specifically,
I'd like to have memory at 0x2000 and some further, but
gnumach is not letting me! -- because that's more than
VM_MAX_USER_ADDRESS, which is defined to VM_MAX_ADDRESS, which is then
defined to 0xc000, same as on i386. That's of course too small.
Can we bump this substantially?

Sergey



Re: [PATCH 5/5] add setting gs/fsbase

2023-04-19 Thread Sergey Bugaev
On Wed, Apr 19, 2023 at 10:48 PM Luca Dariz  wrote:
> * i386/i386/pcb.c: switch FSBASE/GSBASE on context switch and
>   implement accessors in thread setstatus/getstatus
> * i386/i386/thread.h: add new state to thread saved state
> * kern/thread.c: add i386_FSGS_BASE_STATE handler

Hi Luca -- this is great!

I will build gnumach with your changes and see if that works with my
latest build of x86_64-gnu glibc.

Have you tried running my executable again? How does it fail now?

Also since we talked last time, Samuel has reviewed and pushed most of
my patches, so you can *almost* build x86_64-gnu glibc from master
yourself.

Have you looked into why the console (when not redirecting it into a
serial port) crashes?

Sergey

P.S. And a small tip: it would have been nice of you to cc me on this
patch. In this case I'm subscribed to the list so it doesn't matter
much, but still. When using git send-email, you can use --cc to cc
someone on the whole series, or add a Cc: line in an individual .patch
file.