Re: [PATCH v4 0/7 gnumach] x86_64 SMP

2026-01-18 Thread Samuel Thibault
Hello,

Michael Kelly, le dim. 18 janv. 2026 08:04:57 +, a ecrit:
> On 18/01/2026 03:46, Damien Zammit wrote:
> > There are two known remaining issues with this patchset:
> > 
> > 1. There is a problem with the recent changes regarding
> > _Xmach_port_set_ktype -> ipc_kobject_set that causes a hang
> > only on SMP kernels, which can be temporarily worked around by
> > reverting commit:
> > fdfca0e8 "Add mach_port_set_ktype RPC to set ktype of a user port"
> > but we really ought to dig into it and fix this.
> 
> This seems to be a case of trying to acquire a lock that is already held.
> The function mach_port_set_ktype() in mach_port.c acquires the ip_lock on
> the 'port' indirectly via ipc_object_translate(). The call to
> ipc_kobject_set then tries to also acquire the lock already held. As a
> workaround proof I added a ipc_kobject_set_unlocked() that does the same as
> ipc_kobject_set() but without the lock/unlock. This alteration allowed the
> i386/smp to boot to console login.
> 
> Which of these methods is the best way of fixing this properly?
> 
> 1) Drop the port lock before the call to ipc_kobject_set() in some safe way.
> 
> 2) As above with a version of ipc_kobject_set() that requires the port to be
> already locked.

Yes, an internal version that assumes the port is locked will be fine.
Call it ipc_kobject_set_locked, however, to be consistent with other
such functions (e.g. vm_object_reference_locked etc.). And make
ipc_kobject_set call it, to avoid duplicating code.

Samuel



Re: [PATCH v4 0/7 gnumach] x86_64 SMP

2026-01-18 Thread Michael Kelly

On 18/01/2026 03:46, Damien Zammit wrote:

There are two known remaining issues with this patchset:

1. There is a problem with the recent changes regarding
_Xmach_port_set_ktype -> ipc_kobject_set that causes a hang
only on SMP kernels, which can be temporarily worked around by
reverting commit:
fdfca0e8 "Add mach_port_set_ktype RPC to set ktype of a user port"
but we really ought to dig into it and fix this.


This seems to be a case of trying to acquire a lock that is already 
held. The function mach_port_set_ktype() in mach_port.c acquires the 
ip_lock on the 'port' indirectly via ipc_object_translate(). The call to 
ipc_kobject_set then tries to also acquire the lock already held. As a 
workaround proof I added a ipc_kobject_set_unlocked() that does the same 
as ipc_kobject_set() but without the lock/unlock. This alteration 
allowed the i386/smp to boot to console login.


Which of these methods is the best way of fixing this properly?

1) Drop the port lock before the call to ipc_kobject_set() in some safe way.

2) As above with a version of ipc_kobject_set() that requires the port 
to be already locked.


3) Something else.

Mike.




[PATCH v4 0/7 gnumach] x86_64 SMP

2026-01-17 Thread Damien Zammit


Hi all,

Nice to talk to everyone at the online Hurd party!

Thanks to Samuel for fine grained review/debugging yesterday:

With this v4 patch series that trumps all my previous attempts,
we have 64b SMP booting to userspace, (well almost...)

There are two known remaining issues with this patchset:

1. There is a problem with the recent changes regarding
   _Xmach_port_set_ktype -> ipc_kobject_set that causes a hang
   only on SMP kernels, which can be temporarily worked around by
   reverting commit:
   fdfca0e8 "Add mach_port_set_ktype RPC to set ktype of a user port"
   but we really ought to dig into it and fix this.

2. There is a page fault in 64b SMP when attempting to boot with
   multiple cpus, but this is a vast improvement from not being
   able to compile a 64b SMP kernel at all.

TESTED:

UP+apic i386: boots

UP+apic x86_64: boots

SMP+apic -smp 1 i386
[   3.2700050] cd0(ahcisata0:2:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 
5 (Ultra/100) (using DMA), NCQ (31 tags)
^D
Stopped at  ipc_kobject_set+0x22:   movl0(%edx),%eax
ipc_kobject_set(f5fac168,0,1c,f5409ec4,f5bf4e78)+0x22
mach_port_set_ktype(c10be720,f5bf4e70,4d,1,1c)+0x87
_Xmach_port_set_ktype(f6185010,fa0aa010,1,f6185048,3b9ac9ff)+0xab
ipc_kobject_server(f6185000,f5bf4e70,f5bdbea0,0)+0x93
mach_msg_trap(bfffbba4,3,38,20,5)+0x73f
\> user space <
db{0}>
(but boots with mentioned commit reverted)

SMP+apic -smp 6 i386
[   3.2900050] cd0(ahcisata0:2:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 
5 (Ultra/100) (using DMA), NCQ (31 tags)
^D
Stopped at  ipc_kobject_set+0x24:   testl   %eax,%eax
ipc_kobject_set(f67fb530,0,1c,f5bdbec4,f5bf4e78)+0x24
mach_port_set_ktype(c10be720,f5bf4e70,4d,1,1c)+0x87
_Xmach_port_set_ktype(f6195010,fa09a010,1,f6195048,3b9ac9ff)+0xab
ipc_kobject_server(f6195000,f5bf4e70,f5bd5ea0,0)+0x93
mach_msg_trap(bfffbba4,3,38,20,5)+0x73f
\> user space <
db{0}>
(but boots with mentioned commit reverted)

SMP+apic -smp 1 x86_64
[   3.2800050] cd0(ahcisata0:2:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 
5 (Ultra/100) (using DMA), NCQ (31 tags)
^D
Stopped at  ipc_kobject_set+0x12:   TODO
ipc_kobject_set(...)+0x12
_Xmach_port_set_ktype(...)+0xc3
ipc_kobject_server(...)+0xac
mach_msg_trap(...)+0x7b3
syscall64(...)+0xea
\> user space <
db{0}>
(but boots with mentioned commit reverted)

SMP+apic -smp 6 x86_64
[   1.050] ahcisata0: 64-bit DMA
trace/tu
Debugger(...)+0x15
Panic(...)+0x10f
kernel_trap(dc2e0e78)+0x25c
\> Page fault (14) for  400036 at 0x400036 <
0x400036(
no memory is assigned to address 400036
...)
\> user space <
0x0()
db{4}>
(does not quite boot)

Thanks,
Damien