> However, on amd64, with the diff applied the kernel faults when writing
> to curproc.  In the trace below tatclock+0x108 corresponds to
> tu_enter(&p->p_tu) in statclock().

I have tried this and it fails even earlier for me.

The uvm_map_protect() call in kern_exec.c will now end up invoking
pmap_protect(), which is an inline function ending up in
pmap_write_protect(pmap_kernel, va, va + PAGE_SIZE).

In my case, va = 0xffff.8000.4210.c000 which is in kernel space.
However, at pmap_write_protect+0x213, which is the pmap_pte_clearbits()
macro expansion here in the loop:

                for (/*null */; spte < epte ; spte++) {
                        if (!pmap_valid_entry(*spte))
                                continue;
                        pmap_pte_clearbits(spte, clear);
                        pmap_pte_setbits(spte, set);
                }

we end up with spte == 0x7fffe.c000.8000, which is BELOW the kernel (and
*spte == 0x464c457f == the ELF signature). Therefore the attempt to flip
bits in this bogus address faults, pcb_onfault is (correctly not set),
kpageflttrap() panics.

Now if you look at the beginning of pmap_write_protect(), it does this:

        /* should be ok, but just in case ... */
        sva &= PG_FRAME;
        eva &= PG_FRAME;

and I'm afraid I don't understand this. My understanding is that
PG_FRAME is a mask supposed to apply to physical addresses, not virtual
addresses!

Because of this, my initial page address, known as sva, gets
"normalized" from 0xffff.8000.4210.c000 to 0x000f.8000.4210.c000, which
is now LOWER than VM_MIN_KERNEL_ADDRESS and will not sign-extend
correctly.

Is the PG_FRAME masking really only intending to mask the low-order
bits, and should use ~PAGE_MASK instead?

In addition to this, the computation of `blockend' in the main loop of
that routine will clear high-order bits (in my case, to
0x0000.8000.4220.0000), and because it assumes blockend > va to make
progress at every iteration, this will actually become an infinite loop
which will corrupt memory until it faults or you get tired of waiting
for it to complete.

This STRONGLY hints that this routine has never been used on
pmap_kernel() addresses until now.

Can anyone with some amd64 mmu knowledge can confirm this analysis and
do the required work to make that routine cope with non-userland
addresses?

Reply via email to