Hi,

I've been trying to debug a pmap issue on ARM.  I have no solution, but
I would like to share my findings so far.

First of all, the panic/uvm_fault:


login:
uvm_fault(0xca06c858, 0, 1, 0) -> e
Fatal kernel mode data abort: 'Translation Fault (P)'
trapframe: 0xc986bd98
DFSR=00000007, DFAR=0000000c, spsr=60000113
r0 =00000000, r1 =ccaa7148, r2 =00136000, r3 =00000000
r4 =00000012, r5 =ccaa7148, r6 =32130000, r7 =00000005
r8 =3213000e, r9 =00000021, r10=cc634698, r11=c986be34
r12=cd43a9a0, ssp=c986bde8, slr=c054c1d0, pc =c054c1d0

Stopped at      pmap_enter+0x178:       ldr     r6, [r0, #0x00c]
ddb> trace
pmap_enter+0xc
        scp=0xc054c064 rlv=0xc04f60c0 (uvm_fault+0xa28)
        rsp=0xc986be38 rfp=0xc986bf54
        r10=0x00000000 r9=0x00000001 r8=0x00000000 r7=0xc050fe80
        r6=0x00000000 r5=0xc4a5aad0 r4=0x00000001
uvm_fault+0xc
        scp=0xc04f56a4 rlv=0xc0547e68 (data_abort_handler+0x264)
        rsp=0xc986bf58 rfp=0xc986bfac
        r10=0xc986bfb0 r9=0x00000007 r8=0xc986a000 r7=0x00000001
        r6=0xca06c858 r5=0x00000001 r4=0x00136000
data_abort_handler+0xc
        scp=0xc0547c10 rlv=0xc05477bc (exception_exit)
        rsp=0xc986bfb0 rfp=0xbffc2430
        r10=0x00da776c r9=0x00000001 r8=0x60165dbc r7=0x00000000
        r6=0x00000200 r5=0x00d0c920 r4=0x4188e9f4

The faulting instruction is in pmap_enter() of pamp7.c:

                        if (opg) {
                                /*
                                 * Replacing an existing mapping with a new one.
                                 * It is part of our managed memory so we
                                 * must remove it from the PV list
                                 */
                                pve = pmap_remove_pv(opg, pm, va);
    pve is NULL ->              oflags = pve->pv_flags;
                        } else

How did we come here?  pmap_enter() was called to enter mapping for a
given virtual and physical address.  The VA is 0x136000.  Then it looked
if there is already a mapping at the given address.  Turns out there is!
opte (old pagetable entry) is set to 0x429f202c.  This is an "invalid"
(means not active) page entry pointing to the physical page 0x429f2000.
Using the vm_pages struct for that physical address, we can have a look
at its virtual addresses.

vm_pages:
0xc4f86754:     429f1000
0xc4f86758:     0
0xc4f8675c:     0
0xc4f86760:     3
0xc4f86764:     0
0xc4f86768:     0
0xc4f8676c:     0
0xc4f86770:     c47263f0
0xc4f86774:     c4f86590
0xc4f86778:     c4a721d0
0xc4f8677c:     c520d200
0xc4f86780:     0
0xc4f86784:     0
0xc4f86788:     ca32b230
0xc4f8678c:     0
0xc4f86790:     0
0xc4f86794:     0
0xc4f86798:     140000
0xc4f8679c:     41
0xc4f867a0:     0
0xc4f867a4:     429f2000  <-- vm_page.phys_addr
0xc4f867a8:     0
0xc4f867ac:     cd43a9a0  <-- vm_page.mdpage.pvh_list
0xc4f867b0:     3
0xc4f867b4:     0
0xc4f867b8:     0
0xc4f867bc:     0
0xc4f867c0:     ffffffff
0xc4f867c4:     ffffffff
0xc4f867c8:     c4f50760
0xc4f867cc:     c4c67990
0xc4f867d0:     c50374d0
0xc4f867d4:     0
0xc4f867d8:     0
0xc4f867dc:     c0714248
0xc4f867e0:     cf2c6000
0xc4f867e4:     0
0xc4f867e8:     4d
0xc4f867ec:     385
0xc4f867f0:     1
0xc4f867f4:     429f3000

vm_page.mdpage.pvh_list:
0xcd43a9a0:     0 <-- next ptr in list
0xcd43a9a4:     ccaa7148 <- pmap
0xcd43a9a8:     717c7000 <- va
0xcd43a9ac:     b <- flags

This means, the physical page already has a VA.  Oh, and it belongs
to the same pmap.  Have a look at register r5, that's the pmap we want
to enter a mapping for.  But it's not the kernel pmap:

ddb> print kernel_pmap_store
c073b8f4

So wait, the only VA for that physical address is 0x717c7000; but the
VA I used to look the physical page up is 0x136000.  That is weird.

So now we got a pmap_enter() for a physical address and a virtual
address, where the virtual address is already used by the same pmap
for another physical address.  But that physical address thinks a
completely different VA is using it.

The phys address behind the mapping is 0x429f2000.  The phys address
it wanted to map the VA to is in R6: 0x32130000.

That's probably not enough to find the exact fault, but it's a writeup
of what I have found so far.

Patrick

Reply via email to