On 06/05/14 19:18, Mark Cave-Ayland wrote:

(cut)

As soon as I step into address 0x1001804 then this is where things start
to go wrong; the TLB (TTE) entry for 0x1800000 which is accessed by %sp
is marked as privileged, but ASI 0x11 is user access only. QEMU's
current behaviour for this is to generate a datafault for the page at
0x1800000 which seems to get all the way through to the retry at the end
of winfixsave, but then hits the breakpoint trap above when executing
the retry.

I've finally located the source of this bug thanks to more testing, which showed that OpenBSD 4.9 was surprisingly also able to boot (something I missed this in my original bisection). This allowed me to track down what was happening fairly easily. The problem is caused by the fact that 0x1800000 has *two* mappings in the TLB and the way in which QEMU resolves them.

Compare the state of the TLB when the fill_0_normal trap occurs on OpenBSD 5.5 (faults, incorrect) and OpenBSD 4.9 (no fault, correct):


OpenBSD 5.5:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[14] VA: 1800000, PA: f400000,   4M, priv, RW, locked, ctx 0 local
...
[42] VA: 1800000, PA: f400000,   8k, user, RW, unlocked, ctx 0 local
...

OpenBSD 4.9:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[08] VA: 1800000, PA: f400000,   8k, user, RW, unlocked, ctx 0 local
...
[14] VA: 1800000, PA: f400000,   4M, priv, RW, locked, ctx 0 local
...


The bug occurs because the QEMU TLB algorithm currently searches the TLB *in order* starting from entry 0 until it finds a VA match.

In the OpenBSD 5.5 case, the first mapping it finds is the 4M privileged mapping, and so the fill_0_normal trap which uses user ASI 0x11 faults due to not being privileged. This is in contrast to the OpenBSD 4.9 case where the first mapping it finds is the 8K unprivileged mapping, hence the fill_0_normal trap succeeds and we proceed to boot.

Does anyone know how real hardware resolves conflicts between multiple TLB entries with the same VA? My guess would be that the smaller 8K mapping should take priority, but the documentation in relation to address aliasing is fairly non-existent so I wondering if there are any other rules relating to whether privileged mappings should take priority or not? Once the behaviour is known, it will be fairly easy to fix up QEMU to match.

Finally it does raise an eyebrow that the first window trap taken when the kernel takes over the trap table is a fill_0_normal *user* trap, particularly when it's against an *unlocked* TLB entry which could potentially could have been evicted beforehand. It might be worth double-checking as to whether this is the intended behaviour or not.


Kind regards,

Mark.

Reply via email to