Re: sparc64: problem after trap table takeover under QEMU

2014-05-25 Thread Mark Cave-Ayland

On 08/05/14 20:28, Mark Kettenis wrote:


Hi Mark,

Interesting to see sparc64 support in QEMU.


Yeah, it's been a work in progress for quite a while now. There seems to 
be two main areas of interest: firstly for people who are now migrating 
away from SPARC but need to keep a legacy application(s), and secondly 
for open source projects interested in testing across multiple 
architectures.



As soon as I step into address 0x1001804 then this is where things start
to go wrong; the TLB (TTE) entry for 0x180 which is accessed by %sp
is marked as privileged, but ASI 0x11 is user access only. QEMU's
current behaviour for this is to generate a datafault for the page at
0x180 which seems to get all the way through to the retry at the end
of winfixsave, but then hits the breakpoint trap above when executing
the retry.


I've finally located the source of this bug thanks to more testing,
which showed that OpenBSD 4.9 was surprisingly also able to boot
(something I missed this in my original bisection). This allowed me to
track down what was happening fairly easily. The problem is caused by
the fact that 0x180 has *two* mappings in the TLB and the way in
which QEMU resolves them.

Compare the state of the TLB when the fill_0_normal trap occurs on
OpenBSD 5.5 (faults, incorrect) and OpenBSD 4.9 (no fault, correct):


OpenBSD 5.5:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...
[42] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...

OpenBSD 4.9:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[08] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...


The bug occurs because the QEMU TLB algorithm currently searches the TLB
*in order* starting from entry 0 until it finds a VA match.

In the OpenBSD 5.5 case, the first mapping it finds is the 4M privileged
mapping, and so the fill_0_normal trap which uses user ASI 0x11 faults
due to not being privileged. This is in contrast to the OpenBSD 4.9 case
where the first mapping it finds is the 8K unprivileged mapping, hence
the fill_0_normal trap succeeds and we proceed to boot.

Does anyone know how real hardware resolves conflicts between multiple
TLB entries with the same VA? My guess would be that the smaller 8K
mapping should take priority, but the documentation in relation to
address aliasing is fairly non-existent so I wondering if there are any
other rules relating to whether privileged mappings should take priority
or not? Once the behaviour is known, it will be fairly easy to fix up
QEMU to match.


It seems that this first hypothesis was incorrect; after some help from 
the NetBSD guys we found out that all PROM mappings should default to 
privileged. So the issue is no longer to do with the difference between 
privileged/unprivileged mappings, but why does the fault occur in the 
first place?



I don;t know how the real hardware behaves.  But it certainly is the
intention that the 4M "locked" mapping gets used as soon as we've
taken over the trap table.  Not sure where the 8K mapping is coming
from.


Finally it does raise an eyebrow that the first window trap taken when
the kernel takes over the trap table is a fill_0_normal *user* trap,
particularly when it's against an *unlocked* TLB entry which could
potentially could have been evicted beforehand. It might be worth
double-checking as to whether this is the intended behaviour or not.


Right.  It certainly isn't the intention that we end up a
fill_0_normal at this point.  Perhaps %wstate is initialized
differently in QEMU than on real hardware?  The OpenBSD bootstrap code
does set %wstate appropriately immediately after taking over the trap
table.  We can't really do this earlier since we don't know the
conventions used by the spill and fill handlers provided by the
firmware.  But it looks like a Sun Fire T2000 actually initializes
%wstate to 0.

So perhaps we're just getting lucky on real hardware that the prom
code doesn't spill our trap frame and therefore we don't have to fill
it again.


After more work, I believe that your theory here is correct. Take a look 
at cpu_initialize() in locore.S:



/*
 * Initialize a CPU.  This is used both for bootstrapping the first CPU
 * and spinning up each subsequent CPU.  Basically:
 *
 *  Install trap table.
 *  Switch to the initial stack.
 *  Call the routine passed in in cpu_info->ci_spinup.
 */

_C_LABEL(cpu_initialize):

wrpr%g0, 0, %tl ! Make sure we're not in 
NUCLEUS mode
flushw

/* Change the trap base register */
set _C_LABEL(trapbase), %l1
#ifdef SUN4V
sethi   %hi(_C_LABEL(cputyp)), %l0
ld  [%l0 + %lo(_C_LABEL(cputyp))], %l0
cmp %l0, CPU_SUN4V
bne,pt  %icc, 1f
 nop
set _C_LABEL(trapbase_sun4v), %

Re: sparc64: problem after trap table takeover under QEMU

2014-05-08 Thread Mark Kettenis
> Date: Thu, 08 May 2014 14:44:30 +0100
> From: Mark Cave-Ayland 
> 
> On 06/05/14 19:18, Mark Cave-Ayland wrote:

Hi Mark,

Interesting to see sparc64 support in QEMU.  

> > As soon as I step into address 0x1001804 then this is where things start
> > to go wrong; the TLB (TTE) entry for 0x180 which is accessed by %sp
> > is marked as privileged, but ASI 0x11 is user access only. QEMU's
> > current behaviour for this is to generate a datafault for the page at
> > 0x180 which seems to get all the way through to the retry at the end
> > of winfixsave, but then hits the breakpoint trap above when executing
> > the retry.
> 
> I've finally located the source of this bug thanks to more testing, 
> which showed that OpenBSD 4.9 was surprisingly also able to boot 
> (something I missed this in my original bisection). This allowed me to 
> track down what was happening fairly easily. The problem is caused by 
> the fact that 0x180 has *two* mappings in the TLB and the way in 
> which QEMU resolves them.
> 
> Compare the state of the TLB when the fill_0_normal trap occurs on 
> OpenBSD 5.5 (faults, incorrect) and OpenBSD 4.9 (no fault, correct):
> 
> 
> OpenBSD 5.5:
> 
> (qemu) info tlb
> MMU contexts: Primary: 0, Secondary: 0
> DMMU dump
> ...
> [14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
> ...
> [42] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
> ...
> 
> OpenBSD 4.9:
> 
> (qemu) info tlb
> MMU contexts: Primary: 0, Secondary: 0
> DMMU dump
> ...
> [08] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
> ...
> [14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
> ...
> 
> 
> The bug occurs because the QEMU TLB algorithm currently searches the TLB 
> *in order* starting from entry 0 until it finds a VA match.
> 
> In the OpenBSD 5.5 case, the first mapping it finds is the 4M privileged 
> mapping, and so the fill_0_normal trap which uses user ASI 0x11 faults 
> due to not being privileged. This is in contrast to the OpenBSD 4.9 case 
> where the first mapping it finds is the 8K unprivileged mapping, hence 
> the fill_0_normal trap succeeds and we proceed to boot.
> 
> Does anyone know how real hardware resolves conflicts between multiple 
> TLB entries with the same VA? My guess would be that the smaller 8K 
> mapping should take priority, but the documentation in relation to 
> address aliasing is fairly non-existent so I wondering if there are any 
> other rules relating to whether privileged mappings should take priority 
> or not? Once the behaviour is known, it will be fairly easy to fix up 
> QEMU to match.

I don;t know how the real hardware behaves.  But it certainly is the
intention that the 4M "locked" mapping gets used as soon as we've
taken over the trap table.  Not sure where the 8K mapping is coming
from.

> Finally it does raise an eyebrow that the first window trap taken when 
> the kernel takes over the trap table is a fill_0_normal *user* trap, 
> particularly when it's against an *unlocked* TLB entry which could 
> potentially could have been evicted beforehand. It might be worth 
> double-checking as to whether this is the intended behaviour or not.

Right.  It certainly isn't the intention that we end up a
fill_0_normal at this point.  Perhaps %wstate is initialized
differently in QEMU than on real hardware?  The OpenBSD bootstrap code
does set %wstate appropriately immediately after taking over the trap
table.  We can't really do this earlier since we don't know the
conventions used by the spill and fill handlers provided by the
firmware.  But it looks like a Sun Fire T2000 actually initializes
%wstate to 0.

So perhaps we're just getting lucky on real hardware that the prom
code doesn't spill our trap frame and therefore we don't have to fill
it again.



Re: sparc64: problem after trap table takeover under QEMU

2014-05-08 Thread Mark Cave-Ayland

On 06/05/14 19:18, Mark Cave-Ayland wrote:

(cut)


As soon as I step into address 0x1001804 then this is where things start
to go wrong; the TLB (TTE) entry for 0x180 which is accessed by %sp
is marked as privileged, but ASI 0x11 is user access only. QEMU's
current behaviour for this is to generate a datafault for the page at
0x180 which seems to get all the way through to the retry at the end
of winfixsave, but then hits the breakpoint trap above when executing
the retry.


I've finally located the source of this bug thanks to more testing, 
which showed that OpenBSD 4.9 was surprisingly also able to boot 
(something I missed this in my original bisection). This allowed me to 
track down what was happening fairly easily. The problem is caused by 
the fact that 0x180 has *two* mappings in the TLB and the way in 
which QEMU resolves them.


Compare the state of the TLB when the fill_0_normal trap occurs on 
OpenBSD 5.5 (faults, incorrect) and OpenBSD 4.9 (no fault, correct):



OpenBSD 5.5:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...
[42] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...

OpenBSD 4.9:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[08] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...


The bug occurs because the QEMU TLB algorithm currently searches the TLB 
*in order* starting from entry 0 until it finds a VA match.


In the OpenBSD 5.5 case, the first mapping it finds is the 4M privileged 
mapping, and so the fill_0_normal trap which uses user ASI 0x11 faults 
due to not being privileged. This is in contrast to the OpenBSD 4.9 case 
where the first mapping it finds is the 8K unprivileged mapping, hence 
the fill_0_normal trap succeeds and we proceed to boot.


Does anyone know how real hardware resolves conflicts between multiple 
TLB entries with the same VA? My guess would be that the smaller 8K 
mapping should take priority, but the documentation in relation to 
address aliasing is fairly non-existent so I wondering if there are any 
other rules relating to whether privileged mappings should take priority 
or not? Once the behaviour is known, it will be fairly easy to fix up 
QEMU to match.


Finally it does raise an eyebrow that the first window trap taken when 
the kernel takes over the trap table is a fill_0_normal *user* trap, 
particularly when it's against an *unlocked* TLB entry which could 
potentially could have been evicted beforehand. It might be worth 
double-checking as to whether this is the intended behaviour or not.



Kind regards,

Mark.



sparc64: problem after trap table takeover under QEMU

2014-05-06 Thread Mark Cave-Ayland

Hi all,

I'm currently working on a set of patches for OpenBIOS (the OF 
implementation for QEMU) in order to get the various *BSD kernels to 
boot under QEMU SPARC64 with some success, but I'm struggling with a 
privilege violation trap which occurs on the first window fill trap 
after OpenBSD takes over the trap table. This is with the latest OpenBSD 
5.5 and with my current patchset the console output looks like this:



Loading FCode image...
Loaded 4829 bytes
entry point is 0x4000
OpenBSD IEEE 1275 Bootblock 1.3
..
Jumping to entry point 0010 for type 0001...
switching to new context: entry point 0x10 stack 0xffe8aa09
>> OpenBSD BOOT 1.6
Trying bsd...
open /pci@1fe,0/pci-ata@5/ide1@600/cdrom@0:f/etc/random.seed: No such 
file or directory

Booting /pci@1fe,0/pci-ata@5/ide1@600/cdrom@0:f/bsd
3901336@0x100+6248@0x13b8798+3261984@0x180+932320@0x1b1c620
symbols @ 0xffc5a300 119 start=0x100

Unexpected client interface exception: -1
panic: trap type 0x101 (breakpoint): pc=1010254 npc=1010258 
pstate=99110414

halted

EXIT


I asked around on IRC and it was suggested that I post the information 
here in order to get some further input on this. My feeling is that QEMU 
SPARC64 may be doing something different to real hardware but I don't 
have any to play with and this is my first dig into OpenBSD, so I'd 
really appreciate some pointers from interested parties.


The privilege violation trap I experience occurs just after OpenBSD 
invokes the OF "SUNW,set-trap-table" call and occurs in the epilogue of 
openfirmware() in locore.S at the final restore:


...
rdpr%pstate, %l0
jmpl%i4, %o7
 wrpr   %g0, PSTATE_PROM|PSTATE_IE, %pstate
wrpr%l0, %g0, %pstate
mov %l1, %g1
mov %l2, %g2
mov %l3, %g3
mov %l4, %g4
mov %l5, %g5
mov %l6, %g6
mov %l7, %g7
wrpr%i2, 0, %pil
ret
 restore%o0, %g0, %o0

What happens here is that when the final restore is executed in the 
delay slot, a fill_0_normal trap is generated which vectors into 
0x1001800 here:


(gdb) disas 0x1001800, 0x100185c
Dump of assembler code from 0x1001800 to 0x100185c:
=> 0x01001800:  wr  %g0, 0x11, %asi
   0x01001804:  ldxa  [ %sp + 0x7ff ] %asi, %l0
   0x01001808:  ldxa  [ %sp + 0x807 ] %asi, %l1
   0x0100180c:  ldxa  [ %sp + 0x80f ] %asi, %l2
   0x01001810:  ldxa  [ %sp + 0x817 ] %asi, %l3
   0x01001814:  ldxa  [ %sp + 0x81f ] %asi, %l4
   0x01001818:  ldxa  [ %sp + 0x827 ] %asi, %l5
   0x0100181c:  ldxa  [ %sp + 0x82f ] %asi, %l6
   0x01001820:  ldxa  [ %sp + 0x837 ] %asi, %l7
   0x01001824:  ldxa  [ %sp + 0x83f ] %asi, %i0
   0x01001828:  ldxa  [ %sp + 0x847 ] %asi, %i1
   0x0100182c:  ldxa  [ %sp + 0x84f ] %asi, %i2
   0x01001830:  ldxa  [ %sp + 0x857 ] %asi, %i3
   0x01001834:  ldxa  [ %sp + 0x85f ] %asi, %i4
   0x01001838:  ldxa  [ %sp + 0x867 ] %asi, %i5
   0x0100183c:  ldxa  [ %sp + 0x86f ] %asi, %fp
   0x01001840:  ldxa  [ %sp + 0x877 ] %asi, %i7
   0x01001844:  nop
   0x01001848:  sethi  %hi(0xe0018000), %g5
   0x0100184c:  ldx  [ %g5 + 0x10 ], %g5! 0xe0018010
   0x01001850:  ldx  [ %g5 + 0x28 ], %g5
   0x01001854:  xor  %g5, %i7, %i7
   0x01001858:  restored
End of assembler dump.
(gdb) info regi sp
sp 0x18006710x1800671

As soon as I step into address 0x1001804 then this is where things start 
to go wrong; the TLB (TTE) entry for 0x180 which is accessed by %sp 
is marked as privileged, but ASI 0x11 is user access only. QEMU's 
current behaviour for this is to generate a datafault for the page at 
0x180 which seems to get all the way through to the retry at the end 
of winfixsave, but then hits the breakpoint trap above when executing 
the retry.


Based on this I have a couple of questions about what is happening here:

1) Is the fill_0_normal (user-level) trap the correct one? Or does 
OpenBIOS need to do something with %otherwin to invoke a 
supervisor-level trap?


2) Is the QEMU SPARC64 behaviour of invoking a data_access_exception 
when accessing supervisor memory with a user ASI correct?


FWIW I also tried some older OpenBSD ISOs and found that this behaviour 
was introduced between the 4.3 and 4.4 releases, and older releases 
don't exhibit this problem. Repeating the same test in 4.3, which is the 
last release that doesn't trap with the breakpoint error above, shows 
that the fill_0_normal trap is still invoked in the openfirmware() 
epilogue, however the stack pointer is now different:


(gdb) info regi sp
sp 0x1c096210x1c09621

And I can confirm that page 0x1c08000 exists in the TLB but compared to 
the current release above *isn't* marked as privileged, so no fault 
occurs and exec