Re: sparc64 boot issue on qemu

2020-06-01 Thread Mark Cave-Ayland
On 01/06/2020 21:08, Jason A. Donenfeld wrote:

> On Mon, Jun 1, 2020 at 1:54 PM Mark Cave-Ayland
>  wrote:
>>
>> On 01/06/2020 08:23, Jason A. Donenfeld wrote:
>>
>>> On Sun, May 31, 2020 at 3:18 AM Mark Cave-Ayland
>>>  wrote:
>>>>
>>>> AFAICT the problem here is the Forth being used at
>>>> https://github.com/openbsd/src/blob/master/sys/arch/sparc64/dev/fb.c#L511: 
>>>> since the
>>>> addr word isn't part of the IEEE-1275 specification, it is currently 
>>>> unimplemented in
>>>> OpenBIOS.
>>>
>>> Actually, it looks to me like after this line runs:
>>>
>>> OF_interpret("stdout @ is my-self "
>>> "addr char-height addr char-width "
>>> "addr window-top addr window-left",
>>> 4, , , , );
>>>
>>> windowleft and windowtop contain legit addresses, but romwidth and
>>> romheight have garbage in them. It might be possible to chalk this up
>>> to bogus QEMU firmware, in which case, whatever.
>>
>> Sadly I think that's more due to luck than anything else. If you have a 
>> working boot
>> loader then can you try booting qemu-system-sparc64 with -prom-env 
>> 'auto-boot?=false'
>> and then entering the following definition of addr at the Forth prompt:
>>
>> : addr
>>   parse-word $find if
>> cell +
>>   then
>> ;
>>
>> followed by:
>>
>> boot
>>
>> That should give you a definition of addr that will return the address of a 
>> value
>> type in Forth.
> 
> Wow, that's magic, and works perfectly:
> https://data.zx2c4.com/openbsd-qemu-sparc64-pretty-vga-with-serif-font.png
> Pretty font too.
> 
> It sounds like the issue we're facing here is that the addr function
> is missing from QEMU's firmware? Would it be quasi interesting to
> remove use of it from OpenBSD? Or should we take this over to QEMU
> instead and get it implemented?

Oh wow it looks great! I also have commit access to OpenBIOS so I can tidy that 
up
and get it posted over on the OpenBIOS mailing list. Probably the main thing is 
to
figure out what to do if the specified word doesn't exist. I'll also try and 
find a
few mins to fire up my Mac Mini to see if it exists there to work out if it 
should be
restricted to SPARC only.

Note that I did my last merge a few days ago so it will be a little while 
before it
hits QEMU git master, but I can certainly get it added in time for the next 
official
QEMU release.


ATB,

Mark.



Re: sparc64 boot issue on qemu

2020-06-01 Thread Mark Cave-Ayland
On 01/06/2020 08:23, Jason A. Donenfeld wrote:

> On Sun, May 31, 2020 at 3:18 AM Mark Cave-Ayland
>  wrote:
>>
>> AFAICT the problem here is the Forth being used at
>> https://github.com/openbsd/src/blob/master/sys/arch/sparc64/dev/fb.c#L511: 
>> since the
>> addr word isn't part of the IEEE-1275 specification, it is currently 
>> unimplemented in
>> OpenBIOS.
> 
> Actually, it looks to me like after this line runs:
> 
> OF_interpret("stdout @ is my-self "
> "addr char-height addr char-width "
> "addr window-top addr window-left",
> 4, , , , );
> 
> windowleft and windowtop contain legit addresses, but romwidth and
> romheight have garbage in them. It might be possible to chalk this up
> to bogus QEMU firmware, in which case, whatever.

Sadly I think that's more due to luck than anything else. If you have a working 
boot
loader then can you try booting qemu-system-sparc64 with -prom-env 
'auto-boot?=false'
and then entering the following definition of addr at the Forth prompt:

: addr
  parse-word $find if
cell +
  then
;

followed by:

boot

That should give you a definition of addr that will return the address of a 
value
type in Forth.


ATB,

Mark.



Re: sparc64 boot issue on qemu

2020-05-31 Thread Mark Cave-Ayland
On 31/05/2020 15:58, Theo de Raadt wrote:

>> AFAICT the problem here is the Forth being used at
>> https://github.com/openbsd/src/blob/master/sys/arch/sparc64/dev/fb.c#L511: 
>> since the
>> addr word isn't part of the IEEE-1275 specification, it is currently 
>> unimplemented in
>> OpenBIOS.
>>
>> Why is addr needed here? Does the fb.c driver try and change these values 
>> rather than
>> just read them?
> 
> Why does that matter?
> 
> sparc64 isn't a IEEE-1275 openfirmware.
> 
> It is a Sun openfirmware, meaning it is more than the vague
> specification.  An emulation must be able to emulate THE REAL HARDWARE.
> 
> This should work.

Well there are plenty of SUN-ims already included in OpenBIOS to enable Solaris 
to
boot as far it does; I'm not against them, I was just commenting that this was 
the
reason why it is currently unimplemented.

> For another 64-bit cell_t usage see dev/prtc.c.
> 
> For another "addr" usage, see romgetcursoraddr()

A simple addr implementation for Forth values should be fairly easy to put 
together.
Since I don't have access to any Sun hardware, can someone confirm the 
semantics of
the addr word for me? In particular what does it return for:

- Values (presumably this is a pointer to a 64-bit value?)
- Defers (does it return a pointer to the deferred word?)
- Words  (is it the same as the ' word?)


ATB,

Mark.



Re: sparc64 boot issue on qemu

2020-05-31 Thread Mark Cave-Ayland
On 30/05/2020 00:19, Jason A. Donenfeld wrote:

> Note that you need to run this with `-nographic`, because the kernel
> crashes when trying to use vgafb on sparc64/qemu. I've witnessed two
> varieties crashes:
> 
> - https://data.zx2c4.com/openbsd-6.7-sparc64-vga-panic-miniroot67.png
> This happens when booting up miniroot67.fs
> 
> - https://data.zx2c4.com/openbsd-6.7-sparc64-vga-panic-after-installation.png
> This happens after installation openbsd onto disk properly, and then
> booting up into it.
> 
> Passing `-nographic` prevents these from happening, since vgafb doesn't
> bind to anything.
> 
> I don't have a bsd.gdb in order to addr2line this, but if the miniroot
> panic is related to the normal panic, and we then assume alignment
> issues in fb_get_console_metrics, then I wonder if the below patch would
> make a difference. On the other hand, a "data access fault" makes it
> seem more likely that OF_interpret is just getting bogus addresses from
> buggy qemu firmware.
> 
> I probably have another two hours to go in waiting for this thing to
> build...
> 
> Jason
> 
> --- a/sys/arch/sparc64/dev/fb.c
> +++ b/sys/arch/sparc64/dev/fb.c
> @@ -507,6 +507,7 @@ int
>  fb_get_console_metrics(int *fontwidth, int *fontheight, int *wtop, int 
> *wleft)
>  {
>   cell_t romheight, romwidth, windowtop, windowleft;
> + uint64_t romheight_64, romwidth_64, windowtop_64, windowleft_64;
> 
>   /*
>* Get the PROM font metrics and address
> @@ -520,10 +521,15 @@ fb_get_console_metrics(int *fontwidth, int *fontheight, 
> int *wtop, int *wleft)
>   windowtop == 0 || windowleft == 0)
>   return (1);
> 
> - *fontwidth = (int)*(uint64_t *)romwidth;
> - *fontheight = (int)*(uint64_t *)romheight;
> - *wtop = (int)*(uint64_t *)windowtop;
> - *wleft = (int)*(uint64_t *)windowleft;
> + memcpy(_64, (void *)romheight, sizeof(romheight_64));
> + memcpy(_64, (void *)romwidth, sizeof(romwidth_64));
> + memcpy(_64, (void *)windowtop, sizeof(windowtop_64));
> + memcpy(_64, (void *)windowleft, sizeof(windowleft_64));
> +
> + *fontwidth = (int)romwidth_64;
> + *fontheight = (int)romheight_64;
> + *wtop = (int)windowtop_64;
> + *wleft = (int)windowleft_64;
> 
>   return (0);
>  }

AFAICT the problem here is the Forth being used at
https://github.com/openbsd/src/blob/master/sys/arch/sparc64/dev/fb.c#L511: 
since the
addr word isn't part of the IEEE-1275 specification, it is currently 
unimplemented in
OpenBIOS.

Why is addr needed here? Does the fb.c driver try and change these values 
rather than
just read them?


ATB,

Mark.



Re: sparc64 boot issue on qemu

2020-05-31 Thread Mark Cave-Ayland
On 30/05/2020 10:54, Otto Moerbeek wrote:

> https://cdn.openbsd.org/pub/OpenBSD/snapshots/sparc64/
> contains the unpatched miniroot.
> 
> https://www.drijf.net/openbsd/disk.qcow2
> 
> is the disk image based on the miniroot containing the patch in the
> firts post in this thread.
> 
> Thanks for looking into this.
> 
> Note that we did *not* observe boot failure on any real sparc64
> hardware. The bootblock changes I did for the 6.7 release were tested
> on many different machines.

Thanks for the test case which enables me to reproduce the issue. With 
?fcode-verbose
enabled you see this at the end of the FCode execution:

...
...
5acf :  [ 0x8b7 ]
5ad0 : b(lit) [ 0x10 ]
5ad6 :  [ 0x81e ]
5ad7 : 0= [ 0x34 ]
5ad8 : swap [ 0x49 ]
5ad9 : drop [ 0x46 ]
5ada : b?branch [ 0x14 ]
   (offset) 5
5ade : (compile)  [ 0x8c8 ]
5adf : (compile) b(>resolve) [ 0xb2 ]
OpenBSD IEEE 1275 Bootblock 2.0
Booting from device /pci@1fe,0/pci@1,1/ide@3/ide@1/cdrom@0
Try superblock read
FFS v1
ufs-open complete
.Looking for ofwboot in directory...
.
..
ofwboot
Found it
.Loading 1a1c8  bytes of file...
Copying 2000 bytes to 4000
Copying 2000 bytes to 6000
Copying 2000 bytes to 8000
Copying 2000 bytes to a000
Copying 2000 bytes to c000
Copying 2000 bytes to e000
Copying 2000 bytes to 1
Copying 2000 bytes to 12000
Copying 2000 bytes to 14000
Copying 2000 bytes to 16000
Copying 2000 bytes to 18000
Copying 2000 bytes to 1a000
Copying 2000 bytes to 1c000
Copying 2000 bytes to 1e000
5ae0 : expect [ 0x8a ]


Now that 0x8a is completely wrong since according to
https://github.com/openbsd/src/blob/master/sys/arch/sparc64/stand/bootblk/bootblk.fth
the last instruction should be exit which is 0x33.

Since the FCode itself is located at load-base (0x4000) it looks to me from the 
above
debug that you're loading ofwboot at the same address, overwriting the FCode. 
Once
do-boot has finished executing, the FCode interpreter returns to execute the 
exit
word which has now been overwritten: so instead of returning to the updated 
client
context via exit to execute ofwboot, it executes expect which asks for input 
from the
keyboard and then crashes because the stack is incorrect.

My recommendation would be to load ofwboot at 0x6000 instead of 0x4000 which I
believe will fix the issue. It's interesting you mention that this works on real
hardware, since it doesn't agree with my reading of the IEEE-1275 specification 
so
you're certainly relying on some undocumented behaviour here.


ATB,

Mark.



Re: sparc64 boot issue on qemu

2020-05-30 Thread Mark Cave-Ayland
On 29/05/2020 23:56, Jason A. Donenfeld wrote:

> Oh that's a nice observation about `boot disk -V`. Doing so actually
> got me booting up entirely:
> 
> $ qemu-img convert -O qcow2 miniroot66.fs disk.qcow2
> $ qemu-img resize disk.qcow2 20G
> $ qemu-system-sparc64 -m 1024 -drive file=disk.qcow2,if=ide -net
> nic,model=ne2k_pci -net user -boot a -nographic -monitor none -serial
> stdio

I think the problem here is that you're asking OpenBIOS to boot from the (empty)
floppy disk with "-boot a" rather than the qcow2 image which is normally 
attached to
the first hard disk "-boot c". As this is the default, then I would expect the
command line above to work if you simply drop "-boot a".

Also is there a particular reason for using the ne2k_pci NIC instead of the 
default
in-built sunhme device? I try and keep the documentation at
https://wiki.qemu.org/Documentation/Platforms/SPARC as accurate as I can, so do 
look
there for latest best practices and command line examples.

Finally the version of qemu-system-sparc64 you are running can also boot from a
virtio-blk-pci device (again see the above wiki page for details) if you are 
looking
for the best emulated disk performance.


ATB,

Mark.



Re: sparc64 boot issue on qemu

2020-05-30 Thread Mark Cave-Ayland
On 30/05/2020 10:03, Otto Moerbeek wrote:

> Hi,
> 
> thanks for the hints, but an unpatched 6.7 miniroot still fails to
> boot for me
> 
> qemu-system-sparc64 -machine sun4u -m 1024 -drive \
>   file=miniroot67.img,format=raw -nographic -serial stdio -monitor none
> 
> OpenBIOS for Sparc64
> Configuration device id QEMU version 1 machine id 0
> kernel cmdline 
> CPUs: 1 x SUNW,UltraSPARC-IIi
> UUID: ----
> Welcome to OpenBIOS v1.1 built on Oct 28 2019 17:08
>   Type 'help' for detailed information
> Trying disk:a...
> Not a bootable ELF image
> Not a bootable a.out image
> 
> Loading FCode image...
> Loaded 6882 bytes
> entry point is 0x4000
> Evaluating FCode...
> OpenBSD IEEE 1275 Bootblock 2.0
> ..
> 
> And then hangs
> 
> While the patched bootblocks do boot (but hang later after
> 
> scsibus1 at softraid0: 256 targets
> 
> 
> as before,
> 
>   -Otto

Hmmm odd. Is it possible for you to upload your miniroot somewhere for me to 
take a
quick look? I don't have a great deal of time right now, but I can run it 
through a
debugger to see if anything obvious shows up.


ATB,

Mark.



Re: Status of openbsd/macppc port?

2018-08-17 Thread Mark Cave-Ayland
On 17/08/18 14:27, Mark Kettenis wrote:

>> Obviously I can't categorically state that QEMU's emulation is perfect,
>> but it can now reliably run all of Linux, MacOS, NetBSD and FreeBSD in
>> my local tests which makes me suspect that OpenBSD is trying to do
>> something different here.
> 
> Runs fairly stable as long as there is enough RAM.  There is an
> (unknown) pmap bug that causes memory corruption as soon as the
> machine starts swapping.

Right, I wonder if this is related to the invalid memory accesses I'm
seeing in QEMU? Fortunately it's fairly easy to boot different images
within the VM, so let's go backwards in time...


OpenBSD 6.1
- Boots to userspace, but hangs quickly at the installer shell

OpenBSD 6.0
- Hangs on boot just after the USB controller initialises

OpenBSD 5.9
- Boots to userspace, but hangs quickly at the installer shell (qemu
console logs attempt to execute a NULL opcode, so looks like we're
jumping off somewhere strange?)

OpenBSD 5.8
- Hangs on boot just after the USB controller initialises (qemu console
logs an attempt to execute an invalid/unsupported opcode: 00 - 1c - 17 -
0a (004ad5f8)  1)

OpenBSD 5.7
- Lots of "mac_intr_establish called, not yet inited" warnings in the
kernel dmesg output
- However it boots to userspace and the installer shell seems stable

OpenBSD 5.6
- Panics with a stack smash warning:

OpenBSD 5.6 (RAMDISK) #163: Fri Aug  8 09:05:59 MDT 2014
dera...@macppc.openbsd.org:/usr/src/sys/arch/macppc/compile/RAMDISK
real mem = 1073741824 (1024MB)
avail mem = 1029210112 (981MB)
warning: no entropy supplied by boot loader
mainbus0 at root: model PowerMac3,1
cpu0 at mainbus0: 7400 (Revision 0x209): 900 MHz: L2 cache not enabled
mem at mainbus0 not configured
mpcpcibr0 at mainbus0 pci: uni-north
pci0 at mpcpcibr0 bus 0
panic: smashed stack in ofw_enumerate_pcibus
Stopped at  Debugger+0x10:  lwz r0,36(r1)
00a00ae4: end+0x561cc fp a00ac0 nfp a00ae0
001ee6dc: panic+0xe0 fp a00ae0 nfp a00b40
001e235c: __stack_smash_handler+0x18 fp a00b40 nfp a00b60
0037ea18: ofw_enumerate_pcibus+0x1b0 fp a00b60 nfp a00bc0
0031bc90: pciattach+0xf0 fp a00bc0 nfp a00bf0
001e3e50: config_attach+0x1f0 fp a00bf0 nfp a00c40
0037dc0c: mpcpcibrattach+0x3b0 fp a00c40 nfp a00d60
001e3e50: config_attach+0x1f0 fp a00d60 nfp a00db0
003095f0: dbdma_flush+0x4d8 fp a00db0 nfp a00e90
001e3e50: config_attach+0x1f0 fp a00e90 nfp a00ee0
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
ddb> trace
00a00ae4: end+0x561cc fp a00ac0 nfp a00ae0
001ee6dc: panic+0xe0 fp a00ae0 nfp a00b40
001e235c: __stack_smash_handler+0x18 fp a00b40 nfp a00b60
0037ea18: ofw_enumerate_pcibus+0x1b0 fp a00b60 nfp a00bc0
0031bc90: pciattach+0xf0 fp a00bc0 nfp a00bf0
001e3e50: config_attach+0x1f0 fp a00bf0 nfp a00c40
0037dc0c: mpcpcibrattach+0x3b0 fp a00c40 nfp a00d60
001e3e50: config_attach+0x1f0 fp a00d60 nfp a00db0
003095f0: dbdma_flush+0x4d8 fp a00db0 nfp a00e90
001e3e50: config_attach+0x1f0 fp a00e90 nfp a00ee0
002f63ec: cpu_configure+0x24 fp a00ee0 nfp a00f00
001c525c: main+0x3f0 fp a00f00 nfp a00f40
001001bc: kernel_text+0xa8 fp a00f40 nfp 0
ddb> ps
   PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
*0 -1  0  0  7 0x10200swapper
ddb>

OpenBSD 5.5

- Lots of "mac_intr_establish called, not yet inited" warnings in the
kernel dmesg output
- Panics on boot when initialising USB:

uhub0 at usb0 "Apple OHCI root hub" rev 1.00/1.00 addr 1
panic: trap type 600 at 2cf4a0 (mtx_enter+0x28) lr 2cf490
Stopped at  Debugger+0x10:  lwz r0,20(r1)
00fc: tlbdsmsize+0x14 fp 94ba70 nfp 94ba80
001cec40: panic+0xd0 fp 94ba80 nfp 94bae0
002ce8cc: trap+0x184 fp 94bae0 nfp 94bb60
00100900: ddblow+0x1ac fp 94bb60 nfp 94bc10
002cf48c: mtx_enter+0x14 fp 94bc10 nfp 94bc20
001c4a50: config_attach+0x200 fp 94bc20 nfp 94bc60
00351018: mpcpcibrattach+0x3b0 fp 94bc60 nfp 94bd80
001c4a40: config_attach+0x1f0 fp 94bd80 nfp 94bdc0
002e4af0: mb_matchname+0x4e8 fp 94bdc0 nfp 94beb0
001c4a40: config_attach+0x1f0 fp 94beb0 nfp 94bef0
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
ddb> trace
00fc: tlbdsmsize+0x14 fp 94ba70 nfp 94ba80
001cec40: panic+0xd0 fp 94ba80 nfp 94bae0
002ce8cc: trap+0x184 fp 94bae0 nfp 94bb60
00100900: ddblow+0x1ac fp 94bb60 nfp 94bc10
002cf48c: mtx_enter+0x14 fp 94bc10 nfp 94bc20
001c4a50: config_attach+0x200 fp 94bc20 nfp 94bc60
00351018: mpcpcibrattach+0x3b0 fp 94bc60 nfp 94bd80
001c4a40: config_attach+0x1f0 fp 94bd80 nfp 94bdc0
002e4af0: mb_matchname+0x4e8 fp 94bdc0 nfp 94beb0
001c4a40: config_attach+0x1f0 fp 94beb0 nfp 94bef0
002d1d9c: cpu_configure+0x24 fp 94bef0 nfp 94bf00
001a7314: main+0x3cc fp 94bf00 nfp 94bf40
001001bc: kernel_text+0xa8 fp 94bf40 nfp 0
ddb> ps
   PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
*0 -1  0  0  7 

Re: Status of openbsd/macppc port?

2018-08-17 Thread Mark Cave-Ayland
On 17/08/18 13:55, Solene Rapenne wrote:

> I'm using the macppc port since 6.1 to -current and apart failing
> harware I don't have any issue while playing Doom or rebuilind ports :)

Hmmm. 6.1 is the latest version that I can boot to userspace, even if it
faults quickly after a few keypresses (QEMU is generally really strict
on invalid memory accesses which is basically what I see, but once the
access is tracked down it would be possible to fix it).

I'd be interested to know if you are able to at least boot a 6.3
installation CDROM on the Mac Mini to the installer without hanging,
which is probably the closest match to what I'm doing on real hardware.


ATB,

Mark.



Re: Status of openbsd/macppc port?

2018-08-17 Thread Mark Cave-Ayland
On 17/08/18 13:37, Solene Rapenne wrote:
> Mark Cave-Ayland  wrote:
>> Hi all,
>>
>> I was just wondering what is the current state of the openbsd/macppc
>> port? As part of my recent work on qemu-system-ppc I now have a patch
>> that can boot OpenBSD macppc under the New World (-M mac99,via=pmu)
>> machine but I'm seeing quite a bit of instability in OpenBSD compared to
>> all my other test OSs.

> Hello
> 
> I can't help you much with your qemu issue but I can confirm you that
> the OpenBSD macppc port works really well as I use 2 macppc devices (an
> mac mini and a powerbook) often. The sad state is that less and less
> ports are running on them.

Thanks for the response Solene. Can I ask which version of
openbsd/macppc you are currently running?


ATB,

Mark.



Re: Status of openbsd/macppc port?

2018-08-17 Thread Mark Cave-Ayland
On 17/08/18 13:34, Jonathan Gray wrote:

> On Fri, Aug 17, 2018 at 12:15:10PM +0100, Mark Cave-Ayland wrote:
>> Hi all,
>>
>> I was just wondering what is the current state of the openbsd/macppc
>> port? As part of my recent work on qemu-system-ppc I now have a patch
>> that can boot OpenBSD macppc under the New World (-M mac99,via=pmu)
>> machine but I'm seeing quite a bit of instability in OpenBSD compared to
>> all my other test OSs.
>>
>> For those that are interested I have included screenshots below:
>>
>> OpenBSD 6.3
>> - Hangs just after USB detection
>> - https://www.ilande.co.uk/tmp/qemu/openbsd-6.3.png
>>
>> OpenBSD 6.2
>> - Panics just after USB detection
>> - https://www.ilande.co.uk/tmp/qemu/openbsd-6.2.png
>>
>> OpenBSD 6.1
>> - Boots all the way to the installer but causes qemu-system-ppc to
>> terminate fairly easily after pressing a few keys with "qemu: fatal:
>> ERROR: instruction should not need address translation"
>> - https://www.ilande.co.uk/tmp/qemu/openbsd-6.1.png
>>
>> Note I also get a constant stream of messages on the console related to
>> OpenPIC:
>>
>> qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
>> qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
>> qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
>> qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
>> qemu-system-ppc: openpic_iack: bad raised IRQ 28 ctpr 8 ivpr 0x4045001c
>> qemu-system-ppc: openpic_iack: bad raised IRQ 28 ctpr 8 ivpr 0x4045001c
>> qemu-system-ppc: openpic_iack: bad raised IRQ 28 ctpr 8 ivpr 0x4045001c
>> etc.
>>
>>
>> Obviously I can't categorically state that QEMU's emulation is perfect,
>> but it can now reliably run all of Linux, MacOS, NetBSD and FreeBSD in
>> my local tests which makes me suspect that OpenBSD is trying to do
>> something different here.
> 
> Builds are done natively on real hardware (xserves).  Your work on
> qemu-system-ppc would be improved by being able to compare to a real
> machine while it is still possible to find some that work.  You could
> search bugs@ but I don't believe any of those problems have been reported
> running on actual macppc machines.

Thanks for information. I guess there is a difference between being able
to build and run the guest OS - for example do the builds get regularly
tested on any Sawtooth-type PowerMac3,1 machines (which is effectively
what QEMU is trying to emulate)?

FWIW from the screenshots above the "bad IRQs" being complained about
above can be show to be macgpio1 (IRQ 47) and ohci0 (IRQ 28). Is there
anything special about these interrupts at all, e.g. edge vs. level
triggering?


ATB,

Mark.



Status of openbsd/macppc port?

2018-08-17 Thread Mark Cave-Ayland
Hi all,

I was just wondering what is the current state of the openbsd/macppc
port? As part of my recent work on qemu-system-ppc I now have a patch
that can boot OpenBSD macppc under the New World (-M mac99,via=pmu)
machine but I'm seeing quite a bit of instability in OpenBSD compared to
all my other test OSs.

For those that are interested I have included screenshots below:

OpenBSD 6.3
- Hangs just after USB detection
- https://www.ilande.co.uk/tmp/qemu/openbsd-6.3.png

OpenBSD 6.2
- Panics just after USB detection
- https://www.ilande.co.uk/tmp/qemu/openbsd-6.2.png

OpenBSD 6.1
- Boots all the way to the installer but causes qemu-system-ppc to
terminate fairly easily after pressing a few keys with "qemu: fatal:
ERROR: instruction should not need address translation"
- https://www.ilande.co.uk/tmp/qemu/openbsd-6.1.png

Note I also get a constant stream of messages on the console related to
OpenPIC:

qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
qemu-system-ppc: openpic_iack: bad raised IRQ 47 ctpr 8 ivpr 0x4047002f
qemu-system-ppc: openpic_iack: bad raised IRQ 28 ctpr 8 ivpr 0x4045001c
qemu-system-ppc: openpic_iack: bad raised IRQ 28 ctpr 8 ivpr 0x4045001c
qemu-system-ppc: openpic_iack: bad raised IRQ 28 ctpr 8 ivpr 0x4045001c
etc.


Obviously I can't categorically state that QEMU's emulation is perfect,
but it can now reliably run all of Linux, MacOS, NetBSD and FreeBSD in
my local tests which makes me suspect that OpenBSD is trying to do
something different here.


ATB,

Mark.



Re: hme: incorrect register endian for PCI sun hme devices?

2017-08-14 Thread Mark Cave-Ayland
On 14/08/17 21:18, Mark Kettenis wrote:

>> So tracing through HME register writes it seems the difference between
>> OpenBSD and the other OSs is that OpenBSD appears to write to the
>> virtual address 0x40008098000 with a standard (0x80) primary ASI,
>> whereas the other OSs seem to write directly to the physical address
>> 0x1ff0400 with a physical LE ASI.
>>
>> Is this because in OpenBSD the memory is being allocated as DVMA memory
>> via the IOMMU?
> 
> Ah, no.  For memory mapped io it seems we create an actual
> little-endian memory mapping (i.e. with the IE bit set).  That was
> probably done to support mapping framebuffers.

Ah yes, I bet that's it - thanks for the pointer! Not sure it's going to
be the easiest job to implement though.


ATB,

Mark.



Re: hme: incorrect register endian for PCI sun hme devices?

2017-08-14 Thread Mark Cave-Ayland
On 14/08/17 14:25, Mark Kettenis wrote:

>> Great, thanks for the information - the fact that the nsphy0 has been
>> detected correctly means that the access still works. Looks like I'll
>> have to go digging deeper.
> 
> The OpenBSD code uses %asi if necessary to let the hardware do the
> byteswapping.  Howver, I think the psycho(4) host bridge also does an
> implicit byteswap.  Always has been a bit confusing to me.  But the
> code defenitely works correctly on real hardware.

So tracing through HME register writes it seems the difference between
OpenBSD and the other OSs is that OpenBSD appears to write to the
virtual address 0x40008098000 with a standard (0x80) primary ASI,
whereas the other OSs seem to write directly to the physical address
0x1ff0400 with a physical LE ASI.

Is this because in OpenBSD the memory is being allocated as DVMA memory
via the IOMMU?


ATB,

Mark.



Re: hme: incorrect register endian for PCI sun hme devices?

2017-08-13 Thread Mark Cave-Ayland
On 13/08/17 16:52, Kaashif Hymabaccus wrote:

> Hello Mark,
> 
> I have a Sun Ultra 5 with the following dmesg:
> 
> console is /pci@1f,0/pci@1,1/ebus@1/se@14,40:a
> Copyright (c) 1982, 1986, 1989, 1991, 1993
>   The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2017 OpenBSD. All rights reserved.  https://www.OpenBSD.org
> 
> OpenBSD 6.1-current (GENERIC) #225: Fri Aug 11 19:58:43 MDT 2017
> dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC
> real mem = 536870912 (512MB)
> avail mem = 512393216 (488MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root: Sun Ultra 5/10 UPA/PCI (UltraSPARC-IIi 270MHz)
> cpu0 at mainbus0: SUNW,UltraSPARC-IIi (rev 1.3) @ 269.802 MHz
> cpu0: physical 16K instruction (32 b/l), 16K data (32 b/l), 256K external (64 
> b/l)
> psycho0 at mainbus0 addr 0xfffc4000: SUNW,sabre, impl 0, version 0, ign 7c0
> psycho0: bus range 0-2, PCI bus 0
> psycho0: dvma map c000-dfff
> pci0 at psycho0
> ppb0 at pci0 dev 1 function 1 "Sun Simba" rev 0x11
> pci1 at ppb0 bus 1
> ebus0 at pci1 dev 1 function 0 "Sun PCIO EBus2" rev 0x01
> auxio0 at ebus0 addr 726000-726003, 728000-728003, 72a000-72a003, 
> 72c000-72c003, 72f000-72f003
> power0 at ebus0 addr 724000-724003 ivec 0x25
> "SUNW,pll" at ebus0 addr 504000-504002 not configured
> sab0 at ebus0 addr 40-40007f ivec 0x2b: rev 3.2
> sabtty0 at sab0 port 0: console
> sabtty1 at sab0 port 1
> comkbd0 at ebus0 addr 3083f8-3083ff ivec 0x29: no keyboard
> comms0 at ebus0 addr 3062f8-3062ff ivec 0x2a
> wsmouse0 at comms0 mux 0
> lpt0 at ebus0 addr 3043bc-3043cb, 30015c-30015d, 70-7f ivec 0x22: 
> polled
> "fdthree" at ebus0 addr 3023f0-3023f7, 706000-70600f, 72-720003 ivec 0x27 
> not configured
> clock1 at ebus0 addr 0-1fff: mk48t59
> "flashprom" at ebus0 addr 0-f not configured
> audioce0 at ebus0 addr 20-2000ff, 702000-70200f, 704000-70400f, 
> 722000-722003 ivec 0x23 ivec 0x24: nvaddrs 0
> audio0 at audioce0
> hme0 at pci1 dev 1 function 1 "Sun HME" rev 0x01: ivec 0x7e1, address 
> 08:00:20:19:39:20
> nsphy0 at hme0 phy 1: DP83840 10/100 PHY, rev. 1
> machfb0 at pci1 dev 2 function 0 "ATI Mach64" rev 0x9a
> machfb0: ATY,GT-B, 1152x900
> wsdisplay0 at machfb0 mux 1
> wsdisplay0: screen 0 added (std, sun emulation)
> pciide0 at pci1 dev 3 function 0 "CMD Technology PCI0646" rev 0x03: DMA, 
> channel 0 configured to native-PCI, channel 1 configured to native-PCI
> pciide0: using ivec 0x7e0 for native-PCI interrupt
> wd0 at pciide0 channel 0 drive 0: 
> wd0: 16-sector PIO, LBA48, 117800MB, 241254720 sectors
> wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
> atapiscsi0 at pciide0 channel 1 drive 0
> scsibus1 at atapiscsi0: 2 targets
> cd0 at scsibus1 targ 0 lun 0:  ATAPI 5/cdrom 
> removable
> wd1 at pciide0 channel 1 drive 1: 
> wd1: 16-sector PIO, LBA, 19546MB, 40031712 sectors
> cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
> wd1(pciide0:1:1): using PIO mode 4, DMA mode 2
> ppb1 at pci0 dev 1 function 0 "Sun Simba" rev 0x11
> pci2 at ppb1 bus 2
> vscsi0 at root
> scsibus2 at vscsi0: 256 targets
> softraid0 at root
> scsibus3 at softraid0: 256 targets
> bootpath: /pci@1f,0/pci@1,1/ide@3,0/disk@0,0
> root on wd0a (f52f0bbc65e53556.a) swap on wd0b dump on wd0b
> 
> It has a PCI hme card and it works great.
> 
> I would be happy to help if you want to test some diff or program, but
> I am not knowledgeable enough to comment on the inner workings of the
> hme driver.

Great, thanks for the information - the fact that the nsphy0 has been
detected correctly means that the access still works. Looks like I'll
have to go digging deeper.


ATB,

Mark.



hme: incorrect register endian for PCI sun hme devices?

2017-08-13 Thread Mark Cave-Ayland
Hi all,

Does anyone have any real Sun hardware containing a PCI hme card running
OpenBSD, and if so does it work with the current 6.1 release?

I've been working on a virtual hme device for qemu-system-sparc64 in the
hope of getting working networking on *BSD images and I have a driver
that now works well for pretty much all OSs... except OpenBSD.

Looking at the hme driver register accesses on OpenBSD, the issue
appears to be that all accesses to the hme register blocks defined in
if_hme_pci.c (SEB, ETX, ERX, MAC, MIF) are done using big endian
accesses, whereas for PCI devices these need to be done using little
endian accesses.

Is there something I've missed in the hme device emulation or is there
something amiss with the hme driver? I see that it is shared between PCI
and SBus so perhaps that is part of the puzzle?


ATB,

Mark.



Re: sparc64 pmap diff

2016-04-18 Thread Mark Cave-Ayland
On 17/04/16 20:42, Mark Kettenis wrote:

> Ran into an interesting problem with the sparc64 pmap bootstrapping
> code.  Early on, we ask the firmware what physical memory is
> available.  Later we use this memory to set up the kernel page tables,
> kernel stack and per-cpu data structures.  We explicitly tell the
> firmware about the mappings of these data structure as the firmware is
> handling page faults for us at this stage.  To store these mappings
> the firmware may need to allocate more memory.  And if it happens to
> allocate memory that we're using for some other purpose, bad things
> will happen.  In my case dmesg stopped working because its mappings
> were messed up.
> 
> The following diff attempts to fix this issue by telling the firmware
> which pages we're stealing.  It's not perfect as it doesn't prevent us
> from allocating the same pages as the firmware is allocating.
> 
> Tests on a wide variety of  sparc64 hardware would be welcome.
> 
> 
> Index: pmap.c
> ===
> RCS file: /cvs/src/sys/arch/sparc64/sparc64/pmap.c,v
> retrieving revision 1.96
> diff -u -p -r1.96 pmap.c
> --- pmap.c27 Nov 2015 15:34:01 -  1.96
> +++ pmap.c17 Apr 2016 19:17:45 -
> @@ -2869,6 +2869,7 @@ pmap_get_page(paddr_t *pa, const char *w
>   *pa = VM_PAGE_TO_PHYS(pg);
>   } else {
>   uvm_page_physget(pa);
> + prom_claim_phys(*pa, PAGE_SIZE);
>   pmap_zero_phys(*pa);
>   }

This patch feels wrong - essentially it is just hiding the fact there is
a missing prom_claim_phys() or prom_alloc_phys() somewhere at the point
of allocation. Can you give more information about the particular case
you describe above?


ATB,

Mark.



SPARC64: suggested fixes for OF interface

2014-10-02 Thread Mark Cave-Ayland

Hi all,

From my work on running OpenBSD under OpenBIOS/QEMU, I found a couple 
of bugs in the NetBSD OF bindings for SPARC64 which also seem to be 
relevant to OpenBSD. I've applied patches to OpenBIOS to compensate for 
these bugs which allows OpenBSD to boot under QEMU, but thought that as 
there is interest here it would be worth documenting them for the sake 
of correctness.



1) OF_close has the wrong number of return arguments

src/sys/arch/sparc64/stand/ofwboot/Locore.c specifies the OF_close has 
args.nreturns == 1. From the IEEE1275 specification we can see that the 
close word doesn't return any arguments, and so args.nreturns should 
be set to 0. OpenBIOS currently compensates for this and issues a 
warning when debugging is enabled.



2) OF_test_method takes a phandle not an ihandle, and also returns 0 on 
success


src/sys/arch/sparc64/sparc64/ofw_machdep.c calls OF_test_method with an 
ihandle instead of an phandle as detailed in the Open Firmware working 
group proposal at 
http://www.openfirmware.org/1275/proposals/Closed/Accepted/270-it.txt 
(WARNING: the above link is currently down, however Google still has a 
cached version available).


Similarly the Forth word signature looks like this:

test-method ( method-cstr phandle -- missing-flag? )

This means that missing-flag? should be true if the method is missing 
and false if it is present, which indicates that the check to determine 
the existence of SUNW,retain in ofw_machdep.c is the wrong way around, 
i.e. the result comparison should be == 0 rather than != 0.


What happens at the moment is that calling OF_test_method with an 
ihandle causes an exception and so the client inferface returns -1 to 
indicate failure. However since the result is checked for != 0 then this 
is taken to indicate that SUNW,retain exists which is why this currently 
works on some real PROMs.


It's worth mentioning that this fixes test-method on E250/E450 systems 
and so the NetBSD folks were able to remove the is_e250 hack after 
testing on real hardware.


For interested parties the corresponding NetBSD diff can be found at 
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/sparc64/sparc64/ofw_machdep.c.diff?r1=1.41r2=1.42f=h.



ATB,

Mark.



sparc64: fledgling QEMU support

2014-09-09 Thread Mark Cave-Ayland

Hi all,

Following up from my posts at the beginning of the summer, I'm pleased 
to announce that as of today, qemu-system-sparc64 built from QEMU git 
master will successfully install OpenBSD from an .iso and boot back into 
it in serial mode with its default sun4u emulation:



$ ./qemu-system-sparc64 -cdrom install55.iso -boot d -nographic
OpenBIOS for Sparc64
Configuration device id QEMU version 1 machine id 0
kernel cmdline
CPUs: 1 x SUNW,UltraSPARC-IIi
UUID: ----
Welcome to OpenBIOS v1.1 built on Aug 26 2014 12:48
  Type 'help' for detailed information
Trying cdrom:f...
Not a bootable ELF image
Not a bootable a.out image

Loading FCode image...
Loaded 4829 bytes
entry point is 0x4000
OpenBSD IEEE 1275 Bootblock 1.3
..
Jumping to entry point 0010 for type 0001...
switching to new context: entry point 0x10 stack 0xffe8aa09
 OpenBSD BOOT 1.6
Trying bsd...
open /pci@1fe,0/pci-ata@5/ide1@2200/cdrom@0:f/etc/random.seed: No such 
file or directory

Booting /pci@1fe,0/pci-ata@5/ide1@2200/cdrom@0:f/bsd
3901336@0x100+6248@0x13b8798+3261984@0x180+932320@0x1b1c620
symbols @ 0xffc5a300 119 start=0x100

Unexpected client interface exception: -1
console is /pci@1fe,0/ebus@3/su
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2014 OpenBSD. All rights reserved. 
http://www.OpenBSD.org


OpenBSD 5.5 (RAMDISK) #153: Tue Mar  4 15:12:10 MST 2014
dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/RAMDISK
real mem = 134217728 (128MB)
avail mem = 122011648 (116MB)
mainbus0 at root: OpenBiosTeam,OpenBIOS
cpu0 at mainbus0: SUNW,UltraSPARC-IIi (rev 9.1) @ 100 MHz
cpu0: physical 256K instruction (64 b/l), 16K data (32 b/l), 256K 
external (64 b/l)

psycho0 at mainbus0: SUNW,sabre, impl 0, version 0, ign 7c0
psycho0: bus range 0-2, PCI bus 0
psycho0: dvma map c000-dfff
pci0 at psycho0
ppb0 at pci0 dev 1 function 0 Sun Simba rev 0x11
pci1 at ppb0 bus 1
ppb1 at pci0 dev 1 function 1 Sun Simba rev 0x11
pci2 at ppb1 bus 2
unknown vendor 0x1234 product 0x (class display subclass VGA, rev 
0x00) at pci0 dev 2 function 0 not configured

ebus0 at pci0 dev 3 function 0 Sun PCIO EBus2 rev 0x01
fdthree at ebus0 addr 0- not configured
com0 at ebus0 addr 3f8-3ff ivec 0x2b: ns16550a, 16 byte fifo
com0: console
kb_ps2 at ebus0 addr 60-67 not configured
Realtek 8029 rev 0x00 at pci0 dev 4 function 0 not configured
pciide0 at pci0 dev 5 function 0 CMD Technology PCI0646 rev 0x07: DMA, 
channel 0 configured to native-PCI, channel 1 configured to native-PCI

pciide0: using ivec 0x7d4 for native-PCI interrupt
pciide0: channel 0 disabled (no drives)
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: QEMU, QEMU DVD-ROM, 2.1. ATAPI 5/cdrom 
removable

cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
prtc0 at mainbus0
softraid0 at root
scsibus1 at softraid0: 256 targets
bootpath: /pci@1fe,0/pci-ata@5,0/ide1@2200,0/cdrom@0,0:f
root on rd0a swap on rd0b dump on rd0b
unix-gettod:interpret: exception -13 caught
interpret h# 01c099fc unix-gettod failed with error ffed
WARNING: bad date in battery clock -- CHECK AND RESET THE DATE!
erase ^?, werase ^W, kill ^U, intr ^C, status ^T

Welcome to the OpenBSD/sparc64 5.5 installation program.
(I)nstall, (U)pgrade, (A)utoinstall or (S)hell? I
At any prompt except password prompts you can escape to a shell by
typing '!'. Default answers are shown in []'s and are selected by
pressing RETURN.  You can exit this program at any time by pressing
Control-C, but this can leave your system in an inconsistent state.

Terminal type? [sun]
System hostname? (short form, e.g. 'foo') openbsd

Available network interfaces are: vlan0.
Which network interface do you wish to configure? (or 'done') [vlan0] done
DNS domain name? (e.g. 'bar.com') [my.domain]
DNS nameservers? (IP address list or 'none') [none]

Password for root account? (will not echo)
Password for root account? (again)
Start sshd(8) by default? [yes]
Start ntpd(8) by default? [no]
Do you expect to run the X Window System? [no]
Setup a user? (enter a lower-case loginname, or 'no') [no]

... etc.


There are still some issues with the device tree to work out; in 
particular NVRAM and networking (I'd guess that the OpenBSD sparc64 
kernel doesn't contain the Realtek device driver so at some point I'll 
need to create a virtual hme device) but it's good enough to 
install/boot an OS on different hardware for testing - what could be 
more fun than that?



ATB,

Mark.



Re: sparc64: fledgling QEMU support

2014-09-09 Thread Mark Cave-Ayland

On 09/09/14 19:54, Mark Kettenis wrote:


Sweet.

The RealTek 8129 should be supported by the rl(4) driver, and is
AFAICT included in the RAMDISK kernel.  Not sure why it doesn't
attach.  If it is easy to hook up QEMU's e1000 hardware emulation to
the emulated sparc64 hardware, that should be supported as well on the
OpenBSD side.

OpenBSD expects the device tree node for the PS/2 keyboard to be named
8042.  That's how it is named on the Ultra AXi boards.


Thanks for the information. I've had some interest from the NetBSD folk 
too and it seems that they don't build 8042 support into their default 
sparc64 kernel, so it looks like I'd have to switch over to su serial 
ports instead like the real thing (the QEMU sun4u model is fairly close 
to an Ultra 5). My aim is to try and provide an environment that mostly 
just works for as many OSs as possible.



The NVRAM is supposed to be described by a node named eeprom under
ebus.  proper emulation of this device will get rid of the

   unix-gettod:interpret: exception -13 caught
   interpret h# 01c099fc unix-gettod failed with error ffed
   WARNING: bad date in battery clock -- CHECK AND RESET THE DATE!

spam.


Brilliant - very useful. The one issue I am aware of is that currently 
the NVRAM chap is wired up as ioport rather than MMIO so that will need 
to change. I believe Artyom posted some patches for this a year or so 
ago, however they will likely need a bit of work to get them suitable 
for upstream QEMU.



ATB,

Mark.



Re: sparc64: fledgling QEMU support

2014-09-09 Thread Mark Cave-Ayland

On 09/09/14 19:57, Brad Smith wrote:


The Realtek hardware in that dmesg is an NE2000 PCI adapter which
the sparc64 kernel config indeed does not have a driver for at the
very moment, although it could be added. Having a QEMU driver for
the Happy Meal MAC would provide the best level of compatibility
with other OS's as that is what comes with a lot of Sun systems.


Agreed. Once I've sorted out the NVRAM issues in theory QEMU should be 
able to run some older 64-bit Solaris 9-10 kernels, and I suspect I'll 
need to implement a virtual hme device in order for that to work. It 
seems like people on this list have quite a bit of SPARC experience, so 
would it be okay to ask questions about hme drivers on this list? Or 
would somewhere else be more appropriate?



But for OpenBSD and sparc64 there are other options that could be
used from QEMU's perspective such as the e1000 [em(4)], i82551 /
i82559er [fxp(4)] and rtl8139 [re(4)] drivers that should work
well.


Interesting. Longer term the aim of the QEMU project is to move the 
hardwired machine types into pluggable devices, e.g. you can build a 
whole machine on the command line from multiple -device parameters or 
preload the default machine types such as sun4u using instructions from 
a file. So while this is not practical now without source hacks, it is 
likely to become possible in the future.



ATB,

Mark.



Re: sparc64: fledgling QEMU support

2014-09-09 Thread Mark Cave-Ayland

On 09/09/14 20:04, Bryan Steele wrote:


Neat! :-)

It seems the GENERIC sparc64 kernel already has PCMCIA/CardBus ne(4), so
adding 'ne* at pci?' might just work.

OpenBSD/sparc64 already supports sun4v LDOMS, so there's drivers implementing
the virtual protocols (..vnet(4)/vdsk(4)). Does QEMU support this?

Could the PCI virtio stuff be adapted to non-x86 architectures?


QEMU already has a virtio PCI device that can be plugged into 
qemu-system-sparc64 (see Artyom's blog at 
http://tyom.blogspot.co.uk/2013/03/debiansparc64-wheezy-under-qemu-how-to.html 
for an example of how to do this with Linux).


This could be an amusing project; in theory it would be possible to work 
on an x86 laptop to test/debug big-endian virtio support with the help 
of QEMU's virtual hardware.  You can do this by plugging in a standard 
virtual cdrom/hd along with an additional virtio hd/nic, booting from 
the standard devices and then testing the drivers accessing the extra 
devices as required.


I should probably add that there may still be some CPU bugs lying 
around, and also you'd need a power source since as I don't believe the 
UIIi processor has any power-saving instructions (or at least QEMU 
doesn't emulate them) which causes qemu-system-sparc64 to take a lot of 
CPU...



ATB,

Mark.



Re: sparc64: fledgling QEMU support

2014-09-09 Thread Mark Cave-Ayland

On 09/09/14 21:26, Miod Vallat wrote:


Interesting. Longer term the aim of the QEMU project is to move the
hardwired machine types into pluggable devices, e.g. you can build a whole
machine on the command line from multiple -device parameters or preload the
default machine types such as sun4u using instructions from a file. So while
this is not practical now without source hacks, it is likely to become
possible in the future.


Do not expect any support for the fanciest device combinations. While
most sparc64 systems will probably be able to cope with whatever
five-feet sheeps you can build, sparc32 qemu will happily attempt to
emulate systems which make no sense, physically, and dismissing reports
that BSD does not run on such artificial setups is annoying, to say the
least.


Oh sure. It was more to make a point that at some point the QEMU machine 
will become ultimately more flexible, which I see as something useful 
for development rather than production use. As I mentioned in one of my 
earlier emails, my aim is to get the basic sun4u Ultra 5 machine good 
enough to be able to run the main Linux/*BSD/Solaris OSs out of the box 
so the final choices of hardware for the virtual device model will be 
quite limited.



ATB,

Mark.



Re: sparc64: problem after trap table takeover under QEMU

2014-05-25 Thread Mark Cave-Ayland

On 08/05/14 20:28, Mark Kettenis wrote:


Hi Mark,

Interesting to see sparc64 support in QEMU.


Yeah, it's been a work in progress for quite a while now. There seems to 
be two main areas of interest: firstly for people who are now migrating 
away from SPARC but need to keep a legacy application(s), and secondly 
for open source projects interested in testing across multiple 
architectures.



As soon as I step into address 0x1001804 then this is where things start
to go wrong; the TLB (TTE) entry for 0x180 which is accessed by %sp
is marked as privileged, but ASI 0x11 is user access only. QEMU's
current behaviour for this is to generate a datafault for the page at
0x180 which seems to get all the way through to the retry at the end
of winfixsave, but then hits the breakpoint trap above when executing
the retry.


I've finally located the source of this bug thanks to more testing,
which showed that OpenBSD 4.9 was surprisingly also able to boot
(something I missed this in my original bisection). This allowed me to
track down what was happening fairly easily. The problem is caused by
the fact that 0x180 has *two* mappings in the TLB and the way in
which QEMU resolves them.

Compare the state of the TLB when the fill_0_normal trap occurs on
OpenBSD 5.5 (faults, incorrect) and OpenBSD 4.9 (no fault, correct):


OpenBSD 5.5:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...
[42] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...

OpenBSD 4.9:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[08] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...


The bug occurs because the QEMU TLB algorithm currently searches the TLB
*in order* starting from entry 0 until it finds a VA match.

In the OpenBSD 5.5 case, the first mapping it finds is the 4M privileged
mapping, and so the fill_0_normal trap which uses user ASI 0x11 faults
due to not being privileged. This is in contrast to the OpenBSD 4.9 case
where the first mapping it finds is the 8K unprivileged mapping, hence
the fill_0_normal trap succeeds and we proceed to boot.

Does anyone know how real hardware resolves conflicts between multiple
TLB entries with the same VA? My guess would be that the smaller 8K
mapping should take priority, but the documentation in relation to
address aliasing is fairly non-existent so I wondering if there are any
other rules relating to whether privileged mappings should take priority
or not? Once the behaviour is known, it will be fairly easy to fix up
QEMU to match.


It seems that this first hypothesis was incorrect; after some help from 
the NetBSD guys we found out that all PROM mappings should default to 
privileged. So the issue is no longer to do with the difference between 
privileged/unprivileged mappings, but why does the fault occur in the 
first place?



I don;t know how the real hardware behaves.  But it certainly is the
intention that the 4M locked mapping gets used as soon as we've
taken over the trap table.  Not sure where the 8K mapping is coming
from.


Finally it does raise an eyebrow that the first window trap taken when
the kernel takes over the trap table is a fill_0_normal *user* trap,
particularly when it's against an *unlocked* TLB entry which could
potentially could have been evicted beforehand. It might be worth
double-checking as to whether this is the intended behaviour or not.


Right.  It certainly isn't the intention that we end up a
fill_0_normal at this point.  Perhaps %wstate is initialized
differently in QEMU than on real hardware?  The OpenBSD bootstrap code
does set %wstate appropriately immediately after taking over the trap
table.  We can't really do this earlier since we don't know the
conventions used by the spill and fill handlers provided by the
firmware.  But it looks like a Sun Fire T2000 actually initializes
%wstate to 0.

So perhaps we're just getting lucky on real hardware that the prom
code doesn't spill our trap frame and therefore we don't have to fill
it again.


After more work, I believe that your theory here is correct. Take a look 
at cpu_initialize() in locore.S:



/*
 * Initialize a CPU.  This is used both for bootstrapping the first CPU
 * and spinning up each subsequent CPU.  Basically:
 *
 *  Install trap table.
 *  Switch to the initial stack.
 *  Call the routine passed in in cpu_info-ci_spinup.
 */

_C_LABEL(cpu_initialize):

wrpr%g0, 0, %tl ! Make sure we're not in 
NUCLEUS mode
flushw

/* Change the trap base register */
set _C_LABEL(trapbase), %l1
#ifdef SUN4V
sethi   %hi(_C_LABEL(cputyp)), %l0
ld  [%l0 + %lo(_C_LABEL(cputyp))], %l0
cmp %l0, CPU_SUN4V
bne,pt  %icc, 1f
 nop
set _C_LABEL(trapbase_sun4v), %l1

Re: sparc64: problem after trap table takeover under QEMU

2014-05-08 Thread Mark Cave-Ayland

On 06/05/14 19:18, Mark Cave-Ayland wrote:

(cut)


As soon as I step into address 0x1001804 then this is where things start
to go wrong; the TLB (TTE) entry for 0x180 which is accessed by %sp
is marked as privileged, but ASI 0x11 is user access only. QEMU's
current behaviour for this is to generate a datafault for the page at
0x180 which seems to get all the way through to the retry at the end
of winfixsave, but then hits the breakpoint trap above when executing
the retry.


I've finally located the source of this bug thanks to more testing, 
which showed that OpenBSD 4.9 was surprisingly also able to boot 
(something I missed this in my original bisection). This allowed me to 
track down what was happening fairly easily. The problem is caused by 
the fact that 0x180 has *two* mappings in the TLB and the way in 
which QEMU resolves them.


Compare the state of the TLB when the fill_0_normal trap occurs on 
OpenBSD 5.5 (faults, incorrect) and OpenBSD 4.9 (no fault, correct):



OpenBSD 5.5:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...
[42] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...

OpenBSD 4.9:

(qemu) info tlb
MMU contexts: Primary: 0, Secondary: 0
DMMU dump
...
[08] VA: 180, PA: f40,   8k, user, RW, unlocked, ctx 0 local
...
[14] VA: 180, PA: f40,   4M, priv, RW, locked, ctx 0 local
...


The bug occurs because the QEMU TLB algorithm currently searches the TLB 
*in order* starting from entry 0 until it finds a VA match.


In the OpenBSD 5.5 case, the first mapping it finds is the 4M privileged 
mapping, and so the fill_0_normal trap which uses user ASI 0x11 faults 
due to not being privileged. This is in contrast to the OpenBSD 4.9 case 
where the first mapping it finds is the 8K unprivileged mapping, hence 
the fill_0_normal trap succeeds and we proceed to boot.


Does anyone know how real hardware resolves conflicts between multiple 
TLB entries with the same VA? My guess would be that the smaller 8K 
mapping should take priority, but the documentation in relation to 
address aliasing is fairly non-existent so I wondering if there are any 
other rules relating to whether privileged mappings should take priority 
or not? Once the behaviour is known, it will be fairly easy to fix up 
QEMU to match.


Finally it does raise an eyebrow that the first window trap taken when 
the kernel takes over the trap table is a fill_0_normal *user* trap, 
particularly when it's against an *unlocked* TLB entry which could 
potentially could have been evicted beforehand. It might be worth 
double-checking as to whether this is the intended behaviour or not.



Kind regards,

Mark.



sparc64: problem after trap table takeover under QEMU

2014-05-06 Thread Mark Cave-Ayland

Hi all,

I'm currently working on a set of patches for OpenBIOS (the OF 
implementation for QEMU) in order to get the various *BSD kernels to 
boot under QEMU SPARC64 with some success, but I'm struggling with a 
privilege violation trap which occurs on the first window fill trap 
after OpenBSD takes over the trap table. This is with the latest OpenBSD 
5.5 and with my current patchset the console output looks like this:



Loading FCode image...
Loaded 4829 bytes
entry point is 0x4000
OpenBSD IEEE 1275 Bootblock 1.3
..
Jumping to entry point 0010 for type 0001...
switching to new context: entry point 0x10 stack 0xffe8aa09
 OpenBSD BOOT 1.6
Trying bsd...
open /pci@1fe,0/pci-ata@5/ide1@600/cdrom@0:f/etc/random.seed: No such 
file or directory

Booting /pci@1fe,0/pci-ata@5/ide1@600/cdrom@0:f/bsd
3901336@0x100+6248@0x13b8798+3261984@0x180+932320@0x1b1c620
symbols @ 0xffc5a300 119 start=0x100

Unexpected client interface exception: -1
panic: trap type 0x101 (breakpoint): pc=1010254 npc=1010258 
pstate=99110414MG,PEF,PRIV

halted

EXIT


I asked around on IRC and it was suggested that I post the information 
here in order to get some further input on this. My feeling is that QEMU 
SPARC64 may be doing something different to real hardware but I don't 
have any to play with and this is my first dig into OpenBSD, so I'd 
really appreciate some pointers from interested parties.


The privilege violation trap I experience occurs just after OpenBSD 
invokes the OF SUNW,set-trap-table call and occurs in the epilogue of 
openfirmware() in locore.S at the final restore:


...
rdpr%pstate, %l0
jmpl%i4, %o7
 wrpr   %g0, PSTATE_PROM|PSTATE_IE, %pstate
wrpr%l0, %g0, %pstate
mov %l1, %g1
mov %l2, %g2
mov %l3, %g3
mov %l4, %g4
mov %l5, %g5
mov %l6, %g6
mov %l7, %g7
wrpr%i2, 0, %pil
ret
 restore%o0, %g0, %o0

What happens here is that when the final restore is executed in the 
delay slot, a fill_0_normal trap is generated which vectors into 
0x1001800 here:


(gdb) disas 0x1001800, 0x100185c
Dump of assembler code from 0x1001800 to 0x100185c:
= 0x01001800:  wr  %g0, 0x11, %asi
   0x01001804:  ldxa  [ %sp + 0x7ff ] %asi, %l0
   0x01001808:  ldxa  [ %sp + 0x807 ] %asi, %l1
   0x0100180c:  ldxa  [ %sp + 0x80f ] %asi, %l2
   0x01001810:  ldxa  [ %sp + 0x817 ] %asi, %l3
   0x01001814:  ldxa  [ %sp + 0x81f ] %asi, %l4
   0x01001818:  ldxa  [ %sp + 0x827 ] %asi, %l5
   0x0100181c:  ldxa  [ %sp + 0x82f ] %asi, %l6
   0x01001820:  ldxa  [ %sp + 0x837 ] %asi, %l7
   0x01001824:  ldxa  [ %sp + 0x83f ] %asi, %i0
   0x01001828:  ldxa  [ %sp + 0x847 ] %asi, %i1
   0x0100182c:  ldxa  [ %sp + 0x84f ] %asi, %i2
   0x01001830:  ldxa  [ %sp + 0x857 ] %asi, %i3
   0x01001834:  ldxa  [ %sp + 0x85f ] %asi, %i4
   0x01001838:  ldxa  [ %sp + 0x867 ] %asi, %i5
   0x0100183c:  ldxa  [ %sp + 0x86f ] %asi, %fp
   0x01001840:  ldxa  [ %sp + 0x877 ] %asi, %i7
   0x01001844:  nop
   0x01001848:  sethi  %hi(0xe0018000), %g5
   0x0100184c:  ldx  [ %g5 + 0x10 ], %g5! 0xe0018010
   0x01001850:  ldx  [ %g5 + 0x28 ], %g5
   0x01001854:  xor  %g5, %i7, %i7
   0x01001858:  restored
End of assembler dump.
(gdb) info regi sp
sp 0x18006710x1800671

As soon as I step into address 0x1001804 then this is where things start 
to go wrong; the TLB (TTE) entry for 0x180 which is accessed by %sp 
is marked as privileged, but ASI 0x11 is user access only. QEMU's 
current behaviour for this is to generate a datafault for the page at 
0x180 which seems to get all the way through to the retry at the end 
of winfixsave, but then hits the breakpoint trap above when executing 
the retry.


Based on this I have a couple of questions about what is happening here:

1) Is the fill_0_normal (user-level) trap the correct one? Or does 
OpenBIOS need to do something with %otherwin to invoke a 
supervisor-level trap?


2) Is the QEMU SPARC64 behaviour of invoking a data_access_exception 
when accessing supervisor memory with a user ASI correct?


FWIW I also tried some older OpenBSD ISOs and found that this behaviour 
was introduced between the 4.3 and 4.4 releases, and older releases 
don't exhibit this problem. Repeating the same test in 4.3, which is the 
last release that doesn't trap with the breakpoint error above, shows 
that the fill_0_normal trap is still invoked in the openfirmware() 
epilogue, however the stack pointer is now different:


(gdb) info regi sp
sp 0x1c096210x1c09621

And I can confirm that page 0x1c08000 exists in the TLB but compared to 
the current release above *isn't* marked as privileged, so no fault 
occurs