Hi! I'm a major contributor to the gem5 open source computer
architecture simulator, and I'm trying to get SeaBIOS and FreeDOS to
run on it. We've had at least some level of x86 support on our
simulator for a number of years now, but we've primarily focused on 64
bit mode. I've found a lot of bugs in our simulator as I've been going
along, but despite my best efforts I haven't been able to find a way
to blame my code for the bug I'm currently stuck on.
I'm using a very stripped down configuration of SeaBIOS, and am using
the serial console to interact with it since I haven't written a
simulated VGA interface yet. It reads FreeDOS from their published
QEMU disk image, and it starts up and prints a menu where it wants me
to select from 4 different boot modes. Roughly when I send a character
over the serial connection, the simulator crashes because the software
running on it tried to access an address that has nothing behind it.
I've been tracing this problem down, and I see an int 0x16 with code
0x1 happening, which is trying to check the keyboard status, I
believe. That goes along, and eventually calls check_irqs, which calls
need_hop_back which returns 1, and then calls "stack_hop_back" to call
back into itself but on the "original callers stack" I 75% know what
that's talking about, but I'm not 100%.
Anyway, once we're on the other stack, we call into the
clock_poll_irq, that calls clock_update, that calls the (inlined I
think) sercon_check_event, and when that tries to
SET_LOW(rx_buf[rx_bytes], byte) the bad access happens. At least I'm
pretty confident that's where it happens, it could also be in one of
the other lines right around there.
This is probably a bug in src/sercon.c. rx_bytes is marked VARLOW. The
following code is not tested but I think
SET_LOW(rx_buf[GET_LOW(rx_bytes)], byte);
is correct.
I used objdump -drS -m i8086 rom16.o for the disassembly listing.
f3d8: 8e c3 mov %bx,%es
if (GET_LOW(rx_bytes) < sizeof(rx_buf)) {
f3da: 26 8a 16 79 f1 mov %es:-0xe87,%dl
f3df: 80 fa 0f cmp $0xf,%dl
f3e2: 77 e6 ja f3ca <ExtraStack+0x1f2>
SET_LOW(rx_buf[GET_LOW(rx_bytes)], byte);
f3e4: 66 0f b6 ca movzbl %dl,%ecx
f3e8: 26 67 88 81 7c f1 00 mov %al,%es:0xf17c(%ecx)
f3ef: 00
SET_LOW(rx_bytes, GET_LOW(rx_bytes) + 1);
f3f0: 66 42 inc %edx
f3f2: 26 88 16 79 f1 mov %dl,%es:-0xe87
count++;
f3f7: 66 46 inc %esi
f3f9: eb cf jmp f3ca <ExtraStack+0x1f2>
With best regards,
Volker
The problem seems to be that the variable it's trying to access is
supposed to be in the "e" segment, ie with selector 0xe000 and base
address 0xe0000. The code that does this is here:
fbd43: 8e c3 mov %bx,%es
fbd45: 26 8a 16 9d f7 mov %es:-0x863,%dl
fbd4a: 80 fa 0f cmp $0xf,%dl
fbd4d: 77 e6 ja fbd35 <clock_update+0x6b>
fbd4f: 66 0f b6 0e 9d f7 movzbl -0x863,%ecx <==== where it
asplodes
fbd55: 26 67 88 81 a0 f7 00 mov %al,%es:0xf7a0(%ecx)
Note the comparison against 0xf, which I think is where it checks
against the size of rx_buf.
You can see here that this access is (I think) using the %ds register
by default. It has an operand size prefix, and a 2 byte displacement
of 0xf79d. Adding this to 0xe0000 gives 0xef79d, which from what I've
seen is a pretty valid looking address, not far below where I have the
BIOS ROM mapped in.
Unfortunately when this has problems, %ds is actually 0x9d80, which
gives a base of 0x9d800, which gives a linear address of 0xacf9d. This
is in the middle of the (not yet implemented) VGA framebuffer which is
why it dies.
I then traced down why %ds has this value, and it's from the "hop
back" step, specifically here:
asm volatile(
// Backup stack_pos and current %ss/%esp
"movl %6, %4\n"
"movw %%ss, %w3\n"
"movl %%esp, %6\n"
// Restore original callers' %ss/%esp
"movl -4(%4), %5\n"
"movl %5, %%ss\n" <======== Where %ss is set
"movw %%ds:-8(%4), %%sp\n"
"movl %5, %%ds\n" <======== Where %ds is set
// Call func
Note that in this code, *both* %ss and %ds are being set, and being
set to the same thing. This value *was* successfully pulled off the
saved data from when the int was originally called as far as I can
tell, but this value of %ds does *not* seem to be correct, since the
first time it's used it causes the bad access.
Could you please help me figure out what's going wrong here? Is this
supposed to work out somehow, and my simulator is just wrong (my bet,
but what's it doing wrong?), or is this a bug in SeaBIOS? Am I using
SeaBIOS in some way it's known not to work?
Please let me know if you need any other info, I'll be more than happy
to get this sorted out!
Gabe
_______________________________________________
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org
_______________________________________________
SeaBIOS mailing list -- seabios@seabios.org
To unsubscribe send an email to seabios-le...@seabios.org