Laszio,
I think that one of your links is broken. Shouldn't it be:
http://www.intel.com/content/www/us/en/architecture-and-technology/unified-extensible-firmware-interface/intel-uefi-development-kit-debugger-tool.html
-Pike
________________________________
From: Laszlo Ersek <ler...@redhat.com>
To: edk2-devel@lists.sourceforge.net
Cc: Jeff Fan <jeff....@intel.com>
Sent: Sunday, 21 April 2013, 10:25
Subject: Re: [edk2] OVMF hang after application terminates
On 04/20/13 01:50, Jordan Justen wrote:
> On Fri, Apr 19, 2013 at 3:13 PM, Duane Voth <dua...@gmail.com> wrote:
>> Building OVMF with -D SOURCE_DEBUG_ENABLE is involved.
>
> Building with -D SOURCE_DEBUG_ENABLE should cause the EDK Debugger
> infrastructure to be enabled. This will try to speak a special edk2
> specific debug protocol over the serial port on OVMF.
>
> I've never been able to do much of anything with it...
>
> Maybe this page can help explain -D SOURCE_DEBUG_ENABLE?
> http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=Debug_FAQ
I think I managed to set up the "debug solution" mostly, but the
emulated serial communication (on both sides) has so bad timing
characteristics that it's unusable.
This is the architecture:
debuggee
debugger ("host") VM ("target") VM
+----------------------------------------------+ +-----------+
| gdb frontend <---TCP---> | | |
| UDK gdb server <---serial---> OVMF |
| netcat <---TCP---> | | |
+----------------------------------------------+ +-----------+
"netcat" would be used as a terminal IO frontend for OVMF.
One thing that I managed to figure out about the serial protocol between
the UDK gdb server and OVMF/DebugAgent is: bytes from OVMF with the high
bit clear are "transparent", ie. they are not interpreted by the UDK gdb
server as debug protocol commands. Bytes with the high bit set introduce
debug protocol commands. This allows OVMF/DebugAgent to alternate
between sending debug commands and plain text on the same serial port.
My debugger VM was Fedora 18 (RHEL-6 is not recent enough); see the list
of supported Linux distros in the Intel UEFI Development Kit Debugger
Tool Configuration and Setup Guide (link below in (3)).
My host OS is RHEL-6. For these tests I put SELinux in permissive mode
(otherwise the unix domain socket stuff below would be denied, and
adding an SELinux module etc. is messy).
Steps to set up the debug solution:
(1) We need two VMs, with their (emulated) serial ports connected to
each other's. We could actually run the two VMs on separate hosts and
use TCP to forward the serial data, but that would only make the
latencies much worse. So I'm using a single host and a unix domain
socket to connect the VMs' serial ports.
Since I use libvirt, I edited the domain XMLs. Two nodes modified for
the debugger VM:
<console type='pty'>
<target type='virtio' port='0'/>
</console>
<serial type='unix'>
<source mode='bind' path='/tmp/debug.sock'/>
<target port='0'/>
</serial>
Same two nodes modified for the debuggee VM (of course different contents):
<console type='pty'>
<target type='virtio' port='0'/>
</console>
<serial type='unix'>
<source mode='connect' path='/tmp/debug.sock'/>
<target port='0'/>
</serial>
(2) Start the debugger VM.
(3) In the debugger VM, download and install the debug tools (docs
available in the same place). The v1.2 bundle is available under
http://www.intel.com/content/www/us/en/architecture-and-technology/unified-extensible-firmware-interface/intel
uefi-development-kit-debugger-tool.html
but make sure you download v1.3 instead:
http://uefidk.intel.com/develop/intel-uefi-tools-and-utilities/intel-uefi-development-kit-debugger-tool
because starting with edk2 svn r14083 from Jan 25th 2013, the edk2 Debug
Agent only works with 1.3.
(4) Still in the debugger VM, build OVMF.
I used svn r14242, with the following extra patches (the second for
gcc-4.7 in Fedora 18):
- http://thread.gmane.org/gmane.comp.bios.tianocore.devel/1419/focus=1462
-
http://sourceforge.net/mailarchive/forum.php?thread_name=1341995918-22888-1-git-send-email-pbonzini%40redhat.c
m&forum_name=edk2-buildtools-devel
Then
OvmfPkg/build.sh -p OvmfPkg/OvmfPkgX64.dsc -D FD_SIZE_2MB \
-D SOURCE_DEBUG_ENABLE -a X64 -b DEBUG -t GCC47 -n 4
(5) Copy OVMF.fd just built from the debugger VM to the host OS.
(6) In the debugger VM, start
udk-gdb-server
It should print something like
UDK GDB Server - Version 1.3
Waiting for the connection from the Target...
Debugging through serial port (/dev/ttyS0:115200:Hardware)
Redirect TARGET output to TCP port (20715).
(7) Start the debuggee VM. As boot firmware it should use the OVMF.fd
file built in step (4).
(8) [theory] When the "udk-gdb-server" command from step (6) prints
something like
GdbServer on <HOST> is waiting for connection on port 1234
Connect with 'target remote <HOST>:1234'
Now fire up two further shells in the debugger VM. In the first, start
the gdb frontend (see the docs linked in (3) for how); in the second,
connect netcat to port 20715. In gdb we'd then execute a python script
coming with the UDK GDB server installation (load symbols from the build
tree in step (4) etc), and netcat would be used for terminal
communication with OVMF.
Of course the problem is that (8) never happens. I saw the "waiting for
connection on port 1234" appear exactly once during countless attempts,
and even then the thing wasn't usable.
So what goes wrong?
(a) I verified that my "virtual null modem cable" transfers data
correctly (data-wise) by running "minicom" in both virtual machines.
Whatever I type in one appears in the other.
(b) After this I tried to "listen-in" on the communication. Of course
when the debuggee VM is booting OVMF, I can't run "minicom" in the same
debuggee VM. However I can run "minicom" on the debugger VM at the same
time the debugger VM is running the UDK GDB server, and I can eavesdrop
on the data incoming from OVMF/DebugAgent.
As I said above, the serial protocol alternates between plaintext ASCII
messages and debug protocol commands. The messages I witnessed from
OVMF/DebugAgent were:
Send INIT break packet and try to connect the HOST (Intel(R) UDK
Debugger Tool v1.3) ...
DebugPortReadBuffer(Command) timeout
DebugPortReadBuffer(SequenceNo) timeout
HOST connection is failed!
interspersed with binary stuff (debug proto commands). The messages are
printed by code in
"SourceLevelDebugPkg/Library/DebugAgent/DebugAgentCommon":
AttachHost() [DebugAgent.c]
SendCommandAndWaitForAckOK()
ReceivePacket()
DebugPortReadBuffer()
I started messing with the communication parameters:
- I recompiled "PcAtChipsetPkg/Library/SerialIoLib/SerialPortLib.c" with
"gBps" set to 9600 (with a corresponding change on the debugger VM side
in "/etc/udkdebugger.conf" -- in the latter I even tried disabling flow
control),
- I set "RetryCount" from 3 to 30 in SendCommandAndWaitForAckOK(),
- finally I changed the READ_PACKET_TIMEOUT macro in "DebugAgent.h" from
half a second to three seconds.
This final step proved decisive (even without the first two); at the
next try OVMF/DebugAgent immediately wrote
HOST connection is successful!
according to my minicom "wiretap" co-running in the debugger VM. Still
the debug session didn't start, and the message in (8) didn't appear.
(c) I turned to the debugger side. "/etc/udkdebugger.conf" contains the
stanza
[Maintenance]
# Uncomment the below line to turn on tracing
# Trace = 0x1f
After I did just that & restarted udk-gdb-server (NB ^C is not enough to
stop it, it has runaway children, use "killall"), and started the
debugee VM (step 7) too, udk-gdb-server begun to log this to the terminal:
UDK GDB Server - Version 1.3
Waiting for the connection from the Target...
Debugging through serial port (/dev/ttyS0:115200:Hardware)
Redirect TARGET output to TCP port (20715).
Send INIT break packet and try to connect the HOST (Intel(R) UDK
Debugger Tool v1.3) ...
Received data [ fe 3f 06 00 59 ba ]
Request = 0
m_WaitingAckForReset is 0 when INIT_BREAK
Sent data [ fe 80 06 00 9f66 ]
ProcessAsyncCommand(): Running = FALSE when INIT_BREAK
HandleInitBreak() called
PutDebuggerSetting() called: Key = 2 Value = 1f
Communicate-4() called
Sent data [ fe 11 08 01 ef 09 02 1f ]
Timeout...
Sent data [ fe 11 08 01 ef 09 02 1f ]
Recv timeout
Timeout...
Sent data [ fe 11 08 01 ef 09 02 1f ]
Recv timeout
Timeout...
Sent data [ fe 11 08 01 ef 09 02 1f ]
Recv timeout
HOST connection is successful!
Timeout...
Sent data [ fe 11 08 01 ef 09 02 1f ]
Recv timeout
Timeout...
Recv timeout
Sent data [ fe 11 08 01 ef 09 02 1f ]
Timeout...
Sent data [ fe 11 08 01 ef 09 02 1f ]
Recv timeout
Timeout...
Sent data [ fe 11 08 01 ef 09 02 1f ]
Recv timeout
SendAckPacket: SequenceNo = 1
Sent data [ FE 80 06 01 57 AC ]
Received data [ fe 80 06 01 57 ac ]
Communicate-4() returning
PutDebuggerSetting() returning
PutDebuggerSetting() called: Key = 1 Value = 0
Communicate-4() called
Sent data [ fe 11 08 02 50 bd 01 00 ]
TARGET: Try to get command from HOST...
and so on and so forth, with a huge number of timeouts. There's also
some stale data:
TARGET: Receive one old command[1] agaist command[1]
SendAckPacket: SequenceNo = 1
Sent data [ FE 80 06 01 57 AC ]
Received data [ fe 80 06 01 57 ac ]
Receive old response (1), ignore it
Clearly I'd have to increase the debugger-side receive timeout,
similarly to step (b).
Unfortunately, "udk-gdb-server" is a packed (self-extracting)
executable. It extracts to a directory called /tmp/_MEIXXXXXX (the six
X's replaced as in mktemp()). The contents of this directory is
-rwx------. 1 root root 38496 Apr 21 09:11 bz2.so
-rwx------. 1 root root 152560 Apr 21 09:11 _ctypes.so
-rwx------. 1 root root 96896 Apr 21 09:11 datetime.so
-rwx------. 1 root root 183815 Apr 21 09:11 DebugInterface.so
-rwx------. 1 root root 21680 Apr 21 09:11 _heapq.so
-rwx------. 1 root root 1852792 Apr 21 09:11 libcrypto.so.1.0.0
-rwx------. 1 root root 88384 Apr 21 09:11 libgcc_s.so.1
-rwx------. 1 root root 3059040 Apr 21 09:11 libpython2.7.so.1.0
-rwx------. 1 root root 374608 Apr 21 09:11 libssl.so.1.0.0
with the main logic in "DebugInterface.so" presumably. Some googling
fingers <http://www.pyinstaller.org/> as the packer.
In English, I don't have the source code for udk-gdb-server.
/etc/udkdebugger.conf doesn't offer a receive timeout knob either.
So I gave up. The default receive timeout seems too low for the
latencies that the two qemu processes, emulating the serial ports,
introduce, even though the "virtual null modem cable" seems clean data-wise.
Thanks,
Laszlo
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel