Oops, apologies for the late reply. Reading through it again, I found a few more small nits:
On Tue, Jun 06, 2023 at 00:48:39 -0500, Glenn Washburn wrote: > Debugging GRUB can be tricky and require arcane knowledge. This will > help those unfamiliar with the process to get started debugging GRUB > with less effort. > > Signed-off-by: Glenn Washburn <developm...@efficientek.com> > --- > Changes from v1: > * Add gdbinfo section > --- > Interdiff against v2: > diff --git a/docs/grub-dev.texi b/docs/grub-dev.texi > index 188ca9c7ca6e..72470b42c61a 100644 > --- a/docs/grub-dev.texi > +++ b/docs/grub-dev.texi > @@ -638,7 +638,7 @@ various targets using @command{gdb} and the > @samp{gdb_grub} GDB script. > @section i386-pc > > The i386-pc target is a good place to start when first debugging GRUB2 > -because in some respects its easier than EFI platforms. The reason being > +because in some respects it's easier than EFI platforms. The reason being > that the initial load address is always known in advance. To start > debugging GRUB2 first QEMU must be started in GDB stub mode. The following > command is a simple illustration: > @@ -688,11 +688,11 @@ it does add the module symbols with the appropriate > offset. > @section x86_64-efi > > Using GDB to debug GRUB2 for the x86_64-efi target has some similarities > with > -the i386-pc target. Please read be familiar with the @ref{i386-pc} section > -when reading this one. Extra care must be used to run QEMU such that it > boots > -a UEFI firmware. This usually involves either using the @samp{-bios} option > -with a UEFI firmware blob (eg. @file{OVMF.fd}) or loading the firmware via > -pflash. This document will not go further into how to do this as there are > +the i386-pc target. Please read and familiarize yourself with the > @ref{i386-pc} > +section when reading this one. Extra care must be used to run QEMU such > that it > +boots a UEFI firmware. This usually involves either using the @samp{-bios} > +option with a UEFI firmware blob (eg. @file{OVMF.fd}) or loading the > firmware > +via pflash. This document will not go further into how to do this as there > are > ample resource on the web. > > Like all EFI implementations, on x86_64-efi the (U)EFI firmware that loads > @@ -700,7 +700,7 @@ the GRUB2 EFI application determines at runtime where > the application will > be loaded. This means that we do not know where to tell GDB to load the > symbols for the GRUB2 core until the (U)EFI firmware determines it. There > are > two good ways of figuring this out when running in QEMU: use a @ref{OVMF > debug log, > -debug build of OVMF} and check the debug log or have GRUB2 say where it is > +debug build of OVMF} and check the debug log, or have GRUB2 say where it is > loaded. Neither of these are ideal because they both generally give the > information after GRUB2 is already running, which makes debugging early > boot > infeasible. Technically, the first method does give the load address before > @@ -734,11 +734,11 @@ application must be run via QEMU at least once prior > in order to get the > load address. Two methods for obtaining the load address are described in > two subsections below. Generally speaking, the load address does not change > between QEMU runs. There are exceptions to this, namely that different > -GRUB2 EFI applications can be run at different addresses. Also, its been > +GRUB2 EFI applications can be run at different addresses. Also, it has been > observed that after running the EFI application for the first time, the > second run will some times have a different load address, but subsequent > runs of the same EFI application will have the same load address as the > -second run. And its a near certainty that if the GRUB EFI binary has > changed, > +second run. And it's a near certainty that if the GRUB EFI binary has > changed, > eg. been recompiled, the load address will also be different. > > This ability to predict what the load address will be allows one to assume > @@ -752,7 +752,7 @@ gdb -x gdb_grub -ex 'dynamic_load_symbols @var{address > of .text section}' > @end example > > If you load the symbols in this manner and, after continuing execution, do > -not see output showing the loading of modules symbol, then its very likely > +not see output showing the loading of modules symbol, then it is very > likely > that the load address was incorrect. > > Another thing to be aware of is how the loading of the GRUB image by the > @@ -760,8 +760,8 @@ firmware affects previously set software breakpoints. > On x86 platforms, > software breakpoints are implemented by GDB by writing a special processor > instruction at the location of the desired breakpoint. This special > instruction > when executed will stop the program execution and hand control to the > -debugger, GDB. GDB will first saves the instruction bytes that will be > -overwritten at the breakpoint, and will put them back when the breakpoint > +debugger, GDB. GDB will first save the instruction bytes that are > +overwritten at the breakpoint and will put them back when the breakpoint > is hit. If GRUB is being run for the first time in QEMU, the firmware will > be loading the GRUB image into memory where every byte is already set to 0. > This means that if a breakpoint is set before GRUB is loaded, GDB will save > > docs/grub-dev.texi | 224 +++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 224 insertions(+) > > diff --git a/docs/grub-dev.texi b/docs/grub-dev.texi > index 31eb99ea2994..72470b42c61a 100644 > --- a/docs/grub-dev.texi > +++ b/docs/grub-dev.texi > @@ -79,6 +79,7 @@ This edition documents version @value{VERSION}. > * Contributing Changes:: > * Setting up and running test suite:: > * Updating External Code:: > +* Debugging:: > * Porting:: > * Error Handling:: > * Stack and heap size:: > @@ -595,6 +596,229 @@ cp minilzo-2.10/*.[hc] grub-core/lib/minilzo > rm -r minilzo-2.10* > @end example > > +@node Debugging > +@chapter Debugging > + > +GRUB2 can be difficult to debug because it runs on the bare-metal and thus > +does not have the debugging facilities normally provided by an operating > +system. This chapter aims to provide useful information on some ways to > +debug GRUB2 for some architectures. It by no means intends to be exhaustive. > +The focus will be one x86_64 and i386 architectures. Luckily for some issues > +virtual machines have made the ability to debug GRUB2 much easier, and this > +chapter will focus debugging via the QEMU virtual machine. We will not be > +going over debugging of the userland tools (eg. grub-install), there are > +many tutorials on debugging programs in userland. > + > +You will need GDB and the QEMU binaries for your system, on Debian these > +can be installed with the @samp{gdb} and @samp{qemu-system-x86} packages. > +Also it is assumed that you have already successfully compiled GRUB2 from > +source for the target specified in the section below and have some > +familiarity with GDB. When GRUB2 is built it will create many different > +binaries. The ones of concern will be in the @file{grub-core} > +directory of the GRUB2 build dir. To aide in debugging we will want the > +debugging symbols generated during the build because these symbols are not > +kept in the binaries which get installed to the boot location. The build > +process outputs two sets of binaries, one without symbols which gets executed > +at boot, and another set of ELF images with debugging symbols. The built > +images with debugging symbols will have a @file{.image} suffix, and the ones > +without a @file{.img} suffix. Similarly, loadable modules with debugging > +symbols will have a @file{.module} suffix, and ones without a @file{.mod} > +suffix. In the case of the kernel the binary with symbols is named > +@file{kernel.exec}. > + > +In the following sections, information will be provided on debugging on > +various targets using @command{gdb} and the @samp{gdb_grub} GDB script. > + > +@menu > +* i386-pc:: > +* x86_64-efi:: > +@end menu > + > +@node i386-pc > +@section i386-pc > + > +The i386-pc target is a good place to start when first debugging GRUB2 > +because in some respects it's easier than EFI platforms. The reason being > +that the initial load address is always known in advance. To start > +debugging GRUB2 first QEMU must be started in GDB stub mode. The following > +command is a simple illustration: > + > +@example > +qemu-system-i386 -drive file=disk.img,format=raw \ > + -device virtio-scsi-pci,id=scsi0 -S -s > +@end example > + > +This will start a QEMU instance booting from @file{disk.img}. It will pause > +at start waiting for a GDB instance to attach to it. You should change > +@file{disk.img} to something more appropriate. A block device can be used, > +but you may need to run QEMU as a privileged user. > + > +To connect to this QEMU instance with GDB, the @code{target remote} GDB > +command must be used. We also need to load a binary image, preferably with > +symbols. This can be done using the GDB command @code{file kernel.exec}, if > +GDB is started from the @file{grub-core} directory in the GRUB2 build > +directory. GRUB2 developers have made this more simple by including a GDB > +script which does much of the setup. This file at @file{grub-core/gdb_grub} > +of the build directory and is also installed via @command{make install}. This sentence is definitely missing an "is" or similar, but I'd write something like: This file is at grub-core/gdb_grub in the build directory > +If not building GRUB, the distribution may have a package which installs > +this GDB script along with debug symbol binaries, such as Debian's > +@samp{grub-pc-dbg} package. The GDB scripts is intended to by used If it's just a single script, this should be: The GDB script is intended to by used > +like so, assuming: Did you forget to state what the assumption is? > + > +@example > +cd $(dirname /path/to/script/gdb_grub) > +gdb -x gdb_grub > +@end example > + > +Once GDB has been started with the @file{gdb_grub} script it will > +automatically connect to the QEMU instance. You can then do things you > +normally would in GDB like set a break point on @var{grub_main}. > + > +Setting breakpoints in modules is trickier since they haven't been loaded > +yet and are loaded at addresses determined at runtime. The module could be > +loaded to different addresses in different QEMU instances. The debug symbols > +in the modules @file{.module} binary, thus are always wrong, and GDB needs > +to be told where to load the symbols to. But this must happen at runtime > +after GRUB2 has determined where the module will get loaded. Luckily the > +@file{gdb_grub} script takes care of this with the > @command{runtime_load_module} > +command, which configures GDB to watch for GRUB2 module loading and when > +it does add the module symbols with the appropriate offset. > + > +@node x86_64-efi > +@section x86_64-efi > + > +Using GDB to debug GRUB2 for the x86_64-efi target has some similarities with > +the i386-pc target. Please read and familiarize yourself with the > @ref{i386-pc} > +section when reading this one. Extra care must be used to run QEMU such that > it > +boots a UEFI firmware. This usually involves either using the @samp{-bios} > +option with a UEFI firmware blob (eg. @file{OVMF.fd}) or loading the firmware > +via pflash. This document will not go further into how to do this as there > are > +ample resource on the web. > + > +Like all EFI implementations, on x86_64-efi the (U)EFI firmware that loads > +the GRUB2 EFI application determines at runtime where the application will > +be loaded. This means that we do not know where to tell GDB to load the > +symbols for the GRUB2 core until the (U)EFI firmware determines it. There are > +two good ways of figuring this out when running in QEMU: use a @ref{OVMF > debug log, > +debug build of OVMF} and check the debug log, or have GRUB2 say where it is > +loaded. Neither of these are ideal because they both generally give the > +information after GRUB2 is already running, which makes debugging early boot > +infeasible. Technically, the first method does give the load address before > +GRUB2 is run, but without debugging the EFI firmware with symbols, the author > +currently does not know how to cause the OVMF firmware to pause at that point > +to use the load address before GRUB2 is run. > + > +Even after getting the application load address, the loading of core symbols > +is complicated by the fact that the debugging symbols for the kernel are in > +an ELF binary named @file{kernel.exec} while what is in memory are sections > +for the PE32+ EFI binary. When @command{grub-mkimage} creates the PE32+ > +binary it condenses several segments from the ELF kernel binary into one > +.data section in the PE32+ binary. This must be taken into account to > +properly load the other non-text sections. Otherwise, GDB will work as > +expected when breaking on functions, but, for instance, global variables > +will point to the wrong address in memory and thus give incorrect values > +(which can be difficult to debug). > + > +The calculating of the correct offsets for sections when loading symbol > +files are taken care of when loading the kernel symbols via the user-defined This sentence feels a bit clumsy. I'd write something like: Calculating the correct offsets for sections is taken care of automatically when loading the kernel symbols via the user-defined... I was originally going to suggest "section offsets" here too, but I'm not confident that it couldn't potentially mean something else in this context. > +GDB command @command{dynamic_load_kernel_exec_symbols}, which takes one > +argument, the address where the text section is loaded, as determined by I would personally drop the second comma in "argument, ... as determined by". > +one of the methods above. Alternatively, the command > @command{dynamic_load_symbols} > +with the text section address as an agrument can be called to load the > +kernel symbols and setup loading the module symbols as they are loaded at "setup" should probably be "set up". > +runtime. > + > +In the author's experience, when debugging with QEMU and OVMF, to have > +debugging symbols loaded at the start of GRUB2 execution the GRUB2 EFI > +application must be run via QEMU at least once prior in order to get the > +load address. Two methods for obtaining the load address are described in > +two subsections below. Generally speaking, the load address does not change > +between QEMU runs. There are exceptions to this, namely that different > +GRUB2 EFI applications can be run at different addresses. Also, it has been > +observed that after running the EFI application for the first time, the > +second run will some times have a different load address, but subsequent "some times" should probably be "sometimes". > +runs of the same EFI application will have the same load address as the > +second run. And it's a near certainty that if the GRUB EFI binary has > changed, > +eg. been recompiled, the load address will also be different. > + > +This ability to predict what the load address will be allows one to assume > +the load address on subsequent runs and thus load the symbols before GRUB2 > +starts. The following command illustrates this, assuming that QEMU is > +running and waiting for a debugger connection and the current working > +directory is where @file{gdb_grub} resides: > + > +@example > +gdb -x gdb_grub -ex 'dynamic_load_symbols @var{address of .text section}' > +@end example > + > +If you load the symbols in this manner and, after continuing execution, do > +not see output showing the loading of modules symbol, then it is very likely Would this make more sense as "showing the module symbols loading"? > +that the load address was incorrect. > + > +Another thing to be aware of is how the loading of the GRUB image by the > +firmware affects previously set software breakpoints. On x86 platforms, > +software breakpoints are implemented by GDB by writing a special processor > +instruction at the location of the desired breakpoint. This special > instruction > +when executed will stop the program execution and hand control to the > +debugger, GDB. GDB will first save the instruction bytes that are > +overwritten at the breakpoint and will put them back when the breakpoint > +is hit. If GRUB is being run for the first time in QEMU, the firmware will > +be loading the GRUB image into memory where every byte is already set to 0. > +This means that if a breakpoint is set before GRUB is loaded, GDB will save > +the 0-byte(s) where the the special instruction will go. Then when the > firmware > +loads the GRUB image and because it is unaware of the debugger, it will > +write the GRUB image to memory, overwriting anything that was there > previously, > +notably in this case the instruction that implements the software breakpoint. I would probably split "notably in this case ..." off into its own sentence. > +This will be confusing for the person using GDB because GDB will show the > +breakpoint as set, but the brekapoint will never be hit. Furthermore, GDB > +then becomes confused, such that even deleting an recreating the breakpoint > +will not create usable breakpoints. The @file{gdb_grub} script takes care of > +this by saving the breakpoints just before they are overwritten, and then > +restores them at the start of GRUB execution. So breakpoints for GRUB can be > +set before GRUB is loaded, but be mindful of this effect if you are confused > +as to why breakpoints are not getting hit. > + > +Also note, that hardware breakpoints do not suffer this problem. They are > +implemented by having the breakpoint address in special debug registers on > +the CPU. So they can always be set freely without regard to whether GRUB has > +been loaded or not. The reason that hardware breakpoints aren't always used > +is because there are a limited number of them, usually around 4 on various > +CPUs, and specifically exactly 4 for x86 CPUs. The @file{gdb_grub} script > +goes out of its way to not use hardware breakpoints internally and when > +needed use them as short a time as possible, thus allowing the user to have a I'd write this as: The gdb_grub script goes out of its way to avoid using hardware breakpoints internally, and when needed, uses them as briefly as possible, thus allowing the user... > +maximal number at their disposal. > + > +@node OVMF debug log > +@subsection OVMF debug log > + > +In order to get the GRUB2 load address from OVMF, first, a debug build > +of OVMF must be obtained > (@uref{https://github.com/retrage/edk2-nightly/raw/master/bin/DEBUGX64_OVMF.fd, > +here is one} which is not officially recommended). OVMF will output debug > +messages to a special serial device, which we must add to QEMU. The following > +QEMU command will run the debug OVMF and write the debug messages to a > +file named @file{debug.log}. It is assumed that @file{disk.img} is a disk > +image or block device that is setup to boot GRUB2 EFI. This "setup" should probably be "set up" as well. - Oskari > + > +@example > +qemu-system-x86_64 -bios /path/to/debug/OVMF.fd \ > + -drive file=disk.img,format=raw \ > + -device virtio-scsi-pci,id=scsi0 \ > + -debugcon file:debug.log -global isa-debugcon.iobase=0x402 > +@end example > + > +If GRUB2 was started by the (U)EFI firmware, then in the @file{debug.log} > +file one of the last lines should be a log message like: > +@samp{Loading driver at 0x00006AEE000 EntryPoint=0x00006AEE756}. This > +means that the GRUB2 EFI application was loaded at @samp{0x00006AEE000} and > +its .text section is at @samp{0x00006AEE756}. > + > +@node Using the gdbinfo command > +@subsection Using the gdbinfo command > + > +On EFI platforms the command @command{gdbinfo} will output a string that > +is to be run in a GDB session running with the @file{gdb_grub} GDB script. > + > + > @node Porting > @chapter Porting > > -- > 2.34.1 >
signature.asc
Description: PGP signature
_______________________________________________ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel