-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 03/28/2014 03:47 PM, Christoffer Dall wrote: > On Fri, Mar 28, 2014 at 03:38:28PM -0400, Michael Casadevall > wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> >> >> On 03/28/2014 02:09 PM, Christoffer Dall wrote: >>> On Fri, Mar 28, 2014 at 04:26:59AM -0400, Michael Casadevall >>> wrote: >>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >>>> >>>> As I've made a fair bit of headway since LinaroConnect, I >>>> wanted to drop a line on my current progress with porting >>>> TianoCore to KVM >>>> >>>> Summary (tl;dr version): >>>> >>>> KVM can start TianoCore, and boot all the way to shell, and >>>> access HDDs via VirtioBlk. We can start grub and >>>> successfully retrieve files from ext partitions, load a >>>> device tree, and start the kernel. The kernel runs through >>>> most of the EFI stub, but falls over during >>>> ExitBootServices() >>> >>> Thanks for providing this status! >>> >>>> >>>> Long Version: >>>> >>>> So, after much blood sweat and tears, we're finally at the >>>> point of trying to actually start a kernel, though this (for >>>> the moment) remains an elusive goal. The current problem is >>>> that once we call EBS(), we get an exception from EFI with no >>>> Image information, which means the exception handler doesn't >>>> know where it came from. After several seconds, we get a >>>> second exception from within DxeCore, and then EFI falls >>>> over. >>>> >>>> Debugging EFI is difficult and error prone, combined with >>>> limited debug facilities from the gdb-stub in QEMU (no >>>> breakpoints), and no decent way to load all of EFI itself >>>> (you have to run add-symbol-file manually with the output of >>>> commands printed on the console; supposedly its possible to >>>> generate a giant GdbSyms.dll file to import in a single go, >>>> but I haven't succeeded at this). This is further complicated >>>> that it appears we're asserting somewhere in a driver, and >>>> short of adding printfs to *every* driver, its impossible to >>>> know which is asseting. >>> >>> Maybe it's worth adding a hack-support-gdb-in-kvm >>> implementation for this. If we go down this road, I can >>> probably find time to help you out there. >>> >>> Can you do some scripting to replace assert statements with "{ >>> print("%s:%d\n", __FILE__, __LINE__); orig_assert(); }" type >>> hack? >>> >> >> That's probably a decent idea if I can find where ASSERT() is >> defined. I'll try that in a bit. >> >>>> >>>> Previous attempts to debug assets shows that EFI does "odd" >>>> things to the stack when we hit an exception, making walking >>>> it with GDB impossible. I need to figure out what madness EFI >>>> does with my SP so I can get the entire stack on an >>>> explosion, but this remains at best hopeful thinking. >>> >>> This sounds very strange - could it be that because you take an >>> exception, you use a SP from a different mode and everything >>> just messes up? >>> >> >> This could be GDB just being unhappy. I've had issues walking >> the stack in KVM in general, but even if I walk the stack by >> hand, I don't see a pointer to the next frame when we're in an >> exception. To my knowledge, UEFI uses the standard AArch64 C ABI, >> but this might be a faulty exception on my part. >> >>>> >>>> Further complicating things is that during EBS, my print >>>> debugging goes away. I might just cheat and roll a simple >>>> assembly function to bang out messages through serial >>>> without calling anything else. Ugly as sin, but this should >>>> let me get useful debug output through the EBS framework. >>>> Complicating matters is that I need to locate each and all >>>> EBS() event functions, which are spread *everywhere* in >>>> TianoCore, and then debug them each individually. >>> >>> I'm a little confused no knowing UEFI, is EBS() not a single >>> function and what does it matter that it's called from >>> multiple places? >>> >> >> So, drivers and applications can enlist to get notification of >> when ExitBootServices are called. This pushes a pointer to a >> function into an array when is then iterated through and this >> pointer is then called so drivers can unregister themselves from >> boot services, etc. >> >> Complicating the issue is I can't use printf once GetMemoryMap() >> is called without breaking EBS() (I think this is a bug in UEFI, >> leif, 2 cents?, but I think I can twiddle the serial port >> directly without breaking shit. > > yeah, just writing to the pl011 out should be trivial, or add an > hvc temporary hack to KVM, I've done things like that when > originally debugging kernel boot under KVM. > Just for the record, hvc? >> >> Having slept on it, its probably easy to print out the pointers >> as we go through them, so I can get an idea of whats listening >> for EBS and try and narrow down my list of candidates. >> > > yes, add a function that side-steps all the UEFI-weirdness (should > be a few lines static function) that can print the pointers of the > functions you're calling. > Biggest issue is now binutils doesn't like PE?AArch64 files (addr2line and friends don't work) but I think I can muddle through it. There are tricks at this point I can use if I have a pointer to get an idea where UEFI is. >>>> >>>> I'm open to ideas on how best to accomplish this. >>>> >>>> On a larger scale, there are a couple of other bugs and odds >>>> and ends which currently affect us: >>>> >>>> * wfi doesn't work >>>> >>>> THis is probably the biggest w.r.t. to functionality that >>>> should work, but doesn't. The EFI event loop is built on >>>> checking the timer, then calling wfi to check the timer >>>> later. The problem here is we call wfi ... and UEFI never >>>> comes back despite events firing (I can put print code in the >>>> interrupt handler to confirm this). This may be related to >>>> the VGIC errors I get running kvm under foundation, but >>>> haven't taken the time to properly nail down the bug here. >>> >>> So if I understand it, the expected sequence of events are: >>> >>> 1. check timer (arch timer counter?) 2. WFI 3. virtual arch >>> timer interrupt, causes wake-up from WFI 4. go to 1-> >>> >>> But you seem to get stuck at (2)? >>> >> >> Exactly. >> >>> When you say "print code in the interrupt handler" is that the >>> UEFI interrupt handler? In that case, you do wake up from the >>> WFI...? >>> >> >> I put a DEBUG print line in the Timer interrupt handler, which >> prints out a message every tick letting me know the timer was >> working. When we call wfi, the timer ticks still show up (and I >> can see them through vgic with debugging there enabled) >> > > Which timer interrupt handler? The UEFI one? > > If you get an interrupt for the timer in UEFI, then your WFIs are > not hanging, the VCPU actually resumes. Assuming you receive the > interrupts on the same CPU that did the WFI. > We're running uni-proc as that's all KVM supports ATM. What happens is we wfi, the interrupt fires, the interrupt handler fires, and we remain at the wfi. >>> Do you see stuff happening in virt/kvm/arm/arch_timer.c: >>> kvm_timer_inject_irq()? >>> >>> That should call kvm_vgic_inject_irq(), which should >>> vgic_kick_vcpus(kvm), which is what wakes you up from your >>> WFI. >>> >> >> Hrm, I need some debug code in vgic_kick_vcpus. Thanks! >> >>>> >>>> This was worked around by commenting out the wfi, turning >>>> event loop into a busy loop, but this has to be resolved >>>> before we can ever consider merging it >>>> >>>> * No RTC >>>> >>>> I looked through virt.c in KVM, and as best I can tell, I've >>>> got no RTC at all (no PL031). It also appears that the kernel >>>> can't get RTC as running a kernel gets me a 1970 clock. I'm >>>> not sure if this is by design or not, but it causes GetTime() >>>> to return EFI_ERROR, and I suspect may be one of the >>>> exceptions I'm getting avoid (Shell prints a ton of warnings >>>> that GetTime is busted). >>> >>> The only thing you can use to tell passing of time in mach-virt >>> is the arch-timer counter and use a fixed starting point. >>> >> >> The problem here is spec says THOU SHALL HAVE RTC. We could fake >> it with counting up from system start and using the UEFI build >> time as a starting point, but this is not what the spec rights >> had in mind (nothing says GetTime() has to be accurate :-)). >> >> For KVM, I'm wondering if we should just stick a PL031 on the bus >> and be done with it. For Xen, we're going to need a way to do >> this via xenbus. >> >> > So the QEMU virt platform is simply not equipped to run UEFI? > That's interesting. Peter, any thoughts? > I didn't notice the lack of RTC until I went to connect it, I should have hilighted this one sooner :-/. > -Christoffer > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTNdbTAAoJEGKVRgSEnX1Qp7YIAIQYBxOVlJtAigl7Wvnutqva mAI+NsZVHOnR8FhAknQPaZGckTxcIDOzHll3GN1UT2Y3rkkF1eIqfvSUgQEC4PER Wo0WJr1eQRSOyn8QgaOdb0HxZUSfu5lxuiS4t3gWkmgSoUmSzIstLYuOhkLJehX7 11WpQ5eABXB2kykjEs3GRjsrpPy1I+UewJP/6ZoQRddgrrIkXA1LZe6588fDysJa pgWAlGzV6UxG20RpB/O3IESnGSijd0TF8YvAj7A/eykfPctlDpIt/KgZC55YXrrn A6tSwrU3sPWKmKmFhtH7rS8rGBmTrNh+nad1C1n3bjETbszRnIuQwKB8AGdMbGA= =fYhC -----END PGP SIGNATURE----- _______________________________________________ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev