On 02/19/15 16:52, Blank Field wrote:
> Hello everybody!
> 
> I've been experimenting with QEMU, VFIO and GPU passthrough, and,
> suddenly, encountered a bug.
> My VM(launched from the script [1]) starts booting fine, but then comes
> to a complete stop with 100% load on one CPU.
> To investigate the issue, I've connected GDB to qemu, and what I've seen
> was very strange.[2]
> There is some strange way to write zero to RAX, then we compare RAX with
> RAX and do JE back to the write-zero part.
> Using binary grep(bgrep or some other ways), I've found that op-code
> byte sequence in OVMF-pure-efi.fd:
> 
> |hexdump -C /usr/share/edk2.git/ovmf-x64/OVMF-pure-efi.fd | grep "74 f7 c9"
> 001cdaf0  90 48 8b 45 f8 48 85 c0  74 f7 c9 c3 55 48 89 e5  
> |.H.E.H..t...UH..||
> 
> 8b 45 f8 is MOV -0x8 RAX, 74 f7 is JE -0x7
> When changing JE to JNE on a live, hung VM via GDB - the VM continues to
> boot fine without any noticeable anomalies.

Yes, I found your report on the archlinux forum when I performed my usual 
weekly google search for new OVMF mentions. I wondered if you'd report it here.

> VM doesn't hang when VFIO device is removed from the VM.
> When that given device is connected directly to the hardware, Asus'
> proprietary UEFI successfully boots and works.
> 
> I've got my OVMF binary from [3], which appears to be jenkins-based
> automated build system. Build number doesn't seem to matter much,
> because I have been experiencing this problem for 3-4 months, and yet
> nothing changed much.

That build is okay, it's always fresh.

> I'm using fedora 21 and QEMU emulator version 2.1.3 (qemu-2.1.3-2.fc21)
> with kernel 3.18.7-200.fc21.x86_64.
> 
> Since i'm not very familiar with GDB and debugging UEFI/firmwares in
> general, can someone please determine where is the actual problem, and
> what does that weird assembly code really does?
> 
> I can provide the ROM file that I am using and any other needed
> information to investigate the issue.
> 
> Links:
> [1]: http://pastebin.com/cy7ZYvjc
> [2]: http://pastebin.com/Q5Q9Fwgp
> [3]: https://www.kraxel.org/repos/

The assembly under <http://pastebin.com/Q5Q9Fwgp> and your description quite 
remind me of the implementation of CpuDeadLoop() 
[MdePkg/Library/BaseLib/CpuDeadLoop.c]:

/**
  Executes an infinite loop.

  Forces the CPU to execute an infinite loop. A debugger may be used to skip
  past the loop and the code that follows the loop must execute properly. This
  implies that the infinite loop must not cause the code that follow it to be
  optimized away.

**/
VOID
EFIAPI
CpuDeadLoop (
  VOID
  )
{
  volatile UINTN  Index;

  for (Index = 0; Index == 0;);
}

The most likely reason for OVMF ending up in CpuDeadLoop() is a failed 
ASSERT(). Please capture the OVMF debug output in a file, and see if there's a 
failed assertion in it.

As documented in OvmfPkg/README, the debug output can be saved with:

  -debugcon file:debug.log -global isa-debugcon.iobase=0x402

In addition, here's some (complex) tips for debugging OVMF with gdb:

  http://edk2.bluestop.org/w/tianocore/debugging-with-gdb/

(Thanks again Bruce for formatting it so nicely.)

HTH
Laszlo

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel

Reply via email to