Re: Early EFI-related boot freeze in parse_setup_data()
On Fri, Aug 16, 2019 at 2:14 PM Daniel Drake wrote: > Anyway, the system freeze occurs in parse_setup_data(), specifically: > > data = early_memremap(pa_data, sizeof(*data)); > data_len = data->len + sizeof(struct setup_data); > > Dereferencing data->len causes the system to hang. I presume it > triggers an exception handler due to some kind of invalid memory > access. > > By returning early in that function, boot continues basically fine. So > I could then log the details: pa_data has value 0x892bb018 and > early_memremap returns address 0xff200018. Accessing just a > single byte at that address causes the system hang. I noticed a complaint about NX in the logs, right where it does the early_memremap of this data (which is now at address 0x893c0018): Notice: NX (Execute Disable) protection missing in CPU! e820: update [mem 0x893c0018-0x893cec57] usable ==> usable e820: update [mem 0x893c0018-0x893cec57] usable ==> usable e820: update [mem 0x893b3018-0x893bf057] usable ==> usable e820: update [mem 0x893b3018-0x893bf057] usable ==> usable Indeed, in the BIOS setup menu, "NX Mode" was Disabled. Setting it to Enabled avoids the hang and Linux boots as normal. Weird! Daniel
Early EFI-related boot freeze in parse_setup_data()
Hi, We're working with a new consumer MiniPC based on AMD E1-7010. It fails to boot Linux when booting in EFI mode - it hangs with nothing on screen. earlycon=efifb doesn't show any output. Looking closer, I was able to confirm that we reach EFI ExitBootServices() via efi_printk in the efi stub. But you can't use EFI's console functionality after that point, so I then resorted to inserting calls to: idt_invalidate(NULL); __asm__ __volatile__("int3"); throughout the early boot code that follows in order to force a system reset. That way I could deduce if execution was reaching that point (system reset) or not (system hang as before). As a side-question I'd be curious if there is any better way to debug such early boot failures on consumer x86 hardware without a serial port... Anyway, the system freeze occurs in parse_setup_data(), specifically: data = early_memremap(pa_data, sizeof(*data)); data_len = data->len + sizeof(struct setup_data); Dereferencing data->len causes the system to hang. I presume it triggers an exception handler due to some kind of invalid memory access. By returning early in that function, boot continues basically fine. So I could then log the details: pa_data has value 0x892bb018 and early_memremap returns address 0xff200018. Accessing just a single byte at that address causes the system hang. This original pa_data value (from boot_params.hdr.setup_data) was set by the EFI stub in setup_efi_pci(). I confirmed that the same 0x892bb018 value is set there, it is not being corrupted along the way. Any suggestions for how to diagnose further? dmesg output: https://gist.github.com/dsd/199bed7b590e90efdf73f9f6384ca551 Thanks Daniel