Re: Early EFI-related boot freeze in parse_setup_data()

2019-08-27 Thread Daniel Drake
On Fri, Aug 16, 2019 at 2:14 PM Daniel Drake  wrote:
> Anyway, the system freeze occurs in parse_setup_data(), specifically:
>
> data = early_memremap(pa_data, sizeof(*data));
> data_len = data->len + sizeof(struct setup_data);
>
> Dereferencing data->len causes the system to hang. I presume it
> triggers an exception handler due to some kind of invalid memory
> access.
>
> By returning early in that function, boot continues basically fine. So
> I could then log the details: pa_data has value 0x892bb018 and
> early_memremap returns address 0xff200018. Accessing just a
> single byte at that address causes the system hang.

I noticed a complaint about NX in the logs, right where it does the
early_memremap of this data (which is now at address 0x893c0018):

 Notice: NX (Execute Disable) protection missing in CPU!
 e820: update [mem 0x893c0018-0x893cec57] usable ==> usable
 e820: update [mem 0x893c0018-0x893cec57] usable ==> usable
 e820: update [mem 0x893b3018-0x893bf057] usable ==> usable
 e820: update [mem 0x893b3018-0x893bf057] usable ==> usable

Indeed, in the BIOS setup menu, "NX Mode" was Disabled.
Setting it to Enabled avoids the hang and Linux boots as normal. Weird!

Daniel


Early EFI-related boot freeze in parse_setup_data()

2019-08-16 Thread Daniel Drake
Hi,

We're working with a new consumer MiniPC based on AMD E1-7010.

It fails to boot Linux when booting in EFI mode - it hangs with
nothing on screen. earlycon=efifb doesn't show any output.

Looking closer, I was able to confirm that we reach EFI
ExitBootServices() via efi_printk in the efi stub. But you can't use
EFI's console functionality after that point, so I then resorted to
inserting calls to:

   idt_invalidate(NULL); __asm__ __volatile__("int3");

throughout the early boot code that follows in order to force a system
reset. That way I could deduce if execution was reaching that point
(system reset) or not (system hang as before). As a side-question I'd
be curious if there is any better way to debug such early boot
failures on consumer x86 hardware without a serial port...

Anyway, the system freeze occurs in parse_setup_data(), specifically:

data = early_memremap(pa_data, sizeof(*data));
data_len = data->len + sizeof(struct setup_data);

Dereferencing data->len causes the system to hang. I presume it
triggers an exception handler due to some kind of invalid memory
access.

By returning early in that function, boot continues basically fine. So
I could then log the details: pa_data has value 0x892bb018 and
early_memremap returns address 0xff200018. Accessing just a
single byte at that address causes the system hang.

This original pa_data value (from boot_params.hdr.setup_data) was set
by the EFI stub in setup_efi_pci(). I confirmed that the same
0x892bb018 value is set there, it is not being corrupted along the
way.

Any suggestions for how to diagnose further?

dmesg output:
https://gist.github.com/dsd/199bed7b590e90efdf73f9f6384ca551

Thanks
Daniel


Re: EFI reboot vs. ACPI reboot (was: Re: [tip:x86/urgent] x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T)

2019-04-17 Thread Daniel Drake
On Wed, Apr 17, 2019 at 2:16 PM Ingo Molnar  wrote:
> Ok, so acpi_gbl_reduced_hardware is set when the ... 'reduced hardware'
> bit is set:
>
> acpi_gbl_reduced_hardware = FALSE;
> if (acpi_gbl_FADT.flags & ACPI_FADT_HW_REDUCED) {
> acpi_gbl_reduced_hardware = TRUE;
> }
>
>
> which is described as:
>
>  #define ACPI_FADT_HW_REDUCED(1<<20) /* 20: [V5] ACPI hardware is 
> not implemented (ACPI 5.0) */
>
> That seems counter-intuitive to me: if no full ACPI hardware is
> implemented then we should assume reduced ACPI functionality, i.e. if the
> EFI runtime is otherwise available we should default to it.

It's a bit confusing, but my loose understanding is that previous
versions of the ACPI spec required system implementors to implement
the whole thing; but that's increasingly impractical today, e.g. with
ARM systems coming along, which do not gel well with some of the
historical x86-rooted design aspects that spilled over into ACPI. The
V5 spec introduces reduced mode as an opt-in new feature, but for
compatibility with pre-V5 implementations it needs to consider "full
hardware" mode as the default.

> Feel free to send a patch that makes EFI reboot the default one under
> these circumstances,

Just to check, you mean: EFI reboot (and shutdown) become the default
methods when the machine is booted in EFI mode, and EFI stuff has not
been disabled with a kernel parameter?
Even when running in full hardware ACPI mode.

Thanks
Daniel


Re: EFI reboot vs. ACPI reboot (was: Re: [tip:x86/urgent] x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T)

2019-04-16 Thread Daniel Drake
On Tue, Apr 16, 2019 at 4:20 PM Ingo Molnar  wrote:
> I wanted to get a second opinion from the EFI folks for this whole
> concept. On x86 we default to ACPI reboot on modern systems, and we
> default to EFI reboot on modern EFI systems, via the
> efi_reboot_required() method which keys off on acpi_gbl_reduced_hardware
> to create a barrier for older ACPI systems.

So if acpi_gbl_reduced_hardware is set, we are on a "modern EFI
system", and EFI reboot is used.

> It appears that Acer TravelMate X514-51T systems get marked as
> 'acpi_gbl_reduced_hardware' which enables ACPI-reboot, but they require
> EFI-reboot.

We will double check, but in this case I believe the system is *not*
marked as reduced hardware, which is why ACPI reboot is used.

> Should we perhaps re-think the boundary between EFI-reboot and
> ACPI-reboot systems? I.e. if the EFI runtime is enabled, shouldn't we
> just use the EFI reboot method?

I agree this is a good question.

We also previously hit a similar issue for shutdown on Acer laptops
which is still unresolved.
https://marc.info/?l=linux-acpi=148857214431346=2

Daniel


Re: Multiple Acer laptops hang on ACPI poweroff

2017-10-27 Thread Daniel Drake
On Fri, Oct 27, 2017 at 3:57 PM, Rafael J. Wysocki  wrote:
>> Testing shutdown on Acer Aspire ES1-732 (Intel Apollo Lake N4200) on
>> Linux 4.14-rc6, this issue is still present.
>>
>> The FADT has:
>>
>> [0ACh 0172  12]   PM1A Control Block : [Generic Address Structure]
>> [0ACh 0172   1] Space ID : 01 [SystemIO]
>> [0ADh 0173   1]Bit Width : 10
>> [0AEh 0174   1]   Bit Offset : 00
>> [0AFh 0175   1] Encoded Access Width : 02 [Word Access:16]
>> [0B0h 0176   8]  Address : 0404
>>
>> Full ACPI tables dump:
>> https://gist.github.com/dsd/ed80d9fdd32f99e310002b2492cd6e1b
>>
>> We have tested that writing bit 13 of port 0404 under Windows 10
>> (using an app called RW everything) results in an immediate and
>> successful power down. However, writing the same bit under Linux just
>> makes the system hang.
>>
>> I am not really familiar with the guts of x86 systems. When the OS
>> writes to this port, which component of the system receives that
>> request and acts accordingly? Is it handled by the BIOS? Or an EC, or
>> ...? With more background here we may be able to approach the relevant
>> component vendor and ask for help.
>
> Writes to the PM1A register may go straight to the PMC or trigger an
> SMM trap.  In both cases the platform takes over.

Platform means BIOS/firmware? (i.e. Insyde in this case)
Or you are referring to the Intel SoC?

> Is Apollo Lake the only platform affected or are there any other?

All 3 affected product families are Apollo Lake

Thanks
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Multiple Acer laptops hang on ACPI poweroff

2017-10-27 Thread Daniel Drake
On Sat, Mar 4, 2017 at 4:15 AM, Daniel Drake <dr...@endlessm.com> wrote:
> Some particular Acer/Packard Bell machines hang during shutdown.
> The system completely hangs while doing bit operations for turning on SLP_EN
> bit in ACPI PM1A control address and Sleep Control Register. Thus the
> normal acpi_power_off path can never complete the shutdown process.
>
> We have found a workaround to force these systems to use EFI for poweroff,
> included below, but I wonder if anything better can be done. It is especially
> not ideal because the system hangs the same way when going into suspend and
> we don't have a workaround for that.

Testing shutdown on Acer Aspire ES1-732 (Intel Apollo Lake N4200) on
Linux 4.14-rc6, this issue is still present.

The FADT has:

[0ACh 0172  12]   PM1A Control Block : [Generic Address Structure]
[0ACh 0172   1] Space ID : 01 [SystemIO]
[0ADh 0173   1]Bit Width : 10
[0AEh 0174   1]   Bit Offset : 00
[0AFh 0175   1] Encoded Access Width : 02 [Word Access:16]
[0B0h 0176   8]  Address : 0404

Full ACPI tables dump:
https://gist.github.com/dsd/ed80d9fdd32f99e310002b2492cd6e1b

We have tested that writing bit 13 of port 0404 under Windows 10
(using an app called RW everything) results in an immediate and
successful power down. However, writing the same bit under Linux just
makes the system hang.

I am not really familiar with the guts of x86 systems. When the OS
writes to this port, which component of the system receives that
request and acts accordingly? Is it handled by the BIOS? Or an EC, or
...? With more background here we may be able to approach the relevant
component vendor and ask for help.

Searching the web for "pm1a control 0404" I can see at
least a handful of systems with the exact same address here; could
just be a coincidence, or is there some kind of standardization?

Any other debugging suggestions would be very welcome.

Thanks
Daniel


> ---
>  drivers/firmware/efi/reboot.c | 42 +-
>  1 file changed, 41 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/reboot.c b/drivers/firmware/efi/reboot.c
> index 62ead9b..6d1496d 100644
> --- a/drivers/firmware/efi/reboot.c
> +++ b/drivers/firmware/efi/reboot.c
> @@ -4,6 +4,7 @@
>   */
>  #include 
>  #include 
> +#include 
>
>  int efi_reboot_quirk_mode = -1;
>
> @@ -43,6 +44,45 @@ void efi_reboot(enum reboot_mode reboot_mode, const char 
> *__unused)
> efi.reset_system(efi_mode, EFI_SUCCESS, 0, NULL);
>  }
>
> +static const struct dmi_system_id force_efi_poweroff[] = {
> +{
> +.ident = "Packard Bell Easynote ENLG81AP",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "Packard Bell"),
> +DMI_MATCH(DMI_PRODUCT_NAME, "Easynote ENLG81AP"),
> +},
> +},
> +{
> +.ident = "Packard Bell Easynote ENTE69AP",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "Packard Bell"),
> +DMI_MATCH(DMI_PRODUCT_NAME, "Easynote ENTE69AP"),
> +},
> +},
> +{
> +.ident = "Acer Aspire ES1-533",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
> +DMI_MATCH(DMI_PRODUCT_NAME, "Aspire ES1-533"),
> +},
> +},
> +{
> +.ident = "Acer Aspire ES1-732",
> +.matches = {
> +DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
> +DMI_MATCH(DMI_PRODUCT_NAME, "Aspire ES1-732"),
> +},
> +},
> +{}
> +};
> +
> +bool efi_poweroff_forced(void)
> +{
> +   if (dmi_check_system(force_efi_poweroff))
> +   return true;
> +   return false;
> +}
> +
>  bool __weak efi_poweroff_required(void)
>  {
> return false;
> @@ -58,7 +98,7 @@ static int __init efi_shutdown_init(void)
> if (!efi_enabled(EFI_RUNTIME_SERVICES))
> return -ENODEV;
>
> -   if (efi_poweroff_required())
> +   if (efi_poweroff_required() || efi_poweroff_forced())
> pm_power_off = efi_power_off;
>
> return 0;
> --
> 2.9.3
>
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Multiple Acer laptops hang on ACPI poweroff

2017-03-03 Thread Daniel Drake
Some particular Acer/Packard Bell machines hang during shutdown.
The system completely hangs while doing bit operations for turning on SLP_EN
bit in ACPI PM1A control address and Sleep Control Register. Thus the
normal acpi_power_off path can never complete the shutdown process.

We have found a workaround to force these systems to use EFI for poweroff,
included below, but I wonder if anything better can be done. It is especially
not ideal because the system hangs the same way when going into suspend and
we don't have a workaround for that.

Any debugging tips for how to diagnose such problems?

Thanks
Daniel

---
 drivers/firmware/efi/reboot.c | 42 +-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/reboot.c b/drivers/firmware/efi/reboot.c
index 62ead9b..6d1496d 100644
--- a/drivers/firmware/efi/reboot.c
+++ b/drivers/firmware/efi/reboot.c
@@ -4,6 +4,7 @@
  */
 #include 
 #include 
+#include 
 
 int efi_reboot_quirk_mode = -1;
 
@@ -43,6 +44,45 @@ void efi_reboot(enum reboot_mode reboot_mode, const char 
*__unused)
efi.reset_system(efi_mode, EFI_SUCCESS, 0, NULL);
 }
 
+static const struct dmi_system_id force_efi_poweroff[] = {
+{
+.ident = "Packard Bell Easynote ENLG81AP",
+.matches = {
+DMI_MATCH(DMI_SYS_VENDOR, "Packard Bell"),
+DMI_MATCH(DMI_PRODUCT_NAME, "Easynote ENLG81AP"),
+},
+},
+{
+.ident = "Packard Bell Easynote ENTE69AP",
+.matches = {
+DMI_MATCH(DMI_SYS_VENDOR, "Packard Bell"),
+DMI_MATCH(DMI_PRODUCT_NAME, "Easynote ENTE69AP"),
+},
+},
+{
+.ident = "Acer Aspire ES1-533",
+.matches = {
+DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
+DMI_MATCH(DMI_PRODUCT_NAME, "Aspire ES1-533"),
+},
+},
+{
+.ident = "Acer Aspire ES1-732",
+.matches = {
+DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
+DMI_MATCH(DMI_PRODUCT_NAME, "Aspire ES1-732"),
+},
+},
+{}
+};
+
+bool efi_poweroff_forced(void)
+{
+   if (dmi_check_system(force_efi_poweroff))
+   return true;
+   return false;
+}
+
 bool __weak efi_poweroff_required(void)
 {
return false;
@@ -58,7 +98,7 @@ static int __init efi_shutdown_init(void)
if (!efi_enabled(EFI_RUNTIME_SERVICES))
return -ENODEV;
 
-   if (efi_poweroff_required())
+   if (efi_poweroff_required() || efi_poweroff_forced())
pm_power_off = efi_power_off;
 
return 0;
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html