Re: [coreboot] [RFH] Status of the Lenovo X201
On 03/05/18 21:51, qtux wrote: > I uploaded a status report for the X201 and it contains the smashed > stack message. Since then I booted several times but was not able to > reproduce this stack smashing issue. It seems like that this kind of > error occurs only once after flashing. Please find attached a diff of > the notable differences of a new console log compared to the one I > pushed to board-status. It supports that there is an issue with the > initial raminit. > > Cheers, > Matthias > > On 02/05/18 20:12, ron minnich wrote: >> Yeah I think you want to hunt this stack smash error down, it's not >> something you want to ignore. >> >> On Wed, May 2, 2018 at 11:09 AM Kyösti Mälkki>> wrote: >> >>> On Wed, May 2, 2018 at 8:53 PM, Nico Huber wrote: On 02.05.2018 18:37, qtux wrote: > Thanks for your detailed explanation. So in essence shall I ignore the > messages or blacklist lpc_ich? Yes, either ;) > > Besides, while preparing the status report, I sometimes find a "Smashed > stack detected in romstage!" message in the console log, just before > ramstage is starting. Is there something to worry about there? Um, yes. I think that's not good. But I wonder why it's not happening consistently. >>> >>> I commented about that earlier in this thread. Seemed like actual >>> raminit eats a lot of stack, but loading from MRC cache or equivalent >>> does not. One could find that struct and move it to BSS, declared with >>> CAR_GLOBAL. I would rather not extend the boundary for stack-smashing >>> detection. >>> >>> Kyösti >>> >>> -- >>> coreboot mailing list: coreboot@coreboot.org >>> https://mail.coreboot.org/mailman/listinfo/coreboot >> I uploaded a proposed fix for the smashed stack issue: https://review.coreboot.org/#/c/coreboot/+/26388/ Side note: I tried adding CAR_GLOBAL to the ram_training and the raminfo struct. Both had no effect on the issue. Cheers, Matthias -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
I uploaded a status report for the X201 and it contains the smashed stack message. Since then I booted several times but was not able to reproduce this stack smashing issue. It seems like that this kind of error occurs only once after flashing. Please find attached a diff of the notable differences of a new console log compared to the one I pushed to board-status. It supports that there is an issue with the initial raminit. Cheers, Matthias On 02/05/18 20:12, ron minnich wrote: > Yeah I think you want to hunt this stack smash error down, it's not > something you want to ignore. > > On Wed, May 2, 2018 at 11:09 AM Kyösti Mälkki> wrote: > >> On Wed, May 2, 2018 at 8:53 PM, Nico Huber wrote: >>> On 02.05.2018 18:37, qtux wrote: Thanks for your detailed explanation. So in essence shall I ignore the messages or blacklist lpc_ich? >>> >>> Yes, either ;) >>> Besides, while preparing the status report, I sometimes find a "Smashed stack detected in romstage!" message in the console log, just before ramstage is starting. Is there something to worry about there? >>> >>> Um, yes. I think that's not good. But I wonder why it's not happening >>> consistently. >> >> I commented about that earlier in this thread. Seemed like actual >> raminit eats a lot of stack, but loading from MRC cache or equivalent >> does not. One could find that struct and move it to BSS, declared with >> CAR_GLOBAL. I would rather not extend the boundary for stack-smashing >> detection. >> >> Kyösti >> >> -- >> coreboot mailing list: coreboot@coreboot.org >> https://mail.coreboot.org/mailman/listinfo/coreboot > diff --git a/lenovo/x201/4.7-994-ga940e384b6/2018-05-03T17_19_05Z/coreboot_console.txt b/lenovo/x201/4.7-994-ga940e384b6/2018-05-03T17_19_05Z/coreboot_console.txt index 6c73a758a..20bffc5c7 100644 --- a/lenovo/x201/4.7-994-ga940e384b6/2018-05-03T17_19_05Z/coreboot_console.txt +++ b/lenovo/x201/4.7-994-ga940e384b6/2018-05-03T17_19_05Z/coreboot_console.txt @@ -99,10 +100,6 @@ ME: Error Code : No Error ME: Progress Phase : BUP Phase ME: Power Management Event : Clean Moff->Mx wake ME: Progress Phase State: 0x41 -Smashed stack detected in romstage! -Smashed stack detected in romstage! -Smashed stack detected in romstage! -Smashed stack detected in romstage! MTRR Range: Start=ff80 End=0 (Size 80) MTRR Range: Start=0 End=100 (Size 100) MTRR Range: Start=bf00 End=bf80 (Size 80) @@ -860,7 +857,7 @@ SMM Module: stub loaded at bf808000. Will call bf8101a6() Initializing southbridge SMI... ... pmbase = 0x0500 SMI_STS: MCSMI PM1 -PM1_STS: WAK BM TMROF +PM1_STS: WAK BM GPE0_STS: GPIO14 GPIO11 GPIO9 GPIO5 GPIO4 GPIO3 GPIO2 GPIO1 GPIO0 ALT_GP_SMI_STS: GPI14 GPI13 GPI11 GPI10 GPI9 GPI7 GPI6 GPI5 GPI4 GPI3 GPI2 GPI1 GPI0 TCO_STS: @@ -1301,17 +1298,13 @@ Updating MRC cache data. CBFS: 'Master Header Locator' located CBFS at [700200:7fffc0) CBFS: Locating 'mrc.cache' CBFS: Found @ offset 1fdc0 size 1 -find_current_mrc_cache_local: picked entry 0 from cache block -Manufacturer: c2 -SF: Detected MX25L6405D with sector size 0x1000, total 0x80 -find_next_mrc_cache: picked next entry from cache block at fff21000 -Finally: write MRC cache update to flash at fff21000 -Successfully wrote MRC cache -BS: BS_DEV_INIT times (us): entry 5 run 145643 exit 14099 +find_current_mrc_cache_local: picked entry 1 from cache block +MRC data in flash is up to date. No update. +BS: BS_DEV_INIT times (us): entry 5 run 145972 exit 12026 Finalize devices... PCI: 00:1f.0 final Devices finalized @@ -1472,7 +1465,7 @@ SF: Detected MX25L6405D with sector size 0x1000, total 0x80 CBFS: 'Master Header Locator' located CBFS at [700200:7fffc0) FMAP: Found "FLASH" version 1.1 at 70. FMAP: base = ff80 size = 80 #areas = 3 -Wrote coreboot table at: bf746000, 0x36c bytes, checksum acd2 +Wrote coreboot table at: bf746000, 0x36c bytes, checksum 50d2 coreboot table: 900 bytes. IMD ROOT0. bf7ff000 1000 IMD SMALL 1. bf7fe000 1000 @@ -1538,14 +1531,16 @@ AHCI controller at 00:1f.2, iobase 0xcfd26000, irq 11 Found 0 lpt ports Found 0 serial ports Searching bootorder for: /rom@img/memtest -Discarding ps2 data aa (status=11) Searching bootorder for: /pci@i0cf8/*@1f,2/drive@0/disk@0 AHCI/0: Set transfer mode to UDMA-5 AHCI/0: registering: "AHCI/0: M4-CT128M4SSD2 ATA-9 Hard-Disk (119 GiBytes)" Initialized USB HUB (0 ports used) +WARNING - Timeout at ps2_recvbyte:182! +Discarding ps2 data aa (status=11) +WARNING - Timeout at ps2_recvbyte:182! PS2 keyboard initialized WARNING - Timeout at ehci_wait_td:516! -ehci pipe=0xbf6c1080 cur=bf6b5dc0 tok=80080d80 next=bf6b5e00 td=0xbf6b5dc0 status=80080d80 +ehci pipe=0xbf6c1080 cur=bf6b4dc0 tok=80080d80 next=bf6b4e00 td=0xbf6b4dc0 status=80080d80 Initialized USB HUB (0 ports used) All threads complete. Scan for option roms -- coreboot mailing list:
Re: [coreboot] [RFH] Status of the Lenovo X201
Hi Nico, On 02.05.2018 00:42, Nico Huber wrote: > Well, you better know what you are doing ;) that's indeed really true. :-) Regards, Reiner -- -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
Yeah I think you want to hunt this stack smash error down, it's not something you want to ignore. On Wed, May 2, 2018 at 11:09 AM Kyösti Mälkkiwrote: > On Wed, May 2, 2018 at 8:53 PM, Nico Huber wrote: > > On 02.05.2018 18:37, qtux wrote: > >> Thanks for your detailed explanation. So in essence shall I ignore the > >> messages or blacklist lpc_ich? > > > > Yes, either ;) > > > >> > >> Besides, while preparing the status report, I sometimes find a "Smashed > >> stack detected in romstage!" message in the console log, just before > >> ramstage is starting. Is there something to worry about there? > > > > Um, yes. I think that's not good. But I wonder why it's not happening > > consistently. > > I commented about that earlier in this thread. Seemed like actual > raminit eats a lot of stack, but loading from MRC cache or equivalent > does not. One could find that struct and move it to BSS, declared with > CAR_GLOBAL. I would rather not extend the boundary for stack-smashing > detection. > > Kyösti > > -- > coreboot mailing list: coreboot@coreboot.org > https://mail.coreboot.org/mailman/listinfo/coreboot -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On Wed, May 2, 2018 at 8:53 PM, Nico Huberwrote: > On 02.05.2018 18:37, qtux wrote: >> Thanks for your detailed explanation. So in essence shall I ignore the >> messages or blacklist lpc_ich? > > Yes, either ;) > >> >> Besides, while preparing the status report, I sometimes find a "Smashed >> stack detected in romstage!" message in the console log, just before >> ramstage is starting. Is there something to worry about there? > > Um, yes. I think that's not good. But I wonder why it's not happening > consistently. I commented about that earlier in this thread. Seemed like actual raminit eats a lot of stack, but loading from MRC cache or equivalent does not. One could find that struct and move it to BSS, declared with CAR_GLOBAL. I would rather not extend the boundary for stack-smashing detection. Kyösti -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On 02.05.2018 18:37, qtux wrote: > Thanks for your detailed explanation. So in essence shall I ignore the > messages or blacklist lpc_ich? Yes, either ;) > > Besides, while preparing the status report, I sometimes find a "Smashed > stack detected in romstage!" message in the console log, just before > ramstage is starting. Is there something to worry about there? Um, yes. I think that's not good. But I wonder why it's not happening consistently. > > It may correlate with another issue I found: Sometimes (mostly after > some experimentation) SeaBios loads a long time (about 10 to 20 seconds) > and is not able to find my SATA drive (though, payloads from cbfs can > still be loaded). Restarting with Ctrl+Alt+Del is sufficient in these > cases to solve the issue until the next time I tinker with coreboot (in > particular experimenting with me_cleaner seems to cause this issue quite > often). With the ME in general: Don't write it's firmware region unless you have to test a change there. And in this case, don't touch the BIOS region. After you have flashed the ME firmware you have to make incredibly sure that it's reset and in a valid state (hard to tell because an error state is expected with me_cleaner). Only then you can reason about core- boot. Even though some people are unhappy with limited access to the ME firm- ware, the best thing to do during coreboot development is to keep it locked (or write-protected at least). Nico -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
Thanks for your detailed explanation. So in essence shall I ignore the messages or blacklist lpc_ich? Besides, while preparing the status report, I sometimes find a "Smashed stack detected in romstage!" message in the console log, just before ramstage is starting. Is there something to worry about there? It may correlate with another issue I found: Sometimes (mostly after some experimentation) SeaBios loads a long time (about 10 to 20 seconds) and is not able to find my SATA drive (though, payloads from cbfs can still be loaded). Restarting with Ctrl+Alt+Del is sufficient in these cases to solve the issue until the next time I tinker with coreboot (in particular experimenting with me_cleaner seems to cause this issue quite often). Cheers, Matthias On 02/05/18 00:42, Nico Huber wrote: > On 02.05.2018 00:03, qtux wrote: >> ... >> ACPI Warning: SystemIO range 0x0480-0x04AF >> conflicts with OpRegion 0x0480-0x04EB (\GPIO) >> (20180105/utaddress-247) >> ACPI: If an ACPI driver is available for this device, you should use it >> instead of the native driver >> lpc_ich: Resource conflict(s) found affecting gpio_ich >> >> Maybe these are also caused by copy pasting Sandy Bridge code as I found >> a reference to PMIO and GPIO with matching addresses in >> src/southbridge/intel/bd82x6x/acpi/pch.asl. Do you have any ideas on >> this issue? > > It's pretty simple. When the firmware was written (both coreboot and the > one from Lenovo) this `lpc_ich` driver didn't exist in Linux and wasn't > accounted for. From a firmware point of view, that driver shouldn't > exist at all and our ACPI code claims the device's resources therefore. > I don't think the driver was meant to be included into generic Linux > distributions. > > Related story: The same applies to other drivers like the buggy intel- > spi. That one even warns in Kconfig "Say N here unless you know what you > are doing.". Due to a simple off-by-one in the code, it bricked[1] a lot > of systems with Ubuntu 17.10 and they had to withdraw their images. > That's what you get when you blindly enable all modules and ship them to > humble users. > > Well, you better know what you are doing ;) > > Nico > > [1] It only write-protected the firmware flash by accident, the actual > brick was caused by the UEFI shipping on the affected systems. > -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On 02.05.2018 00:03, qtux wrote: > ... > ACPI Warning: SystemIO range 0x0480-0x04AF > conflicts with OpRegion 0x0480-0x04EB (\GPIO) > (20180105/utaddress-247) > ACPI: If an ACPI driver is available for this device, you should use it > instead of the native driver > lpc_ich: Resource conflict(s) found affecting gpio_ich > > Maybe these are also caused by copy pasting Sandy Bridge code as I found > a reference to PMIO and GPIO with matching addresses in > src/southbridge/intel/bd82x6x/acpi/pch.asl. Do you have any ideas on > this issue? It's pretty simple. When the firmware was written (both coreboot and the one from Lenovo) this `lpc_ich` driver didn't exist in Linux and wasn't accounted for. From a firmware point of view, that driver shouldn't exist at all and our ACPI code claims the device's resources therefore. I don't think the driver was meant to be included into generic Linux distributions. Related story: The same applies to other drivers like the buggy intel- spi. That one even warns in Kconfig "Say N here unless you know what you are doing.". Due to a simple off-by-one in the code, it bricked[1] a lot of systems with Ubuntu 17.10 and they had to withdraw their images. That's what you get when you blindly enable all modules and ship them to humble users. Well, you better know what you are doing ;) Nico [1] It only write-protected the firmware flash by accident, the actual brick was caused by the UEFI shipping on the affected systems. -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
Thank you Kyösti, your patch solves all irq issues and USB is working again on my X201i. I opened a review for adding the X201i as an X201 variant: https://review.coreboot.org/#/c/coreboot/+/25971/ Apart from that I have the following ACPI conflict with PMIO and GPIO: ACPI: Battery Slot [BAT0] (battery present) ACPI: Battery Slot [BAT1] (battery absent) ACPI: AC Adapter [AC] (on-line) ACPI Warning: SystemIO range 0x0528-0x052F conflicts with OpRegion 0x0500-0x057F (\PMIO) (20180105/utaddress-247) ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver ACPI Warning: SystemIO range 0x04C0-0x04CF conflicts with OpRegion 0x0480-0x04EB (\GPIO) (20180105/utaddress-247) ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver ACPI Warning: SystemIO range 0x04B0-0x04BF conflicts with OpRegion 0x0480-0x04EB (\GPIO) (20180105/utaddress-247) ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver ACPI Warning: SystemIO range 0x0480-0x04AF conflicts with OpRegion 0x0480-0x04EB (\GPIO) (20180105/utaddress-247) ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver lpc_ich: Resource conflict(s) found affecting gpio_ich Maybe these are also caused by copy pasting Sandy Bridge code as I found a reference to PMIO and GPIO with matching addresses in src/southbridge/intel/bd82x6x/acpi/pch.asl. Do you have any ideas on this issue? Cheers, Matthias On 01/05/18 19:35, qtux wrote: > Beware that patch is incomplete! Coreboot dies at > src/southbridge/intel/common/acpi_pirq_gen.c line 97: > > if (!lpcb_path) > die("ACPI_PIRQ_GEN: Missing LPCB ACPI path\n"); > > You have to add the lpc_acpi_name function to > src/southbridge/intel/ibexpeak/lpc.c as in > src/southbridge/intel/bd82x6x/lpc.c to circumvent this issue. > > I was preparing to upload my patch when I saw yours (which is almost > identical). Additionally, I already tested my patch on a Lenovo X201i. > Shall I edit your patch on gerrit or upload mine in a separate merge > request, or do something else? > > Cheers, > Matthias > > On 01/05/18 18:45, Kyösti Mälkki wrote: >> On Mon, Apr 30, 2018 at 6:46 AM, qtuxwrote: >>> I wrote a patch [0] for the finalize code issue. With that my X201i is >>> working fine on current master besides an regression introduced in commit >>> 7f5efd90e598320791200e03f761309ee04b58a3 [1]. With that regression USB and >>> SD card is not working anymore and it raises the following errors: >> >> Thanks for patching _and_ testing, your patch for finalize was just merged. >> >>> can't derive routing for PCI INT A >>> PCI INT A: no GSI >> >> As for IRQ regressions, I think I can see where it went wrong, find my >> attempt to fix it blind-folded [1]. >> >> [1] https://review.coreboot.org/#/c/coreboot/+/25965 >> >> Kyösti >> > -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
Beware that patch is incomplete! Coreboot dies at src/southbridge/intel/common/acpi_pirq_gen.c line 97: if (!lpcb_path) die("ACPI_PIRQ_GEN: Missing LPCB ACPI path\n"); You have to add the lpc_acpi_name function to src/southbridge/intel/ibexpeak/lpc.c as in src/southbridge/intel/bd82x6x/lpc.c to circumvent this issue. I was preparing to upload my patch when I saw yours (which is almost identical). Additionally, I already tested my patch on a Lenovo X201i. Shall I edit your patch on gerrit or upload mine in a separate merge request, or do something else? Cheers, Matthias On 01/05/18 18:45, Kyösti Mälkki wrote: > On Mon, Apr 30, 2018 at 6:46 AM, qtuxwrote: >> I wrote a patch [0] for the finalize code issue. With that my X201i is >> working fine on current master besides an regression introduced in commit >> 7f5efd90e598320791200e03f761309ee04b58a3 [1]. With that regression USB and >> SD card is not working anymore and it raises the following errors: > > Thanks for patching _and_ testing, your patch for finalize was just merged. > >> can't derive routing for PCI INT A >> PCI INT A: no GSI > > As for IRQ regressions, I think I can see where it went wrong, find my > attempt to fix it blind-folded [1]. > > [1] https://review.coreboot.org/#/c/coreboot/+/25965 > > Kyösti > -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On Mon, Apr 30, 2018 at 6:46 AM, qtuxwrote: > I wrote a patch [0] for the finalize code issue. With that my X201i is > working fine on current master besides an regression introduced in commit > 7f5efd90e598320791200e03f761309ee04b58a3 [1]. With that regression USB and SD > card is not working anymore and it raises the following errors: Thanks for patching _and_ testing, your patch for finalize was just merged. > can't derive routing for PCI INT A > PCI INT A: no GSI As for IRQ regressions, I think I can see where it went wrong, find my attempt to fix it blind-folded [1]. [1] https://review.coreboot.org/#/c/coreboot/+/25965 Kyösti -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
April 30, 2018 5:51 AM, "qtux"wrote: > I wrote a patch [0] for the finalize code issue. It fixed master on my X201 too, thank you. Nicola -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
I wrote a patch [0] for the finalize code issue. With that my X201i is working fine on current master besides an regression introduced in commit 7f5efd90e598320791200e03f761309ee04b58a3 [1]. With that regression USB and SD card is not working anymore and it raises the following errors: [ 17.986754] usb 1-1: device not accepting address 2, error -110 [ 18.110095] usb 1-1: new high-speed USB device number 3 using ehci-pci [ 18.200089] usb 2-1: device not accepting address 2, error -110 [ 18.323421] usb 2-1: new high-speed USB device number 3 using ehci-pci [ 34.200083] usb 1-1: device not accepting address 3, error -110 [ 34.200169] usb usb1-port1: attempt power cycle [ 34.413364] usb 2-1: device not accepting address 3, error -110 [ 34.413439] usb usb2-port1: attempt power cycle [ 34.636752] usb 1-1: new high-speed USB device number 4 using ehci-pci [ 34.850085] usb 2-1: new high-speed USB device number 4 using ehci-pci [ 45.293417] usb 1-1: device not accepting address 4, error -110 [ 45.416732] usb 1-1: new high-speed USB device number 5 using ehci-pci [ 45.506783] usb 2-1: device not accepting address 4, error -110 [ 45.630088] usb 2-1: new high-speed USB device number 5 using ehci-pci [ 56.173393] usb 1-1: device not accepting address 5, error -110 [ 56.173445] usb usb1-port1: unable to enumerate USB device [ 56.386753] usb 2-1: device not accepting address 5, error -110 [ 56.386845] usb usb2-port1: unable to enumerate USB device Additionally there are some IRQ errors inside the kernel messages like can't derive routing for PCI INT A PCI INT A: no GSI for different devices which seem related to the change in [1]. Cheers, Matthias [0] https://review.coreboot.org/#/c/coreboot/+/25914/ [1] https://review.coreboot.org/#/c/coreboot/+/22859/ On 29/04/18 14:14, Kyösti Mälkki wrote: > On Sun, Apr 29, 2018 at 1:35 PM, Nicola Cornawrote: >> April 28, 2018 5:59 PM, "Nico Huber" wrote: >> >>> Yes, that's very likely a problem. It looks like the whole finalize code >>> path of the X201 was untested all the time (even on resume). I don't >>> remember if EHCI debug works in SMM? If it does, you could enable log- >>> ging for the SMI handler as well (if you want to debug it). >>> >>> Nico >> Attached you can find the log with the SMM debug enabled, but it doesn't seem >> to me much different from the non-debug log. >> >> Nicola > DEBUG_SMI does not output to EHCI, I have considered it too unstable. > > You can try your luck with attached patch to have DEBUG_SMI=y output > on EHCI debug. EHCI console code does not take precautions against > someone else touching the same register set so it's likely to fail > once payload and/or OS loads its EHCI driver, possibly making USB > media and keyboard unusable as well. > > Kyösti -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On Sun, Apr 29, 2018 at 1:35 PM, Nicola Cornawrote: > April 28, 2018 5:59 PM, "Nico Huber" wrote: > >> Yes, that's very likely a problem. It looks like the whole finalize code >> path of the X201 was untested all the time (even on resume). I don't >> remember if EHCI debug works in SMM? If it does, you could enable log- >> ging for the SMI handler as well (if you want to debug it). >> >> Nico > > Attached you can find the log with the SMM debug enabled, but it doesn't seem > to me much different from the non-debug log. > > Nicola DEBUG_SMI does not output to EHCI, I have considered it too unstable. You can try your luck with attached patch to have DEBUG_SMI=y output on EHCI debug. EHCI console code does not take precautions against someone else touching the same register set so it's likely to fail once payload and/or OS loads its EHCI driver, possibly making USB media and keyboard unusable as well. Kyösti usbdebug-in-smi Description: Binary data -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On 28.04.2018 17:34, Kyösti Mälkki wrote: > On Sat, Apr 28, 2018 at 3:41 PM, Nicola Cornawrote: >> April 27, 2018 12:29 PM, "Nicola Corna" wrote: >> With config PARALLEL_CPU_INIT=y so SMP / SMM init in initialize_cpus() will never call wait_other_cpus() at all. That actually regressed in my commit 0cc2ce4 [1] but I can't test if reverting it solves this for you. I'll push a regression fix soonish for review. [1] https://review.coreboot.org/c/coreboot/+/21088 Kyösti >>> >>> I'm going to test https://review.coreboot.org/#/c/coreboot/+/25874 soon and >>> report back, thanks for your help. >> >> Unfortunately that patch dont't fix the issue (log attached). >> I've also tried 0cc2ce4 but with that revision it works correctly >> (log and config attached). >> >> If needed I can run a git bisect to find which commit broke the boot. >> > > I thought I had seen some other message [1] on the list regarding > X201(i) not booting or resuming. Looking at your boot log ending on > "0:1f.0 final", it appears we are looking at the same commit which > adds INTEL_CHIPSET_LOCKDOWN implementation in lpc_final(). The commit > message tells it was not tested for ibexpeak so you may want to try > revert that one before going for full bisect. Yes, that's very likely a problem. It looks like the whole finalize code path of the X201 was untested all the time (even on resume). I don't remember if EHCI debug works in SMM? If it does, you could enable log- ging for the SMI handler as well (if you want to debug it). Nico -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On Sat, Apr 28, 2018 at 3:41 PM, Nicola Cornawrote: > April 27, 2018 12:29 PM, "Nicola Corna" wrote: > >>> With config PARALLEL_CPU_INIT=y so SMP / SMM init in initialize_cpus() >>> will never call wait_other_cpus() at all. That actually regressed in >>> my commit 0cc2ce4 [1] but I can't test if reverting it solves this for >>> you. I'll push a regression fix soonish for review. >>> >>> [1] https://review.coreboot.org/c/coreboot/+/21088 >>> >>> Kyösti >> >> I'm going to test https://review.coreboot.org/#/c/coreboot/+/25874 soon and >> report back, thanks for your help. > > Unfortunately that patch dont't fix the issue (log attached). > I've also tried 0cc2ce4 but with that revision it works correctly > (log and config attached). > > If needed I can run a git bisect to find which commit broke the boot. > I thought I had seen some other message [1] on the list regarding X201(i) not booting or resuming. Looking at your boot log ending on "0:1f.0 final", it appears we are looking at the same commit which adds INTEL_CHIPSET_LOCKDOWN implementation in lpc_final(). The commit message tells it was not tested for ibexpeak so you may want to try revert that one before going for full bisect. [1] https://mail.coreboot.org/pipermail/coreboot/2018-April/086564.html [2] https://review.coreboot.org/c/coreboot/+/21129 Kyösti -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
April 26, 2018 7:21 PM, "Kyösti Mälkki"wrote: > Well, smashed stack in romstage -error is no longer in the log, > possibly because this boot used MRC cache now. Could be, unfortunately I don't have first boot log, I'll grab it in the next test. > With config PARALLEL_CPU_INIT=y so SMP / SMM init in initialize_cpus() > will never call wait_other_cpus() at all. That actually regressed in > my commit 0cc2ce4 [1] but I can't test if reverting it solves this for > you. I'll push a regression fix soonish for review. > > [1] https://review.coreboot.org/c/coreboot/+/21088 > > Kyösti I'm going to test https://review.coreboot.org/#/c/coreboot/+/25874 soon and report back, thanks for your help. Nicola -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
On Wed, Apr 25, 2018 at 1:51 PM, Nicola Cornawrote: > April 18, 2018 3:54 PM, "Kyösti Mälkki" wrote: > >> Having romstage stack smashed seems irrelevant for the no-boot issue. >> That nehalem raminit code, struct raminfo, seems to eat a lot of stack >> and an error message for that case was added with commit 2c3fd49. You >> could try parent of that commit, but rumour is lenovo/x201 was a >> no-boot case long time before that. > > I'll do some tests to find which commit caused the no-boot. Well, smashed stack in romstage -error is no longer in the log, possibly because this boot used MRC cache now. > >> I can see int15h messages interleaved with SMP init, that looks odd to >> me. Also, did the log really end inside 0:1f.0 PCI finalize or was >> usbdebug logging just interrupted for some reason at that point? > > The log really ends there. > With config PARALLEL_CPU_INIT=y so SMP / SMM init in initialize_cpus() will never call wait_other_cpus() at all. That actually regressed in my commit 0cc2ce4 [1] but I can't test if reverting it solves this for you. I'll push a regression fix soonish for review. [1] https://review.coreboot.org/c/coreboot/+/21088 Kyösti -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
Hi! I'm also interested in a running version of current master. On 25.04.2018 12:51, Nicola Corna wrote: > If needed I can do some tests on this PC. I also would do some testing on this machine, if that helps. Regards, Reiner -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
April 18, 2018 3:54 PM, "Kyösti Mälkki"wrote: > Having romstage stack smashed seems irrelevant for the no-boot issue. > That nehalem raminit code, struct raminfo, seems to eat a lot of stack > and an error message for that case was added with commit 2c3fd49. You > could try parent of that commit, but rumour is lenovo/x201 was a > no-boot case long time before that. I'll do some tests to find which commit caused the no-boot. > I can see int15h messages interleaved with SMP init, that looks odd to > me. Also, did the log really end inside 0:1f.0 PCI finalize or was > usbdebug logging just interrupted for some reason at that point? The log really ends there. > Try current master with default config instead of a one derived from > last reported board_status. There has been lot of changes on > framebuffer kconfig settings. And we probably need the .config you > used to assess what's happening there. I've tried with the current master on a fresh config, but the result is the same. Attached you can find the log and the .config. If needed I can do some tests on this PC. Nicola x201_config Description: Binary data x201_log Description: Binary data -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
Hi On Wed, Apr 18, 2018 at 12:42 PM, Nicola Cornawrote: > Hi Paul, > > I can't make my X201 boot with the most recent commit: the screen turns on > and it shows a blinking > cursor, but that's all. > Attached you can find the debug log: as you can see it has detected a stack > smashing and it froze > in a random point. > Having romstage stack smashed seems irrelevant for the no-boot issue. That nehalem raminit code, struct raminfo, seems to eat a lot of stack and an error message for that case was added with commit 2c3fd49. You could try parent of that commit, but rumour is lenovo/x201 was a no-boot case long time before that. I can see int15h messages interleaved with SMP init, that looks odd to me. Also, did the log really end inside 0:1f.0 PCI finalize or was usbdebug logging just interrupted for some reason at that point? > I don't have the build config right now, but it is based on the latest config > in the board_status > repo. > > Any idea? Try current master with default config instead of a one derived from last reported board_status. There has been lot of changes on framebuffer kconfig settings. And we probably need the .config you used to assess what's happening there. Kyösti -- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Re: [coreboot] [RFH] Status of the Lenovo X201
Hi Paul, I can't make my X201 boot with the most recent commit: the screen turns on and it shows a blinking cursor, but that's all. Attached you can find the debug log: as you can see it has detected a stack smashing and it froze in a random point. I don't have the build config right now, but it is based on the latest config in the board_status repo. Any idea? Best, Nicola April 16, 2018 9:58 AM, "Paul Menzel"wrote: > Dear coreboot users, > > There is some uncertainty about the state of latest coreboot on the > Lenovo X201. Does the most recent commit from coreboot work on it, or > are there problems or regressions? > > Kind regards, > > Paul > -- > coreboot mailing list: coreboot@coreboot.org > https://mail.coreboot.org/mailman/listinfo/coreboot USB coreboot-4.7-732-g0b643d2499-6QET70WW (1.40) Fri Apr 13 16:55:14 UTC 2018 romstage starting... PM1_CNT: 1c00 SMBus controller enabled. CBFS: 'Master Header Locator' located CBFS at [590200:7fffc0) CBFS: Locating 'cmos_layout.bin' CBFS: Found @ offset 169c0 size 62c Intel ME early init Intel ME firmware is ready ME: Requested 32MB UMA SMBus controller enabled. CBFS: 'Master Header Locator' located CBFS at [590200:7fffc0) CBFS: Locating 'mrc.cache' CBFS: Found @ offset 1fdc0 size 1 find_current_mrc_cache_local: No valid MRC cache found. reg2ca9_bit0 = 0 reg274265[0][0] = 5 reg274265[0][1] = 5 reg274265[0][2] = e reg274265[1][0] = 5 reg274265[1][1] = 5 reg274265[1][2] = e [6dc] <= 23faff [6e8] <= 23faff USB coreboot-4.7-732-g0b643d2499-6QET70WW (1.40) Fri Apr 13 16:55:14 UTC 2018 romstage starting... PM1_CNT: 1c00 SMBus controller enabled. CBFS: 'Master Header Locator' located CBFS at [590200:7fffc0) CBFS: Locating 'cmos_layout.bin' CBFS: Found @ offset 169c0 size 62c Intel ME early init Intel ME firmware is ready ME: Requested 32MB UMA SMBus controller enabled. CBFS: 'Master Header Locator' located CBFS at [590200:7fffc0) CBFS: Locating 'mrc.cache' CBFS: Found @ offset 1fdc0 size 1 find_current_mrc_cache_local: No valid MRC cache found. Timings: channel 0, slot 0, rank 0 lane 0: 20 (20) 67 (72) 68 (68) 87 (87) lane 1: 20 (20) 64 (6f) 6a (6a) 89 (89) lane 2: 20 (20) 6b (76) 75 (75) 92 (92) lane 3: 20 (20) 70 (7b) 76 (76) 94 (94) lane 4: 20 (20) 86 (91) 7f (7f) 9c (9c) lane 5: 20 (20) 89 (94) 7a (7a) 98 (98) lane 6: 20 (20) 99 (a4) 8a (8a) a9 (a9) lane 7: 20 (20) a0 (ab) 83 (83) a2 (a2) lane 8: 15 (20) 100 (10b) 80 (80) 80 (80) channel 1, slot 0, rank 0 lane 0: 20 (20) 8c (97) 69 (69) 85 (85) lane 1: 20 (20) 8b (96) 68 (68) 84 (84) lane 2: 20 (20) 95 (a0) 74 (74) 8f (8f) lane 3: 20 (20) 92 (9d) 79 (79) 95 (95) lane 4: 20 (20) ad (b8) 84 (84) a1 (a1) lane 5: 20 (20) ad (b8) 7f (7f) 9c (9c) lane 6: 20 (20) b9 (c4) 90 (90) ab (ab) lane 7: 20 (20) bb (c6) 8a (8a) a5 (a5) lane 8: 15 (20) 100 (10b) 80 (80) 80 (80) [178] = 38 (0) [10b] = 0 (0) Timings: channel 0, slot 0, rank 0 lane 0: 20 (20) 72 (72) 68 (68) 87 (87) lane 1: 20 (20) 6f (6f) 6a (6a) 89 (89) lane 2: 20 (20) 76 (76) 75 (75) 92 (92) lane 3: 20 (20) 7b (7b) 76 (76) 94 (94) lane 4: 20 (20) 91 (91) 7f (7f) 9c (9c) lane 5: 20 (20) 94 (94) 7a (7a) 98 (98) lane 6: 20 (20) a4 (a4) 8a (8a) a9 (a9) lane 7: 20 (20) ab (ab) 83 (83) a2 (a2) lane 8: 15 (20) 100 (10b) 80 (80) 80 (80) channel 1, slot 0, rank 0 lane 0: 20 (20) 97 (97) 69 (69) 85 (85) lane 1: 20 (20) 96 (96) 68 (68) 84 (84) lane 2: 20 (20) a0 (a0) 74 (74) 8f (8f) lane 3: 20 (20) 9d (9d) 79 (79) 95 (95) lane 4: 20 (20) b8 (b8) 84 (84) a1 (a1) lane 5: 20 (20) b8 (b8) 7f (7f) 9c (9c) lane 6: 20 (20) c4 (c4) 90 (90) ab (ab) lane 7: 20 (20) c6 (c6) 8a (8a) a5 (a5) lane 8: 15 (20) 100 (10b) 80 (80) 80 (80) [178] = 0 (0) [10b] = 0 (0) Timings: channel 0, slot 0, rank 0 lane 0: 20 (20) 72 (72) 68 (68) 87 (87) lane 1: 20 (20) 6f (6f) 6a (6a) 89 (89) lane 2: 20 (20) 76 (76) 75 (75) 92 (92) lane 3: 20 (20) 7b (7b) 76 (76) 94 (94) lane 4: 20 (20) 91 (91) 7f (7f) 9c (9c) lane 5: 20 (20) 94 (94) 7a (7a) 98 (98) lane 6: 20 (20) a4 (a4) 8a (8a) a9 (a9) lane 7: 20 (20) ab (ab) 83 (83) a2 (a2) lane 8: 15 (20) 100 (10b) 80 (80) 80 (80) channel 1, slot 0, rank 0 lane 0: 12 (20) 89 (97) 69 (69) 85 (85) lane 1: 20 (20) 96 (96) 68 (68) 84 (84) lane 2: 20 (20) a0 (a0) 74 (74) 8f (8f) lane 3: 20 (20) 9d (9d) 79 (79) 95 (95) lane 4: 20 (20) b8 (b8) 84 (84) a1 (a1) lane 5: 20 (20) b8 (b8) 7f (7f) 9c (9c) lane 6: 20 (20) c4 (c4) 90 (90) ab (ab) lane 7: 20 (20) c6 (c6) 8a (8a) a5 (a5) lane 8: 15 (20) 100 (10b) 80 (80) 80 (80) [178] = 0 (0) [10b] = 0 (0) Timings: channel 0, slot 0, rank 0 lane 0: 20 (20) 72 (72) 68 (68) 87 (87) lane 1: 20 (20) 6f (6f) 6a (6a) 89 (89) lane 2: 20 (20) 76 (76) 75 (75) 92 (92) lane 3: 20 (20) 7b (7b) 76 (76) 94 (94) lane 4: 20 (20) 91 (91) 7f (7f) 9c (9c) lane 5: 20 (20) 94 (94) 7a (7a) 98 (98) lane 6: 20 (20) a4 (a4) 8a (8a) a9 (a9) lane 7: 20 (20) ab (ab) 83 (83) a2 (a2) lane 8: 15 (20) 100 (10b) 80 (80) 80 (80) channel 1,