Re: linux-next: Tree for Apr 9 (x86 boot problem)
On Tue, 13 Apr 2021, Mike Rapoport wrote: > > I think I've found the reason. trim_snb_memory() reserved the entire first > megabyte very early leaving no room for real mode trampoline allocation. > Since this reservation is needed only to make sure integrated gfx does not > access some memory, it can be safely done after memblock allocations are > possible. > > I don't know if it can be fixed on the graphics device driver side, but > from the setup_arch() perspective I think this would be the proper fix: > > From c05f6046137abbcbb700571ce1ac54e7abb56a7d Mon Sep 17 00:00:00 2001 > From: Mike Rapoport > Date: Tue, 13 Apr 2021 21:08:39 +0300 > Subject: [PATCH] x86/setup: move trim_snb_memory() later in setup_arch to fix > boot hangs > > Commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") > moved reservation of the memory inaccessible by Sandy Bride integrated > graphics very early and as the result on systems with such devices the > first 1M was reserved by trim_snb_memory() which prevented the allocation > of the real mode trampoline and made the boot hang very early. > > Since the purpose of trim_snb_memory() is to prevent problematic pages ever > reaching the graphics device, it is safe to reserve these pages after > memblock allocations are possible. > > Move trim_snb_memory later in boot so that it will be called after > reserve_real_mode() and make comments describing trim_snb_memory() > operation more elaborate. > > Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") > Reported-by: Randy Dunlap > Signed-off-by: Mike Rapoport Tested-by: Hugh Dickins Thanks Mike and Randy. ThinkPad T420s here. I didn't notice this thread until this morning, but had been investigating bootup panic on mmotm yesterday. I was more fortunate than Randy, in getting some console output which soon led to a799c2bd29d1 without bisection. Expected to go through it line by line today, but you've saved me - thanks. > --- > arch/x86/kernel/setup.c | 20 +++- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 59e5e0903b0c..ccdcfb19df1e 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -633,11 +633,16 @@ static void __init trim_snb_memory(void) > printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n"); > > /* > - * Reserve all memory below the 1 MB mark that has not > - * already been reserved. > + * SandyBridge integrated graphic devices have a bug that prevents > + * them from accessing certain memory ranges, namely anything below > + * 1M and in the pages listed in the bad_pages. > + * > + * To avoid these pages being ever accessed by SNB gfx device > + * reserve all memory below the 1 MB mark and bad_pages that have > + * not already been reserved at boot time. >*/ > memblock_reserve(0, 1<<20); > - > + > for (i = 0; i < ARRAY_SIZE(bad_pages); i++) { > if (memblock_reserve(bad_pages[i], PAGE_SIZE)) > printk(KERN_WARNING "failed to reserve 0x%08lx\n", > @@ -746,8 +751,6 @@ static void __init early_reserve_memory(void) > > reserve_ibft_region(); > reserve_bios_regions(); > - > - trim_snb_memory(); > } > > /* > @@ -1083,6 +1086,13 @@ void __init setup_arch(char **cmdline_p) > > reserve_real_mode(); > > + /* > + * Reserving memory causing GPU hangs on Sandy Bridge integrated > + * graphic devices should be done after we allocated memory under > + * 1M for the real mode trampoline > + */ > + trim_snb_memory(); > + > init_mem_mapping(); > > idt_setup_early_pf(); > -- > 2.28.0
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On 4/13/21 11:23 AM, Mike Rapoport wrote: > On Tue, Apr 13, 2021 at 10:34:25AM -0700, Randy Dunlap wrote: >> On 4/13/21 9:58 AM, Mike Rapoport wrote: >> >> Mike, >> That works. >> >> Please send the next test. > > I think I've found the reason. trim_snb_memory() reserved the entire first > megabyte very early leaving no room for real mode trampoline allocation. > Since this reservation is needed only to make sure integrated gfx does not > access some memory, it can be safely done after memblock allocations are > possible. > > I don't know if it can be fixed on the graphics device driver side, but > from the setup_arch() perspective I think this would be the proper fix: > > From c05f6046137abbcbb700571ce1ac54e7abb56a7d Mon Sep 17 00:00:00 2001 > From: Mike Rapoport > Date: Tue, 13 Apr 2021 21:08:39 +0300 > Subject: [PATCH] x86/setup: move trim_snb_memory() later in setup_arch to fix > boot hangs > > Commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") > moved reservation of the memory inaccessible by Sandy Bride integrated > graphics very early and as the result on systems with such devices the > first 1M was reserved by trim_snb_memory() which prevented the allocation > of the real mode trampoline and made the boot hang very early. > > Since the purpose of trim_snb_memory() is to prevent problematic pages ever > reaching the graphics device, it is safe to reserve these pages after > memblock allocations are possible. > > Move trim_snb_memory later in boot so that it will be called after > reserve_real_mode() and make comments describing trim_snb_memory() > operation more elaborate. > > Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") > Reported-by: Randy Dunlap > Signed-off-by: Mike Rapoport Yay! That boots. Tested-by: Randy Dunlap Thanks. > --- > arch/x86/kernel/setup.c | 20 +++- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 59e5e0903b0c..ccdcfb19df1e 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -633,11 +633,16 @@ static void __init trim_snb_memory(void) > printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n"); > > /* > - * Reserve all memory below the 1 MB mark that has not > - * already been reserved. > + * SandyBridge integrated graphic devices have a bug that prevents > + * them from accessing certain memory ranges, namely anything below > + * 1M and in the pages listed in the bad_pages. > + * > + * To avoid these pages being ever accessed by SNB gfx device > + * reserve all memory below the 1 MB mark and bad_pages that have > + * not already been reserved at boot time. >*/ > memblock_reserve(0, 1<<20); > - > + > for (i = 0; i < ARRAY_SIZE(bad_pages); i++) { > if (memblock_reserve(bad_pages[i], PAGE_SIZE)) > printk(KERN_WARNING "failed to reserve 0x%08lx\n", > @@ -746,8 +751,6 @@ static void __init early_reserve_memory(void) > > reserve_ibft_region(); > reserve_bios_regions(); > - > - trim_snb_memory(); > } > > /* > @@ -1083,6 +1086,13 @@ void __init setup_arch(char **cmdline_p) > > reserve_real_mode(); > > + /* > + * Reserving memory causing GPU hangs on Sandy Bridge integrated > + * graphic devices should be done after we allocated memory under > + * 1M for the real mode trampoline > + */ > + trim_snb_memory(); > + > init_mem_mapping(); > > idt_setup_early_pf(); > -- ~Randy
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On Tue, Apr 13, 2021 at 10:34:25AM -0700, Randy Dunlap wrote: > On 4/13/21 9:58 AM, Mike Rapoport wrote: > > On Mon, Apr 12, 2021 at 11:21:48PM -0700, Randy Dunlap wrote: > >> On 4/12/21 11:06 PM, Mike Rapoport wrote: > >>> Hi Randy, > >>> > >>> On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote: > On 4/12/21 10:01 AM, Mike Rapoport wrote: > > On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote: > > > > I thought about adding some prints to see what's causing the hang, the > > reservations or their absence. Can you replace the debug patch with this > > one: > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 776fc9b3fafe..a10ac252dbcc 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void) > > return false; > > > > vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID); > > + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > > + > > + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, > > device); > > s/device)/devid)/ > >>> > >>> Oh, sorry. > >>> > > + > > if (vendor != 0x8086) > > return false; > > > > - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > > for (i = 0; i < ARRAY_SIZE(snb_ids); i++) > > if (devid == snb_ids[i]) > > return true; > > That prints: > > [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126 > [0.00] early_reserve_memory: snb_gfx: 1 > ... > [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126 > [0.014064] reserving inaccessible SNB gfx pages > > > The full boot log is attached. > >>> > >>> Can you please send the log with memblock=debug added to the kernel > >>> command > >>> line? > >>> > >>> Probably should have started from this... > >>> > >> > >> It's attached. > > > > Honestly, I can't see any reason why moving these reservations around would > > cause your laptop to hang. > > Let's try moving the reservations back to their original place one by > > one, e.g something like this: > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 776fc9b3fafe..892ad20b8557 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -632,12 +632,6 @@ static void __init trim_snb_memory(void) > > > > printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n"); > > > > - /* > > -* Reserve all memory below the 1 MB mark that has not > > -* already been reserved. > > -*/ > > - memblock_reserve(0, 1<<20); > > - > > for (i = 0; i < ARRAY_SIZE(bad_pages); i++) { > > if (memblock_reserve(bad_pages[i], PAGE_SIZE)) > > printk(KERN_WARNING "failed to reserve 0x%08lx\n", > > @@ -1081,6 +1075,12 @@ void __init setup_arch(char **cmdline_p) > > > > reserve_real_mode(); > > > > + /* > > +* Reserve all memory below the 1 MB mark that has not > > +* already been reserved. > > +*/ > > + memblock_reserve(0, 1<<20); > > + > > init_mem_mapping(); > > > > idt_setup_early_pf(); > > > > Mike, > That works. > > Please send the next test. I think I've found the reason. trim_snb_memory() reserved the entire first megabyte very early leaving no room for real mode trampoline allocation. Since this reservation is needed only to make sure integrated gfx does not access some memory, it can be safely done after memblock allocations are possible. I don't know if it can be fixed on the graphics device driver side, but from the setup_arch() perspective I think this would be the proper fix: >From c05f6046137abbcbb700571ce1ac54e7abb56a7d Mon Sep 17 00:00:00 2001 From: Mike Rapoport Date: Tue, 13 Apr 2021 21:08:39 +0300 Subject: [PATCH] x86/setup: move trim_snb_memory() later in setup_arch to fix boot hangs Commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") moved reservation of the memory inaccessible by Sandy Bride integrated graphics very early and as the result on systems with such devices the first 1M was reserved by trim_snb_memory() which prevented the allocation of the real mode trampoline and made the boot hang very early. Since the purpose of trim_snb_memory() is to prevent problematic pages ever reaching the graphics device, it is safe to reserve these pages after memblock allocations are possible. Move trim_snb_memory later in boot so that it will be called after reserve_real_mode() and make comments describing trim_snb_memory() operation more elaborate. Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") Reported-by: Randy Dunlap Signed-off-by: Mike Rapoport --- arch/x86/kernel/setup.c | 20 +++- 1 file changed,
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On 4/13/21 9:58 AM, Mike Rapoport wrote: > On Mon, Apr 12, 2021 at 11:21:48PM -0700, Randy Dunlap wrote: >> On 4/12/21 11:06 PM, Mike Rapoport wrote: >>> Hi Randy, >>> >>> On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote: On 4/12/21 10:01 AM, Mike Rapoport wrote: > On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote: > > I thought about adding some prints to see what's causing the hang, the > reservations or their absence. Can you replace the debug patch with this > one: > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 776fc9b3fafe..a10ac252dbcc 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void) > return false; > > vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID); > + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > + > + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device); s/device)/devid)/ >>> >>> Oh, sorry. >>> > + > if (vendor != 0x8086) > return false; > > - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > for (i = 0; i < ARRAY_SIZE(snb_ids); i++) > if (devid == snb_ids[i]) > return true; That prints: [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126 [0.00] early_reserve_memory: snb_gfx: 1 ... [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126 [0.014064] reserving inaccessible SNB gfx pages The full boot log is attached. >>> >>> Can you please send the log with memblock=debug added to the kernel command >>> line? >>> >>> Probably should have started from this... >>> >> >> It's attached. > > Honestly, I can't see any reason why moving these reservations around would > cause your laptop to hang. > Let's try moving the reservations back to their original place one by > one, e.g something like this: > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 776fc9b3fafe..892ad20b8557 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -632,12 +632,6 @@ static void __init trim_snb_memory(void) > > printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n"); > > - /* > - * Reserve all memory below the 1 MB mark that has not > - * already been reserved. > - */ > - memblock_reserve(0, 1<<20); > - > for (i = 0; i < ARRAY_SIZE(bad_pages); i++) { > if (memblock_reserve(bad_pages[i], PAGE_SIZE)) > printk(KERN_WARNING "failed to reserve 0x%08lx\n", > @@ -1081,6 +1075,12 @@ void __init setup_arch(char **cmdline_p) > > reserve_real_mode(); > > + /* > + * Reserve all memory below the 1 MB mark that has not > + * already been reserved. > + */ > + memblock_reserve(0, 1<<20); > + > init_mem_mapping(); > > idt_setup_early_pf(); > Mike, That works. Please send the next test. -- ~Randy
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On Mon, Apr 12, 2021 at 11:21:48PM -0700, Randy Dunlap wrote: > On 4/12/21 11:06 PM, Mike Rapoport wrote: > > Hi Randy, > > > > On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote: > >> On 4/12/21 10:01 AM, Mike Rapoport wrote: > >>> On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote: > >>> > >>> I thought about adding some prints to see what's causing the hang, the > >>> reservations or their absence. Can you replace the debug patch with this > >>> one: > >>> > >>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > >>> index 776fc9b3fafe..a10ac252dbcc 100644 > >>> --- a/arch/x86/kernel/setup.c > >>> +++ b/arch/x86/kernel/setup.c > >>> @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void) > >>> return false; > >>> > >>> vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID); > >>> + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > >>> + > >>> + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device); > >> > >> s/device)/devid)/ > > > > Oh, sorry. > > > >>> + > >>> if (vendor != 0x8086) > >>> return false; > >>> > >>> - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > >>> for (i = 0; i < ARRAY_SIZE(snb_ids); i++) > >>> if (devid == snb_ids[i]) > >>> return true; > >> > >> That prints: > >> > >> [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126 > >> [0.00] early_reserve_memory: snb_gfx: 1 > >> ... > >> [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126 > >> [0.014064] reserving inaccessible SNB gfx pages > >> > >> > >> The full boot log is attached. > > > > Can you please send the log with memblock=debug added to the kernel command > > line? > > > > Probably should have started from this... > > > > It's attached. Honestly, I can't see any reason why moving these reservations around would cause your laptop to hang. Let's try moving the reservations back to their original place one by one, e.g something like this: diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 776fc9b3fafe..892ad20b8557 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -632,12 +632,6 @@ static void __init trim_snb_memory(void) printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n"); - /* -* Reserve all memory below the 1 MB mark that has not -* already been reserved. -*/ - memblock_reserve(0, 1<<20); - for (i = 0; i < ARRAY_SIZE(bad_pages); i++) { if (memblock_reserve(bad_pages[i], PAGE_SIZE)) printk(KERN_WARNING "failed to reserve 0x%08lx\n", @@ -1081,6 +1075,12 @@ void __init setup_arch(char **cmdline_p) reserve_real_mode(); + /* +* Reserve all memory below the 1 MB mark that has not +* already been reserved. +*/ + memblock_reserve(0, 1<<20); + init_mem_mapping(); idt_setup_early_pf(); -- Sincerely yours, Mike.
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On 4/12/21 11:06 PM, Mike Rapoport wrote: > Hi Randy, > > On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote: >> On 4/12/21 10:01 AM, Mike Rapoport wrote: >>> On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote: >>> >>> I thought about adding some prints to see what's causing the hang, the >>> reservations or their absence. Can you replace the debug patch with this >>> one: >>> >>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c >>> index 776fc9b3fafe..a10ac252dbcc 100644 >>> --- a/arch/x86/kernel/setup.c >>> +++ b/arch/x86/kernel/setup.c >>> @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void) >>> return false; >>> >>> vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID); >>> + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); >>> + >>> + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device); >> >> s/device)/devid)/ > > Oh, sorry. > >>> + >>> if (vendor != 0x8086) >>> return false; >>> >>> - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); >>> for (i = 0; i < ARRAY_SIZE(snb_ids); i++) >>> if (devid == snb_ids[i]) >>> return true; >> >> That prints: >> >> [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126 >> [0.00] early_reserve_memory: snb_gfx: 1 >> ... >> [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126 >> [0.014064] reserving inaccessible SNB gfx pages >> >> >> The full boot log is attached. > > Can you please send the log with memblock=debug added to the kernel command > line? > > Probably should have started from this... > It's attached. -- ~Randy {bedtime} boot0409-memblk-debug.log.gz Description: application/gzip
Re: linux-next: Tree for Apr 9 (x86 boot problem)
Hi Randy, On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote: > On 4/12/21 10:01 AM, Mike Rapoport wrote: > > On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote: > > > > I thought about adding some prints to see what's causing the hang, the > > reservations or their absence. Can you replace the debug patch with this > > one: > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 776fc9b3fafe..a10ac252dbcc 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void) > > return false; > > > > vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID); > > + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > > + > > + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device); > > s/device)/devid)/ Oh, sorry. > > + > > if (vendor != 0x8086) > > return false; > > > > - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > > for (i = 0; i < ARRAY_SIZE(snb_ids); i++) > > if (devid == snb_ids[i]) > > return true; > > That prints: > > [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126 > [0.00] early_reserve_memory: snb_gfx: 1 > ... > [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126 > [0.014064] reserving inaccessible SNB gfx pages > > > The full boot log is attached. Can you please send the log with memblock=debug added to the kernel command line? Probably should have started from this... -- Sincerely yours, Mike.
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On 4/12/21 10:01 AM, Mike Rapoport wrote: > On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote: >> On 4/11/21 11:14 PM, Mike Rapoport wrote: >>> Hi Randy, >>> >>> On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote: On 4/9/21 4:51 AM, Stephen Rothwell wrote: > Hi all, > > Changes since 20210408: > Hi, I cannot boot linux-next 20210408 nor 20210409 on an antique x86_64 laptop (Toshiba Portege). After many failed tests, I finally resorted to git bisect, which led me to: # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several reservations of start of memory git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9 I reverted both of these patches and the laptop boots successfully: commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d Author: Mike Rapoport Date: Tue Mar 2 12:04:05 2021 +0200 x86/setup: Consolidate early memory reservations && commit 4c674481dcf9974834b96622fa4b079c176f36f9 Author: Mike Rapoport Date: Tue Mar 2 12:04:06 2021 +0200 x86/setup: Merge several reservations of start of memory There is no (zero, nil) console display when I try to boot next 0408 or 0409. I connected a USB serial debug cable and booted with earlyprintk=dbgp,keep and still got nothing. The attached boot log is linux-next 20210409 minus the 2 patches listed above. Mike- what data would you like to see? >>> >>> Huh, with no console this would be fun :) >>> For now the only idea I have is to "bisect" the changes and move >>> reservations one by one back to their original place until the system boots >>> again. >>> >>> I'd start with trim_snb_memory() since it's surely needed on your laptop >>> and quite likely it is a NOP on other systems. >>> >>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c >>> index 776fc9b3fafe..dfca9d6b1aa6 100644 >>> --- a/arch/x86/kernel/setup.c >>> +++ b/arch/x86/kernel/setup.c >>> @@ -746,8 +746,6 @@ static void __init early_reserve_memory(void) >>> >>> reserve_ibft_region(); >>> reserve_bios_regions(); >>> - >>> - trim_snb_memory(); >>> } >>> >>> /* >>> @@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p) >>> >>> reserve_real_mode(); >>> >>> + trim_snb_memory(); >>> + >>> init_mem_mapping(); >>> >>> idt_setup_early_pf(); >>> -- > > Hi Randy, > >> Hi Mike, >> That works fine. >> Can you provide another/next step? > > I thought about adding some prints to see what's causing the hang, the > reservations or their absence. Can you replace the debug patch with this > one: > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 776fc9b3fafe..a10ac252dbcc 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void) > return false; > > vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID); > + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > + > + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device); s/device)/devid)/ > + > if (vendor != 0x8086) > return false; > > - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); > for (i = 0; i < ARRAY_SIZE(snb_ids); i++) > if (devid == snb_ids[i]) > return true; > @@ -747,7 +750,7 @@ static void __init early_reserve_memory(void) > reserve_ibft_region(); > reserve_bios_regions(); > > - trim_snb_memory(); > + pr_info("%s: snb_gfx: %d\n", __func__, snb_gfx_workaround_needed()); > } > > /* > @@ -1081,6 +1084,8 @@ void __init setup_arch(char **cmdline_p) > > reserve_real_mode(); > > + trim_snb_memory(); > + > init_mem_mapping(); > > idt_setup_early_pf(); That prints: [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126 [0.00] early_reserve_memory: snb_gfx: 1 ... [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126 [0.014064] reserving inaccessible SNB gfx pages The full boot log is attached. -- ~Randy boottest002.log.gz Description: application/gzip
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote: > On 4/11/21 11:14 PM, Mike Rapoport wrote: > > Hi Randy, > > > > On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote: > >> On 4/9/21 4:51 AM, Stephen Rothwell wrote: > >>> Hi all, > >>> > >>> Changes since 20210408: > >>> > >> > >> Hi, > >> > >> I cannot boot linux-next 20210408 nor 20210409 on an antique > >> x86_64 laptop (Toshiba Portege). > >> > >> After many failed tests, I finally resorted to git bisect, > >> which led me to: > >> > >> # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several > >> reservations of start of memory > >> git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9 > >> > >> > >> I reverted both of these patches and the laptop boots successfully: > >> > >> commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d > >> Author: Mike Rapoport > >> Date: Tue Mar 2 12:04:05 2021 +0200 > >> > >> x86/setup: Consolidate early memory reservations > >> > >> && > >> > >> commit 4c674481dcf9974834b96622fa4b079c176f36f9 > >> Author: Mike Rapoport > >> Date: Tue Mar 2 12:04:06 2021 +0200 > >> > >> x86/setup: Merge several reservations of start of memory > >> > >> > >> There is no (zero, nil) console display when I try to boot > >> next 0408 or 0409. I connected a USB serial debug cable and > >> booted with earlyprintk=dbgp,keep and still got nothing. > >> > >> The attached boot log is linux-next 20210409 minus the 2 patches > >> listed above. > >> > >> Mike- what data would you like to see? > > > > Huh, with no console this would be fun :) > > For now the only idea I have is to "bisect" the changes and move > > reservations one by one back to their original place until the system boots > > again. > > > > I'd start with trim_snb_memory() since it's surely needed on your laptop > > and quite likely it is a NOP on other systems. > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 776fc9b3fafe..dfca9d6b1aa6 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -746,8 +746,6 @@ static void __init early_reserve_memory(void) > > > > reserve_ibft_region(); > > reserve_bios_regions(); > > - > > - trim_snb_memory(); > > } > > > > /* > > @@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p) > > > > reserve_real_mode(); > > > > + trim_snb_memory(); > > + > > init_mem_mapping(); > > > > idt_setup_early_pf(); > > > >> -- Hi Randy, > Hi Mike, > That works fine. > Can you provide another/next step? I thought about adding some prints to see what's causing the hang, the reservations or their absence. Can you replace the debug patch with this one: diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 776fc9b3fafe..a10ac252dbcc 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void) return false; vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID); + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); + + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device); + if (vendor != 0x8086) return false; - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID); for (i = 0; i < ARRAY_SIZE(snb_ids); i++) if (devid == snb_ids[i]) return true; @@ -747,7 +750,7 @@ static void __init early_reserve_memory(void) reserve_ibft_region(); reserve_bios_regions(); - trim_snb_memory(); + pr_info("%s: snb_gfx: %d\n", __func__, snb_gfx_workaround_needed()); } /* @@ -1081,6 +1084,8 @@ void __init setup_arch(char **cmdline_p) reserve_real_mode(); + trim_snb_memory(); + init_mem_mapping(); idt_setup_early_pf(); > If not, I'll try a few things. Sure :) > thanks. > -- > ~Randy > -- Sincerely yours, Mike.
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On 4/11/21 11:14 PM, Mike Rapoport wrote: > Hi Randy, > > On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote: >> On 4/9/21 4:51 AM, Stephen Rothwell wrote: >>> Hi all, >>> >>> Changes since 20210408: >>> >> >> Hi, >> >> I cannot boot linux-next 20210408 nor 20210409 on an antique >> x86_64 laptop (Toshiba Portege). >> >> After many failed tests, I finally resorted to git bisect, >> which led me to: >> >> # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several >> reservations of start of memory >> git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9 >> >> >> I reverted both of these patches and the laptop boots successfully: >> >> commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d >> Author: Mike Rapoport >> Date: Tue Mar 2 12:04:05 2021 +0200 >> >> x86/setup: Consolidate early memory reservations >> >> && >> >> commit 4c674481dcf9974834b96622fa4b079c176f36f9 >> Author: Mike Rapoport >> Date: Tue Mar 2 12:04:06 2021 +0200 >> >> x86/setup: Merge several reservations of start of memory >> >> >> There is no (zero, nil) console display when I try to boot >> next 0408 or 0409. I connected a USB serial debug cable and >> booted with earlyprintk=dbgp,keep and still got nothing. >> >> The attached boot log is linux-next 20210409 minus the 2 patches >> listed above. >> >> Mike- what data would you like to see? > > Huh, with no console this would be fun :) > For now the only idea I have is to "bisect" the changes and move > reservations one by one back to their original place until the system boots > again. > > I'd start with trim_snb_memory() since it's surely needed on your laptop > and quite likely it is a NOP on other systems. > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 776fc9b3fafe..dfca9d6b1aa6 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -746,8 +746,6 @@ static void __init early_reserve_memory(void) > > reserve_ibft_region(); > reserve_bios_regions(); > - > - trim_snb_memory(); > } > > /* > @@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p) > > reserve_real_mode(); > > + trim_snb_memory(); > + > init_mem_mapping(); > > idt_setup_early_pf(); > >> -- Hi Mike, That works fine. Can you provide another/next step? If not, I'll try a few things. thanks. -- ~Randy
Re: linux-next: Tree for Apr 9 (x86 boot problem)
Hi Randy, On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote: > On 4/9/21 4:51 AM, Stephen Rothwell wrote: > > Hi all, > > > > Changes since 20210408: > > > > Hi, > > I cannot boot linux-next 20210408 nor 20210409 on an antique > x86_64 laptop (Toshiba Portege). > > After many failed tests, I finally resorted to git bisect, > which led me to: > > # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several > reservations of start of memory > git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9 > > > I reverted both of these patches and the laptop boots successfully: > > commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d > Author: Mike Rapoport > Date: Tue Mar 2 12:04:05 2021 +0200 > > x86/setup: Consolidate early memory reservations > > && > > commit 4c674481dcf9974834b96622fa4b079c176f36f9 > Author: Mike Rapoport > Date: Tue Mar 2 12:04:06 2021 +0200 > > x86/setup: Merge several reservations of start of memory > > > There is no (zero, nil) console display when I try to boot > next 0408 or 0409. I connected a USB serial debug cable and > booted with earlyprintk=dbgp,keep and still got nothing. > > The attached boot log is linux-next 20210409 minus the 2 patches > listed above. > > Mike- what data would you like to see? Huh, with no console this would be fun :) For now the only idea I have is to "bisect" the changes and move reservations one by one back to their original place until the system boots again. I'd start with trim_snb_memory() since it's surely needed on your laptop and quite likely it is a NOP on other systems. diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 776fc9b3fafe..dfca9d6b1aa6 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -746,8 +746,6 @@ static void __init early_reserve_memory(void) reserve_ibft_region(); reserve_bios_regions(); - - trim_snb_memory(); } /* @@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p) reserve_real_mode(); + trim_snb_memory(); + init_mem_mapping(); idt_setup_early_pf(); > -- > ~Randy > Reported-by: Randy Dunlap -- Sincerely yours, Mike.
Re: linux-next: Tree for Apr 9 (x86 boot problem)
On 4/9/21 4:51 AM, Stephen Rothwell wrote: > Hi all, > > Changes since 20210408: > Hi, I cannot boot linux-next 20210408 nor 20210409 on an antique x86_64 laptop (Toshiba Portege). After many failed tests, I finally resorted to git bisect, which led me to: git bisect start # good: [e49d033bddf5b565044e2abe4241353959bc9120] Linux 5.12-rc6 git bisect good e49d033bddf5b565044e2abe4241353959bc9120 # bad: [e99d8a8495175df8cb8b739f8cf9b0fc9d0cd3b5] Add linux-next specific files for 20210409 git bisect bad e99d8a8495175df8cb8b739f8cf9b0fc9d0cd3b5 # good: [24c5f79572740c1744a7ec2e9e21b541acab6de3] Merge remote-tracking branch 'crypto/master' git bisect good 24c5f79572740c1744a7ec2e9e21b541acab6de3 # bad: [4b90473874c7b6af320b9815f82ac305fd8807f7] Merge remote-tracking branch 'ftrace/for-next' git bisect bad 4b90473874c7b6af320b9815f82ac305fd8807f7 # good: [9cf3382276b26848891c7e072db0a774fadd10e4] Merge remote-tracking branch 'sound/for-next' git bisect good 9cf3382276b26848891c7e072db0a774fadd10e4 # good: [f8d16164c586548d7ccedc058ca9ae547e0cebbe] Merge remote-tracking branch 'mmc/next' git bisect good f8d16164c586548d7ccedc058ca9ae547e0cebbe # good: [761ab817c8710fd601d90bfc5179b0f83b1424bb] Merge remote-tracking branch 'devicetree/for-next' git bisect good 761ab817c8710fd601d90bfc5179b0f83b1424bb # bad: [9ed0086faca0aefcc429a219ab1bd80654093937] Merge branch 'objtool/core' git bisect bad 9ed0086faca0aefcc429a219ab1bd80654093937 # good: [4abeb983d38461f36b0aefa909d8b420c60b05be] Merge branch 'x86/core' git bisect good 4abeb983d38461f36b0aefa909d8b420c60b05be # bad: [6842a3ece3b7c0d558b6664dd6bf19b9ec4fc526] Merge branch 'timers/core' git bisect bad 6842a3ece3b7c0d558b6664dd6bf19b9ec4fc526 # bad: [5247390b761f1f9e255a59123ffab302a83a581b] Merge branch 'x86/boot' git bisect bad 5247390b761f1f9e255a59123ffab302a83a581b # good: [7dfe553affd0d003c7535b7ba60d09193471ea9d] x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL() git bisect good 7dfe553affd0d003c7535b7ba60d09193471ea9d # good: [fda215642945f0b128e91c24c9b90c567f008887] Merge branch 'x86/build' git bisect good fda215642945f0b128e91c24c9b90c567f008887 # good: [e14cfb3bdd0f82147d09e9f46bedda6302f28ee1] x86/boot/compressed: Avoid gcc-11 -Wstringop-overread warning git bisect good e14cfb3bdd0f82147d09e9f46bedda6302f28ee1 # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several reservations of start of memory git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9 I reverted both of these patches and the laptop boots successfully: commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d Author: Mike Rapoport Date: Tue Mar 2 12:04:05 2021 +0200 x86/setup: Consolidate early memory reservations && commit 4c674481dcf9974834b96622fa4b079c176f36f9 Author: Mike Rapoport Date: Tue Mar 2 12:04:06 2021 +0200 x86/setup: Merge several reservations of start of memory There is no (zero, nil) console display when I try to boot next 0408 or 0409. I connected a USB serial debug cable and booted with earlyprintk=dbgp,keep and still got nothing. The attached boot log is linux-next 20210409 minus the 2 patches listed above. Mike- what data would you like to see? -- ~Randy Reported-by: Randy Dunlap boot0409-2.log.gz Description: application/gzip