Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-13 Thread Hugh Dickins
On Tue, 13 Apr 2021, Mike Rapoport wrote:
> 
> I think I've found the reason. trim_snb_memory() reserved the entire first
> megabyte very early leaving no room for real mode trampoline allocation.
> Since this reservation is needed only to make sure integrated gfx does not
> access some memory, it can be safely done after memblock allocations are
> possible.
> 
> I don't know if it can be fixed on the graphics device driver side, but
> from the setup_arch() perspective I think this would be the proper fix:
> 
> From c05f6046137abbcbb700571ce1ac54e7abb56a7d Mon Sep 17 00:00:00 2001
> From: Mike Rapoport 
> Date: Tue, 13 Apr 2021 21:08:39 +0300
> Subject: [PATCH] x86/setup: move trim_snb_memory() later in setup_arch to fix
>  boot hangs
> 
> Commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
> moved reservation of the memory inaccessible by Sandy Bride integrated
> graphics very early and as the result on systems with such devices the
> first 1M was reserved by trim_snb_memory() which prevented the allocation
> of the real mode trampoline and made the boot hang very early.
> 
> Since the purpose of trim_snb_memory() is to prevent problematic pages ever
> reaching the graphics device, it is safe to reserve these pages after
> memblock allocations are possible.
> 
> Move trim_snb_memory later in boot so that it will be called after
> reserve_real_mode() and make comments describing trim_snb_memory()
> operation more elaborate.
> 
> Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
> Reported-by: Randy Dunlap 
> Signed-off-by: Mike Rapoport 

Tested-by: Hugh Dickins 

Thanks Mike and Randy. ThinkPad T420s here. I didn't notice this thread
until this morning, but had been investigating bootup panic on mmotm
yesterday. I was more fortunate than Randy, in getting some console
output which soon led to a799c2bd29d1 without bisection. Expected
to go through it line by line today, but you've saved me - thanks.

> ---
>  arch/x86/kernel/setup.c | 20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 59e5e0903b0c..ccdcfb19df1e 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -633,11 +633,16 @@ static void __init trim_snb_memory(void)
>   printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n");
>  
>   /*
> -  * Reserve all memory below the 1 MB mark that has not
> -  * already been reserved.
> +  * SandyBridge integrated graphic devices have a bug that prevents
> +  * them from accessing certain memory ranges, namely anything below
> +  * 1M and in the pages listed in the bad_pages.
> +  *
> +  * To avoid these pages being ever accessed by SNB gfx device
> +  * reserve all memory below the 1 MB mark and bad_pages that have
> +  * not already been reserved at boot time.
>*/
>   memblock_reserve(0, 1<<20);
> - 
> +
>   for (i = 0; i < ARRAY_SIZE(bad_pages); i++) {
>   if (memblock_reserve(bad_pages[i], PAGE_SIZE))
>   printk(KERN_WARNING "failed to reserve 0x%08lx\n",
> @@ -746,8 +751,6 @@ static void __init early_reserve_memory(void)
>  
>   reserve_ibft_region();
>   reserve_bios_regions();
> -
> - trim_snb_memory();
>  }
>  
>  /*
> @@ -1083,6 +1086,13 @@ void __init setup_arch(char **cmdline_p)
>  
>   reserve_real_mode();
>  
> + /*
> +  * Reserving memory causing GPU hangs on Sandy Bridge integrated
> +  * graphic devices should be done after we allocated memory under
> +  * 1M for the real mode trampoline
> +  */
> + trim_snb_memory();
> +
>   init_mem_mapping();
>  
>   idt_setup_early_pf();
> -- 
> 2.28.0


Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-13 Thread Randy Dunlap
On 4/13/21 11:23 AM, Mike Rapoport wrote:
> On Tue, Apr 13, 2021 at 10:34:25AM -0700, Randy Dunlap wrote:
>> On 4/13/21 9:58 AM, Mike Rapoport wrote:

>>
>> Mike,
>> That works.
>>
>> Please send the next test.
> 
> I think I've found the reason. trim_snb_memory() reserved the entire first
> megabyte very early leaving no room for real mode trampoline allocation.
> Since this reservation is needed only to make sure integrated gfx does not
> access some memory, it can be safely done after memblock allocations are
> possible.
> 
> I don't know if it can be fixed on the graphics device driver side, but
> from the setup_arch() perspective I think this would be the proper fix:
> 
> From c05f6046137abbcbb700571ce1ac54e7abb56a7d Mon Sep 17 00:00:00 2001
> From: Mike Rapoport 
> Date: Tue, 13 Apr 2021 21:08:39 +0300
> Subject: [PATCH] x86/setup: move trim_snb_memory() later in setup_arch to fix
>  boot hangs
> 
> Commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
> moved reservation of the memory inaccessible by Sandy Bride integrated
> graphics very early and as the result on systems with such devices the
> first 1M was reserved by trim_snb_memory() which prevented the allocation
> of the real mode trampoline and made the boot hang very early.
> 
> Since the purpose of trim_snb_memory() is to prevent problematic pages ever
> reaching the graphics device, it is safe to reserve these pages after
> memblock allocations are possible.
> 
> Move trim_snb_memory later in boot so that it will be called after
> reserve_real_mode() and make comments describing trim_snb_memory()
> operation more elaborate.
> 
> Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
> Reported-by: Randy Dunlap 
> Signed-off-by: Mike Rapoport 

Yay! That boots.

Tested-by: Randy Dunlap 

Thanks.

> ---
>  arch/x86/kernel/setup.c | 20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 59e5e0903b0c..ccdcfb19df1e 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -633,11 +633,16 @@ static void __init trim_snb_memory(void)
>   printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n");
>  
>   /*
> -  * Reserve all memory below the 1 MB mark that has not
> -  * already been reserved.
> +  * SandyBridge integrated graphic devices have a bug that prevents
> +  * them from accessing certain memory ranges, namely anything below
> +  * 1M and in the pages listed in the bad_pages.
> +  *
> +  * To avoid these pages being ever accessed by SNB gfx device
> +  * reserve all memory below the 1 MB mark and bad_pages that have
> +  * not already been reserved at boot time.
>*/
>   memblock_reserve(0, 1<<20);
> - 
> +
>   for (i = 0; i < ARRAY_SIZE(bad_pages); i++) {
>   if (memblock_reserve(bad_pages[i], PAGE_SIZE))
>   printk(KERN_WARNING "failed to reserve 0x%08lx\n",
> @@ -746,8 +751,6 @@ static void __init early_reserve_memory(void)
>  
>   reserve_ibft_region();
>   reserve_bios_regions();
> -
> - trim_snb_memory();
>  }
>  
>  /*
> @@ -1083,6 +1086,13 @@ void __init setup_arch(char **cmdline_p)
>  
>   reserve_real_mode();
>  
> + /*
> +  * Reserving memory causing GPU hangs on Sandy Bridge integrated
> +  * graphic devices should be done after we allocated memory under
> +  * 1M for the real mode trampoline
> +  */
> + trim_snb_memory();
> +
>   init_mem_mapping();
>  
>   idt_setup_early_pf();
> 


-- 
~Randy



Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-13 Thread Mike Rapoport
On Tue, Apr 13, 2021 at 10:34:25AM -0700, Randy Dunlap wrote:
> On 4/13/21 9:58 AM, Mike Rapoport wrote:
> > On Mon, Apr 12, 2021 at 11:21:48PM -0700, Randy Dunlap wrote:
> >> On 4/12/21 11:06 PM, Mike Rapoport wrote:
> >>> Hi Randy,
> >>>
> >>> On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote:
>  On 4/12/21 10:01 AM, Mike Rapoport wrote:
> > On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote:
> >  
> > I thought about adding some prints to see what's causing the hang, the
> > reservations or their absence. Can you replace the debug patch with this
> > one:
> >
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 776fc9b3fafe..a10ac252dbcc 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void)
> > return false;
> >  
> > vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID);
> > +   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> > +
> > +   pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, 
> > device);
> 
>  s/device)/devid)/
> >>>  
> >>> Oh, sorry.
> >>>
> > +
> > if (vendor != 0x8086)
> > return false;
> >  
> > -   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> > for (i = 0; i < ARRAY_SIZE(snb_ids); i++)
> > if (devid == snb_ids[i])
> > return true;
> 
>  That prints:
> 
>  [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126
>  [0.00] early_reserve_memory: snb_gfx: 1
>  ...
>  [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126
>  [0.014064] reserving inaccessible SNB gfx pages
> 
> 
>  The full boot log is attached.
> >>>  
> >>> Can you please send the log with memblock=debug added to the kernel 
> >>> command
> >>> line?
> >>>
> >>> Probably should have started from this...
> >>>
> >>
> >> It's attached.
> > 
> > Honestly, I can't see any reason why moving these reservations around would
> > cause your laptop to hang.
> > Let's try moving the reservations back to their original place one by
> > one, e.g something like this:
> > 
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 776fc9b3fafe..892ad20b8557 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -632,12 +632,6 @@ static void __init trim_snb_memory(void)
> >  
> > printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n");
> >  
> > -   /*
> > -* Reserve all memory below the 1 MB mark that has not
> > -* already been reserved.
> > -*/
> > -   memblock_reserve(0, 1<<20);
> > -   
> > for (i = 0; i < ARRAY_SIZE(bad_pages); i++) {
> > if (memblock_reserve(bad_pages[i], PAGE_SIZE))
> > printk(KERN_WARNING "failed to reserve 0x%08lx\n",
> > @@ -1081,6 +1075,12 @@ void __init setup_arch(char **cmdline_p)
> >  
> > reserve_real_mode();
> >  
> > +   /*
> > +* Reserve all memory below the 1 MB mark that has not
> > +* already been reserved.
> > +*/
> > +   memblock_reserve(0, 1<<20);
> > +
> > init_mem_mapping();
> >  
> > idt_setup_early_pf();
> > 
> 
> Mike,
> That works.
> 
> Please send the next test.

I think I've found the reason. trim_snb_memory() reserved the entire first
megabyte very early leaving no room for real mode trampoline allocation.
Since this reservation is needed only to make sure integrated gfx does not
access some memory, it can be safely done after memblock allocations are
possible.

I don't know if it can be fixed on the graphics device driver side, but
from the setup_arch() perspective I think this would be the proper fix:

>From c05f6046137abbcbb700571ce1ac54e7abb56a7d Mon Sep 17 00:00:00 2001
From: Mike Rapoport 
Date: Tue, 13 Apr 2021 21:08:39 +0300
Subject: [PATCH] x86/setup: move trim_snb_memory() later in setup_arch to fix
 boot hangs

Commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
moved reservation of the memory inaccessible by Sandy Bride integrated
graphics very early and as the result on systems with such devices the
first 1M was reserved by trim_snb_memory() which prevented the allocation
of the real mode trampoline and made the boot hang very early.

Since the purpose of trim_snb_memory() is to prevent problematic pages ever
reaching the graphics device, it is safe to reserve these pages after
memblock allocations are possible.

Move trim_snb_memory later in boot so that it will be called after
reserve_real_mode() and make comments describing trim_snb_memory()
operation more elaborate.

Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations")
Reported-by: Randy Dunlap 
Signed-off-by: Mike Rapoport 
---
 arch/x86/kernel/setup.c | 20 +++-
 1 file changed, 

Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-13 Thread Randy Dunlap
On 4/13/21 9:58 AM, Mike Rapoport wrote:
> On Mon, Apr 12, 2021 at 11:21:48PM -0700, Randy Dunlap wrote:
>> On 4/12/21 11:06 PM, Mike Rapoport wrote:
>>> Hi Randy,
>>>
>>> On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote:
 On 4/12/21 10:01 AM, Mike Rapoport wrote:
> On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote:
>  
> I thought about adding some prints to see what's causing the hang, the
> reservations or their absence. Can you replace the debug patch with this
> one:
>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 776fc9b3fafe..a10ac252dbcc 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void)
>   return false;
>  
>   vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID);
> + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> +
> + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device);

 s/device)/devid)/
>>>  
>>> Oh, sorry.
>>>
> +
>   if (vendor != 0x8086)
>   return false;
>  
> - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
>   for (i = 0; i < ARRAY_SIZE(snb_ids); i++)
>   if (devid == snb_ids[i])
>   return true;

 That prints:

 [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126
 [0.00] early_reserve_memory: snb_gfx: 1
 ...
 [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126
 [0.014064] reserving inaccessible SNB gfx pages


 The full boot log is attached.
>>>  
>>> Can you please send the log with memblock=debug added to the kernel command
>>> line?
>>>
>>> Probably should have started from this...
>>>
>>
>> It's attached.
> 
> Honestly, I can't see any reason why moving these reservations around would
> cause your laptop to hang.
> Let's try moving the reservations back to their original place one by
> one, e.g something like this:
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 776fc9b3fafe..892ad20b8557 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -632,12 +632,6 @@ static void __init trim_snb_memory(void)
>  
>   printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n");
>  
> - /*
> -  * Reserve all memory below the 1 MB mark that has not
> -  * already been reserved.
> -  */
> - memblock_reserve(0, 1<<20);
> - 
>   for (i = 0; i < ARRAY_SIZE(bad_pages); i++) {
>   if (memblock_reserve(bad_pages[i], PAGE_SIZE))
>   printk(KERN_WARNING "failed to reserve 0x%08lx\n",
> @@ -1081,6 +1075,12 @@ void __init setup_arch(char **cmdline_p)
>  
>   reserve_real_mode();
>  
> + /*
> +  * Reserve all memory below the 1 MB mark that has not
> +  * already been reserved.
> +  */
> + memblock_reserve(0, 1<<20);
> +
>   init_mem_mapping();
>  
>   idt_setup_early_pf();
> 

Mike,
That works.

Please send the next test.

-- 
~Randy



Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-13 Thread Mike Rapoport
On Mon, Apr 12, 2021 at 11:21:48PM -0700, Randy Dunlap wrote:
> On 4/12/21 11:06 PM, Mike Rapoport wrote:
> > Hi Randy,
> > 
> > On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote:
> >> On 4/12/21 10:01 AM, Mike Rapoport wrote:
> >>> On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote:
> >>>  
> >>> I thought about adding some prints to see what's causing the hang, the
> >>> reservations or their absence. Can you replace the debug patch with this
> >>> one:
> >>>
> >>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> >>> index 776fc9b3fafe..a10ac252dbcc 100644
> >>> --- a/arch/x86/kernel/setup.c
> >>> +++ b/arch/x86/kernel/setup.c
> >>> @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void)
> >>>   return false;
> >>>  
> >>>   vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID);
> >>> + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> >>> +
> >>> + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device);
> >>
> >> s/device)/devid)/
> >  
> > Oh, sorry.
> > 
> >>> +
> >>>   if (vendor != 0x8086)
> >>>   return false;
> >>>  
> >>> - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> >>>   for (i = 0; i < ARRAY_SIZE(snb_ids); i++)
> >>>   if (devid == snb_ids[i])
> >>>   return true;
> >>
> >> That prints:
> >>
> >> [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126
> >> [0.00] early_reserve_memory: snb_gfx: 1
> >> ...
> >> [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126
> >> [0.014064] reserving inaccessible SNB gfx pages
> >>
> >>
> >> The full boot log is attached.
> >  
> > Can you please send the log with memblock=debug added to the kernel command
> > line?
> > 
> > Probably should have started from this...
> > 
> 
> It's attached.

Honestly, I can't see any reason why moving these reservations around would
cause your laptop to hang.
Let's try moving the reservations back to their original place one by
one, e.g something like this:

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 776fc9b3fafe..892ad20b8557 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -632,12 +632,6 @@ static void __init trim_snb_memory(void)
 
printk(KERN_DEBUG "reserving inaccessible SNB gfx pages\n");
 
-   /*
-* Reserve all memory below the 1 MB mark that has not
-* already been reserved.
-*/
-   memblock_reserve(0, 1<<20);
-   
for (i = 0; i < ARRAY_SIZE(bad_pages); i++) {
if (memblock_reserve(bad_pages[i], PAGE_SIZE))
printk(KERN_WARNING "failed to reserve 0x%08lx\n",
@@ -1081,6 +1075,12 @@ void __init setup_arch(char **cmdline_p)
 
reserve_real_mode();
 
+   /*
+* Reserve all memory below the 1 MB mark that has not
+* already been reserved.
+*/
+   memblock_reserve(0, 1<<20);
+
init_mem_mapping();
 
idt_setup_early_pf();

-- 
Sincerely yours,
Mike.


Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-13 Thread Randy Dunlap
On 4/12/21 11:06 PM, Mike Rapoport wrote:
> Hi Randy,
> 
> On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote:
>> On 4/12/21 10:01 AM, Mike Rapoport wrote:
>>> On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote:
>>>  
>>> I thought about adding some prints to see what's causing the hang, the
>>> reservations or their absence. Can you replace the debug patch with this
>>> one:
>>>
>>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>>> index 776fc9b3fafe..a10ac252dbcc 100644
>>> --- a/arch/x86/kernel/setup.c
>>> +++ b/arch/x86/kernel/setup.c
>>> @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void)
>>> return false;
>>>  
>>> vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID);
>>> +   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
>>> +
>>> +   pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device);
>>
>> s/device)/devid)/
>  
> Oh, sorry.
> 
>>> +
>>> if (vendor != 0x8086)
>>> return false;
>>>  
>>> -   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
>>> for (i = 0; i < ARRAY_SIZE(snb_ids); i++)
>>> if (devid == snb_ids[i])
>>> return true;
>>
>> That prints:
>>
>> [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126
>> [0.00] early_reserve_memory: snb_gfx: 1
>> ...
>> [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126
>> [0.014064] reserving inaccessible SNB gfx pages
>>
>>
>> The full boot log is attached.
>  
> Can you please send the log with memblock=debug added to the kernel command
> line?
> 
> Probably should have started from this...
> 

It's attached.

-- 
~Randy
{bedtime}



boot0409-memblk-debug.log.gz
Description: application/gzip


Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-13 Thread Mike Rapoport
Hi Randy,

On Mon, Apr 12, 2021 at 01:53:34PM -0700, Randy Dunlap wrote:
> On 4/12/21 10:01 AM, Mike Rapoport wrote:
> > On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote:
> >  
> > I thought about adding some prints to see what's causing the hang, the
> > reservations or their absence. Can you replace the debug patch with this
> > one:
> > 
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 776fc9b3fafe..a10ac252dbcc 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void)
> > return false;
> >  
> > vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID);
> > +   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> > +
> > +   pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device);
> 
> s/device)/devid)/
 
Oh, sorry.

> > +
> > if (vendor != 0x8086)
> > return false;
> >  
> > -   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> > for (i = 0; i < ARRAY_SIZE(snb_ids); i++)
> > if (devid == snb_ids[i])
> > return true;
> 
> That prints:
> 
> [0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126
> [0.00] early_reserve_memory: snb_gfx: 1
> ...
> [0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126
> [0.014064] reserving inaccessible SNB gfx pages
> 
> 
> The full boot log is attached.
 
Can you please send the log with memblock=debug added to the kernel command
line?

Probably should have started from this...

-- 
Sincerely yours,
Mike.


Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-12 Thread Randy Dunlap
On 4/12/21 10:01 AM, Mike Rapoport wrote:
> On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote:
>> On 4/11/21 11:14 PM, Mike Rapoport wrote:
>>> Hi Randy,
>>>
>>> On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote:
 On 4/9/21 4:51 AM, Stephen Rothwell wrote:
> Hi all,
>
> Changes since 20210408:
>

 Hi,

 I cannot boot linux-next 20210408 nor 20210409 on an antique
 x86_64 laptop (Toshiba Portege).

 After many failed tests, I finally resorted to git bisect,
 which led me to:

 # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several 
 reservations of start of memory
 git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9


 I reverted both of these patches and the laptop boots successfully:

 commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d
 Author: Mike Rapoport 
 Date:   Tue Mar 2 12:04:05 2021 +0200

 x86/setup: Consolidate early memory reservations

 &&

 commit 4c674481dcf9974834b96622fa4b079c176f36f9
 Author: Mike Rapoport 
 Date:   Tue Mar 2 12:04:06 2021 +0200

 x86/setup: Merge several reservations of start of memory


 There is no (zero, nil) console display when I try to boot
 next 0408 or 0409. I connected a USB serial debug cable and
 booted with earlyprintk=dbgp,keep and still got nothing.

 The attached boot log is linux-next 20210409 minus the 2 patches
 listed above.

 Mike- what data would you like to see?
>>>
>>> Huh, with no console this would be fun :)
>>> For now the only idea I have is to "bisect" the changes and move
>>> reservations one by one back to their original place until the system boots
>>> again. 
>>>
>>> I'd start with trim_snb_memory() since it's surely needed on your laptop
>>> and quite likely it is a NOP on other systems.
>>>
>>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>>> index 776fc9b3fafe..dfca9d6b1aa6 100644
>>> --- a/arch/x86/kernel/setup.c
>>> +++ b/arch/x86/kernel/setup.c
>>> @@ -746,8 +746,6 @@ static void __init early_reserve_memory(void)
>>>  
>>> reserve_ibft_region();
>>> reserve_bios_regions();
>>> -
>>> -   trim_snb_memory();
>>>  }
>>>  
>>>  /*
>>> @@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p)
>>>  
>>> reserve_real_mode();
>>>  
>>> +   trim_snb_memory();
>>> +
>>> init_mem_mapping();
>>>  
>>> idt_setup_early_pf();
>>>  
 -- 
> 
> Hi Randy,
>  
>> Hi Mike,
>> That works fine.
>> Can you provide another/next step?
>  
> I thought about adding some prints to see what's causing the hang, the
> reservations or their absence. Can you replace the debug patch with this
> one:
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 776fc9b3fafe..a10ac252dbcc 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void)
>   return false;
>  
>   vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID);
> + devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
> +
> + pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device);

s/device)/devid)/

> +
>   if (vendor != 0x8086)
>   return false;
>  
> - devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
>   for (i = 0; i < ARRAY_SIZE(snb_ids); i++)
>   if (devid == snb_ids[i])
>   return true;
> @@ -747,7 +750,7 @@ static void __init early_reserve_memory(void)
>   reserve_ibft_region();
>   reserve_bios_regions();
>  
> - trim_snb_memory();
> + pr_info("%s: snb_gfx: %d\n", __func__, snb_gfx_workaround_needed());
>  }
>  
>  /*
> @@ -1081,6 +1084,8 @@ void __init setup_arch(char **cmdline_p)
>  
>   reserve_real_mode();
>  
> + trim_snb_memory();
> +
>   init_mem_mapping();
>  
>   idt_setup_early_pf();

That prints:

[0.00] snb_gfx_workaround_needed: vendor: 8086, device: 126
[0.00] early_reserve_memory: snb_gfx: 1
...
[0.014061] snb_gfx_workaround_needed: vendor: 8086, device: 126
[0.014064] reserving inaccessible SNB gfx pages


The full boot log is attached.


-- 
~Randy



boottest002.log.gz
Description: application/gzip


Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-12 Thread Mike Rapoport
On Mon, Apr 12, 2021 at 08:49:49AM -0700, Randy Dunlap wrote:
> On 4/11/21 11:14 PM, Mike Rapoport wrote:
> > Hi Randy,
> > 
> > On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote:
> >> On 4/9/21 4:51 AM, Stephen Rothwell wrote:
> >>> Hi all,
> >>>
> >>> Changes since 20210408:
> >>>
> >>
> >> Hi,
> >>
> >> I cannot boot linux-next 20210408 nor 20210409 on an antique
> >> x86_64 laptop (Toshiba Portege).
> >>
> >> After many failed tests, I finally resorted to git bisect,
> >> which led me to:
> >>
> >> # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several 
> >> reservations of start of memory
> >> git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9
> >>
> >>
> >> I reverted both of these patches and the laptop boots successfully:
> >>
> >> commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d
> >> Author: Mike Rapoport 
> >> Date:   Tue Mar 2 12:04:05 2021 +0200
> >>
> >> x86/setup: Consolidate early memory reservations
> >>
> >> &&
> >>
> >> commit 4c674481dcf9974834b96622fa4b079c176f36f9
> >> Author: Mike Rapoport 
> >> Date:   Tue Mar 2 12:04:06 2021 +0200
> >>
> >> x86/setup: Merge several reservations of start of memory
> >>
> >>
> >> There is no (zero, nil) console display when I try to boot
> >> next 0408 or 0409. I connected a USB serial debug cable and
> >> booted with earlyprintk=dbgp,keep and still got nothing.
> >>
> >> The attached boot log is linux-next 20210409 minus the 2 patches
> >> listed above.
> >>
> >> Mike- what data would you like to see?
> > 
> > Huh, with no console this would be fun :)
> > For now the only idea I have is to "bisect" the changes and move
> > reservations one by one back to their original place until the system boots
> > again. 
> > 
> > I'd start with trim_snb_memory() since it's surely needed on your laptop
> > and quite likely it is a NOP on other systems.
> > 
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 776fc9b3fafe..dfca9d6b1aa6 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -746,8 +746,6 @@ static void __init early_reserve_memory(void)
> >  
> > reserve_ibft_region();
> > reserve_bios_regions();
> > -
> > -   trim_snb_memory();
> >  }
> >  
> >  /*
> > @@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p)
> >  
> > reserve_real_mode();
> >  
> > +   trim_snb_memory();
> > +
> > init_mem_mapping();
> >  
> > idt_setup_early_pf();
> >  
> >> -- 

Hi Randy,
 
> Hi Mike,
> That works fine.
> Can you provide another/next step?
 
I thought about adding some prints to see what's causing the hang, the
reservations or their absence. Can you replace the debug patch with this
one:

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 776fc9b3fafe..a10ac252dbcc 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -600,10 +600,13 @@ static bool __init snb_gfx_workaround_needed(void)
return false;
 
vendor = read_pci_config_16(0, 2, 0, PCI_VENDOR_ID);
+   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
+
+   pr_info("%s: vendor: %x, device: %x\n", __func__, vendor, device);
+
if (vendor != 0x8086)
return false;
 
-   devid = read_pci_config_16(0, 2, 0, PCI_DEVICE_ID);
for (i = 0; i < ARRAY_SIZE(snb_ids); i++)
if (devid == snb_ids[i])
return true;
@@ -747,7 +750,7 @@ static void __init early_reserve_memory(void)
reserve_ibft_region();
reserve_bios_regions();
 
-   trim_snb_memory();
+   pr_info("%s: snb_gfx: %d\n", __func__, snb_gfx_workaround_needed());
 }
 
 /*
@@ -1081,6 +1084,8 @@ void __init setup_arch(char **cmdline_p)
 
reserve_real_mode();
 
+   trim_snb_memory();
+
init_mem_mapping();
 
idt_setup_early_pf();

> If not, I'll try a few things.

Sure :)
 
> thanks.
> -- 
> ~Randy
> 

-- 
Sincerely yours,
Mike.


Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-12 Thread Randy Dunlap
On 4/11/21 11:14 PM, Mike Rapoport wrote:
> Hi Randy,
> 
> On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote:
>> On 4/9/21 4:51 AM, Stephen Rothwell wrote:
>>> Hi all,
>>>
>>> Changes since 20210408:
>>>
>>
>> Hi,
>>
>> I cannot boot linux-next 20210408 nor 20210409 on an antique
>> x86_64 laptop (Toshiba Portege).
>>
>> After many failed tests, I finally resorted to git bisect,
>> which led me to:
>>
>> # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several 
>> reservations of start of memory
>> git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9
>>
>>
>> I reverted both of these patches and the laptop boots successfully:
>>
>> commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d
>> Author: Mike Rapoport 
>> Date:   Tue Mar 2 12:04:05 2021 +0200
>>
>> x86/setup: Consolidate early memory reservations
>>
>> &&
>>
>> commit 4c674481dcf9974834b96622fa4b079c176f36f9
>> Author: Mike Rapoport 
>> Date:   Tue Mar 2 12:04:06 2021 +0200
>>
>> x86/setup: Merge several reservations of start of memory
>>
>>
>> There is no (zero, nil) console display when I try to boot
>> next 0408 or 0409. I connected a USB serial debug cable and
>> booted with earlyprintk=dbgp,keep and still got nothing.
>>
>> The attached boot log is linux-next 20210409 minus the 2 patches
>> listed above.
>>
>> Mike- what data would you like to see?
> 
> Huh, with no console this would be fun :)
> For now the only idea I have is to "bisect" the changes and move
> reservations one by one back to their original place until the system boots
> again. 
> 
> I'd start with trim_snb_memory() since it's surely needed on your laptop
> and quite likely it is a NOP on other systems.
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 776fc9b3fafe..dfca9d6b1aa6 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -746,8 +746,6 @@ static void __init early_reserve_memory(void)
>  
>   reserve_ibft_region();
>   reserve_bios_regions();
> -
> - trim_snb_memory();
>  }
>  
>  /*
> @@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p)
>  
>   reserve_real_mode();
>  
> + trim_snb_memory();
> +
>   init_mem_mapping();
>  
>   idt_setup_early_pf();
>  
>> -- 

Hi Mike,
That works fine.
Can you provide another/next step?

If not, I'll try a few things.

thanks.
-- 
~Randy



Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-12 Thread Mike Rapoport
Hi Randy,

On Sun, Apr 11, 2021 at 07:41:37PM -0700, Randy Dunlap wrote:
> On 4/9/21 4:51 AM, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Changes since 20210408:
> > 
> 
> Hi,
> 
> I cannot boot linux-next 20210408 nor 20210409 on an antique
> x86_64 laptop (Toshiba Portege).
> 
> After many failed tests, I finally resorted to git bisect,
> which led me to:
> 
> # bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several 
> reservations of start of memory
> git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9
> 
> 
> I reverted both of these patches and the laptop boots successfully:
> 
> commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d
> Author: Mike Rapoport 
> Date:   Tue Mar 2 12:04:05 2021 +0200
> 
> x86/setup: Consolidate early memory reservations
> 
> &&
> 
> commit 4c674481dcf9974834b96622fa4b079c176f36f9
> Author: Mike Rapoport 
> Date:   Tue Mar 2 12:04:06 2021 +0200
> 
> x86/setup: Merge several reservations of start of memory
> 
> 
> There is no (zero, nil) console display when I try to boot
> next 0408 or 0409. I connected a USB serial debug cable and
> booted with earlyprintk=dbgp,keep and still got nothing.
> 
> The attached boot log is linux-next 20210409 minus the 2 patches
> listed above.
> 
> Mike- what data would you like to see?

Huh, with no console this would be fun :)
For now the only idea I have is to "bisect" the changes and move
reservations one by one back to their original place until the system boots
again. 

I'd start with trim_snb_memory() since it's surely needed on your laptop
and quite likely it is a NOP on other systems.

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 776fc9b3fafe..dfca9d6b1aa6 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -746,8 +746,6 @@ static void __init early_reserve_memory(void)
 
reserve_ibft_region();
reserve_bios_regions();
-
-   trim_snb_memory();
 }
 
 /*
@@ -1081,6 +1079,8 @@ void __init setup_arch(char **cmdline_p)
 
reserve_real_mode();
 
+   trim_snb_memory();
+
init_mem_mapping();
 
idt_setup_early_pf();
 
> -- 
> ~Randy
> Reported-by: Randy Dunlap 

-- 
Sincerely yours,
Mike.


Re: linux-next: Tree for Apr 9 (x86 boot problem)

2021-04-11 Thread Randy Dunlap
On 4/9/21 4:51 AM, Stephen Rothwell wrote:
> Hi all,
> 
> Changes since 20210408:
> 

Hi,

I cannot boot linux-next 20210408 nor 20210409 on an antique
x86_64 laptop (Toshiba Portege).

After many failed tests, I finally resorted to git bisect,
which led me to:

git bisect start
# good: [e49d033bddf5b565044e2abe4241353959bc9120] Linux 5.12-rc6
git bisect good e49d033bddf5b565044e2abe4241353959bc9120
# bad: [e99d8a8495175df8cb8b739f8cf9b0fc9d0cd3b5] Add linux-next specific files 
for 20210409
git bisect bad e99d8a8495175df8cb8b739f8cf9b0fc9d0cd3b5
# good: [24c5f79572740c1744a7ec2e9e21b541acab6de3] Merge remote-tracking branch 
'crypto/master'
git bisect good 24c5f79572740c1744a7ec2e9e21b541acab6de3
# bad: [4b90473874c7b6af320b9815f82ac305fd8807f7] Merge remote-tracking branch 
'ftrace/for-next'
git bisect bad 4b90473874c7b6af320b9815f82ac305fd8807f7
# good: [9cf3382276b26848891c7e072db0a774fadd10e4] Merge remote-tracking branch 
'sound/for-next'
git bisect good 9cf3382276b26848891c7e072db0a774fadd10e4
# good: [f8d16164c586548d7ccedc058ca9ae547e0cebbe] Merge remote-tracking branch 
'mmc/next'
git bisect good f8d16164c586548d7ccedc058ca9ae547e0cebbe
# good: [761ab817c8710fd601d90bfc5179b0f83b1424bb] Merge remote-tracking branch 
'devicetree/for-next'
git bisect good 761ab817c8710fd601d90bfc5179b0f83b1424bb
# bad: [9ed0086faca0aefcc429a219ab1bd80654093937] Merge branch 'objtool/core'
git bisect bad 9ed0086faca0aefcc429a219ab1bd80654093937
# good: [4abeb983d38461f36b0aefa909d8b420c60b05be] Merge branch 'x86/core'
git bisect good 4abeb983d38461f36b0aefa909d8b420c60b05be
# bad: [6842a3ece3b7c0d558b6664dd6bf19b9ec4fc526] Merge branch 'timers/core'
git bisect bad 6842a3ece3b7c0d558b6664dd6bf19b9ec4fc526
# bad: [5247390b761f1f9e255a59123ffab302a83a581b] Merge branch 'x86/boot'
git bisect bad 5247390b761f1f9e255a59123ffab302a83a581b
# good: [7dfe553affd0d003c7535b7ba60d09193471ea9d] x86/syscalls: Fix 
-Wmissing-prototypes warnings from COND_SYSCALL()
git bisect good 7dfe553affd0d003c7535b7ba60d09193471ea9d
# good: [fda215642945f0b128e91c24c9b90c567f008887] Merge branch 'x86/build'
git bisect good fda215642945f0b128e91c24c9b90c567f008887
# good: [e14cfb3bdd0f82147d09e9f46bedda6302f28ee1] x86/boot/compressed: Avoid 
gcc-11 -Wstringop-overread warning
git bisect good e14cfb3bdd0f82147d09e9f46bedda6302f28ee1
# bad: [4c674481dcf9974834b96622fa4b079c176f36f9] x86/setup: Merge several 
reservations of start of memory
git bisect bad 4c674481dcf9974834b96622fa4b079c176f36f9


I reverted both of these patches and the laptop boots successfully:

commit a799c2bd29d19c565f37fa038b31a0a1d44d0e4d
Author: Mike Rapoport 
Date:   Tue Mar 2 12:04:05 2021 +0200

x86/setup: Consolidate early memory reservations

&&

commit 4c674481dcf9974834b96622fa4b079c176f36f9
Author: Mike Rapoport 
Date:   Tue Mar 2 12:04:06 2021 +0200

x86/setup: Merge several reservations of start of memory


There is no (zero, nil) console display when I try to boot
next 0408 or 0409. I connected a USB serial debug cable and
booted with earlyprintk=dbgp,keep and still got nothing.

The attached boot log is linux-next 20210409 minus the 2 patches
listed above.


Mike- what data would you like to see?

-- 
~Randy
Reported-by: Randy Dunlap 


boot0409-2.log.gz
Description: application/gzip