Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-18 Thread Erhard Furtner
On Wed, 18 Oct 2023 16:45:04 +1100
Michael Ellerman  wrote:

> Thanks. Yeah text is generally better, it archives better and can be
> grepped etc. but in this case I was going a bit mad trying to make sense
> of the oops :)
> 
> In hindsight the bug is an obvious boot time ordering problem, can you
> confirm this fixes it. That should apply on top of Linus' current
> master.
> 
> cheers
> 
> diff --git a/arch/powerpc/kernel/setup-common.c 
> b/arch/powerpc/kernel/setup-common.c
> index 2f1026fba00d..71f16fb32ceb 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -948,6 +948,7 @@ void __init setup_arch(char **cmdline_p)
>  
>   /* Parse memory topology */
>   mem_topology_setup();
> + set_max_mapnr(max_pfn);
>  
>   /*
>* Release secondary cpus out of their spinloops at 0x60 now that
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 8b121df7b08f..07e8f4f1e07f 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -288,7 +288,6 @@ void __init mem_init(void)
>  #endif
>  
>   high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
> - set_max_mapnr(max_pfn);
>  
>   kasan_late_init();
>  

Yes, this fix actually does the trick. v6.6-rc6 booting up fine now (dmesg 
attached) on the G5. Patch also applies on 6.5.7 with seemingly no side effects 
. Many thanks to all involved!

I'll check whether this also helps on a older memory related bug I bisected 
recently. And post the bug if not. ;)

Regards,
Erhard


dmesg_66-rc6_g5
Description: Binary data


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-17 Thread Michael Ellerman
Erhard Furtner  writes:
> On Tue, 17 Oct 2023 14:40:49 +1100
> Michael Ellerman  wrote:
>
>> I think I've reproduced the crash on my Quad G5 by using your config
>> with some things tweaked, but I don't get any output on the screen :/
>
> You could try PPC_EARLY_DEBUG=y with PPC_EARLY_DEBUG_BOOTX or 
> PPC_EARLY_DEBUG_G5.

I have tried PPC_EARLY_DEBUG_BOOTX but it didn't help :/

>> Do you mind booting the commit above and taking a photo of the oops and
>> attach it here. The oops you transcribed didn't entirely make sense,
>> probably due to a typo here or there, so a photo would be best.
>> 
>> cheers
>
> No problem. Just thought transcribing the photo would make more sense
> for a mailing list. ;) But maybe some subtle errors slept crept in.
> Attached are 2 photos from the issue on current v6.6-rc6.

Thanks. Yeah text is generally better, it archives better and can be
grepped etc. but in this case I was going a bit mad trying to make sense
of the oops :)

In hindsight the bug is an obvious boot time ordering problem, can you
confirm this fixes it. That should apply on top of Linus' current
master.

cheers

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 2f1026fba00d..71f16fb32ceb 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -948,6 +948,7 @@ void __init setup_arch(char **cmdline_p)
 
/* Parse memory topology */
mem_topology_setup();
+   set_max_mapnr(max_pfn);
 
/*
 * Release secondary cpus out of their spinloops at 0x60 now that
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8b121df7b08f..07e8f4f1e07f 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -288,7 +288,6 @@ void __init mem_init(void)
 #endif
 
high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
-   set_max_mapnr(max_pfn);
 
kasan_late_init();
 


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-16 Thread Michael Ellerman
Erhard Furtner  writes:
> On Thu, 12 Oct 2023 22:41:56 +1100
> Michael Ellerman  wrote:
>
>> Can you checkout the exact commit that crash is from and do:
>> 
>>  $ make arch/powerpc/mm/book3s64/hash_utils.lst
>> 
>> And paste/attach the content of that file.
>> 
>> cheers
>
> Ok, attached the output from:
>
> git checkout 9fee28baa601f4dbf869b1373183b312d2d5ef3d
> make vmlinux -j16
> make arch/powerpc/mm/book3s64/hash_utils.lst
>
> Commit 9fee28baa601f4dbf869b1373183b312d2d5ef3d is the 1st bad commit of my 
> bisect.

Thanks.

I think I've reproduced the crash on my Quad G5 by using your config
with some things tweaked, but I don't get any output on the screen :/

Do you mind booting the commit above and taking a photo of the oops and
attach it here. The oops you transcribed didn't entirely make sense,
probably due to a typo here or there, so a photo would be best.

cheers


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-12 Thread Erhard Furtner
On Thu, 12 Oct 2023 10:47:51 +1100
Michael Ellerman  wrote:

> I don't see this crash on my quad G5.
> 
> I notice that your config has CONFIG_FLATMEM=y. Can you try switching to
> SPARSEMEM and see if that helps? It might help us narrow down the bug at
> least.

Your assumption was right, interesting! With CONFIG_SpARSEMEM=y my G5 boots up 
just fine (dmesg attached).

I did set CONFIG_FLATMEM=y in the .config as it's a G5 11,2 with a Dual-Core 
970MP, not Dual-CPU as the G5 7,3. So I saw no point in using SpARSEMEM as it's 
a single-CPU machine.

Regards,
Erhard


dmesg_66-rc5_g5
Description: Binary data


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-12 Thread Michael Ellerman
Erhard Furtner  writes:
> Greetings!
>
> Kernel 6.5.5 boots fine on my PowerMac G5 11,2 but kernel 6.6-rc3 fails to 
> boot with following dmesg shown on the OpenFirmware console (transcribed 
> screenshot):
>
> [...]
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> rcu: Hierarchical RCU implementation.
>  Tracing variant of Tasks RCU enabled.
> rcu: RCU calculated value of scheduler-enlistment delay is 30 jiffies.
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> mpic: Setting up MPIC " MPIC 1   " version 1.2 at f804, max 2 CPUs
> mpic: ISU size: 124, shift: 7, mask: 7f
> mpic: Initializing for 124 sources
> mpic: Setting up HT PICs workarounds for U3/U4
> BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe
> Faulting instruction address: 0xc005dc40
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: GT  6.6.0-rc3-PMacGS #1
> Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
> NIP:  c005dc40 LR: c000 CTR: c0007730
> REGS: c22bf510 TRAP: 0380   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 44004242  XER: 
> IRQMASK: 3
> GPR00:  c22bf7b0 c10c0b00 01ac
> GPR04: 03c8 0300 c000f20001ae 0300
> GPR08: 0006 feffbb62ffec65ff 0001 
> GPR12: 90001032 c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 0006 
> GPR20: 01ac c0f6f920 c22cd985 000c
> GPR24: 0300 0003b0a3691d c0003e00803e 
> GPR28: c00c c000f20001ee feffbb62ffec65fe 01ac
> NIP [c005dc40] hash_page_do_lazy_icache+0x50/0x100
> LR [c000] __hash_page_4K+0x420/0x590
> Call Trace:
> [c22bf7e0] [] 0x
> [c22bf8c0] [c005e164] hash_page_mm+0x364/0x6f0
> [c22bf990] [c005e684] do_hash_fault+0x114/0x2b0
> [c22bf9c0] [c00078e8] data_access_common_virt+0x198/0x1f0
> --- interrupt: 300 at mpic_init+0x4bc/0x10c4
> NIP:  c2020a5c LR: c2020a04 CTR: 
> REGS: c22bf9f0 TRAP: 0300   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 24004248  XER: 
> DAR: c0003e00803e DSISR: 4000 IRQMASK: 1
> GPR00:  c22bfc90 c10c0b00 c0003e008030
> GPR04:    
> GPR08:  221b80894c06df2f  
> GPR12:  c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 02367c70 
> GPR20: 567ce25e8c9202b7 c0f6f920 0001 c0003e008030
> GPR24: c226f348 0004 c404c640 
> GPR28: c0003e008030 c404c000 45886d8559cb69b4 c22bfc90
> NIP [c005dc40] mpic_init+0x4bc/0x10c4
> LR [c000] mpic_init+0x464/0x10c4
> ~~~ interrupt: 300
> [c22bfd90] [c2022ae4] pmac_setup_one_mpic+0x258/0x2dc
> [c22bf2e0] [c2022df4] pmac_pic_init+0x28c/0x3d8
> [c22bfef0] [c200b750] init_IRQ+0x90/0x140
> [c22bff30] [c20053c0] start_kernel+0x57c/0x78c
> [c22bffe0] [c000cb48] start_here_common+0x1c/0x20
> Code: 0929 7c292040 4081007c fbc10020 3d220127 78843664 3929d700 ebc9 
> 7fde2214 e93e 712a0001 40820064  71232000 40820048 e93e
> ---[ end trace  ]---

Can you checkout the exact commit that crash is from and do:

 $ make arch/powerpc/mm/book3s64/hash_utils.lst

And paste/attach the content of that file.

cheers


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-11 Thread Michael Ellerman
Erhard Furtner  writes:
> Greetings!
>
> Kernel 6.5.5 boots fine on my PowerMac G5 11,2 but kernel 6.6-rc3
> fails to boot with following dmesg shown on the OpenFirmware console
> (transcribed screenshot):

Thanks for transcribing all that :)

I don't see this crash on my quad G5.

I notice that your config has CONFIG_FLATMEM=y. Can you try switching to
SPARSEMEM and see if that helps? It might help us narrow down the bug at
least.

cheers


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-06 Thread Erhard Furtner
On Fri, 06 Oct 2023 17:38:14 +0530
"Aneesh Kumar K.V"  wrote:

> Sorry that I shared a change without build testing.  Here is the updated 
> change
> 
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index 3ba9fe411604..e563e13ffd88 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -190,29 +190,28 @@ static pte_t set_access_flags_filter(pte_t pte, struct 
> vm_area_struct *vma,
>  void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
>   pte_t pte, unsigned int nr)
>  {
> - /*
> -  * Make sure hardware valid bit is not set. We don't do
> -  * tlb flush for this update.
> -  */
> - VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
> -
> - /* Note: mm->context.id might not yet have been assigned as
> -  * this context might not have been activated yet when this
> -  * is called.
> -  */
> - pte = set_pte_filter(pte);
> -
>   /* Perform the setting of the PTE */
> - arch_enter_lazy_mmu_mode();
>   for (;;) {
> +
> + /*
> +  * Make sure hardware valid bit is not set. We don't do
> +  * tlb flush for this update.
> +  */
> + VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
> +
> + /* Note: mm->context.id might not yet have been assigned as
> +  * this context might not have been activated yet when this
> +  * is called.
> +  */
> + pte = set_pte_filter(pte);
> +
> + /* Perform the setting of the PTE */
>   __set_pte_at(mm, addr, ptep, pte, 0);
>   if (--nr == 0)
>   break;
>   ptep++;
> - pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT));
>   addr += PAGE_SIZE;
>   }
> - arch_leave_lazy_mmu_mode();
>  }
>  
>  void unmap_kernel_page(unsigned long va)

It applies cleanly on top of 6.6-rc4 but it doesn't work out.

I get the same Call Trace and very similar dmesg output like I posted in my 
last email.

Regards,
Erhard


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-06 Thread Aneesh Kumar K.V
Erhard Furtner  writes:

> On Fri, 06 Oct 2023 11:04:15 +0530
> "Aneesh Kumar K.V"  wrote:
>
>> Can you check this change?
>> 
>> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
>> index 3ba9fe411604..6d144fedd557 100644
>

...

>>  void unmap_kernel_page(unsigned long va)
>
> Thanks for having a look into the issue! Your patch applies but I got a build 
> failure:
>
>  # make
>   CALLscripts/checksyscalls.sh
>   CC  arch/powerpc/mm/pgtable.o
> In file included from ./include/linux/mm.h:29,
>  from arch/powerpc/mm/pgtable.c:22:
> ./include/linux/pgtable.h:247:71: error: expected declaration specifiers or 
> '...' before numeric constant
>   247 | #define set_pte_at(mm, addr, ptep, pte) set_ptes(mm, addr, ptep, pte, 
> 1)
>   |   
> ^
> arch/powerpc/mm/pgtable.c:190:13: note: in expansion of macro 'set_pte_at'
>   190 | static void set_pte_at(struct mm_struct *mm, unsigned long addr, 
> pte_t *ptep,
>   | ^~

Sorry that I shared a change without build testing.  Here is the updated change

diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 3ba9fe411604..e563e13ffd88 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -190,29 +190,28 @@ static pte_t set_access_flags_filter(pte_t pte, struct 
vm_area_struct *vma,
 void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
pte_t pte, unsigned int nr)
 {
-   /*
-* Make sure hardware valid bit is not set. We don't do
-* tlb flush for this update.
-*/
-   VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
-
-   /* Note: mm->context.id might not yet have been assigned as
-* this context might not have been activated yet when this
-* is called.
-*/
-   pte = set_pte_filter(pte);
-
/* Perform the setting of the PTE */
-   arch_enter_lazy_mmu_mode();
for (;;) {
+
+   /*
+* Make sure hardware valid bit is not set. We don't do
+* tlb flush for this update.
+*/
+   VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
+
+   /* Note: mm->context.id might not yet have been assigned as
+* this context might not have been activated yet when this
+* is called.
+*/
+   pte = set_pte_filter(pte);
+
+   /* Perform the setting of the PTE */
__set_pte_at(mm, addr, ptep, pte, 0);
if (--nr == 0)
break;
ptep++;
-   pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT));
addr += PAGE_SIZE;
}
-   arch_leave_lazy_mmu_mode();
 }
 
 void unmap_kernel_page(unsigned long va)


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-06 Thread Erhard Furtner
On Fri, 06 Oct 2023 11:04:15 +0530
"Aneesh Kumar K.V"  wrote:

> Can you check this change?
> 
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index 3ba9fe411604..6d144fedd557 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -187,8 +187,8 @@ static pte_t set_access_flags_filter(pte_t pte, struct 
> vm_area_struct *vma,
>  /*
>   * set_pte stores a linux PTE into the linux page table.
>   */
> -void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> - pte_t pte, unsigned int nr)
> +static void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> + pte_t pte)
>  {
>   /*
>* Make sure hardware valid bit is not set. We don't do
> @@ -203,16 +203,23 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, 
> pte_t *ptep,
>   pte = set_pte_filter(pte);
>  
>   /* Perform the setting of the PTE */
> - arch_enter_lazy_mmu_mode();
> + __set_pte_at(mm, addr, ptep, pte, 0);
> +}
> +
> +/*
> + * set_pte stores a linux PTE into the linux page table.
> + */
> +void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> + pte_t pte, unsigned int nr)
> +{
> + /* Perform the setting of the PTE */
>   for (;;) {
> - __set_pte_at(mm, addr, ptep, pte, 0);
> + set_pte_at(mm, addr, ptep, pte);
>   if (--nr == 0)
>   break;
>   ptep++;
> - pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT));
>   addr += PAGE_SIZE;
>   }
> - arch_leave_lazy_mmu_mode();
>  }
>  
>  void unmap_kernel_page(unsigned long va)

Thanks for having a look into the issue! Your patch applies but I got a build 
failure:

 # make
  CALLscripts/checksyscalls.sh
  CC  arch/powerpc/mm/pgtable.o
In file included from ./include/linux/mm.h:29,
 from arch/powerpc/mm/pgtable.c:22:
./include/linux/pgtable.h:247:71: error: expected declaration specifiers or 
'...' before numeric constant
  247 | #define set_pte_at(mm, addr, ptep, pte) set_ptes(mm, addr, ptep, pte, 1)
  |   ^
arch/powerpc/mm/pgtable.c:190:13: note: in expansion of macro 'set_pte_at'
  190 | static void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t 
*ptep,
  | ^~
make[4]: *** [scripts/Makefile.build:243: arch/powerpc/mm/pgtable.o] Fehler 1
make[3]: *** [scripts/Makefile.build:480: arch/powerpc/mm] Fehler 2
make[2]: *** [scripts/Makefile.build:480: arch/powerpc] Fehler 2
make[1]: *** [/usr/src/linux-stable/Makefile:1913: .] Fehler 2
make: *** [Makefile:234: __sub-make] Fehler 2

Probably you forgot adding a parameter on this line of your patch:
-   __set_pte_at(mm, addr, ptep, pte, 0);
+   set_pte_at(mm, addr, ptep, pte);

So I changed it to:
-   __set_pte_at(mm, addr, ptep, pte, 0);
+   set_pte_at(mm, addr, ptep, pte, 0);

Got the kernel building after that but on booting I still run into the issue. 
Though details of the dmesg look different now:

BUG: Unable to handle kernel data access at 0xfb6affee6dfe
Faulting instruction address: 0xc005d150
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Tainted: GT  6.6.0-rc4-PMacGS #1
Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
NIP:  c005d150 LR: c0065a70 CTR: c0007730
REGS: c22bf4c0 TRAP: 0380   Tainted: GT 
(6.6.0-rc3-PMacGS)
MSR:  90001032   CR: 44004242  XER: 
IRQMASK: 3
GPR00:  c22bf760 c10bb900 01ac
GPR04: 03c8 0300 c000f20001ae 0300
GPR08: 0006 fb6affee6dff 0001 
GPR12: 90001032 c2362000 c0f9eb80 
GPR16:  00047fb56ef0 0006 c0f62280
GPR20: 01ac c00c c22ce985 000c
GPR24: 0300 0003b0a3691d c0003e00803e 
GPR28: c00c c000f20001ee fb6affee6dfe 01ac
NIP [c005d150] hash_page_do_lazy_icache+0x50/0x100
LR [c0065a70] __hash_page_4K+0x420/0x590
Call Trace:
[c22bf760] [c22bf7a0] 0xc22bf7a0 (unreliable)
[c22bf790] [c22bf7d0] 0xc22bf7d0
[c22bf870] [c005d55c] hash_page_mm+0x24c/0x770
[c22bf950] [c005dc0c] do_hash_fault+0x10c/0x290
[c22bf980] [c00078e8] data_access_common_virt+0x198/0x1f0
--- interrupt: 300 at mpic_init+0x530/0x1164
NIP:  c2020c10 LR: c2020b40 CTR: 
REGS: c22bf9f0 TRAP: 0300   Tainted: GT 
(6.6.0-rc4-PMacGS)
MSR:  90001032   CR: 

Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-06 Thread Bagas Sanjaya
On 06/10/2023 08:19, Matthew Wilcox wrote:
> On Fri, Oct 06, 2023 at 08:11:12AM +0700, Bagas Sanjaya wrote:
>> Matthew Wilcox, did you miss this regression report? You should look into it
>> since it is (apparently) cause by a commit of yours.
> 
> No, I didn't miss it.  I'm simply choosing to work on other things.
> All this regression tracking nonsense and being told to work on things
> by people who've appointed themselves my manager has completely sapped
> my motivation to work on bugs.  If you want me to work on things, *don't*
> harass me.
> 

OK, thanks!

-- 
An old man doll... just what I always wanted! - Clara



Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-05 Thread Aneesh Kumar K.V


Hi,

Erhard Furtner  writes:

> Greetings!
>
> Kernel 6.5.5 boots fine on my PowerMac G5 11,2 but kernel 6.6-rc3 fails to 
> boot with following dmesg shown on the OpenFirmware console (transcribed 
> screenshot):

> I bisected the issue and got 9fee28baa601f4dbf869b1373183b312d2d5ef3d as 1st 
> bad commit:
>

Can you check this change?

diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 3ba9fe411604..6d144fedd557 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -187,8 +187,8 @@ static pte_t set_access_flags_filter(pte_t pte, struct 
vm_area_struct *vma,
 /*
  * set_pte stores a linux PTE into the linux page table.
  */
-void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
-   pte_t pte, unsigned int nr)
+static void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+   pte_t pte)
 {
/*
 * Make sure hardware valid bit is not set. We don't do
@@ -203,16 +203,23 @@ void set_ptes(struct mm_struct *mm, unsigned long addr, 
pte_t *ptep,
pte = set_pte_filter(pte);
 
/* Perform the setting of the PTE */
-   arch_enter_lazy_mmu_mode();
+   __set_pte_at(mm, addr, ptep, pte, 0);
+}
+
+/*
+ * set_pte stores a linux PTE into the linux page table.
+ */
+void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+   pte_t pte, unsigned int nr)
+{
+   /* Perform the setting of the PTE */
for (;;) {
-   __set_pte_at(mm, addr, ptep, pte, 0);
+   set_pte_at(mm, addr, ptep, pte);
if (--nr == 0)
break;
ptep++;
-   pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT));
addr += PAGE_SIZE;
}
-   arch_leave_lazy_mmu_mode();
 }
 
 void unmap_kernel_page(unsigned long va)


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-05 Thread Matthew Wilcox
On Fri, Oct 06, 2023 at 08:11:12AM +0700, Bagas Sanjaya wrote:
> Matthew Wilcox, did you miss this regression report? You should look into it
> since it is (apparently) cause by a commit of yours.

No, I didn't miss it.  I'm simply choosing to work on other things.
All this regression tracking nonsense and being told to work on things
by people who've appointed themselves my manager has completely sapped
my motivation to work on bugs.  If you want me to work on things, *don't*
harass me.



Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-10-05 Thread Bagas Sanjaya
On Fri, Sep 29, 2023 at 01:27:50PM +0200, Erhard Furtner wrote:
> Greetings!
> 
> Kernel 6.5.5 boots fine on my PowerMac G5 11,2 but kernel 6.6-rc3 fails to 
> boot with following dmesg shown on the OpenFirmware console (transcribed 
> screenshot):
> 
> [...]
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> rcu: Hierarchical RCU implementation.
>  Tracing variant of Tasks RCU enabled.
> rcu: RCU calculated value of scheduler-enlistment delay is 30 jiffies.
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> mpic: Setting up MPIC " MPIC 1   " version 1.2 at f804, max 2 CPUs
> mpic: ISU size: 124, shift: 7, mask: 7f
> mpic: Initializing for 124 sources
> mpic: Setting up HT PICs workarounds for U3/U4
> BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe
> Faulting instruction address: 0xc005dc40
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: GT  6.6.0-rc3-PMacGS #1
> Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
> NIP:  c005dc40 LR: c000 CTR: c0007730
> REGS: c22bf510 TRAP: 0380   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 44004242  XER: 
> IRQMASK: 3
> GPR00:  c22bf7b0 c10c0b00 01ac
> GPR04: 03c8 0300 c000f20001ae 0300
> GPR08: 0006 feffbb62ffec65ff 0001 
> GPR12: 90001032 c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 0006 
> GPR20: 01ac c0f6f920 c22cd985 000c
> GPR24: 0300 0003b0a3691d c0003e00803e 
> GPR28: c00c c000f20001ee feffbb62ffec65fe 01ac
> NIP [c005dc40] hash_page_do_lazy_icache+0x50/0x100
> LR [c000] __hash_page_4K+0x420/0x590
> Call Trace:
> [c22bf7e0] [] 0x
> [c22bf8c0] [c005e164] hash_page_mm+0x364/0x6f0
> [c22bf990] [c005e684] do_hash_fault+0x114/0x2b0
> [c22bf9c0] [c00078e8] data_access_common_virt+0x198/0x1f0
> --- interrupt: 300 at mpic_init+0x4bc/0x10c4
> NIP:  c2020a5c LR: c2020a04 CTR: 
> REGS: c22bf9f0 TRAP: 0300   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 24004248  XER: 
> DAR: c0003e00803e DSISR: 4000 IRQMASK: 1
> GPR00:  c22bfc90 c10c0b00 c0003e008030
> GPR04:    
> GPR08:  221b80894c06df2f  
> GPR12:  c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 02367c70 
> GPR20: 567ce25e8c9202b7 c0f6f920 0001 c0003e008030
> GPR24: c226f348 0004 c404c640 
> GPR28: c0003e008030 c404c000 45886d8559cb69b4 c22bfc90
> NIP [c005dc40] mpic_init+0x4bc/0x10c4
> LR [c000] mpic_init+0x464/0x10c4
> ~~~ interrupt: 300
> [c22bfd90] [c2022ae4] pmac_setup_one_mpic+0x258/0x2dc
> [c22bf2e0] [c2022df4] pmac_pic_init+0x28c/0x3d8
> [c22bfef0] [c200b750] init_IRQ+0x90/0x140
> [c22bff30] [c20053c0] start_kernel+0x57c/0x78c
> [c22bffe0] [c000cb48] start_here_common+0x1c/0x20
> Code: 0929 7c292040 4081007c fbc10020 3d220127 78843664 3929d700 ebc9 
> 7fde2214 e93e 712a0001 40820064  71232000 40820048 e93e
> ---[ end trace  ]---
> 
> Kernel panic - not syncing: Fatal exception
> Rebooting in 40 seconds..
> 
> 
> I bisected the issue and got 9fee28baa601f4dbf869b1373183b312d2d5ef3d as 1st 
> bad commit:
> 
>  # git bisect good
> 9fee28baa601f4dbf869b1373183b312d2d5ef3d is the first bad commit
> commit 9fee28baa601f4dbf869b1373183b312d2d5ef3d
> Author: Matthew Wilcox (Oracle) 
> Date:   Wed Aug 2 16:13:49 2023 +0100
> 
> powerpc: implement the new page table range API
> 
> Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio().  Change
> the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio.
> 
> [wi...@infradead.org: re-export flush_dcache_icache_folio()]
>   Link: https://lkml.kernel.org/r/zmx1daywvd9em...@casper.infradead.org
> Link: 
> https://lkml.kernel.org/r/20230802151406.3735276-22-wi...@infradead.org
> Signed-off-by: Matthew Wilcox (Oracle) 
> Acked-by: Mike Rapoport (IBM) 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Cc: Christophe Leroy 
> Signed-off-by: Andrew Morton 
> 
>  

Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-09-29 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 29.09.23 13:27, Erhard Furtner wrote:
> Greetings!
> 
> Kernel 6.5.5 boots fine on my PowerMac G5 11,2 but kernel 6.6-rc3 fails to 
> boot with following dmesg shown on the OpenFirmware console (transcribed 
> screenshot):
> 
> [...]
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> rcu: Hierarchical RCU implementation.
>  Tracing variant of Tasks RCU enabled.
> rcu: RCU calculated value of scheduler-enlistment delay is 30 jiffies.
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> mpic: Setting up MPIC " MPIC 1   " version 1.2 at f804, max 2 CPUs
> mpic: ISU size: 124, shift: 7, mask: 7f
> mpic: Initializing for 124 sources
> mpic: Setting up HT PICs workarounds for U3/U4
> BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe
> Faulting instruction address: 0xc005dc40
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: GT  6.6.0-rc3-PMacGS #1
> Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
> NIP:  c005dc40 LR: c000 CTR: c0007730
> REGS: c22bf510 TRAP: 0380   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 44004242  XER: 
> IRQMASK: 3
> GPR00:  c22bf7b0 c10c0b00 01ac
> GPR04: 03c8 0300 c000f20001ae 0300
> GPR08: 0006 feffbb62ffec65ff 0001 
> GPR12: 90001032 c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 0006 
> GPR20: 01ac c0f6f920 c22cd985 000c
> GPR24: 0300 0003b0a3691d c0003e00803e 
> GPR28: c00c c000f20001ee feffbb62ffec65fe 01ac
> NIP [c005dc40] hash_page_do_lazy_icache+0x50/0x100
> LR [c000] __hash_page_4K+0x420/0x590
> Call Trace:
> [c22bf7e0] [] 0x
> [c22bf8c0] [c005e164] hash_page_mm+0x364/0x6f0
> [c22bf990] [c005e684] do_hash_fault+0x114/0x2b0
> [c22bf9c0] [c00078e8] data_access_common_virt+0x198/0x1f0
> --- interrupt: 300 at mpic_init+0x4bc/0x10c4
> NIP:  c2020a5c LR: c2020a04 CTR: 
> REGS: c22bf9f0 TRAP: 0300   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 24004248  XER: 
> DAR: c0003e00803e DSISR: 4000 IRQMASK: 1
> GPR00:  c22bfc90 c10c0b00 c0003e008030
> GPR04:    
> GPR08:  221b80894c06df2f  
> GPR12:  c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 02367c70 
> GPR20: 567ce25e8c9202b7 c0f6f920 0001 c0003e008030
> GPR24: c226f348 0004 c404c640 
> GPR28: c0003e008030 c404c000 45886d8559cb69b4 c22bfc90
> NIP [c005dc40] mpic_init+0x4bc/0x10c4
> LR [c000] mpic_init+0x464/0x10c4
> ~~~ interrupt: 300
> [c22bfd90] [c2022ae4] pmac_setup_one_mpic+0x258/0x2dc
> [c22bf2e0] [c2022df4] pmac_pic_init+0x28c/0x3d8
> [c22bfef0] [c200b750] init_IRQ+0x90/0x140
> [c22bff30] [c20053c0] start_kernel+0x57c/0x78c
> [c22bffe0] [c000cb48] start_here_common+0x1c/0x20
> Code: 0929 7c292040 4081007c fbc10020 3d220127 78843664 3929d700 ebc9 
> 7fde2214 e93e 712a0001 40820064  71232000 40820048 e93e
> ---[ end trace  ]---
> 
> Kernel panic - not syncing: Fatal exception
> Rebooting in 40 seconds..
> 
> 
> I bisected the issue and got 9fee28baa601f4dbf869b1373183b312d2d5ef3d as 1st 
> bad commit:
> 
>  # git bisect good
> 9fee28baa601f4dbf869b1373183b312d2d5ef3d is the first bad commit
> commit 9fee28baa601f4dbf869b1373183b312d2d5ef3d
> Author: Matthew Wilcox (Oracle) 
> Date:   Wed Aug 2 16:13:49 2023 +0100
> 
> powerpc: implement the new page table range API
> 
> Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio().  Change
> the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio.
> 
> [wi...@infradead.org: re-export flush_dcache_icache_folio()]
>   Link: