Re: Xorg doesn't start and some other issues with the RC1 of kernel 6.10

2024-06-01 Thread Linux regression tracking (Thorsten Leemhuis)
On 01.06.24 08:34, Greg KH wrote:
> On Fri, May 31, 2024 at 12:16:50PM +0200, Greg KH wrote:
>> On Fri, May 31, 2024 at 12:02:15PM +0200, Greg KH wrote:
>>> On Fri, May 31, 2024 at 11:19:34AM +0200, Thorsten Leemhuis wrote:
 Thx, I already had an eye on this, but thought tracking would not be
 needed, as Greg (now CCed) wanted to revert 8c467f3300591a ("VT: Use
 macros to define ioctls") two days ago:
 https://lore.kernel.org/all/2024052901-police-trash-e9f9@gregkh/

 But that commit is not yet in -next afaics. :-/

 /me meanwhile wonders if it would be wise to fix this before -rc2
>>>
>>> I do, sorry, been traveling this week with geen vrije tijd.  Will get to
>>> it tomorrow.
>>
>> Ugh, sorry for the dutch, I have "no free time" because I am studying
>> the language this week.  It is bleeding over here into my emails now...

:-D

> Pull request now sent:
>   https://lore.kernel.org/r/zlq8ymiubtois...@kroah.com

Dank u wel![1, 2] And good luck with your studies! Ciao, Thorsten

[1] "Many thx!"

[2] I understand some dutch (more than enough for "geen vrije tijd"),
but do not really speak it; but it was enough to get that simple phrase
right on the first attempt.

#regzbot fix: 7bc4244c882a7d7d


Re: [PATCH 04/23] scsi: initialize scsi midlayer limits before allocating the queue

2024-05-29 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 20.05.24 17:15, Christoph Hellwig wrote:
> Adding ben and the linuxppc list.

Hmm, no reply and no other progress to get this resolved afaics. So lets
bring Michael into the mix, he might be able to help out.

BTW TWIMC: a PowerMac G5 user user reported similar symptoms here
recently: https://bugzilla.kernel.org/show_bug.cgi?id=218858

Ciao, Thorsten

> Context: pata_macio initialization now fails as we enforce that the
> segment size is set properly.
> 
> On Wed, May 15, 2024 at 04:52:29PM -0700, Guenter Roeck wrote:
>> pata_macio_common_init() Calling ata_host_activate() with limit 65280
>> ...
>> max_segment_size is 65280; PAGE_SIZE is 65536; BLK_MAX_SEGMENT_SIZE is 65536
>> WARNING: CPU: 0 PID: 12 at block/blk-settings.c:202 
>> blk_validate_limits+0x2d4/0x364
>> ...
>>
>> This is with PPC_BOOK3S_64 which selects a default page size of 64k.
> 
> Yeah.  Did you actually manage to use pata macio previously?  Or is
> it just used because it's part of the pmac default config?
> 
>> Looking at the old code, I think it did what you suggested above,
> 
>> but assuming that the driver requested a lower limit on purpose that
>> may not be the best solution.
> 
>> Never mind, though - I updated my test configuration to explicitly
>> configure the page size to 4k to work around the problem. With that,
>> please consider this report a note in case someone hits the problem
>> on a real system (and sorry for the noise).
> 
> Yes, the idea behind this change was to catch such errors.  So far
> most errors have been drivers setting lower limits than what the
> hardware can actually handle, but I'd love to track this down.
> 
> If the hardware can't actually handle the lower limit we should
> probably just fail the probe gracefully with a well comment if
> statement instead.


Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception

2024-04-04 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Was this regression ever resolved? Doesn't look like it, but maybe I
just missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 20.03.24 14:43, Nicholas Piggin wrote:
> On Wed Mar 13, 2024 at 5:26 PM AEST, Vaibhav Jain wrote:
>> This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC:
>> Book3S HV nestedv2: Do not cancel pending decrementer exception") which
>> prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It
>> was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC
>> expiry TB' register which was higher compared to handling extra decrementer
>> exceptions.
>>
>> This overhead of reading 'HDEC expiry TB' register has been mitigated
>> recently by the L0 hypervisor(PowerVM) by putting the value of this
>> register in L2 guest-state output buffer on trap to L1. From there the
>> value of this register is cached, made available in kvmhv_run_single_vcpu()
>> to compare it against host(L1) timebase and cancel the pending hypervisor
>> decrementer exception if needed.
> 
> Ah, I figured out the problem here. Guest entry never clears the
> queued dec, because it's level triggered on the DEC MSB so it
> doesn't go away when it's delivered. So upstream code is indeed
> buggy and I think I take the blame for suggesting this nestedv2
> workaround.
> 
> I actually don't think that is necessary though, we could treat it
> like other interrupts.  I think that would solve the problem without
> having to test dec here.
> 
> I am wondering though, what workload slows down that this patch
> was needed in the first place. We'd only get here after a cede
> returns, then we'd dequeue the dec and stop having to GET_STATE
> it here.
> 
> Thanks,
> Nick
> 
>>
>> Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending 
>> decrementer exception")
>> Signed-off-by: Vaibhav Jain 
>> ---
>>  arch/powerpc/kvm/book3s_hv.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 0b921704da45..e47b954ce266 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
>> time_limit,
>>   * entering a nested guest in which case the decrementer is now owned
>>   * by L2 and the L1 decrementer is provided in hdec_expires
>>   */
>> -if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
>> +if (kvmppc_core_pending_dec(vcpu) &&
>>  ((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
>>   (trap == BOOK3S_INTERRUPT_SYSCALL &&
>>kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
> 


Re: [PATCH] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-18 Thread Linux regression tracking (Thorsten Leemhuis)
On 19.11.23 00:45, Timothy Pearson wrote:
> During floating point and vector save to thread data fr0/vs0 are clobbered
> by the FPSCR/VSCR store routine.  This leads to userspace register corruption
> and application data corruption / crash under the following rare condition:
> [...]
> Tested-by: Timothy Pearson 

Many thx for this, good to see you finally found the problem.

FWIW, you might want to add a

 Closes:
https://lore.kernel.org/all/480932026.45576726.1699374859845.javamail.zim...@raptorengineeringinc.com/

here. Yes, I care about those tags because of regression tracking. But
it only relies on Link:/Closes: tags because they were meant to be used
in the first place to link to backstories and details of a change[1].

And you and Jens did such good debugging in that thread, which is why
it's IMHO really worth linking here in case anyone ever needs to look
into the backstory later.

> Signed-off-by: Timothy Pearson 
> [..]

Thx again for all your work you put into this.

Ciao, Thorsten

[1] see Documentation/process/submitting-patches.rst
(http://docs.kernel.org/process/submitting-patches.html) and
Documentation/process/5.Posting.rst
(https://docs.kernel.org/process/5.Posting.html)

See also these mails from Linus:
https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+gsyyfbth0f-dgvd...@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wgs38ZrfPvy=nowvkvzjpm3vfu1zobp37fwd_h9iad...@mail.gmail.com/
https://lore.kernel.org/all/CAHk-=wjxzafG-=j8ot30s7upn4rhbs6tx-uvfz5rme+l5_d...@mail.gmail.com/


Re: [Bisected] PowerMac G5 fails booting kernel 6.6-rc3 (BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe)

2023-09-29 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 29.09.23 13:27, Erhard Furtner wrote:
> Greetings!
> 
> Kernel 6.5.5 boots fine on my PowerMac G5 11,2 but kernel 6.6-rc3 fails to 
> boot with following dmesg shown on the OpenFirmware console (transcribed 
> screenshot):
> 
> [...]
> SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> rcu: Hierarchical RCU implementation.
>  Tracing variant of Tasks RCU enabled.
> rcu: RCU calculated value of scheduler-enlistment delay is 30 jiffies.
> NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
> mpic: Setting up MPIC " MPIC 1   " version 1.2 at f804, max 2 CPUs
> mpic: ISU size: 124, shift: 7, mask: 7f
> mpic: Initializing for 124 sources
> mpic: Setting up HT PICs workarounds for U3/U4
> BUG: Unable to handle kernel data access at 0xfeffbb62ffec65fe
> Faulting instruction address: 0xc005dc40
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Tainted: GT  6.6.0-rc3-PMacGS #1
> Hardware name: PowerMac11,2 PPC970MP 0x440101 PowerMac
> NIP:  c005dc40 LR: c000 CTR: c0007730
> REGS: c22bf510 TRAP: 0380   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 44004242  XER: 
> IRQMASK: 3
> GPR00:  c22bf7b0 c10c0b00 01ac
> GPR04: 03c8 0300 c000f20001ae 0300
> GPR08: 0006 feffbb62ffec65ff 0001 
> GPR12: 90001032 c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 0006 
> GPR20: 01ac c0f6f920 c22cd985 000c
> GPR24: 0300 0003b0a3691d c0003e00803e 
> GPR28: c00c c000f20001ee feffbb62ffec65fe 01ac
> NIP [c005dc40] hash_page_do_lazy_icache+0x50/0x100
> LR [c000] __hash_page_4K+0x420/0x590
> Call Trace:
> [c22bf7e0] [] 0x
> [c22bf8c0] [c005e164] hash_page_mm+0x364/0x6f0
> [c22bf990] [c005e684] do_hash_fault+0x114/0x2b0
> [c22bf9c0] [c00078e8] data_access_common_virt+0x198/0x1f0
> --- interrupt: 300 at mpic_init+0x4bc/0x10c4
> NIP:  c2020a5c LR: c2020a04 CTR: 
> REGS: c22bf9f0 TRAP: 0300   Tainted: GT 
> (6.6.0-rc3-PMacGS)
> MSR:  90001032   CR: 24004248  XER: 
> DAR: c0003e00803e DSISR: 4000 IRQMASK: 1
> GPR00:  c22bfc90 c10c0b00 c0003e008030
> GPR04:    
> GPR08:  221b80894c06df2f  
> GPR12:  c2362000 c0f76b80 0349ecd8
> GPR16: 02367ba8 02367f08 02367c70 
> GPR20: 567ce25e8c9202b7 c0f6f920 0001 c0003e008030
> GPR24: c226f348 0004 c404c640 
> GPR28: c0003e008030 c404c000 45886d8559cb69b4 c22bfc90
> NIP [c005dc40] mpic_init+0x4bc/0x10c4
> LR [c000] mpic_init+0x464/0x10c4
> ~~~ interrupt: 300
> [c22bfd90] [c2022ae4] pmac_setup_one_mpic+0x258/0x2dc
> [c22bf2e0] [c2022df4] pmac_pic_init+0x28c/0x3d8
> [c22bfef0] [c200b750] init_IRQ+0x90/0x140
> [c22bff30] [c20053c0] start_kernel+0x57c/0x78c
> [c22bffe0] [c000cb48] start_here_common+0x1c/0x20
> Code: 0929 7c292040 4081007c fbc10020 3d220127 78843664 3929d700 ebc9 
> 7fde2214 e93e 712a0001 40820064  71232000 40820048 e93e
> ---[ end trace  ]---
> 
> Kernel panic - not syncing: Fatal exception
> Rebooting in 40 seconds..
> 
> 
> I bisected the issue and got 9fee28baa601f4dbf869b1373183b312d2d5ef3d as 1st 
> bad commit:
> 
>  # git bisect good
> 9fee28baa601f4dbf869b1373183b312d2d5ef3d is the first bad commit
> commit 9fee28baa601f4dbf869b1373183b312d2d5ef3d
> Author: Matthew Wilcox (Oracle) 
> Date:   Wed Aug 2 16:13:49 2023 +0100
> 
> powerpc: implement the new page table range API
> 
> Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio().  Change
> the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio.
> 
> [wi...@infradead.org: re-export flush_dcache_icache_folio()]
>   Link: 

Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-31 Thread Linux regression tracking (Thorsten Leemhuis)
On 18.07.23 18:15, Randy Dunlap wrote:
> On 7/18/23 04:48, Michael Ellerman wrote:
>> Bagas Sanjaya  writes:
>>> On Thu, Jul 13, 2023 at 09:11:10AM -0700, Randy Dunlap wrote:
 on ppc64:

 In file included from ../include/linux/device.h:15,
  from ../arch/powerpc/include/asm/io.h:22,
  from ../include/linux/io.h:13,
  from ../include/linux/irq.h:20,
  from ../arch/powerpc/include/asm/hardirq.h:6,
  from ../include/linux/hardirq.h:11,
  from ../include/linux/interrupt.h:11,
  from ../drivers/video/fbdev/ps3fb.c:25:
 ../drivers/video/fbdev/ps3fb.c: In function 'ps3fb_probe':
 ../drivers/video/fbdev/ps3fb.c:1172:40: error: 'struct fb_info' has no 
 member named 'dev'
> [...]
>>
>> Does regzbot track issues in linux-next?

Seems your patch didn't make any progress, at least I can't see it in
-next. Is there a reason why, or did I miss anything?

And yes, sure, I'm aware that it's -next and a driver that people might
not enable regularly. But I noticed it and thought "quickly bring it up,
might be good to fix this rather sooner than later before other people
run into it (and who knows, maybe it'll switch a light in some CI system
from red to green as well)"

Ciao, Thorsten

>> The driver seems to only use info->dev in that one dev_info() line,
>> which seems purely cosmetic, so I think it could just be removed, eg:
>>
>> diff --git a/drivers/video/fbdev/ps3fb.c b/drivers/video/fbdev/ps3fb.c
>> index d4abcf8aff75..a304a39d712b 100644
>> --- a/drivers/video/fbdev/ps3fb.c
>> +++ b/drivers/video/fbdev/ps3fb.c
>> @@ -1168,8 +1168,7 @@ static int ps3fb_probe(struct ps3_system_bus_device 
>> *dev)
>>  
>>  ps3_system_bus_set_drvdata(dev, info);
>>  
>> -dev_info(info->device, "%s %s, using %u KiB of video memory\n",
>> - dev_driver_string(info->dev), dev_name(info->dev),
>> +dev_info(info->device, "using %u KiB of video memory\n",
>>   info->fix.smem_len >> 10);
>>  
>>  task = kthread_run(ps3fbd, info, DEVICE_NAME);
> 
> 
> Tested-by: Randy Dunlap  # build-tested
> 
> Thanks.
> 


Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-18 Thread Linux regression tracking (Thorsten Leemhuis)
Michael, thx for looking into this!

On 18.07.23 13:48, Michael Ellerman wrote:
> Bagas Sanjaya  writes:
>> On Thu, Jul 13, 2023 at 09:11:10AM -0700, Randy Dunlap wrote:
>>> on ppc64:
>>>
>>> In file included from ../include/linux/device.h:15,
>>>  from ../arch/powerpc/include/asm/io.h:22,
>>>  from ../include/linux/io.h:13,
>>>  from ../include/linux/irq.h:20,
>>>  from ../arch/powerpc/include/asm/hardirq.h:6,
>>>  from ../include/linux/hardirq.h:11,
>>>  from ../include/linux/interrupt.h:11,
>>>  from ../drivers/video/fbdev/ps3fb.c:25:
>>> ../drivers/video/fbdev/ps3fb.c: In function 'ps3fb_probe':
>>> ../drivers/video/fbdev/ps3fb.c:1172:40: error: 'struct fb_info' has no 
>>> member named 'dev'
>>>  1172 |  dev_driver_string(info->dev), dev_name(info->dev),
>>>   |^~
>>> ../include/linux/dev_printk.h:110:37: note: in definition of macro 
>>> 'dev_printk_index_wrap'
>>>   110 | _p_func(dev, fmt, ##__VA_ARGS__);   
>>> \
>>>   | ^~~
>>> ../drivers/video/fbdev/ps3fb.c:1171:9: note: in expansion of macro 
>>> 'dev_info'
>>>  1171 | dev_info(info->device, "%s %s, using %u KiB of video 
>>> memory\n",
>>>   | ^~~~
>>> ../drivers/video/fbdev/ps3fb.c:1172:61: error: 'struct fb_info' has no 
>>> member named 'dev'
>>>  1172 |  dev_driver_string(info->dev), dev_name(info->dev),
>>>   | ^~
>>> ../include/linux/dev_printk.h:110:37: note: in definition of macro 
>>> 'dev_printk_index_wrap'
>>>   110 | _p_func(dev, fmt, ##__VA_ARGS__);   
>>> \
>>>   | ^~~
>>> ../drivers/video/fbdev/ps3fb.c:1171:9: note: in expansion of macro 
>>> 'dev_info'
>>>  1171 | dev_info(info->device, "%s %s, using %u KiB of video 
>>> memory\n",
>>>   | ^~~~
>>>
>>>
>>
>> Hmm, there is no response from Thomas yet. I guess we should go with
>> reverting bdb616479eff419, right? Regardless, I'm adding this build 
>> regression
>> to regzbot so that parties involved are aware of it:
>>
>> #regzbot ^introduced: bdb616479eff419
>> #regzbot title: build regression in PS3 framebuffer
> 
> Does regzbot track issues in linux-next?

It can, I made sure of that in case somebody want to use this sooner or
later (and it wasn't much work), but I don't actively use this
functionally right now and do not plan to do so, there are more
important issues to spend time on.

> They're not really regressions because they're not in a release yet.
> 
> Anyway I don't see where bdb616479eff419 comes from.

That makes two of us :-D

> The issue was introduced by:
> 
>   701d2054fa31 fbdev: Make support for userspace interfaces configurable

Ahh, that makes a lot more sense. While at it, let me tell regzbot:

#regzbot introduced: 701d2054fa31

Ciao, Thorsten


Re: Fwd: Memory corruption in multithreaded user space program while calling fork

2023-07-05 Thread Linux regression tracking (Thorsten Leemhuis)
On 05.07.23 09:08, Greg KH wrote:
> On Tue, Jul 04, 2023 at 01:22:54PM -0700, Suren Baghdasaryan wrote:
>> On Tue, Jul 4, 2023 at 9:18 AM Andrew Morton  
>> wrote:
>>> On Tue, 4 Jul 2023 09:00:19 +0100 Greg KH  
>>> wrote:
 Thanks! I'll investigate this later today. After discussing with
 Andrew, we would like to disable CONFIG_PER_VMA_LOCK by default until
 the issue is fixed. I'll post a patch shortly.
>>>
>>> Posted at: 
>>> https://lore.kernel.org/all/20230703182150.2193578-1-sur...@google.com/
>>
>> As that change fixes something in 6.4, why not cc: stable on it as well?
>
> Sorry, I thought since per-VMA locks were introduced in 6.4 and this
> patch is fixing 6.4 I didn't need to send it to stable for older
> versions. Did I miss something?

 6.4.y is a stable kernel tree right now, so yes, it needs to be included
 there :)
>>>
>>> I'm in wait-a-few-days-mode on this.  To see if we have a backportable
>>> fix rather than disabling the feature in -stable.

Andrew, how long will you remain in "wait-a-few-days-mode"? Given what
Greg said below and that we already had three reports I know of I'd
prefer if we could fix this rather sooner than later in mainline --
especially as Arch Linux and openSUSE Tumbleweed likely have switched to
6.4.y already or will do so soon.

>> Ok, I think we have a fix posted at [2]  and it's cleanly applies to
>> 6.4.y stable branch as well. However fork() performance might slightly
>> regress, therefore disabling per-VMA locks by default for now seems to
>> be preferable even with this fix (see discussion at
>> https://lore.kernel.org/all/54cd9ffb-8f4b-003f-c2d6-3b6b0d2cb...@google.com/).
>> IOW, both [1] and [2] should be applied to 6.4.y stable. Both apply
>> cleanly and I CC'ed stable on [2]. Greg, should I send [1] separately
>> to stable@vger?
> 
> We can't do anything for stable until it lands in Linus's tree, so if
> you didn't happen to have the stable@ tag in the patch, just email us
> the git SHA1 and I can pick it up that way.
> 
> thanks,
> 
> greg k-h

Ciao, Thorsten


Re: [PATCH v4 29/33] x86/mm: try VMA lock-based page fault handling first

2023-07-03 Thread Linux regression tracking (Thorsten Leemhuis)
On 29.06.23 16:40, Jiri Slaby wrote:
> On 27. 02. 23, 18:36, Suren Baghdasaryan wrote:
>> Attempt VMA lock-based page fault handling first, and fall back to the
>> existing mmap_lock-based handling if that fails.
>>
>> Signed-off-by: Suren Baghdasaryan 
>> ---
>>   arch/x86/Kconfig    |  1 +
>>   arch/x86/mm/fault.c | 36 
>>   2 files changed, 37 insertions(+)
>>
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index a825bf031f49..df21fba77db1 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -27,6 +27,7 @@ config X86_64
>>   # Options that are inherently 64-bit kernel only:
>>   select ARCH_HAS_GIGANTIC_PAGE
>>   select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
>> +    select ARCH_SUPPORTS_PER_VMA_LOCK
>>   select ARCH_USE_CMPXCHG_LOCKREF
>>   select HAVE_ARCH_SOFT_DIRTY
>>   select MODULES_USE_ELF_RELA
>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>> index a498ae1fbe66..e4399983c50c 100644
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -19,6 +19,7 @@
>>   #include     /* faulthandler_disabled()    */
>>   #include     /*
>> efi_crash_gracefully_on_page_fault()*/
>>   #include 
>> +#include     /* find_and_lock_vma() */
>>     #include     /* boot_cpu_has, ...    */
>>   #include     /* dotraplinkage, ...    */
>> @@ -1333,6 +1334,38 @@ void do_user_addr_fault(struct pt_regs *regs,
>>   }
>>   #endif
>>   +#ifdef CONFIG_PER_VMA_LOCK
>> +    if (!(flags & FAULT_FLAG_USER))
>> +    goto lock_mmap;
>> +
>> +    vma = lock_vma_under_rcu(mm, address);
>> +    if (!vma)
>> +    goto lock_mmap;
>> +
>> +    if (unlikely(access_error(error_code, vma))) {
>> +    vma_end_read(vma);
>> +    goto lock_mmap;
>> +    }
>> +    fault = handle_mm_fault(vma, address, flags |
>> FAULT_FLAG_VMA_LOCK, regs);
>> +    vma_end_read(vma);
>> +
>> +    if (!(fault & VM_FAULT_RETRY)) {
>> +    count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> +    goto done;
>> +    }
>> +    count_vm_vma_lock_event(VMA_LOCK_RETRY);
> 
> This is apparently not strong enough as it causes go build failures like:

TWIMC & for the record: there is another report about trouble caused by
this change; for details see

https://bugzilla.kernel.org/show_bug.cgi?id=217624

And a "forward to devs and lists" thread about that report:

https://lore.kernel.org/all/facbfec3-837a-51ed-85fa-31021c17d...@gmail.com/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

> [  409s] strconv
> [  409s] releasep: m=0x579e2000 m->p=0x5781c600 p->m=0x0 p->status=2
> [  409s] fatal error: releasep: invalid p state
> [  409s]
> 
> [  325s] hash/adler32
> [  325s] hash/crc32
> [  325s] cmd/internal/codesign
> [  336s] fatal error: runtime: out of memory
> 
> There are many kinds of similar errors. It happens in 1-3 out of 20
> builds only.
> 
> If I revert the commit on top of 6.4, they all dismiss. Any idea?
> 
> The downstream report:
> https://bugzilla.suse.com/show_bug.cgi?id=1212775
> 
>> +
>> +    /* Quick path to respond to signals */
>> +    if (fault_signal_pending(fault, regs)) {
>> +    if (!user_mode(regs))
>> +    kernelmode_fixup_or_oops(regs, error_code, address,
>> + SIGBUS, BUS_ADRERR,
>> + ARCH_DEFAULT_PKEY);
>> +    return;
>> +    }
>> +lock_mmap:
>> +#endif /* CONFIG_PER_VMA_LOCK */
>> +
>>   /*
>>    * Kernel-mode access to the user address space should only occur
>>    * on well-defined single instructions listed in the exception
>> @@ -1433,6 +1466,9 @@ void do_user_addr_fault(struct pt_regs *regs,
>>   }
>>     mmap_read_unlock(mm);
>> +#ifdef CONFIG_PER_VMA_LOCK
>> +done:
>> +#endif
>>   if (likely(!(fault & VM_FAULT_ERROR)))
>>   return;
>>   
> 
> thanks,


Re: Fwd: Memory corruption in multithreaded user space program while calling fork

2023-07-03 Thread Linux regression tracking (Thorsten Leemhuis)
On 02.07.23 14:27, Bagas Sanjaya wrote:
> I notice a regression report on Bugzilla [1]. Quoting from it:
> 
>> After upgrading to kernel version 6.4.0 from 6.3.9, I noticed frequent but 
>> random crashes in a user space program.  After a lot of reduction, I have 
>> come up with the following reproducer program:
> [...]
>> After tuning the various parameters for my computer, exit code 2, which 
>> indicates that memory corruption was detected, occurs approximately 99% of 
>> the time.  Exit code 1, which occurs approximately 1% of the time, means it 
>> ran out of statically-allocated memory before reproducing the issue, and 
>> increasing the memory usage any more only leads to diminishing returns.  
>> There is also something like a 0.1% chance that it segfaults due to memory 
>> corruption elsewhere than in the statically-allocated buffer.
>>
>> With this reproducer in hand, I was able to perform the following bisection:
> [...]
>
> See Bugzilla for the full thread.

Additional details from
https://bugzilla.kernel.org/show_bug.cgi?id=217624#c5 :

```
I can confirm that v6.4 with 0bff0aaea03e2a3ed6bfa302155cca8a432a1829
reverted no longer causes any memory corruption with either my
reproducer or the original program.
```

FWIW: 0bff0aaea03 ("x86/mm: try VMA lock-based page fault handling
first") [merged for v6.4-rc1, authored by Suren Baghdasaryan [already CCed]]

That's the same commit that causes build problems with go:

https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf...@kernel.org/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot introduced: 0bff0aaea03e2a3


Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)

2023-06-22 Thread Linux regression tracking (Thorsten Leemhuis)
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

As Linus will likely release 6.4 on this or the following Sunday a quick
question: is there any hope this regression might be fixed any time
soon? Doesn't look like it, as it seems nothing happened for a few days,
but maybe I missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 15.06.23 06:57, Sachin Sant wrote:
> 
>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 6000 6000 
>>> 6000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728  2c2c 
>>> 41820020 7d8903a6 
>>
>>  2c:   28 07 23 e9 ld  r9,1832(r3)
>>  30:   50 00 89 e9 ld  r12,80(r9)
>>
>> Where r3 is *chip.
>> r9 is NULL, and 80 = 0x50.
>>
>> Looks like a NULL chip->ops, which oopses in:
>>
>> static int tpm_request_locality(struct tpm_chip *chip)
>> {
>> int rc;
>>
>> if (!chip->ops->request_locality)
>>
>>
>> Can you test the patch below?
>>
> 
> It proceeds further but then run into following crash
> 
> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? 
> (uid: 0)
> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x0018
> [  103.269595] Faulting instruction address: 0xc09dcf34
> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) 
> nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) 
> nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) 
> nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) 
> rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) 
> aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) 
> libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) 
> crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) 
> vmx_crypto(E) fuse(E)
> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G 
>E  6.4.0-rc6-dirty #8
> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
> of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [  103.269653] NIP:  c09dcf34 LR: c09dd2bc CTR: 
> c09eaa60
> [  103.269656] REGS: c000a113f510 TRAP: 0300   Tainted: GE
>(6.4.0-rc6-dirty)
> [  103.269660] MSR:  8280b033   CR: 
> 88484886  XER: 0001
> [  103.269669] CFAR: c09dd2b8 DAR: 0018 DSISR: 4000 
> IRQMASK: 0  [  103.269669] GPR00: c09dd2bc c000a113f7b0 
> c14a1500 c0009031  [  103.269669] GPR04: c0009f77 
> 0016 06007a01 0016  [  103.269669] GPR08: 
> c0009f77   8000  [  
> 103.269669] GPR12: c09eaa60 c0135fab7f00  
>   [  103.269669] GPR16:   
>    [  103.269669] GPR20:  
>     [  103.269669] GPR24: 
>  0016 c0009031 1000  [  
> 103.269669] GPR28: c0009f77 7a01 c0009f77 
> c0009031  [  103.269707] NIP [c09dcf34] 
> tpm_try_transmit+0x74/0x300
> [  103.269713] LR [c09dd2bc] tpm_transmit+0xfc/0x190
> [  103.269717] Call Trace:
> [  103.269718] [c000a113f7b0] [c000a113f880] 0xc000a113f880 
> (unreliable)
> [  103.269724] [c000a113f840] [c09dd2bc] tpm_transmit+0xfc/0x190
> [  103.269727] [c000a113f900] [c09dd398] 
> tpm_transmit_cmd+0x48/0x110
> [  103.269731] [c000a113f980] [c09df1b0] 
> tpm2_get_tpm_pt+0x140/0x230
> [  103.269736] [c000a113fa20] [c09db208] 
> tpm_amd_is_rng_defective+0xb8/0x250
> [  103.269739] [c000a113faa0] [c09db828] 
> tpm_chip_unregister+0x138/0x160
> [  103.269743] [c000a113fae0] [c09eaa94] 
> tpm_ibmvtpm_remove+0x34/0x130
> [  103.269748] [c000a113fb50] [c0115738] vio_bus_remove+0x58/0xd0
> [  103.269754] [c000a113fb90] [c0a01dcc] 
> device_shutdown+0x21c/0x39c
> [  103.269758] [c000a113fc20] [c01a2684] 
> kernel_restart_prepare+0x54/0x70
> [  103.269762] [c000a113fc40] [c0292c48] kernel_kexec+0xa8/0x100
> [  103.269766] [c000a113fcb0] [c01a2cd4] 
> __do_sys_reboot+0x214/0x2c0
> [  103.269770] [c000a113fe10] [c0034adc] 
> system_call_exception+0x13c/0x340
> [  103.269776] [c000a113fe50] [c000d05c] 
> 

Re: [PASEMI NEMO] Boot issue with the PowerPC updates 6.4-1

2023-05-08 Thread Linux regression tracking (Thorsten Leemhuis)



On 08.05.23 14:58, Bagas Sanjaya wrote:
> On Mon, May 08, 2023 at 01:29:22PM +0200, Linux regression tracking #adding 
> (Thorsten Leemhuis) wrote:
>> [CCing the regression list, as it should be in the loop for regressions:
>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>>
>> [TLDR: I'm adding this report to the list of tracked Linux kernel
>> regressions; the text you find below is based on a few templates
>> paragraphs you might have encountered already in similar form.
>> See link in footer if these mails annoy you.]
>>
>> On 02.05.23 04:22, Christian Zigotzky wrote:
>>> Hello,
>>>
>>> Our PASEMI Nemo board [1] doesn't boot with the PowerPC updates 6.4-1 [2].
>>>
>>> The kernel hangs right after the booting Linux via __start() @
>>> 0x ...
>>>
>>> I was able to revert the PowerPC updates 6.4-1 [2] with the following
>>> command: git revert 70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7 -m 1
>>>
>>  ... 
>> Thanks for the report. To be sure the issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced e4ab08be5b4902e5
> 
> Why and how can you conclude that the culprit is e4ab08be5b4902 
> ("powerpc/isa-bridge:
> Remove open coded "ranges" parsing") rather than powerpc PR merge commit
> 70cc1b5307e8ee ("Merge tag 'powerpc-6.4-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux")? 

I looked at the thread and noticed it was mentioned later (
https://lore.kernel.org/all/3fa42c8c-09bd-d0f0-401b-315b484f4...@xenosoft.de/
).

Ciao, Thorsten


Re: [PASEMI NEMO] Boot issue with the PowerPC updates 6.4-1

2023-05-08 Thread Linux regression tracking (Thorsten Leemhuis)
On 08.05.23 14:49, Michael Ellerman wrote:
> "Linux regression tracking #adding (Thorsten Leemhuis)"
>  writes:
>> [CCing the regression list, as it should be in the loop for regressions:
>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>>
>> [TLDR: I'm adding this report to the list of tracked Linux kernel
>> regressions; the text you find below is based on a few templates
>> paragraphs you might have encountered already in similar form.
>> See link in footer if these mails annoy you.]
> 
> Patch is in testing.
> https://lore.kernel.org/linuxppc-dev/20230505171816.3175865-1-r...@kernel.org/

Ahh, great, thx for letting me know.

Thanks to a proper Link tag regzbot would have noticed that fix once it
landed in next, but it's nevertheless good to know that the fix is
already under review. :-D

Fun fact: sometimes I wish we would not post fixes in new threads, as
that makes it hard to find the proposed fix for anybody that runs into
reported issues and also manages to find the report (e.g. this thread).
But whatever, that's just a detail.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot monitor:
https://lore.kernel.org/linuxppc-dev/20230505171816.3175865-1-r...@kernel.org/

>> On 02.05.23 04:22, Christian Zigotzky wrote:
>>> Hello,
>>>
>>> Our PASEMI Nemo board [1] doesn't boot with the PowerPC updates 6.4-1 [2].
>>>
>>> The kernel hangs right after the booting Linux via __start() @
>>> 0x ...
>>>
>>> I was able to revert the PowerPC updates 6.4-1 [2] with the following
>>> command: git revert 70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7 -m 1
>>>
>>> After a re-compiling, the kernel boots without any problems without the
>>> PowerPC updates 6.4-1 [2].
>>>
>>> Could you please explain me, what you have done in the boot area?
>>>
>>> Please find attached the kernel config.
>>
>> Thanks for the report. To be sure the issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced e4ab08be5b4902e5
>> #regzbot title powerpc: boot issues on PASEMI Nemo board
>> #regzbot ignore-activity
>>
>> This isn't a regression? This issue or a fix for it are already
>> discussed somewhere else? It was fixed already? You want to clarify when
>> the regression started to happen? Or point out I got the title or
>> something else totally wrong? Then just reply and tell me -- ideally
>> while also telling regzbot about it, as explained by the page listed in
>> the footer of this mail.
>>
>> Developers: When fixing the issue, remember to add 'Link:' tags pointing
>> to the report (the parent of this mail). See page linked in footer for
>> details.
>>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>> --
>> Everything you wanna know about Linux kernel regression tracking:
>> https://linux-regtracking.leemhuis.info/about/#tldr
>> That page also explains what to do if mails like this annoy you.
> 


Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

2023-04-04 Thread Linux regression tracking (Thorsten Leemhuis)
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 23.03.23 10:53, Srikar Dronamraju wrote:
> 
> I am unable to boot upstream kernels from v5.16 to the latest upstream
> kernel on a maxconfig system. (Machine config details given below)
> 
> At boot, we see a series of messages like the below.
> 
> dracut-initqueue[13917]: Warning: dracut-initqueue: timeout, still waiting 
> for following initqueue hooks:
> dracut-initqueue[13917]: Warning: 
> /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-uuid\x2f93dc0767-18aa-467f-afa7-5b4e9c13108a.sh:
>  "if ! grep -q After=remote-fs-pre.target 
> /run/systemd/generator/systemd-cryptsetup@*.service 2>/dev/null; then
> dracut-initqueue[13917]: [ -e 
> "/dev/disk/by-uuid/93dc0767-18aa-467f-afa7-5b4e9c13108a" ]
> dracut-initqueue[13917]: fi"

Alexey, did you look into this? This is apparently caused by a commit of
yours (see quoted part below) that Michael applied. Looks like it fell
through the cracks from here, but maybe I'm missing something.

Anyway, for the rest of this mail:

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 387273118714
#regzbot title powerps/pseries/dma: Probing nvme disks fails on powerpc
Maxconfig
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

> journalctl shows the below warning.
> 
>  WARNING: CPU: 242 PID: 1219 at 
> /home/srikar/work/linux.git/arch/powerpc/kernel/iommu.c:227 
> iommu_range_alloc+0x3d4/0x450
>  Modules linked in: lpfc(E+) nvmet_fc(E) nvmet(E) configfs(E) qla2xxx(E+) 
> nvme_fc(E) nvme_fabrics(E) vmx_crypto(E) gf128mul(E) xhci_pci(E) 
> xhci_pci_renesas(E) xhci_hcd(E) ipr(E+) nvme(E) usbcore(E) libata(E) 
> nvme_core(E) t10_pi(E) scsi_transport_fc(E) usb_common(E) btrfs(E) 
> blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) sg(E) 
> dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) 
> scsi_mod(E) scsi_common(E)
>  CPU: 242 PID: 1219 Comm: kworker/u3843:0 Tainted: GW   EL
> 5.15.0-sp4+ #33 91e1c36ffe385108bbe4a3834506a047dc78552d
>  Workqueue: nvme-reset-wq nvme_reset_work [nvme]
>  NIP:  c005a134 LR: c005a128 CTR: 
>  REGS: c7fd4c7eb580 TRAP: 0700   Tainted: GW   EL 
> (5.15.0-sp4+)
>  MSR:  80029033   CR: 24002424  XER: 
>  CFAR: c020972c IRQMASK: 0
>  GPR00: c005a128 c7fd4c7eb820 c2aa4b00 0001
>  GPR04: c273d648 0003 0bfbcb21 c2d88390
>  GPR08:   00f2 c2b05240
>  GPR12: 2000 cbfbdfffcb00  c7fd4c9d1c40
>  GPR16:    
>  GPR20:   c2bab580 
>  GPR24: c73b30c8   
>  GPR28: c7fd7133  0001 0001
>  NIP [c005a134] iommu_range_alloc+0x3d4/0x450
>  LR [c005a128] iommu_range_alloc+0x3c8/0x450
>  Call Trace:
>  [c7fd4c7eb820] [c005a128] iommu_range_alloc+0x3c8/0x450 
> (unreliable)
>  [c7fd4c7eb8e0] [c005a580] iommu_alloc+0x60/0x170
>  [c7fd4c7eb930] [c005bd4c] iommu_alloc_coherent+0x11c/0x1d0
>  [c7fd4c7eb9d0] [c00597e8] dma_iommu_alloc_coherent+0x38/0x50
>  [c7fd4c7eb9f0] [c0249ce8] dma_alloc_attrs+0x128/0x180
>  [c7fd4c7eba60] [c0080001093210d8] nvme_alloc_queue+0x90/0x2b0 [nvme]
>  [c7fd4c7ebac0] [c008000109326034] nvme_reset_work+0x44c/0x1870 [nvme]
>  [c7fd4c7ebc30] [c01870b8] process_one_work+0x388/0x730
>  [c7fd4c7ebd10] [c01874d8] worker_thread+0x78/0x5b0
>  [c7fd4c7ebda0] [c01945cc] 

Re: 6.2-rc7 fails building on Talos II: memory.c:(.text+0x2e14): undefined reference to `hash__tlb_flush'

2023-02-16 Thread Linux regression tracking (Thorsten Leemhuis)
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 16.02.23 00:55, Erhard F. wrote:
> Just noticed a build failure on 6.2-rc7 for my Talos 2 (.config attached):
> 
>  # make
>   CALLscripts/checksyscalls.sh
>   UPD include/generated/utsversion.h
>   CC  init/version-timestamp.o
>   LD  .tmp_vmlinux.kallsyms1
> ld: ld: DWARF error: could not find abbrev number 6
> mm/memory.o: in function `unmap_page_range':
> memory.c:(.text+0x2e14): undefined reference to `hash__tlb_flush'
> ld: memory.c:(.text+0x2f8c): undefined reference to `hash__tlb_flush'
> ld: ld: DWARF error: could not find abbrev number 3117
> mm/mmu_gather.o: in function `tlb_remove_table':
> mmu_gather.c:(.text+0x584): undefined reference to `hash__tlb_flush'
> ld: mmu_gather.c:(.text+0x6c4): undefined reference to `hash__tlb_flush'
> ld: mm/mmu_gather.o: in function `tlb_flush_mmu':
> mmu_gather.c:(.text+0x80c): undefined reference to `hash__tlb_flush'
> ld: mm/mmu_gather.o:mmu_gather.c:(.text+0xbe0): more undefined references to 
> `hash__tlb_flush' follow
> make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Fehler 1
> make: *** [Makefile:1264: vmlinux] Error 2
> 
> As 6.2-rc6 was good on this machine I did a quick bisect which revealed this 
> commit:
> 
>  # git bisect bad
> 1665c027afb225882a5a0b014c45e84290b826c2 is the first bad commit
> [...]

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 1665c027afb225
#regzbot title powerpc: 6.2-rc7 fails building on Talos II
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.