Re: [PATCH 6/6] powerpc/pseries: Add firmware details to dump stack arch description

2022-09-29 Thread Michael Ellerman
Nathan Lynch  writes:
> Michael Ellerman  writes:
>
>> Add firmware version details to the dump stack arch description, which
>> is printed in case of an oops.
>>
>> Currently /hypervisor only exists on KVM, so if we don't find that
>> look for something that suggests we're on phyp and if so that's
>> probably a good guess. The actual content of the ibm,fw-net-version
>> seems to be a full path so is too long to add to the description.
>
> My only reservation is that ibm,fw-net-version seems to be unspecified
> and could disappear in future firmware versions.

Yeah good point.

> /ibm,powervm-partition would be the best PAPR-specified property for
> this purpose, but I don't see it on a P8/860 partition I checked,
> unfortunately. I do see it on a P9. Presumably it's present in device
> trees on PowerVM P9 systems and later, but it's probably too new to use
> for this.

I'll look for both, it's easy enough.

> /ibm,lpar-capable "indicates that the platform is capable of supporting
> logical partitioning and is only present on such systems." This one is
> present on the P8.

But conceivably qemu/KVM could provide that property, which would defeat
the purpose here which is to differentiate which actual hypervisor we're
under.

>> eg: Hardware name: ... of:'IBM,FW860.42 (SV860_138)' hv:phyp
>
> Will this info get printed during boot as well? There are many times it
> would have been useful to me when looking at logs from non-oopsed
> kernels.

No it's not. But you're right that would often be useful.

I think we can print it at the end of probe_machine().

I'll send a v2.

cheers


Re: [PATCH v2 1/6] powerpc: Add hardware description string

2022-09-29 Thread Michael Ellerman
Nathan Lynch  writes:
> Michael Ellerman  writes:
>> Create a hardware description string, which we will use to record
>> various details of the hardware platform we are running on.
>>
>> Print the accumulated description at boot, and use it to set the generic
>> description which is printed in oopses.
>>
>> To begin with add ppc_md.name, aka the "machine description".
>>
>> Example output at boot with the full series applied:
>>
>>   Linux version 6.0.0-rc2-gcc-11.1.0-00199-g893f9007a5ce-dirty 
>> (michael@alpine1-p1) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU 
>> Binutils) 2.36.1) #844 SMP Thu Sep 29 22:29:53 AEST 2022
>>   Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER8 (raw)' 
>> pvr:0x4d0200 lpvr:0xf04 of:'SLOF,HEAD' machine:pSeries
>>   printk: bootconsole [udbg0] enabled
>>
>> Signed-off-by: Michael Ellerman 
>> ---
>>  arch/powerpc/include/asm/setup.h   |  2 ++
>>  arch/powerpc/kernel/setup-common.c | 19 ++-
>>  2 files changed, 20 insertions(+), 1 deletion(-)
>>
>> v2: Print the string at boot as suggested by Nathan.
>
> Thanks!
>
> I've booted the series on P8 and P9 LPARs:
>
> Hardware name: model:'IBM,8408-E8E' cpu:'POWER8E (raw)' pvr:0x4b0201 
> lpvr:0xf04 of:'IBM,FW860.50 (SV860_146)' hv:'phyp' machine:pSeries
>
> Hardware name: model:'IBM,9040-MR9' cpu:'POWER9 (raw)' pvr:0x4e2102 
> lpvr:0xf05 of:'IBM,FW950.01 (VM950_047)' hv:'phyp' machine:pSeries
>
> Not on objection but just an FYI: we're already very close to exceeding
> the arch description buffer's size on PowerVM. Both of the above are
> over 120 bytes.

Hmm yeah that's a good point.

I was tossing up whether the tags (model:, cpu: etc) are worth the space
they consume.

I erred on the side of keeping them because although I know what the raw
values mean, I figured other folks might not.

But given we are getting tight for space I might change my mind on that
and just use the values with no tags. It will make the value harder to
parse programmatically, but we will probably never do that anyway.

> It also occurs to me that we'll want to rebuild the arch description
> string after partition migration. Probably immediately after processing
> the device tree updates.

OK I hadn't thought of that.

It won't be entirely straight forward because the existing code wants to
build the value up incrementally, so we have as much info as possible at
any point during the boot.

But we can probably just have a pseries specific routine that
reconstructs the value in a similar format to the existing one after
migration.

cheers


Re: [PATCH v2 6/6] powerpc/pseries: Add firmware details to the hardware description

2022-09-29 Thread Michael Ellerman
Nathan Lynch  writes:
> Michael Ellerman  writes:
>> Add firmware version details to the hardware description, which is
>> printed at boot and in case of an oops.
>>
>> Use /hypervisor if we find it, though currently it only exists if we're
>> running under qemu.
>>
>> Look for "ibm,powervm-partition" which is specified in PAPR+ v2.11 and
>> tells us we're running under PowerVM.
>>
>> Failing that look for "ibm,fw-net-version" which is seen on PowerVM
>> going back to at least Power6.
>>
>> eg: Hardware name: ... of:'IBM,FW860.42 (SV860_138)' hv:'phyp'
>>
>> Signed-off-by: Michael Ellerman 
>> ---
>>  arch/powerpc/platforms/pseries/setup.c | 30 ++
>>  1 file changed, 30 insertions(+)
>>
>> v2: Look for "ibm,powervm-partition" as suggested by Nathan.
>> Use of_property_read_string().
>
> LGTM.
>
> I noticed that we don't get an "of:" report with qemu+vof, because there's no
> /openprom node.
>
> $ qemu-system-ppc64 -nographic -vga none -M pseries,x-vof=off -kernel vmlinux 
> | grep Hardware
> Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER9 (raw)' 
> pvr:0x4e1200 lpvr:0xf05 of:'SLOF,HEAD' machine:pSeries
> $ qemu-system-ppc64 -nographic -vga none -M pseries,x-vof=on -kernel vmlinux 
> | grep Hardware
> Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER9 (raw)' 
> pvr:0x4e1200 lpvr:0xf05 machine:pSeries
> $ qemu-system-ppc64 --version
> QEMU emulator version 7.0.0 (qemu-7.0.0-6.fc36)
>
> I didn't see anything in the vof device tree that would help though.

OK. We don't boot via prom_init when booting with vof, so in that sense
there is no OF.

I think the combo of seeing qemu but no "of" is sufficient to recognise
that case.

cheers


Re: [Bug report] BUG: Kernel NULL pointer dereference at 0x00000069, filemap_release_folio+0x88/0xb0

2022-09-29 Thread Michael Ellerman
Matthew Wilcox  writes:
> On Tue, Sep 27, 2022 at 09:17:20AM +0800, Zorro Lang wrote:
>> Hi mm and ppc list,
>> 
>> Recently I started to hit a kernel panic [2] rarely on *ppc64le* with *1k
>> blocksize* ext4. It's not easy to reproduce, but still has chance to trigger
>> by loop running generic/048 on ppc64le (not sure all kind of ppc64le can
>> reproduce it).
>> 
>> Although I've reported a bug to ext4 [1] (more details refer to [1]), but I 
>> only
>> hit it on ppc64le until now, and I'm not sure if it's an ext4 related bug, 
>> more
>> likes folio related issue, so I cc mm and ppc mail list, hope to get more
>> reviewing.
>
> Argh.  This is the wrong way to do it.  Please stop using bugzilla.
> Now there's discussion in two places and there's nowhere to see all
> of it.
>
>> [ 4681.230907] BUG: Kernel NULL pointer dereference at 0x0069 
>> [ 4681.230922] Faulting instruction address: 0xc068ee0c 
>> [ 4681.230929] Oops: Kernel access of bad area, sig: 11 [#1] 
>> [ 4681.230934] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries 
>> [ 4681.230991] CPU: 0 PID: 82 Comm: kswapd0 Kdump: loaded Not tainted 
>> 6.0.0-rc6+ #1 
>> [ 4681.230999] NIP:  c068ee0c LR: c068f2b8 CTR: 
>>  
>> [ 4681.238525] REGS: c6c0b560 TRAP: 0380   Not tainted  (6.0.0-rc6+) 
>> [ 4681.238532] MSR:  8280b033   CR: 
>> 24028242  XER:  
>> [ 4681.238556] CFAR: c068edf4 IRQMASK: 0  
>> [ 4681.238556] GPR00: c068f2b8 c6c0b800 c2cf1700 
>> c00c0042f1c0  
>> [ 4681.238556] GPR04: c6c0b860  0002 
>>   
>> [ 4681.238556] GPR08: c2d404b0  c00c0042f1c0 
>>   
>> [ 4681.238556] GPR12: c01cf080 c510 c0194298 
>> c001fff9c480  
>> [ 4681.238556] GPR16: c00048cdb850 0007  
>>   
>> [ 4681.238556] GPR20: 0001 c6c0b8f8 c146b9d8 
>> 5deadbeef100  
>> [ 4681.238556] GPR24: 5deadbeef122 c00048cdb800 c6c0bc00 
>> c6c0b8e8  
>> [ 4681.238556] GPR28: c6c0b860 c00c0042f1c0 0009 
>> 0009  
>> [ 4681.238634] NIP [c068ee0c] drop_buffers.constprop.0+0x4c/0x1c0 
>> [ 4681.238643] LR [c068f2b8] try_to_free_buffers+0x128/0x150 
>> [ 4681.238650] Call Trace: 
>> [ 4681.238654] [c6c0b800] [c6c0b880] 0xc6c0b880 
>> (unreliable) 
>> [ 4681.238663] [c6c0b840] [c6c0bc00] 0xc6c0bc00 
>> [ 4681.238670] [c6c0b890] [c0498708] 
>> filemap_release_folio+0x88/0xb0 
>> [ 4681.238679] [c6c0b8b0] [c04c51c0] 
>> shrink_active_list+0x490/0x750 
>> [ 4681.238688] [c6c0b9b0] [c04c9f88] 
>> shrink_lruvec+0x3f8/0x430 
>> [ 4681.238697] [c6c0baa0] [c04ca1f4] 
>> shrink_node_memcgs+0x234/0x290 
>> [ 4681.238704] [c6c0bb10] [c04ca3c4] shrink_node+0x174/0x6b0 
>> [ 4681.238711] [c6c0bbc0] [c04cacf0] 
>> balance_pgdat+0x3f0/0x970 
>> [ 4681.238718] [c6c0bd20] [c04cb440] kswapd+0x1d0/0x450 
>> [ 4681.238726] [c6c0bdc0] [c01943d8] kthread+0x148/0x150 
>> [ 4681.238735] [c6c0be10] [c000cbe4] 
>> ret_from_kernel_thread+0x5c/0x64 
>> [ 4681.238745] Instruction dump: 
>> [ 4681.238749] fbc1fff0 f821ffc1 7c7d1b78 7c9c2378 ebc30028 7fdff378 
>> 4818 6000  
>> [ 4681.238765] 6000 ebff0008 7c3ef840 41820048 <815f0060> e93f 
>> 5529077c 7d295378  
>
> Running that through scripts/decodecode (with some minor hacks .. how
> do PPC people do this properly?)

We've just always used our own scripts. Mine is here: 
https://github.com/mpe/misc-scripts/blob/master/ppc/ppc-disasm

I've added an issue to our tracker for us to get scripts/decodecode
working on our oopses (eventually).

> I get:
>
>0: fb c1 ff f0 std r30,-16(r1)
>4: f8 21 ff c1 stdur1,-64(r1)
>8: 7c 7d 1b 78 mr  r29,r3
>c: 7c 9c 23 78 mr  r28,r4
>   10: eb c3 00 28 ld  r30,40(r3)
>   14: 7f df f3 78 mr  r31,r30
>   18: 48 00 00 18 b   0x30
>   1c: 60 00 00 00 nop
>   20: 60 00 00 00 nop
>   24: eb ff 00 08 ld  r31,8(r31)
>   28: 7c 3e f8 40 cmpld   r30,r31
>   2c: 41 82 00 48 beq 0x74
>   30:*81 5f 00 60 lwz r10,96(r31) <-- trapping 
> instruction
>   34: e9 3f 00 00 ld  r9,0(r31)
>   38: 55 29 07 7c rlwinm  r9,r9,0,29,30
>   3c: 7d 29 53 78 or  r9,r9,r10
>
> That would seem to track; 96 is 0x60 and r31 contains 0x00..09, giving
> us an effective address of 0x69.
>
> It would be nice to know what source line that corresponds to.  Could
> you use scripts/faddr2line to turn drop_buffers.constprop.0+0x4c/0x1c0
> into a line number?  I can't because it needs the vmlinux you generated.

You'll need: 

Re: [PATCH 2/7] mm: Free device private pages have zero refcount

2022-09-29 Thread Dan Williams
Alistair Popple wrote:
> 
> Dan Williams  writes:
> 
> > Alistair Popple wrote:
> >>
> >> Jason Gunthorpe  writes:
> >>
> >> > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote:
> >> >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page
> >> >> refcount") device private pages have no longer had an extra reference
> >> >> count when the page is in use. However before handing them back to the
> >> >> owning device driver we add an extra reference count such that free
> >> >> pages have a reference count of one.
> >> >>
> >> >> This makes it difficult to tell if a page is free or not because both
> >> >> free and in use pages will have a non-zero refcount. Instead we should
> >> >> return pages to the drivers page allocator with a zero reference count.
> >> >> Kernel code can then safely use kernel functions such as
> >> >> get_page_unless_zero().
> >> >>
> >> >> Signed-off-by: Alistair Popple 
> >> >> ---
> >> >>  arch/powerpc/kvm/book3s_hv_uvmem.c   | 1 +
> >> >>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 +
> >> >>  drivers/gpu/drm/nouveau/nouveau_dmem.c   | 1 +
> >> >>  lib/test_hmm.c   | 1 +
> >> >>  mm/memremap.c| 5 -
> >> >>  mm/page_alloc.c  | 6 ++
> >> >>  6 files changed, 10 insertions(+), 5 deletions(-)
> >> >
> >> > I think this is a great idea, but I'm surprised no dax stuff is
> >> > touched here?
> >>
> >> free_zone_device_page() shouldn't be called for pgmap->type ==
> >> MEMORY_DEVICE_FS_DAX so I don't think we should have to worry about DAX
> >> there. Except that the folio code looks like it might have introduced a
> >> bug. AFAICT put_page() always calls
> >> put_devmap_managed_page(>page) but folio_put() does not (although
> >> folios_put() does!). So it seems folio_put() won't end up calling
> >> __put_devmap_managed_page_refs() as I think it should.
> >>
> >> I think you're right about the change to __init_zone_device_page() - I
> >> should limit it to DEVICE_PRIVATE/COHERENT pages only. But I need to
> >> look at Dan's patch series more closely as I suspect it might be better
> >> to rebase this patch on top of that.
> >
> > Apologies for the delay I was travelling the past few days. Yes, I think
> > this patch slots in nicely to avoid the introduction of an init_mode
> > [1]:
> >
> > https://lore.kernel.org/nvdimm/166329940343.2786261.6047770378829215962.st...@dwillia2-xfh.jf.intel.com/
> >
> > Mind if I steal it into my series?
> 
> No problem, although I notice Andrew has already merged it into
> mm-unstable. If you end up rebasing your series on top of mine I think
> all that's needed is a patch somewhere in your series to drop the
> various `if (pgmap->type == MEMORY_DEVICE_*)` I added to (hopefully)
> avoid breaking DAX. Assuming DAX takes a pagemap reference on struct
> page allocation something like below.

Yeah, I'll go that route and rebase on top of -mm.

Thanks again.


Re: [PATCH -next] powerpc/mpic_msgr: fix cast removes address space of expression warnings

2022-09-29 Thread Ruan Jinjie
Ping.

On 2022/9/1 16:54, ruanjinjie wrote:
> When build Linux kernel, encounter the following warnings:
> 
> ./arch/powerpc/sysdev/mpic_msgr.c:230:38: warning: cast removes address space 
> '__iomem' of expression
> ./arch/powerpc/sysdev/mpic_msgr.c:230:27: warning: incorrect type in 
> assignment (different address spaces)
> 
> The data type of msgr->mer and msgr->base are 'u32 __iomem *', but
> converted to 'u32 *' and 'u8 *' directly and cause above warnings, now
> recover their data types to fix these warnings.
> 
> Signed-off-by: ruanjinjie 
> ---
>  arch/powerpc/sysdev/mpic_msgr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/sysdev/mpic_msgr.c b/arch/powerpc/sysdev/mpic_msgr.c
> index 698fefaaa6dd..cbb0d24f15ba 100644
> --- a/arch/powerpc/sysdev/mpic_msgr.c
> +++ b/arch/powerpc/sysdev/mpic_msgr.c
> @@ -227,7 +227,7 @@ static int mpic_msgr_probe(struct platform_device *dev)
>  
>   reg_number = block_number * MPIC_MSGR_REGISTERS_PER_BLOCK + i;
>   msgr->base = msgr_block_addr + i * MPIC_MSGR_STRIDE;
> - msgr->mer = (u32 *)((u8 *)msgr->base + MPIC_MSGR_MER_OFFSET);
> + msgr->mer = (u32 __iomem *)((u8 __iomem *)msgr->base + 
> MPIC_MSGR_MER_OFFSET);
>   msgr->in_use = MSGR_FREE;
>   msgr->num = i;
>   raw_spin_lock_init(>lock);


Re: [PATCH 2/7] mm: Free device private pages have zero refcount

2022-09-29 Thread Alistair Popple


Dan Williams  writes:

> Alistair Popple wrote:
>>
>> Jason Gunthorpe  writes:
>>
>> > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote:
>> >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page
>> >> refcount") device private pages have no longer had an extra reference
>> >> count when the page is in use. However before handing them back to the
>> >> owning device driver we add an extra reference count such that free
>> >> pages have a reference count of one.
>> >>
>> >> This makes it difficult to tell if a page is free or not because both
>> >> free and in use pages will have a non-zero refcount. Instead we should
>> >> return pages to the drivers page allocator with a zero reference count.
>> >> Kernel code can then safely use kernel functions such as
>> >> get_page_unless_zero().
>> >>
>> >> Signed-off-by: Alistair Popple 
>> >> ---
>> >>  arch/powerpc/kvm/book3s_hv_uvmem.c   | 1 +
>> >>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 +
>> >>  drivers/gpu/drm/nouveau/nouveau_dmem.c   | 1 +
>> >>  lib/test_hmm.c   | 1 +
>> >>  mm/memremap.c| 5 -
>> >>  mm/page_alloc.c  | 6 ++
>> >>  6 files changed, 10 insertions(+), 5 deletions(-)
>> >
>> > I think this is a great idea, but I'm surprised no dax stuff is
>> > touched here?
>>
>> free_zone_device_page() shouldn't be called for pgmap->type ==
>> MEMORY_DEVICE_FS_DAX so I don't think we should have to worry about DAX
>> there. Except that the folio code looks like it might have introduced a
>> bug. AFAICT put_page() always calls
>> put_devmap_managed_page(>page) but folio_put() does not (although
>> folios_put() does!). So it seems folio_put() won't end up calling
>> __put_devmap_managed_page_refs() as I think it should.
>>
>> I think you're right about the change to __init_zone_device_page() - I
>> should limit it to DEVICE_PRIVATE/COHERENT pages only. But I need to
>> look at Dan's patch series more closely as I suspect it might be better
>> to rebase this patch on top of that.
>
> Apologies for the delay I was travelling the past few days. Yes, I think
> this patch slots in nicely to avoid the introduction of an init_mode
> [1]:
>
> https://lore.kernel.org/nvdimm/166329940343.2786261.6047770378829215962.st...@dwillia2-xfh.jf.intel.com/
>
> Mind if I steal it into my series?

No problem, although I notice Andrew has already merged it into
mm-unstable. If you end up rebasing your series on top of mine I think
all that's needed is a patch somewhere in your series to drop the
various `if (pgmap->type == MEMORY_DEVICE_*)` I added to (hopefully)
avoid breaking DAX. Assuming DAX takes a pagemap reference on struct
page allocation something like below.

---

diff --git a/mm/memremap.c b/mm/memremap.c
index 421bec3a29ee..da1a0e0abb8b 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -507,15 +507,7 @@ void free_zone_device_page(struct page *page)
page->mapping = NULL;
page->pgmap->ops->page_free(page);

-   if (page->pgmap->type != MEMORY_DEVICE_PRIVATE &&
-   page->pgmap->type != MEMORY_DEVICE_COHERENT)
-   /*
-* Reset the page count to 1 to prepare for handing out the page
-* again.
-*/
-   set_page_count(page, 1);
-   else
-   put_dev_pagemap(page->pgmap);
+   put_dev_pagemap(page->pgmap);
 }

 void zone_device_page_init(struct page *page)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 014dbdf54d62..3e5ff06700ca 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6816,9 +6816,7 @@ static void __ref __init_zone_device_page(struct page 
*page, unsigned long pfn,
 * ZONE_DEVICE pages are released directly to the driver page allocator
 * which will set the page count to 1 when allocating the page.
 */
-   if (pgmap->type == MEMORY_DEVICE_PRIVATE ||
-   pgmap->type == MEMORY_DEVICE_COHERENT)
-   set_page_count(page, 0);
+   set_page_count(page, 0);
 }

 /*


Re: [PATCH] powerpc/pseries/vas: Pass hw_cpu_id to node associativity HCALL

2022-09-29 Thread Nathan Lynch
Haren Myneni  writes:
> Generally the hypervisor decides to allocate a window on different
> VAS instances. But if the user space wishes to allocate on the
> current VAS instance where the process is executing, the kernel has
> to pass associativity domain IDs to allocate VAS window HCALL. To
> determine the associativity domain IDs for the current CPU, passing
> smp_processor_id() to node associativity HCALL which may return
> H_P2 (-55) error during DLPAR CPU event.
>
> This patch fixes this issue by passing hard_smp_processor_id() with
> VPHN_FLAG_VCPU flag (PAPR 14.11.6.1 H_HOME_NODE_ASSOCIATIVITY).
>
> Signed-off-by: Haren Myneni 
> ---
>  arch/powerpc/platforms/pseries/vas.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/pseries/vas.c 
> b/arch/powerpc/platforms/pseries/vas.c
> index fe33bdb620d5..533026fd1f40 100644
> --- a/arch/powerpc/platforms/pseries/vas.c
> +++ b/arch/powerpc/platforms/pseries/vas.c
> @@ -348,7 +348,7 @@ static struct vas_window *vas_allocate_window(int vas_id, 
> u64 flags,
>* So no unpacking needs to be done.
>*/
>   rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, domain,
> -   VPHN_FLAG_VCPU, smp_processor_id());
> +   VPHN_FLAG_VCPU, hard_smp_processor_id());
>   if (rc != H_SUCCESS) {
>   pr_err("H_HOME_NODE_ASSOCIATIVITY error: %d\n", rc);
>   goto out;

Yes, it is always wrong to pass Linux CPU numbers to the hypervisor,
which has its own numbering for hardware threads. It usually coincides
with Linux's numbering in practice, which tends to hide bugs like this.

Reviewed-by: Nathan Lynch 


Re: [PATCH v2 6/6] powerpc/pseries: Add firmware details to the hardware description

2022-09-29 Thread Nathan Lynch
Michael Ellerman  writes:
> Add firmware version details to the hardware description, which is
> printed at boot and in case of an oops.
>
> Use /hypervisor if we find it, though currently it only exists if we're
> running under qemu.
>
> Look for "ibm,powervm-partition" which is specified in PAPR+ v2.11 and
> tells us we're running under PowerVM.
>
> Failing that look for "ibm,fw-net-version" which is seen on PowerVM
> going back to at least Power6.
>
> eg: Hardware name: ... of:'IBM,FW860.42 (SV860_138)' hv:'phyp'
>
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/platforms/pseries/setup.c | 30 ++
>  1 file changed, 30 insertions(+)
>
> v2: Look for "ibm,powervm-partition" as suggested by Nathan.
> Use of_property_read_string().

LGTM.

I noticed that we don't get an "of:" report with qemu+vof, because there's no
/openprom node.

$ qemu-system-ppc64 -nographic -vga none -M pseries,x-vof=off -kernel vmlinux | 
grep Hardware
Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER9 (raw)' 
pvr:0x4e1200 lpvr:0xf05 of:'SLOF,HEAD' machine:pSeries
$ qemu-system-ppc64 -nographic -vga none -M pseries,x-vof=on -kernel vmlinux | 
grep Hardware
Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER9 (raw)' 
pvr:0x4e1200 lpvr:0xf05 machine:pSeries
$ qemu-system-ppc64 --version
QEMU emulator version 7.0.0 (qemu-7.0.0-6.fc36)

I didn't see anything in the vof device tree that would help though.


> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index 5e44c65a032c..83b047db35da 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -41,6 +41,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -1011,6 +1012,33 @@ static void __init pSeries_cmo_feature_init(void)
>   pr_debug(" <- fw_cmo_feature_init()\n");
>  }
>  
> +static void __init pseries_add_hw_description(void)
> +{
> + struct device_node *dn;
> + const char *s;
> +
> + dn = of_find_node_by_path("/openprom");
> + if (dn) {
> + if (of_property_read_string(dn, "model", ) == 0)
> + seq_buf_printf(_hw_desc, "of:'%s' ", s);
> +
> + of_node_put(dn);
> + }
> +
> + dn = of_find_node_by_path("/hypervisor");
> + if (dn) {
> + if (of_property_read_string(dn, "compatible", ) == 0)
> + seq_buf_printf(_hw_desc, "hv:'%s' ", s);
> +
> + of_node_put(dn);
> + return;
> + }
> +
> + if (of_property_read_bool(of_root, "ibm,powervm-partition") ||
> + of_property_read_bool(of_root, "ibm,fw-net-version"))
> + seq_buf_printf(_hw_desc, "hv:'phyp' ");
> +}


Re: [PATCH v2 1/6] powerpc: Add hardware description string

2022-09-29 Thread Nathan Lynch
Michael Ellerman  writes:
> Create a hardware description string, which we will use to record
> various details of the hardware platform we are running on.
>
> Print the accumulated description at boot, and use it to set the generic
> description which is printed in oopses.
>
> To begin with add ppc_md.name, aka the "machine description".
>
> Example output at boot with the full series applied:
>
>   Linux version 6.0.0-rc2-gcc-11.1.0-00199-g893f9007a5ce-dirty 
> (michael@alpine1-p1) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 
> 2.36.1) #844 SMP Thu Sep 29 22:29:53 AEST 2022
>   Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER8 (raw)' 
> pvr:0x4d0200 lpvr:0xf04 of:'SLOF,HEAD' machine:pSeries
>   printk: bootconsole [udbg0] enabled
>
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/include/asm/setup.h   |  2 ++
>  arch/powerpc/kernel/setup-common.c | 19 ++-
>  2 files changed, 20 insertions(+), 1 deletion(-)
>
> v2: Print the string at boot as suggested by Nathan.

Thanks!

I've booted the series on P8 and P9 LPARs:

Hardware name: model:'IBM,8408-E8E' cpu:'POWER8E (raw)' pvr:0x4b0201 
lpvr:0xf04 of:'IBM,FW860.50 (SV860_146)' hv:'phyp' machine:pSeries

Hardware name: model:'IBM,9040-MR9' cpu:'POWER9 (raw)' pvr:0x4e2102 
lpvr:0xf05 of:'IBM,FW950.01 (VM950_047)' hv:'phyp' machine:pSeries

Not on objection but just an FYI: we're already very close to exceeding
the arch description buffer's size on PowerVM. Both of the above are
over 120 bytes.

It also occurs to me that we'll want to rebuild the arch description
string after partition migration. Probably immediately after processing
the device tree updates.

Regardless, LGTM.


Re: [Bug report] BUG: Kernel NULL pointer dereference at 0x00000069, filemap_release_folio+0x88/0xb0

2022-09-29 Thread Matthew Wilcox
On Tue, Sep 27, 2022 at 09:17:20AM +0800, Zorro Lang wrote:
> Hi mm and ppc list,
> 
> Recently I started to hit a kernel panic [2] rarely on *ppc64le* with *1k
> blocksize* ext4. It's not easy to reproduce, but still has chance to trigger
> by loop running generic/048 on ppc64le (not sure all kind of ppc64le can
> reproduce it).
> 
> Although I've reported a bug to ext4 [1] (more details refer to [1]), but I 
> only
> hit it on ppc64le until now, and I'm not sure if it's an ext4 related bug, 
> more
> likes folio related issue, so I cc mm and ppc mail list, hope to get more
> reviewing.

Argh.  This is the wrong way to do it.  Please stop using bugzilla.
Now there's discussion in two places and there's nowhere to see all
of it.

> [ 4681.230907] BUG: Kernel NULL pointer dereference at 0x0069 
> [ 4681.230922] Faulting instruction address: 0xc068ee0c 
> [ 4681.230929] Oops: Kernel access of bad area, sig: 11 [#1] 
> [ 4681.230934] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries 
> [ 4681.230991] CPU: 0 PID: 82 Comm: kswapd0 Kdump: loaded Not tainted 
> 6.0.0-rc6+ #1 
> [ 4681.230999] NIP:  c068ee0c LR: c068f2b8 CTR: 
>  
> [ 4681.238525] REGS: c6c0b560 TRAP: 0380   Not tainted  (6.0.0-rc6+) 
> [ 4681.238532] MSR:  8280b033   CR: 
> 24028242  XER:  
> [ 4681.238556] CFAR: c068edf4 IRQMASK: 0  
> [ 4681.238556] GPR00: c068f2b8 c6c0b800 c2cf1700 
> c00c0042f1c0  
> [ 4681.238556] GPR04: c6c0b860  0002 
>   
> [ 4681.238556] GPR08: c2d404b0  c00c0042f1c0 
>   
> [ 4681.238556] GPR12: c01cf080 c510 c0194298 
> c001fff9c480  
> [ 4681.238556] GPR16: c00048cdb850 0007  
>   
> [ 4681.238556] GPR20: 0001 c6c0b8f8 c146b9d8 
> 5deadbeef100  
> [ 4681.238556] GPR24: 5deadbeef122 c00048cdb800 c6c0bc00 
> c6c0b8e8  
> [ 4681.238556] GPR28: c6c0b860 c00c0042f1c0 0009 
> 0009  
> [ 4681.238634] NIP [c068ee0c] drop_buffers.constprop.0+0x4c/0x1c0 
> [ 4681.238643] LR [c068f2b8] try_to_free_buffers+0x128/0x150 
> [ 4681.238650] Call Trace: 
> [ 4681.238654] [c6c0b800] [c6c0b880] 0xc6c0b880 
> (unreliable) 
> [ 4681.238663] [c6c0b840] [c6c0bc00] 0xc6c0bc00 
> [ 4681.238670] [c6c0b890] [c0498708] 
> filemap_release_folio+0x88/0xb0 
> [ 4681.238679] [c6c0b8b0] [c04c51c0] 
> shrink_active_list+0x490/0x750 
> [ 4681.238688] [c6c0b9b0] [c04c9f88] 
> shrink_lruvec+0x3f8/0x430 
> [ 4681.238697] [c6c0baa0] [c04ca1f4] 
> shrink_node_memcgs+0x234/0x290 
> [ 4681.238704] [c6c0bb10] [c04ca3c4] shrink_node+0x174/0x6b0 
> [ 4681.238711] [c6c0bbc0] [c04cacf0] 
> balance_pgdat+0x3f0/0x970 
> [ 4681.238718] [c6c0bd20] [c04cb440] kswapd+0x1d0/0x450 
> [ 4681.238726] [c6c0bdc0] [c01943d8] kthread+0x148/0x150 
> [ 4681.238735] [c6c0be10] [c000cbe4] 
> ret_from_kernel_thread+0x5c/0x64 
> [ 4681.238745] Instruction dump: 
> [ 4681.238749] fbc1fff0 f821ffc1 7c7d1b78 7c9c2378 ebc30028 7fdff378 4818 
> 6000  
> [ 4681.238765] 6000 ebff0008 7c3ef840 41820048 <815f0060> e93f 
> 5529077c 7d295378  

Running that through scripts/decodecode (with some minor hacks .. how
do PPC people do this properly?) I get:

   0:   fb c1 ff f0 std r30,-16(r1)
   4:   f8 21 ff c1 stdur1,-64(r1)
   8:   7c 7d 1b 78 mr  r29,r3
   c:   7c 9c 23 78 mr  r28,r4
  10:   eb c3 00 28 ld  r30,40(r3)
  14:   7f df f3 78 mr  r31,r30
  18:   48 00 00 18 b   0x30
  1c:   60 00 00 00 nop
  20:   60 00 00 00 nop
  24:   eb ff 00 08 ld  r31,8(r31)
  28:   7c 3e f8 40 cmpld   r30,r31
  2c:   41 82 00 48 beq 0x74
  30:*  81 5f 00 60 lwz r10,96(r31) <-- trapping instruction
  34:   e9 3f 00 00 ld  r9,0(r31)
  38:   55 29 07 7c rlwinm  r9,r9,0,29,30
  3c:   7d 29 53 78 or  r9,r9,r10

That would seem to track; 96 is 0x60 and r31 contains 0x00..09, giving
us an effective address of 0x69.

It would be nice to know what source line that corresponds to.  Could
you use scripts/faddr2line to turn drop_buffers.constprop.0+0x4c/0x1c0
into a line number?  I can't because it needs the vmlinux you generated.


Re: [PATCH 2/7] mm: Free device private pages have zero refcount

2022-09-29 Thread Dan Williams
Alistair Popple wrote:
> 
> Jason Gunthorpe  writes:
> 
> > On Mon, Sep 26, 2022 at 04:03:06PM +1000, Alistair Popple wrote:
> >> Since 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page
> >> refcount") device private pages have no longer had an extra reference
> >> count when the page is in use. However before handing them back to the
> >> owning device driver we add an extra reference count such that free
> >> pages have a reference count of one.
> >>
> >> This makes it difficult to tell if a page is free or not because both
> >> free and in use pages will have a non-zero refcount. Instead we should
> >> return pages to the drivers page allocator with a zero reference count.
> >> Kernel code can then safely use kernel functions such as
> >> get_page_unless_zero().
> >>
> >> Signed-off-by: Alistair Popple 
> >> ---
> >>  arch/powerpc/kvm/book3s_hv_uvmem.c   | 1 +
> >>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 +
> >>  drivers/gpu/drm/nouveau/nouveau_dmem.c   | 1 +
> >>  lib/test_hmm.c   | 1 +
> >>  mm/memremap.c| 5 -
> >>  mm/page_alloc.c  | 6 ++
> >>  6 files changed, 10 insertions(+), 5 deletions(-)
> >
> > I think this is a great idea, but I'm surprised no dax stuff is
> > touched here?
> 
> free_zone_device_page() shouldn't be called for pgmap->type ==
> MEMORY_DEVICE_FS_DAX so I don't think we should have to worry about DAX
> there. Except that the folio code looks like it might have introduced a
> bug. AFAICT put_page() always calls
> put_devmap_managed_page(>page) but folio_put() does not (although
> folios_put() does!). So it seems folio_put() won't end up calling
> __put_devmap_managed_page_refs() as I think it should.
> 
> I think you're right about the change to __init_zone_device_page() - I
> should limit it to DEVICE_PRIVATE/COHERENT pages only. But I need to
> look at Dan's patch series more closely as I suspect it might be better
> to rebase this patch on top of that.

Apologies for the delay I was travelling the past few days. Yes, I think
this patch slots in nicely to avoid the introduction of an init_mode
[1]:

https://lore.kernel.org/nvdimm/166329940343.2786261.6047770378829215962.st...@dwillia2-xfh.jf.intel.com/

Mind if I steal it into my series?


[PATCH v2 6/6] powerpc/pseries: Add firmware details to the hardware description

2022-09-29 Thread Michael Ellerman
Add firmware version details to the hardware description, which is
printed at boot and in case of an oops.

Use /hypervisor if we find it, though currently it only exists if we're
running under qemu.

Look for "ibm,powervm-partition" which is specified in PAPR+ v2.11 and
tells us we're running under PowerVM.

Failing that look for "ibm,fw-net-version" which is seen on PowerVM
going back to at least Power6.

eg: Hardware name: ... of:'IBM,FW860.42 (SV860_138)' hv:'phyp'

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/pseries/setup.c | 30 ++
 1 file changed, 30 insertions(+)

v2: Look for "ibm,powervm-partition" as suggested by Nathan.
Use of_property_read_string().

diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 5e44c65a032c..83b047db35da 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1011,6 +1012,33 @@ static void __init pSeries_cmo_feature_init(void)
pr_debug(" <- fw_cmo_feature_init()\n");
 }
 
+static void __init pseries_add_hw_description(void)
+{
+   struct device_node *dn;
+   const char *s;
+
+   dn = of_find_node_by_path("/openprom");
+   if (dn) {
+   if (of_property_read_string(dn, "model", ) == 0)
+   seq_buf_printf(_hw_desc, "of:'%s' ", s);
+
+   of_node_put(dn);
+   }
+
+   dn = of_find_node_by_path("/hypervisor");
+   if (dn) {
+   if (of_property_read_string(dn, "compatible", ) == 0)
+   seq_buf_printf(_hw_desc, "hv:'%s' ", s);
+
+   of_node_put(dn);
+   return;
+   }
+
+   if (of_property_read_bool(of_root, "ibm,powervm-partition") ||
+   of_property_read_bool(of_root, "ibm,fw-net-version"))
+   seq_buf_printf(_hw_desc, "hv:'phyp' ");
+}
+
 /*
  * Early initialization.  Relocation is on but do not reference unbolted pages
  */
@@ -1018,6 +1046,8 @@ static void __init pseries_init(void)
 {
pr_debug(" -> pseries_init()\n");
 
+   pseries_add_hw_description();
+
 #ifdef CONFIG_HVC_CONSOLE
if (firmware_has_feature(FW_FEATURE_LPAR))
hvc_vio_init_early();
-- 
2.37.3



[PATCH v2 5/6] powerpc/powernv: Add opal details to the hardware description

2022-09-29 Thread Michael Ellerman
Add OPAL version details to the hardware description, which is printed
at boot and in case of an oops.

eg: Hardware name: ... opal:v6.2

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/powernv/setup.c | 22 ++
 1 file changed, 22 insertions(+)

v2: Use of_property_read_string()

diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index dac545aa0308..61ab2d38ff4b 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -207,8 +208,29 @@ static void __init pnv_setup_arch(void)
pnv_rng_init();
 }
 
+static void __init pnv_add_hw_description(void)
+{
+   struct device_node *dn;
+   const char *s;
+
+   dn = of_find_node_by_path("/ibm,opal/firmware");
+   if (!dn)
+   return;
+
+   if (of_property_read_string(dn, "version", ) == 0 ||
+   of_property_read_string(dn, "git-id", ) == 0)
+   seq_buf_printf(_hw_desc, "opal:%s ", s);
+
+   if (of_property_read_string(dn, "mi-version", ) == 0)
+   seq_buf_printf(_hw_desc, "mi:%s ", s);
+
+   of_node_put(dn);
+}
+
 static void __init pnv_init(void)
 {
+   pnv_add_hw_description();
+
/*
 * Initialize the LPC bus now so that legacy serial
 * ports can be found on it
-- 
2.37.3



[PATCH v2 4/6] powerpc: Add device-tree model to the hardware description

2022-09-29 Thread Michael Ellerman
Add the model of the machine we're on to the hardware description, which
is printed at boot and in case of an oops.

eg: Hardware name: model:'IBM,8247-22L'

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/prom.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 8c4cce6dc1e8..93315c6483de 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -715,6 +715,23 @@ static void __init tm_init(void)
 static void tm_init(void) { }
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
+static int __init
+early_init_dt_scan_model(unsigned long node, const char *uname,
+int depth, void *data)
+{
+   const char *prop;
+
+   if (depth != 0)
+   return 0;
+
+   prop = of_get_flat_dt_prop(node, "model", NULL);
+   if (prop)
+   seq_buf_printf(_hw_desc, "model:'%s' ", prop);
+
+   /* break now */
+   return 1;
+}
+
 #ifdef CONFIG_PPC64
 static void __init save_fscr_to_task(void)
 {
@@ -743,6 +760,8 @@ void __init early_init_devtree(void *params)
if (!early_init_dt_verify(params))
panic("BUG: Failed verifying flat device tree, bad version?");
 
+   of_scan_flat_dt(early_init_dt_scan_model, NULL);
+
 #ifdef CONFIG_PPC_RTAS
/* Some machines might need RTAS info for debugging, grab it now. */
of_scan_flat_dt(early_init_dt_scan_rtas, NULL);
-- 
2.37.3



[PATCH v2 3/6] powerpc/64: Add logical PVR to the hardware description

2022-09-29 Thread Michael Ellerman
If we detect a logical PVR add that to the hardware description, which
is printed at boot and in case of an oops.

eg: Hardware name: ... lpvr:0xf04

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/prom.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index b42e2dbeb021..8c4cce6dc1e8 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -390,8 +390,10 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
 */
if (!dt_cpu_ftrs_in_use()) {
prop = of_get_flat_dt_prop(node, "cpu-version", NULL);
-   if (prop && (be32_to_cpup(prop) & 0xff00) == 0x0f00)
+   if (prop && (be32_to_cpup(prop) & 0xff00) == 0x0f00) {
identify_cpu(0, be32_to_cpup(prop));
+   seq_buf_printf(_hw_desc, "lpvr:0x%04x ", 
be32_to_cpup(prop));
+   }
 
check_cpu_feature_properties(node);
check_cpu_features(node, "ibm,pa-features", ibm_pa_features,
-- 
2.37.3



[PATCH v2 2/6] powerpc: Add PVR & CPU name to hardware description

2022-09-29 Thread Michael Ellerman
Add the PVR and CPU name to the hardware description, which is printed
at boot and in case of an oops.

eg: Hardware name: ... cpu:'POWER8E (raw)' pvr:0x4b0201

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/prom.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 2e7a04dab2f7..b42e2dbeb021 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -819,6 +820,10 @@ void __init early_init_devtree(void *params)
 
dt_cpu_ftrs_scan();
 
+   // We can now add the CPU name & PVR to the hardware description
+   seq_buf_printf(_hw_desc, "cpu:'%s' pvr:0x%04lx ", 
cur_cpu_spec->cpu_name,
+  mfspr(SPRN_PVR));
+
/* Retrieve CPU related informations from the flat tree
 * (altivec support, boot CPU ID, ...)
 */
-- 
2.37.3



[PATCH v2 1/6] powerpc: Add hardware description string

2022-09-29 Thread Michael Ellerman
Create a hardware description string, which we will use to record
various details of the hardware platform we are running on.

Print the accumulated description at boot, and use it to set the generic
description which is printed in oopses.

To begin with add ppc_md.name, aka the "machine description".

Example output at boot with the full series applied:

  Linux version 6.0.0-rc2-gcc-11.1.0-00199-g893f9007a5ce-dirty 
(michael@alpine1-p1) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 
2.36.1) #844 SMP Thu Sep 29 22:29:53 AEST 2022
  Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER8 (raw)' 
pvr:0x4d0200 lpvr:0xf04 of:'SLOF,HEAD' machine:pSeries
  printk: bootconsole [udbg0] enabled

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/setup.h   |  2 ++
 arch/powerpc/kernel/setup-common.c | 19 ++-
 2 files changed, 20 insertions(+), 1 deletion(-)

v2: Print the string at boot as suggested by Nathan.
Add some comments, update change log.

diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index 85143849a586..e29e83f8a89c 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -88,6 +88,8 @@ unsigned long __init prom_init(unsigned long r3, unsigned 
long r4,
   unsigned long pp, unsigned long r6,
   unsigned long r7, unsigned long kbase);
 
+extern struct seq_buf ppc_hw_desc;
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_SETUP_H */
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index dd98f43bd685..99f1c52a3ca4 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -25,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -588,6 +590,15 @@ static __init int add_pcspkr(void)
 device_initcall(add_pcspkr);
 #endif /* CONFIG_PCSPKR_PLATFORM */
 
+static char ppc_hw_desc_buf[128] __initdata;
+
+struct seq_buf ppc_hw_desc __initdata = {
+   .buffer = ppc_hw_desc_buf,
+   .size = sizeof(ppc_hw_desc_buf),
+   .len = 0,
+   .readpos = 0,
+};
+
 static __init void probe_machine(void)
 {
extern struct machdep_calls __machine_desc_start;
@@ -628,7 +639,13 @@ static __init void probe_machine(void)
for (;;);
}
 
-   printk(KERN_INFO "Using %s machine description\n", ppc_md.name);
+   // Append the machine name to other info we've gathered
+   seq_buf_printf(_hw_desc, "machine:%s", ppc_md.name);
+
+   // Set the generic hardware description shown in oopses
+   dump_stack_set_arch_desc(ppc_hw_desc.buffer);
+
+   pr_info("Hardware name: %s\n", ppc_hw_desc.buffer);
 }
 
 /* Match a class of boards, not a specific device configuration. */
-- 
2.37.3



Re: [PATCH v2 1/6] powerpc: Add hardware description string

2022-09-29 Thread Michael Ellerman
Michael Ellerman  writes:
> Create a hardware description string, which we will use to record
> various details of the hardware platform we are running on.
>
> Print the accumulated description at boot, and use it to set the generic
> description which is printed in oopses.
>
> To begin with add ppc_md.name, aka the "machine description".
>
> Example output at boot with the full series applied:
>
>   Linux version 6.0.0-rc2-gcc-11.1.0-00199-g893f9007a5ce-dirty 
> (michael@alpine1-p1) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 
> 2.36.1) #844 SMP Thu Sep 29 22:29:53 AEST 2022
>   Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER8 (raw)' 
> pvr:0x4d0200 lpvr:0xf04 of:'SLOF,HEAD' machine:pSeries
>   printk: bootconsole [udbg0] enabled
>
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/include/asm/setup.h   |  2 ++
>  arch/powerpc/kernel/setup-common.c | 19 ++-
>  2 files changed, 20 insertions(+), 1 deletion(-)
>
> v2: Print the string at boot as suggested by Nathan.
> Add some comments, update change log.

Ugh, I managed to bork the patches when sending. Don't send patches at
midnight.

New version coming.

cheers


[PATCH v2 5/6] powerpc/powernv: Add opal details to the hardware description

2022-09-29 Thread Michael Ellerman
Add OPAL version details to the hardware description, which is printed
at boot and in case of an oops.

eg: Hardware name: ... opal:v6.2

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/powernv/setup.c | 22 ++
 1 file changed, 22 insertions(+)

v2: Use of_property_read_string()

diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index dac545aa0308..61ab2d38ff4b 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -207,8 +208,29 @@ static void __init pnv_setup_arch(void)
pnv_rng_init();
 }
 
+static void __init pnv_add_hw_description(void)
+{
+   struct device_node *dn;
+   const char *s;
+
+   dn = of_find_node_by_path("/ibm,opal/firmware");
+   if (!dn)
+   return;
+
+   if (of_property_read_string(dn, "version", ) == 0 ||
+   of_property_read_string(dn, "git-id", ) == 0)
+   seq_buf_printf(_hw_desc, "opal:%s ", s);
+
+   if (of_property_read_string(dn, "mi-version", ) == 0)
+   seq_buf_printf(_hw_desc, "mi:%s ", s);
+
+   of_node_put(dn);
+}
+
 static void __init pnv_init(void)
 {
+   pnv_add_hw_description();
+
/*
 * Initialize the LPC bus now so that legacy serial
 * ports can be found on it
-- 
2.37.3



[PATCH v2 1/6] powerpc: Add hardware description string

2022-09-29 Thread Michael Ellerman
Create a hardware description string, which we will use to record
various details of the hardware platform we are running on.

Print the accumulated description at boot, and use it to set the generic
description which is printed in oopses.

To begin with add ppc_md.name, aka the "machine description".

Example output at boot with the full series applied:

  Linux version 6.0.0-rc2-gcc-11.1.0-00199-g893f9007a5ce-dirty 
(michael@alpine1-p1) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 
2.36.1) #844 SMP Thu Sep 29 22:29:53 AEST 2022
  Hardware name: model:'IBM pSeries (emulated by qemu)' cpu:'POWER8 (raw)' 
pvr:0x4d0200 lpvr:0xf04 of:'SLOF,HEAD' machine:pSeries
  printk: bootconsole [udbg0] enabled

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/setup.h   |  2 ++
 arch/powerpc/kernel/setup-common.c | 19 ++-
 2 files changed, 20 insertions(+), 1 deletion(-)

v2: Print the string at boot as suggested by Nathan.
Add some comments, update change log.

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index dd98f43bd685..99f1c52a3ca4 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -588,6 +590,15 @@ static __init int add_pcspkr(void)
 device_initcall(add_pcspkr);
 #endif /* CONFIG_PCSPKR_PLATFORM */
 
+static char ppc_hw_desc_buf[128] __initdata;
+
+struct seq_buf ppc_hw_desc __initdata = {
+   .buffer = ppc_hw_desc_buf,
+   .size = sizeof(ppc_hw_desc_buf),
+   .len = 0,
+   .readpos = 0,
+};
+
 static __init void probe_machine(void)
 {
extern struct machdep_calls __machine_desc_start;
@@ -628,7 +639,13 @@ static __init void probe_machine(void)
for (;;);
}
 
-   printk(KERN_INFO "Using %s machine description\n", ppc_md.name);
+   // Append the machine name to other info we've gathered
+   seq_buf_printf(_hw_desc, "machine:%s", ppc_md.name);
+
+   // Set the generic hardware description shown in oopses
+   dump_stack_set_arch_desc(ppc_hw_desc.buffer);
+
+   pr_info("Hardware name: %s\n", ppc_hw_desc.buffer);
 }
 
 /* Match a class of boards, not a specific device configuration. */
-- 
2.37.3



[PATCH v2 2/6] powerpc: Add PVR & CPU name to hardware description

2022-09-29 Thread Michael Ellerman
Add the PVR and CPU name to the hardware description, which is printed
at boot and in case of an oops.

eg: Hardware name: ... cpu:'POWER8E (raw)' pvr:0x4b0201

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/prom.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 2e7a04dab2f7..b42e2dbeb021 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -819,6 +820,10 @@ void __init early_init_devtree(void *params)
 
dt_cpu_ftrs_scan();
 
+   // We can now add the CPU name & PVR to the hardware description
+   seq_buf_printf(_hw_desc, "cpu:'%s' pvr:0x%04lx ", 
cur_cpu_spec->cpu_name,
+  mfspr(SPRN_PVR));
+
/* Retrieve CPU related informations from the flat tree
 * (altivec support, boot CPU ID, ...)
 */
-- 
2.37.3



[PATCH v2 6/6] powerpc/pseries: Add firmware details to the hardware description

2022-09-29 Thread Michael Ellerman
Add firmware version details to the hardware description, which is
printed at boot and in case of an oops.

Use /hypervisor if we find it, though currently it only exists if we're
running under qemu.

Look for "ibm,powervm-partition" which is specified in PAPR+ v2.11 and
tells us we're running under PowerVM.

Failing that look for "ibm,fw-net-version" which is seen on PowerVM
going back to at least Power6.

eg: Hardware name: ... of:'IBM,FW860.42 (SV860_138)' hv:'phyp'

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/pseries/setup.c | 30 ++
 1 file changed, 30 insertions(+)

v2: Look for "ibm,powervm-partition" as suggested by Nathan.
Use of_property_read_string().

diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 5e44c65a032c..83b047db35da 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -1018,6 +1046,8 @@ static void __init pseries_init(void)
 {
pr_debug(" -> pseries_init()\n");
 
+   pseries_add_hw_description();
+
 #ifdef CONFIG_HVC_CONSOLE
if (firmware_has_feature(FW_FEATURE_LPAR))
hvc_vio_init_early();
-- 
2.37.3



[PATCH v2 3/6] powerpc/64: Add logical PVR to the hardware description

2022-09-29 Thread Michael Ellerman
If we detect a logical PVR add that to the hardware description, which
is printed at boot and in case of an oops.

eg: Hardware name: ... lpvr:0xf04

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/prom.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index b42e2dbeb021..8c4cce6dc1e8 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -390,8 +390,10 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
 */
if (!dt_cpu_ftrs_in_use()) {
prop = of_get_flat_dt_prop(node, "cpu-version", NULL);
-   if (prop && (be32_to_cpup(prop) & 0xff00) == 0x0f00)
+   if (prop && (be32_to_cpup(prop) & 0xff00) == 0x0f00) {
identify_cpu(0, be32_to_cpup(prop));
+   seq_buf_printf(_hw_desc, "lpvr:0x%04x ", 
be32_to_cpup(prop));
+   }
 
check_cpu_feature_properties(node);
check_cpu_features(node, "ibm,pa-features", ibm_pa_features,
-- 
2.37.3



[PATCH v2 4/6] powerpc: Add device-tree model to the hardware description

2022-09-29 Thread Michael Ellerman
Add the model of the machine we're on to the hardware description, which
is printed at boot and in case of an oops.

eg: Hardware name: model:'IBM,8247-22L'

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/prom.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 8c4cce6dc1e8..93315c6483de 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -715,6 +715,23 @@ static void __init tm_init(void)
 static void tm_init(void) { }
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
+static int __init
+early_init_dt_scan_model(unsigned long node, const char *uname,
+int depth, void *data)
+{
+   const char *prop;
+
+   if (depth != 0)
+   return 0;
+
+   prop = of_get_flat_dt_prop(node, "model", NULL);
+   if (prop)
+   seq_buf_printf(_hw_desc, "model:'%s' ", prop);
+
+   /* break now */
+   return 1;
+}
+
 #ifdef CONFIG_PPC64
 static void __init save_fscr_to_task(void)
 {
@@ -743,6 +760,8 @@ void __init early_init_devtree(void *params)
if (!early_init_dt_verify(params))
panic("BUG: Failed verifying flat device tree, bad version?");
 
+   of_scan_flat_dt(early_init_dt_scan_model, NULL);
+
 #ifdef CONFIG_PPC_RTAS
/* Some machines might need RTAS info for debugging, grab it now. */
of_scan_flat_dt(early_init_dt_scan_rtas, NULL);
-- 
2.37.3



Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value

2022-09-29 Thread James Clark



On 29/09/2022 09:49, Athira Rajeev wrote:
> 
> 
>> On 28-Sep-2022, at 9:05 PM, James Clark  wrote:
>>
>>
>>
> 
> Hi James,
> 
> Thanks for looking at the patch and sharing review comments.
> 
>> On 13/09/2022 12:57, Athira Rajeev wrote:
>>> perf stat includes option to specify aggr_mode to display
>>> per-socket, per-core, per-die, per-node counter details.
>>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
>>> counter values are displayed for each cpu along with "CPU"
>>> value in one field of the output.
>>>
>>> Each of the aggregate mode uses the information fetched
>>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
>>
>> I thought that this wouldn't apply to the cpu field because cpu is
>> basically interchangeable as an index in cpumap, rather than anything
>> being read from the topology file.
> 
> The cpu value is filled in this function:
> 
> Function : aggr_cpu_id__cpu
> Code: util/cpumap.c
> 
>>
>>> physical_package_id. Utility functions in "cpumap.c" fetches
>>> this information and populates the socket id, core id, cpu etc.
>>> If the platform does not expose the topology information,
>>> these values will be set to -1. Example, in case of powerpc,
>>> details like physical_package_id is restricted to be exposed
>>> in pSeries platform. So id.socket, id.core, id.cpu all will
>>> be set as -1.
>>>
>>> In case of displaying socket or die value, there is no check
>>> done in the "aggr_printout" function to see if it points to
>>> valid socket id or die. But for displaying "cpu" value, there
>>> is a check for "if (id.core > -1)". In case of powerpc pSeries
>>> where detail like physical_package_id is restricted to be
>>> exposed, id.core will be set to -1. Hence the column or field
>>> itself for CPU won't be displayed in the output.
>>>
>>> Result for per-socket:
>>>
>>> <<>>
>>> perf stat -e branches --per-socket -a true
>>>
>>> Performance counter stats for 'system wide':
>>>
>>> S-1  32416,851  branches
>>> <<>>
>>>
>>> Here S has -1 in above result. But with -A option which also
>>> expects CPU in one column in the result, below is observed.
>>>
>>> <<>>
>>> /bin/perf stat -e instructions -A -a true
>>>
>>> Performance counter stats for 'system wide':
>>>
>>>47,146  instructions
>>>45,226  instructions
>>>43,354  instructions
>>>45,184  instructions
>>> <<>>
>>>
>>> If the cpu id value is pointing to -1 also, it makes sense
>>> to display the column in the output to replicate the behaviour
>>> or to be in precedence with other aggr options(like per-socket,
>>> per-core). Remove the check "id.core" so that CPU field gets
>>> displayed in the output.
>>
>> Why would you want to print -1 out? Seems like the if statement was a
>> good one to me, otherwise the output looks a bit broken to users. Are
>> the other aggregation modes even working if -1 is set for socket and
>> die? Maybe we need to not print -1 in those cases or exit earlier with a
>> failure.
>>
>> The -1 value has a specific internal meaning which is "to not
>> aggregate". It doesn't mean "not set".
> 
> Currently, this check is done only for printing cpu value.
> For socket/die/core values, this check is not done. Pasting an
> example snippet from a powerpc system ( specifically from pseries platform 
> where
> the value is set to -1 )
> 
> ./perf stat --per-core -a -C 1 true
> 
>  Performance counter stats for 'system wide':
> 
> S-1-D-1-C-1  1   1.06 msec cpu-clock  
>   #1.018 CPUs utilized  
> S-1-D-1-C-1  1  2  context-switches   
>   #1.879 K/sec  
> S-1-D-1-C-1  1  0  cpu-migrations 
>   #0.000 /sec   
> 
> Here though the value is -1, we are displaying it. Where as in case of cpu, 
> the first column will be
> empty since we do a check before printing. 
> 
> Example:
> 
> ./perf stat --per-core -A -C 1 true
> 
>  Performance counter stats for 'CPU(s) 1':
> 
>   0.88 msec cpu-clock#1.022 CPUs 
> utilized  
>  2  context-switches  
>  
>  0  cpu-migrations
>  
> 
> 
> No sure, whether there are scripts out there, which consume the current 
> format and
> not displaying -1 may break it. That is why we tried with change to remove 
> check for cpu, similar to
> other modes like socket, die, core etc.

I wouldn't worry about that because there are json and CSV modes which
are machine readable, and -1 is already not always displayed. If
anything this change here is also likely to break parsing by adding -1
where it wasn't before.

> 
> Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
> values to -1 . I was checking to see where we are 

Re: [PATCH v2 39/44] cpuidle,clk: Remove trace_.*_rcuidle()

2022-09-29 Thread Stephen Boyd
Quoting Peter Zijlstra (2022-09-19 03:00:18)
> OMAP was the one and only user.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---

Acked-by: Stephen Boyd 


Re: [RFC PATCH RESEND 00/28] per-VMA locks proposal

2022-09-29 Thread Vlastimil Babka
On 9/28/22 04:28, Suren Baghdasaryan wrote:
> On Sun, Sep 11, 2022 at 2:35 AM Vlastimil Babka  wrote:
>>
>> On 9/2/22 01:26, Suren Baghdasaryan wrote:
>> >
>> >>
>> >> Two complaints so far:
>> >>  - I don't like the vma_mark_locked() name. To me it says that the caller
>> >>already took or is taking the lock and this function is just marking 
>> >> that
>> >>we're holding the lock, but it's really taking a different type of 
>> >> lock. But
>> >>this function can block, it really is taking a lock, so it should say 
>> >> that.
>> >>
>> >>This is AFAIK a new concept, not sure I'm going to have anything good 
>> >> either,
>> >>but perhaps vma_lock_multiple()?
>> >
>> > I'm open to name suggestions but vma_lock_multiple() is a bit
>> > confusing to me. Will wait for more suggestions.
>>
>> Well, it does act like a vma_write_lock(), no? So why not that name. The
>> checking function for it is even called vma_assert_write_locked().
>>
>> We just don't provide a single vma_write_unlock(), but a
>> vma_mark_unlocked_all(), that could be instead named e.g.
>> vma_write_unlock_all().
>> But it's called on a mm, so maybe e.g. mm_vma_write_unlock_all()?
> 
> Thank you for your suggestions, Vlastimil! vma_write_lock() sounds
> good to me. For vma_mark_unlocked_all() replacement, I would prefer
> vma_write_unlock_all() which keeps the vma_write_XXX naming pattern to

OK.

> indicate that these are operating on the same locks. If the fact that
> it accepts mm_struct as a parameter is an issue then maybe
> vma_write_unlock_mm() ?

Sounds good!

>>
>>



linux-next: manual merge of the powerpc tree with the kbuild tree

2022-09-29 Thread broonie
Hi all,

Today's linux-next merge of the powerpc tree got conflicts in:

  arch/powerpc/Makefile
  arch/powerpc/kernel/Makefile

between commits:

  4f62512adbe9a ("kbuild: use obj-y instead extra-y for objects placed at the 
head")
  0f17eda6118db ("kbuild: remove head-y syntax")

from the kbuild tree and commits:

  dfc3095cec27f ("powerpc: Remove CONFIG_FSL_BOOKE")
  688de017efaab ("powerpc: Change CONFIG_E500 to CONFIG_PPC_E500")
  3e7318584dfec ("powerpc: Remove CONFIG_PPC_FSL_BOOK3E")
  6556fd1a1e9fc ("powerpc: Cleanup idle for e500")

from the powerpc tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

diff --cc arch/powerpc/Makefile
index 89c27827a11fb,19470d29de163..0
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
diff --cc arch/powerpc/kernel/Makefile
index ad3decb9f20ba,1f121c1888051..0
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@@ -118,12 -116,12 +116,12 @@@ obj-$(CONFIG_PPC_E500)  += cpu_setup_e5
  obj-$(CONFIG_PPC_DOORBELL)+= dbell.o
  obj-$(CONFIG_JUMP_LABEL)  += jump_label.o
  
 -extra-$(CONFIG_PPC64) := head_64.o
 -extra-$(CONFIG_PPC_BOOK3S_32) := head_book3s_32.o
 -extra-$(CONFIG_40x)   := head_40x.o
 -extra-$(CONFIG_44x)   := head_44x.o
 -extra-$(CONFIG_PPC_85xx)  := head_85xx.o
 -extra-$(CONFIG_PPC_8xx)   := head_8xx.o
 +obj-$(CONFIG_PPC64)   += head_64.o
 +obj-$(CONFIG_PPC_BOOK3S_32)   += head_book3s_32.o
 +obj-$(CONFIG_40x) += head_40x.o
 +obj-$(CONFIG_44x) += head_44x.o
- obj-$(CONFIG_FSL_BOOKE)   += head_fsl_booke.o
++obj-$(CONFIG_PPC_85xx)+= head_85xx.o
 +obj-$(CONFIG_PPC_8xx) += head_8xx.o
  extra-y   += vmlinux.lds
  
  obj-$(CONFIG_RELOCATABLE) += reloc_$(BITS).o


[PATCH] powerpc: update config files

2022-09-29 Thread Lukas Bulwahn
Clean up config files by:
  - removing configs that were deleted in the past
  - removing configs not in tree and without recently pending patches
  - adding new configs that are replacements for old configs in the file

For some detailed information, see Link.

Link: 
https://lore.kernel.org/kernel-janitors/20220929090645.1389-1-lukas.bulw...@gmail.com/

Signed-off-by: Lukas Bulwahn 
---
 arch/powerpc/configs/83xx/mpc837x_rdb_defconfig | 1 -
 arch/powerpc/configs/85xx/ge_imp3a_defconfig| 1 -
 arch/powerpc/configs/85xx/ppa8548_defconfig | 2 --
 arch/powerpc/configs/cell_defconfig | 1 -
 arch/powerpc/configs/g5_defconfig   | 1 -
 arch/powerpc/configs/mpc512x_defconfig  | 1 -
 arch/powerpc/configs/mpc885_ads_defconfig   | 2 +-
 arch/powerpc/configs/pasemi_defconfig   | 1 -
 arch/powerpc/configs/pmac32_defconfig   | 1 -
 arch/powerpc/configs/powernv_defconfig  | 3 ---
 arch/powerpc/configs/ppc64_defconfig| 3 ---
 arch/powerpc/configs/ppc64e_defconfig   | 3 ---
 arch/powerpc/configs/ppc6xx_defconfig   | 7 ---
 arch/powerpc/configs/ps3_defconfig  | 1 -
 arch/powerpc/configs/pseries_defconfig  | 3 ---
 arch/powerpc/configs/skiroot_defconfig  | 2 --
 arch/powerpc/configs/storcenter_defconfig   | 1 -
 17 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/arch/powerpc/configs/83xx/mpc837x_rdb_defconfig 
b/arch/powerpc/configs/83xx/mpc837x_rdb_defconfig
index cbcae2a927e9..4e3373381ab6 100644
--- a/arch/powerpc/configs/83xx/mpc837x_rdb_defconfig
+++ b/arch/powerpc/configs/83xx/mpc837x_rdb_defconfig
@@ -77,6 +77,5 @@ CONFIG_NFS_FS=y
 CONFIG_NFS_V4=y
 CONFIG_ROOT_NFS=y
 CONFIG_CRC_T10DIF=y
-# CONFIG_ENABLE_MUST_CHECK is not set
 CONFIG_CRYPTO_ECB=m
 CONFIG_CRYPTO_PCBC=m
diff --git a/arch/powerpc/configs/85xx/ge_imp3a_defconfig 
b/arch/powerpc/configs/85xx/ge_imp3a_defconfig
index e7672c186325..ea719898b581 100644
--- a/arch/powerpc/configs/85xx/ge_imp3a_defconfig
+++ b/arch/powerpc/configs/85xx/ge_imp3a_defconfig
@@ -74,7 +74,6 @@ CONFIG_MTD_PHYSMAP_OF=y
 CONFIG_MTD_RAW_NAND=y
 CONFIG_MTD_NAND_FSL_ELBC=y
 CONFIG_BLK_DEV_LOOP=m
-CONFIG_BLK_DEV_CRYPTOLOOP=m
 CONFIG_BLK_DEV_NBD=m
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=131072
diff --git a/arch/powerpc/configs/85xx/ppa8548_defconfig 
b/arch/powerpc/configs/85xx/ppa8548_defconfig
index 190978a5b7d5..4bd5f993d26a 100644
--- a/arch/powerpc/configs/85xx/ppa8548_defconfig
+++ b/arch/powerpc/configs/85xx/ppa8548_defconfig
@@ -7,9 +7,7 @@ CONFIG_RAPIDIO=y
 CONFIG_FSL_RIO=y
 CONFIG_RAPIDIO_DMA_ENGINE=y
 CONFIG_RAPIDIO_ENUM_BASIC=y
-CONFIG_RAPIDIO_TSI57X=y
 CONFIG_RAPIDIO_CPS_XX=y
-CONFIG_RAPIDIO_TSI568=y
 CONFIG_RAPIDIO_CPS_GEN2=y
 CONFIG_ADVANCED_OPTIONS=y
 CONFIG_LOWMEM_SIZE_BOOL=y
diff --git a/arch/powerpc/configs/cell_defconfig 
b/arch/powerpc/configs/cell_defconfig
index 7fd9e596ea33..06391cc2af3a 100644
--- a/arch/powerpc/configs/cell_defconfig
+++ b/arch/powerpc/configs/cell_defconfig
@@ -195,7 +195,6 @@ CONFIG_NLS_ISO8859_9=m
 CONFIG_NLS_ISO8859_13=m
 CONFIG_NLS_ISO8859_14=m
 CONFIG_NLS_ISO8859_15=m
-# CONFIG_ENABLE_MUST_CHECK is not set
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_KERNEL=y
 CONFIG_DEBUG_MUTEXES=y
diff --git a/arch/powerpc/configs/g5_defconfig 
b/arch/powerpc/configs/g5_defconfig
index 9d6212a8b195..71d9d112c0b6 100644
--- a/arch/powerpc/configs/g5_defconfig
+++ b/arch/powerpc/configs/g5_defconfig
@@ -119,7 +119,6 @@ CONFIG_INPUT_EVDEV=y
 # CONFIG_SERIO_I8042 is not set
 # CONFIG_SERIO_SERPORT is not set
 # CONFIG_HW_RANDOM is not set
-CONFIG_RAW_DRIVER=y
 CONFIG_I2C_CHARDEV=y
 CONFIG_AGP=m
 CONFIG_AGP_UNINORTH=m
diff --git a/arch/powerpc/configs/mpc512x_defconfig 
b/arch/powerpc/configs/mpc512x_defconfig
index e75d3f3060c9..10fe061c5e6d 100644
--- a/arch/powerpc/configs/mpc512x_defconfig
+++ b/arch/powerpc/configs/mpc512x_defconfig
@@ -114,5 +114,4 @@ CONFIG_NFS_FS=y
 CONFIG_ROOT_NFS=y
 CONFIG_NLS_CODEPAGE_437=y
 CONFIG_NLS_ISO8859_1=y
-# CONFIG_ENABLE_MUST_CHECK is not set
 # CONFIG_CRYPTO_HW is not set
diff --git a/arch/powerpc/configs/mpc885_ads_defconfig 
b/arch/powerpc/configs/mpc885_ads_defconfig
index 700115d85d6f..56b876e418e9 100644
--- a/arch/powerpc/configs/mpc885_ads_defconfig
+++ b/arch/powerpc/configs/mpc885_ads_defconfig
@@ -78,4 +78,4 @@ CONFIG_DEBUG_VM_PGTABLE=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_BDI_SWITCH=y
 CONFIG_PPC_EARLY_DEBUG=y
-CONFIG_PPC_PTDUMP=y
+CONFIG_GENERIC_PTDUMP=y
diff --git a/arch/powerpc/configs/pasemi_defconfig 
b/arch/powerpc/configs/pasemi_defconfig
index e00a703581c3..96aa5355911f 100644
--- a/arch/powerpc/configs/pasemi_defconfig
+++ b/arch/powerpc/configs/pasemi_defconfig
@@ -92,7 +92,6 @@ CONFIG_LEGACY_PTY_COUNT=4
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_HW_RANDOM=y
-CONFIG_RAW_DRIVER=y
 CONFIG_I2C_CHARDEV=y
 CONFIG_I2C_PASEMI=y
 CONFIG_SENSORS_LM85=y
diff --git a/arch/powerpc/configs/pmac32_defconfig 
b/arch/powerpc/configs/pmac32_defconfig
index 

Re: [PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value

2022-09-29 Thread Athira Rajeev



> On 28-Sep-2022, at 9:05 PM, James Clark  wrote:
> 
> 
> 

Hi James,

Thanks for looking at the patch and sharing review comments.

> On 13/09/2022 12:57, Athira Rajeev wrote:
>> perf stat includes option to specify aggr_mode to display
>> per-socket, per-core, per-die, per-node counter details.
>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
>> counter values are displayed for each cpu along with "CPU"
>> value in one field of the output.
>> 
>> Each of the aggregate mode uses the information fetched
>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
> 
> I thought that this wouldn't apply to the cpu field because cpu is
> basically interchangeable as an index in cpumap, rather than anything
> being read from the topology file.

The cpu value is filled in this function:

Function : aggr_cpu_id__cpu
Code: util/cpumap.c

> 
>> physical_package_id. Utility functions in "cpumap.c" fetches
>> this information and populates the socket id, core id, cpu etc.
>> If the platform does not expose the topology information,
>> these values will be set to -1. Example, in case of powerpc,
>> details like physical_package_id is restricted to be exposed
>> in pSeries platform. So id.socket, id.core, id.cpu all will
>> be set as -1.
>> 
>> In case of displaying socket or die value, there is no check
>> done in the "aggr_printout" function to see if it points to
>> valid socket id or die. But for displaying "cpu" value, there
>> is a check for "if (id.core > -1)". In case of powerpc pSeries
>> where detail like physical_package_id is restricted to be
>> exposed, id.core will be set to -1. Hence the column or field
>> itself for CPU won't be displayed in the output.
>> 
>> Result for per-socket:
>> 
>> <<>>
>> perf stat -e branches --per-socket -a true
>> 
>> Performance counter stats for 'system wide':
>> 
>> S-1  32416,851  branches
>> <<>>
>> 
>> Here S has -1 in above result. But with -A option which also
>> expects CPU in one column in the result, below is observed.
>> 
>> <<>>
>> /bin/perf stat -e instructions -A -a true
>> 
>> Performance counter stats for 'system wide':
>> 
>>47,146  instructions
>>45,226  instructions
>>43,354  instructions
>>45,184  instructions
>> <<>>
>> 
>> If the cpu id value is pointing to -1 also, it makes sense
>> to display the column in the output to replicate the behaviour
>> or to be in precedence with other aggr options(like per-socket,
>> per-core). Remove the check "id.core" so that CPU field gets
>> displayed in the output.
> 
> Why would you want to print -1 out? Seems like the if statement was a
> good one to me, otherwise the output looks a bit broken to users. Are
> the other aggregation modes even working if -1 is set for socket and
> die? Maybe we need to not print -1 in those cases or exit earlier with a
> failure.
> 
> The -1 value has a specific internal meaning which is "to not
> aggregate". It doesn't mean "not set".

Currently, this check is done only for printing cpu value.
For socket/die/core values, this check is not done. Pasting an
example snippet from a powerpc system ( specifically from pseries platform where
the value is set to -1 )

./perf stat --per-core -a -C 1 true

 Performance counter stats for 'system wide':

S-1-D-1-C-1  1   1.06 msec cpu-clock
#1.018 CPUs utilized  
S-1-D-1-C-1  1  2  context-switches 
#1.879 K/sec  
S-1-D-1-C-1  1  0  cpu-migrations   
#0.000 /sec   

Here though the value is -1, we are displaying it. Where as in case of cpu, the 
first column will be
empty since we do a check before printing. 

Example:

./perf stat --per-core -A -C 1 true

 Performance counter stats for 'CPU(s) 1':

  0.88 msec cpu-clock#1.022 CPUs 
utilized  
 2  context-switches
   
 0  cpu-migrations  
   


No sure, whether there are scripts out there, which consume the current format 
and
not displaying -1 may break it. That is why we tried with change to remove 
check for cpu, similar to
other modes like socket, die, core etc.

Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
values to -1 . I was checking to see where we are mapping -1 to “to not 
aggregate”.
What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero.

Reference: defined in util/stat.h

enum aggr_mode {
AGGR_NONE,

James, can you point me to reference for that meaning if I have missed anything.

Thanks
Athira

> 
>> 
>> After the fix:
>> 
>> <<>>
>> perf stat -e instructions -A -a true
>> 
>> Performance counter stats for 'system wide':
>> 
>> CPU-1  64,034  

Re: [powerpc] Build failure include/linux/compiler_types.h __alloc_size__ (next-20220928)

2022-09-29 Thread Kees Cook
On Thu, Sep 29, 2022 at 11:49:28AM +0530, Sachin Sant wrote:
> Linux-next  6.0.0-rc7-next-20220928 fails to build on powerpc with
> following error:
> 
> make -j 17 -s && make modules_install && make install
> In file included from :
> ./include/linux/percpu.h: In function '__alloc_reserved_percpu':
> ././include/linux/compiler_types.h:279:30: error: expected declaration 
> specifiers before '__alloc_size__'
>  #define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc
>   ^~

Apologies for the breakage! This should be fixed by:

https://lore.kernel.org/lkml/20220929081642.1932200-1-keesc...@chromium.org

-- 
Kees Cook