date:20230719

Re: [PATCH V2 00/26] tools/perf: Fix shellcheck coding/formatting issues of perf tool shell scripts

2023-07-19 Thread kajoljain




On 7/20/23 10:42, Athira Rajeev wrote:
> 
> 
>> On 19-Jul-2023, at 11:16 PM, Ian Rogers  wrote:
>>
>> On Tue, Jul 18, 2023 at 11:17 PM kajoljain  wrote:
>>>
>>> Hi,
>>>
>>> Looking for review comments on this patchset.
>>>
>>> Thanks,
>>> Kajol Jain
>>>
>>>
>>> On 7/9/23 23:57, Athira Rajeev wrote:
 Patchset covers a set of fixes for coding/formatting issues observed while
 running shellcheck tool on the perf shell scripts.

 This cleanup is a pre-requisite to include a build option for shellcheck
 discussed here: 
 https://www.spinics.net/lists/linux-perf-users/msg25553.html
 First set of patches were posted here:
 https://lore.kernel.org/linux-perf-users/53b7d823-1570-4289-a632-2205ee2b5...@linux.vnet.ibm.com/T/#t

 This patchset covers remaining set of shell scripts which needs
 fix. Patch 1 is resubmission of patch 6 from the initial series.
 Patch 15, 16 and 22 touches code from tools/perf/trace/beauty.
 Other patches are fixes for scripts from tools/perf/tests.

 The shellcheck is run for severity level for errors and warnings.
 Command used:

 # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
 warning $F; done
 # echo $?
 0

>>
>> I don't see anything objectionable in the changes so for the series:
>> Acked-by: Ian Rogers 
>>
>> Some thoughts:
>> - Adding "#!/bin/bash" to scripts in tools/perf/tests/lib - I think
>> we didn't do this to avoid these being included as tests. There are
>> now extra checks when finding shell tests, so I can imagine doing this
>> isn't a regression but just a heads up.
>> - I think James' comment was addressed:
>> https://lore.kernel.org/linux-perf-users/334989bf-5501-494c-f246-81878fd2f...@arm.com/
>> - Why aren't these changes being mailed to LKML? The wider community
>> on LKML have thoughts on shell scripts, plus it makes the changes miss
>> my mail filters.
>> - Can we automate this testing into the build? For example, following
>> a similar kernel build pattern we run a python test and make the log
>> output a requirement here:
>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/Build?h=perf-tools-next#n30
>>   I think we can translate:
>> for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck
>> -S warning $F; done
>>   into a rule in make for log files that are then a dependency on the
>> perf binary. We can then parallel shellcheck during the build and
>> avoid regressions. We probably need a CONFIG_SHELLCHECK feature check
>> in the build to avoid not having shellcheck breaking the build.
> 
> Hi Ian
> 
> Thanks for the comments.
> Yes, next step after this is to include build option for shellcheck by 
> updating Makefile.
> We will surely get into that build option enablement patch once we have all 
> these corrections in place.
> 
> Thanks
> Athira
>>

Hi Ian,
   Thanks for reviewing the patches. As athira mentioned our next is to
include build option. So, we will work on it next once all the
correction done.

Thanks,
Kajol Jain

>> Thanks,
>> Ian
>>
 Changelog:
 v1 -> v2:
  - Rebased on top of perf-tools-next from:
  
 https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-tools-next

  - Fixed shellcheck errors and warnings reported for newly
added changes from perf-tools-next branch

  - Addressed review comment from James clark for patch
number 13 from V1. The changes in patch 13 were not necessary
since the file "tests/shell/lib/coresight.sh" is sourced from
other test files.

 Akanksha J N (1):
  tools/perf/tests: Fix shellcheck warnings for
trace+probe_vfs_getname.sh

 Athira Rajeev (14):
  tools/perf/tests: fix test_arm_spe_fork.sh signal case issues
  tools/perf/tests: Fix unused variable references in
stat+csv_summary.sh testcase
  tools/perf/tests: fix shellcheck warning for
test_perf_data_converter_json.sh testcase
  tools/perf/tests: Fix shellcheck issue for stat_bpf_counters.sh
testcase
  tools/perf/tests: Fix shellcheck issues in
tests/shell/stat+shadow_stat.sh tetscase
  tools/perf/tests: Fix shellcheck warnings for
thread_loop_check_tid_10.sh
  tools/perf/tests: Fix shellcheck warnings for unroll_loop_thread_10.sh
  tools/perf/tests: Fix shellcheck warnings for lib/probe_vfs_getname.sh
  tools/perf/tests: Fix the shellcheck warnings in lib/waiting.sh
  tools/perf/trace: Fix x86_arch_prctl.sh to address shellcheck warnings
  tools/perf/arch/x86: Fix syscalltbl.sh to address shellcheck warnings
  tools/perf/tests/shell: Fix the shellcheck warnings in
record+zstd_comp_decomp.sh
  tools/perf/tests/shell: Fix shellcheck warning for stat+std_output.sh
testcase
  tools/perf/tests: Fix shellcheck warning for stat+std_output.sh
testcase

Re: [PATCH V2 00/26] tools/perf: Fix shellcheck coding/formatting issues of perf tool shell scripts

2023-07-19 Thread Athira Rajeev




> On 19-Jul-2023, at 11:16 PM, Ian Rogers  wrote:
> 
> On Tue, Jul 18, 2023 at 11:17 PM kajoljain  wrote:
>> 
>> Hi,
>> 
>> Looking for review comments on this patchset.
>> 
>> Thanks,
>> Kajol Jain
>> 
>> 
>> On 7/9/23 23:57, Athira Rajeev wrote:
>>> Patchset covers a set of fixes for coding/formatting issues observed while
>>> running shellcheck tool on the perf shell scripts.
>>> 
>>> This cleanup is a pre-requisite to include a build option for shellcheck
>>> discussed here: https://www.spinics.net/lists/linux-perf-users/msg25553.html
>>> First set of patches were posted here:
>>> https://lore.kernel.org/linux-perf-users/53b7d823-1570-4289-a632-2205ee2b5...@linux.vnet.ibm.com/T/#t
>>> 
>>> This patchset covers remaining set of shell scripts which needs
>>> fix. Patch 1 is resubmission of patch 6 from the initial series.
>>> Patch 15, 16 and 22 touches code from tools/perf/trace/beauty.
>>> Other patches are fixes for scripts from tools/perf/tests.
>>> 
>>> The shellcheck is run for severity level for errors and warnings.
>>> Command used:
>>> 
>>> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
>>> warning $F; done
>>> # echo $?
>>> 0
>>> 
> 
> I don't see anything objectionable in the changes so for the series:
> Acked-by: Ian Rogers 
> 
> Some thoughts:
> - Adding "#!/bin/bash" to scripts in tools/perf/tests/lib - I think
> we didn't do this to avoid these being included as tests. There are
> now extra checks when finding shell tests, so I can imagine doing this
> isn't a regression but just a heads up.
> - I think James' comment was addressed:
> https://lore.kernel.org/linux-perf-users/334989bf-5501-494c-f246-81878fd2f...@arm.com/
> - Why aren't these changes being mailed to LKML? The wider community
> on LKML have thoughts on shell scripts, plus it makes the changes miss
> my mail filters.
> - Can we automate this testing into the build? For example, following
> a similar kernel build pattern we run a python test and make the log
> output a requirement here:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/Build?h=perf-tools-next#n30
>   I think we can translate:
> for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck
> -S warning $F; done
>   into a rule in make for log files that are then a dependency on the
> perf binary. We can then parallel shellcheck during the build and
> avoid regressions. We probably need a CONFIG_SHELLCHECK feature check
> in the build to avoid not having shellcheck breaking the build.

Hi Ian

Thanks for the comments.
Yes, next step after this is to include build option for shellcheck by updating 
Makefile.
We will surely get into that build option enablement patch once we have all 
these corrections in place.

Thanks
Athira
> 
> Thanks,
> Ian
> 
>>> Changelog:
>>> v1 -> v2:
>>>  - Rebased on top of perf-tools-next from:
>>>  
>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-tools-next
>>> 
>>>  - Fixed shellcheck errors and warnings reported for newly
>>>added changes from perf-tools-next branch
>>> 
>>>  - Addressed review comment from James clark for patch
>>>number 13 from V1. The changes in patch 13 were not necessary
>>>since the file "tests/shell/lib/coresight.sh" is sourced from
>>>other test files.
>>> 
>>> Akanksha J N (1):
>>>  tools/perf/tests: Fix shellcheck warnings for
>>>trace+probe_vfs_getname.sh
>>> 
>>> Athira Rajeev (14):
>>>  tools/perf/tests: fix test_arm_spe_fork.sh signal case issues
>>>  tools/perf/tests: Fix unused variable references in
>>>stat+csv_summary.sh testcase
>>>  tools/perf/tests: fix shellcheck warning for
>>>test_perf_data_converter_json.sh testcase
>>>  tools/perf/tests: Fix shellcheck issue for stat_bpf_counters.sh
>>>testcase
>>>  tools/perf/tests: Fix shellcheck issues in
>>>tests/shell/stat+shadow_stat.sh tetscase
>>>  tools/perf/tests: Fix shellcheck warnings for
>>>thread_loop_check_tid_10.sh
>>>  tools/perf/tests: Fix shellcheck warnings for unroll_loop_thread_10.sh
>>>  tools/perf/tests: Fix shellcheck warnings for lib/probe_vfs_getname.sh
>>>  tools/perf/tests: Fix the shellcheck warnings in lib/waiting.sh
>>>  tools/perf/trace: Fix x86_arch_prctl.sh to address shellcheck warnings
>>>  tools/perf/arch/x86: Fix syscalltbl.sh to address shellcheck warnings
>>>  tools/perf/tests/shell: Fix the shellcheck warnings in
>>>record+zstd_comp_decomp.sh
>>>  tools/perf/tests/shell: Fix shellcheck warning for stat+std_output.sh
>>>testcase
>>>  tools/perf/tests: Fix shellcheck warning for stat+std_output.sh
>>>testcase
>>> 
>>> Kajol Jain (11):
>>>  tools/perf/tests: Fix shellcheck warning for probe_vfs_getname.sh
>>>testcase
>>>  tools/perf/tests: Fix shellcheck warning for record_offcpu.sh testcase
>>>  tools/perf/tests: Fix shellcheck issue for lock_contention.sh testcase
>>>  tools/perf/tests: Fix shellcheck issue for stat_bpf_counters_cgrp.sh
>>>

Re: [PATCH v2 3/5] mmu_notifiers: Call invalidate_range() when invalidating TLBs

2023-07-19 Thread SeongJae Park

On Thu, 20 Jul 2023 10:52:59 +1000 Alistair Popple  wrote:

> 
> SeongJae Park  writes:
> 
> > Hi Alistair,
> >
> > On Wed, 19 Jul 2023 22:18:44 +1000 Alistair Popple  
> > wrote:
> >
> >> The invalidate_range() is going to become an architecture specific mmu
> >> notifier used to keep the TLB of secondary MMUs such as an IOMMU in
> >> sync with the CPU page tables. Currently it is called from separate
> >> code paths to the main CPU TLB invalidations. This can lead to a
> >> secondary TLB not getting invalidated when required and makes it hard
> >> to reason about when exactly the secondary TLB is invalidated.
> >> 
> >> To fix this move the notifier call to the architecture specific TLB
> >> maintenance functions for architectures that have secondary MMUs
> >> requiring explicit software invalidations.
> >> 
> >> This fixes a SMMU bug on ARM64. On ARM64 PTE permission upgrades
> >> require a TLB invalidation. This invalidation is done by the
> >> architecutre specific ptep_set_access_flags() which calls
> >> flush_tlb_page() if required. However this doesn't call the notifier
> >> resulting in infinite faults being generated by devices using the SMMU
> >> if it has previously cached a read-only PTE in it's TLB.
> >> 
> >> Moving the invalidations into the TLB invalidation functions ensures
> >> all invalidations happen at the same time as the CPU invalidation. The
> >> architecture specific flush_tlb_all() routines do not call the
> >> notifier as none of the IOMMUs require this.
> >> 
> >> Signed-off-by: Alistair Popple 
> >> Suggested-by: Jason Gunthorpe 
> >
> > I found below kernel NULL-dereference issue on latest mm-unstable tree, and
> > bisect points me to the commit of this patch, namely
> > 75c400f82d347af1307010a3e06f3aa5d831d995.
> >
> > To reproduce, I use 'stress-ng --bigheap $(nproc)'.  The issue happens as 
> > soon
> > as it starts reclaiming memory.  I didn't dive deep into this yet, but
> > reporting this issue first, since you might have an idea already.
> 
> Thanks for the report SJ!
> 
> I see the problem - current->mm can (obviously!) be NULL which is what's
> leading to the NULL dereference. Instead I think on x86 I need to call
> the notifier when adding the invalidate to the tlbbatch in
> arch_tlbbatch_add_pending() which is equivalent to what ARM64 does.
> 
> The below should fix it. Will do a respin with this.

Thank you for this quick reply!  I confirm this fixes my issue.


Tested-by: SeongJae Park 

> 
> ---
> 
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index 837e4a50281a..79c46da919b9 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -4,6 +4,7 @@
>  
>  #include 
>  #include 
> +#include 

Nit.  How about putting it between mm_types.h and sched.h, so that it looks
alphabetically sorted?

>  
>  #include 
>  #include 
> @@ -282,6 +283,7 @@ static inline void arch_tlbbatch_add_pending(struct 
> arch_tlbflush_unmap_batch *b
>  {
>   inc_mm_tlb_gen(mm);
>   cpumask_or(>cpumask, >cpumask, mm_cpumask(mm));
> + mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
>  }
>  
>  static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm)
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 0b990fb56b66..2d253919b3e8 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -1265,7 +1265,6 @@ void arch_tlbbatch_flush(struct 
> arch_tlbflush_unmap_batch *batch)
>  
>   put_flush_tlb_info();
>   put_cpu();
> - mmu_notifier_arch_invalidate_secondary_tlbs(current->mm, 0, -1UL);
>  }
>  
>  /*
> 
> 


Thanks,
SJ

Re: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

2023-07-19 Thread Zhihao Cheng


在 2023/7/19 22:38, Ard Biesheuvel 写道:

On Wed, 19 Jul 2023 at 16:23, Zhihao Cheng  wrote:


在 2023/7/19 16:33, Ard Biesheuvel 写道:

On Wed, 19 Jul 2023 at 00:38, Eric Biggers  wrote:


On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:

Currently, the ubifs code allocates a worst case buffer size to
recompress a data node, but does not pass the size of that buffer to the
compression code. This means that the compression code will never use


I think you mean the 'out_len' which describes the lengh of 'buf' is
passed into ubifs_decompress, which effects the result of
decompressor(eg. lz4 uses length to calculate the buffer end pos).
So, we should pass the real lenghth of 'buf'.



Yes, that is what I meant.

But Eric makes a good point, and looking a bit more closely, there is
really no need for the multiplication here: we know the size of the
decompressed data, so we don't need the additional space.



Right, we get 'out_len' from 'dn->size' which is the length of 
uncompressed data. ubifs_compress makes sure the compressed length is 
smaller than original length.



I intend to drop this patch, and replace it with the following:

8<--

Currently, when truncating a data node, a decompression buffer is
allocated that is twice the size of the data node's uncompressed size.
However, the fact that this space is available is not communicated to
the compression routines, as out_len itself is not updated.

The additional space is not needed even in the theoretical worst case
where compression might lead to inadvertent expansion: first of all,
increasing the size of the input buffer does not help mitigate that
issue. And given the truncation of the data node and the fact that the
original data compressed well enough to pass the UBIFS_MIN_COMPRESS_DIFF
test, there is no way on this particular code path that compression
could result in expansion beyond the original decompressed size, and so
no mitigation is necessary to begin with.

So let's just drop WORST_COMPR_FACTOR here.

Signed-off-by: Ard Biesheuvel 

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index dc52ac0f4a345f30..0b55cbfe0c30505e 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -1489,7 +1489,7 @@ static int truncate_data_node(const struct
ubifs_info *c, const struct inode *in
 int err, dlen, compr_type, out_len, data_size;

 out_len = le32_to_cpu(dn->size);
-   buf = kmalloc_array(out_len, WORST_COMPR_FACTOR, GFP_NOFS);
+   buf = kmalloc(out_len, GFP_NOFS);
 if (!buf)
 return -ENOMEM;
.



This version looks better.

Reviewed-by: Zhihao Cheng

Re: [RFC PATCH v11 05/29] KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER

2023-07-19 Thread Yuan Yao

On Wed, Jul 19, 2023 at 07:15:09AM -0700, Sean Christopherson wrote:
> On Wed, Jul 19, 2023, Yuan Yao wrote:
> > On Tue, Jul 18, 2023 at 04:44:48PM -0700, Sean Christopherson wrote:
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > index 90a0be261a5c..d2d3e083ec7f 100644
> > > --- a/include/linux/kvm_host.h
> > > +++ b/include/linux/kvm_host.h
> > > @@ -255,7 +255,9 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t 
> > > cr2_or_gpa,
> > >  int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
> > >  #endif
> > >
> > > -#ifdef KVM_ARCH_WANT_MMU_NOTIFIER
> > > +struct kvm_gfn_range;
> >
> > Not sure why a declaration here, it's defined for ARCHs which defined
> > KVM_ARCH_WANT_MMU_NOTIFIER before.
>
> The forward declaration exists to handle cases where CONFIG_KVM=n, 
> specifically
> arch/powerpc/include/asm/kvm_ppc.h's declaration of hooks to forward calls to
> uarch modules:
>
>   bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range);
>   bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
>   bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
>   bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
>
> Prior to using a Kconfig, a forward declaration wasn't necessary because
> arch/powerpc/include/asm/kvm_host.h would #define KVM_ARCH_WANT_MMU_NOTIFIER 
> even
> if CONFIG_KVM=n.
>
> Alternatively, kvm_ppc.h could declare the struct.  I went this route mainly 
> to
> avoid the possibility of someone encountering the same problem on a different
> architecture.

Ah I see, thanks for your explanation!

Re: [PATCH v2 3/5] mmu_notifiers: Call invalidate_range() when invalidating TLBs

2023-07-19 Thread Alistair Popple



SeongJae Park  writes:

> Hi Alistair,
>
> On Wed, 19 Jul 2023 22:18:44 +1000 Alistair Popple  wrote:
>
>> The invalidate_range() is going to become an architecture specific mmu
>> notifier used to keep the TLB of secondary MMUs such as an IOMMU in
>> sync with the CPU page tables. Currently it is called from separate
>> code paths to the main CPU TLB invalidations. This can lead to a
>> secondary TLB not getting invalidated when required and makes it hard
>> to reason about when exactly the secondary TLB is invalidated.
>> 
>> To fix this move the notifier call to the architecture specific TLB
>> maintenance functions for architectures that have secondary MMUs
>> requiring explicit software invalidations.
>> 
>> This fixes a SMMU bug on ARM64. On ARM64 PTE permission upgrades
>> require a TLB invalidation. This invalidation is done by the
>> architecutre specific ptep_set_access_flags() which calls
>> flush_tlb_page() if required. However this doesn't call the notifier
>> resulting in infinite faults being generated by devices using the SMMU
>> if it has previously cached a read-only PTE in it's TLB.
>> 
>> Moving the invalidations into the TLB invalidation functions ensures
>> all invalidations happen at the same time as the CPU invalidation. The
>> architecture specific flush_tlb_all() routines do not call the
>> notifier as none of the IOMMUs require this.
>> 
>> Signed-off-by: Alistair Popple 
>> Suggested-by: Jason Gunthorpe 
>
> I found below kernel NULL-dereference issue on latest mm-unstable tree, and
> bisect points me to the commit of this patch, namely
> 75c400f82d347af1307010a3e06f3aa5d831d995.
>
> To reproduce, I use 'stress-ng --bigheap $(nproc)'.  The issue happens as soon
> as it starts reclaiming memory.  I didn't dive deep into this yet, but
> reporting this issue first, since you might have an idea already.

Thanks for the report SJ!

I see the problem - current->mm can (obviously!) be NULL which is what's
leading to the NULL dereference. Instead I think on x86 I need to call
the notifier when adding the invalidate to the tlbbatch in
arch_tlbbatch_add_pending() which is equivalent to what ARM64 does.

The below should fix it. Will do a respin with this.

---

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 837e4a50281a..79c46da919b9 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -282,6 +283,7 @@ static inline void arch_tlbbatch_add_pending(struct 
arch_tlbflush_unmap_batch *b
 {
inc_mm_tlb_gen(mm);
cpumask_or(>cpumask, >cpumask, mm_cpumask(mm));
+   mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
 }
 
 static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm)
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 0b990fb56b66..2d253919b3e8 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1265,7 +1265,6 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch 
*batch)
 
put_flush_tlb_info();
put_cpu();
-   mmu_notifier_arch_invalidate_secondary_tlbs(current->mm, 0, -1UL);
 }
 
 /*

Re: [PATCH v3 2/2] PCI: layerscape: Add the workaround for lost link capabilities during reset.

2023-07-19 Thread Frank Li

On Wed, Jul 19, 2023 at 11:57:07AM -0400, Frank Li wrote:
> From: Xiaowei Bao 
> 
> A workaround for the issue where the PCI Express Endpoint (EP) controller
> loses the values of the Maximum Link Width and Supported Link Speed from
> the Link Capabilities Register, which initially configured by the Reset
> Configuration Word (RCW) during a link-down or hot reset event.
> 
> Fixes: a805770d8a22 ("PCI: layerscape: Add EP mode support")
> Acked-by: Manivannan Sadhasivam 
> Signed-off-by: Xiaowei Bao 
> Signed-off-by: Hou Zhiqiang 
> Signed-off-by: Frank Li 
> ---
> change from v2 to v3
>  - fix subject typo capabilities
> change from v1 to v2:
>  - add comments at restore register
>  - add fixes tag
>  - dw_pcie_writew_dbi to dw_pcie_writel_dbi
> 
>  .../pci/controller/dwc/pci-layerscape-ep.c| 21 ++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c 
> b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> index e0969ff2ddf7..39dbd911c3f8 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> @@ -45,6 +45,7 @@ struct ls_pcie_ep {
>   struct pci_epc_features *ls_epc;
>   const struct ls_pcie_ep_drvdata *drvdata;
>   int irq;
> + u32 lnkcap;
>   boolbig_endian;
>  };
>  
> @@ -73,6 +74,7 @@ static irqreturn_t ls_pcie_ep_event_handler(int irq, void 
> *dev_id)
>   struct ls_pcie_ep *pcie = dev_id;
>   struct dw_pcie *pci = pcie->pci;
>   u32 val, cfg;
> + u8 offset;
>  
>   val = ls_lut_readl(pcie, PEX_PF0_PME_MES_DR);
>   ls_lut_writel(pcie, PEX_PF0_PME_MES_DR, val);
> @@ -81,12 +83,25 @@ static irqreturn_t ls_pcie_ep_event_handler(int irq, void 
> *dev_id)
>   return IRQ_NONE;
>  
>   if (val & PEX_PF0_PME_MES_DR_LUD) {
> +
> + offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> +
> + /*
> +  * The values of the Maximum Link Width and Supported Link
> +  * Speed from the Link Capabilities Register will be lost
> +  * during link down or hot reset. Restore initial value
> +  * that configured by the Reset Configuration Word (RCW).
> +  */
> + dw_pcie_dbi_ro_wr_en(pci);
> + dw_pcie_writel_dbi(pci, offset + PCI_EXP_LNKCAP, pcie->lnkcap);
> + dw_pcie_dbi_ro_wr_dis(pci);
> +
>   cfg = ls_lut_readl(pcie, PEX_PF0_CONFIG);
>   cfg |= PEX_PF0_CFG_READY;
>   ls_lut_writel(pcie, PEX_PF0_CONFIG, cfg);
>   dw_pcie_ep_linkup(>ep);
>  
> - dev_dbg(pci->dev, "Link up\n");
> + dev_err(pci->dev, "Link up\n");

Sorry, Just found that. mistake merge a debug code.
It should be dev_dbg here, will send update patch soon

Frank

>   } else if (val & PEX_PF0_PME_MES_DR_LDD) {
>   dev_dbg(pci->dev, "Link down\n");
>   pci_epc_linkdown(pci->ep.epc);
> @@ -216,6 +231,7 @@ static int __init ls_pcie_ep_probe(struct platform_device 
> *pdev)
>   struct ls_pcie_ep *pcie;
>   struct pci_epc_features *ls_epc;
>   struct resource *dbi_base;
> + u8 offset;
>   int ret;
>  
>   pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
> @@ -252,6 +268,9 @@ static int __init ls_pcie_ep_probe(struct platform_device 
> *pdev)
>  
>   platform_set_drvdata(pdev, pcie);
>  
> + offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> + pcie->lnkcap = dw_pcie_readl_dbi(pci, offset + PCI_EXP_LNKCAP);
> +
>   ret = dw_pcie_ep_init(>ep);
>   if (ret)
>   return ret;
> -- 
> 2.34.1
>

[PATCH] ASoC: fsl_spdif: Silence output on stop

2023-07-19 Thread Matus Gajdos

Clear TX registers on stop to prevent the SPDIF interface from sending
last written word over and over again.

Fixes: a2388a498ad2 ("ASoC: fsl: Add S/PDIF CPU DAI driver")
Signed-off-by: Matus Gajdos 
---
 sound/soc/fsl/fsl_spdif.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 95e639711eba..95bb8b10494a 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -755,6 +755,8 @@ static int fsl_spdif_trigger(struct snd_pcm_substream 
*substream,
case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
regmap_update_bits(regmap, REG_SPDIF_SCR, dmaen, 0);
regmap_update_bits(regmap, REG_SPDIF_SIE, intr, 0);
+   regmap_write(regmap, REG_SPDIF_STL, 0x0);
+   regmap_write(regmap, REG_SPDIF_STR, 0x0);
break;
default:
return -EINVAL;
-- 
2.25.1

[PATCH] ASoC: fsl_spdif: Add support for 22.05 kHz sample rate

2023-07-19 Thread Matus Gajdos

Add support for 22.05 kHz sample rate for TX.

Signed-off-by: Matus Gajdos 
---
 sound/soc/fsl/fsl_spdif.c | 8 ++--
 sound/soc/fsl/fsl_spdif.h | 6 --
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 015c3708aa04..95e639711eba 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -514,6 +514,10 @@ static int spdif_set_sample_rate(struct snd_pcm_substream 
*substream,
int ret;
 
switch (sample_rate) {
+   case 22050:
+   rate = SPDIF_TXRATE_22050;
+   csfs = IEC958_AES3_CON_FS_22050;
+   break;
case 32000:
rate = SPDIF_TXRATE_32000;
csfs = IEC958_AES3_CON_FS_32000;
@@ -1422,7 +1426,7 @@ static u32 fsl_spdif_txclk_caldiv(struct fsl_spdif_priv 
*spdif_priv,
struct clk *clk, u64 savesub,
enum spdif_txrate index, bool round)
 {
-   static const u32 rate[] = { 32000, 44100, 48000, 88200, 96000, 176400,
+   static const u32 rate[] = { 22050, 32000, 44100, 48000, 88200, 96000, 
176400,
192000, };
bool is_sysclk = clk_is_match(clk, spdif_priv->sysclk);
u64 rate_ideal, rate_actual, sub;
@@ -1483,7 +1487,7 @@ static u32 fsl_spdif_txclk_caldiv(struct fsl_spdif_priv 
*spdif_priv,
 static int fsl_spdif_probe_txclk(struct fsl_spdif_priv *spdif_priv,
enum spdif_txrate index)
 {
-   static const u32 rate[] = { 32000, 44100, 48000, 88200, 96000, 176400,
+   static const u32 rate[] = { 22050, 32000, 44100, 48000, 88200, 96000, 
176400,
192000, };
struct platform_device *pdev = spdif_priv->pdev;
struct device *dev = >dev;
diff --git a/sound/soc/fsl/fsl_spdif.h b/sound/soc/fsl/fsl_spdif.h
index 75b42a692c90..2bc1b10c17d4 100644
--- a/sound/soc/fsl/fsl_spdif.h
+++ b/sound/soc/fsl/fsl_spdif.h
@@ -175,7 +175,8 @@ enum spdif_gainsel {
 
 /* SPDIF tx rate */
 enum spdif_txrate {
-   SPDIF_TXRATE_32000 = 0,
+   SPDIF_TXRATE_22050 = 0,
+   SPDIF_TXRATE_32000,
SPDIF_TXRATE_44100,
SPDIF_TXRATE_48000,
SPDIF_TXRATE_88200,
@@ -191,7 +192,8 @@ enum spdif_txrate {
 #define SPDIF_QSUB_SIZE(SPDIF_UBITS_SIZE / 8)
 
 
-#define FSL_SPDIF_RATES_PLAYBACK   (SNDRV_PCM_RATE_32000 | \
+#define FSL_SPDIF_RATES_PLAYBACK   (SNDRV_PCM_RATE_22050 | \
+SNDRV_PCM_RATE_32000 | \
 SNDRV_PCM_RATE_44100 | \
 SNDRV_PCM_RATE_48000 | \
 SNDRV_PCM_RATE_88200 | \
-- 
2.25.1

Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-07-19 Thread Vishal Annapurve

On Tue, Jul 18, 2023 at 4:49 PM Sean Christopherson  wrote:
> ...
> +static int kvm_gmem_error_page(struct address_space *mapping, struct page 
> *page)
> +{
> +   struct list_head *gmem_list = >private_list;
> +   struct kvm_memory_slot *slot;
> +   struct kvm_gmem *gmem;
> +   unsigned long index;
> +   pgoff_t start, end;
> +   gfn_t gfn;
> +
> +   filemap_invalidate_lock_shared(mapping);
> +
> +   start = page->index;
> +   end = start + thp_nr_pages(page);
> +
> +   list_for_each_entry(gmem, gmem_list, entry) {
> +   xa_for_each_range(>bindings, index, slot, start, end - 
> 1) {
> +   for (gfn = start; gfn < end; gfn++) {
> +   if (WARN_ON_ONCE(gfn < slot->base_gfn ||
> +   gfn >= slot->base_gfn + 
> slot->npages))
> +   continue;
> +
> +   /*
> +* FIXME: Tell userspace that the *private*
> +* memory encountered an error.
> +*/
> +   send_sig_mceerr(BUS_MCEERR_AR,
> +   (void __user 
> *)gfn_to_hva_memslot(slot, gfn),
> +   PAGE_SHIFT, current);

Does it make sense to replicate what happens with MCE handling on
tmpfs backed guest memory:
1) Unmap gpa from guest
2) On the next guest EPT fault, exit to userspace to handle/log the
mce error for the gpa.

IIUC, such MCEs could be asynchronous and "current" might not always
be the intended recipient of this signal.

> +   }
> +   }
> +   }
> +
> +   filemap_invalidate_unlock_shared(mapping);
> +
> +   return 0;
> +}
> +

Re: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

2023-07-19 Thread Zhihao Cheng


在 2023/7/19 16:33, Ard Biesheuvel 写道:

On Wed, 19 Jul 2023 at 00:38, Eric Biggers  wrote:


On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:

Currently, the ubifs code allocates a worst case buffer size to
recompress a data node, but does not pass the size of that buffer to the
compression code. This means that the compression code will never use


I think you mean the 'out_len' which describes the lengh of 'buf' is 
passed into ubifs_decompress, which effects the result of 
decompressor(eg. lz4 uses length to calculate the buffer end pos).

So, we should pass the real lenghth of 'buf'.

Reviewed-by: Zhihao Cheng 


the additional space, and might fail spuriously due to lack of space.

So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
buffer. Doing so is guaranteed not to overflow, given that the preceding
kmalloc_array() call would have failed otherwise.

Signed-off-by: Ard Biesheuvel 
---
  fs/ubifs/journal.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index dc52ac0f4a345f30..4e5961878f336033 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info *c, 
const struct inode *in
   if (!buf)
   return -ENOMEM;

+ out_len *= WORST_COMPR_FACTOR;
+
   dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
   data_size = dn_size - UBIFS_DATA_NODE_SZ;
   compr_type = le16_to_cpu(dn->compr_type);


This looks like another case where data that would be expanded by compression
should just be stored uncompressed instead.

In fact, it seems that UBIFS does that already.  ubifs_compress() has this:

 /*
  * If the data compressed only slightly, it is better to leave it
  * uncompressed to improve read speed.
  */
 if (in_len - *out_len < UBIFS_MIN_COMPRESS_DIFF)
 goto no_compr;

So it's unclear why the WORST_COMPR_FACTOR thing is needed at all.



It is not. The buffer is used for decompression in the truncation
path, so none of this logic even matters. Even if the subsequent
recompression of the truncated data node could result in expansion
beyond the uncompressed size of the original data (which seems
impossible to me), increasing the size of this buffer would not help
as it is the input buffer for the compression not the output buffer.
.

Re: [v3 1/2] PCI: layerscape: Add support for Link down notification

2023-07-19 Thread Frank Li

On Wed, Jul 19, 2023 at 10:08:16PM +0200, Markus Elfring wrote:
> > Cover letter just annoise people here.
> 
> How do you think about advices from another information source?
> 
> See also:
> https://kernelnewbies.org/PatchSeries

"You may like to include a cover letter with your patch series."

Generally, I think cover letter will be needed only if it really
help reviewer to get main idea about patches. 

Such as my on going pathes(with cover letter):
  https://lore.kernel.org/imx/ZLglBiSz0meJm5os@lizhi-Precision-Tower-5810/T/#t

Similar case without(cover leter) and accepted.
 https://lore.kernel.org/imx/20230719063425.GE151430@dragon/T/#t

I don't think cover letter real help reviewer to review these two patches.

I more like to get "real problem"(such as comments about "typo").

It is just waste time to discuss if need add cover letter here.

Frank

> 
> Regards,
> Markus

Re: [v3 1/2] PCI: layerscape: Add support for Link down notification

2023-07-19 Thread Markus Elfring

> Cover letter just annoise people here.

How do you think about advices from another information source?

See also:
https://kernelnewbies.org/PatchSeries

Regards,
Markus

Re: [PATCH] ASoC: fsl_spdif: Add support for 22.05 kHz sample rate

2023-07-19 Thread Mark Brown

On Wed, 19 Jul 2023 18:31:53 +0200, Matus Gajdos wrote:
> Add support for 22.05 kHz sample rate for TX.
> 
> 

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: fsl_spdif: Add support for 22.05 kHz sample rate
  commit: 65bc25b8d0904e0aff66b1c3a9dd4c0dcb8efbf1

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

Re: [PATCH v3 1/2] PCI: layerscape: Add support for Link down notification

2023-07-19 Thread Frank Li

On Wed, Jul 19, 2023 at 09:27:23PM +0200, Markus Elfring wrote:
> > Add support to pass …
> 
> Why did you omit a cover letter for the discussed patch series once more?

Your comments is 
"Will a cover letter become helpful also for the presented small patch series?"

According to my understand it is optional. I don't think cover letter will
help this case. Patch 1 and 2 is that independent at all. 

I sent these together just because easy to test once.

Maintainer can pick any one individually. 

Cover letter just annoise people here.

Frank

> 
> Do you care for consequences according to message threading?
> 
> Regards,
> Markus

Re: [PATCH v3 1/2] PCI: layerscape: Add support for Link down notification

2023-07-19 Thread Markus Elfring

> Add support to pass …

Why did you omit a cover letter for the discussed patch series once more?

Do you care for consequences according to message threading?

Regards,
Markus

Re: Kernel Crash Dump (kdump) broken with 6.5

2023-07-19 Thread Mahesh J Salgaonkar

On 2023-07-18 23:19:23 Tue, Michael Ellerman wrote:
> Mahesh J Salgaonkar  writes:
> > On 2023-07-17 20:15:53 Mon, Sachin Sant wrote:
> >> Kdump seems to be broken with 6.5 for ppc64le.
> >> 
> >> [ 14.200412] systemd[1]: Starting dracut pre-pivot and cleanup hook...
> >> [[0;32m OK [0m] Started dracut pre-pivot and cleanup hook.
> >> Starting Kdump Vmcore Save Service...
> >> [ 14.231669] systemd[1]: Started dracut pre-pivot and cleanup hook.
> >> [ 14.231801] systemd[1]: Starting Kdump Vmcore Save Service...
> >> [ 14.341035] kdump.sh[297]: kdump: saving to 
> >> /sysroot//var/crash//127.0.0.1-2023-07-14-13:32:34/
> >> [ 14.350053] EXT4-fs (sda2): re-mounted 
> >> e971a335-1ef8-4295-ab4e-3940f28e53fc r/w. Quota mode: none.
> >> [ 14.345979] kdump.sh[297]: kdump: saving vmcore-dmesg.txt to 
> >> /sysroot//var/crash//127.0.0.1-2023-07-14-13:32:34/
> >> [ 14.348742] kdump.sh[331]: Cannot open /proc/vmcore: No such file or 
> >> directory
> >> [ 14.348845] kdump.sh[297]: kdump: saving vmcore-dmesg.txt failed
> >> [ 14.349014] kdump.sh[297]: kdump: saving vmcore
> >> [ 14.443422] kdump.sh[332]: open_dump_memory: Can't open the dump 
> >> memory(/proc/vmcore). No such file or directory
> >> [ 14.456413] kdump.sh[332]: makedumpfile Failed.
> >> [ 14.456662] kdump.sh[297]: kdump: saving vmcore failed, _exitcode:1
> >> [ 14.456822] kdump.sh[297]: kdump: saving the 
> >> /run/initramfs/kexec-dmesg.log to 
> >> /sysroot//var/crash//127.0.0.1-2023-07-14-13:32:34/
> >> [ 14.487002] kdump.sh[297]: kdump: saving vmcore failed
> >> [[0;1;31mFAILED[0m] Failed to start Kdump Vmcore Save Service.
> >
> > Thanks Sachin for catching this.
> >
> >> 
> >> 6.4 was good. Git bisect points to following patch
> >> 
> >> commit 606787fed7268feb256957872586370b56af697a
> >> powerpc/64s: Remove support for ELFv1 little endian userspace
> >> 
> >> Reverting this patch allows a successful capture of vmcore.
> >> 
> >> Does this change require any corresponding change to kdump
> >> and/or kexec tools?
> >
> > Need to investigate that. It looks like vmcore_elf64_check_arch()
> > check from fs/proc/vmcore.c is failing after above commit.
> >
> > static int __init parse_crash_elf64_headers(void)
> > {
> > [...]
> >
> > /* Do some basic Verification. */
> > if (memcmp(ehdr.e_ident, ELFMAG, SELFMAG) != 0 ||
> > (ehdr.e_type != ET_CORE) ||
> > !vmcore_elf64_check_arch() ||
> > [...]
> 
> Where vmcore_elf64_check_arch() calls elf_check_arch(), which was
> modified by the commit, so that makes sense.
> 
> > It looks like ehdr->e_flags are not set properly while generating vmcore
> > ELF header. I see that in kexec_file_load, ehdr->e_flags left set to 0
> > irrespective of IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2) is true or false.
> 
> Does initialising it in crash_prepare_elf64_headers() fix the issue?

Yes, the bellow change fixes the issue. Can't use
IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2)i check in common code. I see that
fs/proc/kcore.c uses ELF_CORE_EFLAGS to set e_flags. Will send out
formal patch.

>From 2d12fe7dff5dfe9035a75b1fe8d7da7da3000b90 Mon Sep 17 00:00:00 2001
From: Mahesh Salgaonkar 
Date: Wed, 19 Jul 2023 20:36:37 +0530
Subject: [PATCH] kdump fix

Signed-off-by: Mahesh Salgaonkar 
---
 kernel/kexec_file.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 881ba0d1714cc..be51560509128 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -1279,6 +1279,7 @@ int crash_prepare_elf64_headers(struct crash_mem *mem, 
int need_kernel_map,
ehdr->e_phoff = sizeof(Elf64_Ehdr);
ehdr->e_ehsize = sizeof(Elf64_Ehdr);
ehdr->e_phentsize = sizeof(Elf64_Phdr);
+   ehdr->e_flags = ELF_CORE_EFLAGS;
 
/* Prepare one phdr of type PT_NOTE for each present CPU */
for_each_present_cpu(cpu) {
-- 
2.41.0

-- 
Mahesh J Salgaonkar

Re: [PATCH] ASoC: fsl_spdif: Silence output on stop

2023-07-19 Thread Fabio Estevam

On Wed, Jul 19, 2023 at 1:48 PM Matus Gajdos  wrote:
>
> Clear TX registers on stop to prevent the SPDIF interface from sending
> last written word over and over again.
>
> Fixes: a2388a498ad2 ("ASoC: fsl: Add S/PDIF CPU DAI driver")
> Signed-off-by: Matus Gajdos 

Reviewed-by: Fabio Estevam

Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-07-19 Thread Sean Christopherson

On Wed, Jul 19, 2023, Vishal Annapurve wrote:
> On Tue, Jul 18, 2023 at 4:49 PM Sean Christopherson  wrote:
> > ...
> > +static int kvm_gmem_error_page(struct address_space *mapping, struct page 
> > *page)
> > +{
> > +   struct list_head *gmem_list = >private_list;
> > +   struct kvm_memory_slot *slot;
> > +   struct kvm_gmem *gmem;
> > +   unsigned long index;
> > +   pgoff_t start, end;
> > +   gfn_t gfn;
> > +
> > +   filemap_invalidate_lock_shared(mapping);
> > +
> > +   start = page->index;
> > +   end = start + thp_nr_pages(page);
> > +
> > +   list_for_each_entry(gmem, gmem_list, entry) {
> > +   xa_for_each_range(>bindings, index, slot, start, end 
> > - 1) {
> > +   for (gfn = start; gfn < end; gfn++) {
> > +   if (WARN_ON_ONCE(gfn < slot->base_gfn ||
> > +   gfn >= slot->base_gfn + 
> > slot->npages))
> > +   continue;
> > +
> > +   /*
> > +* FIXME: Tell userspace that the *private*
> > +* memory encountered an error.
> > +*/
> > +   send_sig_mceerr(BUS_MCEERR_AR,
> > +   (void __user 
> > *)gfn_to_hva_memslot(slot, gfn),
> > +   PAGE_SHIFT, current);
> 
> Does it make sense to replicate what happens with MCE handling on
> tmpfs backed guest memory:
> 1) Unmap gpa from guest
> 2) On the next guest EPT fault, exit to userspace to handle/log the
> mce error for the gpa.

Hmm, yes, that would be much better.  Ah, and kvm_gmem_get_pfn() needs to check
folio_test_hwpoison() and potentially PageHWPoison().  E.g. if the folio is 
huge,
KVM needs to restrict the mapping to order-0 (target page isn't poisoned), or
return KVM_PFN_ERR_HWPOISON (taget page IS poisoned).

Alternatively, KVM could punch a hole in kvm_gmem_error_page(), but I don't 
think
we want to do that because that would prevent forwarding the #MC to the guest.

Re: [PATCH V2 00/26] tools/perf: Fix shellcheck coding/formatting issues of perf tool shell scripts

2023-07-19 Thread Ian Rogers

On Tue, Jul 18, 2023 at 11:17 PM kajoljain  wrote:
>
> Hi,
>
> Looking for review comments on this patchset.
>
> Thanks,
> Kajol Jain
>
>
> On 7/9/23 23:57, Athira Rajeev wrote:
> > Patchset covers a set of fixes for coding/formatting issues observed while
> > running shellcheck tool on the perf shell scripts.
> >
> > This cleanup is a pre-requisite to include a build option for shellcheck
> > discussed here: https://www.spinics.net/lists/linux-perf-users/msg25553.html
> > First set of patches were posted here:
> > https://lore.kernel.org/linux-perf-users/53b7d823-1570-4289-a632-2205ee2b5...@linux.vnet.ibm.com/T/#t
> >
> > This patchset covers remaining set of shell scripts which needs
> > fix. Patch 1 is resubmission of patch 6 from the initial series.
> > Patch 15, 16 and 22 touches code from tools/perf/trace/beauty.
> > Other patches are fixes for scripts from tools/perf/tests.
> >
> > The shellcheck is run for severity level for errors and warnings.
> > Command used:
> >
> > # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
> > warning $F; done
> > # echo $?
> > 0
> >

I don't see anything objectionable in the changes so for the series:
Acked-by: Ian Rogers 

Some thoughts:
 - Adding "#!/bin/bash" to scripts in tools/perf/tests/lib - I think
we didn't do this to avoid these being included as tests. There are
now extra checks when finding shell tests, so I can imagine doing this
isn't a regression but just a heads up.
 - I think James' comment was addressed:
https://lore.kernel.org/linux-perf-users/334989bf-5501-494c-f246-81878fd2f...@arm.com/
 - Why aren't these changes being mailed to LKML? The wider community
on LKML have thoughts on shell scripts, plus it makes the changes miss
my mail filters.
 - Can we automate this testing into the build? For example, following
a similar kernel build pattern we run a python test and make the log
output a requirement here:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/Build?h=perf-tools-next#n30
   I think we can translate:
for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck
-S warning $F; done
   into a rule in make for log files that are then a dependency on the
perf binary. We can then parallel shellcheck during the build and
avoid regressions. We probably need a CONFIG_SHELLCHECK feature check
in the build to avoid not having shellcheck breaking the build.

Thanks,
Ian

> > Changelog:
> > v1 -> v2:
> >   - Rebased on top of perf-tools-next from:
> >   
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-tools-next
> >
> >   - Fixed shellcheck errors and warnings reported for newly
> > added changes from perf-tools-next branch
> >
> >   - Addressed review comment from James clark for patch
> > number 13 from V1. The changes in patch 13 were not necessary
> > since the file "tests/shell/lib/coresight.sh" is sourced from
> > other test files.
> >
> > Akanksha J N (1):
> >   tools/perf/tests: Fix shellcheck warnings for
> > trace+probe_vfs_getname.sh
> >
> > Athira Rajeev (14):
> >   tools/perf/tests: fix test_arm_spe_fork.sh signal case issues
> >   tools/perf/tests: Fix unused variable references in
> > stat+csv_summary.sh testcase
> >   tools/perf/tests: fix shellcheck warning for
> > test_perf_data_converter_json.sh testcase
> >   tools/perf/tests: Fix shellcheck issue for stat_bpf_counters.sh
> > testcase
> >   tools/perf/tests: Fix shellcheck issues in
> > tests/shell/stat+shadow_stat.sh tetscase
> >   tools/perf/tests: Fix shellcheck warnings for
> > thread_loop_check_tid_10.sh
> >   tools/perf/tests: Fix shellcheck warnings for unroll_loop_thread_10.sh
> >   tools/perf/tests: Fix shellcheck warnings for lib/probe_vfs_getname.sh
> >   tools/perf/tests: Fix the shellcheck warnings in lib/waiting.sh
> >   tools/perf/trace: Fix x86_arch_prctl.sh to address shellcheck warnings
> >   tools/perf/arch/x86: Fix syscalltbl.sh to address shellcheck warnings
> >   tools/perf/tests/shell: Fix the shellcheck warnings in
> > record+zstd_comp_decomp.sh
> >   tools/perf/tests/shell: Fix shellcheck warning for stat+std_output.sh
> > testcase
> >   tools/perf/tests: Fix shellcheck warning for stat+std_output.sh
> > testcase
> >
> > Kajol Jain (11):
> >   tools/perf/tests: Fix shellcheck warning for probe_vfs_getname.sh
> > testcase
> >   tools/perf/tests: Fix shellcheck warning for record_offcpu.sh testcase
> >   tools/perf/tests: Fix shellcheck issue for lock_contention.sh testcase
> >   tools/perf/tests: Fix shellcheck issue for stat_bpf_counters_cgrp.sh
> > testcase
> >   tools/perf/tests: Fix shellcheck warning for asm_pure_loop.sh shell
> > script
> >   tools/perf/tests: Fix shellcheck warning for memcpy_thread_16k_10.sh
> > shell script
> >   tools/perf/tests: Fix shellcheck warning for probe.sh shell script
> >   tools/perf/trace: Fix shellcheck issue for arch_errno_names.sh

Re: [RFC PATCH v11 04/29] KVM: PPC: Drop dead code related to KVM_ARCH_WANT_MMU_NOTIFIER

2023-07-19 Thread Paolo Bonzini


On 7/19/23 01:44, Sean Christopherson wrote:

Signed-off-by: Sean Christopherson
---
  arch/powerpc/kvm/powerpc.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 7197c8256668..5cf9e5e3112a 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -634,10 +634,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long 
ext)
case KVM_CAP_SYNC_MMU:
  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
r = hv_enabled;


This could actually be unnecessarily conservative.  Even book3s_pr.c 
knows how to do unmap and set_spte, so it should be able to support 
KVM_CAP_SYNC_MMU.  Alex, Nick, do you remember any of this?  This would 
mean moving KVM_CAP_SYNC_MMU to virt/kvm/kvm_main.c, which is nice.


Paolo


-#elif defined(KVM_ARCH_WANT_MMU_NOTIFIER)
-   r = 1;
  #else
-   r = 0;
+#ifndef KVM_ARCH_WANT_MMU_NOTIFIER
+   BUILD_BUG();
+#endif
+   r = 1;
  #endif
break;
  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE

Re: [RFC PATCH v11 02/29] KVM: Tweak kvm_hva_range and hva_handler_t to allow reusing for gfn ranges

2023-07-19 Thread Paolo Bonzini


On 7/19/23 01:44, Sean Christopherson wrote:

Signed-off-by: Sean Christopherson
---
  virt/kvm/kvm_main.c | 34 +++---
  1 file changed, 19 insertions(+), 15 deletions(-)


Reviewed-by: Paolo Bonzini

Re: [RFC PATCH v11 03/29] KVM: Use gfn instead of hva for mmu_notifier_retry

2023-07-19 Thread Paolo Bonzini


On 7/19/23 01:44, Sean Christopherson wrote:

From: Chao Peng

Currently in mmu_notifier invalidate path, hva range is recorded and
then checked against by mmu_notifier_retry_hva() in the page fault
handling path. However, for the to be introduced private memory, a page
fault may not have a hva associated, checking gfn(gpa) makes more sense.

For existing hva based shared memory, gfn is expected to also work. The
only downside is when aliasing multiple gfns to a single hva, the
current algorithm of checking multiple ranges could result in a much
larger range being rejected. Such aliasing should be uncommon, so the
impact is expected small.


Reviewed-by: Paolo Bonzini 


Suggested-by: Sean Christopherson
Signed-off-by: Chao Peng
Reviewed-by: Fuad Tabba
Tested-by: Fuad Tabba
[sean: convert vmx_set_apic_access_page_addr() to gfn-based API]
Signed-off-by: Sean Christopherson
---
  arch/x86/kvm/mmu/mmu.c   | 10 ++
  arch/x86/kvm/vmx/vmx.c   | 11 +--
  include/linux/kvm_host.h | 33 +
  virt/kvm/kvm_main.c  | 40 +++-
  4 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d72f2b20f430..b034727c4cf9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3087,7 +3087,7 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, 
u64 *sptep)
   *
   * There are several ways to safely use this helper:
   *
- * - Check mmu_invalidate_retry_hva() after grabbing the mapping level, before
+ * - Check mmu_invalidate_retry_gfn() after grabbing the mapping level, before
   *   consuming it.  In this case, mmu_lock doesn't need to be held during the
   *   lookup, but it does need to be held while checking the MMU notifier.
   *
@@ -4400,7 +4400,7 @@ static bool is_page_fault_stale(struct kvm_vcpu *vcpu,
return true;
  
  	return fault->slot &&

-  mmu_invalidate_retry_hva(vcpu->kvm, fault->mmu_seq, fault->hva);
+  mmu_invalidate_retry_gfn(vcpu->kvm, fault->mmu_seq, fault->gfn);
  }
  
  static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)

@@ -6301,7 +6301,9 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, 
gfn_t gfn_end)
  
  	write_lock(>mmu_lock);
  
-	kvm_mmu_invalidate_begin(kvm, 0, -1ul);

+   kvm_mmu_invalidate_begin(kvm);
+
+   kvm_mmu_invalidate_range_add(kvm, gfn_start, gfn_end);
  
  	flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end);
  
@@ -6314,7 +6316,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)

if (flush)
kvm_flush_remote_tlbs_range(kvm, gfn_start, gfn_end - 
gfn_start);
  
-	kvm_mmu_invalidate_end(kvm, 0, -1ul);

+   kvm_mmu_invalidate_end(kvm);
  
  	write_unlock(>mmu_lock);

  }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0ecf4be2c6af..946380b53cf5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6729,10 +6729,10 @@ static void vmx_set_apic_access_page_addr(struct 
kvm_vcpu *vcpu)
return;
  
  	/*

-* Grab the memslot so that the hva lookup for the mmu_notifier retry
-* is guaranteed to use the same memslot as the pfn lookup, i.e. rely
-* on the pfn lookup's validation of the memslot to ensure a valid hva
-* is used for the retry check.
+* Explicitly grab the memslot using KVM's internal slot ID to ensure
+* KVM doesn't unintentionally grab a userspace memslot.  It_should_
+* be impossible for userspace to create a memslot for the APIC when
+* APICv is enabled, but paranoia won't hurt in this case.
 */
slot = id_to_memslot(slots, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT);
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
@@ -6757,8 +6757,7 @@ static void vmx_set_apic_access_page_addr(struct kvm_vcpu 
*vcpu)
return;
  
  	read_lock(>kvm->mmu_lock);

-   if (mmu_invalidate_retry_hva(kvm, mmu_seq,
-gfn_to_hva_memslot(slot, gfn))) {
+   if (mmu_invalidate_retry_gfn(kvm, mmu_seq, gfn)) {
kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
read_unlock(>kvm->mmu_lock);
goto out;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b901571ab61e..90a0be261a5c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,8 +788,8 @@ struct kvm {
struct mmu_notifier mmu_notifier;
unsigned long mmu_invalidate_seq;
long mmu_invalidate_in_progress;
-   unsigned long mmu_invalidate_range_start;
-   unsigned long mmu_invalidate_range_end;
+   gfn_t mmu_invalidate_range_start;
+   gfn_t mmu_invalidate_range_end;
  #endif
struct list_head devices;
u64 manual_dirty_log_protect;
@@ -1371,10 +1371,9 @@ void kvm_mmu_free_memory_cache(struct 
kvm_mmu_memory_cache *mc);
  void

Re: [PATCH] powerpc/build: vdso linker warning for orphan sections

2023-07-19 Thread John Ogness

Hi Michael,

On 2023-07-19, Michael Ellerman  wrote:
> I regularly test with a gcc 5.5.0 / ld 2.29 toolchain and gcc 13.1.1 /
> ld 2.39, and I haven't seen the warning. I tried a bunch of others and
> can't reproduce it.

I will send my config in a separate email (without the lists in
CC). Building the vdso_prepare target is all that is needed.

> Can you confirm that this makes the warning go away?
>
> diff --git a/arch/powerpc/kernel/vdso/vdso64.lds.S 
> b/arch/powerpc/kernel/vdso/vdso64.lds.S
> index bda6c8cdd459..286e1597c548 100644
> --- a/arch/powerpc/kernel/vdso/vdso64.lds.S
> +++ b/arch/powerpc/kernel/vdso/vdso64.lds.S
> @@ -85,7 +85,7 @@ SECTIONS
>   *(.branch_lt)
>   *(.data .data.* .gnu.linkonce.d.* .sdata*)
>   *(.bss .sbss .dynbss .dynsbss)
> - *(.opd)
> + *(.opd .rela.opd)
>   *(.glink .iplt .plt .rela*)

Hmmm. Not sure what that would change. And indeed it does not make the
warning go away.

Doing some testing it seems that previously .rela.opd was being silently
placed in the .rela.dyn section. So doing that explicitly obviously gets
rid of the warning:

Index: linux-6.5-rc2/arch/powerpc/kernel/vdso/vdso64.lds.S
===
--- linux-6.5-rc2.orig/arch/powerpc/kernel/vdso/vdso64.lds.S
+++ linux-6.5-rc2/arch/powerpc/kernel/vdso/vdso64.lds.S
@@ -69,7 +69,7 @@ SECTIONS
.eh_frame_hdr   : { *(.eh_frame_hdr) }  :text   :eh_frame_hdr
.eh_frame   : { KEEP (*(.eh_frame)) }   :text
.gcc_except_table : { *(.gcc_except_table) }
-   .rela.dyn ALIGN(8) : { *(.rela.dyn) }
+   .rela.dyn ALIGN(8) : { *(.rela.dyn) *(.rela.opd) }
 
.got ALIGN(8)   : { *(.got .toc) }

But if the goal is to get rid of .rela.opd then the question is: why is
the linker complaining about it being discarded?

John

Re: [RFC PATCH v11 01/29] KVM: Wrap kvm_gfn_range.pte in a per-action union

2023-07-19 Thread Paolo Bonzini


On 7/19/23 01:44, Sean Christopherson wrote:

+   BUILD_BUG_ON(sizeof(gfn_range.arg) != sizeof(gfn_range.arg.raw));
+   BUILD_BUG_ON(sizeof(range->arg) != sizeof(range->arg.raw));


I think these should be static assertions near the definition of the 
structs.  However another possibility is to remove 'raw' and just assign 
the whole union.


Apart from this,

Reviewed-by: Paolo Bonzini 

Paolo


+   BUILD_BUG_ON(sizeof(gfn_range.arg) != sizeof(range->arg));

Re: [PATCH] ASoC: fsl_spdif: Add support for 22.05 kHz sample rate

2023-07-19 Thread Fabio Estevam

On Wed, Jul 19, 2023 at 1:32 PM Matus Gajdos  wrote:
>
> Add support for 22.05 kHz sample rate for TX.
>
> Signed-off-by: Matus Gajdos 

Reviewed-by: Fabio Estevam

Re: Kernel Crash Dump (kdump) broken with 6.5

2023-07-19 Thread Linux regression tracking #adding (Thorsten Leemhuis)

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 17.07.23 16:45, Sachin Sant wrote:
> Kdump seems to be broken with 6.5 for ppc64le.
> [...]
> 
> 6.4 was good. Git bisect points to following patch
> 
> commit 606787fed7268feb256957872586370b56af697a
> powerpc/64s: Remove support for ELFv1 little endian userspace
> 
> Reverting this patch allows a successful capture of vmcore.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 606787fed7268feb256957872586370b56af69
#regzbot title powerpc/64s: Crash Dump (kdump) broken with 6.5
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

[PATCH v3 2/2] PCI: layerscape: Add the workaround for lost link capabilities during reset.

2023-07-19 Thread Frank Li

From: Xiaowei Bao 

A workaround for the issue where the PCI Express Endpoint (EP) controller
loses the values of the Maximum Link Width and Supported Link Speed from
the Link Capabilities Register, which initially configured by the Reset
Configuration Word (RCW) during a link-down or hot reset event.

Fixes: a805770d8a22 ("PCI: layerscape: Add EP mode support")
Acked-by: Manivannan Sadhasivam 
Signed-off-by: Xiaowei Bao 
Signed-off-by: Hou Zhiqiang 
Signed-off-by: Frank Li 
---
change from v2 to v3
 - fix subject typo capabilities
change from v1 to v2:
 - add comments at restore register
 - add fixes tag
 - dw_pcie_writew_dbi to dw_pcie_writel_dbi

 .../pci/controller/dwc/pci-layerscape-ep.c| 21 ++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c 
b/drivers/pci/controller/dwc/pci-layerscape-ep.c
index e0969ff2ddf7..39dbd911c3f8 100644
--- a/drivers/pci/controller/dwc/pci-layerscape-ep.c
+++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
@@ -45,6 +45,7 @@ struct ls_pcie_ep {
struct pci_epc_features *ls_epc;
const struct ls_pcie_ep_drvdata *drvdata;
int irq;
+   u32 lnkcap;
boolbig_endian;
 };
 
@@ -73,6 +74,7 @@ static irqreturn_t ls_pcie_ep_event_handler(int irq, void 
*dev_id)
struct ls_pcie_ep *pcie = dev_id;
struct dw_pcie *pci = pcie->pci;
u32 val, cfg;
+   u8 offset;
 
val = ls_lut_readl(pcie, PEX_PF0_PME_MES_DR);
ls_lut_writel(pcie, PEX_PF0_PME_MES_DR, val);
@@ -81,12 +83,25 @@ static irqreturn_t ls_pcie_ep_event_handler(int irq, void 
*dev_id)
return IRQ_NONE;
 
if (val & PEX_PF0_PME_MES_DR_LUD) {
+
+   offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+
+   /*
+* The values of the Maximum Link Width and Supported Link
+* Speed from the Link Capabilities Register will be lost
+* during link down or hot reset. Restore initial value
+* that configured by the Reset Configuration Word (RCW).
+*/
+   dw_pcie_dbi_ro_wr_en(pci);
+   dw_pcie_writel_dbi(pci, offset + PCI_EXP_LNKCAP, pcie->lnkcap);
+   dw_pcie_dbi_ro_wr_dis(pci);
+
cfg = ls_lut_readl(pcie, PEX_PF0_CONFIG);
cfg |= PEX_PF0_CFG_READY;
ls_lut_writel(pcie, PEX_PF0_CONFIG, cfg);
dw_pcie_ep_linkup(>ep);
 
-   dev_dbg(pci->dev, "Link up\n");
+   dev_err(pci->dev, "Link up\n");
} else if (val & PEX_PF0_PME_MES_DR_LDD) {
dev_dbg(pci->dev, "Link down\n");
pci_epc_linkdown(pci->ep.epc);
@@ -216,6 +231,7 @@ static int __init ls_pcie_ep_probe(struct platform_device 
*pdev)
struct ls_pcie_ep *pcie;
struct pci_epc_features *ls_epc;
struct resource *dbi_base;
+   u8 offset;
int ret;
 
pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
@@ -252,6 +268,9 @@ static int __init ls_pcie_ep_probe(struct platform_device 
*pdev)
 
platform_set_drvdata(pdev, pcie);
 
+   offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+   pcie->lnkcap = dw_pcie_readl_dbi(pci, offset + PCI_EXP_LNKCAP);
+
ret = dw_pcie_ep_init(>ep);
if (ret)
return ret;
-- 
2.34.1

[PATCH v3 1/2] PCI: layerscape: Add support for Link down notification

2023-07-19 Thread Frank Li

Add support to pass Link down notification to Endpoint function driver
so that the LINK_DOWN event can be processed by the function.

Acked-by: Manivannan Sadhasivam 
Signed-off-by: Frank Li 
---
Change from v2 to v3
 - none
Change from v1 to v2
 - move pci_epc_linkdown() after dev_dbg()

 drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c 
b/drivers/pci/controller/dwc/pci-layerscape-ep.c
index de4c1758a6c3..e0969ff2ddf7 100644
--- a/drivers/pci/controller/dwc/pci-layerscape-ep.c
+++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
@@ -89,6 +89,7 @@ static irqreturn_t ls_pcie_ep_event_handler(int irq, void 
*dev_id)
dev_dbg(pci->dev, "Link up\n");
} else if (val & PEX_PF0_PME_MES_DR_LDD) {
dev_dbg(pci->dev, "Link down\n");
+   pci_epc_linkdown(pci->ep.epc);
} else if (val & PEX_PF0_PME_MES_DR_HRD) {
dev_dbg(pci->dev, "Hot reset\n");
}
-- 
2.34.1

Re: [RFC PATCH v11 01/29] KVM: Wrap kvm_gfn_range.pte in a per-action union

2023-07-19 Thread Sean Christopherson

On Wed, Jul 19, 2023, Jarkko Sakkinen wrote:
> On Wed Jul 19, 2023 at 2:44 AM EEST, Sean Christopherson wrote:
> > /* Huge pages aren't expected to be modified without first being 
> > zapped. */
> > -   WARN_ON(pte_huge(range->pte) || range->start + 1 != range->end);
> > +   WARN_ON(pte_huge(range->arg.pte) || range->start + 1 != range->end);
> 
> Not familiar with this code. Just checking whether whether instead
> pr_{warn,err}()

The "full" WARN is desirable, this is effecitvely an assert on the contract 
between
the primary MMU, generic KVM code, and x86's TDP MMU.  The .change_pte() 
mmu_notifier
callback doesn't allow for hugepages, i.e. it's a (likely fatal) kernel bug if a
hugepage is encountered at this point.  Ditto for the "start + 1 == end" check,
if that fails then generic KVM likely has a fatal bug.

> combined with return false would be a more graceful option?

The return value communicates whether or not a TLB flush is needed, not whether
or not the operation was successful, i.e. there is no way to cancel the 
unexpected
PTE change.

Re: [PATCH v2 0/9] video: screen_info cleanups

2023-07-19 Thread Helge Deller


On 7/19/23 14:39, Arnd Bergmann wrote:

From: Arnd Bergmann 

I refreshed the first four patches that I sent before with very minor
updates, and then added some more to further disaggregate the use
of screen_info:

  - I found that powerpc wasn't using vga16fb any more

  - vgacon can be almost entirely separated from the global
screen_info, except on x86

  - similarly, the EFI framebuffer initialization can be
kept separate, except on x86.


Nice cleanup, Arnd!

You may add a
Acked-by: Helge Deller 
to the series.



I did extensive build testing on arm/arm64/x86 and the normal built bot
testing for the other architectures.



Which tree should this get merged through?


I suggest drm-misc or fbdev. Either is fine for me.

Since it applies cleanly onto git head, I can put it a few days into
the fbdev git tree to see if some builds break. Just let me know.

Helge

Re: [PATCH v2 5/9] vgacon: remove screen_info dependency

2023-07-19 Thread Arnd Bergmann

On Wed, Jul 19, 2023, at 15:49, Philippe Mathieu-Daudé wrote:
> On 19/7/23 14:39, Arnd Bergmann wrote:

>> @@ -1074,13 +1077,13 @@ static int vgacon_resize(struct vc_data *c, unsigned 
>> int width,
>>   * Ho ho!  Someone (svgatextmode, eh?) may have reprogrammed
>>   * the video mode!  Set the new defaults then and go away.
>>   */
>> -screen_info.orig_video_cols = width;
>> -screen_info.orig_video_lines = height;
>> +vga_si->orig_video_cols = width;
>> +vga_si->orig_video_lines = height;
>>  vga_default_font_height = c->vc_cell_height;
>>  return 0;
>>  }
>> -if (width % 2 || width > screen_info.orig_video_cols ||
>> -height > (screen_info.orig_video_lines * vga_default_font_height)/
>> +if (width % 2 || width > vga_si->orig_video_cols ||
>> +height > (vga_si->orig_video_lines * vga_default_font_height)/
>>  c->vc_cell_height)
>>  return -EINVAL;
>>   
>> @@ -1110,8 +1113,8 @@ static void vgacon_save_screen(struct vc_data *c)
>>   * console initialization routines.
>>   */
>>  vga_bootup_console = 1;
>> -c->state.x = screen_info.orig_x;
>> -c->state.y = screen_info.orig_y;
>> +c->state.x = vga_si->orig_x;
>> +c->state.y = vga_si->orig_y;
>
> Not really my area, so bare with me if this is obviously not
> possible :) If using DUMMY_CONSOLE, can we trigger a save_screen
> / resize? If so, we'd reach here with vga_si=NULL.
>

I think it cannot happen because the only way that anything calls
into vgacon.c is through the "conswitchp = _con;" that now happens
at the same time as the "vga_si = _info;". It's definitely
possible that I'm missing something as well here.

 Arnd

Re: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

2023-07-19 Thread Ard Biesheuvel

On Wed, 19 Jul 2023 at 16:23, Zhihao Cheng  wrote:
>
> 在 2023/7/19 16:33, Ard Biesheuvel 写道:
> > On Wed, 19 Jul 2023 at 00:38, Eric Biggers  wrote:
> >>
> >> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
> >>> Currently, the ubifs code allocates a worst case buffer size to
> >>> recompress a data node, but does not pass the size of that buffer to the
> >>> compression code. This means that the compression code will never use
>
> I think you mean the 'out_len' which describes the lengh of 'buf' is
> passed into ubifs_decompress, which effects the result of
> decompressor(eg. lz4 uses length to calculate the buffer end pos).
> So, we should pass the real lenghth of 'buf'.
>

Yes, that is what I meant.

But Eric makes a good point, and looking a bit more closely, there is
really no need for the multiplication here: we know the size of the
decompressed data, so we don't need the additional space.

I intend to drop this patch, and replace it with the following:

8<--

Currently, when truncating a data node, a decompression buffer is
allocated that is twice the size of the data node's uncompressed size.
However, the fact that this space is available is not communicated to
the compression routines, as out_len itself is not updated.

The additional space is not needed even in the theoretical worst case
where compression might lead to inadvertent expansion: first of all,
increasing the size of the input buffer does not help mitigate that
issue. And given the truncation of the data node and the fact that the
original data compressed well enough to pass the UBIFS_MIN_COMPRESS_DIFF
test, there is no way on this particular code path that compression
could result in expansion beyond the original decompressed size, and so
no mitigation is necessary to begin with.

So let's just drop WORST_COMPR_FACTOR here.

Signed-off-by: Ard Biesheuvel 

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index dc52ac0f4a345f30..0b55cbfe0c30505e 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -1489,7 +1489,7 @@ static int truncate_data_node(const struct
ubifs_info *c, const struct inode *in
int err, dlen, compr_type, out_len, data_size;

out_len = le32_to_cpu(dn->size);
-   buf = kmalloc_array(out_len, WORST_COMPR_FACTOR, GFP_NOFS);
+   buf = kmalloc(out_len, GFP_NOFS);
if (!buf)
return -ENOMEM;

Re: [PATCH v2 9/9] efi: move screen_info into efi init code

2023-07-19 Thread Javier Martinez Canillas

Arnd Bergmann  writes:

> From: Arnd Bergmann 
>
> After the vga console no longer relies on global screen_info, there are
> only two remaining use cases:
>
>  - on the x86 architecture, it is used for multiple boot methods
>(bzImage, EFI, Xen, kexec) to commicate the initial VGA or framebuffer

communicate

>settings to a number of device drivers.
>
>  - on other architectures, it is only used as part of the EFI stub,
>and only for the three sysfb framebuffers (simpledrm, simplefb, efifb).
>
> Remove the duplicate data structure definitions by moving it into the
> efi-init.c file that sets it up initially for the EFI case, leaving x86
> as an exception that retains its own definition for non-EFI boots.
>
> The added #ifdefs here are optional, I added them to further limit the
> reach of screen_info to configurations that have at least one of the
> users enabled.
>
> Signed-off-by: Arnd Bergmann 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [PATCH v3 07/13] s390: add pte_free_defer() for pgtables sharing page

2023-07-19 Thread Claudio Imbrenda

On Tue, 11 Jul 2023 21:38:35 -0700 (PDT)
Hugh Dickins  wrote:

[...]

> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable)
> +{
> + struct page *page;
> +
> + page = virt_to_page(pgtable);
> + SetPageActive(page);
> + page_table_free(mm, (unsigned long *)pgtable);
> + /*
> +  * page_table_free() does not do the pgste gmap_unlink() which
> +  * page_table_free_rcu() does: warn us if pgste ever reaches here.
> +  */
> + WARN_ON_ONCE(mm_alloc_pgste(mm));

it seems I have overlooked something when we previously discussed
this...

mm_alloc_pgste() is true for all processes that have PGSTEs, not only
for processes that can run guests.

There are two ways to enable PGSTEs: an ELF header bit, and a sysctl
knob.

The ELF bit is only used by qemu, it enables PGSTE allocation only for
that single process. This is a strong indication that the process wants
to run guests.

The sysctl knob enables PGSTE allocation for every process in the system
from that moment on. In that case, the WARN_ON_ONCE would be triggered
when not necessary.

There is however another way to check if a process is actually
__using__ the PGSTEs, a.k.a. if the process is actually capable of
running guests.

Confusingly, the name of that function is mm_has_pgste(). This confused
me as well, which is why I didn't notice it when we discussed this
previously :)

in short: can you please use mm_has_pgste() instead of mm_alloc_pgste()
in the WARN_ON_ONCE ?

> +}
> +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> +
>  /*
>   * Base infrastructure required to generate basic asces, region, segment,
>   * and page tables that do not make use of enhanced features like EDAT1.

Re: [PATCH v2 8/9] hyperv: avoid dependency on screen_info

2023-07-19 Thread Javier Martinez Canillas

Arnd Bergmann  writes:

> From: Arnd Bergmann 
>
> The two hyperv framebuffer drivers (hyperv_fb or hyperv_drm_drv) access the
> global screen_info in order to take over from the sysfb framebuffer, which
> in turn could be handled by simplefb, simpledrm or efifb. Similarly, the
> vmbus_drv code marks the original EFI framebuffer as reserved, but this
> is not required if there is no sysfb.
>
> As a preparation for making screen_info itself more local to the sysfb
> helper code, add a compile-time conditional in all three files that relate
> to hyperv fb and just skip this code if there is no sysfb that needs to
> be unregistered.
>
> Signed-off-by: Arnd Bergmann 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [PATCH v2 7/9] vga16fb: drop powerpc support

2023-07-19 Thread Javier Martinez Canillas

Arnd Bergmann  writes:

> From: Arnd Bergmann 
>
> I noticed that commit 0db5b61e0dc07 ("fbdev/vga16fb: Create
> EGA/VGA devices in sysfb code") broke vga16fb on non-x86 platforms,
> because the sysfb code never creates a vga-framebuffer device when
> screen_info.orig_video_isVGA is set to '1' instead of VIDEO_TYPE_VGAC.
>
> However, it turns out that the only architecture that has allowed
> building vga16fb in the past 20 years is powerpc, and this only worked
> on two 32-bit platforms and never on 64-bit powerpc. The last machine
> that actually used this was removed in linux-3.10, so this is all dead
> code and can be removed.
>
> The big-endian support in vga16fb.c could also be removed, but I'd just
> leave this in place.
>
> Fixes: 933ee7119fb14 ("powerpc: remove PReP platform")
> Signed-off-by: Arnd Bergmann 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [PATCH v2 6/9] vgacon: clean up global screen_info instances

2023-07-19 Thread Javier Martinez Canillas

Arnd Bergmann  writes:

> From: Arnd Bergmann 
>
> To prepare for completely separating the VGA console screen_info from
> the one used in EFI/sysfb, rename the vgacon instances and make them
> local as much as possible.
>
> ia64 and arm both have confurations with vgacon and efi, but the contents

is this a typo for configurations ?

> never overlaps because ia64 has no EFI framebuffer, and arm only has
> vga console on legacy platforms without EFI. Renaming these is required
> before the EFI screen_info can be moved into drivers/firmware.
>
> The ia64 vga console is actually registered in two places from
> setup_arch(), but one of them is wrong, so drop the one in pcdp.c and
> the fix the one in setup.c to use the correct conditional.
>

s/the fix the/fix the

> x86 has to keep them together, as the boot protocol is used to switch
> between VGA text console and framebuffer through the screen_info data.
>
> Signed-off-by: Arnd Bergmann 
> ---

Patch looks good to me, but I'm not that familiar with some of the arches
to give a proper reviewed-by.

Acked-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [RFC PATCH v11 07/29] KVM: Add KVM_EXIT_MEMORY_FAULT exit

2023-07-19 Thread Sean Christopherson

On Wed, Jul 19, 2023, Yuan Yao wrote:
> On Tue, Jul 18, 2023 at 04:44:50PM -0700, Sean Christopherson wrote:
> > From: Chao Peng 
> >
> > This new KVM exit allows userspace to handle memory-related errors. It
> > indicates an error happens in KVM at guest memory range [gpa, gpa+size).
> > The flags includes additional information for userspace to handle the
> > error. Currently bit 0 is defined as 'private memory' where '1'
> > indicates error happens due to private memory access and '0' indicates
> > error happens due to shared memory access.
> 
> Now it's bit 3:

Yeah, I need to update (or write) a lot of changelogs.

> #define KVM_MEMORY_EXIT_FLAG_PRIVATE (1ULL << 3)
> 
> I remember some other attributes were introduced in v10 yet:
> 
> #define KVM_MEMORY_ATTRIBUTE_READ  (1ULL << 0)
> #define KVM_MEMORY_ATTRIBUTE_WRITE (1ULL << 1)
> #define KVM_MEMORY_ATTRIBUTE_EXECUTE   (1ULL << 2)
> #define KVM_MEMORY_ATTRIBUTE_PRIVATE   (1ULL << 3)
> 
> So KVM_MEMORY_EXIT_FLAG_PRIVATE changed to bit 3 due to above things,
> or other reason ? (Sorry I didn't follow v10 too much before).

Yep, I want to reserve space for the RWX bits.

Re: [RFC PATCH v11 05/29] KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER

2023-07-19 Thread Sean Christopherson

On Wed, Jul 19, 2023, Yuan Yao wrote:
> On Tue, Jul 18, 2023 at 04:44:48PM -0700, Sean Christopherson wrote:
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 90a0be261a5c..d2d3e083ec7f 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -255,7 +255,9 @@ bool kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t 
> > cr2_or_gpa,
> >  int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu);
> >  #endif
> >
> > -#ifdef KVM_ARCH_WANT_MMU_NOTIFIER
> > +struct kvm_gfn_range;
> 
> Not sure why a declaration here, it's defined for ARCHs which defined
> KVM_ARCH_WANT_MMU_NOTIFIER before.

The forward declaration exists to handle cases where CONFIG_KVM=n, specifically
arch/powerpc/include/asm/kvm_ppc.h's declaration of hooks to forward calls to
uarch modules:

bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);

Prior to using a Kconfig, a forward declaration wasn't necessary because
arch/powerpc/include/asm/kvm_host.h would #define KVM_ARCH_WANT_MMU_NOTIFIER 
even
if CONFIG_KVM=n.

Alternatively, kvm_ppc.h could declare the struct.  I went this route mainly to
avoid the possibility of someone encountering the same problem on a different
architecture.

Re: [PATCH 1/6] media: v4l2: Add audio capture and output support

2023-07-19 Thread Shengjiu Wang

Hi Mark

On Fri, Jul 7, 2023 at 11:13 AM Shengjiu Wang 
wrote:

> Hi Mark
>
> On Tue, Jul 4, 2023 at 12:03 PM Shengjiu Wang 
> wrote:
>
>>
>>
>> On Tue, Jul 4, 2023 at 1:59 AM Mark Brown  wrote:
>>
>>> On Mon, Jul 03, 2023 at 03:12:55PM +0200, Hans Verkuil wrote:
>>>
>>> > My main concern is that these cross-subsystem drivers are a pain to
>>> > maintain. So there have to be good reasons to do this.
>>>
>>> > Also it is kind of weird to have to use the V4L2 API in userspace to
>>> > deal with a specific audio conversion. Quite unexpected.
>>>
>>> > But in the end, that's a decision I can't make.
>>>
>>> > So I wait for that feedback. Note that if the decision is made that
>>> this
>>> > can use V4L2, then there is quite a lot more that needs to be done:
>>> > documentation, new compliance tests, etc. It's adding a new API, and
>>> that
>>> > comes with additional work...
>>>
>>> Absolutely, I agree with all of this - my impression was that the target
>>> here would be bypass of audio streams to/from a v4l2 device, without
>>> bouncing through an application layer.  If it's purely for audio usage
>>> with no other tie to v4l2 then involving v4l2 does just seem like
>>> complication.
>>>
>>
>> This audio use case is using the v4l2 application layer. in the user space
>> I need to call below v4l2 ioctls to implement the feature:
>> VIDIOC_QUERYCAP
>> VIDIOC_TRY_FMT
>> VIDIOC_S_FMT
>> VIDIOC_REQBUFS
>> VIDIOC_QUERYBUF
>> VIDIOC_STREAMON
>> VIDIOC_QBUF
>> VIDIOC_DQBUF
>> VIDIOC_STREAMOFF
>>
>> why the driver was put in the ALSA, because previously we implemented
>> the ASRC M2P (memory to peripheral) in ALSA,  so I think it is better to
>> add M2M driver in ALSA.  The hardware IP is the same. The compatible
>> string is the same.
>>
>>
>> Could you please share more of your ideas about this patch? and could
> you please check further about this implementation.
>
> I tried to find a good interface in ALSA for this m2m request, but didn't
> find one,  then I try the V4L2, find it is good this audio case.
>
> but it needs to extend the V4L2 API.
>
> I have no idea how to go on, could you please recommend?
>
>
Should I implement the asrc m2m driver as a separate v4l2 driver?
And move it to the /driver/media folder ? In ALSA part, just need
register the platform device.

The bridge between ALSA and V4L2 framework can be the header
file in /include/sound/

Does it sound better?

Best regards
Wang Shengjiu

Re: [PATCH v7 5/8] KVM: x86/mmu: Don't pass FOLL_GET to __kvm_follow_pfn

2023-07-19 Thread Yan Zhao

On Tue, Jul 04, 2023 at 04:50:50PM +0900, David Stevens wrote:
> @@ -4451,7 +4461,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, 
> struct kvm_page_fault *fault
>  
>  out_unlock:
>   write_unlock(>kvm->mmu_lock);
> - kvm_release_pfn_clean(fault->pfn);
> + if (fault->is_refcounted_page)
> + kvm_set_page_accessed(pfn_to_page(fault->pfn));
For a refcounted page, as now KVM puts its ref early in kvm_faultin_pfn(),
should this kvm_set_page_accessed() be placed before unlocking mmu_lock?

Otherwise, if the user unmaps a region (which triggers kvm_unmap_gfn_range()
with mmu_lock holding for write), and release the page, and if the two
steps happen after checking page_count() in kvm_set_page_accessed() and
before mark_page_accessed(), the latter function may mark accessed to a page
that is released or does not belong to current process.

Is it true?

>   return r;
>  }
>  
> @@ -4529,7 +4540,8 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
>  
>  out_unlock:
>   read_unlock(>kvm->mmu_lock);
> - kvm_release_pfn_clean(fault->pfn);
> + if (fault->is_refcounted_page)
> + kvm_set_page_accessed(pfn_to_page(fault->pfn));
>   return r;
>  }
Ditto.

Re: [PATCH v2 5/9] vgacon: remove screen_info dependency

2023-07-19 Thread Philippe Mathieu-Daudé


Hi Arnd,

On 19/7/23 14:39, Arnd Bergmann wrote:

From: Arnd Bergmann 

The vga console driver is fairly self-contained, and only used by
architectures that explicitly initialize the screen_info settings.

Chance every instance that picks the vga console by setting conswitchp
to call a function instead, and pass a reference to the screen_info
there.

Signed-off-by: Arnd Bergmann 
---
  arch/alpha/kernel/setup.c  |  2 +-
  arch/arm/kernel/setup.c|  2 +-
  arch/ia64/kernel/setup.c   |  2 +-
  arch/mips/kernel/setup.c   |  2 +-
  arch/x86/kernel/setup.c|  2 +-
  drivers/firmware/pcdp.c|  2 +-
  drivers/video/console/vgacon.c | 68 --
  include/linux/console.h|  7 
  8 files changed, 53 insertions(+), 34 deletions(-)




@@ -1074,13 +1077,13 @@ static int vgacon_resize(struct vc_data *c, unsigned 
int width,
 * Ho ho!  Someone (svgatextmode, eh?) may have reprogrammed
 * the video mode!  Set the new defaults then and go away.
 */
-   screen_info.orig_video_cols = width;
-   screen_info.orig_video_lines = height;
+   vga_si->orig_video_cols = width;
+   vga_si->orig_video_lines = height;
vga_default_font_height = c->vc_cell_height;
return 0;
}
-   if (width % 2 || width > screen_info.orig_video_cols ||
-   height > (screen_info.orig_video_lines * vga_default_font_height)/
+   if (width % 2 || width > vga_si->orig_video_cols ||
+   height > (vga_si->orig_video_lines * vga_default_font_height)/
c->vc_cell_height)
return -EINVAL;
  
@@ -1110,8 +1113,8 @@ static void vgacon_save_screen(struct vc_data *c)

 * console initialization routines.
 */
vga_bootup_console = 1;
-   c->state.x = screen_info.orig_x;
-   c->state.y = screen_info.orig_y;
+   c->state.x = vga_si->orig_x;
+   c->state.y = vga_si->orig_y;


Not really my area, so bare with me if this is obviously not
possible :) If using DUMMY_CONSOLE, can we trigger a save_screen
/ resize? If so, we'd reach here with vga_si=NULL.


}
  
  	/* We can't copy in more than the size of the video buffer,

@@ -1204,4 +1207,13 @@ const struct consw vga_con = {
  };
  EXPORT_SYMBOL(vga_con);
  
+void vgacon_register_screen(struct screen_info *si)

+{
+   if (!si || vga_si)
+   return;
+
+   conswitchp = _con;
+   vga_si = si;
+}

Re: [PATCH v2 5/9] vgacon: remove screen_info dependency

2023-07-19 Thread Javier Martinez Canillas

Arnd Bergmann  writes:

> From: Arnd Bergmann 
>
> The vga console driver is fairly self-contained, and only used by
> architectures that explicitly initialize the screen_info settings.
>
> Chance every instance that picks the vga console by setting conswitchp
> to call a function instead, and pass a reference to the screen_info
> there.
>
> Signed-off-by: Arnd Bergmann 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [RFC PATCH v11 01/29] KVM: Wrap kvm_gfn_range.pte in a per-action union

2023-07-19 Thread Jarkko Sakkinen

On Wed Jul 19, 2023 at 2:44 AM EEST, Sean Christopherson wrote:
>   /* Huge pages aren't expected to be modified without first being 
> zapped. */
> - WARN_ON(pte_huge(range->pte) || range->start + 1 != range->end);
> + WARN_ON(pte_huge(range->arg.pte) || range->start + 1 != range->end);

Not familiar with this code. Just checking whether whether instead
pr_{warn,err}() combined with return false would be a more graceful
option?

BR, Jarkko

Re: [PATCH v2 4/9] vgacon, arch/*: remove unused screen_info definitions

2023-07-19 Thread Philippe Mathieu-Daudé


On 19/7/23 14:39, Arnd Bergmann wrote:

From: Arnd Bergmann 

A number of architectures either kept the screen_info definition for
historical purposes as it used to be required by the generic VT code, or
they copied it from another architecture in order to build the VGA console
driver in an allmodconfig build. The mips definition is used by some
platforms, but the initialization on jazz is not needed.

Now that vgacon no longer builds on these architectures, remove the
stale definitions and initializations.

Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
Acked-by: Dinh Nguyen 
Acked-by: Max Filippov 
Acked-by: Palmer Dabbelt 
Acked-by: Guo Ren 
Signed-off-by: Arnd Bergmann 
---
  arch/csky/kernel/setup.c  | 12 
  arch/hexagon/kernel/Makefile  |  2 --
  arch/hexagon/kernel/screen_info.c |  3 ---
  arch/mips/jazz/setup.c|  9 -
  arch/nios2/kernel/setup.c |  5 -
  arch/sh/kernel/setup.c|  5 -
  arch/sparc/kernel/setup_32.c  | 13 -
  arch/sparc/kernel/setup_64.c  | 13 -
  arch/xtensa/kernel/setup.c| 12 
  9 files changed, 74 deletions(-)
  delete mode 100644 arch/hexagon/kernel/screen_info.c


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 9/9] efi: move screen_info into efi init code

2023-07-19 Thread Ard Biesheuvel

On Wed, 19 Jul 2023 at 14:41, Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> After the vga console no longer relies on global screen_info, there are
> only two remaining use cases:
>
>  - on the x86 architecture, it is used for multiple boot methods
>(bzImage, EFI, Xen, kexec) to commicate the initial VGA or framebuffer
>settings to a number of device drivers.
>
>  - on other architectures, it is only used as part of the EFI stub,
>and only for the three sysfb framebuffers (simpledrm, simplefb, efifb).
>
> Remove the duplicate data structure definitions by moving it into the
> efi-init.c file that sets it up initially for the EFI case, leaving x86
> as an exception that retains its own definition for non-EFI boots.
>
> The added #ifdefs here are optional, I added them to further limit the
> reach of screen_info to configurations that have at least one of the
> users enabled.
>
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Ard Biesheuvel 

> ---
>  arch/arm/kernel/setup.c   |  4 
>  arch/arm64/kernel/efi.c   |  4 
>  arch/arm64/kernel/image-vars.h|  2 ++
>  arch/ia64/kernel/setup.c  |  4 
>  arch/loongarch/kernel/efi.c   |  3 ++-
>  arch/loongarch/kernel/image-vars.h|  2 ++
>  arch/loongarch/kernel/setup.c |  5 -
>  arch/riscv/kernel/image-vars.h|  2 ++
>  arch/riscv/kernel/setup.c |  5 -
>  drivers/firmware/efi/efi-init.c   | 14 +-
>  drivers/firmware/efi/libstub/efi-stub-entry.c |  8 +++-
>  11 files changed, 28 insertions(+), 25 deletions(-)
>
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 86c2751f56dcf..135b7eff03f72 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -939,10 +939,6 @@ static struct screen_info vgacon_screen_info = {
>  };
>  #endif
>
> -#if defined(CONFIG_EFI)
> -struct screen_info screen_info;
> -#endif
> -
>  static int __init customize_machine(void)
>  {
> /*
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 3afbe503b066f..ff2d5169d7f1f 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -71,10 +71,6 @@ static __init pteval_t 
> create_mapping_protection(efi_memory_desc_t *md)
> return pgprot_val(PAGE_KERNEL_EXEC);
>  }
>
> -/* we will fill this structure from the stub, so don't put it in .bss */
> -struct screen_info screen_info __section(".data");
> -EXPORT_SYMBOL(screen_info);
> -
>  int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md)
>  {
> pteval_t prot_val = create_mapping_protection(md);
> diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
> index 35f3c79595137..5e4dc72ab1bda 100644
> --- a/arch/arm64/kernel/image-vars.h
> +++ b/arch/arm64/kernel/image-vars.h
> @@ -27,7 +27,9 @@ PROVIDE(__efistub__text   = _text);
>  PROVIDE(__efistub__end = _end);
>  PROVIDE(__efistub___inittext_end   = __inittext_end);
>  PROVIDE(__efistub__edata   = _edata);
> +#if defined(CONFIG_EFI_EARLYCON) || defined(CONFIG_SYSFB)
>  PROVIDE(__efistub_screen_info  = screen_info);
> +#endif
>  PROVIDE(__efistub__ctype   = _ctype);
>
>  PROVIDE(__pi___memcpy  = __pi_memcpy);
> diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
> index 82feae1323f40..e91a91b5e9142 100644
> --- a/arch/ia64/kernel/setup.c
> +++ b/arch/ia64/kernel/setup.c
> @@ -86,10 +86,6 @@ EXPORT_SYMBOL(local_per_cpu_offset);
>  #endif
>  unsigned long ia64_cycles_per_usec;
>  struct ia64_boot_param *ia64_boot_param;
> -#if defined(CONFIG_EFI)
> -/* No longer used on ia64, but needed for linking */
> -struct screen_info screen_info;
> -#endif
>  #ifdef CONFIG_VGA_CONSOLE
>  unsigned long vga_console_iobase;
>  unsigned long vga_console_membase;
> diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c
> index 9fc10cea21e10..df7db34024e61 100644
> --- a/arch/loongarch/kernel/efi.c
> +++ b/arch/loongarch/kernel/efi.c
> @@ -115,7 +115,8 @@ void __init efi_init(void)
>
> set_bit(EFI_CONFIG_TABLES, );
>
> -   init_screen_info();
> +   if (IS_ENABLED(CONFIG_EFI_EARLYCON) || IS_ENABLED(CONFIG_SYSFB))
> +   init_screen_info();
>
> if (boot_memmap == EFI_INVALID_TABLE_ADDR)
> return;
> diff --git a/arch/loongarch/kernel/image-vars.h 
> b/arch/loongarch/kernel/image-vars.h
> index e561989d02de9..5087416b9678d 100644
> --- a/arch/loongarch/kernel/image-vars.h
> +++ b/arch/loongarch/kernel/image-vars.h
> @@ -12,7 +12,9 @@ __efistub_kernel_entry= kernel_entry;
>  __efistub_kernel_asize = kernel_asize;
>  __efistub_kernel_fsize = kernel_fsize;
>  __efistub_kernel_offset= kernel_offset;
> +#if defined(CONFIG_EFI_EARLYCON) || defined(CONFIG_SYSFB)
>

Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-19 Thread Thorsten Leemhuis

On 19.07.23 14:36, Bagas Sanjaya wrote:
> On 7/18/23 17:06, Thorsten Leemhuis wrote:
>> I'm missing something here:
>>
>> * What makes you think this is caused by bdb616479eff419? I didn't see
>> anything in the thread that claims this, but I might be missing something
>> * related: if I understand Randy right, this is only happening in -next;
>> so why is bdb616479eff419 the culprit, which is also in mainline since
>> End of June?
> 
> Actually drivers/video/fbdev/ps3bf.c only had two non-merge commits during
> previous cycle: 25ec15abb06194 and bdb616479eff419. The former was simply
> adding .owner field in ps3fb_ops (hence trivial), so I inferred that the
> culprit was likely the latter (due to it was being authored by Thomas).

As you can see from Michael's reply this was misguided, as it was an
external change that broke the driver. This happens all the time, such
inferring thus is not possible at all.

Ciao, Thorsten

[PATCH v2 9/9] efi: move screen_info into efi init code

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

After the vga console no longer relies on global screen_info, there are
only two remaining use cases:

 - on the x86 architecture, it is used for multiple boot methods
   (bzImage, EFI, Xen, kexec) to commicate the initial VGA or framebuffer
   settings to a number of device drivers.

 - on other architectures, it is only used as part of the EFI stub,
   and only for the three sysfb framebuffers (simpledrm, simplefb, efifb).

Remove the duplicate data structure definitions by moving it into the
efi-init.c file that sets it up initially for the EFI case, leaving x86
as an exception that retains its own definition for non-EFI boots.

The added #ifdefs here are optional, I added them to further limit the
reach of screen_info to configurations that have at least one of the
users enabled.

Signed-off-by: Arnd Bergmann 
---
 arch/arm/kernel/setup.c   |  4 
 arch/arm64/kernel/efi.c   |  4 
 arch/arm64/kernel/image-vars.h|  2 ++
 arch/ia64/kernel/setup.c  |  4 
 arch/loongarch/kernel/efi.c   |  3 ++-
 arch/loongarch/kernel/image-vars.h|  2 ++
 arch/loongarch/kernel/setup.c |  5 -
 arch/riscv/kernel/image-vars.h|  2 ++
 arch/riscv/kernel/setup.c |  5 -
 drivers/firmware/efi/efi-init.c   | 14 +-
 drivers/firmware/efi/libstub/efi-stub-entry.c |  8 +++-
 11 files changed, 28 insertions(+), 25 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 86c2751f56dcf..135b7eff03f72 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -939,10 +939,6 @@ static struct screen_info vgacon_screen_info = {
 };
 #endif
 
-#if defined(CONFIG_EFI)
-struct screen_info screen_info;
-#endif
-
 static int __init customize_machine(void)
 {
/*
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 3afbe503b066f..ff2d5169d7f1f 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -71,10 +71,6 @@ static __init pteval_t 
create_mapping_protection(efi_memory_desc_t *md)
return pgprot_val(PAGE_KERNEL_EXEC);
 }
 
-/* we will fill this structure from the stub, so don't put it in .bss */
-struct screen_info screen_info __section(".data");
-EXPORT_SYMBOL(screen_info);
-
 int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md)
 {
pteval_t prot_val = create_mapping_protection(md);
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 35f3c79595137..5e4dc72ab1bda 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -27,7 +27,9 @@ PROVIDE(__efistub__text   = _text);
 PROVIDE(__efistub__end = _end);
 PROVIDE(__efistub___inittext_end   = __inittext_end);
 PROVIDE(__efistub__edata   = _edata);
+#if defined(CONFIG_EFI_EARLYCON) || defined(CONFIG_SYSFB)
 PROVIDE(__efistub_screen_info  = screen_info);
+#endif
 PROVIDE(__efistub__ctype   = _ctype);
 
 PROVIDE(__pi___memcpy  = __pi_memcpy);
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 82feae1323f40..e91a91b5e9142 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -86,10 +86,6 @@ EXPORT_SYMBOL(local_per_cpu_offset);
 #endif
 unsigned long ia64_cycles_per_usec;
 struct ia64_boot_param *ia64_boot_param;
-#if defined(CONFIG_EFI)
-/* No longer used on ia64, but needed for linking */
-struct screen_info screen_info;
-#endif
 #ifdef CONFIG_VGA_CONSOLE
 unsigned long vga_console_iobase;
 unsigned long vga_console_membase;
diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c
index 9fc10cea21e10..df7db34024e61 100644
--- a/arch/loongarch/kernel/efi.c
+++ b/arch/loongarch/kernel/efi.c
@@ -115,7 +115,8 @@ void __init efi_init(void)
 
set_bit(EFI_CONFIG_TABLES, );
 
-   init_screen_info();
+   if (IS_ENABLED(CONFIG_EFI_EARLYCON) || IS_ENABLED(CONFIG_SYSFB))
+   init_screen_info();
 
if (boot_memmap == EFI_INVALID_TABLE_ADDR)
return;
diff --git a/arch/loongarch/kernel/image-vars.h 
b/arch/loongarch/kernel/image-vars.h
index e561989d02de9..5087416b9678d 100644
--- a/arch/loongarch/kernel/image-vars.h
+++ b/arch/loongarch/kernel/image-vars.h
@@ -12,7 +12,9 @@ __efistub_kernel_entry= kernel_entry;
 __efistub_kernel_asize = kernel_asize;
 __efistub_kernel_fsize = kernel_fsize;
 __efistub_kernel_offset= kernel_offset;
+#if defined(CONFIG_EFI_EARLYCON) || defined(CONFIG_SYSFB)
 __efistub_screen_info  = screen_info;
+#endif
 
 #endif
 
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index 77e7a3722caa6..4570c3149b849 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 #include

[PATCH v2 8/9] hyperv: avoid dependency on screen_info

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

The two hyperv framebuffer drivers (hyperv_fb or hyperv_drm_drv) access the
global screen_info in order to take over from the sysfb framebuffer, which
in turn could be handled by simplefb, simpledrm or efifb. Similarly, the
vmbus_drv code marks the original EFI framebuffer as reserved, but this
is not required if there is no sysfb.

As a preparation for making screen_info itself more local to the sysfb
helper code, add a compile-time conditional in all three files that relate
to hyperv fb and just skip this code if there is no sysfb that needs to
be unregistered.

Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/hyperv/hyperv_drm_drv.c | 7 ---
 drivers/hv/vmbus_drv.c  | 6 --
 drivers/video/fbdev/hyperv_fb.c | 8 
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c 
b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
index 8026118c6e033..9a44a00effc24 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
@@ -73,9 +73,10 @@ static int hyperv_setup_vram(struct hyperv_drm_device *hv,
struct drm_device *dev = >dev;
int ret;
 
-   drm_aperture_remove_conflicting_framebuffers(screen_info.lfb_base,
-screen_info.lfb_size,
-_driver);
+   if (IS_ENABLED(CONFIG_SYSFB))
+   
drm_aperture_remove_conflicting_framebuffers(screen_info.lfb_base,
+
screen_info.lfb_size,
+_driver);
 
hv->fb_size = (unsigned long)hv->mmio_megabytes * 1024 * 1024;
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 67f95a29aeca5..5bc059e8a9f5f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2100,8 +2100,10 @@ static void __maybe_unused vmbus_reserve_fb(void)
 
if (efi_enabled(EFI_BOOT)) {
/* Gen2 VM: get FB base from EFI framebuffer */
-   start = screen_info.lfb_base;
-   size = max_t(__u32, screen_info.lfb_size, 0x80);
+   if (IS_ENABLED(CONFIG_SYSFB)) {
+   start = screen_info.lfb_base;
+   size = max_t(__u32, screen_info.lfb_size, 0x80);
+   }
} else {
/* Gen1 VM: get FB base from PCI */
pdev = pci_get_device(PCI_VENDOR_ID_MICROSOFT,
diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
index b331452aab4fb..7e0d1c4235549 100644
--- a/drivers/video/fbdev/hyperv_fb.c
+++ b/drivers/video/fbdev/hyperv_fb.c
@@ -1030,7 +1030,7 @@ static int hvfb_getmem(struct hv_device *hdev, struct 
fb_info *info)
goto getmem_done;
}
pr_info("Unable to allocate enough contiguous physical memory 
on Gen 1 VM. Using MMIO instead.\n");
-   } else {
+   } else if (IS_ENABLED(CONFIG_SYSFB)) {
base = screen_info.lfb_base;
size = screen_info.lfb_size;
}
@@ -1076,13 +1076,13 @@ static int hvfb_getmem(struct hv_device *hdev, struct 
fb_info *info)
 getmem_done:
aperture_remove_conflicting_devices(base, size, KBUILD_MODNAME);
 
-   if (gen2vm) {
+   if (!gen2vm) {
+   pci_dev_put(pdev);
+   } else if (IS_ENABLED(CONFIG_SYSFB)) {
/* framebuffer is reallocated, clear screen_info to avoid 
misuse from kexec */
screen_info.lfb_size = 0;
screen_info.lfb_base = 0;
screen_info.orig_video_isVGA = 0;
-   } else {
-   pci_dev_put(pdev);
}
 
return 0;
-- 
2.39.2

[PATCH v2 7/9] vga16fb: drop powerpc support

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

I noticed that commit 0db5b61e0dc07 ("fbdev/vga16fb: Create
EGA/VGA devices in sysfb code") broke vga16fb on non-x86 platforms,
because the sysfb code never creates a vga-framebuffer device when
screen_info.orig_video_isVGA is set to '1' instead of VIDEO_TYPE_VGAC.

However, it turns out that the only architecture that has allowed
building vga16fb in the past 20 years is powerpc, and this only worked
on two 32-bit platforms and never on 64-bit powerpc. The last machine
that actually used this was removed in linux-3.10, so this is all dead
code and can be removed.

The big-endian support in vga16fb.c could also be removed, but I'd just
leave this in place.

Fixes: 933ee7119fb14 ("powerpc: remove PReP platform")
Signed-off-by: Arnd Bergmann 
---
 arch/powerpc/kernel/setup-common.c | 16 
 drivers/video/fbdev/Kconfig|  2 +-
 drivers/video/fbdev/vga16fb.c  |  9 +
 3 files changed, 2 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index d2a446216444f..81a6313927228 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -98,21 +97,6 @@ int boot_cpu_hwid = -1;
 int dcache_bsize;
 int icache_bsize;
 
-/*
- * This still seems to be needed... -- paulus
- */ 
-struct screen_info screen_info = {
-   .orig_x = 0,
-   .orig_y = 25,
-   .orig_video_cols = 80,
-   .orig_video_lines = 25,
-   .orig_video_isVGA = 1,
-   .orig_video_points = 16
-};
-#if defined(CONFIG_FB_VGA16_MODULE)
-EXPORT_SYMBOL(screen_info);
-#endif
-
 /* Variables required to store legacy IO irq routing */
 int of_i8042_kbd_irq;
 EXPORT_SYMBOL_GPL(of_i8042_kbd_irq);
diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
index 9169ee532baf7..ebc3cdfdfca07 100644
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -368,7 +368,7 @@ config FB_IMSTT
 
 config FB_VGA16
tristate "VGA 16-color graphics support"
-   depends on FB && (X86 || PPC)
+   depends on FB && X86
select APERTURE_HELPERS
select FB_CFB_FILLRECT
select FB_CFB_COPYAREA
diff --git a/drivers/video/fbdev/vga16fb.c b/drivers/video/fbdev/vga16fb.c
index 34d00347ad58a..8e28f9dd19044 100644
--- a/drivers/video/fbdev/vga16fb.c
+++ b/drivers/video/fbdev/vga16fb.c
@@ -185,8 +185,6 @@ static inline void setindex(int index)
 /* Check if the video mode is supported by the driver */
 static inline int check_mode_supported(const struct screen_info *si)
 {
-   /* non-x86 architectures treat orig_video_isVGA as a boolean flag */
-#if defined(CONFIG_X86)
/* only EGA and VGA in 16 color graphic mode are supported */
if (si->orig_video_isVGA != VIDEO_TYPE_EGAC &&
si->orig_video_isVGA != VIDEO_TYPE_VGAC)
@@ -197,7 +195,7 @@ static inline int check_mode_supported(const struct 
screen_info *si)
si->orig_video_mode != 0x10 &&  /* 640x350/4 (EGA) */
si->orig_video_mode != 0x12)/* 640x480/4 (VGA) */
return -ENODEV;
-#endif
+
return 0;
 }
 
@@ -1338,12 +1336,7 @@ static int vga16fb_probe(struct platform_device *dev)
printk(KERN_INFO "vga16fb: mapped to 0x%p\n", info->screen_base);
par = info->par;
 
-#if defined(CONFIG_X86)
par->isVGA = si->orig_video_isVGA == VIDEO_TYPE_VGAC;
-#else
-   /* non-x86 architectures treat orig_video_isVGA as a boolean flag */
-   par->isVGA = si->orig_video_isVGA;
-#endif
par->palette_blanked = 0;
par->vesa_blanked = 0;
 
-- 
2.39.2

[PATCH v2 6/9] vgacon: clean up global screen_info instances

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

To prepare for completely separating the VGA console screen_info from
the one used in EFI/sysfb, rename the vgacon instances and make them
local as much as possible.

ia64 and arm both have confurations with vgacon and efi, but the contents
never overlaps because ia64 has no EFI framebuffer, and arm only has
vga console on legacy platforms without EFI. Renaming these is required
before the EFI screen_info can be moved into drivers/firmware.

The ia64 vga console is actually registered in two places from
setup_arch(), but one of them is wrong, so drop the one in pcdp.c and
the fix the one in setup.c to use the correct conditional.

x86 has to keep them together, as the boot protocol is used to switch
between VGA text console and framebuffer through the screen_info data.

Signed-off-by: Arnd Bergmann 
---
 arch/alpha/kernel/proto.h |  2 ++
 arch/alpha/kernel/setup.c |  6 ++--
 arch/alpha/kernel/sys_sio.c   |  6 ++--
 arch/arm/include/asm/setup.h  |  5 
 arch/arm/kernel/atags_parse.c | 18 ++--
 arch/arm/kernel/efi.c |  6 
 arch/arm/kernel/setup.c   | 10 +--
 arch/ia64/kernel/setup.c  | 49 +++
 arch/mips/kernel/setup.c  | 11 ---
 arch/mips/mti-malta/malta-setup.c |  4 ++-
 arch/mips/sibyte/swarm/setup.c| 24 ---
 arch/mips/sni/setup.c | 16 +-
 drivers/firmware/pcdp.c   |  1 -
 13 files changed, 78 insertions(+), 80 deletions(-)

diff --git a/arch/alpha/kernel/proto.h b/arch/alpha/kernel/proto.h
index 5816a31c1b386..2c89c1c557129 100644
--- a/arch/alpha/kernel/proto.h
+++ b/arch/alpha/kernel/proto.h
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include 
+#include 
 #include 
 
 /* Prototypes of functions used across modules here in this directory.  */
@@ -113,6 +114,7 @@ extern int boot_cpuid;
 #ifdef CONFIG_VERBOSE_MCHECK
 extern unsigned long alpha_verbose_mcheck;
 #endif
+extern struct screen_info vgacon_screen_info;
 
 /* srmcons.c */
 #if defined(CONFIG_ALPHA_GENERIC) || defined(CONFIG_ALPHA_SRM)
diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
index d73b685fe9852..7b35af2ed2787 100644
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@@ -138,7 +138,7 @@ static char __initdata command_line[COMMAND_LINE_SIZE];
  * code think we're on a VGA color display.
  */
 
-struct screen_info screen_info = {
+struct screen_info vgacon_screen_info = {
.orig_x = 0,
.orig_y = 25,
.orig_video_cols = 80,
@@ -146,8 +146,6 @@ struct screen_info screen_info = {
.orig_video_isVGA = 1,
.orig_video_points = 16
 };
-
-EXPORT_SYMBOL(screen_info);
 #endif
 
 /*
@@ -655,7 +653,7 @@ setup_arch(char **cmdline_p)
 
 #ifdef CONFIG_VT
 #if defined(CONFIG_VGA_CONSOLE)
-   vgacon_register_screen(_info);
+   vgacon_register_screen(_screen_info);
 #endif
 #endif
 
diff --git a/arch/alpha/kernel/sys_sio.c b/arch/alpha/kernel/sys_sio.c
index 7de8a5d2d2066..086488ed83a7f 100644
--- a/arch/alpha/kernel/sys_sio.c
+++ b/arch/alpha/kernel/sys_sio.c
@@ -60,9 +60,9 @@ alphabook1_init_arch(void)
 #ifdef CONFIG_VGA_CONSOLE
/* The AlphaBook1 has LCD video fixed at 800x600,
   37 rows and 100 cols. */
-   screen_info.orig_y = 37;
-   screen_info.orig_video_cols = 100;
-   screen_info.orig_video_lines = 37;
+   vgacon_screen_info.orig_y = 37;
+   vgacon_screen_info.orig_video_cols = 100;
+   vgacon_screen_info.orig_video_lines = 37;
 #endif
 
lca_init_arch();
diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index 546af8b1e3f65..cc106f946c691 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -11,6 +11,7 @@
 #ifndef __ASMARM_SETUP_H
 #define __ASMARM_SETUP_H
 
+#include 
 #include 
 
 
@@ -35,4 +36,8 @@ void early_mm_init(const struct machine_desc *);
 void adjust_lowmem_bounds(void);
 void setup_dma_zone(const struct machine_desc *desc);
 
+#ifdef CONFIG_VGA_CONSOLE
+extern struct screen_info vgacon_screen_info;
+#endif
+
 #endif
diff --git a/arch/arm/kernel/atags_parse.c b/arch/arm/kernel/atags_parse.c
index 4c815da3b77b0..4ec591bde3dfa 100644
--- a/arch/arm/kernel/atags_parse.c
+++ b/arch/arm/kernel/atags_parse.c
@@ -72,15 +72,15 @@ __tagtable(ATAG_MEM, parse_tag_mem32);
 #if defined(CONFIG_ARCH_FOOTBRIDGE) && defined(CONFIG_VGA_CONSOLE)
 static int __init parse_tag_videotext(const struct tag *tag)
 {
-   screen_info.orig_x= tag->u.videotext.x;
-   screen_info.orig_y= tag->u.videotext.y;
-   screen_info.orig_video_page   = tag->u.videotext.video_page;
-   screen_info.orig_video_mode   = tag->u.videotext.video_mode;
-   screen_info.orig_video_cols   = tag->u.videotext.video_cols;
-   screen_info.orig_video_ega_bx = tag->u.videotext.video_ega_bx;
-   screen_info.orig_video_lines  = tag->u.videotext.video_lines;
-

[PATCH v2 5/9] vgacon: remove screen_info dependency

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

The vga console driver is fairly self-contained, and only used by
architectures that explicitly initialize the screen_info settings.

Chance every instance that picks the vga console by setting conswitchp
to call a function instead, and pass a reference to the screen_info
there.

Signed-off-by: Arnd Bergmann 
---
 arch/alpha/kernel/setup.c  |  2 +-
 arch/arm/kernel/setup.c|  2 +-
 arch/ia64/kernel/setup.c   |  2 +-
 arch/mips/kernel/setup.c   |  2 +-
 arch/x86/kernel/setup.c|  2 +-
 drivers/firmware/pcdp.c|  2 +-
 drivers/video/console/vgacon.c | 68 --
 include/linux/console.h|  7 
 8 files changed, 53 insertions(+), 34 deletions(-)

diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
index b4d2297765c02..d73b685fe9852 100644
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@@ -655,7 +655,7 @@ setup_arch(char **cmdline_p)
 
 #ifdef CONFIG_VT
 #if defined(CONFIG_VGA_CONSOLE)
-   conswitchp = _con;
+   vgacon_register_screen(_info);
 #endif
 #endif
 
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 40326a35a179b..5d8a7fb3eba45 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -1192,7 +1192,7 @@ void __init setup_arch(char **cmdline_p)
 
 #ifdef CONFIG_VT
 #if defined(CONFIG_VGA_CONSOLE)
-   conswitchp = _con;
+   vgacon_register_screen(_info);
 #endif
 #endif
 
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index d2c66efdde560..2c9283fcd3759 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -619,7 +619,7 @@ setup_arch (char **cmdline_p)
 * memory so we can avoid this problem.
 */
if (efi_mem_type(0xA) != EFI_CONVENTIONAL_MEMORY)
-   conswitchp = _con;
+   vgacon_register_screen(_info);
 # endif
}
 #endif
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 1aba7dc95132c..6c3fae62a9f6b 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -794,7 +794,7 @@ void __init setup_arch(char **cmdline_p)
 
 #if defined(CONFIG_VT)
 #if defined(CONFIG_VGA_CONSOLE)
-   conswitchp = _con;
+   vgacon_register_screen(_info);
 #endif
 #endif
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fd975a4a52006..b1ea77d504615 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1293,7 +1293,7 @@ void __init setup_arch(char **cmdline_p)
 #ifdef CONFIG_VT
 #if defined(CONFIG_VGA_CONSOLE)
if (!efi_enabled(EFI_BOOT) || (efi_mem_type(0xa) != 
EFI_CONVENTIONAL_MEMORY))
-   conswitchp = _con;
+   vgacon_register_screen(_info);
 #endif
 #endif
x86_init.oem.banner();
diff --git a/drivers/firmware/pcdp.c b/drivers/firmware/pcdp.c
index 715a45442d1cf..667a595373b2d 100644
--- a/drivers/firmware/pcdp.c
+++ b/drivers/firmware/pcdp.c
@@ -72,7 +72,7 @@ setup_vga_console(struct pcdp_device *dev)
return -ENODEV;
}
 
-   conswitchp = _con;
+   vgacon_register_screen(_info);
printk(KERN_INFO "PCDP: VGA console\n");
return 0;
 #else
diff --git a/drivers/video/console/vgacon.c b/drivers/video/console/vgacon.c
index e25ba523892e5..3d7fedf27ffc1 100644
--- a/drivers/video/console/vgacon.c
+++ b/drivers/video/console/vgacon.c
@@ -97,6 +97,8 @@ static intvga_video_font_height;
 static int vga_scan_lines  __read_mostly;
 static unsigned intvga_rolled_over; /* last vc_origin offset before wrap */
 
+static struct screen_info *vga_si;
+
 static bool vga_hardscroll_enabled;
 static bool vga_hardscroll_user_enable = true;
 
@@ -161,8 +163,9 @@ static const char *vgacon_startup(void)
u16 saved1, saved2;
volatile u16 *p;
 
-   if (screen_info.orig_video_isVGA == VIDEO_TYPE_VLFB ||
-   screen_info.orig_video_isVGA == VIDEO_TYPE_EFI) {
+   if (!vga_si ||
+   vga_si->orig_video_isVGA == VIDEO_TYPE_VLFB ||
+   vga_si->orig_video_isVGA == VIDEO_TYPE_EFI) {
  no_vga:
 #ifdef CONFIG_DUMMY_CONSOLE
conswitchp = _con;
@@ -172,29 +175,29 @@ static const char *vgacon_startup(void)
 #endif
}
 
-   /* boot_params.screen_info reasonably initialized? */
-   if ((screen_info.orig_video_lines == 0) ||
-   (screen_info.orig_video_cols  == 0))
+   /* vga_si reasonably initialized? */
+   if ((vga_si->orig_video_lines == 0) ||
+   (vga_si->orig_video_cols  == 0))
goto no_vga;
 
/* VGA16 modes are not handled by VGACON */
-   if ((screen_info.orig_video_mode == 0x0D) ||/* 320x200/4 */
-   (screen_info.orig_video_mode == 0x0E) ||/* 640x200/4 */
-   (screen_info.orig_video_mode == 0x10) ||/* 640x350/4 */
-   (screen_info.orig_video_mode == 0x12) ||/* 640x480/4 */
-

[PATCH v2 4/9] vgacon, arch/*: remove unused screen_info definitions

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

A number of architectures either kept the screen_info definition for
historical purposes as it used to be required by the generic VT code, or
they copied it from another architecture in order to build the VGA console
driver in an allmodconfig build. The mips definition is used by some
platforms, but the initialization on jazz is not needed.

Now that vgacon no longer builds on these architectures, remove the
stale definitions and initializations.

Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
Acked-by: Dinh Nguyen 
Acked-by: Max Filippov 
Acked-by: Palmer Dabbelt 
Acked-by: Guo Ren 
Signed-off-by: Arnd Bergmann 
---
 arch/csky/kernel/setup.c  | 12 
 arch/hexagon/kernel/Makefile  |  2 --
 arch/hexagon/kernel/screen_info.c |  3 ---
 arch/mips/jazz/setup.c|  9 -
 arch/nios2/kernel/setup.c |  5 -
 arch/sh/kernel/setup.c|  5 -
 arch/sparc/kernel/setup_32.c  | 13 -
 arch/sparc/kernel/setup_64.c  | 13 -
 arch/xtensa/kernel/setup.c| 12 
 9 files changed, 74 deletions(-)
 delete mode 100644 arch/hexagon/kernel/screen_info.c

diff --git a/arch/csky/kernel/setup.c b/arch/csky/kernel/setup.c
index 106fbf0b6f3b4..51012e90780d6 100644
--- a/arch/csky/kernel/setup.c
+++ b/arch/csky/kernel/setup.c
@@ -8,22 +8,10 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 
-#ifdef CONFIG_DUMMY_CONSOLE
-struct screen_info screen_info = {
-   .orig_video_lines   = 30,
-   .orig_video_cols= 80,
-   .orig_video_mode= 0,
-   .orig_video_ega_bx  = 0,
-   .orig_video_isVGA   = 1,
-   .orig_video_points  = 8
-};
-#endif
-
 static void __init csky_memblock_init(void)
 {
unsigned long lowmem_size = PFN_DOWN(LOWMEM_LIMIT - PHYS_OFFSET_OFFSET);
diff --git a/arch/hexagon/kernel/Makefile b/arch/hexagon/kernel/Makefile
index e73cb321630ec..3fdf937eb572e 100644
--- a/arch/hexagon/kernel/Makefile
+++ b/arch/hexagon/kernel/Makefile
@@ -17,5 +17,3 @@ obj-y += vm_vectors.o
 obj-$(CONFIG_HAS_DMA) += dma.o
 
 obj-$(CONFIG_STACKTRACE) += stacktrace.o
-
-obj-$(CONFIG_VGA_CONSOLE) += screen_info.o
diff --git a/arch/hexagon/kernel/screen_info.c 
b/arch/hexagon/kernel/screen_info.c
deleted file mode 100644
index 1e1ceb18bafe7..0
--- a/arch/hexagon/kernel/screen_info.c
+++ /dev/null
@@ -1,3 +0,0 @@
-#include 
-
-struct screen_info screen_info;
diff --git a/arch/mips/jazz/setup.c b/arch/mips/jazz/setup.c
index ee044261eb223..23059ead773fc 100644
--- a/arch/mips/jazz/setup.c
+++ b/arch/mips/jazz/setup.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -76,14 +75,6 @@ void __init plat_mem_setup(void)
 
_machine_restart = jazz_machine_restart;
 
-#ifdef CONFIG_VT
-   screen_info = (struct screen_info) {
-   .orig_video_cols= 160,
-   .orig_video_lines   = 64,
-   .orig_video_points  = 16,
-   };
-#endif
-
add_preferred_console("ttyS", 0, "9600");
 }
 
diff --git a/arch/nios2/kernel/setup.c b/arch/nios2/kernel/setup.c
index 8582ed9658447..da122a5fa43b2 100644
--- a/arch/nios2/kernel/setup.c
+++ b/arch/nios2/kernel/setup.c
@@ -19,7 +19,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
@@ -36,10 +35,6 @@ static struct pt_regs fake_regs = { 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,
0};
 
-#ifdef CONFIG_VT
-struct screen_info screen_info;
-#endif
-
 /* Copy a short hook instruction sequence to the exception address */
 static inline void copy_exception_handler(unsigned int addr)
 {
diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c
index b3da2757faaf3..3d80515298d26 100644
--- a/arch/sh/kernel/setup.c
+++ b/arch/sh/kernel/setup.c
@@ -7,7 +7,6 @@
  *  Copyright (C) 1999  Niibe Yutaka
  *  Copyright (C) 2002 - 2010 Paul Mundt
  */
-#include 
 #include 
 #include 
 #include 
@@ -69,10 +68,6 @@ EXPORT_SYMBOL(cpu_data);
 struct sh_machine_vector sh_mv = { .mv_name = "generic", };
 EXPORT_SYMBOL(sh_mv);
 
-#ifdef CONFIG_VT
-struct screen_info screen_info;
-#endif
-
 extern int root_mountflags;
 
 #define RAMDISK_IMAGE_START_MASK   0x07FF
diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c
index 34ef7febf0d56..e3b72a7b46d37 100644
--- a/arch/sparc/kernel/setup_32.c
+++ b/arch/sparc/kernel/setup_32.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -51,18 +50,6 @@
 
 #include "kernel.h"
 
-struct screen_info screen_info = {
-   0, 0,   /* orig-x, orig-y */
-   0,  /* unused */
-   0,  /* orig-video-page */
-   0,  /* orig-video-mode */
-   128,/* orig-video-cols

[PATCH v2 3/9] dummycon: limit Arm console size hack to footbridge

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

The dummycon default console size used to be determined by architecture,
but now this is a Kconfig setting on everything except ARM. Tracing this
back in the historic git trees, this was used to match the size of VGA
console or VGA framebuffer on early machines, but nowadays that code is
no longer used, except probably on the old footbridge/netwinder since
that is the only one that supports vgacon.

On machines with a framebuffer, booting with DT so far results in always
using the hardcoded 80x30 size in dummycon, while on ATAGS the setting
can come from a bootloader specific override. Both seem to be worse
choices than the Kconfig setting, since the actual text size for fbcon
also depends on the selected font.

Make this work the same way as everywhere else and use the normal
Kconfig setting, except for the footbridge with vgacon, which keeps
using the traditional code. If vgacon is disabled, footbridge can
also ignore the setting. This means the screen_info only has to be
provided when either vgacon or EFI are enabled now.

To limit the amount of surprises on Arm, change the Kconfig default
to the previously used 80x30 setting instead of the usual 80x25.

Reviewed-by: Thomas Zimmermann 
Tested-by: Linus Walleij 
Reviewed-by: Javier Martinez Canillas 
Signed-off-by: Arnd Bergmann 
---
 arch/arm/kernel/atags_parse.c| 2 +-
 arch/arm/kernel/setup.c  | 3 +--
 drivers/video/console/Kconfig| 5 +++--
 drivers/video/console/dummycon.c | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/atags_parse.c b/arch/arm/kernel/atags_parse.c
index 33f6eb5213a5a..4c815da3b77b0 100644
--- a/arch/arm/kernel/atags_parse.c
+++ b/arch/arm/kernel/atags_parse.c
@@ -69,7 +69,7 @@ static int __init parse_tag_mem32(const struct tag *tag)
 
 __tagtable(ATAG_MEM, parse_tag_mem32);
 
-#if defined(CONFIG_VGA_CONSOLE) || defined(CONFIG_DUMMY_CONSOLE)
+#if defined(CONFIG_ARCH_FOOTBRIDGE) && defined(CONFIG_VGA_CONSOLE)
 static int __init parse_tag_videotext(const struct tag *tag)
 {
screen_info.orig_x= tag->u.videotext.x;
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index c66b560562b30..40326a35a179b 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -928,8 +928,7 @@ static void __init request_standard_resources(const struct 
machine_desc *mdesc)
request_resource(_resource, );
 }
 
-#if defined(CONFIG_VGA_CONSOLE) || defined(CONFIG_DUMMY_CONSOLE) || \
-defined(CONFIG_EFI)
+#if defined(CONFIG_VGA_CONSOLE) || defined(CONFIG_EFI)
 struct screen_info screen_info = {
  .orig_video_lines = 30,
  .orig_video_cols  = 80,
diff --git a/drivers/video/console/Kconfig b/drivers/video/console/Kconfig
index 6af90db6d2da9..b575cf54174af 100644
--- a/drivers/video/console/Kconfig
+++ b/drivers/video/console/Kconfig
@@ -52,7 +52,7 @@ config DUMMY_CONSOLE
 
 config DUMMY_CONSOLE_COLUMNS
int "Initial number of console screen columns"
-   depends on DUMMY_CONSOLE && !ARM
+   depends on DUMMY_CONSOLE && !ARCH_FOOTBRIDGE
default 160 if PARISC
default 80
help
@@ -62,8 +62,9 @@ config DUMMY_CONSOLE_COLUMNS
 
 config DUMMY_CONSOLE_ROWS
int "Initial number of console screen rows"
-   depends on DUMMY_CONSOLE && !ARM
+   depends on DUMMY_CONSOLE && !ARCH_FOOTBRIDGE
default 64 if PARISC
+   default 30 if ARM
default 25
help
  On PA-RISC, the default value is 64, which should fit a 1280x1024
diff --git a/drivers/video/console/dummycon.c b/drivers/video/console/dummycon.c
index f1711b2f9ff05..70549fecee12c 100644
--- a/drivers/video/console/dummycon.c
+++ b/drivers/video/console/dummycon.c
@@ -18,7 +18,7 @@
  *  Dummy console driver
  */
 
-#if defined(__arm__)
+#if defined(CONFIG_ARCH_FOOTBRIDGE) && defined(CONFIG_VGA_CONSOLE)
 #define DUMMY_COLUMNS  screen_info.orig_video_cols
 #define DUMMY_ROWS screen_info.orig_video_lines
 #else
-- 
2.39.2

[PATCH v2 2/9] vgacon: rework screen_info #ifdef checks

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

On non-x86 architectures, the screen_info variable is generally only
used for the VGA console where supported, and in some cases the EFI
framebuffer or vga16fb.

Now that we have a definite list of which architectures actually use it
for what, use consistent #ifdef checks so the global variable is only
defined when it is actually used on those architectures.

Loongarch and riscv have no support for vgacon or vga16fb, but
they support EFI firmware, so only that needs to be checked, and the
initialization can be removed because that is handled by EFI.
IA64 has both vgacon and EFI, though EFI apparently never uses
a framebuffer here.

Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
Signed-off-by: Arnd Bergmann 
---
v2 changes:
 - split out mips/jazz change
 - improve ia64 #ifdef changes
---
 arch/alpha/kernel/setup.c  |  2 ++
 arch/alpha/kernel/sys_sio.c|  2 ++
 arch/ia64/kernel/setup.c   |  6 ++
 arch/loongarch/kernel/setup.c  |  2 ++
 arch/mips/kernel/setup.c   |  2 +-
 arch/mips/sibyte/swarm/setup.c |  2 +-
 arch/mips/sni/setup.c  |  2 +-
 arch/riscv/kernel/setup.c  | 11 ++-
 8 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
index b650ff1cb022e..b4d2297765c02 100644
--- a/arch/alpha/kernel/setup.c
+++ b/arch/alpha/kernel/setup.c
@@ -131,6 +131,7 @@ static void determine_cpu_caches (unsigned int);
 
 static char __initdata command_line[COMMAND_LINE_SIZE];
 
+#ifdef CONFIG_VGA_CONSOLE
 /*
  * The format of "screen_info" is strange, and due to early
  * i386-setup code. This is just enough to make the console
@@ -147,6 +148,7 @@ struct screen_info screen_info = {
 };
 
 EXPORT_SYMBOL(screen_info);
+#endif
 
 /*
  * The direct map I/O window, if any.  This should be the same
diff --git a/arch/alpha/kernel/sys_sio.c b/arch/alpha/kernel/sys_sio.c
index 7c420d8dac53d..7de8a5d2d2066 100644
--- a/arch/alpha/kernel/sys_sio.c
+++ b/arch/alpha/kernel/sys_sio.c
@@ -57,11 +57,13 @@ sio_init_irq(void)
 static inline void __init
 alphabook1_init_arch(void)
 {
+#ifdef CONFIG_VGA_CONSOLE
/* The AlphaBook1 has LCD video fixed at 800x600,
   37 rows and 100 cols. */
screen_info.orig_y = 37;
screen_info.orig_video_cols = 100;
screen_info.orig_video_lines = 37;
+#endif
 
lca_init_arch();
 }
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 5a55ac82c13a4..d2c66efdde560 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -86,9 +86,13 @@ EXPORT_SYMBOL(local_per_cpu_offset);
 #endif
 unsigned long ia64_cycles_per_usec;
 struct ia64_boot_param *ia64_boot_param;
+#if defined(CONFIG_VGA_CONSOLE) || defined(CONFIG_EFI)
 struct screen_info screen_info;
+#endif
+#ifdef CONFIG_VGA_CONSOLE
 unsigned long vga_console_iobase;
 unsigned long vga_console_membase;
+#endif
 
 static struct resource data_resource = {
.name   = "Kernel data",
@@ -497,6 +501,7 @@ early_console_setup (char *cmdline)
 static void __init
 screen_info_setup(void)
 {
+#ifdef CONFIG_VGA_CONSOLE
unsigned int orig_x, orig_y, num_cols, num_rows, font_height;
 
memset(_info, 0, sizeof(screen_info));
@@ -525,6 +530,7 @@ screen_info_setup(void)
screen_info.orig_video_mode = 3;/* XXX fake */
screen_info.orig_video_isVGA = 1;   /* XXX fake */
screen_info.orig_video_ega_bx = 3;  /* XXX fake */
+#endif
 }
 
 static inline void
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index 95e6b579dfdd1..77e7a3722caa6 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -57,7 +57,9 @@
 #define SMBIOS_CORE_PACKAGE_OFFSET 0x23
 #define LOONGSON_EFI_ENABLE(1 << 3)
 
+#ifdef CONFIG_EFI
 struct screen_info screen_info __section(".data");
+#endif
 
 unsigned long fw_arg0, fw_arg1, fw_arg2;
 DEFINE_PER_CPU(unsigned long, kernelsp);
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index cb871eb784a7c..1aba7dc95132c 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -54,7 +54,7 @@ struct cpuinfo_mips cpu_data[NR_CPUS] __read_mostly;
 
 EXPORT_SYMBOL(cpu_data);
 
-#ifdef CONFIG_VT
+#ifdef CONFIG_VGA_CONSOLE
 struct screen_info screen_info;
 #endif
 
diff --git a/arch/mips/sibyte/swarm/setup.c b/arch/mips/sibyte/swarm/setup.c
index 76683993cdd3a..37df504d3ecbb 100644
--- a/arch/mips/sibyte/swarm/setup.c
+++ b/arch/mips/sibyte/swarm/setup.c
@@ -129,7 +129,7 @@ void __init plat_mem_setup(void)
if (m41t81_probe())
swarm_rtc_type = RTC_M41T81;
 
-#ifdef CONFIG_VT
+#ifdef CONFIG_VGA_CONSOLE
screen_info = (struct screen_info) {
.orig_video_page= 52,
.orig_video_mode= 3,
diff --git a/arch/mips/sni/setup.c b/arch/mips/sni/setup.c
index efad85c8c823b..9984cf91be7d0 100644
--- a/arch/mips/sni/setup.c
+++

[PATCH v2 1/9] vgacon: rework Kconfig dependencies

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

The list of dependencies here is phrased as an opt-out, but this is missing
a lot of architectures that don't actually support VGA consoles, and some
of the entries are stale:

 - powerpc used to support VGA consoles in the old arch/ppc codebase, but
   the merged arch/powerpc never did

 - arm lists footbridge, integrator and netwinder, but netwinder is actually
   part of footbridge, and integrator does not appear to have an actual
   VGA hardware, or list it in its ATAG or DT.

 - mips has a few platforms (malta, sibyte, and sni) that initialize
   screen_info, on everything else the console is selected but cannot
   actually work.

 - csky, hexgagon, loongarch, nios2, riscv and xtensa are not listed
   in the opt-out table and declare a screen_info to allow building
   vga_con, but this cannot work because the console is never selected.

Replace this with an opt-in table that lists only the platforms that
remain. This is effectively x86, plus a couple of historic workstation
and server machines that reused parts of the x86 system architecture.

Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Thomas Zimmermann 
Signed-off-by: Arnd Bergmann 
---
 drivers/video/console/Kconfig | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/video/console/Kconfig b/drivers/video/console/Kconfig
index 1b5a319971ed0..6af90db6d2da9 100644
--- a/drivers/video/console/Kconfig
+++ b/drivers/video/console/Kconfig
@@ -7,9 +7,9 @@ menu "Console display driver support"
 
 config VGA_CONSOLE
bool "VGA text console" if EXPERT || !X86
-   depends on !4xx && !PPC_8xx && !SPARC && !M68K && !PARISC &&  !SUPERH 
&& \
-   (!ARM || ARCH_FOOTBRIDGE || ARCH_INTEGRATOR || ARCH_NETWINDER) 
&& \
-   !ARM64 && !ARC && !MICROBLAZE && !OPENRISC && !S390 && !UML
+   depends on ALPHA || IA64 || X86 || \
+   (ARM && ARCH_FOOTBRIDGE) || \
+   (MIPS && (MIPS_MALTA || SIBYTE_BCM112X || SIBYTE_SB1250 || 
SIBYTE_BCM1x80 || SNI_RM))
select APERTURE_HELPERS if (DRM || FB || VFIO_PCI_CORE)
default y
help
-- 
2.39.2

[PATCH v2 0/9] video: screen_info cleanups

2023-07-19 Thread Arnd Bergmann

From: Arnd Bergmann 

I refreshed the first four patches that I sent before with very minor
updates, and then added some more to further disaggregate the use
of screen_info:

 - I found that powerpc wasn't using vga16fb any more

 - vgacon can be almost entirely separated from the global
   screen_info, except on x86

 - similarly, the EFI framebuffer initialization can be
   kept separate, except on x86.

I did extensive build testing on arm/arm64/x86 and the normal built bot
testing for the other architectures.

Which tree should this get merged through?

Link: https://lore.kernel.org/lkml/20230707095415.1449376-1-a...@kernel.org/

Arnd Bergmann (9):
  vgacon: rework Kconfig dependencies
  vgacon: rework screen_info #ifdef checks
  dummycon: limit Arm console size hack to footbridge
  vgacon, arch/*: remove unused screen_info definitions
  vgacon: remove screen_info dependency
  vgacon: clean up global screen_info instances
  vga16fb: drop powerpc support
  hyperv: avoid dependency on screen_info
  efi: move screen_info into efi init code

 arch/alpha/kernel/proto.h |  2 +
 arch/alpha/kernel/setup.c |  8 +--
 arch/alpha/kernel/sys_sio.c   |  8 ++-
 arch/arm/include/asm/setup.h  |  5 ++
 arch/arm/kernel/atags_parse.c | 20 +++---
 arch/arm/kernel/efi.c |  6 --
 arch/arm/kernel/setup.c   |  7 +-
 arch/arm64/kernel/efi.c   |  4 --
 arch/arm64/kernel/image-vars.h|  2 +
 arch/csky/kernel/setup.c  | 12 
 arch/hexagon/kernel/Makefile  |  2 -
 arch/hexagon/kernel/screen_info.c |  3 -
 arch/ia64/kernel/setup.c  | 51 +++---
 arch/loongarch/kernel/efi.c   |  3 +-
 arch/loongarch/kernel/image-vars.h|  2 +
 arch/loongarch/kernel/setup.c |  3 -
 arch/mips/jazz/setup.c|  9 ---
 arch/mips/kernel/setup.c  | 11 ---
 arch/mips/mti-malta/malta-setup.c |  4 +-
 arch/mips/sibyte/swarm/setup.c| 26 ---
 arch/mips/sni/setup.c | 18 ++---
 arch/nios2/kernel/setup.c |  5 --
 arch/powerpc/kernel/setup-common.c| 16 -
 arch/riscv/kernel/setup.c | 12 
 arch/sh/kernel/setup.c|  5 --
 arch/sparc/kernel/setup_32.c  | 13 
 arch/sparc/kernel/setup_64.c  | 13 
 arch/x86/kernel/setup.c   |  2 +-
 arch/xtensa/kernel/setup.c| 12 
 drivers/firmware/efi/efi-init.c   | 14 +++-
 drivers/firmware/efi/libstub/efi-stub-entry.c |  8 ++-
 drivers/firmware/pcdp.c   |  1 -
 drivers/gpu/drm/hyperv/hyperv_drm_drv.c   |  7 +-
 drivers/hv/vmbus_drv.c|  6 +-
 drivers/video/console/Kconfig | 11 +--
 drivers/video/console/dummycon.c  |  2 +-
 drivers/video/console/vgacon.c| 68 +++
 drivers/video/fbdev/Kconfig   |  2 +-
 drivers/video/fbdev/hyperv_fb.c   |  8 +--
 drivers/video/fbdev/vga16fb.c |  9 +--
 include/linux/console.h   |  7 ++
 41 files changed, 178 insertions(+), 249 deletions(-)
 delete mode 100644 arch/hexagon/kernel/screen_info.c

-- 
2.39.2

Cc: "David S. Miller" 
Cc: "K. Y. Srinivasan" 
Cc: Ard Biesheuvel 
Cc: Borislav Petkov 
Cc: Brian Cain 
Cc: Catalin Marinas 
Cc: Christophe Leroy 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: David Airlie 
Cc: Deepak Rawat 
Cc: Dexuan Cui 
Cc: Dinh Nguyen 
Cc: Greg Kroah-Hartman 
Cc: Guo Ren 
Cc: Haiyang Zhang 
Cc: Helge Deller 
Cc: Huacai Chen 
Cc: Ingo Molnar 
Cc: Javier Martinez Canillas 
Cc: John Paul Adrian Glaubitz 
Cc: Khalid Aziz 
Cc: Linus Walleij 
Cc: Matt Turner 
Cc: Max Filippov 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Palmer Dabbelt 
Cc: Russell King 
Cc: Thomas Bogendoerfer 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: WANG Xuerui 
Cc: Wei Liu 
Cc: Will Deacon 
Cc: x...@kernel.org
Cc: linux-al...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-...@vger.kernel.org
Cc: linux-c...@vger.kernel.org
Cc: linux-hexa...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: loonga...@lists.linux.dev
Cc: linux-m...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux-hyp...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Cc: linux-fb...@vger.kernel.org

Re: linux-next: Tree for Jul 13 (drivers/video/fbdev/ps3fb.c)

2023-07-19 Thread Bagas Sanjaya

On 7/18/23 17:06, Thorsten Leemhuis wrote:
> I'm missing something here:
> 
> * What makes you think this is caused by bdb616479eff419? I didn't see
> anything in the thread that claims this, but I might be missing something
> * related: if I understand Randy right, this is only happening in -next;
> so why is bdb616479eff419 the culprit, which is also in mainline since
> End of June?
> 

Actually drivers/video/fbdev/ps3bf.c only had two non-merge commits during
previous cycle: 25ec15abb06194 and bdb616479eff419. The former was simply
adding .owner field in ps3fb_ops (hence trivial), so I inferred that the
culprit was likely the latter (due to it was being authored by Thomas).

Thanks for the question.

-- 
An old man doll... just what I always wanted! - Clara

Re: [PATCH 2/2] PCI: layerscape: Add the workaround for lost link capablities during reset

2023-07-19 Thread Markus Elfring

> A workaround for the issue where …

Would you like to avoid a typo in the subject for the final commit message?


Will a cover letter become helpful also for the presented small patch series?

Regards,
Markus

[PATCH v2 5/5] mmu_notifiers: Rename invalidate_range notifier

2023-07-19 Thread Alistair Popple

There are two main use cases for mmu notifiers. One is by KVM which
uses mmu_notifier_invalidate_range_start()/end() to manage a software
TLB.

The other is to manage hardware TLBs which need to use the
invalidate_range() callback because HW can establish new TLB entries
at any time. Hence using start/end() can lead to memory corruption as
these callbacks happen too soon/late during page unmap.

mmu notifier users should therefore either use the start()/end()
callbacks or the invalidate_range() callbacks. To make this usage
clearer rename the invalidate_range() callback to
arch_invalidate_secondary_tlbs() and update documention.

Signed-off-by: Alistair Popple 
Suggested-by: Jason Gunthorpe 
---
 arch/arm64/include/asm/tlbflush.h   |  6 +-
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c|  2 +-
 arch/powerpc/mm/book3s64/radix_tlb.c| 10 ++--
 arch/x86/mm/tlb.c   |  4 +-
 drivers/iommu/amd/iommu_v2.c| 10 ++--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 13 ++---
 drivers/iommu/intel/svm.c   |  8 +--
 drivers/misc/ocxl/link.c|  8 +--
 include/linux/mmu_notifier.h| 48 +-
 mm/huge_memory.c|  4 +-
 mm/hugetlb.c|  7 +--
 mm/mmu_notifier.c   | 20 ++--
 12 files changed, 76 insertions(+), 64 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h 
b/arch/arm64/include/asm/tlbflush.h
index a99349d..84a05a0 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -253,7 +253,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
__tlbi(aside1is, asid);
__tlbi_user(aside1is, asid);
dsb(ish);
-   mmu_notifier_invalidate_range(mm, 0, -1UL);
+   mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
 }
 
 static inline void __flush_tlb_page_nosync(struct mm_struct *mm,
@@ -265,7 +265,7 @@ static inline void __flush_tlb_page_nosync(struct mm_struct 
*mm,
addr = __TLBI_VADDR(uaddr, ASID(mm));
__tlbi(vale1is, addr);
__tlbi_user(vale1is, addr);
-   mmu_notifier_invalidate_range(mm, uaddr & PAGE_MASK,
+   mmu_notifier_arch_invalidate_secondary_tlbs(mm, uaddr & PAGE_MASK,
(uaddr & PAGE_MASK) + 
PAGE_SIZE);
 }
 
@@ -400,7 +400,7 @@ static inline void __flush_tlb_range(struct vm_area_struct 
*vma,
scale++;
}
dsb(ish);
-   mmu_notifier_invalidate_range(vma->vm_mm, start, end);
+   mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end);
 }
 
 static inline void flush_tlb_range(struct vm_area_struct *vma,
diff --git a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c 
b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
index f3fb49f..17075c7 100644
--- a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
+++ b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
@@ -39,7 +39,7 @@ void radix__flush_hugetlb_tlb_range(struct vm_area_struct 
*vma, unsigned long st
radix__flush_tlb_pwc_range_psize(vma->vm_mm, start, end, psize);
else
radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize);
-   mmu_notifier_invalidate_range(vma->vm_mm, start, end);
+   mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end);
 }
 
 void radix__huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index 9724b26..64c11a4 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -752,7 +752,7 @@ void radix__local_flush_tlb_page(struct vm_area_struct 
*vma, unsigned long vmadd
return radix__local_flush_hugetlb_page(vma, vmaddr);
 #endif
radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, 
mmu_virtual_psize);
-   mmu_notifier_invalidate_range(vma->vm_mm, vmaddr,
+   mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, vmaddr,
vmaddr + mmu_virtual_psize);
 }
 EXPORT_SYMBOL(radix__local_flush_tlb_page);
@@ -989,7 +989,7 @@ void radix__flush_tlb_mm(struct mm_struct *mm)
}
}
preempt_enable();
-   mmu_notifier_invalidate_range(mm, 0, -1UL);
+   mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
 }
 EXPORT_SYMBOL(radix__flush_tlb_mm);
 
@@ -1023,7 +1023,7 @@ static void __flush_all_mm(struct mm_struct *mm, bool 
fullmm)
_tlbiel_pid_multicast(mm, pid, RIC_FLUSH_ALL);
}
preempt_enable();
-   mmu_notifier_invalidate_range(mm, 0, -1UL);
+   mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
 }
 
 void radix__flush_all_mm(struct mm_struct *mm)
@@ -1232,7 +1232,7 @@ static inline void __radix__flush_tlb_range(struct 
mm_struct *mm,
}
 out:

[PATCH v2 4/5] mmu_notifiers: Don't invalidate secondary TLBs as part of mmu_notifier_invalidate_range_end()

2023-07-19 Thread Alistair Popple

Secondary TLBs are now invalidated from the architecture specific TLB
invalidation functions. Therefore there is no need to explicitly
notify or invalidate as part of the range end functions. This means we
can remove mmu_notifier_invalidate_range_end_only() and some of the
ptep_*_notify() functions.

Signed-off-by: Alistair Popple 
---
 include/linux/mmu_notifier.h | 56 +
 kernel/events/uprobes.c  |  2 +-
 mm/huge_memory.c | 25 ++---
 mm/hugetlb.c |  1 +-
 mm/memory.c  |  8 +
 mm/migrate_device.c  |  9 +-
 mm/mmu_notifier.c| 25 ++---
 mm/rmap.c| 40 +--
 8 files changed, 14 insertions(+), 152 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 64a3e05..f2e9edc 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -395,8 +395,7 @@ extern int __mmu_notifier_test_young(struct mm_struct *mm,
 extern void __mmu_notifier_change_pte(struct mm_struct *mm,
  unsigned long address, pte_t pte);
 extern int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *r);
-extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r,
- bool only_end);
+extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r);
 extern void __mmu_notifier_invalidate_range(struct mm_struct *mm,
  unsigned long start, unsigned long end);
 extern bool
@@ -481,14 +480,7 @@ mmu_notifier_invalidate_range_end(struct 
mmu_notifier_range *range)
might_sleep();
 
if (mm_has_notifiers(range->mm))
-   __mmu_notifier_invalidate_range_end(range, false);
-}
-
-static inline void
-mmu_notifier_invalidate_range_only_end(struct mmu_notifier_range *range)
-{
-   if (mm_has_notifiers(range->mm))
-   __mmu_notifier_invalidate_range_end(range, true);
+   __mmu_notifier_invalidate_range_end(range);
 }
 
 static inline void mmu_notifier_invalidate_range(struct mm_struct *mm,
@@ -582,45 +574,6 @@ static inline void mmu_notifier_range_init_owner(
__young;\
 })
 
-#defineptep_clear_flush_notify(__vma, __address, __ptep)   
\
-({ \
-   unsigned long ___addr = __address & PAGE_MASK;  \
-   struct mm_struct *___mm = (__vma)->vm_mm;   \
-   pte_t ___pte;   \
-   \
-   ___pte = ptep_clear_flush(__vma, __address, __ptep);\
-   mmu_notifier_invalidate_range(___mm, ___addr,   \
-   ___addr + PAGE_SIZE);   \
-   \
-   ___pte; \
-})
-
-#define pmdp_huge_clear_flush_notify(__vma, __haddr, __pmd)\
-({ \
-   unsigned long ___haddr = __haddr & HPAGE_PMD_MASK;  \
-   struct mm_struct *___mm = (__vma)->vm_mm;   \
-   pmd_t ___pmd;   \
-   \
-   ___pmd = pmdp_huge_clear_flush(__vma, __haddr, __pmd);  \
-   mmu_notifier_invalidate_range(___mm, ___haddr,  \
- ___haddr + HPAGE_PMD_SIZE);   \
-   \
-   ___pmd; \
-})
-
-#define pudp_huge_clear_flush_notify(__vma, __haddr, __pud)\
-({ \
-   unsigned long ___haddr = __haddr & HPAGE_PUD_MASK;  \
-   struct mm_struct *___mm = (__vma)->vm_mm;   \
-   pud_t ___pud;   \
-   \
-   ___pud = pudp_huge_clear_flush(__vma, __haddr, __pud);  \
-   mmu_notifier_invalidate_range(___mm, ___haddr,  \
- ___haddr + HPAGE_PUD_SIZE);   \
-   \
-   ___pud; \
-})
-
 /*
  * set_pte_at_notify() sets the pte _after_ running the notifier.
  * This is safe to start by updating the secondary MMUs, because the primary 
MMU
@@ -711,11 +664,6 @@

[PATCH v2 3/5] mmu_notifiers: Call invalidate_range() when invalidating TLBs

2023-07-19 Thread Alistair Popple

The invalidate_range() is going to become an architecture specific mmu
notifier used to keep the TLB of secondary MMUs such as an IOMMU in
sync with the CPU page tables. Currently it is called from separate
code paths to the main CPU TLB invalidations. This can lead to a
secondary TLB not getting invalidated when required and makes it hard
to reason about when exactly the secondary TLB is invalidated.

To fix this move the notifier call to the architecture specific TLB
maintenance functions for architectures that have secondary MMUs
requiring explicit software invalidations.

This fixes a SMMU bug on ARM64. On ARM64 PTE permission upgrades
require a TLB invalidation. This invalidation is done by the
architecutre specific ptep_set_access_flags() which calls
flush_tlb_page() if required. However this doesn't call the notifier
resulting in infinite faults being generated by devices using the SMMU
if it has previously cached a read-only PTE in it's TLB.

Moving the invalidations into the TLB invalidation functions ensures
all invalidations happen at the same time as the CPU invalidation. The
architecture specific flush_tlb_all() routines do not call the
notifier as none of the IOMMUs require this.

Signed-off-by: Alistair Popple 
Suggested-by: Jason Gunthorpe 
---
 arch/arm64/include/asm/tlbflush.h | 5 +
 arch/powerpc/include/asm/book3s/64/tlbflush.h | 1 +
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c  | 1 +
 arch/powerpc/mm/book3s64/radix_tlb.c  | 6 ++
 arch/x86/mm/tlb.c | 3 +++
 include/asm-generic/tlb.h | 1 -
 6 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/tlbflush.h 
b/arch/arm64/include/asm/tlbflush.h
index 3456866..a99349d 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -252,6 +253,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
__tlbi(aside1is, asid);
__tlbi_user(aside1is, asid);
dsb(ish);
+   mmu_notifier_invalidate_range(mm, 0, -1UL);
 }
 
 static inline void __flush_tlb_page_nosync(struct mm_struct *mm,
@@ -263,6 +265,8 @@ static inline void __flush_tlb_page_nosync(struct mm_struct 
*mm,
addr = __TLBI_VADDR(uaddr, ASID(mm));
__tlbi(vale1is, addr);
__tlbi_user(vale1is, addr);
+   mmu_notifier_invalidate_range(mm, uaddr & PAGE_MASK,
+   (uaddr & PAGE_MASK) + 
PAGE_SIZE);
 }
 
 static inline void flush_tlb_page_nosync(struct vm_area_struct *vma,
@@ -396,6 +400,7 @@ static inline void __flush_tlb_range(struct vm_area_struct 
*vma,
scale++;
}
dsb(ish);
+   mmu_notifier_invalidate_range(vma->vm_mm, start, end);
 }
 
 static inline void flush_tlb_range(struct vm_area_struct *vma,
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 0d0c144..dca0477 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -5,6 +5,7 @@
 #define MMU_NO_CONTEXT ~0UL
 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c 
b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
index 5e31955..f3fb49f 100644
--- a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
+++ b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c
@@ -39,6 +39,7 @@ void radix__flush_hugetlb_tlb_range(struct vm_area_struct 
*vma, unsigned long st
radix__flush_tlb_pwc_range_psize(vma->vm_mm, start, end, psize);
else
radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize);
+   mmu_notifier_invalidate_range(vma->vm_mm, start, end);
 }
 
 void radix__huge_ptep_modify_prot_commit(struct vm_area_struct *vma,
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index 0bd4866..9724b26 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -752,6 +752,8 @@ void radix__local_flush_tlb_page(struct vm_area_struct 
*vma, unsigned long vmadd
return radix__local_flush_hugetlb_page(vma, vmaddr);
 #endif
radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, 
mmu_virtual_psize);
+   mmu_notifier_invalidate_range(vma->vm_mm, vmaddr,
+   vmaddr + mmu_virtual_psize);
 }
 EXPORT_SYMBOL(radix__local_flush_tlb_page);
 
@@ -987,6 +989,7 @@ void radix__flush_tlb_mm(struct mm_struct *mm)
}
}
preempt_enable();
+   mmu_notifier_invalidate_range(mm, 0, -1UL);
 }
 EXPORT_SYMBOL(radix__flush_tlb_mm);
 
@@ -1020,6 +1023,7 @@ static void __flush_all_mm(struct mm_struct *mm, bool 
fullmm)
_tlbiel_pid_multicast(mm, pid, RIC_FLUSH_ALL);
}
preempt_enable();
+   mmu_notifier_invalidate_range(mm,

[PATCH v2 2/5] mmu_notifiers: Fixup comment in mmu_interval_read_begin()

2023-07-19 Thread Alistair Popple

The comment in mmu_interval_read_begin() refers to a function that
doesn't exist and uses the wrong call-back name. The op for mmu
interval notifiers is mmu_interval_notifier_ops->invalidate() so fix
the comment up to reflect that.

Signed-off-by: Alistair Popple 
---
 mm/mmu_notifier.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 50c0dde..b7ad155 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -199,7 +199,7 @@ mmu_interval_read_begin(struct mmu_interval_notifier 
*interval_sub)
 * invalidate_start/end and is colliding.
 *
 * The locking looks broadly like this:
-*   mn_tree_invalidate_start():  mmu_interval_read_begin():
+*   mn_itree_inv_start(): mmu_interval_read_begin():
 * spin_lock
 *  seq = 
READ_ONCE(interval_sub->invalidate_seq);
 *  seq == subs->invalidate_seq
@@ -207,7 +207,7 @@ mmu_interval_read_begin(struct mmu_interval_notifier 
*interval_sub)
 *spin_lock
 * seq = ++subscriptions->invalidate_seq
 *spin_unlock
-* op->invalidate_range():
+* op->invalidate():
 *   user_lock
 *mmu_interval_set_seq()
 * interval_sub->invalidate_seq = seq
-- 
git-series 0.9.1

[PATCH v2 1/5] arm64/smmu: Use TLBI ASID when invalidating entire range

2023-07-19 Thread Alistair Popple

The ARM SMMU has a specific command for invalidating the TLB for an
entire ASID. Currently this is used for the IO_PGTABLE API but not for
ATS when called from the MMU notifier.

The current implementation of notifiers does not attempt to invalidate
such a large address range, instead walking each VMA and invalidating
each range individually during mmap removal. However in future SMMU
TLB invalidations are going to be sent as part of the normal
flush_tlb_*() kernel calls. To better deal with that add handling to
use TLBI ASID when invalidating the entire address space.

Signed-off-by: Alistair Popple 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index a5a63b1..2a19784 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -200,10 +200,20 @@ static void arm_smmu_mm_invalidate_range(struct 
mmu_notifier *mn,
 * range. So do a simple translation here by calculating size correctly.
 */
size = end - start;
+   if (size == ULONG_MAX)
+   size = 0;
+
+   if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM)) {
+   if (!size)
+   arm_smmu_tlb_inv_asid(smmu_domain->smmu,
+ smmu_mn->cd->asid);
+   else
+   arm_smmu_tlb_inv_range_asid(start, size,
+   smmu_mn->cd->asid,
+   PAGE_SIZE, false,
+   smmu_domain);
+   }
 
-   if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
-   arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
-   PAGE_SIZE, false, smmu_domain);
arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, start, size);
 }
 
-- 
git-series 0.9.1

[PATCH v2 0/5] Invalidate secondary IOMMU TLB on permission upgrade

2023-07-19 Thread Alistair Popple

The main change is to move secondary TLB invalidation mmu notifier
callbacks into the architecture specific TLB flushing functions. This
makes secondary TLB invalidation mostly match CPU invalidation while
still allowing efficient range based invalidations based on the
existing TLB batching code.

Changes for v2:

 - Rebased on linux-next commit 906fa30154ef ("mm/rmap: correct stale
   comment of rmap_walk_anon and rmap_walk_file") to fix a minor
   integration conflict with "arm64: support batched/deferred tlb
   shootdown during page reclamation/migration". This series will need
   to be applied after the conflicting patch.

 - Reordered the function rename until the end of the series as many
   places that were getting renamed ended up being removed anyway.

 - Fixed a couple of build issues which broke bisection.

 - Added a minor patch to fix up a stale/incorrect comment.

==
Background
==

The arm64 architecture specifies TLB permission bits may be cached and
therefore the TLB must be invalidated during permission upgrades. For
the CPU this currently occurs in the architecture specific
ptep_set_access_flags() routine.

Secondary TLBs such as implemented by the SMMU IOMMU match the CPU
architecture specification and may also cache permission bits and
require the same TLB invalidations. This may be achieved in one of two
ways.

Some SMMU implementations implement broadcast TLB maintenance
(BTM). This snoops CPU TLB invalidates and will invalidate any
secondary TLB at the same time as the CPU. However implementations are
not required to implement BTM.

Implementations without BTM rely on mmu notifier callbacks to send
explicit TLB invalidation commands to invalidate SMMU TLB. Therefore
either generic kernel code or architecture specific code needs to call
the mmu notifier on permission upgrade.

Currently that doesn't happen so devices will fault indefinitely when
writing to a PTE that was previously read-only as nothing invalidates
the SMMU TLB.


Solution


To fix this the series first renames the .invalidate_range() callback
to .arch_invalidate_secondary_tlbs() as suggested by Jason and Sean to
make it clear this callback is only used for secondary TLBs. That was
made possible thanks to Sean's series [1] to remove KVM's incorrect
usage.

Based on feedback from Jason [2] the proposed solution to the bug is
to move the calls to mmu_notifier_arch_invalidate_secondary_tlbs()
closer to the architecture specific TLB invalidation code. This
ensures the secondary TLB won't miss invalidations, including the
existing invalidation in the ARM64 code to deal with permission
upgrade.

Currently only ARM64, PowerPC and x86 have IOMMU with secondary TLBs
requiring SW invalidation so the notifier is only called for those
architectures. It is also not called for invalidation of kernel
mappings as no secondary IOMMU implementations can access those and
hence it is not required.

[1] - https://lore.kernel.org/all/20230602011518.787006-1-sea...@google.com/
[2] - https://lore.kernel.org/linux-mm/zjmr5bw8l+bbz...@ziepe.ca/

Alistair Popple (5):
  arm64/smmu: Use TLBI ASID when invalidating entire range
  mmu_notifiers: Fixup comment in mmu_interval_read_begin()
  mmu_notifiers: Call invalidate_range() when invalidating TLBs
  mmu_notifiers: Don't invalidate secondary TLBs as part of 
mmu_notifier_invalidate_range_end()
  mmu_notifiers: Rename invalidate_range notifier

 arch/arm64/include/asm/tlbflush.h   |   5 +-
 arch/powerpc/include/asm/book3s/64/tlbflush.h   |   1 +-
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c|   1 +-
 arch/powerpc/mm/book3s64/radix_tlb.c|   6 +-
 arch/x86/mm/tlb.c   |   3 +-
 drivers/iommu/amd/iommu_v2.c|  10 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c |  29 +++--
 drivers/iommu/intel/svm.c   |   8 +-
 drivers/misc/ocxl/link.c|   8 +-
 include/asm-generic/tlb.h   |   1 +-
 include/linux/mmu_notifier.h| 104 -
 kernel/events/uprobes.c |   2 +-
 mm/huge_memory.c|  29 +
 mm/hugetlb.c|   8 +-
 mm/memory.c |   8 +-
 mm/migrate_device.c |   9 +-
 mm/mmu_notifier.c   |  49 +++-
 mm/rmap.c   |  40 +---
 18 files changed, 110 insertions(+), 211 deletions(-)

base-commit: 906fa30154ef42f93d28d7322860e76c6ae392ac
-- 
git-series 0.9.1

Re: [PATCH] powerpc/build: vdso linker warning for orphan sections

2023-07-19 Thread Michael Ellerman

John Ogness  writes:
> On 2023-07-18, Michael Ellerman  wrote:
>>> ld: warning: discarding dynamic section .rela.opd
>>>
>>> and bisects to:
>>>
>>> 8ad57add77d3 ("powerpc/build: vdso linker warning for orphan sections")
>>
>> Can you test with a newer compiler/binutils?
>
> Testing the Debian release cross compilers/binutils:
>
> Debian 10 / gcc 8.3.0  / ld 2.31.1: generates the warning
>
> Debian 11 / gcc 10.2.1 / ld 2.35.2: generates the warning
>
> Debian 12 / gcc 12.2.0 / ld 2.40:   does _not_ generate the warning
>
> I suppose moving to the newer toolchain is the workaround. Although it
> is a bit unusual to require such a modern toolchain in order to build a
> kernel without warnings.

I didn't mean that you should move to a new toolchain to avoid the
warning, I was just curious why you're the only person to see it.

I regularly test with a gcc 5.5.0 / ld 2.29 toolchain and gcc 13.1.1 /
ld 2.39, and I haven't seen the warning. I tried a bunch of others and
can't reproduce it.

Can you confirm that this makes the warning go away?

cheers


diff --git a/arch/powerpc/kernel/vdso/vdso64.lds.S 
b/arch/powerpc/kernel/vdso/vdso64.lds.S
index bda6c8cdd459..286e1597c548 100644
--- a/arch/powerpc/kernel/vdso/vdso64.lds.S
+++ b/arch/powerpc/kernel/vdso/vdso64.lds.S
@@ -85,7 +85,7 @@ SECTIONS
*(.branch_lt)
*(.data .data.* .gnu.linkonce.d.* .sdata*)
*(.bss .sbss .dynbss .dynsbss)
-   *(.opd)
+   *(.opd .rela.opd)
*(.glink .iplt .plt .rela*)
}
 }

[PATCH] platforms: powermac: "foo* bar" replace with "foo *bar"

2023-07-19 Thread hanyu001


Fix below checkpatch error:

/platforms/powermac/pfunc_core.c: ERROR: "foo* bar" should be "foo *bar"

Signed-off-by: Yu Han 
---
 arch/powerpc/platforms/powermac/pfunc_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powermac/pfunc_core.c 
b/arch/powerpc/platforms/powermac/pfunc_core.c

index 07555c1bb484..7d01352a69f5 100644
--- a/arch/powerpc/platforms/powermac/pfunc_core.c
+++ b/arch/powerpc/platforms/powermac/pfunc_core.c
@@ -105,7 +105,7 @@ static u32 pmf_next32(struct pmf_cmd *cmd)
 return value;
 }

-static const void* pmf_next_blob(struct pmf_cmd *cmd, int count)
+static const void *pmf_next_blob(struct pmf_cmd *cmd, int count)
 {
 const void *value;
 if ((cmd->cmdend - cmd->cmdptr) < count) {

[PATCH] platforms: powermac: "foo* bar" replace with "foo *bar"

2023-07-19 Thread hanyu001


Fix below checkpatch error:

/platforms/powermac/pfunc_core.c: ERROR: "foo* bar" should be "foo *bar"

Signed-off-by: Yu Han 
---
 arch/powerpc/platforms/powermac/pfunc_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powermac/pfunc_core.c 
b/arch/powerpc/platforms/powermac/pfunc_core.c

index 07555c1bb484..7d01352a69f5 100644
--- a/arch/powerpc/platforms/powermac/pfunc_core.c
+++ b/arch/powerpc/platforms/powermac/pfunc_core.c
@@ -105,7 +105,7 @@ static u32 pmf_next32(struct pmf_cmd *cmd)
 return value;
 }

-static const void* pmf_next_blob(struct pmf_cmd *cmd, int count)
+static const void *pmf_next_blob(struct pmf_cmd *cmd, int count)
 {
 const void *value;
 if ((cmd->cmdend - cmd->cmdptr) < count) {

[PATCH] platforms: powermac: Add require space after that ','

2023-07-19 Thread hanyu001


Fix below checkpatch errors:

platforms/powermac/pfunc_core.c:22: ERROR: space required after that ',' 
(ctx:VxV)
platforms/powermac/pfunc_core.c:22: ERROR: space required after that ',' 
(ctx:VxV)


Signed-off-by: Yu Han 
---
 arch/powerpc/platforms/powermac/pfunc_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powermac/pfunc_core.c 
b/arch/powerpc/platforms/powermac/pfunc_core.c

index 22741ddfd5b2..07555c1bb484 100644
--- a/arch/powerpc/platforms/powermac/pfunc_core.c
+++ b/arch/powerpc/platforms/powermac/pfunc_core.c
@@ -19,7 +19,7 @@
 /* Debug */
 #define LOG_PARSE(fmt...)
 #define LOG_ERROR(fmt...)printk(fmt)
-#define LOG_BLOB(t,b,c)
+#define LOG_BLOB(t, b, c)

 #undef DEBUG
 #ifdef DEBUG

Re: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

2023-07-19 Thread Ard Biesheuvel

On Wed, 19 Jul 2023 at 00:38, Eric Biggers  wrote:
>
> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
> > Currently, the ubifs code allocates a worst case buffer size to
> > recompress a data node, but does not pass the size of that buffer to the
> > compression code. This means that the compression code will never use
> > the additional space, and might fail spuriously due to lack of space.
> >
> > So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
> > buffer. Doing so is guaranteed not to overflow, given that the preceding
> > kmalloc_array() call would have failed otherwise.
> >
> > Signed-off-by: Ard Biesheuvel 
> > ---
> >  fs/ubifs/journal.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
> > index dc52ac0f4a345f30..4e5961878f336033 100644
> > --- a/fs/ubifs/journal.c
> > +++ b/fs/ubifs/journal.c
> > @@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info 
> > *c, const struct inode *in
> >   if (!buf)
> >   return -ENOMEM;
> >
> > + out_len *= WORST_COMPR_FACTOR;
> > +
> >   dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
> >   data_size = dn_size - UBIFS_DATA_NODE_SZ;
> >   compr_type = le16_to_cpu(dn->compr_type);
>
> This looks like another case where data that would be expanded by compression
> should just be stored uncompressed instead.
>
> In fact, it seems that UBIFS does that already.  ubifs_compress() has this:
>
> /*
>  * If the data compressed only slightly, it is better to leave it
>  * uncompressed to improve read speed.
>  */
> if (in_len - *out_len < UBIFS_MIN_COMPRESS_DIFF)
> goto no_compr;
>
> So it's unclear why the WORST_COMPR_FACTOR thing is needed at all.
>

It is not. The buffer is used for decompression in the truncation
path, so none of this logic even matters. Even if the subsequent
recompression of the truncated data node could result in expansion
beyond the uncompressed size of the original data (which seems
impossible to me), increasing the size of this buffer would not help
as it is the input buffer for the compression not the output buffer.

Re: [RFC PATCH v11 07/29] KVM: Add KVM_EXIT_MEMORY_FAULT exit

2023-07-19 Thread Yuan Yao

On Tue, Jul 18, 2023 at 04:44:50PM -0700, Sean Christopherson wrote:
> From: Chao Peng 
>
> This new KVM exit allows userspace to handle memory-related errors. It
> indicates an error happens in KVM at guest memory range [gpa, gpa+size).
> The flags includes additional information for userspace to handle the
> error. Currently bit 0 is defined as 'private memory' where '1'
> indicates error happens due to private memory access and '0' indicates
> error happens due to shared memory access.

Now it's bit 3:

#define KVM_MEMORY_EXIT_FLAG_PRIVATE (1ULL << 3)

I remember some other attributes were introduced in v10 yet:

#define KVM_MEMORY_ATTRIBUTE_READ  (1ULL << 0)
#define KVM_MEMORY_ATTRIBUTE_WRITE (1ULL << 1)
#define KVM_MEMORY_ATTRIBUTE_EXECUTE   (1ULL << 2)
#define KVM_MEMORY_ATTRIBUTE_PRIVATE   (1ULL << 3)

So KVM_MEMORY_EXIT_FLAG_PRIVATE changed to bit 3 due to above things,
or other reason ? (Sorry I didn't follow v10 too much before).

>
> When private memory is enabled, this new exit will be used for KVM to
> exit to userspace for shared <-> private memory conversion in memory
> encryption usage. In such usage, typically there are two kind of memory
> conversions:
>   - explicit conversion: happens when guest explicitly calls into KVM
> to map a range (as private or shared), KVM then exits to userspace
> to perform the map/unmap operations.
>   - implicit conversion: happens in KVM page fault handler where KVM
> exits to userspace for an implicit conversion when the page is in a
> different state than requested (private or shared).
>
> Suggested-by: Sean Christopherson 
> Co-developed-by: Yu Zhang 
> Signed-off-by: Yu Zhang 
> Signed-off-by: Chao Peng 
> Reviewed-by: Fuad Tabba 
> Tested-by: Fuad Tabba 
> Signed-off-by: Sean Christopherson 
> ---
>  Documentation/virt/kvm/api.rst | 22 ++
>  include/uapi/linux/kvm.h   |  8 
>  2 files changed, 30 insertions(+)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index c0ddd3035462..34d4ce66e0c8 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6700,6 +6700,28 @@ array field represents return values. The userspace 
> should update the return
>  values of SBI call before resuming the VCPU. For more details on RISC-V SBI
>  spec refer, https://github.com/riscv/riscv-sbi-doc.
>
> +::
> +
> + /* KVM_EXIT_MEMORY_FAULT */
> + struct {
> +  #define KVM_MEMORY_EXIT_FLAG_PRIVATE   (1ULL << 3)
> + __u64 flags;
> + __u64 gpa;
> + __u64 size;
> + } memory;
> +
> +If exit reason is KVM_EXIT_MEMORY_FAULT then it indicates that the VCPU has
> +encountered a memory error which is not handled by KVM kernel module and
> +userspace may choose to handle it. The 'flags' field indicates the memory
> +properties of the exit.
> +
> + - KVM_MEMORY_EXIT_FLAG_PRIVATE - indicates the memory error is caused by
> +   private memory access when the bit is set. Otherwise the memory error is
> +   caused by shared memory access when the bit is clear.
> +
> +'gpa' and 'size' indicate the memory range the error occurs at. The userspace
> +may handle the error and return to KVM to retry the previous memory access.
> +
>  ::
>
>  /* KVM_EXIT_NOTIFY */
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 4d4b3de8ac55..6c6ed214b6ac 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -274,6 +274,7 @@ struct kvm_xen_exit {
>  #define KVM_EXIT_RISCV_SBI35
>  #define KVM_EXIT_RISCV_CSR36
>  #define KVM_EXIT_NOTIFY   37
> +#define KVM_EXIT_MEMORY_FAULT 38
>
>  /* For KVM_EXIT_INTERNAL_ERROR */
>  /* Emulate instruction failed. */
> @@ -520,6 +521,13 @@ struct kvm_run {
>  #define KVM_NOTIFY_CONTEXT_INVALID   (1 << 0)
>   __u32 flags;
>   } notify;
> + /* KVM_EXIT_MEMORY_FAULT */
> + struct {
> +#define KVM_MEMORY_EXIT_FLAG_PRIVATE (1ULL << 3)
> + __u64 flags;
> + __u64 gpa;
> + __u64 size;
> + } memory;
>   /* Fix the size of the union. */
>   char padding[256];
>   };
> --
> 2.41.0.255.g8b1d071c50-goog
>

Re: [RFC PATCH v11 05/29] KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER

2023-07-19 Thread Yuan Yao

On Tue, Jul 18, 2023 at 04:44:48PM -0700, Sean Christopherson wrote:
> Signed-off-by: Sean Christopherson 
> ---
>  arch/arm64/include/asm/kvm_host.h   |  2 --
>  arch/arm64/kvm/Kconfig  |  2 +-
>  arch/mips/include/asm/kvm_host.h|  2 --
>  arch/mips/kvm/Kconfig   |  2 +-
>  arch/powerpc/include/asm/kvm_host.h |  2 --
>  arch/powerpc/kvm/Kconfig|  8 
>  arch/powerpc/kvm/powerpc.c  |  4 +---
>  arch/riscv/include/asm/kvm_host.h   |  2 --
>  arch/riscv/kvm/Kconfig  |  2 +-
>  arch/x86/include/asm/kvm_host.h |  2 --
>  arch/x86/kvm/Kconfig|  2 +-
>  include/linux/kvm_host.h|  8 +---
>  virt/kvm/Kconfig|  4 
>  virt/kvm/kvm_main.c | 10 +-
>  14 files changed, 23 insertions(+), 29 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 8b6096753740..50d89d400bf1 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -912,8 +912,6 @@ int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
>  int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
> struct kvm_vcpu_events *events);
>
> -#define KVM_ARCH_WANT_MMU_NOTIFIER
> -
>  void kvm_arm_halt_guest(struct kvm *kvm);
>  void kvm_arm_resume_guest(struct kvm *kvm);
>
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index f531da6b362e..a650b46f4f2f 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -22,7 +22,7 @@ menuconfig KVM
>   bool "Kernel-based Virtual Machine (KVM) support"
>   depends on HAVE_KVM
>   select KVM_GENERIC_HARDWARE_ENABLING
> - select MMU_NOTIFIER
> + select KVM_GENERIC_MMU_NOTIFIER
>   select PREEMPT_NOTIFIERS
>   select HAVE_KVM_CPU_RELAX_INTERCEPT
>   select HAVE_KVM_ARCH_TLB_FLUSH_ALL
> diff --git a/arch/mips/include/asm/kvm_host.h 
> b/arch/mips/include/asm/kvm_host.h
> index 04cedf9f8811..22a41d941bf3 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -810,8 +810,6 @@ int kvm_mips_mkclean_gpa_pt(struct kvm *kvm, gfn_t 
> start_gfn, gfn_t end_gfn);
>  pgd_t *kvm_pgd_alloc(void);
>  void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
>
> -#define KVM_ARCH_WANT_MMU_NOTIFIER
> -
>  /* Emulation */
>  enum emulation_result update_pc(struct kvm_vcpu *vcpu, u32 cause);
>  int kvm_get_badinstr(u32 *opc, struct kvm_vcpu *vcpu, u32 *out);
> diff --git a/arch/mips/kvm/Kconfig b/arch/mips/kvm/Kconfig
> index a8cdba75f98d..c04987d2ed2e 100644
> --- a/arch/mips/kvm/Kconfig
> +++ b/arch/mips/kvm/Kconfig
> @@ -25,7 +25,7 @@ config KVM
>   select HAVE_KVM_EVENTFD
>   select HAVE_KVM_VCPU_ASYNC_IOCTL
>   select KVM_MMIO
> - select MMU_NOTIFIER
> + select KVM_GENERIC_MMU_NOTIFIER
>   select INTERVAL_TREE
>   select KVM_GENERIC_HARDWARE_ENABLING
>   help
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 14ee0dece853..4b5c3f2acf78 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -62,8 +62,6 @@
>
>  #include 
>
> -#define KVM_ARCH_WANT_MMU_NOTIFIER
> -
>  #define HPTEG_CACHE_NUM  (1 << 15)
>  #define HPTEG_HASH_BITS_PTE  13
>  #define HPTEG_HASH_BITS_PTE_LONG 12
> diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
> index 902611954200..b33358ee6424 100644
> --- a/arch/powerpc/kvm/Kconfig
> +++ b/arch/powerpc/kvm/Kconfig
> @@ -42,7 +42,7 @@ config KVM_BOOK3S_64_HANDLER
>  config KVM_BOOK3S_PR_POSSIBLE
>   bool
>   select KVM_MMIO
> - select MMU_NOTIFIER
> + select KVM_GENERIC_MMU_NOTIFIER
>
>  config KVM_BOOK3S_HV_POSSIBLE
>   bool
> @@ -85,7 +85,7 @@ config KVM_BOOK3S_64_HV
>   tristate "KVM for POWER7 and later using hypervisor mode in host"
>   depends on KVM_BOOK3S_64 && PPC_POWERNV
>   select KVM_BOOK3S_HV_POSSIBLE
> - select MMU_NOTIFIER
> + select KVM_GENERIC_MMU_NOTIFIER
>   select CMA
>   help
> Support running unmodified book3s_64 guest kernels in
> @@ -194,7 +194,7 @@ config KVM_E500V2
>   depends on !CONTEXT_TRACKING_USER
>   select KVM
>   select KVM_MMIO
> - select MMU_NOTIFIER
> + select KVM_GENERIC_MMU_NOTIFIER
>   help
> Support running unmodified E500 guest kernels in virtual machines on
> E500v2 host processors.
> @@ -211,7 +211,7 @@ config KVM_E500MC
>   select KVM
>   select KVM_MMIO
>   select KVM_BOOKE_HV
> - select MMU_NOTIFIER
> + select KVM_GENERIC_MMU_NOTIFIER
>   help
> Support running unmodified E500MC/E5500/E6500 guest kernels in
> virtual machines on E500MC/E5500/E6500 host processors.
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 5cf9e5e3112a..f97fbac7eac9 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++

[PATCH] Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"

2023-07-19 Thread Andrew Donnellan

This reverts commit 606787fed7268feb256957872586370b56af697a.

ELFv1 with LE has never been a thing, and people who try to make ELFv1 LE
binaries are maniacs who need to be stopped, but unfortunately there are
ELFv1 LE binaries out there in the wild.

One such binary is the ppc64el (as Debian calls it) helper for
arch-test[0], a tool for detecting architectures that can be executed on a
given machine by means of attempting to execute helper binaries compiled
for each architecture and seeing which binaries succeed and fail. The
helpers are small snippets of assembly, and the ppc64el assembly doesn't
include the right directives to generate an ELFv2 binary.

This results in arch-test incorrectly determining that a ppc64el kernel
can't execute a ppc64el userspace, which in turn means that a number of
developer tools such as debootstrap will break (assuming arch-test is
installed).

[0] https://github.com/kilobyte/arch-test

Signed-off-by: Andrew Donnellan 
Cc: Nicholas Piggin 
Cc: Naveen N Rao 
Cc: Christophe Leroy 
Cc: Adam Borowski 
---
 arch/powerpc/include/asm/elf.h | 6 --
 arch/powerpc/include/asm/thread_info.h | 6 +-
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/elf.h b/arch/powerpc/include/asm/elf.h
index a26ca097d032..79f1c480b5eb 100644
--- a/arch/powerpc/include/asm/elf.h
+++ b/arch/powerpc/include/asm/elf.h
@@ -12,14 +12,8 @@
 
 /*
  * This is used to ensure we don't load something for the wrong architecture.
- * 64le only supports ELFv2 64-bit binaries (64be supports v1 and v2).
  */
-#if defined(CONFIG_PPC64) && defined(CONFIG_CPU_LITTLE_ENDIAN)
-#define elf_check_arch(x) (((x)->e_machine == ELF_ARCH) && \
-  (((x)->e_flags & 0x3) == 0x2))
-#else
 #define elf_check_arch(x) ((x)->e_machine == ELF_ARCH)
-#endif
 #define compat_elf_check_arch(x)   ((x)->e_machine == EM_PPC)
 
 #define CORE_DUMP_USE_REGSET
diff --git a/arch/powerpc/include/asm/thread_info.h 
b/arch/powerpc/include/asm/thread_info.h
index bc5d39a835fe..bf5dde1a4114 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -183,13 +183,9 @@ static inline bool test_thread_local_flags(unsigned int 
flags)
 #define clear_tsk_compat_task(tsk) do { } while (0)
 #endif
 
-#ifdef CONFIG_PPC64
-#ifdef CONFIG_CPU_BIG_ENDIAN
+#if defined(CONFIG_PPC64)
 #define is_elf2_task() (test_thread_flag(TIF_ELF2ABI))
 #else
-#define is_elf2_task() (1)
-#endif
-#else
 #define is_elf2_task() (0)
 #endif
 
-- 
2.41.0

Re: [PATCH v7 5/8] KVM: x86/mmu: Don't pass FOLL_GET to __kvm_follow_pfn

2023-07-19 Thread David Stevens

On Wed, Jul 19, 2023 at 3:35 PM Yan Zhao  wrote:
>
> On Tue, Jul 04, 2023 at 04:50:50PM +0900, David Stevens wrote:
> > @@ -4451,7 +4461,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, 
> > struct kvm_page_fault *fault
> >
> >  out_unlock:
> >   write_unlock(>kvm->mmu_lock);
> > - kvm_release_pfn_clean(fault->pfn);
> > + if (fault->is_refcounted_page)
> > + kvm_set_page_accessed(pfn_to_page(fault->pfn));
> For a refcounted page, as now KVM puts its ref early in kvm_faultin_pfn(),
> should this kvm_set_page_accessed() be placed before unlocking mmu_lock?
>
> Otherwise, if the user unmaps a region (which triggers kvm_unmap_gfn_range()
> with mmu_lock holding for write), and release the page, and if the two
> steps happen after checking page_count() in kvm_set_page_accessed() and
> before mark_page_accessed(), the latter function may mark accessed to a page
> that is released or does not belong to current process.
>
> Is it true?

Yes, good catch. During some testing last week, I actually found this
bug thanks to the WARN_ON the first patch in this series added to
kvm_is_ad_tracked_page. I'll fix it in the next revision, after Sean
gets a chance to comment on the series.

Thanks,
David

Re: [PATCH V2 00/26] tools/perf: Fix shellcheck coding/formatting issues of perf tool shell scripts

2023-07-19 Thread kajoljain

Hi,

Looking for review comments on this patchset.

Thanks,
Kajol Jain


On 7/9/23 23:57, Athira Rajeev wrote:
> Patchset covers a set of fixes for coding/formatting issues observed while
> running shellcheck tool on the perf shell scripts.
> 
> This cleanup is a pre-requisite to include a build option for shellcheck
> discussed here: https://www.spinics.net/lists/linux-perf-users/msg25553.html
> First set of patches were posted here:
> https://lore.kernel.org/linux-perf-users/53b7d823-1570-4289-a632-2205ee2b5...@linux.vnet.ibm.com/T/#t
> 
> This patchset covers remaining set of shell scripts which needs
> fix. Patch 1 is resubmission of patch 6 from the initial series.
> Patch 15, 16 and 22 touches code from tools/perf/trace/beauty.
> Other patches are fixes for scripts from tools/perf/tests.
> 
> The shellcheck is run for severity level for errors and warnings.
> Command used:
> 
> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
> warning $F; done
> # echo $?
> 0
> 
> Changelog:
> v1 -> v2:
>   - Rebased on top of perf-tools-next from:
>   
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-tools-next
> 
>   - Fixed shellcheck errors and warnings reported for newly
> added changes from perf-tools-next branch
> 
>   - Addressed review comment from James clark for patch 
> number 13 from V1. The changes in patch 13 were not necessary
> since the file "tests/shell/lib/coresight.sh" is sourced from
> other test files.
> 
> Akanksha J N (1):
>   tools/perf/tests: Fix shellcheck warnings for
> trace+probe_vfs_getname.sh
> 
> Athira Rajeev (14):
>   tools/perf/tests: fix test_arm_spe_fork.sh signal case issues
>   tools/perf/tests: Fix unused variable references in
> stat+csv_summary.sh testcase
>   tools/perf/tests: fix shellcheck warning for
> test_perf_data_converter_json.sh testcase
>   tools/perf/tests: Fix shellcheck issue for stat_bpf_counters.sh
> testcase
>   tools/perf/tests: Fix shellcheck issues in
> tests/shell/stat+shadow_stat.sh tetscase
>   tools/perf/tests: Fix shellcheck warnings for
> thread_loop_check_tid_10.sh
>   tools/perf/tests: Fix shellcheck warnings for unroll_loop_thread_10.sh
>   tools/perf/tests: Fix shellcheck warnings for lib/probe_vfs_getname.sh
>   tools/perf/tests: Fix the shellcheck warnings in lib/waiting.sh
>   tools/perf/trace: Fix x86_arch_prctl.sh to address shellcheck warnings
>   tools/perf/arch/x86: Fix syscalltbl.sh to address shellcheck warnings
>   tools/perf/tests/shell: Fix the shellcheck warnings in
> record+zstd_comp_decomp.sh
>   tools/perf/tests/shell: Fix shellcheck warning for stat+std_output.sh
> testcase
>   tools/perf/tests: Fix shellcheck warning for stat+std_output.sh
> testcase
> 
> Kajol Jain (11):
>   tools/perf/tests: Fix shellcheck warning for probe_vfs_getname.sh
> testcase
>   tools/perf/tests: Fix shellcheck warning for record_offcpu.sh testcase
>   tools/perf/tests: Fix shellcheck issue for lock_contention.sh testcase
>   tools/perf/tests: Fix shellcheck issue for stat_bpf_counters_cgrp.sh
> testcase
>   tools/perf/tests: Fix shellcheck warning for asm_pure_loop.sh shell
> script
>   tools/perf/tests: Fix shellcheck warning for memcpy_thread_16k_10.sh
> shell script
>   tools/perf/tests: Fix shellcheck warning for probe.sh shell script
>   tools/perf/trace: Fix shellcheck issue for arch_errno_names.sh script
>   tools/perf: Fix shellcheck issue for check-headers.sh script
>   tools/shell/coresight: Fix shellcheck warning for
> thread_loop_check_tid_2.sh shell script
>   tools/perf/tests/shell/lib: Fix shellcheck warning for stat_output.sh
> shell script
> 
>  .../arch/x86/entry/syscalls/syscalltbl.sh |  2 +-
>  tools/perf/check-headers.sh   |  6 ++--
>  .../tests/shell/coresight/asm_pure_loop.sh|  2 +-
>  .../shell/coresight/memcpy_thread_16k_10.sh   |  2 +-
>  .../coresight/thread_loop_check_tid_10.sh |  2 +-
>  .../coresight/thread_loop_check_tid_2.sh  |  2 +-
>  .../shell/coresight/unroll_loop_thread_10.sh  |  2 +-
>  tools/perf/tests/shell/lib/probe.sh   |  1 +
>  .../perf/tests/shell/lib/probe_vfs_getname.sh |  5 ++--
>  tools/perf/tests/shell/lib/stat_output.sh |  1 +
>  tools/perf/tests/shell/lib/waiting.sh |  1 +
>  tools/perf/tests/shell/lock_contention.sh | 12 
>  tools/perf/tests/shell/probe_vfs_getname.sh   |  4 +--
>  .../tests/shell/record+zstd_comp_decomp.sh| 14 +-
>  tools/perf/tests/shell/record_offcpu.sh   |  6 ++--
>  tools/perf/tests/shell/stat+csv_output.sh |  2 +-
>  tools/perf/tests/shell/stat+csv_summary.sh|  4 +--
>  tools/perf/tests/shell/stat+shadow_stat.sh|  4 +--
>  tools/perf/tests/shell/stat+std_output.sh |  3 +-
>  tools/perf/tests/shell/stat_bpf_counters.sh   |  4 +--
>  .../tests/shell/stat_bpf_counters_cgrp.sh | 28 ---
>  tools/perf/tests/shell/test_arm_spe_fork.sh

[PATCH v3 10/10] docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_partition sysfs interface file

2023-07-19 Thread Kajol Jain

Add details of the new hv-gpci interface file called
"affinity_domain_via_partition" in the ABI documentation.

Signed-off-by: Kajol Jain 
---
 .../sysfs-bus-event_source-devices-hv_gpci| 32 +++
 1 file changed, 32 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
index 399f0a2bd546..40f7cd240591 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
@@ -208,3 +208,35 @@ Description:   admin read only
   more information.
 
* "-EFBIG" : System information exceeds PAGE_SIZE.
+
+What:  /sys/devices/hv_gpci/interface/affinity_domain_via_partition
+Date:  July 2023
+Contact:   Linux on PowerPC Developer List 
+Description:   admin read only
+   This sysfs file exposes the system topology information by 
making HCALL
+   H_GET_PERF_COUNTER_INFO. The HCALL is made with counter request 
value
+   AFFINITY_DOMAIN_INFORMATION_BY_PARTITION(0xB1).
+
+   * This sysfs file will be created only for power10 and above 
platforms.
+
+   * User needs root privileges to read data from this sysfs file.
+
+   * This sysfs file will be created, only when the HCALL returns 
"H_SUCCESS",
+ "H_AUTHORITY" or "H_PARAMETER" as the return type.
+
+ HCALL with return error type "H_AUTHORITY" can be resolved 
during
+ runtime by setting "Enable Performance Information 
Collection" option.
+
+   * The end user reading this sysfs file must decode the content 
as per
+ underlying platform/firmware.
+
+   Possible error codes while reading this sysfs file:
+
+   * "-EPERM" : Partition is not permitted to retrieve performance 
information,
+   required to set "Enable Performance Information 
Collection" option.
+
+   * "-EIO" : Can't retrieve system information because of invalid 
buffer length/invalid address
+  or because of some hardware error. Refer to 
getPerfCountInfo documentation for
+  more information.
+
+   * "-EFBIG" : System information exceeds PAGE_SIZE.
-- 
2.39.3

[PATCH v3 09/10] powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via partition information

2023-07-19 Thread Kajol Jain

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
AFFINITY_DOMAIN_INFORMATION_BY_PARTITION(0XB1), can be used to get
the system affinity domain via partition information. To expose the system
affinity domain via partition information, patch adds sysfs file called
"affinity_domain_via_partition" to the "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver.

Add new entry for AFFINITY_DOMAIN_VIA_PAR in sysinfo_counter_request
array, which points to the counter request value
"affinity_domain_via_partition" in hv-gpci.c file. Also add a
new function called "affinity_domain_via_partition_result_parse" to parse
the hcall result and store it in output buffer.

The affinity_domain_via_partition sysfs file is only available for power10
and above platforms. Add a macro called
INTERFACE_AFFINITY_DOMAIN_VIA_PAR_ATTR, which points to the index of NULL
placeholder, for affinity_domain_via_partition attribute in
interface_attrs array. Also updated the value of INTERFACE_NULL_ATTR
macro in hv-gpci.c file.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/hv-gpci.c | 160 +++-
 1 file changed, 159 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 326b758df7c8..f2fff166290b 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -107,7 +107,8 @@ static ssize_t cpumask_show(struct device *dev,
 #define INTERFACE_PROCESSOR_CONFIG_ATTR7
 #define INTERFACE_AFFINITY_DOMAIN_VIA_VP_ATTR  8
 #define INTERFACE_AFFINITY_DOMAIN_VIA_DOM_ATTR 9
-#define INTERFACE_NULL_ATTR10
+#define INTERFACE_AFFINITY_DOMAIN_VIA_PAR_ATTR 10
+#define INTERFACE_NULL_ATTR11
 
 /* Counter request value to retrieve system information */
 enum {
@@ -115,6 +116,7 @@ enum {
PROCESSOR_CONFIG,
AFFINITY_DOMAIN_VIA_VP, /* affinity domain via virtual processor */
AFFINITY_DOMAIN_VIA_DOM, /* affinity domain via domain */
+   AFFINITY_DOMAIN_VIA_PAR, /* affinity domain via partition */
 };
 
 static int sysinfo_counter_request[] = {
@@ -122,6 +124,7 @@ static int sysinfo_counter_request[] = {
[PROCESSOR_CONFIG] = 0x90,
[AFFINITY_DOMAIN_VIA_VP] = 0xA0,
[AFFINITY_DOMAIN_VIA_DOM] = 0xB0,
+   [AFFINITY_DOMAIN_VIA_PAR] = 0xB1,
 };
 
 static DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) 
__aligned(sizeof(uint64_t));
@@ -458,6 +461,152 @@ static ssize_t affinity_domain_via_domain_show(struct 
device *dev, struct device
return ret;
 }
 
+static void affinity_domain_via_partition_result_parse(int returned_values,
+   int element_size, char *buf, size_t *last_element,
+   size_t *n, struct hv_gpci_request_buffer *arg)
+{
+   size_t i = 0, j = 0;
+   size_t k, l, m;
+   uint16_t total_affinity_domain_ele, size_of_each_affinity_domain_ele;
+
+   /*
+* hcall H_GET_PERF_COUNTER_INFO populates the 'returned_values'
+* to show the total number of counter_value array elements
+* returned via hcall.
+* Unlike other request types, the data structure returned by this
+* request is variable-size. For this counter request type,
+* hcall populates 'cv_element_size' corresponds to minimum size of
+* the structure returned i.e; the size of the structure with no domain
+* information. Below loop go through all counter_value array
+* to determine the number and size of each domain array element and
+* add it to the output buffer.
+*/
+   while (i < returned_values) {
+   k = j;
+   for (; k < j + element_size; k++)
+   *n += sprintf(buf + *n,  "%02x", (u8)arg->bytes[k]);
+   *n += sprintf(buf + *n,  "\n");
+
+   total_affinity_domain_ele = (u8)arg->bytes[k - 2] << 8 | 
(u8)arg->bytes[k - 3];
+   size_of_each_affinity_domain_ele = (u8)arg->bytes[k] << 8 | 
(u8)arg->bytes[k - 1];
+
+   for (l = 0; l < total_affinity_domain_ele; l++) {
+   for (m = 0; m < size_of_each_affinity_domain_ele; m++) {
+   *n += sprintf(buf + *n,  "%02x", 
(u8)arg->bytes[k]);
+   k++;
+   }
+   *n += sprintf(buf + *n,  "\n");
+   }
+
+   *n += sprintf(buf + *n,  "\n");
+   i++;
+   j = k;
+   }
+
+   *last_element = k;
+}
+
+static ssize_t affinity_domain_via_partition_show(struct device *dev, struct 
device_attribute *attr,
+   char *buf)
+{
+   struct hv_gpci_request_buffer *arg;
+   unsigned long ret;
+   size_t n = 0;
+   size_t last_element = 0;
+   u32 starting_index;
+
+   arg = (void *)get_cpu_var(hv_gpci_reqb);
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
+
+   /*
+

[PATCH v3 08/10] docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_domain sysfs interface file

2023-07-19 Thread Kajol Jain

Add details of the new hv-gpci interface file called
"affinity_domain_via_domain" in the ABI documentation.

Signed-off-by: Kajol Jain 
---
 .../sysfs-bus-event_source-devices-hv_gpci| 32 +++
 1 file changed, 32 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
index 5ee33218be83..399f0a2bd546 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
@@ -176,3 +176,35 @@ Description:   admin read only
   more information.
 
* "-EFBIG" : System information exceeds PAGE_SIZE.
+
+What:  /sys/devices/hv_gpci/interface/affinity_domain_via_domain
+Date:  July 2023
+Contact:   Linux on PowerPC Developer List 
+Description:   admin read only
+   This sysfs file exposes the system topology information by 
making HCALL
+   H_GET_PERF_COUNTER_INFO. The HCALL is made with counter request 
value
+   AFFINITY_DOMAIN_INFORMATION_BY_DOMAIN(0xB0).
+
+   * This sysfs file will be created only for power10 and above 
platforms.
+
+   * User needs root privileges to read data from this sysfs file.
+
+   * This sysfs file will be created, only when the HCALL returns 
"H_SUCCESS",
+ "H_AUTHORITY" or "H_PARAMETER" as the return type.
+
+ HCALL with return error type "H_AUTHORITY" can be resolved 
during
+ runtime by setting "Enable Performance Information 
Collection" option.
+
+   * The end user reading this sysfs file must decode the content 
as per
+ underlying platform/firmware.
+
+   Possible error codes while reading this sysfs file:
+
+   * "-EPERM" : Partition is not permitted to retrieve performance 
information,
+   required to set "Enable Performance Information 
Collection" option.
+
+   * "-EIO" : Can't retrieve system information because of invalid 
buffer length/invalid address
+  or because of some hardware error. Refer to 
getPerfCountInfo documentation for
+  more information.
+
+   * "-EFBIG" : System information exceeds PAGE_SIZE.
-- 
2.39.3

[PATCH v3 07/10] powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via domain information

2023-07-19 Thread Kajol Jain

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
AFFINITY_DOMAIN_INFORMATION_BY_DOMAIN(0XB0), can be used to get
the system affinity domain via domain information. To expose the system
affinity domain via domain information, patch adds sysfs file called
"affinity_domain_via_domain" to the "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver.

Add new entry for AFFINITY_DOMAIN_VIA_DOM in sysinfo_counter_request
array, which points to the counter request value
"affinity_domain_via_domain" in hv-gpci.c file.

The affinity_domain_via_domain sysfs file is only available for power10
and above platforms. Add a macro called
INTERFACE_AFFINITY_DOMAIN_VIA_DOM_ATTR, which points to the index of NULL
placeholder, for affinity_domain_via_domain attribute in interface_attrs
array. Also updated the value of INTERFACE_NULL_ATTR macro in hv-gpci.c
file.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/hv-gpci.c | 80 -
 1 file changed, 79 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 68502cb18262..326b758df7c8 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -106,19 +106,22 @@ static ssize_t cpumask_show(struct device *dev,
 #define INTERFACE_PROCESSOR_BUS_TOPOLOGY_ATTR  6
 #define INTERFACE_PROCESSOR_CONFIG_ATTR7
 #define INTERFACE_AFFINITY_DOMAIN_VIA_VP_ATTR  8
-#define INTERFACE_NULL_ATTR9
+#define INTERFACE_AFFINITY_DOMAIN_VIA_DOM_ATTR 9
+#define INTERFACE_NULL_ATTR10
 
 /* Counter request value to retrieve system information */
 enum {
PROCESSOR_BUS_TOPOLOGY,
PROCESSOR_CONFIG,
AFFINITY_DOMAIN_VIA_VP, /* affinity domain via virtual processor */
+   AFFINITY_DOMAIN_VIA_DOM, /* affinity domain via domain */
 };
 
 static int sysinfo_counter_request[] = {
[PROCESSOR_BUS_TOPOLOGY] = 0xD0,
[PROCESSOR_CONFIG] = 0x90,
[AFFINITY_DOMAIN_VIA_VP] = 0xA0,
+   [AFFINITY_DOMAIN_VIA_DOM] = 0xB0,
 };
 
 static DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) 
__aligned(sizeof(uint64_t));
@@ -389,6 +392,72 @@ static ssize_t 
affinity_domain_via_virtual_processor_show(struct device *dev,
return ret;
 }
 
+static ssize_t affinity_domain_via_domain_show(struct device *dev, struct 
device_attribute *attr,
+   char *buf)
+{
+   struct hv_gpci_request_buffer *arg;
+   unsigned long ret;
+   size_t n = 0;
+
+   arg = (void *)get_cpu_var(hv_gpci_reqb);
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
+
+   /*
+* Pass the counter request 0xB0 corresponds to request
+* type 'Affinity_domain_information_by_domain',
+* to retrieve the system affinity domain information.
+* starting_index value refers to the starting hardware
+* processor index.
+*/
+   ret = 
systeminfo_gpci_request(sysinfo_counter_request[AFFINITY_DOMAIN_VIA_DOM],
+   0, 0, buf, , arg);
+
+   if (!ret)
+   return n;
+
+   if (ret != H_PARAMETER)
+   goto out;
+
+   /*
+* ret value as 'H_PARAMETER' corresponds to 'GEN_BUF_TOO_SMALL', which
+* implies that buffer can't accommodate all information, and a partial 
buffer
+* returned. To handle that, we need to take subsequent requests
+* with next starting index to retrieve additional (missing) data.
+* Below loop do subsequent hcalls with next starting index and add it
+* to buffer util we get all the information.
+*/
+   while (ret == H_PARAMETER) {
+   int returned_values = be16_to_cpu(arg->params.returned_values);
+   int elementsize = be16_to_cpu(arg->params.cv_element_size);
+   int last_element = (returned_values - 1) * elementsize;
+
+   /*
+* Since the starting index value is part of counter_value
+* buffer elements, use the starting index value in the last
+* element and add 1 to make subsequent hcalls.
+*/
+   u32 starting_index = arg->bytes[last_element + 1] +
+   (arg->bytes[last_element] << 8) + 1;
+
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
+
+   ret = 
systeminfo_gpci_request(sysinfo_counter_request[AFFINITY_DOMAIN_VIA_DOM],
+   starting_index, 0, buf, , arg);
+
+   if (!ret)
+   return n;
+
+   if (ret != H_PARAMETER)
+   goto out;
+   }
+
+   return n;
+
+out:
+   put_cpu_var(hv_gpci_reqb);
+   return ret;
+}
+
 static DEVICE_ATTR_RO(kernel_version);
 static DEVICE_ATTR_RO(cpumask);
 
@@ -420,6 +489,11 @@ static struct attribute *interface_attrs[] = {
 * attribute, set in init function if applicable.
 */
NULL,
+

[PATCH v3 06/10] docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_virtual_processor sysfs interface file

2023-07-19 Thread Kajol Jain

Add details of the new hv-gpci interface file called
"affinity_domain_via_virtual_processor" in the ABI documentation.

Signed-off-by: Kajol Jain 
---
 .../sysfs-bus-event_source-devices-hv_gpci| 32 +++
 1 file changed, 32 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
index 9e81de18142f..5ee33218be83 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
@@ -144,3 +144,35 @@ Description:   admin read only
   more information.
 
* "-EFBIG" : System information exceeds PAGE_SIZE.
+
+What:  
/sys/devices/hv_gpci/interface/affinity_domain_via_virtual_processor
+Date:  July 2023
+Contact:   Linux on PowerPC Developer List 
+Description:   admin read only
+   This sysfs file exposes the system topology information by 
making HCALL
+   H_GET_PERF_COUNTER_INFO. The HCALL is made with counter request 
value
+   AFFINITY_DOMAIN_INFORMATION_BY_VIRTUAL_PROCESSOR(0xA0).
+
+   * This sysfs file will be created only for power10 and above 
platforms.
+
+   * User needs root privileges to read data from this sysfs file.
+
+   * This sysfs file will be created, only when the HCALL returns 
"H_SUCCESS",
+ "H_AUTHORITY" or "H_PARAMETER" as the return type.
+
+ HCALL with return error type "H_AUTHORITY" can be resolved 
during
+ runtime by setting "Enable Performance Information 
Collection" option.
+
+   * The end user reading this sysfs file must decode the content 
as per
+ underlying platform/firmware.
+
+   Possible error codes while reading this sysfs file:
+
+   * "-EPERM" : Partition is not permitted to retrieve performance 
information,
+   required to set "Enable Performance Information 
Collection" option.
+
+   * "-EIO" : Can't retrieve system information because of invalid 
buffer length/invalid address
+  or because of some hardware error. Refer to 
getPerfCountInfo documentation for
+  more information.
+
+   * "-EFBIG" : System information exceeds PAGE_SIZE.
-- 
2.39.3

[PATCH v3 05/10] powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via virtual processor information

2023-07-19 Thread Kajol Jain

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
AFFINITY_DOMAIN_INFORMATION_BY_VIRTUAL_PROCESSOR(0XA0), can be used to get
the system affinity domain via virtual processor information. To expose
the system affinity domain via virtual processor information, patch adds
sysfs file called "affinity_domain_via_virtual_processor" to the
"/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver.

The affinity_domain_via_virtual_processor sysfs file is only available for
power10 and above platforms. Add a macro called
INTERFACE_AFFINITY_DOMAIN_VIA_VP_ATTR, which points to the index of NULL
placeholder, for affinity_domain_via_virtual_processor attribute in
interface_attrs array. Also updated the value of INTERFACE_NULL_ATTR macro
in hv-gpci.c file.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/hv-gpci.c | 86 -
 1 file changed, 84 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index c74076d3c7a7..68502cb18262 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -105,17 +105,20 @@ static ssize_t cpumask_show(struct device *dev,
 /* Interface attribute array index to store system information */
 #define INTERFACE_PROCESSOR_BUS_TOPOLOGY_ATTR  6
 #define INTERFACE_PROCESSOR_CONFIG_ATTR7
-#define INTERFACE_NULL_ATTR8
+#define INTERFACE_AFFINITY_DOMAIN_VIA_VP_ATTR  8
+#define INTERFACE_NULL_ATTR9
 
 /* Counter request value to retrieve system information */
 enum {
PROCESSOR_BUS_TOPOLOGY,
-   PROCESSOR_CONFIG
+   PROCESSOR_CONFIG,
+   AFFINITY_DOMAIN_VIA_VP, /* affinity domain via virtual processor */
 };
 
 static int sysinfo_counter_request[] = {
[PROCESSOR_BUS_TOPOLOGY] = 0xD0,
[PROCESSOR_CONFIG] = 0x90,
+   [AFFINITY_DOMAIN_VIA_VP] = 0xA0,
 };
 
 static DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) 
__aligned(sizeof(uint64_t));
@@ -316,6 +319,76 @@ static ssize_t processor_config_show(struct device *dev, 
struct device_attribute
return ret;
 }
 
+static ssize_t affinity_domain_via_virtual_processor_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct hv_gpci_request_buffer *arg;
+   unsigned long ret;
+   size_t n = 0;
+
+   arg = (void *)get_cpu_var(hv_gpci_reqb);
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
+
+   /*
+* Pass the counter request 0xA0 corresponds to request
+* type 'Affinity_domain_information_by_virutal_processor',
+* to retrieve the system affinity domain information.
+* starting_index value refers to the starting hardware
+* processor index.
+*/
+   ret = 
systeminfo_gpci_request(sysinfo_counter_request[AFFINITY_DOMAIN_VIA_VP],
+   0, 0, buf, , arg);
+
+   if (!ret)
+   return n;
+
+   if (ret != H_PARAMETER)
+   goto out;
+
+   /*
+* ret value as 'H_PARAMETER' corresponds to 'GEN_BUF_TOO_SMALL', which
+* implies that buffer can't accommodate all information, and a partial 
buffer
+* returned. To handle that, we need to take subsequent requests
+* with next secondary index to retrieve additional (missing) data.
+* Below loop do subsequent hcalls with next secondary index and add it
+* to buffer util we get all the information.
+*/
+   while (ret == H_PARAMETER) {
+   int returned_values = be16_to_cpu(arg->params.returned_values);
+   int elementsize = be16_to_cpu(arg->params.cv_element_size);
+   int last_element = (returned_values - 1) * elementsize;
+
+   /*
+* Since the starting index and secondary index type is part of 
the
+* counter_value buffer elements, use the starting index value 
in the
+* last array element as subsequent starting index, and use 
secondary index
+* value in the last array element plus 1 as subsequent 
secondary index.
+* For counter request '0xA0', starting index points to 
partition id
+* and secondary index points to corresponding virtual 
processor index.
+*/
+   u32 starting_index = arg->bytes[last_element + 1] + 
(arg->bytes[last_element] << 8);
+   u16 secondary_index = arg->bytes[last_element + 3] +
+   (arg->bytes[last_element + 2] << 8) + 1;
+
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
+
+   ret = 
systeminfo_gpci_request(sysinfo_counter_request[AFFINITY_DOMAIN_VIA_VP],
+   starting_index, secondary_index, buf, , arg);
+
+   if (!ret)
+   return n;
+
+   if (ret != H_PARAMETER)
+   goto out;
+   }
+
+   return n;
+
+out:
+

[PATCH v3 04/10] docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document processor_config sysfs interface file

2023-07-19 Thread Kajol Jain

Add details of the new hv-gpci interface file called
"processor_config" in the ABI documentation.

Signed-off-by: Kajol Jain 
---
 .../sysfs-bus-event_source-devices-hv_gpci| 32 +++
 1 file changed, 32 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
index ba3f9aa3d68e..9e81de18142f 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
@@ -112,3 +112,35 @@ Description:   admin read only
   more information.
 
* "-EFBIG" : System information exceeds PAGE_SIZE.
+
+What:  /sys/devices/hv_gpci/interface/processor_config
+Date:  July 2023
+Contact:   Linux on PowerPC Developer List 
+Description:   admin read only
+   This sysfs file exposes the system topology information by 
making HCALL
+   H_GET_PERF_COUNTER_INFO. The HCALL is made with counter request 
value
+   PROCESSOR_CONFIG(0x90).
+
+   * This sysfs file will be created only for power10 and above 
platforms.
+
+   * User needs root privileges to read data from this sysfs file.
+
+   * This sysfs file will be created, only when the HCALL returns 
"H_SUCCESS",
+ "H_AUTHORITY" or "H_PARAMETER" as the return type.
+
+ HCALL with return error type "H_AUTHORITY" can be resolved 
during
+ runtime by setting "Enable Performance Information 
Collection" option.
+
+   * The end user reading this sysfs file must decode the content 
as per
+ underlying platform/firmware.
+
+   Possible error codes while reading this sysfs file:
+
+   * "-EPERM" : Partition is not permitted to retrieve performance 
information,
+   required to set "Enable Performance Information 
Collection" option.
+
+   * "-EIO" : Can't retrieve system information because of invalid 
buffer length/invalid address
+  or because of some hardware error. Refer to 
getPerfCountInfo documentation for
+  more information.
+
+   * "-EFBIG" : System information exceeds PAGE_SIZE.
-- 
2.39.3

[PATCH v3 03/10] powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor config information

2023-07-19 Thread Kajol Jain

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
PROCESSOR_CONFIG(0X90), can be used to get the system
processor configuration information. To expose the system
processor config information, patch adds sysfs file called
"processor_config" to the "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver.

Add enum and sysinfo_counter_request array to get required
counter request value in hv-gpci.c file.
Also add a new function called "sysinfo_device_attr_create",
which will create and return required device attribute to the
add_sysinfo_interface_files function.

The processor_config sysfs file is only available for power10
and above platforms. Add a new macro called
INTERFACE_PROCESSOR_CONFIG_ATTR, which points to the index of
NULL placefolder, for processor_config attribute in the interface_attrs
array. Also add macro INTERFACE_NULL_ATTR which points to index of NULL
attribute in interface_attrs array.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/hv-gpci.c | 168 
 1 file changed, 153 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 225f148f75fd..c74076d3c7a7 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -102,11 +102,21 @@ static ssize_t cpumask_show(struct device *dev,
return cpumap_print_to_pagebuf(true, buf, _gpci_cpumask);
 }
 
-/* Counter request value to retrieve system information */
-#define PROCESSOR_BUS_TOPOLOGY 0XD0
-
 /* Interface attribute array index to store system information */
 #define INTERFACE_PROCESSOR_BUS_TOPOLOGY_ATTR  6
+#define INTERFACE_PROCESSOR_CONFIG_ATTR7
+#define INTERFACE_NULL_ATTR8
+
+/* Counter request value to retrieve system information */
+enum {
+   PROCESSOR_BUS_TOPOLOGY,
+   PROCESSOR_CONFIG
+};
+
+static int sysinfo_counter_request[] = {
+   [PROCESSOR_BUS_TOPOLOGY] = 0xD0,
+   [PROCESSOR_CONFIG] = 0x90,
+};
 
 static DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) 
__aligned(sizeof(uint64_t));
 
@@ -187,7 +197,8 @@ static ssize_t processor_bus_topology_show(struct device 
*dev, struct device_att
 * starting_index value implies the starting hardware
 * chip id.
 */
-   ret = systeminfo_gpci_request(PROCESSOR_BUS_TOPOLOGY, 0, 0, buf, , 
arg);
+   ret = 
systeminfo_gpci_request(sysinfo_counter_request[PROCESSOR_BUS_TOPOLOGY],
+   0, 0, buf, , arg);
 
if (!ret)
return n;
@@ -220,8 +231,76 @@ static ssize_t processor_bus_topology_show(struct device 
*dev, struct device_att
 
memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
 
-   ret = systeminfo_gpci_request(PROCESSOR_BUS_TOPOLOGY, 
starting_index,
-   0, buf, , arg);
+   ret = 
systeminfo_gpci_request(sysinfo_counter_request[PROCESSOR_BUS_TOPOLOGY],
+   starting_index, 0, buf, , arg);
+
+   if (!ret)
+   return n;
+
+   if (ret != H_PARAMETER)
+   goto out;
+   }
+
+   return n;
+
+out:
+   put_cpu_var(hv_gpci_reqb);
+   return ret;
+}
+
+static ssize_t processor_config_show(struct device *dev, struct 
device_attribute *attr,
+   char *buf)
+{
+   struct hv_gpci_request_buffer *arg;
+   unsigned long ret;
+   size_t n = 0;
+
+   arg = (void *)get_cpu_var(hv_gpci_reqb);
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
+
+   /*
+* Pass the counter request value 0x90 corresponds to request
+* type 'Processor_config', to retrieve
+* the system processor information.
+* starting_index value implies the starting hardware
+* processor index.
+*/
+   ret = systeminfo_gpci_request(sysinfo_counter_request[PROCESSOR_CONFIG],
+   0, 0, buf, , arg);
+
+   if (!ret)
+   return n;
+
+   if (ret != H_PARAMETER)
+   goto out;
+
+   /*
+* ret value as 'H_PARAMETER' corresponds to 'GEN_BUF_TOO_SMALL', which
+* implies that buffer can't accommodate all information, and a partial 
buffer
+* returned. To handle that, we need to take subsequent requests
+* with next starting index to retrieve additional (missing) data.
+* Below loop do subsequent hcalls with next starting index and add it
+* to buffer util we get all the information.
+*/
+   while (ret == H_PARAMETER) {
+   int returned_values = be16_to_cpu(arg->params.returned_values);
+   int elementsize = be16_to_cpu(arg->params.cv_element_size);
+   int last_element = (returned_values - 1) * elementsize;
+
+   /*
+* Since the starting index is part of counter_value
+* buffer elements, use the starting index value in the last
+

[PATCH v3 02/10] docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document processor_bus_topology sysfs interface file

2023-07-19 Thread Kajol Jain

Add details of the new hv-gpci interface file called
"processor_bus_topology" in the ABI documentation.

Signed-off-by: Kajol Jain 
---
 .../sysfs-bus-event_source-devices-hv_gpci| 32 +++
 1 file changed, 32 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
index 12e2bf92783f..ba3f9aa3d68e 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_gpci
@@ -80,3 +80,35 @@ Contact: Linux on PowerPC Developer List 

 Description:   read only
This sysfs file exposes the cpumask which is designated to make
HCALLs to retrieve hv-gpci pmu event counter data.
+
+What:  /sys/devices/hv_gpci/interface/processor_bus_topology
+Date:  July 2023
+Contact:   Linux on PowerPC Developer List 
+Description:   admin read only
+   This sysfs file exposes the system topology information by 
making HCALL
+   H_GET_PERF_COUNTER_INFO. The HCALL is made with counter request 
value
+   PROCESSOR_BUS_TOPOLOGY(0xD0).
+
+   * This sysfs file will be created only for power10 and above 
platforms.
+
+   * User needs root privileges to read data from this sysfs file.
+
+   * This sysfs file will be created, only when the HCALL returns 
"H_SUCCESS",
+ "H_AUTHORITY" or "H_PARAMETER" as the return type.
+
+ HCALL with return error type "H_AUTHORITY" can be resolved 
during
+ runtime by setting "Enable Performance Information 
Collection" option.
+
+   * The end user reading this sysfs file must decode the content 
as per
+ underlying platform/firmware.
+
+   Possible error codes while reading this sysfs file:
+
+   * "-EPERM" : Partition is not permitted to retrieve performance 
information,
+   required to set "Enable Performance Information 
Collection" option.
+
+   * "-EIO" : Can't retrieve system information because of invalid 
buffer length/invalid address
+  or because of some hardware error. Refer to 
getPerfCountInfo documentation for
+  more information.
+
+   * "-EFBIG" : System information exceeds PAGE_SIZE.
-- 
2.39.3

[PATCH v3 01/10] powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor bus topology information

2023-07-19 Thread Kajol Jain

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
PROCESSOR_BUS_TOPOLOGY(0XD0), can be used to get the system
topology information. To expose the system topology information,
patch adds sysfs file called "processor_bus_topology" to the
"/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver.

Add macro for PROCESSOR_BUS_TOPOLOGY counter request value
in hv-gpci.c file. Also add a new function called
"systeminfo_gpci_request", to make the H_GET_PERF_COUNTER_INFO hcall
with added macro and populates the output buffer.

The processor_bus_topology sysfs file is only available for power10
and above platforms. Add a new function called
"add_sysinfo_interface_files", which will add processor_bus_topology
attribute in the interface_attrs array, only for power10 and
above platforms.
Also add macro INTERFACE_PROCESSOR_BUS_TOPOLOGY_ATTR in hv-gpci.c
file, which points to the index of NULL placefolder, for
processor_bus_topology attribute.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/hv-gpci.c | 184 +++-
 1 file changed, 182 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c
index 7ff8ff3509f5..225f148f75fd 100644
--- a/arch/powerpc/perf/hv-gpci.c
+++ b/arch/powerpc/perf/hv-gpci.c
@@ -102,6 +102,141 @@ static ssize_t cpumask_show(struct device *dev,
return cpumap_print_to_pagebuf(true, buf, _gpci_cpumask);
 }
 
+/* Counter request value to retrieve system information */
+#define PROCESSOR_BUS_TOPOLOGY 0XD0
+
+/* Interface attribute array index to store system information */
+#define INTERFACE_PROCESSOR_BUS_TOPOLOGY_ATTR  6
+
+static DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) 
__aligned(sizeof(uint64_t));
+
+static unsigned long systeminfo_gpci_request(u32 req, u32 starting_index,
+   u16 secondary_index, char *buf,
+   size_t *n, struct hv_gpci_request_buffer *arg)
+{
+   unsigned long ret;
+   size_t i, j;
+
+   arg->params.counter_request = cpu_to_be32(req);
+   arg->params.starting_index = cpu_to_be32(starting_index);
+   arg->params.secondary_index = cpu_to_be16(secondary_index);
+
+   ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO,
+   virt_to_phys(arg), HGPCI_REQ_BUFFER_SIZE);
+
+   /*
+* ret value as 'H_PARAMETER' corresponds to 'GEN_BUF_TOO_SMALL',
+* which means that the current buffer size cannot accommodate
+* all the information and a partial buffer returned.
+* hcall fails incase of ret value other than H_SUCCESS or H_PARAMETER.
+*
+* ret value as H_AUTHORITY implies that partition is not permitted to 
retrieve
+* performance information, and required to set
+* "Enable Performance Information Collection" option.
+*/
+   if (ret == H_AUTHORITY)
+   return -EPERM;
+
+   /*
+* hcall can fail with other possible ret value like 
H_PRIVILEGE/H_HARDWARE
+* because of invalid buffer-length/address or due to some hardware
+* error.
+*/
+   if (ret && (ret != H_PARAMETER))
+   return -EIO;
+
+   /*
+* hcall H_GET_PERF_COUNTER_INFO populates the 'returned_values'
+* to show the total number of counter_value array elements
+* returned via hcall.
+* hcall also populates 'cv_element_size' corresponds to individual
+* counter_value array element size. Below loop go through all
+* counter_value array elements as per their size and add it to
+* the output buffer.
+*/
+   for (i = 0; i < be16_to_cpu(arg->params.returned_values); i++) {
+   j = i * be16_to_cpu(arg->params.cv_element_size);
+
+   for (; j < (i + 1) * be16_to_cpu(arg->params.cv_element_size); 
j++)
+   *n += sprintf(buf + *n,  "%02x", (u8)arg->bytes[j]);
+   *n += sprintf(buf + *n,  "\n");
+   }
+
+   if (*n >= PAGE_SIZE) {
+   pr_info("System information exceeds PAGE_SIZE\n");
+   return -EFBIG;
+   }
+
+   return ret;
+}
+
+static ssize_t processor_bus_topology_show(struct device *dev, struct 
device_attribute *attr,
+   char *buf)
+{
+   struct hv_gpci_request_buffer *arg;
+   unsigned long ret;
+   size_t n = 0;
+
+   arg = (void *)get_cpu_var(hv_gpci_reqb);
+   memset(arg, 0, HGPCI_REQ_BUFFER_SIZE);
+
+   /*
+* Pass the counter request value 0xD0 corresponds to request
+* type 'Processor_bus_topology', to retrieve
+* the system topology information.
+* starting_index value implies the starting hardware
+* chip id.
+*/
+   ret = systeminfo_gpci_request(PROCESSOR_BUS_TOPOLOGY, 0, 0, buf, , 
arg);
+
+   if (!ret)
+   return n;
+
+   if (ret != H_PARAMETER)
+   goto out;
+
+   /*
+* ret value as

[PATCH v3 00/10] Add sysfs interface files to hv_gpci device to expose system information

2023-07-19 Thread Kajol Jain

The hcall H_GET_PERF_COUNTER_INFO can be used to get data related to
chips, dimms and system topology, by passing different counter request
values.
Patchset adds sysfs files to "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver, which will expose system topology information
using H_GET_PERF_COUNTER_INFO hcall. The added sysfs files are
available for power10 and above platforms and needs root access
to read the data.

Patches 1,3,5,7,9 adds sysfs interface files to the hv_gpci
pmu driver, to get system topology information.

List of added sysfs files:
-> processor_bus_topology (Counter request value : 0xD0)
-> processor_config (Counter request value : 0x90)
-> affinity_domain_via_virtual_processor (Counter request value : 0xA0)
-> affinity_domain_via_domain (Counter request value : 0xB0)
-> affinity_domain_via_partition (Counter request value : 0xB1)

Patches 2,4,6,8,10 adds details of the newly added hv_gpci
interface files listed above in the ABI documentation.

Patches 2,4,6,8,10 adds details of the newly added hv_gpci
interface files listed above in the ABI documentation.

Changelog:
v2 -> v3
-> Make nit changes in documentation patches as suggested by Randy Dunlap.

v1 -> v2
-> Incase the HCALL fails with errors that can be resolve during runtime,
   then only add sysinfo interface attributes to the interface_attrs
   attribute array. Even if one of the counter request value HCALL fails,
   don't add any sysinfo attribute to the interface_attrs attribute array.
   Add the code changes to make sure sysinfo interface added only when all
   the requirements met as suggested by Michael Ellerman.
-> Make changes in documentation, adds detail of errors type
   which can be resolved at runtime as suggested by Michael Ellerman.
-> Add new enum and sysinfo_counter_request array to get required
   counter request value in hv-gpci.c file.
-> Move the macros for interface attribute array index to hv-gpci.c, as
   these macros currently only used in hv-gpci.c file.

Kajol Jain (10):
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show
processor bus topology information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
processor_bus_topology sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show
processor config information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
processor_config sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity
domain via virtual processor information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
affinity_domain_via_virtual_processor sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity
domain via domain information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
affinity_domain_via_domain sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity
domain via partition information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
affinity_domain_via_partition sysfs interface file

 .../sysfs-bus-event_source-devices-hv_gpci| 160 +
 arch/powerpc/perf/hv-gpci.c   | 640 +-
 2 files changed, 798 insertions(+), 2 deletions(-)

-- 
2.39.3

[PATCH v3 00/10] Add sysfs interface files to hv_gpci device to expose system information

2023-07-19 Thread Kajol Jain

The hcall H_GET_PERF_COUNTER_INFO can be used to get data related to
chips, dimms and system topology, by passing different counter request
values.
Patchset adds sysfs files to "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver, which will expose system topology information
using H_GET_PERF_COUNTER_INFO hcall. The added sysfs files are
available for power10 and above platforms and needs root access
to read the data.

Patches 1,3,5,7,9 adds sysfs interface files to the hv_gpci
pmu driver, to get system topology information.

List of added sysfs files:
-> processor_bus_topology (Counter request value : 0xD0)
-> processor_config (Counter request value : 0x90)
-> affinity_domain_via_virtual_processor (Counter request value : 0xA0)
-> affinity_domain_via_domain (Counter request value : 0xB0)
-> affinity_domain_via_partition (Counter request value : 0xB1)

Patches 2,4,6,8,10 adds details of the newly added hv_gpci
interface files listed above in the ABI documentation.

Patches 2,4,6,8,10 adds details of the newly added hv_gpci
interface files listed above in the ABI documentation.

Changelog:
v2 -> v3
-> Make nit changes in documentation patches as suggested by Randy Dunlap.

v1 -> v2
-> Incase the HCALL fails with errors that can be resolve during runtime,
   then only add sysinfo interface attributes to the interface_attrs
   attribute array. Even if one of the counter request value HCALL fails,
   don't add any sysinfo attribute to the interface_attrs attribute array.
   Add the code changes to make sure sysinfo interface added only when all
   the requirements met as suggested by Michael Ellerman.
-> Make changes in documentation, adds detail of errors type
   which can be resolved at runtime as suggested by Michael Ellerman.
-> Add new enum and sysinfo_counter_request array to get required
   counter request value in hv-gpci.c file.
-> Move the macros for interface attribute array index to hv-gpci.c, as
   these macros currently only used in hv-gpci.c file.

Kajol Jain (10):
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show
processor bus topology information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
processor_bus_topology sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show
processor config information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
processor_config sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity
domain via virtual processor information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
affinity_domain_via_virtual_processor sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity
domain via domain information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
affinity_domain_via_domain sysfs interface file
  powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity
domain via partition information
  docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document
affinity_domain_via_partition sysfs interface file

 .../sysfs-bus-event_source-devices-hv_gpci| 160 +
 arch/powerpc/perf/hv-gpci.c   | 640 +-
 2 files changed, 798 insertions(+), 2 deletions(-)

-- 
2.39.3

88 matches

Mail list logo