Re: [kvm-unit-tests PATCH v2 4/9] migration: use a more robust way to wait for background job

2024-02-05 Thread Nicholas Piggin
On Tue Feb 6, 2024 at 12:58 AM AEST, Marc Hartmayer wrote:
> On Fri, Feb 02, 2024 at 04:57 PM +1000, Nicholas Piggin  
> wrote:
> > Starting a pipeline of jobs in the background does not seem to have
> > a simple way to reliably find the pid of a particular process in the
> > pipeline (because not all processes are started when the shell
> > continues to execute).
> >
> > The way PID of QEMU is derived can result in a failure waiting on a
> > PID that is not running. This is easier to hit with subsequent
> > multiple-migration support. Changing this to use $! by swapping the
> > pipeline for a fifo is more robust.
> >
> > Signed-off-by: Nicholas Piggin 
> > ---
>
> […snip…]
>
> >  
> > +   # Wait until the destination has created the incoming and qmp sockets
> > +   while ! [ -S ${migsock} ] ; do sleep 0.1 ; done
> > +   while ! [ -S ${qmp2} ] ; do sleep 0.1 ; done
>
> There should be timeout implemented, otherwise we might end in an
> endless loop in case of a bug. Or is the global timeout good enough to
> handle this situation?

I was going to say it's not worthwhile since we can't recover, but
actually printing where the timeout happens if nothing else would
be pretty helpful to gather and diagnose problems especially ones
we can't reproduce locally. So, yeah good idea.

We have a bunch of potential hangs where we don't do anything already
though. Sadly it doesn't look like $BASH_LINENO can give anything
useful of the interrupted context from a SIGHUP trap. We might be able
to do something like -

timeout_handler() {
echo "Timeout $timeout_msg"
exit
}

trap timeout_handler HUP

timeout_msg="waiting for destination migration socket to be created"
while ! [ -S ${migsock} ] ; do sleep 0.1 ; done
timeout_msg="waiting for destination QMP socket to be created"
while ! [ -S ${qmp2} ] ; do sleep 0.1 ; done
timeout_msg=

Unless you have any better ideas. Not sure if there's some useful
bash debugging options that can be used. Other option is adding timeout
checks in loops and blocking commands... not sure if that's simpler and
less error prone though.

Anyway we have a bunch of potential hangs and timeouts that aren't
handled already though, so I might leave this out for a later pass at
it unless we come up with a really nice easy way to go.

Thanks,
Nick

>
> > +
> > qmp ${qmp1} '"migrate", "arguments": { "uri": "unix:'${migsock}'" }' > 
> > ${qmpout1}
> >  
> > # Wait for the migration to complete
> > -- 
> > 2.42.0
> >
> >



Re: [PATCH v2 1/2] powerpc: Add Power11 architected and raw mode

2024-02-05 Thread Madhavan Srinivasan



On 2/5/24 2:13 PM, Aneesh Kumar K.V wrote:

Madhavan Srinivasan  writes:


reg.h is updated with Power11 pvr. pvr_mask value of 0x0F07
means we are arch v3.1 compliant.


If it is called arch v3.1, it will conflict with.


#define PVR_ARCH_31 0x0f06


Nice catch. My bad, missed to include a macro for new logical PVR in the 
patch. Will fix it





This is used by phyp and
kvm when booting as a pseries guest to detect and enable
the appropriate hwcap, facility bits and PMU related fields.
Copied most of fields from Power10 table entry and added relevant
Power11 setup/restore and device tree routines.

Signed-off-by: Madhavan Srinivasan 
---
Changelog v1:
- no change in this patch.

  arch/powerpc/include/asm/cpu_setup.h  |  2 ++
  arch/powerpc/include/asm/cputable.h   |  3 ++
  arch/powerpc/include/asm/mce.h|  1 +
  arch/powerpc/include/asm/mmu.h|  1 +
  arch/powerpc/include/asm/reg.h|  1 +
  arch/powerpc/kernel/cpu_setup_power.c | 10 +++
  arch/powerpc/kernel/cpu_specs_book3s_64.h | 34 +++
  arch/powerpc/kernel/dt_cpu_ftrs.c | 15 ++
  arch/powerpc/kernel/mce_power.c   |  5 
  arch/powerpc/kernel/prom_init.c   | 10 ++-
  10 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/cpu_setup.h 
b/arch/powerpc/include/asm/cpu_setup.h
index 30e2fe389502..ce800650bb8b 100644
--- a/arch/powerpc/include/asm/cpu_setup.h
+++ b/arch/powerpc/include/asm/cpu_setup.h
@@ -9,10 +9,12 @@ void __setup_cpu_power7(unsigned long offset, struct cpu_spec 
*spec);
  void __setup_cpu_power8(unsigned long offset, struct cpu_spec *spec);
  void __setup_cpu_power9(unsigned long offset, struct cpu_spec *spec);
  void __setup_cpu_power10(unsigned long offset, struct cpu_spec *spec);
+void __setup_cpu_power11(unsigned long offset, struct cpu_spec *spec);
  void __restore_cpu_power7(void);
  void __restore_cpu_power8(void);
  void __restore_cpu_power9(void);
  void __restore_cpu_power10(void);
+void __restore_cpu_power11(void);
  
  void __setup_cpu_e500v1(unsigned long offset, struct cpu_spec *spec);

  void __setup_cpu_e500v2(unsigned long offset, struct cpu_spec *spec);
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index 8765d5158324..3bd6e6e0224c 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -454,6 +454,9 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
CPU_FTR_DAWR | CPU_FTR_DAWR1 | \
CPU_FTR_DEXCR_NPHIE)
+
+#define CPU_FTRS_POWER11   CPU_FTRS_POWER10


One of the problem with that is we have code that does the below in kvm.

if (cpu_has_feature(CPU_FTR_ARCH_31))
host_pcr_bit = PCR_ARCH_31;


How should we handle that?


This is my understanding and kindly advice if this assumption is wrong.
IIUC, we dont see different host_pcr_bit values for Power10 and Power11 
at this point,
but that said, if we see a change, we could add additional check with 
base pvr value to

differentiate the value to set right?




+
  #define CPU_FTRS_CELL (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index c9f0936bd3c9..241eee743fc5 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -257,6 +257,7 @@ long __machine_check_early_realmode_p7(struct pt_regs 
*regs);
  long __machine_check_early_realmode_p8(struct pt_regs *regs);
  long __machine_check_early_realmode_p9(struct pt_regs *regs);
  long __machine_check_early_realmode_p10(struct pt_regs *regs);
+long __machine_check_early_realmode_p11(struct pt_regs *regs);
  #endif /* CONFIG_PPC_BOOK3S_64 */
  
  #ifdef CONFIG_PPC_BOOK3S_64

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index d8b7e246a32f..61ebe5eff2c9 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -133,6 +133,7 @@
  #define MMU_FTRS_POWER8   MMU_FTRS_POWER6
  #define MMU_FTRS_POWER9   MMU_FTRS_POWER6
  #define MMU_FTRS_POWER10  MMU_FTRS_POWER6
+#define MMU_FTRS_POWER11   MMU_FTRS_POWER6
  #define MMU_FTRS_CELL MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
MMU_FTR_CI_LARGE_PAGE
  #define MMU_FTRS_PA6T MMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 7fd09f25452d..7a7aa24bf57a 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1364,6 +1364,7 @@
  #define PVR_HX_C2000  0x0066
  #define PVR_POWER90x004E
  #define PVR_POWER10   0x0080
+#define PVR_POWER110x0082
  #define PVR_BE0x0070
  #define PVR_PA6T  0x0090
  
diff --git 

Re: [kvm-unit-tests PATCH v2 3/9] arch-run: Clean up initrd cleanup

2024-02-05 Thread Nicholas Piggin
On Mon Feb 5, 2024 at 10:04 PM AEST, Thomas Huth wrote:
> On 02/02/2024 07.57, Nicholas Piggin wrote:
> > Rather than put a big script into the trap handler, have it call
> > a function.
> > 
> > Signed-off-by: Nicholas Piggin 
> > ---
> >   scripts/arch-run.bash | 12 +++-
> >   1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
> > index f22ead6f..cc7da7c5 100644
> > --- a/scripts/arch-run.bash
> > +++ b/scripts/arch-run.bash
> > @@ -271,10 +271,20 @@ search_qemu_binary ()
> > export PATH=$save_path
> >   }
> >   
> > +initrd_cleanup ()
> > +{
> > +   if [ "$KVM_UNIT_TESTS_ENV_OLD" ]; then
> > +   export KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD"
> > +   else
> > +   unset KVM_UNIT_TESTS_ENV
> > +   unset KVM_UNIT_TESTS_ENV_OLD
> > +   fi
> > +}
> > +
> >   initrd_create ()
> >   {
> > if [ "$ENVIRON_DEFAULT" = "yes" ]; then
> > -   trap_exit_push 'rm -f $KVM_UNIT_TESTS_ENV; [ 
> > "$KVM_UNIT_TESTS_ENV_OLD" ] && export 
> > KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD" || unset KVM_UNIT_TESTS_ENV; 
> > unset KVM_UNIT_TESTS_ENV_OLD'
> > +   trap_exit_push 'rm -f $KVM_UNIT_TESTS_ENV; initrd_cleanup'
> >
>
> Why don't you move the 'rm -f $KVM_UNIT_TESTS_ENV' into the initrd_cleanup() 
> function, too? ... that would IMHO make more sense for a function that is 
> called *_cleanup() ?

Yeah good point, will respin.

Thanks,
Nick


Re: (subset) [PATCH 00/42] Fix coccicheck warnings

2024-02-05 Thread Martin K. Petersen
On Tue, 16 Jan 2024 12:10:47 +0800, Li Zhijian wrote:

> make coccicheck COCCI=$PWD/scripts/coccinelle/api/device_attr_show.cocci`
> complians some warnnings as following[1]:
> 
> Not sure if someone had tried these fixes, feel free to ignore this
> patch set if we have come to a *NOT-FIX* conclusion before :)
> 
> This patch set also fix a few snprintf() beside coccicheck reported.
> For example, some thing like
> xxx_show() {
>   rc = snprintf();
> ...
>   return rc;
> }
> 
> [...]

Applied to 6.9/scsi-queue, thanks!

[22/42] drivers/scsi/fnic: Convert snprintf to sysfs_emit
https://git.kernel.org/mkp/scsi/c/1ad717c92925
[25/42] drivers/scsi/ibmvscsi: Convert snprintf to sysfs_emit
https://git.kernel.org/mkp/scsi/c/29ff822f466e
[26/42] drivers/scsi/ibmvscsi_tgt: Convert snprintf to sysfs_emit
https://git.kernel.org/mkp/scsi/c/01105c23de42
[27/42] drivers/scsi/isci: Convert snprintf to sysfs_emit
https://git.kernel.org/mkp/scsi/c/5fbf37e53091
[34/42] drivers/scsi/pm8001: Convert snprintf to sysfs_emit
https://git.kernel.org/mkp/scsi/c/8179041f801d

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: Build regressions/improvements in v6.8-rc1

2024-02-05 Thread Jesse Brandeburg
On 1/23/2024 3:45 AM, Geert Uytterhoeven wrote:
> in drivers/net/ethernet/intel/ice/ice_base.c
> 
> powerpc-gcc5/ppc32_allmodconfig
> in drivers/net/ethernet/intel/ice/ice_nvm.c
> 
> aarcharm64-gcc5/arm64-allmodconfig
> powerpc-gcc5/ppc32_allmodconfig
> powerpc-gcc5/powerpc-allmodconfig
> powerpc-gcc5/ppc64le_allmodconfig
> 
> in drivers/net/ethernet/intel/ice/ice_common.c
> 
> arm64-gcc5/arm64-allmodconfig
> powerpc-gcc5/ppc32_allmodconfig
> 
> in drivers/net/ethernet/intel/i40e/i40e_dcb.c
> 
> powerpc-gcc5/powerpc-allmodconfig
> powerpc-gcc5/ppc32_allmodconfig
> powerpc-gcc5/ppc64_book3e_allmodconfig
> 
> 
> arm64-gcc5/arm64-allmodconfig
> powerpc-gcc5/powerpc-all{mod,yes}config
> powerpc-gcc5/ppc{32,64le,64_book3e}_allmodconfig
> 

Fixes for intel/i40e and intel/ice posted here:
https://lore.kernel.org/lkml/20240206022906.2194214-1-jesse.brandeb...@intel.com/





Re: [PATCH v2 2/4] PCI/AER: Handle Advisory Non-Fatal properly

2024-02-05 Thread Bjorn Helgaas
In the subject, "properly" really doesn't convey information.  I think
this patch does two things:

  - Prints error bits that might be ANFE 
  - Clears UNCOR_STATUS bits that were previously not cleared

Maybe the subject line could say something about those (clearing
UNCOR_STATUS might be more important, or maybe this could even be
split into two patches so we could see both).

On Thu, Jan 25, 2024 at 02:28:00PM +0800, Wang, Qingshun wrote:
> When processing an Advisory Non-Fatal error, ideally both correctable
> error status and uncorrectable error status should be cleared. However,
> there is no way to fully identify the UE associated with ANFE. Even
> worse, a Fatal/Non-Fatal error may set the same UE status bit as ANFE.
> Assuming an ANFE is FE/NFE is kind of bad, but assuming a FE/NFE is an
> ANFE is usually unacceptable. To avoid clearing UEs that are not ANFE by
> accident, the most conservative route is taken here: If any of the
> Fatal/Non-Fatal Error Detected bits is set in Device Status, do not
> touch UE status, they should be cleared later by the UE handler.
> Otherwise, a specific set of UEs that may be raised as ANFE according to
> the PCIe specification will be cleared if their corresponding severity
> is non-fatal. Additionally, log UEs that will be cleared.
> 
> For instance, previously when kernel receives an ANFE with Poisoned TLP
> in OS native AER mode, only status of CE will be reported and cleared:
> 
>   AER: Corrected error received: :b7:02.0
>   PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=2000/
>  [13] NonFatalErr
> 
> If the kernel receives a Malformed TLP after that, two UE will be
> reported, which is unexpected. Malformed TLP Header was lost since
> the previous ANF gated the TLP header logs:
> 
>   PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, 
> (Receiver ID)
> device [8086:0db0] error status/mask=00041000/00180020
>  [12] TLP(First)
>  [18] MalfTLP
> 
> Now, in the same scenario, both CE status and related UE status will be
> reported and cleared after ANFE:
> 
>   AER: Corrected error received: :b7:02.0
>   PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
> device [8086:0db0] error status/mask=2000/
>  [13] NonFatalErr
> Uncorrectable errors that may cause Advisory Non-Fatal:
>  [18] TLP
> 
> Signed-off-by: "Wang, Qingshun" 
> ---
>  drivers/pci/pcie/aer.c | 61 +-
>  1 file changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 6583dcf50977..713cbf625d3f 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -107,6 +107,12 @@ struct aer_stats {
>   PCI_ERR_ROOT_MULTI_COR_RCV |\
>   PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>  
> +#define AER_ERR_ANFE_UNC_MASK(PCI_ERR_UNC_POISON_TLP |   
> \
> + PCI_ERR_UNC_COMP_TIME | \
> + PCI_ERR_UNC_COMP_ABORT |\
> + PCI_ERR_UNC_UNX_COMP |  \
> + PCI_ERR_UNC_UNSUP)
> +
>  static int pcie_aer_disable;
>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>  
> @@ -612,6 +618,32 @@ const struct attribute_group aer_stats_attr_group = {
>   .is_visible = aer_stats_attrs_are_visible,
>  };
>  
> +static int anfe_get_related_err(struct aer_err_info *info)
> +{
> + /*
> +  * Take the most conservative route here. If there are
> +  * Non-Fatal/Fatal errors detected, do not assume any
> +  * bit in uncor_status is set by ANFE.
> +  */
> + if (info->device_status & (PCI_EXP_DEVSTA_NFED | PCI_EXP_DEVSTA_FED))
> + return 0;
> + /*
> +  * According to PCIe Base Specification Revision 6.1,
> +  * Section 6.2.3.2.4, if an UNCOR error is rasied as
> +  * Advisory Non-Fatal error, it will match the following
> +  * conditions:
> +  *  a. The severity of the error is Non-Fatal.
> +  *  b. The error is one of the following:
> +  *  1. Poisoned TLP
> +  *  2. Completion Timeout
> +  *  3. Completer Abort
> +  *  4. Unexpected Completion
> +  *  5. Unsupported Request
> +  */
> + return info->uncor_status & ~info->uncor_mask
> + & AER_ERR_ANFE_UNC_MASK & ~info->severity;
> +}
> +
>  static void pci_dev_aer_stats_incr(struct pci_dev *pdev,
>  struct aer_err_info *info)
>  {
> @@ -678,6 +710,7 @@ static void __aer_print_error(struct pci_dev *dev,
> struct aer_err_info *info)
>  {
>   unsigned long status;
> + unsigned long anfe_status;
>

Re: [PATCH v2 1/4] PCI/AER: Store more information in aer_err_info

2024-02-05 Thread Bjorn Helgaas
On Thu, Jan 25, 2024 at 02:27:59PM +0800, Wang, Qingshun wrote:
> When Advisory Non-Fatal errors are raised, both correctable and
> uncorrectable error statuses will be set. The current kernel code cannot
> store both statuses at the same time, thus failing to handle ANFE properly.
> In addition, to avoid clearing UEs that are not ANFE by accident, UE
> severity and Device Status also need to be recorded: any fatal UE cannot
> be ANFE, and if Fatal/Non-Fatal Error Detected is set in Device Status, do
> not take any assumption and let UE handler to clear UE status.
> 
> Store status and mask of both correctable and uncorrectable errors in
> aer_err_info. The severity of UEs and the values of the Device Status
> register are also recorded, which will be used to determine UEs that should
> be handled by the ANFE handler. Refactor the rest of the code to use
> cor/uncor_status and cor/uncor_mask fields instead of status and mask
> fields.

There's a lot going on in this patch.  Could it possibly be split up a
bit, e.g., first tease apart aer_err_info.status/.mask into
.cor_status/mask and .uncor_status/mask, then add .uncor_severity,
then add the device_status bit separately?  If it could be split up, I
think the ANFE case would be easier to see.

Thanks a lot for working on this area!

Bjorn


Re: [PATCH] powerpc: add crtsavres.o to always-y instead of extra-y

2024-02-05 Thread Masahiro Yamada
On Mon, Feb 5, 2024 at 10:22 PM Jan Stancek  wrote:
>
> On Mon, Feb 5, 2024 at 12:50 PM Michael Ellerman  wrote:
> >
> > Jan Stancek  writes:
> > > On Tue, Nov 21, 2023 at 10:51:34AM +1000, Nicholas Piggin wrote:
> > >>On Tue Nov 21, 2023 at 9:23 AM AEST, Masahiro Yamada wrote:
> > >>> crtsavres.o is linked to modules. However, as explained in commit
> > >>> d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
> > >>> and always-y"), 'make modules' does not build extra-y.
> > >>>
> > >>> For example, the following command fails:
> > >>>
> > >>>   $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper 
> > >>> ps3_defconfig modules
> > >>> [snip]
> > >>> LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
> > >>>   ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file 
> > >>> or directory
> > >>>   make[3]: *** [scripts/Makefile.modfinal:56: 
> > >>> arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
> > >>>   make[2]: *** [Makefile:1844: modules] Error 2
> > >>>   make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: 
> > >>> __build_one_by_one] Error 2
> > >>>   make: *** [Makefile:234: __sub-make] Error 2
> > >>>
> > >>
> > >>Thanks. Is this the correct Fixes tag?
> > >>
> > >>Fixes: d0e628cd817f ("powerpc/64: Do not link crtsavres.o in vmlinux")
> > >>
> > >>Hmm, looks like LLD might just do this now automatically for us
> > >>too without --save-restore-funcs (https://reviews.llvm.org/D79977).
> > >>But we probably still need it for older versions, so we still need
> > >>your patch.
> > >
> > > Hi,
> > >
> > > I'm still seeing the error of crtsavres.o missing when building external 
> > > modules
> > > after "make LLVM=1 modules_prepare". Should it be built also in 
> > > archprepare?
> >
> > Or modules_prepare?
> >
> > Example patch below.
>
> I tested your patch with my setup and that works for me as well.
>




Please note 'make ARCH=powerpc clean' will remove  '*.o'
files globally.


Kbuild promised you would still be able to compile external modules
after 'make clean' (until you run 'make mrproper'), but
that would not work in this case.

So, the external module support for powerpc
is broken in another way, already.


Perhaps, an easy workaround might be to change
the suffix, but I did not test it at all.

mv arch/powerpc/lib/crtsavres.o arch/powerpc/lib/crtsavres.o.do_not_remove_me




-- 
Best Regards
Masahiro Yamada


[PATCH v3] KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat'

2024-02-05 Thread Amit Machhiwal
Currently, rebooting a pseries nested qemu-kvm guest (L2) results in
below error as L1 qemu sends PVR value 'arch_compat' == 0 via
ppc_set_compat ioctl. This triggers a condition failure in
kvmppc_set_arch_compat() resulting in an EINVAL.

qemu-system-ppc64: Unable to set CPU compatibility mode in KVM: Invalid
argument

Also, a value of 0 for arch_compat generally refers the default
compatibility of the host. But, arch_compat, being a Guest Wide Element
in nested API v2, cannot be set to 0 in GSB as PowerVM (L0) expects a
non-zero value. A value of 0 triggers a kernel trap during a reboot and
consequently causes it to fail:

[   22.106360] reboot: Restarting system
KVM: unknown exit, hardware reason ffea
NIP 0100   LR fe44 CTR  XER 
20040092 CPU#0
MSR 1000 HID0   HF 6c00 iidx 3 didx 3
TB   DECR 0
GPR00   c2a8c300 7fe0
GPR04   1002 82803033
GPR08 0a00  0004 2fff
GPR12  c2e1 000105639200 0004
GPR16  00010563a090  
GPR20 000105639e20 0001056399c8 7fffe54abab0 000105639288
GPR24  0001 0001 
GPR28   c2b30840 
CR   [ -  -  -  -  -  -  -  -  ] RES 000@
 SRR0   SRR1 PVR 00800200 VRSAVE 

SPRG0  SPRG1   SPRG2   SPRG3 

SPRG4  SPRG5   SPRG6   SPRG7 

HSRR0  HSRR1 
 CFAR 
 LPCR 00020400
 PTCR    DAR   DSISR 

 kernel:trap=0xffea | pc=0x100 | msr=0x1000

This patch updates kvmppc_set_arch_compat() to use the host PVR value if
'compat_pvr' == 0 indicating that qemu doesn't want to enforce any
specific PVR compat mode.

Fixes: 19d31c5f1157 ("KVM: PPC: Add support for nestedv2 guests")
Reviewed-by: Aneesh Kumar K.V (IBM) 
Reviewed-by: Vaibhav Jain 
Signed-off-by: Amit Machhiwal 
---

Changes v2 -> v3:
- Vaibhav: Use a 'break' instead of switch-case fallthrough
- v2: 
https://lore.kernel.org/all/20240205132607.2776637-1-amach...@linux.ibm.com/

Changes v1 -> v2:
- Added descriptive error log in the patch description when
  `arch_compat == 0` passed in GSB
- Added a helper function for PCR to capabilities mapping
- Added relevant comments around the changes being made
- v1: 
https://lore.kernel.org/lkml/20240118095653.2588129-1-amach...@linux.ibm.com/

 arch/powerpc/kvm/book3s_hv.c  | 26 --
 arch/powerpc/kvm/book3s_hv_nestedv2.c | 23 +--
 2 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 52427fc2a33f..0b921704da45 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -391,6 +391,24 @@ static void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 
pvr)
 /* Dummy value used in computing PCR value below */
 #define PCR_ARCH_31(PCR_ARCH_300 << 1)
 
+static inline unsigned long map_pcr_to_cap(unsigned long pcr)
+{
+   unsigned long cap = 0;
+
+   switch (pcr) {
+   case PCR_ARCH_300:
+   cap = H_GUEST_CAP_POWER9;
+   break;
+   case PCR_ARCH_31:
+   cap = H_GUEST_CAP_POWER10;
+   break;
+   default:
+   break;
+   }
+
+   return cap;
+}
+
 static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
 {
unsigned long host_pcr_bit = 0, guest_pcr_bit = 0, cap = 0;
@@ -424,11 +442,9 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
u32 arch_compat)
break;
case PVR_ARCH_300:
guest_pcr_bit = PCR_ARCH_300;
-   cap = H_GUEST_CAP_POWER9;
break;
case PVR_ARCH_31:
guest_pcr_bit = PCR_ARCH_31;
-   cap = H_GUEST_CAP_POWER10;
break;
default:
return -EINVAL;
@@ -440,6 +456,12 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
u32 arch_compat)
return -EINVAL;
 
if (kvmhv_on_pseries() && kvmhv_is_nestedv2()) {
+   /*
+* 'arch_compat == 0' would mean the guest should default to
+* L1's compatibility. In this case, the guest would pick
+* host's PCR and evaluate the corresponding capabilities.
+*/
+   

Re: Re: [PATCH v2] KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat'

2024-02-05 Thread Amit Machhiwal
Hi Vaibhav,

Thanks for looking into the patch.

On 2024/02/05 11:05 PM, Vaibhav Jain wrote:
> Hi Amit,
> 
> Thanks for the patch. Minor comment on the patch below:
> 
> Amit Machhiwal  writes:
> 
> 
> 
> >  
> > +static inline unsigned long map_pcr_to_cap(unsigned long pcr)
> > +{
> > +   unsigned long cap = 0;
> > +
> > +   switch (pcr) {
> > +   case PCR_ARCH_300:
> > +   cap = H_GUEST_CAP_POWER9;
> > +   break;
> > +   case PCR_ARCH_31:
> > +   cap = H_GUEST_CAP_POWER10;
> Though CONFIG_CC_IMPLICIT_FALLTHROUGH and '-Wimplicit-fallthrough'
> doesnt explicitly flag this usage, please consider using the
> 'fallthrough;' keyword here.
> 
> However you probably dont want this switch-case to fallthrough so please
> use a 'break' instead.

Sure, v3 on the way.

> 
> > +   default:
> > +   break;
> > +   }
> > +
> > +   return cap;
> > +}
> > +
> >
> 
> 
> With the suggested change above
> 
> Reviewed-by: Vaibhav Jain 

Thanks!

> 
> -- 
> Cheers
> ~ Vaibhav

~Amit


Re: [Bisected] [commit dd6c3c544126] [linux-next] [6.8.0-rc2] Task hungs for infinite time call traces

2024-02-05 Thread Tejun Heo
Hello,

On Mon, Feb 05, 2024 at 03:24:17PM +0530, Tasmiya Nalatwad wrote:
> Greetings,
> 
> [Linux-next] [6.8.0-rc2-next-20240130] [FC / XFS] Task hangs for infinite
> time while running bonnie test XFS filesystem
> 
> Bisected the issue. Git bisect points to the below commit
> commit dd6c3c5441263723305a9c52c5ccc899a4653000
>   workqueue: Move pwq_dec_nr_in_flight() to the end of work item
> handling

This should be fixed by c70e1779b73a ("Fix pwq->nr_in_flight corruption in
try_to_grab_pending()").

Thanks.

-- 
tejun


Re: [kvm-unit-tests PATCH v2 4/9] migration: use a more robust way to wait for background job

2024-02-05 Thread Marc Hartmayer
On Fri, Feb 02, 2024 at 04:57 PM +1000, Nicholas Piggin  
wrote:
> Starting a pipeline of jobs in the background does not seem to have
> a simple way to reliably find the pid of a particular process in the
> pipeline (because not all processes are started when the shell
> continues to execute).
>
> The way PID of QEMU is derived can result in a failure waiting on a
> PID that is not running. This is easier to hit with subsequent
> multiple-migration support. Changing this to use $! by swapping the
> pipeline for a fifo is more robust.
>
> Signed-off-by: Nicholas Piggin 
> ---

[…snip…]

>  
> + # Wait until the destination has created the incoming and qmp sockets
> + while ! [ -S ${migsock} ] ; do sleep 0.1 ; done
> + while ! [ -S ${qmp2} ] ; do sleep 0.1 ; done

There should be timeout implemented, otherwise we might end in an
endless loop in case of a bug. Or is the global timeout good enough to
handle this situation?

> +
>   qmp ${qmp1} '"migrate", "arguments": { "uri": "unix:'${migsock}'" }' > 
> ${qmpout1}
>  
>   # Wait for the migration to complete
> -- 
> 2.42.0
>
>
-- 
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


Re: [kvm-unit-tests PATCH v2 1/9] (arm|powerpc|s390x): Makefile: Fix .aux.o generation

2024-02-05 Thread Marc Hartmayer
On Fri, Feb 02, 2024 at 04:57 PM +1000, Nicholas Piggin  
wrote:
> Using all prerequisites for the source file results in the build
> dying on the second time around with:
>
> gcc: fatal error: cannot specify ‘-o’ with ‘-c’, ‘-S’ or ‘-E’ with multiple 
> files
>
> This is due to auxinfo.h becoming a prerequisite after the first
> build recorded the dependency.
>
> Use the first prerequisite for this recipe.
>
> Fixes: f2372f2d49135 ("(arm|powerpc|s390x): Makefile: add `%.aux.o` target")
> Signed-off-by: Nicholas Piggin 
> ---
>  arm/Makefile.common | 2 +-

[…snip]

Thanks a ton for fixing this!

Reviewed-by: Marc Hartmayer 

-- 
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


[Bisected] [commit dd6c3c544126] [linux-next] [6.8.0-rc2] Task hungs for infinite time call traces

2024-02-05 Thread Tasmiya Nalatwad

Greetings,

[Linux-next] [6.8.0-rc2-next-20240130] [FC / XFS] Task hangs for 
infinite time while running bonnie test XFS filesystem


Bisected the issue. Git bisect points to the below commit
commit dd6c3c5441263723305a9c52c5ccc899a4653000
  workqueue: Move pwq_dec_nr_in_flight() to the end of work 
item handling


--- Traces ---

[  981.280811] Call Trace:
[  981.280813] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[  981.280820] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[  981.280827] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[  981.280832] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[  981.280836] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[  981.280841] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[  981.280845] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[  981.280852] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[  981.280913] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[  981.280969] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[  981.281023] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[  981.281029] [c001d10b7c60] [c05c8a50] kill_block_super+0x30/0x68
[  981.281034] [c001d10b7c90] [c00808fa5c54] xfs_kill_sb+0x28/0x4c [xfs]
[  981.281088] [c001d10b7cc0] [c05ca9d4] 
deactivate_locked_super+0x70/0x144
[  981.281093] [c001d10b7cf0] [c0605728] cleanup_mnt+0x10c/0x1d8
[  981.281098] [c001d10b7d40] [c019b5e4] task_work_run+0xe0/0x16c
[  981.281102] [c001d10b7d90] [c0022974] 
do_notify_resume+0x134/0x13c
[  981.281107] [c001d10b7dc0] [c0032378] 
interrupt_exit_user_prepare_main+0x1ac/0x264
[  981.281112] [c001d10b7e20] [c0032580] 
syscall_exit_prepare+0x150/0x178
[  981.281116] [c001d10b7e50] [c000d068] 
system_call_vectored_common+0x168/0x2ec
[  981.281122] --- interrupt: 3000 at 0x7fffaed4c11c
[  981.281125] NIP:  7fffaed4c11c LR:  CTR: 
[  981.281128] REGS: c001d10b7e80 TRAP: 3000   Not tainted  
(6.8.0-rc2-next-20240130-auto)
[  981.281131] MSR:  8000d033   CR: 48002402  
XER: 
[  981.281139] IRQMASK: 0
[  981.281139] GPR00: 0034 7fffec649770 7fffaef07f00 

[  981.281139] GPR04:  ff00  
0001
[  981.281139] GPR08: 00014cd61390   

[  981.281139] GPR12:  7fffaefbc140 0ee6b280 
7fffec649a30
[  981.281139] GPR16: 7fffec649bd8 000118b66478  

[  981.281139] GPR20:    

[  981.281139] GPR24: 7fffec64b0b0 000118b663d8 000118b66a58 

[  981.281139] GPR28: 00014cd61250  00014cd61370 
00014cd61140
[  981.281175] NIP [7fffaed4c11c] 0x7fffaed4c11c
[  981.281177] LR [] 0x0
[  981.281179] --- interrupt: 3000

--
Regards,
Tasmiya Nalatwad
IBM Linux Technology Center


Re: [PATCH v2] KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat'

2024-02-05 Thread Vaibhav Jain
Hi Amit,

Thanks for the patch. Minor comment on the patch below:

Amit Machhiwal  writes:



>  
> +static inline unsigned long map_pcr_to_cap(unsigned long pcr)
> +{
> + unsigned long cap = 0;
> +
> + switch (pcr) {
> + case PCR_ARCH_300:
> + cap = H_GUEST_CAP_POWER9;
> + break;
> + case PCR_ARCH_31:
> + cap = H_GUEST_CAP_POWER10;
Though CONFIG_CC_IMPLICIT_FALLTHROUGH and '-Wimplicit-fallthrough'
doesnt explicitly flag this usage, please consider using the
'fallthrough;' keyword here.

However you probably dont want this switch-case to fallthrough so please
use a 'break' instead.

> + default:
> + break;
> + }
> +
> + return cap;
> +}
> +
>


With the suggested change above

Reviewed-by: Vaibhav Jain 

-- 
Cheers
~ Vaibhav


Re: [PATCH v2 1/6] net: wan: Add support for QMC HDLC

2024-02-05 Thread Herve Codina
On Mon, 05 Feb 2024 16:49:33 +0100
Paolo Abeni  wrote:

[...
> > > 
> > > In general is quite bad that the existing infra does not allow
> > > leveraging NAPI. Have you considered expanding the QMC to accomodate
> > > such user?  
> > 
> > I cannot mask/unmask the 'end of transfer' interrupt.
> > Indeed, other streams use this interrupt among them audio streams and so
> > masking the interrupt for HDLC data will also mask the interrupt for audio
> > data.  
> 
> Uhm... I fear the above makes the available options list empty :(
> 
> > At the HDLC driver level, the best I can to is to store a queue of complete
> > HDLC skbs (queue filled on interrupts) and send them to the network stack
> > when the napi poll() is called.
> > 
> > I am not sure that this kind of queue (additional level between always
> > enabled interrupts and the network stack) makes sense.
> > 
> > Do you have any opinion about this additional queue management for NAPI
> > support?  
> 
> With such idea in place, what HDLC-level data will be accessed by the
> napi context? The RX interrupts will remain unmasked after the
> interrupt and before the napi poll right? That would be
> problematic/could cause drop if the ingress pkt/interrupt rate will be
> higher that what the napi could process - and that in turn could bring
> back old bad livelock times :(

Indeed.
So the best thing to do is to keep this driver without NAPI support.

Best regards,
Hervé


Re: [PATCH v2 1/6] net: wan: Add support for QMC HDLC

2024-02-05 Thread Paolo Abeni
On Mon, 2024-02-05 at 15:22 +0100, Herve Codina wrote:
> Hi Paolo,
> 
> On Thu, 01 Feb 2024 12:41:32 +0100
> Paolo Abeni  wrote:
> 
> [...]
> > > +static inline struct qmc_hdlc *netdev_to_qmc_hdlc(struct net_device 
> > > *netdev)
> > > +{
> > > + return dev_to_hdlc(netdev)->priv;
> > > +}  
> > 
> > Please, no 'inline' function in c files. You could move this function
> > and the struct definition into a new, local, header file.
> 
> 'inline' function specifier will be removed in the next iteration on the 
> series.
> 
> > 
> > > +static int qmc_hdlc_recv_queue(struct qmc_hdlc *qmc_hdlc, struct 
> > > qmc_hdlc_desc *desc, size_t size);
> > > +
> > > +#define QMC_HDLC_RX_ERROR_FLAGS (QMC_RX_FLAG_HDLC_OVF | \
> > > +  QMC_RX_FLAG_HDLC_UNA | \
> > > +  QMC_RX_FLAG_HDLC_ABORT | \
> > > +  QMC_RX_FLAG_HDLC_CRC)
> > > +
> > > +static void qmc_hcld_recv_complete(void *context, size_t length, 
> > > unsigned int flags)
> > > +{
> > > + struct qmc_hdlc_desc *desc = context;
> > > + struct net_device *netdev = desc->netdev;
> > > + struct qmc_hdlc *qmc_hdlc = netdev_to_qmc_hdlc(netdev);
> > > + int ret;  
> > 
> > Please, respect the reverse x-mas tree order for local variable
> > definition.
> 
> desc depends on context, netdev depends on desc and qmc_hdlc depends on 
> netdev.
> I think the declaration order is correct here even it doesn't respect the 
> reverse
> x-mas tree.
> 
> [...]
> > > +static netdev_tx_t qmc_hdlc_xmit(struct sk_buff *skb, struct net_device 
> > > *netdev)
> > > +{
> > > + struct qmc_hdlc *qmc_hdlc = netdev_to_qmc_hdlc(netdev);
> > > + struct qmc_hdlc_desc *desc;
> > > + unsigned long flags;
> > > + int ret;
> > > +
> > > + spin_lock_irqsave(_hdlc->tx_lock, flags);
> > > + desc = _hdlc->tx_descs[qmc_hdlc->tx_out];
> > > + if (WARN_ONCE(!desc->skb, "No tx descriptors available\n")) {
> > > + /* Should never happen.
> > > +  * Previous xmit should have already stopped the queue.
> > > +  */
> > > + netif_stop_queue(netdev);
> > > + spin_unlock_irqrestore(_hdlc->tx_lock, flags);
> > > + return NETDEV_TX_BUSY;
> > > + }
> > > + spin_unlock_irqrestore(_hdlc->tx_lock, flags);
> > > +
> > > + desc->netdev = netdev;
> > > + desc->dma_size = skb->len;
> > > + desc->skb = skb;
> > > + ret = qmc_hdlc_xmit_queue(qmc_hdlc, desc);
> > > + if (ret) {
> > > + desc->skb = NULL; /* Release the descriptor */
> > > + if (ret == -EBUSY) {
> > > + netif_stop_queue(netdev);
> > > + return NETDEV_TX_BUSY;
> > > + }
> > > + dev_kfree_skb(skb);
> > > + netdev->stats.tx_dropped++;
> > > + return NETDEV_TX_OK;
> > > + }
> > > +
> > > + qmc_hdlc->tx_out = (qmc_hdlc->tx_out + 1) % 
> > > ARRAY_SIZE(qmc_hdlc->tx_descs);
> > > +
> > > + spin_lock_irqsave(_hdlc->tx_lock, flags);
> > > + if (qmc_hdlc->tx_descs[qmc_hdlc->tx_out].skb)
> > > + netif_stop_queue(netdev);
> > > + spin_unlock_irqrestore(_hdlc->tx_lock, flags);  
> > 
> > The locking schema is quite bad, as the drivers acquires and releases 3
> > locks for each tx packet. You could improve that using the qmc_chan-
> > > tx_lock to protect the whole qmc_hdlc_xmit() function, factoring out a  
> > lockless variant of qmc_hdlc_xmit_queue(), and using it here.
> 
> I will change on next iteration and keep 2 lock/unlock instead of 3:
>   - one in qmc_hdlc_xmit()
>   - one in qmc_hdlc_xmit_complete() 
> to protect the descriptors accesses.
> 
> > 
> > In general is quite bad that the existing infra does not allow
> > leveraging NAPI. Have you considered expanding the QMC to accomodate
> > such user?
> 
> I cannot mask/unmask the 'end of transfer' interrupt.
> Indeed, other streams use this interrupt among them audio streams and so
> masking the interrupt for HDLC data will also mask the interrupt for audio
> data.

Uhm... I fear the above makes the available options list empty :(

> At the HDLC driver level, the best I can to is to store a queue of complete
> HDLC skbs (queue filled on interrupts) and send them to the network stack
> when the napi poll() is called.
> 
> I am not sure that this kind of queue (additional level between always
> enabled interrupts and the network stack) makes sense.
> 
> Do you have any opinion about this additional queue management for NAPI
> support?

With such idea in place, what HDLC-level data will be accessed by the
napi context? The RX interrupts will remain unmasked after the
interrupt and before the napi poll right? That would be
problematic/could cause drop if the ingress pkt/interrupt rate will be
higher that what the napi could process - and that in turn could bring
back old bad livelock times :(

Cheers,

Paolo



Re: [PATCH v2 5/6] net: wan: fsl_qmc_hdlc: Add runtime timeslots changes support

2024-02-05 Thread Herve Codina
Hi Paolo,

On Thu, 01 Feb 2024 13:01:51 +0100
Paolo Abeni  wrote:

[...]
> >  
> > +static int qmc_hdlc_xlate_slot_map(struct qmc_hdlc *qmc_hdlc,
> > +  u32 slot_map, struct qmc_chan_ts_info 
> > *ts_info)
> > +{
> > +   DECLARE_BITMAP(ts_mask_avail, 64);
> > +   DECLARE_BITMAP(ts_mask, 64);
> > +   DECLARE_BITMAP(map, 64);
> > +   u32 array32[2];
> > +
> > +   /* Tx and Rx available masks must be identical */
> > +   if (ts_info->rx_ts_mask_avail != ts_info->tx_ts_mask_avail) {
> > +   dev_err(qmc_hdlc->dev, "tx and rx available timeslots mismatch 
> > (0x%llx, 0x%llx)\n",
> > +   ts_info->rx_ts_mask_avail, ts_info->tx_ts_mask_avail);
> > +   return -EINVAL;
> > +   }
> > +
> > +   bitmap_from_arr64(ts_mask_avail, _info->rx_ts_mask_avail, 64);
> > +   array32[0] = slot_map;
> > +   array32[1] = 0;
> > +   bitmap_from_arr32(map, array32, 64);  
> 
> What about using bitmap_from_u64 everywhere ?

Yes indeed.
Will be updated in the next series iteration.

Thanks for this review.
Best regards,
Hervé


Re: [PATCH v2 1/6] net: wan: Add support for QMC HDLC

2024-02-05 Thread Herve Codina
Hi Paolo,

On Thu, 01 Feb 2024 12:41:32 +0100
Paolo Abeni  wrote:

[...]
> > +static inline struct qmc_hdlc *netdev_to_qmc_hdlc(struct net_device 
> > *netdev)
> > +{
> > +   return dev_to_hdlc(netdev)->priv;
> > +}  
> 
> Please, no 'inline' function in c files. You could move this function
> and the struct definition into a new, local, header file.

'inline' function specifier will be removed in the next iteration on the series.

> 
> > +static int qmc_hdlc_recv_queue(struct qmc_hdlc *qmc_hdlc, struct 
> > qmc_hdlc_desc *desc, size_t size);
> > +
> > +#define QMC_HDLC_RX_ERROR_FLAGS (QMC_RX_FLAG_HDLC_OVF | \
> > +QMC_RX_FLAG_HDLC_UNA | \
> > +QMC_RX_FLAG_HDLC_ABORT | \
> > +QMC_RX_FLAG_HDLC_CRC)
> > +
> > +static void qmc_hcld_recv_complete(void *context, size_t length, unsigned 
> > int flags)
> > +{
> > +   struct qmc_hdlc_desc *desc = context;
> > +   struct net_device *netdev = desc->netdev;
> > +   struct qmc_hdlc *qmc_hdlc = netdev_to_qmc_hdlc(netdev);
> > +   int ret;  
> 
> Please, respect the reverse x-mas tree order for local variable
> definition.

desc depends on context, netdev depends on desc and qmc_hdlc depends on netdev.
I think the declaration order is correct here even it doesn't respect the 
reverse
x-mas tree.

[...]
> > +static netdev_tx_t qmc_hdlc_xmit(struct sk_buff *skb, struct net_device 
> > *netdev)
> > +{
> > +   struct qmc_hdlc *qmc_hdlc = netdev_to_qmc_hdlc(netdev);
> > +   struct qmc_hdlc_desc *desc;
> > +   unsigned long flags;
> > +   int ret;
> > +
> > +   spin_lock_irqsave(_hdlc->tx_lock, flags);
> > +   desc = _hdlc->tx_descs[qmc_hdlc->tx_out];
> > +   if (WARN_ONCE(!desc->skb, "No tx descriptors available\n")) {
> > +   /* Should never happen.
> > +* Previous xmit should have already stopped the queue.
> > +*/
> > +   netif_stop_queue(netdev);
> > +   spin_unlock_irqrestore(_hdlc->tx_lock, flags);
> > +   return NETDEV_TX_BUSY;
> > +   }
> > +   spin_unlock_irqrestore(_hdlc->tx_lock, flags);
> > +
> > +   desc->netdev = netdev;
> > +   desc->dma_size = skb->len;
> > +   desc->skb = skb;
> > +   ret = qmc_hdlc_xmit_queue(qmc_hdlc, desc);
> > +   if (ret) {
> > +   desc->skb = NULL; /* Release the descriptor */
> > +   if (ret == -EBUSY) {
> > +   netif_stop_queue(netdev);
> > +   return NETDEV_TX_BUSY;
> > +   }
> > +   dev_kfree_skb(skb);
> > +   netdev->stats.tx_dropped++;
> > +   return NETDEV_TX_OK;
> > +   }
> > +
> > +   qmc_hdlc->tx_out = (qmc_hdlc->tx_out + 1) % 
> > ARRAY_SIZE(qmc_hdlc->tx_descs);
> > +
> > +   spin_lock_irqsave(_hdlc->tx_lock, flags);
> > +   if (qmc_hdlc->tx_descs[qmc_hdlc->tx_out].skb)
> > +   netif_stop_queue(netdev);
> > +   spin_unlock_irqrestore(_hdlc->tx_lock, flags);  
> 
> The locking schema is quite bad, as the drivers acquires and releases 3
> locks for each tx packet. You could improve that using the qmc_chan-
> >tx_lock to protect the whole qmc_hdlc_xmit() function, factoring out a  
> lockless variant of qmc_hdlc_xmit_queue(), and using it here.

I will change on next iteration and keep 2 lock/unlock instead of 3:
  - one in qmc_hdlc_xmit()
  - one in qmc_hdlc_xmit_complete() 
to protect the descriptors accesses.

> 
> In general is quite bad that the existing infra does not allow
> leveraging NAPI. Have you considered expanding the QMC to accomodate
> such user?

I cannot mask/unmask the 'end of transfer' interrupt.
Indeed, other streams use this interrupt among them audio streams and so
masking the interrupt for HDLC data will also mask the interrupt for audio
data.

At the HDLC driver level, the best I can to is to store a queue of complete
HDLC skbs (queue filled on interrupts) and send them to the network stack
when the napi poll() is called.

I am not sure that this kind of queue (additional level between always
enabled interrupts and the network stack) makes sense.

Do you have any opinion about this additional queue management for NAPI
support?

Best regards,
Hervé


Re: Re: [PATCH v2 1/3] powerpc/code-patching: Add generic memory patching

2024-02-05 Thread Naveen N Rao
On Mon, Feb 05, 2024 at 01:30:46PM +1100, Benjamin Gray wrote:
> On Thu, 2023-11-30 at 15:55 +0530, Naveen N Rao wrote:
> > On Mon, Oct 16, 2023 at 04:01:45PM +1100, Benjamin Gray wrote:
> > > 
> > > diff --git a/arch/powerpc/include/asm/code-patching.h
> > > b/arch/powerpc/include/asm/code-patching.h
> > > index 3f881548fb61..7c6056bb1706 100644
> > > --- a/arch/powerpc/include/asm/code-patching.h
> > > +++ b/arch/powerpc/include/asm/code-patching.h
> > > @@ -75,6 +75,39 @@ int patch_branch(u32 *addr, unsigned long
> > > target, int flags);
> > >  int patch_instruction(u32 *addr, ppc_inst_t instr);
> > >  int raw_patch_instruction(u32 *addr, ppc_inst_t instr);
> > >  
> > > +/*
> > > + * patch_uint() and patch_ulong() should only be called on
> > > addresses where the
> > > + * patch does not cross a cacheline, otherwise it may not be
> > > flushed properly
> > > + * and mixes of new and stale data may be observed. It cannot
> > > cross a page
> > > + * boundary, as only the target page is mapped as writable.
> > 
> > Should we enforce alignment requirements, especially for
> > patch_ulong() 
> > on 64-bit powerpc? I am not sure if there are use cases for unaligned
> > 64-bit writes. That should also ensure that the write doesn't cross a
> > cacheline.
> 
> Yeah, the current description is more just the technical restrictions,
> not an endorsement of usage. If the caller isn't working with aligned
> data, it seems unlikely it would still be cacheline aligned. The caller
> should break it into 32bit patches if this is the case.
> 
> By enforce, are you suggesting a WARN_ON in the code too?

No, just detecting and returning an error code should help detect 
incorrect usage.


- Naveen



Re: [PATCH v2] KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat'

2024-02-05 Thread Aneesh Kumar K . V
Amit Machhiwal  writes:

> Currently, rebooting a pseries nested qemu-kvm guest (L2) results in
> below error as L1 qemu sends PVR value 'arch_compat' == 0 via
> ppc_set_compat ioctl. This triggers a condition failure in
> kvmppc_set_arch_compat() resulting in an EINVAL.
>
> qemu-system-ppc64: Unable to set CPU compatibility mode in KVM: Invalid
> argument
>
> Also, a value of 0 for arch_compat generally refers the default
> compatibility of the host. But, arch_compat, being a Guest Wide Element
> in nested API v2, cannot be set to 0 in GSB as PowerVM (L0) expects a
> non-zero value. A value of 0 triggers a kernel trap during a reboot and
> consequently causes it to fail:
>
> [   22.106360] reboot: Restarting system
> KVM: unknown exit, hardware reason ffea
> NIP 0100   LR fe44 CTR  XER 
> 20040092 CPU#0
> MSR 1000 HID0   HF 6c00 iidx 3 didx 3
> TB   DECR 0
> GPR00   c2a8c300 7fe0
> GPR04   1002 82803033
> GPR08 0a00  0004 2fff
> GPR12  c2e1 000105639200 0004
> GPR16  00010563a090  
> GPR20 000105639e20 0001056399c8 7fffe54abab0 000105639288
> GPR24  0001 0001 
> GPR28   c2b30840 
> CR   [ -  -  -  -  -  -  -  -  ] RES 000@
>  SRR0   SRR1 PVR 00800200 VRSAVE 
> 
> SPRG0  SPRG1   SPRG2   SPRG3 
> 
> SPRG4  SPRG5   SPRG6   SPRG7 
> 
> HSRR0  HSRR1 
>  CFAR 
>  LPCR 00020400
>  PTCR    DAR   DSISR 
>
>  kernel:trap=0xffea | pc=0x100 | msr=0x1000
>
> This patch updates kvmppc_set_arch_compat() to use the host PVR value if
> 'compat_pvr' == 0 indicating that qemu doesn't want to enforce any
> specific PVR compat mode.
>

Reviewed-by: Aneesh Kumar K.V (IBM) 

>
> Fixes: 19d31c5f1157 ("KVM: PPC: Add support for nestedv2 guests")
> Signed-off-by: Amit Machhiwal 
> ---
>
> Changes v1 -> v2:
> - Added descriptive error log in the patch description when
>   `arch_compat == 0` passed in GSB
> - Added a helper function for PCR to capabilities mapping
> - Added relevant comments around the changes being made
>
> v1: 
> https://lore.kernel.org/lkml/20240118095653.2588129-1-amach...@linux.ibm.com/
>
>  arch/powerpc/kvm/book3s_hv.c  | 25 +++--
>  arch/powerpc/kvm/book3s_hv_nestedv2.c | 23 +--
>  2 files changed, 44 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 52427fc2a33f..270ab9cf9a54 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -391,6 +391,23 @@ static void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 
> pvr)
>  /* Dummy value used in computing PCR value below */
>  #define PCR_ARCH_31(PCR_ARCH_300 << 1)
>  
> +static inline unsigned long map_pcr_to_cap(unsigned long pcr)
> +{
> + unsigned long cap = 0;
> +
> + switch (pcr) {
> + case PCR_ARCH_300:
> + cap = H_GUEST_CAP_POWER9;
> + break;
> + case PCR_ARCH_31:
> + cap = H_GUEST_CAP_POWER10;
> + default:
> + break;
> + }
> +
> + return cap;
> +}
> +
>  static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
>  {
>   unsigned long host_pcr_bit = 0, guest_pcr_bit = 0, cap = 0;
> @@ -424,11 +441,9 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
> u32 arch_compat)
>   break;
>   case PVR_ARCH_300:
>   guest_pcr_bit = PCR_ARCH_300;
> - cap = H_GUEST_CAP_POWER9;
>   break;
>   case PVR_ARCH_31:
>   guest_pcr_bit = PCR_ARCH_31;
> - cap = H_GUEST_CAP_POWER10;
>   break;
>   default:
>   return -EINVAL;
> @@ -440,6 +455,12 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
> u32 arch_compat)
>   return -EINVAL;
>  
>   if (kvmhv_on_pseries() && kvmhv_is_nestedv2()) {
> + /*
> +  * 'arch_compat == 0' would mean the guest should default to
> +  * L1's compatibility. In this case, the guest would pick
> +  * host's PCR and evaluate the corresponding capabilities.
> +  */
> + cap = map_pcr_to_cap(guest_pcr_bit);
>  

[PATCH v2] KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat'

2024-02-05 Thread Amit Machhiwal
Currently, rebooting a pseries nested qemu-kvm guest (L2) results in
below error as L1 qemu sends PVR value 'arch_compat' == 0 via
ppc_set_compat ioctl. This triggers a condition failure in
kvmppc_set_arch_compat() resulting in an EINVAL.

qemu-system-ppc64: Unable to set CPU compatibility mode in KVM: Invalid
argument

Also, a value of 0 for arch_compat generally refers the default
compatibility of the host. But, arch_compat, being a Guest Wide Element
in nested API v2, cannot be set to 0 in GSB as PowerVM (L0) expects a
non-zero value. A value of 0 triggers a kernel trap during a reboot and
consequently causes it to fail:

[   22.106360] reboot: Restarting system
KVM: unknown exit, hardware reason ffea
NIP 0100   LR fe44 CTR  XER 
20040092 CPU#0
MSR 1000 HID0   HF 6c00 iidx 3 didx 3
TB   DECR 0
GPR00   c2a8c300 7fe0
GPR04   1002 82803033
GPR08 0a00  0004 2fff
GPR12  c2e1 000105639200 0004
GPR16  00010563a090  
GPR20 000105639e20 0001056399c8 7fffe54abab0 000105639288
GPR24  0001 0001 
GPR28   c2b30840 
CR   [ -  -  -  -  -  -  -  -  ] RES 000@
 SRR0   SRR1 PVR 00800200 VRSAVE 

SPRG0  SPRG1   SPRG2   SPRG3 

SPRG4  SPRG5   SPRG6   SPRG7 

HSRR0  HSRR1 
 CFAR 
 LPCR 00020400
 PTCR    DAR   DSISR 

 kernel:trap=0xffea | pc=0x100 | msr=0x1000

This patch updates kvmppc_set_arch_compat() to use the host PVR value if
'compat_pvr' == 0 indicating that qemu doesn't want to enforce any
specific PVR compat mode.

Fixes: 19d31c5f1157 ("KVM: PPC: Add support for nestedv2 guests")
Signed-off-by: Amit Machhiwal 
---

Changes v1 -> v2:
- Added descriptive error log in the patch description when
  `arch_compat == 0` passed in GSB
- Added a helper function for PCR to capabilities mapping
- Added relevant comments around the changes being made

v1: 
https://lore.kernel.org/lkml/20240118095653.2588129-1-amach...@linux.ibm.com/

 arch/powerpc/kvm/book3s_hv.c  | 25 +++--
 arch/powerpc/kvm/book3s_hv_nestedv2.c | 23 +--
 2 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 52427fc2a33f..270ab9cf9a54 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -391,6 +391,23 @@ static void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 
pvr)
 /* Dummy value used in computing PCR value below */
 #define PCR_ARCH_31(PCR_ARCH_300 << 1)
 
+static inline unsigned long map_pcr_to_cap(unsigned long pcr)
+{
+   unsigned long cap = 0;
+
+   switch (pcr) {
+   case PCR_ARCH_300:
+   cap = H_GUEST_CAP_POWER9;
+   break;
+   case PCR_ARCH_31:
+   cap = H_GUEST_CAP_POWER10;
+   default:
+   break;
+   }
+
+   return cap;
+}
+
 static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
 {
unsigned long host_pcr_bit = 0, guest_pcr_bit = 0, cap = 0;
@@ -424,11 +441,9 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
u32 arch_compat)
break;
case PVR_ARCH_300:
guest_pcr_bit = PCR_ARCH_300;
-   cap = H_GUEST_CAP_POWER9;
break;
case PVR_ARCH_31:
guest_pcr_bit = PCR_ARCH_31;
-   cap = H_GUEST_CAP_POWER10;
break;
default:
return -EINVAL;
@@ -440,6 +455,12 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
u32 arch_compat)
return -EINVAL;
 
if (kvmhv_on_pseries() && kvmhv_is_nestedv2()) {
+   /*
+* 'arch_compat == 0' would mean the guest should default to
+* L1's compatibility. In this case, the guest would pick
+* host's PCR and evaluate the corresponding capabilities.
+*/
+   cap = map_pcr_to_cap(guest_pcr_bit);
if (!(cap & nested_capabilities))
return -EINVAL;
}
diff --git a/arch/powerpc/kvm/book3s_hv_nestedv2.c 
b/arch/powerpc/kvm/book3s_hv_nestedv2.c
index 5378eb40b162..6042bdc70230 

Re: [PATCH] powerpc: add crtsavres.o to always-y instead of extra-y

2024-02-05 Thread Jan Stancek
On Mon, Feb 5, 2024 at 12:50 PM Michael Ellerman  wrote:
>
> Jan Stancek  writes:
> > On Tue, Nov 21, 2023 at 10:51:34AM +1000, Nicholas Piggin wrote:
> >>On Tue Nov 21, 2023 at 9:23 AM AEST, Masahiro Yamada wrote:
> >>> crtsavres.o is linked to modules. However, as explained in commit
> >>> d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
> >>> and always-y"), 'make modules' does not build extra-y.
> >>>
> >>> For example, the following command fails:
> >>>
> >>>   $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig 
> >>> modules
> >>> [snip]
> >>> LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
> >>>   ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file 
> >>> or directory
> >>>   make[3]: *** [scripts/Makefile.modfinal:56: 
> >>> arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
> >>>   make[2]: *** [Makefile:1844: modules] Error 2
> >>>   make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: 
> >>> __build_one_by_one] Error 2
> >>>   make: *** [Makefile:234: __sub-make] Error 2
> >>>
> >>
> >>Thanks. Is this the correct Fixes tag?
> >>
> >>Fixes: d0e628cd817f ("powerpc/64: Do not link crtsavres.o in vmlinux")
> >>
> >>Hmm, looks like LLD might just do this now automatically for us
> >>too without --save-restore-funcs (https://reviews.llvm.org/D79977).
> >>But we probably still need it for older versions, so we still need
> >>your patch.
> >
> > Hi,
> >
> > I'm still seeing the error of crtsavres.o missing when building external 
> > modules
> > after "make LLVM=1 modules_prepare". Should it be built also in archprepare?
>
> Or modules_prepare?
>
> Example patch below.

I tested your patch with my setup and that works for me as well.

>
> cheers
>
>
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index 051247027da0..82cdef40a9cd 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -59,6 +59,11 @@ ifeq ($(CONFIG_PPC64)$(CONFIG_LD_IS_BFD),yy)
>  KBUILD_LDFLAGS_MODULE += --save-restore-funcs
>  else
>  KBUILD_LDFLAGS_MODULE += arch/powerpc/lib/crtsavres.o
> +
> +crtsavres_prepare: scripts
> +   $(MAKE) $(build)=arch/powerpc/lib arch/powerpc/lib/crtsavres.o
> +
> +modules_prepare: crtsavres_prepare
>  endif
>
>  ifdef CONFIG_CPU_LITTLE_ENDIAN
>



Re: [kvm-unit-tests PATCH v2 3/9] arch-run: Clean up initrd cleanup

2024-02-05 Thread Thomas Huth

On 02/02/2024 07.57, Nicholas Piggin wrote:

Rather than put a big script into the trap handler, have it call
a function.

Signed-off-by: Nicholas Piggin 
---
  scripts/arch-run.bash | 12 +++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index f22ead6f..cc7da7c5 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -271,10 +271,20 @@ search_qemu_binary ()
export PATH=$save_path
  }
  
+initrd_cleanup ()

+{
+   if [ "$KVM_UNIT_TESTS_ENV_OLD" ]; then
+   export KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD"
+   else
+   unset KVM_UNIT_TESTS_ENV
+   unset KVM_UNIT_TESTS_ENV_OLD
+   fi
+}
+
  initrd_create ()
  {
if [ "$ENVIRON_DEFAULT" = "yes" ]; then
-   trap_exit_push 'rm -f $KVM_UNIT_TESTS_ENV; [ "$KVM_UNIT_TESTS_ENV_OLD" ] 
&& export KVM_UNIT_TESTS_ENV="$KVM_UNIT_TESTS_ENV_OLD" || unset KVM_UNIT_TESTS_ENV; unset 
KVM_UNIT_TESTS_ENV_OLD'
+   trap_exit_push 'rm -f $KVM_UNIT_TESTS_ENV; initrd_cleanup'



Why don't you move the 'rm -f $KVM_UNIT_TESTS_ENV' into the initrd_cleanup() 
function, too? ... that would IMHO make more sense for a function that is 
called *_cleanup() ?


 Thomas




Re: [PATCH] powerpc: add crtsavres.o to always-y instead of extra-y

2024-02-05 Thread Michael Ellerman
Jan Stancek  writes:
> On Tue, Nov 21, 2023 at 10:51:34AM +1000, Nicholas Piggin wrote:
>>On Tue Nov 21, 2023 at 9:23 AM AEST, Masahiro Yamada wrote:
>>> crtsavres.o is linked to modules. However, as explained in commit
>>> d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
>>> and always-y"), 'make modules' does not build extra-y.
>>>
>>> For example, the following command fails:
>>>
>>>   $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig 
>>> modules
>>> [snip]
>>> LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
>>>   ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or 
>>> directory
>>>   make[3]: *** [scripts/Makefile.modfinal:56: 
>>> arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
>>>   make[2]: *** [Makefile:1844: modules] Error 2
>>>   make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: 
>>> __build_one_by_one] Error 2
>>>   make: *** [Makefile:234: __sub-make] Error 2
>>>
>>
>>Thanks. Is this the correct Fixes tag?
>>
>>Fixes: d0e628cd817f ("powerpc/64: Do not link crtsavres.o in vmlinux")
>>
>>Hmm, looks like LLD might just do this now automatically for us
>>too without --save-restore-funcs (https://reviews.llvm.org/D79977).
>>But we probably still need it for older versions, so we still need
>>your patch.
>
> Hi,
>
> I'm still seeing the error of crtsavres.o missing when building external 
> modules
> after "make LLVM=1 modules_prepare". Should it be built also in archprepare?

Or modules_prepare?

Example patch below.

cheers


diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 051247027da0..82cdef40a9cd 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -59,6 +59,11 @@ ifeq ($(CONFIG_PPC64)$(CONFIG_LD_IS_BFD),yy)
 KBUILD_LDFLAGS_MODULE += --save-restore-funcs
 else
 KBUILD_LDFLAGS_MODULE += arch/powerpc/lib/crtsavres.o
+
+crtsavres_prepare: scripts
+   $(MAKE) $(build)=arch/powerpc/lib arch/powerpc/lib/crtsavres.o
+
+modules_prepare: crtsavres_prepare
 endif
 
 ifdef CONFIG_CPU_LITTLE_ENDIAN



Re: [kvm-unit-tests PATCH v2 1/9] (arm|powerpc|s390x): Makefile: Fix .aux.o generation

2024-02-05 Thread Thomas Huth

On 02/02/2024 10.30, Andrew Jones wrote:

On Fri, Feb 02, 2024 at 04:57:32PM +1000, Nicholas Piggin wrote:

Using all prerequisites for the source file results in the build
dying on the second time around with:

gcc: fatal error: cannot specify ‘-o’ with ‘-c’, ‘-S’ or ‘-E’ with multiple 
files

This is due to auxinfo.h becoming a prerequisite after the first
build recorded the dependency.


D'oh, of course I only tried to run "make" once when testing that patch :-/


diff --git a/arm/Makefile.common b/arm/Makefile.common
index 54cb4a63..c2ee568c 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -71,7 +71,7 @@ FLATLIBS = $(libcflat) $(LIBFDT_archive) $(libeabi)
  
  ifeq ($(CONFIG_EFI),y)

  %.aux.o: $(SRCDIR)/lib/auxinfo.c
-   $(CC) $(CFLAGS) -c -o $@ $^ \
+   $(CC) $(CFLAGS) -c -o $@ $< \
-DPROGNAME=\"$(@:.aux.o=.efi)\" -DAUXFLAGS=$(AUXFLAGS)


There are two instances of the %.aux.o target in arm/Makefile.common. We
need to fix both. We can actually pull the target out of the two arms of
the CONFIG_EFI if-else, though, by changing the .efi/.flat to .$(exe).


I went ahead and pushed this patch with the trivial fix for the else-branch 
to the repo to unbreak the build. If you think it's worthwhile to unify the 
target, please provide a patch to do so, thanks!


 Thomas



14:59,[Bisected] [commit dd6c3c544126] [linux-next] [6.8.0-rc2] Task hungs for infinite time call traces

2024-02-05 Thread Tasmiya Nalatwad
[linux-next] [6.8.0-rc2-next-20240130] [FC/XFS] Task hungs for infinite 
time while running bonnie test XFS filesystem


Bisected the issue. Git bisect points to the below commit
commit dd6c3c5441263723305a9c52c5ccc899a4653000
  workqueue: Move pwq_dec_nr_in_flight() to the end of work 
item handling


--- Traces ---

[  981.280811] Call Trace:
[  981.280813] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[  981.280820] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[  981.280827] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[  981.280832] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[  981.280836] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[  981.280841] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[  981.280845] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[  981.280852] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[  981.280913] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[  981.280969] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[  981.281023] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[  981.281029] [c001d10b7c60] [c05c8a50] kill_block_super+0x30/0x68
[  981.281034] [c001d10b7c90] [c00808fa5c54] xfs_kill_sb+0x28/0x4c [xfs]
[  981.281088] [c001d10b7cc0] [c05ca9d4] 
deactivate_locked_super+0x70/0x144
[  981.281093] [c001d10b7cf0] [c0605728] cleanup_mnt+0x10c/0x1d8
[  981.281098] [c001d10b7d40] [c019b5e4] task_work_run+0xe0/0x16c
[  981.281102] [c001d10b7d90] [c0022974] 
do_notify_resume+0x134/0x13c
[  981.281107] [c001d10b7dc0] [c0032378] 
interrupt_exit_user_prepare_main+0x1ac/0x264
[  981.281112] [c001d10b7e20] [c0032580] 
syscall_exit_prepare+0x150/0x178
[  981.281116] [c001d10b7e50] [c000d068] 
system_call_vectored_common+0x168/0x2ec
[  981.281122] --- interrupt: 3000 at 0x7fffaed4c11c
[  981.281125] NIP:  7fffaed4c11c LR:  CTR: 
[  981.281128] REGS: c001d10b7e80 TRAP: 3000   Not tainted  
(6.8.0-rc2-next-20240130-auto)
[  981.281131] MSR:  8000d033   CR: 48002402  
XER: 
[  981.281139] IRQMASK: 0
[  981.281139] GPR00: 0034 7fffec649770 7fffaef07f00 

[  981.281139] GPR04:  ff00  
0001
[  981.281139] GPR08: 00014cd61390   

[  981.281139] GPR12:  7fffaefbc140 0ee6b280 
7fffec649a30
[  981.281139] GPR16: 7fffec649bd8 000118b66478  

[  981.281139] GPR20:    

[  981.281139] GPR24: 7fffec64b0b0 000118b663d8 000118b66a58 

[  981.281139] GPR28: 00014cd61250  00014cd61370 
00014cd61140
[  981.281175] NIP [7fffaed4c11c] 0x7fffaed4c11c
[  981.281177] LR [] 0x0
[  981.281179] --- interrupt: 3000

--
Regards,
Tasmiya Nalatwad
IBM Linux Technology Center


[Bisected] [commit dd6c3c544126] [linux-next] [6.8.0-rc2] Task hungs for infinite time call traces

2024-02-05 Thread Tasmiya Nalatwad
[linux-next] [6.8.0-rc2-next-20240130] [FC/XFS] Task hungs for infinite 
time while running bonnie test XFS filesystem


Bisected the issue. Git bisect points to the below commit
commit dd6c3c5441263723305a9c52c5ccc899a4653000
  workqueue: Move pwq_dec_nr_in_flight() to the end of work 
item handling


--- Traces ---

[  981.280811] Call Trace:
[  981.280813] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[  981.280820] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[  981.280827] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[  981.280832] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[  981.280836] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[  981.280841] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[  981.280845] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[  981.280852] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[  981.280913] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[  981.280969] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[  981.281023] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[  981.281029] [c001d10b7c60] [c05c8a50] kill_block_super+0x30/0x68
[  981.281034] [c001d10b7c90] [c00808fa5c54] xfs_kill_sb+0x28/0x4c [xfs]
[  981.281088] [c001d10b7cc0] [c05ca9d4] 
deactivate_locked_super+0x70/0x144
[  981.281093] [c001d10b7cf0] [c0605728] cleanup_mnt+0x10c/0x1d8
[  981.281098] [c001d10b7d40] [c019b5e4] task_work_run+0xe0/0x16c
[  981.281102] [c001d10b7d90] [c0022974] 
do_notify_resume+0x134/0x13c
[  981.281107] [c001d10b7dc0] [c0032378] 
interrupt_exit_user_prepare_main+0x1ac/0x264
[  981.281112] [c001d10b7e20] [c0032580] 
syscall_exit_prepare+0x150/0x178
[  981.281116] [c001d10b7e50] [c000d068] 
system_call_vectored_common+0x168/0x2ec
[  981.281122] --- interrupt: 3000 at 0x7fffaed4c11c
[  981.281125] NIP:  7fffaed4c11c LR:  CTR: 
[  981.281128] REGS: c001d10b7e80 TRAP: 3000   Not tainted  
(6.8.0-rc2-next-20240130-auto)
[  981.281131] MSR:  8000d033   CR: 48002402  
XER: 
[  981.281139] IRQMASK: 0
[  981.281139] GPR00: 0034 7fffec649770 7fffaef07f00 

[  981.281139] GPR04:  ff00  
0001
[  981.281139] GPR08: 00014cd61390   

[  981.281139] GPR12:  7fffaefbc140 0ee6b280 
7fffec649a30
[  981.281139] GPR16: 7fffec649bd8 000118b66478  

[  981.281139] GPR20:    

[  981.281139] GPR24: 7fffec64b0b0 000118b663d8 000118b66a58 

[  981.281139] GPR28: 00014cd61250  00014cd61370 
00014cd61140
[  981.281175] NIP [7fffaed4c11c] 0x7fffaed4c11c
[  981.281177] LR [] 0x0
[  981.281179] --- interrupt: 3000
[ 1104.160797] INFO: taskumount:32506  blocked for more than 245 seconds.
[ 1104.160811]   Not tainted 6.8.0-rc2-next-20240130-auto #1
[ 1104.160814] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1104.160818]task:umount   state:D  stack:0  pid:32506  tgid:32506 
 ppid:32329   flags:0x00042080
[ 1104.160826] Call Trace:
[ 1104.160829] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[ 1104.160836] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[ 1104.160844] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[ 1104.160851] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[ 1104.160856] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[ 1104.160862] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[ 1104.160867] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[ 1104.160873] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[ 1104.160938] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[ 1104.160993] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[ 1104.161047] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[ 1104.161053] [c001d10b7c60] [c05c8a50] kill_block_super+0x30/0x68
[ 1104.161059] [c001d10b7c90] [c00808fa5c54] xfs_kill_sb+0x28/0x4c [xfs]
[ 1104.161112] [c001d10b7cc0] [c05ca9d4] 
deactivate_locked_super+0x70/0x144
[ 1104.161118] [c001d10b7cf0] [c0605728] cleanup_mnt+0x10c/0x1d8
[ 1104.161123] [c001d10b7d40] [c019b5e4] task_work_run+0xe0/0x16c
[ 1104.161126] [c001d10b7d90] [c0022974] 
do_notify_resume+0x134/0x13c
[ 

Fwd: [Bisected] [commit dd6c3c544126] [linux-next] [6.8.0-rc2] Task hungs for infinite time call traces

2024-02-05 Thread Tasmiya Nalatwad




 Forwarded Message 
Subject: 	[Bisected] [commit dd6c3c544126] [linux-next] [6.8.0-rc2] Task 
hungs for infinite time call traces

Date:   Mon, 5 Feb 2024 14:10:24 +0530
From:   Tasmiya Nalatwad 
To: 	linux-ker...@vger.kernel.org , 
linuxppc-dev@lists.ozlabs.org , 
linux-bl...@vger.kernel.org , 
linux-n...@vger.kernel.org 
CC: 	t...@kernel.org, jiangshan...@gmail.com, abdha...@linux.vnet.ibm.com 
, sach...@linux.vnet.com 
, mputt...@linux.vnet.com 




Greetings,

[linux-next] [6.8.0-rc2-next-20240130] [FC/XFS] Task hungs for infinite 
time while running bonnie test XFS filesystem


Bisected the issue. Git bisect points to the below commit
commit dd6c3c5441263723305a9c52c5ccc899a4653000
Author: Tejun Heo 

--- Traces ---

[  981.280811] Call Trace:
[  981.280813] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[  981.280820] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[  981.280827] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[  981.280832] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[  981.280836] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[  981.280841] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[  981.280845] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[  981.280852] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[  981.280913] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[  981.280969] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[  981.281023] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[  981.281029] [c001d10b7c60] [c05c8a50] kill_block_super+0x30/0x68
[  981.281034] [c001d10b7c90] [c00808fa5c54] xfs_kill_sb+0x28/0x4c [xfs]
[  981.281088] [c001d10b7cc0] [c05ca9d4] 
deactivate_locked_super+0x70/0x144
[  981.281093] [c001d10b7cf0] [c0605728] cleanup_mnt+0x10c/0x1d8
[  981.281098] [c001d10b7d40] [c019b5e4] task_work_run+0xe0/0x16c
[  981.281102] [c001d10b7d90] [c0022974] 
do_notify_resume+0x134/0x13c
[  981.281107] [c001d10b7dc0] [c0032378] 
interrupt_exit_user_prepare_main+0x1ac/0x264
[  981.281112] [c001d10b7e20] [c0032580] 
syscall_exit_prepare+0x150/0x178
[  981.281116] [c001d10b7e50] [c000d068] 
system_call_vectored_common+0x168/0x2ec
[  981.281122] --- interrupt: 3000 at 0x7fffaed4c11c
[  981.281125] NIP:  7fffaed4c11c LR:  CTR: 
[  981.281128] REGS: c001d10b7e80 TRAP: 3000   Not tainted  
(6.8.0-rc2-next-20240130-auto)
[  981.281131] MSR:  8000d033   CR: 48002402  
XER: 
[  981.281139] IRQMASK: 0
[  981.281139] GPR00: 0034 7fffec649770 7fffaef07f00 

[  981.281139] GPR04:  ff00  
0001
[  981.281139] GPR08: 00014cd61390   

[  981.281139] GPR12:  7fffaefbc140 0ee6b280 
7fffec649a30
[  981.281139] GPR16: 7fffec649bd8 000118b66478  

[  981.281139] GPR20:    

[  981.281139] GPR24: 7fffec64b0b0 000118b663d8 000118b66a58 

[  981.281139] GPR28: 00014cd61250  00014cd61370 
00014cd61140
[  981.281175] NIP [7fffaed4c11c] 0x7fffaed4c11c
[  981.281177] LR [] 0x0
[  981.281179] --- interrupt: 3000
[ 1104.160797] INFO: taskumount:32506  blocked for more than 245 seconds.
[ 1104.160811]   Not tainted 6.8.0-rc2-next-20240130-auto #1
[ 1104.160814] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1104.160818]task:umount   state:D  stack:0  pid:32506  tgid:32506 
 ppid:32329   flags:0x00042080
[ 1104.160826] Call Trace:
[ 1104.160829] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[ 1104.160836] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[ 1104.160844] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[ 1104.160851] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[ 1104.160856] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[ 1104.160862] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[ 1104.160867] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[ 1104.160873] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[ 1104.160938] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[ 1104.160993] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[ 1104.161047] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[ 1104.161053] [c001d10b7c60] [c05c8a50] 

[Bisected] [commit dd6c3c544126] [linux-next] [6.8.0-rc2] Task hungs for infinite time call traces

2024-02-05 Thread Tasmiya Nalatwad

Greetings,

[linux-next] [6.8.0-rc2-next-20240130] [FC/XFS] Task hungs for infinite 
time while running bonnie test XFS filesystem


Bisected the issue. Git bisect points to the below commit
commit dd6c3c5441263723305a9c52c5ccc899a4653000
Author: Tejun Heo 

--- Traces ---

[  981.280811] Call Trace:
[  981.280813] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[  981.280820] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[  981.280827] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[  981.280832] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[  981.280836] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[  981.280841] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[  981.280845] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[  981.280852] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[  981.280913] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[  981.280969] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[  981.281023] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[  981.281029] [c001d10b7c60] [c05c8a50] kill_block_super+0x30/0x68
[  981.281034] [c001d10b7c90] [c00808fa5c54] xfs_kill_sb+0x28/0x4c [xfs]
[  981.281088] [c001d10b7cc0] [c05ca9d4] 
deactivate_locked_super+0x70/0x144
[  981.281093] [c001d10b7cf0] [c0605728] cleanup_mnt+0x10c/0x1d8
[  981.281098] [c001d10b7d40] [c019b5e4] task_work_run+0xe0/0x16c
[  981.281102] [c001d10b7d90] [c0022974] 
do_notify_resume+0x134/0x13c
[  981.281107] [c001d10b7dc0] [c0032378] 
interrupt_exit_user_prepare_main+0x1ac/0x264
[  981.281112] [c001d10b7e20] [c0032580] 
syscall_exit_prepare+0x150/0x178
[  981.281116] [c001d10b7e50] [c000d068] 
system_call_vectored_common+0x168/0x2ec
[  981.281122] --- interrupt: 3000 at 0x7fffaed4c11c
[  981.281125] NIP:  7fffaed4c11c LR:  CTR: 
[  981.281128] REGS: c001d10b7e80 TRAP: 3000   Not tainted  
(6.8.0-rc2-next-20240130-auto)
[  981.281131] MSR:  8000d033   CR: 48002402  
XER: 
[  981.281139] IRQMASK: 0
[  981.281139] GPR00: 0034 7fffec649770 7fffaef07f00 

[  981.281139] GPR04:  ff00  
0001
[  981.281139] GPR08: 00014cd61390   

[  981.281139] GPR12:  7fffaefbc140 0ee6b280 
7fffec649a30
[  981.281139] GPR16: 7fffec649bd8 000118b66478  

[  981.281139] GPR20:    

[  981.281139] GPR24: 7fffec64b0b0 000118b663d8 000118b66a58 

[  981.281139] GPR28: 00014cd61250  00014cd61370 
00014cd61140
[  981.281175] NIP [7fffaed4c11c] 0x7fffaed4c11c
[  981.281177] LR [] 0x0
[  981.281179] --- interrupt: 3000
[ 1104.160797] INFO: taskumount:32506  blocked for more than 245 seconds.
[ 1104.160811]   Not tainted 6.8.0-rc2-next-20240130-auto #1
[ 1104.160814] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 1104.160818]task:umount   state:D  stack:0  pid:32506  tgid:32506 
 ppid:32329   flags:0x00042080
[ 1104.160826] Call Trace:
[ 1104.160829] [c001d10b7560] [c6e9b934] 0xc6e9b934 
(unreliable)
[ 1104.160836] [c001d10b7710] [c001fbac] __switch_to+0x13c/0x220
[ 1104.160844] [c001d10b7770] [c1002840] __schedule+0x268/0x7c4
[ 1104.160851] [c001d10b7840] [c1002ddc] schedule+0x40/0x108
[ 1104.160856] [c001d10b78b0] [c100b748] 
schedule_timeout+0x19c/0x1c0
[ 1104.160862] [c001d10b7980] [c1004044] 
__wait_for_common+0x148/0x340
[ 1104.160867] [c001d10b7a10] [c018fa98] 
__flush_workqueue+0x15c/0x530
[ 1104.160873] [c001d10b7ab0] [c00808f89e9c] 
xfs_inodegc_flush+0x54/0x15c [xfs]
[ 1104.160938] [c001d10b7b00] [c00808f9f47c] xfs_unmountfs+0x30/0x1e4 
[xfs]
[ 1104.160993] [c001d10b7b80] [c00808fa825c] 
xfs_fs_put_super+0x5c/0x110 [xfs]
[ 1104.161047] [c001d10b7bf0] [c05c8774] 
generic_shutdown_super+0xc0/0x16c
[ 1104.161053] [c001d10b7c60] [c05c8a50] kill_block_super+0x30/0x68
[ 1104.161059] [c001d10b7c90] [c00808fa5c54] xfs_kill_sb+0x28/0x4c [xfs]
[ 1104.161112] [c001d10b7cc0] [c05ca9d4] 
deactivate_locked_super+0x70/0x144
[ 1104.161118] [c001d10b7cf0] [c0605728] cleanup_mnt+0x10c/0x1d8
[ 1104.161123] [c001d10b7d40] [c019b5e4] task_work_run+0xe0/0x16c
[ 1104.161126] [c001d10b7d90] [c0022974] 
do_notify_resume+0x134/0x13c
[ 1104.161131] [c001d10b7dc0] [c0032378] 

Re: [PATCH v2 1/2] powerpc: Add Power11 architected and raw mode

2024-02-05 Thread Aneesh Kumar K . V
Madhavan Srinivasan  writes:

> reg.h is updated with Power11 pvr. pvr_mask value of 0x0F07
> means we are arch v3.1 compliant.
>

If it is called arch v3.1, it will conflict with. 


#define PVR_ARCH_31 0x0f06

>This is used by phyp and
> kvm when booting as a pseries guest to detect and enable
> the appropriate hwcap, facility bits and PMU related fields.
> Copied most of fields from Power10 table entry and added relevant
> Power11 setup/restore and device tree routines.
>
> Signed-off-by: Madhavan Srinivasan 
> ---
> Changelog v1:
> - no change in this patch.
>
>  arch/powerpc/include/asm/cpu_setup.h  |  2 ++
>  arch/powerpc/include/asm/cputable.h   |  3 ++
>  arch/powerpc/include/asm/mce.h|  1 +
>  arch/powerpc/include/asm/mmu.h|  1 +
>  arch/powerpc/include/asm/reg.h|  1 +
>  arch/powerpc/kernel/cpu_setup_power.c | 10 +++
>  arch/powerpc/kernel/cpu_specs_book3s_64.h | 34 +++
>  arch/powerpc/kernel/dt_cpu_ftrs.c | 15 ++
>  arch/powerpc/kernel/mce_power.c   |  5 
>  arch/powerpc/kernel/prom_init.c   | 10 ++-
>  10 files changed, 81 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/cpu_setup.h 
> b/arch/powerpc/include/asm/cpu_setup.h
> index 30e2fe389502..ce800650bb8b 100644
> --- a/arch/powerpc/include/asm/cpu_setup.h
> +++ b/arch/powerpc/include/asm/cpu_setup.h
> @@ -9,10 +9,12 @@ void __setup_cpu_power7(unsigned long offset, struct 
> cpu_spec *spec);
>  void __setup_cpu_power8(unsigned long offset, struct cpu_spec *spec);
>  void __setup_cpu_power9(unsigned long offset, struct cpu_spec *spec);
>  void __setup_cpu_power10(unsigned long offset, struct cpu_spec *spec);
> +void __setup_cpu_power11(unsigned long offset, struct cpu_spec *spec);
>  void __restore_cpu_power7(void);
>  void __restore_cpu_power8(void);
>  void __restore_cpu_power9(void);
>  void __restore_cpu_power10(void);
> +void __restore_cpu_power11(void);
>  
>  void __setup_cpu_e500v1(unsigned long offset, struct cpu_spec *spec);
>  void __setup_cpu_e500v2(unsigned long offset, struct cpu_spec *spec);
> diff --git a/arch/powerpc/include/asm/cputable.h 
> b/arch/powerpc/include/asm/cputable.h
> index 8765d5158324..3bd6e6e0224c 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -454,6 +454,9 @@ static inline void cpu_feature_keys_init(void) { }
>   CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
>   CPU_FTR_DAWR | CPU_FTR_DAWR1 | \
>   CPU_FTR_DEXCR_NPHIE)
> +
> +#define CPU_FTRS_POWER11 CPU_FTRS_POWER10
>

One of the problem with that is we have code that does the below in kvm.

if (cpu_has_feature(CPU_FTR_ARCH_31))
host_pcr_bit = PCR_ARCH_31;


How should we handle that?

> +
>  #define CPU_FTRS_CELL(CPU_FTR_LWSYNC | \
>   CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>   CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> index c9f0936bd3c9..241eee743fc5 100644
> --- a/arch/powerpc/include/asm/mce.h
> +++ b/arch/powerpc/include/asm/mce.h
> @@ -257,6 +257,7 @@ long __machine_check_early_realmode_p7(struct pt_regs 
> *regs);
>  long __machine_check_early_realmode_p8(struct pt_regs *regs);
>  long __machine_check_early_realmode_p9(struct pt_regs *regs);
>  long __machine_check_early_realmode_p10(struct pt_regs *regs);
> +long __machine_check_early_realmode_p11(struct pt_regs *regs);
>  #endif /* CONFIG_PPC_BOOK3S_64 */
>  
>  #ifdef CONFIG_PPC_BOOK3S_64
> diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
> index d8b7e246a32f..61ebe5eff2c9 100644
> --- a/arch/powerpc/include/asm/mmu.h
> +++ b/arch/powerpc/include/asm/mmu.h
> @@ -133,6 +133,7 @@
>  #define MMU_FTRS_POWER8  MMU_FTRS_POWER6
>  #define MMU_FTRS_POWER9  MMU_FTRS_POWER6
>  #define MMU_FTRS_POWER10 MMU_FTRS_POWER6
> +#define MMU_FTRS_POWER11 MMU_FTRS_POWER6
>  #define MMU_FTRS_CELLMMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
>   MMU_FTR_CI_LARGE_PAGE
>  #define MMU_FTRS_PA6TMMU_FTRS_DEFAULT_HPTE_ARCH_V2 | \
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 7fd09f25452d..7a7aa24bf57a 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -1364,6 +1364,7 @@
>  #define PVR_HX_C2000 0x0066
>  #define PVR_POWER9   0x004E
>  #define PVR_POWER10  0x0080
> +#define PVR_POWER11  0x0082
>  #define PVR_BE   0x0070
>  #define PVR_PA6T 0x0090
>  
> diff --git a/arch/powerpc/kernel/cpu_setup_power.c 
> b/arch/powerpc/kernel/cpu_setup_power.c
> index 98bd4e6c1770..8c24fc67d90f 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.c
> +++ b/arch/powerpc/kernel/cpu_setup_power.c
> @@ -286,3 +286,13 @@ void __restore_cpu_power10(void)
>   init_HFSCR();
>   

Re: [PATCH v5 0/6] DCP as trusted keys backend

2024-02-05 Thread David Gstir
Hi,

> On 15.12.2023, at 12:06, David Gstir  wrote:
> 
> This is a revival of the previous patch set submitted by Richard Weinberger:
> https://lore.kernel.org/linux-integrity/20210614201620.30451-1-rich...@nod.at/
> 
> v4 is here:
> https://lore.kernel.org/keyrings/20231024162024.51260-1-da...@sigma-star.at/
> 
> v4 -> v5:
> - Make Kconfig for trust source check scalable as suggested by Jarkko Sakkinen
> - Add Acked-By from Herbert Xu to patch #1 - thanks!
> v3 -> v4:
> - Split changes on MAINTAINERS and documentation into dedicated patches
> - Use more concise wording in commit messages as suggested by Jarkko Sakkinen
> v2 -> v3:
> - Addressed review comments from Jarkko Sakkinen
> v1 -> v2:
> - Revive and rebase to latest version
> - Include review comments from Ahmad Fatoum
> 
> The Data CoProcessor (DCP) is an IP core built into many NXP SoCs such
> as i.mx6ull.
> 
> Similar to the CAAM engine used in more powerful SoCs, DCP can AES-
> encrypt/decrypt user data using a unique, never-disclosed,
> device-specific key. Unlike CAAM though, it cannot directly wrap and
> unwrap blobs in hardware. As DCP offers only the bare minimum feature
> set and a blob mechanism needs aid from software. A blob in this case
> is a piece of sensitive data (e.g. a key) that is encrypted and
> authenticated using the device-specific key so that unwrapping can only
> be done on the hardware where the blob was wrapped.
> 
> This patch series adds a DCP based, trusted-key backend and is similar
> in spirit to the one by Ahmad Fatoum [0] that does the same for CAAM.
> It is of interest for similar use cases as the CAAM patch set, but for
> lower end devices, where CAAM is not available.
> 
> Because constructing and parsing the blob has to happen in software,
> we needed to decide on a blob format and chose the following:
> 
> struct dcp_blob_fmt {
> __u8 fmt_version;
> __u8 blob_key[AES_KEYSIZE_128];
> __u8 nonce[AES_KEYSIZE_128];
> __le32 payload_len;
> __u8 payload[];
> } __packed;
> 
> The `fmt_version` is currently 1.
> 
> The encrypted key is stored in the payload area. It is AES-128-GCM
> encrypted using `blob_key` and `nonce`, GCM auth tag is attached at
> the end of the payload (`payload_len` does not include the size of
> the auth tag).
> 
> The `blob_key` itself is encrypted in AES-128-ECB mode by DCP using
> the OTP or UNIQUE device key. A new `blob_key` and `nonce` are generated
> randomly, when sealing/exporting the DCP blob.
> 
> This patchset was tested with dm-crypt on an i.MX6ULL board.
> 
> [0] 
> https://lore.kernel.org/keyrings/20220513145705.2080323-1-a.fat...@pengutronix.de/
> 
> David Gstir (6):
>  crypto: mxs-dcp: Add support for hardware-bound keys
>  KEYS: trusted: improve scalability of trust source config
>  KEYS: trusted: Introduce NXP DCP-backed trusted keys
>  MAINTAINERS: add entry for DCP-based trusted keys
>  docs: document DCP-backed trusted keys kernel params
>  docs: trusted-encrypted: add DCP as new trust source
> 
> .../admin-guide/kernel-parameters.txt |  13 +
> .../security/keys/trusted-encrypted.rst   |  85 +
> MAINTAINERS   |   9 +
> drivers/crypto/mxs-dcp.c  | 104 +-
> include/keys/trusted_dcp.h|  11 +
> include/soc/fsl/dcp.h |  17 +
> security/keys/trusted-keys/Kconfig|  18 +-
> security/keys/trusted-keys/Makefile   |   2 +
> security/keys/trusted-keys/trusted_core.c |   6 +-
> security/keys/trusted-keys/trusted_dcp.c  | 311 ++
> 10 files changed, 562 insertions(+), 14 deletions(-)
> create mode 100644 include/keys/trusted_dcp.h
> create mode 100644 include/soc/fsl/dcp.h
> create mode 100644 security/keys/trusted-keys/trusted_dcp.c

Jarkko, Mimi, David do you need anything from my side for these patches to get 
them merged?

Thanks,
- David