Re: PPC476 hangs during tlb flush after calling /init in crash kernel with linux 5.4+

2021-04-27 Thread Christophe Leroy




Le 28/04/2021 à 00:42, Eddie James a écrit :

On Tue, 2021-04-27 at 19:26 +0200, Christophe Leroy wrote:

Hi Eddies,

Le 27/04/2021 à 19:03, Eddie James a écrit :

Hi all,

I'm having a problem in simulation and hardware where my PPC476
processor stops executing instructions after callling /init. In my
case
this is a bash script. The code descends to flush the TLB, and
somewhere in the loop in _tlbil_pid, the PC goes to
InstructionTLBError47x but does not go any further. This only
occurs in
the crash kernel environment, which is using the same kernel,
initramfs, and init script as the main kernel, which executed fine.
I
do not see this problem with linux 4.19 or 3.10. I do see it with
5.4
and 5.10. I see a fair amount of refactoring in the PPC memory
management area between 4.19 and 5.4. Can anyone point me in a
direction to debug this further? My stack trace is below as I can
run
gdb in simulation.


Can you bisect to pin point the culprit commit ?


Hi, thanks for your prompt reply.

Good idea! I have bisected to:

commit 9e849f231c3c72d4c3c1b07c9cd19ae789da0420 (b8-bad,
refs/bisect/bad)
Author: Christophe Leroy 
Date:   Thu Feb 21 19:08:40 2019 +

 powerpc/mm/32s: use generic mmu_mapin_ram() for all blocks.
 
 Now that mmu_mapin_ram() is able to handle other blocks

 than the one starting at 0, the WII can use it for all
 its blocks.
 
 Signed-off-by: Christophe Leroy 

 Signed-off-by: Michael Ellerman 

I also confirmed that reverting this commit resolves the issue in 5.4+.

Now, I don't understand why this is problematic or what is really
happening... Reverting is probably not the desired solution.



Can you provide the 'dmesg' or a dump of the logs printed by the kernel at 
boottime ?

The difference with this commit is that if there are several memblocks, all get mapped. Maybe your 
target doesn't like it.


You are talking about simulation, are you using QEMU ? If yes can you provide details so that I can 
try and reproduce the issue ?


Thanks
Christophe


Re: [PATCH] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards

2021-04-27 Thread Gautham R Shenoy
Hello Michal,

On Sun, Apr 25, 2021 at 01:07:14PM +0200, Michal Suchánek wrote:
> On Sat, Apr 24, 2021 at 01:07:16PM +0530, Vaidyanathan Srinivasan wrote:
> > * Michal Such?nek  [2021-04-23 20:42:16]:
> > 
> > > On Fri, Apr 23, 2021 at 11:59:30PM +0530, Vaidyanathan Srinivasan wrote:
> > > > * Michal Such?nek  [2021-04-23 19:45:05]:
> > > > 
> > > > > On Fri, Apr 23, 2021 at 09:29:39PM +0530, Vaidyanathan Srinivasan 
> > > > > wrote:
> > > > > > * Michal Such?nek  [2021-04-23 09:35:51]:
> > > > > > 
> > > > > > > On Thu, Apr 22, 2021 at 08:37:29PM +0530, Gautham R. Shenoy wrote:
> > > > > > > > From: "Gautham R. Shenoy" 
> > > > > > > > 
> > > > > > > > Commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for
> > > > > > > > CEDE(0)") sets the exit latency of CEDE(0) based on the latency 
> > > > > > > > values
> > > > > > > > of the Extended CEDE states advertised by the platform
> > > > > > > > 
> > > > > > > > On some of the POWER9 LPARs, the older firmwares advertise a 
> > > > > > > > very low
> > > > > > > > value of 2us for CEDE1 exit latency on a Dedicated LPAR. 
> > > > > > > > However the
> > > > > > > Can you be more specific about 'older firmwares'?
> > > > > > 
> > > > > > Hi Michal,
> > > > > > 
> > > > > > This is POWER9 vs POWER10 difference, not really an obsolete FW.  
> > > > > > The
> > > > > > key idea behind the original patch was to make the H_CEDE latency 
> > > > > > and
> > > > > > hence target residency come from firmware instead of being decided 
> > > > > > by
> > > > > > the kernel.  The advantage is such that, different type of systems 
> > > > > > in
> > > > > > POWER10 generation can adjust this value and have an optimal H_CEDE
> > > > > > entry criteria which balances good single thread performance and
> > > > > > wakeup latency.  Further we can have additional H_CEDE state to feed
> > > > > > into the cpuidle.  
> > > > > 
> > > > > So all POWER9 machines are affected by the firmware bug where firmware
> > > > > reports CEDE1 exit latency of 2us and the real latency is 5us which
> > > > > causes the kernel to prefer CEDE1 too much when relying on the values
> > > > > supplied by the firmware. It is not about 'older firmware'.
> > > > 
> > > > Correct.  All POWER9 systems running Linux as guest LPARs will see
> > > > extra usage of CEDE idle state, but not baremetal (PowerNV).
> > > > 
> > > > The correct definition of the bug or miss-match in expectation is that
> > > > firmware reports wakeup latency from a core/thread wakeup timing, but
> > > > not end-to-end time from sending a wakeup event like an IPI using
> > > > H_calls and receiving the events on the target.  Practically there are
> > > > few extra micro-seconds needed after deciding to wakeup a target
> > > > core/thread to getting the target to start executing instructions
> > > > within the LPAR instance.
> > > 
> > > Thanks for the detailed explanation.
> > > 
> > > Maybe just adding a few microseconds to the reported time would be a
> > > more reasonable workaround than using a blanket fixed value then.
> > 
> > Yes, that is an option.  But that may only reduce the difference
> > between existing kernel and new kernel unless we make it the same
> > number.  Further we are fixing this in P10 and hence we will have to
> > add "if(P9) do the compensation" and otherwise take it as is.  That
> > would not be elegant.  Given that our goal for P9 platform is to not
> > introduce changes in H_CEDE entry behaviour, we arrived at this
> > approach (this small patch) and this also makes it easy to backport to
> > various distro products.
> 
> I don't see how this is more elegent.
> 
> The current patch is
> 
> if(p9)
>   use fixed value
> 
> the suggested patch is
> 
> if(p9)
>   apply compensation


We could do that, however, from the recent measurements the default
value is closer to the latency value measured using an IPI.

As Vaidy described earlier, on POWER9 and prior platforms, the wakeup
latency advertized by the PHYP hypervisor corresponds to the latency
required to wakeup from the underlying hardware idle state (Nap in
POWER8 and stop0/1/2 on POWER9) into the hypervisor. That's 2us on
POWER9.

We need to apply two kinds of compensation,

1. Compensation for the time taken to transition the CPU from the
   Hypervisor into the LPAR post wakeup from platform idle state

2. Compensation for the time taken to send the IPI from the source CPU
   (waker) to the idle target CPU (wakee).

1. can be measured via timer idle test (I am using Pratik's
cpuidle self-test posted here
https://lore.kernel.org/lkml/20210412074309.38484-1-psam...@linux.ibm.com/)

We queue a timer, say for 1ms, and enter the CEDE state. When the
timer fires, in the timer handler we compute how much extra timer over
the expected 1ms have we consumed. This is what it looks like on
POWER9 LPAR

CEDE latency measured using a timer (numbers in ns)
===
N   Min  Median 

[RFC PATCH] usb: gadget: fsl_qe_udc: fix implicit-fallthrough warnings

2021-04-27 Thread Randy Dunlap
Quieten implicit-fallthrough warnings in fsl_qe_udc.c:

../drivers/usb/gadget/udc/fsl_qe_udc.c: In function 'qe_ep_init':
../drivers/usb/gadget/udc/fsl_qe_udc.c:542:37: warning: this statement may fall 
through [-Wimplicit-fallthrough=]
  542 |if ((max == 128) || (max == 256) || (max == 512))
../drivers/usb/gadget/udc/fsl_qe_udc.c:563:8: warning: this statement may fall 
through [-Wimplicit-fallthrough=]
  563 | if (max <= 1024)
../drivers/usb/gadget/udc/fsl_qe_udc.c:566:8: warning: this statement may fall 
through [-Wimplicit-fallthrough=]
  566 | if (max <= 64)
../drivers/usb/gadget/udc/fsl_qe_udc.c:580:8: warning: this statement may fall 
through [-Wimplicit-fallthrough=]
  580 | if (max <= 1024)
../drivers/usb/gadget/udc/fsl_qe_udc.c:596:5: warning: this statement may fall 
through [-Wimplicit-fallthrough=]
  596 | switch (max) {

This basically just documents what is currently being done.
If any of them need to do something else, just say so or
even make the change.

Signed-off-by: Randy Dunlap 
Cc: Li Yang 
Cc: linuxppc-dev@lists.ozlabs.org
---
 drivers/usb/gadget/udc/fsl_qe_udc.c |5 +
 1 file changed, 5 insertions(+)

--- linux-next-20210427.orig/drivers/usb/gadget/udc/fsl_qe_udc.c
+++ linux-next-20210427/drivers/usb/gadget/udc/fsl_qe_udc.c
@@ -541,6 +541,7 @@ static int qe_ep_init(struct qe_udc *udc
case USB_SPEED_HIGH:
if ((max == 128) || (max == 256) || (max == 512))
break;
+   fallthrough;
default:
switch (max) {
case 4:
@@ -562,9 +563,11 @@ static int qe_ep_init(struct qe_udc *udc
case USB_SPEED_HIGH:
if (max <= 1024)
break;
+   fallthrough;
case USB_SPEED_FULL:
if (max <= 64)
break;
+   fallthrough;
default:
if (max <= 8)
break;
@@ -579,6 +582,7 @@ static int qe_ep_init(struct qe_udc *udc
case USB_SPEED_HIGH:
if (max <= 1024)
break;
+   fallthrough;
case USB_SPEED_FULL:
if (max <= 1023)
break;
@@ -605,6 +609,7 @@ static int qe_ep_init(struct qe_udc *udc
default:
goto en_done;
}
+   fallthrough;
case USB_SPEED_LOW:
switch (max) {
case 1:


[powerpc:next-test] BUILD SUCCESS adb68c38d8d49a3d60805479c558649bb2182473

2021-04-27 Thread kernel test robot
alldefconfig
m68k amcore_defconfig
sh  sh7785lcr_32bit_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a005-20210426
i386 randconfig-a002-20210426
i386 randconfig-a001-20210426
i386 randconfig-a006-20210426
i386 randconfig-a004-20210426
i386 randconfig-a003-20210426
x86_64   randconfig-a002-20210427
x86_64   randconfig-a004-20210427
x86_64   randconfig-a001-20210427
x86_64   randconfig-a006-20210427
x86_64   randconfig-a005-20210427
x86_64   randconfig-a003-20210427
x86_64   randconfig-a015-20210426
x86_64   randconfig-a016-20210426
x86_64   randconfig-a011-20210426
x86_64   randconfig-a014-20210426
x86_64   randconfig-a012-20210426
x86_64   randconfig-a013-20210426
i386 randconfig-a014-20210426
i386 randconfig-a012-20210426
i386 randconfig-a011-20210426
i386 randconfig-a013-20210426
i386 randconfig-a015-20210426
i386 randconfig-a016-20210426
riscvnommu_k210_defconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a002-20210426
x86_64   randconfig-a004-20210426
x86_64   randconfig-a001-20210426
x86_64   randconfig-a006-20210426
x86_64   randconfig-a005-20210426
x86_64   randconfig-a003-20210426

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[powerpc:next] BUILD SUCCESS ee1bc694fbaec1a662770703fc34a74abf418938

2021-04-27 Thread kernel test robot
  rsk7203_defconfig
microblaze  mmu_defconfig
riscvalldefconfig
m68k amcore_defconfig
sh  sh7785lcr_32bit_defconfig
ia64 allmodconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a005-20210426
i386 randconfig-a002-20210426
i386 randconfig-a001-20210426
i386 randconfig-a006-20210426
i386 randconfig-a004-20210426
i386 randconfig-a003-20210426
x86_64   randconfig-a002-20210427
x86_64   randconfig-a004-20210427
x86_64   randconfig-a001-20210427
x86_64   randconfig-a006-20210427
x86_64   randconfig-a005-20210427
x86_64   randconfig-a003-20210427
x86_64   randconfig-a015-20210426
x86_64   randconfig-a016-20210426
x86_64   randconfig-a011-20210426
x86_64   randconfig-a014-20210426
x86_64   randconfig-a012-20210426
x86_64   randconfig-a013-20210426
i386 randconfig-a014-20210426
i386 randconfig-a012-20210426
i386 randconfig-a011-20210426
i386 randconfig-a013-20210426
i386 randconfig-a015-20210426
i386 randconfig-a016-20210426
riscvnommu_k210_defconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a002-20210426
x86_64   randconfig-a004-20210426
x86_64   randconfig-a001-20210426
x86_64   randconfig-a006-20210426
x86_64   randconfig-a005-20210426
x86_64   randconfig-a003-20210426

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[powerpc:merge] BUILD SUCCESS 1b628dbc3ccb0b51963fcb3c60b57daac21dc016

2021-04-27 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
merge
branch HEAD: 1b628dbc3ccb0b51963fcb3c60b57daac21dc016  Automatic merge of 
'master' into merge (2021-04-27 23:47)

elapsed time: 814m

configs tested: 131
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
x86_64   allyesconfig
riscvallmodconfig
i386 allyesconfig
riscvallyesconfig
mipsmaltaup_defconfig
mips   rs90_defconfig
powerpc mpc83xx_defconfig
microblaze  defconfig
openrisc simple_smp_defconfig
mipsbcm47xx_defconfig
h8300alldefconfig
sh kfr2r09-romimage_defconfig
armspear6xx_defconfig
powerpc  g5_defconfig
m68k  sun3x_defconfig
m68km5272c3_defconfig
arm  colibri_pxa300_defconfig
xtensaxip_kc705_defconfig
sh apsh4a3a_defconfig
arm   stm32_defconfig
s390 alldefconfig
mips  maltaaprp_defconfig
powerpcamigaone_defconfig
sh   sh7724_generic_defconfig
ia64  tiger_defconfig
mipsjmr3927_defconfig
armxcep_defconfig
powerpc  mpc885_ads_defconfig
powerpc  chrp32_defconfig
powerpc powernv_defconfig
sh microdev_defconfig
armmmp2_defconfig
mips   ip22_defconfig
arm  moxart_defconfig
arm  pcm027_defconfig
arm   tegra_defconfig
xtensa   common_defconfig
sh   se7721_defconfig
powerpc   bluestone_defconfig
arm   mainstone_defconfig
arm   milbeaut_m10v_defconfig
powerpc tqm8548_defconfig
sh   alldefconfig
m68k   m5475evb_defconfig
mips  fuloong2e_defconfig
sh  sdk7780_defconfig
armkeystone_defconfig
h8300 edosk2674_defconfig
xtensa virt_defconfig
sh  rsk7203_defconfig
microblaze  mmu_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a005-20210426
i386 randconfig-a002-20210426
i386 randconfig-a001-20210426
i386 randconfig-a006-20210426
i386 randconfig-a004-20210426
i386 randconfig-a003-20210426
i386 randconfig-a005-20210427
i386 randconfig-a002-20210427
i386 randconfig-a001-20210427
i386 randconfig-a006-20210427
i386 randconfig-a004-20210427
i386 randconfig-a003-20210427
x86_64   randconfig-a015-20210

Re: PPC32: Boot regression on Nintendo Wii, after create_branch rework in 5.8

2021-04-27 Thread Jordan Niethe
On Mon, Apr 26, 2021 at 1:40 AM Jonathan Neuschäfer
 wrote:
>
> Hi,
>
> I recently booted my Wii again, and I noticed a regression at boot time.
> Output stops after the "Finalizing device tree... flat tree at 0xXX"
> message. I bisected it to this commit in the 5.8 development cycle:
>
> commit 7c95d8893fb55869882c9f68f4c94840dc43f18f
> Author: Jordan Niethe 
> Date:   Wed May 6 13:40:25 2020 +1000
>
> powerpc: Change calling convention for create_branch() et. al.
>
> create_branch(), create_cond_branch() and translate_branch() return the
> instruction that they create, or return 0 to signal an error. Separate
> these concerns in preparation for an instruction type that is not just
> an unsigned int.  Fill the created instruction to a pointer passed as
> the first parameter to the function and use a non-zero return value to
> signify an error.
>
> Signed-off-by: Jordan Niethe 
> Signed-off-by: Michael Ellerman 
> Reviewed-by: Alistair Popple 
> Link: https://lore.kernel.org/r/20200506034050.24806-6-jniet...@gmail.com
>
>  arch/powerpc/include/asm/code-patching.h |  12 +--
>  arch/powerpc/kernel/optprobes.c  |  24 +++---
>  arch/powerpc/kernel/setup_32.c   |   4 +-
>  arch/powerpc/kernel/trace/ftrace.c   |  24 +++---
>  arch/powerpc/lib/code-patching.c | 134 
> ++-
>  arch/powerpc/lib/feature-fixups.c|   5 +-
>  6 files changed, 119 insertions(+), 84 deletions(-)
>
>
> Do you have any hints on how to debug and/or fix this issue?
Thanks for bisecting and reporting.
The "Finalizing device tree... flat tree at 0xXX" message comes
from the bootwrapper so if that is the last output it must be crashing
pretty early.
Commit 7c95d8893fb5 ("powerpc: Change calling convention for
create_branch() et. al.") made a change to machine_init() in
setup_32.c which seems like it might be a likely culprit for causing
early crashing.
The branch that is created and patched is just for optimization, so to
see if that is in fact the problem it might be worth trying to boot
with a patch like below

diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -87,9 +87,6 @@ notrace void __init machine_init(u64 dt_ptr)

patch_instruction_site(&patch__memcpy_nocache, ppc_inst(PPC_INST_NOP));

-   create_cond_branch(&insn, addr, branch_target(addr), 0x82);
-   patch_instruction(addr, insn);  /* replace b by bne cr0 */
-
/* Do some early initialization based on the flat device tree */
early_init_devtree(__va(dt_ptr));

>
>
> Best regards,
> Jonathan Neuschäfer


Re: powerpc{32,64} randconfigs

2021-04-27 Thread Randy Dunlap
On 4/20/21 11:45 PM, Randy Dunlap wrote:
> On 4/20/21 11:37 PM, Christophe Leroy wrote:
>>
>>
>> Le 21/04/2021 à 01:31, Randy Dunlap a écrit :
>>> --- linux-next-20210420.orig/arch/powerpc/kernel/vdso64/Makefile
>>> +++ linux-next-20210420/arch/powerpc/kernel/vdso64/Makefile
>>> @@ -30,7 +30,7 @@ ccflags-y := -shared -fno-common -fno-bu
>>>   asflags-y := -D__VDSO64__ -s
>>>     targets += vdso64.lds
>>> -CPPFLAGS_vdso64.lds += -P -C -U$(ARCH)
>>> +CPPFLAGS_vdso64.lds += -P -C -U$(SRCARCH)
>>
>> Maybe it would be better to do -Upowerpc like in VDSO32
>>
> 
> OK, thanks for getting that done. :)
> 
>>>     # link rule for the .so file, .lds has to be first
>>>   $(obj)/vdso64.so.dbg: $(src)/vdso64.lds $(obj-vdso64) 
>>> $(obj)/vgettimeofday.o FORCE
>>> --- linux-next-20210420.orig/arch/powerpc/Makefile

Just to wrap this up, I got this method working also.
Thanks for all of your help, Christophe.

-- 
~Randy



Re: [PATCH v7] powerpc/irq: Inline call_do_irq() and call_do_softirq()

2021-04-27 Thread Nick Desaulniers
On Mon, Apr 26, 2021 at 11:39 PM Christophe Leroy
 wrote:
>
>
>
> Le 26/04/2021 à 20:50, Nathan Chancellor a écrit :
> > On Sat, Mar 20, 2021 at 11:22:27PM +1100, Michael Ellerman wrote:
> >> From: Christophe Leroy 
> >>
> >> call_do_irq() and call_do_softirq() are simple enough to be
> >> worth inlining.
> >>
> >> Inlining them avoids an mflr/mtlr pair plus a save/reload on stack. It
> >> also allows GCC to keep the saved ksp_limit in an nonvolatile reg.
> >>
> >> This is inspired from S390 arch. Several other arches do more or
> >> less the same. The way sparc arch does seems odd thought.
> >>
> >> Signed-off-by: Christophe Leroy 
> >> Signed-off-by: Michael Ellerman 
> >>
> >
> > This change caused our ppc44x_defconfig builds to hang when powering
> > down in QEMU:
> >
> > https://github.com/ClangBuiltLinux/continuous-integration2/runs/2304364629?check_suite_focus=true#logs
> >
> > This is probably something with clang given that GCC 10.3.0 works fine
> > but due to the nature of the change, I have no idea how to tell what is
> > going wrong. I tried to do some rudimentary debugging with gdb but that
> > did not really get me anywhere.
> >
> > The kernel was built with just 'CC=clang' and it is reproducible with
> > all versions of clang that the kernel supports.
> >
> > The QEMU invocation is visible at the link above, it is done with our
> > boot-qemu.sh in this repo, which also houses the rootfs:
> >
> > https://github.com/ClangBuiltLinux/boot-utils
> >
> > Happy to provide any other information or debug/test as directed!
> >
>
> With GCC:
>
> 03f0 :
>   3f0:  94 21 ff f0 stwur1,-16(r1)
>   3f4:  7c 08 02 a6 mflrr0
>   3f8:  3d 20 00 00 lis r9,0
> 3fa: R_PPC_ADDR16_HA.data..read_mostly+0x4
>   3fc:  93 e1 00 0c stw r31,12(r1)
>   400:  90 01 00 14 stw r0,20(r1)
>   404:  83 e9 00 00 lwz r31,0(r9)
> 406: R_PPC_ADDR16_LO.data..read_mostly+0x4
>   408:  94 3f 1f f0 stwur1,8176(r31)
>   40c:  7f e1 fb 78 mr  r1,r31
>   410:  48 00 00 01 bl  410 
> 410: R_PPC_REL24__do_softirq
>   414:  80 21 00 00 lwz r1,0(r1)
>   418:  80 01 00 14 lwz r0,20(r1)
>   41c:  83 e1 00 0c lwz r31,12(r1)
>   420:  38 21 00 10 addir1,r1,16
>   424:  7c 08 03 a6 mtlrr0
>   428:  4e 80 00 20 blr
>
>
> With CLANG:
>
> 03e8 :
>   3e8:  94 21 ff f0 stwur1,-16(r1)
>   3ec:  93 c1 00 08 stw r30,8(r1)
>   3f0:  3c 60 00 00 lis r3,0
> 3f2: R_PPC_ADDR16_HAsoftirq_ctx
>   3f4:  83 c3 00 00 lwz r30,0(r3)
> 3f6: R_PPC_ADDR16_LOsoftirq_ctx
>   3f8:  94 3e 1f f0 stwur1,8176(r30)
>   3fc:  7f c1 f3 78 mr  r1,r30
>   400:  48 00 00 01 bl  400 
> 400: R_PPC_REL24__do_softirq
>   404:  80 21 00 00 lwz r1,0(r1)
>   408:  83 c1 00 08 lwz r30,8(r1)
>   40c:  38 21 00 10 addir1,r1,16
>   410:  4e 80 00 20 blr
>
>
> As you can see, CLANG doesn't save/restore 'lr' allthought 'lr' is 
> explicitely listed in the
> registers clobbered by the inline assembly:

Ah, thanks for debugging this. Will follow up in
https://bugs.llvm.org/show_bug.cgi?id=50147.

>
>  >> +static __always_inline void call_do_softirq(const void *sp)
>  >> +{
>  >> +   /* Temporarily switch r1 to sp, call __do_softirq() then restore r1. 
> */
>  >> +   asm volatile (
>  >> +PPC_STLU " %%r1, %[offset](%[sp])  ;"
>  >> +   "mr %%r1, %[sp] ;"
>  >> +   "bl %[callee]   ;"
>  >> +PPC_LL "   %%r1, 0(%%r1)   ;"
>  >> +: // Outputs
>  >> +: // Inputs
>  >> +  [sp] "b" (sp), [offset] "i" (THREAD_SIZE - 
> STACK_FRAME_OVERHEAD),
>  >> +  [callee] "i" (__do_softirq)
>  >> +: // Clobbers
>  >> +  "lr", "xer", "ctr", "memory", "cr0", "cr1", "cr5", "cr6",
>  >> +  "cr7", "r0", "r3", "r4", "r5", "r6", "r7", "r8", "r9", 
> "r10",
>  >> +  "r11", "r12"
>  >> +   );
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clang-built-linux+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clang-built-linux/de6fc09f-97f5-c934-6393-998ec766b48a%40csgroup.eu.



-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v6] powerpc/kexec_file: use current CPU info while setting up FDT

2021-04-27 Thread Tyrel Datwyler
On 4/26/21 9:51 PM, Sourabh Jain wrote:
> kexec_file_load uses initial_boot_params in setting up the device-tree
> for the kernel to be loaded. Though initial_boot_params holds info
> about CPUs at the time of boot, it doesn't account for hot added CPUs.
> 
> So, kexec'ing with kexec_file_load syscall would leave the kexec'ed
> kernel with inaccurate CPU info. Also, if kdump kernel is loaded with
> kexec_file_load syscall and the system crashes on a hot added CPU,
> capture kernel hangs failing to identify the boot CPU.
> 
>  Kernel panic - not syncing: sysrq triggered crash
>  CPU: 24 PID: 6065 Comm: echo Kdump: loaded Not tainted 5.12.0-rc5upstream #54
>  Call Trace:
>  [c000e590fac0] [c07b2400] dump_stack+0xc4/0x114 (unreliable)
>  [c000e590fb00] [c0145290] panic+0x16c/0x41c
>  [c000e590fba0] [c08892e0] sysrq_handle_crash+0x30/0x40
>  [c000e590fc00] [c0889cdc] __handle_sysrq+0xcc/0x1f0
>  [c000e590fca0] [c088a538] write_sysrq_trigger+0xd8/0x178
>  [c000e590fce0] [c05e9b7c] proc_reg_write+0x10c/0x1b0
>  [c000e590fd10] [c04f26d0] vfs_write+0xf0/0x330
>  [c000e590fd60] [c04f2aec] ksys_write+0x7c/0x140
>  [c000e590fdb0] [c0031ee0] system_call_exception+0x150/0x290
>  [c000e590fe10] [c000ca5c] system_call_common+0xec/0x278
>  --- interrupt: c00 at 0x7fff905b9664
>  NIP:  7fff905b9664 LR: 7fff905320c4 CTR: 
>  REGS: c000e590fe80 TRAP: 0c00   Not tainted  (5.12.0-rc5upstream)
>  MSR:  8280f033   CR: 28000242
>XER: 
>  IRQMASK: 0
>  GPR00: 0004 75fedf30 7fff906a7300 0001
>  GPR04: 01002a7355b0 0002 0001 75fef616
>  GPR08: 0001   
>  GPR12:  7fff9073a160  
>  GPR16:    
>  GPR20:  7fff906a4ee0 0002 0001
>  GPR24: 7fff906a0898  0002 01002a7355b0
>  GPR28: 0002 7fff906a1790 01002a7355b0 0002
>  NIP [7fff905b9664] 0x7fff905b9664
>  LR [7fff905320c4] 0x7fff905320c4
>  --- interrupt: c00
> 
> To avoid this from happening, extract current CPU info from of_root
> device node and use it for setting up the fdt in kexec_file_load case.
> 
> Fixes: 6ecd0163d360 ("powerpc/kexec_file: Add appropriate regions for memory 
> reserve map")
> 
> Signed-off-by: Sourabh Jain 
> Reviewed-by: Hari Bathini 
> Cc: 
> ---
>  arch/powerpc/kexec/file_load_64.c | 88 +++
>  1 file changed, 88 insertions(+)
> 
>  ---
> Changelog:
> 
> v1 -> v5
>   - https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-April/227950.html
> 
> v5 -> v6
>   - use exiting macro (for_each_property_of_node) to loop through all
> properties of a node.
>   - removed devtree_lock while accessing the node properties.
>   - function name update, add_node_prop to add_node_props.
>  ---
> 
> diff --git a/arch/powerpc/kexec/file_load_64.c 
> b/arch/powerpc/kexec/file_load_64.c
> index 02b9e4d0dc40..4f7d4c10f939 100644
> --- a/arch/powerpc/kexec/file_load_64.c
> +++ b/arch/powerpc/kexec/file_load_64.c
> @@ -960,6 +960,89 @@ unsigned int kexec_fdt_totalsize_ppc64(struct kimage 
> *image)
>   return fdt_size;
>  }
> 
> +/**
> + * add_node_props - Reads node properties from device node structure and add
> + *  them to fdt.
> + * @fdt:Flattened device tree of the kernel
> + * @node_offset:offset of the node to add a property at
> + * @dn: device node pointer
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +static int add_node_props(void *fdt, int node_offset, const struct 
> device_node *dn)
> +{
> + int ret = 0;
> + struct property *pp;
> +
> + if (!dn)
> + return -EINVAL;
> +
> + for_each_property_of_node(dn, pp) {
> + ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, 
> pp->length);
> + if (ret < 0) {
> + pr_err("Unable to add %s property: %s\n", pp->name, 
> fdt_strerror(ret));
> + return ret;
> + }
> + }
> + return ret;
> +}
> +
> +/**
> + * update_cpus_node - Update cpus node of flattened device tree using of_root
> + *device node.
> + * @fdt:  Flattened device tree of the kernel.
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +static int update_cpus_node(void *fdt)
> +{
> + struct device_node *cpus_node, *dn;
> + int cpus_offset, cpus_subnode_offset, ret = 0;
> +
> + cpus_offset = fdt_path_offset(fdt, "/cpus");
> + if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
> + pr_err("Malformed device tree: error reading /cpus node: %s\n",
> +

Re: powerpc{32,64} randconfigs

2021-04-27 Thread Randy Dunlap
On 4/21/21 12:15 AM, Michael Ellerman wrote:
> Randy Dunlap  writes:
>> Hi,
>>
>> Is there a way to do this?
>>
>> $ make ARCH=powerpc randconfig # and force PPC32
> 
> Sort of:
> 
> $ KCONFIG_ALLCONFIG=arch/powerpc/configs/book3s_32.config make randconfig
> 
> But that also forces BOOK3S.
> 
>> and separately
>> $ make ARCH=powerpc randconfig # and force PPC64
> 
> No.
> 
> ...
>> OK, I have a patch that seems for work as far as setting
>> PPC32=y or PPC64=y... but it has a problem during linking
>> of vmlinux:
>>
>> crosstool/gcc-9.3.0-nolibc/powerpc-linux/bin/powerpc-linux-ld:./arch/powerpc/kernel/vmlinux.lds:6:
>>  syntax error
>>
>> and the (bad) generated vmlinux.lds file says (at line 6):
>>
>> OUTPUT_ARCH(1:common)
>>
>> while it should say:
>>
>> OUTPUT_ARCH(powerpc:common)
>>
>> Does anyone have any ideas about this problem?
> 
> I guess your patch broke something? :D
> Not sure sorry.
> 
> What about something like this?
> 
> cheers
> 
> 
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index 3212d076ac6a..712c5e8768ce 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -376,6 +376,16 @@ PHONY += ppc64_book3e_allmodconfig
>   $(Q)$(MAKE) 
> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/85xx-64bit.config \
>   -f $(srctree)/Makefile allmodconfig
>  
> +PHONY += ppc32_randconfig
> +ppc32_randconfig:
> + $(Q)$(MAKE) 
> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/32-bit.config \
> + -f $(srctree)/Makefile randconfig
> +
> +PHONY += ppc64_randconfig
> +ppc64_randconfig:
> + $(Q)$(MAKE) 
> KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/64-bit.config \
> + -f $(srctree)/Makefile randconfig
> +
>  define archhelp
>@echo '* zImage  - Build default images selected by kernel config'
>@echo '  zImage.*- Compressed kernel image 
> (arch/$(ARCH)/boot/zImage.*)'
> diff --git a/arch/powerpc/configs/32-bit.config 
> b/arch/powerpc/configs/32-bit.config
> new file mode 100644
> index ..bdf833009006
> --- /dev/null
> +++ b/arch/powerpc/configs/32-bit.config
> @@ -0,0 +1 @@
> +CONFIG_PPC64=n

I used the suggested change here (above).

> diff --git a/arch/powerpc/configs/64-bit.config 
> b/arch/powerpc/configs/64-bit.config
> new file mode 100644
> index ..0fe6406929e2
> --- /dev/null
> +++ b/arch/powerpc/configs/64-bit.config
> @@ -0,0 +1 @@
> +CONFIG_PPC64=y
> 

Reviewed-by: Randy Dunlap 
Tested-by: Randy Dunlap 


Please merge this. :)

thanks.
-- 
~Randy



Re: PPC476 hangs during tlb flush after calling /init in crash kernel with linux 5.4+

2021-04-27 Thread Eddie James
On Tue, 2021-04-27 at 19:26 +0200, Christophe Leroy wrote:
> Hi Eddies,
> 
> Le 27/04/2021 à 19:03, Eddie James a écrit :
> > Hi all,
> > 
> > I'm having a problem in simulation and hardware where my PPC476
> > processor stops executing instructions after callling /init. In my
> > case
> > this is a bash script. The code descends to flush the TLB, and
> > somewhere in the loop in _tlbil_pid, the PC goes to
> > InstructionTLBError47x but does not go any further. This only
> > occurs in
> > the crash kernel environment, which is using the same kernel,
> > initramfs, and init script as the main kernel, which executed fine.
> > I
> > do not see this problem with linux 4.19 or 3.10. I do see it with
> > 5.4
> > and 5.10. I see a fair amount of refactoring in the PPC memory
> > management area between 4.19 and 5.4. Can anyone point me in a
> > direction to debug this further? My stack trace is below as I can
> > run
> > gdb in simulation.
> 
> Can you bisect to pin point the culprit commit ?

Hi, thanks for your prompt reply.

Good idea! I have bisected to:

commit 9e849f231c3c72d4c3c1b07c9cd19ae789da0420 (b8-bad,
refs/bisect/bad)
Author: Christophe Leroy 
Date:   Thu Feb 21 19:08:40 2019 +

powerpc/mm/32s: use generic mmu_mapin_ram() for all blocks.

Now that mmu_mapin_ram() is able to handle other blocks
than the one starting at 0, the WII can use it for all
its blocks.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 

I also confirmed that reverting this commit resolves the issue in 5.4+.

Now, I don't understand why this is problematic or what is really
happening... Reverting is probably not the desired solution.

Thanks
Eddie


> 
> Assuming the problem is in arch/powerpc/ , you should get the result
> in approx 10 steps:
> 
> [root@po15610vm linux-powerpc]# git bisect start -- arch/powerpc/
> [root@po15610vm linux-powerpc]# git bisect bad v5.4
> [root@po15610vm linux-powerpc]# git bisect good v4.19
> Bisecting: 964 revisions left to test after this (roughly 10 steps)
> 
> 
> Christophe



PPC476 hangs during tlb flush after calling /init in crash kernel with linux 5.4+

2021-04-27 Thread Eddie James
Hi all,

I'm having a problem in simulation and hardware where my PPC476
processor stops executing instructions after callling /init. In my case
this is a bash script. The code descends to flush the TLB, and
somewhere in the loop in _tlbil_pid, the PC goes to
InstructionTLBError47x but does not go any further. This only occurs in
the crash kernel environment, which is using the same kernel,
initramfs, and init script as the main kernel, which executed fine. I
do not see this problem with linux 4.19 or 3.10. I do see it with 5.4
and 5.10. I see a fair amount of refactoring in the PPC memory
management area between 4.19 and 5.4. Can anyone point me in a
direction to debug this further? My stack trace is below as I can run
gdb in simulation.

Thanks,
Eddie


#0  _tlbil_pid () at
/usr/src/kernel/arch/powerpc/mm/nohash/tlb_low.S:123
#1  0xca014864 in local_flush_tlb_mm (mm=)
at /usr/src/kernel/arch/powerpc/mm/nohash/tlb.c:201
#2  tlb_flush (tlb=)
at /usr/src/kernel/arch/powerpc/mm/nohash/tlb.c:395
#3  0xca161e48 in tlb_flush_mmu_tlbonly (tlb=)
at /usr/src/kernel/include/asm-generic/tlb.h:408
#4  tlb_flush_mmu_tlbonly (tlb=)
at /usr/src/kernel/include/asm-generic/tlb.h:403
#5  tlb_flush_mmu (tlb=0xcec2fd18) at
/usr/src/kernel/mm/mmu_gather.c:190
#6  0xca161fa8 in tlb_finish_mmu (tlb=0xcec2fd18, start=,
end=) at /usr/src/kernel/mm/mmu_gather.c:272
#7  0xca18e070 in shift_arg_pages (shift=,
vma=)
at /usr/src/kernel/fs/exec.c:678
#8  setup_arg_pages (bprm=0xcef1a000, stack_top=,
executable_stack=) at /usr/src/kernel/fs/exec.c:768
#9  0xca1f617c in load_elf_binary (bprm=0xcef1a000)
at /usr/src/kernel/fs/binfmt_elf.c:867
#10 0xca18f3d4 in search_binary_handler (bprm=)
at /usr/src/kernel/fs/exec.c:1691
#11 0xca1f458c in next_terminator (last=,
first=)
at /usr/src/kernel/fs/binfmt_script.c:29
#12 load_script (bprm=0xcef1a000) at
/usr/src/kernel/fs/binfmt_script.c:83
#13 0xca18f3d4 in search_binary_handler (bprm=)
at /usr/src/kernel/fs/exec.c:1691
#14 0xca190104 in acct_arg_size (bprm=, pages=)
at /usr/src/kernel/fs/exec.c:187
#15 __do_execve_file (fd=, filename=0xcec98000,
argv=...,
envp=..., flags=, file=)
at /usr/src/kernel/fs/exec.c:1872
#16 0xca19059c in __read_once_size (size=,
res=, p=) at
/usr/src/kernel/include/linux/compiler.h:235
#17 set_dumpable (mm=, value=) at
/usr/src/kernel/fs/exec.c:1983



Re: [PATCH] pseries/drmem: update LMBs after LPM

2021-04-27 Thread Tyrel Datwyler
On 4/27/21 11:13 AM, Laurent Dufour wrote:
> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
> updated by the hypervisor in the case the NUMA topology of the LPAR's
> memory is updated.
> 
> This is caught by the kernel, but the memory's node is updated because
> there is no way to move a memory block between nodes.
> 
> If later a memory block is added or removed, drmem_update_dt() is called
> and it is overwriting the DT node to match the added or removed LMB. But
> the LMB's associativity node has not been updated after the DT node update
> and thus the node is overwritten by the Linux's topology instead of the
> hypervisor one.
> 
> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
> updated to force an update of the LMB's associativity.
> 
> Cc: Tyrel Datwyler 
> Signed-off-by: Laurent Dufour 
> 
> Change since V1:
>  - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
>  introducing a new hook mechanism.
> ---
>  arch/powerpc/include/asm/drmem.h  |  1 +
>  arch/powerpc/mm/drmem.c   | 35 +++
>  .../platforms/pseries/hotplug-memory.c|  4 +++
>  3 files changed, 40 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/drmem.h 
> b/arch/powerpc/include/asm/drmem.h
> index bf2402fed3e0..4265d5e95c2c 100644
> --- a/arch/powerpc/include/asm/drmem.h
> +++ b/arch/powerpc/include/asm/drmem.h
> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>  int __init
>  walk_drmem_lmbs_early(unsigned long node, void *data,
> int (*func)(struct drmem_lmb *, const __be32 **, void *));
> +void drmem_update_lmbs(struct property *prop);
>  #endif
>  
>  static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
> index 9af3832c9d8d..f0a6633132af 100644
> --- a/arch/powerpc/mm/drmem.c
> +++ b/arch/powerpc/mm/drmem.c
> @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, 
> void *data,
>   return ret;
>  }
>  
> +/*
> + * Update the LMB associativity index.
> + */
> +static int update_lmb(struct drmem_lmb *updated_lmb,
> +   __maybe_unused const __be32 **usm,
> +   __maybe_unused void *data)
> +{
> + struct drmem_lmb *lmb;
> +
> + /*
> +  * Brut force there may be better way to fetch the LMB
> +  */
> + for_each_drmem_lmb(lmb) {
> + if (lmb->drc_index != updated_lmb->drc_index)
> + continue;
> +
> + lmb->aa_index = updated_lmb->aa_index;
> + break;
> + }
> + return 0;
> +}
> +
> +/*
> + * Update the LMB associativity index.
> + *
> + * This needs to be called when the hypervisor is updating the
> + * dynamic-reconfiguration-memory node property.
> + */
> +void drmem_update_lmbs(struct property *prop)
> +{
> + if (!strcmp(prop->name, "ibm,dynamic-memory"))
> + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
> + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
> + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
> +}
>  #endif
>  
>  static int init_drmem_lmb_size(struct device_node *dn)
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 8377f1f7c78e..8aabaafc484b 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block 
> *nb,
>   case OF_RECONFIG_DETACH_NODE:
>   err = pseries_remove_mem_node(rd->dn);
>   break;
> + case OF_RECONFIG_UPDATE_PROPERTY:
> + if (!strcmp(rd->dn->full_name,

Pretty much a self nit on myself since I just copied the device node name field
from your initial patch into my suggested code block.

It used to be that dn->full_name was intended to store the full device-tree path
name of the device node ane dn->name simply the base name. These days the values
of both name fields are simply the basename for pseries. Regardless,
rd->dn->name is technically correct and shorter.

-Tyrel

> + "ibm,dynamic-reconfiguration-memory"))
> + drmem_update_lmbs(rd->prop);
>   }
>   return notifier_from_errno(err);
>  }
> 



Re: [PATCH] pseries/drmem: update LMBs after LPM

2021-04-27 Thread Laurent Dufour

Michael, this is a v2 despite the mail's subject, sorry for the mess.


[PATCH] pseries/drmem: update LMBs after LPM

2021-04-27 Thread Laurent Dufour
After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.

This is caught by the kernel, but the memory's node is updated because
there is no way to move a memory block between nodes.

If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node to match the added or removed LMB. But
the LMB's associativity node has not been updated after the DT node update
and thus the node is overwritten by the Linux's topology instead of the
hypervisor one.

Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity.

Cc: Tyrel Datwyler 
Signed-off-by: Laurent Dufour 

Change since V1:
 - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
 introducing a new hook mechanism.
---
 arch/powerpc/include/asm/drmem.h  |  1 +
 arch/powerpc/mm/drmem.c   | 35 +++
 .../platforms/pseries/hotplug-memory.c|  4 +++
 3 files changed, 40 insertions(+)

diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index bf2402fed3e0..4265d5e95c2c 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -111,6 +111,7 @@ int drmem_update_dt(void);
 int __init
 walk_drmem_lmbs_early(unsigned long node, void *data,
  int (*func)(struct drmem_lmb *, const __be32 **, void *));
+void drmem_update_lmbs(struct property *prop);
 #endif
 
 static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index 9af3832c9d8d..f0a6633132af 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void 
*data,
return ret;
 }
 
+/*
+ * Update the LMB associativity index.
+ */
+static int update_lmb(struct drmem_lmb *updated_lmb,
+ __maybe_unused const __be32 **usm,
+ __maybe_unused void *data)
+{
+   struct drmem_lmb *lmb;
+
+   /*
+* Brut force there may be better way to fetch the LMB
+*/
+   for_each_drmem_lmb(lmb) {
+   if (lmb->drc_index != updated_lmb->drc_index)
+   continue;
+
+   lmb->aa_index = updated_lmb->aa_index;
+   break;
+   }
+   return 0;
+}
+
+/*
+ * Update the LMB associativity index.
+ *
+ * This needs to be called when the hypervisor is updating the
+ * dynamic-reconfiguration-memory node property.
+ */
+void drmem_update_lmbs(struct property *prop)
+{
+   if (!strcmp(prop->name, "ibm,dynamic-memory"))
+   __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
+   else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
+   __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
+}
 #endif
 
 static int init_drmem_lmb_size(struct device_node *dn)
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 8377f1f7c78e..8aabaafc484b 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block 
*nb,
case OF_RECONFIG_DETACH_NODE:
err = pseries_remove_mem_node(rd->dn);
break;
+   case OF_RECONFIG_UPDATE_PROPERTY:
+   if (!strcmp(rd->dn->full_name,
+   "ibm,dynamic-reconfiguration-memory"))
+   drmem_update_lmbs(rd->prop);
}
return notifier_from_errno(err);
 }
-- 
2.31.1



Re: [PATCH] pseries/drmem: update LMBs after LPM

2021-04-27 Thread Laurent Dufour

Le 27/04/2021 à 19:01, Tyrel Datwyler a écrit :

On 4/27/21 8:01 AM, Laurent Dufour wrote:

After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.

This is caught by the kernel, but the memory's node is updated because
there is no way to move a memory block between nodes.

If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node to match the added or removed LMB. But
the LMB's associativity node has not been updated after the DT node update
and thus the node is overwritten by the Linux's topology instead of the
hypervisor one.

Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity.

Signed-off-by: Laurent Dufour 
---
  arch/powerpc/include/asm/drmem.h  |  1 +
  arch/powerpc/mm/drmem.c   | 48 +++
  arch/powerpc/platforms/pseries/mobility.c |  9 +
  3 files changed, 58 insertions(+)

diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index bf2402fed3e0..55c2c25085b0 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -111,6 +111,7 @@ int drmem_update_dt(void);
  int __init
  walk_drmem_lmbs_early(unsigned long node, void *data,
  int (*func)(struct drmem_lmb *, const __be32 **, void *));
+void drmem_update_lmbs(void);
  #endif

  static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index 9af3832c9d8d..46074bdfdb3c 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -307,6 +307,54 @@ int __init walk_drmem_lmbs_early(unsigned long node, void 
*data,
return ret;
  }

+/*
+ * Update the LMB associativity index.
+ */
+static int update_lmb(struct drmem_lmb *updated_lmb,
+ __maybe_unused const __be32 **usm,
+ __maybe_unused void *data)
+{
+   struct drmem_lmb *lmb;
+
+   /*
+* Brut force there may be better way to fetch the LMB
+*/
+   for_each_drmem_lmb(lmb) {
+   if (lmb->drc_index != updated_lmb->drc_index)
+   continue;
+
+   lmb->aa_index = updated_lmb->aa_index;
+   break;
+   }
+   return 0;
+}
+
+/*
+ * Update the LMB associativity index.
+ *
+ * This needs to be called when the hypervisor is updating the
+ * dynamic-reconfiguration-memory node property.
+ */
+void drmem_update_lmbs(void)
+{
+   struct device_node *node;
+   const __be32 *prop;
+
+   node = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+   if (!node)
+   return;
+
+   prop = of_get_property(node, "ibm,dynamic-memory", NULL);
+   if (prop) {
+   __walk_drmem_v1_lmbs(prop, NULL, NULL, update_lmb);
+   } else {
+   prop = of_get_property(node, "ibm,dynamic-memory-v2", NULL);
+   if (prop)
+   __walk_drmem_v2_lmbs(prop, NULL, NULL, update_lmb);
+   }
+
+   of_node_put(node);
+}
  #endif

  static int init_drmem_lmb_size(struct device_node *dn)
diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index ea4d6a660e0d..c68eccc6e8df 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -25,6 +25,7 @@

  #include 
  #include 
+#include 
  #include "pseries.h"
  #include "../../kernel/cacheinfo.h"

@@ -237,6 +238,7 @@ int pseries_devicetree_update(s32 scope)
__be32 *data;
int update_nodes_token;
int rc;
+   bool drmem_updated = false;

update_nodes_token = rtas_token("ibm,update-nodes");
if (update_nodes_token == RTAS_UNKNOWN_SERVICE)
@@ -271,6 +273,10 @@ int pseries_devicetree_update(s32 scope)
continue;
}

+   if (!strcmp(np->full_name,
+   
"ibm,dynamic-reconfiguration-memory"))
+   drmem_updated = true;


Is there a reason that we can't use the existing pseries_memory_notifier()
callback in pseries/hotplug-memory.c to trigger the drmem_update_lmbs() when
either the ibm,dynamic-memory or ibm,dynamic-memory-v2 properties are updated?


Thanks a lot Tyrel!

That's far more elegant, I'll send a v2 soon.

Laurent.


Something like:

static int pseries_memory_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
 struct of_reconfig_data *rd = data;
 int err = 0;

 switch (action) {
 case OF_RECONFIG_ATTACH_NODE:
 err = pseries_add_mem_node(rd->dn);
 break;
 case OF_RECONFIG_DETACH_NODE:

Re: PPC476 hangs during tlb flush after calling /init in crash kernel with linux 5.4+

2021-04-27 Thread Christophe Leroy

Hi Eddies,

Le 27/04/2021 à 19:03, Eddie James a écrit :

Hi all,

I'm having a problem in simulation and hardware where my PPC476
processor stops executing instructions after callling /init. In my case
this is a bash script. The code descends to flush the TLB, and
somewhere in the loop in _tlbil_pid, the PC goes to
InstructionTLBError47x but does not go any further. This only occurs in
the crash kernel environment, which is using the same kernel,
initramfs, and init script as the main kernel, which executed fine. I
do not see this problem with linux 4.19 or 3.10. I do see it with 5.4
and 5.10. I see a fair amount of refactoring in the PPC memory
management area between 4.19 and 5.4. Can anyone point me in a
direction to debug this further? My stack trace is below as I can run
gdb in simulation.


Can you bisect to pin point the culprit commit ?

Assuming the problem is in arch/powerpc/ , you should get the result in approx 
10 steps:

[root@po15610vm linux-powerpc]# git bisect start -- arch/powerpc/
[root@po15610vm linux-powerpc]# git bisect bad v5.4
[root@po15610vm linux-powerpc]# git bisect good v4.19
Bisecting: 964 revisions left to test after this (roughly 10 steps)


Christophe


Re: [PATCH] pseries/drmem: update LMBs after LPM

2021-04-27 Thread Tyrel Datwyler
On 4/27/21 8:01 AM, Laurent Dufour wrote:
> After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
> updated by the hypervisor in the case the NUMA topology of the LPAR's
> memory is updated.
> 
> This is caught by the kernel, but the memory's node is updated because
> there is no way to move a memory block between nodes.
> 
> If later a memory block is added or removed, drmem_update_dt() is called
> and it is overwriting the DT node to match the added or removed LMB. But
> the LMB's associativity node has not been updated after the DT node update
> and thus the node is overwritten by the Linux's topology instead of the
> hypervisor one.
> 
> Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
> updated to force an update of the LMB's associativity.
> 
> Signed-off-by: Laurent Dufour 
> ---
>  arch/powerpc/include/asm/drmem.h  |  1 +
>  arch/powerpc/mm/drmem.c   | 48 +++
>  arch/powerpc/platforms/pseries/mobility.c |  9 +
>  3 files changed, 58 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/drmem.h 
> b/arch/powerpc/include/asm/drmem.h
> index bf2402fed3e0..55c2c25085b0 100644
> --- a/arch/powerpc/include/asm/drmem.h
> +++ b/arch/powerpc/include/asm/drmem.h
> @@ -111,6 +111,7 @@ int drmem_update_dt(void);
>  int __init
>  walk_drmem_lmbs_early(unsigned long node, void *data,
> int (*func)(struct drmem_lmb *, const __be32 **, void *));
> +void drmem_update_lmbs(void);
>  #endif
> 
>  static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
> diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
> index 9af3832c9d8d..46074bdfdb3c 100644
> --- a/arch/powerpc/mm/drmem.c
> +++ b/arch/powerpc/mm/drmem.c
> @@ -307,6 +307,54 @@ int __init walk_drmem_lmbs_early(unsigned long node, 
> void *data,
>   return ret;
>  }
> 
> +/*
> + * Update the LMB associativity index.
> + */
> +static int update_lmb(struct drmem_lmb *updated_lmb,
> +   __maybe_unused const __be32 **usm,
> +   __maybe_unused void *data)
> +{
> + struct drmem_lmb *lmb;
> +
> + /*
> +  * Brut force there may be better way to fetch the LMB
> +  */
> + for_each_drmem_lmb(lmb) {
> + if (lmb->drc_index != updated_lmb->drc_index)
> + continue;
> +
> + lmb->aa_index = updated_lmb->aa_index;
> + break;
> + }
> + return 0;
> +}
> +
> +/*
> + * Update the LMB associativity index.
> + *
> + * This needs to be called when the hypervisor is updating the
> + * dynamic-reconfiguration-memory node property.
> + */
> +void drmem_update_lmbs(void)
> +{
> + struct device_node *node;
> + const __be32 *prop;
> +
> + node = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
> + if (!node)
> + return;
> +
> + prop = of_get_property(node, "ibm,dynamic-memory", NULL);
> + if (prop) {
> + __walk_drmem_v1_lmbs(prop, NULL, NULL, update_lmb);
> + } else {
> + prop = of_get_property(node, "ibm,dynamic-memory-v2", NULL);
> + if (prop)
> + __walk_drmem_v2_lmbs(prop, NULL, NULL, update_lmb);
> + }
> +
> + of_node_put(node);
> +}
>  #endif
> 
>  static int init_drmem_lmb_size(struct device_node *dn)
> diff --git a/arch/powerpc/platforms/pseries/mobility.c 
> b/arch/powerpc/platforms/pseries/mobility.c
> index ea4d6a660e0d..c68eccc6e8df 100644
> --- a/arch/powerpc/platforms/pseries/mobility.c
> +++ b/arch/powerpc/platforms/pseries/mobility.c
> @@ -25,6 +25,7 @@
> 
>  #include 
>  #include 
> +#include 
>  #include "pseries.h"
>  #include "../../kernel/cacheinfo.h"
> 
> @@ -237,6 +238,7 @@ int pseries_devicetree_update(s32 scope)
>   __be32 *data;
>   int update_nodes_token;
>   int rc;
> + bool drmem_updated = false;
> 
>   update_nodes_token = rtas_token("ibm,update-nodes");
>   if (update_nodes_token == RTAS_UNKNOWN_SERVICE)
> @@ -271,6 +273,10 @@ int pseries_devicetree_update(s32 scope)
>   continue;
>   }
> 
> + if (!strcmp(np->full_name,
> + 
> "ibm,dynamic-reconfiguration-memory"))
> + drmem_updated = true;

Is there a reason that we can't use the existing pseries_memory_notifier()
callback in pseries/hotplug-memory.c to trigger the drmem_update_lmbs() when
either the ibm,dynamic-memory or ibm,dynamic-memory-v2 properties are updated?

Something like:

static int pseries_memory_notifier(struct notifier_block *nb,
   unsigned long action, void *data)
{
struct of_reconfig_data *rd = data;
int err = 0;

switch (action) {
case OF_RECONFIG_ATTACH_NODE:
err = pseries_add_mem_node(rd->dn);
break;
case OF_RECONFIG_DETAC

Re: [PATCH 1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction

2021-04-27 Thread Naveen N. Rao

Michael Ellerman wrote:

"Naveen N. Rao"  writes:

Michael Ellerman wrote:

"Naveen N. Rao"  writes:

Daniel Axtens wrote:

Sathvika Vasireddy  writes:


This adds emulation support for the following instruction:
   * Set Boolean (setb)

Signed-off-by: Sathvika Vasireddy 

...


If you do end up respinning the patch, I think it would be good to make
the maths a bit clearer. I think it works because a left shift of 2 is
the same as multiplying by 4, but it would be easier to follow if you
used a temporary variable for btf.


Indeed. I wonder if it is better to follow the ISA itself. Per the ISA, 
the bit we are interested in is:

4 x BFA + 32

So, if we use that along with the PPC_BIT() macro, we get:
if (regs->ccr & PPC_BIT(ra + 32))


Use of PPC_BIT risks annoying your maintainer :)


Uh oh... that isn't good :)

I looked up previous discussions and I think I now understand why you 
don't prefer it.


Hah, I'd forgotten I'd written (ranted :D) about this in the past.

But, I feel it helps make it easy to follow the code when referring to 
the ISA.


That's true. But I think that's much much less common than people
reading the code in isolation.


I thought that isn't so for at least the instruction emulation 
infrastructure...




And ultimately it doesn't matter if the code (appears to) match the ISA,
it matters that the code works. My worry is that too much use of those
type of macros obscures what's actually happening.


... but, I agree on the above point. I can see why it is better to keep 
it simple.


I also see precedence for what both you and Segher are suggesting in the 
existing code in sstep.c




I'm wondering if it is just the name you dislike and if so, 
does it make sense to rename PPC_BIT() to something else? We have 
BIT_ULL(), so perhaps BIT_MSB_ULL() or MSB_BIT_ULL()?


The name is part of it. But I don't really like BIT_ULL() either, it
hides in a macro something that could just be there in front of you
ie. (1ull << x).


For this case of setb, I think I'd go with something like below. It
doesn't exactly match the ISA, but I think there's minimal obfuscation
of what's actually going on.

// ra is now bfa
ra = (ra >> 2);

// Extract 4-bit CR field
val = regs->ccr >> (CR0_SHIFT - 4 * ra);

if (val & 8)
op->val = -1;
else if (val & 4)
op->val = 1;
else
op->val = 0;


If anything could use a macro it would be the 8 and 4, eg. CR_LT, CR_GT.

Of course that's probably got a bug in it, because I just wrote it by
eye and it's 11:28 pm :)


LGTM, thanks. I'll let Sathvika decide on which variant she wants to go 
with for v2 :)



- Naveen



[PATCH] pseries/drmem: update LMBs after LPM

2021-04-27 Thread Laurent Dufour
After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.

This is caught by the kernel, but the memory's node is updated because
there is no way to move a memory block between nodes.

If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node to match the added or removed LMB. But
the LMB's associativity node has not been updated after the DT node update
and thus the node is overwritten by the Linux's topology instead of the
hypervisor one.

Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity.

Signed-off-by: Laurent Dufour 
---
 arch/powerpc/include/asm/drmem.h  |  1 +
 arch/powerpc/mm/drmem.c   | 48 +++
 arch/powerpc/platforms/pseries/mobility.c |  9 +
 3 files changed, 58 insertions(+)

diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index bf2402fed3e0..55c2c25085b0 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -111,6 +111,7 @@ int drmem_update_dt(void);
 int __init
 walk_drmem_lmbs_early(unsigned long node, void *data,
  int (*func)(struct drmem_lmb *, const __be32 **, void *));
+void drmem_update_lmbs(void);
 #endif
 
 static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index 9af3832c9d8d..46074bdfdb3c 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -307,6 +307,54 @@ int __init walk_drmem_lmbs_early(unsigned long node, void 
*data,
return ret;
 }
 
+/*
+ * Update the LMB associativity index.
+ */
+static int update_lmb(struct drmem_lmb *updated_lmb,
+ __maybe_unused const __be32 **usm,
+ __maybe_unused void *data)
+{
+   struct drmem_lmb *lmb;
+
+   /*
+* Brut force there may be better way to fetch the LMB
+*/
+   for_each_drmem_lmb(lmb) {
+   if (lmb->drc_index != updated_lmb->drc_index)
+   continue;
+
+   lmb->aa_index = updated_lmb->aa_index;
+   break;
+   }
+   return 0;
+}
+
+/*
+ * Update the LMB associativity index.
+ *
+ * This needs to be called when the hypervisor is updating the
+ * dynamic-reconfiguration-memory node property.
+ */
+void drmem_update_lmbs(void)
+{
+   struct device_node *node;
+   const __be32 *prop;
+
+   node = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
+   if (!node)
+   return;
+
+   prop = of_get_property(node, "ibm,dynamic-memory", NULL);
+   if (prop) {
+   __walk_drmem_v1_lmbs(prop, NULL, NULL, update_lmb);
+   } else {
+   prop = of_get_property(node, "ibm,dynamic-memory-v2", NULL);
+   if (prop)
+   __walk_drmem_v2_lmbs(prop, NULL, NULL, update_lmb);
+   }
+
+   of_node_put(node);
+}
 #endif
 
 static int init_drmem_lmb_size(struct device_node *dn)
diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index ea4d6a660e0d..c68eccc6e8df 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -25,6 +25,7 @@
 
 #include 
 #include 
+#include 
 #include "pseries.h"
 #include "../../kernel/cacheinfo.h"
 
@@ -237,6 +238,7 @@ int pseries_devicetree_update(s32 scope)
__be32 *data;
int update_nodes_token;
int rc;
+   bool drmem_updated = false;
 
update_nodes_token = rtas_token("ibm,update-nodes");
if (update_nodes_token == RTAS_UNKNOWN_SERVICE)
@@ -271,6 +273,10 @@ int pseries_devicetree_update(s32 scope)
continue;
}
 
+   if (!strcmp(np->full_name,
+   
"ibm,dynamic-reconfiguration-memory"))
+   drmem_updated = true;
+
switch (action) {
case DELETE_DT_NODE:
delete_dt_node(np);
@@ -293,6 +299,9 @@ int pseries_devicetree_update(s32 scope)
} while (rc == 1);
 
kfree(rtas_buf);
+
+   if (drmem_updated)
+   drmem_update_lmbs();
return rc;
 }
 
-- 
2.31.1



Re: [PATCH 4/4] powerpc/pseries: warn if recursing into the hcall tracing code

2021-04-27 Thread Naveen N. Rao

Nicholas Piggin wrote:

---
 arch/powerpc/platforms/pseries/lpar.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 835e7f661a05..a961a7ebeab3 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -1828,8 +1828,11 @@ void hcall_tracepoint_unregfunc(void)

 /*
  * Since the tracing code might execute hcalls we need to guard against
- * recursion. H_CONFER from spin locks must be treated separately though
- * and use _notrace plpar_hcall variants, see yield_to_preempted().
+ * recursion, but this always seems risky -- __trace_hcall_entry might be
+ * ftraced, for example. So warn in this case.


__trace_hcall_[entry|exit] aren't traced anymore since they now have the 
'notrace' annotation.



+ *
+ * H_CONFER from spin locks must be treated separately though and use _notrace
+ * plpar_hcall variants, see yield_to_preempted().
  */
 static DEFINE_PER_CPU(unsigned int, hcall_trace_depth);

@@ -1843,7 +1846,7 @@ notrace void __trace_hcall_entry(unsigned long opcode, 
unsigned long *args)

depth = this_cpu_ptr(&hcall_trace_depth);

-   if (*depth)
+   if (WARN_ON_ONCE(*depth))
goto out;


I don't think this will be helpful. The hcall trace depth tracking is 
for the tracepoint and I suspect that this warning will be triggered 
quite easily. Since we have recursion protection, I don't think we 
should warn here.



- Naveen



Re: [PATCH 3/4] powerpc/pseries: use notrace hcall variant for H_CEDE idle

2021-04-27 Thread Naveen N. Rao

Nicholas Piggin wrote:

Rather than special-case H_CEDE in the hcall trace wrappers, make the
idle H_CEDE call use plpar_hcall_norets_notrace().

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/plpar_wrappers.h |  6 +-
 arch/powerpc/platforms/pseries/lpar.c | 10 --
 2 files changed, 5 insertions(+), 11 deletions(-)


Reviewed-by: Naveen N. Rao 


- Naveen



Re: [PATCH 2/4] powerpc/pseries: Don't trace hcall tracing wrapper

2021-04-27 Thread Naveen N. Rao

Nicholas Piggin wrote:

This doesn't seem very useful to trace before the recursion check, even
if the ftrace code has any recursion checks of its own. Be on the safe
side and don't trace the hcall trace wrappers.


These functions exist precisely to allow hcalls to be traced, so it 
doesn't make sense to "trace the tracer". Users wanting to know about 
hcalls are better off enabling the tracepoint here instead.




Reported-by: Naveen N. Rao 
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/lpar.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Naveen N. Rao 


- Naveen


Re: [PATCH 1/4] powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks

2021-04-27 Thread Naveen N. Rao

Nicholas Piggin wrote:

The paravit queued spinlock slow path adds itself to the queue then
calls pv_wait to wait for the lock to become free. This is implemented
by calling H_CONFER to donate cycles.

When hcall tracing is enabled, this H_CONFER call can lead to a spin
lock being taken in the tracing code, which will result in the lock to
be taken again, which will also go to the slow path because it queues
behind itself and so won't ever make progress.

An example trace of a deadlock:

  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_hcall_exit
  __trace_hcall_exit
  plpar_hcall_norets_trace
  __pv_queued_spin_lock_slowpath
  trace_clock_global
  ring_buffer_lock_reserve
  trace_event_buffer_lock_reserve
  trace_event_buffer_reserve
  trace_event_raw_event_rcu_dyntick
  rcu_irq_exit
  irq_exit
  __do_irq
  call_do_irq
  do_IRQ
  hardware_interrupt_common_virt

Fix this by introducing plpar_hcall_norets_notrace(), and using that to
make SPLPAR virtual processor dispatching hcalls by the paravirt
spinlock code.

Fixes: 20c0e8269e9d ("powerpc/pseries: Implement paravirt qspinlocks for 
SPLPAR")
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/hvcall.h   |  3 +++
 arch/powerpc/include/asm/paravirt.h | 22 +++---
 arch/powerpc/platforms/pseries/hvCall.S | 10 ++
 arch/powerpc/platforms/pseries/lpar.c   |  4 ++--
 4 files changed, 34 insertions(+), 5 deletions(-)


Thanks for the fix! Some very minor nits below, but none the less:
Reviewed-by: Naveen N. Rao 



diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index ed6086d57b22..0c92b01a3c3c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -446,6 +446,9 @@
  */
 long plpar_hcall_norets(unsigned long opcode, ...);

+/* Variant which does not do hcall tracing */
+long plpar_hcall_norets_notrace(unsigned long opcode, ...);
+
 /**
  * plpar_hcall: - Make a pseries hypervisor call
  * @opcode: The hypervisor call to make.
diff --git a/arch/powerpc/include/asm/paravirt.h 
b/arch/powerpc/include/asm/paravirt.h
index 5d1726bb28e7..3c13c2ec70a9 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -30,17 +30,33 @@ static inline u32 yield_count_of(int cpu)

 static inline void yield_to_preempted(int cpu, u32 yield_count)
 {


It looks like yield_to_preempted() is only used by simple spin locks 
today. I wonder if it makes more sense to put the below comment in 
yield_to_any() which is used by the qspinlock code.



-   plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), 
yield_count);
+   /*
+* Spinlock code yields and prods, so don't trace the hcalls because
+* tracing code takes spinlocks which could recurse.
+*
+* These calls are made while the lock is not held, the lock slowpath
+* yields if it can not acquire the lock, and unlock slow path might
+* prod if a waiter has yielded). So this did not seem to be a problem
+* for simple spin locks because technically it didn't recuse on the

   ^^
   recurse


+* lock. However the queued spin lock contended path is more strictly
+* ordered: the H_CONFER hcall is made after the task has queued itself
+* on the lock, so then recursing on the lock will queue up behind that
+* (or worse: queued spinlocks uses tricks that assume a context never
+* waits on more than one spinlock, so that may cause random
+* corruption).
+*/
+   plpar_hcall_norets_notrace(H_CONFER,
+  get_hard_smp_processor_id(cpu), yield_count);


This can all be on a single line.


- Naveen



Re: [PATCH 2/9] ARM: PXA: Kill use of irq_create_strict_mappings()

2021-04-27 Thread Guenter Roeck
On 4/27/21 1:30 AM, Marc Zyngier wrote:
> Hi Guenter,
> 
> Thanks for the heads up.
> 
> On Mon, 26 Apr 2021 23:39:42 +0100,
> Guenter Roeck  wrote:
>>
>> On Tue, Apr 06, 2021 at 10:35:50AM +0100, Marc Zyngier wrote:
>>> irq_create_strict_mappings() is a poor way to allow the use of
>>> a linear IRQ domain as a legacy one. Let's be upfront about
>>> it and use a legacy domain when appropriate.
>>>
>>> Signed-off-by: Marc Zyngier 
>>> ---
>>
>> When running the "mainstone" qemu emulation, this patch results
>> in many (32, actually) runtime warnings such as the following.
>>
>> [0.528272] [ cut here ]
>> [0.528285] WARNING: CPU: 0 PID: 1 at kernel/irq/irqdomain.c:550 
>> irq_domain_associate+0x194/0x1f0
>> [0.528315] error: virq335 is not allocated
> 
> [...]
> 
> This looks like a case of CONFIG_SPARSE_IRQ, combined with a lack of
> brain engagement. I've come up with the following patch, which lets
> the kernel boot in QEMU without screaming (other than the lack of a
> rootfs...).
> 
> Please let me know if this helps.
> 

It does.

Tested-by: Guenter Roeck 

Thanks,
Guenter

> Thanks,
> 
>   M.
> 
> From 4d7f6ddbbfdff1c9f029bafca79020d3294dc32c Mon Sep 17 00:00:00 2001
> From: Marc Zyngier 
> Date: Tue, 27 Apr 2021 09:00:28 +0100
> Subject: [PATCH] ARM: PXA: Fix cplds irqdesc allocation when using legacy mode
> 
> The Mainstone PXA platform uses CONFIG_SPARSE_IRQ, and thus we
> cannot rely on the irq descriptors to be readilly allocated
> before creating the irqdomain in legacy mode. The kernel then
> complains loudly about not being able to associate the interrupt
> in the domain -- can't blame it.
> 
> Fix it by allocating the irqdescs upfront in the legacy case.
> 
> Fixes: b68761da0111 ("ARM: PXA: Kill use of irq_create_strict_mappings()")
> Reported-by: Guenter Roeck 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm/mach-pxa/pxa_cplds_irqs.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mach-pxa/pxa_cplds_irqs.c 
> b/arch/arm/mach-pxa/pxa_cplds_irqs.c
> index ec0d9b094744..bddfc7cd5d40 100644
> --- a/arch/arm/mach-pxa/pxa_cplds_irqs.c
> +++ b/arch/arm/mach-pxa/pxa_cplds_irqs.c
> @@ -121,8 +121,13 @@ static int cplds_probe(struct platform_device *pdev)
>   return fpga->irq;
>  
>   base_irq = platform_get_irq(pdev, 1);
> - if (base_irq < 0)
> + if (base_irq < 0) {
>   base_irq = 0;
> + } else {
> + ret = devm_irq_alloc_descs(&pdev->dev, base_irq, base_irq, 
> CPLDS_NB_IRQ, 0);
> + if (ret < 0)
> + return ret;
> + }
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   fpga->base = devm_ioremap_resource(&pdev->dev, res);
> 



Re: [PATCH 2/9] ARM: PXA: Kill use of irq_create_strict_mappings()

2021-04-27 Thread Marc Zyngier
Hi Guenter,

Thanks for the heads up.

On Mon, 26 Apr 2021 23:39:42 +0100,
Guenter Roeck  wrote:
> 
> On Tue, Apr 06, 2021 at 10:35:50AM +0100, Marc Zyngier wrote:
> > irq_create_strict_mappings() is a poor way to allow the use of
> > a linear IRQ domain as a legacy one. Let's be upfront about
> > it and use a legacy domain when appropriate.
> > 
> > Signed-off-by: Marc Zyngier 
> > ---
> 
> When running the "mainstone" qemu emulation, this patch results
> in many (32, actually) runtime warnings such as the following.
> 
> [0.528272] [ cut here ]
> [0.528285] WARNING: CPU: 0 PID: 1 at kernel/irq/irqdomain.c:550 
> irq_domain_associate+0x194/0x1f0
> [0.528315] error: virq335 is not allocated

[...]

This looks like a case of CONFIG_SPARSE_IRQ, combined with a lack of
brain engagement. I've come up with the following patch, which lets
the kernel boot in QEMU without screaming (other than the lack of a
rootfs...).

Please let me know if this helps.

Thanks,

M.

From 4d7f6ddbbfdff1c9f029bafca79020d3294dc32c Mon Sep 17 00:00:00 2001
From: Marc Zyngier 
Date: Tue, 27 Apr 2021 09:00:28 +0100
Subject: [PATCH] ARM: PXA: Fix cplds irqdesc allocation when using legacy mode

The Mainstone PXA platform uses CONFIG_SPARSE_IRQ, and thus we
cannot rely on the irq descriptors to be readilly allocated
before creating the irqdomain in legacy mode. The kernel then
complains loudly about not being able to associate the interrupt
in the domain -- can't blame it.

Fix it by allocating the irqdescs upfront in the legacy case.

Fixes: b68761da0111 ("ARM: PXA: Kill use of irq_create_strict_mappings()")
Reported-by: Guenter Roeck 
Signed-off-by: Marc Zyngier 
---
 arch/arm/mach-pxa/pxa_cplds_irqs.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mach-pxa/pxa_cplds_irqs.c 
b/arch/arm/mach-pxa/pxa_cplds_irqs.c
index ec0d9b094744..bddfc7cd5d40 100644
--- a/arch/arm/mach-pxa/pxa_cplds_irqs.c
+++ b/arch/arm/mach-pxa/pxa_cplds_irqs.c
@@ -121,8 +121,13 @@ static int cplds_probe(struct platform_device *pdev)
return fpga->irq;
 
base_irq = platform_get_irq(pdev, 1);
-   if (base_irq < 0)
+   if (base_irq < 0) {
base_irq = 0;
+   } else {
+   ret = devm_irq_alloc_descs(&pdev->dev, base_irq, base_irq, 
CPLDS_NB_IRQ, 0);
+   if (ret < 0)
+   return ret;
+   }
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
fpga->base = devm_ioremap_resource(&pdev->dev, res);
-- 
2.30.2


-- 
Without deviation from the norm, progress is not possible.