[PATCH v2 net-next] ibmvnic: remove default label from to_string switch

2021-05-03 Thread Michal Suchanek
This way the compiler warns when a new value is added to the enum but
not to the string translation like:

drivers/net/ethernet/ibm/ibmvnic.c: In function 'adapter_state_to_string':
drivers/net/ethernet/ibm/ibmvnic.c:832:2: warning: enumeration value 
'VNIC_FOOBAR' not handled in switch [-Wswitch]
  switch (state) {
  ^~
drivers/net/ethernet/ibm/ibmvnic.c: In function 'reset_reason_to_string':
drivers/net/ethernet/ibm/ibmvnic.c:1935:2: warning: enumeration value 
'VNIC_RESET_FOOBAR' not handled in switch [-Wswitch]
  switch (reason) {
  ^~

Signed-off-by: Michal Suchanek 
---
v2: Fix typo in commit message
---
 drivers/net/ethernet/ibm/ibmvnic.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 5788bb956d73..4d439413f6d9 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -846,9 +846,8 @@ static const char *adapter_state_to_string(enum vnic_state 
state)
return "REMOVING";
case VNIC_REMOVED:
return "REMOVED";
-   default:
-   return "UNKNOWN";
}
+   return "UNKNOWN";
 }
 
 static int ibmvnic_login(struct net_device *netdev)
@@ -1946,9 +1945,8 @@ static const char *reset_reason_to_string(enum 
ibmvnic_reset_reason reason)
return "TIMEOUT";
case VNIC_RESET_CHANGE_PARAM:
return "CHANGE_PARAM";
-   default:
-   return "UNKNOWN";
}
+   return "UNKNOWN";
 }
 
 /*
-- 
2.26.2



Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Alexander Dahl
Hello Arnd,

Am Mon, May 03, 2021 at 11:25:21AM +0200 schrieb Arnd Bergmann:
> On Mon, May 3, 2021 at 9:35 AM Alexander Dahl  wrote:
> >
> > Desktops and servers are all nice, however I just want to make you
> > aware, there are embedded users forced to stick to older cross
> > toolchains for different reasons as well, e.g. in industrial
> > environment. :-)
> >
> > This is no show stopper for us, I just wanted to let you be aware.
> 
> Can you be more specific about what scenarios you are thinking of,
> what the motivations are for using an old compiler with a new kernel
> on embedded systems, and what you think a realistic maximum
> time would be between compiler updates?

One reason might be certification. For certain industrial applications
like support for complex field bus protocols, you need to get your
devices tested by an external partner running extensive test suites.
This is time consuming and expensive. 

Changing the toolchain of your system then, would be a massive change
which would require recertification, while you could argue just
updating a single component like the kernel and building everything
again, does not require the whole testing process again. 

Thin ice, I know.

> One scenario that I've seen previously is where user space and
> kernel are built together as a source based distribution (OE, buildroot,
> openwrt, ...), and the compiler is picked to match the original sources
> of the user space because that is best tested, but the same compiler
> then gets used to build the kernel as well because that is the default
> in the build environment.

One problem we actually ran into in BSPs like that (we build with
ptxdist, however build system doesn't matter here, it could as well
have been buildroot etc.) was things* failing to build with newer
compilers, things we could not or did not want to fix, so staying with
an older toolchain was the obvious choice. 

*Things as in bootloaders for an armv5 platform.

> There are two problems I see with this logic:
> 
> - Running the latest kernel to avoid security problems is of course
>   a good idea, but if one runs that with ten year old user space that
>   is never updated, the system is likely to end up just as insecure.
>   Not all bugs are in the kernel.

Agreed.

> - The same logic that applies to ancient user space staying with
>   an ancient compiler (it's better tested in this combination) also
>   applies to the kernel: running the latest kernel on an old compiler
>   is something that few people test, and tends to run into more bugs
>   than using the compiler that other developers used to test that
>   kernel.

What we actually did: building recent userspace and kernel with older
toolchains, because bootloader. I know, there are several
possibilities to solve this kind of lock:

- built bootloader with different compiler
- update bootloader
- …

As said before, this is no problem for me now, I can work around it,
but to give an idea what could keep people on older toolchains.

Greets
Alex



Re: [PATCH v3 1/2] KVM: PPC: Book3S HV: Sanitise vcpu registers in nested path

2021-05-03 Thread Nicholas Piggin
Excerpts from Paul Mackerras's message of May 4, 2021 2:28 pm:
> On Sat, May 01, 2021 at 11:58:36AM +1000, Nicholas Piggin wrote:
>> Excerpts from Fabiano Rosas's message of April 16, 2021 9:09 am:
>> > As one of the arguments of the H_ENTER_NESTED hypercall, the nested
>> > hypervisor (L1) prepares a structure containing the values of various
>> > hypervisor-privileged registers with which it wants the nested guest
>> > (L2) to run. Since the nested HV runs in supervisor mode it needs the
>> > host to write to these registers.
>> > 
>> > To stop a nested HV manipulating this mechanism and using a nested
>> > guest as a proxy to access a facility that has been made unavailable
>> > to it, we have a routine that sanitises the values of the HV registers
>> > before copying them into the nested guest's vcpu struct.
>> > 
>> > However, when coming out of the guest the values are copied as they
>> > were back into L1 memory, which means that any sanitisation we did
>> > during guest entry will be exposed to L1 after H_ENTER_NESTED returns.
>> > 
>> > This patch alters this sanitisation to have effect on the vcpu->arch
>> > registers directly before entering and after exiting the guest,
>> > leaving the structure that is copied back into L1 unchanged (except
>> > when we really want L1 to access the value, e.g the Cause bits of
>> > HFSCR).
>> > 
>> > Signed-off-by: Fabiano Rosas 
>> > ---
>> >  arch/powerpc/kvm/book3s_hv_nested.c | 55 ++---
>> >  1 file changed, 34 insertions(+), 21 deletions(-)
>> > 
>> > diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
>> > b/arch/powerpc/kvm/book3s_hv_nested.c
>> > index 0cd0e7aad588..270552dd42c5 100644
>> > --- a/arch/powerpc/kvm/book3s_hv_nested.c
>> > +++ b/arch/powerpc/kvm/book3s_hv_nested.c
>> > @@ -102,8 +102,17 @@ static void save_hv_return_state(struct kvm_vcpu 
>> > *vcpu, int trap,
>> >  {
>> >struct kvmppc_vcore *vc = vcpu->arch.vcore;
>> >  
>> > +  /*
>> > +   * When loading the hypervisor-privileged registers to run L2,
>> > +   * we might have used bits from L1 state to restrict what the
>> > +   * L2 state is allowed to be. Since L1 is not allowed to read
>> > +   * the HV registers, do not include these modifications in the
>> > +   * return state.
>> > +   */
>> > +  hr->hfscr = ((~HFSCR_INTR_CAUSE & hr->hfscr) |
>> > +   (HFSCR_INTR_CAUSE & vcpu->arch.hfscr));
>> > +
>> >hr->dpdes = vc->dpdes;
>> > -  hr->hfscr = vcpu->arch.hfscr;
>> >hr->purr = vcpu->arch.purr;
>> >hr->spurr = vcpu->arch.spurr;
>> >hr->ic = vcpu->arch.ic;
>> 
>> Do we still have the problem here that hfac interrupts due to bits cleared
>> by the hfscr sanitisation would have the cause bits returned to the L1,
>> so in theory it could probe hfscr directly that way? I don't see a good
>> solution to this except either have the L0 intercept these faults and do
>> "something" transparent, or return error from H_ENTER_NESTED (which would
>> also allow trivial probing of the facilities).
> 
> It seems to me that there are various specific reasons why L0 would
> clear HFSCR bits, and if we think about the specific reasons, what we
> should do becomes clear.  (I say "L0" but in fact the same reasoning
> applies to any hypervisor that lets its guest do hypervisor-ish
> things.)
> 
> 1. Emulating a version of the architecture which doesn't have the
> feature in question - in that case the bit should appear to L1 as a
> reserved bit in HFSCR (i.e. always read 0), the associated facility
> code should never appear in the top 8 bits of any HFSCR value that L1
> sees, and any HFU interrupt received by L0 for the facility should be
> changed into an illegal instruction interrupt (or HEAI) forwarded to
> L1.  In this case the real HFSCR should always have the enable bit for
> the facility set to 0.
> 
> 2. Lazy save/restore of the state associated with a facility - in this
> case, while the system is in the "lazy" state (i.e. the state is not
> that of the currently running guest), the real HFSCR bit for the
> facility should be 0.  On an HFU interrupt for the facility, L0 looks
> at L1's HFSCR value: if it's 0, forward the HFU interrupt to L1; if
> it's 1, load up the facility state, set the facility's bit in HFSCR,
> and resume the guest.
> 
> 3. Emulating a facility in software - in this case, the real HFSCR
> bit for the facility would always be 0.  On an HFU interrupt, L0 reads
> the instruction and emulates it, then resumes the guest.
> 
> One thing this all makes clear is that the IC field of the "virtual"
> HFSCR value seen by L1 should only ever be changed when L0 forwards a
> HFU interrupt to L1.
> 
> In fact we currently never do (1) or (2), and we only do (3) for
> msgsndp etc., so this discussion is mostly theoretical.

Yeah it's somewhat theoretical, and I guess I mostly agree with you.

Missing is the case where the L0 does not implement a feature at all.
Let's say TM is broken so it disables it, or nobody uses TAR so it 
doesn't 

Re: [FSL P50x0] Xorg always restarts again and again after the the PowerPC updates 5.13-1

2021-05-03 Thread Christophe Leroy




Le 04/05/2021 à 00:25, Christian Zigotzky a écrit :

Hello,

Xorg always restarts again and again after the the PowerPC updates 5.13-1 [1] on my FSL P5040 Cyrus+ 
board (A-EON AmigaOne X5000) [2]. Xorg doesn't start anymore in a virtual e5500 QEMU machine [3].


I bisected today [4].

Result: powerpc/signal32: Convert do_setcontext[_tm]() to user access block 
(887f3ceb51cd34109ac17bfc98695162e299e657) [5] is the first bad commit.


Please find attached the kernel config.

Please check the first bad commit.


I'm not sure you can conclude anything here. There is a problem in that commit, but it is fixed by 
525642624783 ("powerpc/signal32: Fix erroneous SIGSEGV on RT signal return") which is the last 
commit of powerpc-5.13-1.


So any bisect from there will for sure point to 887f3ceb51cd ("powerpc/signal32: Convert 
do_setcontext[_tm]() to user access block") but that's unconclusive. If the problem is still there 
at the HEAD of powerpc-5.13-1, the problem is likely somewhere else.


I think you need to do the bisect again with a cherry-pick of 525642624783 at 
each step.

Thanks
Christophe




Thanks,
Christian

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c70a4be130de333ea079c59da41cc959712bb01c 


[2] http://wiki.amiga.org/index.php?title=X5000
[3] qemu-system-ppc64 -M ppce500 -cpu e5500 -m 1024 -kernel uImage -drive 
format=raw,file=fedora28-2.img,index=0,if=virtio -netdev user,id=mynet0 -device 
virtio-net-pci,netdev=mynet0 -append "rw root=/dev/vda" -device virtio-vga -usb -device 
usb-ehci,id=ehci -device usb-tablet -device virtio-keyboard-pci -smp 4 -vnc :1

[4] https://forum.hyperion-entertainment.com/viewtopic.php?p=53101#p53101
[5] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=887f3ceb51cd34109ac17bfc98695162e299e657 



Re: [PATCH v3 1/2] KVM: PPC: Book3S HV: Sanitise vcpu registers in nested path

2021-05-03 Thread Paul Mackerras
On Sat, May 01, 2021 at 11:58:36AM +1000, Nicholas Piggin wrote:
> Excerpts from Fabiano Rosas's message of April 16, 2021 9:09 am:
> > As one of the arguments of the H_ENTER_NESTED hypercall, the nested
> > hypervisor (L1) prepares a structure containing the values of various
> > hypervisor-privileged registers with which it wants the nested guest
> > (L2) to run. Since the nested HV runs in supervisor mode it needs the
> > host to write to these registers.
> > 
> > To stop a nested HV manipulating this mechanism and using a nested
> > guest as a proxy to access a facility that has been made unavailable
> > to it, we have a routine that sanitises the values of the HV registers
> > before copying them into the nested guest's vcpu struct.
> > 
> > However, when coming out of the guest the values are copied as they
> > were back into L1 memory, which means that any sanitisation we did
> > during guest entry will be exposed to L1 after H_ENTER_NESTED returns.
> > 
> > This patch alters this sanitisation to have effect on the vcpu->arch
> > registers directly before entering and after exiting the guest,
> > leaving the structure that is copied back into L1 unchanged (except
> > when we really want L1 to access the value, e.g the Cause bits of
> > HFSCR).
> > 
> > Signed-off-by: Fabiano Rosas 
> > ---
> >  arch/powerpc/kvm/book3s_hv_nested.c | 55 ++---
> >  1 file changed, 34 insertions(+), 21 deletions(-)
> > 
> > diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
> > b/arch/powerpc/kvm/book3s_hv_nested.c
> > index 0cd0e7aad588..270552dd42c5 100644
> > --- a/arch/powerpc/kvm/book3s_hv_nested.c
> > +++ b/arch/powerpc/kvm/book3s_hv_nested.c
> > @@ -102,8 +102,17 @@ static void save_hv_return_state(struct kvm_vcpu 
> > *vcpu, int trap,
> >  {
> > struct kvmppc_vcore *vc = vcpu->arch.vcore;
> >  
> > +   /*
> > +* When loading the hypervisor-privileged registers to run L2,
> > +* we might have used bits from L1 state to restrict what the
> > +* L2 state is allowed to be. Since L1 is not allowed to read
> > +* the HV registers, do not include these modifications in the
> > +* return state.
> > +*/
> > +   hr->hfscr = ((~HFSCR_INTR_CAUSE & hr->hfscr) |
> > +(HFSCR_INTR_CAUSE & vcpu->arch.hfscr));
> > +
> > hr->dpdes = vc->dpdes;
> > -   hr->hfscr = vcpu->arch.hfscr;
> > hr->purr = vcpu->arch.purr;
> > hr->spurr = vcpu->arch.spurr;
> > hr->ic = vcpu->arch.ic;
> 
> Do we still have the problem here that hfac interrupts due to bits cleared
> by the hfscr sanitisation would have the cause bits returned to the L1,
> so in theory it could probe hfscr directly that way? I don't see a good
> solution to this except either have the L0 intercept these faults and do
> "something" transparent, or return error from H_ENTER_NESTED (which would
> also allow trivial probing of the facilities).

It seems to me that there are various specific reasons why L0 would
clear HFSCR bits, and if we think about the specific reasons, what we
should do becomes clear.  (I say "L0" but in fact the same reasoning
applies to any hypervisor that lets its guest do hypervisor-ish
things.)

1. Emulating a version of the architecture which doesn't have the
feature in question - in that case the bit should appear to L1 as a
reserved bit in HFSCR (i.e. always read 0), the associated facility
code should never appear in the top 8 bits of any HFSCR value that L1
sees, and any HFU interrupt received by L0 for the facility should be
changed into an illegal instruction interrupt (or HEAI) forwarded to
L1.  In this case the real HFSCR should always have the enable bit for
the facility set to 0.

2. Lazy save/restore of the state associated with a facility - in this
case, while the system is in the "lazy" state (i.e. the state is not
that of the currently running guest), the real HFSCR bit for the
facility should be 0.  On an HFU interrupt for the facility, L0 looks
at L1's HFSCR value: if it's 0, forward the HFU interrupt to L1; if
it's 1, load up the facility state, set the facility's bit in HFSCR,
and resume the guest.

3. Emulating a facility in software - in this case, the real HFSCR
bit for the facility would always be 0.  On an HFU interrupt, L0 reads
the instruction and emulates it, then resumes the guest.

One thing this all makes clear is that the IC field of the "virtual"
HFSCR value seen by L1 should only ever be changed when L0 forwards a
HFU interrupt to L1.

In fact we currently never do (1) or (2), and we only do (3) for
msgsndp etc., so this discussion is mostly theoretical.

> Returning an hfac interrupt to a hypervisor that thought it enabled the 
> bit would be strange. But so does appearing to modify the register 
> underneath it and then returning a fault.

I don't think we should ever do either of those things.  The closest
would be (1) above, but in that case the fault has to be either an
illegal instruction type program interrupt, or a HEAI.

> I think 

[RFC 09/10] powerpc/rtas: convert to rtas_sched_if_busy()

2021-05-03 Thread Nathan Lynch
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small
extended delay values.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 4177f7385ea2..c5cc4542856f 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -743,7 +743,7 @@ int rtas_set_power_level(int powerdomain, int level, int 
*setlevel)
 
do {
rc = rtas_call(token, 2, 2, setlevel, powerdomain, level);
-   } while (rtas_busy_delay(rc));
+   } while (rtas_sched_if_busy(rc));
 
if (rc < 0)
return rtas_error_rc(rc);
@@ -761,7 +761,7 @@ int rtas_get_sensor(int sensor, int index, int *state)
 
do {
rc = rtas_call(token, 2, 2, state, sensor, index);
-   } while (rtas_busy_delay(rc));
+   } while (rtas_sched_if_busy(rc));
 
if (rc < 0)
return rtas_error_rc(rc);
@@ -822,7 +822,7 @@ int rtas_set_indicator(int indicator, int index, int 
new_value)
 
do {
rc = rtas_call(token, 3, 1, NULL, indicator, index, new_value);
-   } while (rtas_busy_delay(rc));
+   } while (rtas_sched_if_busy(rc));
 
if (rc < 0)
return rtas_error_rc(rc);
@@ -990,7 +990,7 @@ void rtas_activate_firmware(void)
 
do {
fwrc = rtas_call(token, 0, 1, NULL);
-   } while (rtas_busy_delay(fwrc));
+   } while (rtas_sched_if_busy(fwrc));
 
if (fwrc)
pr_err("ibm,activate-firmware failed (%i)\n", fwrc);
-- 
2.30.2



[RFC 10/10] powerpc/rtas_flash: convert to rtas_sched_if_busy()

2021-05-03 Thread Nathan Lynch
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small
extended delay values.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas_flash.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
index a99179d83538..bedefb9178ec 100644
--- a/arch/powerpc/kernel/rtas_flash.c
+++ b/arch/powerpc/kernel/rtas_flash.c
@@ -378,7 +378,7 @@ static void manage_flash(struct rtas_manage_flash_t 
*args_buf, unsigned int op)
do {
rc = rtas_call(rtas_token("ibm,manage-flash-image"), 1, 1,
   NULL, op);
-   } while (rtas_busy_delay(rc));
+   } while (rtas_sched_if_busy(rc));
 
args_buf->status = rc;
 }
@@ -456,7 +456,7 @@ static void validate_flash(struct rtas_validate_flash_t 
*args_buf)
   (u32) __pa(rtas_data_buf), args_buf->buf_size);
memcpy(args_buf->buf, rtas_data_buf, VALIDATE_BUF_SIZE);
spin_unlock(_data_buf_lock);
-   } while (rtas_busy_delay(rc));
+   } while (rtas_sched_if_busy(rc));
 
args_buf->status = rc;
args_buf->update_results = update_results;
-- 
2.30.2



[RFC 08/10] powerpc/pseries/dlpar: convert to rtas_sched_if_busy()

2021-05-03 Thread Nathan Lynch
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small
extended delay values.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/dlpar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 3ac70790ec7a..3ba77bc09a6e 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -167,7 +167,7 @@ struct device_node *dlpar_configure_connector(__be32 
drc_index,
 
spin_unlock(_data_buf_lock);
 
-   if (rtas_busy_delay(rc))
+   if (rtas_sched_if_busy(rc))
continue;
 
switch (rc) {
-- 
2.30.2



[RFC 07/10] powerpc/pseries/iommu: convert to rtas_sched_if_busy()

2021-05-03 Thread Nathan Lynch
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small
extended delay values.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 0c55b991f665..0f0e7a51b863 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1016,7 +1016,7 @@ static int create_ddw(struct pci_dev *dev, const u32 
*ddw_avail,
ret = rtas_call(ddw_avail[DDW_CREATE_PE_DMA_WIN], 5, 4,
(u32 *)create, cfg_addr, BUID_HI(buid),
BUID_LO(buid), page_shift, window_shift);
-   } while (rtas_busy_delay(ret));
+   } while (rtas_sched_if_busy(ret));
dev_info(>dev,
"ibm,create-pe-dma-window(%x) %x %x %x %x %x returned %d "
"(liobn = 0x%x starting addr = %x %x)\n",
-- 
2.30.2



[RFC 06/10] powerpc/pseries/msi: convert to rtas_sched_if_busy()

2021-05-03 Thread Nathan Lynch
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small
extended delay values.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/msi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/msi.c 
b/arch/powerpc/platforms/pseries/msi.c
index 637300330507..df434b8a3aa7 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -49,7 +49,7 @@ static int rtas_change_msi(struct pci_dn *pdn, u32 func, u32 
num_irqs)
func, num_irqs, seq_num);
 
seq_num = rtas_ret[1];
-   } while (rtas_busy_delay(rc));
+   } while (rtas_sched_if_busy(rc));
 
/*
 * If the RTAS call succeeded, return the number of irqs allocated.
@@ -100,7 +100,7 @@ static int rtas_query_irq_number(struct pci_dn *pdn, int 
offset)
do {
rc = rtas_call(query_token, 4, 3, rtas_ret, addr,
   BUID_HI(buid), BUID_LO(buid), offset);
-   } while (rtas_busy_delay(rc));
+   } while (rtas_sched_if_busy(rc));
 
if (rc) {
pr_debug("rtas_msi: error (%d) querying source number\n", rc);
-- 
2.30.2



[RFC 05/10] powerpc/pseries/fadump: convert to rtas_sched_if_busy()

2021-05-03 Thread Nathan Lynch
None of these call sites need to use mdelay(); convert them to
rtas_sched_if_busy().

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/rtas-fadump.c | 22 +++-
 1 file changed, 3 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/rtas-fadump.c 
b/arch/powerpc/platforms/pseries/rtas-fadump.c
index f8f73b47b107..9a200d3bf5e0 100644
--- a/arch/powerpc/platforms/pseries/rtas-fadump.c
+++ b/arch/powerpc/platforms/pseries/rtas-fadump.c
@@ -129,7 +129,6 @@ static u64 rtas_fadump_get_bootmem_min(void)
 
 static int rtas_fadump_register(struct fw_dump *fadump_conf)
 {
-   unsigned int wait_time;
int rc, err = -EIO;
 
/* TODO: Add upper time limit for the delay */
@@ -137,12 +136,7 @@ static int rtas_fadump_register(struct fw_dump 
*fadump_conf)
rc =  rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1,
NULL, FADUMP_REGISTER, ,
sizeof(struct rtas_fadump_mem_struct));
-
-   wait_time = rtas_busy_delay_time(rc);
-   if (wait_time)
-   mdelay(wait_time);
-
-   } while (wait_time);
+   } while (rtas_sched_if_busy(rc));
 
switch (rc) {
case 0:
@@ -177,7 +171,6 @@ static int rtas_fadump_register(struct fw_dump *fadump_conf)
 
 static int rtas_fadump_unregister(struct fw_dump *fadump_conf)
 {
-   unsigned int wait_time;
int rc;
 
/* TODO: Add upper time limit for the delay */
@@ -185,11 +178,7 @@ static int rtas_fadump_unregister(struct fw_dump 
*fadump_conf)
rc =  rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1,
NULL, FADUMP_UNREGISTER, ,
sizeof(struct rtas_fadump_mem_struct));
-
-   wait_time = rtas_busy_delay_time(rc);
-   if (wait_time)
-   mdelay(wait_time);
-   } while (wait_time);
+   } while (rtas_sched_if_busy(rc));
 
if (rc) {
pr_err("Failed to un-register - unexpected error(%d).\n", rc);
@@ -202,7 +191,6 @@ static int rtas_fadump_unregister(struct fw_dump 
*fadump_conf)
 
 static int rtas_fadump_invalidate(struct fw_dump *fadump_conf)
 {
-   unsigned int wait_time;
int rc;
 
/* TODO: Add upper time limit for the delay */
@@ -210,11 +198,7 @@ static int rtas_fadump_invalidate(struct fw_dump 
*fadump_conf)
rc =  rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1,
NULL, FADUMP_INVALIDATE, fdm_active,
sizeof(struct rtas_fadump_mem_struct));
-
-   wait_time = rtas_busy_delay_time(rc);
-   if (wait_time)
-   mdelay(wait_time);
-   } while (wait_time);
+   } while (rtas_sched_if_busy(rc));
 
if (rc) {
pr_err("Failed to invalidate - unexpected error (%d).\n", rc);
-- 
2.30.2



[RFC 04/10] powerpc/rtas-rtc: convert set-time-of-day to rtas_sched_if_busy()

2021-05-03 Thread Nathan Lynch
rtas_set_rtc_time() is called only in process context; convert this to
rtas_sched_if_busy().

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas-rtc.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/rtas-rtc.c b/arch/powerpc/kernel/rtas-rtc.c
index 82cb95f29a11..421b92f95669 100644
--- a/arch/powerpc/kernel/rtas-rtc.c
+++ b/arch/powerpc/kernel/rtas-rtc.c
@@ -62,7 +62,7 @@ void rtas_get_rtc_time(struct rtc_time *rtc_tm)
 
 int rtas_set_rtc_time(struct rtc_time *tm)
 {
-   int error, wait_time;
+   int error;
u64 max_wait_tb;
 
max_wait_tb = get_tb() + tb_ticks_per_usec * 1000 * MAX_RTC_WAIT;
@@ -72,13 +72,7 @@ int rtas_set_rtc_time(struct rtc_time *tm)
  tm->tm_mday, tm->tm_hour, tm->tm_min,
  tm->tm_sec, 0);
 
-   wait_time = rtas_busy_delay_time(error);
-   if (wait_time) {
-   if (in_interrupt())
-   return 1;   /* probably decrementer */
-   msleep(wait_time);
-   }
-   } while (wait_time && (get_tb() < max_wait_tb));
+   } while (rtas_sched_if_busy(error) && (get_tb() < max_wait_tb));
 
if (error != 0)
printk_ratelimited(KERN_WARNING
-- 
2.30.2



[RFC 03/10] powerpc/rtas-rtc: convert get-time-of-day to rtas_force_spin_if_busy()

2021-05-03 Thread Nathan Lynch
The functions in rtas-rtc which call get-time-of-day can be invoked in
boot, suspend, and resume paths with interrupts off. Unfortunately
get-time-of-day can return an extended delay status, so we use
rtas_force_spin_if_busy().

In the specific case of rtas_get_rtc_time(), it is not clear why
returning an incorrect result is better than calling again even if we
are in interrupt context. Remove this logic.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas-rtc.c | 28 ++--
 1 file changed, 2 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/kernel/rtas-rtc.c b/arch/powerpc/kernel/rtas-rtc.c
index a28239b8b0c0..82cb95f29a11 100644
--- a/arch/powerpc/kernel/rtas-rtc.c
+++ b/arch/powerpc/kernel/rtas-rtc.c
@@ -17,19 +17,12 @@ time64_t __init rtas_get_boot_time(void)
 {
int ret[8];
int error;
-   unsigned int wait_time;
u64 max_wait_tb;
 
max_wait_tb = get_tb() + tb_ticks_per_usec * 1000 * MAX_RTC_WAIT;
do {
error = rtas_call(rtas_token("get-time-of-day"), 0, 8, ret);
-
-   wait_time = rtas_busy_delay_time(error);
-   if (wait_time) {
-   /* This is boot time so we spin. */
-   udelay(wait_time*1000);
-   }
-   } while (wait_time && (get_tb() < max_wait_tb));
+   } while (rtas_force_spin_if_busy(error) && (get_tb() < max_wait_tb));
 
if (error != 0) {
printk_ratelimited(KERN_WARNING
@@ -41,33 +34,16 @@ time64_t __init rtas_get_boot_time(void)
return mktime64(ret[0], ret[1], ret[2], ret[3], ret[4], ret[5]);
 }
 
-/* NOTE: get_rtc_time will get an error if executed in interrupt context
- * and if a delay is needed to read the clock.  In this case we just
- * silently return without updating rtc_tm.
- */
 void rtas_get_rtc_time(struct rtc_time *rtc_tm)
 {
 int ret[8];
int error;
-   unsigned int wait_time;
u64 max_wait_tb;
 
max_wait_tb = get_tb() + tb_ticks_per_usec * 1000 * MAX_RTC_WAIT;
do {
error = rtas_call(rtas_token("get-time-of-day"), 0, 8, ret);
-
-   wait_time = rtas_busy_delay_time(error);
-   if (wait_time) {
-   if (in_interrupt()) {
-   memset(rtc_tm, 0, sizeof(struct rtc_time));
-   printk_ratelimited(KERN_WARNING
-  "error: reading clock "
-  "would delay interrupt\n");
-   return; /* delay not allowed */
-   }
-   msleep(wait_time);
-   }
-   } while (wait_time && (get_tb() < max_wait_tb));
+   } while (rtas_sched_if_busy(error) && (get_tb() < max_wait_tb));
 
if (error != 0) {
printk_ratelimited(KERN_WARNING
-- 
2.30.2



[RFC 01/10] powerpc/rtas: new APIs for busy and extended delay statuses

2021-05-03 Thread Nathan Lynch
Add new APIs for handling busy (-2) and extended delay
hint (9900...9905) statuses from RTAS. These are intended to be
drop-in replacements for existing uses of rtas_busy_delay().

A problem with rtas_busy_delay() and rtas_busy_delay_time() is that
they consider -2/busy to be equivalent to 9900 (wait 1ms). In fact,
the OS should call again as soon as it wants on -2, which at least on
PowerVM means RTAS is returning only to uphold the general requirement
that RTAS must return control to the OS in a "timely fashion" (250us).

Combine this with the fact that msleep(1) actually sleeps for more
like 20ms in practice: on busy VMs we schedule away for much longer
than necessary on -2 and 9900.

This is fixed in rtas_sched_if_busy(), which uses usleep_range() for
small delay hints, and only schedules away on -2 if there is other
work available. It also refuses to sleep longer than one second
regardless of the hinted value, on the assumption that even longer
running operations can tolerate polling at 1HZ.

rtas_spin_if_busy() and rtas_force_spin_if_busy() are provided for
atomic contexts which need to handle busy status and extended delay
hints.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/include/asm/rtas.h |   4 +
 arch/powerpc/kernel/rtas.c  | 168 
 2 files changed, 172 insertions(+)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 9dc97d2f9d27..555ff3290f92 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -266,6 +266,10 @@ extern int rtas_set_rtc_time(struct rtc_time *rtc_time);
 extern unsigned int rtas_busy_delay_time(int status);
 extern unsigned int rtas_busy_delay(int status);
 
+bool rtas_sched_if_busy(int status);
+bool rtas_spin_if_busy(int status);
+bool rtas_force_spin_if_busy(int status);
+
 extern int early_init_dt_scan_rtas(unsigned long node,
const char *uname, int depth, void *data);
 
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 6bada744402b..4a1dfbfa51ba 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -519,6 +519,174 @@ unsigned int rtas_busy_delay(int status)
 }
 EXPORT_SYMBOL(rtas_busy_delay);
 
+/**
+ * rtas_force_spin_if_busy() - Consume a busy or extended delay status
+ * in atomic context.
+ * @status: Return value from rtas_call() or similar function.
+ *
+ * Use this function when you cannot avoid using an RTAS function
+ * which may return an extended delay hint in atomic context. If
+ * possible, use rtas_spin_if_busy() or rtas_sched_if_busy() instead
+ * of this function.
+ *
+ * Return: True if @status is -2 or 990x, in which case
+ * rtas_spin_if_busy() will have delayed an appropriate amount
+ * of time, and the caller should call the RTAS function
+ * again. False otherwise.
+ */
+bool rtas_force_spin_if_busy(int status)
+{
+   bool was_busy = true;
+
+   switch (status) {
+   case RTAS_BUSY:
+   /* OK to call again immediately; do nothing. */
+   break;
+   case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX:
+   mdelay(1);
+   break;
+   default:
+   was_busy = false;
+   break;
+   }
+
+   return was_busy;
+}
+
+/**
+ * rtas_spin_if_busy() - Consume a busy status in atomic context.
+ * @status: Return value from rtas_call() or similar function.
+ *
+ * Prefer rtas_sched_if_busy() over this function. Prefer this
+ * function over rtas_force_spin_if_busy(). Use this function in
+ * atomic contexts with RTAS calls that are specified to return -2 but
+ * not 990x. This function will complain and execute a minimal delay
+ * if passed a 990x status.
+ *
+ * Return: True if @status is -2 or 990x, in which case
+ * rtas_spin_if_busy() will have delayed an appropriate amount
+ * of time, and the caller should call the RTAS function
+ * again. False otherwise.
+ */
+bool rtas_spin_if_busy(int status)
+{
+   bool was_busy = true;
+
+   switch (status) {
+   case RTAS_BUSY:
+   /* OK to call again immediately; do nothing. */
+   break;
+   case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX:
+   /*
+* Generally, RTAS functions which can return this
+* status should be considered too expensive to use in
+* atomic context. Change the calling code to use
+* rtas_sched_if_busy(), or if that's not possible,
+* use rtas_force_spin_if_busy().
+*/
+   pr_warn_once("%pS may use RTAS call in atomic context which 
returns extended delay.\n",
+__builtin_return_address(0));
+   mdelay(1);
+   break;
+   default:
+   was_busy = false;
+   break;
+   }
+
+   return was_busy;
+}
+
+static unsigned 

[RFC 02/10] powerpc/rtas: do not schedule in rtas_os_term()

2021-05-03 Thread Nathan Lynch
rtas_os_term() is called in the panic path and should immediately
re-call the RTAS ibm,os-term function as long as it returns a busy
status. It's not safe to use rtas_busy_delay() in this context, which
potentially can schedule away. Use rtas_spin_if_busy().

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 4a1dfbfa51ba..4177f7385ea2 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -960,7 +960,7 @@ void rtas_os_term(char *str)
do {
status = rtas_call(rtas_token("ibm,os-term"), 1, 1, NULL,
   __pa(rtas_os_term_buf));
-   } while (rtas_busy_delay(status));
+   } while (rtas_spin_if_busy(status));
 
if (status != 0)
printk(KERN_EMERG "ibm,os-term call failed %d\n", status);
-- 
2.30.2



[RFC 00/10] powerpc/rtas: improved busy and extended delay status handling

2021-05-03 Thread Nathan Lynch
This is an attempt at providing clearer names as discussed here:

https://github.com/linuxppc/issues/issues/164

as well as providing better behavior for RTAS_BUSY (-2) and small
extended delay values, which in my experience seem more common than
the larger ones. In testing PREEMPT_NONE kernels with CPUs busy, I see
the elapsed time for memory add operations roughly halved, while
memory remove operations' elapsed time shrinks by about ~25%. This is
achieved without significantly more time spent on CPU:

(- is before, + is after)

  Performance counter stats for 'drmgr -c mem -a -q 10' (10 runs):

- 1,898  probe:rtas_call   #0.003 M/sec 
   ( +-  2.20% )
-751.57 msec task-clock#0.289 CPUs utilized 
   ( +-  1.56% )
+ 1,969  probe:rtas_call   #0.003 M/sec 
   ( +-  2.69% )
+766.20 msec task-clock#0.688 CPUs utilized 
   ( +-  1.99% )

- 2.605 +- 0.148 seconds time elapsed  ( +-  5.70% )
+1.1129 +- 0.0660 seconds time elapsed  ( +-  5.93% )

  Performance counter stats for 'drmgr -c mem -r -q 10' (10 runs):

-   673  probe:rtas_call   #0.002 M/sec 
   ( +-  0.55% )
-318.36 msec task-clock#0.234 CPUs utilized 
   ( +-  0.42% )
+   692  probe:rtas_call   #0.002 M/sec 
   ( +-  0.73% )
+320.87 msec task-clock#0.309 CPUs utilized 
   ( +-  0.34% )

- 1.362 +- 0.100 seconds time elapsed  ( +-  7.37% )
+1.0372 +- 0.0468 seconds time elapsed  ( +-  4.51% )

Questions / concerns / to do:
* I don't love the new API function names.
* Introduces three new APIs when two likely would suffice.
* Need to convert eeh_pseries and scanlog.
* rtas_busy_delay() and rtas_busy_delay_time() not yet removed.

Nathan Lynch (10):
  powerpc/rtas: new APIs for busy and extended delay statuses
  powerpc/rtas: do not schedule in rtas_os_term()
  powerpc/rtas-rtc: convert get-time-of-day to rtas_force_spin_if_busy()
  powerpc/rtas-rtc: convert set-time-of-day to rtas_sched_if_busy()
  powerpc/pseries/fadump: convert to rtas_sched_if_busy()
  powerpc/pseries/msi: convert to rtas_sched_if_busy()
  powerpc/pseries/iommu: convert to rtas_sched_if_busy()
  powerpc/pseries/dlpar: convert to rtas_sched_if_busy()
  powerpc/rtas: convert to rtas_sched_if_busy()
  powerpc/rtas_flash: convert to rtas_sched_if_busy()

 arch/powerpc/include/asm/rtas.h  |   4 +
 arch/powerpc/kernel/rtas-rtc.c   |  38 +---
 arch/powerpc/kernel/rtas.c   | 178 ++-
 arch/powerpc/kernel/rtas_flash.c |   4 +-
 arch/powerpc/platforms/pseries/dlpar.c   |   2 +-
 arch/powerpc/platforms/pseries/iommu.c   |   2 +-
 arch/powerpc/platforms/pseries/msi.c |   4 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c |  22 +--
 8 files changed, 190 insertions(+), 64 deletions(-)

-- 
2.30.2



[PATCH] powerpc/pseries/dlpar: use rtas_get_sensor()

2021-05-03 Thread Nathan Lynch
Instead of making bare calls to get-sensor-state, use
rtas_get_sensor(), which correctly handles busy and extended delay
statuses.

Fixes: ab519a011caa ("powerpc/pseries: Kernel DLPAR Infrastructure")
Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/dlpar.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 3ac70790ec7a..b1f01ac0c29e 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -289,8 +289,7 @@ int dlpar_acquire_drc(u32 drc_index)
 {
int dr_status, rc;
 
-   rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, _status,
-  DR_ENTITY_SENSE, drc_index);
+   rc = rtas_get_sensor(DR_ENTITY_SENSE, drc_index, _status);
if (rc || dr_status != DR_ENTITY_UNUSABLE)
return -1;
 
@@ -311,8 +310,7 @@ int dlpar_release_drc(u32 drc_index)
 {
int dr_status, rc;
 
-   rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, _status,
-  DR_ENTITY_SENSE, drc_index);
+   rc = rtas_get_sensor(DR_ENTITY_SENSE, drc_index, _status);
if (rc || dr_status != DR_ENTITY_PRESENT)
return -1;
 
@@ -333,8 +331,7 @@ int dlpar_unisolate_drc(u32 drc_index)
 {
int dr_status, rc;
 
-   rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, _status,
-   DR_ENTITY_SENSE, drc_index);
+   rc = rtas_get_sensor(DR_ENTITY_SENSE, drc_index, _status);
if (rc || dr_status != DR_ENTITY_PRESENT)
return -1;
 
-- 
2.30.2



[powerpc:next] BUILD SUCCESS 562d1e207d322e6346e8db91bbd11d94f16427d2

2021-05-03 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next
branch HEAD: 562d1e207d322e6346e8db91bbd11d94f16427d2  powerpc/powernv: remove 
the nvlink support

elapsed time: 726m

configs tested: 44
configs skipped: 78

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
mips tb0219_defconfig
mips   ip27_defconfig
shapsh4ad0a_defconfig
riscv allnoconfig
armneponset_defconfig
arm pxa_defconfig
armclps711x_defconfig
m68k   m5475evb_defconfig
mips loongson1c_defconfig
arm  exynos_defconfig
arm defconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a003-20210503
i386 randconfig-a006-20210503
i386 randconfig-a001-20210503
i386 randconfig-a005-20210503
i386 randconfig-a004-20210503
i386 randconfig-a002-20210503
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig

clang tested configs:
x86_64   randconfig-a014-20210503
x86_64   randconfig-a015-20210503
x86_64   randconfig-a012-20210503
x86_64   randconfig-a011-20210503
x86_64   randconfig-a013-20210503
x86_64   randconfig-a016-20210503

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[powerpc:merge] BUILD SUCCESS 134b5c8a49b594ff6cfb4ea1a92400bb382b46d2

2021-05-03 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
merge
branch HEAD: 134b5c8a49b594ff6cfb4ea1a92400bb382b46d2  Automatic merge of 
'master' into merge (2021-05-02 23:37)

elapsed time: 2159m

configs tested: 145
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
riscvallmodconfig
riscvallyesconfig
mips tb0219_defconfig
mips   ip27_defconfig
shapsh4ad0a_defconfig
riscv allnoconfig
armneponset_defconfig
sh  rsk7203_defconfig
sh   sh7724_generic_defconfig
m68k  amiga_defconfig
ia64zx1_defconfig
mips   lemote2f_defconfig
powerpc xes_mpc85xx_defconfig
powerpcwarp_defconfig
xtensa  defconfig
mipse55_defconfig
powerpcmvme5100_defconfig
arm  pxa255-idp_defconfig
arm pxa_defconfig
armclps711x_defconfig
m68k   m5475evb_defconfig
mips loongson1c_defconfig
arm  exynos_defconfig
sh  polaris_defconfig
powerpc  cm5200_defconfig
sparc64  alldefconfig
powerpcmpc7448_hpc2_defconfig
powerpc kmeter1_defconfig
arc  allyesconfig
armlart_defconfig
powerpc ep8248e_defconfig
armmulti_v5_defconfig
arm  pxa910_defconfig
m68k  multi_defconfig
um   x86_64_defconfig
mipsomega2p_defconfig
mips  pistachio_defconfig
xtensa   common_defconfig
sh   se7619_defconfig
arm  pxa3xx_defconfig
arcvdk_hs38_defconfig
arm  iop32x_defconfig
sh ecovec24_defconfig
nds32alldefconfig
i386defconfig
arm assabet_defconfig
arm  colibri_pxa270_defconfig
armspear3xx_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a001-20210503
x86_64   randconfig-a005-20210503
x86_64   randconfig-a003-20210503
x86_64   randconfig-a002-20210503
x86_64   randconfig-a006-20210503
x86_64   randconfig-a004-20210503
i386 randconfig-a003-20210503
i386 randconfig-a006-20210503
i386 randconfig-a001-20210503
i386 randconfig-a005-20210503
i386 randconfig-a004-20210503
i386 randconfig-a002-20210503
i386 randconfig-a003-20210502
i386 randconfig-a006-20210502
i386 randconfig-a001-20210502
i386 randconfig-a005-20210502
i386 randconfig-a004-20210502
i386

[powerpc:next-test] BUILD SUCCESS 7905dafdefe9f1238a3ca2795cf975b311b5a5f6

2021-05-03 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next-test
branch HEAD: 7905dafdefe9f1238a3ca2795cf975b311b5a5f6  powerpc/pseries: warn if 
recursing into the hcall tracing code

elapsed time: 2157m

configs tested: 108
configs skipped: 98

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
riscvallyesconfig
mips tb0219_defconfig
mips   ip27_defconfig
shapsh4ad0a_defconfig
riscv allnoconfig
armneponset_defconfig
powerpc xes_mpc85xx_defconfig
powerpcwarp_defconfig
xtensa  defconfig
mipse55_defconfig
arm pxa_defconfig
armclps711x_defconfig
m68k   m5475evb_defconfig
mips loongson1c_defconfig
arm  exynos_defconfig
sh  polaris_defconfig
powerpc  cm5200_defconfig
sparc64  alldefconfig
powerpcmpc7448_hpc2_defconfig
powerpc kmeter1_defconfig
arc  allyesconfig
armlart_defconfig
powerpc ep8248e_defconfig
armmulti_v5_defconfig
arm  pxa910_defconfig
m68k  multi_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a003-20210503
i386 randconfig-a006-20210503
i386 randconfig-a001-20210503
i386 randconfig-a005-20210503
i386 randconfig-a004-20210503
i386 randconfig-a002-20210503
i386 randconfig-a003-20210502
i386 randconfig-a006-20210502
i386 randconfig-a001-20210502
i386 randconfig-a005-20210502
i386 randconfig-a004-20210502
i386 randconfig-a002-20210502
x86_64   randconfig-a014-20210502
x86_64   randconfig-a015-20210502
x86_64   randconfig-a012-20210502
x86_64   randconfig-a011-20210502
x86_64   randconfig-a013-20210502
x86_64   randconfig-a016-20210502
i386 randconfig-a013-20210502
i386 randconfig-a015-20210502
i386 randconfig-a016-20210502
i386 randconfig-a014-20210502
i386 randconfig-a011-20210502
i386 randconfig-a012-20210502
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a001-20210502
x86_64   randconfig-a005-20210502
x86_64   randconfig-a003-20210502
x86_64

Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Masahiro Yamada
On Mon, May 3, 2021 at 3:17 PM Christophe Leroy
 wrote:
>
>
>
> Le 01/05/2021 à 17:15, Masahiro Yamada a écrit :
> > The current minimum GCC version is 4.9 except ARCH=arm64 requiring
> > GCC 5.1.
> >
> > When we discussed last time, we agreed to raise the minimum GCC version
> > to 5.1 globally. [1]
> >
> > I'd like to propose GCC 5.2 to clean up arch/powerpc/Kconfig as well.
>
> One point I missed when I saw your patch first time, but I realised during 
> the discussion:
>
> Up to 4.9, GCC was numbered with 3 digits, we had 4.8.0, 4.8.1, ... 4.8.5, 
> 4.9.0, 4.9.1,  4.9.4
>
> Then starting at 5, GCC switched to a 2 digits scheme, with 5.0, 5.1, 5.2, 
> ... 5.5
>
> So, that is not GCC 5.1 or 5.2 that you should target, but only GCC 5.
> Then it is up to the user to use the latest available version of GCC 5, which 
> is 5.5 at the time
> begin, just like the user would have selected 4.9.4 when 4.9 was the minimum 
> GCC version.
>
> Christophe



One line below in Documentation/process/changes.rst,
I see

 Clang/LLVM (optional)  10.0.1   clang --version



Clang 10.0.1 is a bug fix release of Clang 10


I do not think GCC 5.2 is strange when we
want to exclude the initial release of GCC 5.





-- 
Best Regards
Masahiro Yamada


Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks

2021-05-03 Thread David Gibson
On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote:
> dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed
> by checking !DRCONF_MEM_RESERVED, and in the following loop before
> dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before
> removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and
> !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being
> removed.  The function will end up not removing all 'lmbs_to_remove'
> LMBs while also not reporting any errors.
> 
> Comparing it to dlpar_memory_remove_by_count(), the validation is done
> via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump
> constraints. No additional check is made afterwards, and
> DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The
> function doesn't have the same 'check A for validation, then B for
> removal' issue as remove_by_ic(), but it's not checking if the LMB is
> reserved.
> 
> There is no reason for these functions to validate the same operation in
> two different manners.

Actually, I think there is: remove_by_ic() is handling a request to
remove a specific range of LMBs.  If any are reserved, they can't be
removed and so this needs to fail.  But if they are !ASSIGNED, that
essentially means they're *already* removed (or never added), so
"removing" them is, correctly, a no-op.

remove_by_count(), in contrast, is being asked to remove a fixed
number of LMBs from wherever they can be found, and for that it needs
to find LMBs that haven't already been removed.

Basically remove_by_ic() is an absolute request: "make this set of
LMBs be not-plugged", whereas remove_by_count() is a relative request
"make N less LMBs be plugged".


So I think remove_by_ic()s existing handling is correct.  I'm less
sure if remove_by_count() ignoring RESERVED is correct - I couldn't
quickly find under what circumstances RESERVED gets set.


> This patch addresses that by changing
> lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a
> lmb is removable, making dlpar_memory_remove_by_count() take the
> reservation state into account when counting the LMBs.
> lmb_is_removable() is then used in the validation step of
> dlpar_memory_remove_by_ic(), which is already checking for both states
> but in different stages, to avoid counting a LMB that is not assigned as
> eligible for removal. We can then skip the check before
> dlpar_remove_lmb() since we're validating all LMBs beforehand.
> 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index bb98574a84a2..4e6d162c3f1a 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np)
>  
>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>  {
> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> + if ((lmb->flags & DRCONF_MEM_RESERVED) ||
> + !(lmb->flags & DRCONF_MEM_ASSIGNED))
>   return false;
>  
>  #ifdef CONFIG_FA_DUMP
> @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
> u32 drc_index)
>  
>   /* Validate that there are enough LMBs to satisfy the request */
>   for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
> - if (lmb->flags & DRCONF_MEM_RESERVED)
> + if (!lmb_is_removable(lmb))
>   break;
>  
>   lmbs_available++;
> @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
> u32 drc_index)
>   return -EINVAL;
>  
>   for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) {
> - if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
> - continue;
> -
>   rc = dlpar_remove_lmb(lmb);
>   if (rc)
>   break;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error

2021-05-03 Thread David Gibson
On Fri, Apr 30, 2021 at 09:09:15AM -0300, Daniel Henrique Barboza wrote:
> As previously done in dlpar_cpu_remove() for CPUs, this patch changes
> dlpar_memory_remove_by_ic() to unisolate the LMB DRC when the LMB is
> failed to be removed. The hypervisor, seeing a LMB DRC that was supposed
> to be removed being unisolated instead, can do error recovery on its
> side.
> 
> This change is done in dlpar_memory_remove_by_ic() only because, as of
> today, only QEMU is using this code path for error recovery (via the
> PSERIES_HP_ELOG_ID_DRC_IC event). phyp treats it as a no-op.
> 
> Signed-off-by: Daniel Henrique Barboza 

Reviewed-by: David Gibson 

> ---
>  arch/powerpc/platforms/pseries/hotplug-memory.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
> b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index 8377f1f7c78e..bb98574a84a2 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -551,6 +551,13 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, 
> u32 drc_index)
>   if (!drmem_lmb_reserved(lmb))
>   continue;
>  
> + /*
> +  * Setting the isolation state of an 
> UNISOLATED/CONFIGURED
> +  * device to UNISOLATE is a no-op, but the hypervisor 
> can
> +  * use it as a hint that the LMB removal failed.
> +  */
> + dlpar_unisolate_drc(lmb->drc_index);
> +
>   rc = dlpar_add_lmb(lmb);
>   if (rc)
>   pr_err("Failed to add LMB, drc index %x\n",

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH 4/4] powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes

2021-05-03 Thread Joel Stanley
On Mon, 3 May 2021 at 13:04, Nicholas Piggin  wrote:
>
> These aren't necessarily POWER9 only, and it's not to say some new
> vulnerability may not get discovered on other processors for which
> we would like the flexibility of having the workaround enabled by
> firmware.
>
> Remove the restriction that they only apply to POWER9.

I was wondering how these worked which led me to reviewing your patch.
>From what I could see, these are enabled by default (SEC_FTR_DEFAULT
in arch/powerpc/include/asm/security_features.h), so unless all
non-POWER9 machines have set the "please don't" bit in their firmware
this patch will enable the feature for those machines. Is that what
you wanted?

>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/platforms/powernv/setup.c | 9 -
>  1 file changed, 9 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/setup.c 
> b/arch/powerpc/platforms/powernv/setup.c
> index a8db3f153063..6ec67223f8c7 100644
> --- a/arch/powerpc/platforms/powernv/setup.c
> +++ b/arch/powerpc/platforms/powernv/setup.c
> @@ -122,15 +122,6 @@ static void pnv_setup_security_mitigations(void)
> type = L1D_FLUSH_ORI;
> }
>
> -   /*
> -* If we are non-Power9 bare metal, we don't need to flush on kernel
> -* entry or after user access: they fix a P9 specific vulnerability.
> -*/
> -   if (!pvr_version_is(PVR_POWER9)) {
> -   security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY);
> -   security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS);
> -   }
> -
> enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \
>  (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)   || \
>   security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV));
> --
> 2.23.0
>


Re: [PATCH] ibmvnic: remove default label from to_string switch

2021-05-03 Thread David Miller
From: Lijun Pan 
Date: Mon, 3 May 2021 13:21:00 -0500

> On Mon, May 3, 2021 at 5:54 AM Michal Suchanek  wrote:
>>
>> This way the compiler warns when a new value is added to the enum but
>> not the string transation like:
> 
> s/transation/translation/
> 
> This trick works.
> Since the original code does not generate gcc warnings/errors, should
> this patch be sent to net-next as an improvement?

Yes.


Re: [PATCH v3] pseries/drmem: update LMBs after LPM

2021-05-03 Thread Tyrel Datwyler
On 5/3/21 10:28 AM, Laurent Dufour wrote:
> Le 01/05/2021 à 01:58, Tyrel Datwyler a écrit :
>> On 4/30/21 9:13 AM, Laurent Dufour wrote:
>>> Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit :
 On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:
> Laurent Dufour  writes:
>

Snip

>>
>> As of today I don't have a problem with your patch. This was more of me 
>> pointing
>> out things that I think are currently wrong with our memory hotplug
>> implementation, and that we need to take a long hard look at it down the 
>> road.
> 
> I do agree, there is a lot of odd things there to address in this area.
> If you're ok with that patch, do you mind to add a reviewed-by?
> 

Can you send a v4 with the fix for the duplicate update included?

-Tyrel


Re: [RFC] powerpc/pseries: delete scanlog

2021-05-03 Thread Tyrel Datwyler
On 5/3/21 10:18 AM, Nathan Lynch wrote:
> A commit from 2008 says this driver was relevant only for "older
> systems", and currently supported hardware doesn't have this
> facility. Get rid of it.

The only references I could find to scan log dump support are several Power 4+
systems, in particular the IntelliStation POWER 9114 and pSeries 615, which were
released in 2003 at the same time this code was originally introduced.

Historical Linux commit form February 2003:
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=f92e361842d5251e50562b09664082dcbd0548bb

IntelliStation and pSeries docs:
http://ps-2.retropc.se/basil.holloway/ALL%20PDF/380635.pdf
http://ps-2.kev009.com/rs6000/manuals/p/p615-6C3-6E3/6C3_and_6E3_Users_Guide_SA38-0629.pdf

Current firmware RTAS implementations have no reference to ibm,scan-log-dump,
and a long standing developer for that code has no recollection of its 
existence.

This appears to be a straggler from RPA and Power 4 days. Based on my
understanding that we dropped support Power 4 in mainline this looks pretty
orphaned to me and a solid candidate for removal barring and insight from
someone else that knows better.

+1

Feel free to add my RB tag to any non-RFC followup.

Reviewed-by: Tyrel Datwyler 

> 
> Signed-off-by: Nathan Lynch 
> ---
>  arch/powerpc/configs/ppc64_defconfig |   1 -
>  arch/powerpc/configs/pseries_defconfig   |   1 -
>  arch/powerpc/platforms/pseries/Kconfig   |   4 -
>  arch/powerpc/platforms/pseries/Makefile  |   1 -
>  arch/powerpc/platforms/pseries/scanlog.c | 195 ---
>  5 files changed, 202 deletions(-)
>  delete mode 100644 arch/powerpc/platforms/pseries/scanlog.c
> 
> diff --git a/arch/powerpc/configs/ppc64_defconfig 
> b/arch/powerpc/configs/ppc64_defconfig
> index 701811c91a6f..acf13b4917c4 100644
> --- a/arch/powerpc/configs/ppc64_defconfig
> +++ b/arch/powerpc/configs/ppc64_defconfig
> @@ -26,7 +26,6 @@ CONFIG_PPC64=y
>  CONFIG_NR_CPUS=2048
>  CONFIG_PPC_SPLPAR=y
>  CONFIG_DTL=y
> -CONFIG_SCANLOG=m
>  CONFIG_PPC_SMLPAR=y
>  CONFIG_IBMEBUS=y
>  CONFIG_PPC_SVM=y
> diff --git a/arch/powerpc/configs/pseries_defconfig 
> b/arch/powerpc/configs/pseries_defconfig
> index 50168dde4ea5..d120321e4eea 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -38,7 +38,6 @@ CONFIG_MODULE_SRCVERSION_ALL=y
>  CONFIG_PARTITION_ADVANCED=y
>  CONFIG_PPC_SPLPAR=y
>  CONFIG_DTL=y
> -CONFIG_SCANLOG=m
>  CONFIG_PPC_SMLPAR=y
>  CONFIG_IBMEBUS=y
>  CONFIG_PAPR_SCM=m
> diff --git a/arch/powerpc/platforms/pseries/Kconfig 
> b/arch/powerpc/platforms/pseries/Kconfig
> index 5e037df2a3a1..bf9b612a929b 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -61,10 +61,6 @@ config PSERIES_ENERGY
> Provides: /sys/devices/system/cpu/pseries_(de)activation_hint_list
> and /sys/devices/system/cpu/cpuN/pseries_(de)activation_hint
> 
> -config SCANLOG
> - tristate "Scanlog dump interface"
> - depends on RTAS_PROC && PPC_PSERIES
> -
>  config IO_EVENT_IRQ
>   bool "IO Event Interrupt support"
>   depends on PPC_PSERIES
> diff --git a/arch/powerpc/platforms/pseries/Makefile 
> b/arch/powerpc/platforms/pseries/Makefile
> index c8a2b0b05ac0..754d1102de08 100644
> --- a/arch/powerpc/platforms/pseries/Makefile
> +++ b/arch/powerpc/platforms/pseries/Makefile
> @@ -8,7 +8,6 @@ obj-y := lpar.o hvCall.o nvram.o reconfig.o \
>  firmware.o power.o dlpar.o mobility.o rng.o \
>  pci.o pci_dlpar.o eeh_pseries.o msi.o
>  obj-$(CONFIG_SMP)+= smp.o
> -obj-$(CONFIG_SCANLOG)+= scanlog.o
>  obj-$(CONFIG_KEXEC_CORE) += kexec.o
>  obj-$(CONFIG_PSERIES_ENERGY) += pseries_energy.o
> 
> diff --git a/arch/powerpc/platforms/pseries/scanlog.c 
> b/arch/powerpc/platforms/pseries/scanlog.c
> deleted file mode 100644
> index 2879c4f0ceb7..
> --- a/arch/powerpc/platforms/pseries/scanlog.c
> +++ /dev/null
> @@ -1,195 +0,0 @@
> -// SPDX-License-Identifier: GPL-2.0-or-later
> -/*
> - *  c 2001 PPC 64 Team, IBM Corp
> - *
> - * scan-log-data driver for PPC64  Todd Inglett 
> - *
> - * When ppc64 hardware fails the service processor dumps internal state
> - * of the system.  After a reboot the operating system can access a dump
> - * of this data using this driver.  A dump exists if the device-tree
> - * /chosen/ibm,scan-log-data property exists.
> - *
> - * This driver exports /proc/powerpc/scan-log-dump which can be read.
> - * The driver supports only sequential reads.
> - *
> - * The driver looks at a write to the driver for the single word "reset".
> - * If given, the driver will reset the scanlog so the platform can free it.
> - */
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -#define MODULE_VERS "1.0"
> -#define MODULE_NAME "scanlog"
> -
> -/* 

Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2021-05-03 Thread Michal Suchánek
On Mon, May 03, 2021 at 11:34:25AM +0200, Michal Suchánek wrote:
> On Mon, May 03, 2021 at 09:11:16AM +0200, Michal Suchánek wrote:
> > On Mon, May 03, 2021 at 10:58:33AM +1000, Nicholas Piggin wrote:
> > > Excerpts from Michal Suchánek's message of May 3, 2021 2:57 am:
> > > > On Tue, Apr 28, 2020 at 09:25:17PM +1000, Nicholas Piggin wrote:
> > > >> Provide an option to use ELFv2 ABI for big endian builds. This works on
> > > >> GCC and clang (since 2014). It is less well tested and supported by the
> > > >> GNU toolchain, but it can give some useful advantages of the ELFv2 ABI
> > > >> for BE (e.g., less stack usage). Some distros even build BE ELFv2
> > > >> userspace.
> > > > 
> > > > Fixes BTFID failure on BE for me and the ELF ABIv2 kernel boots.
> > > 
> > > What's the BTFID failure? Anything we can do to fix it on the v1 ABI or 
> > > at least make it depend on BUILD_ELF_V2?
> > 
> > Looks like symbols are prefixed with a dot in ABIv1 and BTFID tool is
> > not aware of that. It can be disabled on ABIv1 easily.
> > 
> > Thanks
> > 
> > Michal
> > 
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 678c13967580..e703c26e9b80 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -305,6 +305,7 @@ config DEBUG_INFO_BTF
> > bool "Generate BTF typeinfo"
> > depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED
> > depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST
> > +   depends on !PPC64 || BUILD_ELF_V2
> > help
> >   Generate deduplicated BTF type information from DWARF debug info.
> >   Turning this on expects presence of pahole tool, which will convert
> > 
> > > 
> > > > 
> > > > Tested-by: Michal Suchánek 
> > > > 
> > > > Also can we enable mprofile on BE now?
> > > > 
> > > > I don't see anything endian-specific in the mprofile code at a glance
> > > > but don't have any idea how to test it.
> > > 
> > > AFAIK it's just a different ABI for the _mcount call so just running
> > > some ftrace and ftrace with call graph should test it reasonably well.
> 
> It does not crash and burn but there are some regressions from LE to BE
> on the ftrace kernel selftest:
> 
> --- ftraceLE.txt  2021-05-03 11:19:14.83000 +0200
> +++ ftraceBE.txt  2021-05-03 11:27:24.77000 +0200
> @@ -7,8 +7,8 @@
>  [n] Change the ringbuffer size   [PASS]
>  [n] Snapshot and tracing setting [PASS]
>  [n] trace_pipe and trace_marker  [PASS]
> -[n] Test ftrace direct functions against tracers [UNRESOLVED]
> -[n] Test ftrace direct functions against kprobes [UNRESOLVED]
> +[n] Test ftrace direct functions against tracers [FAIL]
> +[n] Test ftrace direct functions against kprobes [FAIL]
>  [n] Generic dynamic event - add/remove kprobe events [PASS]
>  [n] Generic dynamic event - add/remove synthetic events  [PASS]
>  [n] Generic dynamic event - selective clear (compatibility)  [PASS]
> @@ -16,10 +16,10 @@
>  [n] event tracing - enable/disable with event level files[PASS]
>  [n] event tracing - restricts events based on pid notrace filtering  [PASS]
>  [n] event tracing - restricts events based on pid[PASS]
> -[n] event tracing - enable/disable with subsystem level files[PASS]
> +[n] event tracing - enable/disable with subsystem level files[FAIL]
>  [n] event tracing - enable/disable with top level files  [PASS]
> -[n] Test trace_printk from module[UNRESOLVED]
> -[n] ftrace - function graph filters with stack tracer[PASS]
> +[n] Test trace_printk from module[FAIL]
> +[n] ftrace - function graph filters with stack tracer[FAIL]
>  [n] ftrace - function graph filters  [PASS]
>  [n] ftrace - function trace with cpumask [PASS]
>  [n] ftrace - test for function event triggers[PASS]
> @@ -27,7 +27,7 @@
>  [n] ftrace - function pid notrace filters[PASS]
>  [n] ftrace - function pid filters[PASS]
>  [n] ftrace - stacktrace filter command   [PASS]
> -[n] ftrace - function trace on module[UNRESOLVED]
> +[n] ftrace - function trace on module[FAIL]
>  [n] ftrace - function profiler with function tracing [PASS]
>  [n] ftrace - function profiling  [PASS]
>  [n] ftrace - test reading of set_ftrace_filter   [PASS]
> @@ -44,10 +44,10 @@
>  [n] Kprobe event argument syntax [PASS]
>  [n] Kprobe dynamic event with arguments  [PASS]
>  [n] Kprobes event arguments with types   [PASS]
> -[n] Kprobe event user-memory access  [UNSUPPORTED]
> +[n] Kprobe event user-memory access  [FAIL]
>  [n] Kprobe event auto/manual naming  [PASS]
>  [n] Kprobe dynamic event with function tracer[PASS]
> -[n] Kprobe dynamic event - probing module[UNRESOLVED]
> +[n] Kprobe dynamic event - probing module[FAIL]
>  [n] Create/delete multiprobe on kprobe event [PASS]
>  [n] Kprobe event parser error log check  [PASS]
>  [n] Kretprobe dynamic event with arguments   [PASS]
> @@ -57,11 +57,11 @@
>  [n] Kprobe events - probe points [PASS]
>  [n] Kprobe 

Re: [PATCH] ibmvnic: remove default label from to_string switch

2021-05-03 Thread Lijun Pan
On Mon, May 3, 2021 at 5:54 AM Michal Suchanek  wrote:
>
> This way the compiler warns when a new value is added to the enum but
> not the string transation like:

s/transation/translation/

This trick works.
Since the original code does not generate gcc warnings/errors, should
this patch be sent to net-next as an improvement?

>
> drivers/net/ethernet/ibm/ibmvnic.c: In function 'adapter_state_to_string':
> drivers/net/ethernet/ibm/ibmvnic.c:832:2: warning: enumeration value 
> 'VNIC_FOOBAR' not handled in switch [-Wswitch]
>   switch (state) {
>   ^~
> drivers/net/ethernet/ibm/ibmvnic.c: In function 'reset_reason_to_string':
> drivers/net/ethernet/ibm/ibmvnic.c:1935:2: warning: enumeration value 
> 'VNIC_RESET_FOOBAR' not handled in switch [-Wswitch]
>   switch (reason) {
>   ^~
>
> Signed-off-by: Michal Suchanek 
> ---

Acked-by: Lijun Pan 

>  drivers/net/ethernet/ibm/ibmvnic.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 5788bb956d73..4d439413f6d9 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -846,9 +846,8 @@ static const char *adapter_state_to_string(enum 
> vnic_state state)
> return "REMOVING";
> case VNIC_REMOVED:
> return "REMOVED";
> -   default:
> -   return "UNKNOWN";
> }
> +   return "UNKNOWN";
>  }
>
>  static int ibmvnic_login(struct net_device *netdev)
> @@ -1946,9 +1945,8 @@ static const char *reset_reason_to_string(enum 
> ibmvnic_reset_reason reason)
> return "TIMEOUT";
> case VNIC_RESET_CHANGE_PARAM:
> return "CHANGE_PARAM";
> -   default:
> -   return "UNKNOWN";
> }
> +   return "UNKNOWN";
>  }
>
>  /*
> --
> 2.26.2
>


Re: [PATCH 1/3] lib: early_string: allow early usage of some string functions

2021-05-03 Thread Daniel Walker
On Mon, May 03, 2021 at 11:01:41AM -0700, Daniel Walker wrote:
> On Sat, May 01, 2021 at 09:31:47AM +0200, Christophe Leroy wrote:
> > 
> > > In fact, should be like in prom_init today:
> > > 
> > > #ifdef __EARLY_STRING_ENABLED
> > >  if (dsize >= count)
> > >      return count;
> > > #else
> > >  BUG_ON(dsize >= count);
> > > #endif
> > 
> > Thinking about it once more, this BUG_ON() is overkill and should be
> > avoided, see https://www.kernel.org/doc/html/latest/process/deprecated.html
> > 
> > Therefore, something like the following would make it:
> > 
> > if (dsize >= count) {
> > WARN_ON(!__is_defined(__EARLY_STRING_ENABLED));
> > 
> > return count;
> > }
> 
> I agree, it's overkill it stop the system for this condition.
> 
> how about I do something more like this for my changes,
> 
> 
> > if (WARN_ON(dsize >= count && !__is_defined(__EARLY_STRING_ENABLED)))
> > return count;

I'll have to work on this one..

Daniel


Re: [PATCH 1/3] lib: early_string: allow early usage of some string functions

2021-05-03 Thread Daniel Walker
On Sat, May 01, 2021 at 09:31:47AM +0200, Christophe Leroy wrote:
> 
> > In fact, should be like in prom_init today:
> > 
> > #ifdef __EARLY_STRING_ENABLED
> >  if (dsize >= count)
> >      return count;
> > #else
> >  BUG_ON(dsize >= count);
> > #endif
> 
> Thinking about it once more, this BUG_ON() is overkill and should be
> avoided, see https://www.kernel.org/doc/html/latest/process/deprecated.html
> 
> Therefore, something like the following would make it:
> 
>   if (dsize >= count) {
>   WARN_ON(!__is_defined(__EARLY_STRING_ENABLED));
> 
>   return count;
>   }

I agree, it's overkill it stop the system for this condition.

how about I do something more like this for my changes,


>   if (WARN_ON(dsize >= count && !__is_defined(__EARLY_STRING_ENABLED)))
>   return count;

and for generic kernel,

>   if (WARN_ON(dsize >= count))
>   return count;



Daniel


[PATCH] powerpc/rtas-rtc: remove unused constant

2021-05-03 Thread Nathan Lynch
RTAS_CLOCK_BUSY is unused, remove it.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas-rtc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas-rtc.c b/arch/powerpc/kernel/rtas-rtc.c
index a28239b8b0c0..33c07c8af6c8 100644
--- a/arch/powerpc/kernel/rtas-rtc.c
+++ b/arch/powerpc/kernel/rtas-rtc.c
@@ -12,7 +12,7 @@
 
 
 #define MAX_RTC_WAIT 5000  /* 5 sec */
-#define RTAS_CLOCK_BUSY (-2)
+
 time64_t __init rtas_get_boot_time(void)
 {
int ret[8];
-- 
2.30.2



Re: [PATCH v3] pseries/drmem: update LMBs after LPM

2021-05-03 Thread Laurent Dufour

Le 01/05/2021 à 01:58, Tyrel Datwyler a écrit :

On 4/30/21 9:13 AM, Laurent Dufour wrote:

Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit :

On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote:

Laurent Dufour  writes:


After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.

This is caught by the kernel, but the memory's node is updated because
there is no way to move a memory block between nodes.

If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node to match the added or removed LMB. But
the LMB's associativity node has not been updated after the DT node update
and thus the node is overwritten by the Linux's topology instead of the
hypervisor one.

Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity.

Cc: Tyrel Datwyler 
Signed-off-by: Laurent Dufour 
---

V3:
   - Check rd->dn->name instead of rd->dn->full_name
V2:
   - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of
   introducing a new hook mechanism.
---
   arch/powerpc/include/asm/drmem.h  |  1 +
   arch/powerpc/mm/drmem.c   | 35 +++
   .../platforms/pseries/hotplug-memory.c    |  4 +++
   3 files changed, 40 insertions(+)

diff --git a/arch/powerpc/include/asm/drmem.h
b/arch/powerpc/include/asm/drmem.h
index bf2402fed3e0..4265d5e95c2c 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -111,6 +111,7 @@ int drmem_update_dt(void);
   int __init
   walk_drmem_lmbs_early(unsigned long node, void *data,
     int (*func)(struct drmem_lmb *, const __be32 **, void *));
+void drmem_update_lmbs(struct property *prop);
   #endif
     static inline void invalidate_lmb_associativity_index(struct drmem_lmb
*lmb)
diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c
index 9af3832c9d8d..f0a6633132af 100644
--- a/arch/powerpc/mm/drmem.c
+++ b/arch/powerpc/mm/drmem.c
@@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node,
void *data,
   return ret;
   }
   +/*
+ * Update the LMB associativity index.
+ */
+static int update_lmb(struct drmem_lmb *updated_lmb,
+  __maybe_unused const __be32 **usm,
+  __maybe_unused void *data)
+{
+    struct drmem_lmb *lmb;
+
+    /*
+ * Brut force there may be better way to fetch the LMB
+ */
+    for_each_drmem_lmb(lmb) {
+    if (lmb->drc_index != updated_lmb->drc_index)
+    continue;
+
+    lmb->aa_index = updated_lmb->aa_index;
+    break;
+    }
+    return 0;
+}
+
+/*
+ * Update the LMB associativity index.
+ *
+ * This needs to be called when the hypervisor is updating the
+ * dynamic-reconfiguration-memory node property.
+ */
+void drmem_update_lmbs(struct property *prop)
+{
+    if (!strcmp(prop->name, "ibm,dynamic-memory"))
+    __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb);
+    else if (!strcmp(prop->name, "ibm,dynamic-memory-v2"))
+    __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb);
+}
   #endif
     static int init_drmem_lmb_size(struct device_node *dn)
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 8377f1f7c78e..672ffbee2e78 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct
notifier_block *nb,
   case OF_RECONFIG_DETACH_NODE:
   err = pseries_remove_mem_node(rd->dn);
   break;
+    case OF_RECONFIG_UPDATE_PROPERTY:
+    if (!strcmp(rd->dn->name,
+    "ibm,dynamic-reconfiguration-memory"))
+    drmem_update_lmbs(rd->prop);
   }
   return notifier_from_errno(err);


How will this interact with DLPAR memory? When we dlpar memory,
ibm,configure-connector is used to fetch the new associativity details
and set drmem_lmb->aa_index correctly there. Once that is done kernel
then call drmem_update_dt() which will result in the above notifier
callback?

IIUC, the call back then will update drmem_lmb->aa_index again?


After digging through some of this code I'm a bit concerned about all the kernel
device tree manipulation around memory DLPAR both with the assoc-lookup-array
prop update and post dynamic-memory prop updating. We build a drmem_info array
of the LMBs from the device-tree at boot. I don't really understand why we are
manipulating the device tree property every time we add/remove an LMB. Not sure
the reasoning was to write back in particular the aa_index and flags for each
LMB into the device tree when we already have them in the drmem_info array. On
the other hand the assoc-lookup-array I suppose would need to have an in kernel
representation to avoid updating the device tree property every time.


I think the 

[RFC] powerpc/pseries: delete scanlog

2021-05-03 Thread Nathan Lynch
A commit from 2008 says this driver was relevant only for "older
systems", and currently supported hardware doesn't have this
facility. Get rid of it.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/configs/ppc64_defconfig |   1 -
 arch/powerpc/configs/pseries_defconfig   |   1 -
 arch/powerpc/platforms/pseries/Kconfig   |   4 -
 arch/powerpc/platforms/pseries/Makefile  |   1 -
 arch/powerpc/platforms/pseries/scanlog.c | 195 ---
 5 files changed, 202 deletions(-)
 delete mode 100644 arch/powerpc/platforms/pseries/scanlog.c

diff --git a/arch/powerpc/configs/ppc64_defconfig 
b/arch/powerpc/configs/ppc64_defconfig
index 701811c91a6f..acf13b4917c4 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -26,7 +26,6 @@ CONFIG_PPC64=y
 CONFIG_NR_CPUS=2048
 CONFIG_PPC_SPLPAR=y
 CONFIG_DTL=y
-CONFIG_SCANLOG=m
 CONFIG_PPC_SMLPAR=y
 CONFIG_IBMEBUS=y
 CONFIG_PPC_SVM=y
diff --git a/arch/powerpc/configs/pseries_defconfig 
b/arch/powerpc/configs/pseries_defconfig
index 50168dde4ea5..d120321e4eea 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -38,7 +38,6 @@ CONFIG_MODULE_SRCVERSION_ALL=y
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_PPC_SPLPAR=y
 CONFIG_DTL=y
-CONFIG_SCANLOG=m
 CONFIG_PPC_SMLPAR=y
 CONFIG_IBMEBUS=y
 CONFIG_PAPR_SCM=m
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 5e037df2a3a1..bf9b612a929b 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -61,10 +61,6 @@ config PSERIES_ENERGY
  Provides: /sys/devices/system/cpu/pseries_(de)activation_hint_list
  and /sys/devices/system/cpu/cpuN/pseries_(de)activation_hint
 
-config SCANLOG
-   tristate "Scanlog dump interface"
-   depends on RTAS_PROC && PPC_PSERIES
-
 config IO_EVENT_IRQ
bool "IO Event Interrupt support"
depends on PPC_PSERIES
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index c8a2b0b05ac0..754d1102de08 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -8,7 +8,6 @@ obj-y   := lpar.o hvCall.o nvram.o reconfig.o \
   firmware.o power.o dlpar.o mobility.o rng.o \
   pci.o pci_dlpar.o eeh_pseries.o msi.o
 obj-$(CONFIG_SMP)  += smp.o
-obj-$(CONFIG_SCANLOG)  += scanlog.o
 obj-$(CONFIG_KEXEC_CORE)   += kexec.o
 obj-$(CONFIG_PSERIES_ENERGY)   += pseries_energy.o
 
diff --git a/arch/powerpc/platforms/pseries/scanlog.c 
b/arch/powerpc/platforms/pseries/scanlog.c
deleted file mode 100644
index 2879c4f0ceb7..
--- a/arch/powerpc/platforms/pseries/scanlog.c
+++ /dev/null
@@ -1,195 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- *  c 2001 PPC 64 Team, IBM Corp
- *
- * scan-log-data driver for PPC64  Todd Inglett 
- *
- * When ppc64 hardware fails the service processor dumps internal state
- * of the system.  After a reboot the operating system can access a dump
- * of this data using this driver.  A dump exists if the device-tree
- * /chosen/ibm,scan-log-data property exists.
- *
- * This driver exports /proc/powerpc/scan-log-dump which can be read.
- * The driver supports only sequential reads.
- *
- * The driver looks at a write to the driver for the single word "reset".
- * If given, the driver will reset the scanlog so the platform can free it.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#define MODULE_VERS "1.0"
-#define MODULE_NAME "scanlog"
-
-/* Status returns from ibm,scan-log-dump */
-#define SCANLOG_COMPLETE 0
-#define SCANLOG_HWERROR -1
-#define SCANLOG_CONTINUE 1
-
-
-static unsigned int ibm_scan_log_dump; /* RTAS token */
-static unsigned int *scanlog_buffer;   /* The data buffer */
-
-static ssize_t scanlog_read(struct file *file, char __user *buf,
-   size_t count, loff_t *ppos)
-{
-   unsigned int *data = scanlog_buffer;
-   int status;
-   unsigned long len, off;
-   unsigned int wait_time;
-
-   if (count > RTAS_DATA_BUF_SIZE)
-   count = RTAS_DATA_BUF_SIZE;
-
-   if (count < 1024) {
-   /* This is the min supported by this RTAS call.  Rather
-* than do all the buffering we insist the user code handle
-* larger reads.  As long as cp works... :)
-*/
-   printk(KERN_ERR "scanlog: cannot perform a small read (%ld)\n", 
count);
-   return -EINVAL;
-   }
-
-   if (!access_ok(buf, count))
-   return -EFAULT;
-
-   for (;;) {
-   wait_time = 500;/* default wait if no data */
-   spin_lock(_data_buf_lock);
-   memcpy(rtas_data_buf, data, RTAS_DATA_BUF_SIZE);
-   status = 

[PATCH 2/2] powerpc/paca: Remove mm_ctx_id and mm_ctx_slb_addr_limit

2021-05-03 Thread Christophe Leroy
mm_ctx_id and mm_ctx_slb_addr_limit are not used anymore.

Remove them.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/paca.h | 2 --
 arch/powerpc/kernel/paca.c  | 2 --
 2 files changed, 4 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index ec18ac818e3a..ecc8d792a431 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -149,11 +149,9 @@ struct paca_struct {
 #endif /* CONFIG_PPC_BOOK3E */
 
 #ifdef CONFIG_PPC_BOOK3S
-   mm_context_id_t mm_ctx_id;
 #ifdef CONFIG_PPC_MM_SLICES
unsigned char mm_ctx_low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE];
unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE];
-   unsigned long mm_ctx_slb_addr_limit;
 #else
u16 mm_ctx_user_psize;
u16 mm_ctx_sllp;
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 7f5aae3c387d..9bd30cac852b 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -346,10 +346,8 @@ void copy_mm_to_paca(struct mm_struct *mm)
 #ifdef CONFIG_PPC_BOOK3S
mm_context_t *context = >context;
 
-   get_paca()->mm_ctx_id = context->id;
 #ifdef CONFIG_PPC_MM_SLICES
VM_BUG_ON(!mm_ctx_slb_addr_limit(context));
-   get_paca()->mm_ctx_slb_addr_limit = mm_ctx_slb_addr_limit(context);
memcpy(_paca()->mm_ctx_low_slices_psize, mm_ctx_low_slices(context),
   LOW_SLICE_ARRAY_SZ);
memcpy(_paca()->mm_ctx_high_slices_psize, 
mm_ctx_high_slices(context),
-- 
2.25.0



[PATCH 1/2] powerpc/asm-offset: Remove unused items related to paca

2021-05-03 Thread Christophe Leroy
PACA_SIZE, PACACONTEXTID, PACALOWSLICESPSIZE, PACAHIGHSLICEPSIZE,
PACA_SLB_ADDR_LIMIT, MMUPSIZEDEFSIZE, PACASLBCACHE, PACASLBCACHEPTR,
PACASTABRR, PACAVMALLOCSLLP, MMUPSIZESLLP, PACACONTEXTSLLP,
PACALPPACAPTR, LPPACA_DTLIDX and PACA_DTL_RIDX are not used anymore
by ASM code.

Remove them.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/asm-offsets.c | 24 
 1 file changed, 24 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 28af4efb4587..419ab4a89114 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -197,7 +197,6 @@ int main(void)
OFFSET(ICACHEL1LOGBLOCKSIZE, ppc64_caches, l1i.log_block_size);
OFFSET(ICACHEL1BLOCKSPERPAGE, ppc64_caches, l1i.blocks_per_page);
/* paca */
-   DEFINE(PACA_SIZE, sizeof(struct paca_struct));
OFFSET(PACAPACAINDEX, paca_struct, paca_index);
OFFSET(PACAPROCSTART, paca_struct, cpu_start);
OFFSET(PACAKSAVE, paca_struct, kstack);
@@ -212,15 +211,6 @@ int main(void)
OFFSET(PACAIRQSOFTMASK, paca_struct, irq_soft_mask);
OFFSET(PACAIRQHAPPENED, paca_struct, irq_happened);
OFFSET(PACA_FTRACE_ENABLED, paca_struct, ftrace_enabled);
-#ifdef CONFIG_PPC_BOOK3S
-   OFFSET(PACACONTEXTID, paca_struct, mm_ctx_id);
-#ifdef CONFIG_PPC_MM_SLICES
-   OFFSET(PACALOWSLICESPSIZE, paca_struct, mm_ctx_low_slices_psize);
-   OFFSET(PACAHIGHSLICEPSIZE, paca_struct, mm_ctx_high_slices_psize);
-   OFFSET(PACA_SLB_ADDR_LIMIT, paca_struct, mm_ctx_slb_addr_limit);
-   DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
-#endif /* CONFIG_PPC_MM_SLICES */
-#endif
 
 #ifdef CONFIG_PPC_BOOK3E
OFFSET(PACAPGD, paca_struct, pgd);
@@ -241,21 +231,9 @@ int main(void)
 #endif /* CONFIG_PPC_BOOK3E */
 
 #ifdef CONFIG_PPC_BOOK3S_64
-   OFFSET(PACASLBCACHE, paca_struct, slb_cache);
-   OFFSET(PACASLBCACHEPTR, paca_struct, slb_cache_ptr);
-   OFFSET(PACASTABRR, paca_struct, stab_rr);
-   OFFSET(PACAVMALLOCSLLP, paca_struct, vmalloc_sllp);
-#ifdef CONFIG_PPC_MM_SLICES
-   OFFSET(MMUPSIZESLLP, mmu_psize_def, sllp);
-#else
-   OFFSET(PACACONTEXTSLLP, paca_struct, mm_ctx_sllp);
-#endif /* CONFIG_PPC_MM_SLICES */
OFFSET(PACA_EXGEN, paca_struct, exgen);
OFFSET(PACA_EXMC, paca_struct, exmc);
OFFSET(PACA_EXNMI, paca_struct, exnmi);
-#ifdef CONFIG_PPC_PSERIES
-   OFFSET(PACALPPACAPTR, paca_struct, lppaca_ptr);
-#endif
OFFSET(PACA_SLBSHADOWPTR, paca_struct, slb_shadow_ptr);
OFFSET(SLBSHADOW_STACKVSID, slb_shadow, save_area[SLB_NUM_BOLTED - 
1].vsid);
OFFSET(SLBSHADOW_STACKESID, slb_shadow, save_area[SLB_NUM_BOLTED - 
1].esid);
@@ -264,9 +242,7 @@ int main(void)
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
OFFSET(PACA_PMCINUSE, paca_struct, pmcregs_in_use);
 #endif
-   OFFSET(LPPACA_DTLIDX, lppaca, dtl_idx);
OFFSET(LPPACA_YIELDCOUNT, lppaca, yield_count);
-   OFFSET(PACA_DTL_RIDX, paca_struct, dtl_ridx);
 #endif /* CONFIG_PPC_BOOK3S_64 */
OFFSET(PACAEMERGSP, paca_struct, emergency_sp);
 #ifdef CONFIG_PPC_BOOK3S_64
-- 
2.25.0



Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2021-05-03 Thread Segher Boessenkool
Hi!

On Mon, May 03, 2021 at 10:51:41AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of May 3, 2021 3:55 am:
> > On Wed, Apr 29, 2020 at 10:57:16AM +1000, Nicholas Piggin wrote:
> >> Excerpts from Segher Boessenkool's message of April 29, 2020 9:40 am:
> >> I blame toolchain for -mabi=elfv2 ! And also some blame on ABI document 
> >> which is called ELF V2 ABI rather than ELF ABI V2 which would have been 
> >> unambiguous.
> > 
> > At least ELFv2 ABI is correct.  "ELF ABI v2" is not.
> > 
> >> I can go through and change all my stuff and config options to ELF_ABI_v2.
> > 
> > Please don't.  It is wrong.
> 
> Then I'm not sure what the point of your previous mail was, what did I 
> miss?

I asked if you could make it clearer to people who do not know what this
is whether they want to use it.  Or that was my intention, anyhow :-/

> > Both the original PowerPC ELF ABI and the
> > ELFv2 one have versions themselves.  Also, the base ELF standard has a
> > version, and is set up so there can be incompatible versions even!  Of
> > course it still is version 1 to this day, but :-)
> 
> The point was for people who don't know ELFv2 has a specific meaning for 
> powerpc,

It does not have *any* meaning outside of Power.  But people who do not
know what it is can assume the wrong things about it.  It isn't a great
name because of that :-(

(It's not as bad as the MIPS ABIs -- an older one is called "new" :-) )

> then ELF ABIv2 is more explanatory about it being an abi change
> rather than base elf change, even if it's not the "correct" name.

I very much disagree.  "ELF ABIv2" is completely meaningless.

> If you don't want that then good, I also prefer to just use ELFv2. I 

Good :-)

> think people who change this option can easily look up the name in 
> toolchain and other docs.

Yeah.  As long as the defaults are good, whoever blows themselves up has
only themselves to blame :-P


Segher


Re: [PATCH v3] powerpc/64: Option to use ELFv2 ABI for big-endian kernels

2021-05-03 Thread Michal Suchánek
On Mon, May 03, 2021 at 01:37:57PM +0200, Andreas Schwab wrote:
> Should this add a tag to the module vermagic?

Would the modues link even if the vermagic was not changed?

I suppose something like this might do it.

Thanks

Michal

diff --git a/arch/powerpc/include/asm/vermagic.h 
b/arch/powerpc/include/asm/vermagic.h
index b054a8576e5d..3fdaacd7a743 100644
--- a/arch/powerpc/include/asm/vermagic.h
+++ b/arch/powerpc/include/asm/vermagic.h
@@ -14,7 +14,14 @@
 #define MODULE_ARCH_VERMAGIC_RELOCATABLE   ""
 #endif
 
+
+#ifdef CONFIG_PPC64_BUILD_BIG_ENDIAN_ELF_V2_ABI
+#define MODULE_ARCH_VERMAGIC_ELF_V2_ABI"abi-elfv2 "
+#else
+#define MODULE_ARCH_VERMAGIC_ELF_V2_ABI""
+#endif
+
 #define MODULE_ARCH_VERMAGIC \
-   MODULE_ARCH_VERMAGIC_FTRACE MODULE_ARCH_VERMAGIC_RELOCATABLE
+   MODULE_ARCH_VERMAGIC_FTRACE MODULE_ARCH_VERMAGIC_RELOCATABLE 
MODULE_ARCH_VERMAGIC_ELF_V2_ABI
 
 #endif /* _ASM_VERMAGIC_H */


Re: [PATCH v5 14/16] dma-direct: Allocate memory from restricted DMA pool if available

2021-05-03 Thread Claire Chang
On Fri, Apr 23, 2021 at 9:46 PM Robin Murphy  wrote:
>
> On 2021-04-22 09:15, Claire Chang wrote:
> > The restricted DMA pool is preferred if available.
> >
> > The restricted DMA pools provide a basic level of protection against the
> > DMA overwriting buffer contents at unexpected times. However, to protect
> > against general data leakage and system memory corruption, the system
> > needs to provide a way to lock down the memory access, e.g., MPU.
> >
> > Signed-off-by: Claire Chang 
> > ---
> >   kernel/dma/direct.c | 35 ++-
> >   1 file changed, 26 insertions(+), 9 deletions(-)
> >
> > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > index 7a27f0510fcc..29523d2a9845 100644
> > --- a/kernel/dma/direct.c
> > +++ b/kernel/dma/direct.c
> > @@ -78,6 +78,10 @@ static bool dma_coherent_ok(struct device *dev, 
> > phys_addr_t phys, size_t size)
> >   static void __dma_direct_free_pages(struct device *dev, struct page *page,
> >   size_t size)
> >   {
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > + if (swiotlb_free(dev, page, size))
> > + return;
> > +#endif
> >   dma_free_contiguous(dev, page, size);
> >   }
> >
> > @@ -92,7 +96,17 @@ static struct page *__dma_direct_alloc_pages(struct 
> > device *dev, size_t size,
> >
> >   gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> >  _limit);
> > - page = dma_alloc_contiguous(dev, size, gfp);
> > +
> > +#ifdef CONFIG_DMA_RESTRICTED_POOL
> > + page = swiotlb_alloc(dev, size);
> > + if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > + __dma_direct_free_pages(dev, page, size);
> > + page = NULL;
> > + }
> > +#endif
> > +
> > + if (!page)
> > + page = dma_alloc_contiguous(dev, size, gfp);
> >   if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> >   dma_free_contiguous(dev, page, size);
> >   page = NULL;
> > @@ -148,7 +162,7 @@ void *dma_direct_alloc(struct device *dev, size_t size,
> >   gfp |= __GFP_NOWARN;
> >
> >   if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
> > - !force_dma_unencrypted(dev)) {
> > + !force_dma_unencrypted(dev) && !is_dev_swiotlb_force(dev)) {
> >   page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
> >   if (!page)
> >   return NULL;
> > @@ -161,8 +175,8 @@ void *dma_direct_alloc(struct device *dev, size_t size,
> >   }
> >
> >   if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&
> > - !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
> > - !dev_is_dma_coherent(dev))
> > + !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) 
> > &&
> > + !is_dev_swiotlb_force(dev))
> >   return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
> >
> >   /*
> > @@ -172,7 +186,9 @@ void *dma_direct_alloc(struct device *dev, size_t size,
> >   if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) &&
> >   !gfpflags_allow_blocking(gfp) &&
> >   (force_dma_unencrypted(dev) ||
> > -  (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && 
> > !dev_is_dma_coherent(dev
> > +  (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
> > +   !dev_is_dma_coherent(dev))) &&
> > + !is_dev_swiotlb_force(dev))
> >   return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
> >
> >   /* we always manually zero the memory once we are done */
> > @@ -253,15 +269,15 @@ void dma_direct_free(struct device *dev, size_t size,
> >   unsigned int page_order = get_order(size);
> >
> >   if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
> > - !force_dma_unencrypted(dev)) {
> > + !force_dma_unencrypted(dev) && !is_dev_swiotlb_force(dev)) {
> >   /* cpu_addr is a struct page cookie, not a kernel address */
> >   dma_free_contiguous(dev, cpu_addr, size);
> >   return;
> >   }
> >
> >   if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&
> > - !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
> > - !dev_is_dma_coherent(dev)) {
> > + !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) 
> > &&
> > + !is_dev_swiotlb_force(dev)) {
> >   arch_dma_free(dev, size, cpu_addr, dma_addr, attrs);
> >   return;
> >   }
> > @@ -289,7 +305,8 @@ struct page *dma_direct_alloc_pages(struct device *dev, 
> > size_t size,
> >   void *ret;
> >
> >   if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) &&
> > - force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp))
> > + force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp) &&
> > + !is_dev_swiotlb_force(dev))
> >   return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp);
>
> Wait, this seems broken for non-coherent devices - in that case we need
> to 

Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Miguel Ojeda
On Mon, May 3, 2021 at 2:20 PM David Laight  wrote:
>
> It would be nice to be able to build current kernels (for local
> use) on the 'new' system - but gcc is already too old.

I have seen such environments too... However, for the kernel in
particular, you could install a newer GCC in the 'new' machine (just
for the kernel builds) or do your kernel builds in a different machine
-- a 'new' 'new' one :)

Cheers,
Miguel


[PATCH 4/4] powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes

2021-05-03 Thread Nicholas Piggin
These aren't necessarily POWER9 only, and it's not to say some new
vulnerability may not get discovered on other processors for which
we would like the flexibility of having the workaround enabled by
firmware.

Remove the restriction that they only apply to POWER9.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/platforms/powernv/setup.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index a8db3f153063..6ec67223f8c7 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -122,15 +122,6 @@ static void pnv_setup_security_mitigations(void)
type = L1D_FLUSH_ORI;
}
 
-   /*
-* If we are non-Power9 bare metal, we don't need to flush on kernel
-* entry or after user access: they fix a P9 specific vulnerability.
-*/
-   if (!pvr_version_is(PVR_POWER9)) {
-   security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY);
-   security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS);
-   }
-
enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \
 (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)   || \
  security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV));
-- 
2.23.0



[PATCH 3/4] powerpc/pesries: Get STF barrier requirement from H_GET_CPU_CHARACTERISTICS

2021-05-03 Thread Nicholas Piggin
This allows the hypervisor / firmware to describe this workarounds to
the guest.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/hvcall.h  | 1 +
 arch/powerpc/platforms/pseries/setup.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index f962b339865c..a60ef261f63a 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -395,6 +395,7 @@
 #define H_CPU_BEHAV_FLUSH_LINK_STACK   (1ull << 57) // IBM bit 6
 #define H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY (1ull << 56) // IBM bit 7
 #define H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS (1ull << 55) // IBM bit 8
+#define H_CPU_BEHAV_NO_STF_BARRIER (1ull << 54) // IBM bit 9
 
 /* Flag values used in H_REGISTER_PROC_TBL hcall */
 #define PROC_TABLE_OP_MASK 0x18
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 287f33645419..631a0d57b6cd 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -555,6 +555,9 @@ static void init_cpu_char_feature_flags(struct 
h_cpu_char_result *result)
if (result->behaviour & H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS)
security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS);
 
+   if (result->behaviour & H_CPU_BEHAV_NO_STF_BARRIER)
+   security_ftr_clear(SEC_FTR_STF_BARRIER);
+
if (!(result->behaviour & H_CPU_BEHAV_BNDS_CHK_SPEC_BAR))
security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR);
 }
-- 
2.23.0



[PATCH 2/4] powerpc/security: Add a security feature for STF barrier

2021-05-03 Thread Nicholas Piggin
Rather than tying this mitigation to RFI L1D flush requirement, add a
new bit for it.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/security_features.h | 4 
 arch/powerpc/kernel/security.c   | 7 ++-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/security_features.h 
b/arch/powerpc/include/asm/security_features.h
index b774a4477d5f..792eefaf230b 100644
--- a/arch/powerpc/include/asm/security_features.h
+++ b/arch/powerpc/include/asm/security_features.h
@@ -92,6 +92,9 @@ static inline bool security_ftr_enabled(u64 feature)
 // The L1-D cache should be flushed after user accesses from the kernel
 #define SEC_FTR_L1D_FLUSH_UACCESS  0x8000ull
 
+// The STF flush should be executed on privilege state switch
+#define SEC_FTR_STF_BARRIER0x0001ull
+
 // Features enabled by default
 #define SEC_FTR_DEFAULT \
(SEC_FTR_L1D_FLUSH_HV | \
@@ -99,6 +102,7 @@ static inline bool security_ftr_enabled(u64 feature)
 SEC_FTR_BNDS_CHK_SPEC_BAR | \
 SEC_FTR_L1D_FLUSH_ENTRY | \
 SEC_FTR_L1D_FLUSH_UACCESS | \
+SEC_FTR_STF_BARRIER | \
 SEC_FTR_FAVOUR_SECURITY)
 
 #endif /* _ASM_POWERPC_SECURITY_FEATURES_H */
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index 0fdfcdd9d880..2eb257b759c6 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -300,9 +300,7 @@ static void stf_barrier_enable(bool enable)
 void setup_stf_barrier(void)
 {
enum stf_barrier_type type;
-   bool enable, hv;
-
-   hv = cpu_has_feature(CPU_FTR_HVMODE);
+   bool enable;
 
/* Default to fallback in case fw-features are not available */
if (cpu_has_feature(CPU_FTR_ARCH_300))
@@ -315,8 +313,7 @@ void setup_stf_barrier(void)
type = STF_BARRIER_NONE;
 
enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
-   (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) ||
-(security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) && hv));
+security_ftr_enabled(SEC_FTR_STF_BARRIER);
 
if (type == STF_BARRIER_FALLBACK) {
pr_info("stf-barrier: fallback barrier available\n");
-- 
2.23.0



[PATCH 1/4] powerpc/pseries: Get entry and uaccess flush required bits from H_GET_CPU_CHARACTERISTICS

2021-05-03 Thread Nicholas Piggin
This allows the hypervisor / firmware to describe these workarounds to
the guest.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/hvcall.h  | 2 ++
 arch/powerpc/platforms/pseries/setup.c | 6 ++
 2 files changed, 8 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index 443050906018..f962b339865c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -393,6 +393,8 @@
 #define H_CPU_BEHAV_FAVOUR_SECURITY_H  (1ull << 60) // IBM bit 3
 #define H_CPU_BEHAV_FLUSH_COUNT_CACHE  (1ull << 58) // IBM bit 5
 #define H_CPU_BEHAV_FLUSH_LINK_STACK   (1ull << 57) // IBM bit 6
+#define H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY (1ull << 56) // IBM bit 7
+#define H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS (1ull << 55) // IBM bit 8
 
 /* Flag values used in H_REGISTER_PROC_TBL hcall */
 #define PROC_TABLE_OP_MASK 0x18
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 754e493b7c05..287f33645419 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -549,6 +549,12 @@ static void init_cpu_char_feature_flags(struct 
h_cpu_char_result *result)
if (!(result->behaviour & H_CPU_BEHAV_L1D_FLUSH_PR))
security_ftr_clear(SEC_FTR_L1D_FLUSH_PR);
 
+   if (result->behaviour & H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY)
+   security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY);
+
+   if (result->behaviour & H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS)
+   security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS);
+
if (!(result->behaviour & H_CPU_BEHAV_BNDS_CHK_SPEC_BAR))
security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR);
 }
-- 
2.23.0



[PATCH 0/4] powerpc/security mitigation updates

2021-05-03 Thread Nicholas Piggin
This series adds a few missing bits added to recent pseries
H_GET_CPU_CHARACTERISTICS and implements them, also removes
a restriction from powernv for some of the flushes.

This is tested mianly in qemu where I just submitted a patch
that adds support for these bits (not upstream yet).

Nicholas Piggin (4):
  powerpc/pseries: Get entry and uaccess flush required bits from
H_GET_CPU_CHARACTERISTICS
  powerpc/security: Add a security feature for STF barrier
  powerpc/pesries: Get STF barrier requirement from
H_GET_CPU_CHARACTERISTICS
  powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess
flushes

 arch/powerpc/include/asm/hvcall.h| 3 +++
 arch/powerpc/include/asm/security_features.h | 4 
 arch/powerpc/kernel/security.c   | 7 ++-
 arch/powerpc/platforms/powernv/setup.c   | 9 -
 arch/powerpc/platforms/pseries/setup.c   | 9 +
 5 files changed, 18 insertions(+), 14 deletions(-)

-- 
2.23.0



Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread David Sterba
On Sun, May 02, 2021 at 12:15:38AM +0900, Masahiro Yamada wrote:
> The current minimum GCC version is 4.9 except ARCH=arm64 requiring
> GCC 5.1.
> 
> When we discussed last time, we agreed to raise the minimum GCC version
> to 5.1 globally. [1]

There are still a lot of comment references to old gcc releases with
workarounds or bugfixes, a quick serarch:

$ git grep -in 'gcc.*[234]\.x'
arch/alpha/include/asm/string.h:30:/* For gcc 3.x, we cannot have the inline 
function named "memset" because
arch/arc/include/asm/checksum.h:9: *  -gcc 4.4.x broke networking. Alias 
analysis needed to be primed.
arch/arm/Makefile:127:# Need -Uarm for gcc < 3.x
arch/ia64/lib/memcpy_mck.S:535: * Due to lack of local tag support in gcc 2.x 
assembler, it is not clear which
arch/mips/include/asm/page.h:210: * also affect MIPS so we keep this one until 
GCC 3.x has been retired
arch/x86/include/asm/page.h:53: * remove this Voodoo magic stuff. (i.e. once 
gcc3.x is deprecated)
arch/x86/kvm/x86.c:5569: * This union makes it completely explicit to 
gcc-3.x
arch/x86/mm/pgtable.c:302:  if (PREALLOCATED_PMDS == 0) /* Work around 
gcc-3.4.x bug */
drivers/net/ethernet/renesas/sh_eth.c:51: * that warning from W=1 builds. GCC 
has supported this option since 4.2.X, but
lib/xz/xz_dec_lzma2.c:494: * of the code generated by GCC 3.x decreases 10-15 
%. (GCC 4.3 doesn't care,
lib/xz/xz_dec_lzma2.c:495: * and it generates 10-20 % faster code than GCC 3.x 
from this file anyway.)
net/core/skbuff.c:32: * The functions in this file will not compile correctly 
with gcc 2.4.x

This misses version-specific quirks, but the following returns 216
results and not all are problematic (eg. just referring to gcc for some
historical reason) so I'm not pasting it here.

$ git grep -in 'gcc.*[234]\.[0-9]'
...


RE: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread David Laight
From: Arnd Bergmann
> Sent: 03 May 2021 10:25
...
> One scenario that I've seen previously is where user space and
> kernel are built together as a source based distribution (OE, buildroot,
> openwrt, ...), and the compiler is picked to match the original sources
> of the user space because that is best tested, but the same compiler
> then gets used to build the kernel as well because that is the default
> in the build environment.

If you are building programs for release to customers who might
be running then on old distributions then you need a system with
the original userspace headers and almost certainly a similar
vintage compiler.
Never mind RHEL7 we have customers running RHEL6.
(We've managed to get everyone off RHEL5.)
So the build machine is running a 10+ year old distro.

I did try to build on a newer system (only 5 years old)
but the complete fubar of memcpy() makes it impossible
to compile C programs that will run on an older libc.
And don't even mention C++, the 'character traits' is just
plain horrid - enough to make me want to remove every
reference to CString from the small amount of C++ we have.

To quote our makefile:
# C++ is fighting back.
# I'd like to be able to compile on a 'new' system and still be able to run
# the binaries on RHEL 6 (2.6.32 kernel 2011 era libraries).
# But even linking libstdc++ static still leaves
# an undefined C++ symbol that the dynamic loader barfs on.
# The static libstdc++ also references memcpy@GLIBC_2.14 - but that can be
# 'solved' by adding an extra .so that defines the symbol (and calls memmove()).
# I've also tried pulling a single .o out of libstc++.a. This might work if
# the .o is small and self contained.
#
# For now we statically link libstc++ and continue to build on an old system.
C++LDLIBS := -Wl,-Bstatic -lstdc++ -Wl,-Bdynamic

It would be nice to be able to build current kernels (for local
use) on the 'new' system - but gcc is already too old.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH v3] powerpc/64: Option to use ELFv2 ABI for big-endian kernels

2021-05-03 Thread Andreas Schwab
Should this add a tag to the module vermagic?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Arnd Bergmann
On Mon, May 3, 2021 at 12:32 AM Matthew Wilcox  wrote:
> On Sun, May 02, 2021 at 02:08:31PM -0700, Linus Torvalds wrote:
> > What is relevant is what version of gcc various distributions actually
> > have reasonably easily available, and how old and relevant the
> > distributions are. We did decide that (just as an example) RHEL 7 was
> > too old to worry about when we updated the gcc version requirement
> > last time.
> >
> > Last year, Arnd and Kirill (maybe others were involved too) made a
> > list of distros and older gcc versions. But I don't think anybody
> > actually _maintains_ such a list. It would be perhaps interesting to
> > have some way to check what compiler versions are being offered by
> > different distros.
>
> fwiw, Debian 9 aka Stretch released June 2017 had gcc 6.3
> Debian 10 aka Buster released June 2019 had gcc 7.4 *and* 8.3.
> Debian 8 aka Jessie had gcc-4.8.4 and gcc-4.9.2.
>
> So do we care about people who haven't bothered to upgrade userspace
> since 2017?  If so, we can't go past 4.9.

I would argue that we shouldn't care about distros that are officially
end-of-life. Jessie support ended last July according to the official
Debian pages at https://wiki.debian.org/LTS.

It's a little harder for distros that are still officially supported, like the
RHEL7 case that Linus mentioned, Debian Stretch (gcc-6.3),
Slackware 14.2 (gcc-5.3), or Ubuntu 18.04 (gcc-7.3). For any of
these you could make the argument one way or the other: either
say we care as long as the distro cares, or the users that want
to build their own kernels can be reasonably expected to either
upgrade their distro or install a newer compiler manually.

Looking at the Debian case specifically, I see these numbers
from https://popcon.debian.org/:

testing/unstable: 16730
buster/stable: 113881
stretch/oldstable: 39147
jessie/oldoldstable: 19286

Assuming the numbers of users that installed popcon are
proportional to the actual number of users, that's still a large
chunk of people running stretch or older. Presumably,
these users are actually less likely to build their own kernels.

   Arnd


[PATCH] powerpc/pseries: Enable hardlockup watchdog for PowerVM partitions

2021-05-03 Thread Nicholas Piggin
PowerVM will not arbitrarily oversubscribe or stop guests, page out the
guest kernel text to a NFS volume connected by carrier pigeon to abacus
based storage, etc., as a KVM host might. So PowerVM guests are not
likely to be killed by the hard lockup watchdog in normal operation,
even with shared processor LPARs which still get a minimum allotment of
CPU time.

Enable the hard lockup detector by default on !KVM guests, which we will
assume is PowerVM. It has been useful in finding problems on bare metal
kernels.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kernel/setup_64.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index b779d25761cf..c0e234456863 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -939,15 +939,17 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh)
  * disable it by default. Book3S has a soft-nmi hardlockup detector based
  * on the decrementer interrupt, so it does not suffer from this problem.
  *
- * It is likely to get false positives in VM guests, so disable it there
- * by default too.
+ * It is likely to get false positives in KVM guests, so disable it there
+ * by default too. PowerVM will not stop or arbitrarily oversubscribe
+ * CPUs, but give a minimum regular allotment even with SPLPAR, so enable
+ * the detector for non-KVM guests, assume PowerVM.
  */
 static int __init disable_hardlockup_detector(void)
 {
 #ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
hardlockup_detector_disable();
 #else
-   if (firmware_has_feature(FW_FEATURE_LPAR))
+   if (is_kvm_guest())
hardlockup_detector_disable();
 #endif
 
-- 
2.23.0



[PATCH] powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled

2021-05-03 Thread Nicholas Piggin
scv support introduced the notion of code that implicitly soft-masks
irqs due to the instruction addresses. This is required because scv
enters the kernel with MSR[EE]=1.

If a NMI (including soft-NMI) interrupt hits when we are implicitly
soft-masked then its regs->softe does not reflect this because it is
derived from the explicit soft mask state (paca->irq_soft_mask). This
makes arch_irq_disabled_regs(regs) return false.

This can trigger a warning in the soft-NMI watchdog code (shown below).
Fix it by having NMI interrupts set regs->softe to disabled in case of
interrupting an implicit soft-masked region.

  [ cut here ]
  WARNING: CPU: 41 PID: 1103 at arch/powerpc/kernel/watchdog.c:259 
soft_nmi_interrupt+0x3e4/0x5f0
  CPU: 41 PID: 1103 Comm: (spawn) Not tainted
  NIP:  c0039534 LR: c0039234 CTR: c0009a00
  REGS: c07fffbcf940 TRAP: 0700   Not tainted
  MSR:  90021033   CR: 22042482  XER: 200400ad
  CFAR: c0039260 IRQMASK: 3
  GPR00: c0039204 c07fffbcfbe0 c1d6c300 0003
  GPR04: 7a45d078  0008 0020
  GPR08: 007ffd4e  c07ceb00 7265677368657265
  GPR12: 90009033 c07ceb00 0f7075bf4480 002a
  GPR16: 0f705745a528 7a45ddd8 0f70574d0008 
  GPR20: 0f7075c58d70 0f7057459c38 0001 0040
  GPR24:  0029 c1dae058 0029
  GPR28:  0800 0009 c07fffbcfd60
  NIP [c0039534] soft_nmi_interrupt+0x3e4/0x5f0
  LR [c0039234] soft_nmi_interrupt+0xe4/0x5f0
  Call Trace:
  [c07fffbcfbe0] [c0039204] soft_nmi_interrupt+0xb4/0x5f0 
(unreliable)
  [c07fffbcfcf0] [c000c0e8] soft_nmi_common+0x138/0x1c4
  --- interrupt: 900 at end_real_trampolines+0x0/0x1000
  NIP:  c0003000 LR: 7ca426adb03c CTR: 9280f033
  REGS: c07fffbcfd60 TRAP: 0900
  MSR:  90009033   CR: 44042482  XER: 200400ad
  CFAR: 7ca426946020 IRQMASK: 0
  GPR00: 00ad 7a45d050 7ca426b07f00 0035
  GPR04: 7a45d078  0008 0020
  GPR08:  0010 1000 7a45d110
  GPR12: 0001 7ca426d4e680 0f7075bf4480 002a
  GPR16: 0f705745a528 7a45ddd8 0f70574d0008 
  GPR20: 0f7075c58d70 0f7057459c38 0001 0040
  GPR24:  0f7057473f68 0003 041b
  GPR28: 7a45d4c4 0035  0f7057473f68
  NIP [c0003000] end_real_trampolines+0x0/0x1000
  LR [7ca426adb03c] 0x7ca426adb03c
  --- interrupt: 900
  Instruction dump:
  6000 6000 6042 3861 482b3ae5 6000 e93f0138 a36d0008
  7daa6b78 71290001 7f7907b4 4082fd34 <0fe0> 4bfffd2c 6042 ea6100a8
  ---[ end trace dc75f67d819779da ]---

Fixes: 118178e62e2e ("powerpc: move NMI entry/exit code into wrapper")
Reported-by: Cédric Le Goater 
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/interrupt.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/interrupt.h 
b/arch/powerpc/include/asm/interrupt.h
index 44cde2e129b8..299e51337aca 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -222,6 +222,13 @@ static inline void interrupt_nmi_enter_prepare(struct 
pt_regs *regs, struct inte
local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
 
+   if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !(regs->msr & MSR_PR) &&
+   regs->nip < (unsigned long)__end_interrupts) {
+   /* Kernel code running below __end_interrupts is implicitly
+* soft-masked */
+   regs->softe = IRQS_ALL_DISABLED;
+   }
+
/* Don't do any per-CPU operations until interrupt state is fixed */
 
if (nmi_disables_ftrace(regs)) {
-- 
2.23.0



[PATCH v3] powerpc/64: Option to use ELFv2 ABI for big-endian kernels

2021-05-03 Thread Nicholas Piggin
Provide an option to build big-endian kernels using the ELFv2 ABI. This
works on GCC only so far, although it is rumored to work with clang
that's not been tested yet.

This can give big-endian kernels some useful advantages of the ELFv2 ABI
(e.g., less stack usage, -mprofile-kernel, better compatibility with bpf
tools).

BE+ELFv2 is not officially supported by the GNU toolchain, but it works
fine in testing and has been used by some userspace for some time (e.g.,
Void Linux).

Tested-by: Michal Suchánek 
Reviewed-by: Segher Boessenkool 
Signed-off-by: Nicholas Piggin 
---

I didn't add the -mprofile-kernel change but I think it would be a good
one that can be merged independently if it works.

Since v2:
- Rebased, tweaked changelog.
- Changed ELF_V2 to ELF_V2_ABI in config options, to be clearer.

Since v1:
- Improved the override flavour name suggested by Segher.
- Improved changelog wording.

 arch/powerpc/Kconfig| 22 ++
 arch/powerpc/Makefile   | 18 --
 arch/powerpc/boot/Makefile  |  4 +++-
 arch/powerpc/kernel/vdso64/Makefile | 13 +
 drivers/crypto/vmx/Makefile |  8 ++--
 drivers/crypto/vmx/ppc-xlate.pl | 10 ++
 6 files changed, 62 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1e6230bea09d..d3f78d3d574d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -160,6 +160,7 @@ config PPC
select ARCH_WEAK_RELEASE_ACQUIRE
select BINFMT_ELF
select BUILDTIME_TABLE_SORT
+   select PPC64_BUILD_ELF_V2_ABI   if PPC64 && CPU_LITTLE_ENDIAN
select CLONE_BACKWARDS
select DCACHE_WORD_ACCESS   if PPC64 && CPU_LITTLE_ENDIAN
select DMA_OPS  if PPC64
@@ -568,6 +569,27 @@ config KEXEC_FILE
 config ARCH_HAS_KEXEC_PURGATORY
def_bool KEXEC_FILE
 
+config PPC64_BUILD_ELF_V2_ABI
+   bool
+
+config PPC64_BUILD_BIG_ENDIAN_ELF_V2_ABI
+   bool "Build big-endian kernel using ELF ABI V2 (EXPERIMENTAL)"
+   depends on PPC64 && CPU_BIG_ENDIAN && EXPERT
+   depends on CC_IS_GCC && LD_VERSION >= 22400
+   default n
+   select PPC64_BUILD_ELF_V2_ABI
+   help
+ This builds the kernel image using the "Power Architecture 64-Bit ELF
+ V2 ABI Specification", which has a reduced stack overhead and faster
+ function calls. This internal kernel ABI option does not affect
+  userspace compatibility.
+
+ The V2 ABI is standard for 64-bit little-endian, but for big-endian
+ it is less well tested by kernel and toolchain. However some distros
+ build userspace this way, and it can produce a functioning kernel.
+
+ This requires GCC and binutils 2.24 or newer.
+
 config RELOCATABLE
bool "Build a relocatable kernel"
depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 3212d076ac6a..b90b5cb799aa 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -91,10 +91,14 @@ endif
 
 ifdef CONFIG_PPC64
 ifndef CONFIG_CC_IS_CLANG
-cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1)
-cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call 
cc-option,-mcall-aixdesc)
-aflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1)
-aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2
+ifdef CONFIG_PPC64_BUILD_ELF_V2_ABI
+cflags-y   += $(call cc-option,-mabi=elfv2)
+aflags-y   += $(call cc-option,-mabi=elfv2)
+else
+cflags-y   += $(call cc-option,-mabi=elfv1)
+cflags-y   += $(call cc-option,-mcall-aixdesc)
+aflags-y   += $(call cc-option,-mabi=elfv1)
+endif
 endif
 endif
 
@@ -142,15 +146,17 @@ endif
 
 CFLAGS-$(CONFIG_PPC64) := $(call cc-option,-mtraceback=no)
 ifndef CONFIG_CC_IS_CLANG
-ifdef CONFIG_CPU_LITTLE_ENDIAN
-CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2,$(call 
cc-option,-mcall-aixdesc))
+ifdef CONFIG_PPC64_BUILD_ELF_V2_ABI
+CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2)
 AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2)
 else
+# Keep these in synch with arch/powerpc/kernel/vdso64/Makefile
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1)
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcall-aixdesc)
 AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1)
 endif
 endif
+
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,$(call 
cc-option,-mminimal-toc))
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions)
 
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 2b8da923ceca..be84a72f8258 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -40,6 +40,9 @@ BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes 
-Wno-trigraphs 

[PATCH] ibmvnic: remove default label from to_string switch

2021-05-03 Thread Michal Suchanek
This way the compiler warns when a new value is added to the enum but
not the string transation like:

drivers/net/ethernet/ibm/ibmvnic.c: In function 'adapter_state_to_string':
drivers/net/ethernet/ibm/ibmvnic.c:832:2: warning: enumeration value 
'VNIC_FOOBAR' not handled in switch [-Wswitch]
  switch (state) {
  ^~
drivers/net/ethernet/ibm/ibmvnic.c: In function 'reset_reason_to_string':
drivers/net/ethernet/ibm/ibmvnic.c:1935:2: warning: enumeration value 
'VNIC_RESET_FOOBAR' not handled in switch [-Wswitch]
  switch (reason) {
  ^~

Signed-off-by: Michal Suchanek 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 5788bb956d73..4d439413f6d9 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -846,9 +846,8 @@ static const char *adapter_state_to_string(enum vnic_state 
state)
return "REMOVING";
case VNIC_REMOVED:
return "REMOVED";
-   default:
-   return "UNKNOWN";
}
+   return "UNKNOWN";
 }
 
 static int ibmvnic_login(struct net_device *netdev)
@@ -1946,9 +1945,8 @@ static const char *reset_reason_to_string(enum 
ibmvnic_reset_reason reason)
return "TIMEOUT";
case VNIC_RESET_CHANGE_PARAM:
return "CHANGE_PARAM";
-   default:
-   return "UNKNOWN";
}
+   return "UNKNOWN";
 }
 
 /*
-- 
2.26.2



Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Arnd Bergmann
On Mon, May 3, 2021 at 2:44 AM Segher Boessenkool
 wrote:
>
> On Sun, May 02, 2021 at 02:23:01PM -0700, Joe Perches wrote:
> > On Sun, 2021-05-02 at 15:32 -0500, Segher Boessenkool wrote:
> > > On Sun, May 02, 2021 at 01:00:28PM -0700, Joe Perches wrote:
> > []
> > > > Perhaps 8 might be best as that has a __diag warning control mechanism.
> > >
> > > I have no idea what you mean?
> >
> > ? read the last bit of compiler-gcc.h
>
> Ah, you mean
> #pragma GCC diagnostic
> (which has existed since GCC 4.2).  Does anything in this __diag stuff
> require GCC 8?  Other than that this is hardcoded here :-)

The '8' was just a kernel thing, we made it configurable to have version
specific warnings, and I have a header file that adds these macros
for all supported compilers, but the version that is in mainline only does
it for gcc-8 or later.

Early compilers only supported "#pragma GCC diagnostic", but I think
even gcc-4.6 supported the _Pragma() syntax that lets you do it inside
of a macro.

It's something we should improve with plumbing on top, e.g. I want
a macro that lets you locally turn off both -Woverride-init on gcc
and -Winitializer-overrides on clang. It's not a reason to mandate
a newer compiler though.

Arnd


Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Kirill A. Shutemov
On Sun, May 02, 2021 at 02:08:31PM -0700, Linus Torvalds wrote:
> Last year, Arnd and Kirill (maybe others were involved too) made a
> list of distros and older gcc versions. But I don't think anybody
> actually _maintains_ such a list.

Distrowatch does. I used it for checking. But you need to check it per
distro. For Debian it would be here:

https://distrowatch.com/table.php?distribution=debian

-- 
 Kirill A. Shutemov


Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2021-05-03 Thread Michal Suchánek
On Mon, May 03, 2021 at 09:11:16AM +0200, Michal Suchánek wrote:
> On Mon, May 03, 2021 at 10:58:33AM +1000, Nicholas Piggin wrote:
> > Excerpts from Michal Suchánek's message of May 3, 2021 2:57 am:
> > > On Tue, Apr 28, 2020 at 09:25:17PM +1000, Nicholas Piggin wrote:
> > >> Provide an option to use ELFv2 ABI for big endian builds. This works on
> > >> GCC and clang (since 2014). It is less well tested and supported by the
> > >> GNU toolchain, but it can give some useful advantages of the ELFv2 ABI
> > >> for BE (e.g., less stack usage). Some distros even build BE ELFv2
> > >> userspace.
> > > 
> > > Fixes BTFID failure on BE for me and the ELF ABIv2 kernel boots.
> > 
> > What's the BTFID failure? Anything we can do to fix it on the v1 ABI or 
> > at least make it depend on BUILD_ELF_V2?
> 
> Looks like symbols are prefixed with a dot in ABIv1 and BTFID tool is
> not aware of that. It can be disabled on ABIv1 easily.
> 
> Thanks
> 
> Michal
> 
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 678c13967580..e703c26e9b80 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -305,6 +305,7 @@ config DEBUG_INFO_BTF
>   bool "Generate BTF typeinfo"
>   depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED
>   depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST
> + depends on !PPC64 || BUILD_ELF_V2
>   help
> Generate deduplicated BTF type information from DWARF debug info.
> Turning this on expects presence of pahole tool, which will convert
> 
> > 
> > > 
> > > Tested-by: Michal Suchánek 
> > > 
> > > Also can we enable mprofile on BE now?
> > > 
> > > I don't see anything endian-specific in the mprofile code at a glance
> > > but don't have any idea how to test it.
> > 
> > AFAIK it's just a different ABI for the _mcount call so just running
> > some ftrace and ftrace with call graph should test it reasonably well.

It does not crash and burn but there are some regressions from LE to BE
on the ftrace kernel selftest:

--- ftraceLE.txt2021-05-03 11:19:14.83000 +0200
+++ ftraceBE.txt2021-05-03 11:27:24.77000 +0200
@@ -7,8 +7,8 @@
 [n] Change the ringbuffer size [PASS]
 [n] Snapshot and tracing setting   [PASS]
 [n] trace_pipe and trace_marker[PASS]
-[n] Test ftrace direct functions against tracers   [UNRESOLVED]
-[n] Test ftrace direct functions against kprobes   [UNRESOLVED]
+[n] Test ftrace direct functions against tracers   [FAIL]
+[n] Test ftrace direct functions against kprobes   [FAIL]
 [n] Generic dynamic event - add/remove kprobe events   [PASS]
 [n] Generic dynamic event - add/remove synthetic events[PASS]
 [n] Generic dynamic event - selective clear (compatibility)[PASS]
@@ -16,10 +16,10 @@
 [n] event tracing - enable/disable with event level files  [PASS]
 [n] event tracing - restricts events based on pid notrace filtering[PASS]
 [n] event tracing - restricts events based on pid  [PASS]
-[n] event tracing - enable/disable with subsystem level files  [PASS]
+[n] event tracing - enable/disable with subsystem level files  [FAIL]
 [n] event tracing - enable/disable with top level files[PASS]
-[n] Test trace_printk from module  [UNRESOLVED]
-[n] ftrace - function graph filters with stack tracer  [PASS]
+[n] Test trace_printk from module  [FAIL]
+[n] ftrace - function graph filters with stack tracer  [FAIL]
 [n] ftrace - function graph filters[PASS]
 [n] ftrace - function trace with cpumask   [PASS]
 [n] ftrace - test for function event triggers  [PASS]
@@ -27,7 +27,7 @@
 [n] ftrace - function pid notrace filters  [PASS]
 [n] ftrace - function pid filters  [PASS]
 [n] ftrace - stacktrace filter command [PASS]
-[n] ftrace - function trace on module  [UNRESOLVED]
+[n] ftrace - function trace on module  [FAIL]
 [n] ftrace - function profiler with function tracing   [PASS]
 [n] ftrace - function profiling[PASS]
 [n] ftrace - test reading of set_ftrace_filter [PASS]
@@ -44,10 +44,10 @@
 [n] Kprobe event argument syntax   [PASS]
 [n] Kprobe dynamic event with arguments[PASS]
 [n] Kprobes event arguments with types [PASS]
-[n] Kprobe event user-memory access[UNSUPPORTED]
+[n] Kprobe event user-memory access[FAIL]
 [n] Kprobe event auto/manual naming[PASS]
 [n] Kprobe dynamic event with function tracer  [PASS]
-[n] Kprobe dynamic event - probing module  [UNRESOLVED]
+[n] Kprobe dynamic event - probing module  [FAIL]
 [n] Create/delete multiprobe on kprobe event   [PASS]
 [n] Kprobe event parser error log check[PASS]
 [n] Kretprobe dynamic event with arguments [PASS]
@@ -57,11 +57,11 @@
 [n] Kprobe events - probe points   [PASS]
 [n] Kprobe dynamic event - adding and removing [PASS]
 [n] Uprobe event parser error log check[PASS]
-[n] test for the preemptirqsoff tracer [UNSUPPORTED]
-[n] Meta-selftest: Checkbashisms   [UNRESOLVED]
+[n] test for the preemptirqsoff tracer [FAIL]
+[n] 

Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Arnd Bergmann
On Mon, May 3, 2021 at 9:35 AM Alexander Dahl  wrote:
>
> Desktops and servers are all nice, however I just want to make you
> aware, there are embedded users forced to stick to older cross
> toolchains for different reasons as well, e.g. in industrial
> environment. :-)
>
> This is no show stopper for us, I just wanted to let you be aware.

Can you be more specific about what scenarios you are thinking of,
what the motivations are for using an old compiler with a new kernel
on embedded systems, and what you think a realistic maximum
time would be between compiler updates?

One scenario that I've seen previously is where user space and
kernel are built together as a source based distribution (OE, buildroot,
openwrt, ...), and the compiler is picked to match the original sources
of the user space because that is best tested, but the same compiler
then gets used to build the kernel as well because that is the default
in the build environment.

There are two problems I see with this logic:

- Running the latest kernel to avoid security problems is of course
  a good idea, but if one runs that with ten year old user space that
  is never updated, the system is likely to end up just as insecure.
  Not all bugs are in the kernel.

- The same logic that applies to ancient user space staying with
  an ancient compiler (it's better tested in this combination) also
  applies to the kernel: running the latest kernel on an old compiler
  is something that few people test, and tends to run into more bugs
  than using the compiler that other developers used to test that
  kernel.

   Arnd


Re: [PATCH v3] powerpc/64s/radix: Enable huge vmalloc mappings

2021-05-03 Thread Christophe Leroy




Le 03/05/2021 à 11:17, Nicholas Piggin a écrit :

This reduces TLB misses by nearly 30x on a `git diff` workload on a
2-node POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%, due
to vfs hashes being allocated with 2MB pages.

Acked-by: Michael Ellerman 
Signed-off-by: Nicholas Piggin 


Reviewed-by: Christophe Leroy 


---
Since v2:
- Fix ppc32 compile bug.

Since v1:
- Don't define MODULES_VADDR which has some other side effect (e.g.,
   ptdump).
- Fixed (hopefully) kbuild warning.
- Keep __vmalloc_node_range call on 3 lines.

  .../admin-guide/kernel-parameters.txt  |  2 ++
  arch/powerpc/Kconfig   |  1 +
  arch/powerpc/kernel/module.c   | 18 +-
  3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 1c0a3cf6fcc9..1be38b25c485 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3250,6 +3250,8 @@
  
  	nohugeiomap	[KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings.
  
+	nohugevmalloc	[PPC] Disable kernel huge vmalloc mappings.

+
nosmt   [KNL,S390] Disable symmetric multithreading (SMT).
Equivalent to smt=1.
  
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig

index 1e6230bea09d..c547a9d6a2dd 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -185,6 +185,7 @@ config PPC
select GENERIC_VDSO_TIME_NS
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP  if PPC_BOOK3S_64 && 
PPC_RADIX_MMU
+   select HAVE_ARCH_HUGE_VMALLOC   if HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN  if PPC32 && PPC_PAGE_SHIFT <= 14
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index fab84024650c..3f35c8d20be7 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -8,6 +8,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -88,17 +89,22 @@ int module_finalize(const Elf_Ehdr *hdr,
return 0;
  }
  
-#ifdef MODULES_VADDR

  static __always_inline void *
  __module_alloc(unsigned long size, unsigned long start, unsigned long end)
  {
-   return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL,
-   PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, 
NUMA_NO_NODE,
-   __builtin_return_address(0));
+   /*
+* Don't do huge page allocations for modules yet until more testing
+* is done. STRICT_MODULE_RWX may require extra work to support this
+* too.
+*/
+   return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, 
PAGE_KERNEL_EXEC,
+   VM_FLUSH_RESET_PERMS | VM_NO_HUGE_VMAP,
+   NUMA_NO_NODE, __builtin_return_address(0));
  }
  
  void *module_alloc(unsigned long size)

  {
+#ifdef MODULES_VADDR
unsigned long limit = (unsigned long)_etext - SZ_32M;
void *ptr = NULL;
  
@@ -112,5 +118,7 @@ void *module_alloc(unsigned long size)

ptr = __module_alloc(size, MODULES_VADDR, MODULES_END);
  
  	return ptr;

-}
+#else
+   return __module_alloc(size, VMALLOC_START, VMALLOC_END);
  #endif
+}



[PATCH v3] powerpc/64s/radix: Enable huge vmalloc mappings

2021-05-03 Thread Nicholas Piggin
This reduces TLB misses by nearly 30x on a `git diff` workload on a
2-node POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%, due
to vfs hashes being allocated with 2MB pages.

Acked-by: Michael Ellerman 
Signed-off-by: Nicholas Piggin 
---
Since v2:
- Fix ppc32 compile bug.

Since v1:
- Don't define MODULES_VADDR which has some other side effect (e.g.,
  ptdump).
- Fixed (hopefully) kbuild warning.
- Keep __vmalloc_node_range call on 3 lines.

 .../admin-guide/kernel-parameters.txt  |  2 ++
 arch/powerpc/Kconfig   |  1 +
 arch/powerpc/kernel/module.c   | 18 +-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 1c0a3cf6fcc9..1be38b25c485 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3250,6 +3250,8 @@
 
nohugeiomap [KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings.
 
+   nohugevmalloc   [PPC] Disable kernel huge vmalloc mappings.
+
nosmt   [KNL,S390] Disable symmetric multithreading (SMT).
Equivalent to smt=1.
 
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1e6230bea09d..c547a9d6a2dd 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -185,6 +185,7 @@ config PPC
select GENERIC_VDSO_TIME_NS
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP  if PPC_BOOK3S_64 && 
PPC_RADIX_MMU
+   select HAVE_ARCH_HUGE_VMALLOC   if HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN  if PPC32 && PPC_PAGE_SHIFT <= 14
diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c
index fab84024650c..3f35c8d20be7 100644
--- a/arch/powerpc/kernel/module.c
+++ b/arch/powerpc/kernel/module.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -88,17 +89,22 @@ int module_finalize(const Elf_Ehdr *hdr,
return 0;
 }
 
-#ifdef MODULES_VADDR
 static __always_inline void *
 __module_alloc(unsigned long size, unsigned long start, unsigned long end)
 {
-   return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL,
-   PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, 
NUMA_NO_NODE,
-   __builtin_return_address(0));
+   /*
+* Don't do huge page allocations for modules yet until more testing
+* is done. STRICT_MODULE_RWX may require extra work to support this
+* too.
+*/
+   return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, 
PAGE_KERNEL_EXEC,
+   VM_FLUSH_RESET_PERMS | VM_NO_HUGE_VMAP,
+   NUMA_NO_NODE, __builtin_return_address(0));
 }
 
 void *module_alloc(unsigned long size)
 {
+#ifdef MODULES_VADDR
unsigned long limit = (unsigned long)_etext - SZ_32M;
void *ptr = NULL;
 
@@ -112,5 +118,7 @@ void *module_alloc(unsigned long size)
ptr = __module_alloc(size, MODULES_VADDR, MODULES_END);
 
return ptr;
-}
+#else
+   return __module_alloc(size, VMALLOC_START, VMALLOC_END);
 #endif
+}
-- 
2.23.0



Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Joe Perches
On Mon, 2021-05-03 at 09:34 +0200, Alexander Dahl wrote:
> Desktops and servers are all nice, however I just want to make you
> aware, there are embedded users forced to stick to older cross
> toolchains for different reasons as well, e.g. in industrial
> environment. :-)

In your embedded case, what kernel version do you use?

For older toolchains, unless it's kernel version 5.13+,
it wouldn't matter.

And all the supported architectures have gcc 10.3 available at
http://cdn.kernel.org/pub/tools/crosstool/




Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Alexander Dahl
Hei hei,

Am Sun, May 02, 2021 at 11:30:07PM +0100 schrieb Matthew Wilcox:
> On Sun, May 02, 2021 at 02:08:31PM -0700, Linus Torvalds wrote:
> > What is relevant is what version of gcc various distributions actually
> > have reasonably easily available, and how old and relevant the
> > distributions are. We did decide that (just as an example) RHEL 7 was
> > too old to worry about when we updated the gcc version requirement
> > last time.
> > 
> > Last year, Arnd and Kirill (maybe others were involved too) made a
> > list of distros and older gcc versions. But I don't think anybody
> > actually _maintains_ such a list. It would be perhaps interesting to
> > have some way to check what compiler versions are being offered by
> > different distros.
> 
> fwiw, Debian 9 aka Stretch released June 2017 had gcc 6.3
> Debian 10 aka Buster released June 2019 had gcc 7.4 *and* 8.3.
> Debian 8 aka Jessie had gcc-4.8.4 and gcc-4.9.2.
> 
> So do we care about people who haven't bothered to upgrade userspace
> since 2017?  If so, we can't go past 4.9.

Desktops and servers are all nice, however I just want to make you
aware, there are embedded users forced to stick to older cross
toolchains for different reasons as well, e.g. in industrial
environment. :-)

This is no show stopper for us, I just wanted to let you be aware.

Greets
Alex



Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels

2021-05-03 Thread Michal Suchánek
On Mon, May 03, 2021 at 10:58:33AM +1000, Nicholas Piggin wrote:
> Excerpts from Michal Suchánek's message of May 3, 2021 2:57 am:
> > On Tue, Apr 28, 2020 at 09:25:17PM +1000, Nicholas Piggin wrote:
> >> Provide an option to use ELFv2 ABI for big endian builds. This works on
> >> GCC and clang (since 2014). It is less well tested and supported by the
> >> GNU toolchain, but it can give some useful advantages of the ELFv2 ABI
> >> for BE (e.g., less stack usage). Some distros even build BE ELFv2
> >> userspace.
> > 
> > Fixes BTFID failure on BE for me and the ELF ABIv2 kernel boots.
> 
> What's the BTFID failure? Anything we can do to fix it on the v1 ABI or 
> at least make it depend on BUILD_ELF_V2?

Looks like symbols are prefixed with a dot in ABIv1 and BTFID tool is
not aware of that. It can be disabled on ABIv1 easily.

Thanks

Michal

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 678c13967580..e703c26e9b80 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -305,6 +305,7 @@ config DEBUG_INFO_BTF
bool "Generate BTF typeinfo"
depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED
depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST
+   depends on !PPC64 || BUILD_ELF_V2
help
  Generate deduplicated BTF type information from DWARF debug info.
  Turning this on expects presence of pahole tool, which will convert

> 
> > 
> > Tested-by: Michal Suchánek 
> > 
> > Also can we enable mprofile on BE now?
> > 
> > I don't see anything endian-specific in the mprofile code at a glance
> > but don't have any idea how to test it.
> 
> AFAIK it's just a different ABI for the _mcount call so just running
> some ftrace and ftrace with call graph should test it reasonably well.
> 
> > 
> > Thanks
> > 
> > Michal
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 6a4ad11f6349..75b3afbfc378 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -495,7 +495,7 @@ config LD_HEAD_STUB_CATCH
> >   If unsure, say "N".
> >  
> >  config MPROFILE_KERNEL
> > -   depends on PPC64 && CPU_LITTLE_ENDIAN && FUNCTION_TRACER
> > +   depends on PPC64 && BUILD_ELF_V2 && FUNCTION_TRACER
> > def_bool 
> > $(success,$(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) 
> > -I$(srctree)/include -D__KERNEL__)
> 
> Good idea. I can't remember if I did a grep for LITTLE_ENDIAN to check 
> for other such opportunities.
> 
> Thanks,
> Nick
> 
> >  
> >  config HOTPLUG_CPU
> >> 
> >> Reviewed-by: Segher Boessenkool 
> >> Signed-off-by: Nicholas Piggin 
> >> ---
> >> Since v1:
> >> - Improved the override flavour name suggested by Segher.
> >> - Improved changelog wording.
> >> 
> >> 
> >>  arch/powerpc/Kconfig| 19 +++
> >>  arch/powerpc/Makefile   | 15 ++-
> >>  arch/powerpc/boot/Makefile  |  4 
> >>  drivers/crypto/vmx/Makefile |  8 ++--
> >>  drivers/crypto/vmx/ppc-xlate.pl | 10 ++
> >>  5 files changed, 45 insertions(+), 11 deletions(-)
> >> 
> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> >> index 924c541a9260..d9d2abc06c2c 100644
> >> --- a/arch/powerpc/Kconfig
> >> +++ b/arch/powerpc/Kconfig
> >> @@ -147,6 +147,7 @@ config PPC
> >>select ARCH_WEAK_RELEASE_ACQUIRE
> >>select BINFMT_ELF
> >>select BUILDTIME_TABLE_SORT
> >> +  select BUILD_ELF_V2 if PPC64 && CPU_LITTLE_ENDIAN
> >>select CLONE_BACKWARDS
> >>select DCACHE_WORD_ACCESS   if PPC64 && CPU_LITTLE_ENDIAN
> >>select DYNAMIC_FTRACE   if FUNCTION_TRACER
> >> @@ -541,6 +542,24 @@ config KEXEC_FILE
> >>  config ARCH_HAS_KEXEC_PURGATORY
> >>def_bool KEXEC_FILE
> >>  
> >> +config BUILD_ELF_V2
> >> +  bool
> >> +
> >> +config BUILD_BIG_ENDIAN_ELF_V2
> >> +  bool "Build big-endian kernel using ELFv2 ABI (EXPERIMENTAL)"
> >> +  depends on PPC64 && CPU_BIG_ENDIAN && EXPERT
> >> +  default n
> >> +  select BUILD_ELF_V2
> >> +  help
> >> +This builds the kernel image using the ELFv2 ABI, which has a
> >> +reduced stack overhead and faster function calls. This does not
> >> +affect the userspace ABIs.
> >> +
> >> +ELFv2 is the standard ABI for little-endian, but for big-endian
> >> +this is an experimental option that is less tested (kernel and
> >> +toolchain). This requires gcc 4.9 or newer and binutils 2.24 or
> >> +newer.
> >> +
> >>  config RELOCATABLE
> >>bool "Build a relocatable kernel"
> >>depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
> >> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> >> index f310c32e88a4..e306b39d847e 100644
> >> --- a/arch/powerpc/Makefile
> >> +++ b/arch/powerpc/Makefile
> >> @@ -92,10 +92,14 @@ endif
> >>  
> >>  ifdef CONFIG_PPC64
> >>  ifndef CONFIG_CC_IS_CLANG
> >> -cflags-$(CONFIG_CPU_BIG_ENDIAN)   += $(call cc-option,-mabi=elfv1)
> >> -cflags-$(CONFIG_CPU_BIG_ENDIAN)   += $(call 
> >> 

Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}

2021-05-03 Thread Christophe Leroy




Le 03/05/2021 à 08:26, Jordan Niethe a écrit :

On Mon, May 3, 2021 at 4:22 PM Christophe Leroy
 wrote:




Le 03/05/2021 à 08:16, Jordan Niethe a écrit :

On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
 wrote:




Le 03/05/2021 à 07:39, Jordan Niethe a écrit :

On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
 wrote:




Le 29/04/2021 à 05:15, Jordan Niethe a écrit :

If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
VMALLOC_END respectively. This reduces the need for special cases. For
example, powerpc's module_alloc() was previously predicated on
MODULES_VADDR being defined but now is unconditionally defined.

This will be useful reducing conditional code in other places that need
to allocate from the module region (i.e., kprobes).

Signed-off-by: Jordan Niethe 
---
v10: New to series
v11: - Consider more places MODULES_VADDR was being used
---
 arch/powerpc/include/asm/pgtable.h| 11 +++
 arch/powerpc/kernel/module.c  |  5 +
 arch/powerpc/mm/kasan/kasan_init_32.c | 10 +-
 arch/powerpc/mm/ptdump/ptdump.c   |  4 ++--
 4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index c6a676714f04..882fda779648 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -39,6 +39,17 @@ struct mm_struct;
 #define __S110  PAGE_SHARED_X
 #define __S111  PAGE_SHARED_X

+#ifndef MODULES_VADDR
+#define MODULES_VADDR VMALLOC_START
+#define MODULES_END VMALLOC_END
+#endif
+
+#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)


No no.

TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.

Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?

On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
TASK_SIZE depends on current.
Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
If I put it back in module_alloc() and wrap it with #ifdef
CONFIG_PPC_BOOK3S_32 will that be fine?


Thinking about it once more, I think the best approach is the one taken by Nick 
in
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npig...@gmail.com/

Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END 
otherwise.

I know I suggested to always define MODULES_VADDR, but maybe that's not the 
best solution at the end.

Sure, let's do it like that.


For kprobes, is there a way to re-use functions from modules.c in 
alloc_insn_page() ?

Probably we can use module_alloc() then the set_memory_ functions to
get the permissions right.
Something like we had in v9:
https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniet...@gmail.com/


Yes, more or less, but using module_alloc() instead of vmalloc().
And module_alloc() implies EXEC, so only the set_memory_ro() will be required.

Yep.


I see no point in doing any set_memory_xxx() in free_insn_page(), because as 
soon as you do a
vfree() the page is not mapped anymore so any access will lead to a fault.

Yeah, I'd not realised we had VM_FLUSH_RESET_PERMS when I added that.
I agree it's pointless.


At the end if should be quite similar to what S390 architecture does.


Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}

2021-05-03 Thread Jordan Niethe
On Mon, May 3, 2021 at 4:22 PM Christophe Leroy
 wrote:
>
>
>
> Le 03/05/2021 à 08:16, Jordan Niethe a écrit :
> > On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
> >  wrote:
> >>
> >>
> >>
> >> Le 03/05/2021 à 07:39, Jordan Niethe a écrit :
> >>> On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
> >>>  wrote:
> 
> 
> 
>  Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> > If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
> > VMALLOC_END respectively. This reduces the need for special cases. For
> > example, powerpc's module_alloc() was previously predicated on
> > MODULES_VADDR being defined but now is unconditionally defined.
> >
> > This will be useful reducing conditional code in other places that need
> > to allocate from the module region (i.e., kprobes).
> >
> > Signed-off-by: Jordan Niethe 
> > ---
> > v10: New to series
> > v11: - Consider more places MODULES_VADDR was being used
> > ---
> > arch/powerpc/include/asm/pgtable.h| 11 +++
> > arch/powerpc/kernel/module.c  |  5 +
> > arch/powerpc/mm/kasan/kasan_init_32.c | 10 +-
> > arch/powerpc/mm/ptdump/ptdump.c   |  4 ++--
> > 4 files changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/pgtable.h 
> > b/arch/powerpc/include/asm/pgtable.h
> > index c6a676714f04..882fda779648 100644
> > --- a/arch/powerpc/include/asm/pgtable.h
> > +++ b/arch/powerpc/include/asm/pgtable.h
> > @@ -39,6 +39,17 @@ struct mm_struct;
> > #define __S110  PAGE_SHARED_X
> > #define __S111  PAGE_SHARED_X
> >
> > +#ifndef MODULES_VADDR
> > +#define MODULES_VADDR VMALLOC_START
> > +#define MODULES_END VMALLOC_END
> > +#endif
> > +
> > +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
> 
>  No no.
> 
>  TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any 
>  configuration.
> 
>  Why is it a problem to leave the test as a BUILD_BUG_ON() in 
>  module_alloc() ?
> >>> On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
> >>> TASK_SIZE depends on current.
> >>> Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
> >>> If I put it back in module_alloc() and wrap it with #ifdef
> >>> CONFIG_PPC_BOOK3S_32 will that be fine?
> >>
> >> Thinking about it once more, I think the best approach is the one taken by 
> >> Nick in
> >> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npig...@gmail.com/
> >>
> >> Use MODULES_VADDR/MODULES_END when it exists, use 
> >> VMALLOC_START/VMALLOC_END otherwise.
> >>
> >> I know I suggested to always define MODULES_VADDR, but maybe that's not 
> >> the best solution at the end.
> > Sure, let's do it like that.
> >>
> >> For kprobes, is there a way to re-use functions from modules.c in 
> >> alloc_insn_page() ?
> > Probably we can use module_alloc() then the set_memory_ functions to
> > get the permissions right.
> > Something like we had in v9:
> > https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniet...@gmail.com/
>
> Yes, more or less, but using module_alloc() instead of vmalloc().
> And module_alloc() implies EXEC, so only the set_memory_ro() will be required.
Yep.
>
> I see no point in doing any set_memory_xxx() in free_insn_page(), because as 
> soon as you do a
> vfree() the page is not mapped anymore so any access will lead to a fault.
Yeah, I'd not realised we had VM_FLUSH_RESET_PERMS when I added that.
I agree it's pointless.
>
> Christophe


Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}

2021-05-03 Thread Christophe Leroy




Le 03/05/2021 à 08:16, Jordan Niethe a écrit :

On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
 wrote:




Le 03/05/2021 à 07:39, Jordan Niethe a écrit :

On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
 wrote:




Le 29/04/2021 à 05:15, Jordan Niethe a écrit :

If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
VMALLOC_END respectively. This reduces the need for special cases. For
example, powerpc's module_alloc() was previously predicated on
MODULES_VADDR being defined but now is unconditionally defined.

This will be useful reducing conditional code in other places that need
to allocate from the module region (i.e., kprobes).

Signed-off-by: Jordan Niethe 
---
v10: New to series
v11: - Consider more places MODULES_VADDR was being used
---
arch/powerpc/include/asm/pgtable.h| 11 +++
arch/powerpc/kernel/module.c  |  5 +
arch/powerpc/mm/kasan/kasan_init_32.c | 10 +-
arch/powerpc/mm/ptdump/ptdump.c   |  4 ++--
4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index c6a676714f04..882fda779648 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -39,6 +39,17 @@ struct mm_struct;
#define __S110  PAGE_SHARED_X
#define __S111  PAGE_SHARED_X

+#ifndef MODULES_VADDR
+#define MODULES_VADDR VMALLOC_START
+#define MODULES_END VMALLOC_END
+#endif
+
+#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)


No no.

TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any configuration.

Why is it a problem to leave the test as a BUILD_BUG_ON() in module_alloc() ?

On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
TASK_SIZE depends on current.
Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
If I put it back in module_alloc() and wrap it with #ifdef
CONFIG_PPC_BOOK3S_32 will that be fine?


Thinking about it once more, I think the best approach is the one taken by Nick 
in
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npig...@gmail.com/

Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END 
otherwise.

I know I suggested to always define MODULES_VADDR, but maybe that's not the 
best solution at the end.

Sure, let's do it like that.


For kprobes, is there a way to re-use functions from modules.c in 
alloc_insn_page() ?

Probably we can use module_alloc() then the set_memory_ functions to
get the permissions right.
Something like we had in v9:
https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniet...@gmail.com/


Yes, more or less, but using module_alloc() instead of vmalloc().
And module_alloc() implies EXEC, so only the set_memory_ro() will be required.

I see no point in doing any set_memory_xxx() in free_insn_page(), because as soon as you do a 
vfree() the page is not mapped anymore so any access will lead to a fault.


Christophe


Re: [PATCH] Raise the minimum GCC version to 5.2

2021-05-03 Thread Christophe Leroy




Le 01/05/2021 à 17:15, Masahiro Yamada a écrit :

The current minimum GCC version is 4.9 except ARCH=arm64 requiring
GCC 5.1.

When we discussed last time, we agreed to raise the minimum GCC version
to 5.1 globally. [1]

I'd like to propose GCC 5.2 to clean up arch/powerpc/Kconfig as well.


One point I missed when I saw your patch first time, but I realised during the 
discussion:

Up to 4.9, GCC was numbered with 3 digits, we had 4.8.0, 4.8.1, ... 4.8.5, 
4.9.0, 4.9.1,  4.9.4

Then starting at 5, GCC switched to a 2 digits scheme, with 5.0, 5.1, 5.2, ... 
5.5

So, that is not GCC 5.1 or 5.2 that you should target, but only GCC 5.
Then it is up to the user to use the latest available version of GCC 5, which is 5.5 at the time 
begin, just like the user would have selected 4.9.4 when 4.9 was the minimum GCC version.


Christophe


Re: [PATCH v11 3/9] powerpc: Always define MODULES_{VADDR,END}

2021-05-03 Thread Jordan Niethe
On Mon, May 3, 2021 at 3:57 PM Christophe Leroy
 wrote:
>
>
>
> Le 03/05/2021 à 07:39, Jordan Niethe a écrit :
> > On Thu, Apr 29, 2021 at 3:04 PM Christophe Leroy
> >  wrote:
> >>
> >>
> >>
> >> Le 29/04/2021 à 05:15, Jordan Niethe a écrit :
> >>> If MODULES_{VADDR,END} are not defined set them to VMALLOC_START and
> >>> VMALLOC_END respectively. This reduces the need for special cases. For
> >>> example, powerpc's module_alloc() was previously predicated on
> >>> MODULES_VADDR being defined but now is unconditionally defined.
> >>>
> >>> This will be useful reducing conditional code in other places that need
> >>> to allocate from the module region (i.e., kprobes).
> >>>
> >>> Signed-off-by: Jordan Niethe 
> >>> ---
> >>> v10: New to series
> >>> v11: - Consider more places MODULES_VADDR was being used
> >>> ---
> >>>arch/powerpc/include/asm/pgtable.h| 11 +++
> >>>arch/powerpc/kernel/module.c  |  5 +
> >>>arch/powerpc/mm/kasan/kasan_init_32.c | 10 +-
> >>>arch/powerpc/mm/ptdump/ptdump.c   |  4 ++--
> >>>4 files changed, 19 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/include/asm/pgtable.h 
> >>> b/arch/powerpc/include/asm/pgtable.h
> >>> index c6a676714f04..882fda779648 100644
> >>> --- a/arch/powerpc/include/asm/pgtable.h
> >>> +++ b/arch/powerpc/include/asm/pgtable.h
> >>> @@ -39,6 +39,17 @@ struct mm_struct;
> >>>#define __S110  PAGE_SHARED_X
> >>>#define __S111  PAGE_SHARED_X
> >>>
> >>> +#ifndef MODULES_VADDR
> >>> +#define MODULES_VADDR VMALLOC_START
> >>> +#define MODULES_END VMALLOC_END
> >>> +#endif
> >>> +
> >>> +#if defined(CONFIG_PPC_BOOK3S_32) && defined(CONFIG_STRICT_KERNEL_RWX)
> >>
> >> No no.
> >>
> >> TASK_SIZE > MODULES_VADDR is ALWAYS wrong, for any target, in any 
> >> configuration.
> >>
> >> Why is it a problem to leave the test as a BUILD_BUG_ON() in 
> >> module_alloc() ?
> > On ppc64s, MODULES_VADDR is __vmalloc_start (a variable)  and
> > TASK_SIZE depends on current.
> > Also for nohash like 44x, MODULES_VADDR is defined based on high_memory.
> > If I put it back in module_alloc() and wrap it with #ifdef
> > CONFIG_PPC_BOOK3S_32 will that be fine?
>
> Thinking about it once more, I think the best approach is the one taken by 
> Nick in
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210502110050.324953-1-npig...@gmail.com/
>
> Use MODULES_VADDR/MODULES_END when it exists, use VMALLOC_START/VMALLOC_END 
> otherwise.
>
> I know I suggested to always define MODULES_VADDR, but maybe that's not the 
> best solution at the end.
Sure, let's do it like that.
>
> For kprobes, is there a way to re-use functions from modules.c in 
> alloc_insn_page() ?
Probably we can use module_alloc() then the set_memory_ functions to
get the permissions right.
Something like we had in v9:
https://lore.kernel.org/linuxppc-dev/20210316031741.1004850-3-jniet...@gmail.com/
>
> Christophe