Re: [PATCH] Raise the minimum GCC version to 5.2
Le 04/05/2021 à 07:30, Alexander Dahl a écrit : Hello Arnd, Am Mon, May 03, 2021 at 11:25:21AM +0200 schrieb Arnd Bergmann: On Mon, May 3, 2021 at 9:35 AM Alexander Dahl wrote: Desktops and servers are all nice, however I just want to make you aware, there are embedded users forced to stick to older cross toolchains for different reasons as well, e.g. in industrial environment. :-) This is no show stopper for us, I just wanted to let you be aware. Can you be more specific about what scenarios you are thinking of, what the motivations are for using an old compiler with a new kernel on embedded systems, and what you think a realistic maximum time would be between compiler updates? One reason might be certification. For certain industrial applications like support for complex field bus protocols, you need to get your devices tested by an external partner running extensive test suites. This is time consuming and expensive. Changing the toolchain of your system then, would be a massive change which would require recertification, while you could argue just updating a single component like the kernel and building everything again, does not require the whole testing process again. Not sure to follow you. Our company provides systems for Air Trafic Control, so we have the same kind of assurance quality process, but then I can't understand why you would need to upgrade your kernel at all. Today our system is based on GCC 5 and Kernel 4.14. At the time being we are using GCC 5.5 (Latest GCC 5) and kernel 4.14.232 (Latest 4.14.y). Kernel 4.14 is maintained until 2024. The day we do an upgrade, we upgrade everything including the tool chain then we go for another 6 years without major changes/re-qualification, because we can't afford a new qualitication every now and then. So really, I can't see your approach. Christophe
[PATCH v2 net-next] ibmvnic: remove default label from to_string switch
This way the compiler warns when a new value is added to the enum but not to the string translation like: drivers/net/ethernet/ibm/ibmvnic.c: In function 'adapter_state_to_string': drivers/net/ethernet/ibm/ibmvnic.c:832:2: warning: enumeration value 'VNIC_FOOBAR' not handled in switch [-Wswitch] switch (state) { ^~ drivers/net/ethernet/ibm/ibmvnic.c: In function 'reset_reason_to_string': drivers/net/ethernet/ibm/ibmvnic.c:1935:2: warning: enumeration value 'VNIC_RESET_FOOBAR' not handled in switch [-Wswitch] switch (reason) { ^~ Signed-off-by: Michal Suchanek --- v2: Fix typo in commit message --- drivers/net/ethernet/ibm/ibmvnic.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 5788bb956d73..4d439413f6d9 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -846,9 +846,8 @@ static const char *adapter_state_to_string(enum vnic_state state) return "REMOVING"; case VNIC_REMOVED: return "REMOVED"; - default: - return "UNKNOWN"; } + return "UNKNOWN"; } static int ibmvnic_login(struct net_device *netdev) @@ -1946,9 +1945,8 @@ static const char *reset_reason_to_string(enum ibmvnic_reset_reason reason) return "TIMEOUT"; case VNIC_RESET_CHANGE_PARAM: return "CHANGE_PARAM"; - default: - return "UNKNOWN"; } + return "UNKNOWN"; } /* -- 2.26.2
Re: [PATCH] Raise the minimum GCC version to 5.2
Hello Arnd, Am Mon, May 03, 2021 at 11:25:21AM +0200 schrieb Arnd Bergmann: > On Mon, May 3, 2021 at 9:35 AM Alexander Dahl wrote: > > > > Desktops and servers are all nice, however I just want to make you > > aware, there are embedded users forced to stick to older cross > > toolchains for different reasons as well, e.g. in industrial > > environment. :-) > > > > This is no show stopper for us, I just wanted to let you be aware. > > Can you be more specific about what scenarios you are thinking of, > what the motivations are for using an old compiler with a new kernel > on embedded systems, and what you think a realistic maximum > time would be between compiler updates? One reason might be certification. For certain industrial applications like support for complex field bus protocols, you need to get your devices tested by an external partner running extensive test suites. This is time consuming and expensive. Changing the toolchain of your system then, would be a massive change which would require recertification, while you could argue just updating a single component like the kernel and building everything again, does not require the whole testing process again. Thin ice, I know. > One scenario that I've seen previously is where user space and > kernel are built together as a source based distribution (OE, buildroot, > openwrt, ...), and the compiler is picked to match the original sources > of the user space because that is best tested, but the same compiler > then gets used to build the kernel as well because that is the default > in the build environment. One problem we actually ran into in BSPs like that (we build with ptxdist, however build system doesn't matter here, it could as well have been buildroot etc.) was things* failing to build with newer compilers, things we could not or did not want to fix, so staying with an older toolchain was the obvious choice. *Things as in bootloaders for an armv5 platform. > There are two problems I see with this logic: > > - Running the latest kernel to avoid security problems is of course > a good idea, but if one runs that with ten year old user space that > is never updated, the system is likely to end up just as insecure. > Not all bugs are in the kernel. Agreed. > - The same logic that applies to ancient user space staying with > an ancient compiler (it's better tested in this combination) also > applies to the kernel: running the latest kernel on an old compiler > is something that few people test, and tends to run into more bugs > than using the compiler that other developers used to test that > kernel. What we actually did: building recent userspace and kernel with older toolchains, because bootloader. I know, there are several possibilities to solve this kind of lock: - built bootloader with different compiler - update bootloader - … As said before, this is no problem for me now, I can work around it, but to give an idea what could keep people on older toolchains. Greets Alex
Re: [PATCH v3 1/2] KVM: PPC: Book3S HV: Sanitise vcpu registers in nested path
Excerpts from Paul Mackerras's message of May 4, 2021 2:28 pm: > On Sat, May 01, 2021 at 11:58:36AM +1000, Nicholas Piggin wrote: >> Excerpts from Fabiano Rosas's message of April 16, 2021 9:09 am: >> > As one of the arguments of the H_ENTER_NESTED hypercall, the nested >> > hypervisor (L1) prepares a structure containing the values of various >> > hypervisor-privileged registers with which it wants the nested guest >> > (L2) to run. Since the nested HV runs in supervisor mode it needs the >> > host to write to these registers. >> > >> > To stop a nested HV manipulating this mechanism and using a nested >> > guest as a proxy to access a facility that has been made unavailable >> > to it, we have a routine that sanitises the values of the HV registers >> > before copying them into the nested guest's vcpu struct. >> > >> > However, when coming out of the guest the values are copied as they >> > were back into L1 memory, which means that any sanitisation we did >> > during guest entry will be exposed to L1 after H_ENTER_NESTED returns. >> > >> > This patch alters this sanitisation to have effect on the vcpu->arch >> > registers directly before entering and after exiting the guest, >> > leaving the structure that is copied back into L1 unchanged (except >> > when we really want L1 to access the value, e.g the Cause bits of >> > HFSCR). >> > >> > Signed-off-by: Fabiano Rosas >> > --- >> > arch/powerpc/kvm/book3s_hv_nested.c | 55 ++--- >> > 1 file changed, 34 insertions(+), 21 deletions(-) >> > >> > diff --git a/arch/powerpc/kvm/book3s_hv_nested.c >> > b/arch/powerpc/kvm/book3s_hv_nested.c >> > index 0cd0e7aad588..270552dd42c5 100644 >> > --- a/arch/powerpc/kvm/book3s_hv_nested.c >> > +++ b/arch/powerpc/kvm/book3s_hv_nested.c >> > @@ -102,8 +102,17 @@ static void save_hv_return_state(struct kvm_vcpu >> > *vcpu, int trap, >> > { >> >struct kvmppc_vcore *vc = vcpu->arch.vcore; >> > >> > + /* >> > + * When loading the hypervisor-privileged registers to run L2, >> > + * we might have used bits from L1 state to restrict what the >> > + * L2 state is allowed to be. Since L1 is not allowed to read >> > + * the HV registers, do not include these modifications in the >> > + * return state. >> > + */ >> > + hr->hfscr = ((~HFSCR_INTR_CAUSE & hr->hfscr) | >> > + (HFSCR_INTR_CAUSE & vcpu->arch.hfscr)); >> > + >> >hr->dpdes = vc->dpdes; >> > - hr->hfscr = vcpu->arch.hfscr; >> >hr->purr = vcpu->arch.purr; >> >hr->spurr = vcpu->arch.spurr; >> >hr->ic = vcpu->arch.ic; >> >> Do we still have the problem here that hfac interrupts due to bits cleared >> by the hfscr sanitisation would have the cause bits returned to the L1, >> so in theory it could probe hfscr directly that way? I don't see a good >> solution to this except either have the L0 intercept these faults and do >> "something" transparent, or return error from H_ENTER_NESTED (which would >> also allow trivial probing of the facilities). > > It seems to me that there are various specific reasons why L0 would > clear HFSCR bits, and if we think about the specific reasons, what we > should do becomes clear. (I say "L0" but in fact the same reasoning > applies to any hypervisor that lets its guest do hypervisor-ish > things.) > > 1. Emulating a version of the architecture which doesn't have the > feature in question - in that case the bit should appear to L1 as a > reserved bit in HFSCR (i.e. always read 0), the associated facility > code should never appear in the top 8 bits of any HFSCR value that L1 > sees, and any HFU interrupt received by L0 for the facility should be > changed into an illegal instruction interrupt (or HEAI) forwarded to > L1. In this case the real HFSCR should always have the enable bit for > the facility set to 0. > > 2. Lazy save/restore of the state associated with a facility - in this > case, while the system is in the "lazy" state (i.e. the state is not > that of the currently running guest), the real HFSCR bit for the > facility should be 0. On an HFU interrupt for the facility, L0 looks > at L1's HFSCR value: if it's 0, forward the HFU interrupt to L1; if > it's 1, load up the facility state, set the facility's bit in HFSCR, > and resume the guest. > > 3. Emulating a facility in software - in this case, the real HFSCR > bit for the facility would always be 0. On an HFU interrupt, L0 reads > the instruction and emulates it, then resumes the guest. > > One thing this all makes clear is that the IC field of the "virtual" > HFSCR value seen by L1 should only ever be changed when L0 forwards a > HFU interrupt to L1. > > In fact we currently never do (1) or (2), and we only do (3) for > msgsndp etc., so this discussion is mostly theoretical. Yeah it's somewhat theoretical, and I guess I mostly agree with you. Missing is the case where the L0 does not implement a feature at all. Let's say TM is broken so it disables it, or nobody uses TAR so it doesn't bo
Re: [FSL P50x0] Xorg always restarts again and again after the the PowerPC updates 5.13-1
Le 04/05/2021 à 00:25, Christian Zigotzky a écrit : Hello, Xorg always restarts again and again after the the PowerPC updates 5.13-1 [1] on my FSL P5040 Cyrus+ board (A-EON AmigaOne X5000) [2]. Xorg doesn't start anymore in a virtual e5500 QEMU machine [3]. I bisected today [4]. Result: powerpc/signal32: Convert do_setcontext[_tm]() to user access block (887f3ceb51cd34109ac17bfc98695162e299e657) [5] is the first bad commit. Please find attached the kernel config. Please check the first bad commit. I'm not sure you can conclude anything here. There is a problem in that commit, but it is fixed by 525642624783 ("powerpc/signal32: Fix erroneous SIGSEGV on RT signal return") which is the last commit of powerpc-5.13-1. So any bisect from there will for sure point to 887f3ceb51cd ("powerpc/signal32: Convert do_setcontext[_tm]() to user access block") but that's unconclusive. If the problem is still there at the HEAD of powerpc-5.13-1, the problem is likely somewhere else. I think you need to do the bisect again with a cherry-pick of 525642624783 at each step. Thanks Christophe Thanks, Christian [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c70a4be130de333ea079c59da41cc959712bb01c [2] http://wiki.amiga.org/index.php?title=X5000 [3] qemu-system-ppc64 -M ppce500 -cpu e5500 -m 1024 -kernel uImage -drive format=raw,file=fedora28-2.img,index=0,if=virtio -netdev user,id=mynet0 -device virtio-net-pci,netdev=mynet0 -append "rw root=/dev/vda" -device virtio-vga -usb -device usb-ehci,id=ehci -device usb-tablet -device virtio-keyboard-pci -smp 4 -vnc :1 [4] https://forum.hyperion-entertainment.com/viewtopic.php?p=53101#p53101 [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=887f3ceb51cd34109ac17bfc98695162e299e657
Re: [PATCH v3 1/2] KVM: PPC: Book3S HV: Sanitise vcpu registers in nested path
On Sat, May 01, 2021 at 11:58:36AM +1000, Nicholas Piggin wrote: > Excerpts from Fabiano Rosas's message of April 16, 2021 9:09 am: > > As one of the arguments of the H_ENTER_NESTED hypercall, the nested > > hypervisor (L1) prepares a structure containing the values of various > > hypervisor-privileged registers with which it wants the nested guest > > (L2) to run. Since the nested HV runs in supervisor mode it needs the > > host to write to these registers. > > > > To stop a nested HV manipulating this mechanism and using a nested > > guest as a proxy to access a facility that has been made unavailable > > to it, we have a routine that sanitises the values of the HV registers > > before copying them into the nested guest's vcpu struct. > > > > However, when coming out of the guest the values are copied as they > > were back into L1 memory, which means that any sanitisation we did > > during guest entry will be exposed to L1 after H_ENTER_NESTED returns. > > > > This patch alters this sanitisation to have effect on the vcpu->arch > > registers directly before entering and after exiting the guest, > > leaving the structure that is copied back into L1 unchanged (except > > when we really want L1 to access the value, e.g the Cause bits of > > HFSCR). > > > > Signed-off-by: Fabiano Rosas > > --- > > arch/powerpc/kvm/book3s_hv_nested.c | 55 ++--- > > 1 file changed, 34 insertions(+), 21 deletions(-) > > > > diff --git a/arch/powerpc/kvm/book3s_hv_nested.c > > b/arch/powerpc/kvm/book3s_hv_nested.c > > index 0cd0e7aad588..270552dd42c5 100644 > > --- a/arch/powerpc/kvm/book3s_hv_nested.c > > +++ b/arch/powerpc/kvm/book3s_hv_nested.c > > @@ -102,8 +102,17 @@ static void save_hv_return_state(struct kvm_vcpu > > *vcpu, int trap, > > { > > struct kvmppc_vcore *vc = vcpu->arch.vcore; > > > > + /* > > +* When loading the hypervisor-privileged registers to run L2, > > +* we might have used bits from L1 state to restrict what the > > +* L2 state is allowed to be. Since L1 is not allowed to read > > +* the HV registers, do not include these modifications in the > > +* return state. > > +*/ > > + hr->hfscr = ((~HFSCR_INTR_CAUSE & hr->hfscr) | > > +(HFSCR_INTR_CAUSE & vcpu->arch.hfscr)); > > + > > hr->dpdes = vc->dpdes; > > - hr->hfscr = vcpu->arch.hfscr; > > hr->purr = vcpu->arch.purr; > > hr->spurr = vcpu->arch.spurr; > > hr->ic = vcpu->arch.ic; > > Do we still have the problem here that hfac interrupts due to bits cleared > by the hfscr sanitisation would have the cause bits returned to the L1, > so in theory it could probe hfscr directly that way? I don't see a good > solution to this except either have the L0 intercept these faults and do > "something" transparent, or return error from H_ENTER_NESTED (which would > also allow trivial probing of the facilities). It seems to me that there are various specific reasons why L0 would clear HFSCR bits, and if we think about the specific reasons, what we should do becomes clear. (I say "L0" but in fact the same reasoning applies to any hypervisor that lets its guest do hypervisor-ish things.) 1. Emulating a version of the architecture which doesn't have the feature in question - in that case the bit should appear to L1 as a reserved bit in HFSCR (i.e. always read 0), the associated facility code should never appear in the top 8 bits of any HFSCR value that L1 sees, and any HFU interrupt received by L0 for the facility should be changed into an illegal instruction interrupt (or HEAI) forwarded to L1. In this case the real HFSCR should always have the enable bit for the facility set to 0. 2. Lazy save/restore of the state associated with a facility - in this case, while the system is in the "lazy" state (i.e. the state is not that of the currently running guest), the real HFSCR bit for the facility should be 0. On an HFU interrupt for the facility, L0 looks at L1's HFSCR value: if it's 0, forward the HFU interrupt to L1; if it's 1, load up the facility state, set the facility's bit in HFSCR, and resume the guest. 3. Emulating a facility in software - in this case, the real HFSCR bit for the facility would always be 0. On an HFU interrupt, L0 reads the instruction and emulates it, then resumes the guest. One thing this all makes clear is that the IC field of the "virtual" HFSCR value seen by L1 should only ever be changed when L0 forwards a HFU interrupt to L1. In fact we currently never do (1) or (2), and we only do (3) for msgsndp etc., so this discussion is mostly theoretical. > Returning an hfac interrupt to a hypervisor that thought it enabled the > bit would be strange. But so does appearing to modify the register > underneath it and then returning a fault. I don't think we should ever do either of those things. The closest would be (1) above, but in that case the fault has to be either an illegal instruction type program interrupt, or a HEAI. > I think the
[RFC 09/10] powerpc/rtas: convert to rtas_sched_if_busy()
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small extended delay values. Signed-off-by: Nathan Lynch --- arch/powerpc/kernel/rtas.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 4177f7385ea2..c5cc4542856f 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -743,7 +743,7 @@ int rtas_set_power_level(int powerdomain, int level, int *setlevel) do { rc = rtas_call(token, 2, 2, setlevel, powerdomain, level); - } while (rtas_busy_delay(rc)); + } while (rtas_sched_if_busy(rc)); if (rc < 0) return rtas_error_rc(rc); @@ -761,7 +761,7 @@ int rtas_get_sensor(int sensor, int index, int *state) do { rc = rtas_call(token, 2, 2, state, sensor, index); - } while (rtas_busy_delay(rc)); + } while (rtas_sched_if_busy(rc)); if (rc < 0) return rtas_error_rc(rc); @@ -822,7 +822,7 @@ int rtas_set_indicator(int indicator, int index, int new_value) do { rc = rtas_call(token, 3, 1, NULL, indicator, index, new_value); - } while (rtas_busy_delay(rc)); + } while (rtas_sched_if_busy(rc)); if (rc < 0) return rtas_error_rc(rc); @@ -990,7 +990,7 @@ void rtas_activate_firmware(void) do { fwrc = rtas_call(token, 0, 1, NULL); - } while (rtas_busy_delay(fwrc)); + } while (rtas_sched_if_busy(fwrc)); if (fwrc) pr_err("ibm,activate-firmware failed (%i)\n", fwrc); -- 2.30.2
[RFC 10/10] powerpc/rtas_flash: convert to rtas_sched_if_busy()
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small extended delay values. Signed-off-by: Nathan Lynch --- arch/powerpc/kernel/rtas_flash.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c index a99179d83538..bedefb9178ec 100644 --- a/arch/powerpc/kernel/rtas_flash.c +++ b/arch/powerpc/kernel/rtas_flash.c @@ -378,7 +378,7 @@ static void manage_flash(struct rtas_manage_flash_t *args_buf, unsigned int op) do { rc = rtas_call(rtas_token("ibm,manage-flash-image"), 1, 1, NULL, op); - } while (rtas_busy_delay(rc)); + } while (rtas_sched_if_busy(rc)); args_buf->status = rc; } @@ -456,7 +456,7 @@ static void validate_flash(struct rtas_validate_flash_t *args_buf) (u32) __pa(rtas_data_buf), args_buf->buf_size); memcpy(args_buf->buf, rtas_data_buf, VALIDATE_BUF_SIZE); spin_unlock(&rtas_data_buf_lock); - } while (rtas_busy_delay(rc)); + } while (rtas_sched_if_busy(rc)); args_buf->status = rc; args_buf->update_results = update_results; -- 2.30.2
[RFC 08/10] powerpc/pseries/dlpar: convert to rtas_sched_if_busy()
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small extended delay values. Signed-off-by: Nathan Lynch --- arch/powerpc/platforms/pseries/dlpar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 3ac70790ec7a..3ba77bc09a6e 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -167,7 +167,7 @@ struct device_node *dlpar_configure_connector(__be32 drc_index, spin_unlock(&rtas_data_buf_lock); - if (rtas_busy_delay(rc)) + if (rtas_sched_if_busy(rc)) continue; switch (rc) { -- 2.30.2
[RFC 07/10] powerpc/pseries/iommu: convert to rtas_sched_if_busy()
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small extended delay values. Signed-off-by: Nathan Lynch --- arch/powerpc/platforms/pseries/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index 0c55b991f665..0f0e7a51b863 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -1016,7 +1016,7 @@ static int create_ddw(struct pci_dev *dev, const u32 *ddw_avail, ret = rtas_call(ddw_avail[DDW_CREATE_PE_DMA_WIN], 5, 4, (u32 *)create, cfg_addr, BUID_HI(buid), BUID_LO(buid), page_shift, window_shift); - } while (rtas_busy_delay(ret)); + } while (rtas_sched_if_busy(ret)); dev_info(&dev->dev, "ibm,create-pe-dma-window(%x) %x %x %x %x %x returned %d " "(liobn = 0x%x starting addr = %x %x)\n", -- 2.30.2
[RFC 06/10] powerpc/pseries/msi: convert to rtas_sched_if_busy()
rtas_sched_if_busy() has better behavior for RTAS_BUSY (-2) and small extended delay values. Signed-off-by: Nathan Lynch --- arch/powerpc/platforms/pseries/msi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c index 637300330507..df434b8a3aa7 100644 --- a/arch/powerpc/platforms/pseries/msi.c +++ b/arch/powerpc/platforms/pseries/msi.c @@ -49,7 +49,7 @@ static int rtas_change_msi(struct pci_dn *pdn, u32 func, u32 num_irqs) func, num_irqs, seq_num); seq_num = rtas_ret[1]; - } while (rtas_busy_delay(rc)); + } while (rtas_sched_if_busy(rc)); /* * If the RTAS call succeeded, return the number of irqs allocated. @@ -100,7 +100,7 @@ static int rtas_query_irq_number(struct pci_dn *pdn, int offset) do { rc = rtas_call(query_token, 4, 3, rtas_ret, addr, BUID_HI(buid), BUID_LO(buid), offset); - } while (rtas_busy_delay(rc)); + } while (rtas_sched_if_busy(rc)); if (rc) { pr_debug("rtas_msi: error (%d) querying source number\n", rc); -- 2.30.2
[RFC 05/10] powerpc/pseries/fadump: convert to rtas_sched_if_busy()
None of these call sites need to use mdelay(); convert them to rtas_sched_if_busy(). Signed-off-by: Nathan Lynch --- arch/powerpc/platforms/pseries/rtas-fadump.c | 22 +++- 1 file changed, 3 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/platforms/pseries/rtas-fadump.c b/arch/powerpc/platforms/pseries/rtas-fadump.c index f8f73b47b107..9a200d3bf5e0 100644 --- a/arch/powerpc/platforms/pseries/rtas-fadump.c +++ b/arch/powerpc/platforms/pseries/rtas-fadump.c @@ -129,7 +129,6 @@ static u64 rtas_fadump_get_bootmem_min(void) static int rtas_fadump_register(struct fw_dump *fadump_conf) { - unsigned int wait_time; int rc, err = -EIO; /* TODO: Add upper time limit for the delay */ @@ -137,12 +136,7 @@ static int rtas_fadump_register(struct fw_dump *fadump_conf) rc = rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1, NULL, FADUMP_REGISTER, &fdm, sizeof(struct rtas_fadump_mem_struct)); - - wait_time = rtas_busy_delay_time(rc); - if (wait_time) - mdelay(wait_time); - - } while (wait_time); + } while (rtas_sched_if_busy(rc)); switch (rc) { case 0: @@ -177,7 +171,6 @@ static int rtas_fadump_register(struct fw_dump *fadump_conf) static int rtas_fadump_unregister(struct fw_dump *fadump_conf) { - unsigned int wait_time; int rc; /* TODO: Add upper time limit for the delay */ @@ -185,11 +178,7 @@ static int rtas_fadump_unregister(struct fw_dump *fadump_conf) rc = rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1, NULL, FADUMP_UNREGISTER, &fdm, sizeof(struct rtas_fadump_mem_struct)); - - wait_time = rtas_busy_delay_time(rc); - if (wait_time) - mdelay(wait_time); - } while (wait_time); + } while (rtas_sched_if_busy(rc)); if (rc) { pr_err("Failed to un-register - unexpected error(%d).\n", rc); @@ -202,7 +191,6 @@ static int rtas_fadump_unregister(struct fw_dump *fadump_conf) static int rtas_fadump_invalidate(struct fw_dump *fadump_conf) { - unsigned int wait_time; int rc; /* TODO: Add upper time limit for the delay */ @@ -210,11 +198,7 @@ static int rtas_fadump_invalidate(struct fw_dump *fadump_conf) rc = rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1, NULL, FADUMP_INVALIDATE, fdm_active, sizeof(struct rtas_fadump_mem_struct)); - - wait_time = rtas_busy_delay_time(rc); - if (wait_time) - mdelay(wait_time); - } while (wait_time); + } while (rtas_sched_if_busy(rc)); if (rc) { pr_err("Failed to invalidate - unexpected error (%d).\n", rc); -- 2.30.2
[RFC 04/10] powerpc/rtas-rtc: convert set-time-of-day to rtas_sched_if_busy()
rtas_set_rtc_time() is called only in process context; convert this to rtas_sched_if_busy(). Signed-off-by: Nathan Lynch --- arch/powerpc/kernel/rtas-rtc.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kernel/rtas-rtc.c b/arch/powerpc/kernel/rtas-rtc.c index 82cb95f29a11..421b92f95669 100644 --- a/arch/powerpc/kernel/rtas-rtc.c +++ b/arch/powerpc/kernel/rtas-rtc.c @@ -62,7 +62,7 @@ void rtas_get_rtc_time(struct rtc_time *rtc_tm) int rtas_set_rtc_time(struct rtc_time *tm) { - int error, wait_time; + int error; u64 max_wait_tb; max_wait_tb = get_tb() + tb_ticks_per_usec * 1000 * MAX_RTC_WAIT; @@ -72,13 +72,7 @@ int rtas_set_rtc_time(struct rtc_time *tm) tm->tm_mday, tm->tm_hour, tm->tm_min, tm->tm_sec, 0); - wait_time = rtas_busy_delay_time(error); - if (wait_time) { - if (in_interrupt()) - return 1; /* probably decrementer */ - msleep(wait_time); - } - } while (wait_time && (get_tb() < max_wait_tb)); + } while (rtas_sched_if_busy(error) && (get_tb() < max_wait_tb)); if (error != 0) printk_ratelimited(KERN_WARNING -- 2.30.2
[RFC 03/10] powerpc/rtas-rtc: convert get-time-of-day to rtas_force_spin_if_busy()
The functions in rtas-rtc which call get-time-of-day can be invoked in boot, suspend, and resume paths with interrupts off. Unfortunately get-time-of-day can return an extended delay status, so we use rtas_force_spin_if_busy(). In the specific case of rtas_get_rtc_time(), it is not clear why returning an incorrect result is better than calling again even if we are in interrupt context. Remove this logic. Signed-off-by: Nathan Lynch --- arch/powerpc/kernel/rtas-rtc.c | 28 ++-- 1 file changed, 2 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/kernel/rtas-rtc.c b/arch/powerpc/kernel/rtas-rtc.c index a28239b8b0c0..82cb95f29a11 100644 --- a/arch/powerpc/kernel/rtas-rtc.c +++ b/arch/powerpc/kernel/rtas-rtc.c @@ -17,19 +17,12 @@ time64_t __init rtas_get_boot_time(void) { int ret[8]; int error; - unsigned int wait_time; u64 max_wait_tb; max_wait_tb = get_tb() + tb_ticks_per_usec * 1000 * MAX_RTC_WAIT; do { error = rtas_call(rtas_token("get-time-of-day"), 0, 8, ret); - - wait_time = rtas_busy_delay_time(error); - if (wait_time) { - /* This is boot time so we spin. */ - udelay(wait_time*1000); - } - } while (wait_time && (get_tb() < max_wait_tb)); + } while (rtas_force_spin_if_busy(error) && (get_tb() < max_wait_tb)); if (error != 0) { printk_ratelimited(KERN_WARNING @@ -41,33 +34,16 @@ time64_t __init rtas_get_boot_time(void) return mktime64(ret[0], ret[1], ret[2], ret[3], ret[4], ret[5]); } -/* NOTE: get_rtc_time will get an error if executed in interrupt context - * and if a delay is needed to read the clock. In this case we just - * silently return without updating rtc_tm. - */ void rtas_get_rtc_time(struct rtc_time *rtc_tm) { int ret[8]; int error; - unsigned int wait_time; u64 max_wait_tb; max_wait_tb = get_tb() + tb_ticks_per_usec * 1000 * MAX_RTC_WAIT; do { error = rtas_call(rtas_token("get-time-of-day"), 0, 8, ret); - - wait_time = rtas_busy_delay_time(error); - if (wait_time) { - if (in_interrupt()) { - memset(rtc_tm, 0, sizeof(struct rtc_time)); - printk_ratelimited(KERN_WARNING - "error: reading clock " - "would delay interrupt\n"); - return; /* delay not allowed */ - } - msleep(wait_time); - } - } while (wait_time && (get_tb() < max_wait_tb)); + } while (rtas_sched_if_busy(error) && (get_tb() < max_wait_tb)); if (error != 0) { printk_ratelimited(KERN_WARNING -- 2.30.2
[RFC 01/10] powerpc/rtas: new APIs for busy and extended delay statuses
Add new APIs for handling busy (-2) and extended delay hint (9900...9905) statuses from RTAS. These are intended to be drop-in replacements for existing uses of rtas_busy_delay(). A problem with rtas_busy_delay() and rtas_busy_delay_time() is that they consider -2/busy to be equivalent to 9900 (wait 1ms). In fact, the OS should call again as soon as it wants on -2, which at least on PowerVM means RTAS is returning only to uphold the general requirement that RTAS must return control to the OS in a "timely fashion" (250us). Combine this with the fact that msleep(1) actually sleeps for more like 20ms in practice: on busy VMs we schedule away for much longer than necessary on -2 and 9900. This is fixed in rtas_sched_if_busy(), which uses usleep_range() for small delay hints, and only schedules away on -2 if there is other work available. It also refuses to sleep longer than one second regardless of the hinted value, on the assumption that even longer running operations can tolerate polling at 1HZ. rtas_spin_if_busy() and rtas_force_spin_if_busy() are provided for atomic contexts which need to handle busy status and extended delay hints. Signed-off-by: Nathan Lynch --- arch/powerpc/include/asm/rtas.h | 4 + arch/powerpc/kernel/rtas.c | 168 2 files changed, 172 insertions(+) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 9dc97d2f9d27..555ff3290f92 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -266,6 +266,10 @@ extern int rtas_set_rtc_time(struct rtc_time *rtc_time); extern unsigned int rtas_busy_delay_time(int status); extern unsigned int rtas_busy_delay(int status); +bool rtas_sched_if_busy(int status); +bool rtas_spin_if_busy(int status); +bool rtas_force_spin_if_busy(int status); + extern int early_init_dt_scan_rtas(unsigned long node, const char *uname, int depth, void *data); diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 6bada744402b..4a1dfbfa51ba 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -519,6 +519,174 @@ unsigned int rtas_busy_delay(int status) } EXPORT_SYMBOL(rtas_busy_delay); +/** + * rtas_force_spin_if_busy() - Consume a busy or extended delay status + * in atomic context. + * @status: Return value from rtas_call() or similar function. + * + * Use this function when you cannot avoid using an RTAS function + * which may return an extended delay hint in atomic context. If + * possible, use rtas_spin_if_busy() or rtas_sched_if_busy() instead + * of this function. + * + * Return: True if @status is -2 or 990x, in which case + * rtas_spin_if_busy() will have delayed an appropriate amount + * of time, and the caller should call the RTAS function + * again. False otherwise. + */ +bool rtas_force_spin_if_busy(int status) +{ + bool was_busy = true; + + switch (status) { + case RTAS_BUSY: + /* OK to call again immediately; do nothing. */ + break; + case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX: + mdelay(1); + break; + default: + was_busy = false; + break; + } + + return was_busy; +} + +/** + * rtas_spin_if_busy() - Consume a busy status in atomic context. + * @status: Return value from rtas_call() or similar function. + * + * Prefer rtas_sched_if_busy() over this function. Prefer this + * function over rtas_force_spin_if_busy(). Use this function in + * atomic contexts with RTAS calls that are specified to return -2 but + * not 990x. This function will complain and execute a minimal delay + * if passed a 990x status. + * + * Return: True if @status is -2 or 990x, in which case + * rtas_spin_if_busy() will have delayed an appropriate amount + * of time, and the caller should call the RTAS function + * again. False otherwise. + */ +bool rtas_spin_if_busy(int status) +{ + bool was_busy = true; + + switch (status) { + case RTAS_BUSY: + /* OK to call again immediately; do nothing. */ + break; + case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX: + /* +* Generally, RTAS functions which can return this +* status should be considered too expensive to use in +* atomic context. Change the calling code to use +* rtas_sched_if_busy(), or if that's not possible, +* use rtas_force_spin_if_busy(). +*/ + pr_warn_once("%pS may use RTAS call in atomic context which returns extended delay.\n", +__builtin_return_address(0)); + mdelay(1); + break; + default: + was_busy = false; + break; + } + + return was_busy; +} + +static unsigned l
[RFC 02/10] powerpc/rtas: do not schedule in rtas_os_term()
rtas_os_term() is called in the panic path and should immediately re-call the RTAS ibm,os-term function as long as it returns a busy status. It's not safe to use rtas_busy_delay() in this context, which potentially can schedule away. Use rtas_spin_if_busy(). Signed-off-by: Nathan Lynch --- arch/powerpc/kernel/rtas.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 4a1dfbfa51ba..4177f7385ea2 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -960,7 +960,7 @@ void rtas_os_term(char *str) do { status = rtas_call(rtas_token("ibm,os-term"), 1, 1, NULL, __pa(rtas_os_term_buf)); - } while (rtas_busy_delay(status)); + } while (rtas_spin_if_busy(status)); if (status != 0) printk(KERN_EMERG "ibm,os-term call failed %d\n", status); -- 2.30.2
[RFC 00/10] powerpc/rtas: improved busy and extended delay status handling
This is an attempt at providing clearer names as discussed here: https://github.com/linuxppc/issues/issues/164 as well as providing better behavior for RTAS_BUSY (-2) and small extended delay values, which in my experience seem more common than the larger ones. In testing PREEMPT_NONE kernels with CPUs busy, I see the elapsed time for memory add operations roughly halved, while memory remove operations' elapsed time shrinks by about ~25%. This is achieved without significantly more time spent on CPU: (- is before, + is after) Performance counter stats for 'drmgr -c mem -a -q 10' (10 runs): - 1,898 probe:rtas_call #0.003 M/sec ( +- 2.20% ) -751.57 msec task-clock#0.289 CPUs utilized ( +- 1.56% ) + 1,969 probe:rtas_call #0.003 M/sec ( +- 2.69% ) +766.20 msec task-clock#0.688 CPUs utilized ( +- 1.99% ) - 2.605 +- 0.148 seconds time elapsed ( +- 5.70% ) +1.1129 +- 0.0660 seconds time elapsed ( +- 5.93% ) Performance counter stats for 'drmgr -c mem -r -q 10' (10 runs): - 673 probe:rtas_call #0.002 M/sec ( +- 0.55% ) -318.36 msec task-clock#0.234 CPUs utilized ( +- 0.42% ) + 692 probe:rtas_call #0.002 M/sec ( +- 0.73% ) +320.87 msec task-clock#0.309 CPUs utilized ( +- 0.34% ) - 1.362 +- 0.100 seconds time elapsed ( +- 7.37% ) +1.0372 +- 0.0468 seconds time elapsed ( +- 4.51% ) Questions / concerns / to do: * I don't love the new API function names. * Introduces three new APIs when two likely would suffice. * Need to convert eeh_pseries and scanlog. * rtas_busy_delay() and rtas_busy_delay_time() not yet removed. Nathan Lynch (10): powerpc/rtas: new APIs for busy and extended delay statuses powerpc/rtas: do not schedule in rtas_os_term() powerpc/rtas-rtc: convert get-time-of-day to rtas_force_spin_if_busy() powerpc/rtas-rtc: convert set-time-of-day to rtas_sched_if_busy() powerpc/pseries/fadump: convert to rtas_sched_if_busy() powerpc/pseries/msi: convert to rtas_sched_if_busy() powerpc/pseries/iommu: convert to rtas_sched_if_busy() powerpc/pseries/dlpar: convert to rtas_sched_if_busy() powerpc/rtas: convert to rtas_sched_if_busy() powerpc/rtas_flash: convert to rtas_sched_if_busy() arch/powerpc/include/asm/rtas.h | 4 + arch/powerpc/kernel/rtas-rtc.c | 38 +--- arch/powerpc/kernel/rtas.c | 178 ++- arch/powerpc/kernel/rtas_flash.c | 4 +- arch/powerpc/platforms/pseries/dlpar.c | 2 +- arch/powerpc/platforms/pseries/iommu.c | 2 +- arch/powerpc/platforms/pseries/msi.c | 4 +- arch/powerpc/platforms/pseries/rtas-fadump.c | 22 +-- 8 files changed, 190 insertions(+), 64 deletions(-) -- 2.30.2
[PATCH] powerpc/pseries/dlpar: use rtas_get_sensor()
Instead of making bare calls to get-sensor-state, use rtas_get_sensor(), which correctly handles busy and extended delay statuses. Fixes: ab519a011caa ("powerpc/pseries: Kernel DLPAR Infrastructure") Signed-off-by: Nathan Lynch --- arch/powerpc/platforms/pseries/dlpar.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 3ac70790ec7a..b1f01ac0c29e 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -289,8 +289,7 @@ int dlpar_acquire_drc(u32 drc_index) { int dr_status, rc; - rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status, - DR_ENTITY_SENSE, drc_index); + rc = rtas_get_sensor(DR_ENTITY_SENSE, drc_index, &dr_status); if (rc || dr_status != DR_ENTITY_UNUSABLE) return -1; @@ -311,8 +310,7 @@ int dlpar_release_drc(u32 drc_index) { int dr_status, rc; - rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status, - DR_ENTITY_SENSE, drc_index); + rc = rtas_get_sensor(DR_ENTITY_SENSE, drc_index, &dr_status); if (rc || dr_status != DR_ENTITY_PRESENT) return -1; @@ -333,8 +331,7 @@ int dlpar_unisolate_drc(u32 drc_index) { int dr_status, rc; - rc = rtas_call(rtas_token("get-sensor-state"), 2, 2, &dr_status, - DR_ENTITY_SENSE, drc_index); + rc = rtas_get_sensor(DR_ENTITY_SENSE, drc_index, &dr_status); if (rc || dr_status != DR_ENTITY_PRESENT) return -1; -- 2.30.2
[powerpc:next] BUILD SUCCESS 562d1e207d322e6346e8db91bbd11d94f16427d2
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next branch HEAD: 562d1e207d322e6346e8db91bbd11d94f16427d2 powerpc/powernv: remove the nvlink support elapsed time: 726m configs tested: 44 configs skipped: 78 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: mips tb0219_defconfig mips ip27_defconfig shapsh4ad0a_defconfig riscv allnoconfig armneponset_defconfig arm pxa_defconfig armclps711x_defconfig m68k m5475evb_defconfig mips loongson1c_defconfig arm exynos_defconfig arm defconfig nios2 defconfig arc allyesconfig nds32 allnoconfig parisc defconfig s390 allyesconfig s390 allmodconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig i386 randconfig-a003-20210503 i386 randconfig-a006-20210503 i386 randconfig-a001-20210503 i386 randconfig-a005-20210503 i386 randconfig-a004-20210503 i386 randconfig-a002-20210503 um allmodconfig umallnoconfig um allyesconfig um defconfig clang tested configs: x86_64 randconfig-a014-20210503 x86_64 randconfig-a015-20210503 x86_64 randconfig-a012-20210503 x86_64 randconfig-a011-20210503 x86_64 randconfig-a013-20210503 x86_64 randconfig-a016-20210503 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
[powerpc:merge] BUILD SUCCESS 134b5c8a49b594ff6cfb4ea1a92400bb382b46d2
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge branch HEAD: 134b5c8a49b594ff6cfb4ea1a92400bb382b46d2 Automatic merge of 'master' into merge (2021-05-02 23:37) elapsed time: 2159m configs tested: 145 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig riscvallmodconfig riscvallyesconfig mips tb0219_defconfig mips ip27_defconfig shapsh4ad0a_defconfig riscv allnoconfig armneponset_defconfig sh rsk7203_defconfig sh sh7724_generic_defconfig m68k amiga_defconfig ia64zx1_defconfig mips lemote2f_defconfig powerpc xes_mpc85xx_defconfig powerpcwarp_defconfig xtensa defconfig mipse55_defconfig powerpcmvme5100_defconfig arm pxa255-idp_defconfig arm pxa_defconfig armclps711x_defconfig m68k m5475evb_defconfig mips loongson1c_defconfig arm exynos_defconfig sh polaris_defconfig powerpc cm5200_defconfig sparc64 alldefconfig powerpcmpc7448_hpc2_defconfig powerpc kmeter1_defconfig arc allyesconfig armlart_defconfig powerpc ep8248e_defconfig armmulti_v5_defconfig arm pxa910_defconfig m68k multi_defconfig um x86_64_defconfig mipsomega2p_defconfig mips pistachio_defconfig xtensa common_defconfig sh se7619_defconfig arm pxa3xx_defconfig arcvdk_hs38_defconfig arm iop32x_defconfig sh ecovec24_defconfig nds32alldefconfig i386defconfig arm assabet_defconfig arm colibri_pxa270_defconfig armspear3xx_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig nds32 allnoconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig s390 allmodconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a001-20210503 x86_64 randconfig-a005-20210503 x86_64 randconfig-a003-20210503 x86_64 randconfig-a002-20210503 x86_64 randconfig-a006-20210503 x86_64 randconfig-a004-20210503 i386 randconfig-a003-20210503 i386 randconfig-a006-20210503 i386 randconfig-a001-20210503 i386 randconfig-a005-20210503 i386 randconfig-a004-20210503 i386 randconfig-a002-20210503 i386 randconfig-a003-20210502 i386 randconfig-a006-20210502 i386 randconfig-a001-20210502 i386 randconfig-a005-20210502 i386 randconfig-a004-202
[powerpc:next-test] BUILD SUCCESS 7905dafdefe9f1238a3ca2795cf975b311b5a5f6
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test branch HEAD: 7905dafdefe9f1238a3ca2795cf975b311b5a5f6 powerpc/pseries: warn if recursing into the hcall tracing code elapsed time: 2157m configs tested: 108 configs skipped: 98 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig riscvallyesconfig mips tb0219_defconfig mips ip27_defconfig shapsh4ad0a_defconfig riscv allnoconfig armneponset_defconfig powerpc xes_mpc85xx_defconfig powerpcwarp_defconfig xtensa defconfig mipse55_defconfig arm pxa_defconfig armclps711x_defconfig m68k m5475evb_defconfig mips loongson1c_defconfig arm exynos_defconfig sh polaris_defconfig powerpc cm5200_defconfig sparc64 alldefconfig powerpcmpc7448_hpc2_defconfig powerpc kmeter1_defconfig arc allyesconfig armlart_defconfig powerpc ep8248e_defconfig armmulti_v5_defconfig arm pxa910_defconfig m68k multi_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig nds32 allnoconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig s390 allmodconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig i386 randconfig-a003-20210503 i386 randconfig-a006-20210503 i386 randconfig-a001-20210503 i386 randconfig-a005-20210503 i386 randconfig-a004-20210503 i386 randconfig-a002-20210503 i386 randconfig-a003-20210502 i386 randconfig-a006-20210502 i386 randconfig-a001-20210502 i386 randconfig-a005-20210502 i386 randconfig-a004-20210502 i386 randconfig-a002-20210502 x86_64 randconfig-a014-20210502 x86_64 randconfig-a015-20210502 x86_64 randconfig-a012-20210502 x86_64 randconfig-a011-20210502 x86_64 randconfig-a013-20210502 x86_64 randconfig-a016-20210502 i386 randconfig-a013-20210502 i386 randconfig-a015-20210502 i386 randconfig-a016-20210502 i386 randconfig-a014-20210502 i386 randconfig-a011-20210502 i386 randconfig-a012-20210502 um allmodconfig umallnoconfig um allyesconfig um defconfig x86_64 allyesconfig x86_64rhel-8.3-kselftests x86_64 defconfig x86_64 rhel-8.3 x86_64 rhel-8.3-kbuiltin x86_64 kexec clang tested configs: x86_64 randconfig-a001-20210502 x86_64 randconfig-a005-20210502 x86_64 randconfig-a003-20210502 x86_64
Re: [PATCH] Raise the minimum GCC version to 5.2
On Mon, May 3, 2021 at 3:17 PM Christophe Leroy wrote: > > > > Le 01/05/2021 à 17:15, Masahiro Yamada a écrit : > > The current minimum GCC version is 4.9 except ARCH=arm64 requiring > > GCC 5.1. > > > > When we discussed last time, we agreed to raise the minimum GCC version > > to 5.1 globally. [1] > > > > I'd like to propose GCC 5.2 to clean up arch/powerpc/Kconfig as well. > > One point I missed when I saw your patch first time, but I realised during > the discussion: > > Up to 4.9, GCC was numbered with 3 digits, we had 4.8.0, 4.8.1, ... 4.8.5, > 4.9.0, 4.9.1, 4.9.4 > > Then starting at 5, GCC switched to a 2 digits scheme, with 5.0, 5.1, 5.2, > ... 5.5 > > So, that is not GCC 5.1 or 5.2 that you should target, but only GCC 5. > Then it is up to the user to use the latest available version of GCC 5, which > is 5.5 at the time > begin, just like the user would have selected 4.9.4 when 4.9 was the minimum > GCC version. > > Christophe One line below in Documentation/process/changes.rst, I see Clang/LLVM (optional) 10.0.1 clang --version Clang 10.0.1 is a bug fix release of Clang 10 I do not think GCC 5.2 is strange when we want to exclude the initial release of GCC 5. -- Best Regards Masahiro Yamada
Re: [PATCH 2/3] hotplug-memory.c: enhance dlpar_memory_remove* LMB checks
On Fri, Apr 30, 2021 at 09:09:16AM -0300, Daniel Henrique Barboza wrote: > dlpar_memory_remove_by_ic() validates the amount of LMBs to be removed > by checking !DRCONF_MEM_RESERVED, and in the following loop before > dlpar_remove_lmb() a check for DRCONF_MEM_ASSIGNED is made before > removing it. This means that a LMB that is both !DRCONF_MEM_RESERVED and > !DRCONF_MEM_ASSIGNED will be counted as valid, but then not being > removed. The function will end up not removing all 'lmbs_to_remove' > LMBs while also not reporting any errors. > > Comparing it to dlpar_memory_remove_by_count(), the validation is done > via lmb_is_removable(), which checks for DRCONF_MEM_ASSIGNED and fadump > constraints. No additional check is made afterwards, and > DRCONF_MEM_RESERVED is never checked before dlpar_remove_lmb(). The > function doesn't have the same 'check A for validation, then B for > removal' issue as remove_by_ic(), but it's not checking if the LMB is > reserved. > > There is no reason for these functions to validate the same operation in > two different manners. Actually, I think there is: remove_by_ic() is handling a request to remove a specific range of LMBs. If any are reserved, they can't be removed and so this needs to fail. But if they are !ASSIGNED, that essentially means they're *already* removed (or never added), so "removing" them is, correctly, a no-op. remove_by_count(), in contrast, is being asked to remove a fixed number of LMBs from wherever they can be found, and for that it needs to find LMBs that haven't already been removed. Basically remove_by_ic() is an absolute request: "make this set of LMBs be not-plugged", whereas remove_by_count() is a relative request "make N less LMBs be plugged". So I think remove_by_ic()s existing handling is correct. I'm less sure if remove_by_count() ignoring RESERVED is correct - I couldn't quickly find under what circumstances RESERVED gets set. > This patch addresses that by changing > lmb_is_removable() to also check for DRCONF_MEM_RESERVED to tell if a > lmb is removable, making dlpar_memory_remove_by_count() take the > reservation state into account when counting the LMBs. > lmb_is_removable() is then used in the validation step of > dlpar_memory_remove_by_ic(), which is already checking for both states > but in different stages, to avoid counting a LMB that is not assigned as > eligible for removal. We can then skip the check before > dlpar_remove_lmb() since we're validating all LMBs beforehand. > > Signed-off-by: Daniel Henrique Barboza > --- > arch/powerpc/platforms/pseries/hotplug-memory.c | 8 +++- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c > b/arch/powerpc/platforms/pseries/hotplug-memory.c > index bb98574a84a2..4e6d162c3f1a 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -348,7 +348,8 @@ static int pseries_remove_mem_node(struct device_node *np) > > static bool lmb_is_removable(struct drmem_lmb *lmb) > { > - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) > + if ((lmb->flags & DRCONF_MEM_RESERVED) || > + !(lmb->flags & DRCONF_MEM_ASSIGNED)) > return false; > > #ifdef CONFIG_FA_DUMP > @@ -523,7 +524,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, > u32 drc_index) > > /* Validate that there are enough LMBs to satisfy the request */ > for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { > - if (lmb->flags & DRCONF_MEM_RESERVED) > + if (!lmb_is_removable(lmb)) > break; > > lmbs_available++; > @@ -533,9 +534,6 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, > u32 drc_index) > return -EINVAL; > > for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { > - if (!(lmb->flags & DRCONF_MEM_ASSIGNED)) > - continue; > - > rc = dlpar_remove_lmb(lmb); > if (rc) > break; -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH 1/3] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error
On Fri, Apr 30, 2021 at 09:09:15AM -0300, Daniel Henrique Barboza wrote: > As previously done in dlpar_cpu_remove() for CPUs, this patch changes > dlpar_memory_remove_by_ic() to unisolate the LMB DRC when the LMB is > failed to be removed. The hypervisor, seeing a LMB DRC that was supposed > to be removed being unisolated instead, can do error recovery on its > side. > > This change is done in dlpar_memory_remove_by_ic() only because, as of > today, only QEMU is using this code path for error recovery (via the > PSERIES_HP_ELOG_ID_DRC_IC event). phyp treats it as a no-op. > > Signed-off-by: Daniel Henrique Barboza Reviewed-by: David Gibson > --- > arch/powerpc/platforms/pseries/hotplug-memory.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c > b/arch/powerpc/platforms/pseries/hotplug-memory.c > index 8377f1f7c78e..bb98574a84a2 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -551,6 +551,13 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, > u32 drc_index) > if (!drmem_lmb_reserved(lmb)) > continue; > > + /* > + * Setting the isolation state of an > UNISOLATED/CONFIGURED > + * device to UNISOLATE is a no-op, but the hypervisor > can > + * use it as a hint that the LMB removal failed. > + */ > + dlpar_unisolate_drc(lmb->drc_index); > + > rc = dlpar_add_lmb(lmb); > if (rc) > pr_err("Failed to add LMB, drc index %x\n", -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH 4/4] powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes
On Mon, 3 May 2021 at 13:04, Nicholas Piggin wrote: > > These aren't necessarily POWER9 only, and it's not to say some new > vulnerability may not get discovered on other processors for which > we would like the flexibility of having the workaround enabled by > firmware. > > Remove the restriction that they only apply to POWER9. I was wondering how these worked which led me to reviewing your patch. >From what I could see, these are enabled by default (SEC_FTR_DEFAULT in arch/powerpc/include/asm/security_features.h), so unless all non-POWER9 machines have set the "please don't" bit in their firmware this patch will enable the feature for those machines. Is that what you wanted? > > Signed-off-by: Nicholas Piggin > --- > arch/powerpc/platforms/powernv/setup.c | 9 - > 1 file changed, 9 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/setup.c > b/arch/powerpc/platforms/powernv/setup.c > index a8db3f153063..6ec67223f8c7 100644 > --- a/arch/powerpc/platforms/powernv/setup.c > +++ b/arch/powerpc/platforms/powernv/setup.c > @@ -122,15 +122,6 @@ static void pnv_setup_security_mitigations(void) > type = L1D_FLUSH_ORI; > } > > - /* > -* If we are non-Power9 bare metal, we don't need to flush on kernel > -* entry or after user access: they fix a P9 specific vulnerability. > -*/ > - if (!pvr_version_is(PVR_POWER9)) { > - security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY); > - security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS); > - } > - > enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \ > (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || \ > security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV)); > -- > 2.23.0 >
Re: [PATCH] ibmvnic: remove default label from to_string switch
From: Lijun Pan Date: Mon, 3 May 2021 13:21:00 -0500 > On Mon, May 3, 2021 at 5:54 AM Michal Suchanek wrote: >> >> This way the compiler warns when a new value is added to the enum but >> not the string transation like: > > s/transation/translation/ > > This trick works. > Since the original code does not generate gcc warnings/errors, should > this patch be sent to net-next as an improvement? Yes.
Re: [PATCH v3] pseries/drmem: update LMBs after LPM
On 5/3/21 10:28 AM, Laurent Dufour wrote: > Le 01/05/2021 à 01:58, Tyrel Datwyler a écrit : >> On 4/30/21 9:13 AM, Laurent Dufour wrote: >>> Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit : On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote: > Laurent Dufour writes: > Snip >> >> As of today I don't have a problem with your patch. This was more of me >> pointing >> out things that I think are currently wrong with our memory hotplug >> implementation, and that we need to take a long hard look at it down the >> road. > > I do agree, there is a lot of odd things there to address in this area. > If you're ok with that patch, do you mind to add a reviewed-by? > Can you send a v4 with the fix for the duplicate update included? -Tyrel
Re: [RFC] powerpc/pseries: delete scanlog
On 5/3/21 10:18 AM, Nathan Lynch wrote: > A commit from 2008 says this driver was relevant only for "older > systems", and currently supported hardware doesn't have this > facility. Get rid of it. The only references I could find to scan log dump support are several Power 4+ systems, in particular the IntelliStation POWER 9114 and pSeries 615, which were released in 2003 at the same time this code was originally introduced. Historical Linux commit form February 2003: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=f92e361842d5251e50562b09664082dcbd0548bb IntelliStation and pSeries docs: http://ps-2.retropc.se/basil.holloway/ALL%20PDF/380635.pdf http://ps-2.kev009.com/rs6000/manuals/p/p615-6C3-6E3/6C3_and_6E3_Users_Guide_SA38-0629.pdf Current firmware RTAS implementations have no reference to ibm,scan-log-dump, and a long standing developer for that code has no recollection of its existence. This appears to be a straggler from RPA and Power 4 days. Based on my understanding that we dropped support Power 4 in mainline this looks pretty orphaned to me and a solid candidate for removal barring and insight from someone else that knows better. +1 Feel free to add my RB tag to any non-RFC followup. Reviewed-by: Tyrel Datwyler > > Signed-off-by: Nathan Lynch > --- > arch/powerpc/configs/ppc64_defconfig | 1 - > arch/powerpc/configs/pseries_defconfig | 1 - > arch/powerpc/platforms/pseries/Kconfig | 4 - > arch/powerpc/platforms/pseries/Makefile | 1 - > arch/powerpc/platforms/pseries/scanlog.c | 195 --- > 5 files changed, 202 deletions(-) > delete mode 100644 arch/powerpc/platforms/pseries/scanlog.c > > diff --git a/arch/powerpc/configs/ppc64_defconfig > b/arch/powerpc/configs/ppc64_defconfig > index 701811c91a6f..acf13b4917c4 100644 > --- a/arch/powerpc/configs/ppc64_defconfig > +++ b/arch/powerpc/configs/ppc64_defconfig > @@ -26,7 +26,6 @@ CONFIG_PPC64=y > CONFIG_NR_CPUS=2048 > CONFIG_PPC_SPLPAR=y > CONFIG_DTL=y > -CONFIG_SCANLOG=m > CONFIG_PPC_SMLPAR=y > CONFIG_IBMEBUS=y > CONFIG_PPC_SVM=y > diff --git a/arch/powerpc/configs/pseries_defconfig > b/arch/powerpc/configs/pseries_defconfig > index 50168dde4ea5..d120321e4eea 100644 > --- a/arch/powerpc/configs/pseries_defconfig > +++ b/arch/powerpc/configs/pseries_defconfig > @@ -38,7 +38,6 @@ CONFIG_MODULE_SRCVERSION_ALL=y > CONFIG_PARTITION_ADVANCED=y > CONFIG_PPC_SPLPAR=y > CONFIG_DTL=y > -CONFIG_SCANLOG=m > CONFIG_PPC_SMLPAR=y > CONFIG_IBMEBUS=y > CONFIG_PAPR_SCM=m > diff --git a/arch/powerpc/platforms/pseries/Kconfig > b/arch/powerpc/platforms/pseries/Kconfig > index 5e037df2a3a1..bf9b612a929b 100644 > --- a/arch/powerpc/platforms/pseries/Kconfig > +++ b/arch/powerpc/platforms/pseries/Kconfig > @@ -61,10 +61,6 @@ config PSERIES_ENERGY > Provides: /sys/devices/system/cpu/pseries_(de)activation_hint_list > and /sys/devices/system/cpu/cpuN/pseries_(de)activation_hint > > -config SCANLOG > - tristate "Scanlog dump interface" > - depends on RTAS_PROC && PPC_PSERIES > - > config IO_EVENT_IRQ > bool "IO Event Interrupt support" > depends on PPC_PSERIES > diff --git a/arch/powerpc/platforms/pseries/Makefile > b/arch/powerpc/platforms/pseries/Makefile > index c8a2b0b05ac0..754d1102de08 100644 > --- a/arch/powerpc/platforms/pseries/Makefile > +++ b/arch/powerpc/platforms/pseries/Makefile > @@ -8,7 +8,6 @@ obj-y := lpar.o hvCall.o nvram.o reconfig.o \ > firmware.o power.o dlpar.o mobility.o rng.o \ > pci.o pci_dlpar.o eeh_pseries.o msi.o > obj-$(CONFIG_SMP)+= smp.o > -obj-$(CONFIG_SCANLOG)+= scanlog.o > obj-$(CONFIG_KEXEC_CORE) += kexec.o > obj-$(CONFIG_PSERIES_ENERGY) += pseries_energy.o > > diff --git a/arch/powerpc/platforms/pseries/scanlog.c > b/arch/powerpc/platforms/pseries/scanlog.c > deleted file mode 100644 > index 2879c4f0ceb7.. > --- a/arch/powerpc/platforms/pseries/scanlog.c > +++ /dev/null > @@ -1,195 +0,0 @@ > -// SPDX-License-Identifier: GPL-2.0-or-later > -/* > - * c 2001 PPC 64 Team, IBM Corp > - * > - * scan-log-data driver for PPC64 Todd Inglett > - * > - * When ppc64 hardware fails the service processor dumps internal state > - * of the system. After a reboot the operating system can access a dump > - * of this data using this driver. A dump exists if the device-tree > - * /chosen/ibm,scan-log-data property exists. > - * > - * This driver exports /proc/powerpc/scan-log-dump which can be read. > - * The driver supports only sequential reads. > - * > - * The driver looks at a write to the driver for the single word "reset". > - * If given, the driver will reset the scanlog so the platform can free it. > - */ > - > -#include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > -#include > - > -#define MODULE_VERS "1.0" > -#define MODULE_NAME "scanlog" > - > -/* Statu
Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels
On Mon, May 03, 2021 at 11:34:25AM +0200, Michal Suchánek wrote: > On Mon, May 03, 2021 at 09:11:16AM +0200, Michal Suchánek wrote: > > On Mon, May 03, 2021 at 10:58:33AM +1000, Nicholas Piggin wrote: > > > Excerpts from Michal Suchánek's message of May 3, 2021 2:57 am: > > > > On Tue, Apr 28, 2020 at 09:25:17PM +1000, Nicholas Piggin wrote: > > > >> Provide an option to use ELFv2 ABI for big endian builds. This works on > > > >> GCC and clang (since 2014). It is less well tested and supported by the > > > >> GNU toolchain, but it can give some useful advantages of the ELFv2 ABI > > > >> for BE (e.g., less stack usage). Some distros even build BE ELFv2 > > > >> userspace. > > > > > > > > Fixes BTFID failure on BE for me and the ELF ABIv2 kernel boots. > > > > > > What's the BTFID failure? Anything we can do to fix it on the v1 ABI or > > > at least make it depend on BUILD_ELF_V2? > > > > Looks like symbols are prefixed with a dot in ABIv1 and BTFID tool is > > not aware of that. It can be disabled on ABIv1 easily. > > > > Thanks > > > > Michal > > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > > index 678c13967580..e703c26e9b80 100644 > > --- a/lib/Kconfig.debug > > +++ b/lib/Kconfig.debug > > @@ -305,6 +305,7 @@ config DEBUG_INFO_BTF > > bool "Generate BTF typeinfo" > > depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED > > depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST > > + depends on !PPC64 || BUILD_ELF_V2 > > help > > Generate deduplicated BTF type information from DWARF debug info. > > Turning this on expects presence of pahole tool, which will convert > > > > > > > > > > > > > Tested-by: Michal Suchánek > > > > > > > > Also can we enable mprofile on BE now? > > > > > > > > I don't see anything endian-specific in the mprofile code at a glance > > > > but don't have any idea how to test it. > > > > > > AFAIK it's just a different ABI for the _mcount call so just running > > > some ftrace and ftrace with call graph should test it reasonably well. > > It does not crash and burn but there are some regressions from LE to BE > on the ftrace kernel selftest: > > --- ftraceLE.txt 2021-05-03 11:19:14.83000 +0200 > +++ ftraceBE.txt 2021-05-03 11:27:24.77000 +0200 > @@ -7,8 +7,8 @@ > [n] Change the ringbuffer size [PASS] > [n] Snapshot and tracing setting [PASS] > [n] trace_pipe and trace_marker [PASS] > -[n] Test ftrace direct functions against tracers [UNRESOLVED] > -[n] Test ftrace direct functions against kprobes [UNRESOLVED] > +[n] Test ftrace direct functions against tracers [FAIL] > +[n] Test ftrace direct functions against kprobes [FAIL] > [n] Generic dynamic event - add/remove kprobe events [PASS] > [n] Generic dynamic event - add/remove synthetic events [PASS] > [n] Generic dynamic event - selective clear (compatibility) [PASS] > @@ -16,10 +16,10 @@ > [n] event tracing - enable/disable with event level files[PASS] > [n] event tracing - restricts events based on pid notrace filtering [PASS] > [n] event tracing - restricts events based on pid[PASS] > -[n] event tracing - enable/disable with subsystem level files[PASS] > +[n] event tracing - enable/disable with subsystem level files[FAIL] > [n] event tracing - enable/disable with top level files [PASS] > -[n] Test trace_printk from module[UNRESOLVED] > -[n] ftrace - function graph filters with stack tracer[PASS] > +[n] Test trace_printk from module[FAIL] > +[n] ftrace - function graph filters with stack tracer[FAIL] > [n] ftrace - function graph filters [PASS] > [n] ftrace - function trace with cpumask [PASS] > [n] ftrace - test for function event triggers[PASS] > @@ -27,7 +27,7 @@ > [n] ftrace - function pid notrace filters[PASS] > [n] ftrace - function pid filters[PASS] > [n] ftrace - stacktrace filter command [PASS] > -[n] ftrace - function trace on module[UNRESOLVED] > +[n] ftrace - function trace on module[FAIL] > [n] ftrace - function profiler with function tracing [PASS] > [n] ftrace - function profiling [PASS] > [n] ftrace - test reading of set_ftrace_filter [PASS] > @@ -44,10 +44,10 @@ > [n] Kprobe event argument syntax [PASS] > [n] Kprobe dynamic event with arguments [PASS] > [n] Kprobes event arguments with types [PASS] > -[n] Kprobe event user-memory access [UNSUPPORTED] > +[n] Kprobe event user-memory access [FAIL] > [n] Kprobe event auto/manual naming [PASS] > [n] Kprobe dynamic event with function tracer[PASS] > -[n] Kprobe dynamic event - probing module[UNRESOLVED] > +[n] Kprobe dynamic event - probing module[FAIL] > [n] Create/delete multiprobe on kprobe event [PASS] > [n] Kprobe event parser error log check [PASS] > [n] Kretprobe dynamic event with arguments [PASS] > @@ -57,11 +57,11 @@ > [n] Kprobe events - probe points [PASS] > [n] Kprobe dyna
Re: [PATCH] ibmvnic: remove default label from to_string switch
On Mon, May 3, 2021 at 5:54 AM Michal Suchanek wrote: > > This way the compiler warns when a new value is added to the enum but > not the string transation like: s/transation/translation/ This trick works. Since the original code does not generate gcc warnings/errors, should this patch be sent to net-next as an improvement? > > drivers/net/ethernet/ibm/ibmvnic.c: In function 'adapter_state_to_string': > drivers/net/ethernet/ibm/ibmvnic.c:832:2: warning: enumeration value > 'VNIC_FOOBAR' not handled in switch [-Wswitch] > switch (state) { > ^~ > drivers/net/ethernet/ibm/ibmvnic.c: In function 'reset_reason_to_string': > drivers/net/ethernet/ibm/ibmvnic.c:1935:2: warning: enumeration value > 'VNIC_RESET_FOOBAR' not handled in switch [-Wswitch] > switch (reason) { > ^~ > > Signed-off-by: Michal Suchanek > --- Acked-by: Lijun Pan > drivers/net/ethernet/ibm/ibmvnic.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c > b/drivers/net/ethernet/ibm/ibmvnic.c > index 5788bb956d73..4d439413f6d9 100644 > --- a/drivers/net/ethernet/ibm/ibmvnic.c > +++ b/drivers/net/ethernet/ibm/ibmvnic.c > @@ -846,9 +846,8 @@ static const char *adapter_state_to_string(enum > vnic_state state) > return "REMOVING"; > case VNIC_REMOVED: > return "REMOVED"; > - default: > - return "UNKNOWN"; > } > + return "UNKNOWN"; > } > > static int ibmvnic_login(struct net_device *netdev) > @@ -1946,9 +1945,8 @@ static const char *reset_reason_to_string(enum > ibmvnic_reset_reason reason) > return "TIMEOUT"; > case VNIC_RESET_CHANGE_PARAM: > return "CHANGE_PARAM"; > - default: > - return "UNKNOWN"; > } > + return "UNKNOWN"; > } > > /* > -- > 2.26.2 >
Re: [PATCH 1/3] lib: early_string: allow early usage of some string functions
On Mon, May 03, 2021 at 11:01:41AM -0700, Daniel Walker wrote: > On Sat, May 01, 2021 at 09:31:47AM +0200, Christophe Leroy wrote: > > > > > In fact, should be like in prom_init today: > > > > > > #ifdef __EARLY_STRING_ENABLED > > > if (dsize >= count) > > > return count; > > > #else > > > BUG_ON(dsize >= count); > > > #endif > > > > Thinking about it once more, this BUG_ON() is overkill and should be > > avoided, see https://www.kernel.org/doc/html/latest/process/deprecated.html > > > > Therefore, something like the following would make it: > > > > if (dsize >= count) { > > WARN_ON(!__is_defined(__EARLY_STRING_ENABLED)); > > > > return count; > > } > > I agree, it's overkill it stop the system for this condition. > > how about I do something more like this for my changes, > > > > if (WARN_ON(dsize >= count && !__is_defined(__EARLY_STRING_ENABLED))) > > return count; I'll have to work on this one.. Daniel
Re: [PATCH 1/3] lib: early_string: allow early usage of some string functions
On Sat, May 01, 2021 at 09:31:47AM +0200, Christophe Leroy wrote: > > > In fact, should be like in prom_init today: > > > > #ifdef __EARLY_STRING_ENABLED > > if (dsize >= count) > > return count; > > #else > > BUG_ON(dsize >= count); > > #endif > > Thinking about it once more, this BUG_ON() is overkill and should be > avoided, see https://www.kernel.org/doc/html/latest/process/deprecated.html > > Therefore, something like the following would make it: > > if (dsize >= count) { > WARN_ON(!__is_defined(__EARLY_STRING_ENABLED)); > > return count; > } I agree, it's overkill it stop the system for this condition. how about I do something more like this for my changes, > if (WARN_ON(dsize >= count && !__is_defined(__EARLY_STRING_ENABLED))) > return count; and for generic kernel, > if (WARN_ON(dsize >= count)) > return count; Daniel
[PATCH] powerpc/rtas-rtc: remove unused constant
RTAS_CLOCK_BUSY is unused, remove it. Signed-off-by: Nathan Lynch --- arch/powerpc/kernel/rtas-rtc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/rtas-rtc.c b/arch/powerpc/kernel/rtas-rtc.c index a28239b8b0c0..33c07c8af6c8 100644 --- a/arch/powerpc/kernel/rtas-rtc.c +++ b/arch/powerpc/kernel/rtas-rtc.c @@ -12,7 +12,7 @@ #define MAX_RTC_WAIT 5000 /* 5 sec */ -#define RTAS_CLOCK_BUSY (-2) + time64_t __init rtas_get_boot_time(void) { int ret[8]; -- 2.30.2
Re: [PATCH v3] pseries/drmem: update LMBs after LPM
Le 01/05/2021 à 01:58, Tyrel Datwyler a écrit : On 4/30/21 9:13 AM, Laurent Dufour wrote: Le 29/04/2021 à 21:12, Tyrel Datwyler a écrit : On 4/29/21 3:27 AM, Aneesh Kumar K.V wrote: Laurent Dufour writes: After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be updated by the hypervisor in the case the NUMA topology of the LPAR's memory is updated. This is caught by the kernel, but the memory's node is updated because there is no way to move a memory block between nodes. If later a memory block is added or removed, drmem_update_dt() is called and it is overwriting the DT node to match the added or removed LMB. But the LMB's associativity node has not been updated after the DT node update and thus the node is overwritten by the Linux's topology instead of the hypervisor one. Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is updated to force an update of the LMB's associativity. Cc: Tyrel Datwyler Signed-off-by: Laurent Dufour --- V3: - Check rd->dn->name instead of rd->dn->full_name V2: - Take Tyrel's idea to rely on OF_RECONFIG_UPDATE_PROPERTY instead of introducing a new hook mechanism. --- arch/powerpc/include/asm/drmem.h | 1 + arch/powerpc/mm/drmem.c | 35 +++ .../platforms/pseries/hotplug-memory.c | 4 +++ 3 files changed, 40 insertions(+) diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h index bf2402fed3e0..4265d5e95c2c 100644 --- a/arch/powerpc/include/asm/drmem.h +++ b/arch/powerpc/include/asm/drmem.h @@ -111,6 +111,7 @@ int drmem_update_dt(void); int __init walk_drmem_lmbs_early(unsigned long node, void *data, int (*func)(struct drmem_lmb *, const __be32 **, void *)); +void drmem_update_lmbs(struct property *prop); #endif static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb) diff --git a/arch/powerpc/mm/drmem.c b/arch/powerpc/mm/drmem.c index 9af3832c9d8d..f0a6633132af 100644 --- a/arch/powerpc/mm/drmem.c +++ b/arch/powerpc/mm/drmem.c @@ -307,6 +307,41 @@ int __init walk_drmem_lmbs_early(unsigned long node, void *data, return ret; } +/* + * Update the LMB associativity index. + */ +static int update_lmb(struct drmem_lmb *updated_lmb, + __maybe_unused const __be32 **usm, + __maybe_unused void *data) +{ + struct drmem_lmb *lmb; + + /* + * Brut force there may be better way to fetch the LMB + */ + for_each_drmem_lmb(lmb) { + if (lmb->drc_index != updated_lmb->drc_index) + continue; + + lmb->aa_index = updated_lmb->aa_index; + break; + } + return 0; +} + +/* + * Update the LMB associativity index. + * + * This needs to be called when the hypervisor is updating the + * dynamic-reconfiguration-memory node property. + */ +void drmem_update_lmbs(struct property *prop) +{ + if (!strcmp(prop->name, "ibm,dynamic-memory")) + __walk_drmem_v1_lmbs(prop->value, NULL, NULL, update_lmb); + else if (!strcmp(prop->name, "ibm,dynamic-memory-v2")) + __walk_drmem_v2_lmbs(prop->value, NULL, NULL, update_lmb); +} #endif static int init_drmem_lmb_size(struct device_node *dn) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 8377f1f7c78e..672ffbee2e78 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -949,6 +949,10 @@ static int pseries_memory_notifier(struct notifier_block *nb, case OF_RECONFIG_DETACH_NODE: err = pseries_remove_mem_node(rd->dn); break; + case OF_RECONFIG_UPDATE_PROPERTY: + if (!strcmp(rd->dn->name, + "ibm,dynamic-reconfiguration-memory")) + drmem_update_lmbs(rd->prop); } return notifier_from_errno(err); How will this interact with DLPAR memory? When we dlpar memory, ibm,configure-connector is used to fetch the new associativity details and set drmem_lmb->aa_index correctly there. Once that is done kernel then call drmem_update_dt() which will result in the above notifier callback? IIUC, the call back then will update drmem_lmb->aa_index again? After digging through some of this code I'm a bit concerned about all the kernel device tree manipulation around memory DLPAR both with the assoc-lookup-array prop update and post dynamic-memory prop updating. We build a drmem_info array of the LMBs from the device-tree at boot. I don't really understand why we are manipulating the device tree property every time we add/remove an LMB. Not sure the reasoning was to write back in particular the aa_index and flags for each LMB into the device tree when we already have them in the drmem_info array. On the other hand the assoc-lookup-array I suppose would need to have an in kernel representation to avoid updating the device tree property every time. I think the rea
[RFC] powerpc/pseries: delete scanlog
A commit from 2008 says this driver was relevant only for "older systems", and currently supported hardware doesn't have this facility. Get rid of it. Signed-off-by: Nathan Lynch --- arch/powerpc/configs/ppc64_defconfig | 1 - arch/powerpc/configs/pseries_defconfig | 1 - arch/powerpc/platforms/pseries/Kconfig | 4 - arch/powerpc/platforms/pseries/Makefile | 1 - arch/powerpc/platforms/pseries/scanlog.c | 195 --- 5 files changed, 202 deletions(-) delete mode 100644 arch/powerpc/platforms/pseries/scanlog.c diff --git a/arch/powerpc/configs/ppc64_defconfig b/arch/powerpc/configs/ppc64_defconfig index 701811c91a6f..acf13b4917c4 100644 --- a/arch/powerpc/configs/ppc64_defconfig +++ b/arch/powerpc/configs/ppc64_defconfig @@ -26,7 +26,6 @@ CONFIG_PPC64=y CONFIG_NR_CPUS=2048 CONFIG_PPC_SPLPAR=y CONFIG_DTL=y -CONFIG_SCANLOG=m CONFIG_PPC_SMLPAR=y CONFIG_IBMEBUS=y CONFIG_PPC_SVM=y diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig index 50168dde4ea5..d120321e4eea 100644 --- a/arch/powerpc/configs/pseries_defconfig +++ b/arch/powerpc/configs/pseries_defconfig @@ -38,7 +38,6 @@ CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_PARTITION_ADVANCED=y CONFIG_PPC_SPLPAR=y CONFIG_DTL=y -CONFIG_SCANLOG=m CONFIG_PPC_SMLPAR=y CONFIG_IBMEBUS=y CONFIG_PAPR_SCM=m diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 5e037df2a3a1..bf9b612a929b 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -61,10 +61,6 @@ config PSERIES_ENERGY Provides: /sys/devices/system/cpu/pseries_(de)activation_hint_list and /sys/devices/system/cpu/cpuN/pseries_(de)activation_hint -config SCANLOG - tristate "Scanlog dump interface" - depends on RTAS_PROC && PPC_PSERIES - config IO_EVENT_IRQ bool "IO Event Interrupt support" depends on PPC_PSERIES diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile index c8a2b0b05ac0..754d1102de08 100644 --- a/arch/powerpc/platforms/pseries/Makefile +++ b/arch/powerpc/platforms/pseries/Makefile @@ -8,7 +8,6 @@ obj-y := lpar.o hvCall.o nvram.o reconfig.o \ firmware.o power.o dlpar.o mobility.o rng.o \ pci.o pci_dlpar.o eeh_pseries.o msi.o obj-$(CONFIG_SMP) += smp.o -obj-$(CONFIG_SCANLOG) += scanlog.o obj-$(CONFIG_KEXEC_CORE) += kexec.o obj-$(CONFIG_PSERIES_ENERGY) += pseries_energy.o diff --git a/arch/powerpc/platforms/pseries/scanlog.c b/arch/powerpc/platforms/pseries/scanlog.c deleted file mode 100644 index 2879c4f0ceb7.. --- a/arch/powerpc/platforms/pseries/scanlog.c +++ /dev/null @@ -1,195 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * c 2001 PPC 64 Team, IBM Corp - * - * scan-log-data driver for PPC64 Todd Inglett - * - * When ppc64 hardware fails the service processor dumps internal state - * of the system. After a reboot the operating system can access a dump - * of this data using this driver. A dump exists if the device-tree - * /chosen/ibm,scan-log-data property exists. - * - * This driver exports /proc/powerpc/scan-log-dump which can be read. - * The driver supports only sequential reads. - * - * The driver looks at a write to the driver for the single word "reset". - * If given, the driver will reset the scanlog so the platform can free it. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#define MODULE_VERS "1.0" -#define MODULE_NAME "scanlog" - -/* Status returns from ibm,scan-log-dump */ -#define SCANLOG_COMPLETE 0 -#define SCANLOG_HWERROR -1 -#define SCANLOG_CONTINUE 1 - - -static unsigned int ibm_scan_log_dump; /* RTAS token */ -static unsigned int *scanlog_buffer; /* The data buffer */ - -static ssize_t scanlog_read(struct file *file, char __user *buf, - size_t count, loff_t *ppos) -{ - unsigned int *data = scanlog_buffer; - int status; - unsigned long len, off; - unsigned int wait_time; - - if (count > RTAS_DATA_BUF_SIZE) - count = RTAS_DATA_BUF_SIZE; - - if (count < 1024) { - /* This is the min supported by this RTAS call. Rather -* than do all the buffering we insist the user code handle -* larger reads. As long as cp works... :) -*/ - printk(KERN_ERR "scanlog: cannot perform a small read (%ld)\n", count); - return -EINVAL; - } - - if (!access_ok(buf, count)) - return -EFAULT; - - for (;;) { - wait_time = 500;/* default wait if no data */ - spin_lock(&rtas_data_buf_lock); - memcpy(rtas_data_buf, data, RTAS_DATA_BUF_SIZE); - status = rtas_call(ibm_
[PATCH 2/2] powerpc/paca: Remove mm_ctx_id and mm_ctx_slb_addr_limit
mm_ctx_id and mm_ctx_slb_addr_limit are not used anymore. Remove them. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/paca.h | 2 -- arch/powerpc/kernel/paca.c | 2 -- 2 files changed, 4 deletions(-) diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index ec18ac818e3a..ecc8d792a431 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -149,11 +149,9 @@ struct paca_struct { #endif /* CONFIG_PPC_BOOK3E */ #ifdef CONFIG_PPC_BOOK3S - mm_context_id_t mm_ctx_id; #ifdef CONFIG_PPC_MM_SLICES unsigned char mm_ctx_low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE]; unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE]; - unsigned long mm_ctx_slb_addr_limit; #else u16 mm_ctx_user_psize; u16 mm_ctx_sllp; diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c index 7f5aae3c387d..9bd30cac852b 100644 --- a/arch/powerpc/kernel/paca.c +++ b/arch/powerpc/kernel/paca.c @@ -346,10 +346,8 @@ void copy_mm_to_paca(struct mm_struct *mm) #ifdef CONFIG_PPC_BOOK3S mm_context_t *context = &mm->context; - get_paca()->mm_ctx_id = context->id; #ifdef CONFIG_PPC_MM_SLICES VM_BUG_ON(!mm_ctx_slb_addr_limit(context)); - get_paca()->mm_ctx_slb_addr_limit = mm_ctx_slb_addr_limit(context); memcpy(&get_paca()->mm_ctx_low_slices_psize, mm_ctx_low_slices(context), LOW_SLICE_ARRAY_SZ); memcpy(&get_paca()->mm_ctx_high_slices_psize, mm_ctx_high_slices(context), -- 2.25.0
[PATCH 1/2] powerpc/asm-offset: Remove unused items related to paca
PACA_SIZE, PACACONTEXTID, PACALOWSLICESPSIZE, PACAHIGHSLICEPSIZE, PACA_SLB_ADDR_LIMIT, MMUPSIZEDEFSIZE, PACASLBCACHE, PACASLBCACHEPTR, PACASTABRR, PACAVMALLOCSLLP, MMUPSIZESLLP, PACACONTEXTSLLP, PACALPPACAPTR, LPPACA_DTLIDX and PACA_DTL_RIDX are not used anymore by ASM code. Remove them. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/asm-offsets.c | 24 1 file changed, 24 deletions(-) diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 28af4efb4587..419ab4a89114 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -197,7 +197,6 @@ int main(void) OFFSET(ICACHEL1LOGBLOCKSIZE, ppc64_caches, l1i.log_block_size); OFFSET(ICACHEL1BLOCKSPERPAGE, ppc64_caches, l1i.blocks_per_page); /* paca */ - DEFINE(PACA_SIZE, sizeof(struct paca_struct)); OFFSET(PACAPACAINDEX, paca_struct, paca_index); OFFSET(PACAPROCSTART, paca_struct, cpu_start); OFFSET(PACAKSAVE, paca_struct, kstack); @@ -212,15 +211,6 @@ int main(void) OFFSET(PACAIRQSOFTMASK, paca_struct, irq_soft_mask); OFFSET(PACAIRQHAPPENED, paca_struct, irq_happened); OFFSET(PACA_FTRACE_ENABLED, paca_struct, ftrace_enabled); -#ifdef CONFIG_PPC_BOOK3S - OFFSET(PACACONTEXTID, paca_struct, mm_ctx_id); -#ifdef CONFIG_PPC_MM_SLICES - OFFSET(PACALOWSLICESPSIZE, paca_struct, mm_ctx_low_slices_psize); - OFFSET(PACAHIGHSLICEPSIZE, paca_struct, mm_ctx_high_slices_psize); - OFFSET(PACA_SLB_ADDR_LIMIT, paca_struct, mm_ctx_slb_addr_limit); - DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def)); -#endif /* CONFIG_PPC_MM_SLICES */ -#endif #ifdef CONFIG_PPC_BOOK3E OFFSET(PACAPGD, paca_struct, pgd); @@ -241,21 +231,9 @@ int main(void) #endif /* CONFIG_PPC_BOOK3E */ #ifdef CONFIG_PPC_BOOK3S_64 - OFFSET(PACASLBCACHE, paca_struct, slb_cache); - OFFSET(PACASLBCACHEPTR, paca_struct, slb_cache_ptr); - OFFSET(PACASTABRR, paca_struct, stab_rr); - OFFSET(PACAVMALLOCSLLP, paca_struct, vmalloc_sllp); -#ifdef CONFIG_PPC_MM_SLICES - OFFSET(MMUPSIZESLLP, mmu_psize_def, sllp); -#else - OFFSET(PACACONTEXTSLLP, paca_struct, mm_ctx_sllp); -#endif /* CONFIG_PPC_MM_SLICES */ OFFSET(PACA_EXGEN, paca_struct, exgen); OFFSET(PACA_EXMC, paca_struct, exmc); OFFSET(PACA_EXNMI, paca_struct, exnmi); -#ifdef CONFIG_PPC_PSERIES - OFFSET(PACALPPACAPTR, paca_struct, lppaca_ptr); -#endif OFFSET(PACA_SLBSHADOWPTR, paca_struct, slb_shadow_ptr); OFFSET(SLBSHADOW_STACKVSID, slb_shadow, save_area[SLB_NUM_BOLTED - 1].vsid); OFFSET(SLBSHADOW_STACKESID, slb_shadow, save_area[SLB_NUM_BOLTED - 1].esid); @@ -264,9 +242,7 @@ int main(void) #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE OFFSET(PACA_PMCINUSE, paca_struct, pmcregs_in_use); #endif - OFFSET(LPPACA_DTLIDX, lppaca, dtl_idx); OFFSET(LPPACA_YIELDCOUNT, lppaca, yield_count); - OFFSET(PACA_DTL_RIDX, paca_struct, dtl_ridx); #endif /* CONFIG_PPC_BOOK3S_64 */ OFFSET(PACAEMERGSP, paca_struct, emergency_sp); #ifdef CONFIG_PPC_BOOK3S_64 -- 2.25.0
Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels
Hi! On Mon, May 03, 2021 at 10:51:41AM +1000, Nicholas Piggin wrote: > Excerpts from Segher Boessenkool's message of May 3, 2021 3:55 am: > > On Wed, Apr 29, 2020 at 10:57:16AM +1000, Nicholas Piggin wrote: > >> Excerpts from Segher Boessenkool's message of April 29, 2020 9:40 am: > >> I blame toolchain for -mabi=elfv2 ! And also some blame on ABI document > >> which is called ELF V2 ABI rather than ELF ABI V2 which would have been > >> unambiguous. > > > > At least ELFv2 ABI is correct. "ELF ABI v2" is not. > > > >> I can go through and change all my stuff and config options to ELF_ABI_v2. > > > > Please don't. It is wrong. > > Then I'm not sure what the point of your previous mail was, what did I > miss? I asked if you could make it clearer to people who do not know what this is whether they want to use it. Or that was my intention, anyhow :-/ > > Both the original PowerPC ELF ABI and the > > ELFv2 one have versions themselves. Also, the base ELF standard has a > > version, and is set up so there can be incompatible versions even! Of > > course it still is version 1 to this day, but :-) > > The point was for people who don't know ELFv2 has a specific meaning for > powerpc, It does not have *any* meaning outside of Power. But people who do not know what it is can assume the wrong things about it. It isn't a great name because of that :-( (It's not as bad as the MIPS ABIs -- an older one is called "new" :-) ) > then ELF ABIv2 is more explanatory about it being an abi change > rather than base elf change, even if it's not the "correct" name. I very much disagree. "ELF ABIv2" is completely meaningless. > If you don't want that then good, I also prefer to just use ELFv2. I Good :-) > think people who change this option can easily look up the name in > toolchain and other docs. Yeah. As long as the defaults are good, whoever blows themselves up has only themselves to blame :-P Segher
Re: [PATCH v3] powerpc/64: Option to use ELFv2 ABI for big-endian kernels
On Mon, May 03, 2021 at 01:37:57PM +0200, Andreas Schwab wrote: > Should this add a tag to the module vermagic? Would the modues link even if the vermagic was not changed? I suppose something like this might do it. Thanks Michal diff --git a/arch/powerpc/include/asm/vermagic.h b/arch/powerpc/include/asm/vermagic.h index b054a8576e5d..3fdaacd7a743 100644 --- a/arch/powerpc/include/asm/vermagic.h +++ b/arch/powerpc/include/asm/vermagic.h @@ -14,7 +14,14 @@ #define MODULE_ARCH_VERMAGIC_RELOCATABLE "" #endif + +#ifdef CONFIG_PPC64_BUILD_BIG_ENDIAN_ELF_V2_ABI +#define MODULE_ARCH_VERMAGIC_ELF_V2_ABI"abi-elfv2 " +#else +#define MODULE_ARCH_VERMAGIC_ELF_V2_ABI"" +#endif + #define MODULE_ARCH_VERMAGIC \ - MODULE_ARCH_VERMAGIC_FTRACE MODULE_ARCH_VERMAGIC_RELOCATABLE + MODULE_ARCH_VERMAGIC_FTRACE MODULE_ARCH_VERMAGIC_RELOCATABLE MODULE_ARCH_VERMAGIC_ELF_V2_ABI #endif /* _ASM_VERMAGIC_H */
Re: [PATCH v5 14/16] dma-direct: Allocate memory from restricted DMA pool if available
On Fri, Apr 23, 2021 at 9:46 PM Robin Murphy wrote: > > On 2021-04-22 09:15, Claire Chang wrote: > > The restricted DMA pool is preferred if available. > > > > The restricted DMA pools provide a basic level of protection against the > > DMA overwriting buffer contents at unexpected times. However, to protect > > against general data leakage and system memory corruption, the system > > needs to provide a way to lock down the memory access, e.g., MPU. > > > > Signed-off-by: Claire Chang > > --- > > kernel/dma/direct.c | 35 ++- > > 1 file changed, 26 insertions(+), 9 deletions(-) > > > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > > index 7a27f0510fcc..29523d2a9845 100644 > > --- a/kernel/dma/direct.c > > +++ b/kernel/dma/direct.c > > @@ -78,6 +78,10 @@ static bool dma_coherent_ok(struct device *dev, > > phys_addr_t phys, size_t size) > > static void __dma_direct_free_pages(struct device *dev, struct page *page, > > size_t size) > > { > > +#ifdef CONFIG_DMA_RESTRICTED_POOL > > + if (swiotlb_free(dev, page, size)) > > + return; > > +#endif > > dma_free_contiguous(dev, page, size); > > } > > > > @@ -92,7 +96,17 @@ static struct page *__dma_direct_alloc_pages(struct > > device *dev, size_t size, > > > > gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask, > > &phys_limit); > > - page = dma_alloc_contiguous(dev, size, gfp); > > + > > +#ifdef CONFIG_DMA_RESTRICTED_POOL > > + page = swiotlb_alloc(dev, size); > > + if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) { > > + __dma_direct_free_pages(dev, page, size); > > + page = NULL; > > + } > > +#endif > > + > > + if (!page) > > + page = dma_alloc_contiguous(dev, size, gfp); > > if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) { > > dma_free_contiguous(dev, page, size); > > page = NULL; > > @@ -148,7 +162,7 @@ void *dma_direct_alloc(struct device *dev, size_t size, > > gfp |= __GFP_NOWARN; > > > > if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) && > > - !force_dma_unencrypted(dev)) { > > + !force_dma_unencrypted(dev) && !is_dev_swiotlb_force(dev)) { > > page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO); > > if (!page) > > return NULL; > > @@ -161,8 +175,8 @@ void *dma_direct_alloc(struct device *dev, size_t size, > > } > > > > if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) && > > - !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && > > - !dev_is_dma_coherent(dev)) > > + !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) > > && > > + !is_dev_swiotlb_force(dev)) > > return arch_dma_alloc(dev, size, dma_handle, gfp, attrs); > > > > /* > > @@ -172,7 +186,9 @@ void *dma_direct_alloc(struct device *dev, size_t size, > > if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) && > > !gfpflags_allow_blocking(gfp) && > > (force_dma_unencrypted(dev) || > > - (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && > > !dev_is_dma_coherent(dev > > + (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && > > + !dev_is_dma_coherent(dev))) && > > + !is_dev_swiotlb_force(dev)) > > return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp); > > > > /* we always manually zero the memory once we are done */ > > @@ -253,15 +269,15 @@ void dma_direct_free(struct device *dev, size_t size, > > unsigned int page_order = get_order(size); > > > > if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) && > > - !force_dma_unencrypted(dev)) { > > + !force_dma_unencrypted(dev) && !is_dev_swiotlb_force(dev)) { > > /* cpu_addr is a struct page cookie, not a kernel address */ > > dma_free_contiguous(dev, cpu_addr, size); > > return; > > } > > > > if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) && > > - !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && > > - !dev_is_dma_coherent(dev)) { > > + !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) > > && > > + !is_dev_swiotlb_force(dev)) { > > arch_dma_free(dev, size, cpu_addr, dma_addr, attrs); > > return; > > } > > @@ -289,7 +305,8 @@ struct page *dma_direct_alloc_pages(struct device *dev, > > size_t size, > > void *ret; > > > > if (IS_ENABLED(CONFIG_DMA_COHERENT_POOL) && > > - force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp)) > > + force_dma_unencrypted(dev) && !gfpflags_allow_blocking(gfp) && > > + !is_dev_swiotlb_force(dev)) > > return dma_direct_alloc_from_pool(dev, size, dma_handle, gfp); > > Wait, this seems broken for non-coherent devices - in that case we need >
Re: [PATCH] Raise the minimum GCC version to 5.2
On Mon, May 3, 2021 at 2:20 PM David Laight wrote: > > It would be nice to be able to build current kernels (for local > use) on the 'new' system - but gcc is already too old. I have seen such environments too... However, for the kernel in particular, you could install a newer GCC in the 'new' machine (just for the kernel builds) or do your kernel builds in a different machine -- a 'new' 'new' one :) Cheers, Miguel
[PATCH 4/4] powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes
These aren't necessarily POWER9 only, and it's not to say some new vulnerability may not get discovered on other processors for which we would like the flexibility of having the workaround enabled by firmware. Remove the restriction that they only apply to POWER9. Signed-off-by: Nicholas Piggin --- arch/powerpc/platforms/powernv/setup.c | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index a8db3f153063..6ec67223f8c7 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -122,15 +122,6 @@ static void pnv_setup_security_mitigations(void) type = L1D_FLUSH_ORI; } - /* -* If we are non-Power9 bare metal, we don't need to flush on kernel -* entry or after user access: they fix a P9 specific vulnerability. -*/ - if (!pvr_version_is(PVR_POWER9)) { - security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY); - security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS); - } - enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \ (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || \ security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV)); -- 2.23.0
[PATCH 3/4] powerpc/pesries: Get STF barrier requirement from H_GET_CPU_CHARACTERISTICS
This allows the hypervisor / firmware to describe this workarounds to the guest. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/hvcall.h | 1 + arch/powerpc/platforms/pseries/setup.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index f962b339865c..a60ef261f63a 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -395,6 +395,7 @@ #define H_CPU_BEHAV_FLUSH_LINK_STACK (1ull << 57) // IBM bit 6 #define H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY (1ull << 56) // IBM bit 7 #define H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS (1ull << 55) // IBM bit 8 +#define H_CPU_BEHAV_NO_STF_BARRIER (1ull << 54) // IBM bit 9 /* Flag values used in H_REGISTER_PROC_TBL hcall */ #define PROC_TABLE_OP_MASK 0x18 diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 287f33645419..631a0d57b6cd 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -555,6 +555,9 @@ static void init_cpu_char_feature_flags(struct h_cpu_char_result *result) if (result->behaviour & H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS) security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS); + if (result->behaviour & H_CPU_BEHAV_NO_STF_BARRIER) + security_ftr_clear(SEC_FTR_STF_BARRIER); + if (!(result->behaviour & H_CPU_BEHAV_BNDS_CHK_SPEC_BAR)) security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR); } -- 2.23.0
[PATCH 2/4] powerpc/security: Add a security feature for STF barrier
Rather than tying this mitigation to RFI L1D flush requirement, add a new bit for it. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/security_features.h | 4 arch/powerpc/kernel/security.c | 7 ++- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h index b774a4477d5f..792eefaf230b 100644 --- a/arch/powerpc/include/asm/security_features.h +++ b/arch/powerpc/include/asm/security_features.h @@ -92,6 +92,9 @@ static inline bool security_ftr_enabled(u64 feature) // The L1-D cache should be flushed after user accesses from the kernel #define SEC_FTR_L1D_FLUSH_UACCESS 0x8000ull +// The STF flush should be executed on privilege state switch +#define SEC_FTR_STF_BARRIER0x0001ull + // Features enabled by default #define SEC_FTR_DEFAULT \ (SEC_FTR_L1D_FLUSH_HV | \ @@ -99,6 +102,7 @@ static inline bool security_ftr_enabled(u64 feature) SEC_FTR_BNDS_CHK_SPEC_BAR | \ SEC_FTR_L1D_FLUSH_ENTRY | \ SEC_FTR_L1D_FLUSH_UACCESS | \ +SEC_FTR_STF_BARRIER | \ SEC_FTR_FAVOUR_SECURITY) #endif /* _ASM_POWERPC_SECURITY_FEATURES_H */ diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 0fdfcdd9d880..2eb257b759c6 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -300,9 +300,7 @@ static void stf_barrier_enable(bool enable) void setup_stf_barrier(void) { enum stf_barrier_type type; - bool enable, hv; - - hv = cpu_has_feature(CPU_FTR_HVMODE); + bool enable; /* Default to fallback in case fw-features are not available */ if (cpu_has_feature(CPU_FTR_ARCH_300)) @@ -315,8 +313,7 @@ void setup_stf_barrier(void) type = STF_BARRIER_NONE; enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && - (security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || -(security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) && hv)); +security_ftr_enabled(SEC_FTR_STF_BARRIER); if (type == STF_BARRIER_FALLBACK) { pr_info("stf-barrier: fallback barrier available\n"); -- 2.23.0
[PATCH 1/4] powerpc/pseries: Get entry and uaccess flush required bits from H_GET_CPU_CHARACTERISTICS
This allows the hypervisor / firmware to describe these workarounds to the guest. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/hvcall.h | 2 ++ arch/powerpc/platforms/pseries/setup.c | 6 ++ 2 files changed, 8 insertions(+) diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index 443050906018..f962b339865c 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -393,6 +393,8 @@ #define H_CPU_BEHAV_FAVOUR_SECURITY_H (1ull << 60) // IBM bit 3 #define H_CPU_BEHAV_FLUSH_COUNT_CACHE (1ull << 58) // IBM bit 5 #define H_CPU_BEHAV_FLUSH_LINK_STACK (1ull << 57) // IBM bit 6 +#define H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY (1ull << 56) // IBM bit 7 +#define H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS (1ull << 55) // IBM bit 8 /* Flag values used in H_REGISTER_PROC_TBL hcall */ #define PROC_TABLE_OP_MASK 0x18 diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 754e493b7c05..287f33645419 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -549,6 +549,12 @@ static void init_cpu_char_feature_flags(struct h_cpu_char_result *result) if (!(result->behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) security_ftr_clear(SEC_FTR_L1D_FLUSH_PR); + if (result->behaviour & H_CPU_BEHAV_NO_L1D_FLUSH_ENTRY) + security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY); + + if (result->behaviour & H_CPU_BEHAV_NO_L1D_FLUSH_UACCESS) + security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS); + if (!(result->behaviour & H_CPU_BEHAV_BNDS_CHK_SPEC_BAR)) security_ftr_clear(SEC_FTR_BNDS_CHK_SPEC_BAR); } -- 2.23.0
[PATCH 0/4] powerpc/security mitigation updates
This series adds a few missing bits added to recent pseries H_GET_CPU_CHARACTERISTICS and implements them, also removes a restriction from powernv for some of the flushes. This is tested mianly in qemu where I just submitted a patch that adds support for these bits (not upstream yet). Nicholas Piggin (4): powerpc/pseries: Get entry and uaccess flush required bits from H_GET_CPU_CHARACTERISTICS powerpc/security: Add a security feature for STF barrier powerpc/pesries: Get STF barrier requirement from H_GET_CPU_CHARACTERISTICS powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes arch/powerpc/include/asm/hvcall.h| 3 +++ arch/powerpc/include/asm/security_features.h | 4 arch/powerpc/kernel/security.c | 7 ++- arch/powerpc/platforms/powernv/setup.c | 9 - arch/powerpc/platforms/pseries/setup.c | 9 + 5 files changed, 18 insertions(+), 14 deletions(-) -- 2.23.0
Re: [PATCH] Raise the minimum GCC version to 5.2
On Sun, May 02, 2021 at 12:15:38AM +0900, Masahiro Yamada wrote: > The current minimum GCC version is 4.9 except ARCH=arm64 requiring > GCC 5.1. > > When we discussed last time, we agreed to raise the minimum GCC version > to 5.1 globally. [1] There are still a lot of comment references to old gcc releases with workarounds or bugfixes, a quick serarch: $ git grep -in 'gcc.*[234]\.x' arch/alpha/include/asm/string.h:30:/* For gcc 3.x, we cannot have the inline function named "memset" because arch/arc/include/asm/checksum.h:9: * -gcc 4.4.x broke networking. Alias analysis needed to be primed. arch/arm/Makefile:127:# Need -Uarm for gcc < 3.x arch/ia64/lib/memcpy_mck.S:535: * Due to lack of local tag support in gcc 2.x assembler, it is not clear which arch/mips/include/asm/page.h:210: * also affect MIPS so we keep this one until GCC 3.x has been retired arch/x86/include/asm/page.h:53: * remove this Voodoo magic stuff. (i.e. once gcc3.x is deprecated) arch/x86/kvm/x86.c:5569: * This union makes it completely explicit to gcc-3.x arch/x86/mm/pgtable.c:302: if (PREALLOCATED_PMDS == 0) /* Work around gcc-3.4.x bug */ drivers/net/ethernet/renesas/sh_eth.c:51: * that warning from W=1 builds. GCC has supported this option since 4.2.X, but lib/xz/xz_dec_lzma2.c:494: * of the code generated by GCC 3.x decreases 10-15 %. (GCC 4.3 doesn't care, lib/xz/xz_dec_lzma2.c:495: * and it generates 10-20 % faster code than GCC 3.x from this file anyway.) net/core/skbuff.c:32: * The functions in this file will not compile correctly with gcc 2.4.x This misses version-specific quirks, but the following returns 216 results and not all are problematic (eg. just referring to gcc for some historical reason) so I'm not pasting it here. $ git grep -in 'gcc.*[234]\.[0-9]' ...
RE: [PATCH] Raise the minimum GCC version to 5.2
From: Arnd Bergmann > Sent: 03 May 2021 10:25 ... > One scenario that I've seen previously is where user space and > kernel are built together as a source based distribution (OE, buildroot, > openwrt, ...), and the compiler is picked to match the original sources > of the user space because that is best tested, but the same compiler > then gets used to build the kernel as well because that is the default > in the build environment. If you are building programs for release to customers who might be running then on old distributions then you need a system with the original userspace headers and almost certainly a similar vintage compiler. Never mind RHEL7 we have customers running RHEL6. (We've managed to get everyone off RHEL5.) So the build machine is running a 10+ year old distro. I did try to build on a newer system (only 5 years old) but the complete fubar of memcpy() makes it impossible to compile C programs that will run on an older libc. And don't even mention C++, the 'character traits' is just plain horrid - enough to make me want to remove every reference to CString from the small amount of C++ we have. To quote our makefile: # C++ is fighting back. # I'd like to be able to compile on a 'new' system and still be able to run # the binaries on RHEL 6 (2.6.32 kernel 2011 era libraries). # But even linking libstdc++ static still leaves # an undefined C++ symbol that the dynamic loader barfs on. # The static libstdc++ also references memcpy@GLIBC_2.14 - but that can be # 'solved' by adding an extra .so that defines the symbol (and calls memmove()). # I've also tried pulling a single .o out of libstc++.a. This might work if # the .o is small and self contained. # # For now we statically link libstc++ and continue to build on an old system. C++LDLIBS := -Wl,-Bstatic -lstdc++ -Wl,-Bdynamic It would be nice to be able to build current kernels (for local use) on the 'new' system - but gcc is already too old. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH v3] powerpc/64: Option to use ELFv2 ABI for big-endian kernels
Should this add a tag to the module vermagic? Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: [PATCH] Raise the minimum GCC version to 5.2
On Mon, May 3, 2021 at 12:32 AM Matthew Wilcox wrote: > On Sun, May 02, 2021 at 02:08:31PM -0700, Linus Torvalds wrote: > > What is relevant is what version of gcc various distributions actually > > have reasonably easily available, and how old and relevant the > > distributions are. We did decide that (just as an example) RHEL 7 was > > too old to worry about when we updated the gcc version requirement > > last time. > > > > Last year, Arnd and Kirill (maybe others were involved too) made a > > list of distros and older gcc versions. But I don't think anybody > > actually _maintains_ such a list. It would be perhaps interesting to > > have some way to check what compiler versions are being offered by > > different distros. > > fwiw, Debian 9 aka Stretch released June 2017 had gcc 6.3 > Debian 10 aka Buster released June 2019 had gcc 7.4 *and* 8.3. > Debian 8 aka Jessie had gcc-4.8.4 and gcc-4.9.2. > > So do we care about people who haven't bothered to upgrade userspace > since 2017? If so, we can't go past 4.9. I would argue that we shouldn't care about distros that are officially end-of-life. Jessie support ended last July according to the official Debian pages at https://wiki.debian.org/LTS. It's a little harder for distros that are still officially supported, like the RHEL7 case that Linus mentioned, Debian Stretch (gcc-6.3), Slackware 14.2 (gcc-5.3), or Ubuntu 18.04 (gcc-7.3). For any of these you could make the argument one way or the other: either say we care as long as the distro cares, or the users that want to build their own kernels can be reasonably expected to either upgrade their distro or install a newer compiler manually. Looking at the Debian case specifically, I see these numbers from https://popcon.debian.org/: testing/unstable: 16730 buster/stable: 113881 stretch/oldstable: 39147 jessie/oldoldstable: 19286 Assuming the numbers of users that installed popcon are proportional to the actual number of users, that's still a large chunk of people running stretch or older. Presumably, these users are actually less likely to build their own kernels. Arnd
[PATCH] powerpc/pseries: Enable hardlockup watchdog for PowerVM partitions
PowerVM will not arbitrarily oversubscribe or stop guests, page out the guest kernel text to a NFS volume connected by carrier pigeon to abacus based storage, etc., as a KVM host might. So PowerVM guests are not likely to be killed by the hard lockup watchdog in normal operation, even with shared processor LPARs which still get a minimum allotment of CPU time. Enable the hard lockup detector by default on !KVM guests, which we will assume is PowerVM. It has been useful in finding problems on bare metal kernels. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/setup_64.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index b779d25761cf..c0e234456863 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -939,15 +939,17 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh) * disable it by default. Book3S has a soft-nmi hardlockup detector based * on the decrementer interrupt, so it does not suffer from this problem. * - * It is likely to get false positives in VM guests, so disable it there - * by default too. + * It is likely to get false positives in KVM guests, so disable it there + * by default too. PowerVM will not stop or arbitrarily oversubscribe + * CPUs, but give a minimum regular allotment even with SPLPAR, so enable + * the detector for non-KVM guests, assume PowerVM. */ static int __init disable_hardlockup_detector(void) { #ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF hardlockup_detector_disable(); #else - if (firmware_has_feature(FW_FEATURE_LPAR)) + if (is_kvm_guest()) hardlockup_detector_disable(); #endif -- 2.23.0
[PATCH] powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled
scv support introduced the notion of code that implicitly soft-masks irqs due to the instruction addresses. This is required because scv enters the kernel with MSR[EE]=1. If a NMI (including soft-NMI) interrupt hits when we are implicitly soft-masked then its regs->softe does not reflect this because it is derived from the explicit soft mask state (paca->irq_soft_mask). This makes arch_irq_disabled_regs(regs) return false. This can trigger a warning in the soft-NMI watchdog code (shown below). Fix it by having NMI interrupts set regs->softe to disabled in case of interrupting an implicit soft-masked region. [ cut here ] WARNING: CPU: 41 PID: 1103 at arch/powerpc/kernel/watchdog.c:259 soft_nmi_interrupt+0x3e4/0x5f0 CPU: 41 PID: 1103 Comm: (spawn) Not tainted NIP: c0039534 LR: c0039234 CTR: c0009a00 REGS: c07fffbcf940 TRAP: 0700 Not tainted MSR: 90021033 CR: 22042482 XER: 200400ad CFAR: c0039260 IRQMASK: 3 GPR00: c0039204 c07fffbcfbe0 c1d6c300 0003 GPR04: 7a45d078 0008 0020 GPR08: 007ffd4e c07ceb00 7265677368657265 GPR12: 90009033 c07ceb00 0f7075bf4480 002a GPR16: 0f705745a528 7a45ddd8 0f70574d0008 GPR20: 0f7075c58d70 0f7057459c38 0001 0040 GPR24: 0029 c1dae058 0029 GPR28: 0800 0009 c07fffbcfd60 NIP [c0039534] soft_nmi_interrupt+0x3e4/0x5f0 LR [c0039234] soft_nmi_interrupt+0xe4/0x5f0 Call Trace: [c07fffbcfbe0] [c0039204] soft_nmi_interrupt+0xb4/0x5f0 (unreliable) [c07fffbcfcf0] [c000c0e8] soft_nmi_common+0x138/0x1c4 --- interrupt: 900 at end_real_trampolines+0x0/0x1000 NIP: c0003000 LR: 7ca426adb03c CTR: 9280f033 REGS: c07fffbcfd60 TRAP: 0900 MSR: 90009033 CR: 44042482 XER: 200400ad CFAR: 7ca426946020 IRQMASK: 0 GPR00: 00ad 7a45d050 7ca426b07f00 0035 GPR04: 7a45d078 0008 0020 GPR08: 0010 1000 7a45d110 GPR12: 0001 7ca426d4e680 0f7075bf4480 002a GPR16: 0f705745a528 7a45ddd8 0f70574d0008 GPR20: 0f7075c58d70 0f7057459c38 0001 0040 GPR24: 0f7057473f68 0003 041b GPR28: 7a45d4c4 0035 0f7057473f68 NIP [c0003000] end_real_trampolines+0x0/0x1000 LR [7ca426adb03c] 0x7ca426adb03c --- interrupt: 900 Instruction dump: 6000 6000 6042 3861 482b3ae5 6000 e93f0138 a36d0008 7daa6b78 71290001 7f7907b4 4082fd34 <0fe0> 4bfffd2c 6042 ea6100a8 ---[ end trace dc75f67d819779da ]--- Fixes: 118178e62e2e ("powerpc: move NMI entry/exit code into wrapper") Reported-by: Cédric Le Goater Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/interrupt.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h index 44cde2e129b8..299e51337aca 100644 --- a/arch/powerpc/include/asm/interrupt.h +++ b/arch/powerpc/include/asm/interrupt.h @@ -222,6 +222,13 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte local_paca->irq_soft_mask = IRQS_ALL_DISABLED; local_paca->irq_happened |= PACA_IRQ_HARD_DIS; + if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !(regs->msr & MSR_PR) && + regs->nip < (unsigned long)__end_interrupts) { + /* Kernel code running below __end_interrupts is implicitly +* soft-masked */ + regs->softe = IRQS_ALL_DISABLED; + } + /* Don't do any per-CPU operations until interrupt state is fixed */ if (nmi_disables_ftrace(regs)) { -- 2.23.0
[PATCH v3] powerpc/64: Option to use ELFv2 ABI for big-endian kernels
Provide an option to build big-endian kernels using the ELFv2 ABI. This works on GCC only so far, although it is rumored to work with clang that's not been tested yet. This can give big-endian kernels some useful advantages of the ELFv2 ABI (e.g., less stack usage, -mprofile-kernel, better compatibility with bpf tools). BE+ELFv2 is not officially supported by the GNU toolchain, but it works fine in testing and has been used by some userspace for some time (e.g., Void Linux). Tested-by: Michal Suchánek Reviewed-by: Segher Boessenkool Signed-off-by: Nicholas Piggin --- I didn't add the -mprofile-kernel change but I think it would be a good one that can be merged independently if it works. Since v2: - Rebased, tweaked changelog. - Changed ELF_V2 to ELF_V2_ABI in config options, to be clearer. Since v1: - Improved the override flavour name suggested by Segher. - Improved changelog wording. arch/powerpc/Kconfig| 22 ++ arch/powerpc/Makefile | 18 -- arch/powerpc/boot/Makefile | 4 +++- arch/powerpc/kernel/vdso64/Makefile | 13 + drivers/crypto/vmx/Makefile | 8 ++-- drivers/crypto/vmx/ppc-xlate.pl | 10 ++ 6 files changed, 62 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1e6230bea09d..d3f78d3d574d 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -160,6 +160,7 @@ config PPC select ARCH_WEAK_RELEASE_ACQUIRE select BINFMT_ELF select BUILDTIME_TABLE_SORT + select PPC64_BUILD_ELF_V2_ABI if PPC64 && CPU_LITTLE_ENDIAN select CLONE_BACKWARDS select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN select DMA_OPS if PPC64 @@ -568,6 +569,27 @@ config KEXEC_FILE config ARCH_HAS_KEXEC_PURGATORY def_bool KEXEC_FILE +config PPC64_BUILD_ELF_V2_ABI + bool + +config PPC64_BUILD_BIG_ENDIAN_ELF_V2_ABI + bool "Build big-endian kernel using ELF ABI V2 (EXPERIMENTAL)" + depends on PPC64 && CPU_BIG_ENDIAN && EXPERT + depends on CC_IS_GCC && LD_VERSION >= 22400 + default n + select PPC64_BUILD_ELF_V2_ABI + help + This builds the kernel image using the "Power Architecture 64-Bit ELF + V2 ABI Specification", which has a reduced stack overhead and faster + function calls. This internal kernel ABI option does not affect + userspace compatibility. + + The V2 ABI is standard for 64-bit little-endian, but for big-endian + it is less well tested by kernel and toolchain. However some distros + build userspace this way, and it can produce a functioning kernel. + + This requires GCC and binutils 2.24 or newer. + config RELOCATABLE bool "Build a relocatable kernel" depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE)) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 3212d076ac6a..b90b5cb799aa 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -91,10 +91,14 @@ endif ifdef CONFIG_PPC64 ifndef CONFIG_CC_IS_CLANG -cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1) -cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mcall-aixdesc) -aflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1) -aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2 +ifdef CONFIG_PPC64_BUILD_ELF_V2_ABI +cflags-y += $(call cc-option,-mabi=elfv2) +aflags-y += $(call cc-option,-mabi=elfv2) +else +cflags-y += $(call cc-option,-mabi=elfv1) +cflags-y += $(call cc-option,-mcall-aixdesc) +aflags-y += $(call cc-option,-mabi=elfv1) +endif endif endif @@ -142,15 +146,17 @@ endif CFLAGS-$(CONFIG_PPC64) := $(call cc-option,-mtraceback=no) ifndef CONFIG_CC_IS_CLANG -ifdef CONFIG_CPU_LITTLE_ENDIAN -CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2,$(call cc-option,-mcall-aixdesc)) +ifdef CONFIG_PPC64_BUILD_ELF_V2_ABI +CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2) AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2) else +# Keep these in synch with arch/powerpc/kernel/vdso64/Makefile CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1) CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcall-aixdesc) AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1) endif endif + CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,$(call cc-option,-mminimal-toc)) CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions) diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 2b8da923ceca..be84a72f8258 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -40,6 +40,9 @@ BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
[PATCH] ibmvnic: remove default label from to_string switch
This way the compiler warns when a new value is added to the enum but not the string transation like: drivers/net/ethernet/ibm/ibmvnic.c: In function 'adapter_state_to_string': drivers/net/ethernet/ibm/ibmvnic.c:832:2: warning: enumeration value 'VNIC_FOOBAR' not handled in switch [-Wswitch] switch (state) { ^~ drivers/net/ethernet/ibm/ibmvnic.c: In function 'reset_reason_to_string': drivers/net/ethernet/ibm/ibmvnic.c:1935:2: warning: enumeration value 'VNIC_RESET_FOOBAR' not handled in switch [-Wswitch] switch (reason) { ^~ Signed-off-by: Michal Suchanek --- drivers/net/ethernet/ibm/ibmvnic.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 5788bb956d73..4d439413f6d9 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -846,9 +846,8 @@ static const char *adapter_state_to_string(enum vnic_state state) return "REMOVING"; case VNIC_REMOVED: return "REMOVED"; - default: - return "UNKNOWN"; } + return "UNKNOWN"; } static int ibmvnic_login(struct net_device *netdev) @@ -1946,9 +1945,8 @@ static const char *reset_reason_to_string(enum ibmvnic_reset_reason reason) return "TIMEOUT"; case VNIC_RESET_CHANGE_PARAM: return "CHANGE_PARAM"; - default: - return "UNKNOWN"; } + return "UNKNOWN"; } /* -- 2.26.2
Re: [PATCH] Raise the minimum GCC version to 5.2
On Mon, May 3, 2021 at 2:44 AM Segher Boessenkool wrote: > > On Sun, May 02, 2021 at 02:23:01PM -0700, Joe Perches wrote: > > On Sun, 2021-05-02 at 15:32 -0500, Segher Boessenkool wrote: > > > On Sun, May 02, 2021 at 01:00:28PM -0700, Joe Perches wrote: > > [] > > > > Perhaps 8 might be best as that has a __diag warning control mechanism. > > > > > > I have no idea what you mean? > > > > ? read the last bit of compiler-gcc.h > > Ah, you mean > #pragma GCC diagnostic > (which has existed since GCC 4.2). Does anything in this __diag stuff > require GCC 8? Other than that this is hardcoded here :-) The '8' was just a kernel thing, we made it configurable to have version specific warnings, and I have a header file that adds these macros for all supported compilers, but the version that is in mainline only does it for gcc-8 or later. Early compilers only supported "#pragma GCC diagnostic", but I think even gcc-4.6 supported the _Pragma() syntax that lets you do it inside of a macro. It's something we should improve with plumbing on top, e.g. I want a macro that lets you locally turn off both -Woverride-init on gcc and -Winitializer-overrides on clang. It's not a reason to mandate a newer compiler though. Arnd
Re: [PATCH] Raise the minimum GCC version to 5.2
On Sun, May 02, 2021 at 02:08:31PM -0700, Linus Torvalds wrote: > Last year, Arnd and Kirill (maybe others were involved too) made a > list of distros and older gcc versions. But I don't think anybody > actually _maintains_ such a list. Distrowatch does. I used it for checking. But you need to check it per distro. For Debian it would be here: https://distrowatch.com/table.php?distribution=debian -- Kirill A. Shutemov
Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels
On Mon, May 03, 2021 at 09:11:16AM +0200, Michal Suchánek wrote: > On Mon, May 03, 2021 at 10:58:33AM +1000, Nicholas Piggin wrote: > > Excerpts from Michal Suchánek's message of May 3, 2021 2:57 am: > > > On Tue, Apr 28, 2020 at 09:25:17PM +1000, Nicholas Piggin wrote: > > >> Provide an option to use ELFv2 ABI for big endian builds. This works on > > >> GCC and clang (since 2014). It is less well tested and supported by the > > >> GNU toolchain, but it can give some useful advantages of the ELFv2 ABI > > >> for BE (e.g., less stack usage). Some distros even build BE ELFv2 > > >> userspace. > > > > > > Fixes BTFID failure on BE for me and the ELF ABIv2 kernel boots. > > > > What's the BTFID failure? Anything we can do to fix it on the v1 ABI or > > at least make it depend on BUILD_ELF_V2? > > Looks like symbols are prefixed with a dot in ABIv1 and BTFID tool is > not aware of that. It can be disabled on ABIv1 easily. > > Thanks > > Michal > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 678c13967580..e703c26e9b80 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -305,6 +305,7 @@ config DEBUG_INFO_BTF > bool "Generate BTF typeinfo" > depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED > depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST > + depends on !PPC64 || BUILD_ELF_V2 > help > Generate deduplicated BTF type information from DWARF debug info. > Turning this on expects presence of pahole tool, which will convert > > > > > > > > > Tested-by: Michal Suchánek > > > > > > Also can we enable mprofile on BE now? > > > > > > I don't see anything endian-specific in the mprofile code at a glance > > > but don't have any idea how to test it. > > > > AFAIK it's just a different ABI for the _mcount call so just running > > some ftrace and ftrace with call graph should test it reasonably well. It does not crash and burn but there are some regressions from LE to BE on the ftrace kernel selftest: --- ftraceLE.txt2021-05-03 11:19:14.83000 +0200 +++ ftraceBE.txt2021-05-03 11:27:24.77000 +0200 @@ -7,8 +7,8 @@ [n] Change the ringbuffer size [PASS] [n] Snapshot and tracing setting [PASS] [n] trace_pipe and trace_marker[PASS] -[n] Test ftrace direct functions against tracers [UNRESOLVED] -[n] Test ftrace direct functions against kprobes [UNRESOLVED] +[n] Test ftrace direct functions against tracers [FAIL] +[n] Test ftrace direct functions against kprobes [FAIL] [n] Generic dynamic event - add/remove kprobe events [PASS] [n] Generic dynamic event - add/remove synthetic events[PASS] [n] Generic dynamic event - selective clear (compatibility)[PASS] @@ -16,10 +16,10 @@ [n] event tracing - enable/disable with event level files [PASS] [n] event tracing - restricts events based on pid notrace filtering[PASS] [n] event tracing - restricts events based on pid [PASS] -[n] event tracing - enable/disable with subsystem level files [PASS] +[n] event tracing - enable/disable with subsystem level files [FAIL] [n] event tracing - enable/disable with top level files[PASS] -[n] Test trace_printk from module [UNRESOLVED] -[n] ftrace - function graph filters with stack tracer [PASS] +[n] Test trace_printk from module [FAIL] +[n] ftrace - function graph filters with stack tracer [FAIL] [n] ftrace - function graph filters[PASS] [n] ftrace - function trace with cpumask [PASS] [n] ftrace - test for function event triggers [PASS] @@ -27,7 +27,7 @@ [n] ftrace - function pid notrace filters [PASS] [n] ftrace - function pid filters [PASS] [n] ftrace - stacktrace filter command [PASS] -[n] ftrace - function trace on module [UNRESOLVED] +[n] ftrace - function trace on module [FAIL] [n] ftrace - function profiler with function tracing [PASS] [n] ftrace - function profiling[PASS] [n] ftrace - test reading of set_ftrace_filter [PASS] @@ -44,10 +44,10 @@ [n] Kprobe event argument syntax [PASS] [n] Kprobe dynamic event with arguments[PASS] [n] Kprobes event arguments with types [PASS] -[n] Kprobe event user-memory access[UNSUPPORTED] +[n] Kprobe event user-memory access[FAIL] [n] Kprobe event auto/manual naming[PASS] [n] Kprobe dynamic event with function tracer [PASS] -[n] Kprobe dynamic event - probing module [UNRESOLVED] +[n] Kprobe dynamic event - probing module [FAIL] [n] Create/delete multiprobe on kprobe event [PASS] [n] Kprobe event parser error log check[PASS] [n] Kretprobe dynamic event with arguments [PASS] @@ -57,11 +57,11 @@ [n] Kprobe events - probe points [PASS] [n] Kprobe dynamic event - adding and removing [PASS] [n] Uprobe event parser error log check[PASS] -[n] test for the preemptirqsoff tracer [UNSUPPORTED] -[n] Meta-selftest: Checkbashisms [UNRESOLVED] +[n] test for the preemptirqsoff tracer [FAIL] +[n] Meta-s
Re: [PATCH] Raise the minimum GCC version to 5.2
On Mon, May 3, 2021 at 9:35 AM Alexander Dahl wrote: > > Desktops and servers are all nice, however I just want to make you > aware, there are embedded users forced to stick to older cross > toolchains for different reasons as well, e.g. in industrial > environment. :-) > > This is no show stopper for us, I just wanted to let you be aware. Can you be more specific about what scenarios you are thinking of, what the motivations are for using an old compiler with a new kernel on embedded systems, and what you think a realistic maximum time would be between compiler updates? One scenario that I've seen previously is where user space and kernel are built together as a source based distribution (OE, buildroot, openwrt, ...), and the compiler is picked to match the original sources of the user space because that is best tested, but the same compiler then gets used to build the kernel as well because that is the default in the build environment. There are two problems I see with this logic: - Running the latest kernel to avoid security problems is of course a good idea, but if one runs that with ten year old user space that is never updated, the system is likely to end up just as insecure. Not all bugs are in the kernel. - The same logic that applies to ancient user space staying with an ancient compiler (it's better tested in this combination) also applies to the kernel: running the latest kernel on an old compiler is something that few people test, and tends to run into more bugs than using the compiler that other developers used to test that kernel. Arnd
Re: [PATCH v3] powerpc/64s/radix: Enable huge vmalloc mappings
Le 03/05/2021 à 11:17, Nicholas Piggin a écrit : This reduces TLB misses by nearly 30x on a `git diff` workload on a 2-node POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%, due to vfs hashes being allocated with 2MB pages. Acked-by: Michael Ellerman Signed-off-by: Nicholas Piggin Reviewed-by: Christophe Leroy --- Since v2: - Fix ppc32 compile bug. Since v1: - Don't define MODULES_VADDR which has some other side effect (e.g., ptdump). - Fixed (hopefully) kbuild warning. - Keep __vmalloc_node_range call on 3 lines. .../admin-guide/kernel-parameters.txt | 2 ++ arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/module.c | 18 +- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 1c0a3cf6fcc9..1be38b25c485 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3250,6 +3250,8 @@ nohugeiomap [KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings. + nohugevmalloc [PPC] Disable kernel huge vmalloc mappings. + nosmt [KNL,S390] Disable symmetric multithreading (SMT). Equivalent to smt=1. diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1e6230bea09d..c547a9d6a2dd 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -185,6 +185,7 @@ config PPC select GENERIC_VDSO_TIME_NS select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU + select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL_RELATIVE select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14 diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c index fab84024650c..3f35c8d20be7 100644 --- a/arch/powerpc/kernel/module.c +++ b/arch/powerpc/kernel/module.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -88,17 +89,22 @@ int module_finalize(const Elf_Ehdr *hdr, return 0; } -#ifdef MODULES_VADDR static __always_inline void * __module_alloc(unsigned long size, unsigned long start, unsigned long end) { - return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, - PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, - __builtin_return_address(0)); + /* +* Don't do huge page allocations for modules yet until more testing +* is done. STRICT_MODULE_RWX may require extra work to support this +* too. +*/ + return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, PAGE_KERNEL_EXEC, + VM_FLUSH_RESET_PERMS | VM_NO_HUGE_VMAP, + NUMA_NO_NODE, __builtin_return_address(0)); } void *module_alloc(unsigned long size) { +#ifdef MODULES_VADDR unsigned long limit = (unsigned long)_etext - SZ_32M; void *ptr = NULL; @@ -112,5 +118,7 @@ void *module_alloc(unsigned long size) ptr = __module_alloc(size, MODULES_VADDR, MODULES_END); return ptr; -} +#else + return __module_alloc(size, VMALLOC_START, VMALLOC_END); #endif +}
[PATCH v3] powerpc/64s/radix: Enable huge vmalloc mappings
This reduces TLB misses by nearly 30x on a `git diff` workload on a 2-node POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%, due to vfs hashes being allocated with 2MB pages. Acked-by: Michael Ellerman Signed-off-by: Nicholas Piggin --- Since v2: - Fix ppc32 compile bug. Since v1: - Don't define MODULES_VADDR which has some other side effect (e.g., ptdump). - Fixed (hopefully) kbuild warning. - Keep __vmalloc_node_range call on 3 lines. .../admin-guide/kernel-parameters.txt | 2 ++ arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/module.c | 18 +- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 1c0a3cf6fcc9..1be38b25c485 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3250,6 +3250,8 @@ nohugeiomap [KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings. + nohugevmalloc [PPC] Disable kernel huge vmalloc mappings. + nosmt [KNL,S390] Disable symmetric multithreading (SMT). Equivalent to smt=1. diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1e6230bea09d..c547a9d6a2dd 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -185,6 +185,7 @@ config PPC select GENERIC_VDSO_TIME_NS select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU + select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL_RELATIVE select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14 diff --git a/arch/powerpc/kernel/module.c b/arch/powerpc/kernel/module.c index fab84024650c..3f35c8d20be7 100644 --- a/arch/powerpc/kernel/module.c +++ b/arch/powerpc/kernel/module.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -88,17 +89,22 @@ int module_finalize(const Elf_Ehdr *hdr, return 0; } -#ifdef MODULES_VADDR static __always_inline void * __module_alloc(unsigned long size, unsigned long start, unsigned long end) { - return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, - PAGE_KERNEL_EXEC, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, - __builtin_return_address(0)); + /* +* Don't do huge page allocations for modules yet until more testing +* is done. STRICT_MODULE_RWX may require extra work to support this +* too. +*/ + return __vmalloc_node_range(size, 1, start, end, GFP_KERNEL, PAGE_KERNEL_EXEC, + VM_FLUSH_RESET_PERMS | VM_NO_HUGE_VMAP, + NUMA_NO_NODE, __builtin_return_address(0)); } void *module_alloc(unsigned long size) { +#ifdef MODULES_VADDR unsigned long limit = (unsigned long)_etext - SZ_32M; void *ptr = NULL; @@ -112,5 +118,7 @@ void *module_alloc(unsigned long size) ptr = __module_alloc(size, MODULES_VADDR, MODULES_END); return ptr; -} +#else + return __module_alloc(size, VMALLOC_START, VMALLOC_END); #endif +} -- 2.23.0
Re: [PATCH] Raise the minimum GCC version to 5.2
On Mon, 2021-05-03 at 09:34 +0200, Alexander Dahl wrote: > Desktops and servers are all nice, however I just want to make you > aware, there are embedded users forced to stick to older cross > toolchains for different reasons as well, e.g. in industrial > environment. :-) In your embedded case, what kernel version do you use? For older toolchains, unless it's kernel version 5.13+, it wouldn't matter. And all the supported architectures have gcc 10.3 available at http://cdn.kernel.org/pub/tools/crosstool/
Re: [PATCH] Raise the minimum GCC version to 5.2
Hei hei, Am Sun, May 02, 2021 at 11:30:07PM +0100 schrieb Matthew Wilcox: > On Sun, May 02, 2021 at 02:08:31PM -0700, Linus Torvalds wrote: > > What is relevant is what version of gcc various distributions actually > > have reasonably easily available, and how old and relevant the > > distributions are. We did decide that (just as an example) RHEL 7 was > > too old to worry about when we updated the gcc version requirement > > last time. > > > > Last year, Arnd and Kirill (maybe others were involved too) made a > > list of distros and older gcc versions. But I don't think anybody > > actually _maintains_ such a list. It would be perhaps interesting to > > have some way to check what compiler versions are being offered by > > different distros. > > fwiw, Debian 9 aka Stretch released June 2017 had gcc 6.3 > Debian 10 aka Buster released June 2019 had gcc 7.4 *and* 8.3. > Debian 8 aka Jessie had gcc-4.8.4 and gcc-4.9.2. > > So do we care about people who haven't bothered to upgrade userspace > since 2017? If so, we can't go past 4.9. Desktops and servers are all nice, however I just want to make you aware, there are embedded users forced to stick to older cross toolchains for different reasons as well, e.g. in industrial environment. :-) This is no show stopper for us, I just wanted to let you be aware. Greets Alex
Re: [PATCH v2] powerpc/64: BE option to use ELFv2 ABI for big endian kernels
On Mon, May 03, 2021 at 10:58:33AM +1000, Nicholas Piggin wrote: > Excerpts from Michal Suchánek's message of May 3, 2021 2:57 am: > > On Tue, Apr 28, 2020 at 09:25:17PM +1000, Nicholas Piggin wrote: > >> Provide an option to use ELFv2 ABI for big endian builds. This works on > >> GCC and clang (since 2014). It is less well tested and supported by the > >> GNU toolchain, but it can give some useful advantages of the ELFv2 ABI > >> for BE (e.g., less stack usage). Some distros even build BE ELFv2 > >> userspace. > > > > Fixes BTFID failure on BE for me and the ELF ABIv2 kernel boots. > > What's the BTFID failure? Anything we can do to fix it on the v1 ABI or > at least make it depend on BUILD_ELF_V2? Looks like symbols are prefixed with a dot in ABIv1 and BTFID tool is not aware of that. It can be disabled on ABIv1 easily. Thanks Michal diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 678c13967580..e703c26e9b80 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -305,6 +305,7 @@ config DEBUG_INFO_BTF bool "Generate BTF typeinfo" depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST + depends on !PPC64 || BUILD_ELF_V2 help Generate deduplicated BTF type information from DWARF debug info. Turning this on expects presence of pahole tool, which will convert > > > > > Tested-by: Michal Suchánek > > > > Also can we enable mprofile on BE now? > > > > I don't see anything endian-specific in the mprofile code at a glance > > but don't have any idea how to test it. > > AFAIK it's just a different ABI for the _mcount call so just running > some ftrace and ftrace with call graph should test it reasonably well. > > > > > Thanks > > > > Michal > > > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > > index 6a4ad11f6349..75b3afbfc378 100644 > > --- a/arch/powerpc/Kconfig > > +++ b/arch/powerpc/Kconfig > > @@ -495,7 +495,7 @@ config LD_HEAD_STUB_CATCH > > If unsure, say "N". > > > > config MPROFILE_KERNEL > > - depends on PPC64 && CPU_LITTLE_ENDIAN && FUNCTION_TRACER > > + depends on PPC64 && BUILD_ELF_V2 && FUNCTION_TRACER > > def_bool > > $(success,$(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) > > -I$(srctree)/include -D__KERNEL__) > > Good idea. I can't remember if I did a grep for LITTLE_ENDIAN to check > for other such opportunities. > > Thanks, > Nick > > > > > config HOTPLUG_CPU > >> > >> Reviewed-by: Segher Boessenkool > >> Signed-off-by: Nicholas Piggin > >> --- > >> Since v1: > >> - Improved the override flavour name suggested by Segher. > >> - Improved changelog wording. > >> > >> > >> arch/powerpc/Kconfig| 19 +++ > >> arch/powerpc/Makefile | 15 ++- > >> arch/powerpc/boot/Makefile | 4 > >> drivers/crypto/vmx/Makefile | 8 ++-- > >> drivers/crypto/vmx/ppc-xlate.pl | 10 ++ > >> 5 files changed, 45 insertions(+), 11 deletions(-) > >> > >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > >> index 924c541a9260..d9d2abc06c2c 100644 > >> --- a/arch/powerpc/Kconfig > >> +++ b/arch/powerpc/Kconfig > >> @@ -147,6 +147,7 @@ config PPC > >>select ARCH_WEAK_RELEASE_ACQUIRE > >>select BINFMT_ELF > >>select BUILDTIME_TABLE_SORT > >> + select BUILD_ELF_V2 if PPC64 && CPU_LITTLE_ENDIAN > >>select CLONE_BACKWARDS > >>select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN > >>select DYNAMIC_FTRACE if FUNCTION_TRACER > >> @@ -541,6 +542,24 @@ config KEXEC_FILE > >> config ARCH_HAS_KEXEC_PURGATORY > >>def_bool KEXEC_FILE > >> > >> +config BUILD_ELF_V2 > >> + bool > >> + > >> +config BUILD_BIG_ENDIAN_ELF_V2 > >> + bool "Build big-endian kernel using ELFv2 ABI (EXPERIMENTAL)" > >> + depends on PPC64 && CPU_BIG_ENDIAN && EXPERT > >> + default n > >> + select BUILD_ELF_V2 > >> + help > >> +This builds the kernel image using the ELFv2 ABI, which has a > >> +reduced stack overhead and faster function calls. This does not > >> +affect the userspace ABIs. > >> + > >> +ELFv2 is the standard ABI for little-endian, but for big-endian > >> +this is an experimental option that is less tested (kernel and > >> +toolchain). This requires gcc 4.9 or newer and binutils 2.24 or > >> +newer. > >> + > >> config RELOCATABLE > >>bool "Build a relocatable kernel" > >>depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE)) > >> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile > >> index f310c32e88a4..e306b39d847e 100644 > >> --- a/arch/powerpc/Makefile > >> +++ b/arch/powerpc/Makefile > >> @@ -92,10 +92,14 @@ endif > >> > >> ifdef CONFIG_PPC64 > >> ifndef CONFIG_CC_IS_CLANG > >> -cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mabi=elfv1) > >> -cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call > >> cc-option,-mcall-a