Re: [PATCH v2 -next] powerpc: kernel/time.c - cleanup warnings
Le 24/03/2021 à 07:14, Christophe Leroy a écrit : Le 24/03/2021 à 00:05, Alexandre Belloni a écrit : On 23/03/2021 23:18:17+0100, Alexandre Belloni wrote: Hello, On 23/03/2021 05:12:57-0400, He Ying wrote: We found these warnings in arch/powerpc/kernel/time.c as follows: warning: symbol 'decrementer_max' was not declared. Should it be static? warning: symbol 'rtc_lock' was not declared. Should it be static? warning: symbol 'dtl_consumer' was not declared. Should it be static? Declare 'decrementer_max' and 'rtc_lock' in powerpc asm/time.h. Rename 'rtc_lock' in drviers/rtc/rtc-vr41xx.c to 'vr41xx_rtc_lock' to avoid the conflict with the variable in powerpc asm/time.h. Move 'dtl_consumer' definition behind "include " because it is declared there. Reported-by: Hulk Robot Signed-off-by: He Ying --- v2: - Instead of including linux/mc146818rtc.h in powerpc kernel/time.c, declare rtc_lock in powerpc asm/time.h. V1 was actually the correct thing to do. rtc_lock is there exactly because chrp and maple are using mc146818 compatible RTCs. This is then useful because then drivers/char/nvram.c is enabled. The proper fix would be to scrap all of that and use rtc-cmos for those platforms as this drives the RTC properly and exposes the NVRAM for the mc146818. Or at least, if there are no users for the char/nvram driver on those two platforms, remove the spinlock and stop enabling CONFIG_NVRAM or more likely rename the symbol as it seems to be abused by both chrp and powermac. Ok so rtc_lock is not even used by the char/nvram.c driver as it is completely compiled out. I guess it is fine having it move to the individual platform as looking very quickly at the Kconfig, it is not possible to select both simultaneously. Tentative patch: Looking at it once more, it looks like including linux/mc146818rtc.h is the thing to do, at least for now. Several platforms are defining the rtc_lock exactly the same way as powerpc does, and including mc146818rtc.h I think that to get it clean, this change should go in a dedicated patch and do a bit more and explain exactly what is being do and why. I'll try to draft something for it. He Y., can you make a version v3 of your patch excluding the rtc_lock change ? Finally, I think there is not enough changes to justify a separate patch. So you can send a V3 based on your V1. In addition to the changes you had in V1, please remove the declaration of rfc_lock in arch/powerpc/platforms/chrp/chrp.h Christophe
Re: [PATCH v2 -next] powerpc: kernel/time.c - cleanup warnings
Le 24/03/2021 à 00:05, Alexandre Belloni a écrit : On 23/03/2021 23:18:17+0100, Alexandre Belloni wrote: Hello, On 23/03/2021 05:12:57-0400, He Ying wrote: We found these warnings in arch/powerpc/kernel/time.c as follows: warning: symbol 'decrementer_max' was not declared. Should it be static? warning: symbol 'rtc_lock' was not declared. Should it be static? warning: symbol 'dtl_consumer' was not declared. Should it be static? Declare 'decrementer_max' and 'rtc_lock' in powerpc asm/time.h. Rename 'rtc_lock' in drviers/rtc/rtc-vr41xx.c to 'vr41xx_rtc_lock' to avoid the conflict with the variable in powerpc asm/time.h. Move 'dtl_consumer' definition behind "include " because it is declared there. Reported-by: Hulk Robot Signed-off-by: He Ying --- v2: - Instead of including linux/mc146818rtc.h in powerpc kernel/time.c, declare rtc_lock in powerpc asm/time.h. V1 was actually the correct thing to do. rtc_lock is there exactly because chrp and maple are using mc146818 compatible RTCs. This is then useful because then drivers/char/nvram.c is enabled. The proper fix would be to scrap all of that and use rtc-cmos for those platforms as this drives the RTC properly and exposes the NVRAM for the mc146818. Or at least, if there are no users for the char/nvram driver on those two platforms, remove the spinlock and stop enabling CONFIG_NVRAM or more likely rename the symbol as it seems to be abused by both chrp and powermac. Ok so rtc_lock is not even used by the char/nvram.c driver as it is completely compiled out. I guess it is fine having it move to the individual platform as looking very quickly at the Kconfig, it is not possible to select both simultaneously. Tentative patch: Looking at it once more, it looks like including linux/mc146818rtc.h is the thing to do, at least for now. Several platforms are defining the rtc_lock exactly the same way as powerpc does, and including mc146818rtc.h I think that to get it clean, this change should go in a dedicated patch and do a bit more and explain exactly what is being do and why. I'll try to draft something for it. He Y., can you make a version v3 of your patch excluding the rtc_lock change ? Christophe
Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig
Sure, here it is: snow / # lspci -vxxx -s 7.0 00:07.0 ISA bridge: Contaq Microsystems 82c693 Flags: bus master, medium devsel, latency 0 Kernel modules: pata_cypress 00: 80 10 93 c6 47 00 80 02 00 00 01 06 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40: 03 02 00 00 26 60 00 01 f0 60 00 80 80 71 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Best regards, Barnabas ps.: let me know, if anything else I can do. On Tue, Mar 23, 2021 at 7:43 PM Russell King - ARM Linux admin wrote: > > On Mon, Mar 22, 2021 at 06:10:01PM +0100, Cye Borg wrote: > > PWS 500au: > > > > snow / # lspci -vvx -s 7.1 > > 00:07.1 IDE interface: Contaq Microsystems 82c693 (prog-if 80 [ISA > > Compatibility mode-only controller, supports bus mastering]) > > Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- > > ParErr+ Stepping- SERR- FastB2B- DisINTx- > > Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium > > >TAbort- SERR- > Latency: 0 > > Interrupt: pin A routed to IRQ 0 > > Region 0: I/O ports at 01f0 [size=8] > > Region 1: I/O ports at 03f4 > > Region 4: I/O ports at 9080 [size=16] > > Kernel driver in use: pata_cypress > > Kernel modules: pata_cypress > > 00: 80 10 93 c6 45 00 80 02 00 80 01 01 00 00 80 00 > > 10: f1 01 00 00 f5 03 00 00 00 00 00 00 00 00 00 00 > > 20: 81 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 > > > > snow / # lspci -vvx -s 7.2 > > 00:07.2 IDE interface: Contaq Microsystems 82c693 (prog-if 00 [ISA > > Compatibility mode-only controller]) > > Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- > > ParErr+ Stepping- SERR- FastB2B- DisINTx- > > Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium > > >TAbort- SERR- > Latency: 0 > > Interrupt: pin B routed to IRQ 0 > > Region 0: I/O ports at 0170 [size=8] > > Region 1: I/O ports at 0374 > > Region 4: Memory at 0c24 (32-bit, non-prefetchable) > > [disabled] [size=64K] > > Kernel modules: pata_cypress > > 00: 80 10 93 c6 45 00 80 02 00 00 01 01 00 00 80 00 > > 10: 71 01 00 00 75 03 00 00 00 00 00 00 00 00 00 00 > > 20: 00 00 24 0c 00 00 00 00 00 00 00 00 00 00 00 00 > > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 > > Thanks very much. > > Could I also ask for the output of: > > # lspci -vxxx -s 7.0 > > as well please - this will dump all 256 bytes for the ISA bridge, which > contains a bunch of configuration registers. Thanks. > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH V2 1/5] powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT
On 3/22/21 8:27 PM, Athira Rajeev wrote: Performance Monitoring Unit (PMU) registers in powerpc provides information on cycles elapsed between different stages in the pipeline. This can be used for application tuning. On ISA v3.1 platform, this information is exposed by sampling registers. Patch adds kernel support to capture two of the cycle counters as part of perf sample using the sample type: PERF_SAMPLE_WEIGHT_STRUCT. The power PMU function 'get_mem_weight' currently uses 64 bit weight field of perf_sample_data to capture memory latency. But following the introduction of PERF_SAMPLE_WEIGHT_TYPE, weight field could contain 64-bit or 32-bit value depending on the architexture support for PERF_SAMPLE_WEIGHT_STRUCT. Patches uses WEIGHT_STRUCT to expose the pipeline stage cycles info. Hence update the ppmu functions to work for 64-bit and 32-bit weight values. If the sample type is PERF_SAMPLE_WEIGHT, use the 64-bit weight field. if the sample type is PERF_SAMPLE_WEIGHT_STRUCT, memory subsystem latency is stored in the low 32bits of perf_sample_weight structure. Also for CPU_FTR_ARCH_31, capture the two cycle counter information in two 16 bit fields of perf_sample_weight structure. Changes looks fine to me. Reviewed-by: Madhavan Srinivasan Signed-off-by: Athira Rajeev --- arch/powerpc/include/asm/perf_event_server.h | 2 +- arch/powerpc/perf/core-book3s.c | 4 ++-- arch/powerpc/perf/isa207-common.c| 29 +--- arch/powerpc/perf/isa207-common.h| 6 +- 4 files changed, 34 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h index 00e7e671bb4b..112cf092d7b3 100644 --- a/arch/powerpc/include/asm/perf_event_server.h +++ b/arch/powerpc/include/asm/perf_event_server.h @@ -43,7 +43,7 @@ struct power_pmu { u64 alt[]); void(*get_mem_data_src)(union perf_mem_data_src *dsrc, u32 flags, struct pt_regs *regs); - void(*get_mem_weight)(u64 *weight); + void(*get_mem_weight)(u64 *weight, u64 type); unsigned long group_constraint_mask; unsigned long group_constraint_val; u64 (*bhrb_filter_map)(u64 branch_sample_type); diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 766f064f00fb..6936763246bd 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2206,9 +2206,9 @@ static void record_and_restart(struct perf_event *event, unsigned long val, ppmu->get_mem_data_src) ppmu->get_mem_data_src(&data.data_src, ppmu->flags, regs); - if (event->attr.sample_type & PERF_SAMPLE_WEIGHT && + if (event->attr.sample_type & PERF_SAMPLE_WEIGHT_TYPE && ppmu->get_mem_weight) - ppmu->get_mem_weight(&data.weight.full); + ppmu->get_mem_weight(&data.weight.full, event->attr.sample_type); if (perf_event_overflow(event, &data, regs)) power_pmu_stop(event, 0); diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c index e4f577da33d8..5dcbdbd54598 100644 --- a/arch/powerpc/perf/isa207-common.c +++ b/arch/powerpc/perf/isa207-common.c @@ -284,8 +284,10 @@ void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, u32 flags, } } -void isa207_get_mem_weight(u64 *weight) +void isa207_get_mem_weight(u64 *weight, u64 type) { + union perf_sample_weight *weight_fields; + u64 weight_lat; u64 mmcra = mfspr(SPRN_MMCRA); u64 exp = MMCRA_THR_CTR_EXP(mmcra); u64 mantissa = MMCRA_THR_CTR_MANT(mmcra); @@ -296,9 +298,30 @@ void isa207_get_mem_weight(u64 *weight) mantissa = P10_MMCRA_THR_CTR_MANT(mmcra); if (val == 0 || val == 7) - *weight = 0; + weight_lat = 0; else - *weight = mantissa << (2 * exp); + weight_lat = mantissa << (2 * exp); + + /* +* Use 64 bit weight field (full) if sample type is +* WEIGHT. +* +* if sample type is WEIGHT_STRUCT: +* - store memory latency in the lower 32 bits. +* - For ISA v3.1, use remaining two 16 bit fields of +* perf_sample_weight to store cycle counter values +* from sier2. +*/ + weight_fields = (union perf_sample_weight *)weight; + if (type & PERF_SAMPLE_WEIGHT) + weight_fields->full = weight_lat; + else { + weight_fields->var1_dw = (u32)weight_lat; + if (cpu_has_feature(CPU_FTR_ARCH_31)) { + weight_fields->var2_w = P10_SIER2_FINISH_CYC(mfspr(SPRN_SIER2)); + weight_fields->var3_w =
Re: [PATCH v2 -next] powerpc: kernel/time.c - cleanup warnings
Dear, 在 2021/3/24 6:18, Alexandre Belloni 写道: Hello, On 23/03/2021 05:12:57-0400, He Ying wrote: We found these warnings in arch/powerpc/kernel/time.c as follows: warning: symbol 'decrementer_max' was not declared. Should it be static? warning: symbol 'rtc_lock' was not declared. Should it be static? warning: symbol 'dtl_consumer' was not declared. Should it be static? Declare 'decrementer_max' and 'rtc_lock' in powerpc asm/time.h. Rename 'rtc_lock' in drviers/rtc/rtc-vr41xx.c to 'vr41xx_rtc_lock' to avoid the conflict with the variable in powerpc asm/time.h. Move 'dtl_consumer' definition behind "include " because it is declared there. Reported-by: Hulk Robot Signed-off-by: He Ying --- v2: - Instead of including linux/mc146818rtc.h in powerpc kernel/time.c, declare rtc_lock in powerpc asm/time.h. V1 was actually the correct thing to do. rtc_lock is there exactly because chrp and maple are using mc146818 compatible RTCs. This is then useful because then drivers/char/nvram.c is enabled. The proper fix would be to scrap all of that and use rtc-cmos for those platforms as this drives the RTC properly and exposes the NVRAM for the mc146818. Do you mean that 'rtc_lock' declared in linux/mc146818rtc.h points to same thing as that defined in powerpc kernel/time.c? And you think V1 was correct? Oh, I should have added you to my patch V1 senders:) Or at least, if there are no users for the char/nvram driver on those two platforms, remove the spinlock and stop enabling CONFIG_NVRAM or more likely rename the symbol as it seems to be abused by both chrp and powermac. I'm not completely against the rename in vr41xxx but the fix for the warnings can and should be contained in arch/powerpc. Yes, I agree with you. But I have no choice because there is a compiling error. Maybe there's a better way. So, what about my patch V1? Should I resend it and add you to senders? Thanks.
Re: [PATCH v4 44/46] KVM: PPC: Book3S HV P9: implement hash guest support
Excerpts from Fabiano Rosas's message of March 24, 2021 1:53 am: > Nicholas Piggin writes: > >> Guest entry/exit has to restore and save/clear the SLB, plus several >> other bits to accommodate hash guests in the P9 path. >> >> Radix host, hash guest support is removed from the P7/8 path. >> >> Signed-off-by: Nicholas Piggin >> --- > > > >> diff --git a/arch/powerpc/kvm/book3s_hv_interrupt.c >> b/arch/powerpc/kvm/book3s_hv_interrupt.c >> index cd84d2c37632..03fbfef708a8 100644 >> --- a/arch/powerpc/kvm/book3s_hv_interrupt.c >> +++ b/arch/powerpc/kvm/book3s_hv_interrupt.c >> @@ -55,6 +55,50 @@ static void __accumulate_time(struct kvm_vcpu *vcpu, >> struct kvmhv_tb_accumulator >> #define accumulate_time(vcpu, next) do {} while (0) >> #endif >> >> +static inline void mfslb(unsigned int idx, u64 *slbee, u64 *slbev) >> +{ >> +asm volatile("slbmfev %0,%1" : "=r" (*slbev) : "r" (idx)); >> +asm volatile("slbmfee %0,%1" : "=r" (*slbee) : "r" (idx)); >> +} >> + >> +static inline void __mtslb(u64 slbee, u64 slbev) >> +{ >> +asm volatile("slbmte %0,%1" :: "r" (slbev), "r" (slbee)); >> +} >> + >> +static inline void mtslb(unsigned int idx, u64 slbee, u64 slbev) >> +{ >> +BUG_ON((slbee & 0xfff) != idx); >> + >> +__mtslb(slbee, slbev); >> +} >> + >> +static inline void slb_invalidate(unsigned int ih) >> +{ >> +asm volatile("slbia %0" :: "i"(ih)); >> +} > > Fyi, in my environment the assembler complains: > > {standard input}: Assembler messages: > {standard input}:1293: Error: junk at end of line: `6' > > {standard input}:2138: Error: junk at end of line: `6' > make[3]: *** [../scripts/Makefile.build:271: > arch/powerpc/kvm/book3s_hv_interrupt.o] Error 1 > > This works: > > - asm volatile("slbia %0" :: "i"(ih)); > + asm volatile(PPC_SLBIA(%0) :: "i"(ih)); > > But I don't know what is going on. Ah yes, we still need to use PPC_SLBIA. IH parameter to slbia was only added in binutils 2.27 and we support down to 2.23. Thanks for the fix I'll add it. Thanks, Nick
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
Excerpts from Fabiano Rosas's message of March 24, 2021 8:57 am: > Nicholas Piggin writes: > >> In the interest of minimising the amount of code that is run in >> "real-mode", don't handle hcalls in real mode in the P9 path. >> >> POWER8 and earlier are much more expensive to exit from HV real mode >> and switch to host mode, because on those processors HV interrupts get >> to the hypervisor with the MMU off, and the other threads in the core >> need to be pulled out of the guest, and SLBs all need to be saved, >> ERATs invalidated, and host SLB reloaded before the MMU is re-enabled >> in host mode. Hash guests also require a lot of hcalls to run. The >> XICS interrupt controller requires hcalls to run. >> >> By contrast, POWER9 has independent thread switching, and in radix mode >> the hypervisor is already in a host virtual memory mode when the HV >> interrupt is taken. Radix + xive guests don't need hcalls to handle >> interrupts or manage translations. >> >> So it's much less important to handle hcalls in real mode in P9. >> >> Signed-off-by: Nicholas Piggin >> --- > > > >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index fa7614c37e08..17739aaee3d8 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) >> } >> >> /* >> - * Handle H_CEDE in the nested virtualization case where we haven't >> - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. >> + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall >> + * handlers in book3s_hv_rmhandlers.S. >> + * >> * This has to be done early, not in kvmppc_pseries_do_hcall(), so >> * that the cede logic in kvmppc_run_single_vcpu() works properly. >> */ >> -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) >> +static void kvmppc_cede(struct kvm_vcpu *vcpu) >> { >> vcpu->arch.shregs.msr |= MSR_EE; >> vcpu->arch.ceded = 1; >> @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu >> *vcpu, >> /* hcall - punt to userspace */ >> int i; >> >> -/* hypercall with MSR_PR has already been handled in rmode, >> - * and never reaches here. >> - */ >> +if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { >> +/* >> + * Guest userspace executed sc 1, reflect it back as a >> + * privileged program check interrupt. >> + */ >> +kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV); >> +r = RESUME_GUEST; >> +break; >> +} > > This patch bypasses sc_1_fast_return so it breaks KVM-PR. L1 loops with > the following output: > > [9.503929][ T3443] Couldn't emulate instruction 0x4e800020 (op 19 xop 16) > [9.503990][ T3443] kvmppc_exit_pr_progint: emulation at 48f4 failed > (4e800020) > [9.504080][ T3443] Couldn't emulate instruction 0x4e800020 (op 19 xop 16) > [9.504170][ T3443] kvmppc_exit_pr_progint: emulation at 48f4 failed > (4e800020) > > 0x4e800020 is a blr after a sc 1 in SLOF. > > For KVM-PR we need to inject a 0xc00 at some point, either here or > before branching to no_try_real in book3s_hv_rmhandlers.S. Ah, I didn't know about that PR KVM (I suppose I should test it but I haven't been able to get it running in the past). Should be able to deal with that. This patch probably shouldn't change the syscall behaviour like this anyway. Thanks, Nick
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
Excerpts from Fabiano Rosas's message of March 24, 2021 4:03 am: > Nicholas Piggin writes: > >> In the interest of minimising the amount of code that is run in >> "real-mode", don't handle hcalls in real mode in the P9 path. >> >> POWER8 and earlier are much more expensive to exit from HV real mode >> and switch to host mode, because on those processors HV interrupts get >> to the hypervisor with the MMU off, and the other threads in the core >> need to be pulled out of the guest, and SLBs all need to be saved, >> ERATs invalidated, and host SLB reloaded before the MMU is re-enabled >> in host mode. Hash guests also require a lot of hcalls to run. The >> XICS interrupt controller requires hcalls to run. >> >> By contrast, POWER9 has independent thread switching, and in radix mode >> the hypervisor is already in a host virtual memory mode when the HV >> interrupt is taken. Radix + xive guests don't need hcalls to handle >> interrupts or manage translations. >> >> So it's much less important to handle hcalls in real mode in P9. >> >> Signed-off-by: Nicholas Piggin > > I tried this again in the L2 with xive=off and it works as expected now. > > Tested-by: Fabiano Rosas Oh good, thanks for spotting the problem and re testing. Thanks, Nick
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
Excerpts from Cédric Le Goater's message of March 23, 2021 11:23 pm: > On 3/23/21 2:02 AM, Nicholas Piggin wrote: >> In the interest of minimising the amount of code that is run in >> "real-mode", don't handle hcalls in real mode in the P9 path. >> >> POWER8 and earlier are much more expensive to exit from HV real mode >> and switch to host mode, because on those processors HV interrupts get >> to the hypervisor with the MMU off, and the other threads in the core >> need to be pulled out of the guest, and SLBs all need to be saved, >> ERATs invalidated, and host SLB reloaded before the MMU is re-enabled >> in host mode. Hash guests also require a lot of hcalls to run. The >> XICS interrupt controller requires hcalls to run. >> >> By contrast, POWER9 has independent thread switching, and in radix mode >> the hypervisor is already in a host virtual memory mode when the HV >> interrupt is taken. Radix + xive guests don't need hcalls to handle >> interrupts or manage translations. >> >> So it's much less important to handle hcalls in real mode in P9. >> >> Signed-off-by: Nicholas Piggin >> --- >> arch/powerpc/include/asm/kvm_ppc.h | 5 ++ >> arch/powerpc/kvm/book3s_hv.c| 57 >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 ++ >> arch/powerpc/kvm/book3s_xive.c | 70 + >> 4 files changed, 127 insertions(+), 10 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/kvm_ppc.h >> b/arch/powerpc/include/asm/kvm_ppc.h >> index 73b1ca5a6471..db6646c2ade2 100644 >> --- a/arch/powerpc/include/asm/kvm_ppc.h >> +++ b/arch/powerpc/include/asm/kvm_ppc.h >> @@ -607,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm); >> extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall); >> extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); >> extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd); >> +extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req); >> extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); >> extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); >> extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, >> @@ -639,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu >> *vcpu) >> static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } >> static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) >> { return 0; } >> +static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req) >> +{ return 0; } >> #endif >> >> #ifdef CONFIG_KVM_XIVE >> @@ -673,6 +676,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int >> irq_source_id, u32 irq, >> int level, bool line_status); >> extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu); >> extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu); >> +extern void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu); >> >> static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) >> { >> @@ -714,6 +718,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, >> int irq_source_id, u32 ir >>int level, bool line_status) { return >> -ENODEV; } >> static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { } >> static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { } >> +static inline void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu) { } >> >> static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) >> { return 0; } >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index fa7614c37e08..17739aaee3d8 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) >> } >> >> /* >> - * Handle H_CEDE in the nested virtualization case where we haven't >> - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. >> + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall >> + * handlers in book3s_hv_rmhandlers.S. >> + * >> * This has to be done early, not in kvmppc_pseries_do_hcall(), so >> * that the cede logic in kvmppc_run_single_vcpu() works properly. >> */ >> -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) >> +static void kvmppc_cede(struct kvm_vcpu *vcpu) >> { >> vcpu->arch.shregs.msr |= MSR_EE; >> vcpu->arch.ceded = 1; >> @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu >> *vcpu, >> /* hcall - punt to userspace */ >> int i; >> >> -/* hypercall with MSR_PR has already been handled in rmode, >> - * and never reaches here. >> - */ >> +if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { >> +/* >> + * Guest userspace executed sc 1, reflect it back as a >> + * privileged program check interrupt. >> + */ >> +kvmppc_co
[powerpc:next-test] BUILD SUCCESS 8a83feefbd5254ae7f13aff3e4097dd7d8723bce
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test branch HEAD: 8a83feefbd5254ae7f13aff3e4097dd7d8723bce cxl: Fix couple of spellings elapsed time: 725m configs tested: 109 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig x86_64 allyesconfig riscvallmodconfig i386 allyesconfig riscvallyesconfig h8300allyesconfig arc nsimosci_hs_smp_defconfig powerpc ppc40x_defconfig powerpc makalu_defconfig m68k m5208evb_defconfig mips cu1000-neo_defconfig powerpc ksi8560_defconfig armmps2_defconfig powerpc walnut_defconfig arm rpc_defconfig mipsjmr3927_defconfig arm am200epdkit_defconfig powerpc currituck_defconfig sh sh7710voipgw_defconfig arcvdk_hs38_defconfig mips bmips_stb_defconfig ia64generic_defconfig arcnsim_700_defconfig arm pxa910_defconfig xtensa nommu_kc705_defconfig powerpc mpc8272_ads_defconfig powerpc linkstation_defconfig powerpc rainier_defconfig mipsmaltaup_defconfig arm pxa168_defconfig arm collie_defconfig arm pxa_defconfig powerpc tqm8555_defconfig powerpc eiger_defconfig arm aspeed_g5_defconfig powerpc pseries_defconfig arm pxa255-idp_defconfig arm exynos_defconfig h8300alldefconfig sh se7780_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig s390 allmodconfig parisc allyesconfig s390defconfig sparcallyesconfig sparc defconfig i386 tinyconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a006-20210323 x86_64 randconfig-a004-20210323 x86_64 randconfig-a005-20210323 i386 randconfig-a003-20210323 i386 randconfig-a004-20210323 i386 randconfig-a001-20210323 i386 randconfig-a002-20210323 i386 randconfig-a006-20210323 i386 randconfig-a005-20210323 i386 randconfig-a014-20210323 i386 randconfig-a011-20210323 i386 randconfig-a015-20210323 i386 randconfig-a016-20210323 i386 randconfig-a012-20210323 i386 randconfig-a013-20210323 x86_64 randconfig-a002-20210323 x86_64 randconfig-a003-20210323 x86_64 randconfig-a001-20210323 riscvnommu_k210_defconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscv rv32_defconfig x86_64rhel-7.6-kselftests x86_64 defconfig x86_64 rhel
[powerpc:merge] BUILD SUCCESS 909b15d4ac3524a89c6df8c60e0cb0b4d5a3c248
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge branch HEAD: 909b15d4ac3524a89c6df8c60e0cb0b4d5a3c248 Automatic merge of 'fixes' into merge (2021-03-23 22:53) elapsed time: 725m configs tested: 127 configs skipped: 2 The following configs have been built successfully. More configs may be tested in the coming days. gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig x86_64 allyesconfig i386 allyesconfig powerpc maple_defconfig powerpc lite5200b_defconfig sparcalldefconfig mipsmaltaup_xpa_defconfig powerpc mpc836x_rdk_defconfig armlart_defconfig m68k m5208evb_defconfig mips cu1000-neo_defconfig powerpc ksi8560_defconfig armmps2_defconfig powerpc tqm8xx_defconfig m68k alldefconfig powerpc mgcoge_defconfig sh se7751_defconfig mipsomega2p_defconfig powerpc ppc64e_defconfig powerpc walnut_defconfig arm rpc_defconfig mipsjmr3927_defconfig arm am200epdkit_defconfig powerpc currituck_defconfig sh sh7710voipgw_defconfig powerpcsocrates_defconfig nds32 allnoconfig arm imx_v6_v7_defconfig armneponset_defconfig shhp6xx_defconfig arm orion5x_defconfig mipsmalta_qemu_32r6_defconfig mips capcella_defconfig arm lubbock_defconfig sh alldefconfig powerpc ep8248e_defconfig powerpc tqm8540_defconfig arm integrator_defconfig riscv rv32_defconfig powerpc mpc866_ads_defconfig arm mainstone_defconfig sh sh03_defconfig m68k multi_defconfig arm pxa_defconfig powerpc tqm8555_defconfig powerpc eiger_defconfig arm pxa168_defconfig mips cu1830-neo_defconfig powerpc obs600_defconfig powerpc64 defconfig mips ath25_defconfig arm aspeed_g5_defconfig powerpc pseries_defconfig arm pxa255-idp_defconfig arm exynos_defconfig h8300alldefconfig sh se7780_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig s390 allmodconfig parisc allyesconfig s390defconfig sparcallyesconfig sparc defconfig i386 tinyconfig i386defconfig nios2 defconfig arc allyesconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a002-20210323 x86_64 randconfig-a003-20210323 x86_64 randconfig-a006-20210323 x86_64 randconfig-a001-20210323 x86_64 randconfig-a004-20210323 x86_64 randconfig-a005-20210323 i386 randconfig-a003-20210323 i386 randconfig-a004-202
[powerpc:fixes-test] BUILD SUCCESS 274cb1ca2e7ce02cab56f5f4c61a74aeb566f931
obs600_defconfig m68kmvme16x_defconfig nios2 3c120_defconfig sh landisk_defconfig sh secureedge5410_defconfig arm integrator_defconfig powerpc mpc836x_mds_defconfig powerpc mpc8272_ads_defconfig powerpc linkstation_defconfig powerpc rainier_defconfig mipsmaltaup_defconfig arm pxa168_defconfig arm collie_defconfig arm lpc18xx_defconfig shecovec24-romimage_defconfig mips rs90_defconfig shsh7785lcr_defconfig sh se7721_defconfig arm davinci_all_defconfig powerpc ppc6xx_defconfig powerpc mpc834x_mds_defconfig sh rsk7201_defconfig powerpc tqm8541_defconfig powerpc mpc834x_itx_defconfig sh rsk7203_defconfig mips loongson1b_defconfig arm pxa_defconfig powerpc tqm8555_defconfig powerpc eiger_defconfig mips cu1830-neo_defconfig powerpc64 defconfig mips ath25_defconfig arm axm55xx_defconfig arc nsimosci_hs_smp_defconfig powerpc asp8347_defconfig archsdk_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig nios2 defconfig arc allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig s390 allmodconfig parisc allyesconfig s390defconfig sparcallyesconfig sparc defconfig i386 tinyconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a002-20210323 x86_64 randconfig-a003-20210323 x86_64 randconfig-a006-20210323 x86_64 randconfig-a001-20210323 x86_64 randconfig-a004-20210323 x86_64 randconfig-a005-20210323 i386 randconfig-a003-20210323 i386 randconfig-a001-20210323 i386 randconfig-a002-20210323 i386 randconfig-a004-20210323 i386 randconfig-a006-20210323 i386 randconfig-a005-20210323 i386 randconfig-a004-20210324 i386 randconfig-a003-20210324 i386 randconfig-a001-20210324 i386 randconfig-a002-20210324 i386 randconfig-a006-20210324 i386 randconfig-a005-20210324 i386 randconfig-a015-20210323 i386 randconfig-a016-20210323 i386 randconfig-a014-20210323 i386 randconfig-a011-20210323 i386 randconfig-a012-20210323 i386 randconfig-a013-20210323 riscvnommu_virt_defconfig riscv rv32_defconfig riscvnommu_k210_defconfig riscv allnoconfig riscv defconfig x86_64rhel-7.6-kselftests x86_64 defconfig x86_64 rhel-8.3 x86_64 rhel-8.3-kbuiltin x86_64 kexec clang tested configs: x86_64 randconfig-a012-20210323 x86_64 randconfig-a015-20210323 x86_64 randconfig-a013-20210323 x86_64 randconfig-a014-20210323 x86_64 randconfig-a011-20210323 x86_64 randconfig-a016-20210323 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01
Re: [PATCH v2 -next] powerpc: kernel/time.c - cleanup warnings
On 23/03/2021 23:18:17+0100, Alexandre Belloni wrote: > Hello, > > On 23/03/2021 05:12:57-0400, He Ying wrote: > > We found these warnings in arch/powerpc/kernel/time.c as follows: > > warning: symbol 'decrementer_max' was not declared. Should it be static? > > warning: symbol 'rtc_lock' was not declared. Should it be static? > > warning: symbol 'dtl_consumer' was not declared. Should it be static? > > > > Declare 'decrementer_max' and 'rtc_lock' in powerpc asm/time.h. > > Rename 'rtc_lock' in drviers/rtc/rtc-vr41xx.c to 'vr41xx_rtc_lock' to > > avoid the conflict with the variable in powerpc asm/time.h. > > Move 'dtl_consumer' definition behind "include " because it > > is declared there. > > > > Reported-by: Hulk Robot > > Signed-off-by: He Ying > > --- > > v2: > > - Instead of including linux/mc146818rtc.h in powerpc kernel/time.c, declare > > rtc_lock in powerpc asm/time.h. > > > > V1 was actually the correct thing to do. rtc_lock is there exactly > because chrp and maple are using mc146818 compatible RTCs. This is then > useful because then drivers/char/nvram.c is enabled. The proper fix > would be to scrap all of that and use rtc-cmos for those platforms as > this drives the RTC properly and exposes the NVRAM for the mc146818. > > Or at least, if there are no users for the char/nvram driver on those > two platforms, remove the spinlock and stop enabling CONFIG_NVRAM or > more likely rename the symbol as it seems to be abused by both chrp and > powermac. > Ok so rtc_lock is not even used by the char/nvram.c driver as it is completely compiled out. I guess it is fine having it move to the individual platform as looking very quickly at the Kconfig, it is not possible to select both simultaneously. Tentative patch: 8<- >From dfa59b6f44fdfdefafffa7666aec89e62bbd5c80 Mon Sep 17 00:00:00 2001 From: Alexandre Belloni Date: Wed, 24 Mar 2021 00:00:03 +0100 Subject: [PATCH] powerpc: move rtc_lock to specific platforms Signed-off-by: Alexandre Belloni --- arch/powerpc/kernel/time.c | 3 --- arch/powerpc/platforms/chrp/time.c | 2 +- arch/powerpc/platforms/maple/time.c | 2 ++ 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 67feb3524460..d3bb189ea7f4 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -123,9 +123,6 @@ EXPORT_SYMBOL(tb_ticks_per_usec); unsigned long tb_ticks_per_sec; EXPORT_SYMBOL(tb_ticks_per_sec); /* for cputime_t conversions */ -DEFINE_SPINLOCK(rtc_lock); -EXPORT_SYMBOL_GPL(rtc_lock); - static u64 tb_to_ns_scale __read_mostly; static unsigned tb_to_ns_shift __read_mostly; static u64 boot_tb __read_mostly; diff --git a/arch/powerpc/platforms/chrp/time.c b/arch/powerpc/platforms/chrp/time.c index acde7bbe0716..ea90c15f5edd 100644 --- a/arch/powerpc/platforms/chrp/time.c +++ b/arch/powerpc/platforms/chrp/time.c @@ -30,7 +30,7 @@ #include -extern spinlock_t rtc_lock; +DEFINE_SPINLOCK(rtc_lock); #define NVRAM_AS0 0x74 #define NVRAM_AS1 0x75 diff --git a/arch/powerpc/platforms/maple/time.c b/arch/powerpc/platforms/maple/time.c index 78209bb7629c..ddda02010d86 100644 --- a/arch/powerpc/platforms/maple/time.c +++ b/arch/powerpc/platforms/maple/time.c @@ -34,6 +34,8 @@ #define DBG(x...) #endif +DEFINE_SPINLOCK(rtc_lock); + static int maple_rtc_addr; static int maple_clock_read(int addr) -- 2.25.1 > I'm not completely against the rename in vr41xxx but the fix for the > warnings can and should be contained in arch/powerpc. > > > arch/powerpc/include/asm/time.h | 3 +++ > > arch/powerpc/kernel/time.c | 6 ++ > > drivers/rtc/rtc-vr41xx.c| 22 +++--- > > 3 files changed, 16 insertions(+), 15 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/time.h > > b/arch/powerpc/include/asm/time.h > > index 8dd3cdb25338..64a3ef0b4270 100644 > > --- a/arch/powerpc/include/asm/time.h > > +++ b/arch/powerpc/include/asm/time.h > > @@ -12,6 +12,7 @@ > > #ifdef __KERNEL__ > > #include > > #include > > +#include > > > > #include > > #include > > @@ -22,6 +23,8 @@ extern unsigned long tb_ticks_per_jiffy; > > extern unsigned long tb_ticks_per_usec; > > extern unsigned long tb_ticks_per_sec; > > extern struct clock_event_device decrementer_clockevent; > > +extern u64 decrementer_max; > > +extern spinlock_t rtc_lock; > > > > > > extern void generic_calibrate_decr(void); > > diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c > > index b67d93a609a2..60b6ac7d3685 100644 > > --- a/arch/powerpc/kernel/time.c > > +++ b/arch/powerpc/kernel/time.c > > @@ -150,10 +150,6 @@ bool tb_invalid; > > u64 __cputime_usec_factor; > > EXPORT_SYMBOL(__cputime_usec_factor); > > > > -#ifdef CONFIG_PPC_SPLPAR > > -void (*dtl_consumer)(struct dtl_entry *, u64); > > -#endif > > - > > static void calc_cputime_factors(void) > > { > > struct div_result res; > > @@ -179,6 +175,8 @@ static inline uns
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
Nicholas Piggin writes: > In the interest of minimising the amount of code that is run in > "real-mode", don't handle hcalls in real mode in the P9 path. > > POWER8 and earlier are much more expensive to exit from HV real mode > and switch to host mode, because on those processors HV interrupts get > to the hypervisor with the MMU off, and the other threads in the core > need to be pulled out of the guest, and SLBs all need to be saved, > ERATs invalidated, and host SLB reloaded before the MMU is re-enabled > in host mode. Hash guests also require a lot of hcalls to run. The > XICS interrupt controller requires hcalls to run. > > By contrast, POWER9 has independent thread switching, and in radix mode > the hypervisor is already in a host virtual memory mode when the HV > interrupt is taken. Radix + xive guests don't need hcalls to handle > interrupts or manage translations. > > So it's much less important to handle hcalls in real mode in P9. > > Signed-off-by: Nicholas Piggin > --- > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index fa7614c37e08..17739aaee3d8 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) > } > > /* > - * Handle H_CEDE in the nested virtualization case where we haven't > - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. > + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall > + * handlers in book3s_hv_rmhandlers.S. > + * > * This has to be done early, not in kvmppc_pseries_do_hcall(), so > * that the cede logic in kvmppc_run_single_vcpu() works properly. > */ > -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) > +static void kvmppc_cede(struct kvm_vcpu *vcpu) > { > vcpu->arch.shregs.msr |= MSR_EE; > vcpu->arch.ceded = 1; > @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, > /* hcall - punt to userspace */ > int i; > > - /* hypercall with MSR_PR has already been handled in rmode, > - * and never reaches here. > - */ > + if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { > + /* > + * Guest userspace executed sc 1, reflect it back as a > + * privileged program check interrupt. > + */ > + kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV); > + r = RESUME_GUEST; > + break; > + } This patch bypasses sc_1_fast_return so it breaks KVM-PR. L1 loops with the following output: [9.503929][ T3443] Couldn't emulate instruction 0x4e800020 (op 19 xop 16) [9.503990][ T3443] kvmppc_exit_pr_progint: emulation at 48f4 failed (4e800020) [9.504080][ T3443] Couldn't emulate instruction 0x4e800020 (op 19 xop 16) [9.504170][ T3443] kvmppc_exit_pr_progint: emulation at 48f4 failed (4e800020) 0x4e800020 is a blr after a sc 1 in SLOF. For KVM-PR we need to inject a 0xc00 at some point, either here or before branching to no_try_real in book3s_hv_rmhandlers.S. > > run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); > for (i = 0; i < 9; ++i) > @@ -3663,6 +3670,12 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu > *vcpu, u64 time_limit, > return trap; > } > > +static inline bool hcall_is_xics(unsigned long req) > +{ > + return (req == H_EOI || req == H_CPPR || req == H_IPI || > + req == H_IPOLL || req == H_XIRR || req == H_XIRR_X); > +} > + > /* > * Virtual-mode guest entry for POWER9 and later when the host and > * guest are both using the radix MMU. The LPIDR has already been set. > @@ -3774,15 +3787,36 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu > *vcpu, u64 time_limit, > /* H_CEDE has to be handled now, not later */ > if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested && > kvmppc_get_gpr(vcpu, 3) == H_CEDE) { > - kvmppc_nested_cede(vcpu); > + kvmppc_cede(vcpu); > kvmppc_set_gpr(vcpu, 3, 0); > trap = 0; > } > } else { > kvmppc_xive_push_vcpu(vcpu); > trap = kvmhv_load_hv_regs_and_go(vcpu, time_limit, lpcr); > + if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested && > + !(vcpu->arch.shregs.msr & MSR_PR)) { > + unsigned long req = kvmppc_get_gpr(vcpu, 3); > + > + /* H_CEDE has to be handled now, not later */ > + if (req == H_CEDE) { > + kvmppc_cede(vcpu); > + kvmppc_xive_cede_vcpu(vcpu); /* may un-cede */ > + kvmppc_set_gpr(vcpu, 3, 0); > + trap = 0; > + > + /* XICS hca
Re: [PATCH v2 -next] powerpc: kernel/time.c - cleanup warnings
Hello, On 23/03/2021 05:12:57-0400, He Ying wrote: > We found these warnings in arch/powerpc/kernel/time.c as follows: > warning: symbol 'decrementer_max' was not declared. Should it be static? > warning: symbol 'rtc_lock' was not declared. Should it be static? > warning: symbol 'dtl_consumer' was not declared. Should it be static? > > Declare 'decrementer_max' and 'rtc_lock' in powerpc asm/time.h. > Rename 'rtc_lock' in drviers/rtc/rtc-vr41xx.c to 'vr41xx_rtc_lock' to > avoid the conflict with the variable in powerpc asm/time.h. > Move 'dtl_consumer' definition behind "include " because it > is declared there. > > Reported-by: Hulk Robot > Signed-off-by: He Ying > --- > v2: > - Instead of including linux/mc146818rtc.h in powerpc kernel/time.c, declare > rtc_lock in powerpc asm/time.h. > V1 was actually the correct thing to do. rtc_lock is there exactly because chrp and maple are using mc146818 compatible RTCs. This is then useful because then drivers/char/nvram.c is enabled. The proper fix would be to scrap all of that and use rtc-cmos for those platforms as this drives the RTC properly and exposes the NVRAM for the mc146818. Or at least, if there are no users for the char/nvram driver on those two platforms, remove the spinlock and stop enabling CONFIG_NVRAM or more likely rename the symbol as it seems to be abused by both chrp and powermac. I'm not completely against the rename in vr41xxx but the fix for the warnings can and should be contained in arch/powerpc. > arch/powerpc/include/asm/time.h | 3 +++ > arch/powerpc/kernel/time.c | 6 ++ > drivers/rtc/rtc-vr41xx.c| 22 +++--- > 3 files changed, 16 insertions(+), 15 deletions(-) > > diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h > index 8dd3cdb25338..64a3ef0b4270 100644 > --- a/arch/powerpc/include/asm/time.h > +++ b/arch/powerpc/include/asm/time.h > @@ -12,6 +12,7 @@ > #ifdef __KERNEL__ > #include > #include > +#include > > #include > #include > @@ -22,6 +23,8 @@ extern unsigned long tb_ticks_per_jiffy; > extern unsigned long tb_ticks_per_usec; > extern unsigned long tb_ticks_per_sec; > extern struct clock_event_device decrementer_clockevent; > +extern u64 decrementer_max; > +extern spinlock_t rtc_lock; > > > extern void generic_calibrate_decr(void); > diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c > index b67d93a609a2..60b6ac7d3685 100644 > --- a/arch/powerpc/kernel/time.c > +++ b/arch/powerpc/kernel/time.c > @@ -150,10 +150,6 @@ bool tb_invalid; > u64 __cputime_usec_factor; > EXPORT_SYMBOL(__cputime_usec_factor); > > -#ifdef CONFIG_PPC_SPLPAR > -void (*dtl_consumer)(struct dtl_entry *, u64); > -#endif > - > static void calc_cputime_factors(void) > { > struct div_result res; > @@ -179,6 +175,8 @@ static inline unsigned long read_spurr(unsigned long tb) > > #include > > +void (*dtl_consumer)(struct dtl_entry *, u64); > + > /* > * Scan the dispatch trace log and count up the stolen time. > * Should be called with interrupts disabled. > diff --git a/drivers/rtc/rtc-vr41xx.c b/drivers/rtc/rtc-vr41xx.c > index 5a9f9ad86d32..cc31db058197 100644 > --- a/drivers/rtc/rtc-vr41xx.c > +++ b/drivers/rtc/rtc-vr41xx.c > @@ -72,7 +72,7 @@ static void __iomem *rtc2_base; > > static unsigned long epoch = 1970; /* Jan 1 1970 00:00:00 */ > > -static DEFINE_SPINLOCK(rtc_lock); > +static DEFINE_SPINLOCK(vr41xx_rtc_lock); > static char rtc_name[] = "RTC"; > static unsigned long periodic_count; > static unsigned int alarm_enabled; > @@ -101,13 +101,13 @@ static inline time64_t read_elapsed_second(void) > > static inline void write_elapsed_second(time64_t sec) > { > - spin_lock_irq(&rtc_lock); > + spin_lock_irq(&vr41xx_rtc_lock); > > rtc1_write(ETIMELREG, (uint16_t)(sec << 15)); > rtc1_write(ETIMEMREG, (uint16_t)(sec >> 1)); > rtc1_write(ETIMEHREG, (uint16_t)(sec >> 17)); > > - spin_unlock_irq(&rtc_lock); > + spin_unlock_irq(&vr41xx_rtc_lock); > } > > static int vr41xx_rtc_read_time(struct device *dev, struct rtc_time *time) > @@ -139,14 +139,14 @@ static int vr41xx_rtc_read_alarm(struct device *dev, > struct rtc_wkalrm *wkalrm) > unsigned long low, mid, high; > struct rtc_time *time = &wkalrm->time; > > - spin_lock_irq(&rtc_lock); > + spin_lock_irq(&vr41xx_rtc_lock); > > low = rtc1_read(ECMPLREG); > mid = rtc1_read(ECMPMREG); > high = rtc1_read(ECMPHREG); > wkalrm->enabled = alarm_enabled; > > - spin_unlock_irq(&rtc_lock); > + spin_unlock_irq(&vr41xx_rtc_lock); > > rtc_time64_to_tm((high << 17) | (mid << 1) | (low >> 15), time); > > @@ -159,7 +159,7 @@ static int vr41xx_rtc_set_alarm(struct device *dev, > struct rtc_wkalrm *wkalrm) > > alarm_sec = rtc_tm_to_time64(&wkalrm->time); > > - spin_lock_irq(&rtc_lock); > + spin_lock_irq(&vr41xx_rtc_lock); > > if (alarm_enabled) >
Re: [PATCH] macintosh: A typo fix
On 3/23/21 1:46 PM, Bhaskar Chowdhury wrote: > > s/coment/comment/ > > Signed-off-by: Bhaskar Chowdhury Acked-by: Randy Dunlap > --- > drivers/macintosh/windfarm_smu_controls.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/macintosh/windfarm_smu_controls.c > b/drivers/macintosh/windfarm_smu_controls.c > index 79cb1ad09bfd..75966052819a 100644 > --- a/drivers/macintosh/windfarm_smu_controls.c > +++ b/drivers/macintosh/windfarm_smu_controls.c > @@ -94,7 +94,7 @@ static int smu_set_fan(int pwm, u8 id, u16 value) > return rc; > wait_for_completion(&comp); > > - /* Handle fallback (see coment above) */ > + /* Handle fallback (see comment above) */ > if (cmd.status != 0 && smu_supports_new_fans_ops) { > printk(KERN_WARNING "windfarm: SMU failed new fan command " > "falling back to old method\n"); > -- -- ~Randy
[PATCH v2 1/1] hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline()
One of the reasons that dlpar_cpu_offline can fail is when attempting to offline the last online CPU of the kernel. This can be observed in a pseries QEMU guest that has hotplugged CPUs. If the user offlines all other CPUs of the guest, and a hotplugged CPU is now the last online CPU, trying to reclaim it will fail. See [1] for an example. The current error message in this situation returns rc with -EBUSY and a generic explanation, e.g.: pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16 EBUSY can be caused by other conditions, such as cpu_hotplug_disable being true. Throwing a more specific error message for this case, instead of just "Failed to offline CPU", makes it clearer that the error is in fact a known error situation instead of other generic/unknown cause. This patch adds a 'last online' check in dlpar_cpu_offline() to catch the 'last online CPU' offline error, eturning a more informative error message: pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9 [1] https://bugzilla.redhat.com/1911414 Signed-off-by: Daniel Henrique Barboza --- arch/powerpc/platforms/pseries/hotplug-cpu.c | 13 + 1 file changed, 13 insertions(+) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 12cbffd3c2e3..3ac7e904385c 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -271,6 +271,18 @@ static int dlpar_offline_cpu(struct device_node *dn) if (!cpu_online(cpu)) break; + /* device_offline() will return -EBUSY (via cpu_down()) +* if there is only one CPU left. Check it here to fail +* earlier and with a more informative error message, +* while also retaining the cpu_add_remove_lock to be sure +* that no CPUs are being online/offlined during this +* check. */ + if (num_online_cpus() == 1) { + pr_warn("Unable to remove last online CPU %pOFn\n", dn); + rc = -EBUSY; + goto out_unlock; + } + cpu_maps_update_done(); rc = device_offline(get_cpu_device(cpu)); if (rc) @@ -283,6 +295,7 @@ static int dlpar_offline_cpu(struct device_node *dn) thread); } } +out_unlock: cpu_maps_update_done(); out: -- 2.30.2
[PATCH v2 0/1] show 'last online CPU' error in dlpar_cpu_offline()
changes in v2 after Michael Ellerman review: - moved the verification code from dlpar_cpu_remove() to dlpar_cpu_offline(), while holding cpu_add_remove_lock - reworded the commit message and code comment v1 link: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210305173845.451158-1-danielhb...@gmail.com/ Daniel Henrique Barboza (1): hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_offline() arch/powerpc/platforms/pseries/hotplug-cpu.c | 13 + 1 file changed, 13 insertions(+) -- 2.30.2
[PATCH] macintosh: A typo fix
s/coment/comment/ Signed-off-by: Bhaskar Chowdhury --- drivers/macintosh/windfarm_smu_controls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/macintosh/windfarm_smu_controls.c b/drivers/macintosh/windfarm_smu_controls.c index 79cb1ad09bfd..75966052819a 100644 --- a/drivers/macintosh/windfarm_smu_controls.c +++ b/drivers/macintosh/windfarm_smu_controls.c @@ -94,7 +94,7 @@ static int smu_set_fan(int pwm, u8 id, u16 value) return rc; wait_for_completion(&comp); - /* Handle fallback (see coment above) */ + /* Handle fallback (see comment above) */ if (cmd.status != 0 && smu_supports_new_fans_ops) { printk(KERN_WARNING "windfarm: SMU failed new fan command " "falling back to old method\n"); -- 2.30.1
Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig
On Mon, Mar 22, 2021 at 06:10:01PM +0100, Cye Borg wrote: > PWS 500au: > > snow / # lspci -vvx -s 7.1 > 00:07.1 IDE interface: Contaq Microsystems 82c693 (prog-if 80 [ISA > Compatibility mode-only controller, supports bus mastering]) > Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- > ParErr+ Stepping- SERR- FastB2B- DisINTx- > Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium > >TAbort- SERR- Latency: 0 > Interrupt: pin A routed to IRQ 0 > Region 0: I/O ports at 01f0 [size=8] > Region 1: I/O ports at 03f4 > Region 4: I/O ports at 9080 [size=16] > Kernel driver in use: pata_cypress > Kernel modules: pata_cypress > 00: 80 10 93 c6 45 00 80 02 00 80 01 01 00 00 80 00 > 10: f1 01 00 00 f5 03 00 00 00 00 00 00 00 00 00 00 > 20: 81 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 > > snow / # lspci -vvx -s 7.2 > 00:07.2 IDE interface: Contaq Microsystems 82c693 (prog-if 00 [ISA > Compatibility mode-only controller]) > Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- > ParErr+ Stepping- SERR- FastB2B- DisINTx- > Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium > >TAbort- SERR- Latency: 0 > Interrupt: pin B routed to IRQ 0 > Region 0: I/O ports at 0170 [size=8] > Region 1: I/O ports at 0374 > Region 4: Memory at 0c24 (32-bit, non-prefetchable) > [disabled] [size=64K] > Kernel modules: pata_cypress > 00: 80 10 93 c6 45 00 80 02 00 00 01 01 00 00 80 00 > 10: 71 01 00 00 75 03 00 00 00 00 00 00 00 00 00 00 > 20: 00 00 24 0c 00 00 00 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 Thanks very much. Could I also ask for the output of: # lspci -vxxx -s 7.0 as well please - this will dump all 256 bytes for the ISA bridge, which contains a bunch of configuration registers. Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH v4 02/46] KVM: PPC: Book3S HV: Add a function to filter guest LPCR bits
Nicholas Piggin writes: > Guest LPCR depends on hardware type, and future changes will add > restrictions based on errata and guest MMU mode. Move this logic > to a common function and use it for the cases where the guest > wants to update its LPCR (or the LPCR of a nested guest). > > Signed-off-by: Nicholas Piggin Reviewed-by: Fabiano Rosas > --- > arch/powerpc/include/asm/kvm_book3s.h | 2 + > arch/powerpc/kvm/book3s_hv.c | 60 ++- > arch/powerpc/kvm/book3s_hv_nested.c | 3 +- > 3 files changed, 45 insertions(+), 20 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_book3s.h > b/arch/powerpc/include/asm/kvm_book3s.h > index 2f5f919f6cd3..3eec3ef6f083 100644 > --- a/arch/powerpc/include/asm/kvm_book3s.h > +++ b/arch/powerpc/include/asm/kvm_book3s.h > @@ -258,6 +258,8 @@ extern long kvmppc_hv_get_dirty_log_hpt(struct kvm *kvm, > extern void kvmppc_harvest_vpa_dirty(struct kvmppc_vpa *vpa, > struct kvm_memory_slot *memslot, > unsigned long *map); > +extern unsigned long kvmppc_filter_lpcr_hv(struct kvmppc_vcore *vc, > + unsigned long lpcr); > extern void kvmppc_update_lpcr(struct kvm *kvm, unsigned long lpcr, > unsigned long mask); > extern void kvmppc_set_fscr(struct kvm_vcpu *vcpu, u64 fscr); > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 13bad6bf4c95..c4539c38c639 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -1635,6 +1635,27 @@ static int kvm_arch_vcpu_ioctl_set_sregs_hv(struct > kvm_vcpu *vcpu, > return 0; > } > > +/* > + * Enforce limits on guest LPCR values based on hardware availability, > + * guest configuration, and possibly hypervisor support and security > + * concerns. > + */ > +unsigned long kvmppc_filter_lpcr_hv(struct kvmppc_vcore *vc, unsigned long > lpcr) > +{ > + /* On POWER8 and above, userspace can modify AIL */ > + if (!cpu_has_feature(CPU_FTR_ARCH_207S)) > + lpcr &= ~LPCR_AIL; > + > + /* > + * On POWER9, allow userspace to enable large decrementer for the > + * guest, whether or not the host has it enabled. > + */ > + if (!cpu_has_feature(CPU_FTR_ARCH_300)) > + lpcr &= ~LPCR_LD; > + > + return lpcr; > +} > + > static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr, > bool preserve_top32) > { > @@ -1643,6 +1664,23 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 > new_lpcr, > u64 mask; > > spin_lock(&vc->lock); > + > + /* > + * Userspace can only modify > + * DPFD (default prefetch depth), ILE (interrupt little-endian), > + * TC (translation control), AIL (alternate interrupt location), > + * LD (large decrementer). > + * These are subject to restrictions from kvmppc_filter_lcpr_hv(). > + */ > + mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD; > + > + /* Broken 32-bit version of LPCR must not clear top bits */ > + if (preserve_top32) > + mask &= 0x; > + > + new_lpcr = kvmppc_filter_lpcr_hv(vc, > + (vc->lpcr & ~mask) | (new_lpcr & mask)); > + > /* >* If ILE (interrupt little-endian) has changed, update the >* MSR_LE bit in the intr_msr for each vcpu in this vcore. > @@ -1661,25 +1699,8 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 > new_lpcr, > } > } > > - /* > - * Userspace can only modify DPFD (default prefetch depth), > - * ILE (interrupt little-endian) and TC (translation control). > - * On POWER8 and POWER9 userspace can also modify AIL (alt. interrupt > loc.). > - */ > - mask = LPCR_DPFD | LPCR_ILE | LPCR_TC; > - if (cpu_has_feature(CPU_FTR_ARCH_207S)) > - mask |= LPCR_AIL; > - /* > - * On POWER9, allow userspace to enable large decrementer for the > - * guest, whether or not the host has it enabled. > - */ > - if (cpu_has_feature(CPU_FTR_ARCH_300)) > - mask |= LPCR_LD; > + vc->lpcr = new_lpcr; > > - /* Broken 32-bit version of LPCR must not clear top bits */ > - if (preserve_top32) > - mask &= 0x; > - vc->lpcr = (vc->lpcr & ~mask) | (new_lpcr & mask); > spin_unlock(&vc->lock); > } > > @@ -4641,8 +4662,9 @@ void kvmppc_update_lpcr(struct kvm *kvm, unsigned long > lpcr, unsigned long mask) > struct kvmppc_vcore *vc = kvm->arch.vcores[i]; > if (!vc) > continue; > + > spin_lock(&vc->lock); > - vc->lpcr = (vc->lpcr & ~mask) | lpcr; > + vc->lpcr = kvmppc_filter_lpcr_hv(vc, (vc->lpcr & ~mask) | lpcr); > spin_unlock(&vc->lock); > if (++cores_done >= kvm->arch.online_vcores) > break; > diff --git a/arch/powerpc/kvm/book3s_hv_nested.c > b/arch/powerpc/
Re: [PATCH v4 01/46] KVM: PPC: Book3S HV: Nested move LPCR sanitising to sanitise_hv_regs
Nicholas Piggin writes: > This will get a bit more complicated in future patches. Move it > into the helper function. > > Signed-off-by: Nicholas Piggin Reviewed-by: Fabiano Rosas > --- > arch/powerpc/kvm/book3s_hv_nested.c | 18 -- > 1 file changed, 12 insertions(+), 6 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_nested.c > b/arch/powerpc/kvm/book3s_hv_nested.c > index 0cd0e7aad588..2fe1fea4c934 100644 > --- a/arch/powerpc/kvm/book3s_hv_nested.c > +++ b/arch/powerpc/kvm/book3s_hv_nested.c > @@ -134,6 +134,16 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, > int trap, > > static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state > *hr) > { > + struct kvmppc_vcore *vc = vcpu->arch.vcore; > + u64 mask; > + > + /* > + * Don't let L1 change LPCR bits for the L2 except these: > + */ > + mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | > + LPCR_LPES | LPCR_MER; > + hr->lpcr = (vc->lpcr & ~mask) | (hr->lpcr & mask); > + > /* >* Don't let L1 enable features for L2 which we've disabled for L1, >* but preserve the interrupt cause field. > @@ -271,8 +281,6 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) > u64 hv_ptr, regs_ptr; > u64 hdec_exp; > s64 delta_purr, delta_spurr, delta_ic, delta_vtb; > - u64 mask; > - unsigned long lpcr; > > if (vcpu->kvm->arch.l1_ptcr == 0) > return H_NOT_AVAILABLE; > @@ -321,9 +329,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) > vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token; > vcpu->arch.regs = l2_regs; > vcpu->arch.shregs.msr = vcpu->arch.regs.msr; > - mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | > - LPCR_LPES | LPCR_MER; > - lpcr = (vc->lpcr & ~mask) | (l2_hv.lpcr & mask); > + > sanitise_hv_regs(vcpu, &l2_hv); > restore_hv_regs(vcpu, &l2_hv); > > @@ -335,7 +341,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) > r = RESUME_HOST; > break; > } > - r = kvmhv_run_single_vcpu(vcpu, hdec_exp, lpcr); > + r = kvmhv_run_single_vcpu(vcpu, hdec_exp, l2_hv.lpcr); > } while (is_kvmppc_resume_guest(r)); > > /* save L2 state for return */
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
Nicholas Piggin writes: > In the interest of minimising the amount of code that is run in > "real-mode", don't handle hcalls in real mode in the P9 path. > > POWER8 and earlier are much more expensive to exit from HV real mode > and switch to host mode, because on those processors HV interrupts get > to the hypervisor with the MMU off, and the other threads in the core > need to be pulled out of the guest, and SLBs all need to be saved, > ERATs invalidated, and host SLB reloaded before the MMU is re-enabled > in host mode. Hash guests also require a lot of hcalls to run. The > XICS interrupt controller requires hcalls to run. > > By contrast, POWER9 has independent thread switching, and in radix mode > the hypervisor is already in a host virtual memory mode when the HV > interrupt is taken. Radix + xive guests don't need hcalls to handle > interrupts or manage translations. > > So it's much less important to handle hcalls in real mode in P9. > > Signed-off-by: Nicholas Piggin I tried this again in the L2 with xive=off and it works as expected now. Tested-by: Fabiano Rosas > --- > arch/powerpc/include/asm/kvm_ppc.h | 5 ++ > arch/powerpc/kvm/book3s_hv.c| 57 > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 ++ > arch/powerpc/kvm/book3s_xive.c | 70 + > 4 files changed, 127 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_ppc.h > b/arch/powerpc/include/asm/kvm_ppc.h > index 73b1ca5a6471..db6646c2ade2 100644 > --- a/arch/powerpc/include/asm/kvm_ppc.h > +++ b/arch/powerpc/include/asm/kvm_ppc.h > @@ -607,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm); > extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall); > extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); > extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd); > +extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req); > extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); > extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); > extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, > @@ -639,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu > *vcpu) > static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } > static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) > { return 0; } > +static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req) > + { return 0; } > #endif > > #ifdef CONFIG_KVM_XIVE > @@ -673,6 +676,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int > irq_source_id, u32 irq, > int level, bool line_status); > extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu); > extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu); > +extern void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu); > > static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) > { > @@ -714,6 +718,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, > int irq_source_id, u32 ir > int level, bool line_status) { return > -ENODEV; } > static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { } > static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { } > +static inline void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu) { } > > static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) > { return 0; } > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index fa7614c37e08..17739aaee3d8 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) > } > > /* > - * Handle H_CEDE in the nested virtualization case where we haven't > - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. > + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall > + * handlers in book3s_hv_rmhandlers.S. > + * > * This has to be done early, not in kvmppc_pseries_do_hcall(), so > * that the cede logic in kvmppc_run_single_vcpu() works properly. > */ > -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) > +static void kvmppc_cede(struct kvm_vcpu *vcpu) > { > vcpu->arch.shregs.msr |= MSR_EE; > vcpu->arch.ceded = 1; > @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, > /* hcall - punt to userspace */ > int i; > > - /* hypercall with MSR_PR has already been handled in rmode, > - * and never reaches here. > - */ > + if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { > + /* > + * Guest userspace executed sc 1, reflect it back as a > + * privileged program check interrupt. > + */ > + kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV); > + r = RESUME_GUEST; > +
[PATCH v2] powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall
Add support for ND_REGION_ASYNC capability if the device tree indicates 'ibm,hcall-flush-required' property in the NVDIMM node. Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor. If the flush request failed, the hypervisor is expected to to reflect the problem in the subsequent dimm health request call. This patch prevents mmap of namespaces with MAP_SYNC flag if the nvdimm requires explicit flush[1]. References: [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c Signed-off-by: Shivaprasad G Bhat --- v1 - https://www.spinics.net/lists/kvm-ppc/msg18272.html Changes from v1: - Hcall semantics finalized, all changes are to accomodate them. Documentation/powerpc/papr_hcalls.rst | 14 ++ arch/powerpc/include/asm/hvcall.h |3 +- arch/powerpc/platforms/pseries/papr_scm.c | 39 + 3 files changed, 55 insertions(+), 1 deletion(-) diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst index 48fcf1255a33..648f278eea8f 100644 --- a/Documentation/powerpc/papr_hcalls.rst +++ b/Documentation/powerpc/papr_hcalls.rst @@ -275,6 +275,20 @@ Health Bitmap Flags: Given a DRC Index collect the performance statistics for NVDIMM and copy them to the resultBuffer. +**H_SCM_FLUSH** + +| Input: *drcIndex, continue-token* +| Out: *continue-token* +| Return Value: *H_SUCCESS, H_Parameter, H_P2, H_BUSY* + +Given a DRC Index Flush the data to backend NVDIMM device. + +The hcall returns H_BUSY when the flush takes longer time and the hcall needs +to be issued multiple times in order to be completely serviced. The +*continue-token* from the output to be passed in the argument list of +subsequent hcalls to the hypervisor until the hcall is completely serviced +at which point H_SUCCESS or other error is returned by the hypervisor. + References == .. [1] "Power Architecture Platform Reference" diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index ed6086d57b22..9f7729a97ebd 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -315,7 +315,8 @@ #define H_SCM_HEALTH0x400 #define H_SCM_PERFORMANCE_STATS 0x418 #define H_RPT_INVALIDATE 0x448 -#define MAX_HCALL_OPCODE H_RPT_INVALIDATE +#define H_SCM_FLUSH0x44C +#define MAX_HCALL_OPCODE H_SCM_FLUSH /* Scope args for H_SCM_UNBIND_ALL */ #define H_UNBIND_SCOPE_ALL (0x1) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index 835163f54244..f0407e135410 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -93,6 +93,7 @@ struct papr_scm_priv { uint64_t block_size; int metadata_size; bool is_volatile; + bool hcall_flush_required; uint64_t bound_addr; @@ -117,6 +118,38 @@ struct papr_scm_priv { size_t stat_buffer_len; }; +static int papr_scm_pmem_flush(struct nd_region *nd_region, + struct bio *bio __maybe_unused) +{ + struct papr_scm_priv *p = nd_region_provider_data(nd_region); + unsigned long ret_buf[PLPAR_HCALL_BUFSIZE]; + uint64_t token = 0; + int64_t rc; + + do { + rc = plpar_hcall(H_SCM_FLUSH, ret_buf, p->drc_index, token); + token = ret_buf[0]; + + /* Check if we are stalled for some time */ + if (H_IS_LONG_BUSY(rc)) { + msleep(get_longbusy_msecs(rc)); + rc = H_BUSY; + } else if (rc == H_BUSY) { + cond_resched(); + } + + } while (rc == H_BUSY); + + if (rc) { + dev_err(&p->pdev->dev, "flush error: %lld", rc); + rc = -EIO; + } else { + dev_dbg(&p->pdev->dev, "flush drc 0x%x complete", p->drc_index); + } + + return rc; +} + static LIST_HEAD(papr_nd_regions); static DEFINE_MUTEX(papr_ndr_lock); @@ -943,6 +976,11 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p) ndr_desc.num_mappings = 1; ndr_desc.nd_set = &p->nd_set; + if (p->hcall_flush_required) { + set_bit(ND_REGION_ASYNC, &ndr_desc.flags); + ndr_desc.flush = papr_scm_pmem_flush; + } + if (p->is_volatile) p->region = nvdimm_volatile_region_create(p->bus, &ndr_desc); else { @@ -1088,6 +1126,7 @@ static int papr_scm_probe(struct platform_device *pdev) p->block_size = block_size; p->blocks = blocks; p->is_volatile = !of_property_read_bool(dn, "ibm,cache-flush-required"); + p->hcall_flush_required = of_property_read_bool(dn, "ibm,hcall-flush-required"); /* We just need to ensure that set cookies are unique across */ uuid_parse(uuid_str, (uuid_t *) uuid);
Re: [PATCH] xsysace: Remove SYSACE driver
On 3/23/21 5:28 PM, Jens Axboe wrote: > On 3/23/21 10:25 AM, Michal Simek wrote: >> >> >> On 3/23/21 5:23 PM, Jens Axboe wrote: >>> On 3/22/21 6:04 PM, Davidlohr Bueso wrote: Hi, On Mon, 09 Nov 2020, Michal Simek wrote: > Sysace IP is no longer used on Xilinx PowerPC 405/440 and Microblaze > systems. The driver is not regularly tested and very likely not working > for > quite a long time that's why remove it. Is there a reason this patch was never merged? can the driver be removed? I ran into this as a potential tasklet user that can be replaced/removed. >>> >>> I'd be happy to merge it for 5.13. >>> >> >> Can you just take this version? Or do you want me to send it again? > > Minor edits needed for fuzz, but I've applied this version. Thanks, Michal
Re: [PATCH] xsysace: Remove SYSACE driver
On 3/23/21 10:25 AM, Michal Simek wrote: > > > On 3/23/21 5:23 PM, Jens Axboe wrote: >> On 3/22/21 6:04 PM, Davidlohr Bueso wrote: >>> Hi, >>> >>> On Mon, 09 Nov 2020, Michal Simek wrote: >>> Sysace IP is no longer used on Xilinx PowerPC 405/440 and Microblaze systems. The driver is not regularly tested and very likely not working for quite a long time that's why remove it. >>> >>> Is there a reason this patch was never merged? can the driver be >>> removed? I ran into this as a potential tasklet user that can be >>> replaced/removed. >> >> I'd be happy to merge it for 5.13. >> > > Can you just take this version? Or do you want me to send it again? Minor edits needed for fuzz, but I've applied this version. -- Jens Axboe
Re: [PATCH] xsysace: Remove SYSACE driver
On 3/23/21 5:23 PM, Jens Axboe wrote: > On 3/22/21 6:04 PM, Davidlohr Bueso wrote: >> Hi, >> >> On Mon, 09 Nov 2020, Michal Simek wrote: >> >>> Sysace IP is no longer used on Xilinx PowerPC 405/440 and Microblaze >>> systems. The driver is not regularly tested and very likely not working for >>> quite a long time that's why remove it. >> >> Is there a reason this patch was never merged? can the driver be >> removed? I ran into this as a potential tasklet user that can be >> replaced/removed. > > I'd be happy to merge it for 5.13. > Can you just take this version? Or do you want me to send it again? Thanks, Michal
Re: [PATCH] xsysace: Remove SYSACE driver
On 3/22/21 6:04 PM, Davidlohr Bueso wrote: > Hi, > > On Mon, 09 Nov 2020, Michal Simek wrote: > >> Sysace IP is no longer used on Xilinx PowerPC 405/440 and Microblaze >> systems. The driver is not regularly tested and very likely not working for >> quite a long time that's why remove it. > > Is there a reason this patch was never merged? can the driver be > removed? I ran into this as a potential tasklet user that can be > replaced/removed. I'd be happy to merge it for 5.13. -- Jens Axboe
Re: [PATCH v4 44/46] KVM: PPC: Book3S HV P9: implement hash guest support
Nicholas Piggin writes: > Guest entry/exit has to restore and save/clear the SLB, plus several > other bits to accommodate hash guests in the P9 path. > > Radix host, hash guest support is removed from the P7/8 path. > > Signed-off-by: Nicholas Piggin > --- > diff --git a/arch/powerpc/kvm/book3s_hv_interrupt.c > b/arch/powerpc/kvm/book3s_hv_interrupt.c > index cd84d2c37632..03fbfef708a8 100644 > --- a/arch/powerpc/kvm/book3s_hv_interrupt.c > +++ b/arch/powerpc/kvm/book3s_hv_interrupt.c > @@ -55,6 +55,50 @@ static void __accumulate_time(struct kvm_vcpu *vcpu, > struct kvmhv_tb_accumulator > #define accumulate_time(vcpu, next) do {} while (0) > #endif > > +static inline void mfslb(unsigned int idx, u64 *slbee, u64 *slbev) > +{ > + asm volatile("slbmfev %0,%1" : "=r" (*slbev) : "r" (idx)); > + asm volatile("slbmfee %0,%1" : "=r" (*slbee) : "r" (idx)); > +} > + > +static inline void __mtslb(u64 slbee, u64 slbev) > +{ > + asm volatile("slbmte %0,%1" :: "r" (slbev), "r" (slbee)); > +} > + > +static inline void mtslb(unsigned int idx, u64 slbee, u64 slbev) > +{ > + BUG_ON((slbee & 0xfff) != idx); > + > + __mtslb(slbee, slbev); > +} > + > +static inline void slb_invalidate(unsigned int ih) > +{ > + asm volatile("slbia %0" :: "i"(ih)); > +} Fyi, in my environment the assembler complains: {standard input}: Assembler messages: {standard input}:1293: Error: junk at end of line: `6' {standard input}:2138: Error: junk at end of line: `6' make[3]: *** [../scripts/Makefile.build:271: arch/powerpc/kvm/book3s_hv_interrupt.o] Error 1 This works: - asm volatile("slbia %0" :: "i"(ih)); + asm volatile(PPC_SLBIA(%0) :: "i"(ih)); But I don't know what is going on.
[PATCH] powerpc: Switch to relative jump labels
Convert powerpc to relative jump labels. Before the patch, pseries_defconfig vmlinux.o has: 9074 __jump_table 0003f2a0 01321fa8 2**0 With the patch, the same config gets: 9074 __jump_table 0002a0e0 01321fb4 2**0 Size is 258720 without the patch, 172256 with the patch. That's a 33% size reduction. Largely copied from commit c296146c058c ("arm64/kernel: jump_label: Switch to relative references") Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/jump_label.h | 21 ++--- arch/powerpc/kernel/jump_label.c | 4 ++-- 3 files changed, 9 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index d46db0bfb998..a52938c0f85b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -182,6 +182,7 @@ config PPC select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU select HAVE_ARCH_JUMP_LABEL + select HAVE_ARCH_JUMP_LABEL_RELATIVE select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14 select HAVE_ARCH_KASAN_VMALLOC if PPC32 && PPC_PAGE_SHIFT <= 14 select HAVE_ARCH_KGDB diff --git a/arch/powerpc/include/asm/jump_label.h b/arch/powerpc/include/asm/jump_label.h index 09297ec9fa52..2d5c6bec2b4f 100644 --- a/arch/powerpc/include/asm/jump_label.h +++ b/arch/powerpc/include/asm/jump_label.h @@ -20,7 +20,8 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran asm_volatile_goto("1:\n\t" "nop # arch_static_branch\n\t" ".pushsection __jump_table, \"aw\"\n\t" -JUMP_ENTRY_TYPE "1b, %l[l_yes], %c0\n\t" +".long 1b - ., %l[l_yes] - .\n\t" +JUMP_ENTRY_TYPE "%c0 - .\n\t" ".popsection \n\t" : : "i" (&((char *)key)[branch]) : : l_yes); @@ -34,7 +35,8 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool asm_volatile_goto("1:\n\t" "b %l[l_yes] # arch_static_branch_jump\n\t" ".pushsection __jump_table, \"aw\"\n\t" -JUMP_ENTRY_TYPE "1b, %l[l_yes], %c0\n\t" +".long 1b - ., %l[l_yes] - .\n\t" +JUMP_ENTRY_TYPE "%c0 - .\n\t" ".popsection \n\t" : : "i" (&((char *)key)[branch]) : : l_yes); @@ -43,23 +45,12 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool return true; } -#ifdef CONFIG_PPC64 -typedef u64 jump_label_t; -#else -typedef u32 jump_label_t; -#endif - -struct jump_entry { - jump_label_t code; - jump_label_t target; - jump_label_t key; -}; - #else #define ARCH_STATIC_BRANCH(LABEL, KEY) \ 1098: nop;\ .pushsection __jump_table, "aw";\ - FTR_ENTRY_LONG 1098b, LABEL, KEY; \ + .long 1098b - ., LABEL - .; \ + FTR_ENTRY_LONG KEY; \ .popsection #endif diff --git a/arch/powerpc/kernel/jump_label.c b/arch/powerpc/kernel/jump_label.c index 144858027fa3..ce87dc5ea23c 100644 --- a/arch/powerpc/kernel/jump_label.c +++ b/arch/powerpc/kernel/jump_label.c @@ -11,10 +11,10 @@ void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type) { - struct ppc_inst *addr = (struct ppc_inst *)(unsigned long)entry->code; + struct ppc_inst *addr = (struct ppc_inst *)jump_entry_code(entry); if (type == JUMP_LABEL_JMP) - patch_branch(addr, entry->target, 0); + patch_branch(addr, jump_entry_target(entry), 0); else patch_instruction(addr, ppc_inst(PPC_INST_NOP)); } -- 2.25.0
Re: [PATCH next v1 2/3] printk: remove safe buffers
On Wed 2021-03-17 00:33:25, John Ogness wrote: > With @logbuf_lock removed, the high level printk functions for > storing messages are lockless. Messages can be stored from any > context, so there is no need for the NMI and safe buffers anymore. > Remove the NMI and safe buffers. > > Although the safe buffers are removed, the NMI and safe context > tracking is still in place. In these contexts, store the message > immediately but still use irq_work to defer the console printing. > > Since printk recursion tracking is in place, safe context tracking > for most of printk is not needed. Remove it. Only safe context > tracking relating to the console lock is left in place. This is > because the console lock is needed for the actual printing. I have two more questions after actually checking the entire patch. See below. > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -1084,7 +1069,6 @@ void __init setup_log_buf(int early) > struct printk_record r; > size_t new_descs_size; > size_t new_infos_size; > - unsigned long flags; > char *new_log_buf; > unsigned int free; > u64 seq; > @@ -1142,8 +1126,6 @@ void __init setup_log_buf(int early) >new_descs, ilog2(new_descs_count), >new_infos); > > - printk_safe_enter_irqsave(flags); > - > log_buf_len = new_log_buf_len; > log_buf = new_log_buf; > new_log_buf_len = 0; > @@ -1159,8 +1141,6 @@ void __init setup_log_buf(int early) >*/ > prb = &printk_rb_dynamic; > > - printk_safe_exit_irqrestore(flags); This will allow to add new messages from the IRQ context when we are copying them to the new buffer. They might get lost in the small race window. Also the messages from NMI might get lost because they are not longer stored in the per-CPU buffer. A possible solution might be to do something like this: prb_for_each_record(0, &printk_rb_static, seq, &r) free -= add_to_rb(&printk_rb_dynamic, &r); prb = &printk_rb_dynamic; /* * Copy the remaining messages that might have appeared * from IRQ or NMI context after we ended copying and * before we switched the buffers. They must be finalized * because only one CPU is up at this stage. */ prb_for_each_record(seq, &printk_rb_static, seq, &r) free -= add_to_rb(&printk_rb_dynamic, &r); > - > if (seq != prb_next_seq(&printk_rb_static)) { > pr_err("dropped %llu messages\n", > prb_next_seq(&printk_rb_static) - seq); > @@ -2666,7 +2631,6 @@ void console_unlock(void) > size_t ext_len = 0; > size_t len; > > - printk_safe_enter_irqsave(flags); > skip: > if (!prb_read_valid(prb, console_seq, &r)) > break; > @@ -2711,6 +2675,8 @@ void console_unlock(void) > printk_time); > console_seq++; > > + printk_safe_enter_irqsave(flags); What is the purpose of the printk_safe context here, please? I guess that you wanted to prevent calling console drivers recursively. But it is already serialized by console_lock(). IMHO, the only risk is when manipulating console_sem->lock or console_owner_lock. But they are already guarded by printk_safe context, for example, in console_lock() or console_lock_spinning_enable(). Do I miss something, please? > + > /* >* While actively printing out messages, if another printk() >* were to occur on another CPU, it may wait for this one to > @@ -2745,8 +2711,6 @@ void console_unlock(void) >* flush, no worries. >*/ > retry = prb_read_valid(prb, console_seq, NULL); > - printk_safe_exit_irqrestore(flags); > - > if (retry && console_trylock()) > goto again; > } Heh, all these patches feels like stripping printk of an armour. I hope that we trained it enough to be flexible and avoid any damage. Best Regards, Petr
Re: [PATCH 0/4] Rust for Linux for ppc64le
On Tue, Mar 23, 2021 at 1:16 PM Michael Ellerman wrote: > > It would be nice to be in the CI. I was building natively so I haven't > tried cross compiling yet (which we'll need for CI). Indeed -- in the CI we already cross-compile arm64 (and run under QEMU both arm64 as well as x86_64), so it is easy to add new ones to the matrix. > I can send a pull request if that's easiest. No worries, I will pick the patches. But, of course, feel free to join us in GitHub! :-) Cheers, Miguel
Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig
On Mon, Mar 22, 2021 at 04:33:14PM +0100, Christoph Hellwig wrote: > On Mon, Mar 22, 2021 at 04:18:23PM +0100, Christoph Hellwig wrote: > > On Mon, Mar 22, 2021 at 03:15:03PM +, Russell King - ARM Linux admin > > wrote: > > > It gets worse than that though - due to a change to remove > > > pcibios_min_io from the generic code, moving it into the ARM > > > architecture code, this has caused a regression that prevents the > > > legacy resources being registered against the bus resource. So even > > > if they are there, they cause probe failures. I haven't found a > > > reasonable way to solve this yet, but until there is, there is no > > > way that the PATA driver can be used as the "legacy mode" support > > > is effectively done via the PCI code assigning virtual IO port > > > resources. > > > > > > I'm quite surprised that the CY82C693 even works on Alpha - I've > > > asked for a lspci for that last week but nothing has yet been > > > forthcoming from whoever responded to your patch for Alpha - so I > > > can't compare what I'm seeing with what's happening with Alpha. > > > > That sounds like something we could fix with a quirk for function 2 > > in the PCI resource assignment code. Can you show what vendor and > > device ID function 2 has so that I could try to come up with one? > > Something like this: That solves the problem for the IDE driver, which knows how to deal with legacy mode, but not the PATA driver, which doesn't. The PATA driver needs these resources. As I say, having these resources presents a problem on ARM. A previous commit (3c5d1699887b) changed the way the bus resources are setup which results in /proc/ioports containing: -000f : dma1 0020-003f : pic1 0060-006f : i8042 0070-0073 : rtc_cmos 0070-0073 : rtc0 0080-008f : dma low page 00a0-00bf : pic2 00c0-00df : dma2 0213-0213 : ISAPnP 02f8-02ff : serial8250.0 02f8-02ff : serial 03c0-03df : vga+ 03f8-03ff : serial8250.0 03f8-03ff : serial 0480-048f : dma high page 0a79-0a79 : isapnp write 1000- : PCI0 I/O 1000-107f : :00:08.0 1000-107f : 3c59x 1080-108f : :00:06.1 1090-109f : :00:07.0 1090-109f : pata_it821x 10a0-10a7 : :00:07.0 10a0-10a7 : pata_it821x 10a8-10af : :00:07.0 10a8-10af : pata_it821x 10b0-10b3 : :00:07.0 10b0-10b3 : pata_it821x 10b4-10b7 : :00:07.0 10b4-10b7 : pata_it821x The "PCI0 I/O" resource is the bus level resource, and the legacy resources can not be claimed against that. Without these resources, the PATA cypress driver doesn't work. As I said previously, the reason this regression was not picked up earlier is because I don't upgrade the kernel on this machine very often; the machine has had uptimes into thousands of days. I need to try reverting Rob's commit to find out if anything breaks on this platform - it's completely wrong from a technical point of view for any case where we have a PCI southbridge, since the southbridge provides ISA based resources. I'm not entirely sure what the point of it was, since we still have the PCIBIOS_MIN_IO macro which still uses pcibios_min_io. I'm looking at some of the other changes Rob made back at that time which also look wrong, such as 8ef6e6201b26 which has the effect of locating the 21285 IO resources to PCI address 0, over the top of the ISA southbridge resources. I've no idea what Rob was thinking when he removed the csrio allocation code in that commit, but looking at it to day, it's soo obviously wrong even to a casual glance. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH next v1 2/3] printk: remove safe buffers
On Mon 2021-03-22 22:58:47, John Ogness wrote: > On 2021-03-22, Petr Mladek wrote: > > On Mon 2021-03-22 12:16:15, John Ogness wrote: > >> On 2021-03-21, Sergey Senozhatsky wrote: > >> >> @@ -369,7 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, > >> >> va_list args) > >> >> * Use the main logbuf even in NMI. But avoid calling console > >> >> * drivers that might have their own locks. > >> >> */ > >> >> - if ((this_cpu_read(printk_context) & > >> >> PRINTK_NMI_DIRECT_CONTEXT_MASK)) { > >> >> + if (this_cpu_read(printk_context) & > >> >> + (PRINTK_NMI_DIRECT_CONTEXT_MASK | > >> >> +PRINTK_NMI_CONTEXT_MASK | > >> >> +PRINTK_SAFE_CONTEXT_MASK)) { > >> > > >> But I suppose I could switch > >> the 1 printk_nmi_direct_enter() user to printk_nmi_enter() so that > >> PRINTK_NMI_DIRECT_CONTEXT_MASK can be removed now. I would do this in a > >> 4th patch of the series. > > > > Yes, please unify the PRINTK_NMI_CONTEXT. One is enough. > > Agreed. (But I'll go even further. See below.) > > > I wonder if it would make sense to go even further at this stage. > > What is possible? > > > > 1. We could get rid of printk_nmi_enter()/exit() and > >PRINTK_NMI_CONTEXT completely already now. It is enough > >to check in_nmi() in printk_func(). > > > > Agreed. in_nmi() within vprintk_emit() is enough to detect if the > console code should be skipped: > > if (!in_sched && !in_nmi()) { > ... > } Well, we also need to make sure that the irq work is scheduled to call console later. We should keep this dicision in printk_func(). I mean to replace the current if (this_cpu_read(printk_context) & (PRINTK_NMI_DIRECT_CONTEXT_MASK | PRINTK_NMI_CONTEXT_MASK | PRINTK_SAFE_CONTEXT_MASK)) { with /* * Avoid calling console drivers in recursive printk() * and in NMI context. */ if (this_cpu_read(printk_context) || in_nmi() { That said, I am not sure how this fits your further rework. I do not want to complicate it too much. I am just afraid that the discussion about console rework might take some time. And this would remove some complexity before we started the more complicated or controversial changes. > > 2. I thought about unifying printk_safe_enter()/exit() and > >printk_enter()/exit(). They both count recursion with > >IRQs disabled, have similar name. But they are used > >different way. > > > >But better might be to rename printk_safe_enter()/exit() to > >console_enter()/exit() or to printk_deferred_enter()/exit(). > >It would make more clear what it does now. And it might help > >to better distinguish it from the new printk_enter()/exit(). > > > >I am not sure if it is worth it. > > I am also not sure if it is worth the extra "noise" just to give the > function a more appropriate name. The plan is to remove it completely > soon anyway. My vote is to leave the name as it is. OK, let's keep printk_safe() name. It was just an idea. I wrote it primary to sort my thoughts. Best Regards, Petr
Re: [PATCH v4 39/46] KVM: PPC: Book3S HV: Remove virt mode checks from real mode handlers
On 3/23/21 2:02 AM, Nicholas Piggin wrote: > Now that the P7/8 path no longer supports radix, real-mode handlers > do not need to deal with being called in virt mode. > > This change effectively reverts commit acde25726bc6 ("KVM: PPC: Book3S > HV: Add radix checks in real-mode hypercall handlers"). > > It removes a few more real-mode tests in rm hcall handlers, which also > allows the indirect ops for the xive module to be removed from the > built-in xics rm handlers. > > kvmppc_h_random is renamed to kvmppc_rm_h_random to be a bit more > descriptive of its function. > > Cc: Cédric Le Goater > Signed-off-by: Nicholas Piggin Reviewed-by: Cédric Le Goater > --- > arch/powerpc/include/asm/kvm_ppc.h | 10 +-- > arch/powerpc/kvm/book3s.c | 11 +-- > arch/powerpc/kvm/book3s_64_vio_hv.c | 12 > arch/powerpc/kvm/book3s_hv_builtin.c| 91 ++--- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +- > arch/powerpc/kvm/book3s_xive.c | 18 - > arch/powerpc/kvm/book3s_xive.h | 7 -- > arch/powerpc/kvm/book3s_xive_native.c | 10 --- > 8 files changed, 23 insertions(+), 138 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_ppc.h > b/arch/powerpc/include/asm/kvm_ppc.h > index db6646c2ade2..5dfb3f167f2c 100644 > --- a/arch/powerpc/include/asm/kvm_ppc.h > +++ b/arch/powerpc/include/asm/kvm_ppc.h > @@ -659,8 +659,6 @@ extern int kvmppc_xive_get_xive(struct kvm *kvm, u32 irq, > u32 *server, > u32 *priority); > extern int kvmppc_xive_int_on(struct kvm *kvm, u32 irq); > extern int kvmppc_xive_int_off(struct kvm *kvm, u32 irq); > -extern void kvmppc_xive_init_module(void); > -extern void kvmppc_xive_exit_module(void); > > extern int kvmppc_xive_connect_vcpu(struct kvm_device *dev, > struct kvm_vcpu *vcpu, u32 cpu); > @@ -686,8 +684,6 @@ static inline int kvmppc_xive_enabled(struct kvm_vcpu > *vcpu) > extern int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev, > struct kvm_vcpu *vcpu, u32 cpu); > extern void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu); > -extern void kvmppc_xive_native_init_module(void); > -extern void kvmppc_xive_native_exit_module(void); > extern int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu, >union kvmppc_one_reg *val); > extern int kvmppc_xive_native_set_vp(struct kvm_vcpu *vcpu, > @@ -701,8 +697,6 @@ static inline int kvmppc_xive_get_xive(struct kvm *kvm, > u32 irq, u32 *server, > u32 *priority) { return -1; } > static inline int kvmppc_xive_int_on(struct kvm *kvm, u32 irq) { return -1; } > static inline int kvmppc_xive_int_off(struct kvm *kvm, u32 irq) { return -1; > } > -static inline void kvmppc_xive_init_module(void) { } > -static inline void kvmppc_xive_exit_module(void) { } > > static inline int kvmppc_xive_connect_vcpu(struct kvm_device *dev, > struct kvm_vcpu *vcpu, u32 cpu) { > return -EBUSY; } > @@ -725,8 +719,6 @@ static inline int kvmppc_xive_enabled(struct kvm_vcpu > *vcpu) > static inline int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev, > struct kvm_vcpu *vcpu, u32 cpu) { return -EBUSY; } > static inline void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) { } > -static inline void kvmppc_xive_native_init_module(void) { } > -static inline void kvmppc_xive_native_exit_module(void) { } > static inline int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu, > union kvmppc_one_reg *val) > { return 0; } > @@ -762,7 +754,7 @@ long kvmppc_rm_h_stuff_tce(struct kvm_vcpu *vcpu, > unsigned long tce_value, unsigned long npages); > long int kvmppc_rm_h_confer(struct kvm_vcpu *vcpu, int target, > unsigned int yield_count); > -long kvmppc_h_random(struct kvm_vcpu *vcpu); > +long kvmppc_rm_h_random(struct kvm_vcpu *vcpu); > void kvmhv_commence_exit(int trap); > void kvmppc_realmode_machine_check(struct kvm_vcpu *vcpu); > void kvmppc_subcore_enter_guest(void); > diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c > index 44bf567b6589..1888aedfd410 100644 > --- a/arch/powerpc/kvm/book3s.c > +++ b/arch/powerpc/kvm/book3s.c > @@ -1046,13 +1046,10 @@ static int kvmppc_book3s_init(void) > #ifdef CONFIG_KVM_XICS > #ifdef CONFIG_KVM_XIVE > if (xics_on_xive()) { > - kvmppc_xive_init_module(); > kvm_register_device_ops(&kvm_xive_ops, KVM_DEV_TYPE_XICS); > - if (kvmppc_xive_native_supported()) { > - kvmppc_xive_native_init_module(); > + if (kvmppc_xive_native_supported()) > kvm_register_device_ops(&kvm_xive_native_ops, > KVM_DEV_TYPE_XIVE); > - } > } else
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
On 3/23/21 2:02 AM, Nicholas Piggin wrote: > In the interest of minimising the amount of code that is run in > "real-mode", don't handle hcalls in real mode in the P9 path. > > POWER8 and earlier are much more expensive to exit from HV real mode > and switch to host mode, because on those processors HV interrupts get > to the hypervisor with the MMU off, and the other threads in the core > need to be pulled out of the guest, and SLBs all need to be saved, > ERATs invalidated, and host SLB reloaded before the MMU is re-enabled > in host mode. Hash guests also require a lot of hcalls to run. The > XICS interrupt controller requires hcalls to run. > > By contrast, POWER9 has independent thread switching, and in radix mode > the hypervisor is already in a host virtual memory mode when the HV > interrupt is taken. Radix + xive guests don't need hcalls to handle > interrupts or manage translations. > > So it's much less important to handle hcalls in real mode in P9. > > Signed-off-by: Nicholas Piggin > --- > arch/powerpc/include/asm/kvm_ppc.h | 5 ++ > arch/powerpc/kvm/book3s_hv.c| 57 > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 ++ > arch/powerpc/kvm/book3s_xive.c | 70 + > 4 files changed, 127 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_ppc.h > b/arch/powerpc/include/asm/kvm_ppc.h > index 73b1ca5a6471..db6646c2ade2 100644 > --- a/arch/powerpc/include/asm/kvm_ppc.h > +++ b/arch/powerpc/include/asm/kvm_ppc.h > @@ -607,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm); > extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall); > extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); > extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd); > +extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req); > extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); > extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); > extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, > @@ -639,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu > *vcpu) > static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } > static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) > { return 0; } > +static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req) > + { return 0; } > #endif > > #ifdef CONFIG_KVM_XIVE > @@ -673,6 +676,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int > irq_source_id, u32 irq, > int level, bool line_status); > extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu); > extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu); > +extern void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu); > > static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) > { > @@ -714,6 +718,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, > int irq_source_id, u32 ir > int level, bool line_status) { return > -ENODEV; } > static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { } > static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { } > +static inline void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu) { } > > static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) > { return 0; } > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index fa7614c37e08..17739aaee3d8 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) > } > > /* > - * Handle H_CEDE in the nested virtualization case where we haven't > - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. > + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall > + * handlers in book3s_hv_rmhandlers.S. > + * > * This has to be done early, not in kvmppc_pseries_do_hcall(), so > * that the cede logic in kvmppc_run_single_vcpu() works properly. > */ > -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) > +static void kvmppc_cede(struct kvm_vcpu *vcpu) > { > vcpu->arch.shregs.msr |= MSR_EE; > vcpu->arch.ceded = 1; > @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, > /* hcall - punt to userspace */ > int i; > > - /* hypercall with MSR_PR has already been handled in rmode, > - * and never reaches here. > - */ > + if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { > + /* > + * Guest userspace executed sc 1, reflect it back as a > + * privileged program check interrupt. > + */ > + kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV); > + r = RESUME_GUEST; > + break; > + } > > run->papr
Re: [PATCH v11 0/6] KASAN for powerpc64 radix
Le 23/03/2021 à 02:21, Daniel Axtens a écrit : Hi Christophe, In the discussion we had long time ago, https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20190806233827.16454-5-...@axtens.net/#2321067 , I challenged you on why it was not possible to implement things the same way as other architectures, in extenso with an early mapping. Your first answer was that too many things were done in real mode at startup. After some discussion you said that finally there was not that much things at startup but the issue was KVM. Now you say that instrumentation on KVM is fully disabled. So my question is, if KVM is not a problem anymore, why not go the standard way with an early shadow ? Then you could also support inline instrumentation. Fair enough, I've had some trouble both understanding the problem myself and clearly articulating it. Let me try again. We need translations on to access the shadow area. We reach setup_64.c::early_setup() with translations off. At this point we don't know what MMU we're running under, or our CPU features. What do you need to know ? Whether it is Hash or Radix, or more/different details ? IIUC, today we only support KASAN on Radix. Would it make sense to say that a kernel built with KASAN can only run on processors having Radix capacility ? Then select CONFIG_PPC_RADIX_MMU_DEFAULT when KASAN is set, and accept that the kernel crashes if Radix is not available ? To determine our MMU and CPU features, early_setup() calls functions (dt_cpu_ftrs_init, early_init_devtree) that call out to generic code like of_scan_flat_dt. We need to do this before we turn on translations because we can't set up the MMU until we know what MMU we have. So this puts us in a bind: - We can't set up an early shadow until we have translations on, which requires that the MMU is set up. - We can't set up an MMU until we call out to generic code for FDT parsing. So there will be calls to generic FDT parsing code that happen before the early shadow is set up. I see some logic in kernel/prom_init.c for detecting MMU. Can we get the information from there in order to setup the MMU ? The setup code also prints a bunch of information about the platform with printk() while translations are off, so it wouldn't even be enough to disable instrumentation for bits of the generic DT code on ppc64. I'm sure the printk() stuff can be avoided or delayed without much problems, I guess the main problem is the DT code, isn't it ? As far as I can see the code only use udbg_printf() before MMU is on, and this could be simply skipped when KASAN is selected, I see no situation where you need early printk together with KASAN. Does that make sense? If you can figure out how to 'square the circle' here I'm all ears. Yes it is a lot more clear now, thanks you. Gave a few ideas above, does it help ? Other notes: - There's a comment about printk() being 'safe' in early_setup(), that refers to having a valid PACA, it doesn't mean that it's safe in any other sense. - KVM does indeed also run stuff with translations off but we can catch all of that by disabling instrumentation on the real-mode handlers: it doesn't seem to leak out to generic code. So you are right that KVM is no longer an issue. Christophe
[PATCH] soc/fsl: qbman: fix conflicting alignment attributes
From: Arnd Bergmann When building with W=1, gcc points out that the __packed attribute on struct qm_eqcr_entry conflicts with the 8-byte alignment attribute on struct qm_fd inside it: drivers/soc/fsl/qbman/qman.c:189:1: error: alignment 1 of 'struct qm_eqcr_entry' is less than 8 [-Werror=packed-not-aligned] I assume that the alignment attribute is the correct one, and that qm_eqcr_entry cannot actually be unaligned in memory, so add the same alignment on the outer struct. Fixes: c535e923bb97 ("soc/fsl: Introduce DPAA 1.x QMan device driver") Signed-off-by: Arnd Bergmann --- drivers/soc/fsl/qbman/qman.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c index a1b9be1d105a..fde4edd83c14 100644 --- a/drivers/soc/fsl/qbman/qman.c +++ b/drivers/soc/fsl/qbman/qman.c @@ -186,7 +186,7 @@ struct qm_eqcr_entry { __be32 tag; struct qm_fd fd; u8 __reserved3[32]; -} __packed; +} __packed __aligned(8); #define QM_EQCR_VERB_VBIT 0x80 #define QM_EQCR_VERB_CMD_MASK 0x61/* but only one value; */ #define QM_EQCR_VERB_CMD_ENQUEUE 0x01 -- 2.29.2
Re: [PATCH 0/4] Rust for Linux for ppc64le
Miguel Ojeda writes: > Hi Michael, > > On Tue, Mar 23, 2021 at 4:27 AM Michael Ellerman wrote: >> >> Hi all, >> >> Here's a first attempt at getting the kernel Rust support building on >> powerpc. > > Thanks a *lot*! It is great to have more architectures rolling. No worries. >> It's powerpc64le only for now, as that's what I can easily test given the >> distros I have installed. Though powerpc and powerpc64 are also Tier 2 >> platforms > > Even if it is just 64-bit, it is very good to have it! > >> so in theory should work. Supporting those would require something more >> complicated than just pointing rustc at arch/$(ARCH)/rust/target.json. > > Yeah, the arch/$(ARCH)/rust/target.json dance is a placeholder -- I > need to figure out how to do that more cleanly, likely generating them > on the fly. Yeah that's a good idea. That way they can be made to exactly match the kernel configuration. >> This is based on 832575d934a2 from the Rust-for-Linux tree. Anything newer >> gives >> me errors about symbol name lengths. I figured I'd send this anyway, as it >> seems >> like those errors are probably not powerpc specific. > > Sure, feel free to send things even if they don't work completely. > > I will take a look at the symbol name lengths -- I increased that > limit to 512 and added support for 2-byte lengths in the tables, but > perhaps something is missing. If I manage to make it work, I can add > ppc64le to our CI! :-) It would be nice to be in the CI. I was building natively so I haven't tried cross compiling yet (which we'll need for CI). >> Michael Ellerman (4): >> rust: Export symbols in initialized data section >> rust: Add powerpc64 as a 64-bit target_arch in c_types.rs >> powerpc/rust: Add target.json for ppc64le >> rust: Enable for ppc64le > > Regarding the development process: at least until the RFC we are > working with the usual GitHub PR workflow (for several reasons: having > a quick CI setup, getting new Rust developers on-board, having a list > of "issues", cross-reference with the Rust repo, etc.). > > I can take patches from the list, of course, but since we are pre-RFC, > do you mind if they get rebased etc. through there? No I don't mind at all. I just sent patches so other ppc folks could see what I had, and it's kind of the process I'm used to. I can send a pull request if that's easiest. cheers
Re: [PATCH 0/2] handle premature return from H_JOIN in pseries mobility code
On Mon, 15 Mar 2021 03:00:43 -0500, Nathan Lynch wrote: > pseries VMs in shared processor mode are susceptible to failed > migrations becasue stray H_PRODs from the paravirt spinlock > implementation can bump threads out of joining state before the > suspend has occurred. Fix this by adding a small amount of shared > state and ordering accesses to it with respect to H_PROD and H_JOIN. > > Nathan Lynch (2): > powerpc/pseries/mobility: use struct for shared state > powerpc/pseries/mobility: handle premature return from H_JOIN > > [...] Applied to powerpc/fixes. [1/2] powerpc/pseries/mobility: use struct for shared state https://git.kernel.org/powerpc/c/e834df6cfc71d8e5ce2c27a0184145ea125c3f0f [2/2] powerpc/pseries/mobility: handle premature return from H_JOIN https://git.kernel.org/powerpc/c/274cb1ca2e7ce02cab56f5f4c61a74aeb566f931 cheers
Re: [PATCH v4 28/46] KVM: PPC: Book3S HV P9: Reduce irq_work vs guest decrementer races
Excerpts from Nicholas Piggin's message of March 23, 2021 8:36 pm: > Excerpts from Alexey Kardashevskiy's message of March 23, 2021 8:13 pm: >> >> >> On 23/03/2021 12:02, Nicholas Piggin wrote: >>> irq_work's use of the DEC SPR is racy with guest<->host switch and guest >>> entry which flips the DEC interrupt to guest, which could lose a host >>> work interrupt. >>> >>> This patch closes one race, and attempts to comment another class of >>> races. >>> >>> Signed-off-by: Nicholas Piggin >>> --- >>> arch/powerpc/kvm/book3s_hv.c | 15 ++- >>> 1 file changed, 14 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >>> index 1f38a0abc611..989a1ff5ad11 100644 >>> --- a/arch/powerpc/kvm/book3s_hv.c >>> +++ b/arch/powerpc/kvm/book3s_hv.c >>> @@ -3745,6 +3745,18 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu >>> *vcpu, u64 time_limit, >>> if (!(vcpu->arch.ctrl & 1)) >>> mtspr(SPRN_CTRLT, mfspr(SPRN_CTRLF) & ~1); >>> >>> + /* >>> +* When setting DEC, we must always deal with irq_work_raise via NMI vs >>> +* setting DEC. The problem occurs right as we switch into guest mode >>> +* if a NMI hits and sets pending work and sets DEC, then that will >>> +* apply to the guest and not bring us back to the host. >>> +* >>> +* irq_work_raise could check a flag (or possibly LPCR[HDICE] for >>> +* example) and set HDEC to 1? That wouldn't solve the nested hv >>> +* case which needs to abort the hcall or zero the time limit. >>> +* >>> +* XXX: Another day's problem. >>> +*/ >>> mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb); >>> >>> if (kvmhv_on_pseries()) { >>> @@ -3879,7 +3891,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu >>> *vcpu, u64 time_limit, >>> vc->entry_exit_map = 0x101; >>> vc->in_guest = 0; >>> >>> - mtspr(SPRN_DEC, local_paca->kvm_hstate.dec_expires - tb); >>> + set_dec_or_work(local_paca->kvm_hstate.dec_expires - tb); >> >> >> set_dec_or_work() will write local_paca->kvm_hstate.dec_expires - tb - 1 >> to SPRN_DEC which is not exactly the same, is this still alright? >> >> I asked in v3 but it is probably lost :) > > Oh I did see that then forgot. > > It will write dec_expires - tb, then it will write 1 if it found irq_work > was pending. Ah you were actually asking about set_dec writing val - 1. I totally missed that. Yes that was an unintentional change. This is the way timer.c code works with respect to the decrementers_next_tb value, so it's probably better to make them so it seems like it should be okay (and better to bring the KVM code up to match timer code rather than be different or the other way around). The difference should be noted in the changelog though. Thanks, Nick
Re: [PATCH v4 28/46] KVM: PPC: Book3S HV P9: Reduce irq_work vs guest decrementer races
Excerpts from Alexey Kardashevskiy's message of March 23, 2021 8:13 pm: > > > On 23/03/2021 12:02, Nicholas Piggin wrote: >> irq_work's use of the DEC SPR is racy with guest<->host switch and guest >> entry which flips the DEC interrupt to guest, which could lose a host >> work interrupt. >> >> This patch closes one race, and attempts to comment another class of >> races. >> >> Signed-off-by: Nicholas Piggin >> --- >> arch/powerpc/kvm/book3s_hv.c | 15 ++- >> 1 file changed, 14 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index 1f38a0abc611..989a1ff5ad11 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -3745,6 +3745,18 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu >> *vcpu, u64 time_limit, >> if (!(vcpu->arch.ctrl & 1)) >> mtspr(SPRN_CTRLT, mfspr(SPRN_CTRLF) & ~1); >> >> +/* >> + * When setting DEC, we must always deal with irq_work_raise via NMI vs >> + * setting DEC. The problem occurs right as we switch into guest mode >> + * if a NMI hits and sets pending work and sets DEC, then that will >> + * apply to the guest and not bring us back to the host. >> + * >> + * irq_work_raise could check a flag (or possibly LPCR[HDICE] for >> + * example) and set HDEC to 1? That wouldn't solve the nested hv >> + * case which needs to abort the hcall or zero the time limit. >> + * >> + * XXX: Another day's problem. >> + */ >> mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb); >> >> if (kvmhv_on_pseries()) { >> @@ -3879,7 +3891,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, >> u64 time_limit, >> vc->entry_exit_map = 0x101; >> vc->in_guest = 0; >> >> -mtspr(SPRN_DEC, local_paca->kvm_hstate.dec_expires - tb); >> +set_dec_or_work(local_paca->kvm_hstate.dec_expires - tb); > > > set_dec_or_work() will write local_paca->kvm_hstate.dec_expires - tb - 1 > to SPRN_DEC which is not exactly the same, is this still alright? > > I asked in v3 but it is probably lost :) Oh I did see that then forgot. It will write dec_expires - tb, then it will write 1 if it found irq_work was pending. The change is intentional to fixes one of the lost irq_work races. Thanks, Nick
[PATCH] sound:ppc: fix spelling typo of values
From: caizhichao vaules -> values Signed-off-by: caizhichao --- sound/ppc/snd_ps3_reg.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/ppc/snd_ps3_reg.h b/sound/ppc/snd_ps3_reg.h index 566a318..e2212b7 100644 --- a/sound/ppc/snd_ps3_reg.h +++ b/sound/ppc/snd_ps3_reg.h @@ -308,7 +308,7 @@ each interrupt in this register. Writing 1b to a field containing 1b clears field and de-asserts interrupt. Writing 0b to a field has no effect. -Field vaules are the following: +Field values are the following: 0 - Interrupt hasn't occurred. 1 - Interrupt has occurred. -- 1.9.1
Re: [PATCH v4 28/46] KVM: PPC: Book3S HV P9: Reduce irq_work vs guest decrementer races
On 23/03/2021 12:02, Nicholas Piggin wrote: irq_work's use of the DEC SPR is racy with guest<->host switch and guest entry which flips the DEC interrupt to guest, which could lose a host work interrupt. This patch closes one race, and attempts to comment another class of races. Signed-off-by: Nicholas Piggin --- arch/powerpc/kvm/book3s_hv.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 1f38a0abc611..989a1ff5ad11 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -3745,6 +3745,18 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit, if (!(vcpu->arch.ctrl & 1)) mtspr(SPRN_CTRLT, mfspr(SPRN_CTRLF) & ~1); + /* +* When setting DEC, we must always deal with irq_work_raise via NMI vs +* setting DEC. The problem occurs right as we switch into guest mode +* if a NMI hits and sets pending work and sets DEC, then that will +* apply to the guest and not bring us back to the host. +* +* irq_work_raise could check a flag (or possibly LPCR[HDICE] for +* example) and set HDEC to 1? That wouldn't solve the nested hv +* case which needs to abort the hcall or zero the time limit. +* +* XXX: Another day's problem. +*/ mtspr(SPRN_DEC, vcpu->arch.dec_expires - tb); if (kvmhv_on_pseries()) { @@ -3879,7 +3891,8 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit, vc->entry_exit_map = 0x101; vc->in_guest = 0; - mtspr(SPRN_DEC, local_paca->kvm_hstate.dec_expires - tb); + set_dec_or_work(local_paca->kvm_hstate.dec_expires - tb); set_dec_or_work() will write local_paca->kvm_hstate.dec_expires - tb - 1 to SPRN_DEC which is not exactly the same, is this still alright? I asked in v3 but it is probably lost :) + mtspr(SPRN_SPRG_VDSO_WRITE, local_paca->sprg_vdso); kvmhv_load_host_pmu(); -- Alexey
Re: [PATCH 0/4] Rust for Linux for ppc64le
Hi Michael, On Tue, Mar 23, 2021 at 4:27 AM Michael Ellerman wrote: > > Hi all, > > Here's a first attempt at getting the kernel Rust support building on powerpc. Thanks a *lot*! It is great to have more architectures rolling. > It's powerpc64le only for now, as that's what I can easily test given the > distros I have installed. Though powerpc and powerpc64 are also Tier 2 > platforms Even if it is just 64-bit, it is very good to have it! > so in theory should work. Supporting those would require something more > complicated than just pointing rustc at arch/$(ARCH)/rust/target.json. Yeah, the arch/$(ARCH)/rust/target.json dance is a placeholder -- I need to figure out how to do that more cleanly, likely generating them on the fly. > This is based on 832575d934a2 from the Rust-for-Linux tree. Anything newer > gives > me errors about symbol name lengths. I figured I'd send this anyway, as it > seems > like those errors are probably not powerpc specific. Sure, feel free to send things even if they don't work completely. I will take a look at the symbol name lengths -- I increased that limit to 512 and added support for 2-byte lengths in the tables, but perhaps something is missing. If I manage to make it work, I can add ppc64le to our CI! :-) > Michael Ellerman (4): > rust: Export symbols in initialized data section > rust: Add powerpc64 as a 64-bit target_arch in c_types.rs > powerpc/rust: Add target.json for ppc64le > rust: Enable for ppc64le Regarding the development process: at least until the RFC we are working with the usual GitHub PR workflow (for several reasons: having a quick CI setup, getting new Rust developers on-board, having a list of "issues", cross-reference with the Rust repo, etc.). I can take patches from the list, of course, but since we are pre-RFC, do you mind if they get rebased etc. through there? Thanks again! Cheers, Miguel
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
Excerpts from Alexey Kardashevskiy's message of March 23, 2021 7:24 pm: > > > On 23/03/2021 20:16, Nicholas Piggin wrote: >> Excerpts from Alexey Kardashevskiy's message of March 23, 2021 7:02 pm: >>> >>> >>> On 23/03/2021 12:02, Nicholas Piggin wrote: diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index c11597f815e4..2d0d14ed1d92 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -1397,9 +1397,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) mr r4,r9 bge fast_guest_return 2: + /* If we came in through the P9 short path, no real mode hcalls */ + lwz r0, STACK_SLOT_SHORT_PATH(r1) + cmpwi r0, 0 + bne no_try_real >>> >>> >>> btw is mmu on at this point? or it gets enabled by rfid at the end of >>> guest_exit_short_path? >> >> Hash guest it's off. Radix guest it can be on or off depending on the >> interrupt type and MSR and LPCR[AIL] values. > > What I meant was - what do we expect here on p9? mmu on? ^w^w^w^w^w^w^w^w^w P9 radix can be on or off. If the guest had MSR[IR] or MSR[DR] clear, or if the guest is running AIL=0 mode, or if this is a machine check, system reset, or HMI interrupt then the MMU will be off here. > I just realized - it is radix so there is no problem with vmalloc > addresses in real mode as these do not use top 2 bits as on hash and the > exact mmu state is less important here. Cheers. We still can't use vmalloc addresses in real mode on radix because they don't translate with the page tables. Thanks, Nick
Re: [PATCH 3/4] powerpc/rust: Add target.json for ppc64le
On Tue, Mar 23, 2021 at 4:27 AM Michael Ellerman wrote: > > ppc64le only for now. We'll eventually need to come up with some way to > change the target.json that's used based on more than just $(ARCH). Indeed, it is one reason I didn't tackle e.g. x86 32-bit, because I wanted to figure out how to do the whole `target.json` cleanly (i.e. likely have a script generate them on the fly), so I thought it was better to wait post-RFC. Cheers, Miguel
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
On 23/03/2021 20:16, Nicholas Piggin wrote: Excerpts from Alexey Kardashevskiy's message of March 23, 2021 7:02 pm: On 23/03/2021 12:02, Nicholas Piggin wrote: In the interest of minimising the amount of code that is run in "real-mode", don't handle hcalls in real mode in the P9 path. POWER8 and earlier are much more expensive to exit from HV real mode and switch to host mode, because on those processors HV interrupts get to the hypervisor with the MMU off, and the other threads in the core need to be pulled out of the guest, and SLBs all need to be saved, ERATs invalidated, and host SLB reloaded before the MMU is re-enabled in host mode. Hash guests also require a lot of hcalls to run. The XICS interrupt controller requires hcalls to run. By contrast, POWER9 has independent thread switching, and in radix mode the hypervisor is already in a host virtual memory mode when the HV interrupt is taken. Radix + xive guests don't need hcalls to handle interrupts or manage translations. So it's much less important to handle hcalls in real mode in P9. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/kvm_ppc.h | 5 ++ arch/powerpc/kvm/book3s_hv.c| 57 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 ++ arch/powerpc/kvm/book3s_xive.c | 70 + 4 files changed, 127 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 73b1ca5a6471..db6646c2ade2 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -607,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm); extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall); extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd); +extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req); extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, @@ -639,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) { return 0; } +static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req) + { return 0; } #endif #ifdef CONFIG_KVM_XIVE @@ -673,6 +676,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level, bool line_status); extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu); extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu); +extern void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu); static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) { @@ -714,6 +718,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 ir int level, bool line_status) { return -ENODEV; } static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { } static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { } +static inline void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu) { } static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) { return 0; } diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index fa7614c37e08..17739aaee3d8 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) } /* - * Handle H_CEDE in the nested virtualization case where we haven't - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall + * handlers in book3s_hv_rmhandlers.S. + * * This has to be done early, not in kvmppc_pseries_do_hcall(), so * that the cede logic in kvmppc_run_single_vcpu() works properly. */ -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) +static void kvmppc_cede(struct kvm_vcpu *vcpu) { vcpu->arch.shregs.msr |= MSR_EE; vcpu->arch.ceded = 1; @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, /* hcall - punt to userspace */ int i; - /* hypercall with MSR_PR has already been handled in rmode, -* and never reaches here. -*/ + if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { + /* +* Guest userspace executed sc 1, reflect it back as a +* privileged program check interrupt. +*/ + kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV); + r = RESUME_GUEST; + break; + } run->
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
Excerpts from Alexey Kardashevskiy's message of March 23, 2021 7:02 pm: > > > On 23/03/2021 12:02, Nicholas Piggin wrote: >> In the interest of minimising the amount of code that is run in >> "real-mode", don't handle hcalls in real mode in the P9 path. >> >> POWER8 and earlier are much more expensive to exit from HV real mode >> and switch to host mode, because on those processors HV interrupts get >> to the hypervisor with the MMU off, and the other threads in the core >> need to be pulled out of the guest, and SLBs all need to be saved, >> ERATs invalidated, and host SLB reloaded before the MMU is re-enabled >> in host mode. Hash guests also require a lot of hcalls to run. The >> XICS interrupt controller requires hcalls to run. >> >> By contrast, POWER9 has independent thread switching, and in radix mode >> the hypervisor is already in a host virtual memory mode when the HV >> interrupt is taken. Radix + xive guests don't need hcalls to handle >> interrupts or manage translations. >> >> So it's much less important to handle hcalls in real mode in P9. >> >> Signed-off-by: Nicholas Piggin >> --- >> arch/powerpc/include/asm/kvm_ppc.h | 5 ++ >> arch/powerpc/kvm/book3s_hv.c| 57 >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 ++ >> arch/powerpc/kvm/book3s_xive.c | 70 + >> 4 files changed, 127 insertions(+), 10 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/kvm_ppc.h >> b/arch/powerpc/include/asm/kvm_ppc.h >> index 73b1ca5a6471..db6646c2ade2 100644 >> --- a/arch/powerpc/include/asm/kvm_ppc.h >> +++ b/arch/powerpc/include/asm/kvm_ppc.h >> @@ -607,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm); >> extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall); >> extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); >> extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd); >> +extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req); >> extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); >> extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); >> extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, >> @@ -639,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu >> *vcpu) >> static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } >> static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) >> { return 0; } >> +static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req) >> +{ return 0; } >> #endif >> >> #ifdef CONFIG_KVM_XIVE >> @@ -673,6 +676,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int >> irq_source_id, u32 irq, >> int level, bool line_status); >> extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu); >> extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu); >> +extern void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu); >> >> static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) >> { >> @@ -714,6 +718,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, >> int irq_source_id, u32 ir >>int level, bool line_status) { return >> -ENODEV; } >> static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { } >> static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { } >> +static inline void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu) { } >> >> static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) >> { return 0; } >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index fa7614c37e08..17739aaee3d8 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) >> } >> >> /* >> - * Handle H_CEDE in the nested virtualization case where we haven't >> - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. >> + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall >> + * handlers in book3s_hv_rmhandlers.S. >> + * >>* This has to be done early, not in kvmppc_pseries_do_hcall(), so >>* that the cede logic in kvmppc_run_single_vcpu() works properly. >>*/ >> -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) >> +static void kvmppc_cede(struct kvm_vcpu *vcpu) >> { >> vcpu->arch.shregs.msr |= MSR_EE; >> vcpu->arch.ceded = 1; >> @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu >> *vcpu, >> /* hcall - punt to userspace */ >> int i; >> >> -/* hypercall with MSR_PR has already been handled in rmode, >> - * and never reaches here. >> - */ >> +if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { >> +/* >> + * Guest userspace executed sc 1, reflect it back as a >> + * privileged program check interrupt. >> +
[PATCH v2 -next] powerpc: kernel/time.c - cleanup warnings
We found these warnings in arch/powerpc/kernel/time.c as follows: warning: symbol 'decrementer_max' was not declared. Should it be static? warning: symbol 'rtc_lock' was not declared. Should it be static? warning: symbol 'dtl_consumer' was not declared. Should it be static? Declare 'decrementer_max' and 'rtc_lock' in powerpc asm/time.h. Rename 'rtc_lock' in drviers/rtc/rtc-vr41xx.c to 'vr41xx_rtc_lock' to avoid the conflict with the variable in powerpc asm/time.h. Move 'dtl_consumer' definition behind "include " because it is declared there. Reported-by: Hulk Robot Signed-off-by: He Ying --- v2: - Instead of including linux/mc146818rtc.h in powerpc kernel/time.c, declare rtc_lock in powerpc asm/time.h. arch/powerpc/include/asm/time.h | 3 +++ arch/powerpc/kernel/time.c | 6 ++ drivers/rtc/rtc-vr41xx.c| 22 +++--- 3 files changed, 16 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 8dd3cdb25338..64a3ef0b4270 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -12,6 +12,7 @@ #ifdef __KERNEL__ #include #include +#include #include #include @@ -22,6 +23,8 @@ extern unsigned long tb_ticks_per_jiffy; extern unsigned long tb_ticks_per_usec; extern unsigned long tb_ticks_per_sec; extern struct clock_event_device decrementer_clockevent; +extern u64 decrementer_max; +extern spinlock_t rtc_lock; extern void generic_calibrate_decr(void); diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index b67d93a609a2..60b6ac7d3685 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -150,10 +150,6 @@ bool tb_invalid; u64 __cputime_usec_factor; EXPORT_SYMBOL(__cputime_usec_factor); -#ifdef CONFIG_PPC_SPLPAR -void (*dtl_consumer)(struct dtl_entry *, u64); -#endif - static void calc_cputime_factors(void) { struct div_result res; @@ -179,6 +175,8 @@ static inline unsigned long read_spurr(unsigned long tb) #include +void (*dtl_consumer)(struct dtl_entry *, u64); + /* * Scan the dispatch trace log and count up the stolen time. * Should be called with interrupts disabled. diff --git a/drivers/rtc/rtc-vr41xx.c b/drivers/rtc/rtc-vr41xx.c index 5a9f9ad86d32..cc31db058197 100644 --- a/drivers/rtc/rtc-vr41xx.c +++ b/drivers/rtc/rtc-vr41xx.c @@ -72,7 +72,7 @@ static void __iomem *rtc2_base; static unsigned long epoch = 1970; /* Jan 1 1970 00:00:00 */ -static DEFINE_SPINLOCK(rtc_lock); +static DEFINE_SPINLOCK(vr41xx_rtc_lock); static char rtc_name[] = "RTC"; static unsigned long periodic_count; static unsigned int alarm_enabled; @@ -101,13 +101,13 @@ static inline time64_t read_elapsed_second(void) static inline void write_elapsed_second(time64_t sec) { - spin_lock_irq(&rtc_lock); + spin_lock_irq(&vr41xx_rtc_lock); rtc1_write(ETIMELREG, (uint16_t)(sec << 15)); rtc1_write(ETIMEMREG, (uint16_t)(sec >> 1)); rtc1_write(ETIMEHREG, (uint16_t)(sec >> 17)); - spin_unlock_irq(&rtc_lock); + spin_unlock_irq(&vr41xx_rtc_lock); } static int vr41xx_rtc_read_time(struct device *dev, struct rtc_time *time) @@ -139,14 +139,14 @@ static int vr41xx_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *wkalrm) unsigned long low, mid, high; struct rtc_time *time = &wkalrm->time; - spin_lock_irq(&rtc_lock); + spin_lock_irq(&vr41xx_rtc_lock); low = rtc1_read(ECMPLREG); mid = rtc1_read(ECMPMREG); high = rtc1_read(ECMPHREG); wkalrm->enabled = alarm_enabled; - spin_unlock_irq(&rtc_lock); + spin_unlock_irq(&vr41xx_rtc_lock); rtc_time64_to_tm((high << 17) | (mid << 1) | (low >> 15), time); @@ -159,7 +159,7 @@ static int vr41xx_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *wkalrm) alarm_sec = rtc_tm_to_time64(&wkalrm->time); - spin_lock_irq(&rtc_lock); + spin_lock_irq(&vr41xx_rtc_lock); if (alarm_enabled) disable_irq(aie_irq); @@ -173,7 +173,7 @@ static int vr41xx_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *wkalrm) alarm_enabled = wkalrm->enabled; - spin_unlock_irq(&rtc_lock); + spin_unlock_irq(&vr41xx_rtc_lock); return 0; } @@ -202,7 +202,7 @@ static int vr41xx_rtc_ioctl(struct device *dev, unsigned int cmd, unsigned long static int vr41xx_rtc_alarm_irq_enable(struct device *dev, unsigned int enabled) { - spin_lock_irq(&rtc_lock); + spin_lock_irq(&vr41xx_rtc_lock); if (enabled) { if (!alarm_enabled) { enable_irq(aie_irq); @@ -214,7 +214,7 @@ static int vr41xx_rtc_alarm_irq_enable(struct device *dev, unsigned int enabled) alarm_enabled = 0; } } - spin_unlock_irq(&rtc_lock); + spin_unlock_irq(&vr41xx_rtc_lock); return 0; } @@ -296,7 +296,7 @@ static
Re: [PATCH v4 22/46] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
On 23/03/2021 12:02, Nicholas Piggin wrote: In the interest of minimising the amount of code that is run in "real-mode", don't handle hcalls in real mode in the P9 path. POWER8 and earlier are much more expensive to exit from HV real mode and switch to host mode, because on those processors HV interrupts get to the hypervisor with the MMU off, and the other threads in the core need to be pulled out of the guest, and SLBs all need to be saved, ERATs invalidated, and host SLB reloaded before the MMU is re-enabled in host mode. Hash guests also require a lot of hcalls to run. The XICS interrupt controller requires hcalls to run. By contrast, POWER9 has independent thread switching, and in radix mode the hypervisor is already in a host virtual memory mode when the HV interrupt is taken. Radix + xive guests don't need hcalls to handle interrupts or manage translations. So it's much less important to handle hcalls in real mode in P9. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/kvm_ppc.h | 5 ++ arch/powerpc/kvm/book3s_hv.c| 57 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 ++ arch/powerpc/kvm/book3s_xive.c | 70 + 4 files changed, 127 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 73b1ca5a6471..db6646c2ade2 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -607,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm); extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall); extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd); +extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req); extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu); extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval); extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev, @@ -639,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu) static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { } static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd) { return 0; } +static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req) + { return 0; } #endif #ifdef CONFIG_KVM_XIVE @@ -673,6 +676,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level, bool line_status); extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu); extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu); +extern void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu); static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) { @@ -714,6 +718,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 ir int level, bool line_status) { return -ENODEV; } static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { } static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { } +static inline void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu) { } static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu) { return 0; } diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index fa7614c37e08..17739aaee3d8 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1142,12 +1142,13 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) } /* - * Handle H_CEDE in the nested virtualization case where we haven't - * called the real-mode hcall handlers in book3s_hv_rmhandlers.S. + * Handle H_CEDE in the P9 path where we don't call the real-mode hcall + * handlers in book3s_hv_rmhandlers.S. + * * This has to be done early, not in kvmppc_pseries_do_hcall(), so * that the cede logic in kvmppc_run_single_vcpu() works properly. */ -static void kvmppc_nested_cede(struct kvm_vcpu *vcpu) +static void kvmppc_cede(struct kvm_vcpu *vcpu) { vcpu->arch.shregs.msr |= MSR_EE; vcpu->arch.ceded = 1; @@ -1403,9 +1404,15 @@ static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, /* hcall - punt to userspace */ int i; - /* hypercall with MSR_PR has already been handled in rmode, -* and never reaches here. -*/ + if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { + /* +* Guest userspace executed sc 1, reflect it back as a +* privileged program check interrupt. +*/ + kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV); + r = RESUME_GUEST; + break; + } run->papr_hcall.nr = kvmppc_get_gpr(vcpu, 3); for (i = 0; i < 9; ++i) @@ -3663,6 +3670,12 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vc
Re: [PATCH v4 04/46] KVM: PPC: Book3S HV: Prevent radix guests from setting LPCR[TC]
On 23/03/2021 12:02, Nicholas Piggin wrote: This bit only applies to hash partitions. Signed-off-by: Nicholas Piggin Reviewed-by: Alexey Kardashevskiy --- arch/powerpc/kvm/book3s_hv.c| 6 ++ arch/powerpc/kvm/book3s_hv_nested.c | 3 +-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index c5de7e3f22b6..1ffb0902e779 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1645,6 +1645,12 @@ static int kvm_arch_vcpu_ioctl_set_sregs_hv(struct kvm_vcpu *vcpu, */ unsigned long kvmppc_filter_lpcr_hv(struct kvmppc_vcore *vc, unsigned long lpcr) { + struct kvm *kvm = vc->kvm; + + /* LPCR_TC only applies to HPT guests */ + if (kvm_is_radix(kvm)) + lpcr &= ~LPCR_TC; + /* On POWER8 and above, userspace can modify AIL */ if (!cpu_has_feature(CPU_FTR_ARCH_207S)) lpcr &= ~LPCR_AIL; diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index f7b441b3eb17..851e3f527eb2 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -140,8 +140,7 @@ static void sanitise_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr) /* * Don't let L1 change LPCR bits for the L2 except these: */ - mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | - LPCR_LPES | LPCR_MER; + mask = LPCR_DPFD | LPCR_ILE | LPCR_AIL | LPCR_LD | LPCR_LPES | LPCR_MER; hr->lpcr = kvmppc_filter_lpcr_hv(vc, (vc->lpcr & ~mask) | (hr->lpcr & mask)); -- Alexey
Re: [RFC Qemu PATCH v2 1/2] spapr: drc: Add support for async hcalls at the drc level
Hi David, Sorry about the delay. On 2/8/21 11:51 AM, David Gibson wrote: On Tue, Jan 19, 2021 at 12:40:31PM +0530, Shivaprasad G Bhat wrote: Thanks for the comments! On 12/28/20 2:08 PM, David Gibson wrote: On Mon, Dec 21, 2020 at 01:08:53PM +0100, Greg Kurz wrote: ... The overall idea looks good but I think you should consider using a thread pool to implement it. See below. I am not convinced, however. Specifically, attaching this to the DRC doesn't make sense to me. We're adding exactly one DRC related async hcall, and I can't really see much call for another one. We could have other async hcalls - indeed we already have one for HPT resizing - but attaching this to DRCs doesn't help for those. The semantics of the hcall made me think, if this is going to be re-usable for future if implemented at DRC level. It would only be re-usable for operations that are actually connected to DRCs. It doesn't seem to me particularly likely that we'll ever have more asynchronous hcalls that are also associated with DRCs. Okay Other option is to move the async-hcall-state/list into the NVDIMMState structure in include/hw/mem/nvdimm.h and handle it with machine->nvdimms_state at a global level. I'm ok with either of two options: A) Implement this ad-hoc for this specific case, making whatever simplifications you can based on this specific case. I am simplifying it to nvdimm use-case alone and limiting the scope. B) Implement a general mechanism for async hcalls that is *not* tied to DRCs. Then use that for the existing H_RESIZE_HPT_PREPARE call as well as this new one. Hope you are okay with using the pool based approach that Greg Honestly a thread pool seems like it might be overkill for this application. I think its appropriate here as that is what is being done by virtio-pmem too for flush requests. The aio infrastructure simplifies lot of the thread handling usage. Please suggest if you think there are better ways. I am sending the next version addressing all the comments from you and Greg. Thanks, Shivaprasad
Re: [PATCH 1/1] powerpc/iommu: Enable remaining IOMMU Pagesizes present in LoPAR
On 23/03/2021 06:09, Leonardo Bras wrote: According to LoPAR, ibm,query-pe-dma-window output named "IO Page Sizes" will let the OS know all possible pagesizes that can be used for creating a new DDW. Currently Linux will only try using 3 of the 8 available options: 4K, 64K and 16M. According to LoPAR, Hypervisor may also offer 32M, 64M, 128M, 256M and 16G. Enabling bigger pages would be interesting for direct mapping systems with a lot of RAM, while using less TCE entries. > Signed-off-by: Leonardo Bras --- arch/powerpc/include/asm/iommu.h | 8 arch/powerpc/platforms/pseries/iommu.c | 28 +++--- 2 files changed, 29 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index deef7c94d7b6..c170048b7a1b 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -19,6 +19,14 @@ #include #include +#define IOMMU_PAGE_SHIFT_16G 34 +#define IOMMU_PAGE_SHIFT_256M 28 +#define IOMMU_PAGE_SHIFT_128M 27 +#define IOMMU_PAGE_SHIFT_64M 26 +#define IOMMU_PAGE_SHIFT_32M 25 +#define IOMMU_PAGE_SHIFT_16M 24 +#define IOMMU_PAGE_SHIFT_64K 16 These are not very descriptive, these are just normal shifts, could be as simple as __builtin_ctz(SZ_4K) (gcc will optimize this) and so on. OTOH the PAPR page sizes need macros as they are the ones which are weird and screaming for macros. I'd steal/rework spapr_page_mask_to_query_mask() from QEMU. Thanks, + #define IOMMU_PAGE_SHIFT_4K 12 #define IOMMU_PAGE_SIZE_4K (ASM_CONST(1) << IOMMU_PAGE_SHIFT_4K) #define IOMMU_PAGE_MASK_4K (~((1 << IOMMU_PAGE_SHIFT_4K) - 1)) diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index 9fc5217f0c8e..02958e80aa91 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -1099,6 +1099,24 @@ static void reset_dma_window(struct pci_dev *dev, struct device_node *par_dn) ret); } +/* Returns page shift based on "IO Page Sizes" output at ibm,query-pe-dma-window. SeeL LoPAR */ +static int iommu_get_page_shift(u32 query_page_size) +{ + const int shift[] = {IOMMU_PAGE_SHIFT_4K, IOMMU_PAGE_SHIFT_64K, IOMMU_PAGE_SHIFT_16M, +IOMMU_PAGE_SHIFT_32M, IOMMU_PAGE_SHIFT_64M, IOMMU_PAGE_SHIFT_128M, +IOMMU_PAGE_SHIFT_256M, IOMMU_PAGE_SHIFT_16G}; + int i = ARRAY_SIZE(shift) - 1; + + /* Looks for the largest page size supported */ + for (; i >= 0; i--) { + if (query_page_size & (1 << i)) + return shift[i]; + } + + /* No valid page size found. */ + return 0; +} + /* * If the PE supports dynamic dma windows, and there is space for a table * that can map all pages in a linear offset, then setup such a table, @@ -1206,13 +1224,9 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn) goto out_failed; } } - if (query.page_size & 4) { - page_shift = 24; /* 16MB */ - } else if (query.page_size & 2) { - page_shift = 16; /* 64kB */ - } else if (query.page_size & 1) { - page_shift = 12; /* 4kB */ - } else { + + page_shift = iommu_get_page_shift(query.page_size); + if (!page_shift) { dev_dbg(&dev->dev, "no supported direct page size in mask %x", query.page_size); goto out_failed; -- Alexey
Re: [PATCH v3 19/41] KVM: PPC: Book3S HV P9: Stop handling hcalls in real-mode in the P9 path
On 3/22/21 7:22 PM, Nicholas Piggin wrote: > Excerpts from Cédric Le Goater's message of March 23, 2021 2:01 am: >> On 3/22/21 2:15 PM, Nicholas Piggin wrote: >>> Excerpts from Alexey Kardashevskiy's message of March 22, 2021 5:30 pm: On 06/03/2021 02:06, Nicholas Piggin wrote: > In the interest of minimising the amount of code that is run in>>> > "real-mode", don't handle hcalls in real mode in the P9 path. > > POWER8 and earlier are much more expensive to exit from HV real mode > and switch to host mode, because on those processors HV interrupts get > to the hypervisor with the MMU off, and the other threads in the core > need to be pulled out of the guest, and SLBs all need to be saved, > ERATs invalidated, and host SLB reloaded before the MMU is re-enabled > in host mode. Hash guests also require a lot of hcalls to run. The > XICS interrupt controller requires hcalls to run. > > By contrast, POWER9 has independent thread switching, and in radix mode > the hypervisor is already in a host virtual memory mode when the HV > interrupt is taken. Radix + xive guests don't need hcalls to handle > interrupts or manage translations. >> >> Do we need to handle the host-is-a-P9-without-xive case ? > > I'm not sure really. Is there an intention for OPAL to be able to > provide a fallback layer in the worst case? yes. OPAL has a XICS-on-XIVE emulation for P9, implemented for bringup, and it still boots, XICS guest can run. P10 doesn't have it though. > Maybe microwatt grows HV capability before XIVE? I don't know if we should develop the same XIVE logic for microwatt. It's awfully complex and we have the XICS interface which works already. > So it's much less important to handle hcalls in real mode in P9. So acde25726bc6034b (which added if(kvm_is_radix(vcpu->kvm))return H_TOO_HARD) can be reverted, pretty much? >>> >>> Yes. Although that calls attention to the fact I missed doing >>> a P9 h_random handler in this patch. I'll fix that, then I think >>> acde2572 could be reverted entirely. >>> >>> [...] >>> > } else { > kvmppc_xive_push_vcpu(vcpu); > trap = kvmhv_load_hv_regs_and_go(vcpu, time_limit, > lpcr); > - kvmppc_xive_pull_vcpu(vcpu); > + /* H_CEDE has to be handled now, not later */ > + /* XICS hcalls must be handled before xive is pulled */ > + if (trap == BOOK3S_INTERRUPT_SYSCALL && > + !(vcpu->arch.shregs.msr & MSR_PR)) { > + unsigned long req = kvmppc_get_gpr(vcpu, 3); > > + if (req == H_CEDE) { > + kvmppc_cede(vcpu); > + kvmppc_xive_cede_vcpu(vcpu); /* may un-cede */ > + kvmppc_set_gpr(vcpu, 3, 0); > + trap = 0; > + } > + if (req == H_EOI || req == H_CPPR || else if (req == H_EOI ... ? >>> >>> Hummm, sure. >> >> you could integrate the H_CEDE in the switch statement below. > > Below is in a different file just for the emulation calls. > >>> >>> [...] >>> > +void kvmppc_xive_cede_vcpu(struct kvm_vcpu *vcpu) > +{ > + void __iomem *esc_vaddr = (void __iomem *)vcpu->arch.xive_esc_vaddr; > + > + if (!esc_vaddr) > + return; > + > + /* we are using XIVE with single escalation */ > + > + if (vcpu->arch.xive_esc_on) { > + /* > + * If we still have a pending escalation, abort the cede, > + * and we must set PQ to 10 rather than 00 so that we don't > + * potentially end up with two entries for the escalation > + * interrupt in the XIVE interrupt queue. In that case > + * we also don't want to set xive_esc_on to 1 here in > + * case we race with xive_esc_irq(). > + */ > + vcpu->arch.ceded = 0; > + /* > + * The escalation interrupts are special as we don't EOI them. > + * There is no need to use the load-after-store ordering offset > + * to set PQ to 10 as we won't use StoreEOI. > + */ > + __raw_readq(esc_vaddr + XIVE_ESB_SET_PQ_10); > + } else { > + vcpu->arch.xive_esc_on = true; > + mb(); > + __raw_readq(esc_vaddr + XIVE_ESB_SET_PQ_00); > + } > + mb(); Uff. Thanks for cut-n-pasting the comments, helped a lot to match this c to that asm! >>> >>> Glad it helped. > +} >> >> I had to do the PowerNV models in QEMU to start understanding that stuff ... >> > +EXPORT_SYMBOL_GPL(kvmppc_xive_cede_vcpu); > + > /* >* This is a simple trigger for a generic XIVE IRQ. This must >* only be called for interrupts that support a trigger page > @@ -2106,6 +214