date:20201203

Re: [PATCH v2 1/2] clocksource: arm_arch_timer: Use stable count reader in erratum sne

2020-12-03 Thread zhukeqian

Hi Marc,

On 2020/12/3 22:58, Marc Zyngier wrote:
> On 2020-08-18 04:28, Keqian Zhu wrote:
>> In commit 0ea415390cd3 ("clocksource/arm_arch_timer: Use 
>> arch_timer_read_counter
>> to access stable counters"), we separate stable and normal count reader to 
>> omit
>> unnecessary overhead on systems that have no timer erratum.
>>
>> However, in erratum_set_next_event_tval_generic(), count reader becomes 
>> normal
>> reader. This converts it to stable reader.
>>
>> Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use
>>arch_timer_read_counter to access stable counters")
> 
> On a single line.
Addressed. Thanks.

> 
>> Signed-off-by: Keqian Zhu 
>> ---
>>  drivers/clocksource/arm_arch_timer.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/clocksource/arm_arch_timer.c
>> b/drivers/clocksource/arm_arch_timer.c
>> index 6c3e841..777d38c 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -396,10 +396,10 @@ static void
>> erratum_set_next_event_tval_generic(const int access, unsigned long
>>  ctrl &= ~ARCH_TIMER_CTRL_IT_MASK;
>>
>>  if (access == ARCH_TIMER_PHYS_ACCESS) {
>> -cval = evt + arch_counter_get_cntpct();
>> +cval = evt + arch_counter_get_cntpct_stable();
>>  write_sysreg(cval, cntp_cval_el0);
>>  } else {
>> -cval = evt + arch_counter_get_cntvct();
>> +cval = evt + arch_counter_get_cntvct_stable();
>>  write_sysreg(cval, cntv_cval_el0);
>>  }
> 
> With that fixed:
> 
> Acked-by: Marc Zyngier 
> 
> This should go via the clocksource tree.
Added Cc to it's maintainers, thanks.

> 
> Thanks,
> 
> M.
Cheers,
Keqian
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v3 2/2] clocksource: arm_arch_timer: Correct fault programming of CNTKCTL_EL1.EVNTI

2020-12-03 Thread Keqian Zhu

ARM virtual counter supports event stream, it can only trigger an event
when the trigger bit (the value of CNTKCTL_EL1.EVNTI) of CNTVCT_EL0 changes,
so the actual period of event stream is 2^(cntkctl_evnti + 1). For example,
when the trigger bit is 0, then virtual counter trigger an event for every
two cycles.

Fixes: 037f637767a8 ("drivers: clocksource: add support for ARM architected 
timer event stream")
Suggested-by: Marc Zyngier 
Signed-off-by: Keqian Zhu 
---
 drivers/clocksource/arm_arch_timer.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 777d38cb39b0..d0177824c518 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -822,15 +822,24 @@ static void arch_timer_evtstrm_enable(int divider)
 
 static void arch_timer_configure_evtstream(void)
 {
-   int evt_stream_div, pos;
+   int evt_stream_div, lsb;
+
+   /*
+* As the event stream can at most be generated at half the frequency
+* of the counter, use half the frequency when computing the divider.
+*/
+   evt_stream_div = arch_timer_rate / ARCH_TIMER_EVT_STREAM_FREQ / 2;
+
+   /*
+* Find the closest power of two to the divisor. If the adjacent bit
+* of lsb (last set bit, starts from 0) is set, then we use (lsb + 1).
+*/
+   lsb = fls(evt_stream_div) - 1;
+   if (lsb > 0 && (evt_stream_div & BIT(lsb - 1)))
+   lsb++;
 
-   /* Find the closest power of two to the divisor */
-   evt_stream_div = arch_timer_rate / ARCH_TIMER_EVT_STREAM_FREQ;
-   pos = fls(evt_stream_div);
-   if (pos > 1 && !(evt_stream_div & (1 << (pos - 2
-   pos--;
/* enable event stream */
-   arch_timer_evtstrm_enable(min(pos, 15));
+   arch_timer_evtstrm_enable(max(0, min(lsb, 15)));
 }
 
 static void arch_counter_set_user_access(void)
-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v3 1/2] clocksource: arm_arch_timer: Use stable count reader in erratum sne

2020-12-03 Thread Keqian Zhu

In commit 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter
to access stable counters"), we separate stable and normal count reader to omit
unnecessary overhead on systems that have no timer erratum.

However, in erratum_set_next_event_tval_generic(), count reader becomes normal
reader. This converts it to stable reader.

Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter 
to access stable counters")
Acked-by: Marc Zyngier 
Signed-off-by: Keqian Zhu 
---
 drivers/clocksource/arm_arch_timer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 6c3e84180146..777d38cb39b0 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -396,10 +396,10 @@ static void erratum_set_next_event_tval_generic(const int 
access, unsigned long
ctrl &= ~ARCH_TIMER_CTRL_IT_MASK;
 
if (access == ARCH_TIMER_PHYS_ACCESS) {
-   cval = evt + arch_counter_get_cntpct();
+   cval = evt + arch_counter_get_cntpct_stable();
write_sysreg(cval, cntp_cval_el0);
} else {
-   cval = evt + arch_counter_get_cntvct();
+   cval = evt + arch_counter_get_cntvct_stable();
write_sysreg(cval, cntv_cval_el0);
}
 
-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v3 0/2] clocksource: arm_arch_timer: Some fixes

2020-12-03 Thread Keqian Zhu

change log:

v3:
 - Address Marc's comments.
 - Reform arch_timer_configure_evtstream.

v2:
 - Do not revert commit 0ea415390cd3, fix it instead.
 - Correct the tags of second patch.

Keqian Zhu (2):
  clocksource: arm_arch_timer: Use stable count reader in erratum sne
  clocksource: arm_arch_timer: Correct fault programming of
CNTKCTL_EL1.EVNTI

 drivers/clocksource/arm_arch_timer.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

-- 
2.23.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 2/2] clocksource: arm_arch_timer: Correct fault programming of CNTKCTL_EL1.EVNTI

2020-12-03 Thread zhukeqian

Hi Marc,

On 2020/12/3 22:57, Marc Zyngier wrote:
> On 2020-08-18 04:28, Keqian Zhu wrote:
>> ARM virtual counter supports event stream, it can only trigger an event
>> when the trigger bit (the value of CNTKCTL_EL1.EVNTI) of CNTVCT_EL0 changes,
>> so the actual period of event stream is 2^(cntkctl_evnti + 1). For example,
>> when the trigger bit is 0, then virtual counter trigger an event for every
>> two cycles.
>>
>> Fixes: 037f637767a8 ("drivers: clocksource: add support for
>>ARM architected timer event stream")
> 
> Fixes: tags should on a single line.
Will do.

> 
>> Suggested-by: Marc Zyngier 
>> Signed-off-by: Keqian Zhu 
>> ---
>>  drivers/clocksource/arm_arch_timer.c | 10 +++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/clocksource/arm_arch_timer.c
>> b/drivers/clocksource/arm_arch_timer.c
>> index 777d38c..e3b2ee0 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -824,10 +824,14 @@ static void arch_timer_configure_evtstream(void)
>>  {
>>  int evt_stream_div, pos;
>>
>> -/* Find the closest power of two to the divisor */
>> -evt_stream_div = arch_timer_rate / ARCH_TIMER_EVT_STREAM_FREQ;
>> +/*
>> + * Find the closest power of two to the divisor. As the event
>> + * stream can at most be generated at half the frequency of the
>> + * counter, use half the frequency when computing the divider.
>> + */
>> +evt_stream_div = arch_timer_rate / ARCH_TIMER_EVT_STREAM_FREQ / 2;
>>  pos = fls(evt_stream_div);
>> -if (pos > 1 && !(evt_stream_div & (1 << (pos - 2
>> +if ((pos == 1) || (pos > 1 && !(evt_stream_div & (1 << (pos - 2)
>>  pos--;
> 
> You don't explain why you are special-casing pos == 1.
The logic is not clear here, I will try to reform it.

> 
>>  /* enable event stream */
>>  arch_timer_evtstrm_enable(min(pos, 15));
> 
> Also, please Cc the subsystem maintainers:
> 
> CLOCKSOURCE, CLOCKEVENT DRIVERS
> M:  Daniel Lezcano 
> M:  Thomas Gleixner 
> L:  linux-ker...@vger.kernel.org
> S:  Supported
> T:  git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> timers/core
> F:  Documentation/devicetree/bindings/timer/
> F:  drivers/clocksource/
> 
Will do.

> Thanks,
> 
> M.

Thanks,
Keqian
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 1/2] KVM: arm64: Some fixes of PV-time interface document

2020-12-03 Thread zhukeqian




On 2020/12/3 23:04, Marc Zyngier wrote:
> On 2020-08-17 12:07, Keqian Zhu wrote:
>> Rename PV_FEATURES to PV_TIME_FEATURES.
>>
>> Signed-off-by: Keqian Zhu 
>> Reviewed-by: Andrew Jones 
>> Reviewed-by: Steven Price 
>> ---
>>  Documentation/virt/kvm/arm/pvtime.rst | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/arm/pvtime.rst
>> b/Documentation/virt/kvm/arm/pvtime.rst
>> index 687b60d..94bffe2 100644
>> --- a/Documentation/virt/kvm/arm/pvtime.rst
>> +++ b/Documentation/virt/kvm/arm/pvtime.rst
>> @@ -3,7 +3,7 @@
>>  Paravirtualized time support for arm64
>>  ==
>>
>> -Arm specification DEN0057/A defines a standard for paravirtualised time
>> +Arm specification DEN0057/A defines a standard for paravirtualized time
>>  support for AArch64 guests:
> 
> nit: I do object to this change (some of us are British! ;-).
Oh, I will pay attention to this. Thanks!

Keqian
> 
> M.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v4 00/26] Opt-in always-on nVHE hypervisor

2020-12-03 Thread Marc Zyngier

On Wed, 2 Dec 2020 18:40:56 +, David Brazdil wrote:
> As we progress towards being able to keep guest state private to the
> host running nVHE hypervisor, this series allows the hypervisor to
> install itself on newly booted CPUs before the host is allowed to run
> on them.
> 
> All functionality described below is opt-in, guarded by an early param
> 'kvm-arm.mode=protected'. Future patches specific to the new protected
> mode should be hidden behind the same param.
> 
> [...]

Applied to kvm-arm64/psci-relay, thanks!

Note that although I pushed it to -next, I still need people to
eyeball it and give it some Acks. The commit-IDs below will
thus change as I apply tags, if any.

[01/26] KVM: arm64: Add kvm-arm.mode early kernel parameter
commit: 2d4b0ce5c9b62398522b4b078cfd2cc0fa3fb604
[02/26] KVM: arm64: Add ARM64_KVM_PROTECTED_MODE CPU capability
commit: 44e88d43c442adcebebec5b9e23f260a03a25120
[03/26] psci: Support psci_ops.get_version for v0.1
commit: 47e4000e4f6ea4496accf7e3e68c29f38ca4e179
[04/26] psci: Split functions to v0.1 and v0.2+ variants
commit: 1fbb7db86fb5f1cd7a2c9ec9c477acb67ac986a7
[05/26] psci: Replace psci_function_id array with a struct
commit: c801a91084f382ab8f9707bd33e6ccb7012e1e50
[06/26] psci: Add accessor for psci_0_1_function_ids
commit: 26c9988c7330b2225ba39cae9de43b0bfff57e2a
[07/26] arm64: Make cpu_logical_map() take unsigned int
commit: 2346f8b8ea0bb140d67ba6f06b67aec06e238dde
[08/26] arm64: Extract parts of el2_setup into a macro
commit: 9c322020286c60fbdd97f6a8c41362be5f4f8bb9
[09/26] KVM: arm64: Remove vector_ptr param of hyp-init
commit: 1db5bd14716029c8859551e9c38fe76818959b7b
[10/26] KVM: arm64: Move hyp-init params to a per-CPU struct
commit: 4a836c1e69dbeb14f69d554e1fe36d2e619d94fc
[11/26] KVM: arm64: Init MAIR/TCR_EL2 from params struct
commit: 5e664b8539c396dbceaccb6bef2a9ed48964906a
[12/26] KVM: arm64: Add .hyp.data..ro_after_init ELF section
commit: 89f3705ca070900a127f181ce724aa6c1e9c9479
[13/26] KVM: arm64: Support per_cpu_ptr in nVHE hyp code
commit: 2091f4271a400169d8fa8004bf743aa815c3c5d4
[14/26] KVM: arm64: Create nVHE copy of cpu_logical_map
commit: 626aa81e14f9d723fe91fdb5c1030f73f929d0ad
[15/26] KVM: arm64: Add SMC handler in nVHE EL2
commit: 0ec63d737071f483ab6fc63e2d9b59d0d4cc59fd
[16/26] KVM: arm64: Bootstrap PSCI SMC handler in nVHE EL2
commit: 5988416e2234db36b80c510c1ae99a6de0c1431d
[17/26] KVM: arm64: Add offset for hyp VA <-> PA conversion
commit: bf9dc203286ce42de948dbb0d3fdaea51e2ab37f
[18/26] KVM: arm64: Forward safe PSCI SMCs coming from host
commit: 0e11d688605f1772098add3a755503688db2d06f
[19/26] KVM: arm64: Extract __do_hyp_init into a helper function
commit: 294f71ad53625f75531dd43d775efc3507cd9b0a
[20/26] KVM: arm64: Add function to enter host from KVM nVHE hyp code
commit: cb9773719fc405e8cc2041cd457fcd8655863a78
[21/26] KVM: arm64: Intercept host's CPU_ON SMCs
commit: 6ed1b8bd3c623d4e0e4441a2a73dbda162e3ebe7
[22/26] KVM: arm64: Intercept host's CPU_SUSPEND PSCI SMCs
commit: 5f51e7f65258cea36833c793625f4fb6d0e38426
[23/26] KVM: arm64: Intercept host's SYSTEM_SUSPEND PSCI SMCs
commit: dfa751cfd54b3f9ac1d89050cf0ad6c6bc3a9dc5
[24/26] KVM: arm64: Keep nVHE EL2 vector installed
commit: 0c8078f56aa99ab4350d9ae3dabd3504d2f11fbd
[25/26] KVM: arm64: Trap host SMCs in protected mode
commit: 4e3e6c3acb741a9692e0b772e92368fee85dced8
[26/26] KVM: arm64: Fix EL2 mode availability checks
commit: 5e7953174eb1966d4cdc70caf3708afc8c4dd5f9

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 0/2] KVM: arm64: Some fixes and code adjustments for pvtime ST

2020-12-03 Thread Marc Zyngier

On Mon, 17 Aug 2020 19:07:26 +0800, Keqian Zhu wrote:
> During picking up pvtime LPT support for arm64, I do some trivial fixes for
> pvtime ST.
> 
> change log:
> 
> v2:
>  - Add Andrew's and Steven's R-b.
>  - Correct commit message of the first patch.
>  - Drop the second patch.
> 
> [...]

Applied to kvm-arm64/misc-5.11, thanks!

[1/2] KVM: arm64: Some fixes of PV-time interface document
  commit: 94558543213ae8c83be5d01b83c1fe7530e8a1a0
[2/2] KVM: arm64: Use kvm_write_guest_lock when init stolen time
  commit: 652d0b701d136ede6bc8a977b3abbe2d420226b9

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs

2020-12-03 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: kvmarm-boun...@lists.cs.columbia.edu
> [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Auger Eric
> Sent: 01 December 2020 13:59
> To: wangxingang 
> Cc: Xieyingtai ; jean-phili...@linaro.org;
> k...@vger.kernel.org; m...@kernel.org; j...@8bytes.org; w...@kernel.org;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> vivek.gau...@arm.com; alex.william...@redhat.com;
> zhangfei@linaro.org; robin.mur...@arm.com;
> kvmarm@lists.cs.columbia.edu; eric.auger@gmail.com
> Subject: Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with
> unmanaged ASIDs
> 
> Hi Xingang,
> 
> On 12/1/20 2:33 PM, Xingang Wang wrote:
> > Hi Eric
> >
> > On  Wed, 18 Nov 2020 12:21:43, Eric Auger wrote:
> >> @@ -1710,7 +1710,11 @@ static void arm_smmu_tlb_inv_context(void
> *cookie)
> >> * insertion to guarantee those are observed before the TLBI. Do be
> >> * careful, 007.
> >> */
> >> -  if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> >> +  if (ext_asid >= 0) { /* guest stage 1 invalidation */
> >> +  cmd.opcode  = CMDQ_OP_TLBI_NH_ASID;
> >> +  cmd.tlbi.asid   = ext_asid;
> >> +  cmd.tlbi.vmid   = smmu_domain->s2_cfg.vmid;
> >> +  } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> >
> > Found a problem here, the cmd for guest stage 1 invalidation is built,
> > but it is not delivered to smmu.
> >
> 
> Thank you for the report. I will fix that soon. With that fixed, have
> you been able to run vSVA on top of the series. Do you need other stuff
> to be fixed at SMMU level? 

I am seeing another issue with this series. This is when you have the vSMMU
in non-strict mode(iommu.strict=0). Any network pass-through dev with iperf run 
will be enough to reproduce the issue. It may randomly stop/hang.

It looks like the .flush_iotlb_all from guest is not propagated down to the host
correctly. I have a temp hack to fix this in Qemu wherein CMDQ_OP_TLBI_NH_ASID
will result in a CACHE_INVALIDATE with IOMMU_INV_GRANU_PASID flag and archid
set.

Please take a look and let me know. 

As I am going to respin soon, please let me
> know what is the best branch to rebase to alleviate your integration.

Please find the latest kernel and Qemu branch with vSVA support added here,

https://github.com/hisilicon/kernel-dev/tree/5.10-rc4-2stage-v13-vsva
https://github.com/hisilicon/qemu/tree/v5.2.0-rc1-2stage-rfcv7-vsva

I have done some basic minimum vSVA tests on a HiSilicon D06 board with
a zip dev that supports STALL. All looks good so far apart from the issues
that have been already reported/discussed.

The kernel branch is actually a rebase of sva/uacce related patches from a
Linaro branch here,

https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.10

I think going forward it will be good(if possible) to respin your series on top 
of
a sva branch with STALL/PRI support added. 

Hi Jean/zhangfei,
Is it possible to have a branch with minimum required SVA/UACCE related patches
that are already public and can be a "stable" candidate for future respin of 
Eric's series?
Please share your thoughts.

Thanks,
Shameer 

> Best Regards
> 
> Eric
> 
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [RFC PATCH 16/27] KVM: arm64: Prepare Hyp memory protection

2020-12-03 Thread Fuad Tabba

Hi Quentin,

On Tue, Nov 17, 2020 at 6:17 PM 'Quentin Perret' via kernel-team
 wrote:
>
> When memory protection is enabled, the Hyp code needs the ability to
> create and manage its own page-table. To do so, introduce a new set of
> hypercalls to initialize Hyp memory protection.
>
> During the init hcall, the hypervisor runs with the host-provided
> page-table and uses the trivial early page allocator to create its own
> set of page-tables, using a memory pool that was donated by the host.
> Specifically, the hypervisor creates its own mappings for __hyp_text,
> the Hyp memory pool, the __hyp_bss, the portion of hyp_vmemmap
> corresponding to the Hyp pool, among other things. It then jumps back in
> the idmap page, switches to use the newly-created pgd (instead of the
> temporary one provided by the host) and then installs the full-fledged
> buddy allocator which will then be the only one in used from then on.
>
> Note that for the sake of symplifying the review, this only introduces
> the code doing this operation, without actually being called by anyhing
> yet. This will be done in a subsequent patch, which will introduce the
> necessary host kernel changes.
>
> Credits to Will for __kvm_init_switch_pgd.
>
> Co-authored-by: Will Deacon 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_asm.h |   6 +-
>  arch/arm64/include/asm/kvm_host.h|   8 +
>  arch/arm64/include/asm/kvm_hyp.h |   8 +
>  arch/arm64/kernel/cpufeature.c   |   2 +-
>  arch/arm64/kernel/image-vars.h   |  19 +++
>  arch/arm64/kvm/hyp/Makefile  |   2 +-
>  arch/arm64/kvm/hyp/include/nvhe/memory.h |   6 +
>  arch/arm64/kvm/hyp/include/nvhe/mm.h |  79 +
>  arch/arm64/kvm/hyp/nvhe/Makefile |   4 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  30 
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c   |  44 +
>  arch/arm64/kvm/hyp/nvhe/mm.c | 175 
>  arch/arm64/kvm/hyp/nvhe/psci-relay.c |   2 -
>  arch/arm64/kvm/hyp/nvhe/setup.c  | 196 +++
>  arch/arm64/kvm/hyp/reserved_mem.c|  75 +
>  arch/arm64/kvm/mmu.c |   2 +-
>  arch/arm64/mm/init.c |   3 +
>  17 files changed, 653 insertions(+), 8 deletions(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/mm.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/mm.c
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/setup.c
>  create mode 100644 arch/arm64/kvm/hyp/reserved_mem.c
>
> diff --git a/arch/arm64/include/asm/kvm_asm.h 
> b/arch/arm64/include/asm/kvm_asm.h
> index e4934f5e4234..9266b17f8ba9 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -57,6 +57,10 @@
>  #define __KVM_HOST_SMCCC_FUNC___kvm_get_mdcr_el2   12
>  #define __KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs  13
>  #define __KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs   14
> +#define __KVM_HOST_SMCCC_FUNC___kvm_hyp_protect15
> +#define __KVM_HOST_SMCCC_FUNC___hyp_create_mappings16
> +#define __KVM_HOST_SMCCC_FUNC___hyp_create_private_mapping 17
> +#define __KVM_HOST_SMCCC_FUNC___hyp_cpu_set_vector 18
>
>  #ifndef __ASSEMBLY__
>
> @@ -171,7 +175,7 @@ struct kvm_vcpu;
>  struct kvm_s2_mmu;
>
>  DECLARE_KVM_NVHE_SYM(__kvm_hyp_init);
> -DECLARE_KVM_NVHE_SYM(__kvm_hyp_host_vector);
> +DECLARE_KVM_HYP_SYM(__kvm_hyp_host_vector);
>  DECLARE_KVM_HYP_SYM(__kvm_hyp_vector);
>  #define __kvm_hyp_init CHOOSE_NVHE_SYM(__kvm_hyp_init)
>  #define __kvm_hyp_host_vector  CHOOSE_NVHE_SYM(__kvm_hyp_host_vector)
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 7a5d5f4b3351..ee8bb8021637 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -742,4 +742,12 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
>  #define kvm_vcpu_has_pmu(vcpu) \
> (test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
>
> +#ifdef CONFIG_KVM
> +extern phys_addr_t hyp_mem_base;
> +extern phys_addr_t hyp_mem_size;
> +void __init reserve_kvm_hyp(void);
> +#else
> +static inline void reserve_kvm_hyp(void) { }
> +#endif
> +
>  #endif /* __ARM64_KVM_HOST_H__ */
> diff --git a/arch/arm64/include/asm/kvm_hyp.h 
> b/arch/arm64/include/asm/kvm_hyp.h
> index 95a2bbbcc7e1..dbd2ef86afa9 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -105,5 +105,13 @@ void __noreturn hyp_panic(void);
>  void __noreturn __hyp_do_panic(bool restore_host, u64 spsr, u64 elr, u64 
> par);
>  #endif
>
> +#ifdef __KVM_NVHE_HYPERVISOR__
> +void __kvm_init_switch_pgd(phys_addr_t phys, unsigned long size,
> +  phys_addr_t pgd, void *sp, void *cont_fn);
> +int __kvm_hyp_protect(phys_addr_t phys, unsigned long size,
> + unsigned long nr_cpus,

Re: [PATCH v6 1/2] arm64: kvm: Save/restore MTE registers

2020-12-03 Thread Marc Zyngier




diff --git a/arch/arm64/include/asm/sysreg.h 
b/arch/arm64/include/asm/sysreg.h

index e2ef4c2edf06..b6668ffa04d9 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -569,7 +569,8 @@
 #define SCTLR_ELx_M(BIT(0))

 #define SCTLR_ELx_FLAGS(SCTLR_ELx_M  | SCTLR_ELx_A | SCTLR_ELx_C | \
-SCTLR_ELx_SA | SCTLR_ELx_I | SCTLR_ELx_IESB)
+SCTLR_ELx_SA | SCTLR_ELx_I | SCTLR_ELx_IESB | \
+SCTLR_ELx_ITFSB)

 /* SCTLR_EL2 specific flags. */
 #define SCTLR_EL2_RES1	((BIT(4))  | (BIT(5))  | (BIT(11)) | (BIT(16)) 
| \

diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
index cce43bfe158f..45255ba60152 100644
--- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
+++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
@@ -18,6 +18,11 @@
 static inline void __sysreg_save_common_state(struct kvm_cpu_context 
*ctxt)

 {
ctxt_sys_reg(ctxt, MDSCR_EL1)   = read_sysreg(mdscr_el1);
+   if (system_supports_mte()) {


Please move the per-VM predicate to this patch so that it can be used
not to save/restore the MTE registers if we don't need to.


+   ctxt_sys_reg(ctxt, RGSR_EL1)= read_sysreg_s(SYS_RGSR_EL1);
+   ctxt_sys_reg(ctxt, GCR_EL1) = read_sysreg_s(SYS_GCR_EL1);
+   ctxt_sys_reg(ctxt, TFSRE0_EL1)  = read_sysreg_s(SYS_TFSRE0_EL1);
+   }


Overall, I still don't understand how this is going to work once
we have MTE in the kernel. You mentioned having the ability to
create turn off the tag checks at times, but I don't see that
in this patch (and I'm not sure we want that either).

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 0/2] MTE support for KVM guest

2020-12-03 Thread Mark Rutland

On Thu, Dec 03, 2020 at 04:49:49PM +, Steven Price wrote:
> On 03/12/2020 16:09, Mark Rutland wrote:
> > On Fri, Nov 27, 2020 at 03:21:11PM +, Steven Price wrote:
> > > It's been a week, and I think the comments on v5 made it clear that
> > > enforcing PROT_MTE requirements on the VMM was probably the wrong
> > > approach. So since I've got swap working correctly without that I
> > > thought I'd post a v6 which hopefully addresses all the comments so far.
> > > 
> > > This series adds support for Arm's Memory Tagging Extension (MTE) to
> > > KVM, allowing KVM guests to make use of it. This builds on the existing
> > > user space support already in v5.10-rc4, see [1] for an overview.
> > 
> > >   arch/arm64/include/asm/kvm_emulate.h   |  3 +++
> > >   arch/arm64/include/asm/kvm_host.h  |  8 
> > >   arch/arm64/include/asm/pgtable.h   |  2 +-
> > >   arch/arm64/include/asm/sysreg.h|  3 ++-
> > >   arch/arm64/kernel/mte.c| 18 +-
> > >   arch/arm64/kvm/arm.c   |  9 +
> > >   arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 14 ++
> > >   arch/arm64/kvm/mmu.c   | 16 
> > >   arch/arm64/kvm/sys_regs.c  | 20 +++-
> > >   include/uapi/linux/kvm.h   |  1 +
> > >   10 files changed, 82 insertions(+), 12 deletions(-)
> > 
> > I note that doesn't fixup arch/arm64/kvm/inject_fault.c, where in
> > enter_exception64() we have:
> > 
> > | // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)
> > 
> > ... and IIUC when MTE is present, TCO should be set when delivering an
> > exception, so I believe that needs to be updated to set TCO.
> 
> Well spotted! As you say TCO should be set when delivering an exception, so
> we need the following:
> 
> -   // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)
> +   if (kvm_has_mte(vcpu->kvm))
> +   new |= PSR_TCO_BIT;

Something of that sort, yes.

It'd be worth a look for any mention of TCO or MTE in case there are
other bits that need a fixup.

> > Given that MTE-capable HW does that unconditionally, this is going to be
> > a mess for big.LITTLE. :/
> 
> I'm not sure I follow. Either all CPUs support MTE in which this isn't a
> problem, or the MTE feature just isn't exposed. We don't support a mix of
> MTE and non-MTE CPUs. There are several aspects of MTE which effective mean
> it's an all-or-nothing feature for the system.

So long as the host requires uniform MTE support, I agree that's not a
problem.

The fun is that the CPUs themselves will set TCO upon a real exception
regardless of whether the host is aware, and on a mismatched system some
CPUs will do that while others will not. In such a case the host and
guest will end up seeing the SPSR TCO bit set sometimes upon exceptions
from EL1 or EL2, and I hope that MTE-unaware CPUs ignore the bit upon
ERET, or we're going to have significant problems.

Thanks,
Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 0/2] MTE support for KVM guest

2020-12-03 Thread Steven Price


On 03/12/2020 16:09, Mark Rutland wrote:

On Fri, Nov 27, 2020 at 03:21:11PM +, Steven Price wrote:

It's been a week, and I think the comments on v5 made it clear that
enforcing PROT_MTE requirements on the VMM was probably the wrong
approach. So since I've got swap working correctly without that I
thought I'd post a v6 which hopefully addresses all the comments so far.

This series adds support for Arm's Memory Tagging Extension (MTE) to
KVM, allowing KVM guests to make use of it. This builds on the existing
user space support already in v5.10-rc4, see [1] for an overview.



  arch/arm64/include/asm/kvm_emulate.h   |  3 +++
  arch/arm64/include/asm/kvm_host.h  |  8 
  arch/arm64/include/asm/pgtable.h   |  2 +-
  arch/arm64/include/asm/sysreg.h|  3 ++-
  arch/arm64/kernel/mte.c| 18 +-
  arch/arm64/kvm/arm.c   |  9 +
  arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 14 ++
  arch/arm64/kvm/mmu.c   | 16 
  arch/arm64/kvm/sys_regs.c  | 20 +++-
  include/uapi/linux/kvm.h   |  1 +
  10 files changed, 82 insertions(+), 12 deletions(-)


I note that doesn't fixup arch/arm64/kvm/inject_fault.c, where in
enter_exception64() we have:

| // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)

... and IIUC when MTE is present, TCO should be set when delivering an
exception, so I believe that needs to be updated to set TCO.


Well spotted! As you say TCO should be set when delivering an exception, 
so we need the following:


-   // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)
+   if (kvm_has_mte(vcpu->kvm))
+   new |= PSR_TCO_BIT;


Given that MTE-capable HW does that unconditionally, this is going to be
a mess for big.LITTLE. :/


I'm not sure I follow. Either all CPUs support MTE in which this isn't a 
problem, or the MTE feature just isn't exposed. We don't support a mix 
of MTE and non-MTE CPUs. There are several aspects of MTE which 
effective mean it's an all-or-nothing feature for the system.


Thanks,

Steve
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 0/2] MTE support for KVM guest

2020-12-03 Thread Mark Rutland

On Fri, Nov 27, 2020 at 03:21:11PM +, Steven Price wrote:
> It's been a week, and I think the comments on v5 made it clear that
> enforcing PROT_MTE requirements on the VMM was probably the wrong
> approach. So since I've got swap working correctly without that I
> thought I'd post a v6 which hopefully addresses all the comments so far.
> 
> This series adds support for Arm's Memory Tagging Extension (MTE) to
> KVM, allowing KVM guests to make use of it. This builds on the existing
> user space support already in v5.10-rc4, see [1] for an overview.

>  arch/arm64/include/asm/kvm_emulate.h   |  3 +++
>  arch/arm64/include/asm/kvm_host.h  |  8 
>  arch/arm64/include/asm/pgtable.h   |  2 +-
>  arch/arm64/include/asm/sysreg.h|  3 ++-
>  arch/arm64/kernel/mte.c| 18 +-
>  arch/arm64/kvm/arm.c   |  9 +
>  arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 14 ++
>  arch/arm64/kvm/mmu.c   | 16 
>  arch/arm64/kvm/sys_regs.c  | 20 +++-
>  include/uapi/linux/kvm.h   |  1 +
>  10 files changed, 82 insertions(+), 12 deletions(-)

I note that doesn't fixup arch/arm64/kvm/inject_fault.c, where in
enter_exception64() we have:

| // TODO: TCO (if/when ARMv8.5-MemTag is exposed to guests)

... and IIUC when MTE is present, TCO should be set when delivering an
exception, so I believe that needs to be updated to set TCO.

Given that MTE-capable HW does that unconditionally, this is going to be
a mess for big.LITTLE. :/

Thanks,
Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 1/2] KVM: arm64: Some fixes of PV-time interface document

2020-12-03 Thread Marc Zyngier


On 2020-08-17 12:07, Keqian Zhu wrote:

Rename PV_FEATURES to PV_TIME_FEATURES.

Signed-off-by: Keqian Zhu 
Reviewed-by: Andrew Jones 
Reviewed-by: Steven Price 
---
 Documentation/virt/kvm/arm/pvtime.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/arm/pvtime.rst
b/Documentation/virt/kvm/arm/pvtime.rst
index 687b60d..94bffe2 100644
--- a/Documentation/virt/kvm/arm/pvtime.rst
+++ b/Documentation/virt/kvm/arm/pvtime.rst
@@ -3,7 +3,7 @@
 Paravirtualized time support for arm64
 ==

-Arm specification DEN0057/A defines a standard for paravirtualised 
time
+Arm specification DEN0057/A defines a standard for paravirtualized 
time

 support for AArch64 guests:


nit: I do object to this change (some of us are British! ;-).

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests

2020-12-03 Thread Auger Eric

Hi Alexandru,
On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The LPI code validates a result similarly to the IPI tests, by checking if
> the target CPU received the interrupt with the expected interrupt number.
> However, the LPI tests invent their own way of checking the test results by
> creating a global struct (lpi_stats), using a separate interrupt handler
> (lpi_handler) and test function (check_lpi_stats).
> 
> There are several areas that can be improved in the LPI code, which are
> already covered by the IPI tests:
> 
> - check_lpi_stats() doesn't take into account that the target CPU can
>   receive the correct interrupt multiple times.
> - check_lpi_stats() doesn't take into the account the scenarios where all
>   online CPUs can receive the interrupt, but the target CPU is the last CPU
>   that touches lpi_stats.observed.
> - Insufficient or missing memory synchronization.
> 
> Instead of duplicating code, let's convert the LPI tests to use
> check_acked() and the same interrupt handler as the IPI tests, which has
> been renamed to irq_handler() to avoid any confusion.
> 
> check_lpi_stats() has been replaced with check_acked() which, together with
> using irq_handler(), instantly gives us more correctness checks and proper
> memory synchronization between threads. lpi_stats.expected has been
> replaced by the CPU mask and the expected interrupt number arguments to
> check_acked(), with no change in semantics.
> 
> lpi_handler() aborted the test if the interrupt number was not an LPI. This
> was changed in favor of allowing the test to continue, as it will fail in
> check_acked(), but possibly print information useful for debugging. If the
> test receives spurious interrupts, those are reported via report_info() at
> the end of the test for consistency with the IPI tests, which don't treat
> spurious interrupts as critical errors.
> 
> In the spirit of code reuse, secondary_lpi_tests() has been replaced with
> ipi_recv() because the two are now identical; ipi_recv() has been renamed
> to irq_recv(), similarly to irq_handler(), to avoid confusion.
> 
> CC: Eric Auger 
> Signed-off-by: Alexandru Elisei 
> ---
> With this change, I get the following failure for its-trigger on a
> rockpro64 (running on the little cores):
> 
> $ taskset -c 0-3 arm/run arm/gic.flat -smp 4 -machine gic-version=3 -append 
> its-trigger
> /usr/bin/qemu-system-aarch64 -nodefaults -machine 
> virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device 
> -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev 
> -display none -serial stdio -kernel arm/gic.flat -smp 4 -machine 
> gic-version=3 -append its-trigger # -initrd /tmp/tmp.wWW0iJY6DS
> ITS: MAPD devid=2 size = 0x8 itt=0x403a valid=1
> ITS: MAPD devid=7 size = 0x8 itt=0x403b valid=1
> MAPC col_id=3 target_addr = 0x3 valid=1
> MAPC col_id=2 target_addr = 0x2 valid=1
> INVALL col_id=2
> INVALL col_id=3
> MAPTI dev_id=2 event_id=20 -> phys_id=8195, col_id=3
> MAPTI dev_id=7 event_id=255 -> phys_id=8196, col_id=2
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: int: dev=2, eventid=20  -> lpi= 8195, col=3
> INT dev_id=7 event_id=255
> PASS: gicv3: its-trigger: int: dev=7, eventid=255 -> lpi= 8196, col=2
> INV dev_id=2 event_id=20
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 does not trigger any LPI
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 still does not trigger 
> any LPI
> INVALL col_id=3
> INT dev_id=2 event_id=20
> INFO: gicv3: its-trigger: inv/invall: ACKS: missing=0 extra=1 unexpected=0
> FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
> ITS: MAPD devid=2 size = 0x8 itt=0x403a valid=0
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
> SUMMARY: 6 tests, 1 unexpected failures
> 
> The reason for the failure is that the test "dev2/eventid=20 now triggers
> an LPI" triggers 2 LPIs, not one. This behavior was present before this
> patch, but it was ignored because check_lpi_stats() wasn't looking at the
> acked array.
> 
> I'm not familiar with the ITS so I'm not sure if this is expected, if the
> test is incorrect or if there is something wrong with KVM emulation.
> 
> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
> qemu and kvmtool and Linux v5.8, here's what I found:
> 
> - Using qemu and gic.flat built from *master*: error encountered 864 times
>   out of 1088 runs.
> - Using qemu: error encountered 852 times out of 1027 runs.
> - Using kvmtool: error encountered 8164 times out of 10602 runs.
> 
> Looks to me like it's consistent between master and this series, and
> between qemu and kvmtool.
> 
> Here's the diff that I used for testing master (I removed the diff line
> because it causes trouble when applying the main patch):
> 
> @@ -772,8 +772,12 @@ static void test_its_trigger(void)
> /* Now call the invall and check

Re: [kvm-unit-tests PATCH 09/10] arm/arm64: gic: Make check_acked() more generic

2020-12-03 Thread Auger Eric

Hi,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> Testing that an interrupt is received as expected is done in three places:
> in check_ipi_sender(), check_irqnr() and check_acked(). check_irqnr()
> compares the interrupt ID with IPI_IRQ and records a failure in bad_irq,
> and check_ipi_sender() compares the sender with IPI_SENDER and writes to
> bad_sender when they don't match.
> 
> Let's move all the checks to check_acked() by renaming
> bad_sender->irq_sender and bad_irq->irq_number and changing their semantics
> so they record the interrupt sender, respectively the irq number.
> check_acked() now takes two new parameters: the expected interrupt number
> and sender.
> 
> This has two distinct advantages:
> 
> 1. check_acked() and ipi_handler() can now be used for interrupts other
>than IPIs.
> 2. Correctness checks are consolidated in one function.
> 
> CC: Andre Przywara 
> Signed-off-by: Alexandru Elisei 
Reviewed-by: Eric Auger 

Thanks

Eric

> ---
>  arm/gic.c | 68 +++
>  1 file changed, 33 insertions(+), 35 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index dcdab7d5f39a..da7b42da5449 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -35,7 +35,7 @@ struct gic {
>  
>  static struct gic *gic;
>  static int acked[NR_CPUS], spurious[NR_CPUS];
> -static int bad_sender[NR_CPUS], bad_irq[NR_CPUS];
> +static int irq_sender[NR_CPUS], irq_number[NR_CPUS];
>  static cpumask_t ready;
>  
>  static void nr_cpu_check(int nr)
> @@ -57,8 +57,8 @@ static void stats_reset(void)
>  
>   for (i = 0; i < nr_cpus; ++i) {
>   acked[i] = 0;
> - bad_sender[i] = -1;
> - bad_irq[i] = -1;
> + irq_sender[i] = -1;
> + irq_number[i] = -1;
>   }
>  }
>  
> @@ -92,9 +92,10 @@ static void wait_for_interrupts(cpumask_t *mask)
>   report_info("interrupts timed-out (5s)");
>  }
>  
> -static bool check_acked(cpumask_t *mask)
> +static bool check_acked(cpumask_t *mask, int sender, int irqnum)
>  {
>   int missing = 0, extra = 0, unexpected = 0;
> + bool has_gicv2 = (gic_version() == 2);
>   bool pass = true;
>   int cpu;
>  
> @@ -108,17 +109,19 @@ static bool check_acked(cpumask_t *mask)
>   if (acked[cpu])
>   ++unexpected;
>   }
> + if (!acked[cpu])
> + continue;
>   smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>  
> - if (bad_sender[cpu] != -1) {
> + if (has_gicv2 && irq_sender[cpu] != sender) {
>   report_info("cpu%d received IPI from wrong sender %d",
> - cpu, bad_sender[cpu]);
> + cpu, irq_sender[cpu]);
>   pass = false;
>   }
>  
> - if (bad_irq[cpu] != -1) {
> + if (irq_number[cpu] != irqnum) {
>   report_info("cpu%d received wrong irq %d",
> - cpu, bad_irq[cpu]);
> + cpu, irq_number[cpu]);
>   pass = false;
>   }
>   }
> @@ -143,26 +146,18 @@ static void check_spurious(void)
>   }
>  }
>  
> -static void check_ipi_sender(u32 irqstat, int sender)
> +static int gic_get_sender(int irqstat)
>  {
> - if (gic_version() == 2) {
> - int src = (irqstat >> 10) & 7;
> -
> - if (src != sender)
> - bad_sender[smp_processor_id()] = src;
> - }
> -}
> -
> -static void check_irqnr(u32 irqnr)
> -{
> - if (irqnr != IPI_IRQ)
> - bad_irq[smp_processor_id()] = irqnr;
> + if (gic_version() == 2)
> + return (irqstat >> 10) & 7;
> + return -1;
>  }
>  
>  static void ipi_handler(struct pt_regs *regs __unused)
>  {
>   u32 irqstat = gic_read_iar();
>   u32 irqnr = gic_iar_irqnr(irqstat);
> + int this_cpu = smp_processor_id();
>  
>   if (irqnr != GICC_INT_SPURIOUS) {
>   gic_write_eoir(irqstat);
> @@ -173,12 +168,12 @@ static void ipi_handler(struct pt_regs *regs __unused)
>*/
>   if (gic_version() == 2)
>   smp_rmb();
> - check_ipi_sender(irqstat, IPI_SENDER);
> - check_irqnr(irqnr);
> + irq_sender[this_cpu] = gic_get_sender(irqstat);
> + irq_number[this_cpu] = irqnr;
>   smp_wmb(); /* pairs with smp_rmb in check_acked */
> - ++acked[smp_processor_id()];
> + ++acked[this_cpu];
>   } else {
> - ++spurious[smp_processor_id()];
> + ++spurious[this_cpu];
>   }
>  
>   /* Wait for writes to acked/spurious to complete */
> @@ -311,40 +306,42 @@ static void gicv3_ipi_send_broadcast(void)
>  
>  static void ipi_test_self(void)
>  {
> + int this_cpu = smp_processor_id();
>   cpumask_t mask;
>  
>

Re: [PATCH v2 1/2] clocksource: arm_arch_timer: Use stable count reader in erratum sne

2020-12-03 Thread Marc Zyngier


On 2020-08-18 04:28, Keqian Zhu wrote:
In commit 0ea415390cd3 ("clocksource/arm_arch_timer: Use 
arch_timer_read_counter
to access stable counters"), we separate stable and normal count reader 
to omit

unnecessary overhead on systems that have no timer erratum.

However, in erratum_set_next_event_tval_generic(), count reader becomes 
normal

reader. This converts it to stable reader.

Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use
   arch_timer_read_counter to access stable counters")


On a single line.


Signed-off-by: Keqian Zhu 
---
 drivers/clocksource/arm_arch_timer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c
b/drivers/clocksource/arm_arch_timer.c
index 6c3e841..777d38c 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -396,10 +396,10 @@ static void
erratum_set_next_event_tval_generic(const int access, unsigned long
ctrl &= ~ARCH_TIMER_CTRL_IT_MASK;

if (access == ARCH_TIMER_PHYS_ACCESS) {
-   cval = evt + arch_counter_get_cntpct();
+   cval = evt + arch_counter_get_cntpct_stable();
write_sysreg(cval, cntp_cval_el0);
} else {
-   cval = evt + arch_counter_get_cntvct();
+   cval = evt + arch_counter_get_cntvct_stable();
write_sysreg(cval, cntv_cval_el0);
}


With that fixed:

Acked-by: Marc Zyngier 

This should go via the clocksource tree.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 2/2] clocksource: arm_arch_timer: Correct fault programming of CNTKCTL_EL1.EVNTI

2020-12-03 Thread Marc Zyngier


On 2020-08-18 04:28, Keqian Zhu wrote:

ARM virtual counter supports event stream, it can only trigger an event
when the trigger bit (the value of CNTKCTL_EL1.EVNTI) of CNTVCT_EL0 
changes,
so the actual period of event stream is 2^(cntkctl_evnti + 1). For 
example,
when the trigger bit is 0, then virtual counter trigger an event for 
every

two cycles.

Fixes: 037f637767a8 ("drivers: clocksource: add support for
   ARM architected timer event stream")


Fixes: tags should on a single line.


Suggested-by: Marc Zyngier 
Signed-off-by: Keqian Zhu 
---
 drivers/clocksource/arm_arch_timer.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c
b/drivers/clocksource/arm_arch_timer.c
index 777d38c..e3b2ee0 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -824,10 +824,14 @@ static void arch_timer_configure_evtstream(void)
 {
int evt_stream_div, pos;

-   /* Find the closest power of two to the divisor */
-   evt_stream_div = arch_timer_rate / ARCH_TIMER_EVT_STREAM_FREQ;
+   /*
+* Find the closest power of two to the divisor. As the event
+* stream can at most be generated at half the frequency of the
+* counter, use half the frequency when computing the divider.
+*/
+   evt_stream_div = arch_timer_rate / ARCH_TIMER_EVT_STREAM_FREQ / 2;
pos = fls(evt_stream_div);
-   if (pos > 1 && !(evt_stream_div & (1 << (pos - 2
+   if ((pos == 1) || (pos > 1 && !(evt_stream_div & (1 << (pos - 2)
pos--;


You don't explain why you are special-casing pos == 1.


/* enable event stream */
arch_timer_evtstrm_enable(min(pos, 15));


Also, please Cc the subsystem maintainers:

CLOCKSOURCE, CLOCKEVENT DRIVERS
M:  Daniel Lezcano 
M:  Thomas Gleixner 
L:  linux-ker...@vger.kernel.org
S:  Supported
T:  git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
timers/core

F:  Documentation/devicetree/bindings/timer/
F:  drivers/clocksource/

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v4 06/26] psci: Add accessor for psci_0_1_function_ids

2020-12-03 Thread David Brazdil

On Thu, Dec 03, 2020 at 10:47:12AM +, Mark Rutland wrote:
> On Wed, Dec 02, 2020 at 06:41:02PM +, David Brazdil wrote:
> > Make it possible to retrieve a copy of the psci_0_1_function_ids struct.
> > This is useful for KVM if it is configured to intercept host's PSCI SMCs.
> > 
> > Signed-off-by: David Brazdil 
> 
> Acked-by: Mark Rutland 
> 
> ... just to check, does KVM snapshot which function IDs are valid, or do
> we want to add that state here too? That can be a follow-up if
> necessary.

Ah, that's a good point. It doesn't, but can infer that from psci_ops.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v13 05/15] iommu/smmuv3: Get prepared for nested stage support

2020-12-03 Thread Kunkun Jiang



On 2020/12/3 21:01, Auger Eric wrote:

Hi Kunkun,

On 12/3/20 1:32 PM, Kunkun Jiang wrote:

Hi Eric,

On 2020/11/18 19:21, Eric Auger wrote:

When nested stage translation is setup, both s1_cfg and
s2_cfg are set.

We introduce a new smmu domain abort field that will be set
upon guest stage1 configuration passing.

arm_smmu_write_strtab_ent() is modified to write both stage
fields in the STE and deal with the abort field.

In nested mode, only stage 2 is "finalized" as the host does
not own/configure the stage 1 context descriptor; guest does.

Signed-off-by: Eric Auger 

---
v10 -> v11:
- Fix an issue reported by Shameer when switching from with vSMMU
   to without vSMMU. Despite the spec does not seem to mention it
   seems to be needed to reset the 2 high 64b when switching from
   S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
   On some implementations, if the S2TTB is not reset, this causes
   a C_BAD_STE error
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 64 +
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
  2 files changed, 56 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 18ac5af1b284..412ea1bafa50 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1181,8 +1181,10 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
 * three cases at the moment:

Now, it should be *five cases*.

 *
 * 1. Invalid (all zero) -> bypass/fault (init)
-* 2. Bypass/fault -> translation/bypass (attach)
-* 3. Translation/bypass -> bypass/fault (detach)
+* 2. Bypass/fault -> single stage translation/bypass (attach)
+* 3. Single or nested stage Translation/bypass -> bypass/fault (detach)
+* 4. S2 -> S1 + S2 (attach_pasid_table)

I was testing this series on one of our hardware board with SMMUv3. And
I found while trying to /"//attach_pasid_table//"/,

the sequence of STE (host) config(bit[3:1]) is /"S2->abort->S1 + S2"/.
Because the maintenance is  /"Write everything apart///

/from dword 0, sync, write dword 0, sync"/ when we update the STE
(guest). Dose the sequence meet your expectation?

yes it does. I will fix the comments accordingly.

Is there anything to correct in the code or was it functional?

Thanks

Eric

No, thanks. I just want to clarify the sequence.   :)

+* 5. S1 + S2 -> S2 (detach_pasid_table)
 *
 * Given that we can't update the STE atomically and the SMMU
 * doesn't read the thing in a defined order, that leaves us
@@ -1193,7 +1195,8 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
 * 3. Update Config, sync
 */
u64 val = le64_to_cpu(dst[0]);
-   bool ste_live = false;
+   bool s1_live = false, s2_live = false, ste_live;
+   bool abort, nested = false, translate = false;
struct arm_smmu_device *smmu = NULL;
struct arm_smmu_s1_cfg *s1_cfg;
struct arm_smmu_s2_cfg *s2_cfg;
@@ -1233,6 +1236,8 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
default:
break;
}
+   nested = s1_cfg->set && s2_cfg->set;
+   translate = s1_cfg->set || s2_cfg->set;
}
  
  	if (val & STRTAB_STE_0_V) {

@@ -1240,23 +1245,36 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
case STRTAB_STE_0_CFG_BYPASS:
break;
case STRTAB_STE_0_CFG_S1_TRANS:
+   s1_live = true;
+   break;
case STRTAB_STE_0_CFG_S2_TRANS:
-   ste_live = true;
+   s2_live = true;
+   break;
+   case STRTAB_STE_0_CFG_NESTED:
+   s1_live = true;
+   s2_live = true;
break;
case STRTAB_STE_0_CFG_ABORT:
-   BUG_ON(!disable_bypass);
break;
default:
BUG(); /* STE corruption */
}
}
  
+	ste_live = s1_live || s2_live;

+
/* Nuke the existing STE_0 value, as we're going to rewrite it */
val = STRTAB_STE_0_V;
  
  	/* Bypass/fault */

-   if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
-   if (!smmu_domain && disable_bypass)
+
+   if (!smmu_domain)
+   abort = disable_bypass;
+   else
+   abort = smmu_domain->abort;
+
+   if (abort || !translate) {
+   if (abort)
val |= FIELD_PREP(STRTAB_STE_0_CFG, 
STRTAB_STE_0_CFG_ABORT);
else
val |= FIELD_PREP(STRTAB_STE_0_CFG, 
STRTAB_STE_0_CFG_BYPASS);
@@

Re: [PATCH v13 05/15] iommu/smmuv3: Get prepared for nested stage support

2020-12-03 Thread Kunkun Jiang


Hi Eric,

On 2020/11/18 19:21, Eric Auger wrote:

When nested stage translation is setup, both s1_cfg and
s2_cfg are set.

We introduce a new smmu domain abort field that will be set
upon guest stage1 configuration passing.

arm_smmu_write_strtab_ent() is modified to write both stage
fields in the STE and deal with the abort field.

In nested mode, only stage 2 is "finalized" as the host does
not own/configure the stage 1 context descriptor; guest does.

Signed-off-by: Eric Auger 

---
v10 -> v11:
- Fix an issue reported by Shameer when switching from with vSMMU
   to without vSMMU. Despite the spec does not seem to mention it
   seems to be needed to reset the 2 high 64b when switching from
   S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
   On some implementations, if the S2TTB is not reset, this causes
   a C_BAD_STE error
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 64 +
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
  2 files changed, 56 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 18ac5af1b284..412ea1bafa50 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1181,8 +1181,10 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
 * three cases at the moment:

Now, it should be *five cases*.

 *
 * 1. Invalid (all zero) -> bypass/fault (init)
-* 2. Bypass/fault -> translation/bypass (attach)
-* 3. Translation/bypass -> bypass/fault (detach)
+* 2. Bypass/fault -> single stage translation/bypass (attach)
+* 3. Single or nested stage Translation/bypass -> bypass/fault (detach)
+* 4. S2 -> S1 + S2 (attach_pasid_table)


I was testing this series on one of our hardware board with SMMUv3. And 
I found while trying to /"//attach_pasid_table//"/,


the sequence of STE (host) config(bit[3:1]) is /"S2->abort->S1 + S2"/. 
Because the maintenance is /"Write everything apart///


/from dword 0, sync, write dword 0, sync"/ when we update the STE 
(guest). Dose the sequence meet your expectation?



+* 5. S1 + S2 -> S2 (detach_pasid_table)
 *
 * Given that we can't update the STE atomically and the SMMU
 * doesn't read the thing in a defined order, that leaves us
@@ -1193,7 +1195,8 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
 * 3. Update Config, sync
 */
u64 val = le64_to_cpu(dst[0]);
-   bool ste_live = false;
+   bool s1_live = false, s2_live = false, ste_live;
+   bool abort, nested = false, translate = false;
struct arm_smmu_device *smmu = NULL;
struct arm_smmu_s1_cfg *s1_cfg;
struct arm_smmu_s2_cfg *s2_cfg;
@@ -1233,6 +1236,8 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
default:
break;
}
+   nested = s1_cfg->set && s2_cfg->set;
+   translate = s1_cfg->set || s2_cfg->set;
}
  
  	if (val & STRTAB_STE_0_V) {

@@ -1240,23 +1245,36 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
case STRTAB_STE_0_CFG_BYPASS:
break;
case STRTAB_STE_0_CFG_S1_TRANS:
+   s1_live = true;
+   break;
case STRTAB_STE_0_CFG_S2_TRANS:
-   ste_live = true;
+   s2_live = true;
+   break;
+   case STRTAB_STE_0_CFG_NESTED:
+   s1_live = true;
+   s2_live = true;
break;
case STRTAB_STE_0_CFG_ABORT:
-   BUG_ON(!disable_bypass);
break;
default:
BUG(); /* STE corruption */
}
}
  
+	ste_live = s1_live || s2_live;

+
/* Nuke the existing STE_0 value, as we're going to rewrite it */
val = STRTAB_STE_0_V;
  
  	/* Bypass/fault */

-   if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
-   if (!smmu_domain && disable_bypass)
+
+   if (!smmu_domain)
+   abort = disable_bypass;
+   else
+   abort = smmu_domain->abort;
+
+   if (abort || !translate) {
+   if (abort)
val |= FIELD_PREP(STRTAB_STE_0_CFG, 
STRTAB_STE_0_CFG_ABORT);
else
val |= FIELD_PREP(STRTAB_STE_0_CFG, 
STRTAB_STE_0_CFG_BYPASS);
@@ -1274,8 +1292,16 @@ static void arm_smmu_write_strtab_ent(struct 
arm_smmu_master *master, u32 sid,
return;
}
  
+	BUG_ON(ste_live && !nested);

+
+   if (ste_live) {
+   /* First invalidate the live STE */
+   dst[0] =

Re: [PATCH v2 0/2] clocksource: arm_arch_timer: Some fixes

2020-12-03 Thread zhukeqian

Hi Marc,

Found that this bugfix series is not applied for now.
Does it need some modification? Wish you can pick it up :-)

Thanks,
Keqian

On 2020/8/18 11:28, Keqian Zhu wrote:
> change log:
> 
> v2:
>  - Do not revert commit 0ea415390cd3, fix it instead.
>  - Correct the tags of second patch.
> 
> Keqian Zhu (2):
>   clocksource: arm_arch_timer: Use stable count reader in erratum sne
>   clocksource: arm_arch_timer: Correct fault programming of
> CNTKCTL_EL1.EVNTI
> 
>  drivers/clocksource/arm_arch_timer.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2 0/2] KVM: arm64: Some fixes and code adjustments for pvtime ST

2020-12-03 Thread zhukeqian

Hi Marc,

Found that this series is not applied for now.
Does it need some modification? Wish you can pick it up :-)

Thanks,
Keqian

On 2020/8/17 19:07, Keqian Zhu wrote:
> During picking up pvtime LPT support for arm64, I do some trivial fixes for
> pvtime ST.
> 
> change log:
> 
> v2:
>  - Add Andrew's and Steven's R-b.
>  - Correct commit message of the first patch.
>  - Drop the second patch.
> 
> Keqian Zhu (2):
>   KVM: arm64: Some fixes of PV-time interface document
>   KVM: arm64: Use kvm_write_guest_lock when init stolen time
> 
>  Documentation/virt/kvm/arm/pvtime.rst | 6 +++---
>  arch/arm64/kvm/pvtime.c   | 6 +-
>  2 files changed, 4 insertions(+), 8 deletions(-)
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v4 16/26] kvm: arm64: Bootstrap PSCI SMC handler in nVHE EL2

2020-12-03 Thread Marc Zyngier


A couple of cosmetic comments below, none of which require immediate
addressing.

[...]


diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
new file mode 100644
index ..61375d4571c2
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
@@ -0,0 +1,100 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 - Google LLC
+ * Author: David Brazdil 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 


nit: is there an ordering issue that requires this to be out
of order?


+#include 
+
+#include 
+
+/* Config options set by the host. */
+__ro_after_init u32 kvm_host_psci_version;
+__ro_after_init struct psci_0_1_function_ids 
kvm_host_psci_0_1_function_ids;


nit: we usually place attributes after the type.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH 08/10] arm/arm64: gic: Split check_acked() into two functions

2020-12-03 Thread Auger Eric




On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> check_acked() has several peculiarities: is the only function among the
> check_* functions which calls report() directly, it does two things
> (waits for interrupts and checks for misfired interrupts) and it also
> mixes printf, report_info and report calls.
> 
> check_acked() also reports a pass and returns as soon all the target CPUs
> have received interrupts, However, a CPU not having received an interrupt
> *now* does not guarantee not receiving an eroneous interrupt if we wait
erroneous
> long enough.
> 
> Rework the function by splitting it into two separate functions, each with
> a single responsability: wait_for_interrupts(), which waits for the
> expected interrupts to fire, and check_acked() which checks that interrupts
> have been received as expected.
> 
> wait_for_interrupts() also waits an extra 100 milliseconds after the
> expected interrupts have been received in an effort to make sure we don't
> miss misfiring interrupts.
> 
> Splitting check_acked() into two functions will also allow us to
> customize the behavior of each function in the future more easily
> without using an unnecessarily long list of arguments for check_acked().
> 
> CC: Andre Przywara 
> Signed-off-by: Alexandru Elisei 
> ---
>  arm/gic.c | 73 +++
>  1 file changed, 47 insertions(+), 26 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 544c283f5f47..dcdab7d5f39a 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -62,41 +62,42 @@ static void stats_reset(void)
>   }
>  }
>  
> -static void check_acked(const char *testname, cpumask_t *mask)
> +static void wait_for_interrupts(cpumask_t *mask)
>  {
> - int missing = 0, extra = 0, unexpected = 0;
>   int nr_pass, cpu, i;
> - bool bad = false;
>  
>   /* Wait up to 5s for all interrupts to be delivered */
> - for (i = 0; i < 50; ++i) {
> + for (i = 0; i < 50; i++) {
>   mdelay(100);
>   nr_pass = 0;
>   for_each_present_cpu(cpu) {
> + /*
> +  * A CPU having receied more than one interrupts will
received
> +  * show up in check_acked(), and no matter how long we
> +  * wait it cannot un-receive it. Consier at least one
consider
> +  * interrupt as a pass.
> +  */
>   nr_pass += cpumask_test_cpu(cpu, mask) ?
> - acked[cpu] == 1 : acked[cpu] == 0;
> - smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> -
> - if (bad_sender[cpu] != -1) {
> - printf("cpu%d received IPI from wrong sender 
> %d\n",
> - cpu, bad_sender[cpu]);
> - bad = true;
> - }
> -
> - if (bad_irq[cpu] != -1) {
> - printf("cpu%d received wrong irq %d\n",
> - cpu, bad_irq[cpu]);
> - bad = true;
> - }
> + acked[cpu] >= 1 : acked[cpu] == 0;
>   }
> +
>   if (nr_pass == nr_cpus) {
> - report(!bad, "%s", testname);
>   if (i)
> - report_info("took more than %d ms", i * 100);
> + report_info("interrupts took more than %d ms", 
> i * 100);
> + mdelay(100);
>   return;
>   }
>   }
>  
> + report_info("interrupts timed-out (5s)");
> +}
> +
> +static bool check_acked(cpumask_t *mask)
> +{
> + int missing = 0, extra = 0, unexpected = 0;
> + bool pass = true;
> + int cpu;
> +
>   for_each_present_cpu(cpu) {
>   if (cpumask_test_cpu(cpu, mask)) {
>   if (!acked[cpu])
> @@ -107,11 +108,28 @@ static void check_acked(const char *testname, cpumask_t 
> *mask)
>   if (acked[cpu])
>   ++unexpected;
>   }
> + smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> +
> + if (bad_sender[cpu] != -1) {
> + report_info("cpu%d received IPI from wrong sender %d",
> + cpu, bad_sender[cpu]);
> + pass = false;
> + }
> +
> + if (bad_irq[cpu] != -1) {
> + report_info("cpu%d received wrong irq %d",
> + cpu, bad_irq[cpu]);
> + pass = false;
> + }
> + }
> +
> + if (missing || extra || unexpected) {
> + report_info("ACKS: missing=%d extra=%d unexpected=%d",
> + missing, extra, unexpected);
> + pass = false;
>   }
>  
> - report(false, "%s", testname);

Re: [PATCH v4 15/26] kvm: arm64: Add SMC handler in nVHE EL2

2020-12-03 Thread Marc Zyngier


On 2020-12-02 18:41, David Brazdil wrote:

Add handler of host SMCs in KVM nVHE trap handler. Forward all SMCs to
EL3 and propagate the result back to EL1. This is done in preparation
for validating host SMCs in KVM protected mode.

The implementation assumes that firmware uses SMCCC v1.2 or older. That
means x0-x17 can be used both for arguments and results, other GPRs are
preserved.

Signed-off-by: David Brazdil 
---
 arch/arm64/kvm/hyp/nvhe/host.S | 38 ++
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 35 ---
 2 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/host.S 
b/arch/arm64/kvm/hyp/nvhe/host.S

index fe2740b224cf..2b56f0bdf874 100644
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -180,3 +180,41 @@ SYM_CODE_START(__kvm_hyp_host_vector)
invalid_host_el1_vect   // FIQ 32-bit EL1
invalid_host_el1_vect   // Error 32-bit EL1
 SYM_CODE_END(__kvm_hyp_host_vector)
+
+/*
+ * Forward SMC with arguments in struct kvm_cpu_context, and
+ * store the result into the same struct. Assumes SMCCC 1.2 or older.
+ *
+ * x0: struct kvm_cpu_context*
+ */
+SYM_CODE_START(__kvm_hyp_host_forward_smc)
+   /*
+* Use x18 to keep the pointer to the host context because
+* x18 is callee-saved in SMCCC but not in AAPCS64.
+*/
+   mov x18, x0
+
+   ldp x0, x1,   [x18, #CPU_XREG_OFFSET(0)]
+   ldp x2, x3,   [x18, #CPU_XREG_OFFSET(2)]
+   ldp x4, x5,   [x18, #CPU_XREG_OFFSET(4)]
+   ldp x6, x7,   [x18, #CPU_XREG_OFFSET(6)]
+   ldp x8, x9,   [x18, #CPU_XREG_OFFSET(8)]
+   ldp x10, x11, [x18, #CPU_XREG_OFFSET(10)]
+   ldp x12, x13, [x18, #CPU_XREG_OFFSET(12)]
+   ldp x14, x15, [x18, #CPU_XREG_OFFSET(14)]
+   ldp x16, x17, [x18, #CPU_XREG_OFFSET(16)]
+
+   smc #0
+
+   stp x0, x1,   [x18, #CPU_XREG_OFFSET(0)]
+   stp x2, x3,   [x18, #CPU_XREG_OFFSET(2)]
+   stp x4, x5,   [x18, #CPU_XREG_OFFSET(4)]
+   stp x6, x7,   [x18, #CPU_XREG_OFFSET(6)]
+   stp x8, x9,   [x18, #CPU_XREG_OFFSET(8)]
+   stp x10, x11, [x18, #CPU_XREG_OFFSET(10)]
+   stp x12, x13, [x18, #CPU_XREG_OFFSET(12)]
+   stp x14, x15, [x18, #CPU_XREG_OFFSET(14)]
+   stp x16, x17, [x18, #CPU_XREG_OFFSET(16)]
+
+   ret
+SYM_CODE_END(__kvm_hyp_host_forward_smc)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index a4f1cac714d7..f25680ede080 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -18,6 +18,8 @@

 DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);

+void __kvm_hyp_host_forward_smc(struct kvm_cpu_context *host_ctxt);
+
 static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
 {
DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
@@ -152,12 +154,39 @@ static void handle_host_hcall(struct
kvm_cpu_context *host_ctxt)
cpu_reg(host_ctxt, 0) = SMCCC_RET_NOT_SUPPORTED;
 }

+static void default_host_smc_handler(struct kvm_cpu_context 
*host_ctxt)

+{
+   __kvm_hyp_host_forward_smc(host_ctxt);
+}
+
+static void skip_host_instruction(void)
+{
+   write_sysreg_el2(read_sysreg_el2(SYS_ELR) + 4, SYS_ELR);
+}


Just for the sake of keeping things together, it'd be good to
move this helper to include/hyp/adjust_pc.h. Nothing urgent though.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH 07/10] arm/arm64: gic: Wait for writes to acked or spurious to complete

2020-12-03 Thread Auger Eric

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The IPI test has two parts: in the first part, it tests that the sender CPU
> can send an IPI to itself (ipi_test_self()), and in the second part it
> sends interrupts to even-numbered CPUs (ipi_test_smp()). When acknowledging
> an interrupt, if we read back a spurious interrupt ID (1023), the handler
> increments the index in the static array spurious corresponding to the CPU
> ID that the handler is running on; if we get the expected interrupt ID, we
> increment the same index in the acked array.
> 
> Reads of the spurious and acked arrays are synchronized with writes
> performed before sending the IPI. The synchronization is done either in the
> IPI sender function (GICv3), either by creating a data dependency (GICv2).
> 
> At the end of the test, the sender CPU reads from the acked and spurious
> arrays to check against the expected behaviour. We need to make sure the
> that writes in ipi_handler() are observable by the sender CPU. Use a DSB
> ISHST to make sure that the writes have completed.
> 
> One might rightfully argue that there are no guarantees regarding when the
> DSB instruction completes, just like there are no guarantees regarding when
> the value is observed by the other CPUs. However, let's do our best and
> instruct the CPU to complete the memory access when we know that it will be
> needed.
> 
> We still need to follow the message passing pattern for the acked,
> respectively bad_irq and bad_sender, because DSB guarantees that all memory
> accesses that come before the barrier have completed, not that they have
> completed in program order.
I guess the removal of the smp_rmb in check_spurious should belong to
that patch?
> 
> Signed-off-by: Alexandru Elisei 
Besides, AFAIU

Reviewed-by: Eric Auger 

Thanks

Eric
> ---
>  arm/gic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 5727d72a0ef3..544c283f5f47 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -161,8 +161,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>   ++acked[smp_processor_id()];
>   } else {
>   ++spurious[smp_processor_id()];
> - smp_wmb();
>   }
> +
> + /* Wait for writes to acked/spurious to complete */
> + dsb(ishst);
>  }
>  
>  static void setup_irq(irq_handler_fn handler)
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test

2020-12-03 Thread Alexandru Elisei

Hi Eric,

On 12/3/20 1:10 PM, Auger Eric wrote:
> Hi Alexandru,
>
> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
>> The IPI test works by sending IPIs to even numbered CPUs from the
>> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
>> interrupts as expected. The check is done in check_acked() by the
>> IPI_SENDER CPU with the help of three arrays:
>>
>> - acked, where acked[i] == 1 means that CPU i received the interrupt.
>> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>>   i had the expected interrupt number (IPI_IRQ).
>> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
>>
>> The assumption made by check_acked() is that if a CPU acked an interrupt,
>> then bad_sender and bad_irq have also been updated. This is a common
>> inter-thread communication pattern called message passing.  For message
>> passing to work correctly on weakly consistent memory model architectures,
>> like arm and arm64, barriers or address dependencies are required. This is
>> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
>> using DMB and DSB barriers" (page K11-7993), in the section with a single
>> observer, which is in our case the IPI_SENDER CPU.
>>
>> The IPI test attempts to enforce the correct ordering using memory
>> barriers, but it's not enough. For example, the program execution below is
>> valid from an architectural point of view:
>>
>> 3 online CPUs, initial state (from stats_reset()):
>>
>> acked[2] = 0;
>> bad_sender[2] = -1;
>> bad_irq[2] = -1;
>>
>> CPU1 (in check_acked())  | CPU2 (in ipi_handler())
>>  |
>> smp_rmb() // DMB ISHLD   | acked[2]++;
>> read 1 from acked[2] |
>> nr_pass++ // nr_pass = 3 |
>> read -1 from bad_sender[2]   |
>> read -1 from bad_irq[2]  |
>>  | // in check_ipi_sender()
>>  | bad_sender[2] = 
>>  | // in check_irqnr()
>>  | bad_irq[2] = 
>>  | smp_wmb() // DMB ISHST
>> nr_pass == nr_cpus, return   |
>>
>> In this scenario, CPU1 will read the updated acked value, but it will read
>> the initial bad_sender and bad_irq values. This is permitted because the
>> memory barriers do not create a data dependency between the value read from
>> acked and the values read from bad_rq and bad_sender on CPU1, respectively
>> between the values written to acked, bad_sender and bad_irq on CPU2.
>>
>> To avoid this situation, let's reorder the barriers and accesses to the
>> arrays to create the needed dependencies that ensure that message passing
>> behaves as expected.
>>
>> In the interrupt handler, the writes to bad_sender and bad_irq are
>> reordered before the write to acked and a smp_wmb() barrier is added. This
>> ensures that if other PEs observe the write to acked, then they will also
>> observe the writes to the other two arrays.
>>
>> In check_acked(), put the smp_rmb() barrier after the read from acked to
>> ensure that the subsequent reads from bad_sender, respectively bad_irq,
>> aren't reordered locally by the PE.
>>
>> With these changes, the expected ordering of accesses is respected and we
>> end up with the pattern described in the Arm ARM and also in the Linux
>> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
>> tools/memory-model/litmus-tests. More examples and explanations can be
>> found in the Linux source tree, in Documentation/memory-barriers.txt, in
>> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
>> SPECULATION".
>>
>> For consistency with ipi_handler(), the array accesses in
>> ipi_clear_active_handler() have also been reordered. This shouldn't affect
>> the functionality of that test.
>>
>> Signed-off-by: Alexandru Elisei 
>> ---
>>  arm/gic.c | 9 -
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/arm/gic.c b/arm/gic.c
>> index 7befda2a8673..bcb834406d23 100644
>> --- a/arm/gic.c
>> +++ b/arm/gic.c
>> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t 
>> *mask)
>>  mdelay(100);
>>  nr_pass = 0;
>>  for_each_present_cpu(cpu) {
>> -smp_rmb();
>>  nr_pass += cpumask_test_cpu(cpu, mask) ?
>>  acked[cpu] == 1 : acked[cpu] == 0;
>> +smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>>  
>>  if (bad_sender[cpu] != -1) {
>>  printf("cpu%d received IPI from wrong sender 
>> %d\n",
>> @@ -118,7 +118,6 @@ static void check_spurious(void)
>>  {
>>  int cpu;
>>  
>> -smp_rmb();
> this change is not documented in the commit msg.

You are right. I think this is a rebasing mistake and should actually be part of
#7 ("arm/arm64: gic: Wait

Re: [kvm-unit-tests PATCH 04/10] arm/arm64: gic: Remove unnecessary synchronization with stats_reset()

2020-12-03 Thread Auger Eric

Hi,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The GICv3 driver executes a DSB barrier before sending an IPI, which
> ensures that memory accesses have completed. This removes the need to
> enforce ordering with respect to stats_reset() in the IPI handler.
> 
> For GICv2, we still need the DMB to ensure ordering between the read of the
> GICC_IAR MMIO register and the read from the acked array. It also matches
> what the Linux GICv2 driver does in gic_handle_irq().
> 
> Signed-off-by: Alexandru Elisei 
Reviewed-by: Eric Auger 

Eric
> ---
>  arm/gic.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 4e947e8516a2..7befda2a8673 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -60,7 +60,6 @@ static void stats_reset(void)
>   bad_sender[i] = -1;
>   bad_irq[i] = -1;
>   }
> - smp_wmb();
>  }
>  
>  static void check_acked(const char *testname, cpumask_t *mask)
> @@ -150,7 +149,13 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  
>   if (irqnr != GICC_INT_SPURIOUS) {
>   gic_write_eoir(irqstat);
> - smp_rmb(); /* pairs with wmb in stats_reset */
> + /*
> +  * Make sure data written before the IPI was triggered is
> +  * observed after the IAR is read. Pairs with the smp_wmb
> +  * when sending the IPI.
> +  */
> + if (gic_version() == 2)
> + smp_rmb();
>   ++acked[smp_processor_id()];
>   check_ipi_sender(irqstat);
>   check_irqnr(irqnr);
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH 06/10] arm/arm64: gic: Check spurious and bad_sender in the active test

2020-12-03 Thread Auger Eric

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
> checks that the interrupt has been received as expected. The
> ipi_clear_active_handler() clears the active state of the interrupt with a
> write to the GICD_ICACTIVER register instead of writing the to EOI
> register.
> 
> When acknowledging the interrupt it is possible to get back an spurious
> interrupt ID (ID 1023), and the interrupt handler increments the number of
> spurious interrupts received on the current processor. However, this is not
> checked at the end of the test. Let's also check for spurious interrupts,
> like the IPI test does.
> 
> For IPIs on GICv2, the value returned by a read of the GICC_IAR register
> performed when acknowledging the interrupt also contains the sender CPU
> ID. Add a check for that too.
> 
> Signed-off-by: Alexandru Elisei 
Reviewed-by: Eric Auger 

Eric
> ---
>  arm/gic.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index bcb834406d23..5727d72a0ef3 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -125,12 +125,12 @@ static void check_spurious(void)
>   }
>  }
>  
> -static void check_ipi_sender(u32 irqstat)
> +static void check_ipi_sender(u32 irqstat, int sender)
>  {
>   if (gic_version() == 2) {
>   int src = (irqstat >> 10) & 7;
>  
> - if (src != IPI_SENDER)
> + if (src != sender)
>   bad_sender[smp_processor_id()] = src;
>   }
>  }
> @@ -155,7 +155,7 @@ static void ipi_handler(struct pt_regs *regs __unused)
>*/
>   if (gic_version() == 2)
>   smp_rmb();
> - check_ipi_sender(irqstat);
> + check_ipi_sender(irqstat, IPI_SENDER);
>   check_irqnr(irqnr);
>   smp_wmb(); /* pairs with smp_rmb in check_acked */
>   ++acked[smp_processor_id()];
> @@ -382,6 +382,7 @@ static void ipi_clear_active_handler(struct pt_regs *regs 
> __unused)
>  
>   writel(val, base + GICD_ICACTIVER);
>  
> + check_ipi_sender(irqstat, smp_processor_id());
>   check_irqnr(irqnr);
>   ++acked[smp_processor_id()];
>   } else {
> @@ -394,6 +395,7 @@ static void run_active_clear_test(void)
>   report_prefix_push("active");
>   setup_irq(ipi_clear_active_handler);
>   ipi_test_self();
> + check_spurious();
>   report_prefix_pop();
>  }
>  
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH 05/10] arm/arm64: gic: Use correct memory ordering for the IPI test

2020-12-03 Thread Auger Eric

Hi Alexandru,

On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The IPI test works by sending IPIs to even numbered CPUs from the
> IPI_SENDER CPU (CPU1), and then checking that the other CPUs received the
> interrupts as expected. The check is done in check_acked() by the
> IPI_SENDER CPU with the help of three arrays:
> 
> - acked, where acked[i] == 1 means that CPU i received the interrupt.
> - bad_irq, where bad_irq[i] == -1 means that the interrupt received by CPU
>   i had the expected interrupt number (IPI_IRQ).
> - bad_sender, where bad_sender[i] == -1 means that the interrupt received
>   by CPU i was from the expected sender (IPI_SENDER, GICv2 only).
> 
> The assumption made by check_acked() is that if a CPU acked an interrupt,
> then bad_sender and bad_irq have also been updated. This is a common
> inter-thread communication pattern called message passing.  For message
> passing to work correctly on weakly consistent memory model architectures,
> like arm and arm64, barriers or address dependencies are required. This is
> described in ARM DDI 0487F.b, in "Armv7 compatible approaches for ordering,
> using DMB and DSB barriers" (page K11-7993), in the section with a single
> observer, which is in our case the IPI_SENDER CPU.
> 
> The IPI test attempts to enforce the correct ordering using memory
> barriers, but it's not enough. For example, the program execution below is
> valid from an architectural point of view:
> 
> 3 online CPUs, initial state (from stats_reset()):
> 
> acked[2] = 0;
> bad_sender[2] = -1;
> bad_irq[2] = -1;
> 
> CPU1 (in check_acked())   | CPU2 (in ipi_handler())
>   |
> smp_rmb() // DMB ISHLD| acked[2]++;
> read 1 from acked[2]  |
> nr_pass++ // nr_pass = 3  |
> read -1 from bad_sender[2]|
> read -1 from bad_irq[2]   |
>   | // in check_ipi_sender()
>   | bad_sender[2] = 
>   | // in check_irqnr()
>   | bad_irq[2] = 
>   | smp_wmb() // DMB ISHST
> nr_pass == nr_cpus, return|
> 
> In this scenario, CPU1 will read the updated acked value, but it will read
> the initial bad_sender and bad_irq values. This is permitted because the
> memory barriers do not create a data dependency between the value read from
> acked and the values read from bad_rq and bad_sender on CPU1, respectively
> between the values written to acked, bad_sender and bad_irq on CPU2.
> 
> To avoid this situation, let's reorder the barriers and accesses to the
> arrays to create the needed dependencies that ensure that message passing
> behaves as expected.
> 
> In the interrupt handler, the writes to bad_sender and bad_irq are
> reordered before the write to acked and a smp_wmb() barrier is added. This
> ensures that if other PEs observe the write to acked, then they will also
> observe the writes to the other two arrays.
> 
> In check_acked(), put the smp_rmb() barrier after the read from acked to
> ensure that the subsequent reads from bad_sender, respectively bad_irq,
> aren't reordered locally by the PE.
> 
> With these changes, the expected ordering of accesses is respected and we
> end up with the pattern described in the Arm ARM and also in the Linux
> litmus test MP+fencewmbonceonce+fencermbonceonce.litmus from
> tools/memory-model/litmus-tests. More examples and explanations can be
> found in the Linux source tree, in Documentation/memory-barriers.txt, in
> the sections "SMP BARRIER PAIRING" and "READ MEMORY BARRIERS VS LOAD
> SPECULATION".
> 
> For consistency with ipi_handler(), the array accesses in
> ipi_clear_active_handler() have also been reordered. This shouldn't affect
> the functionality of that test.
> 
> Signed-off-by: Alexandru Elisei 
> ---
>  arm/gic.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 7befda2a8673..bcb834406d23 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -73,9 +73,9 @@ static void check_acked(const char *testname, cpumask_t 
> *mask)
>   mdelay(100);
>   nr_pass = 0;
>   for_each_present_cpu(cpu) {
> - smp_rmb();
>   nr_pass += cpumask_test_cpu(cpu, mask) ?
>   acked[cpu] == 1 : acked[cpu] == 0;
> + smp_rmb(); /* pairs with smp_wmb in ipi_handler */
>  
>   if (bad_sender[cpu] != -1) {
>   printf("cpu%d received IPI from wrong sender 
> %d\n",
> @@ -118,7 +118,6 @@ static void check_spurious(void)
>  {
>   int cpu;
>  
> - smp_rmb();
this change is not documented in the commit msg.
>   for_each_present_cpu(cpu) {
>   if (spurious[cpu])
>   report_info("WARN: cpu%d got %d spurious interrupts",
> @@ -156,10 +155,10 @@ static void ipi_handler(struct pt_regs *regs __unused)

Re: [PATCH v13 05/15] iommu/smmuv3: Get prepared for nested stage support

2020-12-03 Thread Auger Eric

Hi Kunkun,

On 12/3/20 1:32 PM, Kunkun Jiang wrote:
> Hi Eric,
> 
> On 2020/11/18 19:21, Eric Auger wrote:
>> When nested stage translation is setup, both s1_cfg and
>> s2_cfg are set.
>>
>> We introduce a new smmu domain abort field that will be set
>> upon guest stage1 configuration passing.
>>
>> arm_smmu_write_strtab_ent() is modified to write both stage
>> fields in the STE and deal with the abort field.
>>
>> In nested mode, only stage 2 is "finalized" as the host does
>> not own/configure the stage 1 context descriptor; guest does.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>> v10 -> v11:
>> - Fix an issue reported by Shameer when switching from with vSMMU
>>   to without vSMMU. Despite the spec does not seem to mention it
>>   seems to be needed to reset the 2 high 64b when switching from
>>   S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
>>   On some implementations, if the S2TTB is not reset, this causes
>>   a C_BAD_STE error
>> ---
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 64 +
>>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
>>  2 files changed, 56 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index 18ac5af1b284..412ea1bafa50 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -1181,8 +1181,10 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>   * three cases at the moment:
> Now, it should be *five cases*.
>>   *
>>   * 1. Invalid (all zero) -> bypass/fault (init)
>> - * 2. Bypass/fault -> translation/bypass (attach)
>> - * 3. Translation/bypass -> bypass/fault (detach)
>> + * 2. Bypass/fault -> single stage translation/bypass (attach)
>> + * 3. Single or nested stage Translation/bypass -> bypass/fault (detach)
>> + * 4. S2 -> S1 + S2 (attach_pasid_table)
> 
> I was testing this series on one of our hardware board with SMMUv3. And
> I found while trying to /"//attach_pasid_table//"/,
> 
> the sequence of STE (host) config(bit[3:1]) is /"S2->abort->S1 + S2"/.
> Because the maintenance is  /"Write everything apart///
> 
> /from dword 0, sync, write dword 0, sync"/ when we update the STE
> (guest). Dose the sequence meet your expectation?

yes it does. I will fix the comments accordingly.

Is there anything to correct in the code or was it functional?

Thanks

Eric
> 
>> + * 5. S1 + S2 -> S2 (detach_pasid_table)
>>   *
>>   * Given that we can't update the STE atomically and the SMMU
>>   * doesn't read the thing in a defined order, that leaves us
>> @@ -1193,7 +1195,8 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>   * 3. Update Config, sync
>>   */
>>  u64 val = le64_to_cpu(dst[0]);
>> -bool ste_live = false;
>> +bool s1_live = false, s2_live = false, ste_live;
>> +bool abort, nested = false, translate = false;
>>  struct arm_smmu_device *smmu = NULL;
>>  struct arm_smmu_s1_cfg *s1_cfg;
>>  struct arm_smmu_s2_cfg *s2_cfg;
>> @@ -1233,6 +1236,8 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>  default:
>>  break;
>>  }
>> +nested = s1_cfg->set && s2_cfg->set;
>> +translate = s1_cfg->set || s2_cfg->set;
>>  }
>>  
>>  if (val & STRTAB_STE_0_V) {
>> @@ -1240,23 +1245,36 @@ static void arm_smmu_write_strtab_ent(struct 
>> arm_smmu_master *master, u32 sid,
>>  case STRTAB_STE_0_CFG_BYPASS:
>>  break;
>>  case STRTAB_STE_0_CFG_S1_TRANS:
>> +s1_live = true;
>> +break;
>>  case STRTAB_STE_0_CFG_S2_TRANS:
>> -ste_live = true;
>> +s2_live = true;
>> +break;
>> +case STRTAB_STE_0_CFG_NESTED:
>> +s1_live = true;
>> +s2_live = true;
>>  break;
>>  case STRTAB_STE_0_CFG_ABORT:
>> -BUG_ON(!disable_bypass);
>>  break;
>>  default:
>>  BUG(); /* STE corruption */
>>  }
>>  }
>>  
>> +ste_live = s1_live || s2_live;
>> +
>>  /* Nuke the existing STE_0 value, as we're going to rewrite it */
>>  val = STRTAB_STE_0_V;
>>  
>>  /* Bypass/fault */
>> -if (!smmu_domain || !(s1_cfg->set || s2_cfg->set)) {
>> -if (!smmu_domain && disable_bypass)
>> +
>> +if (!smmu_domain)
>> +abort = disable_bypass;
>> +else
>> +abort = smmu_domain->abort;
>> +
>> +if (abort || !translate) {
>> +if (abort)
>>  val |= FIELD_PREP(STRTAB_STE_0_CFG, 
>> STRTAB_STE_0_CFG_ABORT);
>>  else
>>  val

Re: [PATCH v4 16/26] kvm: arm64: Bootstrap PSCI SMC handler in nVHE EL2

2020-12-03 Thread Mark Rutland

On Wed, Dec 02, 2020 at 06:41:12PM +, David Brazdil wrote:
> Add a handler of PSCI SMCs in nVHE hyp code. The handler is initialized
> with the version used by the host's PSCI driver and the function IDs it
> was configured with. If the SMC function ID matches one of the
> configured PSCI calls (for v0.1) or falls into the PSCI function ID
> range (for v0.2+), the SMC is handled by the PSCI handler. For now, all
> SMCs return PSCI_RET_NOT_SUPPORTED.
> 
> Signed-off-by: David Brazdil 

> +static bool is_psci_0_1_call(u64 func_id)
> +{
> + return (func_id == kvm_host_psci_0_1_function_ids.cpu_suspend) ||
> +(func_id == kvm_host_psci_0_1_function_ids.cpu_on) ||
> +(func_id == kvm_host_psci_0_1_function_ids.cpu_off) ||
> +(func_id == kvm_host_psci_0_1_function_ids.migrate);
> +}

One minor thing, as I just spotted on an earlier patch: if FW doesn't
implement one of these, the ID will be 0, so we might need to snapshot
whether or not the function is enabled to stop spurious calls to FID 0.

To be clear, that can be done in a follow-up if necessary.

Thanks,
Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v4 06/26] psci: Add accessor for psci_0_1_function_ids

2020-12-03 Thread Mark Rutland

On Wed, Dec 02, 2020 at 06:41:02PM +, David Brazdil wrote:
> Make it possible to retrieve a copy of the psci_0_1_function_ids struct.
> This is useful for KVM if it is configured to intercept host's PSCI SMCs.
> 
> Signed-off-by: David Brazdil 

Acked-by: Mark Rutland 

... just to check, does KVM snapshot which function IDs are valid, or do
we want to add that state here too? That can be a follow-up if
necessary.

Thanks,
Mark.

> ---
>  drivers/firmware/psci/psci.c | 12 +---
>  include/linux/psci.h |  9 +
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
> index 593fdd0e09a2..f5fc429cae3f 100644
> --- a/drivers/firmware/psci/psci.c
> +++ b/drivers/firmware/psci/psci.c
> @@ -58,15 +58,13 @@ typedef unsigned long (psci_fn)(unsigned long, unsigned 
> long,
>   unsigned long, unsigned long);
>  static psci_fn *invoke_psci_fn;
>  
> -struct psci_0_1_function_ids {
> - u32 cpu_suspend;
> - u32 cpu_on;
> - u32 cpu_off;
> - u32 migrate;
> -};
> -
>  static struct psci_0_1_function_ids psci_0_1_function_ids;
>  
> +struct psci_0_1_function_ids get_psci_0_1_function_ids(void)
> +{
> + return psci_0_1_function_ids;
> +}
> +
>  #define PSCI_0_2_POWER_STATE_MASK\
>   (PSCI_0_2_POWER_STATE_ID_MASK | \
>   PSCI_0_2_POWER_STATE_TYPE_MASK | \
> diff --git a/include/linux/psci.h b/include/linux/psci.h
> index 2a1bfb890e58..4ca0060a3fc4 100644
> --- a/include/linux/psci.h
> +++ b/include/linux/psci.h
> @@ -34,6 +34,15 @@ struct psci_operations {
>  
>  extern struct psci_operations psci_ops;
>  
> +struct psci_0_1_function_ids {
> + u32 cpu_suspend;
> + u32 cpu_on;
> + u32 cpu_off;
> + u32 migrate;
> +};
> +
> +struct psci_0_1_function_ids get_psci_0_1_function_ids(void);
> +
>  #if defined(CONFIG_ARM_PSCI_FW)
>  int __init psci_dt_init(void);
>  #else
> -- 
> 2.29.2.454.gaff20da3a2-goog
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v4 04/26] psci: Split functions to v0.1 and v0.2+ variants

2020-12-03 Thread Mark Rutland

On Wed, Dec 02, 2020 at 06:41:00PM +, David Brazdil wrote:
> Refactor implementation of v0.1+ functions (CPU_SUSPEND, CPU_OFF,
> CPU_ON, MIGRATE) to have two functions psci_0_1_foo / psci_0_2_foo that
> select the function ID and call a common helper __psci_foo.
> 
> This is a small cleanup so that the function ID array is only used for
> v0.1 configurations.
> 
> Signed-off-by: David Brazdil 

Acked-by: Mark Rutland 

Mark.

> ---
>  drivers/firmware/psci/psci.c | 94 +++-
>  1 file changed, 60 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
> index ace5b9ac676c..13b9ed71b446 100644
> --- a/drivers/firmware/psci/psci.c
> +++ b/drivers/firmware/psci/psci.c
> @@ -168,46 +168,80 @@ int psci_set_osi_mode(bool enable)
>   return psci_to_linux_errno(err);
>  }
>  
> -static int psci_cpu_suspend(u32 state, unsigned long entry_point)
> +static int __psci_cpu_suspend(u32 fn, u32 state, unsigned long entry_point)
>  {
>   int err;
> - u32 fn;
>  
> - fn = psci_function_id[PSCI_FN_CPU_SUSPEND];
>   err = invoke_psci_fn(fn, state, entry_point, 0);
>   return psci_to_linux_errno(err);
>  }
>  
> -static int psci_cpu_off(u32 state)
> +static int psci_0_1_cpu_suspend(u32 state, unsigned long entry_point)
> +{
> + return __psci_cpu_suspend(psci_function_id[PSCI_FN_CPU_SUSPEND],
> +   state, entry_point);
> +}
> +
> +static int psci_0_2_cpu_suspend(u32 state, unsigned long entry_point)
> +{
> + return __psci_cpu_suspend(PSCI_FN_NATIVE(0_2, CPU_SUSPEND),
> +   state, entry_point);
> +}
> +
> +static int __psci_cpu_off(u32 fn, u32 state)
>  {
>   int err;
> - u32 fn;
>  
> - fn = psci_function_id[PSCI_FN_CPU_OFF];
>   err = invoke_psci_fn(fn, state, 0, 0);
>   return psci_to_linux_errno(err);
>  }
>  
> -static int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
> +static int psci_0_1_cpu_off(u32 state)
> +{
> + return __psci_cpu_off(psci_function_id[PSCI_FN_CPU_OFF], state);
> +}
> +
> +static int psci_0_2_cpu_off(u32 state)
> +{
> + return __psci_cpu_off(PSCI_0_2_FN_CPU_OFF, state);
> +}
> +
> +static int __psci_cpu_on(u32 fn, unsigned long cpuid, unsigned long 
> entry_point)
>  {
>   int err;
> - u32 fn;
>  
> - fn = psci_function_id[PSCI_FN_CPU_ON];
>   err = invoke_psci_fn(fn, cpuid, entry_point, 0);
>   return psci_to_linux_errno(err);
>  }
>  
> -static int psci_migrate(unsigned long cpuid)
> +static int psci_0_1_cpu_on(unsigned long cpuid, unsigned long entry_point)
> +{
> + return __psci_cpu_on(psci_function_id[PSCI_FN_CPU_ON], cpuid, 
> entry_point);
> +}
> +
> +static int psci_0_2_cpu_on(unsigned long cpuid, unsigned long entry_point)
> +{
> + return __psci_cpu_on(PSCI_FN_NATIVE(0_2, CPU_ON), cpuid, entry_point);
> +}
> +
> +static int __psci_migrate(u32 fn, unsigned long cpuid)
>  {
>   int err;
> - u32 fn;
>  
> - fn = psci_function_id[PSCI_FN_MIGRATE];
>   err = invoke_psci_fn(fn, cpuid, 0, 0);
>   return psci_to_linux_errno(err);
>  }
>  
> +static int psci_0_1_migrate(unsigned long cpuid)
> +{
> + return __psci_migrate(psci_function_id[PSCI_FN_MIGRATE], cpuid);
> +}
> +
> +static int psci_0_2_migrate(unsigned long cpuid)
> +{
> + return __psci_migrate(PSCI_FN_NATIVE(0_2, MIGRATE), cpuid);
> +}
> +
>  static int psci_affinity_info(unsigned long target_affinity,
>   unsigned long lowest_affinity_level)
>  {
> @@ -352,7 +386,7 @@ static void __init psci_init_system_suspend(void)
>  
>  static void __init psci_init_cpu_suspend(void)
>  {
> - int feature = psci_features(psci_function_id[PSCI_FN_CPU_SUSPEND]);
> + int feature = psci_features(PSCI_FN_NATIVE(0_2, CPU_SUSPEND));
>  
>   if (feature != PSCI_RET_NOT_SUPPORTED)
>   psci_cpu_suspend_feature = feature;
> @@ -426,24 +460,16 @@ static void __init psci_init_smccc(void)
>  static void __init psci_0_2_set_functions(void)
>  {
>   pr_info("Using standard PSCI v0.2 function IDs\n");
> - psci_ops.get_version = psci_0_2_get_version;
> -
> - psci_function_id[PSCI_FN_CPU_SUSPEND] =
> - PSCI_FN_NATIVE(0_2, CPU_SUSPEND);
> - psci_ops.cpu_suspend = psci_cpu_suspend;
> -
> - psci_function_id[PSCI_FN_CPU_OFF] = PSCI_0_2_FN_CPU_OFF;
> - psci_ops.cpu_off = psci_cpu_off;
> -
> - psci_function_id[PSCI_FN_CPU_ON] = PSCI_FN_NATIVE(0_2, CPU_ON);
> - psci_ops.cpu_on = psci_cpu_on;
>  
> - psci_function_id[PSCI_FN_MIGRATE] = PSCI_FN_NATIVE(0_2, MIGRATE);
> - psci_ops.migrate = psci_migrate;
> -
> - psci_ops.affinity_info = psci_affinity_info;
> -
> - psci_ops.migrate_info_type = psci_migrate_info_type;
> + psci_ops = (struct psci_operations){
> + .get_version = psci_0_2_get_version,
> + .cpu_suspend = psci_0_2_cpu_suspend,
> + .cpu_off =

Re: [RFC PATCH v3 04/16] arm64: Introduce CPU SPE feature

2020-12-03 Thread Alexandru Elisei

Hi Will,

On 12/2/20 5:23 PM, Will Deacon wrote:
> On Wed, Dec 02, 2020 at 02:29:31PM +, Alexandru Elisei wrote:
>> On 11/19/20 4:58 PM, James Morse wrote:
>>> On 27/10/2020 17:26, Alexandru Elisei wrote:
 Detect Statistical Profiling Extension (SPE) support using the cpufeatures
 framework. The presence of SPE is reported via the ARM64_SPE capability.

 The feature will be necessary for emulating SPE in KVM, because KVM needs
 that all CPUs have SPE hardware to avoid scheduling a VCPU on a CPU without
 support. For this reason, the feature type ARM64_CPUCAP_SYSTEM_FEATURE has
 been selected to disallow hotplugging a CPU which doesn't support SPE.
>>> Can you mention the existing driver in the commit message? Surprisingly it 
>>> doesn't use
>>> cpufeature at all. It looks like arm_spe_pmu_dev_init() goes out of its way 
>>> to support
>>> mismatched systems. (otherwise the significance of the new behaviour isn't 
>>> clear!)
>>>
>>> I read it as: the host is fine with mismatched systems, and the existing 
>>> drivers supports
>>> this. But KVM is not. After this patch you can't make the system mismatched 
>>> 'late'.
>> That was exactly my intention. Certainly, I'll try to make the commit message
>> clearer by mentioning the SPE driver.
> Hmm, so are you saying that with this patch applied, a machine where KVM
> isn't even being used can no longer late-online CPUs without SPE if the boot
> CPUs had it? If so, then I don't think that's acceptable, unfortunately.

Yes, the idea was to prevent hotplugging CPUs that don't have the capability so
the kernel won't schedule a SPE-enabled guest on a CPU which doesn't have SPE.

>
> As James points out, the current driver is very careful to support
> big.LITTLE misconfigurations and I don't see why KVM support should change
> that.

That makes sense, thank you for making it clear from the start that this 
approach
is not the right one.

There was a discussion for supporting KVM SPE on heterogeneous systems [1]. I
chose to use a capability because the focus for this iteration was to ensure the
correctness of SPE emulation, and the capability looked like the easiest way to
get KVM SPE up and running for testing.

The idea discussed previously [1] was to have userspace configure the VM with a
cpumask representing the CPUs the VM is allowed to run on. KVM detects if the 
VCPU
is scheduled on a physical CPU not in the cpumask, and returns from KVM_RUN with
an error code. That looks like a good solution to me and generic enough that it
can be used for all sorts of mismatched features. I will try to implement it in
the next iteration, after I get more feedback on the current series.

[1] https://www.spinics.net/lists/arm-kernel/msg778477.html

Thanks,
Alex
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [kvm-unit-tests PATCH 03/10] arm/arm64: gic: Remove memory synchronization from ipi_clear_active_handler()

2020-12-03 Thread Auger Eric

Hi Alexandru,

On 12/2/20 3:14 PM, Alexandru Elisei wrote:
> Hi,
> 
> On 12/2/20 2:02 PM, Alexandru Elisei wrote:
> 
>> Hi Eric,
>>
>> On 12/1/20 4:37 PM, Auger Eric wrote:
>>> Hi Alexandru,
>>>
>>> On 11/25/20 4:51 PM, Alexandru Elisei wrote:
 The gicv{2,3}-active test sends an IPI from the boot CPU to itself, then
 checks that the interrupt has been received as expected. There is no need
 to use inter-processor memory synchronization primitives on code that runs
 on the same CPU, so remove the unneeded memory barriers.

 The arrays are modified asynchronously (in the interrupt handler) and it is
 possible for the compiler to infer that they won't be changed during normal
 program flow and try to perform harmful optimizations (like stashing a
 previous read in a register and reusing it). To prevent this, for GICv2,
 the smp_wmb() in gicv2_ipi_send_self() is replaced with a compiler barrier.
 For GICv3, the wmb() barrier in gic_ipi_send_single() already implies a
 compiler barrier.

 Signed-off-by: Alexandru Elisei 
 ---
  arm/gic.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

 diff --git a/arm/gic.c b/arm/gic.c
 index 401ffafe4299..4e947e8516a2 100644
 --- a/arm/gic.c
 +++ b/arm/gic.c
 @@ -12,6 +12,7 @@
   * This work is licensed under the terms of the GNU LGPL, version 2.
   */
  #include 
 +#include 
  #include 
  #include 
  #include 
 @@ -260,7 +261,8 @@ static void check_lpi_hits(int *expected, const char 
 *msg)
  
  static void gicv2_ipi_send_self(void)
  {> -  smp_wmb();
>>> nit: previous patch added it and this patch removes it. maybe squash the
>>> modifs into the previous patch saying only a barrier() is needed for self()?
>> You're right, this does look out of place. I'll merge this change into the
>> previous patch.
 +  /* Prevent the compiler from optimizing memory accesses */
 +  barrier();
writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
  }
  
 @@ -359,6 +361,7 @@ static struct gic gicv3 = {
},
  };
  
 +/* Runs on the same CPU as the sender, no need for memory synchronization 
 */
  static void ipi_clear_active_handler(struct pt_regs *regs __unused)
  {
u32 irqstat = gic_read_iar();
 @@ -375,13 +378,10 @@ static void ipi_clear_active_handler(struct pt_regs 
 *regs __unused)
  
writel(val, base + GICD_ICACTIVER);
  
 -  smp_rmb(); /* pairs with wmb in stats_reset */
>>> the comment says it is paired with wmd in stats_reset. So is it OK to
>>> leave the associated wmb?
>> This patch removes multi-processor synchronization from the functions that 
>> run on
>> the same CPU. stats_reset() can be called from one CPU (the IPI_SENDER CPU) 
>> and
>> the variables it modifies accessed by the interrupt handlers running on 
>> different
>> CPUs, like it happens for the IPI tests. In that case we do need the proper
>> barriers in place.
> 
> Sorry, got confused about what you were asking. The next patch removes the
> smp_wmb() from stats_reset() which became redundant after the barriers added 
> to
> the GIC functions that send IPIs. This patch is about removing barriers that 
> were
> never needed in the first place because the functions were running on the same
> CPU, it's not dependent on anyGIC changes.

OK I get it. I was just confused by this pairing commment as we remove
one item and not the other but that's not an issue here as we do not
need the barrier in that case.

Feel free to add my R-b w/ or wo the squash:
Reviewed-by: Eric Auger 

Thanks

Eric
> 
> Thanks,
> Alex
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

38 matches

Mail list logo