date:20200416


Hi Zengtao,

On 2020-04-16 02:38, Zengtao (B) wrote:

Hi Marc:

Got it.
Really a bit patch set :)


Well, yeah... ;-)



BTW, I have done a basic kvm unit test
git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
And I find that after apply the patch KVM: arm64: VNCR-ize ELR_EL1,
The psci test failed for some reason, I can't understand why, this
is only the test result.(find the patch by git bisect + kvm test)


That it is that mechanical, we should be able to quickly nail that one.


My platform: Hisilicon D06 board.
Linux kernel: Linux 5.6-rc6 + nv patches(some rebases)
Could you help to take a look?


I'll have a look tomorrow. I'm in the middle of refactoring the series
for 5.7, and things have changed quite a bit. Hopefully this isn't a VHE
vs non-VHE issue.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 8/8] arm64: cpufeature: Add an overview comment for the cpufeature framework

On Thu, Apr 16, 2020 at 03:59:39PM +0100, Suzuki K Poulose wrote:
> On 04/14/2020 10:31 PM, Will Deacon wrote:
> > Now that Suzuki isn't within throwing distance, I thought I'd better add
> > a rough overview comment to cpufeature.c so that it doesn't take me days
> > to remember how it works next time.
> > 
> > Signed-off-by: Will Deacon 
> > ---
> >   arch/arm64/kernel/cpufeature.c | 43 ++
> >   1 file changed, 43 insertions(+)
> > 
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 680a453ca8c4..421ca99dc8fc 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -3,6 +3,49 @@
> >* Contains CPU feature definitions
> >*
> >* Copyright (C) 2015 ARM Ltd.
> > + *
> > + * A note for the weary kernel hacker: the code here is confusing and hard 
> > to
> > + * follow! That's partly because it's solving a nasty problem, but also 
> > because
> > + * there's a little bit of over-abstraction that tends to obscure what's 
> > going
> > + * on behind a maze of helper functions and macros.
> 
> Thanks for writing this up !

It's purely a selfish thing ;)

> > + * The basic problem is that hardware folks have started gluing together 
> > CPUs
> > + * with distinct architectural features; in some cases even creating SoCs 
> > where
> > + * user-visible instructions are available only on a subset of the 
> > available
> > + * cores. We try to address this by snapshotting the feature registers of 
> > the
> > + * boot CPU and comparing these with the feature registers of each 
> > secondary
> > + * CPU when bringing them up. If there is a mismatch, then we update the
> > + * snapshot state to indicate the lowest-common denominator of the feature,
> > + * known as the "safe" value. This snapshot state can be queried to view 
> > the
> 
> I am not sure if the following is implied above.
> 
>   1) Against the "snapshot" state, where mismatches triggers updating
>  the "snapshot" state to reflect the "safe" value.
> 
>   2) Compared against the CPU feature registers of *the boot CPU* for
> "FTR_STRICT" fields and any mismatch triggers TAINT_CPU_OUT_OF_SPEC.
>  This makes sure that warning is generated for each OUT_OF_SPEC
>  secondary CPU.

I was trying to avoid talking about the consequences of a mismatch in that
paragraph, and instead cover them below:

> > + * The sanitised register values are used to decide which capabilities we
> > + * have in the system. These may be in the form of traditional "hwcaps"
> > + * advertised to userspace or internal "cpucaps" which are used to 
> > configure
> > + * things like alternative patching and static keys. While a feature 
> > mismatch
> > + * may result in a TAINT_CPU_OUT_OF_SPEC kernel taint, a capability 
> > mismatch
> > + * may prevent a CPU from being onlined at all.

Do you think something is missing here?

> > + *
> > + * Some implementation details worth remembering:
> > + *
> > + * - Mismatched features are *always* sanitised to a "safe" value, which
> > + *   usually indicates that the feature is not supported.
> > + *
> > + * - A mismatched feature marked with FTR_STRICT will cause a "SANITY 
> > CHECK"
> > + *   warning when onlining an offending CPU and the kernel will be tainted
> > + *   with TAINT_CPU_OUT_OF_SPEC.
> 
> As mentioned above, this check is against that of the "boot CPU"
> register state, which may not be implicit from the statement.

Hmm, I'm trying to figure out if this matters. I suppose this means you
get a SANITY CHECK warning for every mismatching secondary CPU, but that's
implied by the above. Is there something else I'm missing?

> > + *
> > + * - Features marked as FTR_VISIBLE have their sanitised value visible to
> > + *   userspace. FTR_VISIBLE features in registers that are only visible
> > + *   to EL0 by trapping *must* have a corresponding HWCAP so that late
> > + *   onlining of CPUs cannot lead to features disappearing at runtime.
> > + *
> 
> As you mentioned in the other response we could add information about
> the guest view, something like :
> 
>   - KVM exposes the sanitised value of the feature registers to the
>   guests and is not affected by the FTR_VISIBLE. However,
>   depending on the individual feature support in the hypervisor,
>   some of the fields may be capped/limited.

In light of Marc's comment, I'll add something here along the lines of:

  "KVM exposes its own view of the feature registers to guest operating
   systems regardless of FTR_VISIBLE. This is typically driven from the
   sanitised register values to allow virtual CPUs to be migrated between
   arbitrary physical CPUs, but some features not present on the host are
   also advertised and emulated. Look at sys_reg_descs[] for the gory
   details."

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu

Re: [PATCH] KVM: arm64: Drop PTE_S2_MEMATTR_MASK


On 2020-04-16 18:05, Will Deacon wrote:

On Thu, Apr 16, 2020 at 04:36:05PM +0100, Marc Zyngier wrote:

On Wed, 15 Apr 2020 18:57:46 +0800
Zenghui Yu  wrote:

> The only user of PTE_S2_MEMATTR_MASK macro had been removed since
> commit a501e32430d4 ("arm64: Clean up the default pgprot setting").
> It has been about six years and no one has used it again.
>
> Let's drop it.
>
> Signed-off-by: Zenghui Yu 
> ---
>  arch/arm64/include/asm/pgtable-hwdef.h | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
> index 6bf5e650da78..99315bdca0e6 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -190,7 +190,6 @@
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */
>  #define PTE_S2_MEMATTR(t) (_AT(pteval_t, (t)) << 2)
> -#define PTE_S2_MEMATTR_MASK   (_AT(pteval_t, 0xf) << 2)
>
>  /*
>   * EL2/HYP PTE/PMD definitions

Looks good to me. Catalin, Will: do you want to take this directly? If
so please add my:

Acked-by: Marc Zyngier 

Otherwise, I'll route it via the KVM tree.


I can take it for 5.8 if it's not urgent.


It has been there 6 years, I think we can cope with another few 
months... ;-)


Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 2/2] KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi()


On 2020-04-16 02:17, Zenghui Yu wrote:

On 2020/4/14 11:03, Zenghui Yu wrote:

If we're going to fail out the vgic_add_lpi(), let's make sure the
allocated vgic_irq memory is also freed. Though it seems that both
cases are unlikely to fail.

Cc: Zengruan Ye 
Signed-off-by: Zenghui Yu 
---
  virt/kvm/arm/vgic/vgic-its.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c 
b/virt/kvm/arm/vgic/vgic-its.c

index d53d34a33e35..3c3b6a0f2dce 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -98,12 +98,16 @@ static struct vgic_irq *vgic_add_lpi(struct kvm 
*kvm, u32 intid,
  	 * the respective config data from memory here upon mapping the 
LPI.

 */
ret = update_lpi_config(kvm, irq, NULL, false);
-   if (ret)
+   if (ret) {
+   kfree(irq);
return ERR_PTR(ret);
+   }
ret = vgic_v3_lpi_sync_pending_status(kvm, irq);
-   if (ret)
+   if (ret) {
+   kfree(irq);
return ERR_PTR(ret);
+   }


Looking at it again, I realized that this error handling is still not
complete. Maybe we should use a vgic_put_irq() instead so that we can
also properly delete the vgic_irq from lpi_list.


Yes, this is a more correct fix indeed. There is still a bit of a 
bizarre
behaviour if you have two vgic_add_lpi() racing to create the same 
interrupt,
which is pretty dodgy anyway (it means we have two MAPI at the same 
time...).

You end-up with re-reading the state from memory... Oh well.


Marc, what do you think? Could you please help to fix it, or I can
resend it.


I've fixed it as such (with a comment for a good measure):

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 3c3b6a0f2dce..c012a52b19f5 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -96,16 +96,19 @@ static struct vgic_irq *vgic_add_lpi(struct kvm 
*kvm, u32 intid,
 	 * We "cache" the configuration table entries in our struct 
vgic_irq's.

 * However we only have those structs for mapped IRQs, so we read in
 * the respective config data from memory here upon mapping the LPI.
+*
+* Should any of these fail, behave as if we couldn't create the LPI
+* by dropping the refcount and returning the error.
 */
ret = update_lpi_config(kvm, irq, NULL, false);
if (ret) {
-   kfree(irq);
+   vgic_put_irq(kvm, irq);
return ERR_PTR(ret);
}

ret = vgic_v3_lpi_sync_pending_status(kvm, irq);
if (ret) {
-   kfree(irq);
+   vgic_put_irq(kvm, irq);
return ERR_PTR(ret);
}


Let me know if you agree with that.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH] KVM: arm64: Drop PTE_S2_MEMATTR_MASK

On Thu, Apr 16, 2020 at 04:36:05PM +0100, Marc Zyngier wrote:
> On Wed, 15 Apr 2020 18:57:46 +0800
> Zenghui Yu  wrote:
> 
> > The only user of PTE_S2_MEMATTR_MASK macro had been removed since
> > commit a501e32430d4 ("arm64: Clean up the default pgprot setting").
> > It has been about six years and no one has used it again.
> > 
> > Let's drop it.
> > 
> > Signed-off-by: Zenghui Yu 
> > ---
> >  arch/arm64/include/asm/pgtable-hwdef.h | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
> > b/arch/arm64/include/asm/pgtable-hwdef.h
> > index 6bf5e650da78..99315bdca0e6 100644
> > --- a/arch/arm64/include/asm/pgtable-hwdef.h
> > +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> > @@ -190,7 +190,6 @@
> >   * Memory Attribute override for Stage-2 (MemAttr[3:0])
> >   */
> >  #define PTE_S2_MEMATTR(t)  (_AT(pteval_t, (t)) << 2)
> > -#define PTE_S2_MEMATTR_MASK(_AT(pteval_t, 0xf) << 2)
> >  
> >  /*
> >   * EL2/HYP PTE/PMD definitions
> 
> Looks good to me. Catalin, Will: do you want to take this directly? If
> so please add my:
> 
> Acked-by: Marc Zyngier 
> 
> Otherwise, I'll route it via the KVM tree.

I can take it for 5.8 if it's not urgent.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM/arm64: Support enabling dirty log gradually in small chunks

2020-04-16 Thread Paolo Bonzini

On 16/04/20 17:09, Marc Zyngier wrote:
> On Wed, 15 Apr 2020 18:13:56 +0200
> Paolo Bonzini  wrote:
> 
>> On 13/04/20 14:20, Keqian Zhu wrote:
>>> There is already support of enabling dirty log graually in small chunks
>>> for x86 in commit 3c9bd4006bfc ("KVM: x86: enable dirty log gradually in
>>> small chunks"). This adds support for arm64.
>>>
>>> x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET
>>> is eanbled. However, for arm64, both huge pages and normal pages can be
>>> write protected gradually by userspace.
>>>
>>> Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G
>>> Linux VMs with different page size. The memory pressure is 127G in each
>>> case. The time taken of memory_global_dirty_log_start in QEMU is listed
>>> below:
>>>
>>> Page Size  BeforeAfter Optimization
>>>   4K650ms 1.8ms
>>>   2M 4ms  1.8ms
>>>   1G 2ms  1.8ms
>>>
>>> Besides the time reduction, the biggest income is that we will minimize
>>> the performance side effect (because of dissloving huge pages and marking
>>> memslots dirty) on guest after enabling dirty log.
>>>
>>> Signed-off-by: Keqian Zhu 
>>> ---
>>>  Documentation/virt/kvm/api.rst|  2 +-
>>>  arch/arm64/include/asm/kvm_host.h |  3 +++
>>>  virt/kvm/arm/mmu.c| 12 ++--
>>>  3 files changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>>> index efbbe570aa9b..0017f63fa44f 100644
>>> --- a/Documentation/virt/kvm/api.rst
>>> +++ b/Documentation/virt/kvm/api.rst
>>> @@ -5777,7 +5777,7 @@ will be initialized to 1 when created.  This also 
>>> improves performance because
>>>  dirty logging can be enabled gradually in small chunks on the first call
>>>  to KVM_CLEAR_DIRTY_LOG.  KVM_DIRTY_LOG_INITIALLY_SET depends on
>>>  KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE (it is also only available on
>>> -x86 for now).
>>> +x86 and arm64 for now).
>>>  
>>>  KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 was previously available under the name
>>>  KVM_CAP_MANUAL_DIRTY_LOG_PROTECT, but the implementation had bugs that make
>>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>>> b/arch/arm64/include/asm/kvm_host.h
>>> index 32c8a675e5a4..a723f84fab83 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -46,6 +46,9 @@
>>>  #define KVM_REQ_RECORD_STEAL   KVM_ARCH_REQ(3)
>>>  #define KVM_REQ_RELOAD_GICv4   KVM_ARCH_REQ(4)
>>>  
>>> +#define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | 
>>> \
>>> +KVM_DIRTY_LOG_INITIALLY_SET)
>>> +
>>>  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>>>  
>>>  extern unsigned int kvm_sve_max_vl;
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index e3b9ee268823..1077f653a611 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -2265,8 +2265,16 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>>  * allocated dirty_bitmap[], dirty pages will be be tracked while the
>>>  * memory slot is write protected.
>>>  */
>>> -   if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>>> -   kvm_mmu_wp_memory_region(kvm, mem->slot);
>>> +   if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
>>> +   /*
>>> +* If we're with initial-all-set, we don't need to write
>>> +* protect any pages because they're all reported as dirty.
>>> +* Huge pages and normal pages will be write protect gradually.
>>> +*/
>>> +   if (!kvm_dirty_log_manual_protect_and_init_set(kvm)) {
>>> +   kvm_mmu_wp_memory_region(kvm, mem->slot);
>>> +   }
>>> +   }
>>>  }
>>>  
>>>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
>>>   
>>
>> Marc, what is the status of this patch?
> 
> I just had a look at it. Is there any urgency for merging it?

No, I thought I was still replying to the v1.

Paolo

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH] KVM: arm64: Drop PTE_S2_MEMATTR_MASK

On Wed, 15 Apr 2020 18:57:46 +0800
Zenghui Yu  wrote:

> The only user of PTE_S2_MEMATTR_MASK macro had been removed since
> commit a501e32430d4 ("arm64: Clean up the default pgprot setting").
> It has been about six years and no one has used it again.
> 
> Let's drop it.
> 
> Signed-off-by: Zenghui Yu 
> ---
>  arch/arm64/include/asm/pgtable-hwdef.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
> b/arch/arm64/include/asm/pgtable-hwdef.h
> index 6bf5e650da78..99315bdca0e6 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -190,7 +190,6 @@
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */
>  #define PTE_S2_MEMATTR(t)(_AT(pteval_t, (t)) << 2)
> -#define PTE_S2_MEMATTR_MASK  (_AT(pteval_t, 0xf) << 2)
>  
>  /*
>   * EL2/HYP PTE/PMD definitions

Looks good to me. Catalin, Will: do you want to take this directly? If
so please add my:

Acked-by: Marc Zyngier 

Otherwise, I'll route it via the KVM tree.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 8/8] arm64: cpufeature: Add an overview comment for the cpufeature framework


On 2020-04-16 15:59, Suzuki K Poulose wrote:

Hi Suzuki,

[...]


As you mentioned in the other response we could add information about
the guest view, something like :

  - KVM exposes the sanitised value of the feature registers to the
guests and is not affected by the FTR_VISIBLE. However,
depending on the individual feature support in the hypervisor,
some of the fields may be capped/limited.


Although in most cases, what KVM exposes is indeed a strict subset of
the host's features, there is a few corner cases where we expose 
features

that do not necessarily exist on the host. For example ARMv8.5-GTG and
ARMv8.4-TTL get exposed by the NV patches even if they don't exist on 
the

host, as KVM will actually emulate them.

Not a big deal, but I just wanted to outline that it isn't as clear-cut 
as

it may seem...

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM/arm64: Support enabling dirty log gradually in small chunks

On Wed, 15 Apr 2020 18:13:56 +0200
Paolo Bonzini  wrote:

> On 13/04/20 14:20, Keqian Zhu wrote:
> > There is already support of enabling dirty log graually in small chunks
> > for x86 in commit 3c9bd4006bfc ("KVM: x86: enable dirty log gradually in
> > small chunks"). This adds support for arm64.
> > 
> > x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET
> > is eanbled. However, for arm64, both huge pages and normal pages can be
> > write protected gradually by userspace.
> > 
> > Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G
> > Linux VMs with different page size. The memory pressure is 127G in each
> > case. The time taken of memory_global_dirty_log_start in QEMU is listed
> > below:
> > 
> > Page Size  BeforeAfter Optimization
> >   4K650ms 1.8ms
> >   2M 4ms  1.8ms
> >   1G 2ms  1.8ms
> > 
> > Besides the time reduction, the biggest income is that we will minimize
> > the performance side effect (because of dissloving huge pages and marking
> > memslots dirty) on guest after enabling dirty log.
> > 
> > Signed-off-by: Keqian Zhu 
> > ---
> >  Documentation/virt/kvm/api.rst|  2 +-
> >  arch/arm64/include/asm/kvm_host.h |  3 +++
> >  virt/kvm/arm/mmu.c| 12 ++--
> >  3 files changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index efbbe570aa9b..0017f63fa44f 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -5777,7 +5777,7 @@ will be initialized to 1 when created.  This also 
> > improves performance because
> >  dirty logging can be enabled gradually in small chunks on the first call
> >  to KVM_CLEAR_DIRTY_LOG.  KVM_DIRTY_LOG_INITIALLY_SET depends on
> >  KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE (it is also only available on
> > -x86 for now).
> > +x86 and arm64 for now).
> >  
> >  KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 was previously available under the name
> >  KVM_CAP_MANUAL_DIRTY_LOG_PROTECT, but the implementation had bugs that make
> > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > b/arch/arm64/include/asm/kvm_host.h
> > index 32c8a675e5a4..a723f84fab83 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -46,6 +46,9 @@
> >  #define KVM_REQ_RECORD_STEAL   KVM_ARCH_REQ(3)
> >  #define KVM_REQ_RELOAD_GICv4   KVM_ARCH_REQ(4)
> >  
> > +#define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | 
> > \
> > +KVM_DIRTY_LOG_INITIALLY_SET)
> > +
> >  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
> >  
> >  extern unsigned int kvm_sve_max_vl;
> > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> > index e3b9ee268823..1077f653a611 100644
> > --- a/virt/kvm/arm/mmu.c
> > +++ b/virt/kvm/arm/mmu.c
> > @@ -2265,8 +2265,16 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
> >  * allocated dirty_bitmap[], dirty pages will be be tracked while the
> >  * memory slot is write protected.
> >  */
> > -   if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
> > -   kvm_mmu_wp_memory_region(kvm, mem->slot);
> > +   if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> > +   /*
> > +* If we're with initial-all-set, we don't need to write
> > +* protect any pages because they're all reported as dirty.
> > +* Huge pages and normal pages will be write protect gradually.
> > +*/
> > +   if (!kvm_dirty_log_manual_protect_and_init_set(kvm)) {
> > +   kvm_mmu_wp_memory_region(kvm, mem->slot);
> > +   }
> > +   }
> >  }
> >  
> >  int kvm_arch_prepare_memory_region(struct kvm *kvm,
> >   
> 
> Marc, what is the status of this patch?

I just had a look at it. Is there any urgency for merging it?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM/arm64: Support enabling dirty log gradually in small chunks

On Mon, 13 Apr 2020 20:20:23 +0800
Keqian Zhu  wrote:

> There is already support of enabling dirty log graually in small chunks

gradually

> for x86 in commit 3c9bd4006bfc ("KVM: x86: enable dirty log gradually in
> small chunks"). This adds support for arm64.
> 
> x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET
> is eanbled. However, for arm64, both huge pages and normal pages can be

enabled

> write protected gradually by userspace.
> 
> Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G
> Linux VMs with different page size. The memory pressure is 127G in each
> case. The time taken of memory_global_dirty_log_start in QEMU is listed
> below:
> 
> Page Size  BeforeAfter Optimization
>   4K650ms 1.8ms
>   2M 4ms  1.8ms
>   1G 2ms  1.8ms

These numbers are different from what you have advertised before. What
changed?

> 
> Besides the time reduction, the biggest income is that we will minimize

s/income/improvement/

> the performance side effect (because of dissloving huge pages and marking

dissolving

> memslots dirty) on guest after enabling dirty log.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  Documentation/virt/kvm/api.rst|  2 +-
>  arch/arm64/include/asm/kvm_host.h |  3 +++
>  virt/kvm/arm/mmu.c| 12 ++--
>  3 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index efbbe570aa9b..0017f63fa44f 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -5777,7 +5777,7 @@ will be initialized to 1 when created.  This also 
> improves performance because
>  dirty logging can be enabled gradually in small chunks on the first call
>  to KVM_CLEAR_DIRTY_LOG.  KVM_DIRTY_LOG_INITIALLY_SET depends on
>  KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE (it is also only available on
> -x86 for now).
> +x86 and arm64 for now).
>  
>  KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 was previously available under the name
>  KVM_CAP_MANUAL_DIRTY_LOG_PROTECT, but the implementation had bugs that make
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 32c8a675e5a4..a723f84fab83 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -46,6 +46,9 @@
>  #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3)
>  #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4)
>  
> +#define KVM_DIRTY_LOG_MANUAL_CAPS   (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
> +  KVM_DIRTY_LOG_INITIALLY_SET)
> +
>  DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
>  
>  extern unsigned int kvm_sve_max_vl;
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index e3b9ee268823..1077f653a611 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -2265,8 +2265,16 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>* allocated dirty_bitmap[], dirty pages will be be tracked while the
>* memory slot is write protected.
>*/
> - if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
> - kvm_mmu_wp_memory_region(kvm, mem->slot);
> + if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES) {
> + /*
> +  * If we're with initial-all-set, we don't need to write
> +  * protect any pages because they're all reported as dirty.
> +  * Huge pages and normal pages will be write protect gradually.
> +  */
> + if (!kvm_dirty_log_manual_protect_and_init_set(kvm)) {
> + kvm_mmu_wp_memory_region(kvm, mem->slot);
> + }
> + }
>  }
>  
>  int kvm_arch_prepare_memory_region(struct kvm *kvm,

As it is, it is pretty good. The one thing that isn't clear to me is
why we have a difference in behaviour between x86 and arm64. What
prevents x86 from having the same behaviour as arm64?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 8/8] arm64: cpufeature: Add an overview comment for the cpufeature framework

2020-04-16 Thread Suzuki K Poulose


Hi Will,

On 04/14/2020 10:31 PM, Will Deacon wrote:

Now that Suzuki isn't within throwing distance, I thought I'd better add
a rough overview comment to cpufeature.c so that it doesn't take me days
to remember how it works next time.

Signed-off-by: Will Deacon 
---
  arch/arm64/kernel/cpufeature.c | 43 ++
  1 file changed, 43 insertions(+)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 680a453ca8c4..421ca99dc8fc 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -3,6 +3,49 @@
   * Contains CPU feature definitions
   *
   * Copyright (C) 2015 ARM Ltd.
+ *
+ * A note for the weary kernel hacker: the code here is confusing and hard to
+ * follow! That's partly because it's solving a nasty problem, but also because
+ * there's a little bit of over-abstraction that tends to obscure what's going
+ * on behind a maze of helper functions and macros.


Thanks for writing this up !


+ *
+ * The basic problem is that hardware folks have started gluing together CPUs
+ * with distinct architectural features; in some cases even creating SoCs where
+ * user-visible instructions are available only on a subset of the available
+ * cores. We try to address this by snapshotting the feature registers of the
+ * boot CPU and comparing these with the feature registers of each secondary
+ * CPU when bringing them up. If there is a mismatch, then we update the
+ * snapshot state to indicate the lowest-common denominator of the feature,
+ * known as the "safe" value. This snapshot state can be queried to view the


I am not sure if the following is implied above.

  1) Against the "snapshot" state, where mismatches triggers updating
 the "snapshot" state to reflect the "safe" value.

  2) Compared against the CPU feature registers of *the boot CPU* for
"FTR_STRICT" fields and any mismatch triggers TAINT_CPU_OUT_OF_SPEC.
 This makes sure that warning is generated for each OUT_OF_SPEC
 secondary CPU.


+ * "sanitised" value of a feature register.
+ *
+ * The sanitised register values are used to decide which capabilities we
+ * have in the system. These may be in the form of traditional "hwcaps"
+ * advertised to userspace or internal "cpucaps" which are used to configure
+ * things like alternative patching and static keys. While a feature mismatch
+ * may result in a TAINT_CPU_OUT_OF_SPEC kernel taint, a capability mismatch
+ * may prevent a CPU from being onlined at all.
+ *
+ * Some implementation details worth remembering:
+ *
+ * - Mismatched features are *always* sanitised to a "safe" value, which
+ *   usually indicates that the feature is not supported.
+ *
+ * - A mismatched feature marked with FTR_STRICT will cause a "SANITY CHECK"
+ *   warning when onlining an offending CPU and the kernel will be tainted
+ *   with TAINT_CPU_OUT_OF_SPEC.


As mentioned above, this check is against that of the "boot CPU"
register state, which may not be implicit from the statement.


+ *
+ * - Features marked as FTR_VISIBLE have their sanitised value visible to
+ *   userspace. FTR_VISIBLE features in registers that are only visible
+ *   to EL0 by trapping *must* have a corresponding HWCAP so that late
+ *   onlining of CPUs cannot lead to features disappearing at runtime.
+ *


As you mentioned in the other response we could add information about
the guest view, something like :

  - KVM exposes the sanitised value of the feature registers to the
guests and is not affected by the FTR_VISIBLE. However,
depending on the individual feature support in the hypervisor,
some of the fields may be capped/limited.

Cheers
Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function

2020-04-16 Thread Paolo Bonzini

On 16/04/20 07:10, Tianjia Zhang wrote:
> In earlier versions of kvm, 'kvm_run' is an independent structure
> and is not included in the vcpu structure. At present, 'kvm_run'
> is already included in the vcpu structure, so the parameter
> 'kvm_run' is redundant.
> 
> This patch simplify the function definition, removes the extra
> 'kvm_run' parameter, and extract it from the 'kvm_vcpu' structure
> if necessary.
> 
> Signed-off-by: Tianjia Zhang 
> ---
> 
> v2 change:
>   remove 'kvm_run' parameter and extract it from 'kvm_vcpu'
> 
>  arch/mips/kvm/mips.c   |  3 ++-
>  arch/powerpc/kvm/powerpc.c |  3 ++-
>  arch/s390/kvm/kvm-s390.c   |  3 ++-
>  arch/x86/kvm/x86.c | 11 ++-
>  include/linux/kvm_host.h   |  2 +-
>  virt/kvm/arm/arm.c |  6 +++---
>  virt/kvm/kvm_main.c|  2 +-
>  7 files changed, 17 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index 8f05dd0a0f4e..ec24adf4857e 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -439,8 +439,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu 
> *vcpu,
>   return -ENOIOCTLCMD;
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *run = vcpu->run;
>   int r = -EINTR;
>  
>   vcpu_load(vcpu);
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index e15166b0a16d..7e24691e138a 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -1764,8 +1764,9 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
> struct kvm_one_reg *reg)
>   return r;
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *run = vcpu->run;
>   int r;
>  
>   vcpu_load(vcpu);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 19a81024fe16..443af3ead739 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4333,8 +4333,9 @@ static void store_regs(struct kvm_vcpu *vcpu, struct 
> kvm_run *kvm_run)
>   store_regs_fmt2(vcpu, kvm_run);
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *kvm_run = vcpu->run;
>   int rc;
>  
>   if (kvm_run->immediate_exit)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3bf2ecafd027..a0338e86c90f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8707,8 +8707,9 @@ static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
>   trace_kvm_fpu(0);
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *kvm_run = vcpu->run;
>   int r;
>  
>   vcpu_load(vcpu);
> @@ -8726,18 +8727,18 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> struct kvm_run *kvm_run)
>   r = -EAGAIN;
>   if (signal_pending(current)) {
>   r = -EINTR;
> - vcpu->run->exit_reason = KVM_EXIT_INTR;
> + kvm_run->exit_reason = KVM_EXIT_INTR;
>   ++vcpu->stat.signal_exits;
>   }
>   goto out;
>   }
>  
> - if (vcpu->run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
> + if (kvm_run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
>   r = -EINVAL;
>   goto out;
>   }
>  
> - if (vcpu->run->kvm_dirty_regs) {
> + if (kvm_run->kvm_dirty_regs) {
>   r = sync_regs(vcpu);
>   if (r != 0)
>   goto out;
> @@ -8767,7 +8768,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> struct kvm_run *kvm_run)
>  
>  out:
>   kvm_put_guest_fpu(vcpu);
> - if (vcpu->run->kvm_valid_regs)
> + if (kvm_run->kvm_valid_regs)
>   store_regs(vcpu);
>   post_kvm_run_save(vcpu);
>   kvm_sigset_deactivate(vcpu);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 6d58beb65454..1e17ef719595 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -866,7 +866,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>   struct kvm_mp_state *mp_state);
>  int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>   struct kvm_guest_debug *dbg);
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run);
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu);
>  
>  int kvm_arch_init(void *opaque);
>  void kvm_arch_exit(void);
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 48d0ec44ad77..f5390ac2165b 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -639,7 +639,6 @@ static void

Re: [PATCH] KVM: handle the right RAS SEA(Synchronous External Abort) type

2020-04-16 Thread James Morse

Hi Geng,

On 16/04/2020 13:07, gengdongjiu wrote:
> On 2020/4/14 20:18, James Morse wrote:
>> On 11/04/2020 13:17, Dongjiu Geng wrote:
>>> When the RAS Extension is implemented, b0b011000, 0b011100,
>>> 0b011101, 0b00, and 0b01, are not used and reserved
>>> to the DFSC[5:0] of ESR_ELx, but the code still checks these
>>> unused bits, so remove them.
>>
>> They aren't unused: CPUs without the RAS extensions may still generate these.
>>
>> kvm_handle_guest_abort() wants to know if this is an external abort.
>> KVM doesn't really care if the CPU has the RAS extensions or not, its the 
>> arch code's job
>> to sort all that out.
> 
> No, handle_guest_sea() ---> ghes_notify_sea  ---> apei driver

You missed the arch code's apei_claim_sea(). This is where kvm washes its hands 
of the
exception, its up to the arch code to deal with, in the same way it deals with 
errors for
user-space.

> If it is an  external abort, it will call apei driver to handle it, but it 
> should be only SEA will call the apei driver.
> other type of external abort should not call apei driver.
> I am not see arch code sort all that out.

The other kind is _asynchronous_ external abort, which doesn't get handled in 
here.

Do you mean 'Synchronous external abort on translation table walk, nth level'?
KVM shouldn't care, this is still up to the arch code to deal with.

Do you mean 'Synchronous parity or ECC error on memory access, not on 
translation table
walk'? (and all its translation table walk friends...)

These really are still external abort!
See shared/functions/aborts/IsExternalAbort() in the psuedo code. (J12-7735 of 
DDI0487F.a)

This means they are trapped by SCR_EL3.EA. On a firmware-first system if we see 
one of
these, its because firmware re-injected it. The arch code needs to get APEI to 
check for
CPER records when it sees one. (KVM ... doesn't care)

(I agree not having 'external' in the name is counter-intuitive. I think this 
is because
the component that triggers them is 'in' the CPU, (e.g. the L1 cache). This is 
how you can
know it was a parity or ECC error, instead of 'something outside' (i.e. 
external). An OS
that is deeply tied to the CPU micro-architecture could use the difference to 
read imp-def
sysregs for one, and poke around in imp-def mmio for the external case. Linux 
isn't tied
to the micro-architecture, so ignores this 'internal/external' hint.)

>>> If the handling of guest ras data error fails, it should
>>> inject data instead of SError to let the guest recover as
>>> much as possible.
> 
> In some hardware platform, it supports RAS, but the RAS error address will be 
> not recorded,

In what circumstances does this happen?

Wouldn't these errors all be uncontained? (in which case the host should 
panic()).
e.g. A write went to the wrong address, we don't know where it went...

...

Is this because your platform describes memory-errors that happen inside the 
CPU as
processor errors, instead of memory-errors?

Linux's APEI code ignores them, because it doesn't know what they are. This 
means you are
missing the whole memory_failure(), page-unmapping, user-space signalling ... 
and guest
error injection ... part of the RAS handling.

> so it is better to inject a data abort instead of SError for thtat platform.

Not at all. The SError here is KVM's response to having nothing else it can do.

Having the host handle the error is the best way of solving the problem.

> because guest will try to do recovery for the Synchronous data abort, such as 
> kill the error
> application. But for SError, guest will be panic.

I'd rather we fix this by having the host handle the error.

Botching in a synchronous exception here would mean Qemu can't properly emulate 
a machine
that has SCR_EL3.EA set ... which you need to provide firmware-first for the 
guest.

[...]

> Yes, some platform supports RAS but will not record the error address, so the 
> host has failed
> to handle the error.

I don't think its reasonable to expect KVM to do anything useful in this case.
We should fix the host error handling.

In what circumstances does this happen?

Thanks,

James
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH] KVM: handle the right RAS SEA(Synchronous External Abort) type

2020-04-16 Thread gengdongjiu

Hi James

On 2020/4/14 20:18, James Morse wrote:
> Hi Geng,
> 
> On 11/04/2020 13:17, Dongjiu Geng wrote:
>> When the RAS Extension is implemented, b0b011000, 0b011100,
>> 0b011101, 0b00, and 0b01, are not used and reserved
>> to the DFSC[5:0] of ESR_ELx, but the code still checks these
>> unused bits, so remove them.
> 
> They aren't unused: CPUs without the RAS extensions may still generate these.
> 
> kvm_handle_guest_abort() wants to know if this is an external abort.
> KVM doesn't really care if the CPU has the RAS extensions or not, its the 
> arch code's job
> to sort all that out.

No, handle_guest_sea() ---> ghes_notify_sea  ---> apei driver

If it is an  external abort, it will call apei driver to handle it, but it 
should be only SEA will call the apei driver.
other type of external abort should not call apei driver.
I am not see arch code sort all that out.

/* Synchronous External Abort? */
if (kvm_vcpu_dabt_isextabt(vcpu)) {
/*
 * For RAS the host kernel may handle this abort.
 * There is no need to pass the error into the guest.
 */
if (!handle_guest_sea(fault_ipa, kvm_vcpu_get_hsr(vcpu)))
return 1;
 }

> 
> 
>> If the handling of guest ras data error fails, it should
>> inject data instead of SError to let the guest recover as
>> much as possible.

In some hardware platform, it supports RAS, but the RAS error address will be 
not recorded, so it is better to inject a data abort instead of SError for 
thtat platform.
because guest will try to do recovery for the Synchronous data abort, such as 
kill the error application. But for SError, guest will be panic.

> 
> (I don't quite follow your point here).
> 
> If KVM injected a synchronous external abort due to a RAS error here, then 
> you wouldn't be
> able to support firmware-first RAS with Qemu. I don't think this is what you 
> want.
> 
> 
> The handling is (and should be) decoupled.
> 
> KVM guests aren't special. Whatever happens for a normal user-space process 
> is what should
> happen here. KVM is just doing the plumbing:
> 
> When the hypervisor takes an external abort due to the guest, it should plumb 
> the error
> into the arch code to be handled. This is what would happen for a normal EL0 
> process.
> This is what do_sea() and kvm_handle_guest_sea() do with apei_claim_sea().
> 
> If the RAS code says it handled this error, then we can continue. For 
> user-space, we
> return to user-space. For a guest, we return to the guest. (for user-space 
> this piece is
> not quite complete in mainline, see:
> https://lore.kernel.org/linux-acpi/20200228174817.74278-4-james.mo...@arm.com/
>  )
> 
> This first part happens even if the errors are notified by IRQs, or found in 
> a polled buffer.
> 
> The RAS code may have 'handled' the memory by unmapping it, and marking the 
> corresponding
> page as HWPOISONed. If user-space tries to access this, it will be give an
> SIGBUS:MCEERR_AR. If a guest tries to do this, the same things happens. (The 
> signal goes
> to Qemu).
> (See do_page_fault()s use of the MCEERR si_code's, and 
> kvm_send_hwpoison_signal)
> 
> This second part is the same regardless of how the kernel discovered the RAS 
> error in the
> first place.
> 
> 
> If the RAS code says it did not handle this error, it means it wasn't a RAS 
> error, or your
> platform doesn't support RAS. For an external-abort there is very little the 
> hypervisor
> can do in this situation. It does what KVM has always done: inject an 
> asynchronous
> external abort.
> This should only happen if the host has failed to handle the error. KVM's use 
> of
> asynchronous abort is the simplest one size fits all.
> 
> Are you seeing this happen? If so, what are the circumstances. Did the host 
> handle the
> error? (if not: why not!)

Yes, some platform supports RAS but will not record the error address, so the 
host has failed to handle the error.

> 
> 
> Thanks,
> 
> James
> .
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 8/8] arm64: cpufeature: Add an overview comment for the cpufeature framework

Hi Suzuki,

On Tue, Apr 14, 2020 at 10:31:14PM +0100, Will Deacon wrote:
> Now that Suzuki isn't within throwing distance, I thought I'd better add
> a rough overview comment to cpufeature.c so that it doesn't take me days
> to remember how it works next time.
> 
> Signed-off-by: Will Deacon 
> ---
>  arch/arm64/kernel/cpufeature.c | 43 ++
>  1 file changed, 43 insertions(+)

Any chance you can look at this one, please? I don't trust myself to get
all of the details right here! I'm also wondering whether we should mention
something about KVM and the guest view of the registers.

What do you think?

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function




On 2020/4/16 16:28, Marc Zyngier wrote:

On 2020-04-16 08:03, Vitaly Kuznetsov wrote:

Tianjia Zhang  writes:


In earlier versions of kvm, 'kvm_run' is an independent structure
and is not included in the vcpu structure. At present, 'kvm_run'
is already included in the vcpu structure, so the parameter
'kvm_run' is redundant.

This patch simplify the function definition, removes the extra
'kvm_run' parameter, and extract it from the 'kvm_vcpu' structure
if necessary.

Signed-off-by: Tianjia Zhang 
---

v2 change:
  remove 'kvm_run' parameter and extract it from 'kvm_vcpu'

 arch/mips/kvm/mips.c   |  3 ++-
 arch/powerpc/kvm/powerpc.c |  3 ++-
 arch/s390/kvm/kvm-s390.c   |  3 ++-
 arch/x86/kvm/x86.c | 11 ++-
 include/linux/kvm_host.h   |  2 +-
 virt/kvm/arm/arm.c |  6 +++---
 virt/kvm/kvm_main.c    |  2 +-
 7 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 8f05dd0a0f4e..ec24adf4857e 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -439,8 +439,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct 
kvm_vcpu *vcpu,

 return -ENOIOCTLCMD;
 }

-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+    struct kvm_run *run = vcpu->run;
 int r = -EINTR;

 vcpu_load(vcpu);
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index e15166b0a16d..7e24691e138a 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1764,8 +1764,9 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu 
*vcpu, struct kvm_one_reg *reg)

 return r;
 }

-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+    struct kvm_run *run = vcpu->run;
 int r;

 vcpu_load(vcpu);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 19a81024fe16..443af3ead739 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4333,8 +4333,9 @@ static void store_regs(struct kvm_vcpu *vcpu, 
struct kvm_run *kvm_run)

 store_regs_fmt2(vcpu, kvm_run);
 }

-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run 
*kvm_run)

+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+    struct kvm_run *kvm_run = vcpu->run;
 int rc;

 if (kvm_run->immediate_exit)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3bf2ecafd027..a0338e86c90f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8707,8 +8707,9 @@ static void kvm_put_guest_fpu(struct kvm_vcpu 
*vcpu)

 trace_kvm_fpu(0);
 }

-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run 
*kvm_run)

+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+    struct kvm_run *kvm_run = vcpu->run;
 int r;

 vcpu_load(vcpu);
@@ -8726,18 +8727,18 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu 
*vcpu, struct kvm_run *kvm_run)

 r = -EAGAIN;
 if (signal_pending(current)) {
 r = -EINTR;
-    vcpu->run->exit_reason = KVM_EXIT_INTR;
+    kvm_run->exit_reason = KVM_EXIT_INTR;
 ++vcpu->stat.signal_exits;
 }
 goto out;
 }

-    if (vcpu->run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
+    if (kvm_run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
 r = -EINVAL;
 goto out;
 }

-    if (vcpu->run->kvm_dirty_regs) {
+    if (kvm_run->kvm_dirty_regs) {
 r = sync_regs(vcpu);
 if (r != 0)
 goto out;
@@ -8767,7 +8768,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu 
*vcpu, struct kvm_run *kvm_run)


 out:
 kvm_put_guest_fpu(vcpu);
-    if (vcpu->run->kvm_valid_regs)
+    if (kvm_run->kvm_valid_regs)
 store_regs(vcpu);
 post_kvm_run_save(vcpu);
 kvm_sigset_deactivate(vcpu);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6d58beb65454..1e17ef719595 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -866,7 +866,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct 
kvm_vcpu *vcpu,

 struct kvm_mp_state *mp_state);
 int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 struct kvm_guest_debug *dbg);
-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run 
*kvm_run);

+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu);

 int kvm_arch_init(void *opaque);
 void kvm_arch_exit(void);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 48d0ec44ad77..f5390ac2165b 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -639,7 +639,6 @@ static void check_vcpu_requests(struct kvm_vcpu 
*vcpu)

 /**
  * kvm_arch_vcpu_ioctl_run - the main VCPU run function to execute 
guest code

  * @vcpu:    The VCPU pointer
- * @run:    The kvm_run structure pointer used for userspace state 
exchange

  *
  * This function is called through the VCPU_RUN ioctl called from 
user space. It
  * will execute VM

Re: [PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function




On 2020/4/16 16:50, Cornelia Huck wrote:

On Thu, 16 Apr 2020 16:45:33 +0800
Tianjia Zhang  wrote:


On 2020/4/16 16:28, Marc Zyngier wrote:

On 2020-04-16 08:03, Vitaly Kuznetsov wrote:

Tianjia Zhang  writes:
  

In earlier versions of kvm, 'kvm_run' is an independent structure
and is not included in the vcpu structure. At present, 'kvm_run'
is already included in the vcpu structure, so the parameter
'kvm_run' is redundant.

This patch simplify the function definition, removes the extra
'kvm_run' parameter, and extract it from the 'kvm_vcpu' structure
if necessary.

Signed-off-by: Tianjia Zhang 
---

v2 change:
   remove 'kvm_run' parameter and extract it from 'kvm_vcpu'

  arch/mips/kvm/mips.c   |  3 ++-
  arch/powerpc/kvm/powerpc.c |  3 ++-
  arch/s390/kvm/kvm-s390.c   |  3 ++-
  arch/x86/kvm/x86.c | 11 ++-
  include/linux/kvm_host.h   |  2 +-
  virt/kvm/arm/arm.c |  6 +++---
  virt/kvm/kvm_main.c    |  2 +-
  7 files changed, 17 insertions(+), 13 deletions(-)



Overall, there is a large set of cleanups to be done when both the vcpu
and the run
structures are passed as parameters at the same time. Just grepping the
tree for
kvm_run is pretty instructive.

      M.


Sorry, it's my mistake, I only compiled the x86 platform, I will submit
patch again.


I think it's completely fine (and even preferable) to do cleanups like
that on top.

[FWIW, I compiled s390 here.]



Very good, I will do a comprehensive cleanup of this type of code.

Thanks,
Tianjia
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2020-04-16 Thread Zhangfei Gao




On 2020/4/14 下午11:05, Eric Auger wrote:

This version fixes an issue observed by Shameer on an SMMU 3.2,
when moving from dual stage config to stage 1 only config.
The 2 high 64b of the STE now get reset. Otherwise, leaving the
S2TTB set may cause a C_BAD_STE error.

This series can be found at:
https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
(including the VFIO part)
The QEMU fellow series still can be found at:
https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6

Users have expressed interest in that work and tested v9/v10:
- https://patchwork.kernel.org/cover/11039995/#23012381
- https://patchwork.kernel.org/cover/11039995/#23197235

Background:

This series brings the IOMMU part of HW nested paging support
in the SMMUv3. The VFIO part is submitted separately.

The IOMMU API is extended to support 2 new API functionalities:
1) pass the guest stage 1 configuration
2) pass stage 1 MSI bindings

Then those capabilities gets implemented in the SMMUv3 driver.

The virtualizer passes information through the VFIO user API
which cascades them to the iommu subsystem. This allows the guest
to own stage 1 tables and context descriptors (so-called PASID
table) while the host owns stage 2 tables and main configuration
structures (STE).




Thanks Eric

Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
1. no-sva works, where guest app directly use physical address via ioctl.
2. vSVA still not work, same as v10,
3.  the v10 issue reported by Shameer has been solved,  first start qemu 
with  iommu=smmuv3, then start qemu without  iommu=smmuv3

4. no-sva also works without  iommu=smmuv3

Test details in https://docs.qq.com/doc/DRU5oR1NtUERseFNL

Thanks
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function

2020-04-16 Thread Vitaly Kuznetsov

Tianjia Zhang  writes:

> In earlier versions of kvm, 'kvm_run' is an independent structure
> and is not included in the vcpu structure. At present, 'kvm_run'
> is already included in the vcpu structure, so the parameter
> 'kvm_run' is redundant.
>
> This patch simplify the function definition, removes the extra
> 'kvm_run' parameter, and extract it from the 'kvm_vcpu' structure
> if necessary.
>
> Signed-off-by: Tianjia Zhang 
> ---
>
> v2 change:
>   remove 'kvm_run' parameter and extract it from 'kvm_vcpu'
>
>  arch/mips/kvm/mips.c   |  3 ++-
>  arch/powerpc/kvm/powerpc.c |  3 ++-
>  arch/s390/kvm/kvm-s390.c   |  3 ++-
>  arch/x86/kvm/x86.c | 11 ++-
>  include/linux/kvm_host.h   |  2 +-
>  virt/kvm/arm/arm.c |  6 +++---
>  virt/kvm/kvm_main.c|  2 +-
>  7 files changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index 8f05dd0a0f4e..ec24adf4857e 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -439,8 +439,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu 
> *vcpu,
>   return -ENOIOCTLCMD;
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *run = vcpu->run;
>   int r = -EINTR;
>  
>   vcpu_load(vcpu);
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index e15166b0a16d..7e24691e138a 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -1764,8 +1764,9 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
> struct kvm_one_reg *reg)
>   return r;
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *run = vcpu->run;
>   int r;
>  
>   vcpu_load(vcpu);
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 19a81024fe16..443af3ead739 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4333,8 +4333,9 @@ static void store_regs(struct kvm_vcpu *vcpu, struct 
> kvm_run *kvm_run)
>   store_regs_fmt2(vcpu, kvm_run);
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *kvm_run = vcpu->run;
>   int rc;
>  
>   if (kvm_run->immediate_exit)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 3bf2ecafd027..a0338e86c90f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8707,8 +8707,9 @@ static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
>   trace_kvm_fpu(0);
>  }
>  
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  {
> + struct kvm_run *kvm_run = vcpu->run;
>   int r;
>  
>   vcpu_load(vcpu);
> @@ -8726,18 +8727,18 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> struct kvm_run *kvm_run)
>   r = -EAGAIN;
>   if (signal_pending(current)) {
>   r = -EINTR;
> - vcpu->run->exit_reason = KVM_EXIT_INTR;
> + kvm_run->exit_reason = KVM_EXIT_INTR;
>   ++vcpu->stat.signal_exits;
>   }
>   goto out;
>   }
>  
> - if (vcpu->run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
> + if (kvm_run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
>   r = -EINVAL;
>   goto out;
>   }
>  
> - if (vcpu->run->kvm_dirty_regs) {
> + if (kvm_run->kvm_dirty_regs) {
>   r = sync_regs(vcpu);
>   if (r != 0)
>   goto out;
> @@ -8767,7 +8768,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> struct kvm_run *kvm_run)
>  
>  out:
>   kvm_put_guest_fpu(vcpu);
> - if (vcpu->run->kvm_valid_regs)
> + if (kvm_run->kvm_valid_regs)
>   store_regs(vcpu);
>   post_kvm_run_save(vcpu);
>   kvm_sigset_deactivate(vcpu);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 6d58beb65454..1e17ef719595 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -866,7 +866,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
>   struct kvm_mp_state *mp_state);
>  int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>   struct kvm_guest_debug *dbg);
> -int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run);
> +int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu);
>  
>  int kvm_arch_init(void *opaque);
>  void kvm_arch_exit(void);
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 48d0ec44ad77..f5390ac2165b 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -639,7 +639,6 @@ static void

RE: [PATCH v2 00/94] KVM: arm64: ARMv8.3/8.4 Nested Virtualization support

2020-04-16 Thread Zengtao (B)

Hi Marc:

Got it.
Really a bit patch set :)

BTW, I have done a basic kvm unit test
git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
And I find that after apply the patch KVM: arm64: VNCR-ize ELR_EL1,
The psci test failed for some reason, I can't understand why, this
is only the test result.(find the patch by git bisect + kvm test)

My platform: Hisilicon D06 board.
Linux kernel: Linux 5.6-rc6 + nv patches(some rebases)
Could you help to take a look?

Thanks 
Zengtao 

> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: Saturday, April 11, 2020 5:24 PM
> To: Zengtao (B)
> Cc: George Cherian; dave.mar...@arm.com; alexandru.eli...@arm.com;
> andre.przyw...@arm.com; christoffer.d...@arm.com;
> james.mo...@arm.com; jint...@cs.columbia.edu;
> julien.thierry.k...@gmail.com; k...@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; linux-arm-ker...@lists.infradead.org;
> suzuki.poul...@arm.com; Anil Kumar Reddy H; Ganapatrao Kulkarni
> Subject: Re: [PATCH v2 00/94] KVM: arm64: ARMv8.3/8.4 Nested
> Virtualization support
> 
> Hi Zengtao,
> 
> On Sat, 11 Apr 2020 05:10:05 +0100,
> "Zengtao (B)"  wrote:
> >
> > Hi Marc:
> >
> > Since it's a very large patch series, I want to test it on my platform
> >  which don't support nv, and want to make sure if this patch series
> > affects the existed virtualization functions or not.
> >
> > Any suggestion about the test focus?
> 
> Not really. Given that the NV patches affect absolutely every
> architectural parts of KVM/arm64, everything needs careful
> testing. But more than testing, it needs reviewing.
> 
> Thanks,
> 
>   M.
> 
> --
> Jazz is not dead, it just smells funny.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH] KVM: Optimize kvm_arch_vcpu_ioctl_run function





On 2020/4/15 22:53, Paolo Bonzini wrote:

On 15/04/20 11:07, Vitaly Kuznetsov wrote:

In case this is no longer needed I'd suggest we drop 'kvm_run' parameter
and extract it from 'struct kvm_vcpu' when needed. This looks like a
natural add-on to your cleanup patch.


I agree, though I think it should be _instead_ of Tianjia's patch rather
than on top.

Paolo



Thank you very much for the comments of Vitaly and Paolo, I will make a 
v2 patch.


Thanks and best,
Tianjia
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function

In earlier versions of kvm, 'kvm_run' is an independent structure
and is not included in the vcpu structure. At present, 'kvm_run'
is already included in the vcpu structure, so the parameter
'kvm_run' is redundant.

This patch simplify the function definition, removes the extra
'kvm_run' parameter, and extract it from the 'kvm_vcpu' structure
if necessary.

Signed-off-by: Tianjia Zhang 
---

v2 change:
  remove 'kvm_run' parameter and extract it from 'kvm_vcpu'

 arch/mips/kvm/mips.c   |  3 ++-
 arch/powerpc/kvm/powerpc.c |  3 ++-
 arch/s390/kvm/kvm-s390.c   |  3 ++-
 arch/x86/kvm/x86.c | 11 ++-
 include/linux/kvm_host.h   |  2 +-
 virt/kvm/arm/arm.c |  6 +++---
 virt/kvm/kvm_main.c|  2 +-
 7 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 8f05dd0a0f4e..ec24adf4857e 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -439,8 +439,9 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu 
*vcpu,
return -ENOIOCTLCMD;
 }
 
-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+   struct kvm_run *run = vcpu->run;
int r = -EINTR;
 
vcpu_load(vcpu);
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index e15166b0a16d..7e24691e138a 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1764,8 +1764,9 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
return r;
 }
 
-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+   struct kvm_run *run = vcpu->run;
int r;
 
vcpu_load(vcpu);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 19a81024fe16..443af3ead739 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4333,8 +4333,9 @@ static void store_regs(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
store_regs_fmt2(vcpu, kvm_run);
 }
 
-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+   struct kvm_run *kvm_run = vcpu->run;
int rc;
 
if (kvm_run->immediate_exit)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3bf2ecafd027..a0338e86c90f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8707,8 +8707,9 @@ static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
trace_kvm_fpu(0);
 }
 
-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 {
+   struct kvm_run *kvm_run = vcpu->run;
int r;
 
vcpu_load(vcpu);
@@ -8726,18 +8727,18 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
struct kvm_run *kvm_run)
r = -EAGAIN;
if (signal_pending(current)) {
r = -EINTR;
-   vcpu->run->exit_reason = KVM_EXIT_INTR;
+   kvm_run->exit_reason = KVM_EXIT_INTR;
++vcpu->stat.signal_exits;
}
goto out;
}
 
-   if (vcpu->run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
+   if (kvm_run->kvm_valid_regs & ~KVM_SYNC_X86_VALID_FIELDS) {
r = -EINVAL;
goto out;
}
 
-   if (vcpu->run->kvm_dirty_regs) {
+   if (kvm_run->kvm_dirty_regs) {
r = sync_regs(vcpu);
if (r != 0)
goto out;
@@ -8767,7 +8768,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *kvm_run)
 
 out:
kvm_put_guest_fpu(vcpu);
-   if (vcpu->run->kvm_valid_regs)
+   if (kvm_run->kvm_valid_regs)
store_regs(vcpu);
post_kvm_run_save(vcpu);
kvm_sigset_deactivate(vcpu);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6d58beb65454..1e17ef719595 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -866,7 +866,7 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
struct kvm_mp_state *mp_state);
 int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg);
-int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run);
+int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu);
 
 int kvm_arch_init(void *opaque);
 void kvm_arch_exit(void);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 48d0ec44ad77..f5390ac2165b 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -639,7 +639,6 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
 /**
  * kvm_arch_vcpu_ioctl_run - the main VCPU run function to execute guest code
  * @vcpu:  The VCPU pointer
- * @run:   The kvm_run structure pointer used for userspace

Re: [PATCH RFCv1 0/7] Support Async Page Fault

2020-04-16 Thread Gavin Shan

Hi Mark,

On 4/14/20 9:05 PM, Mark Rutland wrote:

On Tue, Apr 14, 2020 at 03:39:56PM +1000, Gavin Shan wrote:

On 4/10/20 10:52 PM, Marc Zyngier wrote:

On 2020-04-10 09:58, Gavin Shan wrote:

In order to fulfil the control flow and convey signals between host
and guest. A IMPDEF system register (SYS_ASYNC_PF_EL1) is introduced.
The register accepts control block's physical address, plus requested
features. Also, the signal is sent using data abort with the specific
IMPDEF Data Fault Status Code (DFSC). The specific signal is stored
in the control block by host, to be consumed by guest.

- We don't add IMPDEF sysregs, period. That's reserved for the HW. If
you want to trap, there's the HVC instruction to that effect.

I really don't understand how IMPDEF sysreg is used by hardware vendors.
Do we have an existing functionality, which depends on IMPDEF sysreg?
I was thinking the IMPDEF sysreg can be used by software either, but
it seems I'm wrong.

The key is in the name: an IMPLEMENTATION DEFINED register is defined by
the implementation (i.e. the specific CPU microarchitecture), so it's
wrong for software to come up with an arbitrary semantic as this will
differ from the implementation's defined semantic for the register.

Typically, IMP DEF resgisters are used for things that firmware needs to
do (e.g. enter/exit coherency), or for bringup-time debug (e.g. poking
into TLB/cache internals), and are not usually intended for general
purpose software.

Linux generally avoids the use of IMP DEF registers, but does so in some
cases (e.g. for PMUs) after FW explicitly describes that those are safe
to access.

Thanks for the explanation and details, which make things much clear. Since
the IMPDEF system register can't be used like this way, hypercall (HVC) would
be considered to serve same purpose - deliver signals from host to guest.
However,
the hypercall number and behaviors are guarded by specification. For example,
the hypercalls used by para-virtualized stolen time, which are defined in
include/linux/arm-smccc.h, are specified by ARM DEN0057A [1]. So I need a
specification to be created, where the hypercalls used by this feature are
defined? If it's not needed, can I pick hypercalls that aren't used and define
their behaviors by myself?

[1]
http://infocenter.arm.com/help/topic/com.arm.doc.den0057a/DEN0057A_Paravirtualized_Time_for_Arm_based_Systems_v1_0.pdf

Another thing I want to check is about the ESR_EL1[DFSC]. In this series,
the asynchronous page fault is identified by IMPDEF DFSC (Data Fault Status
Code) in ESR_EL1. According to what we discussed, the IMPDEF DFSC shouldn't
be fired (produced) by software. It should be produced by hardware either?
What I understood is IMPDEF is hardware behavior. If this is true, I need
to avoid using IMPDEF DFSC in next revision :)

Thanks,
Gavin

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 0/8] Relax sanity checking for mismatched AArch32 EL1

2020-04-16 Thread Sai Prakash Ranjan


On 2020-04-16 14:09, Sai Prakash Ranjan wrote:

On 2020-04-15 03:01, Will Deacon wrote:

Hi all,

For better or worse, there are SoCs in production where some, but not
all of the CPUs, support AArch32 at EL1 and above. Right now, that
results in "SANITY CHECK" warnings during boot and an unconditional
kernel taint.

This patch series tries to do a bit better: the only time we care 
about

AArch32 at EL1 is for KVM, so rather than throw our toys out of the
pram, we can instead just disable support for 32-bit guests on these
systems. In the unlikely scenario of a late CPU hotplug being the 
first

time we notice that AArch32 is not available, then we fail the hotplug
(right now we let the thing come online, which leads to hilarious
results for any pre-existing 32-bit guests).

Feedback welcome,

Will



Thanks Will, tested this series on QCOM SC7180 and SM8150 SoCs.

For the entire series,

Tested-by: saiprakash.ran...@codeaurora.org



Urgh sorry, it should be

Tested-by: Sai Prakash Ranjan 

-Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 0/8] Relax sanity checking for mismatched AArch32 EL1

2020-04-16 Thread Sai Prakash Ranjan


On 2020-04-15 03:01, Will Deacon wrote:

Hi all,

For better or worse, there are SoCs in production where some, but not
all of the CPUs, support AArch32 at EL1 and above. Right now, that
results in "SANITY CHECK" warnings during boot and an unconditional
kernel taint.

This patch series tries to do a bit better: the only time we care about
AArch32 at EL1 is for KVM, so rather than throw our toys out of the
pram, we can instead just disable support for 32-bit guests on these
systems. In the unlikely scenario of a late CPU hotplug being the first
time we notice that AArch32 is not available, then we fail the hotplug
(right now we let the thing come online, which leads to hilarious
results for any pre-existing 32-bit guests).

Feedback welcome,

Will



Thanks Will, tested this series on QCOM SC7180 and SM8150 SoCs.

For the entire series,

Tested-by: saiprakash.ran...@codeaurora.org

-Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH RFCv1 0/7] Support Async Page Fault

On Thu, Apr 16, 2020 at 10:16:22AM +0100, Mark Rutland wrote:
> On Thu, Apr 16, 2020 at 05:59:33PM +1000, Gavin Shan wrote:
> > However, the hypercall number and behaviors are guarded by
> > specification. For example, the hypercalls used by para-virtualized
> > stolen time, which are defined in include/linux/arm-smccc.h, are
> > specified by ARM DEN0057A [1]. So I need a specification to be
> > created, where the hypercalls used by this feature are defined? If
> > it's not needed, can I pick hypercalls that aren't used and define
> > their behaviors by myself?
> > 
> > [1] 
> > http://infocenter.arm.com/help/topic/com.arm.doc.den0057a/DEN0057A_Paravirtualized_Time_for_Arm_based_Systems_v1_0.pdf
> 
> Take a look at the SMCCC / SMC Calling Convention:
> 
>  https://developer.arm.com/docs/den0028/c
> 
> ... that defines ranges set aside for hypervisor-specific usage, and
> despite its name it also applies to HVC calls.
> 
> There's been intermittent work to add a probing story for that, so that
> part is subject to change, but for prototyping you can just choose an
> arbitray number in that range -- just be suere to mention in the commit
> and cover letter that this part isn't complete.

Right, might be simplest to start off with:

https://android-kvm.googlesource.com/linux/+/refs/heads/willdeacon/hvc

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH RFCv1 0/7] Support Async Page Fault

2020-04-16 Thread Mark Rutland

On Thu, Apr 16, 2020 at 05:59:33PM +1000, Gavin Shan wrote:
> On 4/14/20 9:05 PM, Mark Rutland wrote:
> > On Tue, Apr 14, 2020 at 03:39:56PM +1000, Gavin Shan wrote:
> > > On 4/10/20 10:52 PM, Marc Zyngier wrote:
> > > > On 2020-04-10 09:58, Gavin Shan wrote:
> > > > > In order to fulfil the control flow and convey signals between host
> > > > > and guest. A IMPDEF system register (SYS_ASYNC_PF_EL1) is introduced.
> > > > > The register accepts control block's physical address, plus requested
> > > > > features. Also, the signal is sent using data abort with the specific
> > > > > IMPDEF Data Fault Status Code (DFSC). The specific signal is stored
> > > > > in the control block by host, to be consumed by guest.
> > 
> > > > - We don't add IMPDEF sysregs, period. That's reserved for the HW. If
> > > > you want to trap, there's the HVC instruction to that effect.
> > 
> > > I really don't understand how IMPDEF sysreg is used by hardware vendors.
> > > Do we have an existing functionality, which depends on IMPDEF sysreg?
> > > I was thinking the IMPDEF sysreg can be used by software either, but
> > > it seems I'm wrong.
> > 
> > The key is in the name: an IMPLEMENTATION DEFINED register is defined by
> > the implementation (i.e. the specific CPU microarchitecture), so it's
> > wrong for software to come up with an arbitrary semantic as this will
> > differ from the implementation's defined semantic for the register.
> > 
> > Typically, IMP DEF resgisters are used for things that firmware needs to
> > do (e.g. enter/exit coherency), or for bringup-time debug (e.g. poking
> > into TLB/cache internals), and are not usually intended for general
> > purpose software.
> > 
> > Linux generally avoids the use of IMP DEF registers, but does so in some
> > cases (e.g. for PMUs) after FW explicitly describes that those are safe
> > to access.
> 
> Thanks for the explanation and details, which make things much clear. Since
> the IMPDEF system register can't be used like this way, hypercall (HVC) would
> be considered to serve same purpose - deliver signals from host to guest.

I'm not sure I follow how you'd use HVC to inject a signal into a guest;
the HVC would have to be issued by the guest to the host. Unless you're
injecting the signal via some other mechanism (e.g. an interrupt), and
the guest issues the HVC in response to that?

> However, the hypercall number and behaviors are guarded by
> specification. For example, the hypercalls used by para-virtualized
> stolen time, which are defined in include/linux/arm-smccc.h, are
> specified by ARM DEN0057A [1]. So I need a specification to be
> created, where the hypercalls used by this feature are defined? If
> it's not needed, can I pick hypercalls that aren't used and define
> their behaviors by myself?
> 
> [1] 
> http://infocenter.arm.com/help/topic/com.arm.doc.den0057a/DEN0057A_Paravirtualized_Time_for_Arm_based_Systems_v1_0.pdf

Take a look at the SMCCC / SMC Calling Convention:

 https://developer.arm.com/docs/den0028/c

... that defines ranges set aside for hypervisor-specific usage, and
despite its name it also applies to HVC calls.

There's been intermittent work to add a probing story for that, so that
part is subject to change, but for prototyping you can just choose an
arbitray number in that range -- just be suere to mention in the commit
and cover letter that this part isn't complete.

> Another thing I want to check is about the ESR_EL1[DFSC]. In this series,
> the asynchronous page fault is identified by IMPDEF DFSC (Data Fault Status
> Code) in ESR_EL1. According to what we discussed, the IMPDEF DFSC shouldn't
> be fired (produced) by software. It should be produced by hardware either?
> What I understood is IMPDEF is hardware behavior. If this is true, I need
> to avoid using IMPDEF DFSC in next revision :)

Yes, similar applies here.

If the guest is making a hypercall, you can return the fault info as the
response in GPRs, so I don't think you need to touch any architectural
fault registers.

Thanks,
Mark.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function


On 2020-04-16 09:45, Tianjia Zhang wrote:

On 2020/4/16 16:28, Marc Zyngier wrote:


[...]

Overall, there is a large set of cleanups to be done when both the 
vcpu and the run
structures are passed as parameters at the same time. Just grepping 
the tree for

kvm_run is pretty instructive.

     M.


Sorry, it's my mistake, I only compiled the x86 platform, I will
submit patch again.


Not a mistake. All I'm saying is that there is an opportunity for a 
larger

series that cleans up the code base, rather than just doing a couple of
localized changes.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function

2020-04-16 Thread Cornelia Huck

On Thu, 16 Apr 2020 16:45:33 +0800
Tianjia Zhang  wrote:

> On 2020/4/16 16:28, Marc Zyngier wrote:
> > On 2020-04-16 08:03, Vitaly Kuznetsov wrote:  
> >> Tianjia Zhang  writes:
> >>  
> >>> In earlier versions of kvm, 'kvm_run' is an independent structure
> >>> and is not included in the vcpu structure. At present, 'kvm_run'
> >>> is already included in the vcpu structure, so the parameter
> >>> 'kvm_run' is redundant.
> >>>
> >>> This patch simplify the function definition, removes the extra
> >>> 'kvm_run' parameter, and extract it from the 'kvm_vcpu' structure
> >>> if necessary.
> >>>
> >>> Signed-off-by: Tianjia Zhang 
> >>> ---
> >>>
> >>> v2 change:
> >>>   remove 'kvm_run' parameter and extract it from 'kvm_vcpu'
> >>>
> >>>  arch/mips/kvm/mips.c   |  3 ++-
> >>>  arch/powerpc/kvm/powerpc.c |  3 ++-
> >>>  arch/s390/kvm/kvm-s390.c   |  3 ++-
> >>>  arch/x86/kvm/x86.c | 11 ++-
> >>>  include/linux/kvm_host.h   |  2 +-
> >>>  virt/kvm/arm/arm.c |  6 +++---
> >>>  virt/kvm/kvm_main.c    |  2 +-
> >>>  7 files changed, 17 insertions(+), 13 deletions(-)

> > Overall, there is a large set of cleanups to be done when both the vcpu 
> > and the run
> > structures are passed as parameters at the same time. Just grepping the 
> > tree for
> > kvm_run is pretty instructive.
> > 
> >      M.  
> 
> Sorry, it's my mistake, I only compiled the x86 platform, I will submit 
> patch again.

I think it's completely fine (and even preferable) to do cleanups like
that on top.

[FWIW, I compiled s390 here.]

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v2] KVM: Optimize kvm_arch_vcpu_ioctl_run function