from:"Shameerali Kolothum Thodi"

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

2021-10-11 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Shameerali Kolothum Thodi
> Sent: 11 August 2021 09:48
> To: 'Will Deacon' 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> schedule out
> 
> Hi Will,
> 
> > -Original Message-
> > From: Will Deacon [mailto:w...@kernel.org]
> > Sent: 03 August 2021 16:31
> > To: Shameerali Kolothum Thodi 
> > Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> > linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> > james.mo...@arm.com; julien.thierry.k...@gmail.com;
> > suzuki.poul...@arm.com; jean-phili...@linaro.org;
> > alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> > 
> > Subject: Re: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> > schedule out
> 
> [...]
> 
> > I think we have to be really careful not to run into the "suspended
> > animation" problem described in ae120d9edfe9 ("ARM: 7767/1: let the ASID
> > allocator handle suspended animation") if we go down this road.
> >
> > Maybe something along the lines of:
> >
> > ROLLOVER
> >
> >   * Take lock
> >   * Inc generation
> > => This will force everybody down the slow path
> >   * Record active VMIDs
> >   * Broadcast TLBI
> > => Only active VMIDs can be dirty
> > => Reserve active VMIDs and mark as allocated
> >
> > VCPU SCHED IN
> >
> >   * Set active VMID
> >   * Check generation
> >   * If mismatch then:
> > * Take lock
> > * Try to match a reserved VMID
> > * If no reserved VMID, allocate new
> >
> > VCPU SCHED OUT
> >
> >   * Clear active VMID
> >
> > but I'm not daft enough to think I got it right first time. I think it
> > needs both implementing *and* modelling in TLA+ before we merge it!
> 
> I attempted to implement the above algo as below. It seems to be
> working in both 16-bit vmid and 4-bit vmid test setup. 

It is not :(. I did an extended, overnight test run and it fails.
It looks to me in my below implementation there is no synchronization
on setting the active VMID and a concurrent rollover. I will have another go.

Thanks,
Shameer

Though I am
> not quite sure this Is exactly what you had in mind above and covers
> all corner cases.
> 
> Please take a look and let me know.
> (The diff below is against this v3 series)
> 
> Thanks,
> Shameer
> 
> --->8<
> 
> --- a/arch/arm64/kvm/vmid.c
> +++ b/arch/arm64/kvm/vmid.c
> @@ -43,7 +43,7 @@ static void flush_context(void)
> bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);
> 
> for_each_possible_cpu(cpu) {
> -   vmid = atomic64_xchg_relaxed(_cpu(active_vmids,
> cpu), 0);
> +   vmid = atomic64_read(_cpu(active_vmids, cpu));
> 
> /* Preserve reserved VMID */
> if (vmid == 0)
> @@ -125,32 +125,17 @@ void kvm_arm_vmid_clear_active(void)
>  void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
>  {
> unsigned long flags;
> -   u64 vmid, old_active_vmid;
> +   u64 vmid;
> 
> vmid = atomic64_read(_vmid->id);
> -
> -   /*
> -* Please refer comments in check_and_switch_context() in
> -* arch/arm64/mm/context.c.
> -*/
> -   old_active_vmid = atomic64_read(this_cpu_ptr(_vmids));
> -   if (old_active_vmid && vmid_gen_match(vmid) &&
> -   atomic64_cmpxchg_relaxed(this_cpu_ptr(_vmids),
> -old_active_vmid, vmid))
> +   if (vmid_gen_match(vmid)) {
> +   atomic64_set(this_cpu_ptr(_vmids), vmid);
> return;
> -
> -   raw_spin_lock_irqsave(_vmid_lock, flags);
> -
> -   /* Check that our VMID belongs to the current generation. */
> -   vmid = atomic64_read(_vmid->id);
> -   if (!vmid_gen_match(vmid)) {
> -   vmid = new_vmid(kvm_vmid);
> -   atomic64_set(_vmid->id, vmid);
> }
> 
> -
> +   raw_spin_lock_irqsave(_vmid_lock, flags);
> +   vmid = new_vmid(kvm_vmid);
> +   atomic64_set(_vmid->id, vmid);
> atomic64_set(this_cpu_ptr(_vmids), vmid);
> raw_spin_unlock_irqrestore(_vmid_lock, flags);
>  }
> --->8<
> 
> 
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH 0/4] Fix racing TLBI with ASID/VMID reallocation

2021-09-10 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 06 August 2021 12:31
> To: linux-arm-ker...@lists.infradead.org
> Cc: kernel-t...@android.com; Will Deacon ; Catalin
> Marinas ; Marc Zyngier ; Jade
> Alglave ; Shameerali Kolothum Thodi
> ; kvmarm@lists.cs.columbia.edu;
> linux-a...@vger.kernel.org
> Subject: [PATCH 0/4] Fix racing TLBI with ASID/VMID reallocation
> 
> Hi all,
> 
> While reviewing Shameer's reworked VMID allocator [1] and discussing
> with Marc, we spotted a race between TLB invalidation (which typically
> takes an ASID or VMID argument) and reallocation of ASID/VMID for the
> context being targetted.
> 
> The first patch spells out an example with try_to_unmap_one() in a
> comment, which Catalin has kindly modelled in TLA+ at [2].
> 
> Although I'm posting all this together for ease of review, the intention
> is that the first patch will go via arm64 with the latter going via kvm.
> 
> Cheers,
> 
> Will
> 
> [1]
> https://lore.kernel.org/r/20210729104009.382-1-shameerali.kolothum.thodi
> @huawei.com
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/kernel-tla.git/commi
> t/

Hi Catalin,

I am going through the ASID TLA+ model and in the above commit, it appears that 
the
different ASID check(=> ActiveAsid(c1) # ActiveAsid(c2)) for the Invariant
UniqueASIDActiveTask is now removed.

Just wondering why that is not relevant anymore?

Thanks,
Shameer

> Cc: Catalin Marinas 
> Cc: Marc Zyngier 
> Cc: Jade Alglave 
> Cc: Shameer Kolothum 
> Cc: 
> Cc: 
> 
> --->8
> 
> Marc Zyngier (3):
>   KVM: arm64: Move kern_hyp_va() usage in __load_guest_stage2() into the
> callers
>   KVM: arm64: Convert the host S2 over to __load_guest_stage2()
>   KVM: arm64: Upgrade VMID accesses to {READ,WRITE}_ONCE
> 
> Will Deacon (1):
>   arm64: mm: Fix TLBI vs ASID rollover
> 
>  arch/arm64/include/asm/kvm_mmu.h  | 17 ++-
>  arch/arm64/include/asm/mmu.h  | 29
> ---
>  arch/arm64/include/asm/tlbflush.h | 11 +++
>  arch/arm64/kvm/arm.c  |  2 +-
>  arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  2 +-
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c |  6 ++--
>  arch/arm64/kvm/hyp/nvhe/switch.c  |  4 ++-
>  arch/arm64/kvm/hyp/nvhe/tlb.c |  2 +-
>  arch/arm64/kvm/hyp/vhe/switch.c   |  2 +-
>  arch/arm64/kvm/hyp/vhe/tlb.c  |  2 +-
>  arch/arm64/kvm/mmu.c  |  2 +-
>  11 files changed, 52 insertions(+), 27 deletions(-)
> 
> --
> 2.32.0.605.g8dce9f2422-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

2021-08-11 Thread Shameerali Kolothum Thodi

Hi Will,

> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 03 August 2021 16:31
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: Re: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> schedule out

[...]
 
> I think we have to be really careful not to run into the "suspended
> animation" problem described in ae120d9edfe9 ("ARM: 7767/1: let the ASID
> allocator handle suspended animation") if we go down this road.
> 
> Maybe something along the lines of:
> 
> ROLLOVER
> 
>   * Take lock
>   * Inc generation
> => This will force everybody down the slow path
>   * Record active VMIDs
>   * Broadcast TLBI
> => Only active VMIDs can be dirty
> => Reserve active VMIDs and mark as allocated
> 
> VCPU SCHED IN
> 
>   * Set active VMID
>   * Check generation
>   * If mismatch then:
> * Take lock
> * Try to match a reserved VMID
> * If no reserved VMID, allocate new
> 
> VCPU SCHED OUT
> 
>   * Clear active VMID
> 
> but I'm not daft enough to think I got it right first time. I think it
> needs both implementing *and* modelling in TLA+ before we merge it!

I attempted to implement the above algo as below. It seems to be
working in both 16-bit vmid and 4-bit vmid test setup. Though I am
not quite sure this Is exactly what you had in mind above and covers
all corner cases.

Please take a look and let me know.
(The diff below is against this v3 series)

Thanks,
Shameer

--->8<

--- a/arch/arm64/kvm/vmid.c
+++ b/arch/arm64/kvm/vmid.c
@@ -43,7 +43,7 @@ static void flush_context(void)
bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);

for_each_possible_cpu(cpu) {
-   vmid = atomic64_xchg_relaxed(_cpu(active_vmids, cpu), 0);
+   vmid = atomic64_read(_cpu(active_vmids, cpu));

/* Preserve reserved VMID */
if (vmid == 0)
@@ -125,32 +125,17 @@ void kvm_arm_vmid_clear_active(void)
 void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
 {
unsigned long flags;
-   u64 vmid, old_active_vmid;
+   u64 vmid;

vmid = atomic64_read(_vmid->id);
-
-   /*
-* Please refer comments in check_and_switch_context() in
-* arch/arm64/mm/context.c.
-*/
-   old_active_vmid = atomic64_read(this_cpu_ptr(_vmids));
-   if (old_active_vmid && vmid_gen_match(vmid) &&
-   atomic64_cmpxchg_relaxed(this_cpu_ptr(_vmids),
-old_active_vmid, vmid))
+   if (vmid_gen_match(vmid)) {
+   atomic64_set(this_cpu_ptr(_vmids), vmid);
return;
-
-   raw_spin_lock_irqsave(_vmid_lock, flags);
-
-   /* Check that our VMID belongs to the current generation. */
-   vmid = atomic64_read(_vmid->id);
-   if (!vmid_gen_match(vmid)) {
-   vmid = new_vmid(kvm_vmid);
-   atomic64_set(_vmid->id, vmid);
}

-
+   raw_spin_lock_irqsave(_vmid_lock, flags);
+   vmid = new_vmid(kvm_vmid);
+   atomic64_set(_vmid->id, vmid);
atomic64_set(this_cpu_ptr(_vmids), vmid);
raw_spin_unlock_irqrestore(_vmid_lock, flags);
 }
--->8<




___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

2021-08-09 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 09 August 2021 14:09
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: Re: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> schedule out
> 
> On Fri, Aug 06, 2021 at 12:24:36PM +, Shameerali Kolothum Thodi
> wrote:
> > These are some test numbers with and without this patch, run on two
> > different test setups.
> >
> >
> > a)Test Setup -1
> > ---
> >
> > Platform: HiSilicon D06 with 128 CPUs, VMID bits = 16
> > Run 128 VMs concurrently each with 2 vCPUs. Each Guest will execute
> hackbench
> > 5 times before exiting.
> >
> > Measurements taken avg. of 10 Runs.
> >
> > Image : 5.14-rc3
> > ---
> >   Time(s)   44.43813888
> >   No. of exits145,348,264
> >
> > Image: 5.14-rc3 + vmid-v3
> > 
> >   Time(s)46.59789034
> >   No. of exits 133,587,307
> >
> > %diff against 5.14-rc3
> >   Time: 4.8% more
> >   Exits: 8% less
> >
> > Image: 5.14-rc3 + vmid-v3 + Without active_asid clear
> > ---
> >   Time(s) 44.5031782
> >   No. of exits  144,443,188
> >
> > %diff against 5.14-rc3
> >   Time: 0.15% more
> >   Exits: 2.42% less
> >
> > b)Test Setup -2
> > ---
> >
> > Platform: HiSilicon D06 + Kernel with maxcpus set to 8 and VMID bits set to
> 4.
> > Run 40 VMs concurrently each with 2 vCPUs. Each Guest will execute
> hackbench
> > 5 times before exiting.
> >
> > Measurements taken avg. of 10 Runs.
> >
> > Image : 5.14-rc3-vmid4bit
> > 
> >   Time(s)46.19963266
> >   No. of exits 23,699,546
> >
> > Image: 5.14-rc3-vmid4bit + vmid-v3
> > ---
> >   Time(s)  45.83307736
> >   No. of exits  23,260,203
> >
> > %diff against 5.14-rc3-vmid4bit
> >   Time: 0.8% less
> >   Exits: 1.85% less
> >
> > Image: 5.14-rc3-vmid4bit + vmid-v3 + Without active_asid clear
> > -
> >   Time(s)   44.5031782
> >   No. of exits144,443,188
> 
> Really? The *exact* same numbers as the "Image: 5.14-rc3 + vmid-v3 +
> Without
> active_asid clear" configuration? Guessing a copy-paste error here.
> 
> > %diff against 5.14-rc3-vmid4bit
> >   Time: 1.05% less
> >   Exits: 2.06% less
> >
> > As expected, the active_asid clear on schedule out is not helping.
> > But without this patch, the numbers seems to be better than the
> > vanilla kernel when we force the setup(cpus=8, vmd=4bits)
> > to perform rollover.
> 
> I'm struggling a bit to understand these numbers. Are you saying that
> clearing the active_asid helps in the 16-bit VMID case but not in the
> 4-bit case?

Nope, the other way around.. The point I was trying to make is that
clearing the active_vmids definitely have an impact in 16-bit vmid
case, where rollover is not happening, as it ends up taking the slow
path more frequently.

Test setup-1, case 2(with active_vmids clear): Around 4.8% more time
to finish the test compared to vanilla kernel.

Test setup-1, case 3(Without clear): 0.15% more time compared to
vanilla kernel.

For the 4-bit vmid case, the impact of clearing vmids is not that obvious
probably because we have more rollovers.

Test setup-2, case 2(with active_vmids clear):0.8% less time compared to 
vanilla.
Test setup-2, case 3(Without clear): 1.05% less time compared to vanilla kernel.
 
So between the two(with and without clearing the active_vmids), the "without" 
one has better numbers for both Test setups.

> Why would the active_asid clear have any impact on the number of exits?

In 16 bit vmid case, it looks like the no. of exits is considerably lower if we 
clear
active_vmids. . Not sure it is because of the frequent slow path or not. But 
anyway,
the time to finish the test is higher.
 
> The problem I see with not having the active_asid clear is that we will
> roll over more frequently as the number of reserved VMIDs increases.

Ok. The idea of running the 4-bit test setup was to capture that. It doesn't
look like it has a major impact when compared to the original kernel. May be
I should take an average of more test runs. Please let me know if there is a 
better way to measure that impact.

Hope, I am clear.

Thanks,
Shameer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

2021-08-06 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Shameerali Kolothum Thodi
> Sent: 03 August 2021 16:57
> To: 'Will Deacon' 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> schedule out
> 
> 
> 
> > -Original Message-
> > From: Will Deacon [mailto:w...@kernel.org]
> > Sent: 03 August 2021 16:31
> > To: Shameerali Kolothum Thodi 
> > Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> > linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> > james.mo...@arm.com; julien.thierry.k...@gmail.com;
> > suzuki.poul...@arm.com; jean-phili...@linaro.org;
> > alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> > 
> > Subject: Re: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> > schedule out
> >
> > On Tue, Aug 03, 2021 at 12:55:25PM +, Shameerali Kolothum Thodi
> > wrote:
> > > > > diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
> > > > > index 5584e84aed95..5fd51f5445c1 100644
> > > > > --- a/arch/arm64/kvm/vmid.c
> > > > > +++ b/arch/arm64/kvm/vmid.c
> > > > > @@ -116,6 +116,12 @@ static u64 new_vmid(struct kvm_vmid
> > > > *kvm_vmid)
> > > > >   return idx2vmid(vmid) | generation;
> > > > >  }
> > > > >
> > > > > +/* Call with preemption disabled */
> > > > > +void kvm_arm_vmid_clear_active(void)
> > > > > +{
> > > > > + atomic64_set(this_cpu_ptr(_vmids), 0);
> > > > > +}
> > > >
> > > > I think this is very broken, as it will force everybody to take the
> > > > slow-path when they see an active_vmid of 0.
> > >
> > > Yes. I have seen that happening in my test setup.
> >
> > Why didn't you say so?!
> 
> Sorry. I thought of getting some performance numbers with and
> without this patch and measure the impact. But didn't quite get time
> to finish it yet.

These are some test numbers with and without this patch, run on two
different test setups.


a)Test Setup -1
---

Platform: HiSilicon D06 with 128 CPUs, VMID bits = 16
Run 128 VMs concurrently each with 2 vCPUs. Each Guest will execute hackbench
5 times before exiting.

Measurements taken avg. of 10 Runs.

Image : 5.14-rc3
---
  Time(s)   44.43813888
  No. of exits145,348,264

Image: 5.14-rc3 + vmid-v3

  Time(s)46.59789034
  No. of exits 133,587,307

%diff against 5.14-rc3
  Time: 4.8% more
  Exits: 8% less 

Image: 5.14-rc3 + vmid-v3 + Without active_asid clear
---
  Time(s) 44.5031782
  No. of exits  144,443,188

%diff against 5.14-rc3
  Time: 0.15% more
  Exits: 2.42% less

b)Test Setup -2
---

Platform: HiSilicon D06 + Kernel with maxcpus set to 8 and VMID bits set to 4.
Run 40 VMs concurrently each with 2 vCPUs. Each Guest will execute hackbench
5 times before exiting.

Measurements taken avg. of 10 Runs.

Image : 5.14-rc3-vmid4bit

  Time(s)46.19963266
  No. of exits 23,699,546

Image: 5.14-rc3-vmid4bit + vmid-v3
---
  Time(s)  45.83307736
  No. of exits  23,260,203

%diff against 5.14-rc3-vmid4bit
  Time: 0.8% less
  Exits: 1.85% less 

Image: 5.14-rc3-vmid4bit + vmid-v3 + Without active_asid clear
-
  Time(s)   44.5031782
  No. of exits144,443,188

%diff against 5.14-rc3-vmid4bit
  Time: 1.05% less
  Exits: 2.06% less

As expected, the active_asid clear on schedule out is not helping.
But without this patch, the numbers seems to be better than the
vanilla kernel when we force the setup(cpus=8, vmd=4bits)
to perform rollover.

Please let me know your thoughts.

Thanks,
Shameer

> 
> >
> > > > It also doesn't solve the issue I mentioned before, as an active_vmid 
> > > > of 0
> > > > means that the reserved vmid is preserved.
> > > >
> > > > Needs more thought...
> > >
> > > How about we clear all the active_vmids in kvm_arch_free_vm() if it
> > > matches the kvm_vmid->id ? But we may have to hold the lock
> > > there
&

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

2021-08-03 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 03 August 2021 16:31
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: Re: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> schedule out
> 
> On Tue, Aug 03, 2021 at 12:55:25PM +, Shameerali Kolothum Thodi
> wrote:
> > > > diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
> > > > index 5584e84aed95..5fd51f5445c1 100644
> > > > --- a/arch/arm64/kvm/vmid.c
> > > > +++ b/arch/arm64/kvm/vmid.c
> > > > @@ -116,6 +116,12 @@ static u64 new_vmid(struct kvm_vmid
> > > *kvm_vmid)
> > > > return idx2vmid(vmid) | generation;
> > > >  }
> > > >
> > > > +/* Call with preemption disabled */
> > > > +void kvm_arm_vmid_clear_active(void)
> > > > +{
> > > > +   atomic64_set(this_cpu_ptr(_vmids), 0);
> > > > +}
> > >
> > > I think this is very broken, as it will force everybody to take the
> > > slow-path when they see an active_vmid of 0.
> >
> > Yes. I have seen that happening in my test setup.
> 
> Why didn't you say so?!

Sorry. I thought of getting some performance numbers with and
without this patch and measure the impact. But didn't quite get time
to finish it yet.
 
> 
> > > It also doesn't solve the issue I mentioned before, as an active_vmid of 0
> > > means that the reserved vmid is preserved.
> > >
> > > Needs more thought...
> >
> > How about we clear all the active_vmids in kvm_arch_free_vm() if it
> > matches the kvm_vmid->id ? But we may have to hold the lock
> > there
> 
> I think we have to be really careful not to run into the "suspended
> animation" problem described in ae120d9edfe9 ("ARM: 7767/1: let the ASID
> allocator handle suspended animation") if we go down this road.


Ok. I will go through that.
 
> Maybe something along the lines of:
> 
> ROLLOVER
> 
>   * Take lock
>   * Inc generation
> => This will force everybody down the slow path
>   * Record active VMIDs
>   * Broadcast TLBI
> => Only active VMIDs can be dirty
> => Reserve active VMIDs and mark as allocated
> 
> VCPU SCHED IN
> 
>   * Set active VMID
>   * Check generation
>   * If mismatch then:
> * Take lock
> * Try to match a reserved VMID
> * If no reserved VMID, allocate new
> 
> VCPU SCHED OUT
> 
>   * Clear active VMID
> 
> but I'm not daft enough to think I got it right first time. I think it
> needs both implementing *and* modelling in TLA+ before we merge it!
> 

Ok. I need some time to digest the above first :).

On another note, how serious do you think is the problem of extra
reservation of the VMID space? Just wondering if we can skip this
patch for now or not..

Thanks,
Shameer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

2021-08-03 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 03 August 2021 12:41
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: Re: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> schedule out
> 
> On Thu, Jul 29, 2021 at 11:40:09AM +0100, Shameer Kolothum wrote:
> > Like ASID allocator, we copy the active_vmids into the
> > reserved_vmids on a rollover. But it's unlikely that
> > every CPU will have a vCPU as current task and we may
> > end up unnecessarily reserving the VMID space.
> >
> > Hence, clear active_vmids when scheduling out a vCPU.
> >
> > Suggested-by: Will Deacon 
> > Signed-off-by: Shameer Kolothum
> 
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 1 +
> >  arch/arm64/kvm/arm.c  | 1 +
> >  arch/arm64/kvm/vmid.c | 6 ++
> >  3 files changed, 8 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> > index bb993bce1363..d93141cb8d16 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -687,6 +687,7 @@ extern unsigned int kvm_arm_vmid_bits;
> >  int kvm_arm_vmid_alloc_init(void);
> >  void kvm_arm_vmid_alloc_free(void);
> >  void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid);
> > +void kvm_arm_vmid_clear_active(void);
> >
> >  static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch
> *vcpu_arch)
> >  {
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 077e55a511a9..b134a1b89c84 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -435,6 +435,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > kvm_timer_vcpu_put(vcpu);
> > kvm_vgic_put(vcpu);
> > kvm_vcpu_pmu_restore_host(vcpu);
> > +   kvm_arm_vmid_clear_active();
> >
> > vcpu->cpu = -1;
> >  }
> > diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
> > index 5584e84aed95..5fd51f5445c1 100644
> > --- a/arch/arm64/kvm/vmid.c
> > +++ b/arch/arm64/kvm/vmid.c
> > @@ -116,6 +116,12 @@ static u64 new_vmid(struct kvm_vmid
> *kvm_vmid)
> > return idx2vmid(vmid) | generation;
> >  }
> >
> > +/* Call with preemption disabled */
> > +void kvm_arm_vmid_clear_active(void)
> > +{
> > +   atomic64_set(this_cpu_ptr(_vmids), 0);
> > +}
> 
> I think this is very broken, as it will force everybody to take the
> slow-path when they see an active_vmid of 0.

Yes. I have seen that happening in my test setup.

> It also doesn't solve the issue I mentioned before, as an active_vmid of 0
> means that the reserved vmid is preserved.
> 
> Needs more thought...

How about we clear all the active_vmids in kvm_arch_free_vm() if it
matches the kvm_vmid->id ? But we may have to hold the lock 
there.

Thanks,
Shameer
 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v3 1/4] KVM: arm64: Introduce a new VMID allocator for KVM

2021-08-03 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 03 August 2021 12:39
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: Re: [PATCH v3 1/4] KVM: arm64: Introduce a new VMID allocator for
> KVM
> 
> On Thu, Jul 29, 2021 at 11:40:06AM +0100, Shameer Kolothum wrote:
> > A new VMID allocator for arm64 KVM use. This is based on
> > arm64 ASID allocator algorithm.
> >
> > One major deviation from the ASID allocator is the way we
> > flush the context. Unlike ASID allocator, we expect less
> > frequent rollover in the case of VMIDs. Hence, instead of
> > marking the CPU as flush_pending and issuing a local context
> > invalidation on the next context switch, we broadcast TLB
> > flush + I-cache invalidation over the inner shareable domain
> > on rollover.
> >
> > Signed-off-by: Shameer Kolothum
> 
> > ---
> 
> [...]
> 
> > +void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
> > +{
> > +   unsigned long flags;
> > +   unsigned int cpu;
> > +   u64 vmid, old_active_vmid;
> > +
> > +   vmid = atomic64_read(_vmid->id);
> > +
> > +   /*
> > +* Please refer comments in check_and_switch_context() in
> > +* arch/arm64/mm/context.c.
> > +*/
> > +   old_active_vmid = atomic64_read(this_cpu_ptr(_vmids));
> > +   if (old_active_vmid && vmid_gen_match(vmid) &&
> > +   atomic64_cmpxchg_relaxed(this_cpu_ptr(_vmids),
> > +old_active_vmid, vmid))
> > +   return;
> > +
> > +   raw_spin_lock_irqsave(_vmid_lock, flags);
> > +
> > +   /* Check that our VMID belongs to the current generation. */
> > +   vmid = atomic64_read(_vmid->id);
> > +   if (!vmid_gen_match(vmid)) {
> > +   vmid = new_vmid(kvm_vmid);
> > +   atomic64_set(_vmid->id, vmid);
> 
> new_vmid() can just set kvm_vmid->id directly

Ok.
> 
> > +   }
> > +
> > +   cpu = smp_processor_id();
> 
> Why?

Left over from previous one. Forgot to remove
as we don't have the tlb_flush_pending check anymore.

Thanks,
Shameer

> 
> Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [Linuxarm] Re: [RFC PATCH 4/5] iommu/arm-smmu-v3: Use pinned VMID for NESTED stage with BTM

2021-07-23 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Jean-Philippe Brucker [mailto:jean-phili...@linaro.org]
> Sent: 22 July 2021 17:46
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; io...@lists.linux-foundation.org;
> kvmarm@lists.cs.columbia.edu; m...@kernel.org;
> alex.william...@redhat.com; eric.au...@redhat.com;
> zhangfei@linaro.org; Jonathan Cameron
> ; Zengtao (B) ;
> linux...@openeuler.org; Linuxarm 
> Subject: [Linuxarm] Re: [RFC PATCH 4/5] iommu/arm-smmu-v3: Use pinned
> VMID for NESTED stage with BTM
> 
> Hi Shameer,
> 
> On Wed, Jul 21, 2021 at 08:54:00AM +, Shameerali Kolothum Thodi
> wrote:
> > > More generally I think this pinned VMID set conflicts with that of
> > > stage-2-only domains (which is the default state until a guest attaches a
> > > PASID table). Say you have one guest using DOMAIN_NESTED without
> PASID
> > > table, just DMA to IPA using VMID 0x8000. Now another guest attaches a
> > > PASID table and obtains the same VMID from KVM. The stage-2 translation
> > > might use TLB entries from the other guest, no?  They'll both create
> > > stage-2 TLB entries with {StreamWorld=NS-EL1, VMID=0x8000}
> >
> > Now that we are trying to align the KVM VMID allocation algorithm similar
> to
> > that of the ASID allocator [1], I attempted to use that for the SMMU pinned
> > VMID allocation. But the issue you have mentioned above is still valid.
> >
> > And as a solution what I have tried now is follow what pinned ASID is doing
> > in SVA,
> >  -Use xarray for private VMIDs
> >  -Get pinned VMID from KVM for DOMAIN_NESTED with PASID table
> >  -If the new pinned VMID is in use by private, then update the private
> >   VMID(VMID update to a live STE).
> >
> > This seems to work, but still need to run more tests with this though.
> >
> > > It's tempting to allocate all VMIDs through KVM instead, but that will
> > > force a dependency on KVM to use VFIO_TYPE1_NESTING_IOMMU and
> might
> > > break
> > > existing users of that extension (though I'm not sure there are any).
> > > Instead we might need to restrict the SMMU VMID bitmap to match the
> > > private VMID set in KVM.
> >
> > Another solution I have in mind is, make the new KVM VMID allocator
> common
> > between SMMUv3 and KVM. This will help to avoid all the private and
> shared
> > VMID splitting, also no need for live updates to STE VMID. One possible
> drawback
> > is less number of available KVM VMIDs but with 16 bit VMID space I am not
> sure
> > how much that is a concern.
> 
> Yes I think that works too. In practice there shouldn't be many VMIDs on
> the SMMU side, the feature's only enabled when a user wants to assign
> devices with nesting translation (unlike ASIDs where each device in the
> system gets a private ASID by default).

Ok. What about implementations that supports only stage 2? Do we
need a private VMID allocator for those or can use the same common
KVM VMID allocator?

> Note that you still need to pin all VMIDs used by the SMMU, otherwise
> you'll have to update the STE after rollover.

Sure.

> The problem we have with VFIO_TYPE1_NESTING_IOMMU might be solved by
> the
> upcoming deprecation of VFIO_*_IOMMU [2]. We need a specific sequence
> from
> userspace:
> 1. Attach VFIO group to KVM (KVM_DEV_VFIO_GROUP_ADD)
> 2. Create nesting IOMMU domain and attach the group to it
>(VFIO_GROUP_SET_CONTAINER, VFIO_SET_IOMMU becomes
> IOMMU_IOASID_ALLOC, VFIO_DEVICE_ATTACH_IOASID)
> Currently QEMU does 2 then 1, which would cause the SMMU to allocate a
> separate VMID.

Yes. I have observed this with my current implementation. I have a check
to see the private S2 config VMID belongs to the same domain s2_cfg, then
skip the live update to the STE VMID.

> If we wanted to extend VFIO_TYPE1_NESTING_IOMMU with
> PASID
> tables we'd need to mandate 1-2 and may break existing users. In the new
> design we can require from the start that creating a nesting IOMMU
> container through /dev/iommu *must* come with a KVM context, that way
> we're sure to reuse the existing VMID.

Ok. That helps.

Thanks,
Shameer
 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v2 3/3] kvm/arm: Align the VMID allocation with the arm64 ASID one

2021-07-22 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 21 July 2021 17:32
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; Linuxarm ;
> qper...@google.com
> Subject: Re: [PATCH v2 3/3] kvm/arm: Align the VMID allocation with the
> arm64 ASID one
> 
> [+Quentin]
> 
> On Wed, Jun 16, 2021 at 04:56:06PM +0100, Shameer Kolothum wrote:
> > From: Julien Grall 
> >
> > At the moment, the VMID algorithm will send an SGI to all the CPUs to
> > force an exit and then broadcast a full TLB flush and I-Cache
> > invalidation.
> >
> > This patch use the new VMID allocator. The
> > benefits are:
> > - CPUs are not forced to exit at roll-over. Instead the VMID will be
> > marked reserved and the context will be flushed at next exit. This
> > will reduce the IPIs traffic.
> > - Context invalidation is now per-CPU rather than broadcasted.
> > - Catalin has a formal model of the ASID allocator.
> >
> > With the new algo, the code is now adapted:
> > - The function __kvm_flush_vm_context() has been renamed to
> > __kvm_tlb_flush_local_all() and now only flushing the current CPU
> > context.
> > - The call to update_vmid() will be done with preemption disabled
> > as the new algo requires to store information per-CPU.
> > - The TLBs associated to EL1 will be flushed when booting a CPU to
> > deal with stale information. This was previously done on the
> > allocation of the first VMID of a new generation.
> >
> > Signed-off-by: Julien Grall 
> > Signed-off-by: Shameer Kolothum
> 
> > ---
> >  arch/arm64/include/asm/kvm_asm.h  |   4 +-
> >  arch/arm64/include/asm/kvm_host.h |   6 +-
> >  arch/arm64/include/asm/kvm_mmu.h  |   3 +-
> >  arch/arm64/kvm/Makefile   |   2 +-
> >  arch/arm64/kvm/arm.c  | 115 +++---
> >  arch/arm64/kvm/hyp/nvhe/hyp-main.c|   6 +-
> >  arch/arm64/kvm/hyp/nvhe/mem_protect.c |   3 +-
> >  arch/arm64/kvm/hyp/nvhe/tlb.c |  10 +--
> >  arch/arm64/kvm/hyp/vhe/tlb.c  |  10 +--
> >  arch/arm64/kvm/mmu.c  |   1 -
> >  10 files changed, 52 insertions(+), 108 deletions(-)
> 
> [...]
> 
> > diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> > index 75a7e8071012..d96284da8571 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -70,9 +70,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
> >  void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu);
> >
> >  struct kvm_vmid {
> > -   /* The VMID generation used for the virt. memory system */
> > -   u64vmid_gen;
> > -   u32vmid;
> > +   atomic64_t id;
> 
> Maybe a typedef would be better if this is the only member of the structure?

Ok.

> 
> > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > index 4b60c0056c04..a02c4877a055 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > @@ -106,8 +106,7 @@ int kvm_host_prepare_stage2(void *mem_pgt_pool,
> void *dev_pgt_pool)
> > mmu->pgd_phys = __hyp_pa(host_kvm.pgt.pgd);
> > mmu->arch = _kvm.arch;
> > mmu->pgt = _kvm.pgt;
> > -   mmu->vmid.vmid_gen = 0;
> > -   mmu->vmid.vmid = 0;
> > +   atomic64_set(>vmid.id, 0);
> 
> I think this is the first atomic64 use in the EL2 object, which may pull in
> some fatal KCSAN instrumentation. Quentin, have you run into this before?
> 
> Might be simple just to zero-initialise mmu for now, if it isn't already.

I will check that.

> 
> > diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c
> b/arch/arm64/kvm/hyp/nvhe/tlb.c
> > index 83dc3b271bc5..42df9931ed9a 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/tlb.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
> > @@ -140,10 +140,10 @@ void __kvm_flush_cpu_context(struct
> kvm_s2_mmu *mmu)
> > __tlb_switch_to_host();
> >  }
> >
> > -void __kvm_flush_vm_context(void)
> > +void __kvm_tlb_flush_local_all(void)
> >  {
> > -   dsb(ishst);
> > -   __tlbi(alle1is);
> > +   dsb(nshst);
> > +   __tlbi(alle1);
> >
> > /*
> >  * VIPT and P

RE: [PATCH v2 2/3] kvm/arm: Introduce a new vmid allocator for KVM

2021-07-22 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 21 July 2021 17:06
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; Linuxarm 
> Subject: Re: [PATCH v2 2/3] kvm/arm: Introduce a new vmid allocator for KVM
> 
> On Wed, Jun 16, 2021 at 04:56:05PM +0100, Shameer Kolothum wrote:
> > A new VMID allocator for arm64 KVM use. This is based on
> > arm64 asid allocator algorithm.
> >
> > Signed-off-by: Shameer Kolothum
> 
> > ---
> >  arch/arm64/include/asm/kvm_host.h |   4 +
> >  arch/arm64/kvm/vmid.c | 206
> ++
> >  2 files changed, 210 insertions(+)
> >  create mode 100644 arch/arm64/kvm/vmid.c
> 
> Generally, I prefer this to the alternative of creating a library. However,
> I'd probably remove all the duplicated comments in favour of a reference
> to the ASID allocator. That way, we can just comment any VMID-specific
> behaviour in here.

Agree. I retained the comments mainly for myself as its very difficult at times
to follow :)

> 
> Some comments below...
> 
> > diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> > index 7cd7d5c8c4bc..75a7e8071012 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -680,6 +680,10 @@ int kvm_arm_pvtime_get_attr(struct kvm_vcpu
> *vcpu,
> >  int kvm_arm_pvtime_has_attr(struct kvm_vcpu *vcpu,
> > struct kvm_device_attr *attr);
> >
> > +int kvm_arm_vmid_alloc_init(void);
> > +void kvm_arm_vmid_alloc_free(void);
> > +void kvm_arm_update_vmid(atomic64_t *id);
> > +
> >  static inline void kvm_arm_pvtime_vcpu_init(struct kvm_vcpu_arch
> *vcpu_arch)
> >  {
> > vcpu_arch->steal.base = GPA_INVALID;
> > diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c
> > new file mode 100644
> > index ..687e18d33130
> > --- /dev/null
> > +++ b/arch/arm64/kvm/vmid.c
> > @@ -0,0 +1,206 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * VMID allocator.
> > + *
> > + * Based on arch/arm64/mm/context.c
> > + *
> > + * Copyright (C) 2002-2003 Deep Blue Solutions Ltd, all rights reserved.
> > + * Copyright (C) 2012 ARM Ltd.
> > + */
> > +
> > +#include 
> > +#include 
> > +
> > +#include 
> > +#include 
> > +
> > +static u32 vmid_bits;
> > +static DEFINE_RAW_SPINLOCK(cpu_vmid_lock);
> > +
> > +static atomic64_t vmid_generation;
> > +static unsigned long *vmid_map;
> > +
> > +static DEFINE_PER_CPU(atomic64_t, active_vmids);
> > +static DEFINE_PER_CPU(u64, reserved_vmids);
> > +static cpumask_t tlb_flush_pending;
> > +
> > +#define VMID_MASK  (~GENMASK(vmid_bits - 1, 0))
> > +#define VMID_FIRST_VERSION (1UL << vmid_bits)
> > +
> > +#define NUM_USER_VMIDS VMID_FIRST_VERSION
> > +#define vmid2idx(vmid) ((vmid) & ~VMID_MASK)
> > +#define idx2vmid(idx)  vmid2idx(idx)
> > +
> > +#define vmid_gen_match(vmid) \
> > +   (!(((vmid) ^ atomic64_read(_generation)) >> vmid_bits))
> > +
> > +static void flush_context(void)
> > +{
> > +   int cpu;
> > +   u64 vmid;
> > +
> > +   bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);
> > +
> > +   for_each_possible_cpu(cpu) {
> > +   vmid = atomic64_xchg_relaxed(_cpu(active_vmids, cpu), 0);
> > +   /*
> > +* If this CPU has already been through a
> > +* rollover, but hasn't run another task in
> > +* the meantime, we must preserve its reserved
> > +* VMID, as this is the only trace we have of
> > +* the process it is still running.
> > +*/
> > +   if (vmid == 0)
> > +   vmid = per_cpu(reserved_vmids, cpu);
> > +   __set_bit(vmid2idx(vmid), vmid_map);
> > +   per_cpu(reserved_vmids, cpu) = vmid;
> > +   }
> 
> Hmm, so here we're copying the active_vmids into the reserved_vmids on a
> rollover, but I wonder if that's overly pessismistic? For the ASID
> allocator, every CPU tends to have a current task so it makes sense, but
> I'm not sure it's necessarily the case that every CPU tends to have a
> vCPU as the current

RE: [PATCH v2 1/3] arch/arm64: Introduce a capability to tell whether 16-bit VMID is available

2021-07-22 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 21 July 2021 16:23
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; Linuxarm 
> Subject: Re: [PATCH v2 1/3] arch/arm64: Introduce a capability to tell whether
> 16-bit VMID is available
> 
> On Wed, Jun 16, 2021 at 04:56:04PM +0100, Shameer Kolothum wrote:
> > From: Julien Grall 
> >
> > At the moment, the function kvm_get_vmid_bits() is looking up for the
> > sanitized value of ID_AA64MMFR1_EL1 and extract the information
> > regarding the number of VMID bits supported.
> >
> > This is fine as the function is mainly used during VMID roll-over. New
> > use in a follow-up patch will require the function to be called a every
> > context switch so we want the function to be more efficient.
> >
> > A new capability is introduced to tell whether 16-bit VMID is
> > available.
> 
> I don't really buy this rationale. The VMID allocator introduced later on
> caches this value in the static 'vmid_bits' variable, and that gets used
> on vCPU enter via vmid_gen_match() in the kvm_arm_update_vmid() fastpath.
> 
> So I would prefer that we just expose an accessor for that than introduce
> a static key and new cpufeature just for kvm_get_vttbr().

Ok. Will change it to an accessor.

Thanks,
Shameer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [RFC PATCH 4/5] iommu/arm-smmu-v3: Use pinned VMID for NESTED stage with BTM

2021-07-21 Thread Shameerali Kolothum Thodi

Hi Jean,

> -Original Message-
> From: Jean-Philippe Brucker [mailto:jean-phili...@linaro.org]
> Sent: 04 March 2021 17:11
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; io...@lists.linux-foundation.org;
> kvmarm@lists.cs.columbia.edu; m...@kernel.org;
> alex.william...@redhat.com; eric.au...@redhat.com;
> zhangfei@linaro.org; Jonathan Cameron
> ; Zengtao (B) ;
> linux...@openeuler.org
> Subject: Re: [RFC PATCH 4/5] iommu/arm-smmu-v3: Use pinned VMID for
> NESTED stage with BTM

[...]

> >
> > kfree(smmu_domain);
> > @@ -3199,6 +3230,17 @@ static int arm_smmu_attach_pasid_table(struct
> iommu_domain *domain,
> > !(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB))
> > goto out;
> >
> > +   if (smmu->features & ARM_SMMU_FEAT_BTM) {
> > +   ret = arm_smmu_pinned_vmid_get(smmu_domain);
> > +   if (ret < 0)
> > +   goto out;
> > +
> > +   if (smmu_domain->s2_cfg.vmid)
> > +   arm_smmu_bitmap_free(smmu->vmid_map,
> smmu_domain->s2_cfg.vmid);
> > +
> > +   smmu_domain->s2_cfg.vmid = (u16)ret;
> 
> That will require a TLB invalidation on the old VMID, once the STE is
> rewritten.
> 
> More generally I think this pinned VMID set conflicts with that of
> stage-2-only domains (which is the default state until a guest attaches a
> PASID table). Say you have one guest using DOMAIN_NESTED without PASID
> table, just DMA to IPA using VMID 0x8000. Now another guest attaches a
> PASID table and obtains the same VMID from KVM. The stage-2 translation
> might use TLB entries from the other guest, no?  They'll both create
> stage-2 TLB entries with {StreamWorld=NS-EL1, VMID=0x8000}

Now that we are trying to align the KVM VMID allocation algorithm similar to
that of the ASID allocator [1], I attempted to use that for the SMMU pinned 
VMID allocation. But the issue you have mentioned above is still valid. 

And as a solution what I have tried now is follow what pinned ASID is doing 
in SVA,
 -Use xarray for private VMIDs
 -Get pinned VMID from KVM for DOMAIN_NESTED with PASID table
 -If the new pinned VMID is in use by private, then update the private
  VMID(VMID update to a live STE).

This seems to work, but still need to run more tests with this though.  

> It's tempting to allocate all VMIDs through KVM instead, but that will
> force a dependency on KVM to use VFIO_TYPE1_NESTING_IOMMU and might
> break
> existing users of that extension (though I'm not sure there are any).
> Instead we might need to restrict the SMMU VMID bitmap to match the
> private VMID set in KVM.

Another solution I have in mind is, make the new KVM VMID allocator common
between SMMUv3 and KVM. This will help to avoid all the private and shared
VMID splitting, also no need for live updates to STE VMID. One possible drawback
is less number of available KVM VMIDs but with 16 bit VMID space I am not sure
how much that is a concern.

Please let me know your thoughts.

Thanks,
Shameer

[1]. 
https://lore.kernel.org/kvmarm/20210616155606.2806-1-shameerali.kolothum.th...@huawei.com/

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v2 0/3] kvm/arm: New VMID allocator based on asid

2021-07-13 Thread Shameerali Kolothum Thodi

Hi Marc,

> -Original Message-
> From: Shameerali Kolothum Thodi
> Sent: 16 June 2021 16:56
> To: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org
> Cc: m...@kernel.org; w...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; Linuxarm 
> Subject: [PATCH v2 0/3] kvm/arm: New VMID allocator based on asid
> 
> Hi,
> 
> RFCv1 --> v2
>- Dropped "pinned VMID" support for now.
>- Dropped RFC tag.
> 
> Sanity tested on HiSilicon D06 board.

A gentle ping on this one. Please let me know if you had a chance to look
at this and have any feedback. I could do a rebase to 5.14-rc1 if required.

Thanks,
Shameer

> 
> Thanks,
> Shameer
> 
> History:
> ---
> Please find the RFC series here,
> https://lore.kernel.org/kvmarm/20210506165232.1969-1-shameerali.kolothu
> m.th...@huawei.com/
> 
> From RFCv1:
> 
> This is based on a suggestion from Will [0] to try out the asid
> based kvm vmid solution as a separate VMID allocator instead of
> the shared lib approach attempted in v4[1].
> 
> The idea is to compare both the approaches and see whether the
> shared lib solution with callbacks make sense or not.
> 
> Though we are not using the pinned vmids yet, patch #2 has
> code for pinned vmid support. This is just to help the comparison.
> 
> Test Setup/Results
> 
> The measurement was made with maxcpus set to 8 and with the
> number of VMID limited to 4-bit. The test involves running
> concurrently 40 guests with 2 vCPUs. Each guest will then
> execute hackbench 5 times before exiting.
> 
> The performance difference between the current algo and the
> new one are(avg. of 10 runs):
> - 1.9% less entry/exit from the guest
> - 0.5% faster
> 
> This is more or less comparable to v4 numbers.
> 
> For the complete series, please see,
> https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc7-vmid-2nd-rfc
> 
> and for the shared asid lib v4 solution,
> https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc7-asid-v4
> 
> As you can see there are ofcourse code duplication with this
> approach but may be it is more easy to maintain considering
> the complexity involved.
> 
> [0] https://lore.kernel.org/lkml/20210422160846.GB2214@willie-the-truck/
> [1]
> https://lore.kernel.org/lkml/20210414112312.13704-1-shameerali.kolothum.t
> h...@huawei.com/
> 
> Julien Grall (2):
>   arch/arm64: Introduce a capability to tell whether 16-bit VMID is
> available
>   kvm/arm: Align the VMID allocation with the arm64 ASID one
> 
> Shameer Kolothum (1):
>   kvm/arm: Introduce a new vmid allocator for KVM
> 
>  arch/arm64/include/asm/kvm_asm.h  |   4 +-
>  arch/arm64/include/asm/kvm_host.h |  10 +-
>  arch/arm64/include/asm/kvm_mmu.h  |   7 +-
>  arch/arm64/kernel/cpufeature.c|   9 ++
>  arch/arm64/kvm/Makefile   |   2 +-
>  arch/arm64/kvm/arm.c  | 115 --
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c|   6 +-
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c |   3 +-
>  arch/arm64/kvm/hyp/nvhe/tlb.c |  10 +-
>  arch/arm64/kvm/hyp/vhe/tlb.c  |  10 +-
>  arch/arm64/kvm/mmu.c  |   1 -
>  arch/arm64/kvm/vmid.c | 206
> ++
>  arch/arm64/tools/cpucaps  |   1 +
>  13 files changed, 273 insertions(+), 111 deletions(-)
>  create mode 100644 arch/arm64/kvm/vmid.c
> 
> --
> 2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [RFC PATCH 0/3] kvm/arm: New VMID allocator based on asid(2nd approach)

2021-06-04 Thread Shameerali Kolothum Thodi

Hi Marc,

> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 04 June 2021 14:55
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; w...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org; Alexandru Elisei
> ; Linuxarm 
> Subject: Re: [RFC PATCH 0/3] kvm/arm: New VMID allocator based on asid(2nd
> approach)
> 
> On Fri, 04 Jun 2021 09:13:10 +0100,
> Shameerali Kolothum Thodi 
> wrote:
> >
> > Hi,
> >
> > > -Original Message-
> > > From: Shameerali Kolothum Thodi
> > > Sent: 06 May 2021 17:52
> > > To: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> > > linux-ker...@vger.kernel.org
> > > Cc: m...@kernel.org; w...@kernel.org; catalin.mari...@arm.com;
> > > james.mo...@arm.com; julien.thierry.k...@gmail.com;
> > > suzuki.poul...@arm.com; jean-phili...@linaro.org; Linuxarm
> > > 
> > > Subject: [RFC PATCH 0/3] kvm/arm: New VMID allocator based on asid(2nd
> > > approach)
> > >
> > > This is based on a suggestion from Will [0] to try out the asid
> > > based kvm vmid solution as a separate VMID allocator instead of
> > > the shared lib approach attempted in v4[1].
> > >
> > > The idea is to compare both the approaches and see whether the
> > > shared lib solution with callbacks make sense or not.
> >
> > A gentle ping on this. Please take a look and let me know.
> 
> I had a look and I don't overly dislike it. I'd like to see the code
> without the pinned stuff though, at least to ease the reviewing. I
> haven't tested it in anger, but I have pushed the rebased code at [1]
> as it really didn't apply to 5.13-rc4.

Thanks for taking a look and the rebase. I will remove the pinned stuff
in the next revision as that was added just to compare against the previous
version.

> 
> One thing I'm a bit worried about is that we so far relied on VMID 0
> never being allocated to a guest, which is now crucial for protected
> KVM. I can't really convince myself that this can never happen with
> this.

Hmm..not sure I quite follow that. As per the current logic vmid 0 is
reserved and is not allocated to Guest. 

> Plus, I've found this nugget:
> 
>max_pinned_vmids = NUM_USER_VMIDS - num_possible_cpus() - 2;
> 
> 
> What is this "- 2"? My hunch is that it should really be "- 1" as VMID
> 0 is reserved, and we have no equivalent of KPTI for S2.

I think this is more related to the "pinned vmid" stuff and was borrowed from
the asid_update_limit() fn in arch/arm64/mm/context.c. But I missed the
comment that explains the reason behind it. It says,

---x---
/*
 * There must always be an ASID available after rollover. Ensure that,
 * even if all CPUs have a reserved ASID and the maximum number of ASIDs
 * are pinned, there still is at least one empty slot in the ASID map.
 */
max_pinned_asids = num_available_asids - num_possible_cpus() - 2;
---x---

So this is to make sure we will have at least one VMID available after rollover
in case we have pinned the max number of VMIDs. I will include that comment
to make it clear.

Thanks,
Shameer

> 
>   M.
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h
> =kvm-arm64/mmu/vmid
> 
> --
> Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [RFC PATCH 0/3] kvm/arm: New VMID allocator based on asid(2nd approach)

2021-06-04 Thread Shameerali Kolothum Thodi

Hi,

> -Original Message-
> From: Shameerali Kolothum Thodi
> Sent: 06 May 2021 17:52
> To: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org
> Cc: m...@kernel.org; w...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org; Linuxarm
> 
> Subject: [RFC PATCH 0/3] kvm/arm: New VMID allocator based on asid(2nd
> approach)
> 
> This is based on a suggestion from Will [0] to try out the asid
> based kvm vmid solution as a separate VMID allocator instead of
> the shared lib approach attempted in v4[1].
> 
> The idea is to compare both the approaches and see whether the
> shared lib solution with callbacks make sense or not.

A gentle ping on this. Please take a look and let me know.

Thanks,
Shameer

> Though we are not yet using the pinned vmids yet, patch #2 has
> code for pinned vmid support. This is just to help the comparison.
> 
> Test Setup/Results
> 
> The measurement was made with maxcpus set to 8 and with the
> number of VMID limited to 4-bit. The test involves running
> concurrently 40 guests with 2 vCPUs. Each guest will then
> execute hackbench 5 times before exiting.
> 
> The performance difference between the current algo and the
> new one are(avg. of 10 runs):
> - 1.9% less entry/exit from the guest
> - 0.5% faster
> This is more or less comparable to v4 numbers.
> 
> For the complete series, please see,
> https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc7-vmid-2nd-rfc
> 
> and for the shared asid lib v4 solution,
> https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc7-asid-v4
> 
> As you can see there are of course code duplication with this
> approach but may be this one is more easy to maintain considering
> the complexity involved.
> 
> Please take a look and let me know your feedback.
> 
> Thanks,
> Shameer
> 
> 
> [0] https://lore.kernel.org/lkml/20210422160846.GB2214@willie-the-truck/
> [1]
> https://lore.kernel.org/lkml/20210414112312.13704-1-shameerali.kolothum.t
> h...@huawei.com/
> 
> Julien Grall (2):
>   arch/arm64: Introduce a capability to tell whether 16-bit VMID is
> available
>   kvm/arm: Align the VMID allocation with the arm64 ASID one
> 
> Shameer Kolothum (1):
>   kvm/arm: Introduce a new vmid allocator for KVM
> 
>  arch/arm64/include/asm/cpucaps.h   |   3 +-
>  arch/arm64/include/asm/kvm_asm.h   |   4 +-
>  arch/arm64/include/asm/kvm_host.h  |  11 +-
>  arch/arm64/include/asm/kvm_mmu.h   |   7 +-
>  arch/arm64/kernel/cpufeature.c |   9 +
>  arch/arm64/kvm/Makefile|   2 +-
>  arch/arm64/kvm/arm.c   | 115 
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c |   6 +-
>  arch/arm64/kvm/hyp/nvhe/tlb.c  |  10 +-
>  arch/arm64/kvm/hyp/vhe/tlb.c   |  10 +-
>  arch/arm64/kvm/mmu.c   |   1 -
>  arch/arm64/kvm/vmid.c  | 285
> +
>  12 files changed, 354 insertions(+), 109 deletions(-)
>  create mode 100644 arch/arm64/kvm/vmid.c
> 
> --
> 2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v4 00/16] kvm/arm: Align the VMID allocation with the arm64 ASID one

2021-04-23 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 22 April 2021 18:09
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org; jul...@xen.org; Linuxarm
> 
> Subject: Re: [PATCH v4 00/16] kvm/arm: Align the VMID allocation with the
> arm64 ASID one
> 
> On Wed, Apr 14, 2021 at 12:22:56PM +0100, Shameer Kolothum wrote:
> > Hi,
> >
> > This is an attempt to revive this series originally posted by
> > Julien Grall[1]. The main motive to work on this now is because
> > of the requirement to have Pinned KVM VMIDs and the RFC discussion
> > for the same basically suggested[2] to have a common/better vmid
> > allocator for KVM which this series provides.
> >
> > Major Changes from v3:
> >
> > -Changes related to Pinned ASID support.
> > -Changes to take care KPTI related bits reservation.
> > -Dropped support for 32 bit KVM.
> > -Rebase to 5.12-rc7
> >
> > Individual patches have change history for any major changes
> > from v3.
> >
> > Tests were performed on a HiSilicon D06 platform and so far not observed
> > any regressions.
> >
> > For ASID allocation,
> >
> > Avg of 10 runs(hackbench -s 512 -l 200 -g 300 -f 25 -P),
> > 5.12-rc7: Time:18.8119
> > 5.12-rc7+v4: Time: 18.459
> >
> > ~1.8% improvement.
> >
> > For KVM VMID,
> >
> > The measurement was made with maxcpus set to 8 and with the
> > number of VMID limited to 4-bit. The test involves running
> > concurrently 40 guests with 2 vCPUs. Each guest will then
> > execute hackbench 5 times before exiting.
> >
> > The performance difference between the current algo and the
> > new one are(ag. of 10 runs):
> > - 1.9% less exit from the guest
> > - 0.7% faster
> >
> > For complete series, please see,
> >  https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc7-asid-v4
> >
> > Please take a look and let me know your feedback.
> 
> Although I think aligning the two algorithms makes sense, I'm not completely
> sold on the need to abstract all this into a library and whether the
> additional indirection is justified.
> 
> It would be great to compare this approach with one where portions of the
> code are duplicated into a separate VMID allocator. Have you tried that to
> see what it looks like? Doesn't need to be a proper patch set, but comparing
> the end result might help to evaluate the proposal here.

Ok. I will give it a go and get back.

Thanks,
Shameer

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v15 00/12] SMMUv3 Nested Stage Setup (IOMMU part)

2021-04-14 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: wangxingang
> Sent: 14 April 2021 03:36
> To: Eric Auger ; eric.auger@gmail.com;
> jean-phili...@linaro.org; io...@lists.linux-foundation.org;
> linux-ker...@vger.kernel.org; k...@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; w...@kernel.org; m...@kernel.org;
> robin.mur...@arm.com; j...@8bytes.org; alex.william...@redhat.com;
> t...@semihalf.com; zhukeqian 
> Cc: jacob.jun@linux.intel.com; yi.l@intel.com; 
> zhangfei@linaro.org;
> zhangfei....@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ; yuzenghui
> ; nicoleots...@gmail.com; lushenming
> ; vse...@nvidia.com; chenxiang (M)
> ; vdu...@nvidia.com; jiangkunkun
> 
> Subject: Re: [PATCH v15 00/12] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Eric, Jean-Philippe
> 
> On 2021/4/11 19:12, Eric Auger wrote:
> > SMMUv3 Nested Stage Setup (IOMMU part)
> >
> > This series brings the IOMMU part of HW nested paging support
> > in the SMMUv3. The VFIO part is submitted separately.
> >
> > This is based on Jean-Philippe's
> > [PATCH v14 00/10] iommu: I/O page faults for SMMUv3
> > https://www.spinics.net/lists/arm-kernel/msg886518.html
> > (including the patches that were not pulled for 5.13)
> >
> > The IOMMU API is extended to support 2 new API functionalities:
> > 1) pass the guest stage 1 configuration
> > 2) pass stage 1 MSI bindings
> >
> > Then those capabilities gets implemented in the SMMUv3 driver.
> >
> > The virtualizer passes information through the VFIO user API
> > which cascades them to the iommu subsystem. This allows the guest
> > to own stage 1 tables and context descriptors (so-called PASID
> > table) while the host owns stage 2 tables and main configuration
> > structures (STE).
> >
> > Best Regards
> >
> > Eric
> >
> > This series can be found at:
> > v5.12-rc6-jean-iopf-14-2stage-v15
> > (including the VFIO part in its last version: v13)
> >
> 
> I am testing the performance of an accelerator with/without SVA/vSVA,
> and found there might be some potential performance loss risk for SVA/vSVA.
> 
> I use a Network and computing encryption device (SEC), and send 1MB
> request for 1 times.
> 
> I trigger mm fault before I send the request, so there should be no iopf.
> 
> Here's what I got:
> 
> physical scenario:
> performance:  SVA:9MB/s   NOSVA:9MB/s
> tlb_miss: SVA:302,651 NOSVA:1,223
> trans_table_walk_access:SVA:302,276   NOSVA:1,237
> 
> VM scenario:
> performance:  vSVA:9MB/s  NOvSVA:6MB/s  about 30~40% loss
> tlb_miss: vSVA:4,423,897  NOvSVA:1,907
> trans_table_walk_access:vSVA:61,928,430   NOvSVA:21,948
> 
> In physical scenario, there's almost no performance loss, but the
> tlb_miss and trans_table_walk_access of stage 1 for SVA is quite high,
> comparing to NOSVA.
> 
> In VM scenario, there's about 30~40% performance loss, this is because
> the two stage tlb_miss and trans_table_walk_access is even higher, and
> impact the performance.
> 
> I compare the procedure of building page table of SVA and NOSVA, and
> found that NOSVA uses 2MB mapping as far as possible, while SVA uses
> only 4KB.
> 
> I retest with huge page, and huge page could solve this problem, the
> performance of SVA/vSVA is almost the same as NOSVA.
> 
> I am wondering do you have any other solution for the performance loss
> of vSVA, or any other method to reduce the tlb_miss/trans_table_walk.

Hi Xingang,

Just curious, do you have DVM enabled on this board or does it use explicit
SMMU TLB invalidations?

Thanks,
Shameer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v14 13/13] iommu/smmuv3: Accept configs with more than one context descriptor

2021-04-01 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 01 April 2021 12:49
> To: yuzenghui 
> Cc: eric.auger@gmail.com; io...@lists.linux-foundation.org;
> linux-ker...@vger.kernel.org; k...@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; w...@kernel.org; m...@kernel.org;
> robin.mur...@arm.com; j...@8bytes.org; alex.william...@redhat.com;
> t...@semihalf.com; zhukeqian ;
> jacob.jun@linux.intel.com; yi.l@intel.com; wangxingang
> ; jiangkunkun ;
> jean-phili...@linaro.org; zhangfei@linaro.org; zhangfei@gmail.com;
> vivek.gau...@arm.com; Shameerali Kolothum Thodi
> ; nicoleots...@gmail.com;
> lushenming ; vse...@nvidia.com; Wanghaibin (D)
> 
> Subject: Re: [PATCH v14 13/13] iommu/smmuv3: Accept configs with more than
> one context descriptor
> 
> Hi Zenghui,
> 
> On 3/30/21 11:23 AM, Zenghui Yu wrote:
> > Hi Eric,
> >
> > On 2021/2/24 4:56, Eric Auger wrote:
> >> In preparation for vSVA, let's accept userspace provided configs
> >> with more than one CD. We check the max CD against the host iommu
> >> capability and also the format (linear versus 2 level).
> >>
> >> Signed-off-by: Eric Auger 
> >> Signed-off-by: Shameer Kolothum
> 
> >> ---
> >>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 13 -
> >>   1 file changed, 8 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> index 332d31c0680f..ab74a0289893 100644
> >> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> @@ -3038,14 +3038,17 @@ static int
> arm_smmu_attach_pasid_table(struct
> >> iommu_domain *domain,
> >>   if (smmu_domain->s1_cfg.set)
> >>   goto out;
> >>   -    /*
> >> - * we currently support a single CD so s1fmt and s1dss
> >> - * fields are also ignored
> >> - */
> >> -    if (cfg->pasid_bits)
> >> +    list_for_each_entry(master, _domain->devices,
> >> domain_head) {
> >> +    if (cfg->pasid_bits > master->ssid_bits)
> >> +    goto out;
> >> +    }
> >> +    if (cfg->vendor_data.smmuv3.s1fmt ==
> >> STRTAB_STE_0_S1FMT_64K_L2 &&
> >> +    !(smmu->features &
> ARM_SMMU_FEAT_2_LVL_CDTAB))
> >>   goto out;
> >>     smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;
> >> +    smmu_domain->s1_cfg.s1cdmax = cfg->pasid_bits;
> >> +    smmu_domain->s1_cfg.s1fmt =
> cfg->vendor_data.smmuv3.s1fmt;
> >
> > And what about the SIDSS field?
> >
> I added this patch upon Shameer's request, to be more vSVA friendly.
> Hower this series does not really target multiple CD support. At the
> moment the driver only supports STRTAB_STE_1_S1DSS_SSID0 (0x2) I think.
> At this moment maybe I can only check the s1dss field is 0x2. Or simply
> removes this patch?
> 
> Thoughts?

Right. This was useful for vSVA tests. But yes, to properly support multiple CDs
we need to pass the S1DSS from Qemu. And that requires further changes.
So I think it's better to remove this patch and reject S1CDMAX != 0 cases.

Thanks,
Shameer
   
> 
> Eric

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-03-25 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 25 March 2021 09:33
> To: Shameerali Kolothum Thodi 
> Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> linux-arm-ker...@lists.infradead.org; sta...@vger.kernel.org;
> pbonz...@redhat.com; Linuxarm 
> Subject: Re: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware
> wrongly advertising GICv2-on-v3 compatibility
> 
> On 2021-03-25 09:14, Shameer Kolothum wrote:
> > From: Marc Zyngier 
> >
> > commit 9739f6ef053f104a997165701c6e15582c4307ee upstream.
> >
> > It looks like we have broken firmware out there that wrongly
> > advertises a GICv2 compatibility interface, despite the CPUs not being
> > able to deal with it.
> >
> > To work around this, check that the CPU initialising KVM is actually
> > able to switch to MMIO instead of system registers, and use that as a
> > precondition to enable GICv2 compatibility in KVM.
> >
> > Note that the detection happens on a single CPU. If the firmware is
> > lying *and* that the CPUs are asymetric, all hope is lost anyway.
> >
> > Cc: sta...@vger.kernel.org #5.10
> > Reported-by: Shameerali Kolothum Thodi
> > 
> > Tested-by: Shameer Kolothum 
> > Signed-off-by: Marc Zyngier 
> > Message-Id: <20210305185254.3730990-8-...@kernel.org>
> > Signed-off-by: Paolo Bonzini 
> > Signed-off-by: Shameer Kolothum 
> 
> Please hold on on that.
> 
> This patch causes a regression, and needs a fix that is currently queued for 
> 5.12
> [1]. Once this hits upstream, please add the fix to the series and post it as 
> a
> whole.

Ok. Yes, I noted that. But was thinking if this goes through first and then we 
can have a 
stable tag for that one, we can manage it. Anyway, will wait now.

Thanks,
Shameer
 
> Thanks,
> 
>  M.
> 
> [1] https://lore.kernel.org/r/20210323162301.2049595-1-...@kernel.org
> --
> Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH] KVM: arm64: Fix CPU interface MMIO compatibility detection

2021-03-24 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 23 March 2021 16:23
> To: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org;
> linux-arm-ker...@lists.infradead.org
> Cc: Ard Biesheuvel ; kernel-t...@android.com; Shameerali
> Kolothum Thodi 
> Subject: [PATCH] KVM: arm64: Fix CPU interface MMIO compatibility detection
> 
> In order to detect whether a GICv3 CPU interface is MMIO capable,
> we switch ICC_SRE_EL1.SRE to 0 and check whether it sticks.
> 
> However, this is only possible if *ALL* of the HCR_EL2 interrupt
> overrides are set, and the CPU is perfectly allowed to ignore
> the write to ICC_SRE_EL1 otherwise. This leads KVM to pretend
> that a whole bunch of ARMv8.0 CPUs aren't MMIO-capable, and
> breaks VMs that should work correctly otherwise.
> 
> Fix this by setting IMO/FMO/IMO before touching ICC_SRE_EL1,
> and clear them afterwards. This allows us to reliably detect
> the CPU interface capabilities.
> 

Tested on HiSilicon D06 platform where the original issue(firmware wrongly
advertising GICv2-on-v3 compatibility) was reported and all seems to be fine.

Though not sure whether this fix is relevant to this particular platform,
FWIW,

  Tested-by: Shameer Kolothum 

Thanks,
Shameer

> Cc: Shameerali Kolothum Thodi 
> Fixes: 9739f6ef053f ("KVM: arm64: Workaround firmware wrongly advertising
> GICv2-on-v3 compatibility")
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index ee3682b9873c..39f8f7f9227c 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -429,6 +429,13 @@ u64 __vgic_v3_get_gic_config(void)
>   if (has_vhe())
>   flags = local_daif_save();
> 
> + /*
> +  * Table 11-2 "Permitted ICC_SRE_ELx.SRE settings" indicates
> +  * that to be able to set ICC_SRE_EL1.SRE to 0, all the
> +  * interrupt overrides must be set. You've got to love this.
> +  */
> + sysreg_clear_set(hcr_el2, 0, HCR_AMO | HCR_FMO | HCR_IMO);
> + isb();
>   write_gicreg(0, ICC_SRE_EL1);
>   isb();
> 
> @@ -436,6 +443,8 @@ u64 __vgic_v3_get_gic_config(void)
> 
>   write_gicreg(sre, ICC_SRE_EL1);
>   isb();
> + sysreg_clear_set(hcr_el2, HCR_AMO | HCR_FMO | HCR_IMO, 0);
> + isb();
> 
>   if (has_vhe())
>   local_daif_restore(flags);
> --
> 2.30.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH 7/8] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-03-15 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 05 March 2021 18:53
> To: Paolo Bonzini 
> Cc: Alexandru Elisei ; Andre Przywara
> ; Andrew Scull ; Catalin
> Marinas ; Christoffer Dall
> ; Howard Zhang ; Jia
> He ; Mark Rutland ; Quentin
> Perret ; Shameerali Kolothum Thodi
> ; Suzuki K Poulose
> ; Will Deacon ; James Morse
> ; Julien Thierry ;
> kernel-t...@android.com; linux-arm-ker...@lists.infradead.org;
> kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org
> Subject: [PATCH 7/8] KVM: arm64: Workaround firmware wrongly advertising
> GICv2-on-v3 compatibility
> 
> It looks like we have broken firmware out there that wrongly advertises
> a GICv2 compatibility interface, despite the CPUs not being able to deal
> with it.
> 
> To work around this, check that the CPU initialising KVM is actually able
> to switch to MMIO instead of system registers, and use that as a
> precondition to enable GICv2 compatibility in KVM.
> 
> Note that the detection happens on a single CPU. If the firmware is
> lying *and* that the CPUs are asymetric, all hope is lost anyway.
> 
> Reported-by: Shameerali Kolothum Thodi
> 
> Tested-by: Shameer Kolothum 
> Signed-off-by: Marc Zyngier 

Is it possible to add stable tag for this? Looks like we do have systems out 
there
and reports issues.

Thanks,
Shameer

> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 35 +++--
>  arch/arm64/kvm/vgic/vgic-v3.c   |  8 ++--
>  2 files changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 005daa0c9dd7..ee3682b9873c 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -408,11 +408,42 @@ void __vgic_v3_init_lrs(void)
>  /*
>   * Return the GIC CPU configuration:
>   * - [31:0]  ICH_VTR_EL2
> - * - [63:32] RES0
> + * - [62:32] RES0
> + * - [63]MMIO (GICv2) capable
>   */
>  u64 __vgic_v3_get_gic_config(void)
>  {
> - return read_gicreg(ICH_VTR_EL2);
> + u64 val, sre = read_gicreg(ICC_SRE_EL1);
> + unsigned long flags = 0;
> +
> + /*
> +  * To check whether we have a MMIO-based (GICv2 compatible)
> +  * CPU interface, we need to disable the system register
> +  * view. To do that safely, we have to prevent any interrupt
> +  * from firing (which would be deadly).
> +  *
> +  * Note that this only makes sense on VHE, as interrupts are
> +  * already masked for nVHE as part of the exception entry to
> +  * EL2.
> +  */
> + if (has_vhe())
> + flags = local_daif_save();
> +
> + write_gicreg(0, ICC_SRE_EL1);
> + isb();
> +
> + val = read_gicreg(ICC_SRE_EL1);
> +
> + write_gicreg(sre, ICC_SRE_EL1);
> + isb();
> +
> + if (has_vhe())
> + local_daif_restore(flags);
> +
> + val  = (val & ICC_SRE_EL1_SRE) ? 0 : (1ULL << 63);
> + val |= read_gicreg(ICH_VTR_EL2);
> +
> + return val;
>  }
> 
>  u64 __vgic_v3_read_vmcr(void)
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index c3e6c3fd333b..6f530925a231 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -575,8 +575,10 @@ early_param("kvm-arm.vgic_v4_enable",
> early_gicv4_enable);
>  int vgic_v3_probe(const struct gic_kvm_info *info)
>  {
>   u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config);
> + bool has_v2;
>   int ret;
> 
> + has_v2 = ich_vtr_el2 >> 63;
>   ich_vtr_el2 = (u32)ich_vtr_el2;
> 
>   /*
> @@ -596,13 +598,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
>gicv4_enable ? "en" : "dis");
>   }
> 
> + kvm_vgic_global_state.vcpu_base = 0;
> +
>   if (!info->vcpu.start) {
>   kvm_info("GICv3: no GICV resource entry\n");
> - kvm_vgic_global_state.vcpu_base = 0;
> + } else if (!has_v2) {
> + pr_warn(FW_BUG "CPU interface incapable of MMIO access\n");
>   } else if (!PAGE_ALIGNED(info->vcpu.start)) {
>   pr_warn("GICV physical address 0x%llx not page aligned\n",
>   (unsigned long long)info->vcpu.start);
> - kvm_vgic_global_state.vcpu_base = 0;
>   } else {
>   kvm_vgic_global_state.vcpu_base = info->vcpu.start;
>   kvm_vgic_global_state.can_emulate_gicv2 = true;
> --
> 2.29.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [RFC PATCH 3/5] KVM: ARM64: Add support for pinned VMIDs

2021-03-09 Thread Shameerali Kolothum Thodi

Hi Marc,

> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 09 March 2021 10:33
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; io...@lists.linux-foundation.org;
> kvmarm@lists.cs.columbia.edu; alex.william...@redhat.com;
> jean-phili...@linaro.org; eric.au...@redhat.com; zhangfei@linaro.org;
> Jonathan Cameron ; Zengtao (B)
> ; linux...@openeuler.org; Will Deacon
> 
> Subject: Re: [RFC PATCH 3/5] KVM: ARM64: Add support for pinned VMIDs
> 
> Hi Shameer,
> 
> [+Will]
> 
> On Mon, 22 Feb 2021 15:53:36 +,
> Shameer Kolothum  wrote:
> >
> > On an ARM64 system with a SMMUv3 implementation that fully supports
> > Broadcast TLB Maintenance(BTM) feature, the CPU TLB invalidate
> > instructions are received by SMMU. This is very useful when the
> > SMMU shares the page tables with the CPU(eg: Guest SVA use case).
> > For this to work, the SMMU must use the same VMID that is allocated
> > by KVM to configure the stage 2 translations.
> >
> > At present KVM VMID allocations are recycled on rollover and may
> > change as a result. This will create issues if we have to share
> > the KVM VMID with SMMU. Hence, we spilt the KVM VMID space into
> > two, the first half follows the normal recycle on rollover policy
> > while the second half of the VMID pace is used to allocate pinned
> > VMIDs. This feature is enabled based on a command line option
> > "kvm-arm.pinned_vmid_enable".
> 
> I think this is the wrong approach. Instead of shoving the notion of
> pinned VMID into the current allocator, which really isn't designed
> for this, it'd be a lot better if we aligned the KVM VMID allocator
> with the ASID allocator, which already has support for pinning and is
> in general much more efficient.

Ok. Agree that this is not efficient, but was easy to prototype something :)

> Julien Grall worked on such a series[1] a long while ago, which got
> stalled because of the 32bit KVM port. Since we don't have this burden
> anymore, I'd rather you look in that direction instead of wasting half
> of the VMID space on potentially pinned VMIDs.

Sure. I will check that and work on it.

Thanks,
Shameer

> Thanks,
> 
>   M.
> 
> [1]
> https://patchwork.kernel.org/project/linux-arm-kernel/cover/20190724162534
> .7390-1-julien.gr...@arm.com/
> 
> 
> --
> Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [RFC PATCH 4/5] iommu/arm-smmu-v3: Use pinned VMID for NESTED stage with BTM

2021-03-05 Thread Shameerali Kolothum Thodi

Hi Jean,

> -Original Message-
> From: Jean-Philippe Brucker [mailto:jean-phili...@linaro.org]
> Sent: 04 March 2021 17:11
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; io...@lists.linux-foundation.org;
> kvmarm@lists.cs.columbia.edu; m...@kernel.org;
> alex.william...@redhat.com; eric.au...@redhat.com;
> zhangfei@linaro.org; Jonathan Cameron
> ; Zengtao (B) ;
> linux...@openeuler.org
> Subject: Re: [RFC PATCH 4/5] iommu/arm-smmu-v3: Use pinned VMID for
> NESTED stage with BTM
> 
> Hi Shameer,
> 
> On Mon, Feb 22, 2021 at 03:53:37PM +, Shameer Kolothum wrote:
> > If the SMMU supports BTM and the device belongs to NESTED domain
> > with shared pasid table, we need to use the VMID allocated by the
> > KVM for the s2 configuration. Hence, request a pinned VMID from KVM.
> >
> > Signed-off-by: Shameer Kolothum 
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 49
> -
> >  1 file changed, 47 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 26bf7da1bcd0..04f83f7c8319 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -28,6 +28,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include 
> >
> > @@ -2195,6 +2196,33 @@ static void arm_smmu_bitmap_free(unsigned
> long *map, int idx)
> > clear_bit(idx, map);
> >  }
> >
> > +static int arm_smmu_pinned_vmid_get(struct arm_smmu_domain
> *smmu_domain)
> > +{
> > +   struct arm_smmu_master *master;
> > +
> > +   master = list_first_entry_or_null(_domain->devices,
> > + struct arm_smmu_master, domain_head);
> 
> This probably needs to hold devices_lock while using master.

Ok.

> 
> > +   if (!master)
> > +   return -EINVAL;
> > +
> > +   return kvm_pinned_vmid_get(master->dev);
> > +}
> > +
> > +static int arm_smmu_pinned_vmid_put(struct arm_smmu_domain
> *smmu_domain)
> > +{
> > +   struct arm_smmu_master *master;
> > +
> > +   master = list_first_entry_or_null(_domain->devices,
> > + struct arm_smmu_master, domain_head);
> > +   if (!master)
> > +   return -EINVAL;
> > +
> > +   if (smmu_domain->s2_cfg.vmid)
> > +   return kvm_pinned_vmid_put(master->dev);
> > +
> > +   return 0;
> > +}
> > +
> >  static void arm_smmu_domain_free(struct iommu_domain *domain)
> >  {
> > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > @@ -2215,8 +2243,11 @@ static void arm_smmu_domain_free(struct
> iommu_domain *domain)
> > mutex_unlock(_smmu_asid_lock);
> > }
> > if (s2_cfg->set) {
> > -   if (s2_cfg->vmid)
> > -   arm_smmu_bitmap_free(smmu->vmid_map, s2_cfg->vmid);
> > +   if (s2_cfg->vmid) {
> > +   if (!(smmu->features & ARM_SMMU_FEAT_BTM) &&
> > +   smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
> > +   arm_smmu_bitmap_free(smmu->vmid_map,
> s2_cfg->vmid);
> > +   }
> > }
> >
> > kfree(smmu_domain);
> > @@ -3199,6 +3230,17 @@ static int arm_smmu_attach_pasid_table(struct
> iommu_domain *domain,
> > !(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB))
> > goto out;
> >
> > +   if (smmu->features & ARM_SMMU_FEAT_BTM) {
> > +   ret = arm_smmu_pinned_vmid_get(smmu_domain);
> > +   if (ret < 0)
> > +   goto out;
> > +
> > +   if (smmu_domain->s2_cfg.vmid)
> > +   arm_smmu_bitmap_free(smmu->vmid_map,
> smmu_domain->s2_cfg.vmid);
> > +
> > +   smmu_domain->s2_cfg.vmid = (u16)ret;
> 
> That will require a TLB invalidation on the old VMID, once the STE is
> rewritten.

True. Will add that.

> More generally I think this pinned VMID set conflicts with that of
> stage-2-only domains (which is the default state until a guest attaches a
> PASID table). Say you have one guest using DOMAIN_NESTED without PASID
> table, just DMA to IPA using VMID 0x8000. Now another guest attaches a
> PASID table and obtains the same VMID from KVM. The stage-2 transla

RE: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)

2021-02-22 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 21 February 2021 18:21
> To: Shameerali Kolothum Thodi ;
> eric.auger@gmail.com; io...@lists.linux-foundation.org;
> linux-ker...@vger.kernel.org; k...@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; w...@kernel.org; j...@8bytes.org;
> m...@kernel.org; robin.mur...@arm.com; alex.william...@redhat.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com;
> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> nicoleots...@gmail.com; yuzenghui ; Zengtao (B)
> ; linux...@openeuler.org
> Subject: Re: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Shameer,
> On 1/8/21 6:05 PM, Shameerali Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -Original Message-
> >> From: Eric Auger [mailto:eric.au...@redhat.com]
> >> Sent: 18 November 2020 11:22
> >> To: eric.auger@gmail.com; eric.au...@redhat.com;
> >> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> >> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> >> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> >> alex.william...@redhat.com
> >> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> >> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> >> Thodi ;
> >> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> >> nicoleots...@gmail.com; yuzenghui 
> >> Subject: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)
> >>
> >> This series brings the IOMMU part of HW nested paging support
> >> in the SMMUv3. The VFIO part is submitted separately.
> >>
> >> The IOMMU API is extended to support 2 new API functionalities:
> >> 1) pass the guest stage 1 configuration
> >> 2) pass stage 1 MSI bindings
> >>
> >> Then those capabilities gets implemented in the SMMUv3 driver.
> >>
> >> The virtualizer passes information through the VFIO user API
> >> which cascades them to the iommu subsystem. This allows the guest
> >> to own stage 1 tables and context descriptors (so-called PASID
> >> table) while the host owns stage 2 tables and main configuration
> >> structures (STE).
> >
> > I am seeing an issue with Guest testpmd run with this series.
> > I have two different setups and testpmd works fine with the
> > first one but not with the second.
> >
> > 1). Guest doesn't have kernel driver built-in for pass-through dev.
> >
> > root@ubuntu:/# lspci -v
> > ...
> > 00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev
> 21)
> > Subsystem: Huawei Technologies Co., Ltd. Device 
> > Flags: fast devsel
> > Memory at 800010 (64-bit, prefetchable) [disabled] [size=64K]
> > Memory at 80 (64-bit, prefetchable) [disabled] [size=1M]
> > Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00
> > Capabilities: [a0] MSI-X: Enable- Count=67 Masked-
> > Capabilities: [b0] Power Management version 3
> > Capabilities: [100] Access Control Services
> > Capabilities: [300] Transaction Processing Hints
> >
> > root@ubuntu:/# echo vfio-pci >
> /sys/bus/pci/devices/:00:02.0/driver_override
> > root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe
> >
> > root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix
> socket0  -l 0-1 -n 2 -- -i
> > EAL: Detected 8 lcore(s)
> > EAL: Detected 1 NUMA nodes
> > EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket
> > EAL: Selected IOVA mode 'VA'
> > EAL: No available hugepages reported in hugepages-32768kB
> > EAL: No available hugepages reported in hugepages-64kB
> > EAL: No available hugepages reported in hugepages-1048576kB
> > EAL: Probing VFIO support...
> > EAL: VFIO support initialized
> > EAL:   Invalid NUMA socket, default to 0
> > EAL:   using IOMMU type 1 (Type 1)
> > EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket
> 0)
> > EAL: No legacy callbacks, legacy socket not created
> > Interactive-mode selected
> > testpmd: create a new mbuf pool : n=155456,
> size=2176, socket=0
> > testpmd: preferred mempool ops selected: ring_mp_mc
> >
> > Warning! port-topology=paired and odd forward ports number, the last port
> will pair with itself.
> >
> > Configuring Port 0 (socket 0)
> > Port 0: 8E:A6:8C:43:43:45
> > Checking link statuses...
> > Done
> > te

RE: [PATCH v11 12/13] vfio/pci: Register a DMA fault response region

2021-02-18 Thread Shameerali Kolothum Thodi

Hi Eric,

> > -Original Message-
> > From: Eric Auger [mailto:eric.au...@redhat.com]
> > Sent: 16 November 2020 11:00
> > To: eric.auger@gmail.com; eric.au...@redhat.com;
> > io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> > k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> > j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> > alex.william...@redhat.com
> > Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> > zhangfei....@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> > Thodi ;
> > jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> > nicoleots...@gmail.com; yuzenghui 
> > Subject: [PATCH v11 12/13] vfio/pci: Register a DMA fault response
> > region
> >
> > In preparation for vSVA, let's register a DMA fault response region,
> > where the userspace will push the page responses and increment the
> > head of the buffer. The kernel will pop those responses and inject
> > them on iommu side.
> >
> > Signed-off-by: Eric Auger 
> > ---
> >  drivers/vfio/pci/vfio_pci.c | 114 +---
> >  drivers/vfio/pci/vfio_pci_private.h |   5 ++
> >  drivers/vfio/pci/vfio_pci_rdwr.c|  39 ++
> >  include/uapi/linux/vfio.h   |  32 
> >  4 files changed, 181 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> > index 65a83fd0e8c0..e9a904ce3f0d 100644
> > --- a/drivers/vfio/pci/vfio_pci.c
> > +++ b/drivers/vfio/pci/vfio_pci.c
> > @@ -318,9 +318,20 @@ static void vfio_pci_dma_fault_release(struct
> > vfio_pci_device *vdev,
> > kfree(vdev->fault_pages);
> >  }
> >
> > -static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> > -  struct vfio_pci_region *region,
> > -  struct vm_area_struct *vma)
> > +static void
> > +vfio_pci_dma_fault_response_release(struct vfio_pci_device *vdev,
> > +   struct vfio_pci_region *region) {
> > +   if (vdev->dma_fault_response_wq)
> > +   destroy_workqueue(vdev->dma_fault_response_wq);
> > +   kfree(vdev->fault_response_pages);
> > +   vdev->fault_response_pages = NULL;
> > +}
> > +
> > +static int __vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> > +struct vfio_pci_region *region,
> > +struct vm_area_struct *vma,
> > +u8 *pages)
> >  {
> > u64 phys_len, req_len, pgoff, req_start;
> > unsigned long long addr;
> > @@ -333,14 +344,14 @@ static int vfio_pci_dma_fault_mmap(struct
> > vfio_pci_device *vdev,
> > ((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
> > req_start = pgoff << PAGE_SHIFT;
> >
> > -   /* only the second page of the producer fault region is mmappable */
> > +   /* only the second page of the fault region is mmappable */
> > if (req_start < PAGE_SIZE)
> > return -EINVAL;
> >
> > if (req_start + req_len > phys_len)
> > return -EINVAL;
> >
> > -   addr = virt_to_phys(vdev->fault_pages);
> > +   addr = virt_to_phys(pages);
> > vma->vm_private_data = vdev;
> > vma->vm_pgoff = (addr >> PAGE_SHIFT) + pgoff;
> >
> > @@ -349,13 +360,29 @@ static int vfio_pci_dma_fault_mmap(struct
> > vfio_pci_device *vdev,
> > return ret;
> >  }
> >
> > -static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev,
> > -struct vfio_pci_region *region,
> > -struct vfio_info_cap *caps)
> > +static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> > +  struct vfio_pci_region *region,
> > +  struct vm_area_struct *vma)
> > +{
> > +   return __vfio_pci_dma_fault_mmap(vdev, region, vma,
> > vdev->fault_pages);
> > +}
> > +
> > +static int
> > +vfio_pci_dma_fault_response_mmap(struct vfio_pci_device *vdev,
> > +   struct vfio_pci_region *region,
> > +   struct vm_area_struct *vma)
> > +{
> > +   return __vfio_pci_dma_fault_mmap(vdev, region, vma,
> > vdev->fault_response_pages);
> > +}
> > +
> > +static int __vfio_pci_dma_fault_add_capability(struct vfio_pci_device 
> > *vdev

RE: [PATCH v2 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-01-19 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 15 January 2021 14:15
> To: Ard Biesheuvel 
> Cc: Linux ARM ; kvmarm
> ; Shameerali Kolothum Thodi
> ; James Morse
> ; Julien Thierry ;
> Suzuki K Poulose ; Android Kernel Team
> 
> Subject: Re: [PATCH v2 2/2] KVM: arm64: Workaround firmware wrongly
> advertising GICv2-on-v3 compatibility
> 
> On 2021-01-15 14:08, Ard Biesheuvel wrote:
> > On Fri, 15 Jan 2021 at 15:03, Marc Zyngier  wrote:
> 
> [...]
> 
> >> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c
> >> b/arch/arm64/kvm/vgic/vgic-v3.c index 8e7bf3151057..67b27b47312b
> >> 100644
> >> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> >> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> >> @@ -584,8 +584,10 @@ early_param("kvm-arm.vgic_v4_enable",
> >> early_gicv4_enable);
> >>  int vgic_v3_probe(const struct gic_kvm_info *info)  {
> >> u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config);
> >> +   bool has_v2;
> >> int ret;
> >>
> >> +   has_v2 = ich_vtr_el2 >> 63;
> >> ich_vtr_el2 = (u32)ich_vtr_el2;
> >>
> >> /*
> >> @@ -605,13 +607,15 @@ int vgic_v3_probe(const struct gic_kvm_info
> >> *info)
> >>  gicv4_enable ? "en" : "dis");
> >> }
> >>
> >> +   kvm_vgic_global_state.vcpu_base = 0;
> >> +
> >> if (!info->vcpu.start) {
> >> kvm_info("GICv3: no GICV resource entry\n");
> >> -   kvm_vgic_global_state.vcpu_base = 0;
> >> +   } else if (!has_v2) {
> >> +   pr_warn("CPU interface incapable of MMIO access\n");
> >
> > Could we include FW_BUG here to stress that this is a firmware problem?
> 
> Absolutely! That's what it now looks like:
> 
> [2.648452] kvm [1]: IPA Size Limit: 40 bits
> [2.649259] [Firmware Bug]: CPU interface incapable of MMIO access
> [2.649620] kvm [1]: disabling GICv2 emulation
> [2.650227] kvm [1]: GIC system register CPU interface enabled
> [2.652004] kvm [1]: vgic interrupt IRQ9
> [2.655623] kvm [1]: VHE mode initialized successfully
> 
> Updated version pushed out.

Is there a v3 for this series? I couldn't find one.

Anyways, tested this series on a D06 with faulty firmware and it is working as 
expected.
FWIW,

   Tested-by: Shameer Kolothum 

Thanks,
Shameer

> Thanks,
> 
>  M.
> --
> Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs

2021-01-14 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 14 January 2021 16:58
> To: Shameerali Kolothum Thodi ;
> Jean-Philippe Brucker 
> Cc: Xieyingtai ; alex.william...@redhat.com;
> wangxingang ; k...@vger.kernel.org;
> m...@kernel.org; linux-ker...@vger.kernel.org; vivek.gau...@arm.com;
> io...@lists.linux-foundation.org; qubingbing ;
> Zengtao (B) ; zhangfei@linaro.org;
> eric.auger@gmail.com; w...@kernel.org; kvmarm@lists.cs.columbia.edu;
> robin.mur...@arm.com
> Subject: Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with
> unmanaged ASIDs
> 
> Hi Shameer, Jean-Philippe,
> 
> On 12/4/20 11:23 AM, Auger Eric wrote:
> > Hi Shameer, Jean-Philippe,
> >
> > On 12/4/20 11:20 AM, Shameerali Kolothum Thodi wrote:
> >> Hi Jean,
> >>
> >>> -Original Message-
> >>> From: Jean-Philippe Brucker [mailto:jean-phili...@linaro.org]
> >>> Sent: 04 December 2020 09:54
> >>> To: Shameerali Kolothum Thodi
> 
> >>> Cc: Auger Eric ; wangxingang
> >>> ; Xieyingtai ;
> >>> k...@vger.kernel.org; m...@kernel.org; j...@8bytes.org;
> w...@kernel.org;
> >>> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> >>> vivek.gau...@arm.com; alex.william...@redhat.com;
> >>> zhangfei@linaro.org; robin.mur...@arm.com;
> >>> kvmarm@lists.cs.columbia.edu; eric.auger@gmail.com; Zengtao (B)
> >>> ; qubingbing 
> >>> Subject: Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation
> with
> >>> unmanaged ASIDs
> >>>
> >>> Hi Shameer,
> >>>
> >>> On Thu, Dec 03, 2020 at 06:42:57PM +, Shameerali Kolothum Thodi
> wrote:
> >>>> Hi Jean/zhangfei,
> >>>> Is it possible to have a branch with minimum required SVA/UACCE related
> >>> patches
> >>>> that are already public and can be a "stable" candidate for future respin
> of
> >>> Eric's series?
> >>>> Please share your thoughts.
> >>>
> >>> By "stable" you mean a fixed branch with the latest SVA/UACCE patches
> >>> based on mainline?
> >>
> >> Yes.
> >>
> >>  The uacce-devel branches from
> >>> https://github.com/Linaro/linux-kernel-uadk do provide this at the moment
> >>> (they track the latest sva/zip-devel branch
> >>> https://jpbrucker.net/git/linux/ which is roughly based on mainline.)
> As I plan to respin shortly, please could you confirm the best branch to
> rebase on still is that one (uacce-devel from the linux-kernel-uadk git
> repo). Is it up to date? Commits seem to be quite old there.

I think it is the uacce-devel-5.11 branch, but will wait for Jean or Zhangfei
to confirm.

Thanks,
Shameer

> Thanks
> 
> Eric
> >>
> >> Thanks.
> >>
> >> Hi Eric,
> >>
> >> Could you please take a look at the above branches and see whether it make
> sense
> >> to rebase on top of either of those?
> >>
> >> From vSVA point of view, it will be less rebase hassle if we can do that.
> >
> > Sure. I will rebase on top of this ;-)
> >
> > Thanks
> >
> > Eric
> >>
> >> Thanks,
> >> Shameer
> >>
> >>> Thanks,
> >>> Jean
> >>
> >
> > ___
> > iommu mailing list
> > io...@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility

2021-01-11 Thread Shameerali Kolothum Thodi

Hi Marc,

> -Original Message-
> From: Marc Zyngier [mailto:m...@kernel.org]
> Sent: 08 January 2021 17:12
> To: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu
> Cc: Shameerali Kolothum Thodi ;
> James Morse ; Julien Thierry
> ; Suzuki K Poulose
> ; Ard Biesheuvel ;
> kernel-t...@android.com
> Subject: [PATCH 2/2] KVM: arm64: Workaround firmware wrongly advertising
> GICv2-on-v3 compatibility
> 
> It looks like we have broken firmware out there that wrongly advertises
> a GICv2 compatibility interface, despite the CPUs not being able to deal
> with it.
> 
> To work around this, check that the CPU initialising KVM is actually able
> to switch to MMIO instead of system registers, and use that as a
> precondition to enable GICv2 compatibility in KVM.
> 
> Note that the detection happens on a single CPU. If the firmware is
> lying *and* that the CPUs are asymetric, all hope is lost anyway.
> 
> Reported-by: Shameerali Kolothum Thodi
> 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/hyp/vgic-v3-sr.c | 34 +++--
>  arch/arm64/kvm/vgic/vgic-v3.c   |  8 ++--
>  2 files changed, 38 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 005daa0c9dd7..d504499ab917 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -408,11 +408,41 @@ void __vgic_v3_init_lrs(void)
>  /*
>   * Return the GIC CPU configuration:
>   * - [31:0]  ICH_VTR_EL2
> - * - [63:32] RES0
> + * - [62:32] RES0
> + * - [63]MMIO (GICv2) capable
>   */
>  u64 __vgic_v3_get_gic_config(void)
>  {
> - return read_gicreg(ICH_VTR_EL2);
> + u64 sre = read_gicreg(ICC_SRE_EL1);
> + unsigned long flags = 0;
> + bool v2_capable;
> +
> + /*
> +  * To check whether we have a MMIO-based (GICv2 compatible)
> +  * CPU interface, we need to disable the system register
> +  * view. To do that safely, we have to prevent any interrupt
> +  * from firing (which would be deadly).
> +  *
> +  * Note that this only makes sense on VHE, as interrupts are
> +  * already masked for nVHE as part of the exception entry to
> +  * EL2.
> +  */
> + if (has_vhe())
> + flags = local_daif_save();
> +
> + write_gicreg(0, ICC_SRE_EL1);
> + isb();
> +
> + v2_capable = !(read_gicreg(ICC_SRE_EL1) & ICC_SRE_EL1_SRE);
> +
> + write_gicreg(sre, ICC_SRE_EL1);
> + isb();
> +
> + if (has_vhe())
> + local_daif_restore(flags);
> +
> + return (read_gicreg(ICH_VTR_EL2) |
> + v2_capable ? (1ULL << 63) : 0);
>  }

Thanks for sending this out. I had a go with this series and unfortunately
it didn't work on a system with faulty BIOS. It looks like the culprit here is
the ?: operator. There seems to be an operator precedence at play here
and it returns,
  vgic_v3_probe: ich_vtr_el2 0x8000

And with the below change,

return (read_gicreg(ICH_VTR_EL2) |
-   v2_capable ? (1ULL << 63) : 0);
+   (v2_capable ? (1ULL << 63) : 0));
 }

It returns,
  vgic_v3_probe: ich_vtr_el2 0x90080003

and works correctly.
[   18.918738] kvm [1]: disabling GICv2 emulation

Thanks,
Shameer

>  u64 __vgic_v3_read_vmcr(void)
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 8e7bf3151057..67b27b47312b 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -584,8 +584,10 @@ early_param("kvm-arm.vgic_v4_enable",
> early_gicv4_enable);
>  int vgic_v3_probe(const struct gic_kvm_info *info)
>  {
>   u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config);
> + bool has_v2;
>   int ret;
> 
> + has_v2 = ich_vtr_el2 >> 63;
>   ich_vtr_el2 = (u32)ich_vtr_el2;
> 
>   /*
> @@ -605,13 +607,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
>gicv4_enable ? "en" : "dis");
>   }
> 
> + kvm_vgic_global_state.vcpu_base = 0;
> +
>   if (!info->vcpu.start) {
>   kvm_info("GICv3: no GICV resource entry\n");
> - kvm_vgic_global_state.vcpu_base = 0;
> + } else if (!has_v2) {
> + pr_warn("CPU interface incapable of MMIO access\n");
>   } else if (!PAGE_ALIGNED(info->vcpu.start)) {
>   pr_warn("GICV physical address 0x%llx not page aligned\n",
>   (unsigned long long)info->vcpu.start);
> - kvm_vgic_global_state.vcpu_base = 0;
>   } else {
>   kvm_vgic_global_state.vcpu_base = info->vcpu.start;
>   kvm_vgic_global_state.can_emulate_gicv2 = true;
> --
> 2.29.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)

2021-01-08 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Eric Auger [mailto:eric.au...@redhat.com]
> Sent: 18 November 2020 11:22
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> alex.william...@redhat.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ;
> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> nicoleots...@gmail.com; yuzenghui 
> Subject: [PATCH v13 00/15] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> This series brings the IOMMU part of HW nested paging support
> in the SMMUv3. The VFIO part is submitted separately.
> 
> The IOMMU API is extended to support 2 new API functionalities:
> 1) pass the guest stage 1 configuration
> 2) pass stage 1 MSI bindings
> 
> Then those capabilities gets implemented in the SMMUv3 driver.
> 
> The virtualizer passes information through the VFIO user API
> which cascades them to the iommu subsystem. This allows the guest
> to own stage 1 tables and context descriptors (so-called PASID
> table) while the host owns stage 2 tables and main configuration
> structures (STE).

I am seeing an issue with Guest testpmd run with this series.
I have two different setups and testpmd works fine with the
first one but not with the second.

1). Guest doesn't have kernel driver built-in for pass-through dev.

root@ubuntu:/# lspci -v
...
00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev 21)
Subsystem: Huawei Technologies Co., Ltd. Device 
Flags: fast devsel
Memory at 800010 (64-bit, prefetchable) [disabled] [size=64K]
Memory at 80 (64-bit, prefetchable) [disabled] [size=1M]
Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [a0] MSI-X: Enable- Count=67 Masked-
Capabilities: [b0] Power Management version 3
Capabilities: [100] Access Control Services
Capabilities: [300] Transaction Processing Hints

root@ubuntu:/# echo vfio-pci > /sys/bus/pci/devices/:00:02.0/driver_override
root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe

root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix 
socket0  -l 0-1 -n 2 -- -i
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: No available hugepages reported in hugepages-32768kB
EAL: No available hugepages reported in hugepages-64kB
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL:   Invalid NUMA socket, default to 0
EAL:   using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket 0)
EAL: No legacy callbacks, legacy socket not created
Interactive-mode selected
testpmd: create a new mbuf pool : n=155456, size=2176, 
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc

Warning! port-topology=paired and odd forward ports number, the last port will 
pair with itself.

Configuring Port 0 (socket 0)
Port 0: 8E:A6:8C:43:43:45
Checking link statuses...
Done
testpmd>

2). Guest have kernel driver built-in for pass-through dev.

root@ubuntu:/# lspci -v
...
00:02.0 Ethernet controller: Huawei Technologies Co., Ltd. Device a22e (rev 21)
Subsystem: Huawei Technologies Co., Ltd. Device 
Flags: bus master, fast devsel, latency 0
Memory at 800010 (64-bit, prefetchable) [size=64K]
Memory at 80 (64-bit, prefetchable) [size=1M]
Capabilities: [40] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [a0] MSI-X: Enable+ Count=67 Masked-
Capabilities: [b0] Power Management version 3
Capabilities: [100] Access Control Services
Capabilities: [300] Transaction Processing Hints
Kernel driver in use: hns3

root@ubuntu:/# echo vfio-pci > /sys/bus/pci/devices/:00:02.0/driver_override
root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers/hns3/unbind
root@ubuntu:/# echo :00:02.0 > /sys/bus/pci/drivers_probe

root@ubuntu:/mnt/dpdk/build/app# ./testpmd -w :00:02.0 --file-prefix 
socket0 -l 0-1 -n 2 -- -i
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/socket0/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: No available hugepages reported in hugepages-32768kB
EAL: No available hugepages reported in hugepages-64kB
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL:   Invalid NUMA socket, default to 0
EAL:   using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_hns3_vf (19e5:a22e) device: :00:02.0 (socket 0)
:00:02.0 hns3_get_mbx_resp(): VF could not get mbx(11,0) head(1) tail(0) 
lost(1) from P

RE: [PATCH v11 12/13] vfio/pci: Register a DMA fault response region

2021-01-08 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Eric Auger [mailto:eric.au...@redhat.com]
> Sent: 16 November 2020 11:00
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> alex.william...@redhat.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ;
> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> nicoleots...@gmail.com; yuzenghui 
> Subject: [PATCH v11 12/13] vfio/pci: Register a DMA fault response region
> 
> In preparation for vSVA, let's register a DMA fault response region,
> where the userspace will push the page responses and increment the
> head of the buffer. The kernel will pop those responses and inject them
> on iommu side.
> 
> Signed-off-by: Eric Auger 
> ---
>  drivers/vfio/pci/vfio_pci.c | 114 +---
>  drivers/vfio/pci/vfio_pci_private.h |   5 ++
>  drivers/vfio/pci/vfio_pci_rdwr.c|  39 ++
>  include/uapi/linux/vfio.h   |  32 
>  4 files changed, 181 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 65a83fd0e8c0..e9a904ce3f0d 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -318,9 +318,20 @@ static void vfio_pci_dma_fault_release(struct
> vfio_pci_device *vdev,
>   kfree(vdev->fault_pages);
>  }
> 
> -static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> -struct vfio_pci_region *region,
> -struct vm_area_struct *vma)
> +static void
> +vfio_pci_dma_fault_response_release(struct vfio_pci_device *vdev,
> + struct vfio_pci_region *region)
> +{
> + if (vdev->dma_fault_response_wq)
> + destroy_workqueue(vdev->dma_fault_response_wq);
> + kfree(vdev->fault_response_pages);
> + vdev->fault_response_pages = NULL;
> +}
> +
> +static int __vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> +  struct vfio_pci_region *region,
> +  struct vm_area_struct *vma,
> +  u8 *pages)
>  {
>   u64 phys_len, req_len, pgoff, req_start;
>   unsigned long long addr;
> @@ -333,14 +344,14 @@ static int vfio_pci_dma_fault_mmap(struct
> vfio_pci_device *vdev,
>   ((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
>   req_start = pgoff << PAGE_SHIFT;
> 
> - /* only the second page of the producer fault region is mmappable */
> + /* only the second page of the fault region is mmappable */
>   if (req_start < PAGE_SIZE)
>   return -EINVAL;
> 
>   if (req_start + req_len > phys_len)
>   return -EINVAL;
> 
> - addr = virt_to_phys(vdev->fault_pages);
> + addr = virt_to_phys(pages);
>   vma->vm_private_data = vdev;
>   vma->vm_pgoff = (addr >> PAGE_SHIFT) + pgoff;
> 
> @@ -349,13 +360,29 @@ static int vfio_pci_dma_fault_mmap(struct
> vfio_pci_device *vdev,
>   return ret;
>  }
> 
> -static int vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev,
> -  struct vfio_pci_region *region,
> -  struct vfio_info_cap *caps)
> +static int vfio_pci_dma_fault_mmap(struct vfio_pci_device *vdev,
> +struct vfio_pci_region *region,
> +struct vm_area_struct *vma)
> +{
> + return __vfio_pci_dma_fault_mmap(vdev, region, vma,
> vdev->fault_pages);
> +}
> +
> +static int
> +vfio_pci_dma_fault_response_mmap(struct vfio_pci_device *vdev,
> + struct vfio_pci_region *region,
> + struct vm_area_struct *vma)
> +{
> + return __vfio_pci_dma_fault_mmap(vdev, region, vma,
> vdev->fault_response_pages);
> +}
> +
> +static int __vfio_pci_dma_fault_add_capability(struct vfio_pci_device *vdev,
> +struct vfio_pci_region *region,
> +struct vfio_info_cap *caps,
> +u32 cap_id)
>  {
>   struct vfio_region_info_cap_sparse_mmap *sparse = NULL;
>   struct vfio_region_info_cap_fault cap = {
> - .header.id = VFIO_REGION_INFO_CAP_DMA_FAULT,
> +

RE: [PATCH v13 05/15] iommu/smmuv3: Get prepared for nested stage support

2020-12-09 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Eric Auger [mailto:eric.au...@redhat.com]
> Sent: 18 November 2020 11:22
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> alex.william...@redhat.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ;
> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> nicoleots...@gmail.com; yuzenghui 
> Subject: [PATCH v13 05/15] iommu/smmuv3: Get prepared for nested stage
> support
> 
> When nested stage translation is setup, both s1_cfg and
> s2_cfg are set.
> 
> We introduce a new smmu domain abort field that will be set
> upon guest stage1 configuration passing.
> 
> arm_smmu_write_strtab_ent() is modified to write both stage
> fields in the STE and deal with the abort field.
> 
> In nested mode, only stage 2 is "finalized" as the host does
> not own/configure the stage 1 context descriptor; guest does.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> v10 -> v11:
> - Fix an issue reported by Shameer when switching from with vSMMU
>   to without vSMMU. Despite the spec does not seem to mention it
>   seems to be needed to reset the 2 high 64b when switching from
>   S1+S2 cfg to S1 only. Especially dst[3] needs to be reset (S2TTB).
>   On some implementations, if the S2TTB is not reset, this causes
>   a C_BAD_STE error
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 64
> +
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 +
>  2 files changed, 56 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 18ac5af1b284..412ea1bafa50 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -1181,8 +1181,10 @@ static void arm_smmu_write_strtab_ent(struct
> arm_smmu_master *master, u32 sid,
>* three cases at the moment:
>*
>* 1. Invalid (all zero) -> bypass/fault (init)
> -  * 2. Bypass/fault -> translation/bypass (attach)
> -  * 3. Translation/bypass -> bypass/fault (detach)
> +  * 2. Bypass/fault -> single stage translation/bypass (attach)
> +  * 3. Single or nested stage Translation/bypass -> bypass/fault (detach)
> +  * 4. S2 -> S1 + S2 (attach_pasid_table)
> +  * 5. S1 + S2 -> S2 (detach_pasid_table)
>*
>* Given that we can't update the STE atomically and the SMMU
>* doesn't read the thing in a defined order, that leaves us
> @@ -1193,7 +1195,8 @@ static void arm_smmu_write_strtab_ent(struct
> arm_smmu_master *master, u32 sid,
>* 3. Update Config, sync
>*/
>   u64 val = le64_to_cpu(dst[0]);
> - bool ste_live = false;
> + bool s1_live = false, s2_live = false, ste_live;
> + bool abort, nested = false, translate = false;
>   struct arm_smmu_device *smmu = NULL;
>   struct arm_smmu_s1_cfg *s1_cfg;
>   struct arm_smmu_s2_cfg *s2_cfg;
> @@ -1233,6 +1236,8 @@ static void arm_smmu_write_strtab_ent(struct
> arm_smmu_master *master, u32 sid,
>   default:
>   break;
>   }
> + nested = s1_cfg->set && s2_cfg->set;

This is a problem when the Guest is booted with iommu.passthrough = 1 as we
set s1_cfg.set = false for IOMMU_PASID_CONFIG_BYPASS. 

Results in BUG_ON(ste_live && !nested).

Can we instead have nested = true set a bit above in the code, where we set
s2_cfg->set = true for the ARM_SMMU_DOMAIN_NESTED case?

Please take a look.

Thanks,
Shameer

> + translate = s1_cfg->set || s2_cfg->set;
>   }
> 
>   if (val & STRTAB_STE_0_V) {
> @@ -1240,23 +1245,36 @@ static void arm_smmu_write_strtab_ent(struct
> arm_smmu_master *master, u32 sid,
>   case STRTAB_STE_0_CFG_BYPASS:
>   break;
>   case STRTAB_STE_0_CFG_S1_TRANS:
> + s1_live = true;
> + break;
>   case STRTAB_STE_0_CFG_S2_TRANS:
> - ste_live = true;
> + s2_live = true;
> + break;
> + case STRTAB_STE_0_CFG_NESTED:
> + s1_live = true;
> + s2_live = true;
>   break;
>   case STRTAB_STE_0_CFG_ABORT:
> - BUG_ON(!disable_bypass);

RE: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with unmanaged ASIDs

2020-12-03 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: kvmarm-boun...@lists.cs.columbia.edu
> [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Auger Eric
> Sent: 01 December 2020 13:59
> To: wangxingang 
> Cc: Xieyingtai ; jean-phili...@linaro.org;
> k...@vger.kernel.org; m...@kernel.org; j...@8bytes.org; w...@kernel.org;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> vivek.gau...@arm.com; alex.william...@redhat.com;
> zhangfei@linaro.org; robin.mur...@arm.com;
> kvmarm@lists.cs.columbia.edu; eric.auger@gmail.com
> Subject: Re: [PATCH v13 07/15] iommu/smmuv3: Allow stage 1 invalidation with
> unmanaged ASIDs
> 
> Hi Xingang,
> 
> On 12/1/20 2:33 PM, Xingang Wang wrote:
> > Hi Eric
> >
> > On  Wed, 18 Nov 2020 12:21:43, Eric Auger wrote:
> >> @@ -1710,7 +1710,11 @@ static void arm_smmu_tlb_inv_context(void
> *cookie)
> >> * insertion to guarantee those are observed before the TLBI. Do be
> >> * careful, 007.
> >> */
> >> -  if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> >> +  if (ext_asid >= 0) { /* guest stage 1 invalidation */
> >> +  cmd.opcode  = CMDQ_OP_TLBI_NH_ASID;
> >> +  cmd.tlbi.asid   = ext_asid;
> >> +  cmd.tlbi.vmid   = smmu_domain->s2_cfg.vmid;
> >> +  } else if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> >
> > Found a problem here, the cmd for guest stage 1 invalidation is built,
> > but it is not delivered to smmu.
> >
> 
> Thank you for the report. I will fix that soon. With that fixed, have
> you been able to run vSVA on top of the series. Do you need other stuff
> to be fixed at SMMU level? 

I am seeing another issue with this series. This is when you have the vSMMU
in non-strict mode(iommu.strict=0). Any network pass-through dev with iperf run 
will be enough to reproduce the issue. It may randomly stop/hang.

It looks like the .flush_iotlb_all from guest is not propagated down to the host
correctly. I have a temp hack to fix this in Qemu wherein CMDQ_OP_TLBI_NH_ASID
will result in a CACHE_INVALIDATE with IOMMU_INV_GRANU_PASID flag and archid
set.

Please take a look and let me know. 

As I am going to respin soon, please let me
> know what is the best branch to rebase to alleviate your integration.

Please find the latest kernel and Qemu branch with vSVA support added here,

https://github.com/hisilicon/kernel-dev/tree/5.10-rc4-2stage-v13-vsva
https://github.com/hisilicon/qemu/tree/v5.2.0-rc1-2stage-rfcv7-vsva

I have done some basic minimum vSVA tests on a HiSilicon D06 board with
a zip dev that supports STALL. All looks good so far apart from the issues
that have been already reported/discussed.

The kernel branch is actually a rebase of sva/uacce related patches from a
Linaro branch here,

https://github.com/Linaro/linux-kernel-uadk/tree/uacce-devel-5.10

I think going forward it will be good(if possible) to respin your series on top 
of
a sva branch with STALL/PRI support added. 

Hi Jean/zhangfei,
Is it possible to have a branch with minimum required SVA/UACCE related patches
that are already public and can be a "stable" candidate for future respin of 
Eric's series?
Please share your thoughts.

Thanks,
Shameer 

> Best Regards
> 
> Eric
> 
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v11 08/13] vfio/pci: Add framework for custom interrupt indices

2020-11-23 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Eric Auger [mailto:eric.au...@redhat.com]
> Sent: 16 November 2020 11:00
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com;
> alex.william...@redhat.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ;
> jacob.jun@linux.intel.com; yi.l@intel.com; t...@semihalf.com;
> nicoleots...@gmail.com; yuzenghui 
> Subject: [PATCH v11 08/13] vfio/pci: Add framework for custom interrupt
> indices
> 
> Implement IRQ capability chain infrastructure. All interrupt
> indexes beyond VFIO_PCI_NUM_IRQS are handled as extended
> interrupts. They are registered with a specific type/subtype
> and supported flags.
> 
> Signed-off-by: Eric Auger 
> ---
>  drivers/vfio/pci/vfio_pci.c | 99 +++--
>  drivers/vfio/pci/vfio_pci_intrs.c   | 62 ++
>  drivers/vfio/pci/vfio_pci_private.h | 14 
>  3 files changed, 157 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 2a6cc1a87323..93e03a4a5f32 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -608,6 +608,14 @@ static void vfio_pci_disable(struct vfio_pci_device
> *vdev)
> 
>   WARN_ON(iommu_unregister_device_fault_handler(>pdev->dev));
> 
> + for (i = 0; i < vdev->num_ext_irqs; i++)
> + vfio_pci_set_irqs_ioctl(vdev, VFIO_IRQ_SET_DATA_NONE |
> + VFIO_IRQ_SET_ACTION_TRIGGER,
> + VFIO_PCI_NUM_IRQS + i, 0, 0, NULL);
> + vdev->num_ext_irqs = 0;
> + kfree(vdev->ext_irqs);
> + vdev->ext_irqs = NULL;
> +
>   /* Device closed, don't need mutex here */
>   list_for_each_entry_safe(ioeventfd, ioeventfd_tmp,
>>ioeventfds_list, next) {
> @@ -823,6 +831,9 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device
> *vdev, int irq_type)
>   return 1;
>   } else if (irq_type == VFIO_PCI_REQ_IRQ_INDEX) {
>   return 1;
> + } else if (irq_type >= VFIO_PCI_NUM_IRQS &&
> +irq_type < VFIO_PCI_NUM_IRQS + vdev->num_ext_irqs) {
> + return 1;
>   }
> 
>   return 0;
> @@ -1008,7 +1019,7 @@ static long vfio_pci_ioctl(void *device_data,
>   info.flags |= VFIO_DEVICE_FLAGS_RESET;
> 
>   info.num_regions = VFIO_PCI_NUM_REGIONS + vdev->num_regions;
> - info.num_irqs = VFIO_PCI_NUM_IRQS;
> + info.num_irqs = VFIO_PCI_NUM_IRQS + vdev->num_ext_irqs;
> 
>   if (IS_ENABLED(CONFIG_VFIO_PCI_ZDEV)) {
>   int ret = vfio_pci_info_zdev_add_caps(vdev, );
> @@ -1187,36 +1198,87 @@ static long vfio_pci_ioctl(void *device_data,
> 
>   } else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) {
>   struct vfio_irq_info info;
> + struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
> + unsigned long capsz;
> 
>   minsz = offsetofend(struct vfio_irq_info, count);
> 
> + /* For backward compatibility, cannot require this */
> + capsz = offsetofend(struct vfio_irq_info, cap_offset);
> +
>   if (copy_from_user(, (void __user *)arg, minsz))
>   return -EFAULT;
> 
> - if (info.argsz < minsz || info.index >= VFIO_PCI_NUM_IRQS)
> + if (info.argsz < minsz ||
> + info.index >= VFIO_PCI_NUM_IRQS + vdev->num_ext_irqs)
>   return -EINVAL;
> 
> - switch (info.index) {
> - case VFIO_PCI_INTX_IRQ_INDEX ... VFIO_PCI_MSIX_IRQ_INDEX:
> - case VFIO_PCI_REQ_IRQ_INDEX:
> - break;
> - case VFIO_PCI_ERR_IRQ_INDEX:
> - if (pci_is_pcie(vdev->pdev))
> - break;
> - fallthrough;
> - default:
> - return -EINVAL;
> - }
> + if (info.argsz >= capsz)
> + minsz = capsz;
> 
>   info.flags = VFIO_IRQ_INFO_EVENTFD;
> 
> - info.count = vfio_pci_get_irq_count(vdev, info.index);
> -
> - if (info.index == VFIO_PCI_INTX_IRQ_INDEX)
> + switch (info.index) {
> +

RE: [PATCH v12 04/15] iommu/smmuv3: Dynamically allocate s1_cfg and s2_cfg

2020-11-17 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Eric Auger [mailto:eric.au...@redhat.com]
> Sent: 16 November 2020 10:43
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com
> Cc: jean-phili...@linaro.org; zhangfei@linaro.org;
> zhangfei@gmail.com; vivek.gau...@arm.com; Shameerali Kolothum
> Thodi ;
> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> yi.l@intel.com; t...@semihalf.com; nicoleots...@gmail.com
> Subject: [PATCH v12 04/15] iommu/smmuv3: Dynamically allocate s1_cfg and
> s2_cfg
> 
> In preparation for the introduction of nested stages
> let's turn s1_cfg and s2_cfg fields into pointers which are
> dynamically allocated depending on the smmu_domain stage.

This will break compile if we have CONFIG_ARM_SMMU_V3_SVA
because ,
https://github.com/eauger/linux/blob/5.10-rc4-2stage-v12/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c#L40

Do we really need to make these pointers?

Thanks,
Shameer
 
> In nested mode, both stages will coexist and s1_cfg will
> be allocated when the guest configuration gets passed.
> 
> Signed-off-by: Eric Auger 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 83 -
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  6 +-
>  2 files changed, 48 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index d828d6cbeb0e..4baf9fafe462 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -953,9 +953,9 @@ static __le64 *arm_smmu_get_cd_ptr(struct
> arm_smmu_domain *smmu_domain,
>   unsigned int idx;
>   struct arm_smmu_l1_ctx_desc *l1_desc;
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
> - struct arm_smmu_ctx_desc_cfg *cdcfg = _domain->s1_cfg.cdcfg;
> + struct arm_smmu_ctx_desc_cfg *cdcfg =
> _domain->s1_cfg->cdcfg;
> 
> - if (smmu_domain->s1_cfg.s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
> + if (smmu_domain->s1_cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
>   return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
> 
>   idx = ssid >> CTXDESC_SPLIT;
> @@ -990,7 +990,7 @@ int arm_smmu_write_ctx_desc(struct
> arm_smmu_domain *smmu_domain, int ssid,
>   __le64 *cdptr;
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
> 
> - if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
> + if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg->s1cdmax)))
>   return -E2BIG;
> 
>   cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
> @@ -1056,7 +1056,7 @@ static int arm_smmu_alloc_cd_tables(struct
> arm_smmu_domain *smmu_domain)
>   size_t l1size;
>   size_t max_contexts;
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
> - struct arm_smmu_s1_cfg *cfg = _domain->s1_cfg;
> + struct arm_smmu_s1_cfg *cfg = smmu_domain->s1_cfg;
>   struct arm_smmu_ctx_desc_cfg *cdcfg = >cdcfg;
> 
>   max_contexts = 1 << cfg->s1cdmax;
> @@ -1104,7 +1104,7 @@ static void arm_smmu_free_cd_tables(struct
> arm_smmu_domain *smmu_domain)
>   int i;
>   size_t size, l1size;
>   struct arm_smmu_device *smmu = smmu_domain->smmu;
> - struct arm_smmu_ctx_desc_cfg *cdcfg = _domain->s1_cfg.cdcfg;
> + struct arm_smmu_ctx_desc_cfg *cdcfg =
> _domain->s1_cfg->cdcfg;
> 
>   if (cdcfg->l1_desc) {
>   size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
> @@ -1211,17 +1211,8 @@ static void arm_smmu_write_strtab_ent(struct
> arm_smmu_master *master, u32 sid,
>   }
> 
>   if (smmu_domain) {
> - switch (smmu_domain->stage) {
> - case ARM_SMMU_DOMAIN_S1:
> - s1_cfg = _domain->s1_cfg;
> - break;
> - case ARM_SMMU_DOMAIN_S2:
> - case ARM_SMMU_DOMAIN_NESTED:
> - s2_cfg = _domain->s2_cfg;
> - break;
> - default:
> - break;
> - }
> + s1_cfg = smmu_domain->s1_cfg;
> + s2_cfg = smmu_domain->s2_cfg;
>   }
> 
>   if (val & STRTAB_STE_0_V) {
> @@ -1664,10 +1655,10 @@ static void arm_smmu_tlb_inv_context(void
> *cookie)
>* careful, 007.
>*/
>   if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
> - arm_smmu_tlb_inv_asid(smmu, smmu_domain->s1_cfg.cd.asid);
> +

RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2020-11-17 Thread Shameerali Kolothum Thodi

Hi Eric,

First, many thanks for the respin. I will go through all of 
these(iommu/vfio/Qemu)
and will do a thorough verification/tests on our hardware. 

> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 17 November 2020 08:40
> To: Shameerali Kolothum Thodi ;
> Zhangfei Gao ; eric.auger@gmail.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com
> Cc: jean-phili...@linaro.org; alex.william...@redhat.com;
> jacob.jun@linux.intel.com; yi.l@intel.com; peter.mayd...@linaro.org;
> t...@semihalf.com; bbhush...@marvell.com
> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Shameer,
> 
> On 5/13/20 5:57 PM, Shameerali Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -Original Message-
> >> From: Auger Eric [mailto:eric.au...@redhat.com]
> >> Sent: 13 May 2020 14:29
> >> To: Shameerali Kolothum Thodi ;
> >> Zhangfei Gao ; eric.auger@gmail.com;
> >> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> >> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> >> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com
> >> Cc: jean-phili...@linaro.org; alex.william...@redhat.com;
> >> jacob.jun@linux.intel.com; yi.l@intel.com;
> peter.mayd...@linaro.org;
> >> t...@semihalf.com; bbhush...@marvell.com
> >> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> >>
> > [...]
> >
> >>>>> Yes that's normal this series is not meant to support vSVM at this 
> >>>>> stage.
> >>>>>
> >>>>> I intend to add the missing pieces during the next weeks.
> >>>>
> >>>> Thanks for that. I have made an attempt to add the vSVA based on
> >>>> your v10 + JPBs sva patches. The host kernel and Qemu changes can
> >>>> be found here[1][2].
> >>>>
> >>>> This basically adds multiple pasid support on top of your changes.
> >>>> I have done some basic sanity testing and we have some initial success
> >>>> with the zip vf dev on our D06 platform. Please note that the STALL event
> is
> >>>> not yet supported though, but works fine if we mlock() guest usr mem.
> >>>
> >>> I have added STALL support for our vSVA prototype and it seems to be
> >>> working(on our hardware). I have updated the kernel and qemu branches
> >> with
> >>> the same[1][2]. I should warn you though that these are prototype code
> and I
> >> am pretty
> >>> much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost
> >> everything.
> >>> But thought of sharing, in case if it is useful somehow!.
> >>
> >> Thank you again for sharing the POC. I looked at the kernel and QEMU
> >> branches.
> >>
> >> Here are some preliminary comments:
> >> - "arm-smmu-v3: Reset S2TTB while switching back from nested stage":
> as
> >> you mentionned S2TTB reset now is featured in v11
> >
> > Yes.
> >
> >> - "arm-smmu-v3: Add support for multiple pasid in nested mode": I could
> >> easily integrate this into my series. Update the iommu api first and
> >> pass multiple CD info in a separate patch
> >
> > Ok.
> in v12, I added
> [PATCH v12 14/15] iommu/smmuv3: Accept configs with more than one
> context descriptor
> 
> I don't think you need to add s1cdmax addition as we already have
> pasid_bits which should do the job.

Ok.
 
> >> - "arm-smmu-v3: Add support to Invalidate CD": CD invalidation should be
> >> cascaded to host through the PASID cache invalidation uapi (no pb you
> >> warned us for the POC you simply used VFIO_IOMMU_SET_PASID_TABLE). I
> >> think I should add this support in my original series although it does
> >> not seem to trigger any issue up to now.
> >
> > Agree. Cache invalidation uapi is a better interface for this. Also I don’t 
> > think
> > this matters for non-vsva cases as Guest kernel table/CD(pasid 0) will never
> > get invalidated.
> in v12 I added [PATCH v12 15/15] iommu/smmuv3: Add PASID cache
> invalidation per PASID. I have not tested it though.

Ok. Will verify this.

> >> - "arm-smmu-v3: Remove duplication of fault propagation". I understand
> >> the transcode is done somewhere else with SVA but we

RE: [PATCH v10 01/11] vfio: VFIO_IOMMU_SET_PASID_TABLE

2020-10-27 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: iommu [mailto:iommu-boun...@lists.linux-foundation.org] On Behalf Of
> Auger Eric
> Sent: 23 September 2020 12:47
> To: yuzenghui ; eric.auger@gmail.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> yi.l@intel.com; robin.mur...@arm.com
> Subject: Re: [PATCH v10 01/11] vfio: VFIO_IOMMU_SET_PASID_TABLE

...

> > Besides, before going through the whole series [1][2], I'd like to
> > know if this is the latest version of your Nested-Stage-Setup work in
> > case I had missed something.
> >
> > [1]
> > https://lore.kernel.org/r/20200320161911.27494-1-eric.au...@redhat.com
> > [2]
> > https://lore.kernel.org/r/20200414150607.28488-1-eric.au...@redhat.com
> 
> yes those 2 series are the last ones. Thank you for reviewing.
> 
> FYI, I intend to respin within a week or 2 on top of Jacob's  [PATCH v10 0/7]
> IOMMU user API enhancement. 

Thanks for that. Also is there any plan to respin the related Qemu series as 
well?
I know dual stage interface proposals are still under discussion, but it would 
be
nice to have a testable solution based on new interfaces for ARM64 as well.
Happy to help with any tests or verifications.

Please let me know.

Thanks,
Shameer
  

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2020-05-13 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 13 May 2020 14:29
> To: Shameerali Kolothum Thodi ;
> Zhangfei Gao ; eric.auger@gmail.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com
> Cc: jean-phili...@linaro.org; alex.william...@redhat.com;
> jacob.jun@linux.intel.com; yi.l@intel.com; peter.mayd...@linaro.org;
> t...@semihalf.com; bbhush...@marvell.com
> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
[...]

> >>> Yes that's normal this series is not meant to support vSVM at this stage.
> >>>
> >>> I intend to add the missing pieces during the next weeks.
> >>
> >> Thanks for that. I have made an attempt to add the vSVA based on
> >> your v10 + JPBs sva patches. The host kernel and Qemu changes can
> >> be found here[1][2].
> >>
> >> This basically adds multiple pasid support on top of your changes.
> >> I have done some basic sanity testing and we have some initial success
> >> with the zip vf dev on our D06 platform. Please note that the STALL event 
> >> is
> >> not yet supported though, but works fine if we mlock() guest usr mem.
> >
> > I have added STALL support for our vSVA prototype and it seems to be
> > working(on our hardware). I have updated the kernel and qemu branches
> with
> > the same[1][2]. I should warn you though that these are prototype code and I
> am pretty
> > much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost
> everything.
> > But thought of sharing, in case if it is useful somehow!.
> 
> Thank you again for sharing the POC. I looked at the kernel and QEMU
> branches.
> 
> Here are some preliminary comments:
> - "arm-smmu-v3: Reset S2TTB while switching back from nested stage":  as
> you mentionned S2TTB reset now is featured in v11

Yes.

> - "arm-smmu-v3: Add support for multiple pasid in nested mode": I could
> easily integrate this into my series. Update the iommu api first and
> pass multiple CD info in a separate patch

Ok.
> - "arm-smmu-v3: Add support to Invalidate CD": CD invalidation should be
> cascaded to host through the PASID cache invalidation uapi (no pb you
> warned us for the POC you simply used VFIO_IOMMU_SET_PASID_TABLE). I
> think I should add this support in my original series although it does
> not seem to trigger any issue up to now.

Agree. Cache invalidation uapi is a better interface for this. Also I don’t 
think
this matters for non-vsva cases as Guest kernel table/CD(pasid 0) will never
get invalidated. 

> - "arm-smmu-v3: Remove duplication of fault propagation". I understand
> the transcode is done somewhere else with SVA but we still need to do it
> if a single CD is used, right? I will review the SVA code to better
> understand.

Hmm..not sure. Need to take another look to see whether we need a special
handling for single CD or not.

> - for the STALL response injection I would tend to use a new VFIO region
> for responses. At the moment there is a single VFIO region for reporting
> the fault.

Sure. That will be much cleaner and probably improve the context switch
latency. Another thing I noted with STALL is that pasid_valid flag needs to be
taken care in the SVA kernel path. 

"iommu: Remove pasid validity check for STALL model page response msg"
Not sure this one is a proper way to handle this.
 
> On QEMU side:
> - I am currently working on 3.2 range invalidation support which is
> needed for DPDK/VFIO
> - While at it I will look at how to incrementally introduce some of the
> features you need in this series.

Ok. 

Thanks for taking a look at the POC.

Cheers,
Shameer

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2020-05-07 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Shameerali Kolothum Thodi
> Sent: 30 April 2020 10:38
> To: 'Auger Eric' ; Zhangfei Gao
> ; eric.auger@gmail.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com
> Cc: jean-phili...@linaro.org; alex.william...@redhat.com;
> jacob.jun@linux.intel.com; yi.l@intel.com; peter.mayd...@linaro.org;
> t...@semihalf.com; bbhush...@marvell.com
> Subject: RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Eric,
> 
> > -Original Message-
> > From: Auger Eric [mailto:eric.au...@redhat.com]
> > Sent: 16 April 2020 08:45
> > To: Zhangfei Gao ; eric.auger@gmail.com;
> > io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> > k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> > j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com
> > Cc: jean-phili...@linaro.org; Shameerali Kolothum Thodi
> > ; alex.william...@redhat.com;
> > jacob.jun@linux.intel.com; yi.l@intel.com; peter.mayd...@linaro.org;
> > t...@semihalf.com; bbhush...@marvell.com
> > Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> >
> > Hi Zhangfei,
> >
> > On 4/16/20 6:25 AM, Zhangfei Gao wrote:
> > >
> > >
> > > On 2020/4/14 下午11:05, Eric Auger wrote:
> > >> This version fixes an issue observed by Shameer on an SMMU 3.2,
> > >> when moving from dual stage config to stage 1 only config.
> > >> The 2 high 64b of the STE now get reset. Otherwise, leaving the
> > >> S2TTB set may cause a C_BAD_STE error.
> > >>
> > >> This series can be found at:
> > >> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
> > >> (including the VFIO part)
> > >> The QEMU fellow series still can be found at:
> > >> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
> > >>
> > >> Users have expressed interest in that work and tested v9/v10:
> > >> - https://patchwork.kernel.org/cover/11039995/#23012381
> > >> - https://patchwork.kernel.org/cover/11039995/#23197235
> > >>
> > >> Background:
> > >>
> > >> This series brings the IOMMU part of HW nested paging support
> > >> in the SMMUv3. The VFIO part is submitted separately.
> > >>
> > >> The IOMMU API is extended to support 2 new API functionalities:
> > >> 1) pass the guest stage 1 configuration
> > >> 2) pass stage 1 MSI bindings
> > >>
> > >> Then those capabilities gets implemented in the SMMUv3 driver.
> > >>
> > >> The virtualizer passes information through the VFIO user API
> > >> which cascades them to the iommu subsystem. This allows the guest
> > >> to own stage 1 tables and context descriptors (so-called PASID
> > >> table) while the host owns stage 2 tables and main configuration
> > >> structures (STE).
> > >>
> > >>
> > >
> > > Thanks Eric
> > >
> > > Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
> > > 1. no-sva works, where guest app directly use physical address via ioctl.
> > Thank you for the testing. Glad it works for you.
> > > 2. vSVA still not work, same as v10,
> > Yes that's normal this series is not meant to support vSVM at this stage.
> >
> > I intend to add the missing pieces during the next weeks.
> 
> Thanks for that. I have made an attempt to add the vSVA based on
> your v10 + JPBs sva patches. The host kernel and Qemu changes can
> be found here[1][2].
> 
> This basically adds multiple pasid support on top of your changes.
> I have done some basic sanity testing and we have some initial success
> with the zip vf dev on our D06 platform. Please note that the STALL event is
> not yet supported though, but works fine if we mlock() guest usr mem.

I have added STALL support for our vSVA prototype and it seems to be
working(on our hardware). I have updated the kernel and qemu branches with
the same[1][2]. I should warn you though that these are prototype code and I am 
pretty
much re-using the VFIO_IOMMU_SET_PASID_TABLE interface for almost everything.
But thought of sharing, in case if it is useful somehow!.

Thanks,
Shameer

[1]https://github.com/hisilicon/kernel-dev/commits/vsva-prototype-host-v1

[2]https://github.com/hisilicon/qemu/tree/v4.2.0-2stage-rfcv6-vsva-prototype-v1
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)

2020-04-30 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 16 April 2020 08:45
> To: Zhangfei Gao ; eric.auger@gmail.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; w...@kernel.org;
> j...@8bytes.org; m...@kernel.org; robin.mur...@arm.com
> Cc: jean-phili...@linaro.org; Shameerali Kolothum Thodi
> ; alex.william...@redhat.com;
> jacob.jun@linux.intel.com; yi.l@intel.com; peter.mayd...@linaro.org;
> t...@semihalf.com; bbhush...@marvell.com
> Subject: Re: [PATCH v11 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
> 
> Hi Zhangfei,
> 
> On 4/16/20 6:25 AM, Zhangfei Gao wrote:
> >
> >
> > On 2020/4/14 下午11:05, Eric Auger wrote:
> >> This version fixes an issue observed by Shameer on an SMMU 3.2,
> >> when moving from dual stage config to stage 1 only config.
> >> The 2 high 64b of the STE now get reset. Otherwise, leaving the
> >> S2TTB set may cause a C_BAD_STE error.
> >>
> >> This series can be found at:
> >> https://github.com/eauger/linux/tree/v5.6-2stage-v11_10.1
> >> (including the VFIO part)
> >> The QEMU fellow series still can be found at:
> >> https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
> >>
> >> Users have expressed interest in that work and tested v9/v10:
> >> - https://patchwork.kernel.org/cover/11039995/#23012381
> >> - https://patchwork.kernel.org/cover/11039995/#23197235
> >>
> >> Background:
> >>
> >> This series brings the IOMMU part of HW nested paging support
> >> in the SMMUv3. The VFIO part is submitted separately.
> >>
> >> The IOMMU API is extended to support 2 new API functionalities:
> >> 1) pass the guest stage 1 configuration
> >> 2) pass stage 1 MSI bindings
> >>
> >> Then those capabilities gets implemented in the SMMUv3 driver.
> >>
> >> The virtualizer passes information through the VFIO user API
> >> which cascades them to the iommu subsystem. This allows the guest
> >> to own stage 1 tables and context descriptors (so-called PASID
> >> table) while the host owns stage 2 tables and main configuration
> >> structures (STE).
> >>
> >>
> >
> > Thanks Eric
> >
> > Tested v11 on Hisilicon kunpeng920 board via hardware zip accelerator.
> > 1. no-sva works, where guest app directly use physical address via ioctl.
> Thank you for the testing. Glad it works for you.
> > 2. vSVA still not work, same as v10,
> Yes that's normal this series is not meant to support vSVM at this stage.
> 
> I intend to add the missing pieces during the next weeks.

Thanks for that. I have made an attempt to add the vSVA based on 
your v10 + JPBs sva patches. The host kernel and Qemu changes can 
be found here[1][2].

This basically adds multiple pasid support on top of your changes.
I have done some basic sanity testing and we have some initial success
with the zip vf dev on our D06 platform. Please note that the STALL event is
not yet supported though, but works fine if we mlock() guest usr mem.

I also noted that Intel patches for vSVA has couple of changes in the vfio 
interfaces
and hope there will be a convergence soon. Please let me know your plans
of a respin of this series and see whether incorporating the changes for 
multiple
pasid make sense or not for now.

Thanks,
Shameer

[1]https://github.com/hisilicon/qemu/tree/v4.2.0-2stage-rfcv6-vsva-prototype-v1
[2]https://github.com/hisilicon/kernel-dev/tree/vsva-prototype-host-v1

> Thanks
> 
> Eric
> > 3.  the v10 issue reported by Shameer has been solved,  first start qemu
> > with  iommu=smmuv3, then start qemu without  iommu=smmuv3
> > 4. no-sva also works without  iommu=smmuv3
> >
> > Test details in https://docs.qq.com/doc/DRU5oR1NtUERseFNL
> >
> > Thanks
> >

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)

2019-11-13 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 12 November 2019 20:35
> To: Shameerali Kolothum Thodi ;
> eric.auger@gmail.com; io...@lists.linux-foundation.org;
> linux-ker...@vger.kernel.org; k...@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> yi.l@intel.com; jean-philippe.bruc...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com
> Cc: kevin.t...@intel.com; vincent.ste...@arm.com; ashok@intel.com;
> marc.zyng...@arm.com; tina.zh...@intel.com; Linuxarm
> ; xuwei (O) 
> Subject: Re: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> 
> Hi Shameer,
> 

[..]

> >
> > I just noted that CMDQ_OP_TLBI_NH_VA is missing the vmid filed which
> seems
> > to be the cause for single IOVA TLBI not working properly.
> >
> > I had this fix in arm-smmuv3.c,
> >
> > @@ -947,6 +947,7 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd,
> struct arm_smmu_cmdq_ent *ent)
> > cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_RANGE, 31);
> > break;
> > case CMDQ_OP_TLBI_NH_VA:
> > +   cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, ent->tlbi.vmid);
> Damn, I did not see that! That's it. ASID invalidation fills this field
> indeed. You may post an independent patch for that.

Sure. Just did that.
" iommu/arm-smmu-v3: Populate VMID field for CMDQ_OP_TLBI_NH_VA"

>   cmd[0] |=
> FIELD_PREP(CMDQ_TLBI_0_ASID, ent->tlbi.asid);
> > cmd[1] |= FIELD_PREP(CMDQ_TLBI_1_LEAF, ent->tlbi.leaf);
> > cmd[1] |= ent->tlbi.addr & CMDQ_TLBI_1_VA_MASK;
> >
> >
> > With this, your original qemu branch is working.
> >
> > root@ubuntu:~# iperf -c 10.202.225.185
> > 
> > Client connecting to 10.202.225.185, TCP port 5001 TCP window size: 85.0
> KByte (default)
> > 
> > [  3] local 10.202.225.169 port 44894 connected with 10.202.225.185 port
> 5001
> > [ ID] Interval   Transfer Bandwidth
> > [  3]  0.0-10.0 sec  3.21 GBytes  2.76 Gbits/sec
> >
> > Could you please check this...
> >
> > I also have a rebase of your patches on top of 5.4-rc5. This has some
> optimizations
> > From Will such as batched TLBI inv. Please find it here,
> >
> > https://github.com/hisilicon/kernel-dev/tree/private-vSMMUv3-v9-v5.4-rc5
> >
> > This gives me a better performance with iperf,
> >
> > root@ubuntu:~# iperf -c 10.202.225.185
> > 
> > Client connecting to 10.202.225.185, TCP port 5001 TCP window size: 85.0
> KByte (default)
> > 
> > [  3] local 10.202.225.169 port 55450 connected with 10.202.225.185 port
> 5001
> > [ ID] Interval   Transfer Bandwidth
> > [  3]  0.0-10.0 sec  4.91 GBytes  4.22 Gbits/sec root@ubuntu:~#
> >
> > If possible please check this branch as well.
> 
> To be honest I don't really know what to do with this work. Despite the
> efforts, this has suffered from a lack of traction in the community. My
> last attempt to explain the use cases, upon Will's request at Plumber,
> has not received any comment (https://lkml.org/lkml/2019/9/20/104).
> 
> I think I will post a rebased version with your fix, as a matter to get
> a clean snapshot.

Thanks. That makes sense.

 If you think this work is useful for your projects,
> please let it know on the ML.

Right. While SVA use case is definitely the one we are very much interested, I 
will
check within our team the priority for use case 1(native drivers in Guest) you
mentioned in the above link. 

Cheers,
Shameer

> Thank you again!
> 
> Eric
> >
> > Thanks,
> > Shameer
> >
> >> Thanks,
> >> Shameer
> >>
> >>
> >>> Thanks
> >>>
> >>> Eric
> >>>>
> >>>> Cheers,
> >>>> Shameer
> >>>>
> >>>>> Thanks
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Please let me know.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Shameer
> >>>>>>
> >>>>>>> Best Regards
> >>>>>>>
> >>>>>>> Eric
> >>>>>>>
> >

RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)

2019-11-12 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Shameerali Kolothum Thodi
> Sent: 12 November 2019 14:21
> To: 'Auger Eric' ; eric.auger@gmail.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> yi.l@intel.com; jean-philippe.bruc...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com
> Cc: kevin.t...@intel.com; vincent.ste...@arm.com; ashok@intel.com;
> marc.zyng...@arm.com; tina.zh...@intel.com; Linuxarm
> ; xuwei (O) 
> Subject: RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> 
[...]
> > >>> I am trying to get this running on one of our platform that has smmuv3
> dual
> > >>> stage support. I am seeing some issues with this when an ixgbe vf dev is
> > >>> made pass-through and is behind a vSMMUv3 in Guest.
> > >>>
> > >>> Kernel used : https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
> > >>> Qemu: https://github.com/eauger/qemu/tree/v4.1.0-rc0-2stage-rfcv5
> > >>>
> > >>> And this is my Qemu cmd line,
> > >>>
> > >>> ./qemu-system-aarch64
> > >>> -machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 -cpu host
> \
> > >>> -kernel Image \
> > >>> -drive if=none,file=ubuntu,id=fs \
> > >>> -device virtio-blk-device,drive=fs \
> > >>> -device vfio-pci,host=:01:10.1 \
> > >>> -bios QEMU_EFI.fd \
> > >>> -net none \
> > >>> -m 4G \
> > >>> -nographic -D -d -enable-kvm \
> > >>> -append "console=ttyAMA0 root=/dev/vda rw acpi=force"
> > >>>
> > >>> The basic ping from Guest works fine,
> > >>> root@ubuntu:~# ping 10.202.225.185
> > >>> PING 10.202.225.185 (10.202.225.185) 56(84) bytes of data.
> > >>> 64 bytes from 10.202.225.185: icmp_seq=2 ttl=64 time=0.207 ms
> > >>> 64 bytes from 10.202.225.185: icmp_seq=3 ttl=64 time=0.203 ms
> > >>> ...
> > >>>
> > >>> But if I increase ping packet size,
> > >>>
> > >>> root@ubuntu:~# ping -s 1024 10.202.225.185
> > >>> PING 10.202.225.185 (10.202.225.185) 1024(1052) bytes of data.
> > >>> 1032 bytes from 10.202.225.185: icmp_seq=22 ttl=64 time=0.292 ms
> > >>> 1032 bytes from 10.202.225.185: icmp_seq=23 ttl=64 time=0.207 ms
> > >>> From 10.202.225.169 icmp_seq=66 Destination Host Unreachable
> > >>> From 10.202.225.169 icmp_seq=67 Destination Host Unreachable
> > >>> From 10.202.225.169 icmp_seq=68 Destination Host Unreachable
> > >>> From 10.202.225.169 icmp_seq=69 Destination Host Unreachable
> > >>>
> > >>> And from Host kernel I get,
> > >>> [  819.970742] ixgbe :01:00.1 enp1s0f1: 3 Spoofed packets
> detected
> > >>> [  824.002707] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets
> detected
> > >>> [  828.034683] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets
> detected
> > >>> [  830.050673] ixgbe :01:00.1 enp1s0f1: 4 Spoofed packets
> detected
> > >>> [  832.066659] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets
> detected
> > >>> [  834.082640] ixgbe :01:00.1 enp1s0f1: 3 Spoofed packets
> detected
> > >>>
> > >>> Also noted that iperf cannot work as it fails to establish the 
> > >>> connection
> > with
> > >> iperf
> > >>> server.
> > >>>
> > >>> Please find attached the trace logs(vfio*, smmuv3*) from Qemu for your
> > >> reference.
> > >>> I haven't debugged this further yet and thought of checking with you if
> this
> > is
> > >>> something you have seen already or not. Or maybe I am missing
> something
> > >> here?
> > >>
> > >> Please can you try to edit and modify hw/vfio/common.c, function
> > >> vfio_iommu_unmap_notify
> > >>
> > >>
> > >> /*
> > >> if (size <= 0x1) {
> > >> ustruct.info.cache = IOMMU_CACHE_INV_TYPE_IOTLB;
> > >> ustruct.info.granularity = IOMMU_INV_GRANU_ADDR;
> > >> ustruct.info.addr_info.flags =
> > IOMMU_INV_ADDR_FLAGS_ARCHID;
> > >> if (iotlb->leaf) {
> > >> ustruct.info.addr_info.flags |=
> > >&

RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)

2019-11-12 Thread Shameerali Kolothum Thodi




> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 12 November 2019 13:22
> To: Shameerali Kolothum Thodi ;
> eric.auger@gmail.com; io...@lists.linux-foundation.org;
> linux-ker...@vger.kernel.org; k...@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> yi.l@intel.com; jean-philippe.bruc...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com
> Cc: kevin.t...@intel.com; vincent.ste...@arm.com; ashok@intel.com;
> marc.zyng...@arm.com; tina.zh...@intel.com; Linuxarm
> ; xuwei (O) 
> Subject: Re: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> 
> Hi Shameer,
> 
> On 11/12/19 2:06 PM, Shameerali Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -Original Message-
> >> From: Auger Eric [mailto:eric.au...@redhat.com]
> >> Sent: 12 November 2019 11:29
> >> To: Shameerali Kolothum Thodi ;
> >> eric.auger@gmail.com; io...@lists.linux-foundation.org;
> >> linux-ker...@vger.kernel.org; k...@vger.kernel.org;
> >> kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> >> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> >> yi.l@intel.com; jean-philippe.bruc...@arm.com; will.dea...@arm.com;
> >> robin.mur...@arm.com
> >> Cc: kevin.t...@intel.com; vincent.ste...@arm.com; ashok@intel.com;
> >> marc.zyng...@arm.com; tina.zh...@intel.com; Linuxarm
> >> ; xuwei (O) 
> >> Subject: Re: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> >>
> >> Hi Shameer,
> >> On 11/12/19 12:08 PM, Shameerali Kolothum Thodi wrote:
> >>> Hi Eric,
> >>>
> >>>> -Original Message-
> >>>> From: kvmarm-boun...@lists.cs.columbia.edu
> >>>> [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Eric Auger
> >>>> Sent: 11 July 2019 14:56
> >>>> To: eric.auger@gmail.com; eric.au...@redhat.com;
> >>>> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> >>>> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> >>>> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> >>>> yi.l@intel.com; jean-philippe.bruc...@arm.com;
> will.dea...@arm.com;
> >>>> robin.mur...@arm.com
> >>>> Cc: kevin.t...@intel.com; vincent.ste...@arm.com;
> ashok@intel.com;
> >>>> marc.zyng...@arm.com; tina.zh...@intel.com
> >>>> Subject: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> >>>>
> >>>> This series brings the VFIO part of HW nested paging support
> >>>> in the SMMUv3.
> >>>>
> >>>> The series depends on:
> >>>> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> >>>> (https://www.spinics.net/lists/kernel/msg3187714.html)
> >>>>
> >>>> 3 new IOCTLs are introduced that allow the userspace to
> >>>> 1) pass the guest stage 1 configuration
> >>>> 2) pass stage 1 MSI bindings
> >>>> 3) invalidate stage 1 related caches
> >>>>
> >>>> They map onto the related new IOMMU API functions.
> >>>>
> >>>> We introduce the capability to register specific interrupt
> >>>> indexes (see [1]). A new DMA_FAULT interrupt index allows to register
> >>>> an eventfd to be signaled whenever a stage 1 related fault
> >>>> is detected at physical level. Also a specific region allows
> >>>> to expose the fault records to the user space.
> >>>
> >>> I am trying to get this running on one of our platform that has smmuv3 
> >>> dual
> >>> stage support. I am seeing some issues with this when an ixgbe vf dev is
> >>> made pass-through and is behind a vSMMUv3 in Guest.
> >>>
> >>> Kernel used : https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
> >>> Qemu: https://github.com/eauger/qemu/tree/v4.1.0-rc0-2stage-rfcv5
> >>>
> >>> And this is my Qemu cmd line,
> >>>
> >>> ./qemu-system-aarch64
> >>> -machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 -cpu host \
> >>> -kernel Image \
> >>> -drive if=none,file=ubuntu,id=fs \
> >>> -device virtio-blk-device,drive=fs \
> >>> -device vfio-pci,host=:01:10.1 \
> >>> -bios QEMU_EFI.fd \
> >>> -net none \
> >>> -m 4G \
> >>> -nographic

RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)

2019-11-12 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 12 November 2019 11:29
> To: Shameerali Kolothum Thodi ;
> eric.auger@gmail.com; io...@lists.linux-foundation.org;
> linux-ker...@vger.kernel.org; k...@vger.kernel.org;
> kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> yi.l@intel.com; jean-philippe.bruc...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com
> Cc: kevin.t...@intel.com; vincent.ste...@arm.com; ashok@intel.com;
> marc.zyng...@arm.com; tina.zh...@intel.com; Linuxarm
> ; xuwei (O) 
> Subject: Re: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> 
> Hi Shameer,
> On 11/12/19 12:08 PM, Shameerali Kolothum Thodi wrote:
> > Hi Eric,
> >
> >> -Original Message-
> >> From: kvmarm-boun...@lists.cs.columbia.edu
> >> [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Eric Auger
> >> Sent: 11 July 2019 14:56
> >> To: eric.auger@gmail.com; eric.au...@redhat.com;
> >> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> >> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> >> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> >> yi.l@intel.com; jean-philippe.bruc...@arm.com; will.dea...@arm.com;
> >> robin.mur...@arm.com
> >> Cc: kevin.t...@intel.com; vincent.ste...@arm.com; ashok@intel.com;
> >> marc.zyng...@arm.com; tina.zh...@intel.com
> >> Subject: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> >>
> >> This series brings the VFIO part of HW nested paging support
> >> in the SMMUv3.
> >>
> >> The series depends on:
> >> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> >> (https://www.spinics.net/lists/kernel/msg3187714.html)
> >>
> >> 3 new IOCTLs are introduced that allow the userspace to
> >> 1) pass the guest stage 1 configuration
> >> 2) pass stage 1 MSI bindings
> >> 3) invalidate stage 1 related caches
> >>
> >> They map onto the related new IOMMU API functions.
> >>
> >> We introduce the capability to register specific interrupt
> >> indexes (see [1]). A new DMA_FAULT interrupt index allows to register
> >> an eventfd to be signaled whenever a stage 1 related fault
> >> is detected at physical level. Also a specific region allows
> >> to expose the fault records to the user space.
> >
> > I am trying to get this running on one of our platform that has smmuv3 dual
> > stage support. I am seeing some issues with this when an ixgbe vf dev is
> > made pass-through and is behind a vSMMUv3 in Guest.
> >
> > Kernel used : https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
> > Qemu: https://github.com/eauger/qemu/tree/v4.1.0-rc0-2stage-rfcv5
> >
> > And this is my Qemu cmd line,
> >
> > ./qemu-system-aarch64
> > -machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 -cpu host \
> > -kernel Image \
> > -drive if=none,file=ubuntu,id=fs \
> > -device virtio-blk-device,drive=fs \
> > -device vfio-pci,host=:01:10.1 \
> > -bios QEMU_EFI.fd \
> > -net none \
> > -m 4G \
> > -nographic -D -d -enable-kvm \
> > -append "console=ttyAMA0 root=/dev/vda rw acpi=force"
> >
> > The basic ping from Guest works fine,
> > root@ubuntu:~# ping 10.202.225.185
> > PING 10.202.225.185 (10.202.225.185) 56(84) bytes of data.
> > 64 bytes from 10.202.225.185: icmp_seq=2 ttl=64 time=0.207 ms
> > 64 bytes from 10.202.225.185: icmp_seq=3 ttl=64 time=0.203 ms
> > ...
> >
> > But if I increase ping packet size,
> >
> > root@ubuntu:~# ping -s 1024 10.202.225.185
> > PING 10.202.225.185 (10.202.225.185) 1024(1052) bytes of data.
> > 1032 bytes from 10.202.225.185: icmp_seq=22 ttl=64 time=0.292 ms
> > 1032 bytes from 10.202.225.185: icmp_seq=23 ttl=64 time=0.207 ms
> > From 10.202.225.169 icmp_seq=66 Destination Host Unreachable
> > From 10.202.225.169 icmp_seq=67 Destination Host Unreachable
> > From 10.202.225.169 icmp_seq=68 Destination Host Unreachable
> > From 10.202.225.169 icmp_seq=69 Destination Host Unreachable
> >
> > And from Host kernel I get,
> > [  819.970742] ixgbe :01:00.1 enp1s0f1: 3 Spoofed packets detected
> > [  824.002707] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets detected
> > [  828.034683] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets detected
> > [  830.050673] ixgbe :01:00.1 enp1s0f1: 4 Spoofed packets detected
> > [  832.06

RE: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)

2019-11-12 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: kvmarm-boun...@lists.cs.columbia.edu
> [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Eric Auger
> Sent: 11 July 2019 14:56
> To: eric.auger@gmail.com; eric.au...@redhat.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> k...@vger.kernel.org; kvmarm@lists.cs.columbia.edu; j...@8bytes.org;
> alex.william...@redhat.com; jacob.jun@linux.intel.com;
> yi.l@intel.com; jean-philippe.bruc...@arm.com; will.dea...@arm.com;
> robin.mur...@arm.com
> Cc: kevin.t...@intel.com; vincent.ste...@arm.com; ashok@intel.com;
> marc.zyng...@arm.com; tina.zh...@intel.com
> Subject: [PATCH v9 00/11] SMMUv3 Nested Stage Setup (VFIO part)
> 
> This series brings the VFIO part of HW nested paging support
> in the SMMUv3.
> 
> The series depends on:
> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> (https://www.spinics.net/lists/kernel/msg3187714.html)
> 
> 3 new IOCTLs are introduced that allow the userspace to
> 1) pass the guest stage 1 configuration
> 2) pass stage 1 MSI bindings
> 3) invalidate stage 1 related caches
> 
> They map onto the related new IOMMU API functions.
> 
> We introduce the capability to register specific interrupt
> indexes (see [1]). A new DMA_FAULT interrupt index allows to register
> an eventfd to be signaled whenever a stage 1 related fault
> is detected at physical level. Also a specific region allows
> to expose the fault records to the user space.

I am trying to get this running on one of our platform that has smmuv3 dual
stage support. I am seeing some issues with this when an ixgbe vf dev is 
made pass-through and is behind a vSMMUv3 in Guest.

Kernel used : https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
Qemu: https://github.com/eauger/qemu/tree/v4.1.0-rc0-2stage-rfcv5

And this is my Qemu cmd line,

./qemu-system-aarch64
-machine virt,kernel_irqchip=on,gic-version=3,iommu=smmuv3 -cpu host \
-kernel Image \
-drive if=none,file=ubuntu,id=fs \
-device virtio-blk-device,drive=fs \
-device vfio-pci,host=:01:10.1 \
-bios QEMU_EFI.fd \
-net none \
-m 4G \
-nographic -D -d -enable-kvm \
-append "console=ttyAMA0 root=/dev/vda rw acpi=force"

The basic ping from Guest works fine,
root@ubuntu:~# ping 10.202.225.185
PING 10.202.225.185 (10.202.225.185) 56(84) bytes of data.
64 bytes from 10.202.225.185: icmp_seq=2 ttl=64 time=0.207 ms
64 bytes from 10.202.225.185: icmp_seq=3 ttl=64 time=0.203 ms
...

But if I increase ping packet size, 

root@ubuntu:~# ping -s 1024 10.202.225.185
PING 10.202.225.185 (10.202.225.185) 1024(1052) bytes of data.
1032 bytes from 10.202.225.185: icmp_seq=22 ttl=64 time=0.292 ms
1032 bytes from 10.202.225.185: icmp_seq=23 ttl=64 time=0.207 ms
>From 10.202.225.169 icmp_seq=66 Destination Host Unreachable
>From 10.202.225.169 icmp_seq=67 Destination Host Unreachable
>From 10.202.225.169 icmp_seq=68 Destination Host Unreachable
>From 10.202.225.169 icmp_seq=69 Destination Host Unreachable

And from Host kernel I get,
[  819.970742] ixgbe :01:00.1 enp1s0f1: 3 Spoofed packets detected
[  824.002707] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets detected
[  828.034683] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets detected
[  830.050673] ixgbe :01:00.1 enp1s0f1: 4 Spoofed packets detected
[  832.066659] ixgbe :01:00.1 enp1s0f1: 1 Spoofed packets detected
[  834.082640] ixgbe :01:00.1 enp1s0f1: 3 Spoofed packets detected

Also noted that iperf cannot work as it fails to establish the connection with 
iperf
server. 

Please find attached the trace logs(vfio*, smmuv3*) from Qemu for your 
reference.
I haven't debugged this further yet and thought of checking with you if this is
something you have seen already or not. Or maybe I am missing something here?

Please let me know.

Thanks,
Shameer

> Best Regards
> 
> Eric
> 
> This series can be found at:
> https://github.com/eauger/linux/tree/v5.3.0-rc0-2stage-v9
> 
> It series includes Tina's patch steming from
> [1] "[RFC PATCH v2 1/3] vfio: Use capability chains to handle device
> specific irq" plus patches originally contributed by Yi.
> 
> History:
> 
> v8 -> v9:
> - introduce specific irq framework
> - single fault region
> - iommu_unregister_device_fault_handler failure case not handled
>   yet.
> 
> v7 -> v8:
> - rebase on top of v5.2-rc1 and especially
>   8be39a1a04c1  iommu/arm-smmu-v3: Add a master->domain pointer
> - dynamic alloc of s1_cfg/s2_cfg
> - __arm_smmu_tlb_inv_asid/s1_range_nosync
> - check there is no HW MSI regions
> - asid invalidation using pasid extended struct (change in the uapi)
> - add s1_live/s2_live checks
> - move check about support of nested stages in domain finalise
> - fixes in error reporting according to the discussion with Robin
> - reordered the patches to have first iommu/smmuv3 patches and then
>   VFIO patches
> 
> v6 -> v7:
> - removed device handle from bind/unbind_guest_msi
> - added "iommu/smmuv3: Nested mode single MSI doorbell per domain
>   enforcement"
>

RE: [PATCH] KVM: arm/arm64: Set ICH_HCR_EN in guest anyway when using gicv4

2019-03-13 Thread Shameerali Kolothum Thodi

Hi Marc,

> -Original Message-
> From: kvmarm-boun...@lists.cs.columbia.edu
> [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Marc Zyngier
> Sent: 13 March 2019 11:59
> To: Tangnianyao (ICT) ; Christoffer Dall
> ; james.mo...@arm.com; julien.thie...@arm.com;
> suzuki.poul...@arm.com; catalin.mari...@arm.com; will.dea...@arm.com;
> alex.ben...@linaro.org; mark.rutl...@arm.com; andre.przyw...@arm.com;
> Zhangshaokun ; keesc...@chromium.org;
> linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org
> Subject: Re: [PATCH] KVM: arm/arm64: Set ICH_HCR_EN in guest anyway when
> using gicv4
> 
> On 12/03/2019 15:48, Marc Zyngier wrote:
> > Nianyao,
> >
> > Please do not send patches as HTML. Or any email as HTML. Please make
> > sure that you only send plain text emails on any mailing list (see point
> > #6 in Documentation/process/submitting-patches.rst).
> >
> > On 12/03/2019 12:22, Tangnianyao (ICT) wrote:
> >> In gicv4, direct vlpi may be forward to PE without using LR or ap_list. If
> >>
> >> ICH_HCR_EL2.En is 0 in guest, direct vlpi cannot be accepted by PE.
> >>
> >> It will take long time for direct vlpi to be forwarded in some cases.
> >>
> >> Direct vlpi has to wait for ICH_HCR_EL2.En=1 where some other interrupts
> >>
> >> using LR need to be forwarded, because in kvm_vgic_flush_hwstate,
> >>
> >> if ap_list is empty, it will return quickly not setting ICH_HCR_EL2.En.
> >>
> >> To fix this, set ICH_HCR_EL2.En to 1 before returning to guest when
> >>
> >> using GICv4.
> >>
> >>
> >>
> >> Signed-off-by: Nianyao Tang 
> >>
> >> ---
> >>
> >> arch/arm64/include/asm/kvm_asm.h |  1 +
> >>
> >> virt/kvm/arm/hyp/vgic-v3-sr.c    | 10 ++
> >>
> >> virt/kvm/arm/vgic/vgic-v4.c  |  8 
> >>
> >> 3 files changed, 19 insertions(+)
> >>
> >>
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_asm.h
> >> b/arch/arm64/include/asm/kvm_asm.h
> >>
> >> index f5b79e9..0581c4d 100644
> >>
> >> --- a/arch/arm64/include/asm/kvm_asm.h
> >>
> >> +++ b/arch/arm64/include/asm/kvm_asm.h
> >>
> >> @@ -79,6 +79,7 @@
> >>
> >> extern void __vgic_v3_init_lrs(void);
> >>
> >>  extern u32 __kvm_get_mdcr_el2(void);
> >>
> >> +extern void __vgic_v3_write_hcr(u32 vmcr);
> >>
> >>  /* Home-grown __this_cpu_{ptr,read} variants that always work at HYP */
> >>
> >> #define
> >>
> __hyp_this_cpu_ptr(sym)
> 
> >> \
> >>
> >> diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
> >>
> >> index 264d92d..12027af 100644
> >>
> >> --- a/virt/kvm/arm/hyp/vgic-v3-sr.c
> >>
> >> +++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
> >>
> >> @@ -434,6 +434,16 @@ void __hyp_text __vgic_v3_write_vmcr(u32 vmcr)
> >>
> >>    write_gicreg(vmcr, ICH_VMCR_EL2);
> >>
> >> }
> >>
> >> +u64 __hyp_text __vgic_v3_read_hcr(void)
> >>
> >> +{
> >>
> >> +   return read_gicreg(ICH_HCR_EL2);
> >>
> >> +}
> >>
> >> +
> >>
> >> +void __hyp_text __vgic_v3_write_hcr(u32 vmcr)
> >>
> >> +{
> >>
> >> +   write_gicreg(vmcr, ICH_HCR_EL2);
> >>
> >> +}
> >
> > This is HYP code...
> >
> >>
> >> +
> >>
> >> #ifdef CONFIG_ARM64
> >>
> >>  static int __hyp_text __vgic_v3_bpr_min(void)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
> >>
> >> index 1ed5f22..515171a 100644
> >>
> >> --- a/virt/kvm/arm/vgic/vgic-v4.c
> >>
> >> +++ b/virt/kvm/arm/vgic/vgic-v4.c
> >>
> >> @@ -208,6 +208,8 @@ int vgic_v4_sync_hwstate(struct kvm_vcpu *vcpu)
> >>
> >>    if (!vgic_supports_direct_msis(vcpu->kvm))
> >>
> >>     return 0;
> >>
> >> +   __vgic_v3_write_hcr(vcpu->arch.vgic_cpu.vgic_v3.vgic_hcr &
> >> ~ICH_HCR_EN);
> >
> > And you've now crashed your non-VHE system by branching directly to code
> > that cannot be executed at EL1.
> >
> >>
> >> +
> >>
> >>    return its_schedule_vpe(>arch.vgic_cpu.vgic_v3.its_vpe,
> false);
> >>
> >> }
> >>
> >> @@ -220,6 +222,12 @@ int vgic_v4_flush_hwstate(struct kvm_vcpu *vcpu)
> >>
> >>     return 0;
> >>
> >>     /*
> >>
> >> +   * Enable ICH_HCR_EL.En so that guest can accpet and handle
> direct
> >>
> >> +   * vlpi.
> >>
> >> +   */
> >>
> >> +   __vgic_v3_write_hcr(vcpu->arch.vgic_cpu.vgic_v3.vgic_hcr);
> >>
> >> +
> >>
> >> +   /*
> >>
> >>     * Before making the VPE resident, make sure the redistributor
> >>
> >>     * corresponding to our current CPU expects us here. See the
> >>
> >>     * doc in drivers/irqchip/irq-gic-v4.c to understand how this
> >>
> >> --
> >>
> >> 1.9.1
> >>
> >>
> >>
> >>
> >>
> >
> > Overall, this looks like the wrong approach. It is not the GICv4 code's
> > job to do this, but the low-level code (either the load/put code for VHE
> > or the save/restore code for non-VHE).
> >
> > It would certainly help if you could describe which context you're in
> > (VHE, non-VHE). I suppose the former...
> 
> Can you please give the following patch a go? I can't test it, but hopefully
> you can.

Thanks for your quick response. I just did a

RE: [PATCH] KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs

2016-12-01 Thread Shameerali Kolothum Thodi



> -Original Message-
> From: Marc Zyngier [mailto:marc.zyng...@arm.com]
> Sent: Thursday, December 01, 2016 10:58 AM
> To: Shameerali Kolothum Thodi; Paolo Bonzini; Radim Krčmář
> Cc: Catalin Marinas; kvmarm@lists.cs.columbia.edu; linux-arm-
> ker...@lists.infradead.org; k...@vger.kernel.org
> Subject: Re: [PATCH] KVM: arm/arm64: vgic: Don't notify EOI for non-
> SPIs
> 
> On 01/12/16 10:28, Shameerali Kolothum Thodi wrote:
> > Hi Marc,
> >
> >> -Original Message-
> >> From: kvmarm-boun...@lists.cs.columbia.edu [mailto:kvmarm-
> >> boun...@lists.cs.columbia.edu] On Behalf Of Marc Zyngier
> >> Sent: Thursday, December 01, 2016 9:26 AM
> >> To: Paolo Bonzini; Radim Krčmář
> >> Cc: Catalin Marinas; kvmarm@lists.cs.columbia.edu; linux-arm-
> >> ker...@lists.infradead.org; k...@vger.kernel.org
> >> Subject: [PATCH] KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs
> >>
> >> When we inject a level triggerered interrupt (and unless it is
> backed
> >> by the physical distributor - timer style), we request a maintenance
> >> interrupt. Part of the processing for that interrupt is to feed to
> the
> >> rest of KVM (and to the eventfd subsystem) the information that the
> >> interrupt has been EOIed.
> >>
> >> But that notification only makes sense for SPIs, and not PPIs (such
> as
> >> the PMU interrupt). Skip over the notification if the interrupt is
> not
> >> an SPI.
> >
> > Just to clarify my understanding, the maintenance interrupt is
> generated
> > for cases where there is no mapping of virt to phys interrupts
> > (ie, ICH_LR HW bit is not set). And I was under the impression that
> > kvm_notify_acked_irq will eventually deactivate the interrupt on
> distributor
> > for such cases. Its not clear to me how the deactivation is done
> > otherwise.
> >
> > Could you please help me to understand this better.
> 
> kvm_notify_acked_irq() doesn't do *anything* at the distributor level,
> ever (it has no idea of anything GIC-specific anyway). It's sole job is
> to signal the rest of the stack that an interrupt has been EOIed in the
> guest.

Thanks Marc. Understood. I got confused by the kvm_set_irq in the 
kvm_notify_acked_irq path. 

> For these interrupts, which are purely virtual, there is absolutely
> nothing to do at the physical distributor level anyway. Furthermore,
> kvm_notify_acked_irq doesn't know about per-cpu interrupt, which is why
> we cannot notify them.

Just to confirm, that means for any phys interrupt(PPI/SPI) to be injected
to the Guest, the mapping bit has(HW bit set) to be used.

Thanks,
Shameer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH] KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs

2016-12-01 Thread Shameerali Kolothum Thodi

Hi Marc,

> -Original Message-
> From: kvmarm-boun...@lists.cs.columbia.edu [mailto:kvmarm-
> boun...@lists.cs.columbia.edu] On Behalf Of Marc Zyngier
> Sent: Thursday, December 01, 2016 9:26 AM
> To: Paolo Bonzini; Radim Krčmář
> Cc: Catalin Marinas; kvmarm@lists.cs.columbia.edu; linux-arm-
> ker...@lists.infradead.org; k...@vger.kernel.org
> Subject: [PATCH] KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs
> 
> When we inject a level triggerered interrupt (and unless it is backed
> by the physical distributor - timer style), we request a maintenance
> interrupt. Part of the processing for that interrupt is to feed to the
> rest of KVM (and to the eventfd subsystem) the information that the
> interrupt has been EOIed.
> 
> But that notification only makes sense for SPIs, and not PPIs (such as
> the PMU interrupt). Skip over the notification if the interrupt is not
> an SPI.

Just to clarify my understanding, the maintenance interrupt is generated 
for cases where there is no mapping of virt to phys interrupts
(ie, ICH_LR HW bit is not set). And I was under the impression that
kvm_notify_acked_irq will eventually deactivate the interrupt on distributor
for such cases. Its not clear to me how the deactivation is done
otherwise.

Could you please help me to understand this better.

Thanks,
Shameer


> Cc: sta...@vger.kernel.org # 4.7+
> Fixes: 140b086dd197 ("KVM: arm/arm64: vgic-new: Add GICv2 world switch
> backend")
> Fixes: 59529f69f504 ("KVM: arm/arm64: vgic-new: Add GICv3 world switch
> backend")
> Reported-by: Catalin Marinas 
> Tested-by: Catalin Marinas 
> Acked-by: Christoffer Dall 
> Signed-off-by: Marc Zyngier 
> ---
>  virt/kvm/arm/vgic/vgic-v2.c | 6 --
>  virt/kvm/arm/vgic/vgic-v3.c | 6 --
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
> index 0a063af..9bab867 100644
> --- a/virt/kvm/arm/vgic/vgic-v2.c
> +++ b/virt/kvm/arm/vgic/vgic-v2.c
> @@ -50,8 +50,10 @@ void vgic_v2_process_maintenance(struct kvm_vcpu
> *vcpu)
> 
>   WARN_ON(cpuif->vgic_lr[lr] & GICH_LR_STATE);
> 
> - kvm_notify_acked_irq(vcpu->kvm, 0,
> -  intid - VGIC_NR_PRIVATE_IRQS);
> + /* Only SPIs require notification */
> + if (vgic_valid_spi(vcpu->kvm, intid))
> + kvm_notify_acked_irq(vcpu->kvm, 0,
> +  intid - 
> VGIC_NR_PRIVATE_IRQS);
>   }
>   }
> 
> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
> index 9f0dae3..5c9f974 100644
> --- a/virt/kvm/arm/vgic/vgic-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-v3.c
> @@ -41,8 +41,10 @@ void vgic_v3_process_maintenance(struct kvm_vcpu
> *vcpu)
> 
>   WARN_ON(cpuif->vgic_lr[lr] & ICH_LR_STATE);
> 
> - kvm_notify_acked_irq(vcpu->kvm, 0,
> -  intid - VGIC_NR_PRIVATE_IRQS);
> + /* Only SPIs require notification */
> + if (vgic_valid_spi(vcpu->kvm, intid))
> + kvm_notify_acked_irq(vcpu->kvm, 0,
> +  intid - 
> VGIC_NR_PRIVATE_IRQS);
>   }
> 
>   /*
> --
> 2.1.4
> 
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

48 matches

Mail list logo