Re: I{S,C}ACTIVER implemention question

2020-04-10 Thread Julien Grall




On 06/04/2020 16:14, Marc Zyngier wrote:

Hi Julien,


Hi Marc,



Thanks for the heads up.

On 2020-04-06 14:16, Julien Grall wrote:

Hi,

Xen community is currently reviewing a new implementation for reading
I{S,C}ACTIVER registers (see [1]).

The implementation is based on vgic_mmio_read_active() in KVM, i.e the
active state of the interrupts is based on the vGIC state stored in
memory.

While reviewing the patch on xen-devel, I noticed a potential deadlock
at least with Xen implementation. I know that Xen vGIC and KVM vGIC
are quite different, so I looked at the implementation to see how this
is dealt.

With my limited knowledge of KVM, I wasn't able to rule it out. I am
curious to know if I missed anything.

vCPU A may read the active state of an interrupt routed to vCPU B.
When vCPU A is reading the state, it will read the state stored in
memory.

The only way the memory state can get synced with the HW state is when
vCPU B exit guest context.

AFAICT, vCPU B will not exit when deactivating HW mapped interrupts
and virtual edge interrupts. So vCPU B may run for an abritrary long
time before been exiting and syncing the memory state with the HW
state.


So while I agree that this is definitely not ideal, I don't think we end-up
with a deadlock (or rather a livelock) either. That's because we are 
guaranteed

to exit eventually if only because the kernel's own timer interrupt (or any
other host interrupt routed to the same physical CPU) will fire and get us
out of there. On its own, this is enough to allow the polling vcpu to make
forward progress.


That's a good point. I think in Xen we can't rely on this because in 
some of the setup (such as a pCPU dedicated to a vCPU), there will be 
close to zero host interrupts (timer is only used for scheduling).




Now, it is obvious that we should improve on the current situation. I just
hacked together a patch that provides the same guarantees as the one we
already have on the write side (kick all vcpus out of the guest, snapshot
the state, kick everyone back in). I boot-tested it, so it is obviously 
perfect

and won't eat your data at all! ;-)


Thank you for the patch! This is the similar to what I had in mind.

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


I{S,C}ACTIVER implemention question

2020-04-06 Thread Julien Grall

Hi,

Xen community is currently reviewing a new implementation for reading 
I{S,C}ACTIVER registers (see [1]).


The implementation is based on vgic_mmio_read_active() in KVM, i.e the 
active state of the interrupts is based on the vGIC state stored in memory.


While reviewing the patch on xen-devel, I noticed a potential deadlock 
at least with Xen implementation. I know that Xen vGIC and KVM vGIC are 
quite different, so I looked at the implementation to see how this is dealt.


With my limited knowledge of KVM, I wasn't able to rule it out. I am 
curious to know if I missed anything.


vCPU A may read the active state of an interrupt routed to vCPU B. When 
vCPU A is reading the state, it will read the state stored in memory.


The only way the memory state can get synced with the HW state is when 
vCPU B exit guest context.


AFAICT, vCPU B will not exit when deactivating HW mapped interrupts and 
virtual edge interrupts. So vCPU B may run for an abritrary long time 
before been exiting and syncing the memory state with the HW state.


Looking at Linux (5.4 and onwards) use of the active state, vCPU A would 
loop until the interrupt is not active anymore. So wouldn't the task on 
vCPU A be blocked for an arbitrary long time?


Cheers,

[1] 
https://lists.xenproject.org/archives/html/xen-devel/2020-03/msg01844.html


--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3] KVM: arm: VGIC: properly initialise private IRQ affinity

2019-08-23 Thread Julien Grall

Hi Andre,

On 23/08/2019 11:34, Andre Przywara wrote:

At the moment we initialise the target *mask* of a virtual IRQ to the
VCPU it belongs to, even though this mask is only defined for GICv2 and
quickly runs out of bits for many GICv3 guests.
This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
--
[ 5659.462377] UBSAN: Undefined behaviour in 
virt/kvm/arm/vgic/vgic-init.c:223:21
[ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
--
Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
dump is wrong, due to this very same problem.

Because there is no requirement to create the VGIC device before the
VCPUs (and QEMU actually does it the other way round), we can't safely
initialise mpidr or targets in kvm_vgic_vcpu_init(). But since we touch
every private IRQ for each VCPU anyway later (in vgic_init()), we can
just move the initialisation of those fields into there, where we
definitely know the VGIC type.

On the way make sure we really have either a VGICv2 or a VGICv3 device,
since the existing code is just checking for "VGICv3 or not", silently
ignoring the uninitialised case.

Signed-off-by: Andre Przywara 
Reported-by: Dave Martin 


I have tested with both a combination of GICv2/GICv3 and kvmtools/QEMU. I can 
confirm the UBSAN warning is not present anymore. Feel free to add my tested-by:


Tested-by: Julien Grall 

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RESEND PATCH] KVM: arm: VGIC: properly initialise private IRQ affinity

2019-08-21 Thread Julien Grall

Hi Andre,

On 21/08/2019 18:00, Andre Przywara wrote:

At the moment we initialise the target *mask* of a virtual IRQ to the
VCPU it belongs to, even though this mask is only defined for GICv2 and
quickly runs out of bits for many GICv3 guests.
This behaviour triggers an UBSAN complaint for more than 32 VCPUs:
--
[ 5659.462377] UBSAN: Undefined behaviour in 
virt/kvm/arm/vgic/vgic-init.c:223:21
[ 5659.471689] shift exponent 32 is too large for 32-bit type 'unsigned int'
--
Also for GICv3 guests the reporting of TARGET in the "vgic-state" debugfs
dump is wrong, due to this very same problem.

Fix both issues by only initialising vgic_irq->targets for a vGICv2 guest,
and by initialising vgic_irq->mpdir for vGICv3 guests instead. We can't
use the actual MPIDR for that, as the VCPU's system register is not
initialised at this point yet. This is not really an issue, as ->mpidr
is just used for the debugfs output and the IROUTER MMIO register, which
does not exist in redistributors (dealing with SGIs and PPIs).

Signed-off-by: Andre Przywara 
Reported-by: Dave Martin 


Tested-by: Julien Grall 

Cheers,


---
Hi,

this came up here again, I think it fell through the cracks back in
March:
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-March/637209.html

Cheers,
Andre.

  virt/kvm/arm/vgic/vgic-init.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 80127ca9269f..8bce2f75e0c1 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -210,7 +210,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->intid = i;
irq->vcpu = NULL;
irq->target_vcpu = vcpu;
-   irq->targets = 1U << vcpu->vcpu_id;
kref_init(>refcount);
if (vgic_irq_is_sgi(i)) {
/* SGIs */
@@ -221,10 +220,14 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->config = VGIC_CONFIG_LEVEL;
}
  
-		if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)

+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
irq->group = 1;
-   else
+   /* The actual MPIDR is not initialised at this point. */
+   irq->mpidr = 0;
+   } else {
irq->group = 0;
+   irq->targets = 1U << vcpu->vcpu_id;
+       }
}
  
  	if (!irqchip_in_kernel(vcpu->kvm))




--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: KVM Arm64 and Linux-RT issues

2019-08-20 Thread Julien Grall

Hi Sebastian,

On 19/08/2019 08:33, Sebastian Andrzej Siewior wrote:

On 2019-08-16 17:32:38 [+0100], Julien Grall wrote:

Hi Sebastian,

Hi Julien,


hrtimer_callback_running() will be returning true as the callback is
running somewhere else. This means hrtimer_try_to_cancel()
would return -1. Therefore hrtimer_grab_expiry_lock() would
be called.

Did I miss anything?


nope, you are right. I assumed that we had code to deal with this but
didn't find it…

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 7d7db88021311..40d83c709503e 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -934,7 +934,7 @@ void hrtimer_grab_expiry_lock(const struct hrtimer *timer)
  {
struct hrtimer_clock_base *base = timer->base;
  
-	if (base && base->cpu_base) {

+   if (base && base->cpu_base && base->index < MASK_SHIFT) {


Lower indexes are used for hard interrupt. So this would need to be base->index 
>= MASK_SHIFT.


But I was wondering whether checking timer->is_soft would make the code more 
readable?


While investigation how this is meant to work, I noticed a few others things.

timer->base could potentially change under our feet at any point of time (we 
don't hold any lock). So it would be valid to have base == migration_base.


migration_cpu_base does not have softirq_expiry_lock initialized. So we would 
end up to use an uninitialized lock. Note that migration_base->index is always 
0, so the check base->index > MASK_SHIFT would hide it.


Alternatively, we could initialize the spin lock for migration_cpu_base avoiding 
to rely on side effect of the check.


Another potential issue is the compiler is free to reload timer->base at any 
time. So I think we want an ACCESS_ONCE(...).


Lastly timer->base cannot be NULL. From the comment on top of 
migration_cpu_base, timer->base->cpu_base will as well not be NULL.


So I think the function can be reworked as:

void hrtimer_grab_expirty_lock(const struct hrtimer *timer)
{
struct hrtimer_clock_base *base = ACCESS_ONCE(timer->base);

if (!timer->is_soft && base != migration_base ) {
  spin_lock();
  spin_unlock();
}
}



spin_lock(>cpu_base->softirq_expiry_lock);
spin_unlock(>cpu_base->softirq_expiry_lock);
}

This should deal with it.


Cheers,


Sebastian



--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: KVM Arm64 and Linux-RT issues

2019-08-16 Thread Julien Grall
Hi Sebastian,

On 16/08/2019 16:23, Sebastian Andrzej Siewior wrote:
> On 2019-08-16 16:18:20 [+0100], Julien Grall wrote:
>> Sadly, I managed to hit the same BUG_ON() today with this patch
>> applied on top v5.2-rt1-rebase. :/ Although, it is more difficult
>> to hit than previously.
>>
>> [  157.449545] 000: BUG: sleeping function called from invalid context at 
>> kernel/locking/rtmutex.c:968
>> [  157.449569] 000: in_atomic(): 1, irqs_disabled(): 0, pid: 990, name: 
>> kvm-vcpu-1
>> [  157.449579] 000: 2 locks held by kvm-vcpu-1/990:
>> [  157.449592] 000:  #0: c2fc8217 (>mutex){+.+.}, at: 
>> kvm_vcpu_ioctl+0x70/0xae0
>> [  157.449638] 000:  #1: 96863801 
>> (_base->softirq_expiry_lock){+.+.}, at: 
>> hrtimer_grab_expiry_lock+0x24/0x40
>> [  157.449677] 000: Preemption disabled at:
>> [  157.449679] 000: [] schedule+0x30/0xd8
>> [  157.449702] 000: CPU: 0 PID: 990 Comm: kvm-vcpu-1 Tainted: GW 
>> 5.2.0-rt1-1-gd368139e892f #104
>> [  157.449712] 000: Hardware name: ARM LTD ARM Juno Development Platform/ARM 
>> Juno Development Platform, BIOS EDK II Jan 23 2017
>> [  157.449718] 000: Call trace:
>> [  157.449722] 000:  dump_backtrace+0x0/0x130
>> [  157.449730] 000:  show_stack+0x14/0x20
>> [  157.449738] 000:  dump_stack+0xbc/0x104
>> [  157.449747] 000:  ___might_sleep+0x198/0x238
>> [  157.449756] 000:  rt_spin_lock+0x5c/0x70
>> [  157.449765] 000:  hrtimer_grab_expiry_lock+0x24/0x40
>> [  157.449773] 000:  hrtimer_cancel+0x1c/0x38
>> [  157.449780] 000:  kvm_timer_vcpu_load+0x78/0x3e0
> 
> …
>> I will do some debug and see what I can find.
> 
> which timer is this? Is there another one?

It looks like the timer is the background timer (bg_timer).
Although, the BUG() seems to happen with the other ones
but less often. All of them have already been converted.

Interestingly, hrtimer_grab_expiry_lock may be called by
timer even if is_soft (I assume this means softirq will
not be used) is 0.

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 7d7db8802131..fe05e553dea2 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -934,6 +934,9 @@ void hrtimer_grab_expiry_lock(const struct hrtimer *timer)
 {
struct hrtimer_clock_base *base = timer->base;
 
+   WARN(!preemptible(), "is_soft %u base %p base->cpu_base %p\n",
+timer->is_soft, base, base ? base->cpu_base : NULL);
+
if (base && base->cpu_base) {
spin_lock(>cpu_base->softirq_expiry_lock);
spin_unlock(>cpu_base->softirq_expiry_lock);

[  576.291886] 004: is_soft 0 base 80097eed44c0 base->cpu_base 
80097eed4380

Because the hrtimer is started when scheduling out the
vCPU and canceled when the scheduling in, there is no
guarantee the hrtimer will be running on the same pCPU.
So I think the following can happen:

CPU0  |  CPU1
  |
  |  hrtimer_interrupt()
  |
raw_spin_lock_irqsave(_save->lock)
 hrtimer_cancel() |  __run_hrtimer_run_queues()
   hrtimer_try_to_cancel()|  __run_hrtimer()
 lock_hrtimer_base()  |base->running = timer;
  |
raw_spin_unlock_irqrestore(_save->lock)
   raw_spin_lock_irqsave(cpu_base->lock)  |fn(timer);
 hrtimer_callback_running()       |

hrtimer_callback_running() will be returning true as the callback is
running somewhere else. This means hrtimer_try_to_cancel()
would return -1. Therefore hrtimer_grab_expiry_lock() would
be called.

Did I miss anything?

Cheers,

-- 
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: KVM Arm64 and Linux-RT issues

2019-08-16 Thread Julien Grall
Hi all,

On 13/08/2019 17:24, Marc Zyngier wrote:
> On Tue, 13 Aug 2019 16:44:21 +0100,
> Julien Grall  wrote:
>>
>> Hi Sebastian,
>>
>> On 8/13/19 1:58 PM, bige...@linutronix.de wrote:
>>> On 2019-07-27 14:37:11 [+0100], Julien Grall wrote:
>>>>>> 8<
>>>>>> --- a/virt/kvm/arm/arch_timer.c
>>>>>> +++ b/virt/kvm/arm/arch_timer.c
>>>>>> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
>>>>>> static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>>>>>> {
>>>>>>  hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
>>>>>> -  HRTIMER_MODE_ABS);
>>>>>> +  HRTIMER_MODE_ABS_HARD);
>>>>>> }
>>>>>
>>>>> That's pretty neat, and matches the patch you already have for
>>>>> x86. Feel free to add my
>>>>>
>>>>> Acked-by: Marc Zyngier 
>>>>
>>>> I can confirm the warning now disappeared. Feel free to added my tested-by:
>>>>
>>>> Tested-by: Julien Grall 
>>>>
>>>
>>> |kvm_hrtimer_expire()
>>> | kvm_timer_update_irq()
>>> |   kvm_vgic_inject_irq()
>>> | vgic_lazy_init()
>>> |if (unlikely(!vgic_initialized(kvm))) {
>>> | if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
>>> | return -EBUSY;
>>> |
>>> | mutex_lock(>lock);
>>>
>>> Is this possible path of any concern? This should throw a warning also
>>> for !RT so probably not…
>>
>> Hmmm, theoretically yes. In practice, it looks like the hrtimer will
>> not be started before kvm_vcpu_first_run_init() is called on the first
>> run.
> 
> Exactly. Even if you restore the timer in a "firing" configuration,
> you'll have to execute the vgic init before any background timer gets
> programmed, let alone expired.
> 
> Yes, the interface is terrible.
> 
>> The function will call kvm_vgic_map_resources() which will initialize
>> the vgic if not already done.
>>
>> Looking around, I think this is here to cater the case where
>> KVM_IRQ_LINE is called before running.
>>
>> I am not yet familiar with the vgic, so I may have missed something.
>>
>>>
>>> I prepared the patch below. This one could go straight to tglx's timer tree
>>> since he has the _HARD bits there. I *think* it requires to set the bits
>>> _HARD during _init() and _start() otherwise there is (or was) a warning…
>>>
>>> Sebastian
>>> 8<
>>>
>>> From: Thomas Gleixner 
>>> Date: Tue, 13 Aug 2019 14:29:41 +0200
>>> Subject: [PATCH] KVM: arm/arm64: Let the timer expire in hardirq context on 
>>> RT
>>>
>>> The timers are canceled from an preempt-notifier which is invoked with
>>> disabled preemption which is not allowed on PREEMPT_RT.
>>> The timer callback is short so in could be invoked in hard-IRQ context
>>> on -RT.
>>>
>>> Let the timer expire on hard-IRQ context even on -RT.
>>>
>>> Signed-off-by: Thomas Gleixner 
>>> Acked-by: Marc Zyngier 
>>> Tested-by: Julien Grall 
>>> Signed-off-by: Sebastian Andrzej Siewior 
>>> ---
>>>virt/kvm/arm/arch_timer.c | 8 
>>>1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index 1be486d5d7cb4..0bfa7c5b5c890 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -80,7 +80,7 @@ static inline bool userspace_irqchip(struct kvm *kvm)
>>>static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>>>{
>>> hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
>>> - HRTIMER_MODE_ABS);
>>> + HRTIMER_MODE_ABS_HARD);
>>>}
>>>  static void soft_timer_cancel(struct hrtimer *hrt)
>>> @@ -697,11 +697,11 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>>> update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
>>> ptimer->cntvoff = 0;
>>>-hrtimer_init(>bg_timer, CLOCK_MONOTONIC,
>>> HRTIMER_MODE_ABS);
>>> +   hrtimer_init(>bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
>>> timer->bg_timer.function = kvm_bg_t

Re: KVM Arm64 and Linux-RT issues

2019-08-13 Thread Julien Grall

Hi Sebastian,

On 8/13/19 1:58 PM, bige...@linutronix.de wrote:

On 2019-07-27 14:37:11 [+0100], Julien Grall wrote:

8<
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
   static void soft_timer_start(struct hrtimer *hrt, u64 ns)
   {
hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
- HRTIMER_MODE_ABS);
+ HRTIMER_MODE_ABS_HARD);
   }


That's pretty neat, and matches the patch you already have for
x86. Feel free to add my

Acked-by: Marc Zyngier 


I can confirm the warning now disappeared. Feel free to added my tested-by:

Tested-by: Julien Grall 



|kvm_hrtimer_expire()
| kvm_timer_update_irq()
|   kvm_vgic_inject_irq()
| vgic_lazy_init()
|if (unlikely(!vgic_initialized(kvm))) {
| if (kvm->arch.vgic.vgic_model != KVM_DEV_TYPE_ARM_VGIC_V2)
| return -EBUSY;
|
| mutex_lock(>lock);

Is this possible path of any concern? This should throw a warning also
for !RT so probably not…


Hmmm, theoretically yes. In practice, it looks like the hrtimer will not 
be started before kvm_vcpu_first_run_init() is called on the first run.


The function will call kvm_vgic_map_resources() which will initialize 
the vgic if not already done.


Looking around, I think this is here to cater the case where 
KVM_IRQ_LINE is called before running.


I am not yet familiar with the vgic, so I may have missed something.



I prepared the patch below. This one could go straight to tglx's timer tree
since he has the _HARD bits there. I *think* it requires to set the bits
_HARD during _init() and _start() otherwise there is (or was) a warning…

Sebastian
8<

From: Thomas Gleixner 
Date: Tue, 13 Aug 2019 14:29:41 +0200
Subject: [PATCH] KVM: arm/arm64: Let the timer expire in hardirq context on RT

The timers are canceled from an preempt-notifier which is invoked with
disabled preemption which is not allowed on PREEMPT_RT.
The timer callback is short so in could be invoked in hard-IRQ context
on -RT.

Let the timer expire on hard-IRQ context even on -RT.

Signed-off-by: Thomas Gleixner 
Acked-by: Marc Zyngier 
Tested-by: Julien Grall 
Signed-off-by: Sebastian Andrzej Siewior 
---
  virt/kvm/arm/arch_timer.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 1be486d5d7cb4..0bfa7c5b5c890 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -80,7 +80,7 @@ static inline bool userspace_irqchip(struct kvm *kvm)
  static void soft_timer_start(struct hrtimer *hrt, u64 ns)
  {
hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
- HRTIMER_MODE_ABS);
+ HRTIMER_MODE_ABS_HARD);
  }
  
  static void soft_timer_cancel(struct hrtimer *hrt)

@@ -697,11 +697,11 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
ptimer->cntvoff = 0;
  
-	hrtimer_init(>bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);

+   hrtimer_init(>bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
timer->bg_timer.function = kvm_bg_timer_expire;
  
-	hrtimer_init(>hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);

-   hrtimer_init(>hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+   hrtimer_init(>hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
+   hrtimer_init(>hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
vtimer->hrtimer.function = kvm_hrtimer_expire;
ptimer->hrtimer.function = kvm_hrtimer_expire;
  



--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: KVM Arm64 and Linux-RT issues

2019-07-27 Thread Julien Grall

Hi,

On 7/27/19 12:13 PM, Marc Zyngier wrote:

On Fri, 26 Jul 2019 23:58:38 +0100,
Thomas Gleixner  wrote:


On Wed, 24 Jul 2019, Marc Zyngier wrote:

On 23/07/2019 18:58, Julien Grall wrote:
It really feels like a change in hrtimer_cancel semantics. From what I
understand, this is used to avoid racing against the softirq, but boy it
breaks things.

If this cannot be avoided, this means we can't cancel the background
timer (which is used to emulate the vcpu timer while it is blocked
waiting for an interrupt), then we must move this canceling to the point
where the vcpu is unblocked (instead of scheduled), which may have some
side effects -- I'll have a look.

But that's not the only problem: We also have hrtimers used to emulate
timers while the vcpu is running, and these timers are canceled in
kvm_timer_vcpu_put(), which is also called from a preempt notifier.
Unfortunately, I don't have a reasonable solution for that (other than
putting this hrtimer_cancel in a workqueue and start chasing the
resulting races).


The fix is simple. See below. We'll add that to the next RT release. That
will take a while as I'm busy with posting RT stuff for upstream :)


Ah, thanks for that! And yes, looking forward to RT upstream, it's
just about time! ;-)



Thanks,

tglx

8<
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -80,7 +80,7 @@ static inline bool userspace_irqchip(str
  static void soft_timer_start(struct hrtimer *hrt, u64 ns)
  {
hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
- HRTIMER_MODE_ABS);
+ HRTIMER_MODE_ABS_HARD);
  }
  


That's pretty neat, and matches the patch you already have for
x86. Feel free to add my

Acked-by: Marc Zyngier 


I can confirm the warning now disappeared. Feel free to added my tested-by:

Tested-by: Julien Grall 

Thank you both for the help!

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 15/15] kvm/arm: Align the VMID allocation with the arm64 ASID one

2019-07-24 Thread Julien Grall
At the moment, the VMID algorithm will send an SGI to all the CPUs to
force an exit and then broadcast a full TLB flush and I-Cache
invalidation.

This patch re-use the new ASID allocator. The
benefits are:
- CPUs are not forced to exit at roll-over. Instead the VMID will be
marked reserved and the context will be flushed at next exit. This
will reduce the IPIs traffic.
- Context invalidation is now per-CPU rather than broadcasted.
- Catalin has a formal model of the ASID allocator.

With the new algo, the code is now adapted:
- The function __kvm_flush_vm_context() has been renamed to
__kvm_tlb_flush_local_all() and now only flushing the current CPU
context.
- The call to update_vttbr() will be done with preemption disabled
as the new algo requires to store information per-CPU.
- The TLBs associated to EL1 will be flushed when booting a CPU to
deal with stale information. This was previously done on the
allocation of the first VMID of a new generation.

The measurement was made on a Seattle based SoC (8 CPUs), with the
number of VMID limited to 4-bit. The test involves running concurrently 40
guests with 2 vCPUs. Each guest will then execute hackbench 5 times
before exiting.

The performance difference between the current algo and the new one are:
- 2.5% less exit from the guest
- 22.4% more flush, although they are now local rather than
broadcasted
- 0.11% faster (just for the record)

Signed-off-by: Julien Grall 


Looking at the __kvm_flush_vm_context, it might be possible to
reduce more the overhead by removing the I-Cache flush for other
cache than VIPT. This has been left aside for now.

Changes in v3:
- Free resource if initialization failed
- s/__kvm_flush_cpu_vmid_context/__kvm_tlb_flush_local_all/
- s/asid/id/ in kvm_vmid to avoid confusion
- Generate the VMID in kvm_get_vttbr() rather than using a
callback in the ASID allocator
- Use smp_processor_id() rather than {get, put}_cpu() as the
code should already be called from non-preemptible context
- Mention the formal model in the commit message
---
 arch/arm/include/asm/kvm_asm.h|   2 +-
 arch/arm/include/asm/kvm_host.h   |   5 +-
 arch/arm/include/asm/kvm_hyp.h|   1 +
 arch/arm/include/asm/kvm_mmu.h|   3 +-
 arch/arm/kvm/hyp/tlb.c|   8 +--
 arch/arm64/include/asm/kvm_asid.h |   8 +++
 arch/arm64/include/asm/kvm_asm.h  |   2 +-
 arch/arm64/include/asm/kvm_host.h |   5 +-
 arch/arm64/include/asm/kvm_mmu.h  |   3 +-
 arch/arm64/kvm/hyp/tlb.c  |  10 +--
 virt/kvm/arm/arm.c| 125 ++
 11 files changed, 70 insertions(+), 102 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_asid.h

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index f615830f9f57..b6342258b466 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -53,10 +53,10 @@ struct kvm_vcpu;
 extern char __kvm_hyp_init[];
 extern char __kvm_hyp_init_end[];
 
-extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
+extern void __kvm_tlb_flush_local_all(void);
 
 extern void __kvm_timer_set_cntvoff(u32 cntvoff_low, u32 cntvoff_high);
 
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 8a37c8e89777..9b534f73725f 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -49,9 +49,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 void kvm_reset_coprocs(struct kvm_vcpu *vcpu);
 
 struct kvm_vmid {
-   /* The VMID generation used for the virt. memory system */
-   u64vmid_gen;
-   u32vmid;
+   atomic64_t id;
 };
 
 struct kvm_arch {
@@ -257,7 +255,6 @@ unsigned long __kvm_call_hyp(void *hypfn, ...);
ret;\
})
 
-void force_vm_exit(const cpumask_t *mask);
 int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
  struct kvm_vcpu_events *events);
 
diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index 40e9034db601..46484a516e76 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -64,6 +64,7 @@
 #define TLBIALLIS  __ACCESS_CP15(c8, 0, c3, 0)
 #define TLBIALL__ACCESS_CP15(c8, 0, c7, 0)
 #define TLBIALLNSNHIS  __ACCESS_CP15(c8, 4, c3, 4)
+#define TLBIALLNSNH__ACCESS_CP15(c8, 4, c7, 4)
 #define PRRR   __ACCESS_CP15(c10, 0, c2, 0)
 #define NMRR   __ACCESS_CP15(c10, 0, c2, 1)
 #define AMAIR0 __ACCESS_CP15(c10, 0, c3, 0)
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 0d84d50bf9ba..d7208e7b01bd 100644
--- a/arch/arm

[PATCH v3 13/15] arm/kvm: Introduce a new VMID allocator

2019-07-24 Thread Julien Grall
A follow-up patch will replace the KVM VMID allocator with the arm64 ASID
allocator.

To avoid as much as possible duplication, the arm KVM code will directly
compile arch/arm64/lib/asid.c. The header is a verbatim to copy to
avoid breaking the assumption that architecture port has self-containers
headers.

Signed-off-by: Julien Grall 
Cc: Russell King 

---
I hit a warning when compiling the ASID code:

linux/arch/arm/kvm/../../arm64/lib/asid.c:17: warning: "ASID_MASK" redefined
 #define ASID_MASK(info)   (~GENMASK((info)->bits - 1, 0))

In file included from linux/include/linux/mm_types.h:18,
 from linux/include/linux/mmzone.h:21,
 from linux/include/linux/gfp.h:6,
 from linux/include/linux/slab.h:15,
 from linux/arch/arm/kvm/../../arm64/lib/asid.c:11:
linux/arch/arm/include/asm/mmu.h:26: note: this is the location of the previous 
definition
 #define ASID_MASK ((~0ULL) << ASID_BITS)

I haven't yet resolved because I am not sure of the best way to go.
AFAICT ASID_MASK is only used in mm/context.c. So I am wondering whether
it would be acceptable to move the define.

Changes in v3:
- Resync arm32 with the arm64 header

Changes in v2:
- Re-use arm64/lib/asid.c rather than duplication the code.
---
 arch/arm/include/asm/lib_asid.h | 79 +
 arch/arm/kvm/Makefile   |  1 +
 2 files changed, 80 insertions(+)
 create mode 100644 arch/arm/include/asm/lib_asid.h

diff --git a/arch/arm/include/asm/lib_asid.h b/arch/arm/include/asm/lib_asid.h
new file mode 100644
index ..e3233d37f5db
--- /dev/null
+++ b/arch/arm/include/asm/lib_asid.h
@@ -0,0 +1,79 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ARM_LIB_ASID_H__
+#define __ARM_LIB_ASID_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
+   u32 bits;
+   /* Lock protecting the structure */
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
+};
+
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
+
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+
+void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+ unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static inline void asid_check_context(struct asid_info *info,
+ atomic64_t *pasid, unsigned int cpu)
+{
+   u64 asid, old_active_asid;
+
+   asid = atomic64_read(pasid);
+
+   /*
+* The memory ordering here is subtle.
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
+* cmpxchg. Racing with a concurrent rollover means that either:
+*
+* - We get a zero back from the cmpxchg and end up waiting on the
+*   lock. Taking the lock synchronises with the rollover and so
+*   we are forced to see the updated generation.
+*
+* - We get a valid ASID back from the cmpxchg, which means the
+*   relaxed xchg in flush_context will treat us as reserved
+*   because atomic RmWs are totally ordered for a given location.
+*/
+   old_active_asid = atomic64_read(_asid(info, cpu));
+   if (old_active_asid &&
+   !((asid ^ atomic64_read(>generation)) >> info->bits) &&
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
+old_active_asid, asid))
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+int asid_allocator_init(struct asid_info *info,
+   u32 bits, unsigned int asid_per_ctxt,
+   void (*flush_cpu_ctxt_cb)(void));
+
+void asid_allocator_free(struct asid_info *info);
+
+#endif /* __ARM_LIB_ASID_H__ */
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index 531e59f5be9c..6ab49bd84531 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -40,3 +40,4 @@ obj-y += $(KVM)/arm/vgic/vgic-its.o
 obj-y += $(KVM)/arm/vgic/vgic-debug.o
 obj-y += $(KVM)/irqchip.o
 obj-y +=

[PATCH v3 11/15] arm64: Move the ASID allocator code in a separate file

2019-07-24 Thread Julien Grall
We will want to re-use the ASID allocator in a separate context (e.g
allocating VMID). So move the code in a new file.

The function asid_check_context has been moved in the header as a static
inline function because we want to avoid add a branch when checking if the
ASID is still valid.

Signed-off-by: Julien Grall 

---

This code will be used in the virt code for allocating VMID. I am not
entirely sure where to place it. Lib could potentially be a good place but I
am not entirely convinced the algo as it is could be used by other
architecture.

Looking at x86, it seems that it will not be possible to re-use because
the number of PCID (aka ASID) could be smaller than the number of CPUs.
See commit message 10af6235e0d327d42e1bad974385197817923dc1 "x86/mm:
Implement PCID based optimization: try to preserve old TLB entries using
PCI".

Changes in v3:
- Correctly move ASID_FIRST_VERSION to the new file

Changes in v2:
- Rename the header from asid.h to lib_asid.h
---
 arch/arm64/include/asm/lib_asid.h |  77 +
 arch/arm64/lib/Makefile   |   2 +
 arch/arm64/lib/asid.c | 185 ++
 arch/arm64/mm/context.c   | 235 +-
 4 files changed, 267 insertions(+), 232 deletions(-)
 create mode 100644 arch/arm64/include/asm/lib_asid.h
 create mode 100644 arch/arm64/lib/asid.c

diff --git a/arch/arm64/include/asm/lib_asid.h 
b/arch/arm64/include/asm/lib_asid.h
new file mode 100644
index ..c18e9eca500e
--- /dev/null
+++ b/arch/arm64/include/asm/lib_asid.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ASM_LIB_ASID_H
+#define __ASM_ASM_LIB_ASID_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
+   u32 bits;
+   /* Lock protecting the structure */
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
+};
+
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
+
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+
+void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+ unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static inline void asid_check_context(struct asid_info *info,
+ atomic64_t *pasid, unsigned int cpu)
+{
+   u64 asid, old_active_asid;
+
+   asid = atomic64_read(pasid);
+
+   /*
+* The memory ordering here is subtle.
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
+* cmpxchg. Racing with a concurrent rollover means that either:
+*
+* - We get a zero back from the cmpxchg and end up waiting on the
+*   lock. Taking the lock synchronises with the rollover and so
+*   we are forced to see the updated generation.
+*
+* - We get a valid ASID back from the cmpxchg, which means the
+*   relaxed xchg in flush_context will treat us as reserved
+*   because atomic RmWs are totally ordered for a given location.
+*/
+   old_active_asid = atomic64_read(_asid(info, cpu));
+   if (old_active_asid &&
+   !((asid ^ atomic64_read(>generation)) >> info->bits) &&
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
+old_active_asid, asid))
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+int asid_allocator_init(struct asid_info *info,
+   u32 bits, unsigned int asid_per_ctxt,
+   void (*flush_cpu_ctxt_cb)(void));
+
+#endif
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 33c2a4abda04..37169d541ab5 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -5,6 +5,8 @@ lib-y   := clear_user.o delay.o copy_from_user.o
\
   memcmp.o strcmp.o strncmp.o strlen.o strnlen.o   \
   strchr.o strrchr.o tishift.o
 
+lib-y  += asid.o
+
 ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
 obj-$(CONFIG_XOR_BLOCKS)   += xor-neon.o
 CFLAGS_REMOVE_xor-neon.o 

[PATCH v3 12/15] arm64/lib: Add an helper to free memory allocated by the ASID allocator

2019-07-24 Thread Julien Grall
Some users of the ASID allocator (e.g VMID) may require to free any
resource if the initialization fail. So introduce a function allows to
free any memory allocated by the ASID allocator.

Signed-off-by: Julien Grall 

---
Changes in v3:
- Patch added
---
 arch/arm64/include/asm/lib_asid.h | 2 ++
 arch/arm64/lib/asid.c | 5 +
 2 files changed, 7 insertions(+)

diff --git a/arch/arm64/include/asm/lib_asid.h 
b/arch/arm64/include/asm/lib_asid.h
index c18e9eca500e..ff78865a6823 100644
--- a/arch/arm64/include/asm/lib_asid.h
+++ b/arch/arm64/include/asm/lib_asid.h
@@ -74,4 +74,6 @@ int asid_allocator_init(struct asid_info *info,
u32 bits, unsigned int asid_per_ctxt,
void (*flush_cpu_ctxt_cb)(void));
 
+void asid_allocator_free(struct asid_info *info);
+
 #endif
diff --git a/arch/arm64/lib/asid.c b/arch/arm64/lib/asid.c
index 0b3a99c4aed4..d23f0df656c1 100644
--- a/arch/arm64/lib/asid.c
+++ b/arch/arm64/lib/asid.c
@@ -183,3 +183,8 @@ int asid_allocator_init(struct asid_info *info,
 
return 0;
 }
+
+void asid_allocator_free(struct asid_info *info)
+{
+   kfree(info->map);
+}
-- 
2.11.0



[PATCH v3 14/15] arch/arm64: Introduce a capability to tell whether 16-bit VMID is available

2019-07-24 Thread Julien Grall
At the moment, the function kvm_get_vmid_bits() is looking up for the
sanitized value of ID_AA64MMFR1_EL1 and extract the information
regarding the number of VMID bits supported.

This is fine as the function is mainly used during VMID roll-over. New
use in a follow-up patch will require the function to be called a every
context switch so we want the function to be more efficient.

A new capability is introduced to tell whether 16-bit VMID is
available.

Signed-off-by: Julien Grall 

---
Changes in v3:
- Patch added
---
 arch/arm64/include/asm/cpucaps.h | 3 ++-
 arch/arm64/include/asm/kvm_mmu.h | 4 +---
 arch/arm64/kernel/cpufeature.c   | 9 +
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f19fe4b9acc4..af8ab758b252 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -52,7 +52,8 @@
 #define ARM64_HAS_IRQ_PRIO_MASKING 42
 #define ARM64_HAS_DCPODP   43
 #define ARM64_WORKAROUND_1463225   44
+#define ARM64_HAS_16BIT_VMID   45
 
-#define ARM64_NCAPS45
+#define ARM64_NCAPS46
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index befe37d4bc0e..2ce8055a84b8 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -413,9 +413,7 @@ static inline void __kvm_extend_hypmap(pgd_t *boot_hyp_pgd,
 
 static inline unsigned int kvm_get_vmid_bits(void)
 {
-   int reg = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
-
-   return (cpuid_feature_extract_unsigned_field(reg, 
ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
+   return cpus_have_const_cap(ARM64_HAS_16BIT_VMID) ? 16 : 8;
 }
 
 /*
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f29f36a65175..b401e56af35a 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1548,6 +1548,15 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.min_field_value = 1,
},
 #endif
+   {
+   .capability = ARM64_HAS_16BIT_VMID,
+   .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+   .sys_reg = SYS_ID_AA64MMFR1_EL1,
+   .field_pos = ID_AA64MMFR1_VMIDBITS_SHIFT,
+   .sign = FTR_UNSIGNED,
+   .min_field_value = ID_AA64MMFR1_VMIDBITS_16,
+   .matches = has_cpuid_feature,
+   },
{},
 };
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 06/15] arm64/mm: Store the number of asid allocated per context

2019-07-24 Thread Julien Grall
Currently the number of ASID allocated per context is determined at
compilation time. As the algorithm is becoming generic, the user may
want to instantiate the ASID allocator multiple time with different
number of ASID allocated.

Add a field in asid_info to track the number ASID allocated per context.
This is stored in term of shift amount to avoid division in the code.

This means the number of ASID allocated per context should be a power of
two.

At the same time rename NUM_USERS_ASIDS to NUM_CTXT_ASIDS to make the
name more generic.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index dfb0da35a541..2e1e495cd1d8 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -26,6 +26,8 @@ static struct asid_info
raw_spinlock_t  lock;
/* Which CPU requires context flush on next call */
cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -38,15 +40,15 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 #define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 1)
-#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 1)
-#define idx2asid(info, idx)(((idx) << 1) & ~ASID_MASK(info))
+#define ASID_PER_CONTEXT   2
 #else
-#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info))
-#define asid2idx(info, asid)   ((asid) & ~ASID_MASK(info))
-#define idx2asid(info, idx)asid2idx(info, idx)
+#define ASID_PER_CONTEXT   1
 #endif
 
+#define NUM_CTXT_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 
(info)->ctxt_shift)
+#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 
(info)->ctxt_shift)
+#define idx2asid(info, idx)(((idx) << (info)->ctxt_shift) & 
~ASID_MASK(info))
+
 /* Get the ASIDBits supported by the current CPU */
 static u32 get_cpu_asid_bits(void)
 {
@@ -91,7 +93,7 @@ static void flush_context(struct asid_info *info)
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(info->map, 0, NUM_USER_ASIDS(info));
+   bitmap_clear(info->map, 0, NUM_CTXT_ASIDS(info));
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_asid(info, i), 0);
@@ -171,8 +173,8 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), cur_idx);
-   if (asid != NUM_USER_ASIDS(info))
+   asid = find_next_zero_bit(info->map, NUM_CTXT_ASIDS(info), cur_idx);
+   if (asid != NUM_CTXT_ASIDS(info))
goto set_asid;
 
/* We're out of ASIDs, so increment the global generation count */
@@ -181,7 +183,7 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
flush_context(info);
 
/* We have more ASIDs than CPUs, so this will always succeed */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), 1);
+   asid = find_next_zero_bit(info->map, NUM_CTXT_ASIDS(info), 1);
 
 set_asid:
__set_bit(asid, info->map);
@@ -261,17 +263,18 @@ static int asids_init(void)
struct asid_info *info = _info;
 
info->bits = get_cpu_asid_bits();
+   info->ctxt_shift = ilog2(ASID_PER_CONTEXT);
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is reserved for init_mm.
 */
-   WARN_ON(NUM_USER_ASIDS(info) - 1 <= num_possible_cpus());
+   WARN_ON(NUM_CTXT_ASIDS(info) - 1 <= num_possible_cpus());
atomic64_set(>generation, ASID_FIRST_VERSION(info));
-   info->map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS(info)),
+   info->map = kcalloc(BITS_TO_LONGS(NUM_CTXT_ASIDS(info)),
sizeof(*info->map), GFP_KERNEL);
if (!info->map)
panic("Failed to allocate bitmap for %lu ASIDs\n",
- NUM_USER_ASIDS(info));
+ NUM_CTXT_ASIDS(info));
 
info->active = _asids;
info->reserved = _asids;
@@ -279,7 +282,7 @@ static int asids_init(void)
raw_spin_lock_init(>lock);
 
pr_info("ASID allocator initialised with %lu entries\n",
-   NUM_USER_ASIDS(info));
+   NUM_CTXT_ASIDS(info));
return 0;
 }
 early_initcall(asids

[PATCH v3 08/15] arm64/mm: Split asid_inits in 2 parts

2019-07-24 Thread Julien Grall
Move out the common initialization of the ASID allocator in a separate
function.

Signed-off-by: Julien Grall 

---
Changes in v3:
- Allow bisection (asid_allocator_init() return 0 on success not
  error!).
---
 arch/arm64/mm/context.c | 43 +++
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 3b40ac4a2541..27e328fffdb1 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -260,31 +260,50 @@ asmlinkage void post_ttbr_update_workaround(void)
CONFIG_CAVIUM_ERRATUM_27456));
 }
 
-static int asids_init(void)
+/*
+ * Initialize the ASID allocator
+ *
+ * @info: Pointer to the asid allocator structure
+ * @bits: Number of ASIDs available
+ * @asid_per_ctxt: Number of ASIDs to allocate per-context. ASIDs are
+ * allocated contiguously for a given context. This value should be a power of
+ * 2.
+ */
+static int asid_allocator_init(struct asid_info *info,
+  u32 bits, unsigned int asid_per_ctxt)
 {
-   struct asid_info *info = _info;
-
-   info->bits = get_cpu_asid_bits();
-   info->ctxt_shift = ilog2(ASID_PER_CONTEXT);
+   info->bits = bits;
+   info->ctxt_shift = ilog2(asid_per_ctxt);
/*
 * Expect allocation after rollover to fail if we don't have at least
-* one more ASID than CPUs. ASID #0 is reserved for init_mm.
+* one more ASID than CPUs. ASID #0 is always reserved.
 */
WARN_ON(NUM_CTXT_ASIDS(info) - 1 <= num_possible_cpus());
atomic64_set(>generation, ASID_FIRST_VERSION(info));
info->map = kcalloc(BITS_TO_LONGS(NUM_CTXT_ASIDS(info)),
sizeof(*info->map), GFP_KERNEL);
if (!info->map)
-   panic("Failed to allocate bitmap for %lu ASIDs\n",
- NUM_CTXT_ASIDS(info));
-
-   info->active = _asids;
-   info->reserved = _asids;
+   return -ENOMEM;
 
raw_spin_lock_init(>lock);
 
+   return 0;
+}
+
+static int asids_init(void)
+{
+   u32 bits = get_cpu_asid_bits();
+
+   if (asid_allocator_init(_info, bits, ASID_PER_CONTEXT))
+   panic("Unable to initialize ASID allocator for %lu ASIDs\n",
+ 1UL << bits);
+
+   asid_info.active = _asids;
+   asid_info.reserved = _asids;
+
pr_info("ASID allocator initialised with %lu entries\n",
-   NUM_CTXT_ASIDS(info));
+   NUM_CTXT_ASIDS(_info));
+
return 0;
 }
 early_initcall(asids_init);
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 10/15] arm64/mm: Introduce a callback to flush the local context

2019-07-24 Thread Julien Grall
Flushing the local context will vary depending on the actual user of the ASID
allocator. Introduce a new callback to flush the local context and move
the call to flush local TLB in it.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 5e8b381ab67f..ac10893b403c 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -28,6 +28,8 @@ static struct asid_info
cpumask_t   flush_pending;
/* Number of ASID allocated by context (shift value) */
unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -255,7 +257,7 @@ static void asid_new_context(struct asid_info *info, 
atomic64_t *pasid,
}
 
if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
-   local_flush_tlb_all();
+   info->flush_cpu_ctxt_cb();
 
atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(>lock, flags);
@@ -287,6 +289,11 @@ asmlinkage void post_ttbr_update_workaround(void)
CONFIG_CAVIUM_ERRATUM_27456));
 }
 
+static void asid_flush_cpu_ctxt(void)
+{
+   local_flush_tlb_all();
+}
+
 /*
  * Initialize the ASID allocator
  *
@@ -297,10 +304,12 @@ asmlinkage void post_ttbr_update_workaround(void)
  * 2.
  */
 static int asid_allocator_init(struct asid_info *info,
-  u32 bits, unsigned int asid_per_ctxt)
+  u32 bits, unsigned int asid_per_ctxt,
+  void (*flush_cpu_ctxt_cb)(void))
 {
info->bits = bits;
info->ctxt_shift = ilog2(asid_per_ctxt);
+   info->flush_cpu_ctxt_cb = flush_cpu_ctxt_cb;
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is always reserved.
@@ -321,7 +330,8 @@ static int asids_init(void)
 {
u32 bits = get_cpu_asid_bits();
 
-   if (asid_allocator_init(_info, bits, ASID_PER_CONTEXT))
+   if (asid_allocator_init(_info, bits, ASID_PER_CONTEXT,
+asid_flush_cpu_ctxt))
panic("Unable to initialize ASID allocator for %lu ASIDs\n",
  1UL << bits);
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 09/15] arm64/mm: Split the function check_and_switch_context in 3 parts

2019-07-24 Thread Julien Grall
The function check_and_switch_context is used to:
1) Check whether the ASID is still valid
2) Generate a new one if it is not valid
3) Switch the context

While the latter is specific to the MM subsystem, the rest could be part
of the generic ASID allocator.

After this patch, the function is now split in 3 parts which corresponds
to the use of the functions:
1) asid_check_context: Check if the ASID is still valid
2) asid_new_context: Generate a new ASID for the context
3) check_and_switch_context: Call 1) and 2) and switch the context

1) and 2) have not been merged in a single function because we want to
avoid to add a branch in when the ASID is still valid. This will matter
when the code will be moved in separate file later on as 1) will reside
in the header as a static inline function.

Signed-off-by: Julien Grall 

---

Will wants to avoid to add a branch when the ASID is still valid. So
1) and 2) are in separates function. The former will move to a new
header and make static inline.
---
 arch/arm64/mm/context.c | 51 +
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 27e328fffdb1..5e8b381ab67f 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -193,16 +193,21 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
return idx2asid(info, asid) | generation;
 }
 
-void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
+static void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static void asid_check_context(struct asid_info *info,
+  atomic64_t *pasid, unsigned int cpu)
 {
-   unsigned long flags;
u64 asid, old_active_asid;
-   struct asid_info *info = _info;
 
-   if (system_supports_cnp())
-   cpu_set_reserved_ttbr0();
-
-   asid = atomic64_read(>context.id);
+   asid = atomic64_read(pasid);
 
/*
 * The memory ordering here is subtle.
@@ -223,14 +228,30 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
!((asid ^ atomic64_read(>generation)) >> info->bits) &&
atomic64_cmpxchg_relaxed(_asid(info, cpu),
 old_active_asid, asid))
-   goto switch_mm_fastpath;
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+/*
+ * Generate a new ASID for the context.
+ *
+ * @pasid: Pointer to the current ASID batch allocated. It will be updated
+ * with the new ASID batch.
+ * @cpu: current CPU ID. Must have been acquired through get_cpu()
+ */
+static void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+unsigned int cpu)
+{
+   unsigned long flags;
+   u64 asid;
 
raw_spin_lock_irqsave(>lock, flags);
/* Check that our ASID belongs to the current generation. */
-   asid = atomic64_read(>context.id);
+   asid = atomic64_read(pasid);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
-   asid = new_context(info, >context.id);
-   atomic64_set(>context.id, asid);
+   asid = new_context(info, pasid);
+   atomic64_set(pasid, asid);
}
 
if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
@@ -238,8 +259,14 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 
atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(>lock, flags);
+}
+
+void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
+{
+   if (system_supports_cnp())
+   cpu_set_reserved_ttbr0();
 
-switch_mm_fastpath:
+   asid_check_context(_info, >context.id, cpu);
 
arm64_apply_bp_hardening();
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 03/15] arm64/mm: Move bits to asid_info

2019-07-24 Thread Julien Grall
The variable bits hold information for a given ASID allocator. So move
it to the asid_info structure.

Because most of the macros were relying on bits, they are now taking an
extra parameter that is a pointer to the asid_info structure.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 59 +
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 3de028803284..49fff350e12f 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -16,7 +16,6 @@
 #include 
 #include 
 
-static u32 asid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 
 static struct asid_info
@@ -25,6 +24,7 @@ static struct asid_info
unsigned long   *map;
atomic64_t __percpu *active;
u64 __percpu*reserved;
+   u32 bits;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -35,17 +35,17 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 
 static cpumask_t tlb_flush_pending;
 
-#define ASID_MASK  (~GENMASK(asid_bits - 1, 0))
-#define ASID_FIRST_VERSION (1UL << asid_bits)
+#define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
+#define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define NUM_USER_ASIDS (ASID_FIRST_VERSION >> 1)
-#define asid2idx(asid) (((asid) & ~ASID_MASK) >> 1)
-#define idx2asid(idx)  (((idx) << 1) & ~ASID_MASK)
+#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 1)
+#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 1)
+#define idx2asid(info, idx)(((idx) << 1) & ~ASID_MASK(info))
 #else
-#define NUM_USER_ASIDS (ASID_FIRST_VERSION)
-#define asid2idx(asid) ((asid) & ~ASID_MASK)
-#define idx2asid(idx)  asid2idx(idx)
+#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info))
+#define asid2idx(info, asid)   ((asid) & ~ASID_MASK(info))
+#define idx2asid(info, idx)asid2idx(info, idx)
 #endif
 
 /* Get the ASIDBits supported by the current CPU */
@@ -75,13 +75,13 @@ void verify_cpu_asid_bits(void)
 {
u32 asid = get_cpu_asid_bits();
 
-   if (asid < asid_bits) {
+   if (asid < asid_info.bits) {
/*
 * We cannot decrease the ASID size at runtime, so panic if we 
support
 * fewer ASID bits than the boot CPU.
 */
pr_crit("CPU%d: smaller ASID size(%u) than boot CPU (%u)\n",
-   smp_processor_id(), asid, asid_bits);
+   smp_processor_id(), asid, asid_info.bits);
cpu_panic_kernel();
}
 }
@@ -92,7 +92,7 @@ static void flush_context(struct asid_info *info)
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(info->map, 0, NUM_USER_ASIDS);
+   bitmap_clear(info->map, 0, NUM_USER_ASIDS(info));
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_asid(info, i), 0);
@@ -105,7 +105,7 @@ static void flush_context(struct asid_info *info)
 */
if (asid == 0)
asid = reserved_asid(info, i);
-   __set_bit(asid2idx(asid), info->map);
+   __set_bit(asid2idx(info, asid), info->map);
reserved_asid(info, i) = asid;
}
 
@@ -148,7 +148,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
-   u64 newasid = generation | (asid & ~ASID_MASK);
+   u64 newasid = generation | (asid & ~ASID_MASK(info));
 
/*
 * If our current ASID was active during a rollover, we
@@ -161,7 +161,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * We had a valid ASID in a previous life, so try to re-use
 * it if possible.
 */
-   if (!__test_and_set_bit(asid2idx(asid), info->map))
+   if (!__test_and_set_bit(asid2idx(info, asid), info->map))
return newasid;
}
 
@@ -172,22 +172,22 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, cur_idx);
-   if (asid != NUM_USER_ASIDS)
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), cur_idx);
+   if (asid != NUM_USER_ASIDS(info))
goto set_asid;
 
/* We're out of ASIDs, so increment the global gener

[PATCH v3 07/15] arm64/mm: Introduce NUM_ASIDS

2019-07-24 Thread Julien Grall
At the moment ASID_FIRST_VERSION is used to know the number of ASIDs
supported. As we are going to move the ASID allocator in a separate, it
would be better to use a different name for external users.

This patch adds NUM_ASIDS and implements ASID_FIRST_VERSION using it.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 2e1e495cd1d8..3b40ac4a2541 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -37,7 +37,9 @@ static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
 
 #define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
-#define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+
+#define ASID_FIRST_VERSION(info)   NUM_ASIDS(info)
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 #define ASID_PER_CONTEXT   2
@@ -45,7 +47,7 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 #define ASID_PER_CONTEXT   1
 #endif
 
-#define NUM_CTXT_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 
(info)->ctxt_shift)
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
 #define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 
(info)->ctxt_shift)
 #define idx2asid(info, idx)(((idx) << (info)->ctxt_shift) & 
~ASID_MASK(info))
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 04/15] arm64/mm: Move the variable lock and tlb_flush_pending to asid_info

2019-07-24 Thread Julien Grall
The variables lock and tlb_flush_pending holds information for a given
ASID allocator. So move them to the asid_info structure.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 49fff350e12f..b50f52a09baf 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -16,8 +16,6 @@
 #include 
 #include 
 
-static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
-
 static struct asid_info
 {
atomic64_t  generation;
@@ -25,6 +23,9 @@ static struct asid_info
atomic64_t __percpu *active;
u64 __percpu*reserved;
u32 bits;
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -33,8 +34,6 @@ static struct asid_info
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
 
-static cpumask_t tlb_flush_pending;
-
 #define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
 #define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
@@ -113,7 +112,7 @@ static void flush_context(struct asid_info *info)
 * Queue a TLB invalidation for each CPU to perform on next
 * context-switch
 */
-   cpumask_setall(_flush_pending);
+   cpumask_setall(>flush_pending);
 }
 
 static bool check_update_reserved_asid(struct asid_info *info, u64 asid,
@@ -222,7 +221,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 old_active_asid, asid))
goto switch_mm_fastpath;
 
-   raw_spin_lock_irqsave(_asid_lock, flags);
+   raw_spin_lock_irqsave(>lock, flags);
/* Check that our ASID belongs to the current generation. */
asid = atomic64_read(>context.id);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
@@ -230,11 +229,11 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
atomic64_set(>context.id, asid);
}
 
-   if (cpumask_test_and_clear_cpu(cpu, _flush_pending))
+   if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
local_flush_tlb_all();
 
atomic64_set(_asid(info, cpu), asid);
-   raw_spin_unlock_irqrestore(_asid_lock, flags);
+   raw_spin_unlock_irqrestore(>lock, flags);
 
 switch_mm_fastpath:
 
@@ -277,6 +276,8 @@ static int asids_init(void)
info->active = _asids;
info->reserved = _asids;
 
+   raw_spin_lock_init(>lock);
+
pr_info("ASID allocator initialised with %lu entries\n",
NUM_USER_ASIDS(info));
return 0;
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 02/15] arm64/mm: Move active_asids and reserved_asids to asid_info

2019-07-24 Thread Julien Grall
The variables active_asids and reserved_asids hold information for a
given ASID allocator. So move them to the structure asid_info.

At the same time, introduce wrappers to access the active and reserved
ASIDs to make the code clearer.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index b0789f30d03b..3de028803284 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -23,10 +23,16 @@ static struct asid_info
 {
atomic64_t  generation;
unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
 } asid_info;
 
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+#define reserved_asid(info, cpu) *per_cpu_ptr((info)->reserved, cpu)
+
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
+
 static cpumask_t tlb_flush_pending;
 
 #define ASID_MASK  (~GENMASK(asid_bits - 1, 0))
@@ -89,7 +95,7 @@ static void flush_context(struct asid_info *info)
bitmap_clear(info->map, 0, NUM_USER_ASIDS);
 
for_each_possible_cpu(i) {
-   asid = atomic64_xchg_relaxed(_cpu(active_asids, i), 0);
+   asid = atomic64_xchg_relaxed(_asid(info, i), 0);
/*
 * If this CPU has already been through a
 * rollover, but hasn't run another task in
@@ -98,9 +104,9 @@ static void flush_context(struct asid_info *info)
 * the process it is still running.
 */
if (asid == 0)
-   asid = per_cpu(reserved_asids, i);
+   asid = reserved_asid(info, i);
__set_bit(asid2idx(asid), info->map);
-   per_cpu(reserved_asids, i) = asid;
+   reserved_asid(info, i) = asid;
}
 
/*
@@ -110,7 +116,8 @@ static void flush_context(struct asid_info *info)
cpumask_setall(_flush_pending);
 }
 
-static bool check_update_reserved_asid(u64 asid, u64 newasid)
+static bool check_update_reserved_asid(struct asid_info *info, u64 asid,
+  u64 newasid)
 {
int cpu;
bool hit = false;
@@ -125,9 +132,9 @@ static bool check_update_reserved_asid(u64 asid, u64 
newasid)
 * generation.
 */
for_each_possible_cpu(cpu) {
-   if (per_cpu(reserved_asids, cpu) == asid) {
+   if (reserved_asid(info, cpu) == asid) {
hit = true;
-   per_cpu(reserved_asids, cpu) = newasid;
+   reserved_asid(info, cpu) = newasid;
}
}
 
@@ -147,7 +154,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * If our current ASID was active during a rollover, we
 * can continue to use it and this was just a false alarm.
 */
-   if (check_update_reserved_asid(asid, newasid))
+   if (check_update_reserved_asid(info, asid, newasid))
return newasid;
 
/*
@@ -196,8 +203,8 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 
/*
 * The memory ordering here is subtle.
-* If our active_asids is non-zero and the ASID matches the current
-* generation, then we update the active_asids entry with a relaxed
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
 * cmpxchg. Racing with a concurrent rollover means that either:
 *
 * - We get a zero back from the cmpxchg and end up waiting on the
@@ -208,10 +215,10 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 *   relaxed xchg in flush_context will treat us as reserved
 *   because atomic RmWs are totally ordered for a given location.
 */
-   old_active_asid = atomic64_read(_cpu(active_asids, cpu));
+   old_active_asid = atomic64_read(_asid(info, cpu));
if (old_active_asid &&
!((asid ^ atomic64_read(>generation)) >> asid_bits) &&
-   atomic64_cmpxchg_relaxed(_cpu(active_asids, cpu),
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
 old_active_asid, asid))
goto switch_mm_fastpath;
 
@@ -226,7 +233,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
if (cpumask_test_and_clear_cpu(cpu, _flush_pending))
local_flush_tlb_all();
 
-   atomic64_set(_cpu(active_asids, cpu), asid);
+   atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(_asid_lock, flags);
 
 switch_mm_fastpath:
@@ -267,6 +274,9 @@ static int a

[PATCH v3 05/15] arm64/mm: Remove dependency on MM in new_context

2019-07-24 Thread Julien Grall
The function new_context will be part of a generic ASID allocator. At
the moment, the MM structure is only used to fetch the ASID.

To remove the dependency on MM, it is possible to just pass a pointer to
the current ASID.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index b50f52a09baf..dfb0da35a541 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -140,10 +140,10 @@ static bool check_update_reserved_asid(struct asid_info 
*info, u64 asid,
return hit;
 }
 
-static u64 new_context(struct asid_info *info, struct mm_struct *mm)
+static u64 new_context(struct asid_info *info, atomic64_t *pasid)
 {
static u32 cur_idx = 1;
-   u64 asid = atomic64_read(>context.id);
+   u64 asid = atomic64_read(pasid);
u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
@@ -225,7 +225,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
/* Check that our ASID belongs to the current generation. */
asid = atomic64_read(>context.id);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
-   asid = new_context(info, mm);
+   asid = new_context(info, >context.id);
atomic64_set(>context.id, asid);
}
 
-- 
2.11.0



[PATCH v3 00/15] kvm/arm: Align the VMID allocation with the arm64 ASID one

2019-07-24 Thread Julien Grall
Hi all,

This patch series is moving out the ASID allocator in a separate file in order
to re-use it for the VMID. The benefits are:
- CPUs are not forced to exit on a roll-over.
- Context invalidation is now per-CPU rather than
  broadcasted.

There are no performance regression on the fastpath for ASID allocation.
Actually on the hackbench measurement (300 hackbench) it was .7% faster.

The measurement was made on a Seattle based SoC (8 CPUs), with the
number of VMID limited to 4-bit. The test involves running concurrently 40
guests with 2 vCPUs. Each guest will then execute hackbench 5 times
before exiting.

The performance difference (on 5.1-rc1) between the current algo and the
new one are:
- 2.5% less exit from the guest
- 22.4% more flush, although they are now local rather than broadcasted
- 0.11% faster (just for the record)

The ASID allocator rework to make it generic has been divided in multiple
patches to make the review easier.

A branch with the patch based on 5.3-rc1 can be found:

http://xenbits.xen.org/gitweb/?p=people/julieng/linux-arm.git;a=shortlog;h=refs/heads/vmid-rework/v3

For all the changes see in each patch.

Best regards,

Cc: Russell King 

Julien Grall (15):
  arm64/mm: Introduce asid_info structure and move
asid_generation/asid_map to it
  arm64/mm: Move active_asids and reserved_asids to asid_info
  arm64/mm: Move bits to asid_info
  arm64/mm: Move the variable lock and tlb_flush_pending to asid_info
  arm64/mm: Remove dependency on MM in new_context
  arm64/mm: Store the number of asid allocated per context
  arm64/mm: Introduce NUM_ASIDS
  arm64/mm: Split asid_inits in 2 parts
  arm64/mm: Split the function check_and_switch_context in 3 parts
  arm64/mm: Introduce a callback to flush the local context
  arm64: Move the ASID allocator code in a separate file
  arm64/lib: Add an helper to free memory allocated by the ASID
allocator
  arm/kvm: Introduce a new VMID allocator
  arch/arm64: Introduce a capability to tell whether 16-bit VMID is
available
  kvm/arm: Align the VMID allocation with the arm64 ASID one

 arch/arm/include/asm/kvm_asm.h|   2 +-
 arch/arm/include/asm/kvm_host.h   |   5 +-
 arch/arm/include/asm/kvm_hyp.h|   1 +
 arch/arm/include/asm/kvm_mmu.h|   3 +-
 arch/arm/include/asm/lib_asid.h   |  79 +++
 arch/arm/kvm/Makefile |   1 +
 arch/arm/kvm/hyp/tlb.c|   8 +-
 arch/arm64/include/asm/cpucaps.h  |   3 +-
 arch/arm64/include/asm/kvm_asid.h |   8 ++
 arch/arm64/include/asm/kvm_asm.h  |   2 +-
 arch/arm64/include/asm/kvm_host.h |   5 +-
 arch/arm64/include/asm/kvm_mmu.h  |   7 +-
 arch/arm64/include/asm/lib_asid.h |  79 +++
 arch/arm64/kernel/cpufeature.c|   9 ++
 arch/arm64/kvm/hyp/tlb.c  |  10 +-
 arch/arm64/lib/Makefile   |   2 +
 arch/arm64/lib/asid.c | 190 
 arch/arm64/mm/context.c   | 200 +-
 virt/kvm/arm/arm.c| 125 +---
 19 files changed, 458 insertions(+), 281 deletions(-)
 create mode 100644 arch/arm/include/asm/lib_asid.h
 create mode 100644 arch/arm64/include/asm/kvm_asid.h
 create mode 100644 arch/arm64/include/asm/lib_asid.h
 create mode 100644 arch/arm64/lib/asid.c

-- 
2.11.0



[PATCH v3 01/15] arm64/mm: Introduce asid_info structure and move asid_generation/asid_map to it

2019-07-24 Thread Julien Grall
In an attempt to make the ASID allocator generic, create a new structure
asid_info to store all the information necessary for the allocator.

For now, move the variables asid_generation and asid_map to the new structure
asid_info. Follow-up patches will move more variables.

Note to avoid more renaming aftwards, a local variable 'info' has been
created and is a pointer to the ASID allocator structure.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Add turn asid_info to a static variable
---
 arch/arm64/mm/context.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index b5e329fde2dd..b0789f30d03b 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -19,8 +19,11 @@
 static u32 asid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 
-static atomic64_t asid_generation;
-static unsigned long *asid_map;
+static struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+} asid_info;
 
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
@@ -77,13 +80,13 @@ void verify_cpu_asid_bits(void)
}
 }
 
-static void flush_context(void)
+static void flush_context(struct asid_info *info)
 {
int i;
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
+   bitmap_clear(info->map, 0, NUM_USER_ASIDS);
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_cpu(active_asids, i), 0);
@@ -96,7 +99,7 @@ static void flush_context(void)
 */
if (asid == 0)
asid = per_cpu(reserved_asids, i);
-   __set_bit(asid2idx(asid), asid_map);
+   __set_bit(asid2idx(asid), info->map);
per_cpu(reserved_asids, i) = asid;
}
 
@@ -131,11 +134,11 @@ static bool check_update_reserved_asid(u64 asid, u64 
newasid)
return hit;
 }
 
-static u64 new_context(struct mm_struct *mm)
+static u64 new_context(struct asid_info *info, struct mm_struct *mm)
 {
static u32 cur_idx = 1;
u64 asid = atomic64_read(>context.id);
-   u64 generation = atomic64_read(_generation);
+   u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
u64 newasid = generation | (asid & ~ASID_MASK);
@@ -151,7 +154,7 @@ static u64 new_context(struct mm_struct *mm)
 * We had a valid ASID in a previous life, so try to re-use
 * it if possible.
 */
-   if (!__test_and_set_bit(asid2idx(asid), asid_map))
+   if (!__test_and_set_bit(asid2idx(asid), info->map))
return newasid;
}
 
@@ -162,20 +165,20 @@ static u64 new_context(struct mm_struct *mm)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, cur_idx);
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, cur_idx);
if (asid != NUM_USER_ASIDS)
goto set_asid;
 
/* We're out of ASIDs, so increment the global generation count */
generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION,
-_generation);
-   flush_context();
+>generation);
+   flush_context(info);
 
/* We have more ASIDs than CPUs, so this will always succeed */
-   asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, 1);
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, 1);
 
 set_asid:
-   __set_bit(asid, asid_map);
+   __set_bit(asid, info->map);
cur_idx = asid;
return idx2asid(asid) | generation;
 }
@@ -184,6 +187,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 {
unsigned long flags;
u64 asid, old_active_asid;
+   struct asid_info *info = _info;
 
if (system_supports_cnp())
cpu_set_reserved_ttbr0();
@@ -206,7 +210,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 */
old_active_asid = atomic64_read(_cpu(active_asids, cpu));
if (old_active_asid &&
-   !((asid ^ atomic64_read(_generation)) >> asid_bits) &&
+   !((asid ^ atomic64_read(>generation)) >> asid_bits) &&
atomic64_cmpxchg_relaxed(_cpu(active_asids, cpu),
 old_active_asid, asid))
goto switch_mm_fastpath;
@@ -214,8 +218,8 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
raw_spin_lock_irqsave(_asid_lock, flags);
/* Check that our ASID belongs to the current generation. */
asid = ato

KVM Arm64 and Linux-RT issues

2019-07-23 Thread Julien Grall
.565031] 000:  kvm_arch_vcpu_ioctl_run+0x658/0xbc0
[  122.565032] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  122.565034] 000:  do_vfs_ioctl+0xbc/0x910
[  122.565036] 000:  ksys_ioctl+0x78/0xa8
[  122.565038] 000:  __arm64_sys_ioctl+0x1c/0x28
[  122.565040] 000:  el0_svc_common.constprop.0+0x90/0x188
[  122.565042] 000:  el0_svc_handler+0x28/0x78
[  122.565045] 000:  el0_svc+0x8/0xc
[  122.565048] 000: Code: 88107c31 35b0 d65f03c0 f9800031 (885f7c31)
[  122.565052] 000: ---[ end trace 0005 ]---
[  122.565060] 000: note: kvm-vcpu-1[1430] exited with preempt_count 1

The first problem "BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:968" seem to be related to RT-specific commit 
d628c3c56cab "hrtimer: Introduce expiry spin lock".


From my understanding, the problem is the hrtimer_cancel() is called from a 
preempt notifier and therefore preemption will be disabled. The patch mentioned 
above will actually require hrtimer_cancel() to be called from preemptible context.


Do you have any thoughts how the problem should be addressed?

The second problem seems to hint that migrate_enable() was called on a task not 
pinned (-1). This will result to derefence an invalid value. I need to 
investigate how this can happen.


Looking at the other RT tree, I think 5.0 RT now has the same problem.

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC v2 14/14] kvm/arm: Align the VMID allocation with the arm64 ASID one

2019-07-15 Thread Julien Grall

On 03/07/2019 18:36, James Morse wrote:

Hi Julien,


Hi James,


On 20/06/2019 14:06, Julien Grall wrote:

At the moment, the VMID algorithm will send an SGI to all the CPUs to
force an exit and then broadcast a full TLB flush and I-Cache
invalidation.

This patch re-use the new ASID allocator. The
benefits are:
 - CPUs are not forced to exit at roll-over. Instead the VMID will be
 marked reserved and the context will be flushed at next exit. This
 will reduce the IPIs traffic.
 - Context invalidation is now per-CPU rather than broadcasted.


+ Catalin has a model of the asid-allocator.


That's a good point :).





With the new algo, the code is now adapted:
 - The function __kvm_flush_vm_context() has been renamed to
 __kvm_flush_cpu_vmid_context and now only flushing the current CPU context.
 - The call to update_vttbr() will be done with preemption disabled
 as the new algo requires to store information per-CPU.
 - The TLBs associated to EL1 will be flushed when booting a CPU to
 deal with stale information. This was previously done on the
 allocation of the first VMID of a new generation.

The measurement was made on a Seattle based SoC (8 CPUs), with the
number of VMID limited to 4-bit. The test involves running concurrently 40
guests with 2 vCPUs. Each guest will then execute hackbench 5 times
before exiting.



diff --git a/arch/arm64/include/asm/kvm_asid.h 
b/arch/arm64/include/asm/kvm_asid.h
new file mode 100644
index ..8b586e43c094
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_asid.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ARM64_KVM_ASID_H__
+#define __ARM64_KVM_ASID_H__
+
+#include 
+
+#endif /* __ARM64_KVM_ASID_H__ */
+
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index ff73f5462aca..06821f548c0f 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -62,7 +62,7 @@ extern char __kvm_hyp_init_end[];
  
  extern char __kvm_hyp_vector[];
  
-extern void __kvm_flush_vm_context(void);

+extern void __kvm_flush_cpu_vmid_context(void);
  extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);


As we've got a __kvm_tlb_flush_local_vmid(), would __kvm_tlb_flush_local_all() 
fit in
better? (This mirrors local_flush_tlb_all() too)


I am happy with the renaming here.





  extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
  extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 4bcd9c1291d5..7ef45b7da4eb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -68,8 +68,8 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long 
ext);
  void __extended_idmap_trampoline(phys_addr_t boot_pgd, phys_addr_t 
idmap_start);
  
  struct kvm_vmid {

-   /* The VMID generation used for the virt. memory system */
-   u64vmid_gen;
+   /* The ASID used for the ASID allocator */
+   atomic64_t asid;


Can we call this 'id' as happens in mm_context_t? (calling it asid is confusing)


I am fine with this suggestion.




u32vmid;


Can we filter out the generation bits in kvm_get_vttbr() in the same way the 
arch code
does in cpu_do_switch_mm().

I think this saves writing back a cached pre-filtered version every time, or 
needing
special hooks to know when the value changed. (so we can remove this variable)


[...]


+static void vmid_update_ctxt(void *ctxt)
  {
+   struct kvm_vmid *vmid = ctxt;
+   u64 asid = atomic64_read(>asid);



+   vmid->vmid = asid & ((1ULL << kvm_get_vmid_bits()) - 1);


I don't like having to poke this through the asid-allocator as a kvm-specific 
hack. Can we
do it in kvm_get_vttbr()?


I will have a look.





  }



@@ -487,48 +467,11 @@ static bool need_new_vmid_gen(struct kvm_vmid *vmid)


(git made a mess of the diff here... squashed to just the new code:)


  static void update_vmid(struct kvm_vmid *vmid)
  {



+   int cpu = get_cpu();
  
+	asid_check_context(_info, >asid, cpu, vmid);
  
+	put_cpu();


If we're calling update_vmid() in a pre-emptible context, aren't we already 
doomed?


Yes we are. This made me realize that Linux-RT replaced the preempt_disable() in 
the caller by migrate_disable(). The latter will prevent the task to move to 
another CPU but allow preemption.


This patch will likely makes things awfully broken for Linux-RT. I will have a 
look to see if we can call this from preempt notifier.




Could we use smp_processor_id() instead.



  }




@@ -1322,6 +1271,8 @@ static void cpu_init_hyp_mode(void *dummy)
  
  	__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);

__cpu_init_stage2();




+   kvm_call_hyp(__kvm_flush_cpu_vmid_context);


I think we only need to do this for VHE systems too. cpu_hyp_reinit() only does 
the call
to cpu_init_hyp_mode() if !is_kernel_in_hyp_mo

Re: [RFC v2 11/14] arm64: Move the ASID allocator code in a separate file

2019-07-15 Thread Julien Grall

On 04/07/2019 15:56, James Morse wrote:

Hi Julien,


Hi James,

Thank you for the review.



On 20/06/2019 14:06, Julien Grall wrote:

We will want to re-use the ASID allocator in a separate context (e.g
allocating VMID). So move the code in a new file.

The function asid_check_context has been moved in the header as a static
inline function because we want to avoid add a branch when checking if the
ASID is still valid.



diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 3df63a28856c..b745cf356fe1 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -23,46 +23,21 @@



-#define ASID_FIRST_VERSION(info)   NUM_ASIDS(info)



diff --git a/arch/arm64/lib/asid.c b/arch/arm64/lib/asid.c
new file mode 100644
index ..7252e4fdd5e9
--- /dev/null
+++ b/arch/arm64/lib/asid.c
@@ -0,0 +1,185 @@



+#define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))


(oops!)


Good catch, I will fix it in the next version.





@@ -344,7 +115,7 @@ static int asids_init(void)
if (!asid_allocator_init(_info, bits, ASID_PER_CONTEXT,
 asid_flush_cpu_ctxt))
panic("Unable to initialize ASID allocator for %lu ASIDs\n",
- 1UL << bits);
+ NUM_ASIDS(_info));


Could this go in the patch that adds NUM_ASIDS()?


Actually this change is potentially wrong. This relies on asid_allocator_init() 
to set asid_info.bits even if the function fails.


So I think it would be best to keep 1UL << bits here.

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v2 12/14] arm64/lib: asid: Allow user to update the context under the lock

2019-06-20 Thread Julien Grall
Some users of the ASID allocator (e.g VMID) will require to update the
context when a new ASID is generated. This has to be protected by a lock
to prevent concurrent modification.

Rather than introducing yet another lock, it is possible to re-use the
allocator lock for that purpose. This patch introduces a new callback
that will be call when updating the context.

Signed-off-by: Julien Grall 
---
 arch/arm64/include/asm/lib_asid.h | 12 
 arch/arm64/lib/asid.c | 10 --
 arch/arm64/mm/context.c   | 11 ---
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/lib_asid.h 
b/arch/arm64/include/asm/lib_asid.h
index c18e9eca500e..810f0b05a8da 100644
--- a/arch/arm64/include/asm/lib_asid.h
+++ b/arch/arm64/include/asm/lib_asid.h
@@ -23,6 +23,8 @@ struct asid_info
unsigned intctxt_shift;
/* Callback to locally flush the context. */
void(*flush_cpu_ctxt_cb)(void);
+   /* Callback to call when a context is updated */
+   void(*update_ctxt_cb)(void *ctxt);
 };
 
 #define NUM_ASIDS(info)(1UL << ((info)->bits))
@@ -31,7 +33,7 @@ struct asid_info
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
 
 void asid_new_context(struct asid_info *info, atomic64_t *pasid,
- unsigned int cpu);
+ unsigned int cpu, void *ctxt);
 
 /*
  * Check the ASID is still valid for the context. If not generate a new ASID.
@@ -40,7 +42,8 @@ void asid_new_context(struct asid_info *info, atomic64_t 
*pasid,
  * @cpu: current CPU ID. Must have been acquired throught get_cpu()
  */
 static inline void asid_check_context(struct asid_info *info,
- atomic64_t *pasid, unsigned int cpu)
+  atomic64_t *pasid, unsigned int cpu,
+  void *ctxt)
 {
u64 asid, old_active_asid;
 
@@ -67,11 +70,12 @@ static inline void asid_check_context(struct asid_info 
*info,
 old_active_asid, asid))
return;
 
-   asid_new_context(info, pasid, cpu);
+   asid_new_context(info, pasid, cpu, ctxt);
 }
 
 int asid_allocator_init(struct asid_info *info,
u32 bits, unsigned int asid_per_ctxt,
-   void (*flush_cpu_ctxt_cb)(void));
+   void (*flush_cpu_ctxt_cb)(void),
+   void (*update_ctxt_cb)(void *ctxt));
 
 #endif
diff --git a/arch/arm64/lib/asid.c b/arch/arm64/lib/asid.c
index 7252e4fdd5e9..dd2c6e4c1ff0 100644
--- a/arch/arm64/lib/asid.c
+++ b/arch/arm64/lib/asid.c
@@ -130,9 +130,10 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
  * @pasid: Pointer to the current ASID batch allocated. It will be updated
  * with the new ASID batch.
  * @cpu: current CPU ID. Must have been acquired through get_cpu()
+ * @ctxt: Context to update when calling update_context
  */
 void asid_new_context(struct asid_info *info, atomic64_t *pasid,
- unsigned int cpu)
+ unsigned int cpu, void *ctxt)
 {
unsigned long flags;
u64 asid;
@@ -149,6 +150,9 @@ void asid_new_context(struct asid_info *info, atomic64_t 
*pasid,
info->flush_cpu_ctxt_cb();
 
atomic64_set(_asid(info, cpu), asid);
+
+   info->update_ctxt_cb(ctxt);
+
raw_spin_unlock_irqrestore(>lock, flags);
 }
 
@@ -163,11 +167,13 @@ void asid_new_context(struct asid_info *info, atomic64_t 
*pasid,
  */
 int asid_allocator_init(struct asid_info *info,
u32 bits, unsigned int asid_per_ctxt,
-   void (*flush_cpu_ctxt_cb)(void))
+   void (*flush_cpu_ctxt_cb)(void),
+   void (*update_ctxt_cb)(void *ctxt))
 {
info->bits = bits;
info->ctxt_shift = ilog2(asid_per_ctxt);
info->flush_cpu_ctxt_cb = flush_cpu_ctxt_cb;
+   info->update_ctxt_cb = update_ctxt_cb;
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is always reserved.
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index b745cf356fe1..527ea82983d7 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -82,7 +82,7 @@ void check_and_switch_context(struct mm_struct *mm, unsigned 
int cpu)
if (system_supports_cnp())
cpu_set_reserved_ttbr0();
 
-   asid_check_context(_info, >context.id, cpu);
+   asid_check_context(_info, >context.id, cpu, mm);
 
arm64_apply_bp_hardening();
 
@@ -108,12 +108,17 @@ static void asid_flush_cpu_ctxt(void)
local_flush_tlb_all();
 }
 
+static void asid_update_ctxt(void *ctxt)
+{
+   /* Nothing to do */
+}
+
 static int asids_init(void)

[RFC v2 13/14] arm/kvm: Introduce a new VMID allocator

2019-06-20 Thread Julien Grall
A follow-up patch will replace the KVM VMID allocator with the arm64 ASID
allocator.

To avoid as much as possible duplication, the arm KVM code will directly
compile arch/arm64/lib/asid.c. The header is a verbatim to copy to
avoid breaking the assumption that architecture port has self-containers
headers.

Signed-off-by: Julien Grall 
Cc: Russell King 

---
I hit a warning when compiling the ASID code:

linux/arch/arm/kvm/../../arm64/lib/asid.c:17: warning: "ASID_MASK" redefined
 #define ASID_MASK(info)   (~GENMASK((info)->bits - 1, 0))

In file included from linux/include/linux/mm_types.h:18,
 from linux/include/linux/mmzone.h:21,
 from linux/include/linux/gfp.h:6,
 from linux/include/linux/slab.h:15,
 from linux/arch/arm/kvm/../../arm64/lib/asid.c:11:
linux/arch/arm/include/asm/mmu.h:26: note: this is the location of the previous 
definition
 #define ASID_MASK ((~0ULL) << ASID_BITS)

I haven't yet resolved because I am not sure of the best way to go.
AFAICT ASID_MASK is only used in mm/context.c. So I am wondering whether
it would be acceptable to move the define.

Changes in v2:
- Re-use arm64/lib/asid.c rather than duplication the code.
---
 arch/arm/include/asm/lib_asid.h | 81 +
 arch/arm/kvm/Makefile   |  1 +
 2 files changed, 82 insertions(+)
 create mode 100644 arch/arm/include/asm/lib_asid.h

diff --git a/arch/arm/include/asm/lib_asid.h b/arch/arm/include/asm/lib_asid.h
new file mode 100644
index ..79bce4686d21
--- /dev/null
+++ b/arch/arm/include/asm/lib_asid.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ARM_LIB_ASID_H__
+#define __ARM_LIB_ASID_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
+   u32 bits;
+   /* Lock protecting the structure */
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
+   /* Callback to call when a context is updated */
+   void(*update_ctxt_cb)(void *ctxt);
+};
+
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
+
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+
+void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+ unsigned int cpu, void *ctxt);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static inline void asid_check_context(struct asid_info *info,
+  atomic64_t *pasid, unsigned int cpu,
+  void *ctxt)
+{
+   u64 asid, old_active_asid;
+
+   asid = atomic64_read(pasid);
+
+   /*
+* The memory ordering here is subtle.
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
+* cmpxchg. Racing with a concurrent rollover means that either:
+*
+* - We get a zero back from the cmpxchg and end up waiting on the
+*   lock. Taking the lock synchronises with the rollover and so
+*   we are forced to see the updated generation.
+*
+* - We get a valid ASID back from the cmpxchg, which means the
+*   relaxed xchg in flush_context will treat us as reserved
+*   because atomic RmWs are totally ordered for a given location.
+*/
+   old_active_asid = atomic64_read(_asid(info, cpu));
+   if (old_active_asid &&
+   !((asid ^ atomic64_read(>generation)) >> info->bits) &&
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
+old_active_asid, asid))
+   return;
+
+   asid_new_context(info, pasid, cpu, ctxt);
+}
+
+int asid_allocator_init(struct asid_info *info,
+   u32 bits, unsigned int asid_per_ctxt,
+   void (*flush_cpu_ctxt_cb)(void),
+   void (*update_ctxt_cb)(void *ctxt));
+
+#endif /* __ARM_LIB_ASID_H__ */
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index 531e59f5be9c..6ab49bd84531 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -40,3 +40,4 @@ obj-y += $(KVM)/arm/vg

[RFC v2 10/14] arm64/mm: Introduce a callback to flush the local context

2019-06-20 Thread Julien Grall
Flushing the local context will vary depending on the actual user of the ASID
allocator. Introduce a new callback to flush the local context and move
the call to flush local TLB in it.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index fbef5a5c5624..3df63a28856c 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -39,6 +39,8 @@ static struct asid_info
cpumask_t   flush_pending;
/* Number of ASID allocated by context (shift value) */
unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -266,7 +268,7 @@ static void asid_new_context(struct asid_info *info, 
atomic64_t *pasid,
}
 
if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
-   local_flush_tlb_all();
+   info->flush_cpu_ctxt_cb();
 
atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(>lock, flags);
@@ -298,6 +300,11 @@ asmlinkage void post_ttbr_update_workaround(void)
CONFIG_CAVIUM_ERRATUM_27456));
 }
 
+static void asid_flush_cpu_ctxt(void)
+{
+   local_flush_tlb_all();
+}
+
 /*
  * Initialize the ASID allocator
  *
@@ -308,10 +315,12 @@ asmlinkage void post_ttbr_update_workaround(void)
  * 2.
  */
 static int asid_allocator_init(struct asid_info *info,
-  u32 bits, unsigned int asid_per_ctxt)
+  u32 bits, unsigned int asid_per_ctxt,
+  void (*flush_cpu_ctxt_cb)(void))
 {
info->bits = bits;
info->ctxt_shift = ilog2(asid_per_ctxt);
+   info->flush_cpu_ctxt_cb = flush_cpu_ctxt_cb;
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is always reserved.
@@ -332,7 +341,8 @@ static int asids_init(void)
 {
u32 bits = get_cpu_asid_bits();
 
-   if (!asid_allocator_init(_info, bits, ASID_PER_CONTEXT))
+   if (!asid_allocator_init(_info, bits, ASID_PER_CONTEXT,
+asid_flush_cpu_ctxt))
panic("Unable to initialize ASID allocator for %lu ASIDs\n",
  1UL << bits);
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v2 08/14] arm64/mm: Split asid_inits in 2 parts

2019-06-20 Thread Julien Grall
Move out the common initialization of the ASID allocator in a separate
function.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 43 +++
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index beba8e5b4100..81bc3d365436 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -271,31 +271,50 @@ asmlinkage void post_ttbr_update_workaround(void)
CONFIG_CAVIUM_ERRATUM_27456));
 }
 
-static int asids_init(void)
+/*
+ * Initialize the ASID allocator
+ *
+ * @info: Pointer to the asid allocator structure
+ * @bits: Number of ASIDs available
+ * @asid_per_ctxt: Number of ASIDs to allocate per-context. ASIDs are
+ * allocated contiguously for a given context. This value should be a power of
+ * 2.
+ */
+static int asid_allocator_init(struct asid_info *info,
+  u32 bits, unsigned int asid_per_ctxt)
 {
-   struct asid_info *info = _info;
-
-   info->bits = get_cpu_asid_bits();
-   info->ctxt_shift = ilog2(ASID_PER_CONTEXT);
+   info->bits = bits;
+   info->ctxt_shift = ilog2(asid_per_ctxt);
/*
 * Expect allocation after rollover to fail if we don't have at least
-* one more ASID than CPUs. ASID #0 is reserved for init_mm.
+* one more ASID than CPUs. ASID #0 is always reserved.
 */
WARN_ON(NUM_CTXT_ASIDS(info) - 1 <= num_possible_cpus());
atomic64_set(>generation, ASID_FIRST_VERSION(info));
info->map = kcalloc(BITS_TO_LONGS(NUM_CTXT_ASIDS(info)),
sizeof(*info->map), GFP_KERNEL);
if (!info->map)
-   panic("Failed to allocate bitmap for %lu ASIDs\n",
- NUM_CTXT_ASIDS(info));
-
-   info->active = _asids;
-   info->reserved = _asids;
+   return -ENOMEM;
 
raw_spin_lock_init(>lock);
 
+   return 0;
+}
+
+static int asids_init(void)
+{
+   u32 bits = get_cpu_asid_bits();
+
+   if (!asid_allocator_init(_info, bits, ASID_PER_CONTEXT))
+   panic("Unable to initialize ASID allocator for %lu ASIDs\n",
+ 1UL << bits);
+
+   asid_info.active = _asids;
+   asid_info.reserved = _asids;
+
pr_info("ASID allocator initialised with %lu entries\n",
-   NUM_CTXT_ASIDS(info));
+   NUM_CTXT_ASIDS(_info));
+
return 0;
 }
 early_initcall(asids_init);
-- 
2.11.0



[RFC v2 11/14] arm64: Move the ASID allocator code in a separate file

2019-06-20 Thread Julien Grall
We will want to re-use the ASID allocator in a separate context (e.g
allocating VMID). So move the code in a new file.

The function asid_check_context has been moved in the header as a static
inline function because we want to avoid add a branch when checking if the
ASID is still valid.

Signed-off-by: Julien Grall 

---

This code will be used in the virt code for allocating VMID. I am not
entirely sure where to place it. Lib could potentially be a good place but I
am not entirely convinced the algo as it is could be used by other
architecture.

Looking at x86, it seems that it will not be possible to re-use because
the number of PCID (aka ASID) could be smaller than the number of CPUs.
See commit message 10af6235e0d327d42e1bad974385197817923dc1 "x86/mm:
Implement PCID based optimization: try to preserve old TLB entries using
PCI".

Changes in v2:
- Rename the header from asid.h to lib_asid.h
---
 arch/arm64/include/asm/lib_asid.h |  77 +
 arch/arm64/lib/Makefile   |   2 +
 arch/arm64/lib/asid.c | 185 ++
 arch/arm64/mm/context.c   | 235 +-
 4 files changed, 267 insertions(+), 232 deletions(-)
 create mode 100644 arch/arm64/include/asm/lib_asid.h
 create mode 100644 arch/arm64/lib/asid.c

diff --git a/arch/arm64/include/asm/lib_asid.h 
b/arch/arm64/include/asm/lib_asid.h
new file mode 100644
index ..c18e9eca500e
--- /dev/null
+++ b/arch/arm64/include/asm/lib_asid.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ASM_LIB_ASID_H
+#define __ASM_ASM_LIB_ASID_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
+   u32 bits;
+   /* Lock protecting the structure */
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
+};
+
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
+
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+
+void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+ unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static inline void asid_check_context(struct asid_info *info,
+ atomic64_t *pasid, unsigned int cpu)
+{
+   u64 asid, old_active_asid;
+
+   asid = atomic64_read(pasid);
+
+   /*
+* The memory ordering here is subtle.
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
+* cmpxchg. Racing with a concurrent rollover means that either:
+*
+* - We get a zero back from the cmpxchg and end up waiting on the
+*   lock. Taking the lock synchronises with the rollover and so
+*   we are forced to see the updated generation.
+*
+* - We get a valid ASID back from the cmpxchg, which means the
+*   relaxed xchg in flush_context will treat us as reserved
+*   because atomic RmWs are totally ordered for a given location.
+*/
+   old_active_asid = atomic64_read(_asid(info, cpu));
+   if (old_active_asid &&
+   !((asid ^ atomic64_read(>generation)) >> info->bits) &&
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
+old_active_asid, asid))
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+int asid_allocator_init(struct asid_info *info,
+   u32 bits, unsigned int asid_per_ctxt,
+   void (*flush_cpu_ctxt_cb)(void));
+
+#endif
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 33c2a4abda04..37169d541ab5 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -5,6 +5,8 @@ lib-y   := clear_user.o delay.o copy_from_user.o
\
   memcmp.o strcmp.o strncmp.o strlen.o strnlen.o   \
   strchr.o strrchr.o tishift.o
 
+lib-y  += asid.o
+
 ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
 obj-$(CONFIG_XOR_BLOCKS)   += xor-neon.o
 CFLAGS_REMOVE_xor-neon.o   += -mgeneral-regs-only
diff --git a/arch/arm64/lib/asid.c b/arch/arm64/lib/a

[RFC v2 14/14] kvm/arm: Align the VMID allocation with the arm64 ASID one

2019-06-20 Thread Julien Grall
At the moment, the VMID algorithm will send an SGI to all the CPUs to
force an exit and then broadcast a full TLB flush and I-Cache
invalidation.

This patch re-use the new ASID allocator. The
benefits are:
- CPUs are not forced to exit at roll-over. Instead the VMID will be
marked reserved and the context will be flushed at next exit. This
will reduce the IPIs traffic.
- Context invalidation is now per-CPU rather than broadcasted.

With the new algo, the code is now adapted:
- The function __kvm_flush_vm_context() has been renamed to
__kvm_flush_cpu_vmid_context and now only flushing the current CPU context.
- The call to update_vttbr() will be done with preemption disabled
as the new algo requires to store information per-CPU.
- The TLBs associated to EL1 will be flushed when booting a CPU to
deal with stale information. This was previously done on the
allocation of the first VMID of a new generation.

The measurement was made on a Seattle based SoC (8 CPUs), with the
number of VMID limited to 4-bit. The test involves running concurrently 40
guests with 2 vCPUs. Each guest will then execute hackbench 5 times
before exiting.

The performance difference between the current algo and the new one are:
- 2.5% less exit from the guest
- 22.4% more flush, although they are now local rather than
broadcasted
- 0.11% faster (just for the record)

Signed-off-by: Julien Grall 


Looking at the __kvm_flush_vm_context, it might be possible to
reduce more the overhead by removing the I-Cache flush for other
cache than VIPT. This has been left aside for now.
---
 arch/arm/include/asm/kvm_asm.h|   2 +-
 arch/arm/include/asm/kvm_host.h   |   5 +-
 arch/arm/include/asm/kvm_hyp.h|   1 +
 arch/arm/kvm/hyp/tlb.c|   8 +--
 arch/arm64/include/asm/kvm_asid.h |   8 +++
 arch/arm64/include/asm/kvm_asm.h  |   2 +-
 arch/arm64/include/asm/kvm_host.h |   5 +-
 arch/arm64/kvm/hyp/tlb.c  |  10 ++--
 virt/kvm/arm/arm.c| 112 +-
 9 files changed, 61 insertions(+), 92 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_asid.h

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index f615830f9f57..c2a2e6ef1e2f 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -53,7 +53,7 @@ struct kvm_vcpu;
 extern char __kvm_hyp_init[];
 extern char __kvm_hyp_init_end[];
 
-extern void __kvm_flush_vm_context(void);
+extern void __kvm_flush_cpu_vmid_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index f80418ddeb60..7b894ff16688 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -50,8 +50,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 void kvm_reset_coprocs(struct kvm_vcpu *vcpu);
 
 struct kvm_vmid {
-   /* The VMID generation used for the virt. memory system */
-   u64vmid_gen;
+   /* The ASID used for the ASID allocator */
+   atomic64_t asid;
u32vmid;
 };
 
@@ -259,7 +259,6 @@ unsigned long __kvm_call_hyp(void *hypfn, ...);
ret;\
})
 
-void force_vm_exit(const cpumask_t *mask);
 int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
  struct kvm_vcpu_events *events);
 
diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index 87bcd18df8d5..c3d1011ca1bf 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -75,6 +75,7 @@
 #define TLBIALLIS  __ACCESS_CP15(c8, 0, c3, 0)
 #define TLBIALL__ACCESS_CP15(c8, 0, c7, 0)
 #define TLBIALLNSNHIS  __ACCESS_CP15(c8, 4, c3, 4)
+#define TLBIALLNSNH__ACCESS_CP15(c8, 4, c7, 4)
 #define PRRR   __ACCESS_CP15(c10, 0, c2, 0)
 #define NMRR   __ACCESS_CP15(c10, 0, c2, 1)
 #define AMAIR0 __ACCESS_CP15(c10, 0, c3, 0)
diff --git a/arch/arm/kvm/hyp/tlb.c b/arch/arm/kvm/hyp/tlb.c
index 8e4afba73635..42b9ab47fc94 100644
--- a/arch/arm/kvm/hyp/tlb.c
+++ b/arch/arm/kvm/hyp/tlb.c
@@ -71,9 +71,9 @@ void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu 
*vcpu)
write_sysreg(0, VTTBR);
 }
 
-void __hyp_text __kvm_flush_vm_context(void)
+void __hyp_text __kvm_flush_cpu_vmid_context(void)
 {
-   write_sysreg(0, TLBIALLNSNHIS);
-   write_sysreg(0, ICIALLUIS);
-   dsb(ish);
+   write_sysreg(0, TLBIALLNSNH);
+   write_sysreg(0, ICIALLU);
+   dsb(nsh);
 }
diff --git a/arch/arm64/include/asm/kvm_asid.h 
b/arch/arm64/include/asm/kvm_asid.h
new file mode 100644
index ..8b586e43c094
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_asid.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL

[RFC v2 05/14] arm64/mm: Remove dependency on MM in new_context

2019-06-20 Thread Julien Grall
The function new_context will be part of a generic ASID allocator. At
the moment, the MM structure is only used to fetch the ASID.

To remove the dependency on MM, it is possible to just pass a pointer to
the current ASID.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 6457a9310fe4..a9cc59288b08 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -151,10 +151,10 @@ static bool check_update_reserved_asid(struct asid_info 
*info, u64 asid,
return hit;
 }
 
-static u64 new_context(struct asid_info *info, struct mm_struct *mm)
+static u64 new_context(struct asid_info *info, atomic64_t *pasid)
 {
static u32 cur_idx = 1;
-   u64 asid = atomic64_read(>context.id);
+   u64 asid = atomic64_read(pasid);
u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
@@ -236,7 +236,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
/* Check that our ASID belongs to the current generation. */
asid = atomic64_read(>context.id);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
-   asid = new_context(info, mm);
+   asid = new_context(info, >context.id);
atomic64_set(>context.id, asid);
}
 
-- 
2.11.0



[RFC v2 02/14] arm64/mm: Move active_asids and reserved_asids to asid_info

2019-06-20 Thread Julien Grall
The variables active_asids and reserved_asids hold information for a
given ASID allocator. So move them to the structure asid_info.

At the same time, introduce wrappers to access the active and reserved
ASIDs to make the code clearer.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 8167c369172d..6bacfc295f6e 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -34,10 +34,16 @@ static struct asid_info
 {
atomic64_t  generation;
unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
 } asid_info;
 
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+#define reserved_asid(info, cpu) *per_cpu_ptr((info)->reserved, cpu)
+
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
+
 static cpumask_t tlb_flush_pending;
 
 #define ASID_MASK  (~GENMASK(asid_bits - 1, 0))
@@ -100,7 +106,7 @@ static void flush_context(struct asid_info *info)
bitmap_clear(info->map, 0, NUM_USER_ASIDS);
 
for_each_possible_cpu(i) {
-   asid = atomic64_xchg_relaxed(_cpu(active_asids, i), 0);
+   asid = atomic64_xchg_relaxed(_asid(info, i), 0);
/*
 * If this CPU has already been through a
 * rollover, but hasn't run another task in
@@ -109,9 +115,9 @@ static void flush_context(struct asid_info *info)
 * the process it is still running.
 */
if (asid == 0)
-   asid = per_cpu(reserved_asids, i);
+   asid = reserved_asid(info, i);
__set_bit(asid2idx(asid), info->map);
-   per_cpu(reserved_asids, i) = asid;
+   reserved_asid(info, i) = asid;
}
 
/*
@@ -121,7 +127,8 @@ static void flush_context(struct asid_info *info)
cpumask_setall(_flush_pending);
 }
 
-static bool check_update_reserved_asid(u64 asid, u64 newasid)
+static bool check_update_reserved_asid(struct asid_info *info, u64 asid,
+  u64 newasid)
 {
int cpu;
bool hit = false;
@@ -136,9 +143,9 @@ static bool check_update_reserved_asid(u64 asid, u64 
newasid)
 * generation.
 */
for_each_possible_cpu(cpu) {
-   if (per_cpu(reserved_asids, cpu) == asid) {
+   if (reserved_asid(info, cpu) == asid) {
hit = true;
-   per_cpu(reserved_asids, cpu) = newasid;
+   reserved_asid(info, cpu) = newasid;
}
}
 
@@ -158,7 +165,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * If our current ASID was active during a rollover, we
 * can continue to use it and this was just a false alarm.
 */
-   if (check_update_reserved_asid(asid, newasid))
+   if (check_update_reserved_asid(info, asid, newasid))
return newasid;
 
/*
@@ -207,8 +214,8 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 
/*
 * The memory ordering here is subtle.
-* If our active_asids is non-zero and the ASID matches the current
-* generation, then we update the active_asids entry with a relaxed
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
 * cmpxchg. Racing with a concurrent rollover means that either:
 *
 * - We get a zero back from the cmpxchg and end up waiting on the
@@ -219,10 +226,10 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 *   relaxed xchg in flush_context will treat us as reserved
 *   because atomic RmWs are totally ordered for a given location.
 */
-   old_active_asid = atomic64_read(_cpu(active_asids, cpu));
+   old_active_asid = atomic64_read(_asid(info, cpu));
if (old_active_asid &&
!((asid ^ atomic64_read(>generation)) >> asid_bits) &&
-   atomic64_cmpxchg_relaxed(_cpu(active_asids, cpu),
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
 old_active_asid, asid))
goto switch_mm_fastpath;
 
@@ -237,7 +244,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
if (cpumask_test_and_clear_cpu(cpu, _flush_pending))
local_flush_tlb_all();
 
-   atomic64_set(_cpu(active_asids, cpu), asid);
+   atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(_asid_lock, flags);
 
 switch_mm_fastpath:
@@ -278,6 +285,9 @@ static int a

[RFC v2 09/14] arm64/mm: Split the function check_and_switch_context in 3 parts

2019-06-20 Thread Julien Grall
The function check_and_switch_context is used to:
1) Check whether the ASID is still valid
2) Generate a new one if it is not valid
3) Switch the context

While the latter is specific to the MM subsystem, the rest could be part
of the generic ASID allocator.

After this patch, the function is now split in 3 parts which corresponds
to the use of the functions:
1) asid_check_context: Check if the ASID is still valid
2) asid_new_context: Generate a new ASID for the context
3) check_and_switch_context: Call 1) and 2) and switch the context

1) and 2) have not been merged in a single function because we want to
avoid to add a branch in when the ASID is still valid. This will matter
when the code will be moved in separate file later on as 1) will reside
in the header as a static inline function.

Signed-off-by: Julien Grall 

---

Will wants to avoid to add a branch when the ASID is still valid. So
1) and 2) are in separates function. The former will move to a new
header and make static inline.
---
 arch/arm64/mm/context.c | 51 +
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 81bc3d365436..fbef5a5c5624 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -204,16 +204,21 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
return idx2asid(info, asid) | generation;
 }
 
-void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
+static void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static void asid_check_context(struct asid_info *info,
+  atomic64_t *pasid, unsigned int cpu)
 {
-   unsigned long flags;
u64 asid, old_active_asid;
-   struct asid_info *info = _info;
 
-   if (system_supports_cnp())
-   cpu_set_reserved_ttbr0();
-
-   asid = atomic64_read(>context.id);
+   asid = atomic64_read(pasid);
 
/*
 * The memory ordering here is subtle.
@@ -234,14 +239,30 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
!((asid ^ atomic64_read(>generation)) >> info->bits) &&
atomic64_cmpxchg_relaxed(_asid(info, cpu),
 old_active_asid, asid))
-   goto switch_mm_fastpath;
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+/*
+ * Generate a new ASID for the context.
+ *
+ * @pasid: Pointer to the current ASID batch allocated. It will be updated
+ * with the new ASID batch.
+ * @cpu: current CPU ID. Must have been acquired through get_cpu()
+ */
+static void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+unsigned int cpu)
+{
+   unsigned long flags;
+   u64 asid;
 
raw_spin_lock_irqsave(>lock, flags);
/* Check that our ASID belongs to the current generation. */
-   asid = atomic64_read(>context.id);
+   asid = atomic64_read(pasid);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
-   asid = new_context(info, >context.id);
-   atomic64_set(>context.id, asid);
+   asid = new_context(info, pasid);
+   atomic64_set(pasid, asid);
}
 
if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
@@ -249,8 +270,14 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 
atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(>lock, flags);
+}
+
+void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
+{
+   if (system_supports_cnp())
+   cpu_set_reserved_ttbr0();
 
-switch_mm_fastpath:
+   asid_check_context(_info, >context.id, cpu);
 
arm64_apply_bp_hardening();
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v2 03/14] arm64/mm: Move bits to asid_info

2019-06-20 Thread Julien Grall
The variable bits hold information for a given ASID allocator. So move
it to the asid_info structure.

Because most of the macros were relying on bits, they are now taking an
extra parameter that is a pointer to the asid_info structure.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 59 +
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 6bacfc295f6e..7883347ece52 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 
-static u32 asid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 
 static struct asid_info
@@ -36,6 +35,7 @@ static struct asid_info
unsigned long   *map;
atomic64_t __percpu *active;
u64 __percpu*reserved;
+   u32 bits;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -46,17 +46,17 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 
 static cpumask_t tlb_flush_pending;
 
-#define ASID_MASK  (~GENMASK(asid_bits - 1, 0))
-#define ASID_FIRST_VERSION (1UL << asid_bits)
+#define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
+#define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define NUM_USER_ASIDS (ASID_FIRST_VERSION >> 1)
-#define asid2idx(asid) (((asid) & ~ASID_MASK) >> 1)
-#define idx2asid(idx)  (((idx) << 1) & ~ASID_MASK)
+#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 1)
+#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 1)
+#define idx2asid(info, idx)(((idx) << 1) & ~ASID_MASK(info))
 #else
-#define NUM_USER_ASIDS (ASID_FIRST_VERSION)
-#define asid2idx(asid) ((asid) & ~ASID_MASK)
-#define idx2asid(idx)  asid2idx(idx)
+#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info))
+#define asid2idx(info, asid)   ((asid) & ~ASID_MASK(info))
+#define idx2asid(info, idx)asid2idx(info, idx)
 #endif
 
 /* Get the ASIDBits supported by the current CPU */
@@ -86,13 +86,13 @@ void verify_cpu_asid_bits(void)
 {
u32 asid = get_cpu_asid_bits();
 
-   if (asid < asid_bits) {
+   if (asid < asid_info.bits) {
/*
 * We cannot decrease the ASID size at runtime, so panic if we 
support
 * fewer ASID bits than the boot CPU.
 */
pr_crit("CPU%d: smaller ASID size(%u) than boot CPU (%u)\n",
-   smp_processor_id(), asid, asid_bits);
+   smp_processor_id(), asid, asid_info.bits);
cpu_panic_kernel();
}
 }
@@ -103,7 +103,7 @@ static void flush_context(struct asid_info *info)
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(info->map, 0, NUM_USER_ASIDS);
+   bitmap_clear(info->map, 0, NUM_USER_ASIDS(info));
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_asid(info, i), 0);
@@ -116,7 +116,7 @@ static void flush_context(struct asid_info *info)
 */
if (asid == 0)
asid = reserved_asid(info, i);
-   __set_bit(asid2idx(asid), info->map);
+   __set_bit(asid2idx(info, asid), info->map);
reserved_asid(info, i) = asid;
}
 
@@ -159,7 +159,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
-   u64 newasid = generation | (asid & ~ASID_MASK);
+   u64 newasid = generation | (asid & ~ASID_MASK(info));
 
/*
 * If our current ASID was active during a rollover, we
@@ -172,7 +172,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * We had a valid ASID in a previous life, so try to re-use
 * it if possible.
 */
-   if (!__test_and_set_bit(asid2idx(asid), info->map))
+   if (!__test_and_set_bit(asid2idx(info, asid), info->map))
return newasid;
}
 
@@ -183,22 +183,22 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, cur_idx);
-   if (asid != NUM_USER_ASIDS)
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), cur_idx);
+   if (asid != NUM_USER_ASIDS(info))
goto set_asid;
 
/* We're out of ASIDs, so increment the global gener

[RFC v2 07/14] arm64/mm: Introduce NUM_ASIDS

2019-06-20 Thread Julien Grall
At the moment ASID_FIRST_VERSION is used to know the number of ASIDs
supported. As we are going to move the ASID allocator in a separate, it
would be better to use a different name for external users.

This patch adds NUM_ASIDS and implements ASID_FIRST_VERSION using it.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index d128f02644b0..beba8e5b4100 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -48,7 +48,9 @@ static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
 
 #define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
-#define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+
+#define ASID_FIRST_VERSION(info)   NUM_ASIDS(info)
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 #define ASID_PER_CONTEXT   2
@@ -56,7 +58,7 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 #define ASID_PER_CONTEXT   1
 #endif
 
-#define NUM_CTXT_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 
(info)->ctxt_shift)
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
 #define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 
(info)->ctxt_shift)
 #define idx2asid(info, idx)(((idx) << (info)->ctxt_shift) & 
~ASID_MASK(info))
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v2 04/14] arm64/mm: Move the variable lock and tlb_flush_pending to asid_info

2019-06-20 Thread Julien Grall
The variables lock and tlb_flush_pending holds information for a given
ASID allocator. So move them to the asid_info structure.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 7883347ece52..6457a9310fe4 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -27,8 +27,6 @@
 #include 
 #include 
 
-static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
-
 static struct asid_info
 {
atomic64_t  generation;
@@ -36,6 +34,9 @@ static struct asid_info
atomic64_t __percpu *active;
u64 __percpu*reserved;
u32 bits;
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -44,8 +45,6 @@ static struct asid_info
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
 
-static cpumask_t tlb_flush_pending;
-
 #define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
 #define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
@@ -124,7 +123,7 @@ static void flush_context(struct asid_info *info)
 * Queue a TLB invalidation for each CPU to perform on next
 * context-switch
 */
-   cpumask_setall(_flush_pending);
+   cpumask_setall(>flush_pending);
 }
 
 static bool check_update_reserved_asid(struct asid_info *info, u64 asid,
@@ -233,7 +232,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 old_active_asid, asid))
goto switch_mm_fastpath;
 
-   raw_spin_lock_irqsave(_asid_lock, flags);
+   raw_spin_lock_irqsave(>lock, flags);
/* Check that our ASID belongs to the current generation. */
asid = atomic64_read(>context.id);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
@@ -241,11 +240,11 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
atomic64_set(>context.id, asid);
}
 
-   if (cpumask_test_and_clear_cpu(cpu, _flush_pending))
+   if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
local_flush_tlb_all();
 
atomic64_set(_asid(info, cpu), asid);
-   raw_spin_unlock_irqrestore(_asid_lock, flags);
+   raw_spin_unlock_irqrestore(>lock, flags);
 
 switch_mm_fastpath:
 
@@ -288,6 +287,8 @@ static int asids_init(void)
info->active = _asids;
info->reserved = _asids;
 
+   raw_spin_lock_init(>lock);
+
pr_info("ASID allocator initialised with %lu entries\n",
NUM_USER_ASIDS(info));
return 0;
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC v2 06/14] arm64/mm: Store the number of asid allocated per context

2019-06-20 Thread Julien Grall
Currently the number of ASID allocated per context is determined at
compilation time. As the algorithm is becoming generic, the user may
want to instantiate the ASID allocator multiple time with different
number of ASID allocated.

Add a field in asid_info to track the number ASID allocated per context.
This is stored in term of shift amount to avoid division in the code.

This means the number of ASID allocated per context should be a power of
two.

At the same time rename NUM_USERS_ASIDS to NUM_CTXT_ASIDS to make the
name more generic.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index a9cc59288b08..d128f02644b0 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -37,6 +37,8 @@ static struct asid_info
raw_spinlock_t  lock;
/* Which CPU requires context flush on next call */
cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -49,15 +51,15 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 #define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 1)
-#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 1)
-#define idx2asid(info, idx)(((idx) << 1) & ~ASID_MASK(info))
+#define ASID_PER_CONTEXT   2
 #else
-#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info))
-#define asid2idx(info, asid)   ((asid) & ~ASID_MASK(info))
-#define idx2asid(info, idx)asid2idx(info, idx)
+#define ASID_PER_CONTEXT   1
 #endif
 
+#define NUM_CTXT_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 
(info)->ctxt_shift)
+#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 
(info)->ctxt_shift)
+#define idx2asid(info, idx)(((idx) << (info)->ctxt_shift) & 
~ASID_MASK(info))
+
 /* Get the ASIDBits supported by the current CPU */
 static u32 get_cpu_asid_bits(void)
 {
@@ -102,7 +104,7 @@ static void flush_context(struct asid_info *info)
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(info->map, 0, NUM_USER_ASIDS(info));
+   bitmap_clear(info->map, 0, NUM_CTXT_ASIDS(info));
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_asid(info, i), 0);
@@ -182,8 +184,8 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), cur_idx);
-   if (asid != NUM_USER_ASIDS(info))
+   asid = find_next_zero_bit(info->map, NUM_CTXT_ASIDS(info), cur_idx);
+   if (asid != NUM_CTXT_ASIDS(info))
goto set_asid;
 
/* We're out of ASIDs, so increment the global generation count */
@@ -192,7 +194,7 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
flush_context(info);
 
/* We have more ASIDs than CPUs, so this will always succeed */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), 1);
+   asid = find_next_zero_bit(info->map, NUM_CTXT_ASIDS(info), 1);
 
 set_asid:
__set_bit(asid, info->map);
@@ -272,17 +274,18 @@ static int asids_init(void)
struct asid_info *info = _info;
 
info->bits = get_cpu_asid_bits();
+   info->ctxt_shift = ilog2(ASID_PER_CONTEXT);
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is reserved for init_mm.
 */
-   WARN_ON(NUM_USER_ASIDS(info) - 1 <= num_possible_cpus());
+   WARN_ON(NUM_CTXT_ASIDS(info) - 1 <= num_possible_cpus());
atomic64_set(>generation, ASID_FIRST_VERSION(info));
-   info->map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS(info)),
+   info->map = kcalloc(BITS_TO_LONGS(NUM_CTXT_ASIDS(info)),
sizeof(*info->map), GFP_KERNEL);
if (!info->map)
panic("Failed to allocate bitmap for %lu ASIDs\n",
- NUM_USER_ASIDS(info));
+ NUM_CTXT_ASIDS(info));
 
info->active = _asids;
info->reserved = _asids;
@@ -290,7 +293,7 @@ static int asids_init(void)
raw_spin_lock_init(>lock);
 
pr_info("ASID allocator initialised with %lu entries\n",
-   NUM_USER_ASIDS(info));
+   NUM_CTXT_ASIDS(info));
return 0;
 }
 early_initcall(asids

[RFC v2 01/14] arm64/mm: Introduce asid_info structure and move asid_generation/asid_map to it

2019-06-20 Thread Julien Grall
In an attempt to make the ASID allocator generic, create a new structure
asid_info to store all the information necessary for the allocator.

For now, move the variables asid_generation and asid_map to the new structure
asid_info. Follow-up patches will move more variables.

Note to avoid more renaming aftwards, a local variable 'info' has been
created and is a pointer to the ASID allocator structure.

Signed-off-by: Julien Grall 

---
Changes in v2:
- Add turn asid_info to a static variable
---
 arch/arm64/mm/context.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 1f0ea2facf24..8167c369172d 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -30,8 +30,11 @@
 static u32 asid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 
-static atomic64_t asid_generation;
-static unsigned long *asid_map;
+static struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+} asid_info;
 
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
@@ -88,13 +91,13 @@ void verify_cpu_asid_bits(void)
}
 }
 
-static void flush_context(void)
+static void flush_context(struct asid_info *info)
 {
int i;
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
+   bitmap_clear(info->map, 0, NUM_USER_ASIDS);
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_cpu(active_asids, i), 0);
@@ -107,7 +110,7 @@ static void flush_context(void)
 */
if (asid == 0)
asid = per_cpu(reserved_asids, i);
-   __set_bit(asid2idx(asid), asid_map);
+   __set_bit(asid2idx(asid), info->map);
per_cpu(reserved_asids, i) = asid;
}
 
@@ -142,11 +145,11 @@ static bool check_update_reserved_asid(u64 asid, u64 
newasid)
return hit;
 }
 
-static u64 new_context(struct mm_struct *mm)
+static u64 new_context(struct asid_info *info, struct mm_struct *mm)
 {
static u32 cur_idx = 1;
u64 asid = atomic64_read(>context.id);
-   u64 generation = atomic64_read(_generation);
+   u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
u64 newasid = generation | (asid & ~ASID_MASK);
@@ -162,7 +165,7 @@ static u64 new_context(struct mm_struct *mm)
 * We had a valid ASID in a previous life, so try to re-use
 * it if possible.
 */
-   if (!__test_and_set_bit(asid2idx(asid), asid_map))
+   if (!__test_and_set_bit(asid2idx(asid), info->map))
return newasid;
}
 
@@ -173,20 +176,20 @@ static u64 new_context(struct mm_struct *mm)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, cur_idx);
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, cur_idx);
if (asid != NUM_USER_ASIDS)
goto set_asid;
 
/* We're out of ASIDs, so increment the global generation count */
generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION,
-_generation);
-   flush_context();
+>generation);
+   flush_context(info);
 
/* We have more ASIDs than CPUs, so this will always succeed */
-   asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, 1);
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, 1);
 
 set_asid:
-   __set_bit(asid, asid_map);
+   __set_bit(asid, info->map);
cur_idx = asid;
return idx2asid(asid) | generation;
 }
@@ -195,6 +198,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 {
unsigned long flags;
u64 asid, old_active_asid;
+   struct asid_info *info = _info;
 
if (system_supports_cnp())
cpu_set_reserved_ttbr0();
@@ -217,7 +221,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 */
old_active_asid = atomic64_read(_cpu(active_asids, cpu));
if (old_active_asid &&
-   !((asid ^ atomic64_read(_generation)) >> asid_bits) &&
+   !((asid ^ atomic64_read(>generation)) >> asid_bits) &&
atomic64_cmpxchg_relaxed(_cpu(active_asids, cpu),
 old_active_asid, asid))
goto switch_mm_fastpath;
@@ -225,8 +229,8 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
raw_spin_lock_irqsave(_asid_lock, flags);
/* Check that our ASID belongs to the current generation. */
asid = ato

[RFC v2 00/14] kvm/arm: Align the VMID allocation with the arm64 ASID one

2019-06-20 Thread Julien Grall
Hi all,

This patch series is moving out the ASID allocator in a separate file in order
to re-use it for the VMID. The benefits are:
- CPUs are not forced to exit on a roll-over.
- Context invalidation is now per-CPU rather than
  broadcasted.

There are no performance regression on the fastpath for ASID allocation.
Actually on the hackbench measurement (300 hackbench) it was .7% faster.

The measurement was made on a Seattle based SoC (8 CPUs), with the
number of VMID limited to 4-bit. The test involves running concurrently 40
guests with 2 vCPUs. Each guest will then execute hackbench 5 times
before exiting.

The performance difference (on 5.1-rc1) between the current algo and the
new one are:
- 2.5% less exit from the guest
- 22.4% more flush, although they are now local rather than broadcasted
- 0.11% faster (just for the record)

The ASID allocator rework to make it generic has been divided in multiple
patches to make the review easier.

Compare to the first RFC, Arm is not duplicated most of the code anymore.
Instead, Arm will build the version from Arm64.

A branch with the patch based on 5.2-rc5 can be found:

http://xenbits.xen.org/gitweb/?p=people/julieng/linux-arm.git;a=shortlog;h=refs/heads/vmid-rework/rfc-v2

Best regards,

Cc: Russell King 

Julien Grall (14):
  arm64/mm: Introduce asid_info structure and move
asid_generation/asid_map to it
  arm64/mm: Move active_asids and reserved_asids to asid_info
  arm64/mm: Move bits to asid_info
  arm64/mm: Move the variable lock and tlb_flush_pending to asid_info
  arm64/mm: Remove dependency on MM in new_context
  arm64/mm: Store the number of asid allocated per context
  arm64/mm: Introduce NUM_ASIDS
  arm64/mm: Split asid_inits in 2 parts
  arm64/mm: Split the function check_and_switch_context in 3 parts
  arm64/mm: Introduce a callback to flush the local context
  arm64: Move the ASID allocator code in a separate file
  arm64/lib: asid: Allow user to update the context under the lock
  arm/kvm: Introduce a new VMID allocator
  kvm/arm: Align the VMID allocation with the arm64 ASID one

 arch/arm/include/asm/kvm_asm.h|   2 +-
 arch/arm/include/asm/kvm_host.h   |   5 +-
 arch/arm/include/asm/kvm_hyp.h|   1 +
 arch/arm/include/asm/lib_asid.h   |  81 +++
 arch/arm/kvm/Makefile |   1 +
 arch/arm/kvm/hyp/tlb.c|   8 +-
 arch/arm64/include/asm/kvm_asid.h |   8 ++
 arch/arm64/include/asm/kvm_asm.h  |   2 +-
 arch/arm64/include/asm/kvm_host.h |   5 +-
 arch/arm64/include/asm/lib_asid.h |  81 +++
 arch/arm64/kvm/hyp/tlb.c  |  10 +-
 arch/arm64/lib/Makefile   |   2 +
 arch/arm64/lib/asid.c | 191 +++
 arch/arm64/mm/context.c   | 205 ++
 virt/kvm/arm/arm.c| 112 +++--
 15 files changed, 447 insertions(+), 267 deletions(-)
 create mode 100644 arch/arm/include/asm/lib_asid.h
 create mode 100644 arch/arm64/include/asm/kvm_asid.h
 create mode 100644 arch/arm64/include/asm/lib_asid.h
 create mode 100644 arch/arm64/lib/asid.c

-- 
2.11.0



Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file

2019-06-19 Thread Julien Grall

Hi Guo,

On 19/06/2019 12:51, Guo Ren wrote:

On Wed, Jun 19, 2019 at 4:54 PM Julien Grall  wrote:




On 6/19/19 9:07 AM, Guo Ren wrote:

Hi Julien,


Hi,



You forgot CCing C-SKY folks :P


I wasn't aware you could be interested :).



Move arm asid allocator code in a generic one is a agood idea, I've
made a patchset for C-SKY and test is on processing, See:
https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/

If you plan to seperate it into generic one, I could co-work with you.


Was the ASID allocator work out of box on C-Sky?

Almost done, but one question:
arm64 remove the code in switch_mm:
   cpumask_clear_cpu(cpu, mm_cpumask(prev));
   cpumask_set_cpu(cpu, mm_cpumask(next));





Why? Although arm64 cache operations could affect all harts with CTC
method of interconnect, I think we should
keep these code for primitive integrity in linux. Because cpu_bitmap
is in mm_struct instead of mm->context.


I will let Will answer to this.

[...]


If so, I can easily move the code in a generic place (maybe lib/asid.c).

I think it's OK.


Will emits concern to move the code in lib. So I will stick with what I 
currently have.


Cheers,

--
Julien Grall


Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file

2019-06-19 Thread Julien Grall




On 6/19/19 9:07 AM, Guo Ren wrote:

Hi Julien,


Hi,



You forgot CCing C-SKY folks :P


I wasn't aware you could be interested :).



Move arm asid allocator code in a generic one is a agood idea, I've
made a patchset for C-SKY and test is on processing, See:
https://lore.kernel.org/linux-csky/1560930553-26502-1-git-send-email-guo...@kernel.org/

If you plan to seperate it into generic one, I could co-work with you.


Was the ASID allocator work out of box on C-Sky? If so, I can easily 
move the code in a generic place (maybe lib/asid.c).


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file

2019-06-05 Thread Julien Grall

Hi,

I am CCing RISC-V folks to see if there are an interest to share the code.

@RISC-V: I noticed you are discussing about importing a version of ASID 
allocator in RISC-V. At a first look, the code looks quite similar. Would the 
library below helps you?


Cheers,

On 21/03/2019 16:36, Julien Grall wrote:

We will want to re-use the ASID allocator in a separate context (e.g
allocating VMID). So move the code in a new file.

The function asid_check_context has been moved in the header as a static
inline function because we want to avoid add a branch when checking if the
ASID is still valid.

Signed-off-by: Julien Grall 

---

This code will be used in the virt code for allocating VMID. I am not
entirely sure where to place it. Lib could potentially be a good place but I
am not entirely convinced the algo as it is could be used by other
architecture.

Looking at x86, it seems that it will not be possible to re-use because
the number of PCID (aka ASID) could be smaller than the number of CPUs.
See commit message 10af6235e0d327d42e1bad974385197817923dc1 "x86/mm:
Implement PCID based optimization: try to preserve old TLB entries using
PCI".
---
  arch/arm64/include/asm/asid.h |  77 ++
  arch/arm64/lib/Makefile   |   2 +
  arch/arm64/lib/asid.c | 185 +
  arch/arm64/mm/context.c   | 235 +-
  4 files changed, 267 insertions(+), 232 deletions(-)
  create mode 100644 arch/arm64/include/asm/asid.h
  create mode 100644 arch/arm64/lib/asid.c

diff --git a/arch/arm64/include/asm/asid.h b/arch/arm64/include/asm/asid.h
new file mode 100644
index ..bb62b587f37f
--- /dev/null
+++ b/arch/arm64/include/asm/asid.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ASM_ASID_H
+#define __ASM_ASM_ASID_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
+   u32 bits;
+   /* Lock protecting the structure */
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
+};
+
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
+
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+
+void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+ unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static inline void asid_check_context(struct asid_info *info,
+ atomic64_t *pasid, unsigned int cpu)
+{
+   u64 asid, old_active_asid;
+
+   asid = atomic64_read(pasid);
+
+   /*
+* The memory ordering here is subtle.
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
+* cmpxchg. Racing with a concurrent rollover means that either:
+*
+* - We get a zero back from the cmpxchg and end up waiting on the
+*   lock. Taking the lock synchronises with the rollover and so
+*   we are forced to see the updated generation.
+*
+* - We get a valid ASID back from the cmpxchg, which means the
+*   relaxed xchg in flush_context will treat us as reserved
+*   because atomic RmWs are totally ordered for a given location.
+*/
+   old_active_asid = atomic64_read(_asid(info, cpu));
+   if (old_active_asid &&
+   !((asid ^ atomic64_read(>generation)) >> info->bits) &&
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
+old_active_asid, asid))
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+int asid_allocator_init(struct asid_info *info,
+   u32 bits, unsigned int asid_per_ctxt,
+   void (*flush_cpu_ctxt_cb)(void));
+
+#endif
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 5540a1638baf..720df5ee2aa2 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -5,6 +5,8 @@ lib-y   := clear_user.o delay.o copy_from_user.o
\
   memcmp.o strcmp.o strncmp.o strlen.o strnlen.o   \
   strchr.o strrchr.o tishift.o
  
+lib-y

KVM Arm Device passthrough and linux-rt

2019-06-04 Thread Julien Grall
Hi,

While trying device passthrough on Linux-rt with KVM Arm, I had
the following splat.

[  363.410141] 000: BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:974
[  363.410150] 000: in_atomic(): 0, irqs_disabled(): 128, pid: 2916, name: 
qemu-system-aar
[  363.410153] 000: 4 locks held by qemu-system-aar/2916:
[  363.410157] 000:  #0: 8007bd248100 (>mutex){+.+.}, at: 
kvm_vcpu_ioctl+0x70/0xae0
[  363.410171] 000:  #1: 8007bd1e2b20 (>irq_srcu){}, at: 
kvm_notify_acked_irq+0x7c/0x300
[  363.410179] 000:  #2: 8007bd1e2b20 (>irq_srcu){}, at: 
irqfd_resampler_ack+0x0/0xd8
[  363.410187] 000:  #3: 8007c2b27d28 (>wqh#2){+.+.}, at: 
eventfd_signal+0x24/0x78
[  363.410196] 000: irq event stamp: 4033894
[  363.410197] 000: hardirqs last  enabled at (4033893): [] 
_raw_spin_unlock_irqrestore+0x88/0x90
[  363.410203] 000: hardirqs last disabled at (4033894): [] 
kvm_arch_vcpu_ioctl_run+0x2a8/0xc08
[  363.410207] 000: softirqs last  enabled at (0): [] 
copy_process.isra.1.part.2+0x8d8/0x1958
[  363.410212] 000: softirqs last disabled at (0): [<>]  (null)
[  363.410216] 000: CPU: 0 PID: 2916 Comm: qemu-system-aar Tainted: GW  
 5.0.14-rt9-00013-g4b2a13c8a804 #84
[  363.410219] 000: Hardware name: AMD Seattle (Rev.B0) Development Board 
(Overdrive) (DT)
[  363.410221] 000: Call trace:
[  363.410222] 000:  dump_backtrace+0x0/0x158
[  363.410225] 000:  show_stack+0x14/0x20
[  363.410227] 000:  dump_stack+0xa0/0xd4
[  363.410230] 000:  ___might_sleep+0x16c/0x1f8
[  363.410234] 000:  rt_spin_lock+0x5c/0x70
[  363.410237] 000:  eventfd_signal+0x24/0x78
[  363.410238] 000:  irqfd_resampler_ack+0x94/0xd8
[  363.410241] 000:  kvm_notify_acked_irq+0xf8/0x300
[  363.410243] 000:  vgic_v2_fold_lr_state+0x174/0x1e0
[  363.410246] 000:  kvm_vgic_sync_hwstate+0x5c/0x2b8
[  363.410249] 000:  kvm_arch_vcpu_ioctl_run+0x624/0xc08
[  363.410250] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  363.410252] 000:  do_vfs_ioctl+0xbc/0x910
[  363.410255] 000:  ksys_ioctl+0x78/0xa8
[  363.410257] 000:  __arm64_sys_ioctl+0x1c/0x28
[  363.410260] 000:  el0_svc_common+0x90/0x118
[  363.410263] 000:  el0_svc_handler+0x2c/0x80
[  363.410265] 000:  el0_svc+0x8/0xc

This is happening because vgic_v2_fold_lr_state() is expected
to be called with interrupt disabled. However, some of the path
(e.g eventfd) will take a spinlock.

The spinlock is from the waitqueue, so using a raw_spin_lock cannot
even be considered.

Do you have any input on how this could be solved?

Cheers,

-- 
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 3/3] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-06-04 Thread Julien Grall

Hi Catalin,

On 6/3/19 10:21 PM, Catalin Marinas wrote:

On Mon, Jun 03, 2019 at 05:25:34PM +0100, Catalin Marinas wrote:

On Tue, May 21, 2019 at 06:21:39PM +0100, Julien Grall wrote:

Since a softirq is supposed to check may_use_simd() anyway before
attempting to use FPSIMD/SVE, there is limited reason to keep softirq
disabled when touching the FPSIMD/SVE context. Instead, we can simply
disable preemption and mark the FPSIMD/SVE context as in use by setting
CPU's fpsimd_context_busy flag.

[...]

+static void get_cpu_fpsimd_context(void)
+{
+   preempt_disable();
+   __get_cpu_fpsimd_context();
+}


Is there anything that prevents a softirq being invoked between
preempt_disable() and __get_cpu_fpsimd_context()?


Actually, it shouldn't matter as the softirq finishes using the fpsimd
before the thread is resumed.


If the softirqs is handled in a thread (i.e ksoftirqd), then 
preempt_disable() will prevent them to run.


For softirq running on return from interrupt context, they will finish 
before using fpsimd before the thread is resumed.


Softirq running after __get_cpu_fpsimd_context() is called will not be 
able to use FPSIMD (may_use_simd() returns false).


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 3/3] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-05-21 Thread Julien Grall
When the kernel is compiled with CONFIG_KERNEL_MODE_NEON, some part of
the kernel may be able to use FPSIMD/SVE. This is for instance the case
for crypto code.

Any use of FPSIMD/SVE in the kernel are clearly marked by using the
function kernel_neon_{begin, end}. Furthermore, this can only be used
when may_use_simd() returns true.

The current implementation of may_use_simd() allows softirq to use
FPSIMD/SVE unless it is currently in use (i.e kernel_neon_busy is true).
When in use, softirqs usually fall back to a software method.

At the moment, as a softirq may use FPSIMD/SVE, softirqs are disabled
when touching the FPSIMD/SVE context. This has the drawback to disable
all softirqs even if they are not using FPSIMD/SVE.

Since a softirq is supposed to check may_use_simd() anyway before
attempting to use FPSIMD/SVE, there is limited reason to keep softirq
disabled when touching the FPSIMD/SVE context. Instead, we can simply
disable preemption and mark the FPSIMD/SVE context as in use by setting
CPU's fpsimd_context_busy flag.

Two new helpers {get, put}_cpu_fpsimd_context are introduced to mark
the area using FPSIMD/SVE context and they are used to replace
local_bh_{disable, enable}. The functions kernel_neon_{begin, end} are
also re-implemented to use the new helpers.

Additionally, double-underscored versions of the helpers are provided to
called when preemption is already disabled. These are only relevant on
paths where irqs are disabled anyway, so they are not needed for
correctness in the current code. Let's use them anyway though: this
marks critical sections clearly and will help to avoid mistakes during
future maintenance.

The change has been benchmarked on Linux 5.1-rc4 with defconfig.

On Juno2:
* hackbench 100 process 1000 (10 times)
* .7% quicker

On ThunderX 2:
* hackbench 1000 process 1000 (20 times)
* 3.4% quicker

Signed-off-by: Julien Grall 
Reviewed-by: Dave Martin 

---
Changes in v5:
- Update commit message
- Add Dave's reviewed-by

Changes in v4:
- Clarify the comment on top of get_cpu_fpsimd_context()
- Use double-underscore version in fpsimd_save_and_flush_cpu_state()

Changes in v3:
- Fix typoes in the commit message
- Rework a bit the commit message
- Use imperative mood
- Rename kernel_neon_busy to fpsimd_context_busy
- Remove debug code
- Update comments
- Don't require preemption when calling 
fpsimd_save_and_flush_cpu_state()

Changes in v2:
- Remove spurious call to kernel_neon_enable in kernel_neon_begin.
- Rename kernel_neon_{enable, disable} to {get, put}_cpu_fpsimd_context
- Introduce a double-underscore version of the helpers for case
where preemption is already disabled
- Introduce have_cpu_fpsimd_context() and use it in WARN_ON(...)
- Surround more places in the code with the new helpers
- Rework the comments
- Update the commit message with the benchmark result
---
 arch/arm64/include/asm/simd.h |  10 ++--
 arch/arm64/kernel/fpsimd.c| 124 --
 2 files changed, 89 insertions(+), 45 deletions(-)

diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
index 6495cc51246f..a6307e43b8c2 100644
--- a/arch/arm64/include/asm/simd.h
+++ b/arch/arm64/include/asm/simd.h
@@ -15,9 +15,9 @@
 #include 
 #include 
 
-#ifdef CONFIG_KERNEL_MODE_NEON
+DECLARE_PER_CPU(bool, fpsimd_context_busy);
 
-DECLARE_PER_CPU(bool, kernel_neon_busy);
+#ifdef CONFIG_KERNEL_MODE_NEON
 
 /*
  * may_use_simd - whether it is allowable at this time to issue SIMD
@@ -29,15 +29,15 @@ DECLARE_PER_CPU(bool, kernel_neon_busy);
 static __must_check inline bool may_use_simd(void)
 {
/*
-* kernel_neon_busy is only set while preemption is disabled,
+* fpsimd_context_busy is only set while preemption is disabled,
 * and is clear whenever preemption is enabled. Since
-* this_cpu_read() is atomic w.r.t. preemption, kernel_neon_busy
+* this_cpu_read() is atomic w.r.t. preemption, fpsimd_context_busy
 * cannot change under our feet -- if it's set we cannot be
 * migrated, and if it's clear we cannot be migrated to a CPU
 * where it is set.
 */
return !in_irq() && !irqs_disabled() && !in_nmi() &&
-   !this_cpu_read(kernel_neon_busy);
+   !this_cpu_read(fpsimd_context_busy);
 }
 
 #else /* ! CONFIG_KERNEL_MODE_NEON */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 6448921a2f59..c7c454df2779 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -92,7 +92,8 @@
  * To prevent this from racing with the manipulation of the task's FPSIMD state
  * from task context and thereby corrupting the state, it is necessary to
  * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE
- * fl

[PATCH v5 0/3] arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

2019-05-21 Thread Julien Grall
Hi all,

This patch series keeps softirqs enabled while touching FPSIMD/SVE state.
For more details on the impact see patch #3.

This patch series has been benchmarked on Linux 5.1-rc4 with defconfig.

On Juno2:
* hackbench 100 process 1000 (10 times)
* .7% quicker

On ThunderX 2:
* hackbench 1000 process 1000 (20 times)
* 3.4% quicker

Note that while the benchmark has been done on 5.1-rc4, the patch series is
based on 5.2-rc1.

Cheers,

Julien Grall (3):
  arm64/fpsimd: Remove the prototype for sve_flush_cpu_state()
  arch/arm64: fpsimd: Introduce fpsimd_save_and_flush_cpu_state() and
use it
  arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state

 arch/arm64/include/asm/fpsimd.h |   5 +-
 arch/arm64/include/asm/simd.h   |  10 +--
 arch/arm64/kernel/fpsimd.c  | 139 +++-
 arch/arm64/kvm/fpsimd.c |   4 +-
 4 files changed, 103 insertions(+), 55 deletions(-)

-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 2/3] arch/arm64: fpsimd: Introduce fpsimd_save_and_flush_cpu_state() and use it

2019-05-21 Thread Julien Grall
The only external user of fpsimd_save() and fpsimd_flush_cpu_state() is
the KVM FPSIMD code.

A following patch will introduce a mechanism to acquire owernship of the
FPSIMD/SVE context for performing context management operations. Rather
than having to export the new helpers to get/put the context, we can just
introduce a new function to combine fpsimd_save() and
fpsimd_flush_cpu_state().

This has also the advantage to remove any external call of fpsimd_save()
and fpsimd_flush_cpu_state(), so they can be turned static.

Lastly, the new function can also be used in the PM notifier.

Signed-off-by: Julien Grall 
Reviewed-by: Dave Martin 

---
kernel_neon_begin() does not use fpsimd_save_and_flush_cpu_state()
because the next patch will modify the function to also grab the
FPSIMD/SVE context.

Changes in v4:
- Remove newline before the new prototype
- Add Dave's reviewed-by

Changes in v3:
- Rework the commit message
- Move the prototype of fpsimd_save_and_flush_cpu_state()
further down in the header
- Remove comment in kvm_arch_vcpu_put_fp()

Changes in v2:
- Patch added
---
 arch/arm64/include/asm/fpsimd.h |  4 +---
 arch/arm64/kernel/fpsimd.c  | 17 +
 arch/arm64/kvm/fpsimd.c |  4 +---
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index b73d12fcc7f9..4154851c21ab 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -48,8 +48,6 @@ struct task_struct;
 extern void fpsimd_save_state(struct user_fpsimd_state *state);
 extern void fpsimd_load_state(struct user_fpsimd_state *state);
 
-extern void fpsimd_save(void);
-
 extern void fpsimd_thread_switch(struct task_struct *next);
 extern void fpsimd_flush_thread(void);
 
@@ -63,7 +61,7 @@ extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state 
*state,
 void *sve_state, unsigned int sve_vl);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
-extern void fpsimd_flush_cpu_state(void);
+extern void fpsimd_save_and_flush_cpu_state(void);
 
 /* Maximum VL that SVE VL-agnostic software can transparently support */
 #define SVE_VL_ARCH_MAX 0x100
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index a38bf74bcca8..6448921a2f59 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -246,7 +246,7 @@ static void task_fpsimd_load(void)
  *
  * Softirqs (and preemption) must be disabled.
  */
-void fpsimd_save(void)
+static void fpsimd_save(void)
 {
struct fpsimd_last_state_struct const *last =
this_cpu_ptr(_last_state);
@@ -1122,12 +1122,22 @@ void fpsimd_flush_task_state(struct task_struct *t)
  * Invalidate any task's FPSIMD state that is present on this cpu.
  * This function must be called with softirqs disabled.
  */
-void fpsimd_flush_cpu_state(void)
+static void fpsimd_flush_cpu_state(void)
 {
__this_cpu_write(fpsimd_last_state.st, NULL);
set_thread_flag(TIF_FOREIGN_FPSTATE);
 }
 
+/*
+ * Save the FPSIMD state to memory and invalidate cpu view.
+ * This function must be called with softirqs (and preemption) disabled.
+ */
+void fpsimd_save_and_flush_cpu_state(void)
+{
+   fpsimd_save();
+   fpsimd_flush_cpu_state();
+}
+
 #ifdef CONFIG_KERNEL_MODE_NEON
 
 DEFINE_PER_CPU(bool, kernel_neon_busy);
@@ -1284,8 +1294,7 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block 
*self,
 {
switch (cmd) {
case CPU_PM_ENTER:
-   fpsimd_save();
-   fpsimd_flush_cpu_state();
+   fpsimd_save_and_flush_cpu_state();
break;
case CPU_PM_EXIT:
break;
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 6e3c9c8b2df9..525010504f9d 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -112,9 +112,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
u64 *guest_zcr = >arch.ctxt.sys_regs[ZCR_EL1];
 
-   /* Clean guest FP state to memory and invalidate cpu view */
-   fpsimd_save();
-   fpsimd_flush_cpu_state();
+   fpsimd_save_and_flush_cpu_state();
 
if (guest_has_sve)
*guest_zcr = read_sysreg_s(SYS_ZCR_EL12);
-- 
2.11.0



[PATCH v5 1/3] arm64/fpsimd: Remove the prototype for sve_flush_cpu_state()

2019-05-21 Thread Julien Grall
The function sve_flush_cpu_state() has been removed in commit 21cdd7fd76e3
("KVM: arm64: Remove eager host SVE state saving").

So remove the associated prototype in asm/fpsimd.h.

Signed-off-by: Julien Grall 
Reviewed-by: Dave Martin 

---
Changes in v3:
- Add Dave's reviewed-by
- Fix checkpatch style error when mentioning a commit

Changes in v2:
- Patch added
---
 arch/arm64/include/asm/fpsimd.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index df62bbd33a9a..b73d12fcc7f9 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -64,7 +64,6 @@ extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state 
*state,
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_flush_cpu_state(void);
-extern void sve_flush_cpu_state(void);
 
 /* Maximum VL that SVE VL-agnostic software can transparently support */
 #define SVE_VL_ARCH_MAX 0x100
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH RFC 01/14] arm64/mm: Introduce asid_info structure and move asid_generation/asid_map to it

2019-03-21 Thread Julien Grall

On 3/21/19 5:03 PM, Suzuki K Poulose wrote:

Hi Julien,


Hi Suzuki,


On 21/03/2019 16:36, Julien Grall wrote:

In an attempt to make the ASID allocator generic, create a new structure
asid_info to store all the information necessary for the allocator.

For now, move the variables asid_generation and asid_map to the new 
structure

asid_info. Follow-up patches will move more variables.

Note to avoid more renaming aftwards, a local variable 'info' has been
created and is a pointer to the ASID allocator structure.

Signed-off-by: Julien Grall 
---
  arch/arm64/mm/context.c | 46 
++

  1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 1f0ea2facf24..34db54f1a39a 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -30,8 +30,11 @@
  static u32 asid_bits;
  static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
-static atomic64_t asid_generation;
-static unsigned long *asid_map;
+struct asid_info
+{
+    atomic64_t    generation;
+    unsigned long    *map;
+} asid_info;


Shouldn't this be static ? Rest looks fine.


Yes it should be static. I have updated my code.

Thank you for the review!

Cheers,



Cheers
Suzuki


--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 10/14] arm64/mm: Introduce a callback to flush the local context

2019-03-21 Thread Julien Grall
Flushing the local context will vary depending on the actual user of the ASID
allocator. Introduce a new callback to flush the local context and move
the call to flush local TLB in it.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index cbf1c24cb3ee..678a57b77c91 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -39,6 +39,8 @@ struct asid_info
cpumask_t   flush_pending;
/* Number of ASID allocated by context (shift value) */
unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -266,7 +268,7 @@ static void asid_new_context(struct asid_info *info, 
atomic64_t *pasid,
}
 
if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
-   local_flush_tlb_all();
+   info->flush_cpu_ctxt_cb();
 
atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(>lock, flags);
@@ -298,6 +300,11 @@ asmlinkage void post_ttbr_update_workaround(void)
CONFIG_CAVIUM_ERRATUM_27456));
 }
 
+static void asid_flush_cpu_ctxt(void)
+{
+   local_flush_tlb_all();
+}
+
 /*
  * Initialize the ASID allocator
  *
@@ -308,10 +315,12 @@ asmlinkage void post_ttbr_update_workaround(void)
  * 2.
  */
 static int asid_allocator_init(struct asid_info *info,
-  u32 bits, unsigned int asid_per_ctxt)
+  u32 bits, unsigned int asid_per_ctxt,
+  void (*flush_cpu_ctxt_cb)(void))
 {
info->bits = bits;
info->ctxt_shift = ilog2(asid_per_ctxt);
+   info->flush_cpu_ctxt_cb = flush_cpu_ctxt_cb;
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is always reserved.
@@ -332,7 +341,8 @@ static int asids_init(void)
 {
u32 bits = get_cpu_asid_bits();
 
-   if (!asid_allocator_init(_info, bits, ASID_PER_CONTEXT))
+   if (!asid_allocator_init(_info, bits, ASID_PER_CONTEXT,
+asid_flush_cpu_ctxt))
panic("Unable to initialize ASID allocator for %lu ASIDs\n",
  1UL << bits);
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 09/14] arm64/mm: Split the function check_and_switch_context in 3 parts

2019-03-21 Thread Julien Grall
The function check_and_switch_context is used to:
1) Check whether the ASID is still valid
2) Generate a new one if it is not valid
3) Switch the context

While the latter is specific to the MM subsystem, the rest could be part
of the generic ASID allocator.

After this patch, the function is now split in 3 parts which corresponds
to the use of the functions:
1) asid_check_context: Check if the ASID is still valid
2) asid_new_context: Generate a new ASID for the context
3) check_and_switch_context: Call 1) and 2) and switch the context

1) and 2) have not been merged in a single function because we want to
avoid to add a branch in when the ASID is still valid. This will matter
when the code will be moved in separate file later on as 1) will reside
in the header as a static inline function.

Signed-off-by: Julien Grall 

---

Will wants to avoid to add a branch when the ASID is still valid. So
1) and 2) are in separates function. The former will move to a new
header and make static inline.
---
 arch/arm64/mm/context.c | 51 +
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index b071a1b3469e..cbf1c24cb3ee 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -204,16 +204,21 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
return idx2asid(info, asid) | generation;
 }
 
-void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
+static void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static void asid_check_context(struct asid_info *info,
+  atomic64_t *pasid, unsigned int cpu)
 {
-   unsigned long flags;
u64 asid, old_active_asid;
-   struct asid_info *info = _info;
 
-   if (system_supports_cnp())
-   cpu_set_reserved_ttbr0();
-
-   asid = atomic64_read(>context.id);
+   asid = atomic64_read(pasid);
 
/*
 * The memory ordering here is subtle.
@@ -234,14 +239,30 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
!((asid ^ atomic64_read(>generation)) >> info->bits) &&
atomic64_cmpxchg_relaxed(_asid(info, cpu),
 old_active_asid, asid))
-   goto switch_mm_fastpath;
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+/*
+ * Generate a new ASID for the context.
+ *
+ * @pasid: Pointer to the current ASID batch allocated. It will be updated
+ * with the new ASID batch.
+ * @cpu: current CPU ID. Must have been acquired through get_cpu()
+ */
+static void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+unsigned int cpu)
+{
+   unsigned long flags;
+   u64 asid;
 
raw_spin_lock_irqsave(>lock, flags);
/* Check that our ASID belongs to the current generation. */
-   asid = atomic64_read(>context.id);
+   asid = atomic64_read(pasid);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
-   asid = new_context(info, >context.id);
-   atomic64_set(>context.id, asid);
+   asid = new_context(info, pasid);
+   atomic64_set(pasid, asid);
}
 
if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
@@ -249,8 +270,14 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 
atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(>lock, flags);
+}
+
+void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
+{
+   if (system_supports_cnp())
+   cpu_set_reserved_ttbr0();
 
-switch_mm_fastpath:
+   asid_check_context(_info, >context.id, cpu);
 
arm64_apply_bp_hardening();
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 05/14] arm64/mm: Remove dependency on MM in new_context

2019-03-21 Thread Julien Grall
The function new_context will be part of a generic ASID allocator. At
the moment, the MM structure is only used to fetch the ASID.

To remove the dependency on MM, it is possible to just pass a pointer to
the current ASID.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index e98ab348b9cb..488845c39c39 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -151,10 +151,10 @@ static bool check_update_reserved_asid(struct asid_info 
*info, u64 asid,
return hit;
 }
 
-static u64 new_context(struct asid_info *info, struct mm_struct *mm)
+static u64 new_context(struct asid_info *info, atomic64_t *pasid)
 {
static u32 cur_idx = 1;
-   u64 asid = atomic64_read(>context.id);
+   u64 asid = atomic64_read(pasid);
u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
@@ -236,7 +236,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
/* Check that our ASID belongs to the current generation. */
asid = atomic64_read(>context.id);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
-   asid = new_context(info, mm);
+   asid = new_context(info, >context.id);
atomic64_set(>context.id, asid);
}
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 03/14] arm64/mm: Move bits to asid_info

2019-03-21 Thread Julien Grall
The variable bits hold information for a given ASID allocator. So move
it to the asid_info structure.

Because most of the macros were relying on bits, they are now taking an
extra parameter that is a pointer to the asid_info structure.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 59 +
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index cfe4c5f7abf3..da17ed6c7117 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 
-static u32 asid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 
 struct asid_info
@@ -36,6 +35,7 @@ struct asid_info
unsigned long   *map;
atomic64_t __percpu *active;
u64 __percpu*reserved;
+   u32 bits;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -46,17 +46,17 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 
 static cpumask_t tlb_flush_pending;
 
-#define ASID_MASK  (~GENMASK(asid_bits - 1, 0))
-#define ASID_FIRST_VERSION (1UL << asid_bits)
+#define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
+#define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define NUM_USER_ASIDS (ASID_FIRST_VERSION >> 1)
-#define asid2idx(asid) (((asid) & ~ASID_MASK) >> 1)
-#define idx2asid(idx)  (((idx) << 1) & ~ASID_MASK)
+#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 1)
+#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 1)
+#define idx2asid(info, idx)(((idx) << 1) & ~ASID_MASK(info))
 #else
-#define NUM_USER_ASIDS (ASID_FIRST_VERSION)
-#define asid2idx(asid) ((asid) & ~ASID_MASK)
-#define idx2asid(idx)  asid2idx(idx)
+#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info))
+#define asid2idx(info, asid)   ((asid) & ~ASID_MASK(info))
+#define idx2asid(info, idx)asid2idx(info, idx)
 #endif
 
 /* Get the ASIDBits supported by the current CPU */
@@ -86,13 +86,13 @@ void verify_cpu_asid_bits(void)
 {
u32 asid = get_cpu_asid_bits();
 
-   if (asid < asid_bits) {
+   if (asid < asid_info.bits) {
/*
 * We cannot decrease the ASID size at runtime, so panic if we 
support
 * fewer ASID bits than the boot CPU.
 */
pr_crit("CPU%d: smaller ASID size(%u) than boot CPU (%u)\n",
-   smp_processor_id(), asid, asid_bits);
+   smp_processor_id(), asid, asid_info.bits);
cpu_panic_kernel();
}
 }
@@ -103,7 +103,7 @@ static void flush_context(struct asid_info *info)
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(info->map, 0, NUM_USER_ASIDS);
+   bitmap_clear(info->map, 0, NUM_USER_ASIDS(info));
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_asid(info, i), 0);
@@ -116,7 +116,7 @@ static void flush_context(struct asid_info *info)
 */
if (asid == 0)
asid = reserved_asid(info, i);
-   __set_bit(asid2idx(asid), info->map);
+   __set_bit(asid2idx(info, asid), info->map);
reserved_asid(info, i) = asid;
}
 
@@ -159,7 +159,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
-   u64 newasid = generation | (asid & ~ASID_MASK);
+   u64 newasid = generation | (asid & ~ASID_MASK(info));
 
/*
 * If our current ASID was active during a rollover, we
@@ -172,7 +172,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * We had a valid ASID in a previous life, so try to re-use
 * it if possible.
 */
-   if (!__test_and_set_bit(asid2idx(asid), info->map))
+   if (!__test_and_set_bit(asid2idx(info, asid), info->map))
return newasid;
}
 
@@ -183,22 +183,22 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, cur_idx);
-   if (asid != NUM_USER_ASIDS)
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), cur_idx);
+   if (asid != NUM_USER_ASIDS(info))
goto set_asid;
 
/* We're out of ASIDs, so increment the global gener

[PATCH RFC 07/14] arm64/mm: Introduce NUM_ASIDS

2019-03-21 Thread Julien Grall
At the moment ASID_FIRST_VERSION is used to know the number of ASIDs
supported. As we are going to move the ASID allocator in a separate, it
would be better to use a different name for external users.

This patch adds NUM_ASIDS and implements ASID_FIRST_VERSION using it.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 5a4c2b1aac71..fb13bc249951 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -48,7 +48,9 @@ static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
 
 #define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
-#define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+
+#define ASID_FIRST_VERSION(info)   NUM_ASIDS(info)
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 #define ASID_PER_CONTEXT   2
@@ -56,7 +58,7 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 #define ASID_PER_CONTEXT   1
 #endif
 
-#define NUM_CTXT_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 
(info)->ctxt_shift)
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
 #define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 
(info)->ctxt_shift)
 #define idx2asid(info, idx)(((idx) << (info)->ctxt_shift) & 
~ASID_MASK(info))
 
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 08/14] arm64/mm: Split asid_inits in 2 parts

2019-03-21 Thread Julien Grall
Move out the common initialization of the ASID allocator in a separate
function.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 43 +++
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index fb13bc249951..b071a1b3469e 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -271,31 +271,50 @@ asmlinkage void post_ttbr_update_workaround(void)
CONFIG_CAVIUM_ERRATUM_27456));
 }
 
-static int asids_init(void)
+/*
+ * Initialize the ASID allocator
+ *
+ * @info: Pointer to the asid allocator structure
+ * @bits: Number of ASIDs available
+ * @asid_per_ctxt: Number of ASIDs to allocate per-context. ASIDs are
+ * allocated contiguously for a given context. This value should be a power of
+ * 2.
+ */
+static int asid_allocator_init(struct asid_info *info,
+  u32 bits, unsigned int asid_per_ctxt)
 {
-   struct asid_info *info = _info;
-
-   info->bits = get_cpu_asid_bits();
-   info->ctxt_shift = ilog2(ASID_PER_CONTEXT);
+   info->bits = bits;
+   info->ctxt_shift = ilog2(asid_per_ctxt);
/*
 * Expect allocation after rollover to fail if we don't have at least
-* one more ASID than CPUs. ASID #0 is reserved for init_mm.
+* one more ASID than CPUs. ASID #0 is always reserved.
 */
WARN_ON(NUM_CTXT_ASIDS(info) - 1 <= num_possible_cpus());
atomic64_set(>generation, ASID_FIRST_VERSION(info));
info->map = kcalloc(BITS_TO_LONGS(NUM_CTXT_ASIDS(info)),
sizeof(*info->map), GFP_KERNEL);
if (!info->map)
-   panic("Failed to allocate bitmap for %lu ASIDs\n",
- NUM_CTXT_ASIDS(info));
-
-   info->active = _asids;
-   info->reserved = _asids;
+   return -ENOMEM;
 
raw_spin_lock_init(>lock);
 
+   return 0;
+}
+
+static int asids_init(void)
+{
+   u32 bits = get_cpu_asid_bits();
+
+   if (!asid_allocator_init(_info, bits, ASID_PER_CONTEXT))
+   panic("Unable to initialize ASID allocator for %lu ASIDs\n",
+ 1UL << bits);
+
+   asid_info.active = _asids;
+   asid_info.reserved = _asids;
+
pr_info("ASID allocator initialised with %lu entries\n",
-   NUM_CTXT_ASIDS(info));
+   NUM_CTXT_ASIDS(_info));
+
return 0;
 }
 early_initcall(asids_init);
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 14/14] kvm/arm: Align the VMID allocation with the arm64 ASID one

2019-03-21 Thread Julien Grall
At the moment, the VMID algorithm will send an SGI to all the CPUs to
force an exit and then broadcast a full TLB flush and I-Cache
invalidation.

This patch re-use the new ASID allocator. The
benefits are:
- CPUs are not forced to exit at roll-over. Instead the VMID will be
marked reserved and the context will be flushed at next exit. This
will reduce the IPIs traffic.
- Context invalidation is now per-CPU rather than broadcasted.

With the new algo, the code is now adapted:
- The function __kvm_flush_vm_context() has been renamed to
__kvm_flush_cpu_vmid_context and now only flushing the current CPU context.
- The call to update_vttbr() will be done with preemption disabled
as the new algo requires to store information per-CPU.
- The TLBs associated to EL1 will be flushed when booting a CPU to
deal with stale information. This was previously done on the
allocation of the first VMID of a new generation.

The measurement was made on a Seattle based SoC (8 CPUs), with the
number of VMID limited to 4-bit. The test involves running concurrently 40
guests with 2 vCPUs. Each guest will then execute hackbench 5 times
before exiting.

The performance difference between the current algo and the new one are:
- 2.5% less exit from the guest
- 22.4% more flush, although they are now local rather than
broadcasted
- 0.11% faster (just for the record)

Signed-off-by: Julien Grall 


Looking at the __kvm_flush_vm_context, it might be possible to
reduce more the overhead by removing the I-Cache flush for other
cache than VIPT. This has been left aside for now.
---
 arch/arm/include/asm/kvm_asm.h|   2 +-
 arch/arm/include/asm/kvm_host.h   |   5 +-
 arch/arm/include/asm/kvm_hyp.h|   1 +
 arch/arm/kvm/hyp/tlb.c|   8 +--
 arch/arm64/include/asm/kvm_asid.h |   8 +++
 arch/arm64/include/asm/kvm_asm.h  |   2 +-
 arch/arm64/include/asm/kvm_host.h |   5 +-
 arch/arm64/kvm/hyp/tlb.c  |  10 ++--
 virt/kvm/arm/arm.c| 112 +-
 9 files changed, 61 insertions(+), 92 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_asid.h

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 35491af87985..ce60a4a46fcc 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -65,7 +65,7 @@ struct kvm_vcpu;
 extern char __kvm_hyp_init[];
 extern char __kvm_hyp_init_end[];
 
-extern void __kvm_flush_vm_context(void);
+extern void __kvm_flush_cpu_vmid_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
 extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 extern void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu);
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 770d73257ad9..e2c3a4a7b020 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -59,8 +59,8 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 void kvm_reset_coprocs(struct kvm_vcpu *vcpu);
 
 struct kvm_vmid {
-   /* The VMID generation used for the virt. memory system */
-   u64vmid_gen;
+   /* The ASID used for the ASID allocator */
+   atomic64_t asid;
u32vmid;
 };
 
@@ -264,7 +264,6 @@ unsigned long __kvm_call_hyp(void *hypfn, ...);
ret;\
})
 
-void force_vm_exit(const cpumask_t *mask);
 int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
  struct kvm_vcpu_events *events);
 
diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index 87bcd18df8d5..c3d1011ca1bf 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -75,6 +75,7 @@
 #define TLBIALLIS  __ACCESS_CP15(c8, 0, c3, 0)
 #define TLBIALL__ACCESS_CP15(c8, 0, c7, 0)
 #define TLBIALLNSNHIS  __ACCESS_CP15(c8, 4, c3, 4)
+#define TLBIALLNSNH__ACCESS_CP15(c8, 4, c7, 4)
 #define PRRR   __ACCESS_CP15(c10, 0, c2, 0)
 #define NMRR   __ACCESS_CP15(c10, 0, c2, 1)
 #define AMAIR0 __ACCESS_CP15(c10, 0, c3, 0)
diff --git a/arch/arm/kvm/hyp/tlb.c b/arch/arm/kvm/hyp/tlb.c
index 8e4afba73635..42b9ab47fc94 100644
--- a/arch/arm/kvm/hyp/tlb.c
+++ b/arch/arm/kvm/hyp/tlb.c
@@ -71,9 +71,9 @@ void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu 
*vcpu)
write_sysreg(0, VTTBR);
 }
 
-void __hyp_text __kvm_flush_vm_context(void)
+void __hyp_text __kvm_flush_cpu_vmid_context(void)
 {
-   write_sysreg(0, TLBIALLNSNHIS);
-   write_sysreg(0, ICIALLUIS);
-   dsb(ish);
+   write_sysreg(0, TLBIALLNSNH);
+   write_sysreg(0, ICIALLU);
+   dsb(nsh);
 }
diff --git a/arch/arm64/include/asm/kvm_asid.h 
b/arch/arm64/include/asm/kvm_asid.h
new file mode 100644
index ..8b586e43c094
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_asid.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL

[PATCH RFC 02/14] arm64/mm: Move active_asids and reserved_asids to asid_info

2019-03-21 Thread Julien Grall
The variables active_asids and reserved_asids hold information for a
given ASID allocator. So move them to the structure asid_info.

At the same time, introduce wrappers to access the active and reserved
ASIDs to make the code clearer.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 34db54f1a39a..cfe4c5f7abf3 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -34,10 +34,16 @@ struct asid_info
 {
atomic64_t  generation;
unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
 } asid_info;
 
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+#define reserved_asid(info, cpu) *per_cpu_ptr((info)->reserved, cpu)
+
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
+
 static cpumask_t tlb_flush_pending;
 
 #define ASID_MASK  (~GENMASK(asid_bits - 1, 0))
@@ -100,7 +106,7 @@ static void flush_context(struct asid_info *info)
bitmap_clear(info->map, 0, NUM_USER_ASIDS);
 
for_each_possible_cpu(i) {
-   asid = atomic64_xchg_relaxed(_cpu(active_asids, i), 0);
+   asid = atomic64_xchg_relaxed(_asid(info, i), 0);
/*
 * If this CPU has already been through a
 * rollover, but hasn't run another task in
@@ -109,9 +115,9 @@ static void flush_context(struct asid_info *info)
 * the process it is still running.
 */
if (asid == 0)
-   asid = per_cpu(reserved_asids, i);
+   asid = reserved_asid(info, i);
__set_bit(asid2idx(asid), info->map);
-   per_cpu(reserved_asids, i) = asid;
+   reserved_asid(info, i) = asid;
}
 
/*
@@ -121,7 +127,8 @@ static void flush_context(struct asid_info *info)
cpumask_setall(_flush_pending);
 }
 
-static bool check_update_reserved_asid(u64 asid, u64 newasid)
+static bool check_update_reserved_asid(struct asid_info *info, u64 asid,
+  u64 newasid)
 {
int cpu;
bool hit = false;
@@ -136,9 +143,9 @@ static bool check_update_reserved_asid(u64 asid, u64 
newasid)
 * generation.
 */
for_each_possible_cpu(cpu) {
-   if (per_cpu(reserved_asids, cpu) == asid) {
+   if (reserved_asid(info, cpu) == asid) {
hit = true;
-   per_cpu(reserved_asids, cpu) = newasid;
+   reserved_asid(info, cpu) = newasid;
}
}
 
@@ -158,7 +165,7 @@ static u64 new_context(struct asid_info *info, struct 
mm_struct *mm)
 * If our current ASID was active during a rollover, we
 * can continue to use it and this was just a false alarm.
 */
-   if (check_update_reserved_asid(asid, newasid))
+   if (check_update_reserved_asid(info, asid, newasid))
return newasid;
 
/*
@@ -207,8 +214,8 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 
/*
 * The memory ordering here is subtle.
-* If our active_asids is non-zero and the ASID matches the current
-* generation, then we update the active_asids entry with a relaxed
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
 * cmpxchg. Racing with a concurrent rollover means that either:
 *
 * - We get a zero back from the cmpxchg and end up waiting on the
@@ -219,10 +226,10 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 *   relaxed xchg in flush_context will treat us as reserved
 *   because atomic RmWs are totally ordered for a given location.
 */
-   old_active_asid = atomic64_read(_cpu(active_asids, cpu));
+   old_active_asid = atomic64_read(_asid(info, cpu));
if (old_active_asid &&
!((asid ^ atomic64_read(>generation)) >> asid_bits) &&
-   atomic64_cmpxchg_relaxed(_cpu(active_asids, cpu),
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
 old_active_asid, asid))
goto switch_mm_fastpath;
 
@@ -237,7 +244,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
if (cpumask_test_and_clear_cpu(cpu, _flush_pending))
local_flush_tlb_all();
 
-   atomic64_set(_cpu(active_asids, cpu), asid);
+   atomic64_set(_asid(info, cpu), asid);
raw_spin_unlock_irqrestore(_asid_lock, flags);
 
 switch_mm_fastpath:
@@ -278,6 +285,9 @@ static int a

[PATCH RFC 12/14] arm64/lib: asid: Allow user to update the context under the lock

2019-03-21 Thread Julien Grall
Some users of the ASID allocator (e.g VMID) will require to update the
context when a new ASID is generated. This has to be protected by a lock
to prevent concurrent modification.

Rather than introducing yet another lock, it is possible to re-use the
allocator lock for that purpose. This patch introduces a new callback
that will be call when updating the context.

Signed-off-by: Julien Grall 
---
 arch/arm64/include/asm/asid.h | 12 
 arch/arm64/lib/asid.c | 10 --
 arch/arm64/mm/context.c   | 11 ---
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/asid.h b/arch/arm64/include/asm/asid.h
index bb62b587f37f..d8d9dc875bec 100644
--- a/arch/arm64/include/asm/asid.h
+++ b/arch/arm64/include/asm/asid.h
@@ -23,6 +23,8 @@ struct asid_info
unsigned intctxt_shift;
/* Callback to locally flush the context. */
void(*flush_cpu_ctxt_cb)(void);
+   /* Callback to call when a context is updated */
+   void(*update_ctxt_cb)(void *ctxt);
 };
 
 #define NUM_ASIDS(info)(1UL << ((info)->bits))
@@ -31,7 +33,7 @@ struct asid_info
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
 
 void asid_new_context(struct asid_info *info, atomic64_t *pasid,
- unsigned int cpu);
+ unsigned int cpu, void *ctxt);
 
 /*
  * Check the ASID is still valid for the context. If not generate a new ASID.
@@ -40,7 +42,8 @@ void asid_new_context(struct asid_info *info, atomic64_t 
*pasid,
  * @cpu: current CPU ID. Must have been acquired throught get_cpu()
  */
 static inline void asid_check_context(struct asid_info *info,
- atomic64_t *pasid, unsigned int cpu)
+  atomic64_t *pasid, unsigned int cpu,
+  void *ctxt)
 {
u64 asid, old_active_asid;
 
@@ -67,11 +70,12 @@ static inline void asid_check_context(struct asid_info 
*info,
 old_active_asid, asid))
return;
 
-   asid_new_context(info, pasid, cpu);
+   asid_new_context(info, pasid, cpu, ctxt);
 }
 
 int asid_allocator_init(struct asid_info *info,
u32 bits, unsigned int asid_per_ctxt,
-   void (*flush_cpu_ctxt_cb)(void));
+   void (*flush_cpu_ctxt_cb)(void),
+   void (*update_ctxt_cb)(void *ctxt));
 
 #endif
diff --git a/arch/arm64/lib/asid.c b/arch/arm64/lib/asid.c
index 72b71bfb32be..b47e6769c1bc 100644
--- a/arch/arm64/lib/asid.c
+++ b/arch/arm64/lib/asid.c
@@ -130,9 +130,10 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
  * @pasid: Pointer to the current ASID batch allocated. It will be updated
  * with the new ASID batch.
  * @cpu: current CPU ID. Must have been acquired through get_cpu()
+ * @ctxt: Context to update when calling update_context
  */
 void asid_new_context(struct asid_info *info, atomic64_t *pasid,
- unsigned int cpu)
+ unsigned int cpu, void *ctxt)
 {
unsigned long flags;
u64 asid;
@@ -149,6 +150,9 @@ void asid_new_context(struct asid_info *info, atomic64_t 
*pasid,
info->flush_cpu_ctxt_cb();
 
atomic64_set(_asid(info, cpu), asid);
+
+   info->update_ctxt_cb(ctxt);
+
raw_spin_unlock_irqrestore(>lock, flags);
 }
 
@@ -163,11 +167,13 @@ void asid_new_context(struct asid_info *info, atomic64_t 
*pasid,
  */
 int asid_allocator_init(struct asid_info *info,
u32 bits, unsigned int asid_per_ctxt,
-   void (*flush_cpu_ctxt_cb)(void))
+   void (*flush_cpu_ctxt_cb)(void),
+   void (*update_ctxt_cb)(void *ctxt))
 {
info->bits = bits;
info->ctxt_shift = ilog2(asid_per_ctxt);
info->flush_cpu_ctxt_cb = flush_cpu_ctxt_cb;
+   info->update_ctxt_cb = update_ctxt_cb;
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is always reserved.
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 95ee7711a2ef..737b4bd7bbe7 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -82,7 +82,7 @@ void check_and_switch_context(struct mm_struct *mm, unsigned 
int cpu)
if (system_supports_cnp())
cpu_set_reserved_ttbr0();
 
-   asid_check_context(_info, >context.id, cpu);
+   asid_check_context(_info, >context.id, cpu, mm);
 
arm64_apply_bp_hardening();
 
@@ -108,12 +108,17 @@ static void asid_flush_cpu_ctxt(void)
local_flush_tlb_all();
 }
 
+static void asid_update_ctxt(void *ctxt)
+{
+   /* Nothing to do */
+}
+
 static int asids_init(void)
 {
u32 bits = get_cpu_asid_bits();
 
-

[PATCH RFC 04/14] arm64/mm: Move the variable lock and tlb_flush_pending to asid_info

2019-03-21 Thread Julien Grall
The variables lock and tlb_flush_pending holds information for a given
ASID allocator. So move them to the asid_info structure.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index da17ed6c7117..e98ab348b9cb 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -27,8 +27,6 @@
 #include 
 #include 
 
-static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
-
 struct asid_info
 {
atomic64_t  generation;
@@ -36,6 +34,9 @@ struct asid_info
atomic64_t __percpu *active;
u64 __percpu*reserved;
u32 bits;
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -44,8 +45,6 @@ struct asid_info
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
 
-static cpumask_t tlb_flush_pending;
-
 #define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
 #define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
@@ -124,7 +123,7 @@ static void flush_context(struct asid_info *info)
 * Queue a TLB invalidation for each CPU to perform on next
 * context-switch
 */
-   cpumask_setall(_flush_pending);
+   cpumask_setall(>flush_pending);
 }
 
 static bool check_update_reserved_asid(struct asid_info *info, u64 asid,
@@ -233,7 +232,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 old_active_asid, asid))
goto switch_mm_fastpath;
 
-   raw_spin_lock_irqsave(_asid_lock, flags);
+   raw_spin_lock_irqsave(>lock, flags);
/* Check that our ASID belongs to the current generation. */
asid = atomic64_read(>context.id);
if ((asid ^ atomic64_read(>generation)) >> info->bits) {
@@ -241,11 +240,11 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
atomic64_set(>context.id, asid);
}
 
-   if (cpumask_test_and_clear_cpu(cpu, _flush_pending))
+   if (cpumask_test_and_clear_cpu(cpu, >flush_pending))
local_flush_tlb_all();
 
atomic64_set(_asid(info, cpu), asid);
-   raw_spin_unlock_irqrestore(_asid_lock, flags);
+   raw_spin_unlock_irqrestore(>lock, flags);
 
 switch_mm_fastpath:
 
@@ -288,6 +287,8 @@ static int asids_init(void)
info->active = _asids;
info->reserved = _asids;
 
+   raw_spin_lock_init(>lock);
+
pr_info("ASID allocator initialised with %lu entries\n",
NUM_USER_ASIDS(info));
return 0;
-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 13/14] arm/kvm: Introduce a new VMID allocator

2019-03-21 Thread Julien Grall
A follow-up patch will replace the KVM VMID allocator with the arm64 ASID
allocator. It is not yet clear how the code can be shared between arm
and arm64, so this is a verbatim copy of arch/arm64/lib/asid.c.

Signed-off-by: Julien Grall 
---
 arch/arm/include/asm/kvm_asid.h |  81 +
 arch/arm/kvm/Makefile   |   1 +
 arch/arm/kvm/asid.c | 191 
 3 files changed, 273 insertions(+)
 create mode 100644 arch/arm/include/asm/kvm_asid.h
 create mode 100644 arch/arm/kvm/asid.c

diff --git a/arch/arm/include/asm/kvm_asid.h b/arch/arm/include/asm/kvm_asid.h
new file mode 100644
index ..f312a6d7543c
--- /dev/null
+++ b/arch/arm/include/asm/kvm_asid.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ARM_KVM_ASID_H__
+#define __ARM_KVM_ASID_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
+   u32 bits;
+   /* Lock protecting the structure */
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
+   /* Callback to call when a context is updated */
+   void(*update_ctxt_cb)(void *ctxt);
+};
+
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
+
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+
+void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+ unsigned int cpu, void *ctxt);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static inline void asid_check_context(struct asid_info *info,
+  atomic64_t *pasid, unsigned int cpu,
+  void *ctxt)
+{
+   u64 asid, old_active_asid;
+
+   asid = atomic64_read(pasid);
+
+   /*
+* The memory ordering here is subtle.
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
+* cmpxchg. Racing with a concurrent rollover means that either:
+*
+* - We get a zero back from the cmpxchg and end up waiting on the
+*   lock. Taking the lock synchronises with the rollover and so
+*   we are forced to see the updated generation.
+*
+* - We get a valid ASID back from the cmpxchg, which means the
+*   relaxed xchg in flush_context will treat us as reserved
+*   because atomic RmWs are totally ordered for a given location.
+*/
+   old_active_asid = atomic64_read(_asid(info, cpu));
+   if (old_active_asid &&
+   !((asid ^ atomic64_read(>generation)) >> info->bits) &&
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
+old_active_asid, asid))
+   return;
+
+   asid_new_context(info, pasid, cpu, ctxt);
+}
+
+int asid_allocator_init(struct asid_info *info,
+   u32 bits, unsigned int asid_per_ctxt,
+   void (*flush_cpu_ctxt_cb)(void),
+   void (*update_ctxt_cb)(void *ctxt));
+
+#endif /* __ARM_KVM_ASID_H__ */
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index 531e59f5be9c..35d2d4c67827 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -21,6 +21,7 @@ obj-$(CONFIG_KVM_ARM_HOST) += hyp/
 
 obj-y += kvm-arm.o init.o interrupts.o
 obj-y += handle_exit.o guest.o emulate.o reset.o
+obj-y += asid.o
 obj-y += coproc.o coproc_a15.o coproc_a7.o   vgic-v3-coproc.o
 obj-y += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
 obj-y += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
diff --git a/arch/arm/kvm/asid.c b/arch/arm/kvm/asid.c
new file mode 100644
index ..60a25270163a
--- /dev/null
+++ b/arch/arm/kvm/asid.c
@@ -0,0 +1,191 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generic ASID allocator.
+ *
+ * Based on arch/arm/mm/context.c
+ *
+ * Copyright (C) 2002-2003 Deep Blue Solutions Ltd, all rights reserved.
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#include 
+
+#include 
+
+#define reserved_asid(info, cpu) *per_cpu_ptr((info)->reserved, cpu)
+
+#define ASID_MASK(info)(~GENMASK((info)->bits - 1, 0))
+#define ASID_FIRST_VERSION(info)   (1UL << ((in

[PATCH RFC 11/14] arm64: Move the ASID allocator code in a separate file

2019-03-21 Thread Julien Grall
We will want to re-use the ASID allocator in a separate context (e.g
allocating VMID). So move the code in a new file.

The function asid_check_context has been moved in the header as a static
inline function because we want to avoid add a branch when checking if the
ASID is still valid.

Signed-off-by: Julien Grall 

---

This code will be used in the virt code for allocating VMID. I am not
entirely sure where to place it. Lib could potentially be a good place but I
am not entirely convinced the algo as it is could be used by other
architecture.

Looking at x86, it seems that it will not be possible to re-use because
the number of PCID (aka ASID) could be smaller than the number of CPUs.
See commit message 10af6235e0d327d42e1bad974385197817923dc1 "x86/mm:
Implement PCID based optimization: try to preserve old TLB entries using
PCI".
---
 arch/arm64/include/asm/asid.h |  77 ++
 arch/arm64/lib/Makefile   |   2 +
 arch/arm64/lib/asid.c | 185 +
 arch/arm64/mm/context.c   | 235 +-
 4 files changed, 267 insertions(+), 232 deletions(-)
 create mode 100644 arch/arm64/include/asm/asid.h
 create mode 100644 arch/arm64/lib/asid.c

diff --git a/arch/arm64/include/asm/asid.h b/arch/arm64/include/asm/asid.h
new file mode 100644
index ..bb62b587f37f
--- /dev/null
+++ b/arch/arm64/include/asm/asid.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ASM_ASID_H
+#define __ASM_ASM_ASID_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+   atomic64_t __percpu *active;
+   u64 __percpu*reserved;
+   u32 bits;
+   /* Lock protecting the structure */
+   raw_spinlock_t  lock;
+   /* Which CPU requires context flush on next call */
+   cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
+   /* Callback to locally flush the context. */
+   void(*flush_cpu_ctxt_cb)(void);
+};
+
+#define NUM_ASIDS(info)(1UL << ((info)->bits))
+#define NUM_CTXT_ASIDS(info)   (NUM_ASIDS(info) >> (info)->ctxt_shift)
+
+#define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
+
+void asid_new_context(struct asid_info *info, atomic64_t *pasid,
+ unsigned int cpu);
+
+/*
+ * Check the ASID is still valid for the context. If not generate a new ASID.
+ *
+ * @pasid: Pointer to the current ASID batch
+ * @cpu: current CPU ID. Must have been acquired throught get_cpu()
+ */
+static inline void asid_check_context(struct asid_info *info,
+ atomic64_t *pasid, unsigned int cpu)
+{
+   u64 asid, old_active_asid;
+
+   asid = atomic64_read(pasid);
+
+   /*
+* The memory ordering here is subtle.
+* If our active_asid is non-zero and the ASID matches the current
+* generation, then we update the active_asid entry with a relaxed
+* cmpxchg. Racing with a concurrent rollover means that either:
+*
+* - We get a zero back from the cmpxchg and end up waiting on the
+*   lock. Taking the lock synchronises with the rollover and so
+*   we are forced to see the updated generation.
+*
+* - We get a valid ASID back from the cmpxchg, which means the
+*   relaxed xchg in flush_context will treat us as reserved
+*   because atomic RmWs are totally ordered for a given location.
+*/
+   old_active_asid = atomic64_read(_asid(info, cpu));
+   if (old_active_asid &&
+   !((asid ^ atomic64_read(>generation)) >> info->bits) &&
+   atomic64_cmpxchg_relaxed(_asid(info, cpu),
+old_active_asid, asid))
+   return;
+
+   asid_new_context(info, pasid, cpu);
+}
+
+int asid_allocator_init(struct asid_info *info,
+   u32 bits, unsigned int asid_per_ctxt,
+   void (*flush_cpu_ctxt_cb)(void));
+
+#endif
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 5540a1638baf..720df5ee2aa2 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -5,6 +5,8 @@ lib-y   := clear_user.o delay.o copy_from_user.o
\
   memcmp.o strcmp.o strncmp.o strlen.o strnlen.o   \
   strchr.o strrchr.o tishift.o
 
+lib-y  += asid.o
+
 ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
 obj-$(CONFIG_XOR_BLOCKS)   += xor-neon.o
 CFLAGS_REMOVE_xor-neon.o   += -mgeneral-regs-only
diff --git a/arch/arm64/lib/asid.c b/arch/arm64/lib/asid.c
new file mode 100644
index ..72b71bfb32be
--- /dev/null
+++ b/arch/arm64/lib/as

[PATCH RFC 06/14] arm64/mm: Store the number of asid allocated per context

2019-03-21 Thread Julien Grall
Currently the number of ASID allocated per context is determined at
compilation time. As the algorithm is becoming generic, the user may
want to instantiate the ASID allocator multiple time with different
number of ASID allocated.

Add a field in asid_info to track the number ASID allocated per context.
This is stored in term of shift amount to avoid division in the code.

This means the number of ASID allocated per context should be a power of
two.

At the same time rename NUM_USERS_ASIDS to NUM_CTXT_ASIDS to make the
name more generic.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 488845c39c39..5a4c2b1aac71 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -37,6 +37,8 @@ struct asid_info
raw_spinlock_t  lock;
/* Which CPU requires context flush on next call */
cpumask_t   flush_pending;
+   /* Number of ASID allocated by context (shift value) */
+   unsigned intctxt_shift;
 } asid_info;
 
 #define active_asid(info, cpu) *per_cpu_ptr((info)->active, cpu)
@@ -49,15 +51,15 @@ static DEFINE_PER_CPU(u64, reserved_asids);
 #define ASID_FIRST_VERSION(info)   (1UL << ((info)->bits))
 
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 1)
-#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 1)
-#define idx2asid(info, idx)(((idx) << 1) & ~ASID_MASK(info))
+#define ASID_PER_CONTEXT   2
 #else
-#define NUM_USER_ASIDS(info)   (ASID_FIRST_VERSION(info))
-#define asid2idx(info, asid)   ((asid) & ~ASID_MASK(info))
-#define idx2asid(info, idx)asid2idx(info, idx)
+#define ASID_PER_CONTEXT   1
 #endif
 
+#define NUM_CTXT_ASIDS(info)   (ASID_FIRST_VERSION(info) >> 
(info)->ctxt_shift)
+#define asid2idx(info, asid)   (((asid) & ~ASID_MASK(info)) >> 
(info)->ctxt_shift)
+#define idx2asid(info, idx)(((idx) << (info)->ctxt_shift) & 
~ASID_MASK(info))
+
 /* Get the ASIDBits supported by the current CPU */
 static u32 get_cpu_asid_bits(void)
 {
@@ -102,7 +104,7 @@ static void flush_context(struct asid_info *info)
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(info->map, 0, NUM_USER_ASIDS(info));
+   bitmap_clear(info->map, 0, NUM_CTXT_ASIDS(info));
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_asid(info, i), 0);
@@ -182,8 +184,8 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), cur_idx);
-   if (asid != NUM_USER_ASIDS(info))
+   asid = find_next_zero_bit(info->map, NUM_CTXT_ASIDS(info), cur_idx);
+   if (asid != NUM_CTXT_ASIDS(info))
goto set_asid;
 
/* We're out of ASIDs, so increment the global generation count */
@@ -192,7 +194,7 @@ static u64 new_context(struct asid_info *info, atomic64_t 
*pasid)
flush_context(info);
 
/* We have more ASIDs than CPUs, so this will always succeed */
-   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS(info), 1);
+   asid = find_next_zero_bit(info->map, NUM_CTXT_ASIDS(info), 1);
 
 set_asid:
__set_bit(asid, info->map);
@@ -272,17 +274,18 @@ static int asids_init(void)
struct asid_info *info = _info;
 
info->bits = get_cpu_asid_bits();
+   info->ctxt_shift = ilog2(ASID_PER_CONTEXT);
/*
 * Expect allocation after rollover to fail if we don't have at least
 * one more ASID than CPUs. ASID #0 is reserved for init_mm.
 */
-   WARN_ON(NUM_USER_ASIDS(info) - 1 <= num_possible_cpus());
+   WARN_ON(NUM_CTXT_ASIDS(info) - 1 <= num_possible_cpus());
atomic64_set(>generation, ASID_FIRST_VERSION(info));
-   info->map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS(info)),
+   info->map = kcalloc(BITS_TO_LONGS(NUM_CTXT_ASIDS(info)),
sizeof(*info->map), GFP_KERNEL);
if (!info->map)
panic("Failed to allocate bitmap for %lu ASIDs\n",
- NUM_USER_ASIDS(info));
+ NUM_CTXT_ASIDS(info));
 
info->active = _asids;
info->reserved = _asids;
@@ -290,7 +293,7 @@ static int asids_init(void)
raw_spin_lock_init(>lock);
 
pr_info("ASID allocator initialised with %lu entries\n",
-   NUM_USER_ASIDS(info));
+   NUM_CTXT_ASIDS(info));
return 0;
 }
 early_initcall(asids_init);
-

[PATCH RFC 00/14] kvm/arm: Align the VMID allocation with the arm64 ASID one

2019-03-21 Thread Julien Grall
This patch series is moving out the ASID allocator in a separate file in order
to re-use it for the VMID. The benefits are:
- CPUs are not forced to exit a roll-over.
- Context invalidation is now per-CPU rather than
  broadcasted.

There are no performance regression on the fastpath for ASID allocation.
Actually on the hackbench measurement (300 hackbench) it was .7% faster.

The measurement was made on a Seattle based SoC (8 CPUs), with the
number of VMID limited to 4-bit. The test involves running concurrently 40
guests with 2 vCPUs. Each guest will then execute hackbench 5 times
before exiting.

The performance difference between the current algo and the new one are:
- 2.5% less exit from the guest
- 22.4% more flush, although they are now local rather than
broadcasted
- 0.11% faster (just for the record)

The ASID allocator rework to make it generic has been divided in multiple
patches to make the review easier.

A branch with the patch based on 5.1-rc1 can be found:

http://xenbits.xen.org/gitweb/?p=people/julieng/linux-arm.git;a=shortlog;h=refs/heads/vmid-rework/rfc

Cheers,

Julien Grall (14):
  arm64/mm: Introduce asid_info structure and move
asid_generation/asid_map to it
  arm64/mm: Move active_asids and reserved_asids to asid_info
  arm64/mm: Move bits to asid_info
  arm64/mm: Move the variable lock and tlb_flush_pending to asid_info
  arm64/mm: Remove dependency on MM in new_context
  arm64/mm: Store the number of asid allocated per context
  arm64/mm: Introduce NUM_ASIDS
  arm64/mm: Split asid_inits in 2 parts
  arm64/mm: Split the function check_and_switch_context in 3 parts
  arm64/mm: Introduce a callback to flush the local context
  arm64: Move the ASID allocator code in a separate file
  arm64/lib: asid: Allow user to update the context under the lock
  arm/kvm: Introduce a new VMID allocator
  kvm/arm: Align the VMID allocation with the arm64 ASID one

 arch/arm/include/asm/kvm_asid.h   |  81 +++
 arch/arm/include/asm/kvm_asm.h|   2 +-
 arch/arm/include/asm/kvm_host.h   |   5 +-
 arch/arm/include/asm/kvm_hyp.h|   1 +
 arch/arm/kvm/Makefile |   1 +
 arch/arm/kvm/asid.c   | 191 +++
 arch/arm/kvm/hyp/tlb.c|   8 +-
 arch/arm64/include/asm/asid.h |  81 +++
 arch/arm64/include/asm/kvm_asid.h |   8 ++
 arch/arm64/include/asm/kvm_asm.h  |   2 +-
 arch/arm64/include/asm/kvm_host.h |   5 +-
 arch/arm64/kvm/hyp/tlb.c  |  10 +-
 arch/arm64/lib/Makefile   |   2 +
 arch/arm64/lib/asid.c | 191 +++
 arch/arm64/mm/context.c   | 205 ++
 virt/kvm/arm/arm.c| 112 +++--
 16 files changed, 638 insertions(+), 267 deletions(-)
 create mode 100644 arch/arm/include/asm/kvm_asid.h
 create mode 100644 arch/arm/kvm/asid.c
 create mode 100644 arch/arm64/include/asm/asid.h
 create mode 100644 arch/arm64/include/asm/kvm_asid.h
 create mode 100644 arch/arm64/lib/asid.c

-- 
2.11.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH RFC 01/14] arm64/mm: Introduce asid_info structure and move asid_generation/asid_map to it

2019-03-21 Thread Julien Grall
In an attempt to make the ASID allocator generic, create a new structure
asid_info to store all the information necessary for the allocator.

For now, move the variables asid_generation and asid_map to the new structure
asid_info. Follow-up patches will move more variables.

Note to avoid more renaming aftwards, a local variable 'info' has been
created and is a pointer to the ASID allocator structure.

Signed-off-by: Julien Grall 
---
 arch/arm64/mm/context.c | 46 ++
 1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 1f0ea2facf24..34db54f1a39a 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -30,8 +30,11 @@
 static u32 asid_bits;
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 
-static atomic64_t asid_generation;
-static unsigned long *asid_map;
+struct asid_info
+{
+   atomic64_t  generation;
+   unsigned long   *map;
+} asid_info;
 
 static DEFINE_PER_CPU(atomic64_t, active_asids);
 static DEFINE_PER_CPU(u64, reserved_asids);
@@ -88,13 +91,13 @@ void verify_cpu_asid_bits(void)
}
 }
 
-static void flush_context(void)
+static void flush_context(struct asid_info *info)
 {
int i;
u64 asid;
 
/* Update the list of reserved ASIDs and the ASID bitmap. */
-   bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
+   bitmap_clear(info->map, 0, NUM_USER_ASIDS);
 
for_each_possible_cpu(i) {
asid = atomic64_xchg_relaxed(_cpu(active_asids, i), 0);
@@ -107,7 +110,7 @@ static void flush_context(void)
 */
if (asid == 0)
asid = per_cpu(reserved_asids, i);
-   __set_bit(asid2idx(asid), asid_map);
+   __set_bit(asid2idx(asid), info->map);
per_cpu(reserved_asids, i) = asid;
}
 
@@ -142,11 +145,11 @@ static bool check_update_reserved_asid(u64 asid, u64 
newasid)
return hit;
 }
 
-static u64 new_context(struct mm_struct *mm)
+static u64 new_context(struct asid_info *info, struct mm_struct *mm)
 {
static u32 cur_idx = 1;
u64 asid = atomic64_read(>context.id);
-   u64 generation = atomic64_read(_generation);
+   u64 generation = atomic64_read(>generation);
 
if (asid != 0) {
u64 newasid = generation | (asid & ~ASID_MASK);
@@ -162,7 +165,7 @@ static u64 new_context(struct mm_struct *mm)
 * We had a valid ASID in a previous life, so try to re-use
 * it if possible.
 */
-   if (!__test_and_set_bit(asid2idx(asid), asid_map))
+   if (!__test_and_set_bit(asid2idx(asid), info->map))
return newasid;
}
 
@@ -173,20 +176,20 @@ static u64 new_context(struct mm_struct *mm)
 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
 * pairs.
 */
-   asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, cur_idx);
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, cur_idx);
if (asid != NUM_USER_ASIDS)
goto set_asid;
 
/* We're out of ASIDs, so increment the global generation count */
generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION,
-_generation);
-   flush_context();
+>generation);
+   flush_context(info);
 
/* We have more ASIDs than CPUs, so this will always succeed */
-   asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, 1);
+   asid = find_next_zero_bit(info->map, NUM_USER_ASIDS, 1);
 
 set_asid:
-   __set_bit(asid, asid_map);
+   __set_bit(asid, info->map);
cur_idx = asid;
return idx2asid(asid) | generation;
 }
@@ -195,6 +198,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 {
unsigned long flags;
u64 asid, old_active_asid;
+   struct asid_info *info = _info;
 
if (system_supports_cnp())
cpu_set_reserved_ttbr0();
@@ -217,7 +221,7 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
 */
old_active_asid = atomic64_read(_cpu(active_asids, cpu));
if (old_active_asid &&
-   !((asid ^ atomic64_read(_generation)) >> asid_bits) &&
+   !((asid ^ atomic64_read(>generation)) >> asid_bits) &&
atomic64_cmpxchg_relaxed(_cpu(active_asids, cpu),
 old_active_asid, asid))
goto switch_mm_fastpath;
@@ -225,8 +229,8 @@ void check_and_switch_context(struct mm_struct *mm, 
unsigned int cpu)
raw_spin_lock_irqsave(_asid_lock, flags);
/* Check that our ASID belongs to the current generation. */
asid = atomic64_read(>context.id);
-   if ((asid ^ atomic64_read(_generation)) 

Re: [PATCH 05/11] KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded

2019-03-04 Thread Julien Grall

Hi,

On 04/03/2019 17:06, Marc Zyngier wrote:

On 04/03/2019 16:30, Julien Grall wrote:

Hi,

I noticed some issues with this patch when rebooting a guest after using perf.

[  577.513447] BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:908
[  577.521926] in_atomic(): 1, irqs_disabled(): 0, pid: 2323, name: qemu-system 
aar
[  577.529354] 1 lock held by qemu-system-aar/2323:
[  577.533998]  #0: f4f96804 (>mutex){+.+.}, at:
kvm_vcpu_ioctl+0x74/0xac0
[  577.541865] Preemption disabled at:
[  577.541871] [] kvm_reset_vcpu+0x1c/0x1d0
[  577.550882] CPU: 6 PID: 2323 Comm: qemu-system-aar Tainted: GW  5.0.0
#1277
[  577.559137] Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive)
(DT)
[  577.566698] Call trace:
[  577.569138]  dump_backtrace+0x0/0x140
[  577.572793]  show_stack+0x14/0x20
[  577.576103]  dump_stack+0xa0/0xd4
[  577.579412]  ___might_sleep+0x1e4/0x2b0
[  577.583241]  __might_sleep+0x60/0xb8
[  577.586810]  __mutex_lock+0x58/0x860
[  577.590378]  mutex_lock_nested+0x1c/0x28
[  577.594294]  perf_event_ctx_lock_nested+0xf4/0x238
[  577.599078]  perf_event_read_value+0x24/0x60
[  577.603341]  kvm_pmu_get_counter_value+0x80/0xe8
[  577.607950]  kvm_pmu_stop_counter+0x2c/0x98
[  577.612126]  kvm_pmu_vcpu_reset+0x58/0xd0
[  577.616128]  kvm_reset_vcpu+0xec/0x1d0
[  577.619869]  kvm_arch_vcpu_ioctl+0x6b0/0x860
[  577.624131]  kvm_vcpu_ioctl+0xe0/0xac0
[  577.627876]  do_vfs_ioctl+0xbc/0x910
[  577.631443]  ksys_ioctl+0x78/0xa8
[  577.634751]  __arm64_sys_ioctl+0x1c/0x28
[  577.638667]  el0_svc_common+0x90/0x118
[  577.642408]  el0_svc_handler+0x2c/0x80
[  577.646150]  el0_svc+0x8/0xc

This is happening because the vCPU reset code is now running with preemption
disable. However, the perf code cannot be called with preemption disabled as it
is using mutex.

Do you have any suggestion on the way to fix this potential issue?


Given that the PMU is entirely emulated, it never has any state loaded
on the CPU. It thus doesn't need to be part of the non-preemptible section.

Can you please give this (untested) patchlet one a go? It's not exactly
pretty, but I believe it will do the trick.


It does the trick. Are you going to submit the patch?

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 05/11] KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded

2019-03-04 Thread Julien Grall

Hi,

I noticed some issues with this patch when rebooting a guest after using perf.

[  577.513447] BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:908

[  577.521926] in_atomic(): 1, irqs_disabled(): 0, pid: 2323, name: qemu-system 
aar
[  577.529354] 1 lock held by qemu-system-aar/2323:
[  577.533998]  #0: f4f96804 (>mutex){+.+.}, at: 
kvm_vcpu_ioctl+0x74/0xac0

[  577.541865] Preemption disabled at:
[  577.541871] [] kvm_reset_vcpu+0x1c/0x1d0
[  577.550882] CPU: 6 PID: 2323 Comm: qemu-system-aar Tainted: GW  5.0.0 
#1277
[  577.559137] Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) 
(DT)

[  577.566698] Call trace:
[  577.569138]  dump_backtrace+0x0/0x140
[  577.572793]  show_stack+0x14/0x20
[  577.576103]  dump_stack+0xa0/0xd4
[  577.579412]  ___might_sleep+0x1e4/0x2b0
[  577.583241]  __might_sleep+0x60/0xb8
[  577.586810]  __mutex_lock+0x58/0x860
[  577.590378]  mutex_lock_nested+0x1c/0x28
[  577.594294]  perf_event_ctx_lock_nested+0xf4/0x238
[  577.599078]  perf_event_read_value+0x24/0x60
[  577.603341]  kvm_pmu_get_counter_value+0x80/0xe8
[  577.607950]  kvm_pmu_stop_counter+0x2c/0x98
[  577.612126]  kvm_pmu_vcpu_reset+0x58/0xd0
[  577.616128]  kvm_reset_vcpu+0xec/0x1d0
[  577.619869]  kvm_arch_vcpu_ioctl+0x6b0/0x860
[  577.624131]  kvm_vcpu_ioctl+0xe0/0xac0
[  577.627876]  do_vfs_ioctl+0xbc/0x910
[  577.631443]  ksys_ioctl+0x78/0xa8
[  577.634751]  __arm64_sys_ioctl+0x1c/0x28
[  577.638667]  el0_svc_common+0x90/0x118
[  577.642408]  el0_svc_handler+0x2c/0x80
[  577.646150]  el0_svc+0x8/0xc

This is happening because the vCPU reset code is now running with preemption 
disable. However, the perf code cannot be called with preemption disabled as it 
is using mutex.


Do you have any suggestion on the way to fix this potential issue?

Cheers,

On 07/02/2019 13:18, Marc Zyngier wrote:

From: Christoffer Dall 

We have two ways to reset a vcpu:
- either through VCPU_INIT
- or through a PSCI_ON call

The first one is easy to reason about. The second one is implemented
in a more bizarre way, as it is the vcpu that handles PSCI_ON that
resets the vcpu that is being powered-on. As we need to turn the logic
around and have the target vcpu to reset itself, we must take some
preliminary steps.

Resetting the VCPU state modifies the system register state in memory,
but this may interact with vcpu_load/vcpu_put if running with preemption
disabled, which in turn may lead to corrupted system register state.

Address this by disabling preemption and doing put/load if required
around the reset logic.

Reviewed-by: Andrew Jones 
Signed-off-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
  arch/arm64/kvm/reset.c | 26 --
  1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b72a3dd56204..f21a2a575939 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -105,16 +105,33 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, 
long ext)
   * This function finds the right table above and sets the registers on
   * the virtual CPU struct to their architecturally defined reset
   * values.
+ *
+ * Note: This function can be called from two paths: The KVM_ARM_VCPU_INIT
+ * ioctl or as part of handling a request issued by another VCPU in the PSCI
+ * handling code.  In the first case, the VCPU will not be loaded, and in the
+ * second case the VCPU will be loaded.  Because this function operates purely
+ * on the memory-backed valus of system registers, we want to do a full put if
+ * we were loaded (handling a request) and load the values back at the end of
+ * the function.  Otherwise we leave the state alone.  In both cases, we
+ * disable preemption around the vcpu reset as we would otherwise race with
+ * preempt notifiers which also call put/load.
   */
  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
  {
const struct kvm_regs *cpu_reset;
+   int ret = -EINVAL;
+   bool loaded;
+
+   preempt_disable();
+   loaded = (vcpu->cpu != -1);
+   if (loaded)
+   kvm_arch_vcpu_put(vcpu);
  
  	switch (vcpu->arch.target) {

default:
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
if (!cpu_has_32bit_el1())
-   return -EINVAL;
+   goto out;
cpu_reset = _regs_reset32;
} else {
cpu_reset = _regs_reset;
@@ -137,7 +154,12 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
  
  	/* Reset timer */

-   return kvm_timer_vcpu_reset(vcpu);
+   ret = kvm_timer_vcpu_reset(vcpu);
+out:
+   if (loaded)
+   kvm_arch_vcpu_load(vcpu, smp_processor_id());
+   preempt_enable();
+   return ret;
  }
  
  void kvm_set_ipa_limit(void)





Re: [PATCH v5 13/26] KVM: arm64/sve: System register context switch and access support

2019-02-27 Thread Julien Grall

Hi Dave,

On 2/27/19 1:50 PM, Dave Martin wrote:

On Wed, Feb 27, 2019 at 12:02:46PM +, Julien Grall wrote:

Hi Dave,

On 2/26/19 5:01 PM, Dave Martin wrote:

On Tue, Feb 26, 2019 at 04:32:30PM +, Julien Grall wrote:

On 18/02/2019 19:52, Dave Martin wrote:
We seem to already have code for handling invariant registers as well as
reading ID register. I guess the only reason you can't use them is because
of the check the vcpu is using SVE.

However, AFAICT the restrictions callback would prevent you to enter the
{get, set}_id if the vCPU does not support SVE. So the check should not be
reachable.


Hmmm, those checks were inherited from before this refactoring.

You're right: the checks are now done a common place, so the checks in
the actual accessors should be redundant.

I could demote them to WARN(), but it may make sense simply to delete
them.


I think removing the WARN() would be best as it would avoid to introduce
most of the wrappers for the registers.



The access_id_aa64zfr0_el1() should still be reachable, since we don't
have REG_NO_GUEST for this.


__access_id_reg is taking a boolean to tell whether the register is RAZ or
not. So you probably could re-use it passing !vcpu_has_sve(vcpu).

It feels to me we would introduce a new restriction to tell whether the
register should be RAZ. Anyway, the new restriction is probably for a
follow-up patch.


It's true that we should be able to handle these as regular ID regs in
the get()/set() case, when SVE is enabled for the vcpu.  I'll have a
think about how to reduce the amount of special-case code here maybe
we can indeed get of some of these accessors entitely now that access
is rejected earlier, in a more generic way.

The access() case for this register still has to be custom though; I
don't see a trivial solution for that.


I believe you can implement access_id_aa64zfr0_el1 in one line:

return __access_id_reg(vcpu, p, r, !vcpu_has_sve(vcpu));

Another possibility is to introduce REG_GUEST_RAZ and use the 
restrictions callback to set it when the vCPU is not using SVE.


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 13/26] KVM: arm64/sve: System register context switch and access support

2019-02-27 Thread Julien Grall

Hi Dave,

On 2/26/19 5:01 PM, Dave Martin wrote:

On Tue, Feb 26, 2019 at 04:32:30PM +, Julien Grall wrote:

On 18/02/2019 19:52, Dave Martin wrote:
We seem to already have code for handling invariant registers as well as
reading ID register. I guess the only reason you can't use them is because
of the check the vcpu is using SVE.

However, AFAICT the restrictions callback would prevent you to enter the
{get, set}_id if the vCPU does not support SVE. So the check should not be
reachable.


Hmmm, those checks were inherited from before this refactoring.

You're right: the checks are now done a common place, so the checks in
the actual accessors should be redundant.

I could demote them to WARN(), but it may make sense simply to delete
them.


I think removing the WARN() would be best as it would avoid to introduce 
most of the wrappers for the registers.




The access_id_aa64zfr0_el1() should still be reachable, since we don't
have REG_NO_GUEST for this.


__access_id_reg is taking a boolean to tell whether the register is RAZ 
or not. So you probably could re-use it passing !vcpu_has_sve(vcpu).


It feels to me we would introduce a new restriction to tell whether the 
register should be RAZ. Anyway, the new restriction is probably for a 
follow-up patch.


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 14/26] KVM: arm64/sve: Context switch the SVE registers

2019-02-26 Thread Julien Grall

Hi Dave,

On 26/02/2019 12:13, Dave Martin wrote:

On Wed, Feb 20, 2019 at 04:46:57PM +, Julien Thierry wrote:



On 18/02/2019 19:52, Dave Martin wrote:

In order to give each vcpu its own view of the SVE registers, this
patch adds context storage via a new sve_state pointer in struct
vcpu_arch.  An additional member sve_max_vl is also added for each
vcpu, to determine the maximum vector length visible to the guest
and thus the value to be configured in ZCR_EL2.LEN while the is


"While the  is active"?


Hmmm, yes.  Thanks for deciphering that.  Done.
I think it would be more consistent if you use "vcpu" over "guest". After all 
ZCR_EL2.LEN is per vCPU.


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 13/26] KVM: arm64/sve: System register context switch and access support

2019-02-26 Thread Julien Grall

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

@@ -1091,6 +1088,95 @@ static int reg_from_user(u64 *val, const void __user 
*uaddr, u64 id);
  static int reg_to_user(void __user *uaddr, const u64 *val, u64 id);
  static u64 sys_reg_to_index(const struct sys_reg_desc *reg);
  
+static unsigned int sve_restrictions(const struct kvm_vcpu *vcpu,

+const struct sys_reg_desc *rd)
+{
+   return vcpu_has_sve(vcpu) ? 0 : REG_NO_USER | REG_NO_GUEST;
+}
+
+static unsigned int sve_id_restrictions(const struct kvm_vcpu *vcpu,
+   const struct sys_reg_desc *rd)
+{
+   return vcpu_has_sve(vcpu) ? 0 : REG_NO_USER;
+}
+
+static int get_zcr_el1(struct kvm_vcpu *vcpu,
+  const struct sys_reg_desc *rd,
+  const struct kvm_one_reg *reg, void __user *uaddr)
+{
+   if (WARN_ON(!vcpu_has_sve(vcpu)))
+   return -ENOENT;
+
+   return reg_to_user(uaddr, >arch.ctxt.sys_regs[ZCR_EL1],
+  reg->id);
+}
+
+static int set_zcr_el1(struct kvm_vcpu *vcpu,
+  const struct sys_reg_desc *rd,
+  const struct kvm_one_reg *reg, void __user *uaddr)
+{
+   if (WARN_ON(!vcpu_has_sve(vcpu)))
+   return -ENOENT;
+
+   return reg_from_user(>arch.ctxt.sys_regs[ZCR_EL1], uaddr,
+reg->id);
+}
+
+/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
+static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
+{
+   if (!vcpu_has_sve(vcpu))
+   return 0;
+
+   return read_sanitised_ftr_reg(SYS_ID_AA64ZFR0_EL1);
+}
+
+static bool access_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+  struct sys_reg_params *p,
+  const struct sys_reg_desc *rd)
+{
+   if (p->is_write)
+   return write_to_read_only(vcpu, p, rd);
+
+   p->regval = guest_id_aa64zfr0_el1(vcpu);
+   return true;
+}
+
+static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+   const struct sys_reg_desc *rd,
+   const struct kvm_one_reg *reg, void __user *uaddr)
+{
+   u64 val;
+
+   if (!vcpu_has_sve(vcpu))
+   return -ENOENT;
+
+   val = guest_id_aa64zfr0_el1(vcpu);
+   return reg_to_user(uaddr, , reg->id);
+}
+
+static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
+   const struct sys_reg_desc *rd,
+   const struct kvm_one_reg *reg, void __user *uaddr)
+{
+   const u64 id = sys_reg_to_index(rd);
+   int err;
+   u64 val;
+
+   if (!vcpu_has_sve(vcpu))
+   return -ENOENT;
+
+   err = reg_from_user(, uaddr, id);
+   if (err)
+   return err;
+
+   /* This is what we mean by invariant: you can't change it. */
+   if (val != guest_id_aa64zfr0_el1(vcpu))
+   return -EINVAL;
+
+   return 0;
+}


We seem to already have code for handling invariant registers as well as reading 
ID register. I guess the only reason you can't use them is because of the check 
the vcpu is using SVE.


However, AFAICT the restrictions callback would prevent you to enter the {get, 
set}_id if the vCPU does not support SVE. So the check should not be reachable.


Did I miss anything?

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 08/26] arm64/sve: Enable SVE state tracking for non-task contexts

2019-02-26 Thread Julien Grall

Hi Dave,

On 26/02/2019 15:58, Dave Martin wrote:

On Tue, Feb 26, 2019 at 03:49:00PM +, Julien Grall wrote:

Hi Dave,

On 26/02/2019 12:07, Dave Martin wrote:

On Fri, Feb 22, 2019 at 03:26:51PM +, Julien Grall wrote:

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account.  This means that


NIT: Double-space before "This".


See patch 2...

[...]

Does the code look reasonable to you?  This interacts with FPSIMD/SVE
context switch in the host, so it would be good to have your view on it.


I wanted to look at the rest before giving my reviewed-by tag.
FWIW, this patch looks reasonable to me.


OK, does that amount to a Reviewed-by, or do you have other comments?


I have no further comments on this patch.

Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 08/26] arm64/sve: Enable SVE state tracking for non-task contexts

2019-02-26 Thread Julien Grall

Hi Dave,

On 26/02/2019 12:07, Dave Martin wrote:

On Fri, Feb 22, 2019 at 03:26:51PM +, Julien Grall wrote:

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account.  This means that


NIT: Double-space before "This".


See patch 2...

[...]

Does the code look reasonable to you?  This interacts with FPSIMD/SVE
context switch in the host, so it would be good to have your view on it.


I wanted to look at the rest before giving my reviewed-by tag.
FWIW, this patch looks reasonable to me.

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 06/26] arm64/sve: Check SVE virtualisability

2019-02-26 Thread Julien Grall

Hi Dave,

On 26/02/2019 12:06, Dave Martin wrote:

On Thu, Feb 21, 2019 at 01:36:26PM +, Julien Grall wrote:

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

+   /*
+* Mismatches above sve_max_virtualisable_vl are fine, since
+* no guest is allowed to configure ZCR_EL2.LEN to exceed this:
+*/
+   if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+   pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
+   smp_processor_id());


Would it be worth to print the unsupported vector length?


Possibly not, but admittedly the intent is a bit unclear in this patch.

See my reply to Julien Thierry (and respond on that subthread if you
have comments, so that we don't end up with two subthreads discussing
the same thing...)


I will have a look at the thread.

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 02/26] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush

2019-02-26 Thread Julien Grall




On 26/02/2019 12:06, Dave Martin wrote:

On Thu, Feb 21, 2019 at 12:39:39PM +, Julien Grall wrote:

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

This patch updates fpsimd_flush_task_state() to mirror the new
semantics of fpsimd_flush_cpu_state() introduced by commit
d8ad71fa38a9 ("arm64: fpsimd: Fix TIF_FOREIGN_FPSTATE after
invalidating cpu regs").  Both functions now implicitly set


NIT: Double-space before "Both"


TIF_FOREIGN_FPSTATE to indicate that the task's FPSIMD state is not
loaded into the cpu.

As a side-effect, fpsimd_flush_task_state() now sets
TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of


NIT: Double sppace before "In".


non-running tasks this is not useful but also harmless, because the
flag is live only while the corresponding task is running.  This
function is not called from fast paths, so special-casing this for
the task == current case is not really worth it.

Compiler barriers previously present in restore_sve_fpsimd_context()
are pulled into fpsimd_flush_task_state() so that it can be safely
called with preemption enabled if necessary.

Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
fpsimd_flush_task_state() calls and are now redundant are removed
as appropriate.

fpsimd_flush_task_state() is used to get exclusive access to the
representation of the task's state via task_struct, for the purpose
of replacing the state.  Thus, the call to this function should


NIT: Double-space before "Thus".


happen before manipulating fpsimd_state or sve_state etc. in
task_struct.  Anomalous cases are reordered appropriately in order


NIT: Double-space before "Anomalous".


A habit rather than a mistake [1], and I don't propose to change it ;)


I wasn't aware of this. Thank you for the pointer! Please ignore the comments on 
it :).



Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 11/26] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers

2019-02-22 Thread Julien Grall

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

The reset_unknown() system register helper initialises a guest
register to a distinctive junk value on vcpu reset, to help expose
and debug deficient register initialisation within the guest.

Some registers such as the SVE control register ZCR_EL1 contain a
mixture of UNKNOWN fields and RES0 bits.  For these,
reset_unknown() does not work at present, since it sets all bits to
junk values instead of just the wanted bits.

There is no need to craft another special helper just for that,
since reset_unknown() almost does the appropriate thing anyway.
This patch takes advantage of the unused val field in struct
sys_reg_desc to specify a mask of bits that should be initialised
to zero instead of junk.

All existing users of reset_unknown() do not (and should not)
define a value for val, so they will implicitly set it to zero,
resulting in all bits being made UNKNOWN by this function: thus,
this patch makes no functional change for currently defined
registers.

Future patches will make use of non-zero val.

Signed-off-by: Dave Martin 


Reviewed-by: Julien Grall 

Cheers,


---
  arch/arm64/kvm/sys_regs.h | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 3b1bc7f..174ffc0 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -56,7 +56,12 @@ struct sys_reg_desc {
/* Index into sys_reg[], or 0 if we don't need to save it. */
int reg;
  
-	/* Value (usually reset value) */

+   /*
+* Value (usually reset value)
+* For reset_unknown, each bit set to 1 in val is treated as
+* RES0 in the register: the corresponding register bit is
+* reset to 0 instead of "unknown".
+*/
u64 val;
  
  	/* Custom get/set_user functions, fallback to generic if NULL */

@@ -92,7 +97,9 @@ static inline void reset_unknown(struct kvm_vcpu *vcpu,
  {
BUG_ON(!r->reg);
BUG_ON(r->reg >= NR_SYS_REGS);
-   __vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL;
+
+   /* If non-zero, r->val specifies which register bits are RES0: */
+   __vcpu_sys_reg(vcpu, r->reg) = 0x1de7ec7edbadc0deULL & ~r->val;
  }
  
  static inline void reset_val(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)




--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 08/26] arm64/sve: Enable SVE state tracking for non-task contexts

2019-02-22 Thread Julien Grall

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account.  This means that


NIT: Double-space before "This".


only task contexts can safely use SVE at present.

In preparation for enabling KVM guests to use SVE, it is necessary
to keep track of SVE state for non-task contexts too.

This patch adds the necessary support, removing assumptions from
the context switch code about the location of the SVE context
storage.

When binding a vcpu context, its vector length is arbitrarily
specified as SVE_VL_MIN for now.  In any case, because TIF_SVE is


NIT: Double-space before "In".


presently cleared at vcpu context bind time, the specified vector
length will not be used for anything yet.  In later patches TIF_SVE


NIT: Double-space before "In".


will be set here as appropriate, and the appropriate maximum vector
length for the vcpu will be passed when binding.


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 06/14] KVM: arm/arm64: Factor out VMID into struct kvm_vmid

2019-02-22 Thread Julien Grall

Hi Marc,

On 22/02/2019 09:18, Marc Zyngier wrote:

On Thu, 21 Feb 2019 11:02:56 +
Julien Grall  wrote:

Hi Julien,


Hi Christoffer,

On 24/01/2019 14:00, Christoffer Dall wrote:

Note that to avoid mapping the kvm_vmid_bits variable into hyp, we
simply forego the masking of the vmid value in kvm_get_vttbr and rely on
update_vmid to always assign a valid vmid value (within the supported
range).


[...]


-   kvm->arch.vmid = kvm_next_vmid;
+   vmid->vmid = kvm_next_vmid;
kvm_next_vmid++;
-   kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
-
-   /* update vttbr to be used with the new vmid */
-   pgd_phys = virt_to_phys(kvm->arch.pgd);
-   BUG_ON(pgd_phys & ~kvm_vttbr_baddr_mask(kvm));
-   vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
VTTBR_VMID_MASK(kvm_vmid_bits);
-   kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid | cnp;
+   kvm_next_vmid &= (1 << kvm_get_vmid_bits()) - 1;


The arm64 version of kvm_get_vmid_bits does not look cheap. Indeed it required
to read the sanitized value of SYS_ID_AA64MMFR1_EL1 that is implemented using
the function bsearch.

So wouldn't it be better to keep kvm_vmid_bits variable for use in 
update_vttbr()?


How often does this happen? Can you measure this overhead at all?

My understanding is that we hit this path on rollover only, having IPIed
all CPUs and invalidated all TLBs. I seriously doubt you can observe
any sort of overhead at all, given that it is so incredibly rare. But
feel free to prove me wrong!


That would happen on roll-over and the first time you allocate VMID for the VM.

I am planning to run some test with 3-bit VMIDs and provide them next week.

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 07/26] arm64/sve: Clarify role of the VQ map maintenance functions

2019-02-21 Thread Julien Grall

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

The roles of sve_init_vq_map(), sve_update_vq_map() and
sve_verify_vq_map() are highly non-obvious to anyone who has not dug
through cpufeatures.c in detail.

Since the way these functions interact with each other is more
important here than a full understanding of the cpufeatures code, this
patch adds comments to make the functions' roles clearer.

No functional change.

Signed-off-by: Dave Martin 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 06/26] arm64/sve: Check SVE virtualisability

2019-02-21 Thread Julien Grall

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

+   /*
+* Mismatches above sve_max_virtualisable_vl are fine, since
+* no guest is allowed to configure ZCR_EL2.LEN to exceed this:
+*/
+   if (sve_vl_from_vq(bit_to_vq(b)) <= sve_max_virtualisable_vl) {
+   pr_warn("SVE: cpu%d: Unsupported vector length(s) present\n",
+   smp_processor_id());


Would it be worth to print the unsupported vector length?

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 02/26] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush

2019-02-21 Thread Julien Grall

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

This patch updates fpsimd_flush_task_state() to mirror the new
semantics of fpsimd_flush_cpu_state() introduced by commit
d8ad71fa38a9 ("arm64: fpsimd: Fix TIF_FOREIGN_FPSTATE after
invalidating cpu regs").  Both functions now implicitly set


NIT: Double-space before "Both"


TIF_FOREIGN_FPSTATE to indicate that the task's FPSIMD state is not
loaded into the cpu.

As a side-effect, fpsimd_flush_task_state() now sets
TIF_FOREIGN_FPSTATE even for non-running tasks.  In the case of


NIT: Double sppace before "In".


non-running tasks this is not useful but also harmless, because the
flag is live only while the corresponding task is running.  This
function is not called from fast paths, so special-casing this for
the task == current case is not really worth it.

Compiler barriers previously present in restore_sve_fpsimd_context()
are pulled into fpsimd_flush_task_state() so that it can be safely
called with preemption enabled if necessary.

Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
fpsimd_flush_task_state() calls and are now redundant are removed
as appropriate.

fpsimd_flush_task_state() is used to get exclusive access to the
representation of the task's state via task_struct, for the purpose
of replacing the state.  Thus, the call to this function should


NIT: Double-space before "Thus".


happen before manipulating fpsimd_state or sve_state etc. in
task_struct.  Anomalous cases are reordered appropriately in order


NIT: Double-space before "Anomalous".


to make the code more consistent, although there should be no
functional difference since these cases are protected by
local_bh_disable() anyway.

Signed-off-by: Dave Martin 
Reviewed-by: Alex Bennée 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 01/26] KVM: Documentation: Document arm64 core registers in detail

2019-02-21 Thread Julien Grall

Hi Dave,

On 18/02/2019 19:52, Dave Martin wrote:

Since the the sizes of individual members of the core arm64
registers vary, the list of register encodings that make sense is
not a simple linear sequence.

To clarify which encodings to use, this patch adds a brief list
to the documentation.

Signed-off-by: Dave Martin 


Reviewed-by: Julien Grall 

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 06/14] KVM: arm/arm64: Factor out VMID into struct kvm_vmid

2019-02-21 Thread Julien Grall
Hi Christoffer,

On 24/01/2019 14:00, Christoffer Dall wrote:
> Note that to avoid mapping the kvm_vmid_bits variable into hyp, we
> simply forego the masking of the vmid value in kvm_get_vttbr and rely on
> update_vmid to always assign a valid vmid value (within the supported
> range).

[...]

> - kvm->arch.vmid = kvm_next_vmid;
> + vmid->vmid = kvm_next_vmid;
>   kvm_next_vmid++;
> - kvm_next_vmid &= (1 << kvm_vmid_bits) - 1;
> -
> - /* update vttbr to be used with the new vmid */
> - pgd_phys = virt_to_phys(kvm->arch.pgd);
> - BUG_ON(pgd_phys & ~kvm_vttbr_baddr_mask(kvm));
> - vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & 
> VTTBR_VMID_MASK(kvm_vmid_bits);
> - kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid | cnp;
> + kvm_next_vmid &= (1 << kvm_get_vmid_bits()) - 1;

The arm64 version of kvm_get_vmid_bits does not look cheap. Indeed it required 
to read the sanitized value of SYS_ID_AA64MMFR1_EL1 that is implemented using 
the function bsearch.

So wouldn't it be better to keep kvm_vmid_bits variable for use in 
update_vttbr()?

Cheers,

-- 
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 1/3] KVM: arm/arm64: vgic: Make vgic_irq->irq_lock a raw_spinlock

2019-02-01 Thread Julien Grall

Hi Julia,

On 01/02/2019 17:36, Julia Cartwright wrote:

On Fri, Feb 01, 2019 at 03:30:58PM +, Julien Grall wrote:

Hi Julien,

On 07/01/2019 15:06, Julien Thierry wrote:

vgic_irq->irq_lock must always be taken with interrupts disabled as
it is used in interrupt context.


I am a bit confused with the reason here. The code mention that ap_list_lock
could be taken from the timer interrupt handler interrupt. I assume it
speaks about the handler kvm_arch_timer_handler. Looking at the
configuration of the interrupt, the flag IRQF_NO_THREAD is not set, so the
interrupt should be threaded when CONFIG_PREEMPT_FULL is set. If my
understanding is correct, this means the interrupt thread would sleep if it
takes the spinlock.

Did I miss anything? Do you have an exact path where the vGIC is actually
called from an interrupt context?


The part you're missing is that percpu interrupts are not force
threaded:

static int irq_setup_forced_threading(struct irqaction *new)
{
if (!force_irqthreads)
return 0;
if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU | IRQF_ONESHOT))
return 0;

/* ...*/
}


Thank you for the pointer! I think it would be worth mentioning in the commit 
message that per-cpu interrupts are not threaded.


Best regards,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 1/3] KVM: arm/arm64: vgic: Make vgic_irq->irq_lock a raw_spinlock

2019-02-01 Thread Julien Grall
7 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
struct kvm_vcpu *target_vcpu, *vcpuA, *vcpuB;
bool target_vcpu_needs_kick = false;
  
-		spin_lock(>irq_lock);

+   raw_spin_lock(>irq_lock);
  
  		BUG_ON(vcpu != irq->vcpu);
  
@@ -616,7 +616,7 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)

 */
list_del(>ap_list);
irq->vcpu = NULL;
-   spin_unlock(>irq_lock);
+   raw_spin_unlock(>irq_lock);
  
  			/*

 * This vgic_put_irq call matches the
@@ -631,13 +631,13 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
  
  		if (target_vcpu == vcpu) {

/* We're on the right CPU */
-   spin_unlock(>irq_lock);
+   raw_spin_unlock(>irq_lock);
continue;
}
  
  		/* This interrupt looks like it has to be migrated. */
  
-		spin_unlock(>irq_lock);

+   raw_spin_unlock(>irq_lock);
spin_unlock(_cpu->ap_list_lock);
  
  		/*

@@ -655,7 +655,7 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
spin_lock(>arch.vgic_cpu.ap_list_lock);
spin_lock_nested(>arch.vgic_cpu.ap_list_lock,
 SINGLE_DEPTH_NESTING);
-   spin_lock(>irq_lock);
+   raw_spin_lock(>irq_lock);
  
  		/*

 * If the affinity has been preserved, move the
@@ -675,7 +675,7 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
target_vcpu_needs_kick = true;
}
  
-		spin_unlock(>irq_lock);

+   raw_spin_unlock(>irq_lock);
spin_unlock(>arch.vgic_cpu.ap_list_lock);
spin_unlock(>arch.vgic_cpu.ap_list_lock);
  
@@ -741,10 +741,10 @@ static int compute_ap_list_depth(struct kvm_vcpu *vcpu,

list_for_each_entry(irq, _cpu->ap_list_head, ap_list) {
int w;
  
-		spin_lock(>irq_lock);

+   raw_spin_lock(>irq_lock);
/* GICv2 SGIs can count for more than one... */
w = vgic_irq_get_lr_count(irq);
-   spin_unlock(>irq_lock);
+   raw_spin_unlock(>irq_lock);
  
  		count += w;

*multi_sgi |= (w > 1);
@@ -770,7 +770,7 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
count = 0;
  
  	list_for_each_entry(irq, _cpu->ap_list_head, ap_list) {

-   spin_lock(>irq_lock);
+   raw_spin_lock(>irq_lock);
  
  		/*

 * If we have multi-SGIs in the pipeline, we need to
@@ -780,7 +780,7 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 * the AP list has been sorted already.
 */
if (multi_sgi && irq->priority > prio) {
-   spin_unlock(>irq_lock);
+   _raw_spin_unlock(>irq_lock);
break;
}
  
@@ -791,7 +791,7 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)

prio = irq->priority;
}
  
-		spin_unlock(>irq_lock);

+   raw_spin_unlock(>irq_lock);
  
  		if (count == kvm_vgic_global_state.nr_lr) {

if (!list_is_last(>ap_list,
@@ -921,11 +921,11 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
spin_lock_irqsave(_cpu->ap_list_lock, flags);
  
  	list_for_each_entry(irq, _cpu->ap_list_head, ap_list) {

-   spin_lock(>irq_lock);
+   raw_spin_lock(>irq_lock);
pending = irq_is_pending(irq) && irq->enabled &&
  !irq->active &&
  irq->priority < vmcr.pmr;
-   spin_unlock(>irq_lock);
+   raw_spin_unlock(>irq_lock);
  
  		if (pending)

break;
@@ -963,11 +963,10 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, 
unsigned int vintid)
return false;
  
  	irq = vgic_get_irq(vcpu->kvm, vcpu, vintid);

-   spin_lock_irqsave(>irq_lock, flags);
+   raw_spin_lock_irqsave(>irq_lock, flags);
map_is_active = irq->hw && irq->active;
-   spin_unlock_irqrestore(>irq_lock, flags);
+   raw_spin_unlock_irqrestore(>irq_lock, flags);
vgic_put_irq(vcpu->kvm, irq);
  
  	return map_is_active;

  }
-



--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 4/4] arm64: KVM: Implement workaround for Cortex-A76 erratum 1165522

2018-11-21 Thread Julien Grall

Hi Marc,

On 05/11/2018 14:36, Marc Zyngier wrote:

Early versions of Cortex-A76 can end-up with corrupt TLBs if they
speculate an AT instruction in during a guest switch while the
S1/S2 system registers are in an inconsistent state.

Work around it by:
- Mandating VHE
- Make sure that S1 and S2 system registers are consistent before
   clearing HCR_EL2.TGE, which allows AT to target the EL1 translation
   regime

These two things together ensure that we cannot hit this erratum.

Signed-off-by: Marc Zyngier 
---
  Documentation/arm64/silicon-errata.txt |  1 +
  arch/arm64/Kconfig | 12 
  arch/arm64/include/asm/cpucaps.h   |  3 ++-
  arch/arm64/include/asm/kvm_host.h  |  3 +++
  arch/arm64/include/asm/kvm_hyp.h   |  6 ++
  arch/arm64/kernel/cpu_errata.c |  8 
  arch/arm64/kvm/hyp/switch.c| 14 ++
  7 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.txt 
b/Documentation/arm64/silicon-errata.txt
index 76ccded8b74c..04f0bc4690c6 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -57,6 +57,7 @@ stable kernels.
  | ARM| Cortex-A73  | #858921 | ARM64_ERRATUM_858921   
 |
  | ARM| Cortex-A55  | #1024718| ARM64_ERRATUM_1024718  
 |
  | ARM| Cortex-A76  | #1188873| ARM64_ERRATUM_1188873  
 |
+| ARM| Cortex-A76  | #1165522| ARM64_ERRATUM_1165522   
|
  | ARM| MMU-500 | #841119,#826419 | N/A
 |
  || | |
 |
  | Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375   
 |
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 787d7850e064..a68bc6cc2167 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -497,6 +497,18 @@ config ARM64_ERRATUM_1188873
  
  	  If unsure, say Y.
  
+config ARM64_ERRATUM_1165522

+   bool "Cortex-A76: Speculative AT instruction using out-of-context 
translation regime could cause subsequent request to generate an incorrect 
translation"
+   default y
+   help
+ This option adds work arounds for ARM Cortex-A76 erratum 1165522
+
+ Affected Cortex-A76 cores (r0p0, r1p0, r2p0) could end-up with
+ corrupted TLBs by speculating an AT instruction during a guest
+ context switch.
+
+ If unsure, say Y.


Most of the code in the patch is not guarded by #ifdef ARM64_*. So is there any 
benefits to add a Kconfig for this option?


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 22/24] kvmtool: arm64: Add support for guest physical address size

2018-07-05 Thread Julien Grall

Hi Will,

On 04/07/18 16:52, Will Deacon wrote:

On Wed, Jul 04, 2018 at 04:00:11PM +0100, Julien Grall wrote:

On 04/07/18 15:09, Will Deacon wrote:

On Fri, Jun 29, 2018 at 12:15:42PM +0100, Suzuki K Poulose wrote:

Add an option to specify the physical address size used by this
VM.

Signed-off-by: Suzuki K Poulose 
---
  arm/aarch64/include/kvm/kvm-config-arch.h | 5 -
  arm/include/arm-common/kvm-config-arch.h  | 1 +
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h 
b/arm/aarch64/include/kvm/kvm-config-arch.h
index 04be43d..dabd22c 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -8,7 +8,10 @@
"Create PMUv3 device"),   \
OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed,\
"Specify random seed for Kernel Address Space "   \
-   "Layout Randomization (KASLR)"),
+   "Layout Randomization (KASLR)"),  \
+   OPT_INTEGER('\0', "phys-shift", &(cfg)->phys_shift,\
+   "Specify maximum physical address size (not " \
+   "the amount of memory)"),


Given that this is a shift value, I think the help message could be more
informative. Something like:

"Specify maximum number of bits in a guest physical address"

I think I'd actually leave out any mention of memory, because this does
actually have an effect on the amount of addressable memory in a way that I
don't think we want to describe in half of a usage message line :)

Is there any particular reasons to expose this option to the user?

I have recently sent a series to allow the user to specify the position
of the RAM [1]. With that series in mind, I think the user would not really
need to specify the maximum physical shift. Instead we could automatically
find it.


Marc makes a good point that it doesn't help for MMIO regions, so I'm trying
to understand whether we can do something differently there and avoid
sacrificing the type parameter.


I am not sure to understand this. kvmtools knows the memory layout 
(including MMIOs) of the guest, so couldn't it guess the maximum 
physical shift for that?


Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool test PATCH 22/24] kvmtool: arm64: Add support for guest physical address size

2018-07-04 Thread Julien Grall

Hi,

On 04/07/18 15:09, Will Deacon wrote:

On Fri, Jun 29, 2018 at 12:15:42PM +0100, Suzuki K Poulose wrote:

Add an option to specify the physical address size used by this
VM.

Signed-off-by: Suzuki K Poulose 
---
  arm/aarch64/include/kvm/kvm-config-arch.h | 5 -
  arm/include/arm-common/kvm-config-arch.h  | 1 +
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h 
b/arm/aarch64/include/kvm/kvm-config-arch.h
index 04be43d..dabd22c 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -8,7 +8,10 @@
"Create PMUv3 device"),   \
OPT_U64('\0', "kaslr-seed", &(cfg)->kaslr_seed,\
"Specify random seed for Kernel Address Space "   \
-   "Layout Randomization (KASLR)"),
+   "Layout Randomization (KASLR)"),  \
+   OPT_INTEGER('\0', "phys-shift", &(cfg)->phys_shift,\
+   "Specify maximum physical address size (not " \
+   "the amount of memory)"),


Given that this is a shift value, I think the help message could be more
informative. Something like:

"Specify maximum number of bits in a guest physical address"

I think I'd actually leave out any mention of memory, because this does
actually have an effect on the amount of addressable memory in a way that I
don't think we want to describe in half of a usage message line :)

Is there any particular reasons to expose this option to the user?

I have recently sent a series to allow the user to specify the position
of the RAM [1]. With that series in mind, I think the user would not 
really need to specify the maximum physical shift. Instead we could 
automatically find it.


Cheers,

[1] 
http://archive.armlinux.org.uk/lurker/message/20180510.140428.1c295b5b.en.html




Will



--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 08/14] arm64: ssbd: Disable mitigation on CPU resume if required by user

2018-05-23 Thread Julien Grall

Hi,

On 05/22/2018 04:06 PM, Marc Zyngier wrote:

On a system where firmware can dynamically change the state of the
mitigation, the CPU will always come up with the mitigation enabled,
including when coming back from suspend.

If the user has requested "no mitigation" via a command line option,
let's enforce it by calling into the firmware again to disable it.

Signed-off-by: Marc Zyngier <marc.zyng...@arm.com>


Reviewed-by: Julien Grall <julien.gr...@arm.com>

Cheers,


---
  arch/arm64/include/asm/cpufeature.h | 6 ++
  arch/arm64/kernel/cpu_errata.c  | 8 
  arch/arm64/kernel/suspend.c | 8 
  3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 1bacdf57f0af..d9dcb683259e 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -553,6 +553,12 @@ static inline int arm64_get_ssbd_state(void)
  #endif
  }
  
+#ifdef CONFIG_ARM64_SSBD

+void arm64_set_ssbd_mitigation(bool state);
+#else
+static inline void arm64_set_ssbd_mitigation(bool state) {}
+#endif
+
  #endif /* __ASSEMBLY__ */
  
  #endif

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 8f686f39b9c1..b4c12e9140f0 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -297,7 +297,7 @@ void __init arm64_enable_wa2_handling(struct alt_instr *alt,
*updptr = cpu_to_le32(aarch64_insn_gen_nop());
  }
  
-static void do_ssbd(bool state)

+void arm64_set_ssbd_mitigation(bool state)
  {
switch (psci_ops.conduit) {
case PSCI_CONDUIT_HVC:
@@ -371,20 +371,20 @@ static bool has_ssbd_mitigation(const struct 
arm64_cpu_capabilities *entry,
switch (ssbd_state) {
case ARM64_SSBD_FORCE_DISABLE:
pr_info_once("%s disabled from command-line\n", entry->desc);
-   do_ssbd(false);
+   arm64_set_ssbd_mitigation(false);
required = false;
break;
  
  	case ARM64_SSBD_EL1_ENTRY:

if (required) {
__this_cpu_write(arm64_ssbd_callback_required, 1);
-   do_ssbd(true);
+   arm64_set_ssbd_mitigation(true);
}
break;
  
  	case ARM64_SSBD_FORCE_ENABLE:

pr_info_once("%s forced from command-line\n", entry->desc);
-   do_ssbd(true);
+   arm64_set_ssbd_mitigation(true);
required = true;
break;
  
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c

index a307b9e13392..70c283368b64 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -62,6 +62,14 @@ void notrace __cpu_suspend_exit(void)
 */
if (hw_breakpoint_restore)
hw_breakpoint_restore(cpu);
+
+   /*
+* On resume, firmware implementing dynamic mitigation will
+* have turned the mitigation on. If the user has forcefully
+* disabled it, make sure their wishes are obeyed.
+*/
+   if (arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE)
+   arm64_set_ssbd_mitigation(false);
  }
  
  /*




--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 07/14] arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation

2018-05-23 Thread Julien Grall

Hi Marc,

On 05/22/2018 04:06 PM, Marc Zyngier wrote:

In order to avoid checking arm64_ssbd_callback_required on each
kernel entry/exit even if no mitigation is required, let's
add yet another alternative that by default jumps over the mitigation,
and that gets nop'ed out if we're doing dynamic mitigation.

Think of it as a poor man's static key...

Signed-off-by: Marc Zyngier <marc.zyng...@arm.com>


Reviewed-by: Julien Grall <julien.gr...@arm.com>

Cheers,



---
  arch/arm64/kernel/cpu_errata.c | 14 ++
  arch/arm64/kernel/entry.S  |  3 +++
  2 files changed, 17 insertions(+)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index f1d4e75b0ddd..8f686f39b9c1 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -283,6 +283,20 @@ void __init arm64_update_smccc_conduit(struct alt_instr 
*alt,
*updptr = cpu_to_le32(insn);
  }
  
+void __init arm64_enable_wa2_handling(struct alt_instr *alt,

+ __le32 *origptr, __le32 *updptr,
+ int nr_inst)
+{
+   BUG_ON(nr_inst != 1);
+   /*
+* Only allow mitigation on EL1 entry/exit and guest
+* ARCH_WORKAROUND_2 handling if the SSBD state allows it to
+* be flipped.
+*/
+   if (arm64_get_ssbd_state() == ARM64_SSBD_EL1_ENTRY)
+   *updptr = cpu_to_le32(aarch64_insn_gen_nop());
+}
+
  static void do_ssbd(bool state)
  {
switch (psci_ops.conduit) {
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 29ad672a6abd..e6f6e2339b22 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -142,6 +142,9 @@ alternative_else_nop_endif
// to save/restore them if required.
.macro  apply_ssbd, state, targ, tmp1, tmp2
  #ifdef CONFIG_ARM64_SSBD
+alternative_cb arm64_enable_wa2_handling
+   b   \targ
+alternative_cb_end
ldr_this_cpu\tmp2, arm64_ssbd_callback_required, \tmp1
cbz \tmp2, \targ
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2



--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 06/14] arm64: ssbd: Add global mitigation state accessor

2018-05-23 Thread Julien Grall

Hi Marc,

On 05/22/2018 04:06 PM, Marc Zyngier wrote:

We're about to need the mitigation state in various parts of the
kernel in order to do the right thing for userspace and guests.

Let's expose an accessor that will let other subsystems know
about the state.

Signed-off-by: Marc Zyngier <marc.zyng...@arm.com>


Reviewed-by: Julien Grall <julien.gr...@arm.com>

Cheers,


---
  arch/arm64/include/asm/cpufeature.h | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 9bc548e22784..1bacdf57f0af 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -543,6 +543,16 @@ static inline u64 read_zcr_features(void)
  #define ARM64_SSBD_FORCE_ENABLE   2
  #define ARM64_SSBD_MITIGATED  3
  
+static inline int arm64_get_ssbd_state(void)

+{
+#ifdef CONFIG_ARM64_SSBD
+   extern int ssbd_state;
+   return ssbd_state;
+#else
+   return ARM64_SSBD_UNKNOWN;
+#endif
+}
+
  #endif /* __ASSEMBLY__ */
  
  #endif




--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 05/14] arm64: Add 'ssbd' command-line option

2018-05-23 Thread Julien Grall

Hi Marc,

On 05/22/2018 04:06 PM, Marc Zyngier wrote:

On a system where the firmware implements ARCH_WORKAROUND_2,
it may be useful to either permanently enable or disable the
workaround for cases where the user decides that they'd rather
not get a trap overhead, and keep the mitigation permanently
on or off instead of switching it on exception entry/exit.

In any case, default to the mitigation being enabled.

Signed-off-by: Marc Zyngier <marc.zyng...@arm.com>


Reviewed-by: Julien Grall <julien.gr...@arm.com>

Cheers,


---
  Documentation/admin-guide/kernel-parameters.txt |  17 
  arch/arm64/include/asm/cpufeature.h |   6 ++
  arch/arm64/kernel/cpu_errata.c  | 102 
  3 files changed, 109 insertions(+), 16 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index f2040d46f095..646e112c6f63 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4092,6 +4092,23 @@
expediting.  Set to zero to disable automatic
expediting.
  
+	ssbd=		[ARM64,HW]

+   Speculative Store Bypass Disable control
+
+   On CPUs that are vulnerable to the Speculative
+   Store Bypass vulnerability and offer a
+   firmware based mitigation, this parameter
+   indicates how the mitigation should be used:
+
+   force-on:  Unconditionnaly enable mitigation for
+  for both kernel and userspace
+   force-off: Unconditionnaly disable mitigation for
+  for both kernel and userspace
+   kernel:Always enable mitigation in the
+  kernel, and offer a prctl interface
+  to allow userspace to register its
+  interest in being mitigated too.
+
stack_guard_gap=[MM]
override the default stack gap protection. The value
is in page units and it defines how many pages prior
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 09b0f2a80c8f..9bc548e22784 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -537,6 +537,12 @@ static inline u64 read_zcr_features(void)
return zcr;
  }
  
+#define ARM64_SSBD_UNKNOWN		-1

+#define ARM64_SSBD_FORCE_DISABLE   0
+#define ARM64_SSBD_EL1_ENTRY   1
+#define ARM64_SSBD_FORCE_ENABLE2
+#define ARM64_SSBD_MITIGATED   3
+
  #endif /* __ASSEMBLY__ */
  
  #endif

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 7fd6d5b001f5..f1d4e75b0ddd 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -235,6 +235,38 @@ enable_smccc_arch_workaround_1(const struct 
arm64_cpu_capabilities *entry)
  #ifdef CONFIG_ARM64_SSBD
  DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
  
+int ssbd_state __read_mostly = ARM64_SSBD_EL1_ENTRY;

+
+static const struct ssbd_options {
+   const char  *str;
+   int state;
+} ssbd_options[] = {
+   { "force-on", ARM64_SSBD_FORCE_ENABLE, },
+   { "force-off",ARM64_SSBD_FORCE_DISABLE, },
+   { "kernel",   ARM64_SSBD_EL1_ENTRY, },
+};
+
+static int __init ssbd_cfg(char *buf)
+{
+   int i;
+
+   if (!buf || !buf[0])
+   return -EINVAL;
+
+   for (i = 0; i < ARRAY_SIZE(ssbd_options); i++) {
+   int len = strlen(ssbd_options[i].str);
+
+   if (strncmp(buf, ssbd_options[i].str, len))
+   continue;
+
+   ssbd_state = ssbd_options[i].state;
+   return 0;
+   }
+
+   return -EINVAL;
+}
+early_param("ssbd", ssbd_cfg);
+
  void __init arm64_update_smccc_conduit(struct alt_instr *alt,
   __le32 *origptr, __le32 *updptr,
   int nr_inst)
@@ -272,44 +304,82 @@ static bool has_ssbd_mitigation(const struct 
arm64_cpu_capabilities *entry,
int scope)
  {
struct arm_smccc_res res;
-   bool supported = true;
+   bool required = true;
+   s32 val;
  
  	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
  
-	if (psci_ops.smccc_version == SMCCC_VERSION_1_0)

+   if (psci_ops.smccc_version == SMCCC_VERSION_1_0) {
+   ssbd_state = ARM64_SSBD_UNKNOWN;
return false;
+   }
  
-	/*

-* The probe function return value is either negative
-* (unsupported or mitigated), positive (unaffected), or zero
-* (requires mitigation). 

Re: [PATCH 04/14] arm64: Add ARCH_WORKAROUND_2 probing

2018-05-23 Thread Julien Grall

Hi Marc,

On 05/22/2018 04:06 PM, Marc Zyngier wrote:

As for Spectre variant-2, we rely on SMCCC 1.1 to provide the
discovery mechanism for detecting the SSBD mitigation.

A new capability is also allocated for that purpose, and a
config option.

Signed-off-by: Marc Zyngier <marc.zyng...@arm.com>
---
  arch/arm64/Kconfig   |  9 ++
  arch/arm64/include/asm/cpucaps.h |  3 +-
  arch/arm64/kernel/cpu_errata.c   | 69 
  3 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index eb2cf4938f6d..b2103b4df467 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -938,6 +938,15 @@ config HARDEN_EL2_VECTORS
  
  	  If unsure, say Y.
  
+config ARM64_SSBD

+   bool "Speculative Store Bypass Disable" if EXPERT
+   default y
+   help
+ This enables mitigation of the bypassing of previous stores
+ by speculative loads.
+
+ If unsure, say Y.
+
  menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on COMPAT
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index bc51b72fafd4..5b2facf786ba 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -48,7 +48,8 @@
  #define ARM64_HAS_CACHE_IDC   27
  #define ARM64_HAS_CACHE_DIC   28
  #define ARM64_HW_DBM  29
+#define ARM64_SSBD 30


NIT: Could you indent 30 the same way as the other number?

Reviewed-by: Julien Grall <julien.gr...@arm.com>

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 03/14] arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2

2018-05-23 Thread Julien Grall

Hi Marc,

On 05/22/2018 04:06 PM, Marc Zyngier wrote:

In a heterogeneous system, we can end up with both affected and
unaffected CPUs. Let's check their status before calling into the
firmware.

Signed-off-by: Marc Zyngier <marc.zyng...@arm.com>


Reviewed-by: Julien Grall <julien.gr...@arm.com>

Cheers,


---
  arch/arm64/kernel/cpu_errata.c |  2 ++
  arch/arm64/kernel/entry.S  | 11 +++
  2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 46b3aafb631a..0288d6cf560e 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -233,6 +233,8 @@ enable_smccc_arch_workaround_1(const struct 
arm64_cpu_capabilities *entry)
  #endif/* CONFIG_HARDEN_BRANCH_PREDICTOR */
  
  #ifdef CONFIG_ARM64_SSBD

+DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
+
  void __init arm64_update_smccc_conduit(struct alt_instr *alt,
   __le32 *origptr, __le32 *updptr,
   int nr_inst)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index f33e6aed3037..29ad672a6abd 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -140,8 +140,10 @@ alternative_else_nop_endif
  
  	// This macro corrupts x0-x3. It is the caller's duty

// to save/restore them if required.
-   .macro  apply_ssbd, state
+   .macro  apply_ssbd, state, targ, tmp1, tmp2
  #ifdef CONFIG_ARM64_SSBD
+   ldr_this_cpu\tmp2, arm64_ssbd_callback_required, \tmp1
+   cbz \tmp2, \targ
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
mov w1, #\state
  alternative_cbarm64_update_smccc_conduit
@@ -176,12 +178,13 @@ alternative_cb_end
ldr x19, [tsk, #TSK_TI_FLAGS]   // since we can unmask debug
disable_step_tsk x19, x20   // exceptions when scheduling.
  
-	apply_ssbd 1

+   apply_ssbd 1, 1f, x22, x23
  
  #ifdef CONFIG_ARM64_SSBD

ldp x0, x1, [sp, #16 * 0]
ldp x2, x3, [sp, #16 * 1]
  #endif
+1:
  
  	mov	x29, xzr			// fp pointed to user-space

.else
@@ -323,8 +326,8 @@ alternative_if ARM64_WORKAROUND_845719
  alternative_else_nop_endif
  #endif
  3:
-   apply_ssbd 0
-
+   apply_ssbd 0, 5f, x0, x1
+5:
.endif
  
  	msr	elr_el1, x21			// set up the return data




--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 02/14] arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1

2018-05-23 Thread Julien Grall

Hi Marc,

On 05/22/2018 04:06 PM, Marc Zyngier wrote:

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ec2ee720e33e..f33e6aed3037 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -18,6 +18,7 @@
   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
   */
  
+#include 

  #include 
  #include 
  
@@ -137,6 +138,18 @@ alternative_else_nop_endif

add \dst, \dst, #(\sym - .entry.tramp.text)
.endm
  
+	// This macro corrupts x0-x3. It is the caller's duty

+   // to save/restore them if required.


NIT: Shouldn't you use /* ... */ for multi-line comments?

Regardless that:

Reviewed-by: Julien Grall <julien.gr...@arm.com>

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvmtool PATCH 21/17] kvmtool: arm: Add support for creating VM with PA size

2018-04-30 Thread Julien Grall

Hi,

On 27/03/18 14:15, Suzuki K Poulose wrote:

diff --git a/arm/kvm.c b/arm/kvm.c
index 5701d41..a9a9140 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -11,6 +11,8 @@
  #include 
  #include 
  
+unsigned long kvm_arm_type;

+
  struct kvm_ext kvm_req_ext[] = {
{ DEFINE_KVM_EXT(KVM_CAP_IRQCHIP) },
{ DEFINE_KVM_EXT(KVM_CAP_ONE_REG) },
@@ -18,6 +20,25 @@ struct kvm_ext kvm_req_ext[] = {
{ 0, 0 },
  };
  
+#ifndef KVM_ARM_GET_MAX_VM_PHYS_SHIFT

+#define KVM_ARM_GET_MAX_VM_PHYS_SHIFT  _IO(KVMIO, 0x0a)
+#endif
+
+void kvm__arch_init_hyp(struct kvm *kvm)
+{
+   unsigned max_ipa;
+
+   max_ipa = ioctl(kvm->sys_fd, KVM_ARM_GET_MAX_VM_PHYS_SHIFT);
+   if (max_ipa < 0)


Another issues spotted while doing some testing. This will always be 
false because max_ipa is unsigned.


I think we want to turn max_ipa to signed.

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 08/17] kvm: arm/arm64: Prepare for VM specific stage2 translations

2018-04-27 Thread Julien Grall

Hi Suzuki,

On 27/04/18 16:58, Suzuki K Poulose wrote:

On 27/04/18 16:22, Suzuki K Poulose wrote:

On 26/04/18 14:35, Julien Grall wrote:

Hi Suzuki,

On 27/03/18 14:15, Suzuki K Poulose wrote:

Right now the stage2 page table for a VM is hard coded, assuming
an IPA of 40bits. As we are about to add support for per VM IPA,
prepare the stage2 page table helpers to accept the kvm instance
to make the right decision for the VM. No functional changes.



diff --git a/arch/arm/include/asm/kvm_arm.h 
b/arch/arm/include/asm/kvm_arm.h

index 3ab8b37..c3f1f9b 100644
--- a/arch/arm/include/asm/kvm_arm.h
+++ b/arch/arm/include/asm/kvm_arm.h
@@ -133,8 +133,7 @@
   * space.
   */
  #define KVM_PHYS_SHIFT    (40)
-#define KVM_PHYS_SIZE    (_AC(1, ULL) << KVM_PHYS_SHIFT)
-#define KVM_PHYS_MASK    (KVM_PHYS_SIZE - _AC(1, ULL))


I assume you are moving them to kvm_mmu.h in order to match the arm64 
side, right? If so, would not it make sense to make KVM_PHYS_SHIFT 
with it?


[...]


I am moving all the macros that depend on the "kvm" instance to 
kvm_mmu.h.

I will see if I can move the KVM_PHYS_SHIFT without much trouble.


It looks like we can't do that easily. KVM_PHYS_SHIFT is used for KVM_T0SZ
on arm, even though that can be simply hard coded to avoid the 
dependency on
KVM_PHYS_SHIFT (like we did for arm64, T0SZ is defined to 24). I would 
leave it

as it is to avoid the noise.


Fine. That was only a suggestion :).

Cheers,

--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


  1   2   3   >