Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-22 Thread Stefano Stabellini
On Wed, 21 Apr 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 21/04/2021 01:38, Stefano Stabellini wrote:
> > On Tue, 20 Apr 2021, Julien Grall wrote:
> > > AFAICT, there is nothing in the implementation of
> > > XEN_DOMCTL_getvcpucontext
> > > that justify the extra barrier (assuming we consider vcpu_pause() as a
> > > full
> > > memory barrier).
> > > 
> > >  From your e-mail, I also could not infer what is your exact concern
> > > regarding
> > > the memory ordering. If you have something in mind, then would you mind to
> > > provide a sketch showing what could go wrong?
> > 
> > Let me start with what I had in mind:
> > 
> > initialized = v->is_initialized;
> > smp_rmb();
> > if ( !initialized )
> >goto getvcpucontext_out;
> > 
> > 
> > Without smp_rmb there are no guarantees that we see an up-to-date value
> > of v->is_initialized, right?
> 
> No. The smp_*mb() barriers only guarantee an ordering, it doesn't say anything
> about when the values will be seen. The write may in fact still be in the
> write buffer of the other CPU.
> 
> > This READ of v->is_initialized is supposed
> > to pair with the WRITE in arch_set_info_guest.
> 
> I am assuming you meant the access and not the barrier, right? Regardless, the
> barriers used, the access will happen in any order. IOW the following two
> ordering is possible:
> 
> Sequence 1:
> 
> CPU0: v->vcpu_initialized = 1
> CPU1: read v->vcpu_initialized
> 
> Sequence 2:
> 
> CPU1: read v->vcpu_initialized
> CPU0: v->vcpu_initialized = 0
> 
> In the first sequence, CPU1 will observe 0 and therefore bailout. In the
> second sequence, we will be observing 1 and continue.
> 
> > 
> > Relying on the the barrier in vcpu_pause is OK only if is_initialised
> > was already read as "1".
> > 
> > I think it might be OK in this case, because probably nothing wrong
> > happens if we read the old value of v->is_initialized as "0" but it
> > doesn't seem to be a good idea.
> 
> The smp_rmb() is not going to give you that guarantee. If you need it, then
> you mostly likely want to use a synchronization primitives such as
> spin_lock().
> 
> > Theoretically it could pass a very long
> > time before we see v->is_initialized as "1" without the smp_rmb.
> 
> I have the impression that there might be confusion about what I am aiming to
> achieve.
> 
> The code today looks roughly like:
> 
> 1) if ( v->is_initialized )
> 2)   return;
> 3) vcpu_pause();
> 4) /* Access registers */
> 
> My aim is to ensure that a processor will not READ any value for 4) are not
> happening before 1). If this happens, then we may provide garbagge to the
> domctl.
> 
> Returning an error is a lot better because the caller of the domctl will know
> the vCPU is not yet setup and can provide the infrastructure to try again.
> 
> From this PoV, your proposal and my two proposals are functionally equivalent.
> The main differences is whether there is an extra barrier and how easy is it
> to figure out the ordering.
> 
> Speaking with Ash, your proposal is probably the easiest one to understand.
> However, it also adds a barrier for the failure path (which is pointless).
> 
> I would like to avoid barriers when they are not necessary. Hence why I prefer
> the vcpu_pause() solution (we may want to add a comment).
> 
> Although, I could live with your proposal if the others are happy with it
> (Andrew? Jan?) because this is not an hot path.

I prefer my proposal because it is easier to understand its correctness
and, like you wrote, it is OK to have one more barrier than strictly
necessary in this case. But I wouldn't argue over this.



Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-21 Thread Julien Grall

Hi Stefano,

On 21/04/2021 01:38, Stefano Stabellini wrote:

On Tue, 20 Apr 2021, Julien Grall wrote:

AFAICT, there is nothing in the implementation of XEN_DOMCTL_getvcpucontext
that justify the extra barrier (assuming we consider vcpu_pause() as a full
memory barrier).

 From your e-mail, I also could not infer what is your exact concern regarding
the memory ordering. If you have something in mind, then would you mind to
provide a sketch showing what could go wrong?


Let me start with what I had in mind:

initialized = v->is_initialized;
smp_rmb();
if ( !initialized )
   goto getvcpucontext_out;


Without smp_rmb there are no guarantees that we see an up-to-date value
of v->is_initialized, right?


No. The smp_*mb() barriers only guarantee an ordering, it doesn't say 
anything about when the values will be seen. The write may in fact still 
be in the write buffer of the other CPU.



This READ of v->is_initialized is supposed
to pair with the WRITE in arch_set_info_guest.


I am assuming you meant the access and not the barrier, right? 
Regardless, the barriers used, the access will happen in any order. IOW 
the following two ordering is possible:


Sequence 1:

CPU0: v->vcpu_initialized = 1
CPU1: read v->vcpu_initialized

Sequence 2:

CPU1: read v->vcpu_initialized
CPU0: v->vcpu_initialized = 0

In the first sequence, CPU1 will observe 0 and therefore bailout. In the 
second sequence, we will be observing 1 and continue.




Relying on the the barrier in vcpu_pause is OK only if is_initialised
was already read as "1".

I think it might be OK in this case, because probably nothing wrong
happens if we read the old value of v->is_initialized as "0" but it
doesn't seem to be a good idea.


The smp_rmb() is not going to give you that guarantee. If you need it, 
then you mostly likely want to use a synchronization primitives such as 
spin_lock().



Theoretically it could pass a very long
time before we see v->is_initialized as "1" without the smp_rmb.


I have the impression that there might be confusion about what I am 
aiming to achieve.


The code today looks roughly like:

1) if ( v->is_initialized )
2)   return;
3) vcpu_pause();
4) /* Access registers */

My aim is to ensure that a processor will not READ any value for 4) are 
not happening before 1). If this happens, then we may provide garbagge 
to the domctl.


Returning an error is a lot better because the caller of the domctl will 
know the vCPU is not yet setup and can provide the infrastructure to try 
again.


From this PoV, your proposal and my two proposals are functionally 
equivalent. The main differences is whether there is an extra barrier 
and how easy is it to figure out the ordering.


Speaking with Ash, your proposal is probably the easiest one to 
understand. However, it also adds a barrier for the failure path (which 
is pointless).


I would like to avoid barriers when they are not necessary. Hence why I 
prefer the vcpu_pause() solution (we may want to add a comment).


Although, I could live with your proposal if the others are happy with 
it (Andrew? Jan?) because this is not an hot path.


Cheers,

--
Julien Grall



Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-20 Thread Stefano Stabellini
On Tue, 20 Apr 2021, Julien Grall wrote:
> (+ Andrew and Jan)
> 
> Hi Stefano,
> 
> On 20/04/2021 20:38, Stefano Stabellini wrote:
> > On Fri, 16 Apr 2021, Julien Grall wrote:
> > > > I think your explanation is correct. However, don't we need a smp_rmb()
> > > > barrier after reading v->is_initialised in xen/common/domctl.c:do_domctl
> > > > ? That would be the barrier that pairs with smp_wmb in regards to
> > > > v->is_initialised.
> > > 
> > > There is already a smp_mb() in sync_vcpu_exec_state() which is called from
> > > vcpu_pause() -> vcpu_sleep_sync().
> > 
> > But that's too late, isn't? The v->is_initialized check is done before
> > the vcpu_pause() call. We might end up taking the wrong code path:
> > 
> > https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/domctl.c#L587
> > https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/domctl.c#L598
> 
> I am a bit confused what you mean by "wrong path" here. There is no timing
> guarantee with a memory barrier. What the barrier will guarantee you is
> v->is_initialized will be read *before* anything after the barrier.
> 
> Are you worried that some variables in vcpu_pause() may be read before
> v->is_initialized?
> 
> > 
> > > I don't think we can ever remove the memory barrier in
> > > sync_vcpu_exec_state()
> > > because the vCPU paused may have run (or initialized) on a different pCPU.
> > > So
> > > I would like to rely on the barrier rather than adding an extra one (even
> > > thought it is not a fast path).
> > > 
> > > I am thinking to add a comment on top of vcpu_pause() to clarify that
> > > after
> > > the call, the vCPU context will be observed without extra synchronization
> > > required.
> > 
> > Given that an if.. goto is involved, even if both code paths lead to
> > good results, I think it would be best to have the smp_rmb() barrier
> > above after the first v->is_initialised read to make sure we take the
> > right path.
> 
> I don't understand what you are referring by "wrong" and "right" path. The
> processor will always execute/commit the path based on the value of
> v->is_initialized. What may happen is the processor may start to speculate the
> wrong path and then backtrack if it discovers the value is not the expected
> one. But, AFAIK, smp_rmb() is not going to protect you against speculation...
> smp_rmb() is only going to enforce a memory ordering between read.
> 
> > Instead of having to make sure that even the "wrong" path
> > behaves correctly.
> Just for clarification, I think you meant writing the following code:
> 
> if ( !v->is_initialized )
>   goto getvcpucontext_out;
> 
> smp_rmb();

No, sorry, I'll try to be clearer, see below


> ...
> 
> vcpu_pause();
> 
> AFAICT, there is nothing in the implementation of XEN_DOMCTL_getvcpucontext
> that justify the extra barrier (assuming we consider vcpu_pause() as a full
> memory barrier).
> 
> From your e-mail, I also could not infer what is your exact concern regarding
> the memory ordering. If you have something in mind, then would you mind to
> provide a sketch showing what could go wrong?

Let me start with what I had in mind:

initialized = v->is_initialized;
smp_rmb();
if ( !initialized )
  goto getvcpucontext_out;


Without smp_rmb there are no guarantees that we see an up-to-date value
of v->is_initialized, right? This READ of v->is_initialized is supposed
to pair with the WRITE in arch_set_info_guest.

Relying on the the barrier in vcpu_pause is OK only if is_initialised
was already read as "1".

I think it might be OK in this case, because probably nothing wrong
happens if we read the old value of v->is_initialized as "0" but it
doesn't seem to be a good idea. Theoretically it could pass a very long
time before we see v->is_initialized as "1" without the smp_rmb.

Please let me know if I am missing something.



Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-20 Thread Julien Grall

(+ Andrew and Jan)

Hi Stefano,

On 20/04/2021 20:38, Stefano Stabellini wrote:

On Fri, 16 Apr 2021, Julien Grall wrote:

I think your explanation is correct. However, don't we need a smp_rmb()
barrier after reading v->is_initialised in xen/common/domctl.c:do_domctl
? That would be the barrier that pairs with smp_wmb in regards to
v->is_initialised.


There is already a smp_mb() in sync_vcpu_exec_state() which is called from
vcpu_pause() -> vcpu_sleep_sync().


But that's too late, isn't? The v->is_initialized check is done before
the vcpu_pause() call. We might end up taking the wrong code path:

https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/domctl.c#L587
https://gitlab.com/xen-project/xen/-/blob/staging/xen/common/domctl.c#L598


I am a bit confused what you mean by "wrong path" here. There is no 
timing guarantee with a memory barrier. What the barrier will guarantee 
you is v->is_initialized will be read *before* anything after the barrier.


Are you worried that some variables in vcpu_pause() may be read before 
v->is_initialized?





I don't think we can ever remove the memory barrier in sync_vcpu_exec_state()
because the vCPU paused may have run (or initialized) on a different pCPU. So
I would like to rely on the barrier rather than adding an extra one (even
thought it is not a fast path).

I am thinking to add a comment on top of vcpu_pause() to clarify that after
the call, the vCPU context will be observed without extra synchronization
required.


Given that an if.. goto is involved, even if both code paths lead to
good results, I think it would be best to have the smp_rmb() barrier
above after the first v->is_initialised read to make sure we take the
right path.


I don't understand what you are referring by "wrong" and "right" path. 
The processor will always execute/commit the path based on the value of 
v->is_initialized. What may happen is the processor may start to 
speculate the wrong path and then backtrack if it discovers the value is 
not the expected one. But, AFAIK, smp_rmb() is not going to protect you 
against speculation... smp_rmb() is only going to enforce a memory 
ordering between read.



Instead of having to make sure that even the "wrong" path
behaves correctly.

Just for clarification, I think you meant writing the following code:

if ( !v->is_initialized )
  goto getvcpucontext_out;

smp_rmb();

...

vcpu_pause();

AFAICT, there is nothing in the implementation of 
XEN_DOMCTL_getvcpucontext that justify the extra barrier (assuming we 
consider vcpu_pause() as a full memory barrier).


From your e-mail, I also could not infer what is your exact concern 
regarding the memory ordering. If you have something in mind, then would 
you mind to provide a sketch showing what could go wrong?


Cheers,

--
Julien Grall



Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-20 Thread Stefano Stabellini
On Fri, 16 Apr 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 13/04/2021 23:43, Stefano Stabellini wrote:
> > On Sat, 20 Mar 2021, Julien Grall wrote:
> > > On 20/03/2021 00:01, Stefano Stabellini wrote:
> > > > On Sat, 27 Feb 2021, Julien Grall wrote:
> > > > > (+ Dario and George)
> > > > > 
> > > > > Hi Stefano,
> > > > > 
> > > > > I have added Dario and George to get some inputs from the scheduling
> > > > > part.
> > > > > 
> > > > > On 27/02/2021 01:58, Stefano Stabellini wrote:
> > > > > > On Fri, 26 Feb 2021, Julien Grall wrote:
> > > > > > > From: Julien Grall 
> > > > > > > 
> > > > > > > A vCPU can get scheduled as soon as _VPF_down is cleared. As there
> > > > > > > is
> > > > > > > currently not ordering guarantee in arch_set_info_guest(), it may
> > > > > > > be
> > > > > > > possible that flag can be observed cleared before the new values
> > > > > > > of
> > > > > > > vCPU
> > > > > > > registers are observed.
> > > > > > > 
> > > > > > > Add an smp_mb() before the flag is cleared to prevent re-ordering.
> > > > > > > 
> > > > > > > Signed-off-by: Julien Grall 
> > > > > > > 
> > > > > > > ---
> > > > > > > 
> > > > > > > Barriers should work in pair. However, I am not entirely sure
> > > > > > > whether
> > > > > > > to
> > > > > > > put the other half. Maybe at the beginning of context_switch_to()?
> > > > > > 
> > > > > > It should be right after VGCF_online is set or cleared, right?
> > > > > 
> > > > > vcpu_guest_context_t is variable allocated on the heap just for the
> > > > > purpose of
> > > > > this call. So an ordering with VGFC_online is not going to do
> > > > > anything.
> > > > > 
> > > > > > So it
> > > > > > would be:
> > > > > > 
> > > > > > xen/arch/arm/domctl.c:arch_get_info_guest
> > > > > > xen/arch/arm/vpsci.c:do_common_cpu_on
> > > > > > 
> > > > > > But I think it is impossible that either of them get called at the
> > > > > > same
> > > > > > time as arch_set_info_guest, which makes me wonder if we actually
> > > > > > need
> > > > > > the barrier...
> > > > > arch_get_info_guest() is called without the domain lock held and I
> > > > > can't
> > > > > see
> > > > > any other lock that could prevent it to be called in // of
> > > > > arch_set_info_guest().
> > > > > 
> > > > > So you could technically get corrupted information from
> > > > > XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb()
> > > > > before
> > > > > writing to v->is_initialised. The corresponding read barrier would be
> > > > > in
> > > > > vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().
> > > > > 
> > > > > But this is not the issue I was originally trying to solve. Currently,
> > > > > do_common_cpu_on() will roughly do:
> > > > > 
> > > > >1) domain_lock(d)
> > > > > 
> > > > >2) v->arch.sctlr = ...
> > > > >   v->arch.ttbr0 = ...
> > > > > 
> > > > >3) clear_bit(_VFP_down, >pause_flags);
> > > > > 
> > > > >4) domain_unlock(d)
> > > > > 
> > > > >5) vcpu_wake(v);
> > > > > 
> > > > > If we had only one pCPU on the system, then we would only wake the
> > > > > vCPU in
> > > > > step 5. We would be fine in this situation. But that's not the
> > > > > interesting
> > > > > case.
> > > > > 
> > > > > If you add a second pCPU in the story, it may be possible to have
> > > > > vcpu_wake()
> > > > > happening in // (see more below). As there is no memory barrier, step
> > > > > 3
> > > > > may be
> > > > > observed before 2. So, assuming the vcpu is runnable, we could start
> > > > > to
> > > > > schedule a vCPU before any update to the registers (step 2) are
> > > > > observed.
> > > > > 
> > > > > This means that when context_switch_to() is called, we may end up to
> > > > > restore
> > > > > some old values.
> > > > > 
> > > > > Now the question is can vcpu_wake() be called in // from another pCPU?
> > > > > AFAICT,
> > > > > it would be only called if a given flag in v->pause_flags is cleared
> > > > > (e.g.
> > > > > _VFP_blocked). But can we rely on that?
> > > > > 
> > > > > Even if we can rely on it, v->pause_flags has other flags in it. I
> > > > > couldn't
> > > > > rule out that _VPF_down cannot be set at the same time as the other
> > > > > _VPF_*.
> > > > > 
> > > > > Therefore, I think a barrier is necessary to ensure the ordering.
> > > > > 
> > > > > Do you agree with this analysis?
> > > >Yes, I think this makes sense. The corresponding barrier in the
> > > > scheduling code would have to be after reading _VPF_down and before
> > > > reading v->arch.sctlr, etc.
> > > > 
> > > > 
> > > > > > > The issues described here is also quite theoritical because there
> > > > > > > are
> > > > > > > hundreds of instructions executed between the time a vCPU is seen
> > > > > > > runnable and scheduled. But better be safe than sorry :).
> > > > > > > ---
> > > > > > > xen/arch/arm/domain.c | 7 +++
> > > > > > > 1 file changed, 7 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/xen/arch/arm/domain.c 

Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-16 Thread Julien Grall

Hi Stefano,

On 13/04/2021 23:43, Stefano Stabellini wrote:

On Sat, 20 Mar 2021, Julien Grall wrote:

On 20/03/2021 00:01, Stefano Stabellini wrote:

On Sat, 27 Feb 2021, Julien Grall wrote:

(+ Dario and George)

Hi Stefano,

I have added Dario and George to get some inputs from the scheduling part.

On 27/02/2021 01:58, Stefano Stabellini wrote:

On Fri, 26 Feb 2021, Julien Grall wrote:

From: Julien Grall 

A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
currently not ordering guarantee in arch_set_info_guest(), it may be
possible that flag can be observed cleared before the new values of
vCPU
registers are observed.

Add an smp_mb() before the flag is cleared to prevent re-ordering.

Signed-off-by: Julien Grall 

---

Barriers should work in pair. However, I am not entirely sure whether
to
put the other half. Maybe at the beginning of context_switch_to()?


It should be right after VGCF_online is set or cleared, right?


vcpu_guest_context_t is variable allocated on the heap just for the
purpose of
this call. So an ordering with VGFC_online is not going to do anything.


So it
would be:

xen/arch/arm/domctl.c:arch_get_info_guest
xen/arch/arm/vpsci.c:do_common_cpu_on

But I think it is impossible that either of them get called at the same
time as arch_set_info_guest, which makes me wonder if we actually need
the barrier...

arch_get_info_guest() is called without the domain lock held and I can't
see
any other lock that could prevent it to be called in // of
arch_set_info_guest().

So you could technically get corrupted information from
XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb() before
writing to v->is_initialised. The corresponding read barrier would be in
vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().

But this is not the issue I was originally trying to solve. Currently,
do_common_cpu_on() will roughly do:

   1) domain_lock(d)

   2) v->arch.sctlr = ...
  v->arch.ttbr0 = ...

   3) clear_bit(_VFP_down, >pause_flags);

   4) domain_unlock(d)

   5) vcpu_wake(v);

If we had only one pCPU on the system, then we would only wake the vCPU in
step 5. We would be fine in this situation. But that's not the interesting
case.

If you add a second pCPU in the story, it may be possible to have
vcpu_wake()
happening in // (see more below). As there is no memory barrier, step 3
may be
observed before 2. So, assuming the vcpu is runnable, we could start to
schedule a vCPU before any update to the registers (step 2) are observed.

This means that when context_switch_to() is called, we may end up to
restore
some old values.

Now the question is can vcpu_wake() be called in // from another pCPU?
AFAICT,
it would be only called if a given flag in v->pause_flags is cleared (e.g.
_VFP_blocked). But can we rely on that?

Even if we can rely on it, v->pause_flags has other flags in it. I
couldn't
rule out that _VPF_down cannot be set at the same time as the other
_VPF_*.

Therefore, I think a barrier is necessary to ensure the ordering.

Do you agree with this analysis?

   Yes, I think this makes sense. The corresponding barrier in the
scheduling code would have to be after reading _VPF_down and before
reading v->arch.sctlr, etc.



The issues described here is also quite theoritical because there are
hundreds of instructions executed between the time a vCPU is seen
runnable and scheduled. But better be safe than sorry :).
---
xen/arch/arm/domain.c | 7 +++
1 file changed, 7 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bdd3d3e5b5d5..2b705e66be81 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -914,7 +914,14 @@ int arch_set_info_guest(
v->is_initialised = 1;
  if ( ctxt->flags & VGCF_online )
+{
+/*
+ * The vCPU can be scheduled as soon as _VPF_down is cleared.
+ * So clear the bit *after* the context was loaded.
+ */
+smp_mb();


  From the discussion above, I would move this barrier before
v->is_initialised.
So we also take care of the issue with arch_get_info_guest().

This barrier also can be reduced to a smp_wmb() as we only expect an
ordering
between writes.

The barrier would be paired with the barrier in:
 - sync_vcpu_execstate() in the case of arch_get_vcpu_info_guest().
 - context_switch_to() in the case of scheduling (The exact barrier is
TDB).


OK, this makes sense, but why before:

v->is_initialised = 1;

instead of right after it? It is just v->pause_flags we care about,
right?


The issue I originally tried to address was a race with v->pause_flags. But I
also discovered one with v->initialised while answering to your previous
e-mail. This was only briefly mentioned so let me expand it.

A toolstack can take a snapshot of the vCPU context using
XEN_DOMCTL_get_vcpucontext. The helper will bail out if v->is_initialized is
0.

If v->is_initialized is 1, it will temporarily pause the vCPU and then call

Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-13 Thread Stefano Stabellini
On Sat, 20 Mar 2021, Julien Grall wrote:
> On 20/03/2021 00:01, Stefano Stabellini wrote:
> > On Sat, 27 Feb 2021, Julien Grall wrote:
> > > (+ Dario and George)
> > > 
> > > Hi Stefano,
> > > 
> > > I have added Dario and George to get some inputs from the scheduling part.
> > > 
> > > On 27/02/2021 01:58, Stefano Stabellini wrote:
> > > > On Fri, 26 Feb 2021, Julien Grall wrote:
> > > > > From: Julien Grall 
> > > > > 
> > > > > A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
> > > > > currently not ordering guarantee in arch_set_info_guest(), it may be
> > > > > possible that flag can be observed cleared before the new values of
> > > > > vCPU
> > > > > registers are observed.
> > > > > 
> > > > > Add an smp_mb() before the flag is cleared to prevent re-ordering.
> > > > > 
> > > > > Signed-off-by: Julien Grall 
> > > > > 
> > > > > ---
> > > > > 
> > > > > Barriers should work in pair. However, I am not entirely sure whether
> > > > > to
> > > > > put the other half. Maybe at the beginning of context_switch_to()?
> > > > 
> > > > It should be right after VGCF_online is set or cleared, right?
> > > 
> > > vcpu_guest_context_t is variable allocated on the heap just for the
> > > purpose of
> > > this call. So an ordering with VGFC_online is not going to do anything.
> > > 
> > > > So it
> > > > would be:
> > > > 
> > > > xen/arch/arm/domctl.c:arch_get_info_guest
> > > > xen/arch/arm/vpsci.c:do_common_cpu_on
> > > > 
> > > > But I think it is impossible that either of them get called at the same
> > > > time as arch_set_info_guest, which makes me wonder if we actually need
> > > > the barrier...
> > > arch_get_info_guest() is called without the domain lock held and I can't
> > > see
> > > any other lock that could prevent it to be called in // of
> > > arch_set_info_guest().
> > > 
> > > So you could technically get corrupted information from
> > > XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb() before
> > > writing to v->is_initialised. The corresponding read barrier would be in
> > > vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().
> > > 
> > > But this is not the issue I was originally trying to solve. Currently,
> > > do_common_cpu_on() will roughly do:
> > > 
> > >   1) domain_lock(d)
> > > 
> > >   2) v->arch.sctlr = ...
> > >  v->arch.ttbr0 = ...
> > > 
> > >   3) clear_bit(_VFP_down, >pause_flags);
> > > 
> > >   4) domain_unlock(d)
> > > 
> > >   5) vcpu_wake(v);
> > > 
> > > If we had only one pCPU on the system, then we would only wake the vCPU in
> > > step 5. We would be fine in this situation. But that's not the interesting
> > > case.
> > > 
> > > If you add a second pCPU in the story, it may be possible to have
> > > vcpu_wake()
> > > happening in // (see more below). As there is no memory barrier, step 3
> > > may be
> > > observed before 2. So, assuming the vcpu is runnable, we could start to
> > > schedule a vCPU before any update to the registers (step 2) are observed.
> > > 
> > > This means that when context_switch_to() is called, we may end up to
> > > restore
> > > some old values.
> > > 
> > > Now the question is can vcpu_wake() be called in // from another pCPU?
> > > AFAICT,
> > > it would be only called if a given flag in v->pause_flags is cleared (e.g.
> > > _VFP_blocked). But can we rely on that?
> > > 
> > > Even if we can rely on it, v->pause_flags has other flags in it. I
> > > couldn't
> > > rule out that _VPF_down cannot be set at the same time as the other
> > > _VPF_*.
> > > 
> > > Therefore, I think a barrier is necessary to ensure the ordering.
> > > 
> > > Do you agree with this analysis?
> >   Yes, I think this makes sense. The corresponding barrier in the
> > scheduling code would have to be after reading _VPF_down and before
> > reading v->arch.sctlr, etc.
> > 
> > 
> > > > > The issues described here is also quite theoritical because there are
> > > > > hundreds of instructions executed between the time a vCPU is seen
> > > > > runnable and scheduled. But better be safe than sorry :).
> > > > > ---
> > > > >xen/arch/arm/domain.c | 7 +++
> > > > >1 file changed, 7 insertions(+)
> > > > > 
> > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > > > > index bdd3d3e5b5d5..2b705e66be81 100644
> > > > > --- a/xen/arch/arm/domain.c
> > > > > +++ b/xen/arch/arm/domain.c
> > > > > @@ -914,7 +914,14 @@ int arch_set_info_guest(
> > > > >v->is_initialised = 1;
> > > > >  if ( ctxt->flags & VGCF_online )
> > > > > +{
> > > > > +/*
> > > > > + * The vCPU can be scheduled as soon as _VPF_down is cleared.
> > > > > + * So clear the bit *after* the context was loaded.
> > > > > + */
> > > > > +smp_mb();
> > > 
> > >  From the discussion above, I would move this barrier before
> > > v->is_initialised.
> > > So we also take care of the issue with arch_get_info_guest().
> > > 
> > > This barrier also can be reduced to a smp_wmb() as we 

Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-04-01 Thread Julien Grall

Hi,

On 20/03/2021 11:47, Julien Grall wrote:



On 20/03/2021 00:01, Stefano Stabellini wrote:

On Sat, 27 Feb 2021, Julien Grall wrote:

(+ Dario and George)

Hi Stefano,

I have added Dario and George to get some inputs from the scheduling 
part.


On 27/02/2021 01:58, Stefano Stabellini wrote:

On Fri, 26 Feb 2021, Julien Grall wrote:

From: Julien Grall 

A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
currently not ordering guarantee in arch_set_info_guest(), it may be
possible that flag can be observed cleared before the new values of 
vCPU

registers are observed.

Add an smp_mb() before the flag is cleared to prevent re-ordering.

Signed-off-by: Julien Grall 

---

Barriers should work in pair. However, I am not entirely sure 
whether to

put the other half. Maybe at the beginning of context_switch_to()?


It should be right after VGCF_online is set or cleared, right?


vcpu_guest_context_t is variable allocated on the heap just for the 
purpose of

this call. So an ordering with VGFC_online is not going to do anything.


So it
would be:

xen/arch/arm/domctl.c:arch_get_info_guest
xen/arch/arm/vpsci.c:do_common_cpu_on

But I think it is impossible that either of them get called at the same
time as arch_set_info_guest, which makes me wonder if we actually need
the barrier...
arch_get_info_guest() is called without the domain lock held and I 
can't see

any other lock that could prevent it to be called in // of
arch_set_info_guest().

So you could technically get corrupted information from
XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb() 
before

writing to v->is_initialised. The corresponding read barrier would be in
vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().

But this is not the issue I was originally trying to solve. Currently,
do_common_cpu_on() will roughly do:

  1) domain_lock(d)

  2) v->arch.sctlr = ...
 v->arch.ttbr0 = ...

  3) clear_bit(_VFP_down, >pause_flags);

  4) domain_unlock(d)

  5) vcpu_wake(v);

If we had only one pCPU on the system, then we would only wake the 
vCPU in
step 5. We would be fine in this situation. But that's not the 
interesting

case.

If you add a second pCPU in the story, it may be possible to have 
vcpu_wake()
happening in // (see more below). As there is no memory barrier, step 
3 may be

observed before 2. So, assuming the vcpu is runnable, we could start to
schedule a vCPU before any update to the registers (step 2) are 
observed.


This means that when context_switch_to() is called, we may end up to 
restore

some old values.

Now the question is can vcpu_wake() be called in // from another 
pCPU? AFAICT,
it would be only called if a given flag in v->pause_flags is cleared 
(e.g.

_VFP_blocked). But can we rely on that?

Even if we can rely on it, v->pause_flags has other flags in it. I 
couldn't
rule out that _VPF_down cannot be set at the same time as the other 
_VPF_*.


Therefore, I think a barrier is necessary to ensure the ordering.

Do you agree with this analysis?

Yes, I think this makes sense. The corresponding barrier in the
scheduling code would have to be after reading _VPF_down and before
reading v->arch.sctlr, etc.



The issues described here is also quite theoritical because there are
hundreds of instructions executed between the time a vCPU is seen
runnable and scheduled. But better be safe than sorry :).
---
   xen/arch/arm/domain.c | 7 +++
   1 file changed, 7 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bdd3d3e5b5d5..2b705e66be81 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -914,7 +914,14 @@ int arch_set_info_guest(
   v->is_initialised = 1;
 if ( ctxt->flags & VGCF_online )
+    {
+    /*
+ * The vCPU can be scheduled as soon as _VPF_down is cleared.
+ * So clear the bit *after* the context was loaded.
+ */
+    smp_mb();


 From the discussion above, I would move this barrier before 
v->is_initialised.

So we also take care of the issue with arch_get_info_guest().

This barrier also can be reduced to a smp_wmb() as we only expect an 
ordering

between writes.

The barrier would be paired with the barrier in:
    - sync_vcpu_execstate() in the case of arch_get_vcpu_info_guest().
    - context_switch_to() in the case of scheduling (The exact 
barrier is TDB).


OK, this makes sense, but why before:

   v->is_initialised = 1;

instead of right after it? It is just v->pause_flags we care about,
right?


The issue I originally tried to address was a race with v->pause_flags. 
But I also discovered one with v->initialised while answering to your 
previous e-mail. This was only briefly mentioned so let me expand it.


A toolstack can take a snapshot of the vCPU context using 
XEN_DOMCTL_get_vcpucontext. The helper will bail out if 
v->is_initialized is 0.


If v->is_initialized is 1, it will temporarily pause the vCPU and then 
call arch_get_info_guest().


AFAICT, 

Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-03-20 Thread Julien Grall




On 20/03/2021 00:01, Stefano Stabellini wrote:

On Sat, 27 Feb 2021, Julien Grall wrote:

(+ Dario and George)

Hi Stefano,

I have added Dario and George to get some inputs from the scheduling part.

On 27/02/2021 01:58, Stefano Stabellini wrote:

On Fri, 26 Feb 2021, Julien Grall wrote:

From: Julien Grall 

A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
currently not ordering guarantee in arch_set_info_guest(), it may be
possible that flag can be observed cleared before the new values of vCPU
registers are observed.

Add an smp_mb() before the flag is cleared to prevent re-ordering.

Signed-off-by: Julien Grall 

---

Barriers should work in pair. However, I am not entirely sure whether to
put the other half. Maybe at the beginning of context_switch_to()?


It should be right after VGCF_online is set or cleared, right?


vcpu_guest_context_t is variable allocated on the heap just for the purpose of
this call. So an ordering with VGFC_online is not going to do anything.


So it
would be:

xen/arch/arm/domctl.c:arch_get_info_guest
xen/arch/arm/vpsci.c:do_common_cpu_on

But I think it is impossible that either of them get called at the same
time as arch_set_info_guest, which makes me wonder if we actually need
the barrier...

arch_get_info_guest() is called without the domain lock held and I can't see
any other lock that could prevent it to be called in // of
arch_set_info_guest().

So you could technically get corrupted information from
XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb() before
writing to v->is_initialised. The corresponding read barrier would be in
vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().

But this is not the issue I was originally trying to solve. Currently,
do_common_cpu_on() will roughly do:

  1) domain_lock(d)

  2) v->arch.sctlr = ...
 v->arch.ttbr0 = ...

  3) clear_bit(_VFP_down, >pause_flags);

  4) domain_unlock(d)

  5) vcpu_wake(v);

If we had only one pCPU on the system, then we would only wake the vCPU in
step 5. We would be fine in this situation. But that's not the interesting
case.

If you add a second pCPU in the story, it may be possible to have vcpu_wake()
happening in // (see more below). As there is no memory barrier, step 3 may be
observed before 2. So, assuming the vcpu is runnable, we could start to
schedule a vCPU before any update to the registers (step 2) are observed.

This means that when context_switch_to() is called, we may end up to restore
some old values.

Now the question is can vcpu_wake() be called in // from another pCPU? AFAICT,
it would be only called if a given flag in v->pause_flags is cleared (e.g.
_VFP_blocked). But can we rely on that?

Even if we can rely on it, v->pause_flags has other flags in it. I couldn't
rule out that _VPF_down cannot be set at the same time as the other _VPF_*.

Therefore, I think a barrier is necessary to ensure the ordering.

Do you agree with this analysis?
  
Yes, I think this makes sense. The corresponding barrier in the

scheduling code would have to be after reading _VPF_down and before
reading v->arch.sctlr, etc.



The issues described here is also quite theoritical because there are
hundreds of instructions executed between the time a vCPU is seen
runnable and scheduled. But better be safe than sorry :).
---
   xen/arch/arm/domain.c | 7 +++
   1 file changed, 7 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bdd3d3e5b5d5..2b705e66be81 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -914,7 +914,14 @@ int arch_set_info_guest(
   v->is_initialised = 1;
 if ( ctxt->flags & VGCF_online )
+{
+/*
+ * The vCPU can be scheduled as soon as _VPF_down is cleared.
+ * So clear the bit *after* the context was loaded.
+ */
+smp_mb();


 From the discussion above, I would move this barrier before v->is_initialised.
So we also take care of the issue with arch_get_info_guest().

This barrier also can be reduced to a smp_wmb() as we only expect an ordering
between writes.

The barrier would be paired with the barrier in:
- sync_vcpu_execstate() in the case of arch_get_vcpu_info_guest().
- context_switch_to() in the case of scheduling (The exact barrier is TDB).


OK, this makes sense, but why before:

   v->is_initialised = 1;

instead of right after it? It is just v->pause_flags we care about,
right?


The issue I originally tried to address was a race with v->pause_flags. 
But I also discovered one with v->initialised while answering to your 
previous e-mail. This was only briefly mentioned so let me expand it.


A toolstack can take a snapshot of the vCPU context using 
XEN_DOMCTL_get_vcpucontext. The helper will bail out if 
v->is_initialized is 0.


If v->is_initialized is 1, it will temporarily pause the vCPU and then 
call arch_get_info_guest().


AFAICT, arch_get_info_guest() and arch_set_info_guest() (called from 
PSCI CPU on) can run 

Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-03-19 Thread Stefano Stabellini
On Sat, 27 Feb 2021, Julien Grall wrote:
> (+ Dario and George)
> 
> Hi Stefano,
> 
> I have added Dario and George to get some inputs from the scheduling part.
> 
> On 27/02/2021 01:58, Stefano Stabellini wrote:
> > On Fri, 26 Feb 2021, Julien Grall wrote:
> > > From: Julien Grall 
> > > 
> > > A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
> > > currently not ordering guarantee in arch_set_info_guest(), it may be
> > > possible that flag can be observed cleared before the new values of vCPU
> > > registers are observed.
> > > 
> > > Add an smp_mb() before the flag is cleared to prevent re-ordering.
> > > 
> > > Signed-off-by: Julien Grall 
> > > 
> > > ---
> > > 
> > > Barriers should work in pair. However, I am not entirely sure whether to
> > > put the other half. Maybe at the beginning of context_switch_to()?
> > 
> > It should be right after VGCF_online is set or cleared, right?
> 
> vcpu_guest_context_t is variable allocated on the heap just for the purpose of
> this call. So an ordering with VGFC_online is not going to do anything.
> 
> > So it
> > would be:
> > 
> > xen/arch/arm/domctl.c:arch_get_info_guest
> > xen/arch/arm/vpsci.c:do_common_cpu_on
> > 
> > But I think it is impossible that either of them get called at the same
> > time as arch_set_info_guest, which makes me wonder if we actually need
> > the barrier...
> arch_get_info_guest() is called without the domain lock held and I can't see
> any other lock that could prevent it to be called in // of
> arch_set_info_guest().
> 
> So you could technically get corrupted information from
> XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb() before
> writing to v->is_initialised. The corresponding read barrier would be in
> vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().
> 
> But this is not the issue I was originally trying to solve. Currently,
> do_common_cpu_on() will roughly do:
> 
>  1) domain_lock(d)
> 
>  2) v->arch.sctlr = ...
> v->arch.ttbr0 = ...
> 
>  3) clear_bit(_VFP_down, >pause_flags);
> 
>  4) domain_unlock(d)
> 
>  5) vcpu_wake(v);
> 
> If we had only one pCPU on the system, then we would only wake the vCPU in
> step 5. We would be fine in this situation. But that's not the interesting
> case.
> 
> If you add a second pCPU in the story, it may be possible to have vcpu_wake()
> happening in // (see more below). As there is no memory barrier, step 3 may be
> observed before 2. So, assuming the vcpu is runnable, we could start to
> schedule a vCPU before any update to the registers (step 2) are observed.
> 
> This means that when context_switch_to() is called, we may end up to restore
> some old values.
> 
> Now the question is can vcpu_wake() be called in // from another pCPU? AFAICT,
> it would be only called if a given flag in v->pause_flags is cleared (e.g.
> _VFP_blocked). But can we rely on that?
> 
> Even if we can rely on it, v->pause_flags has other flags in it. I couldn't
> rule out that _VPF_down cannot be set at the same time as the other _VPF_*.
> 
> Therefore, I think a barrier is necessary to ensure the ordering.
> 
> Do you agree with this analysis?
 
Yes, I think this makes sense. The corresponding barrier in the
scheduling code would have to be after reading _VPF_down and before
reading v->arch.sctlr, etc.


> > > The issues described here is also quite theoritical because there are
> > > hundreds of instructions executed between the time a vCPU is seen
> > > runnable and scheduled. But better be safe than sorry :).
> > > ---
> > >   xen/arch/arm/domain.c | 7 +++
> > >   1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > > index bdd3d3e5b5d5..2b705e66be81 100644
> > > --- a/xen/arch/arm/domain.c
> > > +++ b/xen/arch/arm/domain.c
> > > @@ -914,7 +914,14 @@ int arch_set_info_guest(
> > >   v->is_initialised = 1;
> > > if ( ctxt->flags & VGCF_online )
> > > +{
> > > +/*
> > > + * The vCPU can be scheduled as soon as _VPF_down is cleared.
> > > + * So clear the bit *after* the context was loaded.
> > > + */
> > > +smp_mb();
> 
> From the discussion above, I would move this barrier before v->is_initialised.
> So we also take care of the issue with arch_get_info_guest().
> 
> This barrier also can be reduced to a smp_wmb() as we only expect an ordering
> between writes.
> 
> The barrier would be paired with the barrier in:
>- sync_vcpu_execstate() in the case of arch_get_vcpu_info_guest().
>- context_switch_to() in the case of scheduling (The exact barrier is TDB).

OK, this makes sense, but why before:

  v->is_initialised = 1;

instead of right after it? It is just v->pause_flags we care about,
right?



> > >   clear_bit(_VPF_down, >pause_flags);
> > > +}
> > >   else
> > >   set_bit(_VPF_down, >pause_flags);
> > >   




Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-02-27 Thread Julien Grall

(+ Dario and George)

Hi Stefano,

I have added Dario and George to get some inputs from the scheduling part.

On 27/02/2021 01:58, Stefano Stabellini wrote:

On Fri, 26 Feb 2021, Julien Grall wrote:

From: Julien Grall 

A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
currently not ordering guarantee in arch_set_info_guest(), it may be
possible that flag can be observed cleared before the new values of vCPU
registers are observed.

Add an smp_mb() before the flag is cleared to prevent re-ordering.

Signed-off-by: Julien Grall 

---

Barriers should work in pair. However, I am not entirely sure whether to
put the other half. Maybe at the beginning of context_switch_to()?


It should be right after VGCF_online is set or cleared, right?


vcpu_guest_context_t is variable allocated on the heap just for the 
purpose of this call. So an ordering with VGFC_online is not going to do 
anything.



So it
would be:

xen/arch/arm/domctl.c:arch_get_info_guest
xen/arch/arm/vpsci.c:do_common_cpu_on

But I think it is impossible that either of them get called at the same
time as arch_set_info_guest, which makes me wonder if we actually need
the barrier...
arch_get_info_guest() is called without the domain lock held and I can't 
see any other lock that could prevent it to be called in // of 
arch_set_info_guest().


So you could technically get corrupted information from 
XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb() 
before writing to v->is_initialised. The corresponding read barrier 
would be in vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().


But this is not the issue I was originally trying to solve. Currently, 
do_common_cpu_on() will roughly do:


 1) domain_lock(d)

 2) v->arch.sctlr = ...
v->arch.ttbr0 = ...

 3) clear_bit(_VFP_down, >pause_flags);

 4) domain_unlock(d)

 5) vcpu_wake(v);

If we had only one pCPU on the system, then we would only wake the vCPU 
in step 5. We would be fine in this situation. But that's not the 
interesting case.


If you add a second pCPU in the story, it may be possible to have 
vcpu_wake() happening in // (see more below). As there is no memory 
barrier, step 3 may be observed before 2. So, assuming the vcpu is 
runnable, we could start to schedule a vCPU before any update to the 
registers (step 2) are observed.


This means that when context_switch_to() is called, we may end up to 
restore some old values.


Now the question is can vcpu_wake() be called in // from another pCPU? 
AFAICT, it would be only called if a given flag in v->pause_flags is 
cleared (e.g. _VFP_blocked). But can we rely on that?


Even if we can rely on it, v->pause_flags has other flags in it. I 
couldn't rule out that _VPF_down cannot be set at the same time as the 
other _VPF_*.


Therefore, I think a barrier is necessary to ensure the ordering.

Do you agree with this analysis?






The issues described here is also quite theoritical because there are
hundreds of instructions executed between the time a vCPU is seen
runnable and scheduled. But better be safe than sorry :).
---
  xen/arch/arm/domain.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bdd3d3e5b5d5..2b705e66be81 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -914,7 +914,14 @@ int arch_set_info_guest(
  v->is_initialised = 1;
  
  if ( ctxt->flags & VGCF_online )

+{
+/*
+ * The vCPU can be scheduled as soon as _VPF_down is cleared.
+ * So clear the bit *after* the context was loaded.
+ */
+smp_mb();


From the discussion above, I would move this barrier before 
v->is_initialised. So we also take care of the issue with 
arch_get_info_guest().


This barrier also can be reduced to a smp_wmb() as we only expect an 
ordering between writes.


The barrier would be paired with the barrier in:
   - sync_vcpu_execstate() in the case of arch_get_vcpu_info_guest().
   - context_switch_to() in the case of scheduling (The exact barrier 
is TDB).



  clear_bit(_VPF_down, >pause_flags);
+}
  else
  set_bit(_VPF_down, >pause_flags);
  
--

2.17.1



Cheers,

--
Julien Grall



Re: [PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-02-26 Thread Stefano Stabellini
On Fri, 26 Feb 2021, Julien Grall wrote:
> From: Julien Grall 
> 
> A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
> currently not ordering guarantee in arch_set_info_guest(), it may be
> possible that flag can be observed cleared before the new values of vCPU
> registers are observed.
> 
> Add an smp_mb() before the flag is cleared to prevent re-ordering.
> 
> Signed-off-by: Julien Grall 
> 
> ---
> 
> Barriers should work in pair. However, I am not entirely sure whether to
> put the other half. Maybe at the beginning of context_switch_to()?

It should be right after VGCF_online is set or cleared, right? So it
would be:

xen/arch/arm/domctl.c:arch_get_info_guest
xen/arch/arm/vpsci.c:do_common_cpu_on

But I think it is impossible that either of them get called at the same
time as arch_set_info_guest, which makes me wonder if we actually need
the barrier...



> The issues described here is also quite theoritical because there are
> hundreds of instructions executed between the time a vCPU is seen
> runnable and scheduled. But better be safe than sorry :).
> ---
>  xen/arch/arm/domain.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index bdd3d3e5b5d5..2b705e66be81 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -914,7 +914,14 @@ int arch_set_info_guest(
>  v->is_initialised = 1;
>  
>  if ( ctxt->flags & VGCF_online )
> +{
> +/*
> + * The vCPU can be scheduled as soon as _VPF_down is cleared.
> + * So clear the bit *after* the context was loaded.
> + */
> +smp_mb();
>  clear_bit(_VPF_down, >pause_flags);
> +}
>  else
>  set_bit(_VPF_down, >pause_flags);
>  
> -- 
> 2.17.1
> 



[PATCH] xen/arm: Ensure the vCPU context is seen before clearing the _VPF_down

2021-02-26 Thread Julien Grall
From: Julien Grall 

A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
currently not ordering guarantee in arch_set_info_guest(), it may be
possible that flag can be observed cleared before the new values of vCPU
registers are observed.

Add an smp_mb() before the flag is cleared to prevent re-ordering.

Signed-off-by: Julien Grall 

---

Barriers should work in pair. However, I am not entirely sure whether to
put the other half. Maybe at the beginning of context_switch_to()?

The issues described here is also quite theoritical because there are
hundreds of instructions executed between the time a vCPU is seen
runnable and scheduled. But better be safe than sorry :).
---
 xen/arch/arm/domain.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index bdd3d3e5b5d5..2b705e66be81 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -914,7 +914,14 @@ int arch_set_info_guest(
 v->is_initialised = 1;
 
 if ( ctxt->flags & VGCF_online )
+{
+/*
+ * The vCPU can be scheduled as soon as _VPF_down is cleared.
+ * So clear the bit *after* the context was loaded.
+ */
+smp_mb();
 clear_bit(_VPF_down, >pause_flags);
+}
 else
 set_bit(_VPF_down, >pause_flags);
 
-- 
2.17.1