Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC

2018-04-20 Thread Jan Beulich
>>> On 19.04.18 at 20:18,  wrote:
> On 19/04/18 16:54, Natarajan, Janakarajan wrote:
>> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
 @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
 unsigned int index)
   return >avic_physical_id_table[index];
   }
   +static void avic_vcpu_load(struct vcpu *v)
 +{
 +unsigned long tmp;
 +struct arch_svm_struct *s = >arch.hvm_svm;
 +int h_phy_apic_id;
 +struct avic_physical_id_entry *entry = (struct
 avic_physical_id_entry *)
 +
 +ASSERT(!test_bit(_VPF_blocked, >pause_flags));
 +
 +/*
 + * Note: APIC ID = 0xff is used for broadcast.
 + *   APIC ID > 0xff is reserved.
 + */
 +h_phy_apic_id = cpu_data[v->processor].apicid;
 +ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
 +
 +tmp = read_atomic((u64*)(s->avic_last_phy_id));
 +entry->host_phy_apic_id = h_phy_apic_id;
 +entry->is_running = 1;
 +write_atomic((u64*)(s->avic_last_phy_id), tmp);
>>> What is the purpose of s->avic_last_phy_id ?
>>>
>>> As far as I can tell, it is always an unchanging pointer into the
>>> physical ID table, which is only ever updated synchronously in current
>>> context.
>>>
>>> If so, I don't see why it needs any of these hoops to be jumped though.
>>
>> s->avic_last_phy_id is used to quickly access the entry in the table.
>>
>> When the code was pushed for Linux, memory barriers were used and it
>> was suggested that atomic operations
>> be used instead to ensure compiler ordering. The same is done here.
> 
> Ok - summing up a conversation on IRC, and some digging around the manual.
> 
> Per VM, there is a single Physical APIC Table, which lives in a 4k
> page.  This table is referenced by the VMCB, and read by hardware when
> processing guest actions.
> 
> The contents of this table a list of 64bit entries,
> 
> struct __packed avic_physical_id_entry {
> u64 host_phy_apic_id  : 8;
> u64 res1  : 4;
> u64 bk_pg_ptr_mfn : 40;
> u64 res2  : 10;
> u64 is_running: 1;
> u64 valid : 1;
> };
> 
> which are indexed by guest APIC_ID.
> 
> AMD hardware allows writes to the APIC_ID register, but OSes don't do
> this in practice (the register is read/discard on some hardware, and
> strictly read-only in x2apic).  The implementation in Xen is to crash
> the domain if we see a write here, and that is reasonable behaviour
> which I don't expect to change going forwards.
> 
> As a result, the layout of the Physical APIC Table is fixed based on the
> APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
> valid bit (valid) are set up during construction, and remain unchanged
> for the lifetime of the domain.
> 
> The only fields which change during runtime are the host_phys_apic_id,
> and its valid bit (is_running), and these change on vcpu context switch.
> 
> Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
> e->is_running to signify that the vcpu isn't allocated to a pcpu.
> 
> On ctxt_switch_to(), we want a simple
> 
> e->host_phy_apic_id = this_pcpu_apic_id;
> smp_wmb();
> __set_bit(e->is_running);
> 
> which guarantees that the host physical apic id field is valid and up to
> date, before hardware sees it being reported as valid.  As these changes
> are only made in current context, there are no other ordering or
> atomicity concerns.

Besides the above not going to compile (due to the bitfield nature),
why two accesses when one suffices? After all both fields needing
updating live within the same qword.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC

2018-04-20 Thread Andrew Cooper
On 20/04/18 00:04, Boris Ostrovsky wrote:
> On 04/19/2018 02:18 PM, Andrew Cooper wrote:
>> On 19/04/18 16:54, Natarajan, Janakarajan wrote:
>>> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
 On 04/04/18 00:01, Janakarajan Natarajan wrote:
> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
> unsigned int index)
>   return >avic_physical_id_table[index];
>   }
>   +static void avic_vcpu_load(struct vcpu *v)
> +{
> +    unsigned long tmp;
> +    struct arch_svm_struct *s = >arch.hvm_svm;
> +    int h_phy_apic_id;
> +    struct avic_physical_id_entry *entry = (struct
> avic_physical_id_entry *)
> +
> +    ASSERT(!test_bit(_VPF_blocked, >pause_flags));
> +
> +    /*
> + * Note: APIC ID = 0xff is used for broadcast.
> + *   APIC ID > 0xff is reserved.
> + */
> +    h_phy_apic_id = cpu_data[v->processor].apicid;
> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
> +
> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
> +    entry->host_phy_apic_id = h_phy_apic_id;
> +    entry->is_running = 1;
> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
 What is the purpose of s->avic_last_phy_id ?

 As far as I can tell, it is always an unchanging pointer into the
 physical ID table, which is only ever updated synchronously in current
 context.

 If so, I don't see why it needs any of these hoops to be jumped though.
>>> s->avic_last_phy_id is used to quickly access the entry in the table.
>>>
>>> When the code was pushed for Linux, memory barriers were used and it
>>> was suggested that atomic operations
>>> be used instead to ensure compiler ordering. The same is done here.
>> Ok - summing up a conversation on IRC, and some digging around the manual.
>>
>> Per VM, there is a single Physical APIC Table, which lives in a 4k
>> page.  This table is referenced by the VMCB, and read by hardware when
>> processing guest actions.
>>
>> The contents of this table a list of 64bit entries,
>>
>> struct __packed avic_physical_id_entry {
>> u64 host_phy_apic_id  : 8;
>> u64 res1  : 4;
>> u64 bk_pg_ptr_mfn : 40;
>> u64 res2  : 10;
>> u64 is_running: 1;
>> u64 valid : 1;
>> };
>>
>> which are indexed by guest APIC_ID.
>>
>> AMD hardware allows writes to the APIC_ID register, but OSes don't do
>> this in practice (the register is read/discard on some hardware, and
>> strictly read-only in x2apic).  The implementation in Xen is to crash
>> the domain if we see a write here, and that is reasonable behaviour
>> which I don't expect to change going forwards.
>>
>> As a result, the layout of the Physical APIC Table is fixed based on the
>> APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
>> valid bit (valid) are set up during construction, and remain unchanged
>> for the lifetime of the domain.
>>
>> The only fields which change during runtime are the host_phys_apic_id,
>> and its valid bit (is_running), and these change on vcpu context switch.
>>
>> Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
>> e->is_running to signify that the vcpu isn't allocated to a pcpu.
>>
>> On ctxt_switch_to(), we want a simple
>>
>> e->host_phy_apic_id = this_pcpu_apic_id;
>> smp_wmb();
>> __set_bit(e->is_running);
>>
>> which guarantees that the host physical apic id field is valid and up to
>> date, before hardware sees it being reported as valid.  As these changes
>> are only made in current context, there are no other ordering or
>> atomicity concerns.
>>
>> This table is expected to live in regular WB RAM, and the manual has no
>> comment/reference to requiring special accesses.  Therefore, I'm
>> moderately confident that the above ordering is sufficient for correct
>> behaviour, and no explicitly atomic actions are required.
>>
>> Thoughts/comments/suggestions?
>
> The entry can also be written as a single raw 64-bit value (I think you
> suggested in one of the reviews to make it a union with a uint64_t).

Yes it can, but the handling for that is more complicated and (AFAICT)
unnecessary.

You'd need something like:

e = ACCESS_ONCE(*ptr);
e->host_phy_apic_id = ...;
e->is_running = 1;
ACCESS_ONCE(*ptr) = e;

which would be the most flexible way of expressing the result.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC

2018-04-19 Thread Boris Ostrovsky
On 04/19/2018 02:18 PM, Andrew Cooper wrote:
> On 19/04/18 16:54, Natarajan, Janakarajan wrote:
>> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
 @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
 unsigned int index)
   return >avic_physical_id_table[index];
   }
   +static void avic_vcpu_load(struct vcpu *v)
 +{
 +    unsigned long tmp;
 +    struct arch_svm_struct *s = >arch.hvm_svm;
 +    int h_phy_apic_id;
 +    struct avic_physical_id_entry *entry = (struct
 avic_physical_id_entry *)
 +
 +    ASSERT(!test_bit(_VPF_blocked, >pause_flags));
 +
 +    /*
 + * Note: APIC ID = 0xff is used for broadcast.
 + *   APIC ID > 0xff is reserved.
 + */
 +    h_phy_apic_id = cpu_data[v->processor].apicid;
 +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
 +
 +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
 +    entry->host_phy_apic_id = h_phy_apic_id;
 +    entry->is_running = 1;
 +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>>> What is the purpose of s->avic_last_phy_id ?
>>>
>>> As far as I can tell, it is always an unchanging pointer into the
>>> physical ID table, which is only ever updated synchronously in current
>>> context.
>>>
>>> If so, I don't see why it needs any of these hoops to be jumped though.
>> s->avic_last_phy_id is used to quickly access the entry in the table.
>>
>> When the code was pushed for Linux, memory barriers were used and it
>> was suggested that atomic operations
>> be used instead to ensure compiler ordering. The same is done here.
> Ok - summing up a conversation on IRC, and some digging around the manual.
>
> Per VM, there is a single Physical APIC Table, which lives in a 4k
> page.  This table is referenced by the VMCB, and read by hardware when
> processing guest actions.
>
> The contents of this table a list of 64bit entries,
>
> struct __packed avic_physical_id_entry {
> u64 host_phy_apic_id  : 8;
> u64 res1  : 4;
> u64 bk_pg_ptr_mfn : 40;
> u64 res2  : 10;
> u64 is_running: 1;
> u64 valid : 1;
> };
>
> which are indexed by guest APIC_ID.
>
> AMD hardware allows writes to the APIC_ID register, but OSes don't do
> this in practice (the register is read/discard on some hardware, and
> strictly read-only in x2apic).  The implementation in Xen is to crash
> the domain if we see a write here, and that is reasonable behaviour
> which I don't expect to change going forwards.
>
> As a result, the layout of the Physical APIC Table is fixed based on the
> APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
> valid bit (valid) are set up during construction, and remain unchanged
> for the lifetime of the domain.
>
> The only fields which change during runtime are the host_phys_apic_id,
> and its valid bit (is_running), and these change on vcpu context switch.
>
> Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
> e->is_running to signify that the vcpu isn't allocated to a pcpu.
>
> On ctxt_switch_to(), we want a simple
>
> e->host_phy_apic_id = this_pcpu_apic_id;
> smp_wmb();
> __set_bit(e->is_running);
>
> which guarantees that the host physical apic id field is valid and up to
> date, before hardware sees it being reported as valid.  As these changes
> are only made in current context, there are no other ordering or
> atomicity concerns.
>
> This table is expected to live in regular WB RAM, and the manual has no
> comment/reference to requiring special accesses.  Therefore, I'm
> moderately confident that the above ordering is sufficient for correct
> behaviour, and no explicitly atomic actions are required.
>
> Thoughts/comments/suggestions?


The entry can also be written as a single raw 64-bit value (I think you
suggested in one of the reviews to make it a union with a uint64_t).

-boris


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC

2018-04-19 Thread Andrew Cooper
On 19/04/18 16:54, Natarajan, Janakarajan wrote:
> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
>>> unsigned int index)
>>>   return >avic_physical_id_table[index];
>>>   }
>>>   +static void avic_vcpu_load(struct vcpu *v)
>>> +{
>>> +    unsigned long tmp;
>>> +    struct arch_svm_struct *s = >arch.hvm_svm;
>>> +    int h_phy_apic_id;
>>> +    struct avic_physical_id_entry *entry = (struct
>>> avic_physical_id_entry *)
>>> +
>>> +    ASSERT(!test_bit(_VPF_blocked, >pause_flags));
>>> +
>>> +    /*
>>> + * Note: APIC ID = 0xff is used for broadcast.
>>> + *   APIC ID > 0xff is reserved.
>>> + */
>>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>>> +
>>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>>> +    entry->host_phy_apic_id = h_phy_apic_id;
>>> +    entry->is_running = 1;
>>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>> What is the purpose of s->avic_last_phy_id ?
>>
>> As far as I can tell, it is always an unchanging pointer into the
>> physical ID table, which is only ever updated synchronously in current
>> context.
>>
>> If so, I don't see why it needs any of these hoops to be jumped though.
>
> s->avic_last_phy_id is used to quickly access the entry in the table.
>
> When the code was pushed for Linux, memory barriers were used and it
> was suggested that atomic operations
> be used instead to ensure compiler ordering. The same is done here.

Ok - summing up a conversation on IRC, and some digging around the manual.

Per VM, there is a single Physical APIC Table, which lives in a 4k
page.  This table is referenced by the VMCB, and read by hardware when
processing guest actions.

The contents of this table a list of 64bit entries,

struct __packed avic_physical_id_entry {
u64 host_phy_apic_id  : 8;
u64 res1  : 4;
u64 bk_pg_ptr_mfn : 40;
u64 res2  : 10;
u64 is_running: 1;
u64 valid : 1;
};

which are indexed by guest APIC_ID.

AMD hardware allows writes to the APIC_ID register, but OSes don't do
this in practice (the register is read/discard on some hardware, and
strictly read-only in x2apic).  The implementation in Xen is to crash
the domain if we see a write here, and that is reasonable behaviour
which I don't expect to change going forwards.

As a result, the layout of the Physical APIC Table is fixed based on the
APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
valid bit (valid) are set up during construction, and remain unchanged
for the lifetime of the domain.

The only fields which change during runtime are the host_phys_apic_id,
and its valid bit (is_running), and these change on vcpu context switch.

Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
e->is_running to signify that the vcpu isn't allocated to a pcpu.

On ctxt_switch_to(), we want a simple

e->host_phy_apic_id = this_pcpu_apic_id;
smp_wmb();
__set_bit(e->is_running);

which guarantees that the host physical apic id field is valid and up to
date, before hardware sees it being reported as valid.  As these changes
are only made in current context, there are no other ordering or
atomicity concerns.

This table is expected to live in regular WB RAM, and the manual has no
comment/reference to requiring special accesses.  Therefore, I'm
moderately confident that the above ordering is sufficient for correct
behaviour, and no explicitly atomic actions are required.

Thoughts/comments/suggestions?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC

2018-04-19 Thread Natarajan, Janakarajan

On 4/13/2018 12:57 PM, Andrew Cooper wrote:

On 04/04/18 00:01, Janakarajan Natarajan wrote:

@@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d, unsigned 
int index)
  return >avic_physical_id_table[index];
  }
  
+static void avic_vcpu_load(struct vcpu *v)

+{
+unsigned long tmp;
+struct arch_svm_struct *s = >arch.hvm_svm;
+int h_phy_apic_id;
+struct avic_physical_id_entry *entry = (struct avic_physical_id_entry 
*)
+
+ASSERT(!test_bit(_VPF_blocked, >pause_flags));
+
+/*
+ * Note: APIC ID = 0xff is used for broadcast.
+ *   APIC ID > 0xff is reserved.
+ */
+h_phy_apic_id = cpu_data[v->processor].apicid;
+ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
+
+tmp = read_atomic((u64*)(s->avic_last_phy_id));
+entry->host_phy_apic_id = h_phy_apic_id;
+entry->is_running = 1;
+write_atomic((u64*)(s->avic_last_phy_id), tmp);

What is the purpose of s->avic_last_phy_id ?

As far as I can tell, it is always an unchanging pointer into the
physical ID table, which is only ever updated synchronously in current
context.

If so, I don't see why it needs any of these hoops to be jumped though.


s->avic_last_phy_id is used to quickly access the entry in the table.

When the code was pushed for Linux, memory barriers were used and it was 
suggested that atomic operations

be used instead to ensure compiler ordering. The same is done here.



~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC

2018-04-13 Thread Andrew Cooper
On 04/04/18 00:01, Janakarajan Natarajan wrote:
> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d, unsigned 
> int index)
>  return >avic_physical_id_table[index];
>  }
>  
> +static void avic_vcpu_load(struct vcpu *v)
> +{
> +unsigned long tmp;
> +struct arch_svm_struct *s = >arch.hvm_svm;
> +int h_phy_apic_id;
> +struct avic_physical_id_entry *entry = (struct avic_physical_id_entry 
> *)
> +
> +ASSERT(!test_bit(_VPF_blocked, >pause_flags));
> +
> +/*
> + * Note: APIC ID = 0xff is used for broadcast.
> + *   APIC ID > 0xff is reserved.
> + */
> +h_phy_apic_id = cpu_data[v->processor].apicid;
> +ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
> +
> +tmp = read_atomic((u64*)(s->avic_last_phy_id));
> +entry->host_phy_apic_id = h_phy_apic_id;
> +entry->is_running = 1;
> +write_atomic((u64*)(s->avic_last_phy_id), tmp);

What is the purpose of s->avic_last_phy_id ?

As far as I can tell, it is always an unchanging pointer into the
physical ID table, which is only ever updated synchronously in current
context.

If so, I don't see why it needs any of these hoops to be jumped though.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel