Re: [Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-30 Thread Razvan Cojocaru
On 04/30/2018 11:11 AM, Jan Beulich wrote:
 On 28.04.18 at 08:13,  wrote:
>> On 04/28/2018 12:30 AM, Tamas K Lengyel wrote:
>>> On Mon, Apr 23, 2018 at 2:00 AM, Alexandru Isaila
>>>  wrote:
 This patch is adding a way to enable/disable inguest pagefault
 events. It introduces the xc_monitor_inguest_pagefault function
 and adds the inguest_pagefault_disabled in the monitor structure.
 This is needed by the introspection so it will only get gla
 faults and not get spammed with other faults.
 In p2m_mem_access_check() we emulate so no event will get sent.
>>>
>>> This looks good to me, but is the emulator able to handle all
>>> instructions that may trigger it here?
>>
>> That's a very good question. We think not, but we now have the 
>> UNIMPLEMENTED emulator event. The thought here is that the emulator 
>> would be able to handle most cases, and then the ones it can't handle we 
>> can handle with altp2m.
>>
>> Of course, it's not ideal - we'd rather have a mechanism that's 
>> consistently foolproof, but I believe that Jan's objection is correct: 
>> we can't really be sure that the first time we get into access_check() 
>> with a specific [RIP:GLA] pair we need to set the A bit and the second 
>> time the D bit (interrupts may trip this logic up).
> 
> Interrupts are only one aspect. Insns sent back to guest context for
> retry (like AVX2 gathers would commonly do) are another afaict.
> 
>> Furthermore, with 
>> SVM the GLA is not available for page faults (although that's fixable by 
>> comparing GPAs).
> 
> I may not have enough context here, but is that true when multiple
> linear addresses are mapped to the same physical page?

No, you are right. Quite possibly a case like that can happen where
comparing GPAs is not enough.

So as far as I can tell, we can either do this best-effort thing with
trying to emulate the instruction and hope for the best (and handle
UNIMPLEMENTED when necessary), or A) know exactly when we need to set
the A bit and when the D bit - I've not been able to find a foolproof
way of doing that -, or B) single-step GPT page faults directly on
hardware _in_the_hypervisor_, for which there is currently no mechanism
- although one can be seen as doable on top of the altp2m infrastructure
in the future.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-30 Thread Jan Beulich
>>> On 28.04.18 at 08:13,  wrote:
> On 04/28/2018 12:30 AM, Tamas K Lengyel wrote:
>> On Mon, Apr 23, 2018 at 2:00 AM, Alexandru Isaila
>>  wrote:
>>> This patch is adding a way to enable/disable inguest pagefault
>>> events. It introduces the xc_monitor_inguest_pagefault function
>>> and adds the inguest_pagefault_disabled in the monitor structure.
>>> This is needed by the introspection so it will only get gla
>>> faults and not get spammed with other faults.
>>> In p2m_mem_access_check() we emulate so no event will get sent.
>> 
>> This looks good to me, but is the emulator able to handle all
>> instructions that may trigger it here?
> 
> That's a very good question. We think not, but we now have the 
> UNIMPLEMENTED emulator event. The thought here is that the emulator 
> would be able to handle most cases, and then the ones it can't handle we 
> can handle with altp2m.
> 
> Of course, it's not ideal - we'd rather have a mechanism that's 
> consistently foolproof, but I believe that Jan's objection is correct: 
> we can't really be sure that the first time we get into access_check() 
> with a specific [RIP:GLA] pair we need to set the A bit and the second 
> time the D bit (interrupts may trip this logic up).

Interrupts are only one aspect. Insns sent back to guest context for
retry (like AVX2 gathers would commonly do) are another afaict.

> Furthermore, with 
> SVM the GLA is not available for page faults (although that's fixable by 
> comparing GPAs).

I may not have enough context here, but is that true when multiple
linear addresses are mapped to the same physical page?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-28 Thread Tamas K Lengyel
On Sat, Apr 28, 2018 at 12:13 AM, Razvan Cojocaru
 wrote:
> On 04/28/2018 12:30 AM, Tamas K Lengyel wrote:
>>
>> On Mon, Apr 23, 2018 at 2:00 AM, Alexandru Isaila
>>  wrote:
>>>
>>> This patch is adding a way to enable/disable inguest pagefault
>>> events. It introduces the xc_monitor_inguest_pagefault function
>>> and adds the inguest_pagefault_disabled in the monitor structure.
>>> This is needed by the introspection so it will only get gla
>>> faults and not get spammed with other faults.
>>> In p2m_mem_access_check() we emulate so no event will get sent.
>>
>>
>> This looks good to me, but is the emulator able to handle all
>> instructions that may trigger it here?
>
>
> That's a very good question. We think not, but we now have the UNIMPLEMENTED
> emulator event. The thought here is that the emulator would be able to
> handle most cases, and then the ones it can't handle we can handle with
> altp2m.
>
> Of course, it's not ideal - we'd rather have a mechanism that's consistently
> foolproof, but I believe that Jan's objection is correct: we can't really be
> sure that the first time we get into access_check() with a specific
> [RIP:GLA] pair we need to set the A bit and the second time the D bit
> (interrupts may trip this logic up). Furthermore, with SVM the GLA is not
> available for page faults (although that's fixable by comparing GPAs).
>
> If there's a better way to do this optimization that we haven't seen we'd be
> happy to switch the implementation. Suggestions are welcome.

My only worry is that this interface as-is will give someone the false
impression that this will "just work" all the time. Which is not the
case. If it needs the UNIMPLEMENTED event to be enabled as well I
would either make that a requirement for enabling this feature, or at
the minimum extensively documenting how that should be setup,
including an example in xen-access.c

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-28 Thread Razvan Cojocaru

On 04/28/2018 09:13 AM, Razvan Cojocaru wrote:

On 04/28/2018 12:30 AM, Tamas K Lengyel wrote:

On Mon, Apr 23, 2018 at 2:00 AM, Alexandru Isaila
 wrote:

This patch is adding a way to enable/disable inguest pagefault
events. It introduces the xc_monitor_inguest_pagefault function
and adds the inguest_pagefault_disabled in the monitor structure.
This is needed by the introspection so it will only get gla
faults and not get spammed with other faults.
In p2m_mem_access_check() we emulate so no event will get sent.


This looks good to me, but is the emulator able to handle all
instructions that may trigger it here?


That's a very good question. We think not, but we now have the 
UNIMPLEMENTED emulator event. The thought here is that the emulator 
would be able to handle most cases, and then the ones it can't handle we 
can handle with altp2m.


Of course, it's not ideal - we'd rather have a mechanism that's 
consistently foolproof, but I believe that Jan's objection is correct: 
we can't really be sure that the first time we get into access_check() 
with a specific [RIP:GLA] pair we need to set the A bit and the second 
time the D bit (interrupts may trip this logic up). Furthermore, with 
SVM the GLA is not available for page faults (although that's fixable by 
comparing GPAs).


If there's a better way to do this optimization that we haven't seen 
we'd be happy to switch the implementation. Suggestions are welcome.


The only surefire way we have here is to single-step with the monitor 
trap flag set and with an unrestricted altp2m - or similar. We're doing 
something like that with XenServer:


https://github.com/xenserver/xen.pg/blob/XS-7.1.x/master/xen-reexecute-instn-under-monitor-trap.patch

But IIRC that patch had basically been nacked at the time, and I don't 
see a nice way of making the same logic work with alt2pm _without_ 
having vm_events triggered (which is what this optimization is about), 
due to, mainly, the fact that hostp2m changes propagate into altp2ms.



Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-28 Thread Razvan Cojocaru

On 04/28/2018 12:30 AM, Tamas K Lengyel wrote:

On Mon, Apr 23, 2018 at 2:00 AM, Alexandru Isaila
 wrote:

This patch is adding a way to enable/disable inguest pagefault
events. It introduces the xc_monitor_inguest_pagefault function
and adds the inguest_pagefault_disabled in the monitor structure.
This is needed by the introspection so it will only get gla
faults and not get spammed with other faults.
In p2m_mem_access_check() we emulate so no event will get sent.


This looks good to me, but is the emulator able to handle all
instructions that may trigger it here?


That's a very good question. We think not, but we now have the 
UNIMPLEMENTED emulator event. The thought here is that the emulator 
would be able to handle most cases, and then the ones it can't handle we 
can handle with altp2m.


Of course, it's not ideal - we'd rather have a mechanism that's 
consistently foolproof, but I believe that Jan's objection is correct: 
we can't really be sure that the first time we get into access_check() 
with a specific [RIP:GLA] pair we need to set the A bit and the second 
time the D bit (interrupts may trip this logic up). Furthermore, with 
SVM the GLA is not available for page faults (although that's fixable by 
comparing GPAs).


If there's a better way to do this optimization that we haven't seen 
we'd be happy to switch the implementation. Suggestions are welcome.



Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-27 Thread Tamas K Lengyel
On Mon, Apr 23, 2018 at 2:00 AM, Alexandru Isaila
 wrote:
> This patch is adding a way to enable/disable inguest pagefault
> events. It introduces the xc_monitor_inguest_pagefault function
> and adds the inguest_pagefault_disabled in the monitor structure.
> This is needed by the introspection so it will only get gla
> faults and not get spammed with other faults.
> In p2m_mem_access_check() we emulate so no event will get sent.

This looks good to me, but is the emulator able to handle all
instructions that may trigger it here?

>
> Signed-off-by: Alexandru Isaila 
>
> ---
> Changes since V4:
> - Added the hvm_emulate_one_vm_event() call in
>   p2m_mem_access_check()
> - Removed p2m_set_ad_bits().
> ---
>  tools/libxc/include/xenctrl.h |  2 ++
>  tools/libxc/xc_monitor.c  | 14 ++
>  xen/arch/x86/mm/mem_access.c  |  9 +
>  xen/arch/x86/monitor.c| 13 +
>  xen/include/asm-x86/domain.h  |  5 +
>  xen/include/asm-x86/monitor.h |  3 ++-
>  xen/include/public/domctl.h   |  2 ++
>  7 files changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 09e1363..20c2813 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2056,6 +2056,8 @@ int xc_monitor_descriptor_access(xc_interface *xch, 
> uint32_t domain_id,
>   bool enable);
>  int xc_monitor_guest_request(xc_interface *xch, uint32_t domain_id,
>   bool enable, bool sync, bool allow_userspace);
> +int xc_monitor_inguest_pagefault(xc_interface *xch, uint32_t domain_id,
> + bool disable);
>  int xc_monitor_debug_exceptions(xc_interface *xch, uint32_t domain_id,
>  bool enable, bool sync);
>  int xc_monitor_cpuid(xc_interface *xch, uint32_t domain_id, bool enable);
> diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
> index 0233b87..4ac823e 100644
> --- a/tools/libxc/xc_monitor.c
> +++ b/tools/libxc/xc_monitor.c
> @@ -163,6 +163,20 @@ int xc_monitor_guest_request(xc_interface *xch, uint32_t 
> domain_id, bool enable,
>  return do_domctl(xch, );
>  }
>
> +int xc_monitor_inguest_pagefault(xc_interface *xch, uint32_t domain_id,
> +bool disable)
> +{
> +DECLARE_DOMCTL;
> +
> +domctl.cmd = XEN_DOMCTL_monitor_op;
> +domctl.domain = domain_id;
> +domctl.u.monitor_op.op = disable ? XEN_DOMCTL_MONITOR_OP_ENABLE
> +: XEN_DOMCTL_MONITOR_OP_DISABLE;
> +domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_INGUEST_PAGEFAULT;
> +
> +return do_domctl(xch, );
> +}
> +
>  int xc_monitor_emulate_each_rep(xc_interface *xch, uint32_t domain_id,
>  bool enable)
>  {
> diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
> index c0cd017..febe38d 100644
> --- a/xen/arch/x86/mm/mem_access.c
> +++ b/xen/arch/x86/mm/mem_access.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>
>  #include "mm-locks.h"
> @@ -207,6 +208,14 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long gla,
>  return true;
>  }
>  }
> +if ( vm_event_check_ring(d->vm_event_monitor) &&
> + d->arch.monitor.inguest_pagefault_disabled &&
> + npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */
> +{
> +hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op, 
> X86_EVENT_NO_EC);
> +
> +return true;
> +}
>
>  *req_ptr = NULL;
>  req = xzalloc(vm_event_request_t);
> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
> index f229e69..ce5d1ba 100644
> --- a/xen/arch/x86/monitor.c
> +++ b/xen/arch/x86/monitor.c
> @@ -241,6 +241,19 @@ int arch_monitor_domctl_event(struct domain *d,
>  break;
>  }
>
> +case XEN_DOMCTL_MONITOR_EVENT_INGUEST_PAGEFAULT:
> +{
> +bool old_status = ad->monitor.inguest_pagefault_disabled;
> +
> +if ( unlikely(old_status == requested_status) )
> +return -EEXIST;
> +
> +domain_pause(d);
> +ad->monitor.inguest_pagefault_disabled = requested_status;
> +domain_unpause(d);
> +break;
> +}
> +
>  case XEN_DOMCTL_MONITOR_EVENT_DESC_ACCESS:
>  {
>  bool old_status = ad->monitor.descriptor_access_enabled;
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index 4679d54..fbdc392 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -412,6 +412,11 @@ struct arch_domain
>  unsigned int descriptor_access_enabled : 
> 1;
>  unsigned int guest_request_userspace_enabled   : 
> 1;
>  unsigned int emul_unimplemented_enabled: 
> 1;

Re: [Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-23 Thread Wei Liu
On Mon, Apr 23, 2018 at 11:00:50AM +0300, Alexandru Isaila wrote:
> This patch is adding a way to enable/disable inguest pagefault
> events. It introduces the xc_monitor_inguest_pagefault function
> and adds the inguest_pagefault_disabled in the monitor structure.
> This is needed by the introspection so it will only get gla
> faults and not get spammed with other faults.
> In p2m_mem_access_check() we emulate so no event will get sent.
> 
> Signed-off-by: Alexandru Isaila 
> 
> ---
> Changes since V4:
>   - Added the hvm_emulate_one_vm_event() call in
> p2m_mem_access_check()
> - Removed p2m_set_ad_bits().
> ---
>  tools/libxc/include/xenctrl.h |  2 ++
>  tools/libxc/xc_monitor.c  | 14 ++

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5] x86/mm: Suppresses vm_events caused by page-walks

2018-04-23 Thread Alexandru Isaila
This patch is adding a way to enable/disable inguest pagefault
events. It introduces the xc_monitor_inguest_pagefault function
and adds the inguest_pagefault_disabled in the monitor structure.
This is needed by the introspection so it will only get gla
faults and not get spammed with other faults.
In p2m_mem_access_check() we emulate so no event will get sent.

Signed-off-by: Alexandru Isaila 

---
Changes since V4:
- Added the hvm_emulate_one_vm_event() call in
  p2m_mem_access_check()
- Removed p2m_set_ad_bits().
---
 tools/libxc/include/xenctrl.h |  2 ++
 tools/libxc/xc_monitor.c  | 14 ++
 xen/arch/x86/mm/mem_access.c  |  9 +
 xen/arch/x86/monitor.c| 13 +
 xen/include/asm-x86/domain.h  |  5 +
 xen/include/asm-x86/monitor.h |  3 ++-
 xen/include/public/domctl.h   |  2 ++
 7 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 09e1363..20c2813 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2056,6 +2056,8 @@ int xc_monitor_descriptor_access(xc_interface *xch, 
uint32_t domain_id,
  bool enable);
 int xc_monitor_guest_request(xc_interface *xch, uint32_t domain_id,
  bool enable, bool sync, bool allow_userspace);
+int xc_monitor_inguest_pagefault(xc_interface *xch, uint32_t domain_id,
+ bool disable);
 int xc_monitor_debug_exceptions(xc_interface *xch, uint32_t domain_id,
 bool enable, bool sync);
 int xc_monitor_cpuid(xc_interface *xch, uint32_t domain_id, bool enable);
diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
index 0233b87..4ac823e 100644
--- a/tools/libxc/xc_monitor.c
+++ b/tools/libxc/xc_monitor.c
@@ -163,6 +163,20 @@ int xc_monitor_guest_request(xc_interface *xch, uint32_t 
domain_id, bool enable,
 return do_domctl(xch, );
 }
 
+int xc_monitor_inguest_pagefault(xc_interface *xch, uint32_t domain_id,
+bool disable)
+{
+DECLARE_DOMCTL;
+
+domctl.cmd = XEN_DOMCTL_monitor_op;
+domctl.domain = domain_id;
+domctl.u.monitor_op.op = disable ? XEN_DOMCTL_MONITOR_OP_ENABLE
+: XEN_DOMCTL_MONITOR_OP_DISABLE;
+domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_INGUEST_PAGEFAULT;
+
+return do_domctl(xch, );
+}
+
 int xc_monitor_emulate_each_rep(xc_interface *xch, uint32_t domain_id,
 bool enable)
 {
diff --git a/xen/arch/x86/mm/mem_access.c b/xen/arch/x86/mm/mem_access.c
index c0cd017..febe38d 100644
--- a/xen/arch/x86/mm/mem_access.c
+++ b/xen/arch/x86/mm/mem_access.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "mm-locks.h"
@@ -207,6 +208,14 @@ bool p2m_mem_access_check(paddr_t gpa, unsigned long gla,
 return true;
 }
 }
+if ( vm_event_check_ring(d->vm_event_monitor) &&
+ d->arch.monitor.inguest_pagefault_disabled &&
+ npfec.kind != npfec_kind_with_gla ) /* don't send a mem_event */
+{
+hvm_emulate_one_vm_event(EMUL_KIND_NORMAL, TRAP_invalid_op, 
X86_EVENT_NO_EC);
+
+return true;
+}
 
 *req_ptr = NULL;
 req = xzalloc(vm_event_request_t);
diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c
index f229e69..ce5d1ba 100644
--- a/xen/arch/x86/monitor.c
+++ b/xen/arch/x86/monitor.c
@@ -241,6 +241,19 @@ int arch_monitor_domctl_event(struct domain *d,
 break;
 }
 
+case XEN_DOMCTL_MONITOR_EVENT_INGUEST_PAGEFAULT:
+{
+bool old_status = ad->monitor.inguest_pagefault_disabled;
+
+if ( unlikely(old_status == requested_status) )
+return -EEXIST;
+
+domain_pause(d);
+ad->monitor.inguest_pagefault_disabled = requested_status;
+domain_unpause(d);
+break;
+}
+
 case XEN_DOMCTL_MONITOR_EVENT_DESC_ACCESS:
 {
 bool old_status = ad->monitor.descriptor_access_enabled;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 4679d54..fbdc392 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -412,6 +412,11 @@ struct arch_domain
 unsigned int descriptor_access_enabled : 1;
 unsigned int guest_request_userspace_enabled   : 1;
 unsigned int emul_unimplemented_enabled: 1;
+/*
+ * By default all events are sent.
+ * This is used to filter out pagefaults.
+ */
+unsigned int inguest_pagefault_disabled: 1;
 struct monitor_msr_bitmap *msr_bitmap;
 uint64_t write_ctrlreg_mask[4];
 } monitor;
diff --git a/xen/include/asm-x86/monitor.h b/xen/include/asm-x86/monitor.h
index a0444d1..647df4a 100644
---