Re: [Xen-devel] [PATCH 4/4] vm_event: Add support for multi-page ring buffer

2018-09-24 Thread Petre Pircalabu
On Tue, 2018-09-18 at 06:58 -0600, Jan Beulich wrote:
> > > > On 13.09.18 at 17:02,  wrote:
> > 
> > --- a/xen/arch/x86/domain_page.c
> > +++ b/xen/arch/x86/domain_page.c
> > @@ -331,10 +331,9 @@ void *__map_domain_pages_global(const struct
> > page_info *pg, unsigned int nr)
> >  {
> >  mfn_t mfn[nr];
> >  int i;
> > -struct page_info *cur_pg = (struct page_info *)[0];
> >  
> >  for (i = 0; i < nr; i++)
> > -mfn[i] = page_to_mfn(cur_pg++);
> > +mfn[i] = page_to_mfn(pg++);
> >  
> >  return map_domain_pages_global(mfn, nr);
> >  }
> 
> Please could you avoid having such random unrelated changes in your
> patches?
> 
Thank-uy very much for noticing it, I've completely missed it. However
I drop the map_domain_pages functionality altoghether as Andrew pointed
out is not necessary to have a physically contiguous area.

> > @@ -4415,6 +4416,19 @@ int arch_acquire_resource(struct domain *d,
> > unsigned int type,
> >  }
> >  #endif
> >  
> > +case XENMEM_resource_vm_event:
> > +{
> > +rc = vm_event_get_ring_frames(d, id, frame, nr_frames,
> > mfn_list);
> > +if ( rc )
> > +break;
> > +/*
> > + * The frames will have been assigned to the domain that
> > created
> > + * the ioreq server.
> > + */
> > +*flags |= XENMEM_rsrc_acq_caller_owned;
> > +break;
> > +}
> 
> Isn't it wrong to assume that the monitor application will run in the
> same
> domain as the / one ioreq server?

I will remove the ioreq server comment as it's misleading. This call is
not using a ioreq server, but only allocates the memory for the
vm_event's ring buffer.

> > --- a/xen/common/vm_event.c
> > +++ b/xen/common/vm_event.c
> > @@ -39,16 +39,66 @@
> >  #define vm_event_ring_lock(_ved)   spin_lock(&(_ved)-
> > >ring_lock)
> >  #define vm_event_ring_unlock(_ved) spin_unlock(&(_ved)-
> > >ring_lock)
> >  
> > +#define XEN_VM_EVENT_ALLOC_FROM_DOMHEAP 0x
> 
> Note this constant's type vs ...
> 
> >  static int vm_event_enable(
> >  struct domain *d,
> > -struct xen_domctl_vm_event_op *vec,
> >  struct vm_event_domain **ved,
> > +unsigned long param,
> 
> ... the function parameter type here. I don't see why this can't be
> "unsigned int".
Fixed.
> 
> > @@ -88,12 +151,12 @@ static int vm_event_enable(
> >  if ( rc < 0 )
> >  goto err;
> >  
> > -(*ved)->xen_port = vec->port = rc;
> > +(*ved)->xen_port =  rc;
> 
> Yet another unrelated change? It looks to be replaced by code further
> down (in the callers), but it's not clear to me why the change is
> needed
> (here). Perhaps worth splitting up the patch a little?
vm_event_enabled can be called from both vm_event_domctl and
vm_event_get_ring_frames (acquire_resource call path). I have removed
the vec parameter (struct xen_domctl_vm_event_op) to make the function
caller agnostic.

Many thank for your comments,
Petre
> 
> Jan
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/4] vm_event: Add support for multi-page ring buffer

2018-09-24 Thread Petre Pircalabu
On Mon, 2018-09-17 at 15:41 +0100, Andrew Cooper wrote:
> On 13/09/18 16:02, Petre Pircalabu wrote:
> > In high throughput introspection scenarios where lots of monitor
> > vm_events are generated, the ring buffer can fill up before the
> > monitor
> > application gets a chance to handle all the requests thus blocking
> > other vcpus which will have to wait for a slot to become available.
> > 
> > This patch adds support for extending the ring buffer by allocating
> > a
> > number of pages from domheap and mapping them to the monitor
> > application's domain using the foreignmemory_map_resource
> > interface.
> > Unlike the current implementation, the ring buffer pages are not
> > part of
> > the introspected DomU, so they will not be reclaimed when the
> > monitor is
> > disabled.
> > 
> > Signed-off-by: Petre Pircalabu 
> 
> What about the slotted format for the synchronous events?  While this
> is
> fine for the async bits, I don't think we want to end up changing the
> mapping API twice.
> 
> Simply increasing the size of the ring puts more pressure on the
I'm currently investigating this approach and I will send an
inplementation proposal with the next version of the patch.
> 
> > diff --git a/xen/arch/x86/domain_page.c
> > b/xen/arch/x86/domain_page.c
> > index 0d23e52..2a9cbf3 100644
> > --- a/xen/arch/x86/domain_page.c
> > +++ b/xen/arch/x86/domain_page.c
> > @@ -331,10 +331,9 @@ void *__map_domain_pages_global(const struct
> > page_info *pg, unsigned int nr)
> >  {
> >  mfn_t mfn[nr];
> >  int i;
> > -struct page_info *cur_pg = (struct page_info *)[0];
> >  
> >  for (i = 0; i < nr; i++)
> > -mfn[i] = page_to_mfn(cur_pg++);
> > +mfn[i] = page_to_mfn(pg++);
> 
> This hunk looks like it should be in the previous patch?  That
> said...
Yep. I've completely missed it. This piece of code will be removed
along  with the map_domain_pages_global patch.
> 
> >  
> >  return map_domain_pages_global(mfn, nr);
> >  }
> > diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> > index 4793aac..faece3c 100644
> > --- a/xen/common/vm_event.c
> > +++ b/xen/common/vm_event.c
> > @@ -39,16 +39,66 @@
> >  #define vm_event_ring_lock(_ved)   spin_lock(&(_ved)-
> > >ring_lock)
> >  #define vm_event_ring_unlock(_ved) spin_unlock(&(_ved)-
> > >ring_lock)
> >  
> > +#define XEN_VM_EVENT_ALLOC_FROM_DOMHEAP 0x
> > +
> > +static int vm_event_alloc_ring(struct domain *d, struct
> > vm_event_domain *ved)
> > +{
> > +struct page_info *page;
> > +void *va = NULL;
> > +int i, rc = -ENOMEM;
> > +
> > +page = alloc_domheap_pages(d, ved->ring_order,
> > MEMF_no_refcount);
> > +if ( !page )
> > +return -ENOMEM;
> 
> ... what is wrong with vzalloc()?
> 
> You don't want to be making a ring_order allocation, especially as
> the
> order grows.  All you need are some mappings which are virtually
> contiguous, not physically contiguous.
Unfortunately, vzalloc doesn't work here (the acquire_resource
succeeds, but the subsequent mapping fails:
(XEN) mm.c:1024:d0v5 pg_owner d0 l1e_owner d0, but real_pg_owner d-1
(XEN) mm.c:1100:d0v5 Error getting mfn dd24d (pfn )
from L1 entry 8000dd24d227 for l1e_owner d0, pg_owner d0
(XEN) mm.c:1024:d0v5 pg_owner d0 l1e_owner d0, but real_pg_owner d-1
(XEN) mm.c:1100:d0v5 Error getting mfn dd24c (pfn )
from L1 entry 8000dd24c227 for l1e_owner d0, pg_owner d0

However, allocating each page with alloc_domheap_page and then mapping
them using vmap does the trick.
Until the next version is ready (slotted format for synchronous events)
 I have pushed an intermediate version which addresses the issues
signaled by you an Jan to my github fork of the xen repository:
https://github.com/petrepircalabu/xen/tree/multi_page_ring_buffer/devel
_new

> ~Andrew

Many thanks for your support,
Petre

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/4] vm_event: Add support for multi-page ring buffer

2018-09-18 Thread Jan Beulich
>>> On 13.09.18 at 17:02,  wrote:
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -331,10 +331,9 @@ void *__map_domain_pages_global(const struct page_info 
> *pg, unsigned int nr)
>  {
>  mfn_t mfn[nr];
>  int i;
> -struct page_info *cur_pg = (struct page_info *)[0];
>  
>  for (i = 0; i < nr; i++)
> -mfn[i] = page_to_mfn(cur_pg++);
> +mfn[i] = page_to_mfn(pg++);
>  
>  return map_domain_pages_global(mfn, nr);
>  }

Please could you avoid having such random unrelated changes in your patches?

> @@ -4415,6 +4416,19 @@ int arch_acquire_resource(struct domain *d, unsigned 
> int type,
>  }
>  #endif
>  
> +case XENMEM_resource_vm_event:
> +{
> +rc = vm_event_get_ring_frames(d, id, frame, nr_frames, mfn_list);
> +if ( rc )
> +break;
> +/*
> + * The frames will have been assigned to the domain that created
> + * the ioreq server.
> + */
> +*flags |= XENMEM_rsrc_acq_caller_owned;
> +break;
> +}

Isn't it wrong to assume that the monitor application will run in the same
domain as the / one ioreq server?

> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -39,16 +39,66 @@
>  #define vm_event_ring_lock(_ved)   spin_lock(&(_ved)->ring_lock)
>  #define vm_event_ring_unlock(_ved) spin_unlock(&(_ved)->ring_lock)
>  
> +#define XEN_VM_EVENT_ALLOC_FROM_DOMHEAP 0x

Note this constant's type vs ...

>  static int vm_event_enable(
>  struct domain *d,
> -struct xen_domctl_vm_event_op *vec,
>  struct vm_event_domain **ved,
> +unsigned long param,

... the function parameter type here. I don't see why this can't be
"unsigned int".

> @@ -88,12 +151,12 @@ static int vm_event_enable(
>  if ( rc < 0 )
>  goto err;
>  
> -(*ved)->xen_port = vec->port = rc;
> +(*ved)->xen_port =  rc;

Yet another unrelated change? It looks to be replaced by code further
down (in the callers), but it's not clear to me why the change is needed
(here). Perhaps worth splitting up the patch a little?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/4] vm_event: Add support for multi-page ring buffer

2018-09-17 Thread Andrew Cooper
On 13/09/18 16:02, Petre Pircalabu wrote:
> In high throughput introspection scenarios where lots of monitor
> vm_events are generated, the ring buffer can fill up before the monitor
> application gets a chance to handle all the requests thus blocking
> other vcpus which will have to wait for a slot to become available.
>
> This patch adds support for extending the ring buffer by allocating a
> number of pages from domheap and mapping them to the monitor
> application's domain using the foreignmemory_map_resource interface.
> Unlike the current implementation, the ring buffer pages are not part of
> the introspected DomU, so they will not be reclaimed when the monitor is
> disabled.
>
> Signed-off-by: Petre Pircalabu 

What about the slotted format for the synchronous events?  While this is
fine for the async bits, I don't think we want to end up changing the
mapping API twice.

Simply increasing the size of the ring puts more pressure on the

> diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
> index 0d23e52..2a9cbf3 100644
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -331,10 +331,9 @@ void *__map_domain_pages_global(const struct page_info 
> *pg, unsigned int nr)
>  {
>  mfn_t mfn[nr];
>  int i;
> -struct page_info *cur_pg = (struct page_info *)[0];
>  
>  for (i = 0; i < nr; i++)
> -mfn[i] = page_to_mfn(cur_pg++);
> +mfn[i] = page_to_mfn(pg++);

This hunk looks like it should be in the previous patch?  That said...

>  
>  return map_domain_pages_global(mfn, nr);
>  }
> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> index 4793aac..faece3c 100644
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -39,16 +39,66 @@
>  #define vm_event_ring_lock(_ved)   spin_lock(&(_ved)->ring_lock)
>  #define vm_event_ring_unlock(_ved) spin_unlock(&(_ved)->ring_lock)
>  
> +#define XEN_VM_EVENT_ALLOC_FROM_DOMHEAP 0x
> +
> +static int vm_event_alloc_ring(struct domain *d, struct vm_event_domain *ved)
> +{
> +struct page_info *page;
> +void *va = NULL;
> +int i, rc = -ENOMEM;
> +
> +page = alloc_domheap_pages(d, ved->ring_order, MEMF_no_refcount);
> +if ( !page )
> +return -ENOMEM;

... what is wrong with vzalloc()?

You don't want to be making a ring_order allocation, especially as the
order grows.  All you need are some mappings which are virtually
contiguous, not physically contiguous.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/4] vm_event: Add support for multi-page ring buffer

2018-09-14 Thread Petre Pircalabu
On Jo, 2018-09-13 at 10:42 -0600, Tamas K Lengyel wrote:
> On Thu, Sep 13, 2018 at 9:02 AM Petre Pircalabu
>  wrote:
> > 
> > 
> > In high throughput introspection scenarios where lots of monitor
> > vm_events are generated, the ring buffer can fill up before the
> > monitor
> > application gets a chance to handle all the requests thus blocking
> > other vcpus which will have to wait for a slot to become available.
> > 
> > This patch adds support for extending the ring buffer by allocating
> > a
> > number of pages from domheap and mapping them to the monitor
> > application's domain using the foreignmemory_map_resource
> > interface.
> > Unlike the current implementation, the ring buffer pages are not
> > part of
> > the introspected DomU, so they will not be reclaimed when the
> > monitor is
> > disabled.
> > 
> > Signed-off-by: Petre Pircalabu 
> Thanks for this addition, it has been on the TODO for a long while
> now. Could you also please push the patches as a git branch
> somewhere?
I've pushed it to my github repository (branch :
multi_page_ring_buffer/devel_new)
https://github.com/petrepircalabu/xen/tree/multi_page_ring_buffer/devel
_new
> 
> > 
> > ---
> >  tools/libxc/include/xenctrl.h   |   2 +
> >  tools/libxc/xc_monitor.c|   7 +
> >  tools/libxc/xc_private.h|   3 +
> >  tools/libxc/xc_vm_event.c   |  49 +++
> > 
> > +xenaccess->vm_event.domain_id,
> > +xenaccess->vm_event.ring_page_count,
> > +>vm_event.evtchn_port);
> > +
> > +if (xenaccess->vm_event.ring_buffer == NULL && errno ==
> > EOPNOTSUPP)
> How would this situation ever arise? If there is a chance that you
> can't setup multi-page rings, perhaps adding a hypercall that would
> tell the user how many pages are max available for the ring is the
> better route. This just seems like guessing right now.
> 
The multi page ring buffer is mapped using
xenforeignmemory_map_resource which uses IOCTL_PRIVCMD_MMAP_RESOURCE.
This ioctl was added in kernel 4.18.1, which is a relatively new
kernel. If the monitor domain doesnt't recognize this hypercall it sets
errno to EOPNOTSUPP.
> > 
> > +{
> > +xenaccess->vm_event.ring_page_count = 1;
> > +xenaccess->vm_event.ring_buffer =
> >  xc_monitor_enable(xenaccess->xc_handle,
> >    xenaccess->vm_event.domain_id,
> >    >vm_event.evtchn_port);

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/4] vm_event: Add support for multi-page ring buffer

2018-09-13 Thread Tamas K Lengyel
On Thu, Sep 13, 2018 at 9:02 AM Petre Pircalabu
 wrote:
>
> In high throughput introspection scenarios where lots of monitor
> vm_events are generated, the ring buffer can fill up before the monitor
> application gets a chance to handle all the requests thus blocking
> other vcpus which will have to wait for a slot to become available.
>
> This patch adds support for extending the ring buffer by allocating a
> number of pages from domheap and mapping them to the monitor
> application's domain using the foreignmemory_map_resource interface.
> Unlike the current implementation, the ring buffer pages are not part of
> the introspected DomU, so they will not be reclaimed when the monitor is
> disabled.
>
> Signed-off-by: Petre Pircalabu 

Thanks for this addition, it has been on the TODO for a long while
now. Could you also please push the patches as a git branch somewhere?

> ---
>  tools/libxc/include/xenctrl.h   |   2 +
>  tools/libxc/xc_monitor.c|   7 +
>  tools/libxc/xc_private.h|   3 +
>  tools/libxc/xc_vm_event.c   |  49 +++
>  tools/tests/xen-access/xen-access.c |  33 +++--
>  xen/arch/x86/domain_page.c  |   3 +-
>  xen/arch/x86/mm.c   |  14 ++
>  xen/common/vm_event.c   | 258 
> +++-
>  xen/include/public/domctl.h |   1 +
>  xen/include/public/memory.h |   1 +
>  xen/include/xen/sched.h |   5 +-
>  xen/include/xen/vm_event.h  |   4 +
>  12 files changed, 305 insertions(+), 75 deletions(-)
>
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index bb75bcc..4f91ee9 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2005,6 +2005,8 @@ int xc_get_mem_access(xc_interface *xch, uint32_t 
> domain_id,
>   * Caller has to unmap this page when done.
>   */
>  void *xc_monitor_enable(xc_interface *xch, uint32_t domain_id, uint32_t 
> *port);
> +void *xc_monitor_enable_ex(xc_interface *xch, uint32_t domain_id,
> +   int order, uint32_t *port);

I don't think there is value in keeping both version of the API
around. As libxc is not considered to be a stable API extending the
existing API would be the route I would prefer. Also, perhaps instead
of "order" name the input variable "page_count" to make it more clear
what it's about.

>  int xc_monitor_disable(xc_interface *xch, uint32_t domain_id);
>  int xc_monitor_resume(xc_interface *xch, uint32_t domain_id);
>  /*
> diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c
> index 15e6a0e..5188835 100644
> --- a/tools/libxc/xc_monitor.c
> +++ b/tools/libxc/xc_monitor.c
> @@ -28,6 +28,13 @@ void *xc_monitor_enable(xc_interface *xch, uint32_t 
> domain_id, uint32_t *port)
>port);
>  }
>
> +void *xc_monitor_enable_ex(xc_interface *xch, uint32_t domain_id, int order,
> +   uint32_t *port)
> +{
> +return xc_vm_event_enable_ex(xch, domain_id, XEN_VM_EVENT_TYPE_MONITOR,
> + order, port);
> +}
> +
>  int xc_monitor_disable(xc_interface *xch, uint32_t domain_id)
>  {
>  return xc_vm_event_control(xch, domain_id,
> diff --git a/tools/libxc/xc_private.h b/tools/libxc/xc_private.h
> index be22986..03d9460 100644
> --- a/tools/libxc/xc_private.h
> +++ b/tools/libxc/xc_private.h
> @@ -436,6 +436,9 @@ int xc_vm_event_control(xc_interface *xch, uint32_t 
> domain_id, unsigned int op,
>  void *xc_vm_event_enable(xc_interface *xch, uint32_t domain_id, int type,
>   uint32_t *port);
>
> +void *xc_vm_event_enable_ex(xc_interface *xch, uint32_t domain_id, int type,
> +int order, uint32_t *port);
> +
>  int do_dm_op(xc_interface *xch, uint32_t domid, unsigned int nr_bufs, ...);
>
>  #endif /* __XC_PRIVATE_H__ */
> diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
> index de37ca5..216bbe2 100644
> --- a/tools/libxc/xc_vm_event.c
> +++ b/tools/libxc/xc_vm_event.c
> @@ -21,6 +21,7 @@
>   */
>
>  #include "xc_private.h"
> +#include "xenforeignmemory.h"
>
>  int xc_vm_event_control(xc_interface *xch, uint32_t domain_id, unsigned int 
> op,
>  unsigned int mode, uint32_t *port)
> @@ -184,6 +185,54 @@ void *xc_vm_event_enable(xc_interface *xch, uint32_t 
> domain_id, int type,
>  return ring_page;
>  }
>
> +void *xc_vm_event_enable_ex(xc_interface *xch, uint32_t domain_id, int type,
> +int order, uint32_t *port)
> +{
> +xenforeignmemory_resource_handle *fres = NULL;
> +int saved_errno;
> +void *ring_buffer = NULL;
> +
> +if ( !port )
> +{
> +errno = EINVAL;
> +return NULL;
> +}
> +
> +/* Pause the domain for ring page setup */
> +if ( xc_domain_pause(xch, domain_id) )
> +{
> +PERROR("Unable to pause domain\n");
> +return NULL;
> +}
> +
> +fres =