Re: [Xen-devel] [PATCH 07/16] SUPPORT.md: Add virtual devices common to ARM and x86

2017-11-21 Thread Paul Durrant
> -Original Message-
[snip]
> > +### PV keyboard (frontend)
> > +
> > +Status, Linux (xen-kbdfront): Supported
> > +Status, Windows: Supported
> > +
> > +Guest-side driver capable of speaking the Xen PV keyboard protocol
> 
> Are these three active/usable in guests regardless of whether the
> guest is being run PV, PVH, or HVM? If not, wouldn't this need
> spelling out?
> 

I believe the necessary patches to make the PV vkdb protocol usable 
independently of vfb are at least queued for upstream QEMU.

Stefano, am I correct?

Cheers,

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen-netfront: remove warning when unloading module

2017-11-20 Thread Paul Durrant
> -Original Message-
> From: Eduardo Otubo [mailto:ot...@redhat.com]
> Sent: 20 November 2017 10:41
> To: xen-de...@lists.xenproject.org
> Cc: net...@vger.kernel.org; Paul Durrant <paul.durr...@citrix.com>; Wei
> Liu <wei.l...@citrix.com>; linux-ker...@vger.kernel.org;
> vkuzn...@redhat.com; cav...@redhat.com; che...@redhat.com;
> mga...@redhat.com; Eduardo Otubo <ot...@redhat.com>
> Subject: [PATCH] xen-netfront: remove warning when unloading module
> 
> When unloading module xen_netfront from guest, dmesg would output
> warning messages like below:
> 
>   [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
>   [  105.236839] deferring g.e. 0x903 (pfn 0x35805)
> 
> This problem relies on netfront and netback being out of sync. By the time
> netfront revokes the g.e.'s netback didn't have enough time to free all of
> them, hence displaying the warnings on dmesg.
> 
> The trick here is to make netfront to wait until netback frees all the g.e.'s
> and only then continue to cleanup for the module removal, and this is done
> by
> manipulating both device states.
> 
> Signed-off-by: Eduardo Otubo <ot...@redhat.com>
> ---
>  drivers/net/xen-netfront.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8b8689c6d887..b948e2a1ce40 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -2130,6 +2130,17 @@ static int xennet_remove(struct xenbus_device
> *dev)
> 
>   dev_dbg(>dev, "%s\n", dev->nodename);
> 
> + xenbus_switch_state(dev, XenbusStateClosing);
> + while (xenbus_read_driver_state(dev->otherend) !=
> XenbusStateClosing){
> + cpu_relax();
> + schedule();
> + }
> + xenbus_switch_state(dev, XenbusStateClosed);
> + while (dev->xenbus_state != XenbusStateClosed){
> + cpu_relax();
> + schedule();
> + }
> +

Waitiing for closing should be ok but waiting for closed is risky. As soon as a 
backend is in the closed state then a toolstack can completely remove the 
backend xenstore area, resulting a state of XenbusStateUnknown, which would 
cause your second loop to spin forever.

  Paul

>   xennet_disconnect_backend(info);
> 
>   unregister_netdev(info->netdev);
> --
> 2.13.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] xen-disk: use an IOThread per instance

2017-11-16 Thread Paul Durrant
> -Original Message-
> From: Stefano Stabellini [mailto:sstabell...@kernel.org]
> Sent: 16 November 2017 01:11
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: qemu-de...@nongnu.org; xen-de...@lists.xenproject.org; Stefano
> Stabellini <sstabell...@kernel.org>; Anthony Perard
> <anthony.per...@citrix.com>; Kevin Wolf <kw...@redhat.com>; Max Reitz
> <mre...@redhat.com>
> Subject: RE: [PATCH v3] xen-disk: use an IOThread per instance
> 
> On Wed, 15 Nov 2017, Paul Durrant wrote:
> > Anthony, Stefano,
> >
> >   Ping?
> 
> Acked-by: Stefano Stabellini <sstabell...@kernel.org>
> 
> Unless Anthony or somebody else object, I'll queue it up in my "next"
> branch (which I'll send upstream after 2.11 is out).

Great. Thanks,

  Paul

> 
> Cheers,
> 
> Stefano
> 
> 
> > > -Original Message-
> > > From: Paul Durrant [mailto:paul.durr...@citrix.com]
> > > Sent: 07 November 2017 10:47
> > > To: qemu-de...@nongnu.org; xen-de...@lists.xenproject.org
> > > Cc: Paul Durrant <paul.durr...@citrix.com>; Stefano Stabellini
> > > <sstabell...@kernel.org>; Anthony Perard <anthony.per...@citrix.com>;
> > > Kevin Wolf <kw...@redhat.com>; Max Reitz <mre...@redhat.com>
> > > Subject: [PATCH v3] xen-disk: use an IOThread per instance
> > >
> > > This patch allocates an IOThread object for each xen_disk instance and
> > > sets the AIO context appropriately on connect. This allows processing
> > > of I/O to proceed in parallel.
> > >
> > > The patch also adds tracepoints into xen_disk to make it possible to
> > > follow the state transtions of an instance in the log.
> > >
> > > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > > ---
> > > Cc: Stefano Stabellini <sstabell...@kernel.org>
> > > Cc: Anthony Perard <anthony.per...@citrix.com>
> > > Cc: Kevin Wolf <kw...@redhat.com>
> > > Cc: Max Reitz <mre...@redhat.com>
> > >
> > > v3:
> > >  - Use new iothread_create/destroy() functions
> > >
> > > v2:
> > >  - explicitly acquire and release AIO context in qemu_aio_complete() and
> > >blk_bh()
> > > ---
> > >  hw/block/trace-events |  7 +++
> > >  hw/block/xen_disk.c   | 53
> > > ---
> > >  2 files changed, 53 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/hw/block/trace-events b/hw/block/trace-events
> > > index cb6767b3ee..962a3bfa24 100644
> > > --- a/hw/block/trace-events
> > > +++ b/hw/block/trace-events
> > > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *vdev, void *mrb,
> int
> > > start, int num_reqs, uint6
> > >  # hw/block/hd-geometry.c
> > >  hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p
> > > LCHS %d %d %d"
> > >  hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t
> secs,
> > > int trans) "blk %p CHS %u %u %u trans %d"
> > > +
> > > +# hw/block/xen_disk.c
> > > +xen_disk_alloc(char *name) "%s"
> > > +xen_disk_init(char *name) "%s"
> > > +xen_disk_connect(char *name) "%s"
> > > +xen_disk_disconnect(char *name) "%s"
> > > +xen_disk_free(char *name) "%s"
> > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> > > index e431bd89e8..f74fcd42d1 100644
> > > --- a/hw/block/xen_disk.c
> > > +++ b/hw/block/xen_disk.c
> > > @@ -27,10 +27,12 @@
> > >  #include "hw/xen/xen_backend.h"
> > >  #include "xen_blkif.h"
> > >  #include "sysemu/blockdev.h"
> > > +#include "sysemu/iothread.h"
> > >  #include "sysemu/block-backend.h"
> > >  #include "qapi/error.h"
> > >  #include "qapi/qmp/qdict.h"
> > >  #include "qapi/qmp/qstring.h"
> > > +#include "trace.h"
> > >
> > >  /* - */
> > >
> > > @@ -125,6 +127,9 @@ struct XenBlkDev {
> > >  DriveInfo   *dinfo;
> > >  BlockBackend*blk;
> > >  QEMUBH  *bh;
> > > +
> > > +IOThread*iothread;
> > > +AioContext  *ctx;
> > >  };
> > >
> > >  /* -

Re: [Xen-devel] [PATCH v3] xen-disk: use an IOThread per instance

2017-11-15 Thread Paul Durrant
Anthony, Stefano,

  Ping?

> -Original Message-
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: 07 November 2017 10:47
> To: qemu-de...@nongnu.org; xen-de...@lists.xenproject.org
> Cc: Paul Durrant <paul.durr...@citrix.com>; Stefano Stabellini
> <sstabell...@kernel.org>; Anthony Perard <anthony.per...@citrix.com>;
> Kevin Wolf <kw...@redhat.com>; Max Reitz <mre...@redhat.com>
> Subject: [PATCH v3] xen-disk: use an IOThread per instance
> 
> This patch allocates an IOThread object for each xen_disk instance and
> sets the AIO context appropriately on connect. This allows processing
> of I/O to proceed in parallel.
> 
> The patch also adds tracepoints into xen_disk to make it possible to
> follow the state transtions of an instance in the log.
> 
> Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> ---
> Cc: Stefano Stabellini <sstabell...@kernel.org>
> Cc: Anthony Perard <anthony.per...@citrix.com>
> Cc: Kevin Wolf <kw...@redhat.com>
> Cc: Max Reitz <mre...@redhat.com>
> 
> v3:
>  - Use new iothread_create/destroy() functions
> 
> v2:
>  - explicitly acquire and release AIO context in qemu_aio_complete() and
>blk_bh()
> ---
>  hw/block/trace-events |  7 +++
>  hw/block/xen_disk.c   | 53
> ---
>  2 files changed, 53 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/block/trace-events b/hw/block/trace-events
> index cb6767b3ee..962a3bfa24 100644
> --- a/hw/block/trace-events
> +++ b/hw/block/trace-events
> @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *vdev, void *mrb, int
> start, int num_reqs, uint6
>  # hw/block/hd-geometry.c
>  hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p
> LCHS %d %d %d"
>  hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs,
> int trans) "blk %p CHS %u %u %u trans %d"
> +
> +# hw/block/xen_disk.c
> +xen_disk_alloc(char *name) "%s"
> +xen_disk_init(char *name) "%s"
> +xen_disk_connect(char *name) "%s"
> +xen_disk_disconnect(char *name) "%s"
> +xen_disk_free(char *name) "%s"
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index e431bd89e8..f74fcd42d1 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -27,10 +27,12 @@
>  #include "hw/xen/xen_backend.h"
>  #include "xen_blkif.h"
>  #include "sysemu/blockdev.h"
> +#include "sysemu/iothread.h"
>  #include "sysemu/block-backend.h"
>  #include "qapi/error.h"
>  #include "qapi/qmp/qdict.h"
>  #include "qapi/qmp/qstring.h"
> +#include "trace.h"
> 
>  /* - */
> 
> @@ -125,6 +127,9 @@ struct XenBlkDev {
>  DriveInfo   *dinfo;
>  BlockBackend*blk;
>  QEMUBH  *bh;
> +
> +IOThread*iothread;
> +AioContext  *ctx;
>  };
> 
>  /* - */
> @@ -596,9 +601,12 @@ static int ioreq_runio_qemu_aio(struct ioreq
> *ioreq);
>  static void qemu_aio_complete(void *opaque, int ret)
>  {
>  struct ioreq *ioreq = opaque;
> +struct XenBlkDev *blkdev = ioreq->blkdev;
> +
> +aio_context_acquire(blkdev->ctx);
> 
>  if (ret != 0) {
> -xen_pv_printf(>blkdev->xendev, 0, "%s I/O error\n",
> +xen_pv_printf(>xendev, 0, "%s I/O error\n",
>ioreq->req.operation == BLKIF_OP_READ ? "read" : 
> "write");
>  ioreq->aio_errors++;
>  }
> @@ -607,10 +615,10 @@ static void qemu_aio_complete(void *opaque, int
> ret)
>  if (ioreq->presync) {
>  ioreq->presync = 0;
>  ioreq_runio_qemu_aio(ioreq);
> -return;
> +goto done;
>  }
>  if (ioreq->aio_inflight > 0) {
> -return;
> +goto done;
>  }
> 
>  if (xen_feature_grant_copy) {
> @@ -647,16 +655,19 @@ static void qemu_aio_complete(void *opaque, int
> ret)
>  }
>  case BLKIF_OP_READ:
>  if (ioreq->status == BLKIF_RSP_OKAY) {
> -block_acct_done(blk_get_stats(ioreq->blkdev->blk), >acct);
> +block_acct_done(blk_get_stats(blkdev->blk), >acct);
>  } else {
> -block_acct_failed(blk_get_stats(ioreq->blkdev->blk), 
> >acct);
> +block_acct_failed(blk_get_stats(blkdev->blk), >acct);
>  }
>  break;
>  

Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 13 November 2017 16:34
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: net...@vger.kernel.org; Wei Liu <wei.l...@citrix.com>; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> On Mon, Nov 13, 2017 at 11:58:03AM +, Paul Durrant wrote:
> > On Mon, Nov 13, 2017 at 11:54:00AM +, Joao Martins wrote:
> > > On 11/13/2017 10:33 AM, Paul Durrant wrote:
> > > > On 11/10/2017 19:35 PM, Joao Martins wrote:
> 
> [snip]
> 
> > > >> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-
> netback/rx.c
> > > >> index b1cf7c6f407a..793a85f61f9d 100644
> > > >> --- a/drivers/net/xen-netback/rx.c
> > > >> +++ b/drivers/net/xen-netback/rx.c
> > > >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
> > > >> xenvif_queue *queue,
> > > >>   struct xen_netif_rx_request *req,
> > > >>   unsigned int offset, void *data, size_t 
> > > >> len)
> > > >>  {
> > > >> +  unsigned int batch_size;
> > > >>struct gnttab_copy *op;
> > > >>struct page *page;
> > > >>struct xen_page_foreign *foreign;
> > > >>
> > > >> -  if (queue->rx_copy.num == COPY_BATCH_SIZE)
> > > >> +  batch_size = min(xenvif_copy_batch_size, queue-
> >rx_copy.size);
> > > >
> > > > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> > > > identical? Why do you need this statement (and hence stack variable)?
> > > >
> > > This statement was to allow to be changed dynamically and would
> > > affect all newly created guests or running guests if value happened
> > > to be smaller than initially allocated. But I suppose I should make
> > > behaviour more consistent with the other params we have right now
> > > and just look at initially allocated one `queue->rx_copy.batch_size` ?
> >
> > Yes, that would certainly be consistent but I can see value in
> > allowing it to be dynamically tuned, so perhaps adding some re-allocation
> > code to allow the batch to be grown as well as shrunk might be nice.
> 
> The shrink one we potentially risk losing data, so we need to gate the
> reallocation whenever `rx_copy.num` is less than the new requested
> batch. Worst case means guestrx_thread simply uses the initial
> allocated value.

Can't you just re-alloc immediately after the flush (when num is guaranteed to 
be zero)?

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 13 November 2017 11:54
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: net...@vger.kernel.org; Wei Liu <wei.l...@citrix.com>; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> On 11/13/2017 10:33 AM, Paul Durrant wrote:
> >> -Original Message-
> >> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> >> Sent: 10 November 2017 19:35
> >> To: net...@vger.kernel.org
> >> Cc: Joao Martins <joao.m.mart...@oracle.com>; Wei Liu
> >> <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>; xen-
> >> de...@lists.xenproject.org
> >> Subject: [PATCH net-next v1] xen-netback: make copy batch size
> >> configurable
> >>
> >> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
> >> handling and as a result decreased max grant copy ops from 4352 to 64.
> >> Before this commit it would drain the rx_queue (while there are
> >> enough slots in the ring to put packets) then copy to all pages and write
> >> responses on the ring. With the refactor we do almost the same albeit
> >> the last two steps are done every COPY_BATCH_SIZE (64) copies.
> >>
> >> For big packets, the value of 64 means copying 3 packets best case
> scenario
> >> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
> >> plus head cross the 4k grant boundary) which could be the case when
> >> packets go from local backend process.
> >>
> >> Instead of making it static to 64 grant copies, lets allow the user to
> >> select its value (while keeping the current as default) by introducing
> >> the `copy_batch_size` module parameter. This allows users to select
> >> the higher batches (i.e. for better throughput with big packets) as it
> >> was prior to the above mentioned commit.
> >>
> >> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>
> >> ---
> >>  drivers/net/xen-netback/common.h|  6 --
> >>  drivers/net/xen-netback/interface.c | 25
> -
> >>  drivers/net/xen-netback/netback.c   |  5 +
> >>  drivers/net/xen-netback/rx.c|  5 -
> >>  4 files changed, 37 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> >> netback/common.h
> >> index a46a1e94505d..a5fe36e098a7 100644
> >> --- a/drivers/net/xen-netback/common.h
> >> +++ b/drivers/net/xen-netback/common.h
> >> @@ -129,8 +129,9 @@ struct xenvif_stats {
> >>  #define COPY_BATCH_SIZE 64
> >>
> >>  struct xenvif_copy_state {
> >> -  struct gnttab_copy op[COPY_BATCH_SIZE];
> >> -  RING_IDX idx[COPY_BATCH_SIZE];
> >> +  struct gnttab_copy *op;
> >> +  RING_IDX *idx;
> >> +  unsigned int size;
> >
> > Could you name this batch_size, or something like that to make it clear
> what it means?
> >
> Yeap, will change it.
> 
> >>unsigned int num;
> >>struct sk_buff_head *completed;
> >>  };
> >> @@ -381,6 +382,7 @@ extern unsigned int rx_drain_timeout_msecs;
> >>  extern unsigned int rx_stall_timeout_msecs;
> >>  extern unsigned int xenvif_max_queues;
> >>  extern unsigned int xenvif_hash_cache_size;
> >> +extern unsigned int xenvif_copy_batch_size;
> >>
> >>  #ifdef CONFIG_DEBUG_FS
> >>  extern struct dentry *xen_netback_dbg_root;
> >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> >> netback/interface.c
> >> index d6dff347f896..a558868a883f 100644
> >> --- a/drivers/net/xen-netback/interface.c
> >> +++ b/drivers/net/xen-netback/interface.c
> >> @@ -516,7 +516,20 @@ struct xenvif *xenvif_alloc(struct device *parent,
> >> domid_t domid,
> >>
> >>  int xenvif_init_queue(struct xenvif_queue *queue)
> >>  {
> >> +  int size = xenvif_copy_batch_size;
> >
> > unsigned int
> >>>   int err, i;
> >> +  void *addr;
> >> +
> >> +  addr = vzalloc(size * sizeof(struct gnttab_copy));
> >
> > Does the memory need to be zeroed?
> >
> It doesn't need to be but given that xenvif_queue is zeroed (which included
> this
> region) thus thought I would leave the same way.

Ok.

> 
> >> +  if (!addr)
> >> +  goto err;
> >> + 

Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 13 November 2017 10:50
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Wei Liu <wei.l...@citrix.com>; xen-de...@lists.xenproject.org; 'Joao
> Martins' <joao.m.mart...@oracle.com>; net...@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch
> size configurable
> 
> >>> On 13.11.17 at 11:33, <paul.durr...@citrix.com> wrote:
> >> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> >> Sent: 10 November 2017 19:35
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -96,6 +96,11 @@ unsigned int xenvif_hash_cache_size =
> >> XENVIF_HASH_CACHE_SIZE_DEFAULT;
> >>  module_param_named(hash_cache_size, xenvif_hash_cache_size, uint,
> >> 0644);
> 
> Isn't the "owner-write" permission here ...
> 
> >> --- a/drivers/net/xen-netback/rx.c
> >> +++ b/drivers/net/xen-netback/rx.c
> >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
> >> xenvif_queue *queue,
> >>   struct xen_netif_rx_request *req,
> >>   unsigned int offset, void *data, size_t len)
> >>  {
> >> +  unsigned int batch_size;
> >>struct gnttab_copy *op;
> >>struct page *page;
> >>struct xen_page_foreign *foreign;
> >>
> >> -  if (queue->rx_copy.num == COPY_BATCH_SIZE)
> >> +  batch_size = min(xenvif_copy_batch_size, queue->rx_copy.size);
> >
> > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> identical?
> > Why do you need this statement (and hence stack variable)?
> 
> ... the answer to your question?

Yes, I guess it could be... but since there's no re-alloc code for the arrays I 
wonder whether the intention was to make this dynamic or not.

  Paul

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 10 November 2017 19:35
> To: net...@vger.kernel.org
> Cc: Joao Martins <joao.m.mart...@oracle.com>; Wei Liu
> <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>; xen-
> de...@lists.xenproject.org
> Subject: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
> handling and as a result decreased max grant copy ops from 4352 to 64.
> Before this commit it would drain the rx_queue (while there are
> enough slots in the ring to put packets) then copy to all pages and write
> responses on the ring. With the refactor we do almost the same albeit
> the last two steps are done every COPY_BATCH_SIZE (64) copies.
> 
> For big packets, the value of 64 means copying 3 packets best case scenario
> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
> plus head cross the 4k grant boundary) which could be the case when
> packets go from local backend process.
> 
> Instead of making it static to 64 grant copies, lets allow the user to
> select its value (while keeping the current as default) by introducing
> the `copy_batch_size` module parameter. This allows users to select
> the higher batches (i.e. for better throughput with big packets) as it
> was prior to the above mentioned commit.
> 
> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>
> ---
>  drivers/net/xen-netback/common.h|  6 --
>  drivers/net/xen-netback/interface.c | 25 -
>  drivers/net/xen-netback/netback.c   |  5 +
>  drivers/net/xen-netback/rx.c|  5 -
>  4 files changed, 37 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> netback/common.h
> index a46a1e94505d..a5fe36e098a7 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -129,8 +129,9 @@ struct xenvif_stats {
>  #define COPY_BATCH_SIZE 64
> 
>  struct xenvif_copy_state {
> - struct gnttab_copy op[COPY_BATCH_SIZE];
> - RING_IDX idx[COPY_BATCH_SIZE];
> + struct gnttab_copy *op;
> + RING_IDX *idx;
> + unsigned int size;

Could you name this batch_size, or something like that to make it clear what it 
means?

>   unsigned int num;
>   struct sk_buff_head *completed;
>  };
> @@ -381,6 +382,7 @@ extern unsigned int rx_drain_timeout_msecs;
>  extern unsigned int rx_stall_timeout_msecs;
>  extern unsigned int xenvif_max_queues;
>  extern unsigned int xenvif_hash_cache_size;
> +extern unsigned int xenvif_copy_batch_size;
> 
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *xen_netback_dbg_root;
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index d6dff347f896..a558868a883f 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -516,7 +516,20 @@ struct xenvif *xenvif_alloc(struct device *parent,
> domid_t domid,
> 
>  int xenvif_init_queue(struct xenvif_queue *queue)
>  {
> + int size = xenvif_copy_batch_size;

unsigned int

>   int err, i;
> + void *addr;
> +
> + addr = vzalloc(size * sizeof(struct gnttab_copy));

Does the memory need to be zeroed?

> + if (!addr)
> + goto err;
> + queue->rx_copy.op = addr;
> +
> + addr = vzalloc(size * sizeof(RING_IDX));

Likewise.

> + if (!addr)
> + goto err;
> + queue->rx_copy.idx = addr;
> + queue->rx_copy.size = size;
> 
>   queue->credit_bytes = queue->remaining_credit = ~0UL;
>   queue->credit_usec  = 0UL;
> @@ -544,7 +557,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>queue->mmap_pages);
>   if (err) {
>   netdev_err(queue->vif->dev, "Could not reserve
> mmap_pages\n");
> - return -ENOMEM;
> + goto err;
>   }
> 
>   for (i = 0; i < MAX_PENDING_REQS; i++) {
> @@ -556,6 +569,13 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>   }
> 
>   return 0;
> +
> +err:
> + if (queue->rx_copy.op)
> + vfree(queue->rx_copy.op);

vfree is safe to be called with NULL.

> + if (queue->rx_copy.idx)
> + vfree(queue->rx_copy.idx);
> + return -ENOMEM;
>  }
> 
>  void xenvif_carrier_on(struct xenvif *vif)
> @@ -788,6 +808,9 @@ void xenvif_disconnect_ctrl(struct xenvif *vif)
>   */
>  void xenvif_deinit_queue(struct xenvif_queue *queue)
>  {
> +

Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-10 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 10 November 2017 09:53
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Anthony Perard <anthony.per...@citrix.com>; Roger Pau Monne
> <roger@citrix.com>; Mike Reardon <m...@inso.org>; Stefano Stabellini
> <sstabell...@kernel.org>; xen-devel@lists.xen.org; Konrad Rzeszutek Wilk
> <konrad.w...@oracle.com>
> Subject: RE: [Xen-devel] [BUG] blkback reporting incorrect number of
> sectors, unable to boot
> 
> >>> On 10.11.17 at 10:40, <paul.durr...@citrix.com> wrote:
> >> Anthony PERARD
> >> Sent: 09 November 2017 17:50
> >> The problem is that QEMU 4.10 have a lock on the disk image. When
> >> booting an HVM guest with a qdisk backend, the disk is open twice, but
> >> can only be locked once, so when the pv disk is been initialized, the
> >> initialisation kind of fail.
> >> Unfortunatly, OVMF will wait indefinitly until the PV disk is
> >> initialized.
> >
> > That's presumably because the OVMF frontend leaves the emulated disk
> plugged
> > in despite talking via PV?
> 
> Well, how could it not? It can't know whether the OS to be booted
> is going to have PV drivers, and iirc the unplug is not reversible.

Oh, quite, but this is a fundamental problem if QEMU believes that it is not 
safe to open the underlying storage shared read-write (which would be the case 
for a qcow, or vhd where there is metadata to worry about). QEMU will open the 
storage as soon as the emulated device is realised so when xen_disk tries to 
open it again at connect time, it's always going to fail.

  Paul

> Shouldn't OVMF close the blkif connection, with the backend
> responding to this by unlocking (and maybe closing) the image?
> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-10 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Anthony PERARD
> Sent: 09 November 2017 17:50
> To: Roger Pau Monne 
> Cc: Mike Reardon ; xen-devel@lists.xen.org; Stefano
> Stabellini ; Jan Beulich ;
> Konrad Rzeszutek Wilk 
> Subject: Re: [Xen-devel] [BUG] blkback reporting incorrect number of
> sectors, unable to boot
> 
> On Thu, Nov 09, 2017 at 05:03:18PM +, Roger Pau Monné wrote:
> > On Thu, Nov 09, 2017 at 08:15:52AM -0700, Mike Reardon wrote:
> > > On Thu, Nov 9, 2017 at 2:30 AM, Roger Pau Monné
> 
> > > wrote:
> > >
> > > > Please try to avoid top-posting.
> > > >
> > > > On Wed, Nov 08, 2017 at 08:27:17PM -0700, Mike Reardon wrote:
> > > > > So am I correct in reading this that for at least the foreseeable 
> > > > > future
> > > > > storage using 4k sector sizes is not gonna happen?  I'm just trying to
> > > > > figure out if I need to get some different hardware.
> > > >
> > > > Have you tried to use qdisk instead of blkback for the storage
> > > > backend?
> > > >
> > > > You will have to change your disk configuration line to add
> > > > backendtype=qdisk.
> > > >
> > > > Roger.
> > > >
> > >
> > > Sorry I didn't realize my client was defaulting to top post.
> > >
> > > If I add that to the disk config line, the system just hangs on the ovmf
> > > bios screen.  This appears in the qemu-dm log:
> > >
> > >
> > > xen be: qdisk-51712: xen be: qdisk-51712: error: Failed to get "write" 
> > > lock
> > > error: Failed to get "write" lock
> > > xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed
> > > initialise() failed
> 
> :(, I never saw those error messages, maybe we should increase the
> verbosity of the qemu backends.
> 
> > Hm, that doesn't seem related to the issue at hand. Adding Anthony and
> > Stefano (the QEMU maintainers).
> >
> > Is there a know issue when booting a HVM guest with qdisk and UEFI?
> 
> I know of the issue, I don't know what to do about it yet.
> 
> The problem is that QEMU 4.10 have a lock on the disk image. When
> booting an HVM guest with a qdisk backend, the disk is open twice, but
> can only be locked once, so when the pv disk is been initialized, the
> initialisation kind of fail.
> Unfortunatly, OVMF will wait indefinitly until the PV disk is
> initialized.

That's presumably because the OVMF frontend leaves the emulated disk plugged in 
despite talking via PV?

  Paul

> 
> --
> Anthony PERARD
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-09 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Roger Pau Monné
> Sent: 09 November 2017 09:30
> To: Mike Reardon 
> Cc: Konrad Rzeszutek Wilk ; Jan Beulich
> ; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [BUG] blkback reporting incorrect number of
> sectors, unable to boot
> 
> Please try to avoid top-posting.
> 
> On Wed, Nov 08, 2017 at 08:27:17PM -0700, Mike Reardon wrote:
> > So am I correct in reading this that for at least the foreseeable future
> > storage using 4k sector sizes is not gonna happen?  I'm just trying to
> > figure out if I need to get some different hardware.
> 
> Have you tried to use qdisk instead of blkback for the storage
> backend?
> 
> You will have to change your disk configuration line to add
> backendtype=qdisk.

From my reading qdisk (i.e. xen_disk.c in the QEMU source) hard-codes its block 
size to 512, but at least it looks like it won't mis-report the number of 
sectors.

  Paul

> 
> Roger.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH v3] xen-disk: use an IOThread per instance

2017-11-09 Thread Paul Durrant
> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> Sent: 08 November 2017 17:42
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: qemu-de...@nongnu.org; xen-de...@lists.xenproject.org; Anthony
> Perard <anthony.per...@citrix.com>; Kevin Wolf <kw...@redhat.com>;
> Stefano Stabellini <sstabell...@kernel.org>; Max Reitz <mre...@redhat.com>
> Subject: Re: [Qemu-devel] [PATCH v3] xen-disk: use an IOThread per
> instance
> 
> On Tue, Nov 07, 2017 at 05:46:53AM -0500, Paul Durrant wrote:
> > This patch allocates an IOThread object for each xen_disk instance and
> > sets the AIO context appropriately on connect. This allows processing
> > of I/O to proceed in parallel.
> >
> > The patch also adds tracepoints into xen_disk to make it possible to
> > follow the state transtions of an instance in the log.
> 
> virtio-blk and virtio-scsi allow the user to specify an IOThread object.
> This allows users to configure the device<->IOThread mapping any way
> they like (e.g. 1:1, M:N).  Are you sure you want to hard-code the
> IOThread mapping?

Stefan,

  Realistically it's the only option at the moment. Xen PV backends are not 
configured bu QAPI... so no ability to control them from the command line or 
QMP. This is something I seriously intend to address in the near future but, 
for now, I simply want to unlock the performance boost that IOThread can 
provide.

  Cheers,

Paul

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [BUG] blkback reporting incorrect number of sectors, unable to boot

2017-11-07 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Roger Pau Monné
> Sent: 07 November 2017 10:30
> To: Jan Beulich 
> Cc: Mike Reardon ; xen-devel@lists.xen.org; Konrad
> Rzeszutek Wilk 
> Subject: Re: [Xen-devel] [BUG] blkback reporting incorrect number of
> sectors, unable to boot
> 
> On Mon, Nov 06, 2017 at 05:33:37AM -0700, Jan Beulich wrote:
> > >>> On 04.11.17 at 05:48,  wrote:
> > > I added some additional storage to my server with some native 4k sector
> > > size disks.  The LVM volumes on that array seem to work fine when
> mounted
> > > by the host, and when passed through to any of the Linux guests, but
> > > Windows guests aren't able to use them when using PV drivers.  The
> work
> > > fine to install when I first install Windows (Windows 10, latest build) 
> > > but
> > > once I install the PV drivers it will no longer boot and give an
> > > inaccessible boot device error.  If I assign the storage to a different
> > > Windows guest that already has the drivers installed (as secondary
> storage,
> > > not as the boot device) I see the disk listed in disk management, but the
> > > size of the disk is 8x larger than it should be.  After looking into it a
> > > bit, the disk is reporting 8x the number of sectors it should have when I
> > > run xenstore-ls.  Here is the info from xenstore-ls for the relevant
> volume:
> > >
> > >   51712 = ""
> > >frontend = "/local/domain/8/device/vbd/51712"
> > >params = "/dev/tv_storage/main-storage"
> > >script = "/etc/xen/scripts/block"
> > >frontend-id = "8"
> > >online = "1"
> > >removable = "0"
> > >bootable = "1"
> > >state = "2"
> > >dev = "xvda"
> > >type = "phy"
> > >mode = "w"
> > >device-type = "disk"
> > >discard-enable = "1"
> > >feature-max-indirect-segments = "256"
> > >multi-queue-max-queues = "12"
> > >max-ring-page-order = "4"
> > >physical-device = "fe:0"
> > >physical-device-path = "/dev/dm-0"
> > >hotplug-status = "connected"
> > >feature-flush-cache = "1"
> > >feature-discard = "0"
> > >feature-barrier = "1"
> > >feature-persistent = "1"
> > >sectors = "34359738368"
> > >info = "0"
> > >sector-size = "4096"
> > >physical-sector-size = "4096"
> > >
> > >
> > > Here are the numbers for the volume as reported by fdisk:
> > >
> > > Disk /dev/tv_storage/main-storage: 16 TiB, 17592186044416 bytes,
> 4294967296
> > > sectors
> > > Units: sectors of 1 * 4096 = 4096 bytes
> > > Sector size (logical/physical): 4096 bytes / 4096 bytes
> > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > Disklabel type: dos
> > > Disk identifier: 0x
> > >
> > > DeviceBoot StartEndSectors Size Id 
> > > Type
> > > /dev/tv_storage/main-storage1  1 4294967295 4294967295  16T ee
> GPT
> > >
> > >
> > > As with the size reported in Windows disk management, the number of
> sectors
> > > from xenstore seems is 8x higher than what it should be.  The disks aren't
> > > using 512b sector emulation, they are natively 4k, so I have no idea where
> > > the 8x increase is coming from.
> >
> > Hmm, looks like a backend problem indeed: struct hd_struct's
> > nr_sects (which get_capacity() returns) looks to be in 512-byte
> > units, regardless of actual sector size. Hence the plain
> > get_capacity() use as well the (wrongly open coded) use of
> > part_nr_sects_read() looks insufficient in vbd_sz(). Roger,
> > Konrad?
> 
> Hm, AFAICT sector-size should always be set to 512.
> 
> > Question of course is whether the Linux frontend then
> > also needs adjustment, and hence whether the backend can
> > be corrected in a compatible way in the first place.
> 
> blkfront uses set_capacity, which also seems to expect the sectors to
> be hardcoded to 512.
> 

Oh dear. No wonder it's all quite broken.

  Paul

> Roger.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3] xen-disk: use an IOThread per instance

2017-11-07 Thread Paul Durrant
This patch allocates an IOThread object for each xen_disk instance and
sets the AIO context appropriately on connect. This allows processing
of I/O to proceed in parallel.

The patch also adds tracepoints into xen_disk to make it possible to
follow the state transtions of an instance in the log.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Anthony Perard <anthony.per...@citrix.com>
Cc: Kevin Wolf <kw...@redhat.com>
Cc: Max Reitz <mre...@redhat.com>

v3:
 - Use new iothread_create/destroy() functions

v2:
 - explicitly acquire and release AIO context in qemu_aio_complete() and
   blk_bh()
---
 hw/block/trace-events |  7 +++
 hw/block/xen_disk.c   | 53 ---
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/hw/block/trace-events b/hw/block/trace-events
index cb6767b3ee..962a3bfa24 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *vdev, void *mrb, int start, 
int num_reqs, uint6
 # hw/block/hd-geometry.c
 hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS 
%d %d %d"
 hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int 
trans) "blk %p CHS %u %u %u trans %d"
+
+# hw/block/xen_disk.c
+xen_disk_alloc(char *name) "%s"
+xen_disk_init(char *name) "%s"
+xen_disk_connect(char *name) "%s"
+xen_disk_disconnect(char *name) "%s"
+xen_disk_free(char *name) "%s"
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index e431bd89e8..f74fcd42d1 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -27,10 +27,12 @@
 #include "hw/xen/xen_backend.h"
 #include "xen_blkif.h"
 #include "sysemu/blockdev.h"
+#include "sysemu/iothread.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qstring.h"
+#include "trace.h"
 
 /* - */
 
@@ -125,6 +127,9 @@ struct XenBlkDev {
 DriveInfo   *dinfo;
 BlockBackend*blk;
 QEMUBH  *bh;
+
+IOThread*iothread;
+AioContext  *ctx;
 };
 
 /* - */
@@ -596,9 +601,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq);
 static void qemu_aio_complete(void *opaque, int ret)
 {
 struct ioreq *ioreq = opaque;
+struct XenBlkDev *blkdev = ioreq->blkdev;
+
+aio_context_acquire(blkdev->ctx);
 
 if (ret != 0) {
-xen_pv_printf(>blkdev->xendev, 0, "%s I/O error\n",
+xen_pv_printf(>xendev, 0, "%s I/O error\n",
   ioreq->req.operation == BLKIF_OP_READ ? "read" : 
"write");
 ioreq->aio_errors++;
 }
@@ -607,10 +615,10 @@ static void qemu_aio_complete(void *opaque, int ret)
 if (ioreq->presync) {
 ioreq->presync = 0;
 ioreq_runio_qemu_aio(ioreq);
-return;
+goto done;
 }
 if (ioreq->aio_inflight > 0) {
-return;
+goto done;
 }
 
 if (xen_feature_grant_copy) {
@@ -647,16 +655,19 @@ static void qemu_aio_complete(void *opaque, int ret)
 }
 case BLKIF_OP_READ:
 if (ioreq->status == BLKIF_RSP_OKAY) {
-block_acct_done(blk_get_stats(ioreq->blkdev->blk), >acct);
+block_acct_done(blk_get_stats(blkdev->blk), >acct);
 } else {
-block_acct_failed(blk_get_stats(ioreq->blkdev->blk), >acct);
+block_acct_failed(blk_get_stats(blkdev->blk), >acct);
 }
 break;
 case BLKIF_OP_DISCARD:
 default:
 break;
 }
-qemu_bh_schedule(ioreq->blkdev->bh);
+qemu_bh_schedule(blkdev->bh);
+
+done:
+aio_context_release(blkdev->ctx);
 }
 
 static bool blk_split_discard(struct ioreq *ioreq, blkif_sector_t 
sector_number,
@@ -913,17 +924,29 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 static void blk_bh(void *opaque)
 {
 struct XenBlkDev *blkdev = opaque;
+
+aio_context_acquire(blkdev->ctx);
 blk_handle_requests(blkdev);
+aio_context_release(blkdev->ctx);
 }
 
 static void blk_alloc(struct XenDevice *xendev)
 {
 struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
+Error *err = NULL;
+
+trace_xen_disk_alloc(xendev->name);
 
 QLIST_INIT(>inflight);
 QLIST_INIT(>finished);
 QLIST_INIT(>freelist);
-blkdev->bh = qemu_bh_new(blk_bh, blkdev);
+
+blkdev->iothread = iothread_create(xendev->name, );
+assert(!err);
+
+blkdev->ctx = iothread_get_aio_context(blkdev->iothread);
+blkdev-&

Re: [Xen-devel] [PATCH RFC 2/8] public/io/netif: add directory for backend parameters

2017-11-06 Thread Paul Durrant
> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 02 November 2017 18:06
> To: Xen Development List <xen-devel@lists.xen.org>
> Cc: Joao Martins <joao.m.mart...@oracle.com>; Konrad Rzeszutek Wilk
> <konrad.w...@oracle.com>; Paul Durrant <paul.durr...@citrix.com>; Wei Liu
> <wei.l...@citrix.com>
> Subject: [PATCH RFC 2/8] public/io/netif: add directory for backend
> parameters
> 
> The proposed directory provides a mechanism for tools to control the
> maximum feature set of the device being provisioned by backend.
> The parameters/features include offloading features, number of
> queues etc.
> 
> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>
> ---
>  xen/include/public/io/netif.h | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
> index 2454448baa..a412e4771d 100644
> --- a/xen/include/public/io/netif.h
> +++ b/xen/include/public/io/netif.h
> @@ -161,6 +161,22 @@
>   */
> 
>  /*
> + * The directory "require" maybe be created in backend path by tools
> + * domain to override the maximum feature set that backend provides to
> the
> + * frontend. The children entries within this directory are features names
> + * and the correspondent values that should be used backend as defaults
> e.g.:
> + *
> + * /local/domain/X/backend///require
> + * /local/domain/X/backend///require/multi-queue-
> max-queues = "2"
> + * /local/domain/X/backend///require/feature-no-csum-
> offload = "1"
> + *
> + * In the example above, network backend will negotiate up to a maximum
> of
> + * two queues with frontend plus disabling IPv4 checksum offloading.
> + *
> + * This directory and its children entries shall only be visible to the 
> backend.
> + */
> +

What should happen if the toolstack sets something in 'require' that the 
backend cannot provide? I don't see anything in your RFC patches to check that 
the backend has responded appropriately to the keys.

  Paul

> +/*
>   * Control ring
>   * 
>   *
> --
> 2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4] xen: support priv-mapping in an HVM tools domain

2017-11-03 Thread Paul Durrant
If the domain has XENFEAT_auto_translated_physmap then use of the PV-
specific HYPERVISOR_mmu_update hypercall is clearly incorrect.

This patch adds checks in xen_remap_domain_gfn_array() and
xen_unmap_domain_gfn_array() which call through to the approprate
xlate_mmu function if the feature is present. A check is also added
to xen_remap_domain_gfn_range() to fail with -EOPNOTSUPP since this
should not be used in an HVM tools domain.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>

v4:
 - Restore v1 commit comment.

v3:
 - As v1 but with additional stubs in xen/xen-ops.h to handle
   configurations without CONFIG_XEN_AUTO_XLATE.
---
 arch/x86/xen/mmu.c| 14 --
 include/xen/xen-ops.h | 24 
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 3e15345abfe7..d33e7dbe3129 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
   pgprot_t prot, unsigned domid,
   struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return -EOPNOTSUPP;
+
return do_remap_gfn(vma, addr, , nr, NULL, prot, domid, pages);
 }
 EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
@@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
   int *err_ptr, pgprot_t prot,
   unsigned domid, struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_remap_gfn_array(vma, addr, gfn, nr, err_ptr,
+prot, domid, pages);
+
/* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
 * and the consequences later is quite hard to detect what the actual
 * cause of "wrong memory was mapped in".
@@ -193,9 +200,12 @@ EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
 
 /* Returns: 0 success */
 int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
-  int numpgs, struct page **pages)
+  int nr, struct page **pages)
 {
-   if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_unmap_gfn_range(vma, nr, pages);
+
+   if (!pages)
return 0;
 
return -EINVAL;
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 218e6aae5433..18b25631a113 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -103,6 +103,8 @@ int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
   struct page **pages);
 int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
   int numpgs, struct page **pages);
+
+#ifdef CONFIG_XEN_AUTO_XLATE
 int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
  unsigned long addr,
  xen_pfn_t *gfn, int nr,
@@ -111,6 +113,28 @@ int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
  struct page **pages);
 int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
  int nr, struct page **pages);
+#else
+/*
+ * These two functions are called from arch/x86/xen/mmu.c and so stubs
+ * are needed for a configuration not specifying CONFIG_XEN_AUTO_XLATE.
+ */
+static inline int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
+   unsigned long addr,
+   xen_pfn_t *gfn, int nr,
+   int *err_ptr, pgprot_t prot,
+   unsigned int domid,
+   struct page **pages)
+{
+   return -EOPNOTSUPP;
+}
+
+static inline int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
+   int nr, struct page **pages)
+{
+   return -EOPNOTSUPP;
+}
+#endif
+
 int xen_xlate_map_ballooned_pages(xen_pfn_t **pfns, void **vaddr,
  unsigned long nr_grant_frames);
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] xen: support priv-mapping in an HVM tools domain

2017-11-03 Thread Paul Durrant
> -Original Message-
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: 03 November 2017 16:58
> To: x...@kernel.org; xen-de...@lists.xenproject.org; linux-
> ker...@vger.kernel.org
> Cc: Paul Durrant <paul.durr...@citrix.com>; Boris Ostrovsky
> <boris.ostrov...@oracle.com>; Juergen Gross <jgr...@suse.com>; Thomas
> Gleixner <t...@linutronix.de>; Ingo Molnar <mi...@redhat.com>; H. Peter
> Anvin <h...@zytor.com>
> Subject: [PATCH v3] xen: support priv-mapping in an HVM tools domain
> 
> If the domain has XENFEAT_auto_translated_physmap then use of the PV-
> specific HYPERVISOR_mmu_update hypercall is clearly incorrect.
> 
> This patch adds checks in xen_remap_domain_gfn_array() and
> xen_unmap_domain_gfn_array() which call through to the approprate
> xlate_mmu function if the feature is present.
> 
> This patch also moves xen_remap_domain_gfn_range() into the PV-only
> MMU
> code and #ifdefs the (only) calling code in privcmd accordingly.

 I realise now that this paragraph refers to the code in the v2 patch. 
I'll send v4.

  Paul

> 
> Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> ---
> Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
> Cc: Juergen Gross <jgr...@suse.com>
> Cc: Thomas Gleixner <t...@linutronix.de>
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: "H. Peter Anvin" <h...@zytor.com>
> 
> v3:
>  - As v1 but with additional stubs in xen/xen-ops.h to handle
>configurations without CONFIG_XEN_AUTO_XLATE.
> ---
>  arch/x86/xen/mmu.c| 14 --
>  include/xen/xen-ops.h | 24 
>  2 files changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 3e15345abfe7..d33e7dbe3129 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct
> vm_area_struct *vma,
>  pgprot_t prot, unsigned domid,
>  struct page **pages)
>  {
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return -EOPNOTSUPP;
> +
>   return do_remap_gfn(vma, addr, , nr, NULL, prot, domid,
> pages);
>  }
>  EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
> @@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct
> vm_area_struct *vma,
>  int *err_ptr, pgprot_t prot,
>  unsigned domid, struct page **pages)
>  {
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return xen_xlate_remap_gfn_array(vma, addr, gfn, nr,
> err_ptr,
> +  prot, domid, pages);
> +
>   /* We BUG_ON because it's a programmer error to pass a NULL
> err_ptr,
>* and the consequences later is quite hard to detect what the actual
>* cause of "wrong memory was mapped in".
> @@ -193,9 +200,12 @@
> EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
> 
>  /* Returns: 0 success */
>  int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
> -int numpgs, struct page **pages)
> +int nr, struct page **pages)
>  {
> - if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return xen_xlate_unmap_gfn_range(vma, nr, pages);
> +
> + if (!pages)
>   return 0;
> 
>   return -EINVAL;
> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
> index 218e6aae5433..18b25631a113 100644
> --- a/include/xen/xen-ops.h
> +++ b/include/xen/xen-ops.h
> @@ -103,6 +103,8 @@ int xen_remap_domain_gfn_range(struct
> vm_area_struct *vma,
>  struct page **pages);
>  int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
>  int numpgs, struct page **pages);
> +
> +#ifdef CONFIG_XEN_AUTO_XLATE
>  int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
> unsigned long addr,
> xen_pfn_t *gfn, int nr,
> @@ -111,6 +113,28 @@ int xen_xlate_remap_gfn_array(struct
> vm_area_struct *vma,
> struct page **pages);
>  int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
> int nr, struct page **pages);
> +#else
> +/*
> + * These two functions are called from arch/x86/xen/mmu.c and so stubs
> + * are needed for a configuration not specifying
> CONFIG_XEN_AUTO_XLATE.
> + */
> +static inline int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
> + 

[Xen-devel] [PATCH v3] xen: support priv-mapping in an HVM tools domain

2017-11-03 Thread Paul Durrant
If the domain has XENFEAT_auto_translated_physmap then use of the PV-
specific HYPERVISOR_mmu_update hypercall is clearly incorrect.

This patch adds checks in xen_remap_domain_gfn_array() and
xen_unmap_domain_gfn_array() which call through to the approprate
xlate_mmu function if the feature is present.

This patch also moves xen_remap_domain_gfn_range() into the PV-only MMU
code and #ifdefs the (only) calling code in privcmd accordingly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>

v3:
 - As v1 but with additional stubs in xen/xen-ops.h to handle
   configurations without CONFIG_XEN_AUTO_XLATE.
---
 arch/x86/xen/mmu.c| 14 --
 include/xen/xen-ops.h | 24 
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 3e15345abfe7..d33e7dbe3129 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
   pgprot_t prot, unsigned domid,
   struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return -EOPNOTSUPP;
+
return do_remap_gfn(vma, addr, , nr, NULL, prot, domid, pages);
 }
 EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
@@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
   int *err_ptr, pgprot_t prot,
   unsigned domid, struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_remap_gfn_array(vma, addr, gfn, nr, err_ptr,
+prot, domid, pages);
+
/* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
 * and the consequences later is quite hard to detect what the actual
 * cause of "wrong memory was mapped in".
@@ -193,9 +200,12 @@ EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
 
 /* Returns: 0 success */
 int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
-  int numpgs, struct page **pages)
+  int nr, struct page **pages)
 {
-   if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_unmap_gfn_range(vma, nr, pages);
+
+   if (!pages)
return 0;
 
return -EINVAL;
diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
index 218e6aae5433..18b25631a113 100644
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -103,6 +103,8 @@ int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
   struct page **pages);
 int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
   int numpgs, struct page **pages);
+
+#ifdef CONFIG_XEN_AUTO_XLATE
 int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
  unsigned long addr,
  xen_pfn_t *gfn, int nr,
@@ -111,6 +113,28 @@ int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
  struct page **pages);
 int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
  int nr, struct page **pages);
+#else
+/*
+ * These two functions are called from arch/x86/xen/mmu.c and so stubs
+ * are needed for a configuration not specifying CONFIG_XEN_AUTO_XLATE.
+ */
+static inline int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
+   unsigned long addr,
+   xen_pfn_t *gfn, int nr,
+   int *err_ptr, pgprot_t prot,
+   unsigned int domid,
+   struct page **pages)
+{
+   return -EOPNOTSUPP;
+}
+
+static inline int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
+   int nr, struct page **pages)
+{
+   return -EOPNOTSUPP;
+}
+#endif
+
 int xen_xlate_map_ballooned_pages(xen_pfn_t **pfns, void **vaddr,
  unsigned long nr_grant_frames);
 
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Commit moratorium to staging

2017-11-02 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monne
> Sent: 02 November 2017 09:42
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Ian Jackson <ian.jack...@citrix.com>; Lars Kurth
> <lars.ku...@citrix.com>; Wei Liu <wei.l...@citrix.com>; Julien Grall
> <julien.gr...@linaro.org>; committ...@xenproject.org; xen-devel  de...@lists.xenproject.org>
> Subject: Re: [Xen-devel] Commit moratorium to staging
> 
> On Thu, Nov 02, 2017 at 09:20:10AM +, Paul Durrant wrote:
> > > -Original Message-
> > > From: Roger Pau Monne
> > > Sent: 02 November 2017 09:15
> > > To: Roger Pau Monne <roger@citrix.com>
> > > Cc: Ian Jackson <ian.jack...@citrix.com>; Lars Kurth
> > > <lars.ku...@citrix.com>; Wei Liu <wei.l...@citrix.com>; Julien Grall
> > > <julien.gr...@linaro.org>; Paul Durrant <paul.durr...@citrix.com>;
> > > committ...@xenproject.org; xen-devel  de...@lists.xenproject.org>
> > > Subject: Re: [Xen-devel] Commit moratorium to staging
> > >
> > > On Wed, Nov 01, 2017 at 04:17:10PM +, Roger Pau Monné wrote:
> > > > On Wed, Nov 01, 2017 at 02:07:48PM +, Ian Jackson wrote:
> > > > > * Affected hosts differ from unaffected hosts according to cpuid.
> > > > >   Roger has repro'd the bug on an unaffected host by masking out
> > > > >   certain cpuid bits.  There are 6 implicated bits and he is working
> > > > >   to narrow that down.
> > > >
> > > > I'm currently trying to narrow this down and make sure the above is
> > > > accurate.
> > >
> > > So I was wrong with this, I guess I've run the tests on the wrong
> > > host. Even when masking the different cpuid bits in the guest the
> > > tests still succeeds.
> > >
> > > AFAICT the test fail or succeed reliably depending on the host
> > > hardware. I don't really have many ideas about what to do next, but I
> > > think it would be useful to create a manual osstest flight that runs
> > > the win16 job in all the different hosts in the colo. I would also
> > > capture the normal information that Xen collects after each test (xl
> > > info, /proc/cpuid, serial logs...).
> > >
> > > Is there anything else not captured by ts-logs-capture that would be
> > > interesting in order to help debug the issue?
> >
> > Does the shutdown reliably complete prior to migrate and then only fail
> intermittently after a localhost migrate?
> 
> AFAICT yes, but it can also be added to the test in order to be sure.
> 
> > It might be useful to know what cpuid info is seen by the guest before and
> after migrate.
> 
> Is there anyway to get that from windows in an automatic way? If not I
> could test that with a Debian guest. In fact it might even be a good
> thing for Linux based guest to be added to the regular migration tests
> in order to make sure cpuid bits don't change across migrations.
> 

I found this for windows:

https://www.cpuid.com/downloads/cpu-z/cpu-z_1.81-en.exe

It can generate a text or html report as well as being run interactively. But 
you may get more mileage from using a debian HVM guest. I guess it may also be 
useful is we can get a scan of available MSRs and content before and after 
migrate too.

> > Another datapoint... does the shutdown fail if you insert a delay of a 
> > couple
> of minutes between the migrate and the shutdown?
> 
> Sometimes, after a variable number of calls to xl shutdown ... the
> guest usually ends up shutting down.
> 

Hmm. I wonder whether the guest is actually healthy after the migrate. One 
could imagine a situation where the storage device model (IDE in our case I 
guess) gets stuck in some way but recovers after a timeout in the guest storage 
stack. Thus, if you happen to try shut down while it is still stuck Windows 
starts trying to shut down but can't. Try after the timeout though and it can.
In the past we did make attempts to support Windows without PV drivers in 
XenServer but xenrt would never reliably pass VM lifecycle tests using emulated 
devices. That was with qemu trad, but I wonder whether upstream qemu is 
actually any better particularly if using older device models such as IDE and 
RTL8139 (which are probably largely unmodified from trad).

  Paul

> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen: support priv-mapping in an HVM tools domain

2017-11-02 Thread Paul Durrant
> -Original Message-
> From: Boris Ostrovsky [mailto:boris.ostrov...@oracle.com]
> Sent: 01 November 2017 18:19
> To: Juergen Gross <jgr...@suse.com>; Paul Durrant
> <paul.durr...@citrix.com>; x...@kernel.org; xen-
> de...@lists.xenproject.org; linux-ker...@vger.kernel.org
> Cc: Thomas Gleixner <t...@linutronix.de>; Ingo Molnar
> <mi...@redhat.com>; H. Peter Anvin <h...@zytor.com>
> Subject: Re: [PATCH v2] xen: support priv-mapping in an HVM tools domain
> 
> On 11/01/2017 11:37 AM, Juergen Gross wrote:
> >
> > TBH I like V1 better, too.
> >
> > Boris, do you feel strong about the #ifdef part?
> 
> Having looked at what this turned into I now like V1 better too ;-)
> 
> Sorry, Paul.

That's ok. Are you happy with v1 as-is or do you want me to submit a v3 with 
any tweaks?

  Paul

> 
> 
> -boris
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Commit moratorium to staging

2017-11-02 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monne
> Sent: 02 November 2017 09:15
> To: Roger Pau Monne <roger@citrix.com>
> Cc: Ian Jackson <ian.jack...@citrix.com>; Lars Kurth
> <lars.ku...@citrix.com>; Wei Liu <wei.l...@citrix.com>; Julien Grall
> <julien.gr...@linaro.org>; Paul Durrant <paul.durr...@citrix.com>;
> committ...@xenproject.org; xen-devel <xen-de...@lists.xenproject.org>
> Subject: Re: [Xen-devel] Commit moratorium to staging
> 
> On Wed, Nov 01, 2017 at 04:17:10PM +, Roger Pau Monné wrote:
> > On Wed, Nov 01, 2017 at 02:07:48PM +, Ian Jackson wrote:
> > > * Affected hosts differ from unaffected hosts according to cpuid.
> > >   Roger has repro'd the bug on an unaffected host by masking out
> > >   certain cpuid bits.  There are 6 implicated bits and he is working
> > >   to narrow that down.
> >
> > I'm currently trying to narrow this down and make sure the above is
> > accurate.
> 
> So I was wrong with this, I guess I've run the tests on the wrong
> host. Even when masking the different cpuid bits in the guest the
> tests still succeeds.
> 
> AFAICT the test fail or succeed reliably depending on the host
> hardware. I don't really have many ideas about what to do next, but I
> think it would be useful to create a manual osstest flight that runs
> the win16 job in all the different hosts in the colo. I would also
> capture the normal information that Xen collects after each test (xl
> info, /proc/cpuid, serial logs...).
> 
> Is there anything else not captured by ts-logs-capture that would be
> interesting in order to help debug the issue?

Does the shutdown reliably complete prior to migrate and then only fail 
intermittently after a localhost migrate? It might be useful to know what cpuid 
info is seen by the guest before and after migrate. Another datapoint... does 
the shutdown fail if you insert a delay of a couple of minutes between the 
migrate and the shutdown?

  Paul

> 
> Regards, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 for-next 4/4] xen: Convert __page_to_mfn and __mfn_to_page to use typesafe MFN

2017-11-01 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 01 November 2017 14:03
> To: xen-devel@lists.xen.org
> Cc: Julien Grall <julien.gr...@linaro.org>; Stefano Stabellini
> <sstabell...@kernel.org>; Julien Grall <julien.gr...@arm.com>; Andrew
> Cooper <andrew.coop...@citrix.com>; George Dunlap
> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Jan
> Beulich <jbeul...@suse.com>; Konrad Rzeszutek Wilk
> <konrad.w...@oracle.com>; Tim (Xen.org) <t...@xen.org>; Wei Liu
> <wei.l...@citrix.com>; Razvan Cojocaru <rcojoc...@bitdefender.com>;
> Tamas K Lengyel <ta...@tklengyel.com>; Paul Durrant
> <paul.durr...@citrix.com>; Boris Ostrovsky <boris.ostrov...@oracle.com>;
> Suravee Suthikulpanit <suravee.suthikulpa...@amd.com>; Jun Nakajima
> <jun.nakaj...@intel.com>; Kevin Tian <kevin.t...@intel.com>; George
> Dunlap <george.dun...@citrix.com>; Gang Wei <gang@intel.com>;
> Shane Wang <shane.w...@intel.com>
> Subject: [PATCH v3 for-next 4/4] xen: Convert __page_to_mfn and
> __mfn_to_page to use typesafe MFN
> 
> Most of the users of page_to_mfn and mfn_to_page are either overriding
> the macros to make them work with mfn_t or use mfn_x/_mfn because the
> rest of the function use mfn_t.
> 
> So make __page_to_mfn and __mfn_to_page return mfn_t by default.
> 
> Only reasonable clean-ups are done in this patch because it is
> already quite big. So some of the files now override page_to_mfn and
> mfn_to_page to avoid using mfn_t.
> 
> Lastly, domain_page_to_mfn is also converted to use mfn_t given that
> most of the callers are now switched to _mfn(domain_page_to_mfn(...)).
> 
> Signed-off-by: Julien Grall <julien.gr...@linaro.org>
> 

emulate bits...

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> 
> Andrew suggested to drop IS_VALID_PAGE in xen/tmem_xen.h. His
> comment
> was:
> 
> "/sigh  This is tautological.  The definition of a "valid mfn" in this
> case is one for which we have frametable entry, and by having a struct
> page_info in our hands, this is by definition true (unless you have a
> wild pointer, at which point your bug is elsewhere).
> 
> IS_VALID_PAGE() is only ever used in assertions and never usefully, so
> instead I would remove it entirely rather than trying to fix it up."
> 
> I can remove the function in a separate patch at the begining of the
> series if Konrad (TMEM maintainer) is happy with that.
> 
> Cc: Stefano Stabellini <sstabell...@kernel.org>
> Cc: Julien Grall <julien.gr...@arm.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> Cc: George Dunlap <george.dun...@eu.citrix.com>
> Cc: Ian Jackson <ian.jack...@eu.citrix.com>
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
> Cc: Tim Deegan <t...@xen.org>
> Cc: Wei Liu <wei.l...@citrix.com>
> Cc: Razvan Cojocaru <rcojoc...@bitdefender.com>
> Cc: Tamas K Lengyel <ta...@tklengyel.com>
> Cc: Paul Durrant <paul.durr...@citrix.com>
> Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpa...@amd.com>
> Cc: Jun Nakajima <jun.nakaj...@intel.com>
> Cc: Kevin Tian <kevin.t...@intel.com>
> Cc: George Dunlap <george.dun...@eu.citrix.com>
> Cc: Gang Wei <gang@intel.com>
> Cc: Shane Wang <shane.w...@intel.com>
> 
> Changes in v3:
> - Rebase on the latest staging and fix some conflicts. Tags
> haven't be retained.
> - Switch the printf format to PRI_mfn
> 
> Changes in v2:
> - Some part have been moved in separate patch
> - Remove one spurious comment
> - Convert domain_page_to_mfn to use mfn_t
> ---
>  xen/arch/arm/domain_build.c |  2 --
>  xen/arch/arm/kernel.c   |  2 +-
>  xen/arch/arm/mem_access.c   |  2 +-
>  xen/arch/arm/mm.c   |  8 
>  xen/arch/arm/p2m.c  | 10 ++
>  xen/arch/x86/cpu/vpmu.c |  4 ++--
>  xen/arch/x86/domain.c   | 21 +++--
>  xen/arch/x86/domain_page.c  |  6 +++---
>  xen/arch/x86/domctl.c   |  2 +-
>  xen/arch/x86/hvm/dm.c   |  2 +-
>  xen/arch/x86/hvm/dom0_build.c   |  6 +++---
>  xen/arch/x86/hvm/emulate.c  |  6 +++---
>  xen/arch/x86/hvm/hvm.c  | 16 
>  xen/arch/x86/hvm/ioreq.c|  6 +++---
>  xen/arch/x86/hvm/stdvga.c  

Re: [Xen-devel] [PATCH v2] xen: support priv-mapping in an HVM tools domain

2017-11-01 Thread Paul Durrant
> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 01 November 2017 13:40
> To: Paul Durrant <paul.durr...@citrix.com>; x...@kernel.org; xen-
> de...@lists.xenproject.org; linux-ker...@vger.kernel.org
> Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>; Thomas Gleixner
> <t...@linutronix.de>; Ingo Molnar <mi...@redhat.com>; H. Peter Anvin
> <h...@zytor.com>
> Subject: Re: [PATCH v2] xen: support priv-mapping in an HVM tools domain
> 
> On 01/11/17 12:31, Paul Durrant wrote:
> > If the domain has XENFEAT_auto_translated_physmap then use of the PV-
> > specific HYPERVISOR_mmu_update hypercall is clearly incorrect.
> >
> > This patch adds checks in xen_remap_domain_gfn_array() and
> > xen_unmap_domain_gfn_array() which call through to the approprate
> > xlate_mmu function if the feature is present.
> >
> > This patch also moves xen_remap_domain_gfn_range() into the PV-only
> MMU
> > code and #ifdefs the (only) calling code in privcmd accordingly.
> >
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > ---
> > Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
> > Cc: Juergen Gross <jgr...@suse.com>
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > ---
> >  arch/x86/xen/mmu.c| 36 +---
> >  arch/x86/xen/mmu_pv.c | 11 +++
> >  drivers/xen/privcmd.c | 17 +
> >  include/xen/xen-ops.h |  7 +++
> >  4 files changed, 48 insertions(+), 23 deletions(-)
> >
> > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> > index 3e15345abfe7..01837c36e293 100644
> > --- a/arch/x86/xen/mmu.c
> > +++ b/arch/x86/xen/mmu.c
> > @@ -91,12 +91,12 @@ static int remap_area_mfn_pte_fn(pte_t *ptep,
> pgtable_t token,
> > return 0;
> >  }
> >
> > -static int do_remap_gfn(struct vm_area_struct *vma,
> > -   unsigned long addr,
> > -   xen_pfn_t *gfn, int nr,
> > -   int *err_ptr, pgprot_t prot,
> > -   unsigned domid,
> > -   struct page **pages)
> > +int xen_remap_gfn(struct vm_area_struct *vma,
> > + unsigned long addr,
> > + xen_pfn_t *gfn, int nr,
> > + int *err_ptr, pgprot_t prot,
> > + unsigned int domid,
> > + struct page **pages)
> >  {
> > int err = 0;
> > struct remap_data rmd;
> > @@ -166,36 +166,34 @@ static int do_remap_gfn(struct vm_area_struct
> *vma,
> > return err < 0 ? err : mapped;
> >  }
> >
> > -int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
> > -  unsigned long addr,
> > -  xen_pfn_t gfn, int nr,
> > -  pgprot_t prot, unsigned domid,
> > -  struct page **pages)
> > -{
> > -   return do_remap_gfn(vma, addr, , nr, NULL, prot, domid,
> pages);
> > -}
> > -EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
> > -
> >  int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
> >unsigned long addr,
> >xen_pfn_t *gfn, int nr,
> >int *err_ptr, pgprot_t prot,
> >unsigned domid, struct page **pages)
> >  {
> > +   if (xen_feature(XENFEAT_auto_translated_physmap))
> > +   return xen_xlate_remap_gfn_array(vma, addr, gfn, nr,
> err_ptr,
> > +prot, domid, pages);
> > +
> > /* We BUG_ON because it's a programmer error to pass a NULL
> err_ptr,
> >  * and the consequences later is quite hard to detect what the actual
> >  * cause of "wrong memory was mapped in".
> >  */
> > BUG_ON(err_ptr == NULL);
> > -   return do_remap_gfn(vma, addr, gfn, nr, err_ptr, prot, domid,
> pages);
> > +   return xen_remap_gfn(vma, addr, gfn, nr, err_ptr, prot, domid,
> > +pages);
> >  }
> >  EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
> >
> >  /* Returns: 0 success */
> >  int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
> > -  int numpgs, struct page **pages)
> > +  int nr, struct page **pages)
> >  {
> > -   if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
> > +   if (xen

[Xen-devel] [PATCH v2] xen: support priv-mapping in an HVM tools domain

2017-11-01 Thread Paul Durrant
If the domain has XENFEAT_auto_translated_physmap then use of the PV-
specific HYPERVISOR_mmu_update hypercall is clearly incorrect.

This patch adds checks in xen_remap_domain_gfn_array() and
xen_unmap_domain_gfn_array() which call through to the approprate
xlate_mmu function if the feature is present.

This patch also moves xen_remap_domain_gfn_range() into the PV-only MMU
code and #ifdefs the (only) calling code in privcmd accordingly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
---
 arch/x86/xen/mmu.c| 36 +---
 arch/x86/xen/mmu_pv.c | 11 +++
 drivers/xen/privcmd.c | 17 +
 include/xen/xen-ops.h |  7 +++
 4 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 3e15345abfe7..01837c36e293 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -91,12 +91,12 @@ static int remap_area_mfn_pte_fn(pte_t *ptep, pgtable_t 
token,
return 0;
 }
 
-static int do_remap_gfn(struct vm_area_struct *vma,
-   unsigned long addr,
-   xen_pfn_t *gfn, int nr,
-   int *err_ptr, pgprot_t prot,
-   unsigned domid,
-   struct page **pages)
+int xen_remap_gfn(struct vm_area_struct *vma,
+ unsigned long addr,
+ xen_pfn_t *gfn, int nr,
+ int *err_ptr, pgprot_t prot,
+ unsigned int domid,
+ struct page **pages)
 {
int err = 0;
struct remap_data rmd;
@@ -166,36 +166,34 @@ static int do_remap_gfn(struct vm_area_struct *vma,
return err < 0 ? err : mapped;
 }
 
-int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
-  unsigned long addr,
-  xen_pfn_t gfn, int nr,
-  pgprot_t prot, unsigned domid,
-  struct page **pages)
-{
-   return do_remap_gfn(vma, addr, , nr, NULL, prot, domid, pages);
-}
-EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
-
 int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
   unsigned long addr,
   xen_pfn_t *gfn, int nr,
   int *err_ptr, pgprot_t prot,
   unsigned domid, struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_remap_gfn_array(vma, addr, gfn, nr, err_ptr,
+prot, domid, pages);
+
/* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
 * and the consequences later is quite hard to detect what the actual
 * cause of "wrong memory was mapped in".
 */
BUG_ON(err_ptr == NULL);
-   return do_remap_gfn(vma, addr, gfn, nr, err_ptr, prot, domid, pages);
+   return xen_remap_gfn(vma, addr, gfn, nr, err_ptr, prot, domid,
+pages);
 }
 EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
 
 /* Returns: 0 success */
 int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
-  int numpgs, struct page **pages)
+  int nr, struct page **pages)
 {
-   if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_unmap_gfn_range(vma, nr, pages);
+
+   if (!pages)
return 0;
 
return -EINVAL;
diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 71495f1a86d7..4974d8a6c2b4 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -2670,3 +2670,14 @@ phys_addr_t paddr_vmcoreinfo_note(void)
return __pa(vmcoreinfo_note);
 }
 #endif /* CONFIG_KEXEC_CORE */
+
+int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
+  unsigned long addr,
+  xen_pfn_t gfn, int nr,
+  pgprot_t prot, unsigned int domid,
+  struct page **pages)
+{
+   return xen_remap_gfn(vma, addr, , nr, NULL, prot, domid,
+pages);
+}
+EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index feca75b07fdd..b58a1719b606 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -215,6 +215,8 @@ static int traverse_pages_block(unsigned nelem, size_t size,
return ret;
 }
 
+#ifdef CONFIG_XEN_PV
+
 struct mmap_gfn_state {
unsigned long va;
struct vm_area_struct *vma;
@@ -261,10 +263,6 @@ static lon

Re: [Xen-devel] Commit moratorium to staging

2017-11-01 Thread Paul Durrant
> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: 01 November 2017 10:48
> To: Roger Pau Monne <roger@citrix.com>
> Cc: Julien Grall <julien.gr...@linaro.org>; committ...@xenproject.org; xen-
> devel <xen-de...@lists.xenproject.org>; Lars Kurth <lars.ku...@citrix.com>;
> Paul Durrant <paul.durr...@citrix.com>; Wei Liu <wei.l...@citrix.com>
> Subject: Re: Commit moratorium to staging
> 
> On Tue, Oct 31, 2017 at 04:52:37PM +, Roger Pau Monné wrote:
> >
> > I have to admit I have no idea why Windows clears the STS power bit
> > and then completely ignores it on certain occasions.
> >
> > I'm also afraid I have no idea how to debug Windows in order to know
> > why this event is acknowledged but ignored.
> >
> > I've also tried to reproduce the same with a Debian guest, by doing
> > the same amount of save/restores and migrations, and finally issuing a
> > xl trigger  power, but Debian has always worked fine and
> > shut down.
> >
> > Any comments are welcome.
> 
> After googling around, some articles suggest Windows can ignore ACPI
> events under certain circumstances. Is it worth checking in the Windows
> event log to see if an event is received but ignored for reason X?

Dumping the event logs would definitely be a useful thing to do.

> 
> For Windows Server 2012:
> https://serverfault.com/questions/534042/windows-2012-how-to-make-
> power-button-work-in-every-cases
> 
> Can't find anything for Windows Server 2016.

No, I couldn't either. I did find 
https://ethertubes.com/unattended-acpi-shutdown-of-windows-server/ too which 
seems to have some potentially useful suggestions.

  Paul

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v13 11/11] tools/libxenctrl: use new xenforeignmemory API to seed grant table

2017-10-30 Thread Paul Durrant
A previous patch added support for priv-mapping guest resources directly
(rather than having to foreign-map, which requires P2M modification for
HVM guests).

This patch makes use of the new API to seed the guest grant table unless
the underlying infrastructure (i.e. privcmd) doesn't support it, in which
case the old scheme is used.

NOTE: The call to xc_dom_gnttab_hvm_seed() in hvm_build_set_params() was
  actually unnecessary, as the grant table has already been seeded
  by a prior call to xc_dom_gnttab_init() made by libxl__build_dom().

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v13:
 - Re-base.

v10:
 - Use new id constant for grant table.

v4:
 - Minor cosmetic fix suggested by Roger.

v3:
 - Introduced xc_dom_set_gnttab_entry() to avoid duplicated code.
---
 tools/libxc/include/xc_dom.h|   8 +--
 tools/libxc/xc_dom_boot.c   | 114 +---
 tools/libxc/xc_sr_restore_x86_hvm.c |  10 ++--
 tools/libxc/xc_sr_restore_x86_pv.c  |   2 +-
 tools/libxl/libxl_dom.c |   1 -
 tools/python/xen/lowlevel/xc/xc.c   |   6 +-
 6 files changed, 92 insertions(+), 49 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index cdcdd07d2b..45c9d676c7 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -325,12 +325,8 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, 
xen_pfn_t pfn,
 int xc_dom_boot_image(struct xc_dom_image *dom);
 int xc_dom_compat_check(struct xc_dom_image *dom);
 int xc_dom_gnttab_init(struct xc_dom_image *dom);
-int xc_dom_gnttab_hvm_seed(xc_interface *xch, uint32_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   uint32_t console_domid,
-   uint32_t xenstore_domid);
-int xc_dom_gnttab_seed(xc_interface *xch, uint32_t domid,
+int xc_dom_gnttab_seed(xc_interface *xch, uint32_t guest_domid,
+   bool is_hvm,
xen_pfn_t console_gmfn,
xen_pfn_t xenstore_gmfn,
uint32_t console_domid,
diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c
index 40eb5185a9..01e9a1e185 100644
--- a/tools/libxc/xc_dom_boot.c
+++ b/tools/libxc/xc_dom_boot.c
@@ -282,11 +282,29 @@ static xen_pfn_t xc_dom_gnttab_setup(xc_interface *xch, 
uint32_t domid)
 return gmfn;
 }
 
-int xc_dom_gnttab_seed(xc_interface *xch, uint32_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   uint32_t console_domid,
-   uint32_t xenstore_domid)
+static void xc_dom_set_gnttab_entry(xc_interface *xch,
+grant_entry_v1_t *gnttab,
+unsigned int idx,
+uint32_t guest_domid,
+uint32_t backend_domid,
+xen_pfn_t backend_gmfn)
+{
+if ( guest_domid == backend_domid || backend_gmfn == -1)
+return;
+
+xc_dom_printf(xch, "%s: [%u] -> 0x%"PRI_xen_pfn,
+  __FUNCTION__, idx, backend_gmfn);
+
+gnttab[idx].flags = GTF_permit_access;
+gnttab[idx].domid = backend_domid;
+gnttab[idx].frame = backend_gmfn;
+}
+
+static int compat_gnttab_seed(xc_interface *xch, uint32_t domid,
+  xen_pfn_t console_gmfn,
+  xen_pfn_t xenstore_gmfn,
+  uint32_t console_domid,
+  uint32_t xenstore_domid)
 {
 
 xen_pfn_t gnttab_gmfn;
@@ -310,18 +328,10 @@ int xc_dom_gnttab_seed(xc_interface *xch, uint32_t domid,
 return -1;
 }
 
-if ( domid != console_domid  && console_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_CONSOLE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_CONSOLE].domid = console_domid;
-gnttab[GNTTAB_RESERVED_CONSOLE].frame = console_gmfn;
-}
-if ( domid != xenstore_domid && xenstore_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_XENSTORE].domid = xenstore_domid;
-gnttab[GNTTAB_RESERVED_XENSTORE].frame = xenstore_gmfn;
-}
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_CONSOLE,
+domid, console_domid, console_gmfn);
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_XENSTORE,
+domid, xenstore_domid, xenstore_gmfn);
 
 if ( munmap(gnttab, PAGE_SIZE) == -1 )
 {
@@ -339,11 +349,11 @@ int xc_dom_gnttab_seed(

[Xen-devel] [PATCH v13 10/11] common: add a new mappable resource type: XENMEM_resource_grant_table

2017-10-30 Thread Paul Durrant
This patch allows grant table frames to be mapped using the
XENMEM_acquire_resource memory op.

NOTE: This patch expands the on-stack mfn_list array in acquire_resource()
  but it is still small enough to remain on-stack.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

v13:
 - Re-work the internals to avoid using the XENMAPIDX_grant_table_status
   hack.

v12:
 - Dropped limit checks as requested by Jan.

v10:
 - Addressed comments from Jan.

v8:
 - The functionality was originally incorporated into the earlier patch
   "x86/mm: add HYPERVISOR_memory_op to acquire guest resources".
---
 xen/common/grant_table.c  | 63 +--
 xen/common/memory.c   | 45 ++-
 xen/include/public/memory.h   |  6 +
 xen/include/xen/grant_table.h |  4 +++
 4 files changed, 109 insertions(+), 9 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 0558de9ce8..429c421cd9 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3761,21 +3761,21 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, 
grant_ref_t ref,
 }
 #endif
 
-int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
- mfn_t *mfn)
+/* Caller must hold write lock as version may change and table may grow */
+static int gnttab_get_frame(struct domain *d, bool is_status,
+unsigned long idx, mfn_t *mfn)
 {
-int rc = 0;
 struct grant_table *gt = d->grant_table;
-
-grant_write_lock(gt);
+int rc = 0;
 
 if ( gt->gt_version == 0 )
 gt->gt_version = 1;
 
-if ( gt->gt_version == 2 &&
- (idx & XENMAPIDX_grant_table_status) )
+if ( is_status )
 {
-idx &= ~XENMAPIDX_grant_table_status;
+if ( gt->gt_version != 2 )
+return -EINVAL;
+
 if ( idx < nr_status_frames(gt) )
 *mfn = _mfn(virt_to_mfn(gt->status[idx]));
 else
@@ -3792,6 +3792,25 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 rc = -EINVAL;
 }
 
+return rc;
+}
+
+int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
+ mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+bool is_status = false;
+int rc;
+
+grant_write_lock(gt);
+
+if ( idx & XENMAPIDX_grant_table_status )
+{
+is_status = true;
+idx &= ~XENMAPIDX_grant_table_status;
+}
+
+rc = gnttab_get_frame(d, is_status, idx, mfn);
 if ( !rc )
 gnttab_set_frame_gfn(gt, idx, gfn);
 
@@ -3800,6 +3819,34 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 return rc;
 }
 
+int gnttab_get_grant_frame(struct domain *d, unsigned long idx,
+   mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+/* write lock required as version may change and/or table may grow */
+grant_write_lock(gt);
+rc = gnttab_get_frame(d, false, idx, mfn);
+grant_write_unlock(gt);
+
+return rc;
+}
+
+int gnttab_get_status_frame(struct domain *d, unsigned long idx,
+mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+/* write lock required as version may change and/or table may grow */
+grant_write_lock(gt);
+rc = gnttab_get_frame(d, true, idx, mfn);
+grant_write_unlock(gt);
+
+return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
 int first = 1;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 1c6932fd85..8097d85be3 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -965,6 +966,43 @@ static long xatp_permission_check(struct domain *d, 
unsigned int space)
 return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
 }
 
+static int acquire_grant_table(struct domain *d, unsigned int id,
+   unsigned long frame,
+   unsigned int nr_frames,
+   xen_pfn_t mfn_list[])
+{
+unsigned int i = nr_frames;
+
+/* Iterate backwards in case table needs to grow */
+while ( i-- != 0 )
+{
+mfn_t mfn = INVALID_MFN;
+int rc;
+
+switch ( id )
+{
+case XENMEM_resource_grant_table_id_grant:
+rc = gnttab_get_grant_frame(d, frame + i, );
+break;
+
+

[Xen-devel] [PATCH v13 02/11] x86/hvm/ioreq: simplify code and use consistent naming

2017-10-30 Thread Paul Durrant
This patch re-works much of the ioreq server initialization and teardown
code:

- The hvm_map/unmap_ioreq_gfn() functions are expanded to call through
  to hvm_alloc/free_ioreq_gfn() rather than expecting them to be called
  separately by outer functions.
- Several functions now test the validity of the hvm_ioreq_page gfn value
  to determine whether they need to act. This means can be safely called
  for the bufioreq page even when it is not used.
- hvm_add/remove_ioreq_gfn() simply return in the case of the default
  IOREQ server so callers no longer need to test before calling.
- hvm_ioreq_server_setup_pages() is renamed to hvm_ioreq_server_map_pages()
  to mirror the existing hvm_ioreq_server_unmap_pages().

All of this significantly shortens the code.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
Acked-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>

v3:
 - Rebased on top of 's->is_default' to 'IS_DEFAULT(s)' changes.
 - Minor updates in response to review comments from Roger.
---
 xen/arch/x86/hvm/ioreq.c | 182 ++-
 1 file changed, 69 insertions(+), 113 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index da31918bb1..c21fa9f280 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,63 +210,75 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static int hvm_alloc_ioreq_gfn(struct domain *d, unsigned long *gfn)
+static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
+struct domain *d = s->domain;
 unsigned int i;
-int rc;
 
-rc = -ENOMEM;
+ASSERT(!IS_DEFAULT(s));
+
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, >arch.hvm_domain.ioreq_gfn.mask) )
-{
-*gfn = d->arch.hvm_domain.ioreq_gfn.base + i;
-rc = 0;
-break;
-}
+return d->arch.hvm_domain.ioreq_gfn.base + i;
 }
 
-return rc;
+return gfn_x(INVALID_GFN);
 }
 
-static void hvm_free_ioreq_gfn(struct domain *d, unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
+   unsigned long gfn)
 {
+struct domain *d = s->domain;
 unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
 
-if ( gfn != gfn_x(INVALID_GFN) )
-set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
+ASSERT(!IS_DEFAULT(s));
+ASSERT(gfn != gfn_x(INVALID_GFN));
+
+set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
 }
 
-static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool buf)
+static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return;
+
 destroy_ring_for_helper(>va, iorp->page);
+iorp->page = NULL;
+
+if ( !IS_DEFAULT(s) )
+hvm_free_ioreq_gfn(s, iorp->gfn);
+
+iorp->gfn = gfn_x(INVALID_GFN);
 }
 
-static int hvm_map_ioreq_page(
-struct hvm_ioreq_server *s, bool buf, unsigned long gfn)
+static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct domain *d = s->domain;
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
-struct page_info *page;
-void *va;
 int rc;
 
-if ( (rc = prepare_ring_for_helper(d, gfn, , )) )
-return rc;
-
-if ( (iorp->va != NULL) || d->is_dying )
-{
-destroy_ring_for_helper(, page);
+if ( d->is_dying )
 return -EINVAL;
-}
 
-iorp->va = va;
-iorp->page = page;
-iorp->gfn = gfn;
+if ( IS_DEFAULT(s) )
+iorp->gfn = buf ?
+d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+else
+iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-return 0;
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return -ENOMEM;
+
+rc = prepare_ring_for_helper(d, iorp->gfn, >page, >va);
+
+if ( rc )
+hvm_unmap_ioreq_gfn(s, buf);
+
+return rc;
 }
 
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
@@ -279,8 +291,7 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 
 FOR_EACH_IOREQ_SERVER(d, id, s)
 {
-if ( (s->ioreq.va && s->ioreq.page == page) ||
- (s->bufioreq.va && s->bufioreq.page == page) )
+if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
 {
 found = true;
 break;
@@ -292,20 +303,30 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 retu

[Xen-devel] [PATCH v13 04/11] x86/hvm/ioreq: defer mapping gfns until they are actually requsted

2017-10-30 Thread Paul Durrant
A subsequent patch will introduce a new scheme to allow an emulator to
map ioreq server pages directly from Xen rather than the guest P2M.

This patch lays the groundwork for that change by deferring mapping of
gfns until their values are requested by an emulator. To that end, the
pad field of the xen_dm_op_get_ioreq_server_info structure is re-purposed
to a flags field and new flag, XEN_DMOP_no_gfns, defined which modifies the
behaviour of XEN_DMOP_get_ioreq_server_info to allow the caller to avoid
requesting the gfn values.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>

v8:
 - For safety make all of the pointers passed to
   hvm_get_ioreq_server_info() optional.
 - Shrink bufioreq_handling down to a uint8_t.

v3:
 - Updated in response to review comments from Wei and Roger.
 - Added a HANDLE_BUFIOREQ macro to make the code neater.
 - This patch no longer introduces a security vulnerability since there
   is now an explicit limit on the number of ioreq servers that may be
   created for any one domain.
---
 tools/libs/devicemodel/core.c   |  8 +
 tools/libs/devicemodel/include/xendevicemodel.h |  6 ++--
 xen/arch/x86/hvm/dm.c   |  9 +++--
 xen/arch/x86/hvm/ioreq.c| 47 ++---
 xen/include/asm-x86/hvm/domain.h|  2 +-
 xen/include/public/hvm/dm_op.h  | 32 ++---
 6 files changed, 63 insertions(+), 41 deletions(-)

diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
index b66d4f9294..e684e657b6 100644
--- a/tools/libs/devicemodel/core.c
+++ b/tools/libs/devicemodel/core.c
@@ -204,6 +204,14 @@ int xendevicemodel_get_ioreq_server_info(
 
 data->id = id;
 
+/*
+ * If the caller is not requesting gfn values then instruct the
+ * hypercall not to retrieve them as this may cause them to be
+ * mapped.
+ */
+if (!ioreq_gfn && !bufioreq_gfn)
+data->flags |= XEN_DMOP_no_gfns;
+
 rc = xendevicemodel_op(dmod, domid, 1, , sizeof(op));
 if (rc)
 return rc;
diff --git a/tools/libs/devicemodel/include/xendevicemodel.h 
b/tools/libs/devicemodel/include/xendevicemodel.h
index dda0bc7695..fffee3a4a0 100644
--- a/tools/libs/devicemodel/include/xendevicemodel.h
+++ b/tools/libs/devicemodel/include/xendevicemodel.h
@@ -61,11 +61,11 @@ int xendevicemodel_create_ioreq_server(
  * @parm domid the domain id to be serviced
  * @parm id the IOREQ Server id.
  * @parm ioreq_gfn pointer to a xen_pfn_t to receive the synchronous ioreq
- *  gfn
+ *  gfn. (May be NULL if not required)
  * @parm bufioreq_gfn pointer to a xen_pfn_t to receive the buffered ioreq
- *gfn
+ *gfn. (May be NULL if not required)
  * @parm bufioreq_port pointer to a evtchn_port_t to receive the buffered
- * ioreq event channel
+ * ioreq event channel. (May be NULL if not required)
  * @return 0 on success, -1 on failure.
  */
 int xendevicemodel_get_ioreq_server_info(
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 32ade9541d..4d10e91adb 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -416,16 +416,19 @@ static int dm_op(const struct dmop_args *op_args)
 {
 struct xen_dm_op_get_ioreq_server_info *data =
 _ioreq_server_info;
+const uint16_t valid_flags = XEN_DMOP_no_gfns;
 
 const_op = false;
 
 rc = -EINVAL;
-if ( data->pad )
+if ( data->flags & ~valid_flags )
 break;
 
 rc = hvm_get_ioreq_server_info(d, data->id,
-   >ioreq_gfn,
-   >bufioreq_gfn,
+   (data->flags & XEN_DMOP_no_gfns) ?
+   NULL : >ioreq_gfn,
+   (data->flags & XEN_DMOP_no_gfns) ?
+   NULL : >bufioreq_gfn,
>bufioreq_port);
 break;
 }
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index eec4e4771e..39de659ddf 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -350,6 +350,9 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server 
*s,
 }
 }
 
+#define HANDLE_BUFIOREQ(s) \
+((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
+

[Xen-devel] [PATCH v13 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-30 Thread Paul Durrant
Certain memory resources associated with a guest are not necessarily
present in the guest P2M.

This patch adds the boilerplate for new memory op to allow such a resource
to be priv-mapped directly, by either a PV or HVM tools domain.

NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
  I have no means to test it on an ARM platform and so cannot verify
  that it functions correctly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>
Cc: Daniel De Graaf <dgde...@tycho.nsa.gov>
Cc: Julien Grall <julien.gr...@arm.com>

v13:
 - Use xen_pfn_t for mfn_list.
 - Addressed further comments from Jan and Julien.

v12:
 - Addressed more comments form Jan.
 - Removed #ifdef CONFIG_X86 from common code and instead introduced a
   stub set_foreign_p2m_entry() in asm-arm/p2m.h returning -EOPNOTSUPP.
 - Restricted mechanism for querying implementation limit on nr_frames
   and simplified compat code.

v11:
 - Addressed more comments from Jan.

v9:
 - Addressed more comments from Jan.

v8:
 - Move the code into common as requested by Jan.
 - Make the gmfn_list handle a 64-bit type to avoid limiting the MFN
   range for a 32-bit tools domain.
 - Add missing pad.
 - Add compat code.
 - Make this patch deal with purely boilerplate.
 - Drop George's A-b and Wei's R-b because the changes are non-trivial,
   and update Cc list now the boilerplate is common.

v5:
 - Switched __copy_to/from_guest_offset() to copy_to/from_guest_offset().
---
 tools/flask/policy/modules/xen.if   |  4 +-
 xen/arch/x86/mm/p2m.c   |  3 +-
 xen/common/compat/memory.c  | 95 +
 xen/common/memory.c | 93 
 xen/include/asm-arm/p2m.h   | 10 
 xen/include/asm-x86/p2m.h   |  3 ++
 xen/include/public/memory.h | 43 -
 xen/include/xlat.lst|  1 +
 xen/include/xsm/dummy.h |  6 +++
 xen/include/xsm/xsm.h   |  6 +++
 xen/xsm/dummy.c |  1 +
 xen/xsm/flask/hooks.c   |  6 +++
 xen/xsm/flask/policy/access_vectors |  2 +
 13 files changed, 269 insertions(+), 4 deletions(-)

diff --git a/tools/flask/policy/modules/xen.if 
b/tools/flask/policy/modules/xen.if
index 55437496f6..07cba8a15d 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -52,7 +52,8 @@ define(`create_domain_common', `
settime setdomainhandle getvcpucontext set_misc_info };
allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
-   psr_cmt_op psr_cat_op soft_reset set_gnttab_limits };
+   psr_cmt_op psr_cat_op soft_reset set_gnttab_limits
+   resource_map };
allow $1 $2:security check_context;
allow $1 $2:shadow enable;
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
@@ -152,6 +153,7 @@ define(`device_model', `
allow $1 $2_target:domain { getdomaininfo shutdown };
allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack 
};
allow $1 $2_target:hvm { getparam setparam hvmctl cacheattr dm };
+   allow $1 $2_target:domain2 resource_map;
 ')
 
 # make_device_model(priv, dm_dom, hvm_dom)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c72a3cdebb..71bb9b4f93 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1132,8 +1132,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 }
 
 /* Set foreign mfn in the given guest's p2m table. */
-static int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
- mfn_t mfn)
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
 return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
p2m_get_hostp2m(d)->default_access);
diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 35bb259808..9a7cb1a71b 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -71,6 +71,7 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 struct xen_remove_from_physmap *xrfp;
 struct xen_vnuma_topology_info *vnuma;
 struct xen_mem_access_op *mao;
+struct xen_mem_acquire_resource *mar;

[Xen-devel] [PATCH v13 09/11] tools/libxenforeignmemory: reduce xenforeignmemory_restrict code footprint

2017-10-30 Thread Paul Durrant
By using a static inline stub in private.h for OS where this functionality
is not implemented, the various duplicate stubs in the OS-specific source
modules can be avoided.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v4:
 - Removed extraneous freebsd code.

v3:
 - Patch added in response to review comments.
---
 tools/libs/foreignmemory/freebsd.c |  7 ---
 tools/libs/foreignmemory/minios.c  |  7 ---
 tools/libs/foreignmemory/netbsd.c  |  7 ---
 tools/libs/foreignmemory/private.h | 12 +---
 tools/libs/foreignmemory/solaris.c |  7 ---
 5 files changed, 9 insertions(+), 31 deletions(-)

diff --git a/tools/libs/foreignmemory/freebsd.c 
b/tools/libs/foreignmemory/freebsd.c
index dec447485a..6e6bc4b11f 100644
--- a/tools/libs/foreignmemory/freebsd.c
+++ b/tools/libs/foreignmemory/freebsd.c
@@ -95,13 +95,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/minios.c 
b/tools/libs/foreignmemory/minios.c
index 75f340122e..43341ca301 100644
--- a/tools/libs/foreignmemory/minios.c
+++ b/tools/libs/foreignmemory/minios.c
@@ -58,13 +58,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/netbsd.c 
b/tools/libs/foreignmemory/netbsd.c
index 9bf95ef4f0..54a418ebd6 100644
--- a/tools/libs/foreignmemory/netbsd.c
+++ b/tools/libs/foreignmemory/netbsd.c
@@ -100,13 +100,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/private.h 
b/tools/libs/foreignmemory/private.h
index b191000b49..b06ce12583 100644
--- a/tools/libs/foreignmemory/private.h
+++ b/tools/libs/foreignmemory/private.h
@@ -35,9 +35,6 @@ void *osdep_xenforeignmemory_map(xenforeignmemory_handle 
*fmem,
 int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
  void *addr, size_t num);
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid);
-
 #if defined(__NetBSD__) || defined(__sun__)
 /* Strictly compat for those two only only */
 void *compat_mapforeign_batch(xenforeignmem_handle *fmem, uint32_t dom,
@@ -57,6 +54,13 @@ struct xenforeignmemory_resource_handle {
 };
 
 #ifndef __linux__
+static inline int osdep_xenforeignmemory_restrict(xenforeignmemory_handle 
*fmem,
+  domid_t domid)
+{
+errno = EOPNOTSUPP;
+return -1;
+}
+
 static inline int osdep_xenforeignmemory_map_resource(
 xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
 {
@@ -70,6 +74,8 @@ static inline int osdep_xenforeignmemory_unmap_resource(
 return 0;
 }
 #else
+int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
+domid_t domid);
 int osdep_xenforeignmemory_map_resource(
 xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
 int osdep_xenforeignmemory_unmap_resource(
diff --git a/tools/libs/foreignmemory/solaris.c 
b/tools/libs/foreignmemory/solaris.c
index a33decb4ae..ee8aae4fbd 100644
--- a/tools/libs/foreignmemory/solaris.c
+++ b/tools/libs/foreignmemory/solaris.c
@@ -97,13 +97,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v13 08/11] tools/libxenforeignmemory: add support for resource mapping

2017-10-30 Thread Paul Durrant
A previous patch introduced a new HYPERVISOR_memory_op to acquire guest
resources for direct priv-mapping.

This patch adds new functionality into libxenforeignmemory to make use
of a new privcmd ioctl [1] that uses the new memory op to make such
resources available via mmap(2).

[1] 
http://xenbits.xen.org/gitweb/?p=people/pauldu/linux.git;a=commit;h=ce59a05e6712

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v4:
 - Fixed errno and removed single-use label
 - The unmap call now returns a status
 - Use C99 initialization for ioctl struct

v2:
 - Bump minor version up to 3.
---
 tools/include/xen-sys/Linux/privcmd.h  | 11 +
 tools/libs/foreignmemory/Makefile  |  2 +-
 tools/libs/foreignmemory/core.c| 53 ++
 .../libs/foreignmemory/include/xenforeignmemory.h  | 41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |  5 ++
 tools/libs/foreignmemory/linux.c   | 45 ++
 tools/libs/foreignmemory/private.h | 31 +
 7 files changed, 187 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-sys/Linux/privcmd.h 
b/tools/include/xen-sys/Linux/privcmd.h
index 732ff7c15a..9531b728f9 100644
--- a/tools/include/xen-sys/Linux/privcmd.h
+++ b/tools/include/xen-sys/Linux/privcmd.h
@@ -86,6 +86,15 @@ typedef struct privcmd_dm_op {
const privcmd_dm_op_buf_t __user *ubufs;
 } privcmd_dm_op_t;
 
+typedef struct privcmd_mmap_resource {
+   domid_t dom;
+   __u32 type;
+   __u32 id;
+   __u32 idx;
+   __u64 num;
+   __u64 addr;
+} privcmd_mmap_resource_t;
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: _hypercall_t
@@ -103,5 +112,7 @@ typedef struct privcmd_dm_op {
_IOC(_IOC_NONE, 'P', 5, sizeof(privcmd_dm_op_t))
 #define IOCTL_PRIVCMD_RESTRICT \
_IOC(_IOC_NONE, 'P', 6, sizeof(domid_t))
+#define IOCTL_PRIVCMD_MMAP_RESOURCE\
+   _IOC(_IOC_NONE, 'P', 7, sizeof(privcmd_mmap_resource_t))
 
 #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/tools/libs/foreignmemory/Makefile 
b/tools/libs/foreignmemory/Makefile
index cbe815fce8..ee5c3fd67e 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR= 1
-MINOR= 2
+MINOR= 3
 SHLIB_LDFLAGS += -Wl,--version-script=libxenforeignmemory.map
 
 CFLAGS   += -Werror -Wmissing-prototypes
diff --git a/tools/libs/foreignmemory/core.c b/tools/libs/foreignmemory/core.c
index 79b24d273b..efa915015c 100644
--- a/tools/libs/foreignmemory/core.c
+++ b/tools/libs/foreignmemory/core.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 
+#include 
+
 #include "private.h"
 
 static int all_restrict_cb(Xentoolcore__Active_Handle *ah, domid_t domid) {
@@ -135,6 +137,57 @@ int xenforeignmemory_restrict(xenforeignmemory_handle 
*fmem,
 return osdep_xenforeignmemory_restrict(fmem, domid);
 }
 
+xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
+xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+unsigned int id, unsigned long frame, unsigned long nr_frames,
+void **paddr, int prot, int flags)
+{
+xenforeignmemory_resource_handle *fres;
+int rc;
+
+/* Check flags only contains POSIX defined values */
+if ( flags & ~(MAP_SHARED | MAP_PRIVATE) )
+{
+errno = EINVAL;
+return NULL;
+}
+
+fres = calloc(1, sizeof(*fres));
+if ( !fres )
+{
+errno = ENOMEM;
+return NULL;
+}
+
+fres->domid = domid;
+fres->type = type;
+fres->id = id;
+fres->frame = frame;
+fres->nr_frames = nr_frames;
+fres->addr = *paddr;
+fres->prot = prot;
+fres->flags = flags;
+
+rc = osdep_xenforeignmemory_map_resource(fmem, fres);
+if ( rc )
+{
+free(fres);
+fres = NULL;
+} else
+*paddr = fres->addr;
+
+return fres;
+}
+
+int xenforeignmemory_unmap_resource(
+xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+int rc = osdep_xenforeignmemory_unmap_resource(fmem, fres);
+
+free(fres);
+return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/include/xenforeignmemory.h 
b/tools/libs/foreignmemory/include/xenforeignmemory.h
index f4814c390f..d594be8df0 100644
--- a/tools/libs/foreignmemory/include/xenforeignmemory.h
+++ b/tools/libs/foreignmemory/include/xenforeignmemory.h
@@ -138,6 +138,47 @@ int xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
 int xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
   domid_t domid);
 
+typedef struct x

[Xen-devel] [PATCH v13 06/11] x86/hvm/ioreq: add a new mappable resource type...

2017-10-30 Thread Paul Durrant
... XENMEM_resource_ioreq_server

This patch adds support for a new resource type that can be mapped using
the XENMEM_acquire_resource memory op.

If an emulator makes use of this resource type then, instead of mapping
gfns, the IOREQ server will allocate pages from the heap. These pages
will never be present in the P2M of the guest at any point and so are
not vulnerable to any direct attack by the guest. They are only ever
accessible by Xen and any domain that has mapping privilege over the
guest (which may or may not be limited to the domain running the emulator).

NOTE: Use of the new resource type is not compatible with use of
  XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns flag is
  set.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Julien Grall <julien.gr...@arm.com>

v13:
 - Introduce an arch_acquire_resource() as suggested by Julien (and have
   the ARM varient simply return -EOPNOTSUPP).
 - Check for ioreq server id truncation as requested by Jan.
 - Not added Jan's R-b due to substantive change from v12.

v12:
 - Addressed more comments from Jan.
 - Dropped George's A-b and Wei's R-b because of material change.

v11:
 - Addressed more comments from Jan.

v10:
 - Addressed comments from Jan.

v8:
 - Re-base on new boilerplate.
 - Adjust function signature of hvm_get_ioreq_server_frame(), and test
   whether the bufioreq page is present.

v5:
 - Use get_ioreq_server() function rather than indexing array directly.
 - Add more explanation into comments to state than mapping guest frames
   and allocation of pages for ioreq servers are not simultaneously
   permitted.
 - Add a comment into asm/ioreq.h stating the meaning of the index
   value passed to hvm_get_ioreq_server_frame().
---
 xen/arch/x86/hvm/ioreq.c| 156 
 xen/arch/x86/mm.c   |  49 +
 xen/common/memory.c |   3 +-
 xen/include/asm-arm/mm.h|   7 ++
 xen/include/asm-x86/hvm/ioreq.h |   2 +
 xen/include/asm-x86/mm.h|   5 ++
 xen/include/public/hvm/dm_op.h  |   4 ++
 xen/include/public/memory.h |   9 +++
 8 files changed, 234 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 39de659ddf..d991ac9cdc 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -259,6 +259,19 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 int rc;
 
+if ( iorp->page )
+{
+/*
+ * If a page has already been allocated (which will happen on
+ * demand if hvm_get_ioreq_server_frame() is called), then
+ * mapping a guest frame is not permitted.
+ */
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
+return -EPERM;
+
+return 0;
+}
+
 if ( d->is_dying )
 return -EINVAL;
 
@@ -281,6 +294,70 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 return rc;
 }
 
+static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+struct domain *currd = current->domain;
+struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
+
+if ( iorp->page )
+{
+/*
+ * If a guest frame has already been mapped (which may happen
+ * on demand if hvm_get_ioreq_server_info() is called), then
+ * allocating a page is not permitted.
+ */
+if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
+return -EPERM;
+
+return 0;
+}
+
+/*
+ * Allocated IOREQ server pages are assigned to the emulating
+ * domain, not the target domain. This is because the emulator is
+ * likely to be destroyed after the target domain has been torn
+ * down, and we must use MEMF_no_refcount otherwise page allocation
+ * could fail if the emulating domain has already reached its
+ * maximum allocation.
+ */
+iorp->page = alloc_domheap_page(currd, MEMF_no_refcount);
+if ( !iorp->page )
+return -ENOMEM;
+
+if ( !get_page_type(iorp->page, PGT_writable_page) )
+{
+ASSERT_UNREACHABLE();
+put_page(iorp->page);
+iorp->page = NULL;
+return -ENOMEM;
+}
+
+iorp->va = __map_domain_page_global(iorp->page);
+if ( !iorp->va )
+{
+put_page_and_type(iorp->page);
+iorp->page = NULL;
+return -ENOMEM;
+}
+
+clear_page(iorp->va);
+return 0;
+}
+
+static void hvm_free_ioreq_mfn(st

[Xen-devel] [PATCH v13 01/11] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list

2017-10-30 Thread Paul Durrant
A subsequent patch will remove the current implicit limitation on creation
of ioreq servers which is due to the allocation of gfns for the ioreq
structures and buffered ioreq ring.

It will therefore be necessary to introduce an explicit limit and, since
this limit should be small, it simplifies the code to maintain an array of
that size rather than using a list.

Also, by reserving an array slot for the default server and populating
array slots early in create, the need to pass an 'is_default' boolean
to sub-functions can be avoided.

Some function return values are changed by this patch: Specifically, in
the case where the id of the default ioreq server is passed in, -EOPNOTSUPP
is now returned rather than -ENOENT.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>

v10:
 - modified FOR_EACH... macro as suggested by Jan.
 - check for NULL in IS_DEFAULT macro as suggested by Jan.

v9:
 - modified FOR_EACH... macro as requested by Andrew.

v8:
 - Addressed various comments from Jan.

v7:
 - Fixed assertion failure found in testing.

v6:
 - Updated according to comments made by Roger on v4 that I'd missed.

v5:
 - Switched GET/SET_IOREQ_SERVER() macros to get/set_ioreq_server()
   functions to avoid possible double-evaluation issues.

v4:
 - Introduced more helper macros and relocated them to the top of the
   code.

v3:
 - New patch (replacing "move is_default into struct hvm_ioreq_server") in
   response to review comments.
---
 xen/arch/x86/hvm/ioreq.c | 502 +++
 xen/include/asm-x86/hvm/domain.h |  10 +-
 2 files changed, 245 insertions(+), 267 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index d5afe20cc8..da31918bb1 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -33,6 +33,37 @@
 
 #include 
 
+static void set_ioreq_server(struct domain *d, unsigned int id,
+ struct hvm_ioreq_server *s)
+{
+ASSERT(id < MAX_NR_IOREQ_SERVERS);
+ASSERT(!s || !d->arch.hvm_domain.ioreq_server.server[id]);
+
+d->arch.hvm_domain.ioreq_server.server[id] = s;
+}
+
+#define GET_IOREQ_SERVER(d, id) \
+(d)->arch.hvm_domain.ioreq_server.server[id]
+
+static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
+ unsigned int id)
+{
+if ( id >= MAX_NR_IOREQ_SERVERS )
+return NULL;
+
+return GET_IOREQ_SERVER(d, id);
+}
+
+#define IS_DEFAULT(s) \
+((s) && (s) == GET_IOREQ_SERVER((s)->domain, DEFAULT_IOSERVID))
+
+/* Iterate over all possible ioreq servers */
+#define FOR_EACH_IOREQ_SERVER(d, id, s) \
+for ( (id) = 0; (id) < MAX_NR_IOREQ_SERVERS; (id)++ ) \
+if ( !(s = GET_IOREQ_SERVER(d, id)) ) \
+continue; \
+else
+
 static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
 {
 shared_iopage_t *p = s->ioreq.va;
@@ -47,10 +78,9 @@ bool hvm_io_pending(struct vcpu *v)
 {
 struct domain *d = v->domain;
 struct hvm_ioreq_server *s;
+unsigned int id;
 
-list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -127,10 +157,9 @@ bool handle_hvm_io_completion(struct vcpu *v)
 struct hvm_vcpu_io *vio = >arch.hvm_vcpu.hvm_io;
 struct hvm_ioreq_server *s;
 enum hvm_io_completion io_completion;
+unsigned int id;
 
-  list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -243,13 +272,12 @@ static int hvm_map_ioreq_page(
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
 const struct hvm_ioreq_server *s;
+unsigned int id;
 bool found = false;
 
 spin_lock_recursive(>arch.hvm_domain.ioreq_server.lock);
 
-list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 if ( (s->ioreq.va && s->ioreq.page == page) ||
  (s->bufioreq.va && s->bufioreq.page == page) )
@@ -302,7 +330,7 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server 
*s,
 }
 
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
- bool is_default, struct vcpu *v)
+ struct vcpu *v)
 {
 struct hvm_ioreq_vcpu *sv;
 int rc;
@@ -331,7 +359,7 @@ static int hvm_ioreq_server_add_vcpu(struct 
hvm_ioreq_server *s,
 goto fa

[Xen-devel] [PATCH v13 00/11] x86: guest resource mapping

2017-10-30 Thread Paul Durrant
This series introduces support for direct mapping of guest resources.
The resources are:
 - IOREQ server pages
 - Grant tables

v13:
 - Responded to more comments from Jan and Julien.
 - Build-tested using ARM cross-compilation.

v12:
 - Responded to more comments from Jan.

v11:
 - Responded to more comments from Jan.

v10:
 - Responded to comments from Jan.

v9:
 - Change to patch #1 only.

v8:
 - Re-ordered series and dropped two patches that have already been
committed.

v7:
 - Fixed assertion failure hit during domain destroy.

v6:
 - Responded to missed comments from Roger.

v5:
 - Responded to review comments from Wei.

v4:
 - Responded to further review comments from Roger.

v3:
 - Dropped original patch #1 since it is covered by Juergen's patch.
 - Added new xenforeignmemorycleanup patch (#4).
 - Replaced the patch introducing the ioreq server 'is_default' flag with
   one that changes the ioreq server list into an array (#8).
  
Paul Durrant (11):
  x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  x86/hvm/ioreq: simplify code and use consistent naming
  x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page
  x86/hvm/ioreq: defer mapping gfns until they are actually requsted
  x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  x86/hvm/ioreq: add a new mappable resource type...
  x86/mm: add an extra command to HYPERVISOR_mmu_update...
  tools/libxenforeignmemory: add support for resource mapping
  tools/libxenforeignmemory: reduce xenforeignmemory_restrict code
footprint
  common: add a new mappable resource type: XENMEM_resource_grant_table
  tools/libxenctrl: use new xenforeignmemory API to seed grant table

 tools/flask/policy/modules/xen.if  |   4 +-
 tools/include/xen-sys/Linux/privcmd.h  |  11 +
 tools/libs/devicemodel/core.c  |   8 +
 tools/libs/devicemodel/include/xendevicemodel.h|   6 +-
 tools/libs/foreignmemory/Makefile  |   2 +-
 tools/libs/foreignmemory/core.c|  53 ++
 tools/libs/foreignmemory/freebsd.c |   7 -
 .../libs/foreignmemory/include/xenforeignmemory.h  |  41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |   5 +
 tools/libs/foreignmemory/linux.c   |  45 ++
 tools/libs/foreignmemory/minios.c  |   7 -
 tools/libs/foreignmemory/netbsd.c  |   7 -
 tools/libs/foreignmemory/private.h |  43 +-
 tools/libs/foreignmemory/solaris.c |   7 -
 tools/libxc/include/xc_dom.h   |   8 +-
 tools/libxc/xc_dom_boot.c  | 114 ++-
 tools/libxc/xc_sr_restore_x86_hvm.c|  10 +-
 tools/libxc/xc_sr_restore_x86_pv.c |   2 +-
 tools/libxl/libxl_dom.c|   1 -
 tools/python/xen/lowlevel/xc/xc.c  |   6 +-
 xen/arch/x86/hvm/dm.c  |   9 +-
 xen/arch/x86/hvm/ioreq.c   | 831 -
 xen/arch/x86/mm.c  |  62 +-
 xen/arch/x86/mm/p2m.c  |   3 +-
 xen/common/compat/memory.c |  95 +++
 xen/common/grant_table.c   |  63 +-
 xen/common/memory.c| 137 
 xen/include/asm-arm/mm.h   |   7 +
 xen/include/asm-arm/p2m.h  |  10 +
 xen/include/asm-x86/hvm/domain.h   |  14 +-
 xen/include/asm-x86/hvm/ioreq.h|   2 +
 xen/include/asm-x86/mm.h   |   5 +
 xen/include/asm-x86/p2m.h  |   3 +
 xen/include/public/hvm/dm_op.h |  36 +-
 xen/include/public/memory.h|  58 +-
 xen/include/public/xen.h   |  12 +-
 xen/include/xen/grant_table.h  |   4 +
 xen/include/xlat.lst   |   1 +
 xen/include/xsm/dummy.h|   6 +
 xen/include/xsm/xsm.h  |   6 +
 xen/xsm/dummy.c|   1 +
 xen/xsm/flask/hooks.c  |   6 +
 xen/xsm/flask/policy/access_vectors|   2 +
 43 files changed, 1265 insertions(+), 495 deletions(-)

---
Cc: Daniel De Graaf <dgde...@tycho.nsa.gov>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: "Marek Marczykowski-Górecki" <marma...@invisiblethingslab.com>
Cc: Paul Durrant <paul.durr...@citrix.com>
Cc: George Dunlap <g

[Xen-devel] [PATCH v13 07/11] x86/mm: add an extra command to HYPERVISOR_mmu_update...

2017-10-30 Thread Paul Durrant
...to allow the calling domain to prevent translation of specified l1e
value.

Despite what the comment in public/xen.h might imply, specifying a
command value of MMU_NORMAL_PT_UPDATE will not simply update an l1e with
the specified value. Instead, mod_l1_entry() tests whether foreign_dom
has PG_translate set in its paging mode and, if it does, assumes that the
the pfn value in the l1e is a gfn rather than an mfn.

To allow PV tools domain to map mfn values from a previously issued
HYPERVISOR_memory_op:XENMEM_acquire_resource, there needs to be a way
to tell HYPERVISOR_mmu_update that the specific l1e value does not
require translation regardless of the paging mode of foreign_dom. This
patch therefore defines a new command value, MMU_PT_UPDATE_NO_TRANSLATE,
which has the same semantics as MMU_NORMAL_PT_UPDATE except that the
paging mode of foreign_dom is ignored and the l1e value is used verbatim.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

v13:
 - Re-base.

v8:
 - New in this version, replacing "allow a privileged PV domain to map
   guest mfns".
---
 xen/arch/x86/mm.c| 13 -
 xen/include/public/xen.h | 12 +---
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a8c207b973..503bca552c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1724,9 +1724,10 @@ void page_unlock(struct page_info *page)
 
 /* Update the L1 entry at pl1e to new value nl1e. */
 static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e,
-unsigned long gl1mfn, int preserve_ad,
+unsigned long gl1mfn, unsigned int cmd,
 struct vcpu *pt_vcpu, struct domain *pg_dom)
 {
+bool preserve_ad = (cmd == MMU_PT_UPDATE_PRESERVE_AD);
 l1_pgentry_t ol1e;
 struct domain *pt_dom = pt_vcpu->domain;
 int rc = 0;
@@ -1748,7 +1749,8 @@ static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t 
nl1e,
 }
 
 /* Translate foreign guest address. */
-if ( paging_mode_translate(pg_dom) )
+if ( cmd != MMU_PT_UPDATE_NO_TRANSLATE &&
+ paging_mode_translate(pg_dom) )
 {
 p2m_type_t p2mt;
 p2m_query_t q = l1e_get_flags(nl1e) & _PAGE_RW ?
@@ -3438,6 +3440,7 @@ long do_mmu_update(
  */
 case MMU_NORMAL_PT_UPDATE:
 case MMU_PT_UPDATE_PRESERVE_AD:
+case MMU_PT_UPDATE_NO_TRANSLATE:
 {
 p2m_type_t p2mt;
 
@@ -3497,8 +3500,7 @@ long do_mmu_update(
 {
 case PGT_l1_page_table:
 rc = mod_l1_entry(va, l1e_from_intpte(req.val), mfn,
-  cmd == MMU_PT_UPDATE_PRESERVE_AD, v,
-  pg_owner);
+  cmd, v, pg_owner);
 break;
 case PGT_l2_page_table:
 rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn,
@@ -3773,7 +3775,8 @@ static int __do_update_va_mapping(
 goto out;
 }
 
-rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), 0, v, pg_owner);
+rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), MMU_NORMAL_PT_UPDATE, v,
+  pg_owner);
 
 page_unlock(gl1pg);
 put_page(gl1pg);
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 308109f176..fb1df8f293 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -268,6 +268,10 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed
  * with those in @val.
  *
+ * ptr[1:0] == MMU_PT_UPDATE_NO_TRANSLATE:
+ * As MMU_NORMAL_PT_UPDATE above, but @val is not translated though FD
+ * page tables.
+ *
  * @val is usually the machine frame number along with some attributes.
  * The attributes by default follow the architecture defined bits. Meaning that
  * if this is a X86_64 machine and four page table layout is used, the layout
@@ -334,9 +338,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  *
  * PAT (bit 7 on) --> PWT (bit 3 on) and clear bit 7.
  */
-#define MMU_NORMAL_PT_UPDATE  0 /* checked '*ptr = val'. ptr is MA.  */
-#define MMU_MACHPHYS_UPDATE   1 /* ptr = MA of frame to modify entry for */
-#define MMU_PT_UPDATE_PRESERVE_AD 2 /* atomically: *ptr = val | (*ptr&(A|D)) */
+#define MMU_NORMAL_PT_UPDATE   0 /* checked '*ptr = val'. ptr is MA.  
*/
+#define MMU_MACHPHYS_UPDATE1 /* ptr = MA of frame to modify entry for 
*/
+#def

[Xen-devel] [PATCH v13 03/11] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page

2017-10-30 Thread Paul Durrant
This patch adjusts the ioreq server code to use type-safe gfn_t values
where possible. No functional change.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
Acked-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
---
 xen/arch/x86/hvm/ioreq.c | 44 
 xen/include/asm-x86/hvm/domain.h |  2 +-
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index c21fa9f280..eec4e4771e 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,7 +210,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
+static gfn_t hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
 struct domain *d = s->domain;
 unsigned int i;
@@ -220,20 +220,19 @@ static unsigned long hvm_alloc_ioreq_gfn(struct 
hvm_ioreq_server *s)
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, >arch.hvm_domain.ioreq_gfn.mask) )
-return d->arch.hvm_domain.ioreq_gfn.base + i;
+return _gfn(d->arch.hvm_domain.ioreq_gfn.base + i);
 }
 
-return gfn_x(INVALID_GFN);
+return INVALID_GFN;
 }
 
-static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
-   unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s, gfn_t gfn)
 {
 struct domain *d = s->domain;
-unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
+unsigned int i = gfn_x(gfn) - d->arch.hvm_domain.ioreq_gfn.base;
 
 ASSERT(!IS_DEFAULT(s));
-ASSERT(gfn != gfn_x(INVALID_GFN));
+ASSERT(!gfn_eq(gfn, INVALID_GFN));
 
 set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
 }
@@ -242,7 +241,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
 destroy_ring_for_helper(>va, iorp->page);
@@ -251,7 +250,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 if ( !IS_DEFAULT(s) )
 hvm_free_ioreq_gfn(s, iorp->gfn);
 
-iorp->gfn = gfn_x(INVALID_GFN);
+iorp->gfn = INVALID_GFN;
 }
 
 static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
@@ -264,16 +263,17 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 return -EINVAL;
 
 if ( IS_DEFAULT(s) )
-iorp->gfn = buf ?
-d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
-d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+iorp->gfn = _gfn(buf ?
+ d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+ d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN]);
 else
 iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return -ENOMEM;
 
-rc = prepare_ring_for_helper(d, iorp->gfn, >page, >va);
+rc = prepare_ring_for_helper(d, gfn_x(iorp->gfn), >page,
+ >va);
 
 if ( rc )
 hvm_unmap_ioreq_gfn(s, buf);
@@ -309,10 +309,10 @@ static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server 
*s, bool buf)
 struct domain *d = s->domain;
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
-if ( guest_physmap_remove_page(d, _gfn(iorp->gfn),
+if ( guest_physmap_remove_page(d, iorp->gfn,
_mfn(page_to_mfn(iorp->page)), 0) )
 domain_crash(d);
 clear_page(iorp->va);
@@ -324,12 +324,12 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 int rc;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return 0;
 
 clear_page(iorp->va);
 
-rc = guest_physmap_add_page(d, _gfn(iorp->gfn),
+rc = guest_physmap_add_page(d, iorp->gfn,
 _mfn(page_to_mfn(iorp->page)), 0);
 if ( rc == 0 )
 paging_mark_dirty(d, _mfn(page_to_mfn(iorp->page)));
@@ -590,8 +590,8 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 INIT_LIST_HEAD(>ioreq_vcpu_list);
 spin_lock_init(>bufioreq_lock);
 
-s->ioreq.gfn = gfn_x(INVALID_GFN);
-s->

Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-30 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 30 October 2017 12:09
> To: Paul Durrant <paul.durr...@citrix.com>; Jan Beulich
> <jbeul...@suse.com>
> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Wei Liu <wei.l...@citrix.com>; George
> Dunlap <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>;
> Stefano Stabellini <sstabell...@kernel.org>; xen-de...@lists.xenproject.org;
> Konrad Rzeszutek Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> <dgde...@tycho.nsa.gov>; Tim (Xen.org) <t...@xen.org>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> Hi Paul,
> 
> On 27/10/17 16:19, Paul Durrant wrote:
> >> -Original Message-
> >> From: Julien Grall [mailto:julien.gr...@linaro.org]
> >> Sent: 27 October 2017 12:46
> >> To: Jan Beulich <jbeul...@suse.com>; Paul Durrant
> >> <paul.durr...@citrix.com>
> >> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> >> <andrew.coop...@citrix.com>; Wei Liu <wei.l...@citrix.com>; George
> >> Dunlap <george.dun...@citrix.com>; Ian Jackson
> <ian.jack...@citrix.com>;
> >> Stefano Stabellini <sstabell...@kernel.org>; xen-
> de...@lists.xenproject.org;
> >> Konrad Rzeszutek Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> >> <dgde...@tycho.nsa.gov>; Tim (Xen.org) <t...@xen.org>
> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> >> HYPERVISOR_memory_op to acquire guest resources
> >>
> >> Hi,
> >>
> >> On 26/10/17 16:39, Jan Beulich wrote:
> >>>>>> On 26.10.17 at 17:32, <julien.gr...@linaro.org> wrote:
> >>>> On 26/10/17 16:26, Jan Beulich wrote:
> >>>>>>>> On 17.10.17 at 15:24, <paul.durr...@citrix.com> wrote:
> >>>>>> +/* IN/OUT - If the tools domain is PV then, upon return,
> frame_list
> >>>>>> + *  will be populated with the MFNs of the resource.
> >>>>>> + *  If the tools domain is HVM then it is expected that, 
> >>>>>> on
> >>>>>> + *  entry, frame_list will be populated with a list of 
> >>>>>> GFNs
> >>>>>> + *  that will be mapped to the MFNs of the resource.
> >>>>>> + *  If -EIO is returned then the frame_list has only been
> >>>>>> + *  partially mapped and it is up to the caller to unmap 
> >>>>>> all
> >>>>>> + *  the GFNs.
> >>>>>> + *  This parameter may be NULL if nr_frames is 0.
> >>>>>> + */
> >>>>>> +XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
> >>>>>
> >>>>> This is still xen_ulong_t, which I can live with, but then you shouldn't
> >>>>> copy into / out of arrays of other types in acquire_resource() (the
> >>>>> more that this is common code, and iirc xen_ulong_t and
> >>>>> unsigned long aren't the same thing on ARM32).
> >>>>
> >>>> xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't
> >>>> we use xen_pfn_t here?
> >>>
> >>> I had put this question up earlier, but iirc Paul didn't like it.
> >>
> >> I'd like to understand why Paul doesn't like it. We should never assume
> >> that a frame fit in xen_ulong_t. xen_pfn_t was exactly introduced for
> >> that purpose.
> >
> > My reservation is whether xen_pfn_t is intended to hold either gfns or
> mfns, since this hypercall uses the same array for both. If it suitable then 
> I am
> happy to change it, but Andrew led me to believe otherwise.
> 
> Looking at the public hearders, xen_pfn_t is been used for both MFN (see
> xenpf_add_memtype) and GFN (see gnttab_setup_table).
> 
> So I think it would be fine to do the same here.

Yes, I'm going to change it in the next version.

  Cheers,

Paul

> Cheers,
> 
> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-30 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 26 October 2017 16:27
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Wei Liu <wei.l...@citrix.com>; George
> Dunlap <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>;
> Stefano Stabellini <sstabell...@kernel.org>; xen-de...@lists.xenproject.org;
> Konrad Rzeszutek Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> <dgde...@tycho.nsa.gov>; Tim (Xen.org) <t...@xen.org>
> Subject: Re: [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to
> acquire guest resources
> 
> >>> On 17.10.17 at 15:24, <paul.durr...@citrix.com> wrote:
> > @@ -535,6 +588,48 @@ int compat_memory_op(unsigned int cmd,
> XEN_GUEST_HANDLE_PARAM(void) compat)
> >  rc = -EFAULT;
> >  break;
> >
> > +case XENMEM_acquire_resource:
> > +{
> > +const xen_ulong_t *xen_frame_list =
> > +(xen_ulong_t *)(nat.mar + 1);
> > +compat_ulong_t *compat_frame_list =
> > +(compat_ulong_t *)(nat.mar + 1);
> > +
> > +if ( cmp.mar.nr_frames == 0 )
> 
> Doesn't this need to be compat_handle_is_null(cmp.mar.frame_list), or
> a combination of both?

Sorry, yes this was a hang-over from the old scheme.

> 
> > +{
> > +
> DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
> > +
> > +if ( __copy_field_to_guest(
> > + guest_handle_cast(compat,
> > +   compat_mem_acquire_resource_t),
> > + , nr_frames) )
> > +return -EFAULT;
> > +}
> > +else
> > +{
> > +/*
> > + * NOTE: the smaller compat array overwrites the native
> > + *   array.
> > + */
> 
> I think I had already asked for a respective BUILD_BUG_ON().

You asked for the comment. I can't find where you asked for a BUILD_BUG_ON() 
but I can certainly add one.

> 
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -965,6 +965,95 @@ static long xatp_permission_check(struct domain
> *d, unsigned int space)
> >  return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
> >  }
> >
> > +static int acquire_resource(
> > +XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
> > +{
> > +struct domain *d, *currd = current->domain;
> > +xen_mem_acquire_resource_t xmar;
> > +unsigned long mfn_list[2];
> > +int rc;
> > +
> > +if ( copy_from_guest(, arg, 1) )
> > +return -EFAULT;
> > +
> > +if ( xmar.pad != 0 )
> > +return -EINVAL;
> > +
> > +if ( guest_handle_is_null(xmar.frame_list) )
> > +{
> > +/* Special case for querying implementation limit */
> > +if ( xmar.nr_frames == 0 )
> 
> Perhaps invert the condition to reduce ...
> 
> > +{
> > +xmar.nr_frames = ARRAY_SIZE(mfn_list);
> > +
> > +if ( __copy_field_to_guest(arg, , nr_frames) )
> > +return -EFAULT;
> > +
> > +return 0;
> > +}
> 
> ... overall indentation?
> 
> > +return -EINVAL;
> > +}
> > +
> > +if ( xmar.nr_frames == 0 )
> > +return -EINVAL;
> 
> Why? (Almost?) everywhere else zero counts are simply no-ops, which
> result in success returns.

Ok, I'll drop the check.

> 
> > +if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
> > +return -E2BIG;
> > +
> > +d = rcu_lock_domain_by_any_id(xmar.domid);
> 
> This being a tools only interface, why "by_any_id" instead of
> "remote_domain_by_id"? In particular ...
> 
> > +if ( d == NULL )
> > +return -ESRCH;
> > +
> > +rc = xsm_domain_resource_map(XSM_DM_PRIV, d);
> 
> ... an unprivileged dm domain should probably not be permitted to
> invoke this on itself.

True.

> 
> > +if ( rc )
> > +goto out;
> > +
> > +switch ( xmar.type )
> > +{
> > +default:
> > +rc = -EOPNOTSUPP;
> > +break;
> > +}
> > +
> > +if ( rc )
> > +goto out;
> > +
> > +if ( !paging_mode_translate(currd) )
> > +

Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-27 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 27 October 2017 12:46
> To: Jan Beulich <jbeul...@suse.com>; Paul Durrant
> <paul.durr...@citrix.com>
> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Wei Liu <wei.l...@citrix.com>; George
> Dunlap <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>;
> Stefano Stabellini <sstabell...@kernel.org>; xen-de...@lists.xenproject.org;
> Konrad Rzeszutek Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> <dgde...@tycho.nsa.gov>; Tim (Xen.org) <t...@xen.org>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> Hi,
> 
> On 26/10/17 16:39, Jan Beulich wrote:
> >>>> On 26.10.17 at 17:32, <julien.gr...@linaro.org> wrote:
> >> On 26/10/17 16:26, Jan Beulich wrote:
> >>>>>> On 17.10.17 at 15:24, <paul.durr...@citrix.com> wrote:
> >>>> +/* IN/OUT - If the tools domain is PV then, upon return, frame_list
> >>>> + *  will be populated with the MFNs of the resource.
> >>>> + *  If the tools domain is HVM then it is expected that, on
> >>>> + *  entry, frame_list will be populated with a list of GFNs
> >>>> + *  that will be mapped to the MFNs of the resource.
> >>>> + *  If -EIO is returned then the frame_list has only been
> >>>> + *  partially mapped and it is up to the caller to unmap all
> >>>> + *  the GFNs.
> >>>> + *  This parameter may be NULL if nr_frames is 0.
> >>>> + */
> >>>> +XEN_GUEST_HANDLE(xen_ulong_t) frame_list;
> >>>
> >>> This is still xen_ulong_t, which I can live with, but then you shouldn't
> >>> copy into / out of arrays of other types in acquire_resource() (the
> >>> more that this is common code, and iirc xen_ulong_t and
> >>> unsigned long aren't the same thing on ARM32).
> >>
> >> xen_ulong_t is always 64-bit on Arm (32-bit and 64-bit). But shouldn't
> >> we use xen_pfn_t here?
> >
> > I had put this question up earlier, but iirc Paul didn't like it.
> 
> I'd like to understand why Paul doesn't like it. We should never assume
> that a frame fit in xen_ulong_t. xen_pfn_t was exactly introduced for
> that purpose.

My reservation is whether xen_pfn_t is intended to hold either gfns or mfns, 
since this hypercall uses the same array for both. If it suitable then I am 
happy to change it, but Andrew led me to believe otherwise.

  Paul

> 
> Cheers,
> 
> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-25 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 23 October 2017 20:04
> To: Paul Durrant <paul.durr...@citrix.com>; 'Jan Beulich'
> <jbeul...@suse.com>
> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; George Dunlap
> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Roger
> Pau Monne <roger@citrix.com>; Wei Liu <wei.l...@citrix.com>; Stefano
> Stabellini <sstabell...@kernel.org>; xen-de...@lists.xenproject.org; Konrad
> Rzeszutek Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> <dgde...@tycho.nsa.gov>; Tim (Xen.org) <t...@xen.org>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> 
> 
> On 20/10/17 11:10, Paul Durrant wrote:
> >> -----Original Message-
> >> From: Julien Grall [mailto:julien.gr...@linaro.org]
> >> Sent: 20 October 2017 11:00
> >> To: Paul Durrant <paul.durr...@citrix.com>; 'Jan Beulich'
> >> <jbeul...@suse.com>
> >> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> >> <andrew.coop...@citrix.com>; George Dunlap
> >> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>;
> Roger
> >> Pau Monne <roger@citrix.com>; Wei Liu <wei.l...@citrix.com>;
> Stefano
> >> Stabellini <sstabell...@kernel.org>; xen-de...@lists.xenproject.org;
> Konrad
> >> Rzeszutek Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> >> <dgde...@tycho.nsa.gov>; Tim (Xen.org) <t...@xen.org>
> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> >> HYPERVISOR_memory_op to acquire guest resources
> >>
> >> Hi Paul,
> >>
> >> On 20/10/17 09:26, Paul Durrant wrote:
> >>>> -Original Message-
> >>>> From: Jan Beulich [mailto:jbeul...@suse.com]
> >>>> Sent: 20 October 2017 07:25
> >>>> To: Julien Grall <julien.gr...@linaro.org>
> >>>> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> >>>> <andrew.coop...@citrix.com>; George Dunlap
> >>>> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>;
> Paul
> >>>> Durrant <paul.durr...@citrix.com>; Roger Pau Monne
> >>>> <roger@citrix.com>; Wei Liu <wei.l...@citrix.com>; Stefano
> Stabellini
> >>>> <sstabell...@kernel.org>; xen-de...@lists.xenproject.org; Konrad
> >> Rzeszutek
> >>>> Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> >> <dgde...@tycho.nsa.gov>;
> >>>> Tim (Xen.org) <t...@xen.org>
> >>>> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> >>>> HYPERVISOR_memory_op to acquire guest resources
> >>>>
> >>>>>>> On 19.10.17 at 18:21, <julien.gr...@linaro.org> wrote:
> >>>>> Looking a bit more at the resource you can acquire from this hypercall.
> >>>>> Some of them are allocated using alloc_xenheap_page() so not
> assigned
> >> to
> >>>>> a domain.
> >>>>>
> >>>>> So I am not sure how you can expect a function
> set_foreign_p2m_entry
> >> to
> >>>>> take reference in that case.
> >>>>
> >>>> Hmm, with the domain parameter added, DOMID_XEN there (for
> >>>> Xen heap pages) could identify no references to be taken, if that
> >>>> was really the intended behavior in that case. However, even for
> >>>> Xen heap pages life time tracking ought to be done - it is for a
> >>>> reason that share_xen_page_with_guest() assigns the target
> >>>> domain as the owner of such pages, as that allows get_page() to
> >>>> succeed for them.
> >>>>
> >>>
> >
> > Hi Julien,
> >
> >>> So, nothing I'm doing here is making anything worse, right? Grant tables
> are
> >> assigned to the guest, and IOREQ server pages are allocated with
> >> alloc_domheap_page() so nothing is anonymous.
> >>
> >> I don't think grant tables is assigned to the guest today. They are
> >> allocated using xenheap_pages() and I can't find
> >> share_xen_page_with_guest().
> >
> > The guest would not be able to map them if they were not assigned in
> some way!
> 
> Do you mean for PV? For HVM/PVH, we don't check whether 

Re: [Xen-devel] [PATCH v2 2/5] xen: Provide XEN_DMOP_add_to_physmap

2017-10-23 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 23 October 2017 13:18
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>; George Dunlap
> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Ross
> Lagerwall <ross.lagerw...@citrix.com>; Wei Liu <wei.l...@citrix.com>;
> Stefano Stabellini <sstabell...@kernel.org>; xen-devel@lists.xen.org; Konrad
> Rzeszutek Wilk <konrad.w...@oracle.com>; Tim (Xen.org) <t...@xen.org>
> Subject: RE: [Xen-devel] [PATCH v2 2/5] xen: Provide
> XEN_DMOP_add_to_physmap
> 
> >>> On 23.10.17 at 14:03, <paul.durr...@citrix.com> wrote:
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> >> Ross Lagerwall
> >> Sent: 23 October 2017 10:05
> >> --- a/xen/include/public/hvm/dm_op.h
> >> +++ b/xen/include/public/hvm/dm_op.h
> >> @@ -368,6 +368,22 @@ struct xen_dm_op_remote_shutdown {
> >> /* (Other reason values are not blocked) */
> >>  };
> >>
> >> +/*
> >> + * XEN_DMOP_add_to_physmap : Sets the GPFNs at which a page range
> >> appears in
> >> + *   the specified guest's pseudophysical address
> >> + *   space. Identical to XENMEM_add_to_physmap 
> >> with
> >> + *   space == XENMAPSPACE_gmfn_range.
> >> + */
> >> +#define XEN_DMOP_add_to_physmap 17
> >> +
> >> +struct xen_dm_op_add_to_physmap {
> >> +uint16_t size; /* Number of GMFNs to process. */
> >> +uint16_t pad0;
> >> +uint32_t pad1;
> >
> > I think you can lose pad1 by putting idx and gpfn above size rather than
> > below (since IIRC we only need pad up to the next 4 byte boundary).
> 
> No, tail padding would then still be wanted, I think.

Ok.  I stand corrected :-)

  Paul

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 4/5] tools: libxendevicemodel: Provide xendevicemodel_add_to_physmap

2017-10-23 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 23 October 2017 10:05
> To: xen-devel@lists.xen.org
> Cc: Ross Lagerwall <ross.lagerw...@citrix.com>; Ian Jackson
> <ian.jack...@citrix.com>; Wei Liu <wei.l...@citrix.com>
> Subject: [Xen-devel] [PATCH v2 4/5] tools: libxendevicemodel: Provide
> xendevicemodel_add_to_physmap
> 
> Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> 
> Changed in v2:
> * Make it operate on a range.
> 
>  tools/libs/devicemodel/Makefile |  2 +-
>  tools/libs/devicemodel/core.c   | 21 +
>  tools/libs/devicemodel/include/xendevicemodel.h | 15 +++
>  tools/libs/devicemodel/libxendevicemodel.map|  5 +
>  4 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libs/devicemodel/Makefile
> b/tools/libs/devicemodel/Makefile
> index 342371a..5b2df7a 100644
> --- a/tools/libs/devicemodel/Makefile
> +++ b/tools/libs/devicemodel/Makefile
> @@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
>  include $(XEN_ROOT)/tools/Rules.mk
> 
>  MAJOR= 1
> -MINOR= 1
> +MINOR= 2
>  SHLIB_LDFLAGS += -Wl,--version-script=libxendevicemodel.map
> 
>  CFLAGS   += -Werror -Wmissing-prototypes
> diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
> index b66d4f9..07953d3 100644
> --- a/tools/libs/devicemodel/core.c
> +++ b/tools/libs/devicemodel/core.c
> @@ -564,6 +564,27 @@ int xendevicemodel_shutdown(
>  return xendevicemodel_op(dmod, domid, 1, , sizeof(op));
>  }
> 
> +int xendevicemodel_add_to_physmap(
> +xendevicemodel_handle *dmod, domid_t domid, uint16_t size, uint64_t
> idx,
> +uint64_t gpfn)
> +{
> +struct xen_dm_op op;
> +struct xen_dm_op_add_to_physmap *data;
> +
> +memset(, 0, sizeof(op));
> +
> +op.op = XEN_DMOP_add_to_physmap;
> +data = _to_physmap;
> +
> +data->size = size;
> +data->pad0 = 0;
> +data->pad1 = 0;
> +data->idx = idx;
> +data->gpfn = gpfn;
> +
> +return xendevicemodel_op(dmod, domid, 1, , sizeof(op));
> +}
> +
>  int xendevicemodel_restrict(xendevicemodel_handle *dmod, domid_t
> domid)
>  {
>  return osdep_xendevicemodel_restrict(dmod, domid);
> diff --git a/tools/libs/devicemodel/include/xendevicemodel.h
> b/tools/libs/devicemodel/include/xendevicemodel.h
> index dda0bc7..6967e58 100644
> --- a/tools/libs/devicemodel/include/xendevicemodel.h
> +++ b/tools/libs/devicemodel/include/xendevicemodel.h
> @@ -326,6 +326,21 @@ int xendevicemodel_shutdown(
>  xendevicemodel_handle *dmod, domid_t domid, unsigned int reason);
> 
>  /**
> + * Sets the GPFNs at which a page range appears in the domain's
> + * pseudophysical address space.
> + *
> + * @parm dmod a handle to an open devicemodel interface.
> + * @parm domid the domain id to be serviced
> + * @parm size Number of GMFNs to process
> + * @parm idx Index into GMFN space
> + * @parm gpfn Starting GPFN where the GMFNs should appear
> + * @return 0 on success, -1 on failure.
> + */
> +int xendevicemodel_add_to_physmap(
> +xendevicemodel_handle *dmod, domid_t domid, uint16_t size, uint64_t
> idx,
> +uint64_t gpfn);
> +
> +/**
>   * This function restricts the use of this handle to the specified
>   * domain.
>   *
> diff --git a/tools/libs/devicemodel/libxendevicemodel.map
> b/tools/libs/devicemodel/libxendevicemodel.map
> index cefd32b..4a19ecb 100644
> --- a/tools/libs/devicemodel/libxendevicemodel.map
> +++ b/tools/libs/devicemodel/libxendevicemodel.map
> @@ -27,3 +27,8 @@ VERS_1.1 {
>   global:
>   xendevicemodel_shutdown;
>  } VERS_1.0;
> +
> +VERS_1.2 {
> + global:
> + xendevicemodel_add_to_physmap;
> +} VERS_1.1;
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/5] xen: Provide XEN_DMOP_pin_memory_cacheattr

2017-10-23 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 23 October 2017 10:05
> To: xen-devel@lists.xen.org
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Ross Lagerwall <ross.lagerw...@citrix.com>; Jan
> Beulich <jbeul...@suse.com>
> Subject: [Xen-devel] [PATCH v2 3/5] xen: Provide
> XEN_DMOP_pin_memory_cacheattr
> 
> Provide XEN_DMOP_pin_memory_cacheattr to allow a deprivileged QEMU
> to
> pin the caching type of RAM after moving the VRAM. It is equivalent to
> XEN_DOMCTL_pin_memory_cacheattr.
> 
> Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> 
> Changed in v2:
> * Check pad is 0.
> 
>  xen/arch/x86/hvm/dm.c  | 18 ++
>  xen/include/public/hvm/dm_op.h | 14 ++
>  xen/include/xlat.lst   |  1 +
>  3 files changed, 33 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> index 0027567..42d02cc 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -21,6 +21,7 @@
> 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  #include 
> @@ -670,6 +671,22 @@ static int dm_op(const struct dmop_args *op_args)
>  break;
>  }
> 
> +case XEN_DMOP_pin_memory_cacheattr:
> +{
> +const struct xen_dm_op_pin_memory_cacheattr *data =
> +_memory_cacheattr;
> +
> +if ( data->pad )
> +{
> +rc = -EINVAL;
> +break;
> +}
> +
> +rc = hvm_set_mem_pinned_cacheattr(d, data->start, data->end,
> +  data->type);
> +break;
> +}
> +
>  default:
>  rc = -EOPNOTSUPP;
>  break;
> @@ -700,6 +717,7 @@ CHECK_dm_op_inject_event;
>  CHECK_dm_op_inject_msi;
>  CHECK_dm_op_remote_shutdown;
>  CHECK_dm_op_add_to_physmap;
> +CHECK_dm_op_pin_memory_cacheattr;
> 
>  int compat_dm_op(domid_t domid,
>   unsigned int nr_bufs,
> diff --git a/xen/include/public/hvm/dm_op.h
> b/xen/include/public/hvm/dm_op.h
> index f685110..f9c86b8 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -384,6 +384,19 @@ struct xen_dm_op_add_to_physmap {
>  uint64_aligned_t gpfn; /* Starting GPFN where the GMFNs should appear.
> */
>  };
> 
> +/*
> + * XEN_DMOP_pin_memory_cacheattr : Pin caching type of RAM space.
> + * Identical to XEN_DOMCTL_pin_mem_cacheattr.
> + */
> +#define XEN_DMOP_pin_memory_cacheattr 18
> +
> +struct xen_dm_op_pin_memory_cacheattr {
> +uint64_aligned_t start; /* Start gfn. */
> +uint64_aligned_t end;   /* End gfn. */
> +uint32_t type;  /* XEN_DOMCTL_MEM_CACHEATTR_* */
> +uint32_t pad;
> +};
> +
>  struct xen_dm_op {
>  uint32_t op;
>  uint32_t pad;
> @@ -406,6 +419,7 @@ struct xen_dm_op {
>  map_mem_type_to_ioreq_server;
>  struct xen_dm_op_remote_shutdown remote_shutdown;
>  struct xen_dm_op_add_to_physmap add_to_physmap;
> +struct xen_dm_op_pin_memory_cacheattr pin_memory_cacheattr;
>  } u;
>  };
> 
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> index d40bac6..fffb308 100644
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -65,6 +65,7 @@
>  ?dm_op_inject_msihvm/dm_op.h
>  ?dm_op_ioreq_server_rangehvm/dm_op.h
>  ?dm_op_modified_memory   hvm/dm_op.h
> +?dm_op_pin_memory_cacheattr  hvm/dm_op.h
>  ?dm_op_remote_shutdown   hvm/dm_op.h
>  ?dm_op_set_ioreq_server_statehvm/dm_op.h
>  ?dm_op_set_isa_irq_level hvm/dm_op.h
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/5] xen: Provide XEN_DMOP_add_to_physmap

2017-10-23 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 23 October 2017 10:05
> To: xen-devel@lists.xen.org
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Ross Lagerwall <ross.lagerw...@citrix.com>; Jan
> Beulich <jbeul...@suse.com>
> Subject: [Xen-devel] [PATCH v2 2/5] xen: Provide
> XEN_DMOP_add_to_physmap
> 
> Provide XEN_DMOP_add_to_physmap, a limited version of
> XENMEM_add_to_physmap to allow a deprivileged QEMU to move VRAM
> when a
> guest programs its BAR. It is equivalent to XENMEM_add_to_physmap with
> space == XENMAPSPACE_gmfn_range.
> 
> Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

...with one observation below...

> ---
> 
> Changed in v2:
> * Make it operate on a range.
> 
>  xen/arch/x86/hvm/dm.c  | 31 +++
>  xen/include/public/hvm/dm_op.h | 17 +
>  xen/include/xlat.lst   |  1 +
>  3 files changed, 49 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> index 32ade95..0027567 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -640,6 +640,36 @@ static int dm_op(const struct dmop_args *op_args)
>  break;
>  }
> 
> +case XEN_DMOP_add_to_physmap:
> +{
> +struct xen_dm_op_add_to_physmap *data =
> +_to_physmap;
> +struct xen_add_to_physmap xatp = {
> +.domid = op_args->domid,
> +.size = data->size,
> +.space = XENMAPSPACE_gmfn_range,
> +.idx = data->idx,
> +.gpfn = data->gpfn,
> +};
> +
> +if ( data->pad0 || data->pad1 )
> +{
> +rc = -EINVAL;
> +break;
> +}
> +
> +rc = xenmem_add_to_physmap(d, , 0);
> +if ( rc > 0 )
> +{
> +data->size -= rc;
> +data->idx += rc;
> +data->gpfn += rc;
> +const_op = false;
> +rc = -ERESTART;
> +}
> +break;
> +}
> +
>  default:
>  rc = -EOPNOTSUPP;
>  break;
> @@ -669,6 +699,7 @@ CHECK_dm_op_set_mem_type;
>  CHECK_dm_op_inject_event;
>  CHECK_dm_op_inject_msi;
>  CHECK_dm_op_remote_shutdown;
> +CHECK_dm_op_add_to_physmap;
> 
>  int compat_dm_op(domid_t domid,
>   unsigned int nr_bufs,
> diff --git a/xen/include/public/hvm/dm_op.h
> b/xen/include/public/hvm/dm_op.h
> index e173085..f685110 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -368,6 +368,22 @@ struct xen_dm_op_remote_shutdown {
> /* (Other reason values are not blocked) */
>  };
> 
> +/*
> + * XEN_DMOP_add_to_physmap : Sets the GPFNs at which a page range
> appears in
> + *   the specified guest's pseudophysical address
> + *   space. Identical to XENMEM_add_to_physmap with
> + *   space == XENMAPSPACE_gmfn_range.
> + */
> +#define XEN_DMOP_add_to_physmap 17
> +
> +struct xen_dm_op_add_to_physmap {
> +uint16_t size; /* Number of GMFNs to process. */
> +uint16_t pad0;
> +uint32_t pad1;

I think you can lose pad1 by putting idx and gpfn above size rather than below 
(since IIRC we only need pad up to the next 4 byte boundary).

  Paul

> +uint64_aligned_t idx;  /* Index into GMFN space. */
> +uint64_aligned_t gpfn; /* Starting GPFN where the GMFNs should
> appear. */
> +};
> +
>  struct xen_dm_op {
>  uint32_t op;
>  uint32_t pad;
> @@ -389,6 +405,7 @@ struct xen_dm_op {
>  struct xen_dm_op_map_mem_type_to_ioreq_server
>  map_mem_type_to_ioreq_server;
>  struct xen_dm_op_remote_shutdown remote_shutdown;
> +struct xen_dm_op_add_to_physmap add_to_physmap;
>  } u;
>  };
> 
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> index 4346cbe..d40bac6 100644
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -57,6 +57,7 @@
>  ?grant_entry_v2  grant_table.h
>  ?gnttab_swap_grant_ref   grant_table.h
>  !dm_op_buf   hvm/dm_op.h
> +?dm_op_add_to_physmaphvm/dm_op.h
>  ?dm_op_create_ioreq_server   hvm/dm_op.h
>  ?dm_op_destroy_ioreq_server  hvm/dm_op.h
>  ?dm_op_get_ioreq_server_info hvm/dm_op.h
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/5] xen/mm: Make xenmem_add_to_physmap global

2017-10-23 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 23 October 2017 10:05
> To: xen-devel@lists.xen.org
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Ross Lagerwall <ross.lagerw...@citrix.com>; Jan
> Beulich <jbeul...@suse.com>
> Subject: [Xen-devel] [PATCH v2 1/5] xen/mm: Make
> xenmem_add_to_physmap global
> 
> Make it global in preparation to be called by a new dmop.
> 
> Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>
> 
> ---

You need to delete the above '---' otherwise this R-b will not get carried 
through into the commit.

  Paul

> Reviewed-by: Paul Durrant <paul.durr...@citrix.com>
> ---
>  xen/common/memory.c  | 5 ++---
>  xen/include/xen/mm.h | 3 +++
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index ad987e0..c4f05c7 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -741,9 +741,8 @@ static long
> memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange
> _t) arg)
>  return rc;
>  }
> 
> -static int xenmem_add_to_physmap(struct domain *d,
> - struct xen_add_to_physmap *xatp,
> - unsigned int start)
> +int xenmem_add_to_physmap(struct domain *d, struct
> xen_add_to_physmap *xatp,
> +  unsigned int start)
>  {
>  unsigned int done = 0;
>  long rc = 0;
> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> index e813c07..0e0e511 100644
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -579,6 +579,9 @@ int xenmem_add_to_physmap_one(struct domain
> *d, unsigned int space,
>union xen_add_to_physmap_batch_extra extra,
>unsigned long idx, gfn_t gfn);
> 
> +int xenmem_add_to_physmap(struct domain *d, struct
> xen_add_to_physmap *xatp,
> +  unsigned int start);
> +
>  /* Return 0 on success, or negative on error. */
>  int __must_check guest_remove_page(struct domain *d, unsigned long
> gmfn);
>  int __must_check steal_page(struct domain *d, struct page_info *page,
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.9 is broken with last version of Win10

2017-10-23 Thread Paul Durrant
De-htmling...
Moving to xen-users (xen-devel to bcc)...

-
From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Berillions
Sent: 21 October 2017 17:50
To: xen-devel@lists.xen.org
Subject: [Xen-devel] Xen 4.9 is broken with last version of Win10

Hi guys,
I send you this message to warn you that the latest official version of Windows 
10 is broken with Xen. This actual version called "Fall Creator Update" is 
released few days ago.
I did my tests with this version 1709 and the old version 1703 called "Creator 
Update" and SEABIOS.
Windows 10 version 1709 :
SEABIOS : Able to boot to the CD/DVD-ROM but when you must to choose your disk 
to install the system, Windows says that these drivers are obsolete and don't 
find your disk.
http://hpics.li/0082aa8
I try to translate the French message :
Load a driver 
Your computer needs a media's driver which is missing. It can be a DVD Disk, 
USB Disk or Hard Disk driver. If you have a CD or an USB Key with the driver, 
insert it now.

Windows 10 version 1703 :
SEABIOS : All works correctly.
http://hpics.li/0b9aaaf
This problem affect QEMU/KVM too, see here :
http://lists.nongnu.org/archive/html/qemu-discuss/2017-10/msg00044.html

Cheers,
Maxime
-

Hi,

  I just downloaded a copy of 1709 and I don't see any particular problem. What 
does your xl.cfg look like? I guess the problem is your choice of system disk 
emulation, which is why you see the same issue with KVM.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/xen: support priv-mapping in an HVM tools domain

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Boris Ostrovsky
> Sent: 20 October 2017 16:09
> To: Paul Durrant <paul.durr...@citrix.com>; x...@kernel.org; xen-
> de...@lists.xenproject.org; linux-ker...@vger.kernel.org
> Cc: Juergen Gross <jgr...@suse.com>; Thomas Gleixner
> <t...@linutronix.de>; Ingo Molnar <mi...@redhat.com>; H. Peter Anvin
> <h...@zytor.com>
> Subject: Re: [Xen-devel] [PATCH] x86/xen: support priv-mapping in an HVM
> tools domain
> 
> On 10/20/2017 04:35 AM, Paul Durrant wrote:
> >> -Original Message-
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> >> Boris Ostrovsky
> >> Sent: 19 October 2017 18:45
> >> To: Paul Durrant <paul.durr...@citrix.com>; x...@kernel.org; xen-
> >> de...@lists.xenproject.org; linux-ker...@vger.kernel.org
> >> Cc: Juergen Gross <jgr...@suse.com>; Thomas Gleixner
> >> <t...@linutronix.de>; Ingo Molnar <mi...@redhat.com>; H. Peter Anvin
> >> <h...@zytor.com>
> >> Subject: Re: [Xen-devel] [PATCH] x86/xen: support priv-mapping in an
> HVM
> >> tools domain
> >>
> >> On 10/19/2017 11:26 AM, Paul Durrant wrote:
> >>> If the domain has XENFEAT_auto_translated_physmap then use of the
> PV-
> >>> specific HYPERVISOR_mmu_update hypercall is clearly incorrect.
> >>>
> >>> This patch adds checks in xen_remap_domain_gfn_array() and
> >>> xen_unmap_domain_gfn_array() which call through to the approprate
> >>> xlate_mmu function if the feature is present. A check is also added
> >>> to xen_remap_domain_gfn_range() to fail with -EOPNOTSUPP since this
> >>> should not be used in an HVM tools domain.
> >>>
> >>> Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> >>> ---
> >>> Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
> >>> Cc: Juergen Gross <jgr...@suse.com>
> >>> Cc: Thomas Gleixner <t...@linutronix.de>
> >>> Cc: Ingo Molnar <mi...@redhat.com>
> >>> Cc: "H. Peter Anvin" <h...@zytor.com>
> >>> ---
> >>>  arch/x86/xen/mmu.c | 14 --
> >>>  1 file changed, 12 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> >>> index 3e15345abfe7..d33e7dbe3129 100644
> >>> --- a/arch/x86/xen/mmu.c
> >>> +++ b/arch/x86/xen/mmu.c
> >>> @@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct
> >> vm_area_struct *vma,
> >>>  pgprot_t prot, unsigned domid,
> >>>  struct page **pages)
> >>>  {
> >>> + if (xen_feature(XENFEAT_auto_translated_physmap))
> >>> + return -EOPNOTSUPP;
> >>> +
> >> This is never called on XENFEAT_auto_translated_physmap domains,
> there
> >> is a check in privcmd_ioctl_mmap() for that.
> > Yes, that's true but it seems like the wrong place for such a check. I could
> remove that one it you'd prefer.
> 
> I actually think that perhaps we could wrap privcmd_ioctl_mmap() with
> "#ifdef CONFIG_XEN_PV" (#else return -ENOSYS) and move
> xen_remap_domain_gfn_range() to mmu_pv.c. We can then remove it from
> ARM
> code too.
> 
> >
> >>>   return do_remap_gfn(vma, addr, , nr, NULL, prot, domid,
> >> pages);
> >>>  }
> >>>  EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
> >>> @@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct
> >> vm_area_struct *vma,
> >>>  int *err_ptr, pgprot_t prot,
> >>>  unsigned domid, struct page **pages)
> >>>  {
> >>> + if (xen_feature(XENFEAT_auto_translated_physmap))
> >>> + return xen_xlate_remap_gfn_array(vma, addr, gfn, nr,
> >> err_ptr,
> >>> +  prot, domid, pages);
> >>> +
> >> So how did this work before? In fact, I don't see any callers of
> >> xen_xlate_{re|un}map_gfn_range().
> > I assume mean 'array' for the map since there is no
> xen_xlate_remap_gfn_range() function. I'm not quite sure what you're
> asking? Without this patch the mmu code in an x86 domain simply assumes
> the domain is PV... the xlate code is currently only used via the arm mmu
> code (where it clearly knows it's not PV). AFAICS this Is 

Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 20 October 2017 11:00
> To: Paul Durrant <paul.durr...@citrix.com>; 'Jan Beulich'
> <jbeul...@suse.com>
> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; George Dunlap
> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Roger
> Pau Monne <roger@citrix.com>; Wei Liu <wei.l...@citrix.com>; Stefano
> Stabellini <sstabell...@kernel.org>; xen-de...@lists.xenproject.org; Konrad
> Rzeszutek Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> <dgde...@tycho.nsa.gov>; Tim (Xen.org) <t...@xen.org>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> Hi Paul,
> 
> On 20/10/17 09:26, Paul Durrant wrote:
> >> -Original Message-
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: 20 October 2017 07:25
> >> To: Julien Grall <julien.gr...@linaro.org>
> >> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> >> <andrew.coop...@citrix.com>; George Dunlap
> >> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Paul
> >> Durrant <paul.durr...@citrix.com>; Roger Pau Monne
> >> <roger@citrix.com>; Wei Liu <wei.l...@citrix.com>; Stefano Stabellini
> >> <sstabell...@kernel.org>; xen-de...@lists.xenproject.org; Konrad
> Rzeszutek
> >> Wilk <konrad.w...@oracle.com>; Daniel De Graaf
> <dgde...@tycho.nsa.gov>;
> >> Tim (Xen.org) <t...@xen.org>
> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> >> HYPERVISOR_memory_op to acquire guest resources
> >>
> >>>>> On 19.10.17 at 18:21, <julien.gr...@linaro.org> wrote:
> >>> Looking a bit more at the resource you can acquire from this hypercall.
> >>> Some of them are allocated using alloc_xenheap_page() so not assigned
> to
> >>> a domain.
> >>>
> >>> So I am not sure how you can expect a function set_foreign_p2m_entry
> to
> >>> take reference in that case.
> >>
> >> Hmm, with the domain parameter added, DOMID_XEN there (for
> >> Xen heap pages) could identify no references to be taken, if that
> >> was really the intended behavior in that case. However, even for
> >> Xen heap pages life time tracking ought to be done - it is for a
> >> reason that share_xen_page_with_guest() assigns the target
> >> domain as the owner of such pages, as that allows get_page() to
> >> succeed for them.
> >>
> >

Hi Julien,

> > So, nothing I'm doing here is making anything worse, right? Grant tables are
> assigned to the guest, and IOREQ server pages are allocated with
> alloc_domheap_page() so nothing is anonymous.
> 
> I don't think grant tables is assigned to the guest today. They are
> allocated using xenheap_pages() and I can't find
> share_xen_page_with_guest().

The guest would not be able to map them if they were not assigned in some way!
See the code block at 
http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/common/grant_table.c;hb=HEAD#l1716
It calls gnttab_create_shared_page() which is what calls through to 
share_xen_page_with_guest().

> 
> Anyway, I discussed with Stefano about it. set_foreign_p2m_entry is
> going to be left unimplemented on Arm until someone as time to implement
> correctly the function.
> 

That makes sense. Do you still have any issues with this patch apart from the 
cosmetic ones you spotted in the header?

Cheers,

  Paul

> Cheers,
> 
> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 2/5] xen: Provide XEN_DMOP_add_to_physmap

2017-10-20 Thread Paul Durrant


> -Original Message-
> From: Ross Lagerwall [mailto:ross.lagerw...@citrix.com]
> Sent: 20 October 2017 10:37
> To: Paul Durrant <paul.durr...@citrix.com>; Xen-devel  de...@lists.xen.org>
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Jan Beulich <jbeul...@suse.com>
> Subject: Re: [Xen-devel] [PATCH v1 2/5] xen: Provide
> XEN_DMOP_add_to_physmap
> 
> On 10/20/2017 10:15 AM, Paul Durrant wrote:
> >> -Original Message-
> snip>> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> >> index 32ade95..432a863 100644
> >> --- a/xen/arch/x86/hvm/dm.c
> >> +++ b/xen/arch/x86/hvm/dm.c
> >> @@ -640,6 +640,22 @@ static int dm_op(const struct dmop_args
> *op_args)
> >>   break;
> >>   }
> >>
> >> +case XEN_DMOP_add_to_physmap:
> >> +{
> >> +const struct xen_dm_op_add_to_physmap *data =
> >> +_to_physmap;
> >> +struct xen_add_to_physmap xatp = {
> >> +.domid = op_args->domid,
> >> +.space = XENMAPSPACE_gmfn,
> >> +.idx = data->idx,
> >> +.gpfn = data->gpfn,
> >> +};
> >> +
> >
> > Where does xatp.size get set? Looks like you're missing a parameter.
> >
> xatp.size is only used for XENMAPSPACE_gmfn_range which is not
> supported
> by this interface. size gets set to 0 by the C99 designated initializer.
> 
> Based on your other comments, would it make sense to instead use
> XENMAPSPACE_gmfn_range and have the caller set the size?

Yes... my eyes had read XENMAPSPACE_gmfn_range in the first place, hence my 
confusion over the size parameter.

> 
> As it is currently, QEMU does only populate VRAM one page at a time
> (using xen_xc_domain_add_to_physmap)

Ouch, yes, I'd forgotten that.

> so it is already slow but it could
> be improved.

Indeed. I think we should shoot for a better semantic given that it's a new op.

  Paul

> 
> --
> Ross Lagerwall
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 for-next 10/12] vpci/msi: add MSI handlers

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monne [mailto:roger@citrix.com]
> Sent: 18 October 2017 12:41
> To: xen-de...@lists.xenproject.org
> Cc: konrad.w...@oracle.com; boris.ostrov...@oracle.com; Roger Pau Monne
> <roger@citrix.com>; Jan Beulich <jbeul...@suse.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>
> Subject: [PATCH v7 for-next 10/12] vpci/msi: add MSI handlers
> 
> Add handlers for the MSI control, address, data and mask fields in
> order to detect accesses to them and setup the interrupts as requested
> by the guest.
> 
> Note that the pending register is not trapped, and the guest can
> freely read/write to it.
> 
> Signed-off-by: Roger Pau Monné <roger@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> Cc: Paul Durrant <paul.durr...@citrix.com>
> ---
> Changes since v6:
>  - Use domain_spin_lock_irq_desc instead of open coding it.
>  - Reduce the size of printed debug messages.
>  - Constify domain in vpci_dump_msi.
>  - Lock domlist_read_lock before iterating over the list of domains.
>  - Make max_vectors and vectors uint8_t.
>  - Drop the vpci_ prefix from the static functions in msi.c.
>  - Turn the booleans in vpci_msi into bitfields.
>  - Apply the mask bits to all vectors when enabling msi.
>  - Remove the pos field.
>  - Remove the usage of __msi_set_{enable/disable}.
>  - Update the bindings when the message or data fields are updated.
>  - Make vpci_msi_arch_disable return void, it wasn't returning any
>error.
>  - Prevent the guest from writing to the pending bits field, it's read
>only as defined in the spec.
>  - Add the must_check attribute to vpci_msi_arch_enable.
> 
> Changes since v5:
>  - Update to new lock usage.
>  - Change handlers to match the new type.
>  - s/msi_flags/msi_gflags/, remove the local variables and use the new
>DOMCTL_VMSI_* defines.
>  - Change the MSI arch function to take a vpci_msi instead of a
>vpci_arch_msi as parameter.
>  - Fix the calculation of the guest vector for MSI injection to take
>into account the number of bits that can be modified.
>  - Use INVALID_PIRQ everywhere.
>  - Simplify exit path of vpci_msi_disable.
>  - Remove the conditional when setting address64 and masking fields.
>  - Add a process_pending_softirqs to the MSI dump loop.
>  - Place the prototypes for the MSI arch-specific functions in
>xen/vpci.h.
>  - Add parentheses around the INVALID_PIRQ definition.
> 
> Changes since v4:
>  - Fix commit message.
>  - Change the ASSERTs in vpci_msi_arch_mask into ifs.
>  - Introduce INVALID_PIRQ.
>  - Destroy the partially created bindings in case of failure in
>vpci_msi_arch_enable.
>  - Just take the pcidevs lock once in vpci_msi_arch_disable.
>  - Print an error message in case of failure of pt_irq_destroy_bind.
>  - Make vpci_msi_arch_init return void.
>  - Constify the arch parameter of vpci_msi_arch_print.
>  - Use fixed instead of cpu for msi redirection.
>  - Separate the header includes in vpci/msi.c between xen and asm.
>  - Store the number of configured vectors even if MSI is not enabled
>and always return it in vpci_msi_control_read.
>  - Fix/add comments in vpci_msi_control_write to clarify intended
>behavior.
>  - Simplify usage of masks in vpci_msi_address_{upper_}write.
>  - Add comment to vpci_msi_mask_{read/write}.
>  - Don't use MASK_EXTR in vpci_msi_mask_write.
>  - s/msi_offset/pos/ in vpci_init_msi.
>  - Move control variable setup closer to it's usage.
>  - Use d%d in vpci_dump_msi.
>  - Fix printing of bitfield mask in vpci_dump_msi.
>  - Fix definition of MSI_ADDR_REDIRECTION_MASK.
>  - Shuffle the layout of vpci_msi to minimize gaps.
>  - Remove the error label in vpci_init_msi.
> 
> Changes since v3:
>  - Propagate changes from previous versions: drop xen_ prefix, drop
>return value from handlers, use the new vpci_val fields.
>  - Use MASK_EXTR.
>  - Remove the usage of GENMASK.
>  - Add GFLAGS_SHIFT_DEST_ID and use it in msi_flags.
>  - Add "arch" to the MSI arch specific functions.
>  - Move the dumping of vPCI MSI information to dump_msi (key 'M').
>  - Remove the guest_vectors field.
>  - Allow the guest to change the number of active vectors without
>having to disable and enable MSI.
>  - Check the number of active vectors when parsing the disable
>mask.
>  - Remove the debug messages from vpci_init_msi.
>  - Move the arch-specific part of the dump handler to x86/hvm/vmsi.c.
>  - Use trylock in the dump handler to get the vpci lock

Re: [Xen-devel] [PATCH v7 for-next 04/12] x86/mmcfg: add handlers for the PVH Dom0 MMCFG areas

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monne [mailto:roger@citrix.com]
> Sent: 18 October 2017 12:40
> To: xen-de...@lists.xenproject.org
> Cc: konrad.w...@oracle.com; boris.ostrov...@oracle.com; Roger Pau Monne
> <roger@citrix.com>; Jan Beulich <jbeul...@suse.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>
> Subject: [PATCH v7 for-next 04/12] x86/mmcfg: add handlers for the PVH
> Dom0 MMCFG areas
> 
> Introduce a set of handlers for the accesses to the MMCFG areas. Those
> areas are setup based on the contents of the hardware MMCFG tables,
> and the list of handled MMCFG areas is stored inside of the hvm_domain
> struct.
> 
> The read/writes are forwarded to the generic vpci handlers once the
> address is decoded in order to obtain the device and register the
> guest is trying to access.
> 
> Signed-off-by: Roger Pau Monné <roger@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> Cc: Paul Durrant <paul.durr...@citrix.com>
> ---
> Changes since v6:
>  - Move allocation of mmcfg outside of the locked region.
>  - Do proper overlap checks when adding mmcfg regions.
>  - Return _RETRY if the mcfg region cannot be found in the read/write
>handlers. This means the mcfg area has been removed between the
>accept and the read/write calls.
> 
> Changes since v5:
>  - Switch to use pci_sbdf_t.
>  - Switch to the new per vpci locks.
>  - Move the mmcfg related external definitions to asm-x86/pci.h.
> 
> Changes since v4:
>  - Change the attribute of pvh_setup_mmcfg to __hwdom_init.
>  - Try to add as many MMCFG regions as possible, even if one fails to
>add.
>  - Change some fields of the hvm_mmcfg struct: turn size into a
>unsigned int, segment into uint16_t and bus into uint8_t.
>  - Convert some address parameters from unsigned long to paddr_t for
>consistency.
>  - Make vpci_mmcfg_decode_addr return the decoded register in the
>return of the function.
>  - Introduce a new macro to convert a MMCFG address into a BDF, and
>use it in vpci_mmcfg_decode_addr to clarify the logic.
>  - In vpci_mmcfg_{read/write} unify the logic for 8B accesses and
>smaller ones.
>  - Add the __hwdom_init attribute to register_vpci_mmcfg_handler.
>  - Test that reg + size doesn't cross a device boundary.
> 
> Changes since v3:
>  - Propagate changes from previous patches: drop xen_ prefix for vpci
>functions, pass slot and func instead of devfn and fix the error
>paths of the MMCFG handlers.
>  - s/ecam/mmcfg/.
>  - Move the destroy code to a separate function, so the hvm_mmcfg
>struct can be private to hvm/io.c.
>  - Constify the return of vpci_mmcfg_find.
>  - Use d instead of v->domain in vpci_mmcfg_accept.
>  - Allow 8byte accesses to the mmcfg.
> 
> Changes since v1:
>  - Added locking.
> ---
>  xen/arch/x86/hvm/dom0_build.c|  21 +
>  xen/arch/x86/hvm/hvm.c   |   4 +
>  xen/arch/x86/hvm/io.c| 174
> ++-
>  xen/arch/x86/x86_64/mmconfig.h   |   4 -
>  xen/include/asm-x86/hvm/domain.h |   4 +
>  xen/include/asm-x86/hvm/io.h |   7 ++
>  xen/include/asm-x86/pci.h|   6 ++
>  7 files changed, 215 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/dom0_build.c
> b/xen/arch/x86/hvm/dom0_build.c
> index a67071c739..9e841c103d 100644
> --- a/xen/arch/x86/hvm/dom0_build.c
> +++ b/xen/arch/x86/hvm/dom0_build.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  #include 
> @@ -1049,6 +1050,24 @@ static int __init pvh_setup_acpi(struct domain *d,
> paddr_t start_info)
>  return 0;
>  }
> 
> +static void __hwdom_init pvh_setup_mmcfg(struct domain *d)
> +{
> +unsigned int i;
> +int rc;
> +
> +for ( i = 0; i < pci_mmcfg_config_num; i++ )
> +{
> +rc = register_vpci_mmcfg_handler(d, pci_mmcfg_config[i].address,
> + 
> pci_mmcfg_config[i].start_bus_number,
> + pci_mmcfg_config[i].end_bus_number,
> + pci_mmcfg_config[i].pci_segment);
> +if ( rc )
> +printk("Unable to setup MMCFG handler at %#lx for segment %u\n",
> +   pci_mmcfg_config[i].address,
> +   pci_mmcfg_config[i].pci_segment);
> +}
> +}
> +
>  int __init dom0_construct_pvh(struct domain *d, const module_t *image,
>  

Re: [Xen-devel] [PATCH v7 for-next 03/12] vpci: introduce basic handlers to trap accesses to the PCI config space

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monne [mailto:roger@citrix.com]
> Sent: 18 October 2017 12:40
> To: xen-de...@lists.xenproject.org
> Cc: konrad.w...@oracle.com; boris.ostrov...@oracle.com; Roger Pau Monne
> <roger@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Wei Liu
> <wei.l...@citrix.com>; Jan Beulich <jbeul...@suse.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>
> Subject: [PATCH v7 for-next 03/12] vpci: introduce basic handlers to trap
> accesses to the PCI config space
> 
> This functionality is going to reside in vpci.c (and the corresponding
> vpci.h header), and should be arch-agnostic. The handlers introduced
> in this patch setup the basic functionality required in order to trap
> accesses to the PCI config space, and allow decoding the address and
> finding the corresponding handler that should handle the access
> (although no handlers are implemented).
> 
> Note that the traps to the PCI IO ports registers (0xcf8/0xcfc) are
> setup inside of a x86 HVM file, since that's not shared with other
> arches.
> 
> A new XEN_X86_EMU_VPCI x86 domain flag is added in order to signal Xen
> whether a domain should use the newly introduced vPCI handlers, this
> is only enabled for PVH Dom0 at the moment.
> 
> A very simple user-space test is also provided, so that the basic
> functionality of the vPCI traps can be asserted. This has been proven
> quite helpful during development, since the logic to handle partial
> accesses or accesses that expand across multiple registers is not
> trivial.
> 
> The handlers for the registers are added to a linked list that's keep
> sorted at all times. Both the read and write handlers support accesses
> that expand across multiple emulated registers and contain gaps not
> emulated.
> 
> Signed-off-by: Roger Pau Monné <roger@citrix.com>
> Acked-by: Wei Liu <wei.l...@citrix.com>

io parts:

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> Cc: Ian Jackson <ian.jack...@eu.citrix.com>
> Cc: Wei Liu <wei.l...@citrix.com>
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> Cc: Paul Durrant <paul.durr...@citrix.com>
> ---
> Changes since v6:
>  - Align the vpci handlers in the linker script.
>  - Switch add/remove register functions to take a vpci parameter
>instead of a pci_dev.
>  - Expand comment of merge_result.
>  - Return X86EMUL_UNHANDLEABLE if accessing cfc and cf8 is disabled.
> 
> Changes since v5:
>  - Use a spinlock per pci device.
>  - Use the recently introduced pci_sbdf_t type.
>  - Fix test harness to use the right handler type and the newly
>introduced lock.
>  - Move the position of the vpci sections in the linker scripts.
>  - Constify domain and pci_dev in vpci_{read/write}.
>  - Fix typos in comments.
>  - Use _XEN_VPCI_H_ as header guard.
> 
> Changes since v4:
> * User-space test harness:
>  - Do not redirect the output of the test.
>  - Add main.c and emul.h as dependencies of the Makefile target.
>  - Use the same rule to modify the vpci and list headers.
>  - Remove underscores from local macro variables.
>  - Add _check suffix to the test harness multiread function.
>  - Change the value written by every different size in the multiwrite
>test.
>  - Use { } to initialize the r16 and r20 arrays (instead of { 0 }).
>  - Perform some of the read checks with the local variable directly.
>  - Expand some comments.
>  - Implement a dummy rwlock.
> * Hypervisor code:
>  - Guard the linker script changes with CONFIG_HAS_PCI.
>  - Rename vpci_access_check to vpci_access_allowed and make it return
>bool.
>  - Make hvm_pci_decode_addr return the register as return value.
>  - Use ~3 instead of 0xfffc to remove the register offset when
>checking accesses to IO ports.
>  - s/head/prev in vpci_add_register.
>  - Add parentheses around & in vpci_add_register.
>  - Fix register removal.
>  - Change the BUGs in vpci_{read/write}_hw helpers to
>ASSERT_UNREACHABLE.
>  - Make merge_result static and change the computation of the mask to
>avoid using a uint64_t.
>  - Modify vpci_read to only read from hardware the not-emulated gaps.
>  - Remove the vpci_val union and use a uint32_t instead.
>  - Change handler read type to return a uint32_t instead of modifying
>a variable passed by reference.
>  - Constify the data opaque parameter of read handlers.
>  - Change the size parameter of the vpci_{read/write} functions to
>unsigned int.
>  - Place the array of initialization handlers in init.rodata or
>.rodata depending on whether late-h

Re: [Xen-devel] [PATCH v7 for-next 01/12] x86/pio: allow internal PIO handlers to return RETRY

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Roger Pau Monne [mailto:roger@citrix.com]
> Sent: 18 October 2017 12:40
> To: xen-de...@lists.xenproject.org
> Cc: konrad.w...@oracle.com; boris.ostrov...@oracle.com; Roger Pau Monne
> <roger....@citrix.com>; Paul Durrant <paul.durr...@citrix.com>; Jan
> Beulich <jbeul...@suse.com>; Andrew Cooper
> <andrew.coop...@citrix.com>
> Subject: [PATCH v7 for-next 01/12] x86/pio: allow internal PIO handlers to
> return RETRY
> 
> Fix handle_pio so internal PIO handlers can return X86EMUL_RETRY and
> it is properly handled by not advancing the IP.
> 
> Signed-off-by: Roger Pau Monné <roger@citrix.com>

I *think* this is safe.

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> Cc: Paul Durrant <paul.durr...@citrix.com>
> Cc: Jan Beulich <jbeul...@suse.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>
> ---
> Note this is not an issue currently because no internal handlers
> return RETRY.
> ---
>  xen/arch/x86/hvm/io.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index e449b4196e..10e1e2db45 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -157,8 +157,11 @@ bool handle_pio(uint16_t port, unsigned int size, int
> dir)
>  break;
> 
>  case X86EMUL_RETRY:
> -/* We should not advance RIP/EIP if the domain is shutting down */
> -if ( curr->domain->is_shutting_down )
> +/*
> + * We should not advance RIP/EIP if the domain is shutting down or
> + * if X86EMUL_RETRY has been returned by an internal handler.
> + */
> +if ( curr->domain->is_shutting_down || !hvm_io_pending(curr) )
>  return false;
>  break;
> 
> --
> 2.13.5 (Apple Git-94)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 5/5] tools: libxendevicemodel: Provide xendevicemodel_pin_memory_cacheattr

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 18 October 2017 15:04
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Ross Lagerwall <ross.lagerw...@citrix.com>; Ian Jackson
> <ian.jack...@citrix.com>; Wei Liu <wei.l...@citrix.com>
> Subject: [Xen-devel] [PATCH v1 5/5] tools: libxendevicemodel: Provide
> xendevicemodel_pin_memory_cacheattr
> 
> Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  tools/libs/devicemodel/core.c   | 19 +++
>  tools/libs/devicemodel/include/xendevicemodel.h | 14 ++
>  tools/libs/devicemodel/libxendevicemodel.map|  1 +
>  3 files changed, 34 insertions(+)
> 
> diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
> index 2a23077..dada57e 100644
> --- a/tools/libs/devicemodel/core.c
> +++ b/tools/libs/devicemodel/core.c
> @@ -581,6 +581,25 @@ int xendevicemodel_add_to_physmap(
>  return xendevicemodel_op(dmod, domid, 1, , sizeof(op));
>  }
> 
> +int xendevicemodel_pin_memory_cacheattr(
> +xendevicemodel_handle *dmod, domid_t domid, uint64_t start, uint64_t
> end,
> +uint32_t type)
> +{
> +struct xen_dm_op op;
> +struct xen_dm_op_pin_memory_cacheattr *data;
> +
> +memset(, 0, sizeof(op));
> +
> +op.op = XEN_DMOP_pin_memory_cacheattr;
> +data = _memory_cacheattr;
> +
> +data->start = start;
> +data->end = end;
> +data->type = type;
> +
> +return xendevicemodel_op(dmod, domid, 1, , sizeof(op));
> +}
> +
>  int xendevicemodel_restrict(xendevicemodel_handle *dmod, domid_t
> domid)
>  {
>  return osdep_xendevicemodel_restrict(dmod, domid);
> diff --git a/tools/libs/devicemodel/include/xendevicemodel.h
> b/tools/libs/devicemodel/include/xendevicemodel.h
> index 2c4e392..9de6d46 100644
> --- a/tools/libs/devicemodel/include/xendevicemodel.h
> +++ b/tools/libs/devicemodel/include/xendevicemodel.h
> @@ -339,6 +339,20 @@ int xendevicemodel_add_to_physmap(
>  xendevicemodel_handle *dmod, domid_t domid, uint64_t idx, uint64_t
> gpfn);
> 
>  /**
> + * Pins caching type of RAM space.
> + *
> + * @parm dmod a handle to an open devicemodel interface.
> + * @parm domid the domain id to be serviced
> + * @parm start Start gfn
> + * @parm end End gfn
> + * @parm type XEN_DOMCTL_MEM_CACHEATTR_*
> + * @return 0 on success, -1 on failure.
> + */
> +int xendevicemodel_pin_memory_cacheattr(
> +xendevicemodel_handle *dmod, domid_t domid, uint64_t start, uint64_t
> end,
> +uint32_t type);
> +
> +/**
>   * This function restricts the use of this handle to the specified
>   * domain.
>   *
> diff --git a/tools/libs/devicemodel/libxendevicemodel.map
> b/tools/libs/devicemodel/libxendevicemodel.map
> index 4a19ecb..e820b77 100644
> --- a/tools/libs/devicemodel/libxendevicemodel.map
> +++ b/tools/libs/devicemodel/libxendevicemodel.map
> @@ -31,4 +31,5 @@ VERS_1.1 {
>  VERS_1.2 {
>   global:
>   xendevicemodel_add_to_physmap;
> + xendevicemodel_pin_memory_cacheattr;
>  } VERS_1.1;
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 4/5] tools: libxendevicemodel: Provide xendevicemodel_add_to_physmap

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 18 October 2017 15:04
> To: Xen-devel 
> Cc: Ross Lagerwall ; Ian Jackson
> ; Wei Liu 
> Subject: [Xen-devel] [PATCH v1 4/5] tools: libxendevicemodel: Provide
> xendevicemodel_add_to_physmap
> 
> Signed-off-by: Ross Lagerwall 
> ---
>  tools/libs/devicemodel/Makefile |  2 +-
>  tools/libs/devicemodel/core.c   | 17 +
>  tools/libs/devicemodel/include/xendevicemodel.h | 13 +
>  tools/libs/devicemodel/libxendevicemodel.map|  5 +
>  4 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libs/devicemodel/Makefile
> b/tools/libs/devicemodel/Makefile
> index 342371a..5b2df7a 100644
> --- a/tools/libs/devicemodel/Makefile
> +++ b/tools/libs/devicemodel/Makefile
> @@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
>  include $(XEN_ROOT)/tools/Rules.mk
> 
>  MAJOR= 1
> -MINOR= 1
> +MINOR= 2
>  SHLIB_LDFLAGS += -Wl,--version-script=libxendevicemodel.map
> 
>  CFLAGS   += -Werror -Wmissing-prototypes
> diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
> index b66d4f9..2a23077 100644
> --- a/tools/libs/devicemodel/core.c
> +++ b/tools/libs/devicemodel/core.c
> @@ -564,6 +564,23 @@ int xendevicemodel_shutdown(
>  return xendevicemodel_op(dmod, domid, 1, , sizeof(op));
>  }
> 
> +int xendevicemodel_add_to_physmap(
> +xendevicemodel_handle *dmod, domid_t domid, uint64_t idx, uint64_t
> gpfn)

Do you really want this to be single page? Populating VRAM is not going to be 
fast if you need to make 16MB / 4K hypercalls to do it.

  Paul

> +{
> +struct xen_dm_op op;
> +struct xen_dm_op_add_to_physmap *data;
> +
> +memset(, 0, sizeof(op));
> +
> +op.op = XEN_DMOP_add_to_physmap;
> +data = _to_physmap;
> +
> +data->idx = idx;
> +data->gpfn = gpfn;
> +
> +return xendevicemodel_op(dmod, domid, 1, , sizeof(op));
> +}
> +
>  int xendevicemodel_restrict(xendevicemodel_handle *dmod, domid_t
> domid)
>  {
>  return osdep_xendevicemodel_restrict(dmod, domid);
> diff --git a/tools/libs/devicemodel/include/xendevicemodel.h
> b/tools/libs/devicemodel/include/xendevicemodel.h
> index dda0bc7..2c4e392 100644
> --- a/tools/libs/devicemodel/include/xendevicemodel.h
> +++ b/tools/libs/devicemodel/include/xendevicemodel.h
> @@ -326,6 +326,19 @@ int xendevicemodel_shutdown(
>  xendevicemodel_handle *dmod, domid_t domid, unsigned int reason);
> 
>  /**
> + * Sets the GPFN at which a particular page appears in the domain's
> + * pseudophysical address space.
> + *
> + * @parm dmod a handle to an open devicemodel interface.
> + * @parm domid the domain id to be serviced
> + * @parm idx Index into GMFN space
> + * @parm gpfn GPFN in domid where the GMFN should appear
> + * @return 0 on success, -1 on failure.
> + */
> +int xendevicemodel_add_to_physmap(
> +xendevicemodel_handle *dmod, domid_t domid, uint64_t idx, uint64_t
> gpfn);
> +
> +/**
>   * This function restricts the use of this handle to the specified
>   * domain.
>   *
> diff --git a/tools/libs/devicemodel/libxendevicemodel.map
> b/tools/libs/devicemodel/libxendevicemodel.map
> index cefd32b..4a19ecb 100644
> --- a/tools/libs/devicemodel/libxendevicemodel.map
> +++ b/tools/libs/devicemodel/libxendevicemodel.map
> @@ -27,3 +27,8 @@ VERS_1.1 {
>   global:
>   xendevicemodel_shutdown;
>  } VERS_1.0;
> +
> +VERS_1.2 {
> + global:
> + xendevicemodel_add_to_physmap;
> +} VERS_1.1;
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 2/5] xen: Provide XEN_DMOP_add_to_physmap

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Paul Durrant
> Sent: 20 October 2017 10:16
> To: Ross Lagerwall <ross.lagerw...@citrix.com>; Xen-devel  de...@lists.xen.org>
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> Andrew Cooper <andrew.coop...@citrix.com>; Tim (Xen.org)
> <t...@xen.org>; George Dunlap <george.dun...@citrix.com>; Ross
> Lagerwall <ross.lagerw...@citrix.com>; Jan Beulich <jbeul...@suse.com>; Ian
> Jackson <ian.jack...@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v1 2/5] xen: Provide
> XEN_DMOP_add_to_physmap
> 
> > -Original Message-
> > From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> > Ross Lagerwall
> > Sent: 18 October 2017 15:04
> > To: Xen-devel <xen-devel@lists.xen.org>
> > Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> > <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> > George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> > <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> > (Xen.org) <t...@xen.org>; Ross Lagerwall <ross.lagerw...@citrix.com>; Jan
> > Beulich <jbeul...@suse.com>
> > Subject: [Xen-devel] [PATCH v1 2/5] xen: Provide
> > XEN_DMOP_add_to_physmap
> >
> > Provide XEN_DMOP_add_to_physmap, a limited version of
> > XENMEM_add_to_physmap to allow a deprivileged QEMU to move VRAM
> > when a
> > guest programs its BAR. It is equivalent to XENMEM_add_to_physmap with
> > space == XENMAPSPACE_gmfn.
> >
> > Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>
> > ---
> >  xen/arch/x86/hvm/dm.c  | 17 +
> >  xen/include/public/hvm/dm_op.h | 14 ++
> >  xen/include/xlat.lst   |  1 +
> >  3 files changed, 32 insertions(+)
> >
> > diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> > index 32ade95..432a863 100644
> > --- a/xen/arch/x86/hvm/dm.c
> > +++ b/xen/arch/x86/hvm/dm.c
> > @@ -640,6 +640,22 @@ static int dm_op(const struct dmop_args *op_args)
> >  break;
> >  }
> >
> > +case XEN_DMOP_add_to_physmap:
> > +{
> > +const struct xen_dm_op_add_to_physmap *data =
> > +_to_physmap;
> > +struct xen_add_to_physmap xatp = {
> > +.domid = op_args->domid,
> > +.space = XENMAPSPACE_gmfn,
> > +.idx = data->idx,
> > +.gpfn = data->gpfn,
> > +};
> > +
> 
> Where does xatp.size get set? Looks like you're missing a parameter.
> 
>   Paul
> 
> > +rc = xenmem_add_to_physmap(d, ,
> > +   XENMEM_add_to_physmap >>
> MEMOP_EXTENT_SHIFT);

... Also looking at this slightly odd argument, assuming that the additional 
size parameter could be arbitrarily large, you're going to need to handle 
continuations.

  Paul

> > +break;
> > +}
> > +
> >  default:
> >  rc = -EOPNOTSUPP;
> >  break;
> > @@ -669,6 +685,7 @@ CHECK_dm_op_set_mem_type;
> >  CHECK_dm_op_inject_event;
> >  CHECK_dm_op_inject_msi;
> >  CHECK_dm_op_remote_shutdown;
> > +CHECK_dm_op_add_to_physmap;
> >
> >  int compat_dm_op(domid_t domid,
> >   unsigned int nr_bufs,
> > diff --git a/xen/include/public/hvm/dm_op.h
> > b/xen/include/public/hvm/dm_op.h
> > index e173085..88aace7 100644
> > --- a/xen/include/public/hvm/dm_op.h
> > +++ b/xen/include/public/hvm/dm_op.h
> > @@ -368,6 +368,19 @@ struct xen_dm_op_remote_shutdown {
> > /* (Other reason values are not blocked) */
> >  };
> >
> > +/*
> > + * XEN_DMOP_add_to_physmap : Sets the GPFN at which a particular
> page
> > appears
> > + *   in the specified guest's pseudophysical 
> > address
> > + *   space. Identical to XENMEM_add_to_physmap with
> > + *   space == XENMAPSPACE_gmfn.
> > + */
> > +#define XEN_DMOP_add_to_physmap 17
> > +
> > +struct xen_dm_op_add_to_physmap {
> > +uint64_aligned_t idx;  /* Index into GMFN space. */
> > +uint64_aligned_t gpfn; /* GPFN in domid where the GMFN should
> > appear. */
> > +};
> > +
> >  struct xen_dm_op {
> >  uint32_t op;
> >

Re: [Xen-devel] [PATCH v1 3/5] xen: Provide XEN_DMOP_pin_memory_cacheattr

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 18 October 2017 15:04
> To: Xen-devel 
> Cc: Stefano Stabellini ; Wei Liu
> ; Konrad Rzeszutek Wilk ;
> George Dunlap ; Andrew Cooper
> ; Ian Jackson ; Tim
> (Xen.org) ; Ross Lagerwall ; Jan
> Beulich 
> Subject: [Xen-devel] [PATCH v1 3/5] xen: Provide
> XEN_DMOP_pin_memory_cacheattr
> 
> Provide XEN_DMOP_pin_memory_cacheattr to allow a deprivileged QEMU
> to
> pin the caching type of RAM after moving the VRAM. It is equivalent to
> XEN_DOMCTL_pin_memory_cacheattr.
> 
> Signed-off-by: Ross Lagerwall 
> ---
>  xen/arch/x86/hvm/dm.c  | 12 
>  xen/include/public/hvm/dm_op.h | 14 ++
>  xen/include/xlat.lst   |  1 +
>  3 files changed, 27 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> index 432a863..eebcbcc 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -21,6 +21,7 @@
> 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  #include 
> @@ -656,6 +657,16 @@ static int dm_op(const struct dmop_args *op_args)
>  break;
>  }
> 
> +case XEN_DMOP_pin_memory_cacheattr:
> +{
> +const struct xen_dm_op_pin_memory_cacheattr *data =
> +_memory_cacheattr;
> +

You need to check data->pad is 0 here.

  Paul

> +rc = hvm_set_mem_pinned_cacheattr(d, data->start, data->end,
> +  data->type);
> +break;
> +}
> +
>  default:
>  rc = -EOPNOTSUPP;
>  break;
> @@ -686,6 +697,7 @@ CHECK_dm_op_inject_event;
>  CHECK_dm_op_inject_msi;
>  CHECK_dm_op_remote_shutdown;
>  CHECK_dm_op_add_to_physmap;
> +CHECK_dm_op_pin_memory_cacheattr;
> 
>  int compat_dm_op(domid_t domid,
>   unsigned int nr_bufs,
> diff --git a/xen/include/public/hvm/dm_op.h
> b/xen/include/public/hvm/dm_op.h
> index 88aace7..11bb386 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -381,6 +381,19 @@ struct xen_dm_op_add_to_physmap {
>  uint64_aligned_t gpfn; /* GPFN in domid where the GMFN should appear.
> */
>  };
> 
> +/*
> + * XEN_DMOP_pin_memory_cacheattr : Pin caching type of RAM space.
> + * Identical to XEN_DOMCTL_pin_mem_cacheattr.
> + */
> +#define XEN_DMOP_pin_memory_cacheattr 18
> +
> +struct xen_dm_op_pin_memory_cacheattr {
> +uint64_aligned_t start; /* Start gfn. */
> +uint64_aligned_t end;   /* End gfn. */
> +uint32_t type;  /* XEN_DOMCTL_MEM_CACHEATTR_* */
> +uint32_t pad;
> +};
> +
>  struct xen_dm_op {
>  uint32_t op;
>  uint32_t pad;
> @@ -403,6 +416,7 @@ struct xen_dm_op {
>  map_mem_type_to_ioreq_server;
>  struct xen_dm_op_remote_shutdown remote_shutdown;
>  struct xen_dm_op_add_to_physmap add_to_physmap;
> +struct xen_dm_op_pin_memory_cacheattr pin_memory_cacheattr;
>  } u;
>  };
> 
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> index d40bac6..fffb308 100644
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -65,6 +65,7 @@
>  ?dm_op_inject_msihvm/dm_op.h
>  ?dm_op_ioreq_server_rangehvm/dm_op.h
>  ?dm_op_modified_memory   hvm/dm_op.h
> +?dm_op_pin_memory_cacheattr  hvm/dm_op.h
>  ?dm_op_remote_shutdown   hvm/dm_op.h
>  ?dm_op_set_ioreq_server_statehvm/dm_op.h
>  ?dm_op_set_isa_irq_level hvm/dm_op.h
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 2/5] xen: Provide XEN_DMOP_add_to_physmap

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 18 October 2017 15:04
> To: Xen-devel 
> Cc: Stefano Stabellini ; Wei Liu
> ; Konrad Rzeszutek Wilk ;
> George Dunlap ; Andrew Cooper
> ; Ian Jackson ; Tim
> (Xen.org) ; Ross Lagerwall ; Jan
> Beulich 
> Subject: [Xen-devel] [PATCH v1 2/5] xen: Provide
> XEN_DMOP_add_to_physmap
> 
> Provide XEN_DMOP_add_to_physmap, a limited version of
> XENMEM_add_to_physmap to allow a deprivileged QEMU to move VRAM
> when a
> guest programs its BAR. It is equivalent to XENMEM_add_to_physmap with
> space == XENMAPSPACE_gmfn.
> 
> Signed-off-by: Ross Lagerwall 
> ---
>  xen/arch/x86/hvm/dm.c  | 17 +
>  xen/include/public/hvm/dm_op.h | 14 ++
>  xen/include/xlat.lst   |  1 +
>  3 files changed, 32 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
> index 32ade95..432a863 100644
> --- a/xen/arch/x86/hvm/dm.c
> +++ b/xen/arch/x86/hvm/dm.c
> @@ -640,6 +640,22 @@ static int dm_op(const struct dmop_args *op_args)
>  break;
>  }
> 
> +case XEN_DMOP_add_to_physmap:
> +{
> +const struct xen_dm_op_add_to_physmap *data =
> +_to_physmap;
> +struct xen_add_to_physmap xatp = {
> +.domid = op_args->domid,
> +.space = XENMAPSPACE_gmfn,
> +.idx = data->idx,
> +.gpfn = data->gpfn,
> +};
> +

Where does xatp.size get set? Looks like you're missing a parameter.

  Paul

> +rc = xenmem_add_to_physmap(d, ,
> +   XENMEM_add_to_physmap >> 
> MEMOP_EXTENT_SHIFT);
> +break;
> +}
> +
>  default:
>  rc = -EOPNOTSUPP;
>  break;
> @@ -669,6 +685,7 @@ CHECK_dm_op_set_mem_type;
>  CHECK_dm_op_inject_event;
>  CHECK_dm_op_inject_msi;
>  CHECK_dm_op_remote_shutdown;
> +CHECK_dm_op_add_to_physmap;
> 
>  int compat_dm_op(domid_t domid,
>   unsigned int nr_bufs,
> diff --git a/xen/include/public/hvm/dm_op.h
> b/xen/include/public/hvm/dm_op.h
> index e173085..88aace7 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -368,6 +368,19 @@ struct xen_dm_op_remote_shutdown {
> /* (Other reason values are not blocked) */
>  };
> 
> +/*
> + * XEN_DMOP_add_to_physmap : Sets the GPFN at which a particular page
> appears
> + *   in the specified guest's pseudophysical address
> + *   space. Identical to XENMEM_add_to_physmap with
> + *   space == XENMAPSPACE_gmfn.
> + */
> +#define XEN_DMOP_add_to_physmap 17
> +
> +struct xen_dm_op_add_to_physmap {
> +uint64_aligned_t idx;  /* Index into GMFN space. */
> +uint64_aligned_t gpfn; /* GPFN in domid where the GMFN should
> appear. */
> +};
> +
>  struct xen_dm_op {
>  uint32_t op;
>  uint32_t pad;
> @@ -389,6 +402,7 @@ struct xen_dm_op {
>  struct xen_dm_op_map_mem_type_to_ioreq_server
>  map_mem_type_to_ioreq_server;
>  struct xen_dm_op_remote_shutdown remote_shutdown;
> +struct xen_dm_op_add_to_physmap add_to_physmap;
>  } u;
>  };
> 
> diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
> index 4346cbe..d40bac6 100644
> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -57,6 +57,7 @@
>  ?grant_entry_v2  grant_table.h
>  ?gnttab_swap_grant_ref   grant_table.h
>  !dm_op_buf   hvm/dm_op.h
> +?dm_op_add_to_physmaphvm/dm_op.h
>  ?dm_op_create_ioreq_server   hvm/dm_op.h
>  ?dm_op_destroy_ioreq_server  hvm/dm_op.h
>  ?dm_op_get_ioreq_server_info hvm/dm_op.h
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 1/5] xen/mm: Make xenmem_add_to_physmap global

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Ross Lagerwall
> Sent: 18 October 2017 15:04
> To: Xen-devel <xen-devel@lists.xen.org>
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Ross Lagerwall <ross.lagerw...@citrix.com>; Jan
> Beulich <jbeul...@suse.com>
> Subject: [Xen-devel] [PATCH v1 1/5] xen/mm: Make
> xenmem_add_to_physmap global
> 
> Make it global in preparation to be called by a new dmop.
> 
> Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  xen/common/memory.c  | 5 ++---
>  xen/include/xen/mm.h | 3 +++
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index ad987e0..c4f05c7 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -741,9 +741,8 @@ static long
> memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange
> _t) arg)
>  return rc;
>  }
> 
> -static int xenmem_add_to_physmap(struct domain *d,
> - struct xen_add_to_physmap *xatp,
> - unsigned int start)
> +int xenmem_add_to_physmap(struct domain *d, struct
> xen_add_to_physmap *xatp,
> +  unsigned int start)
>  {
>  unsigned int done = 0;
>  long rc = 0;
> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> index e813c07..0e0e511 100644
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -579,6 +579,9 @@ int xenmem_add_to_physmap_one(struct domain
> *d, unsigned int space,
>union xen_add_to_physmap_batch_extra extra,
>unsigned long idx, gfn_t gfn);
> 
> +int xenmem_add_to_physmap(struct domain *d, struct
> xen_add_to_physmap *xatp,
> +  unsigned int start);
> +
>  /* Return 0 on success, or negative on error. */
>  int __must_check guest_remove_page(struct domain *d, unsigned long
> gmfn);
>  int __must_check steal_page(struct domain *d, struct page_info *page,
> --
> 2.9.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/xen: support priv-mapping in an HVM tools domain

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Boris Ostrovsky
> Sent: 19 October 2017 18:45
> To: Paul Durrant <paul.durr...@citrix.com>; x...@kernel.org; xen-
> de...@lists.xenproject.org; linux-ker...@vger.kernel.org
> Cc: Juergen Gross <jgr...@suse.com>; Thomas Gleixner
> <t...@linutronix.de>; Ingo Molnar <mi...@redhat.com>; H. Peter Anvin
> <h...@zytor.com>
> Subject: Re: [Xen-devel] [PATCH] x86/xen: support priv-mapping in an HVM
> tools domain
> 
> On 10/19/2017 11:26 AM, Paul Durrant wrote:
> > If the domain has XENFEAT_auto_translated_physmap then use of the PV-
> > specific HYPERVISOR_mmu_update hypercall is clearly incorrect.
> >
> > This patch adds checks in xen_remap_domain_gfn_array() and
> > xen_unmap_domain_gfn_array() which call through to the approprate
> > xlate_mmu function if the feature is present. A check is also added
> > to xen_remap_domain_gfn_range() to fail with -EOPNOTSUPP since this
> > should not be used in an HVM tools domain.
> >
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > ---
> > Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
> > Cc: Juergen Gross <jgr...@suse.com>
> > Cc: Thomas Gleixner <t...@linutronix.de>
> > Cc: Ingo Molnar <mi...@redhat.com>
> > Cc: "H. Peter Anvin" <h...@zytor.com>
> > ---
> >  arch/x86/xen/mmu.c | 14 --
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> > index 3e15345abfe7..d33e7dbe3129 100644
> > --- a/arch/x86/xen/mmu.c
> > +++ b/arch/x86/xen/mmu.c
> > @@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct
> vm_area_struct *vma,
> >pgprot_t prot, unsigned domid,
> >struct page **pages)
> >  {
> > +   if (xen_feature(XENFEAT_auto_translated_physmap))
> > +   return -EOPNOTSUPP;
> > +
> 
> This is never called on XENFEAT_auto_translated_physmap domains, there
> is a check in privcmd_ioctl_mmap() for that.

Yes, that's true but it seems like the wrong place for such a check. I could 
remove that one it you'd prefer.

> 
> > return do_remap_gfn(vma, addr, , nr, NULL, prot, domid,
> pages);
> >  }
> >  EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
> > @@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct
> vm_area_struct *vma,
> >int *err_ptr, pgprot_t prot,
> >unsigned domid, struct page **pages)
> >  {
> > +   if (xen_feature(XENFEAT_auto_translated_physmap))
> > +   return xen_xlate_remap_gfn_array(vma, addr, gfn, nr,
> err_ptr,
> > +prot, domid, pages);
> > +
> 
> So how did this work before? In fact, I don't see any callers of
> xen_xlate_{re|un}map_gfn_range().

I assume mean 'array' for the map since there is no xen_xlate_remap_gfn_range() 
function. I'm not quite sure what you're asking? Without this patch the mmu 
code in an x86 domain simply assumes the domain is PV... the xlate code is 
currently only used via the arm mmu code (where it clearly knows it's not PV). 
AFAICS this Is just a straightforward buggy assumption in the x86 code.

  Paul

> 
> -boris
> 
> 
> > /* We BUG_ON because it's a programmer error to pass a NULL
> err_ptr,
> >  * and the consequences later is quite hard to detect what the actual
> >  * cause of "wrong memory was mapped in".
> > @@ -193,9 +200,12 @@
> EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
> >
> >  /* Returns: 0 success */
> >  int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
> > -  int numpgs, struct page **pages)
> > +  int nr, struct page **pages)
> >  {
> > -   if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
> > +   if (xen_feature(XENFEAT_auto_translated_physmap))
> > +   return xen_xlate_unmap_gfn_range(vma, nr, pages);
> > +
> > +   if (!pages)
> > return 0;
> >
> > return -EINVAL;
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-20 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 20 October 2017 07:25
> To: Julien Grall <julien.gr...@linaro.org>
> Cc: Julien Grall <julien.gr...@arm.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; George Dunlap
> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Paul
> Durrant <paul.durr...@citrix.com>; Roger Pau Monne
> <roger@citrix.com>; Wei Liu <wei.l...@citrix.com>; Stefano Stabellini
> <sstabell...@kernel.org>; xen-de...@lists.xenproject.org; Konrad Rzeszutek
> Wilk <konrad.w...@oracle.com>; Daniel De Graaf <dgde...@tycho.nsa.gov>;
> Tim (Xen.org) <t...@xen.org>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> >>> On 19.10.17 at 18:21, <julien.gr...@linaro.org> wrote:
> > Looking a bit more at the resource you can acquire from this hypercall.
> > Some of them are allocated using alloc_xenheap_page() so not assigned to
> > a domain.
> >
> > So I am not sure how you can expect a function set_foreign_p2m_entry to
> > take reference in that case.
> 
> Hmm, with the domain parameter added, DOMID_XEN there (for
> Xen heap pages) could identify no references to be taken, if that
> was really the intended behavior in that case. However, even for
> Xen heap pages life time tracking ought to be done - it is for a
> reason that share_xen_page_with_guest() assigns the target
> domain as the owner of such pages, as that allows get_page() to
> succeed for them.
> 

So, nothing I'm doing here is making anything worse, right? Grant tables are 
assigned to the guest, and IOREQ server pages are allocated with 
alloc_domheap_page() so nothing is anonymous.

  Paul

> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/xen: support priv-mapping in an HVM tools domain

2017-10-19 Thread Paul Durrant
If the domain has XENFEAT_auto_translated_physmap then use of the PV-
specific HYPERVISOR_mmu_update hypercall is clearly incorrect.

This patch adds checks in xen_remap_domain_gfn_array() and
xen_unmap_domain_gfn_array() which call through to the approprate
xlate_mmu function if the feature is present. A check is also added
to xen_remap_domain_gfn_range() to fail with -EOPNOTSUPP since this
should not be used in an HVM tools domain.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
---
 arch/x86/xen/mmu.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 3e15345abfe7..d33e7dbe3129 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
   pgprot_t prot, unsigned domid,
   struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return -EOPNOTSUPP;
+
return do_remap_gfn(vma, addr, , nr, NULL, prot, domid, pages);
 }
 EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
@@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
   int *err_ptr, pgprot_t prot,
   unsigned domid, struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_remap_gfn_array(vma, addr, gfn, nr, err_ptr,
+prot, domid, pages);
+
/* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
 * and the consequences later is quite hard to detect what the actual
 * cause of "wrong memory was mapped in".
@@ -193,9 +200,12 @@ EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
 
 /* Returns: 0 success */
 int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
-  int numpgs, struct page **pages)
+  int nr, struct page **pages)
 {
-   if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_unmap_gfn_range(vma, nr, pages);
+
+   if (!pages)
return 0;
 
return -EINVAL;
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/xen: support priv-mapping in an HVM tools domain

2017-10-19 Thread Paul Durrant
If the domain has XENFEAT_auto_translated_physmap then use of the PV-
specific HYPERVISOR_mmu_update hypercall is clearly incorrect.

This patch adds checks in xen_remap_domain_gfn_array() and
xen_unmap_domain_gfn_array() which call through to the approprate
xlate_mmu function if the feature is present. A check is also added
to xen_remap_domain_gfn_range() to fail with -EOPNOTSUPP since this
should not be used in an HVM tools domain.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Boris Ostrovsky <boris.ostrov...@oracle.com>
Juergen Gross <jgr...@suse.com>
Thomas Gleixner <t...@linutronix.de>
Ingo Molnar <mi...@redhat.com>
"H. Peter Anvin" <h...@zytor.com>
---
 arch/x86/xen/mmu.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 3e15345abfe7..d33e7dbe3129 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct vm_area_struct *vma,
   pgprot_t prot, unsigned domid,
   struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return -EOPNOTSUPP;
+
return do_remap_gfn(vma, addr, , nr, NULL, prot, domid, pages);
 }
 EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
@@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct vm_area_struct *vma,
   int *err_ptr, pgprot_t prot,
   unsigned domid, struct page **pages)
 {
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_remap_gfn_array(vma, addr, gfn, nr, err_ptr,
+prot, domid, pages);
+
/* We BUG_ON because it's a programmer error to pass a NULL err_ptr,
 * and the consequences later is quite hard to detect what the actual
 * cause of "wrong memory was mapped in".
@@ -193,9 +200,12 @@ EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
 
 /* Returns: 0 success */
 int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
-  int numpgs, struct page **pages)
+  int nr, struct page **pages)
 {
-   if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
+   if (xen_feature(XENFEAT_auto_translated_physmap))
+   return xen_xlate_unmap_gfn_range(vma, nr, pages);
+
+   if (!pages)
return 0;
 
return -EINVAL;
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/xen: support priv-mapping in an HVM tools domain

2017-10-19 Thread Paul Durrant
Apologies... I misformatted this. I will re-send.

  Paul
> -Original Message-
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: 19 October 2017 16:24
> To: x...@kernel.org; xen-de...@lists.xenproject.org; linux-
> ker...@vger.kernel.org
> Cc: Paul Durrant <paul.durr...@citrix.com>
> Subject: [PATCH] x86/xen: support priv-mapping in an HVM tools domain
> 
> If the domain has XENFEAT_auto_translated_physmap then use of the PV-
> specific HYPERVISOR_mmu_update hypercall is clearly incorrect.
> 
> This patch adds checks in xen_remap_domain_gfn_array() and
> xen_unmap_domain_gfn_array() which call through to the approprate
> xlate_mmu function if the feature is present. A check is also added
> to xen_remap_domain_gfn_range() to fail with -EOPNOTSUPP since this
> should not be used in an HVM tools domain.
> 
> Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> ---
> Boris Ostrovsky <boris.ostrov...@oracle.com>
> Juergen Gross <jgr...@suse.com>
> Thomas Gleixner <t...@linutronix.de>
> Ingo Molnar <mi...@redhat.com>
> "H. Peter Anvin" <h...@zytor.com>
> ---
>  arch/x86/xen/mmu.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
> index 3e15345abfe7..d33e7dbe3129 100644
> --- a/arch/x86/xen/mmu.c
> +++ b/arch/x86/xen/mmu.c
> @@ -172,6 +172,9 @@ int xen_remap_domain_gfn_range(struct
> vm_area_struct *vma,
>  pgprot_t prot, unsigned domid,
>  struct page **pages)
>  {
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return -EOPNOTSUPP;
> +
>   return do_remap_gfn(vma, addr, , nr, NULL, prot, domid,
> pages);
>  }
>  EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_range);
> @@ -182,6 +185,10 @@ int xen_remap_domain_gfn_array(struct
> vm_area_struct *vma,
>  int *err_ptr, pgprot_t prot,
>  unsigned domid, struct page **pages)
>  {
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return xen_xlate_remap_gfn_array(vma, addr, gfn, nr,
> err_ptr,
> +  prot, domid, pages);
> +
>   /* We BUG_ON because it's a programmer error to pass a NULL
> err_ptr,
>* and the consequences later is quite hard to detect what the actual
>* cause of "wrong memory was mapped in".
> @@ -193,9 +200,12 @@
> EXPORT_SYMBOL_GPL(xen_remap_domain_gfn_array);
> 
>  /* Returns: 0 success */
>  int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
> -int numpgs, struct page **pages)
> +int nr, struct page **pages)
>  {
> - if (!pages || !xen_feature(XENFEAT_auto_translated_physmap))
> + if (xen_feature(XENFEAT_auto_translated_physmap))
> + return xen_xlate_unmap_gfn_range(vma, nr, pages);
> +
> + if (!pages)
>   return 0;
> 
>   return -EINVAL;
> --
> 2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-19 Thread Paul Durrant
> -Original Message-
[snip]
> >
> > I'd prefer to make the whole thing x86-only since that's the only platform
> on which I can test it, and indeed the code used to be x86-only. Jan objected
> to this so all I'm trying to achieve is that it builds for ARM. Please can 
> you and
> Jan reach agreement on where the code should live and how, if at all, it
> should be #ifdef-ed?
> 
> I am quite surprised of "it is tools-only" so it is fine to not protect
> it even if it is x86 only. That's probably going to bite us in the future.
> 

So, this appears to have reached an impasse. I don't know how to proceed 
without having to also fix priv mapping for x86, which is a yak far too large 
for me to shave at the moment.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-19 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 19 October 2017 14:29
> To: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Julien Grall <julien.gr...@arm.com>; Jan Beulich
> <jbeul...@suse.com>; Daniel De Graaf <dgde...@tycho.nsa.gov>; Roger
> Pau Monne <roger@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> Hi,
> 
> On 10/19/2017 01:57 PM, Paul Durrant wrote:
> >> -Original Message-
> >> From: Julien Grall [mailto:julien.gr...@linaro.org]
> >> Sent: 19 October 2017 13:23
> >> To: Paul Durrant <paul.durr...@citrix.com>; xen-
> de...@lists.xenproject.org
> >> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> >> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk
> <konrad.w...@oracle.com>;
> >> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> >> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>;
> Tim
> >> (Xen.org) <t...@xen.org>; Julien Grall <julien.gr...@arm.com>; Jan
> Beulich
> >> <jbeul...@suse.com>; Daniel De Graaf <dgde...@tycho.nsa.gov>; Roger
> >> Pau Monne <roger@citrix.com>
> >> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> >> HYPERVISOR_memory_op to acquire guest resources
> >>
> >> Hi,
> >>
> >> On 17/10/17 14:24, Paul Durrant wrote:
> >>> Certain memory resources associated with a guest are not necessarily
> >>> present in the guest P2M.
> >>>
> >>> This patch adds the boilerplate for new memory op to allow such a
> >> resource
> >>> to be priv-mapped directly, by either a PV or HVM tools domain.
> >>>
> >>> NOTE: Whilst the new op is not intrinsicly specific to the x86 
> >>> architecture,
> >>> I have no means to test it on an ARM platform and so cannot verify
> >>> that it functions correctly.
> >>>
> >>> Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> >>> ---
> >>
> >> [...]
> >>
> >>> diff --git a/xen/common/memory.c b/xen/common/memory.c
> >>> index ad987e0f29..cdd2e030cf 100644
> >>> --- a/xen/common/memory.c
> >>> +++ b/xen/common/memory.c
> >>> @@ -965,6 +965,95 @@ static long xatp_permission_check(struct
> domain
> >> *d, unsigned int space)
> >>
> >> [...]
> >>
> >>> +if ( rc )
> >>> +goto out;
> >>> +
> >>> +if ( !paging_mode_translate(currd) )
> >>> +{
> >>> +if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> >>> +rc = -EFAULT;
> >>> +}
> >>> +else
> >>> +{
> >>> +xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
> >>> +unsigned int i;
> >>> +
> >>> +rc = -EFAULT;
> >>> +if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> >>> +goto out;
> >>> +
> >>> +for ( i = 0; i < xmar.nr_frames; i++ )
> >>> +{
> >>> +rc = set_foreign_p2m_entry(currd, gfn_list[i],
> >>> +   _mfn(mfn_list[i]));
> >>
> >> Something looks a bit odd to me here. When I read foreign mapping, I
> >> directly associate to mapping from a foreign domain.
> >>
> >> On Arm, we will always get a reference on that page to prevent it
> >> disappearing if the foreign domain is destroyed but the mapping is still
> >> present.
> >>
> >> This reference will either be put with an unmapped hypercall or while
> >> teardown the domain.
> >>
> >> Per my understanding, this MFN does not belong to any domain (or at
> >> least currd). Right?
> >
> > No. The mfns do belong to the target domain.
> 
> To be fully safe, you need to take a reference on each page you mapped.
> So who is going to get a ref

Re: [Xen-devel] [PATCH v12 06/11] x86/hvm/ioreq: add a new mappable resource type...

2017-10-19 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 19 October 2017 14:08
> To: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Jan Beulich <jbeul...@suse.com>
> Subject: Re: [Xen-devel] [PATCH v12 06/11] x86/hvm/ioreq: add a new
> mappable resource type...
> 
> Hi Paul,
> 
> On 10/19/2017 01:58 PM, Paul Durrant wrote:
> >> -Original Message-----
> >> From: Julien Grall [mailto:julien.gr...@linaro.org]
> >> Sent: 19 October 2017 13:31
> >> To: Paul Durrant <paul.durr...@citrix.com>; xen-
> de...@lists.xenproject.org
> >> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> >> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk
> <konrad.w...@oracle.com>;
> >> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> >> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>;
> Tim
> >> (Xen.org) <t...@xen.org>; Jan Beulich <jbeul...@suse.com>
> >> Subject: Re: [Xen-devel] [PATCH v12 06/11] x86/hvm/ioreq: add a new
> >> mappable resource type...
> >>
> >> Hi,
> >>
> >> On 17/10/17 14:24, Paul Durrant wrote:
> >>> diff --git a/xen/common/memory.c b/xen/common/memory.c
> >>> index cdd2e030cf..b27a71c4f1 100644
> >>> --- a/xen/common/memory.c
> >>> +++ b/xen/common/memory.c
> >>> @@ -1011,6 +1011,11 @@ static int acquire_resource(
> >>>
> >>>switch ( xmar.type )
> >>>{
> >>> +case XENMEM_resource_ioreq_server:
> >>> +rc = xenmem_acquire_ioreq_server(d, xmar.id, xmar.frame,
> >>> + xmar.nr_frames, mfn_list);
> >>> +break;
> >>
> >> I fully appreciate you are not able to test on x86. However, I would
> >> have expected you to at least build test it.
> >>
> >
> > I don't actually know how to set up cross-compilation, which is why I'd
> originally #ifdef-ed this.
> 
> It is quite trivial. See
> https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions#
> Cross_Compiling
> 

Oh, that's way more trivial than I'd imagined. Thanks!

  Paul

> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v12 06/11] x86/hvm/ioreq: add a new mappable resource type...

2017-10-19 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 19 October 2017 13:31
> To: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Jan Beulich <jbeul...@suse.com>
> Subject: Re: [Xen-devel] [PATCH v12 06/11] x86/hvm/ioreq: add a new
> mappable resource type...
> 
> Hi,
> 
> On 17/10/17 14:24, Paul Durrant wrote:
> > diff --git a/xen/common/memory.c b/xen/common/memory.c
> > index cdd2e030cf..b27a71c4f1 100644
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -1011,6 +1011,11 @@ static int acquire_resource(
> >
> >   switch ( xmar.type )
> >   {
> > +case XENMEM_resource_ioreq_server:
> > +rc = xenmem_acquire_ioreq_server(d, xmar.id, xmar.frame,
> > + xmar.nr_frames, mfn_list);
> > +break;
> 
> I fully appreciate you are not able to test on x86. However, I would
> have expected you to at least build test it.
> 

I don't actually know how to set up cross-compilation, which is why I'd 
originally #ifdef-ed this.

> For instance, here you introduced this function on x86, call in common
> code but does not introduce it on Arm.
> 
> Although, I don't think we should introduce it for Arm. Instead we
> should provide arch helpers as we do for other memory operations.
> 

That would be cleaner in this case.

  Paul

> Cheers,
> 
> --
> Julien Grall
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-19 Thread Paul Durrant
> -Original Message-
> From: Julien Grall [mailto:julien.gr...@linaro.org]
> Sent: 19 October 2017 13:23
> To: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org
> Cc: Stefano Stabellini <sstabell...@kernel.org>; Wei Liu
> <wei.l...@citrix.com>; Konrad Rzeszutek Wilk <konrad.w...@oracle.com>;
> George Dunlap <george.dun...@citrix.com>; Andrew Cooper
> <andrew.coop...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Tim
> (Xen.org) <t...@xen.org>; Julien Grall <julien.gr...@arm.com>; Jan Beulich
> <jbeul...@suse.com>; Daniel De Graaf <dgde...@tycho.nsa.gov>; Roger
> Pau Monne <roger@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v12 05/11] x86/mm: add
> HYPERVISOR_memory_op to acquire guest resources
> 
> Hi,
> 
> On 17/10/17 14:24, Paul Durrant wrote:
> > Certain memory resources associated with a guest are not necessarily
> > present in the guest P2M.
> >
> > This patch adds the boilerplate for new memory op to allow such a
> resource
> > to be priv-mapped directly, by either a PV or HVM tools domain.
> >
> > NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
> >I have no means to test it on an ARM platform and so cannot verify
> >that it functions correctly.
> >
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > ---
> 
> [...]
> 
> > diff --git a/xen/common/memory.c b/xen/common/memory.c
> > index ad987e0f29..cdd2e030cf 100644
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -965,6 +965,95 @@ static long xatp_permission_check(struct domain
> *d, unsigned int space)
> 
> [...]
> 
> > +if ( rc )
> > +goto out;
> > +
> > +if ( !paging_mode_translate(currd) )
> > +{
> > +if ( copy_to_guest(xmar.frame_list, mfn_list, xmar.nr_frames) )
> > +rc = -EFAULT;
> > +}
> > +else
> > +{
> > +xen_pfn_t gfn_list[ARRAY_SIZE(mfn_list)];
> > +unsigned int i;
> > +
> > +rc = -EFAULT;
> > +if ( copy_from_guest(gfn_list, xmar.frame_list, xmar.nr_frames) )
> > +goto out;
> > +
> > +for ( i = 0; i < xmar.nr_frames; i++ )
> > +{
> > +rc = set_foreign_p2m_entry(currd, gfn_list[i],
> > +   _mfn(mfn_list[i]));
> 
> Something looks a bit odd to me here. When I read foreign mapping, I
> directly associate to mapping from a foreign domain.
> 
> On Arm, we will always get a reference on that page to prevent it
> disappearing if the foreign domain is destroyed but the mapping is still
> present.
> 
> This reference will either be put with an unmapped hypercall or while
> teardown the domain.
> 
> Per my understanding, this MFN does not belong to any domain (or at
> least currd). Right?

No. The mfns do belong to the target domain.

> So there is no way to get/put a reference on that
> page. So I am unconvinced that this is very safe.
> 
> Also looking at the x86 side, I can't find such reference in the foreign
> path in p2m_add_foreign. Did I miss anything?

No, I don't think there is any reference counting there... but this is no 
different to priv mapping. I'm not trying to fix the mapping infrastructure at 
this point.

> 
> Note that x86 does not handle p2m teardown with foreign map at the
> moment (see p2m_add_foreign).
> 
> You are by-passing this check and I can't see how this would be safe for
> the x86 side too.
> 

I don't follow. What check am I by-passing that is covered when priv mapping?

> > +if ( rc )
> > +{
> > +/*
> > + * Make sure rc is -EIO for any interation other than
> > + * the first.
> > + */
> > +rc = (i != 0) ? -EIO : rc;
> > +break;
> > +}
> > +}
> > +}
> > +
> > + out:
> > +rcu_unlock_domain(d);
> > +return rc;
> > +}
> > +
> >   long do_memory_op(unsigned long cmd,
> XEN_GUEST_HANDLE_PARAM(void) arg)
> >   {
> >   struct domain *d, *curr_d = current->domain;
> > @@ -1406,6 +1495,11 @@ long do_memory_op(unsigned long cmd,
> XEN_GUEST_HANDLE_PARAM(void) arg)
> >   }
> >   #endif
> >
> > +case XENMEM_acquire_resource:
> > +rc = acquire_resource(
> > +guest_handle_cast(arg, xen_mem_acquire_resource_t));
> > +break;
> > +
> >

[Xen-devel] [PATCH v12 11/11] tools/libxenctrl: use new xenforeignmemory API to seed grant table

2017-10-17 Thread Paul Durrant
A previous patch added support for priv-mapping guest resources directly
(rather than having to foreign-map, which requires P2M modification for
HVM guests).

This patch makes use of the new API to seed the guest grant table unless
the underlying infrastructure (i.e. privcmd) doesn't support it, in which
case the old scheme is used.

NOTE: The call to xc_dom_gnttab_hvm_seed() in hvm_build_set_params() was
  actually unnecessary, as the grant table has already been seeded
  by a prior call to xc_dom_gnttab_init() made by libxl__build_dom().

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v10:
 - Use new id constant for grant table.

v4:
 - Minor cosmetic fix suggested by Roger.

v3:
 - Introduced xc_dom_set_gnttab_entry() to avoid duplicated code.
---
 tools/libxc/include/xc_dom.h|   8 +--
 tools/libxc/xc_dom_boot.c   | 114 +---
 tools/libxc/xc_sr_restore_x86_hvm.c |  10 ++--
 tools/libxc/xc_sr_restore_x86_pv.c  |   2 +-
 tools/libxl/libxl_dom.c |   1 -
 tools/python/xen/lowlevel/xc/xc.c   |   6 +-
 6 files changed, 92 insertions(+), 49 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 6e06ef1dec..4216d63462 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -325,12 +325,8 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, 
xen_pfn_t pfn,
 int xc_dom_boot_image(struct xc_dom_image *dom);
 int xc_dom_compat_check(struct xc_dom_image *dom);
 int xc_dom_gnttab_init(struct xc_dom_image *dom);
-int xc_dom_gnttab_hvm_seed(xc_interface *xch, domid_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   domid_t console_domid,
-   domid_t xenstore_domid);
-int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
+int xc_dom_gnttab_seed(xc_interface *xch, domid_t guest_domid,
+   bool is_hvm,
xen_pfn_t console_gmfn,
xen_pfn_t xenstore_gmfn,
domid_t console_domid,
diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c
index 8a376d097c..0fe94aa255 100644
--- a/tools/libxc/xc_dom_boot.c
+++ b/tools/libxc/xc_dom_boot.c
@@ -282,11 +282,29 @@ static xen_pfn_t xc_dom_gnttab_setup(xc_interface *xch, 
domid_t domid)
 return gmfn;
 }
 
-int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   domid_t console_domid,
-   domid_t xenstore_domid)
+static void xc_dom_set_gnttab_entry(xc_interface *xch,
+grant_entry_v1_t *gnttab,
+unsigned int idx,
+domid_t guest_domid,
+domid_t backend_domid,
+xen_pfn_t backend_gmfn)
+{
+if ( guest_domid == backend_domid || backend_gmfn == -1)
+return;
+
+xc_dom_printf(xch, "%s: [%u] -> 0x%"PRI_xen_pfn,
+  __FUNCTION__, idx, backend_gmfn);
+
+gnttab[idx].flags = GTF_permit_access;
+gnttab[idx].domid = backend_domid;
+gnttab[idx].frame = backend_gmfn;
+}
+
+static int compat_gnttab_seed(xc_interface *xch, domid_t domid,
+  xen_pfn_t console_gmfn,
+  xen_pfn_t xenstore_gmfn,
+  domid_t console_domid,
+  domid_t xenstore_domid)
 {
 
 xen_pfn_t gnttab_gmfn;
@@ -310,18 +328,10 @@ int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
 return -1;
 }
 
-if ( domid != console_domid  && console_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_CONSOLE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_CONSOLE].domid = console_domid;
-gnttab[GNTTAB_RESERVED_CONSOLE].frame = console_gmfn;
-}
-if ( domid != xenstore_domid && xenstore_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_XENSTORE].domid = xenstore_domid;
-gnttab[GNTTAB_RESERVED_XENSTORE].frame = xenstore_gmfn;
-}
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_CONSOLE,
+domid, console_domid, console_gmfn);
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_XENSTORE,
+domid, xenstore_domid, xenstore_gmfn);
 
 if ( munmap(gnttab, PAGE_SIZE) == -1 )
 {
@@ -339,11 +349,11 @@ int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
 return 0;

[Xen-devel] [PATCH v12 10/11] common: add a new mappable resource type: XENMEM_resource_grant_table

2017-10-17 Thread Paul Durrant
This patch allows grant table frames to be mapped using the
XENMEM_acquire_resource memory op.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

v12:
 - Dropped limit checks as requested by Jan.

v10:
 - Addressed comments from Jan.

v8:
 - The functionality was originally incorporated into the earlier patch
   "x86/mm: add HYPERVISOR_memory_op to acquire guest resources".
---
 xen/common/grant_table.c  | 49 ++-
 xen/common/memory.c   | 45 ++-
 xen/include/public/memory.h   |  6 ++
 xen/include/xen/grant_table.h |  4 
 4 files changed, 98 insertions(+), 6 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 6d20b17739..886579a7b0 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3756,13 +3756,12 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, 
grant_ref_t ref,
 }
 #endif
 
-int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
- mfn_t *mfn)
+/* Caller must hold write lock as version may change and table may grow */
+static int gnttab_get_frame(struct domain *d, unsigned long idx,
+mfn_t *mfn)
 {
-int rc = 0;
 struct grant_table *gt = d->grant_table;
-
-grant_write_lock(gt);
+int rc = 0;
 
 if ( gt->gt_version == 0 )
 gt->gt_version = 1;
@@ -3787,6 +3786,18 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 rc = -EINVAL;
 }
 
+return rc;
+}
+
+int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
+ mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+grant_write_lock(gt);
+
+rc = gnttab_get_frame(d, idx, mfn);
 if ( !rc )
 gnttab_set_frame_gfn(gt, idx, gfn);
 
@@ -3795,6 +3806,34 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 return rc;
 }
 
+int gnttab_get_grant_frame(struct domain *d, unsigned long idx,
+   mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+/* write lock required as version may change and/or table may grow */
+grant_write_lock(gt);
+rc = gnttab_get_frame(d, idx, mfn);
+grant_write_unlock(gt);
+
+return rc;
+}
+
+int gnttab_get_status_frame(struct domain *d, unsigned long idx,
+mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+/* write lock required as version may change and/or table may grow */
+grant_write_lock(gt);
+rc = gnttab_get_frame(d, idx | XENMAPIDX_grant_table_status, mfn);
+grant_write_unlock(gt);
+
+return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
 int first = 1;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index b27a71c4f1..ae46d95885 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -965,12 +966,49 @@ static long xatp_permission_check(struct domain *d, 
unsigned int space)
 return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
 }
 
+static int acquire_grant_table(struct domain *d, unsigned int id,
+   unsigned long frame,
+   unsigned int nr_frames,
+   unsigned long mfn_list[])
+{
+unsigned int i = nr_frames;
+
+/* Iterate backwards in case table needs to grow */
+while ( i-- != 0 )
+{
+mfn_t mfn = INVALID_MFN;
+int rc;
+
+switch ( id )
+{
+case XENMEM_resource_grant_table_id_grant:
+rc = gnttab_get_grant_frame(d, frame + i, );
+break;
+
+case XENMEM_resource_grant_table_id_status:
+rc = gnttab_get_status_frame(d, frame + i, );
+break;
+
+default:
+rc = -EINVAL;
+break;
+}
+
+if ( rc )
+return rc;
+
+mfn_list[i] = mfn_x(mfn);
+}
+
+return 0;
+}
+
 static int acquire_resource(
 XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
 {
 struct domain *d, *currd = current->domain;
 xen_mem_acquire_resource_t xmar;
-unsigned long mfn_list[2];
+unsigned long mfn_list[32];
 int rc;
 
 if ( copy_from_guest(, arg, 1) )
@@ -1016,6 +1054,11 @@ static int acquire_resource(
  xmar.nr_frames, mfn_list);
 break;
 
+ca

[Xen-devel] [PATCH v12 02/11] x86/hvm/ioreq: simplify code and use consistent naming

2017-10-17 Thread Paul Durrant
This patch re-works much of the ioreq server initialization and teardown
code:

- The hvm_map/unmap_ioreq_gfn() functions are expanded to call through
  to hvm_alloc/free_ioreq_gfn() rather than expecting them to be called
  separately by outer functions.
- Several functions now test the validity of the hvm_ioreq_page gfn value
  to determine whether they need to act. This means can be safely called
  for the bufioreq page even when it is not used.
- hvm_add/remove_ioreq_gfn() simply return in the case of the default
  IOREQ server so callers no longer need to test before calling.
- hvm_ioreq_server_setup_pages() is renamed to hvm_ioreq_server_map_pages()
  to mirror the existing hvm_ioreq_server_unmap_pages().

All of this significantly shortens the code.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
Acked-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>

v3:
 - Rebased on top of 's->is_default' to 'IS_DEFAULT(s)' changes.
 - Minor updates in response to review comments from Roger.
---
 xen/arch/x86/hvm/ioreq.c | 182 ++-
 1 file changed, 69 insertions(+), 113 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index e6ccc7572a..6d81018369 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,63 +210,75 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static int hvm_alloc_ioreq_gfn(struct domain *d, unsigned long *gfn)
+static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
+struct domain *d = s->domain;
 unsigned int i;
-int rc;
 
-rc = -ENOMEM;
+ASSERT(!IS_DEFAULT(s));
+
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, >arch.hvm_domain.ioreq_gfn.mask) )
-{
-*gfn = d->arch.hvm_domain.ioreq_gfn.base + i;
-rc = 0;
-break;
-}
+return d->arch.hvm_domain.ioreq_gfn.base + i;
 }
 
-return rc;
+return gfn_x(INVALID_GFN);
 }
 
-static void hvm_free_ioreq_gfn(struct domain *d, unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
+   unsigned long gfn)
 {
+struct domain *d = s->domain;
 unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
 
-if ( gfn != gfn_x(INVALID_GFN) )
-set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
+ASSERT(!IS_DEFAULT(s));
+ASSERT(gfn != gfn_x(INVALID_GFN));
+
+set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
 }
 
-static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool buf)
+static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return;
+
 destroy_ring_for_helper(>va, iorp->page);
+iorp->page = NULL;
+
+if ( !IS_DEFAULT(s) )
+hvm_free_ioreq_gfn(s, iorp->gfn);
+
+iorp->gfn = gfn_x(INVALID_GFN);
 }
 
-static int hvm_map_ioreq_page(
-struct hvm_ioreq_server *s, bool buf, unsigned long gfn)
+static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct domain *d = s->domain;
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
-struct page_info *page;
-void *va;
 int rc;
 
-if ( (rc = prepare_ring_for_helper(d, gfn, , )) )
-return rc;
-
-if ( (iorp->va != NULL) || d->is_dying )
-{
-destroy_ring_for_helper(, page);
+if ( d->is_dying )
 return -EINVAL;
-}
 
-iorp->va = va;
-iorp->page = page;
-iorp->gfn = gfn;
+if ( IS_DEFAULT(s) )
+iorp->gfn = buf ?
+d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+else
+iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-return 0;
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return -ENOMEM;
+
+rc = prepare_ring_for_helper(d, iorp->gfn, >page, >va);
+
+if ( rc )
+hvm_unmap_ioreq_gfn(s, buf);
+
+return rc;
 }
 
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
@@ -279,8 +291,7 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 
 FOR_EACH_IOREQ_SERVER(d, id, s)
 {
-if ( (s->ioreq.va && s->ioreq.page == page) ||
- (s->bufioreq.va && s->bufioreq.page == page) )
+if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
 {
 found = true;
 break;
@@ -292,20 +303,30 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 retu

[Xen-devel] [PATCH v12 00/11] x86: guest resource mapping

2017-10-17 Thread Paul Durrant
This series introduces support for direct mapping of guest resources.
The resources are:
 - IOREQ server pages
 - Grant tables

v12:
 - Responded to more comments from Jan.

v11:
 - Responded to more comments from Jan.

v10:
 - Responded to comments from Jan.

v9:
 - Change to patch #1 only.

v8:
 - Re-ordered series and dropped two patches that have already been
committed.

v7:
 - Fixed assertion failure hit during domain destroy.

v6:
 - Responded to missed comments from Roger.

v5:
 - Responded to review comments from Wei.

v4:
 - Responded to further review comments from Roger.

v3:
 - Dropped original patch #1 since it is covered by Juergen's patch.
 - Added new xenforeignmemorycleanup patch (#4).
 - Replaced the patch introducing the ioreq server 'is_default' flag with
   one that changes the ioreq server list into an array (#8).
  
Paul Durrant (11):
  x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  x86/hvm/ioreq: simplify code and use consistent naming
  x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page
  x86/hvm/ioreq: defer mapping gfns until they are actually requsted
  x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  x86/hvm/ioreq: add a new mappable resource type...
  x86/mm: add an extra command to HYPERVISOR_mmu_update...
  tools/libxenforeignmemory: add support for resource mapping
  tools/libxenforeignmemory: reduce xenforeignmemory_restrict code
footprint
  common: add a new mappable resource type: XENMEM_resource_grant_table
  tools/libxenctrl: use new xenforeignmemory API to seed grant table

 tools/flask/policy/modules/xen.if  |   4 +-
 tools/include/xen-sys/Linux/privcmd.h  |  11 +
 tools/libs/devicemodel/core.c  |   8 +
 tools/libs/devicemodel/include/xendevicemodel.h|   6 +-
 tools/libs/foreignmemory/Makefile  |   2 +-
 tools/libs/foreignmemory/core.c|  53 ++
 tools/libs/foreignmemory/freebsd.c |   7 -
 .../libs/foreignmemory/include/xenforeignmemory.h  |  41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |   5 +
 tools/libs/foreignmemory/linux.c   |  45 ++
 tools/libs/foreignmemory/minios.c  |   7 -
 tools/libs/foreignmemory/netbsd.c  |   7 -
 tools/libs/foreignmemory/private.h |  43 +-
 tools/libs/foreignmemory/solaris.c |   7 -
 tools/libxc/include/xc_dom.h   |   8 +-
 tools/libxc/xc_dom_boot.c  | 114 ++-
 tools/libxc/xc_sr_restore_x86_hvm.c|  10 +-
 tools/libxc/xc_sr_restore_x86_pv.c |   2 +-
 tools/libxl/libxl_dom.c|   1 -
 tools/python/xen/lowlevel/xc/xc.c  |   6 +-
 xen/arch/x86/hvm/dm.c  |   9 +-
 xen/arch/x86/hvm/ioreq.c   | 831 -
 xen/arch/x86/mm.c  |  39 +-
 xen/arch/x86/mm/p2m.c  |   3 +-
 xen/common/compat/memory.c |  95 +++
 xen/common/grant_table.c   |  49 +-
 xen/common/memory.c| 142 
 xen/include/asm-arm/p2m.h  |   6 +
 xen/include/asm-x86/hvm/domain.h   |  14 +-
 xen/include/asm-x86/hvm/ioreq.h|   2 +
 xen/include/asm-x86/mm.h   |   5 +
 xen/include/asm-x86/p2m.h  |   3 +
 xen/include/public/hvm/dm_op.h |  36 +-
 xen/include/public/memory.h|  58 +-
 xen/include/public/xen.h   |  12 +-
 xen/include/xen/grant_table.h  |   4 +
 xen/include/xlat.lst   |   1 +
 xen/include/xsm/dummy.h|   6 +
 xen/include/xsm/xsm.h  |   6 +
 xen/xsm/dummy.c|   1 +
 xen/xsm/flask/hooks.c  |   6 +
 xen/xsm/flask/policy/access_vectors|   2 +
 42 files changed, 1223 insertions(+), 494 deletions(-)

---
Cc: Daniel De Graaf <dgde...@tycho.nsa.gov>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: "Marek Marczykowski-Górecki" <marma...@invisiblethingslab.com>
Cc: Paul Durrant <paul.durr...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Julien Grall <julien.gr...@arm.com>

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v12 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-17 Thread Paul Durrant
Certain memory resources associated with a guest are not necessarily
present in the guest P2M.

This patch adds the boilerplate for new memory op to allow such a resource
to be priv-mapped directly, by either a PV or HVM tools domain.

NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
  I have no means to test it on an ARM platform and so cannot verify
  that it functions correctly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>
Cc: Daniel De Graaf <dgde...@tycho.nsa.gov>
Cc: Julien Grall <julien.gr...@arm.com>

v12:
 - Addressed more comments form Jan.
 - Removed #ifdef CONFIG_X86 from common code and instead introduced a
   stub set_foreign_p2m_entry() in asm-arm/p2m.h returning -EOPNOTSUPP.
 - Restricted mechanism for querying implementation limit on nr_frames
   and simplified compat code.

v11:
 - Addressed more comments from Jan.

v9:
 - Addressed more comments from Jan.

v8:
 - Move the code into common as requested by Jan.
 - Make the gmfn_list handle a 64-bit type to avoid limiting the MFN
   range for a 32-bit tools domain.
 - Add missing pad.
 - Add compat code.
 - Make this patch deal with purely boilerplate.
 - Drop George's A-b and Wei's R-b because the changes are non-trivial,
   and update Cc list now the boilerplate is common.

v5:
 - Switched __copy_to/from_guest_offset() to copy_to/from_guest_offset().
---
 tools/flask/policy/modules/xen.if   |  4 +-
 xen/arch/x86/mm/p2m.c   |  3 +-
 xen/common/compat/memory.c  | 95 +
 xen/common/memory.c | 94 
 xen/include/asm-arm/p2m.h   |  6 +++
 xen/include/asm-x86/p2m.h   |  3 ++
 xen/include/public/memory.h | 43 -
 xen/include/xlat.lst|  1 +
 xen/include/xsm/dummy.h |  6 +++
 xen/include/xsm/xsm.h   |  6 +++
 xen/xsm/dummy.c |  1 +
 xen/xsm/flask/hooks.c   |  6 +++
 xen/xsm/flask/policy/access_vectors |  2 +
 13 files changed, 266 insertions(+), 4 deletions(-)

diff --git a/tools/flask/policy/modules/xen.if 
b/tools/flask/policy/modules/xen.if
index 55437496f6..07cba8a15d 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -52,7 +52,8 @@ define(`create_domain_common', `
settime setdomainhandle getvcpucontext set_misc_info };
allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
-   psr_cmt_op psr_cat_op soft_reset set_gnttab_limits };
+   psr_cmt_op psr_cat_op soft_reset set_gnttab_limits
+   resource_map };
allow $1 $2:security check_context;
allow $1 $2:shadow enable;
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
@@ -152,6 +153,7 @@ define(`device_model', `
allow $1 $2_target:domain { getdomaininfo shutdown };
allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack 
};
allow $1 $2_target:hvm { getparam setparam hvmctl cacheattr dm };
+   allow $1 $2_target:domain2 resource_map;
 ')
 
 # make_device_model(priv, dm_dom, hvm_dom)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c72a3cdebb..71bb9b4f93 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1132,8 +1132,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 }
 
 /* Set foreign mfn in the given guest's p2m table. */
-static int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
- mfn_t mfn)
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
 return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
p2m_get_hostp2m(d)->default_access);
diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 35bb259808..7f2e2e3107 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -71,6 +71,7 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 struct xen_remove_from_physmap *xrfp;
 struct xen_vnuma_topology_info *vnuma;
 struct xen_mem_access_op *mao;
+struct xen_mem_acquire_resource *mar;
 } nat;
 union {
 struct compat_memory_reservation rsrv;
@@ -79,6 +

[Xen-devel] [PATCH v12 09/11] tools/libxenforeignmemory: reduce xenforeignmemory_restrict code footprint

2017-10-17 Thread Paul Durrant
By using a static inline stub in private.h for OS where this functionality
is not implemented, the various duplicate stubs in the OS-specific source
modules can be avoided.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v4:
 - Removed extraneous freebsd code.

v3:
 - Patch added in response to review comments.
---
 tools/libs/foreignmemory/freebsd.c |  7 ---
 tools/libs/foreignmemory/minios.c  |  7 ---
 tools/libs/foreignmemory/netbsd.c  |  7 ---
 tools/libs/foreignmemory/private.h | 12 +---
 tools/libs/foreignmemory/solaris.c |  7 ---
 5 files changed, 9 insertions(+), 31 deletions(-)

diff --git a/tools/libs/foreignmemory/freebsd.c 
b/tools/libs/foreignmemory/freebsd.c
index dec447485a..6e6bc4b11f 100644
--- a/tools/libs/foreignmemory/freebsd.c
+++ b/tools/libs/foreignmemory/freebsd.c
@@ -95,13 +95,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/minios.c 
b/tools/libs/foreignmemory/minios.c
index 75f340122e..43341ca301 100644
--- a/tools/libs/foreignmemory/minios.c
+++ b/tools/libs/foreignmemory/minios.c
@@ -58,13 +58,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num << PAGE_SHIFT);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/netbsd.c 
b/tools/libs/foreignmemory/netbsd.c
index 9bf95ef4f0..54a418ebd6 100644
--- a/tools/libs/foreignmemory/netbsd.c
+++ b/tools/libs/foreignmemory/netbsd.c
@@ -100,13 +100,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/private.h 
b/tools/libs/foreignmemory/private.h
index 80b22bdbfc..b5d5f0a354 100644
--- a/tools/libs/foreignmemory/private.h
+++ b/tools/libs/foreignmemory/private.h
@@ -32,9 +32,6 @@ void *osdep_xenforeignmemory_map(xenforeignmemory_handle 
*fmem,
 int osdep_xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
  void *addr, size_t num);
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid);
-
 #if defined(__NetBSD__) || defined(__sun__)
 /* Strictly compat for those two only only */
 void *compat_mapforeign_batch(xenforeignmem_handle *fmem, uint32_t dom,
@@ -54,6 +51,13 @@ struct xenforeignmemory_resource_handle {
 };
 
 #ifndef __linux__
+static inline int osdep_xenforeignmemory_restrict(xenforeignmemory_handle 
*fmem,
+  domid_t domid)
+{
+errno = EOPNOTSUPP;
+return -1;
+}
+
 static inline int osdep_xenforeignmemory_map_resource(
 xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
 {
@@ -67,6 +71,8 @@ static inline int osdep_xenforeignmemory_unmap_resource(
 return 0;
 }
 #else
+int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
+domid_t domid);
 int osdep_xenforeignmemory_map_resource(
 xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres);
 int osdep_xenforeignmemory_unmap_resource(
diff --git a/tools/libs/foreignmemory/solaris.c 
b/tools/libs/foreignmemory/solaris.c
index a33decb4ae..ee8aae4fbd 100644
--- a/tools/libs/foreignmemory/solaris.c
+++ b/tools/libs/foreignmemory/solaris.c
@@ -97,13 +97,6 @@ int osdep_xenforeignmemory_unmap(xenforeignmemory_handle 
*fmem,
 return munmap(addr, num*XC_PAGE_SIZE);
 }
 
-int osdep_xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
-domid_t domid)
-{
-errno = -EOPNOTSUPP;
-return -1;
-}
-
 /*
  * Local variables:
  * mode: C
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v12 01/11] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list

2017-10-17 Thread Paul Durrant
A subsequent patch will remove the current implicit limitation on creation
of ioreq servers which is due to the allocation of gfns for the ioreq
structures and buffered ioreq ring.

It will therefore be necessary to introduce an explicit limit and, since
this limit should be small, it simplifies the code to maintain an array of
that size rather than using a list.

Also, by reserving an array slot for the default server and populating
array slots early in create, the need to pass an 'is_default' boolean
to sub-functions can be avoided.

Some function return values are changed by this patch: Specifically, in
the case where the id of the default ioreq server is passed in, -EOPNOTSUPP
is now returned rather than -ENOENT.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>

v10:
 - modified FOR_EACH... macro as suggested by Jan.
 - check for NULL in IS_DEFAULT macro as suggested by Jan.

v9:
 - modified FOR_EACH... macro as requested by Andrew.

v8:
 - Addressed various comments from Jan.

v7:
 - Fixed assertion failure found in testing.

v6:
 - Updated according to comments made by Roger on v4 that I'd missed.

v5:
 - Switched GET/SET_IOREQ_SERVER() macros to get/set_ioreq_server()
   functions to avoid possible double-evaluation issues.

v4:
 - Introduced more helper macros and relocated them to the top of the
   code.

v3:
 - New patch (replacing "move is_default into struct hvm_ioreq_server") in
   response to review comments.
---
 xen/arch/x86/hvm/ioreq.c | 502 +++
 xen/include/asm-x86/hvm/domain.h |  10 +-
 2 files changed, 245 insertions(+), 267 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index f2e0b3f74a..e6ccc7572a 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -33,6 +33,37 @@
 
 #include 
 
+static void set_ioreq_server(struct domain *d, unsigned int id,
+ struct hvm_ioreq_server *s)
+{
+ASSERT(id < MAX_NR_IOREQ_SERVERS);
+ASSERT(!s || !d->arch.hvm_domain.ioreq_server.server[id]);
+
+d->arch.hvm_domain.ioreq_server.server[id] = s;
+}
+
+#define GET_IOREQ_SERVER(d, id) \
+(d)->arch.hvm_domain.ioreq_server.server[id]
+
+static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
+ unsigned int id)
+{
+if ( id >= MAX_NR_IOREQ_SERVERS )
+return NULL;
+
+return GET_IOREQ_SERVER(d, id);
+}
+
+#define IS_DEFAULT(s) \
+((s) && (s) == GET_IOREQ_SERVER((s)->domain, DEFAULT_IOSERVID))
+
+/* Iterate over all possible ioreq servers */
+#define FOR_EACH_IOREQ_SERVER(d, id, s) \
+for ( (id) = 0; (id) < MAX_NR_IOREQ_SERVERS; (id)++ ) \
+if ( !(s = GET_IOREQ_SERVER(d, id)) ) \
+continue; \
+else
+
 static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
 {
 shared_iopage_t *p = s->ioreq.va;
@@ -47,10 +78,9 @@ bool hvm_io_pending(struct vcpu *v)
 {
 struct domain *d = v->domain;
 struct hvm_ioreq_server *s;
+unsigned int id;
 
-list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -127,10 +157,9 @@ bool handle_hvm_io_completion(struct vcpu *v)
 struct hvm_vcpu_io *vio = >arch.hvm_vcpu.hvm_io;
 struct hvm_ioreq_server *s;
 enum hvm_io_completion io_completion;
+unsigned int id;
 
-  list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -243,13 +272,12 @@ static int hvm_map_ioreq_page(
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
 const struct hvm_ioreq_server *s;
+unsigned int id;
 bool found = false;
 
 spin_lock_recursive(>arch.hvm_domain.ioreq_server.lock);
 
-list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 if ( (s->ioreq.va && s->ioreq.page == page) ||
  (s->bufioreq.va && s->bufioreq.page == page) )
@@ -302,7 +330,7 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server 
*s,
 }
 
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
- bool is_default, struct vcpu *v)
+ struct vcpu *v)
 {
 struct hvm_ioreq_vcpu *sv;
 int rc;
@@ -331,7 +359,7 @@ static int hvm_ioreq_server_add_vcpu(struct 
hvm_ioreq_server *s,
 goto fa

[Xen-devel] [PATCH v12 08/11] tools/libxenforeignmemory: add support for resource mapping

2017-10-17 Thread Paul Durrant
A previous patch introduced a new HYPERVISOR_memory_op to acquire guest
resources for direct priv-mapping.

This patch adds new functionality into libxenforeignmemory to make use
of a new privcmd ioctl [1] that uses the new memory op to make such
resources available via mmap(2).

[1] 
http://xenbits.xen.org/gitweb/?p=people/pauldu/linux.git;a=commit;h=ce59a05e6712

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v4:
 - Fixed errno and removed single-use label
 - The unmap call now returns a status
 - Use C99 initialization for ioctl struct

v2:
 - Bump minor version up to 3.
---
 tools/include/xen-sys/Linux/privcmd.h  | 11 +
 tools/libs/foreignmemory/Makefile  |  2 +-
 tools/libs/foreignmemory/core.c| 53 ++
 .../libs/foreignmemory/include/xenforeignmemory.h  | 41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |  5 ++
 tools/libs/foreignmemory/linux.c   | 45 ++
 tools/libs/foreignmemory/private.h | 31 +
 7 files changed, 187 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-sys/Linux/privcmd.h 
b/tools/include/xen-sys/Linux/privcmd.h
index 732ff7c15a..9531b728f9 100644
--- a/tools/include/xen-sys/Linux/privcmd.h
+++ b/tools/include/xen-sys/Linux/privcmd.h
@@ -86,6 +86,15 @@ typedef struct privcmd_dm_op {
const privcmd_dm_op_buf_t __user *ubufs;
 } privcmd_dm_op_t;
 
+typedef struct privcmd_mmap_resource {
+   domid_t dom;
+   __u32 type;
+   __u32 id;
+   __u32 idx;
+   __u64 num;
+   __u64 addr;
+} privcmd_mmap_resource_t;
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: _hypercall_t
@@ -103,5 +112,7 @@ typedef struct privcmd_dm_op {
_IOC(_IOC_NONE, 'P', 5, sizeof(privcmd_dm_op_t))
 #define IOCTL_PRIVCMD_RESTRICT \
_IOC(_IOC_NONE, 'P', 6, sizeof(domid_t))
+#define IOCTL_PRIVCMD_MMAP_RESOURCE\
+   _IOC(_IOC_NONE, 'P', 7, sizeof(privcmd_mmap_resource_t))
 
 #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/tools/libs/foreignmemory/Makefile 
b/tools/libs/foreignmemory/Makefile
index ab7f873f26..5c7f78f61d 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR= 1
-MINOR= 2
+MINOR= 3
 SHLIB_LDFLAGS += -Wl,--version-script=libxenforeignmemory.map
 
 CFLAGS   += -Werror -Wmissing-prototypes
diff --git a/tools/libs/foreignmemory/core.c b/tools/libs/foreignmemory/core.c
index a6897dc561..8d3f9f178f 100644
--- a/tools/libs/foreignmemory/core.c
+++ b/tools/libs/foreignmemory/core.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 
+#include 
+
 #include "private.h"
 
 xenforeignmemory_handle *xenforeignmemory_open(xentoollog_logger *logger,
@@ -120,6 +122,57 @@ int xenforeignmemory_restrict(xenforeignmemory_handle 
*fmem,
 return osdep_xenforeignmemory_restrict(fmem, domid);
 }
 
+xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
+xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+unsigned int id, unsigned long frame, unsigned long nr_frames,
+void **paddr, int prot, int flags)
+{
+xenforeignmemory_resource_handle *fres;
+int rc;
+
+/* Check flags only contains POSIX defined values */
+if ( flags & ~(MAP_SHARED | MAP_PRIVATE) )
+{
+errno = EINVAL;
+return NULL;
+}
+
+fres = calloc(1, sizeof(*fres));
+if ( !fres )
+{
+errno = ENOMEM;
+return NULL;
+}
+
+fres->domid = domid;
+fres->type = type;
+fres->id = id;
+fres->frame = frame;
+fres->nr_frames = nr_frames;
+fres->addr = *paddr;
+fres->prot = prot;
+fres->flags = flags;
+
+rc = osdep_xenforeignmemory_map_resource(fmem, fres);
+if ( rc )
+{
+free(fres);
+fres = NULL;
+} else
+*paddr = fres->addr;
+
+return fres;
+}
+
+int xenforeignmemory_unmap_resource(
+xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+int rc = osdep_xenforeignmemory_unmap_resource(fmem, fres);
+
+free(fres);
+return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/include/xenforeignmemory.h 
b/tools/libs/foreignmemory/include/xenforeignmemory.h
index f4814c390f..d594be8df0 100644
--- a/tools/libs/foreignmemory/include/xenforeignmemory.h
+++ b/tools/libs/foreignmemory/include/xenforeignmemory.h
@@ -138,6 +138,47 @@ int xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
 int xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
   domid_t domid);
 
+typedef struct x

[Xen-devel] [PATCH v12 07/11] x86/mm: add an extra command to HYPERVISOR_mmu_update...

2017-10-17 Thread Paul Durrant
...to allow the calling domain to prevent translation of specified l1e
value.

Despite what the comment in public/xen.h might imply, specifying a
command value of MMU_NORMAL_PT_UPDATE will not simply update an l1e with
the specified value. Instead, mod_l1_entry() tests whether foreign_dom
has PG_translate set in its paging mode and, if it does, assumes that the
the pfn value in the l1e is a gfn rather than an mfn.

To allow PV tools domain to map mfn values from a previously issued
HYPERVISOR_memory_op:XENMEM_acquire_resource, there needs to be a way
to tell HYPERVISOR_mmu_update that the specific l1e value does not
require translation regardless of the paging mode of foreign_dom. This
patch therefore defines a new command value, MMU_PT_UPDATE_NO_TRANSLATE,
which has the same semantics as MMU_NORMAL_PT_UPDATE except that the
paging mode of foreign_dom is ignored and the l1e value is used verbatim.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

v8:
 - New in this version, replacing "allow a privileged PV domain to map
   guest mfns".
---
 xen/arch/x86/mm.c| 17 ++---
 xen/include/public/xen.h | 12 +---
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 1d15ae2a15..63539d5d0b 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1619,9 +1619,10 @@ void page_unlock(struct page_info *page)
 
 /* Update the L1 entry at pl1e to new value nl1e. */
 static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e,
-unsigned long gl1mfn, int preserve_ad,
+unsigned long gl1mfn, unsigned int cmd,
 struct vcpu *pt_vcpu, struct domain *pg_dom)
 {
+bool preserve_ad = (cmd == MMU_PT_UPDATE_PRESERVE_AD);
 l1_pgentry_t ol1e;
 struct domain *pt_dom = pt_vcpu->domain;
 int rc = 0;
@@ -1643,7 +1644,8 @@ static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t 
nl1e,
 return -EINVAL;
 }
 
-if ( paging_mode_translate(pg_dom) )
+if ( cmd != MMU_PT_UPDATE_NO_TRANSLATE &&
+ paging_mode_translate(pg_dom) )
 {
 page = get_page_from_gfn(pg_dom, l1e_get_pfn(nl1e), NULL, 
P2M_ALLOC);
 if ( !page )
@@ -3258,6 +3260,7 @@ long do_mmu_update(
  */
 case MMU_NORMAL_PT_UPDATE:
 case MMU_PT_UPDATE_PRESERVE_AD:
+case MMU_PT_UPDATE_NO_TRANSLATE:
 {
 p2m_type_t p2mt;
 
@@ -3323,7 +3326,8 @@ long do_mmu_update(
 p2m_query_t q = (l1e_get_flags(l1e) & _PAGE_RW) ?
 P2M_UNSHARE : P2M_ALLOC;
 
-if ( paging_mode_translate(pg_owner) )
+if ( cmd != MMU_PT_UPDATE_NO_TRANSLATE &&
+ paging_mode_translate(pg_owner) )
 target = get_page_from_gfn(pg_owner, l1e_get_pfn(l1e),
_p2mt, q);
 
@@ -3350,9 +3354,7 @@ long do_mmu_update(
 break;
 }
 
-rc = mod_l1_entry(va, l1e, mfn,
-  cmd == MMU_PT_UPDATE_PRESERVE_AD, v,
-  pg_owner);
+rc = mod_l1_entry(va, l1e, mfn, cmd, v, pg_owner);
 if ( target )
 put_page(target);
 }
@@ -3630,7 +3632,8 @@ static int __do_update_va_mapping(
 goto out;
 }
 
-rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), 0, v, pg_owner);
+rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), MMU_NORMAL_PT_UPDATE, v,
+  pg_owner);
 
 page_unlock(gl1pg);
 put_page(gl1pg);
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 2ac6b1e24d..d2014a39eb 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -268,6 +268,10 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed
  * with those in @val.
  *
+ * ptr[1:0] == MMU_PT_UPDATE_NO_TRANSLATE:
+ * As MMU_NORMAL_PT_UPDATE above, but @val is not translated though FD
+ * page tables.
+ *
  * @val is usually the machine frame number along with some attributes.
  * The attributes by default follow the architecture defined bits. Meaning that
  * if this is a X86_64 machine and four page table layout is used, the layout
@@ -334,9 +338,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  *
  * PAT (bit 7 on) --> PWT (bit 3 on) and

[Xen-devel] [PATCH v12 03/11] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page

2017-10-17 Thread Paul Durrant
This patch adjusts the ioreq server code to use type-safe gfn_t values
where possible. No functional change.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
Acked-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
---
 xen/arch/x86/hvm/ioreq.c | 44 
 xen/include/asm-x86/hvm/domain.h |  2 +-
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 6d81018369..64bb13cec9 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,7 +210,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
+static gfn_t hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
 struct domain *d = s->domain;
 unsigned int i;
@@ -220,20 +220,19 @@ static unsigned long hvm_alloc_ioreq_gfn(struct 
hvm_ioreq_server *s)
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, >arch.hvm_domain.ioreq_gfn.mask) )
-return d->arch.hvm_domain.ioreq_gfn.base + i;
+return _gfn(d->arch.hvm_domain.ioreq_gfn.base + i);
 }
 
-return gfn_x(INVALID_GFN);
+return INVALID_GFN;
 }
 
-static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
-   unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s, gfn_t gfn)
 {
 struct domain *d = s->domain;
-unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
+unsigned int i = gfn_x(gfn) - d->arch.hvm_domain.ioreq_gfn.base;
 
 ASSERT(!IS_DEFAULT(s));
-ASSERT(gfn != gfn_x(INVALID_GFN));
+ASSERT(!gfn_eq(gfn, INVALID_GFN));
 
 set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
 }
@@ -242,7 +241,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
 destroy_ring_for_helper(>va, iorp->page);
@@ -251,7 +250,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 if ( !IS_DEFAULT(s) )
 hvm_free_ioreq_gfn(s, iorp->gfn);
 
-iorp->gfn = gfn_x(INVALID_GFN);
+iorp->gfn = INVALID_GFN;
 }
 
 static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
@@ -264,16 +263,17 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 return -EINVAL;
 
 if ( IS_DEFAULT(s) )
-iorp->gfn = buf ?
-d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
-d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+iorp->gfn = _gfn(buf ?
+ d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+ d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN]);
 else
 iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return -ENOMEM;
 
-rc = prepare_ring_for_helper(d, iorp->gfn, >page, >va);
+rc = prepare_ring_for_helper(d, gfn_x(iorp->gfn), >page,
+ >va);
 
 if ( rc )
 hvm_unmap_ioreq_gfn(s, buf);
@@ -309,10 +309,10 @@ static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server 
*s, bool buf)
 struct domain *d = s->domain;
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
-if ( guest_physmap_remove_page(d, _gfn(iorp->gfn),
+if ( guest_physmap_remove_page(d, iorp->gfn,
_mfn(page_to_mfn(iorp->page)), 0) )
 domain_crash(d);
 clear_page(iorp->va);
@@ -324,12 +324,12 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 int rc;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return 0;
 
 clear_page(iorp->va);
 
-rc = guest_physmap_add_page(d, _gfn(iorp->gfn),
+rc = guest_physmap_add_page(d, iorp->gfn,
 _mfn(page_to_mfn(iorp->page)), 0);
 if ( rc == 0 )
 paging_mark_dirty(d, _mfn(page_to_mfn(iorp->page)));
@@ -590,8 +590,8 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 INIT_LIST_HEAD(>ioreq_vcpu_list);
 spin_lock_init(>bufioreq_lock);
 
-s->ioreq.gfn = gfn_x(INVALID_GFN);
-s->

[Xen-devel] [PATCH v12 06/11] x86/hvm/ioreq: add a new mappable resource type...

2017-10-17 Thread Paul Durrant
... XENMEM_resource_ioreq_server

This patch adds support for a new resource type that can be mapped using
the XENMEM_acquire_resource memory op.

If an emulator makes use of this resource type then, instead of mapping
gfns, the IOREQ server will allocate pages from the heap. These pages
will never be present in the P2M of the guest at any point and so are
not vulnerable to any direct attack by the guest. They are only ever
accessible by Xen and any domain that has mapping privilege over the
guest (which may or may not be limited to the domain running the emulator).

NOTE: Use of the new resource type is not compatible with use of
  XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns flag is
  set.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>

v12:
 - Addressed more comments from Jan.
 - Dropped George's A-b and Wei's R-b because of material change.

v11:
 - Addressed more comments from Jan.

v10:
 - Addressed comments from Jan.

v8:
 - Re-base on new boilerplate.
 - Adjust function signature of hvm_get_ioreq_server_frame(), and test
   whether the bufioreq page is present.

v5:
 - Use get_ioreq_server() function rather than indexing array directly.
 - Add more explanation into comments to state than mapping guest frames
   and allocation of pages for ioreq servers are not simultaneously
   permitted.
 - Add a comment into asm/ioreq.h stating the meaning of the index
   value passed to hvm_get_ioreq_server_frame().
---
 xen/arch/x86/hvm/ioreq.c| 156 
 xen/arch/x86/mm.c   |  22 ++
 xen/common/memory.c |   5 ++
 xen/include/asm-x86/hvm/ioreq.h |   2 +
 xen/include/asm-x86/mm.h|   5 ++
 xen/include/public/hvm/dm_op.h  |   4 ++
 xen/include/public/memory.h |   9 +++
 7 files changed, 203 insertions(+)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index f654e7796c..2c611fbffa 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -259,6 +259,19 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 int rc;
 
+if ( iorp->page )
+{
+/*
+ * If a page has already been allocated (which will happen on
+ * demand if hvm_get_ioreq_server_frame() is called), then
+ * mapping a guest frame is not permitted.
+ */
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
+return -EPERM;
+
+return 0;
+}
+
 if ( d->is_dying )
 return -EINVAL;
 
@@ -281,6 +294,70 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 return rc;
 }
 
+static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+struct domain *currd = current->domain;
+struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
+
+if ( iorp->page )
+{
+/*
+ * If a guest frame has already been mapped (which may happen
+ * on demand if hvm_get_ioreq_server_info() is called), then
+ * allocating a page is not permitted.
+ */
+if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
+return -EPERM;
+
+return 0;
+}
+
+/*
+ * Allocated IOREQ server pages are assigned to the emulating
+ * domain, not the target domain. This is because the emulator is
+ * likely to be destroyed after the target domain has been torn
+ * down, and we must use MEMF_no_refcount otherwise page allocation
+ * could fail if the emulating domain has already reached its
+ * maximum allocation.
+ */
+iorp->page = alloc_domheap_page(currd, MEMF_no_refcount);
+if ( !iorp->page )
+return -ENOMEM;
+
+if ( !get_page_type(iorp->page, PGT_writable_page) )
+{
+ASSERT_UNREACHABLE();
+put_page(iorp->page);
+iorp->page = NULL;
+return -ENOMEM;
+}
+
+iorp->va = __map_domain_page_global(iorp->page);
+if ( !iorp->va )
+{
+put_page_and_type(iorp->page);
+iorp->page = NULL;
+return -ENOMEM;
+}
+
+clear_page(iorp->va);
+return 0;
+}
+
+static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
+
+if ( !iorp->page )
+return;
+
+unmap_domain_page_global(iorp->va);
+iorp->va = NULL;
+
+put_page_and_type(iorp->page);
+iorp->page = NULL;
+}
+
 bool is_ioreq_server_page(struct domain *d, const struct page_info 

[Xen-devel] [PATCH v12 04/11] x86/hvm/ioreq: defer mapping gfns until they are actually requsted

2017-10-17 Thread Paul Durrant
A subsequent patch will introduce a new scheme to allow an emulator to
map ioreq server pages directly from Xen rather than the guest P2M.

This patch lays the groundwork for that change by deferring mapping of
gfns until their values are requested by an emulator. To that end, the
pad field of the xen_dm_op_get_ioreq_server_info structure is re-purposed
to a flags field and new flag, XEN_DMOP_no_gfns, defined which modifies the
behaviour of XEN_DMOP_get_ioreq_server_info to allow the caller to avoid
requesting the gfn values.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>

v8:
 - For safety make all of the pointers passed to
   hvm_get_ioreq_server_info() optional.
 - Shrink bufioreq_handling down to a uint8_t.

v3:
 - Updated in response to review comments from Wei and Roger.
 - Added a HANDLE_BUFIOREQ macro to make the code neater.
 - This patch no longer introduces a security vulnerability since there
   is now an explicit limit on the number of ioreq servers that may be
   created for any one domain.
---
 tools/libs/devicemodel/core.c   |  8 +
 tools/libs/devicemodel/include/xendevicemodel.h |  6 ++--
 xen/arch/x86/hvm/dm.c   |  9 +++--
 xen/arch/x86/hvm/ioreq.c| 47 ++---
 xen/include/asm-x86/hvm/domain.h|  2 +-
 xen/include/public/hvm/dm_op.h  | 32 ++---
 6 files changed, 63 insertions(+), 41 deletions(-)

diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
index 0f2c1a791f..91c69d103b 100644
--- a/tools/libs/devicemodel/core.c
+++ b/tools/libs/devicemodel/core.c
@@ -188,6 +188,14 @@ int xendevicemodel_get_ioreq_server_info(
 
 data->id = id;
 
+/*
+ * If the caller is not requesting gfn values then instruct the
+ * hypercall not to retrieve them as this may cause them to be
+ * mapped.
+ */
+if (!ioreq_gfn && !bufioreq_gfn)
+data->flags |= XEN_DMOP_no_gfns;
+
 rc = xendevicemodel_op(dmod, domid, 1, , sizeof(op));
 if (rc)
 return rc;
diff --git a/tools/libs/devicemodel/include/xendevicemodel.h 
b/tools/libs/devicemodel/include/xendevicemodel.h
index 13216db04a..d73a76da35 100644
--- a/tools/libs/devicemodel/include/xendevicemodel.h
+++ b/tools/libs/devicemodel/include/xendevicemodel.h
@@ -61,11 +61,11 @@ int xendevicemodel_create_ioreq_server(
  * @parm domid the domain id to be serviced
  * @parm id the IOREQ Server id.
  * @parm ioreq_gfn pointer to a xen_pfn_t to receive the synchronous ioreq
- *  gfn
+ *  gfn. (May be NULL if not required)
  * @parm bufioreq_gfn pointer to a xen_pfn_t to receive the buffered ioreq
- *gfn
+ *gfn. (May be NULL if not required)
  * @parm bufioreq_port pointer to a evtchn_port_t to receive the buffered
- * ioreq event channel
+ * ioreq event channel. (May be NULL if not required)
  * @return 0 on success, -1 on failure.
  */
 int xendevicemodel_get_ioreq_server_info(
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 9cf53b551c..22fa5b51e3 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -416,16 +416,19 @@ static int dm_op(const struct dmop_args *op_args)
 {
 struct xen_dm_op_get_ioreq_server_info *data =
 _ioreq_server_info;
+const uint16_t valid_flags = XEN_DMOP_no_gfns;
 
 const_op = false;
 
 rc = -EINVAL;
-if ( data->pad )
+if ( data->flags & ~valid_flags )
 break;
 
 rc = hvm_get_ioreq_server_info(d, data->id,
-   >ioreq_gfn,
-   >bufioreq_gfn,
+   (data->flags & XEN_DMOP_no_gfns) ?
+   NULL : >ioreq_gfn,
+   (data->flags & XEN_DMOP_no_gfns) ?
+   NULL : >bufioreq_gfn,
>bufioreq_port);
 break;
 }
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 64bb13cec9..f654e7796c 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -350,6 +350,9 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server 
*s,
 }
 }
 
+#define HANDLE_BUFIOREQ(s) \
+((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
+

Re: [Xen-devel] [PATCH v11 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-17 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 17 October 2017 13:53
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>; George Dunlap
> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Wei Liu
> <wei.l...@citrix.com>; Stefano Stabellini <sstabell...@kernel.org>; xen-
> de...@lists.xenproject.org; KonradRzeszutek Wilk
> <konrad.w...@oracle.com>; Daniel de Graaf <dgde...@tycho.nsa.gov>; Tim
> (Xen.org) <t...@xen.org>
> Subject: RE: [PATCH v11 05/11] x86/mm: add HYPERVISOR_memory_op to
> acquire guest resources
> 
> >>> On 17.10.17 at 14:28, <paul.durr...@citrix.com> wrote:
> >>  -Original Message-
> >>
> >> > --- a/xen/include/xsm/dummy.h
> >> > +++ b/xen/include/xsm/dummy.h
> >> > @@ -724,3 +724,9 @@ static XSM_INLINE int xsm_xen_version
> >> (XSM_DEFAULT_ARG uint32_t op)
> >> >  return xsm_default_action(XSM_PRIV, current->domain, NULL);
> >> >  }
> >> >  }
> >> > +
> >> > +static XSM_INLINE int
> xsm_domain_resource_map(XSM_DEFAULT_ARG
> >> struct domain *d)
> >> > +{
> >> > +XSM_ASSERT_ACTION(XSM_DM_PRIV);
> >> > +return xsm_default_action(action, current->domain, d);
> >> > +}
> >>
> >> Perhaps better place this near something similar/related (also for
> >> some of the other additions further down)?
> >
> > Looking at this again it seems that various related things, e.g.
> > domain_memory_map, are x86 only so adding at the end seems like the
> best
> > thing to do.
> 
> Well, okay then (unless Daniel, whom it looks like you forgot to Cc,
> has a better suggestion).
> 

Yes, I realised that I forgot to cc him in since the code was added. He's on 
the v12 list.

  Paul

> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-17 Thread Paul Durrant
> -Original Message-
> 
> > --- a/xen/include/xsm/dummy.h
> > +++ b/xen/include/xsm/dummy.h
> > @@ -724,3 +724,9 @@ static XSM_INLINE int xsm_xen_version
> (XSM_DEFAULT_ARG uint32_t op)
> >  return xsm_default_action(XSM_PRIV, current->domain, NULL);
> >  }
> >  }
> > +
> > +static XSM_INLINE int xsm_domain_resource_map(XSM_DEFAULT_ARG
> struct domain *d)
> > +{
> > +XSM_ASSERT_ACTION(XSM_DM_PRIV);
> > +return xsm_default_action(action, current->domain, d);
> > +}
> 
> Perhaps better place this near something similar/related (also for
> some of the other additions further down)?

Looking at this again it seems that various related things, e.g. 
domain_memory_map, are x86 only so adding at the end seems like the best thing 
to do.

  Paul

> 
> Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 10/11] common: add a new mappable resource type: XENMEM_resource_grant_table

2017-10-17 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 17 October 2017 10:06
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>; George Dunlap
> <george.dun...@citrix.com>; Ian Jackson <ian.jack...@citrix.com>; Wei Liu
> <wei.l...@citrix.com>; sstabell...@kernel.org; xen-
> de...@lists.xenproject.org; konrad.w...@oracle.com; Tim (Xen.org)
> <t...@xen.org>
> Subject: RE: [PATCH v11 10/11] common: add a new mappable resource type:
> XENMEM_resource_grant_table
> 
> >>> On 17.10.17 at 10:30, <paul.durr...@citrix.com> wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: 17 October 2017 07:43
> >> >>> Paul Durrant <paul.durr...@citrix.com> 10/12/17 6:28 PM >>>
> >> >+int gnttab_get_grant_frame(struct domain *d, unsigned long idx,
> >> >+   mfn_t *mfn)
> >> >+{
> >> >+struct grant_table *gt = d->grant_table;
> >> >+int rc;
> >> >+
> >> >+/* write lock required as version may change and/or table may grow
> */
> >> >+grant_write_lock(gt);
> >> >+
> >> >+rc = (gt->gt_version == 2 &&
> >> >+  idx > XENMAPIDX_grant_table_status) ?
> >>
> >> I don't understand this check - why does XENMAPIDX_grant_table_status
> >> matter here at all? Same in gnttab_get_status_frame() then.
> >>
> >
> > Well, the current legal range of grant table frames for v2 is 0 - (1 <<
> > XENMAPIDX_grant_table_status) whereas it appears that for v1 there is no
> > limit. As for status frames, they are a v2-only concept but I agree that the
> > range check there is wrong.
> 
> I don't think the range limitation from the other interface should
> impose any restriction for this new one.

Ok. I'll drop the check.

> 
> Oh, one other thing I only notice now - could you please also
> attach a brief comment to the array that you grow to 32
> entries making clear that this is a pretty arbitrary choice?
> 

Sure.

  Paul

> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 10/11] common: add a new mappable resource type: XENMEM_resource_grant_table

2017-10-17 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 17 October 2017 07:43
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>; Wei Liu
> <wei.l...@citrix.com>; George Dunlap <george.dun...@citrix.com>; Ian
> Jackson <ian.jack...@citrix.com>; sstabell...@kernel.org; xen-
> de...@lists.xenproject.org; konrad.w...@oracle.com; Tim (Xen.org)
> <t...@xen.org>
> Subject: Re: [PATCH v11 10/11] common: add a new mappable resource
> type: XENMEM_resource_grant_table
> 
> >>> Paul Durrant <paul.durr...@citrix.com> 10/12/17 6:28 PM >>>
> >@@ -1608,7 +1608,8 @@ fault:
> >}
> >
> >static int
> >-gnttab_populate_status_frames(struct domain *d, struct grant_table *gt,
> >+gnttab_populate_status_frames(struct domain *d,
> >+  struct grant_table *gt,
> >unsigned int req_nr_frames)
> 
> What is this change about?
> 

It must have crept in accidentally. I'll get rid of it.

> >+int gnttab_get_grant_frame(struct domain *d, unsigned long idx,
> >+   mfn_t *mfn)
> >+{
> >+struct grant_table *gt = d->grant_table;
> >+int rc;
> >+
> >+/* write lock required as version may change and/or table may grow */
> >+grant_write_lock(gt);
> >+
> >+rc = (gt->gt_version == 2 &&
> >+  idx > XENMAPIDX_grant_table_status) ?
> 
> I don't understand this check - why does XENMAPIDX_grant_table_status
> matter here at all? Same in gnttab_get_status_frame() then.
> 

Well, the current legal range of grant table frames for v2 is 0 - (1 << 
XENMAPIDX_grant_table_status) whereas it appears that for v1 there is no limit. 
As for status frames, they are a v2-only concept but I agree that the range 
check there is wrong.

  Paul

> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 06/11] x86/hvm/ioreq: add a new mappable resource type...

2017-10-16 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 16 October 2017 15:07
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>; Ian Jackson
> <ian.jack...@citrix.com>; Stefano Stabellini <sstabell...@kernel.org>; xen-
> de...@lists.xenproject.org; Konrad Rzeszutek Wilk
> <konrad.w...@oracle.com>; Tim (Xen.org) <t...@xen.org>
> Subject: Re: [Xen-devel] [PATCH v11 06/11] x86/hvm/ioreq: add a new
> mappable resource type...
> 
> >>> On 12.10.17 at 18:25, <paul.durr...@citrix.com> wrote:
> > ... XENMEM_resource_ioreq_server
> >
> > This patch adds support for a new resource type that can be mapped using
> > the XENMEM_acquire_resource memory op.
> >
> > If an emulator makes use of this resource type then, instead of mapping
> > gfns, the IOREQ server will allocate pages from the heap. These pages
> > will never be present in the P2M of the guest at any point and so are
> > not vulnerable to any direct attack by the guest. They are only ever
> > accessible by Xen and any domain that has mapping privilege over the
> > guest (which may or may not be limited to the domain running the
> emulator).
> >
> > NOTE: Use of the new resource type is not compatible with use of
> >   XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns
> flag is
> >   set.
> >
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > Acked-by: George Dunlap <george.dun...@eu.citrix.com>
> > Reviewed-by: Wei Liu <wei.l...@citrix.com>
> 
> Can you have validly retained this?

I didn't think the structure of this particular patch had changed that 
fundamentally.

> 
> > --- a/xen/arch/x86/hvm/ioreq.c
> > +++ b/xen/arch/x86/hvm/ioreq.c
> > @@ -281,6 +294,69 @@ static int hvm_map_ioreq_gfn(struct
> hvm_ioreq_server *s, bool buf)
> >  return rc;
> >  }
> >
> > +static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
> > +{
> > +struct domain *currd = current->domain;
> > +struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
> > +
> > +if ( iorp->page )
> > +{
> > +/*
> > + * If a guest frame has already been mapped (which may happen
> > + * on demand if hvm_get_ioreq_server_info() is called), then
> > + * allocating a page is not permitted.
> > + */
> > +if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
> > +return -EPERM;
> > +
> > +return 0;
> > +}
> > +
> > +/*
> > + * Allocated IOREQ server pages are assigned to the emulating
> > + * domain, not the target domain. This is because the emulator is
> > + * likely to be destroyed after the target domain has been torn
> > + * down, and we must use MEMF_no_refcount otherwise page
> allocation
> > + * could fail if the emulating domain has already reached its
> > + * maximum allocation.
> > + */
> > +iorp->page = alloc_domheap_page(currd, MEMF_no_refcount);
> > +if ( !iorp->page )
> > +return -ENOMEM;
> > +
> > +if ( !get_page_type(iorp->page, PGT_writable_page) )
> > +{
> 
> ASSERT_UNREACHABLE() ?

Ok.

> 
> > @@ -777,6 +886,51 @@ int hvm_get_ioreq_server_info(struct domain *d,
> ioservid_t id,
> >  return rc;
> >  }
> >
> > +int hvm_get_ioreq_server_frame(struct domain *d, ioservid_t id,
> > +   unsigned long idx, mfn_t *mfn)
> > +{
> > +struct hvm_ioreq_server *s;
> > +int rc;
> > +
> > +spin_lock_recursive(>arch.hvm_domain.ioreq_server.lock);
> > +
> > +if ( id == DEFAULT_IOSERVID )
> > +return -EOPNOTSUPP;
> > +
> > +s = get_ioreq_server(d, id);
> > +
> > +ASSERT(!IS_DEFAULT(s));
> > +
> > +rc = hvm_ioreq_server_alloc_pages(s);
> > +if ( rc )
> > +goto out;
> > +
> > +switch ( idx )
> > +{
> > +case XENMEM_resource_ioreq_server_frame_bufioreq:
> > +rc = -ENOENT;
> > +if ( !HANDLE_BUFIOREQ(s) )
> > +goto out;
> > +
> > +*mfn = _mfn(page_to_mfn(s->bufioreq.page));
> > +rc = 0;
> > +break;
> 
> How about
> 
> if ( HANDLE_BUFIOREQ(s) )
> *mfn = _mfn(page_to_mfn(s->bufioreq.page));
> else
> rc = -ENOENT;
> break;
> 

Looking a

Re: [Xen-devel] [PATCH v11 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-16 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 16 October 2017 14:53
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Andrew Cooper <andrew.coop...@citrix.com>; Wei Liu
> <wei.l...@citrix.com>; George Dunlap <george.dun...@citrix.com>; Ian
> Jackson <ian.jack...@citrix.com>; Stefano Stabellini
> <sstabell...@kernel.org>; xen-de...@lists.xenproject.org; Konrad Rzeszutek
> Wilk <konrad.w...@oracle.com>; Tim (Xen.org) <t...@xen.org>
> Subject: Re: [PATCH v11 05/11] x86/mm: add HYPERVISOR_memory_op to
> acquire guest resources
> 
> >>> On 12.10.17 at 18:25, <paul.durr...@citrix.com> wrote:
> > @@ -402,14 +469,56 @@ int compat_memory_op(unsigned int cmd,
> XEN_GUEST_HANDLE_PARAM(void) compat)
> >  rc = do_memory_op(cmd, nat.hnd);
> >  if ( rc < 0 )
> >  {
> > -if ( rc == -ENOBUFS && op == XENMEM_get_vnumainfo )
> > +switch ( op)
> 
> Missing blank.

Oh yes.

> 
> >  {
> > -cmp.vnuma.nr_vnodes = nat.vnuma->nr_vnodes;
> > -cmp.vnuma.nr_vcpus = nat.vnuma->nr_vcpus;
> > -cmp.vnuma.nr_vmemranges = nat.vnuma->nr_vmemranges;
> > -if ( __copy_to_guest(compat, , 1) )
> > -rc = -EFAULT;
> > +case XENMEM_get_vnumainfo:
> > +if ( rc == -ENOBUFS )
> > +{
> > +cmp.vnuma.nr_vnodes = nat.vnuma->nr_vnodes;
> > +cmp.vnuma.nr_vcpus = nat.vnuma->nr_vcpus;
> > +cmp.vnuma.nr_vmemranges = nat.vnuma->nr_vmemranges;
> > +if ( __copy_to_guest(compat, , 1) )
> > +rc = -EFAULT;
> > +}
> > +
> > +break;
> > +
> > +case XENMEM_acquire_resource:
> > +{
> > +xen_ulong_t *xen_frame_list = (xen_ulong_t *)(nat.mar + 1);
> 
> const

Ok.

> 
> > +if ( rc == -EINVAL && xen_frame_list[0] != 0 )
> 
> I think this will go wrong if you get -EINVAL for other than the
> specific reason you consider here, in particular when caller
> passed in a valid array. You'd need to also check for
> cmp.mar.nr_frames being zero. But see also below.
> 
> > +{
> > +/*
> > + * The value of nr_frames passed to the implementation
> > + * was not the value passed by the caller, it was
> > + * overridden.
> > + * The value in xen_frame_list[0] is the maximum
> > + * number of frames that can be bounced so we need
> > + * to set cmp.nr_frames to the minimum of this and
> > + * the maximum number of frames allowed by the
> > + * implementation before passing back to the caller.
> > + */
> > +cmp.mar.nr_frames = min_t(unsigned int,
> > +  xen_frame_list[0],
> > +  nat.mar->nr_frames);
> > +rc = -E2BIG;
> > +}
> > +
> > +/* In either of these cases nr_frames is an OUT value */
> > +if ( rc == -EINVAL || rc == -E2BIG )
> > +{
> > +if ( copy_to_guest(compat, , 1) )
> > +rc = -EFAULT;
> 
> The two if()s should be combined. Also - __copy_field_to_guest()?

Yes, maybe that would neater.

> 
> > +}
> > +
> > +break;
> > +}
> > +default:
> > +break;
> 
> No real need for a default label. Yet if you want to keep it, please
> have a blank line ahead of it.
> 

Ok.

> > @@ -535,6 +644,30 @@ int compat_memory_op(unsigned int cmd,
> XEN_GUEST_HANDLE_PARAM(void) compat)
> >  rc = -EFAULT;
> >  break;
> >
> > +case XENMEM_acquire_resource:
> > +{
> > +xen_ulong_t *xen_frame_list = (xen_ulong_t *)(nat.mar + 1);
> 
> const

Ok.

> 
> > +compat_ulong_t *compat_frame_list =
> > +(compat_ulong_t *)(nat.mar + 1);
> > +
> > +/* NOTE: the compat array overwrites the native array */
> 
> Perhaps "the smaller compat array ..."?

Ok.

> 
> > +  

Re: [Xen-devel] [PATCH 3/8] xen: defer call to xen_restrict until just before os_setup_post

2017-10-13 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Andrew Cooper
> Sent: 13 October 2017 10:00
> To: Ross Lagerwall ; Ian Jackson
> ; qemu-de...@nongnu.org
> Cc: Anthony Perard ; Juergen Gross
> ; Stefano Stabellini ; xen-
> de...@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH 3/8] xen: defer call to xen_restrict until 
> just
> before os_setup_post
> 
> On 13/10/2017 09:37, Ross Lagerwall wrote:
> > On 10/09/2017 05:01 PM, Ian Jackson wrote:
> >> We need to restrict *all* the control fds that qemu opens.  Looking in
> >> /proc/PID/fd shows there are many; their allocation seems scattered
> >> throughout Xen support code in qemu.
> >>
> >> We must postpone the restrict call until roughly the same time as qemu
> >> changes its uid, chroots (if applicable), and so on.
> >>
> >> There doesn't seem to be an appropriate hook already.  The RunState
> >> change hook fires at different times depending on exactly what mode
> >> qemu is operating in.
> >>
> >> And it appears that no-one but the Xen code wants a hook at this phase
> >> of execution.  So, introduce a bare call to a new function
> >> xen_setup_post, just before os_setup_post.  Also provide the
> >> appropriate stub for when Xen compilation is disabled.
> >>
> >> We do the restriction before rather than after os_setup_post, because
> >> xen_restrict may need to open /dev/null, and os_setup_post might have
> >> called chroot.
> >>
> > This works for normally starting a VM but doesn't seem to work when
> > resuming/migrating.
> >
> > Here is the order of selected operations when starting a VM normally:
> >     VNC server running on 127.0.0.1:5901
> >     xen_change_state_handler()
> >     xenstore_record_dm_state: running
> >     xen_setup_post()
> >     xentoolcore_restrict_all: rc = 0
> >     os_setup_post()
> >     main_loop()
> >
> > Here is the order of selected operations when starting QEMU with
> > -incoming fd:... :
> >     VNC server running on 127.0.0.1:5902
> >     migration_fd_incoming()
> >     xen_setup_post()
> >     xentoolcore_restrict_all: rc = 0
> >     os_setup_post()
> >     main_loop()
> >     migration_set_incoming_channel()
> >     migrate_set_state()
> >     xen_change_state_handler()
> >     xenstore_record_dm_state: running
> >     error recording dm state
> >     qemu exited with code 1
> >
> > The issue is that QEMU needs xenstore access to write "running" but
> > this is after it has already been restricted. Moving xen_setup_post()
> > into xen_change_state_handler() works fine. The only issue is that in
> > the migration case, it executes after os_setup_post() so QEMU might be
> > chrooted in which case opening /dev/null to restrict fds doesn't work
> > (unless its new root has a /dev/null).
> >
> 
> Wasn't the agreement in the end to remove all use of xenstore from the
> device mode?  This running notification can and should be QMP, at which
> point we break a causal dependency.
> 

Yes, that was the agreement. One problem is that there is not yet adequate 
separation in either QEMU's common and pv/hvm init routines to do this yet. 
Nor, I believe, is there support in libxl to spawn separate xenpv and xenfv 
instances of QEMU for the same guest.

  Paul

> For safety reasons, qemu needs to have restricted/dropped/etc all
> permissions before it looks at a single byte of incoming migration data,
> to protect against buggy or malicious alterations to the migration stream.
> 
> ~Andrew
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 11/11] tools/libxenctrl: use new xenforeignmemory API to seed grant table

2017-10-12 Thread Paul Durrant
A previous patch added support for priv-mapping guest resources directly
(rather than having to foreign-map, which requires P2M modification for
HVM guests).

This patch makes use of the new API to seed the guest grant table unless
the underlying infrastructure (i.e. privcmd) doesn't support it, in which
case the old scheme is used.

NOTE: The call to xc_dom_gnttab_hvm_seed() in hvm_build_set_params() was
  actually unnecessary, as the grant table has already been seeded
  by a prior call to xc_dom_gnttab_init() made by libxl__build_dom().

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marma...@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v10:
 - Use new id constant for grant table.

v4:
 - Minor cosmetic fix suggested by Roger.

v3:
 - Introduced xc_dom_set_gnttab_entry() to avoid duplicated code.
---
 tools/libxc/include/xc_dom.h|   8 +--
 tools/libxc/xc_dom_boot.c   | 114 +---
 tools/libxc/xc_sr_restore_x86_hvm.c |  10 ++--
 tools/libxc/xc_sr_restore_x86_pv.c  |   2 +-
 tools/libxl/libxl_dom.c |   1 -
 tools/python/xen/lowlevel/xc/xc.c   |   6 +-
 6 files changed, 92 insertions(+), 49 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 6e06ef1dec..4216d63462 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -325,12 +325,8 @@ void *xc_dom_boot_domU_map(struct xc_dom_image *dom, 
xen_pfn_t pfn,
 int xc_dom_boot_image(struct xc_dom_image *dom);
 int xc_dom_compat_check(struct xc_dom_image *dom);
 int xc_dom_gnttab_init(struct xc_dom_image *dom);
-int xc_dom_gnttab_hvm_seed(xc_interface *xch, domid_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   domid_t console_domid,
-   domid_t xenstore_domid);
-int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
+int xc_dom_gnttab_seed(xc_interface *xch, domid_t guest_domid,
+   bool is_hvm,
xen_pfn_t console_gmfn,
xen_pfn_t xenstore_gmfn,
domid_t console_domid,
diff --git a/tools/libxc/xc_dom_boot.c b/tools/libxc/xc_dom_boot.c
index 8a376d097c..0fe94aa255 100644
--- a/tools/libxc/xc_dom_boot.c
+++ b/tools/libxc/xc_dom_boot.c
@@ -282,11 +282,29 @@ static xen_pfn_t xc_dom_gnttab_setup(xc_interface *xch, 
domid_t domid)
 return gmfn;
 }
 
-int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
-   xen_pfn_t console_gmfn,
-   xen_pfn_t xenstore_gmfn,
-   domid_t console_domid,
-   domid_t xenstore_domid)
+static void xc_dom_set_gnttab_entry(xc_interface *xch,
+grant_entry_v1_t *gnttab,
+unsigned int idx,
+domid_t guest_domid,
+domid_t backend_domid,
+xen_pfn_t backend_gmfn)
+{
+if ( guest_domid == backend_domid || backend_gmfn == -1)
+return;
+
+xc_dom_printf(xch, "%s: [%u] -> 0x%"PRI_xen_pfn,
+  __FUNCTION__, idx, backend_gmfn);
+
+gnttab[idx].flags = GTF_permit_access;
+gnttab[idx].domid = backend_domid;
+gnttab[idx].frame = backend_gmfn;
+}
+
+static int compat_gnttab_seed(xc_interface *xch, domid_t domid,
+  xen_pfn_t console_gmfn,
+  xen_pfn_t xenstore_gmfn,
+  domid_t console_domid,
+  domid_t xenstore_domid)
 {
 
 xen_pfn_t gnttab_gmfn;
@@ -310,18 +328,10 @@ int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
 return -1;
 }
 
-if ( domid != console_domid  && console_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_CONSOLE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_CONSOLE].domid = console_domid;
-gnttab[GNTTAB_RESERVED_CONSOLE].frame = console_gmfn;
-}
-if ( domid != xenstore_domid && xenstore_gmfn != -1)
-{
-gnttab[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
-gnttab[GNTTAB_RESERVED_XENSTORE].domid = xenstore_domid;
-gnttab[GNTTAB_RESERVED_XENSTORE].frame = xenstore_gmfn;
-}
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_CONSOLE,
+domid, console_domid, console_gmfn);
+xc_dom_set_gnttab_entry(xch, gnttab, GNTTAB_RESERVED_XENSTORE,
+domid, xenstore_domid, xenstore_gmfn);
 
 if ( munmap(gnttab, PAGE_SIZE) == -1 )
 {
@@ -339,11 +349,11 @@ int xc_dom_gnttab_seed(xc_interface *xch, domid_t domid,
 return 0;

[Xen-devel] [PATCH v11 10/11] common: add a new mappable resource type: XENMEM_resource_grant_table

2017-10-12 Thread Paul Durrant
This patch allows grant table frames to be mapped using the
XENMEM_acquire_resource memory op.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

v10:
 - Addressed comments from Jan.

v8:
 - The functionality was originally incorporated into the earlier patch
   "x86/mm: add HYPERVISOR_memory_op to acquire guest resources".
---
 xen/common/grant_table.c  | 63 ++-
 xen/common/memory.c   | 44 +-
 xen/include/public/memory.h   |  6 +
 xen/include/xen/grant_table.h |  4 +++
 4 files changed, 110 insertions(+), 7 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 6d20b17739..e42c1b6bf3 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1608,7 +1608,8 @@ fault:
 }
 
 static int
-gnttab_populate_status_frames(struct domain *d, struct grant_table *gt,
+gnttab_populate_status_frames(struct domain *d,
+  struct grant_table *gt,
   unsigned int req_nr_frames)
 {
 unsigned i;
@@ -3756,13 +3757,12 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, 
grant_ref_t ref,
 }
 #endif
 
-int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
- mfn_t *mfn)
+/* Caller must hold write lock as version may change and table may grow */
+static int gnttab_get_frame(struct domain *d, unsigned long idx,
+mfn_t *mfn)
 {
-int rc = 0;
 struct grant_table *gt = d->grant_table;
-
-grant_write_lock(gt);
+int rc = 0;
 
 if ( gt->gt_version == 0 )
 gt->gt_version = 1;
@@ -3787,6 +3787,19 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 rc = -EINVAL;
 }
 
+return rc;
+}
+
+int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
+ mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+grant_write_lock(gt);
+
+rc = gnttab_get_frame(d, idx, mfn);
+
 if ( !rc )
 gnttab_set_frame_gfn(gt, idx, gfn);
 
@@ -3795,6 +3808,44 @@ int gnttab_map_frame(struct domain *d, unsigned long 
idx, gfn_t gfn,
 return rc;
 }
 
+int gnttab_get_grant_frame(struct domain *d, unsigned long idx,
+   mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+/* write lock required as version may change and/or table may grow */
+grant_write_lock(gt);
+
+rc = (gt->gt_version == 2 &&
+  idx > XENMAPIDX_grant_table_status) ?
+-EINVAL :
+gnttab_get_frame(d, idx, mfn);
+
+grant_write_unlock(gt);
+
+return rc;
+}
+
+int gnttab_get_status_frame(struct domain *d, unsigned long idx,
+mfn_t *mfn)
+{
+struct grant_table *gt = d->grant_table;
+int rc;
+
+/* write lock required as version may change and/or table may grow */
+grant_write_lock(gt);
+
+rc = (gt->gt_version != 2 ||
+  idx > XENMAPIDX_grant_table_status) ?
+-EINVAL :
+gnttab_get_frame(d, idx & XENMAPIDX_grant_table_status, mfn);
+
+grant_write_unlock(gt);
+
+return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
 int first = 1;
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 1a9872b75c..a50d93d006 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -965,11 +966,47 @@ static long xatp_permission_check(struct domain *d, 
unsigned int space)
 return xsm_add_to_physmap(XSM_TARGET, current->domain, d);
 }
 
+static int acquire_grant_table(struct domain *d, unsigned int id,
+   unsigned long frame,
+   unsigned int nr_frames,
+   unsigned long mfn_list[])
+{
+unsigned int i = nr_frames;
+
+while ( i-- != 0 )
+{
+mfn_t mfn = INVALID_MFN;
+int rc;
+
+switch ( id )
+{
+case XENMEM_resource_grant_table_id_grant:
+rc = gnttab_get_grant_frame(d, frame + i, );
+break;
+
+case XENMEM_resource_grant_table_id_status:
+rc = gnttab_get_status_frame(d, frame + i, );
+break;
+
+default:
+rc = -EINVAL;
+break;
+}
+
+if ( rc )
+return rc;
+
+mfn_list[i] = mfn_x(mfn);
+}
+
+return 0;
+}
+
 static int acquire_resource(XEN_GUES

[Xen-devel] [PATCH v11 01/11] x86/hvm/ioreq: maintain an array of ioreq servers rather than a list

2017-10-12 Thread Paul Durrant
A subsequent patch will remove the current implicit limitation on creation
of ioreq servers which is due to the allocation of gfns for the ioreq
structures and buffered ioreq ring.

It will therefore be necessary to introduce an explicit limit and, since
this limit should be small, it simplifies the code to maintain an array of
that size rather than using a list.

Also, by reserving an array slot for the default server and populating
array slots early in create, the need to pass an 'is_default' boolean
to sub-functions can be avoided.

Some function return values are changed by this patch: Specifically, in
the case where the id of the default ioreq server is passed in, -EOPNOTSUPP
is now returned rather than -ENOENT.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>

v10:
 - modified FOR_EACH... macro as suggested by Jan.
 - check for NULL in IS_DEFAULT macro as suggested by Jan.

v9:
 - modified FOR_EACH... macro as requested by Andrew.

v8:
 - Addressed various comments from Jan.

v7:
 - Fixed assertion failure found in testing.

v6:
 - Updated according to comments made by Roger on v4 that I'd missed.

v5:
 - Switched GET/SET_IOREQ_SERVER() macros to get/set_ioreq_server()
   functions to avoid possible double-evaluation issues.

v4:
 - Introduced more helper macros and relocated them to the top of the
   code.

v3:
 - New patch (replacing "move is_default into struct hvm_ioreq_server") in
   response to review comments.
---
 xen/arch/x86/hvm/ioreq.c | 502 +++
 xen/include/asm-x86/hvm/domain.h |  10 +-
 2 files changed, 245 insertions(+), 267 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index f2e0b3f74a..e6ccc7572a 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -33,6 +33,37 @@
 
 #include 
 
+static void set_ioreq_server(struct domain *d, unsigned int id,
+ struct hvm_ioreq_server *s)
+{
+ASSERT(id < MAX_NR_IOREQ_SERVERS);
+ASSERT(!s || !d->arch.hvm_domain.ioreq_server.server[id]);
+
+d->arch.hvm_domain.ioreq_server.server[id] = s;
+}
+
+#define GET_IOREQ_SERVER(d, id) \
+(d)->arch.hvm_domain.ioreq_server.server[id]
+
+static struct hvm_ioreq_server *get_ioreq_server(const struct domain *d,
+ unsigned int id)
+{
+if ( id >= MAX_NR_IOREQ_SERVERS )
+return NULL;
+
+return GET_IOREQ_SERVER(d, id);
+}
+
+#define IS_DEFAULT(s) \
+((s) && (s) == GET_IOREQ_SERVER((s)->domain, DEFAULT_IOSERVID))
+
+/* Iterate over all possible ioreq servers */
+#define FOR_EACH_IOREQ_SERVER(d, id, s) \
+for ( (id) = 0; (id) < MAX_NR_IOREQ_SERVERS; (id)++ ) \
+if ( !(s = GET_IOREQ_SERVER(d, id)) ) \
+continue; \
+else
+
 static ioreq_t *get_ioreq(struct hvm_ioreq_server *s, struct vcpu *v)
 {
 shared_iopage_t *p = s->ioreq.va;
@@ -47,10 +78,9 @@ bool hvm_io_pending(struct vcpu *v)
 {
 struct domain *d = v->domain;
 struct hvm_ioreq_server *s;
+unsigned int id;
 
-list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -127,10 +157,9 @@ bool handle_hvm_io_completion(struct vcpu *v)
 struct hvm_vcpu_io *vio = >arch.hvm_vcpu.hvm_io;
 struct hvm_ioreq_server *s;
 enum hvm_io_completion io_completion;
+unsigned int id;
 
-  list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 struct hvm_ioreq_vcpu *sv;
 
@@ -243,13 +272,12 @@ static int hvm_map_ioreq_page(
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
 const struct hvm_ioreq_server *s;
+unsigned int id;
 bool found = false;
 
 spin_lock_recursive(>arch.hvm_domain.ioreq_server.lock);
 
-list_for_each_entry ( s,
-  >arch.hvm_domain.ioreq_server.list,
-  list_entry )
+FOR_EACH_IOREQ_SERVER(d, id, s)
 {
 if ( (s->ioreq.va && s->ioreq.page == page) ||
  (s->bufioreq.va && s->bufioreq.page == page) )
@@ -302,7 +330,7 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server 
*s,
 }
 
 static int hvm_ioreq_server_add_vcpu(struct hvm_ioreq_server *s,
- bool is_default, struct vcpu *v)
+ struct vcpu *v)
 {
 struct hvm_ioreq_vcpu *sv;
 int rc;
@@ -331,7 +359,7 @@ static int hvm_ioreq_server_add_vcpu(struct 
hvm_ioreq_server *s,
 goto fa

[Xen-devel] [PATCH v11 05/11] x86/mm: add HYPERVISOR_memory_op to acquire guest resources

2017-10-12 Thread Paul Durrant
Certain memory resources associated with a guest are not necessarily
present in the guest P2M.

This patch adds the boilerplate for new memory op to allow such a resource
to be priv-mapped directly, by either a PV or HVM tools domain.

NOTE: Whilst the new op is not intrinsicly specific to the x86 architecture,
  I have no means to test it on an ARM platform and so cannot verify
  that it functions correctly. Hence it is currently only implemented
  for x86.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

v11:
 - Addressed more comments from Jan.

v9:
 - Addressed more comments from Jan.

v8:
 - Move the code into common as requested by Jan.
 - Make the gmfn_list handle a 64-bit type to avoid limiting the MFN
   range for a 32-bit tools domain.
 - Add missing pad.
 - Add compat code.
 - Make this patch deal with purely boilerplate.
 - Drop George's A-b and Wei's R-b because the changes are non-trivial,
   and update Cc list now the boilerplate is common.

v5:
 - Switched __copy_to/from_guest_offset() to copy_to/from_guest_offset().
---
 tools/flask/policy/modules/xen.if   |   4 +-
 xen/arch/x86/mm/p2m.c   |   3 +-
 xen/common/compat/memory.c  | 145 ++--
 xen/common/memory.c |  90 ++
 xen/include/asm-x86/p2m.h   |   3 +
 xen/include/public/memory.h |  43 ++-
 xen/include/xlat.lst|   1 +
 xen/include/xsm/dummy.h |   6 ++
 xen/include/xsm/xsm.h   |   6 ++
 xen/xsm/dummy.c |   1 +
 xen/xsm/flask/hooks.c   |   6 ++
 xen/xsm/flask/policy/access_vectors |   2 +
 12 files changed, 300 insertions(+), 10 deletions(-)

diff --git a/tools/flask/policy/modules/xen.if 
b/tools/flask/policy/modules/xen.if
index 55437496f6..07cba8a15d 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -52,7 +52,8 @@ define(`create_domain_common', `
settime setdomainhandle getvcpucontext set_misc_info };
allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
-   psr_cmt_op psr_cat_op soft_reset set_gnttab_limits };
+   psr_cmt_op psr_cat_op soft_reset set_gnttab_limits
+   resource_map };
allow $1 $2:security check_context;
allow $1 $2:shadow enable;
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
@@ -152,6 +153,7 @@ define(`device_model', `
allow $1 $2_target:domain { getdomaininfo shutdown };
allow $1 $2_target:mmu { map_read map_write adjust physmap target_hack 
};
allow $1 $2_target:hvm { getparam setparam hvmctl cacheattr dm };
+   allow $1 $2_target:domain2 resource_map;
 ')
 
 # make_device_model(priv, dm_dom, hvm_dom)
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c72a3cdebb..71bb9b4f93 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1132,8 +1132,7 @@ static int set_typed_p2m_entry(struct domain *d, unsigned 
long gfn_l,
 }
 
 /* Set foreign mfn in the given guest's p2m table. */
-static int set_foreign_p2m_entry(struct domain *d, unsigned long gfn,
- mfn_t mfn)
+int set_foreign_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
 return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_map_foreign,
p2m_get_hostp2m(d)->default_access);
diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 35bb259808..031d1a48ae 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -71,6 +71,7 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 struct xen_remove_from_physmap *xrfp;
 struct xen_vnuma_topology_info *vnuma;
 struct xen_mem_access_op *mao;
+struct xen_mem_acquire_resource *mar;
 } nat;
 union {
 struct compat_memory_reservation rsrv;
@@ -79,6 +80,7 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 struct compat_add_to_physmap_batch atpb;
 struct compat_vnuma_topology_info vnuma;
 struct compat_mem_access_op mao;
+struct compat_mem_acquire_resource mar;
 } cmp;
 
 set_xen_guest_handle(nat.hnd, COMPAT_ARG_XLAT_VIRT_BA

[Xen-devel] [PATCH v11 02/11] x86/hvm/ioreq: simplify code and use consistent naming

2017-10-12 Thread Paul Durrant
This patch re-works much of the ioreq server initialization and teardown
code:

- The hvm_map/unmap_ioreq_gfn() functions are expanded to call through
  to hvm_alloc/free_ioreq_gfn() rather than expecting them to be called
  separately by outer functions.
- Several functions now test the validity of the hvm_ioreq_page gfn value
  to determine whether they need to act. This means can be safely called
  for the bufioreq page even when it is not used.
- hvm_add/remove_ioreq_gfn() simply return in the case of the default
  IOREQ server so callers no longer need to test before calling.
- hvm_ioreq_server_setup_pages() is renamed to hvm_ioreq_server_map_pages()
  to mirror the existing hvm_ioreq_server_unmap_pages().

All of this significantly shortens the code.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
Acked-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>

v3:
 - Rebased on top of 's->is_default' to 'IS_DEFAULT(s)' changes.
 - Minor updates in response to review comments from Roger.
---
 xen/arch/x86/hvm/ioreq.c | 182 ++-
 1 file changed, 69 insertions(+), 113 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index e6ccc7572a..6d81018369 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,63 +210,75 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static int hvm_alloc_ioreq_gfn(struct domain *d, unsigned long *gfn)
+static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
+struct domain *d = s->domain;
 unsigned int i;
-int rc;
 
-rc = -ENOMEM;
+ASSERT(!IS_DEFAULT(s));
+
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, >arch.hvm_domain.ioreq_gfn.mask) )
-{
-*gfn = d->arch.hvm_domain.ioreq_gfn.base + i;
-rc = 0;
-break;
-}
+return d->arch.hvm_domain.ioreq_gfn.base + i;
 }
 
-return rc;
+return gfn_x(INVALID_GFN);
 }
 
-static void hvm_free_ioreq_gfn(struct domain *d, unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
+   unsigned long gfn)
 {
+struct domain *d = s->domain;
 unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
 
-if ( gfn != gfn_x(INVALID_GFN) )
-set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
+ASSERT(!IS_DEFAULT(s));
+ASSERT(gfn != gfn_x(INVALID_GFN));
+
+set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
 }
 
-static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool buf)
+static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return;
+
 destroy_ring_for_helper(>va, iorp->page);
+iorp->page = NULL;
+
+if ( !IS_DEFAULT(s) )
+hvm_free_ioreq_gfn(s, iorp->gfn);
+
+iorp->gfn = gfn_x(INVALID_GFN);
 }
 
-static int hvm_map_ioreq_page(
-struct hvm_ioreq_server *s, bool buf, unsigned long gfn)
+static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
 {
 struct domain *d = s->domain;
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
-struct page_info *page;
-void *va;
 int rc;
 
-if ( (rc = prepare_ring_for_helper(d, gfn, , )) )
-return rc;
-
-if ( (iorp->va != NULL) || d->is_dying )
-{
-destroy_ring_for_helper(, page);
+if ( d->is_dying )
 return -EINVAL;
-}
 
-iorp->va = va;
-iorp->page = page;
-iorp->gfn = gfn;
+if ( IS_DEFAULT(s) )
+iorp->gfn = buf ?
+d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+else
+iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-return 0;
+if ( iorp->gfn == gfn_x(INVALID_GFN) )
+return -ENOMEM;
+
+rc = prepare_ring_for_helper(d, iorp->gfn, >page, >va);
+
+if ( rc )
+hvm_unmap_ioreq_gfn(s, buf);
+
+return rc;
 }
 
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
@@ -279,8 +291,7 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 
 FOR_EACH_IOREQ_SERVER(d, id, s)
 {
-if ( (s->ioreq.va && s->ioreq.page == page) ||
- (s->bufioreq.va && s->bufioreq.page == page) )
+if ( (s->ioreq.page == page) || (s->bufioreq.page == page) )
 {
 found = true;
 break;
@@ -292,20 +303,30 @@ bool is_ioreq_server_page(struct domain *d, const struct 
page_info *page)
 retu

[Xen-devel] [PATCH v11 06/11] x86/hvm/ioreq: add a new mappable resource type...

2017-10-12 Thread Paul Durrant
... XENMEM_resource_ioreq_server

This patch adds support for a new resource type that can be mapped using
the XENMEM_acquire_resource memory op.

If an emulator makes use of this resource type then, instead of mapping
gfns, the IOREQ server will allocate pages from the heap. These pages
will never be present in the P2M of the guest at any point and so are
not vulnerable to any direct attack by the guest. They are only ever
accessible by Xen and any domain that has mapping privilege over the
guest (which may or may not be limited to the domain running the emulator).

NOTE: Use of the new resource type is not compatible with use of
  XEN_DMOP_get_ioreq_server_info unless the XEN_DMOP_no_gfns flag is
  set.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: George Dunlap <george.dun...@eu.citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>

v11:
 - Addressed more comments from Jan.

v10:
 - Addressed comments from Jan.

v8:
 - Re-base on new boilerplate.
 - Adjust function signature of hvm_get_ioreq_server_frame(), and test
   whether the bufioreq page is present.

v5:
 - Use get_ioreq_server() function rather than indexing array directly.
 - Add more explanation into comments to state than mapping guest frames
   and allocation of pages for ioreq servers are not simultaneously
   permitted.
 - Add a comment into asm/ioreq.h stating the meaning of the index
   value passed to hvm_get_ioreq_server_frame().
---
 xen/arch/x86/hvm/ioreq.c| 154 
 xen/arch/x86/mm.c   |  22 ++
 xen/common/memory.c |   5 ++
 xen/include/asm-x86/hvm/ioreq.h |   2 +
 xen/include/asm-x86/mm.h|   5 ++
 xen/include/public/hvm/dm_op.h  |   4 ++
 xen/include/public/memory.h |   9 +++
 7 files changed, 201 insertions(+)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index f654e7796c..ff41312455 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -259,6 +259,19 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 int rc;
 
+if ( iorp->page )
+{
+/*
+ * If a page has already been allocated (which will happen on
+ * demand if hvm_get_ioreq_server_frame() is called), then
+ * mapping a guest frame is not permitted.
+ */
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
+return -EPERM;
+
+return 0;
+}
+
 if ( d->is_dying )
 return -EINVAL;
 
@@ -281,6 +294,69 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 return rc;
 }
 
+static int hvm_alloc_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+struct domain *currd = current->domain;
+struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
+
+if ( iorp->page )
+{
+/*
+ * If a guest frame has already been mapped (which may happen
+ * on demand if hvm_get_ioreq_server_info() is called), then
+ * allocating a page is not permitted.
+ */
+if ( !gfn_eq(iorp->gfn, INVALID_GFN) )
+return -EPERM;
+
+return 0;
+}
+
+/*
+ * Allocated IOREQ server pages are assigned to the emulating
+ * domain, not the target domain. This is because the emulator is
+ * likely to be destroyed after the target domain has been torn
+ * down, and we must use MEMF_no_refcount otherwise page allocation
+ * could fail if the emulating domain has already reached its
+ * maximum allocation.
+ */
+iorp->page = alloc_domheap_page(currd, MEMF_no_refcount);
+if ( !iorp->page )
+return -ENOMEM;
+
+if ( !get_page_type(iorp->page, PGT_writable_page) )
+{
+put_page(iorp->page);
+iorp->page = NULL;
+return -ENOMEM;
+}
+
+iorp->va = __map_domain_page_global(iorp->page);
+if ( !iorp->va )
+{
+put_page_and_type(iorp->page);
+iorp->page = NULL;
+return -ENOMEM;
+}
+
+clear_page(iorp->va);
+return 0;
+}
+
+static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool buf)
+{
+struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
+
+if ( !iorp->page )
+return;
+
+unmap_domain_page_global(iorp->va);
+iorp->va = NULL;
+
+put_page_and_type(iorp->page);
+iorp->page = NULL;
+}
+
 bool is_ioreq_server_page(struct domain *d, const struct page_info *page)
 {
 const struct hvm_ioreq_server *s;
@@ -484,6 +560,27 @@ static void hvm_ioreq_server_unmap_pages(struct 
hvm_ior

[Xen-devel] [PATCH v11 08/11] tools/libxenforeignmemory: add support for resource mapping

2017-10-12 Thread Paul Durrant
A previous patch introduced a new HYPERVISOR_memory_op to acquire guest
resources for direct priv-mapping.

This patch adds new functionality into libxenforeignmemory to make use
of a new privcmd ioctl [1] that uses the new memory op to make such
resources available via mmap(2).

[1] 
http://xenbits.xen.org/gitweb/?p=people/pauldu/linux.git;a=commit;h=ce59a05e6712

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>

v4:
 - Fixed errno and removed single-use label
 - The unmap call now returns a status
 - Use C99 initialization for ioctl struct

v2:
 - Bump minor version up to 3.
---
 tools/include/xen-sys/Linux/privcmd.h  | 11 +
 tools/libs/foreignmemory/Makefile  |  2 +-
 tools/libs/foreignmemory/core.c| 53 ++
 .../libs/foreignmemory/include/xenforeignmemory.h  | 41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |  5 ++
 tools/libs/foreignmemory/linux.c   | 45 ++
 tools/libs/foreignmemory/private.h | 31 +
 7 files changed, 187 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-sys/Linux/privcmd.h 
b/tools/include/xen-sys/Linux/privcmd.h
index 732ff7c15a..9531b728f9 100644
--- a/tools/include/xen-sys/Linux/privcmd.h
+++ b/tools/include/xen-sys/Linux/privcmd.h
@@ -86,6 +86,15 @@ typedef struct privcmd_dm_op {
const privcmd_dm_op_buf_t __user *ubufs;
 } privcmd_dm_op_t;
 
+typedef struct privcmd_mmap_resource {
+   domid_t dom;
+   __u32 type;
+   __u32 id;
+   __u32 idx;
+   __u64 num;
+   __u64 addr;
+} privcmd_mmap_resource_t;
+
 /*
  * @cmd: IOCTL_PRIVCMD_HYPERCALL
  * @arg: _hypercall_t
@@ -103,5 +112,7 @@ typedef struct privcmd_dm_op {
_IOC(_IOC_NONE, 'P', 5, sizeof(privcmd_dm_op_t))
 #define IOCTL_PRIVCMD_RESTRICT \
_IOC(_IOC_NONE, 'P', 6, sizeof(domid_t))
+#define IOCTL_PRIVCMD_MMAP_RESOURCE\
+   _IOC(_IOC_NONE, 'P', 7, sizeof(privcmd_mmap_resource_t))
 
 #endif /* __LINUX_PUBLIC_PRIVCMD_H__ */
diff --git a/tools/libs/foreignmemory/Makefile 
b/tools/libs/foreignmemory/Makefile
index ab7f873f26..5c7f78f61d 100644
--- a/tools/libs/foreignmemory/Makefile
+++ b/tools/libs/foreignmemory/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR= 1
-MINOR= 2
+MINOR= 3
 SHLIB_LDFLAGS += -Wl,--version-script=libxenforeignmemory.map
 
 CFLAGS   += -Werror -Wmissing-prototypes
diff --git a/tools/libs/foreignmemory/core.c b/tools/libs/foreignmemory/core.c
index a6897dc561..8d3f9f178f 100644
--- a/tools/libs/foreignmemory/core.c
+++ b/tools/libs/foreignmemory/core.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 
+#include 
+
 #include "private.h"
 
 xenforeignmemory_handle *xenforeignmemory_open(xentoollog_logger *logger,
@@ -120,6 +122,57 @@ int xenforeignmemory_restrict(xenforeignmemory_handle 
*fmem,
 return osdep_xenforeignmemory_restrict(fmem, domid);
 }
 
+xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
+xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+unsigned int id, unsigned long frame, unsigned long nr_frames,
+void **paddr, int prot, int flags)
+{
+xenforeignmemory_resource_handle *fres;
+int rc;
+
+/* Check flags only contains POSIX defined values */
+if ( flags & ~(MAP_SHARED | MAP_PRIVATE) )
+{
+errno = EINVAL;
+return NULL;
+}
+
+fres = calloc(1, sizeof(*fres));
+if ( !fres )
+{
+errno = ENOMEM;
+return NULL;
+}
+
+fres->domid = domid;
+fres->type = type;
+fres->id = id;
+fres->frame = frame;
+fres->nr_frames = nr_frames;
+fres->addr = *paddr;
+fres->prot = prot;
+fres->flags = flags;
+
+rc = osdep_xenforeignmemory_map_resource(fmem, fres);
+if ( rc )
+{
+free(fres);
+fres = NULL;
+} else
+*paddr = fres->addr;
+
+return fres;
+}
+
+int xenforeignmemory_unmap_resource(
+xenforeignmemory_handle *fmem, xenforeignmemory_resource_handle *fres)
+{
+int rc = osdep_xenforeignmemory_unmap_resource(fmem, fres);
+
+free(fres);
+return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/foreignmemory/include/xenforeignmemory.h 
b/tools/libs/foreignmemory/include/xenforeignmemory.h
index f4814c390f..d594be8df0 100644
--- a/tools/libs/foreignmemory/include/xenforeignmemory.h
+++ b/tools/libs/foreignmemory/include/xenforeignmemory.h
@@ -138,6 +138,47 @@ int xenforeignmemory_unmap(xenforeignmemory_handle *fmem,
 int xenforeignmemory_restrict(xenforeignmemory_handle *fmem,
   domid_t domid);
 
+typedef struct x

[Xen-devel] [PATCH v11 07/11] x86/mm: add an extra command to HYPERVISOR_mmu_update...

2017-10-12 Thread Paul Durrant
...to allow the calling domain to prevent translation of specified l1e
value.

Despite what the comment in public/xen.h might imply, specifying a
command value of MMU_NORMAL_PT_UPDATE will not simply update an l1e with
the specified value. Instead, mod_l1_entry() tests whether foreign_dom
has PG_translate set in its paging mode and, if it does, assumes that the
the pfn value in the l1e is a gfn rather than an mfn.

To allow PV tools domain to map mfn values from a previously issued
HYPERVISOR_memory_op:XENMEM_acquire_resource, there needs to be a way
to tell HYPERVISOR_mmu_update that the specific l1e value does not
require translation regardless of the paging mode of foreign_dom. This
patch therefore defines a new command value, MMU_PT_UPDATE_NO_TRANSLATE,
which has the same semantics as MMU_NORMAL_PT_UPDATE except that the
paging mode of foreign_dom is ignored and the l1e value is used verbatim.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

v8:
 - New in this version, replacing "allow a privileged PV domain to map
   guest mfns".
---
 xen/arch/x86/mm.c| 17 ++---
 xen/include/public/xen.h | 12 +---
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index c9bc4a4e92..3dd5b2c00f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1619,9 +1619,10 @@ void page_unlock(struct page_info *page)
 
 /* Update the L1 entry at pl1e to new value nl1e. */
 static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t nl1e,
-unsigned long gl1mfn, int preserve_ad,
+unsigned long gl1mfn, unsigned int cmd,
 struct vcpu *pt_vcpu, struct domain *pg_dom)
 {
+bool preserve_ad = (cmd == MMU_PT_UPDATE_PRESERVE_AD);
 l1_pgentry_t ol1e;
 struct domain *pt_dom = pt_vcpu->domain;
 int rc = 0;
@@ -1643,7 +1644,8 @@ static int mod_l1_entry(l1_pgentry_t *pl1e, l1_pgentry_t 
nl1e,
 return -EINVAL;
 }
 
-if ( paging_mode_translate(pg_dom) )
+if ( cmd != MMU_PT_UPDATE_NO_TRANSLATE &&
+ paging_mode_translate(pg_dom) )
 {
 page = get_page_from_gfn(pg_dom, l1e_get_pfn(nl1e), NULL, 
P2M_ALLOC);
 if ( !page )
@@ -3258,6 +3260,7 @@ long do_mmu_update(
  */
 case MMU_NORMAL_PT_UPDATE:
 case MMU_PT_UPDATE_PRESERVE_AD:
+case MMU_PT_UPDATE_NO_TRANSLATE:
 {
 p2m_type_t p2mt;
 
@@ -3323,7 +3326,8 @@ long do_mmu_update(
 p2m_query_t q = (l1e_get_flags(l1e) & _PAGE_RW) ?
 P2M_UNSHARE : P2M_ALLOC;
 
-if ( paging_mode_translate(pg_owner) )
+if ( cmd != MMU_PT_UPDATE_NO_TRANSLATE &&
+ paging_mode_translate(pg_owner) )
 target = get_page_from_gfn(pg_owner, l1e_get_pfn(l1e),
_p2mt, q);
 
@@ -3350,9 +3354,7 @@ long do_mmu_update(
 break;
 }
 
-rc = mod_l1_entry(va, l1e, mfn,
-  cmd == MMU_PT_UPDATE_PRESERVE_AD, v,
-  pg_owner);
+rc = mod_l1_entry(va, l1e, mfn, cmd, v, pg_owner);
 if ( target )
 put_page(target);
 }
@@ -3630,7 +3632,8 @@ static int __do_update_va_mapping(
 goto out;
 }
 
-rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), 0, v, pg_owner);
+rc = mod_l1_entry(pl1e, val, mfn_x(gl1mfn), MMU_NORMAL_PT_UPDATE, v,
+  pg_owner);
 
 page_unlock(gl1pg);
 put_page(gl1pg);
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index 2ac6b1e24d..d2014a39eb 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -268,6 +268,10 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  * As MMU_NORMAL_PT_UPDATE above, but A/D bits currently in the PTE are ORed
  * with those in @val.
  *
+ * ptr[1:0] == MMU_PT_UPDATE_NO_TRANSLATE:
+ * As MMU_NORMAL_PT_UPDATE above, but @val is not translated though FD
+ * page tables.
+ *
  * @val is usually the machine frame number along with some attributes.
  * The attributes by default follow the architecture defined bits. Meaning that
  * if this is a X86_64 machine and four page table layout is used, the layout
@@ -334,9 +338,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
  *
  * PAT (bit 7 on) --> PWT (bit 3 on) and

[Xen-devel] [PATCH v11 04/11] x86/hvm/ioreq: defer mapping gfns until they are actually requsted

2017-10-12 Thread Paul Durrant
A subsequent patch will introduce a new scheme to allow an emulator to
map ioreq server pages directly from Xen rather than the guest P2M.

This patch lays the groundwork for that change by deferring mapping of
gfns until their values are requested by an emulator. To that end, the
pad field of the xen_dm_op_get_ioreq_server_info structure is re-purposed
to a flags field and new flag, XEN_DMOP_no_gfns, defined which modifies the
behaviour of XEN_DMOP_get_ioreq_server_info to allow the caller to avoid
requesting the gfn values.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
Reviewed-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>

v8:
 - For safety make all of the pointers passed to
   hvm_get_ioreq_server_info() optional.
 - Shrink bufioreq_handling down to a uint8_t.

v3:
 - Updated in response to review comments from Wei and Roger.
 - Added a HANDLE_BUFIOREQ macro to make the code neater.
 - This patch no longer introduces a security vulnerability since there
   is now an explicit limit on the number of ioreq servers that may be
   created for any one domain.
---
 tools/libs/devicemodel/core.c   |  8 +
 tools/libs/devicemodel/include/xendevicemodel.h |  6 ++--
 xen/arch/x86/hvm/dm.c   |  9 +++--
 xen/arch/x86/hvm/ioreq.c| 47 ++---
 xen/include/asm-x86/hvm/domain.h|  2 +-
 xen/include/public/hvm/dm_op.h  | 32 ++---
 6 files changed, 63 insertions(+), 41 deletions(-)

diff --git a/tools/libs/devicemodel/core.c b/tools/libs/devicemodel/core.c
index 0f2c1a791f..91c69d103b 100644
--- a/tools/libs/devicemodel/core.c
+++ b/tools/libs/devicemodel/core.c
@@ -188,6 +188,14 @@ int xendevicemodel_get_ioreq_server_info(
 
 data->id = id;
 
+/*
+ * If the caller is not requesting gfn values then instruct the
+ * hypercall not to retrieve them as this may cause them to be
+ * mapped.
+ */
+if (!ioreq_gfn && !bufioreq_gfn)
+data->flags |= XEN_DMOP_no_gfns;
+
 rc = xendevicemodel_op(dmod, domid, 1, , sizeof(op));
 if (rc)
 return rc;
diff --git a/tools/libs/devicemodel/include/xendevicemodel.h 
b/tools/libs/devicemodel/include/xendevicemodel.h
index 13216db04a..d73a76da35 100644
--- a/tools/libs/devicemodel/include/xendevicemodel.h
+++ b/tools/libs/devicemodel/include/xendevicemodel.h
@@ -61,11 +61,11 @@ int xendevicemodel_create_ioreq_server(
  * @parm domid the domain id to be serviced
  * @parm id the IOREQ Server id.
  * @parm ioreq_gfn pointer to a xen_pfn_t to receive the synchronous ioreq
- *  gfn
+ *  gfn. (May be NULL if not required)
  * @parm bufioreq_gfn pointer to a xen_pfn_t to receive the buffered ioreq
- *gfn
+ *gfn. (May be NULL if not required)
  * @parm bufioreq_port pointer to a evtchn_port_t to receive the buffered
- * ioreq event channel
+ * ioreq event channel. (May be NULL if not required)
  * @return 0 on success, -1 on failure.
  */
 int xendevicemodel_get_ioreq_server_info(
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 9cf53b551c..22fa5b51e3 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -416,16 +416,19 @@ static int dm_op(const struct dmop_args *op_args)
 {
 struct xen_dm_op_get_ioreq_server_info *data =
 _ioreq_server_info;
+const uint16_t valid_flags = XEN_DMOP_no_gfns;
 
 const_op = false;
 
 rc = -EINVAL;
-if ( data->pad )
+if ( data->flags & ~valid_flags )
 break;
 
 rc = hvm_get_ioreq_server_info(d, data->id,
-   >ioreq_gfn,
-   >bufioreq_gfn,
+   (data->flags & XEN_DMOP_no_gfns) ?
+   NULL : >ioreq_gfn,
+   (data->flags & XEN_DMOP_no_gfns) ?
+   NULL : >bufioreq_gfn,
>bufioreq_port);
 break;
 }
diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 64bb13cec9..f654e7796c 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -350,6 +350,9 @@ static void hvm_update_ioreq_evtchn(struct hvm_ioreq_server 
*s,
 }
 }
 
+#define HANDLE_BUFIOREQ(s) \
+((s)->bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF)
+

[Xen-devel] [PATCH v11 00/11] x86: guest resource mapping

2017-10-12 Thread Paul Durrant
This series introduces support for direct mapping of guest resources.
The resources are:
 - IOREQ server pages
 - Grant tables

v10:
 - Responded to comments from Jan.

v9:
 - Change to patch #1 only.

v8:
 - Re-ordered series and dropped two patches that have already been
committed.

v7:
 - Fixed assertion failure hit during domain destroy.

v6:
 - Responded to missed comments from Roger.

v5:
 - Responded to review comments from Wei.

v4:
 - Responded to further review comments from Roger.

v3:
 - Dropped original patch #1 since it is covered by Juergen's patch.
 - Added new xenforeignmemorycleanup patch (#4).
 - Replaced the patch introducing the ioreq server 'is_default' flag with
   one that changes the ioreq server list into an array (#8).
  
Paul Durrant (11):
  x86/hvm/ioreq: maintain an array of ioreq servers rather than a list
  x86/hvm/ioreq: simplify code and use consistent naming
  x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page
  x86/hvm/ioreq: defer mapping gfns until they are actually requsted
  x86/mm: add HYPERVISOR_memory_op to acquire guest resources
  x86/hvm/ioreq: add a new mappable resource type...
  x86/mm: add an extra command to HYPERVISOR_mmu_update...
  tools/libxenforeignmemory: add support for resource mapping
  tools/libxenforeignmemory: reduce xenforeignmemory_restrict code
footprint
  common: add a new mappable resource type: XENMEM_resource_grant_table
  tools/libxenctrl: use new xenforeignmemory API to seed grant table

 tools/flask/policy/modules/xen.if  |   4 +-
 tools/include/xen-sys/Linux/privcmd.h  |  11 +
 tools/libs/devicemodel/core.c  |   8 +
 tools/libs/devicemodel/include/xendevicemodel.h|   6 +-
 tools/libs/foreignmemory/Makefile  |   2 +-
 tools/libs/foreignmemory/core.c|  53 ++
 tools/libs/foreignmemory/freebsd.c |   7 -
 .../libs/foreignmemory/include/xenforeignmemory.h  |  41 +
 tools/libs/foreignmemory/libxenforeignmemory.map   |   5 +
 tools/libs/foreignmemory/linux.c   |  45 ++
 tools/libs/foreignmemory/minios.c  |   7 -
 tools/libs/foreignmemory/netbsd.c  |   7 -
 tools/libs/foreignmemory/private.h |  43 +-
 tools/libs/foreignmemory/solaris.c |   7 -
 tools/libxc/include/xc_dom.h   |   8 +-
 tools/libxc/xc_dom_boot.c  | 114 ++-
 tools/libxc/xc_sr_restore_x86_hvm.c|  10 +-
 tools/libxc/xc_sr_restore_x86_pv.c |   2 +-
 tools/libxl/libxl_dom.c|   1 -
 tools/python/xen/lowlevel/xc/xc.c  |   6 +-
 xen/arch/x86/hvm/dm.c  |   9 +-
 xen/arch/x86/hvm/ioreq.c   | 829 -
 xen/arch/x86/mm.c  |  39 +-
 xen/arch/x86/mm/p2m.c  |   3 +-
 xen/common/compat/memory.c | 145 +++-
 xen/common/grant_table.c   |  63 +-
 xen/common/memory.c| 137 
 xen/include/asm-x86/hvm/domain.h   |  14 +-
 xen/include/asm-x86/hvm/ioreq.h|   2 +
 xen/include/asm-x86/mm.h   |   5 +
 xen/include/asm-x86/p2m.h  |   3 +
 xen/include/public/hvm/dm_op.h |  36 +-
 xen/include/public/memory.h|  58 +-
 xen/include/public/xen.h   |  12 +-
 xen/include/xen/grant_table.h  |   4 +
 xen/include/xlat.lst   |   1 +
 xen/include/xsm/dummy.h|   6 +
 xen/include/xsm/xsm.h  |   6 +
 xen/xsm/dummy.c|   1 +
 xen/xsm/flask/hooks.c  |   6 +
 xen/xsm/flask/policy/access_vectors|   2 +
 41 files changed, 1267 insertions(+), 501 deletions(-)

---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
Cc: George Dunlap <george.dun...@eu.citrix.com>
Cc: Ian Jackson <ian.jack...@eu.citrix.com>
Cc: Jan Beulich <jbeul...@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Stefano Stabellini <sstabell...@kernel.org>
Cc: Tim Deegan <t...@xen.org>
Cc: Wei Liu <wei.l...@citrix.com>

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v11 03/11] x86/hvm/ioreq: use gfn_t in struct hvm_ioreq_page

2017-10-12 Thread Paul Durrant
This patch adjusts the ioreq server code to use type-safe gfn_t values
where possible. No functional change.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reviewed-by: Roger Pau Monné <roger@citrix.com>
Reviewed-by: Wei Liu <wei.l...@citrix.com>
Acked-by: Jan Beulich <jbeul...@suse.com>
---
Cc: Andrew Cooper <andrew.coop...@citrix.com>
---
 xen/arch/x86/hvm/ioreq.c | 44 
 xen/include/asm-x86/hvm/domain.h |  2 +-
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
index 6d81018369..64bb13cec9 100644
--- a/xen/arch/x86/hvm/ioreq.c
+++ b/xen/arch/x86/hvm/ioreq.c
@@ -210,7 +210,7 @@ bool handle_hvm_io_completion(struct vcpu *v)
 return true;
 }
 
-static unsigned long hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
+static gfn_t hvm_alloc_ioreq_gfn(struct hvm_ioreq_server *s)
 {
 struct domain *d = s->domain;
 unsigned int i;
@@ -220,20 +220,19 @@ static unsigned long hvm_alloc_ioreq_gfn(struct 
hvm_ioreq_server *s)
 for ( i = 0; i < sizeof(d->arch.hvm_domain.ioreq_gfn.mask) * 8; i++ )
 {
 if ( test_and_clear_bit(i, >arch.hvm_domain.ioreq_gfn.mask) )
-return d->arch.hvm_domain.ioreq_gfn.base + i;
+return _gfn(d->arch.hvm_domain.ioreq_gfn.base + i);
 }
 
-return gfn_x(INVALID_GFN);
+return INVALID_GFN;
 }
 
-static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s,
-   unsigned long gfn)
+static void hvm_free_ioreq_gfn(struct hvm_ioreq_server *s, gfn_t gfn)
 {
 struct domain *d = s->domain;
-unsigned int i = gfn - d->arch.hvm_domain.ioreq_gfn.base;
+unsigned int i = gfn_x(gfn) - d->arch.hvm_domain.ioreq_gfn.base;
 
 ASSERT(!IS_DEFAULT(s));
-ASSERT(gfn != gfn_x(INVALID_GFN));
+ASSERT(!gfn_eq(gfn, INVALID_GFN));
 
 set_bit(i, >arch.hvm_domain.ioreq_gfn.mask);
 }
@@ -242,7 +241,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 {
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
 destroy_ring_for_helper(>va, iorp->page);
@@ -251,7 +250,7 @@ static void hvm_unmap_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 if ( !IS_DEFAULT(s) )
 hvm_free_ioreq_gfn(s, iorp->gfn);
 
-iorp->gfn = gfn_x(INVALID_GFN);
+iorp->gfn = INVALID_GFN;
 }
 
 static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, bool buf)
@@ -264,16 +263,17 @@ static int hvm_map_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 return -EINVAL;
 
 if ( IS_DEFAULT(s) )
-iorp->gfn = buf ?
-d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
-d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN];
+iorp->gfn = _gfn(buf ?
+ d->arch.hvm_domain.params[HVM_PARAM_BUFIOREQ_PFN] :
+ d->arch.hvm_domain.params[HVM_PARAM_IOREQ_PFN]);
 else
 iorp->gfn = hvm_alloc_ioreq_gfn(s);
 
-if ( iorp->gfn == gfn_x(INVALID_GFN) )
+if ( gfn_eq(iorp->gfn, INVALID_GFN) )
 return -ENOMEM;
 
-rc = prepare_ring_for_helper(d, iorp->gfn, >page, >va);
+rc = prepare_ring_for_helper(d, gfn_x(iorp->gfn), >page,
+ >va);
 
 if ( rc )
 hvm_unmap_ioreq_gfn(s, buf);
@@ -309,10 +309,10 @@ static void hvm_remove_ioreq_gfn(struct hvm_ioreq_server 
*s, bool buf)
 struct domain *d = s->domain;
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return;
 
-if ( guest_physmap_remove_page(d, _gfn(iorp->gfn),
+if ( guest_physmap_remove_page(d, iorp->gfn,
_mfn(page_to_mfn(iorp->page)), 0) )
 domain_crash(d);
 clear_page(iorp->va);
@@ -324,12 +324,12 @@ static int hvm_add_ioreq_gfn(struct hvm_ioreq_server *s, 
bool buf)
 struct hvm_ioreq_page *iorp = buf ? >bufioreq : >ioreq;
 int rc;
 
-if ( IS_DEFAULT(s) || iorp->gfn == gfn_x(INVALID_GFN) )
+if ( IS_DEFAULT(s) || gfn_eq(iorp->gfn, INVALID_GFN) )
 return 0;
 
 clear_page(iorp->va);
 
-rc = guest_physmap_add_page(d, _gfn(iorp->gfn),
+rc = guest_physmap_add_page(d, iorp->gfn,
 _mfn(page_to_mfn(iorp->page)), 0);
 if ( rc == 0 )
 paging_mark_dirty(d, _mfn(page_to_mfn(iorp->page)));
@@ -590,8 +590,8 @@ static int hvm_ioreq_server_init(struct hvm_ioreq_server *s,
 INIT_LIST_HEAD(>ioreq_vcpu_list);
 spin_lock_init(>bufioreq_lock);
 
-s->ioreq.gfn = gfn_x(INVALID_GFN);
-s->

  1   2   3   4   5   6   7   8   9   10   >