date:20150417

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Chen Baozi

Hi Stefano,

On Fri, Apr 17, 2015 at 03:32:20PM +0100, Stefano Stabellini wrote:
> On Fri, 17 Apr 2015, Chen Baozi wrote:
> > Hi all,
> > 
> > According to my recent experience, there might be some problems of swiotlb
> > dma map on 1:1 mapping arm64 dom0 with large memory. The issue is like 
> > below:
> > 
> > For those arm64 server with large memory, it is possible to set dom0_mem >
> > 4G (e.g. I have one set with 16G). In this case, according to my 
> > understanding,
> > there is chance that the dom0 kernel needs to map some buffers above 4G to 
> > do
> > DMA operations (e.g. in snps,dwmac ethernet driver). However, most DMA 
> > engines
> > support only 32-bit physical address, thus aren't able to operate directly 
> > on
> > those memory. IIUC, swiotlb is implemented to solve this (using bounce 
> > buffer),
> > if there is no IOMMU or IOMMU is not enabled on the system. Sadly, it seems
> > that xen_swiotlb_map_page in my dom0 kernel allocates
> > (start_dma_addr = 0x94480) the buffers for DMA above 4G which fails
> > dma_capable() checking and was then unable to return from 
> > xen_swiotlb_map_page()
> > successfully.
> >
> > If I set dom0_mem to a small value (e.g. 512M), which makes all physical 
> > memory
> > of dom0 below 4G, everything goes fine.
> > 
> > I am not familiar with swiotlb-xen, so there would be misunderstanding about
> > the current situation. Fix me if I did/understood anything wrong.
> > 
> > Any ideas?
> 
> I think that the problem is that xen_swiotlb_init doesn't necessarely allocate
> memory under 4G on arm/arm64.
> 
> xen_swiotlb_init calls __get_free_pages to allocate memory, so the pages
> could easily be above 4G.  Subsequently xen_swiotlb_fixup is called on
> the allocated memory range, calling xen_create_contiguous_region and
> passing an address_bits mask. However xen_create_contiguous_region
> doesn't actually do anything at all on ARM.
> 
> I think that given that dom0 is mapped 1:1 on ARM, the easiest and best
> fix would be to simply allocate memory under 4G to begin with. Something
> like (maybe with an ifdef ARM around it):
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 810ad41..22ac33a 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -235,7 +235,7 @@ retry:
>  #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
>  #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
>   while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
> - xen_io_tlb_start = (void 
> *)__get_free_pages(__GFP_NOWARN, order);
> + xen_io_tlb_start = (void 
> *)__get_free_pages(__GFP_NOWARN|__GFP_DMA32, order);
>   if (xen_io_tlb_start)
>   break;
>   order--;

I have no idea if __GFP_DMA32 on arm64 has something wrong. But It looks like
that it doesn't help...

Here is the memory info about what xen has populated to dom0 (I did some hacks
to allocate_memory_11 to make it map some low memory banks to dom0):

(XEN) Allocating 1:1 mappings totalling 16384MB for dom0:
(XEN) BANK[0] 0x008800-0x009800 (256MB)
(XEN) BANK[1] 0x00a000-0x00f800 (1408MB)
(XEN) BANK[2] 0x04-0x06 (8192MB)
(XEN) BANK[3] 0x068000-0x07 (2048MB)
(XEN) BANK[4] 0x08-0x09 (4096MB)
(XEN) BANK[5] 0x094000-0x095800 (384MB)

And Here is the printk info I got when trying to map a dma page:

enter xen_swiotlb_map_page.
phys = 0x9444e4042, dev_addr = 0x9444e4042, size = 0x600
start_dma_addr = 0x94480
virt_to_phys(xen_io_tlb_start) = 0x94480
Oh Well, have to allocate and map a bounce buffer.
map = 0x94480
dev_addr = 0x94480
*dev->dma_mask = 0x
!dma_capable(0xffc8bd384810, 0x94480, 0x600)

And the patch I used for dom0 hacking:

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 810ad41..96465cf 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -235,7 +235,7 @@ retry:
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-   xen_io_tlb_start = (void 
*)__get_free_pages(__GFP_NOWARN, order);
+   xen_io_tlb_start = (void 
*)__get_free_pages(__GFP_NOWARN|__GFP_DMA32, order);
if (xen_io_tlb_start)
break;
order--;
@@ -391,6 +391,13 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct 
page *page,
dma_addr_t dev_addr = xen_phys_to_bus(phys);
 
BUG_ON(dir == DMA_NONE);
+   printk("enter xen_swiotlb_map_page.\n");
+   printk("phys = 0x%lx, dev_addr = 0x%lx, size = 0x%lx\n",
+   phys, dev_addr, size);
+   printk("start_dma_addr = 0x%lx\n", start_dma_addr);
+   printk("

Re: [Xen-devel] converting gatewaydev= from domU.cfg to libvirt.xml fails

2015-04-17 Thread Olaf Hering

On Fri, Apr 17, Jim Fehlig wrote:

> On 04/17/2015 12:50 PM, Olaf Hering wrote:
> >How should this be converted?
> >
> >/etc/init.d/boot.local
> >tap=xentap
> >tunctl -pt ${tap}
> >ip addr add 1.1.1.1/29 dev ${tap}
> >ip link set up dev ${tap}
> >
> >domU.cfg
> >vif=[
> >'mac=00:16:3e:13:01:00,ip=1.1.1.2,type=vif,gatewaydev=xentap,script=vif-route'
> >]
> >
> >
> >The result from "convert-xml xen-xl domU,cfg" is:
> >
> > 
> >   
> >   
> >   
> > 
> >
> >gatewaydev= is missing, the guest will not start.
> 
> libvirt doesn't support 'gatewaydev'.  I'm certainly no expert in this area,
> but is it possible to convert your host config in boot.local to a libvirt
> routed network [1] and then have the guest use it with something like
> 
>   
> 
> ...
>   

I will have to check how this is supposed to work, but it seems the
xen-xl to xml converter should recognize this variant.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: initialize vfb defbools in libxlMakeVfb

2015-04-17 Thread Olaf Hering

On Fri, Apr 17, Jim Fehlig wrote:

> On 04/17/2015 11:59 AM, Olaf Hering wrote:
> >On Fri, Apr 17, Olaf Hering wrote:
> >
> >>If the domU configu has sdl enabled libvirtd crashes:
> >>libvirtd[5158]: libvirtd: libxl.c:343: libxl_defbool_val: Assertion 
> >>`!libxl_defbool_is_default(db)' failed.
> >>
> >>Initialize the relevant defbool variables in libxl_device_vfb.
> >Fix one crash, find another:
> >
> >Does libvirt have a representation for this one?
> >
> >   libxl_defbool_val(vfb.sdl.opengl));
> 
> I'm not aware of any way to specify OpenGL in libvirt domXML.

The qemu-dm process runs without DISPLAY=0:0, which leads to a failure
if sdl is enabled.

Once I will try to find the time I will see if setting DISPLAY= will
actually help. xl.cfg(5) states that display= and xauthority= are
currently not handled anyway. And libvirt lacks such functionality as
well if I understand the docs correctly.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: initialize vfb defbools in libxlMakeVfb

2015-04-17 Thread Olaf Hering

On Fri, Apr 17, Jim Fehlig wrote:

> On 04/17/2015 11:19 AM, Olaf Hering wrote:
> >+libxl_defbool_set(&x_vfb->vnc.enable, 0);
> Not shown here, but just before the switch is
> 
>  libxl_device_vfb_init(x_vfb);
> 
> which IIUC (looking at the impl in $xensrc/tools/libxl/_libxl_types.c)
> should initialize the vfb struct with default values.

Yes, but the default values lead to the assert.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 p2 04/19] xen/arm: Implement hypercall DOMCTL_{, un}bind_pt_pirq

2015-04-17 Thread Daniel De Graaf


On 04/16/2015 10:55 AM, Ian Campbell wrote:

On Thu, 2015-04-09 at 16:09 +0100, Julien Grall wrote:

From: Julien Grall 


I've left the XSM related quotes untrimmed and CCd Daniel. I think it's
all code motion (making x86 specific things generic), so perhaps no ack
needed but an opportunity to nack instead ;-)


This seems correct to me.  My initial thought when looking at this problem
was that a distinct XSM hook for the ARM hypercall would be better, but
other than the minor overhead from doing the IRQ label lookups twice, there
is no real reason to prefer that method.  This method has the advantage
of not making more architecture-specific hooks which are sometimes harder
to test/maintain.

Acked-by: Daniel De Graaf 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 0/3] libxl: fd events: Recheck with poll

2015-04-17 Thread Konrad Rzeszutek Wilk

On Thu, Apr 16, 2015 at 07:24:44PM +0100, Ian Jackson wrote:
> Ian Jackson writes
> ("<21807.61130.841852.546...@mariner.uk.xensource.com>"):
> 
> Gah, mangled the subject line.

I've tested those three patches and they work fine. To make sure that
they did fix the problem I went back to Xen before these patches
to make sure I can reproduce the failure .. and of course I can't.

I am going to try next week to reproduce the error case
to make sure it does fix the issue.

> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] converting gatewaydev= from domU.cfg to libvirt.xml fails

2015-04-17 Thread Jim Fehlig


On 04/17/2015 12:50 PM, Olaf Hering wrote:

How should this be converted?

/etc/init.d/boot.local
tap=xentap
tunctl -pt ${tap}
ip addr add 1.1.1.1/29 dev ${tap}
ip link set up dev ${tap}

domU.cfg
vif=[
'mac=00:16:3e:13:01:00,ip=1.1.1.2,type=vif,gatewaydev=xentap,script=vif-route'
]


The result from "convert-xml xen-xl domU,cfg" is:

 
   
   
   
 

gatewaydev= is missing, the guest will not start.


libvirt doesn't support 'gatewaydev'.  I'm certainly no expert in this area, but 
is it possible to convert your host config in boot.local to a libvirt routed 
network [1] and then have the guest use it with something like


  

...
  

Regards,
Jim

[1] http://libvirt.org/formatnetwork.html#examplesRoute


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] libxl: fd events: Break out fd_occurs

2015-04-17 Thread Konrad Rzeszutek Wilk

On Thu, Apr 16, 2015 at 07:23:27PM +0100, Ian Jackson wrote:
> No functional change, only code motion.
> 
> Currently, contrary to this function's name, there are two sites where
> efd->func() is called so one of them doesn't go through here just yet.
> That will be dealt with in the next commit.

s/next commit/"libxl: fd events: Suppress spurious fd events"/

> 
> Signed-off-by: Ian Jackson 
> CC: Jim Fehlig 
> CC: Konrad Rzeszutek Wilk 
> ---
>  tools/libxl/libxl_event.c |   13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
> index 3efb357..2b2254e 100644
> --- a/tools/libxl/libxl_event.c
> +++ b/tools/libxl/libxl_event.c
> @@ -1121,6 +1121,14 @@ static int afterpoll_check_fd(libxl__poller *poller,
>  return revents;
>  }
>  
> +static void fd_occurs(libxl__egc *egc, libxl__ev_fd *efd, short revents)
> +{
> +DBG("ev_fd=%p occurs fd=%d events=%x revents=%x",
> +efd, efd->fd, efd->events, revents);
> +
> +efd->func(egc, efd, efd->fd, efd->events, revents);
> +}
> +
>  static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
> int nfds, const struct pollfd *fds,
> struct timeval now)
> @@ -1183,10 +1191,7 @@ static void afterpoll_internal(libxl__egc *egc, 
> libxl__poller *poller,
>  break;
>  
>  found_fd_event:
> -DBG("ev_fd=%p occurs fd=%d events=%x revents=%x",
> -efd, efd->fd, efd->events, revents);
> -
> -efd->func(egc, efd, efd->fd, efd->events, revents);
> +fd_occurs(egc, efd, revents);
>  }
>  
>  if (afterpoll_check_fd(poller,fds,nfds, poller->wakeup_pipe[0],POLLIN)) {
> -- 
> 1.7.10.4
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: include a XLU_Config in _libxlDriverConfig

2015-04-17 Thread Jim Fehlig


On 04/17/2015 11:44 AM, Olaf Hering wrote:

Upcoming changes for vscsi will use libxlutil.so to prepare the
configuration for libxl. The helpers needs a xlu struct for logging.
Provide one and reuse the existing output as log target.

Signed-off-by: Olaf Hering 
Cc: Jim Fehlig 
---
  src/libxl/libxl_conf.c | 6 ++
  src/libxl/libxl_conf.h | 2 ++
  2 files changed, 8 insertions(+)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 8b76fc7..43712b3 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -1421,6 +1421,12 @@ libxlDriverConfigNew(void)
  goto error;
  }
  
+cfg->xlu = xlu_cfg_init(cfg->logger_file, "libvirt");

+if (!cfg->xlu) {
+VIR_ERROR(_("cannot create xlu for libxenlight, disabling driver"));
+goto error;
+}
+
  if (libxl_ctx_alloc(&cfg->ctx, LIBXL_VERSION, 0, cfg->logger)) {
  VIR_ERROR(_("cannot initialize libxenlight context, probably not "
  "running in a Xen Dom0, disabling driver"));
diff --git a/src/libxl/libxl_conf.h b/src/libxl/libxl_conf.h
index 59389d1..fd2459f 100644
--- a/src/libxl/libxl_conf.h
+++ b/src/libxl/libxl_conf.h
@@ -27,6 +27,7 @@
  # define LIBXL_CONF_H
  
  # include 

+# include 


I needed this too for parsing xl disk config, but until recently it was not 
installed


http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=8ff079803677b82195addebc0e88f1630cb7354b

I added a check for libxlutil.h to libvirt's configure script. HAVE_LIBXLUTIL_H 
will be defined if it exists.  Currently it is only used in src/xenconfig/xen_xl.c.


Regards,
Jim

  
  # include "internal.h"

  # include "libvirt_internal.h"
@@ -90,6 +91,7 @@ struct _libxlDriverConfig {
  /* log stream for driver-wide libxl ctx */
  FILE *logger_file;
  xentoollog_logger *logger;
+XLU_Config *xlu;
  /* libxl ctx for driver wide ops; getVersion, getNodeInfo, ... */
  libxl_ctx *ctx;
  




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: initialize vfb defbools in libxlMakeVfb

2015-04-17 Thread Jim Fehlig


On 04/17/2015 11:59 AM, Olaf Hering wrote:

On Fri, Apr 17, Olaf Hering wrote:


If the domU configu has sdl enabled libvirtd crashes:
libvirtd[5158]: libvirtd: libxl.c:343: libxl_defbool_val: Assertion 
`!libxl_defbool_is_default(db)' failed.

Initialize the relevant defbool variables in libxl_device_vfb.

Fix one crash, find another:

Does libvirt have a representation for this one?

   libxl_defbool_val(vfb.sdl.opengl));


I'm not aware of any way to specify OpenGL in libvirt domXML.


If not, it should be initialized to false in libxlMakeVfb.


As before, seems like the init function should handle this.

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: initialize vfb defbools in libxlMakeVfb

2015-04-17 Thread Jim Fehlig


On 04/17/2015 11:19 AM, Olaf Hering wrote:

If the domU configu has sdl enabled libvirtd crashes:
libvirtd[5158]: libvirtd: libxl.c:343: libxl_defbool_val: Assertion 
`!libxl_defbool_is_default(db)' failed.


The assertion seems harsh considering the offense...



Initialize the relevant defbool variables in libxl_device_vfb.

Signed-off-by: Olaf Hering 
Cc: Jim Fehlig 
---

Seen in 1.2.14.

  src/libxl/libxl_conf.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 9b3c949..6feb7d9 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -1232,6 +1232,7 @@ libxlMakeVfb(virPortAllocatorPtr graphicsports,
  switch (l_vfb->type) {
  case VIR_DOMAIN_GRAPHICS_TYPE_SDL:
  libxl_defbool_set(&x_vfb->sdl.enable, 1);
+libxl_defbool_set(&x_vfb->vnc.enable, 0);


Not shown here, but just before the switch is

 libxl_device_vfb_init(x_vfb);

which IIUC (looking at the impl in $xensrc/tools/libxl/_libxl_types.c) should 
initialize the vfb struct with default values.


Regards,
Jim



  if (VIR_STRDUP(x_vfb->sdl.display, l_vfb->data.sdl.display) < 0)
  return -1;
  if (VIR_STRDUP(x_vfb->sdl.xauthority, l_vfb->data.sdl.xauth) < 0)
@@ -1239,6 +1240,7 @@ libxlMakeVfb(virPortAllocatorPtr graphicsports,
  break;
  case  VIR_DOMAIN_GRAPHICS_TYPE_VNC:
  libxl_defbool_set(&x_vfb->vnc.enable, 1);
+libxl_defbool_set(&x_vfb->sdl.enable, 0);
  /* driver handles selection of free port */
  libxl_defbool_set(&x_vfb->vnc.findunused, 0);
  if (l_vfb->data.vnc.autoport) {




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Question about the tools/misc/xen-hptool

2015-04-17 Thread Andrew Cooper

On 17/04/15 19:48, Meng Xu wrote:
> Hi,
>
> I'm trying to use the xen-hptool to make one memory page offline. I
> tried to search around to see how to use this tool but didn't find any
> manual/tutorial about it.

First off, let me say that I have never used this interface before, so
the rest of this email is somewhat guesswork.

>
> I had a look at the code of xen-hptool.c and tried the command
> 'xen-hptool mem-offline' but it does not work (probably because of
> incorrect system configuration IMO).  I really appreciate if anyone
> could guide me a little bit on how to use this tool. (Maybe a wiki
> page about how to use this tool will be beneficial?)
>
> I used the latest Xen code in staging branch of xen repository. I
> created a guest PV domain vm1 on Xen with credit scheduler.
>
> I first find on page of dom1 which is mfn=0x16c47a by using xen-mfndump tool.
> # xen-mfndump dump-p2m 1
>
>  --- Dumping P2M for domain 1 ---
>
>  Guest Width: 8, PT Levels: 4 P2M size: = 262144
>
> ...(omit the output of other pfn entries)
>
>   pfn=0x1a ==> mfn=0x16c47a (type 0x0)
>
> ...(omit the output of other pfn entries)
>
>
> I want to make mfn 0x16c47a offline, so I issue this following command:
>
> # xen-hptool mem-offline 0x16c47a
>
> Prepare to offline MEMORY mfn 16c47a
>
> DOM1: No suspend port, try live migration
>
> Failed to suspend guest 1 for mfn 16c47a
>
>
> I checked the output of 'xenstore-ls', and found that  dom1 does not
> set up the suspend channel.
>
> This is the part of output of xenstore-ls for dom1
>
> device = ""
>
> suspend = ""
>
>  event-channel = ""
>
> I'm guessing this is the reason why xen-hptool mem-offline does not work.
>
>
> ===My question are:===
> 1) How should I set up the dom1 so that the xen-hptool mem-offline
> could work and solve this above issue? In other words, how should I
> configure the suspend event-channel for dom1?
> (I'm not so familiar with the event channel mechanism in Xen. Any
> advice/help on how to make this work is really appreciated.)

It is suspend/resume mechanism, and is strictly optional, according to
docs/misc/xenstore-paths.markdown.

>
> 2) Can I use the libxl_domain_suspend(ctx, domid, fd, 0, NULL) to
> suspend the domain instead of using the event channel?
>
> 3) Do I have to suspend the guest domain before making one page of the
> domain offline? Can I just pause the domain and make the page offline?

It would appear that the idea is to simulate a suspend without actually
doing a full suspend, so that when the guest resumes and rescans its
p2m, it notices the broken page.

This a) can only work for PV guests, and b) seems very silly.  A better
alternative would surely be to deliver a RAM MCE to the guest, which
would then also work for HVM guests.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]

2015-04-17 Thread Prashant Sreedharan

On Fri, 2015-04-17 at 15:12 -0400, David Miller wrote:
> From: Konrad Rzeszutek Wilk 
> Date: Fri, 17 Apr 2015 15:04:48 -0400
> 
> > From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001
> > From: Konrad Rzeszutek Wilk 
> > Date: Fri, 17 Apr 2015 14:55:47 -0400
> > Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected
> > 
> > A huge amount of NIC drivers use the DMA API, however if compiled
> > under 32-bit an very important part of the DMA API can be ommitted leading
> > to the drivers not working at all (especially if used with
> > 'swiotlb=force iommu=soft').
> > 
> > As Prashant Sreedharan explains it: "the driver [tg3] uses
> > DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma
> > "mapping" and dma_unmap_addr() to get the "mapping" value. On most of
> > the platforms this is a no-op, but ... with "iommu=soft and
> > swiotlb=force" this house keeping is required, ... otherwise
> > we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the
> > DMA address."
> > 
> > As such enable this even when using 32-bit kernels.
> > 
> > Reported-by: Ian Jackson 
> > Signed-off-by: Konrad Rzeszutek Wilk 
> 
> Acked-by: David S. Miller 

Acked-by: Prashant Sreedharan 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]

2015-04-17 Thread David Miller

From: Konrad Rzeszutek Wilk 
Date: Fri, 17 Apr 2015 15:04:48 -0400

> From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk 
> Date: Fri, 17 Apr 2015 14:55:47 -0400
> Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected
> 
> A huge amount of NIC drivers use the DMA API, however if compiled
> under 32-bit an very important part of the DMA API can be ommitted leading
> to the drivers not working at all (especially if used with
> 'swiotlb=force iommu=soft').
> 
> As Prashant Sreedharan explains it: "the driver [tg3] uses
> DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma
> "mapping" and dma_unmap_addr() to get the "mapping" value. On most of
> the platforms this is a no-op, but ... with "iommu=soft and
> swiotlb=force" this house keeping is required, ... otherwise
> we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the
> DMA address."
> 
> As such enable this even when using 32-bit kernels.
> 
> Reported-by: Ian Jackson 
> Signed-off-by: Konrad Rzeszutek Wilk 

Acked-by: David S. Miller 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]

2015-04-17 Thread Konrad Rzeszutek Wilk

On Fri, Apr 17, 2015 at 10:46:09AM -0700, Michael Chan wrote:
> On Fri, 2015-04-17 at 13:19 -0400, David Miller wrote:
> > So the gist of the situation is, that NEED_DMA_MAP_STATE can be 'n' in
> > situations where we might actually need it to be 'y' based upon kernel
> > comman line boot options given.
> > 
> > Right?
> 
> Yes.

Would this work ?

Peter, Ingo, Thomas, pls see Prashant's thread:

http://www.spinics.net/lists/netdev/msg325645.html
http://www.spinics.net/lists/netdev/msg325774.html

Thank you.

>From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Fri, 17 Apr 2015 14:55:47 -0400
Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected

A huge amount of NIC drivers use the DMA API, however if compiled
under 32-bit an very important part of the DMA API can be ommitted leading
to the drivers not working at all (especially if used with
'swiotlb=force iommu=soft').

As Prashant Sreedharan explains it: "the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma
"mapping" and dma_unmap_addr() to get the "mapping" value. On most of
the platforms this is a no-op, but ... with "iommu=soft and
swiotlb=force" this house keeping is required, ... otherwise
we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the
DMA address."

As such enable this even when using 32-bit kernels.

Reported-by: Ian Jackson 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b7d31ca..570c71d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -177,7 +177,7 @@ config SBUS

 config NEED_DMA_MAP_STATE
def_bool y
-   depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG
+   depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG || SWIOTLB

 config NEED_SG_DMA_LENGTH
def_bool y
-- 
2.1.0

> > 
> > 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] converting gatewaydev= from domU.cfg to libvirt.xml fails

2015-04-17 Thread Olaf Hering


How should this be converted?

/etc/init.d/boot.local
tap=xentap
tunctl -pt ${tap}
ip addr add 1.1.1.1/29 dev ${tap}
ip link set up dev ${tap}

domU.cfg
vif=[
'mac=00:16:3e:13:01:00,ip=1.1.1.2,type=vif,gatewaydev=xentap,script=vif-route'
]


The result from "convert-xml xen-xl domU,cfg" is:


  
  
  


gatewaydev= is missing, the guest will not start.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Question about the tools/misc/xen-hptool

2015-04-17 Thread Meng Xu

Hi,

I'm trying to use the xen-hptool to make one memory page offline. I
tried to search around to see how to use this tool but didn't find any
manual/tutorial about it.

I had a look at the code of xen-hptool.c and tried the command
'xen-hptool mem-offline' but it does not work (probably because of
incorrect system configuration IMO).  I really appreciate if anyone
could guide me a little bit on how to use this tool. (Maybe a wiki
page about how to use this tool will be beneficial?)

I used the latest Xen code in staging branch of xen repository. I
created a guest PV domain vm1 on Xen with credit scheduler.

I first find on page of dom1 which is mfn=0x16c47a by using xen-mfndump tool.
# xen-mfndump dump-p2m 1

 --- Dumping P2M for domain 1 ---

 Guest Width: 8, PT Levels: 4 P2M size: = 262144

...(omit the output of other pfn entries)

  pfn=0x1a ==> mfn=0x16c47a (type 0x0)

...(omit the output of other pfn entries)


I want to make mfn 0x16c47a offline, so I issue this following command:

# xen-hptool mem-offline 0x16c47a

Prepare to offline MEMORY mfn 16c47a

DOM1: No suspend port, try live migration

Failed to suspend guest 1 for mfn 16c47a


I checked the output of 'xenstore-ls', and found that  dom1 does not
set up the suspend channel.

This is the part of output of xenstore-ls for dom1

device = ""

suspend = ""

 event-channel = ""

I'm guessing this is the reason why xen-hptool mem-offline does not work.


===My question are:===
1) How should I set up the dom1 so that the xen-hptool mem-offline
could work and solve this above issue? In other words, how should I
configure the suspend event-channel for dom1?
(I'm not so familiar with the event channel mechanism in Xen. Any
advice/help on how to make this work is really appreciated.)

2) Can I use the libxl_domain_suspend(ctx, domid, fd, 0, NULL) to
suspend the domain instead of using the event channel?

3) Do I have to suspend the guest domain before making one page of the
domain offline? Can I just pause the domain and make the page offline?


===
Below is the output of the command  `xenstore-ls` in case you need
more information.

local = ""

 domain = ""

  0 = ""

   domid = "0"

   name = "Domain-0"

   device-model = ""

0 = ""

 state = "running"

   libxl = ""

disable_udev = "1"

   backend = ""

vbd = ""

 1 = ""

  51712 = ""

   frontend = "/local/domain/1/device/vbd/51712"

   params = "/home/pennpanda/Research/rt-xen/images/vm1.img"

   script = "/etc/xen/scripts/block"

   frontend-id = "1"

   online = "1"

   removable = "0"

   bootable = "1"

   state = "4"

   dev = "xvda"

   type = "phy"

   mode = "w"

   device-type = "disk"

   discard-enable = "1"

   node = "/dev/loop0"

   physical-device = "7:0"

   hotplug-status = "connected"

   feature-flush-cache = "1"

   discard-granularity = "4096"

   discard-alignment = "0"

   discard-secure = "0"

   feature-discard = "1"

   feature-barrier = "1"

   feature-persistent = "1"

   feature-max-indirect-segments = "256"

   sectors = "52428800"

   info = "0"

   sector-size = "512"

   physical-sector-size = "512"

console = ""

 1 = ""

  0 = ""

   frontend = "/local/domain/1/console"

   frontend-id = "1"

   online = "1"

   state = "1"

   protocol = "vt100"

vif = ""

 1 = ""

  0 = ""

   frontend = "/local/domain/1/device/vif/0"

   frontend-id = "1"

   online = "1"

   state = "4"

   script = "/etc/xen/scripts/vif-bridge"

   mac = "00:16:3e:0f:49:33"

   bridge = "xenbr0"

   handle = "0"

   type = "vif"

   feature-sg = "1"

   feature-gso-tcpv4 = "1"

   feature-rx-copy = "1"

   feature-rx-flip = "0"

   feature-split-event-channels = "1"

   hotplug-status = "connected"

  1 = ""

   vm = "/vm/72df1959-a67e-4606-8a0b-cc18a9864f31"

   name = "vm1"

   cpu = ""

0 = ""

 availability = "online"

1 = ""

 availability = "online"

   memory = ""

static-max = "1048576"

target = "1048576"

videoram = "0"

   device = ""

suspend = ""

 event-channel = ""

vbd = ""

 51712 = ""

  backend = "/local/domain/0/backend/vbd/1/51712"

  backend-id = "0"

  state = "4"

  virtual-device = "51712"

  device-type = "disk"

  protocol = "x86_64-abi"

  ring-ref = "8"

  event-channel = "15"

vif = ""

 0 = ""

  backend = "/local/domain/0/backend/vif/1/0"

  backend-id = "0"

  state = "4"

  handle = "0"

  mac = "00:16:3e:0f:49:33"

  tx-ring-ref = "768"

  rx-ring-ref = "769"

  event-channel = "16"

  request-rx-copy = "1"

  feature-rx-notify = "1"

  feature-sg = "1"

  feature-gso-tcpv4 = "1"

   control = ""

shutdown = ""

platform-feature-multiprocessor-suspend = "1"

plat

Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's vCPU on Xen ARM 4.5.x?

2015-04-17 Thread 신정섭

 
NO
 
"Peripheral IRQ routing" means that  
Xen select itself one of domain0's vCPU to inject periperal IRQ.
 
So below Simple peripheral IRQ routing Code is a Example of Peripheral IRQ 
routing.
periperal IRQ is injected to Domain0' vcpu0 or vcpu1 without vGIC Information.
 
I know that periperal IRQ can be process on any cpu in linux.
So All Domain0's vcpu can process periperal IRQ injected by Xen.
 
On Xen 4.4.1 my simple Simple peripheral irq routing Code is working well. 
(below)
But Xen 4.5.0 it dosen't.
 
 
-Original Message-
From: "Ian Campbell" 
To: "신정섭"; 
Cc: ; "Stefano 
Stabellini"; 
Sent: 2015-04-17 (금) 18:49:39
Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's 
vCPU on Xen ARM 4.5.x?
 
On Fri, 2015-04-17 at 11:36 +0900, 신정섭 wrote:
>  
> 
> I'm studying periperal irq routing to Domain0's vCPU

What do you mean by "peripheral irq routing"? Do you mean supporting the
guest writing to GICD_ITARGER to cause an interrupt to be injected to a
specific vcpu?

I thought that was supposed to work, Stefano?

> 
>  
> 
> I'm testing on Arndale Broad and Domain 0 has 2 vCPU.
> 
> So Xen can select vcpu0 or vcpu1 to inject periperal irq.
> 
>  
> 
> I tested periperal routing on Xen 4.4.1 and it works well.
> 
> But I tested periperal routing on Xen 4.5.0 but irq dosen't works
> well.
> 
>  
> 
> So I tested very simple periperal routing code like this.
> 
> 'flag' is grobal variable. 
> 
>  
> 
> * In "do_IRQ" function on Xen 4.4.1 
> 
> -
> 
> - from
> 
> if ( desc->status & IRQ_GUEST )
> 
> {
> 
> struct domain *d = action->dev_id;
> 
>  
> 
> desc->handler->end(desc);
> 
>  
> 
> desc->status = IRQ_INPROGRESS;
> 
> desc->arch.eoi_cpu = smp_processor_id();
> 
>  
> 
> /* XXX: inject irq into all guest vcpus */
> 
> vgic_vcpu_inject_irq(d->vcpu[0], irq, 0);
> 
> goto out_no_end;
> 
> }
> 
> -to if ( desc->status & IRQ_GUEST ) {
> 
> struct domain *d = action->dev_id;
> 
>  
> 
> desc->handler->end(desc);
> 
>  
> 
> desc->status = IRQ_INPROGRESS;
> 
> desc->arch.eoi_cpu = smp_processor_id();
> 
>  
> 
> /* XXX: inject irq into all guest vcpus */
> 
> vgic_vcpu_inject_irq(d->vcpu[++flag % 2], irq, 0);
> 
> goto out_no_end;
> 
> }
> 
> -
> 
>  
> 
> * In "vgic_vcpu_inject_spi" function on Xen 4.5.0 
> 
> -
> 
> -from
> 
> void vgic_vcpu_inject_spi(struct domain *d, unsigned int irq)
> 
> {
> 
> struct vcpu *v;
> 
>  
> 
> /* the IRQ needs to be an SPI */
> 
> ASSERT(irq >= 32 && irq <= gic_number_lines());
> 
>  
> 
> v = vgic_get_target_vcpu(d->vcpu[0], irq);
> 
> vgic_vcpu_inject_irq(v, irq);
> 
> }
> 
> -tovoid vgic_vcpu_inject_spi(struct domain *d, unsigned int irq)
> 
> {
> 
> struct vcpu *v;
> 
>  
> 
> /* the IRQ needs to be an SPI */
> 
> ASSERT(irq >= 32 && irq <= gic_number_lines());
> 
>  
> 
> vgic_vcpu_inject_irq(d->vcpu[++flag % 2], irq);
> 
> }
> 
> -
> 
> so periperal irq injected to Domain0's vCPU0 or vCPU1.
> 
>  
> 
> on Xen 4.4.1 it work well and i can confirm 
> 
> periperal irq routed vcpu0 or vcpu1 by using cat /proc/interrupts
> command.
> 
>  
> 
> * cat /proc/interrupts command on Xen 4.4.1
> 
> --
> 
> CPU0 CPU1 
> 
> 27: 8690 8558 GIC 27 arch_timer
> 
> 31: 34 1 GIC 31 events
> 
> 65: 0 0 GIC 65 1080.mdma
> 
> 66: 0 0 GIC 66 121a.pdma
> 
> 67: 0 0 GIC 67 121b.pdma
> 
> 74: 0 0 GIC 74 101d.watchdog
> 
> 75: 0 0 GIC 75 s3c2410-rtc alarm
> 
> 76: 0 0 GIC 76 s3c2410-rtc tick
> 
> 77: 0 0 GIC 77 1340.pinctrl
> 
> 78: 0 0 GIC 78 1140.pinctrl
> 
> 79: 0 0 GIC 79 386.pinctrl
> 
> 82: 0 0 GIC 82 10d1.pinctrl
> 
> 88: 229 233 GIC 88 12c6.i2c
> 
> 90: 0 0 GIC 90 12c8.i2c
> 
> 91: 0 0 GIC 91 12c9.i2c
> 
> 96: 0 0 GIC 96 12ce.i2c
> 
> 97: 0 0 GIC 97 1006.tmu
> 
> 103: 257 246 GIC 103 ehci_hcd:usb3, ohci_hcd:usb4
> 
> 104: 0 0 GIC 104 xhci-hcd:usb1
> 
> 107: 710 710 GIC 107 dw-mci
> 
> 109: 9602 9610 GIC 109 dw-mci
> 
> 156: 0 0 GIC 156 11c1.mdma
> 
> 160: 0 0 xen-dyn-event xenbus
> 
> 183: 1 0 exynos_wkup_irq_chip 2 s5m8767
> 
> 184: 33 0 xen-percpu-virq hvc_console
> 
> 185: 0 0 s5m8767 12 rtc-alarm0
> 
> 186: 0 0 exynos_wkup_irq_chip 4 SW-TACT2
> 
> 187: 0 0 exynos_wkup_irq_chip 5 SW-TACT3
> 
> 188: 0 0 exynos_wkup_irq_chip 6 SW-TACT4
> 
> 189: 0 0 exynos_wkup_irq_chip 7 SW-TACT5
> 
> 190: 0 0 exynos_wkup_irq_chip 0 SW-TACT6
> 
> 191: 0 0 exynos_wkup_irq_chip 1 SW-TACT7
> 
> IPI0: 0 0 CPU wakeup interrupts
> 
> IPI1: 0 0 Timer broadcast interrupts
> 
> IPI2: 6660 6920 Rescheduling interrupts
> 
> IPI3: 0 0 Function call interrupts
> 
> IPI4: 9 3 Single function call interrupts
> 
> IPI5: 0 0 CPU stop interrupts
> 
> IPI6: 0 0 IRQ work interrupts

Re: [Xen-devel] [PATCH] libxl: initialize vfb defbools in libxlMakeVfb

2015-04-17 Thread Olaf Hering

On Fri, Apr 17, Olaf Hering wrote:

> If the domU configu has sdl enabled libvirtd crashes:
> libvirtd[5158]: libvirtd: libxl.c:343: libxl_defbool_val: Assertion 
> `!libxl_defbool_is_default(db)' failed.
> 
> Initialize the relevant defbool variables in libxl_device_vfb.

Fix one crash, find another:

Does libvirt have a representation for this one?

  libxl_defbool_val(vfb.sdl.opengl));

If not, it should be initialized to false in libxlMakeVfb.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] libxl: initialize allocated libxl_device_vfb array

2015-04-17 Thread Olaf Hering

On Fri, Apr 17, Olaf Hering wrote:

> Its already allocated by calloc, but the init function sets ->devid.

Scratch that one, libxlMakeVfb already calls libxl_device_vfb_init.
Somehow I missed that.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]

2015-04-17 Thread Michael Chan

On Fri, 2015-04-17 at 13:19 -0400, David Miller wrote:
> So the gist of the situation is, that NEED_DMA_MAP_STATE can be 'n' in
> situations where we might actually need it to be 'y' based upon kernel
> comman line boot options given.
> 
> Right?

Yes.
> 
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] libxl: include a XLU_Config in _libxlDriverConfig

2015-04-17 Thread Olaf Hering

Upcoming changes for vscsi will use libxlutil.so to prepare the
configuration for libxl. The helpers needs a xlu struct for logging.
Provide one and reuse the existing output as log target.

Signed-off-by: Olaf Hering 
Cc: Jim Fehlig 
---
 src/libxl/libxl_conf.c | 6 ++
 src/libxl/libxl_conf.h | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 8b76fc7..43712b3 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -1421,6 +1421,12 @@ libxlDriverConfigNew(void)
 goto error;
 }
 
+cfg->xlu = xlu_cfg_init(cfg->logger_file, "libvirt");
+if (!cfg->xlu) {
+VIR_ERROR(_("cannot create xlu for libxenlight, disabling driver"));
+goto error;
+}
+
 if (libxl_ctx_alloc(&cfg->ctx, LIBXL_VERSION, 0, cfg->logger)) {
 VIR_ERROR(_("cannot initialize libxenlight context, probably not "
 "running in a Xen Dom0, disabling driver"));
diff --git a/src/libxl/libxl_conf.h b/src/libxl/libxl_conf.h
index 59389d1..fd2459f 100644
--- a/src/libxl/libxl_conf.h
+++ b/src/libxl/libxl_conf.h
@@ -27,6 +27,7 @@
 # define LIBXL_CONF_H
 
 # include 
+# include 
 
 # include "internal.h"
 # include "libvirt_internal.h"
@@ -90,6 +91,7 @@ struct _libxlDriverConfig {
 /* log stream for driver-wide libxl ctx */
 FILE *logger_file;
 xentoollog_logger *logger;
+XLU_Config *xlu;
 /* libxl ctx for driver wide ops; getVersion, getNodeInfo, ... */
 libxl_ctx *ctx;
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] libxl: initialize allocated libxl_device_vfb array

2015-04-17 Thread Olaf Hering

Its already allocated by calloc, but the init function sets ->devid.
Just in case anything cares.

Signed-off-by: Olaf Hering 
Cc: Jim Fehlig 
---
 src/libxl/libxl_conf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 6feb7d9..8b76fc7 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -1288,6 +1288,7 @@ libxlMakeVfbList(virPortAllocatorPtr graphicsports,
 }
 
 for (i = 0; i < nvfbs; i++) {
+libxl_device_vfb_init(&x_vfbs[i]);
 libxl_device_vkb_init(&x_vkbs[i]);
 
 if (libxlMakeVfb(graphicsports, l_vfbs[i], &x_vfbs[i]) < 0)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]

2015-04-17 Thread David Miller

From: Ian Jackson 
Date: Fri, 17 Apr 2015 17:29:28 +0100

> Prashant Sreedharan writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 
> 3 more messages]"):
>> Ok this is what is causing the problem, the driver uses
>> DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma
>> "mapping" and dma_unmap_addr() to get the "mapping" value. On most of
>> the platforms this is a no-op, but it appears with "iommu=soft and
>> swiotlb=force" this house keeping is required, when I pass the correct
>> dma_addr instead of 0 while calling pci_unmap_/pci_dma_sync_ I don't see
>> the corruption. ie If you set CONFIG_NEED_DMA_MAP_STATE=y in your kernel
>> config you should not see the problem. Can you confirm ? Thanks
> 
> That kernel config option is an automatically computed one:
> 
> config NEED_DMA_MAP_STATE
> def_bool y
> depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG
> 
> and grepping my .config shows:
> 
> # CONFIG_INTEL_IOMMU is not set
> # CONFIG_DMA_API_DEBUG is not set
> 
> It's a 32-bit kernel so it hasn't got X86_64 enabled either.
> 
> Arguably at least some of osstest's kernels should have INTEL_IOMMU
> enabled to detect conflicts between Xen's use of the iommu and
> possible attempts bo Linux to do the same thing, but not having it
> enabled should not cause a driver bug.

So the gist of the situation is, that NEED_DMA_MAP_STATE can be 'n' in
situations where we might actually need it to be 'y' based upon kernel
comman line boot options given.

Right?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] libxl: initialize vfb defbools in libxlMakeVfb

2015-04-17 Thread Olaf Hering

If the domU configu has sdl enabled libvirtd crashes:
libvirtd[5158]: libvirtd: libxl.c:343: libxl_defbool_val: Assertion 
`!libxl_defbool_is_default(db)' failed.

Initialize the relevant defbool variables in libxl_device_vfb.

Signed-off-by: Olaf Hering 
Cc: Jim Fehlig 
---

Seen in 1.2.14.

 src/libxl/libxl_conf.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 9b3c949..6feb7d9 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -1232,6 +1232,7 @@ libxlMakeVfb(virPortAllocatorPtr graphicsports,
 switch (l_vfb->type) {
 case VIR_DOMAIN_GRAPHICS_TYPE_SDL:
 libxl_defbool_set(&x_vfb->sdl.enable, 1);
+libxl_defbool_set(&x_vfb->vnc.enable, 0);
 if (VIR_STRDUP(x_vfb->sdl.display, l_vfb->data.sdl.display) < 0)
 return -1;
 if (VIR_STRDUP(x_vfb->sdl.xauthority, l_vfb->data.sdl.xauth) < 0)
@@ -1239,6 +1240,7 @@ libxlMakeVfb(virPortAllocatorPtr graphicsports,
 break;
 case  VIR_DOMAIN_GRAPHICS_TYPE_VNC:
 libxl_defbool_set(&x_vfb->vnc.enable, 1);
+libxl_defbool_set(&x_vfb->sdl.enable, 0);
 /* driver handles selection of free port */
 libxl_defbool_set(&x_vfb->vnc.findunused, 0);
 if (l_vfb->data.vnc.autoport) {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-17 Thread Andrew Cooper

On 17/04/15 17:59, Boris Ostrovsky wrote:
> Signed-off-by: Boris Ostrovsky 
> Acked-by: Daniel De Graaf 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v7 1/5] sysctl: Make XEN_SYSCTL_numainfo a little more efficient

2015-04-17 Thread Andrew Cooper

On 17/04/15 17:59, Boris Ostrovsky wrote:
> A number of changes to XEN_SYSCTL_numainfo interface:
>
> * Make sysctl NUMA topology query use fewer copies by combining some
>   fields into a single structure and copying distances for each node
>   in a single copy.
> * NULL meminfo and distance handles are a request for maximum number
>   of nodes (num_nodes). If those handles are valid and num_nodes is
>   is smaller than the number of nodes in the system then -ENOBUFS is
>   returned (and correct num_nodes is provided)
> * Instead of using max_node_index for passing number of nodes keep this
>   value in num_nodes: almost all uses of max_node_index required adding
>   or subtracting one to eventually get to number of nodes anyway.
> * Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
>   XEN_INVALID_NODE_DIST.
>
> Signed-off-by: Boris Ostrovsky 
> Acked-by: Ian Campbell 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-17 Thread Boris Ostrovsky

Signed-off-by: Boris Ostrovsky 
Acked-by: Daniel De Graaf 
---

Changes in v7:
* Break from the loop when -ENODEV is encountered

 docs/misc/xsm-flask.txt |1 +
 xen/common/sysctl.c |   59 +++
 xen/include/public/sysctl.h |   30 ++
 xen/xsm/flask/hooks.c   |1 +
 xen/xsm/flask/policy/access_vectors |1 +
 5 files changed, 92 insertions(+), 0 deletions(-)

diff --git a/docs/misc/xsm-flask.txt b/docs/misc/xsm-flask.txt
index d63a8a7..e0a3dcc 100644
--- a/docs/misc/xsm-flask.txt
+++ b/docs/misc/xsm-flask.txt
@@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h)
  * XEN_SYSCTL_cpupool_op
  * XEN_SYSCTL_scheduler_op
  * XEN_SYSCTL_coverage_op
+ * XEN_SYSCTL_pcitopoinfo
 
 __HYPERVISOR_memory_op (xen/include/public/memory.h)
 
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index b025a90..8f3a58a 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -411,6 +411,65 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
 break;
 #endif
 
+#ifdef HAS_PCI
+case XEN_SYSCTL_pcitopoinfo:
+{
+xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
+unsigned dev_cnt = 0;
+
+if ( guest_handle_is_null(ti->devs) ||
+ guest_handle_is_null(ti->nodes) ||
+ (ti->first_dev > ti->num_devs) )
+{
+ret = -EINVAL;
+break;
+}
+
+while ( ti->first_dev < ti->num_devs )
+{
+physdev_pci_device_t dev;
+uint32_t node;
+struct pci_dev *pdev;
+
+if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
+{
+ret = -EFAULT;
+break;
+}
+
+spin_lock(&pcidevs_lock);
+pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
+if ( !pdev )
+{
+ret = -ENODEV;
+node = XEN_INVALID_NODE_ID;
+}
+else if ( pdev->node == NUMA_NO_NODE )
+node = XEN_INVALID_NODE_ID;
+else
+node = pdev->node;
+spin_unlock(&pcidevs_lock);
+
+if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
+{
+ret = -EFAULT;
+break;
+}
+
+ti->first_dev++;
+
+if ( (ret == -ENODEV) ||
+ ((++dev_cnt > 0x3f) && hypercall_preempt_check()) )
+break;
+}
+
+if ( (!ret || (ret == -ENODEV)) &&
+ __copy_field_to_guest(u_sysctl, op, u.pcitopoinfo.first_dev) )
+ret = -EFAULT;
+}
+break;
+#endif
+
 default:
 ret = arch_do_sysctl(op, u_sysctl);
 copyback = 0;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 021d505..58e72c3 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -33,6 +33,7 @@
 
 #include "xen.h"
 #include "domctl.h"
+#include "physdev.h"
 
 #define XEN_SYSCTL_INTERFACE_VERSION 0x000C
 
@@ -669,6 +670,33 @@ struct xen_sysctl_psr_cmt_op {
 typedef struct xen_sysctl_psr_cmt_op xen_sysctl_psr_cmt_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cmt_op_t);
 
+/* XEN_SYSCTL_pcitopoinfo */
+struct xen_sysctl_pcitopoinfo {
+/* IN: Number of elements in 'pcitopo' and 'nodes' arrays. */
+uint32_t num_devs;
+
+/*
+ * IN/OUT:
+ *   IN: First element of pcitopo array that needs to be processed by
+ *   the hypervisor.
+ *  OUT: Index of the first still unprocessed element of pcitopo array.
+ */
+uint32_t first_dev;
+
+/* IN: list of devices for which node IDs are requested. */
+XEN_GUEST_HANDLE_64(physdev_pci_device_t) devs;
+
+/*
+ * OUT: node identifier for each device.
+ * If information for a particular device is not avalable then set
+ * to XEN_INVALID_NODE_ID. In addition, if device is not known to the
+ * hypervisor, sysctl will stop further processing and return -ENODEV.
+ */
+XEN_GUEST_HANDLE_64(uint32) nodes;
+};
+typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
+
 struct xen_sysctl {
 uint32_t cmd;
 #define XEN_SYSCTL_readconsole1
@@ -691,12 +719,14 @@ struct xen_sysctl {
 #define XEN_SYSCTL_scheduler_op  19
 #define XEN_SYSCTL_coverage_op   20
 #define XEN_SYSCTL_psr_cmt_op21
+#define XEN_SYSCTL_pcitopoinfo   22
 uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
 union {
 struct xen_sysctl_readconsole   readconsole;
 struct xen_sysctl_tbuf_op   tbuf_op;
 struct xen_sysctl_physinfo  physinfo;
 struct xen_sysctl_cputopoinfo   cputopoinfo;
+struct xen_sysctl_pcitopoinfo   pcitopoinfo;
 struct xen_sysctl

[Xen-devel] [PATCH v7 0/5] Display IO topology when PXM data is available (plus some cleanup)

2015-04-17 Thread Boris Ostrovsky

Changes in v7:
* Allow one of arguments to NUMA info sysctls to be NULL, in which case only
  the non-NULL buffer will be filled in by hypervisor (patches 1 and 4)
* Properly handle -ENODEVS in PCI topology sysctl (patch 2)
* Error handling changes in patch 5

Changes in v6:
* PCI topology interface changes: no continuations, userspace will be dealing
  with "unfinished" sysctl (patches 2 and 5)
* Unknown device will cause ENODEV in sysctl
* No NULL tests in libxc
* Loop control initialization fix (similar to commit 26da081ac91a)
* Other minor changes (see per-patch notes)

Changes in v5:
* Make CPU topology and NUMA info sysctls behave more like 
XEN_DOMCTL_get_vcpu_msrs
  when passed NULL buffers. This required toolstack changes as well
* Don't use 8-bit data types in interfaces
* Fold interface version update into patch#3

Changes in v4:
* Split cputopology and NUMA info changes into separate patches
* Added patch#1 (partly because patch#4 needs to know when when distance is 
invalid,
  i.e. NUMA_NO_DISTANCE)
* Split sysctl version update into a separate patch
* Other changes are listed in each patch
* NOTE: I did not test python's xc changes since I don't think I know how.

Changes in v3:
* Added patch #1 to more consistently define nodes as a u8 and properly
  use NUMA_NO_NODE.
* Make changes to xen_sysctl_numainfo, similar to those made to
  xen_sysctl_topologyinfo. (Q: I kept both sets of changes in the same
  patch #3 to avoid bumping interface version twice. Perhaps it's better
  to split it into two?)
* Instead of copying data for each loop index allocate a buffer and copy
  once for all three queries in sysctl.c.
* Move hypercall buffer management from libxl to libxc (as requested by
  Dario, patches #5 and #6).
* Report topology info for offlined CPUs as well
* Added LIBXL_HAVE_PCITOPO macro

Changes in v2:
* Split topology sysctls into two --- one for CPU topology and the other
  for devices
* Avoid long loops in the hypervisor by using continuations. (I am not
  particularly happy about using first_dev in the interface, suggestions
  for a better interface would be appreciated)
* Use proper libxl conventions for interfaces
* Avoid hypervisor stack corruption when copying PXM data from guest


A few patches that add interface for querying hypervisor about device
topology and allow 'xl info -n' display this information if PXM object
is provided by ACPI.

This series also makes some optimizations and cleanup of current CPU
topology and NUMA sysctl queries.


Boris Ostrovsky (5):
  sysctl: Make XEN_SYSCTL_numainfo a little more efficient
  sysctl: Add sysctl interface for querying PCI topology
  libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer
management to libxc
  libxl/libxc: Move libxl_get_numainfo()'s hypercall buffer management
to libxc
  libxl: Add interface for querying hypervisor about PCI topology

 docs/misc/xsm-flask.txt |1 +
 tools/libxc/include/xenctrl.h   |   12 ++-
 tools/libxc/xc_misc.c   |   97 ++---
 tools/libxl/libxl.c |  160 ++-
 tools/libxl/libxl.h |   12 +++
 tools/libxl/libxl_freebsd.c |   12 +++
 tools/libxl/libxl_internal.h|5 +
 tools/libxl/libxl_linux.c   |   70 +++
 tools/libxl/libxl_netbsd.c  |   12 +++
 tools/libxl/libxl_types.idl |7 ++
 tools/libxl/libxl_utils.c   |8 ++
 tools/libxl/xl_cmdimpl.c|   40 +++--
 tools/misc/xenpm.c  |   51 +--
 tools/python/xen/lowlevel/xc/xc.c   |   74 ++--
 xen/common/sysctl.c |  143 ---
 xen/include/public/sysctl.h |   84 +-
 xen/xsm/flask/hooks.c   |1 +
 xen/xsm/flask/policy/access_vectors |1 +
 18 files changed, 560 insertions(+), 230 deletions(-)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v7 5/5] libxl: Add interface for querying hypervisor about PCI topology

2015-04-17 Thread Boris Ostrovsky

.. and use this new interface to display it along with CPU topology
and NUMA information when 'xl info -n' command is issued

The output will look like
...
cpu_topology   :
cpu:coresocket node
  0:   000
...
device topology:
device   node
:00:00.0  0
:00:01.0  0
...

Signed-off-by: Boris Ostrovsky 
---

Changes in v7:
* Replaced LOG with LOGE in libxl_get_pci_topology()
* Replaced LOGE with LOG and set errno in libxl__pci_topology_init()
* Test correct pointer for non-NULL in output_topologyinfo() (pciinfo, not
  cpuinfo)


 tools/libxc/include/xenctrl.h |3 ++
 tools/libxc/xc_misc.c |   44 +
 tools/libxl/libxl.c   |   42 
 tools/libxl/libxl.h   |   12 +++
 tools/libxl/libxl_freebsd.c   |   12 +++
 tools/libxl/libxl_internal.h  |5 +++
 tools/libxl/libxl_linux.c |   70 +
 tools/libxl/libxl_netbsd.c|   12 +++
 tools/libxl/libxl_types.idl   |7 
 tools/libxl/libxl_utils.c |8 +
 tools/libxl/xl_cmdimpl.c  |   40 +++
 11 files changed, 248 insertions(+), 7 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 520a284..2e7787a 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1231,6 +1231,7 @@ typedef xen_sysctl_physinfo_t xc_physinfo_t;
 typedef xen_sysctl_cputopo_t xc_cputopo_t;
 typedef xen_sysctl_numainfo_t xc_numainfo_t;
 typedef xen_sysctl_meminfo_t xc_meminfo_t;
+typedef xen_sysctl_pcitopoinfo_t xc_pcitopoinfo_t;
 
 typedef uint32_t xc_cpu_to_node_t;
 typedef uint32_t xc_cpu_to_socket_t;
@@ -1244,6 +1245,8 @@ int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
xc_cputopo_t *cputopo);
 int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
 xc_meminfo_t *meminfo, uint32_t *distance);
+int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
+   physdev_pci_device_t *devs, uint32_t *nodes);
 
 int xc_sched_id(xc_interface *xch,
 int *sched_id);
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 94b70b4..3a3e366 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -238,6 +238,50 @@ out:
 return ret;
 }
 
+int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
+   physdev_pci_device_t *devs,
+   uint32_t *nodes)
+{
+int ret = 0;
+DECLARE_SYSCTL;
+DECLARE_HYPERCALL_BOUNCE(devs, num_devs * sizeof(*devs),
+ XC_HYPERCALL_BUFFER_BOUNCE_IN);
+DECLARE_HYPERCALL_BOUNCE(nodes, num_devs* sizeof(*nodes),
+ XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
+
+if ( (ret = xc_hypercall_bounce_pre(xch, devs)) )
+goto out;
+if ( (ret = xc_hypercall_bounce_pre(xch, nodes)) )
+goto out;
+
+sysctl.u.pcitopoinfo.first_dev = 0;
+sysctl.u.pcitopoinfo.num_devs = num_devs;
+set_xen_guest_handle(sysctl.u.pcitopoinfo.devs, devs);
+set_xen_guest_handle(sysctl.u.pcitopoinfo.nodes, nodes);
+
+sysctl.cmd = XEN_SYSCTL_pcitopoinfo;
+
+while ( sysctl.u.pcitopoinfo.first_dev < num_devs )
+{
+if ( (ret = do_sysctl(xch, &sysctl)) != 0 )
+{
+/*
+ * node[] is set to XEN_INVALID_NODE_ID for invalid devices,
+ * we can just skip those entries.
+ */
+if ( errno == ENODEV )
+errno = ret = 0;
+else
+break;
+}
+}
+
+ out:
+xc_hypercall_bounce_post(xch, devs);
+xc_hypercall_bounce_post(xch, nodes);
+
+return ret;
+}
 
 int xc_sched_id(xc_interface *xch,
 int *sched_id)
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 1abcbf1..86aff8e 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5139,6 +5139,48 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx 
*ctx, int *nb_cpu_out)
 return ret;
 }
 
+libxl_pcitopology *libxl_get_pci_topology(libxl_ctx *ctx, int *num_devs)
+{
+GC_INIT(ctx);
+physdev_pci_device_t *devs;
+uint32_t *nodes;
+libxl_pcitopology *ret = NULL;
+int i;
+
+*num_devs = libxl__pci_numdevs(gc);
+if (*num_devs < 0) {
+LOGE(ERROR, "Unable to determine number of PCI devices");
+goto out;
+}
+
+devs = libxl__zalloc(gc, sizeof(*devs) * *num_devs);
+nodes = libxl__zalloc(gc, sizeof(*nodes) * *num_devs);
+
+if (libxl__pci_topology_init(gc, devs, *num_devs)) {
+LOGE(ERROR, "Cannot initialize PCI hypercall structure");
+goto out;
+}
+
+if (xc_pcitopoinfo(ctx->xch, *num_devs, devs, nodes) != 0) {
+LOGE(ERROR, "PCI topology info hypercall failed");
+goto out;
+}
+
+ret = libxl__zalloc(NOGC, sizeof(libxl_pcitopology) * *num_devs);
+
+for (i = 0; i < *num_devs; i++) {
+ret[i].seg = devs

[Xen-devel] [PATCH v7 3/5] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc

2015-04-17 Thread Boris Ostrovsky

xc_cputopoinfo() is not expected to be used on a hot path and therefore
hypercall buffer management can be pushed into libxc. This will simplify
life for callers.

Also update error reporting macros.

Signed-off-by: Boris Ostrovsky 
Acked-by: Ian Campbell 
---
 tools/libxc/include/xenctrl.h |5 ++-
 tools/libxc/xc_misc.c |   23 +++-
 tools/libxl/libxl.c   |   37 --
 tools/misc/xenpm.c|   51 -
 tools/python/xen/lowlevel/xc/xc.c |   20 ++
 5 files changed, 61 insertions(+), 75 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 02d0db8..b476fda 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1228,7 +1228,7 @@ int xc_readconsolering(xc_interface *xch,
 int xc_send_debug_keys(xc_interface *xch, char *keys);
 
 typedef xen_sysctl_physinfo_t xc_physinfo_t;
-typedef xen_sysctl_cputopoinfo_t xc_cputopoinfo_t;
+typedef xen_sysctl_cputopo_t xc_cputopo_t;
 typedef xen_sysctl_numainfo_t xc_numainfo_t;
 
 typedef uint32_t xc_cpu_to_node_t;
@@ -1239,7 +1239,8 @@ typedef uint64_t xc_node_to_memfree_t;
 typedef uint32_t xc_node_to_node_dist_t;
 
 int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
-int xc_cputopoinfo(xc_interface *xch, xc_cputopoinfo_t *info);
+int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
+   xc_cputopo_t *cputopo);
 int xc_numainfo(xc_interface *xch, xc_numainfo_t *info);
 
 int xc_sched_id(xc_interface *xch,
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index be68291..630a86c 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -177,22 +177,31 @@ int xc_physinfo(xc_interface *xch,
 return 0;
 }
 
-int xc_cputopoinfo(xc_interface *xch,
-   xc_cputopoinfo_t *put_info)
+int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
+   xc_cputopo_t *cputopo)
 {
 int ret;
 DECLARE_SYSCTL;
+DECLARE_HYPERCALL_BOUNCE(cputopo, *max_cpus * sizeof(*cputopo),
+ XC_HYPERCALL_BUFFER_BOUNCE_OUT);
 
-sysctl.cmd = XEN_SYSCTL_cputopoinfo;
+if ( (ret = xc_hypercall_bounce_pre(xch, cputopo)) )
+goto out;
 
-memcpy(&sysctl.u.cputopoinfo, put_info, sizeof(*put_info));
+sysctl.u.cputopoinfo.num_cpus = *max_cpus;
+set_xen_guest_handle(sysctl.u.cputopoinfo.cputopo, cputopo);
+
+sysctl.cmd = XEN_SYSCTL_cputopoinfo;
 
 if ( (ret = do_sysctl(xch, &sysctl)) != 0 )
-return ret;
+goto out;
 
-memcpy(put_info, &sysctl.u.cputopoinfo, sizeof(*put_info));
+*max_cpus = sysctl.u.cputopoinfo.num_cpus;
 
-return 0;
+out:
+xc_hypercall_bounce_post(xch, cputopo);
+
+return ret;
 }
 
 int xc_numainfo(xc_interface *xch,
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2ff46b4..f348afd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5102,37 +5102,28 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo 
*physinfo)
 libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out)
 {
 GC_INIT(ctx);
-xc_cputopoinfo_t tinfo;
-DECLARE_HYPERCALL_BUFFER(xen_sysctl_cputopo_t, cputopo);
+xc_cputopo_t *cputopo;
 libxl_cputopology *ret = NULL;
 int i;
+unsigned num_cpus;
 
-/* Setting buffer to NULL makes the hypercall return number of CPUs */
-set_xen_guest_handle(tinfo.cputopo, HYPERCALL_BUFFER_NULL);
-if (xc_cputopoinfo(ctx->xch, &tinfo) != 0)
+/* Setting buffer to NULL makes the call return number of CPUs */
+if (xc_cputopoinfo(ctx->xch, &num_cpus, NULL))
 {
-LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number of CPUS");
-ret = NULL;
+LOGEV(ERROR, errno, "Unable to determine number of CPUS");
 goto out;
 }
 
-cputopo = xc_hypercall_buffer_alloc(ctx->xch, cputopo,
-sizeof(*cputopo) * tinfo.num_cpus);
-if (cputopo == NULL) {
-LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
-"Unable to allocate hypercall arguments");
-goto fail;
-}
-set_xen_guest_handle(tinfo.cputopo, cputopo);
+cputopo = libxl__zalloc(gc, sizeof(*cputopo) * num_cpus);
 
-if (xc_cputopoinfo(ctx->xch, &tinfo) != 0) {
-LIBXL__LOG_ERRNO(ctx, XTL_ERROR, "CPU topology info hypercall failed");
-goto fail;
+if (xc_cputopoinfo(ctx->xch, &num_cpus, cputopo)) {
+LOGEV(ERROR, errno, "CPU topology info hypercall failed");
+goto out;
 }
 
-ret = libxl__zalloc(NOGC, sizeof(libxl_cputopology) * tinfo.num_cpus);
+ret = libxl__zalloc(NOGC, sizeof(libxl_cputopology) * num_cpus);
 
-for (i = 0; i < tinfo.num_cpus; i++) {
+for (i = 0; i < num_cpus; i++) {
 #define V(map, i, invalid) ( cputopo[i].map == invalid) ? \
LIBXL_CPUTOPOLOGY_INVALID_ENTRY : cputopo[i].map
 ret[i].core = V(core, i, XEN_INVALID_CORE_I

[Xen-devel] [PATCH v7 4/5] libxl/libxc: Move libxl_get_numainfo()'s hypercall buffer management to libxc

2015-04-17 Thread Boris Ostrovsky

xc_numainfo() is not expected to be used on a hot path and therefore
hypercall buffer management can be pushed into libxc. This will simplify
life for callers.

Also update error logging macros.

Signed-off-by: Boris Ostrovsky 
Acked-by: Ian Campbell 
---

Changes in v7:
* Dropped '!!meminfo ^ !!distance' test in xc_numainfo() since one non-NULL 
  argument is now allowed

 tools/libxc/include/xenctrl.h |4 ++-
 tools/libxc/xc_misc.c |   30 -
 tools/libxl/libxl.c   |   51 
 tools/python/xen/lowlevel/xc/xc.c |   38 ++-
 4 files changed, 57 insertions(+), 66 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index b476fda..520a284 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1230,6 +1230,7 @@ int xc_send_debug_keys(xc_interface *xch, char *keys);
 typedef xen_sysctl_physinfo_t xc_physinfo_t;
 typedef xen_sysctl_cputopo_t xc_cputopo_t;
 typedef xen_sysctl_numainfo_t xc_numainfo_t;
+typedef xen_sysctl_meminfo_t xc_meminfo_t;
 
 typedef uint32_t xc_cpu_to_node_t;
 typedef uint32_t xc_cpu_to_socket_t;
@@ -1241,7 +1242,8 @@ typedef uint32_t xc_node_to_node_dist_t;
 int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
 int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
xc_cputopo_t *cputopo);
-int xc_numainfo(xc_interface *xch, xc_numainfo_t *info);
+int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
+xc_meminfo_t *meminfo, uint32_t *distance);
 
 int xc_sched_id(xc_interface *xch,
 int *sched_id);
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 630a86c..94b70b4 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -204,22 +204,38 @@ out:
 return ret;
 }
 
-int xc_numainfo(xc_interface *xch,
-xc_numainfo_t *put_info)
+int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
+xc_meminfo_t *meminfo, uint32_t *distance)
 {
 int ret;
 DECLARE_SYSCTL;
+DECLARE_HYPERCALL_BOUNCE(meminfo, *max_nodes * sizeof(*meminfo),
+ XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+DECLARE_HYPERCALL_BOUNCE(distance,
+ *max_nodes * *max_nodes * sizeof(*distance),
+ XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+if ( (ret = xc_hypercall_bounce_pre(xch, meminfo)) )
+goto out;
+if ((ret = xc_hypercall_bounce_pre(xch, distance)) )
+goto out;
+
+sysctl.u.numainfo.num_nodes = *max_nodes;
+set_xen_guest_handle(sysctl.u.numainfo.meminfo, meminfo);
+set_xen_guest_handle(sysctl.u.numainfo.distance, distance);
 
 sysctl.cmd = XEN_SYSCTL_numainfo;
 
-memcpy(&sysctl.u.numainfo, put_info, sizeof(*put_info));
+if ( (ret = do_sysctl(xch, &sysctl)) != 0 )
+goto out;
 
-if ((ret = do_sysctl(xch, &sysctl)) != 0)
-return ret;
+*max_nodes = sysctl.u.numainfo.num_nodes;
 
-memcpy(put_info, &sysctl.u.numainfo, sizeof(*put_info));
+out:
+xc_hypercall_bounce_post(xch, meminfo);
+xc_hypercall_bounce_post(xch, distance);
 
-return 0;
+return ret;
 }
 
 
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f348afd..1abcbf1 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5142,61 +5142,44 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx 
*ctx, int *nb_cpu_out)
 libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr)
 {
 GC_INIT(ctx);
-xc_numainfo_t ninfo;
-DECLARE_HYPERCALL_BUFFER(xen_sysctl_meminfo_t, meminfo);
-DECLARE_HYPERCALL_BUFFER(uint32_t, distance);
+xc_meminfo_t *meminfo;
+uint32_t *distance;
 libxl_numainfo *ret = NULL;
 int i, j;
+unsigned num_nodes;
 
-set_xen_guest_handle(ninfo.meminfo, HYPERCALL_BUFFER_NULL);
-set_xen_guest_handle(ninfo.distance, HYPERCALL_BUFFER_NULL);
-if (xc_numainfo(ctx->xch, &ninfo) != 0) {
-LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number of NODES");
-ret = NULL;
+if (xc_numainfo(ctx->xch, &num_nodes, NULL, NULL)) {
+LOGEV(ERROR, errno, "Unable to determine number of nodes");
 goto out;
 }
 
-meminfo = xc_hypercall_buffer_alloc(ctx->xch, meminfo,
-sizeof(*meminfo) * ninfo.num_nodes);
-distance = xc_hypercall_buffer_alloc(ctx->xch, distance,
- sizeof(*distance) *
- ninfo.num_nodes * ninfo.num_nodes);
-if ((meminfo == NULL) || (distance == NULL)) {
-LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
-"Unable to allocate hypercall arguments");
-goto fail;
-}
+meminfo = libxl__zalloc(gc, sizeof(*meminfo) * num_nodes);
+distance = libxl__zalloc(gc, sizeof(*distance) * num_nodes * num_nodes);
 
-set_xen_guest_handle(ninfo.meminfo, meminfo);
-set_xe

[Xen-devel] [PATCH v7 1/5] sysctl: Make XEN_SYSCTL_numainfo a little more efficient

2015-04-17 Thread Boris Ostrovsky

A number of changes to XEN_SYSCTL_numainfo interface:

* Make sysctl NUMA topology query use fewer copies by combining some
  fields into a single structure and copying distances for each node
  in a single copy.
* NULL meminfo and distance handles are a request for maximum number
  of nodes (num_nodes). If those handles are valid and num_nodes is
  is smaller than the number of nodes in the system then -ENOBUFS is
  returned (and correct num_nodes is provided)
* Instead of using max_node_index for passing number of nodes keep this
  value in num_nodes: almost all uses of max_node_index required adding
  or subtracting one to eventually get to number of nodes anyway.
* Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
  XEN_INVALID_NODE_DIST.

Signed-off-by: Boris Ostrovsky 
Acked-by: Ian Campbell 
---

Changes in v7:
* Allow one of arguments to  NUMA info sysctls to be NULL, in which case only
  the non-NULL buffer will be filled in by hypervisor (changes in sysctl.[ch])

 tools/libxl/libxl.c   |   66 +
 tools/python/xen/lowlevel/xc/xc.c |   58 --
 xen/common/sysctl.c   |   84 +++--
 xen/include/public/sysctl.h   |   54 ++--
 4 files changed, 141 insertions(+), 121 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 511eef1..2ff46b4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5156,65 +5156,59 @@ libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int 
*nr)
 {
 GC_INIT(ctx);
 xc_numainfo_t ninfo;
-DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize);
-DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree);
-DECLARE_HYPERCALL_BUFFER(uint32_t, node_dists);
+DECLARE_HYPERCALL_BUFFER(xen_sysctl_meminfo_t, meminfo);
+DECLARE_HYPERCALL_BUFFER(uint32_t, distance);
 libxl_numainfo *ret = NULL;
-int i, j, max_nodes;
+int i, j;
 
-max_nodes = libxl_get_max_nodes(ctx);
-if (max_nodes < 0)
-{
+set_xen_guest_handle(ninfo.meminfo, HYPERCALL_BUFFER_NULL);
+set_xen_guest_handle(ninfo.distance, HYPERCALL_BUFFER_NULL);
+if (xc_numainfo(ctx->xch, &ninfo) != 0) {
 LIBXL__LOG(ctx, XTL_ERROR, "Unable to determine number of NODES");
 ret = NULL;
 goto out;
 }
 
-memsize = xc_hypercall_buffer_alloc
-(ctx->xch, memsize, sizeof(*memsize) * max_nodes);
-memfree = xc_hypercall_buffer_alloc
-(ctx->xch, memfree, sizeof(*memfree) * max_nodes);
-node_dists = xc_hypercall_buffer_alloc
-(ctx->xch, node_dists, sizeof(*node_dists) * max_nodes * max_nodes);
-if ((memsize == NULL) || (memfree == NULL) || (node_dists == NULL)) {
+meminfo = xc_hypercall_buffer_alloc(ctx->xch, meminfo,
+sizeof(*meminfo) * ninfo.num_nodes);
+distance = xc_hypercall_buffer_alloc(ctx->xch, distance,
+ sizeof(*distance) *
+ ninfo.num_nodes * ninfo.num_nodes);
+if ((meminfo == NULL) || (distance == NULL)) {
 LIBXL__LOG_ERRNOVAL(ctx, XTL_ERROR, ENOMEM,
 "Unable to allocate hypercall arguments");
 goto fail;
 }
 
-set_xen_guest_handle(ninfo.node_to_memsize, memsize);
-set_xen_guest_handle(ninfo.node_to_memfree, memfree);
-set_xen_guest_handle(ninfo.node_to_node_distance, node_dists);
-ninfo.max_node_index = max_nodes - 1;
+set_xen_guest_handle(ninfo.meminfo, meminfo);
+set_xen_guest_handle(ninfo.distance, distance);
 if (xc_numainfo(ctx->xch, &ninfo) != 0) {
 LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "getting numainfo");
 goto fail;
 }
 
-if (ninfo.max_node_index < max_nodes - 1)
-max_nodes = ninfo.max_node_index + 1;
+*nr = ninfo.num_nodes;
 
-*nr = max_nodes;
+ret = libxl__zalloc(NOGC, sizeof(libxl_numainfo) * ninfo.num_nodes);
+for (i = 0; i < ninfo.num_nodes; i++)
+ret[i].dists = libxl__calloc(NOGC, ninfo.num_nodes, sizeof(*distance));
 
-ret = libxl__zalloc(NOGC, sizeof(libxl_numainfo) * max_nodes);
-for (i = 0; i < max_nodes; i++)
-ret[i].dists = libxl__calloc(NOGC, max_nodes, sizeof(*node_dists));
-
-for (i = 0; i < max_nodes; i++) {
-#define V(mem, i) (mem[i] == INVALID_NUMAINFO_ID) ? \
-LIBXL_NUMAINFO_INVALID_ENTRY : mem[i]
-ret[i].size = V(memsize, i);
-ret[i].free = V(memfree, i);
-ret[i].num_dists = max_nodes;
-for (j = 0; j < ret[i].num_dists; j++)
-ret[i].dists[j] = V(node_dists, i * max_nodes + j);
+for (i = 0; i < ninfo.num_nodes; i++) {
+#define V(val, invalid) (val == invalid) ? \
+   LIBXL_NUMAINFO_INVALID_ENTRY : val
+ret[i].size = V(meminfo[i].memsize, XEN_INVALID_MEM_SZ);
+ret[i].free = V(meminfo[i].memfree, XEN_INVALID_MEM_SZ);
+ret[i].num_dists = ninfo.num_nodes;
+for (j

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Chen Baozi

On Fri, Apr 17, 2015 at 02:21:45PM +0100, Ian Campbell wrote:
> On Fri, 2015-04-17 at 19:24 +0800, Chen Baozi wrote:
> > Hi all,
> > 
> > According to my recent experience, there might be some problems of swiotlb
> > dma map on 1:1 mapping arm64 dom0 with large memory. The issue is like 
> > below:
> > 
> > For those arm64 server with large memory, it is possible to set dom0_mem >
> > 4G (e.g. I have one set with 16G). In this case, according to my 
> > understanding,
> > there is chance that the dom0 kernel needs to map some buffers above 4G to 
> > do
> 
>  ^below?
> 
> > DMA operations (e.g. in snps,dwmac ethernet driver). However, most DMA 
> > engines
> > support only 32-bit physical address, thus aren't able to operate directly 
> > on
> > those memory.
> 
> Even on arm64 systems with RAM above 4GB? That seems short-sighted.
> Oh well, I suppose we have to live with it.

I understand for most ARM SoCs, the DMA engines come from third party IP 
companies
which is arm32/arm64 independent. Thus, 32-bit address DMA engine should be 
common
even on arm64 system. The preferred way is to use/enable SMMU(IOMMU). However, 
we
are focusing on 1:1 mapping right now...

> 
> >  IIUC, swiotlb is implemented to solve this (using bounce buffer),
> > if there is no IOMMU or IOMMU is not enabled on the system. Sadly, it seems
> > that xen_swiotlb_map_page in my dom0 kernel allocates
> > (start_dma_addr = 0x94480) the buffers for DMA above 4G which fails
> > dma_capable() checking and was then unable to return from 
> > xen_swiotlb_map_page()
> > successfully.
> 
> The swiotlb bounce buffer have been allocated below 4GB? 

I have no idea (about the exact behavior of bounce buffer). But I don't think
it has been allocated below 4GB on my board, for in that case it won't fail
dma_capable() in the end of xen_swiotlb_map_page().

> I suspect that
> xen_swiotlb_init is buggy for ARM -- it allocates some random pages and
> then swizzles the backing pages for ones < 4G, but that won't work on an
> ARM dom0 with a 1:1 mapping, I don't think. Do you see error messages
> along those lines?
> 
> Essentially I think either xen_swiotlb_fixup is unable to work on ARM,
> or the following:
> start_dma_addr = xen_virt_to_bus(xen_io_tlb_start);
> is returning 1:1 and not reflecting the fixup.

Yes. It seems very likely what happened in my system.

> 
> > If I set dom0_mem to a small value (e.g. 512M), which makes all physical 
> > memory
> > of dom0 below 4G, everything goes fine.
> 
> So you are getting allocated memory below 4G?

If all the banks of memory that xen populate to dom0 is below 4G, yes. However,
if some banks of memory for dom0 is above 4G, usually not.

> 
> You message on IRC suggested you weren't, did you hack around this?

Yes. I did some hacks to help understand my situation earlier. What I have done
and observed is as below:

1. At the very beginning, I used the default dom0_mem value to boot the system,
which is 128M. And I didn't realize the DMA buffer problem.

2. I started to try more dom0_mem (16G). Then the ethernet driver reported
that it cannot initiate rx buffers (DMA buffers). And I found out that
allocate_memory_11 didn't populate any banks of memory below 4G for dom0.
At that time, I guessed the failure might be introduced because there is no
memory banks below 4G was populated. (there is only a 2GB address space below 4G
for physical memory on my platform, and there is a hole for PCI memory address
space above 4G before the memory address space continue.)

3. So I did some hacks to let lowmem=true manually in allocate_memory_11, which
made xen on arm64 acts similar as it is on arm32 that populates at least one
bank of memory below 4G to dom0. (this is the point when I send you message
on IRC.) I thought that can solve the problem, but it doesn't.

4. Then I found out once xen populated any banks of memory which is above 4G,
the ethernet driver would have chances (very likely, almost every time if
dom0_mem=16G) to use buffers above 4G, regardless whether dom0 has banks of
memory below 4G.

> 
> I think we have two options, either xen_swiotlb_init allocates pages
> below 4GB (e.g. __GFP_DMA) or we do something to allow xen_swiotlb_fixup
> to actually work even on a 1:1 dom0.
> 
> Although the first option seems preferable at first glance it has the
> short coming that it requires dom0 to have some memory below 4GB, which
> might not necessarily be the case. The second option seems like it might
> be uglier but doesn't suffer from this issue.
> 
> Can you please look and find out if the IPA at 0x94480 is actually
> backed by 1:1 RAM or if xen_swiotlb_fixup has done it's job and updated
> things such that the associated PAs are below 4GB?

I am at home now and will check it out tomorrow. But I guess it should be
the first situation you mentioned.

Cheers,

Baozi.

___
Xen-devel ma

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Ian Campbell

On Fri, 2015-04-17 at 17:13 +0100, Stefano Stabellini wrote:
> On Fri, 17 Apr 2015, Ian Campbell wrote:
> > On Fri, 2015-04-17 at 15:34 +0100, Stefano Stabellini wrote:
> > > > > If I set dom0_mem to a small value (e.g. 512M), which makes all 
> > > > > physical memory
> > > > > of dom0 below 4G, everything goes fine.
> > > > 
> > > > So you are getting allocated memory below 4G?
> > > > 
> > > > You message on IRC suggested you weren't, did you hack around this?
> > > > 
> > > > I think we have two options, either xen_swiotlb_init allocates pages
> > > > below 4GB (e.g. __GFP_DMA) or we do something to allow xen_swiotlb_fixup
> > > > to actually work even on a 1:1 dom0.
> > > 
> > > I don't think that making xen_swiotlb_fixup work on ARM is a good idea:
> > > it would break the 1:1.
> > 
> > This would actually work though, I think, because this is the swiotlb so
> > we definitely have the opportunity to return the actual DMA address
> > whenever we use this buffer and the device will use it in the right
> > places for sure.
> 
> The code is pretty complex as is -- I would rather avoid adding more
> complexity to it.  For example we would need to bring back a mechanism
> to track dma address -> pseudo-physical address mappings on arm, even
> though it would be far simpler of course.

There's no need for any of that, just initialise start_dma_addr with the
correct DMA addr when you do the fixup and the swiotlb code will already
take care of everything, won't it?.

That DMA addr would be given to us by Xen as we do the flip.

> Also I think it makes sense to use the swiotlb buffer for its original
> purpose.
> 
> If we could introduce a mechanism to get a lower than 4G buffer in dom0,
> but matching the 1:1, I think it would make the maintenance much easier
> on the linux side.
> 
> 
> > The swiotlb buffer can't ever get reused for anything else so we don't
> > even need to worry about undoing the damage later.
> > 
> > > > Although the first option seems preferable at first glance it has the
> > > > short coming that it requires dom0 to have some memory below 4GB, which
> > > > might not necessarily be the case.
> > > 
> > > I think we should arrange dom0 to get some memory under 4G to begin
> > > with, not necessarily all of it.
> > 
> > It's another option for sure, the question is how to decide how much,
> > and how to make it configurable etc.
>  
> Adding a memory region below 4G should be relatively easy, right?  Is it
> sizing it up the problem? FYI the default size of the swiotlb buffer in
> Linux is 64M, but it shoul be able to cope with less than that.

Picking a size might be tricky, you can't just say "Linux is 64M", what
if that 64M gets used for something else first.

Given a machine to 2GB of low RAM and 126GB of high RAM and a 27G dom0
which fraction do you put below 4GB?

But maybe it works to just say that dom0 gets as much lowmem as the host
memory map and our ability to allocate allows for. That's not entirely
trivial either though, in case Xen has (by chance) allocated any lowmem
for itself the bank bin packing stuff will get even more complex.
Likewise firmware reserved regions.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]

2015-04-17 Thread Ian Jackson

Prashant Sreedharan writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 
more messages]"):
> Ok this is what is causing the problem, the driver uses
> DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma
> "mapping" and dma_unmap_addr() to get the "mapping" value. On most of
> the platforms this is a no-op, but it appears with "iommu=soft and
> swiotlb=force" this house keeping is required, when I pass the correct
> dma_addr instead of 0 while calling pci_unmap_/pci_dma_sync_ I don't see
> the corruption. ie If you set CONFIG_NEED_DMA_MAP_STATE=y in your kernel
> config you should not see the problem. Can you confirm ? Thanks

That kernel config option is an automatically computed one:

config NEED_DMA_MAP_STATE
def_bool y
depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG

and grepping my .config shows:

# CONFIG_INTEL_IOMMU is not set
# CONFIG_DMA_API_DEBUG is not set

It's a 32-bit kernel so it hasn't got X86_64 enabled either.

Arguably at least some of osstest's kernels should have INTEL_IOMMU
enabled to detect conflicts between Xen's use of the iommu and
possible attempts bo Linux to do the same thing, but not having it
enabled should not cause a driver bug.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Stefano Stabellini

On Fri, 17 Apr 2015, Ian Campbell wrote:
> On Fri, 2015-04-17 at 15:34 +0100, Stefano Stabellini wrote:
> > > > If I set dom0_mem to a small value (e.g. 512M), which makes all 
> > > > physical memory
> > > > of dom0 below 4G, everything goes fine.
> > > 
> > > So you are getting allocated memory below 4G?
> > > 
> > > You message on IRC suggested you weren't, did you hack around this?
> > > 
> > > I think we have two options, either xen_swiotlb_init allocates pages
> > > below 4GB (e.g. __GFP_DMA) or we do something to allow xen_swiotlb_fixup
> > > to actually work even on a 1:1 dom0.
> > 
> > I don't think that making xen_swiotlb_fixup work on ARM is a good idea:
> > it would break the 1:1.
> 
> This would actually work though, I think, because this is the swiotlb so
> we definitely have the opportunity to return the actual DMA address
> whenever we use this buffer and the device will use it in the right
> places for sure.

The code is pretty complex as is -- I would rather avoid adding more
complexity to it.  For example we would need to bring back a mechanism
to track dma address -> pseudo-physical address mappings on arm, even
though it would be far simpler of course.

Also I think it makes sense to use the swiotlb buffer for its original
purpose.

If we could introduce a mechanism to get a lower than 4G buffer in dom0,
but matching the 1:1, I think it would make the maintenance much easier
on the linux side.

> The swiotlb buffer can't ever get reused for anything else so we don't
> even need to worry about undoing the damage later.
> 
> > > Although the first option seems preferable at first glance it has the
> > > short coming that it requires dom0 to have some memory below 4GB, which
> > > might not necessarily be the case.
> > 
> > I think we should arrange dom0 to get some memory under 4G to begin
> > with, not necessarily all of it.
> 
> It's another option for sure, the question is how to decide how much,
> and how to make it configurable etc.

Adding a memory region below 4G should be relatively easy, right?  Is it
sizing it up the problem? FYI the default size of the swiotlb buffer in
Linux is 64M, but it shoul be able to cope with less than that.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Konrad Rzeszutek Wilk

On Fri, Apr 17, 2015 at 04:31:18PM +0200, Olaf Hering wrote:
> On Fri, Apr 17, Konrad Rzeszutek Wilk wrote:
> 
> > The /noexitboot will inhibit Xen from calling ExitBootServices.
> 
> How is that supposed to be passed to xen.efi? Looks like I have no
> cmdline interface.

EFI Shell. 

Or you can run 'efibootmgr' to add the arguments to xen.efi

> 
> Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Commit 1aeb1156fa43fe2cd2b5003995b20466cd19a622: "x86 don't change affinity with interrupt unmasked", APCI errors and assorted pci trouble

2015-04-17 Thread Sander Eikelenboom


Friday, April 17, 2015, 5:46:56 PM, you wrote:

> On 17/04/15 16:20, Jan Beulich wrote:
> On 17.04.15 at 17:11,  wrote:
>>> Friday, April 17, 2015, 1:43:32 PM, you wrote:
 --- unstable.orig/xen/drivers/passthrough/amd/iommu_intr.c
 +++ unstable/xen/drivers/passthrough/amd/iommu_intr.c
 @@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
  unsigned int apic, unsigned int reg)
  {
  unsigned int val = __io_apic_read(apic, reg);
 +unsigned int pin = (reg - 0x10) / 2;
 +unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
  
 -if ( !(reg & 1) )
 +if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
  {
 -unsigned int offset = val & (INTREMAP_ENTRIES - 1);
  u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
  u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
  u16 req_id = get_intremap_requestor_id(seg, bdf);
  const u32 *entry = get_intremap_entry(seg, req_id, offset);
  
 +ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
  val &= ~(INTREMAP_ENTRIES - 1);
  val |= get_field_from_reg_u32(*entry,
INT_REMAP_ENTRY_INTTYPE_MASK,
>>>
>>> Hmmm can this patch or tim's patch make andrew's patch ineffective ?
>> I can't see how either would.

> Tim indicated that he thought my patch might be racy, so I might not be
> surprised if a problem still exists.

> ~Andrew

Just reverted Tim's, but it still crashes.
So that's probably it then.

--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Commit 1aeb1156fa43fe2cd2b5003995b20466cd19a622: "x86 don't change affinity with interrupt unmasked", APCI errors and assorted pci trouble

2015-04-17 Thread Andrew Cooper

On 17/04/15 16:20, Jan Beulich wrote:
 On 17.04.15 at 17:11,  wrote:
>> Friday, April 17, 2015, 1:43:32 PM, you wrote:
>>> --- unstable.orig/xen/drivers/passthrough/amd/iommu_intr.c
>>> +++ unstable/xen/drivers/passthrough/amd/iommu_intr.c
>>> @@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
>>>  unsigned int apic, unsigned int reg)
>>>  {
>>>  unsigned int val = __io_apic_read(apic, reg);
>>> +unsigned int pin = (reg - 0x10) / 2;
>>> +unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
>>>  
>>> -if ( !(reg & 1) )
>>> +if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
>>>  {
>>> -unsigned int offset = val & (INTREMAP_ENTRIES - 1);
>>>  u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
>>>  u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
>>>  u16 req_id = get_intremap_requestor_id(seg, bdf);
>>>  const u32 *entry = get_intremap_entry(seg, req_id, offset);
>>>  
>>> +ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
>>>  val &= ~(INTREMAP_ENTRIES - 1);
>>>  val |= get_field_from_reg_u32(*entry,
>>>INT_REMAP_ENTRY_INTTYPE_MASK,
>>
>> Hmmm can this patch or tim's patch make andrew's patch ineffective ?
> I can't see how either would.

Tim indicated that he thought my patch might be racy, so I might not be
surprised if a problem still exists.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] libxl: document foreground '-F' option of create command

2015-04-17 Thread Giuseppe Mazzotta

commit 482b29753946125bbcf892a5daa57b86620f68b1
Author: Giuseppe Mazzotta 
Date:   Fri Apr 17 17:10:03 2015 +0200

libxl: document foreground '-F' option of create command

Signed-off-by: Giuseppe Mazzotta 

diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 9284887..7f4759b 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -30,6 +30,7 @@ struct cmd_spec cmd_table[] = {
   "-n, --dryrunDry run - prints the resulting configuration\n"
   " (deprecated in favour of global -N option).\n"
   "-d  Enable debug messages.\n"
+  "-F  Run in foreground until death of the domain.\n"
   "-e  Do not wait in the background for the death of the domain.\n"
   "-V, --vncviewer Connect to the VNC display after the domain is created.\n"
   "-A, --vncviewer-autopass\n"
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] AMD IOMMU: only translate remapped IO-APIC RTEs

2015-04-17 Thread Jan Beulich

1aeb1156fa ("x86 don't change affinity with interrupt unmasked")
introducing RTE reads prior to the respective interrupt having got
enabled for the first time uncovered a bug in 2ca9fbd739 ("AMD IOMMU:
allocate IRTE entries instead of using a static mapping"): We obviously
shouldn't be translating RTEs for which remapping didn't get set up
yet.

Reported-by: Sander Eikelenboom 
Signed-off-by: Jan Beulich 

--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
 unsigned int apic, unsigned int reg)
 {
 unsigned int val = __io_apic_read(apic, reg);
+unsigned int pin = (reg - 0x10) / 2;
+unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
 
-if ( !(reg & 1) )
+if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
 {
-unsigned int offset = val & (INTREMAP_ENTRIES - 1);
 u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
 u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
 u16 req_id = get_intremap_requestor_id(seg, bdf);
 const u32 *entry = get_intremap_entry(seg, req_id, offset);
 
+ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
 val &= ~(INTREMAP_ENTRIES - 1);
 val |= get_field_from_reg_u32(*entry,
   INT_REMAP_ENTRY_INTTYPE_MASK,



AMD IOMMU: only translate remapped IO-APIC RTEs

1aeb1156fa ("x86 don't change affinity with interrupt unmasked")
introducing RTE reads prior to the respective interrupt having got
enabled for the first time uncovered a bug in 2ca9fbd739 ("AMD IOMMU:
allocate IRTE entries instead of using a static mapping"): We obviously
shouldn't be translating RTEs for which remapping didn't get set up
yet.

Reported-by: Sander Eikelenboom 
Signed-off-by: Jan Beulich 

--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
 unsigned int apic, unsigned int reg)
 {
 unsigned int val = __io_apic_read(apic, reg);
+unsigned int pin = (reg - 0x10) / 2;
+unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
 
-if ( !(reg & 1) )
+if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
 {
-unsigned int offset = val & (INTREMAP_ENTRIES - 1);
 u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
 u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
 u16 req_id = get_intremap_requestor_id(seg, bdf);
 const u32 *entry = get_intremap_entry(seg, req_id, offset);
 
+ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
 val &= ~(INTREMAP_ENTRIES - 1);
 val |= get_field_from_reg_u32(*entry,
   INT_REMAP_ENTRY_INTTYPE_MASK,
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Commit 1aeb1156fa43fe2cd2b5003995b20466cd19a622: "x86 don't change affinity with interrupt unmasked", APCI errors and assorted pci trouble

2015-04-17 Thread Jan Beulich

>>> On 17.04.15 at 17:11,  wrote:
> Friday, April 17, 2015, 1:43:32 PM, you wrote:
>> --- unstable.orig/xen/drivers/passthrough/amd/iommu_intr.c
>> +++ unstable/xen/drivers/passthrough/amd/iommu_intr.c
>> @@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
>>  unsigned int apic, unsigned int reg)
>>  {
>>  unsigned int val = __io_apic_read(apic, reg);
>> +unsigned int pin = (reg - 0x10) / 2;
>> +unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
>>  
>> -if ( !(reg & 1) )
>> +if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
>>  {
>> -unsigned int offset = val & (INTREMAP_ENTRIES - 1);
>>  u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
>>  u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
>>  u16 req_id = get_intremap_requestor_id(seg, bdf);
>>  const u32 *entry = get_intremap_entry(seg, req_id, offset);
>>  
>> +ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
>>  val &= ~(INTREMAP_ENTRIES - 1);
>>  val |= get_field_from_reg_u32(*entry,
>>INT_REMAP_ENTRY_INTTYPE_MASK,
> 
> 
> Hmmm can this patch or tim's patch make andrew's patch ineffective ?

I can't see how either would.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Commit 1aeb1156fa43fe2cd2b5003995b20466cd19a622: "x86 don't change affinity with interrupt unmasked", APCI errors and assorted pci trouble

2015-04-17 Thread Sander Eikelenboom


Friday, April 17, 2015, 1:43:32 PM, you wrote:

 On 14.04.15 at 14:46,  wrote:
>> I just had a hunch .. could it be related to the kernel apci/irq refactoring
>> series of Jiang Liu, that already caused a lot of trouble in 3.17, 3.18 and 
>> 3.19
>> with Xen.  And yes that seems to be the case:
>> 
>> On Xen without "x86 don't change affinity with interrupt unmasked"
>> - 3.16 && 3.19 && 4.0 all work fine 
>> 
>> On Xen with "x86 don't change affinity with interrupt unmasked" 
>> - 3.16 (which is before that kernel refactoring series) works fine.
>> - 3.19, 4.0 both give the dom0 kernel hangs and the :
>> (XEN) [2015-03-26 20:35:42.205] APIC error on CPU0: 00(40)
>> (XEN) [2015-03-26 20:35:42.372] APIC error on CPU0: 40(40)
>> 
>> (haven't tested 3.17 and 3.18 because these have asorted problems due that 
>>  series that weren't fixed in time before stable updates ended.)
>> 
>> So it seems Jan's patch seems to interfere with that patch series.

> That's rather odd a finding - the patch in question in fact uncovered
> a bug introduced in 2ca9fbd739 ("AMD IOMMU: allocate IRTE entries
> instead of using a static mapping") in that IO-APIC RTE reads would
> unconditionally translate the data (i.e. regardless of whether the
> entry was already in translated format). The patch below fixes this
> for me - can you please give this a try too?

> Thanks, Jan

> --- unstable.orig/xen/drivers/passthrough/amd/iommu_intr.c
> +++ unstable/xen/drivers/passthrough/amd/iommu_intr.c
> @@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
>  unsigned int apic, unsigned int reg)
>  {
>  unsigned int val = __io_apic_read(apic, reg);
> +unsigned int pin = (reg - 0x10) / 2;
> +unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
>  
> -if ( !(reg & 1) )
> +if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
>  {
> -unsigned int offset = val & (INTREMAP_ENTRIES - 1);
>  u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
>  u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
>  u16 req_id = get_intremap_requestor_id(seg, bdf);
>  const u32 *entry = get_intremap_entry(seg, req_id, offset);
>  
> +ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
>  val &= ~(INTREMAP_ENTRIES - 1);
>  val |= get_field_from_reg_u32(*entry,
>INT_REMAP_ENTRY_INTTYPE_MASK,


Hmmm can this patch or tim's patch make andrew's patch ineffective ?

I now have applied:
Jan's:
diff --git a/xen/drivers/passthrough/amd/iommu_intr.c 
b/xen/drivers/passthrough/amd/iommu_intr.c
index c1b76fb..879698e 100644
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_ire(
 unsigned int apic, unsigned int reg)
 {
 unsigned int val = __io_apic_read(apic, reg);
+unsigned int pin = (reg - 0x10) / 2;
+unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];

-if ( !(reg & 1) )
+if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
 {
-unsigned int offset = val & (INTREMAP_ENTRIES - 1);
 u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
 u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
 u16 req_id = get_intremap_requestor_id(seg, bdf);
 const u32 *entry = get_intremap_entry(seg, req_id, offset);

+ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
 val &= ~(INTREMAP_ENTRIES - 1);
 val |= get_field_from_reg_u32(*entry,
   INT_REMAP_ENTRY_INTTYPE_MASK,


Tim's:
@@ -529,10 +531,11 @@ int amd_iommu_msi_msg_update_ire(
 } while ( PCI_SLOT(bdf) == PCI_SLOT(pdev->devfn) );

 if ( !rc )
+{
 for ( i = 1; i < nr; ++i )
 msi_desc[i].remap_index = msi_desc->remap_index + i;
-
-msg->data = data;
+msg->data = data;
+}
 return rc;
 }



Andrew's:
diff --git a/xen/drivers/passthrough/x86/iommu.c 
b/xen/drivers/passthrough/x86/iommu.c
index 9eb8d33..3aee00c 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -56,9 +56,9 @@ int arch_iommu_populate_page_table(struct domain *d)

 while ( !rc && (page = page_list_remove_head(&d->page_list)) )
 {
-if ( has_hvm_container_domain(d) ||
-(page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page )
-{
+if ( (mfn_to_gmfn(d, page_to_mfn(page)) != INVALID_MFN) &&
+(has_hvm_container_domain(d) ||
+ ((page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page)) 
)
 BUG_ON(SHARED_M2P(mfn_to_gmfn(d, page_to_mfn(page;
 rc = hd->platform_ops->map_page(
 d, mfn_to_gmfn(d, page_to_mfn(page)), page_to_mfn(page),



And i now have this one again (which Andrew's patch should prevent):
(XEN) [2015-04-17 15:00:55.954] Xen call trace:
(XEN) [2015-04-17 15:00:55.954][] 
iommu_pde_from_

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Ian Campbell

On Fri, 2015-04-17 at 15:34 +0100, Stefano Stabellini wrote:
> > > If I set dom0_mem to a small value (e.g. 512M), which makes all physical 
> > > memory
> > > of dom0 below 4G, everything goes fine.
> > 
> > So you are getting allocated memory below 4G?
> > 
> > You message on IRC suggested you weren't, did you hack around this?
> > 
> > I think we have two options, either xen_swiotlb_init allocates pages
> > below 4GB (e.g. __GFP_DMA) or we do something to allow xen_swiotlb_fixup
> > to actually work even on a 1:1 dom0.
> 
> I don't think that making xen_swiotlb_fixup work on ARM is a good idea:
> it would break the 1:1.

This would actually work though, I think, because this is the swiotlb so
we definitely have the opportunity to return the actual DMA address
whenever we use this buffer and the device will use it in the right
places for sure.

The swiotlb buffer can't ever get reused for anything else so we don't
even need to worry about undoing the damage later.

> > Although the first option seems preferable at first glance it has the
> > short coming that it requires dom0 to have some memory below 4GB, which
> > might not necessarily be the case.
> 
> I think we should arrange dom0 to get some memory under 4G to begin
> with, not necessarily all of it.

It's another option for sure, the question is how to decide how much,
and how to make it configurable etc.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 12/12] docs: add xl-psr.markdown

2015-04-17 Thread Chao Peng

On Thu, Apr 16, 2015 at 12:58:16PM +0100, Ian Campbell wrote:
> On Thu, 2015-04-09 at 17:18 +0800, Chao Peng wrote:
> 
> BTW, do you know if someone is planning to work on libvirt integration
> for this stuff?

As I know, there are people from Intel will take care of this.

> (Aside: "cache-occupancy" would be more in keeping with the interfaces,
> oh well, you can fix if you feel like it, or not bother if you like).

It's fixed in v5.

> > +
> > +Detailed information please refer to Intel SDM chapter 17.15.
> 
> Perhaps a few simple examples, would make the basics clearer without
> having to hit the SDM for the full gory detail e.g.
> 
> For example, assuming a system with 8 portions and 3 domains:
> 
> A CBM of 0xff for every domain means each domain can access the
> whole cache. This is the default.
> 
> Giving one domain a CBM of 0x0F and the other two domain's 0xF0
> means that the first domain gets exclusive access to half of the
> cache (half of the portions) and the other two will share the
> other half.
> 
> Giving one domain a CBM of 0x0F, one 0x30 and the last 0xc0
> would give the first domain exclusive access to half the cache,
> and the other two exclusive access to one quarter each.
> 
> Then have the reference the SDM for more detailed stuff.

Thank you, for typing so much ...

Chao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Ian Campbell

On Fri, 2015-04-17 at 15:32 +0100, Stefano Stabellini wrote:
> I think that given that dom0 is mapped 1:1 on ARM, the easiest and best
> fix would be to simply allocate memory under 4G to begin with.

Not necessarily best, see my reply (hint: dom0 might not have RAM under
4GB even if the host does).



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Stefano Stabellini

On Fri, 17 Apr 2015, Ian Campbell wrote:
> On Fri, 2015-04-17 at 19:24 +0800, Chen Baozi wrote:
> > Hi all,
> > 
> > According to my recent experience, there might be some problems of swiotlb
> > dma map on 1:1 mapping arm64 dom0 with large memory. The issue is like 
> > below:
> > 
> > For those arm64 server with large memory, it is possible to set dom0_mem >
> > 4G (e.g. I have one set with 16G). In this case, according to my 
> > understanding,
> > there is chance that the dom0 kernel needs to map some buffers above 4G to 
> > do
> 
>  ^below?
> 
> > DMA operations (e.g. in snps,dwmac ethernet driver). However, most DMA 
> > engines
> > support only 32-bit physical address, thus aren't able to operate directly 
> > on
> > those memory.
> 
> Even on arm64 systems with RAM above 4GB? That seems short-sighted.
> Oh well, I suppose we have to live with it.
> 
> >  IIUC, swiotlb is implemented to solve this (using bounce buffer),
> > if there is no IOMMU or IOMMU is not enabled on the system. Sadly, it seems
> > that xen_swiotlb_map_page in my dom0 kernel allocates
> > (start_dma_addr = 0x94480) the buffers for DMA above 4G which fails
> > dma_capable() checking and was then unable to return from 
> > xen_swiotlb_map_page()
> > successfully.
> 
> The swiotlb bounce buffer have been allocated below 4GB? I suspect that
> xen_swiotlb_init is buggy for ARM -- it allocates some random pages and
> then swizzles the backing pages for ones < 4G, but that won't work on an
> ARM dom0 with a 1:1 mapping, I don't think. Do you see error messages
> along those lines?
> 
> Essentially I think either xen_swiotlb_fixup is unable to work on ARM,
> or the following:
> start_dma_addr = xen_virt_to_bus(xen_io_tlb_start);
> is returning 1:1 and not reflecting the fixup.

The swiotlb on arm doesn't necessarily get memory under 4G, see my other reply.


> > If I set dom0_mem to a small value (e.g. 512M), which makes all physical 
> > memory
> > of dom0 below 4G, everything goes fine.
> 
> So you are getting allocated memory below 4G?
> 
> You message on IRC suggested you weren't, did you hack around this?
> 
> I think we have two options, either xen_swiotlb_init allocates pages
> below 4GB (e.g. __GFP_DMA) or we do something to allow xen_swiotlb_fixup
> to actually work even on a 1:1 dom0.

I don't think that making xen_swiotlb_fixup work on ARM is a good idea:
it would break the 1:1.


> Although the first option seems preferable at first glance it has the
> short coming that it requires dom0 to have some memory below 4GB, which
> might not necessarily be the case.

I think we should arrange dom0 to get some memory under 4G to begin
with, not necessarily all of it.


> The second option seems like it might
> be uglier but doesn't suffer from this issue.
> 
> Can you please look and find out if the IPA at 0x94480 is actually
> backed by 1:1 RAM or if xen_swiotlb_fixup has done it's job and updated
> things such that the associated PAs are below 4GB?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 12/13] tools: add tools support for Intel CAT

2015-04-17 Thread Chao Peng

This is the xc/xl changes to support Intel Cache Allocation
Technology(CAT). Two commands are introduced:
- xl psr-cat-hwinfo
  Show CAT hardware information.
- xl psr-cat-cbm-set [-s socket]  
  Set cache capacity bitmasks(CBM) for a domain.
- xl psr-cat-show 
  Show CAT domain information.

Examples:
[root@vmm-psr vmm]# xl psr-cat-hwinfo
Cache Allocation Technology (CAT):
Socket ID   : 0
L3 Cache: 12288KB
Maximum COS : 15
CBM length  : 12
Default CBM : 0xfff

[root@vmm-psr vmm]# xl psr-cat-cbm-set 0 0xff

[root@vmm-psr vmm]# xl psr-cat-show
Socket ID   : 0
L3 Cache: 12288KB
Default CBM : 0xfff
   ID NAME CBM
0 Domain-00xff

Signed-off-by: Chao Peng 
---
Changes in v5:
* Add psr-cat-hwinfo.
* Add libxl_psr_cat_info_list_free.
* malloc => libxl__malloc
* Other comments from Ian/Wei.
Changes in v4:
* Add example output in commit message.
* Make libxl__count_physical_sockets private to libxl_psr.c.
* Set errno in several error cases.
* Change libxl_psr_cat_get_l3_info to return all sockets information.
* Remove unused libxl_domain_info call.
Changes in v3:
* Add manpage.
* libxl_psr_cat_set/get_domain_data => libxl_psr_cat_set/get_cbm.
* Move libxl_count_physical_sockets into seperate patch.
* Support LIBXL_PSR_TARGET_ALL for libxl_psr_cat_set_cbm.
* Clean up the print codes.
---
 docs/man/xl.pod.1 |  35 +
 tools/libxc/include/xenctrl.h |  15 
 tools/libxc/xc_psr.c  |  76 +++
 tools/libxl/libxl.h   |  26 +++
 tools/libxl/libxl_psr.c   | 170 --
 tools/libxl/libxl_types.idl   |  10 +++
 tools/libxl/xl.h  |   5 ++
 tools/libxl/xl_cmdimpl.c  | 169 +
 tools/libxl/xl_cmdtable.c |  19 +
 9 files changed, 518 insertions(+), 7 deletions(-)

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 640788f..efc6599 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1517,6 +1517,41 @@ monitor types are:
 
 =back
 
+=head1 CACHE ALLOCATION TECHNOLOGY
+
+Intel Broadwell and later server platforms offer capabilities to configure and
+make use of the Cache Allocation Technology (CAT) mechanisms, which enable more
+cache resources (i.e. L3 cache) to be made available for high priority
+applications. In Xen implementation, CAT is used to control cache allocation
+on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
+(CBM) for the domain.
+
+=over 4
+
+=item B
+
+Show CAT hardware information.
+
+=item B [I] I I
+
+Set cache capacity bitmasks(CBM) for a domain.
+
+B
+
+=over 4
+
+=item B<-s SOCKET>, B<--socket=SOCKET>
+
+Specify the socket to process, otherwise all sockets are processed.
+
+=back
+
+=item B [I]
+
+Show CAT settings for a certain domain or all domains.
+
+=back
+
 =head1 TO BE DOCUMENTED
 
 We need better documentation for:
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 02d0db8..077cc1b 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2696,6 +2696,12 @@ enum xc_psr_cmt_type {
 XC_PSR_CMT_LOCAL_MEM_COUNT,
 };
 typedef enum xc_psr_cmt_type xc_psr_cmt_type;
+
+enum xc_psr_cat_type {
+XC_PSR_CAT_L3_CBM = 1,
+};
+typedef enum xc_psr_cat_type xc_psr_cat_type;
+
 int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid);
 int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid);
 int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
@@ -2710,6 +2716,15 @@ int xc_psr_cmt_get_data(xc_interface *xch, uint32_t 
rmid, uint32_t cpu,
 uint32_t psr_cmt_type, uint64_t *monitor_data,
 uint64_t *tsc);
 int xc_psr_cmt_enabled(xc_interface *xch);
+
+int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
+   xc_psr_cat_type type, uint32_t target,
+   uint64_t data);
+int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
+   xc_psr_cat_type type, uint32_t target,
+   uint64_t *data);
+int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
+   uint32_t *cos_max, uint32_t *cbm_len);
 #endif
 
 #endif /* XENCTRL_H */
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index e367a80..d8b3a51 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -248,6 +248,82 @@ int xc_psr_cmt_enabled(xc_interface *xch)
 
 return 0;
 }
+int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
+   xc_psr_cat_type type, uint32_t target,
+   uint64_t data)
+{
+DECLARE_DOMCTL;
+uint32_t cmd;
+
+switch ( type )
+{
+case XC_PSR_CAT_L3_CBM:
+cmd = XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM;
+break;
+default:
+errno = EINVAL;
+return -1;

[Xen-devel] [PATCH v5 06/13] x86: expose CBM length and COS number information

2015-04-17 Thread Chao Peng

General CAT information such as maximum COS and CBM length are exposed to
user space by a SYSCTL hypercall, to help user space to construct the CBM.

Signed-off-by: Chao Peng 
Reviewed-by: Andrew Cooper 
---
 xen/arch/x86/psr.c  | 31 +++
 xen/arch/x86/sysctl.c   | 18 ++
 xen/include/asm-x86/psr.h   |  3 +++
 xen/include/public/sysctl.h | 16 
 4 files changed, 68 insertions(+)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 592d610..d784efb 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -215,6 +215,37 @@ void psr_ctxt_switch_to(struct domain *d)
 }
 }
 
+static int get_cat_socket_info(unsigned int socket,
+   struct psr_cat_socket_info **info)
+{
+if ( !cat_socket_info )
+return -ENODEV;
+
+if ( socket >= nr_sockets )
+return -EBADSLT;
+
+if ( !cat_socket_info[socket].enabled )
+return -ENOENT;
+
+*info = cat_socket_info + socket;
+return 0;
+}
+
+int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
+uint32_t *cos_max)
+{
+struct psr_cat_socket_info *info;
+int ret = get_cat_socket_info(socket, &info);
+
+if ( ret )
+return ret;
+
+*cbm_len = info->cbm_len;
+*cos_max = info->cos_max;
+
+return 0;
+}
+
 /* Called with domain lock held, no psr specific lock needed */
 static void psr_free_cos(struct domain *d)
 {
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 611a291..8a9e120 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -171,6 +171,24 @@ long arch_do_sysctl(
 
 break;
 
+case XEN_SYSCTL_psr_cat_op:
+switch ( sysctl->u.psr_cat_op.cmd )
+{
+case XEN_SYSCTL_PSR_CAT_get_l3_info:
+ret = psr_get_cat_l3_info(sysctl->u.psr_cat_op.target,
+  &sysctl->u.psr_cat_op.u.l3_info.cbm_len,
+  &sysctl->u.psr_cat_op.u.l3_info.cos_max);
+
+if ( !ret && __copy_to_guest(u_sysctl, sysctl, 1) )
+ret = -EFAULT;
+
+break;
+default:
+ret = -EOPNOTSUPP;
+break;
+}
+break;
+
 default:
 ret = -ENOSYS;
 break;
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 45392bf..3a8a406 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -52,6 +52,9 @@ void psr_free_rmid(struct domain *d);
 
 void psr_ctxt_switch_to(struct domain *d);
 
+int psr_get_cat_l3_info(unsigned int socket, uint32_t *cbm_len,
+uint32_t *cos_max);
+
 int psr_domain_init(struct domain *d);
 void psr_domain_free(struct domain *d);
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 711441f..f28e460 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -661,6 +661,20 @@ struct xen_sysctl_psr_cmt_op {
 typedef struct xen_sysctl_psr_cmt_op xen_sysctl_psr_cmt_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cmt_op_t);
 
+#define XEN_SYSCTL_PSR_CAT_get_l3_info   0
+struct xen_sysctl_psr_cat_op {
+uint32_t cmd;   /* IN: XEN_SYSCTL_PSR_CAT_* */
+uint32_t target;/* IN: socket to be operated on */
+union {
+struct {
+uint32_t cbm_len;   /* OUT: CBM length */
+uint32_t cos_max;   /* OUT: Maximum COS */
+} l3_info;
+} u;
+};
+typedef struct xen_sysctl_psr_cat_op xen_sysctl_psr_cat_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_psr_cat_op_t);
+
 struct xen_sysctl {
 uint32_t cmd;
 #define XEN_SYSCTL_readconsole1
@@ -683,6 +697,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_scheduler_op  19
 #define XEN_SYSCTL_coverage_op   20
 #define XEN_SYSCTL_psr_cmt_op21
+#define XEN_SYSCTL_psr_cat_op22
 uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
 union {
 struct xen_sysctl_readconsole   readconsole;
@@ -705,6 +720,7 @@ struct xen_sysctl {
 struct xen_sysctl_scheduler_op  scheduler_op;
 struct xen_sysctl_coverage_op   coverage_op;
 struct xen_sysctl_psr_cmt_oppsr_cmt_op;
+struct xen_sysctl_psr_cat_oppsr_cat_op;
 uint8_t pad[128];
 } u;
 };
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 13/13] docs: add xl-psr.markdown

2015-04-17 Thread Chao Peng

Add document to introduce basic concepts and terms in PSR family
technologies and the xl interfaces.

Signed-off-by: Chao Peng 
---
Changes in v5:
* Address comments from Andrew/Ian.
---
 docs/man/xl.pod.1 |  10 +++-
 docs/misc/xl-psr.markdown | 134 ++
 2 files changed, 143 insertions(+), 1 deletion(-)
 create mode 100644 docs/misc/xl-psr.markdown

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index efc6599..b024d45 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1493,6 +1493,9 @@ occupancy monitoring share the same set of underlying 
monitoring service. Once
 a domain is attached to the monitoring service, monitoring data can be showed
 for any of these monitoring types.
 
+See L for more
+information.
+
 =over 4
 
 =item B
@@ -1526,6 +1529,9 @@ applications. In Xen implementation, CAT is used to 
control cache allocation
 on VM basis. To enforce cache on a specific domain, just set capacity bitmasks
 (CBM) for the domain.
 
+See L for more
+information.
+
 =over 4
 
 =item B
@@ -1534,7 +1540,8 @@ Show CAT hardware information.
 
 =item B [I] I I
 
-Set cache capacity bitmasks(CBM) for a domain.
+Set cache capacity bitmasks(CBM) for a domain. For how to specify I
+please refer to the link above.
 
 B
 
@@ -1575,6 +1582,7 @@ And the following documents on the xen.org website:
 L
 L
 L
+L
 
 For systems that don't automatically bring CPU online:
 
diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
new file mode 100644
index 000..d167b84
--- /dev/null
+++ b/docs/misc/xl-psr.markdown
@@ -0,0 +1,134 @@
+# Intel Platform Shared Resource Monitoring/Control in xl
+
+This document introduces Intel Platform Shared Resource Monitoring/Control
+technologies, their basic concepts and the xl interfaces.
+
+## Cache Monitoring Technology (CMT)
+
+Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
+and later server platforms that allows an OS or Hypervisor/VMM to determine
+the usage of cache(currently only L3 cache supported) by applications running
+on the platform. A Resource Monitoring ID (RMID) is the abstraction of the
+application(s) that will be monitored for its cache usage. The CMT hardware
+tracks cache utilization of memory accesses according to the RMID and reports
+monitored data via a counter register.
+
+For more detailed information please refer to Intel SDM chapter
+"17.14 - Platform Shared Resource Monitoring: Cache Monitoring Technology".
+
+In Xen's implementation, each domain in the system can be assigned a RMID
+independently, while RMID=0 is reserved for monitoring domains that don't
+have CMT service attached. RMID is opaque for xl/libxl and is only used in
+hypervisor.
+
+### xl interfaces
+
+A domain is assigned a RMID implicitly by attaching it to CMT service:
+
+`xl psr-cmt-attach `
+
+After that, cache usage for the domain can be shown by:
+
+`xl psr-cmt-show cache-occupancy `
+
+Once monitoring is not needed any more, the domain can be detached from the
+CMT service by:
+
+`xl psr-cmt-detach `
+
+An attach may fail because of no free RMID available. In such case unused
+RMID(s) can be freed by detaching corresponding domains from CMT service.
+
+Maximum RMID and supported monitor types in the system can be obtained by:
+
+`xl psr-cmt-hwinfo`
+
+## Memory Bandwidth Monitoring (MBM)
+
+Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel
+Broadwell and later server platforms which builds on the CMT infrastructure to
+allow monitoring of system memory bandwidth. It introduces two new monitoring
+event type to monitor system total/local memory bandwidth. The same RMID can
+be used to monitor both cache usage and memory bandwidth at the same time.
+
+For more detailed information please refer to Intel SDM chapter
+"17.14 - Platform Shared Resource Monitoring: Cache Monitoring Technology".
+
+In Xen's implementation, MBM shares the same set of underlying monitoring
+service with CMT and can be used to monitor memory bandwidth on a per domain
+basis.
+
+The xl interfaces are the same with that of CMT. The difference is the
+monitor type is corresponding memory monitoring type (local-mem-bandwidth/
+total-mem-bandwidth instead of cache-occupancy). E.g. after a `xl 
psr-cmt-attach`:
+
+`xl psr-cmt-show local-mem-bandwidth `
+
+`xl psr-cmt-show total-mem-bandwidth `
+
+## Cache Allocation Technology (CAT)
+
+Cache Allocation Technology (CAT) is a new feature available on Intel
+Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
+partition cache allocation (i.e. L3 cache) based on a

[Xen-devel] [PATCH v5 07/13] x86: dynamically get/set CBM for a domain

2015-04-17 Thread Chao Peng

For CAT, COS is maintained in hypervisor only while CBM is exposed to
user space directly to allow getting/setting domain's cache capacity.
For each specified CBM, hypervisor will either use a existed COS which
has the same CBM or allocate a new one if the same CBM is not found. If
the allocation fails because of no enough COS available then error is
returned. The getting/setting are always operated on a specified socket.
For multiple sockets system, the interface may be called several times.

Signed-off-by: Chao Peng 
---
Changes in v5:
* Add spin_lock to protect cbm_map.
---
 xen/arch/x86/domctl.c   |  20 ++
 xen/arch/x86/psr.c  | 139 +++-
 xen/include/asm-x86/msr-index.h |   1 +
 xen/include/asm-x86/psr.h   |   2 +
 xen/include/public/domctl.h |  12 
 5 files changed, 172 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 9450795..7ffa650 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1334,6 +1334,26 @@ long arch_do_domctl(
 }
 break;
 
+case XEN_DOMCTL_psr_cat_op:
+switch ( domctl->u.psr_cat_op.cmd )
+{
+case XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM:
+ret = psr_set_l3_cbm(d, domctl->u.psr_cat_op.target,
+ domctl->u.psr_cat_op.data);
+break;
+
+case XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM:
+ret = psr_get_l3_cbm(d, domctl->u.psr_cat_op.target,
+ &domctl->u.psr_cat_op.data);
+copyback = 1;
+break;
+
+default:
+ret = -EOPNOTSUPP;
+break;
+}
+break;
+
 default:
 ret = iommu_do_domctl(domctl, d, u_domctl);
 break;
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index d784efb..2b08269 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -32,6 +32,7 @@ struct psr_cat_socket_info {
 unsigned int cbm_len;
 unsigned int cos_max;
 struct psr_cat_cbm *cos_to_cbm;
+spinlock_t cbm_lock;
 };
 
 struct psr_assoc {
@@ -47,6 +48,14 @@ static unsigned int opt_cos_max = 255;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
+static unsigned int get_socket_cpu(unsigned int socket)
+{
+if ( socket < nr_sockets )
+return cpumask_any(&socket_to_cpumask[socket]);
+
+return nr_cpu_ids;
+}
+
 static void __init parse_psr_bool(char *s, char *value, char *feature,
   unsigned int mask)
 {
@@ -246,24 +255,148 @@ int psr_get_cat_l3_info(unsigned int socket, uint32_t 
*cbm_len,
 return 0;
 }
 
+int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm)
+{
+unsigned int cos;
+struct psr_cat_socket_info *info;
+int ret = get_cat_socket_info(socket, &info);
+
+if ( ret )
+return ret;
+
+cos = d->arch.psr_cos_ids[socket];
+*cbm = info->cos_to_cbm[cos].cbm;
+return 0;
+}
+
+static bool_t psr_check_cbm(unsigned int cbm_len, uint64_t cbm)
+{
+unsigned int first_bit, zero_bit;
+
+/* Set bits should only in the range of [0, cbm_len). */
+if ( cbm & (~0ull << cbm_len) )
+return 0;
+
+first_bit = find_first_bit(&cbm, cbm_len);
+zero_bit = find_next_zero_bit(&cbm, cbm_len, first_bit);
+
+/* Set bits should be contiguous. */
+if ( zero_bit < cbm_len &&
+ find_next_bit(&cbm, cbm_len, zero_bit) < cbm_len )
+return 0;
+
+return 1;
+}
+
+struct cos_cbm_info
+{
+unsigned int cos;
+uint64_t cbm;
+};
+
+static void do_write_l3_cbm(void *data)
+{
+struct cos_cbm_info *info = data;
+
+wrmsrl(MSR_IA32_PSR_L3_MASK(info->cos), info->cbm);
+}
+
+static int write_l3_cbm(unsigned int socket, unsigned int cos, uint64_t cbm)
+{
+struct cos_cbm_info info = { .cos = cos, .cbm = cbm };
+
+if ( socket == cpu_to_socket(smp_processor_id()) )
+do_write_l3_cbm(&info);
+else
+{
+unsigned int cpu = get_socket_cpu(socket);
+
+if ( cpu >= nr_cpu_ids )
+return -EBADSLT;
+on_selected_cpus(cpumask_of(cpu), do_write_l3_cbm, &info, 1);
+}
+
+return 0;
+}
+
+int psr_set_l3_cbm(struct domain *d, unsigned int socket, uint64_t cbm)
+{
+unsigned int old_cos, cos;
+struct psr_cat_cbm *map, *find;
+struct psr_cat_socket_info *info;
+int ret = get_cat_socket_info(socket, &info);
+
+if ( ret )
+return ret;
+
+if ( !psr_check_cbm(info->cbm_len, cbm) )
+return -EINVAL;
+
+old_cos = d->arch.psr_cos_ids[socket];
+map = info->cos_to_cbm;
+find = NULL;
+
+for ( cos = 0; cos <= info->cos_max; cos++ )
+{
+/* If still not found, then keep unused one. */
+if ( !find && cos != 0 && map[cos].ref == 0 )
+find = map + cos;
+else if ( map[cos].cbm == cbm )
+{
+if ( unlikely(cos == old_cos) )
+return 0;
+fin

[Xen-devel] [PATCH v5 05/13] x86: add COS information for each domain

2015-04-17 Thread Chao Peng

In Xen's implementation, the CAT enforcement granularity is per domain.
Due to the length of CBM and the number of COS may be socket-different,
each domain has COS ID for each socket. The domain get COS=0 by default
and at runtime its COS is then allocated dynamically when user specifies
a CBM for the domain.

Signed-off-by: Chao Peng 
Reviewed-by: Andrew Cooper 
---
 xen/arch/x86/domain.c|  6 +-
 xen/arch/x86/psr.c   | 42 ++
 xen/include/asm-x86/domain.h |  5 -
 xen/include/asm-x86/psr.h|  3 +++
 4 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index c26c732..a0b5e25 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -617,6 +617,9 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 /* 64-bit PV guest by default. */
 d->arch.is_32bit_pv = d->arch.has_32bit_shinfo = 0;
 
+if ( (rc = psr_domain_init(d)) != 0 )
+goto fail;
+
 /* initialize default tsc behavior in case tools don't */
 tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0);
 spin_lock_init(&d->arch.vtsc_lock);
@@ -635,6 +638,7 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 free_perdomain_mappings(d);
 if ( is_pv_domain(d) )
 free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+psr_domain_free(d);
 return rc;
 }
 
@@ -658,7 +662,7 @@ void arch_domain_destroy(struct domain *d)
 free_xenheap_page(d->shared_info);
 cleanup_domain_irq_mapping(d);
 
-psr_free_rmid(d);
+psr_domain_free(d);
 }
 
 void arch_domain_shutdown(struct domain *d)
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 11e44c4..592d610 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -215,6 +215,48 @@ void psr_ctxt_switch_to(struct domain *d)
 }
 }
 
+/* Called with domain lock held, no psr specific lock needed */
+static void psr_free_cos(struct domain *d)
+{
+unsigned int socket;
+unsigned int cos;
+
+if( !d->arch.psr_cos_ids )
+return;
+
+for ( socket = 0; socket < nr_sockets; socket++ )
+{
+if ( !cat_socket_info[socket].enabled )
+continue;
+
+if ( (cos = d->arch.psr_cos_ids[socket]) == 0 )
+continue;
+
+cat_socket_info[socket].cos_to_cbm[cos].ref--;
+}
+
+xfree(d->arch.psr_cos_ids);
+d->arch.psr_cos_ids = NULL;
+}
+
+int psr_domain_init(struct domain *d)
+{
+if ( cat_socket_info )
+{
+d->arch.psr_cos_ids = xzalloc_array(unsigned int, nr_sockets);
+if ( !d->arch.psr_cos_ids )
+return -ENOMEM;
+}
+
+return 0;
+}
+
+void psr_domain_free(struct domain *d)
+{
+psr_free_rmid(d);
+psr_free_cos(d);
+}
+
 static void cat_cpu_init(void)
 {
 unsigned int eax, ebx, ecx, edx;
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index e5102cc..324011d 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -333,7 +333,10 @@ struct arch_domain
 struct e820entry *e820;
 unsigned int nr_e820;
 
-unsigned int psr_rmid; /* RMID assigned to the domain for CMT */
+/* RMID assigned to the domain for CMT */
+unsigned int psr_rmid;
+/* COS assigned to the domain for each socket */
+unsigned int *psr_cos_ids;
 
 /* Shared page for notifying that explicit PIRQ EOI is required. */
 unsigned long *pirq_eoi_map;
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index 3bc5496..45392bf 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -52,6 +52,9 @@ void psr_free_rmid(struct domain *d);
 
 void psr_ctxt_switch_to(struct domain *d);
 
+int psr_domain_init(struct domain *d);
+void psr_domain_free(struct domain *d);
+
 #endif /* __ASM_PSR_H__ */
 
 /*
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 08/13] x86: add scheduling support for Intel CAT

2015-04-17 Thread Chao Peng

On context switch, write the the domain's Class of Service(COS) to MSR
IA32_PQR_ASSOC, to notify hardware to use the new COS.

For performance reason, the COS mask for current cpu is also cached in
the local per-CPU variable.

Signed-off-by: Chao Peng 
---
Changes in v5:
* Remove the need to cache socket.
Changes in v2:
* merge common scheduling changes into scheduling improvement patch.
* use readable expr for psra->cos_mask.
---
 xen/arch/x86/psr.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 2b08269..ac1c2b7 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -37,6 +37,7 @@ struct psr_cat_socket_info {
 
 struct psr_assoc {
 uint64_t val;
+uint64_t cos_mask;
 };
 
 struct psr_cmt *__read_mostly psr_cmt;
@@ -200,7 +201,17 @@ static inline void psr_assoc_init(void)
 {
 struct psr_assoc *psra = &this_cpu(psr_assoc);
 
-if ( psr_cmt_enabled() )
+if ( cat_socket_info )
+{
+struct psr_cat_socket_info *info = cat_socket_info +
+   cpu_to_socket(smp_processor_id());
+
+if ( info->enabled )
+psra->cos_mask = ((1ull << get_count_order(info->cos_max)) - 1)
+ << 32;
+}
+
+if ( psr_cmt_enabled() || psra->cos_mask )
 rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
 }
 
@@ -209,6 +220,12 @@ static inline void psr_assoc_rmid(uint64_t *reg, unsigned 
int rmid)
 *reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
 }
 
+static inline void psr_assoc_cos(uint64_t *reg, unsigned int cos,
+ uint64_t cos_mask)
+{
+*reg = (*reg & ~cos_mask) | (((uint64_t)cos << 32) & cos_mask);
+}
+
 void psr_ctxt_switch_to(struct domain *d)
 {
 struct psr_assoc *psra = &this_cpu(psr_assoc);
@@ -217,6 +234,11 @@ void psr_ctxt_switch_to(struct domain *d)
 if ( psr_cmt_enabled() )
 psr_assoc_rmid(®, d->arch.psr_rmid);
 
+if ( psra->cos_mask )
+psr_assoc_cos(®, d->arch.psr_cos_ids ?
+  d->arch.psr_cos_ids[cpu_to_socket(smp_processor_id())] :
+  0, psra->cos_mask);
+
 if ( reg != psra->val )
 {
 wrmsrl(MSR_IA32_PSR_ASSOC, reg);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 11/13] tools/libxl: add command to show CMT hardware info

2015-04-17 Thread Chao Peng

Add dedicated one to show hardware information.

[root@vmm-psr]xl psr-cmt-hwinfo
Cache Monitoring Technology (CMT):
Enabled : 1
Total RMID  : 63
Supported monitor types:
cache-occupancy
total-mem-bandwidth
local-mem-bandwidth

Signed-off-by: Chao Peng 
---
 docs/man/xl.pod.1 |  4 
 tools/libxl/xl.h  |  1 +
 tools/libxl/xl_cmdimpl.c  | 35 +++
 tools/libxl/xl_cmdtable.c |  5 +
 4 files changed, 45 insertions(+)

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 16783c8..640788f 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1495,6 +1495,10 @@ for any of these monitoring types.
 
 =over 4
 
+=item B
+
+Show CMT hardware information.
+
 =item B [I]
 
 attach: Attach the platform shared resource monitoring service to a domain.
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index 5bc138c..1c526dc 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -113,6 +113,7 @@ int main_remus(int argc, char **argv);
 #endif
 int main_devd(int argc, char **argv);
 #ifdef LIBXL_HAVE_PSR_CMT
+int main_psr_cmt_hwinfo(int argc, char **argv);
 int main_psr_cmt_attach(int argc, char **argv);
 int main_psr_cmt_detach(int argc, char **argv);
 int main_psr_cmt_show(int argc, char **argv);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c666d84..4ab5e8b 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -8014,6 +8014,36 @@ out:
 }
 
 #ifdef LIBXL_HAVE_PSR_CMT
+static int psr_cmt_hwinfo(void)
+{
+int rc;
+int enabled;
+uint32_t total_rmid;
+
+printf("Cache Monitoring Technology (CMT):\n");
+
+enabled = libxl_psr_cmt_enabled(ctx);
+printf("%-16s: %s\n", "Enabled", enabled ? "1" : "0");
+if (!enabled)
+return 0;
+
+rc = libxl_psr_cmt_get_total_rmid(ctx, &total_rmid);
+if (rc) {
+fprintf(stderr, "Failed to get max RMID value\n");
+return rc;
+}
+printf("%-16s: %u\n", "Total RMID", total_rmid);
+
+printf("Supported monitor types:\n");
+if (libxl_psr_cmt_type_supported(ctx, LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY))
+printf("cache-occupancy\n");
+if (libxl_psr_cmt_type_supported(ctx, LIBXL_PSR_CMT_TYPE_TOTAL_MEM_COUNT))
+printf("total-mem-bandwidth\n");
+if (libxl_psr_cmt_type_supported(ctx, LIBXL_PSR_CMT_TYPE_LOCAL_MEM_COUNT))
+printf("local-mem-bandwidth\n");
+
+return rc;
+}
 
 #define MBM_SAMPLE_RETRY_MAX 4
 static int psr_cmt_get_mem_bandwidth(uint32_t domid,
@@ -8180,6 +8210,11 @@ static int psr_cmt_show(libxl_psr_cmt_type type, 
uint32_t domid)
 return 0;
 }
 
+int main_psr_cmt_hwinfo(int argc, char **argv)
+{
+return psr_cmt_hwinfo();
+}
+
 int main_psr_cmt_attach(int argc, char **argv)
 {
 uint32_t domid;
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 9284887..dc25d1f 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -524,6 +524,11 @@ struct cmd_spec cmd_table[] = {
   "-F  Run in the foreground",
 },
 #ifdef LIBXL_HAVE_PSR_CMT
+{ "psr-cmt-hwinfo",
+  &main_psr_cmt_hwinfo, 0, 1,
+  "Show hardware information for Cache Monitoring Technology",
+  "",
+},
 { "psr-cmt-attach",
   &main_psr_cmt_attach, 0, 1,
   "Attach Cache Monitoring Technology service to a domain",
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 10/13] tools/libxl: minor name changes for CMT commands

2015-04-17 Thread Chao Peng

Use "-" instead of  "_" for monitor types.

Signed-off-by: Chao Peng 
---
 tools/libxl/xl_cmdimpl.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 394b55d..c666d84 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -8220,11 +8220,11 @@ int main_psr_cmt_show(int argc, char **argv)
 /* No options */
 }
 
-if (!strcmp(argv[optind], "cache_occupancy"))
+if (!strcmp(argv[optind], "cache-occupancy"))
 type = LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY;
-else if (!strcmp(argv[optind], "total_mem_bandwidth"))
+else if (!strcmp(argv[optind], "total-mem-bandwidth"))
 type = LIBXL_PSR_CMT_TYPE_TOTAL_MEM_COUNT;
-else if (!strcmp(argv[optind], "local_mem_bandwidth"))
+else if (!strcmp(argv[optind], "local-mem-bandwidth"))
 type = LIBXL_PSR_CMT_TYPE_LOCAL_MEM_COUNT;
 else {
 help("psr-cmt-show");
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 00/13] enable Cache Allocation Technology (CAT) for VMs

2015-04-17 Thread Chao Peng

Changes in v5:
* Address comments from Andrew and Ian(Detail in patch).
* Add socket_to_cpumask.
* Add xl psr-cmt/cat-hwinfo.
* Add some libxl CMT enhancement.
Changes in v4:
* Address comments from Andrew and Ian(Detail in patch).
* Split COS/CBM management patch into 4 small patches.
* Add documentation xl-psr.markdown.
Changes in v3:
* Address comments from Jan and Ian(Detail in patch).
* Add xl sample output in cover letter.
Changes in v2:
* Address comments from Konrad and Jan(Detail in patch):
* Make all cat unrelated changes into the preparation patches. 

This patch serial enable the new Cache Allocation Technology (CAT) feature
found in Intel Broadwell and later server platform. In Xen's implementation,
CAT is used to control cache allocation on VM basis.

Detail hardware spec can be found in section 17.15 of the Intel SDM [1].
The design for XEN can be found at [2].

patch1-2:   preparation.
patch3-9:   real work for CAT.
patch10-11: enhancement for CMT.
patch12:xl document for CMT/MBM/CAT.

[1] Intel SDM 
(http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf)
[2] CAT design for XEN( 
http://lists.xen.org/archives/html/xen-devel/2014-12/msg01382.html)

Chao Peng (13):
  x86: add socket_to_cpumask
  x86: improve psr scheduling code
  x86: detect and initialize Intel CAT feature
  x86: maintain COS to CBM mapping for each socket
  x86: add COS information for each domain
  x86: expose CBM length and COS number information
  x86: dynamically get/set CBM for a domain
  x86: add scheduling support for Intel CAT
  xsm: add CAT related xsm policies
  tools/libxl: minor name changes for CMT commands
  tools/libxl: add command to show CMT hardware info
  tools: add tools support for Intel CAT
  docs: add xl-psr.markdown

 docs/man/xl.pod.1|  47 
 docs/misc/xen-command-line.markdown  |  15 +-
 docs/misc/xl-psr.markdown| 134 ++
 tools/flask/policy/policy/modules/xen/xen.if |   2 +-
 tools/flask/policy/policy/modules/xen/xen.te |   4 +-
 tools/libxc/include/xenctrl.h|  15 ++
 tools/libxc/xc_psr.c |  76 ++
 tools/libxl/libxl.h  |  26 ++
 tools/libxl/libxl_psr.c  | 170 +++-
 tools/libxl/libxl_types.idl  |  10 +
 tools/libxl/xl.h |   6 +
 tools/libxl/xl_cmdimpl.c | 210 ++-
 tools/libxl/xl_cmdtable.c|  24 ++
 xen/arch/x86/domain.c|  13 +-
 xen/arch/x86/domctl.c|  20 ++
 xen/arch/x86/psr.c   | 370 +--
 xen/arch/x86/smpboot.c   |  15 ++
 xen/arch/x86/sysctl.c|  18 ++
 xen/include/asm-x86/cpufeature.h |   1 +
 xen/include/asm-x86/domain.h |   5 +-
 xen/include/asm-x86/msr-index.h  |   1 +
 xen/include/asm-x86/psr.h|  14 +-
 xen/include/asm-x86/smp.h|   8 +
 xen/include/public/domctl.h  |  12 +
 xen/include/public/sysctl.h  |  16 ++
 xen/xsm/flask/hooks.c|   6 +
 xen/xsm/flask/policy/access_vectors  |   4 +
 27 files changed, 1201 insertions(+), 41 deletions(-)
 create mode 100644 docs/misc/xl-psr.markdown

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 09/13] xsm: add CAT related xsm policies

2015-04-17 Thread Chao Peng

Add xsm policies for Cache Allocation Technology(CAT) related hypercalls
to restrict the functions visibility to control domain only.

Signed-off-by: Chao Peng 
Acked-by:  Daniel De Graaf 
---
 tools/flask/policy/policy/modules/xen/xen.if | 2 +-
 tools/flask/policy/policy/modules/xen/xen.te | 4 +++-
 xen/xsm/flask/hooks.c| 6 ++
 xen/xsm/flask/policy/access_vectors  | 4 
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if 
b/tools/flask/policy/policy/modules/xen/xen.if
index 620d151..aa5eb72 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -51,7 +51,7 @@ define(`create_domain_common', `
getaffinity setaffinity setvcpuextstate };
allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
-   psr_cmt_op };
+   psr_cmt_op psr_cat_op };
allow $1 $2:security check_context;
allow $1 $2:shadow enable;
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
diff --git a/tools/flask/policy/policy/modules/xen/xen.te 
b/tools/flask/policy/policy/modules/xen/xen.te
index e555d11..6dcf953 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -67,6 +67,7 @@ allow dom0_t xen_t:xen {
 allow dom0_t xen_t:xen2 {
 resource_op
 psr_cmt_op
+psr_cat_op
 };
 allow dom0_t xen_t:mmu memorymap;
 
@@ -80,7 +81,8 @@ allow dom0_t dom0_t:domain {
getpodtarget setpodtarget set_misc_info set_virq_handler
 };
 allow dom0_t dom0_t:domain2 {
-   set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo 
get_vnumainfo psr_cmt_op
+   set_cpuid gettsc settsc setscheduler set_max_evtchn set_vnumainfo
+   get_vnumainfo psr_cmt_op psr_cat_op
 };
 allow dom0_t dom0_t:resource { add remove };
 
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index ab5141d..72fe9b3 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -729,6 +729,9 @@ static int flask_domctl(struct domain *d, int cmd)
 case XEN_DOMCTL_psr_cmt_op:
 return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PSR_CMT_OP);
 
+case XEN_DOMCTL_psr_cat_op:
+return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__PSR_CAT_OP);
+
 default:
 printk("flask_domctl: Unknown op %d\n", cmd);
 return -EPERM;
@@ -787,6 +790,9 @@ static int flask_sysctl(int cmd)
 case XEN_SYSCTL_psr_cmt_op:
 return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
 XEN2__PSR_CMT_OP, NULL);
+case XEN_SYSCTL_psr_cat_op:
+return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
+XEN2__PSR_CAT_OP, NULL);
 
 default:
 printk("flask_sysctl: Unknown op %d\n", cmd);
diff --git a/xen/xsm/flask/policy/access_vectors 
b/xen/xsm/flask/policy/access_vectors
index 128250e..bdf496e 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -84,6 +84,8 @@ class xen2
 resource_op
 # XEN_SYSCTL_psr_cmt_op
 psr_cmt_op
+# XEN_SYSCTL_psr_cat_op
+psr_cat_op
 }
 
 # Classes domain and domain2 consist of operations that a domain performs on
@@ -219,6 +221,8 @@ class domain2
 get_vnumainfo
 # XEN_DOMCTL_psr_cmt_op
 psr_cmt_op
+# XEN_DOMCTL_psr_cat_op
+psr_cat_op
 }
 
 # Similar to class domain, but primarily contains domctls related to HVM 
domains
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 01/13] x86: add socket_to_cpumask

2015-04-17 Thread Chao Peng

Maintain socket_to_cpumask which contains all the HT and core siblings
in the same socket.

Signed-off-by: Chao Peng 
---
 xen/arch/x86/smpboot.c| 15 +++
 xen/include/asm-x86/smp.h |  8 
 2 files changed, 23 insertions(+)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 116c8f8..d236f18 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -59,6 +59,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 cpumask_t cpu_online_map __read_mostly;
 EXPORT_SYMBOL(cpu_online_map);
 
+unsigned int nr_sockets __read_mostly;
+cpumask_t *socket_to_cpumask __read_mostly;
+
 struct cpuinfo_x86 cpu_data[NR_CPUS];
 
 u32 x86_cpu_to_apicid[NR_CPUS] __read_mostly =
@@ -301,6 +304,8 @@ static void set_cpu_sibling_map(int cpu)
 }
 }
 }
+
+cpumask_set_cpu(cpu, &socket_to_cpumask[cpu_to_socket(cpu)]);
 }
 
 void start_secondary(void *unused)
@@ -717,6 +722,14 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 
 stack_base[0] = stack_start;
 
+nr_sockets = DIV_ROUND_UP(nr_cpu_ids, boot_cpu_data.x86_max_cores *
+  boot_cpu_data.x86_num_siblings);
+ASSERT(nr_sockets > 0);
+
+socket_to_cpumask = xzalloc_array(cpumask_t, nr_sockets);
+if ( !socket_to_cpumask )
+panic("No memory for socket CPU siblings map");
+
 if ( !zalloc_cpumask_var(&per_cpu(cpu_sibling_mask, 0)) ||
  !zalloc_cpumask_var(&per_cpu(cpu_core_mask, 0)) )
 panic("No memory for boot CPU sibling/core maps");
@@ -782,6 +795,8 @@ remove_siblinginfo(int cpu)
 int sibling;
 struct cpuinfo_x86 *c = cpu_data;
 
+cpumask_clear_cpu(cpu, &socket_to_cpumask[cpu_to_socket(cpu)]);
+
 for_each_cpu ( sibling, per_cpu(cpu_core_mask, cpu) )
 {
 cpumask_clear_cpu(cpu, per_cpu(cpu_core_mask, sibling));
diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h
index 67518cf..3ffddde 100644
--- a/xen/include/asm-x86/smp.h
+++ b/xen/include/asm-x86/smp.h
@@ -58,6 +58,14 @@ int hard_smp_processor_id(void);
 
 void __stop_this_cpu(void);
 
+/*
+ * This value is considered to not change from the initial startup.
+ * Otherwise all the relevant places need to be retrofitted.
+ */
+extern unsigned int nr_sockets;
+
+/* Representing HT and core siblings in each socket */
+extern cpumask_t *socket_to_cpumask;
 #endif /* !__ASSEMBLY__ */
 
 #endif
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 02/13] x86: improve psr scheduling code

2015-04-17 Thread Chao Peng

Switching RMID from previous vcpu to next vcpu only needs to write
MSR_IA32_PSR_ASSOC once. Write it with the value of next vcpu is enough,
no need to write '0' first. Idle domain has RMID set to 0 and because MSR
is already updated lazily, so just switch it as it does.

Also move the initialization of per-CPU variable which used for lazy
update from context switch to CPU starting.

Signed-off-by: Chao Peng 
Reviewed-by: Andrew Cooper 
---
Changes in v5:
* use this_cpu() rather than per_cpu().
Changes in v4:
* Move psr_assoc_reg_read/psr_assoc_reg_write into psr_ctxt_switch_to.
* Use 0 instead of smp_processor_id() for boot cpu.
* add cpu parameter to psr_assoc_init.
Changes in v2:
* Move initialization for psr_assoc from context switch to CPU_STARTING.
---
 xen/arch/x86/domain.c |  7 ++---
 xen/arch/x86/psr.c| 71 +--
 xen/include/asm-x86/psr.h |  3 +-
 3 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index fcea94b..c26c732 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1444,8 +1444,6 @@ static void __context_switch(void)
 {
 memcpy(&p->arch.user_regs, stack_regs, CTXT_SWITCH_STACK_BYTES);
 vcpu_save_fpu(p);
-if ( psr_cmt_enabled() )
-psr_assoc_rmid(0);
 p->arch.ctxt_switch_from(p);
 }
 
@@ -1470,11 +1468,10 @@ static void __context_switch(void)
 }
 vcpu_restore_fpu_eager(n);
 n->arch.ctxt_switch_to(n);
-
-if ( psr_cmt_enabled() && n->domain->arch.psr_rmid > 0 )
-psr_assoc_rmid(n->domain->arch.psr_rmid);
 }
 
+psr_ctxt_switch_to(n->domain);
+
 gdt = !is_pv_32on64_vcpu(n) ? per_cpu(gdt_table, cpu) :
   per_cpu(compat_gdt_table, cpu);
 if ( need_full_gdt(n) )
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 344de3c..2490d22 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -22,7 +22,6 @@
 
 struct psr_assoc {
 uint64_t val;
-bool_t initialized;
 };
 
 struct psr_cmt *__read_mostly psr_cmt;
@@ -122,14 +121,6 @@ static void __init init_psr_cmt(unsigned int rmid_max)
 printk(XENLOG_INFO "Cache Monitoring Technology enabled\n");
 }
 
-static int __init init_psr(void)
-{
-if ( (opt_psr & PSR_CMT) && opt_rmid_max )
-init_psr_cmt(opt_rmid_max);
-return 0;
-}
-__initcall(init_psr);
-
 /* Called with domain lock held, no psr specific lock needed */
 int psr_alloc_rmid(struct domain *d)
 {
@@ -175,27 +166,65 @@ void psr_free_rmid(struct domain *d)
 d->arch.psr_rmid = 0;
 }
 
-void psr_assoc_rmid(unsigned int rmid)
+static inline void psr_assoc_init(void)
 {
-uint64_t val;
-uint64_t new_val;
 struct psr_assoc *psra = &this_cpu(psr_assoc);
 
-if ( !psra->initialized )
-{
+if ( psr_cmt_enabled() )
 rdmsrl(MSR_IA32_PSR_ASSOC, psra->val);
-psra->initialized = 1;
-}
-val = psra->val;
+}
+
+static inline void psr_assoc_rmid(uint64_t *reg, unsigned int rmid)
+{
+*reg = (*reg & ~rmid_mask) | (rmid & rmid_mask);
+}
+
+void psr_ctxt_switch_to(struct domain *d)
+{
+struct psr_assoc *psra = &this_cpu(psr_assoc);
+uint64_t reg = psra->val;
+
+if ( psr_cmt_enabled() )
+psr_assoc_rmid(®, d->arch.psr_rmid);
 
-new_val = (val & ~rmid_mask) | (rmid & rmid_mask);
-if ( val != new_val )
+if ( reg != psra->val )
 {
-wrmsrl(MSR_IA32_PSR_ASSOC, new_val);
-psra->val = new_val;
+wrmsrl(MSR_IA32_PSR_ASSOC, reg);
+psra->val = reg;
 }
 }
 
+static void psr_cpu_init(void)
+{
+psr_assoc_init();
+}
+
+static int cpu_callback(
+struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
+if ( action == CPU_STARTING )
+psr_cpu_init();
+
+return NOTIFY_DONE;
+}
+
+static struct notifier_block cpu_nfb = {
+.notifier_call = cpu_callback
+};
+
+static int __init psr_presmp_init(void)
+{
+if ( (opt_psr & PSR_CMT) && opt_rmid_max )
+init_psr_cmt(opt_rmid_max);
+
+psr_cpu_init();
+if ( psr_cmt_enabled() )
+register_cpu_notifier(&cpu_nfb);
+
+return 0;
+}
+presmp_initcall(psr_presmp_init);
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-x86/psr.h b/xen/include/asm-x86/psr.h
index c6076e9..585350c 100644
--- a/xen/include/asm-x86/psr.h
+++ b/xen/include/asm-x86/psr.h
@@ -46,7 +46,8 @@ static inline bool_t psr_cmt_enabled(void)
 
 int psr_alloc_rmid(struct domain *d);
 void psr_free_rmid(struct domain *d);
-void psr_assoc_rmid(unsigned int rmid);
+
+void psr_ctxt_switch_to(struct domain *d);
 
 #endif /* __ASM_PSR_H__ */
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 04/13] x86: maintain COS to CBM mapping for each socket

2015-04-17 Thread Chao Peng

For each socket, a COS to CBM mapping structure is maintained for each
COS. The mapping is indexed by COS and the value is the corresponding
CBM. Different VMs may use the same CBM, a reference count is used to
indicate if the CBM is available.

Signed-off-by: Chao Peng 
Reviewed-by: Andrew Cooper 
---
Changes in v5:
* rename cos_cbm_map to cos_to_cbm.
---
 xen/arch/x86/psr.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 96456de..11e44c4 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -21,11 +21,17 @@
 #define PSR_CMT(1<<0)
 #define PSR_CAT(1<<1)
 
+struct psr_cat_cbm {
+unsigned int ref;
+uint64_t cbm;
+};
+
 struct psr_cat_socket_info {
 bool_t initialized;
 bool_t enabled;
 unsigned int cbm_len;
 unsigned int cos_max;
+struct psr_cat_cbm *cos_to_cbm;
 };
 
 struct psr_assoc {
@@ -236,6 +242,14 @@ static void cat_cpu_init(void)
 info->cbm_len = (eax & 0x1f) + 1;
 info->cos_max = min(opt_cos_max, edx & 0x);
 
+info->cos_to_cbm = xzalloc_array(struct psr_cat_cbm,
+  info->cos_max + 1UL);
+if ( !info->cos_to_cbm )
+return;
+
+/* cos=0 is reserved as default cbm(all ones). */
+info->cos_to_cbm[0].cbm = (1ull << info->cbm_len) - 1;
+
 info->enabled = 1;
 printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, 
cbm_len:%u\n",
socket, info->cos_max, info->cbm_len);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v5 03/13] x86: detect and initialize Intel CAT feature

2015-04-17 Thread Chao Peng

Detect Intel Cache Allocation Technology(CAT) feature and store the
cpuid information for later use. Currently only L3 cache allocation is
supported. The L3 CAT features may vary among sockets so per-socket
feature information is stored. The initialization can happen either at
boot time or when CPU(s) is hot plugged after booting.

Signed-off-by: Chao Peng 
Reviewed-by: Andrew Cooper 
---
Changes in v5:
* Add cos_max boot option.
Changes in v4:
* check X86_FEATURE_CAT available before doing initialization.
Changes in v3:
* Remove num_sockets boot option instead calculate it at boot time.
* Name hardcoded CAT cpuid leaf as PSR_CPUID_LEVEL_CAT.
Changes in v2:
* socket_num => num_sockets and fix several documentaion issues.
* refactor boot line parameters parsing into standlone patch.
* set opt_num_sockets = NR_CPUS when opt_num_sockets > NR_CPUS.
* replace CPU_ONLINE with CPU_STARTING and integrate that into scheduling
  improvement patch.
* reimplement get_max_socket() with cpu_to_socket();
* cbm is still uint64 as there is a path forward for supporting long masks.
---
 docs/misc/xen-command-line.markdown | 15 +++--
 xen/arch/x86/psr.c  | 63 +++--
 xen/include/asm-x86/cpufeature.h|  1 +
 xen/include/asm-x86/psr.h   |  3 ++
 4 files changed, 78 insertions(+), 4 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 1dda1f0..a3deb36 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1122,9 +1122,9 @@ This option can be specified more than once (up to 8 
times at present).
 > `= `
 
 ### psr (Intel)
-> `= List of ( cmt: | rmid_max: )`
+> `= List of ( cmt: | rmid_max: | cat: | 
cos_max: )`
 
-> Default: `psr=cmt:0,rmid_max:255`
+> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255`
 
 Platform Shared Resource(PSR) Services.  Intel Haswell and later server
 platforms offer information about the sharing of resources.
@@ -1134,6 +1134,12 @@ Monitoring ID(RMID) is used to bind the domain to 
corresponding shared
 resource.  RMID is a hardware-provided layer of abstraction between software
 and logical processors.
 
+To use the PSR cache allocation service for a certain domain, a capacity
+bitmasks(CBM) is used to bind the domain to corresponding shared resource.
+CBM represents cache capacity and indicates the degree of overlap and isolation
+between domains. In hypervisor a Class of Service(COS) ID is allocated for each
+unique CBM.
+
 The following resources are available:
 
 * Cache Monitoring Technology (Haswell and later).  Information regarding the
@@ -1144,6 +1150,11 @@ The following resources are available:
   total/local memory bandwidth. Follow the same options with Cache Monitoring
   Technology.
 
+* Cache Alllocation Technology (Broadwell and later).  Information regarding
+  the cache allocation.
+  * `cat` instructs Xen to enable/disable Cache Allocation Technology.
+  * `cos_max` indicates the max value for COS ID.
+
 ### reboot
 > `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | 
 > [c]old]`
 
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 2490d22..96456de 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -19,14 +19,25 @@
 #include 
 
 #define PSR_CMT(1<<0)
+#define PSR_CAT(1<<1)
+
+struct psr_cat_socket_info {
+bool_t initialized;
+bool_t enabled;
+unsigned int cbm_len;
+unsigned int cos_max;
+};
 
 struct psr_assoc {
 uint64_t val;
 };
 
 struct psr_cmt *__read_mostly psr_cmt;
+static struct psr_cat_socket_info *__read_mostly cat_socket_info;
+
 static unsigned int __initdata opt_psr;
 static unsigned int __initdata opt_rmid_max = 255;
+static unsigned int opt_cos_max = 255;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
@@ -63,10 +74,14 @@ static void __init parse_psr_param(char *s)
 *val_str++ = '\0';
 
 parse_psr_bool(s, val_str, "cmt", PSR_CMT);
+parse_psr_bool(s, val_str, "cat", PSR_CAT);
 
 if ( val_str && !strcmp(s, "rmid_max") )
 opt_rmid_max = simple_strtoul(val_str, NULL, 0);
 
+if ( val_str && !strcmp(s, "cos_max") )
+opt_cos_max = simple_strtoul(val_str, NULL, 0);
+
 s = ss + 1;
 } while ( ss );
 }
@@ -194,8 +209,49 @@ void psr_ctxt_switch_to(struct domain *d)
 }
 }
 
+static void cat_cpu_init(void)
+{
+unsigned int eax, ebx, ecx, edx;
+struct psr_cat_socket_info *info;
+unsigned int socket;
+unsigned int cpu = smp_processor_id();
+const struct cpuinfo_x86 *c = cpu_data + cpu;
+
+if ( !cpu_has(c, X86_FEATURE_CAT) )
+return;
+
+socket = cpu_to_socket(cpu);
+ASSERT(socket < nr_sockets);
+
+info = cat_socket_info + socket;
+
+/* Avoid initializing more than one times for the same socket. */
+if ( test_and_set_bool(info->initialized) )
+return;
+
+cpuid_count(PSR_CPUID

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Stefano Stabellini

On Fri, 17 Apr 2015, Chen Baozi wrote:
> Hi all,
> 
> According to my recent experience, there might be some problems of swiotlb
> dma map on 1:1 mapping arm64 dom0 with large memory. The issue is like below:
> 
> For those arm64 server with large memory, it is possible to set dom0_mem >
> 4G (e.g. I have one set with 16G). In this case, according to my 
> understanding,
> there is chance that the dom0 kernel needs to map some buffers above 4G to do
> DMA operations (e.g. in snps,dwmac ethernet driver). However, most DMA engines
> support only 32-bit physical address, thus aren't able to operate directly on
> those memory. IIUC, swiotlb is implemented to solve this (using bounce 
> buffer),
> if there is no IOMMU or IOMMU is not enabled on the system. Sadly, it seems
> that xen_swiotlb_map_page in my dom0 kernel allocates
> (start_dma_addr = 0x94480) the buffers for DMA above 4G which fails
> dma_capable() checking and was then unable to return from 
> xen_swiotlb_map_page()
> successfully.
>
> If I set dom0_mem to a small value (e.g. 512M), which makes all physical 
> memory
> of dom0 below 4G, everything goes fine.
> 
> I am not familiar with swiotlb-xen, so there would be misunderstanding about
> the current situation. Fix me if I did/understood anything wrong.
> 
> Any ideas?

I think that the problem is that xen_swiotlb_init doesn't necessarely allocate
memory under 4G on arm/arm64.

xen_swiotlb_init calls __get_free_pages to allocate memory, so the pages
could easily be above 4G.  Subsequently xen_swiotlb_fixup is called on
the allocated memory range, calling xen_create_contiguous_region and
passing an address_bits mask. However xen_create_contiguous_region
doesn't actually do anything at all on ARM.

I think that given that dom0 is mapped 1:1 on ARM, the easiest and best
fix would be to simply allocate memory under 4G to begin with. Something
like (maybe with an ifdef ARM around it):

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 810ad41..22ac33a 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -235,7 +235,7 @@ retry:
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-   xen_io_tlb_start = (void 
*)__get_free_pages(__GFP_NOWARN, order);
+   xen_io_tlb_start = (void 
*)__get_free_pages(__GFP_NOWARN|__GFP_DMA32, order);
if (xen_io_tlb_start)
break;
order--;

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Olaf Hering

On Fri, Apr 17, Konrad Rzeszutek Wilk wrote:

> The /noexitboot will inhibit Xen from calling ExitBootServices.

How is that supposed to be passed to xen.efi? Looks like I have no
cmdline interface.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Commit 1aeb1156fa43fe2cd2b5003995b20466cd19a622: "x86 don't change affinity with interrupt unmasked", APCI errors and assorted pci trouble

2015-04-17 Thread Sander Eikelenboom


Friday, April 17, 2015, 1:43:32 PM, you wrote:

 On 14.04.15 at 14:46,  wrote:
>> I just had a hunch .. could it be related to the kernel apci/irq refactoring
>> series of Jiang Liu, that already caused a lot of trouble in 3.17, 3.18 and 
>> 3.19
>> with Xen.  And yes that seems to be the case:
>> 
>> On Xen without "x86 don't change affinity with interrupt unmasked"
>> - 3.16 && 3.19 && 4.0 all work fine 
>> 
>> On Xen with "x86 don't change affinity with interrupt unmasked" 
>> - 3.16 (which is before that kernel refactoring series) works fine.
>> - 3.19, 4.0 both give the dom0 kernel hangs and the :
>> (XEN) [2015-03-26 20:35:42.205] APIC error on CPU0: 00(40)
>> (XEN) [2015-03-26 20:35:42.372] APIC error on CPU0: 40(40)
>> 
>> (haven't tested 3.17 and 3.18 because these have asorted problems due that 
>>  series that weren't fixed in time before stable updates ended.)
>> 
>> So it seems Jan's patch seems to interfere with that patch series.

> That's rather odd a finding - the patch in question in fact uncovered
> a bug introduced in 2ca9fbd739 ("AMD IOMMU: allocate IRTE entries
> instead of using a static mapping") in that IO-APIC RTE reads would
> unconditionally translate the data (i.e. regardless of whether the
> entry was already in translated format). The patch below fixes this
> for me - can you please give this a try too?

> Thanks, Jan

Hi Jan,

For me as well, thanks again !

--
Sander


> --- unstable.orig/xen/drivers/passthrough/amd/iommu_intr.c
> +++ unstable/xen/drivers/passthrough/amd/iommu_intr.c
> @@ -365,15 +365,17 @@ unsigned int amd_iommu_read_ioapic_from_
>  unsigned int apic, unsigned int reg)
>  {
>  unsigned int val = __io_apic_read(apic, reg);
> +unsigned int pin = (reg - 0x10) / 2;
> +unsigned int offset = ioapic_sbdf[IO_APIC_ID(apic)].pin_2_idx[pin];
>  
> -if ( !(reg & 1) )
> +if ( !(reg & 1) && offset < INTREMAP_ENTRIES )
>  {
> -unsigned int offset = val & (INTREMAP_ENTRIES - 1);
>  u16 bdf = ioapic_sbdf[IO_APIC_ID(apic)].bdf;
>  u16 seg = ioapic_sbdf[IO_APIC_ID(apic)].seg;
>  u16 req_id = get_intremap_requestor_id(seg, bdf);
>  const u32 *entry = get_intremap_entry(seg, req_id, offset);
>  
> +ASSERT(offset == (val & (INTREMAP_ENTRIES - 1)));
>  val &= ~(INTREMAP_ENTRIES - 1);
>  val |= get_field_from_reg_u32(*entry,
>INT_REMAP_ENTRY_INTTYPE_MASK,





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 18/19] xen: arm: Annotate registers trapped when CNTHCTL_EL2.EL1PCEN == 0

2015-04-17 Thread Ian Campbell

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/traps.c |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index c869b96..ad6ff05 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1653,6 +1653,12 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 
 switch ( hsr.bits & HSR_CP32_REGS_MASK )
 {
+/*
+ * !CNTHCTL_EL2.EL1PCEN / !CNTHCTL.PL1PCEN
+ *
+ * ARMv7 (DDI 0406C.b): B4.1.22
+ * ARMv8 (DDI 0487A.d): D1-1510 Table D1-60
+ */
 case HSR_CPREG32(CNTP_CTL):
 case HSR_CPREG32(CNTP_TVAL):
 if ( !vtimer_emulate(regs, hsr) )
@@ -1768,6 +1774,12 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 
 switch ( hsr.bits & HSR_CP64_REGS_MASK )
 {
+/*
+ * !CNTHCTL_EL2.EL1PCEN / !CNTHCTL.PL1PCEN
+ *
+ * ARMv7 (DDI 0406C.b): B4.1.22
+ * ARMv8 (DDI 0487A.d): D1-1510 Table D1-60
+ */
 case HSR_CPREG64(CNTP_CVAL):
 if ( !vtimer_emulate(regs, hsr) )
 return inject_undef_exception(regs, hsr);
@@ -2130,12 +2142,18 @@ static void do_sysreg(struct cpu_user_regs *regs,
  */
 return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
 
+/*
+ * !CNTHCTL_EL2.EL1PCEN
+ *
+ * ARMv8 (DDI 0487A.d): D1-1510 Table D1-60
+ */
 case HSR_SYSREG_CNTP_CTL_EL0:
 case HSR_SYSREG_CNTP_TVAL_EL0:
 case HSR_SYSREG_CNTP_CVAL_EL0:
 if ( !vtimer_emulate(regs, hsr) )
 return inject_undef_exception(regs, hsr);
 break;
+
 case HSR_SYSREG_ICC_SGI1R_EL1:
 if ( !vgic_emulate(regs, hsr) )
 {
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Konrad Rzeszutek Wilk

On Fri, Apr 17, 2015 at 02:45:23PM +0100, Andrew Cooper wrote:
> On 17/04/15 14:40, Konrad Rzeszutek Wilk wrote:
> > On Fri, Apr 17, 2015 at 01:54:28PM +0100, Andrew Cooper wrote:
> >> On 17/04/15 13:39, Jan Beulich wrote:
> >> On 17.04.15 at 13:59,  wrote:
>  On 17/04/15 12:17, Olaf Hering wrote:
> > Since booting xen fails on my ProBook unless I specify "maxcpus=1" I
> > tried the EFI firmware today. To my surprise it boots and finds all
> > cpus. But once some efi driver in dom0 is loaded xen crashes. The same
> > happens with xen-4.4 as included in SLE12.
> >
> > ...
> > (XEN) Xen call trace:
> > (XEN)[] aec1e8e1
> > (XEN)[] efi_runtime_call+0x7f0/0x890
> > (XEN)[] do_platform_op+0x679/0x1670
> > (XEN)[] syscall_enter+0xa9/0xae
> > 
> >
> > Can I do anything about it, or is this a firmware bug? I will move the
> > offending efi driver away and try again.
> >
> > Olaf
>  This is a firmware bug.
> >>> +1 (and I'm surprised how common this is)
> >> The bug is present in the reference implementation code, which means it
> >> is present in a lot of real firmware.  We have kit from 3 different
> >> vendors which are affected, including latest available firmware.
> >>
> > (XEN)  1-23fff type=7 attr=000
> > (XEN)  0fec1-0fec10fff type=11 attr=8001
> > (XEN)  0fff4-0fff46fff type=11 attr=8000
> > (XEN) Unknown cachability for MFNs 0xfff40-0xfff46
>  This unknown cacheability causes Xen not to make pagetables for the 
>  region.
> 
>  There is a patch or two floating around the list, but currently no
>  resolution on the argument it created.
> 
>  https://github.com/xenserver/xen-4.5.pg/blob/master/master/unknown-cacheabilit
>   
>  y.patch
>  is the XenServer fix.
> >>> Now that's surely wrong
> >> Right or wrong, this is (apparently; I have not checked) what Linux does.
> >>
> >>>  - if anything, unknown should be treated as
> >>> UC (and quite likely specifically in a case like the one Olaf reports 
> >>> here,
> >>> as the offending memory range pretty likely is other than normal RAM).
> >>> What I'd accept as a patch would be the addition of a command line
> >>> option enforcing the mapping of such unknown cacheability areas with
> >>> a certain caching type (default then being UC).
> >> If I can find some copious free time, I will see about making this happen.
> > I actually did cobble a patch like this, but it is based on Daniel's 
> > Multibootv2
> > so it won't apply cleany. See attached patchset with various 'work-arounds'.
> >
> > Jan if you are OK with them (well the 'idea' behind them) I can refresh
> > it against staging and post them?
> 
> I was planning to make one efi= command line option along the
> psr/ept/iommu line, rather than having a large number of top-level
> options (and folding our one efi-rs option into it).

That does sound more sensible than a bunch of 'efi-XYZ'.
> 
> But otherwise, that sounds like a plan.

Great. Next week I will post it.
> 
> ~Andrew
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 12/19] xen: arm: Annotate the handlers for HSTR_EL2.T15

2015-04-17 Thread Ian Campbell

Signed-off-by: Ian Campbell 
---
v2: s/Tx/T15/
---
 xen/arch/arm/traps.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index a2bae51..86b5655 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1720,6 +1720,11 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  * ARMv7 (DDI 0406C.b): B1.14.12
  * ARMv8 (DDI 0487A.d): N/A
  *
+ * HSTR_EL2.T15
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.14
+ * ARMv8 (DDI 0487A.d): D1-1507 Table D1-55
+ *
  * And all other unknown registers.
  */
 default:
@@ -1758,6 +1763,11 @@ static void do_cp15_64(struct cpu_user_regs *regs,
  * ARMv7 (DDI 0406C.b): B1.14.12
  * ARMv8 (DDI 0487A.d): N/A
  *
+ * HSTR_EL2.Tx
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.14
+ * ARMv8 (DDI 0487A.d): D1-1507 Table D1-55
+ *
  * And all other unknown registers.
  */
 default:
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 15/19] xen: arm: Annotate registers trapped by MDCR_EL2.TDA

2015-04-17 Thread Ian Campbell

Gather the affected handlers in a single place per trap type.

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/traps.c |   60 +-
 1 file changed, 49 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 7606bff..97cde45 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1816,6 +1816,28 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 case HSR_CPREG32(DBGOSDLR):
 return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 
+/*
+ * MDCR_EL2.TDA
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.15
+ * ARMv8 (DDI 0487A.d): D1-1510 Table D1-59
+ *
+ * Unhandled:
+ *DBGDCCINT
+ *DBGDTRRXint
+ *DBGDTRTXint
+ *DBGWFAR
+ *DBGDTRTXext
+ *DBGDTRRXext,
+ *DBGBXVR
+ *DBGCLAIMSET
+ *DBGCLAIMCLR
+ *DBGAUTHSTATUS
+ *DBGDEVID
+ *DBGDEVID1
+ *DBGDEVID2
+ *DBGOSECCR
+ */
 case HSR_CPREG32(DBGDIDR):
 /*
  * Read-only register. Accessible by EL0 if DBGDSCRext.UDCCdis
@@ -2014,15 +2036,38 @@ static void do_sysreg(struct cpu_user_regs *regs,
 case HSR_SYSREG_OSDLR_EL1:
 return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
 
-/* RAZ/WI registers: */
-/*  - Debug */
+/*
+ * MDCR_EL2.TDA
+ *
+ * ARMv8 (DDI 0487A.d): D1-1510 Table D1-59
+ *
+ * Unhandled:
+ *MDCCINT_EL1
+ *DBGDTR_EL0
+ *DBGDTRRX_EL0
+ *DBGDTRTX_EL0
+ *OSDTRRX_EL1
+ *OSDTRTX_EL1
+ *OSECCR_EL1
+ *DBGCLAIMSET_EL1
+ *DBGCLAIMCLR_EL1
+ *DBGAUTHSTATUS_EL1
+ */
 case HSR_SYSREG_MDSCR_EL1:
-/*  - Breakpoints */
+return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
+case HSR_SYSREG_MDCCSR_EL0:
+/*
+ * Accessible at EL0 only if MDSCR_EL1.TDCC is set to 0. We emulate 
that
+ * register as RAZ/WI above. So RO at both EL0 and EL1.
+ */
+return handle_ro_raz(regs, x, hsr.sysreg.read, hsr, 0);
 HSR_SYSREG_DBG_CASES(DBGBVR):
 HSR_SYSREG_DBG_CASES(DBGBCR):
-/*  - Watchpoints */
 HSR_SYSREG_DBG_CASES(DBGWVR):
 HSR_SYSREG_DBG_CASES(DBGWCR):
+return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
+
+/* RAZ/WI registers: */
 /*  - Perf monitors */
 case HSR_SYSREG_PMINTENSET_EL1:
 case HSR_SYSREG_PMINTENCLR_EL1:
@@ -2032,13 +2077,6 @@ static void do_sysreg(struct cpu_user_regs *regs,
  */
 return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
 
-case HSR_SYSREG_MDCCSR_EL0:
-/*
- * Accessible at EL0 only if MDSCR_EL1.TDCC is set to 0. We emulate 
that
- * register as RAZ/WI above. So RO at both EL0 and EL1.
- */
-return handle_ro_raz(regs, x, hsr.sysreg.read, hsr, 0);
-
 /* - Perf monitors */
 case HSR_SYSREG_PMUSERENR_EL0:
 /* RO at EL0. RAZ/WI at EL1 */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 16/19] xen: arm: Annotate registers trapped by MDCR_EL2.TPM and TPMCR

2015-04-17 Thread Ian Campbell

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/traps.c |   39 ++-
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 97cde45..d4505b5 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1672,6 +1672,24 @@ static void do_cp15_32(struct cpu_user_regs *regs,
*r = v->arch.actlr;
 break;
 
+/*
+ * MDCR_EL2.TPM
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.17
+ * ARMv8 (DDI 0487A.d): D1-1511 Table D1-61
+ *
+ * Unhandled:
+ *PMEVCNTR
+ *PMEVTYPER
+ *PMCCFILTR
+ *
+ * MDCR_EL2.TPMCR
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.17
+ * ARMv8 (DDI 0487A.d): D1-1511 Table D1-62
+ *
+ * NB: Both MDCR_EL2.TPM and MDCR_EL2.TPMCR cause trapping of PMCR.
+ */
 /* We could trap ID_DFR0 and tell the guest we don't support
  * performance monitoring, but Linux doesn't check the ID_DFR0.
  * Therefore it will read PMCR.
@@ -1686,7 +1704,6 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 return handle_ro_raz(regs, r, cp32.read, hsr, 0);
 else
 return handle_raz_wi(regs, r, cp32.read, hsr, 1);
-
 case HSR_CPREG32(PMINTENSET):
 case HSR_CPREG32(PMINTENCLR):
 /* EL1 only, however MDCR_EL2.TPM==1 means EL0 may trap here also. */
@@ -2067,8 +2084,22 @@ static void do_sysreg(struct cpu_user_regs *regs,
 HSR_SYSREG_DBG_CASES(DBGWCR):
 return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
 
-/* RAZ/WI registers: */
-/*  - Perf monitors */
+/*
+ * MDCR_EL2.TPM
+ *
+ * ARMv8 (DDI 0487A.d): D1-1511 Table D1-61
+ *
+ * Unhandled:
+ *PMEVCNTR_EL0
+ *PMEVTYPER_EL0
+ *PMCCFILTR_EL0
+ * MDCR_EL2.TPMCR
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.17
+ * ARMv8 (DDI 0487A.d): D1-1511 Table D1-62
+ *
+ * NB: Both MDCR_EL2.TPM and MDCR_EL2.TPMCR cause trapping of PMCR.
+ */
 case HSR_SYSREG_PMINTENSET_EL1:
 case HSR_SYSREG_PMINTENCLR_EL1:
 /*
@@ -2076,8 +2107,6 @@ static void do_sysreg(struct cpu_user_regs *regs,
  * undef.
  */
 return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
-
-/* - Perf monitors */
 case HSR_SYSREG_PMUSERENR_EL0:
 /* RO at EL0. RAZ/WI at EL1 */
 if ( psr_mode_is_user(regs) )
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 17/19] xen: arm: Remove CNTPCT_EL0 trap handling.

2015-04-17 Thread Ian Campbell

We set CNTHCTL_EL2.EL1PCTEN and therefore according to ARMv8 (DDI
0487A.d) D1-1510 Table D1-60 we are not trapping this.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
 xen/arch/arm/traps.c  |1 -
 xen/arch/arm/vtimer.c |   30 --
 2 files changed, 31 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index d4505b5..c869b96 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1768,7 +1768,6 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 
 switch ( hsr.bits & HSR_CP64_REGS_MASK )
 {
-case HSR_CPREG64(CNTPCT):
 case HSR_CPREG64(CNTP_CVAL):
 if ( !vtimer_emulate(regs, hsr) )
 return inject_undef_exception(regs, hsr);
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index be65c9f..685bfea 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -243,28 +243,6 @@ static int vtimer_cntp_cval(struct cpu_user_regs *regs, 
uint64_t *r, int read)
 }
 return 1;
 }
-static int vtimer_cntpct(struct cpu_user_regs *regs, uint64_t *r, int read)
-{
-struct vcpu *v = current;
-uint64_t ticks;
-s_time_t now;
-
-if ( read )
-{
-if ( !ACCESS_ALLOWED(regs, EL0PCTEN) )
-return 0;
-now = NOW() - v->domain->arch.phys_timer_base.offset;
-ticks = ns_to_ticks(now);
-*r = ticks;
-return 1;
-}
-else
-{
-gprintk(XENLOG_DEBUG, "WRITE to R/O CNTPCT\n");
-return 0;
-}
-}
-
 
 static int vtimer_emulate_cp32(struct cpu_user_regs *regs, union hsr hsr)
 {
@@ -303,11 +281,6 @@ static int vtimer_emulate_cp64(struct cpu_user_regs *regs, 
union hsr hsr)
 
 switch ( hsr.bits & HSR_CP64_REGS_MASK )
 {
-case HSR_CPREG64(CNTPCT):
-if ( !vtimer_cntpct(regs, &x, cp64.read) )
-return 0;
-break;
-
 case HSR_CPREG64(CNTP_CVAL):
 if ( !vtimer_cntp_cval(regs, &x, cp64.read) )
 return 0;
@@ -356,9 +329,6 @@ static int vtimer_emulate_sysreg(struct cpu_user_regs 
*regs, union hsr hsr)
 case HSR_SYSREG_CNTP_CVAL_EL0:
 return vtimer_cntp_cval(regs, x, sysreg.read);
 
-case HSR_SYSREG_CNTPCT_EL0:
-return vtimer_cntpct(regs, x, sysreg.read);
-
 default:
 return 0;
 }
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 00/19] xen: arm: cleanup traps.c

2015-04-17 Thread Ian Campbell

While working on reenabling 32-bit user space on arm63 I concluded that
the trap handling in traps.c had grown into a twisty confusing mess.
Lets try and sort that out.

This series contains two halves (after a couple of preparatory
cleanups).

First clean up the goto maze which we've found ourselves in, by
providing a selection of handle_* helpers e.g. for raz/ro etc and by
calling those and the existing inject_* helpers directly instead of
trying to have only one call to each of the latter by using goto. The
handle_* helpers can also deal with the minimum allowable exception
level, which simplifies things further.

To keep things simpler I've used "return handle_..." when the caller and
callee both return void, since that avoids the need for 3 more lines (2
braces and the return), I think this improves clarity.

Second go through init_traps and for each bit there consolidate the
handling for each type of trap (e.g. do_cp15_32, do_cp15_64, do_sysreg
etc) such that all the registers whose traps are associated with that
bit are kept together beneath a comment which documents why those bits
are trapped, references the appropriate section of the ARMv7 and ARMv8
ARM (the v8 one in particular has a series of very useful tables per
bit) and notes which registers are not explicitly handled (and therefore
take the default case).

For traps which have no explicit handling (i.e. those which trap
implementation defined registers) and which always hit the default case
add the comment above that instead.

Do the same for the GICv3 ICC traps and timer traps.

There is probably scope for doing more, i.e. refactoring related
functionality into subsystem helpers (like we do for vtimer) and even
moving into separate files, but I think this is a good start.

This is a lot of patches, sorry, because I wanted to mostly go through
the trap bits one at a time per patch to keep each one manageable,
although I did end up compressing some of the more obvious ones.

Since last time I've addressed all (I hope!) of Julien's comments.

(R)eviewed

R   xen: arm: constify union hsr and struct hsr_* where possible.
R   xen: arm: Fix handling of ICC_{SGI1R,SGI0R,ASGI1R}_EL1
R   xen: arm: call inject_undef_exception directly
 M  xen: arm: provide and use a handle_raz_wi helper
 M  xen: arm: Add and use r/o+raz and w/o+wi helpers
 M  xen: arm: add minimum exception level argument to trap handler helpers
 M  xen: arm: Annotate trap handler for HSR_EL2.{TWI,TWE,TSC}
 M  xen: arm: implement handling of ACTLR_EL1 trap
RM  xen: arm: Annotate registers trapped by HCR_EL1.TIDCP
 M  xen: arm: implement handling of registers trapped by CPTR_EL2.TTA
 M  xen: arm: Annotate handlers for CPTR_EL2.Tx
 M  xen: arm: Annotate the handlers for HSTR_EL2.T15
 M  xen: arm: Annotate registers trapped by MDCR_EL2.TDRA
 M  xen: arm: Annotate registers trapped by MDCR_EL2.TDOSA
xen: arm: Annotate registers trapped by MDCR_EL2.TDA
xen: arm: Annotate registers trapped by MDCR_EL2.TPM and TPMCR
R   xen: arm: Remove CNTPCT_EL0 trap handling.
xen: arm: Annotate registers trapped when CNTHCTL_EL2.EL1PCEN == 0
R   xen: arm: Annotate source of ICC SGI register trapping

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 03/19] xen: arm: call inject_undef_exception directly

2015-04-17 Thread Ian Campbell

Reducing the amount of goto maze considerably.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
 xen/arch/arm/traps.c |   56 +++---
 1 file changed, 26 insertions(+), 30 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 99ceaea..7270116 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -518,13 +518,13 @@ static void inject_iabt64_exception(struct cpu_user_regs 
*regs,
 #endif
 
 static void inject_undef_exception(struct cpu_user_regs *regs,
-   int instr_len)
+   const union hsr hsr)
 {
 if ( is_32bit_domain(current->domain) )
 inject_undef32_exception(regs);
 #ifdef CONFIG_ARM_64
 else
-inject_undef64_exception(regs, instr_len);
+inject_undef64_exception(regs, hsr.len);
 #endif
 }
 
@@ -1592,11 +1592,11 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 case HSR_CPREG32(CNTP_CTL):
 case HSR_CPREG32(CNTP_TVAL):
 if ( !vtimer_emulate(regs, hsr) )
-goto undef_cp15_32;
+return inject_undef_exception(regs, hsr);
 break;
 case HSR_CPREG32(ACTLR):
 if ( psr_mode_is_user(regs) )
-goto undef_cp15_32;
+return inject_undef_exception(regs, hsr);
 if ( cp32.read )
*r = v->arch.actlr;
 break;
@@ -1612,14 +1612,14 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 case HSR_CPREG32(PMUSERENR):
 /* RO at EL0. RAZ/WI at EL1 */
 if ( psr_mode_is_user(regs) && !hsr.cp32.read )
-goto undef_cp15_32;
+return inject_undef_exception(regs, hsr);
 goto cp15_32_raz_wi;
 
 case HSR_CPREG32(PMINTENSET):
 case HSR_CPREG32(PMINTENCLR):
 /* EL1 only, however MDCR_EL2.TPM==1 means EL0 may trap here also. */
 if ( psr_mode_is_user(regs) )
-goto undef_cp15_32;
+return inject_undef_exception(regs, hsr);
 goto cp15_32_raz_wi;
 case HSR_CPREG32(PMCR):
 case HSR_CPREG32(PMCNTENSET):
@@ -1638,7 +1638,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  * emulate that register as 0 above.
  */
 if ( psr_mode_is_user(regs) )
-goto undef_cp15_32;
+return inject_undef_exception(regs, hsr);
  cp15_32_raz_wi:
 if ( cp32.read )
 *r = 0;
@@ -1652,8 +1652,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  cp32.op1, cp32.reg, cp32.crn, cp32.crm, cp32.op2, regs->pc);
 gdprintk(XENLOG_ERR, "unhandled 32-bit CP15 access %#x\n",
  hsr.bits & HSR_CP32_REGS_MASK);
- undef_cp15_32:
-inject_undef_exception(regs, hsr.len);
+inject_undef_exception(regs, hsr);
 return;
 }
 advance_pc(regs, hsr);
@@ -1673,7 +1672,7 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 case HSR_CPREG64(CNTPCT):
 case HSR_CPREG64(CNTP_CVAL):
 if ( !vtimer_emulate(regs, hsr) )
-goto undef_cp15_64;
+return inject_undef_exception(regs, hsr);
 break;
 default:
 {
@@ -1685,8 +1684,7 @@ static void do_cp15_64(struct cpu_user_regs *regs,
  cp64.op1, cp64.reg1, cp64.reg2, cp64.crm, regs->pc);
 gdprintk(XENLOG_ERR, "unhandled 64-bit CP15 access %#x\n",
  hsr.bits & HSR_CP64_REGS_MASK);
- undef_cp15_64:
-inject_undef_exception(regs, hsr.len);
+inject_undef_exception(regs, hsr);
 return;
 }
 }
@@ -1713,7 +1711,7 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
  * is set to 0, which we emulated below.
  */
 if ( !cp32.read )
-goto undef_cp14_32;
+return inject_undef_exception(regs, hsr);
 
 /* Implement the minimum requirements:
  *  - Number of watchpoints: 1
@@ -1731,14 +1729,14 @@ static void do_cp14_32(struct cpu_user_regs *regs, 
const union hsr hsr)
  * is set to 0, which we emulated below.
  */
 if ( !cp32.read )
-goto undef_cp14_32;
+return inject_undef_exception(regs, hsr);
 
 *r = 0;
 break;
 
 case HSR_CPREG32(DBGDSCREXT):
 if ( usr_mode(regs) )
-goto undef_cp14_32;
+return inject_undef_exception(regs, hsr);
 
 /* Implement debug status and control register as RAZ/WI.
  * The OS won't use Hardware debug if MDBGen not set
@@ -1756,7 +1754,7 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 case HSR_CPREG32(DBGBCR1):
 case HSR_CPREG32(DBGOSDLR):
 if ( usr_mode(regs) )
-goto undef_cp14_32;
+return inject_undef_exception(regs, hsr);
 /* RAZ/WI */
 if ( cp32.read )
 *r = 0;
@@ -1764,10 +1762,10 @@ static void do_cp14_32(struct cpu_user_regs *regs,

[Xen-devel] [PATCH v2 05/19] xen: arm: Add and use r/o+raz and w/o+wi helpers

2015-04-17 Thread Ian Campbell

Signed-off-by: Ian Campbell 
---
v2: Move last paramter of a handle_ro_raz call to next patch where it
belongs.
---
 xen/arch/arm/traps.c |   52 --
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 8b1846a..b54aef6 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1587,6 +1587,34 @@ static void handle_raz_wi(struct cpu_user_regs *regs,
 advance_pc(regs, hsr);
 }
 
+/* Write only + write ignore */
+static void handle_wo_wi(struct cpu_user_regs *regs,
+ register_t *reg,
+ bool_t read,
+ const union hsr hsr)
+{
+if ( read )
+return inject_undef_exception(regs, hsr);
+/* else: ignore */
+
+advance_pc(regs, hsr);
+}
+
+/* Read only + read as zero */
+static void handle_ro_raz(struct cpu_user_regs *regs,
+  register_t *reg,
+  bool_t read,
+  const union hsr hsr)
+{
+if ( !read )
+return inject_undef_exception(regs, hsr);
+/* else: raz */
+
+*reg = 0;
+
+advance_pc(regs, hsr);
+}
+
 static void do_cp15_32(struct cpu_user_regs *regs,
const union hsr hsr)
 {
@@ -1737,11 +1765,7 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
  * Read-only register. Accessible by EL0 if DBGDSCRext.UDCCdis
  * is set to 0, which we emulated below.
  */
-if ( !cp32.read )
-return inject_undef_exception(regs, hsr);
-
-*r = 0;
-break;
+return handle_ro_raz(regs, r, cp32.read, hsr);
 
 case HSR_CPREG32(DBGDSCREXT):
 if ( usr_mode(regs) )
@@ -1768,11 +1792,7 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 case HSR_CPREG32(DBGOSLAR):
 if ( usr_mode(regs) )
 return inject_undef_exception(regs, hsr);
-/* WO */
-if ( cp32.read )
-return inject_undef_exception(regs, hsr);
-/* else: ignore */
-break;
+return handle_wo_wi(regs, r, cp32.read, hsr);
 default:
 gdprintk(XENLOG_ERR,
  "%s p14, %d, r%d, cr%d, cr%d, %d @ 0x%"PRIregister"\n",
@@ -1857,11 +1877,7 @@ static void do_sysreg(struct cpu_user_regs *regs,
  * Accessible at EL0 only if MDSCR_EL1.TDCC is set to 0. We emulate 
that
  * register as RAZ/WI above. So RO at both EL0 and EL1.
  */
-if ( !hsr.sysreg.read )
-return inject_undef_exception(regs, hsr);
-
-*x = 0;
-break;
+return handle_ro_raz(regs, x, hsr.sysreg.read, hsr);
 
 /* - Perf monitors */
 case HSR_SYSREG_PMUSERENR_EL0:
@@ -1891,10 +1907,8 @@ static void do_sysreg(struct cpu_user_regs *regs,
 
 /* Write only, Write ignore registers: */
 case HSR_SYSREG_OSLAR_EL1:
-if ( hsr.sysreg.read )
-return inject_undef_exception(regs, hsr);
-/* else: write ignored */
-break;
+return handle_wo_wi(regs, x, hsr.sysreg.read, hsr);
+
 case HSR_SYSREG_CNTP_CTL_EL0:
 case HSR_SYSREG_CNTP_TVAL_EL0:
 case HSR_SYSREG_CNTP_CVAL_EL0:
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 06/19] xen: arm: add minimum exception level argument to trap handler helpers

2015-04-17 Thread Ian Campbell

Removes a load of boiler plate.

Signed-off-by: Ian Campbell 
---
v2: Move last parameter of a call to handle_ro_raz here where it
belongs.
Added asserts for valid min_el values
---
 xen/arch/arm/traps.c |   73 +++---
 1 file changed, 39 insertions(+), 34 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index b54aef6..7110c66 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1578,8 +1578,14 @@ static void advance_pc(struct cpu_user_regs *regs, const 
union hsr hsr)
 static void handle_raz_wi(struct cpu_user_regs *regs,
   register_t *reg,
   bool_t read,
-  const union hsr hsr)
+  const union hsr hsr,
+  int min_el)
 {
+ASSERT((min_el == 0) || (min_el == 1));
+
+if ( min_el > 0 && psr_mode_is_user(regs) )
+return inject_undef_exception(regs, hsr);
+
 if ( read )
 *reg = 0;
 /* else: write ignored */
@@ -1591,8 +1597,14 @@ static void handle_raz_wi(struct cpu_user_regs *regs,
 static void handle_wo_wi(struct cpu_user_regs *regs,
  register_t *reg,
  bool_t read,
- const union hsr hsr)
+ const union hsr hsr,
+ int min_el)
 {
+ASSERT((min_el == 0) || (min_el == 1));
+
+if ( min_el > 0 && psr_mode_is_user(regs) )
+return inject_undef_exception(regs, hsr);
+
 if ( read )
 return inject_undef_exception(regs, hsr);
 /* else: ignore */
@@ -1604,8 +1616,14 @@ static void handle_wo_wi(struct cpu_user_regs *regs,
 static void handle_ro_raz(struct cpu_user_regs *regs,
   register_t *reg,
   bool_t read,
-  const union hsr hsr)
+  const union hsr hsr,
+  int min_el)
 {
+ASSERT((min_el == 0) || (min_el == 1));
+
+if ( min_el > 0 && psr_mode_is_user(regs) )
+return inject_undef_exception(regs, hsr);
+
 if ( !read )
 return inject_undef_exception(regs, hsr);
 /* else: raz */
@@ -1652,16 +1670,15 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  */
 case HSR_CPREG32(PMUSERENR):
 /* RO at EL0. RAZ/WI at EL1 */
-if ( psr_mode_is_user(regs) && !hsr.cp32.read )
-return inject_undef_exception(regs, hsr);
-return handle_raz_wi(regs, r, cp32.read, hsr);
+if ( psr_mode_is_user(regs) )
+return handle_ro_raz(regs, r, cp32.read, hsr, 0);
+else
+return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 
 case HSR_CPREG32(PMINTENSET):
 case HSR_CPREG32(PMINTENCLR):
 /* EL1 only, however MDCR_EL2.TPM==1 means EL0 may trap here also. */
-if ( psr_mode_is_user(regs) )
-return inject_undef_exception(regs, hsr);
-return handle_raz_wi(regs, r, cp32.read, hsr);
+return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 case HSR_CPREG32(PMCR):
 case HSR_CPREG32(PMCNTENSET):
 case HSR_CPREG32(PMCNTENCLR):
@@ -1678,9 +1695,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  * Accessible at EL0 only if PMUSERENR_EL0.EN is set. We
  * emulate that register as 0 above.
  */
-if ( psr_mode_is_user(regs) )
-return inject_undef_exception(regs, hsr);
-return handle_raz_wi(regs, r, cp32.read, hsr);
+return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 
 default:
 gdprintk(XENLOG_ERR,
@@ -1765,17 +1780,14 @@ static void do_cp14_32(struct cpu_user_regs *regs, 
const union hsr hsr)
  * Read-only register. Accessible by EL0 if DBGDSCRext.UDCCdis
  * is set to 0, which we emulated below.
  */
-return handle_ro_raz(regs, r, cp32.read, hsr);
+return handle_ro_raz(regs, r, cp32.read, hsr, 1);
 
 case HSR_CPREG32(DBGDSCREXT):
-if ( usr_mode(regs) )
-return inject_undef_exception(regs, hsr);
-
 /*
  * Implement debug status and control register as RAZ/WI.
  * The OS won't use Hardware debug if MDBGen not set.
  */
-return handle_raz_wi(regs, r, cp32.read, hsr);
+return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 
 case HSR_CPREG32(DBGVCR):
 case HSR_CPREG32(DBGBVR0):
@@ -1785,14 +1797,10 @@ static void do_cp14_32(struct cpu_user_regs *regs, 
const union hsr hsr)
 case HSR_CPREG32(DBGBVR1):
 case HSR_CPREG32(DBGBCR1):
 case HSR_CPREG32(DBGOSDLR):
-if ( usr_mode(regs) )
-return inject_undef_exception(regs, hsr);
-return handle_raz_wi(regs, r, cp32.read, hsr);
+return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 
 case HSR_CPREG32(DBGOSLAR):
-if ( usr_mode(regs) )
-return inject_undef_exception(regs, hsr);
-return handle_wo_

[Xen-devel] [PATCH v2 14/19] xen: arm: Annotate registers trapped by MDCR_EL2.TDOSA

2015-04-17 Thread Ian Campbell

Gather the affected handlers in a single place per trap type.

Add some HSR_SYSREG and AArch32 defines for those registers (because
I'd already typed them in when I realised I didn't need them).

Signed-off-by: Ian Campbell 
---
v2: Move comment block in cp14_dbg handler from incorrect place in
next patch
Drop stray comment
---
 xen/arch/arm/traps.c  |   51 +
 xen/include/asm-arm/cpregs.h  |2 ++
 xen/include/asm-arm/sysregs.h |2 ++
 3 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 17ddcd0..7606bff 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1801,6 +1801,21 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 
 switch ( hsr.bits & HSR_CP32_REGS_MASK )
 {
+/*
+ * MDCR_EL2.TDOSA
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.15
+ * ARMv8 (DDI 0487A.d): D1-1509 Table D1-58
+ *
+ * Unhandled:
+ *DBGOSLSR
+ *DBGPRCR
+ */
+case HSR_CPREG32(DBGOSLAR):
+return handle_wo_wi(regs, r, cp32.read, hsr, 1);
+case HSR_CPREG32(DBGOSDLR):
+return handle_raz_wi(regs, r, cp32.read, hsr, 1);
+
 case HSR_CPREG32(DBGDIDR):
 /*
  * Read-only register. Accessible by EL0 if DBGDSCRext.UDCCdis
@@ -1840,12 +1855,8 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 case HSR_CPREG32(DBGWCR0):
 case HSR_CPREG32(DBGBVR1):
 case HSR_CPREG32(DBGBCR1):
-case HSR_CPREG32(DBGOSDLR):
 return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 
-case HSR_CPREG32(DBGOSLAR):
-return handle_wo_wi(regs, r, cp32.read, hsr, 1);
-
 /*
  * CPTR_EL2.TTA
  *
@@ -1923,6 +1934,18 @@ static void do_cp14_dbg(struct cpu_user_regs *regs, 
const union hsr hsr)
 return;
 }
 
+/*
+ * MDCR_EL2.TDOSA
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.15
+ * ARMv8 (DDI 0487A.d): D1-1509 Table D1-58
+ *
+ * Unhandled:
+ *DBGDTRTXint
+ *DBGDTRRXint
+ *
+ * And all other unknown registers.
+ */
 gdprintk(XENLOG_ERR,
  "%s p14, %d, r%d, r%d, cr%d @ 0x%"PRIregister"\n",
  cp64.read ? "mrrc" : "mcrr",
@@ -1977,6 +2000,20 @@ static void do_sysreg(struct cpu_user_regs *regs,
 case HSR_SYSREG_MDRAR_EL1:
 return handle_ro_raz(regs, x, hsr.sysreg.read, hsr, 1);
 
+/*
+ * MDCR_EL2.TDOSA
+ *
+ * ARMv8 (DDI 0487A.d): D1-1509 Table D1-58
+ *
+ * Unhandled:
+ *OSLSR_EL1
+ *DBGPRCR_EL1
+ */
+case HSR_SYSREG_OSLAR_EL1:
+return handle_wo_wi(regs, x, hsr.sysreg.read, hsr, 1);
+case HSR_SYSREG_OSDLR_EL1:
+return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
+
 /* RAZ/WI registers: */
 /*  - Debug */
 case HSR_SYSREG_MDSCR_EL1:
@@ -1986,8 +2023,6 @@ static void do_sysreg(struct cpu_user_regs *regs,
 /*  - Watchpoints */
 HSR_SYSREG_DBG_CASES(DBGWVR):
 HSR_SYSREG_DBG_CASES(DBGWCR):
-/*  - Double Lock Register */
-case HSR_SYSREG_OSDLR_EL1:
 /*  - Perf monitors */
 case HSR_SYSREG_PMINTENSET_EL1:
 case HSR_SYSREG_PMINTENCLR_EL1:
@@ -2029,10 +2064,6 @@ static void do_sysreg(struct cpu_user_regs *regs,
  */
 return handle_raz_wi(regs, x, hsr.sysreg.read, hsr, 1);
 
-/* Write only, Write ignore registers: */
-case HSR_SYSREG_OSLAR_EL1:
-return handle_wo_wi(regs, x, hsr.sysreg.read, hsr, 1);
-
 case HSR_SYSREG_CNTP_CTL_EL0:
 case HSR_SYSREG_CNTP_TVAL_EL0:
 case HSR_SYSREG_CNTP_CVAL_EL0:
diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
index 9db8cfd..e5cb00c 100644
--- a/xen/include/asm-arm/cpregs.h
+++ b/xen/include/asm-arm/cpregs.h
@@ -83,7 +83,9 @@
 #define DBGBVR1 p14,0,c0,c1,4   /* Breakpoint Value 1 */
 #define DBGBCR1 p14,0,c0,c1,5   /* Breakpoint Control 1 */
 #define DBGOSLARp14,0,c1,c0,4   /* OS Lock Access */
+#define DBGOSLSRp14,0,c1,c1,4   /* OS Lock Status Register */
 #define DBGOSDLRp14,0,c1,c3,4   /* OS Double Lock */
+#define DBGPRCR p14,0,c1,c4,4   /* Debug Power Control Register */
 
 /* CP14 CR0: */
 #define TEECR   p14,6,c0,c0,0   /* ThumbEE Configuration Register */
diff --git a/xen/include/asm-arm/sysregs.h b/xen/include/asm-arm/sysregs.h
index 55457fd..570f43e 100644
--- a/xen/include/asm-arm/sysregs.h
+++ b/xen/include/asm-arm/sysregs.h
@@ -47,7 +47,9 @@
 #define HSR_SYSREG_MDSCR_EL1  HSR_SYSREG(2,0,c0,c2,2)
 #define HSR_SYSREG_MDRAR_EL1  HSR_SYSREG(2,0,c1,c0,0)
 #define HSR_SYSREG_OSLAR_EL1  HSR_SYSREG(2,0,c1,c0,4)
+#define HSR_SYSREG_OSLSR_EL1  HSR_SYSREG(2,0,c1,c1,4)
 #define HSR_SYSREG_OSDLR_EL1  HSR_SYSREG(2,0,c1,c3,4)
+#define HSR_SYSREG_DBGPRCR_EL1HSR_SYSREG(2,0,c1,c4,4)
 #define HSR_SYSREG_MDCCSR_EL0 HSR_SYSREG(2,3,c0,c1,0)
 
 #define HSR_SYSREG_DBGBVRn_EL1(n) HSR_SYSREG(2,0,c0,c##n,4)

[Xen-devel] [PATCH v2 07/19] xen: arm: Annotate trap handler for HSR_EL2.{TWI, TWE, TSC}

2015-04-17 Thread Ian Campbell

Reference the bit which enables the trap and the section/page which
describes what that bit enables.

These ones are pretty trivial, included for completeness.

Signed-off-by: Ian Campbell 
---
v2: s/HSR_EL2/HCR_EL2/
---
 xen/arch/arm/traps.c |   17 +
 1 file changed, 17 insertions(+)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 7110c66..7b79990 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2089,6 +2089,12 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 
 switch (hsr.ec) {
 case HSR_EC_WFI_WFE:
+/*
+ * HCR_EL2.TWI, HCR_EL2.TWE
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.9
+ * ARMv8 (DDI 0487A.d): D1-1505 Table D1-51
+ */
 if ( !check_conditional_instr(regs, hsr) )
 {
 advance_pc(regs, hsr);
@@ -2131,6 +2137,12 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 do_cp(regs, hsr);
 break;
 case HSR_EC_SMC32:
+/*
+ * HCR_EL2.TSC
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.8
+ * ARMv8 (DDI 0487A.d): D1-1501 Table D1-44
+ */
 GUEST_BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_smc32);
 inject_undef32_exception(regs);
@@ -2159,6 +2171,11 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 do_trap_hypercall(regs, ®s->x16, hsr.iss);
 break;
 case HSR_EC_SMC64:
+/*
+ * HCR_EL2.TSC
+ *
+ * ARMv8 (DDI 0487A.d): D1-1501 Table D1-44
+ */
 GUEST_BUG_ON(psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_smc64);
 inject_undef64_exception(regs, hsr.len);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 09/19] xen: arm: Annotate registers trapped by HCR_EL1.TIDCP

2015-04-17 Thread Ian Campbell

This traps variety of implementation defined registers, so add a note
to the default case of the respective handler.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
v2: Typo in subject
---
 xen/arch/arm/traps.c |   16 
 1 file changed, 16 insertions(+)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 522701b..d908738 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1704,6 +1704,14 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  */
 return handle_raz_wi(regs, r, cp32.read, hsr, 1);
 
+/*
+ * HCR_EL2.TIDCP
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.3
+ * ARMv8 (DDI 0487A.d): D1-1501 Table D1-43
+ *
+ * And all other unknown registers.
+ */
 default:
 gdprintk(XENLOG_ERR,
  "%s p15, %d, r%d, cr%d, cr%d, %d @ 0x%"PRIregister"\n",
@@ -1954,6 +1962,14 @@ static void do_sysreg(struct cpu_user_regs *regs,
 dprintk(XENLOG_WARNING,
 "Emulation of sysreg ICC_SGI0R_EL1/ASGI1R_EL1 not 
supported\n");
 return inject_undef64_exception(regs, hsr.len);
+
+/*
+ * HCR_EL2.TIDCP
+ *
+ * ARMv8 (DDI 0487A.d): D1-1501 Table D1-43
+ *
+ * And all other unknown registers.
+ */
 default:
 {
 const struct hsr_sysreg sysreg = hsr.sysreg;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 01/19] xen: arm: constify union hsr and struct hsr_* where possible.

2015-04-17 Thread Ian Campbell

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
 xen/arch/arm/traps.c |   41 +
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index aaa9d93..69b9513 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -447,7 +447,7 @@ static vaddr_t exception_handler64(struct cpu_user_regs 
*regs, vaddr_t offset)
 static void inject_undef64_exception(struct cpu_user_regs *regs, int instr_len)
 {
 vaddr_t handler;
-union hsr esr = {
+const union hsr esr = {
 .iss = 0,
 .len = instr_len,
 .ec = HSR_EC_UNKNOWN,
@@ -1141,7 +1141,7 @@ int do_bug_frame(struct cpu_user_regs *regs, vaddr_t pc)
 }
 
 #ifdef CONFIG_ARM_64
-static void do_trap_brk(struct cpu_user_regs *regs, union hsr hsr)
+static void do_trap_brk(struct cpu_user_regs *regs, const union hsr hsr)
 {
 /* HCR_EL2.TGE and MDCR_EL2.TDE are not set so we never receive
  * software breakpoint exception for EL1 and EL0 here.
@@ -1488,7 +1488,8 @@ static const unsigned short cc_map[16] = {
 0   /* NV */
 };
 
-static int check_conditional_instr(struct cpu_user_regs *regs, union hsr hsr)
+static int check_conditional_instr(struct cpu_user_regs *regs,
+   const union hsr hsr)
 {
 unsigned long cpsr, cpsr_cond;
 int cond;
@@ -1533,7 +1534,7 @@ static int check_conditional_instr(struct cpu_user_regs 
*regs, union hsr hsr)
 return 1;
 }
 
-static void advance_pc(struct cpu_user_regs *regs, union hsr hsr)
+static void advance_pc(struct cpu_user_regs *regs, const union hsr hsr)
 {
 unsigned long itbits, cond, cpsr = regs->cpsr;
 
@@ -1574,9 +1575,9 @@ static void advance_pc(struct cpu_user_regs *regs, union 
hsr hsr)
 }
 
 static void do_cp15_32(struct cpu_user_regs *regs,
-   union hsr hsr)
+   const union hsr hsr)
 {
-struct hsr_cp32 cp32 = hsr.cp32;
+const struct hsr_cp32 cp32 = hsr.cp32;
 uint32_t *r = (uint32_t*)select_user_reg(regs, cp32.reg);
 struct vcpu *v = current;
 
@@ -1659,7 +1660,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 }
 
 static void do_cp15_64(struct cpu_user_regs *regs,
-   union hsr hsr)
+   const union hsr hsr)
 {
 if ( !check_conditional_instr(regs, hsr) )
 {
@@ -1676,7 +1677,7 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 break;
 default:
 {
-struct hsr_cp64 cp64 = hsr.cp64;
+const struct hsr_cp64 cp64 = hsr.cp64;
 
 gdprintk(XENLOG_ERR,
  "%s p15, %d, r%d, r%d, cr%d @ 0x%"PRIregister"\n",
@@ -1692,9 +1693,9 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 advance_pc(regs, hsr);
 }
 
-static void do_cp14_32(struct cpu_user_regs *regs, union hsr hsr)
+static void do_cp14_32(struct cpu_user_regs *regs, const union hsr hsr)
 {
-struct hsr_cp32 cp32 = hsr.cp32;
+const struct hsr_cp32 cp32 = hsr.cp32;
 uint32_t *r = (uint32_t *)select_user_reg(regs, cp32.reg);
 struct domain *d = current->domain;
 
@@ -1784,9 +1785,9 @@ static void do_cp14_32(struct cpu_user_regs *regs, union 
hsr hsr)
 advance_pc(regs, hsr);
 }
 
-static void do_cp14_dbg(struct cpu_user_regs *regs, union hsr hsr)
+static void do_cp14_dbg(struct cpu_user_regs *regs, const union hsr hsr)
 {
-struct hsr_cp64 cp64 = hsr.cp64;
+const struct hsr_cp64 cp64 = hsr.cp64;
 
 if ( !check_conditional_instr(regs, hsr) )
 {
@@ -1804,9 +1805,9 @@ static void do_cp14_dbg(struct cpu_user_regs *regs, union 
hsr hsr)
 inject_undef_exception(regs, hsr.len);
 }
 
-static void do_cp(struct cpu_user_regs *regs, union hsr hsr)
+static void do_cp(struct cpu_user_regs *regs, const union hsr hsr)
 {
-struct hsr_cp cp = hsr.cp;
+const struct hsr_cp cp = hsr.cp;
 
 if ( !check_conditional_instr(regs, hsr) )
 {
@@ -1821,7 +1822,7 @@ static void do_cp(struct cpu_user_regs *regs, union hsr 
hsr)
 
 #ifdef CONFIG_ARM_64
 static void do_sysreg(struct cpu_user_regs *regs,
-  union hsr hsr)
+  const union hsr hsr)
 {
 register_t *x = select_user_reg(regs, hsr.sysreg.reg);
 
@@ -1918,7 +1919,7 @@ static void do_sysreg(struct cpu_user_regs *regs,
 inject_undef64_exception(regs, hsr.len);
 default:
 {
-struct hsr_sysreg sysreg = hsr.sysreg;
+const struct hsr_sysreg sysreg = hsr.sysreg;
 
 gdprintk(XENLOG_ERR,
  "%s %d, %d, c%d, c%d, %d %s x%d @ 0x%"PRIregister"\n",
@@ -1997,16 +1998,16 @@ done:
 }
 
 static void do_trap_instr_abort_guest(struct cpu_user_regs *regs,
-  union hsr hsr)
+  const union hsr hsr)
 {
 register_t addr = READ_SYSREG(FAR_EL2);
 inject_iabt_exception(regs, addr, hsr.len);
 }
 
 static void d

[Xen-devel] [PATCH v2 08/19] xen: arm: implement handling of ACTLR_EL1 trap

2015-04-17 Thread Ian Campbell

While annotating ACTLR I noticed that we don't appear to handle the
64-bit version of this trap. Do so and annotate everything.

Signed-off-by: Ian Campbell 
---
v2: s/TASC/TACR/ and s/HSR/HCR/
---
 xen/arch/arm/traps.c  |   20 
 xen/include/asm-arm/sysregs.h |1 +
 2 files changed, 21 insertions(+)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 7b79990..522701b 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1653,6 +1653,13 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 if ( !vtimer_emulate(regs, hsr) )
 return inject_undef_exception(regs, hsr);
 break;
+
+/*
+ * HCR_EL2.TACR / HCR.TAC
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.6
+ * ARMv8 (DDI 0487A.d): G6.2.1
+ */
 case HSR_CPREG32(ACTLR):
 if ( psr_mode_is_user(regs) )
 return inject_undef_exception(regs, hsr);
@@ -1855,9 +1862,22 @@ static void do_sysreg(struct cpu_user_regs *regs,
   const union hsr hsr)
 {
 register_t *x = select_user_reg(regs, hsr.sysreg.reg);
+struct vcpu *v = current;
 
 switch ( hsr.bits & HSR_SYSREG_REGS_MASK )
 {
+/*
+ * HCR_EL2.TACR
+ *
+ * ARMv8 (DDI 0487A.d): D7.2.1
+ */
+case HSR_SYSREG_ACTLR_EL1:
+if ( psr_mode_is_user(regs) )
+return inject_undef_exception(regs, hsr);
+if ( hsr.sysreg.read )
+   *x = v->arch.actlr;
+break;
+
 /* RAZ/WI registers: */
 /*  - Debug */
 case HSR_SYSREG_MDSCR_EL1:
diff --git a/xen/include/asm-arm/sysregs.h b/xen/include/asm-arm/sysregs.h
index 2284fd7..d75e154 100644
--- a/xen/include/asm-arm/sysregs.h
+++ b/xen/include/asm-arm/sysregs.h
@@ -72,6 +72,7 @@
   case HSR_SYSREG_##REG##n_EL1(15)
 
 #define HSR_SYSREG_SCTLR_EL1  HSR_SYSREG(3,0,c1, c0,0)
+#define HSR_SYSREG_ACTLR_EL1  HSR_SYSREG(3,0,c1, c0,1)
 #define HSR_SYSREG_TTBR0_EL1  HSR_SYSREG(3,0,c2, c0,0)
 #define HSR_SYSREG_TTBR1_EL1  HSR_SYSREG(3,0,c2, c0,1)
 #define HSR_SYSREG_TCR_EL1HSR_SYSREG(3,0,c2, c0,2)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 19/19] xen: arm: Annotate source of ICC SGI register trapping

2015-04-17 Thread Ian Campbell

I was unable to find an ARMv8 ARM reference to this, so refer to the
GIC Architecture Specification instead.

ARMv8 ARM does cover other ways of trapping these accesses via
ICH_HCR_EL2 but we don't use those and they trap additional registers
as well.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
 xen/arch/arm/traps.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index ad6ff05..6fe9b7a 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2154,6 +2154,12 @@ static void do_sysreg(struct cpu_user_regs *regs,
 return inject_undef_exception(regs, hsr);
 break;
 
+/*
+ * HCR_EL2.FMO or HCR_EL2.IMO
+ *
+ * ARMv8: GIC Architecture Specification (PRD03-GENC-010745 24.0)
+ *Section 4.6.8.
+ */
 case HSR_SYSREG_ICC_SGI1R_EL1:
 if ( !vgic_emulate(regs, hsr) )
 {
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 10/19] xen: arm: implement handling of registers trapped by CPTR_EL2.TTA

2015-04-17 Thread Ian Campbell

Add explicit handler for 64-bit CP14 accesses, with more relevant
debug message (as per other handlers) and to provide a place for a
comment.

Signed-off-by: Ian Campbell 
---
v2: Changed title from "xen: arm: Annotate registers trapped by
CPTR_EL2.TTA"
Add "And all other unknown registers" to new annotation.
---
 xen/arch/arm/traps.c |   45 +-
 xen/include/asm-arm/perfc_defn.h |1 +
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index d908738..afa8a95 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1816,6 +1816,15 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 
 case HSR_CPREG32(DBGOSLAR):
 return handle_wo_wi(regs, r, cp32.read, hsr, 1);
+
+/*
+ * CPTR_EL2.TTA
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.16
+ * ARMv8 (DDI 0487A.d): D1-1507 Table D1-54
+ *
+ * And all other unknown registers.
+ */
 default:
 gdprintk(XENLOG_ERR,
  "%s p14, %d, r%d, cr%d, cr%d, %d @ 0x%"PRIregister"\n",
@@ -1830,7 +1839,7 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 advance_pc(regs, hsr);
 }
 
-static void do_cp14_dbg(struct cpu_user_regs *regs, const union hsr hsr)
+static void do_cp14_64(struct cpu_user_regs *regs, const union hsr hsr)
 {
 const struct hsr_cp64 cp64 = hsr.cp64;
 
@@ -1840,12 +1849,37 @@ static void do_cp14_dbg(struct cpu_user_regs *regs, 
const union hsr hsr)
 return;
 }
 
+/*
+ * CPTR_EL2.TTA
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.16
+ * ARMv8 (DDI 0487A.d): D1-1507 Table D1-54
+ */
 gdprintk(XENLOG_ERR,
  "%s p14, %d, r%d, r%d, cr%d @ 0x%"PRIregister"\n",
  cp64.read ? "mrrc" : "mcrr",
  cp64.op1, cp64.reg1, cp64.reg2, cp64.crm, regs->pc);
 gdprintk(XENLOG_ERR, "unhandled 64-bit CP14 access %#x\n",
  hsr.bits & HSR_CP64_REGS_MASK);
+inject_undef_exception(regs, hsr);
+}
+
+static void do_cp14_dbg(struct cpu_user_regs *regs, const union hsr hsr)
+{
+struct hsr_cp64 cp64 = hsr.cp64;
+
+if ( !check_conditional_instr(regs, hsr) )
+{
+advance_pc(regs, hsr);
+return;
+}
+
+gdprintk(XENLOG_ERR,
+ "%s p14, %d, r%d, r%d, cr%d @ 0x%"PRIregister"\n",
+ cp64.read ? "mrrc" : "mcrr",
+ cp64.op1, cp64.reg1, cp64.reg2, cp64.crm, regs->pc);
+gdprintk(XENLOG_ERR, "unhandled 64-bit CP14 DBG access %#x\n",
+ hsr.bits & HSR_CP64_REGS_MASK);
 
 inject_undef_exception(regs, hsr);
 }
@@ -1968,6 +2002,10 @@ static void do_sysreg(struct cpu_user_regs *regs,
  *
  * ARMv8 (DDI 0487A.d): D1-1501 Table D1-43
  *
+ * CPTR_EL2.TTA
+ *
+ * ARMv8 (DDI 0487A.d): D1-1507 Table D1-54
+ *
  * And all other unknown registers.
  */
 default:
@@ -2162,6 +2200,11 @@ asmlinkage void do_trap_hypervisor(struct cpu_user_regs 
*regs)
 perfc_incr(trap_cp14_32);
 do_cp14_32(regs, hsr);
 break;
+case HSR_EC_CP14_64:
+GUEST_BUG_ON(!psr_mode_is_32bit(regs->cpsr));
+perfc_incr(trap_cp14_64);
+do_cp14_64(regs, hsr);
+break;
 case HSR_EC_CP14_DBG:
 GUEST_BUG_ON(!psr_mode_is_32bit(regs->cpsr));
 perfc_incr(trap_cp14_dbg);
diff --git a/xen/include/asm-arm/perfc_defn.h b/xen/include/asm-arm/perfc_defn.h
index 46015f5..69fabe7 100644
--- a/xen/include/asm-arm/perfc_defn.h
+++ b/xen/include/asm-arm/perfc_defn.h
@@ -9,6 +9,7 @@ PERFCOUNTER(trap_wfe,  "trap: wfe")
 PERFCOUNTER(trap_cp15_32,  "trap: cp15 32-bit access")
 PERFCOUNTER(trap_cp15_64,  "trap: cp15 64-bit access")
 PERFCOUNTER(trap_cp14_32,  "trap: cp14 32-bit access")
+PERFCOUNTER(trap_cp14_64,  "trap: cp14 64-bit access")
 PERFCOUNTER(trap_cp14_dbg, "trap: cp14 dbg access")
 PERFCOUNTER(trap_cp,   "trap: cp access")
 PERFCOUNTER(trap_smc32,"trap: 32-bit smc")
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 13/19] xen: arm: Annotate registers trapped by MDCR_EL2.TDRA

2015-04-17 Thread Ian Campbell

DBGDRAR and DBGDSAR are actually two cp or sys registers each, one
32-bit and one 64-bit. The cpregs #define is suffixed "64" and
annotations are added to both handlers.

MDRAR_EL1 (arm64 version of DBGDRAR) wasn't handled, so add that here.

Signed-off-by: Ian Campbell 
---
v2: Move comment next to default label where it belongs.
Clarify DBGDRAR vs DBGDRAR64
---
 xen/arch/arm/traps.c  |   28 
 xen/include/asm-arm/cpregs.h  |4 
 xen/include/asm-arm/sysregs.h |1 +
 3 files changed, 33 insertions(+)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 86b5655..17ddcd0 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1852,6 +1852,15 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
  * ARMv7 (DDI 0406C.b): B1.14.16
  * ARMv8 (DDI 0487A.d): D1-1507 Table D1-54
  *
+ * MDCR_EL2.TDRA
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.15
+ * ARMv8 (DDI 0487A.d): D1-1508 Table D1-57
+ *
+ * Unhandled:
+ *DBGDRAR (32-bit accesses)
+ *DBGDSAR (32-bit accesses)
+ *
  * And all other unknown registers.
  */
 default:
@@ -1883,6 +1892,17 @@ static void do_cp14_64(struct cpu_user_regs *regs, const 
union hsr hsr)
  *
  * ARMv7 (DDI 0406C.b): B1.14.16
  * ARMv8 (DDI 0487A.d): D1-1507 Table D1-54
+ *
+ * MDCR_EL2.TDRA
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.15
+ * ARMv8 (DDI 0487A.d): D1-1508 Table D1-57
+ *
+ * Unhandled:
+ *DBGDRAR (64-bit accesses)
+ *DBGDSAR (64-bit accesses)
+ *
+ * And all other unknown registers.
  */
 gdprintk(XENLOG_ERR,
  "%s p14, %d, r%d, r%d, cr%d @ 0x%"PRIregister"\n",
@@ -1949,6 +1969,14 @@ static void do_sysreg(struct cpu_user_regs *regs,
*x = v->arch.actlr;
 break;
 
+/*
+ * MDCR_EL2.TDRA
+ *
+ * ARMv8 (DDI 0487A.d): D1-1508 Table D1-57
+ */
+case HSR_SYSREG_MDRAR_EL1:
+return handle_ro_raz(regs, x, hsr.sysreg.read, hsr, 1);
+
 /* RAZ/WI registers: */
 /*  - Debug */
 case HSR_SYSREG_MDSCR_EL1:
diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
index afe9148..9db8cfd 100644
--- a/xen/include/asm-arm/cpregs.h
+++ b/xen/include/asm-arm/cpregs.h
@@ -89,10 +89,14 @@
 #define TEECR   p14,6,c0,c0,0   /* ThumbEE Configuration Register */
 
 /* CP14 CR1: */
+#define DBGDRAR64   p14,0,c1/* Debug ROM Address Register (64-bit 
access) */
+#define DBGDRAR p14,0,c1,c0,0   /* Debug ROM Address Register (32-bit 
access) */
 #define TEEHBR  p14,6,c1,c0,0   /* ThumbEE Handler Base Register */
 #define JOSCR   p14,7,c1,c0,0   /* Jazelle OS Control Register */
 
 /* CP14 CR2: */
+#define DBGDSAR64   p14,0,c2/* Debug Self Address Offset Register 
(64-bit access) */
+#define DBGDSAR p14,0,c2,c0,0   /* Debug Self Address Offset Register 
(32-bit access) */
 #define JMCRp14,7,c2,c0,0   /* Jazelle Main Configuration Register 
*/
 
 
diff --git a/xen/include/asm-arm/sysregs.h b/xen/include/asm-arm/sysregs.h
index d75e154..55457fd 100644
--- a/xen/include/asm-arm/sysregs.h
+++ b/xen/include/asm-arm/sysregs.h
@@ -45,6 +45,7 @@
 #define HSR_SYSREG_DCCISW HSR_SYSREG(1,0,c7,c14,2)
 
 #define HSR_SYSREG_MDSCR_EL1  HSR_SYSREG(2,0,c0,c2,2)
+#define HSR_SYSREG_MDRAR_EL1  HSR_SYSREG(2,0,c1,c0,0)
 #define HSR_SYSREG_OSLAR_EL1  HSR_SYSREG(2,0,c1,c0,4)
 #define HSR_SYSREG_OSDLR_EL1  HSR_SYSREG(2,0,c1,c3,4)
 #define HSR_SYSREG_MDCCSR_EL0 HSR_SYSREG(2,3,c0,c1,0)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 04/19] xen: arm: provide and use a handle_raz_wi helper

2015-04-17 Thread Ian Campbell

Reduces the use of goto in the trap handlers to none.

Some explicitly 32-bit types become register_t here, but that's OK, on
32-bit they are 32-bit already and on 64-bit it is fine/harmless to
set the larger register, a 32-bit guest won't see the top half in any
case.

Per section B1.2.1 (ARMv8 DDI0487 A.d) writes to wN registers are zero
extended, so there is no risk of leaking the top half here.

Unlike the previous code the advancing of PC is handled within the
helper, rather than after the end of the switch as before. So return
as the handler is called.

Signed-off-by: Ian Campbell 
---
v2: Added reference to B1.2.1
---
 xen/arch/arm/traps.c |   51 +-
 1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 7270116..8b1846a 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1574,11 +1574,24 @@ static void advance_pc(struct cpu_user_regs *regs, 
const union hsr hsr)
 regs->pc += hsr.len ? 4 : 2;
 }
 
+/* Read as zero + write ignore */
+static void handle_raz_wi(struct cpu_user_regs *regs,
+  register_t *reg,
+  bool_t read,
+  const union hsr hsr)
+{
+if ( read )
+*reg = 0;
+/* else: write ignored */
+
+advance_pc(regs, hsr);
+}
+
 static void do_cp15_32(struct cpu_user_regs *regs,
const union hsr hsr)
 {
 const struct hsr_cp32 cp32 = hsr.cp32;
-uint32_t *r = (uint32_t*)select_user_reg(regs, cp32.reg);
+register_t *r = select_user_reg(regs, cp32.reg);
 struct vcpu *v = current;
 
 if ( !check_conditional_instr(regs, hsr) )
@@ -1613,14 +1626,14 @@ static void do_cp15_32(struct cpu_user_regs *regs,
 /* RO at EL0. RAZ/WI at EL1 */
 if ( psr_mode_is_user(regs) && !hsr.cp32.read )
 return inject_undef_exception(regs, hsr);
-goto cp15_32_raz_wi;
+return handle_raz_wi(regs, r, cp32.read, hsr);
 
 case HSR_CPREG32(PMINTENSET):
 case HSR_CPREG32(PMINTENCLR):
 /* EL1 only, however MDCR_EL2.TPM==1 means EL0 may trap here also. */
 if ( psr_mode_is_user(regs) )
 return inject_undef_exception(regs, hsr);
-goto cp15_32_raz_wi;
+return handle_raz_wi(regs, r, cp32.read, hsr);
 case HSR_CPREG32(PMCR):
 case HSR_CPREG32(PMCNTENSET):
 case HSR_CPREG32(PMCNTENCLR):
@@ -1639,11 +1652,7 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  */
 if ( psr_mode_is_user(regs) )
 return inject_undef_exception(regs, hsr);
- cp15_32_raz_wi:
-if ( cp32.read )
-*r = 0;
-/* else: write ignored */
-break;
+return handle_raz_wi(regs, r, cp32.read, hsr);
 
 default:
 gdprintk(XENLOG_ERR,
@@ -1694,7 +1703,7 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 static void do_cp14_32(struct cpu_user_regs *regs, const union hsr hsr)
 {
 const struct hsr_cp32 cp32 = hsr.cp32;
-uint32_t *r = (uint32_t *)select_user_reg(regs, cp32.reg);
+register_t *r = select_user_reg(regs, cp32.reg);
 struct domain *d = current->domain;
 
 if ( !check_conditional_instr(regs, hsr) )
@@ -1738,12 +1747,11 @@ static void do_cp14_32(struct cpu_user_regs *regs, 
const union hsr hsr)
 if ( usr_mode(regs) )
 return inject_undef_exception(regs, hsr);
 
-/* Implement debug status and control register as RAZ/WI.
- * The OS won't use Hardware debug if MDBGen not set
+/*
+ * Implement debug status and control register as RAZ/WI.
+ * The OS won't use Hardware debug if MDBGen not set.
  */
-if ( cp32.read )
-   *r = 0;
-break;
+return handle_raz_wi(regs, r, cp32.read, hsr);
 
 case HSR_CPREG32(DBGVCR):
 case HSR_CPREG32(DBGBVR0):
@@ -1755,10 +1763,7 @@ static void do_cp14_32(struct cpu_user_regs *regs, const 
union hsr hsr)
 case HSR_CPREG32(DBGOSDLR):
 if ( usr_mode(regs) )
 return inject_undef_exception(regs, hsr);
-/* RAZ/WI */
-if ( cp32.read )
-*r = 0;
-break;
+return handle_raz_wi(regs, r, cp32.read, hsr);
 
 case HSR_CPREG32(DBGOSLAR):
 if ( usr_mode(regs) )
@@ -1845,7 +1850,7 @@ static void do_sysreg(struct cpu_user_regs *regs,
  */
 if ( psr_mode_is_user(regs) )
 return inject_undef_exception(regs, hsr);
-goto sysreg_raz_wi;
+return handle_raz_wi(regs, x, hsr.sysreg.read, hsr);
 
 case HSR_SYSREG_MDCCSR_EL0:
 /*
@@ -1863,7 +1868,7 @@ static void do_sysreg(struct cpu_user_regs *regs,
 /* RO at EL0. RAZ/WI at EL1 */
 if ( psr_mode_is_user(regs) && !hsr.sysreg.read )
 return inject_undef_exception(regs, hsr);
-goto sysreg_raz_wi;
+return handle_raz_wi(regs, x, hsr.sysreg.read, hsr);
 case HSR_SYSREG_PMCR_EL0:

Re: [Xen-devel] [PATCH v2 2/4] x86/MSI-X: access MSI-X table only after having enabled MSI-X

2015-04-17 Thread Jan Beulich

>>> On 15.04.15 at 19:41,  wrote:
> On Mon, Apr 13, 2015 at 10:05:14AM +0100, Jan Beulich wrote:
>> >>> On 10.04.15 at 22:02,  wrote:
>> > On Wed, Mar 25, 2015 at 04:39:49PM +, Jan Beulich wrote:
>> >> As done in Linux by f598282f51 ("PCI: Fix the NIU MSI-X problem in a
>> >> better way") and its broken predecessor, make sure we don't access the
>> >> MSI-X table without having enabled MSI-X first, using the mask-all flag
>> >> instead to prevent interrupts from occurring.
>> > 
>> > This causes an regression with an Linux guest that has the XSA120 + XSA120
>> > addendum with PV guests (hadn't tried yet HVM).
>> 
>> You mentioning XSA-120 and its addendum - are these requirements
>> for the problem to be seen? I admit I may have tested a PV guest
>> only with an SR-IOV VF (and only a HVM guest also with an "ordinary"
>> device), but I'd like to be clear about the validity of the connection.
> 
> No. I just tried with v4.0-rc5 (and then also v4.0) and just 
> using SR-IOV to make this simpler.
> 
> With staging  + two of your patches:
> a10cc68 TODO: drop //temp-s
> 1b8721c x86/MSI-X: be more careful during teardown
> 
> When trying to enable SR-IOV I get this error:
> 
> failed to echo 1 > 
> /sys/devices/pci:00/:00:01.0/:0a:00.0/sriov_numvfs, rc: 1
> (hadn't tried just passing in an HVM guest).
> 
> Attached is the 'xl dmesg'.

Could you replace the patch I handed you earlier on by this one
and try again? I actually was able to determine that I did try a
(SUSE) PV guest without seeing an issue. I just now tried again,
and I don't see either of the two debug warnings. So quite clear
any indication towards a pvops problem.

Jan

x86/MSI-X: access MSI-X table only after having enabled MSI-X

As done in Linux by f598282f51 ("PCI: Fix the NIU MSI-X problem in a
better way") and its broken predecessor, make sure we don't access the
MSI-X table without having enabled MSI-X first, using the mask-all flag
instead to prevent interrupts from occurring.

Signed-off-by: Jan Beulich 
---
v3: temporarily enable MSI-X in setup_msi_irq() if not already enabled

--- unstable.orig/xen/arch/x86/msi.c
+++ unstable/xen/arch/x86/msi.c
@@ -142,6 +142,21 @@ static bool_t memory_decoded(const struc
   PCI_COMMAND_MEMORY);
 }
 
+static bool_t msix_memory_decoded(const struct pci_dev *dev, unsigned int pos)
+{
+u16 control = pci_conf_read16(dev->seg, dev->bus, PCI_SLOT(dev->devfn),
+  PCI_FUNC(dev->devfn), msix_control_reg(pos));
+
+if ( !(control & PCI_MSIX_FLAGS_ENABLE) )
+{//temp
+ static bool_t warned;
+ WARN_ON(!test_and_set_bool(warned));
+return 0;
+}
+
+return memory_decoded(dev);
+}
+
 /*
  * MSI message composition
  */
@@ -219,7 +234,8 @@ static bool_t read_msi_msg(struct msi_de
 void __iomem *base;
 base = entry->mask_base;
 
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 return 0;
 msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
 msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET);
@@ -285,7 +301,8 @@ static int write_msi_msg(struct msi_desc
 void __iomem *base;
 base = entry->mask_base;
 
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 return -ENXIO;
 writel(msg->address_lo,
base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
@@ -379,7 +396,7 @@ static bool_t msi_set_mask_bit(struct ir
 {
 struct msi_desc *entry = desc->msi_desc;
 struct pci_dev *pdev;
-u16 seg;
+u16 seg, control;
 u8 bus, slot, func;
 
 ASSERT(spin_is_locked(&desc->lock));
@@ -401,35 +418,38 @@ static bool_t msi_set_mask_bit(struct ir
 }
 break;
 case PCI_CAP_ID_MSIX:
+control = pci_conf_read16(seg, bus, slot, func,
+  msix_control_reg(entry->msi_attrib.pos));
+if ( unlikely(!(control & PCI_MSIX_FLAGS_ENABLE)) )
+pci_conf_write16(seg, bus, slot, func,
+ msix_control_reg(entry->msi_attrib.pos),
+ control | (PCI_MSIX_FLAGS_ENABLE |
+PCI_MSIX_FLAGS_MASKALL));
 if ( likely(memory_decoded(pdev)) )
 {
 writel(flag, entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
 readl(entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
-break;
+if ( likely(control & PCI_MSIX_FLAGS_ENABLE) )
+break;
+flag = 1;
 }
-if ( flag )
+else if ( flag && !(control & PCI_MSIX_FLAGS_MASKALL) )
 {
-u16 control;
 domid_t domid = pdev->domain->domain_id;
 
-control = pci

[Xen-devel] [PATCH v2 11/19] xen: arm: Annotate handlers for CPTR_EL2.Tx

2015-04-17 Thread Ian Campbell

Also expand on the comment when writing CPTR_EL2 to mention that most
of the bits we are setting are RES1 on arm64 anyway.

Signed-off-by: Ian Campbell 
---
v2: s/PCTR/CPTR/
Expand the comment when writing to CPTR_EL2
---
 xen/arch/arm/traps.c |   23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index afa8a95..a2bae51 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -110,8 +110,13 @@ void __cpuinit init_traps(void)
 /* Trap CP15 c15 used for implementation defined registers */
 WRITE_SYSREG(HSTR_T(15), HSTR_EL2);
 
-/* Trap all coprocessor registers (0-13) except cp10 and cp11 for VFP
- * /!\ All processors except cp10 and cp11 cannot be used in Xen
+/* Trap all coprocessor registers (0-13) except cp10 and
+ * cp11 for VFP.
+ *
+ * /!\ All coprocessors except cp10 and cp11 cannot be used in Xen.
+ *
+ * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
+ * RES1, i.e. they would trap whether we did this write or not.
  */
 WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) | HCPTR_TTA,
  CPTR_EL2);
@@ -1710,6 +1715,11 @@ static void do_cp15_32(struct cpu_user_regs *regs,
  * ARMv7 (DDI 0406C.b): B1.14.3
  * ARMv8 (DDI 0487A.d): D1-1501 Table D1-43
  *
+ * CPTR_EL2.T{0..9,12..13}
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.12
+ * ARMv8 (DDI 0487A.d): N/A
+ *
  * And all other unknown registers.
  */
 default:
@@ -1741,6 +1751,15 @@ static void do_cp15_64(struct cpu_user_regs *regs,
 if ( !vtimer_emulate(regs, hsr) )
 return inject_undef_exception(regs, hsr);
 break;
+
+/*
+ * CPTR_EL2.T{0..9,12..13}
+ *
+ * ARMv7 (DDI 0406C.b): B1.14.12
+ * ARMv8 (DDI 0487A.d): N/A
+ *
+ * And all other unknown registers.
+ */
 default:
 {
 const struct hsr_cp64 cp64 = hsr.cp64;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 02/19] xen: arm: Fix handling of ICC_{SGI1R, SGI0R, ASGI1R}_EL1

2015-04-17 Thread Ian Campbell

Having injected an undefined instruction we don't want to also advance
pc. So return.

The ICC_{SGI0R,ASGI1R}_EL1 case was previously missing a break, so
would have fallen through to the default case and injected a second
undef, corrupting SPSR_EL1 and ELR_EL1 for the guest.

Signed-off-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
v2: Remove vestigial second commit message
---
 xen/arch/arm/traps.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 69b9513..99ceaea 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1908,7 +1908,7 @@ static void do_sysreg(struct cpu_user_regs *regs,
 {
 dprintk(XENLOG_WARNING,
 "failed emulation of sysreg ICC_SGI1R_EL1 access\n");
-inject_undef64_exception(regs, hsr.len);
+return inject_undef64_exception(regs, hsr.len);
 }
 break;
 case HSR_SYSREG_ICC_SGI0R_EL1:
@@ -1916,7 +1916,7 @@ static void do_sysreg(struct cpu_user_regs *regs,
 /* TBD: Implement to support secure grp0/1 SGI forwarding */
 dprintk(XENLOG_WARNING,
 "Emulation of sysreg ICC_SGI0R_EL1/ASGI1R_EL1 not 
supported\n");
-inject_undef64_exception(regs, hsr.len);
+return inject_undef64_exception(regs, hsr.len);
 default:
 {
 const struct hsr_sysreg sysreg = hsr.sysreg;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Jan Beulich

>>> On 17.04.15 at 15:40,  wrote:
> I actually did cobble a patch like this, but it is based on Daniel's 
> Multibootv2
> so it won't apply cleany. See attached patchset with various 'work-arounds'.
> 
> Jan if you are OK with them (well the 'idea' behind them) I can refresh
> it against staging and post them?

Yeah, the 3rd of these patches looks pretty close to what could be
taken right away (largely subject to whether you'd want to follow
Andrew's suggestion to put everything behind a single efi= option,
and perhaps allowing cacheability to be other than UC; neither of
the two would be a strict requirement for acceptance though).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Andrew Cooper

On 17/04/15 14:40, Konrad Rzeszutek Wilk wrote:
> On Fri, Apr 17, 2015 at 01:54:28PM +0100, Andrew Cooper wrote:
>> On 17/04/15 13:39, Jan Beulich wrote:
>> On 17.04.15 at 13:59,  wrote:
 On 17/04/15 12:17, Olaf Hering wrote:
> Since booting xen fails on my ProBook unless I specify "maxcpus=1" I
> tried the EFI firmware today. To my surprise it boots and finds all
> cpus. But once some efi driver in dom0 is loaded xen crashes. The same
> happens with xen-4.4 as included in SLE12.
>
> ...
> (XEN) Xen call trace:
> (XEN)[] aec1e8e1
> (XEN)[] efi_runtime_call+0x7f0/0x890
> (XEN)[] do_platform_op+0x679/0x1670
> (XEN)[] syscall_enter+0xa9/0xae
> 
>
> Can I do anything about it, or is this a firmware bug? I will move the
> offending efi driver away and try again.
>
> Olaf
 This is a firmware bug.
>>> +1 (and I'm surprised how common this is)
>> The bug is present in the reference implementation code, which means it
>> is present in a lot of real firmware.  We have kit from 3 different
>> vendors which are affected, including latest available firmware.
>>
> (XEN)  1-23fff type=7 attr=000
> (XEN)  0fec1-0fec10fff type=11 attr=8001
> (XEN)  0fff4-0fff46fff type=11 attr=8000
> (XEN) Unknown cachability for MFNs 0xfff40-0xfff46
 This unknown cacheability causes Xen not to make pagetables for the region.

 There is a patch or two floating around the list, but currently no
 resolution on the argument it created.

 https://github.com/xenserver/xen-4.5.pg/blob/master/master/unknown-cacheabilit
  
 y.patch
 is the XenServer fix.
>>> Now that's surely wrong
>> Right or wrong, this is (apparently; I have not checked) what Linux does.
>>
>>>  - if anything, unknown should be treated as
>>> UC (and quite likely specifically in a case like the one Olaf reports here,
>>> as the offending memory range pretty likely is other than normal RAM).
>>> What I'd accept as a patch would be the addition of a command line
>>> option enforcing the mapping of such unknown cacheability areas with
>>> a certain caching type (default then being UC).
>> If I can find some copious free time, I will see about making this happen.
> I actually did cobble a patch like this, but it is based on Daniel's 
> Multibootv2
> so it won't apply cleany. See attached patchset with various 'work-arounds'.
>
> Jan if you are OK with them (well the 'idea' behind them) I can refresh
> it against staging and post them?

I was planning to make one efi= command line option along the
psr/ept/iommu line, rather than having a large number of top-level
options (and folding our one efi-rs option into it).

But otherwise, that sounds like a plan.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] xen: arm: propagate gic's #interrupt-cells property to dom0.

2015-04-17 Thread Ian Campbell

On Mon, 2015-03-16 at 16:01 +, Julien Grall wrote:
> Hi Ian,
> 
> On 12/03/15 17:17, Ian Campbell wrote:
> > This is similar to 816f5bb1f074 "xen: arm: propagate gic's
> > should propagate (rather than invent our own value) since this value
> > is used to size fields within other properties within the tree.
> > I'm not sure why I didn't do this as part of 816f5bb1f074. I think
> > probably just because #interrupt-cells must always be 3 for a GIC
> > whereas #address-cells can legitimately differ. Regardless, I think we
> > might as well do this in common code.
> 
> Hmmm... We are creating some interrupt ourself assuming the number of
> interrupt cells is 3. So it makes sense to hard-code (not really invent)
> the value.

I'll move the addition to common code but leave it as hard coded then.

> 
> > Signed-off-by: Ian Campbell 
> > ---
> >  xen/arch/arm/domain_build.c |   18 +-
> >  xen/arch/arm/gic-hip04.c|4 
> >  xen/arch/arm/gic-v2.c   |4 
> >  xen/arch/arm/gic-v3.c   |4 
> >  4 files changed, 13 insertions(+), 17 deletions(-)
> > 
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index ab4ad65..2a2fc2b 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -784,8 +784,8 @@ static int make_gic_node(const struct domain *d, void 
> > *fdt,
> >  {
> >  const struct dt_device_node *gic = dt_interrupt_controller;
> >  int res = 0;
> > -const void *addrcells;
> > -u32 addrcells_len;
> > +const void *cells;
> > +u32 cells_len;
> >  
> >  /*
> >   * Xen currently supports only a single GIC. Discard any secondary
> > @@ -815,10 +815,18 @@ static int make_gic_node(const struct domain *d, void 
> > *fdt,
> >  return res;
> >  }
> >  
> > -addrcells = dt_get_property(gic, "#address-cells", &addrcells_len);
> > -if ( addrcells )
> > +cells = dt_get_property(gic, "#address-cells", &cells_len);
> > +if ( cells )
> >  {
> > -res = fdt_property(fdt, "#address-cells", addrcells, 
> > addrcells_len);
> > +res = fdt_property(fdt, "#address-cells", cells, cells_len);
> > +if ( res )
> > +return res;
> > +}
> > +
> > +cells = dt_get_property(gic, "#interrupt-cells", &cells_len);
> > +if ( cells )
> > +{
> > +res = fdt_property(fdt, "#interrupt-cells", cells, cells_len);
> 
> The #interrupt-cells as to be present at any time for the GIC. So I
> don't think it's worth to check if it presents. Maybe an ASSERT would be
> enough?

With the change discussed above it becomes moot.

> Also, I would check somewhere that the value is effectively 3 otherwise
> we are in trouble for the timer/evtchn interrupt creation. Though, it
> was there before too.

Probably somewhere should but I'm not sure where.

> > diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> > index ab80670..528500a 100644
> > --- a/xen/arch/arm/gic-v3.c
> > +++ b/xen/arch/arm/gic-v3.c
> > @@ -1102,10 +1102,6 @@ static int gicv3_make_dt_node(const struct domain *d,
> >  if ( res )
> >  return res;
> >  
> > -res = fdt_property_cell(fdt, "#interrupt-cells", 3);
> > -if ( res )
> > -return res;
> > -
> >  res = fdt_property(fdt, "interrupt-controller", NULL, 0);
> >  if ( res )
> >  return res;
> > 
> 
> While you move #interrupt-cells to common code. Could you move
> interrupt-controller too?

I suppose I may as well.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Konrad Rzeszutek Wilk

On Fri, Apr 17, 2015 at 01:54:28PM +0100, Andrew Cooper wrote:
> On 17/04/15 13:39, Jan Beulich wrote:
>  On 17.04.15 at 13:59,  wrote:
> >> On 17/04/15 12:17, Olaf Hering wrote:
> >>> Since booting xen fails on my ProBook unless I specify "maxcpus=1" I
> >>> tried the EFI firmware today. To my surprise it boots and finds all
> >>> cpus. But once some efi driver in dom0 is loaded xen crashes. The same
> >>> happens with xen-4.4 as included in SLE12.
> >>>
> >>> ...
> >>> (XEN) Xen call trace:
> >>> (XEN)[] aec1e8e1
> >>> (XEN)[] efi_runtime_call+0x7f0/0x890
> >>> (XEN)[] do_platform_op+0x679/0x1670
> >>> (XEN)[] syscall_enter+0xa9/0xae
> >>> 
> >>>
> >>> Can I do anything about it, or is this a firmware bug? I will move the
> >>> offending efi driver away and try again.
> >>>
> >>> Olaf
> >> This is a firmware bug.
> > +1 (and I'm surprised how common this is)
> 
> The bug is present in the reference implementation code, which means it
> is present in a lot of real firmware.  We have kit from 3 different
> vendors which are affected, including latest available firmware.
> 
> >
> >>> (XEN)  1-23fff type=7 attr=000
> >>> (XEN)  0fec1-0fec10fff type=11 attr=8001
> >>> (XEN)  0fff4-0fff46fff type=11 attr=8000
> >>> (XEN) Unknown cachability for MFNs 0xfff40-0xfff46
> >> This unknown cacheability causes Xen not to make pagetables for the region.
> >>
> >> There is a patch or two floating around the list, but currently no
> >> resolution on the argument it created.
> >>
> >> https://github.com/xenserver/xen-4.5.pg/blob/master/master/unknown-cacheabilit
> >>  
> >> y.patch
> >> is the XenServer fix.
> > Now that's surely wrong
> 
> Right or wrong, this is (apparently; I have not checked) what Linux does.
> 
> >  - if anything, unknown should be treated as
> > UC (and quite likely specifically in a case like the one Olaf reports here,
> > as the offending memory range pretty likely is other than normal RAM).
> > What I'd accept as a patch would be the addition of a command line
> > option enforcing the mapping of such unknown cacheability areas with
> > a certain caching type (default then being UC).
> 
> If I can find some copious free time, I will see about making this happen.

I actually did cobble a patch like this, but it is based on Daniel's Multibootv2
so it won't apply cleany. See attached patchset with various 'work-arounds'.

Jan if you are OK with them (well the 'idea' behind them) I can refresh
it against staging and post them?
>From 33badf8e314251e9d9c3b768c0b7a34b225aa45c Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Tue, 3 Feb 2015 11:56:33 -0500
Subject: [PATCH 1/3] EFI/early: Implement /noexit to not ExitBootServices

The /noexitboot will inhibit Xen from calling ExitBootServices.
This allows on Lenovo ThinkPad x230 to use GetNextVariableName
in 1-1 mapping mode.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 xen/arch/x86/efi/efi-boot.h |  2 +-
 xen/common/efi/boot.c   | 15 +++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
index f50c10a..0fbc4de 100644
--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -676,7 +676,7 @@ void __init efi_multiboot2(EFI_HANDLE ImageHandle, 
EFI_SYSTEM_TABLE *SystemTable
 setup_efi_pci();
 efi_variables();
 efi_set_gop_mode(gop, gop_mode);
-efi_exit_boot(ImageHandle, SystemTable);
+efi_exit_boot(ImageHandle, SystemTable, 0);
 }
 
 /*
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index d1d06d7..2389a1a 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -86,7 +86,7 @@ static void efi_tables(void);
 static void setup_efi_pci(void);
 static void efi_variables(void);
 static void efi_set_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, UINTN 
gop_mode);
-static void efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable);
+static void efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable, int exit_boot_services);
 
 static const EFI_BOOT_SERVICES *__initdata efi_bs;
 static EFI_HANDLE __initdata efi_ih;
@@ -882,7 +882,7 @@ static void __init 
efi_set_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, UINTN gop
 efi_arch_video_init(gop, info_size, mode_info);
 }
 
-static void __init efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
+static void __init efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable, int exit_boot_services)
 {
 EFI_STATUS status;
 UINTN map_key;
@@ -906,7 +906,10 @@ static void __init efi_exit_boot(EFI_HANDLE ImageHandle, 
EFI_SYSTEM_TABLE *Syste
 
 efi_arch_pre_exit_boot();
 
-status = efi_bs->ExitBootServices(ImageHandle, map_key);
+if ( exit_boot_services )
+status = efi_bs->ExitBootServices(ImageHandle, map_key);
+else
+status = 0

Re: [Xen-devel] Archiving Xen on ARM and PVOPS subprojects

2015-04-17 Thread Stefano Stabellini

On Wed, 15 Apr 2015, Lars Kurth wrote:
> Hi all,
> I wanted to make the proposal to archive the following two subproject on the 
> grounds that they completed their goals
> 
> a) http://xenproject.org/developers/teams/pvops.html
> b) http://xenproject.org/developers/teams/arm-hypervisor.html
> 
> In the case of a) the goal was to establish Xen support in Linux, which has 
> been achieved
> In the case of b) the goal was to establish ARM support in the Hypervisor 
> which has been achieved
> 
> According to http://xenproject.org/governance.html we would need to perform 
> an archivation review. In this case, the situation is
> quite clear IMHO and I believe that we do not need to make an extensive case, 
> besides the one above. So my proposal would be to
> just have a committer vote on these. I would be happy if committers and 
> maintainers listed in a), b)
> and http://xenproject.org/developers/teams/hypervisor.html were to just reply 
> with the usual +1, 0, -1 to this thread.
> 
> In practical terms, the following would happen. I would remove a) from the 
> website and merge bits of b)
> into http://xenproject.org/developers/teams/hypervisor.html
> 
> There is no mailing list or repo impact for both projects, as both do not 
> have separate lists.
> 
> We would probably archive some pages in 
> http://wiki.xenproject.org/wiki/Category:PVOPS - I would need David and 
> Konrad's help for
> that. Probably 15 minutes should be fine. Maybe in fact it may make sense to 
> rename the Category to Linux.
> 
> As far as I can see, none of the pages in 
> http://wiki.xenproject.org/wiki/Category:XenARM need to be
> archived. http://wiki.xenproject.org/wiki/Archived/Xen_ARM_(PV) is already 
> archived.
> 
> We would probably also put together a blog post, once all of this has been 
> done and maybe put together a press release (up to
> Sarah and the Advisory Board).
> 
> Any views?

+1 on both counts___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Question about DMA on 1:1 mapping dom0 of arm64

2015-04-17 Thread Ian Campbell

On Fri, 2015-04-17 at 19:24 +0800, Chen Baozi wrote:
> Hi all,
> 
> According to my recent experience, there might be some problems of swiotlb
> dma map on 1:1 mapping arm64 dom0 with large memory. The issue is like below:
> 
> For those arm64 server with large memory, it is possible to set dom0_mem >
> 4G (e.g. I have one set with 16G). In this case, according to my 
> understanding,
> there is chance that the dom0 kernel needs to map some buffers above 4G to do

 ^below?

> DMA operations (e.g. in snps,dwmac ethernet driver). However, most DMA engines
> support only 32-bit physical address, thus aren't able to operate directly on
> those memory.

Even on arm64 systems with RAM above 4GB? That seems short-sighted.
Oh well, I suppose we have to live with it.

>  IIUC, swiotlb is implemented to solve this (using bounce buffer),
> if there is no IOMMU or IOMMU is not enabled on the system. Sadly, it seems
> that xen_swiotlb_map_page in my dom0 kernel allocates
> (start_dma_addr = 0x94480) the buffers for DMA above 4G which fails
> dma_capable() checking and was then unable to return from 
> xen_swiotlb_map_page()
> successfully.

The swiotlb bounce buffer have been allocated below 4GB? I suspect that
xen_swiotlb_init is buggy for ARM -- it allocates some random pages and
then swizzles the backing pages for ones < 4G, but that won't work on an
ARM dom0 with a 1:1 mapping, I don't think. Do you see error messages
along those lines?

Essentially I think either xen_swiotlb_fixup is unable to work on ARM,
or the following:
start_dma_addr = xen_virt_to_bus(xen_io_tlb_start);
is returning 1:1 and not reflecting the fixup.

> If I set dom0_mem to a small value (e.g. 512M), which makes all physical 
> memory
> of dom0 below 4G, everything goes fine.

So you are getting allocated memory below 4G?

You message on IRC suggested you weren't, did you hack around this?

I think we have two options, either xen_swiotlb_init allocates pages
below 4GB (e.g. __GFP_DMA) or we do something to allow xen_swiotlb_fixup
to actually work even on a 1:1 dom0.

Although the first option seems preferable at first glance it has the
short coming that it requires dom0 to have some memory below 4GB, which
might not necessarily be the case. The second option seems like it might
be uglier but doesn't suffer from this issue.

Can you please look and find out if the IPA at 0x94480 is actually
backed by 1:1 RAM or if xen_swiotlb_fixup has done it's job and updated
things such that the associated PAs are below 4GB?

Ian.

> I am not familiar with swiotlb-xen, so there would be misunderstanding about
> the current situation. Fix me if I did/understood anything wrong.
> 
> Any ideas?
> 
> Cheers,
> 
> Chen Baozi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv2 3/6] xen: generic xadd() for ticket locks

2015-04-17 Thread David Vrabel

On 17/04/15 14:09, Ian Campbell wrote:
> On Fri, 2015-04-17 at 13:34 +0100, David Vrabel wrote:
>>
>> Can you use
>>
>>   git://xenbits.xen.org/people/dvrabel/xen.git ticketlocks-v3

git://xenbits.xen.org/people/dvrabel/xen.git ticketlock-v3

> I tried that and it built and booted just fine on both arm32 and arm64.
> 
> I eyeballed the assembly produced via the use of __sync_fetch_and_add
> (for _spin_lock only) and it is exactly what I would have written in my
> own versions.
> 
> I was using gcc 4.8.3 in both cases. For arm64 I'm pretty sure we don't
> want to consider anything earlier.
> 
> For arm32 I have also tried gcc 4.6.3 (Debian Wheezy's compiler) and it
> built and booted, and eyeballing shows the same asm. I think that's the
> earliest we really need to worry about.
> 
> IOW I'm not going to bother with custom versions of these functions on
> ARM. If you wanted you could drop the #ifndef xadd from
> asm-arm/system.h.
> 
> Perhaps it would be useful to add some of the info from my tests
> reported above, or a reference to this mail, to the commit log?
> 
> in either case you can add:
> 
> Acked-by: Ian Campbell 

Thanks, Ian!

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/2] raisin: introduce seabios component

2015-04-17 Thread Stefano Stabellini

Build SeaBIOS as a separate component.
Pass --with-system-seabios to the xen configure script.

Signed-off-by: Stefano Stabellini 
---
 components/seabios |   57 
 components/series  |1 +
 components/xen |3 ++-
 defconfig  |2 ++
 4 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 components/seabios

diff --git a/components/seabios b/components/seabios
new file mode 100644
index 000..960a538
--- /dev/null
+++ b/components/seabios
@@ -0,0 +1,57 @@
+#!/usr/bin/env bash
+
+function seabios_check_package() {
+local DEP_Debian_common="build-essential iasl"
+local DEP_Debian_x86_32="$DEP_Debian_common"
+local DEP_Debian_x86_64="$DEP_Debian_common"
+local DEP_Debian_arm32="$DEP_Debian_common"
+local DEP_Debian_arm64="$DEP_Debian_common"
+
+local DEP_Fedora_common="make gcc acpica-tools"
+local DEP_Fedora_x86_32="$DEP_Fedora_common"
+local DEP_Fedora_x86_64="$DEP_Fedora_common"
+
+
+if [[ $ARCH != "x86_64" && $ARCH != "x86_32" ]]
+then
+echo seabios is only supported on x86_32 and x86_64
+return
+fi
+echo Checking SeaBIOS dependencies
+eval check-package \$DEP_"$DISTRO"_"$ARCH"
+}
+
+
+function seabios_build() {
+if [[ $ARCH != "x86_64" && $ARCH != "x86_32" ]]
+then
+echo seabios is only supported on x86_32 and x86_64
+return
+fi
+
+cd "$BASEDIR"
+git-checkout $SEABIOS_URL $SEABIOS_REVISION seabios-dir
+cd seabios-dir
+$RAISIN_MAKE defconfig
+$RAISIN_MAKE
+cd "$BASEDIR"
+}
+
+function seabios_clean() {
+cd "$BASEDIR"
+if [[ -d seabios-dir ]]
+then
+cd seabios-dir
+$RAISIN_MAKE distclean
+cd ..
+rm -rf seabios-dir
+fi
+}
+
+function seabios_configure() {
+:
+}
+
+function seabios_unconfigure() {
+:
+}
diff --git a/components/series b/components/series
index d21243a..f0f3cfa 100644
--- a/components/series
+++ b/components/series
@@ -1,3 +1,4 @@
+seabios
 xen
 qemu
 qemu_traditional
diff --git a/components/xen b/components/xen
index d150efb..f64afe6 100644
--- a/components/xen
+++ b/components/xen
@@ -24,7 +24,8 @@ function xen_build() {
 git-checkout $XEN_URL $XEN_REVISION xen-dir
 cd xen-dir
 ./configure --prefix=$PREFIX 
--with-system-qemu=$PREFIX/lib/xen/bin/qemu-system-i386 \
---disable-qemu-traditional --enable-rombios
+--disable-qemu-traditional --enable-rombios \
+--with-system-seabios="$BASEDIR"/seabios-dir/out/bios.bin
 $RAISIN_MAKE
 $RAISIN_MAKE install DESTDIR="$INST_DIR"
 cd "$BASEDIR"
diff --git a/defconfig b/defconfig
index d45e2df..b1a0590 100644
--- a/defconfig
+++ b/defconfig
@@ -16,6 +16,7 @@ DESTDIR=dist
 XEN_URL="git://xenbits.xen.org/xen.git"
 QEMU_URL="git://git.qemu.org/qemu.git"
 QEMU_TRADITIONAL_URL="git://xenbits.xen.org/qemu-xen-unstable.git"
+SEABIOS_URL="git://xenbits.xen.org/seabios.git"
 GRUB_URL="git://git.savannah.gnu.org/grub.git"
 LIBVIRT_URL="git://libvirt.org/libvirt.git"
 
@@ -25,5 +26,6 @@ LIBVIRT_URL="git://libvirt.org/libvirt.git"
 XEN_REVISION="master"
 QEMU_REVISION="master"
 QEMU_TRADITIONAL_REVISION="master"
+SEABIOS_REVISION="master"
 GRUB_REVISION="master"
 LIBVIRT_REVISION="master"
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 1/2] raisin: add a component to build qemu_traditional

2015-04-17 Thread Stefano Stabellini

Introduce a component to build qemu-traditional out of xen-unstable.
Do not compile qemu-traditional from xen-unstable by passing the right
command line option to configure.

Signed-off-by: Stefano Stabellini 
---
 components/qemu_traditional |   49 +++
 components/series   |1 +
 components/xen  |3 ++-
 defconfig   |2 ++
 4 files changed, 54 insertions(+), 1 deletion(-)
 create mode 100644 components/qemu_traditional

diff --git a/components/qemu_traditional b/components/qemu_traditional
new file mode 100644
index 000..500cbed
--- /dev/null
+++ b/components/qemu_traditional
@@ -0,0 +1,49 @@
+#!/usr/bin/env bash
+
+function qemu_traditional_check_package() {
+local DEP_Debian_common="build-essential zlib1g-dev pciutils-dev 
pkg-config \
+  libncurses5-dev"
+local DEP_Debian_x86_32="$DEP_Debian_common"
+local DEP_Debian_x86_64="$DEP_Debian_common"
+local DEP_Debian_arm32="$DEP_Debian_common"
+local DEP_Debian_arm64="$DEP_Debian_common"
+
+local DEP_Fedora_common="make gcc zlib-devel ncurses-devel pciutils-devel"
+local DEP_Fedora_x86_32="$DEP_Fedora_common"
+local DEP_Fedora_x86_64="$DEP_Fedora_common"
+
+echo Checking QEMU dependencies
+eval check-package \$DEP_"$DISTRO"_"$ARCH"
+}
+
+function qemu_traditional_build() {
+cd "$BASEDIR"
+git-checkout $QEMU_TRADITIONAL_URL $QEMU_TRADITIONAL_REVISION 
qemu_traditional-dir
+cd qemu_traditional-dir
+
+export CONFIG_BLKTAP1=n
+export XEN_ROOT="$BASEDIR"/xen-dir
+./xen-setup
+$RAISIN_MAKE all
+$RAISIN_MAKE install DESTDIR="$INST_DIR"
+cd "$BASEDIR"
+}
+
+function qemu_traditional_clean() {
+cd "$BASEDIR"
+if [[ -d qemu_traditional-dir ]]
+then
+cd qemu_traditional-dir
+$MAKE distclean
+cd ..
+rm -rf qemu_traditional-dir
+fi
+}
+
+function qemu_traditional_configure() {
+:
+}
+
+function qemu_traditional_unconfigure() {
+:
+}
diff --git a/components/series b/components/series
index 8f614f0..d21243a 100644
--- a/components/series
+++ b/components/series
@@ -1,4 +1,5 @@
 xen
 qemu
+qemu_traditional
 grub
 libvirt
diff --git a/components/xen b/components/xen
index f8959be..d150efb 100644
--- a/components/xen
+++ b/components/xen
@@ -23,7 +23,8 @@ function xen_build() {
 cd "$BASEDIR"
 git-checkout $XEN_URL $XEN_REVISION xen-dir
 cd xen-dir
-./configure --prefix=$PREFIX 
--with-system-qemu=$PREFIX/lib/xen/bin/qemu-system-i386
+./configure --prefix=$PREFIX 
--with-system-qemu=$PREFIX/lib/xen/bin/qemu-system-i386 \
+--disable-qemu-traditional --enable-rombios
 $RAISIN_MAKE
 $RAISIN_MAKE install DESTDIR="$INST_DIR"
 cd "$BASEDIR"
diff --git a/defconfig b/defconfig
index 23c76eb..d45e2df 100644
--- a/defconfig
+++ b/defconfig
@@ -15,6 +15,7 @@ DESTDIR=dist
 #LIBVIRT_URL="https://gitorious.org/libvirt/libvirt.git";
 XEN_URL="git://xenbits.xen.org/xen.git"
 QEMU_URL="git://git.qemu.org/qemu.git"
+QEMU_TRADITIONAL_URL="git://xenbits.xen.org/qemu-xen-unstable.git"
 GRUB_URL="git://git.savannah.gnu.org/grub.git"
 LIBVIRT_URL="git://libvirt.org/libvirt.git"
 
@@ -23,5 +24,6 @@ LIBVIRT_URL="git://libvirt.org/libvirt.git"
 # Grub and Libvirt needs Xen to build and run.
 XEN_REVISION="master"
 QEMU_REVISION="master"
+QEMU_TRADITIONAL_REVISION="master"
 GRUB_REVISION="master"
 LIBVIRT_REVISION="master"
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 0/2] raisin: build qemu-traditional and seabios

2015-04-17 Thread Stefano Stabellini

Hi all,

This patch series builds qemu-traditional and seabios separately from the
Xen tree. It also change the QEMU build to be more Xen specific,
installing the QEMU binary under /usr/lib/xen/bin.


Changes compared to the previous version of the qemu-traditional patch:
- --enable-rombios (otherwise automatically disabled by
  --disable-qemu-traditional)
- suppress unhelpful echo in configure and unconfigure



Stefano Stabellini (2):
  raisin: add a component to build qemu_traditional
  raisin: introduce seabios component

 components/qemu_traditional |   49 +
 components/seabios  |   57 +++
 components/series   |2 ++
 components/xen  |4 ++-
 defconfig   |4 +++
 5 files changed, 115 insertions(+), 1 deletion(-)
 create mode 100644 components/qemu_traditional
 create mode 100644 components/seabios



Cheers,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv2 3/6] xen: generic xadd() for ticket locks

2015-04-17 Thread Ian Campbell

On Fri, 2015-04-17 at 13:34 +0100, David Vrabel wrote:
> On 17/04/15 13:32, Ian Campbell wrote:
> > On Thu, 2015-04-16 at 16:28 +0100, Jan Beulich wrote:
> > On 10.04.15 at 16:19,  wrote:
> >>> +#define xadd(ptr, v) generic_xaddl((ptr), (v))
> >>
> >> I think it is at least confusing to call the thing xadd (looking to be
> >> size generic) and then expand to generic_xaddl (only supporting
> >> 32-bit operations), yet subsequently implementing a size-generic
> >> xadd() for x86.
> > 
> > Indeed, and I went to build on arm32 prior to hacking up a proper xadd
> > and:
> > 
> > spinlock.c: In function ‘_spin_lock’:
> > spinlock.c:145:5: error: passing argument 1 of ‘generic_xaddl’ from 
> > incompatible pointer type [-Werror]
> >  tickets.head_tail = xadd(&lock->tickets, tickets.head_tail);
> >  ^
> > spinlock.c:15:12: note: expected ‘volatile u32 *’ but argument is of type 
> > ‘union spinlock_tickets_t *’
> >  static u32 generic_xaddl(volatile u32 *ptr, u32 v)
> > ^
> > 
> > (I hope to knock up the arm asm version in the next hour or so, so you
> > may not care...)
> 
> Can you use
> 
>   git://xenbits.xen.org/people/dvrabel/xen.git ticketlocks-v3
> 
> as a base instead?

I tried that and it built and booted just fine on both arm32 and arm64.

I eyeballed the assembly produced via the use of __sync_fetch_and_add
(for _spin_lock only) and it is exactly what I would have written in my
own versions.

I was using gcc 4.8.3 in both cases. For arm64 I'm pretty sure we don't
want to consider anything earlier.

For arm32 I have also tried gcc 4.6.3 (Debian Wheezy's compiler) and it
built and booted, and eyeballing shows the same asm. I think that's the
earliest we really need to worry about.

IOW I'm not going to bother with custom versions of these functions on
ARM. If you wanted you could drop the #ifndef xadd from
asm-arm/system.h.

Perhaps it would be useful to add some of the info from my tests
reported above, or a reference to this mail, to the commit log?

in either case you can add:

Acked-by: Ian Campbell 

to the patch below. 

commit b08cf3fa4791d7ff0d01fb932192e02078ce670a
Author: David Vrabel 
Date:   Thu Apr 16 15:31:18 2015 +0100

arm: provide xadd()

xadd() atomically adds a value and returns the previous value.  This
is needed to implement ticket locks.

This generic arm implementation uses the GCC __sync_fetch_and_add()
builtin, but a arm32 or arm64 specific variant could be provided in the
future (e.g., if required to support older versions of GCC).

Signed-off-by: David Vrabel 

diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index ce3d38a..f037e84 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -51,6 +51,10 @@
 # error "unknown ARM variant"
 #endif

+#ifndef xadd
+#  define xadd(x, v) __sync_fetch_and_add(x, v)
+#endif
+
 extern struct vcpu *__context_switch(struct vcpu *prev, struct vcpu *next);

 #endif

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Andrew Cooper

On 17/04/15 13:39, Jan Beulich wrote:
 On 17.04.15 at 13:59,  wrote:
>> On 17/04/15 12:17, Olaf Hering wrote:
>>> Since booting xen fails on my ProBook unless I specify "maxcpus=1" I
>>> tried the EFI firmware today. To my surprise it boots and finds all
>>> cpus. But once some efi driver in dom0 is loaded xen crashes. The same
>>> happens with xen-4.4 as included in SLE12.
>>>
>>> ...
>>> (XEN) Xen call trace:
>>> (XEN)[] aec1e8e1
>>> (XEN)[] efi_runtime_call+0x7f0/0x890
>>> (XEN)[] do_platform_op+0x679/0x1670
>>> (XEN)[] syscall_enter+0xa9/0xae
>>> 
>>>
>>> Can I do anything about it, or is this a firmware bug? I will move the
>>> offending efi driver away and try again.
>>>
>>> Olaf
>> This is a firmware bug.
> +1 (and I'm surprised how common this is)

The bug is present in the reference implementation code, which means it
is present in a lot of real firmware.  We have kit from 3 different
vendors which are affected, including latest available firmware.

>
>>> (XEN)  1-23fff type=7 attr=000
>>> (XEN)  0fec1-0fec10fff type=11 attr=8001
>>> (XEN)  0fff4-0fff46fff type=11 attr=8000
>>> (XEN) Unknown cachability for MFNs 0xfff40-0xfff46
>> This unknown cacheability causes Xen not to make pagetables for the region.
>>
>> There is a patch or two floating around the list, but currently no
>> resolution on the argument it created.
>>
>> https://github.com/xenserver/xen-4.5.pg/blob/master/master/unknown-cacheabilit
>>  
>> y.patch
>> is the XenServer fix.
> Now that's surely wrong

Right or wrong, this is (apparently; I have not checked) what Linux does.

>  - if anything, unknown should be treated as
> UC (and quite likely specifically in a case like the one Olaf reports here,
> as the offending memory range pretty likely is other than normal RAM).
> What I'd accept as a patch would be the addition of a command line
> option enforcing the mapping of such unknown cacheability areas with
> a certain caching type (default then being UC).

If I can find some copious free time, I will see about making this happen.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] crash in efi_runtime_call

2015-04-17 Thread Jan Beulich

>>> On 17.04.15 at 13:59,  wrote:
> On 17/04/15 12:17, Olaf Hering wrote:
>> Since booting xen fails on my ProBook unless I specify "maxcpus=1" I
>> tried the EFI firmware today. To my surprise it boots and finds all
>> cpus. But once some efi driver in dom0 is loaded xen crashes. The same
>> happens with xen-4.4 as included in SLE12.
>>
>> ...
>> (XEN) Xen call trace:
>> (XEN)[] aec1e8e1
>> (XEN)[] efi_runtime_call+0x7f0/0x890
>> (XEN)[] do_platform_op+0x679/0x1670
>> (XEN)[] syscall_enter+0xa9/0xae
>> 
>>
>> Can I do anything about it, or is this a firmware bug? I will move the
>> offending efi driver away and try again.
>>
>> Olaf
> 
> This is a firmware bug.

+1 (and I'm surprised how common this is)

>> (XEN)  1-23fff type=7 attr=000
>> (XEN)  0fec1-0fec10fff type=11 attr=8001
>> (XEN)  0fff4-0fff46fff type=11 attr=8000
>> (XEN) Unknown cachability for MFNs 0xfff40-0xfff46
> 
> This unknown cacheability causes Xen not to make pagetables for the region.
> 
> There is a patch or two floating around the list, but currently no
> resolution on the argument it created.
> 
> https://github.com/xenserver/xen-4.5.pg/blob/master/master/unknown-cacheabilit
>  
> y.patch
> is the XenServer fix.

Now that's surely wrong - if anything, unknown should be treated as
UC (and quite likely specifically in a case like the one Olaf reports here,
as the offending memory range pretty likely is other than normal RAM).
What I'd accept as a patch would be the addition of a command line
option enforcing the mapping of such unknown cacheability areas with
a certain caching type (default then being UC).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

1 2 >

1 - 100 of 168 matches

Mail list logo