date:20160126

Re: [Xen-devel] [PATCH 2/2] tools: avoid redefinining xenevtchn_handle typedef for xc_suspend_*

2016-01-26 Thread Ian Campbell

On Mon, 2016-01-25 at 12:36 -0500, Boris Ostrovsky wrote:
> On 01/25/2016 12:22 PM, Ian Jackson wrote:
> > Ian Campbell writes ("[PATCH 2/2] tools: avoid redefinining
> > xenevtchn_handle typedef for xc_suspend_*"):
> > > Similar to the previous xentoollog case this is not allowed. Switch
> > > to
> > > a forward decl of the struct and use of it in the APIs.
> > Both of these
> > 
> > Acked-by: Ian Jackson 
> > Committed-by: Ian Jackson 
> 
> 
> It's too late by now, so really FYI:
> 
> Tested-by: Boris Ostrovsky 
> 
> (for all three patches)

Thanks Boris and Olaf for testing. Hopefully that's the last of this batch
of issues.

/me stands on one leg, touches wood, crosses fingers.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] pre Sandy bridge IOMMU support (gm45)

2016-01-26 Thread Jan Beulich

>>> On 25.01.16 at 22:49,  wrote:
> The case is 1) disabling iommu for IGD, unilaterally since i915 + gm45
> doesn't play well together. Iommu is still desired to isolate usb and
> network devices, so we don't want to disable iommu completely. The side
> effect of this would be to have IGD only for dom0, which would also
> completely make sense in this use case.
> 
> The point is the iommu=no-igfx doesn't fix the issue, since remapping seems
> to still happen for IGD. Does that make sense ?

It certainly may make sense, just that in what you have written so
far I don't think I've been able to spot any evidence thereof. Since,
as you say, nothing interesting gets logged by Xen, you must be
drawing this conclusion from something (or else you wouldn't say
"doesn't fix the issue").

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] arm: p2m.c bug-fix: hypervisor hang on __p2m_get_mem_access

2016-01-26 Thread Corneliu ZUZU

When __p2m_get_mem_access gets called, the p2m lock is already taken
by either get_page_from_gva or p2m_get_mem_access.

Possible code paths:
1)  -> get_page_from_gva
-> p2m_mem_access_check_and_get_page
-> __p2m_get_mem_access
2)  -> p2m_get_mem_access
-> __p2m_get_mem_access

In both cases if __p2m_get_mem_access subsequently gets to
call p2m_lookup (happens if !radix_tree_lookup(...)), a hypervisor
hang will occur, since p2m_lookup also spin-locks on the p2m lock.

This bug-fix simply replaces the p2m_lookup call from __p2m_get_mem_access
with a call to __p2m_lookup.

Signed-off-by: Corneliu ZUZU 
---
 xen/arch/arm/p2m.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 2190908..a9157e5 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -490,7 +490,7 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn,
  * No setting was found in the Radix tree. Check if the
  * entry exists in the page-tables.
  */
-paddr_t maddr = p2m_lookup(d, gfn_x(gfn) << PAGE_SHIFT, NULL);
+paddr_t maddr = __p2m_lookup(d, gfn_x(gfn) << PAGE_SHIFT, NULL);
 if ( INVALID_PADDR == maddr )
 return -ESRCH;
 
-- 
2.5.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] xen-4.6: xenstored crashes during domain->interface access

2016-01-26 Thread Stefan Bader

Hi,

while playing around with xen-4.6 I stumbled over an odd problem and am
wondering whether anybody has seen the same. A method to relatively quickly
reproduce this for me seems to:

- Start one domU (PV or HVM does not seem to matter)
- Repeatedly call xenstore-ls a few times

I think I never got beyond 10 repeats when the xenstore-ls call suddenly locks
up and xenstored crashes with a SIGBUS error. In the majority of cases (I think
I saw one different), the crash happens while accessing conn->domain->interface
in tools/xenstore/xenstored_domain.c:domain_can_read().
Looking at the corefile produced by xenstored I now got at least one case where
the pointer still matches the previously mapped value. Though I think I had also
at least one run (with less debugging added) where it seemed to be really wrong.
There is more info at [1] in case someone is interested.

I need to repeat a few more times to see how consistent the whole thing is. Does
this happen for anybody else? Any advice what I should look at (in the sense of
gathering better data)?

Thanks,
Stefan

[1] https://bugs.launchpad.net/ubuntu/+source/xen/+bug/1538049



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Error booting Xen

2016-01-26 Thread Dario Faggioli

On Mon, 2016-01-25 at 06:42 -0700, Jan Beulich wrote:
> > > > On 21.01.16 at 16:14,  wrote:
> > On Wed, 2016-01-20 at 03:06 -0700, Jan Beulich wrote:

> > > But of course another question then is why the XRSTORS faults
> > > in the first place. I guess we'll need a debugging patch to dump
> > > the full state to understand that.
> > > 
> > If someone can produce and send such patch, I'm sure Harmandeep
> > will be
> > happy to give it a try on her hardware.
> 
> So here you go. Instead of a debugging one, I hope I have at
> once fixed the issue in a suitable way. Whether we'd like to keep
> the debugging output we can decide later on.
> 
Great, thanks!

> Both patches need to be applied; while the order shouldn't matter,
> the alignment one is a prereq to the actual change.
> 
Ok. Harmandeep, can you give these patches a try when you get a chance?
(contact me, by email or on IRC, if you need help doing so).

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 14/17] XEN: EFI: Move x86 specific codes to architecture directory

2016-01-26 Thread Shannon Zhao



On 2016/1/26 19:31, Stefano Stabellini wrote:
> On Tue, 26 Jan 2016, Shannon Zhao wrote:
>> > On 2016/1/26 0:44, Stefano Stabellini wrote:
>>> > > On Sat, 23 Jan 2016, Shannon Zhao wrote:
> > >> > From: Shannon Zhao 
> > >> > 
> > >> > Move x86 specific codes to architecture directory and export those 
> > >> > EFI
> > >> > runtime service functions. This will be useful for initializing 
> > >> > runtime
> > >> > service on ARM later.
> > >> > 
> > >> > Signed-off-by: Shannon Zhao 
>>> > > This patch causes a build breakage on x86:
>>> > > 
>>> > > arch/x86/xen/efi.c: In function ‘xen_efi_probe’:
>>> > > arch/x86/xen/efi.c:101:2: error: implicit declaration of function 
>>> > > ‘HYPERVISOR_platform_op’ [-Werror=implicit-function-declaration]
>>> > > 
>> > This patch is based on following patch [1]. Maybe you need to update
>> > your branch. :)
>> > 
>> > [1] xen: rename dom0_op to platform_op
>> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=cfafae940381207d48b11a73a211142dba5947d3
> Sorry, I made a mistake rebasing the series. It doesn't help that I
> couldn't find any Linux RCs on top of which it applies cleanly. As Linus
> tags RCs often enough, it is usually helpful to base one's work on a tag
> so that it is easier for other people to work on it.
>
Ah, I'll rebase this series on top of the newest master at next version
since the relevant patches are in master.

> Please add my Reviewed-by to this patch.
Thanks a lot!

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra

On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 04:42:43PM +, Will Deacon wrote:
> > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > On Fri, Jan 15, 2016 at 10:27:14PM +0100, Peter Zijlstra wrote:

> > > > Yes, that seems a good start. But yesterday you raised the 'fun' point
> > > > of two globally ordered sequences connected by a single local link.
> > > 
> > > The conclusion that I am slowly coming to is that litmus tests should
> > > not be thought of as linear chains, but rather as cycles.  If you think
> > > of it as a cycle, then it doesn't matter where the local link is, just
> > > how many of them and how they are connected.
> > 
> > Do you have some examples of this? I'm struggling to make it work in my
> > mind, or are you talking specifically in the context of the kernel
> > memory model?
> 
> Now that you mention it, maybe it would be best to keep the transitive
> and non-transitive separate for the time being anyway.  Just because it
> might be possible to deal with does not necessarily mean that we should
> be encouraging it.  ;-)

So isn't smp_mb__after_unlock_lock() exactly such a scenario? And would
not someone trying to implement RCsc locks using locally transitive
RELEASE/ACQUIRE operations need exactly this stuff?

That is, I am afraid we need to cover the mix of local and global
transitive operations at least in overview.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Will Deacon

On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 04:42:43PM +, Will Deacon wrote:
> > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > PPC Overlapping Group-B sets version 4
> > > ""
> > > (* When the Group-B sets from two different barriers involve instructions 
> > > in
> > >the same thread, within that thread one set must contain the other.
> > > 
> > >   P0  P1  P2
> > >   Rx=1Wy=1Wz=2
> > >   dep.lwsync  lwsync
> > >   Ry=0Wz=1Wx=1
> > >   Rz=1
> > > 
> > >   assert(!(z=2))
> > > 
> > >Forbidden by ppcmem, allowed by herd.
> > > *)
> > > {
> > > 0:r1=x; 0:r2=y; 0:r3=z;
> > > 1:r1=x; 1:r2=y; 1:r3=z; 1:r4=1;
> > > 2:r1=x; 2:r2=y; 2:r3=z; 2:r4=1; 2:r5=2;
> > > }
> > >  P0   | P1| P2;
> > >  lwz r6,0(r1) | stw r4,0(r2)  | stw r5,0(r3)  ;
> > >  xor r7,r6,r6 | lwsync| lwsync;
> > >  lwzx r7,r7,r2| stw r4,0(r3)  | stw r4,0(r1)  ;
> > >  lwz r8,0(r3) |   |   ;
> > > 
> > > exists
> > > (z=2 /\ 0:r6=1 /\ 0:r7=0 /\ 0:r8=1)
> > 
> > That really hurts. Assuming that the "assert(!(z=2))" is actually there
> > to constrain the coherence order of z to be {0->1->2}, then I think that
> > this test is forbidden on arm using dmb instead of lwsync. That said, I
> > also don't think the Rz=1 in P0 changes anything.
> 
> What about the smp_wmb() variant of dmb that orders only stores?

Tricky, but I think it still works out if the coherence order of z is as
I described above. The line of reasoning is weird though -- I ended up
considering the two cases where P0 reads z before and after it reads x
and what that means for the read of y.

Will

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] pre Sandy bridge IOMMU support (gm45)

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 12:57,  wrote:
> Only dom0 talks directly to the i915 driver, other appvm being pv, which is
> why I put in question the complete deactivation of IGD by iommu=no-igfx.
> 
> Is there anything I can provide to troubleshoot?

Hard to tell. The VT-d maintainers may be able to give you better
guidance.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 16/17] FDT: Add a helper to get specified name subnode

2016-01-26 Thread Shannon Zhao



On 2016/1/26 20:11, Stefano Stabellini wrote:
> On Sat, 23 Jan 2016, Shannon Zhao wrote:
>> > From: Shannon Zhao 
>> > 
>> > Sometimes it needs to check if there is a node in FDT by full path.
>> > Introduce this helper to get the specified name subnode if it exists.
>> > 
>> > Signed-off-by: Shannon Zhao 
>> > ---
>> > CC: Rob Herring 
>> > ---
>> >  drivers/of/fdt.c   | 35 +++
>> >  include/linux/of_fdt.h |  2 ++
>> >  2 files changed, 37 insertions(+)
>> > 
>> > diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>> > index 655f79d..112ec16 100644
>> > --- a/drivers/of/fdt.c
>> > +++ b/drivers/of/fdt.c
>> > @@ -645,6 +645,41 @@ int __init of_scan_flat_dt(int (*it)(unsigned long 
>> > node,
>> >  }
>> >  
>> >  /**
>> > + * of_get_flat_dt_subnode_by_name - get subnode of specified node by name
>> > + *
>> > + * @node: the parent node
>> > + * @uname: the name of subnode
>> > + * @return offset of the subnode, or -FDT_ERR_NOTFOUND if there is none
>> > + */
>> > +
>> > +int of_get_flat_dt_subnode_by_name(unsigned long node, const char *uname)
>> > +{
>> > +  const void *blob = initial_boot_params;
>> > +  int offset;
>> > +  const char *pathp;
>> > +
>> > +  /* Find first subnode if it exists */
>> > +  offset = fdt_first_subnode(blob, node);
>> > +  if (offset < 0)
>> > +  return -FDT_ERR_NOTFOUND;
>> > +  pathp = fdt_get_name(blob, offset, NULL);
>> > +  if (strncmp(pathp, uname, strlen(uname)) == 0)
>> > +  return offset;
> Wouldn't this check succeed even if uname is "uefi" and the node
> name is actually "uefi"?  You might have to use strcmp.
> 
Ah, yes. Will fix this.

> 
>> > +  /* Find other subnodes */
>> > +  do {
>> > +  offset = fdt_next_subnode(blob, offset);
>> > +  if (offset < 0)
>> > +  return -FDT_ERR_NOTFOUND;
>> > +  pathp = fdt_get_name(blob, offset, NULL);
>> > +  if (strncmp(pathp, uname, strlen(uname)) == 0)
>> > +  return offset;
>> > +  } while (offset >= 0);
> Rather than writing the name check twice, I think it would be best to
> code this loop as:
> 
> for (offset = fdt_first_subnode(blob, offset);
>  offset >= 0;
>  offset = fdt_next_subnode(blob, offset)) {
> 
>  /* do name check */
> 
> 
> 
Thanks for your suggestion. Will change this.

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 5/5] Allow all user to create a file under the directory /var/lib/xen

2016-01-26 Thread Ian Campbell

On Tue, 2016-01-26 at 00:00 +, Andrew Cooper wrote:
> On 25/01/2016 20:36, Konrad Rzeszutek Wilk wrote:
> > On Wed, Dec 30, 2015 at 11:00:52AM +, Andrew Cooper wrote:
> > > On 30/12/2015 05:25, Wen Congyang wrote:
> > > > On 12/30/2015 12:11 PM, Doug Goldstein wrote:
> > > > > On 12/29/15 8:39 PM, Wen Congyang wrote:
> > > > > > We may use non-root user to run qemu, and the qemu needs to
> > > > > > write
> > > > > > save file to /var/lib/xen. So we should allow all user to
> > > > > > create
> > > > > > a file under the directory /var/lib/xen
> > > > > > 
> > > > > > Signed-off-by: Wen Congyang 
> > > > > > ---
> > > > > >  tools/Makefile | 2 +-
> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/tools/Makefile b/tools/Makefile
> > > > > > index 820ca40..402b417 100644
> > > > > > --- a/tools/Makefile
> > > > > > +++ b/tools/Makefile
> > > > > > @@ -60,7 +60,7 @@ build all: subdirs-all
> > > > > >  install: subdirs-install
> > > > > >     $(INSTALL_DIR) -m 700 $(DESTDIR)$(XEN_DUMP_DIR)
> > > > > >     $(INSTALL_DIR) $(DESTDIR)/var/log/xen
> > > > > > -   $(INSTALL_DIR) $(DESTDIR)/var/lib/xen
> > > > > > +   $(INSTALL_DIR) -m 777 $(DESTDIR)/var/lib/xen
> > > > > >  .PHONY: uninstall
> > > > > >  uninstall: D=$(DESTDIR)
> > > > > > 
> > > > > I could be wrong but this doesn't seem like something that you'd
> > > > > want to
> > > > > do given what's stored in there. Could you do something with
> > > > > permissions
> > > > > on sub-directories to achieve what you need?
> > > > > 
> > > > The save file's path is:
> > > > #define LIBXL_DEVICE_MODEL_SAVE_FILE "/var/lib/xen/qemu-save" /*
> > > > .$domid */
> > > > 
> > > > So all user must have write permission on the directory
> > > > /var/lib/xen/, otherwise,
> > > > the migration will fail.
> > > For now, I would avoid running qemu as a non-root user.  It doesn't
> > > gain you
> > > any meaninful security at present (at the expense of a warning which
> > > can't
> > > be turned off).
> > > 
> > > As to this bug, marking the directory 0777 is not an option, as save
> > > records
> > > necessarily contain sensitive data.
> > > 
> > > Longterm, (and already identified in one of the threads in the past),
> > > the
> > > best course of action is to switch away from having files, and
> > > passing file
> > > descriptors instead.  This is more flexible (currently libxl can't
> > > function
> > > on a read-only root filesystem), and would allow a privileged entity
> > > to open
> > > the file descriptor and pass it to a non-privileged entity to
> > > use.  This
> > > allows the non-privileged entity to function, and maintains security.
> > Wen,
> > 
> > Could you mention the use case for wanting to write files there?
> > Looking
> > at the patches you had sent for COLO and Remus they use an file
> > descriptor - so
> > what is the use-case here?
> 
> This is a bug in existing code.  It is not a COLO specific issue.
> 
> The current protocol for live migration requires Qemu to write its save
> file here.
> 
> Until this issue is resolved, live migration is inoperable with Qemu
> running as a non-root user.

Stefano, is this already on your list of issues to address?

In any case creating a world writeable directory is clearly a non-starter.
We might need the toolstack to create a directory with suitable permissions
until we can rework things to work with fds only.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [ovmf bisection] complete test-amd64-i386-xl-qemuu-ovmf-amd64

2016-01-26 Thread Ian Campbell

On Sat, 2016-01-09 at 00:33 +, osstest service owner wrote:

According to http://logs.test-lab.xenproject.org/osstest/results/all-branch
-statuses.txt the ovmf push gate has been broken for a while (48 days).

The bisector seems to have fingered the commit below (there are some other
intermittent issues, but this one seems to be rather persistent).

An example of the failure can be seen in:
http://logs.test-lab.xenproject.org/osstest/logs/78929/test-amd64-i386-xl-qemuu-ovmf-amd64/info.html

I wouldn't mind betting that this is something similar:
http://logs.test-lab.xenproject.org/osstest/logs/78929/test-amd64-amd64-xl-qemuu-ovmf-amd64/info.html

Ian.

> branch xen-unstable
> xenbranch xen-unstable
> job test-amd64-i386-xl-qemuu-ovmf-amd64
> testid guest-start/debianhvm.repeat
> 
> Tree: linux git://xenbits.xen.org/linux-pvops.git
> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> Tree: ovmf https://github.com/tianocore/edk2.git
> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
> 
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  ovmf https://github.com/tianocore/edk2.git
>   Bug introduced:  b0fa5d29d08e61fd7f2178aa3b455e41374b36c4
>   Bug not present: fa25cf38d988778ef3237e17fc93c1fa0c9e9f8a
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/77418/
> 
> 
>   commit b0fa5d29d08e61fd7f2178aa3b455e41374b36c4
>   Author: Michael Kinney 
>   Date:   Tue Dec 8 05:24:18 2015 +
>   
>   UefiCpuPkg/MtrrLib: Reduce hardware init when program variable MTRRs
>   
>   When MtrrSetMemoryAttribute() programs variable MTRRs, it may 
> disable/enable
>   cache and disable/enable MTRRs several times. This updating tries to do
>   operation in local variable and does the hardware initialization one 
> time only.
>   
>   Cc: Feng Tian 
>   Cc: Michael Kinney 
>   Contributed-under: TianoCore Contribution Agreement 1.0
>   Signed-off-by: Michael Kinney 
>   Signed-off-by: Jeff Fan 
>   Reviewed-by: Feng Tian 
>   
>   git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@19158 
> 6f19259b-4bc3-4df7-8a09-765794883524
> 
> 
> For bisection revision-tuple graph see:
>    http://logs.test-lab.xenproject.org/osstest/results/bisect/ovmf/test-a
> md64-i386-xl-qemuu-ovmf-amd64.guest-start--debianhvm.repeat.html
> Revision IDs in each graph node refer, respectively, to the Trees above.
> 
> 
> Running cs-bisection-step --graph-
> out=/home/logs/results/bisect/ovmf/test-amd64-i386-xl-qemuu-ovmf-
> amd64.guest-start--debianhvm.repeat --summary-out=tmp/77418.bisection-
> summary --basis-template=65543 --blessings=real,real-bisect ovmf test-
> amd64-i386-xl-qemuu-ovmf-amd64 guest-start/debianhvm.repeat
> Searching for failure / basis pass:
>  77229 fail [host=rimava1] / 66401 [host=huxelrebe0] 65677
> [host=huxelrebe1] 65624 [host=baroque0] 65593 [host=italia1] 65543
> [host=pinot1] 65468 [host=fiano0] 65386 [host=fiano1] 65359
> [host=nocera1] 65336 [host=italia0] 65319 ok.
> Failure / basis pass flights: 77229 / 65319
> (tree with no url: seabios)
> Tree: linux git://xenbits.xen.org/linux-pvops.git
> Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> Tree: ovmf https://github.com/tianocore/edk2.git
> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
> Latest 5d7b0fcc26d66db767a477574effc764022c19ac
> c530a75c1e6a472b0eb9558310b518f0dfcd8860
> c2a892d7c8a78143006bb7fdc95fb18f7e2fc685
> a82794b1d5a6da06062a333b1db404e2448345dd
> f165e581d9a6f7cf81aa7496d3eee1e31212c8ad
> bf925a9f1254391749f569c1b8fc606036340488
> Basis pass 769b79eb206ad5b0249a08665fefb913c3d1998e
> c530a75c1e6a472b0eb9558310b518f0dfcd8860
> dcb2e4bb61931e2dee1739bb76aba315002f0a82
> bc00cad75d8bcc3ba696992bec219c21db8406aa
> 3fb401edbd8e9741c611bfddf6a2032ca91f55ed
> 713b7e4ef2aa4ec3ae697cde9c81d5a57548f9b1
> Generating revisions with ./adhoc-revtuple-generator  git://xenbits.xen.o
> rg/linux-pvops.git#769b79eb206ad5b0249a08665fefb913c3d1998e-
> 5d7b0fcc26d66db767a477574effc764022c19ac git://xenbits.xen.org/osstest/li
> nux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-
> c530a75c1e6a472b0eb9558310b518f0dfcd8860 https://github.com/tianocore/edk
> 2.git#dcb2e4bb61931e2dee1739bb76aba315002f0a82-
> c2a892d7c8a78143006bb7fdc95fb18f7e2fc685 git://xenbits.xen.org/qemu-xen-t
> raditional.git#bc00cad75d8bcc3ba696992bec219c21db8406aa-
> a82794b1d5a6da06062a333b1db404e2448345dd git://xenbits.xen.org/qemu-xen.g
> it#3fb401edbd8e9741c611bfddf6a2032ca91f55ed-
> f165e581d9a6f7cf81aa7496d3eee1e31212c8ad git://xenbits.xen.org/xen.git#71
>

Re: [Xen-devel] [PATCH 2/2] xenalyze: remove cr3_compare_total

2016-01-26 Thread George Dunlap

On 22/01/16 14:27, Ian Campbell wrote:
> gcc-6 complains:
> xenalyze.c:4132:9: error: 'cr3_compare_total' defined but not used 
> [-Werror=unused-function]
>  int cr3_compare_total(const void *_a, const void *_b) {
>  ^
> 
> I believe it is correct.
> 
> Signed-off-by: Ian Campbell 

Thanks,

Reviewed-by: George Dunlap 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Will Deacon

On Mon, Jan 25, 2016 at 05:06:46PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 02:41:34PM +, Will Deacon wrote:
> > On Fri, Jan 15, 2016 at 11:28:45AM -0800, Paul E. McKenney wrote:
> > > On Fri, Jan 15, 2016 at 09:54:01AM -0800, Paul E. McKenney wrote:
> > > > On Fri, Jan 15, 2016 at 10:24:32AM +, Will Deacon wrote:
> > > > > See my earlier reply [1] (but also, your WRC Linux example looks more
> > > > > like a variant on WWC and I couldn't really follow it).
> > > > 
> > > > I will revisit my WRC Linux example.  And yes, creating litmus tests
> > > > that use non-fake dependencies is still a bit of an undertaking.  :-/
> > > > I am sure that it will seem more natural with time and experience...
> > > 
> > > Hmmm...  You are quite right, I did do WWC.  I need to change cpu2()'s
> > > last access from a store to a load to get WRC.  Plus the levels of
> > > indirection definitely didn't match up, did they?
> > 
> > Nope, it was pretty baffling!
> 
> "It is a service that I provide."  ;-)
> 
> > >   struct foo {
> > >   struct foo *next;
> > >   };
> > >   struct foo a;
> > >   struct foo b;
> > >   struct foo c = {  };
> > >   struct foo d = {  };
> > >   struct foo x = {  };
> > >   struct foo y = {  };
> > >   struct foo *r1, *r2, *r3;
> > > 
> > >   void cpu0(void)
> > >   {
> > >   WRITE_ONCE(x.next, );
> > >   }
> > > 
> > >   void cpu1(void)
> > >   {
> > >   r1 = lockless_dereference(x.next);
> > >   WRITE_ONCE(r1->next, );
> > >   }
> > > 
> > >   void cpu2(void)
> > >   {
> > >   r2 = lockless_dereference(y.next);
> > >   r3 = READ_ONCE(r2->next);
> > >   }
> > > 
> > > In this case, it is legal to end the run with:
> > > 
> > >   r1 ==  && r2 ==  && r3 == 
> > > 
> > > Please see below for a ppcmem litmus test.
> > > 
> > > So, did I get it right this time?  ;-)
> > 
> > The code above looks correct to me (in that it matches WRC+addrs),
> > but your litmus test:
> > 
> > > PPC WRCnf+addrs
> > > ""
> > > {
> > > 0:r2=x; 0:r3=y;
> > > 1:r2=x; 1:r3=y;
> > > 2:r2=x; 2:r3=y;
> > > c=a; d=b; x=c; y=d;
> > > }
> > >  P0   | P1| P2;
> > >  stw r3,0(r2) | lwz r8,0(r2)  | lwz r8,0(r3)  ;
> > >   | stw r2,0(r3)  | lwz r9,0(r8)  ;
> > > exists
> > > (1:r8=y /\ 2:r8=x /\ 2:r9=c)
> > 
> > Seems to be missing the address dependency on P1.
> 
> You are quite correct!  How about the following?

I think that's it!

> As before, both herd and ppcmem say that the cycle is allowed, as
> expected, given non-transitive ordering.  To prohibit the cycle, P1
> needs a suitable memory-barrier instruction.
> 
> 
> 
> PPC WRCnf+addrs
> ""
> {
> 0:r2=x; 0:r3=y;
> 1:r2=x; 1:r3=y;
> 2:r2=x; 2:r3=y;
> c=a; d=b; x=c; y=d;
> }
>  P0   | P1| P2;
>  stw r3,0(r2) | lwz r8,0(r2)  | lwz r8,0(r3)  ;
>   | stw r2,0(r8)  | lwz r9,0(r8)  ;
> exists
> (1:r8=y /\ 2:r8=x /\ 2:r9=c)

Agreed.

Will

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 17/17] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-01-26 Thread Matt Fleming

On Sat, 23 Jan, at 11:19:44AM, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Add a new function to parse DT parameters for Xen specific UEFI just
> like the way for normal UEFI. Then it could reuse the existing codes.
> 
> If Xen supports EFI, initialize runtime services.
> 
> Signed-off-by: Shannon Zhao 
> ---
> CC: Matt Fleming 
> ---
>  arch/arm/xen/enlighten.c   |  6 ++
>  arch/arm64/kernel/efi.c| 17 -
>  drivers/firmware/efi/efi.c | 45 ++---
>  3 files changed, 56 insertions(+), 12 deletions(-)
 
Looks OK to me, but I've added some people to Cc that have touched
these files in the past, and it would be good to get their ACKs.

Reviewed-by: Matt Fleming 

> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index cdc0bd2..608d735 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -245,6 +245,12 @@ static int __init fdt_find_hyper_node(unsigned long 
> node, const char *uname,
>   !strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
>   hyper_node.version = s + strlen(hyper_node.prefix);
>  
> + if (IS_ENABLED(CONFIG_XEN_EFI)) {
> + /* Check if Xen supports EFI */
> + if (of_get_flat_dt_subnode_by_name(node, "uefi") > 0)
> + set_bit(EFI_PARAVIRT, );
> + }
> +
>   return 0;
>  }
>  
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 4eeb171..3c46129 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -33,6 +33,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct efi_memory_map memmap;
>  
> @@ -308,13 +309,19 @@ static int __init arm64_enable_runtime_services(void)
>   }
>   set_bit(EFI_SYSTEM_TABLES, );
>  
> - if (!efi_virtmap_init()) {
> - pr_err("No UEFI virtual mapping was installed -- runtime 
> services will not be available\n");
> - return -ENOMEM;
> + if (IS_ENABLED(CONFIG_XEN_EFI) && efi_enabled(EFI_PARAVIRT)) {
> + /* Set up runtime services function pointers for Xen Dom0 */
> + xen_efi_runtime_setup();
> + } else {
> + if (!efi_virtmap_init()) {
> + pr_err("No UEFI virtual mapping was installed -- 
> runtime services will not be available\n");
> + return -ENOMEM;
> + }
> +
> + /* Set up runtime services function pointers */
> + efi_native_runtime_setup();
>   }
>  
> - /* Set up runtime services function pointers */
> - efi_native_runtime_setup();
>   set_bit(EFI_RUNTIME_SERVICES, );
>  
>   efi.runtime_version = efi.systab->hdr.revision;
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 027ca21..bdcf6d7 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -498,12 +498,14 @@ device_initcall(efi_load_efivars);
>   FIELD_SIZEOF(struct efi_fdt_params, field) \
>   }
>  
> -static __initdata struct {
> +struct params {
>   const char name[32];
>   const char propname[32];
>   int offset;
>   int size;
> -} dt_params[] = {
> +};
> +
> +static struct params fdt_params[] __initdata = {
>   UEFI_PARAM("System Table", "linux,uefi-system-table", system_table),
>   UEFI_PARAM("MemMap Address", "linux,uefi-mmap-start", mmap),
>   UEFI_PARAM("MemMap Size", "linux,uefi-mmap-size", mmap_size),
> @@ -511,24 +513,45 @@ static __initdata struct {
>   UEFI_PARAM("MemMap Desc. Version", "linux,uefi-mmap-desc-ver", desc_ver)
>  };
>  
> +static struct params xen_fdt_params[] __initdata = {
> + UEFI_PARAM("System Table", "xen,uefi-system-table", system_table),
> + UEFI_PARAM("MemMap Address", "xen,uefi-mmap-start", mmap),
> + UEFI_PARAM("MemMap Size", "xen,uefi-mmap-size", mmap_size),
> + UEFI_PARAM("MemMap Desc. Size", "xen,uefi-mmap-desc-size", desc_size),
> + UEFI_PARAM("MemMap Desc. Version", "xen,uefi-mmap-desc-ver", desc_ver)
> +};
> +
>  struct param_info {
>   int found;
>   void *params;
> + struct params *dt_params;
> + int size;
>  };
>  
>  static int __init fdt_find_uefi_params(unsigned long node, const char *uname,
>  int depth, void *data)
>  {
>   struct param_info *info = data;
> + struct params *dt_params = info->dt_params;
>   const void *prop;
>   void *dest;
>   u64 val;
> - int i, len;
> + int i, len, offset;
>  
> - if (depth != 1 || strcmp(uname, "chosen") != 0)
> - return 0;
> + if (efi_enabled(EFI_PARAVIRT)) {
> + if (depth != 1 || strcmp(uname, "hypervisor") != 0)
> + return 0;
>  
> - for (i = 0; i < ARRAY_SIZE(dt_params); i++) {
> + offset = of_get_flat_dt_subnode_by_name(node, "uefi");
> +

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra

On Thu, Jan 14, 2016 at 02:20:46PM -0800, Paul E. McKenney wrote:
> On Thu, Jan 14, 2016 at 01:24:34PM -0800, Leonid Yegoshin wrote:
> > On 01/14/2016 12:48 PM, Paul E. McKenney wrote:
> > >
> > >So SYNC_RMB is intended to implement smp_rmb(), correct?
> > Yes.
> > >
> > >You could use SYNC_ACQUIRE() to implement read_barrier_depends() and
> > >smp_read_barrier_depends(), but SYNC_RMB probably does not suffice.
> > 
> > If smp_read_barrier_depends() is used to separate not only two reads
> > but read pointer and WRITE basing on that pointer (example below) -
> > yes. I just doesn't see any example of this in famous
> > Documentation/memory-barriers.txt and had no chance to know what you
> > use it in this way too.
> 
> Well, Documentation/memory-barriers.txt was intended as a guide for Linux
> kernel hackers, and not for hardware architects.

Yeah, this goes under the header: memory-barriers.txt is _NOT_ a
specification (I seem to keep repeating this).

> 
> 
> commit 955720966e216b00613fcf60188d507c103f0e80
> Author: Paul E. McKenney 
> Date:   Thu Jan 14 14:17:04 2016 -0800
> 
> documentation: Subsequent writes ordered by rcu_dereference()
> 
> The current memory-barriers.txt does not address the possibility of
> a write to a dereferenced pointer.  This should be rare, 

How are these rare? Isn't:

rcu_read_lock()
obj = rcu_dereference(ptr);
if (!atomic_inc_not_zero(>ref))
obj = NULL;
rcu_read_unlock();

a _very_ common thing to do?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/3] tools: introduce parameter max_wp_ram_ranges.

2016-01-26 Thread David Vrabel

On 22/01/16 03:20, Yu Zhang wrote:
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -962,6 +962,24 @@ FIFO-based event channel ABI support up to 131,071 event 
> channels.
>  Other guests are limited to 4095 (64-bit x86 and ARM) or 1023 (32-bit
>  x86).
>  
> +=item

Re: [Xen-devel] pre Sandy bridge IOMMU support (gm45)

2016-01-26 Thread Jan Beulich

(re-adding xen-devel)

>>> On 26.01.16 at 12:28,  wrote:
> Iommu=0 let the whole Qubes system work, without enforcing hardware
> compartimentalisation (iommu is enforced in software mode)
> 
> When iommu=no-igfx is enforced, shell console boot up works flawlessly. All
> domu machines get booted up. A system hang will happen at the moment a domu
> machine does graphic rendering,

And this is (other than I originally implied) without passing through
the IGD to the DomU? If so, I can't see the difference between a
guest rendering to its display (and vncviewer or whatever frontend
you use converting this to rendering on the host) and rendering
which originates in the host.

Jan

> which often results in tray icon being
> fuzzy just before the system gets unresponsive(netvm showing it get
> connected through nm - applet rendering) , or a notification starting to
> show up while the system hangs before it disappears with some minor/major
> visual glitch being visible (usb-vm showing device attribution to another
> vm).
> 
> Again, if iommu=0 is passed to xen, there is no system hang while not
> having any added isolation security from usb devices being in a domu and
> network devices being in another one, while applications sit in seperate
> ones. This is why Qubes strongly suggest but doesn't require iommu;stronger
> isolation.
> IGD has a bad history of iommu support. A quick list :
> 
> -http://lists.freedesktop.org/archives/dri-devel/2013-January/033662.html 
> -https://lists.ubuntu.com/archives/kernel-team/2013-February/024796.html 
> 
> Isolation of netvm and usb is a required use case in Qubes. IGD passthrough
> would be nice to have, but isn't required. I don't really see why someone
> would want to passthrouh IGD to a Windows domu, gm45 based laptops are
> definitely not gaming laptops. Since i915 and gm45 have a bad iommu
> history, just being able to completely disable iommu for IGD would suffice.
> 
> 
> Thierry
> 
> Le mar. 26 janv. 2016 05:52, Jan Beulich  a écrit :
> 
>> >>> On 25.01.16 at 22:49,  wrote:
>> > The case is 1) disabling iommu for IGD, unilaterally since i915 + gm45
>> > doesn't play well together. Iommu is still desired to isolate usb and
>> > network devices, so we don't want to disable iommu completely. The side
>> > effect of this would be to have IGD only for dom0, which would also
>> > completely make sense in this use case.
>> >
>> > The point is the iommu=no-igfx doesn't fix the issue, since remapping
>> seems
>> > to still happen for IGD. Does that make sense ?
>>
>> It certainly may make sense, just that in what you have written so
>> far I don't think I've been able to spot any evidence thereof. Since,
>> as you say, nothing interesting gets logged by Xen, you must be
>> drawing this conclusion from something (or else you wouldn't say
>> "doesn't fix the issue").
>>
>> Jan
>>
>>



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread George Dunlap

On Thu, Jan 21, 2016 at 2:52 PM, Jan Beulich  wrote:
 On 21.01.16 at 15:01,  wrote:
>> On 01/21/16 03:25, Jan Beulich wrote:
>>> >>> On 21.01.16 at 10:10,  wrote:
>>> > b) some _DSMs control PMEM so you should filter out these kind of _DSMs 
>>> > and
>>> > handle them in hypervisor.
>>>
>>> Not if (see above) following the model we currently have in place.
>>>
>>
>> You mean let dom0 linux evaluates those _DSMs and interact with
>> hypervisor if necessary (e.g. XENPF_mem_hotadd for memory hotplug)?
>
> Yes.
>
>>> > c) hypervisor should mange PMEM resource pool and partition it to multiple
>>> > VMs.
>>>
>>> Yes.
>>>
>>
>> But I Still do not quite understand this part: why must pmem resource
>> management and partition be done in hypervisor?
>
> Because that's where memory management belongs. And PMEM,
> other than PBLK, is just another form of RAM.

I haven't looked more deeply into the details of this, but this
argument doesn't seem right to me.

Normal RAM in Xen is what might be called "fungible" -- at boot, all
RAM is zeroed, and it basically doesn't matter at all what RAM is
given to what guest.  (There are restrictions of course: lowmem for
DMA, contiguous superpages,  but within those groups, it doesn't
matter *which* bit of lowmem you get, as long as you get enough to do
your job.)  If you reboot your guest or hand RAM back to the
hypervisor, you assume that everything in it will disappear.  When you
ask for RAM, you can request some parameters that it will have
(lowmem, on a specific node, ), but you can't request a specific
page that you had before.

This is not the case for PMEM.  The whole point of PMEM (correct me if
I'm wrong) is to be used for long-term storage that survives over
reboot.  It matters very much that a guest be given the same PRAM
after the host is rebooted that it was given before.  It doesn't make
any sense to manage it the way Xen currently manages RAM (i.e., that
you request a page and get whatever Xen happens to give you).

So if Xen is going to use PMEM, it will have to invent an entirely new
interface for guests, and it will have to keep track of those
resources across host reboots.  In other words, it will have to
duplicate all the work that Linux already does.  What do we gain from
that duplication?  Why not just leverage what's already implemented in
dom0?

>> I mean if we allow the following steps of operations (for example)
>> (1) partition pmem in dom 0
>> (2) get address and size of each partition (part_addr, part_size)
>> (3) call a hypercall like nvdimm_memory_mapping(d, part_addr, part_size,
>> gpfn) to
>> map a partition to the address gpfn in dom d.
>> Only the last step requires hypervisor. Would anything be wrong if we
>> allow above operations?
>
> The main issue is that this would imo be a layering violation. I'm
> sure it can be made work, but that doesn't mean that's the way
> it ought to work.

Jan, from a toolstack <-> Xen perspective, I'm not sure what
alternative there to the interface above.  Won't the toolstack have to
1) figure out what nvdimm regions there are and 2) tell Xen how and
where to assign them to the guest no matter what we do?  And if we
want to assign arbitrary regions to arbitrary guests, then (part_addr,
part_size) and (gpfn) are going to be necessary bits of information.
The only difference would be whether part_addr is the machine address
or some abstracted address space (possibly starting at 0).

What does your ideal toolstack <-> Xen interface look like?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-next test] 78997: regressions - trouble: broken/fail/pass

2016-01-26 Thread osstest service owner

flight 78997 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/78997/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt-xsm  8 leak-check/basis(8)   fail REGR. vs. 78857

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-rumpuserxen-i386 10 guest-start  fail blocked in 78857
 test-amd64-amd64-xl-credit2  15 guest-localmigrate   fail blocked in 78857
 test-amd64-amd64-xl  15 guest-localmigrate   fail blocked in 78857
 test-amd64-amd64-xl-xsm  15 guest-localmigrate   fail blocked in 78857
 test-amd64-i386-xl   15 guest-localmigrate   fail blocked in 78857
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail blocked in 78857
 test-amd64-i386-xl-xsm   15 guest-localmigrate   fail blocked in 78857
 test-amd64-amd64-libvirt 15 guest-saverestore.2  fail blocked in 78857
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate  fail blocked in 78857
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2  fail blocked in 78857
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail blocked in 78857
 test-amd64-amd64-pair 22 guest-migrate/dst_host/src_host fail blocked in 78857
 test-amd64-amd64-libvirt-pair 22 guest-migrate/dst_host/src_host fail blocked 
in 78857
 test-armhf-armhf-xl   8 leak-check/basis(8)  fail blocked in 78857
 test-armhf-armhf-xl-xsm   8 leak-check/basis(8)  fail blocked in 78857
 test-armhf-armhf-xl-cubietruck  8 leak-check/basis(8)fail blocked in 78857
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 15 guest-localmigrate/x10 fail 
blocked in 78857
 test-armhf-armhf-xl-multivcpu  8 leak-check/basis(8) fail blocked in 78857
 test-amd64-amd64-xl-rtds 15 guest-localmigrate   fail blocked in 78857
 test-armhf-armhf-xl-rtds15 guest-start/debian.repeat fail blocked in 78857
 test-amd64-i386-pair  22 guest-migrate/dst_host/src_host fail blocked in 78857
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop   fail blocked in 78857
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stopfail blocked in 78857
 test-amd64-amd64-xl-qemut-win7-amd64 12 guest-saverestore fail blocked in 78857
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail blocked in 78857
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stopfail blocked in 78857
 test-armhf-armhf-xl-credit2   8 leak-check/basis(8)  fail blocked in 78857
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail blocked 
in 78857
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 17 guest-start/debianhvm.repeat fail 
blocked in 78857
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 15 
guest-localmigrate/x10 fail blocked in 78857
 test-armhf-armhf-xl-vhd   9 debian-di-installfail blocked in 78857
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail blocked in 78857
 test-amd64-amd64-xl-qemuu-winxpsp3 12 guest-saverestore  fail blocked in 78857

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass

version targeted for testing:
 linuxe216cada8e1b4d6c278a2a9af051aeba9b7f1bbe
baseline version:
 linuxb82dde0230439215b55e545880e90337ee16f51a

Last test of basis  (not found) 
Failing since 0  1970-01-01 00:00:00 Z 16826 days
Testing same since

Re: [Xen-devel] [PATCH V2 1/1] Improved RTDS scheduler

2016-01-26 Thread Dario Faggioli

On Mon, 2016-01-25 at 17:04 -0500, Tianyang Chen wrote:
> I have removed some of the Ccs so they won't get bothered as we 
> discussed previously.
> 
Yeah... I said you should have done that in the first place, and then
Cc-ed them myself! Sorry... :-P

> On 1/25/2016 4:00 AM, Dario Faggioli wrote:
> > On Thu, 2015-12-31 at 05:20 -0500, Tianyang Chen wrote:
> > > 
> > So, there's always only one timer... Even if we have multiple
> > cpupool
> > with RTDS as their scheduler, they share the replenishment timer? I
> > think it makes more sense to make this per-scheduler.
> > 
> Yeah, I totally ignored the case for cpu-pools. It looks like when a 
> cpu-pool is created, it copies the scheduler struct and calls
> rt_init() 
> where a private field is initialized. So I assume the timer should
> be 
> put inside the scheduler private struct? 
>
Yes, I think it should be there. We certainly don't want different
cpupools to share the timer.

> Now that I think about it, the 
> timer is hard-coded to run on cpu0.
>
It is. Well, in your patch it is "hard-coded" to cpu0. When considering
cpupools, you just hard-code it to one cpu of the pool.

In fact, the fact that the timer is sort-of pinned to a pcpu is a
(potential) issue (overhead on that pcpu, what happens if that pcpu
goes offline), but let's deal with all this later. For now, make the
code cpupools-safe.

>  If there're lots of cpu-pools but 
> the replenishment can only be done on the same pcpu, would that be a 
> problem? Should we keep track of all instances of schedulers
> (nr_rt_ops 
> counts how many) and just put times on different pcpus?
>
One timer per cpupool is what we want, at least for now.

> > About the actual startup of the timer (no matter whether for first
> > time
> > or not). Here, you were doing it in _vcpu_insert() and not in
> > _vcpu_wake(); in v3 you're doing it in _vcpu_wake() and not in
> > _runq_insert()... Which one is the proper way?
> > 
> 
> Correct me if I'm wrong, at the beginning of the boot process, all
> vcpus 
> are put to sleep/not_runnable after insertions. Therefore, the timer 
> should start when the first vcpu wakes up. I think the wake() in v3 
> should be correct.
> 
Check when the insert_vcpu is called in schedule.c (hint, this also has
to do with cpupools()). I think that starting it in wake() is ok, but,
really, do double check (and, once you're ready for that, test things
by creating multiple pools and moving domains around between them).

> > Mmm... I'll think about this more and let you know... But out of
> > the
> > top of my head, I think the tickling has to stay? You preempted a
> > vcpu
> > from the pcpu where it was running, maybe some other pcpu is either
> > idle or running a vcpu with a later deadline, and should come and
> > pick
> > this one up?
> > 
> gEDF allows this but there is overhead and may not be worth it. I
> have 
> no stats to support this but there are some papers on restricting
> what 
> tasks can migrate. We can discuss more if we need extra logic here.
> 
Ok (more on this in the reply to Meng's email).

> > Oh, and one thing: the use of the term "release time" is IMO a bit
> > misleading. Release of what? Typically, the release time of an RT
> > task
> > (or job) is when the task (or job) is declared ready to run... But
> > I
> > don't think it's used like this in here.
> > 
> > I propose to just get rid of it.
> > 
> The "release time" here means the next time when a deferrable server
> is 
> released and ready to serve. It happens every period. Maybe the term 
> "inter-release time" is more appropriate?
>
Perhaps, but I think this part of the DS algorithm can be implemented
in a smart enough way to avoid having to deal with this "explicitly"
(and in particular, having to scan the running or ready queues during
replenishment).

> > > +if( min_repl> svc->cur_deadline )
> > > +{
> > > +min_repl = svc->cur_deadline;
> > > +}
> > > +/* reinsert the vcpu if its deadline is updated */
> > > +__q_remove(svc);
> > > +__runq_insert(ops, svc);
> > > 
> > One more proof of what I was trying to say. Is it really this
> > handler's
> > job to --basically-- re-sort the runqueue? I don't think so.
> > 
> > What is the specific situation that you are trying to handle like
> > this?
> > 
> Right, if we want to count deadline misses, it could be done when a
> vcpu 
> is picked. However, when selecting the most imminent "inter-release 
> time" of all runnable vcpu, the head of the runq could be missing
> its 
> deadline and the cur-deadline could be in the past. How do we handle 
> this situation? We still need to scan the runq right?
> 
I'll do my best to avoid that we'll end up scanning the runqueue in the
replenishment timer handler, and in fact I still don't think this is
going to be necessary.

Let's discuss more about this specific point when replying to Meng's
email.

> > But I don't think I understand. When a vcpu runs out of budget,
>

Re: [Xen-devel] [PATCH v3 16/17] FDT: Add a helper to get specified name subnode

2016-01-26 Thread Stefano Stabellini

On Sat, 23 Jan 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Sometimes it needs to check if there is a node in FDT by full path.
> Introduce this helper to get the specified name subnode if it exists.
> 
> Signed-off-by: Shannon Zhao 
> ---
> CC: Rob Herring 
> ---
>  drivers/of/fdt.c   | 35 +++
>  include/linux/of_fdt.h |  2 ++
>  2 files changed, 37 insertions(+)
> 
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index 655f79d..112ec16 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -645,6 +645,41 @@ int __init of_scan_flat_dt(int (*it)(unsigned long node,
>  }
>  
>  /**
> + * of_get_flat_dt_subnode_by_name - get subnode of specified node by name
> + *
> + * @node: the parent node
> + * @uname: the name of subnode
> + * @return offset of the subnode, or -FDT_ERR_NOTFOUND if there is none
> + */
> +
> +int of_get_flat_dt_subnode_by_name(unsigned long node, const char *uname)
> +{
> + const void *blob = initial_boot_params;
> + int offset;
> + const char *pathp;
> +
> + /* Find first subnode if it exists */
> + offset = fdt_first_subnode(blob, node);
> + if (offset < 0)
> + return -FDT_ERR_NOTFOUND;
> + pathp = fdt_get_name(blob, offset, NULL);
> + if (strncmp(pathp, uname, strlen(uname)) == 0)
> + return offset;

Wouldn't this check succeed even if uname is "uefi" and the node
name is actually "uefi"?  You might have to use strcmp.


> + /* Find other subnodes */
> + do {
> + offset = fdt_next_subnode(blob, offset);
> + if (offset < 0)
> + return -FDT_ERR_NOTFOUND;
> + pathp = fdt_get_name(blob, offset, NULL);
> + if (strncmp(pathp, uname, strlen(uname)) == 0)
> + return offset;
> + } while (offset >= 0);

Rather than writing the name check twice, I think it would be best to
code this loop as:

for (offset = fdt_first_subnode(blob, offset);
 offset >= 0;
 offset = fdt_next_subnode(blob, offset)) {

 /* do name check */



> + return -FDT_ERR_NOTFOUND;
> +}
> +
> +/**
>   * of_get_flat_dt_root - find the root node in the flat blob
>   */
>  unsigned long __init of_get_flat_dt_root(void)
> diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
> index df9ef38..fc28162 100644
> --- a/include/linux/of_fdt.h
> +++ b/include/linux/of_fdt.h
> @@ -52,6 +52,8 @@ extern char __dtb_end[];
>  extern int of_scan_flat_dt(int (*it)(unsigned long node, const char *uname,
>int depth, void *data),
>  void *data);
> +extern int of_get_flat_dt_subnode_by_name(unsigned long node,
> +   const char *uname);
>  extern const void *of_get_flat_dt_prop(unsigned long node, const char *name,
>  int *size);
>  extern int of_flat_dt_is_compatible(unsigned long node, const char *name);
> -- 
> 2.0.4
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 1/8] Kconfig: import kconfig.h from Linux 4.3

2016-01-26 Thread Shannon Zhao



On 2016/1/25 22:35, Jan Beulich wrote:
 On 23.01.16 at 09:00,  wrote:
>> > --- a/xen/include/xen/config.h
>> > +++ b/xen/include/xen/config.h
>> > @@ -7,7 +7,7 @@
>> >  #ifndef __XEN_CONFIG_H__
>> >  #define __XEN_CONFIG_H__
>> >  
>> > -#include 
>> > +#include 
> Why? I don't see why all source files need to include this new
> header, no matter whether they make use of any of the
> definitions therein.
Will fix this. Thanks.

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 1/6] public/xen.h: add flags field to vcpu_time_info

2016-01-26 Thread Joao Martins



On 01/25/2016 08:11 PM, Konrad Rzeszutek Wilk wrote:
> On Mon, Dec 28, 2015 at 04:59:40PM +, Joao Martins wrote:
>> This field has two possible flags (as of latest pvclock ABI
>> shared with KVM).
> 
> 
> 
> Wish they had CC-ed xen-devel instead of just doing their
> change
Indeed, Andrew was suggesting that an entry could perhaps be added to the
maintainers file with xen-devel, to avoid situations like this.


>>
>> flags: bits in this field indicate extended capabilities
>> coordinated between the guest and the hypervisor.  Specifically
>> on KVM, availability of specific flags has to be checked in
>> 0x4001 cpuid leaf. On Xen, we don't have that but we can
>> still check some of the flags after registering the time info
>> page since a force_update_vcpu_system_time is performed.
>>
>> Current flags are:
>>
>>  flag bit   | cpuid bit| meaning
>> -
>> |  | time measures taken across
>>  0  |  24  | multiple cpus are guaranteed to
>> |  | be monotonic
>> -
>> |  | guest vcpu has been paused by
>>  1  | N/A  | the host
>> |  |
>> -
>>
>> Signed-off-by: Joao Martins 
> 
> Reviewed-by: Konrad Rzeszutek Wilk 
Thanks!

>> ---
>>  xen/include/public/xen.h | 6 +-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
>> index ff5547e..1223686 100644
>> --- a/xen/include/public/xen.h
>> +++ b/xen/include/public/xen.h
>> @@ -601,10 +601,14 @@ struct vcpu_time_info {
>>   */
>>  uint32_t tsc_to_system_mul;
>>  int8_t   tsc_shift;
>> -int8_t   pad1[3];
>> +int8_t   flags;
>> +int8_t   pad1[2];
>>  }; /* 32 bytes */
>>  typedef struct vcpu_time_info vcpu_time_info_t;
>>  
>> +#define PVCLOCK_TSC_STABLE_BIT  (1 << 0)
>> +#define PVCLOCK_GUEST_STOPPED   (1 << 1)
>> +
>>  struct vcpu_info {
>>  /*
>>   * 'evtchn_upcall_pending' is written non-zero by Xen to indicate
>> -- 
>> 2.1.4
>>
>>
>> ___
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC 3/6] x86/time: streamline platform time init on plt_init()

2016-01-26 Thread Joao Martins



On 01/25/2016 08:26 PM, Konrad Rzeszutek Wilk wrote:
>> +if ( clocksource_is_tsc )
>> +{
>> +plt_init();
>> +}
>> +else
>> +{
>> +plt_overflow_period = scale_delta(
>> +1ull << (pts->counter_bits-1), _scale);
>> +init_timer(_overflow_timer, plt_overflow, NULL, 0);
>> +plt_overflow(NULL);
>> +
>> +printk("Platform timer overflow period is %lu secs\n",
>> +   plt_overflow_period/10);
> 
> s/10/SECONDS(1) ?
> 
Yeah, looks much better that way.

>> +}
>>  
>>  platform_timer_stamp = plt_stamp64;
>>  stime_platform_stamp = NOW();
>> -- 
>> 2.1.4
>>
>>
>> ___
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/3] tools: introduce parameter max_wp_ram_ranges.

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 08:32,  wrote:
> On 1/22/2016 4:01 PM, Jan Beulich wrote:
> On 22.01.16 at 04:20,  wrote:
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -940,6 +940,10 @@ static int hvm_ioreq_server_alloc_rangesets(struct
>>> hvm_ioreq_server *s,
>>>   {
>>>   unsigned int i;
>>>   int rc;
>>> +unsigned int max_wp_ram_ranges =
>>> +( s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_WP_RAM_RANGES] > 
>>> 0 ) ?
>>> +s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_WP_RAM_RANGES] :
>>> +MAX_NR_IO_RANGES;
>>
>> Besides this having stray blanks inside the parentheses it truncates
>> the value from 64 to 32 bits and would benefit from using the gcc
>> extension of omitting the middle operand of ?:. But even better
>> would imo be if you avoided the local variable and ...
>>
> After second thought, how about we define a default value for this
> parameter in libx.h, and initialize the parameter when creating the
> domain with default value if it's not configured.

No, I don't think the tool stack should be determining the default
here (unless you want the default to be zero, and have zero
indeed mean zero).

> About this local variable, we keep it, and ...
> 
>>> @@ -962,7 +966,10 @@ static int hvm_ioreq_server_alloc_rangesets(struct 
> hvm_ioreq_server *s,
>>>   if ( !s->range[i] )
>>>   goto fail;
>>>
>>> -rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
>>> +if ( i == HVMOP_IO_RANGE_WP_MEM )
>>> +rangeset_limit(s->range[i], max_wp_ram_ranges);
>>> +else
>>> +rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
>>
>> ... did the entire computation here, using ?: for the second argument
>> of the function invocation.
>>
> ... replace the if/else pair with sth. like:
>  rangeset_limit(s->range[i],
> ((i == HVMOP_IO_RANGE_WP_MEM)?
>  max_wp_ram_ranges:
>  MAX_NR_IO_RANGES));
> This 'max_wp_ram_ranges' has no particular usages, but the string
> "s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_WP_RAM_RANGES] "
> is too lengthy, and can easily break the 80 column limitation. :)
> Does this approach sounds OK? :)

Seems better than the original, so okay.

>>> @@ -6009,6 +6016,7 @@ static int hvm_allow_set_param(struct domain *d,
>>>   case HVM_PARAM_IOREQ_SERVER_PFN:
>>>   case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
>>>   case HVM_PARAM_ALTP2M:
>>> +case HVM_PARAM_MAX_WP_RAM_RANGES:
>>>   if ( value != 0 && a->value != value )
>>>   rc = -EEXIST;
>>>   break;
>>
>> Is there a particular reason you want this limit to be unchangeable
>> after having got set once?
>>
> Well, not exactly. :)
> I added this limit because by now we do not have any approach to
> change the max range numbers inside ioreq server during run-time.
> I can add another patch to introduce an xl command, which can change
> it dynamically. But I doubt the necessity of this new command and
> am also wonder if this new command would cause more confusion for
> the user...

And I didn't say you need to expose this to the user. All I asked
was whether you really mean the value to be a set-once one. If
yes, the code above is fine. If no, the code above should be
changed, but there's then still no need to expose a way to
"manually" adjust the value until a need for such arises.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v11 2/3] Differentiate IO/mem resources tracked by ioreq server

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 08:59,  wrote:

> 
> On 1/22/2016 7:43 PM, Jan Beulich wrote:
> On 22.01.16 at 04:20,  wrote:
>>> @@ -2601,6 +2605,16 @@ struct hvm_ioreq_server 
> *hvm_select_ioreq_server(struct domain *d,
>>>   type = (p->type == IOREQ_TYPE_PIO) ?
>>>   HVMOP_IO_RANGE_PORT : HVMOP_IO_RANGE_MEMORY;
>>>   addr = p->addr;
>>> +if ( type == HVMOP_IO_RANGE_MEMORY )
>>> +{
>>> + ram_page = get_page_from_gfn(d, p->addr >> PAGE_SHIFT,
>>> +  , P2M_UNSHARE);
>>
>> It seems to me like I had asked before: Why P2M_UNSHARE instead
>> of just P2M_QUERY? (This could surely be fixed up while committing,
>> the more that I've already done some cleanup here, but I'd like to
>> understand this before it goes in.)
>>
> Hah, sorry for my bad memory. :)
> I did not found P2M_QUERY; only P2M_UNSHARE and P2M_ALLOC are
> defined. But after reading the code in ept_get_entry(), I guess the
> P2M_UNSHARE is not accurate, maybe I should use 0 here for the
> p2m_query_t parameter in get_page_from_gfn()?

Ah, sorry for the misnamed suggestion. I'm not sure whether using
zero here actually matches your needs; P2M_UNSHARE though
seems odd in any case, so at least switching to P2M_ALLOC (to
populate PoD pages) would seem to be necessary.

>>> @@ -2642,6 +2656,11 @@ struct hvm_ioreq_server 
>>> *hvm_select_ioreq_server(struct domain *d,
>>>   }
>>>
>>>   break;
>>> +case HVMOP_IO_RANGE_WP_MEM:
>>> +if ( rangeset_contains_singleton(r, PFN_DOWN(addr)) )
>>> +return s;
>>
>> Considering you've got p2m_mmio_write_dm above - can this
>> validly return false here?
> 
> Well, if we have multiple ioreq servers defined, it will...

Ah, right. That's fine then.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Will Deacon

On Tue, Jan 26, 2016 at 11:32:00AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 26, 2016 at 11:24:02AM +0100, Peter Zijlstra wrote:
> 
> > Yeah, this goes under the header: memory-barriers.txt is _NOT_ a
> > specification (I seem to keep repeating this).
> 
> Do we want this ?
> 
> ---
>  Documentation/memory-barriers.txt | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/Documentation/memory-barriers.txt 
> b/Documentation/memory-barriers.txt
> index a61be39c7b51..433326ebdc26 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -1,3 +1,4 @@
> +
>
>LINUX KERNEL MEMORY BARRIERS
>
> @@ -5,6 +6,22 @@
>  By: David Howells 
>  Paul E. McKenney 
>  
> +==
> +DISCLAIMER
> +==
> +
> +This document is not a specification; it is intentionally (for the sake of
> +brevity) and unintentionally (due to being human) incomplete. This document 
> is
> +meant as a guide to using the various memory barriers provided by Linux, but
> +in case of any doubt (and there are many) please ask.

It might be worth adding you and me to the top of the file, to save Paul
Cc'ing us on questions (get_maintainer.pl points at poor old Corbet for
this file).

But yes, it seems that something like this is required.

Will

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 17/17] Xen: EFI: Parse DT parameters for Xen specific UEFI

2016-01-26 Thread Stefano Stabellini

On Sat, 23 Jan 2016, Shannon Zhao wrote:
> From: Shannon Zhao 
> 
> Add a new function to parse DT parameters for Xen specific UEFI just
> like the way for normal UEFI. Then it could reuse the existing codes.
> 
> If Xen supports EFI, initialize runtime services.
> 
> Signed-off-by: Shannon Zhao 

Reviewed-by: Stefano Stabellini 


> CC: Matt Fleming 
> ---
>  arch/arm/xen/enlighten.c   |  6 ++
>  arch/arm64/kernel/efi.c| 17 -
>  drivers/firmware/efi/efi.c | 45 ++---
>  3 files changed, 56 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index cdc0bd2..608d735 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -245,6 +245,12 @@ static int __init fdt_find_hyper_node(unsigned long 
> node, const char *uname,
>   !strncmp(hyper_node.prefix, s, strlen(hyper_node.prefix)))
>   hyper_node.version = s + strlen(hyper_node.prefix);
>  
> + if (IS_ENABLED(CONFIG_XEN_EFI)) {
> + /* Check if Xen supports EFI */
> + if (of_get_flat_dt_subnode_by_name(node, "uefi") > 0)
> + set_bit(EFI_PARAVIRT, );
> + }
> +
>   return 0;
>  }
>  
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 4eeb171..3c46129 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -33,6 +33,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct efi_memory_map memmap;
>  
> @@ -308,13 +309,19 @@ static int __init arm64_enable_runtime_services(void)
>   }
>   set_bit(EFI_SYSTEM_TABLES, );
>  
> - if (!efi_virtmap_init()) {
> - pr_err("No UEFI virtual mapping was installed -- runtime 
> services will not be available\n");
> - return -ENOMEM;
> + if (IS_ENABLED(CONFIG_XEN_EFI) && efi_enabled(EFI_PARAVIRT)) {
> + /* Set up runtime services function pointers for Xen Dom0 */
> + xen_efi_runtime_setup();
> + } else {
> + if (!efi_virtmap_init()) {
> + pr_err("No UEFI virtual mapping was installed -- 
> runtime services will not be available\n");
> + return -ENOMEM;
> + }
> +
> + /* Set up runtime services function pointers */
> + efi_native_runtime_setup();
>   }
>  
> - /* Set up runtime services function pointers */
> - efi_native_runtime_setup();
>   set_bit(EFI_RUNTIME_SERVICES, );
>  
>   efi.runtime_version = efi.systab->hdr.revision;
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 027ca21..bdcf6d7 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -498,12 +498,14 @@ device_initcall(efi_load_efivars);
>   FIELD_SIZEOF(struct efi_fdt_params, field) \
>   }
>  
> -static __initdata struct {
> +struct params {
>   const char name[32];
>   const char propname[32];
>   int offset;
>   int size;
> -} dt_params[] = {
> +};
> +
> +static struct params fdt_params[] __initdata = {
>   UEFI_PARAM("System Table", "linux,uefi-system-table", system_table),
>   UEFI_PARAM("MemMap Address", "linux,uefi-mmap-start", mmap),
>   UEFI_PARAM("MemMap Size", "linux,uefi-mmap-size", mmap_size),
> @@ -511,24 +513,45 @@ static __initdata struct {
>   UEFI_PARAM("MemMap Desc. Version", "linux,uefi-mmap-desc-ver", desc_ver)
>  };
>  
> +static struct params xen_fdt_params[] __initdata = {
> + UEFI_PARAM("System Table", "xen,uefi-system-table", system_table),
> + UEFI_PARAM("MemMap Address", "xen,uefi-mmap-start", mmap),
> + UEFI_PARAM("MemMap Size", "xen,uefi-mmap-size", mmap_size),
> + UEFI_PARAM("MemMap Desc. Size", "xen,uefi-mmap-desc-size", desc_size),
> + UEFI_PARAM("MemMap Desc. Version", "xen,uefi-mmap-desc-ver", desc_ver)
> +};
> +
>  struct param_info {
>   int found;
>   void *params;
> + struct params *dt_params;
> + int size;
>  };
>  
>  static int __init fdt_find_uefi_params(unsigned long node, const char *uname,
>  int depth, void *data)
>  {
>   struct param_info *info = data;
> + struct params *dt_params = info->dt_params;
>   const void *prop;
>   void *dest;
>   u64 val;
> - int i, len;
> + int i, len, offset;
>  
> - if (depth != 1 || strcmp(uname, "chosen") != 0)
> - return 0;
> + if (efi_enabled(EFI_PARAVIRT)) {
> + if (depth != 1 || strcmp(uname, "hypervisor") != 0)
> + return 0;
>  
> - for (i = 0; i < ARRAY_SIZE(dt_params); i++) {
> + offset = of_get_flat_dt_subnode_by_name(node, "uefi");
> + if (offset < 0)
> + return 0;
> + node = offset;
> + } else {
> + if (depth

[Xen-devel] [linux-mingo-tip-master test] 79067: regressions - FAIL

2016-01-26 Thread osstest service owner

flight 79067 linux-mingo-tip-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/79067/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-xsm  15 guest-localmigratefail REGR. vs. 60684
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2   fail REGR. vs. 60684
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate   fail REGR. vs. 60684
 test-amd64-amd64-xl  15 guest-localmigratefail REGR. vs. 60684
 test-amd64-amd64-xl-credit2  15 guest-localmigratefail REGR. vs. 60684
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 60684
 test-amd64-amd64-pair  22 guest-migrate/dst_host/src_host fail REGR. vs. 60684
 test-amd64-amd64-libvirt 15 guest-saverestore.2   fail REGR. vs. 60684
 test-amd64-i386-rumpuserxen-i386 10 guest-start   fail REGR. vs. 60684

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 15 guest-localmigratefail REGR. vs. 60684
 test-amd64-amd64-libvirt-pair 22 guest-migrate/dst_host/src_host fail blocked 
in 60684
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail blocked in 60684
 test-amd64-i386-xl-xsm   15 guest-localmigrate   fail blocked in 60684
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail blocked in 60684
 test-amd64-i386-xl   15 guest-localmigrate   fail blocked in 60684
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail blocked 
in 60684
 test-amd64-i386-pair  22 guest-migrate/dst_host/src_host fail blocked in 60684

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass

version targeted for testing:
 linuxeec9b33534ab3ffd0b118a2b7988788ee58496e1
baseline version:
 linux69f75ebe3b1d1e636c4ce0a0ee248edacc69cbe0

Last test of basis60684  2015-08-13 04:21:46 Z  166 days
Failing since 60712  2015-08-15 18:33:48 Z  164 days  114 attempts
Testing same since79067  2016-01-26 04:28:34 Z0 days1 attempts

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  fail
 test-amd64-i386-xl   fail
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  fail
 test-amd64-i386-xl-xsm   fail
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd

Re: [Xen-devel] [PATCH v9 03/25] libxc/migration: Specification update for DIRTY_PFN_LIST records

2016-01-26 Thread Wen Congyang

On 01/27/2016 04:44 AM, Konrad Rzeszutek Wilk wrote:
>> + 0x000F: DIRTY_PFN_LIST
>> +
> 
> Perhaps make it part of the optional and prefix it with CHECKPOINT?

IIUC, optional record can be ignored, but this record cannot be ignored.

To Andrew Cooper:
Should I mark this record as optional record?

Thanks
Wen Congyang

> 
>> + 0x0010 - 0x7FFF: Reserved for future _mandatory_
>>   records.
>>  
>>   0x8000 - 0x: Reserved for future _optional_
> 
> 
> .
> 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Intel-gfx] [Announcement] 2015-Q4 release of XenGT - a Mediated Graphics Passthrough Solution from Intel

2016-01-26 Thread Jike Song

Hi all,

We are pleased to announce another update of Intel GVT-g for Xen.

Intel GVT-g is a full GPU virtualization solution with mediated pass-through, 
starting from 4th generation Intel Core(TM) processors with Intel Graphics 
processors. A virtual GPU instance is maintained for each VM, with part of 
performance critical resources directly assigned. The capability of running 
native graphics driver inside a VM, without hypervisor intervention in 
performance critical paths, achieves a good balance among performance, feature, 
and sharing capability. Xen is currently supported on Intel Processor Graphics 
(a.k.a. XenGT).

Repositories
-

Kernel: https://github.com/01org/igvtg-kernel (2015q4-4.2.0 branch)
Xen: https://github.com/01org/igvtg-xen (2015q4-4.5 branch)
Qemu: https://github.com/01org/igvtg-qemu (xengt_public2015q4 branch)

This update consists of:

- 6th generation Intel Core Processor (code name: Skylake) is 
preliminarily supported in this release. Users could start run multiple Windows 
/ Linux virtual machines simultaneously, and switch display among them.
- Backward compatibility support 4th generation Intel Core Processor 
(code name: Haswell) and 5th generation Intel Core Processor (code name: 
Broadwell).
- Kernel update from drm-intel 3.18.0 to drm-intel 4.2.0.

Known issues:
   - At least 2GB memory is suggested for a VM to run most 3D workloads.
   - Keymap might be incorrect in guest. Config file may need to explicitly 
specify "keymap='en-us'". Although it looks like the default value, earlier we 
saw the problem of wrong keymap code if it is not explicitly set.
   - Cannot move mouse pointer smoothly in guest by default launched by VNC 
mode. Configuration file need to explicitly specify "usb=1" to enable a USB 
bus, and "usbdevice='tablet'" to add pointer device using absolute coordinates.
   - Running heavy 3D workloads in multiple guests for couple of hours may 
cause stability issue.
   - There are still stability issues on Skylake


Next update will be around early April, 2016.

GVT-g project portal: https://01.org/igvt-g
Please subscribe mailing list: https://lists.01.org/mailman/listinfo/igvt-g


More information about background, architecture and others about Intel GVT-g, 
can be found at:

https://01.org/igvt-g
https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian

http://events.linuxfoundation.org/sites/events/files/slides/XenGT-Xen%20Summit-v7_0.pdf

http://events.linuxfoundation.org/sites/events/files/slides/XenGT-Xen%20Summit-REWRITE%203RD%20v4.pdf
https://01.org/xen/blogs/srclarkx/2013/graphics-virtualization-xengt


Note: The XenGT project should be considered a work in progress. As such it is 
not a complete product nor should it be considered one. Extra care should be 
taken when testing and configuring a system to use the XenGT project.


--
Thanks,
Jike

On 10/27/2015 05:25 PM, Jike Song wrote:
> Hi all,
> 
> We are pleased to announce another update of Intel GVT-g for Xen.
> 
> Intel GVT-g is a full GPU virtualization solution with mediated pass-through, 
> starting from 4th generation Intel Core(TM) processors with Intel Graphics 
> processors. A virtual GPU instance is maintained for each VM, with part of 
> performance critical resources directly assigned. The capability of running 
> native graphics driver inside a VM, without hypervisor intervention in 
> performance critical paths, achieves a good balance among performance, 
> feature, and sharing capability. Xen is currently supported on Intel 
> Processor Graphics (a.k.a. XenGT); and the core logic can be easily ported to 
> other hypervisors.
> 
> 
> Repositories
> 
>  Kernel: https://github.com/01org/igvtg-kernel (2015q3-3.18.0 branch)
>  Xen: https://github.com/01org/igvtg-xen (2015q3-4.5 branch)
>  Qemu: https://github.com/01org/igvtg-qemu (xengt_public2015q3 branch)
> 
> 
> This update consists of:
> 
>  - XenGT is now merged with KVMGT in unified repositories(kernel and 
> qemu), but currently
>different branches for qemu.  XenGT and KVMGT share same iGVT-g core 
> logic.
>  - fix sysfs/debugfs access seldom crash issue
>  - fix a BUG in XenGT I/O emulation logic
>  - improve 3d workload stability
> 
> Next update will be around early Jan, 2016.
> 
> 
> Known issues:
> 
>  - At least 2GB memory is suggested for VM to run most 3D workloads.
>  - Keymap might be incorrect in guest. Config file may need to explicitly 
> specify "keymap='en-us'". Although it looks like the default value, earlier 
> we saw the problem of wrong keymap code if it is not explicitly set.
>  - When using three monitors, doing hotplug between Guest pause/unpause 
> may not be able to lightup all monitors automatically. Some specific monitor 
> issues.
>  - Cannot move mouse pointer smoothly in guest by default launched by VNC 
> mode. Configuration file need to explicitly specify

Re: [Xen-devel] [PATCH v2 3/3] tools: introduce parameter max_wp_ram_ranges.

2016-01-26 Thread Yu, Zhang




On 1/26/2016 7:16 PM, David Vrabel wrote:

On 22/01/16 03:20, Yu Zhang wrote:

--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -962,6 +962,24 @@ FIFO-based event channel ABI support up to 131,071 event 
channels.
  Other guests are limited to 4095 (64-bit x86 and ARM) or 1023 (32-bit
  x86).

+=item

Re: [Xen-devel] [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn

2016-01-26 Thread Han, Huaitong

On Tue, 2016-01-26 at 14:30 +, Tim Deegan wrote:
> Hi,
> 
> At 15:30 +0800 on 19 Jan (1453217458), Huaitong Han wrote:
> > At the moment, the pfec argument to gva_to_gfn has two functions:
> > 
> > * To inform guest_walk what kind of access is happenind
> > 
> > * As a value to pass back into the guest in the event of a fault.
> > 
> > Unfortunately this is not quite treated consistently: the
> > hvm_fetch_*
> > function will "pre-clear" the PFEC_insn_fetch flag before calling
> > gva_to_gfn; meaning guest_walk doesn't actually know whether a
> > given
> > access is an instruction fetch or not.  This works now, but will
> > cause
> > issues when pkeys are introduced, since guest_walk will need to
> > know
> > whether an access is an instruction fetch even if it doesn't return
> > PFEC_insn_fetch.
> > 
> > Fix this by making a clean separation for in and out
> > functionalities
> > of the pfec argument:
> > 
> > 1. Always pass in the access type to gva_to_gfn
> > 
> > 2. Filter out inappropriate access flags before returning from
> > gva_to_gfn.
> 
> This seems OK.  But can you please:
>  - Add this new adjustment once, in paging_gva_to_gfn(), instead of
>adding it to each implementation; and
>  - Adjust the comment above the declaration of paging_gva_to_gfn() in
>paging.h to describe this new behaviour.
Although adding adjustment in paging_gva_to_gfn can reduce code
duplication, adding it to each implementation is more readable, becasue
other sections of pfec are handled in each implementation.

> 
> Cheers,
> 
> Tim.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Haozhong Zhang

On 01/26/16 14:32, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 26, 2016 at 09:34:13AM -0700, Jan Beulich wrote:
> > >>> On 26.01.16 at 16:57,  wrote:
> > > On 01/26/16 08:37, Jan Beulich wrote:
> > >> >>> On 26.01.16 at 15:44,  wrote:
> > >> >>  Last year at Linux Plumbers Conference I attended a session dedicated
> > >> >> to NVDIMM support. I asked the very same question and the INTEL guy
> > >> >> there told me there is indeed something like a partition table meant
> > >> >> to describe the layout of the memory areas and their contents.
> > >> > 
> > >> > It is described in details at pmem.io, look at  Documents, see
> > >> > http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf see Namespaces 
> > >> > section.
> > >> 
> > >> Well, that's about how PMEM and PBLK ranges get marked, but not
> > >> about how use of the space inside a PMEM range is coordinated.
> > >>
> > > 
> > > How a NVDIMM is partitioned into pmem and pblk is described by ACPI NFIT 
> > > table.
> > > Namespace to pmem is something like partition table to disk.
> > 
> > But I'm talking about sub-dividing the space inside an individual
> > PMEM range.
> 
> The namespaces are it.
>

Because only one persistent memory namespace is allowed for an
individual pmem, namespace can not be used to sub-divide.

> Once you have done them you can mount the PMEM range under say /dev/pmem0
> and then put a filesystem on it (ext4, xfs) - and enable DAX support.
> The DAX just means that the FS will bypass the page cache and write directly
> to the virtual address.
> 
> then one can create giant 'dd' images on this filesystem and pass it
> to QEMU to .. expose as NVDIMM to the guest. Because it is a file - the blocks
> (or MFNs) for the contents of the file are most certainly discontingous.
>

Though the 'dd' image may occupy discontingous MFNs on host pmem, we can map 
them
to contiguous guest PFNs.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 0/3] x86: xsave{c,s} fixes and adjustments

2016-01-26 Thread Jan Beulich

1: xstate: don't unintentionally clear compaction bit
2: adjust xsave structure attributes
3: xstate: fix fault behavior on XRSTORS

Signed-off-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Peter Zijlstra

On Tue, Jan 26, 2016 at 02:33:40PM -0800, Linus Torvalds wrote:

> If it turns out that some architecture does actually need a barrier
> between a read and a dependent write, then that will mean that
> 
>  (a) we'll have to make up a _new_ barrier, because
> "smp_read_barrier_depends()" is not that barrier. We'll presumably
> then have to make that new barrier part of "rcu_derefence()" and
> friends.
> 
>  (b) we will have found an architecture with even worse memory
> ordering semantics than alpha, and we'll have to stop castigating
> alpha for being the worst memory ordering ever.
> 
> but I sincerely hope that we'll never find that kind of broken architecture.

So for a moment it looked like MIPS wanted to equal or surpass Alpha in
this respect.

And Paul made the point that smp_read_barrier_depends() really should
be smp_aquire_barrier_depends() in that we rely on both dependent reads
and writes to be ordered against the initial pointer load.

Now, as you've made abundantly clear, Alpha does this, although it needs
the little extra help in the dependent read department.

The 'problem' is that someone seemed to have used our
Documentation/memory-barriers.txt as a specification for what hardware
is permitted and we require. And in that light Paul noted that
read_barrier_depends really should be considered an
acquire_barrier_depends and order both dependent reads and writes
against the (prior) read (if nothing else already does).

Now clearly, any sane architecture doesn't need anything like this, but
again our document doesn't seem to judge. That is, from reading the
document one can get the impression is a perfectly fine thing to do.
Nowhere does our disdain for this thing show.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v9 03/25] libxc/migration: Specification update for DIRTY_PFN_LIST records

2016-01-26 Thread Wen Congyang

On 01/27/2016 04:44 AM, Konrad Rzeszutek Wilk wrote:
>> + 0x000F: DIRTY_PFN_LIST
>> +
> 
> Perhaps make it part of the optional and prefix it with CHECKPOINT?

Will be fixed in the next version.

Thanks
Wen Congyang

> 
>> + 0x0010 - 0x7FFF: Reserved for future _mandatory_
>>   records.
>>  
>>   0x8000 - 0x: Reserved for future _optional_
> 
> 
> .
> 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v9 02/25] docs/libxl: Introduce COLO_CONTEXT to support migration v2 colo streams

2016-01-26 Thread Wen Congyang

On 01/27/2016 04:40 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:37:32AM +0800, Wen Congyang wrote:
>> It is the negotiation record for COLO.
>> Primary->Secondary:
>> control_id  0x: Secondary VM is out of sync, start a new 
>> checkpoint
>> Secondary->Primary:
>> 0x0001: Secondary VM is suspended
>> 0x0002: Secondary VM is ready
>> 0x0003: Secondary VM is resumed
>>
>> Signed-off-by: Wen Congyang 
>> Signed-off-by: Yang Hongyang 
>> ---
>>  docs/specs/libxl-migration-stream.pandoc | 25 +++--
>>  tools/libxl/libxl_sr_stream_format.h | 11 +++
>>  tools/python/xen/migration/libxl.py  |  9 +
>>  3 files changed, 43 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/specs/libxl-migration-stream.pandoc 
>> b/docs/specs/libxl-migration-stream.pandoc
>> index 2c97d86..5166d66 100644
>> --- a/docs/specs/libxl-migration-stream.pandoc
>> +++ b/docs/specs/libxl-migration-stream.pandoc
>> @@ -1,6 +1,6 @@
>>  % LibXenLight Domain Image Format
>>  % Andrew Cooper <>
>> -% Revision 1
>> +% Revision 2
>>  
>>  Introduction
>>  
>> @@ -119,7 +119,9 @@ type 0x: END
>>  
>>   0x0004: CHECKPOINT_END
>>  
>> - 0x0005 - 0x7FFF: Reserved for future _mandatory_
>> + 0x0005: CHECKPOINT_STATE
>> +
>> + 0x0006 - 0x7FFF: Reserved for future _mandatory_
> 
> This is in the 'mandatory' records. Should it be part of optional records?
> 
> Would this checkpoint state always present on non-COLO guest migration?

No. Will be fixed in the next version

Thanks
Wen Congyang

> 
> 
> 
> .
> 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [qubes-devel] Re: pre Sandy bridge IOMMU support (gm45)

2016-01-26 Thread Thierry Laurion

Le mar. 26 janv. 2016 à 21:10, Thierry Laurion 
a écrit :

> I just tested freshly compiled xen.gz file produced from patched source,
> as recommended by ktempkin. (Previous post xen.diff attached file got
> applied to disable pmr).
>
> Same behavior was observable with iommu=no-igfx: when net-vm tray icon
> gets rendered (corrupted graphics) and notification are draw on screen,
> system hang without logging any error.
>
> I will compile xen with debugging options.
>
> If you guys have any insight or people I should talk to, please advise. It
> would be greatly appreciated. :)
>
> Thierry
>
>
> Le dim. 24 janv. 2016 18:45, Marek Marczykowski-Górecki <
> marma...@invisiblethingslab.com> a écrit :
>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA256
>>
>> On Sun, Jan 24, 2016 at 06:21:05PM +, Thierry Laurion wrote:
>> > Hi devs!
>> >
>> > XEN devs:
>> > As per short discussion with ktemkin earlier in January in #xen:
>> >
>> > "ktemkin Jan 10, 2016 16:21:50
>> > This test patch did appear to make the system work, though:
>> > https://gist.github.com/ktemkin/0e81b93654ae800a5609
>> >
>> > ktemkin Jan 10, 2016 16:24:55
>> > Only real difference I see between that and the upstream behavior
>> (besides
>> > limiting things to dom0 so things weren't accidentally passed through)
>> is
>> > the call to disable_pmr on line 117 before aborting."
>> >
>> >
>> >
>> > Makes total sense to my early understanding, since it seems that it is
>> said
>> > that vt-d engine gets disabled, but disable_pmr(iommu) function is not
>> > called to enforce.
>> >
>> > What do you think?
>> >
>> > QUBES devs:
>> > I'm still trying to understand how to apply this patch to qubes_builder
>> to
>> > actually build a test iso or xen.gz image and report. All Qubes patches
>> > seem to be applied from git to local directory structure. Looking inside
>> > the code to understand how to generate the provided patch to git can
>> apply
>> > it to local chrooted environment when building. Any documentation you
>> could
>> > point me to would be greatly appreciated, as any feedback to actually
>> fix
>> > the issue stopping this laptop from being a nearly perfect candidate for
>> > Qubes.
>>
>> Actually for testing patched hypervisor, you can build xen the standard
>> way (http://wiki.xenproject.org/wiki/Compiling_Xen_From_Source). And
>> then copy just xen.gz. Qubes-specific patches are only for the
>> toolstack, not the hypervisor.
>>
>> But if you want to build full xen package, simply place patches
>> somewhere in qubes-builder/qubes-src/vmm-xen (patches.misc subdir?) and
>> add them to series.conf. Then execute "make vmm-xen" from qubes-builder
>> directory.
>>
>> >
>> > Thierry
>> >
>> > Le sam. 23 janv. 2016 à 02:37, Thierry Laurion <
>> thierry.laur...@gmail.com>
>> > a écrit :
>> >
>> > > Hey devs,
>> > >
>> > > Thinkpad x200 p8600 laptops have vt-d, vt-x and tpm. They also have
>> intel
>> > > integrated graphics 4 Series (gm45 chipset), supported through i915
>> driver.
>> > >
>> > > In December, a fix got introduced to Xen 4.6 through iommu=no-igfx
>> switch.
>> > > Before that fix, it was impossible to boot xen without passing
>> iommu=0.
>> > >
>> > > With iommu=no-igfx passed on, Qubes boots xen, kernel, dom0 and domu
>> until
>> > > some graphic rendering is done from a domu to dom0 xserver.
>> > >
>> > > I'm trying to push forward IOMMU support of gm45 chipset here. The
>> problem
>> > > is between i915 and xen iommu support for sure, but there is no crash
>> or
>> > > interesting debugging information given on a serial console.
>> > >
>> > > Any dev help is welcome since that beast and t400 would be excellent
>> Qubes
>> > > candidates once that problem is fixed. I posted in December on the
>> list
>> > > just before Christmas but I guess the timing wasn't right;)
>> > >
>> > > Thanks for your help.
>> > > Thierry
>> > >
>> >
>>
>>
>>
>> - --
>> Best Regards,
>> Marek Marczykowski-Górecki
>> Invisible Things Lab
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>> -BEGIN PGP SIGNATURE-
>> Version: GnuPG v2
>>
>> iQEcBAEBCAAGBQJWpWIjAAoJENuP0xzK19csmBcH/jAkYioso8K0POq+hIPop9Ft
>> E9h0b964j/jaZsgqofmnZFj8ZA4zI/qr4mQEIuNdk+dUgN69awn/Ffa+/bxTtv0B
>> 7AnCv65s+xMAOn8YHIc/pcwmL1/FymK1NAoVdk4wWXdWhxOW1PdGp+OCvFGFpOd1
>> L0rWwuY+EAV1UnUmd4OyPBLVh4f5fFG7B4tXnd1LaZ18noeSOaJpj5/o55zuwpgC
>> Fx3CtxtAlMLOpu7W1S/MzC73aOajKpFwoaS4RAMD8/Wby3nvtgcBJ6jmBmmSdn/J
>> 9YUOxO9cflIKjKbqXmYZJFceK1CmGNYhYEjTI8m1K9e+ian3vWa3GOwEfBk1oIo=
>> =F+Eh
>> -END PGP SIGNATURE-
>>
>
Here is the output of xen (compiled with debug options in Config.mk and
rules.mk as instucted here
) debug trace
when launched from grub2 with:

multiboot /xen-4.6.0-debug.gz placeholder console=none dom0_mem=min:1024M
dom0_mem=max:4096M console_timestamps=datems loglvl=all guest_loglvl=all
sync_console console_to_ring

Re: [Xen-devel] [PATCH v1 04/12] xen/hvmlite: Bootstrap HVMlite guest

2016-01-26 Thread Luis R. Rodriguez

On Jan 26, 2016 6:16 PM, "Luis R. Rodriguez"  wrote:
>
> On Tue, Jan 26, 2016 at 4:04 PM, Luis R. Rodriguez 
wrote:
> > You go:
> >
> > hvmlite_start_xen() -->
> > HVM stub
> > startup_64() | (startup_32()
>
> Hrm, does HVMlite work well with load_ucode_bsp(), note the patches to
> rebrand pv_enabled() to pv_legacy() or whatever, this PV type will not
> be legacy or crap / old, so we'd need a way to catch it if we should
> not use that code for this PV type. This begs the question, are you
> also sure other callers in startup_32() or startup_64() might be OK as
> well where previously guarded with pv_enabled() ?

Actually this call can't be used, and if early code used it prior to
setup_arch() it'd be a bug as its only properly set until later. Vetting
for correctness of all code call is still required though and perhaps we do
need something to catch now this PV type on early code such as this one if
we don't want it. From what I've gathered before on other bsp ucode we
don't want ucode loaded for PV guest types through these mechanisms.

  Luis
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-3.14 baseline-only test] 38705: regressions - FAIL

2016-01-26 Thread Platform Team regression test user

This run is configured for baseline tests only.

flight 38705 linux-3.14 real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38705/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl   19 guest-start/debian.repeat fail REGR. vs. 38511

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemut-winxpsp3  9 windows-installfail blocked in 38511
 test-amd64-i386-rumpuserxen-i386 10 guest-startfail like 38511
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail like 38511
 test-amd64-amd64-xl-credit2  19 guest-start/debian.repeatfail   like 38511

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail never pass

version targeted for testing:
 linuxe9977508d75a36c78c2167800bc9d19d174f7585
baseline version:
 linux5d7b0fcc26d66db767a477574effc764022c19ac

Last test of basis38511  2015-12-14 10:38:54 Z   43 days
Testing same since38705  2016-01-26 17:25:42 Z0 days1 attempts


People who touched revisions under test:
  Aaro Koskinen 
  Alan Stern 
  Alexey Khoroshilov 
  Andrew Morton 
  Ben Hutchings 
  Benjamin Coddington 
  BjÃ¸rn Mork 
  Cong Wang 
  Daeho Jeong 
  Daniel Borkmann 
  Daniele Palmas 
  David Howells 
  David S. Miller 
  Dmitry Vyukov 
  Don Zickus 
  Eric Dumazet 
  Filipe Manana 
  Greg Kroah-Hartman 
  Hannes Frederic Sowa 
  Hobin Woo 
  James Morris 
  Jarod Wilson 
  Jason A. Donenfeld 
  Jason Wang 
  Jeff Layton 
  Jeff Layton 
  Johan Hovold 
  Jonas Jonsson 
  Junxiao Bi 
  Kamal Mostafa 
  Konstantin Shkolnyy 
  Linus Torvalds 
  lucien 
  Lukas Czerner 
  Marcelo Ricardo Leitner 
  Michal Hocko 
  Michal Kubecek 
  Michal KubeÄek 
  Neil Horman 
  Nicolas Dichtel 
  Nikolay Aleksandrov 
  Pavel Machek 
  Prarit Bhargava 
  Rainer Weikusat 
  Sergei Shtylyov 
  Stefan Richter 
  Theodore Ts'o 
  Trond Myklebust 
  Vlad Yasevich 
  Vladislav Yasevich 
  Vladislav Yasevich 
  WANG Cong 
  Willem de Bruijn 
  Xin Long 
  Yevgeny Pats 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386

Re: [Xen-devel] [PATCH v2 3/3] tools: introduce parameter max_wp_ram_ranges.

2016-01-26 Thread Yu, Zhang




On 1/26/2016 7:00 PM, Jan Beulich wrote:

On 26.01.16 at 08:32,  wrote:

On 1/22/2016 4:01 PM, Jan Beulich wrote:

On 22.01.16 at 04:20,  wrote:

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -940,6 +940,10 @@ static int hvm_ioreq_server_alloc_rangesets(struct
hvm_ioreq_server *s,
   {
   unsigned int i;
   int rc;
+unsigned int max_wp_ram_ranges =
+( s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_WP_RAM_RANGES] > 0 ) 
?
+s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_WP_RAM_RANGES] :
+MAX_NR_IO_RANGES;


Besides this having stray blanks inside the parentheses it truncates
the value from 64 to 32 bits and would benefit from using the gcc
extension of omitting the middle operand of ?:. But even better
would imo be if you avoided the local variable and ...


After second thought, how about we define a default value for this
parameter in libx.h, and initialize the parameter when creating the
domain with default value if it's not configured.


No, I don't think the tool stack should be determining the default
here (unless you want the default to be zero, and have zero
indeed mean zero).


Thank you, Jan.
If we do not provide a default value in tool stack, the code above
should be kept, to initialize the local variable with either the one
set in the configuration file, or with MAX_NR_IO_RANGES. Is this OK?


About this local variable, we keep it, and ...


@@ -962,7 +966,10 @@ static int hvm_ioreq_server_alloc_rangesets(struct

hvm_ioreq_server *s,

   if ( !s->range[i] )
   goto fail;

-rangeset_limit(s->range[i], MAX_NR_IO_RANGES);
+if ( i == HVMOP_IO_RANGE_WP_MEM )
+rangeset_limit(s->range[i], max_wp_ram_ranges);
+else
+rangeset_limit(s->range[i], MAX_NR_IO_RANGES);


... did the entire computation here, using ?: for the second argument
of the function invocation.


... replace the if/else pair with sth. like:
  rangeset_limit(s->range[i],
 ((i == HVMOP_IO_RANGE_WP_MEM)?
  max_wp_ram_ranges:
  MAX_NR_IO_RANGES));
This 'max_wp_ram_ranges' has no particular usages, but the string
"s->domain->arch.hvm_domain.params[HVM_PARAM_MAX_WP_RAM_RANGES]"
is too lengthy, and can easily break the 80 column limitation. :)
Does this approach sounds OK? :)


Seems better than the original, so okay.


@@ -6009,6 +6016,7 @@ static int hvm_allow_set_param(struct domain *d,
   case HVM_PARAM_IOREQ_SERVER_PFN:
   case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
   case HVM_PARAM_ALTP2M:
+case HVM_PARAM_MAX_WP_RAM_RANGES:
   if ( value != 0 && a->value != value )
   rc = -EEXIST;
   break;


Is there a particular reason you want this limit to be unchangeable
after having got set once?


Well, not exactly. :)
I added this limit because by now we do not have any approach to
change the max range numbers inside ioreq server during run-time.
I can add another patch to introduce an xl command, which can change
it dynamically. But I doubt the necessity of this new command and
am also wonder if this new command would cause more confusion for
the user...


And I didn't say you need to expose this to the user. All I asked
was whether you really mean the value to be a set-once one. If
yes, the code above is fine. If no, the code above should be
changed, but there's then still no need to expose a way to
"manually" adjust the value until a need for such arises.



I see. The constraint is not necessary. And I'll remove this code. :)


Jan




B.R.
Yu

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v11 2/3] Differentiate IO/mem resources tracked by ioreq server

2016-01-26 Thread Yu, Zhang




On 1/26/2016 7:24 PM, Jan Beulich wrote:

On 26.01.16 at 08:59,  wrote:




On 1/22/2016 7:43 PM, Jan Beulich wrote:

On 22.01.16 at 04:20,  wrote:

@@ -2601,6 +2605,16 @@ struct hvm_ioreq_server

*hvm_select_ioreq_server(struct domain *d,

   type = (p->type == IOREQ_TYPE_PIO) ?
   HVMOP_IO_RANGE_PORT : HVMOP_IO_RANGE_MEMORY;
   addr = p->addr;
+if ( type == HVMOP_IO_RANGE_MEMORY )
+{
+ ram_page = get_page_from_gfn(d, p->addr >> PAGE_SHIFT,
+  , P2M_UNSHARE);


It seems to me like I had asked before: Why P2M_UNSHARE instead
of just P2M_QUERY? (This could surely be fixed up while committing,
the more that I've already done some cleanup here, but I'd like to
understand this before it goes in.)


Hah, sorry for my bad memory. :)
I did not found P2M_QUERY; only P2M_UNSHARE and P2M_ALLOC are
defined. But after reading the code in ept_get_entry(), I guess the
P2M_UNSHARE is not accurate, maybe I should use 0 here for the
p2m_query_t parameter in get_page_from_gfn()?


Ah, sorry for the misnamed suggestion. I'm not sure whether using
zero here actually matches your needs; P2M_UNSHARE though
seems odd in any case, so at least switching to P2M_ALLOC (to
populate PoD pages) would seem to be necessary.



Thanks, Jan.  :)
And now I believe we should use zero here. By now XenGT does not
support PoD and here all we care about is whether the p2m type of this
gfn is p2m_mmio_write_dm.


@@ -2642,6 +2656,11 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct 
domain *d,
   }

   break;
+case HVMOP_IO_RANGE_WP_MEM:
+if ( rangeset_contains_singleton(r, PFN_DOWN(addr)) )
+return s;


Considering you've got p2m_mmio_write_dm above - can this
validly return false here?


Well, if we have multiple ioreq servers defined, it will...


Ah, right. That's fine then.

Jan




B.R.
Yu

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc

2016-01-26 Thread Wen Congyang

On 01/26/2016 03:41 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:29:02AM +0800, Wen Congyang wrote:
>> In COLO mode, both VMs are running, and are considered in sync if the
>> visible network traffic is identical.  After some time, they fall out of
>> sync.
>>
>> At this point, the two VMs have definitely diverged.  Lets call the
>> primary dirty bitmap set A, while the secondary dirty bitmap set B.
>>
>> Sets A and B are different.
>>
>> Under normal migration, the page data for set A will be sent form the
> 
> s/form/from/
> 
>> primary to the secondary.
>>
>> However, the set difference B - A (lets call this C) is out-of-date on
>> the secondary (with respect to the primary) and will not be sent by the
>> primary, as it was not memory dirtied by the primary.  The secondary
> 
> s/primary/primary (to secondary)/
> 
>> needs the page data for C to reconstruct an exact copy of the primary at
> 
> s/the page data/C page data/
> 
>> the checkpoint.
>>
>> The secondary cannot calculate C as it doesn't know A.  Instead, the
>> secondary must send B to the primary, at which point the primary
>> calculates the union of A and B (lets call this D) which is all the
>> pages dirtied by both the primary and the secondary, and sends all page
>> data covered by D.
> 
> You could invert this - the primary could send A to secondary? I presume
> this non-optimal as the 'A' set is much much bigger than 'C' set?

'C' set is the one in 'B' set but not in 'A' set.

> 
> It may be good to include this in the commit description.
> 
>>
>> In the general case, D is a superset of both A and B.  Without the
>> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
>> copy of the primary.
>>
>> We transfer the dirty bitmap on libxc side, so we need to introduce back
>> channel to libxc.
> 
>>
>> Note: it is different from the paper. We change the original design to
>> the current one, according to our following concerns:
>> 1. The original design needs extra memory on Secondary host. When there's
>>multiple backups on one host, the memory cost is high.
>> 2. The memory cache code will be another 1k+, it will make the review
>>more time consuming.
> 
> Well, that 2) is a very good reason :-)
>>
>> Signed-off-by: Yang Hongyang 
>> commit message:
> 
> ? Huh?

I don't know what it is. Will remove it in the next version.

> 
>> Signed-off-by: Andrew Cooper 
>> CC: Ian Campbell 
>> CC: Ian Jackson 
>> CC: Wei Liu 
> 
> .. snip..
>> index 05159bb..d4dc501 100644
>> --- a/tools/libxc/xc_sr_restore.c
>> +++ b/tools/libxc/xc_sr_restore.c
>> @@ -722,7 +722,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, 
>> uint32_t dom,
>>unsigned long *console_gfn, domid_t console_domid,
>>unsigned int hvm, unsigned int pae, int superpages,
>>int checkpointed_stream,
>> -  struct restore_callbacks *callbacks)
>> +  struct restore_callbacks *callbacks, int back_fd)
>>  {
>>  struct xc_sr_context ctx =
>>  {
>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>> index 8ffd71d..a49d083 100644
>> --- a/tools/libxc/xc_sr_save.c
>> +++ b/tools/libxc/xc_sr_save.c
>> @@ -824,7 +824,7 @@ static int save(struct xc_sr_context *ctx, uint16_t 
>> guest_type)
>>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
>> uint32_t max_iters, uint32_t max_factor, uint32_t flags,
>> struct save_callbacks* callbacks, int hvm,
>> -   int checkpointed_stream)
>> +   int checkpointed_stream, int back_fd)
>>  {
>>  struct xc_sr_context ctx =
>>  {
> 
> 
> But where is the code?
> 
> Or is that suppose to be done in another patch? If so you may want to
> mention that in the commit description?

Do you mean where is the code that uses back_fd? It is in another series:
http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02904.html

Thanks
Wen Congyang

> 
> 
> 
> .
> 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v11 2/3] Differentiate IO/mem resources tracked by ioreq server

2016-01-26 Thread Yu, Zhang




On 1/22/2016 7:43 PM, Jan Beulich wrote:

On 22.01.16 at 04:20,  wrote:

@@ -2601,6 +2605,16 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct 
domain *d,
  type = (p->type == IOREQ_TYPE_PIO) ?
  HVMOP_IO_RANGE_PORT : HVMOP_IO_RANGE_MEMORY;
  addr = p->addr;
+if ( type == HVMOP_IO_RANGE_MEMORY )
+{
+ ram_page = get_page_from_gfn(d, p->addr >> PAGE_SHIFT,
+  , P2M_UNSHARE);


It seems to me like I had asked before: Why P2M_UNSHARE instead
of just P2M_QUERY? (This could surely be fixed up while committing,
the more that I've already done some cleanup here, but I'd like to
understand this before it goes in.)


Hah, sorry for my bad memory. :)
I did not found P2M_QUERY; only P2M_UNSHARE and P2M_ALLOC are
defined. But after reading the code in ept_get_entry(), I guess the
P2M_UNSHARE is not accurate, maybe I should use 0 here for the
p2m_query_t parameter in get_page_from_gfn()?


+ if ( p2mt == p2m_mmio_write_dm )
+ type = HVMOP_IO_RANGE_WP_MEM;
+
+ if ( ram_page )
+ put_page(ram_page);
+}
  }

  list_for_each_entry ( s,
@@ -2642,6 +2656,11 @@ struct hvm_ioreq_server *hvm_select_ioreq_server(struct 
domain *d,
  }

  break;
+case HVMOP_IO_RANGE_WP_MEM:
+if ( rangeset_contains_singleton(r, PFN_DOWN(addr)) )
+return s;


Considering you've got p2m_mmio_write_dm above - can this
validly return false here?


Well, if we have multiple ioreq servers defined, it will...
Currently, this p2m type is only used in XenGT, which has only one
ioreq server other than qemu for the vGPU. But suppose there will
be more devices using this type and more ioreq servers introduced
for them, it can return false.


Jan



B.R.
Yu

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 14/17] XEN: EFI: Move x86 specific codes to architecture directory

2016-01-26 Thread Shannon Zhao



On 2016/1/26 0:44, Stefano Stabellini wrote:
> On Sat, 23 Jan 2016, Shannon Zhao wrote:
>> > From: Shannon Zhao 
>> > 
>> > Move x86 specific codes to architecture directory and export those EFI
>> > runtime service functions. This will be useful for initializing runtime
>> > service on ARM later.
>> > 
>> > Signed-off-by: Shannon Zhao 
> This patch causes a build breakage on x86:
> 
> arch/x86/xen/efi.c: In function ‘xen_efi_probe’:
> arch/x86/xen/efi.c:101:2: error: implicit declaration of function 
> ‘HYPERVISOR_platform_op’ [-Werror=implicit-function-declaration]
> 
This patch is based on following patch [1]. Maybe you need to update
your branch. :)

[1] xen: rename dom0_op to platform_op
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=cfafae940381207d48b11a73a211142dba5947d3

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support

2016-01-26 Thread Han, Huaitong

On Mon, 2016-01-25 at 08:25 -0700, Jan Beulich wrote:
> > > > On 19.01.16 at 08:30,  wrote:
> > Changes in v6:
> > *2 patches merged are not included.
> > *Don't write XSTATE_PKRU to PV's xcr0.
> > *Use "if()" instead of "?:" in cpuid handling patch.
> > *Update read_pkru function.
> > *Use value 4 instead of CONFIG_PAGING_LEVELS.
> > *Add George's patch for PFEC_insn_fetch handling.
> 
> How does this last item match up with ...

"At the moment PFEC_insn_fetch is only set in
hvm_fetch_from_guest_virt() if hvm_nx_enabled() or hvm_smep_enabled()
are true.  Which means that if you *don't* have nx or smep enabled,
then the patch series as is will fault on instruction fetches when it
shouldn't.  (I don't remember anyone mentioning nx or smep being
enabled as a prerequisite for pkeys.)"

I think realistically the only way to address this is to start making
the clean separation between "pfec in" and "pfec out" I mentioned in
the previous discussion.

I've coded up the attached patch, but only compile-tested it.  Can you
give it a look to see if you think it is correct, test it, include it
in your next patch series?

--from George's comments on V5 patches.

> 
> > Huaitong Han (5):
> >   x86/hvm: pkeys, disable pkeys for guests in non-paging mode
> >   x86/hvm: pkeys, add pkeys support for guest_walk_tables
> >   x86/hvm: pkeys, add xstate support for pkeys
> >   xen/mm: Clean up pfec handling in gva_to_gfn
> >   x86/hvm: pkeys, add pkeys support for cpuid handling
> 
> ... all five patches being yours?
I will update a patch author to George.
> 
> Jan
> 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state

2016-01-26 Thread Wen Congyang

On 01/26/2016 03:55 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:29:06AM +0800, Wen Congyang wrote:
>> Checkpoint device is an abstract layer to do checkpoint.
>> COLO can also use it to do checkpoint. But there are
>> still some codes in checkpoint device which touch remus.
>>
>> This patch and the following 2 will seperate remus from
> 
> s/and the following 2/and:
> 
>  tools/libxl: move remus state into a seperate structure 
>  tools/libxl: seperate device init/cleanup from checkpoint device layer
> 
>> checkpoint device layer.
>>
>> We use remus ops directly in checkpoint device. Store it
>> in checkpoint device state so that we do not aware of
>> remus_ops in the checkpoint device layer.
>>
>> it is pure refactoring and no functional changes.
> s/it/It/
> 
>>
>> Signed-off-by: Wen Congyang 
>> Signed-off-by: Yang Hongyang 
>> Acked-by:Ian Campbell 
> 
> Reviewed-by: Konrad Rzeszutek Wilk 
> 
> with the changes I mentioned.

OK, will fix it in the next version.

Thanks
Wen Congyang

>> ---
>>  tools/libxl/libxl_checkpoint_device.c | 10 +-
>>  tools/libxl/libxl_internal.h  |  2 ++
>>  tools/libxl/libxl_remus.c |  9 +
>>  3 files changed, 12 insertions(+), 9 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_checkpoint_device.c 
>> b/tools/libxl/libxl_checkpoint_device.c
>> index 226f159..bbc6dc4 100644
>> --- a/tools/libxl/libxl_checkpoint_device.c
>> +++ b/tools/libxl/libxl_checkpoint_device.c
>> @@ -17,14 +17,6 @@
>>  
>>  #include "libxl_internal.h"
>>  
>> -extern const libxl__checkpoint_device_instance_ops remus_device_nic;
>> -extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
>> -static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
>> -_device_nic,
>> -_device_drbd_disk,
>> -NULL,
>> -};
>> -
>>  /*- helper functions -*/
>>  
>>  static int init_device_subkind(libxl__checkpoint_devices_state *cds)
>> @@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, 
>> libxl__ao_device *aodev)
>>  goto out;
>>  
>>  do {
>> -dev->ops = remus_ops[++dev->ops_index];
>> +dev->ops = dev->cds->ops[++dev->ops_index];
>>  if (!dev->ops) {
>>  libxl_device_nic * nic = NULL;
>>  libxl_device_disk * disk = NULL;
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index 5b99d6e..914ce94 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -2895,6 +2895,8 @@ struct libxl__checkpoint_devices_state {
>>  uint32_t domid;
>>  libxl__checkpoint_callback *callback;
>>  int device_kind_flags;
>> +/* The ops must be pointer array, and the last ops must be NULL */
> 
> s/NULL/NULL./
> 
>> +const libxl__checkpoint_device_instance_ops **ops;
>>  
>>  /*- private for abstract layer only -*/
>>  
>> diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
>> index d088dad..3375331 100644
>> --- a/tools/libxl/libxl_remus.c
>> +++ b/tools/libxl/libxl_remus.c
>> @@ -18,6 +18,14 @@
>>  
>>  #include "libxl_internal.h"
>>  
>> +extern const libxl__checkpoint_device_instance_ops remus_device_nic;
>> +extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
>> +static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
>> +_device_nic,
>> +_device_drbd_disk,
>> +NULL,
>> +};
>> +
>>  /* Remus setup and teardown -*/
>>  
>>  static void remus_setup_done(libxl__egc *egc,
>> @@ -50,6 +58,7 @@ void libxl__remus_setup(libxl__egc *egc,
>>  cds->ao = ao;
>>  cds->domid = dss->domid;
>>  cds->callback = remus_setup_done;
>> +cds->ops = remus_ops;
>>  
>>  dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>>  
>> -- 
>> 2.5.0
>>
>>
>>
>>
>> ___
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> 
> 
> .
> 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 COLOPre 14/18] tools/libxl: fix backword compatibility after the automatic renaming

2016-01-26 Thread Wen Congyang

On 01/26/2016 03:43 AM, Konrad Rzeszutek Wilk wrote:
> On Thu, Dec 17, 2015 at 03:48:18PM +0800, Wen Congyang wrote:
>> From: Yang Hongyang 
>>
>> The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
>> changed to ERROR_CHECKPOINT_XXX after previous renaming.
>> The patch fix the backword compatibility.
>>
>> Signed-off-by: Yang Hongyang 
>> Signed-off-by: Wen Congyang 
>> ---
>>  tools/libxl/libxl.h | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
>> index 67a4ad7..2a26ba2 100644
>> --- a/tools/libxl/libxl.h
>> +++ b/tools/libxl/libxl.h
>> @@ -883,6 +883,19 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
>> libxl_mac *src);
>>   */
>>  #define LIBXL_HAVE_CHECKPOINTED_STREAM 1
>>  
>> +/* Remus stuff */
> 
> /sRemus stuff//

Do you mean remove this line?

> 
> .. as it is obvious that this is remus related.
>> +/*
>> + * ERROR_REMUS_XXX error code only exists from Xen 4.5, and in Xen 4.6
> 
> s/4.6/4.7/

Will fix it in the next version.

> 
>> + * it is changed to ERROR_CHECKPOINT_XXX
>> + */
>> +#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040500 \
>> +   && LIBXL_API_VERSION < 0x040600
> 
> s/040600/040700/

Will fix it in the next version.

Thanks
Wen Congyang

> 
>> +#define ERROR_REMUS_DEVOPS_DOES_NOT_MATCH \
>> +ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH
>> +#define ERROR_REMUS_DEVICE_NOT_SUPPORTED \
>> +ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
>> +#endif
>> +
>>  typedef char **libxl_string_list;
>>  void libxl_string_list_dispose(libxl_string_list *sl);
>>  int libxl_string_list_length(const libxl_string_list *sl);
>> -- 
>> 2.5.0
>>
>>
>>
>>
>> ___
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> 
> 
> .
> 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Error booting Xen

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 13:51,  wrote:
> I tried the patch and I am very happy to inform you
> all that the patch has solved my problem. Now I am
> able to boot Xen without disabling XSAVE. I have
> full log of boot at http://paste2.org/gVW0Z9nm (if
> any one is interested. also "XXX Hello, this is my
> first mod :)" is printed by my mod, so ignore that
> one).

Thanks for trying it out, but while I'm glad it helped I'm afraid
we're not done here. With (for every vCPU)

(XEN) traps.c:3290: GPF (): 82d0801c1d08 -> 82d080252e5c
(XEN) d0v1 fault#1: mxcsr=1f80
(XEN) d0v1 xs= xc=
(XEN) d0v1 r0= r1=
(XEN) d0v1 r2= r3=
(XEN) d0v1 r4= r5=
(XEN) traps.c:3290: GPF (): 82d0801c1d08 -> 82d080252e5c
(XEN) d0v1 fault#2: mxcsr=1f80
(XEN) d0v1 xs= xc=
(XEN) d0v1 r0= r1=
(XEN) d0v1 r2= r3=
(XEN) d0v1 r4= r5=

it continues to be unclear why bit 63 in the value printed as
xc= isn't set from the beginning. Or wait, I think I see where
the problem is. Here's a 3rd patch, to try together with the
other two. The expectation would be for the above log
messages to then disappear altogether. (And then the patch
should also work on its own, i.e. with the other two removed
again.) Please let us know.

Thanks, Jan

x86/xstate: don't unintentionally clear compaction bit

When the VGCF_I387_VALID flag is clear in arch_set_info_guest()'s input
we must not clear the compaction bit when using XSAVES/XRSTORS. Split
initialization of xcomp_bv from the other FPU/SSE/AVX related state
setup in this function.

Signed-off-by: Jan Beulich 

--- unstable.orig/xen/arch/x86/domain.c
+++ unstable/xen/arch/x86/domain.c
@@ -922,15 +922,10 @@ int arch_set_info_guest(
 {
 memcpy(v->arch.fpu_ctxt, >fpu_ctxt, sizeof(c.nat->fpu_ctxt));
 if ( v->arch.xsave_area )
-{
 v->arch.xsave_area->xsave_hdr.xstate_bv = XSTATE_FP_SSE;
-v->arch.xsave_area->xsave_hdr.xcomp_bv =
-cpu_has_xsaves ? XSTATE_COMPACTION_ENABLED : 0;
-}
 }
 else if ( v->arch.xsave_area )
-memset(>arch.xsave_area->xsave_hdr, 0,
-   sizeof(v->arch.xsave_area->xsave_hdr));
+v->arch.xsave_area->xsave_hdr.xstate_bv = 0;
 else
 {
 typeof(v->arch.xsave_area->fpu_sse) *fpu_sse = v->arch.fpu_ctxt;
@@ -939,6 +934,11 @@ int arch_set_info_guest(
 fpu_sse->fcw = FCW_DEFAULT;
 fpu_sse->mxcsr = MXCSR_DEFAULT;
 }
+if ( cpu_has_xsaves )
+{
+ASSERT(v->arch.xsave_area);
+v->arch.xsave_area->xsave_hdr.xcomp_bv = XSTATE_COMPACTION_ENABLED;
+}
 
 if ( !compat )
 {




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Error booting Xen

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 14:13,  wrote:
> The patch as I already said is letting me boot
> into the Xen, but the system is now resetting
> stating XSAVE as the cause. I have attached
> links to two cases where system was reset as
> the result. I don't think that problem is fully
> solved yet.
> http://paste2.org/Ky56Z92g 
> http://paste2.org/3hcbG6L7 

I guess this would also get resolved by the 3rd patch just sent.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.

2016-01-26 Thread Xu, Quan

> On January 25, 2016 at 10:09pm,  wrote:
> >>> On 22.01.16 at 16:57,  wrote:
> >>  On January 22, 2016 at 12:31am,  wrote:
> >> >>> On 21.01.16 at 17:16,  wrote:
> >> >>  On January 20, 2016 at 7:29 pm,   wrote:
> >> >> >>> On 20.01.16 at 11:26,  wrote:
> >> >> >> On January 15, 2016 at 9:10,  wrote:
> >> >> >> >>> On 23.12.15 at 09:25,  wrote:
> >> >> >> Also note that unused slots hold zero, i.e. there's at least a
> >> >> >> theoretical
> >> >> > risk of
> >> >> >> problems here when you don't also look at
> >> >> >> iommu->domid_bitmap.
> >> >> >>
> >> >> > I am not clear how to fix this point. Do you have good idea?
> >> >> > Add a lock to 'iommu->domid_bitmap'?
> >> >>
> >> >> How would a lock help avoiding mistaking an unused slot to mean Dom0?
> >> >> As already suggested - I think you simply need to consult the
> >> >> bitmap along with
> >> > the
> >> >> domain ID array.
> >> >>
> >> >
> >> > Try to get domain id with iommu->domid_map[did] ?
> >>
> >> ???
> >>
> > +if ( iommu->domid_map )
> > +   d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> >
> > Is it right?
> 
> I don't see what this changes. Again - what your code has been lacking so far 
> is
> some mechanism to guarantee that what you read from domid_map[] is
> actually a valid domain ID. I can only once again point your attention to
> domid_bitmap, which afaics gives you exactly that valid/invalid indication.
> 
Jan,
Sorry, I was confused about this point as I didn't understand the domid_bitmap. 
I printed out the iommu->domid_bitmap[did], which was tricky to me.
Now I get it. 
As you mentioned , I simply need to consult the bitmap along with the domain ID 
array.

+If ( test_bit(did, iommu->domid_bitmap) && iommu->domid_map[did] >= 0 )
+   d = rcu_lock_domain_by_id(iommu->domid_map[did]);

Is it right now?

> >> >> >> > +{
> >> >> >> > +list_del(>domain_list);
> >> >> >>
> >> >> >> This should happen under pcidevs_lock - you need to either
> >> >> >> acquire it or
> >> >> >> ASSERT() it being held.
> >> >> >>
> >> >> >
> >> >> > I should check whether it is under pcidevs_lock -- with
> >> >> > spin_is_locked(_lock)
> >> >> > If it is not under pcidevs_lock, I should acquire it.
> >> >> > I think ASSERT() is not a good idea. Hypervisor acquires this
> >> >> > lock and then remove the resource.
> >> >>
> >> >> I don't understand this last sentence.
> >> >>
> >> > For example: in
> >> > pci_remove_device()
> >> > {
> >> > ...
> >> > spin_lock(_lock);
> >> > ..
> >> > iommu_remove_device()..
> >> > ..
> >> > spin_unlock(_lock);
> >> > }
> >> >
> >> > Device-TLB is maybe flush error in iommu_remove_device()..
> >> > Then it is under pcidevs_lock..
> >> > In this case, I need to check whether it is under pcidevs_lock.
> >> > If not, I need to acquire the pcidevs_lock.
> >>
> >> Ah, okay. But you can't use spin_is_locked() for that purpose.
> >>
> > If I introduce a new parameter 'lock'.
> > + int lock = spin_is_locked(_lock);
> >
> >
> > + if ( !lock )
> > +spin_lock(_lock);
> > ...
> > + if ( !lock )
> > +spin_unlock(_lock);
> >
> > Is it right?
> > Jan, do you have some better idea?
> 
> If indeed different locking state is possible for different call trees, 

Indeed different. For example,
It is _not_under_ lock for the following call tree:
$ flush_iotlb_qi()--- iommu_flush_iotlb_psi() -- __intel_iommu_iotlb_flush() 
--intel_iommu_iotlb_flush() --iommu_iotlb_flush() 
--xenmem_add_to_physmap()--do_memory_op() 

It is _under_ lock for the following call tree:
$flush_iotlb_qi()--iommu_flush_iotlb_dsi()--domain_context_unmap_one()--domain_context_unmap()--reassign_device_ownership()--deassign_device()-iommu_do_pci_domctl()



> then I'm
> afraid you won't get around passing down flags to indicate whether the needed
> lock is already being held.
> 
Agreed.

At first, I am open for any solution.
pcidevs_lock is quite a big lock. For this point, it looks much better to add a 
new flag to delay hiding device.
I am also afraid that it may raise further security issues.


Jan, thanks!!
-Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread George Dunlap

On 26/01/16 12:44, Jan Beulich wrote:
 On 26.01.16 at 12:44,  wrote:
>> On Thu, Jan 21, 2016 at 2:52 PM, Jan Beulich  wrote:
>> On 21.01.16 at 15:01,  wrote:
 On 01/21/16 03:25, Jan Beulich wrote:
 On 21.01.16 at 10:10,  wrote:
>> c) hypervisor should mange PMEM resource pool and partition it to 
>> multiple
>> VMs.
>
> Yes.
>

 But I Still do not quite understand this part: why must pmem resource
 management and partition be done in hypervisor?
>>>
>>> Because that's where memory management belongs. And PMEM,
>>> other than PBLK, is just another form of RAM.
>>
>> I haven't looked more deeply into the details of this, but this
>> argument doesn't seem right to me.
>>
>> Normal RAM in Xen is what might be called "fungible" -- at boot, all
>> RAM is zeroed, and it basically doesn't matter at all what RAM is
>> given to what guest.  (There are restrictions of course: lowmem for
>> DMA, contiguous superpages,  but within those groups, it doesn't
>> matter *which* bit of lowmem you get, as long as you get enough to do
>> your job.)  If you reboot your guest or hand RAM back to the
>> hypervisor, you assume that everything in it will disappear.  When you
>> ask for RAM, you can request some parameters that it will have
>> (lowmem, on a specific node, ), but you can't request a specific
>> page that you had before.
>>
>> This is not the case for PMEM.  The whole point of PMEM (correct me if
>> I'm wrong) is to be used for long-term storage that survives over
>> reboot.  It matters very much that a guest be given the same PRAM
>> after the host is rebooted that it was given before.  It doesn't make
>> any sense to manage it the way Xen currently manages RAM (i.e., that
>> you request a page and get whatever Xen happens to give you).
> 
> Interesting. This isn't the usage model I have been thinking about
> so far. Having just gone back to the original 0/4 mail, I'm afraid
> we're really left guessing, and you guessed differently than I did.
> My understanding of the intentions of PMEM so far was that this
> is a high-capacity, slower than DRAM but much faster than e.g.
> swapping to disk alternative to normal RAM. I.e. the persistent
> aspect of it wouldn't matter at all in this case (other than for PBLK,
> obviously).

Oh, right -- yes, if the usage model of PRAM is just "cheap slow RAM",
then you're right -- it is just another form of RAM, that should be
treated no differently than say, lowmem: a fungible resource that can be
requested by setting a flag.

Haozhong?

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 14:47,  wrote:
> As you mentioned , I simply need to consult the bitmap along with the domain 
> ID array.
> 
> +If ( test_bit(did, iommu->domid_bitmap) && iommu->domid_map[did] >= 0 )
> +   d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> 
> Is it right now?

Mostly, except that I don't understand the >= 0 part.

> At first, I am open for any solution.
> pcidevs_lock is quite a big lock. For this point, it looks much better to 
> add a new flag to delay hiding device.
> I am also afraid that it may raise further security issues.

Well, I'd say just go and see which one turns out to be less
cumbersome and/or less intrusive.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V2 1/1] Improved RTDS scheduler

2016-01-26 Thread Dario Faggioli

On Mon, 2016-01-25 at 18:00 -0500, Meng Xu wrote:
> On Mon, Jan 25, 2016 at 5:04 PM, Tianyang Chen 
> wrote:
> > I have removed some of the Ccs so they won't get bothered as we
> > discussed
> > previously.
> > 
> > On 1/25/2016 4:00 AM, Dario Faggioli wrote:
> > > 
> > > On Thu, 2015-12-31 at 05:20 -0500, Tianyang Chen wrote:
> > > > 
> > > > 
> > > > @@ -147,6 +148,16 @@ static unsigned int nr_rt_ops;
> > > >    * Global lock is referenced by schedule_data.schedule_lock
> > > > from all
> > > >    * physical cpus. It can be grabbed via
> > > > vcpu_schedule_lock_irq()
> > > >    */
> > > > +
> > > > +/* dedicated timer for replenishment */
> > > > +static struct timer repl_timer;
> > > > +
> > > 
> > > So, there's always only one timer... Even if we have multiple
> > > cpupool
> > > with RTDS as their scheduler, they share the replenishment timer?
> > > I
> > > think it makes more sense to make this per-scheduler.
> > > 
> > Yeah, I totally ignored the case for cpu-pools. It looks like when
> > a
> > cpu-pool is created, it copies the scheduler struct and calls
> > rt_init()
> > where a private field is initialized. So I assume the timer should
> > be put
> > inside the scheduler private struct? Now that I think about it, the
> > timer is
> > hard-coded to run on cpu0. If there're lots of cpu-pools but the
> > replenishment can only be done on the same pcpu, would that be a
> > problem?
> > Should we keep track of all instances of schedulers (nr_rt_ops
> > counts how
> > many) and just put times on different pcpus?
> > 
> > > > +/* controls when to first start the timer*/
> > > > +static int timer_started;
> > > > +
> > > 
> > > I don't like this, and I don't think we need it. In fact, you
> > > removed
> > > it yourself from v3, AFAICT.
> > > 
> > > > @@ -635,6 +652,13 @@ rt_vcpu_insert(const struct scheduler
> > > > *ops,
> > > > struct vcpu *vc)
> > > > 
> > > >   /* add rt_vcpu svc to scheduler-specific vcpu list of the
> > > > dom */
> > > >   list_add_tail(>sdom_elem, >sdom->vcpu);
> > > > +
> > > > +if(!timer_started)
> > > > +{
> > > > +/* the first vcpu starts the timer for the first
> > > > time*/
> > > > +timer_started = 1;
> > > > +set_timer(_timer,svc->cur_deadline);
> > > > +}
> > > >   }
> > > > 
> > > This also seems to be gone in v3, which is good. In fact, it uses
> > > timer_started, which I already said I didn't like.
> > > 
> > > About the actual startup of the timer (no matter whether for
> > > first time
> > > or not). Here, you were doing it in _vcpu_insert() and not in
> > > _vcpu_wake(); in v3 you're doing it in _vcpu_wake() and not in
> > > _runq_insert()... Which one is the proper way?
> > > 
> > 
> > Correct me if I'm wrong, at the beginning of the boot process, all
> > vcpus are
> > put to sleep/not_runnable after insertions. Therefore, the timer
> > should
> > start when the first vcpu wakes up. I think the wake() in v3 should
> > be
> > correct.
> > 
> > 
> > > > @@ -792,44 +816,6 @@ __runq_pick(const struct scheduler *ops,
> > > > const
> > > > cpumask_t *mask)
> > > >   }
> > > > 
> > > >   /*
> > > > - * Update vcpu's budget and
> > > > - * sort runq by insert the modifed vcpu back to runq
> > > > - * lock is grabbed before calling this function
> > > > - */
> > > > -static void
> > > > -__repl_update(const struct scheduler *ops, s_time_t now)
> > > > -{
> > > > 
> > > Please, allow me to say that seeing this function going away,
> > > fills my
> > > heart with pure joy!! :-D
> > > 
> > > > @@ -889,7 +874,7 @@ rt_schedule(const struct scheduler *ops,
> > > > s_time_t
> > > > now, bool_t tasklet_work_sched
> > > >   }
> > > >   }
> > > > 
> > > > -ret.time = MIN(snext->budget, MAX_SCHEDULE); /* sched
> > > > quantum */
> > > > +ret.time = snext->budget; /* invoke the scheduler next
> > > > time */
> > > >   ret.task = snext->vcpu;
> > > > 
> > > This is ok as it is done in v3 (i.e., snext->budget if !idle, -1
> > > if
> > > idle).
> > > 
> > > > @@ -1074,14 +1055,7 @@ rt_vcpu_wake(const struct scheduler
> > > > *ops,
> > > > struct vcpu *vc)
> > > >   /* insert svc to runq/depletedq because svc is not in
> > > > queue now
> > > > */
> > > >   __runq_insert(ops, svc);
> > > > 
> > > > -__repl_update(ops, now);
> > > > -
> > > > -ASSERT(!list_empty(>sdom));
> > > > -sdom = list_entry(prv->sdom.next, struct rt_dom,
> > > > sdom_elem);
> > > > -online = cpupool_scheduler_cpumask(sdom->dom->cpupool);
> > > > -snext = __runq_pick(ops, online); /* pick snext from ALL
> > > > valid
> > > > cpus */
> > > > -
> > > > -runq_tickle(ops, snext);
> > > > +runq_tickle(ops, svc);
> > > > 
> > > And this is another thing I especially like of this patch: it
> > > makes the
> > > wakeup path a lot simpler and a lot more similar to how it looks
> > > like
> > > in the other schedulers.
> > > 
> > > Good job with this. :-)
> > > 
> > > > @@ -1108,15 +1078,8 @@

Re: [Xen-devel] [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()

2016-01-26 Thread Konrad Rzeszutek Wilk

On Tue, Jan 26, 2016 at 03:04:39PM +0800, Wen Congyang wrote:
> On 01/26/2016 02:59 AM, Konrad Rzeszutek Wilk wrote:
> > On Wed, Dec 30, 2015 at 10:28:59AM +0800, Wen Congyang wrote:
> >> Secondary vm is running in colo mode, we need to send
> >> secondary vm's dirty page information to master at checkpoint,
> > 
> > In previous patch you called it primary, so perhaps:
> > s/master/primary/ ?
> > 
> >> so we have to enable qemu logdirty on secondary.
> >>
> >> libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
> >> qemu logdirty. But it uses domain_save_state, and calls
> > 
> > s/domain_save_state/libxl__domain_save_state/
> >> libxl__xc_domain_saverestore_async_callback_done()
> >> before exits. This can not be used for secondary vm.
> >>
> >> Update libxl__domain_suspend_common_switch_qemu_logdirty() to
> >> introduce a new API libxl__domain_common_switch_qemu_logdirty().
> >> This API only uses libxl__logdirty_switch, and calls
> >> lds->callback before exits.
> > 
> > One question - that perhaps had been part of the review earlier
> > (if so it may be good to include this in the description
> > so I don't ask silly questions):
> > 
> > Why add this extra API? You could squash 
> > libxl__domain_suspend_common_switch_qemu_logdirty
> > and libxl__domain_common_switch_qemu_logdirty code together
> > and call it libxl_domain_common_and_suspend_common_switch_qemu_logdirty
> > (ok, just kidding on the name). But - why not have one function
> > instead of splitting the functionality in two?
> 
> Do you mean that auto switch qemu logdirty when suspend the guest?

Squash the two functions - libxl__domain_common_switch_qemu_logdirty and
libxl__domain_suspend_common_switch_qemu_logdirty together?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc

2016-01-26 Thread Konrad Rzeszutek Wilk

> > Or is that suppose to be done in another patch? If so you may want to
> > mention that in the commit description?
> 
> Do you mean where is the code that uses back_fd? It is in another series:
> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02904.html

Ah right that big patchset one. Hadn't looked at that yet - it is a bit hard
without having a git tree on which the foundation patches (this patch
series) are applied so I can look at the contents of the functions.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn

2016-01-26 Thread Tim Deegan

Hi,

At 15:30 +0800 on 19 Jan (1453217458), Huaitong Han wrote:
> At the moment, the pfec argument to gva_to_gfn has two functions:
> 
> * To inform guest_walk what kind of access is happenind
> 
> * As a value to pass back into the guest in the event of a fault.
> 
> Unfortunately this is not quite treated consistently: the hvm_fetch_*
> function will "pre-clear" the PFEC_insn_fetch flag before calling
> gva_to_gfn; meaning guest_walk doesn't actually know whether a given
> access is an instruction fetch or not.  This works now, but will cause
> issues when pkeys are introduced, since guest_walk will need to know
> whether an access is an instruction fetch even if it doesn't return
> PFEC_insn_fetch.
> 
> Fix this by making a clean separation for in and out functionalities
> of the pfec argument:
> 
> 1. Always pass in the access type to gva_to_gfn
> 
> 2. Filter out inappropriate access flags before returning from gva_to_gfn.

This seems OK.  But can you please:
 - Add this new adjustment once, in paging_gva_to_gfn(), instead of
   adding it to each implementation; and
 - Adjust the comment above the declaration of paging_gva_to_gfn() in
   paging.h to describe this new behaviour.

Also:

> diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
> index 58f7e72..bbbc706 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3668,6 +3668,12 @@ sh_gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m,
>  pfec[0] &= ~PFEC_page_present;
>  if ( missing & _PAGE_INVALID_BITS )
>  pfec[0] |= PFEC_reserved_bit;
> +/*
> + * Intel 64 Volume 3, Section 4.7: The PFEC_insn_fetch flag is
> + * set only when NX or SMEP are enabled.
> + */
> +if ( !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
> +pfec[0] &= ~PFEC_insn_fetch;

This needs to either DTRT for PV guests or assert that it always sees
a HVM guest (I think this is the case but haven't tested).

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag

2016-01-26 Thread Paul Durrant

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: 21 January 2016 15:46
> To: Ian Campbell
> Cc: Paul Durrant; xen-de...@lists.xenproject.org; Ian Jackson; Jan Beulich;
> Keir (Xen.org); Tim (Xen.org); Wei Liu; Roger Pau Monne
> Subject: Re: [PATCH] public/io/netif.h: change semantics of "request-
> multicast-control" flag
> 
> On Thu, Jan 21, 2016 at 03:29:36PM +, Ian Campbell wrote:
> > On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote:
> > > My patch b2700877 "move and amend multicast control documentation"
> > > clarified use of the multicast control protocol between frontend and
> > > backend. However, it transpires that the restrictions that documentation
> > > placed on the "request-multicast-control" flag make it hard for a
> > > frontend to enable 'all multicast' promiscuous mode, in that to do so
> > > would require the frontend and backend to disconnect and re-connect.
> > >
> > > This patch adds a new "feature-dynamic-multicast-control" flag to allow
> > > a backend to advertise that it will watch "request-multicast-control"
> hence
> > > allowing it to be meaningfully modified by the frontend at any time rather
> > > than only when the frontend and backend are disconnected.
> > >
> > > Signed-off-by: Paul Durrant 
> > > Cc: Ian Campbell 
> > > Cc: Ian Jackson 
> > > Cc: Jan Beulich 
> > > Cc: Keir Fraser 
> > > Cc: Tim Deegan 
> >
> >
> > This looks good to me, but also adding Wei (Linux netback + BSD stuff) and
> > Roger (BSD stuff) for their perspective.
> >
> > I should probably have done that for the last set of netif.h changes too,
> > since apart from the nominal maintainers of xen/include/public/io/*.h it's
> > worth getting input from the maintainers of the consumers. Not sure we
> can
> > express that very well in MAINTAINERS :-(.
> >
> > Ian.
> 
> LGTM
> 
> Acked-by: Wei Liu 

Ping? I notice this patch is not yet applied.

  Paul


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] XSAVE flavors

2016-01-26 Thread Jan Beulich

Shuai,

originally I only meant to inquire about the state of the promised
alternatives improvement to the XSAVE code. However, while
looking over the code in question again I stumbled across a
separate issue: XSAVES, just like XSAVEOPT, may use the
"modified" optimization. However, the fcs and fds handling code
that has been present around the use of XSAVEOPT did not also
get applied to the XSAVES path. I suppose this was just an
oversight?

With this another question then is whether, when both XSAVEC
and XSAVEOPT are available, it is indeed always better to use
XSAVEC (as the code is doing after your enabling).

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state

2016-01-26 Thread Konrad Rzeszutek Wilk

On Tue, Jan 26, 2016 at 10:23:52AM +0800, Wen Congyang wrote:
> On 01/26/2016 01:29 AM, Konrad Rzeszutek Wilk wrote:
> > .snip..
> >> --- a/tools/libxl/libxl_dom_suspend.c
> >> +++ b/tools/libxl/libxl_dom_suspend.c
> >> @@ -19,14 +19,71 @@
> >>  
> >>  /*== Domain suspend ===*/
> >>  
> >> +int libxl__domain_suspend_init(libxl__egc *egc,
> >> +   libxl__domain_suspend_state *dsps)
> >> +{
> >> +STATE_AO_GC(dsps->ao);
> >> +int rc = ERROR_FAIL;
> >> +int port;
> >> +libxl_domain_type type;
> >> +
> >> +/* Convenience aliases */
> >> +const uint32_t domid = dsps->domid;
> >> +
> >> +type = libxl__domain_type(gc, domid);
> >> +switch (type) {
> >> +case LIBXL_DOMAIN_TYPE_HVM: {
> >> +dsps->hvm = 1;
> >> +break;
> >> +}
> >> +case LIBXL_DOMAIN_TYPE_PV:
> >> +dsps->hvm = 0;
> >> +break;
> >> +default:
> >> +goto out;
> > 
> > This will mean we return back to libxl__domain_save which will goto out 
> > which calls:
> > domain_save_done. And that will try to use the dsps->guestevtchn leading to 
> > a crash since:
> 
> Yes, thanks for pointing it out. In which case, the type is not HVM or PV?

If you call those init routines before the switch statemet - such as the
libxl__xswait_init, etc, then you can still goto out
> 
> >> +}
> >> +
> >> +libxl__xswait_init(>pvcontrol);
> >> +libxl__ev_evtchn_init(>guest_evtchn);
> > 
> > we initialize them here.
> >> +libxl__ev_xswatch_init(>guest_watch);
> >> +libxl__ev_time_init(>guest_timeout);
> > 
> > I would instead recommend you move these initialization routines above the
> > 'type' check.
> 
> I think we should not return ERROR_FAIL when the type is not PV or HVM. We 
> should abort the program
> like what we do in libxl__domain_save().

I would rather return - this is a library after all - so the controlling 
program should
do such drastic measures - not an library.

> 
> > 
> >> +
> >> +dsps->guest_evtchn.port = -1;
> >> +dsps->guest_evtchn_lockfd = -1;
> >> +dsps->guest_responded = 0;
> >> +dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
> >> +
> >> +port = xs_suspend_evtchn_port(domid);
> >> +
> >> +if (port >= 0) {
> >> +rc = libxl__ctx_evtchn_init(gc);
> >> +if (rc) goto out;
> >> +
> >> +dsps->guest_evtchn.port =
> >> +xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
> >> +domid, port, 
> >> >guest_evtchn_lockfd);
> >> +
> >> +if (dsps->guest_evtchn.port < 0) {
> >> +LOG(WARN, "Suspend event channel initialization failed");
> >> +rc = ERROR_FAIL;
> >> +goto out;
> >> +}
> >> +}
> >> +
> >> +rc = 0;
> >> +
> >> +out:
> >> +return rc;
> >> +}
> >> +
> > 
> > .. snip..
> >>  struct libxl__domain_suspend_state {
> >> +/* set by caller of libxl__domain_suspend_init */
> >> +libxl__ao *ao;
> >> +uint32_t domid;
> >> +
> >> +/* private */
> >> +int hvm;
> > 
> > How about 'is_hvm' and just use 'libxl_domain_type' type?
> > instead of having an int? You can just do:
> 
> In dss, it is 'int hvm'.
> Before this patch:
> if (dss->hvm) ...
> After this patch:
> if (dsps->hvm) ...

Right..
> 
> Thanks
> Wen Congyang
> 
> > 
> > if (type == LIBXL_DOMAIN_TYPE_HVM) ..

But what if you use that? As in dsps->type == LIBXL_DOMAIAN_TYPE_HVM for 
example?

> > 
> > And to check for non-conforming types - you can make  
> > libxl__domain_suspend_init
> > do this:
> > 
> > if (type == LIBXL_DOMAIN_TYPE_INVALID) {
> > rc = ERROR_FAIL;
> > goto out; 
> > }
> > 
> > ?
> > 
> > 
> > .
> > 
> 
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-26 Thread Boqun Feng

Hi Will,

On Tue, Jan 26, 2016 at 12:16:09PM +, Will Deacon wrote:
> On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 25, 2016 at 04:42:43PM +, Will Deacon wrote:
> > > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > > PPC Overlapping Group-B sets version 4
> > > > ""
> > > > (* When the Group-B sets from two different barriers involve 
> > > > instructions in
> > > >the same thread, within that thread one set must contain the other.
> > > > 
> > > > P0  P1  P2
> > > > Rx=1Wy=1Wz=2
> > > > dep.lwsync  lwsync
> > > > Ry=0Wz=1Wx=1
> > > > Rz=1
> > > > 
> > > > assert(!(z=2))
> > > > 
> > > >Forbidden by ppcmem, allowed by herd.
> > > > *)
> > > > {
> > > > 0:r1=x; 0:r2=y; 0:r3=z;
> > > > 1:r1=x; 1:r2=y; 1:r3=z; 1:r4=1;
> > > > 2:r1=x; 2:r2=y; 2:r3=z; 2:r4=1; 2:r5=2;
> > > > }
> > > >  P0 | P1| P2;
> > > >  lwz r6,0(r1)   | stw r4,0(r2)  | stw r5,0(r3)  ;
> > > >  xor r7,r6,r6   | lwsync| lwsync;
> > > >  lwzx r7,r7,r2  | stw r4,0(r3)  | stw r4,0(r1)  ;
> > > >  lwz r8,0(r3)   |   |   ;
> > > > 
> > > > exists
> > > > (z=2 /\ 0:r6=1 /\ 0:r7=0 /\ 0:r8=1)
> > > 
> > > That really hurts. Assuming that the "assert(!(z=2))" is actually there
> > > to constrain the coherence order of z to be {0->1->2}, then I think that
> > > this test is forbidden on arm using dmb instead of lwsync. That said, I
> > > also don't think the Rz=1 in P0 changes anything.
> > 
> > What about the smp_wmb() variant of dmb that orders only stores?
> 
> Tricky, but I think it still works out if the coherence order of z is as
> I described above. The line of reasoning is weird though -- I ended up
> considering the two cases where P0 reads z before and after it reads x
 ^^^
Because of the fact that two reads on the same processors can't be
executed simultaneously? I feel like this is exactly something herd
missed.

> and what that means for the read of y.
> 

And the reasoning on PPC is similar, so looks like the read of z on P0
is a necessary condition for the exists clause to be forbidden.

Regards,
Boqun

> Will


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Konrad Rzeszutek Wilk

> Last year at Linux Plumbers Conference I attended a session dedicated
> to NVDIMM support. I asked the very same question and the INTEL guy
> there told me there is indeed something like a partition table meant
> to describe the layout of the memory areas and their contents.

It is described in details at pmem.io, look at  Documents, see
http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf see Namespaces section.

Then I would recommend you read:
http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf

followed by http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

And then for dessert:
https://www.kernel.org/doc/Documentation/nvdimm/nvdimm.txt
which explains it in more technical terms.
> 
> It would be nice to have a pointer to such information. Without anything
> like this it might be rather difficult to find the best solution how to
> implement NVDIMM support in Xen or any other product.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] tools/libxl: improve logging on domain create failure.

2016-01-26 Thread Ian Campbell

A user reported[0] that xl create failed with just:
libxl: error: libxl_create.c:892:initiate_domain_create: Unable to set 
domain build info defaults
and some resulting fallout, but without indicating why it was unable
to set the defaults, even in verbose mode[1].

Go through libxl__domain_{create,build}_info_setdefault and ensure
that each error path logs something.

In most cases this involved simply adding a call to LOG.

In two cases this involved switching from strdup to
libxl__strdup(NOGC) and removing the existing error handling.

When switching from qemu-xen to qemu-xen-traditional (because the
former is not available) log at level INFO rather than VERBOSE, so
the message would normally be printed. Also tweak the language here.

I'm not sure all these messages are reachable (some might be shadowed
by previous error paths) but it seems better to err on the side of
caution.

[0] http://lists.xen.org/archives/html/xen-users/2016-01/msg00125.html
[1] http://lists.xen.org/archives/html/xen-users/2016-01/msg00129.html

Signed-off-by: Ian Campbell 
Cc: suse@fea.st
---
suse.dev, this might help diagnose the issue you are seeing.

Given the usability issue, I think this ought to be backported.
---
 tools/libxl/libxl_create.c | 48 +-
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e491d83..de5d27f 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -30,8 +30,10 @@
 int libxl__domain_create_info_setdefault(libxl__gc *gc,
  libxl_domain_create_info *c_info)
 {
-if (!c_info->type)
+if (!c_info->type) {
+LOG(ERROR, "domain type unspecified");
 return ERROR_INVAL;
+}
 
 if (c_info->type == LIBXL_DOMAIN_TYPE_HVM) {
 libxl_defbool_setdefault(_info->hap, true);
@@ -66,8 +68,10 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 int i;
 
 if (b_info->type != LIBXL_DOMAIN_TYPE_HVM &&
-b_info->type != LIBXL_DOMAIN_TYPE_PV)
+b_info->type != LIBXL_DOMAIN_TYPE_PV) {
+LOG(ERROR, "invalid domain type");
 return ERROR_INVAL;
+}
 
 libxl_defbool_setdefault(_info->device_model_stubdomain, false);
 
@@ -97,8 +101,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 if (rc < 0) {
 /* qemu-xen unavailable, use qemu-xen-traditional */
 if (errno == ENOENT) {
-LOGE(VERBOSE, "qemu-xen is unavailable"
- ", use qemu-xen-traditional instead");
+LOGE(INFO, "qemu-xen is unavailable"
+ ", using qemu-xen-traditional instead");
 b_info->device_model_version =
 LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL;
 } else {
@@ -121,18 +125,24 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 b_info->u.hvm.bios = LIBXL_BIOS_TYPE_SEABIOS; break;
 case LIBXL_DEVICE_MODEL_VERSION_NONE:
 break;
-default:return ERROR_INVAL;
+default:
+LOG(ERROR, "unknown device model version");
+return ERROR_INVAL;
 }
 
 /* Enforce BIOS<->Device Model version relationship */
 switch (b_info->device_model_version) {
 case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-if (b_info->u.hvm.bios != LIBXL_BIOS_TYPE_ROMBIOS)
+if (b_info->u.hvm.bios != LIBXL_BIOS_TYPE_ROMBIOS) {
+LOG(ERROR, "qemu-xen-traditional requires bios=rombios.");
 return ERROR_INVAL;
+}
 break;
 case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-if (b_info->u.hvm.bios == LIBXL_BIOS_TYPE_ROMBIOS)
+if (b_info->u.hvm.bios == LIBXL_BIOS_TYPE_ROMBIOS) {
+LOG(ERROR, "qemu-xen does not support bios=rombios.");
 return ERROR_INVAL;
+}
 break;
 case LIBXL_DEVICE_MODEL_VERSION_NONE:
 break;
@@ -160,19 +170,25 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 if (!b_info->max_vcpus)
 b_info->max_vcpus = 1;
 if (!b_info->avail_vcpus.size) {
-if (libxl_cpu_bitmap_alloc(CTX, _info->avail_vcpus, 1))
+if (libxl_cpu_bitmap_alloc(CTX, _info->avail_vcpus, 1)) {
+LOG(ERROR, "unable to allocate avail_vcpus bitmap");
 return ERROR_FAIL;
+}
 libxl_bitmap_set(_info->avail_vcpus, 0);
-} else if (b_info->avail_vcpus.size > HVM_MAX_VCPUS)
+} else if (b_info->avail_vcpus.size > HVM_MAX_VCPUS) {
+LOG(ERROR, "avail_vcpus bitmap contains too many VCPUS");
 return ERROR_FAIL;
+}
 
 /* In libxl internals, we want to deal with vcpu_hard_affinity only! */
 if (b_info->cpumap.size &&

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Konrad Rzeszutek Wilk

On Tue, Jan 26, 2016 at 01:58:35PM +, George Dunlap wrote:
> On 26/01/16 12:44, Jan Beulich wrote:
>  On 26.01.16 at 12:44,  wrote:
> >> On Thu, Jan 21, 2016 at 2:52 PM, Jan Beulich  wrote:
> >> On 21.01.16 at 15:01,  wrote:
>  On 01/21/16 03:25, Jan Beulich wrote:
>  On 21.01.16 at 10:10,  wrote:
> >> c) hypervisor should mange PMEM resource pool and partition it to 
> >> multiple
> >> VMs.
> >
> > Yes.
> >
> 
>  But I Still do not quite understand this part: why must pmem resource
>  management and partition be done in hypervisor?
> >>>
> >>> Because that's where memory management belongs. And PMEM,
> >>> other than PBLK, is just another form of RAM.
> >>
> >> I haven't looked more deeply into the details of this, but this
> >> argument doesn't seem right to me.
> >>
> >> Normal RAM in Xen is what might be called "fungible" -- at boot, all
> >> RAM is zeroed, and it basically doesn't matter at all what RAM is
> >> given to what guest.  (There are restrictions of course: lowmem for
> >> DMA, contiguous superpages,  but within those groups, it doesn't
> >> matter *which* bit of lowmem you get, as long as you get enough to do
> >> your job.)  If you reboot your guest or hand RAM back to the
> >> hypervisor, you assume that everything in it will disappear.  When you
> >> ask for RAM, you can request some parameters that it will have
> >> (lowmem, on a specific node, ), but you can't request a specific
> >> page that you had before.
> >>
> >> This is not the case for PMEM.  The whole point of PMEM (correct me if
> >> I'm wrong) is to be used for long-term storage that survives over
> >> reboot.  It matters very much that a guest be given the same PRAM
> >> after the host is rebooted that it was given before.  It doesn't make
> >> any sense to manage it the way Xen currently manages RAM (i.e., that
> >> you request a page and get whatever Xen happens to give you).
> > 
> > Interesting. This isn't the usage model I have been thinking about
> > so far. Having just gone back to the original 0/4 mail, I'm afraid
> > we're really left guessing, and you guessed differently than I did.
> > My understanding of the intentions of PMEM so far was that this
> > is a high-capacity, slower than DRAM but much faster than e.g.
> > swapping to disk alternative to normal RAM. I.e. the persistent
> > aspect of it wouldn't matter at all in this case (other than for PBLK,
> > obviously).
> 
> Oh, right -- yes, if the usage model of PRAM is just "cheap slow RAM",
> then you're right -- it is just another form of RAM, that should be
> treated no differently than say, lowmem: a fungible resource that can be
> requested by setting a flag.

I would think of it as MMIO ranges than RAM. Yes it is behind an MMC - but
there are subtle things such as the new instructions - pcommit, clfushopt,
and other that impact it.

Furthermore ranges (contingous and most likely discontingous)
of this  "RAM" has to be shared with guests (at least dom0)
and with other (multiple HVM guests).


> 
> Haozhong?
> 
>  -George
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 COLOPre 14/18] tools/libxl: fix backword compatibility after the automatic renaming

2016-01-26 Thread Konrad Rzeszutek Wilk

> >> +/* Remus stuff */
> > 
> > /sRemus stuff//
> 
> Do you mean remove this line?



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 COLOPre 08/18] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state

2016-01-26 Thread Konrad Rzeszutek Wilk

On Tue, Jan 26, 2016 at 02:17:08PM +0800, Wen Congyang wrote:
> On 01/26/2016 02:41 AM, Konrad Rzeszutek Wilk wrote:
> > On Thu, Dec 17, 2015 at 03:48:12PM +0800, Wen Congyang wrote:
> >> From: Yang Hongyang 
> >>
> >> In normal migration, the qemu state was passed to qemu as a parameter.
> > 
> > /was/is/
> > 
> >> With COLO, Secondary vm is running. So we will do the following steps
> >> at every checkpoint:
> >> 1. suspend both primary vm and secondary vm
> >> 2. sync the state
> >> 3. resume both primary vm and secondary vm
> >> Primary will send qemu's state in step2, and
> >> Secondary's qemu should read it and restore the state before it
> > 
> > s/Secondary/secondary/
> >> is resumed. We can not pass the state to qemu as a parameter because
> >> Secondary QEMU already started at this point, so we introduce
> > 
> > s/Secondary/secondary/
> >> libxl__domain_restore_device_model() to do it.
> >> This API should be called before resuming secondary vm.
> > 
> > s/before/MUST/
> 
> s/should/MUST/?

 Yes. Thank you!

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] XSAVE flavors

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 15:33,  wrote:
> originally I only meant to inquire about the state of the promised
> alternatives improvement to the XSAVE code. However, while
> looking over the code in question again I stumbled across a
> separate issue: XSAVES, just like XSAVEOPT, may use the
> "modified" optimization. However, the fcs and fds handling code
> that has been present around the use of XSAVEOPT did not also
> get applied to the XSAVES path. I suppose this was just an
> oversight?
> 
> With this another question then is whether, when both XSAVEC
> and XSAVEOPT are available, it is indeed always better to use
> XSAVEC (as the code is doing after your enabling).

And I'm afraid there's yet one more issue: If my reading of the
SDM is right, then the offsets at which components get saved
by XSAVEC / XSAVES aren't fixed, but depend on RFBM (as that's
what gets stored into xcomp_bv[62:0]). xstate_comp_offsets[],
otoh, gets computed based on all available features, irrespective
of vcpu_xsave_mask() returning four different values depending
on current guest state. I can't see how get_xsave_addr() can
work correctly without honoring xcomp_bv. Nor can I convince
myself that state can't get corrupted / lost, e.g. when a save
with v->fpu_dirtied set is followed by one with v->fpu_dirtied
clear.

Am I misunderstanding what the SDM writes?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.

2016-01-26 Thread Xu, Quan

> On January 26, 2016 at 10:00pm,  wrote:
> >>> On 26.01.16 at 14:47,  wrote:
> > As you mentioned , I simply need to consult the bitmap along with the
> > domain ID array.
> >
> > +If ( test_bit(did, iommu->domid_bitmap) && iommu->domid_map[did] >= 0 )
> > +   d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> >
> > Is it right now?
> 
> Mostly, except that I don't understand the >= 0 part.
> 
Domain ID should be >= 0..
If it is redundant, I can remove it.

> > At first, I am open for any solution.
> > pcidevs_lock is quite a big lock. For this point, it looks much better
> > to add a new flag to delay hiding device.
> > I am also afraid that it may raise further security issues.
> 
> Well, I'd say just go and see which one turns out to be less cumbersome and/or
> less intrusive.
> 
For this lock, any good idea?
IMO, I can get started to add a new flag to delay hiding device.

-Quan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Haozhong Zhang

On 01/26/16 05:44, Jan Beulich wrote:
> >>> On 26.01.16 at 12:44,  wrote:
> > On Thu, Jan 21, 2016 at 2:52 PM, Jan Beulich  wrote:
> > On 21.01.16 at 15:01,  wrote:
> >>> On 01/21/16 03:25, Jan Beulich wrote:
>  >>> On 21.01.16 at 10:10,  wrote:
>  > c) hypervisor should mange PMEM resource pool and partition it to 
>  > multiple
>  > VMs.
> 
>  Yes.
> 
> >>>
> >>> But I Still do not quite understand this part: why must pmem resource
> >>> management and partition be done in hypervisor?
> >>
> >> Because that's where memory management belongs. And PMEM,
> >> other than PBLK, is just another form of RAM.
> > 
> > I haven't looked more deeply into the details of this, but this
> > argument doesn't seem right to me.
> > 
> > Normal RAM in Xen is what might be called "fungible" -- at boot, all
> > RAM is zeroed, and it basically doesn't matter at all what RAM is
> > given to what guest.  (There are restrictions of course: lowmem for
> > DMA, contiguous superpages,  but within those groups, it doesn't
> > matter *which* bit of lowmem you get, as long as you get enough to do
> > your job.)  If you reboot your guest or hand RAM back to the
> > hypervisor, you assume that everything in it will disappear.  When you
> > ask for RAM, you can request some parameters that it will have
> > (lowmem, on a specific node, ), but you can't request a specific
> > page that you had before.
> > 
> > This is not the case for PMEM.  The whole point of PMEM (correct me if
> > I'm wrong) is to be used for long-term storage that survives over
> > reboot.  It matters very much that a guest be given the same PRAM
> > after the host is rebooted that it was given before.  It doesn't make
> > any sense to manage it the way Xen currently manages RAM (i.e., that
> > you request a page and get whatever Xen happens to give you).
> 
> Interesting. This isn't the usage model I have been thinking about
> so far. Having just gone back to the original 0/4 mail, I'm afraid
> we're really left guessing, and you guessed differently than I did.
> My understanding of the intentions of PMEM so far was that this
> is a high-capacity, slower than DRAM but much faster than e.g.
> swapping to disk alternative to normal RAM. I.e. the persistent
> aspect of it wouldn't matter at all in this case (other than for PBLK,
> obviously).
>

Of course, pmem could be used in the way you thought because of its
'ram' aspect. But I think the more meaningful usage is from its
persistent aspect. For example, the implementation of some journal
file systems could store logs in pmem rather than the normal ram, so
that if a power failure happens before those in-memory logs are
completely written to the disk, there would still be chance to restore
them from pmem after next booting (rather than abandoning all of
them).

(I'm still writing the design doc which will include more details of
underlying hardware and the software interface of nvdimm exposed by
current linux)

> However, thinking through your usage model I have problems
> seeing it work in a reasonable way even with virtualization left
> aside: To my knowledge there's no established protocol on how
> multiple parties (different versions of the same OS, or even
> completely different OSes) would arbitrate using such memory
> ranges. And even for a single OS it is, other than for disks (and
> hence PBLK), not immediately clear how it would communicate
> from one boot to another what information got stored where,
> or how it would react to some or all of this storage having
> disappeared (just like a disk which got removed, which - unless
> it held the boot partition - would normally have pretty little
> effect on the OS coming back up).
>

Label storage area is a persistent area on NVDIMM and can be used to
store partitions information. It's not included in pmem (that part
that is mapped into the system address space). Instead, it can be only
accessed through NVDIMM _DSM method [1]. However, what contents are
stored and how they are interpreted are left to software. One way is
to follow NVDIMM Namespace Specification [2] to store an array of
labels that describe the start address (from the base 0 of pmem) and
the size of each partition, which is called as namespace. On Linux,
each namespace is exposed as a /dev/pmemXX device.

In the virtualization, the (virtual) label storage area of vNVDIMM and
the corresponding _DSM method are emulated by QEMU. The virtual label
storage area is not written to the host one. Instead, we can reserve a
piece area on pmem for the virtual one.

Besides namespaces, we can also create DAX file systems on pmem and
use files to partition.

Haozhong

> > So if Xen is going to use PMEM, it will have to invent an entirely new
> > interface for guests, and it will have to keep track of those
> > resources across host reboots.  In other words, it will have

[Xen-devel] [linux-3.10 baseline-only test] 38702: trouble: blocked/broken

2016-01-26 Thread Platform Team regression test user

This run is configured for baseline tests only.

flight 38702 linux-3.10 real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38702/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-xsm3 host-install(3) broken REGR. vs. 38513
 build-i386-pvops  3 host-install(3) broken REGR. vs. 38513
 build-i3863 host-install(3) broken REGR. vs. 38513
 build-amd64   3 host-install(3) broken REGR. vs. 38513
 build-amd64-pvops 3 host-install(3) broken REGR. vs. 38513
 build-amd64-xsm   3 host-install(3) broken REGR. vs. 38513

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 build-i386-rumpuserxen1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 build-amd64-rumpuserxen   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Haozhong Zhang

On 01/26/16 23:30, Haozhong Zhang wrote:
> On 01/26/16 05:44, Jan Beulich wrote:
> > >>> On 26.01.16 at 12:44,  wrote:
> > > On Thu, Jan 21, 2016 at 2:52 PM, Jan Beulich  wrote:
> > > On 21.01.16 at 15:01,  wrote:
> > >>> On 01/21/16 03:25, Jan Beulich wrote:
> >  >>> On 21.01.16 at 10:10,  wrote:
> >  > c) hypervisor should mange PMEM resource pool and partition it to 
> >  > multiple
> >  > VMs.
> > 
> >  Yes.
> > 
> > >>>
> > >>> But I Still do not quite understand this part: why must pmem resource
> > >>> management and partition be done in hypervisor?
> > >>
> > >> Because that's where memory management belongs. And PMEM,
> > >> other than PBLK, is just another form of RAM.
> > > 
> > > I haven't looked more deeply into the details of this, but this
> > > argument doesn't seem right to me.
> > > 
> > > Normal RAM in Xen is what might be called "fungible" -- at boot, all
> > > RAM is zeroed, and it basically doesn't matter at all what RAM is
> > > given to what guest.  (There are restrictions of course: lowmem for
> > > DMA, contiguous superpages,  but within those groups, it doesn't
> > > matter *which* bit of lowmem you get, as long as you get enough to do
> > > your job.)  If you reboot your guest or hand RAM back to the
> > > hypervisor, you assume that everything in it will disappear.  When you
> > > ask for RAM, you can request some parameters that it will have
> > > (lowmem, on a specific node, ), but you can't request a specific
> > > page that you had before.
> > > 
> > > This is not the case for PMEM.  The whole point of PMEM (correct me if
> > > I'm wrong) is to be used for long-term storage that survives over
> > > reboot.  It matters very much that a guest be given the same PRAM
> > > after the host is rebooted that it was given before.  It doesn't make
> > > any sense to manage it the way Xen currently manages RAM (i.e., that
> > > you request a page and get whatever Xen happens to give you).
> > 
> > Interesting. This isn't the usage model I have been thinking about
> > so far. Having just gone back to the original 0/4 mail, I'm afraid
> > we're really left guessing, and you guessed differently than I did.
> > My understanding of the intentions of PMEM so far was that this
> > is a high-capacity, slower than DRAM but much faster than e.g.
> > swapping to disk alternative to normal RAM. I.e. the persistent
> > aspect of it wouldn't matter at all in this case (other than for PBLK,
> > obviously).
> >
> 
> Of course, pmem could be used in the way you thought because of its
> 'ram' aspect. But I think the more meaningful usage is from its
> persistent aspect. For example, the implementation of some journal
> file systems could store logs in pmem rather than the normal ram, so
> that if a power failure happens before those in-memory logs are
> completely written to the disk, there would still be chance to restore
> them from pmem after next booting (rather than abandoning all of
> them).
> 
> (I'm still writing the design doc which will include more details of
> underlying hardware and the software interface of nvdimm exposed by
> current linux)
> 
> > However, thinking through your usage model I have problems
> > seeing it work in a reasonable way even with virtualization left
> > aside: To my knowledge there's no established protocol on how
> > multiple parties (different versions of the same OS, or even
> > completely different OSes) would arbitrate using such memory
> > ranges. And even for a single OS it is, other than for disks (and
> > hence PBLK), not immediately clear how it would communicate
> > from one boot to another what information got stored where,
> > or how it would react to some or all of this storage having
> > disappeared (just like a disk which got removed, which - unless
> > it held the boot partition - would normally have pretty little
> > effect on the OS coming back up).
> >
> 
> Label storage area is a persistent area on NVDIMM and can be used to
> store partitions information. It's not included in pmem (that part
> that is mapped into the system address space). Instead, it can be only
> accessed through NVDIMM _DSM method [1]. However, what contents are
> stored and how they are interpreted are left to software. One way is
> to follow NVDIMM Namespace Specification [2] to store an array of
> labels that describe the start address (from the base 0 of pmem) and
> the size of each partition, which is called as namespace. On Linux,
> each namespace is exposed as a /dev/pmemXX device.
> 
> In the virtualization, the (virtual) label storage area of vNVDIMM and
> the corresponding _DSM method are emulated by QEMU. The virtual label
> storage area is not written to the host one. Instead, we can reserve a
> piece area on pmem for the virtual one.
> 
> Besides namespaces, we can also create DAX file systems on pmem and
> use files to

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 15:44,  wrote:
>>  Last year at Linux Plumbers Conference I attended a session dedicated
>> to NVDIMM support. I asked the very same question and the INTEL guy
>> there told me there is indeed something like a partition table meant
>> to describe the layout of the memory areas and their contents.
> 
> It is described in details at pmem.io, look at  Documents, see
> http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf see Namespaces section.

Well, that's about how PMEM and PBLK ranges get marked, but not
about how use of the space inside a PMEM range is coordinated.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 3/8] xen: Switch to libxengnttab interface for compat shims.

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.

One such library will be libxengnttab which provides access to grant
tables.

In preparation for this switch the compatibility layer in xen_common.h
(which support building with older versions of Xen) to use what will
be the new library API. This means that the gnttab shim will disappear
for versions of Xen which include libxengnttab.

To simplify things for the <= 4.0.0 support we wrap the int fd in a
malloc(sizeof int) such that the handle is always a pointer. This
leads to less typedef headaches and the need for
XC_HANDLER_INITIAL_VALUE etc for these interfaces.

Note that this patch does not add any support for actually using
libxengnttab, it just adjusts the existing shims.

Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 
---
 hw/block/xen_disk.c  |   38 --
 hw/char/xen_console.c|4 ++--
 hw/net/xen_nic.c |   18 +-
 hw/xen/xen_backend.c |   10 +-
 include/hw/xen/xen_backend.h |2 +-
 include/hw/xen/xen_common.h  |   42 --
 6 files changed, 69 insertions(+), 45 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 571f651..7bd5bde 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -161,11 +161,11 @@ static gint int_cmp(gconstpointer a, gconstpointer b, 
gpointer user_data)
 static void destroy_grant(gpointer pgnt)
 {
 PersistentGrant *grant = pgnt;
-XenGnttab gnt = grant->blkdev->xendev.gnttabdev;
+xengnttab_handle *gnt = grant->blkdev->xendev.gnttabdev;
 
-if (xc_gnttab_munmap(gnt, grant->page, 1) != 0) {
+if (xengnttab_unmap(gnt, grant->page, 1) != 0) {
 xen_be_printf(>blkdev->xendev, 0,
-  "xc_gnttab_munmap failed: %s\n",
+  "xengnttab_unmap failed: %s\n",
   strerror(errno));
 }
 grant->blkdev->persistent_gnt_count--;
@@ -178,11 +178,11 @@ static void remove_persistent_region(gpointer data, 
gpointer dev)
 {
 PersistentRegion *region = data;
 struct XenBlkDev *blkdev = dev;
-XenGnttab gnt = blkdev->xendev.gnttabdev;
+xengnttab_handle *gnt = blkdev->xendev.gnttabdev;
 
-if (xc_gnttab_munmap(gnt, region->addr, region->num) != 0) {
+if (xengnttab_unmap(gnt, region->addr, region->num) != 0) {
 xen_be_printf(>xendev, 0,
-  "xc_gnttab_munmap region %p failed: %s\n",
+  "xengnttab_unmap region %p failed: %s\n",
   region->addr, strerror(errno));
 }
 xen_be_printf(>xendev, 3,
@@ -317,7 +317,7 @@ err:
 
 static void ioreq_unmap(struct ioreq *ioreq)
 {
-XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
+xengnttab_handle *gnt = ioreq->blkdev->xendev.gnttabdev;
 int i;
 
 if (ioreq->num_unmap == 0 || ioreq->mapped == 0) {
@@ -327,8 +327,9 @@ static void ioreq_unmap(struct ioreq *ioreq)
 if (!ioreq->pages) {
 return;
 }
-if (xc_gnttab_munmap(gnt, ioreq->pages, ioreq->num_unmap) != 0) {
-xen_be_printf(>blkdev->xendev, 0, "xc_gnttab_munmap failed: 
%s\n",
+if (xengnttab_unmap(gnt, ioreq->pages, ioreq->num_unmap) != 0) {
+xen_be_printf(>blkdev->xendev, 0,
+  "xengnttab_unmap failed: %s\n",
   strerror(errno));
 }
 ioreq->blkdev->cnt_map -= ioreq->num_unmap;
@@ -338,8 +339,9 @@ static void ioreq_unmap(struct ioreq *ioreq)
 if (!ioreq->page[i]) {
 continue;
 }
-if (xc_gnttab_munmap(gnt, ioreq->page[i], 1) != 0) {
-xen_be_printf(>blkdev->xendev, 0, "xc_gnttab_munmap 
failed: %s\n",
+if (xengnttab_unmap(gnt, ioreq->page[i], 1) != 0) {
+xen_be_printf(>blkdev->xendev, 0,
+  "xengnttab_unmap failed: %s\n",
   strerror(errno));
 }
 ioreq->blkdev->cnt_map--;
@@ -351,7 +353,7 @@ static void ioreq_unmap(struct ioreq *ioreq)
 
 static int ioreq_map(struct ioreq *ioreq)
 {
-XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
+xengnttab_handle *gnt = ioreq->blkdev->xendev.gnttabdev;
 uint32_t domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 uint32_t refs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
@@ -402,7 +404,7 @@ static int ioreq_map(struct ioreq *ioreq)
 }
 
 if (batch_maps && new_maps) {
-ioreq->pages = xc_gnttab_map_grant_refs
+ioreq->pages = xengnttab_map_grant_refs
 (gnt, new_maps, domids, refs, ioreq->prot);
 if (ioreq->pages == NULL) {
 xen_be_printf(>blkdev->xendev, 0,
@@

[Xen-devel] [PULL 1/8] xen_console: correctly cleanup primary console on teardown.

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

All of the work in con_disconnect applies to the primary console case
(when xendev->dev is NULL). Therefore remove the early check and bail
and allow it to fall through. All of the existing code is correctly
conditional already.

The ->dev and ->gnttabdev handles are either both set or neither. For
consistency with con_initialise() with to the former here too.

With this con_initialise and con_disconnect now mirror each other.

Fix up a hard tab in the function while editing.

Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 
---
 hw/char/xen_console.c |7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index eb7f450..63ade33 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -265,9 +265,6 @@ static void con_disconnect(struct XenDevice *xendev)
 {
 struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
 
-if (!xendev->dev) {
-return;
-}
 if (con->chr) {
 qemu_chr_add_handlers(con->chr, NULL, NULL, NULL, NULL);
 qemu_chr_fe_release(con->chr);
@@ -275,12 +272,12 @@ static void con_disconnect(struct XenDevice *xendev)
 xen_be_unbind_evtchn(>xendev);
 
 if (con->sring) {
-if (!xendev->gnttabdev) {
+if (!xendev->dev) {
 munmap(con->sring, XC_PAGE_SIZE);
 } else {
 xc_gnttab_munmap(xendev->gnttabdev, con->sring, 1);
 }
-   con->sring = NULL;
+con->sring = NULL;
 }
 }
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 8/8] xen: make it possible to build without the Xen PV domain builder

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

Until the previous patch this relied on xc_fd(), which was only
implemented for Xen 4.0 and earlier.

Given this wasn't working since Xen 4.0 I have marked this as disabled
by default.

Removing this support drops the use of a bunch of symbols from
libxenctrl, specifically:

  - xc_domain_create
  - xc_domain_destroy
  - xc_domain_getinfo
  - xc_domain_max_vcpus
  - xc_domain_setmaxmem
  - xc_domain_unpause
  - xc_evtchn_alloc_unbound
  - xc_linux_build

This is another step towards only using Xen libraries which provide a
stable inteface.

Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 
---
 configure   |   15 +++
 hw/xenpv/Makefile.objs  |4 +++-
 hw/xenpv/xen_machine_pv.c   |   15 +++
 include/hw/xen/xen_common.h |2 ++
 4 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 9ead31d..3506e44 100755
--- a/configure
+++ b/configure
@@ -250,6 +250,7 @@ vnc_jpeg=""
 vnc_png=""
 xen=""
 xen_ctrl_version=""
+xen_pv_domain_build="no"
 xen_pci_passthrough=""
 linux_aio=""
 cap_ng=""
@@ -927,6 +928,10 @@ for opt do
   ;;
   --enable-xen-pci-passthrough) xen_pci_passthrough="yes"
   ;;
+  --disable-xen-pv-domain-build) xen_pv_domain_build="no"
+  ;;
+  --enable-xen-pv-domain-build) xen_pv_domain_build="yes"
+  ;;
   --disable-brlapi) brlapi="no"
   ;;
   --enable-brlapi) brlapi="yes"
@@ -2229,6 +2234,12 @@ if test "$xen_pci_passthrough" != "no"; then
   fi
 fi
 
+if test "$xen_pv_domain_build" = "yes" &&
+   test "$xen" != "yes"; then
+error_exit "User requested Xen PV domain builder support" \
+  "which requires Xen support."
+fi
+
 ##
 # libtool probe
 
@@ -4848,6 +4859,7 @@ fi
 echo "xen support   $xen"
 if test "$xen" = "yes" ; then
   echo "xen ctrl version  $xen_ctrl_version"
+  echo "pv dom build  $xen_pv_domain_build"
 fi
 echo "brlapi support$brlapi"
 echo "bluez  support$bluez"
@@ -5219,6 +5231,9 @@ fi
 if test "$xen" = "yes" ; then
   echo "CONFIG_XEN_BACKEND=y" >> $config_host_mak
   echo "CONFIG_XEN_CTRL_INTERFACE_VERSION=$xen_ctrl_version" >> 
$config_host_mak
+  if test "$xen_pv_domain_build" = "yes" ; then
+echo "CONFIG_XEN_PV_DOMAIN_BUILD=y" >> $config_host_mak
+  fi
 fi
 if test "$linux_aio" = "yes" ; then
   echo "CONFIG_LINUX_AIO=y" >> $config_host_mak
diff --git a/hw/xenpv/Makefile.objs b/hw/xenpv/Makefile.objs
index 49f6e9e..bbf5873 100644
--- a/hw/xenpv/Makefile.objs
+++ b/hw/xenpv/Makefile.objs
@@ -1,2 +1,4 @@
 # Xen PV machine support
-obj-$(CONFIG_XEN) += xen_domainbuild.o xen_machine_pv.o
+obj-$(CONFIG_XEN) += xen_machine_pv.o
+# Xen PV machine builder support
+obj-$(CONFIG_XEN_PV_DOMAIN_BUILD) += xen_domainbuild.o
diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c
index 23d6ef0..3250b94 100644
--- a/hw/xenpv/xen_machine_pv.c
+++ b/hw/xenpv/xen_machine_pv.c
@@ -30,9 +30,6 @@
 
 static void xen_init_pv(MachineState *machine)
 {
-const char *kernel_filename = machine->kernel_filename;
-const char *kernel_cmdline = machine->kernel_cmdline;
-const char *initrd_filename = machine->initrd_filename;
 DriveInfo *dinfo;
 int i;
 
@@ -46,17 +43,27 @@ static void xen_init_pv(MachineState *machine)
 case XEN_ATTACH:
 /* nothing to do, xend handles everything */
 break;
-case XEN_CREATE:
+#ifdef CONFIG_XEN_PV_DOMAIN_BUILD
+case XEN_CREATE: {
+const char *kernel_filename = machine->kernel_filename;
+const char *kernel_cmdline = machine->kernel_cmdline;
+const char *initrd_filename = machine->initrd_filename;
 if (xen_domain_build_pv(kernel_filename, initrd_filename,
 kernel_cmdline) < 0) {
 fprintf(stderr, "xen pv domain creation failed\n");
 exit(1);
 }
 break;
+}
+#endif
 case XEN_EMULATE:
 fprintf(stderr, "xen emulation not implemented (yet)\n");
 exit(1);
 break;
+default:
+fprintf(stderr, "unhandled xen_mode %d\n", xen_mode);
+exit(1);
+break;
 }
 
 xen_be_register("console", _console_ops);
diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index be7a915..0d83891 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -505,6 +505,7 @@ static inline int xen_xc_domain_add_to_physmap(XenXC xch, 
uint32_t domid,
 }
 #endif
 
+#ifdef CONFIG_XEN_PV_DOMAIN_BUILD
 #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 470
 static inline int xen_domain_create(XenXC xc, uint32_t ssidref,
 xen_domain_handle_t handle, uint32_t flags,
@@ -520,6 +521,7 @@ static inline int xen_domain_create(XenXC xc, uint32_t 
ssidref,
 return xc_domain_create(xc, ssidref, handle, flags, pdomid, NULL);
 }
 #endif
+#endif
 
 #if

[Xen-devel] [PULL 6/8] xen: Use stable library interfaces when they are available.

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.

Specifically libxenevtchn, libxengnttab and libxenforeignmemory.

Previous patches have already laid the groundwork for using these by
switching the existing compatibility shims to reflect the intefaces to
these libraries.

So all which remains is to update configure to detect the libraries
and enable their use. Although they are notionally independent we take
an all or nothing approach to the three libraries since they were
added at the same time.

The only non-obvious bit is that we now open a proper xenforeignmemory
handle for xen_fmem instead of reusing the xen_xc handle.

Build tested with 4.0 .. 4.6 (inclusive) and the patches targetting
4.7 which adds these libraries.

This uses CONFIG_XEN_CTRL_INTERFACE_VERSION == 471 to cover the
introduction of these new interfaces.

Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 
---
 configure   |   55 +++
 include/hw/xen/xen_common.h |   35 +--
 2 files changed, 88 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 44ac9ab..9ead31d 100755
--- a/configure
+++ b/configure
@@ -1938,6 +1938,7 @@ fi
 
 if test "$xen" != "no" ; then
   xen_libs="-lxenstore -lxenctrl -lxenguest"
+  xen_stable_libs="-lxenforeignmemory -lxengnttab -lxenevtchn"
 
   # First we test whether Xen headers and libraries are available.
   # If no, we are done and there is no Xen support.
@@ -1960,6 +1961,57 @@ EOF
   # Xen unstable
   elif
   cat > $TMPC <
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#if !defined(HVM_MAX_VCPUS)
+# error HVM_MAX_VCPUS not defined
+#endif
+int main(void) {
+  xc_interface *xc = NULL;
+  xenforeignmemory_handle *xfmem;
+  xenevtchn_handle *xe;
+  xengnttab_handle *xg;
+  xen_domain_handle_t handle;
+
+  xs_daemon_open();
+
+  xc = xc_interface_open(0, 0, 0);
+  xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
+  xc_domain_add_to_physmap(0, 0, XENMAPSPACE_gmfn, 0, 0);
+  xc_hvm_inject_msi(xc, 0, 0xf000, 0x);
+  xc_hvm_create_ioreq_server(xc, 0, HVM_IOREQSRV_BUFIOREQ_ATOMIC, NULL);
+  xc_domain_create(xc, 0, handle, 0, NULL, NULL);
+
+  xfmem = xenforeignmemory_open(0, 0);
+  xenforeignmemory_map(xfmem, 0, 0, 0, 0, 0);
+
+  xe = xenevtchn_open(0, 0);
+  xenevtchn_fd(xe);
+
+  xg = xengnttab_open(0, 0);
+  xengnttab_map_grant_ref(xg, 0, 0, 0);
+
+  return 0;
+}
+EOF
+  compile_prog "" "$xen_libs $xen_stable_libs"
+then
+xen_ctrl_version=471
+xen=yes
+  elif
+  cat > $TMPC <
 #include 
 int main(void) {
@@ -2153,6 +2205,9 @@ EOF
   fi
 
   if test "$xen" = yes; then
+if test $xen_ctrl_version -ge 471  ; then
+  libs_softmmu="$xen_stable_libs $libs_softmmu"
+fi
 libs_softmmu="$xen_libs $libs_softmmu"
   fi
 fi
diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 95275b3..19f1577 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -6,6 +6,15 @@
 #include 
 #include 
 
+/*
+ * If we have new enough libxenctrl then we do not want/need these compat
+ * interfaces, despite what the user supplied cflags might say. They
+ * must be undefined before including xenctrl.h
+ */
+#undef XC_WANT_COMPAT_EVTCHN_API
+#undef XC_WANT_COMPAT_GNTTAB_API
+#undef XC_WANT_COMPAT_MAP_FOREIGN_API
+
 #include 
 #if CONFIG_XEN_CTRL_INTERFACE_VERSION < 420
 #  include 
@@ -148,8 +157,8 @@ static inline void xs_close(struct xs_handle *xsh)
 }
 
 
-/* Xen 4.1 */
-#else
+/* Xen 4.1 thru 4.6 */
+#elif CONFIG_XEN_CTRL_INTERFACE_VERSION < 471
 
 typedef xc_interface *XenXC;
 typedef xc_interface *xenforeignmemory_handle;
@@ -184,6 +193,28 @@ static inline XenXC xen_xc_interface_open(void *logger, 
void *dombuild_logger,
 
 /* See below for xenforeignmemory_* APIs */
 
+/* FIXME There is no way to have the xen fd */
+static inline int xc_fd(xc_interface *xen_xc)
+{
+return -1;
+}
+#else /* CONFIG_XEN_CTRL_INTERFACE_VERSION >= 471 */
+
+typedef xc_interface *XenXC;
+
+#  define XC_INTERFACE_FMT "%p"
+#  define XC_HANDLER_INITIAL_VALUENULL
+
+#include 
+#include 
+#include 
+
+static inline XenXC xen_xc_interface_open(void *logger, void *dombuild_logger,
+  unsigned int open_flags)
+{
+return xc_interface_open(logger, dombuild_logger, open_flags);
+}
+
 /* FIXME There is now way to have the xen fd */
 static inline int xc_fd(xc_interface *xen_xc)
 {
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 2/8] xen: Switch to libxenevtchn interface for compat shims.

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.

One such library will be libxenevtchn which provides access to event
channels.

In preparation for this switch the compatibility layer in xen_common.h
(which support building with older versions of Xen) to use what will
be the new library API. This means that the evtchn shim will disappear
for versions of Xen which include libxenevtchn.

To simplify things for the <= 4.0.0 support we wrap the int fd in a
malloc(sizeof int) such that the handle is always a pointer. This
leads to less typedef headaches and the need for
XC_HANDLER_INITIAL_VALUE etc for these interfaces.

Note that this patch does not add any support for actually using
libxenevtchn, it just adjusts the existing shims.

Note that xc_evtchn_alloc_unbound functionality remains in libxenctrl,
since that functionality is not exposed by /dev/xen/evtchn.

Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 
---
 hw/xen/xen_backend.c |   31 +++--
 include/hw/xen/xen_backend.h |2 +-
 include/hw/xen/xen_common.h  |   44 --
 xen-hvm.c|   25 
 4 files changed, 64 insertions(+), 38 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 2510e2e..ae2a1f0 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -243,19 +243,19 @@ static struct XenDevice *xen_be_get_xendev(const char 
*type, int dom, int dev,
 xendev->debug  = debug;
 xendev->local_port = -1;
 
-xendev->evtchndev = xen_xc_evtchn_open(NULL, 0);
-if (xendev->evtchndev == XC_HANDLER_INITIAL_VALUE) {
+xendev->evtchndev = xenevtchn_open(NULL, 0);
+if (xendev->evtchndev == NULL) {
 xen_be_printf(NULL, 0, "can't open evtchn device\n");
 g_free(xendev);
 return NULL;
 }
-fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
+fcntl(xenevtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
 
 if (ops->flags & DEVOPS_FLAG_NEED_GNTDEV) {
 xendev->gnttabdev = xen_xc_gnttab_open(NULL, 0);
 if (xendev->gnttabdev == XC_HANDLER_INITIAL_VALUE) {
 xen_be_printf(NULL, 0, "can't open gnttab device\n");
-xc_evtchn_close(xendev->evtchndev);
+xenevtchn_close(xendev->evtchndev);
 g_free(xendev);
 return NULL;
 }
@@ -306,8 +306,8 @@ static struct XenDevice *xen_be_del_xendev(int dom, int dev)
 g_free(xendev->fe);
 }
 
-if (xendev->evtchndev != XC_HANDLER_INITIAL_VALUE) {
-xc_evtchn_close(xendev->evtchndev);
+if (xendev->evtchndev != NULL) {
+xenevtchn_close(xendev->evtchndev);
 }
 if (xendev->gnttabdev != XC_HANDLER_INITIAL_VALUE) {
 xc_gnttab_close(xendev->gnttabdev);
@@ -691,13 +691,14 @@ static void xen_be_evtchn_event(void *opaque)
 struct XenDevice *xendev = opaque;
 evtchn_port_t port;
 
-port = xc_evtchn_pending(xendev->evtchndev);
+port = xenevtchn_pending(xendev->evtchndev);
 if (port != xendev->local_port) {
-xen_be_printf(xendev, 0, "xc_evtchn_pending returned %d (expected 
%d)\n",
+xen_be_printf(xendev, 0,
+  "xenevtchn_pending returned %d (expected %d)\n",
   port, xendev->local_port);
 return;
 }
-xc_evtchn_unmask(xendev->evtchndev, port);
+xenevtchn_unmask(xendev->evtchndev, port);
 
 if (xendev->ops->event) {
 xendev->ops->event(xendev);
@@ -740,14 +741,14 @@ int xen_be_bind_evtchn(struct XenDevice *xendev)
 if (xendev->local_port != -1) {
 return 0;
 }
-xendev->local_port = xc_evtchn_bind_interdomain
+xendev->local_port = xenevtchn_bind_interdomain
 (xendev->evtchndev, xendev->dom, xendev->remote_port);
 if (xendev->local_port == -1) {
-xen_be_printf(xendev, 0, "xc_evtchn_bind_interdomain failed\n");
+xen_be_printf(xendev, 0, "xenevtchn_bind_interdomain failed\n");
 return -1;
 }
 xen_be_printf(xendev, 2, "bind evtchn port %d\n", xendev->local_port);
-qemu_set_fd_handler(xc_evtchn_fd(xendev->evtchndev),
+qemu_set_fd_handler(xenevtchn_fd(xendev->evtchndev),
 xen_be_evtchn_event, NULL, xendev);
 return 0;
 }
@@ -757,15 +758,15 @@ void xen_be_unbind_evtchn(struct XenDevice *xendev)
 if (xendev->local_port == -1) {
 return;
 }
-qemu_set_fd_handler(xc_evtchn_fd(xendev->evtchndev), NULL, NULL, NULL);
-xc_evtchn_unbind(xendev->evtchndev, xendev->local_port);
+qemu_set_fd_handler(xenevtchn_fd(xendev->evtchndev), NULL, NULL, NULL);
+xenevtchn_unbind(xendev->evtchndev, xendev->local_port);
 xen_be_printf(xendev, 2, "unbind

[Xen-devel] [PULL 0/8] xen-20160126

2016-01-26 Thread Stefano Stabellini

The following changes since commit 1535a6d699487740b490369e44f9ca8d305463cd:

  Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into 
staging (2016-01-26 09:16:07 +)

are available in the git repository at:


  git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-20160126

for you to fetch changes up to f4297d663d92844f87aeb6ea762244167490dadb:

  xen: make it possible to build without the Xen PV domain builder (2016-01-26 
15:03:38 +)


Xen 2016/01/26


Ian Campbell (8):
  xen_console: correctly cleanup primary console on teardown.
  xen: Switch to libxenevtchn interface for compat shims.
  xen: Switch to libxengnttab interface for compat shims.
  xen: Switch uses of xc_map_foreign_range into xc_map_foreign_pages
  xen: Switch uses of xc_map_foreign_{pages,bulk} to use 
libxenforeignmemory API.
  xen: Use stable library interfaces when they are available.
  xen: domainbuild: reopen libxenctrl interface after forking for domain 
watcher.
  xen: make it possible to build without the Xen PV domain builder

 configure|   70 
 hw/block/xen_disk.c  |   38 ++-
 hw/char/xen_console.c|   19 +++---
 hw/display/xenfb.c   |   28 
 hw/net/xen_nic.c |   18 ++---
 hw/xen/xen_backend.c |   44 +++--
 hw/xenpv/Makefile.objs   |4 +-
 hw/xenpv/xen_domainbuild.c   |9 ++-
 hw/xenpv/xen_machine_pv.c|   15 +++--
 include/hw/xen/xen_backend.h |5 +-
 include/hw/xen/xen_common.h  |  149 +-
 xen-common.c |6 ++
 xen-hvm.c|   39 +--
 xen-mapcache.c   |6 +-
 14 files changed, 315 insertions(+), 135 deletions(-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 5/8] xen: Switch uses of xc_map_foreign_{pages, bulk} to use libxenforeignmemory API.

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.

One such library will be libxenforeignmemory which provides access to
privileged foreign mappings and which will provide an interface
equivalent to xc_map_foreign_{pages,bulk}.

The new xenforeignmemory_map() function behaves like
xc_map_foreign_pages() when the err argument is NULL and like
xc_map_foreign_bulk() when err is non-NULL, which maps into the shim
here onto checking err == NULL and calling the appropriate old
function.

Note that xenforeignmemory_map() takes the number of pages before the
arrays themselves, in order to support potentially future use of
variable-length-arrays in the prototype (in the future, when Xen's
baseline toolchain requirements are new enough to ensure VLAs are
supported).

In preparation for adding support for libxenforeignmemory add support
to the <=4.0 and <=4.6 compat code in xen_common.h to allow us to
switch to using the new API. These shims will disappear for versions
of Xen which include libxenforeignmemory.

Since libxenforeignmemory will have its own handle type but for <= 4.6
the functionality is provided by using a libxenctrl handle we
introduce a new global xen_fmem alongside the existing xen_xc. In fact
we make xen_fmem a pointer to the existing xen_xc, which then works
correctly with both <=4.0 (xc handle is an int) and <=4.6 (xc handle
is a pointer). In the latter case xen_fmem is actually a double
indirect pointer, but it all falls out in the wash.

Unlike libxenctrl libxenforeignmemory has an explicit unmap function,
rather than just specifying that munmap should be used, so the unmap
paths are updated to use xenforeignmemory_unmap, which is a shim for
munmap on these versions of xen. The mappings in xen-hvm.c do not
appear to be unmapped (which makes sense for a qemu-dm process)

In fb_disconnect this results in a change from simply mmap over the
existing mapping (with an implicit munmap) to expliclty unmapping with
xenforeignmemory_unmap and then mapping the required anonymous memory
in the same hole. I don't think this is a problem since any other
thread which was racily touching this region would already be running
the risk of hitting the mapping halfway through the call. If this is
thought to be a problem then we could consider adding an extra API to
the libxenforeignmemory interface to replace a foreign mapping with
anonymous shared memory, but I'd prefer not to.

Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 
---
 hw/char/xen_console.c|8 
 hw/display/xenfb.c   |   17 +
 hw/xen/xen_backend.c |3 ++-
 include/hw/xen/xen_backend.h |1 +
 include/hw/xen/xen_common.h  |   25 +
 xen-common.c |6 ++
 xen-hvm.c|   12 ++--
 xen-mapcache.c   |6 +++---
 8 files changed, 56 insertions(+), 22 deletions(-)

diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index 3e8a57b..b92d0c6 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -229,9 +229,9 @@ static int con_initialise(struct XenDevice *xendev)
 
 if (!xendev->dev) {
 xen_pfn_t mfn = con->ring_ref;
-con->sring = xc_map_foreign_pages(xen_xc, con->xendev.dom,
- PROT_READ|PROT_WRITE,
- , 1);
+con->sring = xenforeignmemory_map(xen_fmem, con->xendev.dom,
+  PROT_READ|PROT_WRITE,
+  1, , NULL);
 } else {
 con->sring = xengnttab_map_grant_ref(xendev->gnttabdev, 
con->xendev.dom,
  con->ring_ref,
@@ -273,7 +273,7 @@ static void con_disconnect(struct XenDevice *xendev)
 
 if (con->sring) {
 if (!xendev->dev) {
-munmap(con->sring, XC_PAGE_SIZE);
+xenforeignmemory_unmap(xen_fmem, con->sring, 1);
 } else {
 xengnttab_unmap(xendev->gnttabdev, con->sring, 1);
 }
diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index aa38803..1676660 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -106,8 +106,8 @@ static int common_bind(struct common *c)
 if (xenstore_read_fe_int(>xendev, "event-channel", 
>xendev.remote_port) == -1)
return -1;
 
-c->page = xc_map_foreign_pages(xen_xc, c->xendev.dom,
-   PROT_READ | PROT_WRITE, , 1);
+c->page = xenforeignmemory_map(xen_fmem, c->xendev.dom,
+   PROT_READ | PROT_WRITE, 1, , NULL);
 if (c->page == NULL)
return -1;
 
@@ -122,7 +122,7 @@ static void common_unbind(struct common *c)
 {
 xen_be_unbind_evtchn(>xendev);
 if (c->page) {
-

[Xen-devel] [PULL 7/8] xen: domainbuild: reopen libxenctrl interface after forking for domain watcher.

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

Using an existing libxenctrl handle after a fork was never
particularly safe (especially if foreign mappings existed at the time
of the fork) and the xc fd has been unavailable for many releases.

Reopen the handle after fork and therefore do away with xc_fd().

Signed-off-by: Ian Campbell 
Acked-by: Stefano Stabellini 
---
 hw/xenpv/xen_domainbuild.c  |9 ++---
 include/hw/xen/xen_common.h |   17 -
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/hw/xenpv/xen_domainbuild.c b/hw/xenpv/xen_domainbuild.c
index ac0e5ac..f9be029 100644
--- a/hw/xenpv/xen_domainbuild.c
+++ b/hw/xenpv/xen_domainbuild.c
@@ -174,12 +174,15 @@ static int xen_domain_watcher(void)
 for (i = 3; i < n; i++) {
 if (i == fd[0])
 continue;
-if (i == xc_fd(xen_xc)) {
-continue;
-}
 close(i);
 }
 
+/*
+ * Reopen xc interface, since the original is unsafe after fork
+ * and was closed above.
+ */
+xen_xc = xc_interface_open(0, 0, 0);
+
 /* ignore term signals */
 signal(SIGINT,  SIG_IGN);
 signal(SIGTERM, SIG_IGN);
diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 19f1577..be7a915 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -116,12 +116,6 @@ static inline XenXC xen_xc_interface_open(void *logger, 
void *dombuild_logger,
 
 /* See below for xenforeignmemory_* APIs */
 
-static inline int xc_fd(int xen_xc)
-{
-return xen_xc;
-}
-
-
 static inline int xc_domain_populate_physmap_exact
 (XenXC xc_handle, uint32_t domid, unsigned long nr_extents,
  unsigned int extent_order, unsigned int mem_flags, xen_pfn_t 
*extent_start)
@@ -193,11 +187,6 @@ static inline XenXC xen_xc_interface_open(void *logger, 
void *dombuild_logger,
 
 /* See below for xenforeignmemory_* APIs */
 
-/* FIXME There is no way to have the xen fd */
-static inline int xc_fd(xc_interface *xen_xc)
-{
-return -1;
-}
 #else /* CONFIG_XEN_CTRL_INTERFACE_VERSION >= 471 */
 
 typedef xc_interface *XenXC;
@@ -214,12 +203,6 @@ static inline XenXC xen_xc_interface_open(void *logger, 
void *dombuild_logger,
 {
 return xc_interface_open(logger, dombuild_logger, open_flags);
 }
-
-/* FIXME There is now way to have the xen fd */
-static inline int xc_fd(xc_interface *xen_xc)
-{
-return -1;
-}
 #endif
 
 /* Xen before 4.2 */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 16:27,  wrote:
>>  On January 26, 2016 at 10:00pm,  wrote:
>> >>> On 26.01.16 at 14:47,  wrote:
>> > As you mentioned , I simply need to consult the bitmap along with the
>> > domain ID array.
>> >
>> > +If ( test_bit(did, iommu->domid_bitmap) && iommu->domid_map[did] >= 0 )
>> > +   d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> >
>> > Is it right now?
>> 
>> Mostly, except that I don't understand the >= 0 part.
>> 
> Domain ID should be >= 0..
> If it is redundant, I can remove it.

Quan, please: You have the code, so you can check whether
negative values ever get stored there. And had you checked,
you'd have found that domid_map is declared as u16 *. The
"u" here, as I hope you know, stands for "unsigned". (Of
course this really should be domid_t, but I'm sure the VT-d
maintainers won't care at all about such inconsistencies.)

>> > At first, I am open for any solution.
>> > pcidevs_lock is quite a big lock. For this point, it looks much better
>> > to add a new flag to delay hiding device.
>> > I am also afraid that it may raise further security issues.
>> 
>> Well, I'd say just go and see which one turns out to be less cumbersome 
> and/or
>> less intrusive.
>> 
> For this lock, any good idea?
> IMO, I can get started to add a new flag to delay hiding device.

Once again: Before getting started, please assess which route is
going to be the better one. Remember that we had already
discussed and put aside some form of deferring the hiding of
devices, so if you come back with a patch doing that again, you'll
have to be able to explain why the alternative(s) are worse.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Haozhong Zhang

On 01/26/16 08:37, Jan Beulich wrote:
> >>> On 26.01.16 at 15:44,  wrote:
> >>  Last year at Linux Plumbers Conference I attended a session dedicated
> >> to NVDIMM support. I asked the very same question and the INTEL guy
> >> there told me there is indeed something like a partition table meant
> >> to describe the layout of the memory areas and their contents.
> > 
> > It is described in details at pmem.io, look at  Documents, see
> > http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf see Namespaces section.
> 
> Well, that's about how PMEM and PBLK ranges get marked, but not
> about how use of the space inside a PMEM range is coordinated.
>

How a NVDIMM is partitioned into pmem and pblk is described by ACPI NFIT table.
Namespace to pmem is something like partition table to disk.

Haozhong


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PULL 4/8] xen: Switch uses of xc_map_foreign_range into xc_map_foreign_pages

2016-01-26 Thread Stefano Stabellini

From: Ian Campbell 

In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.

One such library will be libxenforeignmemory which provides access to
privileged foreign mappings and which will provide an interface
equivalent to xc_map_foreign_{pages,bulk}.

In preparation for this switch all uses of xc_map_foreign_range to
xc_map_foreign_pages. This is trivial because size was always
XC_PAGE_SIZE so the necessary adjustments are trivial:

  * Pass  (an array of length 1) instead of mfn. The function
takes a pointer to const, so there is no possibily of mfn changing
due to this change.
  * Pass nr_pages=1 instead of size=XC_PAGE_SIZE

There is one wrinkle in xen_console.c:con_initialise() where
con->ring_ref is an int but can in some code paths (when !xendev->dev)
be treated as an mfn. I think this is an existing latent truncation
hazard on platforms where xen_pfn_t is 64-bit and int is 32-bit (e.g.
amd64, both arm* variants). I'm unsure under what circumstances
xendev->dev can be NULL or if anything elsewhere ensures the value
fits into an int. For now I just use a temporary xen_pfn_t to in
effect upcast the pointer from int* to xen_pfn_t*.

In xenfb.c:common_bind we now explicitly launder the mfn into a
xen_pfn_t, so it has the correct type to be passed to
xc_map_foreign_pages and doesn't provoke warnings on 32-bit x86.

Signed-off-by: Ian Campbell 
Reviewed-by: Stefano Stabellini 
---
 hw/char/xen_console.c |8 
 hw/display/xenfb.c|   15 ---
 xen-hvm.c |   14 +++---
 3 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index ac1b324..3e8a57b 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -228,10 +228,10 @@ static int con_initialise(struct XenDevice *xendev)
con->buffer.max_capacity = limit;
 
 if (!xendev->dev) {
-con->sring = xc_map_foreign_range(xen_xc, con->xendev.dom,
-  XC_PAGE_SIZE,
-  PROT_READ|PROT_WRITE,
-  con->ring_ref);
+xen_pfn_t mfn = con->ring_ref;
+con->sring = xc_map_foreign_pages(xen_xc, con->xendev.dom,
+ PROT_READ|PROT_WRITE,
+ , 1);
 } else {
 con->sring = xengnttab_map_grant_ref(xendev->gnttabdev, 
con->xendev.dom,
  con->ring_ref,
diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 8eb3046..aa38803 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -95,23 +95,24 @@ struct XenFB {
 
 static int common_bind(struct common *c)
 {
-uint64_t mfn;
+uint64_t val;
+xen_pfn_t mfn;
 
-if (xenstore_read_fe_uint64(>xendev, "page-ref", ) == -1)
+if (xenstore_read_fe_uint64(>xendev, "page-ref", ) == -1)
return -1;
-assert(mfn == (xen_pfn_t)mfn);
+mfn = (xen_pfn_t)val;
+assert(val == mfn);
 
 if (xenstore_read_fe_int(>xendev, "event-channel", 
>xendev.remote_port) == -1)
return -1;
 
-c->page = xc_map_foreign_range(xen_xc, c->xendev.dom,
-  XC_PAGE_SIZE,
-  PROT_READ | PROT_WRITE, mfn);
+c->page = xc_map_foreign_pages(xen_xc, c->xendev.dom,
+   PROT_READ | PROT_WRITE, , 1);
 if (c->page == NULL)
return -1;
 
 xen_be_bind_evtchn(>xendev);
-xen_be_printf(>xendev, 1, "ring mfn %"PRIx64", remote-port %d, 
local-port %d\n",
+xen_be_printf(>xendev, 1, "ring mfn %"PRI_xen_pfn", remote-port %d, 
local-port %d\n",
  mfn, c->xendev.remote_port, c->xendev.local_port);
 
 return 0;
diff --git a/xen-hvm.c b/xen-hvm.c
index 1b6fa9e..878ae0a 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -1242,8 +1242,9 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion 
**ram_memory)
 DPRINTF("buffered io page at pfn %lx\n", bufioreq_pfn);
 DPRINTF("buffered io evtchn is %x\n", bufioreq_evtchn);
 
-state->shared_page = xc_map_foreign_range(xen_xc, xen_domid, XC_PAGE_SIZE,
-  PROT_READ|PROT_WRITE, ioreq_pfn);
+state->shared_page = xc_map_foreign_pages(xen_xc, xen_domid,
+  PROT_READ|PROT_WRITE,
+  _pfn, 1);
 if (state->shared_page == NULL) {
 error_report("map shared IO page returned error %d handle=" 
XC_INTERFACE_FMT,
  errno, xen_xc);
@@ -1254,8 +1255,8 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion 
**ram_memory)
 if (!rc) {
 DPRINTF("shared vmport page at pfn %lx\n", ioreq_pfn);
 state->shared_vmport_page =
-

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-26 Thread Jan Beulich

>>> On 26.01.16 at 16:30,  wrote:
> On 01/26/16 05:44, Jan Beulich wrote:
>> Interesting. This isn't the usage model I have been thinking about
>> so far. Having just gone back to the original 0/4 mail, I'm afraid
>> we're really left guessing, and you guessed differently than I did.
>> My understanding of the intentions of PMEM so far was that this
>> is a high-capacity, slower than DRAM but much faster than e.g.
>> swapping to disk alternative to normal RAM. I.e. the persistent
>> aspect of it wouldn't matter at all in this case (other than for PBLK,
>> obviously).
> 
> Of course, pmem could be used in the way you thought because of its
> 'ram' aspect. But I think the more meaningful usage is from its
> persistent aspect. For example, the implementation of some journal
> file systems could store logs in pmem rather than the normal ram, so
> that if a power failure happens before those in-memory logs are
> completely written to the disk, there would still be chance to restore
> them from pmem after next booting (rather than abandoning all of
> them).

Well, that leaves open how that file system would find its log
after reboot, or how that log is protected from clobbering by
another OS booted in between.

>> However, thinking through your usage model I have problems
>> seeing it work in a reasonable way even with virtualization left
>> aside: To my knowledge there's no established protocol on how
>> multiple parties (different versions of the same OS, or even
>> completely different OSes) would arbitrate using such memory
>> ranges. And even for a single OS it is, other than for disks (and
>> hence PBLK), not immediately clear how it would communicate
>> from one boot to another what information got stored where,
>> or how it would react to some or all of this storage having
>> disappeared (just like a disk which got removed, which - unless
>> it held the boot partition - would normally have pretty little
>> effect on the OS coming back up).
> 
> Label storage area is a persistent area on NVDIMM and can be used to
> store partitions information. It's not included in pmem (that part
> that is mapped into the system address space). Instead, it can be only
> accessed through NVDIMM _DSM method [1]. However, what contents are
> stored and how they are interpreted are left to software. One way is
> to follow NVDIMM Namespace Specification [2] to store an array of
> labels that describe the start address (from the base 0 of pmem) and
> the size of each partition, which is called as namespace. On Linux,
> each namespace is exposed as a /dev/pmemXX device.

According to what I've just read in one of the documents Konrad
pointed us to, there can be just one PMEM label per DIMM. Unless
I misread of course...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V13 3/5] libxl: add pvusb API

2016-01-26 Thread Olaf Hering

On Tue, Jan 19, Chunyan Liu wrote:

> +++ b/tools/libxl/libxl.c
> @@ -3204,7 +3204,7 @@ void 
> libxl__device_disk_local_initiate_detach(libxl__egc *egc,
>  aodev->dev = device;
>  aodev->callback = local_device_detach_cb;
>  aodev->force = 0;
> -libxl__initiate_device_remove(egc, aodev);
> +libxl__initiate_device_generic_remove(egc, aodev);
>  return;
>  }
>  
> @@ -4172,8 +4172,10 @@ out:
>   * libxl_device_vkb_destroy
>   * libxl_device_vfb_remove
>   * libxl_device_vfb_destroy
> + * libxl_device_usbctrl_remove
> + * libxl_device_usbctrl_destroy

This should be moved down to DEFINE_DEVICE_REMOVE_CUSTOM.

>   */
> -#define DEFINE_DEVICE_REMOVE(type, removedestroy, f)\
> +#define DEFINE_DEVICE_REMOVE_EXT(type, remtype, removedestroy, f)\
>  int libxl_device_##type##_##removedestroy(libxl_ctx *ctx,   \
>  uint32_t domid, libxl_device_##type *type,  \
>  const libxl_asyncop_how *ao_how)\
> @@ -4193,13 +4195,19 @@ out:
>  aodev->dev = device;\
>  aodev->callback = device_addrm_aocomplete;  \
>  aodev->force = f;   \
> -libxl__initiate_device_remove(egc, aodev);  \
> +libxl__initiate_device_##remtype##_remove(egc, aodev);  \
>  \
>  out:\
> -if (rc) return AO_CREATE_FAIL(rc);   
>  \
> +if (rc) return AO_CREATE_FAIL(rc);  \
>  return AO_INPROGRESS;   \
>  }
>  
> +#define DEFINE_DEVICE_REMOVE(type, removedestroy, f) \
> +DEFINE_DEVICE_REMOVE_EXT(type, generic, removedestroy, f)
> +
> +#define DEFINE_DEVICE_REMOVE_CUSTOM(type, removedestroy, f)  \
> +DEFINE_DEVICE_REMOVE_EXT(type, type, removedestroy, f)
> +
>  /* Define all remove/destroy functions and undef the macro */
>  
>  /* disk */


If this is the way to move forward, please split this out into a
separate change which can be applied to staging independent of any pvusb
changes.

I think the patch needs also the #undef.


> @@ -4223,6 +4231,10 @@ DEFINE_DEVICE_REMOVE(vfb, destroy, 1)
>  DEFINE_DEVICE_REMOVE(vtpm, remove, 0)
>  DEFINE_DEVICE_REMOVE(vtpm, destroy, 1)
>  
> +/* usbctrl */
> +DEFINE_DEVICE_REMOVE_CUSTOM(usbctrl, remove, 0)
> +DEFINE_DEVICE_REMOVE_CUSTOM(usbctrl, destroy, 1)
> +
>  /* channel/console hotunplug is not implemented. There are 2 possibilities:
>   * 1. add support for secondary consoles to xenconsoled
>   * 2. dynamically add/remove qemu chardevs via qmp messages. */

A comment should mention both libxl_device_usbctrl_remove/destroy and
libxl__initiate_device_usbctrl_remove/destroy.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] arm: p2m.c bug-fix: hypervisor hang on __p2m_get_mem_access

2016-01-26 Thread Ian Campbell

On Tue, 2016-01-26 at 13:46 +0200, Corneliu ZUZU wrote:
> When __p2m_get_mem_access gets called, the p2m lock is already taken
> by either get_page_from_gva or p2m_get_mem_access.
> 
> Possible code paths:
> 1)-> get_page_from_gva
>   -> p2m_mem_access_check_and_get_page
>   -> __p2m_get_mem_access
> 2)-> p2m_get_mem_access
>   -> __p2m_get_mem_access

What about:
-> p2m_mem_access_check
-> p2m_get_mem_access
I can't see the lock being taken in that paths.

As well as fixing that I think it would be wise to add an assert that the
lock is held before the call to p2m_lookup which you are changing into the
unlocked variant.

> 
> In both cases if __p2m_get_mem_access subsequently gets to
> call p2m_lookup (happens if !radix_tree_lookup(...)), a hypervisor
> hang will occur, since p2m_lookup also spin-locks on the p2m lock.
> 
> This bug-fix simply replaces the p2m_lookup call from
> __p2m_get_mem_access
> with a call to __p2m_lookup.
> 
> Signed-off-by: Corneliu ZUZU 
> ---
>  xen/arch/arm/p2m.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 2190908..a9157e5 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -490,7 +490,7 @@ static int __p2m_get_mem_access(struct domain *d,
> gfn_t gfn,
>   * No setting was found in the Radix tree. Check if the
>   * entry exists in the page-tables.
>   */
> -paddr_t maddr = p2m_lookup(d, gfn_x(gfn) << PAGE_SHIFT, NULL);
> +paddr_t maddr = __p2m_lookup(d, gfn_x(gfn) << PAGE_SHIFT, NULL);
>  if ( INVALID_PADDR == maddr )
>  return -ESRCH;
>  

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 4/5] tools/libxc: error handling for the postcopy() callback

2016-01-26 Thread Ian Campbell

On Tue, 2016-01-26 at 15:02 +0800, Yang Hongyang wrote:
> 
> On 01/26/2016 02:48 PM, Wen Congyang wrote:
> > On 01/26/2016 02:45 PM, Yang Hongyang wrote:
> > > ditto
> > > 
> > > Reviewed-by: Yang Hongyang 
> > 
> > The newest version is v5, and this series is in the staging now.
> 
> Sorry for the noise...I saw the series too late, please ignore my
> comments...

Is an update to the MAINTAINERS file required?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/3] libxc/xc_domain_resume: Update comment.

2016-01-26 Thread Ian Campbell

On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote:
> To hopefully clarify what it meant.
> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxc/xc_resume.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
> index 87d4324..19ba2a3 100644
> --- a/tools/libxc/xc_resume.c
> +++ b/tools/libxc/xc_resume.c
> @@ -248,9 +248,12 @@ out:
>  /*
>   * Resume execution of a domain after suspend shutdown.
>   * This can happen in one of two ways:
> - *  1. Resume with special return code.
> - *  2. Reset guest environment so it believes it is resumed in a new
> + *  1. (fast=1) Resume with special return code (1) that the guest
> + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend.

"SCHEDOP_shutdown(SHUTDOWN_suspend)" looks more like the function call
which this in effect is.

I think I'd say "Resume the guest without resetting the domain environment.
The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 1".

(assuming that is true re resetting)

> + *
> + *  2. (fast=0) Reset guest environment so it believes it is resumed in a new
>   * domain context.

with the above I would suggesting adding "The guests's call to
SCHEDOP_shutdown(SHUTDOWN_suspend) will return 0".

> + *
>   * (2) should be used only for guests which cannot handle the special
>   * new return code. (1) is always safe (but slower).

Is this correct? I'd have said (2) was always safe but slow?

And I would invert the first, that is to say that (1) should be used in
preference with guests which support it.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V13 3/5] libxl: add pvusb API

2016-01-26 Thread George Dunlap

On Tue, Jan 26, 2016 at 7:43 AM, Chun Yan Liu  wrote:
>
>
 On 1/20/2016 at 12:56 PM, in message
> <569f83f502660009e...@relay2.provo.novell.com>, "Chun Yan Liu"
>  wrote:
>
>>
> On 1/19/2016 at 11:48 PM, in message
>> <22174.23240.402164.635...@mariner.uk.xensource.com>, Ian Jackson
>>  wrote:
>> > Chunyan Liu writes ("[PATCH V13 3/5] libxl: add pvusb API"):
>> > > Add pvusb APIs, including:
>> > >  - attach/detach (create/destroy) virtual usb controller.
>> > >  - attach/detach usb device
>> > >  - list usb controller and usb devices
>> > >  - some other helper functions
>> >
>> >
>> > Thanks.  This is making progress but I'm afraid we're not quite there
>> > yet.
>> >
>> >
>> > > +static int usbback_dev_unassign(libxl__gc *gc, const char *busid)
>> > > +{
>> > ...
>> > > +/* Till here, USB device has been unbound from USBBACK and
>> > > + * removed from xenstore, usb list couldn't show it anymore,
>> > > + * so no matter removimg driver path successfully or not,
>> > > + * we will report operation success.
>> > > + */
>> >
>> > I'm still unconvinced by this and this may mean that the code in this
>> > function is in the wrong order.  Earlier we had this exchange:
>> >
>> > > > Ought this function to really report success if these calls fail ?
>> > >
>> > > I think so. Till here, the USB device has already been unbound from
>> > > usbback and removed from xenstore. usb-list cannot list it any more.
>> >
>> > The problem is that I think that if this function fails, it can leave
>> >  - debris in xenstore (the usbback path)
>> Yes, it's true.
>>
>> >  - the interface bound to the wrong driver
>> No, it won't be bound to 'wrong' driver, only maybe not bound to any driver
>> (Already unbound from usbback, but failed to rebound to its original
>> driver).
>> In this case, we would report warning: failed to rebind to driver xxx.
>>
>> > And then there is no way for the user to get libxl to re-attempt the
>> > operation, or clean up.  Am I right ?
>>
>> Yes. No way to re-attempt usbdev-detach or cleanup driver path in
>> xenstore. But won't affect next time usbdev-attach the same device.
>>
>> >
>> > One way to avoid this kind of problem is to deal with the xenstore
>> > path last.  That way the device will still appear as attached to the
>> > domain.
>>
>> I'm afraid if the side effect is acceptable? In my testing, some USB
>> bluetooth
>> device always fails to rebind to 'btusb' driver after it's unbound
>> from 'usbback'. In this case, we can't detach it from the domain then.
>
> Ian J., any opinion on this? If it's still thought to be better, I'll update 
> patch.

I think Ian may be waiting for me to reply and express an opinion; but
unfortunately that will have to wait until next week. :-(

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] libxl/remus: Move the assert before the info is used.

2016-01-26 Thread Ian Campbell

On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote:
> The assert(info) is after quite a lot of manipulations
> on 'info' - which makes the assert pointless because if
> info was NULL it would have crashed earlier.
> 
> Move it earlier so that it guards before we try using
> the 'info' structure.

That assert (wherever it is placed) is rather aggressive for an application
provided argument. ERROR_INVALID would be more normal I think.

> 
> CC: Wen Congyang 
> CC: Yang Hongyang 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxl/libxl.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 2bde0f5..60974cc 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -855,6 +855,8 @@ int libxl_domain_remus_start(libxl_ctx *ctx,
> libxl_domain_remus_info *info,
>  goto out;
>  }
>  
> +assert(info);
> +
>  libxl_defbool_setdefault(>allow_unsafe, false);
>  libxl_defbool_setdefault(>blackhole, false);
>  libxl_defbool_setdefault(>compression, true);
> @@ -883,8 +885,6 @@ int libxl_domain_remus_start(libxl_ctx *ctx,
> libxl_domain_remus_info *info,
>  dss->debug = 0;
>  dss->remus = info;
>  
> -assert(info);
> -
>  /* Point of no return */
>  libxl__remus_setup(egc, dss);
>  return AO_INPROGRESS;

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/3] libxc/xc_domain_resume: Update comment.

2016-01-26 Thread Ian Jackson

Ian Campbell writes ("Re: [PATCH 1/3] libxc/xc_domain_resume: Update comment."):
> On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote:
> > To hopefully clarify what it meant.
...
> > + *  1. (fast=1) Resume with special return code (1) that the guest
> > + * gets from SCHEDOP_shutdown:SHUTDOWN_suspend.
> 
> "SCHEDOP_shutdown(SHUTDOWN_suspend)" looks more like the function call
> which this in effect is.
> 
> I think I'd say "Resume the guest without resetting the domain environment.
> The guests's call to SCHEDOP_shutdown(SHUTDOWN_suspend) will return 1".
> 
> (assuming that is true re resetting)

I'm not sure that `will return 1' is correct.  IIRC there is some
... unpleasantness here, with something effectively corrupting the
guest state in a way that the guest is supposed to expect and
cooperate with.

I haven't investigated the details recently.  I do remember it being
fiddly.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] libxl/remus: Move the assert before the info is used.

2016-01-26 Thread Ian Jackson

Ian Campbell writes ("Re: [PATCH 2/3] libxl/remus: Move the assert before the 
info is used."):
> On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote:
> > The assert(info) is after quite a lot of manipulations
> > on 'info' - which makes the assert pointless because if
> > info was NULL it would have crashed earlier.
> > 
> > Move it earlier so that it guards before we try using
> > the 'info' structure.
> 
> That assert (wherever it is placed) is rather aggressive for an application
> provided argument. ERROR_INVALID would be more normal I think.

I think the assert should simply be removed.  We don't assert() other
pointer parameters for non-NULL-ness.

Certainly turning null pointer bugs into ERROR_INVALID is very
unfriendly.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [ovmf bisection] complete test-amd64-i386-xl-qemuu-ovmf-amd64

2016-01-26 Thread Wei Liu

On Tue, Jan 26, 2016 at 11:05:50AM +, Ian Campbell wrote:
> On Sat, 2016-01-09 at 00:33 +, osstest service owner wrote:
> 
> According to http://logs.test-lab.xenproject.org/osstest/results/all-branch
> -statuses.txt the ovmf push gate has been broken for a while (48 days).
> 
> The bisector seems to have fingered the commit below (there are some other
> intermittent issues, but this one seems to be rather persistent).
> 
> An example of the failure can be seen in:
> http://logs.test-lab.xenproject.org/osstest/logs/78929/test-amd64-i386-xl-qemuu-ovmf-amd64/info.html
> 
> I wouldn't mind betting that this is something similar:
> http://logs.test-lab.xenproject.org/osstest/logs/78929/test-amd64-amd64-xl-qemuu-ovmf-amd64/info.html
> 
> Ian.
> 

The screenshot and serial log of the guest look OK.

I also installed a guest with the same version of Debian, with latest
OVMF binary that contains the changeset below, the guest worked fine.

Wei.

> > branch xen-unstable
> > xenbranch xen-unstable
> > job test-amd64-i386-xl-qemuu-ovmf-amd64
> > testid guest-start/debianhvm.repeat
> > 
> > Tree: linux git://xenbits.xen.org/linux-pvops.git
> > Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> > Tree: ovmf https://github.com/tianocore/edk2.git
> > Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> > Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> > Tree: xen git://xenbits.xen.org/xen.git
> > 
> > *** Found and reproduced problem changeset ***
> > 
> >   Bug is in tree:  ovmf https://github.com/tianocore/edk2.git
> >   Bug introduced:  b0fa5d29d08e61fd7f2178aa3b455e41374b36c4
> >   Bug not present: fa25cf38d988778ef3237e17fc93c1fa0c9e9f8a
> >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/77418/
> > 
> > 
> >   commit b0fa5d29d08e61fd7f2178aa3b455e41374b36c4
> >   Author: Michael Kinney 
> >   Date:   Tue Dec 8 05:24:18 2015 +
> >   
> >   UefiCpuPkg/MtrrLib: Reduce hardware init when program variable MTRRs
> >   
> >   When MtrrSetMemoryAttribute() programs variable MTRRs, it may 
> > disable/enable
> >   cache and disable/enable MTRRs several times. This updating tries to 
> > do
> >   operation in local variable and does the hardware initialization one 
> > time only.
> >   
> >   Cc: Feng Tian 
> >   Cc: Michael Kinney 
> >   Contributed-under: TianoCore Contribution Agreement 1.0
> >   Signed-off-by: Michael Kinney 
> >   Signed-off-by: Jeff Fan 
> >   Reviewed-by: Feng Tian 
> >   
> >   git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@19158 
> > 6f19259b-4bc3-4df7-8a09-765794883524
> > 
> > 
> > For bisection revision-tuple graph see:
> >    http://logs.test-lab.xenproject.org/osstest/results/bisect/ovmf/test-a
> > md64-i386-xl-qemuu-ovmf-amd64.guest-start--debianhvm.repeat.html
> > Revision IDs in each graph node refer, respectively, to the Trees above.
> > 
> > 
> > Running cs-bisection-step --graph-
> > out=/home/logs/results/bisect/ovmf/test-amd64-i386-xl-qemuu-ovmf-
> > amd64.guest-start--debianhvm.repeat --summary-out=tmp/77418.bisection-
> > summary --basis-template=65543 --blessings=real,real-bisect ovmf test-
> > amd64-i386-xl-qemuu-ovmf-amd64 guest-start/debianhvm.repeat
> > Searching for failure / basis pass:
> >  77229 fail [host=rimava1] / 66401 [host=huxelrebe0] 65677
> > [host=huxelrebe1] 65624 [host=baroque0] 65593 [host=italia1] 65543
> > [host=pinot1] 65468 [host=fiano0] 65386 [host=fiano1] 65359
> > [host=nocera1] 65336 [host=italia0] 65319 ok.
> > Failure / basis pass flights: 77229 / 65319
> > (tree with no url: seabios)
> > Tree: linux git://xenbits.xen.org/linux-pvops.git
> > Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
> > Tree: ovmf https://github.com/tianocore/edk2.git
> > Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> > Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> > Tree: xen git://xenbits.xen.org/xen.git
> > Latest 5d7b0fcc26d66db767a477574effc764022c19ac
> > c530a75c1e6a472b0eb9558310b518f0dfcd8860
> > c2a892d7c8a78143006bb7fdc95fb18f7e2fc685
> > a82794b1d5a6da06062a333b1db404e2448345dd
> > f165e581d9a6f7cf81aa7496d3eee1e31212c8ad
> > bf925a9f1254391749f569c1b8fc606036340488
> > Basis pass 769b79eb206ad5b0249a08665fefb913c3d1998e
> > c530a75c1e6a472b0eb9558310b518f0dfcd8860
> > dcb2e4bb61931e2dee1739bb76aba315002f0a82
> > bc00cad75d8bcc3ba696992bec219c21db8406aa
> > 3fb401edbd8e9741c611bfddf6a2032ca91f55ed
> > 713b7e4ef2aa4ec3ae697cde9c81d5a57548f9b1
> > Generating revisions with ./adhoc-revtuple-generator  git://xenbits.xen.o
> > rg/linux-pvops.git#769b79eb206ad5b0249a08665fefb913c3d1998e-
> > 5d7b0fcc26d66db767a477574effc764022c19ac git://xenbits.xen.org/osstest/li
> > nux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-
> >

Re: [Xen-devel] [PATCH 3/3] tools/libxl: run_helper - add #define for arguments.

2016-01-26 Thread Ian Campbell

On Mon, 2016-01-25 at 16:06 -0500, Konrad Rzeszutek Wilk wrote:
> Describe what the four (or more in the future) arguments
> are for.

I'd say that a code comment on the definition would be sufficient here, but
I'll defer to Ian J as author of this code.

> 
> Signed-off-by: Konrad Rzeszutek Wilk 
> ---
>  tools/libxl/libxl_save_callout.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl_save_callout.c
> b/tools/libxl/libxl_save_callout.c
> index 3af99af..45b9727 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -119,13 +119,22 @@ void
> libxl__save_helper_init(libxl__save_helper_state *shs)
>  
>  /*- helper execution -*/
>  
> +/*
> + * Both save and restore share four parameters:
> + * 1) Path to libxl-save-helper.
> + * 2) --[restore|save]-domain.
> + * 3) stream file descriptor.
> + * n) save/restore specific parameters.
> + * 4) A \0 at the end.
> + */
> +#define HELPER_NR_ARGS 4
>  static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> const char *mode_arg, int stream_fd,
> const int *preserve_fds, int num_preserve_fds,
> const unsigned long *argnums, int num_argnums)
>  {
>  STATE_AO_GC(shs->ao);
> -const char *args[4 + num_argnums];
> +const char *args[HELPER_NR_ARGS + num_argnums];
>  const char **arg = args;
>  int i, rc;
>  

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCHv2 4/4] spinlock: fair read-write locks

2016-01-26 Thread David Vrabel

From: Jennifer Herbert 

The current rwlocks are write-biased and unfair.  This allows writers
to starve readers in situations where there are many writers (e.g.,
p2m type changes from log dirty updates during domain save).

Replace the current implementation with queued read-write locks which use
a fair spinlock (a ticket lock in this case) to ensure fairness between
readers and writers when they are contended.

This implementation is from the Linux commit 70af2f8a4f48 by Waiman
Long and Peter Zijlstra.

locking/rwlocks: Introduce 'qrwlocks' - fair, queued rwlocks

This rwlock uses the arch_spin_lock_t as a waitqueue, and assuming
the arch_spin_lock_t is a fair lock (ticket,mcs etc..) the
resulting rwlock is a fair lock.

It fits in the same 8 bytes as the regular rwlock_t by folding the
reader and writer count into a single integer, using the remaining
4 bytes for the arch_spinlock_t.

Architectures that can single-copy adress bytes can optimize
queue_write_unlock() with a 0 write to the LSB (the write count).

We do not yet make use of the architecture-specific optimization noted
above.

Signed-off-by: Jennifer Herbert 
Signed-off-by: David Vrabel 
---
v2:
- Remove in_irq() special case from read slow path.
- Comment on why 8-bit write mask.
- Style.
- Use 0 in various cmpxchg() calls where appropriate.
---
 xen/common/rwlock.c| 102 +++
 xen/common/spinlock.c  | 204 -
 xen/include/xen/rwlock.h   | 182 
 xen/include/xen/spinlock.h |  32 ---
 4 files changed, 284 insertions(+), 236 deletions(-)

diff --git a/xen/common/rwlock.c b/xen/common/rwlock.c
index 410d4dc..1fd8ea0 100644
--- a/xen/common/rwlock.c
+++ b/xen/common/rwlock.c
@@ -1,6 +1,108 @@
 #include 
 #include 
 
+/*
+ * rspin_until_writer_unlock - spin until writer is gone.
+ * @lock  : Pointer to queue rwlock structure.
+ * @cnts: Current queue rwlock writer status byte.
+ *
+ * In interrupt context or at the head of the queue, the reader will just
+ * increment the reader count & wait until the writer releases the lock.
+ */
+static inline void rspin_until_writer_unlock(rwlock_t *lock, u32 cnts)
+{
+while ( (cnts & _QW_WMASK) == _QW_LOCKED )
+{
+cpu_relax();
+smp_rmb();
+cnts = atomic_read(>cnts);
+}
+}
+
+/*
+ * queue_read_lock_slowpath - acquire read lock of a queue rwlock.
+ * @lock: Pointer to queue rwlock structure.
+ */
+void queue_read_lock_slowpath(rwlock_t *lock)
+{
+u32 cnts;
+
+/*
+ * Readers come here when they cannot get the lock without waiting.
+ */
+atomic_sub(_QR_BIAS, >cnts);
+
+/*
+ * Put the reader into the wait queue.
+ */
+spin_lock(>lock);
+
+/*
+ * At the head of the wait queue now, wait until the writer state
+ * goes to 0 and then try to increment the reader count and get
+ * the lock. It is possible that an incoming writer may steal the
+ * lock in the interim, so it is necessary to check the writer byte
+ * to make sure that the write lock isn't taken.
+ */
+while ( atomic_read(>cnts) & _QW_WMASK )
+cpu_relax();
+
+cnts = atomic_add_return(_QR_BIAS, >cnts) - _QR_BIAS;
+rspin_until_writer_unlock(lock, cnts);
+
+/*
+ * Signal the next one in queue to become queue head.
+ */
+spin_unlock(>lock);
+}
+
+/*
+ * queue_write_lock_slowpath - acquire write lock of a queue rwlock
+ * @lock : Pointer to queue rwlock structure.
+ */
+void queue_write_lock_slowpath(rwlock_t *lock)
+{
+u32 cnts;
+
+/* Put the writer into the wait queue. */
+spin_lock(>lock);
+
+/* Try to acquire the lock directly if no reader is present. */
+if ( !atomic_read(>cnts) &&
+ (atomic_cmpxchg(>cnts, 0, _QW_LOCKED) == 0) )
+goto unlock;
+
+/*
+ * Set the waiting flag to notify readers that a writer is pending,
+ * or wait for a previous writer to go away.
+ */
+for (;;)
+{
+cnts = atomic_read(>cnts);
+if ( !(cnts & _QW_WMASK) &&
+ (atomic_cmpxchg(>cnts, cnts,
+ cnts | _QW_WAITING) == cnts) )
+break;
+
+cpu_relax();
+}
+
+/* When no more readers, set the locked flag. */
+for (;;)
+{
+cnts = atomic_read(>cnts);
+if ( (cnts == _QW_WAITING) &&
+ (atomic_cmpxchg(>cnts, _QW_WAITING,
+ _QW_LOCKED) == _QW_WAITING) )
+break;
+
+cpu_relax();
+}
+ unlock:
+spin_unlock(>lock);
+}
+
+
 static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers);
 
 void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 7b0cf6c..a43fa84 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -288,210 +288,6 @@ void

[Xen-devel] [PATCHv2 1/4] atomic: replace atomic_compareandswap() with atomic_cmpxchg()

2016-01-26 Thread David Vrabel

atomic_compareandswap() used atomic_t as the new, old and returned
values which is less convinient than using just int.

Signed-off-by: David Vrabel 
---
v2:
- arm/arm64 already provided atomic_cmpxchg()
---
 xen/common/domain.c  |  5 +
 xen/include/asm-arm/atomic.h |  8 
 xen/include/asm-x86/atomic.h | 24 
 xen/include/xen/sched.h  |  9 -
 4 files changed, 21 insertions(+), 25 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 2979c1b..93e77f5 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -857,14 +857,11 @@ static void complete_domain_destroy(struct rcu_head *head)
 void domain_destroy(struct domain *d)
 {
 struct domain **pd;
-atomic_t old = ATOMIC_INIT(0);
-atomic_t new = ATOMIC_INIT(DOMAIN_DESTROYED);
 
 BUG_ON(!d->is_dying);
 
 /* May be already destroyed, or get_domain() can race us. */
-old = atomic_compareandswap(old, new, >refcnt);
-if ( _atomic_read(old) != 0 )
+if ( atomic_cmpxchg(>refcnt, 0, DOMAIN_DESTROYED) != 0 )
 return;
 
 cpupool_rm_domain(d);
diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
index 5a38c67..29ab265 100644
--- a/xen/include/asm-arm/atomic.h
+++ b/xen/include/asm-arm/atomic.h
@@ -138,14 +138,6 @@ static inline void _atomic_set(atomic_t *v, int i)
 # error "unknown ARM variant"
 #endif
 
-static inline atomic_t atomic_compareandswap(
-atomic_t old, atomic_t new, atomic_t *v)
-{
-atomic_t rc;
-rc.counter = __cmpxchg(>counter, old.counter, new.counter, sizeof(int));
-return rc;
-}
-
 #endif /* __ARCH_ARM_ATOMIC__ */
 /*
  * Local variables:
diff --git a/xen/include/asm-x86/atomic.h b/xen/include/asm-x86/atomic.h
index 2b8c877..d246b70 100644
--- a/xen/include/asm-x86/atomic.h
+++ b/xen/include/asm-x86/atomic.h
@@ -135,6 +135,10 @@ static inline void _atomic_set(atomic_t *v, int i)
 v->counter = i;
 }
 
+static inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+return cmpxchg(>counter, old, new);
+}
 
 /**
  * atomic_add - add integer to atomic variable
@@ -152,6 +156,18 @@ static inline void atomic_add(int i, atomic_t *v)
 }
 
 /**
+ * atomic_add_return - add integer and return
+ * @i: integer value to add
+ * @v: pointer of type atomic_t
+ *
+ * Atomically adds @i to @v and returns @i + @v
+ */
+static inline int atomic_add_return(int i, atomic_t *v)
+{
+return i + arch_fetch_and_add(>counter, i);
+}
+
+/**
  * atomic_sub - subtract the atomic variable
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
@@ -272,12 +288,4 @@ static inline int atomic_add_negative(int i, atomic_t *v)
 return c;
 }
 
-static inline atomic_t atomic_compareandswap(
-atomic_t old, atomic_t new, atomic_t *v)
-{
-atomic_t rc;
-rc.counter = __cmpxchg(>counter, old.counter, new.counter, sizeof(int));
-return rc;
-}
-
 #endif /* __ARCH_X86_ATOMIC__ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 82b6dd1..5870745 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -483,16 +483,15 @@ extern struct vcpu *idle_vcpu[NR_CPUS];
  */
 static always_inline int get_domain(struct domain *d)
 {
-atomic_t old, new, seen = d->refcnt;
+int old, seen = atomic_read(>refcnt);
 do
 {
 old = seen;
-if ( unlikely(_atomic_read(old) & DOMAIN_DESTROYED) )
+if ( unlikely(old & DOMAIN_DESTROYED) )
 return 0;
-_atomic_set(, _atomic_read(old) + 1);
-seen = atomic_compareandswap(old, new, >refcnt);
+seen = atomic_cmpxchg(>refcnt, old, old + 1);
 }
-while ( unlikely(_atomic_read(seen) != _atomic_read(old)) );
+while ( unlikely(seen != old) );
 return 1;
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCHv2 2/4] spinlock: shrink struct lock_debug

2016-01-26 Thread David Vrabel

From: Jennifer Herbert 

Reduce the size of struct lock_debug so increases in other lock
structures don't increase the size of struct domain too much.

Signed-off-by: Jennifer Herbert 
Signed-off-by: David Vrabel 
---
v2:
- Continue to use int for local variable.
---
 xen/include/xen/spinlock.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
index 8b2590f..22c4fc2 100644
--- a/xen/include/xen/spinlock.h
+++ b/xen/include/xen/spinlock.h
@@ -8,7 +8,7 @@
 
 #ifndef NDEBUG
 struct lock_debug {
-int irq_safe; /* +1: IRQ-safe; 0: not IRQ-safe; -1: don't know yet */
+s16 irq_safe; /* +1: IRQ-safe; 0: not IRQ-safe; -1: don't know yet */
 };
 #define _LOCK_DEBUG { -1 }
 void spin_debug_enable(void);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCHv2 3/4] spinlock: move rwlock API and per-cpu rwlocks into their own files

2016-01-26 Thread David Vrabel

From: Jennifer Herbert 

In preparation for a replacement read-write lock implementation, move
the API and the per-cpu read-write locks into their own files.

Signed-off-by: Jennifer Herbert 
Signed-off-by: David Vrabel 
---
v2:
- new
---
 xen/arch/x86/mm/mem_sharing.c |   1 +
 xen/common/Makefile   |   1 +
 xen/common/rwlock.c   |  47 +
 xen/common/spinlock.c |  45 -
 xen/include/asm-x86/mm.h  |   1 +
 xen/include/xen/grant_table.h |   1 +
 xen/include/xen/rwlock.h  | 150 ++
 xen/include/xen/sched.h   |   1 +
 xen/include/xen/spinlock.h| 143 
 9 files changed, 202 insertions(+), 188 deletions(-)
 create mode 100644 xen/common/rwlock.c
 create mode 100644 xen/include/xen/rwlock.h

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index a95e105..a522423 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 4df71ee..6e82b33 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -30,6 +30,7 @@ obj-y += rangeset.o
 obj-y += radix-tree.o
 obj-y += rbtree.o
 obj-y += rcupdate.o
+obj-y += rwlock.o
 obj-$(CONFIG_SCHED_ARINC653) += sched_arinc653.o
 obj-$(CONFIG_SCHED_CREDIT) += sched_credit.o
 obj-$(CONFIG_SCHED_CREDIT2) += sched_credit2.o
diff --git a/xen/common/rwlock.c b/xen/common/rwlock.c
new file mode 100644
index 000..410d4dc
--- /dev/null
+++ b/xen/common/rwlock.c
@@ -0,0 +1,47 @@
+#include 
+#include 
+
+static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers);
+
+void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
+percpu_rwlock_t *percpu_rwlock)
+{
+unsigned int cpu;
+cpumask_t *rwlock_readers = _cpu(percpu_rwlock_readers);
+
+/* Validate the correct per_cpudata variable has been provided. */
+_percpu_rwlock_owner_check(per_cpudata, percpu_rwlock);
+
+/*
+ * First take the write lock to protect against other writers or slow
+ * path readers.
+ */
+write_lock(_rwlock->rwlock);
+
+/* Now set the global variable so that readers start using read_lock. */
+percpu_rwlock->writer_activating = 1;
+smp_mb();
+
+/* Using a per cpu cpumask is only safe if there is no nesting. */
+ASSERT(!in_irq());
+cpumask_copy(rwlock_readers, _online_map);
+
+/* Check if there are any percpu readers in progress on this rwlock. */
+for ( ; ; )
+{
+for_each_cpu(cpu, rwlock_readers)
+{
+/*
+ * Remove any percpu readers not contending on this rwlock
+ * from our check mask.
+ */
+if ( per_cpu_ptr(per_cpudata, cpu) != percpu_rwlock )
+__cpumask_clear_cpu(cpu, rwlock_readers);
+}
+/* Check if we've cleared all percpu readers from check mask. */
+if ( cpumask_empty(rwlock_readers) )
+break;
+/* Give the coherency fabric a break. */
+cpu_relax();
+};
+}
diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index bab1f95..7b0cf6c 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -10,8 +10,6 @@
 #include 
 #include 
 
-static DEFINE_PER_CPU(cpumask_t, percpu_rwlock_readers);
-
 #ifndef NDEBUG
 
 static atomic_t spin_debug __read_mostly = ATOMIC_INIT(0);
@@ -494,49 +492,6 @@ int _rw_is_write_locked(rwlock_t *lock)
 return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */
 }
 
-void _percpu_write_lock(percpu_rwlock_t **per_cpudata,
-percpu_rwlock_t *percpu_rwlock)
-{
-unsigned int cpu;
-cpumask_t *rwlock_readers = _cpu(percpu_rwlock_readers);
-
-/* Validate the correct per_cpudata variable has been provided. */
-_percpu_rwlock_owner_check(per_cpudata, percpu_rwlock);
-
-/* 
- * First take the write lock to protect against other writers or slow 
- * path readers.
- */
-write_lock(_rwlock->rwlock);
-
-/* Now set the global variable so that readers start using read_lock. */
-percpu_rwlock->writer_activating = 1;
-smp_mb();
-
-/* Using a per cpu cpumask is only safe if there is no nesting. */
-ASSERT(!in_irq());
-cpumask_copy(rwlock_readers, _online_map);
-
-/* Check if there are any percpu readers in progress on this rwlock. */
-for ( ; ; )
-{
-for_each_cpu(cpu, rwlock_readers)
-{
-/* 
- * Remove any percpu readers not contending on this rwlock
- * from our check mask.
- */
-if ( per_cpu_ptr(per_cpudata, cpu) != percpu_rwlock )
-__cpumask_clear_cpu(cpu, rwlock_readers);
-}
-/* Check if we've cleared all percpu readers from check mask. */
-

1 2 3 >

1 - 100 of 207 matches

Mail list logo