date:20160407

Re: [Xen-devel] [PATCH] x86/emulate: Check current->arch.vm_event in hvmemul_virtual_to_linear()

2016-04-07 Thread Razvan Cojocaru

On 04/08/16 00:17, Jan Beulich wrote:
 On 07.04.16 at 19:54,  wrote:
>> On 04/07/16 20:27, Jan Beulich wrote:
>> On 07.04.16 at 10:39,  wrote:
 Theoretically it is possible for mem_access_emulate_each_rep to be
 true even when current->arch.vm_event == NULL, so add an extra
 check to hvmemul_virtual_to_linear().
>>>
>>> Mind saying what those theoretical conditions are when this might
>>> happen?
>>
>> This could happen if someone were to call xc_monitor_emulate_each_rep(),
>> but not xc_monitor_enable() (when current->arch.vm_event gets
>> allocated), or after someone called both, but afterwards called
>> xc_monitor_disable() (when current->arch.vm_event gets freed).
> 
> Then wouldn't the correct action be to fail
> xc_monitor_emulate_each_rep() (i.e. whatever hypercall this
> resolves to) when the monitor is not enabled? (You did already
> clarify that the other variant isn't applicable)?

Yes, I think that's better. I'll do that instead.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 04/14] x86/rtc: replace paravirt rtc check with platform legacy quirk

2016-04-07 Thread Juergen Gross

On 08/04/16 02:32, Luis R. Rodriguez wrote:
> On Thu, Apr 07, 2016 at 08:55:54AM -0400, Boris Ostrovsky wrote:
>> On 04/06/2016 08:06 PM, Luis R. Rodriguez wrote:
>>> We have 4 types of x86 platforms that disable RTC:
>>>
>>>   * Intel MID
>>>   * Lguest - uses paravirt
>>>   * Xen dom-U - uses paravirt
>>>   * x86 on legacy systems annotated with an ACPI legacy flag
>>>
>>> We can consolidate all of these into a platform specific legacy
>>> quirk set early in boot through i386_start_kernel() and through
>>> x86_64_start_reservations(). This deals with the RTC quirks which
>>> we can rely on through the hardware subarch, the ACPI check can
>>> be dealt with separately.
>>>
>>> v2: split the subarch check from the ACPI check, clarify
>>> on the ACPI change commit log why ordering works
>>>
>>> Suggested-by: Ingo Molnar 
>>> Signed-off-by: Luis R. Rodriguez 
> 
> <-- snip -->
> 
>>> diff --git a/arch/x86/kernel/platform-quirks.c 
>>> b/arch/x86/kernel/platform-quirks.c
>>> new file mode 100644
>>> index ..1b114ac5996f
>>> --- /dev/null
>>> +++ b/arch/x86/kernel/platform-quirks.c
>>> @@ -0,0 +1,18 @@
>>> +#include 
>>> +#include 
>>> +
>>> +#include 
>>> +#include 
>>> +
>>> +void __init x86_early_init_platform_quirks(void)
>>> +{
>>> +   x86_platform.legacy.rtc = 1;
>>> +
>>> +   switch (boot_params.hdr.hardware_subarch) {
>>> +   case X86_SUBARCH_XEN:
>>> +   case X86_SUBARCH_LGUEST:
>>> +   case X86_SUBARCH_INTEL_MID:
>>> +   x86_platform.legacy.rtc = 0;
>>> +   break;
>>> +   }
>>> +}
>>
>> What about Xen dom0 (aka initial domain)?
> 
> Indeed, thanks for catching this, the hunk below removes the re-enablement of
> the the RTC for dom0:
> 
>>> --- a/arch/x86/xen/enlighten.c
>>> +++ b/arch/x86/xen/enlighten.c
>>> @@ -1192,7 +1192,6 @@ static const struct pv_info xen_info __initconst = {
>>>  #ifdef CONFIG_X86_64
>>> .extra_user_64bit_cs = FLAT_USER_CS64,
>>>  #endif
>>> -   .features = 0,
>>> .name = "Xen",
>>>  };
>>> @@ -1525,8 +1524,6 @@ asmlinkage __visible void __init 
>>> xen_start_kernel(void)
>>> /* Install Xen paravirt ops */
>>> pv_info = xen_info;
>>> -   if (xen_initial_domain())
>>> -   pv_info.features |= PV_SUPPORTED_RTC;
>>> pv_init_ops = xen_init_ops;
>>> if (!xen_pvh_domain()) {
>>> pv_cpu_ops = xen_cpu_ops;
> 
> This should then break dom0 unless of course you have the respective next
> patch applied and that disabled the RTC due to an ACPI setting on your
> platform. Juergen, can you check to see if that was the case for your
> testing platform on dom0 ?

Are you sure it would break? Wouldn't it just fall back to another
clock source, e.g. hpet?

I looked into my test system: seems as if add_rtc_cmos() is returning
before the .legacy.rtc test.

> This highlights a semantic gap issue. From a quick cursory review, I think
> we can address this temporarily by just using a check:
> 
> void __init x86_early_init_platform_quirks(void)
> {
>   x86_platform.legacy.rtc = 1;
> 
>   switch (boot_params.hdr.hardware_subarch) {
>   case X86_SUBARCH_XEN:
>   case X86_SUBARCH_LGUEST:
>   case X86_SUBARCH_INTEL_MID:
> - x86_platform.legacy.rtc = 0;
> + if (x86_init.mpparse.get_smp_config != x86_init_uint_noop)
> + x86_platform.legacy.rtc = 0;

No! Why don't you just use the explicit test xen_initial_domain() ?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-04-07 Thread Haozhong Zhang

On 03/29/16 04:49, Jan Beulich wrote:
> >>> On 29.03.16 at 12:10,  wrote:
> > On 03/29/16 03:11, Jan Beulich wrote:
> >> >>> On 29.03.16 at 10:47,  wrote:
[..]
> >> > I still cannot find a neat approach to manage guest permissions for
> >> > nvdimm pages. A possible one is to use a per-domain bitmap to track
> >> > permissions: each bit corresponding to an nvdimm page. The bitmap can
> >> > save lots of spaces and even be stored in the normal ram, but
> >> > operating it for a large nvdimm range, especially for a contiguous
> >> > one, is slower than rangeset.
> >> 
> >> I don't follow: What would a single bit in that bitmap mean? Any
> >> guest may access the page? That surely wouldn't be what we
> >> need.
> >>
> > 
> > For a host having a N pages of nvdimm, each domain will have a N bits
> > bitmap. If the m'th bit of a domain's bitmap is set, then that domain
> > has the permission to access the m'th host nvdimm page.
> 
> Which will be more overhead as soon as there are enough such
> domains in a system.
>

Sorry for the late reply.

I think we can make some optimization to reduce the space consumed by
the bitmap.

A per-domain bitmap covering the entire host NVDIMM address range is
wasteful especially if the actual used ranges are congregated. We may
take following ways to reduce its space.

1) Split the per-domain bitmap into multiple sub-bitmap and each
   sub-bitmap covers a smaller and contiguous sub host NVDIMM address
   range. In the beginning, no sub-bitmap is allocated for the
   domain. If the access permission to a host NVDIMM page in a sub
   host address range is added to a domain, only the sub-bitmap for
   that address range is allocated for the domain. If access
   permissions to all host NVDIMM pages in a sub range are removed
   from a domain, the corresponding sub-bitmap can be freed.

2) If a domain has access permissions to all host NVDIMM pages in a
   sub range, the corresponding sub-bitmap will be replaced by a range
   struct. If range structs are used to track adjacent ranges, they
   will be merged into one range struct. If access permissions to some
   pages in that sub range are removed from a domain, the range struct
   should be converted back to bitmap segment(s).

3) Because there might be lots of above bitmap segments and range
   structs per-domain, we can organize them in a balanced interval
   tree to quickly search/add/remove an individual structure.

In the worst case that each sub range has non-contiguous pages
assigned to a domain, above solution will use all sub-bitmaps and
consume more space than a single bitmap because of the extra space for
organization. I assume that the sysadmin should be responsible to
ensure the host nvdimm ranges assigned to each domain as contiguous
and congregated as possible in order to avoid the worst case. However,
if the worst case does happen, xen hypervisor should refuse to assign
nvdimm to guest when it runs out of memory.

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-mingo-tip-master test] 89303: regressions - FAIL

2016-04-07 Thread osstest service owner

flight 89303 linux-mingo-tip-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89303/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-xsm  15 guest-localmigratefail REGR. vs. 60684
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 60684
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate   fail REGR. vs. 60684
 test-amd64-amd64-libvirt 14 guest-saverestore fail REGR. vs. 60684
 test-amd64-amd64-xl  17 guest-localmigrate/x10fail REGR. vs. 60684
 test-amd64-amd64-xl-credit2  14 guest-saverestore fail REGR. vs. 60684
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2   fail REGR. vs. 60684
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 60684
 test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
60684
 test-amd64-amd64-pair  21 guest-migrate/src_host/dst_host fail REGR. vs. 60684

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 17 guest-localmigrate/x10fail REGR. vs. 60684
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail blocked in 60684
 test-amd64-i386-xl   15 guest-localmigrate   fail blocked in 60684
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail blocked in 60684
 test-amd64-i386-xl-xsm   15 guest-localmigrate   fail blocked in 60684
 test-amd64-i386-pair  22 guest-migrate/dst_host/src_host fail blocked in 60684
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail blocked 
in 60684
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop   fail blocked in 60684
 test-amd64-amd64-libvirt-pair 21 guest-migrate/src_host/dst_host fail like 
60684
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 60684
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 60684

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass

version targeted for testing:
 linux8cdaa4153bbd0354bcd06636101daeef42312d2d
baseline version:
 linux69f75ebe3b1d1e636c4ce0a0ee248edacc69cbe0

Last test of basis60684  2015-08-13 04:21:46 Z  238 days
Failing since 60712  2015-08-15 18:33:48 Z  236 days  180 attempts
Testing same since89303  2016-04-07 04:43:29 Z0 days1 attempts

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  fail
 test-amd64-i386-xl   fail
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm

Re: [Xen-devel] [PATCH v3 08/11] xen: sched: allow for choosing credit2 runqueues configuration at boot

2016-04-07 Thread Juergen Gross

On 08/04/16 03:24, Dario Faggioli wrote:
> In fact, credit2 uses CPU topology to decide how to arrange
> its internal runqueues. Before this change, only 'one runqueue
> per socket' was allowed. However, experiments have shown that,
> for instance, having one runqueue per physical core improves
> performance, especially in case hyperthreading is available.
> 
> In general, it makes sense to allow users to pick one runqueue
> arrangement at boot time, so that:
>  - more experiments can be easily performed to even better
>assess and improve performance;
>  - one can select the best configuration for his specific
>use case and/or hardware.
> 
> This patch enables the above.
> 
> Note that, for correctly arranging runqueues to be per-core,
> just checking cpu_to_core() on the host CPUs is not enough.
> In fact, cores (and hyperthreads) on different sockets, can
> have the same core (and thread) IDs! We, therefore, need to
> check whether the full topology of two CPUs matches, for
> them to be put in the same runqueue.
> 
> Note also that the default (although not functional) for
> credit2, since now, has been per-socket runqueue. This patch
> leaves things that way, to avoid mixing policy and technical
> changes.
> 
> Finally, it would be a nice feature to be able to select
> a particular runqueue arrangement, even when creating a
> Credit2 cpupool. This is left as future work.
> 
> Signed-off-by: Dario Faggioli 
> Signed-off-by: Uma Sharma 

Some nits below.

> ---
> Cc: George Dunlap 
> Cc: Uma Sharma 
> Cc: Juergen Gross 
> ---
> Changes from v2:
>  * valid strings  are now in an array, that we scan during
>parameter parsing, as suggested during review.
> 
> Cahnges from v1:
>  * fix bug in parameter parsing, and start using strcmp()
>for that, as requested during review.
> ---
>  docs/misc/xen-command-line.markdown |   19 
>  xen/common/sched_credit2.c  |   83 
> +--
>  2 files changed, 97 insertions(+), 5 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown 
> b/docs/misc/xen-command-line.markdown
> index ca77e3b..0047f94 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -469,6 +469,25 @@ combination with the `low_crashinfo` command line option.
>  ### credit2\_load\_window\_shift
>  > `= `
>  
> +### credit2\_runqueue
> +> `= core | socket | node | all`
> +
> +> Default: `socket`
> +
> +Specify how host CPUs are arranged in runqueues. Runqueues are kept
> +balanced with respect to the load generated by the vCPUs running on
> +them. Smaller runqueues (as in with `core`) means more accurate load
> +balancing (for instance, it will deal better with hyperthreading),
> +but also more overhead.
> +
> +Available alternatives, with their meaning, are:
> +* `core`: one runqueue per each physical core of the host;
> +* `socket`: one runqueue per each physical socket (which often,
> +but not always, matches a NUMA node) of the host;
> +* `node`: one runqueue per each NUMA node of the host;
> +* `all`: just one runqueue shared by all the logical pCPUs of
> + the host
> +
>  ### dbgp
>  > `= ehci[  | @pci:. ]`
>  
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index a61a45a..eeb3f54 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -81,10 +81,6 @@
>   * Credits are "reset" when the next vcpu in the runqueue is less than
>   * or equal to zero.  At that point, everyone's credits are "clipped"
>   * to a small value, and a fixed credit is added to everyone.
> - *
> - * The plan is for all cores that share an L2 will share the same
> - * runqueue.  At the moment, there is one global runqueue for all
> - * cores.
>   */
>  
>  /*
> @@ -193,6 +189,63 @@ static int __read_mostly opt_overload_balance_tolerance 
> = -3;
>  integer_param("credit2_balance_over", opt_overload_balance_tolerance);
>  
>  /*
> + * Runqueue organization.
> + *
> + * The various cpus are to be assigned each one to a runqueue, and we
> + * want that to happen basing on topology. At the moment, it is possible
> + * to choose to arrange runqueues to be:
> + *
> + * - per-core: meaning that there will be one runqueue per each physical
> + * core of the host. This will happen if the opt_runqueue
> + * parameter is set to 'core';
> + *
> + * - per-node: meaning that there will be one runqueue per each physical
> + * NUMA node of the host. This will happen if the opt_runqueue
> + * parameter is set to 'node';
> + *
> + * - per-socket: meaning that there will be one runqueue per each physical
> + *   socket (AKA package, which often, but not always, also
> + *   matches a NUMA node) of the host; This will happen if
> + *   the opt_runqueue parameter

Re: [Xen-devel] [PATCH] bind_usbintf: do not reuse 'path'

2016-04-07 Thread Chun Yan Liu



>>> On 4/8/2016 at 12:52 AM, in message
<22278.36941.470901.631...@mariner.uk.xensource.com>, Ian Jackson
 wrote: 
> Chunyan Liu writes ("[PATCH] bind_usbintf: do not reuse 'path'"): 
> > To avoid confusion, add a new variable "intf_path" to indicate 
> > driver/interface path, let "path" indicate driver/bind path only. 
>  
> I think it would be better to rename both variables, as I suggested in 
> my mail yesterday.  That makes it much more obvious that each place 
> where `path' was used has been checked and the reference to the 
> proper variable substituted. 

Updated. Sent V2.

>  
> Thanks, 
> Ian. 
>  
>  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH V2] bind_usbintf: do not reuse 'path'

2016-04-07 Thread Chunyan Liu

To avoid confusion, use "intf_path" to indicate driver/interface path,
and "bind_path" indicate driver/bind path.

Signed-off-by: Chunyan Liu 
CC: Simon Cao 
CC: George Dunlap 
CC: Ian Jackson 
---
 tools/libxl/libxl_pvusb.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
index 6447639..4c1967d 100644
--- a/tools/libxl/libxl_pvusb.c
+++ b/tools/libxl/libxl_pvusb.c
@@ -1035,33 +1035,33 @@ out:
 
 static int bind_usbintf(libxl__gc *gc, const char *intf, const char *drvpath)
 {
-char *path;
+char *bind_path, *intf_path;
 struct stat st;
 int fd = -1;
 int rc, r;
 
-path = GCSPRINTF("%s/%s", drvpath, intf);
+intf_path = GCSPRINTF("%s/%s", drvpath, intf);
 
 /* check through lstat, if intf already exists under drvpath,
  * it's already bound, return directly; if it doesn't exist,
  * continue to do bind work; otherwise, return error.
  */
-r = lstat(path, );
+r = lstat(intf_path, );
 if (r == 0)
 return 0;
 if (r < 0 && errno != ENOENT)
 return ERROR_FAIL;
 
-path = GCSPRINTF("%s/bind", drvpath);
+bind_path = GCSPRINTF("%s/bind", drvpath);
 
-fd = open(path, O_WRONLY);
+fd = open(bind_path, O_WRONLY);
 if (fd < 0) {
-LOGE(ERROR, "open file failed: '%s'", path);
+LOGE(ERROR, "open file failed: '%s'", bind_path);
 rc = ERROR_FAIL;
 goto out;
 }
 
-if (libxl_write_exactly(CTX, fd, intf, strlen(intf), path, intf)) {
+if (libxl_write_exactly(CTX, fd, intf, strlen(intf), bind_path, intf)) {
 rc = ERROR_FAIL;
 goto out;
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC PATCH] Data integrity extension support for xen-block

2016-04-07 Thread Juergen Gross

On 08/04/16 03:24, Bob Liu wrote:
> 
> On 04/07/2016 11:55 PM, Juergen Gross wrote:
>> On 07/04/16 12:00, Bob Liu wrote:
>>> * What's data integrity extension and why?
>>> Modern filesystems feature checksumming of data and metadata to protect 
>>> against
>>> data corruption.  However, the detection of the corruption is done at read 
>>> time
>>> which could potentially be months after the data was written.  At that 
>>> point the
>>> original data that the application tried to write is most likely lost.
>>>
>>> The solution in Linux is the data integrity framework which enables 
>>> protection
>>> information to be pinned to I/Os and sent to/received from controllers that
>>> support it. struct bio has been extended with a pointer to a struct bip 
>>> which
>>> in turn contains the integrity metadata. The bip is essentially a trimmed 
>>> down
>>> bio with a bio_vec and some housekeeping.
>>>
>>> * Issues when xen-block get involved.
>>> xen-blkfront only transmits the normal data of struct bio while the 
>>> integrity
>>> metadata buffer(struct bio_integrity_payload in each bio) is ignored.
>>>
>>> * Proposal of transmitting bio integrity payload.
>>> Adding an extra request following the normal data request, this extra 
>>> request
>>> contains the integrity payload.
>>> The xen-blkback will reconstruct an new bio with both received normal data 
>>> and
>>> integrity metadata.
>>>
>>> Welcome any better ideas, thank you!
>>>
>>> [1] http://lwn.net/Articles/280023/
>>> [2] https://www.kernel.org/doc/Documentation/block/data-integrity.txt
>>>
>>> Signed-off-by: Bob Liu 
>>> ---
>>>  xen/include/public/io/blkif.h |   50 
>>> +
>>>  1 file changed, 50 insertions(+)
>>>
>>> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
>>> index 99f0326..3d8d39f 100644
>>> --- a/xen/include/public/io/blkif.h
>>> +++ b/xen/include/public/io/blkif.h
>>> @@ -635,6 +635,28 @@
>>>  #define BLKIF_OP_INDIRECT  6
>>>  
>>>  /*
>>> + * Recognized only if "feature-extra-request" is present in backend xenbus 
>>> info.
>>> + * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is 
>>> followed
>>> + * in the shared ring buffer.
>>> + *
>>> + * By this way, extra data like bio integrity payload can be transmitted 
>>> from
>>> + * frontend to backend.
>>> + *
>>> + * The 'wire' format is like:
>>> + *  Request 1: xen_blkif_request
>>> + * [Request 2: xen_blkif_extra_request](only if request 1 has 
>>> BLKIF_OP_EXTRA_FLAG)
>>> + *  Request 3: xen_blkif_request
>>> + *  Request 4: xen_blkif_request
>>> + * [Request 5: xen_blkif_extra_request](only if request 4 has 
>>> BLKIF_OP_EXTRA_FLAG)
>>> + *  ...
>>> + *  Request N: xen_blkif_request
>>> + *
>>> + * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not* 
>>> create the
>>> + * "feature-extra-request" node!
>>> + */
>>> +#define BLKIF_OP_EXTRA_FLAG (0x80)
>>> +
>>> +/*
>>>   * Maximum scatter/gather segments per request.
>>>   * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
>>>   * NB. This could be 12 if the ring indexes weren't stored in the same 
>>> page.
>>> @@ -703,6 +725,34 @@ struct blkif_request_indirect {
>>>  };
>>>  typedef struct blkif_request_indirect blkif_request_indirect_t;
>>>  
>>> +enum blkif_extra_request_type {
>>> +   BLKIF_EXTRA_TYPE_DIX = 1,   /* Data integrity extension 
>>> payload.  */
>>> +};
>>> +
>>> +struct bio_integrity_req {
>>> +   /*
>>> +* Grant mapping for transmitting bio integrity payload to backend.
>>> +*/
>>> +   grant_ref_t *gref;
>>> +   unsigned int nr_grefs;
>>> +   unsigned int len;
>>> +};
>>
>> How does the payload look like? It's structure should be defined here
>> or a reference to it's definition in case it is a standard should be
>> given.
>>
> 
> The payload is also described using struct bio_vec(the same as bio).
> 
> /*
>  * bio integrity payload
>  */
> struct bio_integrity_payload {
>   struct bio  *bip_bio;   /* parent bio */

And e.g. Windows guests know how those look like? And BSDs? And others?
All Linux versions with all possible kernel configurations have the
same layout?

I don't think so.

> 
>   struct bvec_iterbip_iter;
> 
>   bio_end_io_t*bip_end_io;/* saved I/O completion fn */
> 
>   unsigned short  bip_slab;   /* slab the bip came from */
>   unsigned short  bip_vcnt;   /* # of integrity bio_vecs */
>   unsigned short  bip_max_vcnt;   /* integrity bio_vec slots */
>   unsigned short  bip_flags;  /* control flags */
> 
>   struct work_struct  bip_work;   /* I/O completion */
> 
>   struct bio_vec  *bip_vec;
>   struct bio_vec  bip_inline_vecs[0];/* embedded bvec array */
> };
> 
>>> +
>>> +/*
>>> + * Extra request, must follow a normal-request and a normal-request can
>>> + * only be followed by

Re: [Xen-devel] [PATCH 1/2] libxl: Set rc on failure of usbdev_busaddr_to_busid

2016-04-07 Thread Chun Yan Liu



>>> On 4/8/2016 at 01:04 AM, in message
<22278.37677.975595.101...@mariner.uk.xensource.com>, Ian Jackson
 wrote: 
> Chun Yan Liu writes ("Re: [PATCH 1/2] libxl: Set rc on failure of  
> usbdev_busaddr_to_busid"): 
> > Thanks, Ian! 
>  
> Should I take that as a Reviewed-by ? 

Ah, yes. Acked. :-)

Chunyan

>  
> Thanks, 
> Ian. 
>  
>  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] a fix in libxl_device_usbdev_list

2016-04-07 Thread Chun Yan Liu



>>> On 4/8/2016 at 12:45 AM, in message
<22278.36492.245114.295...@mariner.uk.xensource.com>, Ian Jackson
 wrote: 
> Chunyan Liu writes ("[PATCH 1/4] a fix in libxl_device_usbdev_list"): 
> > In testing with libvirt pvusb functionality, found a rc check 
> > error in libxl_device_usbdev_list. Correct it. 
>  
> Thanks.  But now that I look at this code I'm not sure your fix is 
> complete. 
>  
> > diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c 
> > index 5f92628..04e41b4 100644 
> > --- a/tools/libxl/libxl_pvusb.c 
> > +++ b/tools/libxl/libxl_pvusb.c 
> > @@ -701,13 +701,13 @@ libxl_device_usbdev_list(libxl_ctx *ctx, uint32_t  
> domid, int *num) 
> >  usbctrls = libxl__xs_directory(gc, XBT_NULL, path, ); 
> >   
> >  for (i = 0; i < nc; i++) { 
> > -int r, nd = 0; 
> > +int rc, nd = 0; 
> >  libxl_device_usbdev *tmp = NULL; 
> >   
> > -r = libxl__device_usbdev_list_for_usbctrl(gc, domid, 
> > +rc = libxl__device_usbdev_list_for_usbctrl(gc, domid, 
> >atoi(usbctrls[i]), 
> >, ); 
> > -if (!r || !nd) continue; 
> > +if (rc || !nd) continue; 
>  
> If libxl__device_usbdev_list_for_usbctrl fails, shouldn't 
> libxl_device_usbdev_list fail too ? 

Following the similar definitions of other device types, the return value of
this function is "libxl_device_usbdev *", to treat the above case as failure,
it cannot be reflected through return value, we can only set return value
to NULL, but that will be confusing with a real no-device case.

- Chunyan
>  
> Ian. 
>  
>  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 6/6] xl: improve exit codes of domain creation related functions

2016-04-07 Thread Dario Faggioli

by making them more consistent with other examples in xl.

Signed-off-by: Dario Faggioli 
Signed-off-by: Harmandeep Kaur 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
v3: In create_domain(), only deal with exit codes, internal returns
need more work than what was being done in the original patch.
Shorten changelog.

v2: Add create_domain()
Remove main_sharing()
---
 tools/libxl/xl_cmdimpl.c |   32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index d2ad8c8..31e2f30 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2964,7 +2964,7 @@ static int create_domain(struct domain_create *dom_info)
 if (!json) {
 fprintf(stderr,
 "Failed to convert domain configuration to JSON\n");
-exit(1);
+exit(EXIT_FAILURE);
 }
 fputs(json, cfg_print_fh);
 free(json);
@@ -3212,7 +3212,7 @@ out:
  * already happened in the parent.
  */
 if ( daemonize && !need_daemon )
-exit(ret);
+exit(EXIT_SUCCESS);
 
 return ret;
 }
@@ -5345,7 +5345,7 @@ static void string_realloc_append(char **accumulate, 
const char *more)
 size_t morelen = strlen(more) + 1/*nul*/;
 if (oldlen > SSIZE_MAX || morelen > SSIZE_MAX - oldlen) {
 fprintf(stderr,"Additional config data far too large\n");
-exit(-ERROR_FAIL);
+exit(EXIT_FAILURE);
 }
 
 *accumulate = xrealloc(*accumulate, oldlen + morelen);
@@ -5420,7 +5420,7 @@ int main_create(int argc, char **argv)
 } else {
 help("create");
 free(dom_info.extra_config);
-return 2;
+return EXIT_FAILURE;
 }
 }
 
@@ -5440,11 +5440,11 @@ int main_create(int argc, char **argv)
 rc = create_domain(_info);
 if (rc < 0) {
 free(dom_info.extra_config);
-return -rc;
+return EXIT_FAILURE;
 }
 
 free(dom_info.extra_config);
-return 0;
+return EXIT_SUCCESS;
 }
 
 int main_config_update(int argc, char **argv)
@@ -5465,7 +5465,7 @@ int main_config_update(int argc, char **argv)
 if (argc < 2) {
 fprintf(stderr, "xl config-update requires a domain argument\n");
 help("config-update");
-exit(1);
+exit(EXIT_FAILURE);
 }
 
 fprintf(stderr, "WARNING: xl now has better capability to manage domain 
configuration, "
@@ -5497,7 +5497,7 @@ int main_config_update(int argc, char **argv)
 } else {
 help("create");
 free(extra_config);
-return 2;
+return EXIT_FAILURE;
 }
 }
 if (filename) {
@@ -5506,25 +5506,25 @@ int main_config_update(int argc, char **argv)
   _data, _len);
 if (rc) { fprintf(stderr, "Failed to read config file: %s: %s\n",
filename, strerror(errno));
-  free(extra_config); return ERROR_FAIL; }
+  free(extra_config); return EXIT_FAILURE; }
 if (extra_config && strlen(extra_config)) {
 if (config_len > INT_MAX - (strlen(extra_config) + 2 + 1)) {
 fprintf(stderr, "Failed to attach extra configuration\n");
-exit(1);
+exit(EXIT_FAILURE);
 }
 /* allocate space for the extra config plus two EOLs plus \0 */
 config_data = realloc(config_data, config_len
 + strlen(extra_config) + 2 + 1);
 if (!config_data) {
 fprintf(stderr, "Failed to realloc config_data\n");
-exit(1);
+exit(EXIT_FAILURE);
 }
 config_len += sprintf(config_data + config_len, "\n%s\n",
 extra_config);
 }
 } else {
 fprintf(stderr, "Config file not specified\n");
-exit(1);
+exit(EXIT_FAILURE);
 }
 
 libxl_domain_config_init(_config);
@@ -5540,7 +5540,7 @@ int main_config_update(int argc, char **argv)
config_data, config_len);
 if (rc) {
 fprintf(stderr, "failed to update configuration\n");
-exit(1);
+exit(EXIT_FAILURE);
 }
 }
 
@@ -5548,7 +5548,7 @@ int main_config_update(int argc, char **argv)
 
 free(config_data);
 free(extra_config);
-return 0;
+return EXIT_SUCCESS;
 }
 
 static void button_press(uint32_t domid, const char *b)
@@ -7034,10 +7034,10 @@ int main_rename(int argc, char **argv)
 domid = find_domain(dom);
 if (libxl_domain_rename(ctx, domid, common_domname, new_name)) {
 fprintf(stderr, "Can't rename domain '%s'.\n", dom);
-return 1;
+return EXIT_FAILURE;
 }
 
-return 0;
+return EXIT_SUCCESS;
 }

[Xen-devel] [PATCH v3 4/6] xl : improve exit codes of debug related functions

2016-04-07 Thread Dario Faggioli

From: Harmandeep Kaur 

by making them more consistent with other examples in xl.

Signed-off-by: Harmandeep Kaur 
Signed-off-by: Dario Faggioli 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
v3: Shorten changelog.

v2: Add main_sysrq(), main_debug_keys(), main_dmesg()
Remove xvasprintf(), main_remus()
---
 tools/libxl/xl_cmdimpl.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 4a560d7..823cb46 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -5561,7 +5561,7 @@ static void button_press(uint32_t domid, const char *b)
 trigger = LIBXL_TRIGGER_SLEEP;
 } else {
 fprintf(stderr, "%s is an invalid button identifier\n", b);
-exit(2);
+exit(EXIT_FAILURE);
 }
 
 libxl_send_trigger(ctx, domid, trigger, 0);
@@ -7058,7 +7058,7 @@ int main_trigger(int argc, char **argv)
 trigger_name = argv[optind++];
 if (libxl_trigger_from_string(trigger_name, )) {
 fprintf(stderr, "Invalid trigger \"%s\"\n", trigger_name);
-return -1;
+return EXIT_FAILURE;
 }
 
 if (argv[optind]) {
@@ -7070,7 +7070,7 @@ int main_trigger(int argc, char **argv)
 
 libxl_send_trigger(ctx, domid, trigger, vcpuid);
 
-return 0;
+return EXIT_SUCCESS;
 }
 
 
@@ -7091,12 +7091,12 @@ int main_sysrq(int argc, char **argv)
 if (sysrq[1] != '\0') {
 fprintf(stderr, "Invalid sysrq.\n\n");
 help("sysrq");
-return 1;
+return EXIT_FAILURE;
 }
 
 libxl_send_sysrq(ctx, domid, sysrq[0]);
 
-return 0;
+return EXIT_SUCCESS;
 }
 
 int main_debug_keys(int argc, char **argv)
@@ -7112,10 +7112,10 @@ int main_debug_keys(int argc, char **argv)
 
 if (libxl_send_debug_keys(ctx, keys)) {
 fprintf(stderr, "cannot send debug keys: %s\n", keys);
-return 1;
+return EXIT_FAILURE;
 }
 
-return 0;
+return EXIT_SUCCESS;
 }
 
 int main_dmesg(int argc, char **argv)
@@ -7141,7 +7141,7 @@ int main_dmesg(int argc, char **argv)
 finish:
 if (cr)
 libxl_xen_console_read_finish(ctx, cr);
-return ret;
+return ret ? EXIT_FAILURE : EXIT_SUCCESS;
 }
 
 int main_top(int argc, char **argv)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 2/6] xl: improve exit codes of save/restore and migration functions

2016-04-07 Thread Dario Faggioli

From: Harmandeep Kaur 

by making them more consistent with other examples in xl.

Macros CHK_ERRNOVAL, CHK_SYSCALL, MUST are also updated.

Signed-off-by: Harmandeep Kaur 
Signed-off-by: Dario Faggioli 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
v3: This is patches 2 and 3 squashed together.
Shorten changelog.

v2: Add main_remus().
Remove create_domain().
---
 tools/libxl/xl_cmdimpl.c |   80 +++---
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 9612b00..9cd3144 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -50,7 +50,7 @@
 else if (chk_errnoval > 0) {\
 fprintf(stderr,"xl: fatal error: %s:%d: %s: %s\n",  \
 __FILE__,__LINE__, strerror(chk_errnoval), #call);  \
-exit(-ERROR_FAIL);  \
+exit(EXIT_FAILURE); \
 }   \
 })
 
@@ -59,7 +59,7 @@
 if ((call) == -1) { \
 fprintf(stderr,"xl: fatal error: %s:%d: %s: %s\n",  \
 __FILE__,__LINE__, strerror(errno), #call); \
-exit(-ERROR_FAIL);  \
+exit(EXIT_FAILURE); \
 }   \
 })
 
@@ -68,7 +68,7 @@
 if (must_rc < 0) {  \
 fprintf(stderr,"xl: fatal error: %s:%d, rc=%d: %s\n",   \
 __FILE__,__LINE__, must_rc, #call); \
-exit(-must_rc); \
+exit(EXIT_FAILURE); \
 }   \
 })
 
@@ -376,7 +376,7 @@ static void xvasprintf(char **strp, const char *fmt, 
va_list ap)
 int r = vasprintf(strp, fmt, ap);
 if (r == -1) {
 perror("asprintf failed");
-exit(-ERROR_FAIL);
+exit(EXIT_FAILURE);
 }
 }
 
@@ -4350,7 +4350,7 @@ static void save_domain_core_begin(uint32_t domid,
   _v, config_len_r);
 if (rc) {
 fprintf(stderr, "unable to read overridden config file\n");
-exit(2);
+exit(EXIT_FAILURE);
 }
 parse_config_data(override_config_file, config_v, *config_len_r,
   _config);
@@ -4359,14 +4359,14 @@ static void save_domain_core_begin(uint32_t domid,
 rc = libxl_retrieve_domain_configuration(ctx, domid, _config);
 if (rc) {
 fprintf(stderr, "unable to retrieve domain configuration\n");
-exit(2);
+exit(EXIT_FAILURE);
 }
 }
 
 config_c = libxl_domain_config_to_json(ctx, _config);
 if (!config_c) {
 fprintf(stderr, "unable to convert config file to JSON\n");
-exit(2);
+exit(EXIT_FAILURE);
 }
 *config_data_r = (uint8_t *)config_c;
 *config_len_r = strlen(config_c) + 1; /* including trailing '\0' */
@@ -4436,7 +4436,7 @@ static int save_domain(uint32_t domid, const char 
*filename, int checkpoint,
 fd = open(filename, O_WRONLY|O_CREAT|O_TRUNC, 0644);
 if (fd < 0) {
 fprintf(stderr, "Failed to open temp file %s for writing\n", filename);
-exit(2);
+exit(EXIT_FAILURE);
 }
 
 save_domain_core_writeconfig(fd, filename, config_data, config_len);
@@ -4456,7 +4456,7 @@ static int save_domain(uint32_t domid, const char 
*filename, int checkpoint,
 else
 libxl_domain_destroy(ctx, domid, 0);
 
-exit(rc < 0 ? 1 : 0);
+exit(rc < 0 ? EXIT_FAILURE : EXIT_SUCCESS);
 }
 
 static pid_t create_migration_child(const char *rune, int *send_fd,
@@ -4480,7 +4480,7 @@ static pid_t create_migration_child(const char *rune, int 
*send_fd,
 close(recvpipe[0]); close(recvpipe[1]);
 execlp("sh","sh","-c",rune,(char*)0);
 perror("failed to exec sh");
-exit(-1);
+exit(EXIT_FAILURE);
 }
 
 close(sendpipe[0]);
@@ -4503,14 +4503,14 @@ static int migrate_read_fixedmessage(int fd, const void 
*msg, int msgsz,
 
 stream = rune ? "migration receiver stream" : "migration stream";
 rc = libxl_read_exactly(ctx, fd, buf, msgsz, stream, what);
-if (rc) return ERROR_FAIL;
+if (rc) return 1;
 
 if (memcmp(buf, msg, msgsz)) {
 fprintf(stderr, "%s contained unexpected data instead of %s\n",
 stream, what);
 if (rune)
 fprintf(stderr,

[Xen-devel] [PATCH v3 5/6] xl: make return type of create_domain() more consistent.

2016-04-07 Thread Dario Faggioli

create_domain() is of uint32_t return type, because on
success it returns the domid of the new domain, and
uint32_t is what we typically use for domid-s.

However, on failure, it returns ERROR_FAIL or ERROR_INVAL,
which are -3 and -6. Callers assign the return value to an
'int rc' variable and then check for '(rc < 0)'.

Although things work, and no tool (compiler, Coverity, ecc.)
is complaining, using 'int' as return type seems better.

Signed-off-by: Dario Faggioli 
---
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Andrew Cooper 
---
 tools/libxl/xl_cmdimpl.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 823cb46..d2ad8c8 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2777,7 +2777,7 @@ static void evdisable_disk_ejects(libxl_evgen_disk_eject 
**diskws,
 }
 }
 
-static uint32_t create_domain(struct domain_create *dom_info)
+static int create_domain(struct domain_create *dom_info)
 {
 uint32_t domid = INVALID_DOMID;
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 3/6] xl: improve exit codes of some of the domain handling functions

2016-04-07 Thread Dario Faggioli

From: Harmandeep Kaur 

by making them more consistent with other examples in xl.

Affected functions are the ones related to console, vnc,
dump, destroy, shutdown, list, domid and domname.

Signed-off-by: Harmandeep Kaur 
Signed-off-by: Dario Fagggioli 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
V3: This is patches 4, 5, 6 and 9 squashed into one.
Shorten changelog.

V2: Add autoconnect_vncviewer().
---
 tools/libxl/xl_cmdimpl.c |   72 +++---
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 9cd3144..4a560d7 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -174,7 +174,7 @@ static uint32_t find_domain(const char *p)
 rc = libxl_domain_qualifier_to_domid(ctx, p, );
 if (rc) {
 fprintf(stderr, "%s is an invalid domain identifier (rc=%d)\n", p, rc);
-exit(2);
+exit(EXIT_FAILURE);
 }
 common_domname = libxl_domid_to_name(ctx, domid);
 return domid;
@@ -221,7 +221,7 @@ static void autoconnect_vncviewer(uint32_t domid, int 
autopass)
 
 sleep(1);
 vncviewer(domid, autopass);
-_exit(1);
+_exit(EXIT_FAILURE);
 }
 
 static int acquire_lock(void)
@@ -435,7 +435,7 @@ static void flush_stream(FILE *fh)
 
 if (ferror(fh) || fflush(fh)) {
 perror(fh_name);
-exit(-1);
+exit(EXIT_FAILURE);
 }
 }
 
@@ -3735,7 +3735,7 @@ int main_console(int argc, char **argv)
 type = LIBXL_CONSOLE_TYPE_SERIAL;
 else {
 fprintf(stderr, "console type supported are: pv, serial\n");
-return 2;
+return EXIT_FAILURE;
 }
 break;
 case 'n':
@@ -3749,7 +3749,7 @@ int main_console(int argc, char **argv)
 else
 libxl_console_exec(ctx, domid, num, type);
 fprintf(stderr, "Unable to attach console\n");
-return 1;
+return EXIT_FAILURE;
 }
 
 int main_vncviewer(int argc, char **argv)
@@ -3771,8 +3771,8 @@ int main_vncviewer(int argc, char **argv)
 domid = find_domain(argv[optind]);
 
 if (vncviewer(domid, autopass))
-return 1;
-return 0;
+return EXIT_FAILURE;
+return EXIT_SUCCESS;
 }
 
 static void pcilist(uint32_t domid)
@@ -4010,10 +4010,10 @@ static void destroy_domain(uint32_t domid, int force)
 fprintf(stderr, "Not destroying domain 0; use -f to force.\n"
 "This can only be done when using a disaggregated "
 "hardware domain and toolstack.\n\n");
-exit(-1);
+exit(EXIT_FAILURE);
 }
 rc = libxl_domain_destroy(ctx, domid, 0);
-if (rc) { fprintf(stderr,"destroy failed (rc=%d)\n",rc); exit(-1); }
+if (rc) { fprintf(stderr,"destroy failed (rc=%d)\n",rc); 
exit(EXIT_FAILURE); }
 }
 
 static void wait_for_domain_deaths(libxl_evgen_domain_death **deathws, int nr)
@@ -4025,7 +4025,7 @@ static void 
wait_for_domain_deaths(libxl_evgen_domain_death **deathws, int nr)
 rc = libxl_event_wait(ctx, , LIBXL_EVENTMASK_ALL, 0,0);
 if (rc) {
 LOG("Failed to get event, quitting (rc=%d)", rc);
-exit(-1);
+exit(EXIT_FAILURE);
 }
 
 switch (event->type) {
@@ -4070,14 +4070,14 @@ static void shutdown_domain(uint32_t domid,
 }
 
 if (rc) {
-fprintf(stderr,"shutdown failed (rc=%d)\n",rc);exit(-1);
+fprintf(stderr,"shutdown failed (rc=%d)\n",rc);exit(EXIT_FAILURE);
 }
 
 if (deathw) {
 rc = libxl_evenable_domain_death(ctx, domid, for_user, deathw);
 if (rc) {
 fprintf(stderr,"wait for death failed (evgen, rc=%d)\n",rc);
-exit(-1);
+exit(EXIT_FAILURE);
 }
 }
 }
@@ -4101,14 +4101,14 @@ static void reboot_domain(uint32_t domid, 
libxl_evgen_domain_death **deathw,
 }
 }
 if (rc) {
-fprintf(stderr,"reboot failed (rc=%d)\n",rc);exit(-1);
+fprintf(stderr,"reboot failed (rc=%d)\n",rc);exit(EXIT_FAILURE);
 }
 
 if (deathw) {
 rc = libxl_evenable_domain_death(ctx, domid, for_user, deathw);
 if (rc) {
 fprintf(stderr,"wait for death failed (evgen, rc=%d)\n",rc);
-exit(-1);
+exit(EXIT_FAILURE);
 }
 }
 }
@@ -4241,12 +4241,12 @@ static void list_domains(bool verbose, bool context, 
bool claim, bool numa,
 if (numa) {
 if (libxl_node_bitmap_alloc(ctx, , 0)) {
 fprintf(stderr, "libxl_node_bitmap_alloc_failed.\n");
-exit(1);
+exit(EXIT_FAILURE);
 }
 if (libxl_get_physinfo(ctx, ) != 0) {
 fprintf(stderr, "libxl_physinfo failed.\n");
 libxl_bitmap_dispose();
-exit(1);
+exit(EXIT_FAILURE);
 }
 
 printf(" NODE Affinity");
@@ -4310,7 +4310,7

[Xen-devel] [PATCH v3 1/6] xl: improve return and exit codes of memory related functions

2016-04-07 Thread Dario Faggioli

From: Harmandeep Kaur 

by making them more consistent with other examples in xl.

While there, make freemem() of boolean return type, which
looks more natural, and add comment explaining why
parse_mem_size_kb() needs to diverge from the pattern.

Signed-off-by: Harmandeep Kaur 
Signed-off-by: Dario Faggioli 
---
Cc: Ian Jackson 
Cc: Wei Liu 
---
v3: Shorten changelog.
Make freemem() boolean return type.

v2: Add comment to explain return vaule of parse_mem_size_kb().
Add freemem() and main_sharing().
Remove find_domain().
---
 tools/libxl/xl_cmdimpl.c |   39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 2ee6c74..9612b00 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2680,40 +2680,45 @@ static int preserve_domain(uint32_t *r_domid, 
libxl_event *event,
 return rc == 0 ? 1 : 0;
 }
 
-static int freemem(uint32_t domid, libxl_domain_build_info *b_info)
+/*
+ * Returns false if memory can't be freed, but also if we encounter errors.
+ * Returns true in case there is already, or we manage to free it, enough
+ * memory, but also if autoballoon is false.
+ */
+static bool freemem(uint32_t domid, libxl_domain_build_info *b_info)
 {
 int rc, retries = 3;
 uint32_t need_memkb, free_memkb;
 
 if (!autoballoon)
-return 0;
+return true;
 
 rc = libxl_domain_need_memory(ctx, b_info, _memkb);
 if (rc < 0)
-return rc;
+return false;
 
 do {
 rc = libxl_get_free_memory(ctx, _memkb);
 if (rc < 0)
-return rc;
+return false;
 
 if (free_memkb >= need_memkb)
-return 0;
+return true;
 
 rc = libxl_set_memory_target(ctx, 0, free_memkb - need_memkb, 1, 0);
 if (rc < 0)
-return rc;
+return false;
 
 /* wait until dom0 reaches its target, as long as we are making
  * progress */
 rc = libxl_wait_for_memory_target(ctx, 0, 10);
 if (rc < 0)
-return rc;
+return false;
 
 retries--;
 } while (retries > 0);
 
-return ERROR_NOMEM;
+return false;
 }
 
 static void autoconnect_console(libxl_ctx *ctx_ignored,
@@ -2980,8 +2985,7 @@ start:
 goto error_out;
 
 if (domid_soft_reset == INVALID_DOMID) {
-ret = freemem(domid, _config.b_info);
-if (ret < 0) {
+if (!freemem(domid, _config.b_info)) {
 fprintf(stderr, "failed to free memory for the domain\n");
 ret = ERROR_FAIL;
 goto error_out;
@@ -3245,6 +3249,7 @@ void help(const char *command)
 }
 }
 
+/* Returns -1 on failure; the amount of memory on success. */
 static int64_t parse_mem_size_kb(const char *mem)
 {
 char *endptr;
@@ -3391,7 +3396,7 @@ static int set_memory_max(uint32_t domid, const char *mem)
 memorykb = parse_mem_size_kb(mem);
 if (memorykb == -1) {
 fprintf(stderr, "invalid memory size: %s\n", mem);
-exit(3);
+exit(EXIT_FAILURE);
 }
 
 if (libxl_domain_setmaxmem(ctx, domid, memorykb)) {
@@ -3425,7 +3430,7 @@ static int set_memory_target(uint32_t domid, const char 
*mem)
 memorykb = parse_mem_size_kb(mem);
 if (memorykb == -1)  {
 fprintf(stderr, "invalid memory size: %s\n", mem);
-exit(3);
+exit(EXIT_FAILURE);
 }
 
 if (libxl_set_memory_target(ctx, domid, memorykb, 0, /* enforce */ 1)) {
@@ -6132,7 +6137,7 @@ int main_sharing(int argc, char **argv)
 info = libxl_list_domain(ctx, _domain);
 if (!info) {
 fprintf(stderr, "libxl_list_domain failed.\n");
-return 1;
+return EXIT_FAILURE;
 }
 info_free = info;
 } else if (optind == argc-1) {
@@ -6141,17 +6146,17 @@ int main_sharing(int argc, char **argv)
 if (rc == ERROR_DOMAIN_NOTFOUND) {
 fprintf(stderr, "Error: Domain \'%s\' does not exist.\n",
 argv[optind]);
-return -rc;
+return EXIT_FAILURE;
 }
 if (rc) {
 fprintf(stderr, "libxl_domain_info failed (code %d).\n", rc);
-return -rc;
+return EXIT_FAILURE;
 }
 info = _buf;
 nb_domain = 1;
 } else {
 help("sharing");
-return 2;
+return EXIT_FAILURE;
 }
 
 sharing(info, nb_domain);
@@ -6161,7 +6166,7 @@ int main_sharing(int argc, char **argv)
 else
 libxl_dominfo_dispose(info);
 
-return 0;
+return EXIT_SUCCESS;
 }
 
 static int sched_domain_get(libxl_scheduler sched, int domid,


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 0/6] xl: convert exit codes related to domain subcommands to EXIT_[SUCCESS|FAILURE]

2016-04-07 Thread Dario Faggioli

Hi tools maintainers,

So, this series from Harmandeep seems to have fallen through the cracks (and
that's my fault, as I said I'd have handled it). In any case, it was posted
before last posting day, it looks simple enough to me and it's a nice cleanup.

This is about using EXIT_SUCCESS or EXIT_FAILURE when calling exit(), in xl.
main_foo() functions are treated as they were main(), and so they also return
EXIT_SUCCESS or EXIT_FAILURE. Internal functions are also refactored. The idea
was to have them return either 0 or 1, but there needs to be exceptions (that
are documented). TBF, this series mostly deals with exit codes, and the amount
of internal function refactoring is rather limited.

Unfortunately, even after this series, there will still be more similar cleanup
to do... but that would have to wait 4.8.

Version 2 was already almost ok, but there were a few adjustments needed.
Therefore, instead than asking for them, I've done them myself. That basically
means that I've reorganized the series a bit, squashing some patches together.
I've also reduced the scope of "xl: improve return and exit codes of dom create
related functions" to only deal with exit()-s from within that function, as
quite a few more work than what was being done in that patch in v2 was
necessary to propetly refactoring the internal return paths.

I've also added patch "xl: make return type of create_domain() more
consistent.", doing what we agreed upon on IRC (about making the return type
'int'). I could have folded that change in the other patch, but I thought the
thing would deserve its own changelog.

v2 is here:
 http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg01037.html

and there is a git branch with the series applied here:
 git://xenbits.xen.org/people/dariof/xen.git   rel/xl/exit-code-cleanup-v3
 
http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/xl/exit-code-cleanup-v3

Let me know what you think.

Thanks and Regards,
Dario
---
Dario Faggioli (2):
  xl: make return type of create_domain() more consistent.
  xl: improve return and exit codes of dom create related functions

Harmandeep Kaur (4):
  xl: improve return and exit codes of memory related functions
  xl: improve exit codes of save/restore and migration functions
  xl: improve exit codes of some of the domain handling functions
  xl: improve exit codes of debug related functions


 tools/libxl/xl_cmdimpl.c |  241 +++---
 1 file changed, 123 insertions(+), 118 deletions(-)
--
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v9 2/3] VT-d: wrap a _sync version for all VT-d flush interfaces

2016-04-07 Thread Xu, Quan

On April 07, 2016 11:29pm, Jan Beulich  wrote:
> >>> On 07.04.16 at 09:44,  wrote:
> > On April 05, 2016 5:35pm, Jan Beulich  wrote:
> >> >>> On 01.04.16 at 16:47,  wrote:
> >> > +{
> >> > +queue_invalidate_context(iommu, did, source_id,
> >> > + function_mask, granu);
> >> > +
> >> > +return invalidate_sync(iommu); }
> >>
> >> Further down you replace the only call to
> >> queue_invalidate_context() - why keep both functions instead of just
> >> making
> > the
> >> existing one do the sync? (That would the likely also apply to
> >> qinval_device_iotlb() and others below.)
> >>
> >
> > It is optional.
> >  I think:
> > 1. in the long term, we may need no _sync version.
> > 2. At least, the current wrap looks good to me. e.g.
> > queue_invalidate_context() is for context-cache Invalidate Descriptor,
> > and the
> > invalidate_sync() is for Invalidation Wait Descriptor. It is much clearer.
> 
> I don't really agree, but will leave it to the VT-d maintainers to judge.
> 

+to Kevin and Feng, I am open for it.


> >> > +if ( ret )
> >> > +return ret;
> >> > +
> >> >  if ( flush_dev_iotlb )
> >> >  ret = dev_invalidate_iotlb(iommu, did, addr, size_order,
> type);
> >> > -rc = invalidate_sync(iommu);
> >> > -if ( !ret )
> >> > -ret = rc;
> >> >  }
> >>
> >> I think leaving the existing logic as is would be better - best
> >> effort
> > invalidation
> >> even when an error has occurred.
> >>
> >
> > I have an open:
> > As vt-d spec(:Queued Invalidation Ordering Considerations) said,
> >  1. If the Fence(FN) flag is 1 in a inv_wait_dsc, hardware must
> > execute descriptors following the inv_wait_dsc only after wait command is
> completed.
> >  2. when a Device-TLB invalidation timeout is detected, hardware
> > must not complete any pending inv_wait_dsc commands.
> > In current code, the Fence(FN) is always 1.
> > if a Device-TLB invalidation timeout is detected, this additional
> > inv_wait_dsc is not completed.
> > __iiuc__,
> > the new coming descriptors, in that queue, _might_ be not executed any
> > more, waiting for this additional inv_wait_dsc which is not completed.
> > is it true?
> 
> That's not a question to me, is it?

To community, but vt-d maintainers are someone who can explain to me.

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 19/28] build_id: Provide ld-embedded build-ids

2016-04-07 Thread Konrad Rzeszutek Wilk

On Thu, Apr 07, 2016 at 08:18:27PM -0400, Konrad Rzeszutek Wilk wrote:
> > > +build_id.o: $(TARGET)-syms
> > > + $(OBJCOPY) -O binary --only-section=.note $(BASEDIR)/xen-syms $@.bin
> > 
> > Considering your xen.lds.S changes, won't this potentially copy quite
> > a bit more than just the build ID (i.e. all notes)? This may be okay, and
> > may be even intended, but then the generate file's name should
> > reflect this to not misguide people.
> 
> I changed it notes.o
> 
> > 
> > > + $(OBJCOPY) -I binary -O elf64-x86-64 -B i386:x86-64 \
> > > + --rename-section=.data=.note.gnu.build-id -S $@.bin $@
> > 
> > Since you put the notes into .rodata anyway, why name the
> > section .note.*? Just name it .rodata.*, avoiding to mislead
> > others who also think that a section's name has much of a
> > meaning.
> 
> Way back last year:
> http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg01264.html
> 
> which is where the .note came about. I can put it all in .rodata
> and not have it for normal ELF builds if you would like.

.rodata.notes for both ELF and EFI looks to have done the trick.
Now it is just the matter of testing it.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Bug in x86 instruction emulator?

2016-04-07 Thread wogiz


On 2016-04-07 04:04, Jan Beulich wrote:

We'd need to know which exact exception (including error code and,
in the case of #PF, CR2 value) gets raised to the guest by what
specific piece of code in the hypervisor. That'll likely mean some
instrumentation of the hypervisor code.

Jan


No problem with instrumentation as it's not a production system.

I don't have the knowledge to write code, but if a patch can be created 
or if only a few lines of code is required, I can edit source files 
manually, rebuild, install Xen and collect the required information.


William Z.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 13/28] xsplice, symbols: Implement symbol name resolution on address.

2016-04-07 Thread Konrad Rzeszutek Wilk

On Thu, Apr 07, 2016 at 09:46:49AM -0600, Jan Beulich wrote:
> >>> On 07.04.16 at 05:14,  wrote:
> > On Fri, Apr 01, 2016 at 09:11:40AM -0600, Jan Beulich wrote:
> >> >>> On 24.03.16 at 21:00,  wrote:
> >> > --- a/xen/arch/x86/Makefile
> >> > +++ b/xen/arch/x86/Makefile
> >> > @@ -113,12 +113,14 @@ $(TARGET)-syms: prelink.o xen.lds 
> >> > $(BASEDIR)/common/symbols-dummy.o
> >> >  $(LD) $(LDFLAGS) -T xen.lds -N prelink.o \
> >> >  $(BASEDIR)/common/symbols-dummy.o -o $(@D)/.$(@F).0
> >> >  $(NM) -pa --format=sysv $(@D)/.$(@F).0 \
> >> > -| $(BASEDIR)/tools/symbols --sysv --sort 
> >> > >$(@D)/.$(@F).0.S
> >> > +| $(BASEDIR)/tools/symbols --all-symbols --sysv --sort \
> >> > +>$(@D)/.$(@F).0.S
> >> >  $(MAKE) -f $(BASEDIR)/Rules.mk $(@D)/.$(@F).0.o
> >> >  $(LD) $(LDFLAGS) -T xen.lds -N prelink.o \
> >> >  $(@D)/.$(@F).0.o -o $(@D)/.$(@F).1
> >> >  $(NM) -pa --format=sysv $(@D)/.$(@F).1 \
> >> > -| $(BASEDIR)/tools/symbols --sysv --sort --warn-dup 
> >> > >$(@D)/.$(@F).1.S
> >> > +| $(BASEDIR)/tools/symbols --all-symbols --sysv --sort 
> >> > --warn-dup \
> >> > +>$(@D)/.$(@F).1.S
> >> 
> >> This addition should be dependent on CONFIG_XSPLICE, not the
> >> least because I expect it to bloat the symbol table quite a bit. And
> >> then - how come this is needed here, but not in the xen.efi rule?
> > 
> > I added it to xen.efi rule and got:
> > 
> > home/konrad/xen/xen/.xen.efi.0s.S: Assembler messages:
> > /home/konrad/xen/xen/.xen.efi.0s.S:21: Warning: value 0x7d2f8543 
> > truncated to 0x8543
> > /home/konrad/xen/xen/.xen.efi.0s.S:22: Warning: value 0x7d2f88b2 
> > truncated to 0x88b2
> > /home/konrad/xen/xen/.xen.efi.0s.S:23: Warning: value 0x7d2f88b4 
> > truncated to 0x88b4
> > /home/konrad/xen/xen/.xen.efi.0s.S:24: Warning: value 0x7d2f88b9 
> > truncated to 0x88b9
> > /home/konrad/xen/xen/.xen.efi.0s.S:25: Warning: value 0x7d2f8000103f 
> > truncated to 0x8000103f
> > /home/konrad/xen/xen/.xen.efi.0s.S:26: Warning: value 0x7d2f80001043 
> > truncated to 0x80001043
> > /home/konrad/xen/xen/.xen.efi.0s.S:27: Warning: value 0x7d2f80001047 
> > truncated to 0x80001047
> > /home/konrad/xen/xen/.xen.efi.0s.S:6746: Warning: value 0x10065 
> > truncated to 0x65
> > 
> > and so on.. Not sure why. The xen.efi file boots thought?
> 
> It's the kallsyms symbol table that suffers, so the image booting
> fine is not really surprising. But we'd need to understand what
> specific data objects these warnings originate from - perhaps
> linker generated symbols not sitting inside sections?

From the .xen.efi.0s.S:
.globl symbols_offsets
ALGN
symbols_offsets:
#endif
PTR 0x544 - SYMBOLS_ORIGIN
PTR 0xa01 - SYMBOLS_ORIGIN
PTR 0xa03 - SYMBOLS_ORIGIN
PTR 0xa08 - SYMBOLS_ORIGIN
PTR 0x118e - SYMBOLS_ORIGIN
PTR 0x1192 - SYMBOLS_ORIGIN
PTR 0x1196 - SYMBOLS_ORIGIN
PTR 0x82d08010 - SYMBOLS_ORIGIN
PTR 0x82d08010 - SYMBOLS_ORIGIN

which corresponds to:
multiboot1_header_start|0544|   ?  |  | | |
multiboot1_header_start|0a01|   ?  |  | | |
multiboot1_header_start|0a03|   ?  |  | | |
multiboot1_header_start|0a08|   ?  |  | | |
multiboot1_header_start|118e|   ?  |  | | |
multiboot1_header_start|1192|   ?  |  | | |
multiboot1_header_start|1196|   ?  |  | | |

which is also found:
multiboot1_header_start|82d08018|   t  |  | | | 

> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 10/11] xen: sched: privde some scratch space for not putting cpumasks on stack

2016-04-07 Thread Dario Faggioli

directly, from schedule.c, for any scheduler that needs
it to use it.

In fact, Credit1 and RTDS needs this already. Credit2 is
also going to need it, for supporting hard affinity
(which is, typically, what requires a lot of cpumask
manipulations, inside various functions).

Therefore, let's define the scratch space at a broader
scope, to limit code duplication in handling it.

Signed-off-by: Dario Faggioli 
Reviewed-by: George Dunlap 
---
Cc: Andrew Cooper 
---
Changes from v1:
* scratch space for cpumask is not "global", and defined
  in schedule.c, as suggested during review.
---
 xen/common/sched_credit.c  |   34 ++---
 xen/common/sched_credit2.c |1 +
 xen/common/sched_rt.c  |   59 +++-
 xen/common/schedule.c  |8 ++
 xen/include/xen/sched-if.h |4 +++
 5 files changed, 25 insertions(+), 81 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 490f10b..bc36837 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -171,20 +171,9 @@ struct csched_pcpu {
 struct timer ticker;
 unsigned int tick;
 unsigned int idle_bias;
-/* Store this here to avoid having too many cpumask_var_t-s on stack */
-cpumask_var_t balance_mask;
 };
 
 /*
- * Convenience macro for accessing the per-PCPU cpumask we need for
- * implementing the two steps (soft and hard affinity) balancing logic.
- * It is stored in csched_pcpu so that serialization is not an issue,
- * as there is a csched_pcpu for each PCPU, and we always hold the
- * runqueue lock for the proper PCPU when using this.
- */
-#define csched_balance_mask(c) (CSCHED_PCPU(c)->balance_mask)
-
-/*
  * Virtual CPU
  */
 struct csched_vcpu {
@@ -416,10 +405,10 @@ static inline void __runq_tickle(struct csched_vcpu *new)
 
 /* Are there idlers suitable for new (for this balance step)? */
 csched_balance_cpumask(new->vcpu, balance_step,
-   csched_balance_mask(cpu));
-cpumask_and(csched_balance_mask(cpu),
-csched_balance_mask(cpu), _mask);
-new_idlers_empty = cpumask_empty(csched_balance_mask(cpu));
+   cpumask_scratch_cpu(cpu));
+cpumask_and(cpumask_scratch_cpu(cpu),
+cpumask_scratch_cpu(cpu), _mask);
+new_idlers_empty = cpumask_empty(cpumask_scratch_cpu(cpu));
 
 /*
  * Let's not be too harsh! If there aren't idlers suitable
@@ -445,8 +434,8 @@ static inline void __runq_tickle(struct csched_vcpu *new)
 if ( new_idlers_empty && new->pri > cur->pri )
 {
 csched_balance_cpumask(cur->vcpu, balance_step,
-   csched_balance_mask(cpu));
-if ( cpumask_intersects(csched_balance_mask(cpu),
+   cpumask_scratch_cpu(cpu));
+if ( cpumask_intersects(cpumask_scratch_cpu(cpu),
 _mask) )
 {
 SCHED_VCPU_STAT_CRANK(cur, kicked_away);
@@ -519,7 +508,6 @@ csched_free_pdata(const struct scheduler *ops, void *pcpu, 
int cpu)
 
 spin_unlock_irqrestore(>lock, flags);
 
-free_cpumask_var(spc->balance_mask);
 xfree(spc);
 }
 
@@ -533,12 +521,6 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
 if ( spc == NULL )
 return ERR_PTR(-ENOMEM);
 
-if ( !alloc_cpumask_var(>balance_mask) )
-{
-xfree(spc);
-return ERR_PTR(-ENOMEM);
-}
-
 return spc;
 }
 
@@ -1592,9 +1574,9 @@ csched_runq_steal(int peer_cpu, int cpu, int pri, int 
balance_step)
  && !__vcpu_has_soft_affinity(vc, vc->cpu_hard_affinity) )
 continue;
 
-csched_balance_cpumask(vc, balance_step, csched_balance_mask(cpu));
+csched_balance_cpumask(vc, balance_step, cpumask_scratch_cpu(cpu));
 if ( __csched_vcpu_is_migrateable(vc, cpu,
-  csched_balance_mask(cpu)) )
+  cpumask_scratch_cpu(cpu)) )
 {
 /* We got a candidate. Grab it! */
 TRACE_3D(TRC_CSCHED_STOLEN_VCPU, peer_cpu,
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index d44cc3d..8617c9b 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2259,6 +2259,7 @@ csched2_init(struct scheduler *ops)
 if ( prv == NULL )
 return -ENOMEM;
 ops->sched_data = prv;
+
 spin_lock_init(>lock);
 INIT_LIST_HEAD(>sdom);
 
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 3bb8c71..673fc92 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -155,24 +155,6 @@
 #define TRC_RTDS_BUDGET_REPLENISH

[Xen-devel] [PATCH v3 11/11] xen: sched: implement vcpu hard affinity in Credit2

2016-04-07 Thread Dario Faggioli

From: Justin Weaver 

as it was still missing.

Note that this patch "only" implements hard affinity,
i.e., the possibility of specifying on what pCPUs a
certain vCPU can run. Soft affinity (which express a
preference for vCPUs to run on certain pCPUs) is still
not supported by Credit2, even after this patch.

Signed-off-by: Justin Weaver 
Signed-off-by: Dario Faggioli 
Acked-by: George Dunlap 
---
 xen/common/sched_credit2.c |  131 ++--
 1 file changed, 102 insertions(+), 29 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 8617c9b..59eb3db 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -318,6 +318,36 @@ struct csched2_dom {
 uint16_t nr_vcpus;
 };
 
+/*
+ * When a hard affinity change occurs, we may not be able to check some
+ * (any!) of the other runqueues, when looking for the best new processor
+ * for svc (as trylock-s in choose_cpu() can fail). If that happens, we
+ * pick, in order of decreasing preference:
+ *  - svc's current pcpu;
+ *  - another pcpu from svc's current runq;
+ *  - any cpu.
+ */
+static int get_fallback_cpu(struct csched2_vcpu *svc)
+{
+int cpu;
+
+if ( likely(cpumask_test_cpu(svc->vcpu->processor,
+ svc->vcpu->cpu_hard_affinity)) )
+return svc->vcpu->processor;
+
+cpumask_and(cpumask_scratch, svc->vcpu->cpu_hard_affinity,
+>rqd->active);
+cpu = cpumask_first(cpumask_scratch);
+if ( likely(cpu < nr_cpu_ids) )
+return cpu;
+
+cpumask_and(cpumask_scratch, svc->vcpu->cpu_hard_affinity,
+cpupool_domain_cpumask(svc->vcpu->domain));
+
+ASSERT(!cpumask_empty(cpumask_scratch));
+
+return cpumask_first(cpumask_scratch);
+}
 
 /*
  * Time-to-credit, credit-to-time.
@@ -551,8 +581,9 @@ runq_tickle(const struct scheduler *ops, unsigned int cpu, 
struct csched2_vcpu *
 goto tickle;
 }
 
-/* Get a mask of idle, but not tickled */
+/* Get a mask of idle, but not tickled, that new is allowed to run on. */
 cpumask_andnot(, >idle, >tickled);
+cpumask_and(, , new->vcpu->cpu_hard_affinity);
 
 /* If it's not empty, choose one */
 i = cpumask_cycle(cpu, );
@@ -563,9 +594,11 @@ runq_tickle(const struct scheduler *ops, unsigned int cpu, 
struct csched2_vcpu *
 }
 
 /* Otherwise, look for the non-idle cpu with the lowest credit,
- * skipping cpus which have been tickled but not scheduled yet */
+ * skipping cpus which have been tickled but not scheduled yet,
+ * that new is allowed to run on. */
 cpumask_andnot(, >active, >idle);
 cpumask_andnot(, , >tickled);
+cpumask_and(, , new->vcpu->cpu_hard_affinity);
 
 for_each_cpu(i, )
 {
@@ -1115,9 +1148,8 @@ choose_cpu(const struct scheduler *ops, struct vcpu *vc)
 d2printk("%pv -\n", svc->vcpu);
 clear_bit(__CSFLAG_runq_migrate_request, >flags);
 }
-/* Leave it where it is for now.  When we actually pay attention
- * to affinity we'll have to figure something out... */
-return vc->processor;
+
+return get_fallback_cpu(svc);
 }
 
 /* First check to see if we're here because someone else suggested a place
@@ -1128,45 +1160,56 @@ choose_cpu(const struct scheduler *ops, struct vcpu *vc)
 {
 printk("%s: Runqueue migrate aborted because target runqueue 
disappeared!\n",
__func__);
-/* Fall-through to normal cpu pick */
 }
 else
 {
-d2printk("%pv +\n", svc->vcpu);
-new_cpu = cpumask_cycle(vc->processor, >migrate_rqd->active);
-goto out_up;
+cpumask_and(cpumask_scratch, vc->cpu_hard_affinity,
+>migrate_rqd->active);
+new_cpu = cpumask_any(cpumask_scratch);
+if ( new_cpu < nr_cpu_ids )
+{
+d2printk("%pv +\n", svc->vcpu);
+goto out_up;
+}
 }
+/* Fall-through to normal cpu pick */
 }
 
-/* FIXME: Pay attention to cpu affinity */ 
 
-
 min_avgload = MAX_LOAD;
 
 /* Find the runqueue with the lowest instantaneous load */
 for_each_cpu(i, >active_queues)
 {
 struct csched2_runqueue_data *rqd;
-s_time_t rqd_avgload;
+s_time_t rqd_avgload = MAX_LOAD;
 
 rqd = prv->rqd + i;
 
-/* If checking a different runqueue, grab the lock,
- * read the avg, and then release the lock.
+/*
+ * If checking a different runqueue, grab the lock, check hard
+ * affinity, read the avg, and then release the lock.
  *
  * If on our own runqueue, don't grab or release the lock;
  * but

[Xen-devel] [PATCH v3 08/11] xen: sched: allow for choosing credit2 runqueues configuration at boot

2016-04-07 Thread Dario Faggioli

In fact, credit2 uses CPU topology to decide how to arrange
its internal runqueues. Before this change, only 'one runqueue
per socket' was allowed. However, experiments have shown that,
for instance, having one runqueue per physical core improves
performance, especially in case hyperthreading is available.

In general, it makes sense to allow users to pick one runqueue
arrangement at boot time, so that:
 - more experiments can be easily performed to even better
   assess and improve performance;
 - one can select the best configuration for his specific
   use case and/or hardware.

This patch enables the above.

Note that, for correctly arranging runqueues to be per-core,
just checking cpu_to_core() on the host CPUs is not enough.
In fact, cores (and hyperthreads) on different sockets, can
have the same core (and thread) IDs! We, therefore, need to
check whether the full topology of two CPUs matches, for
them to be put in the same runqueue.

Note also that the default (although not functional) for
credit2, since now, has been per-socket runqueue. This patch
leaves things that way, to avoid mixing policy and technical
changes.

Finally, it would be a nice feature to be able to select
a particular runqueue arrangement, even when creating a
Credit2 cpupool. This is left as future work.

Signed-off-by: Dario Faggioli 
Signed-off-by: Uma Sharma 
---
Cc: George Dunlap 
Cc: Uma Sharma 
Cc: Juergen Gross 
---
Changes from v2:
 * valid strings  are now in an array, that we scan during
   parameter parsing, as suggested during review.

Cahnges from v1:
 * fix bug in parameter parsing, and start using strcmp()
   for that, as requested during review.
---
 docs/misc/xen-command-line.markdown |   19 
 xen/common/sched_credit2.c  |   83 +--
 2 files changed, 97 insertions(+), 5 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index ca77e3b..0047f94 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -469,6 +469,25 @@ combination with the `low_crashinfo` command line option.
 ### credit2\_load\_window\_shift
 > `= `
 
+### credit2\_runqueue
+> `= core | socket | node | all`
+
+> Default: `socket`
+
+Specify how host CPUs are arranged in runqueues. Runqueues are kept
+balanced with respect to the load generated by the vCPUs running on
+them. Smaller runqueues (as in with `core`) means more accurate load
+balancing (for instance, it will deal better with hyperthreading),
+but also more overhead.
+
+Available alternatives, with their meaning, are:
+* `core`: one runqueue per each physical core of the host;
+* `socket`: one runqueue per each physical socket (which often,
+but not always, matches a NUMA node) of the host;
+* `node`: one runqueue per each NUMA node of the host;
+* `all`: just one runqueue shared by all the logical pCPUs of
+ the host
+
 ### dbgp
 > `= ehci[  | @pci:. ]`
 
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index a61a45a..eeb3f54 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -81,10 +81,6 @@
  * Credits are "reset" when the next vcpu in the runqueue is less than
  * or equal to zero.  At that point, everyone's credits are "clipped"
  * to a small value, and a fixed credit is added to everyone.
- *
- * The plan is for all cores that share an L2 will share the same
- * runqueue.  At the moment, there is one global runqueue for all
- * cores.
  */
 
 /*
@@ -193,6 +189,63 @@ static int __read_mostly opt_overload_balance_tolerance = 
-3;
 integer_param("credit2_balance_over", opt_overload_balance_tolerance);
 
 /*
+ * Runqueue organization.
+ *
+ * The various cpus are to be assigned each one to a runqueue, and we
+ * want that to happen basing on topology. At the moment, it is possible
+ * to choose to arrange runqueues to be:
+ *
+ * - per-core: meaning that there will be one runqueue per each physical
+ * core of the host. This will happen if the opt_runqueue
+ * parameter is set to 'core';
+ *
+ * - per-node: meaning that there will be one runqueue per each physical
+ * NUMA node of the host. This will happen if the opt_runqueue
+ * parameter is set to 'node';
+ *
+ * - per-socket: meaning that there will be one runqueue per each physical
+ *   socket (AKA package, which often, but not always, also
+ *   matches a NUMA node) of the host; This will happen if
+ *   the opt_runqueue parameter is set to 'socket';
+ *
+ * - global: meaning that there will be only one runqueue to which all the
+ *   (logical) processors of the host belongs. This will happen if
+ *   the opt_runqueue parameter is set to 'all'.
+ *
+ * Depending on the value of opt_runqueue, therefore, cpus that

[Xen-devel] [PATCH v3 09/11] xen: sched: per-core runqueues as default in credit2

2016-04-07 Thread Dario Faggioli

Experiments have shown that arranging the scheduing
runqueues on a per-core basis yields better results,
in most cases.

Such evaluation has been done, for the first time,
by Uma Sharma, during her participation to OPW. Some
of the results she got are summarized here:

 http://lists.xen.org/archives/html/xen-devel/2015-03/msg01499.html

Signed-off-by: Dario Faggioli 
Signed-off-by: Uma Sharma 
Reviewed-by: Juergen Gross 
Acked-by: George Dunlap 
---
 docs/misc/xen-command-line.markdown |2 +-
 xen/common/sched_credit2.c  |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 0047f94..5d801b8 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -472,7 +472,7 @@ combination with the `low_crashinfo` command line option.
 ### credit2\_runqueue
 > `= core | socket | node | all`
 
-> Default: `socket`
+> Default: `core`
 
 Specify how host CPUs are arranged in runqueues. Runqueues are kept
 balanced with respect to the load generated by the vCPUs running on
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index eeb3f54..d44cc3d 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -226,7 +226,7 @@ static const char *const opt_runqueue_str[] = {
 [OPT_RUNQUEUE_NODE] = "node",
 [OPT_RUNQUEUE_ALL] = "all"
 };
-static int __read_mostly opt_runqueue = OPT_RUNQUEUE_SOCKET;
+static int __read_mostly opt_runqueue = OPT_RUNQUEUE_CORE;
 
 static void parse_credit2_runqueue(const char *s)
 {


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 05/11] xen: sched: improve credit2 bootparams' scope, placement and signedness

2016-04-07 Thread Dario Faggioli

From: Uma Sharma 

and, while we are adjusting signedness of opt_load_window_shift,
make also prv->load_window_shift unsigned, as approapriate.

Signed-off-by: Uma Sharma 
Signed-off-by: Dario Faggioli 
Reviewed-by: Juergen Gross 
Acked-by: George Dunlap 
---
Cc: Jan Beulich 
---
 xen/common/sched_credit2.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 60c6f5b..7286e50 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -162,7 +162,7 @@
 #define CSFLAG_runq_migrate_request (1<<__CSFLAG_runq_migrate_request)
 
 
-int opt_migrate_resist=500;
+static unsigned int __read_mostly opt_migrate_resist = 500;
 integer_param("sched_credit2_migrate_resist", opt_migrate_resist);
 
 /*
@@ -185,12 +185,12 @@ integer_param("sched_credit2_migrate_resist", 
opt_migrate_resist);
  *   to a load of 1.
  */
 #define LOADAVG_GRANULARITY_SHIFT (10)
-int opt_load_window_shift=18;
+static unsigned int __read_mostly opt_load_window_shift = 18;
 #define  LOADAVG_WINDOW_SHIFT_MIN 4
 integer_param("credit2_load_window_shift", opt_load_window_shift);
-int opt_underload_balance_tolerance=0;
+static int __read_mostly opt_underload_balance_tolerance = 0;
 integer_param("credit2_balance_under", opt_underload_balance_tolerance);
-int opt_overload_balance_tolerance=-3;
+static int __read_mostly opt_overload_balance_tolerance = -3;
 integer_param("credit2_balance_over", opt_overload_balance_tolerance);
 
 /*
@@ -227,7 +227,7 @@ struct csched2_private {
 cpumask_t active_queues; /* Queues which may have active cpus */
 struct csched2_runqueue_data rqd[NR_CPUS];
 
-int load_window_shift;
+unsigned int load_window_shift;
 };
 
 /*


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 04/11] xen: sched: close potential races when switching scheduler to CPUs

2016-04-07 Thread Dario Faggioli

In short, the point is making sure that the actual switch
of scheduler and the remapping of the scheduler's runqueue
lock occur in the same critical section, protected by the
"old" scheduler's lock (and not, e.g., in the free_pdata
hook, as it is now for Credit2 and RTDS).

Not doing  so, is (at least) racy. In fact, for instance,
if we switch cpu X from, Credit2 to Credit, we do:

 schedule_cpu_switch(x, csched2 --> csched):
   //scheduler[x] is csched2
   //schedule_lock[x] is csched2_lock
   csched_alloc_pdata(x)
   csched_init_pdata(x)
   pcpu_schedule_lock(x) > takes csched2_lock
   scheduler[X] = csched
   pcpu_schedule_unlock(x) --> unlocks csched2_lock
   [1]
   csched2_free_pdata(x)
 pcpu_schedule_lock(x) --> takes csched2_lock
 schedule_lock[x] = csched_lock
 spin_unlock(csched2_lock)

While, if we switch cpu X from, Credit to Credit2, we do:

 schedule_cpu_switch(X, csched --> csched2):
   //scheduler[x] is csched
   //schedule_lock[x] is csched_lock
   csched2_alloc_pdata(x)
   csched2_init_pdata(x)
 pcpu_schedule_lock(x) --> takes csched_lock
 schedule_lock[x] = csched2_lock
 spin_unlock(csched_lock)
   [2]
   pcpu_schedule_lock(x) > takes csched2_lock
   scheduler[X] = csched2
   pcpu_schedule_unlock(x) --> unlocks csched2_lock
   csched_free_pdata(x)

And if we switch cpu X from RTDS to Credit2, we do:

 schedule_cpu_switch(X, RTDS --> csched2):
   //scheduler[x] is rtds
   //schedule_lock[x] is rtds_lock
   csched2_alloc_pdata(x)
   csched2_init_pdata(x)
 pcpu_schedule_lock(x) --> takes rtds_lock
 schedule_lock[x] = csched2_lock
 spin_unlock(rtds_lock)
   pcpu_schedule_lock(x) > takes csched2_lock
   scheduler[x] = csched2
   pcpu_schedule_unlock(x) --> unlocks csched2_lock
   rtds_free_pdata(x)
 spin_lock(rtds_lock)
 ASSERT(schedule_lock[x] == rtds_lock) [3]
 schedule_lock[x] = DEFAULT_SCHEDULE_LOCK [4]
 spin_unlock(rtds_lock)

So, the first problem is that, if anything related to
scheduling, and involving CPU, happens at [1] or [2], we:
 - take csched2_lock,
 - operate on Credit1 functions and data structures,
which is no good!

The second problem is that the ASSERT at [3] triggers, and
the third that at [4], we screw up the lock remapping we've
done for ourself in csched2_init_pdata()!

The first problem arises because there is a window during
which the lock is already the new one, but the scheduler is
still the old one. The other two, becase we let schedulers
mess with the lock (re)mapping done by others.

This patch, therefore, introduces a new hook in the scheduler
interface, called switch_sched, meant at being used when
switching scheduler on a CPU, and implements it for the
various schedulers, so that things are done in the proper
order and under the protection of the best suited (set of)
lock(s). It is necessary to add the hook (as compared to
keep doing things in generic code), because different
schedulers may have different locking schemes.

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Josh Whitehead 
Cc: Robert VanVossen 
Cc: Meng Xu 
Cc: Tianyang Chen 
---
Changes from v2:
 - the hook is implemented in ARINC653 too, as necessary and
   as requested during review.

Changes from v1:

new patch, basically, coming from squashing what were
4 patches in v1. In any case, with respect to those 4
patches:
 - runqueue lock is back being taken in schedule_cpu_switch(),
   as suggested during review;
 - add barriers for making sure all initialization is done
   when the new lock is assigned, as sugested during review;
 - add comments and ASSERT-s about how and why the adopted
   locking scheme is safe, as suggested during review.
---
 xen/common/sched_arinc653.c |   34 ++
 xen/common/sched_credit.c   |   44 +++
 xen/common/sched_credit2.c  |   81 ---
 xen/common/sched_rt.c   |   45 +---
 xen/common/schedule.c   |   41 +-
 xen/include/xen/sched-if.h  |3 ++
 6 files changed, 206 insertions(+), 42 deletions(-)

diff --git a/xen/common/sched_arinc653.c b/xen/common/sched_arinc653.c
index b79fcdf..ebd2090 100644
--- a/xen/common/sched_arinc653.c
+++ b/xen/common/sched_arinc653.c
@@ -652,6 +652,38 @@ a653sched_pick_cpu(const struct scheduler *ops, struct 
vcpu *vc)
 }
 
 /**
+ * Xen scheduler callback to change the scheduler of a cpu
+ *
+ * @param new_ops   Pointer to this instance of the scheduler structure
+ * @param cpu   The cpu that is changing scheduler
+ * @param pdata scheduler specific PCPU data (we don't have any)
+ * @param vdata scheduler specific VCPU data of the idle vcpu
+ */
+static void
+a653_switch_sched(struct scheduler *new_ops, unsigned int cpu,
+  void *pdata, void *vdata)
+{
+

[Xen-devel] [PATCH v3 03/11] xen: sched: move pCPU initialization in an helper

2016-04-07 Thread Dario Faggioli

That will turn out useful in following patches, where such
code will need to be called more than just once. Create an
helper now, and move the code there, to avoid mixing code
motion and functional changes later.

In Credit2, some style cleanup is also done.

No functional change intended.

Signed-off-by: Dario Faggioli 
Reviewed-by: George Dunlap 
---
 xen/common/sched_credit.c  |   20 
 xen/common/sched_credit2.c |   26 --
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index f503e73..96a245d 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -543,17 +543,12 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
 }
 
 static void
-csched_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
+init_pdata(struct csched_private *prv, struct csched_pcpu *spc, int cpu)
 {
-struct csched_private *prv = CSCHED_PRIV(ops);
-struct csched_pcpu * const spc = pdata;
-unsigned long flags;
-
-/* cpu data needs to be allocated, but STILL uninitialized */
+ASSERT(spin_is_locked(>lock));
+/* cpu data needs to be allocated, but STILL uninitialized. */
 ASSERT(spc && spc->runq.next == NULL && spc->runq.prev == NULL);
 
-spin_lock_irqsave(>lock, flags);
-
 /* Initialize/update system-wide config */
 prv->credit += prv->credits_per_tslice;
 prv->ncpus++;
@@ -576,7 +571,16 @@ csched_init_pdata(const struct scheduler *ops, void 
*pdata, int cpu)
 /* Start off idling... */
 BUG_ON(!is_idle_vcpu(curr_on_cpu(cpu)));
 cpumask_set_cpu(cpu, prv->idlers);
+}
 
+static void
+csched_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
+{
+unsigned long flags;
+struct csched_private *prv = CSCHED_PRIV(ops);
+
+spin_lock_irqsave(>lock, flags);
+init_pdata(prv, pdata, cpu);
 spin_unlock_irqrestore(>lock, flags);
 }
 
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 8a56953..8989eea 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1972,16 +1972,13 @@ static void deactivate_runqueue(struct csched2_private 
*prv, int rqi)
 }
 
 static void
-csched2_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
+init_pdata(struct csched2_private *prv, unsigned int cpu)
 {
 unsigned rqi;
-unsigned long flags;
-struct csched2_private *prv = CSCHED2_PRIV(ops);
 struct csched2_runqueue_data *rqd;
 spinlock_t *old_lock;
 
-spin_lock_irqsave(>lock, flags);
-
+ASSERT(spin_is_locked(>lock));
 ASSERT(!cpumask_test_cpu(cpu, >initialized));
 
 /* Figure out which runqueue to put it in */
@@ -2001,7 +1998,7 @@ csched2_init_pdata(const struct scheduler *ops, void 
*pdata, int cpu)
 BUG();
 }
 
-rqd=prv->rqd + rqi;
+rqd = prv->rqd + rqi;
 
 printk("Adding cpu %d to runqueue %d\n", cpu, rqi);
 if ( ! cpumask_test_cpu(rqi, >active_queues) )
@@ -2011,13 +2008,13 @@ csched2_init_pdata(const struct scheduler *ops, void 
*pdata, int cpu)
 }
 
 /* IRQs already disabled */
-old_lock=pcpu_schedule_lock(cpu);
+old_lock = pcpu_schedule_lock(cpu);
 
 /* Move spinlock to new runq lock.  */
 per_cpu(schedule_data, cpu).schedule_lock = >lock;
 
 /* Set the runqueue map */
-prv->runq_map[cpu]=rqi;
+prv->runq_map[cpu] = rqi;
 
 cpumask_set_cpu(cpu, >idle);
 cpumask_set_cpu(cpu, >active);
@@ -2027,12 +2024,21 @@ csched2_init_pdata(const struct scheduler *ops, void 
*pdata, int cpu)
 
 cpumask_set_cpu(cpu, >initialized);
 
-spin_unlock_irqrestore(>lock, flags);
-
 return;
 }
 
 static void
+csched2_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
+{
+struct csched2_private *prv = CSCHED2_PRIV(ops);
+unsigned long flags;
+
+spin_lock_irqsave(>lock, flags);
+init_pdata(prv, cpu);
+spin_unlock_irqrestore(>lock, flags);
+}
+
+static void
 csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
 {
 unsigned long flags;


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 01/11] xen: sched: make implementing .alloc_pdata optional

2016-04-07 Thread Dario Faggioli

The .alloc_pdata scheduler hook must, before this change,
be implemented by all schedulers --even those ones that
don't need to allocate anything.

Make it possible to just use the SCHED_OP(), like for
the other hooks, by using ERR_PTR() and IS_ERR() for
error reporting. This:
 - makes NULL a variant of success;
 - allows for errors other than ENOMEM to be properly
   communicated (if ever necessary).

This, in turn, means that schedulers not needing to
allocate any per-pCPU data, can avoid implementing the
hook. In fact, the artificial implementation of
.alloc_pdata in the ARINC653 is removed (and, while there,
nuke .free_pdata too, as it is equally useless).

Signed-off-by: Dario Faggioli 
Reviewed-by: Meng Xu 
Reviewed-by: Juergen Gross 
Acked-by: George Dunlap 
---
Cc: Robert VanVossen 
Cc: Josh Whitehead 
Cc: Jan Beulich 
---
Changes from v1:
 * only update sd->sched_priv if alloc_pdata does not return
   IS_ERR, so that xfree() can always be safely called on
   sd->sched_priv itself, as requested during review;
 * xen/err.h included in .c files that actually need it,
   instead than in sched-if.h.
---
 xen/common/sched_arinc653.c |   31 ---
 xen/common/sched_credit.c   |5 +++--
 xen/common/sched_credit2.c  |2 +-
 xen/common/sched_rt.c   |8 
 xen/common/schedule.c   |   27 +--
 5 files changed, 25 insertions(+), 48 deletions(-)

diff --git a/xen/common/sched_arinc653.c b/xen/common/sched_arinc653.c
index 8a11a2f..b79fcdf 100644
--- a/xen/common/sched_arinc653.c
+++ b/xen/common/sched_arinc653.c
@@ -456,34 +456,6 @@ a653sched_free_vdata(const struct scheduler *ops, void 
*priv)
 }
 
 /**
- * This function allocates scheduler-specific data for a physical CPU
- *
- * We do not actually make use of any per-CPU data but the hypervisor expects
- * a non-NULL return value
- *
- * @param ops   Pointer to this instance of the scheduler structure
- *
- * @return  Pointer to the allocated data
- */
-static void *
-a653sched_alloc_pdata(const struct scheduler *ops, int cpu)
-{
-/* return a non-NULL value to keep schedule.c happy */
-return SCHED_PRIV(ops);
-}
-
-/**
- * This function frees scheduler-specific data for a physical CPU
- *
- * @param ops   Pointer to this instance of the scheduler structure
- */
-static void
-a653sched_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)
-{
-/* nop */
-}
-
-/**
  * This function allocates scheduler-specific data for a domain
  *
  * We do not actually make use of any per-domain data but the hypervisor
@@ -737,9 +709,6 @@ static const struct scheduler sched_arinc653_def = {
 .free_vdata = a653sched_free_vdata,
 .alloc_vdata= a653sched_alloc_vdata,
 
-.free_pdata = a653sched_free_pdata,
-.alloc_pdata= a653sched_alloc_pdata,
-
 .free_domdata   = a653sched_free_domdata,
 .alloc_domdata  = a653sched_alloc_domdata,
 
diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 4c4927f..63a4a63 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 /*
@@ -532,12 +533,12 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
 /* Allocate per-PCPU info */
 spc = xzalloc(struct csched_pcpu);
 if ( spc == NULL )
-return NULL;
+return ERR_PTR(-ENOMEM);
 
 if ( !alloc_cpumask_var(>balance_mask) )
 {
 xfree(spc);
-return NULL;
+return ERR_PTR(-ENOMEM);
 }
 
 spin_lock_irqsave(>lock, flags);
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b8c8e40..e97d8be 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -2047,7 +2047,7 @@ csched2_alloc_pdata(const struct scheduler *ops, int cpu)
 printk("%s: cpu %d not online yet, deferring initializatgion\n",
__func__, cpu);
 
-return (void *)1;
+return NULL;
 }
 
 static void
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 321b0a5..aece318 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -681,7 +682,7 @@ rt_alloc_pdata(const struct scheduler *ops, int cpu)
 spin_unlock_irqrestore(old_lock, flags);
 
 if ( !alloc_cpumask_var(&_cpumask_scratch[cpu]) )
-return NULL;
+return ERR_PTR(-ENOMEM);
 
 if ( prv->repl_timer == NULL )
 {
@@ -689,13 +690,12 @@ rt_alloc_pdata(const struct scheduler *ops, int cpu)
 prv->repl_timer = xzalloc(struct timer);
 
 if ( prv->repl_timer == NULL )
-return NULL;
+return ERR_PTR(-ENOMEM);
 
 init_timer(prv->repl_timer, repl_timer_handler, (void *)ops,

Re: [Xen-devel] [RFC PATCH] Data integrity extension support for xen-block

2016-04-07 Thread Bob Liu


On 04/07/2016 11:55 PM, Juergen Gross wrote:
> On 07/04/16 12:00, Bob Liu wrote:
>> * What's data integrity extension and why?
>> Modern filesystems feature checksumming of data and metadata to protect 
>> against
>> data corruption.  However, the detection of the corruption is done at read 
>> time
>> which could potentially be months after the data was written.  At that point 
>> the
>> original data that the application tried to write is most likely lost.
>>
>> The solution in Linux is the data integrity framework which enables 
>> protection
>> information to be pinned to I/Os and sent to/received from controllers that
>> support it. struct bio has been extended with a pointer to a struct bip which
>> in turn contains the integrity metadata. The bip is essentially a trimmed 
>> down
>> bio with a bio_vec and some housekeeping.
>>
>> * Issues when xen-block get involved.
>> xen-blkfront only transmits the normal data of struct bio while the integrity
>> metadata buffer(struct bio_integrity_payload in each bio) is ignored.
>>
>> * Proposal of transmitting bio integrity payload.
>> Adding an extra request following the normal data request, this extra request
>> contains the integrity payload.
>> The xen-blkback will reconstruct an new bio with both received normal data 
>> and
>> integrity metadata.
>>
>> Welcome any better ideas, thank you!
>>
>> [1] http://lwn.net/Articles/280023/
>> [2] https://www.kernel.org/doc/Documentation/block/data-integrity.txt
>>
>> Signed-off-by: Bob Liu 
>> ---
>>  xen/include/public/io/blkif.h |   50 
>> +
>>  1 file changed, 50 insertions(+)
>>
>> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
>> index 99f0326..3d8d39f 100644
>> --- a/xen/include/public/io/blkif.h
>> +++ b/xen/include/public/io/blkif.h
>> @@ -635,6 +635,28 @@
>>  #define BLKIF_OP_INDIRECT  6
>>  
>>  /*
>> + * Recognized only if "feature-extra-request" is present in backend xenbus 
>> info.
>> + * A request with BLKIF_OP_EXTRA_FLAG indicates an extra request is followed
>> + * in the shared ring buffer.
>> + *
>> + * By this way, extra data like bio integrity payload can be transmitted 
>> from
>> + * frontend to backend.
>> + *
>> + * The 'wire' format is like:
>> + *  Request 1: xen_blkif_request
>> + * [Request 2: xen_blkif_extra_request](only if request 1 has 
>> BLKIF_OP_EXTRA_FLAG)
>> + *  Request 3: xen_blkif_request
>> + *  Request 4: xen_blkif_request
>> + * [Request 5: xen_blkif_extra_request](only if request 4 has 
>> BLKIF_OP_EXTRA_FLAG)
>> + *  ...
>> + *  Request N: xen_blkif_request
>> + *
>> + * If a backend does not recognize BLKIF_OP_EXTRA_FLAG, it should *not* 
>> create the
>> + * "feature-extra-request" node!
>> + */
>> +#define BLKIF_OP_EXTRA_FLAG (0x80)
>> +
>> +/*
>>   * Maximum scatter/gather segments per request.
>>   * This is carefully chosen so that sizeof(blkif_ring_t) <= PAGE_SIZE.
>>   * NB. This could be 12 if the ring indexes weren't stored in the same page.
>> @@ -703,6 +725,34 @@ struct blkif_request_indirect {
>>  };
>>  typedef struct blkif_request_indirect blkif_request_indirect_t;
>>  
>> +enum blkif_extra_request_type {
>> +BLKIF_EXTRA_TYPE_DIX = 1,   /* Data integrity extension 
>> payload.  */
>> +};
>> +
>> +struct bio_integrity_req {
>> +/*
>> + * Grant mapping for transmitting bio integrity payload to backend.
>> + */
>> +grant_ref_t *gref;
>> +unsigned int nr_grefs;
>> +unsigned int len;
>> +};
> 
> How does the payload look like? It's structure should be defined here
> or a reference to it's definition in case it is a standard should be
> given.
> 

The payload is also described using struct bio_vec(the same as bio).

/*
 * bio integrity payload
 */
struct bio_integrity_payload {
struct bio  *bip_bio;   /* parent bio */

struct bvec_iterbip_iter;

bio_end_io_t*bip_end_io;/* saved I/O completion fn */

unsigned short  bip_slab;   /* slab the bip came from */
unsigned short  bip_vcnt;   /* # of integrity bio_vecs */
unsigned short  bip_max_vcnt;   /* integrity bio_vec slots */
unsigned short  bip_flags;  /* control flags */

struct work_struct  bip_work;   /* I/O completion */

struct bio_vec  *bip_vec;
struct bio_vec  bip_inline_vecs[0];/* embedded bvec array */
};

>> +
>> +/*
>> + * Extra request, must follow a normal-request and a normal-request can
>> + * only be followed by one extra request.
>> + */
> 
> Wouldn't it be possible to include this in the original request (e.g.
> it could be the first or last indirect segment) ?
> 

Yes, that's also an option.

The common way for block layer/driver to handle integrity metadata is using two 
scatterlists.
One containing the data as usual, and one containing the integrity metadata.

[Xen-devel] [PATCH v3 07/11] xen: sched: fix per-socket runqueue creation in credit2

2016-04-07 Thread Dario Faggioli

The credit2 scheduler tries to setup runqueues in such
a way that there is one of them per each socket. However,
that does not work. The issue is described in bug #36
"credit2 only uses one runqueue instead of one runq per
socket" (http://bugs.xenproject.org/xen/bug/36), and a
solution has been attempted by an old patch series:

 http://lists.xen.org/archives/html/xen-devel/2014-08/msg02168.html

Here, we take advantage of the fact that now initialization
happens (for all schedulers) during CPU_STARTING, so we
have all the topology information available when necessary.

This is true for all the pCPUs _except_ the boot CPU. That
is not an issue, though. In fact, no runqueue exists yet
when the boot CPU is initialized, so we can just create
one and put the boot CPU in there.

Signed-off-by: Dario Faggioli 
Reviewed-by: George Dunlap 
---
Cc: Justin Weaver 
---
Changes from v1:
 * fixed a typo in a comment.
---
 xen/common/sched_credit2.c |   59 
 1 file changed, 43 insertions(+), 16 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b207d84..a61a45a 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -53,7 +53,6 @@
  *  http://wiki.xen.org/wiki/Credit2_Scheduler_Development
  * TODO:
  * + Multiple sockets
- *  - Detect cpu layout and make runqueue map, one per L2 (make_runq_map())
  *  - Simple load balancer / runqueue assignment
  *  - Runqueue load measurement
  *  - Load-based load balancer
@@ -1975,6 +1974,48 @@ static void deactivate_runqueue(struct csched2_private 
*prv, int rqi)
 cpumask_clear_cpu(rqi, >active_queues);
 }
 
+static unsigned int
+cpu_to_runqueue(struct csched2_private *prv, unsigned int cpu)
+{
+struct csched2_runqueue_data *rqd;
+unsigned int rqi;
+
+for ( rqi = 0; rqi < nr_cpu_ids; rqi++ )
+{
+unsigned int peer_cpu;
+
+/*
+ * As soon as we come across an uninitialized runqueue, use it.
+ * In fact, either:
+ *  - we are initializing the first cpu, and we assign it to
+ *runqueue 0. This is handy, especially if we are dealing
+ *with the boot cpu (if credit2 is the default scheduler),
+ *as we would not be able to use cpu_to_socket() and similar
+ *helpers anyway (they're result of which is not reliable yet);
+ *  - we have gone through all the active runqueues, and have not
+ *found anyone whose cpus' topology matches the one we are
+ *dealing with, so activating a new runqueue is what we want.
+ */
+if ( prv->rqd[rqi].id == -1 )
+break;
+
+rqd = prv->rqd + rqi;
+BUG_ON(cpumask_empty(>active));
+
+peer_cpu = cpumask_first(>active);
+BUG_ON(cpu_to_socket(cpu) == XEN_INVALID_SOCKET_ID ||
+   cpu_to_socket(peer_cpu) == XEN_INVALID_SOCKET_ID);
+
+if ( cpu_to_socket(cpumask_first(>active)) == cpu_to_socket(cpu) )
+break;
+}
+
+/* We really expect to be able to assign each cpu to a runqueue. */
+BUG_ON(rqi >= nr_cpu_ids);
+
+return rqi;
+}
+
 /* Returns the ID of the runqueue the cpu is assigned to. */
 static unsigned
 init_pdata(struct csched2_private *prv, unsigned int cpu)
@@ -1986,21 +2027,7 @@ init_pdata(struct csched2_private *prv, unsigned int cpu)
 ASSERT(!cpumask_test_cpu(cpu, >initialized));
 
 /* Figure out which runqueue to put it in */
-rqi = 0;
-
-/* Figure out which runqueue to put it in */
-/* NB: cpu 0 doesn't get a STARTING callback, so we hard-code it to 
runqueue 0. */
-if ( cpu == 0 )
-rqi = 0;
-else
-rqi = cpu_to_socket(cpu);
-
-if ( rqi == XEN_INVALID_SOCKET_ID )
-{
-printk("%s: cpu_to_socket(%d) returned %d!\n",
-   __func__, cpu, rqi);
-BUG();
-}
+rqi = cpu_to_runqueue(prv, cpu);
 
 rqd = prv->rqd + rqi;
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 06/11] xen: sched: on Credit2, don't reprogram the timer if idle

2016-04-07 Thread Dario Faggioli

as other schedulers are doing already: if the idle vcpu
is picked and scheduled, there is no need to reprogram the
scheduler timer to fire and invoke csched2_schedule()
again in future.

Tickling or external events will serve as pokes, when
necessary, but until we can, we should just stay idle.

Signed-off-by: Dario Faggioli 
Reported-by: Tianyang Chen 
Suggested-by: George Dunlap 
Acked-by: George Dunlap 
---
Cc: Tianyang Chen 
---
Changes from v1:
 * use George's work address (sorry again!).
---
 xen/common/sched_credit2.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index 7286e50..b207d84 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1543,8 +1543,12 @@ csched2_runtime(const struct scheduler *ops, int cpu, 
struct csched2_vcpu *snext
 struct csched2_runqueue_data *rqd = RQD(ops, cpu);
 struct list_head *runq = >runq;
 
+/*
+ * If we're idle, just stay so. Others (or external events)
+ * will poke us when necessary.
+ */
 if ( is_idle_vcpu(snext->vcpu) )
-return CSCHED2_MAX_TIMER;
+return -1;
 
 /* General algorithm:
  * 1) Run until snext's credit will be 0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 00/11] Fixes and improvement (including hard affinity!) for Credit2

2016-04-07 Thread Dario Faggioli

And here it comes take 3.

I've addressed George's review comments (and, indirectly, while doing that,
Juergen's one as well).

Now it's only these two patches that need being Acked:

  04/11 xen: sched: close potential races when switching scheduler to CPUs
  08/11 xen: sched: allow for choosing credit2 runqueues configuration at boot

Previous version (v2) is here:
  http://lists.xen.org/archives/html/xen-devel/2016-04/msg00783.html

And there is a git branch for this version here:
  git://xenbits.xen.org/people/dariof/xen.git  
rel/sched/credit2/fix-runq-and-haff-v3
  
http://xenbits.xen.org/gitweb/?p=people/dariof/xen.git;a=shortlog;h=refs/heads/rel/sched/credit2/fix-runq-and-haff-v3

Thanks and Regards,
Dario
---
Dario Faggioli (9):
  xen: sched: make implementing .alloc_pdata optional
  xen: sched: implement .init_pdata in Credit, Credit2 and RTDS
  xen: sched: move pCPU initialization in an helper
  xen: sched: close potential races when switching scheduler to CPUs
  xen: sched: on Credit2, don't reprogram the timer if idle
  xen: sched: fix per-socket runqueue creation in credit2
  xen: sched: allow for choosing credit2 runqueues configuration at boot
  xen: sched: per-core runqueues as default in credit2
  xen: sched: privde some scratch space for not putting cpumasks on stack

Justin Weaver (1):
  xen: sched: implement vcpu hard affinity in Credit2

Uma Sharma (1):
  xen: sched: improve credit2 bootparams' scope, placement and signedness

 docs/misc/xen-command-line.markdown |   19 ++
 xen/common/sched_arinc653.c |   65 +++--
 xen/common/sched_credit.c   |  103 ++--
 xen/common/sched_credit2.c  |  443 ---
 xen/common/sched_rt.c   |  117 -
 xen/common/schedule.c   |   76 +-
 xen/include/xen/sched-if.h  |7 +
 7 files changed, 537 insertions(+), 293 deletions(-)
--
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 02/11] xen: sched: implement .init_pdata in Credit, Credit2 and RTDS

2016-04-07 Thread Dario Faggioli

In fact, if a scheduler needs per-pCPU information,
that needs to be initialized appropriately. So, we take
the code that is performing initializations from (right
now) .alloc_pdata, and use it for .init_pdata, leaving
only actualy allocations in the former, if any (which
is the case in RTDS and Credit1).

On the other hand, in Credit2, since we don't really
need any per-pCPU data allocation, everything that was
being done in .alloc_pdata, is now done in .init_pdata.
And the fact that now .alloc_pdata can be left undefined,
allows us to just get rid of it.

Still for Credit2, the fact that .init_pdata is called
during CPU_STARTING (rather than CPU_UP_PREPARE) kills
the need for the scheduler to setup a similar callback
itself, simplifying the code.

And thanks to such simplification, it is now also ok to
move some of the logic meant at double checking that a
cpu was (or was not) initialized, into ASSERTS (rather
than an if() and a BUG_ON).

Signed-off-by: Dario Faggioli 
Reviewed-by: Meng Xu 
Reviewed-by: George Dunlap 
---
Cc: Juergen Gross 
---
Changes from v2:
* make the ASSERT() in credit more linear, as suggested
  during review;
* minor adjustements to the changelog, as suggested during
  review.
---
 xen/common/sched_credit.c  |   20 +---
 xen/common/sched_credit2.c |   72 +++-
 xen/common/sched_rt.c  |   11 ++-
 3 files changed, 28 insertions(+), 75 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 63a4a63..f503e73 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -527,8 +527,6 @@ static void *
 csched_alloc_pdata(const struct scheduler *ops, int cpu)
 {
 struct csched_pcpu *spc;
-struct csched_private *prv = CSCHED_PRIV(ops);
-unsigned long flags;
 
 /* Allocate per-PCPU info */
 spc = xzalloc(struct csched_pcpu);
@@ -541,6 +539,19 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
 return ERR_PTR(-ENOMEM);
 }
 
+return spc;
+}
+
+static void
+csched_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
+{
+struct csched_private *prv = CSCHED_PRIV(ops);
+struct csched_pcpu * const spc = pdata;
+unsigned long flags;
+
+/* cpu data needs to be allocated, but STILL uninitialized */
+ASSERT(spc && spc->runq.next == NULL && spc->runq.prev == NULL);
+
 spin_lock_irqsave(>lock, flags);
 
 /* Initialize/update system-wide config */
@@ -561,16 +572,12 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
 INIT_LIST_HEAD(>runq);
 spc->runq_sort_last = prv->runq_sort;
 spc->idle_bias = nr_cpu_ids - 1;
-if ( per_cpu(schedule_data, cpu).sched_priv == NULL )
-per_cpu(schedule_data, cpu).sched_priv = spc;
 
 /* Start off idling... */
 BUG_ON(!is_idle_vcpu(curr_on_cpu(cpu)));
 cpumask_set_cpu(cpu, prv->idlers);
 
 spin_unlock_irqrestore(>lock, flags);
-
-return spc;
 }
 
 #ifndef NDEBUG
@@ -2054,6 +2061,7 @@ static const struct scheduler sched_credit_def = {
 .alloc_vdata= csched_alloc_vdata,
 .free_vdata = csched_free_vdata,
 .alloc_pdata= csched_alloc_pdata,
+.init_pdata = csched_init_pdata,
 .free_pdata = csched_free_pdata,
 .alloc_domdata  = csched_alloc_domdata,
 .free_domdata   = csched_free_domdata,
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index e97d8be..8a56953 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1971,7 +1971,8 @@ static void deactivate_runqueue(struct csched2_private 
*prv, int rqi)
 cpumask_clear_cpu(rqi, >active_queues);
 }
 
-static void init_pcpu(const struct scheduler *ops, int cpu)
+static void
+csched2_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
 {
 unsigned rqi;
 unsigned long flags;
@@ -1981,12 +1982,7 @@ static void init_pcpu(const struct scheduler *ops, int 
cpu)
 
 spin_lock_irqsave(>lock, flags);
 
-if ( cpumask_test_cpu(cpu, >initialized) )
-{
-printk("%s: Strange, cpu %d already initialized!\n", __func__, cpu);
-spin_unlock_irqrestore(>lock, flags);
-return;
-}
+ASSERT(!cpumask_test_cpu(cpu, >initialized));
 
 /* Figure out which runqueue to put it in */
 rqi = 0;
@@ -2036,20 +2032,6 @@ static void init_pcpu(const struct scheduler *ops, int 
cpu)
 return;
 }
 
-static void *
-csched2_alloc_pdata(const struct scheduler *ops, int cpu)
-{
-/* Check to see if the cpu is online yet */
-/* Note: cpu 0 doesn't get a STARTING callback */
-if ( cpu == 0 || cpu_to_socket(cpu) != XEN_INVALID_SOCKET_ID )
-init_pcpu(ops, cpu);
-else
-printk("%s: cpu %d not online yet, deferring initializatgion\n",
-   __func__, cpu);
-
-return NULL;
-}
-
 static void
 csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu)

Re: [Xen-devel] [PATCH v4 00/14] x86: remove paravirt_enabled

2016-04-07 Thread Luis R. Rodriguez

On Wed, Apr 06, 2016 at 05:06:20PM -0700, Luis R. Rodriguez wrote:
> Now that Andy's ASM paravirt_enabled() use is merged 

Sorry I should have provided more context, I meant that now
that Andy's ASM paravirt_enabled() removal is merged:

This is the ASM hack that Andy removed:
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/fsgsbase=58a5aac5331388a175a42b6ed2154f0559cefb21

This puts a nail on coffin for the ASM hack:
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/fsgsbase=0dd0036f6e07f741a1356b424b84a3164b6e59cf

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable-smoke test] 89421: tolerable FAIL - PUSHED

2016-04-07 Thread osstest service owner

flight 89421 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89421/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-amd64-libvirt   5 libvirt-buildfail   like 89383

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  a35dc6ccbb5c77f83dd03f6de0e58e71f805debf
baseline version:
 xen  c63a6ee79cbe45b8f8092543a110361a66d50ca9

Last test of basis89383  2016-04-07 17:02:00 Z0 days
Testing same since89421  2016-04-07 23:05:13 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Dario Faggioli 
  George Dunlap 
  Jan Beulich 
  Shuai Ruan 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  fail
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=a35dc6ccbb5c77f83dd03f6de0e58e71f805debf
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
a35dc6ccbb5c77f83dd03f6de0e58e71f805debf
+ branch=xen-unstable-smoke
+ revision=a35dc6ccbb5c77f83dd03f6de0e58e71f805debf
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.6-testing
+ '[' xa35dc6ccbb5c77f83dd03f6de0e58e71f805debf = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local

[Xen-devel] [qemu-mainline test] 89252: regressions - FAIL

2016-04-07 Thread osstest service owner

flight 89252 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89252/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 10 guest-start fail REGR. vs. 86454
 test-amd64-i386-freebsd10-amd64 10 guest-startfail REGR. vs. 86454
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. 
vs. 86454
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
86454
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. 
vs. 86454
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 86454
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 86454
 test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
86454
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 86454
 test-amd64-i386-qemuu-rhel6hvm-intel  9 redhat-installfail REGR. vs. 86454
 test-amd64-amd64-qemuu-nested-intel  9 debian-hvm-install fail REGR. vs. 86454
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 86454
 test-amd64-i386-libvirt-pair 21 guest-migrate/src_host/dst_host fail REGR. vs. 
86454
 test-amd64-i386-qemuu-rhel6hvm-amd  9 redhat-install  fail REGR. vs. 86454
 test-amd64-amd64-qemuu-nested-amd  9 debian-hvm-install   fail REGR. vs. 86454
 test-amd64-amd64-libvirt-pair 21 guest-migrate/src_host/dst_host fail REGR. 
vs. 86454

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 86454
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 86454
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 86454

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu7acbff99c6c285b3070bf0e768d56f511e2bf346
baseline version:
 qemuud1f8764099022bc1173f2413331b26d4ff609a0c

Last test of basis86454  2016-03-17 06:01:30 Z   21 days
Failing since 86547  2016-03-18 07:12:41 Z   20 days   22 attempts
Testing same since89252  2016-04-06 22:19:02 Z1 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Alex BennÃ©e 
  Alex Williamson 
  Alexey Kardashevskiy 
  Andrew Baumann 
  Andrey Smetanin 
  Anthony Perard

[Xen-devel] Does __KERNEL_DS serve a purpose?

2016-04-07 Thread Andy Lutomirski

I can't see any reason that we need the __KERNEL_DS segment at all --
I think that everything that uses __KERNEL_DS could use __USER_DS
instead.  Am I missing anything?  This has been bugging me for a
while.

I mulled over this a bit when trying to understand the sysret_ss_attrs
bug and then forgot about it.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 19/28] build_id: Provide ld-embedded build-ids

2016-04-07 Thread Konrad Rzeszutek Wilk

> > +build_id.o: $(TARGET)-syms
> > +   $(OBJCOPY) -O binary --only-section=.note $(BASEDIR)/xen-syms $@.bin
> 
> Considering your xen.lds.S changes, won't this potentially copy quite
> a bit more than just the build ID (i.e. all notes)? This may be okay, and
> may be even intended, but then the generate file's name should
> reflect this to not misguide people.

I changed it notes.o

> 
> > +   $(OBJCOPY) -I binary -O elf64-x86-64 -B i386:x86-64 \
> > +   --rename-section=.data=.note.gnu.build-id -S $@.bin $@
> 
> Since you put the notes into .rodata anyway, why name the
> section .note.*? Just name it .rodata.*, avoiding to mislead
> others who also think that a section's name has much of a
> meaning.

Way back last year:
http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg01264.html

which is where the .note came about. I can put it all in .rodata
and not have it for normal ELF builds if you would like.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 04/11] xen: sched: close potential races when switching scheduler to CPUs

2016-04-07 Thread Dario Faggioli

On Thu, 2016-04-07 at 15:54 +0100, George Dunlap wrote:
> On 06/04/16 18:23, Dario Faggioli wrote:
> > 
> > This patch, therefore, introduces a new hook in the scheduler
> > interface, called switch_sched, meant at being used when
> > switching scheduler on a CPU, and implements it for the
> > various schedulers (that needs it: i.e., all except ARINC653),
> > so that things are done in the proper order and under the
> > protection of the best suited (set of) lock(s). It is
> > necessary to add the hook (as compared to keep doing things
> > in generic code), because different schedulers may have
> >  different locking schemes.
> > 
> > Signed-off-by: Dario Faggioli 
> Hey Dario! Everything here looks good, except for one thing: the
> scheduler lock for arinc653 scheduler. :-)  
>
Bah! :-/

> What happens now if you
> assign a cpu to credit2, and then assign it to arinc653?  Since arinc
> doesn't implement the switch_sched() functionality, the per-cpu
> scheduler lock will still point to the credit2 lock, won't it?
> 
So, to assign a cpu to a pool you have to first remove it from the pool
it's currently in. And removing a cpu from its pool, means re-assigning 
it to the default pool (pool0), which uses as scheduler whatever we
chose at boot. (Then, of course, vcpus of domains in the default pool
still won't run there, because its corresponding bit in
pool0->cpu_valid mask is 0.)

So, if you move a cpu from pool0 to an ARINC pool, indeed the lock
would keep pointing to pool0's runqueue lock (which will be one of the
Credit2's runqueue locks, if the default scheduler is Credit2).

And that is the case even if you move a cpu from some non-default pool
to an ARINC pool, because you always have to remove the cpu from where
it is first, which automatically puts it back in the default pool.

This is relevant because...

> Which will *work*, although it will add unnecessary contention to the
> credit2 lock;  until the lock goes away, at which point
> vcpu_schedule_lock*() will essentially be using a wild pointer.
> 
...since we're talking about the lock of the default scheduler, it'll
never go away.

But indeed I agree that this just adds unnecessary contention which, in
case of Credit2 or RTDS, where the lock is shared between pcpus, is
something we certainly don't want (good catch!).

I'll add a very simple implementation of switch_sched to ARINC, which
basically puts back sd->schedule_lock to sd->_lock, and also does the
basic operations for actually switching scheduler, which before this
patch were done in schedule_cpu_switch(), and hence are also needed in
there.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)

signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 03/21] xen/x86: Generate deep dependencies of features

2016-04-07 Thread Andrew Cooper

On 08/04/2016 00:18, Jan Beulich wrote:
 On 07.04.16 at 13:57,  wrote:
>> +deps = {
>> +# FPU is taken to mean support for the x87 regisers as well as the
>> +# instructions.  MMX is documented to alias the %MM registers over 
>> the
>> +# x87 %ST registers in hardware.
>> +FPU: [MMX],
>> +
>> +# The PSE36 feature indicates that reserved bits in a PSE superpage
>> +# may be used as extra physical address bits.
>> +PSE: [PSE36],
>> +
>> +# Entering Long Mode requires that %CR4.PAE is set.  The NX and PKU
>> +# pagetable bits are only representable in the 64bit PTE format
>> +# offered by PAE.
>> +PAE: [LM, NX, PKU],
> PKU is explicitly documented to be available only in IA-32e paging,
> so I think it should be listed as a dependent of LM.

I am not so sure.  Unlike PCID, it is not declared that setting CR4.PKE
will result in a #GP fault out of Long Mode, and the description does
say "If CR4.PKE = 0, or if IA-32e paging is not active, the processor
does not associate linear addresses with protection keys".

This suggests that CR4.PKE can be set out of Long Mode, but all
pagewalks will act as if key0 is in use.  PCID on the other hand goes to
lengths declaring that you can't set CR4.PCIDE out of Long Mode, nor may
you attempt to exit Long Mode if it is set.

I do have some Skylake hardware with PKU available in the office.  I
will investigate what really happens in this case.  (I am also mindful
of the fact that 32bit PV guests could in principle use PKU, at which
point the dependency on LM would be counter-productive.)

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 03/21] xen/x86: Generate deep dependencies of features

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 13:57,  wrote:
> +deps = {
> +# FPU is taken to mean support for the x87 regisers as well as the
> +# instructions.  MMX is documented to alias the %MM registers over 
> the
> +# x87 %ST registers in hardware.
> +FPU: [MMX],
> +
> +# The PSE36 feature indicates that reserved bits in a PSE superpage
> +# may be used as extra physical address bits.
> +PSE: [PSE36],
> +
> +# Entering Long Mode requires that %CR4.PAE is set.  The NX and PKU
> +# pagetable bits are only representable in the 64bit PTE format
> +# offered by PAE.
> +PAE: [LM, NX, PKU],

PKU is explicitly documented to be available only in IA-32e paging,
so I think it should be listed as a dependent of LM.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 02/21] xen/x86: Calculate maximum host and guest featuresets

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 13:57,  wrote:
> All of this information will be used by the toolstack to make informed
> levelling decisions for VMs, and by Xen to sanity check toolstack-provided
> information.
> 
> The split between the shadow and hap HVM masks is necessary due to the lack of
> a "get cpuid policy" hypercall.  Multi-host toolstacks (i.e. not libxl)
> dealing with hap and non-hap capable hosts need to be able to calculate that
> migrating a shadow guest is safe.
> 
> Future planned development work will implement proper cpuid policy handing in
> Xen, including a "get policy" hypercall, but until then, the difference is
> made available for toolstack use via a non-stable interface.
> 
> Signed-off-by: Andrew Cooper 
> Reviewed-by: Konrad Rzeszutek Wilk 

Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 01/21] xen/x86: Annotate VM applicability in featureset

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 13:57,  wrote:
> Use attributes to specify whether a feature is applicable to be exposed to:
>  1) All guests
>  2) HVM guests
>  3) HVM HAP guests
> and, via absence of an attribute, to no guests.
> 
> There is no current need for other categories (e.g. PV-only features), and
> such categories should not be introduced if possible.  These categories 
> follow
> from the fact that, with increased hardware support, a guest gets more
> features to use.
> 
> These settings are derived from the existing code in {pv,hvm}_cpuid(), and
> xc_cpuid_x86.c.  One notable exception is EXTAPIC which was previously
> erroneously exposed to guests.  PV guests don't get to use the APIC and the
> HVM APIC emulation doesn't support extended space.
> 
> Signed-off-by: Andrew Cooper 
> Reviewed-by: Konrad Rzeszutek Wilk 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.7 v2 2/2] xen/x86: Introduce a new VMASSIST for architectural behaviour of iopl

2016-04-07 Thread Jan Beulich

>>> On 08.04.16 at 00:30,  wrote:
> On 07/04/2016 22:55, Jan Beulich wrote:
> On 07.04.16 at 23:39,  wrote:
>>> @@ -1763,7 +1765,8 @@ static void load_segments(struct vcpu *n)
>>>  vcpu_info(n, evtchn_upcall_mask) = 1;
>>>  
>>>  regs->entry_vector |= TRAP_syscall;
>>> -regs->_eflags  &= 0xFFFCBEFFUL;
>>> +regs->_eflags  &= 
>>> ~(X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|
>>> +
>>> X86_EFLAGS_NT|X86_EFLAGS_IOPL|X86_EFLAGS_TF);
>> Why AC, which didn't get cleared before? Did you just copy
>> the 64-bit variant from below?
> 
> Yes,
> 
>> Assuming so, with it removed Reviewed-by: Jan Beulich 
> 
> Why keep the disparity?

Because there's no reason to clear AC for 32-bit guests.

> Looking this up again, architecturally speaking, its wrong.  AC does not
> get cleared on a 32 or 64bit task switch; It only gets cleared on a real
> mode task switch.

Not sure what meaning of "task switch" you're implying here -
we're talking about code dealing with certain failures in the
PV context switch path, which has nothing to do with hardware
task switching.

> I presume you are refering to c/s eb97b7dc2b "[XEN] Fix x86/64 bug where
> a guest application can crash the guest OS by setting AC flag in
> RFLAGS.", from 2006?  Such a PV VM is already vulnerable from other
> means.  I suppose this explains why 32bit PV kernels end up leaking AC
> back into userspace.

Nor do I understand your reference to leaking whatever state
into user space: We're injecting a failsafe callback here, i.e.
guest execution is guaranteed to resume in kernel space.

The difference here mirrors the difference between
compat_create_bounce_frame and create_bounce_frame in
regard to what parts of EFLAGS they clear.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 08/11] xen: sched: allow for choosing credit2 runqueues configuration at boot

2016-04-07 Thread Dario Faggioli

On Thu, 2016-04-07 at 16:04 +0100, George Dunlap wrote:
> On 07/04/16 06:04, Juergen Gross wrote:
> > On 06/04/16 19:23, Dario Faggioli wrote:
> > > @@ -2170,6 +2234,8 @@ csched2_init(struct scheduler *ops)
> > >  printk(" load_window_shift: %d\n", opt_load_window_shift);
> > >  printk(" underload_balance_tolerance: %d\n",
> > > opt_underload_balance_tolerance);
> > >  printk(" overload_balance_tolerance: %d\n",
> > > opt_overload_balance_tolerance);
> > > +printk(" runqueues arrangement: per-%s\n",
> > > +   opt_runqueue == OPT_RUNQUEUE_CORE ? "core" :
> > > "socket");
> > I asked this before: shouldn't the optiones "node" and "all" be
> > respected here, too?
> Dario, would it make sense to put the string names ("core", "socket",
> ) in an array, then have both parse_credit2_runqueue() iterate over
> the array to find the appropriate numeric value, and have this use
> the
> array to convert from the numeric value to a string?
> 
Ok, I'll do that.

Even if I do, though, I can't get rid of the OPT_RUNQUEUE_CORE, etc.,
symbols, as I need to figure out what the numeric value found during
parsing actually means in cpu_to_runqueue().

I know you're not mentioning this, but I felt like I better make this
clear, in case one would expect for those to go away too.

In any case, you'll see this in the patch.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] abstract model of IOMMU unmaping/mapping failures

2016-04-07 Thread Jan Beulich

>>> On 06.04.16 at 09:38,  wrote:
> On April 01, 2016 7:57pm,  wrote:
>> >>> On 31.03.16 at 11:06,  wrote:
>> > 4. __gnttab_unmap_common():rollback (no change)
>> >
>> > (Existing code)
>> >>>...
>> >  if ( !kind )
>> > err = iommu_unmap_page(ld, op->frame);
>> > else if ( !(kind & MAPKIND_WRITE) )
>> > err = iommu_map_page(ld, op->frame, op->frame,
>> > IOMMUF_readable);
>> >
>> > double_gt_unlock(lgt, rgt);
>> >
>> > if ( err )
>> > rc = GNTST_general_error;
>> ><<...
>> >
>> > Similarly no change required, as error has been correctly handled.
>> 
>> I wouldn't call this "correctly handled", but for the hardware domain it 
>> should
>> be good enough, and by crashing DomU-s simply reporting the error up the call
>> tree is sufficient. One question though is whether the loops higher up the 
>> call
>> tree in grant_table.c shouldn't be exited when noticing the domain has 
>> crashed,
>> both to avoid unnecessary work and to reduce the risk of secondary problems.
>> 
> 
> For this point, I suppose that the domain structure is not destructed (we 
> are safe to call domain's element, e.g. ld->grant_table) in do_grant_table_op 
> hypercall cycle,
> even the domain is crashed down. I am not quite sure, whether it is correct 
> or not, could you explain more?

Explain what more? Sure, struct domain stays around until the
domain gets actually cleaned up, so accesses to its grant table
(and alike) remain valid while execution didn't leave the context
of that guest (vCPU) yet.

>> > 7. set_identity_p2m_entry():rollback (no change).
>> >
>> > (Existing code)
>> >   >>...
>> > if ( !paging_mode_translate(p2m->domain) )
>> > {
>> > if ( !need_iommu(d) )
>> > return 0;
>> > return iommu_map_page(d, gfn, gfn,
>> IOMMUF_readable|IOMMUF_writable);
>> > }
>> >   <<...
>> > if ( is_hardware_domain(d) && !iommu_use_hap_pt(d) )
>> > ret = iommu_map_page(d, gfn, gfn,
>> IOMMUF_readable|IOMMUF_writable);
>> >>> ...
>> >
>> > error handling and rollback already in-place.
>> 
>> For the first portion of the quoted code you mean. I don't see any rollback 
>> in
>> the paging_mode_translate() case.
>> 
> 
> Does it refer to as the below call trees:
>  set_identity_p2m_entry()--rmrr_identity_mapping()--intel_iommu_add_device() 
> --...
>  
> set_identity_p2m_entry()--rmrr_identity_mapping()--intel_iommu_remove_device()--...
>  

No, my comment solely referred to set_identity_p2m_entry(),
but it looks like I mis-read the code and there's indeed nothing
to roll back inside that function.

>> > 8. p2m_remove_page():rollback one level(change required).
>> >
>> > (Existing code)
>> >   >>...
>> > if ( !paging_mode_translate(p2m->domain) )
>> > {
>> > if ( need_iommu(p2m->domain) )
>> > for ( i = 0; i < (1 << page_order); i++ )
>> > iommu_unmap_page(p2m->domain, mfn + i);
>> > return 0;
>> > }
>> >   <<...
>> >
>> > There is no error checking of iommu_unmap_page. We need to add.
>> > However, there are several callers of this function which don't handle
>> > error at all. I'm not sure whether hardware domain will use this function.
>> > Since it is in core p2m logic (which I don't have much knowledge), I
>> > hope we can print a warning and simply return error here (given the
>> > fact that non-hardware domains are all crashed immediately within
>> > unmap call)
>> 
>> Yes, at least error propagation needs to be added here. I don't think more is
>> required. (Be careful, btw, with adding warnings - you must not spam the log
>> with such.)
>> 
> 
> 
>  agreed. A quick question about error propagation:
> ...
> for ( i = 0; i < (1UL << page_order); i++ )
> iommu_unmap_page(p2m->domain, gfn + i);
> ...
> 
> As you mentioned, as a special case, unmapping should perhaps continue 
> despite an error, in an attempt to do best effort cleanup.
> Then i could modify the code as:
> 
> ...
> for ( i = 0; i < (1UL << page_order); i++ )
> {
> rc = iommu_unmap_page(p2m->domain, gfn + i);
> if ( rc )
>   ret = rc;
> }
> ..
>return ret;
> ...
> 
> It looks cumbersome to me, any suggestion?

What's cumbersome here?

>> > 9. p2m_pt_set_entry(): open (propose no-rollback).
>> >
>> > (Existing code)
>> >   >>...
>> >for ( i = 0; i < (1UL << page_order); i++ )
>> > iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
>> >iommu_pte_flags);
>> > else
>> > for ( i = 0; i < (1UL << page_order); i++ )
>> > iommu_unmap_page(p2m->domain, gfn + i);
>> >   <<...
>> >
>> > Above is in the end of the func.
>> >
>> > I'm not sure whether it will be called for hardware domain ( PVH mode
>> > is one potential concern. as we know, PVH has been introduced on Xen
>> > 4.4 as a DomU, and on Xen 4.5 as a Dom0.).

Re: [Xen-devel] [PATCH v4 01/14] x86/boot: enumerate documentation for the x86 hardware_subarch

2016-04-07 Thread Luis R. Rodriguez

On Thu, Apr 07, 2016 at 02:25:38PM +0300, Andy Shevchenko wrote:
> On Wed, 2016-04-06 at 17:06 -0700, Luis R. Rodriguez wrote:
> > Although hardware_subarch has been in place since the x86 boot
> > protocol 2.07 it hasn't been used much. Enumerate current possible
> > values to avoid misuses and help with semantics later at boot
> > time should this be used further.
> > 
> > These enums should only ever be used by architecture x86 code,
> > and all that code should be well contained and compartamentalized,
> > clarify that as well.
> 
> Nitpick:
> 
> > + * @X86_SUBARCH_PC: Should be used if the hardware is enumerable
> > using standard
> > + * PC mechanisms (PCI, ACPI) and doesn't need a special boot
> > flow.
> > + * @X86_SUBARCH_LGUEST: Used for x86 hypervisor demo, lguest
> > + * @X86_SUBARCH_XEN: Used for Xen guest types which follow the PV
> > boot path,
> > + * which start at asm startup_xen() entry point and later
> > jump to the C
> > + * xen_start_kernel() entry point.
> > + * @X86_SUBARCH_INTEL_MID: Used for Intel MID (Mobile Internet
> > Device) platform
> > + * systems which do not have the PCI legacy interfaces.
> > + * @X86_SUBARCH_CE4100: Used for Intel CE media processor (CE4100)
> > SOC for
> 
> I think 'SoC' (without quotes) will be better.

Amended, since I think I'll need a re-spin and since we may need to take
care of the dom0 Vs domU semantics I'll also make some changes to include
X86_SUBARCH_XEN documentation to annotate that PV guests can be of domU
or dom0 type...

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 08/14] apm32: remove paravirt_enabled() use

2016-04-07 Thread Luis R. Rodriguez

On Thu, Apr 07, 2016 at 09:08:36AM -0400, Boris Ostrovsky wrote:
> On 04/06/2016 08:06 PM, Luis R. Rodriguez wrote:
> >There is already a check for apm_info.bios == 0, the
> >apm_info.bios is set from the boot_params.apm_bios_info.
> >Both Xen and lguest, which are also the only ones that set
> >paravirt_enabled to true, never set the apm_bios.info. The
> >
> >Xen folks are sure force disable to 0 is not needed,
> 
> Because apm_info lives in .bss (which we recently made sure is
> cleared on Xen PV). May be worth mentioning in the commit message so
> that we don't forget why this is not needed.

Thanks, I'll change that last paragraph with:

Xen folks are sure force disable to 0 is not needed because
apm_info lives in .bss, we recently forced disabled this on
lguest, and on the Xen side just to be sure Boris zeroed out
the .bss for PV guests through commit 04b6b4a56884327c1648
("xen/x86: Zero out .bss for PV guests"). With this care taken
into consideration the paravirt_enabled() check is simply not
needed anymore.

> I think you also have this statement in other patches.

Indeed, I'll highlight this on the tboot commit log as well.

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.7 v2 2/2] xen/x86: Introduce a new VMASSIST for architectural behaviour of iopl

2016-04-07 Thread Andrew Cooper

On 07/04/2016 22:55, Jan Beulich wrote:
 On 07.04.16 at 23:39,  wrote:
>> @@ -1763,7 +1765,8 @@ static void load_segments(struct vcpu *n)
>>  vcpu_info(n, evtchn_upcall_mask) = 1;
>>  
>>  regs->entry_vector |= TRAP_syscall;
>> -regs->_eflags  &= 0xFFFCBEFFUL;
>> +regs->_eflags  &= 
>> ~(X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|
>> +
>> X86_EFLAGS_NT|X86_EFLAGS_IOPL|X86_EFLAGS_TF);
> Why AC, which didn't get cleared before? Did you just copy
> the 64-bit variant from below?

Yes,

> Assuming so, with it removed Reviewed-by: Jan Beulich 

Why keep the disparity?

Looking this up again, architecturally speaking, its wrong.  AC does not
get cleared on a 32 or 64bit task switch; It only gets cleared on a real
mode task switch.

I presume you are refering to c/s eb97b7dc2b "[XEN] Fix x86/64 bug where
a guest application can crash the guest OS by setting AC flag in
RFLAGS.", from 2006?  Such a PV VM is already vulnerable from other
means.  I suppose this explains why 32bit PV kernels end up leaking AC
back into userspace.

Yuck - yet more non-architectural and non-documented PV ABI caused by
Xen trying to bugfix its way around broken PV kernels.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable test] 89249: regressions - FAIL

2016-04-07 Thread osstest service owner

flight 89249 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89249/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   5 libvirt-build fail REGR. vs. 86491
 build-i386-libvirt5 libvirt-build fail REGR. vs. 86491
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 13 guest-localmigrate 
fail REGR. vs. 86491
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 13 guest-localmigrate 
fail REGR. vs. 86491
 build-armhf-libvirt   5 libvirt-build fail REGR. vs. 86491

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  9 debian-installfail REGR. vs. 86491
 test-armhf-armhf-xl-rtds15 guest-start/debian.repeat fail blocked in 86491
 build-i386-rumpuserxen6 xen-buildfail   like 86491
 build-amd64-rumpuserxen   6 xen-buildfail   like 86491
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 86491
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 86491
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 86491
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 86491

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  50b5df7dccd052b685084abe31a4b6be43c976ae
baseline version:
 xen  a6f2cdb633bf519244a16674031b8034b581ba7f

Last test of basis86491  2016-03-17 15:24:59 Z   21 days
Failing since 86560  2016-03-18 10:56:34 Z   20 days   24 attempts
Testing same since89249  2016-04-06 22:18:40 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Andrew Cooper  [for the Ocaml stubs]
  Anthony PERARD 
  Boris Ostrovsky 
  Changlong Xie 
  Chong Li 
  Chunyan Liu 
  Dagaen Golomb 
  Daniel De Graaf 
  Daniel De Graaf  [XSM bits]
  Dario

Re: [Xen-devel] [PATCH for-4.7 v2 2/2] xen/x86: Introduce a new VMASSIST for architectural behaviour of iopl

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 23:39,  wrote:
> @@ -1763,7 +1765,8 @@ static void load_segments(struct vcpu *n)
>  vcpu_info(n, evtchn_upcall_mask) = 1;
>  
>  regs->entry_vector |= TRAP_syscall;
> -regs->_eflags  &= 0xFFFCBEFFUL;
> +regs->_eflags  &= 
> ~(X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|
> +
> X86_EFLAGS_NT|X86_EFLAGS_IOPL|X86_EFLAGS_TF);

Why AC, which didn't get cleared before? Did you just copy
the 64-bit variant from below? Assuming so, with it removed
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 11/14] pnpbios: replace paravirt_enabled() check with legacy device check

2016-04-07 Thread Luis R. Rodriguez

On Thu, Apr 07, 2016 at 10:46:11AM +0100, David Vrabel wrote:
> On 07/04/16 01:06, Luis R. Rodriguez wrote:
> > Since we are removing paravirt_enabled() replace it with a
> > logical equivalent. Even though PNPBIOS is x86 specific we
> > add an arch-specific type call, which can be implemented by
> > any architecture to show how other legacy attribute devices
> > can later be also checked for with other ACPI legacy attribute
> > flags.
> > 
> > This implicates the first ACPI 5.2.9.3 IA-PC Boot Architecture
> > ACPI_FADT_LEGACY_DEVICES flag device, and shows how to add more.
> [...]
> > +struct x86_legacy_devices {
> > +   int pnpbios;
> > +};
> 
> It's not clear why pnpbios needs a new structure

I'm glad you asked. Dealing with placing pnpbios quirk in a more useful generic
fashion was perhaps the most difficult challenge in this series.  As I reviewed
possibilities to remove paravirt_enabled() the best prospect I found was to see
if Xen could instead use ACPI 5.2.9.3 IA-PC Boot Architecture flags to annotate
some quirks. It turns out that it is possible, but there are only so many flags,
and also, we didn't want to have a solution that incurred respective upstream
Xen hypervisor change, that would be silly.

To make this quirk more useful then this folds the pnpbios quirk as a sub
quirk under the more borad ACPI_FADT_LEGACY_DEVICES ACPI flag. What this
does, as can be seen by also looking at the next patch, "x86, ACPI: parse
ACPI_FADT_LEGACY_DEVICES" is it explicitly folds pnpbios as one of the
ACPI_FADT_LEGACY_DEVICES devices, but also paves the way for further known
main legacy components to added to the list.

> and why this structure of devices does not have the bit for the rtc device.

That's because ACPI has its own dedicated flag for it, so there already
is a one-to-one mapping available. All we needed to do to replace the
RTC hack was to provide a mechanism to unify both the paravirt RTC hack
with the ACPI RTC flag.

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.7 v2 1/2] xen/x86: Remove the use of vm86_mode()

2016-04-07 Thread Andrew Cooper

Xen, being 64bit only, cannot run PV guests in vm86 mode.  HVM guests however
can be running in vm86 mode, and common codepaths need to be able to cope.

The definition of vm86_mode() in x86_64/regs.h is incorrect, as the predicate
is used by non-PV codepaths.

One buggy use is in hvm/emulate.c.  An HVM guest can be in vm86 mode, and
vm86_mode() sliently omits the check.  Luckily, due to the VEX prefix decoding
logic in x86_emulate(), there is no path to the erronious use with EFLAGS_VM
set.

Another potentially problematic use is in show_guest_stack().  In principle,
show_guest_stack() is common code called for both PV and HVM vcpus.  HVM vcpus
exit early (with no reasonable way of making the code generic), making this
part a PV-only codepath.

Open-code its use in emulate.c, matching the surrounding code.  This causes
all other uses to be in PV-only codepaths, making the code to be logically
dead.  Drop it completely, to avoid future misuse.

Part of resulting cleanup removes vm86attr from read_descriptor(), although
retaining one relevant piece of information; i.e. whether we are reading a
selector for an instruction fetch, or a data fetch.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * New
---
 xen/arch/x86/hvm/emulate.c|  2 +-
 xen/arch/x86/traps.c  | 55 ---
 xen/include/asm-x86/regs.h|  2 +-
 xen/include/asm-x86/x86_64/regs.h |  1 -
 4 files changed, 19 insertions(+), 41 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index f5ab5bc..cc0b841 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1559,7 +1559,7 @@ static int hvmemul_get_fpu(
 break;
 case X86EMUL_FPU_ymm:
 if ( !(curr->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) ||
- vm86_mode(ctxt->regs) ||
+ (ctxt->regs->eflags & X86_EFLAGS_VM) ||
  !(curr->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ||
  !(curr->arch.xcr0 & XSTATE_SSE) ||
  !(curr->arch.xcr0 & XSTATE_YMM) )
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 6fbb1cf..8125c53 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -198,17 +198,8 @@ static void show_guest_stack(struct vcpu *v, const struct 
cpu_user_regs *regs)
 return;
 }
 
-if ( vm86_mode(regs) )
-{
-stack = (unsigned long *)((regs->ss << 4) + (regs->esp & 0x));
-printk("Guest stack trace from ss:sp = %04x:%04x (VM86)\n  ",
-   regs->ss, (uint16_t)(regs->esp & 0x));
-}
-else
-{
-stack = (unsigned long *)regs->esp;
-printk("Guest stack trace from "__OP"sp=%p:\n  ", stack);
-}
+stack = (unsigned long *)regs->esp;
+printk("Guest stack trace from "__OP"sp=%p:\n  ", stack);
 
 if ( !access_ok(stack, sizeof(*stack)) )
 {
@@ -1683,28 +1674,20 @@ static int read_descriptor(unsigned int sel,
unsigned long *base,
unsigned long *limit,
unsigned int *ar,
-   unsigned int vm86attr)
+   bool_t insn_fetch)
 {
 struct desc_struct desc;
 
-if ( !vm86_mode(regs) )
-{
-if ( sel < 4)
-desc.b = desc.a = 0;
-else if ( __get_user(desc,
-(const struct desc_struct *)(!(sel & 4)
- ? GDT_VIRT_START(v)
- : LDT_VIRT_START(v))
-+ (sel >> 3)) )
-return 0;
-if ( !(vm86attr & _SEGMENT_CODE) )
-desc.b &= ~_SEGMENT_L;
-}
-else
-{
-desc.a = (sel << 20) | 0x;
-desc.b = vm86attr | (sel >> 12);
-}
+if ( sel < 4)
+desc.b = desc.a = 0;
+else if ( __get_user(desc,
+ (const struct desc_struct *)(!(sel & 4)
+  ? GDT_VIRT_START(v)
+  : LDT_VIRT_START(v))
+ + (sel >> 3)) )
+return 0;
+if ( !insn_fetch )
+desc.b &= ~_SEGMENT_L;
 
 *ar = desc.b & 0x00f0ff00;
 if ( !(desc.b & _SEGMENT_L) )
@@ -1715,7 +1698,7 @@ static int read_descriptor(unsigned int sel,
 if ( desc.b & _SEGMENT_G )
 *limit = ((*limit + 1) << 12) - 1;
 #ifndef NDEBUG
-if ( !vm86_mode(regs) && (sel > 3) )
+if ( sel > 3 )
 {
 unsigned int a, l;
 unsigned char valid;
@@ -1805,8 +1788,7 @@ static int guest_io_okay(
 int user_mode = !(v->arch.flags & TF_kernel_mode);
 #define TOGGLE_MODE() if ( user_mode ) toggle_guest_mode(v)
 
-if ( !vm86_mode(regs) &&
- (v->arch.pv_vcpu.iopl >= (guest_kernel_mode(v, regs) ? 1 : 3)) )
+if ( v->arch.pv_vcpu.iopl >=

[Xen-devel] [PATCH for-4.7 v2 2/2] xen/x86: Introduce a new VMASSIST for architectural behaviour of iopl

2016-04-07 Thread Andrew Cooper

The existing vIOPL interface is hard to use, and need not be.

Introduce a VMASSIST with which a guest can opt-in to having vIOPL behaviour
consistenly with native hardware.

Specifically:
 - virtual iopl updated from do_iret() hypercalls.
 - virtual iopl reported in bounce frames.
 - guest kernels assumed to be level 0 for the purpose of iopl checks.

v->arch.pv_vcpu.iopl is altered to store IOPL shifted as it would exist
eflags, for the benefit of the assembly code.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * Shift vcpu.iopl to match eflags.
 * Factor out iopl_ok().
 * Use CMOVcc in assembly code.

Along with this, I have functional tests for both vIOPL interfaces in PV
guests.
---
 xen/arch/x86/domain.c  | 15 ++-
 xen/arch/x86/physdev.c |  2 +-
 xen/arch/x86/traps.c   | 15 +--
 xen/arch/x86/x86_64/asm-offsets.c  |  3 +++
 xen/arch/x86/x86_64/compat/entry.S |  7 ++-
 xen/arch/x86/x86_64/compat/traps.c |  4 
 xen/arch/x86/x86_64/entry.S|  7 ++-
 xen/arch/x86/x86_64/traps.c|  3 +++
 xen/include/asm-x86/config.h   |  1 +
 xen/include/asm-x86/domain.h   |  4 +++-
 xen/include/public/xen.h   |  8 
 11 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a4f6db2..8c590f3 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -994,7 +994,7 @@ int arch_set_info_guest(
 init_int80_direct_trap(v);
 
 /* IOPL privileges are virtualised. */
-v->arch.pv_vcpu.iopl = (v->arch.user_regs.eflags >> 12) & 3;
+v->arch.pv_vcpu.iopl = v->arch.user_regs.eflags & X86_EFLAGS_IOPL;
 v->arch.user_regs.eflags &= ~X86_EFLAGS_IOPL;
 
 /* Ensure real hardware interrupts are enabled. */
@@ -1735,8 +1735,10 @@ static void load_segments(struct vcpu *n)
 cs_and_mask = (unsigned short)regs->cs |
 ((unsigned int)vcpu_info(n, evtchn_upcall_mask) << 16);
 /* Fold upcall mask into RFLAGS.IF. */
-eflags  = regs->_eflags & ~X86_EFLAGS_IF;
+eflags  = regs->_eflags & ~(X86_EFLAGS_IF|X86_EFLAGS_IOPL);
 eflags |= !vcpu_info(n, evtchn_upcall_mask) << 9;
+if ( VM_ASSIST(n->domain, architectural_iopl) )
+eflags |= n->arch.pv_vcpu.iopl;
 
 if ( !ring_1(regs) )
 {
@@ -1763,7 +1765,8 @@ static void load_segments(struct vcpu *n)
 vcpu_info(n, evtchn_upcall_mask) = 1;
 
 regs->entry_vector |= TRAP_syscall;
-regs->_eflags  &= 0xFFFCBEFFUL;
+regs->_eflags  &= ~(X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|
+
X86_EFLAGS_NT|X86_EFLAGS_IOPL|X86_EFLAGS_TF);
 regs->ss= FLAT_COMPAT_KERNEL_SS;
 regs->_esp  = (unsigned long)(esp-7);
 regs->cs= FLAT_COMPAT_KERNEL_CS;
@@ -1781,8 +1784,10 @@ static void load_segments(struct vcpu *n)
 ((unsigned long)vcpu_info(n, evtchn_upcall_mask) << 32);
 
 /* Fold upcall mask into RFLAGS.IF. */
-rflags  = regs->rflags & ~X86_EFLAGS_IF;
+rflags  = regs->rflags & ~(X86_EFLAGS_IF|X86_EFLAGS_IOPL);
 rflags |= !vcpu_info(n, evtchn_upcall_mask) << 9;
+if ( VM_ASSIST(n->domain, architectural_iopl) )
+rflags |= n->arch.pv_vcpu.iopl;
 
 if ( put_user(regs->ss,rsp- 1) |
  put_user(regs->rsp,   rsp- 2) |
@@ -1806,7 +1811,7 @@ static void load_segments(struct vcpu *n)
 
 regs->entry_vector |= TRAP_syscall;
 regs->rflags   &= ~(X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|
-X86_EFLAGS_NT|X86_EFLAGS_TF);
+X86_EFLAGS_NT|X86_EFLAGS_IOPL|X86_EFLAGS_TF);
 regs->ss= FLAT_KERNEL_SS;
 regs->rsp   = (unsigned long)(rsp-11);
 regs->cs= FLAT_KERNEL_CS;
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 1cb9b58..5a49796 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -529,7 +529,7 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg)
 if ( set_iopl.iopl > 3 )
 break;
 ret = 0;
-curr->arch.pv_vcpu.iopl = set_iopl.iopl;
+curr->arch.pv_vcpu.iopl = MASK_INSR(set_iopl.iopl, X86_EFLAGS_IOPL);
 break;
 }
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 8125c53..7861a3c 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1779,6 +1779,17 @@ static int read_gate_descriptor(unsigned int gate_sel,
 return 1;
 }
 
+/* Perform IOPL check between the vcpu's shadowed IOPL, and the assumed cpl. */
+static bool_t iopl_ok(const struct vcpu *v, const struct cpu_user_regs *regs)
+{
+unsigned int cpl = guest_kernel_mode(v, regs) ?
+

Re: [Xen-devel] [PATCH v2 6/6] x86/time: implement PVCLOCK_TSC_STABLE_BIT

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 23:17,  wrote:
>> > The main
>>> difference I see between both would be the base system time: 
>>> read_platform_stime
>>> uses stime_platform_stamp as base, and computes a difference from the
>>> read_counter (i.e. rdtsc() ) with previously saved platform-wide stamp
>>> (platform_timer_stamp). get_s_time uses the stime_local_stamp (updated from
>>> stime_master_stamp on local_time_calibration) as base plus delta from 
>>> rdtsc()
>>> with local_tsc_stamp. And since this is now all TSC, and TSC monotonically
>>> increase and is synchronized across CPUs, both calls would end up returning 
>>> the
>>> same or a always up-to-date value, whether cpu_time have a larger gap or not
>>> from stime_platform_stamp. Unless the concern you are raising comes from the
>>> fact CPU 0 calibrates much sooner than the last calibrated CPU, as opposed 
>>> to
>>> roughly at the same time with std_rendezvous?
>> 
>> In a way, yes. I'm concerned by the two time stamps no longer
>> being obtained at (almost) the same time. If that's not having
>> any bad consequences, the better.
> 
> I don't think there would be bad consequences as both timestamps correspond 
> to the same time reference - thus returning always the latest system time
> irrespective of the gap between both stamps.
> 
> If you prefer I can go back with my initial approach (v1, with std_rendezvous)
> to have both timestamps closely updated. And later (post-release?) revisit 
> the introduction of nop_rendezvous. Perhaps this way is more reasonable?

Since the new mode need to be actively asked for, I don't think
that's necessary.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 13/14] x86/init: rename ebda code file

2016-04-07 Thread Luis R. Rodriguez

This makes it clearer what this is.

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/Makefile  | 2 +-
 arch/x86/kernel/Makefile   | 2 +-
 arch/x86/kernel/{head.c => ebda.c} | 0
 3 files changed, 2 insertions(+), 2 deletions(-)
 rename arch/x86/kernel/{head.c => ebda.c} (100%)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index f9ed8a7ce2b6..6fce7f096b88 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -208,7 +208,7 @@ endif
 
 head-y := arch/x86/kernel/head_$(BITS).o
 head-y += arch/x86/kernel/head$(BITS).o
-head-y += arch/x86/kernel/head.o
+head-y += arch/x86/kernel/ebda.o
 head-y += arch/x86/kernel/platform-quirks.o
 
 libs-y  += arch/x86/lib/
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 7a9e44d935de..0503f5bfb18d 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -4,7 +4,7 @@
 
 extra-y:= head_$(BITS).o
 extra-y+= head$(BITS).o
-extra-y+= head.o
+extra-y+= ebda.o
 extra-y+= platform-quirks.o
 extra-y+= vmlinux.lds
 
diff --git a/arch/x86/kernel/head.c b/arch/x86/kernel/ebda.c
similarity index 100%
rename from arch/x86/kernel/head.c
rename to arch/x86/kernel/ebda.c
-- 
2.7.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/emulate: Check current->arch.vm_event in hvmemul_virtual_to_linear()

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 19:54,  wrote:
> On 04/07/16 20:27, Jan Beulich wrote:
> On 07.04.16 at 10:39,  wrote:
>>> Theoretically it is possible for mem_access_emulate_each_rep to be
>>> true even when current->arch.vm_event == NULL, so add an extra
>>> check to hvmemul_virtual_to_linear().
>> 
>> Mind saying what those theoretical conditions are when this might
>> happen?
> 
> This could happen if someone were to call xc_monitor_emulate_each_rep(),
> but not xc_monitor_enable() (when current->arch.vm_event gets
> allocated), or after someone called both, but afterwards called
> xc_monitor_disable() (when current->arch.vm_event gets freed).

Then wouldn't the correct action be to fail
xc_monitor_emulate_each_rep() (i.e. whatever hypercall this
resolves to) when the monitor is not enabled? (You did already
clarify that the other variant isn't applicable)?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 6/6] x86/time: implement PVCLOCK_TSC_STABLE_BIT

2016-04-07 Thread Joao Martins

>> The main
>> difference I see between both would be the base system time: 
>> read_platform_stime
>> uses stime_platform_stamp as base, and computes a difference from the
>> read_counter (i.e. rdtsc() ) with previously saved platform-wide stamp
>> (platform_timer_stamp). get_s_time uses the stime_local_stamp (updated from
>> stime_master_stamp on local_time_calibration) as base plus delta from 
>> rdtsc()
>> with local_tsc_stamp. And since this is now all TSC, and TSC monotonically
>> increase and is synchronized across CPUs, both calls would end up returning 
>> the
>> same or a always up-to-date value, whether cpu_time have a larger gap or not
>> from stime_platform_stamp. Unless the concern you are raising comes from the
>> fact CPU 0 calibrates much sooner than the last calibrated CPU, as opposed 
>> to
>> roughly at the same time with std_rendezvous?
> 
> In a way, yes. I'm concerned by the two time stamps no longer
> being obtained at (almost) the same time. If that's not having
> any bad consequences, the better.

I don't think there would be bad consequences as both timestamps correspond to
the same time reference - thus returning always the latest system time
irrespective of the gap between both stamps.

If you prefer I can go back with my initial approach (v1, with std_rendezvous)
to have both timestamps closely updated. And later (post-release?) revisit the
introduction of nop_rendezvous. Perhaps this way is more reasonable?

Joao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 06/14] x86/init: use a platform legacy quirk for ebda

2016-04-07 Thread Luis R. Rodriguez

On Thu, Apr 07, 2016 at 10:44:50AM +0100, David Vrabel wrote:
> On 07/04/16 01:06, Luis R. Rodriguez wrote:
> > 
> > --- a/arch/x86/kernel/platform-quirks.c
> > +++ b/arch/x86/kernel/platform-quirks.c
> > @@ -7,8 +7,12 @@
> >  void __init x86_early_init_platform_quirks(void)
> >  {
> > x86_platform.legacy.rtc = 1;
> > +   x86_platform.legacy.ebda_search = 0;
> 
> You should make the default the setting for regular PC hardware, as you
> have done for the .rtc bit.

I went with with Ingo's suggested template [0]. Ingo, let me know what
you prefer.

[0] http://lkml.kernel.org/r/20160407205902.gs1...@wotan.suse.de

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 01/11] x86/boot: enumerate documentation for the x86 hardware_subarch

2016-04-07 Thread Luis R. Rodriguez

David, please note below the highlighted code.

On Wed, Feb 24, 2016 at 09:32:59AM +0100, Ingo Molnar wrote:
> 
> * Luis R. Rodriguez  wrote:
> For hard coded platform quirks I'd suggest we add x86_platform.quirks flags. 
> For 
> example the F00F hack for Xen could be done via:
> 
>   x86_platform.quirks.idt_remap = 0;
> 
> and would be set like this during early init:
> 
> void early_init_platform_quirks(void)
> {
>   x86_platform.legacy.ebda_search = 0;
>   x86_platform.quirks.idt_remap = 1;
> 
>   switch (boot_params.hdr.hardware_subarch) {
>   case X86_SUBARCH_PC:
>   x86_platform.legacy.ebda_search = 1;
>   break;
>   case X86_SUBARCH_XEN:
>   x86_platform.quirks.idt_remap = 0;
>   break;
>   case X86_SUBARCH_LGUEST:
>   x86_platform.quirks.idt_remap = 0;
>   break;
>   }
> }
> 
> And if also add the legacy RTC flag, it becomes:
> 
> void early_init_hardcoded_platform_quirks(void)
> {
>   x86_platform.legacy.ebda_search = 0;
>   x86_platform.quirks.idt_remap = 1;
>   x86_platform.legacy.rtc = 1;
> 
>   switch (boot_params.hdr.hardware_subarch) {
>   case X86_SUBARCH_PC:
>   x86_platform.legacy.ebda_search = 1;
>   break;
>   case X86_SUBARCH_XEN:
>   x86_platform.quirks.idt_remap = 0;
>   x86_platform.legacy.rtc = 0;
>   break;
>   case X86_SUBARCH_LGUEST:
>   x86_platform.quirks.idt_remap = 0;
>   x86_platform.legacy.rtc = 0;
>   break;
>   }
> }
> 
> Note that both opt-in and opt-out quirks/legacies are possible this way, and 
> note 
> how cleanly and consistently it's all organized: setup of all hard coded 
> legacies/quirks is concentrated in a single function, and the actual usage 
> sites 
> don't know anything about subarchitectures.

<-- snip -- > 

So.. I went with Ingo's template.

> Furthermore we should probably move a few other existing legacies to this 
> flag 
> space as well, to make all this more consistent.

And this suggestion should explain a bit of the effort I put into making
room for other legacy things, which I'll elaborate in the other thread.

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.7] xen/build: Fix build with Clang

2016-04-07 Thread Doug Goldstein

On 4/7/16 2:21 PM, Jan Beulich wrote:
 On 07.04.16 at 21:16,  wrote:
>> On 07/04/16 20:12, Jan Beulich wrote:
>> On 07.04.16 at 20:46,  wrote:
 --- a/xen/Rules.mk
 +++ b/xen/Rules.mk
 @@ -50,9 +50,15 @@ ALL_OBJS-$(CONFIG_X86)   += $(BASEDIR)/crypto/built_in.o
  CFLAGS += -nostdinc -fno-builtin -fno-common
  CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
  CFLAGS += -pipe -g -D__XEN__ -include $(BASEDIR)/include/xen/config.h
 -CFLAGS += -Wa,--strip-local-absolute
  CFLAGS += '-D__OBJECT_FILE__="$@"'
  
 +ifneq ($(clang),y)
 +# Clang doesn't understand this command line argument, and doesn't appear 
 to
 +# have an suitable alternative.  The resulting compiled binary does 
 function,
 +# but has an excessively large symbol table.
 +CFLAGS += -Wa,--strip-local-absolute
 +endif
>>> Well, that's the brute force undo-it-altogether-for-clang approach
>>> that I think Doug had also considered. You may have seen the
>>> discussion (on irc iirc) - I'd really like to see the option still getting
>>> passed to gas (for all the .S files) even when using clang. Would
>>> that really be hard to arrange for?
>>
>> That won't fix the fact that all the .c files which include
>> cpufeatureset.h also gets the absolute symbols, to allow the
>> alternatives() blocks to compile.
> 
> That's understood.
> 
>> It will complicate the clang build quite a bit, and won't make much of a
>> dent on the symbol table bloat.
> 
> While this I'm unclear about: Istr Doug mentioning that simply
> adding the option in suitable for to AFLAGS would do.
> 
> Jan
> 

I was trying to do exactly what you mentioned where we still passed it
to gas and didn't pass it to llvm but unfortunately at some point the
flags get combined together and passed to llvm and fails.

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 08/24] xsplice: Add helper elf routines

2016-04-07 Thread Konrad Rzeszutek Wilk

On Thu, Apr 07, 2016 at 05:19:37PM +0100, Ian Jackson wrote:
> Konrad Rzeszutek Wilk writes ("[PATCH v6 08/24] xsplice: Add helper elf 
> routines"):
> > From: Ross Lagerwall 
> > 
> > Add Elf routines and data structures in preparation for loading an
> > xSplice payload.
> > 
> > We make an assumption that the max number of sections an ELF payload
> > can have is 64. We can in future make this be dependent on the
> > names of the sections and verifying against a list, but for right now
> > this suffices.
> > 
> > Also we a whole lot of checks to make sure that the ELF payload
> > file is not corrupted nor that the offsets point past the file.
> 
> This is good, but: ideally I would like to avoid conducting a detailed
> security review of this code.
> 
> My understanding of this is that the purpose of this machinery is to
> supply binary runtime patches to the hypervisor.  So I think someone
> who can inject malicious xsplice payloads can already control the
> host.  Is that right ?

The payload could be just fine from an ELF perspective and
insert an patch that immediately calls BUG_ON().

> 
> If so then bugs in this loader cannot be any security impact.

Yes.
> 
> It might be worth mentioning somewhere that this loader must not be
> used for xsplice payloads for guest kernels.

How "fun" would that be! Also I do want signature checking on
the payloads so at least we would only load ones that are trusted
from a vendor. But that is v2 goal.

> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 08/24] xsplice: Add helper elf routines

2016-04-07 Thread Andrew Cooper

On 07/04/16 17:19, Ian Jackson wrote:
> Konrad Rzeszutek Wilk writes ("[PATCH v6 08/24] xsplice: Add helper elf 
> routines"):
>> From: Ross Lagerwall 
>>
>> Add Elf routines and data structures in preparation for loading an
>> xSplice payload.
>>
>> We make an assumption that the max number of sections an ELF payload
>> can have is 64. We can in future make this be dependent on the
>> names of the sections and verifying against a list, but for right now
>> this suffices.
>>
>> Also we a whole lot of checks to make sure that the ELF payload
>> file is not corrupted nor that the offsets point past the file.
> This is good, but: ideally I would like to avoid conducting a detailed
> security review of this code.
>
> My understanding of this is that the purpose of this machinery is to
> supply binary runtime patches to the hypervisor.  So I think someone
> who can inject malicious xsplice payloads can already control the
> host.  Is that right ?

Correct.

>
> If so then bugs in this loader cannot be any security impact.

I agree.

The reason for the checks is so Xen doesn't accidentally fall over a
malformed ELF.  Earlier versions of this patch were a bit too lax in
trusting the integrity of the ELF image for my liking, which is why I
specifically asked for better verification.

> It might be worth mentioning somewhere that this loader must not be
> used for xsplice payloads for guest kernels.

I don't see how this is related.  If the host admin wanted to patch
guest kernels without using the kernels internal self-patching
mechanism, it would be infinitely easier to do the patching from dom0,
using toolstack mapping powers.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 06/24] x86: Alter nmi_callback_t typedef

2016-04-07 Thread Andrew Cooper

On 07/04/16 04:49, Konrad Rzeszutek Wilk wrote:
> Drop paranthesis and function pointer on nmi_callback_t typedef.
>
> Make it more inline with how x86 maintainers want function
> typedefs to be.
>
> Signed-off-by: Konrad Rzeszutek Wilk 

I don't see the point, but this doesn't introduce any problems.

Acked-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 05/24] arm/x86: Use struct virtual_region to do bug, symbol, and (x86) exception tables lookup.

2016-04-07 Thread Andrew Cooper

On 07/04/16 04:49, Konrad Rzeszutek Wilk wrote:
> @@ -80,8 +82,12 @@ search_one_table(const struct exception_table_entry *first,
>  unsigned long
>  search_exception_table(unsigned long addr)
>  {
> -return search_one_table(
> -__start___ex_table, __stop___ex_table-1, addr);
> +const struct virtual_region *region = find_text_region(addr);
> +
> +if ( region && region->ex )
> +return search_one_table(region->ex, region->ex_end-1, addr);

As you are moving the line, style for "end - 1".

> +void unregister_virtual_region(struct virtual_region *r)
> +{
> +/* Expected to be called from xSplice - which has IRQs disabled. */
> +ASSERT(!local_irq_is_enabled());
> +
> +remove_virtual_region(r);
> +}
> +
> +void unregister_init_virtual_region(void)

__init

> diff --git a/xen/include/xen/virtual_region.h 
> b/xen/include/xen/virtual_region.h
> new file mode 100644
> index 000..71ab4bf
> --- /dev/null
> +++ b/xen/include/xen/virtual_region.h
> @@ -0,0 +1,47 @@
> +/*
> + * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
> + *
> + */
> +
> +#ifndef __XEN_VIRTUAL_REGION__
> +#define __XEN_VIRTUAL_REGION__
> +
> +#include 
> +#include 
> +
> +struct virtual_region
> +{
> +struct list_head list;
> +unsigned long start;/* Virtual address start. */
> +unsigned long end;  /* Virtual address start. */

Both start eh?

No major issues.  Once fixed, Reviewed-by: Andrew Cooper


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 03/24] libxc: Implementation of XEN_XSPLICE_op in libxc

2016-04-07 Thread Andrew Cooper

On 07/04/16 04:49, Konrad Rzeszutek Wilk wrote:
> diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
> index 124537b..e09ac90 100644
> --- a/tools/libxc/xc_misc.c
> +++ b/tools/libxc/xc_misc.c
> @@ -693,6 +693,343 @@ int xc_hvm_inject_trap(
>  return rc;
>  }
>  
> +int xc_xsplice_upload(xc_interface *xch,
> +  char *name,
> +  unsigned char *payload,
> +  uint32_t size)
> +{
> +int rc;
> +DECLARE_SYSCTL;
> +DECLARE_HYPERCALL_BUFFER(char, local);
> +DECLARE_HYPERCALL_BOUNCE(name, 0 /* later */, 
> XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +xen_xsplice_name_t def_name = { .pad = { 0, 0, 0 } };
> +
> +if ( !name || !payload )
> +return -1;

Its reasonable to trust (or assert()) that these pointers are not NULL.

Please however don't return -1 without setting errno.  We already have
too many "failed: Success" cases in libxc.  If you do want to keep this,
then EINVAL.

> +
> +def_name.size = strlen(name) + 1;
> +if ( def_name.size > XEN_XSPLICE_NAME_SIZE )

errno = EINVAL

> +return -1;
> +
> +HYPERCALL_BOUNCE_SET_SIZE(name, def_name.size);
> +
> +if ( xc_hypercall_bounce_pre(xch, name) )
> +return -1;
> +
> +local = xc_hypercall_buffer_alloc(xch, local, size);
> +if ( !local )
> +{
> +xc_hypercall_bounce_post(xch, name);
> +return -1;
> +}
> +memcpy(local, payload, size);
> +
> +sysctl.cmd = XEN_SYSCTL_xsplice_op;
> +sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_UPLOAD;
> +sysctl.u.xsplice.pad = 0;
> +sysctl.u.xsplice.u.upload.size = size;
> +set_xen_guest_handle(sysctl.u.xsplice.u.upload.payload, local);
> +
> +sysctl.u.xsplice.u.upload.name = def_name;
> +set_xen_guest_handle(sysctl.u.xsplice.u.upload.name.name, name);
> +
> +rc = do_sysctl(xch, );
> +
> +xc_hypercall_buffer_free(xch, local);
> +xc_hypercall_bounce_post(xch, name);
> +
> +return rc;
> +}
> +
> +int xc_xsplice_get(xc_interface *xch,
> +   char *name,
> +   xen_xsplice_status_t *status)
> +{
> +int rc;
> +DECLARE_SYSCTL;
> +DECLARE_HYPERCALL_BOUNCE(name, 0 /*adjust later */, 
> XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +xen_xsplice_name_t def_name = { .pad = { 0, 0, 0 } };
> +
> +if ( !name )

EINVAL

> +return -1;
> +
> +def_name.size = strlen(name) + 1;
> +if ( def_name.size > XEN_XSPLICE_NAME_SIZE )

EINVAL

> +return -1;
> +
> +HYPERCALL_BOUNCE_SET_SIZE(name, def_name.size);
> +
> +if ( xc_hypercall_bounce_pre(xch, name) )
> +return -1;
> +
> +sysctl.cmd = XEN_SYSCTL_xsplice_op;
> +sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_GET;
> +sysctl.u.xsplice.pad = 0;
> +
> +sysctl.u.xsplice.u.get.status.state = 0;
> +sysctl.u.xsplice.u.get.status.rc = 0;
> +
> +sysctl.u.xsplice.u.get.name = def_name;
> +set_xen_guest_handle(sysctl.u.xsplice.u.get.name.name, name);
> +
> +rc = do_sysctl(xch, );
> +
> +xc_hypercall_bounce_post(xch, name);
> +
> +memcpy(status, , sizeof(*status));
> +
> +return rc;
> +}
> +
> +/*
> + * The heart of this function is to get an array of xen_xsplice_status_t.
> + *
> + * However it is complex because it has to deal with the hypervisor
> + * returning some of the requested data or data being stale
> + * (another hypercall might alter the list).
> + *
> + * The parameters that the function expects to contain data from
> + * the hypervisor are: 'info', 'name', and 'len'. The 'done' and
> + * 'left' are also updated with the number of entries filled out
> + * and respectively the number of entries left to get from hypervisor.
> + *
> + * It is expected that the caller of this function will take the
> + * 'left' and use the value for 'start'. This way we have an
> + * cursor in the array. Note that the 'info','name', and 'len' will
> + * be updated at the subsequent calls.
> + *
> + * The 'max' is to be provided by the caller with the maximum
> + * number of entries that 'info', 'name', and 'len' arrays can
> + * be filled up with.
> + *
> + * Each entry in the 'name' array is expected to be of XEN_XSPLICE_NAME_SIZE
> + * length.
> + *
> + * Each entry in the 'info' array is expected to be of xen_xsplice_status_t
> + * structure size.
> + *
> + * Each entry in the 'len' array is expected to be of uint32_t size.
> + *
> + * The return value is zero if the hypercall completed successfully.
> + * Note that the return value is _not_ the amount of entries filled
> + * out - that is saved in 'done'.
> + *
> + * If there was an error performing the operation, the return value
> + * will contain an negative -EXX type value. The 'done' and 'left'
> + * will contain the number of entries that had been succesfully
> + * retrieved (if any).
> + */
> +int xc_xsplice_list(xc_interface *xch, unsigned int max, unsigned int start,
> +xen_xsplice_status_t *info,
> +char *name,

Re: [Xen-devel] [PATCH for-4.7] xen/build: Fix build with Clang

2016-04-07 Thread Andrew Cooper

On 07/04/16 20:21, Jan Beulich wrote:
 On 07.04.16 at 21:16,  wrote:
>> On 07/04/16 20:12, Jan Beulich wrote:
>> On 07.04.16 at 20:46,  wrote:
 --- a/xen/Rules.mk
 +++ b/xen/Rules.mk
 @@ -50,9 +50,15 @@ ALL_OBJS-$(CONFIG_X86)   += $(BASEDIR)/crypto/built_in.o
  CFLAGS += -nostdinc -fno-builtin -fno-common
  CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
  CFLAGS += -pipe -g -D__XEN__ -include $(BASEDIR)/include/xen/config.h
 -CFLAGS += -Wa,--strip-local-absolute
  CFLAGS += '-D__OBJECT_FILE__="$@"'
  
 +ifneq ($(clang),y)
 +# Clang doesn't understand this command line argument, and doesn't appear 
 to
 +# have an suitable alternative.  The resulting compiled binary does 
 function,
 +# but has an excessively large symbol table.
 +CFLAGS += -Wa,--strip-local-absolute
 +endif
>>> Well, that's the brute force undo-it-altogether-for-clang approach
>>> that I think Doug had also considered. You may have seen the
>>> discussion (on irc iirc) - I'd really like to see the option still getting
>>> passed to gas (for all the .S files) even when using clang. Would
>>> that really be hard to arrange for?
>> That won't fix the fact that all the .c files which include
>> cpufeatureset.h also gets the absolute symbols, to allow the
>> alternatives() blocks to compile.
> That's understood.
>
>> It will complicate the clang build quite a bit, and won't make much of a
>> dent on the symbol table bloat.
> While this I'm unclear about: Istr Doug mentioning that simply
> adding the option in suitable for to AFLAGS would do.

In which case I clearly missed that bit on IRC.

I am happy to defer to Doug's solution if it actually resolves the
underlying issue.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] live migration from legacy to staging fails

2016-04-07 Thread Andrew Cooper

On 07/04/16 16:40, Olaf Hering wrote:
> convert-legacy-stream.py fails to receive a HVM domU coming from xen-4.5.
> It runs into legacy.CHUNK_enable_verify_mode "Unable to convert a debug
> stream", but only if the '--debug' knob was passed to 'xl migrate'.
> The comment there is clearly wrong.
>
> How is this supposed to be handled? I think --debug is supposed to
> enable just all the DPRINTKs, nothing else.

Sorry - I have been having email problems today.  I see have already
found and addressed the issue.

To answer this question, --debug does more than just increase the
verbosity.  Once the live phase is complete, it sends the verify_mode
marker, then resends the entire memory contents.  (Once the legacy
sending side was complete, it reliably fell over a SIGSEGV for me, and
ended up terminating the migration due to an early eof on the fd.)

The receiving side, on receipt of verify_mode, switches from memcpy() to
memcmp() to verify that the memory it received during the live phase
matches up.

The problematic issue is that any domain with PV rings will expect to
see memory corruption according to verify mode, and this is expected
behaviour due to {net,blk}backends continuing to process requests after
the domain has paused.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.7] xen/build: Fix build with Clang

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 21:16,  wrote:
> On 07/04/16 20:12, Jan Beulich wrote:
> On 07.04.16 at 20:46,  wrote:
>>> --- a/xen/Rules.mk
>>> +++ b/xen/Rules.mk
>>> @@ -50,9 +50,15 @@ ALL_OBJS-$(CONFIG_X86)   += $(BASEDIR)/crypto/built_in.o
>>>  CFLAGS += -nostdinc -fno-builtin -fno-common
>>>  CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
>>>  CFLAGS += -pipe -g -D__XEN__ -include $(BASEDIR)/include/xen/config.h
>>> -CFLAGS += -Wa,--strip-local-absolute
>>>  CFLAGS += '-D__OBJECT_FILE__="$@"'
>>>  
>>> +ifneq ($(clang),y)
>>> +# Clang doesn't understand this command line argument, and doesn't appear 
>>> to
>>> +# have an suitable alternative.  The resulting compiled binary does 
>>> function,
>>> +# but has an excessively large symbol table.
>>> +CFLAGS += -Wa,--strip-local-absolute
>>> +endif
>> Well, that's the brute force undo-it-altogether-for-clang approach
>> that I think Doug had also considered. You may have seen the
>> discussion (on irc iirc) - I'd really like to see the option still getting
>> passed to gas (for all the .S files) even when using clang. Would
>> that really be hard to arrange for?
> 
> That won't fix the fact that all the .c files which include
> cpufeatureset.h also gets the absolute symbols, to allow the
> alternatives() blocks to compile.

That's understood.

> It will complicate the clang build quite a bit, and won't make much of a
> dent on the symbol table bloat.

While this I'm unclear about: Istr Doug mentioning that simply
adding the option in suitable for to AFLAGS would do.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable-smoke test] 89383: tolerable FAIL - PUSHED

2016-04-07 Thread osstest service owner

flight 89383 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/89383/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-amd64-libvirt   5 libvirt-buildfail   like 89344

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  c63a6ee79cbe45b8f8092543a110361a66d50ca9
baseline version:
 xen  b5836f71e186c891231def53b415b3f340306613

Last test of basis89344  2016-04-07 11:12:33 Z0 days
Testing same since89383  2016-04-07 17:02:00 Z0 days1 attempts


People who touched revisions under test:
  Chunyan Liu 
  Ian Jackson 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  fail
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=c63a6ee79cbe45b8f8092543a110361a66d50ca9
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
c63a6ee79cbe45b8f8092543a110361a66d50ca9
+ branch=xen-unstable-smoke
+ revision=c63a6ee79cbe45b8f8092543a110361a66d50ca9
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.6-testing
+ '[' xc63a6ee79cbe45b8f8092543a110361a66d50ca9 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;

Re: [Xen-devel] [PATCH for-4.7] xen/build: Fix build with Clang

2016-04-07 Thread Andrew Cooper

On 07/04/16 20:12, Jan Beulich wrote:
 On 07.04.16 at 20:46,  wrote:
>> The clang build already has many duplicate symbols for some reason I have yet
>> to identify, e.g.
>>
>>   Duplicate symbol 'asid.c#get_cpu_info' (82d0801e6840 != 
>> 82d0801c8190)
>>   Duplicate symbol 'ats.c#__list_add' (82d08015b900 != 82d0801546a0)
>>   Duplicate symbol 'common.c#clear_bit' (82d080213560 != 
>> 82d0801baf10)
>>   Duplicate symbol 'common.c#constant_test_bit' (82d080213550 != 
>> 82d0801ba750)
>>   Duplicate symbol 'common.c#cpumask_check' (82d080218c50 != 
>> 82d0801baf20)
>>   Duplicate symbol 'common.c#cpumask_clear_cpu' (82d080214990 != 
>> 82d0801bae40)
>>   Duplicate symbol 'common.c#get_cpu_info' (82d080212210 != 
>> 82d0801bad20)
>>
>> The resulting binary does function.  Someone with more time can investigate
>> making symbol handling work better with Clang
> I'd guess that's because they don't inline functions as aggressively
> as gcc does.
>
>> --- a/xen/Rules.mk
>> +++ b/xen/Rules.mk
>> @@ -50,9 +50,15 @@ ALL_OBJS-$(CONFIG_X86)   += $(BASEDIR)/crypto/built_in.o
>>  CFLAGS += -nostdinc -fno-builtin -fno-common
>>  CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
>>  CFLAGS += -pipe -g -D__XEN__ -include $(BASEDIR)/include/xen/config.h
>> -CFLAGS += -Wa,--strip-local-absolute
>>  CFLAGS += '-D__OBJECT_FILE__="$@"'
>>  
>> +ifneq ($(clang),y)
>> +# Clang doesn't understand this command line argument, and doesn't appear to
>> +# have an suitable alternative.  The resulting compiled binary does 
>> function,
>> +# but has an excessively large symbol table.
>> +CFLAGS += -Wa,--strip-local-absolute
>> +endif
> Well, that's the brute force undo-it-altogether-for-clang approach
> that I think Doug had also considered. You may have seen the
> discussion (on irc iirc) - I'd really like to see the option still getting
> passed to gas (for all the .S files) even when using clang. Would
> that really be hard to arrange for?

That won't fix the fact that all the .c files which include
cpufeatureset.h also gets the absolute symbols, to allow the
alternatives() blocks to compile.

It will complicate the clang build quite a bit, and won't make much of a
dent on the symbol table bloat.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools: handle xl migrate --debug in legacy stream

2016-04-07 Thread Andrew Cooper

On 07/04/16 17:31, Olaf Hering wrote:
> Doing a 'xl migrate --debug domU host' on xen-4.5 adds a
> XC_SAVE_ID_ENABLE_VERIFY_MODE marker, which is not handled.
> Since using --debug is valid usage, handle it by logging the fact
> instead of aborting the migration.
>
> Signed-off-by: Olaf Hering 
> Cc: Ian Jackson 
> Cc: Wei Liu 

Hmm - I had never considered someone doing that.  Debug mode for legacy
migration was sufficiently broken (in a falling over wild pointers kind
of way) that I assumed noone ever used it.

But yes - not needlessly killing the migration is an improvement.

Reviewed-by: Andrew Cooper 

Ian: This is also a backport candidate for 4.6

> ---
>  tools/python/scripts/convert-legacy-stream | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/tools/python/scripts/convert-legacy-stream 
> b/tools/python/scripts/convert-legacy-stream
> index 41fee10..5f80f13 100755
> --- a/tools/python/scripts/convert-legacy-stream
> +++ b/tools/python/scripts/convert-legacy-stream
> @@ -389,8 +389,7 @@ def read_chunks(vm):
>  write_page_data(pfns, pages)
>  
>  elif marker == legacy.CHUNK_enable_verify_mode:
> -# For debugging purposes only.  Will not be seen in real 
> migration
> -raise RuntimeError("Unable to convert a debug stream")
> +info("This is a debug stream")
>  
>  elif marker == legacy.CHUNK_vcpu_info:
>  max_id, = unpack_exact("i")


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry

2016-04-07 Thread Luis R. Rodriguez

On Wed, Apr 06, 2016 at 01:11:30PM +0200, Daniel Kiper wrote:
> On Wed, Apr 06, 2016 at 04:40:27AM +0200, Luis R. Rodriguez wrote:
> > Boris sent out the first HVMLite series of patches to add a new Xen guest 
> > type
> > February 1, 2016 [0]. We've been talking off list with a few folks now over
> > the prospect of instead of adding yet-another-boot-entry we instead fixate
> > HVMLite to use the x86 EFI boot entry. There's a series of reasons to 
> > consider
> > this, likewise there are reasons to question the effort required and if its
> > really needed. We'd like some more public review of this proposal, and see 
> > if
> > others can come up with other ideas, both in favor or against this proposal.
> >
> > This in particular is also a good time to get x86 Linux folks to chime on on
> > the general design proposal of HVMLite design, given that outside of the 
> > boot
> > entry discussion it would seem including myself that we didn't get the memo
> > over the proposed architecture review [1]. At least on my behalf perhaps the
> > only sticking thorns of the design was the new boot entry, which came to me
> > as a surprise, and this thread addresses and the lack of addressing 
> > semantics
> > for early boot (which we may seem to need to address; some of this is being
> > addressing in parallels through other work). The HVMLite document talks 
> > about
> > using ACPI_FADT_NO_VGA -- we don't use this yet upstream but I have some 
> > pending
> > changes which should make it easy to integrate its use on HVMLite. Perhaps
> > there are others that may have some other points they may want to raise 
> > now...
> >
> > A huge summary of the discussion over EFI boot option for HVMLite is now on 
> > a
> > wiki [2], below I'll just provide the outline of the discussion. Consider 
> > this a
> > request for more public review, feel free to take any of the items below and
> > elaborate on it as you see fit.
> >
> > Worth mentioning also is that this topic will be discussed at the 2016 Xen
> > Hackathon April 18-19 [3] at the ARM Cambridge, UK Headquarters so if you 
> > can
> > attend and this topic interests you, consider attending.
> 
> I hope that you will be there as one of the biggest proponents of EFI entry 
> point.

It would be a last minute trip to prepare for...

> If you does not it will be difficult or impossible to discuss this issue 
> without you.
> In the worst case I can raise this topic on behalf of you and then we should 
> organize
> phone call if possible (and accepted by others). However, to do that I must 
> know your
> plans in advance.

I understand, I'd like to make it clear I am taking simply a neutral position
on this topic, even though it may seem I'm a die-hard on this idea, this was
simply an architectural question that came up, and I have been just
dissatisfied with the answers against the architectural questions I had over
this.

To help better evaluate how neutral really a discussion like this can be
can someone please help chime in on the question of if there are pressures to
just complete HVMLite design already ? How strong are those ? Are we really
able to have a very neutral technical discussion on this ?

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.7] xen/build: Fix build with Clang

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 20:46,  wrote:
> The clang build already has many duplicate symbols for some reason I have yet
> to identify, e.g.
> 
>   Duplicate symbol 'asid.c#get_cpu_info' (82d0801e6840 != 
> 82d0801c8190)
>   Duplicate symbol 'ats.c#__list_add' (82d08015b900 != 82d0801546a0)
>   Duplicate symbol 'common.c#clear_bit' (82d080213560 != 82d0801baf10)
>   Duplicate symbol 'common.c#constant_test_bit' (82d080213550 != 
> 82d0801ba750)
>   Duplicate symbol 'common.c#cpumask_check' (82d080218c50 != 
> 82d0801baf20)
>   Duplicate symbol 'common.c#cpumask_clear_cpu' (82d080214990 != 
> 82d0801bae40)
>   Duplicate symbol 'common.c#get_cpu_info' (82d080212210 != 
> 82d0801bad20)
> 
> The resulting binary does function.  Someone with more time can investigate
> making symbol handling work better with Clang

I'd guess that's because they don't inline functions as aggressively
as gcc does.

> --- a/xen/Rules.mk
> +++ b/xen/Rules.mk
> @@ -50,9 +50,15 @@ ALL_OBJS-$(CONFIG_X86)   += $(BASEDIR)/crypto/built_in.o
>  CFLAGS += -nostdinc -fno-builtin -fno-common
>  CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
>  CFLAGS += -pipe -g -D__XEN__ -include $(BASEDIR)/include/xen/config.h
> -CFLAGS += -Wa,--strip-local-absolute
>  CFLAGS += '-D__OBJECT_FILE__="$@"'
>  
> +ifneq ($(clang),y)
> +# Clang doesn't understand this command line argument, and doesn't appear to
> +# have an suitable alternative.  The resulting compiled binary does function,
> +# but has an excessively large symbol table.
> +CFLAGS += -Wa,--strip-local-absolute
> +endif

Well, that's the brute force undo-it-altogether-for-clang approach
that I think Doug had also considered. You may have seen the
discussion (on irc iirc) - I'd really like to see the option still getting
passed to gas (for all the .S files) even when using clang. Would
that really be hard to arrange for?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry

2016-04-07 Thread Luis R. Rodriguez

On Wed, Apr 06, 2016 at 12:07:36PM +0100, George Dunlap wrote:
> On Wed, Apr 6, 2016 at 3:40 AM, Luis R. Rodriguez  wrote:
> > A huge summary of the discussion over EFI boot option for HVMLite is now on 
> > a
> > wiki [2], below I'll just provide the outline of the discussion. Consider 
> > this a
> > request for more public review, feel free to take any of the items below and
> > elaborate on it as you see fit.
> [snip]
> >   * Issues with boot x86 boot entries
> > * Small x86 zero page stubs
> [snip]
> >   * Points against using EFI
> > * Nulling the claimed boot loader effect
> 
> I'm a bit confused about this. You list exactly two arguments against
> the proposed stub in the "con" section:
> 1. Bootloaders may not be able to use the extra entry point
> 2. It's an extra entry point
> 
> And then later, in another section, you actually list the reason #1 is
> irrelevant: bootloaders don't matter because the stub is there to boot
> from the Xen hypervisor.

Forgive me, the private thread was ongoing and I really wanted to capture
both sides of the expressed arguments and move to the list any extensions
to the discussion, this meant annotating both positions and letting
others fill in the gaps to determine if in fact one position was really
nullified by the other.

First I should state that it is only natural for anyone sensible to have
any type of knee-jerk reaction to kick and scream about the idea of
adding yet a new x86 entry point for Linux... IMO one should not expect
it to be sensible to simply accept yet-another-entry-point to Linux,
rather is should be the expected behaviour to have people really dig
and ensure they did their homework to ensure that if they are going to
add yet-another-entry-point they really validate and have exhausted
review of all possible avenues.

It was Andrew Coopers's position that boot loaders would not need to be
involved, and that would seem to nullify Matt's original position on this.

While Andrew's position is right in that perhaps only Xen tools have to deal
with the HVMLite specific entry, it would also still mean diverging from ARM's
own EFI entry only position, which I'd like to clarify that ARM has no custom
Xen entry, we should strive to match that. Anything far from that to me really
deserves an explanation, specially if we are going to argue that HVMLite is
the best that x86 Xen can do.

Ultimately unifying entry approaches for Xen in a streamlined fashion seems
like a sensible thing to strive for. Anything we push in the other direction,
as small as it can be, should deserve at least a 'hey, wait a minute'...

> So the only actual argument you have against the proposed PVH stub in
> the linked document is that it's an extra entry point.

Then you have not really read the document well, more to the point,
EFI's entry already does what the small HVMLite stub does, already
provides an existing entry and path to the kernel, so why should we
add yet another small stub?

So more to it, if the EFI entry already provides a way into Linux
in a more streamlined fashion bringing it closer to the bare metal
boot entry, why *would* we add another boot entry to x86, even if
its small and self contained ?

Another position against small stubs which I listed myself is that we may need
more semantics for early boot even if the new HVMLite small stub is added. This
remains to be seen. If we are going to add new semantics, it would seem best to
use something more standard like EFI configuration tables rather than hack on
to x86 further custom semantics. Custom sloppy semantics have proven to be
misused, and were ultimately a sloppy mess. To take this further,
virtualization semantics are being abused even outside of Xen -- drivers
developers may think that just because some semantics are available they can
use them to customize drivers to fine tune them for virtualized environments.
Even the best of our folks have taken positions to claim certain hacks are
*impossible* to change [0], when in fact only 4 days later a completely sensible
replacement was found [1], and this as even outside of Xen's situation, so its
not only Xen I am careful over here with regards to semantics. If we need early
boot code semantics or general kernel semantics for virtualization I want to
address that now and I want to be very careful with that given the abuse.
I'm doing my part to ensure that we clarify sloppy old semantics on Xen [2],
and this effort is actually proving to even pave the path for HVMLite, for
instance consider the gains of leveraging use of the legacy devices struct
in the future for ACPI_FADT_NO_VGA now, which HVMLite's specification seems
to annotate it will use. Clearing out the paravirt_enabled() hack for
pnpbios helped push for a right architectural solution to pave the path
for this in generic fashion.

[0] http://lkml.kernel.org/r/s5hvb4151v1.wl-ti...@suse.de
[1] https://www.spinics.net/lists/alsa-devel/msg48627.html
[2]

Re: [Xen-devel] [for-4.7] xen/arm64: correctly emulate the {w, x}zr registers

2016-04-07 Thread Julien Grall


Hi Jan,

On 07/04/2016 18:18, Jan Beulich wrote:

On 07.04.16 at 12:53,  wrote:

---

Stefano, let me know the new helper corresponds to change you requested
(see [1])

This patch is a bug fix for Xen 4.7. Without it, a MMIO register and
sysreg will be set to a random value (actually PC) when the zero
register is used.

I'm not sure if we should consider this patch to be backported to Xen
4.6 and Xen 4.5. It depends on other patches and it would require some
rework to backport it alone.

[1]
http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg03100.html


So the tags and alike would suggest this is ready to be committed,
but the lack of a version number or version history don't really
help support this. Could you please clarify the state of this patch?


Sorry I forgot to mention the changes I made. This is a resend with the 
modification suggested by Stefano.


I've kept Stefano's reviewed-by as he was fine with and without the 
helpers. Although, I would like to give Stefano a chance to object. Can 
you wait a bit before committing?


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.7] xen/build: Fix build with Clang

2016-04-07 Thread Andrew Cooper

c/s 607044bf9 "build: avoid putting local absolute symbols in symbol tables"
breaks the build with Clang, as the command line argument isn't understood.

Clang does not appear to have any equivielent option, and already has
outstanding issues with duplicate symbols.  Excluding this option makes the
problem no worse.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

The clang build already has many duplicate symbols for some reason I have yet
to identify, e.g.

  Duplicate symbol 'asid.c#get_cpu_info' (82d0801e6840 != 82d0801c8190)
  Duplicate symbol 'ats.c#__list_add' (82d08015b900 != 82d0801546a0)
  Duplicate symbol 'common.c#clear_bit' (82d080213560 != 82d0801baf10)
  Duplicate symbol 'common.c#constant_test_bit' (82d080213550 != 
82d0801ba750)
  Duplicate symbol 'common.c#cpumask_check' (82d080218c50 != 
82d0801baf20)
  Duplicate symbol 'common.c#cpumask_clear_cpu' (82d080214990 != 
82d0801bae40)
  Duplicate symbol 'common.c#get_cpu_info' (82d080212210 != 
82d0801bad20)

The resulting binary does function.  Someone with more time can investigate
making symbol handling work better with Clang
---
 xen/Rules.mk | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/xen/Rules.mk b/xen/Rules.mk
index d4dffde..7183d69 100644
--- a/xen/Rules.mk
+++ b/xen/Rules.mk
@@ -50,9 +50,15 @@ ALL_OBJS-$(CONFIG_X86)   += $(BASEDIR)/crypto/built_in.o
 CFLAGS += -nostdinc -fno-builtin -fno-common
 CFLAGS += -Werror -Wredundant-decls -Wno-pointer-arith
 CFLAGS += -pipe -g -D__XEN__ -include $(BASEDIR)/include/xen/config.h
-CFLAGS += -Wa,--strip-local-absolute
 CFLAGS += '-D__OBJECT_FILE__="$@"'
 
+ifneq ($(clang),y)
+# Clang doesn't understand this command line argument, and doesn't appear to
+# have an suitable alternative.  The resulting compiled binary does function,
+# but has an excessively large symbol table.
+CFLAGS += -Wa,--strip-local-absolute
+endif
+
 CFLAGS-$(verbose)   += -DVERBOSE
 CFLAGS-$(crash_debug)   += -DCRASH_DEBUG
 CFLAGS-$(perfc) += -DPERF_COUNTERS
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/emulate: Check current->arch.vm_event in hvmemul_virtual_to_linear()

2016-04-07 Thread Razvan Cojocaru

On 04/07/16 20:54, Razvan Cojocaru wrote:
> On 04/07/16 20:27, Jan Beulich wrote:
> On 07.04.16 at 10:39,  wrote:
>>> Theoretically it is possible for mem_access_emulate_each_rep to be
>>> true even when current->arch.vm_event == NULL, so add an extra
>>> check to hvmemul_virtual_to_linear().
>>
>> Mind saying what those theoretical conditions are when this might
>> happen?
> 
> This could happen if someone were to call xc_monitor_emulate_each_rep(),
> but not xc_monitor_enable() (when current->arch.vm_event gets
> allocated), or after someone called both, but afterwards called
> xc_monitor_disable() (when current->arch.vm_event gets freed).

Actually, I need to correct myself here: only the first case applies. I
was looking at older Xen source code, but there's a
"d->arch.mem_access_emulate_each_rep = 0;" statement in
vm_event_cleanup_domain() now, so calling xc_monitor_disable() would not
cause a problem.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Regarding Outreachy project on Improving CR Dashboard

2016-04-07 Thread Jesus M. Gonzalez-Barahona

On Thu, 2016-04-07 at 17:57 +0530, Priya wrote:
> Hello all,
> 
> Thanks for the suggestions. I have updated the changes as u had
> mentioned. I am sorry, but I could not find any errors while running 
> 
> $ python3 createjson.py --mbox xen-devel-2016-03 --output new.json 
> 
> command. I am wondering what is wrong with this and might be a
> problem with python3 or my perceval version. I have added licensing
> and python logging. You can see it in my github repo [1]. I will try
> upgrading perceval and adding in the tests in the coming days, and
> will update.

Priya, I guess I didn't explain the source for the error clearly
enough, sorry about that. It is very likely that you're not running the
latest version of Perceval, which changed the format for the items it
produces some days ago. That's why, if you're running a version of
Perceval of more than some days ago, it works, but if you run the
latest one, it doesn't work (at least for me). I'm not completely sure,
but that's why I asked you to upgrade to the latest version (either
master HEAD or version 0.1.0). If you don't see the error once you do
this, please let me know.

Thanks,

Jesus.

> [1]: https://github.com/priya299/Dashboard
> 
> Priya V
> Amrita University
> LinkedIn | GitHub | Bitbucket
> 
> 
> On Thu, Apr 7, 2016 at 3:29 AM, Jesus M. Gonzalez-Barahona  gia.com> wrote:
> > On Wed, 2016-04-06 at 17:30 +0530, Priya wrote:
> > > Hello,
> > >
> > > Thanks for your suggestions.
> > > I have made the appropriate changes as you had mentioned.
> > > It took a little time to change from python3 to python3.4 as
> > perceval
> > > supports python3.4. I have updated the changes in my github. You
> > can
> > > see my git repo [1]
> > >
> > > [1]:https://github.com/priya299/Dashboard
> > 
> > Thanks a lot, Priya. Good work. Some preliminary comments, below.
> > 
> > * When runing the script on the xen-devel-2016-03 mbox, I seen an
> > exception raised:
> > 
> > 
> > (perceval)jgb@expisito:~/src/outreachy/Dashboard/dashboard$ python3
> > createjson.py --mbox xen-devel-2016-03 --output new.json
> > Traceback (most recent call last):
> >   File "createjson.py", line 61, in 
> > create_json(args.mbox,args.output)
> >   File "createjson.py", line 43, in create_json
> > if key == k['Message-ID'].strip('<>'):
> > KeyError: 'Message-ID'
> > 
> > 
> > Maybe some message is not having a Message-ID field? I suggest that
> > you
> > capture this exception, print out the offending message, and go on
> > with
> > the next one. You can use the Python logging package for printing
> > out
> > this kind of information (you can see how to use it in the Perceval
> > package itself). But see below.
> > 
> > * Minor typo in the README:
> > 
> > Instead of 
> > 
> > eg: python3.4 createjson --mbox xen-devel-2016-03 --output new.json
> > 
> > it should be
> > 
> > eg: python3.4 createjson.py --mbox xen-devel-2016-03 --output
> > new.json
> > 
> > * The files have no licensing info. If you agree, it could be
> > GPLv3, as
> > is Perceval itself. For that, it would be enough that you mimic the
> > header in Perceval files in your Python files (of course,
> > indicating
> > your authorship information).
> > 
> > * Which version of Perceval are you using? Some weeks ago, the
> > format
> > of the dictionary produced by Perceval for each message changed.
> > Now
> > the  actual fields of the message are in a data subdictionary.
> > Please,
> > check that: the above exception with respect to the Message-ID key
> > could be because of this... Please, try to make it work with master
> > HEAD for Perceval (I don't expect any new major change in the next
> > days/weeks, and I'll try to warn you in case some happens).
> > 
> > * Could you please write at least one unit test for your code? You
> > can
> > see examples of the testing schema we use in the tests directory in
> > Perceval, but we use vanilla unittest (the Python package for
> > tests).
> > At this stage I don't need that you produce a whole set of tests,
> > only
> > one or two to show that you know how to write unit tests, please.
> > 
> > Saludos,
> > 
> >         Jesus.
> > 
> > > Priya V
> > > Amrita University
> > > LinkedIn | GitHub | Bitbucket
> > >
> > > ___
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > http://lists.xen.org/xen-devel
> > --
> > Bitergia: http://bitergia.com
> > /me at Twitter: https://twitter.com/jgbarah
> > 
> > 
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/emulate: Check current->arch.vm_event in hvmemul_virtual_to_linear()

2016-04-07 Thread Razvan Cojocaru

On 04/07/16 20:27, Jan Beulich wrote:
 On 07.04.16 at 10:39,  wrote:
>> Theoretically it is possible for mem_access_emulate_each_rep to be
>> true even when current->arch.vm_event == NULL, so add an extra
>> check to hvmemul_virtual_to_linear().
> 
> Mind saying what those theoretical conditions are when this might
> happen?

This could happen if someone were to call xc_monitor_emulate_each_rep(),
but not xc_monitor_enable() (when current->arch.vm_event gets
allocated), or after someone called both, but afterwards called
xc_monitor_disable() (when current->arch.vm_event gets freed).

>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -514,7 +514,7 @@ static int hvmemul_virtual_to_linear(
>>   * vm_event being triggered for repeated writes to a whole page.
>>   */
>>  if ( unlikely(current->domain->arch.mem_access_emulate_each_rep) &&
>> - current->arch.vm_event->emulate_flags != 0 )
>> + current->arch.vm_event && current->arch.vm_event->emulate_flags != 
>> 0 )
> 
> That's then the third instance of "current" here - this needs
> latching into a local variable.

No problem.


Thanks,
Razvan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/4] libxl: set the backend type to Qdisk for CDROM devices on DM HVM guests

2016-04-07 Thread Roger Pau Monne

This is needed because the cd-{insert/eject} functions are not prepared to
deal with blkback, which would be used by default if no backend was
specified.

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libxl/libxl.c  | 24 +++-
 tools/libxl/libxl_create.c   |  2 +-
 tools/libxl/libxl_internal.h |  3 ++-
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d35fc33..9d785a4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2301,7 +2301,8 @@ int libxl_devid_to_device_vtpm(libxl_ctx *ctx,
 
 
/**/
 
-int libxl__device_disk_setdefault(libxl__gc *gc, libxl_device_disk *disk)
+int libxl__device_disk_setdefault(libxl__gc *gc, libxl_device_disk *disk,
+  uint32_t domid)
 {
 int rc;
 
@@ -2312,6 +2313,19 @@ int libxl__device_disk_setdefault(libxl__gc *gc, 
libxl_device_disk *disk)
 rc = libxl__resolve_domid(gc, disk->backend_domname, >backend_domid);
 if (rc < 0) return rc;
 
+/* Force Qdisk backend for CDROM devices of guests with a device model. */
+if (disk->is_cdrom != 0 &&
+libxl__domain_type(gc, domid) == LIBXL_DOMAIN_TYPE_HVM &&
+libxl__device_model_version_running(gc, domid) !=
+LIBXL_DEVICE_MODEL_VERSION_NONE) {
+if (!(disk->backend == LIBXL_DISK_BACKEND_QDISK ||
+  disk->backend == LIBXL_DISK_BACKEND_UNKNOWN)) {
+LOG(ERROR, "Backend for CD devices on HVM guests must be Qdisk");
+return ERROR_FAIL;
+}
+disk->backend = LIBXL_DISK_BACKEND_QDISK;
+}
+
 rc = libxl__device_disk_set_backend(gc, disk);
 return rc;
 }
@@ -2427,7 +2441,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
domid,
 }
 }
 
-rc = libxl__device_disk_setdefault(gc, disk);
+rc = libxl__device_disk_setdefault(gc, disk, domid);
 if (rc) goto out;
 
 front = flexarray_make(gc, 16, 1);
@@ -2869,7 +2883,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 disk_empty.vdev = libxl__strdup(NOGC, disk->vdev);
 disk_empty.pdev_path = libxl__strdup(NOGC, "");
 disk_empty.is_cdrom = 1;
-libxl__device_disk_setdefault(gc, _empty);
+libxl__device_disk_setdefault(gc, _empty, domid);
 
 libxl_domain_type type = libxl__domain_type(gc, domid);
 if (type == LIBXL_DOMAIN_TYPE_INVALID) {
@@ -2910,7 +2924,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 goto out;
 }
 
-rc = libxl__device_disk_setdefault(gc, disk);
+rc = libxl__device_disk_setdefault(gc, disk, domid);
 if (rc) goto out;
 
 if (!disk->pdev_path) {
@@ -3173,7 +3187,7 @@ void libxl__device_disk_local_initiate_attach(libxl__egc 
*egc,
 disk->script = libxl__strdup(gc, in_disk->script);
 disk->vdev = NULL;
 
-rc = libxl__device_disk_setdefault(gc, disk);
+rc = libxl__device_disk_setdefault(gc, disk, LIBXL_TOOLSTACK_DOMID);
 if (rc) goto out;
 
 libxl__prepare_ao_device(ao, >aodev);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 24f168b..1a57907 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -923,7 +923,7 @@ static void initiate_domain_create(libxl__egc *egc,
 store_libxl_entry(gc, domid, _config->b_info);
 
 for (i = 0; i < d_config->num_disks; i++) {
-ret = libxl__device_disk_setdefault(gc, _config->disks[i]);
+ret = libxl__device_disk_setdefault(gc, _config->disks[i], domid);
 if (ret) {
 LOG(ERROR, "Unable to set disk defaults for disk %d", i);
 goto error_out;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 55896f8..47aca76 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1204,7 +1204,8 @@ _hidden int 
libxl__domain_create_info_setdefault(libxl__gc *gc,
 _hidden int libxl__domain_build_info_setdefault(libxl__gc *gc,
 libxl_domain_build_info *b_info);
 _hidden int libxl__device_disk_setdefault(libxl__gc *gc,
-  libxl_device_disk *disk);
+  libxl_device_disk *disk,
+  uint32_t domid);
 _hidden int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
  uint32_t domid);
 _hidden int libxl__device_vtpm_setdefault(libxl__gc *gc, libxl_device_vtpm 
*vtpm);
-- 
2.6.4 (Apple Git-63)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 0/4] libxl: fix CDROM issues

2016-04-07 Thread Roger Pau Monne

Due to recent changes (and intended fixes) CDROM has become quite unstable 
recently. This series aims to fix the issues reported with having empty 
CDROM devices in the HVM guest configuration files. It also contains some 
cleanups of more generic code in order to ease the implementation.

After some talk on IRC, the easiest solution at this point in the release is 
to force PV CDROM backends of HVM guests with a device model to use Qdisk, 
since the cd-{insert/eject} implementation is only able to deal with this 
kind of PV backend.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 4/4] libxl: remove code added to use the 'phy' backend with CDROM devices

2016-04-07 Thread Roger Pau Monne

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libxl/libxl_device.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index ce53520..82405a7 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -191,20 +191,13 @@ static int disk_try_backend(disk_try_backend_args *a,
 
 switch (backend) {
 case LIBXL_DISK_BACKEND_PHY:
-if (!(a->disk->format == LIBXL_DISK_FORMAT_RAW ||
-  a->disk->format == LIBXL_DISK_FORMAT_EMPTY)) {
+if (a->disk->format != LIBXL_DISK_FORMAT_RAW) {
 goto bad_format;
 }
 
 if (libxl_defbool_val(a->disk->colo_enable))
 goto bad_colo;
 
-if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) {
-LOG(DEBUG, "Disk vdev=%s is empty, skipping physical device check",
-a->disk->vdev);
-return backend;
-}
-
 if (a->disk->backend_domid != LIBXL_TOOLSTACK_DOMID) {
 LOG(DEBUG, "Disk vdev=%s, is using a storage driver domain, "
"skipping physical device check", a->disk->vdev);
-- 
2.6.4 (Apple Git-63)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 3/4] libxl: only allow guests with a device model to use cd-{eject/insert}

2016-04-07 Thread Roger Pau Monne

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libxl/libxl.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 9d785a4..232e2c1 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2909,6 +2909,12 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 goto out;
 }
 
+if (dm_ver == LIBXL_DEVICE_MODEL_VERSION_NONE) {
+LOG(ERROR, "HVM guests without a device model cannot use cd-insert");
+rc = ERROR_FAIL;
+goto out;
+}
+
 disks = libxl_device_disk_list(ctx, domid, );
 for (i = 0; i < num; i++) {
 if (disks[i].is_cdrom && !strcmp(disk->vdev, disks[i].vdev))
-- 
2.6.4 (Apple Git-63)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 1/4] libxl: set the device model version earlier in xenstore

2016-04-07 Thread Roger Pau Monne

So libxl doesn't have to pass the build info around just to get the device
model used by the guest. This allows to simplify
libxl__device_nic_setdefault.

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Wei Liu 
---
 tools/libxl/libxl.c  | 24 +++-
 tools/libxl/libxl_create.c   |  9 +++--
 tools/libxl/libxl_dm.c   |  3 +--
 tools/libxl/libxl_internal.h |  3 +--
 4 files changed, 16 insertions(+), 23 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 5cdc09e..d35fc33 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -3292,7 +3292,7 @@ out:
 
/**/
 
 int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
- uint32_t domid, libxl_domain_build_info *info)
+ uint32_t domid)
 {
 int rc;
 
@@ -3330,21 +3330,11 @@ int libxl__device_nic_setdefault(libxl__gc *gc, 
libxl_device_nic *nic,
 switch (libxl__domain_type(gc, domid)) {
 case LIBXL_DOMAIN_TYPE_HVM:
 if (!nic->nictype) {
-if (info != NULL) {
-/* Path taken at creation time. */
-if (info->device_model_version ==
-LIBXL_DEVICE_MODEL_VERSION_NONE)
-nic->nictype = LIBXL_NIC_TYPE_VIF;
-else
-nic->nictype = LIBXL_NIC_TYPE_VIF_IOEMU;
-} else {
-/* Path taken when hot-adding a nic. */
-if (libxl__device_model_version_running(gc, domid) ==
-LIBXL_DEVICE_MODEL_VERSION_NONE)
-nic->nictype = LIBXL_NIC_TYPE_VIF;
-else
-nic->nictype = LIBXL_NIC_TYPE_VIF_IOEMU;
-}
+if (libxl__device_model_version_running(gc, domid) ==
+LIBXL_DEVICE_MODEL_VERSION_NONE)
+nic->nictype = LIBXL_NIC_TYPE_VIF;
+else
+nic->nictype = LIBXL_NIC_TYPE_VIF_IOEMU;
 }
 break;
 case LIBXL_DOMAIN_TYPE_PV:
@@ -3394,7 +3384,7 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t 
domid,
 libxl_device_nic_init(_saved);
 libxl_device_nic_copy(CTX, _saved, nic);
 
-rc = libxl__device_nic_setdefault(gc, nic, domid, NULL);
+rc = libxl__device_nic_setdefault(gc, nic, domid);
 if (rc) goto out;
 
 front = flexarray_make(gc, 16, 1);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 4b02de9..24f168b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -916,6 +916,12 @@ static void initiate_domain_create(libxl__egc *egc,
 goto error_out;
 }
 
+/*
+ * Set the dm version quite early so that libxl doesn't have to pass the
+ * build info around just to know if the domain has a device model or not.
+ */
+store_libxl_entry(gc, domid, _config->b_info);
+
 for (i = 0; i < d_config->num_disks; i++) {
 ret = libxl__device_disk_setdefault(gc, _config->disks[i]);
 if (ret) {
@@ -940,8 +946,7 @@ static void initiate_domain_create(libxl__egc *egc,
  * called libxl_device_nic_add when domcreate_launch_dm gets called,
  * but qemu needs the nic information to be complete.
  */
-ret = libxl__device_nic_setdefault(gc, _config->nics[i], domid,
-   _config->b_info);
+ret = libxl__device_nic_setdefault(gc, _config->nics[i], domid);
 if (ret) {
 LOG(ERROR, "Unable to set nic defaults for nic %d", i);
 goto error_out;
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index eac5501..a12c90d 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1810,8 +1810,7 @@ static void spawn_stub_launch_dm(libxl__egc *egc,
  * called libxl_device_nic_add at this point, but qemu needs
  * the nic information to be complete.
  */
-ret = libxl__device_nic_setdefault(gc, _config->nics[i], dm_domid,
-   _config->b_info);
+ret = libxl__device_nic_setdefault(gc, _config->nics[i], dm_domid);
 if (ret)
 goto out;
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 653c152..55896f8 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1206,8 +1206,7 @@ _hidden int libxl__domain_build_info_setdefault(libxl__gc 
*gc,
 _hidden int libxl__device_disk_setdefault(libxl__gc *gc,
   libxl_device_disk *disk);
 _hidden int libxl__device_nic_setdefault(libxl__gc *gc, libxl_device_nic *nic,
- uint32_t domid,
- libxl_domain_build_info *info);
+

Re: [Xen-devel] [libvirt] [PATCH] libxl: libxl_domain_create_restore has an extra argument

2016-04-07 Thread Wei Liu

On Thu, Apr 07, 2016 at 05:35:33PM +0100, Daniel P. Berrange wrote:
> On Wed, Apr 06, 2016 at 04:43:07PM -0500, Doug Goldstein wrote:
> > On 4/5/16 9:20 AM, Wei Liu wrote:
> > > In the latest libxenlight code, libxl_domain_create_restore accepts a
> > > new argument. Update libvirt's libxl driver for that. Use the macro
> > > provided by libxenlight to detect which version should be used.
> > > 
> > > The new parameter (send_back_fd) is set to -1 because libvirt provides
> > > no such fd.
> > > 
> > > Signed-off-by: Wei Liu 
> > > ---
> > > Build test with Xen 4.6.1 (old API) and Xen unstable (new API).
> > > ---
> > >  src/libxl/libxl_domain.c | 7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
> > > index 04962a0..aed904b 100644
> > > --- a/src/libxl/libxl_domain.c
> > > +++ b/src/libxl/libxl_domain.c
> > > @@ -1070,7 +1070,12 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
> > > virDomainObjPtr vm,
> > >  ret = libxl_domain_create_new(cfg->ctx, _config,
> > >, NULL, _console_how);
> > >  } else {
> > > -#ifdef LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS
> > > +#if defined(LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_BACK_FD)
> > > +params.checkpointed_stream = 0;
> > > +ret = libxl_domain_create_restore(cfg->ctx, _config, ,
> > > +  restore_fd, -1, , NULL,
> > > +  _console_how);
> > > +#elif defined(LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS)
> > >  params.checkpointed_stream = 0;
> > >  ret = libxl_domain_create_restore(cfg->ctx, _config, ,
> > >restore_fd, , NULL,
> > > 
> > 
> > ACK
> > 
> > This fixes integration testing that Xen Project does with libvirt and
> > Xen so it would be nice to get in for 1.3.3.
> 
> It missed the boat for 1.3.3, but I've pushed it now
> 

Thank you very much for pushing this!

Wei.

> 
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] libxl: replace the usage of uuid_t with a char array

2016-04-07 Thread Ian Jackson

Wei Liu writes ("Re: [PATCH v2] libxl: replace the usage of uuid_t with a char 
array"):
> A thought: maybe it is worth to have a #define LIBXL_HAVE_UNIFIED_UUID
> in libxl.h?

This is a good idea.

> /* If this is defined, libxl_uuid is big endian 16-octet stream on all
>  * platform. The libxl_uuid API family will handle transformation
>  * between libxl_uuid format and OS specific format.
>  */
> #define LBIXL_HAVE_UNIFIED_UUID 1

But the wording here isn't, quite.  The "transformation between
libxl_uuid format and OS specific format" is entirely hidden within
libxl and exists only to implement the functions provided by libxl.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/emulate: Check current->arch.vm_event in hvmemul_virtual_to_linear()

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 10:39,  wrote:
> Theoretically it is possible for mem_access_emulate_each_rep to be
> true even when current->arch.vm_event == NULL, so add an extra
> check to hvmemul_virtual_to_linear().

Mind saying what those theoretical conditions are when this might
happen?

> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -514,7 +514,7 @@ static int hvmemul_virtual_to_linear(
>   * vm_event being triggered for repeated writes to a whole page.
>   */
>  if ( unlikely(current->domain->arch.mem_access_emulate_each_rep) &&
> - current->arch.vm_event->emulate_flags != 0 )
> + current->arch.vm_event && current->arch.vm_event->emulate_flags != 
> 0 )

That's then the third instance of "current" here - this needs
latching into a local variable.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 08/24] xsplice: Add helper elf routines

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 18:19,  wrote:
> Konrad Rzeszutek Wilk writes ("[PATCH v6 08/24] xsplice: Add helper elf 
> routines"):
>> From: Ross Lagerwall 
>> 
>> Add Elf routines and data structures in preparation for loading an
>> xSplice payload.
>> 
>> We make an assumption that the max number of sections an ELF payload
>> can have is 64. We can in future make this be dependent on the
>> names of the sections and verifying against a list, but for right now
>> this suffices.
>> 
>> Also we a whole lot of checks to make sure that the ELF payload
>> file is not corrupted nor that the offsets point past the file.
> 
> This is good, but: ideally I would like to avoid conducting a detailed
> security review of this code.
> 
> My understanding of this is that the purpose of this machinery is to
> supply binary runtime patches to the hypervisor.  So I think someone
> who can inject malicious xsplice payloads can already control the
> host.  Is that right ?
> 
> If so then bugs in this loader cannot be any security impact.

Well, in a way this depends on re-visiting the position we take
related to heavy disaggregation, which I mean to put up as a
subject on the hackathon.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [for-4.7] xen/arm64: correctly emulate the {w, x}zr registers

2016-04-07 Thread Jan Beulich

>>> On 07.04.16 at 12:53,  wrote:
> On AArch64, encoding 31 for an R in the HSR is used to represent
> either {w,x}sp or {w,x}zr (See C1.2.4 in ARM DDI 0486A.d) depending on
> how the register field is interpreted by the instruction.
> 
> All the instructions trapped by Xen (either via a sysreg access or
> data abort) interpret encoding 31 as {w,x}zr. Therefore we don't have
> to worry about the possibility that a trap could refer to sp or about
> decoding the instruction.
> 
> For example AArch64 LDR and STR can have zr in the source/target
> register , but never sp. sp can be present in the destination
> pointer( i.e.  "[sp]"), but that would be represented by the value of
> FAR_EL2, not in the HSR.
> 
> For AArch32 it is possible for a LDR to target the PC, but this would
> not result in a valid ISS in the HSR register. However this could only
> occur if loading or storing the PC to MMIO, which we simply choose not
> to support for now.
> 
> Finally, features such as xenaccess can lead to us trapping on
> arbitrary instructions accessing RAM and not just for MMIO. However in
> many such cases HSR.ISS is not valid and in general features such as
> xenaccess do not rely on the nature of the specific instruction, they
> resolve the fault (via information found elsewhere e.g. FAR_EL2)
> without needing to know anything about the instruction which triggered
> the trap.
> 
> The register zr represents the zero register, i.e it will always
> return 0 and write to it is ignored. To properly handle this property,
> 2 new helpers have been introduced {get,set}_user_reg to read/write a
> value from/to a register. All the calls to select_user_reg have been
> replaced by these 2 helpers.
> 
> Furthermore, the code to emulate encoding 31 in select_user_reg has been
> dropped because it was invalid. For Aarch64 context, the encoding is
> used for sp or zr. For AArch32 context, the ISS won't be valid for data
> abort from AArch32 using r15 (i.e pc) as source/destination (See D7-1881
> ARM DDI 0487A.d, note the validity is more restrictive than on ARMv7).
> It's also not possible to use r15 in co-processor instructions.
> 
> This patch fixes setting MMIO register and sysreg to a random value
> (actually PC) instead of zero by something like:
> 
> *((volatile int*)reg) = 0;
> 
> compilers tend to generate "str wzr, [xx]" here.
> 
> [ian: added BUG_ON to select_user_reg and clarified bits of the commit 
> message]
> Reported-by: Marc Zyngier 
> Signed-off-by: Julien Grall 
> Signed-off-by: Ian Campbell 
> Reviewed-by: Stefano Stabellini 
> 
> ---
> 
> Stefano, let me know the new helper corresponds to change you requested
> (see [1])
> 
> This patch is a bug fix for Xen 4.7. Without it, a MMIO register and
> sysreg will be set to a random value (actually PC) when the zero
> register is used.
> 
> I'm not sure if we should consider this patch to be backported to Xen
> 4.6 and Xen 4.5. It depends on other patches and it would require some
> rework to backport it alone.
> 
> [1] 
> http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg03100.html 

So the tags and alike would suggest this is ready to be committed,
but the lack of a version number or version history don't really
help support this. Could you please clarify the state of this patch?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] libxl: Set rc on failure of usbdev_busaddr_to_busid

2016-04-07 Thread Ian Jackson

Chun Yan Liu writes ("Re: [PATCH 1/2] libxl: Set rc on failure of 
usbdev_busaddr_to_busid"):
> Thanks, Ian!

Should I take that as a Reviewed-by ?

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] libxl: Do not leak data on error path from libxl__read_sysfs_file_contents

2016-04-07 Thread Ian Jackson

Chun Yan Liu writes ("Re: [PATCH 2/2] libxl: Do not leak data on error path 
from libxl__read_sysfs_file_contents"):
> <1459782600-16073-2-git-send-email-ian.jack...@eu.citrix.com>, Ian Jackson
>  wrote: 
> > +free(data); 
> 
> 'data' is malloced with 'gc', it'll be freed by GC_FREE. Do we need to free
> it here?

Oh, you are quite right.  My patch was wrong.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] libfsimage: Add support for btrfs

2016-04-07 Thread Ian Jackson

Ross Lagerwall writes ("[PATCH 2/2] libfsimage: Add support for btrfs"):
> Add support for booting from btrfs filesystems. All the code except
> fsys_btrfs.c is imported from btrfs-progs 4.5.1. btrfs-progs is licensed
> as GPLv2. The files have been imported with minimal modification.
> Further work could split and prune the support code to only what is
> required for reading.
> 
> The driver supports following subvolumes and symlinks, and reading files
> with compression (zlib and lzo). This driver is useful for booting
> SLES 12 in its default configuration, since it uses btrfs.

I think this is fine in principle.  But can you split the code import
from the code you have written ?  I guess I don't mean to ask you to
_rearrange_ the code, but can you split it into separate commits ?
I think that will make archeaology etc. much easier.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] pygrub: Ignore GRUB2 if statements

2016-04-07 Thread Ian Jackson

Ross Lagerwall writes ("[PATCH 1/2] pygrub: Ignore GRUB2 if statements"):
> SLES 12's default GRUB config has the following code before any entries:
> if [ -n "$extra_cmdline" ]; then
>   submenu "Bootable snapshot #$snapshot_num" {
> menuentry "If OK, run 'snapper rollback' and reboot." { true; }
>   }
> fi
> 
> This prevents pygrub from booting using the default entry. Since I'm not
> aware of any distro GRUB config which puts useful entries within
> conditionals, ignore them.

I think this has a risk of breaking things, doesn't it ?  I mean,
existing setups might contain `if's and the relevant parts would no
longer be processed.

So I think this isn't a backport candidate.  And really I would like
an alternative, less risky fix, which would be a backport candidate.
Can you think of an alternative approach ?

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] bind_usbintf: do not reuse 'path'

2016-04-07 Thread Ian Jackson

Chunyan Liu writes ("[PATCH] bind_usbintf: do not reuse 'path'"):
> To avoid confusion, add a new variable "intf_path" to indicate
> driver/interface path, let "path" indicate driver/bind path only.

I think it would be better to rename both variables, as I suggested in
my mail yesterday.  That makes it much more obvious that each place
where `path' was used has been checked and the reference to the
proper variable substituted.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 14/21] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL

2016-04-07 Thread Daniel De Graaf


On 04/07/2016 07:57 AM, Andrew Cooper wrote:

And provide stubs for toolstack use.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
Acked-by: David Scott 
Acked-by: Jan Beulich 


Acked-by: Daniel De Graaf 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 08/21] x86/cpu: Sysctl and common infrastructure for levelling context switching

2016-04-07 Thread Daniel De Graaf


On 04/07/2016 07:57 AM, Andrew Cooper wrote:

A toolstack needs to know how much control Xen has over the visible cpuid
values in PV guests.  Provide an explicit mechanism to query what Xen is
capable of.

This interface will currently report no capabilities.  This change is
scaffolding for future patches, which will introduce detection and switching
logic, after which the interface will report hardware capabilities correctly.

Signed-off-by: Andrew Cooper 
Acked-by: Jan Beulich 


Acked-by: Daniel De Graaf 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Stub domain crash on Xen v4.6.1

2016-04-07 Thread Fanny Dwargee

I'm sorry but this is my first attempt to use stubdom.

This is a test machine so feel free to propose xen recompilation adding
printk's/obscured options or settings modification.

With regards,

Fanny

2016-04-07 18:13 GMT+02:00 Wei Liu :

> On Wed, Apr 06, 2016 at 02:49:19PM +0200, Fanny Dwargee wrote:
> > Hi Wei,
> >
> > first: thanks for your effort. :)
> >
> > Find attached the 'xl create' output and the
> > /var/log/xen/console/hypervisor.log
> > files after creating the same domain without the
> > 'device_model_stubdomain_override' configuration option.
> >
> > 'xl list' command output...
> > Name ID   Mem VCPUs  State   Time(s)
> > Domain-0  0  1023 2 r-   11608.1
> > win7-sp1-x64-2   28  2048 1 --   0.4
> >
>
> I was hoping that you somehow had a running setup with stubdom enabled.
> Have you tried earlier version of Xen or is this your first attempt to
> use stubdom?
>
> Wei.
>
> > Regards,
> >
> > Fanny
> >
> >
> >
> >
> > 2016-04-06 13:12 GMT+02:00 Wei Liu :
> >
> > > On Tue, Apr 05, 2016 at 05:17:00PM +0200, Fanny Dwargee wrote:
> > > > Hi,
> > > >
> > > > after adding the 'device_model_stubdomain_override = 1' to an
> otherwise
> > > > fine configuration the domain crashes on start.
> > > >
> > > > Xen is v4.6.1 compiled from source on Debian Jessie 64bits this way:
> > > >
> > > >- ./configure --enable-stubdom --enable-githttp
> > > >- make dist-xen
> > > >- make dist-tools
> > > >- make dist-stubdom
> > > >- make install-xen
> > > >- make install-tools
> > > >- make install-stubdom
> > > >
> > > > As pointed out before the same configuration file without the '
> > > > device_model_stubdomain_override' works flawlessly.
> > > >
> > > > This is the 'xl list' command output while the domain is starting:
> > > >
> > > > NameID   Mem VCPUs  State   Time(s)
> > > > Domain-0 0  1022 2 r- 318.3
> > > > win7-sp1-x64-2  20  2048 1 r-   5.5
> > > > win7-sp1-x64-2-dm   2144 1 r-   6.0
> > > >
> > > >
> > > > As you can see both domains are started (the stub and the original
> > > domain)
> > > >
> > > > Find attached the domain configuration file, 'xl info' output, 'xl
> > > create'
> > > > output and the /var/log/xen/console/hypervisor.log file, notice the
> grant
> > > > table error on console-hypervisor.log
> > > >
> > > > I'd very grateful for any help finding the cause of this problem.
> > > >
> > >
> > >
> > > The hypervsior log show some error, but unfortunately it is not
> > > immediately clear what went wrong.
> > >
> > > Do you have a working base line setup so that I can compare the
> > > difference?
> > >
> > > Wei.
> > >
> > > > Best regards,
> > > >
> > > > Fanny
> > >
> > >
>
>
>
>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] a fix in libxl_device_usbdev_list

2016-04-07 Thread Ian Jackson

Chunyan Liu writes ("[PATCH 1/4] a fix in libxl_device_usbdev_list"):
> In testing with libvirt pvusb functionality, found a rc check
> error in libxl_device_usbdev_list. Correct it. This function
> is not used by xl.

Thanks for these.  I have applied 2,3, and 4.  But not 1, as I had
comments on it.

Regards,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

1 2 3 >

1 - 100 of 227 matches

Mail list logo