Re: [Xen-devel] [PATCH V3 2/6] libxl_read_file_contents: fix reading sysfs file
On Mon, 2015-05-18 at 15:23 +0100, Ian Jackson wrote: Perhaps the bulk should be made into libxl__read_file_contents_core which takes a boolean instructing whether to tolerate magically shrinking files ? Setting that boolean probably ought to arrange to insist that the function gets eof, in case the file is actually bigger rather than smaller than the size. Ian, Wei ? Sounds ok to me. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xen scheduler
[Adding George. In future, if you are interested in getting feedback on a particular subsystem, look for it in the MAINTAINERS file, and Cc the address(es) you find there] On Mon, 2015-05-18 at 13:24 +0530, Rajendra Bele wrote: As per my knowledge. Credit scheduler sorts its queue of VCPUs with priority based on credit value. Yes and no. :-) This is probably formally correct, as: 1. when sorting it, we do rearrange the runq in priority order 2. the priority of a vCPU is _based_ on credits, as being in UNDER or in OVER state does depend on credits However, as stated here: /* * This is a O(n) optimized sort of the runq. * * Time-share VCPUs can only be one of two priorities, UNDER or OVER. We walk * through the runq and move up any UNDERs that are preceded by OVERS. We * remember the last UNDER to make the move up operation O(1). */ static void csched_runq_sort(struct csched_private *prv, unsigned int cpu) there are only two priorities, so, for Credit, sorts its queue of VCPUs with priority based on credit value means all the UNDER vCPUs come before any OVER vCPU... was that what you meant? BTW, this is one of the differences between Credit and Credit2, as in Credit2, the runqueues are kept sorted by credit order... It follows FCFS technique for equal priority if we apply SJF for equal priority will be helpful to reduce waiting time spend in the queue basically for the Under Priority (credits0) VCPUs. Yes, I think that treating the various vCPUs in UNDER differently, basing on some parameter/state/etc. of them would be good... actually, that's why I like Credit2, and why we're trying to make it usable in production. Doing the same in Credit is of course possible, but I fear it would reveal really complex. Then, again, we already have Credit2 doing something like that... So I think that anyone wanting a scheduler with a similar property should invest time in Credit2, rather than trying to tweak Credit1 into that. But then, of course, I may be wrong, and you'll come up with a 15 lines patch that does the trick! ;-P Anyway, you're mentioning SJF, which would indeed be great, if it weren't impossible to implement: Another disadvantage of using shortest job next is that the total execution time of a job must be known before execution (http://en.wikipedia.org/wiki/Shortest_job_next ) :-( How where you thinking to approximate the execution time of upcoming execution instance of a vCPU? I'm asking because, per my experience, the method chosen for that purpose has quite a bit of influence in the effectiveness of a particular SJF implementation. obliviously situation is rare but will make sense when large no of VM are active. I'm not sure I'm getting what you mean here. What's rare, that there are many vCPUs in UNDER? I don't think it is. Or, in any case, it certainly is the typical situation in which a scheduler is important (if there is less work than CPUs, the scheduler does not count that much!), so it's a good scenario to consider and try to improve... Or were you referring to something else? If anybody working on this wants his/her comments on this idea I don't think there is anyone working on this particular item, but scheduling is certainly receiving some attention, and we're always happy to discuss potential new features, improvements, and alike! :-) Regards, Dario signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v21 11/14] x86/VPMU: Handle PMU interrupts for PV(H) guests
On 05/18/2015 05:43 AM, Dietmar Hahn wrote: Am Freitag 08 Mai 2015, 17:06:11 schrieb Boris Ostrovsky: +if ( !is_hvm_vcpu(sampling) ) +{ +/* PV(H) guest */ +const struct cpu_user_regs *cur_regs; +uint64_t *flags = vpmu-xenpmu_data-pmu.pmu_flags; +domid_t domid = DOMID_SELF; + +if ( !vpmu-xenpmu_data ) +return; + +if ( is_pvh_vcpu(sampling) + !vpmu-arch_vpmu_ops-do_interrupt(regs) ) Here you expect vpmu-arch_vpmu_ops != NULL but ... +return; + +if ( *flags PMU_CACHED ) +return; + ... + +return; +} if ( vpmu-arch_vpmu_ops ) ... here is a check. Maybe this check here is unnecessary because you would never get this interrupt without having arch_vpmu_ops != NULL to switch on the machinery? There are some other locations too with checks before calling vpmu-arch_vpmu_ops-... and some without. Maybe it would make sense to force always a complete set of arch_vpmu_ops - functions to avoid this? I was actually thinking about (eventually) dropping ops tests and checking that all of them exist during VPMU initialization. As for this particular test, it may be worth moving it to the beginning of the routine, mostly to guard against spurious interrupts (but also to avoid performing it more than once) } -void vpmu_load(struct vcpu *v) +int vpmu_load(struct vcpu *v, bool_t verify) vpmu_load uses verify but within the arch_vpmu_load functions (core2_vpmu_load() and amd_vpmu_load()) you use from_guest for the same meaning. This is a little bit confusing. Always using verify would be clearer I think. Then this will not be consistent with the save part (which doesn't use the flag to verify the context but rather to only state that the routine should copy it). So I think renaming 'verify' to 'from_guest' and keeping arch ops as they are now would be more consistent. Thanks. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v21 02/14] x86/VPMU: Add public xenpmu.h
On 08.05.15 at 23:06, boris.ostrov...@oracle.com wrote: --- /dev/null +++ b/xen/include/public/arch-x86/pmu.h @@ -0,0 +1,122 @@ +#ifndef __XEN_PUBLIC_ARCH_X86_PMU_H__ +#define __XEN_PUBLIC_ARCH_X86_PMU_H__ + +/* x86-specific PMU definitions */ + +/* AMD PMU registers and structures */ +struct xen_pmu_amd_ctxt { +/* Offsets to counter and control MSRs (relative to xen_pmu_arch.c.amd) */ +uint32_t counters; +uint32_t ctrls; +}; +typedef struct xen_pmu_amd_ctxt xen_pmu_amd_ctxt_t; +DEFINE_XEN_GUEST_HANDLE(xen_pmu_amd_ctxt_t); + +/* Intel PMU registers and structures */ +struct xen_pmu_cntr_pair { +uint64_t counter; +uint64_t control; +}; +typedef struct xen_pmu_cntr_pair xen_pmu_cntr_pair_t; +DEFINE_XEN_GUEST_HANDLE(xen_pmu_cntr_pair_t); + +struct xen_pmu_intel_ctxt { +uint64_t global_ctrl; +uint64_t global_ovf_ctrl; +uint64_t global_status; +uint64_t fixed_ctrl; +uint64_t ds_area; +uint64_t pebs_enable; +uint64_t debugctl; +/* + * Offsets to fixed and architectural counter MSRs (relative to + * xen_pmu_arch.c.intel) + */ +uint32_t fixed_counters; +uint32_t arch_counters; +}; +typedef struct xen_pmu_intel_ctxt xen_pmu_intel_ctxt_t; +DEFINE_XEN_GUEST_HANDLE(xen_pmu_intel_ctxt_t); + +/* Sampled domain's registers */ +struct xen_pmu_regs { +uint64_t ip; +uint64_t sp; +uint64_t flags; +uint16_t cs; +uint16_t ss; +uint8_t cpl; +uint8_t pad[3]; +}; +typedef struct xen_pmu_regs xen_pmu_regs_t; +DEFINE_XEN_GUEST_HANDLE(xen_pmu_regs_t); + +/* PMU flags */ +#define PMU_CACHED (10) /* PMU MSRs are cached in the context */ + +/* + * Architecture-specific information describing state of the processor at + * the time of PMU interrupt. + * Fields of this structure marked as RW for guest can only be written by the + * guest when PMU_CACHED bit in pmu_flags is set (which is done by the + * hypervisor during PMU interrupt). Hypervisor will read updated data in + * XENPMU_flush hypercall and clear PMU_CACHED bit. + */ +struct xen_pmu_arch { +union { +/* + * Processor's registers at the time of interrupt. + * RW for hypervisor, RO for guests. + */ +struct xen_pmu_regs regs; +/* Padding for adding new registers to xen_pmu_regs in the future */ +#define XENPMU_REGS_PAD_SZ 64 +uint8_t pad[XENPMU_REGS_PAD_SZ]; +} r; + +/* RW for hypervisor, RO for guest */ +uint64_t pmu_flags; + +/* + * APIC LVTPC register. + * RW for both hypervisor and guest. + * Only APIC_LVT_MASKED bit is loaded by the hypervisor into hardware + * during XENPMU_flush. + */ +union { +uint32_t lapic_lvtpc; +uint64_t pad; +} l; + +/* + * Vendor-specific PMU registers. + * RW for both hypervisor and guest. + * Guest's updates to this field are verified and then loaded by the + * hypervisor into hardware during XENPMU_flush + */ +union { +struct xen_pmu_amd_ctxt amd; +struct xen_pmu_intel_ctxt intel; + +/* + * Padding for contexts (fixed parts only, does not include MSR banks + * that are specified by offsets) + */ +#define XENPMU_CTXT_PAD_SZ 128 +uint8_t pad[XENPMU_CTXT_PAD_SZ]; +} c; +}; Marking all the fields RW for the hypervisor is certainly correct from a permissions pov, but requires close auditing that the hypervisor doesn't ever read a field twice, potentially getting different results and hence inconsistent internal state. Therefore - do all of the fields _need_ to be RW for the hypervisor? If not, marking the ones where this isn't needed as WO would be much preferred, to limit the scope of whats needs to be audited. Same of course applies to all the arch-independent bits further down (which didn't get annotated so far). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 2/6] libxl_read_file_contents: fix reading sysfs file
On Mon, May 18, 2015 at 03:23:38PM +0100, Ian Jackson wrote: Chunyan Liu writes ([PATCH V3 2/6] libxl_read_file_contents: fix reading sysfs file): Sysfs file has size=4096 but actual file content is less than that. Wow. Is there any danger that the actual size might be 4096 ? Current libxl_read_file_contents will treat it as error when file size and actual file content differs, so reading sysfs file content with this function always fails. Fix it so that we can reuse this function to get sysfs file content in later pvusb work. I'm uncomfortable with removing an error check from this function for all its call sites. I think, sadly, that we are going to need a new function - at least, a new entrypoint. We don't want to repeat the whole of libxl__read_file_contents. Perhaps the bulk should be made into libxl__read_file_contents_core which takes a boolean instructing whether to tolerate magically shrinking files ? Setting that boolean probably ought to arrange to insist that the function gets eof, in case the file is actually bigger rather than smaller than the size. Ian, Wei ? Yes, we need a new entry point. Wei. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 2/6] libxl_read_file_contents: fix reading sysfs file
Chunyan Liu writes ([PATCH V3 2/6] libxl_read_file_contents: fix reading sysfs file): Sysfs file has size=4096 but actual file content is less than that. Wow. Is there any danger that the actual size might be 4096 ? Current libxl_read_file_contents will treat it as error when file size and actual file content differs, so reading sysfs file content with this function always fails. Fix it so that we can reuse this function to get sysfs file content in later pvusb work. I'm uncomfortable with removing an error check from this function for all its call sites. I think, sadly, that we are going to need a new function - at least, a new entrypoint. We don't want to repeat the whole of libxl__read_file_contents. Perhaps the bulk should be made into libxl__read_file_contents_core which takes a boolean instructing whether to tolerate magically shrinking files ? Setting that boolean probably ought to arrange to insist that the function gets eof, in case the file is actually bigger rather than smaller than the size. Ian, Wei ? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools: Fix wild memory allocations from c/s 250f0b4 and 85d78b4
On 18/05/15 15:34, Boris Ostrovsky wrote: On 05/18/2015 10:09 AM, Andrew Cooper wrote: On 18/05/15 15:00, Boris Ostrovsky wrote: On 05/18/2015 08:57 AM, Andrew Cooper wrote: These changesets cause the respective libxc functions to unconditonally dereference their max_cpus/nodes parameters as part of initial memory allocations. It will fail at obtaining the correct number of cpus/nodes from Xen, as the guest handles will not be NULL. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Boris Ostrovsky boris.ostrov...@oracle.com --- Spotted by XenServers Coverity run. --- tools/libxl/libxl.c |4 ++-- tools/misc/xenpm.c|4 ++-- tools/python/xen/lowlevel/xc/xc.c |4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) xenpm bug is already fixed (commit b315cd9cce5b6da7ca89b2d7bad3fb01e7716044 n the staging tree). I am not sure I understand why Coverity complains about other spots. For example, in libxl_get_cpu_topology() num_cpus can be left uninitialized only if xc_cputopoinfo(ctx-xch, num_cpus, NULL) fails, in which case we go to 'GC_FREE; return ret;', so it's not ever used. xc_cputopoinfo(ctx-xch, num_cpus, NULL) unconditionally dereferences and reads num_cpus, and performs a memory allocation based on the result. Ah, OK. xc_cputopoinf() (or, rather, the hypervisor) actually doesn't use the value of dereferenced num_cpus in this case but obviously Coverity can't know about this. So Coverity cross-checks routines to see how callers use the arguments? xc_cputopoinfo(ctx-xch, num_cpus, NULL) dereferences num_cpus as part of its DECLARE_HYPERCALL_BUFFER()s. All of this happens before getting anywhere near the hypervisor. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 10/41] arm/acpi : Print GIC information when MADT is parsed
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: From: Naresh Bhat naresh.b...@linaro.org When MADT is parsed, print GIC information to make the boot log look pretty. Signed-off-by: Hanjun Guo hanjun@linaro.org Signed-off-by: Tomasz Nowicki tomasz.nowi...@linaro.org Signed-off-by: Naresh Bhat naresh.b...@linaro.org --- xen/drivers/acpi/tables.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/xen/drivers/acpi/tables.c b/xen/drivers/acpi/tables.c index 1beca79..684d8c9 100644 --- a/xen/drivers/acpi/tables.c +++ b/xen/drivers/acpi/tables.c @@ -190,6 +190,45 @@ void __init acpi_table_print_madt_entry(struct acpi_subtable_header *header) } break; +case ACPI_MADT_TYPE_GENERIC_INTERRUPT: +{ +struct acpi_madt_generic_interrupt *p = +(struct acpi_madt_generic_interrupt *)header; +printk(KERN_INFO PREFIX +GIC (acpi_id[0x%04x] gic_id[0x%04x] %s)\n, +p-uid, p-gic_id, +(p-flags ACPI_MADT_ENABLED) ? enabled : disabled); Printk indentation: printk(KERN_INFO PREFIX GIC ... ...); Also, it seems that the indentation doesn't match the rest of the switch case. +} +break; + +case ACPI_MADT_TYPE_GENERIC_DISTRIBUTOR: +{ +struct acpi_madt_generic_distributor *p = +(struct acpi_madt_generic_distributor *)header; +printk(KERN_INFO PREFIX +GIC Distributor (id[0x%04x] address[0x%08llx] gsi_base[%d])\n, +p-gic_id, (long long unsigned int)p-base_address, p-global_irq_base); Ditto +} +break; + +case ACPI_MADT_TYPE_GIC_MSI_FRAME: +{ +struct acpi_madt_gic_msi_frame *p = +(struct acpi_madt_gic_msi_frame *)header; +printk(GIC MSI Frame (address[0x%08llx] msi_fame_id[%d])\n, +(long long unsigned int)p-base_address, p-gic_msi_frame_id); Ditto missing KERN_INFO PREFIX +} +break; + +case ACPI_MADT_TYPE_GIC_REDISTRIBUTOR: +{ +struct acpi_madt_gic_redistributor *p = +(struct acpi_madt_gic_redistributor *)header; +printk(GIC Redistributor (address[0x%08llx] region_size[0x%x])\n, +(long long unsigned int)p-base_address, p-region_size); Ditto missing KERN_INFO PREFIX +} +break; + default: printk(KERN_WARNING PREFIX Found unsupported MADT entry (type = %#x)\n, Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 11/41] arm/acpi : add GTDT support updated by ACPI 5.1
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: With ACPI 5.0, we got per-processor timer support in GTDT, and ACPI 5.1 introduced the support for platform (memory-mapped) timers: GT Block and SBSA watchdog timer, add the code needed for the spec change. Signed-off-by: Hanjun Guo hanjun@linaro.org Signed-off-by: Naresh Bhat naresh.b...@linaro.org Signed-off-by: Parth Dixit parth.di...@linaro.org --- xen/include/acpi/actbl3.h | 92 +++--- xen/include/asm-arm/acpi.h | 2 + 2 files changed, 80 insertions(+), 14 deletions(-) diff --git a/xen/include/acpi/actbl3.h b/xen/include/acpi/actbl3.h index 8c61b5f..7664f9d 100644 --- a/xen/include/acpi/actbl3.h +++ b/xen/include/acpi/actbl3.h @@ -241,33 +241,97 @@ struct acpi_s3pt_suspend { /*** * - * GTDT - Generic Timer Description Table (ACPI 5.0) + * GTDT - Generic Timer Description Table (ACPI 5.1) *Version 1 * **/ struct acpi_table_gtdt { struct acpi_table_header header;/* Common ACPI table header */ - u64 address; - u32 flags; - u32 secure_pl1_interrupt; - u32 secure_pl1_flags; - u32 non_secure_pl1_interrupt; - u32 non_secure_pl1_flags; + u64 cnt_control_base_address; This patch is out-of-sync compare to the Linux one (naming different, comment...). Can you update it? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] RFC/Proposal: Partial `libxenctrl` API/ABI stabilisation
% Partial `libxenctrl` API/ABI stabilisation % Ian Campbell ian.campb...@citrix.com % Draft A # Introduction The low-level `libxenctrl` library currently has an unstable API and ABI and some of the hypervisor interfaces which it exposes are similarly unstable. However several external projects use some of these interfaces (at least: qemu and kexec-tools), which presents problems for distros and other consumers. In particular the need for spurious rebuilds of those components against newer versions of Xen and difficulty supporting parallel installation of different versions of Xen (useful during upgrade). This document considers whether parts of `libxenctrl` can be split out into new libraries with more useful API and ABI guarantees. XXX: I haven't yet done a full pass over the list of symbols in `libxenctrl` to categorise them and decide where they belong. I thought I would get some early feedback first and just picked a few representative examples for each library. # ABI/API Compatibility Classes Compatibility opportunities: * `LAPI` -- Library API * `LABI` -- Library ABI * `HABI` -- Hypervisor ABI (includes ioctls). Each can either be Stable (`S`), Unstable (`U`) or don't care (`x`, because made moot by higher level interface class, e.g. no real point from an application PoV to a Stable HABI behind a unstable LABI). Stable vs. Unstable is across major hypervisor version bump, always aim to be stable across point releases. For libraries Stable means `SONAME` major component, but forward compatible only (i.e. old app on new library works, new app on old library may not link due to e.g. new symbols. This is the normal SONAME expectation). XXX find a link to the sort of scheme I mean. `HABI` may include ioctls used to access those ABIs, typically these are already required to be stable by the relevant OS maintainers. A library interface may fall into one of these categories (I expect there are others and we may not want any library to use some of even these): * Unstable `LAPI` (`Uxx`) * The Wild West * Current Examples: `libxenctrl` * Stable `LAPI`, Unstable `LABI` (`SUx`) * Requires application rebuild for a new Xen version, but not application source code changes. * Current Examples: `libxenlight` * Stable `LAPI`, Stable `LABI`, Unstable `HABI` (`SSU`) * Library can be switched out via dynamic linking across hypervisor upgrade (mechanism TBD, pos. distro specific, e.g. symlink switched on boot). Requires application/daemon restart but not rebuild (but changing hypervisor version involves a reboot anyway). * Current Examples: None?? * Stable `LAPI`, Stable `LABI`, Stable `HABI` (`SSS`) * Applications linked again a library will function against any hypervisor version. * Current Examples: None?? # Goal Provide `SSU` or `SSS` interfaces for major external consumers of current libxenctrl functionality. Out of scope (for now): `SSU` or `SSS` interfaces for consumers of `libxenlight`. Rationale: Lets focus on fixing external consumers of libxenctrl first. # Major External Consumers of `libxenctrl` * qemu * kexec tools * in guest tools e.g. users of `libxenstore`, `libvchan`, and by extension `grant table` and `event channel` functionality. NB: `libxenstore` is already `SSU` or `SSS` (XXX?) # `libxenctrl` symbols Gathered by: nm tools/libxc/libxenctrl.so | grep ' [Tt] ' | cut -f 3 -d \ | sort -u `libxenctrl` today exposes many symbols which look to be internal. We should consider also reducing that set by using `__attribute__((visibility(hidden)))`. The following proposes some functional groupings via some proposed split library names. In some cases we may also wish to consider replacing an API with one which can be properly maintained going forwards. e.g.: - perhaps replacing domctl's used by qemu with new stable hypercall ABIs and reflecting that in new library APIs. - perhaps exposing more constrained versions of some broad interfaces for external users. XXX: Change `xc_*` namespacing as well as library names? ## `libxenhypercall` Core open/close interface, make a hypercall functionality, hypercall buffers. All other libraries likely depend on this. Applications do as well in order to access open/close interface at least. - xc_interface_close - xc_interface_is_fake (???) - xc_interface_open - xc_hypercall_buffer_array_create - xc_hypercall_buffer_array_destroy ## `libxenevtchn` Interacting with `/dev/xen/evtchn` - xc_evtchn_alloc_unbound - xc_evtchn_bind_interdomain - xc_evtchn_bind_unbound_port - xc_evtchn_bind_virq - xc_evtchn_close - xc_evtchn_fd - xc_evtchn_notify - xc_evtchn_open - xc_evtchn_pending - xc_evtchn_reset - xc_evtchn_status - xc_evtchn_unbind - xc_evtchn_unmask ## `libxengnttab` Interacting with `/dev/xen/gnt{shr,alloc}` XXX two libs or one? - xc_gntshr_close - xc_gntshr_munmap -
[Xen-devel] [PATCH 4/4] libxl: fix HVM vNUMA
This patch does two thing: The original code erroneously fills in xc_hvm_build_args before generating vmemranges. The effect is that guest memory is populated without vNUMA information. Move the hunk to right place to fix this. Move the subtraction of video ram to libxl__vnuma_build_vmemrange_hvm because it's the central place for generating vmemranges. Reported-by: Boris Ostrovsky boris.ostrov...@oracle.com Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Dario Faggioli dario.faggi...@citrix.com --- tools/libxl/libxl_dom.c | 32 ++-- tools/libxl/libxl_vnuma.c | 15 ++- 2 files changed, 24 insertions(+), 23 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 608e574..e3d1338 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -961,6 +961,16 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, if (info-num_vnuma_nodes != 0) { int i; +ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args); +if (ret) { +LOGEV(ERROR, ret, hvm build vmemranges failed); +goto out; +} +ret = libxl__vnuma_config_check(gc, info, state); +if (ret) goto out; +ret = set_vnuma_info(gc, domid, info, state); +if (ret) goto out; + args.nr_vmemranges = state-num_vmemranges; args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) * args.nr_vmemranges); @@ -972,17 +982,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, args.vmemranges[i].nid = state-vmemranges[i].nid; } -/* Consider video ram belongs to vmemrange 0 -- just shrink it - * by the size of video ram. - */ -if (((args.vmemranges[0].end - args.vmemranges[0].start) 10) - info-video_memkb) { -LOG(ERROR, vmemrange 0 too small to contain video ram); -goto out; -} - -args.vmemranges[0].end -= (info-video_memkb 10); - args.nr_vnodes = info-num_vnuma_nodes; args.vnode_to_pnode = libxl__malloc(gc, sizeof(*args.vnode_to_pnode) * args.nr_vnodes); @@ -996,17 +995,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } -if (info-num_vnuma_nodes != 0) { -ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args); -if (ret) { -LOGEV(ERROR, ret, hvm build vmemranges failed); -goto out; -} -ret = libxl__vnuma_config_check(gc, info, state); -if (ret) goto out; -ret = set_vnuma_info(gc, domid, info, state); -if (ret) goto out; -} ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port, state-store_mfn, state-console_port, state-console_mfn, state-store_domid, diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c index cac78d7..56856d2 100644 --- a/tools/libxl/libxl_vnuma.c +++ b/tools/libxl/libxl_vnuma.c @@ -257,6 +257,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc, uint64_t hole_start, hole_end, next; int nid, nr_vmemrange; xen_vmemrange_t *vmemranges; +int rc; /* Derive vmemranges from vnode size and memory hole. * @@ -277,6 +278,16 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc, libxl_vnode_info *p = b_info-vnuma_nodes[nid]; uint64_t remaining_bytes = p-memkb 10; +/* Consider video ram belongs to vnode 0 */ +if (nid == 0) { +if (p-memkb b_info-video_memkb) { +LOG(ERROR, vnode 0 too small to contain video ram); +rc = ERROR_INVAL; +goto out; +} +remaining_bytes -= (b_info-video_memkb 10); +} + while (remaining_bytes 0) { uint64_t count = remaining_bytes; @@ -300,7 +311,9 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc, state-vmemranges = vmemranges; state-num_vmemranges = nr_vmemrange; -return 0; +rc = 0; +out: +return rc; } /* -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/4] libxc/libxl: fill xc_hvm_build_args in libxl
When building HVM guests, originally some fields of xc_hvm_build_args are filled in xc_hvm_build (and buried in the wrong function), some are set in libxl__build_hvm before passing xc_hvm_build_args to xc_hvm_build. This is fragile. After examining the code in xc_hvm_build that sets those fields, we can in fact move setting of mmio_start etc in libxl. This way we consolidate memory layout setting in libxl. The setting of firmware data related fields is left in xc_hvm_build because it depends on parsing ELF image. Those fields only point to scratch data that doesn't affect memory layout. There should be no change in the generated guest memory layout. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com --- Cc: Chen, Tiejun tiejun.c...@intel.com This might affect your RMRR patch series. I once said xc_hvm_build would touch various xc_hvm_build_args fields that would affect guest memory layout. It won't be that case anymore with this patch. --- tools/libxc/xc_hvm_build_x86.c | 37 +++-- tools/libxl/libxl_dom.c| 16 2 files changed, 23 insertions(+), 30 deletions(-) diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index e45ae4a..92422bf 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -88,22 +88,14 @@ static int modules_init(struct xc_hvm_build_args *args, return 0; } -static void build_hvm_info(void *hvm_info_page, uint64_t mem_size, - uint64_t mmio_start, uint64_t mmio_size, +static void build_hvm_info(void *hvm_info_page, struct xc_hvm_build_args *args) { struct hvm_info_table *hvm_info = (struct hvm_info_table *) (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET); -uint64_t lowmem_end = mem_size, highmem_end = 0; uint8_t sum; int i; -if ( lowmem_end mmio_start ) -{ -highmem_end = (1ull32) + (lowmem_end - mmio_start); -lowmem_end = mmio_start; -} - memset(hvm_info_page, 0, PAGE_SIZE); /* Fill in the header. */ @@ -116,14 +108,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size, memset(hvm_info-vcpu_online, 0xff, sizeof(hvm_info-vcpu_online)); /* Memory parameters. */ -hvm_info-low_mem_pgend = lowmem_end PAGE_SHIFT; -hvm_info-high_mem_pgend = highmem_end PAGE_SHIFT; +hvm_info-low_mem_pgend = args-lowmem_end PAGE_SHIFT; +hvm_info-high_mem_pgend = args-highmem_end PAGE_SHIFT; hvm_info-reserved_mem_pgstart = ioreq_server_pfn(0); -args-lowmem_end = lowmem_end; -args-highmem_end = highmem_end; -args-mmio_start = mmio_start; - /* Finish with the checksum. */ for ( i = 0, sum = 0; i hvm_info-length; i++ ) sum += ((uint8_t *)hvm_info)[i]; @@ -251,8 +239,6 @@ static int setup_guest(xc_interface *xch, xen_pfn_t *page_array = NULL; unsigned long i, vmemid, nr_pages = args-mem_size PAGE_SHIFT; unsigned long target_pages = args-mem_target PAGE_SHIFT; -uint64_t mmio_start = (1ull 32) - args-mmio_size; -uint64_t mmio_size = args-mmio_size; unsigned long entry_eip, cur_pages, cur_pfn; void *hvm_info_page; uint32_t *ident_pt; @@ -344,8 +330,8 @@ static int setup_guest(xc_interface *xch, for ( i = 0; i nr_pages; i++ ) page_array[i] = i; -for ( i = mmio_start PAGE_SHIFT; i nr_pages; i++ ) -page_array[i] += mmio_size PAGE_SHIFT; +for ( i = args-mmio_start PAGE_SHIFT; i nr_pages; i++ ) +page_array[i] += args-mmio_size PAGE_SHIFT; /* * Try to claim pages for early warning of insufficient memory available. @@ -446,7 +432,7 @@ static int setup_guest(xc_interface *xch, * range */ !check_mmio_hole(cur_pfn PAGE_SHIFT, SUPERPAGE_1GB_NR_PFNS PAGE_SHIFT, - mmio_start, mmio_size) ) + args-mmio_start, args-mmio_size) ) { long done; unsigned long nr_extents = count SUPERPAGE_1GB_SHIFT; @@ -545,7 +531,7 @@ static int setup_guest(xc_interface *xch, xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, HVM_INFO_PFN)) == NULL ) goto error_out; -build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args); +build_hvm_info(hvm_info_page, args); munmap(hvm_info_page, PAGE_SIZE); /* Allocate and clear special pages. */ @@ -661,12 +647,6 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid, if ( args.image_file_name == NULL ) return -1; -if ( args.mem_target == 0 ) -args.mem_target = args.mem_size; - -if ( args.mmio_size == 0 ) -args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH; - /* An HVM guest must be initialised with at least 2MB memory. */ if
[Xen-devel] [PATCH 0/4] Fix HVM vNUMA
Boris discovered that HVM vNUMA didn't actually work. This patch series fixes that. The first patch is a prerequisite patch for the actual fixes. The second patch is to help debugging. The fixes are in the last two patches, which can be squashed into one if necessary. Wei. Wei Liu (4): libxc/libxl: fill xc_hvm_build_args in libxl libxc: print more error messages when failed libxc: rework vnuma bits in setup_guest libxl: fix HVM vNUMA tools/libxc/xc_hvm_build_x86.c | 129 ++--- tools/libxl/libxl_dom.c| 48 --- tools/libxl/libxl_vnuma.c | 15 - 3 files changed, 122 insertions(+), 70 deletions(-) -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v10 3/9] Refactor installation of overlays
On Mon, 2015-05-18 at 10:52 +0100, Ian Campbell wrote: On Thu, 2015-05-14 at 11:59 +0100, Ian Campbell wrote: On Wed, 2015-05-13 at 11:36 +0800, longtao.pang wrote: Based on Ian Campbell's v6_patch [04,05,06], I create this patch to refactor installation of overlays for guest as well as host used. Link of Ian Campbell's patch: http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg00467.html http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg00452.html http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg00459.html FYI I've just pushed these to osstest's pretest branch where they will be tested and appear in production hopefully tomorrow if all goes well. These have passed into the production osstest branch now. Great! Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/4] x86: switch default mapping attributes to non-executable
Only a very limited subset of mappings need to be done as executable ones; in particular the direct mapping should not be executable to limit the damage attackers can cause by exploiting security relevant bugs. The EFI change at once includes an adjustment to set NX only when supported by the hardware. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -293,7 +293,7 @@ struct vcpu_guest_context *alloc_vcpu_gu free_vcpu_guest_context(NULL); return NULL; } -__set_fixmap(idx - i, page_to_mfn(pg), __PAGE_HYPERVISOR); +__set_fixmap(idx - i, page_to_mfn(pg), __PAGE_HYPERVISOR_RW); per_cpu(vgc_pages[i], cpu) = pg; } return (void *)fix_to_virt(idx); --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -160,7 +160,7 @@ void *map_domain_page(unsigned long mfn) spin_unlock(dcache-lock); -l1e_write(MAPCACHE_L1ENT(idx), l1e_from_pfn(mfn, __PAGE_HYPERVISOR)); +l1e_write(MAPCACHE_L1ENT(idx), l1e_from_pfn(mfn, __PAGE_HYPERVISOR_RW)); out: local_irq_restore(flags); --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4416,7 +4416,7 @@ long set_gdt(struct vcpu *v, for ( i = 0; i nr_pages; i++ ) { v-arch.pv_vcpu.gdt_frames[i] = frames[i]; -l1e_write(pl1e[i], l1e_from_pfn(frames[i], __PAGE_HYPERVISOR)); +l1e_write(pl1e[i], l1e_from_pfn(frames[i], __PAGE_HYPERVISOR_RW)); } xfree(pfns); @@ -6004,7 +6004,7 @@ int create_perdomain_mapping(struct doma if ( !IS_NIL(ppg) ) *ppg++ = pg; l1tab[l1_table_offset(va)] = -l1e_from_page(pg, __PAGE_HYPERVISOR | _PAGE_AVAIL0); +l1e_from_page(pg, __PAGE_HYPERVISOR_RW | _PAGE_AVAIL0); l2e_add_flags(*pl2e, _PAGE_AVAIL0); } else @@ -6133,7 +6133,7 @@ void memguard_init(void) (unsigned long)__va(start), start PAGE_SHIFT, (__pa(_end) + PAGE_SIZE - 1 - start) PAGE_SHIFT, -__PAGE_HYPERVISOR|MAP_SMALL_PAGES); +__PAGE_HYPERVISOR_RW|MAP_SMALL_PAGES); BUG_ON(start != xen_phys_start); map_pages_to_xen( XEN_VIRT_START, @@ -6146,7 +6146,7 @@ static void __memguard_change_range(void { unsigned long _p = (unsigned long)p; unsigned long _l = (unsigned long)l; -unsigned int flags = __PAGE_HYPERVISOR | MAP_SMALL_PAGES; +unsigned int flags = __PAGE_HYPERVISOR_RW | MAP_SMALL_PAGES; /* Ensure we are dealing with a page-aligned whole number of pages. */ ASSERT((_p~PAGE_MASK) == 0); --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -900,7 +900,7 @@ void __init noreturn __start_xen(unsigne /* The only data mappings to be relocated are in the Xen area. */ pl2e = __va(__pa(l2_xenmap)); *pl2e++ = l2e_from_pfn(xen_phys_start PAGE_SHIFT, - PAGE_HYPERVISOR | _PAGE_PSE); + PAGE_HYPERVISOR_RWX | _PAGE_PSE); for ( i = 1; i L2_PAGETABLE_ENTRIES; i++, pl2e++ ) { if ( !(l2e_get_flags(*pl2e) _PAGE_PRESENT) ) @@ -1087,7 +1087,7 @@ void __init noreturn __start_xen(unsigne /* This range must not be passed to the boot allocator and * must also not be mapped with _PAGE_GLOBAL. */ map_pages_to_xen((unsigned long)__va(map_e), PFN_DOWN(map_e), - PFN_DOWN(e - map_e), __PAGE_HYPERVISOR); + PFN_DOWN(e - map_e), __PAGE_HYPERVISOR_RW); } if ( s map_s ) { --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -895,6 +895,33 @@ void __init subarch_init_memory(void) share_xen_page_with_privileged_guests(page, XENSHARE_readonly); } } + +/* Mark low 16Mb of direct map NX if hardware supports it. */ +if ( !cpu_has_nx ) +return; + +v = DIRECTMAP_VIRT_START + (1UL 20); +l3e = l4e_to_l3e(idle_pg_table[l4_table_offset(v)])[l3_table_offset(v)]; +ASSERT(l3e_get_flags(l3e) _PAGE_PRESENT); +do { +l2e = l3e_to_l2e(l3e)[l2_table_offset(v)]; +ASSERT(l2e_get_flags(l2e) _PAGE_PRESENT); +if ( l2e_get_flags(l2e) _PAGE_PSE ) +{ +l2e_add_flags(l2e, _PAGE_NX_BIT); +l3e_to_l2e(l3e)[l2_table_offset(v)] = l2e; +v += 1 L2_PAGETABLE_SHIFT; +} +else +{ +l1_pgentry_t l1e = l2e_to_l1e(l2e)[l1_table_offset(v)]; + +ASSERT(l1e_get_flags(l1e) _PAGE_PRESENT); +l1e_add_flags(l1e, _PAGE_NX_BIT); +l2e_to_l1e(l2e)[l1_table_offset(v)] = l1e; +v += 1 L1_PAGETABLE_SHIFT; +} +} while ( v DIRECTMAP_VIRT_START + (16UL 20) ); } long subarch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) @@ -1359,7 +1386,7 @@ int
[Xen-devel] [PATCH 3/4] x86: move I/O emulation stubs off the stack
This is needed as stacks are going to become non-executable. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -2006,7 +2006,7 @@ static int emulate_privileged_op(struct ? (*(u32 *)regs-reg = (val)) \ : (*(u16 *)regs-reg = (val))) unsigned long code_base, code_limit; -char io_emul_stub[32]; +char *io_emul_stub = NULL; void (*io_emul)(struct cpu_user_regs *) __attribute__((__regparm__(1))); uint64_t val; @@ -2195,6 +2195,9 @@ static int emulate_privileged_op(struct * GPR context. This is needed for some systems which (ab)use IN/OUT * to communicate with BIOS code in system-management mode. */ +io_emul_stub = map_domain_page(this_cpu(stubs.mfn)) + + (this_cpu(stubs.addr) (PAGE_SIZE - 1)) + + STUB_BUF_SIZE / 2; /* movq $host_to_guest_gpr_switch,%rcx */ io_emul_stub[0] = 0x48; io_emul_stub[1] = 0xb9; @@ -2212,7 +2215,7 @@ static int emulate_privileged_op(struct io_emul_stub[15] = 0xc3; /* Handy function-typed pointer to the stub. */ -io_emul = (void *)io_emul_stub; +io_emul = (void *)(this_cpu(stubs.addr) + STUB_BUF_SIZE / 2); if ( ioemul_handle_quirk ) ioemul_handle_quirk(opcode, io_emul_stub[12], regs); @@ -2777,9 +2780,13 @@ static int emulate_privileged_op(struct done: instruction_done(regs, eip, bpmatch); skip: +if ( io_emul_stub ) +unmap_domain_page(io_emul_stub); return EXCRET_fault_fixed; fail: +if ( io_emul_stub ) +unmap_domain_page(io_emul_stub); return 0; } x86: move I/O emulation stubs off the stack This is needed as stacks are going to become non-executable. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -2006,7 +2006,7 @@ static int emulate_privileged_op(struct ? (*(u32 *)regs-reg = (val)) \ : (*(u16 *)regs-reg = (val))) unsigned long code_base, code_limit; -char io_emul_stub[32]; +char *io_emul_stub = NULL; void (*io_emul)(struct cpu_user_regs *) __attribute__((__regparm__(1))); uint64_t val; @@ -2195,6 +2195,9 @@ static int emulate_privileged_op(struct * GPR context. This is needed for some systems which (ab)use IN/OUT * to communicate with BIOS code in system-management mode. */ +io_emul_stub = map_domain_page(this_cpu(stubs.mfn)) + + (this_cpu(stubs.addr) (PAGE_SIZE - 1)) + + STUB_BUF_SIZE / 2; /* movq $host_to_guest_gpr_switch,%rcx */ io_emul_stub[0] = 0x48; io_emul_stub[1] = 0xb9; @@ -2212,7 +2215,7 @@ static int emulate_privileged_op(struct io_emul_stub[15] = 0xc3; /* Handy function-typed pointer to the stub. */ -io_emul = (void *)io_emul_stub; +io_emul = (void *)(this_cpu(stubs.addr) + STUB_BUF_SIZE / 2); if ( ioemul_handle_quirk ) ioemul_handle_quirk(opcode, io_emul_stub[12], regs); @@ -2777,9 +2780,13 @@ static int emulate_privileged_op(struct done: instruction_done(regs, eip, bpmatch); skip: +if ( io_emul_stub ) +unmap_domain_page(io_emul_stub); return EXCRET_fault_fixed; fail: +if ( io_emul_stub ) +unmap_domain_page(io_emul_stub); return 0; } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/3] x86: support additional Broadwell model
The title (of course) was meant to be [PATCH 1/3] mwait-idle: support additional Broadwell model Jan On 18.05.15 at 14:54, jbeul...@suse.com wrote: Signed-off-by: Len Brown len.br...@intel.com [Linux commit bea57077e44ec9c1e6d3a3c142c8a3c0289e290d] Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/arch/x86/cpu/mwait-idle.c +++ b/xen/arch/x86/cpu/mwait-idle.c @@ -683,6 +683,7 @@ static struct intel_idle_id { ICPU(0x46, hsw), ICPU(0x4d, avn), ICPU(0x3d, bdw), + ICPU(0x47, bdw), ICPU(0x4f, bdw), ICPU(0x56, bdw), {} ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/41] acpi : add helper function for mapping memory
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile index 935999e..096e9ef 100644 --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -2,6 +2,7 @@ subdir-$(arm32) += arm32 subdir-$(arm64) += arm64 subdir-y += platforms subdir-$(arm64) += efi +subdir-$(HAS_ACPI) += acpi obj-$(EARLY_PRINTK) += early_printk.o obj-y += cpu.o diff --git a/xen/arch/arm/acpi/Makefile b/xen/arch/arm/acpi/Makefile new file mode 100644 index 000..b5be22d --- /dev/null +++ b/xen/arch/arm/acpi/Makefile @@ -0,0 +1 @@ +obj-y += lib.o diff --git a/xen/arch/arm/acpi/lib.c b/xen/arch/arm/acpi/lib.c new file mode 100644 index 000..650beed --- /dev/null +++ b/xen/arch/arm/acpi/lib.c @@ -0,0 +1,8 @@ +#include xen/acpi.h +#include asm/mm.h + +void __iomem * +acpi_os_map_iomem(acpi_physical_address phys, acpi_size size) +{ +return __va(phys); +} I would have prefer two distinct patch: one for the refactoring of acpi_os_map_memory and the other for implementing the ARM part explaining why only using __va. __va should only be used when the memory is direct-mapped to Xen (i.e accessible directly). On ARM64, this only the case for the RAM. Can you confirm that ACPI will always reside to the RAM? I already asked the same question on the previous version but got no answer from you... /* * Important Safety Note: The fixed ACPI page numbers are *subtracted* * from the fixed base. That's why we start at FIX_ACPI_END and diff --git a/xen/drivers/acpi/osl.c b/xen/drivers/acpi/osl.c index 93c983c..958caae 100644 --- a/xen/drivers/acpi/osl.c +++ b/xen/drivers/acpi/osl.c @@ -87,16 +87,7 @@ acpi_physical_address __init acpi_os_get_root_pointer(void) void __iomem * acpi_os_map_memory(acpi_physical_address phys, acpi_size size) { - if (system_state = SYS_STATE_active) { - unsigned long pfn = PFN_DOWN(phys); - unsigned int offs = phys (PAGE_SIZE - 1); - - /* The low first Mb is always mapped. */ - if ( !((phys + size - 1) 20) ) - return __va(phys); - return __vmap(pfn, PFN_UP(offs + size), 1, 1, PAGE_HYPERVISOR_NOCACHE) + offs; - } - return __acpi_map_table(phys, size); +return acpi_os_map_iomem(phys, size); The naming is wrong. It's really hard to differentiate acpi_os_map_memory from acpi_os_map_iomem. I would rename to something more meaningful such as arch_acpi_os_map_memory. Although, given that acpi_os_map_memory only call acpi_os_map_iomem. I would move acpi_os_map_memory per-architecture. FWIW, it's what Linux does. -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 07/41] arm/acpi : Introduce ARM Boot Architecture Flags in FADT
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: The Power State Coordination Interface (PSCI) defines an API that can be used to coordinate power control amongst the various supervisory systems concurrently running on a device. ACPI support for this technology would require the addition of two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the hardware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC). An ARM Boot Architecture Flags structure to support new ARM hardware was introduced in FADT in ACPI 5.1, add the code accordingly to implement that in ACPICA core. Since ACPI 5.1 doesn't support self defined PSCI function IDs, which means that only PSCI 0.2+ is supported in ACPI. Signed-off-by: Hanjun Guo hanjun@linaro.org Signed-off-by: Naresh Bhat naresh.b...@linaro.org --- xen/include/acpi/actbl.h | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/xen/include/acpi/actbl.h b/xen/include/acpi/actbl.h index 856945d..96fd1d8 100644 --- a/xen/include/acpi/actbl.h +++ b/xen/include/acpi/actbl.h @@ -244,7 +244,8 @@ struct acpi_table_fadt { u32 flags; /* Miscellaneous flag bits (see below for individual flags) */ struct acpi_generic_address reset_register; /* 64-bit address of the Reset register */ u8 reset_value; /* Value to write to the reset_register port to reset the system */ - u8 reserved4[3];/* Reserved, must be zero */ +u16 arm_boot_flags; /* ARM Boot Architecture Flags (see below for individual flags) */ +u8 minor_revision; /* Minor version of this FADT structure */ Wrong indentation. The file is using hard tab. u64 Xfacs; /* 64-bit physical address of FACS */ u64 Xdsdt; /* 64-bit physical address of DSDT */ struct acpi_generic_address xpm1a_event_block; /* 64-bit Extended Power Mgt 1a Event Reg Blk address */ @@ -270,6 +271,11 @@ struct acpi_table_fadt { #define FADT2_REVISION_ID 3 +/* Masks for FADT ARM Boot Architecture Flags (arm_boot_flags) */ + +#define ACPI_FADT_PSCI_COMPLIANT(1)/* 00: PSCI 0.2+ is implemented */ +#define ACPI_FADT_PSCI_USE_HVC (11) /* 01: HVC must be used instead of SMC as the PSCI conduit */ + /* Masks for FADT flags */ #define ACPI_FADT_WBINVD(1) /* 00: [V1] The wbinvd instruction works properly */ @@ -345,7 +351,7 @@ enum acpi_prefered_pm_profiles { * FADT V5 size: 0x10C */ #define ACPI_FADT_V1_SIZE (u32) (ACPI_FADT_OFFSET (flags) + 4) -#define ACPI_FADT_V2_SIZE (u32) (ACPI_FADT_OFFSET (reserved4[0]) + 3) +#define ACPI_FADT_V2_SIZE (u32) (ACPI_FADT_OFFSET (arm_boot_flags) + 3) Linux is using ACPI_FADT_OFFSET(minor_revision) + 1. Can you use the same here? Also, I've notice that the patch (see 9eb1105) is slightly different. Mostly documenting the ACPI version for the fields. Can you update the patch based on Linux code to use the latest version (i.e the version upstreamed)? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools: Fix wild memory allocations from c/s 250f0b4 and 85d78b4
On 05/18/2015 10:09 AM, Andrew Cooper wrote: On 18/05/15 15:00, Boris Ostrovsky wrote: On 05/18/2015 08:57 AM, Andrew Cooper wrote: These changesets cause the respective libxc functions to unconditonally dereference their max_cpus/nodes parameters as part of initial memory allocations. It will fail at obtaining the correct number of cpus/nodes from Xen, as the guest handles will not be NULL. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Boris Ostrovsky boris.ostrov...@oracle.com --- Spotted by XenServers Coverity run. --- tools/libxl/libxl.c |4 ++-- tools/misc/xenpm.c|4 ++-- tools/python/xen/lowlevel/xc/xc.c |4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) xenpm bug is already fixed (commit b315cd9cce5b6da7ca89b2d7bad3fb01e7716044 n the staging tree). I am not sure I understand why Coverity complains about other spots. For example, in libxl_get_cpu_topology() num_cpus can be left uninitialized only if xc_cputopoinfo(ctx-xch, num_cpus, NULL) fails, in which case we go to 'GC_FREE; return ret;', so it's not ever used. xc_cputopoinfo(ctx-xch, num_cpus, NULL) unconditionally dereferences and reads num_cpus, and performs a memory allocation based on the result. Ah, OK. xc_cputopoinf() (or, rather, the hypervisor) actually doesn't use the value of dereferenced num_cpus in this case but obviously Coverity can't know about this. So Coverity cross-checks routines to see how callers use the arguments? -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 08/41] arm/acpi : Parse FADT table and get PSCI flags
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: There are two flags: PSCI_COMPLIANT and PSCI_USE_HVC. When set, the former signals to the OS that the hardware is PSCI compliant. The latter selects the appropriate conduit for PSCI calls by toggling between Hypervisor Calls (HVC) and Secure Monitor Calls (SMC). FADT table contains such information, parse FADT to get the flags for furture usage. At the same time, only ACPI 5.1 or higher verison supports PSCI, and FADT Major.Minor version was introduced in ACPI 5.1, so we will check the version and only parse FADT table with version = 5.1. If firmware provides ACPI tables with ACPI version less than 5.1, OS will be messed up with those information, so disable ACPI if we get an FADT table with version less that 5.1. Modify FADT table before passing it to Dom0. Set PSCI_COMPLIANT and PSCI_USE_HVC. Signed-off-by: Hanjun Guo hanjun@linaro.org Signed-off-by: Naresh Bhat naresh.b...@linaro.org Signed-off-by: Parth Dixit parth.di...@linaro.org --- xen/arch/arm/acpi/boot.c | 38 ++ xen/arch/arm/acpi/lib.c| 11 +++ xen/include/asm-arm/acpi.h | 11 +++ 3 files changed, 60 insertions(+) diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c index 8dc69d5..57eb33c 100644 --- a/xen/arch/arm/acpi/boot.c +++ b/xen/arch/arm/acpi/boot.c @@ -24,9 +24,40 @@ #include xen/init.h #include xen/acpi.h +#include xen/errno.h +#include acpi/actables.h +#include xen/mm.h #include asm/acpi.h +static int __init acpi_parse_fadt(struct acpi_table_header *table) +{ +struct acpi_table_fadt *fadt = (struct acpi_table_fadt *)table; +u8 checksum; + +/* + * Revision in table header is the FADT Major revision, and there + * is a minor revision of FADT which was introduced by ACPI 5.1, + * we only deal with ACPI 5.1 or newer revision to get GIC and SMP + * boot protocol configuration data, or we will disable ACPI. + */ +if ( table-revision 5 || +( table-revision == 5 fadt-minor_revision = 1 ) ) The indentation looks wrong here. +{ +fadt-arm_boot_flags |= ( ACPI_FADT_PSCI_COMPLIANT | ACPI_FADT_PSCI_USE_HVC ); +checksum = acpi_tb_checksum(ACPI_CAST_PTR(u8, fadt), fadt-header.length); +fadt-header.checksum -= checksum; +clean_dcache_va_range(fadt, sizeof(struct acpi_table_fadt)); Most of this patch is dealing with setting up correctly DOM0 FADT although the title doesn't mention it and there is only 2 lines in the commit message. This would also need comment in the need explaining what this code does. Furthermore, I don't think this code should live here. The function is called by acpi_boot_table_init which should initialize ACPI and not trying to modify the ACPI table. We should have a specific dom0 acpi function to modify/add ACPI table when it's necessary. +return 0; +} + +printk(Unsupported FADT revision %d.%d, should be 5.1+, will disable ACPI\n, +table-revision, fadt-minor_revision); +disable_acpi(); + +return -EINVAL; +} + /* * acpi_boot_table_init() called from setup_arch(), always. * 1. find RSDP and get its address, and then find XSDT @@ -51,6 +82,13 @@ int __init acpi_boot_table_init(void) return error; } +if ( acpi_table_parse(ACPI_SIG_FADT, acpi_parse_fadt) ) +{ +/* disable ACPI if no FADT is found */ +disable_acpi(); +printk(Can't find FADT\n); +} + I think the code readability will be improved if we introduce acpi_get_table_with_size. Although, this is not implemented by ACPICA but only Linux. Jan may not be agree to import it. return 0; } diff --git a/xen/arch/arm/acpi/lib.c b/xen/arch/arm/acpi/lib.c index 650beed..fd9bfa4 100644 --- a/xen/arch/arm/acpi/lib.c +++ b/xen/arch/arm/acpi/lib.c @@ -6,3 +6,14 @@ acpi_os_map_iomem(acpi_physical_address phys, acpi_size size) { return __va(phys); } missing blank line +/* 1 to indicate PSCI 0.2+ is implemented */ +inline bool_t acpi_psci_present(void) inline is not necessary. Although, I would move the function in the header because it's very simple. +{ +return acpi_gbl_FADT.arm_boot_flags ACPI_FADT_PSCI_COMPLIANT; +} + +/* 1 to indicate HVC is present instead of SMC as the PSCI conduit */ +inline bool_t acpi_psci_hvc_present(void) Ditto. +{ +return acpi_gbl_FADT.arm_boot_flags ACPI_FADT_PSCI_USE_HVC; +} Regards, -- -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/41] acpi : add helper function for mapping memory
On 18/05/15 15:01, Jan Beulich wrote: On 18.05.15 at 15:26, julien.gr...@citrix.com wrote: On 17/05/15 21:03, Parth Dixit wrote: --- /dev/null +++ b/xen/arch/arm/acpi/lib.c @@ -0,0 +1,8 @@ +#include xen/acpi.h +#include asm/mm.h + +void __iomem * +acpi_os_map_iomem(acpi_physical_address phys, acpi_size size) +{ +return __va(phys); +} I would have prefer two distinct patch: one for the refactoring of acpi_os_map_memory and the other for implementing the ARM part explaining why only using __va. +1 Although, given that acpi_os_map_memory only call acpi_os_map_iomem. I would move acpi_os_map_memory per-architecture. FWIW, it's what Linux does. In which Linux version did you see this to be the case? Certainly not 4.1-rc or anything recent... What I'm trying to get at is that we should please be very careful with deviating from the ACPI CA derived naming as well as with what is to live in os.c. Sorry, I had in mind an older version of the ACPI series for Linux. I forgot to check what was really done. Although, acpi_os_map_iomem is also part of the ACPI CA. Would a function arch_acpi_os_map_memory suit for you? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] 4.5.1-rc1 has been tagged
All, aiming at a release with presumably (i.e. as usual) one more RC, please test! Thanks, Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/41] acpi : add helper function for mapping memory
On 18/05/15 15:32, Jan Beulich wrote: On 18.05.15 at 16:20, julien.gr...@citrix.com wrote: Although, acpi_os_map_iomem is also part of the ACPI CA. Would a function arch_acpi_os_map_memory suit for you? Only if we - other than Linux - really need this to be arch dependent. The current implementation of acpi_os_map_iomem is not valid on ARM: the first MB is not always mapped and we already have mapped the table if it lives in RAM. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 2/6] libxl_read_file_contents: fix reading sysfs file
On Sun, Apr 19, 2015 at 11:50:48AM +0800, Chunyan Liu wrote: Sysfs file has size=4096 but actual file content is less than that. Current libxl_read_file_contents will treat it as error when file size and actual file content differs, so reading sysfs file content with this function always fails. Fix it so that we can reuse this function to get sysfs file content in later pvusb work. I'm not sure if I should classify this as a bug in Linux's sysfs interface. In any case, we would still like to detect the error case that file size is changed under our feet. I have a dumb idea of having a dedicated function that is used to read sysfs, but I'm not sure if it is too dumb. I will wait for Ian and Ian's input on this. Wei. Signed-off-by: Chunyan Liu cy...@suse.com --- tools/libxl/libxl_utils.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c index 9053b27..18ad2b8 100644 --- a/tools/libxl/libxl_utils.c +++ b/tools/libxl/libxl_utils.c @@ -363,12 +363,9 @@ int libxl_read_file_contents(libxl_ctx *ctx, const char *filename, if (!data) goto xe; rs = fread(data, 1, datalen, f); -if (rs != datalen) { +if (rs != datalen !feof(f)) { if (ferror(f)) LOGE(ERROR, failed to read %s, filename); -else if (feof(f)) -LOG(ERROR, %s changed size while we were reading it, - filename); else abort(); goto xe; -- 1.8.5.2 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/41] acpi : add helper function for mapping memory
On 18.05.15 at 16:20, julien.gr...@citrix.com wrote: Although, acpi_os_map_iomem is also part of the ACPI CA. Would a function arch_acpi_os_map_memory suit for you? Only if we - other than Linux - really need this to be arch dependent. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v21 06/14] x86/VPMU: Initialize PMU for PV(H) guests
On 08.05.15 at 23:06, boris.ostrov...@oracle.com wrote: Code for initializing/tearing down PMU for PV guests Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com Acked-by: Daniel De Graaf dgde...@tycho.nsa.gov Acked-by Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 00/15] Add VT-d Posted-Interrupts support
-Original Message- From: Tian, Kevin Sent: Monday, May 18, 2015 1:33 PM To: Wu, Feng; xen-devel@lists.xen.org Cc: k...@xen.org; jbeul...@suse.com; andrew.coop...@citrix.com; Zhang, Yang Z; george.dun...@eu.citrix.com Subject: RE: [RFC v2 00/15] Add VT-d Posted-Interrupts support From: Wu, Feng Sent: Friday, May 08, 2015 5:07 PM VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. With VT-d Posted-Interrupts enabled, external interrupts from direct-assigned devices can be delivered to guests without VMM intervention when guest is running in non-root mode. You can find the VT-d Posted-Interrtups Spec. in the following URL: http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog y/vt-directed-io-spec.html This patch set follow the following design: http://article.gmane.org/gmane.comp.emulators.xen.devel/236476 v1 - v2 1. Add the design doc. 2. Coding style fix. 3. Add some comments for struct pi_desc. 4. Extend 'struct iremap_entry' to a more common format. 5. Delete the atomic helper functions for pi descriptor manipulation. 6. Add the new command line in docs/misc/xen-command-line.markdown. 7. Use macros to replace some magic numbers. Though generally this version looks good to me, it'd be clearer if you could give v1-v2 information per patch to help review. :-) Good suggestion, Kevin! And here it is: [RFC v2 01/15] Vt-d Posted-intterrupt (PI) design Add the design doc. [RFC v2 04/15] vmx: Extend struct pi_desc to support VT-d Posted-Interrupts Add some comments for struct pi_desc. [RFC v2 05/15] vmx: Initialize VT-d Posted-Interrupts Descriptor 1. if ( iommu_intpost == 1 ) -- if ( iommu_intpost ) 2. Don't need to clear SN and NDM fileds in pi_desc_init(), since it was clear to zero when initialized 3. Use macro MASK_INSR for (dest 8) 0xFF00 in pi_desc_init() [RFC v2 06/15] vt-d: Extend struct iremap_entry to support VT-d Posted-Interrupts Extend 'struct iremap_entry' to a more common format. 1. Hide bit manipulation of IRTE inside a static inline function. 2. Define a new macro PDA_MASK to manipulate IRTE 3. Make the error message more informative. [RFC v2 08/15] Update IRTE according to guest interrupt config changes 1. Check the result when dest_vcpu_array is allocated. 2. Use interrupt remapping when we encounter failures during interrupts setup/update for PI. [RFC v2 10/15] vmx: Define two per-cpu variables 1. block_vcpu_on_cpu -- blocked_vcpu. 2. blocked_vcpu_lock _on_cpu -- blocked_vcpu_lock. [RFC v2 11/15] vmx: Add a global wake-up vector for VT-d Posted-Interrupts Adjust the initialization of vmx_function_table.pi_desc_update [RFC v2 12/15] vmx: Properly handle notification event when vCPU is running Add detailed description about the scenario in which the changes in this patch is used. [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling Properly remove the vcpu from the blocked list when 'ON' field is set during vCPU is being blocked. [RFC v2 15/15] Add a command line parameter for VT-d posted-interrupts Add the new command line in docs/misc/xen-command-line.markdown. Thanks, Feng Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
Hi Wei, On 15/05/15 16:31, Wei Liu wrote: On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote: On 15/05/15 03:35, Wei Liu wrote: On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote: The PV network protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity working as a network backend on a non-modified Xen. It's only necessary to adapt the ring size and break skb data in small chunk of 4KB. The rest of the code is relying on the grant table code. Although only simple workload is working (dhcp request, ping). If I try to use wget in the guest, it will stall until a tcpdump is started on the vif interface in DOM0. I wasn't able to find why. I think in wget workload you're more likely to break down 64K pages to 4K pages. Some of your calculation of mfn, offset might be wrong. If so, why tcpdump on the vif interface would make wget suddenly working? Does it make netback use a different path? No, but if might make core network component behave differently, this is only my suspicion. Do you see malformed packets with tcpdump? I don't see any malformed packets with tcpdump. The connection is stalling until tcpdump is started on the vif in dom0. I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what it's used for (I have limited knowledge on the network driver). This is the maximum slots a guest packet can use. AIUI the protocol still works on 4K granularity (you break 64K page to a bunch of 4K pages), you don't need to change this. 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the number of Linux page. So we would have to get the number for Xen page. Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to change this constant to match underlying HV page. Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16) but it get stuck in the loop. I don't follow. What is the new #define? Which loop does it get stuck? diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 0eda6e9..c2a5402 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */ /* Maximum number of Rx slots a to-guest packet may use, including the * slot needed for GSO meta-data. */ -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1) +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE) enum state_bit_shift { /* This bit marks that the vif is connected */ The function xenvif_wait_for_rx_work never returns. I guess it's because there is not enough slot available. For 64KB page granularity we ask for 16 times more slots than 4KB page granularity. Although, it's very unlikely that all the slot will be used. FWIW I pointed out the same problem on blkfront. queue-tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF; queue-tx_copy_ops[*copy_ops].dest.offset = - offset_in_page(skb-data); + offset_in_page(skb-data) ~XEN_PAGE_MASK; queue-tx_copy_ops[*copy_ops].len = data_len; queue-tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref; @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s This function is to coalesce frag_list to a new SKB. It's completely fine to use the natural granularity of backend domain. The way you modified it can lead to waste of memory, i.e. you only use first 4K of a 64K page. Thanks for explaining. I wasn't sure how the function works so I change it for safety. I will redo the change. FWIW, I'm sure there is other place in netback where we waste memory with 64KB page granularity (such as grant table). I need to track them. Let me know if you have some place in mind where the memory usage can be improved. I was about to say the mmap_pages array is an array of pages. But that probably belongs to grant table driver. Yes, there is a lot of rework in the grant table driver in order to avoid wasting memory. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [rumpuserxen test] 56663: regressions - FAIL
flight 56663 rumpuserxen real [real] http://logs.test-lab.xenproject.org/osstest/logs/56663/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-rumpuserxen 5 rumpuserxen-build fail REGR. vs. 33866 build-i386-rumpuserxen5 rumpuserxen-build fail REGR. vs. 33866 Tests which did not succeed, but are not blocking: test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a version targeted for testing: rumpuserxen 3b91e44996ea6ae1276bce1cc44f38701c53ee6f baseline version: rumpuserxen 30d72f3fc5e35cd53afd82c8179cc0e0b11146ad People who touched revisions under test: Antti Kantee po...@iki.fi Ian Jackson ian.jack...@eu.citrix.com Martin Lucina mar...@lucina.net Wei Liu l...@liuw.name jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-rumpuserxen-amd64 blocked test-amd64-i386-rumpuserxen-i386 blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/osstest/pub/logs images: /home/osstest/pub/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 535 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/3] x86: mwait-idle sync with recent Linux
1: support additional Broadwell model 2: update support for Silvermont Core in Baytrail SOC 3: add support for the Airmont Core in the Cherrytrail and Braswell SOCs Signed-off-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2] xSplice design
On Mon, May 18, 2015 at 08:54:22PM +0800, Liuqiming (John) wrote: Hi Konrad, Will this design include hotpatch build tools chain? Such as how these .xplice_ section are created? How to handle xen symbols when creating hotpatch elf file? No, this is not a main goal of this project. However, it was agreed that there will be a simple tool showing how to build patches. Additionally, it will give us a chance to verify correctness of design and code in hypervisor itself. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test
On 05/18/2015 02:14 PM, Ian Campbell wrote: That solves the most general case; but it sounds like you care mostly about the very specific case of dealing with components that depend on the current output of xen.git. Starting simple may be fine. Currently we only have ts-*-build things which depend on the output of ts-xen-build (in fact, we only have ts-libvirt-build). I'm not sure if there will be others in the future, I suppose ts-rump{qemu,xenstore,foo}-build - ts-rumpkernel-build - ts-xen-build might eventually be a possibility... I guess I was assuming that at some point you might have the following builds and dependencies (not sure these are all correct): ts-seabios-build: [none] ts-qemut-build: [none] ts-qemuu-build: ts-seabios-build ts-xen-build: ts-qemut-build ts-qemuu-build ts-libvirt-build: ts-xen-build c I'm not arguing for this, I'm just trying to explain the problem I was initially trying to solve. :-) But as we don't have any tests for seabios and qemu* in isolation, I guess it doesn't really make sense to treat them separately. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v10 3/9] Refactor installation of overlays
On Thu, 2015-05-14 at 11:59 +0100, Ian Campbell wrote: On Wed, 2015-05-13 at 11:36 +0800, longtao.pang wrote: Based on Ian Campbell's v6_patch [04,05,06], I create this patch to refactor installation of overlays for guest as well as host used. Link of Ian Campbell's patch: http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg00467.html http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg00452.html http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg00459.html FYI I've just pushed these to osstest's pretest branch where they will be tested and appear in production hopefully tomorrow if all goes well. These have passed into the production osstest branch now. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test
On Mon, 2015-05-18 at 14:23 +0100, George Dunlap wrote: On 05/18/2015 02:14 PM, Ian Campbell wrote: That solves the most general case; but it sounds like you care mostly about the very specific case of dealing with components that depend on the current output of xen.git. Starting simple may be fine. Currently we only have ts-*-build things which depend on the output of ts-xen-build (in fact, we only have ts-libvirt-build). I'm not sure if there will be others in the future, I suppose ts-rump{qemu,xenstore,foo}-build - ts-rumpkernel-build - ts-xen-build might eventually be a possibility... I guess I was assuming that at some point you might have the following builds and dependencies (not sure these are all correct): ts-seabios-build: [none] ts-qemut-build: [none] ts-qemuu-build: ts-seabios-build ts-xen-build: ts-qemut-build ts-qemuu-build ts-libvirt-build: ts-xen-build c NB: ts-xen-build depends on ts-seabios-build directly, not via ts-qemu?-build since the BIOS becomes part of hvmloader not qemu. I think ts-xen-build - ts-qemu?-build is a little circular today in xen.git (qemu uses libxenctrl), but building ts-xen-build first (i.e. reversing the deps you have there) is how I would solve that (and I expect raisin did so). I'm not arguing for this, I'm just trying to explain the problem I was initially trying to solve. :-) But as we don't have any tests for seabios and qemu* in isolation, I guess it doesn't really make sense to treat them separately. It could very well eventually make sense to split up things which used to be part of the monolithic xen.git build into separate builds. qemuu would be an ideal candidate, for example. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] [RFC] x86/domctl: Fix getpageframeinfo* handling.
On 18.05.15 at 15:24, andrew.coop...@citrix.com wrote: On 18/05/15 12:43, Jan Beulich wrote: On 18.05.15 at 12:59, andrew.coop...@citrix.com wrote: +if ( unlikely(num 1024) || + unlikely(num != domctl-u.getpageframeinfo3.num) ) +{ +ret = -E2BIG; +break; +} + +for ( i = 0; i num; ++i ) +{ +unsigned long gfn = 0, type = 0; gfn's initializer looks pointless (and if anything it should be INVALID_MFN or some such). It must absolutely be 0 for when we read a 32bit values into it, Ah, of course! although I realise I do need to extend ~0U to ~0UL for compat guests. Why is ~0 special here? It's not even valid input afaict, and hence there's no point in massaging it in any way. +struct page_info *page; +p2m_type_t t; + +if ( raw_copy_from_guest(gfn, guest_handle + (i * width), width) ) +{ +ret = -EFAULT; +break; +} + +page = get_page_from_gfn(d, gfn, t, P2M_ALLOC); + +if ( unlikely(!page) || + unlikely(is_xen_heap_page(page)) ) +{ +if ( p2m_is_broken(t) ) +type = XEN_DOMCTL_PFINFO_BROKEN; +else +type = XEN_DOMCTL_PFINFO_XTAB; Realizing that this was this way in the old code too, would you nevertheless mind adding unlikely() and/or flip the if and else branches? Certainly. Does this mean that you are happy in principle with the raw_* use? I'm not particularly happy about it, but for the purpose of code consolidation I think it is acceptable here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Xen dom0 crash
Hi, Roger asked me to send this bug report to xen-devel. I'm trying to bring up a Xen dom0 on a Fujitsu RX308, but it crashes when strying to start the kernel. Any ideas? Lars B2 __ _ _ | | | _ \ / | __ \ | |___ _ __ ___ ___ | |_) | (___ | | | | | ___| '__/ _ \/ _ \| _ \___ \| | | | | | | | | __/ __/| |_) |) | |__| | | | | | |||| | | |_| |_| \___|\___||/|_/|_/```` s` `.---...--.``` -/ +Welcome to FreeBSD +4H+ .--` /y:` +. /boot/kernel/cc_vegas.ko size 0x30d0 at 0x1068000:.:o `+- Booting...el/cc_hd.ko size 0x2c00 at 0x106200 -/` -o/ Xen 4.6-unstabletcp.ko size 0x2f90 at 0x1065000 ::/sy+:. (XEN) Xen version 4.6-unstable (r...@netapp.com) (gcc47 (FreeBSD Ports Collection) 4.7.4) debug=y Mon May 18 14:50:17 CEST 2015 :` (XEN) Latest ChangeSet:| `: :` (XEN) Bootloader: FreeBSD Loader | / / (XEN) Command line: dom0_mem=4096M dom0pvh=1 com1=115200,8n1 console=com1. (XEN) Video information:]ptions... | -- -. (XEN) VGA is text mode 80x25, font 8x16 |`:` `:` (XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds `--. (XEN) EDID info not retrieved because no DDC retrieval method detected=(XEN) Disc information: (XEN) Found 0 MBR signatures (XEN) Found 0 EDD information structures (XEN) Xen-e820 RAM map:+0x50db0 - (XEN) - 0008b800 (usable) (XEN) 0008b800 - 000a (reserved) (XEN) 000e - 0010 (reserved) (XEN) 0010 - 7cb09000 (usable) (XEN) 7cb09000 - 7cb39000 (reserved) (XEN) 7cb39000 - 7cc52000 (ACPI data) (XEN) 7cc52000 - 7d4b (ACPI NVS) (XEN) 7d4b - 7eafb000 (reserved) (XEN) 7eafb000 - 7eafc000 (usable) (XEN) 7eafc000 - 7eb82000 (ACPI NVS) (XEN) 7eb82000 - 7f00 (usable) (XEN) 8000 - 9000 (reserved) (XEN) fed1c000 - fed2 (reserved) (XEN) ff00 - 0001 (reserved) (XEN) 0001 - 00208000 (usable) (XEN) ACPI: RSDP 000F04A0, 0024 (r2 FTS ) (XEN) ACPI: XSDT 7CB71090, 00A4 (r1 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: FACP 7CB7FAA8, 010C (r5 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: DSDT 7CB711C8, E8DD (r2 FTSD2939-B1 114 INTL 20091112) (XEN) ACPI: FACS 7D4A7080, 0040 (XEN) ACPI: APIC 7CB7FBB8, 0294 (r3 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: FPDT 7CB7FE50, 0044 (r1 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: MCFG 7CB7FE98, 003C (r1 FTSOEMMCFG. 1072009 MSFT 97) (XEN) ACPI: SRAT 7CB7FED8, 0530 (r1 A M I AMI SRAT1 AMI.0) (XEN) ACPI: SLIT 7CB80408, 0030 (r1 A M I AMI SLIT0 AMI.0) (XEN) ACPI: HPET 7CB80438, 0038 (r1 FTSD2939-B1 1072009 AMI.5) (XEN) ACPI: PRAD 7CB80470, 00BE (r2 PRADID PRADTID1 MSFT 301) (XEN) ACPI: SPMI 7CB80530, 0040 (r5 A M I OEMSPMI0 AMI.0) (XEN) ACPI: SSDT 7CB80570, D0CB0 (r2 INTELCpuPm 4000 INTL 20051117) (XEN) ACPI: SPCR 7CC51220, 0050 (r1 A M I APTIO4 1072009 AMI.5) (XEN) ACPI: EINJ 7CC51270, 0130 (r1AMI AMI EINJ0 0) (XEN) ACPI: ERST 7CC513A0, 0230 (r1 AMIER AMI ERST0 0) (XEN) ACPI: HEST 7CC515D0, 00A8 (r1AMI AMI HEST0 0) (XEN) ACPI: BERT 7CC51678, 0030 (r1AMI AMI BERT0 0) (XEN) ACPI: DMAR 7CC516A8, 0110 (r1 A M I OEMDMAR1 INTL1) (XEN) System RAM: 131023MB (134167628kB) (XEN) SRAT: PXM 0 - APIC 0 - Node 0 (XEN) SRAT: PXM 0 - APIC 1 - Node 0 (XEN) SRAT: PXM 0 - APIC 2 - Node 0 (XEN) SRAT: PXM 0 - APIC 3 - Node 0 (XEN) SRAT: PXM 0 - APIC 4 - Node 0 (XEN) SRAT: PXM 0 - APIC 5 - Node 0 (XEN) SRAT: PXM 0 - APIC 6 - Node 0 (XEN) SRAT: PXM 0 - APIC 7 - Node 0 (XEN) SRAT: PXM 0 - APIC 8 - Node 0 (XEN) SRAT: PXM 0 - APIC 9 - Node 0 (XEN) SRAT: PXM 0 - APIC 16 - Node 0 (XEN) SRAT: PXM 0 - APIC 17 - Node 0 (XEN) SRAT: PXM 0 - APIC 18 - Node 0 (XEN) SRAT: PXM 0 - APIC 19 - Node 0 (XEN) SRAT: PXM 0 - APIC 20 - Node 0 (XEN) SRAT: PXM 0 - APIC 21 - Node 0 (XEN) SRAT: PXM 0 - APIC 22 - Node 0 (XEN) SRAT: PXM 0 - APIC 23 - Node 0 (XEN) SRAT: PXM 0 - APIC 24 - Node 0 (XEN) SRAT: PXM 0 - APIC 25 - Node 0 (XEN) SRAT: PXM 1 - APIC 32 - Node 1 (XEN) SRAT: PXM 1 - APIC 33 - Node 1 (XEN) SRAT: PXM 1 - APIC 34 - Node 1 (XEN) SRAT: PXM 1 - APIC 35 - Node 1 (XEN) SRAT: PXM 1 - APIC 36 - Node 1 (XEN)
Re: [Xen-devel] [PATCH v2 05/41] acpi : add helper function for mapping memory
On 18.05.15 at 15:26, julien.gr...@citrix.com wrote: On 17/05/15 21:03, Parth Dixit wrote: --- /dev/null +++ b/xen/arch/arm/acpi/lib.c @@ -0,0 +1,8 @@ +#include xen/acpi.h +#include asm/mm.h + +void __iomem * +acpi_os_map_iomem(acpi_physical_address phys, acpi_size size) +{ +return __va(phys); +} I would have prefer two distinct patch: one for the refactoring of acpi_os_map_memory and the other for implementing the ARM part explaining why only using __va. +1 Although, given that acpi_os_map_memory only call acpi_os_map_iomem. I would move acpi_os_map_memory per-architecture. FWIW, it's what Linux does. In which Linux version did you see this to be the case? Certainly not 4.1-rc or anything recent... What I'm trying to get at is that we should please be very careful with deviating from the ACPI CA derived naming as well as with what is to live in os.c. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 1/6] libxl: export some functions for pvusb use
On Sun, Apr 19, 2015 at 11:50:47AM +0800, Chunyan Liu wrote: Signed-off-by: Chunyan Liu cy...@suse.com Signed-off-by: Simon Cao caobosi...@gmail.com On the basis that this can help reduce the length of libxl.c and improve maintainability by moving stuff out of libxl.c to dedicated files. Acked-by: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl.c | 6 +++--- tools/libxl/libxl_internal.h | 5 + 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 511eef1..b05d18b 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -1963,7 +1963,7 @@ out: /**/ /* generic callback for devices that only need to set ao_complete */ -static void device_addrm_aocomplete(libxl__egc *egc, libxl__ao_device *aodev) +void device_addrm_aocomplete(libxl__egc *egc, libxl__ao_device *aodev) { STATE_AO_GC(aodev-ao); @@ -1986,7 +1986,7 @@ out: } /* common function to get next device id */ -static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device) +int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device) { char *dompath, **l; unsigned int nb; @@ -2005,7 +2005,7 @@ static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device) return nextid; } -static int libxl__resolve_domid(libxl__gc *gc, const char *name, +int libxl__resolve_domid(libxl__gc *gc, const char *name, uint32_t *domid) { if (!name) diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 9c22309..42eb1b9 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1080,6 +1080,9 @@ _hidden int libxl__init_console_from_channel(libxl__gc *gc, libxl__device_console *console, int dev_num, libxl_device_channel *channel); +_hidden int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device); +_hidden int libxl__resolve_domid(libxl__gc *gc, const char *name, + uint32_t *domid); /* * For each aggregate type which can be used as an input we provide: @@ -2208,6 +2211,8 @@ struct libxl__ao_device { /* Starts preparing to add/remove a bunch of devices. */ _hidden void libxl__multidev_begin(libxl__ao *ao, libxl__multidev*); +/* generic callback for devices that only need to set ao_complete */ +_hidden void device_addrm_aocomplete(libxl__egc *egc, libxl__ao_device *aodev); /* Prepares to add/remove one of many devices. * Calls libxl__prepare_ao_device on libxl__ao_device argument provided and -- 1.8.5.2 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 00/15] Add VT-d Posted-Interrupts support
On 18.05.15 at 12:22, feng...@intel.com wrote: -Original Message- From: Tian, Kevin Sent: Monday, May 18, 2015 1:33 PM To: Wu, Feng; xen-devel@lists.xen.org Cc: k...@xen.org; jbeul...@suse.com; andrew.coop...@citrix.com; Zhang, Yang Z; george.dun...@eu.citrix.com Subject: RE: [RFC v2 00/15] Add VT-d Posted-Interrupts support From: Wu, Feng Sent: Friday, May 08, 2015 5:07 PM VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt. With VT-d Posted-Interrupts enabled, external interrupts from direct-assigned devices can be delivered to guests without VMM intervention when guest is running in non-root mode. You can find the VT-d Posted-Interrtups Spec. in the following URL: http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog y/vt-directed-io-spec.html This patch set follow the following design: http://article.gmane.org/gmane.comp.emulators.xen.devel/236476 v1 - v2 1. Add the design doc. 2. Coding style fix. 3. Add some comments for struct pi_desc. 4. Extend 'struct iremap_entry' to a more common format. 5. Delete the atomic helper functions for pi descriptor manipulation. 6. Add the new command line in docs/misc/xen-command-line.markdown. 7. Use macros to replace some magic numbers. Though generally this version looks good to me, it'd be clearer if you could give v1-v2 information per patch to help review. :-) Good suggestion, Kevin! And here it is: That's better, but (for the future) still not the format we'd like it to be in: These notes should go into the individual patches, after the first delimiting --- marker. Jan [RFC v2 01/15] Vt-d Posted-intterrupt (PI) design Add the design doc. [RFC v2 04/15] vmx: Extend struct pi_desc to support VT-d Posted-Interrupts Add some comments for struct pi_desc. [RFC v2 05/15] vmx: Initialize VT-d Posted-Interrupts Descriptor 1. if ( iommu_intpost == 1 ) -- if ( iommu_intpost ) 2. Don't need to clear SN and NDM fileds in pi_desc_init(), since it was clear to zero when initialized 3. Use macro MASK_INSR for (dest 8) 0xFF00 in pi_desc_init() [RFC v2 06/15] vt-d: Extend struct iremap_entry to support VT-d Posted-Interrupts Extend 'struct iremap_entry' to a more common format. 1. Hide bit manipulation of IRTE inside a static inline function. 2. Define a new macro PDA_MASK to manipulate IRTE 3. Make the error message more informative. [RFC v2 08/15] Update IRTE according to guest interrupt config changes 1. Check the result when dest_vcpu_array is allocated. 2. Use interrupt remapping when we encounter failures during interrupts setup/update for PI. [RFC v2 10/15] vmx: Define two per-cpu variables 1. block_vcpu_on_cpu -- blocked_vcpu. 2. blocked_vcpu_lock _on_cpu -- blocked_vcpu_lock. [RFC v2 11/15] vmx: Add a global wake-up vector for VT-d Posted-Interrupts Adjust the initialization of vmx_function_table.pi_desc_update [RFC v2 12/15] vmx: Properly handle notification event when vCPU is running Add detailed description about the scenario in which the changes in this patch is used. [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling Properly remove the vcpu from the blocked list when 'ON' field is set during vCPU is being blocked. [RFC v2 15/15] Add a command line parameter for VT-d posted-interrupts Add the new command line in docs/misc/xen-command-line.markdown. Thanks, Feng Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: Initialize valid_devs in output_topologyinfo()
On Fri, May 15, 2015 at 11:06:57AM -0400, Boris Ostrovsky wrote: Commit e78e8b9bb649 (libxl: Add interface for querying hypervisor about PCI topology) neglected to initialize valid_devs. This may result in not printing a message to console if no IO topology information is available and, more importantly, may break non-debug builds on some versions of gcc. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com Reported-by: Olaf Hering o...@aepfle.de Acked-by: Wei Liu wei.l...@citrix.com --- tools/libxl/xl_cmdimpl.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 373aa37..6d60ce4 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -5423,7 +5423,7 @@ static void output_topologyinfo(void) libxl_cputopology *cpuinfo; int i, nr; libxl_pcitopology *pciinfo; -int valid_devs; +int valid_devs = 0; cpuinfo = libxl_get_cpu_topology(ctx, nr); -- 1.7.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: gic-hip04: Resync the driver with the GICv2
On 15/05/15 22:08, Julien Grall wrote: Hi Zoltan, On 07/05/2015 13:37, Zoltan Kiss wrote: On 07/05/15 10:32, Ian Campbell wrote: On Thu, 2015-05-07 at 09:52 +0100, Zoltan Kiss wrote: Looks good at first glance, let me try it on a board. On 06/05/15 19:52, Julien Grall wrote: [...] I'm concerned to see a newly driver (pushed last march) already orphan. Does Huawei still plan to maintain this driver? I share Julien's concerns here. It would be good if those listed in the MAINTAINERS file for this device would respond reasonably promptly to mails such as [1] and try to keep on top of things or to find a {replacement /co-}maintainer who can do so. As I said, I've missed that thread entirely (not that hard given the traffic of the list), but now I've improved my mail filters to make sure I don't miss a mail where my Huawei address is on the Cc. We are also looking to add new co-maintainers, because nowadays I'm working on other projects. I need a few days to test the patch as my Xen test environment is not available right now. Ping? Were you able to test it? One of my colleague is doing that, to spread experience. It's going slower for him obviously to set up an enviroment, but we are working on that. Zoli Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 04/14] x86: maintain COS to CBM mapping for each socket
On 08.05.15 at 10:56, chao.p.p...@linux.intel.com wrote: @@ -237,6 +243,14 @@ static void cat_cpu_init(void) info-cbm_len = (eax 0x1f) + 1; info-cos_max = min(opt_cos_max, edx 0x); +info-cos_to_cbm = xzalloc_array(struct psr_cat_cbm, + info-cos_max + 1UL); How does the array dimension here match the for each socket in the title? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 3/6] libxl: add pvusb API
On Sun, Apr 19, Chunyan Liu wrote: +static int libxl__usbctrl_add_xenstore(libxl__gc *gc, uint32_t domid, + libxl_device_usbctrl *usbctrl) +{ +flexarray_append_pair(back, state, 1); +flexarray_append_pair(front, state, 1); This (and perhaps other places) should be converted to xenbus_state, see commit 25519c75b7e05fd82d7f2959aaa85518b5564cc3 (libxl: convert strings and ints to xenbus_state). Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/4] x86: don't default to executable mappings
Particularly for the 1:1 mapping it was pointed out that in order to limit the damage from security issues we should avoid mapping things executable when they don't need to be. 1: move syscall trampolines off the stack 2: emul: move stubs off the stack 3: move I/O emulation stubs off the stack 4: switch default mapping attributes to non-executable Signed-off-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 2/7] linux-stubdomain: Compile Linux
On Thu, May 14, 2015 at 10:28 AM, Ian Campbell ian.campb...@citrix.com wrote: On Thu, 2015-05-14 at 10:08 +0100, George Dunlap wrote: On Fri, Feb 6, 2015 at 5:45 PM, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: +LINUX_URL=ftp://ftp.kernel.org/pub/linux/kernel/v3.x/$(LINUX_V).tar.xz + +all: $(VMLINUZ) I think it is best if we git clone it. Is that still true if unpatched 3.18.6 works? I don't know if there is a desire to reduce load on kernel.org, for example. That's a good point. I think git clone would be more inline with any other external project that we use. However I'll let the other maintainers decide on this. It takes a *lng* time to download a full Linux git tree, and it takes up a huge amount of disk space. It would be a lot more convenient to be able to just download a tarball. git clone --depth=some small N to create a shallow clone? So my unscientific poll (more details below) 1. Download tarball: 80MB, 111s (2m) 2. git clone --depth 1: 138MiB, 392s. (~6m30s) I'd just go for the tarball, but I'll leave that decision up to you guys. -George $ time wget ftp://ftp.kernel.org/pub/linux/kernel/v3.x/linux-3.18.8.tar.xz --2015-05-18 11:12:05-- ftp://ftp.kernel.org/pub/linux/kernel/v3.x/linux-3.18.8.tar.xz = ‘linux-3.18.8.tar.xz’ Resolving ftp.kernel.org (ftp.kernel.org)... 199.204.44.194, 149.20.4.69, 198.145.20.140 Connecting to ftp.kernel.org (ftp.kernel.org)|199.204.44.194|:21... connected. Logging in as anonymous ... Logged in! == SYST ... done.== PWD ... done. == TYPE I ... done. == CWD (1) /pub/linux/kernel/v3.x ... done. == SIZE linux-3.18.8.tar.xz ... 80954240 == PASV ... done.== RETR linux-3.18.8.tar.xz ... done. Length: 80954240 (77M) (unauthoritative) 100%[=] 80,954,240 723KB/s in 1m 51s 2015-05-18 11:13:57 (711 KB/s) - ‘linux-3.18.8.tar.xz’ saved [80954240] real1m52.548s user0m0.318s sys0m1.894s $ time git clone --depth 1 --branch v3.18 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Cloning into 'linux'... remote: Counting objects: 50728, done. remote: Compressing objects: 100% (48908/48908), done. remote: Total 50728 (delta 3388), reused 21833 (delta 1202) Receiving objects: 100% (50728/50728), 138.44 MiB | 269.00 KiB/s, done. Resolving deltas: 100% (3388/3388), done. Checking connectivity... done. Note: checking out 'b2776bf7149bddd1f4161f14f79520f17fc1d71d'. You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b new_branch_name Checking out files: 100% (47986/47986), done. real6m32.968s user0m11.522s sys0m4.815s Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
Hi David, On 15/05/15 16:45, David Vrabel wrote: On 14/05/15 18:00, Julien Grall wrote: Hi all, ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen hypercall interface and PV protocol are always based on 4KB page granularity. Any attempt to boot a Linux guest with 64KB pages enabled will result to a guest crash. This series is a first attempt to allow those Linux running with the current hypercall interface and PV protocol. This solution has been chosen because we want to run Linux 64KB in released Xen ARM version or/and platform using an old version of Linux DOM0. The key problem I see with this approach is the confusion between guest page size and Xen page size. This is going to be particularly problematic since the majority of development/usage will remain on x86 where PAGE_SIZE == XEN_PAGE_SIZE. I think it would be nice to keep XEN_PAGE_SIZE etc out of front and backend drivers. Perhaps with a suitable set of helper functions? Even with the helpers, we are not protected from any change in the frontend/backend that will impact 64K. It won't be possible to remove all the XEN_PAGE_* usage (there is a lots of places where adding helpers would not be possible) and we would still have to carefully review any changes. I think it may be possible to move the grant table splitting in helpers which would be helpful to support different grant size. Although, it would require a big amount of work at least in blkfront. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test
On 05/18/2015 12:21 PM, Ian Campbell wrote: On Mon, 2015-05-18 at 11:54 +0100, George Dunlap wrote: On 05/18/2015 11:33 AM, Ian Campbell wrote: On Mon, 2015-05-18 at 11:08 +0100, George Dunlap wrote: On Wed, May 13, 2015 at 12:48 PM, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: On Wed, 13 May 2015, Ian Campbell wrote: On Tue, 2015-05-12 at 12:46 +0100, Stefano Stabellini wrote: Would a separate clone of the same raisin version with some sort of dist directory transported over be sufficient and supportable? Or are raisin's outputs not in one place and easily transportable? i.e. today build-$ARCH-libvirt picks up the dist.tar.gz files from the corresponding build-$ARCH, unpacks them and asks libvirt to build against that tree. Moving the dist directory over should work, although I have never tested this configuration. Would you be willing to support this as a requirement going forward? Yeah, I think it is OK I assume that it is not also necessary to reclone all the trees for the preexisting components, just the new ones? Only if the user asks for a components to be built, the corresponding tree is cloned. Won't the problem here be disentangling the stuff installed in dist/ (or whatever it's called) from the things we want to rebuild vs the things we want to change? From the osstest PoV at least the proposal here only involves building additional things, not rebuilding anything which came from a previous build. e.g. given a build of xen.git now do a build of libvirt.git using those previously built Xen libs. Sure; but what I'm saying is if you do xen-full-build, you'll have a dist/ which contains: * qemut * qemuu * seabios * xen * libvirt * (c) But when you re-build just libvirt, what you want is a dist/ that contains: * qemut * qemuu * seabios * xen Specifically, you want it *not* to contain anything from the previous libvirt builds. That's what I'm talking about. That's not what I was talking about ;-). WRT the osstest usage the first build wouldn't be a full build, and in particular it would exclude the libvirt. I appreciate there may be reasons to care about the scenario you presented, but right now I'm trying to figure out how we can best integrate raisin into osstest and whether it can some how be made suitable for building actual build artefacts for osstest to use for further testing, as opposed to just existing as a test case for the sole purpose of testing that raisin works. That's what I'm trying to solve too. :-) So it sounds like we have different ideas about what osstest needs. I had assumed that since osstest separately tests the various qemu trees, seabios, c, that what we had envisioned for raisin was something similar: that the xen.git raisin test would use pre-built qemuu, qemut, and seabios components and only build what was in xen.git. It sounds like you're saying just to have a base Xen build that would build what currently comes out of xen.git, and then base our other components on top of that. In which case what you described would probably work just fine. Per component dist dirs is similarly surely possible but perhaps not something raisin wants. You could in theory have per-component output directories, and then a global input directory which was blown away at the beginning of every raisin build and re-constructed as needed. That would be the sort of equivalent of the mock-style RPM build (where the chroot represents the global input). Not sure how well that would work, though. In essence everything builds into dist.$component and then at the end of each component raisin automatically takes that and overlays whatever it contains over some central dist.all which subsequent components actually build against? Perhaps with a mode to seed dist.all from dist.* iff dist.all doesn't exist. Yeah, the basic idea would be when you run build, you rm -rf dist.all. Then if $dependency is in $COMPONENTS, then you clone (if necessary) build; if not, you copy from dist.$dependency into dist.all. If $dependency is neither in $COMPONENTS nor in dist.$dependency, you throw an error. That solves the most general case; but it sounds like you care mostly about the very specific case of dealing with components that depend on the current output of xen.git. Starting simple may be fine. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v21 11/14] x86/VPMU: Handle PMU interrupts for PV(H) guests
Am Freitag 08 Mai 2015, 17:06:11 schrieb Boris Ostrovsky: Add support for handling PMU interrupts for PV(H) guests. I have only some minor nits below. Reviewed-by: Dietmar Hahn dietmar.h...@ts.fujitsu.com VPMU for the interrupted VCPU is unloaded until the guest issues XENPMU_flush hypercall. This allows the guest to access PMU MSR values that are stored in VPMU context which is shared between hypervisor and domain, thus avoiding traps to hypervisor. Since the interrupt handler may now force VPMU context save (i.e. set VPMU_CONTEXT_SAVE flag) we need to make changes to amd_vpmu_save() which until now expected this flag to be set only when the counters were stopped. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com Acked-by: Daniel De Graaf dgde...@tycho.nsa.gov --- Changes in v21: * Copy hypervisor-private VPMU context to shared page during interrupt and copy it back during XENPMU_flush (see also changes to patch 6). Verify user-provided VPMU context before loading it into hypervisor-private one (and then to HW). Specifically, the changes are: - Change definitions of save/load ops to take a flag that specifies whether a copy and verification is required and, for the load op, to return error if verification fails. - Both load ops: update VMPU_RUNNING flag based on user-provided context, copy VPMU context - Both save ops: copy VPMU context - core2_vpmu_load(): add core2_vpmu_verify() call to do context verification - precompute VPMU context size into ctxt_sz to use in memcpy - Return an error in XENPMU_flush (vpmu.c) if verification fails. * Non-privileged domains should not be provided with physical CPUID in vpmu_do_interrupt(), set it to vcpu_id instead. xen/arch/x86/hvm/svm/vpmu.c | 63 +++--- xen/arch/x86/hvm/vmx/vpmu_core2.c | 87 -- xen/arch/x86/hvm/vpmu.c | 237 +++--- xen/include/asm-x86/hvm/vpmu.h| 8 +- xen/include/public/arch-x86/pmu.h | 3 + xen/include/public/pmu.h | 2 + xen/include/xsm/dummy.h | 4 +- xen/xsm/flask/hooks.c | 2 + 8 files changed, 359 insertions(+), 47 deletions(-) diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index 74d03a5..efe5573 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -45,6 +45,7 @@ static unsigned int __read_mostly num_counters; static const u32 __read_mostly *counters; static const u32 __read_mostly *ctrls; static bool_t __read_mostly k7_counters_mirrored; +static unsigned long __read_mostly ctxt_sz; #define F10H_NUM_COUNTERS 4 #define F15H_NUM_COUNTERS 6 @@ -188,27 +189,52 @@ static inline void context_load(struct vcpu *v) } } -static void amd_vpmu_load(struct vcpu *v) +static int amd_vpmu_load(struct vcpu *v, bool_t from_guest) { struct vpmu_struct *vpmu = vcpu_vpmu(v); -struct xen_pmu_amd_ctxt *ctxt = vpmu-context; -uint64_t *ctrl_regs = vpmu_reg_pointer(ctxt, ctrls); +struct xen_pmu_amd_ctxt *ctxt; +uint64_t *ctrl_regs; +unsigned int i; vpmu_reset(vpmu, VPMU_FROZEN); -if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) +if ( !from_guest vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED) ) { -unsigned int i; +ctxt = vpmu-context; +ctrl_regs = vpmu_reg_pointer(ctxt, ctrls); for ( i = 0; i num_counters; i++ ) wrmsrl(ctrls[i], ctrl_regs[i]); -return; +return 0; +} + +if ( from_guest ) +{ +ASSERT(!is_hvm_vcpu(v)); + +ctxt = vpmu-xenpmu_data-pmu.c.amd; +ctrl_regs = vpmu_reg_pointer(ctxt, ctrls); +for ( i = 0; i num_counters; i++ ) +{ +if ( is_pmu_enabled(ctrl_regs[i]) ) +{ +vpmu_set(vpmu, VPMU_RUNNING); +break; +} +} + +if ( i == num_counters ) +vpmu_reset(vpmu, VPMU_RUNNING); + +memcpy(vpmu-context, vpmu-xenpmu_data-pmu.c.amd, ctxt_sz); } vpmu_set(vpmu, VPMU_CONTEXT_LOADED); context_load(v); + +return 0; } static inline void context_save(struct vcpu *v) @@ -223,22 +249,17 @@ static inline void context_save(struct vcpu *v) rdmsrl(counters[i], counter_regs[i]); } -static int amd_vpmu_save(struct vcpu *v) +static int amd_vpmu_save(struct vcpu *v, bool_t to_guest) { struct vpmu_struct *vpmu = vcpu_vpmu(v); unsigned int i; -/* - * Stop the counters. If we came here via vpmu_save_force (i.e. - * when VPMU_CONTEXT_SAVE is set) counters are already stopped. - */ +for ( i = 0; i num_counters; i++ ) +wrmsrl(ctrls[i], 0); + if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_SAVE) ) { vpmu_set(vpmu, VPMU_FROZEN); - -for ( i = 0; i num_counters; i++ ) -
Re: [Xen-devel] [PATCH v2 04/41] arm/acpi : add arm specific acpi header file
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: +#ifndef _ASM_ARM64_ACPI_H +#define _ASM_ARM64_ACPI_H s/_ASM_ARM64_ACPI_H/_ASM_ARM_ACPI_H/ + +#include xen/init.h + +#define COMPILER_DEPENDENT_INT64 long long +#define COMPILER_DEPENDENT_UINT64 unsigned long long + +extern bool_t acpi_disabled; +/* Basic configuration for ACPI */ +static inline void disable_acpi(void) +{ +acpi_disabled = 1; +} It makes a little sense to add the prototype of acpi_disabled without effectively declaring it. Also, the code is very similar to the x86. It would make sense to factorize it (disable_acpi, acpi parameters...) in a common place. +#endif /*_ASM_ARM_ACPI_H*/ Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] [RFC] x86/domctl: Fix getpageframeinfo* handling.
On 18/05/15 12:43, Jan Beulich wrote: On 18.05.15 at 12:59, andrew.coop...@citrix.com wrote: In tree, there is one single caller of XEN_DOMCTL_getpageframeinfo3 (xc_get_pfn_type_batch()), and no callers of the older variants. getpageframeinfo3 and getpageframeinfo2 are compatible if the parameter contents are considered to be unsigned long; a compat guest calling getpageframeinfo3 falls through into the getpageframeinfo2 handler. However, getpageframeinfo3 and getpageframeinfo2 have different algorithms for calculating the eventual frame type, which means that a toolstack will get different answers depending on whether it is compat or not, which is a problem for all possible uses. Is there any other difference besides the former being capable of returning XEN_DOMCTL_PFINFO_BROKEN (which I would suppose to have been a later addition that didn't get properly sync-ed to the older handlers)? That was the only difference I could spot, but any difference is a problem. @@ -378,6 +147,81 @@ long arch_do_domctl( break; } +case XEN_DOMCTL_getpageframeinfo3: +{ +unsigned int num = domctl-u.getpageframeinfo3.num; + +/* Games to allow this code block to handle a compat guest. */ +void * __user guest_handle = domctl-u.getpageframeinfo3.array.p; The __used belongs between void and *. Also the blank line above looks somewhat misplaced (I guess you added it to kind of emphasize the comment). I think it ended up like this more by accident, but the emphasis is important. I will shuffle the width up. +unsigned int width = has_32bit_shinfo(currd) ? 32 : 64; These are bit counts, yet where you use the value you want byte granularity. So they are. I am surprised that this didn't blow up. 32bit unsigned long into it +if ( unlikely(num 1024) || + unlikely(num != domctl-u.getpageframeinfo3.num) ) +{ +ret = -E2BIG; +break; +} + +for ( i = 0; i num; ++i ) +{ +unsigned long gfn = 0, type = 0; gfn's initializer looks pointless (and if anything it should be INVALID_MFN or some such). It must absolutely be 0 for when we read a 32bit values into it, although I realise I do need to extend ~0U to ~0UL for compat guests. +struct page_info *page; +p2m_type_t t; + +if ( raw_copy_from_guest(gfn, guest_handle + (i * width), width) ) +{ +ret = -EFAULT; +break; +} + +page = get_page_from_gfn(d, gfn, t, P2M_ALLOC); + +if ( unlikely(!page) || + unlikely(is_xen_heap_page(page)) ) +{ +if ( p2m_is_broken(t) ) +type = XEN_DOMCTL_PFINFO_BROKEN; +else +type = XEN_DOMCTL_PFINFO_XTAB; Realizing that this was this way in the old code too, would you nevertheless mind adding unlikely() and/or flip the if and else branches? Certainly. Does this mean that you are happy in principle with the raw_* use? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test
On Mon, 2015-05-18 at 14:33 +0100, Ian Jackson wrote: Ian Campbell writes (Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test): On Mon, 2015-05-18 at 14:05 +0100, George Dunlap wrote: That solves the most general case; but it sounds like you care mostly about the very specific case of dealing with components that depend on the current output of xen.git. Starting simple may be fine. Currently we only have ts-*-build things which depend on the output of ts-xen-build (in fact, we only have ts-libvirt-build). That's not true. We have: job step script build-amd64 xen-build ts-xen-build build-amd64-rumpuserxen rumpuserxen-build ts-rumpuserxen-build build-amd64-rumpuserxen xen-build ts-xen-build where each of the lines in my table, above, uses output from the previous line. Indeed, I even seem to have partially remembered that in the next para (partially because I had it as in the future...) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools: Fix wild memory allocations from c/s 250f0b4 and 85d78b4
On 18/05/15 15:00, Boris Ostrovsky wrote: On 05/18/2015 08:57 AM, Andrew Cooper wrote: These changesets cause the respective libxc functions to unconditonally dereference their max_cpus/nodes parameters as part of initial memory allocations. It will fail at obtaining the correct number of cpus/nodes from Xen, as the guest handles will not be NULL. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Boris Ostrovsky boris.ostrov...@oracle.com --- Spotted by XenServers Coverity run. --- tools/libxl/libxl.c |4 ++-- tools/misc/xenpm.c|4 ++-- tools/python/xen/lowlevel/xc/xc.c |4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) xenpm bug is already fixed (commit b315cd9cce5b6da7ca89b2d7bad3fb01e7716044 n the staging tree). I am not sure I understand why Coverity complains about other spots. For example, in libxl_get_cpu_topology() num_cpus can be left uninitialized only if xc_cputopoinfo(ctx-xch, num_cpus, NULL) fails, in which case we go to 'GC_FREE; return ret;', so it's not ever used. xc_cputopoinfo(ctx-xch, num_cpus, NULL) unconditionally dereferences and reads num_cpus, and performs a memory allocation based on the result. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/41] arm/acpi : Add basic ACPI initialization
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: acpi_boot_table_init() will be called in start_xen to get the RSDP and all the table pointers. with this patch, we can get ACPI boot-time tables from firmware on ARM64. Signed-off-by: Naresh Bhat naresh.b...@linaro.org Signed-off-by: Parth Dixit parth.di...@linaro.org --- xen/arch/arm/acpi/Makefile | 1 + xen/arch/arm/acpi/boot.c | 56 ++ xen/arch/arm/setup.c | 13 +-- 3 files changed, 68 insertions(+), 2 deletions(-) create mode 100644 xen/arch/arm/acpi/boot.c diff --git a/xen/arch/arm/acpi/Makefile b/xen/arch/arm/acpi/Makefile index b5be22d..196c40a 100644 --- a/xen/arch/arm/acpi/Makefile +++ b/xen/arch/arm/acpi/Makefile @@ -1 +1,2 @@ obj-y += lib.o +obj-y += boot.o diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c new file mode 100644 index 000..8dc69d5 --- /dev/null +++ b/xen/arch/arm/acpi/boot.c @@ -0,0 +1,56 @@ +/* + * ARM64 Specific Low-Level ACPI Boot Support This code is not ARM64 specific: s/ARM64/ARM/ + * + * Copyright (C) 2014, Naresh Bhat naresh.b...@linaro.org The code within this file is a copy of arch/x86/acpi/boot.c. Please retain the copyright and add yours if necessary. + * + * ~~ + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * ~~ + */ + +#include xen/init.h +#include xen/acpi.h + +#include asm/acpi.h + +/* + * acpi_boot_table_init() called from setup_arch(), always. + * 1. find RSDP and get its address, and then find XSDT + * 2. extract all tables and checksums them all + * + * We can parse ACPI boot-time tables such as FADT, MADT after + * this function is called. It's worth to expand the commit message as done in the x86 version to explain the return value. + */ +int __init acpi_boot_table_init(void) +{ +int error; + +/* If acpi_disabled, bail out */ +if ( acpi_disabled ) +return 1; + +/* Initialize the ACPI boot-time table parser. */ +error = acpi_table_init(); I didn't find a better place for this comment. Though it's related to the ACPI initialization... You need to change_acpi_os_get_root_pointer, the current behavior is to fallback on the legacy method (i.e scanning the first MB of memory) when efi_enabled 0. +if ( error ) +{ +disable_acpi(); +return error; +} + +return 0; +} + diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index 06f8e54..5711077 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -36,6 +36,7 @@ #include xen/pfn.h #include xen/vmap.h #include xen/libfdt/libfdt.h +#include xen/acpi.h #include asm/page.h #include asm/current.h #include asm/setup.h @@ -45,10 +46,12 @@ #include asm/procinfo.h #include asm/setup.h #include xsm/xsm.h +#include asm/acpi.h struct bootinfo __initdata bootinfo; struct cpuinfo_arm __read_mostly boot_cpu_data; +bool_t acpi_disabled; #ifdef CONFIG_ARM_32 static unsigned long opt_xenheap_megabytes __initdata; @@ -610,7 +613,6 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size) init_xenheap_pages(pfn_to_paddr(xenheap_mfn_start), pfn_to_paddr(boot_mfn_start)); -end_boot_allocator(); } #else /* CONFIG_ARM_64 */ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size) @@ -680,7 +682,6 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size) setup_frametable_mappings(ram_start, ram_end); max_page = PFN_DOWN(ram_end); -end_boot_allocator(); You need to explain in the commit message why this is necessary (see [1]). Anybody who didn't follow/remember the thread will think this is wrong... } #endif @@ -751,6 +752,14 @@ void __init start_xen(unsigned long boot_phys_offset, setup_mm(fdt_paddr, fdt_size); +/* +* Parse the ACPI tables for possible boot-time configuration +*/ Coding style. /* * Foo * bar */ Although the comment is fitting in a single line so /* foo bar */ Bu +if( !acpi_disabled )
Re: [Xen-devel] [PATCHv6 1/3] xen: use ticket locks for spin locks
On 14.05.15 at 13:21, david.vra...@citrix.com wrote: void _spin_lock(spinlock_t *lock) { +spinlock_tickets_t tickets = { .tail = 1, }; This breaks the build on gcc 4.3.x (due to tail being a member of an unnamed structure member of a union). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/41] arm/acpi: Build numa for x86 only
Hi Parth, On 17/05/15 21:03, Parth Dixit wrote: From: Naresh Bhat naresh.b...@linaro.org Numa is currently not supported for arm in xen. Configure and build numa for x86 architecture only. Signed-off-by: Naresh Bhat naresh.b...@linaro.org Signed-off-by: Parth Dixit parth.di...@linaro.org Reviewed-by: Julien Grall julien.gr...@citrix.com Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools: Fix wild memory allocations from c/s 250f0b4 and 85d78b4
On Mon, May 18, 2015 at 01:57:24PM +0100, Andrew Cooper wrote: These changesets cause the respective libxc functions to unconditonally dereference their max_cpus/nodes parameters as part of initial memory allocations. It will fail at obtaining the correct number of cpus/nodes from Xen, as the guest handles will not be NULL. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Boris Ostrovsky boris.ostrov...@oracle.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 01/14] x86: add socket_to_cpumask
On 08.05.15 at 10:56, chao.p.p...@linux.intel.com wrote: --- a/xen/arch/x86/mpparse.c +++ b/xen/arch/x86/mpparse.c @@ -64,6 +64,9 @@ unsigned int __read_mostly boot_cpu_physical_apicid = BAD_APICID; static unsigned int __devinitdata num_processors; static unsigned int __initdata disabled_cpus; +/* Total detected cpus (may exceed NR_CPUS) */ +unsigned int total_cpus; + /* Bitmask of physically existing CPUs */ physid_mask_t phys_cpu_present_map; @@ -112,6 +115,8 @@ static int __devinit MP_processor_info_x(struct mpc_config_processor *m, { int ver, apicid, cpu = 0; + total_cpus++; + if (!(m-mpc_cpuflag CPU_ENABLED)) { if (!hotplug) ++disabled_cpus; Is there a reason you can't use disabled_cpus and avoid adding yet another variable? --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -59,6 +59,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask); cpumask_t cpu_online_map __read_mostly; EXPORT_SYMBOL(cpu_online_map); +unsigned int nr_sockets __read_mostly; +cpumask_var_t *socket_to_cpumask __read_mostly; I'd really like to see the to dropped from the name. It has been confusing me not for the first time. I'd also prefer the section annotations to be at their mandated place, between type and variable name. @@ -239,11 +242,14 @@ static void link_thread_siblings(int cpu1, int cpu2) static void set_cpu_sibling_map(int cpu) { -int i; +int i, socket = cpu_to_socket(cpu); unsigned int struct cpuinfo_x86 *c = cpu_data; cpumask_set_cpu(cpu, cpu_sibling_setup_map); +if ( socket nr_sockets ) +cpumask_set_cpu(cpu, socket_to_cpumask[socket]); + if ( c[cpu].x86_num_siblings 1 ) { for_each_cpu ( i, cpu_sibling_setup_map ) @@ -301,6 +307,7 @@ static void set_cpu_sibling_map(int cpu) } } } + } ??? @@ -704,6 +711,8 @@ static struct notifier_block cpu_smpboot_nfb = { void __init smp_prepare_cpus(unsigned int max_cpus) { +int socket; unsigned int @@ -717,6 +726,15 @@ void __init smp_prepare_cpus(unsigned int max_cpus) stack_base[0] = stack_start; +nr_sockets = DIV_ROUND_UP(total_cpus, boot_cpu_data.x86_max_cores * + boot_cpu_data.x86_num_siblings); +socket_to_cpumask = xzalloc_array(cpumask_var_t, nr_sockets); +if ( !socket_to_cpumask ) +panic(No memory for socket CPU siblings map); +for ( socket = 0; socket nr_sockets; socket++ ) +if ( !zalloc_cpumask_var(socket_to_cpumask + socket) ) +panic(No memory for socket CPU siblings cpumask); You might be allocating quite a bit too much memory now that you overestimate nr_sockets. Hence at least this second part of the change here would better be switched to an on demand allocation model. @@ -779,9 +797,12 @@ void __init smp_prepare_boot_cpu(void) static void remove_siblinginfo(int cpu) { -int sibling; +int sibling, socket = cpu_to_socket(cpu); unsigned int --- a/xen/include/asm-x86/smp.h +++ b/xen/include/asm-x86/smp.h @@ -58,6 +58,22 @@ int hard_smp_processor_id(void); void __stop_this_cpu(void); +/* Total number of cpus in this system (may exceed NR_CPUS) */ +extern unsigned int total_cpus; + +/* + * This value is calculated by total_cpus/cpus_per_socket with the assumption + * that APIC IDs from MP table are continuous. It's possible that this value + * is less than the real socket number in the system if the APIC IDs from MP + * table are too sparse. Also the value is considered not to change from the + * initial startup. Violation of any of these assumptions may result in errors + * and requires retrofitting all the relevant places. + */ This all reads pretty frightening. How about using a better estimate of core and thread count (i.e. ones matching actually observed values instead of the nearest larger powers of two) in the nr_sockets calculation? Overestimating nr_sockets is surely better than using too small a value, as the alternative of remembering to always bounds check socket values before use (not only in your series, but also in future changes) is going to be pretty fragile. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 07/14] x86: dynamically get/set CBM for a domain
On 08.05.15 at 10:56, chao.p.p...@linux.intel.com wrote: +int psr_get_l3_cbm(struct domain *d, unsigned int socket, uint64_t *cbm) +{ +unsigned int cos; +struct psr_cat_socket_info *info; +int ret = get_cat_socket_info(socket, info); + +if ( ret ) +return ret; + +cos = d-arch.psr_cos_ids[socket]; +*cbm = info-cos_to_cbm[cos].cbm; +return 0; +} Blank line before a function's final return statement please. With that and Dario's comment addressed Acked-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools: Fix wild memory allocations from c/s 250f0b4 and 85d78b4
On 05/18/2015 08:57 AM, Andrew Cooper wrote: These changesets cause the respective libxc functions to unconditonally dereference their max_cpus/nodes parameters as part of initial memory allocations. It will fail at obtaining the correct number of cpus/nodes from Xen, as the guest handles will not be NULL. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Boris Ostrovsky boris.ostrov...@oracle.com --- Spotted by XenServers Coverity run. --- tools/libxl/libxl.c |4 ++-- tools/misc/xenpm.c|4 ++-- tools/python/xen/lowlevel/xc/xc.c |4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) xenpm bug is already fixed (commit b315cd9cce5b6da7ca89b2d7bad3fb01e7716044 n the staging tree). I am not sure I understand why Coverity complains about other spots. For example, in libxl_get_cpu_topology() num_cpus can be left uninitialized only if xc_cputopoinfo(ctx-xch, num_cpus, NULL) fails, in which case we go to 'GC_FREE; return ret;', so it's not ever used. -boris diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index a6eb2df..295877b 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -5105,7 +5105,7 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out) xc_cputopo_t *cputopo; libxl_cputopology *ret = NULL; int i; -unsigned num_cpus; +unsigned num_cpus = 0; /* Setting buffer to NULL makes the call return number of CPUs */ if (xc_cputopoinfo(ctx-xch, num_cpus, NULL)) @@ -5191,7 +5191,7 @@ libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr) uint32_t *distance; libxl_numainfo *ret = NULL; int i, j; -unsigned num_nodes; +unsigned num_nodes = 0; if (xc_numainfo(ctx-xch, num_nodes, NULL, NULL)) { LOGE(ERROR, Unable to determine number of nodes); diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c index fe2c001..2f9bd8e 100644 --- a/tools/misc/xenpm.c +++ b/tools/misc/xenpm.c @@ -356,7 +356,7 @@ static void signal_int_handler(int signo) struct timeval tv; int cx_cap = 0, px_cap = 0; xc_cputopo_t *cputopo = NULL; -unsigned max_cpus; +unsigned max_cpus = 0; if ( xc_cputopoinfo(xc_handle, max_cpus, NULL) != 0 ) { @@ -961,7 +961,7 @@ void scaling_governor_func(int argc, char *argv[]) void cpu_topology_func(int argc, char *argv[]) { xc_cputopo_t *cputopo = NULL; -unsigned max_cpus; +unsigned max_cpus = 0; int i, rc; if ( xc_cputopoinfo(xc_handle, max_cpus, NULL) != 0 ) diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c index fbd93db..c77e15b 100644 --- a/tools/python/xen/lowlevel/xc/xc.c +++ b/tools/python/xen/lowlevel/xc/xc.c @@ -1221,7 +1221,7 @@ static PyObject *pyxc_getcpuinfo(XcObject *self, PyObject *args, PyObject *kwds) static PyObject *pyxc_topologyinfo(XcObject *self) { xc_cputopo_t *cputopo = NULL; -unsigned i, num_cpus; +unsigned i, num_cpus = 0; PyObject *ret_obj = NULL; PyObject *cpu_to_core_obj, *cpu_to_socket_obj, *cpu_to_node_obj; @@ -1293,7 +1293,7 @@ static PyObject *pyxc_topologyinfo(XcObject *self) static PyObject *pyxc_numainfo(XcObject *self) { -unsigned i, j, num_nodes; +unsigned i, j, num_nodes = 0; uint64_t free_heap; PyObject *ret_obj = NULL, *node_to_node_dist_list_obj; PyObject *node_to_memsize_obj, *node_to_memfree_obj; ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86/EFI: keep EFI runtime services top level page tables up-to-date
Updates to idle_pg_table[] need to be mirrored into the page tables used for invoking EFI runtime services. Signed-off-by: Jan Beulich jbeul...@suse.com --- This in particular is a prereq for the patch at http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg02125.html to be correct. --- a/xen/arch/x86/efi/runtime.h +++ b/xen/arch/x86/efi/runtime.h @@ -2,4 +2,10 @@ #ifndef COMPAT l4_pgentry_t *__read_mostly efi_l4_pgtable; + +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) +{ +if ( efi_l4_pgtable ) +l4e_write(efi_l4_pgtable + l4idx, l4e); +} #endif --- a/xen/arch/x86/efi/stub.c +++ b/xen/arch/x86/efi/stub.c @@ -2,6 +2,7 @@ #include xen/errno.h #include xen/init.h #include xen/lib.h +#include asm/page.h #ifndef efi_enabled const bool_t efi_enabled = 0; @@ -9,6 +10,8 @@ const bool_t efi_enabled = 0; void __init efi_init_memory(void) { } +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { } + paddr_t efi_rs_page_table(void) { BUG(); --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5311,7 +5311,10 @@ static l3_pgentry_t *virt_to_xen_l3e(uns spin_lock(map_pgdir_lock); if ( !(l4e_get_flags(*pl4e) _PAGE_PRESENT) ) { -l4e_write(pl4e, l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR)); +l4_pgentry_t l4e = l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR); + +l4e_write(pl4e, l4e); +efi_update_l4_pgtable(l4_table_offset(v), l4e); pl3e = NULL; } if ( locking ) --- a/xen/include/asm-x86/page.h +++ b/xen/include/asm-x86/page.h @@ -288,6 +288,7 @@ extern l2_pgentry_t l2_identmap[4*L2_PAG extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES], l1_fixmap[L1_PAGETABLE_ENTRIES]; void paging_init(void); +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t); #endif /* !defined(__ASSEMBLY__) */ #define _PAGE_NONE _AC(0x000,U) x86/EFI: keep EFI runtime services top level page tables up-to-date Updates to idle_pg_table[] need to be mirrored into the page tables used for invoking EFI runtime services. Signed-off-by: Jan Beulich jbeul...@suse.com --- This in particular is a prereq for the patch at http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg02125.html to be correct. --- a/xen/arch/x86/efi/runtime.h +++ b/xen/arch/x86/efi/runtime.h @@ -2,4 +2,10 @@ #ifndef COMPAT l4_pgentry_t *__read_mostly efi_l4_pgtable; + +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) +{ +if ( efi_l4_pgtable ) +l4e_write(efi_l4_pgtable + l4idx, l4e); +} #endif --- a/xen/arch/x86/efi/stub.c +++ b/xen/arch/x86/efi/stub.c @@ -2,6 +2,7 @@ #include xen/errno.h #include xen/init.h #include xen/lib.h +#include asm/page.h #ifndef efi_enabled const bool_t efi_enabled = 0; @@ -9,6 +10,8 @@ const bool_t efi_enabled = 0; void __init efi_init_memory(void) { } +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { } + paddr_t efi_rs_page_table(void) { BUG(); --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5311,7 +5311,10 @@ static l3_pgentry_t *virt_to_xen_l3e(uns spin_lock(map_pgdir_lock); if ( !(l4e_get_flags(*pl4e) _PAGE_PRESENT) ) { -l4e_write(pl4e, l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR)); +l4_pgentry_t l4e = l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR); + +l4e_write(pl4e, l4e); +efi_update_l4_pgtable(l4_table_offset(v), l4e); pl3e = NULL; } if ( locking ) --- a/xen/include/asm-x86/page.h +++ b/xen/include/asm-x86/page.h @@ -288,6 +288,7 @@ extern l2_pgentry_t l2_identmap[4*L2_PAG extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES], l1_fixmap[L1_PAGETABLE_ENTRIES]; void paging_init(void); +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t); #endif /* !defined(__ASSEMBLY__) */ #define _PAGE_NONE _AC(0x000,U) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 56456: regressions - FAIL
On 16.05.15 at 13:45, roger@citrix.com wrote: El 16/05/15 a les 10.51, osstest service user ha escrit: flight 56456 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/56456/ Regressions :-( This is my fault, paging_gva_to_gfn cannot be used to translate a PV guest VA to a GFN. The patch above restores the previous path for PV callers. While Tim would have the final say, I certainly would prefer to revert the offending patch and then apply a correct new version in its stead in this case (where the fix is not a simple, few lines change). Jan --- commit 4cfaf46cbb116ce8dd0124bbbea319489db4b068 Author: Roger Pau Monne roger@citrix.com Date: Sat May 16 13:42:39 2015 +0200 x86: restore PV path on paging_log_dirty_op Restore previous path for PV domains calling paging_log_dirty_op. This fixes the fallout from a809eeea06d20b115d78f12e473502bcb6209844. Signed-off-by: Roger Pau Monné roger@citrix.com diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 5eee88c..2f51126 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -416,6 +416,7 @@ static inline void *map_dirty_bitmap(XEN_GUEST_HANDLE_64(uint8) dirty_bitmap, unsigned long gfn; p2m_type_t p2mt; +ASSERT(paging_mode_enabled(current-domain)); gfn = paging_gva_to_gfn(current, (unsigned long)(dirty_bitmap.p + (pages 3)), pfec); @@ -446,6 +447,7 @@ static inline void *map_dirty_bitmap(XEN_GUEST_HANDLE_64(uint8) dirty_bitmap, static inline void unmap_dirty_bitmap(void *addr, struct page_info *page) { +ASSERT(paging_mode_enabled(current-domain)); if ( addr != NULL ) { unmap_domain_page(addr); @@ -465,9 +467,9 @@ static int paging_log_dirty_op(struct domain *d, mfn_t *l4 = NULL, *l3 = NULL, *l2 = NULL; unsigned long *l1 = NULL; int i4, i3, i2; -uint8_t *dirty_bitmap; -struct page_info *page; -unsigned long index_mapped; +uint8_t *dirty_bitmap = NULL; +struct page_info *page = NULL; +unsigned long index_mapped = 0; again: if ( !resuming ) @@ -482,12 +484,15 @@ static int paging_log_dirty_op(struct domain *d, p2m_flush_hardware_cached_dirty(d); } -index_mapped = resuming ? d-arch.paging.preempt.log_dirty.done : 0; -dirty_bitmap = map_dirty_bitmap(sc-dirty_bitmap, index_mapped, page); -if ( dirty_bitmap == NULL ) +if ( paging_mode_enabled(current-domain) ) { -domain_unpause(d); -return -EFAULT; +index_mapped = resuming ? d-arch.paging.preempt.log_dirty.done : 0; +dirty_bitmap = map_dirty_bitmap(sc-dirty_bitmap, index_mapped, page); +if ( dirty_bitmap == NULL ) +{ +domain_unpause(d); +return -EFAULT; +} } paging_lock(d); @@ -549,7 +554,8 @@ static int paging_log_dirty_op(struct domain *d, bytes = (unsigned int)((sc-pages - pages + 7) 3); if ( likely(peek) ) { -if ( pages (3 + PAGE_SHIFT) != +if ( paging_mode_enabled(current-domain) + pages (3 + PAGE_SHIFT) != index_mapped (3 + PAGE_SHIFT) ) { /* We need to map next page */ @@ -564,13 +570,30 @@ static int paging_log_dirty_op(struct domain *d, unmap_dirty_bitmap(dirty_bitmap, page); goto again; } -ASSERT(((pages 3) % PAGE_SIZE) + bytes = PAGE_SIZE); -if ( l1 ) -memcpy(dirty_bitmap + ((pages 3) % PAGE_SIZE), l1, - bytes); + +if ( paging_mode_enabled(current-domain) ) +{ +ASSERT(((pages 3) % PAGE_SIZE) + bytes = PAGE_SIZE); +if ( l1 ) +memcpy(dirty_bitmap + ((pages 3) % PAGE_SIZE), + l1, bytes); +else +memset(dirty_bitmap + ((pages 3) % PAGE_SIZE), + 0, bytes); +} else -memset(dirty_bitmap + ((pages 3) % PAGE_SIZE), 0, - bytes); +{ +if ( (l1 ? copy_to_guest_offset(sc-dirty_bitmap, +pages 3, +(uint8_t *)l1, +bytes) + : clear_guest_offset(sc-dirty_bitmap, +
Re: [Xen-devel] [xen-unstable test] 56456: regressions - FAIL
At 09:34 +0100 on 18 May (1431941676), Jan Beulich wrote: On 16.05.15 at 13:45, roger@citrix.com wrote: El 16/05/15 a les 10.51, osstest service user ha escrit: flight 56456 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/56456/ Regressions :-( This is my fault, paging_gva_to_gfn cannot be used to translate a PV guest VA to a GFN. The patch above restores the previous path for PV callers. While Tim would have the final say, I certainly would prefer to revert the offending patch and then apply a correct new version in its stead in this case (where the fix is not a simple, few lines change). I would be OK with a follow-up fix here, but I'm not convinced that this is it. In particular, paging_mode_enabled() should be true for any PV domain that's in log-dirty mode, so presumably the failure is only for lgd ops on VMs that don't have lgd enabled. So maybe we can either: - return an error for that case (but we'd want to understand how we got there first); or - have map_dirty_bitmap() DTRT, with something like access_ok() + a linear-pagetable lookup to find the frame. In any case, this is a regression so we should revert while we figure it out. Tim. --- commit 4cfaf46cbb116ce8dd0124bbbea319489db4b068 Author: Roger Pau Monne roger@citrix.com Date: Sat May 16 13:42:39 2015 +0200 x86: restore PV path on paging_log_dirty_op Restore previous path for PV domains calling paging_log_dirty_op. This fixes the fallout from a809eeea06d20b115d78f12e473502bcb6209844. Signed-off-by: Roger Pau Monné roger@citrix.com diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 5eee88c..2f51126 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -416,6 +416,7 @@ static inline void *map_dirty_bitmap(XEN_GUEST_HANDLE_64(uint8) dirty_bitmap, unsigned long gfn; p2m_type_t p2mt; +ASSERT(paging_mode_enabled(current-domain)); gfn = paging_gva_to_gfn(current, (unsigned long)(dirty_bitmap.p + (pages 3)), pfec); @@ -446,6 +447,7 @@ static inline void *map_dirty_bitmap(XEN_GUEST_HANDLE_64(uint8) dirty_bitmap, static inline void unmap_dirty_bitmap(void *addr, struct page_info *page) { +ASSERT(paging_mode_enabled(current-domain)); if ( addr != NULL ) { unmap_domain_page(addr); @@ -465,9 +467,9 @@ static int paging_log_dirty_op(struct domain *d, mfn_t *l4 = NULL, *l3 = NULL, *l2 = NULL; unsigned long *l1 = NULL; int i4, i3, i2; -uint8_t *dirty_bitmap; -struct page_info *page; -unsigned long index_mapped; +uint8_t *dirty_bitmap = NULL; +struct page_info *page = NULL; +unsigned long index_mapped = 0; again: if ( !resuming ) @@ -482,12 +484,15 @@ static int paging_log_dirty_op(struct domain *d, p2m_flush_hardware_cached_dirty(d); } -index_mapped = resuming ? d-arch.paging.preempt.log_dirty.done : 0; -dirty_bitmap = map_dirty_bitmap(sc-dirty_bitmap, index_mapped, page); -if ( dirty_bitmap == NULL ) +if ( paging_mode_enabled(current-domain) ) { -domain_unpause(d); -return -EFAULT; +index_mapped = resuming ? d-arch.paging.preempt.log_dirty.done : 0; +dirty_bitmap = map_dirty_bitmap(sc-dirty_bitmap, index_mapped, page); +if ( dirty_bitmap == NULL ) +{ +domain_unpause(d); +return -EFAULT; +} } paging_lock(d); @@ -549,7 +554,8 @@ static int paging_log_dirty_op(struct domain *d, bytes = (unsigned int)((sc-pages - pages + 7) 3); if ( likely(peek) ) { -if ( pages (3 + PAGE_SHIFT) != +if ( paging_mode_enabled(current-domain) + pages (3 + PAGE_SHIFT) != index_mapped (3 + PAGE_SHIFT) ) { /* We need to map next page */ @@ -564,13 +570,30 @@ static int paging_log_dirty_op(struct domain *d, unmap_dirty_bitmap(dirty_bitmap, page); goto again; } -ASSERT(((pages 3) % PAGE_SIZE) + bytes = PAGE_SIZE); -if ( l1 ) -memcpy(dirty_bitmap + ((pages 3) % PAGE_SIZE), l1, - bytes); + +if ( paging_mode_enabled(current-domain) ) +{ +ASSERT(((pages 3) % PAGE_SIZE) + bytes = PAGE_SIZE); +if ( l1 ) +memcpy(dirty_bitmap + ((pages 3) % PAGE_SIZE), + l1, bytes); +
Re: [Xen-devel] [PATCH v2 00/41] Add ACPI support for arm64 on Xen
Hi Parth (You dropped xen-devel, re-cc it) On 18/05/15 10:59, Parth Dixit wrote: yes i tested with linux-next of 15th may and it is working fine with it,except mounting of root partiton because kernel is not able to detetct the partition i did not looked further into it but its presumably configuration of right driver. Linux mainline will need 3 additional patches 1. Adding STAO and XENV 2. Hiding of UART 3. Enabling xen compilation with ACPI which was disabled by Hanjun. I'll wait for comment's on this series and if there are no major changes i'll cleanup linux patches and send them for review as well. I think you can send an RFC now. The STA0 and XENV table are well-defined and the rest of the patches doesn't seem to be tight with your Xen series. Few workarounds have been made to get it working, these are as follows 1. In Xen interrupts are routed at the boot time with edge/trigger level set to 0 because this information is not available at the time of booting. edge/trigger is only a bit. Can't you just avoid to set it rather than using a potentially incorrect value? That would require disabling the checks for edge/trigger at multiple places for acpi, if you are ok with the approach i'll go ahead and send the patch. IIRC, during the last Linaro connect we agreed to differ edge/trigger setting for IRQ assigned to a Domain for both ACPI and DT. 2. EFI runtime services are disaled in linux but proper solution has to come from linux side. Can you details a bit more? What is missing? We only create EFI stub for passing RSDP and memory data in xen, we are not implementing full fledged runtime service support for EFI. At present linux expects that if EFI interface is present runtime services are also available. Thanks for the explanation. Please add it in the next cover letter. It's useful for people who weren't present at the meeting or forgot what was said. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/4] x86emul: move stubs off the stack
This is needed as stacks are going to become non-executable. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/tools/tests/x86_emulator/x86_emulate.c +++ b/tools/tests/x86_emulator/x86_emulate.c @@ -17,4 +17,8 @@ typedef bool bool_t; #define __packed __attribute__((packed)) #include x86_emulate/x86_emulate.h + +#define get_stub(stb) ((void *)((stb).addr = (uintptr_t)(stb).buf)) +#define put_stub(stb) + #include x86_emulate/x86_emulate.c --- a/xen/arch/x86/x86_emulate.c +++ b/xen/arch/x86/x86_emulate.c @@ -9,6 +9,7 @@ *Keir Fraser k...@xen.org */ +#include xen/domain_page.h #include asm/x86_emulate.h #include asm/asm_defns.h /* mark_regs_dirty() */ #include asm/processor.h /* current_cpu_info */ @@ -17,8 +18,22 @@ /* Avoid namespace pollution. */ #undef cmpxchg #undef cpuid +#undef wbinvd #define cpu_has_amd_erratum(nr) \ cpu_has_amd_erratum(current_cpu_data, AMD_ERRATUM_##nr) +#define get_stub(stb) ({ \ +(stb).addr = this_cpu(stubs.addr) + STUB_BUF_SIZE / 2; \ +((stb).ptr = map_domain_page(this_cpu(stubs.mfn))) + \ +((stb).addr (PAGE_SIZE - 1));\ +}) +#define put_stub(stb) ({ \ +if ( (stb).ptr ) \ +{ \ +unmap_domain_page((stb).ptr); \ +(stb).ptr = NULL; \ +} \ +}) + #include x86_emulate/x86_emulate.c --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -717,11 +717,14 @@ do{ struct fpu_insn_ctxt fic; } while (0) #define emulate_fpu_insn_stub(_bytes...)\ -do{ uint8_t stub[] = { _bytes, 0xc3 }; \ -struct fpu_insn_ctxt fic = { .insn_bytes = sizeof(stub)-1 };\ +do{ uint8_t *buf = get_stub(stub); \ +unsigned int _nr = sizeof((uint8_t[]){ _bytes }); \ +struct fpu_insn_ctxt fic = { .insn_bytes = _nr };\ +memcpy(buf, ((uint8_t[]){ _bytes, 0xc3 }), _nr + 1);\ get_fpu(X86EMUL_FPU_fpu, fic); \ -(*(void(*)(void))stub)(); \ +stub.func();\ put_fpu(fic); \ +put_stub(stub); \ } while (0) static unsigned long _get_rep_prefix( @@ -1458,6 +1461,7 @@ x86_emulate( struct operand src = { .reg = REG_POISON }; struct operand dst = { .reg = REG_POISON }; enum x86_swint_type swint_type; +struct x86_emulate_stub stub = {}; DECLARE_ALIGNED(mmval_t, mmval); /* * Data operand effective address (usually computed from ModRM). @@ -3792,6 +3796,7 @@ x86_emulate( done: _put_fpu(); +put_stub(stub); return rc; twobyte_insn: @@ -4007,9 +4012,15 @@ x86_emulate( /* {,v}movss xmm,xmm/m32 */ /* {,v}movsd xmm,xmm/m64 */ { -uint8_t stub[] = { 0x3e, 0x3e, 0x0f, b, modrm, 0xc3 }; -struct fpu_insn_ctxt fic = { .insn_bytes = sizeof(stub)-1 }; +uint8_t *buf = get_stub(stub); +struct fpu_insn_ctxt fic = { .insn_bytes = 5 }; +buf[0] = 0x3e; +buf[1] = 0x3e; +buf[2] = 0x0f; +buf[3] = b; +buf[4] = modrm; +buf[5] = 0xc3; if ( vex.opcx == vex_none ) { if ( vex.pfx VEX_PREFIX_DOUBLE_MASK ) @@ -4017,7 +4028,7 @@ x86_emulate( else vcpu_must_have_sse(); ea.bytes = 16; -SET_SSE_PREFIX(stub[0], vex.pfx); +SET_SSE_PREFIX(buf[0], vex.pfx); get_fpu(X86EMUL_FPU_xmm, fic); } else @@ -4044,15 +4055,16 @@ x86_emulate( /* convert memory operand to (%rAX) */ rex_prefix = ~REX_B; vex.b = 1; -stub[4] = 0x38; +buf[4] = 0x38; } if ( !rc ) { - copy_REX_VEX(stub, rex_prefix, vex); - asm volatile ( call *%0 : : r (stub), a (mmvalp) + copy_REX_VEX(buf, rex_prefix, vex); + asm volatile ( call *%0 : : r (stub.func), a (mmvalp) : memory ); } put_fpu(fic); +put_stub(stub); if ( !rc (b 1) (ea.type == OP_MEM) ) rc = ops-write(ea.mem.seg, ea.mem.off, mmvalp, ea.bytes, ctxt); @@ -4242,9 +4254,15 @@ x86_emulate( /* {,v}movdq{a,u} xmm,xmm/m128 */ /* vmovdq{a,u} ymm,ymm/m256 */ { -uint8_t stub[] = { 0x3e, 0x3e, 0x0f, b, modrm, 0xc3 }; -struct
[Xen-devel] [OSSTEST PATCH] ts-host-install: die on unknown options
Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com --- ts-host-install |2 ++ 1 file changed, 2 insertions(+) diff --git a/ts-host-install b/ts-host-install index b13f293..9d6a73c 100755 --- a/ts-host-install +++ b/ts-host-install @@ -37,6 +37,8 @@ while (@ARGV and $ARGV[0] =~ m/^-/) { $xopts{DebconfPriority}= defined($1) ? $1 : 'low'; } elsif (m/^--rescue$/) { $xopts{RescueMode}= 1; +} else { + die $_ $!; } } -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/3] mwait-idle: update support for Silvermont Core in Baytrail SOC
On some Silvermont-Core/Baytrail-SOC systems, C1E latency is higher than original specifications. Although C1E is still enumerated in CPUID.MWAIT.EDX, we delete the state from intel_idle to avoid latency impact. Under some conditions, the latency of the C6N-BYT and C6S-BYT states may exceed the specified values of 40 and 140 usec, respectively. Increase those values to 300 and 500 usec; to assure that the hardware does not violate constraints that may be set by the Linux PM_QOS sub-system. Also increase the C7-BYT target residency to 4.0 ms from 1.5 ms. Signed-off-by: Len Brown len.br...@intel.com [Linux commit d7ef76717322c8e2df7d4360b33faa9466cb1a0d] Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/arch/x86/cpu/mwait-idle.c +++ b/xen/arch/x86/cpu/mwait-idle.c @@ -196,28 +196,22 @@ static const struct cpuidle_state byt_cs .target_residency = 1, }, { - .name = C1E-BYT, - .flags = MWAIT2flg(0x01), - .exit_latency = 15, - .target_residency = 30, - }, - { .name = C6N-BYT, .flags = MWAIT2flg(0x58) | CPUIDLE_FLAG_TLB_FLUSHED, - .exit_latency = 40, + .exit_latency = 300, .target_residency = 275, }, { .name = C6S-BYT, .flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED, - .exit_latency = 140, + .exit_latency = 500, .target_residency = 560, }, { .name = C7-BYT, .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 1200, - .target_residency = 1500, + .target_residency = 4000, }, { .name = C7S-BYT, mwait-idle: update support for Silvermont Core in Baytrail SOC On some Silvermont-Core/Baytrail-SOC systems, C1E latency is higher than original specifications. Although C1E is still enumerated in CPUID.MWAIT.EDX, we delete the state from intel_idle to avoid latency impact. Under some conditions, the latency of the C6N-BYT and C6S-BYT states may exceed the specified values of 40 and 140 usec, respectively. Increase those values to 300 and 500 usec; to assure that the hardware does not violate constraints that may be set by the Linux PM_QOS sub-system. Also increase the C7-BYT target residency to 4.0 ms from 1.5 ms. Signed-off-by: Len Brown len.br...@intel.com [Linux commit d7ef76717322c8e2df7d4360b33faa9466cb1a0d] Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/arch/x86/cpu/mwait-idle.c +++ b/xen/arch/x86/cpu/mwait-idle.c @@ -196,28 +196,22 @@ static const struct cpuidle_state byt_cs .target_residency = 1, }, { - .name = C1E-BYT, - .flags = MWAIT2flg(0x01), - .exit_latency = 15, - .target_residency = 30, - }, - { .name = C6N-BYT, .flags = MWAIT2flg(0x58) | CPUIDLE_FLAG_TLB_FLUSHED, - .exit_latency = 40, + .exit_latency = 300, .target_residency = 275, }, { .name = C6S-BYT, .flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED, - .exit_latency = 140, + .exit_latency = 500, .target_residency = 560, }, { .name = C7-BYT, .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED, .exit_latency = 1200, - .target_residency = 1500, + .target_residency = 4000, }, { .name = C7S-BYT, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test
On Mon, 2015-05-18 at 14:05 +0100, George Dunlap wrote: It sounds like you're saying just to have a base Xen build that would build what currently comes out of xen.git, and then base our other components on top of that. Correct, otherwise you get one big job which can fail due to any component, which makes reporting harder. It is nice to spot ts-libvirt-build vs ts-xen-build failing in the reports rather than just ts-raisin-build for everything. More critically one big job makes the bisector less effective since it has to rebuild things which haven't changed just because something which shares the monolothic build job has changed. e.g. today a libvirt bisect won't rebuild Xen if it doesn't have to. In which case what you described would probably work just fine. Per component dist dirs is similarly surely possible but perhaps not something raisin wants. You could in theory have per-component output directories, and then a global input directory which was blown away at the beginning of every raisin build and re-constructed as needed. That would be the sort of equivalent of the mock-style RPM build (where the chroot represents the global input). Not sure how well that would work, though. In essence everything builds into dist.$component and then at the end of each component raisin automatically takes that and overlays whatever it contains over some central dist.all which subsequent components actually build against? Perhaps with a mode to seed dist.all from dist.* iff dist.all doesn't exist. Yeah, the basic idea would be when you run build, you rm -rf dist.all. Then if $dependency is in $COMPONENTS, then you clone (if necessary) build; if not, you copy from dist.$dependency into dist.all. If $dependency is neither in $COMPONENTS nor in dist.$dependency, you throw an error. That solves the most general case; but it sounds like you care mostly about the very specific case of dealing with components that depend on the current output of xen.git. Starting simple may be fine. Currently we only have ts-*-build things which depend on the output of ts-xen-build (in fact, we only have ts-libvirt-build). I'm not sure if there will be others in the future, I suppose ts-rump{qemu,xenstore,foo}-build - ts-rumpkernel-build - ts-xen-build might eventually be a possibility... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] qemu device model question
On Wed, May 13, 2015 at 10:30 AM, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-05-13 at 10:54 +0200, Juergen Gross wrote: Hi, while trying to build a pvusb backend in qemu I think I've found a general issue in xl: qemu for pv-domains is started only at domain creation and only if there is at least one backend in qemu required. If there is no qemu process started for the domain at creation time it will be impossible to successfully add such a device later while the domain is running. Are there any plans to remove that restriction? Or have I missed some mechanism in xl to start qemu at a later time? I think it would be reasonable to have some way to indicate that pvusb support is desired even if there are no such devices on boot, and for libxl to start the necessary backend in that case. s/pvusb/whatever/ In the pvusb case, I think it's the plan to have an option to specify a pvusb *bus*, even with no devices attached. But we should probably solve the more general problem. :-) We should have a way to plug in disks / other devices supplied by qemu after the VM is already running. Specifying an I might want something later option is suboptimal from a UI point of view: if the admin forgets to add it when it's needed, she'll have to reboot the VM; if she forgets to remove it when it's not needed, she'll have a useless process lying around, potentially increasing the surface of attack unnecessarily. Starting a qdisk (qemu-pv?) process on-demand would automatically give you the most efficient option without additional UI baggage. Is there really no way to start up a qdisk process after the domain is created? The qdisk process doesn't actually need to do any emulation, after all -- it's just acting as a backend, right? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 03/14] x86: detect and initialize Intel CAT feature
On 08.05.15 at 10:56, chao.p.p...@linux.intel.com wrote: --- a/xen/arch/x86/psr.c +++ b/xen/arch/x86/psr.c @@ -19,14 +19,26 @@ #include asm/psr.h #define PSR_CMT(10) +#define PSR_CAT(11) + +struct psr_cat_socket_info { +unsigned int cbm_len; +unsigned int cos_max; +}; struct psr_assoc { uint64_t val; }; struct psr_cmt *__read_mostly psr_cmt; + +static unsigned long *__read_mostly cat_socket_init_bitmap; +static unsigned long *__read_mostly cat_socket_enable_bitmap; Didn't we agree to fold these two into one? Apart from that the _bitmap name tag doesn't seem very useful... @@ -194,16 +210,100 @@ void psr_ctxt_switch_to(struct domain *d) } } +static void cat_cpu_init(void) +{ +unsigned int eax, ebx, ecx, edx; +struct psr_cat_socket_info *info; +unsigned int socket; +unsigned int cpu = smp_processor_id(); +const struct cpuinfo_x86 *c = cpu_data + cpu; + +if ( !cpu_has(c, X86_FEATURE_CAT) ) +return; + +socket = cpu_to_socket(cpu); +if ( socket = nr_sockets ) +return; + +/* Avoid initializing more than one times for the same socket. */ +if ( test_and_set_bit(socket, cat_socket_init_bitmap) ) +return; + +cpuid_count(PSR_CPUID_LEVEL_CAT, 0, eax, ebx, ecx, edx); +if ( ebx PSR_RESOURCE_TYPE_L3 ) +{ +cpuid_count(PSR_CPUID_LEVEL_CAT, 1, eax, ebx, ecx, edx); +info = cat_socket_info + socket; +info-cbm_len = (eax 0x1f) + 1; +info-cos_max = min(opt_cos_max, edx 0x); Is opt_cos_max being zero (or even one) going to result in a useful / working environment? I.e. shouldn't you rather disable CAT in that case? +static void cat_cpu_fini(void) +{ +unsigned int cpu = smp_processor_id(); +unsigned int socket = cpu_to_socket(cpu); This is the only use of cpu, i.e. the variable is pretty pointless. static int cpu_callback( struct notifier_block *nfb, unsigned long action, void *hcpu) { if ( action == CPU_STARTING ) psr_cpu_init(); +else if ( action == CPU_DYING ) +psr_cpu_fini(); Are these the right notifiers for doing things involving memory allocation / freeing? @@ -217,9 +317,12 @@ static int __init psr_presmp_init(void) if ( (opt_psr PSR_CMT) opt_rmid_max ) init_psr_cmt(opt_rmid_max); +if ( opt_psr PSR_CAT ) +init_psr_cat(); + psr_cpu_init(); -if ( psr_cmt_enabled() ) -register_cpu_notifier(cpu_nfb); +if ( psr_cmt_enabled() || cat_socket_info ) + register_cpu_notifier(cpu_nfb); Please don't corrupt indentation here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen dom0 crash
Forgot to CC the Intel folks... On 2015-5-18, at 15:55, Lars Eggert l...@netapp.com wrote: Hi, Roger asked me to send this bug report to xen-devel. I'm trying to bring up a Xen dom0 on a Fujitsu RX308, but it crashes when strying to start the kernel. Any ideas? Lars B2 __ _ _ | | | _ \ / | __ \ | |___ _ __ ___ ___ | |_) | (___ | | | | | ___| '__/ _ \/ _ \| _ \___ \| | | | | | | | | __/ __/| |_) |) | |__| | | | | | |||| | | |_| |_| \___|\___||/|_/|_/```` s` `.---...--.``` -/ +Welcome to FreeBSD +4H+ .--` /y:` +. /boot/kernel/cc_vegas.ko size 0x30d0 at 0x1068000:.:o `+- Booting...el/cc_hd.ko size 0x2c00 at 0x106200 -/` -o/ Xen 4.6-unstabletcp.ko size 0x2f90 at 0x1065000 ::/sy+:. (XEN) Xen version 4.6-unstable (r...@netapp.com) (gcc47 (FreeBSD Ports Collection) 4.7.4) debug=y Mon May 18 14:50:17 CEST 2015 :` (XEN) Latest ChangeSet:| `: :` (XEN) Bootloader: FreeBSD Loader | / / (XEN) Command line: dom0_mem=4096M dom0pvh=1 com1=115200,8n1 console=com1. (XEN) Video information:]ptions... | -- -. (XEN) VGA is text mode 80x25, font 8x16 |`:` `:` (XEN) VBE/DDC methods: none; EDID transfer time: 0 seconds `--. (XEN) EDID info not retrieved because no DDC retrieval method detected=(XEN) Disc information: (XEN) Found 0 MBR signatures (XEN) Found 0 EDD information structures (XEN) Xen-e820 RAM map:+0x50db0 - (XEN) - 0008b800 (usable) (XEN) 0008b800 - 000a (reserved) (XEN) 000e - 0010 (reserved) (XEN) 0010 - 7cb09000 (usable) (XEN) 7cb09000 - 7cb39000 (reserved) (XEN) 7cb39000 - 7cc52000 (ACPI data) (XEN) 7cc52000 - 7d4b (ACPI NVS) (XEN) 7d4b - 7eafb000 (reserved) (XEN) 7eafb000 - 7eafc000 (usable) (XEN) 7eafc000 - 7eb82000 (ACPI NVS) (XEN) 7eb82000 - 7f00 (usable) (XEN) 8000 - 9000 (reserved) (XEN) fed1c000 - fed2 (reserved) (XEN) ff00 - 0001 (reserved) (XEN) 0001 - 00208000 (usable) (XEN) ACPI: RSDP 000F04A0, 0024 (r2 FTS ) (XEN) ACPI: XSDT 7CB71090, 00A4 (r1 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: FACP 7CB7FAA8, 010C (r5 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: DSDT 7CB711C8, E8DD (r2 FTSD2939-B1 114 INTL 20091112) (XEN) ACPI: FACS 7D4A7080, 0040 (XEN) ACPI: APIC 7CB7FBB8, 0294 (r3 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: FPDT 7CB7FE50, 0044 (r1 FTSD2939-B1 1072009 AMI 10013) (XEN) ACPI: MCFG 7CB7FE98, 003C (r1 FTSOEMMCFG. 1072009 MSFT 97) (XEN) ACPI: SRAT 7CB7FED8, 0530 (r1 A M I AMI SRAT1 AMI.0) (XEN) ACPI: SLIT 7CB80408, 0030 (r1 A M I AMI SLIT0 AMI.0) (XEN) ACPI: HPET 7CB80438, 0038 (r1 FTSD2939-B1 1072009 AMI.5) (XEN) ACPI: PRAD 7CB80470, 00BE (r2 PRADID PRADTID1 MSFT 301) (XEN) ACPI: SPMI 7CB80530, 0040 (r5 A M I OEMSPMI0 AMI.0) (XEN) ACPI: SSDT 7CB80570, D0CB0 (r2 INTELCpuPm 4000 INTL 20051117) (XEN) ACPI: SPCR 7CC51220, 0050 (r1 A M I APTIO4 1072009 AMI.5) (XEN) ACPI: EINJ 7CC51270, 0130 (r1AMI AMI EINJ0 0) (XEN) ACPI: ERST 7CC513A0, 0230 (r1 AMIER AMI ERST0 0) (XEN) ACPI: HEST 7CC515D0, 00A8 (r1AMI AMI HEST0 0) (XEN) ACPI: BERT 7CC51678, 0030 (r1AMI AMI BERT0 0) (XEN) ACPI: DMAR 7CC516A8, 0110 (r1 A M I OEMDMAR1 INTL1) (XEN) System RAM: 131023MB (134167628kB) (XEN) SRAT: PXM 0 - APIC 0 - Node 0 (XEN) SRAT: PXM 0 - APIC 1 - Node 0 (XEN) SRAT: PXM 0 - APIC 2 - Node 0 (XEN) SRAT: PXM 0 - APIC 3 - Node 0 (XEN) SRAT: PXM 0 - APIC 4 - Node 0 (XEN) SRAT: PXM 0 - APIC 5 - Node 0 (XEN) SRAT: PXM 0 - APIC 6 - Node 0 (XEN) SRAT: PXM 0 - APIC 7 - Node 0 (XEN) SRAT: PXM 0 - APIC 8 - Node 0 (XEN) SRAT: PXM 0 - APIC 9 - Node 0 (XEN) SRAT: PXM 0 - APIC 16 - Node 0 (XEN) SRAT: PXM 0 - APIC 17 - Node 0 (XEN) SRAT: PXM 0 - APIC 18 - Node 0 (XEN) SRAT: PXM 0 - APIC 19 - Node 0 (XEN) SRAT: PXM 0 - APIC 20 - Node 0 (XEN) SRAT: PXM 0 - APIC 21 - Node 0 (XEN) SRAT: PXM 0 - APIC 22 - Node 0 (XEN) SRAT: PXM 0 - APIC 23 - Node 0 (XEN) SRAT: PXM 0 - APIC 24 - Node 0 (XEN) SRAT: PXM 0 - APIC 25 - Node 0 (XEN) SRAT:
Re: [Xen-devel] [PATCH] x86/EFI: keep EFI runtime services top level page tables up-to-date
On 18/05/15 11:22, Jan Beulich wrote: Updates to idle_pg_table[] need to be mirrored into the page tables used for invoking EFI runtime services. Signed-off-by: Jan Beulich jbeul...@suse.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com --- This in particular is a prereq for the patch at http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg02125.html to be correct. --- a/xen/arch/x86/efi/runtime.h +++ b/xen/arch/x86/efi/runtime.h @@ -2,4 +2,10 @@ #ifndef COMPAT l4_pgentry_t *__read_mostly efi_l4_pgtable; + +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) +{ +if ( efi_l4_pgtable ) +l4e_write(efi_l4_pgtable + l4idx, l4e); +} #endif --- a/xen/arch/x86/efi/stub.c +++ b/xen/arch/x86/efi/stub.c @@ -2,6 +2,7 @@ #include xen/errno.h #include xen/init.h #include xen/lib.h +#include asm/page.h #ifndef efi_enabled const bool_t efi_enabled = 0; @@ -9,6 +10,8 @@ const bool_t efi_enabled = 0; void __init efi_init_memory(void) { } +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t l4e) { } + paddr_t efi_rs_page_table(void) { BUG(); --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5311,7 +5311,10 @@ static l3_pgentry_t *virt_to_xen_l3e(uns spin_lock(map_pgdir_lock); if ( !(l4e_get_flags(*pl4e) _PAGE_PRESENT) ) { -l4e_write(pl4e, l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR)); +l4_pgentry_t l4e = l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR); + +l4e_write(pl4e, l4e); +efi_update_l4_pgtable(l4_table_offset(v), l4e); pl3e = NULL; } if ( locking ) --- a/xen/include/asm-x86/page.h +++ b/xen/include/asm-x86/page.h @@ -288,6 +288,7 @@ extern l2_pgentry_t l2_identmap[4*L2_PAG extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES], l1_fixmap[L1_PAGETABLE_ENTRIES]; void paging_init(void); +void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t); #endif /* !defined(__ASSEMBLY__) */ #define _PAGE_NONE _AC(0x000,U) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [libvirt] [PATCH 0/4] xenconfig: fix SPICE parsing/formatting
On 09.05.2015 00:00, Jim Fehlig wrote: This series fixes several bugs related to SPICE parsing and formatting code in xenconfig. The bugs are mostly due to misinterpretation of the Xen documenation, which I failed to notice when reviewing the initial submission. Jim Fehlig (4): xenconfig: use local variable for graphics def xenconfig: format spice listenAddr when formating ports xenconfig: fix spicepasswd handling xenconfig: fix spice mousemode and copypaste src/xenconfig/xen_xl.c | 95 -- tests/xlconfigdata/test-spice-features.cfg | 32 ++ tests/xlconfigdata/test-spice-features.xml | 48 +++ tests/xlconfigdata/test-spice.cfg | 4 +- tests/xlconfigdata/test-spice.xml | 2 + tests/xlconfigtest.c | 1 + 6 files changed, 148 insertions(+), 34 deletions(-) create mode 100644 tests/xlconfigdata/test-spice-features.cfg create mode 100644 tests/xlconfigdata/test-spice-features.xml ACK series Michal ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 56456: regressions - FAIL
On 18.05.15 at 12:17, t...@xen.org wrote: At 09:34 +0100 on 18 May (1431941676), Jan Beulich wrote: On 16.05.15 at 13:45, roger@citrix.com wrote: El 16/05/15 a les 10.51, osstest service user ha escrit: flight 56456 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/56456/ Regressions :-( This is my fault, paging_gva_to_gfn cannot be used to translate a PV guest VA to a GFN. The patch above restores the previous path for PV callers. While Tim would have the final say, I certainly would prefer to revert the offending patch and then apply a correct new version in its stead in this case (where the fix is not a simple, few lines change). I would be OK with a follow-up fix here, but I'm not convinced that this is it. In particular, paging_mode_enabled() should be true for any PV domain that's in log-dirty mode, so presumably the failure is only for lgd ops on VMs that don't have lgd enabled. So maybe we can either: - return an error for that case (but we'd want to understand how we got there first); or - have map_dirty_bitmap() DTRT, with something like access_ok() + a linear-pagetable lookup to find the frame. In any case, this is a regression so we should revert while we figure it out. Done. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test
On Mon, 2015-05-18 at 11:08 +0100, George Dunlap wrote: On Wed, May 13, 2015 at 12:48 PM, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: On Wed, 13 May 2015, Ian Campbell wrote: On Tue, 2015-05-12 at 12:46 +0100, Stefano Stabellini wrote: Would a separate clone of the same raisin version with some sort of dist directory transported over be sufficient and supportable? Or are raisin's outputs not in one place and easily transportable? i.e. today build-$ARCH-libvirt picks up the dist.tar.gz files from the corresponding build-$ARCH, unpacks them and asks libvirt to build against that tree. Moving the dist directory over should work, although I have never tested this configuration. Would you be willing to support this as a requirement going forward? Yeah, I think it is OK I assume that it is not also necessary to reclone all the trees for the preexisting components, just the new ones? Only if the user asks for a components to be built, the corresponding tree is cloned. Won't the problem here be disentangling the stuff installed in dist/ (or whatever it's called) from the things we want to rebuild vs the things we want to change? From the osstest PoV at least the proposal here only involves building additional things, not rebuilding anything which came from a previous build. e.g. given a build of xen.git now do a build of libvirt.git using those previously built Xen libs. But there is still the issue of separating stuff built in Pass-A from the stuff in Pass-B. Raisin could presumably have a concept of two dist dirs, dist.base and dist with the former being r/o. But that sounds to me like the sort of thing you wouldn't want in Raisin. Per component dist dirs is similarly surely possible but perhaps not something raisin wants. I.e., ideally if you want to build just xen.git, you want dist/ to contain the output of the previous build of seabios, qemut, qemuu, c, but *not* the output of previous xen.git builds (or, ideally, the output of previous libvirt, pvgrub, or stubdom builds). Just tar and untarr'ing dist/ after a full build won't accomplish that. Would it make sense to do some sort of save snapshot functionality that would tar up the dist/ before building a particular component, such that it could be used later? Sort of a stage 2* for raisin. :-) -George * Referring to Gentoo. Not sure the comparison is 100% accurate. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] MAINTAINERS: Remove Frediano Ziglio from HISILICON HIP04 Support
On Fri, 2015-05-15 at 22:09 +0100, Julien Grall wrote: Hi, On 07/05/2015 15:39, Zoltan Kiss wrote: Now the question of course is - is that really what we want to do? I.e. is it known that he no longer wants to be a maintainer of this code (or cannot be)? The mail address having become stale doesn't necessarily mean that. Does one of you know where he went? Should we - in the absence of a valid mail address - keep his name here without mail address (I admit this is of limited use)? He left to an another company, to work on quite different things I guess, and without a board it would be quite hard for him to do this job. But I've added his private email address, so he can speak out for himself. Ping? Unless someone objects I'm going to pick this up next time I do a pass over my queue committing things, which will likely be either later today or tomorrow. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] tools: Fix wild memory allocations from c/s 250f0b4 and 85d78b4
These changesets cause the respective libxc functions to unconditonally dereference their max_cpus/nodes parameters as part of initial memory allocations. It will fail at obtaining the correct number of cpus/nodes from Xen, as the guest handles will not be NULL. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Boris Ostrovsky boris.ostrov...@oracle.com --- Spotted by XenServers Coverity run. --- tools/libxl/libxl.c |4 ++-- tools/misc/xenpm.c|4 ++-- tools/python/xen/lowlevel/xc/xc.c |4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index a6eb2df..295877b 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -5105,7 +5105,7 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out) xc_cputopo_t *cputopo; libxl_cputopology *ret = NULL; int i; -unsigned num_cpus; +unsigned num_cpus = 0; /* Setting buffer to NULL makes the call return number of CPUs */ if (xc_cputopoinfo(ctx-xch, num_cpus, NULL)) @@ -5191,7 +5191,7 @@ libxl_numainfo *libxl_get_numainfo(libxl_ctx *ctx, int *nr) uint32_t *distance; libxl_numainfo *ret = NULL; int i, j; -unsigned num_nodes; +unsigned num_nodes = 0; if (xc_numainfo(ctx-xch, num_nodes, NULL, NULL)) { LOGE(ERROR, Unable to determine number of nodes); diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c index fe2c001..2f9bd8e 100644 --- a/tools/misc/xenpm.c +++ b/tools/misc/xenpm.c @@ -356,7 +356,7 @@ static void signal_int_handler(int signo) struct timeval tv; int cx_cap = 0, px_cap = 0; xc_cputopo_t *cputopo = NULL; -unsigned max_cpus; +unsigned max_cpus = 0; if ( xc_cputopoinfo(xc_handle, max_cpus, NULL) != 0 ) { @@ -961,7 +961,7 @@ void scaling_governor_func(int argc, char *argv[]) void cpu_topology_func(int argc, char *argv[]) { xc_cputopo_t *cputopo = NULL; -unsigned max_cpus; +unsigned max_cpus = 0; int i, rc; if ( xc_cputopoinfo(xc_handle, max_cpus, NULL) != 0 ) diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c index fbd93db..c77e15b 100644 --- a/tools/python/xen/lowlevel/xc/xc.c +++ b/tools/python/xen/lowlevel/xc/xc.c @@ -1221,7 +1221,7 @@ static PyObject *pyxc_getcpuinfo(XcObject *self, PyObject *args, PyObject *kwds) static PyObject *pyxc_topologyinfo(XcObject *self) { xc_cputopo_t *cputopo = NULL; -unsigned i, num_cpus; +unsigned i, num_cpus = 0; PyObject *ret_obj = NULL; PyObject *cpu_to_core_obj, *cpu_to_socket_obj, *cpu_to_node_obj; @@ -1293,7 +1293,7 @@ static PyObject *pyxc_topologyinfo(XcObject *self) static PyObject *pyxc_numainfo(XcObject *self) { -unsigned i, j, num_nodes; +unsigned i, j, num_nodes = 0; uint64_t free_heap; PyObject *ret_obj = NULL, *node_to_node_dist_list_obj; PyObject *node_to_memsize_obj, *node_to_memfree_obj; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test
Ian Campbell writes (Re: [Xen-devel] [PATCH v4] OSSTEST: introduce a raisin build test): On Mon, 2015-05-18 at 14:05 +0100, George Dunlap wrote: That solves the most general case; but it sounds like you care mostly about the very specific case of dealing with components that depend on the current output of xen.git. Starting simple may be fine. Currently we only have ts-*-build things which depend on the output of ts-xen-build (in fact, we only have ts-libvirt-build). That's not true. We have: job step script build-amd64 xen-build ts-xen-build build-amd64-rumpuserxen rumpuserxen-build ts-rumpuserxen-build build-amd64-rumpuserxen xen-build ts-xen-build where each of the lines in my table, above, uses output from the previous line. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3 1/6] libxl: export some functions for pvusb use
On 04/20/2015 05:25 PM, Olaf Hering wrote: On Sun, Apr 19, Chunyan Liu wrote: +++ b/tools/libxl/libxl_internal.h +_hidden int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device); +_hidden int libxl__resolve_domid(libxl__gc *gc, const char *name, + uint32_t *domid); +/* generic callback for devices that only need to set ao_complete */ +_hidden void device_addrm_aocomplete(libxl__egc *egc, libxl__ao_device *aodev); If that goes in I may move some or all of the vscsi code in libxl.c into libxl_vscsi.c. It sounds like this would be useful independent of the pvusb stuff. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-3.4 test] 56631: regressions - FAIL
flight 56631 linux-3.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/56631/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl 9 debian-install fail REGR. vs. 52209-bisect test-amd64-amd64-pair 15 debian-install/dst_host fail REGR. vs. 52715-bisect test-amd64-i386-pair15 debian-install/dst_host fail REGR. vs. 56366-bisect Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf 9 debian-installfail blocked in 56366-bisect test-amd64-amd64-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-amd64-xl-multivcpu 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-libvirt-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-qemut-rhel6hvm-intel 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-i386-freebsd10-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-sedf-pin 9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-freebsd10-i386 13 guest-localmigrate fail blocked in 56366-bisect test-amd64-i386-xl9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-ovmf-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-i386-rumpuserxen-i386 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-xl-credit2 9 debian-install fail never pass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-xl-pvh-amd 9 debian-install fail never pass test-amd64-i386-xl-xsm9 debian-install fail never pass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-xl-pvh-intel 9 debian-install fail never pass test-amd64-amd64-libvirt-xsm 9 debian-install fail never pass test-amd64-amd64-xl-xsm 9 debian-install fail never pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: linux56b48fcda5076d4070ab00df32ff5ff834e0be86 baseline version: linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70 370 people touched revisions under test, not listing them all jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl fail test-amd64-i386-xl fail test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmfail test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmfail test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm fail test-amd64-amd64-libvirt-xsm fail test-amd64-i386-libvirt-xsm fail test-amd64-amd64-xl-xsm fail test-amd64-i386-xl-xsm fail test-amd64-amd64-xl-pvh-amd
Re: [Xen-devel] [PATCH 5/5] xen: Write CR0, CR3 and CR4 in arch_set_info_guest()
On 05/18/2015 11:05 AM, Jan Beulich wrote: On 18.05.15 at 09:58, rcojoc...@bitdefender.com wrote: On 05/18/2015 10:27 AM, Jan Beulich wrote: On 15.05.15 at 22:45, rcojoc...@bitdefender.com wrote: On 05/15/2015 06:57 PM, Jan Beulich wrote: On 06.05.15 at 19:12, rcojoc...@bitdefender.com wrote: Arch_set_info_guest() doesn't set CR0, CR3 or CR4. Added code that does that. But you should also say a word on why this is needed, since things worked fine so far without, and enabling the functions to run outside of their own vCPU context is not immediately obviously correct. This is a way to undo malicious CR writes. This is achieved for MSR writes with the deny vm_event response flag patch in this series, but the CR events are being send after the actual write. In such cases, while the VCPU is paused before I put a vm_response in the ring, I can simply write the old value back. I've brought up the issue in the past, and the consensus, IIRC, was that I should not alter existing behaviour (post-write events) - so the alternatives were either to add a new pre-write CR event (which seemed messy), or this (which seemed less intrusive). Of course, if it has now become acceptable to reconsider having the CR vm_events consistently pre-write, the deny patch could be extended to them. Considering that in the reply to Andrew's response you already pointed at where a suggestion towards consistent pre events was made, it would have helped if you identified where this was advised against before. It certainly seems sensible to treat MSR and CR writes in a consistent fashion. Sorry for the omission. This is the original reply: http://lists.xen.org/archives/html/xen-devel/2014-10/msg00240.html where you've pointed out that we should not break existing behaviour. As said (mainly by Andrew) numerous times recently, breaking of existing behavior is - with all the incompatible changes already done to the interface after 4.5 - not currently an issue. 2. Change the control register events to pre-write and prevent post-write events in the future. This one, as long as amongst the ones that care for the events agreement can be reached that this is the way to go. Ack, unless otherwise requested I'll add this modification to the next iteration of xen/vm_event: Clean up control-register-write vm_events. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/6] x86: reduce paravirtualized spinlock overhead
On 05/17/2015 07:30 AM, Ingo Molnar wrote: * Juergen Gross jgr...@suse.com wrote: On 05/05/2015 07:21 PM, Jeremy Fitzhardinge wrote: On 05/03/2015 10:55 PM, Juergen Gross wrote: I did a small measurement of the pure locking functions on bare metal without and with my patches. spin_lock() for the first time (lock and code not in cache) dropped from about 600 to 500 cycles. spin_unlock() for first time dropped from 145 to 87 cycles. spin_lock() in a loop dropped from 48 to 45 cycles. spin_unlock() in the same loop dropped from 24 to 22 cycles. Did you isolate icache hot/cold from dcache hot/cold? It seems to me the main difference will be whether the branch predictor is warmed up rather than if the lock itself is in dcache, but its much more likely that the lock code is icache if the code is lock intensive, making the cold case moot. But that's pure speculation. Could you see any differences in workloads beyond microbenchmarks? Not that its my call at all, but I think we'd need to see some concrete improvements in real workloads before adding the complexity of more pvops. I did another test on a larger machine: 25 kernel builds (time make -j 32) on a 32 core machine. Before each build make clean was called, the first result after boot was omitted to avoid disk cache warmup effects. System time without my patches: 861.5664 +/- 3.3665 s with my patches: 852.2269 +/- 3.6629 s So how does the profile look like in the guest, before/after the PV spinlock patches? I'm a bit surprised to see so much spinlock overhead. I did another test in Xen dom0: System time without my patches: 2903 +/- 2 s with my patches: 2904 +/- 2 s BTW, this was what I expected: There should be no significant change in system time, as the only real difference between both variants in a guest is an additional 2-byte nop in the inlined unlock function call, another one in the lock call and one jmp instruction less in the lock call. What I didn't expect was the huge performance difference between native and guest. The used configuration (32 cores with hyperthreads enabled) surely is one reason for the difference, but still this seems to be too much. I double checked the results on bare metal, they are still more or less the same (did only one kernel build resulting in 862 seconds system time). There seems to be a lot of room for improvement, but this is another story. Regarding spinlock overhead: I think the reason I saw about 1% less system time with my patches was mainly due to less cache misses. Inlining of the unlock function avoided an additional instruction cache miss for the unlock function. KT Raghavendra did some benchmarks with only small user programs and high kernel load which showed nearly no effect at all. Additionally I've compared the two kernels using bloat-o-meter: add/remove: 11/13 grow/shrink: 654/603 up/down: 6046/-31754 (-25708) with some hot path functions going down in size quite nice, e.g.: __raw_spin_unlock_irq336 90-246 Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/41] Add ACPI support for arm64 on Xen
On 18 May 2015 at 13:55, Jan Beulich jbeul...@suse.com wrote: On 17.05.15 at 22:03, parth.di...@linaro.org wrote: Naresh Bhat (3): arm/acpi: Build numa for x86 only arm/acpi : Print GIC information when MADT is parsed xen: arm64: Add ACPI support Parth Dixit (38): arm/acpi: Build pmstat for x86 only arm/acpi : emulate io ports for arm arm/acpi : add arm specific acpi header file acpi : add helper function for mapping memory arm/acpi : Add basic ACPI initialization arm/acpi : Introduce ARM Boot Architecture Flags in FADT arm/acpi : Parse FADT table and get PSCI flags arm/acpi : Add Generic Interrupt and Distributor struct arm/acpi : add GTDT support updated by ACPI 5.1 arm : move dt specific code in smp to seperate functions arm/acpi : parse MADT to map logical cpu to MPIDR and get cpu_possible_map arm : acpi add helper function for setting interrupt type arm : acpi parse GTDT to initialize timer acpi : Introduce acpi_parse_entries arm : refactor gic into generic and dt specific parts arm: Introduce a generic way to use a device from acpi arm : acpi Add GIC specific ACPI boot support arm : create generic uart initialization function arm : acpi Initialize serial port from ACPI SPCR table arm : acpi create min DT stub for DOM0 arm : acpi create chosen node for DOM0 arm : acpi create efi node for DOM0 arm : acpi add status override table arm : acpi add xen environment table arm : add helper functions to map memory regions arm : acpi add efi structures to common efi header arm : acpi read acpi memory info from uefi arm : acpi add placeholder for acpi load address arm : acpi estimate memory required for acpi/efi tables arm : acpi dynamically map mmio regions arm : acpi prepare acpi tables for dom0 arm : acpi create and map acpi tables arm : acpi add helper function to calculate crc32 arm : acpi pass rsdp and memory via efi table arm : acpi add acpi parameter to enable/disable acpi arm : acpi enable efi for acpi arm : acpi configure interrupts dynamically arm : acpi route irq's at time of boot Please trim your Cc list on the individual patches, i.e. avoid Cc-ing every maintainer involved in some of the patches one every one of them. In the case here, only 16 of the 41 patches really need me looking at them. Jan sure, sorry for the trouble, i'll take care in future. regards parth ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/3] xen/pvh: use a custom IO bitmap for PVH hardware domains
From: Roger Pau Monne [mailto:roger@citrix.com] Sent: Thursday, May 07, 2015 10:54 PM Since a PVH hardware domain has access to the physical hardware create a custom more permissive IO bitmap. The permissions set on the bitmap are populated based on the contents of the ioports rangeset. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Cc: Aravind Gopalakrishnan aravind.gopalakrish...@amd.com Cc: Jun Nakajima jun.nakaj...@intel.com Cc: Eddie Dong eddie.d...@intel.com Cc: Kevin Tian kevin.t...@intel.com Acked-by: Kevin Tian kevin.t...@intel.com Thanks Kevin --- Changes since v4: - Split changes also affecting PV to a separate patch. - Use int with __clear_bit. - Drop pointless cast in vmcb.c. - Make HVM_IOBITMAP_SIZE contain the size of the io bitmap pages in bytes. - Make setup_io_bitmap a hardware domain specific function, and allow it to work with late hw domain init. Changes since v3: - Add the RTC IO ports to the list of blocked ports. - Remove admin_io_okay since it's just a wrapper around ioports_access_permitted. Changes since v2: - Add 0xcf8-0xcfb to the range of blocked (trapped) IO ports. - Use rangeset_report_ranges in order to iterate over the range of not trapped IO ports. - Allocate the Dom0 PVH IO bitmap with _xmalloc_array, which allows setting the alignment to PAGE_SIZE. - Tested with Linux PV/PVH using 3.18 and FreeBSD PVH HEAD. Changes since v1: - Dynamically allocate PVH Dom0 IO bitmap if needed. - Drop cast from construct_vmcs when writing the IO bitmap. - Create a new function that sets up the bitmap before launching Dom0. This is needed because ns16550_endboot is called after construct_dom0. --- xen/arch/x86/hvm/hvm.c | 24 ++-- xen/arch/x86/hvm/svm/vmcb.c | 2 +- xen/arch/x86/hvm/vmx/vmcs.c | 5 +++-- xen/arch/x86/setup.c | 29 + xen/common/domain.c | 2 ++ xen/include/asm-x86/hvm/domain.h | 2 ++ xen/include/asm-x86/setup.h | 1 + 7 files changed, 60 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 3a09439..ea052c4 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -77,9 +77,13 @@ integer_param(hvm_debug, opt_hvm_debug_level); struct hvm_function_table hvm_funcs __read_mostly; -/* I/O permission bitmap is globally shared by all HVM guests. */ +#define HVM_IOBITMAP_SIZE (3*PAGE_SIZE) +/* + * The I/O permission bitmap is globally shared by all HVM guests except + * the hardware domain that has a more permissive IO bitmap. + */ unsigned long __attribute__ ((__section__ (.bss.page_aligned))) -hvm_io_bitmap[3*PAGE_SIZE/BYTES_PER_LONG]; +hvm_io_bitmap[HVM_IOBITMAP_SIZE/BYTES_PER_LONG]; /* Xen command-line option to enable HAP */ static bool_t __initdata opt_hap_enabled = 1; @@ -1461,6 +1465,20 @@ int hvm_domain_initialise(struct domain *d) goto fail1; d-arch.hvm_domain.io_handler-num_slot = 0; +/* Set the default IO Bitmap */ +if ( is_hardware_domain(d) ) +{ +d-arch.hvm_domain.io_bitmap = _xmalloc(HVM_IOBITMAP_SIZE, PAGE_SIZE); +if ( d-arch.hvm_domain.io_bitmap == NULL ) +{ +rc = -ENOMEM; +goto fail1; +} +memset(d-arch.hvm_domain.io_bitmap, ~0, HVM_IOBITMAP_SIZE); +} +else +d-arch.hvm_domain.io_bitmap = hvm_io_bitmap; + if ( is_pvh_domain(d) ) { register_portio_handler(d, 0, 0x10003, handle_pvh_io); @@ -1496,6 +1514,8 @@ int hvm_domain_initialise(struct domain *d) stdvga_deinit(d); vioapic_deinit(d); fail1: +if ( is_hardware_domain(d) ) +xfree(d-arch.hvm_domain.io_bitmap); xfree(d-arch.hvm_domain.io_handler); xfree(d-arch.hvm_domain.params); fail0: diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c index 21292bb..10afd44 100644 --- a/xen/arch/x86/hvm/svm/vmcb.c +++ b/xen/arch/x86/hvm/svm/vmcb.c @@ -118,7 +118,7 @@ static int construct_vmcb(struct vcpu *v) svm_disable_intercept_for_msr(v, MSR_AMD64_LWP_CBADDR); vmcb-_msrpm_base_pa = (u64)virt_to_maddr(arch_svm-msrpm); -vmcb-_iopm_base_pa = (u64)virt_to_maddr(hvm_io_bitmap); +vmcb-_iopm_base_pa = virt_to_maddr(v-domain-arch.hvm_domain.io_bitmap); /* Virtualise EFLAGS.IF and LAPIC TPR (CR8). */ vmcb-_vintr.fields.intr_masking = 1; diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 3123706..f62aa90 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1032,8 +1032,9 @@ static int construct_vmcs(struct vcpu *v) } /* I/O access bitmap. */
[Xen-devel] [xen-unstable test] 56630: regressions - FAIL
flight 56630 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/56630/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-multivcpu 14 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-xl 14 guest-localmigratefail REGR. vs. 56375 test-amd64-amd64-xl-credit2 14 guest-localmigratefail REGR. vs. 56375 test-amd64-amd64-xl-qemut-win7-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-xl 14 guest-localmigratefail REGR. vs. 56375 test-amd64-i386-xl-qemut-win7-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-pair 21 guest-migrate/src_host/dst_host fail REGR. vs. 56375 test-amd64-amd64-xl-qemut-debianhvm-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-pair 21 guest-migrate/src_host/dst_host fail REGR. vs. 56375 test-amd64-i386-xl-qemut-debianhvm-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-xl-qemut-winxpsp3 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 12 guest-localmigrate fail REGR. vs. 56375 test-armhf-armhf-xl-multivcpu 17 leak-check/check fail REGR. vs. 56375 test-amd64-i386-xl-qemut-winxpsp3 12 guest-localmigrate fail REGR. vs. 56375 Tests which are failing intermittently (not blocking): test-amd64-i386-rumpuserxen-i386 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail pass in 56576 Regressions which are regarded as allowable (not blocking): test-amd64-i386-freebsd10-amd64 13 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-xl-sedf 14 guest-localmigratefail REGR. vs. 56375 test-amd64-amd64-libvirt 11 guest-start fail REGR. vs. 56375 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-xl-qemuu-winxpsp3 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-xl-sedf-pin 14 guest-localmigratefail REGR. vs. 56375 test-amd64-amd64-xl-qemuu-debianhvm-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-xl-qemuu-win7-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-xl-qemuu-winxpsp3 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-xl-qemuu-debianhvm-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-xl-qemuu-ovmf-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-xl-qemuu-ovmf-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-amd64-xl-qemuu-win7-amd64 12 guest-localmigrate fail REGR. vs. 56375 test-amd64-i386-freebsd10-i386 13 guest-localmigrate fail like 56375 test-amd64-i386-libvirt 11 guest-start fail like 56375 test-armhf-armhf-libvirt 11 guest-start fail like 56375 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-xl-xsm 11 guest-start fail never pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-libvirt-xsm 11 guest-start fail never pass test-amd64-i386-xl-xsm 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 11 guest-start fail never pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-armhf-armhf-libvirt-xsm 6 xen-boot fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-armhf-armhf-xl-xsm 6 xen-boot fail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass version targeted for testing: xen e4ad2836842ac114e7791963d56ebd02dd4c384f baseline version: xen e13013dbf1d5997915548a3b5f1c39594d8c1d7b People who touched revisions under test: David Vrabel david.vra...@citrix.com George Dunlap george.dun...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Jan Beulich jbeul...@suse.com Julien Grall julien.gr...@citrix.com (ARM) Konrad Rzeszutek Wilk konrad.w...@oracle.com Roger Pau Monné
[Xen-devel] [PATCH Remus v7 1/3] libxc/save: refactor of send_domain_memory_live()
Split the send_domain_memory_live() into three helper function: - send_memory_live() do the actually live send - suspend_and_send_dirty() suspend the guest and send dirty pages - send_memory_verify() The motivation of this is that when we send checkpointed stream, we will skip the actually live part. Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Andrew Cooper andrew.coop...@citrix.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com --- tools/libxc/xc_sr_save.c | 137 +-- 1 file changed, 98 insertions(+), 39 deletions(-) diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c index 1d0a46d..c08a49e 100644 --- a/tools/libxc/xc_sr_save.c +++ b/tools/libxc/xc_sr_save.c @@ -455,21 +455,15 @@ static int update_progress_string(struct xc_sr_context *ctx, } /* - * Send all domain memory. This is the heart of the live migration loop. + * Send memory while guest is running. */ -static int send_domain_memory_live(struct xc_sr_context *ctx) +static int send_memory_live(struct xc_sr_context *ctx) { xc_interface *xch = ctx-xch; xc_shadow_op_stats_t stats = { 0, ctx-save.p2m_size }; char *progress_str = NULL; unsigned x; int rc = -1; -DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap, -ctx-save.dirty_bitmap_hbuf); - -rc = enable_logdirty(ctx); -if ( rc ) -goto out; rc = update_progress_string(ctx, progress_str, 0); if ( rc ) @@ -485,7 +479,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx) { if ( xc_shadow_control( xch, ctx-domid, XEN_DOMCTL_SHADOW_OP_CLEAN, - HYPERCALL_BUFFER(dirty_bitmap), ctx-save.p2m_size, + ctx-save.dirty_bitmap_hbuf, ctx-save.p2m_size, NULL, 0, stats) != ctx-save.p2m_size ) { PERROR(Failed to retrieve logdirty bitmap); @@ -505,6 +499,26 @@ static int send_domain_memory_live(struct xc_sr_context *ctx) goto out; } + out: +xc_set_progress_prefix(xch, NULL); +free(progress_str); +return rc; +} + +/* + * Suspend the domain and send dirty memory. + * This is the last iteration of the live migration and the + * heart of the checkpointed stream. + */ +static int suspend_and_send_dirty(struct xc_sr_context *ctx) +{ +xc_interface *xch = ctx-xch; +xc_shadow_op_stats_t stats = { 0, ctx-save.p2m_size }; +char *progress_str = NULL; +int rc = -1; +DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap, +ctx-save.dirty_bitmap_hbuf); + rc = suspend_domain(ctx); if ( rc ) goto out; @@ -519,9 +533,15 @@ static int send_domain_memory_live(struct xc_sr_context *ctx) goto out; } -rc = update_progress_string(ctx, progress_str, ctx-save.max_iterations); -if ( rc ) -goto out; +if ( ctx-save.live ) +{ +rc = update_progress_string(ctx, progress_str, +ctx-save.max_iterations); +if ( rc ) +goto out; +} +else +xc_set_progress_prefix(xch, Checkpointed save); bitmap_or(dirty_bitmap, ctx-save.deferred_pages, ctx-save.p2m_size); @@ -529,42 +549,81 @@ static int send_domain_memory_live(struct xc_sr_context *ctx) if ( rc ) goto out; -if ( ctx-save.debug ) +bitmap_clear(ctx-save.deferred_pages, ctx-save.p2m_size); +ctx-save.nr_deferred_pages = 0; + + out: +xc_set_progress_prefix(xch, NULL); +free(progress_str); +return rc; +} + +static int send_memory_verify(struct xc_sr_context *ctx) +{ +xc_interface *xch = ctx-xch; +xc_shadow_op_stats_t stats = { 0, ctx-save.p2m_size }; +int rc = -1; +struct xc_sr_record rec = { -struct xc_sr_record rec = -{ -.type = REC_TYPE_VERIFY, -.length = 0, -}; +.type = REC_TYPE_VERIFY, +.length = 0, +}; -DPRINTF(Enabling verify mode); +DPRINTF(Enabling verify mode); -rc = write_record(ctx, rec); -if ( rc ) -goto out; +rc = write_record(ctx, rec); +if ( rc ) +goto out; -xc_set_progress_prefix(xch, Memory verify); -rc = send_all_pages(ctx); -if ( rc ) -goto out; +xc_set_progress_prefix(xch, Memory verify); +rc = send_all_pages(ctx); +if ( rc ) +goto out; -if ( xc_shadow_control( - xch, ctx-domid, XEN_DOMCTL_SHADOW_OP_PEEK, - HYPERCALL_BUFFER(dirty_bitmap), ctx-save.p2m_size, - NULL, 0, stats) != ctx-save.p2m_size ) -{ -PERROR(Failed to retrieve logdirty bitmap); -rc = -1; -goto out; -} +if (
Re: [Xen-devel] [PATCH 5/5] xen: Write CR0, CR3 and CR4 in arch_set_info_guest()
On 05/17/2015 09:32 PM, Tamas K Lengyel wrote: I took the suggestion to mean at the time that we should have something like EVENT_CR3_PRE and EVENT_CR3_POST, where basically all we needed was for all events for which this applicable to be pre-write events. IMHO that's simpler and sufficient: just send out an event when you know that, unless you deny it, simply responding to it / unblocking the VCPU will perform the write. So you know that the write is about to happen, and by no denying it, that it will happen (or at least, that it's extremely likely to happen - since some HV check can still fail somewhere). So, again, just IMHO, the simpler modification would just be to turn all events where this is applicable into pre-write events. Isn't the event from the guest's perspective guaranteed to be pre-write already? IMHO there is not much point in having two distinct The CR events are not pre-write: 3289 int hvm_set_cr3(unsigned long value) 3290 { 3291 struct vcpu *v = current; 3292 struct page_info *page; 3293 unsigned long old; 3294 3295 if ( hvm_paging_enabled(v) !paging_mode_hap(v-domain) 3296 (value != v-arch.hvm_vcpu.guest_cr[3]) ) 3297 { 3298 /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ 3299 HVM_DBG_LOG(DBG_LEVEL_VMMU, CR3 value = %lx, value); 3300 page = get_page_from_gfn(v-domain, value PAGE_SHIFT, 3301 NULL, P2M_ALLOC); 3302 if ( !page ) 3303 goto bad_cr3; 3304 3305 put_page(pagetable_get_page(v-arch.guest_table)); 3306 v-arch.guest_table = pagetable_from_page(page); 3307 3308 HVM_DBG_LOG(DBG_LEVEL_VMMU, Update CR3 value = %lx, value); 3309 } 3310 3311 old=v-arch.hvm_vcpu.guest_cr[3]; 3312 v-arch.hvm_vcpu.guest_cr[3] = value; 3313 paging_update_cr3(v); 3314 hvm_event_cr(VM_EVENT_X86_CR3, value, old); 3315 return X86EMUL_OKAY; 3316 3317 bad_cr3: 3318 gdprintk(XENLOG_ERR, Invalid CR3\n); 3319 domain_crash(v-domain); 3320 return X86EMUL_UNHANDLEABLE; 3321 } The line numbers assume my patch has been applied, but the only relevant change here is that hvm_event_cr(VM_EVENT_X86_CR3, value, old); replaced the old hvm_event_cr3(value, old); at line 3314. As you can see, first CR3 is being updated, and the events is being sent afterwards. This applies to CR4, etc. also. event types (PRE/POST). I'm not particularly happy with the deny change flag idea either. Why not let the user specify what value he wants to set the register to in such a case? It could the one that was already there (old_value) in deny change style, but it might as well be something else. I'm thinking we could keep the existing vm_event hooks in place and simply let the vm_event response specify the value the register should be set to (in the new_value field). You mean not have a distinct DENY vm_event response flag, but if rsp's new_value != req's new value set that one? Sure, that'll work, but it's less explicit, and thus, IMHO, more error-prone (it's easy for a vm_event consumer to just create the response on the stack, forget (or not know) that this might happen, and have the guest just write garbage to some register). Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] xen scheduler
Dear Developers, As per my knowledge. Credit scheduler sorts its queue of VCPUs with priority based on credit value. It follows FCFS technique for equal priority if we apply SJF for equal priority will be helpful to reduce waiting time spend in the queue basically for the Under Priority (credits0) VCPUs. obliviously situation is rare but will make sense when large no of VM are active. If anybody working on this wants his/her comments on this idea Thanks and regards Rajendra ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] xen: Write CR0, CR3 and CR4 in arch_set_info_guest()
On 05/18/2015 10:27 AM, Jan Beulich wrote: On 15.05.15 at 22:45, rcojoc...@bitdefender.com wrote: On 05/15/2015 06:57 PM, Jan Beulich wrote: On 06.05.15 at 19:12, rcojoc...@bitdefender.com wrote: Arch_set_info_guest() doesn't set CR0, CR3 or CR4. Added code that does that. But you should also say a word on why this is needed, since things worked fine so far without, and enabling the functions to run outside of their own vCPU context is not immediately obviously correct. This is a way to undo malicious CR writes. This is achieved for MSR writes with the deny vm_event response flag patch in this series, but the CR events are being send after the actual write. In such cases, while the VCPU is paused before I put a vm_response in the ring, I can simply write the old value back. I've brought up the issue in the past, and the consensus, IIRC, was that I should not alter existing behaviour (post-write events) - so the alternatives were either to add a new pre-write CR event (which seemed messy), or this (which seemed less intrusive). Of course, if it has now become acceptable to reconsider having the CR vm_events consistently pre-write, the deny patch could be extended to them. Considering that in the reply to Andrew's response you already pointed at where a suggestion towards consistent pre events was made, it would have helped if you identified where this was advised against before. It certainly seems sensible to treat MSR and CR writes in a consistent fashion. Sorry for the omission. This is the original reply: http://lists.xen.org/archives/html/xen-devel/2014-10/msg00240.html where you've pointed out that we should not break existing behaviour. After that, Tamas suggested that I could simply write the previous value back in the vm_event userspace handler, which is how this patch was born, and Andrew suggested pre-write hooks. My suggestions at this point are to either: 1. Modify this patch as suggested and resubmit it in the next series iteration (after we finish with the CR events cleanup). 2. Change the control register events to pre-write and prevent post-write events in the future. 3. Add new event types for pre-write control register events (a new index, CR3_PRE_WRITE or something similar). 4. Keep the existing types of control register events but add a new per-register flag, similar to how the sync flag works, that specifies whether the event should be pre or post-write. So the libxc monitor_write_ctrlreg() functions would take and additional bool parameter (bool pre_write, for example). In cases 2-4 this patch could probably be dropped, and control register write preemption could be added to the vm_event reply DENY flag patch. -int hvm_set_cr0(unsigned long value) +int hvm_set_cr0(struct vcpu *v, unsigned long value, bool_t with_vm_event) { -struct vcpu *v = current; This change is covered by neither the title nor the description, but considering it's you who sends this likely is the meat of the change. However, considering that the three calls you add to arch_set_info_guest() pass this in as zero, I even more wonder why what the title says is needed in the first place. I further wonder whether you wouldn't want an event if and only if v == current (in which case the flag parameter could be dropped). It just seemed useless to send out a vm_event in the case you mention, since presumably the application setting them is very likely the same one receiving the events (though, granted, it doesn't need to be). So in that case, it would be pointless to notify itself that it has done what it knows it's done. If the setting is being done by a monitor VM, I would suppose v != current, and hence - along the lines above - no need for an event. Whereas when v == current, the setting would be done by the VM itself, and hence an event should always be delivered. Ack. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] xen: Write CR0, CR3 and CR4 in arch_set_info_guest()
On 18.05.15 at 09:58, rcojoc...@bitdefender.com wrote: On 05/18/2015 10:27 AM, Jan Beulich wrote: On 15.05.15 at 22:45, rcojoc...@bitdefender.com wrote: On 05/15/2015 06:57 PM, Jan Beulich wrote: On 06.05.15 at 19:12, rcojoc...@bitdefender.com wrote: Arch_set_info_guest() doesn't set CR0, CR3 or CR4. Added code that does that. But you should also say a word on why this is needed, since things worked fine so far without, and enabling the functions to run outside of their own vCPU context is not immediately obviously correct. This is a way to undo malicious CR writes. This is achieved for MSR writes with the deny vm_event response flag patch in this series, but the CR events are being send after the actual write. In such cases, while the VCPU is paused before I put a vm_response in the ring, I can simply write the old value back. I've brought up the issue in the past, and the consensus, IIRC, was that I should not alter existing behaviour (post-write events) - so the alternatives were either to add a new pre-write CR event (which seemed messy), or this (which seemed less intrusive). Of course, if it has now become acceptable to reconsider having the CR vm_events consistently pre-write, the deny patch could be extended to them. Considering that in the reply to Andrew's response you already pointed at where a suggestion towards consistent pre events was made, it would have helped if you identified where this was advised against before. It certainly seems sensible to treat MSR and CR writes in a consistent fashion. Sorry for the omission. This is the original reply: http://lists.xen.org/archives/html/xen-devel/2014-10/msg00240.html where you've pointed out that we should not break existing behaviour. As said (mainly by Andrew) numerous times recently, breaking of existing behavior is - with all the incompatible changes already done to the interface after 4.5 - not currently an issue. 2. Change the control register events to pre-write and prevent post-write events in the future. This one, as long as amongst the ones that care for the events agreement can be reached that this is the way to go. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 20/41] arm : create generic uart initialization function
On 17.05.15 at 22:03, parth.di...@linaro.org wrote: --- a/xen/drivers/char/Makefile +++ b/xen/drivers/char/Makefile @@ -6,5 +6,5 @@ obj-$(HAS_EXYNOS4210) += exynos4210-uart.o obj-$(HAS_OMAP) += omap-uart.o obj-$(HAS_SCIF) += scif-uart.o obj-$(HAS_EHCI) += ehci-dbgp.o -obj-$(CONFIG_ARM) += dt-uart.o +obj-$(CONFIG_ARM) += arm-uart.o The patch is missing the corresponding source file. Also. uart_init() being (presumably) implemented in that file is in no way generic - it's still ARM specific, and hence it should be named that way. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/4] libxc: print more error messages when failed
No functional changes introduced. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com --- tools/libxc/xc_hvm_build_x86.c | 30 +++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index 92422bf..df4b7ed 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -259,7 +259,10 @@ static int setup_guest(xc_interface *xch, memset(elf, 0, sizeof(elf)); if ( elf_init(elf, image, image_size) != 0 ) +{ +PERROR(Could not initialise ELF image); goto error_out; +} xc_elf_set_logfile(xch, elf, 1); @@ -522,15 +525,24 @@ static int setup_guest(xc_interface *xch, DPRINTF( 1GB PAGES: 0x%016lx\n, stat_1gb_pages); if ( loadelfimage(xch, elf, dom, page_array) != 0 ) +{ +PERROR(Could not load ELF image); goto error_out; +} if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 ) -goto error_out; +{ +PERROR(Could not load ACPI modules); +goto error_out; +} if ( (hvm_info_page = xc_map_foreign_range( xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, HVM_INFO_PFN)) == NULL ) +{ +PERROR(Could not map hvm info page); goto error_out; +} build_hvm_info(hvm_info_page, args); munmap(hvm_info_page, PAGE_SIZE); @@ -547,7 +559,10 @@ static int setup_guest(xc_interface *xch, } if ( xc_clear_domain_pages(xch, dom, special_pfn(0), NR_SPECIAL_PAGES) ) -goto error_out; +{ +PERROR(Could not clear special pages); +goto error_out; +} xc_hvm_param_set(xch, dom, HVM_PARAM_STORE_PFN, special_pfn(SPECIALPAGE_XENSTORE)); @@ -580,7 +595,10 @@ static int setup_guest(xc_interface *xch, } if ( xc_clear_domain_pages(xch, dom, ioreq_server_pfn(0), NR_IOREQ_SERVER_PAGES) ) -goto error_out; +{ +PERROR(Could not clear ioreq page); +goto error_out; +} /* Tell the domain where the pages are and how many there are */ xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_SERVER_PFN, @@ -595,7 +613,10 @@ static int setup_guest(xc_interface *xch, if ( (ident_pt = xc_map_foreign_range( xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, special_pfn(SPECIALPAGE_IDENT_PT))) == NULL ) +{ +PERROR(Could not map special page ident_pt); goto error_out; +} for ( i = 0; i PAGE_SIZE / sizeof(*ident_pt); i++ ) ident_pt[i] = ((i 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); @@ -610,7 +631,10 @@ static int setup_guest(xc_interface *xch, char *page0 = xc_map_foreign_range( xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0); if ( page0 == NULL ) +{ +PERROR(Could not map page0); goto error_out; +} page0[0] = 0xe9; *(uint32_t *)page0[1] = entry_eip - 5; munmap(page0, PAGE_SIZE); -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH][RFC] libxl: use new qemu parameters for emulated qemuu disks
On Fri, May 15, 2015 at 01:54:32PM +0200, Fabio Fantoni wrote: NOTES: This patch is a only a fast draft for testing. Some tests result: At xl create cdrom empty or not are both working, xl cd-insert is working, xl cd-eject seems working but on xl command in linux hvm domU return qmp error of Device 'ide-N' is locked, in windows 7 instead don't show the errror. xl block-attach seems working correctly and xl block-detach works correctly with linux hvm but not with windows 7 (seems block the disk remove, I don't know if do the same without this patch) Scsi disk case not tested for now. Any comment is appreciated. I presume you're trying to use AHCI? Do you notice improvement using AHCI when booting a guest? Signed-off-by: Fabio Fantoni fabio.fant...@m2r.biz --- tools/libxl/libxl_dm.c | 35 ++- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 4bec5ba..6d00e38 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -811,7 +811,6 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, int dev_number = libxl__device_disk_dev_number(disks[i].vdev, disk, part); const char *format = qemu_disk_format_string(disks[i].format); -char *drive; const char *pdev_path; if (dev_number == -1) { @@ -822,13 +821,14 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, if (disks[i].is_cdrom) { if (disks[i].format == LIBXL_DISK_FORMAT_EMPTY) -drive = libxl__sprintf -(gc, if=ide,index=%d,media=cdrom,cache=writeback,id=ide-%i, - disk, dev_number); +flexarray_vappend(dm_args, -drive, +GCSPRINTF(if=none,id=ide-%i,cache=writeback, dev_number), -device, +GCSPRINTF(ide-cd,drive=ide-%i, dev_number), NULL); else -drive = libxl__sprintf -(gc, file=%s,if=ide,index=%d,media=cdrom,format=%s,cache=writeback,id=ide-%i, - disks[i].pdev_path, disk, format, dev_number); +flexarray_vappend(dm_args, -drive, + GCSPRINTF(file=%s,if=none,id=ide-%i,format=%s,cache=writeback, +disks[i].pdev_path, dev_number, format), -device, +GCSPRINTF(ide-cd,drive=ide-%i, dev_number), NULL); } else { if (disks[i].format == LIBXL_DISK_FORMAT_EMPTY) { LIBXL__LOG(ctx, LIBXL__LOG_WARNING, cannot support @@ -857,25 +857,26 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, * hd[a-d] and ignore the rest. */ if (strncmp(disks[i].vdev, sd, 2) == 0) -drive = libxl__sprintf -(gc, file=%s,if=scsi,bus=0,unit=%d,format=%s,cache=writeback, - pdev_path, disk, format); +flexarray_vappend(dm_args, -drive, + GCSPRINTF(file=%s,if=none,id=scsidisk-%d,format=%s,cache=writeback, +pdev_path, disk, format), -device, +GCSPRINTF(scsi-hd,drive=scsidisk-%d,scsi-id=%d, +disk, disk), NULL); else if (disk 6 libxl_defbool_val(b_info-u.hvm.ahci)){ I don't see a ahci field in libxl idl. Did you forget to commit that part? Another question is that do we always want to enable AHCI? Do we want to make it user tunable? This affects if you actually need a new field in idl. Wei. flexarray_vappend(dm_args, -drive, GCSPRINTF(file=%s,if=none,id=ahcidisk-%d,format=%s,cache=writeback, -pdev_path, disk, format), -device, GCSPRINTF(ide-hd,bus=ahci0.%d,unit=0,drive=ahcidisk-%d, +pdev_path, disk, format), -device, + GCSPRINTF(ide-hd,bus=ahci0.%d,unit=0,drive=ahcidisk-%d, disk, disk), NULL); -continue; }else if (disk 4) -drive = libxl__sprintf -(gc, file=%s,if=ide,index=%d,media=disk,format=%s,cache=writeback, - pdev_path, disk, format); +flexarray_vappend(dm_args, -drive, + GCSPRINTF(file=%s,if=none,id=idedisk-%d,format=%s,cache=writeback, +pdev_path, disk, format), -device, GCSPRINTF(ide-hd,drive=idedisk-%d, +disk), NULL); else continue; /* Do not emulate this disk */ } -flexarray_append(dm_args, -drive); -
Re: [Xen-devel] [RFC PATCH 12/13] xen-netfront: implement TX persistent grants
On 12/05/15 18:18, Joao Martins wrote: Instead of grant/revoking the buffer related to the skb, it will use an already granted page and memcpy to it. The grants will be mapped by xen-netback and reused overtime, but only unmapped when the vif disconnects, as opposed to every packet. This only happens if the backend supports persistent grants since it would, otherwise, introduce the overhead of a memcpy on top of the grant map. [...] --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c [...] @@ -1610,7 +1622,10 @@ static int xennet_init_queue(struct netfront_queue *queue) for (i = 0; i NET_TX_RING_SIZE; i++) { skb_entry_set_link(queue-tx_skbs[i], i+1); queue-grant_tx[i].ref = GRANT_INVALID_REF; - queue-grant_tx[i].page = NULL; + if (queue-info-feature_persistent) + queue-grant_tx[i].page = alloc_page(GFP_NOIO); Need to check for alloc failure here and unwind correctly? Why NOIO? + else + queue-grant_tx[i].page = NULL; } /* Clear out rx_skbs */ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH][RFC] libxl: use new qemu parameters for emulated qemuu disks
Il 18/05/2015 13:24, George Dunlap ha scritto: On Fri, May 15, 2015 at 12:54 PM, Fabio Fantoni fabio.fant...@m2r.biz wrote: NOTES: This patch is a only a fast draft for testing. Some tests result: At xl create cdrom empty or not are both working, xl cd-insert is working, xl cd-eject seems working but on xl command in linux hvm domU return qmp error of Device 'ide-N' is locked, in windows 7 instead don't show the errror. xl block-attach seems working correctly and xl block-detach works correctly with linux hvm but not with windows 7 (seems block the disk remove, I don't know if do the same without this patch) Scsi disk case not tested for now. Any comment is appreciated. I think what's missing in this changelog is why we want this patch -- what's the advantage? Was some of the functionality above not working before, for example? -George Add of ahci support require new -device parameters, I tried the change also all other cases to have the same. Is there also rare case of possible problem using old parameters on part of things (disks, network, audio and older usb if I remember good) but I not remember the example case, was found by another user. With this patch I'm trying to maintan full compatibility but probably there is something that I don't know that must be changed or may changes with this. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2][RFC] libxl: Add AHCI support for upstream qemu
Il 18/05/2015 17:53, Wei Liu ha scritto: On Thu, May 14, 2015 at 01:11:13PM +0200, Fabio Fantoni wrote: Usage: ahci=0|1 (default=0) If enabled adds ich9 disk controller in ahci mode and uses it with upstream qemu to emulate disks instead of ide. Is ICH9 available in our default setup? Why do we not always enable AHCI? ahci seems require ich9 controller (default in q35 but not in older used by xen), I think that change it ide-ahci automatically for all cases may causes problems, probably in save/restore and more probably in windows =7 without pv where change ide-ahci require a registry change for not have blue screen. It doesn't support cdroms which still use ide (cdroms will use -device ide-cd as new qemu parameters) Ahci requires new qemu parameters but for now other emulated disks cases remains with old ones because automatic bus selection seems bugged in qemu using new parameters. (I'll retry) Buggy as in? Have you reported to QEMU upstream? Already reported long time ago in xen-devel and qemu-devel, the only reply from qemu-devel was to use fixed bus and is what I did in v2 of this patch. Can someone tell me if can be a problem fixed bus? NOTES: This patch is a only a fast draft for testing. Tested with 1 and 6 disks on ubuntu 15.04 hvm, windows 7 and windows 8 domUs. Doc entry and libxl.h define should be added, I'll do. Other emulated disks cases should be converted to use new qemu parameters but probably a fix in qemu is needed. Any comment is appreciated. Signed-off-by: Fabio Fantoni fabio.fant...@m2r.biz Changes in v2: - libxl_dm.c: manual bus and unit selection (as workaround to qemu bug) to have multiple disks working. What's the relation between this patch the other patch you posted later? --- tools/libxl/libxl_create.c | 1 + tools/libxl/libxl_dm.c | 10 +- tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c| 1 + 4 files changed, 12 insertions(+), 1 deletion(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index f0da7dc..fcfe24a 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -322,6 +322,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, libxl_defbool_setdefault(b_info-u.hvm.nested_hvm, false); libxl_defbool_setdefault(b_info-u.hvm.usb,false); libxl_defbool_setdefault(b_info-u.hvm.xen_platform_pci, true); +libxl_defbool_setdefault(b_info-u.hvm.ahci, false); libxl_defbool_setdefault(b_info-u.hvm.spice.enable, false); if (!libxl_defbool_val(b_info-u.hvm.spice.enable) diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 0c6408d..4bec5ba 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -804,6 +804,8 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, flexarray_append(dm_args, libxl__sprintf(gc, %PRId64, ram_size)); if (b_info-type == LIBXL_DOMAIN_TYPE_HVM) { +if (libxl_defbool_val(b_info-u.hvm.ahci)) +flexarray_append_pair(dm_args, -device, ahci,id=ahci0); for (i = 0; i num_disks; i++) { int disk, part; int dev_number = @@ -858,7 +860,13 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, drive = libxl__sprintf (gc, file=%s,if=scsi,bus=0,unit=%d,format=%s,cache=writeback, pdev_path, disk, format); -else if (disk 4) +else if (disk 6 libxl_defbool_val(b_info-u.hvm.ahci)){ And you choose 6 because? Because the ich9 ahci controller have 6 channel, so is the max, ide instead have 2 channel with master/slave. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2][RFC] libxl: Add AHCI support for upstream qemu
Il 18/05/2015 18:16, Ian Jackson ha scritto: Fabio Fantoni writes ([PATCH v2][RFC] libxl: Add AHCI support for upstream qemu): If enabled adds ich9 disk controller in ahci mode and uses it with upstream qemu to emulate disks instead of ide. I'm sorry for perhaps querying the obvious, but why is this a good idea ? I think the underlying motivation should be explained in the commit message. That will also help us understand whether we ought to be changing the default. Ian. From another mails: see this: http://lists.xen.org/archives/html/xen-devel/2015-05/msg01277.html and... Do you notice improvement using AHCI when booting a guest? The more significant is with lubuntu 15.04 hvm where time boot time to login is only about a fifth!!! (20%) in comparison to without. With windows 7 pro 64 bit is different in many case, in better case the boot time is about a third (33%), in the worst it seems to gain only 10-20% of the total time. With windows 8.1 the gain is only 5-10% of the total but with win8 the boot time is very long in any case, I don't know if caused by unexpected case in xen or windows 8 is simply bad or weighty. And another details probably useful: http://lists.xen.org/archives/html/xen-devel/2015-05/msg02327.html ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/4] libxc: rework vnuma bits in setup_guest
Make the setup process similar to PV counterpart. That is, to allocate a P2M array that covers the whole memory range and start from there. This is clearer than using an array with no holes in it. Also the dummy layout should take MMIO hole into consideration. We might end up having two vmemranges in the dummy layout. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com --- tools/libxc/xc_hvm_build_x86.c | 66 -- 1 file changed, 50 insertions(+), 16 deletions(-) diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index df4b7ed..77678f1 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch, { xen_pfn_t *page_array = NULL; unsigned long i, vmemid, nr_pages = args-mem_size PAGE_SHIFT; +unsigned long p2m_size; unsigned long target_pages = args-mem_target PAGE_SHIFT; unsigned long entry_eip, cur_pages, cur_pfn; void *hvm_info_page; @@ -254,8 +255,8 @@ static int setup_guest(xc_interface *xch, xen_pfn_t special_array[NR_SPECIAL_PAGES]; xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; uint64_t total_pages; -xen_vmemrange_t dummy_vmemrange; -unsigned int dummy_vnode_to_pnode; +xen_vmemrange_t dummy_vmemrange[2]; +unsigned int dummy_vnode_to_pnode[2]; memset(elf, 0, sizeof(elf)); if ( elf_init(elf, image, image_size) != 0 ) @@ -275,17 +276,37 @@ static int setup_guest(xc_interface *xch, if ( args-nr_vmemranges == 0 ) { -/* Build dummy vnode information */ -dummy_vmemrange.start = 0; -dummy_vmemrange.end = args-mem_size; -dummy_vmemrange.flags = 0; -dummy_vmemrange.nid = 0; -args-nr_vmemranges = 1; -args-vmemranges = dummy_vmemrange; +/* Build dummy vnode information + * + * Guest physical address space layout: + * [0, hole_start) [hole_start, 4G) [4G, highmem_end) + * + * Of course if there is no high memory, the second vmemrange + * has no effect on the actual result. + */ -dummy_vnode_to_pnode = XC_NUMA_NO_NODE; +dummy_vmemrange[0].start = 0; +dummy_vmemrange[0].end = args-lowmem_end; +dummy_vmemrange[0].flags = 0; +dummy_vmemrange[0].nid = 0; +dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE; +args-nr_vmemranges = 1; args-nr_vnodes = 1; -args-vnode_to_pnode = dummy_vnode_to_pnode; + +if ( args-highmem_end (1ULL 32) ) +{ +dummy_vmemrange[1].start = 1ULL 32; +dummy_vmemrange[1].end = args-highmem_end; +dummy_vmemrange[1].flags = 0; +dummy_vmemrange[1].nid = 0; +dummy_vnode_to_pnode[1] = XC_NUMA_NO_NODE; + +args-nr_vmemranges++; +args-nr_vnodes++; +} + +args-vmemranges = dummy_vmemrange; +args-vnode_to_pnode = dummy_vnode_to_pnode; } else { @@ -297,9 +318,15 @@ static int setup_guest(xc_interface *xch, } total_pages = 0; +p2m_size = 0; for ( i = 0; i args-nr_vmemranges; i++ ) +{ total_pages += ((args-vmemranges[i].end - args-vmemranges[i].start) PAGE_SHIFT); +p2m_size = p2m_size (args-vmemranges[i].end PAGE_SHIFT) ? +p2m_size : (args-vmemranges[i].end PAGE_SHIFT); +} + if ( total_pages != (args-mem_size PAGE_SHIFT) ) { PERROR(vNUMA memory pages mismatch (0x%PRIx64 != 0x%PRIx64), @@ -325,16 +352,23 @@ static int setup_guest(xc_interface *xch, DPRINTF( TOTAL:%016PRIx64-%016PRIx64\n, v_start, v_end); DPRINTF( ENTRY:%016PRIx64\n, elf_uval(elf, elf.ehdr, e_entry)); -if ( (page_array = malloc(nr_pages * sizeof(xen_pfn_t))) == NULL ) +if ( (page_array = malloc(p2m_size * sizeof(xen_pfn_t))) == NULL ) { PERROR(Could not allocate memory.); goto error_out; } -for ( i = 0; i nr_pages; i++ ) -page_array[i] = i; -for ( i = args-mmio_start PAGE_SHIFT; i nr_pages; i++ ) -page_array[i] += args-mmio_size PAGE_SHIFT; +for ( i = 0; i p2m_size; i++ ) +page_array[i] = ((xen_pfn_t)-1); +for ( vmemid = 0; vmemid args-nr_vmemranges; vmemid++ ) +{ +uint64_t pfn; + +for ( pfn = args-vmemranges[vmemid].start PAGE_SHIFT; + pfn args-vmemranges[vmemid].end PAGE_SHIFT; + pfn++ ) +page_array[pfn] = pfn; +} /* * Try to claim pages for early warning of insufficient memory available. -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH 09/13] xen-netfront: move grant_{ref, page} to struct grant
On 12/05/15 18:18, Joao Martins wrote: Refactors a little bit how grants are stored by moving grant_rx_ref/grant_tx_ref and grant_tx_page to its own structure, namely struct grant. Reviewed-by: David Vrabel david.vra...@citrix.com Although... --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -87,6 +87,11 @@ struct netfront_cb { /* IRQ name is queue name with -tx or -rx appended */ #define IRQ_NAME_SIZE (QUEUE_NAME_SIZE + 3) +struct grant { + grant_ref_t ref; + struct page *page; +}; Is this sort of structure (and the following patch) useful for other frontends? David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 56660: regressions - FAIL
flight 56660 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/56660/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-rumpuserxen-i386 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 50329 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-libvirt 11 guest-start fail REGR. vs. 50329 test-amd64-i386-libvirt 11 guest-start fail REGR. vs. 50329 test-amd64-amd64-libvirt 11 guest-start fail REGR. vs. 50329 test-amd64-i386-freebsd10-amd64 9 freebsd-install fail like 50329 test-amd64-i386-freebsd10-i386 9 freebsd-install fail like 50329 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 50329 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 50329 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-i386-libvirt-xsm 11 guest-start fail never pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-xsm 11 guest-start fail never pass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-i386-xl-xsm 11 guest-start fail never pass test-amd64-amd64-libvirt-xsm 11 guest-start fail never pass test-armhf-armhf-xl-xsm 6 xen-boot fail never pass test-armhf-armhf-libvirt-xsm 6 xen-boot fail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass version targeted for testing: linuxc0655fe9b0901a968800f56687be3c62b4cce5d2 baseline version: linux1cced5015b171415169d938fb179c44fe060dc15 2034 people touched revisions under test, not listing them all jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmfail test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmfail test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm fail test-amd64-amd64-libvirt-xsm fail test-armhf-armhf-libvirt-xsm fail test-amd64-i386-libvirt-xsm fail test-amd64-amd64-xl-xsm fail test-armhf-armhf-xl-xsm fail test-amd64-i386-xl-xsm fail test-amd64-amd64-xl-pvh-amd