Re: [Xen-devel] [PATCH 4/4] iommu: add rmrr Xen command line option for extra rmrrs
From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com] Sent: Saturday, May 30, 2015 5:39 AM From: Elena Ufimtseva elena.ufimts...@oracle.com On some platforms RMRR regions may be not specified in ACPI and thus will not be mapped 1:1 in dom0. This causes IO Page Faults and prevents dom0 from booting in PVH mode. New Xen command line option rmrr allows to specify such devices and memory regions. These regions are added to the list of RMRR defined in ACPI if the device is present in system. As a result, additional RMRRs will be mapped 1:1 in dom0 with correct permissions. Mentioned above problems were discovered during PVH work with ThinkCentre M and Dell 5600T. No official documentation was found so far in regards to what devices and why cause this. Experiments show that ThinkCentre M USB devices with enabled debug port generate DMA read transactions to the regions of memory marked reserved in host e820 map. For Dell 5600T the device and faulting addresses are not found yet. For detailed history of the discussion please check following threads: http://lists.Xen.org/archives/html/xen-devel/2015-02/msg01724.html http://lists.Xen.org/archives/html/xen-devel/2015-01/msg02513.html Format for rmrr Xen command line option: rmrr=start-end=[s1]bdf1[,[s1]bdf2[,...]];start-end=[s2]bdf1[,[s2]bdf2[,...]] If grub2 used and multiple ranges are specified, ';' should be quoted/escaped, refer to grub2 manual for more information. Signed-off-by: Elena Ufimtseva elena.ufimts...@oracle.com --- docs/misc/xen-command-line.markdown | 13 +++ xen/drivers/passthrough/vtd/dmar.c | 164 +++- 2 files changed, 176 insertions(+), 1 deletion(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 4889e27..26e2a5e 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1185,6 +1185,19 @@ Specify the host reboot method. 'efi' instructs Xen to reboot using the EFI reboot call (in EFI mode by default it will use that method first). +### rmrr + '= start-end=[s1]bdf1[,[s1]bdf2[,...]];start-end=[s2]bdf1[,[s2]bdf2[,...]] + +Define RMRRs units that are missing from ACPI table along with device +they belong to and use them for 1:1 mapping. End addresses can be omitted +and one page will be mapped. The ranges are inclusive when start and end +are specified.If segement of the first device is not specified, segment zero will be used. +If other segments are not specified, first device segment will be used. +If segments are specified for every device and not equal, an error will be reported. Since you only allow devices under same segment for a given rmrr range, would it be simpler to enforce that explicitly in the format? e.g.: = start-end=[s1]bdf1[,bdf2[,...]]; Then you don't need to verify whether segment in later bdfs is specified and different from 1st bdf. +Note: grub2 requires to escape or use quotations if special +characters are used, namely ';', refer to the grub2 documentation if multiple +ranges are specified. + ### ro-hpet `= boolean` diff --git a/xen/drivers/passthrough/vtd/dmar.c b/xen/drivers/passthrough/vtd/dmar.c index 89a2f79..d675940 100644 --- a/xen/drivers/passthrough/vtd/dmar.c +++ b/xen/drivers/passthrough/vtd/dmar.c @@ -866,6 +866,106 @@ out: return ret; } +#define MAX_EXTRA_RMRR_PAGES 16 +#define MAX_EXTRA_RMRR 10 + +/* RMRR units derived from command line rmrr option */ +#define MAX_EXTRA_RMRR_DEV 20 +struct extra_rmrr_unit { +struct list_head list; +unsigned long base_pfn, end_pfn; +u16dev_count; +u32sbdf[MAX_EXTRA_RMRR_DEV]; +}; +static __initdata unsigned int nr_rmrr; +static struct __initdata extra_rmrr_unit rmrru[MAX_EXTRA_RMRR]; rmrru is confusing especially when used in multiple functions. Better to use a more descriptive name similar to acpi_rmrr_units. + +static void __init add_extra_rmrr(void) +{ +struct acpi_rmrr_unit *rmrrn; 'rmrrn' - 'acpi_rmrr'? +unsigned int dev, seg, i, j; +unsigned long pfn; + +for ( i = 0; i nr_rmrr; i++ ) +{ +if ( rmrru[i].base_pfn rmrru[i].end_pfn ) +{ +printk(XENLOG_ERR VTDPREFIX + Start pfn end pfn for RMRR range [%PRIx64 - %PRIx64]\n, + rmrru[i].base_pfn, rmrru[i].end_pfn); +return; return or continue? other rmrr entry might be good... +} + +if ( rmrru[i].end_pfn - rmrru[i].base_pfn = MAX_EXTRA_RMRR_PAGES ) +{ +printk(XENLOG_ERR VTDPREFIX + RMRR range exceeds 16 pages [%PRIx64 - %PRIx64]\n, + rmrru[i].base_pfn, rmrru[i].end_pfn); +return; +} + +for ( j = 0; j nr_rmrr; j++ ) +{ +if ( i != j rmrru[i].base_pfn = rmrru[j].end_pfn + rmrru[j].base_pfn =
Re: [Xen-devel] [PATCH V8] xen/vm_event: Clean up control-register-write vm_events and add XCR0 event
From: Razvan Cojocaru [mailto:rcojoc...@bitdefender.com] Sent: Friday, May 29, 2015 9:46 PM As suggested by Andrew Cooper, this patch attempts to remove some redundancy and allow for an easier time when adding vm_events for new control registers in the future, by having a single VM_EVENT_REASON_WRITE_CTRLREG vm_event type, meant to serve CR0, CR3, CR4 and (newly introduced) XCR0. The actual control register will be deduced by the new .index field in vm_event_write_ctrlreg (renamed from vm_event_mov_to_cr). The patch has also modified the xen-access.c test - it is now able to log CR3 events. Signed-off-by: Razvan Cojocaru rcojoc...@bitdefender.com Acked-by: Jan Beulich jbeul...@suse.com Acked-by: Kevin Tian kevin.t...@intel.com --- Changes since V7: - Explicitly #include asm/hvm/event.h. --- tools/libxc/include/xenctrl.h |9 ++--- tools/libxc/xc_monitor.c| 40 +++--- tools/tests/xen-access/xen-access.c | 30 +++-- xen/arch/x86/hvm/event.c| 55 +- xen/arch/x86/hvm/hvm.c | 11 +++--- xen/arch/x86/hvm/vmx/vmx.c |6 ++-- xen/arch/x86/monitor.c | 63 +-- xen/include/asm-x86/domain.h| 12 ++- xen/include/asm-x86/hvm/event.h |5 ++- xen/include/asm-x86/monitor.h |2 ++ xen/include/public/domctl.h | 12 +++ xen/include/public/vm_event.h | 29 12 files changed, 114 insertions(+), 160 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 09a7450..83fd289 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2375,12 +2375,9 @@ int xc_mem_access_disable_emulate(xc_interface *xch, domid_t domain_id); void *xc_monitor_enable(xc_interface *xch, domid_t domain_id, uint32_t *port); int xc_monitor_disable(xc_interface *xch, domid_t domain_id); int xc_monitor_resume(xc_interface *xch, domid_t domain_id); -int xc_monitor_mov_to_cr0(xc_interface *xch, domid_t domain_id, bool enable, - bool sync, bool onchangeonly); -int xc_monitor_mov_to_cr3(xc_interface *xch, domid_t domain_id, bool enable, - bool sync, bool onchangeonly); -int xc_monitor_mov_to_cr4(xc_interface *xch, domid_t domain_id, bool enable, - bool sync, bool onchangeonly); +int xc_monitor_write_ctrlreg(xc_interface *xch, domid_t domain_id, + uint16_t index, bool enable, bool sync, + bool onchangeonly); int xc_monitor_mov_to_msr(xc_interface *xch, domid_t domain_id, bool enable, bool extended_capture); int xc_monitor_singlestep(xc_interface *xch, domid_t domain_id, bool enable); diff --git a/tools/libxc/xc_monitor.c b/tools/libxc/xc_monitor.c index 87ad968..63013de 100644 --- a/tools/libxc/xc_monitor.c +++ b/tools/libxc/xc_monitor.c @@ -45,8 +45,9 @@ int xc_monitor_resume(xc_interface *xch, domid_t domain_id) NULL); } -int xc_monitor_mov_to_cr0(xc_interface *xch, domid_t domain_id, bool enable, - bool sync, bool onchangeonly) +int xc_monitor_write_ctrlreg(xc_interface *xch, domid_t domain_id, + uint16_t index, bool enable, bool sync, + bool onchangeonly) { DECLARE_DOMCTL; @@ -54,39 +55,8 @@ int xc_monitor_mov_to_cr0(xc_interface *xch, domid_t domain_id, bool enable, domctl.domain = domain_id; domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE : XEN_DOMCTL_MONITOR_OP_DISABLE; -domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_MOV_TO_CR0; -domctl.u.monitor_op.u.mov_to_cr.sync = sync; -domctl.u.monitor_op.u.mov_to_cr.onchangeonly = onchangeonly; - -return do_domctl(xch, domctl); -} - -int xc_monitor_mov_to_cr3(xc_interface *xch, domid_t domain_id, bool enable, - bool sync, bool onchangeonly) -{ -DECLARE_DOMCTL; - -domctl.cmd = XEN_DOMCTL_monitor_op; -domctl.domain = domain_id; -domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE -: XEN_DOMCTL_MONITOR_OP_DISABLE; -domctl.u.monitor_op.event = XEN_DOMCTL_MONITOR_EVENT_MOV_TO_CR3; -domctl.u.monitor_op.u.mov_to_cr.sync = sync; -domctl.u.monitor_op.u.mov_to_cr.onchangeonly = onchangeonly; - -return do_domctl(xch, domctl); -} - -int xc_monitor_mov_to_cr4(xc_interface *xch, domid_t domain_id, bool enable, - bool sync, bool onchangeonly) -{ -DECLARE_DOMCTL; - -domctl.cmd = XEN_DOMCTL_monitor_op; -domctl.domain = domain_id; -domctl.u.monitor_op.op = enable ? XEN_DOMCTL_MONITOR_OP_ENABLE -
[Xen-devel] Ping: [PATCH] xen/pass-through: fold host PCI command register writes
Ping? On 15.05.15 at 14:46, jbeul...@suse.com wrote: The code introduced to address XSA-126 allows simplification of other code in xen_pt_initfn(): All we need to do is update cmd suitably, as it'll be written back to the host register near the end of the function anyway. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -698,10 +698,7 @@ static int xen_pt_initfn(PCIDevice *d) machine_irq, pirq, rc); /* Disable PCI intx assertion (turn on bit10 of devctl) */ -xen_host_pci_set_word(s-real_device, - PCI_COMMAND, - pci_get_word(s-dev.config + PCI_COMMAND) - | PCI_COMMAND_INTX_DISABLE); +cmd |= PCI_COMMAND_INTX_DISABLE; machine_irq = 0; s-machine_irq = 0; } else { @@ -723,9 +720,7 @@ static int xen_pt_initfn(PCIDevice *d) e_intx, rc); /* Disable PCI intx assertion (turn on bit10 of devctl) */ -xen_host_pci_set_word(s-real_device, PCI_COMMAND, - *(uint16_t *)(s-dev.config[PCI_COMMAND]) - | PCI_COMMAND_INTX_DISABLE); +cmd |= PCI_COMMAND_INTX_DISABLE; xen_pt_mapped_machine_irq[machine_irq]--; if (xen_pt_mapped_machine_irq[machine_irq] == 0) { ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Ping: [PATCH] xen/pass-through: ROM BAR handling adjustments
Ping? On 15.05.15 at 14:41, jbeul...@suse.com wrote: Expecting the ROM BAR to be written with an all ones value when sizing the region is wrong - the low bit has another meaning (enable/disable) and bits 1..10 are reserved. The PCI spec also mandates writing all ones to just the address portion of the register. Use suitable constants also for initializing the ROM BAR register field description. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -248,7 +248,9 @@ static void xen_pt_pci_write_config(PCID /* check unused BAR register */ index = xen_pt_bar_offset_to_index(addr); -if ((index = 0) (val 0 val XEN_PT_BAR_ALLF) +if ((index = 0) (val != 0) +(((index != PCI_ROM_SLOT) ? + val : (val | (uint32_t)~PCI_ROM_ADDRESS_MASK)) != XEN_PT_BAR_ALLF) (s-bases[index].bar_flag == XEN_PT_BAR_FLAG_UNUSED)) { XEN_PT_WARN(d, Guest attempt to set address to unused Base Address Register. (addr: 0x%02x, len: %d)\n, addr, len); --- a/hw/xen/xen_pt_config_init.c +++ b/hw/xen/xen_pt_config_init.c @@ -726,8 +726,8 @@ static XenPTRegInfo xen_pt_emu_reg_heade .offset = PCI_ROM_ADDRESS, .size = 4, .init_val = 0x, -.ro_mask= 0x07FE, -.emu_mask = 0xF800, +.ro_mask= ~PCI_ROM_ADDRESS_MASK ~PCI_ROM_ADDRESS_ENABLE, +.emu_mask = (uint32_t)PCI_ROM_ADDRESS_MASK, .init = xen_pt_bar_reg_init, .u.dw.read = xen_pt_long_reg_read, .u.dw.write = xen_pt_exp_rom_bar_reg_write, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/2] efi: Fix allocation problems if ExitBootServices() fails
On 01.06.15 at 12:17, ross.lagerw...@citrix.com wrote: If calling ExitBootServices() fails, the required memory map size may have increased. When initially allocating the memory map, allocate a slightly larger buffer (by an arbitrary 8 entries) to fix this. The ARM code path was already allocating a larger buffer than required, so this moves the code to be common for all architectures. This was seen on the following machine when using the iscsidxe UEFI driver. The machine would consistently fail the first call to ExitBootServices(). System Information Manufacturer: Supermicro Product Name: X10SLE-F/HF BIOS Information Vendor: American Megatrends Inc. Version: 2.00 Release Date: 04/24/2014 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Provided ARM folks are happy with the reduced increase, Acked-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On 01/06/15 13:11, Ian Campbell wrote: ### Device ID (`ID`) This parameter is used by commands which manage a specific device and the interrupts associated with that device. Checking if a device is present and retrieving the data structure must be fast. The device identifiers may not be assigned contiguously and the maximum number is very high (2^32). XXX In the context of virtualised device ids this may not be the case, e.g. we can arrange for (mostly) contiguous device ids and we know the bound is significantly lower than 2^32 Possible efficient data structures would be: 1. List: The lookup/deletion is in O(n) and the insertion will depend if the device should be sorted following their identifier. The memory overhead is 18 bytes per element. 2. Red-black tree: All the operations are O(log(n)). The memory overhead is 24 bytes per element. A Red-black tree seems the more suitable for having fast deviceID validation even though the memory overhead is a bit higher compare to the list. When PHYSDEVOP_pci_device_add is called, memory for its_device structure and other needed structure for this device is allocated added to RB-tree with all necessary information Sounds like a reasonable time to do it. I added something based on your words. Hmmm... The RB-tree suggested is per domain not the host and indexed with the vDevID. This is the only way to know quickly if the domain is able to use the device and retrieving a device. Indeed, the vDevID won't be equal to the pDevID as the vBDF will be different to the pBDF. PHYSDEVOP_pci_device_add is to ask Xen managing the PCI device. At that time we don't know to which domain the device will be passthrough. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V6 02/10] xen/arm: Add functions of mapping between vCPUID and virtual affinity
From: Chen Baozi baoz...@gmail.com GICv3 restricts that the maximum number of CPUs in affinity 0 (one cluster) is 16. That is to say the upper 4 bits of affinity 0 is unused. Current implementation considers that AFF0 is equal to vCPUID, which makes all vCPUs in one cluster, limiting its number to 16. If we would like to support more than 16 number of vCPU in one guest, we need to make use of AFF1. Considering the unused upper 4 bits, we need to create a pair of functions mapping the vCPUID and virtual affinity. Signed-off-by: Chen Baozi baoz...@gmail.com Reviewed-by: Julien Grall julien.gr...@citrix.com --- xen/include/asm-arm/domain.h | 41 + 1 file changed, 41 insertions(+) diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index 75b17af..b7b5cd2 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -266,6 +266,47 @@ static inline unsigned int domain_max_vcpus(const struct domain *d) return MAX_VIRT_CPUS; } +/* + * Due to the restriction of GICv3, the number of vCPUs in AFF0 is + * limited to 16, thus only the first 4 bits of AFF0 are legal. We will + * use the first 2 affinity levels here, expanding the number of vCPU up + * to 4096 (16*256), which is more than 128 PEs that GIC-500 supports. + * + * Since we don't save information of vCPU's topology (affinity) in + * vMPIDR at the moment, we map the vcpuid to the vMPIDR linearly. + * + * XXX: We may have multi-threading or virtual cluster information in + * the future. + */ +static inline unsigned int vaffinity_to_vcpuid(register_t vaff) +{ +unsigned int vcpuid; + +vaff = MPIDR_HWID_MASK; + +vcpuid = MPIDR_AFFINITY_LEVEL(vaff, 0); +vcpuid |= MPIDR_AFFINITY_LEVEL(vaff, 1) 4; + +return vcpuid; +} + +static inline register_t vcpuid_to_vaffinity(unsigned int vcpuid) +{ +register_t vaff; + +/* + * Right now only AFF0 and AFF1 are supported in virtual affinity. + * Since only the first 4 bits in AFF0 are used in GICv3, the + * available bits are 12 (4+8). + */ +BUILD_BUG_ON(!(MAX_VIRT_CPUS ((1 12; + +vaff = (vcpuid 0x0f) MPIDR_LEVEL_SHIFT(0); +vaff |= ((vcpuid 4) MPIDR_LEVEL_MASK) MPIDR_LEVEL_SHIFT(1); + +return vaff; +} + #endif /* __ASM_DOMAIN_H__ */ /* -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V6 01/10] xen/arm: gic-v3: Increase the size of GICR in address space for guest
From: Chen Baozi baoz...@gmail.com Currently it only supports up to 8 vCPUs. Increase the region to hold up to 128 vCPUs, which is the maximum number that GIC-500 supports. Signed-off-by: Chen Baozi baoz...@gmail.com Reviewed-by: Julien Grall julien.gr...@citrix.com --- xen/include/public/arch-arm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h index c029e0f..ec0c261 100644 --- a/xen/include/public/arch-arm.h +++ b/xen/include/public/arch-arm.h @@ -388,8 +388,8 @@ struct xen_arch_domainconfig { #define GUEST_GICV3_RDIST_STRIDE 0x2ULL #define GUEST_GICV3_RDIST_REGIONS 1 -#define GUEST_GICV3_GICR0_BASE 0x0302ULL/* vCPU0 - vCPU7 */ -#define GUEST_GICV3_GICR0_SIZE 0x0010ULL +#define GUEST_GICV3_GICR0_BASE 0x0302ULL/* vCPU0 - vCPU127 */ +#define GUEST_GICV3_GICR0_SIZE 0x0100ULL /* * 16MB == 4096 pages reserved for guest to use as a region to map its -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V6 03/10] xen/arm: Use the new functions for vCPUID/vaffinity transformation
From: Chen Baozi baoz...@gmail.com There are 3 places to change: * Initialise vMPIDR value in vcpu_initialise() * Find the vCPU from vMPIDR affinity information when accessing GICD registers in vGIC * Find the vCPU from vMPIDR affinity information when booting with vPSCI in vGIC - Also make the code for PSCI 0.1 use MPIDR-like value as the cpuid. Signed-off-by: Chen Baozi baoz...@gmail.com Reviewed-by: Julien Grall julien.gr...@citrix.com --- xen/arch/arm/domain.c | 6 +- xen/arch/arm/vgic-v3.c | 2 +- xen/arch/arm/vpsci.c | 5 + 3 files changed, 3 insertions(+), 10 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 2bde26e..0cf147c 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -501,11 +501,7 @@ int vcpu_initialise(struct vcpu *v) v-arch.sctlr = SCTLR_GUEST_INIT; -/* - * By default exposes an SMP system with AFF0 set to the VCPU ID - * TODO: Handle multi-threading processor and cluster - */ -v-arch.vmpidr = MPIDR_SMP | (v-vcpu_id MPIDR_AFF0_SHIFT); +v-arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v-vcpu_id); v-arch.actlr = READ_SYSREG32(ACTLR_EL1); diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 540f85f..ef9a71a 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -61,7 +61,7 @@ static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t irouter) if ( irouter GICD_IROUTER_SPI_MODE_ANY ) return d-vcpu[0]; -vcpu_id = irouter MPIDR_AFF0_MASK; +vcpu_id = vaffinity_to_vcpuid(irouter); if ( vcpu_id = d-max_vcpus ) return NULL; diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c index 5d899be..aebe1e2 100644 --- a/xen/arch/arm/vpsci.c +++ b/xen/arch/arm/vpsci.c @@ -32,10 +32,7 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point, int is_thumb = entry_point 1; register_t vcpuid; -if( ver == XEN_PSCI_V_0_2 ) -vcpuid = (target_cpu MPIDR_HWID_MASK); -else -vcpuid = target_cpu; +vcpuid = vaffinity_to_vcpuid(target_cpu); if ( vcpuid = d-max_vcpus || (v = d-vcpu[vcpuid]) == NULL ) return PSCI_INVALID_PARAMETERS; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] xen: netback: fix printf format string warning
On Mon, Jun 01, 2015 at 11:30:04AM +0100, Ian Campbell wrote: drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’: drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=] (txreq.offset~PAGE_MASK) + txreq.size); ^ PAGE_MASK's type can vary by arch, so a cast is needed. Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com v2: Cast to unsigned long, since PAGE_MASK can vary by arch. --- drivers/net/xen-netback/netback.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 4de46aa..0d25943 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1250,7 +1250,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue, netdev_err(queue-vif-dev, txreq.offset: %x, size: %u, end: %lu\n, txreq.offset, txreq.size, -(txreq.offset~PAGE_MASK) + txreq.size); +(unsigned long)(txreq.offset~PAGE_MASK) + txreq.size); xenvif_fatal_tx_err(queue-vif); break; } -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: gic-hip04: Resync the driver with the GICv2
On Mon, 2015-06-01 at 12:13 +0100, Julien Grall wrote: Hi, On 18/05/15 14:36, Zoltan Kiss wrote: On 15/05/15 22:08, Julien Grall wrote: Hi Zoltan, On 07/05/2015 13:37, Zoltan Kiss wrote: On 07/05/15 10:32, Ian Campbell wrote: On Thu, 2015-05-07 at 09:52 +0100, Zoltan Kiss wrote: Looks good at first glance, let me try it on a board. On 06/05/15 19:52, Julien Grall wrote: [...] I'm concerned to see a newly driver (pushed last march) already orphan. Does Huawei still plan to maintain this driver? I share Julien's concerns here. It would be good if those listed in the MAINTAINERS file for this device would respond reasonably promptly to mails such as [1] and try to keep on top of things or to find a {replacement /co-}maintainer who can do so. As I said, I've missed that thread entirely (not that hard given the traffic of the list), but now I've improved my mail filters to make sure I don't miss a mail where my Huawei address is on the Cc. We are also looking to add new co-maintainers, because nowadays I'm working on other projects. I need a few days to test the patch as my Xen test environment is not available right now. Ping? Were you able to test it? One of my colleague is doing that, to spread experience. It's going slower for him obviously to set up an enviroment, but we are working on that. Ping? Was your colleague able to setup Xen? Aside gic-hip04 is not booting on Xen, this is also a blocker from few patch series based on this rebase (GICv2 on GICv3, 128 VCPUs support...). FYI I won't be blocking anything due to this patch, if I trip over some other patch not applying/building because of it then I'll either apply it regardless of whether it has been tested/acked or patch the Makefile to disable it in the build, or perhaps both. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [rumpuserxen test] 57717: regressions - FAIL
flight 57717 rumpuserxen real [real] http://logs.test-lab.xenproject.org/osstest/logs/57717/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-rumpuserxen 5 rumpuserxen-build fail REGR. vs. 33866 build-i386-rumpuserxen5 rumpuserxen-build fail REGR. vs. 33866 Tests which did not succeed, but are not blocking: test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a version targeted for testing: rumpuserxen 3b91e44996ea6ae1276bce1cc44f38701c53ee6f baseline version: rumpuserxen 30d72f3fc5e35cd53afd82c8179cc0e0b11146ad People who touched revisions under test: Antti Kantee po...@iki.fi Ian Jackson ian.jack...@eu.citrix.com Martin Lucina mar...@lucina.net Wei Liu l...@liuw.name jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-rumpuserxen-amd64 blocked test-amd64-i386-rumpuserxen-i386 blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 535 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V6 08/10] xen: Add arch_domain_preinit to initialise vGIC before evtchn_init
From: Chen Baozi baoz...@gmail.com evtchn_init will call domain_max_vcpus to allocate poll_mask. On arm/arm64 platform, this number is determined by the vGIC the guest is going to use, which won't be initialised until arch_domain_create is called in current implementation. However, moving arch_domain_create means that we will allocate memory before checking the XSM policy, which seems not to be acceptable because if the domain is not allowed to boot by XSM policy the expensive execution of arch_domain_create is wasteful. Thus, we create the arch_domain_preinit to make vgic_ops initialisation be done earlier. Signed-off-by: Chen Baozi baoz...@gmail.com Cc: Julien Grall julien.gr...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Stefano Stabellini stefano.stabell...@citrix.com Cc: Tim Deegan t...@xen.org Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/arm/domain.c| 73 +--- xen/arch/arm/vgic.c | 14 -- xen/arch/x86/domain.c| 6 xen/common/domain.c | 3 ++ xen/include/xen/domain.h | 2 ++ 5 files changed, 61 insertions(+), 37 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 0cf147c..63c34fd 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -527,37 +527,16 @@ void vcpu_destroy(struct vcpu *v) free_xenheap_pages(v-arch.stack, STACK_ORDER); } -int arch_domain_create(struct domain *d, unsigned int domcr_flags, - struct xen_arch_domainconfig *config) +int arch_domain_preinit(struct domain *d, +struct xen_arch_domainconfig *config) { int rc; -d-arch.relmem = RELMEM_not_started; - /* Idle domains do not need this setup */ if ( is_idle_domain(d) ) return 0; ASSERT(config != NULL); -if ( (rc = p2m_init(d)) != 0 ) -goto fail; - -rc = -ENOMEM; -if ( (d-shared_info = alloc_xenheap_pages(0, 0)) == NULL ) -goto fail; - -/* Default the virtual ID to match the physical */ -d-arch.vpidr = boot_cpu_data.midr.bits; - -clear_page(d-shared_info); -share_xen_page_with_guest( -virt_to_page(d-shared_info), d, XENSHARE_writable); - -if ( (rc = domain_io_init(d)) != 0 ) -goto fail; - -if ( (rc = p2m_alloc_table(d)) != 0 ) -goto fail; switch ( config-gic_version ) { @@ -567,12 +546,16 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, case GIC_V2: config-gic_version = XEN_DOMCTL_CONFIG_GIC_V2; d-arch.vgic.version = GIC_V2; +d-arch.vgic.handler = vgic_v2_ops; break; +#ifdef CONFIG_ARM_64 case GIC_V3: config-gic_version = XEN_DOMCTL_CONFIG_GIC_V3; d-arch.vgic.version = GIC_V3; +d-arch.vgic.handler = vgic_v3_ops; break; +#endif default: BUG(); @@ -581,17 +564,61 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, case XEN_DOMCTL_CONFIG_GIC_V2: d-arch.vgic.version = GIC_V2; +d-arch.vgic.handler = vgic_v2_ops; break; +#ifdef CONFIG_ARM_64 case XEN_DOMCTL_CONFIG_GIC_V3: d-arch.vgic.version = GIC_V3; +d-arch.vgic.handler = vgic_v3_ops; break; +#endif default: rc = -EOPNOTSUPP; goto fail; } +return 0; + +fail: +d-is_dying = DOMDYING_dead; + +return rc; +} + +int arch_domain_create(struct domain *d, unsigned int domcr_flags, + struct xen_arch_domainconfig *config) +{ +int rc; + +d-arch.relmem = RELMEM_not_started; + +/* Idle domains do not need this setup */ +if ( is_idle_domain(d) ) +return 0; + +ASSERT(config != NULL); +if ( (rc = p2m_init(d)) != 0 ) +goto fail; + +rc = -ENOMEM; +if ( (d-shared_info = alloc_xenheap_pages(0, 0)) == NULL ) +goto fail; + +/* Default the virtual ID to match the physical */ +d-arch.vpidr = boot_cpu_data.midr.bits; + +clear_page(d-shared_info); +share_xen_page_with_guest( +virt_to_page(d-shared_info), d, XENSHARE_writable); + +if ( (rc = domain_io_init(d)) != 0 ) +goto fail; + +if ( (rc = p2m_alloc_table(d)) != 0 ) +goto fail; + if ( (rc = domain_vgic_init(d, config-nr_spis)) != 0 ) goto fail; diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 1bd86f8..08b487b 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -88,20 +88,6 @@ int domain_vgic_init(struct domain *d, unsigned int nr_spis) return -ENODEV; } -switch ( d-arch.vgic.version ) -{ -#ifdef CONFIG_ARM_64 -case GIC_V3: -d-arch.vgic.handler = vgic_v3_ops; -break; -#endif -case GIC_V2: -d-arch.vgic.handler = vgic_v2_ops; -break; -default: -return -ENODEV; -
[Xen-devel] [PATCH V6 04/10] xen/arm: Use cpumask_t type for vcpu_mask in vgic_to_sgi
From: Chen Baozi baoz...@gmail.com Use cpumask_t instead of unsigned long which can only express 64 cpus at the most. Add the {gicv2|gicv3}_sgir_to_cpumask in corresponding vGICs to translate GICD_SGIR/ICC_SGI1R_EL1 to vcpu_mask for vgic_to_sgi. Signed-off-by: Chen Baozi baoz...@gmail.com --- xen/arch/arm/vgic-v2.c| 16 +--- xen/arch/arm/vgic-v3.c| 18 ++ xen/arch/arm/vgic.c | 31 --- xen/include/asm-arm/gic.h | 1 + xen/include/asm-arm/gic_v3_defs.h | 2 ++ xen/include/asm-arm/vgic.h| 2 +- 6 files changed, 51 insertions(+), 19 deletions(-) diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c index 3be1a51..17a3c9f 100644 --- a/xen/arch/arm/vgic-v2.c +++ b/xen/arch/arm/vgic-v2.c @@ -33,6 +33,15 @@ #include asm/gic.h #include asm/vgic.h +static inline void gicv2_sgir_to_cpumask(cpumask_t *cpumask, + const register_t sgir) +{ +unsigned long target_list; + +target_list = ((sgir GICD_SGI_TARGET_MASK) GICD_SGI_TARGET_SHIFT); +bitmap_copy(cpumask_bits(cpumask), target_list, GICD_SGI_TARGET_BITS); +} + static int vgic_v2_distr_mmio_read(struct vcpu *v, mmio_info_t *info) { struct hsr_dabt dabt = info-dabt; @@ -201,16 +210,17 @@ static int vgic_v2_to_sgi(struct vcpu *v, register_t sgir) int virq; int irqmode; enum gic_sgi_mode sgi_mode; -unsigned long vcpu_mask = 0; +cpumask_t vcpu_mask; +cpumask_clear(vcpu_mask); irqmode = (sgir GICD_SGI_TARGET_LIST_MASK) GICD_SGI_TARGET_LIST_SHIFT; virq = (sgir GICD_SGI_INTID_MASK); -vcpu_mask = (sgir GICD_SGI_TARGET_MASK) GICD_SGI_TARGET_SHIFT; /* Map GIC sgi value to enum value */ switch ( irqmode ) { case GICD_SGI_TARGET_LIST_VAL: +gicv2_sgir_to_cpumask(vcpu_mask, sgir); sgi_mode = SGI_TARGET_LIST; break; case GICD_SGI_TARGET_OTHERS_VAL: @@ -226,7 +236,7 @@ static int vgic_v2_to_sgi(struct vcpu *v, register_t sgir) return 0; } -return vgic_to_sgi(v, sgir, sgi_mode, virq, vcpu_mask); +return vgic_to_sgi(v, sgir, sgi_mode, virq, vcpu_mask); } static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info) diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index ef9a71a..2bf5294 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -972,22 +972,32 @@ write_ignore: return 1; } +static inline void gicv3_sgir_to_cpumask(cpumask_t *cpumask, + const register_t sgir) +{ +unsigned long target_list; + +target_list = sgir ICH_SGI_TARGETLIST_MASK; +bitmap_copy(cpumask_bits(cpumask), target_list, ICH_SGI_TARGET_BITS); +} + static int vgic_v3_to_sgi(struct vcpu *v, register_t sgir) { int virq; int irqmode; enum gic_sgi_mode sgi_mode; -unsigned long vcpu_mask = 0; +cpumask_t vcpu_mask; +cpumask_clear(vcpu_mask); irqmode = (sgir ICH_SGI_IRQMODE_SHIFT) ICH_SGI_IRQMODE_MASK; virq = (sgir ICH_SGI_IRQ_SHIFT ) ICH_SGI_IRQ_MASK; -/* SGI's are injected at Rdist level 0. ignoring affinity 1, 2, 3 */ -vcpu_mask = sgir ICH_SGI_TARGETLIST_MASK; /* Map GIC sgi value to enum value */ switch ( irqmode ) { case ICH_SGI_TARGET_LIST: +/* SGI's are injected at Rdist level 0. ignoring affinity 1, 2, 3 */ +gicv3_sgir_to_cpumask(vcpu_mask, sgir); sgi_mode = SGI_TARGET_LIST; break; case ICH_SGI_TARGET_OTHERS: @@ -998,7 +1008,7 @@ static int vgic_v3_to_sgi(struct vcpu *v, register_t sgir) return 0; } -return vgic_to_sgi(v, sgir, sgi_mode, virq, vcpu_mask); +return vgic_to_sgi(v, sgir, sgi_mode, virq, vcpu_mask); } static int vgic_v3_emulate_sysreg(struct cpu_user_regs *regs, union hsr hsr) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index 7b387b7..1bd86f8 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -318,15 +318,20 @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n) } } -/* TODO: unsigned long is used to fit vcpu_mask.*/ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int virq, -unsigned long vcpu_mask) +cpumask_t *vcpu_mask) { struct domain *d = v-domain; int vcpuid; int i; -ASSERT(d-max_vcpus 8*sizeof(vcpu_mask)); +/* + * cpumask_t is based on NR_CPUS and there is no relation between + * NR_CPUS and MAX_VIRT_CPUS. Furthermore, NR_CPUS can be configured + * at build time by the user. So we add a BUILD_BUG_ON here in order + * to avoid insecure hypervisor. + */ +BUILD_BUG_ON(sizeof(cpumask_t)*8 MAX_VIRT_CPUS); ASSERT( virq 16 ); @@ -338,25 +343,27 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int case SGI_TARGET_OTHERS: /* * We expect vcpu_mask to
[Xen-devel] [PATCH V6 06/10] tools/libxl: Set 'reg' of cpu node equal to MPIDR affinity for domU
From: Chen Baozi baoz...@gmail.com According to ARM CPUs bindings, the reg field should match the MPIDR's affinity bits. We will use AFF0 and AFF1 when constructing the reg value of the guest at the moment, for it is enough for the current max vcpu number. Signed-off-by: Chen Baozi baoz...@gmail.com Reviewed-by: Julien Grall julien.gr...@citrix.com --- tools/libxl/libxl_arm.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c index c5088c4..16f4158 100644 --- a/tools/libxl/libxl_arm.c +++ b/tools/libxl/libxl_arm.c @@ -272,6 +272,7 @@ static int make_cpus_node(libxl__gc *gc, void *fdt, int nr_cpus, const struct arch_info *ainfo) { int res, i; +uint64_t mpidr_aff; res = fdt_begin_node(fdt, cpus); if (res) return res; @@ -283,7 +284,16 @@ static int make_cpus_node(libxl__gc *gc, void *fdt, int nr_cpus, if (res) return res; for (i = 0; i nr_cpus; i++) { -const char *name = GCSPRINTF(cpu@%d, i); +const char *name; + +/* + * According to ARM CPUs bindings, the reg field should match + * the MPIDR's affinity bits. We will use AFF0 and AFF1 when + * constructing the reg value of the guest at the moment, for it + * is enough for the current max vcpu number. + */ +mpidr_aff = (i 0x0f) | (((i 4) 0xff) 8); +name = GCSPRINTF(cpu@%lx, mpidr_aff); res = fdt_begin_node(fdt, name); if (res) return res; @@ -297,7 +307,7 @@ static int make_cpus_node(libxl__gc *gc, void *fdt, int nr_cpus, res = fdt_property_string(fdt, enable-method, psci); if (res) return res; -res = fdt_property_regs(gc, fdt, 1, 0, 1, (uint64_t)i); +res = fdt_property_regs(gc, fdt, 1, 0, 1, mpidr_aff); if (res) return res; res = fdt_end_node(fdt); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V6 05/10] xen/arm64: gicv3: Use AFF1 when translating ICC_SGI1R_EL1 to cpumask
From: Chen Baozi baoz...@gmail.com To support more than 16 vCPUs, we have to calculate cpumask with AFF1 field value in ICC_SGI1R_EL1. Signed-off-by: Chen Baozi baoz...@gmail.com --- xen/arch/arm/vgic-v3.c| 30 ++ xen/include/asm-arm/gic_v3_defs.h | 2 ++ 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 2bf5294..f2b78a4 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -972,13 +972,28 @@ write_ignore: return 1; } -static inline void gicv3_sgir_to_cpumask(cpumask_t *cpumask, +static inline int gicv3_sgir_to_cpumask(cpumask_t *cpumask, const register_t sgir) { unsigned long target_list; +uint16_t *target_bitmap; +unsigned int aff1; target_list = sgir ICH_SGI_TARGETLIST_MASK; -bitmap_copy(cpumask_bits(cpumask), target_list, ICH_SGI_TARGET_BITS); +/* We assume that only AFF1 is used in ICC_SGI1R_EL1. */ +aff1 = (sgir ICH_SGI_AFFINITY_LEVEL(1)) ICH_SGI_AFFx_MASK; + +/* There might be up to 4096 vCPUs with all bits in affinity 1 + * are used, so we have to check whether it will overflow the + * bitmap array of cpumask_t. + */ +if ( ((aff1 + 1) * ICH_SGI_TARGET_BITS) NR_CPUS ) +return 1; + +target_bitmap = (uint16_t *)cpumask_bits(cpumask); +target_bitmap[aff1] = target_list; + +return 0; } static int vgic_v3_to_sgi(struct vcpu *v, register_t sgir) @@ -996,8 +1011,15 @@ static int vgic_v3_to_sgi(struct vcpu *v, register_t sgir) switch ( irqmode ) { case ICH_SGI_TARGET_LIST: -/* SGI's are injected at Rdist level 0. ignoring affinity 1, 2, 3 */ -gicv3_sgir_to_cpumask(vcpu_mask, sgir); +/* + * Currenty we assume only affinity level-1 is used in SGI's + * injection, ignoring level 2 3. + */ +if ( gicv3_sgir_to_cpumask(vcpu_mask, sgir) ) +{ +gprintk(XENLOG_WARNING, Wrong affinity in SGI1R_EL register\n); +return 0; +} sgi_mode = SGI_TARGET_LIST; break; case ICH_SGI_TARGET_OTHERS: diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h index e106e67..333aa56 100644 --- a/xen/include/asm-arm/gic_v3_defs.h +++ b/xen/include/asm-arm/gic_v3_defs.h @@ -153,6 +153,8 @@ #define ICH_SGI_IRQ_MASK 0xf #define ICH_SGI_TARGETLIST_MASK 0x #define ICH_SGI_TARGET_BITS 16 +#define ICH_SGI_AFFx_MASK0xff +#define ICH_SGI_AFFINITY_LEVEL(x)(16 * (x)) #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */ -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V6 07/10] xen/arm: Set 'reg' of cpu node for dom0 to match MPIDR's affinity
From: Chen Baozi baoz...@gmail.com According to ARM CPUs bindings, the reg field should match the MPIDR's affinity bits. We will use AFF0 and AFF1 when constructing the reg value of the guest at the moment, for it is enough for the current max vcpu number. Signed-off-by: Chen Baozi baoz...@gmail.com --- xen/arch/arm/domain_build.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index a156de9..154c9d6 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -712,6 +712,7 @@ static int make_cpus_node(const struct domain *d, void *fdt, char buf[15]; u32 clock_frequency; bool_t clock_valid; +uint64_t mpidr_aff; DPRINT(Create cpus node\n); @@ -761,9 +762,16 @@ static int make_cpus_node(const struct domain *d, void *fdt, for ( cpu = 0; cpu d-max_vcpus; cpu++ ) { -DPRINT(Create cpu@%u node\n, cpu); +/* + * According to ARM CPUs bindings, the reg field should match + * the MPIDR's affinity bits. We will use AFF0 and AFF1 when + * constructing the reg value of the guest at the moment, for it + * is enough for the current max vcpu number. + */ +mpidr_aff = vcpuid_to_vaffinity(cpu); +DPRINT(Create cpu@%lx (logical CPUID: %d) node\n, mpidr_aff, cpu); -snprintf(buf, sizeof(buf), cpu@%u, cpu); +snprintf(buf, sizeof(buf), cpu@%lx, mpidr_aff); res = fdt_begin_node(fdt, buf); if ( res ) return res; @@ -776,7 +784,7 @@ static int make_cpus_node(const struct domain *d, void *fdt, if ( res ) return res; -res = fdt_property_cell(fdt, reg, cpu); +res = fdt_property_cell(fdt, reg, mpidr_aff); if ( res ) return res; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: gic-hip04: Resync the driver with the GICv2
Hi, On 18/05/15 14:36, Zoltan Kiss wrote: On 15/05/15 22:08, Julien Grall wrote: Hi Zoltan, On 07/05/2015 13:37, Zoltan Kiss wrote: On 07/05/15 10:32, Ian Campbell wrote: On Thu, 2015-05-07 at 09:52 +0100, Zoltan Kiss wrote: Looks good at first glance, let me try it on a board. On 06/05/15 19:52, Julien Grall wrote: [...] I'm concerned to see a newly driver (pushed last march) already orphan. Does Huawei still plan to maintain this driver? I share Julien's concerns here. It would be good if those listed in the MAINTAINERS file for this device would respond reasonably promptly to mails such as [1] and try to keep on top of things or to find a {replacement /co-}maintainer who can do so. As I said, I've missed that thread entirely (not that hard given the traffic of the list), but now I've improved my mail filters to make sure I don't miss a mail where my Huawei address is on the Cc. We are also looking to add new co-maintainers, because nowadays I'm working on other projects. I need a few days to test the patch as my Xen test environment is not available right now. Ping? Were you able to test it? One of my colleague is doing that, to spread experience. It's going slower for him obviously to set up an enviroment, but we are working on that. Ping? Was your colleague able to setup Xen? Aside gic-hip04 is not booting on Xen, this is also a blocker from few patch series based on this rebase (GICv2 on GICv3, 128 VCPUs support...). Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: gic-hip04: Resync the driver with the GICv2
Hi, Yes, we managed to test it, and it works. Then only thing I've found is this bit: +/* Only 1020 interrupts are supported */ +gicv2_info.nr_lines = min(1020U, nr_lines); This interrupt controller only supports 511, so 1020 should be replaced. We had such checking in the code in the early versions, and I looked everywhere in the archives to figure out why it was dropped before upstreaming, but I couldn't find it. Other than this bit: Reviewed-by: Zoltan Kiss zoltan.k...@huawei.com Tested-by: Shameerali Kolothum Thodi shameerali.kolothum.th...@huawei.com And sorry for the loong delay! Regards, Zoli On 01/06/15 12:13, Julien Grall wrote: Hi, On 18/05/15 14:36, Zoltan Kiss wrote: On 15/05/15 22:08, Julien Grall wrote: Hi Zoltan, On 07/05/2015 13:37, Zoltan Kiss wrote: On 07/05/15 10:32, Ian Campbell wrote: On Thu, 2015-05-07 at 09:52 +0100, Zoltan Kiss wrote: Looks good at first glance, let me try it on a board. On 06/05/15 19:52, Julien Grall wrote: [...] I'm concerned to see a newly driver (pushed last march) already orphan. Does Huawei still plan to maintain this driver? I share Julien's concerns here. It would be good if those listed in the MAINTAINERS file for this device would respond reasonably promptly to mails such as [1] and try to keep on top of things or to find a {replacement /co-}maintainer who can do so. As I said, I've missed that thread entirely (not that hard given the traffic of the list), but now I've improved my mail filters to make sure I don't miss a mail where my Huawei address is on the Cc. We are also looking to add new co-maintainers, because nowadays I'm working on other projects. I need a few days to test the patch as my Xen test environment is not available right now. Ping? Were you able to test it? One of my colleague is doing that, to spread experience. It's going slower for him obviously to set up an enviroment, but we are working on that. Ping? Was your colleague able to setup Xen? Aside gic-hip04 is not booting on Xen, this is also a blocker from few patch series based on this rebase (GICv2 on GICv3, 128 VCPUs support...). Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Wed, 2015-05-27 at 22:14 +0530, Vijay Kilari wrote: ## pITS Scheduling A pITS scheduling pass is attempted: * On write to any virtual `CWRITER` iff that write results in there being new outstanding requests for that vits; You mean, scheduling pass (softirq trigger) is triggered iff there is no ongoing requests from that vits? Yes, this has changed with the switch to only a single outstanding batch. I went with: * On write to any virtual `CWRITER` iff that write results in there being new outstanding requests for that vits which could be consumed by the pits (i.e. subject to only a single batch only being permitted by the scheduler); Although implementationwise it may be OK to defer that decision to the scheduler, rather than try to figure it out in the mmio trap. * On read from a virtual `CREADR` iff there are commands outstanding on that vits; * On receipt of an interrupt notification arising from Xen's own use of `INT`; (see discussion under Completion) * On any interrupt injection arising from a guests use of the `INT` command; (XXX perhaps, see discussion under Completion) This may result in lots of contention on the scheduler locking. Therefore we consider that in each case all which happens is triggering of a softirq which will be processed on return to guest, and just once even for multiple events. Is it required to have all the cases to trigger scheduling pass? Just on CWRITER if no ongoing request and on Xen's own completion INT is not sufficient? I think CREADR is needed too, so the guest sees up to date info. And on injection arising from the guest use of INT is marked as optional here and considered later on. Whether it is needed depends on the decision there. [...] The second option is likely to be preferable if the issue of selecting a device ID can be addressed. A secondary question is when these `INT` commands should be inserted into the command stream: (Nb, this is a list of options, not a list of places where it must be done) * After each batch taken from a single `vits_cq`; Is this not enough? because Scheduling pass just sends a one batch of command with Xen's INT command It is almost certainly _sufficient_, the question is more whether it is _necessary_ or whether we can reduce the number of interrupts which are required for correct emulation of a vits, iow can we get away with one of the other two options. The following text argues that only one Xen INT is needed in the stream at any given moment. ### Device ID (`ID`) This parameter is used by commands which manage a specific device and the interrupts associated with that device. Checking if a device is present and retrieving the data structure must be fast. The device identifiers may not be assigned contiguously and the maximum number is very high (2^32). XXX In the context of virtualised device ids this may not be the case, e.g. we can arrange for (mostly) contiguous device ids and we know the bound is significantly lower than 2^32 Possible efficient data structures would be: 1. List: The lookup/deletion is in O(n) and the insertion will depend if the device should be sorted following their identifier. The memory overhead is 18 bytes per element. 2. Red-black tree: All the operations are O(log(n)). The memory overhead is 24 bytes per element. A Red-black tree seems the more suitable for having fast deviceID validation even though the memory overhead is a bit higher compare to the list. When PHYSDEVOP_pci_device_add is called, memory for its_device structure and other needed structure for this device is allocated added to RB-tree with all necessary information Sounds like a reasonable time to do it. I added something based on your words. [...] Format: `MAPC vCID, vTA` - The GITS_TYPER.PAtype is emulated as 0. ITYM `GITS_TYPER.PTA`? I've updated various introductory section to reflect the decision to emulate as 0. - `MAPC pCID, pTA` physical ITS command is generated ### `MAPD` Command translation Format: `MAPD device, Valid, ITT IPA, ITT Size` `MAPD` is sent with `Valid` bit set if device needs to be added and reset when device is removed. If `Valid` bit is set: - Allocate memory for `its_device` struct - Validate ITT IPA ITT size and update its_device struct - Find number of vectors(nrvecs) for this device by querying PCI helper function - Allocate nrvecs number of LPI XXX nrvecs is a function of `ITT Size`? - Allocate memory for `struct vlpi_map` for this device. This `vlpi_map` holds mapping of Virtual LPI to Physical LPI and ID. - Find physical ITS node with which this device is associated - Call `p2m_lookup` on ITT IPA addr and get physical ITT address - Validate ITT Size - Generate/format physical ITS command: `MAPD, ITT PA, ITT Size` Here the overhead
[Xen-devel] [PATCH V6 09/10] xen/arm: make domain_max_vcpus return value from vgic_ops
From: Chen Baozi baoz...@gmail.com When a guest uses vGICv2, the maximum number of vCPU it can support should not be as many as MAX_VIRT_CPUS, which will be more than 8 when GICv3 is used on arm64. So the domain_max_vcpus should return the value according to the vGIC the domain uses. We didn't keep it as the old static inline form because it will break compilation when access the member of struct domain: In file included from xen/include/xen/domain.h:6:0, from xen/include/xen/sched.h:10, from arm64/asm-offsets.c:10: xen/include/asm/domain.h: In function ‘domain_max_vcpus’: xen/include/asm/domain.h:266:10: error: dereferencing pointer to incomplete type if (d-arch.vgic.version == GIC_V2) ^ Signed-off-by: Chen Baozi baoz...@gmail.com --- xen/arch/arm/domain.c| 6 ++ xen/arch/arm/vgic-v2.c | 3 +++ xen/arch/arm/vgic-v3.c | 7 +++ xen/include/asm-arm/domain.h | 5 + xen/include/asm-arm/vgic.h | 2 ++ 5 files changed, 19 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index 63c34fd..1992717 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -917,6 +917,12 @@ void vcpu_block_unless_event_pending(struct vcpu *v) vcpu_unblock(current); } +unsigned int domain_max_vcpus(const struct domain *d) +{ +return min_t(unsigned int, MAX_VIRT_CPUS, + d-arch.vgic.handler-max_vcpus); +} + /* * Local variables: * mode: C diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c index 17a3c9f..09e6b5a 100644 --- a/xen/arch/arm/vgic-v2.c +++ b/xen/arch/arm/vgic-v2.c @@ -33,6 +33,8 @@ #include asm/gic.h #include asm/vgic.h +#define GICV2_MAX_CPUS 8 + static inline void gicv2_sgir_to_cpumask(cpumask_t *cpumask, const register_t sgir) { @@ -594,6 +596,7 @@ const struct vgic_ops vgic_v2_ops = { .domain_init = vgic_v2_domain_init, .get_irq_priority = vgic_v2_get_irq_priority, .get_target_vcpu = vgic_v2_get_target_vcpu, +.max_vcpus = GICV2_MAX_CPUS, }; /* diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index f2b78a4..50dcfc9 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -32,6 +32,12 @@ #include asm/gic.h #include asm/vgic.h +/* + * We will use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU + * that can be supported is up to 4096(256*16) in theory. + */ +#define GICV3_MAX_CPUS 4096 + /* GICD_PIDRn register values for ARM implementations */ #define GICV3_GICD_PIDR0 0x92 #define GICV3_GICD_PIDR1 0xb4 @@ -1234,6 +1240,7 @@ const struct vgic_ops vgic_v3_ops = { .get_irq_priority = vgic_v3_get_irq_priority, .get_target_vcpu = vgic_v3_get_target_vcpu, .emulate_sysreg = vgic_v3_emulate_sysreg, +.max_vcpus = GICV3_MAX_CPUS, }; /* diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index b7b5cd2..b525bec 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -261,10 +261,7 @@ struct arch_vcpu void vcpu_show_execution_state(struct vcpu *); void vcpu_show_registers(const struct vcpu *); -static inline unsigned int domain_max_vcpus(const struct domain *d) -{ -return MAX_VIRT_CPUS; -} +unsigned int domain_max_vcpus(const struct domain *); /* * Due to the restriction of GICv3, the number of vCPUs in AFF0 is diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h index 2f413e1..60c6cfd 100644 --- a/xen/include/asm-arm/vgic.h +++ b/xen/include/asm-arm/vgic.h @@ -110,6 +110,8 @@ struct vgic_ops { struct vcpu *(*get_target_vcpu)(struct vcpu *v, unsigned int irq); /* vGIC sysreg emulation */ int (*emulate_sysreg)(struct cpu_user_regs *regs, union hsr hsr); +/* Maximum number of vCPU supported */ +const unsigned int max_vcpus; }; /* Number of ranks of interrupt registers for a domain */ -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V6 10/10] xen/arm64: increase MAX_VIRT_CPUS to 128 on arm64
From: Chen Baozi baoz...@gmail.com After we have increased the size of GICR in address space for guest and made use of both AFF0 and AFF1 in (v)MPIDR, we are now able to support up to 4096 vCPUs in theory. However, it will cost 512M address space for GICR region, which is not necessary big at the moment. Considering the max CPU number that GIC-500 can support and the old value of MAX_VIRT_CPUS before commit aa25a61, we increase its value to 128. Since the domain_max_vcpus has been changed to depends on vgic_ops, we could have done more work in order to drop the definition of MAX_VIRT_CPUS. However, because it is still used for some conditional compilation in common code, I think that would be better done in a seperate cleanup patch series. Signed-off-by: Chen Baozi baoz...@gmail.com --- xen/arch/arm/vgic-v3.c | 1 - xen/include/asm-arm/config.h | 4 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 50dcfc9..2be9f81 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -895,7 +895,6 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, mmio_info_t *info) rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER, DABT_DOUBLE_WORD); if ( rank == NULL ) goto write_ignore; -BUG_ON(v-domain-max_vcpus 8); new_irouter = *r; vgic_lock_rank(v, rank, flags); diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h index 3b23e05..817c216 100644 --- a/xen/include/asm-arm/config.h +++ b/xen/include/asm-arm/config.h @@ -47,7 +47,11 @@ #define NR_CPUS 128 #endif +#ifdef CONFIG_ARM_64 +#define MAX_VIRT_CPUS 128 +#else #define MAX_VIRT_CPUS 8 +#endif #define asmlinkage /* Nothing needed */ -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools: link executables with libtinfo explicitly
On Sat, May 30, 2015 at 12:07:28AM +0200, Daniel Kiper wrote: binutils 2.22 changed ld default from --copy-dt-needed-entries to -no-copy-dt-needed-entries. This revealed that some objects are linked implicitly with libtinfo and newer ld fails to build relevant executables. Below is short explanation why we should not do that... http://fedoraproject.org/wiki/UnderstandingDSOLinkChange says: The default behaviour for ld (my note: before version 2.22) allows users to 'indirectly' link to required objects/libraries through intermediate objects/libraries. While this is convenient, it can also be dangerous because it makes your program's dependencies tied to the dependencies of other objects. If those objects ever change their linkages, they can break your program without any changes to your own code! Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- config/Tools.mk.in|1 + tools/configure | 46 + Since your autoconf version might be different from the one used by committers, it might be better you don't include the changes to configure in your patch and add a line to prompt committer to rerun autogen.sh. tools/configure.ac|4 tools/misc/Makefile |2 +- tools/xenstat/xentop/Makefile |2 +- 5 files changed, 53 insertions(+), 2 deletions(-) diff --git a/config/Tools.mk.in b/config/Tools.mk.in index d67352e..aef9ed6 100644 --- a/config/Tools.mk.in +++ b/config/Tools.mk.in @@ -77,5 +77,6 @@ CONFIG_LIBICONV := @libiconv@ CONFIG_GCRYPT := @libgcrypt@ EXTFS_LIBS := @EXTFS_LIBS@ CURSES_LIBS := @CURSES_LIBS@ +TINFO_LIBS := @TINFO_LIBS@ FILE_OFFSET_BITS:= @FILE_OFFSET_BITS@ [...] diff --git a/tools/configure.ac b/tools/configure.ac index 9bf6450..1a06ddf 100644 --- a/tools/configure.ac +++ b/tools/configure.ac @@ -318,6 +318,10 @@ i[[3456]]86|x86_64) esac AX_CHECK_UUID AX_CHECK_CURSES +AS_IF([test $ncurses = y], [ +AC_CHECK_LIB([tinfo], [define_key], [TINFO_LIBS=-ltinfo]) define_key? +]) +AC_SUBST(TINFO_LIBS) This doesn't look right. Tinfo is needed by libxenstat. It should not depend on presence of curses library. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 net] xen: netback: read hotplug script once at start of day.
On Mon, Jun 01, 2015 at 11:30:24AM +0100, Ian Campbell wrote: When we come to tear things down in netback_remove() and generate the uevent it is possible that the xenstore directory has already been removed (details below). In such cases netback_uevent() won't be able to read the hotplug script and will write a xenstore error node. A recent change to the hypervisor exposed this race such that we now sometimes lose it (where apparently we didn't ever before). Instead read the hotplug script configuration during setup and use it for the lifetime of the backend device. The apparently more obvious fix of moving the transition to state=Closed in netback_remove() to after the uevent does not work because it is possible that we are already in state=Closed (in reaction to the guest having disconnected as it shutdown). Being already in Closed means the toolstack is at liberty to start tearing down the xenstore directories. In principal it might be possible to arrange to unregister the device sooner (e.g on transition to Closing) such that xenstore would still be there but this state machine is fragile and prone to anger... A modern Xen system only relies on the hotplug uevent for driver domains, when the backend is in the same domain as the toolstack it will run the necessary setup/teardown directly in the correct sequence wrt xenstore changes. Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/2] efi: Fix allocation problems if ExitBootServices() fails
On Mon, 2015-06-01 at 12:10 +0100, Jan Beulich wrote: On 01.06.15 at 12:17, ross.lagerw...@citrix.com wrote: If calling ExitBootServices() fails, the required memory map size may have increased. When initially allocating the memory map, allocate a slightly larger buffer (by an arbitrary 8 entries) to fix this. The ARM code path was already allocating a larger buffer than required, so this moves the code to be common for all architectures. This was seen on the following machine when using the iscsidxe UEFI driver. The machine would consistently fail the first call to ExitBootServices(). System Information Manufacturer: Supermicro Product Name: X10SLE-F/HF BIOS Information Vendor: American Megatrends Inc. Version: 2.00 Release Date: 04/24/2014 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Provided ARM folks are happy with the reduced increase, Hi Roy, This patch[0] turns a +PAGE_SIZE in efi_arch_allocate_mmap_buffer into a 8 * efi_mdesc_size in the common code. The +PAGE_SIZE came from [1] so I think it is as arbitrary as the +8*sizeof here. IOW this change looks ok to me, what do you think? Thanks, Ian. [0] http://lists.xen.org/archives/html/xen-devel/2015-06/msg00067.html [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=932881d953476444ed934c94dbc098c0fefb4d77 Acked-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.2-testing test] 57630: regressions - FAIL
On 01.06.15 at 00:57, osst...@xenbits.xen.org wrote: flight 57630 xen-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/57630/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xend-winxpsp3 16 guest-stop fail REGR. vs. 53018 test-amd64-i386-xend-qemut-winxpsp3 16 guest-stop fail REGR. vs. 53018 This seems to be due to osstest using an unrecognized option: 2015-05-31 20:32:59 Z executing ssh ... root@172.16.144.32 xm shutdown -wF win.guest.osstest Error: option -F not recognized Usage: xm shutdown Domain [-waRH] Shutdown a domain. [options] [DOM] Shutdown one or more domains gracefully. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [Patch v3 27/36] x86, irq: Use access helper irq_data_get_affinity_mask()
Use access helper irq_data_get_affinity_mask() to hide implementation details of struct irq_desc. Signed-off-by: Jiang Liu jiang@linux.intel.com --- arch/x86/kernel/apic/io_apic.c |2 +- arch/x86/kernel/apic/vector.c|5 ++--- arch/x86/kernel/irq.c|5 +++-- drivers/xen/events/events_base.c |4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 845dc0df2002..09921de4210f 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2541,7 +2541,7 @@ void __init setup_ioapic_dest(void) * Honour affinities which have been set in early boot */ if (!irqd_can_balance(idata) || irqd_affinity_was_set(idata)) - mask = idata-affinity; + mask = irq_data_get_affinity_mask(idata); else mask = apic-target_cpus(); diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 9b62f690b0ff..dfa3a5f5b3d3 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -494,9 +494,8 @@ static int apic_set_affinity(struct irq_data *irq_data, err = assign_irq_vector(irq, data, dest); if (err) { - struct irq_data *top = irq_get_irq_data(irq); - - if (assign_irq_vector(irq, data, top-affinity)) + if (assign_irq_vector(irq, data, + irq_data_get_affinity_mask(irq_data))) pr_err(Failed to recover vector for irq %d\n, irq); return err; } diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 7e10c8b4b318..37685e37550c 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -342,7 +342,8 @@ int check_irq_vectors_for_cpu_disable(void) continue; data = irq_desc_get_irq_data(desc); - cpumask_copy(affinity_new, data-affinity); + cpumask_copy(affinity_new, +irq_data_get_affinity_mask(data)); cpumask_clear_cpu(this_cpu, affinity_new); /* Do not count inactive or per-cpu irqs. */ @@ -420,7 +421,7 @@ void fixup_irqs(void) raw_spin_lock(desc-lock); data = irq_desc_get_irq_data(desc); - affinity = data-affinity; + affinity = irq_data_get_affinity_mask(data); if (!irq_has_action(irq) || irqd_is_per_cpu(data) || cpumask_subset(affinity, cpu_online_mask)) { raw_spin_unlock(desc-lock); diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 2b8553bd8715..d00e0be8e9ea 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -336,7 +336,7 @@ static void bind_evtchn_to_cpu(unsigned int chn, unsigned int cpu) BUG_ON(irq == -1); #ifdef CONFIG_SMP - cpumask_copy(irq_get_irq_data(irq)-affinity, cpumask_of(cpu)); + cpumask_copy(irq_get_affinity_mask(irq), cpumask_of(cpu)); #endif xen_evtchn_port_bind_to_cpu(info, cpu); @@ -373,7 +373,7 @@ static void xen_irq_init(unsigned irq) struct irq_info *info; #ifdef CONFIG_SMP /* By default all event channels notify CPU#0. */ - cpumask_copy(irq_get_irq_data(irq)-affinity, cpumask_of(0)); + cpumask_copy(irq_get_affinity_mask(irq), cpumask_of(0)); #endif info = kzalloc(sizeof(*info), GFP_KERNEL); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V5 10/10] xen/arm64: increase MAX_VIRT_CPUS to 128 on arm64
Hi Chen, On 01/06/2015 01:56, Chen Baozi wrote: For instance, with your series a cluster can use up to 16 cores but the GIC-500 is only supporting up to 8 cores... Well, if 4096 is a acceptable value, it will cost 512M address space for GICR_* (We need to chagne patch #1 too). Although we do have enough space for GICR_* before it reaches the GNTTAB region, I don't think it is a good idea to increase MAX_VIRT_CPUS to such a big one. 4096 is the limitation of using only AFF1 and AFF0. That would be inevitable if one day we decide to support 4096 CPUs :). And I've also check the old value of MAX_VIRT_CPUS, it used to be 128 before aa25a61. Or do you have a better suggestion? I'm not against to limit to 128 vCPUs. It's just the commit message is not true. You are limiting the number of VCPUs for practical reason and not because we are implementing GIC-500. Please update the commit message according to it. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 2/4] iommu VT-d: separate rmrr addition function
On 29.05.15 at 23:38, elena.ufimts...@oracle.com wrote: In preparation for auxiliary RMRR data provided on Xen command line, make RMRR adding a separate function. Also free memery for rmrr device scope in error path. No changes since v5. Certainly there is. (And the statement wouldn't belong here anyway, but below the first --- separator.) Signed-off-by: Elena Ufimtseva elena.ufimts...@oracle.com Reviewed-by: Jan Beulich jbeul...@suse.com And certainly I didn't approve it in this shape: +static int register_one_rmrr(struct acpi_rmrr_unit *rmrru) +{ +bool_t ignore = 0; +unsigned int i = 0; +int ret = 0; + +/* Skip checking if segment is not accessible yet. */ +if ( !pci_known_segment(rmrru-segment) ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, UNKNOWN Prefix! %04x, rmrru-segment); +i = UINT_MAX; +} + +for ( ; i rmrru-scope.devices_cnt; i++ ) +{ +u8 b = PCI_BUS(rmrru-scope.devices[i]); +u8 d = PCI_SLOT(rmrru-scope.devices[i]); +u8 f = PCI_FUNC(rmrru-scope.devices[i]); + +if ( pci_device_detect(rmrru-segment, b, d, f) == 0 ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, + Non-existent device (%04x:%02x:%02x.%u) is reported + in RMRR (%PRIx64, %PRIx64)'s scope!\n, +rmrru-segment, b, d, f, +rmrru-base_address, rmrru-end_address); +ignore = 1; +} +else +{ +ignore = 0; +break; +} +} + +if ( ignore ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, + Ignore the RMRR (%PRIx64, %PRIx64) due to +devices under its scope are not PCI discoverable!\n, +rmrru-base_address, rmrru-end_address); +xfree(rmrru-scope.devices); +xfree(rmrru); +ret = -EFAULT; You _again_ made this an error, which it wasn't before. A little more care please. Also you folded the leak fix into here without saying so. As said on the solitary leak fix patch - that change belongs there (not the least because we will want to backport that but not this one). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH net] xen: netback: read hotplug script once at start of day.
On Fri, 2015-05-29 at 18:38 +0100, Wei Liu wrote: On Fri, May 29, 2015 at 05:24:53PM +0100, Ian Campbell wrote: [...] if (be-vif != NULL) return 0; @@ -417,12 +409,23 @@ static int backend_create_xenvif(struct backend_info *be) return (err 0) ? err : -EINVAL; } + script = xenbus_read(XBT_NIL, dev-nodename, script, NULL); + if (IS_ERR(script)) { + int err = PTR_ERR(script); + xenbus_dev_fatal(dev, err, reading script); + return err; + } + vif = xenvif_alloc(dev-dev, dev-otherend_id, handle); if (IS_ERR(vif)) { err = PTR_ERR(vif); xenbus_dev_fatal(dev, err, creating interface); + kfree(script); return err; } + + vif-hotplug_script = script; + IMO it's better we make xenvif_alloc accept a new parameter called script then allocate vif-hotplug_script there. Then free vif-hotplug_script in xenvif_free. This way it's less error prone because all memory allocated for vif is managed in proper place - xenvif_alloc and xenvif_free. Well, except the allocation is still in xenbus_read via backend_create_xenvif, but yes I think that refactoring would be an improvement. What about storing it in struct backend_info and setting/restoring in netback_{probe,remove}? That might be best of all? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/2] efi: Avoid calling boot services after ExitBootServices()
On 01/06/15 11:17, Ross Lagerwall wrote: After the first call to ExitBootServices(), avoid calling any boot services by setting setting efi_bs to NULL and entering an infinite loop in blexit(). Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com --- xen/common/efi/boot.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index 60c1b8d..5c5c158 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -216,6 +216,12 @@ static void __init noreturn blexit(const CHAR16 *str) PrintStr((CHAR16 *)str); PrintStr(newline); +if ( !efi_bs ) +{ +for ( ; ; ) +; At the very least this should be halt() to avoid spinning in a busy loop, and probably with a local_irq_disable() ahead of the for. ~Andrew +} + if ( cfg.addr ) efi_bs-FreePages(cfg.addr, PFN_UP(cfg.size)); if ( kernel.addr ) @@ -1063,8 +1069,10 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) for ( retry = 0; ; retry = 1 ) { efi_memmap_size = map_alloc_size; -status = efi_bs-GetMemoryMap(efi_memmap_size, efi_memmap, map_key, - efi_mdesc_size, mdesc_ver); +status = SystemTable-BootServices-GetMemoryMap(efi_memmap_size, + efi_memmap, map_key, + efi_mdesc_size, + mdesc_ver); if ( EFI_ERROR(status) ) PrintErrMesg(LCannot obtain memory map, status); @@ -1073,7 +1081,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) efi_arch_pre_exit_boot(); -status = efi_bs-ExitBootServices(ImageHandle, map_key); +status = SystemTable-BootServices-ExitBootServices(ImageHandle, + map_key); +efi_bs = NULL; if ( status != EFI_INVALID_PARAMETER || retry ) break; } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/2] efi: Avoid calling boot services after ExitBootServices()
On 01.06.15 at 12:26, andrew.coop...@citrix.com wrote: On 01/06/15 11:17, Ross Lagerwall wrote: --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -216,6 +216,12 @@ static void __init noreturn blexit(const CHAR16 *str) PrintStr((CHAR16 *)str); PrintStr(newline); +if ( !efi_bs ) +{ +for ( ; ; ) +; At the very least this should be halt() to avoid spinning in a busy loop, and probably with a local_irq_disable() ahead of the for. Suitably abstracted, yes: ARM has no halt(), and I don't think we should assume local_irq_disable() can be used here in a completely arch-independent fashion. I.e. perhaps the whole body of the if() should become a new arch hook. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v23 10/15] x86/VPMU: Use pre-computed masks when checking validity of MSRs
On 29.05.15 at 20:42, boris.ostrov...@oracle.com wrote: No need to compute those masks on every MSR access. Also, when checking MSR_P6_EVNTSELx registers make sure that bit 21 is not set. I had hoped you would save me and other readers to look up what this bit means by giving a reason why this change is wanted/needed. What you say above is at best marginally more useful (it clarifies that this isn't a stray change) than having the adjustment only in the code. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 57671: trouble: blocked/broken/fail/pass
flight 57671 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/57671/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 3 host-install(3) broken REGR. vs. 57442 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-libvirt 11 guest-start fail like 57442 test-amd64-i386-libvirt 11 guest-start fail like 57442 test-amd64-i386-freebsd10-amd64 9 freebsd-install fail like 57442 test-amd64-i386-freebsd10-i386 9 freebsd-install fail like 57442 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 57442 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 57442 Tests which did not succeed, but are not blocking: build-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-armhf-armhf-xl-sedf 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-sedf-pin 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 14 guest-localmigrate fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-i386-xl-xsm 14 guest-localmigrate fail never pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass version targeted for testing: linuxaaa20fc23341be3df7b17810e330f12244abcf29 baseline version: linuxde182468d1bb726198abaab315820542425270b7 People who touched revisions under test: Adrien Schildknecht adrien+...@schischi.me Alex Deucher alexander.deuc...@amd.com Amir Vadai am...@mellanox.com Andrew Morton a...@linux-foundation.org Anthoine Bourgeois anthoine.bourge...@gmail.com Arnd Bergmann a...@arndb.de Ben Skeggs bske...@redhat.com Bjorn Helgaas bhelg...@google.com Bjørn Mork bj...@mork.no Bob Copeland m...@bobcopeland.com Brian Foster bfos...@redhat.com Casey Schaufler ca...@schaufler-ca.com Chris Zankel ch...@zankel.net Christian König christian.koe...@amd.com Dan Carpenter dan.carpen...@oracle.com Daniel Vetter daniel.vet...@ffwll.ch Daniel Vetter daniel.vet...@intel.com Darren Hart dvh...@linux.intel.com Dave Airlie airl...@redhat.com Dave Chinner da...@fromorbit.com Dave Chinner dchin...@redhat.com David Henningsson david.hennings...@canonical.com David S. Miller da...@davemloft.net Geert Uytterhoeven geert+rene...@glider.be George Wang xuw2...@gmail.com Guenter Roeck li...@roeck-us.net Henrique de Moraes Holschuh h...@hmh.eng.br Inki Dae inki@samsung.com Jani Nikula jani.nik...@intel.com Joe Perches j...@perches.com Joe Thornber e...@redhat.com Jun'ichi Nomura j-nom...@ce.jp.nec.com Junichi Nomura j-nom...@ce.jp.nec.com Kailang Yang kail...@realtek.com Krzysztof Kozlowski k.kozlow...@samsung.com Kukjin Kim kg...@kernel.org Lars Seipel l...@slrz.net Linus Torvalds torva...@linux-foundation.org Lucas Stach l.st...@pengutronix.de Matthijs van Duin matthijsvand...@gmail.com Max Filippov jcmvb...@gmail.com Mike Snitzer snit...@redhat.com NeilBrown ne...@suse.de Philippe Reynes trem...@gmail.com Rafael J. Wysocki rafael.j.wyso...@intel.com Rob Clark robdcl...@gmail.com Robert Jarzmik robert.jarz...@free.fr Robert Nelson robertcnel...@gmail.com Romain Izard romain.izard@gmail.com Rusty Russell ru...@rustcorp.com.au Rusty Russell ru...@rustcorp.com.au (then rebased) Sasha Levin sasha.le...@oracle.com Shawn Guo shawn@linaro.org Shreyas
Re: [Xen-devel] [PATCH] drivers: xen-blkfront: blkif_recover: recheck feature-persistent
El 26/05/15 a les 2.11, Bob Liu ha escrit: When migrate from !feature-persistent host to feature-persistent host, domU still think new host/backend don't support persistent. Dmesg like: backed has not unmapped grant: 839 backed has not unmapped grant: 773 backed has not unmapped grant: 773 backed has not unmapped grant: 773 backed has not unmapped grant: 839 We should recheck whether the new backend support feature-persistent during blkif_recover(). Right, we recheck for indirect-descriptors but not persistent grants. Do you think it makes sense to split the part of blkfront_connect that checks for optional features, like persistent grants, indirect descriptors and flush/barrier features to a separate function and call it from both blkfront_connect and blkif_recover? Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
On Fri, May 29, 2015 at 04:38:31PM +0800, Chao Peng wrote: On Fri, May 29, 2015 at 09:06:46AM +0100, Jan Beulich wrote: On 29.05.15 at 04:43, chao.p.p...@linux.intel.com wrote: On Thu, May 28, 2015 at 02:17:54PM +0100, Jan Beulich wrote: On 21.05.15 at 10:41, chao.p.p...@linux.intel.com wrote: +static int cat_cpu_init(unsigned int cpu) +{ +int rc; +const struct cpuinfo_x86 *c = cpu_data + cpu; + +if ( !cpu_has(c, X86_FEATURE_CAT) ) +return 0; + +if ( test_bit(cpu_to_socket(cpu), cat_socket_enable) ) +return 0; + +if ( cpu == smp_processor_id() ) +do_cat_cpu_init(rc); +else +on_selected_cpus(cpumask_of(cpu), do_cat_cpu_init, rc, 1); This now being called in the context of CPU_UP_PREPARE, I can't see how this works at all: Neither would the CPU's cpu_data[] instance be initialized by that time, nor would you be able to IPI that CPU, nor can I see how the if() branch could ever get entered. Was this tested at all? Ah, yes! So it sounds really a little difficult to move the memory allocation from CPU_STARTING to CPU_PREPARA for this case. Not sure why you talk about memory allocation again. That should be done in CPU_UP_PREPARE. But stuff that needs to happen on the CPU should happen in CPU_STARTING. The memory allocation's size depending on a CPU characteristic of course makes this a little problematic, but (I think I said so before) since we're assuming symmetry in many other places, I don't see anything wrong with you assuming symmetry here too, and hence use e.g. the boot CPU's value to determine the allocation size. No problem, then I can just forget the support for asymmetry in XEN. As this is quite different with our original assumption and what I described in the design doc, I'd like to have a clear summary here before submitting the new version. Basically speaking, the initial design tries to support systems have different SKUs for each socket. One example would be when plugging one HSX (Haswell server) and BDX (Broadwell server) processor into each socket of a Grantley platform. However, there are difficulties to support this than just to support systems that always have the same SKUs, AFAICS: 1) Not able to detect nr_sockets correctly at booting time, especially when taking cpu hotplug into account. This is also why I added a boot option for this at the beginning of this patch serial, while I agreed it's really not a good interface for user. 2) Unfeasible to allocate memory first and do initialization later in cpu hotplug notifications. My former approach is performing both the allocation and initialization in the CPU_STARTING, which is not a good idea indicated by Jan. For me, it's better to have it supported. But if that's difficult (just as described above), I feel comfortable to drop it. Let's see what changes will be made once that support is dropped: 1) Current per-socket psr_cat_socket_info will be dropped, instead psr_cat will be introduced which holds cos_max/cbm_len for all the sockets, and it will be initialized only once. 2) cat_socket_enable will be dropped as well, instead check !!psr_cat, just as CMT did. 3) No special hotplug consideration, as the CAT hardware information on booting cpu is applied to others. 4) We still support per socket cos configuration, so per socket 'struct psr_cat_cbm *cos_to_cbm' becomes something like 'struct psr_cat_cbm *socket_cbms' which holds all the cos_to_cbm for all the sockets, and it will be initialized at booting time. 5) The 'target' in xen_sysctl_psr_cat_op will be removed as now all the sockets have the same hardware info. Due to this, I'd like to retrofit xen_sysctl_psr_cmt_op but not introduce a new sub-op for this. XEN_DOMCTL_psr_cat_op however will still be there. 6) Related changes in tools side/documentation also should be done. Though the list looks long but generally it becomes simple (of course). Suggestions are welcome. Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen: netback: fix error printf format string.
On Sun, 2015-05-31 at 21:26 -0700, David Miller wrote: From: Ian Campbell ian.campb...@citrix.com Date: Fri, 29 May 2015 17:22:04 +0100 drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’: drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=] (txreq.offset~PAGE_MASK) + txreq.size); ^ txreq.offset and .size are uint16_t fields. Signed-off-by: Ian Campbell ian.campb...@citrix.com This may get rid of the compiler warning on your machine, but it creates one on mine: drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’: drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 5 has type ‘long unsigned int’ [-Wformat=] (txreq.offset~PAGE_MASK) + txreq.size); ^ There is a type involved in this calculation which is arch dependent, so you'll need to add a cast or something to make this warning go away in all cases. Ah, I only considered the types txreq.{offset,size} and missed thinking about PAGE_MASK. I'll resend with a cast. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On 05/29/2015 03:40 PM, Meng Xu wrote: 2015-05-29 4:15 GMT-07:00 Dario Faggioli dario.faggi...@citrix.com: On Tue, 2015-05-26 at 09:59 +0100, Ian Campbell wrote: On Mon, 2015-05-25 at 18:59 -0500, Chong Li wrote: This series arrived in my mailbox as 5 distinct mails. Please use git send-email such that the mails arrive as a single email thread (i.e. each mail as a reply to the previous or to the 0th mail) or arrange for the same thing by hand (I highly recommend using git send-email though) Indeed. BTW, Chong, v1 was threaded ok, so maybe did something different this time when sending the patches. If yes, just don't! :-) I think that's because Chong wanted to send different patches in this patch set to different maintainers. (For example, we don't want to spam Jan's folder with xl, libxl patch. :-) ) Is it ok to use git send-email --reply-to to attach all four patches to the cover letter (that is this email thread) of the patch set? So two git features: * If you write CC: Name addr...@blah.blah in your commit message, git send-email will CC that person *just for that commit* * Anything in the description under a line starting with --- will be discarded So you can write your message like this: 8-- libxl: A one-line example. This is just an example commit. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei@citrix.com 8-- And when you do git-send-email, it will automatically CC Ian and Wei (and only them) for this patch. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH net] xen: netback: read hotplug script once at start of day.
On Mon, Jun 01, 2015 at 09:52:45AM +0100, Ian Campbell wrote: On Fri, 2015-05-29 at 18:38 +0100, Wei Liu wrote: On Fri, May 29, 2015 at 05:24:53PM +0100, Ian Campbell wrote: [...] if (be-vif != NULL) return 0; @@ -417,12 +409,23 @@ static int backend_create_xenvif(struct backend_info *be) return (err 0) ? err : -EINVAL; } + script = xenbus_read(XBT_NIL, dev-nodename, script, NULL); + if (IS_ERR(script)) { + int err = PTR_ERR(script); + xenbus_dev_fatal(dev, err, reading script); + return err; + } + vif = xenvif_alloc(dev-dev, dev-otherend_id, handle); if (IS_ERR(vif)) { err = PTR_ERR(vif); xenbus_dev_fatal(dev, err, creating interface); + kfree(script); return err; } + + vif-hotplug_script = script; + IMO it's better we make xenvif_alloc accept a new parameter called script then allocate vif-hotplug_script there. Then free vif-hotplug_script in xenvif_free. This way it's less error prone because all memory allocated for vif is managed in proper place - xenvif_alloc and xenvif_free. Well, except the allocation is still in xenbus_read via backend_create_xenvif, but yes I think that refactoring would be an improvement. What about storing it in struct backend_info and setting/restoring in netback_{probe,remove}? That might be best of all? Yes, that would be best. Wei. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On 01/06/15 09:58, Jan Beulich wrote: On 01.06.15 at 10:50, george.dun...@eu.citrix.com wrote: On 06/01/2015 09:48 AM, Ian Campbell wrote: On Mon, 2015-06-01 at 09:36 +0100, George Dunlap wrote: Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei@citrix.com Most people put the Cc about the cut (---) which is fine too. It means the history ends up recording who was copied on the patch, which isn't necessarily a bad thing. Right -- I would have thought that was useless information cluttering up the history, but I can see how it might actually be useful. Should I start putting my CC's above the ---? :-) And should I stop dropping them even when above the --- for commit, which so far I've been doing as I don't consider this particularly useful information (other then e.g. who might have commented on a change without it being recorded in an Acked-by or Reviewed-by tag)? Is the CC list useful to keep in history? It ends up being the list of people who didn't respond to it before it got committed (or ignored/missed the email entirely). It is the $FOO'd-by tags which are important when it comes to judging the acceptability of a patch. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] The size of memory is wrong inside of virtual machine(VM) when using OVMF
Hi all: I encountered a troublesome problem about OVMF. I used OVMF.fd as a BIOS of virtual machine(VM). 1、my environment: xen_version: 4.6-unstable I git clone xen from git://xenbits.xen.org/xen.git. configure it and parameters as below: ./configure --prefix=/usr/ --libdir=/usr/lib/ --enable-ovmf then make make install, after that I reboot my host OS(suse11sp3). There is a same problem in KVM too. 2、problem: I started a VM whose memory is 64G in the host. The size of memory inside of the VM is wrong while it is OK when I check it in the host using xl list. But, when I changed the memory to 63G,it was both OK. (1)64G memory inside of VM using free to check: total used free sharedbuffers cached Mem: 3715640 4959163219724 0 22060 175136 -/+ buffers/cache: 2987203416920 Swap: 4046840 04046840 It is only 3.5G. in the host using xl list to check: NameID Mem VCPUs State Time(s) Domain-0 0 305516 r- 861.2 redhat 4 65536 2 r- 5.9 (2)63G memory inside of VM using free to check: total used free sharedbuffers cached Mem: 649182241083912 63834312 0 22188 175344 -/+ buffers/cache: 886380 64031844 Swap: 4046840 04046840 It is OK. in the host using xl list to check: NameID Mem VCPUs State Time(s) Domain-0 0 305516 r- 821.2 redhat 3 64512 2 -b 48.5 3、my cfg file: builder = hvm name = redhat memory = 65536 maxmem = 65536 vcpus = 2 bios = ovmf boot = dc sdl=0 disk = [ 'file:/test/image/redhat6_3.img,xvda,w' ] vnc = 1 vnclisten = '9.61.1.31' vncdisplay = 0 Any thought on this? Any help will be appreciated. If you need some other information, please let me know. :) Thanks! -mao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 57644: regressions - trouble: blocked/broken/fail/pass
On Mon, 2015-06-01 at 03:40 +, osstest service user wrote: flight 57644 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/57644/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-cubietruck 3 host-install(3) broken REGR. vs. 57419 build-armhf 3 host-install(3) broken in 57567 REGR. vs. 57419 On cubietruck-gleizes: May 31 09:18:19.472504 !! ERROR: No root file system May 31 09:18:20.219914 May 31 09:18:20.219980 No root file system is defined. May 31 09:18:20.222861 May 31 09:18:20.222988 Please correct this from the partitioning menu. cubietruck-{metzinger,picasso} suffered the same thing last week and were suddenly and inexplicably unable to see their disks. I unblessed those two and I've now unblessed gleizes as well. That leaves us with just cubietruck-braque and the 4 arndale boards running in production for arm32 testing. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 57666: regressions - trouble: broken/fail/pass
flight 57666 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/57666/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-multivcpu 17 leak-check/check fail REGR. vs. 56831 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-cubietruck 3 host-install(3) broken pass in 57591 test-armhf-armhf-xl-sedf 3 host-install(3) broken pass in 57591 test-amd64-i386-freebsd10-amd64 16 guest-localmigrate/x10 fail in 57591 pass in 57666 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail in 57591 pass in 57666 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 9 windows-install fail in 57591 like 56784 test-armhf-armhf-libvirt 11 guest-start fail like 56784 test-amd64-i386-libvirt 11 guest-start fail like 56831 test-amd64-amd64-libvirt 11 guest-start fail like 56831 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-cubietruck 12 migrate-support-check fail in 57591 never pass test-armhf-armhf-xl-sedf 12 migrate-support-check fail in 57591 never pass test-amd64-i386-xl-xsm 14 guest-localmigrate fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-xsm 14 guest-localmigrate fail never pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-amd64-libvirt-xsm 11 guest-start fail never pass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 11 guest-start fail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: qemuu97af820f539efe80b87615a04f9de11ea585f725 baseline version: qemuueba05e922e8e7f307bc5d4104a78797e55124e97 People who touched revisions under test: Alberto Garcia be...@igalia.com Bastian Koppelmann kbast...@mail.uni-paderborn.de Christoph Hellwig h...@lst.de Cole Robinson crobi...@redhat.com Cornelia Huck cornelia.h...@de.ibm.com Daniel P. Berrange berra...@redhat.com Denis V. Lunev d...@openvz.org Edgar E. Iglesias edgar.igles...@xilinx.com Eric Blake ebl...@redhat.com Eric Farman far...@linux.vnet.ibm.com Fabien Chouteau chout...@adacore.com Fam Zheng f...@redhat.com Gerd Hoffmann kra...@redhat.com Greg Bellows greg.bell...@linaro.org Jan Kiszka jan.kis...@siemens.com John Snow js...@redhat.com Ján Tomko jto...@redhat.com Keith Busch keith.bu...@intel.com Kevin Wolf kw...@redhat.com Marc-André Lureau marcandre.lur...@gmail.com Mark Cave-Ayland mark.cave-ayl...@ailande.co.uk Mark Cave-Ayland mark.cave-ayl...@ilande.co.uk Markus Armbruster arm...@redhat.com Max Reitz mre...@redhat.com Paolo Bonzini pbonz...@redhat.com Peter Maydell peter.mayd...@linaro.org Richard Henderson r...@twiddle.net Richard W.M. Jones rjo...@redhat.com Roman Kagan rka...@parallels.com Shannon Zhao shannon.z...@linaro.org Shannon Zhao zhaoshengl...@huawei.com Stefan Hajnoczi stefa...@redhat.com Thomas Huth th...@redhat.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops
Re: [Xen-devel] [xen-4.4-testing test] 57474: tolerable trouble: blocked/broken/fail/pass - PUSHED
On 29.05.15 at 20:51, osst...@xenbits.xen.org wrote: flight 57474 xen-4.4-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/57474/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-multivcpu 3 host-install(3) broken pass in 57411 test-armhf-armhf-xl-credit2 3 host-install(3) broken pass in 57411 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-localmigrate.2 fail in 57411 pass in 57474 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail pass in 57411 So this issue is visible on 4.4 too (the earlier flight 57411 also suggests so). Interestingly the two 4.5 flights that completed over the weekend do not display it (but there's no reason to believe it's gone). Is this perhaps tied to a particular host? If so, would it be possible to set up a custom bisection on just that host? If not, would setting up a custom bisection taking into account the randomness of the failure be an option? Otoh hand the issue displaying even on 4.3 (which receives only security updates, and whose QEMU_UPSTREAM_REVISION is still at qemu-xen-4.3.4) should limit the set of possibly offending commits quite drastically (unless of course this is an old issue that for some reason started to surface now, or there was a qemuu specific adjustment in osstest itself). Flight 50315 (Apr 4) testing 46ed0083a7 appears to be the first one (while earlier flights in March have random test-amd64-i386-pair guest-migrate/src_host/dst_host failures, aiui this is a PV migration test and hence qemuu unrelated). Yet that commit to me doesn't look at all like it would have the potential of introducing (random) migration failures. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On 01.06.15 at 10:50, george.dun...@eu.citrix.com wrote: On 06/01/2015 09:48 AM, Ian Campbell wrote: On Mon, 2015-06-01 at 09:36 +0100, George Dunlap wrote: Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei@citrix.com Most people put the Cc about the cut (---) which is fine too. It means the history ends up recording who was copied on the patch, which isn't necessarily a bad thing. Right -- I would have thought that was useless information cluttering up the history, but I can see how it might actually be useful. Should I start putting my CC's above the ---? :-) And should I stop dropping them even when above the --- for commit, which so far I've been doing as I don't consider this particularly useful information (other then e.g. who might have commented on a change without it being recorded in an Acked-by or Reviewed-by tag)? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
On Mon, Jun 01, 2015 at 09:36:15AM +0100, Jan Beulich wrote: On 01.06.15 at 10:05, chao.p.p...@linux.intel.com wrote: 2) Unfeasible to allocate memory first and do initialization later in cpu hotplug notifications. My former approach is performing both the allocation and initialization in the CPU_STARTING, which is not a good idea indicated by Jan. Considering info-cos_max = min(opt_cos_max, edx 0x); I don't see why you couldn't do an allocation using opt_cos_max in CPU_UP_PREPARE, potentially not using all of the slots later on. If memory waste is not a problem (especially in most cases opt_cos_max is its default 255), then this would be a solution. As you have suggested this, then I think I can go on this path and still have asymmetry systems been supported. Thanks, Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On Mon, 2015-06-01 at 09:50 +0100, George Dunlap wrote: On 06/01/2015 09:48 AM, Ian Campbell wrote: On Mon, 2015-06-01 at 09:36 +0100, George Dunlap wrote: Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei@citrix.com Most people put the Cc about the cut (---) which is fine too. It means the history ends up recording who was copied on the patch, which isn't necessarily a bad thing. Right -- I would have thought that was useless information cluttering up the history, but I can see how it might actually be useful. Should I start putting my CC's above the ---? :-) FWIW, pretty much everyone else does. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] Toolstack: Do not pass -F to xm shutdown (Was: Re: [xen-4.2-testing test] 57630: regressions - FAIL)
On Mon, 2015-06-01 at 09:34 +0100, Ian Campbell wrote: From aa79bd958fe42da2876fd76e61468c183a7fabdc Mon Sep 17 00:00:00 2001 From: Ian Campbell ian.campb...@citrix.com Date: Mon, 1 Jun 2015 09:32:37 +0100 Subject: [PATCH] Toolstack: Do not pass -F to xm shutdown This is a feature of xl only. Signed-off-by: Ian Campbell ian.campb...@citrix.com With Ian being away for a bit and this blocking testing of older branches which contain xend I've pushed this to osstest's pretest branch. Ian. --- Osstest/Toolstack/xl.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Osstest/Toolstack/xl.pm b/Osstest/Toolstack/xl.pm index dd12ae1..23328d3 100644 --- a/Osstest/Toolstack/xl.pm +++ b/Osstest/Toolstack/xl.pm @@ -56,7 +56,7 @@ sub shutdown_wait ($$$) { my $ho = $self-{Host}; my $gn = $gho-{Name}; my $acpi_fallback = guest_var($gho,'acpi_shutdown','false') eq 'true' - ? F : ; + $self-{Name} eq 'xl' ? F : ; target_cmd_root($ho,$self-{_Command} shutdown -w${acpi_fallback} $gn, $timeout); } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On Mon, 2015-06-01 at 09:36 +0100, George Dunlap wrote: Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei@citrix.com Most people put the Cc about the cut (---) which is fine too. It means the history ends up recording who was copied on the patch, which isn't necessarily a bad thing. 8-- And when you do git-send-email, it will automatically CC Ian and Wei (and only them) for this patch. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-3.4 test] 57661: regressions - FAIL
flight 57661 linux-3.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/57661/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl 9 debian-install fail REGR. vs. 52209-bisect test-amd64-amd64-pair 15 debian-install/dst_host fail REGR. vs. 52715-bisect test-amd64-i386-pair15 debian-install/dst_host fail REGR. vs. 56366-bisect Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf 9 debian-installfail blocked in 56366-bisect test-amd64-amd64-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-amd64-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-qemuu-rhel6hvm-intel 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-qemut-rhel6hvm-intel 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-ovmf-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-freebsd10-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-multivcpu 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-libvirt-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-rumpuserxen-i386 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-rumpuserxen-amd64 6 xen-bootfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-sedf-pin 9 debian-installfail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-i386-freebsd10-i386 6 xen-bootfail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-winxpsp3 6 xen-bootfail blocked in 56366-bisect test-amd64-i386-xl-qemut-winxpsp3 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-winxpsp3 6 xen-bootfail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-ovmf-amd64 6 xen-bootfail like 53709-bisect Tests which did not succeed, but are not blocking: test-amd64-i386-xl-xsm9 debian-install fail never pass test-amd64-amd64-xl-pvh-amd 9 debian-install fail never pass test-amd64-amd64-libvirt-xsm 9 debian-install fail never pass test-amd64-amd64-xl-credit2 9 debian-install fail never pass test-amd64-amd64-xl-pvh-intel 9 debian-install fail never pass test-amd64-amd64-xl-xsm 9 debian-install fail never pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: linux56b48fcda5076d4070ab00df32ff5ff834e0be86 baseline version: linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70 370 people touched revisions under test, not listing them all jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl fail test-amd64-i386-xl fail test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmfail
Re: [Xen-devel] [PATCH 4/4] iommu: add rmrr Xen command line option for extra rmrrs
On 01.06.15 at 08:30, kevin.t...@intel.com wrote: From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com] +rmrrn = xzalloc(struct acpi_rmrr_unit); +if ( !rmrrn ) +return; + +rmrrn-scope.devices = xmalloc_array(u16, + rmrru[i].dev_count); +if ( !rmrrn-scope.devices ) +{ +xfree(rmrrn); +return; +} + +seg = 0; +for ( dev = 0; dev rmrru-dev_count; dev++ ) +{ +rmrrn-scope.devices[dev] = rmrru-sbdf[dev]; +seg = seg | PCI_SEG(rmrru-sbdf[dev] 16); +} +if ( seg != PCI_SEG(rmrru-sbdf[0]) ) +{ +printk(XENLOG_ERR VTDPREFIX + Segments are not equal for RMRR range [%PRIx64 - %PRIx64]\n, + rmrru-base_pfn, rmrru-end_pfn); +xfree(rmrrn-scope.devices); +xfree(rmrrn); +continue; +} This check should be moved earlier along with other checks. I think that would make code worse, as it would mean breaking up the loop above (right now it depends on the memory allocations left in above for context). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [Formal Vote] Changes to Xen Project Security Vulnerability Process - Open until June 8th, 2015
Hi, in accordance with the project's governance, I would like to put the following text changes to a committer vote (committers are on the TO list). The discussion leading to the changes can be found at http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg02881.html http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg02881.html Please vote +1, 0, -1 with explanation as usual. You can reply publicly or in private and I will collate results on the 9th. Regards Lars Old text in http://www.xenproject.org/security-policy.html http://www.xenproject.org/security-policy.html --- Specific process ... 4. Advisory pre-release: This occurs only if the advisory is embargoed (ie, the problem is not already public): As soon as our advisory is available, we will send it, including patches, to members of the Xen security pre-disclosure list. For more information about this list, see below. At this stage the advisory will be clearly marked with the embargo date. --- Proposed text (this adds an additional paragraph, while leaving the existing text as-is): --- Specific process ... 4. Advisory pre-release: This occurs only if the advisory is embargoed (ie, the problem is not already public): As soon as our advisory is available, we will send it, including patches, to members of the Xen security pre-disclosure list. In the event that we do not have a patch available two working weeks before the disclosure date, we aim to send an advisory that reflects the current state of knowledge to the Xen security pre-disclosure list. An updated advisory will be published as soon as available. For more information about this list, see below. At this stage the advisory will be clearly marked with the embargo date. ---___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver
On 01.06.15 at 11:12, wei.w.w...@intel.com wrote: On 29/05/2015 16:46, Jan Beulich wrote On 29.05.15 at 10:19, wei.w.w...@intel.com wrote: On 26/05/2015 21:58, Jan Beulich wrote On 13.05.16 at 09:50, wei.w.w...@intel.com wrote: +static int intel_pstate_verify_policy(struct cpufreq_policy +*policy) { +cpufreq_verify_within_limits(policy, policy-cpuinfo.min_freq, + policy-cpuinfo.max_freq); + +if ( policy-policy != CPUFREQ_POLICY_POWERSAVE + policy-policy != CPUFREQ_POLICY_PERFORMANCE + policy-policy != CPUFREQ_POLICY_USERSPACE + policy-policy != CPUFREQ_POLICY_ONDEMAND ) switch() How would we use switch() here? switch ( policy-policy ) { case CPUFREQ_POLICY_POWERSAVE: etc. But I thought that to be obvious, so I'm not sure I understand what you don't understand. I thought there would be a special usage of switch() here, but no. So, using switch, we will have switch ( policy-policy ) { case CPUFREQ_POLICY_POWERSAVE: case CPUFREQ_POLICY_PERFORMANCE: case CPUFREQ_POLICY_USERSPACE: case CPUFREQ_POLICY_ONDEMAND: return 0; case default: return -EINVAL } Is there a particular reason why we need to change to this style? I think using if() looks more straightforward, since this is just a condition check. Well, a good compiler will avoid fetching policy-policy anyway at least when optimizing, but I consider it good practice to guide the compiler in the right direction (including non-optimized builds). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] xen: netback: fix printf format string warning
drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’: drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=] (txreq.offset~PAGE_MASK) + txreq.size); ^ PAGE_MASK's type can vary by arch, so a cast is needed. Signed-off-by: Ian Campbell ian.campb...@citrix.com v2: Cast to unsigned long, since PAGE_MASK can vary by arch. --- drivers/net/xen-netback/netback.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 4de46aa..0d25943 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -1250,7 +1250,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue, netdev_err(queue-vif-dev, txreq.offset: %x, size: %u, end: %lu\n, txreq.offset, txreq.size, - (txreq.offset~PAGE_MASK) + txreq.size); + (unsigned long)(txreq.offset~PAGE_MASK) + txreq.size); xenvif_fatal_tx_err(queue-vif); break; } -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] drivers: xen-blkback: delay pending_req allocation to connect_ring
On 06/01/2015 04:36 PM, Roger Pau Monné wrote: El 26/05/15 a les 2.06, Bob Liu ha escrit: In connect_ring, we can know exactly how many pages are used for the shared ring and also whether feature-persistent is enabled, delay pending_req allocation here so that we won't waste too much memory. I would very much prefer for this to be a pre-patch for your multipage ring series. Do you think you can include it in the next iteration? I think it's unnecessary to send an new iteration if there isn't any new comments about multi-page ring series. This patch is meaningful only after multi-page ring series, else there is no difference(no memory can be saved). So I think it's fine for this patch coming after multi-page ring series. Signed-off-by: Bob Liu bob@oracle.com --- drivers/block/xen-blkback/common.h | 3 +- drivers/block/xen-blkback/xenbus.c | 95 -- 2 files changed, 51 insertions(+), 47 deletions(-) diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index 919a1ab..e1d605d 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -249,7 +249,7 @@ struct backend_info; #define PERSISTENT_GNT_WAS_ACTIVE 1 /* Number of requests that we can fit in a ring */ -#define XEN_MAX_BLKIF_REQS (32 * XENBUS_MAX_RING_PAGES) +#define XEN_BLKIF_REQS 32 This should be XEN_BLKIF_REQS_PER_PAGE (or a similar name of your choice that reflects that those are the number of requests per ring page). struct persistent_gnt { struct page *page; @@ -321,6 +321,7 @@ struct xen_blkif { struct work_struct free_work; /* Thread shutdown wait queue. */ wait_queue_head_t shutdown_wq; +unsigned int nr_ring_pages; }; struct seg_buf { diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index bc33888..48336a3 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -125,8 +125,6 @@ static void xen_update_blkif_status(struct xen_blkif *blkif) static struct xen_blkif *xen_blkif_alloc(domid_t domid) { struct xen_blkif *blkif; -struct pending_req *req, *n; -int i, j; BUILD_BUG_ON(MAX_INDIRECT_PAGES BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST); @@ -153,50 +151,11 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid) INIT_LIST_HEAD(blkif-pending_free); INIT_WORK(blkif-free_work, xen_blkif_deferred_free); -for (i = 0; i XEN_MAX_BLKIF_REQS; i++) { -req = kzalloc(sizeof(*req), GFP_KERNEL); -if (!req) -goto fail; -list_add_tail(req-free_list, - blkif-pending_free); -for (j = 0; j MAX_INDIRECT_SEGMENTS; j++) { -req-segments[j] = kzalloc(sizeof(*req-segments[0]), - GFP_KERNEL); -if (!req-segments[j]) -goto fail; -} -for (j = 0; j MAX_INDIRECT_PAGES; j++) { -req-indirect_pages[j] = kzalloc(sizeof(*req-indirect_pages[0]), - GFP_KERNEL); -if (!req-indirect_pages[j]) -goto fail; -} -} spin_lock_init(blkif-pending_free_lock); init_waitqueue_head(blkif-pending_free_wq); init_waitqueue_head(blkif-shutdown_wq); return blkif; - -fail: -list_for_each_entry_safe(req, n, blkif-pending_free, free_list) { -list_del(req-free_list); -for (j = 0; j MAX_INDIRECT_SEGMENTS; j++) { -if (!req-segments[j]) -break; -kfree(req-segments[j]); -} -for (j = 0; j MAX_INDIRECT_PAGES; j++) { -if (!req-indirect_pages[j]) -break; -kfree(req-indirect_pages[j]); -} -kfree(req); -} - -kmem_cache_free(xen_blkif_cachep, blkif); - -return ERR_PTR(-ENOMEM); } static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t *gref, @@ -313,7 +272,7 @@ static void xen_blkif_free(struct xen_blkif *blkif) i++; } -WARN_ON(i != XEN_MAX_BLKIF_REQS); +WARN_ON(i != XEN_BLKIF_REQS * blkif-nr_ring_pages); kmem_cache_free(xen_blkif_cachep, blkif); } @@ -868,9 +827,10 @@ static int connect_ring(struct backend_info *be) struct xenbus_device *dev = be-dev; unsigned int ring_ref[XENBUS_MAX_RING_PAGES]; unsigned int evtchn, nr_grefs, ring_page_order; -unsigned int pers_grants; +unsigned int pers_grants, i, j; +struct pending_req *req, *n; char protocol[64] = ; -int err; +int err, nr_indiret_pages, nr_segs;
Re: [Xen-devel] [PATCH v2 6/9] x86/intel_pstate: the main boby of the intel_pstate driver
On 29/05/2015 16:46, Jan Beulich wrote On 29.05.15 at 10:19, wei.w.w...@intel.com wrote: On 26/05/2015 21:58, Jan Beulich wrote On 13.05.16 at 09:50, wei.w.w...@intel.com wrote: +static int intel_pstate_verify_policy(struct cpufreq_policy +*policy) { +cpufreq_verify_within_limits(policy, policy-cpuinfo.min_freq, + policy-cpuinfo.max_freq); + +if ( policy-policy != CPUFREQ_POLICY_POWERSAVE + policy-policy != CPUFREQ_POLICY_PERFORMANCE + policy-policy != CPUFREQ_POLICY_USERSPACE + policy-policy != CPUFREQ_POLICY_ONDEMAND ) switch() How would we use switch() here? switch ( policy-policy ) { case CPUFREQ_POLICY_POWERSAVE: etc. But I thought that to be obvious, so I'm not sure I understand what you don't understand. I thought there would be a special usage of switch() here, but no. So, using switch, we will have switch ( policy-policy ) { case CPUFREQ_POLICY_POWERSAVE: case CPUFREQ_POLICY_PERFORMANCE: case CPUFREQ_POLICY_USERSPACE: case CPUFREQ_POLICY_ONDEMAND: return 0; case default: return -EINVAL } Is there a particular reason why we need to change to this style? I think using if() looks more straightforward, since this is just a condition check. +static int intel_pstate_msrs_not_valid(void) { +/* Check that all the msr's we are using are valid. */ +u64 aperf, mperf, tmp; + +rdmsrl(MSR_IA32_APERF, aperf); +rdmsrl(MSR_IA32_MPERF, mperf); + +if (!pstate_funcs.get_max() || +!pstate_funcs.get_min() || +!pstate_funcs.get_turbo()) +return -ENODEV; + +rdmsrl(MSR_IA32_APERF, tmp); +if (!(tmp - aperf)) Why not if(tmp == aperf)? And - is it guaranteed that APERF (and MPERF below) incremented in the meantime? Will change it to if (tmp == aperf). According to the SDM, IA32_MPERF MSR (E7H) increments in proportion to a fixed frequency, and IA32_APERF MSR increments in proportion to actual performance. So, as long as the two MSRs are valid, their values will be increased. The in proportion is what makes me nervous: What if the proportion is 1 in every 1000 cycles? Right, this may cause a problem. I will remove that part. Then, it will be: static int intel_pstate_msrs_not_valid(void) { if ( !pstate_funcs.get_max() || !pstate_funcs.get_min() || !pstate_funcs.get_turbo()) return -ENODEV; return 0; } +int __init intel_pstate_init(void) { +int cpu, rc = 0; +const struct x86_cpu_id *id; +struct cpu_defaults *cpu_info; + +if (cpuid_ecx(6) 0x1) +set_bit(X86_FEATURE_APERFMPERF, + boot_cpu_data.x86_capability); This should be consolidated out of the other cpufreq drivers into cpu/common.c. How about moving it to the bottom of init_intel() in cpu/intel.c ? It's not Intel specific, so it belongs in cpu/common.c. Ok, I will put it into the cpu_init() function there. --- a/xen/include/asm-x86/acpi.h +++ b/xen/include/asm-x86/acpi.h @@ -32,6 +32,8 @@ #define COMPILER_DEPENDENT_INT64 long long #define COMPILER_DEPENDENT_UINT64 unsigned long long +extern int intel_pstate_init(void); Why in acpi.h? I was thinking that p-state is part of ACPI. Do you prefer to create a new .h called cpufreq.h, just like the cpuidle.h there? ACPI is a way to convey information about P- (and C- and T-) states, but the latter are imo not tied to ACPI. In fact your patch series here proves the opposite: You add code dealing with P-states without using information coming from ACPI. I think this should go into what currently is acpi/cpufreq/cpufreq.h, which is expected to be moved out of acpi/ anyway (I seem to even recall an ARM side series aiming at doing that). Ok, I will move it back to acpi/cpufreq/cpufreq.h. Best, Wei ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests
On Fri, May 29, 2015 at 8:59 AM, Ross Lagerwall ross.lagerw...@citrix.com wrote: When doing passthrough of a PCI device for an HVM guest, don't insert the device into xenstore, otherwise pciback attempts to use it which conflicts with QEMU. This manifests itself such that the first time a device is passed to a domain, it succeeds. Subsequent attempts fail unless the device is unbound from pciback or the machine rebooted. Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com --- tools/libxl/libxl_pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c index e0743f8..2552889 100644 --- a/tools/libxl/libxl_pci.c +++ b/tools/libxl/libxl_pci.c @@ -994,7 +994,7 @@ out: } } -if (!starting) +if (!starting type == LIBXL_DOMAIN_TYPE_PV) rc = libxl__device_pci_add_xenstore(gc, domid, pcidev, starting); It looks like device_pci_list reads backend/pci/ for the list of assigned devices. Does libxl__device_pci_list() still work after this change? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)
On Mon, 1 Jun 2015, Jan Beulich wrote: On 31.05.15 at 00:43, andrew.coop...@citrix.com wrote: On 30/05/2015 23:07, M A Young wrote: On Fri, 29 May 2015, Andrew Cooper wrote: FC22 is miscompiling the C to: struct page_info *page = mfn_to_page(mfn); struct domain *owner = page_get_owner_and_reference(page); if ( owner ) put_page(mfn_to_page(0)); which is wrong, and why free_domheap_pages() does legitimately complain about the wonky refcount. With a bit of experimentation I have found that compiling with the -fno-caller-saves flag gets this code segment back to the Fedora 21 version, thus avoiding the bug. After sending this email, I wondered whether the optimiser as assuming that %rdi was preserved. Indeed, it turns out that the generated code for page_get_owner_and_reference leaves %rdi unmodified, and safe for reuse after return. If the 'mov %r8,%rdi' were simply omitted, the code would work, as %rdi still contains the correct result of the original calculation. And %r8 is known to be preserved too? Therefore, I suspect that the bug is in the -fcaller-saves optimisation code. I suppose together with us allowing it to do such for global functions by marking everything hidden (i.e. something possibly not seeing much testing). Questions now are: 1) Was a bug against gcc opened already? 2) What do we do about it? Working around the issue by setting -fno-caller-saves seems awkward, as we'd likely have nothing but the gcc version to tie this to. And considering distros carry their own patch sets, the version alone may not even be enough. (I didn't see any reports against our tip facing a similar issue despite it being built with gcc 5 now too afaik.) There is a Fedora bug on this https://bugzilla.redhat.com/show_bug.cgi?id=1219197 which I updated and reassigned to gcc yesterday. Michael Young ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.3-testing test] 57600: regressions - trouble: blocked/broken/fail/pass
On 31.05.15 at 19:18, osst...@xenbits.xen.org wrote: flight 57600 xen-4.3-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/57600/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xend-qemut-winxpsp3 16 guest-stop fail REGR. vs. 53768 Same as 4.2 here - option -F used but not recognized. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] drivers: xen-blkfront: blkif_recover: recheck feature-persistent
On 06/01/2015 03:50 PM, Roger Pau Monné wrote: El 26/05/15 a les 2.11, Bob Liu ha escrit: When migrate from !feature-persistent host to feature-persistent host, domU still think new host/backend don't support persistent. Dmesg like: backed has not unmapped grant: 839 backed has not unmapped grant: 773 backed has not unmapped grant: 773 backed has not unmapped grant: 773 backed has not unmapped grant: 839 We should recheck whether the new backend support feature-persistent during blkif_recover(). Right, we recheck for indirect-descriptors but not persistent grants. Do you think it makes sense to split the part of blkfront_connect that checks for optional features, like persistent grants, indirect descriptors and flush/barrier features to a separate function and call it from both blkfront_connect and blkif_recover? Yep, that would be better. Thanks, -Bob ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On 06/01/2015 09:48 AM, Ian Campbell wrote: On Mon, 2015-06-01 at 09:36 +0100, George Dunlap wrote: Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei@citrix.com Most people put the Cc about the cut (---) which is fine too. It means the history ends up recording who was copied on the patch, which isn't necessarily a bad thing. Right -- I would have thought that was useless information cluttering up the history, but I can see how it might actually be useful. Should I start putting my CC's above the ---? :-) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH OSSTEST] Toolstack: Do not pass -F to xm shutdown (Was: Re: [xen-4.2-testing test] 57630: regressions - FAIL)
On 01.06.15 at 10:34, ian.campb...@citrix.com wrote: Subject: [PATCH] Toolstack: Do not pass -F to xm shutdown This is a feature of xl only. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Toolstack/xl.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Osstest/Toolstack/xl.pm b/Osstest/Toolstack/xl.pm index dd12ae1..23328d3 100644 --- a/Osstest/Toolstack/xl.pm +++ b/Osstest/Toolstack/xl.pm @@ -56,7 +56,7 @@ sub shutdown_wait ($$$) { my $ho = $self-{Host}; my $gn = $gho-{Name}; my $acpi_fallback = guest_var($gho,'acpi_shutdown','false') eq 'true' - ? F : ; + $self-{Name} eq 'xl' ? F : ; target_cmd_root($ho,$self-{_Command} shutdown -w${acpi_fallback} $gn, $timeout); Well - to me, not knowing much about osstest's structure, catering for xm in a file called xl.pm seems odd. And then, taking into consideration a hypothetical 3rd toolstack, I wonder whether assuming only xl supports -F (instead of only xm does not support it) is the more suitable check. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On 06/01/2015 10:06 AM, Andrew Cooper wrote: On 01/06/15 09:58, Jan Beulich wrote: On 01.06.15 at 10:50, george.dun...@eu.citrix.com wrote: On 06/01/2015 09:48 AM, Ian Campbell wrote: On Mon, 2015-06-01 at 09:36 +0100, George Dunlap wrote: Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Wei Liu wei@citrix.com Most people put the Cc about the cut (---) which is fine too. It means the history ends up recording who was copied on the patch, which isn't necessarily a bad thing. Right -- I would have thought that was useless information cluttering up the history, but I can see how it might actually be useful. Should I start putting my CC's above the ---? :-) And should I stop dropping them even when above the --- for commit, which so far I've been doing as I don't consider this particularly useful information (other then e.g. who might have commented on a change without it being recorded in an Acked-by or Reviewed-by tag)? Is the CC list useful to keep in history? It ends up being the list of people who didn't respond to it before it got committed (or ignored/missed the email entirely). It is the $FOO'd-by tags which are important when it comes to judging the acceptability of a patch. OK, well that's 3 of us so far who think it's sort of useless, including someone who things it's useless enough to spend the effort deleting them. Maybe we can leave the bike shed the color it is at the moment. ;-) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] Toolstack: Do not pass -F to xm shutdown (Was: Re: [xen-4.2-testing test] 57630: regressions - FAIL)
On 01.06.15 at 11:42, ian.campb...@citrix.com wrote: On Mon, 2015-06-01 at 10:21 +0100, Jan Beulich wrote: On 01.06.15 at 10:34, ian.campb...@citrix.com wrote: Subject: [PATCH] Toolstack: Do not pass -F to xm shutdown This is a feature of xl only. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Toolstack/xl.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Osstest/Toolstack/xl.pm b/Osstest/Toolstack/xl.pm index dd12ae1..23328d3 100644 --- a/Osstest/Toolstack/xl.pm +++ b/Osstest/Toolstack/xl.pm @@ -56,7 +56,7 @@ sub shutdown_wait ($$$) { my $ho = $self-{Host}; my $gn = $gho-{Name}; my $acpi_fallback = guest_var($gho,'acpi_shutdown','false') eq 'true' - ? F : ; + $self-{Name} eq 'xl' ? F : ; target_cmd_root($ho,$self-{_Command} shutdown -w${acpi_fallback} $gn, $timeout); Well - to me, not knowing much about osstest's structure, catering for xm in a file called xl.pm seems odd. And then, taking into consideration a hypothetical 3rd toolstack, I wonder whether assuming only xl supports -F (instead of only xm does not support it) is the more suitable check. xend.pm inherits from xl.pm in a oo-ish way. I did consider adding shutdown_wait to xend.pm to overload this one, but that meant duplicating 95% unaltered. A third toolstack is not hypothetical, libvirt's virsh is supported, but virsh.pm sharing no historical legacy with xl/xm inherets from the toplevel toolstack parent, not the xl one. Makes sense, thanks for explaining. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Running 3.9 kernel in domU with 3.11 kernel in dom0
On 30/05/15 14:43, Gautam Malu wrote: Hi, Hello, It's not necessary to send the same mail twice at one day of interval. The mail will be answered as soon as we can... The xen-arm mailing list has been archived few months ago and this question should be asked on xen-users given it's more a configuration issue than an hypervisor issue for now. I am running kernel 3.11-rc3 on arndale exynos 5250 board with xen 4.5 stack. It works fine with domU with kernel 3.17+ with ubutnu. I am trying to run android as domU, so I am using kernel 3.9 in domU. With kernel 3.9 in domU, xen doesn't even boot the kernel at all. Here is the strace log http://pastebin.com/JMtg9mSY I'm afraid that strace xl can't give any useful output for your problem. It would be more useful to get the log of Xen and xl (xl -vvv create ...). Is this any ABI compatibility issue, as I was reading http://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions#Hypervisor_ABI_Compatibility But both dom0 (3.11-rc3)and domU kernel(3.9) are not patched with this given patch http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=380108d891acf8db5cf0d477176c7ed2b62b7928 . So they should be able to interpolate between themselves. Somehow kernel from arndale-domU-3.9 branch of http://xenbits.xen.org/git-http/people/julieng/linux-arm.git/ boots with same setup. What do you mean by same setup? Same DOM0? Same Xen? Same kernel config file? Are you using a android kernel or upstream? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v8 03/13] x86: maintain COS to CBM mapping for each socket
On 01.06.15 at 10:05, chao.p.p...@linux.intel.com wrote: 2) Unfeasible to allocate memory first and do initialization later in cpu hotplug notifications. My former approach is performing both the allocation and initialization in the CPU_STARTING, which is not a good idea indicated by Jan. Considering info-cos_max = min(opt_cos_max, edx 0x); I don't see why you couldn't do an allocation using opt_cos_max in CPU_UP_PREPARE, potentially not using all of the slots later on. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] drivers: xen-blkback: delay pending_req allocation to connect_ring
El 26/05/15 a les 2.06, Bob Liu ha escrit: In connect_ring, we can know exactly how many pages are used for the shared ring and also whether feature-persistent is enabled, delay pending_req allocation here so that we won't waste too much memory. I would very much prefer for this to be a pre-patch for your multipage ring series. Do you think you can include it in the next iteration? Signed-off-by: Bob Liu bob@oracle.com --- drivers/block/xen-blkback/common.h | 3 +- drivers/block/xen-blkback/xenbus.c | 95 -- 2 files changed, 51 insertions(+), 47 deletions(-) diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index 919a1ab..e1d605d 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -249,7 +249,7 @@ struct backend_info; #define PERSISTENT_GNT_WAS_ACTIVE1 /* Number of requests that we can fit in a ring */ -#define XEN_MAX_BLKIF_REQS (32 * XENBUS_MAX_RING_PAGES) +#define XEN_BLKIF_REQS 32 This should be XEN_BLKIF_REQS_PER_PAGE (or a similar name of your choice that reflects that those are the number of requests per ring page). struct persistent_gnt { struct page *page; @@ -321,6 +321,7 @@ struct xen_blkif { struct work_struct free_work; /* Thread shutdown wait queue. */ wait_queue_head_t shutdown_wq; + unsigned int nr_ring_pages; }; struct seg_buf { diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index bc33888..48336a3 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -125,8 +125,6 @@ static void xen_update_blkif_status(struct xen_blkif *blkif) static struct xen_blkif *xen_blkif_alloc(domid_t domid) { struct xen_blkif *blkif; - struct pending_req *req, *n; - int i, j; BUILD_BUG_ON(MAX_INDIRECT_PAGES BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST); @@ -153,50 +151,11 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid) INIT_LIST_HEAD(blkif-pending_free); INIT_WORK(blkif-free_work, xen_blkif_deferred_free); - for (i = 0; i XEN_MAX_BLKIF_REQS; i++) { - req = kzalloc(sizeof(*req), GFP_KERNEL); - if (!req) - goto fail; - list_add_tail(req-free_list, - blkif-pending_free); - for (j = 0; j MAX_INDIRECT_SEGMENTS; j++) { - req-segments[j] = kzalloc(sizeof(*req-segments[0]), -GFP_KERNEL); - if (!req-segments[j]) - goto fail; - } - for (j = 0; j MAX_INDIRECT_PAGES; j++) { - req-indirect_pages[j] = kzalloc(sizeof(*req-indirect_pages[0]), - GFP_KERNEL); - if (!req-indirect_pages[j]) - goto fail; - } - } spin_lock_init(blkif-pending_free_lock); init_waitqueue_head(blkif-pending_free_wq); init_waitqueue_head(blkif-shutdown_wq); return blkif; - -fail: - list_for_each_entry_safe(req, n, blkif-pending_free, free_list) { - list_del(req-free_list); - for (j = 0; j MAX_INDIRECT_SEGMENTS; j++) { - if (!req-segments[j]) - break; - kfree(req-segments[j]); - } - for (j = 0; j MAX_INDIRECT_PAGES; j++) { - if (!req-indirect_pages[j]) - break; - kfree(req-indirect_pages[j]); - } - kfree(req); - } - - kmem_cache_free(xen_blkif_cachep, blkif); - - return ERR_PTR(-ENOMEM); } static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t *gref, @@ -313,7 +272,7 @@ static void xen_blkif_free(struct xen_blkif *blkif) i++; } - WARN_ON(i != XEN_MAX_BLKIF_REQS); + WARN_ON(i != XEN_BLKIF_REQS * blkif-nr_ring_pages); kmem_cache_free(xen_blkif_cachep, blkif); } @@ -868,9 +827,10 @@ static int connect_ring(struct backend_info *be) struct xenbus_device *dev = be-dev; unsigned int ring_ref[XENBUS_MAX_RING_PAGES]; unsigned int evtchn, nr_grefs, ring_page_order; - unsigned int pers_grants; + unsigned int pers_grants, i, j; + struct pending_req *req, *n; char protocol[64] = ; - int err; + int err, nr_indiret_pages, nr_segs; ^ nr_indirect_pages (notice the missing 'c'). pr_debug(%s %s\n, __func__, dev-otherend); @@ -899,8 +859,6 @@ static int connect_ring(struct backend_info *be) pr_info(%s:using single page: ring-ref %d\n, dev-otherend,
Re: [Xen-devel] [PATCH v2] dmar: device scope mem leak fix
On 01.06.15 at 06:47, kevin.t...@intel.com wrote: From: Tian, Kevin Sent: Monday, June 01, 2015 12:43 PM and looks you dropped earlier changes to acpi_parse_one_rmrr. any elaboration why it's not required in this version? Never mind this one. Seems you have it in RMRR patch set. No - it belongs here, not there. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] Toolstack: Do not pass -F to xm shutdown (Was: Re: [xen-4.2-testing test] 57630: regressions - FAIL)
On Mon, 2015-06-01 at 10:21 +0100, Jan Beulich wrote: On 01.06.15 at 10:34, ian.campb...@citrix.com wrote: Subject: [PATCH] Toolstack: Do not pass -F to xm shutdown This is a feature of xl only. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Toolstack/xl.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Osstest/Toolstack/xl.pm b/Osstest/Toolstack/xl.pm index dd12ae1..23328d3 100644 --- a/Osstest/Toolstack/xl.pm +++ b/Osstest/Toolstack/xl.pm @@ -56,7 +56,7 @@ sub shutdown_wait ($$$) { my $ho = $self-{Host}; my $gn = $gho-{Name}; my $acpi_fallback = guest_var($gho,'acpi_shutdown','false') eq 'true' - ? F : ; +$self-{Name} eq 'xl' ? F : ; target_cmd_root($ho,$self-{_Command} shutdown -w${acpi_fallback} $gn, $timeout); Well - to me, not knowing much about osstest's structure, catering for xm in a file called xl.pm seems odd. And then, taking into consideration a hypothetical 3rd toolstack, I wonder whether assuming only xl supports -F (instead of only xm does not support it) is the more suitable check. xend.pm inherits from xl.pm in a oo-ish way. I did consider adding shutdown_wait to xend.pm to overload this one, but that meant duplicating 95% unaltered. A third toolstack is not hypothetical, libvirt's virsh is supported, but virsh.pm sharing no historical legacy with xl/xm inherets from the toplevel toolstack parent, not the xl one. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 57488: regressions - trouble: broken/fail/pass
On 30.05.15 at 01:24, osst...@xenbits.xen.org wrote: flight 57488 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/57488/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-freebsd10-amd64 14 guest-saverestore.2fail REGR. vs. 57419 test-amd64-amd64-xl-qemuu-win7-amd64 9 windows-install fail REGR. vs. 57419 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-installfail REGR. vs. 57419 I wonder, btw, whether the latter two - also recurring, and also random - may have something in common with the random migration failures seen on the stable branches. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 net] xen: netback: read hotplug script once at start of day.
When we come to tear things down in netback_remove() and generate the uevent it is possible that the xenstore directory has already been removed (details below). In such cases netback_uevent() won't be able to read the hotplug script and will write a xenstore error node. A recent change to the hypervisor exposed this race such that we now sometimes lose it (where apparently we didn't ever before). Instead read the hotplug script configuration during setup and use it for the lifetime of the backend device. The apparently more obvious fix of moving the transition to state=Closed in netback_remove() to after the uevent does not work because it is possible that we are already in state=Closed (in reaction to the guest having disconnected as it shutdown). Being already in Closed means the toolstack is at liberty to start tearing down the xenstore directories. In principal it might be possible to arrange to unregister the device sooner (e.g on transition to Closing) such that xenstore would still be there but this state machine is fragile and prone to anger... A modern Xen system only relies on the hotplug uevent for driver domains, when the backend is in the same domain as the toolstack it will run the necessary setup/teardown directly in the correct sequence wrt xenstore changes. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- v2: Move script to backend_info and read/free it in netback_{probe,remove}. DaveM, could this go to all stable trees please. --- drivers/net/xen-netback/xenbus.c | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c index 3d8dbf5..6380b28 100644 --- a/drivers/net/xen-netback/xenbus.c +++ b/drivers/net/xen-netback/xenbus.c @@ -34,6 +34,8 @@ struct backend_info { enum xenbus_state frontend_state; struct xenbus_watch hotplug_status_watch; u8 have_hotplug_status_watch:1; + + const char *hotplug_script; }; static int connect_rings(struct backend_info *be, struct xenvif_queue *queue); @@ -238,6 +240,7 @@ static int netback_remove(struct xenbus_device *dev) xenvif_free(be-vif); be-vif = NULL; } + kfree(be-hotplug_script); kfree(be); dev_set_drvdata(dev-dev, NULL); return 0; @@ -255,6 +258,7 @@ static int netback_probe(struct xenbus_device *dev, struct xenbus_transaction xbt; int err; int sg; + const char *script; struct backend_info *be = kzalloc(sizeof(struct backend_info), GFP_KERNEL); if (!be) { @@ -347,6 +351,15 @@ static int netback_probe(struct xenbus_device *dev, if (err) pr_debug(Error writing multi-queue-max-queues\n); + script = xenbus_read(XBT_NIL, dev-nodename, script, NULL); + if (IS_ERR(script)) { + err = PTR_ERR(script); + xenbus_dev_fatal(dev, err, reading script); + goto fail; + } + + be-hotplug_script = script; + err = xenbus_switch_state(dev, XenbusStateInitWait); if (err) goto fail; @@ -379,22 +392,14 @@ static int netback_uevent(struct xenbus_device *xdev, struct kobj_uevent_env *env) { struct backend_info *be = dev_get_drvdata(xdev-dev); - char *val; - val = xenbus_read(XBT_NIL, xdev-nodename, script, NULL); - if (IS_ERR(val)) { - int err = PTR_ERR(val); - xenbus_dev_fatal(xdev, err, reading script); - return err; - } else { - if (add_uevent_var(env, script=%s, val)) { - kfree(val); - return -ENOMEM; - } - kfree(val); - } + if (!be) + return 0; + + if (add_uevent_var(env, script=%s, be-hotplug_script)) + return -ENOMEM; - if (!be || !be-vif) + if (!be-vif) return 0; return add_uevent_var(env, vif=%s, be-vif-dev-name); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen BUG at page_alloc.c:1738 (Xen 4.5)
On 31.05.15 at 00:43, andrew.coop...@citrix.com wrote: On 30/05/2015 23:07, M A Young wrote: On Fri, 29 May 2015, Andrew Cooper wrote: FC22 is miscompiling the C to: struct page_info *page = mfn_to_page(mfn); struct domain *owner = page_get_owner_and_reference(page); if ( owner ) put_page(mfn_to_page(0)); which is wrong, and why free_domheap_pages() does legitimately complain about the wonky refcount. With a bit of experimentation I have found that compiling with the -fno-caller-saves flag gets this code segment back to the Fedora 21 version, thus avoiding the bug. After sending this email, I wondered whether the optimiser as assuming that %rdi was preserved. Indeed, it turns out that the generated code for page_get_owner_and_reference leaves %rdi unmodified, and safe for reuse after return. If the 'mov %r8,%rdi' were simply omitted, the code would work, as %rdi still contains the correct result of the original calculation. And %r8 is known to be preserved too? Therefore, I suspect that the bug is in the -fcaller-saves optimisation code. I suppose together with us allowing it to do such for global functions by marking everything hidden (i.e. something possibly not seeing much testing). Questions now are: 1) Was a bug against gcc opened already? 2) What do we do about it? Working around the issue by setting -fno-caller-saves seems awkward, as we'd likely have nothing but the gcc version to tie this to. And considering distros carry their own patch sets, the version alone may not even be enough. (I didn't see any reports against our tip facing a similar issue despite it being built with gcc 5 now too afaik.) Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH OSSTEST] Toolstack: Do not pass -F to xm shutdown (Was: Re: [xen-4.2-testing test] 57630: regressions - FAIL)
On Mon, 2015-06-01 at 08:56 +0100, Jan Beulich wrote: On 01.06.15 at 00:57, osst...@xenbits.xen.org wrote: flight 57630 xen-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/57630/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xend-winxpsp3 16 guest-stop fail REGR. vs. 53018 test-amd64-i386-xend-qemut-winxpsp3 16 guest-stop fail REGR. vs. 53018 This seems to be due to osstest using an unrecognized option: 2015-05-31 20:32:59 Z executing ssh ... root@172.16.144.32 xm shutdown -wF win.guest.osstest Error: option -F not recognized I think this osstest patch ought to fix it. I've only tested with perl -c though. 8 From aa79bd958fe42da2876fd76e61468c183a7fabdc Mon Sep 17 00:00:00 2001 From: Ian Campbell ian.campb...@citrix.com Date: Mon, 1 Jun 2015 09:32:37 +0100 Subject: [PATCH] Toolstack: Do not pass -F to xm shutdown This is a feature of xl only. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Toolstack/xl.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Osstest/Toolstack/xl.pm b/Osstest/Toolstack/xl.pm index dd12ae1..23328d3 100644 --- a/Osstest/Toolstack/xl.pm +++ b/Osstest/Toolstack/xl.pm @@ -56,7 +56,7 @@ sub shutdown_wait ($$$) { my $ho = $self-{Host}; my $gn = $gho-{Name}; my $acpi_fallback = guest_var($gho,'acpi_shutdown','false') eq 'true' - ? F : ; +$self-{Name} eq 'xl' ? F : ; target_cmd_root($ho,$self-{_Command} shutdown -w${acpi_fallback} $gn, $timeout); } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 3/4] pci: add wrapper for parse_pci
On 29.05.15 at 23:38, elena.ufimts...@oracle.com wrote: For sbdf'si parsing in rmrr command line add __parse_pci with addtional parameter def_seg. __parse_pci will help to identify if segment was found in string being parsed or default segment was used. Make a wrapper parse_pci so the rest of the callers are not affected. Signed-off-by: Elena Ufimtseva elena.ufimts...@oracle.com Acked-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] iommu: add rmrr Xen command line option for extra rmrrs
On 01.06.15 at 08:30, kevin.t...@intel.com wrote: From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com] --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1185,6 +1185,19 @@ Specify the host reboot method. 'efi' instructs Xen to reboot using the EFI reboot call (in EFI mode by default it will use that method first). +### rmrr + '= start-end=[s1]bdf1[,[s1]bdf2[,...]];start-end=[s2]bdf1[,[s2]bdf2[,...]] + +Define RMRRs units that are missing from ACPI table along with device +they belong to and use them for 1:1 mapping. End addresses can be omitted +and one page will be mapped. The ranges are inclusive when start and end +are specified.If segement of the first device is not specified, segment zero will be used. +If other segments are not specified, first device segment will be used. +If segments are specified for every device and not equal, an error will be reported. Since you only allow devices under same segment for a given rmrr range, would it be simpler to enforce that explicitly in the format? e.g.: = start-end=[s1]bdf1[,bdf2[,...]]; While that might simplify the code, I think it's better to allow the user to specify canonical device coordinates, which would include a segment number. Plus ... Then you don't need to verify whether segment in later bdfs is specified and different from 1st bdf. ... verification could not be dropped, unless we altered parse_pci() to have a way to not accept a segment number at all. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 1/2] efi: Fix allocation problems if ExitBootServices() fails
If calling ExitBootServices() fails, the required memory map size may have increased. When initially allocating the memory map, allocate a slightly larger buffer (by an arbitrary 8 entries) to fix this. The ARM code path was already allocating a larger buffer than required, so this moves the code to be common for all architectures. This was seen on the following machine when using the iscsidxe UEFI driver. The machine would consistently fail the first call to ExitBootServices(). System Information Manufacturer: Supermicro Product Name: X10SLE-F/HF BIOS Information Vendor: American Megatrends Inc. Version: 2.00 Release Date: 04/24/2014 Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com --- xen/arch/arm/efi/efi-boot.h | 6 ++ xen/arch/x86/efi/efi-boot.h | 4 ++-- xen/common/efi/boot.c | 8 +--- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h index b02cc02..3297f27 100644 --- a/xen/arch/arm/efi/efi-boot.h +++ b/xen/arch/arm/efi/efi-boot.h @@ -370,16 +370,14 @@ static void __init efi_arch_cfg_file_late(EFI_FILE_HANDLE dir_handle, char *sect { } -static void *__init efi_arch_allocate_mmap_buffer(UINTN *map_size) +static void *__init efi_arch_allocate_mmap_buffer(UINTN map_size) { void *ptr; EFI_STATUS status; -UINTN map_size_alloc = *map_size + EFI_PAGE_SIZE; -status = efi_bs-AllocatePool(EfiLoaderData, map_size_alloc, ptr); +status = efi_bs-AllocatePool(EfiLoaderData, map_size, ptr); if ( status != EFI_SUCCESS ) return NULL; -*map_size = map_size_alloc; return ptr; } diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h index 3a3b4fe..cd14c19 100644 --- a/xen/arch/x86/efi/efi-boot.h +++ b/xen/arch/x86/efi/efi-boot.h @@ -190,10 +190,10 @@ static void __init efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable, } -static void *__init efi_arch_allocate_mmap_buffer(UINTN *map_size) +static void *__init efi_arch_allocate_mmap_buffer(UINTN map_size) { place_string(mbi.mem_upper, NULL); -mbi.mem_upper -= *map_size; +mbi.mem_upper -= map_size; mbi.mem_upper = -__alignof__(EFI_MEMORY_DESCRIPTOR); if ( mbi.mem_upper xen_phys_start ) return NULL; diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index ef8476c..60c1b8d 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -700,7 +700,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) EFI_STATUS status; unsigned int i, argc; CHAR16 **argv, *file_name, *cfg_file_name = NULL, *options = NULL; -UINTN map_key, info_size, gop_mode = ~0; +UINTN map_alloc_size, map_key, info_size, gop_mode = ~0; EFI_HANDLE *handles = NULL; EFI_SHIM_LOCK_PROTOCOL *shim_lock; EFI_GRAPHICS_OUTPUT_PROTOCOL *gop = NULL; @@ -1053,14 +1053,16 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) efi_arch_video_init(gop, info_size, mode_info); } -efi_bs-GetMemoryMap(efi_memmap_size, NULL, map_key, +efi_bs-GetMemoryMap(map_alloc_size, NULL, map_key, efi_mdesc_size, mdesc_ver); -efi_memmap = efi_arch_allocate_mmap_buffer(efi_memmap_size); +map_alloc_size += 8 * efi_mdesc_size; +efi_memmap = efi_arch_allocate_mmap_buffer(map_alloc_size); if ( !efi_memmap ) blexit(LUnable to allocate memory for EFI memory map); for ( retry = 0; ; retry = 1 ) { +efi_memmap_size = map_alloc_size; status = efi_bs-GetMemoryMap(efi_memmap_size, efi_memmap, map_key, efi_mdesc_size, mdesc_ver); if ( EFI_ERROR(status) ) -- 2.1.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 2/4] libxc: print more error messages when failed
No functional changes introduced. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com --- tools/libxc/xc_hvm_build_x86.c | 30 +++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index 92422bf..df4b7ed 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -259,7 +259,10 @@ static int setup_guest(xc_interface *xch, memset(elf, 0, sizeof(elf)); if ( elf_init(elf, image, image_size) != 0 ) +{ +PERROR(Could not initialise ELF image); goto error_out; +} xc_elf_set_logfile(xch, elf, 1); @@ -522,15 +525,24 @@ static int setup_guest(xc_interface *xch, DPRINTF( 1GB PAGES: 0x%016lx\n, stat_1gb_pages); if ( loadelfimage(xch, elf, dom, page_array) != 0 ) +{ +PERROR(Could not load ELF image); goto error_out; +} if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 ) -goto error_out; +{ +PERROR(Could not load ACPI modules); +goto error_out; +} if ( (hvm_info_page = xc_map_foreign_range( xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, HVM_INFO_PFN)) == NULL ) +{ +PERROR(Could not map hvm info page); goto error_out; +} build_hvm_info(hvm_info_page, args); munmap(hvm_info_page, PAGE_SIZE); @@ -547,7 +559,10 @@ static int setup_guest(xc_interface *xch, } if ( xc_clear_domain_pages(xch, dom, special_pfn(0), NR_SPECIAL_PAGES) ) -goto error_out; +{ +PERROR(Could not clear special pages); +goto error_out; +} xc_hvm_param_set(xch, dom, HVM_PARAM_STORE_PFN, special_pfn(SPECIALPAGE_XENSTORE)); @@ -580,7 +595,10 @@ static int setup_guest(xc_interface *xch, } if ( xc_clear_domain_pages(xch, dom, ioreq_server_pfn(0), NR_IOREQ_SERVER_PAGES) ) -goto error_out; +{ +PERROR(Could not clear ioreq page); +goto error_out; +} /* Tell the domain where the pages are and how many there are */ xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_SERVER_PFN, @@ -595,7 +613,10 @@ static int setup_guest(xc_interface *xch, if ( (ident_pt = xc_map_foreign_range( xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, special_pfn(SPECIALPAGE_IDENT_PT))) == NULL ) +{ +PERROR(Could not map special page ident_pt); goto error_out; +} for ( i = 0; i PAGE_SIZE / sizeof(*ident_pt); i++ ) ident_pt[i] = ((i 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); @@ -610,7 +631,10 @@ static int setup_guest(xc_interface *xch, char *page0 = xc_map_foreign_range( xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, 0); if ( page0 == NULL ) +{ +PERROR(Could not map page0); goto error_out; +} page0[0] = 0xe9; *(uint32_t *)page0[1] = entry_eip - 5; munmap(page0, PAGE_SIZE); -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 3/4] libxc: rework vnuma bits in setup_guest
Make the setup process similar to PV counterpart. That is, to allocate a P2M array that covers the whole memory range and start from there. This is clearer than using an array with no holes in it. Also the dummy layout should take MMIO hole into consideration. We might end up having two vmemranges in the dummy layout. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com --- v2: only generate 1 vnode in dummy layout v3: reset fields that point to dummy layout in exit path --- tools/libxc/xc_hvm_build_x86.c | 70 +- 1 file changed, 55 insertions(+), 15 deletions(-) diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index df4b7ed..0e98c84 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -238,6 +238,7 @@ static int setup_guest(xc_interface *xch, { xen_pfn_t *page_array = NULL; unsigned long i, vmemid, nr_pages = args-mem_size PAGE_SHIFT; +unsigned long p2m_size; unsigned long target_pages = args-mem_target PAGE_SHIFT; unsigned long entry_eip, cur_pages, cur_pfn; void *hvm_info_page; @@ -254,8 +255,9 @@ static int setup_guest(xc_interface *xch, xen_pfn_t special_array[NR_SPECIAL_PAGES]; xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES]; uint64_t total_pages; -xen_vmemrange_t dummy_vmemrange; -unsigned int dummy_vnode_to_pnode; +xen_vmemrange_t dummy_vmemrange[2]; +unsigned int dummy_vnode_to_pnode[1]; +bool use_dummy = false; memset(elf, 0, sizeof(elf)); if ( elf_init(elf, image, image_size) != 0 ) @@ -275,17 +277,36 @@ static int setup_guest(xc_interface *xch, if ( args-nr_vmemranges == 0 ) { -/* Build dummy vnode information */ -dummy_vmemrange.start = 0; -dummy_vmemrange.end = args-mem_size; -dummy_vmemrange.flags = 0; -dummy_vmemrange.nid = 0; +/* Build dummy vnode information + * + * Guest physical address space layout: + * [0, hole_start) [hole_start, 4G) [4G, highmem_end) + * + * Of course if there is no high memory, the second vmemrange + * has no effect on the actual result. + */ + +dummy_vmemrange[0].start = 0; +dummy_vmemrange[0].end = args-lowmem_end; +dummy_vmemrange[0].flags = 0; +dummy_vmemrange[0].nid = 0; args-nr_vmemranges = 1; -args-vmemranges = dummy_vmemrange; -dummy_vnode_to_pnode = XC_NUMA_NO_NODE; +if ( args-highmem_end (1ULL 32) ) +{ +dummy_vmemrange[1].start = 1ULL 32; +dummy_vmemrange[1].end = args-highmem_end; +dummy_vmemrange[1].flags = 0; +dummy_vmemrange[1].nid = 0; + +args-nr_vmemranges++; +} + +dummy_vnode_to_pnode[0] = XC_NUMA_NO_NODE; args-nr_vnodes = 1; -args-vnode_to_pnode = dummy_vnode_to_pnode; +args-vmemranges = dummy_vmemrange; +args-vnode_to_pnode = dummy_vnode_to_pnode; +use_dummy = true; } else { @@ -297,9 +318,15 @@ static int setup_guest(xc_interface *xch, } total_pages = 0; +p2m_size = 0; for ( i = 0; i args-nr_vmemranges; i++ ) +{ total_pages += ((args-vmemranges[i].end - args-vmemranges[i].start) PAGE_SHIFT); +p2m_size = p2m_size (args-vmemranges[i].end PAGE_SHIFT) ? +p2m_size : (args-vmemranges[i].end PAGE_SHIFT); +} + if ( total_pages != (args-mem_size PAGE_SHIFT) ) { PERROR(vNUMA memory pages mismatch (0x%PRIx64 != 0x%PRIx64), @@ -325,16 +352,23 @@ static int setup_guest(xc_interface *xch, DPRINTF( TOTAL:%016PRIx64-%016PRIx64\n, v_start, v_end); DPRINTF( ENTRY:%016PRIx64\n, elf_uval(elf, elf.ehdr, e_entry)); -if ( (page_array = malloc(nr_pages * sizeof(xen_pfn_t))) == NULL ) +if ( (page_array = malloc(p2m_size * sizeof(xen_pfn_t))) == NULL ) { PERROR(Could not allocate memory.); goto error_out; } -for ( i = 0; i nr_pages; i++ ) -page_array[i] = i; -for ( i = args-mmio_start PAGE_SHIFT; i nr_pages; i++ ) -page_array[i] += args-mmio_size PAGE_SHIFT; +for ( i = 0; i p2m_size; i++ ) +page_array[i] = ((xen_pfn_t)-1); +for ( vmemid = 0; vmemid args-nr_vmemranges; vmemid++ ) +{ +uint64_t pfn; + +for ( pfn = args-vmemranges[vmemid].start PAGE_SHIFT; + pfn args-vmemranges[vmemid].end PAGE_SHIFT; + pfn++ ) +page_array[pfn] = pfn; +} /* * Try to claim pages for early warning of insufficient memory available. @@ -645,6 +679,12 @@ static int setup_guest(xc_interface *xch, error_out: rc = -1; out: +if ( use_dummy ) +{ +args-nr_vnodes = 0; +args-vmemranges = NULL; +
[Xen-devel] [PATCH v3 0/4] Fix HVM vNUMA
Boris discovered that HVM vNUMA didn't actually work. This patch series fixes that. The first patch is a prerequisite patch for the actual fixes. The second patch is to help debugging. The fixes are in the last two patches, which can be squashed into one if necessary. Patch 1 and 3 are updated in this series. Other patches remain the same as last version. Wei. Wei Liu (4): libxc/libxl: fill xc_hvm_build_args in libxl libxc: print more error messages when failed libxc: rework vnuma bits in setup_guest libxl: fix HVM vNUMA tools/libxc/xc_hvm_build_x86.c | 133 +++-- tools/libxl/libxl_dom.c| 48 --- tools/libxl/libxl_vnuma.c | 15 - 3 files changed, 127 insertions(+), 69 deletions(-) -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 4/4] libxl: fix HVM vNUMA
This patch does two thing: The original code erroneously fills in xc_hvm_build_args before generating vmemranges. The effect is that guest memory is populated without vNUMA information. Move the hunk to right place to fix this. Move the subtraction of video ram to libxl__vnuma_build_vmemrange_hvm because it's the central place for generating vmemranges. Reported-by: Boris Ostrovsky boris.ostrov...@oracle.com Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Dario Faggioli dario.faggi...@citrix.com Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com --- tools/libxl/libxl_dom.c | 32 ++-- tools/libxl/libxl_vnuma.c | 15 ++- 2 files changed, 24 insertions(+), 23 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index dccc9ac..867172a 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -961,6 +961,16 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, if (info-num_vnuma_nodes != 0) { int i; +ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args); +if (ret) { +LOGEV(ERROR, ret, hvm build vmemranges failed); +goto out; +} +ret = libxl__vnuma_config_check(gc, info, state); +if (ret) goto out; +ret = set_vnuma_info(gc, domid, info, state); +if (ret) goto out; + args.nr_vmemranges = state-num_vmemranges; args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) * args.nr_vmemranges); @@ -972,17 +982,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, args.vmemranges[i].nid = state-vmemranges[i].nid; } -/* Consider video ram belongs to vmemrange 0 -- just shrink it - * by the size of video ram. - */ -if (((args.vmemranges[0].end - args.vmemranges[0].start) 10) - info-video_memkb) { -LOG(ERROR, vmemrange 0 too small to contain video ram); -goto out; -} - -args.vmemranges[0].end -= (info-video_memkb 10); - args.nr_vnodes = info-num_vnuma_nodes; args.vnode_to_pnode = libxl__malloc(gc, sizeof(*args.vnode_to_pnode) * args.nr_vnodes); @@ -996,17 +995,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, goto out; } -if (info-num_vnuma_nodes != 0) { -ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, args); -if (ret) { -LOGEV(ERROR, ret, hvm build vmemranges failed); -goto out; -} -ret = libxl__vnuma_config_check(gc, info, state); -if (ret) goto out; -ret = set_vnuma_info(gc, domid, info, state); -if (ret) goto out; -} ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port, state-store_mfn, state-console_port, state-console_mfn, state-store_domid, diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c index cac78d7..56856d2 100644 --- a/tools/libxl/libxl_vnuma.c +++ b/tools/libxl/libxl_vnuma.c @@ -257,6 +257,7 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc, uint64_t hole_start, hole_end, next; int nid, nr_vmemrange; xen_vmemrange_t *vmemranges; +int rc; /* Derive vmemranges from vnode size and memory hole. * @@ -277,6 +278,16 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc, libxl_vnode_info *p = b_info-vnuma_nodes[nid]; uint64_t remaining_bytes = p-memkb 10; +/* Consider video ram belongs to vnode 0 */ +if (nid == 0) { +if (p-memkb b_info-video_memkb) { +LOG(ERROR, vnode 0 too small to contain video ram); +rc = ERROR_INVAL; +goto out; +} +remaining_bytes -= (b_info-video_memkb 10); +} + while (remaining_bytes 0) { uint64_t count = remaining_bytes; @@ -300,7 +311,9 @@ int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc, state-vmemranges = vmemranges; state-num_vmemranges = nr_vmemrange; -return 0; +rc = 0; +out: +return rc; } /* -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 1/4] libxc/libxl: fill xc_hvm_build_args in libxl
When building HVM guests, originally some fields of xc_hvm_build_args are filled in xc_hvm_build (and buried in the wrong function), some are set in libxl__build_hvm before passing xc_hvm_build_args to xc_hvm_build. This is fragile. After examining the code in xc_hvm_build that sets those fields, we can in fact move setting of mmio_start etc in libxl. This way we consolidate memory layout setting in libxl. The setting of firmware data related fields is left in xc_hvm_build because it depends on parsing ELF image. Those fields only point to scratch data that doesn't affect memory layout. There should be no change in the generated guest memory layout. But the semantic is changed for xc_hvm_build. Toolstack that built directly on top of libxc need to adjust to this change. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Chen, Tiejun tiejun.c...@intel.com Cc: Andrew Cooper andrew.coop...@citrix.com --- v3: mention semantic change in commit message --- tools/libxc/xc_hvm_build_x86.c | 37 +++-- tools/libxl/libxl_dom.c| 16 2 files changed, 23 insertions(+), 30 deletions(-) diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c index e45ae4a..92422bf 100644 --- a/tools/libxc/xc_hvm_build_x86.c +++ b/tools/libxc/xc_hvm_build_x86.c @@ -88,22 +88,14 @@ static int modules_init(struct xc_hvm_build_args *args, return 0; } -static void build_hvm_info(void *hvm_info_page, uint64_t mem_size, - uint64_t mmio_start, uint64_t mmio_size, +static void build_hvm_info(void *hvm_info_page, struct xc_hvm_build_args *args) { struct hvm_info_table *hvm_info = (struct hvm_info_table *) (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET); -uint64_t lowmem_end = mem_size, highmem_end = 0; uint8_t sum; int i; -if ( lowmem_end mmio_start ) -{ -highmem_end = (1ull32) + (lowmem_end - mmio_start); -lowmem_end = mmio_start; -} - memset(hvm_info_page, 0, PAGE_SIZE); /* Fill in the header. */ @@ -116,14 +108,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t mem_size, memset(hvm_info-vcpu_online, 0xff, sizeof(hvm_info-vcpu_online)); /* Memory parameters. */ -hvm_info-low_mem_pgend = lowmem_end PAGE_SHIFT; -hvm_info-high_mem_pgend = highmem_end PAGE_SHIFT; +hvm_info-low_mem_pgend = args-lowmem_end PAGE_SHIFT; +hvm_info-high_mem_pgend = args-highmem_end PAGE_SHIFT; hvm_info-reserved_mem_pgstart = ioreq_server_pfn(0); -args-lowmem_end = lowmem_end; -args-highmem_end = highmem_end; -args-mmio_start = mmio_start; - /* Finish with the checksum. */ for ( i = 0, sum = 0; i hvm_info-length; i++ ) sum += ((uint8_t *)hvm_info)[i]; @@ -251,8 +239,6 @@ static int setup_guest(xc_interface *xch, xen_pfn_t *page_array = NULL; unsigned long i, vmemid, nr_pages = args-mem_size PAGE_SHIFT; unsigned long target_pages = args-mem_target PAGE_SHIFT; -uint64_t mmio_start = (1ull 32) - args-mmio_size; -uint64_t mmio_size = args-mmio_size; unsigned long entry_eip, cur_pages, cur_pfn; void *hvm_info_page; uint32_t *ident_pt; @@ -344,8 +330,8 @@ static int setup_guest(xc_interface *xch, for ( i = 0; i nr_pages; i++ ) page_array[i] = i; -for ( i = mmio_start PAGE_SHIFT; i nr_pages; i++ ) -page_array[i] += mmio_size PAGE_SHIFT; +for ( i = args-mmio_start PAGE_SHIFT; i nr_pages; i++ ) +page_array[i] += args-mmio_size PAGE_SHIFT; /* * Try to claim pages for early warning of insufficient memory available. @@ -446,7 +432,7 @@ static int setup_guest(xc_interface *xch, * range */ !check_mmio_hole(cur_pfn PAGE_SHIFT, SUPERPAGE_1GB_NR_PFNS PAGE_SHIFT, - mmio_start, mmio_size) ) + args-mmio_start, args-mmio_size) ) { long done; unsigned long nr_extents = count SUPERPAGE_1GB_SHIFT; @@ -545,7 +531,7 @@ static int setup_guest(xc_interface *xch, xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE, HVM_INFO_PFN)) == NULL ) goto error_out; -build_hvm_info(hvm_info_page, v_end, mmio_start, mmio_size, args); +build_hvm_info(hvm_info_page, args); munmap(hvm_info_page, PAGE_SIZE); /* Allocate and clear special pages. */ @@ -661,12 +647,6 @@ int xc_hvm_build(xc_interface *xch, uint32_t domid, if ( args.image_file_name == NULL ) return -1; -if ( args.mem_target == 0 ) -args.mem_target = args.mem_size; - -if ( args.mmio_size == 0 ) -args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH; - /* An HVM guest must be initialised with at least 2MB
[Xen-devel] [PATCH v3 2/2] efi: Avoid calling boot services after ExitBootServices()
After the first call to ExitBootServices(), avoid calling any boot services by setting setting efi_bs to NULL and entering an infinite loop in blexit(). Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com --- xen/common/efi/boot.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index 60c1b8d..5c5c158 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -216,6 +216,12 @@ static void __init noreturn blexit(const CHAR16 *str) PrintStr((CHAR16 *)str); PrintStr(newline); +if ( !efi_bs ) +{ +for ( ; ; ) +; +} + if ( cfg.addr ) efi_bs-FreePages(cfg.addr, PFN_UP(cfg.size)); if ( kernel.addr ) @@ -1063,8 +1069,10 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) for ( retry = 0; ; retry = 1 ) { efi_memmap_size = map_alloc_size; -status = efi_bs-GetMemoryMap(efi_memmap_size, efi_memmap, map_key, - efi_mdesc_size, mdesc_ver); +status = SystemTable-BootServices-GetMemoryMap(efi_memmap_size, + efi_memmap, map_key, + efi_mdesc_size, + mdesc_ver); if ( EFI_ERROR(status) ) PrintErrMesg(LCannot obtain memory map, status); @@ -1073,7 +1081,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) efi_arch_pre_exit_boot(); -status = efi_bs-ExitBootServices(ImageHandle, map_key); +status = SystemTable-BootServices-ExitBootServices(ImageHandle, + map_key); +efi_bs = NULL; if ( status != EFI_INVALID_PARAMETER || retry ) break; } -- 2.1.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] iommu: add rmrr Xen command line option for extra rmrrs
On 29.05.15 at 23:38, elena.ufimts...@oracle.com wrote: --- a/xen/drivers/passthrough/vtd/dmar.c +++ b/xen/drivers/passthrough/vtd/dmar.c @@ -866,6 +866,106 @@ out: return ret; } +#define MAX_EXTRA_RMRR_PAGES 16 +#define MAX_EXTRA_RMRR 10 + +/* RMRR units derived from command line rmrr option */ +#define MAX_EXTRA_RMRR_DEV 20 +struct extra_rmrr_unit { +struct list_head list; +unsigned long base_pfn, end_pfn; +u16dev_count; +u32sbdf[MAX_EXTRA_RMRR_DEV]; +}; +static __initdata unsigned int nr_rmrr; +static struct __initdata extra_rmrr_unit rmrru[MAX_EXTRA_RMRR]; + +static void __init add_extra_rmrr(void) +{ +struct acpi_rmrr_unit *rmrrn; +unsigned int dev, seg, i, j; +unsigned long pfn; + +for ( i = 0; i nr_rmrr; i++ ) +{ +if ( rmrru[i].base_pfn rmrru[i].end_pfn ) +{ +printk(XENLOG_ERR VTDPREFIX + Start pfn end pfn for RMRR range [%PRIx64 - %PRIx64]\n, + rmrru[i].base_pfn, rmrru[i].end_pfn); PRIx64 doesn't match unsigned long (more below). +return; +} + +if ( rmrru[i].end_pfn - rmrru[i].base_pfn = MAX_EXTRA_RMRR_PAGES ) +{ +printk(XENLOG_ERR VTDPREFIX + RMRR range exceeds 16 pages [%PRIx64 - %PRIx64]\n, ITYM __stringify(MAX_EXTRA_RMRR_PAGES) instead of 16. + rmrru[i].base_pfn, rmrru[i].end_pfn); +return; +} + +for ( j = 0; j nr_rmrr; j++ ) +{ +if ( i != j rmrru[i].base_pfn = rmrru[j].end_pfn + rmrru[j].base_pfn = rmrru[i].end_pfn ) +{ +printk(XENLOG_ERR VTDPREFIX + Overlapping RMRRs [%PRIx64,%PRIx64] and [%PRIx64,%PRIx64]\n, + rmrru[i].base_pfn, rmrru[i].end_pfn, + rmrru[j].base_pfn, rmrru[j].end_pfn); + return; +} +} + +for ( pfn = rmrru[i].base_pfn; pfn = rmrru[i].end_pfn; pfn++ ) +{ +if ( !mfn_valid(pfn) ) +if ( iommu_verbose ) +printk(XENLOG_ERR VTDPREFIX + Invalid mfn in RMRR range [%PRIx64 - %PRIx64]\n, + rmrru[i].base_pfn, rmrru[i].end_pfn); +return; +} The lack of braces around the outer if()'s body, with the return needing to be inside, suggests that this version of the patch wasn't tested at all. And as Kevin also already suggested on an earlier code block - please consider whether return is really the right thing if any of these checks fails. @@ -885,7 +986,9 @@ int __init acpi_dmar_init(void) dmar_table = __va(dmar_addr); } -return parse_dmar_table(acpi_parse_dmar); +ret = parse_dmar_table(acpi_parse_dmar); +add_extra_rmrr(); +return ret; } Blank line before a function's final return statement please. @@ -916,3 +1019,62 @@ int platform_supports_x2apic(void) unsigned int mask = ACPI_DMAR_INTR_REMAP | ACPI_DMAR_X2APIC_OPT_OUT; return cpu_has_x2apic ((dmar_flags mask) == ACPI_DMAR_INTR_REMAP); } + +/* + * Parse rmrr Xen command line options and add parsed + * device and region into apci_rmrr_unit list to mapped + * as RMRRs parsed from ACPI. + * Format rmrr=start-end=[s1]bdf1[,[s1]bdf2[,...]];start-end=[s2]bdf1[,[s2]bdf2[,...]] + * If the segement of the first device is not specified, the segment zero will be used. The typo is still there (and also in the command line doc). Please be sure to address comments on earlier versions before sending out new ones. +static void __init parse_rmrr_param(const char *str) +{ +const char *s = str, *cur, *stmp; +unsigned int seg, bus, dev, func; +unsigned long start, end; + +do { +bool_t default_segment = 0; +start = simple_strtoul(cur = s, s, 0); Blank line between declarations and statements please. And this is, afaict, still not correct - it needs to go in the inner loop, as it needs to be initialized in each of that one's iterations. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH OSSTEST] Stubdom test case
Currently only QEMU traditional supports stubdom, so we only create test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm Note that stubdom only supports serial='pty'. Piping serial to stderr causes stubdom to exit abnormally. Signed-off-by: Wei Liu wei.l...@citrix.com --- Diff of xen-unstable flight runvar dump before and after this change: diff -ub ../master-runvars ../stubdom-runvars | sed 's/[ \t]*$//' | egrep '^[\+|-]' --- ../master-runvars 2015-06-01 12:36:22.759594425 +0100 +++ ../stubdom-runvars 2015-06-01 12:37:49.584329616 +0100 +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test,hvm +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm all_hostflags arch-amd64,arch-xen-amd64,suite-wheezy,purpose-test,hvm +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64 all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test,hvm +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm all_hostflags arch-i386,arch-xen-amd64,suite-wheezy,purpose-test,hvm +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 archamd64 +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm arch amd64 +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64 archi386 +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm arch i386 +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 biosrombios +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm bios rombios +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64 biosrombios +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm bios rombios +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 buildjobbuild-amd64 +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm buildjob build-amd64-xsm +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64 buildjobbuild-i386 +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm buildjob build-i386-xsm +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 debianhvm_image debian-7.2.0-amd64-CD-1.iso +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm debianhvm_image debian-7.2.0-amd64-CD-1.iso +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64 debianhvm_image debian-7.2.0-amd64-CD-1.iso +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm debianhvm_image debian-7.2.0-amd64-CD-1.iso +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 device_model_versionqemu-xen-traditional +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm device_model_version qemu-xen-traditional +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64 device_model_versionqemu-xen-traditional +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm device_model_version qemu-xen-traditional +xen-unstable test-amd64-amd64-xl-qemut-debianhvm-amd64 enable_stubdom false +xen-unstable test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm enable_stubdom false +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64 enable_stubdom true +xen-unstable test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm enable_stubdom true +xen-unstable test-amd64-amd64-xl-qemuu-debianhvm-amd64 enable_stubdom false +xen-unstable test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm enable_stubdom false +xen-unstable test-amd64-amd64-xl-qemuu-ovmf-amd64 enable_stubdom false +xen-unstable test-amd64-i386-xl-qemut-debianhvm-amd64 enable_stubdom false +xen-unstable test-amd64-i386-xl-qemut-debianhvm-amd64-xsm enable_stubdom false +xen-unstable test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64 enable_stubdom true +xen-unstable
Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests
On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote: When doing passthrough of a PCI device for an HVM guest, don't insert the device into xenstore, otherwise pciback attempts to use it which conflicts with QEMU. How does it conflict? This manifests itself such that the first time a device is passed to a domain, it succeeds. Subsequent attempts fail unless the device is unbound from pciback or the machine rebooted. Can you be more specific please? What are the issues? Why does it fail? There are certain things that pciback does to prepare an PCI device which QEMU also does. Some of them - such as saving the configuration registers (And then restoring them after the device has been detached) - is something that QEMU does not do. Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com --- tools/libxl/libxl_pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c index e0743f8..2552889 100644 --- a/tools/libxl/libxl_pci.c +++ b/tools/libxl/libxl_pci.c @@ -994,7 +994,7 @@ out: } } -if (!starting) +if (!starting type == LIBXL_DOMAIN_TYPE_PV) rc = libxl__device_pci_add_xenstore(gc, domid, pcidev, starting); else rc = 0; -- 2.1.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests
Monday, June 1, 2015, 5:43:53 PM, you wrote: On 06/01/2015 04:26 PM, Konrad Rzeszutek Wilk wrote: On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote: When doing passthrough of a PCI device for an HVM guest, don't insert the device into xenstore, otherwise pciback attempts to use it which conflicts with QEMU. How does it conflict? It doesn't work with repeated use. See below. This manifests itself such that the first time a device is passed to a domain, it succeeds. Subsequent attempts fail unless the device is unbound from pciback or the machine rebooted. Can you be more specific please? What are the issues? Why does it fail? Without this patch, if a device (e.g. a GPU) is bound to pciback and then passed through to a guest using xl pci-attach, it appears in the guest and works fine. If the guest is rebooted, and the device is again passed through with xl pci-attach, it appears in the guest as before but does not work. In Windows, it gets something like Error Code 43 and on Linux, the Nouveau driver fails to initialize the device (with error -22 or something). The only way to get the device to work again is to reboot the host or unbind and rebind it to pciback. With this patch, it works as expected. The device is bound to pciback and works after being passed through, even after the VM is rebooted. There are certain things that pciback does to prepare an PCI device which QEMU also does. Some of them - such as saving the configuration registers (And then restoring them after the device has been detached) - is something that QEMU does not do. I really have no idea what the correct thing to do is, but the current code with qemu-trad doesn't seem to work (for me). Regards The Konrad's do_flr patch that David NACKed works fine for me. It makes it possible to do VGA passthrough and reboot endlessly. But is has to be refactored into something that is acceptable upstream. (and possibly be self contained and not rely on the do_flr sysfs trigger to work around some locking issue, that would make most part of the NACK (the naming of the option) also go away. ) -- Sander ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] hotplug/Linux: Add --wait to iptables calls.
On Mon, 2015-06-01 at 17:09 +0100, Anthony PERARD wrote: On Mon, Jun 01, 2015 at 04:17:51PM +0100, Ian Campbell wrote: On Mon, 2015-06-01 at 16:08 +0100, Jan Beulich wrote: On 01.06.15 at 16:59, anthony.per...@citrix.com wrote: --- a/tools/hotplug/Linux/vif-common.sh +++ b/tools/hotplug/Linux/vif-common.sh @@ -130,9 +130,9 @@ frob_iptable() local c=-D fi - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ $@ -j ACCEPT 2/dev/null - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ -j ACCEPT 2/dev/null if [ \( $command == online -o $command == add \) -a $? -ne 0 ] Looking at my oldest system's iptables --help I can't spot such an option (which doesn't necessarily mean it's not supported). Did you make sure all (older) distros we care about actually support this? --wait does not appear work on a debian weezy (Debian 7.8). It's not really clear if/why --wait is the solution to the problem of another party using iptables-{save,restore} to do their own network management in the first place. If OpenStack is doing save/modify/restore then what stops us rewriting things in the middle and then getting those changes clobbered on restore? Surely iptables-save can't exit holding the lock... That could be an issue. And if nova-network is managing networking do we really need to do it too? I will investigate in that, check what there are doing, and what we are doing. The solution might just be a script=none. Or script=nova-network even. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 2/4] iommu VT-d: separate rmrr addition function
On Mon, Jun 01, 2015 at 04:51:55AM +, Tian, Kevin wrote: From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com] Sent: Saturday, May 30, 2015 5:39 AM From: Elena Ufimtseva elena.ufimts...@oracle.com In preparation for auxiliary RMRR data provided on Xen command line, make RMRR adding a separate function. Also free memery for rmrr device scope in error path. No changes since v5. Signed-off-by: Elena Ufimtseva elena.ufimts...@oracle.com Reviewed-by: Jan Beulich jbeul...@suse.com --- xen/drivers/passthrough/vtd/dmar.c | 129 - 1 file changed, 70 insertions(+), 59 deletions(-) diff --git a/xen/drivers/passthrough/vtd/dmar.c b/xen/drivers/passthrough/vtd/dmar.c index 0985150..89a2f79 100644 --- a/xen/drivers/passthrough/vtd/dmar.c +++ b/xen/drivers/passthrough/vtd/dmar.c @@ -576,6 +576,73 @@ out: return ret; } +static int register_one_rmrr(struct acpi_rmrr_unit *rmrru) +{ +bool_t ignore = 0; +unsigned int i = 0; +int ret = 0; + +/* Skip checking if segment is not accessible yet. */ +if ( !pci_known_segment(rmrru-segment) ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, UNKNOWN Prefix! %04x, rmrru-segment); +i = UINT_MAX; +} + +for ( ; i rmrru-scope.devices_cnt; i++ ) +{ +u8 b = PCI_BUS(rmrru-scope.devices[i]); +u8 d = PCI_SLOT(rmrru-scope.devices[i]); +u8 f = PCI_FUNC(rmrru-scope.devices[i]); + +if ( pci_device_detect(rmrru-segment, b, d, f) == 0 ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, + Non-existent device (%04x:%02x:%02x.%u) is reported + in RMRR (%PRIx64, %PRIx64)'s scope!\n, +rmrru-segment, b, d, f, +rmrru-base_address, rmrru-end_address); +ignore = 1; +} +else +{ +ignore = 0; +break; +} +} + +if ( ignore ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, + Ignore the RMRR (%PRIx64, %PRIx64) due to +devices under its scope are not PCI discoverable!\n, +rmrru-base_address, rmrru-end_address); +xfree(rmrru-scope.devices); +xfree(rmrru); +ret = -EFAULT; +} +else if ( rmrru-base_address rmrru-end_address ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, + The RMRR (%PRIx64, %PRIx64) is incorrect!\n, +rmrru-base_address, rmrru-end_address); +xfree(rmrru-scope.devices); +xfree(rmrru); +ret = -EFAULT; +} above two error handling can be combined into one at the end of the func like in other places. Thanks Kevin Hi Kevin Thank you for review. I think in this case I cannot combine these two as the ret should not be set in first (ignore) branch. Looks like I placed it there by mistake. Elena ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 2/4] iommu VT-d: separate rmrr addition function
On Mon, Jun 01, 2015 at 09:53:55AM +0100, Jan Beulich wrote: On 29.05.15 at 23:38, elena.ufimts...@oracle.com wrote: In preparation for auxiliary RMRR data provided on Xen command line, make RMRR adding a separate function. Also free memery for rmrr device scope in error path. No changes since v5. Certainly there is. (And the statement wouldn't belong here anyway, but below the first --- separator.) Signed-off-by: Elena Ufimtseva elena.ufimts...@oracle.com Reviewed-by: Jan Beulich jbeul...@suse.com And certainly I didn't approve it in this shape: +static int register_one_rmrr(struct acpi_rmrr_unit *rmrru) +{ +bool_t ignore = 0; +unsigned int i = 0; +int ret = 0; + +/* Skip checking if segment is not accessible yet. */ +if ( !pci_known_segment(rmrru-segment) ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, UNKNOWN Prefix! %04x, rmrru-segment); +i = UINT_MAX; +} + +for ( ; i rmrru-scope.devices_cnt; i++ ) +{ +u8 b = PCI_BUS(rmrru-scope.devices[i]); +u8 d = PCI_SLOT(rmrru-scope.devices[i]); +u8 f = PCI_FUNC(rmrru-scope.devices[i]); + +if ( pci_device_detect(rmrru-segment, b, d, f) == 0 ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, + Non-existent device (%04x:%02x:%02x.%u) is reported + in RMRR (%PRIx64, %PRIx64)'s scope!\n, +rmrru-segment, b, d, f, +rmrru-base_address, rmrru-end_address); +ignore = 1; +} +else +{ +ignore = 0; +break; +} +} + +if ( ignore ) +{ +dprintk(XENLOG_WARNING VTDPREFIX, + Ignore the RMRR (%PRIx64, %PRIx64) due to +devices under its scope are not PCI discoverable!\n, +rmrru-base_address, rmrru-end_address); +xfree(rmrru-scope.devices); +xfree(rmrru); +ret = -EFAULT; You _again_ made this an error, which it wasn't before. A little more care please. Yes, and I agreed that it did not make sense to set ret here, wishful typing I guess ) Also you folded the leak fix into here without saying so. As said on the solitary leak fix patch - that change belongs there (not the least because we will want to backport that but not this one). yes, changing this. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] run QEMU as non-root
On Fri, 29 May 2015, Ian Campbell wrote: On Fri, 2015-05-29 at 14:47 +0100, Stefano Stabellini wrote: Try to use xen-qemudepriv-$domname first, then xen-qemudepriv-base + domid, finally xen-qemudepriv-shared and root if everything else fails. The uids need to be manually created by the user or, more likely, by the xen package maintainer. To actually secure QEMU when running in Dom0, we need at least to deprivilege the privcmd and xenstore interfaces, this is just the first step in that direction. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- Changes in v3: - clarify doc - handle errno == ERANGE --- docs/misc/qemu-deprivilege | 36 + tools/libxl/libxl_dm.c | 60 ++ tools/libxl/libxl_internal.h |4 +++ 3 files changed, 100 insertions(+) create mode 100644 docs/misc/qemu-deprivilege diff --git a/docs/misc/qemu-deprivilege b/docs/misc/qemu-deprivilege new file mode 100644 index 000..3a61867 --- /dev/null +++ b/docs/misc/qemu-deprivilege Could you name this with a .txt or even a .markdown or .pandoc please. I think you should also add a reference to this to the toplevel INSTALL file, so there is some hope of someone seeing it. Good idea And I presume you will update the relevant wikipages (e.g. the installing from source one) once this change lands? Yes, I will. +2) a user named xen-qemudepriv-base, adding domid to its uid +This requires the reservation of 65536 uids from the uid of +xen-qemudepriv-base to uid+65535. For example, if xen-qemudepriv-base +has uid 6000, and the domid is 25, libxl will try to use uid 6025. To +use this mechanism, you might want to create a large number of users at +installation time. For example: + +adduser --system xen-qemudepriv-base +for i in '' $(seq 1 65335) +do +adduser --system xen-qemudepriv-base$i +done I'm not sure that adduser is necessarily guaranteed to create users sequentially, even if it might do so most of the time. Did you check this? Perhaps rather than doing base + domid we should just lookup the user xen-qemudepriv-domidNNN and document creating a user for every domid? The advantage with that is we don't need to figure out how to document the creation of 65K _consecutive_ users. That might be better actually @@ -439,6 +441,9 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, int i, connection, devid; uint64_t ram_size; const char *path, *chardev; +struct passwd pwd, *user = NULL; +char *buf = NULL; +long buf_size; dm_args = flexarray_make(gc, 16, 1); @@ -878,6 +883,61 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, default: break; } + +buf_size = sysconf(_SC_GETPW_R_SIZE_MAX); +if (buf_size 0) { +LOGE(ERROR, sysconf(_SC_GETPW_R_SIZE_MAX) returned error %ld, buf_size); +goto end_search; +} +errno = 0; + +retry: +if (errno == ERANGE) +buf_size += 512; +buf = libxl__realloc(gc, buf, buf_size); +if (c_info-name) { +errno = 0; +getpwnam_r(libxl__sprintf(gc, %s-%s, +LIBXL_QEMU_USER_PREFIX, c_info-name), +pwd, buf, buf_size, user); +if (user != NULL) +goto end_search; +if (errno == ERANGE) +goto retry; +} + +errno = 0; +getpwnam_r(LIBXL_QEMU_USER_BASE, pwd, buf, buf_size, user); +if (user != NULL) { +errno = 0; +getpwuid_r(user-pw_uid + guest_domid, pwd, buf, buf_size, user); +if (user != NULL) +goto end_search; +if (errno == ERANGE) +goto retry; +} +if (errno == ERANGE) +goto retry; + +errno = 0; +getpwnam_r(LIBXL_QEMU_USER_SHARED, pwd, buf, buf_size, user); +if (user != NULL) { +LOG(WARN, Could not find user %s-%s or user %s (+domid %d), falling back to %s, +LIBXL_QEMU_USER_PREFIX, c_info-name, LIBXL_QEMU_USER_BASE, +guest_domid, LIBXL_QEMU_USER_SHARED); +goto end_search; +} +if (errno == ERANGE) +goto retry; This is all rather repetitive, please add a helper which takes care of allocating the buffer, retrying and logging on fail. I don't think you need to worry about making all three attempts use the same buffer, so the helper doesn't need to have the complexity of doing that. That would also let you avoid repeating all three searches when the last one returns ERANGE. OK, I'll do + + +LOG(WARN,
Re: [Xen-devel] [PATCH RFC] hotplug/Linux: Add --wait to iptables calls.
On 01.06.15 at 16:59, anthony.per...@citrix.com wrote: --- a/tools/hotplug/Linux/vif-common.sh +++ b/tools/hotplug/Linux/vif-common.sh @@ -130,9 +130,9 @@ frob_iptable() local c=-D fi - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ $@ -j ACCEPT 2/dev/null - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ -j ACCEPT 2/dev/null if [ \( $command == online -o $command == add \) -a $? -ne 0 ] Looking at my oldest system's iptables --help I can't spot such an option (which doesn't necessarily mean it's not supported). Did you make sure all (older) distros we care about actually support this? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] hotplug/Linux: Add --wait to iptables calls.
On Mon, 2015-06-01 at 16:08 +0100, Jan Beulich wrote: On 01.06.15 at 16:59, anthony.per...@citrix.com wrote: --- a/tools/hotplug/Linux/vif-common.sh +++ b/tools/hotplug/Linux/vif-common.sh @@ -130,9 +130,9 @@ frob_iptable() local c=-D fi - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ $@ -j ACCEPT 2/dev/null - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ -j ACCEPT 2/dev/null if [ \( $command == online -o $command == add \) -a $? -ne 0 ] Looking at my oldest system's iptables --help I can't spot such an option (which doesn't necessarily mean it's not supported). Did you make sure all (older) distros we care about actually support this? It's not really clear if/why --wait is the solution to the problem of another party using iptables-{save,restore} to do their own network management in the first place. If OpenStack is doing save/modify/restore then what stops us rewriting things in the middle and then getting those changes clobbered on restore? Surely iptables-save can't exit holding the lock... And if nova-network is managing networking do we really need to do it too? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/HVM: Avoid cache flush operations during hvm_load
On 01.06.15 at 17:26, ross.lagerw...@citrix.com wrote: --- a/xen/common/hvm/save.c +++ b/xen/common/hvm/save.c This being an x86-specific problem I would think this would better be addressed in x86-specific code. If it was to stay here, a proper arch hook should be introduced (even if ARM isn't using this code right now) and ... @@ -230,13 +233,14 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) printk(XENLOG_G_ERR HVM%d restore: save did not end with a null entry\n, d-domain_id); -return -1; +ret = -1; +goto out; } /* Read the typecode of the next entry and check for the end-marker */ desc = (struct hvm_save_descriptor *)(h-data[h-cur]); if ( desc-typecode == 0 ) -return 0; +goto out; /* Find the handler for this entry */ if ( (desc-typecode HVM_SAVE_CODE_MAX) || @@ -244,7 +248,8 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) { printk(XENLOG_G_ERR HVM%d restore: unknown entry typecode %u\n, d-domain_id, desc-typecode); -return -1; +ret = -1; +goto out; } /* Load the entry */ @@ -254,11 +259,17 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) { printk(XENLOG_G_ERR HVM%d restore: failed to load entry %u/%u\n, d-domain_id, desc-typecode, desc-instance); -return -1; +ret = -1; +goto out; } } ... all of the goto-s should become break-s, ... -/* Not reached */ +ASSERT_UNREACHABLE(); + +out: ... eliminating the need for both of these. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] hotplug/Linux: Add --wait to iptables calls.
On Mon, Jun 01, 2015 at 04:17:51PM +0100, Ian Campbell wrote: On Mon, 2015-06-01 at 16:08 +0100, Jan Beulich wrote: On 01.06.15 at 16:59, anthony.per...@citrix.com wrote: --- a/tools/hotplug/Linux/vif-common.sh +++ b/tools/hotplug/Linux/vif-common.sh @@ -130,9 +130,9 @@ frob_iptable() local c=-D fi - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-in $dev \ $@ -j ACCEPT 2/dev/null - iptables $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ + iptables --wait $c FORWARD -m physdev --physdev-is-bridged --physdev-out $dev \ -j ACCEPT 2/dev/null if [ \( $command == online -o $command == add \) -a $? -ne 0 ] Looking at my oldest system's iptables --help I can't spot such an option (which doesn't necessarily mean it's not supported). Did you make sure all (older) distros we care about actually support this? --wait does not appear work on a debian weezy (Debian 7.8). It's not really clear if/why --wait is the solution to the problem of another party using iptables-{save,restore} to do their own network management in the first place. If OpenStack is doing save/modify/restore then what stops us rewriting things in the middle and then getting those changes clobbered on restore? Surely iptables-save can't exit holding the lock... That could be an issue. And if nova-network is managing networking do we really need to do it too? I will investigate in that, check what there are doing, and what we are doing. The solution might just be a script=none. -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/2] xen: separate the xenstore_record_dm_state calls for pv and hvm machines
The following patch will introduce a new option to restrict the privilege of the xenstore connection. In that case, we do not want to use multiple xenstore connections, but just the one, with lower privileges. For this reason, split the xenstore_record_dm_state calls for pv and hvm machines, so that in the hvm case QEMU will reuse the same xenstore connection. (At the moment it opens two and uses the second one for the xenstore_record_dm_state call.) Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- hw/xenpv/xen_machine_pv.c | 11 +++ include/hw/xen/xen.h |2 ++ xen-common-stub.c |4 xen-common.c | 15 +-- xen-hvm.c |1 + 5 files changed, 19 insertions(+), 14 deletions(-) diff --git a/hw/xenpv/xen_machine_pv.c b/hw/xenpv/xen_machine_pv.c index 2e545d2..5ad22e3 100644 --- a/hw/xenpv/xen_machine_pv.c +++ b/hw/xenpv/xen_machine_pv.c @@ -28,6 +28,15 @@ #include xen_domainbuild.h #include sysemu/block-backend.h +static void xen_change_state_handler(void *opaque, int running, + RunState state) +{ +if (running) { +/* record state running */ +xenstore_record_dm_state(xenstore, running); +} +} + static void xen_init_pv(MachineState *machine) { const char *kernel_filename = machine-kernel_filename; @@ -91,6 +100,8 @@ static void xen_init_pv(MachineState *machine) /* setup framebuffer */ xen_init_display(xen_domid); + +qemu_add_vm_change_state_handler(xen_change_state_handler, NULL); } static QEMUMachine xenpv_machine = { diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h index b0ed04c..d118b56 100644 --- a/include/hw/xen/xen.h +++ b/include/hw/xen/xen.h @@ -37,6 +37,8 @@ void xen_cmos_set_s3_resume(void *opaque, int irq, int level); qemu_irq *xen_interrupt_controller_init(void); void xenstore_store_pv_console_info(int i, struct CharDriverState *chr); +extern struct xs_handle *xs; +void xenstore_record_dm_state(struct xs_handle *xs, const char *state); #if defined(NEED_CPU_H) !defined(CONFIG_USER_ONLY) int xen_hvm_init(ram_addr_t *below_4g_mem_size, ram_addr_t *above_4g_mem_size, diff --git a/xen-common-stub.c b/xen-common-stub.c index 906f991..6fcfc96 100644 --- a/xen-common-stub.c +++ b/xen-common-stub.c @@ -11,3 +11,7 @@ void xenstore_store_pv_console_info(int i, CharDriverState *chr) { } + +void xenstore_record_dm_state(struct xs_handle *xs, const char *state) +{ +} diff --git a/xen-common.c b/xen-common.c index 56359ca..97fc312 100644 --- a/xen-common.c +++ b/xen-common.c @@ -83,8 +83,7 @@ void xenstore_store_pv_console_info(int i, CharDriverState *chr) } } - -static void xenstore_record_dm_state(struct xs_handle *xs, const char *state) +void xenstore_record_dm_state(struct xs_handle *xs, const char *state) { char path[50]; @@ -100,16 +99,6 @@ static void xenstore_record_dm_state(struct xs_handle *xs, const char *state) } } - -static void xen_change_state_handler(void *opaque, int running, - RunState state) -{ -if (running) { -/* record state running */ -xenstore_record_dm_state(xenstore, running); -} -} - static int xen_init(MachineState *ms) { xen_xc = xen_xc_interface_open(0, 0, 0); @@ -117,8 +106,6 @@ static int xen_init(MachineState *ms) xen_be_printf(NULL, 0, can't open xen interface\n); return -1; } -qemu_add_vm_change_state_handler(xen_change_state_handler, NULL); - return 0; } diff --git a/xen-hvm.c b/xen-hvm.c index 315864c..1ea567d 100644 --- a/xen-hvm.c +++ b/xen-hvm.c @@ -1108,6 +1108,7 @@ static void xen_hvm_change_state_handler(void *opaque, int running, if (running) { xen_main_loop_prepare(state); +xenstore_record_dm_state(state-xenstore, running); } xen_set_ioreq_server_state(xen_xc, xen_domid, -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: Don't insert PCI device into xenstore for HVM guests
On 06/01/2015 04:26 PM, Konrad Rzeszutek Wilk wrote: On Fri, May 29, 2015 at 08:59:45AM +0100, Ross Lagerwall wrote: When doing passthrough of a PCI device for an HVM guest, don't insert the device into xenstore, otherwise pciback attempts to use it which conflicts with QEMU. How does it conflict? It doesn't work with repeated use. See below. This manifests itself such that the first time a device is passed to a domain, it succeeds. Subsequent attempts fail unless the device is unbound from pciback or the machine rebooted. Can you be more specific please? What are the issues? Why does it fail? Without this patch, if a device (e.g. a GPU) is bound to pciback and then passed through to a guest using xl pci-attach, it appears in the guest and works fine. If the guest is rebooted, and the device is again passed through with xl pci-attach, it appears in the guest as before but does not work. In Windows, it gets something like Error Code 43 and on Linux, the Nouveau driver fails to initialize the device (with error -22 or something). The only way to get the device to work again is to reboot the host or unbind and rebind it to pciback. With this patch, it works as expected. The device is bound to pciback and works after being passed through, even after the VM is rebooted. There are certain things that pciback does to prepare an PCI device which QEMU also does. Some of them - such as saving the configuration registers (And then restoring them after the device has been detached) - is something that QEMU does not do. I really have no idea what the correct thing to do is, but the current code with qemu-trad doesn't seem to work (for me). Regards -- Ross Lagerwall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86/HVM: Avoid cache flush operations during hvm_load
An MTRR record is processed for each vCPU during hvm_load. Each MTRR record sets several mtrrs, each of which flushes the cache on all pCPUs. This can take some time and trip the watchdog for HVM guests with many CPUs. To fix this, introduce a flag which prevents flushing the cache when doing MTRR operations and instead do a flush at the end of hvm_load. This reduces the time to restore an HVM guest with 32 vCPUs by about 5 seconds. Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com --- xen/arch/x86/hvm/mtrr.c| 5 + xen/common/hvm/save.c | 21 - xen/include/asm-x86/mtrr.h | 9 + 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c index a69ee62..f21b367 100644 --- a/xen/arch/x86/hvm/mtrr.c +++ b/xen/arch/x86/hvm/mtrr.c @@ -65,6 +65,8 @@ static const uint8_t mm_type_tbl[MTRR_NUM_TYPES][PAT_TYPE_NUMS] = { #undef RS }; +DEFINE_PER_CPU(bool_t, memory_type_changed_ignore); + /* * Reverse lookup table, to find a pat type according to MTRR and effective * memory type. This table is dynamically generated. @@ -789,6 +791,9 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr, void memory_type_changed(struct domain *d) { +if ( this_cpu(memory_type_changed_ignore) ) +return; + if ( need_iommu(d) d-vcpu d-vcpu[0] ) { p2m_memory_type_changed(d); diff --git a/xen/common/hvm/save.c b/xen/common/hvm/save.c index da6e668..c6c27e4 100644 --- a/xen/common/hvm/save.c +++ b/xen/common/hvm/save.c @@ -206,6 +206,7 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) struct hvm_save_descriptor *desc; hvm_load_handler handler; struct vcpu *v; +int ret = 0; if ( d-is_dying ) return -EINVAL; @@ -222,6 +223,8 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) if ( test_and_set_bit(_VPF_down, v-pause_flags) ) vcpu_sleep_nosync(v); +this_cpu(memory_type_changed_ignore) = 1; + for ( ; ; ) { if ( h-size - h-cur sizeof(struct hvm_save_descriptor) ) @@ -230,13 +233,14 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) printk(XENLOG_G_ERR HVM%d restore: save did not end with a null entry\n, d-domain_id); -return -1; +ret = -1; +goto out; } /* Read the typecode of the next entry and check for the end-marker */ desc = (struct hvm_save_descriptor *)(h-data[h-cur]); if ( desc-typecode == 0 ) -return 0; +goto out; /* Find the handler for this entry */ if ( (desc-typecode HVM_SAVE_CODE_MAX) || @@ -244,7 +248,8 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) { printk(XENLOG_G_ERR HVM%d restore: unknown entry typecode %u\n, d-domain_id, desc-typecode); -return -1; +ret = -1; +goto out; } /* Load the entry */ @@ -254,11 +259,17 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) { printk(XENLOG_G_ERR HVM%d restore: failed to load entry %u/%u\n, d-domain_id, desc-typecode, desc-instance); -return -1; +ret = -1; +goto out; } } -/* Not reached */ +ASSERT_UNREACHABLE(); + +out: +this_cpu(memory_type_changed_ignore) = 0; +memory_type_changed(d); +return ret; } int _hvm_init_entry(struct hvm_domain_context *h, diff --git a/xen/include/asm-x86/mtrr.h b/xen/include/asm-x86/mtrr.h index 0569db6..9df87cd 100644 --- a/xen/include/asm-x86/mtrr.h +++ b/xen/include/asm-x86/mtrr.h @@ -60,6 +60,15 @@ struct mtrr_state { }; extern struct mtrr_state mtrr_state; +/* + * The purpose of the memory_type_changed_ignore cpu flag is to + * avoid unecessary cache flushes when doing multiple memory type + * operations that may flush the cache. Code can set this flag, do + * several memory type operations, clear the flag and then call + * memory_type_changed() to flush the cache at the end. + */ +DECLARE_PER_CPU(bool_t, memory_type_changed_ignore); + extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); extern int mtrr_add(unsigned long base, unsigned long size, -- 2.1.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On 01/06/15 14:12, Ian Campbell wrote: On Fri, 2015-05-29 at 14:40 +0100, Julien Grall wrote: Hi Ian, Hi Ian, NIT: You used my Linaro email which I think is de-activated now :). I keep finding new address books with that address in them! ## ITS Translation Table Message signalled interrupts are translated into an LPI via an ITS translation table which must be configured for each device which can generate an MSI. I'm not sure what is the ITS Table Table. Did you mean Interrupt Translation Table? I don't think I wrote Table Table anywhere. Sorry I meant ITS translation table I'm referring to the tables which are established by e.g. the MAPD command and friends, e.g. the thing shown in 4.9.12 Notional ITS Table Structure. On previous paragraph you are referring particularly to Interrupt Translation Table. This is the only table that is configured per device. [..] XXX there are other aspects to virtualising the ITS (LPI collection management, assignment of LPI ranges to guests, device management). However these are not currently considered here. XXX Should they be/do they need to be? I think we began to cover these aspect with the section command emulation. Some aspects, yes. I went with: There are other aspects to virtualising the ITS (LPI collection management, assignment of LPI ranges to guests, device management). However these are only considered here to the extent needed for describing the vITS emulation. XXX In the context of virtualised device ids this may not be the case, e.g. we can arrange for (mostly) contiguous device ids and we know the bound is significantly lower than 2^32 Well, the deviceID is computed from the BDF and some DMA alias. As the algorithm can't be tweaked, it's very likely that we will have non-contiguous Device ID. See pci_for_each_dma_alias in Linux (drivers/pci/search.c). The implication here is that deviceID is fixed in hardware and is used by driver domain software in contexts where we do not get the opportunity to translate is that right? What contexts are those? No, the driver domain software will always use a virtual DeviceID (based on the vBDF and other things). The problem I wanted to raise is how to translate back the vDeviceID to a physical deviceID/BDF. Note that the BDF is also something which we could in principal virtualise (we already do for domU). Perhaps that is infeasible for dom0 though? For DOM0 the virtual BDF is equal to the physical BDF. So the both deviceID (physical and virtual) will be the same. We may decide to do vBDF == pBDF for guest too in order to simplify the code. That gives me two thoughts. The first is that although device identifiers are not necessarily contiguous, they are generally at least grouped and not allocated at random through the 2^32 options. For example a PCI Host bridge typically has a range of device ids associated with it and each device has a device id derived from that. Usually it's one per (device, function). I'm not sure if we can leverage that into a more useful data structure than an R-B tree, or for example to arrange for the R-B to allow for the translation of a device within a span into the parent span and from there do the lookup. Specifically when looking up a device ID corresponding to a PCI device we could arrange to find the PCI host bridge and find the actual device from there. This would keep the RB tree much smaller and therefore perhaps quicker? Of course that depends on what the lookup from PCI host bridge to a device looked like. I'm not sure why you are speaking about PCI host bridge. AFAIK, the guest doesn't have a physical host bridge. Although, this is an optimization that we can think about it later. The R-B will already be fast enough for a first implementation. My main point was about the translation vDeviceID = pDeviceID. The second is that perhaps we can do something simpler for the domU case, if we were willing to tolerate it being different from dom0. Possible efficient data structures would be: 1. List: The lookup/deletion is in O(n) and the insertion will depend if the device should be sorted following their identifier. The memory overhead is 18 bytes per element. 2. Red-black tree: All the operations are O(log(n)). The memory overhead is 24 bytes per element. A Red-black tree seems the more suitable for having fast deviceID validation even though the memory overhead is a bit higher compare to the list. ### Event ID (`vID`) This is the per-device Interrupt identifier (i.e. the MSI index). It is configured by the device driver software. It is not necessary to translate a `vID`, however they may need to be represented in various data structures given to the pITS. XXX is any of this true? Right, the vID will always be equal to the pID. Although you will need to associate a physical LPI for every pair (vID, DevID). I think in the
[Xen-devel] [PATCH 0/2] restrict the privilege of the xenstore connection
Hi all, this patch series introduces a new command line option to restrict the privilege of the xenstore connection. Used together with -runas, can help secure the execution of QEMU in Dom0. Stefano Stabellini (2): xen: separate the xenstore_record_dm_state calls for pv and hvm machines xen: introduce xsrestrict hw/xenpv/xen_machine_pv.c | 11 +++ include/hw/xen/xen.h |4 qemu-options.hx | 15 +++ vl.c |8 xen-common-stub.c |6 ++ xen-common.c | 15 +-- xen-hvm.c | 38 ++ 7 files changed, 75 insertions(+), 22 deletions(-) Cheers, Stefano ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: gic-hip04: Resync the driver with the GICv2
On 01/06/15 14:18, Zoltan Kiss wrote: Sure. Btw. do you have them in a public repo somewhere? It would make it a little bit easier to apply and test. It's based on the latest staging: http://xenbits.xen.org/gitweb/?p=people/julieng/xen-unstable.git;a=shortlog;h=refs/heads/for-huawei/gicv2-on-gicv3 Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel