Re: [Xen-devel] Regression, host crash with 4.5rc1
On 20.11.14 at 02:23, sfl...@ihonk.com wrote: On 11/17/2014 23:54, Jan Beulich wrote: Another thing - now that serial logging appears to be working for you, did you try whether the host, once hung, still reacts to serial input (perhaps force input to go to Xen right at boot via the conswitch= option)? If so, 'd' debug-key output would likely be the piece of most interest. Here you go. Performed with a checkout of 9a727a81 (because it was handy), let me know if you'd rather see the results from 4.5-rc2 or any other Xen debugging info: Interesting - all CPUs are executing stuff from Dom1, which be itself is not indicating a problem, but may (considering the host hang) hint at guest vCPU-s not getting de-scheduled anymore on one or more of the CPUs. The 'a' debug key output would hopefully give us enough information to know whether that's the case. Ideally do 'd' and 'a' a couple of times each (alternating, and with a few seconds in between). Also, for double checking purposes, could you make the xen-syms of a build you observed the problem with available somewhere? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 for-xen-4.5 2/2] dpci: Replace tasklet with an softirq
On 19.11.14 at 17:44, konrad.w...@oracle.com wrote: On Fri, Nov 14, 2014 at 11:11:46AM -0500, Konrad Rzeszutek Wilk wrote: On Fri, Nov 14, 2014 at 03:13:42PM +, Jan Beulich wrote: On 12.11.14 at 03:23, konrad.w...@oracle.com wrote: +static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci) +{ +struct domain *d = pirq_dpci-dom; + +ASSERT(spin_is_locked(d-event_lock)); + +switch ( cmpxchg(pirq_dpci-state, 1 STATE_SCHED, 0) ) +{ +case (1 STATE_SCHED): +/* + * We are going to try to de-schedule the softirq before it goes in + * STATE_RUN. Whoever clears STATE_SCHED MUST refcount the 'dom'. + */ +put_domain(d); +/* fallthrough. */ Considering Sander's report, the only suspicious place I find is this one: When the STATE_SCHED flag is set, pirq_dpci is on some CPU's list. What guarantees it to get removed from that list before getting inserted on another one? None. The moment that STATE_SCHED is cleared, 'raise_softirq_for' is free to manipulate the list. I was too quick to say this. A bit more inspection shows that while 'raise_softirq_for' is free to manipulate the list - it won't be called. The reason is that the pt_pirq_softirq_reset is called _after_ the IRQ action handler are removed for this IRQ. That means we will not receive any interrupts for it and call 'raise_softirq_for'. At least until 'pt_irq_create_bind' is called. And said function has a check for this too: 42 * A crude 'while' loop with us dropping the spinlock and giving 243 * the softirq_dpci a chance to run. 244 * We MUST check for this condition as the softirq could be scheduled 245 * and hasn't run yet. Note that this code replaced tasklet_kill which 246 * would have spun forever and would do the same thing (wait to flush out 247 * outstanding hvm_dirq_assist calls. 248 */ 249 if ( pt_pirq_softirq_active(pirq_dpci) ) With that you correct your earlier reply, but I don't see how this answers my original question. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] dom0 kenrel crashes for openstack + libvirt + libxl
Hi Ian, Both of your two points are valid. There is no need to install virt-manager. And the patch to start a qemu process in /etc/init.d/xen seems to be enough for launching instances from horizon. I have updated the wiki page. Please review it at http://wiki.xenproject.org/wiki/Xen_via_libvirt_for_OpenStack_juno Thanks, Ian! -Xing On Mon, Nov 17, 2014 at 5:39 AM, Ian Campbell ian.campb...@citrix.com wrote: On Fri, 2014-11-14 at 21:10 -0700, Xing Lin wrote: Hi, The wiki page is ready. I am not sure whether I am using the correct format or not. Please let me know if any changes are need. Thanks, http://wiki.xenproject.org/wiki/Xen_via_libvirt_for_OpenStack_juno Thanks for this. WRT the need to install virt manager to avoid the cannot open shared object file issue I expect just running ldconfig would have worked instead. It would also be good to understand why it is necessary to install from source. Was it just the lack of the xencommons initscript? Debian and Ubuntu have their own initscripts and don't reuse the xencommons script. However it should be fairly easy to add the necessary commands to start qemu to /etc/init.d/xen instead of rebuilding from source. I'd expect just adding to the end of the start) section of the script to work. e.g. *) log_end_msg 1; exit ;; esac log_end_msg 0 + /usr/local/bin/qemu-system-i386 -xen-domid 0 -xen-attach -name dom0 -nographic -M xenpv -daemonize \ + -monitor /dev/null -serial /dev/null -parallel /dev/null \ + -pidfile /var/run/qemu-xen-dom0.pid ;; stop) capability_check case $? in 0) ;; (nb, that's not a real patch, I just typed it into my mail client as is) If you can confirm that this works then I can try and get this fixed in Debian at least. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][RFC][PATCH 06/13] hvmloader/ram: check if guest memory is out of reserved device memory maps
From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Thursday, November 20, 2014 4:04 PM On 20.11.14 at 08:45, kevin.t...@intel.com wrote: Yang and I did some discussion here. We understand your point to avoid introducing new interface if we can leverage existing code. However it's not a trivial effort to move device assignment before populating p2m, and there is no other benefit of doing so except for this purpose. So we'd not suggest this way. It's not a trivial effort is pretty vague: What specifically is it that makes this difficult? I wouldn't expect there to be any strong dependencies on the ordering of these two operations... I'll leave to Yang to answer this part, who did a detail investigation on that, e.g. on IOMMU page table setup, etc. But what really matters here is not only about complexity, but also flexibility. Doing so will tie the policy to assigned device only, which removes the option to support hotpluggable device. Current option sounds a reasonable one, i.e. passing a list of BDFs assigned to this VM before populating p2m, and then having hypervisor to filter out reserved regions associated with those BDFs. This way libxc teaches Xen to create reserved regions once, and then later the filtered info is returned upon query. Reasonable, but partly redundant. The positive aspect being that it permits this list and the list of actually being assigned devices to be different, i.e. allowing holes to be set up for devices that only _may_ get assigned at some point. redundant if we think the list only exactly matching the statically assigned devices. but that's just current point. reasonable if we think there may be other policies impacting the list (e.g. if hotplugable device may have a config option and then we have a potential list larger than static assigned devices). From this angle I think the new interface actually makes sense for this very purpose. The limitation of wasted memory due to confliction can be mitigated, and we considered further enhancement can be made later in libxc that when populating p2m, the reserved regions can be skipped explicitly at initial p2m creation phase and then there would be no waste at all. But this optimization takes some time and can be built incrementally on current patch and interface, post 4.5 release. For now let's focus on the very correctness first. I agree, as long as the optimization part doesn't get dropped after the correctness part went in. definitely. after putting so much effort in last months from Tiejun, you can see our willingness to make it correct and continuously improved. Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Problems accessing passthrough PCI device
On 19.11.14 at 16:12, furryfutt...@gmail.com wrote: This is getting more interesting. It seems that something is overwriting the pci-back configuration data. Starting from a fresh reboot I checked the Dom0 pci configuration and got this: root@smartin-xen:~# lspci -s 00:19.0 -x 00:19.0 Ethernet controller: Intel Corporation Device 1559 (rev 04) 00: 86 80 59 15 00 00 10 00 04 00 00 02 00 00 00 00 10: 00 00 d0 f7 00 c0 d3 f7 81 f0 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 54 20 30: 00 00 00 00 c8 00 00 00 00 00 00 00 05 01 00 00 I then start/stop my DomU and checked the Dom0 pci configuration again and got this: root@smartin-xen:~# lspci -s 00:19.0 -x 00:19.0 Ethernet controller: Intel Corporation Device 1559 (rev 04) 00: 86 80 59 15 00 00 10 00 04 00 00 02 00 00 00 00 10: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 54 20 30: 00 00 00 00 c8 00 00 00 00 00 00 00 05 01 00 00 I.e. the BARs got zapped. Not something that should happen. And certainly explaining why you would not be able to use the device a second time. Inside my DomU I added code to print the PCI configuration registers and what I get after restarting the DomU is: (d18) 14:57:04.042 src/e1000e.c@00150: 00: 86 80 59 15 00 00 10 00 04 00 00 02 00 00 00 00 (d18) 14:57:04.042 src/e1000e.c@00150: 10: 00 00 d0 f7 00 c0 d3 f7 81 f0 00 00 00 00 00 00 (d18) 14:57:04.042 src/e1000e.c@00150: 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 54 20 (d18) 14:57:04.043 src/e1000e.c@00150: 30: 00 00 00 00 c8 00 00 00 00 00 00 00 14 01 00 00 (d18) 14:57:04.043 src/e1000e.c@00324: Enable PCI Memory Access (d18) 14:57:05.043 src/e1000e.c@00150: 00: 86 80 59 15 03 00 10 00 04 00 00 02 00 00 00 00 (d18) 14:57:05.044 src/e1000e.c@00150: 10: 00 00 d0 f7 00 c0 d3 f7 81 f0 00 00 00 00 00 00 (d18) 14:57:05.044 src/e1000e.c@00150: 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 54 20 (d18) 14:57:05.045 src/e1000e.c@00150: 30: 00 00 00 00 c8 00 00 00 00 00 00 00 14 01 00 00 As you can see the pci configuration read from the pci-back driver by my DomU is different to the data in the Dom0 pci configuration! The only byte that is different afaics is the one at 0x3c, and that's fine. Plus comparing with what Dom0 sees before the guest was started or after the guest was stopped isn't really meaningful - instead you'd want to compare against what Dom0 sees while the guest is running. But in the end your problem may have to do with the various warnings issued regarding unrecognized DMAR table data, all relating to the use of namespaces that our code doesn't have support for. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V3] Decouple SnadyBridge quirk form VTd timeout
On Wed, 2014-11-19 at 12:46 -0700, Donald D. Dugger wrote: Currently the quirk code for SandyBridge uses the VTd timeout value when You've got a typo in the subject (SnadyBridge). ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4 for-4.5] xen: arm: xgene bug fixes + support for McDivitt
On Wed, 2014-11-19 at 16:18 -0500, Konrad Rzeszutek Wilk wrote: On Tue, Nov 18, 2014 at 04:51:42PM +, Ian Campbell wrote: On Tue, 2014-11-18 at 16:44 +, Ian Campbell wrote: These patches: ... which are also at git://xenbits.xen.org/people/ianc/xen.git mcdivitt-v1 I presume you are going to post v2 with Julian's feedback rolled in? Yes, see 1416410868.29243.39.ca...@citrix.com. I took a look at the code and it looks Xen 4.5 material so I am OK with it rolling in, but would appreciate another posting just to make sure that nothing is amiss. Thank you! Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] INSTALL: correct EXTRA_CFLAGS handling
The already documented configure patch was not applied. Adjust documentation to describe existing behaviour. Signed-off-by: Olaf Hering o...@aepfle.de --- INSTALL | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/INSTALL b/INSTALL index 6bb9d23..656c90a 100644 --- a/INSTALL +++ b/INSTALL @@ -128,13 +128,6 @@ original xenstored will be used. Valid names are xenstored and oxenstored. --with-xenstored=name -Using additional CFLAGS to build tools running in dom0 is required when -building distro packages. This is the option to pass things like -RPM_OPT_FLAGS. - --with-extra-cflags-tools=EXTRA_CFLAGS - --with-extra-cflags-qemu-traditional=EXTRA_CFLAGS - --with-extra-cflags-qemu-upstream=EXTRA_CFLAGS - Instead of starting the tools in dom0 with sysv runlevel scripts they can also be started by systemd. If this option is enabled xenstored will receive the communication socked directly from systemd. So starting it @@ -241,6 +234,12 @@ QEMU_UPSTREAM_URL= QEMU_TRADITIONAL_URL= SEABIOS_UPSTREAM_URL= +Using additional CFLAGS to build tools running in dom0 is required when +building distro packages. This can be used to pass RPM_OPT_FLAGS. +EXTRA_CFLAGS_XEN_TOOLS= +EXTRA_CFLAGS_QEMU_TRADITIONAL= +EXTRA_CFLAGS_QEMU_XEN= + This variable can be used to use DIR/include and DIR/lib during build. This is the same as PREPEND_LIB and PREPEND_INCLUDES. APPEND_LIB and APPEND_INCLUDES= will be appended to the CFLAGS/LDFLAGS variable. @@ -310,10 +309,10 @@ sudo make install BOOT_DIR=/ood/path/boot EFI_DIR=/odd/path/efi %build export WGET=$(type -P false) export GIT=$(type -P false) +export EXTRA_CFLAGS_XEN_TOOLS=$RPM_OPT_FLAGS +export EXTRA_CFLAGS_QEMU_TRADITIONAL=$RPM_OPT_FLAGS +export EXTRA_CFLAGS_QEMU_XEN=$RPM_OPT_FLAGS %configure \ ---with-extra-cflags-tools=$RPM_OPT_FLAGS \ ---with-extra-cflags-qemu-traditional=$RPM_OPT_FLAGS \ ---with-extra-cflags-qemu-upstream=$RPM_OPT_FLAGS \ --with-initddir=%{_initddir} unset CFLAGS CXXFLAGS FFLAGS LDFLAGS make ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Number of NICs per VM with qemu-upstream (Was: Re: Re: [Xen-users] libvirt emulator /usr/local/lib/xen/bin/qemu-dm emulator did not work on xen-4.4)
On Mon, 2014-11-17 at 13:00 +, Stefano Stabellini wrote: On Mon, 17 Nov 2014, Ian Campbell wrote: On Sat, 2014-11-15 at 10:16 +0800, hanyandong wrote: By the way, how many NICs can I apply to a VM? On xen-4.4.0, Using qemu-dm, I can apply 8 NIC to a VM, but using qemu-system-i386, I can only apply 4 NICs to an VM? is it normal? I've no idea, CCing the qemu maintainers. I'd have expected the number of PV nics to be completely independent of the device mode, so I suppose you mean emulated NICs? I can pass 4 emulated NICs maximum, but you can easily reach 8 if you use PV NICs instead. Just pass 'type=pv' like this: vif=['', '', '', '', 'type=pv', 'type=pv', 'type=pv', 'type=pv'] it is going to create 4 emulated nics and 4 pv nics. The 4 emulated nics also have 4 corresponding pv nics. The emulated nics get disconnected soon after boot by the guest operating system (if it has pv drivers installed, such as Linux). So overall once the boot sequence is fully completed you'll end up with the 8 pv nics that you want. I wonder if we should do something in (lib)xl such that by default the first 4 NICs have type LIBXL_NIC_TYPE_VIF_IOEMU (i.e. emulated+PV path) and the rest have LIBXL_NIC_TYPE_VIF (i.e. PV only). BTW the reason for the failure seems to be that QEMU runs out of ram (xen: failed to populate ram at 8011, so xc_domain_populate_physmap_exact failed) allocating roms for the rtl8139 (4 bytes each). Maybe qemu-trad wasn't loading any roms for rtl8139. Interestingly e1000 doesn't need any roms either, so another way around this would be to set 'type=e1000' for all the vifs. Or to use the new option in 4.5 to increase the MMIO space (or is that not where ROMs end up?) Do we need to plumb through qemu's optionrom parameter to allow a) limiting the number of NICs which will try to do PXE and b) allow custom roms etc? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] dom0 kenrel crashes for openstack + libvirt + libxl
On Wed, 2014-11-19 at 11:57 -0700, Xing Lin wrote: Hi Ian, Both of your two points are valid. There is no need to install virt-manager. And the patch to start a qemu process in /etc/init.d/xen seems to be enough for launching instances from horizon. I have updated the wiki page. Please review it at http://wiki.xenproject.org/wiki/Xen_via_libvirt_for_OpenStack_juno Looks much better, thanks! I think the need to symlink pygrub to /usr/bin should be considered an openstack bug. I presume nova specifies the absolute path, when it should just say pygrub by default and let the toolstack find the default one. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/3] x86: XSA-109/110 follow-up (to be considered for 4.5)
1: tighten page table owner checking in do_mmu_update() 2: don't ignore foreigndom input on various MMUEXT ops 3: HVM: don't crash guest upon problems occurring in user mode Reason to request considering this for 4.5: Tightened argument checking (as done in the first two patches) reduces the chances of them getting abused. Not unduly crashing the guest (as done in the third one) may avoid future security issues of guest user mode affecting the guest kernel. Signed-off-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/3] x86: tighten page table owner checking in do_mmu_update()
MMU_MACHPHYS_UPDATE, not manipulating page tables, shouldn't ignore a bad page table domain being specified. Also pt_owner can't be NULL when reaching the out label, so the respective check can be dropped. Signed-off-by: Jan Beulich jbeul...@suse.com Acked-by: Tim Deegan t...@xen.org --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3618,6 +3618,11 @@ long do_mmu_update( break; case MMU_MACHPHYS_UPDATE: +if ( unlikely(d != pt_owner) ) +{ +rc = -EPERM; +break; +} mfn = req.ptr PAGE_SHIFT; gpfn = req.val; @@ -3694,7 +3699,7 @@ long do_mmu_update( perfc_add(num_page_updates, i); out: -if ( pt_owner (pt_owner != d) ) +if ( pt_owner != d ) rcu_unlock_domain(pt_owner); /* Add incremental work we have done to the @done output parameter. */ x86: tighten page table owner checking in do_mmu_update() MMU_MACHPHYS_UPDATE, not manipulating page tables, shouldn't ignore a bad page table domain being specified. Also pt_owner can't be NULL when reaching the out label, so the respective check can be dropped. Signed-off-by: Jan Beulich jbeul...@suse.com Acked-by: Tim Deegan t...@xen.org --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3618,6 +3618,11 @@ long do_mmu_update( break; case MMU_MACHPHYS_UPDATE: +if ( unlikely(d != pt_owner) ) +{ +rc = -EPERM; +break; +} mfn = req.ptr PAGE_SHIFT; gpfn = req.val; @@ -3694,7 +3699,7 @@ long do_mmu_update( perfc_add(num_page_updates, i); out: -if ( pt_owner (pt_owner != d) ) +if ( pt_owner != d ) rcu_unlock_domain(pt_owner); /* Add incremental work we have done to the @done output parameter. */ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/3] x86: don't ignore foreigndom input on various MMUEXT ops
Instead properly fail requests that shouldn't be issued on foreign domains or - for MMUEXT_{CLEAR,COPY}_PAGE - extend the existing operation to work that way. In the course of doing this the need to always clear okay even when wanting an error code other than -EINVAL became unwieldy, so the respective logic is being adjusted at once, together with a little other related cleanup. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3062,23 +3062,23 @@ long do_mmuext_op( } case MMUEXT_NEW_BASEPTR: -if ( paging_mode_translate(d) ) -okay = 0; +if ( unlikely(d != pg_owner) ) +rc = -EPERM; +else if ( unlikely(paging_mode_translate(d)) ) +rc = -EINVAL; else -{ rc = new_guest_cr3(op.arg1.mfn); -okay = !rc; -} break; case MMUEXT_NEW_USER_BASEPTR: { unsigned long old_mfn; -if ( paging_mode_translate(current-domain) ) -{ -okay = 0; +if ( unlikely(d != pg_owner) ) +rc = -EPERM; +else if ( unlikely(paging_mode_translate(d)) ) +rc = -EINVAL; +if ( unlikely(rc) ) break; -} old_mfn = pagetable_get_pfn(curr-arch.guest_table_user); /* @@ -3136,12 +3136,17 @@ long do_mmuext_op( } case MMUEXT_TLB_FLUSH_LOCAL: -flush_tlb_local(); +if ( likely(d == pg_owner) ) +flush_tlb_local(); +else +rc = -EPERM; break; case MMUEXT_INVLPG_LOCAL: -if ( !paging_mode_enabled(d) - || paging_invlpg(curr, op.arg1.linear_addr) != 0 ) +if ( unlikely(d != pg_owner) ) +rc = -EPERM; +else if ( !paging_mode_enabled(d) || + paging_invlpg(curr, op.arg1.linear_addr) != 0 ) flush_tlb_one_local(op.arg1.linear_addr); break; @@ -3150,13 +3155,16 @@ long do_mmuext_op( { cpumask_t pmask; -if ( unlikely(vcpumask_to_pcpumask(d, -guest_handle_to_param(op.arg2.vcpumask, const_void), -pmask)) ) -{ -okay = 0; +if ( unlikely(d != pg_owner) ) +rc = -EPERM; +else if ( unlikely(vcpumask_to_pcpumask(d, + guest_handle_to_param(op.arg2.vcpumask, + const_void), + pmask)) ) +rc = -EINVAL; +if ( unlikely(rc) ) break; -} + if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI ) flush_tlb_mask(pmask); else @@ -3165,18 +3173,26 @@ long do_mmuext_op( } case MMUEXT_TLB_FLUSH_ALL: -flush_tlb_mask(d-domain_dirty_cpumask); +if ( likely(d == pg_owner) ) +flush_tlb_mask(d-domain_dirty_cpumask); +else +rc = -EPERM; break; case MMUEXT_INVLPG_ALL: -flush_tlb_one_mask(d-domain_dirty_cpumask, op.arg1.linear_addr); +if ( likely(d == pg_owner) ) +flush_tlb_one_mask(d-domain_dirty_cpumask, op.arg1.linear_addr); +else +rc = -EPERM; break; case MMUEXT_FLUSH_CACHE: -if ( unlikely(!cache_flush_permitted(d)) ) +if ( unlikely(d != pg_owner) ) +rc = -EPERM; +else if ( unlikely(!cache_flush_permitted(d)) ) { MEM_LOG(Non-physdev domain tried to FLUSH_CACHE.); -okay = 0; +rc = -EACCES; } else { @@ -3185,8 +3201,8 @@ long do_mmuext_op( break; case MMUEXT_FLUSH_CACHE_GLOBAL: -if ( unlikely(foreigndom != DOMID_SELF) ) -okay = 0; +if ( unlikely(d != pg_owner) ) +rc = -EPERM; else if ( likely(cache_flush_permitted(d)) ) { unsigned int cpu; @@ -3211,7 +3227,9 @@ long do_mmuext_op( unsigned long ptr = op.arg1.linear_addr; unsigned long ents = op.arg2.nr_ents; -if ( paging_mode_external(d) ) +if ( unlikely(d != pg_owner) ) +rc = -EPERM; +else if ( paging_mode_external(d) ) { MEM_LOG(ignoring SET_LDT hypercall from external domain); okay = 0; @@ -3237,7 +3255,7 @@ long do_mmuext_op( case MMUEXT_CLEAR_PAGE: { struct page_info *page; -page =
Re: [Xen-devel] Xen-4.5 Testing - Cache Monitoring Technology
On Thu, Nov 20, 2014 at 09:58:37AM +, Andrew Cooper wrote: Hi, I have found myself a PSR-capable server and have been having a play with Xen-4.5 At a first pass, I can get some numbers out: [root@blob ~]# xl psr-cmt-attach 0 [root@blob ~]# xl psr-cmt-show cache_occupancy Total RMID: 71 NameIDSocket 0 Socket 1 Total L3 Cache Size 46080 KB 46080 KB Domain-0 045072 KB0 KB [root@blob ~]# However, I am unable to get any occupancy information for HVM guests. Given nothing obvious in the code which would make CMT PV-guest specific, I presume this is unexpected? Thank your for trying... But I just tested the HVM on my box and it runs correct. Have you missed to run psr-cmt-attach for your HVM domain? Otherwise I think there may be something wrong. The cmt itself should not be PV-specific. Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][RFC][PATCH 06/13] hvmloader/ram: check if guest memory is out of reserved device memory maps
Have a second struct acpi_rmrr_unit pointer, starting out as NULL and getting set to the current one when the callback returns a positive value. Skip further iterations as long as both pointers match. Great! int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt) { struct acpi_rmrr_unit *rmrr, *rmrr_cur = NULL; int rc = 0; unsigned int i; u16 bdf; for_each_rmrr_device ( rmrr, bdf, i ) { if ( rmrr != rmrr_cur ) { rc = func(PFN_DOWN(rmrr-base_address), PFN_UP(rmrr-end_address) - PFN_DOWN(rmrr-base_address), PCI_SBDF(rmrr-segment, bdf), ctxt); if ( unlikely(rc 0) ) return rc; /* Hit this entry so just go next. */ if ( rc 0 ) rmrr_cur = rmrr; } } return 0; } Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.5 random freeze question
On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: 19 лист. 2014 20:32, користувач Stefano Stabellini stefano.stabell...@eu.citrix.com написав: On Wed, 19 Nov 2014, Julien Grall wrote: On 11/19/2014 06:14 PM, Stefano Stabellini wrote: That's right, the maintenance interrupt handler is not called, but it doesn't do anything so we are fine. The important thing is that an interrupt is sent and git_clear_lrs gets called on hypervisor entry. It would be worth to write down this somewhere. Just in case someone decide to add code in maintenance interrupt later. Yes, I could add a comment in the handler Maybe it wouldn't take a lot of effort to fix it? I am just worrying that we may hide some issue - typically spurious interrupt this not what is expected. My guess is that by clearing UIE before reading GICC_IAR, we clear the maintenance interrupt too, as a consequence the following read to GICC_IAR would return 1023 (nothing to be read). As bit as if the maintenance interrupt was a level interrupt and we just disabled it. So I think that if we cleared UIE after reading GICC_IAR, GICC_IAR would return the correct value. However with the current structure of the code, the first thing that we do upon entering the hypervisor is clearing LRs and given what happened on your platform I think is a good idea to do it with UIE disabled. This is way I would rather read spurious interrupts but read/write LRs with UIE disabled than reading maintenance interrupts but risking strange behaviours on some platforms.___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/3] x86: tighten page table owner checking in do_mmu_update()
On 20/11/14 10:11, Jan Beulich wrote: MMU_MACHPHYS_UPDATE, not manipulating page tables, shouldn't ignore a bad page table domain being specified. Also pt_owner can't be NULL when reaching the out label, so the respective check can be dropped. Yes it can. Failing if ( (pg_owner = get_pg_owner((uint16_t)foreigndom)) == NULL ) { rc = -ESRCH; goto out; } around line 3462 will cause pt_owner to be NULL at the out label. ~Andrew Signed-off-by: Jan Beulich jbeul...@suse.com Acked-by: Tim Deegan t...@xen.org --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3618,6 +3618,11 @@ long do_mmu_update( break; case MMU_MACHPHYS_UPDATE: +if ( unlikely(d != pt_owner) ) +{ +rc = -EPERM; +break; +} mfn = req.ptr PAGE_SHIFT; gpfn = req.val; @@ -3694,7 +3699,7 @@ long do_mmu_update( perfc_add(num_page_updates, i); out: -if ( pt_owner (pt_owner != d) ) +if ( pt_owner != d ) rcu_unlock_domain(pt_owner); /* Add incremental work we have done to the @done output parameter. */ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/3] x86: tighten page table owner checking in do_mmu_update()
On 20/11/14 10:29, Andrew Cooper wrote: On 20/11/14 10:11, Jan Beulich wrote: MMU_MACHPHYS_UPDATE, not manipulating page tables, shouldn't ignore a bad page table domain being specified. Also pt_owner can't be NULL when reaching the out label, so the respective check can be dropped. Yes it can. Failing if ( (pg_owner = get_pg_owner((uint16_t)foreigndom)) == NULL ) { rc = -ESRCH; goto out; } around line 3462 will cause pt_owner to be NULL at the out label. ~Andrew ...And I should really double check my reply before I send. Apologies for the noise. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [for-xen-4.5 PATCH v2 1/2] dpci: Fix list corruption if INTx device is used and an IRQ timeout is invoked.
On 19.11.14 at 23:21, konrad.w...@oracle.com wrote: @@ -97,13 +97,15 @@ bool_t pt_pirq_softirq_active(struct hvm_pirq_dpci *pirq_dpci) } /* - * Reset the pirq_dpci-dom parameter to NULL. + * Cancels an outstanding pirq_dpci (if scheduled). Also if clear is set, + * reset pirq_dpci-dom parameter to NULL (used for teardown). * * This function checks the different states to make sure it can do it * at the right time. If it unschedules the 'hvm_dirq_assist' from running * it also refcounts (which is what the softirq would have done) properly. */ -static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci) +static void pt_pirq_softirq_cancel(struct hvm_pirq_dpci *pirq_dpci, + unsigned int clear) Looks reasonable apart from this wanting to be bool_t, and apart from the fact that iiuc it still doesn't eliminate the observed crashes (in which case the fix for that may better be folded into here). I'm not really convinced that this fix is what you currently have as patch 2 (as the previously mentioned problem of _reset() [now _cancel()] clearing STATE_SCHED without removing the list entry from the list is now becoming even more obvious), but I'll also comment there. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/3] x86: don't ignore foreigndom input on various MMUEXT ops
At 11:12 +0100 on 20 Nov (1416478351), Jan Beulich wrote: Instead properly fail requests that shouldn't be issued on foreign domains or - for MMUEXT_{CLEAR,COPY}_PAGE - extend the existing operation to work that way. In the course of doing this the need to always clear okay even when wanting an error code other than -EINVAL became unwieldy, so the respective logic is being adjusted at once, together with a little other related cleanup. Signed-off-by: Jan Beulich jbeul...@suse.com Reviewed-by: Tim Deegan t...@xen.org though I would prefer this as two patches, one to remove 'okay' altogether in favour of appropriate rc settings, and then a simpler version of this. Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [for-xen-4.5 PATCH v2 2/2] dpci: Add ZOMBIE state to allow the softirq to finish with the dpci_pirq.
On 19.11.14 at 23:21, konrad.w...@oracle.com wrote: Leaving aside the question of whether this is the right approach, in case it is a couple of comments: @@ -85,7 +91,7 @@ static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci) */ bool_t pt_pirq_softirq_active(struct hvm_pirq_dpci *pirq_dpci) { -if ( pirq_dpci-state ((1 STATE_RUN) | (1 STATE_SCHED)) ) +if ( pirq_dpci-state ((1 STATE_RUN) | (1 STATE_SCHED) | (1 STATE_ZOMBIE) ) ) This is becoming unwieldy - perhaps better just if ( pirq_dpci-state )? @@ -122,6 +128,7 @@ static void pt_pirq_softirq_cancel(struct hvm_pirq_dpci *pirq_dpci, /* fallthrough. */ case (1 STATE_RUN): case (1 STATE_RUN) | (1 STATE_SCHED): +case (1 STATE_RUN) | (1 STATE_SCHED) | (1 STATE_ZOMBIE): How would we get into this state? Afaict STATE_ZOMBIE can't be set at the same time as STATE_SCHED. @@ -786,6 +793,7 @@ unlock: static void dpci_softirq(void) { unsigned int cpu = smp_processor_id(); +unsigned int reset = 0; bool_t (and would better be inside the loop, avoiding the pointless re-init on the last iteration). But taking it as a whole - aren't we now at risk of losing an interrupt instance the guest expects, due to raise_softirq_for() bailing on zombie entries, and dpci_softirq() being the only place where the zombie state gets cleared? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.5 random freeze question
On 11/20/2014 10:28 AM, Stefano Stabellini wrote: On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: 19 лист. 2014 20:32, користувач Stefano Stabellini stefano.stabell...@eu.citrix.com написав: On Wed, 19 Nov 2014, Julien Grall wrote: On 11/19/2014 06:14 PM, Stefano Stabellini wrote: That's right, the maintenance interrupt handler is not called, but it doesn't do anything so we are fine. The important thing is that an interrupt is sent and git_clear_lrs gets called on hypervisor entry. It would be worth to write down this somewhere. Just in case someone decide to add code in maintenance interrupt later. Yes, I could add a comment in the handler Maybe it wouldn't take a lot of effort to fix it? I am just worrying that we may hide some issue - typically spurious interrupt this not what is expected. My guess is that by clearing UIE before reading GICC_IAR, we clear the maintenance interrupt too, as a consequence the following read to GICC_IAR would return 1023 (nothing to be read). As bit as if the maintenance interrupt was a level interrupt and we just disabled it. So I think that if we cleared UIE after reading GICC_IAR, GICC_IAR would return the correct value. However with the current structure of the code, the first thing that we do upon entering the hypervisor is clearing LRs and given what happened on your platform I think is a good idea to do it with UIE disabled. Agreed. UIE should be disabled to avoid another maintenance interrupt as soon as we EOI the IRQ. This is way I would rather read spurious interrupts but read/write LRs with UIE disabled than reading maintenance interrupts but risking strange behaviours on some platforms. Reading the GIC-v2 documentation, the spurious interrupt things should happen on any platform every time the UIE is disabled while we receive a maintenance interrupt. The read returns a spurious interrupt ID of 1023 if any of the following apply: - no pending interrupt on the CPU interface has sufficient priority for the interface to signal it to the processor -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for-4.5 1/5] xen: arm: Add earlyprintk for McDivitt.
Hi Ian, On 11/19/2014 03:28 PM, Ian Campbell wrote: Signed-off-by: Ian Campbell ian.campb...@citrix.com --- v2: Remove pointless/unused baud rate setting. A bunch of other entries have these, but cleaning them up is out of scope here I think. Agreed. Reviewed-by: Julien Grall julien.gr...@linaro.org Regards, --- xen/arch/arm/Rules.mk |5 + 1 file changed, 5 insertions(+) diff --git a/xen/arch/arm/Rules.mk b/xen/arch/arm/Rules.mk index 572d854..30c7823 100644 --- a/xen/arch/arm/Rules.mk +++ b/xen/arch/arm/Rules.mk @@ -95,6 +95,11 @@ EARLY_PRINTK_BAUD := 115200 EARLY_UART_BASE_ADDRESS := 0x1c02 EARLY_UART_REG_SHIFT := 2 endif +ifeq ($(CONFIG_EARLY_PRINTK), xgene-mcdivitt) +EARLY_PRINTK_INC := 8250 +EARLY_UART_BASE_ADDRESS := 0x1c021000 +EARLY_UART_REG_SHIFT := 2 +endif ifeq ($(CONFIG_EARLY_PRINTK), juno) EARLY_PRINTK_INC := pl011 EARLY_PRINTK_BAUD := 115200 -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] x86/xen: use the maximum MFN to calculate the required DMA mask
On 19.11.14 at 17:02, david.vra...@citrix.com wrote: On a Xen PV guest the DMA addresses and physical addresses are not 1:1 (such as Xen PV guests) and the generic dma_get_required_mask() does not return the correct mask (since it uses max_pfn). Some device drivers (such as mptsas, mpt2sas) use dma_get_required_mask() to set the device's DMA mask to allow them to use only 32-bit DMA addresses in hardware structures. This results in unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits, impacting performance significantly. Provide a get_required_mask op that uses the maximum MFN to calculate the DMA mask. Signed-off-by: David Vrabel david.vra...@citrix.com Reviewed-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for-4.5 2/5] xen: arm: Drop EARLY_PRINTK_BAUD from entries which don't set ..._INIT_UART
Hi Ian, On 11/19/2014 03:28 PM, Ian Campbell wrote: EARLY_PRINTK_BAUD doesn't do anything unless EARLY_PRINTK_INIT_UART is set. Furthermore only the pl011 driver implements the init routine at all, so the entries which use 8250 and specified a BAUD were doubly wrong. NIT: and exynos4210 Maybe use 8250 should be replaced by other UARTs drivers? Signed-off-by: Ian Campbell ian.campb...@citrix.com Reviewed-by: Julien Grall julien.gr...@linaro.org Regards, --- v2: New patch. --- xen/arch/arm/Rules.mk |7 --- 1 file changed, 7 deletions(-) diff --git a/xen/arch/arm/Rules.mk b/xen/arch/arm/Rules.mk index 30c7823..4ee51a9 100644 --- a/xen/arch/arm/Rules.mk +++ b/xen/arch/arm/Rules.mk @@ -45,7 +45,6 @@ ifeq ($(debug),y) # Early printk for versatile express ifeq ($(CONFIG_EARLY_PRINTK), vexpress) EARLY_PRINTK_INC := pl011 -EARLY_PRINTK_BAUD := 38400 EARLY_UART_BASE_ADDRESS := 0x1c09 endif ifeq ($(CONFIG_EARLY_PRINTK), fastmodel) @@ -56,12 +55,10 @@ EARLY_UART_BASE_ADDRESS := 0x1c09 endif ifeq ($(CONFIG_EARLY_PRINTK), exynos5250) EARLY_PRINTK_INC := exynos4210 -EARLY_PRINTK_BAUD := 115200 EARLY_UART_BASE_ADDRESS := 0x12c2 endif ifeq ($(CONFIG_EARLY_PRINTK), midway) EARLY_PRINTK_INC := pl011 -EARLY_PRINTK_BAUD := 115200 EARLY_UART_BASE_ADDRESS := 0xfff36000 endif ifeq ($(CONFIG_EARLY_PRINTK), omap5432) @@ -91,7 +88,6 @@ EARLY_UART_REG_SHIFT := 2 endif ifeq ($(CONFIG_EARLY_PRINTK), xgene-storm) EARLY_PRINTK_INC := 8250 -EARLY_PRINTK_BAUD := 115200 EARLY_UART_BASE_ADDRESS := 0x1c02 EARLY_UART_REG_SHIFT := 2 endif @@ -102,18 +98,15 @@ EARLY_UART_REG_SHIFT := 2 endif ifeq ($(CONFIG_EARLY_PRINTK), juno) EARLY_PRINTK_INC := pl011 -EARLY_PRINTK_BAUD := 115200 EARLY_UART_BASE_ADDRESS := 0x7ff8 endif ifeq ($(CONFIG_EARLY_PRINTK), hip04-d01) EARLY_PRINTK_INC := 8250 -EARLY_PRINTK_BAUD := 115200 EARLY_UART_BASE_ADDRESS := 0xE4007000 EARLY_UART_REG_SHIFT := 2 endif ifeq ($(CONFIG_EARLY_PRINTK), seattle) EARLY_PRINTK_INC := pl011 -EARLY_PRINTK_BAUD := 115200 EARLY_UART_BASE_ADDRESS := 0xe101 endif -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Spice-devel] screen freezed for 2-3 minutes on spice connect on xen windows 7 domU's with qxl after save/restore
Il 13/11/2014 13:22, Fabio Fantoni ha scritto: Il 13/11/2014 11:14, Fabio Fantoni ha scritto: Il 19/09/2014 15:18, Fabio Fantoni ha scritto: Il 12/09/2014 16:46, Fabio Fantoni ha scritto: Il 08/07/2014 12:34, Fabio Fantoni ha scritto: Il 08/07/2014 12:06, Fabio Fantoni ha scritto: Il 08/07/2014 10:53, David Jaša ha scritto: Hi, On Út, 2014-07-08 at 10:13 +0200, Fabio Fantoni wrote: On xen 4.5 (tried with qemu 2.0.0/2.1-rc0, spice 0.12.5 and client with spice-gtk 0.23/0.25) windows 7 domUs with qxl vga works good as kvm except for one problem after xl save/restore, when after restore on spice client connect the domU's screen freezed for 2-3 minutes (and seems also windows), after this time seems that all return to works correctly. This problem happen also if spice client connect long time after restore. With stdvga not have this problem but stdvga has many missed resolutions and bad refresh performance. If you need more tests/informations tell me and I'll post them. Client and server logs would certainly help. Please run: * virt-viewer with --spice-debug option * spice-server with SPICE_DEBUG_LEVEL environment variable set to 4 or 5 (if you use qemu+libvirt, use qemu:env element: http://libvirt.org/drvqemu.html#qemucommand ) and note the location in the logs where the freeze takes place. Regards, David Thanks for your reply, in attachments: - domU's xl cfg: W7.cfg - xl -vvv create/save/restore: xen logs.txt - remote-viewer with --spice-debug after domU's start until xl save: spicelog-1.txt (zipped) - remote-viewer with --spice-debug after domU's xl restore: spicelog-2.txt Sorry for my forgetfulness, here also qemu's log: - after domU's start until xl save: qemu-dm-W7.log.1 - after domU's xl restore: qemu-dm-W7.log If you need more tests/informations tell me and I'll post them. Thanks for any reply and sorry for my bad english. ___ Spice-devel mailing list spice-de...@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel The problem persist, this time I saw these in xl dmesg after restore: (XEN) HVM2 restore: CPU 0 (XEN) HVM2 restore: CPU 1 (XEN) HVM2 restore: PIC 0 (XEN) HVM2 restore: PIC 1 (XEN) HVM2 restore: IOAPIC 0 (XEN) HVM2 restore: LAPIC 0 (XEN) HVM2 restore: LAPIC 1 (XEN) HVM2 restore: LAPIC_REGS 0 (XEN) HVM2 restore: LAPIC_REGS 1 (XEN) HVM2 restore: PCI_IRQ 0 (XEN) HVM2 restore: ISA_IRQ 0 (XEN) HVM2 restore: PCI_LINK 0 (XEN) HVM2 restore: PIT 0 (XEN) HVM2 restore: RTC 0 (XEN) HVM2 restore: HPET 0 (XEN) HVM2 restore: PMTIMER 0 (XEN) HVM2 restore: MTRR 0 (XEN) HVM2 restore: MTRR 1 (XEN) HVM2 restore: VIRIDIAN_DOMAIN 0 (XEN) HVM2 restore: VIRIDIAN_VCPU 0 (XEN) HVM2 restore: VIRIDIAN_VCPU 1 (XEN) HVM2 restore: VMCE_VCPU 0 (XEN) HVM2 restore: VMCE_VCPU 1 (XEN) HVM2 restore: TSC_ADJUST 0 (XEN) HVM2 restore: TSC_ADJUST 1 (XEN) memory.c:216:d2v0 Domain 2 page number 77579 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7757a invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7757b invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7757c invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7757d invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7757e invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7757f invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77580 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77581 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77582 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77583 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77584 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77585 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77586 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77587 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77588 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77589 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7758a invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7758b invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7758c invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7758d invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7758e invalid (XEN) memory.c:216:d2v0 Domain 2 page number 7758f invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77590 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77591 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77592 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77593 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77594 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77595 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77596 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77597 invalid (XEN) memory.c:216:d2v0 Domain 2 page number 77598 invalid (XEN) grant_table.c:1272:d2v0 Expanding dom (2) grant table from (4) to (32) frames. (XEN) irq.c:380: Dom2 callback via changed to GSI 24 Tested on latest staging (commit 7d203b337fb2dcd148d2df850e25b67c792d4d0b) plus the
Re: [Xen-devel] [PATCH v2 for-4.5 3/5] xen: arm: correct off by one in xgene-storm's map_one_mmio
Hi Ian, On 11/19/2014 03:28 PM, Ian Campbell wrote: The callers pass the end as the pfn immediately *after* the last page to be mapped, therefore adding one is incorrect and causes an additional page to be mapped. At the same time correct the printing of the mfn values, zero-padding them to 16 digits as for a paddr when they are frame numbers is just confusing. Signed-off-by: Ian Campbell ian.campb...@citrix.com Reviewed-by: Julien Grall julien.gr...@linaro.org Regards, --- v2: Fix the other printk format string too. --- xen/arch/arm/platforms/xgene-storm.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/platforms/xgene-storm.c b/xen/arch/arm/platforms/xgene-storm.c index 29c4752..8685c93 100644 --- a/xen/arch/arm/platforms/xgene-storm.c +++ b/xen/arch/arm/platforms/xgene-storm.c @@ -45,11 +45,11 @@ static int map_one_mmio(struct domain *d, const char *what, { int ret; -printk(Additional MMIO %PRIpaddr-%PRIpaddr (%s)\n, +printk(Additional MMIO %lx-%lx (%s)\n, start, end, what); -ret = map_mmio_regions(d, start, end - start + 1, start); +ret = map_mmio_regions(d, start, end - start, start); if ( ret ) -printk(Failed to map %s @ %PRIpaddr to dom%d\n, +printk(Failed to map %s @ %lx to dom%d\n, what, start, d-domain_id); return ret; } -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] xen: use more fixed strings to build the hypervisor
It should be possible to repeatedly build identical sources and get identical binaries, even on different hosts at different build times. This fails for xen.gz and xen.efi because current time and buildhost get included in the binaries. Provide variables XEN_BUILD_DATE, XEN_BUILD_TIME and XEN_BUILD_HOST which the build environment can set to fixed strings to get a reproducible build. Signed-off-by: Olaf Hering o...@aepfle.de Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org Cc: Tim Deegan t...@xen.org --- v2: reword commit message. xen/Makefile | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/xen/Makefile b/xen/Makefile index 72c1313..47f003c 100644 --- a/xen/Makefile +++ b/xen/Makefile @@ -8,6 +8,9 @@ export XEN_FULLVERSION = $(XEN_VERSION).$(XEN_SUBVERSION)$(XEN_EXTRAVERSION) export XEN_WHOAMI ?= $(USER) export XEN_DOMAIN ?= $(shell ([ -x /bin/dnsdomainname ] /bin/dnsdomainname) || ([ -x /bin/domainname ] /bin/domainname || echo [unknown])) +export XEN_BUILD_DATE ?= $(shell LC_ALL=C date) +export XEN_BUILD_TIME ?= $(shell LC_ALL=C date +%T) +export XEN_BUILD_HOST ?= $(shell hostname) export BASEDIR := $(CURDIR) export XEN_ROOT := $(BASEDIR)/.. @@ -126,11 +129,11 @@ delete-unfresh-files: # compile.h contains dynamic build info. Rebuilt on every 'make' invocation. include/xen/compile.h: include/xen/compile.h.in .banner - @sed -e 's/@@date@@/$(shell LC_ALL=C date)/g' \ - -e 's/@@time@@/$(shell LC_ALL=C date +%T)/g' \ + @sed -e 's/@@date@@/$(XEN_BUILD_DATE)/g' \ + -e 's/@@time@@/$(XEN_BUILD_TIME)/g' \ -e 's/@@whoami@@/$(XEN_WHOAMI)/g' \ -e 's/@@domain@@/$(XEN_DOMAIN)/g' \ - -e 's/@@hostname@@/$(shell hostname)/g' \ + -e 's/@@hostname@@/$(XEN_BUILD_HOST)/g' \ -e 's!@@compiler@@!$(shell $(CC) $(CFLAGS) --version 21 | head -1)!g' \ -e 's/@@version@@/$(XEN_VERSION)/g' \ -e 's/@@subversion@@/$(XEN_SUBVERSION)/g' \ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] x86/xen: use the maximum MFN to calculate the required DMA mask
On 19/11/14 17:51, Stefano Stabellini wrote: On Wed, 19 Nov 2014, David Vrabel wrote: +u64 +xen_swiotlb_get_required_mask(struct device *dev) +{ +unsigned long max_mfn; + +max_mfn = HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL); As Jan pointed out, I think you need to change the prototype of HYPERVISOR_memory_op to return long. Please do consistently across all relevant archs. This doesn't help since 32-bit guests will still truncate. A new hypercall op that returns the result in a uint64_t parameter is required. There is another reason why max_mfn isn't suitable -- IOMMU usage so I think we should assume a 64-bit DMA mask is required (this is actually the change I put into XenServer's kernel). David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Number of NICs per VM with qemu-upstream (Was: Re: Re: [Xen-users] libvirt emulator /usr/local/lib/xen/bin/qemu-dm emulator did not work on xen-4.4)
On Thu, 2014-11-20 at 11:39 +, Stefano Stabellini wrote: On Thu, 20 Nov 2014, Ian Campbell wrote: On Mon, 2014-11-17 at 13:00 +, Stefano Stabellini wrote: On Mon, 17 Nov 2014, Ian Campbell wrote: On Sat, 2014-11-15 at 10:16 +0800, hanyandong wrote: By the way, how many NICs can I apply to a VM? On xen-4.4.0, Using qemu-dm, I can apply 8 NIC to a VM, but using qemu-system-i386, I can only apply 4 NICs to an VM? is it normal? I've no idea, CCing the qemu maintainers. I'd have expected the number of PV nics to be completely independent of the device mode, so I suppose you mean emulated NICs? I can pass 4 emulated NICs maximum, but you can easily reach 8 if you use PV NICs instead. Just pass 'type=pv' like this: vif=['', '', '', '', 'type=pv', 'type=pv', 'type=pv', 'type=pv'] it is going to create 4 emulated nics and 4 pv nics. The 4 emulated nics also have 4 corresponding pv nics. The emulated nics get disconnected soon after boot by the guest operating system (if it has pv drivers installed, such as Linux). So overall once the boot sequence is fully completed you'll end up with the 8 pv nics that you want. I wonder if we should do something in (lib)xl such that by default the first 4 NICs have type LIBXL_NIC_TYPE_VIF_IOEMU (i.e. emulated+PV path) and the rest have LIBXL_NIC_TYPE_VIF (i.e. PV only). That looks like a simple and reasonable idea. BTW the reason for the failure seems to be that QEMU runs out of ram (xen: failed to populate ram at 8011, so xc_domain_populate_physmap_exact failed) allocating roms for the rtl8139 (4 bytes each). Maybe qemu-trad wasn't loading any roms for rtl8139. Interestingly e1000 doesn't need any roms either, so another way around this would be to set 'type=e1000' for all the vifs. Or to use the new option in 4.5 to increase the MMIO space (or is that not where ROMs end up?) Do we need to plumb through qemu's optionrom parameter to allow a) limiting the number of NICs which will try to do PXE and b) allow custom roms etc? The libxl solution is the best one for simplicity, besides I don't think there is such an option for QEMU. There is, it's the romfile option to -device e.g. -device $NICMODEL,vlan=0,romfile=$ROMFILE where NICMODEL is e100, rtl8139, virtio-blah and ROMFILE is e.g. an ipxe binary. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv4 0/4]: dma, x86, xen: reduce SWIOTLB usage in Xen guests
On systems where DMA addresses and physical addresses are not 1:1 (such as Xen PV guests), the generic dma_get_required_mask() will not return the correct mask (since it uses max_pfn). Some device drivers (such as mptsas, mpt2sas) use dma_get_required_mask() to set the device's DMA mask to allow them to use only 32-bit DMA addresses in hardware structures. This results in unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits, impacting performance significantly. This series allows Xen PV guests to override the default dma_get_required_mask() with one that calculates the DMA mask from the maximum MFN (and not the PFN). Changes in v4: - Assume 64-bit mask is required. Changes in v3: - fix off-by-one in xen_dma_get_required_mask() - split ia64 changes into separate patch. Changes in v2: - split x86 and xen changes into separate patches David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/4] x86: allow dma_get_required_mask() to be overridden
Use dma_ops-get_required_mask() if provided, defaulting to dma_get_requried_mask_from_max_pfn(). This is needed on systems (such as Xen PV guests) where the DMA address and the physical address are not equal. ARCH_HAS_DMA_GET_REQUIRED_MASK is defined in asm/device.h instead of asm/dma-mapping.h because linux/dma-mapping.h uses the define before including asm/dma-mapping.h Signed-off-by: David Vrabel david.vra...@citrix.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- arch/x86/include/asm/device.h |2 ++ arch/x86/kernel/pci-dma.c |8 2 files changed, 10 insertions(+) diff --git a/arch/x86/include/asm/device.h b/arch/x86/include/asm/device.h index 03dd729..10bc628 100644 --- a/arch/x86/include/asm/device.h +++ b/arch/x86/include/asm/device.h @@ -13,4 +13,6 @@ struct dev_archdata { struct pdev_archdata { }; +#define ARCH_HAS_DMA_GET_REQUIRED_MASK + #endif /* _ASM_X86_DEVICE_H */ diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index a25e202..5154400 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -140,6 +140,14 @@ void dma_generic_free_coherent(struct device *dev, size_t size, void *vaddr, free_pages((unsigned long)vaddr, get_order(size)); } +u64 dma_get_required_mask(struct device *dev) +{ + if (dma_ops-get_required_mask) + return dma_ops-get_required_mask(dev); + return dma_get_required_mask_from_max_pfn(dev); +} +EXPORT_SYMBOL_GPL(dma_get_required_mask); + /* * See Documentation/x86/x86_64/boot-options.txt for the iommu kernel * parameter documentation. -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/4] dma: add dma_get_required_mask_from_max_pfn()
A generic dma_get_required_mask() is useful even for architectures (such as ia64) that define ARCH_HAS_GET_REQUIRED_MASK. Signed-off-by: David Vrabel david.vra...@citrix.com Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- drivers/base/platform.c | 10 -- include/linux/dma-mapping.h |1 + 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index b2afc29..f9f3930 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -1009,8 +1009,7 @@ int __init platform_bus_init(void) return error; } -#ifndef ARCH_HAS_DMA_GET_REQUIRED_MASK -u64 dma_get_required_mask(struct device *dev) +u64 dma_get_required_mask_from_max_pfn(struct device *dev) { u32 low_totalram = ((max_pfn - 1) PAGE_SHIFT); u32 high_totalram = ((max_pfn - 1) (32 - PAGE_SHIFT)); @@ -1028,6 +1027,13 @@ u64 dma_get_required_mask(struct device *dev) } return mask; } +EXPORT_SYMBOL_GPL(dma_get_required_mask_from_max_pfn); + +#ifndef ARCH_HAS_DMA_GET_REQUIRED_MASK +u64 dma_get_required_mask(struct device *dev) +{ + return dma_get_required_mask_from_max_pfn(dev); +} EXPORT_SYMBOL_GPL(dma_get_required_mask); #endif diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index d5d3881..6e2fdfc 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -127,6 +127,7 @@ static inline int dma_coerce_mask_and_coherent(struct device *dev, u64 mask) return dma_set_mask_and_coherent(dev, mask); } +extern u64 dma_get_required_mask_from_max_pfn(struct device *dev); extern u64 dma_get_required_mask(struct device *dev); #ifndef set_arch_dma_coherent_ops -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/4] x86/xen: assume a 64-bit DMA mask is required
On a Xen PV guest the DMA addresses and physical addresses are not 1:1 (such as Xen PV guests) and the generic dma_get_required_mask() does not return the correct mask (since it uses max_pfn). Some device drivers (such as mptsas, mpt2sas) use dma_get_required_mask() to set the device's DMA mask to allow them to use only 32-bit DMA addresses in hardware structures. This results in unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits, impacting performance significantly. We could base the DMA mask on the maximum MFN but: a) The hypercall op to get the maximum MFN (XENMEM_maximum_ram_page) will truncate the result to an int in 32-bit guests. b) Future uses of the IOMMU in Xen may map frames at bus addresses above the end of RAM. So, just assume a 64-bit DMA mask is always required. Signed-off-by: David Vrabel david.vra...@citrix.com --- arch/x86/xen/pci-swiotlb-xen.c |1 + drivers/xen/swiotlb-xen.c |9 + include/xen/swiotlb-xen.h |4 3 files changed, 14 insertions(+) diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c index 0e98e5d..a5d180a 100644 --- a/arch/x86/xen/pci-swiotlb-xen.c +++ b/arch/x86/xen/pci-swiotlb-xen.c @@ -31,6 +31,7 @@ static struct dma_map_ops xen_swiotlb_dma_ops = { .map_page = xen_swiotlb_map_page, .unmap_page = xen_swiotlb_unmap_page, .dma_supported = xen_swiotlb_dma_supported, + .get_required_mask = xen_swiotlb_get_required_mask, }; /* diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index ebd8f21..767d048 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -42,9 +42,11 @@ #include xen/page.h #include xen/xen-ops.h #include xen/hvc-console.h +#include xen/interface/memory.h #include asm/dma-mapping.h #include asm/xen/page-coherent.h +#include asm/xen/hypercall.h #include trace/events/swiotlb.h /* @@ -683,3 +685,10 @@ xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask) return 0; } EXPORT_SYMBOL_GPL(xen_swiotlb_set_dma_mask); + +u64 +xen_swiotlb_get_required_mask(struct device *dev) +{ + return DMA_BIT_MASK(64); +} +EXPORT_SYMBOL_GPL(xen_swiotlb_get_required_mask); diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h index 8b2eb93..640 100644 --- a/include/xen/swiotlb-xen.h +++ b/include/xen/swiotlb-xen.h @@ -58,4 +58,8 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask); extern int xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask); + +extern u64 +xen_swiotlb_get_required_mask(struct device *dev); + #endif /* __LINUX_SWIOTLB_XEN_H */ -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Vote] Confirm Konrad Rzeszutek Wilk as Xen project Hypervisor Committer (please vote by Nov 30th)
Lars Kurth writes ([Xen-devel] [Vote] Confirm Konrad Rzeszutek Wilk as Xen project Hypervisor Committer (please vote by Nov 30th)): The voting form is at https://docs.google.com/forms/d/ 1Hpoda2VjdMMGDsz1zh01tkHPR1vUsVeUVAR0DWhlgik/viewform but if you want to vote in public feel free to just reply to this thread. I have (obviously) voted yes, with your form. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Vote] Confirm Konrad Rzeszutek Wilk as Xen project Hypervisor Committer (please vote by Nov 30th)
At 16:26 + on 17 Nov (1416237976), Lars Kurth wrote: last week Ian Jackson nominated Konrad as Xen Project Hypervisor committer. Our governance process requires a formal vote by existing committers to confirm Konrad and for me to set up a voting form. Existing committers are: Keir Fraser, Ian Campbell, Ian Jackson, Jan Beulich and Tim Deegan. Others are welcome to voice support. +1. An excellent idea. Tim. The voting form is at https://docs.google.com/forms/d/1Hpoda2VjdMMGDsz1zh01tkHPR1vUsVeUVAR0DWhlgik/viewform but if you want to vote in public feel free to just reply to this thread. Best Regards Lars ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86/HVM: don't crash guest upon problems occurring in user mode
On 20.11.14 at 12:34, t...@xen.org wrote: At 11:13 +0100 on 20 Nov (1416478386), Jan Beulich wrote: This extends commit 5283b310 (x86/HVM: only kill guest when unknown VM exit occurred in guest kernel mode) to further cases, including the failed VM entry one that XSA-110 was needed to be issued for. Signed-off-by: Jan Beulich jbeul...@suse.com This seems like a good idea in general, but I'm not sure it's appropriate for _all_ of these. Unhandled exit types and overlong instruction decode seem obviously good. hvm_hap_nested_page_fault() returns 0: seems only to happen for pvh guests that write to read-only memory (?). That seems like a different class of failure. I don't think our response should be different based on the privilege level here, although domain_crash() does seem harsh. (I presume this is to avoid emulating an instruction in PVH mode?) If we're changing this, I think it should be to #GP rather than #UD. I dropped those for now. We should re-evaluate this once we have #VE support on VMX (i.e. we may want to inject that one there instead). p2m_pt_handle_deferred_changes() returns 0: AFAICS this is basically ENOMEM when trying to update p2m tables. It's so unlikely to be caused by userspace activity that disguising it with #UD is probably just unhelpful. It turns a clean failure into an undebuggable intermittent glitch. Dropped too. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen-4.5 Testing - Cache Monitoring Technology
On 20/11/14 10:23, Andrew Cooper wrote: On 20/11/14 10:15, Chao Peng wrote: On Thu, Nov 20, 2014 at 09:58:37AM +, Andrew Cooper wrote: Hi, I have found myself a PSR-capable server and have been having a play with Xen-4.5 At a first pass, I can get some numbers out: [root@blob ~]# xl psr-cmt-attach 0 [root@blob ~]# xl psr-cmt-show cache_occupancy Total RMID: 71 NameIDSocket 0 Socket 1 Total L3 Cache Size 46080 KB 46080 KB Domain-0 045072 KB0 KB [root@blob ~]# However, I am unable to get any occupancy information for HVM guests. Given nothing obvious in the code which would make CMT PV-guest specific, I presume this is unexpected? Thank your for trying... But I just tested the HVM on my box and it runs correct. Have you missed to run psr-cmt-attach for your HVM domain? Otherwise I think there may be something wrong. No - I get the HVM domain (a memtest domain) listed in the output. It is clear from Dom0's occupancy dropping off significantly that competition is underway on Socket 0 (as expected), but the domain always has 0 occupancy reported. I shall investigate. On further investigation, it would appear to be a behavioural property of memtest (5.01, 8vcpu guest, 1GB RAM). In SMP mode with all cpus streaming data, the occupancy skyrockets. However, when only cpu0 is running (with all other cpus waiting until the end of the round), the occupancy drops to 0. On this system, it would appear that the minimum granularity is 72KB, and 72KB is occasionally reported rather than 0. I am guessing that when only a single cpu is running, the occupancy is less than the granularity, and the 0 being reported is as a result of rounding. It is interesting to note that with dom0 only, dom0's occupancy is almost all of the LLC, but with this single memtest domU, the occupancies of dom0 + domU is never more than 1/4 of the LLC. I presume that this means that copious cache flushing is going on. Whatever the reason, it has been interesting having a bit of a play. It does appear that PSR and CMT JustWorked(tm) for me on Xen-4.5 which is very nice. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC for 4.6] xen: Extend DOMCTL createdomain to support arch configuration
On 18.11.14 at 20:20, julien.gr...@linaro.org wrote: --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -540,6 +540,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) int i, j, e820_warn = 0, bytes = 0; bool_t acpi_boot_table_init_done = 0; struct domain *dom0; +struct arch_domainconfig config; struct ns16550_defaults ns16550 = { .data_bits = 8, .parity= 'n', @@ -1347,7 +1348,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) domcr_flags |= DOMCRF_pvh | DOMCRF_hap; /* Create initial domain 0. */ -dom0 = domain_create(0, domcr_flags, 0); +memset(config, 0, sizeof(config)); +dom0 = domain_create(0, domcr_flags, 0, config); Why not NULL like almost everywhere else? --- a/xen/include/public/arch-x86/xen.h +++ b/xen/include/public/arch-x86/xen.h @@ -228,6 +228,11 @@ struct arch_shared_info { }; typedef struct arch_shared_info arch_shared_info_t; +struct arch_domainconfig { +char dummy; +}; +typedef struct arch_domainconfig arch_domainconfig_t; I think we decided that, regardless of all bad precedents, we'd stop adding things without xen_ prefix to the public interface. I'm also rather uncertain about all these typedef-s - I think we could very well get away without adding any new ones (if internally to the hypervisor or tools they turn out to make the code much better readable, each such component could still introduce one on its own). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][RFC][PATCH 06/13] hvmloader/ram: check if guest memory is out of reserved device memory maps
From: Tian, Kevin Sent: Thursday, November 20, 2014 4:51 PM From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Thursday, November 20, 2014 4:04 PM On 20.11.14 at 08:45, kevin.t...@intel.com wrote: Yang and I did some discussion here. We understand your point to avoid introducing new interface if we can leverage existing code. However it's not a trivial effort to move device assignment before populating p2m, and there is no other benefit of doing so except for this purpose. So we'd not suggest this way. It's not a trivial effort is pretty vague: What specifically is it that makes this difficult? I wouldn't expect there to be any strong dependencies on the ordering of these two operations... I'll leave to Yang to answer this part, who did a detail investigation on that, e.g. on IOMMU page table setup, etc. But what really matters here is not only about complexity, but also flexibility. Doing so will tie the policy to assigned device only, which removes the option to support hotpluggable device. Current option sounds a reasonable one, i.e. passing a list of BDFs assigned to this VM before populating p2m, and then having hypervisor to filter out reserved regions associated with those BDFs. This way libxc teaches Xen to create reserved regions once, and then later the filtered info is returned upon query. Reasonable, but partly redundant. The positive aspect being that it permits this list and the list of actually being assigned devices to be different, i.e. allowing holes to be set up for devices that only _may_ get assigned at some point. redundant if we think the list only exactly matching the statically assigned devices. but that's just current point. reasonable if we think there may be other policies impacting the list (e.g. if hotplugable device may have a config option and then we have a potential list larger than static assigned devices). From this angle I think the new interface actually makes sense for this very purpose. Jan are you OK with this? In previous approach we reserved all the RMRR regions so hotplug scenario is covered automatically. Now since we want to do BDF specific filtering, this new interface is actually necessary for hotplug support. If OK, Tiejun will send out a new series to see whether it's ready for 4.5 check-in. Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][RFC][PATCH 06/13] hvmloader/ram: check if guest memory is out of reserved device memory maps
On 20.11.14 at 15:40, kevin.t...@intel.com wrote: From: Tian, Kevin Sent: Thursday, November 20, 2014 4:51 PM From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Thursday, November 20, 2014 4:04 PM On 20.11.14 at 08:45, kevin.t...@intel.com wrote: Current option sounds a reasonable one, i.e. passing a list of BDFs assigned to this VM before populating p2m, and then having hypervisor to filter out reserved regions associated with those BDFs. This way libxc teaches Xen to create reserved regions once, and then later the filtered info is returned upon query. Reasonable, but partly redundant. The positive aspect being that it permits this list and the list of actually being assigned devices to be different, i.e. allowing holes to be set up for devices that only _may_ get assigned at some point. redundant if we think the list only exactly matching the statically assigned devices. but that's just current point. reasonable if we think there may be other policies impacting the list (e.g. if hotplugable device may have a config option and then we have a potential list larger than static assigned devices). From this angle I think the new interface actually makes sense for this very purpose. Jan are you OK with this? I can live with it, yes. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Number of NICs per VM with qemu-upstream (Was: Re: Re: [Xen-users] libvirt emulator /usr/local/lib/xen/bin/qemu-dm emulator did not work on xen-4.4)
On Thu, 2014-11-20 at 14:46 +, Stefano Stabellini wrote: On Thu, 20 Nov 2014, Ian Campbell wrote: There is, it's the romfile option to -device e.g. -device $NICMODEL,vlan=0,romfile=$ROMFILE where NICMODEL is e100, rtl8139, virtio-blah and ROMFILE is e.g. an ipxe binary. I think that option was intended to point QEMU to a different file, not to disable the romfile. romfile= does that, I think. Or use /dev/null etc etc. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Number of NICs per VM with qemu-upstream (Was: Re: Re: [Xen-users] libvirt emulator /usr/local/lib/xen/bin/qemu-dm emulator did not work on xen-4.4)
create ^ title it qemu-upstream: limitation on 4 emulated NICs prevents guest from starting unless PV override is used. thanks ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC for 4.6] xen: Extend DOMCTL createdomain to support arch configuration
Hi Jan, On 11/20/2014 02:37 PM, Jan Beulich wrote: On 18.11.14 at 20:20, julien.gr...@linaro.org wrote: --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -540,6 +540,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) int i, j, e820_warn = 0, bytes = 0; bool_t acpi_boot_table_init_done = 0; struct domain *dom0; +struct arch_domainconfig config; struct ns16550_defaults ns16550 = { .data_bits = 8, .parity= 'n', @@ -1347,7 +1348,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) domcr_flags |= DOMCRF_pvh | DOMCRF_hap; /* Create initial domain 0. */ -dom0 = domain_create(0, domcr_flags, 0); +memset(config, 0, sizeof(config)); +dom0 = domain_create(0, domcr_flags, 0, config); Why not NULL like almost everywhere else? It was because DOM0 is not a dummy domain. As it's not use on x86, I could replace by NULL. --- a/xen/include/public/arch-x86/xen.h +++ b/xen/include/public/arch-x86/xen.h @@ -228,6 +228,11 @@ struct arch_shared_info { }; typedef struct arch_shared_info arch_shared_info_t; +struct arch_domainconfig { +char dummy; +}; +typedef struct arch_domainconfig arch_domainconfig_t; I think we decided that, regardless of all bad precedents, we'd stop adding things without xen_ prefix to the public interface. I'm also rather uncertain about all these typedef-s - I think we could very well get away without adding any new ones (if internally to the hypervisor or tools they turn out to make the code much better readable, each such component could still introduce one on its own). Ok. I will drop them and rename the structure to xen_arch_domainconfig; Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Processed: Re: Number of NICs per VM with qemu-upstream (Was: Re: Re: [Xen-users] libvirt emulator /usr/local/lib/xen/bin/qemu-dm emulator did not work on xen-4.4)
Processing commands for x...@bugs.xenproject.org: create ^ Created new bug #46 rooted at `1416474814.29243.59.ca...@citrix.com' Title: `Re: [Xen-devel] Number of NICs per VM with qemu-upstream (Was: Re: Re: [Xen-users] libvirt emulator /usr/local/lib/xen/bin/qemu-dm emulator did not work on xen-4.4)' title it qemu-upstream: limitation on 4 emulated NICs prevents guest from starting unless PV override is used. Set title for #46 to `qemu-upstream: limitation on 4 emulated NICs prevents guest from starting unless PV override is used.' thanks Finished processing. Modified/created Bugs: - 46: http://bugs.xenproject.org/xen/bug/46 (new) --- Xen Hypervisor Bug Tracker See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on reporting bugs Contact xen-bugs-ow...@bugs.xenproject.org with any infrastructure issues ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH for-4.5 v2] docs/commandline: Fix formatting issues
For 'dom0_max_vcpus' and 'hvm_debug', markdown was interpreting the text as regular text, and reflowing it as a regular paragraph, leading to a single line as output. Reformat them as code blocks inside blockquote blocks, which causes them to take their precise whitespace layout. For 'psr', the bullet point was incorrectly delineated from paragraph text, causing it to be reflowed. Alter the formatting to include the CMT-specific options as sub-bullets of the overall CMT resource. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Release-acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- v2: Fix 'psr' formatting errors as well --- docs/misc/xen-command-line.markdown | 47 +-- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index f054d4b..66d021b 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -475,13 +475,13 @@ defaults of 1 and unlimited respectively are used instead. For example, with `dom0_max_vcpus=4-8`: - Number of - PCPUs | Dom0 VCPUs - 2| 4 - 4| 4 - 6| 6 - 8| 8 - 10| 8 +Number of + PCPUs | Dom0 VCPUs + 2| 4 + 4| 4 + 6| 6 + 8| 8 + 10| 8 ### dom0\_mem `= List of ( min:size | max:size | size )` @@ -684,18 +684,18 @@ supported only when compiled with XSM\_ENABLE=y on x86. The specified value is a bit mask with the individual bits having the following meaning: -Bit 0 - debug level 0 (unused at present) -Bit 1 - debug level 1 (Control Register logging) -Bit 2 - debug level 2 (VMX logging of MSR restores when context switching) -Bit 3 - debug level 3 (unused at present) -Bit 4 - I/O operation logging -Bit 5 - vMMU logging -Bit 6 - vLAPIC general logging -Bit 7 - vLAPIC timer logging -Bit 8 - vLAPIC interrupt logging -Bit 9 - vIOAPIC logging -Bit 10 - hypercall logging -Bit 11 - MSR operation logging + Bit 0 - debug level 0 (unused at present) + Bit 1 - debug level 1 (Control Register logging) + Bit 2 - debug level 2 (VMX logging of MSR restores when context switching) + Bit 3 - debug level 3 (unused at present) + Bit 4 - I/O operation logging + Bit 5 - vMMU logging + Bit 6 - vLAPIC general logging + Bit 7 - vLAPIC timer logging + Bit 8 - vLAPIC interrupt logging + Bit 9 - vIOAPIC logging + Bit 10 - hypercall logging + Bit 11 - MSR operation logging Recognized in debug builds of the hypervisor only. @@ -1047,12 +1047,11 @@ resource. RMID is a hardware-provided layer of abstraction between software and logical processors. The following resources are available: -* Cache Monitoring Technology (Haswell and later). Information -regarding the L3 cache occupancy. -`cmt` instructs Xen to enable/disable Cache Monitoring Technology. - -`rmid_max` indicates the max value for rmid. +* Cache Monitoring Technology (Haswell and later). Information regarding the + L3 cache occupancy. + * `cmt` instructs Xen to enable/disable Cache Monitoring Technology. + * `rmid_max` indicates the max value for rmid. ### reboot `= t[riple] | k[bd] | a[cpi] | p[ci] | n[o] [, [w]arm | [c]old]` -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/2 V3] fix rename: xenstore not fully updated
On Wed, 2014-11-19 at 16:25 -0500, Konrad Rzeszutek Wilk wrote: On Wed, Nov 19, 2014 at 11:26:32AM +, Ian Jackson wrote: Hi Konrad, I have another release ack request: Chunyan Liu writes ([PATCH 0/2 V3] fix rename: xenstore not fully updated): Currently libxl__domain_rename only update /local/domain/domid/name, still some places in xenstore are not updated, including: /vm/uuid/name and /local/domain/0/backend/device/domid/.../domain. This patch series updates /vm/uuid/name in xenstore, This ([PATCH 2/2 V3] fix rename: xenstore not fully updated) is a bugfix which I think should go into Xen 4.5. The risk WITHOUT this patch is that there are out-of-tree tools which look here for the domain name and will get confused after it is renamed. When was this introduced? Did it exist with Xend? Based on: git grep domain\ RELEASE-4.4.0 tools/python/ git grep domain\' RELEASE-4.4.0 tools/python/ it doesn't appear so, but someone with a xend install would be needed to confirm for sure. Given that this has always been wrong for a libxl domain after migration it seems likely to me that noone is looking at this field. The risk WITH this patch is that the implementation could be wrong somehow, in which case the code would need to be updated again. But it's a very small patch and has been fully reviewed. I checked QEMU and didn't find anything in there. Great. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] mkdeb: correctly map package architectures for x86 and ARM
On Wed, 2014-11-19 at 14:45 -0500, Konrad Rzeszutek Wilk wrote: On Fri, Nov 14, 2014 at 10:10:58AM +, Ian Campbell wrote: (CCing some more maintainers and the release manager) On Wed, 2014-11-12 at 15:43 +, Ian Campbell wrote: On Wed, 2014-11-12 at 09:38 -0600, Clark Laughlin wrote: mkdeb previously set the package architecture to be 'amd64' for anything other than XEN_TARGET_ARCH=x86_32. This patch attempts to correctly map the architecture from GNU names to debian names for x86 and ARM architectures, or otherwise, defaults it to the value in XEN_TARGET_ARCH. Signed-off-by: Clark Laughlin clark.laugh...@linaro.org Acked-by: Ian Campbell ian.campb...@citrix.com Actually thinking about it some more I'd be happier arguing for a freeze exception for something like the below which only handles the actual valid values of XEN_TARGET_ARCH and not the GNU names (which cannot happen) and prints an error for unknown architectures (so new ports aren't bitten in the future, etc). Konrad, wrt the freeze I think this is low risk for breaking x86 platforms and makes things work for arm, so is worth it. Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Ian J acked on IRC, so I've applied, thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] libxl: Document device parameter of libxl_device_type_add functions
On Tue, 2014-11-18 at 13:19 -0500, Konrad Rzeszutek Wilk wrote: On Tue, Nov 18, 2014 at 05:44:20PM +, Ian Jackson wrote: Euan Harris writes ([PATCH] libxl: Document device parameter of libxl_device_type_add functions): The device parameter of libxl_device_type_add is an in/out parameter. Unspecified fields are filled in with appropriate values for the created device when the function returns. Document this behaviour. Thanks. Acked-by: Ian Jackson ian.jack...@eu.citrix.com Konrad, this should go into 4.5 IMO because it's a doc comment describing existing behaviour which we want everyone to rely on. Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com and applied, thanks all. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5] libxl: remove existence check for PCI device hotplug
On Wed, 2014-11-19 at 16:28 -0500, Konrad Rzeszutek Wilk wrote: On Wed, Nov 19, 2014 at 09:21:23PM +, Wei Liu wrote: On Wed, Nov 19, 2014 at 04:01:54PM -0500, Konrad Rzeszutek Wilk wrote: On Mon, Nov 17, 2014 at 12:10:34PM +, Wei Liu wrote: The existence check is to make sure a device is not added to a guest multiple times. PCI device backend path has different rules from vif, disk etc. For example: /local/domain/0/backend/pci/9/0/dev-1/:03:10.1 /local/domain/0/backend/pci/9/0/key-1/:03:10.1 /local/domain/0/backend/pci/9/0/dev-2/:03:10.2 /local/domain/0/backend/pci/9/0/key-2/:03:10.2 The devid for PCI devices is hardcoded 0. libxl__device_exists only checks up to /local/.../9/0 so it always returns true even the device is assignable. Remove invocation of libxl__device_exists. We're sure at this point that the PCI device is assignable (hence no xenstore entry or JSON entry). The check is done before hand. For HVM guest it's done by calling xc_test_assign_device and for PV guest it's done by calling pciback_dev_is_assigned. Reported-by: Li, Liang Z liang.z...@intel.com Signed-off-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Konrad Wilk konrad.w...@oracle.com --- This patch fixes a regression in 4.5. Ouch! That needs then to be fixed. Is the version you would want to commit? I did test it - and it Yes. Then Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Applied. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/5 v2 for-4.5] xen: arm: xgene bug fixes + support for McDivitt
On Wed, 2014-11-19 at 16:27 -0500, Konrad Rzeszutek Wilk wrote: On Wed, Nov 19, 2014 at 03:27:48PM +, Ian Campbell wrote: These patches: * fix up an off by one bug in the xgene mapping of additional PCI bus resources, which would cause an additional extra page to be mapped * correct the size of the mapped regions to match the docs * adds support for the other 4 PCI buses on the chip, which enables mcdivitt and presumably most other Xgene based platforms which uses PCI buses other than pcie0. * adds earlyprintk for the mcdivitt platform They can also be found at: git://xenbits.xen.org/people/ianc/xen.git mcdivitt-v2 #1-#4 can go in - aka Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Applied, with a tweak to one of the ocmmit message suggested by Julien. For #5 I would appreciate an ARM knowledgeble person to review it. Acked by Julien now, are you ok for it to go in? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 v2] docs/commandline: Fix formatting issues
On Thu, 2014-11-20 at 15:22 +, Andrew Cooper wrote: For 'dom0_max_vcpus' and 'hvm_debug', markdown was interpreting the text as regular text, and reflowing it as a regular paragraph, leading to a single line as output. Reformat them as code blocks inside blockquote blocks, which causes them to take their precise whitespace layout. For 'psr', the bullet point was incorrectly delineated from paragraph text, causing it to be reflowed. Alter the formatting to include the CMT-specific options as sub-bullets of the overall CMT resource. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Release-acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Applied, thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 RFC] pygrub: Fix regression from c/s d1b93ea, attempt 2
On Mon, 2014-11-17 at 15:19 +, Andrew Cooper wrote: c/s d1b93ea causes substantial functional regressions in pygrub's ability to parse bootloader configuration files. c/s d1b93ea itself changed an an interface which previously used exclusively integers, to using strings in the case of a grub configuration with explicit default set, along with changing the code calling the interface to require a string. The default value for default remained as an integer. As a result, any Extlinux or Lilo configuration (which drives this interface exclusively with integers), or Grub configuration which doesn't explicitly declare a default will die with an AttributeError when attempting to call self.cf.default.isdigit() where default is an integer. Sadly, this AttributeError gets swallowed by the blanket ignore in the loop which searches partitions for valid bootloader configurations, causing the issue to be reported as Unable to find partition containing kernel This patch attempts to fix the issue by altering all parts of this interface to use strings, as opposed to integers. Would it be less invasive at this point in the release to have the place where this stuff is used do isinstance(s, str) and isinstance(s, int)? Also, in run_grub sel can be set to g.cf.default and then: if sel == -1: print No kernel image selected! sys.exit(1) I can't see where the -1 comes from though, and you aren't changing any -1 into -1, so maybe something more subtle is going on there. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.5 random freeze question
I think I'll debug this a bit later - unfortunately, now don't have time for this. But I want to get rid of spurious interrupt here. BTW - Stefano are you going to post the patch that we created yesterday ? Will Ian accept it? Regards, Andrii On Thu, Nov 20, 2014 at 1:15 PM, Julien Grall julien.gr...@linaro.org wrote: On 11/20/2014 10:28 AM, Stefano Stabellini wrote: On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: 19 лист. 2014 20:32, користувач Stefano Stabellini stefano.stabell...@eu.citrix.com написав: On Wed, 19 Nov 2014, Julien Grall wrote: On 11/19/2014 06:14 PM, Stefano Stabellini wrote: That's right, the maintenance interrupt handler is not called, but it doesn't do anything so we are fine. The important thing is that an interrupt is sent and git_clear_lrs gets called on hypervisor entry. It would be worth to write down this somewhere. Just in case someone decide to add code in maintenance interrupt later. Yes, I could add a comment in the handler Maybe it wouldn't take a lot of effort to fix it? I am just worrying that we may hide some issue - typically spurious interrupt this not what is expected. My guess is that by clearing UIE before reading GICC_IAR, we clear the maintenance interrupt too, as a consequence the following read to GICC_IAR would return 1023 (nothing to be read). As bit as if the maintenance interrupt was a level interrupt and we just disabled it. So I think that if we cleared UIE after reading GICC_IAR, GICC_IAR would return the correct value. However with the current structure of the code, the first thing that we do upon entering the hypervisor is clearing LRs and given what happened on your platform I think is a good idea to do it with UIE disabled. Agreed. UIE should be disabled to avoid another maintenance interrupt as soon as we EOI the IRQ. This is way I would rather read spurious interrupts but read/write LRs with UIE disabled than reading maintenance interrupts but risking strange behaviours on some platforms. Reading the GIC-v2 documentation, the spurious interrupt things should happen on any platform every time the UIE is disabled while we receive a maintenance interrupt. The read returns a spurious interrupt ID of 1023 if any of the following apply: - no pending interrupt on the CPU interface has sufficient priority for the interface to signal it to the processor -- Julien Grall -- Andrii Tseglytskyi | Embedded Dev GlobalLogic www.globallogic.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for 4.5] xl: Return proper error codes for block-attach and block-detach
On Wed, Nov 12, 2014 at 5:31 PM, George Dunlap george.dun...@eu.citrix.com wrote: Return proper error codes on failure so that scripts can tell whether the command completed properly or not. How about changing this to something like: --- Return proper error codes on failure so that scripts can tell whether the command completed properly or not. This is not a proper fix, since it fails to call libxl_device_disk_dispose() on the error path. But a proper fix requires some refactoring, so given where we are in the release process, it's better to have a fix that is simple and obvious, and do the refactoring once the next development window opens up. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for 4.5] xl: Return proper error codes for block-attach and block-detach
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 11/20/2014 03:56 PM, Ian Campbell wrote: On Thu, 2014-11-20 at 15:51 +, George Dunlap wrote: On 11/20/2014 03:47 PM, Ian Campbell wrote: On Mon, 2014-11-17 at 12:36 +, George Dunlap wrote: On 11/14/2014 11:12 AM, Ian Campbell wrote: On Thu, 2014-11-13 at 19:04 +, George Dunlap wrote: Return proper error codes on failure so that scripts can tell whether the command completed properly or not. Also while we're at it, normalize init and dispose of libxl_device_disk structures. This means calling init and dispose in the actual functions where they are declared. This in turn means changing parse_disk_config_multistring() to not call libxl_device_disk_init. And that in turn means changes to callers of parse_disk_config(). It also means removing libxl_device_disk_init() from libxl.c:libxl_vdev_to_device_disk(). This is only called from xl_cmdimpl.c. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Konrad Wilk konrad.w...@oracle.com Release justification: This is a bug fix. It's a fairly minor one, but it's also a very simple one. v2: - Restructure functions to make sure init and dispose are properly called. Sadly this bit has somewhat reduced the truth of the second half of your release justification, since the patch is a fair bit more subtle though. Although IMHO it hasn't invalidated the case for taking the patch for 4.5 (modulo comments below). Well, I think we can take the hacky short-term fix I posted before, which is simple, and do a proper fix after the 4.6 dev window opens up; or we can do a more complete fix now. Specifically the hacky short-term fix is 1415813493-25330-1-git-send-email-george.dun...@eu.citrix.com ? Yes -- the one you found somewhat wanting. :-) I could live with that, perhaps with the commit log explaining that a little. Do you want to add that, or shall I add it and re-submit? If you provide the text I'll just fold it in, unless having written it you find invoking send-email to be so trivial it's actually easier. Unfortunately I think I clobbered v1 in my git tree while developing v2. It probably hasn't been garbage collected yet, but I'll just reply to v1 with an updated changelog. -George -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBCgAGBQJUbg/SAAoJELIVx6fHhBvt7CUIAI75XUObbYy0/Zkinx2eVcLE d+fXSkVFWtwPPI2bfw3DyLtbMzAGxeNLhwwuLZ47b1ROam7fDcMzRp9t36NpetkY E+NoBk7chO8sD8lveGukipNiX0qTSKVQAc721CHwgO3L3yIw7t4iSsylR0Ntzew1 twiL68v1vwC/N70PJYSzW0v1dFQODX7YV5RreFZ+Hon6og8dNvKmlRojPC77Qid0 kAEiL2JouKrQ48ONr1cKKPWHJqNFL3+pZHbZCi9d+OF0pmOhlaVXZFccLlr26xq+ SMQx0rLyTQF6rJRUaQ0zVgMK82BxjVWXO5rIPQggnwwY0ILaYY9YmmPWw86kD0M= =04qs -END PGP SIGNATURE- ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5] scripts/get_maintainer.pl: Correctly CC the maintainers
On Tue, 2014-11-18 at 20:03 +, Julien Grall wrote: By default, the script get_maintainer.pl will remove duplicates email as soon as it appends the list of maintainers of a new file, and therefore override the role of the developper. On complex patch (see [1]), this will result to ommitting randomly some maintainers. This could be fixed Are you proposing an alternative/better fix here? or describing what this patch does? by not removing the duplicate email in the list. Once the list is created, when it's necessary, the script will drop the REST people and remove duplicata. Example: Patch: https://patches.linaro.org/41083/ Before: Daniel De Graaf dgde...@tycho.nsa.gov Ian Jackson ian.jack...@eu.citrix.com Stefano Stabellini stefano.stabell...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Wei Liu wei.l...@citrix.com George Dunlap george.dun...@eu.citrix.com xen-devel@lists.xen.org After: Daniel De Graaf dgde...@tycho.nsa.gov Ian Jackson ian.jack...@eu.citrix.com Stefano Stabellini stefano.stabell...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Wei Liu wei.l...@citrix.com Stefano Stabellini stefano.stabell...@citrix.com Tim Deegan t...@xen.org Keir Fraser k...@xen.org Jan Beulich jbeul...@suse.com George Dunlap george.dun...@eu.citrix.com xen-devel@lists.xen.org [1] http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg00060.html Signed-off-by: Julien Grall julien.gr...@linaro.org CC: Don Slutz dsl...@verizon.com --- I would like to see this patch in Xen 4.5 and backported to Xen 4.4 (first time the script has been introduced). Developpers using this script won't ommitted to cc some maintainers, and it will avoid maintainers complaining about miss CC. The only drawbacks I can see is there is too much people CCed (the patch d67738db was intended to avoid CCing Keir too often). My tree doesn't have in it d67738db but from the example you give above it seems like this patch will regress that? As someone who already gets too much mail and is listed in THE REST these days I am very much in favour of not mailing THE REST when other maintainers have been found. Also, if the maintainers is referenced twice in the file MAINTAINERS with different email, the script won't notice it's duplicated and list 2 times. Though, for this one it could be fixed by modifying the MAINTAINERS file. Is it worth for Xen 4.5? For know, it seems to only happen with Stefano. That's fine IMHO. The script shouldn't be expected to be smart enough to reconcile two distinct strings which happen to refer to the same person into a single string. If someone cares they should patch MAINTAINERS to refer to themselves in a consistent way. --- scripts/get_maintainer.pl |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl index df920e2..cc445cd 100755 --- a/scripts/get_maintainer.pl +++ b/scripts/get_maintainer.pl @@ -35,7 +35,7 @@ my $email_git_min_percent = 5; my $email_git_since = 1-year-ago; my $email_hg_since = -365; my $interactive = 0; -my $email_remove_duplicates = 1; +my $email_remove_duplicates = 0; my $email_use_mailmap = 1; my $email_drop_the_rest_supporter_if_supporter_found = 1; my $output_multiline = 1; ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 RFC] pygrub: Fix regression from c/s d1b93ea, attempt 2
On 20/11/14 16:00, Ian Campbell wrote: On Mon, 2014-11-17 at 15:19 +, Andrew Cooper wrote: c/s d1b93ea causes substantial functional regressions in pygrub's ability to parse bootloader configuration files. c/s d1b93ea itself changed an an interface which previously used exclusively integers, to using strings in the case of a grub configuration with explicit default set, along with changing the code calling the interface to require a string. The default value for default remained as an integer. As a result, any Extlinux or Lilo configuration (which drives this interface exclusively with integers), or Grub configuration which doesn't explicitly declare a default will die with an AttributeError when attempting to call self.cf.default.isdigit() where default is an integer. Sadly, this AttributeError gets swallowed by the blanket ignore in the loop which searches partitions for valid bootloader configurations, causing the issue to be reported as Unable to find partition containing kernel This patch attempts to fix the issue by altering all parts of this interface to use strings, as opposed to integers. Would it be less invasive at this point in the release to have the place where this stuff is used do isinstance(s, str) and isinstance(s, int)? It would be BaseString not str, but I am fairly sure the classes have altered through Py2's history. I would not be any more confident with that as a solution as trying to correctly to start with. By far the least risky option at this point would be to revert the two identified commits in the comments of the original patch. Also, in run_grub sel can be set to g.cf.default and then: if sel == -1: print No kernel image selected! sys.exit(1) I can't see where the -1 comes from though, and you aren't changing any -1 into -1, so maybe something more subtle is going on there. Ian. sel comes either from g.image_index() which strictly is an integer, or pulled out of the loop immediately preceding and strictly an integer. I can't however find anything which could cause it to have the value -1. All error paths I can spot use 0 instead and load the first entry. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Number of NICs per VM with qemu-upstream (Was: Re: Re: [Xen-users] libvirt emulator /usr/local/lib/xen/bin/qemu-dm emulator did not work on xen-4.4)
On Thu, 20 Nov 2014, Ian Campbell wrote: On Thu, 2014-11-20 at 14:46 +, Stefano Stabellini wrote: On Thu, 20 Nov 2014, Ian Campbell wrote: There is, it's the romfile option to -device e.g. -device $NICMODEL,vlan=0,romfile=$ROMFILE where NICMODEL is e100, rtl8139, virtio-blah and ROMFILE is e.g. an ipxe binary. I think that option was intended to point QEMU to a different file, not to disable the romfile. romfile= does that, I think. Or use /dev/null etc etc. Confirmed working. --- libxl: do not load roms for any NICs except the first to avoid wasting memory Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 3e191c3..f907ca9 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -674,9 +674,10 @@ static char ** libxl__build_device_model_args_new(libxl__gc *gc, LIBXL_NIC_TYPE_VIF_IOEMU); flexarray_append(dm_args, -device); flexarray_append(dm_args, - libxl__sprintf(gc, %s,id=nic%d,netdev=net%d,mac=%s, + libxl__sprintf(gc, %s,id=nic%d,netdev=net%d,mac=%s%s, nics[i].model, nics[i].devid, -nics[i].devid, smac)); +nics[i].devid, smac, +i ? ,romfile=\\ : )); flexarray_append(dm_args, -netdev); flexarray_append(dm_args, GCSPRINTF( type=tap,id=net%d,ifname=%s, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Xen Security Advisory 113 - Guest effectable page reference leak in MMU_MACHPHYS_UPDATE handling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Xen Security Advisory XSA-113 Guest effectable page reference leak in MMU_MACHPHYS_UPDATE handling ISSUE DESCRIPTION = An error handling path in the processing of MMU_MACHPHYS_UPDATE failed to drop a page reference which was acquired in an earlier processing step. IMPACT == Malicious or buggy stub domain kernels or tool stacks otherwise living outside of Domain0 can mount a denial of service attack which, if successful, can affect the whole system. Only domains controlling HVM guests can exploit this vulnerability. (This includes domains providing hardware emulation services to HVM guests.) VULNERABLE SYSTEMS == Xen versions from at least 3.2.x onwards are vulnerable on x86 systems. Older versions have not been inspected. ARM systems are not vulnerable. This vulnerability is only applicable to Xen systems using stub domains or other forms of disaggregation of control domains for HVM guests. MITIGATION == Running only PV guests will avoid this issue. (The security of a Xen system using stub domains is still better than with a qemu-dm running as an unrestricted dom0 process. Therefore users with these configurations should not switch to an unrestricted dom0 qemu-dm.) NOTE REGARDING LACK OF EMBARGO == A draft of this advisory was mistakenly sent to xen-devel. The Xen Project Security Team apologises for this error. We are working to share best working practices amongst the team to reduce the risks of recurrance. CREDITS === This issue was discovered by Andrew Cooper of Citrix. RESOLUTION == Applying the attached patch resolves this issue. xsa113.patchxen-unstable, Xen 4.4.x, Xen 4.3.x, Xen 4.2.x $ sha256sum xsa113*.patch a0f2b792a6b4648151f85fe13961b0bf309a568ed03e1b1d4ea01e4eabf1b18e xsa113.patch $ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAEBAgAGBQJUbhNoAAoJEIP+FMlX6CvZ5v8H/0cwnDOmSUZQ5Wm6ULUQH0w+ Jbsf6JPBRyDch1nCv/d8X27vSfmB8JH0m+LclEH0F1XSUiu5p4y46ZKk7Zfm4+gD xq6/eKyXKwCXinAwEcLtvfONrajQQvzk2y4XZpE+g9U00AwvsBXM3AdqPup8cyQl OLQO9Oq+xiqusCXIQeCb/KnoVUGS9PqlG/RT3rKKorYzuQjG7VURU3uKA1Vju7oD ITzbNCjTjnA7cFVSk6g9ZG6k40nGkVKIv+pPFfZAE6/UqiCF91oNzVAYVnA0X0oL YoAFxvVFOHp78192jW/7S8uacG+bskJNAr+NYIuaBlykka6Vbef6esWOW3UZEhA= =LDjw -END PGP SIGNATURE- xsa113.patch Description: Binary data ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5] scripts/get_maintainer.pl: Correctly CC the maintainers
On Thu, 2014-11-20 at 16:21 +, Julien Grall wrote: On 11/20/2014 04:15 PM, Julien Grall wrote: Hi Ian, On 11/20/2014 04:08 PM, Ian Campbell wrote: On Tue, 2014-11-18 at 20:03 +, Julien Grall wrote: By default, the script get_maintainer.pl will remove duplicates email as soon as it appends the list of maintainers of a new file, and therefore override the role of the developper. On complex patch (see [1]), this will result to ommitting randomly some maintainers. This could be fixed Are you proposing an alternative/better fix here? or describing what this patch does? Describing what the patch does. by not removing the duplicate email in the list. Once the list is created, when it's necessary, the script will drop the REST people and remove duplicata. Example: Patch: https://patches.linaro.org/41083/ Before: Daniel De Graaf dgde...@tycho.nsa.gov Ian Jackson ian.jack...@eu.citrix.com Stefano Stabellini stefano.stabell...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Wei Liu wei.l...@citrix.com George Dunlap george.dun...@eu.citrix.com xen-devel@lists.xen.org After: Daniel De Graaf dgde...@tycho.nsa.gov Ian Jackson ian.jack...@eu.citrix.com Stefano Stabellini stefano.stabell...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Wei Liu wei.l...@citrix.com Stefano Stabellini stefano.stabell...@citrix.com Tim Deegan t...@xen.org Keir Fraser k...@xen.org Jan Beulich jbeul...@suse.com George Dunlap george.dun...@eu.citrix.com xen-devel@lists.xen.org [1] http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg00060.html Signed-off-by: Julien Grall julien.gr...@linaro.org CC: Don Slutz dsl...@verizon.com --- I would like to see this patch in Xen 4.5 and backported to Xen 4.4 (first time the script has been introduced). Developpers using this script won't ommitted to cc some maintainers, and it will avoid maintainers complaining about miss CC. The only drawbacks I can see is there is too much people CCed (the patch d67738db was intended to avoid CCing Keir too often). My tree doesn't have in it d67738db but from the example you give above it seems like this patch will regress that? As someone who already gets too much mail and is listed in THE REST these days I am very much in favour of not mailing THE REST when other maintainers have been found. Forgot to add, the example above show the difference without and with the patch. The list is correct because both ARM and x86 maintainers should be CC. Because of this all THE REST maintainers are added. Just to be clear, you mean that everyone under THE REST is added solely because they also happen to be maintainers of some other relevant bit of code, not that THE REST is explicitly added in this case, right? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5] scripts/get_maintainer.pl: Correctly CC the maintainers
On Thu, 2014-11-20 at 16:15 +, Julien Grall wrote: Hi Ian, On 11/20/2014 04:08 PM, Ian Campbell wrote: On Tue, 2014-11-18 at 20:03 +, Julien Grall wrote: By default, the script get_maintainer.pl will remove duplicates email as soon as it appends the list of maintainers of a new file, and therefore override the role of the developper. On complex patch (see [1]), this will result to ommitting randomly some maintainers. This could be fixed Are you proposing an alternative/better fix here? or describing what this patch does? Describing what the patch does. Then I think you meant This is fixed or This patches fixes this by By drawbacks I meant, if there is another bug in the script then we may end up to cc too many people. Honestly I don't believe it's the case. Sure, lets assume not. Ian ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 RFC] pygrub: Fix regression from c/s d1b93ea, attempt 2
On 11/20/2014 11:15 AM, Ian Campbell wrote: On Thu, 2014-11-20 at 16:08 +, Andrew Cooper wrote: On 20/11/14 16:00, Ian Campbell wrote: On Mon, 2014-11-17 at 15:19 +, Andrew Cooper wrote: c/s d1b93ea causes substantial functional regressions in pygrub's ability to parse bootloader configuration files. c/s d1b93ea itself changed an an interface which previously used exclusively integers, to using strings in the case of a grub configuration with explicit default set, along with changing the code calling the interface to require a string. The default value for default remained as an integer. As a result, any Extlinux or Lilo configuration (which drives this interface exclusively with integers), or Grub configuration which doesn't explicitly declare a default will die with an AttributeError when attempting to call self.cf.default.isdigit() where default is an integer. Sadly, this AttributeError gets swallowed by the blanket ignore in the loop which searches partitions for valid bootloader configurations, causing the issue to be reported as Unable to find partition containing kernel This patch attempts to fix the issue by altering all parts of this interface to use strings, as opposed to integers. Would it be less invasive at this point in the release to have the place where this stuff is used do isinstance(s, str) and isinstance(s, int)? It would be BaseString not str, but I am fairly sure the classes have altered through Py2's history. I would not be any more confident with that as a solution as trying to correctly to start with. Actually isinstance(s, basestring) is what the webpage I was looking at said, but I cut-n-pasted the wrong thing. But regardless could it not do something like: if !isinstance(foo.cf.default, int): I think it would be the other way around, e.g. (not tested): diff --git a/tools/pygrub/src/pygrub b/tools/pygrub/src/pygrub index aa7e562..7250f45 100644 --- a/tools/pygrub/src/pygrub +++ b/tools/pygrub/src/pygrub @@ -457,7 +457,9 @@ class Grub: self.cf.parse(buf) def image_index(self): -if self.cf.default.isdigit(): +if isinstance(self.cf.default, int) +sel = self.cf.default +elif if self.cf.default.isdigit(): sel = int(self.cf.default) else: # We don't fully support submenus. Look for the leaf value in but yes, this looks less intrusive (assuming this the only place where we'd hit this error). -boris blah = int(foo.cf.default) elif foo.cf.default.isdigit(): blah = whatever and avoid the confusion about what is/isn't a string class while still fixing the issue? sel comes either from g.image_index() which strictly is an integer, or pulled out of the loop immediately preceding and strictly an integer. Ah, good. I can't however find anything which could cause it to have the value -1. All error paths I can spot use 0 instead and load the first entry. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.5 random freeze question
OK - I see. thanks a lot. On Thu, Nov 20, 2014 at 6:15 PM, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: Already posted: http://marc.info/?l=xen-develm=141648092100568 Ian hasn't provided any feedback yet. On Thu, 20 Nov 2014, Andrii Tseglytskyi wrote: I think I'll debug this a bit later - unfortunately, now don't have time for this. But I want to get rid of spurious interrupt here. BTW - Stefano are you going to post the patch that we created yesterday ? Will Ian accept it? Regards, Andrii On Thu, Nov 20, 2014 at 1:15 PM, Julien Grall julien.gr...@linaro.org wrote: On 11/20/2014 10:28 AM, Stefano Stabellini wrote: On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote: 19 лист. 2014 20:32, користувач Stefano Stabellini stefano.stabell...@eu.citrix.com написав: On Wed, 19 Nov 2014, Julien Grall wrote: On 11/19/2014 06:14 PM, Stefano Stabellini wrote: That's right, the maintenance interrupt handler is not called, but it doesn't do anything so we are fine. The important thing is that an interrupt is sent and git_clear_lrs gets called on hypervisor entry. It would be worth to write down this somewhere. Just in case someone decide to add code in maintenance interrupt later. Yes, I could add a comment in the handler Maybe it wouldn't take a lot of effort to fix it? I am just worrying that we may hide some issue - typically spurious interrupt this not what is expected. My guess is that by clearing UIE before reading GICC_IAR, we clear the maintenance interrupt too, as a consequence the following read to GICC_IAR would return 1023 (nothing to be read). As bit as if the maintenance interrupt was a level interrupt and we just disabled it. So I think that if we cleared UIE after reading GICC_IAR, GICC_IAR would return the correct value. However with the current structure of the code, the first thing that we do upon entering the hypervisor is clearing LRs and given what happened on your platform I think is a good idea to do it with UIE disabled. Agreed. UIE should be disabled to avoid another maintenance interrupt as soon as we EOI the IRQ. This is way I would rather read spurious interrupts but read/write LRs with UIE disabled than reading maintenance interrupts but risking strange behaviours on some platforms. Reading the GIC-v2 documentation, the spurious interrupt things should happen on any platform every time the UIE is disabled while we receive a maintenance interrupt. The read returns a spurious interrupt ID of 1023 if any of the following apply: - no pending interrupt on the CPU interface has sufficient priority for the interface to signal it to the processor -- Julien Grall -- Andrii Tseglytskyi | Embedded Dev GlobalLogic www.globallogic.com -- Andrii Tseglytskyi | Embedded Dev GlobalLogic www.globallogic.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for-4.5] xen/arm: clear UIE on hypervisor entry
Hi Stefano, On 11/20/2014 03:54 PM, Stefano Stabellini wrote: On Thu, 20 Nov 2014, Julien Grall wrote: On 11/20/2014 11:02 AM, Julien Grall wrote: Hi Stefano, On 11/20/2014 10:53 AM, Stefano Stabellini wrote: UIE being set can cause maintenance interrupts to occur when Xen writes to one or more LR registers. The effect is a busy loop around the interrupt handler in Xen (http://marc.info/?l=xen-develm=141597517132682): everything gets stuck. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Reported-and-Tested-by: Andrii Tseglytskyi andrii.tseglyts...@globallogic.com CC: konrad.w...@oracle.com --- Konrad, this fixes an actual bug, at least on OMAP5. It should have no bad side effects on any other platforms as far as I can tell. It should go in 4.5. From Andrii's mail, there is a bad side effect. We can receive spurious interrupt. On V1, you said that you tried this patch on midway. I would prefer if you give a try on platform that would really be used (such as xgene or seattle). As we know Midway won't be used in prod. And maybe give a try to gicv3 too as it's common code. I don't have a gicv3 test environment ready but it works on xgene too Ok. I will give a quick try on the model today or tomorrow. Aside from that, and after reading the spec. This patch looks good to me: Reviewed-by: Julien Grall julien.gr...@linaro.org Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for-4.5] xen/arm: clear UIE on hypervisor entry
On Thu, 2014-11-20 at 10:53 +, Stefano Stabellini wrote: UIE being set can cause maintenance interrupts to occur when Xen writes to one or more LR registers. The effect is a busy loop around the interrupt handler in Xen (http://marc.info/?l=xen-develm=141597517132682): everything gets stuck. I think it would be useful to explain somewhere why/how a spurious interrupt can occur, if not here then in the comment you've already added or in another one in gic_clear_lrs where you clear UIE. The bit about clearing the LRs on entry causing UIE to become deasserted before we get as far as reading the IAR is a bit subtle. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Reported-and-Tested-by: Andrii Tseglytskyi andrii.tseglyts...@globallogic.com CC: konrad.w...@oracle.com With the expanded commentary: Acked-by: Ian Campbell ian.campb...@citrix.com --- Konrad, this fixes an actual bug, at least on OMAP5. It should have no bad side effects on any other platforms as far as I can tell. It should go in 4.5. Changes in v2: - add an in-code comment about maintenance_interrupt not being called. diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 70d10d6..c6c11d3 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -403,6 +403,8 @@ void gic_clear_lrs(struct vcpu *v) if ( is_idle_vcpu(v) ) return; +gic_hw_ops-update_hcr_status(GICH_HCR_UIE, 0); + spin_lock_irqsave(v-arch.vgic.lock, flags); while ((i = find_next_bit((const unsigned long *) this_cpu(lr_mask), @@ -527,8 +529,6 @@ void gic_inject(void) if ( !list_empty(current-arch.vgic.lr_pending) lr_all_full() ) gic_hw_ops-update_hcr_status(GICH_HCR_UIE, 1); -else -gic_hw_ops-update_hcr_status(GICH_HCR_UIE, 0); } static void do_sgi(struct cpu_user_regs *regs, enum gic_sgi sgi) @@ -598,6 +598,10 @@ static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *r * Receiving the interrupt is going to cause gic_inject to be called * on return to guest that is going to clear the old LRs and inject * new interrupts. + * + * Do not add code here: maintenance interrupts caused by setting + * GICH_HCR_UIE, might read as spurious interrupts (1023). As a + * consequence sometimes this handler might not be called. */ } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5] scripts/get_maintainer.pl: Correctly CC the maintainers
On Thu, 2014-11-20 at 16:43 +, Julien Grall wrote: On 11/20/2014 04:29 PM, Ian Campbell wrote: Forgot to add, the example above show the difference without and with the patch. The list is correct because both ARM and x86 maintainers should be CC. Because of this all THE REST maintainers are added. Just to be clear, you mean that everyone under THE REST is added solely because they also happen to be maintainers of some other relevant bit of code, not that THE REST is explicitly added in this case, right? Yes, my description was confusing. With setting $email_remove_duplicates to 0, the script will: 1) Append the list of maintainers for every file 2) Filter the list to remove the entry with THE REST role 3) Remove duplicated address The previous behavior was: 1) Get the list of maintainers of the file (incidentally all the maintainers in THE REST role are added). If the email address already exists in the global list, skip it. 2) Filter the list to remove the entry with THE REST role So if a maintainers is marked on the THE REST on the first file and actually be an x86 maintainers on the second file, the scripts will only retain the THE REST role. If it's more clear, I can add the explanation above in the commit message. It is, please do. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5] scripts/get_maintainer.pl: Correctly CC the maintainers
On Thu, 2014-11-20 at 16:52 +, Ian Campbell wrote: On Thu, 2014-11-20 at 16:43 +, Julien Grall wrote: On 11/20/2014 04:29 PM, Ian Campbell wrote: Forgot to add, the example above show the difference without and with the patch. The list is correct because both ARM and x86 maintainers should be CC. Because of this all THE REST maintainers are added. Just to be clear, you mean that everyone under THE REST is added solely because they also happen to be maintainers of some other relevant bit of code, not that THE REST is explicitly added in this case, right? Just a small clarification... Yes, my description was confusing. With setting $email_remove_duplicates to 0, the script will: 1) Append the list of maintainers for every file At this point each maintainer remains associated with the role/reason they are added, right? 2) Filter the list to remove the entry with THE REST role And this only happens if there are roles other than THE REST in the list? 3) Remove duplicated address The previous behavior was: 1) Get the list of maintainers of the file (incidentally all the maintainers in THE REST role are added). If the email address already exists in the global list, skip it. 2) Filter the list to remove the entry with THE REST role Whereas here the link from maintainer to the role is lost, hence everyone in THE REST is blindly removed? So if a maintainers is marked on the THE REST on the first file and actually be an x86 maintainers on the second file, the scripts will only retain the THE REST role. If it's more clear, I can add the explanation above in the commit message. It is, please do. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 v2] libxc: don't leak buffer containing the uncompressed PV kernel
2014-11-20 15:48 GMT+00:00 Ian Campbell ian.campb...@citrix.com: The libxc xc_dom_* infrastructure uses a very simple malloc memory pool which is freed by xc_dom_release. However the various xc_try_*_decode routines (other than the gzip one) just use plain malloc/realloc and therefore the buffer ends up leaked. The memory pool currently supports mmap'd buffers as well as a directly allocated buffers, however the try decode routines make use of realloc and do not fit well into this model. Introduce a concept of an external memory block to the memory pool and provide an interface to register such memory. The mmap_ptr and mmap_len fields of the memblock tracking struct lose their mmap_ prefix since they are now also used for external memory blocks. We are only seeing this now because the gzip decoder doesn't leak and it's only relatively recently that kernels in the wild have switched to better compression. This is https://bugs.debian.org/767295 Reported by: Gedalya geda...@gedalya.net Signed-off-by: Ian Campbell ian.campb...@citrix.com --- v2: Correct handling in xc_try_lzo1x_decode. This is a bug fix and should go into 4.5. I have a sneaking suspicion this is going to need to backport a very long way... --- tools/libxc/include/xc_dom.h| 10 -- tools/libxc/xc_dom_bzimageloader.c | 20 tools/libxc/xc_dom_core.c | 61 +++ tools/libxc/xc_dom_decompress_lz4.c |5 +++ 4 files changed, 80 insertions(+), 16 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 6ae6a9f..07d7224 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -33,8 +33,13 @@ struct xc_dom_seg { struct xc_dom_mem { struct xc_dom_mem *next; -void *mmap_ptr; -size_t mmap_len; +void *ptr; +enum { +XC_DOM_MEM_TYPE_MALLOC_INTERNAL, +XC_DOM_MEM_TYPE_MALLOC_EXTERNAL, +XC_DOM_MEM_TYPE_MMAP, +} type; +size_t len; unsigned char memory[0]; }; Just a small optimization, if you move type after len you can reduce structure size. Unless you want memory field aligned to 8 bytes on 64 bit but in this case I would force member alignment. Frediano @@ -298,6 +303,7 @@ void xc_dom_log_memory_footprint(struct xc_dom_image *dom); /* --- simple memory pool -- */ void *xc_dom_malloc(struct xc_dom_image *dom, size_t size); +int xc_dom_register_external(struct xc_dom_image *dom, void *ptr, size_t size); void *xc_dom_malloc_page_aligned(struct xc_dom_image *dom, size_t size); void *xc_dom_malloc_filemap(struct xc_dom_image *dom, const char *filename, size_t * size, diff --git a/tools/libxc/xc_dom_bzimageloader.c b/tools/libxc/xc_dom_bzimageloader.c index 2225699..964ebdc 100644 --- a/tools/libxc/xc_dom_bzimageloader.c +++ b/tools/libxc/xc_dom_bzimageloader.c @@ -161,6 +161,13 @@ static int xc_try_bzip2_decode( total = (((uint64_t)stream.total_out_hi32) 32) | stream.total_out_lo32; +if ( xc_dom_register_external(dom, out_buf, total) ) +{ +DOMPRINTF(BZIP2: Error registering stream output); +free(out_buf); +goto bzip2_cleanup; +} + DOMPRINTF(%s: BZIP2 decompress OK, 0x%zx - 0x%lx, __FUNCTION__, *size, (long unsigned int) total); @@ -305,6 +312,13 @@ static int _xc_try_lzma_decode( } } +if ( xc_dom_register_external(dom, out_buf, stream-total_out) ) +{ +DOMPRINTF(%s: Error registering stream output, what); +free(out_buf); +goto lzma_cleanup; +} + DOMPRINTF(%s: %s decompress OK, 0x%zx - 0x%zx, __FUNCTION__, what, *size, (size_t)stream-total_out); @@ -464,7 +478,13 @@ static int xc_try_lzo1x_decode( dst_len = lzo_read_32(cur); if ( !dst_len ) +{ +msg = Error registering stream output; +if ( xc_dom_register_external(dom, out_buf, out_len) ) +break; + return 0; +} if ( dst_len LZOP_MAX_BLOCK_SIZE ) { diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c index baa62a1..ecbf981 100644 --- a/tools/libxc/xc_dom_core.c +++ b/tools/libxc/xc_dom_core.c @@ -132,6 +132,7 @@ void *xc_dom_malloc(struct xc_dom_image *dom, size_t size) return NULL; } memset(block, 0, sizeof(*block) + size); +block-type = XC_DOM_MEM_TYPE_MALLOC_INTERNAL; block-next = dom-memblocks; dom-memblocks = block; dom-alloc_malloc += sizeof(*block) + size; @@ -151,23 +152,45 @@ void *xc_dom_malloc_page_aligned(struct xc_dom_image *dom, size_t size) return NULL; } memset(block, 0, sizeof(*block)); -block-mmap_len = size; -block-mmap_ptr = mmap(NULL, block-mmap_len, -
Re: [Xen-devel] [OSSTEST PATCH] linux-next tests: Use correct branch for baseline
On Thu, 2014-11-20 at 13:51 +, Ian Jackson wrote: Make cr-daily-branch honour an environment or setting variable EXTRA_SGR_ARGS. In branch-settings.linux-next set it appropriately to arrange that the linux-next test reports consider linux-linus tests as interesting as well as just linux-next ones. (We already use a flight from linux-linus for selecting the baseline linux version.) Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com --- branch-settings.linux-next |1 + cr-daily-branch|2 ++ 2 files changed, 3 insertions(+) diff --git a/branch-settings.linux-next b/branch-settings.linux-next index e9bf926..1c5660b 100644 --- a/branch-settings.linux-next +++ b/branch-settings.linux-next @@ -2,3 +2,4 @@ OSSTEST_NO_BASELINE=y OSSTEST_PUSH=false OLD_REVISION=determine-late GITFORCEFLAG=--fail +EXTRA_SGR_ARGS=--branches-also=linux-linus diff --git a/cr-daily-branch b/cr-daily-branch index d00ecbb..17bb2c9 100755 --- a/cr-daily-branch +++ b/cr-daily-branch @@ -301,6 +301,8 @@ END ;; esac +sgr_args+= $EXTRA_SGR_ARGS + : $flight $branch $OSSTEST_BLESSING $sgr_args $DAILY_BRANCH_PREEXEC_HOOK execute_flight $flight $OSSTEST_BLESSING ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [OSSTEST PATCH] README.planner: Document the resource planning system
Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com --- README.planner | 181 +++- 1 file changed, 180 insertions(+), 1 deletion(-) diff --git a/README.planner b/README.planner index de8b962..ec4dce8 100644 --- a/README.planner +++ b/README.planner @@ -1,4 +1,183 @@ -Resource planner / scheduler +RESOURCE PLANNER (IE SCHEDULER) +=== + +Overall architecture + + +Resources (eg hosts) are owned by `tasks'. As resources are allocated +and deallocated, their `owntaskid' in the database is updated. + +When a process wishes to allocate resources, it does as follows: + + - Select an appropriate task. For command-line use, the user@host + static task usually used (as specified by the OSSTEST_TASK env var) + and things fail if it doesn't actually exist. + + Automatic runs create a new ownd task for each job (in become-task + in JobDB-Executive.tcl, in sg-run-job. + + - Connect to the queue daemon and participate in the planning + process. + + +Planning + + +The queue daemon sequences the planning of resource use and the +allocation of resources. This is done in a periodic planning cycle. +Planning cycles are prompted by newly available resources, new +requests for participation, and periodically. + +During each planning cycle we construct, from scratch, a complete plan +for which resources are to be used, when, by which tasks. Resources +which are free and suitable for allocation right away are planned and +allocated for immediate use. + +But, the plan extends far enough into the future to cover all +currently-foreseeable requirements for resources. This provides the +planning algorithms the most complete information about available +tradeoffs, and also provides useful output (the resource plan) for +administrators and users. + +Each planning cycle starts with the existing allocated resources. The +planning daemon records (on disk, not in the database) what expected +duration was declared with each of those allocations. (A task that +has allocated the resources it needs does not any longer participate +in the planning process, although it will retain a liveness connection +to the ms-ownerdaemon.) + +Then each interested client of ms-queuedaemon is asked - one by one, +in turn - to fill into the plan-under-construction, what resources it +intends to uses when. Clients specify the expected duration of their +use (but there is no mechanism for enforcing accuracy of these +estimates). ms-queuedaemon collates and records the provided +information and passes it on to the next client. + +If there are resources which are available right now which a client +wants to use, the client will allocate it there and then during its +planning slot. + +The queueing order is determined by the job priority value. Each +client declares its own priority. The usual basis for the priority is +is client's starting time_t. So by and large jobs execute in order. + +The main client in the planning process is +ts-hosts-allocate-Executive. That program contains the heuristics for +choosing good tests hosts under various conditions. + +Command-line users can use mg-allocate -U to obtain resources through +the planning process. mg-allocate participates with a high queue +priority so that command-line allocations will take precedence over +automatic test runs. (mg-allocate without -U bypasses the planner and +can be used to `grab' resources which happen currently to be free.) + +The distinction between `idle' and `allocatable' resources exists so +that newly-freed resources are properly offered first to the tasks at +the front of the queue. ms-ownerdaemon sets all idle resources to +allocatable at the start of each planning cycle. + + +ms-ownerdaemon and `ownd' tasks +--- + +ms-ownerdaemon helps with cleanup and does nothing else. Test runs +connect to it and obtain ephemeral `task' ids. All of the processes +which are part of the the test run retain a descriptor onto the +socket connection to ms-ownerdaemon. When the last holder of a copy +of the socket connection fd dies, ms-ownerdaemon sees the connection +close. It then sets the task to `not live' in the database. + +This means that there is no need for any explicit cleanup: tasks +which just crash have their resources freed automatically. + +If the ms-ownerdaemon fails and is restarted, the tasks which were +clients of the previous ms-owerdaemon cannot be automatically cleaned +up. The new ms-ownerdaemon will annotate them with `previous'. The +administrator can then clean them up manually, if she knows that all +the corresponding actual processes are no longer running. + + +Types of task +- + + * static tasks. Usual for command-line use. They are manually + created (with ./mg-hosts manual-task-create) and not normally ever + destroyed. + + * `ownd' tasks. These are used for production runs from cron and +
[Xen-devel] Buggy interaction of live migration and p2m updates
Hello, Tim, David and I were discussing this over lunch. This email is a (hopefully accurate) account of our findings, and potential solutions. (If I have messed up, please shout.) Currently, correct live migration of PV domains relies on the toolstack (which has a live mapping of the guests p2m) not observing stale values when the guest updates its p2m, and the race condition between a p2m update and an m2p update. Realistically, this means no updates to the p2m at all, due to several potential race conditions. Should any race conditions happen (e.g. ballooning while live migrating), the effects could be anything from an aborted migration to VM memory corruption. It should be noted that migrationv2 does not fix any of this. It alters the way in which some race conditions might be observed. During development of migrationv2, there was an explicit non-requirement of fixing the existing Ballooning+LiveMigration issues we were aware, although at the time, we were not aware of this specific set of issues. Our goal was to simply make migrationv2 work in the same circumstances as previously, but with a bitness-agnostic wire format and forward-extensible protocol. As far as these issues are concerned, there are two distinct p2m modifications which we care about: 1) p2m structure changes (rearranging the layout of the p2m) 2) p2m content changes (altering entries in the p2m) There is no possible way for the toolstack to prevent a domain from altering its p2m. At the moment, ballooning typically only occurs when requested by the toolstack, but the underlying operations (increase/decrease_reservation, mem_exchange, etc) can be used by the guest at any point. This includes Wei's guest memory fragmentation changes. Changes to the content of the p2m also occur for grant map and unmap operations. Currently in PV guests, the p2m is implemented using a 3-level tree, with its root in the guests shared_info page. It provides a hard VM memory limit of 4TB for 32bit PV guests (which is far higher than the 128GB limit from the compat p2m mappings), or 512GB for 64bit PV guests. Juergen has a proposed new p2m interface using a virtual linear mapping. This is conceptually similar to the previous implementation (which is fine from the toolstacks point of view), but far less complicated from the guests point of view, and removes the memory limits imposed by the p2m structure. The new virtual linear mapping suffers from the same interaction issues as the old 3-level tree did, but the introduction of the new interface affords us an opportunity to make all API modifications at once to reduce churn. During live migration, the toolstack maps the guests p2m into a linear mapping in the toolstacks virtual address space. This is done once at the start of migration, and never subsequently altered. During live migration, the p2m is cross-verified with the m2p, and frames are sent using pfns as a reference, as they will be located in different frames on the receiving side. Should the guest change the p2m structure during live migration, the toolstack ends up with a stale p2m with a non-p2m frame in the middle, resulting in bogus cross-referencing. Should the guest change an entry in the p2m, the p2m frame itself will be resent as it would be marked as dirty in the logdirty bitmap, but the target pfn will remain unsent and probably stale on the receiving side. Another factor which needs to be taken into account is Remus/COLO, which run the domains under live migration conditions for the duration of their lifetime. During the live part of migration, the toolstack already has to be able to tolerate failures to normalise the pagetables, which result as a consequent of the pagetables being in active. These failures are fatal on the final iteration after the guest has been paused, but the same logic could be extended to p2m/m2p issues, if needed. There are several potential solutions to these problems. 1) Freeze the guests p2m during live migrate This is the simplest sounding option, but is quite problematic from the point of view of the guest. It is essentially a shared spinlock between the toolstack and the guest kernel. It would prevent any grant map/unmap operations from occurring, and might interact badly with certain p2m updated in the guest which would previously be expected to unconditionally succeed. Pros) (Can't think of any) Cons) Not easy to implement (even conceptually), requires invasive guest changes, will cripple Remus/COLO 2) Deep p2m dirty tracking In the case that a p2m frame is discovered dirty in the logdirty bitmap, we can be certain that a write has occurred to it, and in the common case, means that the mapping has changed. The toolstack could maintain a non-live copy of the p2m which is updated as new frames are sent. When a dirty p2m frame is found, the live and non-live copies can be consulted to find which pfn mappings have changed, and locally mark all the altered pfns for
Re: [Xen-devel] [PATCH v2 for 4.5] xl: Return proper error codes for block-attach and block-detach
On Thu, Nov 20, 2014 at 03:47:04PM +, Ian Campbell wrote: On Mon, 2014-11-17 at 12:36 +, George Dunlap wrote: On 11/14/2014 11:12 AM, Ian Campbell wrote: On Thu, 2014-11-13 at 19:04 +, George Dunlap wrote: Return proper error codes on failure so that scripts can tell whether the command completed properly or not. Also while we're at it, normalize init and dispose of libxl_device_disk structures. This means calling init and dispose in the actual functions where they are declared. This in turn means changing parse_disk_config_multistring() to not call libxl_device_disk_init. And that in turn means changes to callers of parse_disk_config(). It also means removing libxl_device_disk_init() from libxl.c:libxl_vdev_to_device_disk(). This is only called from xl_cmdimpl.c. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Konrad Wilk konrad.w...@oracle.com Release justification: This is a bug fix. It's a fairly minor one, but it's also a very simple one. v2: - Restructure functions to make sure init and dispose are properly called. Sadly this bit has somewhat reduced the truth of the second half of your release justification, since the patch is a fair bit more subtle though. Although IMHO it hasn't invalidated the case for taking the patch for 4.5 (modulo comments below). Well, I think we can take the hacky short-term fix I posted before, which is simple, and do a proper fix after the 4.6 dev window opens up; or we can do a more complete fix now. Specifically the hacky short-term fix is 1415813493-25330-1-git-send-email-george.dun...@eu.citrix.com ? I could live with that, perhaps with the commit log explaining that a little. Konrad? hands George the band-aid tape Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Or, if the valgrind thing is really important, I could use the change you suggested, and do return rc ? 1 : 0; but I really think that's going in the wrong direction... -George --- tools/libxl/libxl.c | 2 -- tools/libxl/xl_cmdimpl.c | 28 +--- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index f7961f6..d9c4ce7 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -2611,8 +2611,6 @@ int libxl_vdev_to_device_disk(libxl_ctx *ctx, uint32_t domid, if (devid 0) return ERROR_INVAL; -libxl_device_disk_init(disk); _init functions are idempotent, so this is harmless, I think. Existing callers (including things which aren't xl) might be relying on it. Outside of a freeze period that might be acceptable (those callers are buggy) but since you are asking for a freeze exception I think this may be going a step too far in cleaning things up. In terms of other callers you've missed tools/ocaml/libs/xl/xenlight_stubs.c (which appears buggy to me) in tree, plus whatever use e.g. libvirt makes of it. diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 3c9f146..25af715 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -485,8 +485,6 @@ static void parse_disk_config_multistring(XLU_Config **config, { int e; -libxl_device_disk_init(disk); Likewise here, although to a lesser extent since this is xl not libxl. @@ -6378,13 +6382,17 @@ int main_blockattach(int argc, char **argv) printf(disk: %s\n, json); free(json); if (ferror(stdout) || fflush(stdout)) { perror(stdout); exit(-1); } -return 0; You should set rc = 0 here rather than initing it at declaration to catch error paths which don't set the result. (we aren't very consistent about this in the code today so I'm only mentioning it because other changes seem to be needed, if that turns out not to be the case and there's no need for v3 then this shouldn't block acceptance) +goto out; I'm not 100% convinced by the use of the goto out error handling style for a success path, but it's probably better than duplicating the exit path or adding !dryrun checks to all the following operations. Ian, since you recently posted updated coding style for things around this, what do you think? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [for-xen-4.5 PATCH v2 2/2] dpci: Add ZOMBIE state to allow the softirq to finish with the dpci_pirq.
On Thu, Nov 20, 2014 at 11:55:43AM +0100, Jan Beulich wrote: On 19.11.14 at 23:21, konrad.w...@oracle.com wrote: Leaving aside the question of whether this is the right approach, in case it is a couple of comments: @@ -85,7 +91,7 @@ static void raise_softirq_for(struct hvm_pirq_dpci *pirq_dpci) */ bool_t pt_pirq_softirq_active(struct hvm_pirq_dpci *pirq_dpci) { -if ( pirq_dpci-state ((1 STATE_RUN) | (1 STATE_SCHED)) ) +if ( pirq_dpci-state ((1 STATE_RUN) | (1 STATE_SCHED) | (1 STATE_ZOMBIE) ) ) This is becoming unwieldy - perhaps better just if ( pirq_dpci-state )? @@ -122,6 +128,7 @@ static void pt_pirq_softirq_cancel(struct hvm_pirq_dpci *pirq_dpci, /* fallthrough. */ case (1 STATE_RUN): case (1 STATE_RUN) | (1 STATE_SCHED): +case (1 STATE_RUN) | (1 STATE_SCHED) | (1 STATE_ZOMBIE): How would we get into this state? Afaict STATE_ZOMBIE can't be set at the same time as STATE_SCHED. @@ -786,6 +793,7 @@ unlock: static void dpci_softirq(void) { unsigned int cpu = smp_processor_id(); +unsigned int reset = 0; bool_t (and would better be inside the loop, avoiding the pointless re-init on the last iteration). But taking it as a whole - aren't we now at risk of losing an interrupt instance the guest expects, due to raise_softirq_for() bailing on zombie entries, and dpci_softirq() being the only place where the zombie state gets cleared? Ah crud. So a simple fix could be to seperate the 'state' to only deal with the raise_softirq and softirq_dpci. And then add a new (old) 'masked' to deal between hvm_dirq_assist, pt_irq_guest_eoi and hvm_do_IRQ_dpci. From 94a98e20a8ab721a58788919f92e3524a6c6e25c Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Thu, 20 Nov 2014 14:28:11 -0500 Subject: [PATCH] dpci: Add 'masked' as an gate for hvm_dirq_assist to process. commit f6dd295381f4b6a66acddacf46bca8940586c8d8 dpci: replace tasklet with softirq used the 'masked' as an two-bit state mechanism (STATE_SCHED, STATE_RUN) to communicate between 'raise_softirq_for' and 'dpci_softirq' to determine whether the 'struct hvm_pirq_dpci' can be re-scheduled. However it ignored the 'pt_irq_guest_eoi' was not adhering to the proper dialogue and was not using locked cmpxchg or test_bit operations and ended setting 'state' set to zero. That meant 'raise_softirq_for' was free to schedule it while the 'struct hvm_pirq_dpci'' was still on an per-cpu list causing an list corruption. The code would trigger the following path causing list corruption: \-timer_softirq_action pt_irq_time_out calls pt_pirq_softirq_cancel sets state to 0. pirq_dpci is still on dpci_list. \- dpci_sofitrq while (!list_emptry(our_list)) list_del, but has not yet done 'entry-next = LIST_POISON1;' [interrupt happens] raise_softirq checks state which is zero. Adds pirq_dpci to the dpci_list. [interrupt is done, back to dpci_softirq] finishes the entry-next = LIST_POISON1; .. test STATE_SCHED returns true, so executes the hvm_dirq_assist. ends the loop, exits. \- dpci_softirq while (!list_emtpry) list_del, but -next already has LIST_POISON1 and we blow up. An alternative solution was proposed (adding STATE_ZOMBIE and making pt_irq_time_out use the cmpxchg protocol on 'state'), which fixed the above issue but had an fatal bug. It would miss interrupts that are to be scheduled! This patch brings back the 'masked' boolean which is used as an communication channel between 'hvm_do_IRQ_dpci', 'hvm_dirq_assist' and 'pt_irq_guest_eoi'. When we have an interrupt we set 'masked'. Anytime 'hvm_dirq_assist' or 'pt_irq_guest_eoi' executes - it clears it. The 'state' is left as a seperate mechanism to provide an mechanism between 'raise_sofitrq' and 'softirq_dpci' to communicate the state of the 'struct hvm_dirq_pirq'. Reported-by: Sander Eikelenboom li...@eikelenboom.it Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- xen/drivers/passthrough/io.c | 5 +++-- xen/include/xen/hvm/irq.h| 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c index efc66dc..4d8c02f 100644 --- a/xen/drivers/passthrough/io.c +++ b/xen/drivers/passthrough/io.c @@ -142,7 +142,7 @@ static int pt_irq_guest_eoi(struct domain *d, struct hvm_pirq_dpci *pirq_dpci, if ( __test_and_clear_bit(_HVM_IRQ_DPCI_EOI_LATCH_SHIFT, pirq_dpci-flags) ) { -pirq_dpci-state = 0; +pirq_dpci-masked = 0; pirq_dpci-pending = 0; pirq_guest_eoi(dpci_pirq(pirq_dpci)); } @@ -610,6 +610,7 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq) !(pirq_dpci-flags HVM_IRQ_DPCI_MAPPED) ) return 0; +pirq_dpci-masked = 1;
Re: [Xen-devel] Regression, host crash with 4.5rc1
Hi Jan, Thanks for all your help so far! Here's my latest update. On 11/17/2014 23:54, Jan Beulich wrote: Plus, without said adjustment, first just disable the MWAIT CPU idle driver (mwait-idle=0) and then, if that didn't make a difference, use of C states altogether (cpuidle=0). If any of this does make a difference, limiting use of C states without fully excluding their use may need to be the next step. Running with mwait-idle=0 solves (hides?) the problem. Next step is to fiddle with the C states? On 11/19/2014 23:59, Jan Beulich wrote: Also, for double checking purposes, could you make the xen-syms of a build you observed the problem with available somewhere? The xen-syms (from my build of 9a727a81) can be found here: http://steve.freitas.org/xen-syms-4.5-unstable.gz Interesting - all CPUs are executing stuff from Dom1, which be itself is not indicating a problem, but may (considering the host hang) hint at guest vCPU-s not getting de-scheduled anymore on one or more of the CPUs. The 'a' debug key output would hopefully give us enough information to know whether that's the case. Ideally do 'd' and 'a' a couple of times each (alternating, and with a few seconds in between). Here ya go (as before, from 9a727a81). I had to pick and choose what parts to pull from the log because it was getting spammed with kernel complaints about the disk subsystem being locked up, so the hypervisor debugging info was hard to read amidst the kernel errors. As ever, let me know if you need more. Thanks again! Steve (XEN) CPU00: (XEN) ex=1445us timer=8300bf526060 cb=vcpu_singleshot_timer_fn(8300bf526000) (XEN) ex=9918us timer=830c3dc4b1e0 cb=csched_acct(830c3dc4b1c0) (XEN) ex=8390us timer=830c3dcc2d08 cb=csched_tick() (XEN) ex=70409499us timer=82d08031d1e0 cb=plt_overflow() (XEN) ex=12265483us timer=82d08031f4e0 cb=mce_work_fn() (XEN) ex= 94364us timer=82d08031d280 cb=time_calibration() (XEN) ex= 18390us timer=82d080321560 cb=do_dbs_timer(82d0803215a0) (XEN) CPU01: (XEN) ex= 390us timer=830c17ceb460 cb=pt_timer_fn(830c17ceb420) (XEN) ex=14101194us timer=830c17ceb4e0 cb=pt_timer_fn(830c17ceb4a0) (XEN) ex= 153445us timer=8300bf524060 cb=vcpu_singleshot_timer_fn(8300bf524000) (XEN) ex= 44171681527us timer=830c17ceb290 cb=rtc_alarm_cb(830c17ceb1f0) (XEN) CPU02: (XEN) ex=1445us timer=8300bf798060 cb=vcpu_singleshot_timer_fn(8300bf798000) (XEN) ex=8390us timer=830c3dc797c8 cb=csched_tick(0002) (XEN) ex= 18390us timer=830c3dcb8360 cb=do_dbs_timer(830c3dcb83a0) (XEN) ex= 29570us timer=830c3dcb80a0 cb=s_timer_fn() (XEN) CPU03: (XEN) ex= 25445us timer=8300bf2fb060 cb=vcpu_singleshot_timer_fn(8300bf2fb000) (XEN) CPU04: (XEN) ex= 634us timer=8300bf525060 cb=vcpu_singleshot_timer_fn(8300bf525000) (XEN) CPU05: (XEN) ex=1445us timer=8300bf527060 cb=vcpu_singleshot_timer_fn(8300bf527000) (XEN) ex= 388096702us timer=830c17ceb5d0 cb=pmt_timer_callback(830c17ceb5a8) (XEN) 'd' pressed - dumping registers (XEN) (XEN) *** Dumping CPU0 guest state (d1v4): *** (XEN) [ Xen-4.5-unstable x86_64 debug=y Not tainted ] (XEN) CPU:0 (XEN) RIP:0010:[f8000298a20e] (XEN) RFLAGS: 0046 CONTEXT: hvm guest (XEN) rax: 0002 rbx: f88002fa2180 rcx: f88002fa21c0 (XEN) rdx: 0400 rsi: rdi: f88002faceb0 (XEN) rbp: 000f rsp: f88002facc20 r8: f88002fa22a0 (XEN) r9: 03295cdc3c57 r10: f88002fa22a0 r11: f88002facd70 (XEN) r12: f88002facd70 r13: 03940027203c r14: 000f (XEN) r15: 0001 cr0: 80050031 cr4: 06f8 (XEN) cr3: 00187000 cr2: 01cb8300 (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 0018 cs: 0010 (XEN) (XEN) *** Dumping CPU1 guest state (d1v1): *** (XEN) [ Xen-4.5-unstable x86_64 debug=y Not tainted ] (XEN) CPU:1 (XEN) RIP:0010:[f8000298a20e] (XEN) RFLAGS: 0046 CONTEXT: hvm guest (XEN) rax: 0002 rbx: f88002e40180 rcx: f88002e401c0 (XEN) rdx: 0400 rsi: rdi: f88002e4aeb0 (XEN) rbp: 000f rsp: f88002e4ac20 r8: f88002e402a0 (XEN) r9: 0256dc1c8f33 r10: f88002e402a0 r11: f88002e4ad70 (XEN) r12: f88002e4ad70 r13: 0394002722bb r14: 000f (XEN) r15: 0001 cr0: 80050031 cr4: 06f8 (XEN) cr3: 00187000 cr2: 002f7d38 (XEN) ds: 002b es: 002b fs: 0053 gs: 002b ss: 0018 cs: 0010 (XEN) (XEN) *** Dumping CPU2 guest state (d1v0):
Re: [Xen-devel] [PATCH v2 for-4.5] xen/arm: clear UIE on hypervisor entry
On Thu, Nov 20, 2014 at 04:47:42PM +, Ian Campbell wrote: On Thu, 2014-11-20 at 10:53 +, Stefano Stabellini wrote: UIE being set can cause maintenance interrupts to occur when Xen writes to one or more LR registers. The effect is a busy loop around the interrupt handler in Xen (http://marc.info/?l=xen-develm=141597517132682): everything gets stuck. I think it would be useful to explain somewhere why/how a spurious interrupt can occur, if not here then in the comment you've already added or in another one in gic_clear_lrs where you clear UIE. The bit about clearing the LRs on entry causing UIE to become deasserted before we get as far as reading the IAR is a bit subtle. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Reported-and-Tested-by: Andrii Tseglytskyi andrii.tseglyts...@globallogic.com CC: konrad.w...@oracle.com With the expanded commentary: Acked-by: Ian Campbell ian.campb...@citrix.com --- Konrad, this fixes an actual bug, at least on OMAP5. It should have no bad side effects on any other platforms as far as I can tell. It should go in 4.5. As long as the testing that Julian has asked for does not demonstrate any issues with this patch I am OK with it going in Xen 4.5. Changes in v2: - add an in-code comment about maintenance_interrupt not being called. diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 70d10d6..c6c11d3 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -403,6 +403,8 @@ void gic_clear_lrs(struct vcpu *v) if ( is_idle_vcpu(v) ) return; +gic_hw_ops-update_hcr_status(GICH_HCR_UIE, 0); + spin_lock_irqsave(v-arch.vgic.lock, flags); while ((i = find_next_bit((const unsigned long *) this_cpu(lr_mask), @@ -527,8 +529,6 @@ void gic_inject(void) if ( !list_empty(current-arch.vgic.lr_pending) lr_all_full() ) gic_hw_ops-update_hcr_status(GICH_HCR_UIE, 1); -else -gic_hw_ops-update_hcr_status(GICH_HCR_UIE, 0); } static void do_sgi(struct cpu_user_regs *regs, enum gic_sgi sgi) @@ -598,6 +598,10 @@ static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *r * Receiving the interrupt is going to cause gic_inject to be called * on return to guest that is going to clear the old LRs and inject * new interrupts. + * + * Do not add code here: maintenance interrupts caused by setting + * GICH_HCR_UIE, might read as spurious interrupts (1023). As a + * consequence sometimes this handler might not be called. */ } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [for-xen-4.5 PATCH v2 2/2] dpci: Add ZOMBIE state to allow the softirq to finish with the dpci_pirq.
Thursday, November 20, 2014, 8:51:33 PM, you wrote: Ah crud. So a simple fix could be to seperate the 'state' to only deal with the raise_softirq and softirq_dpci. And then add a new (old) 'masked' to deal between hvm_dirq_assist, pt_irq_guest_eoi and hvm_do_IRQ_dpci. From 94a98e20a8ab721a58788919f92e3524a6c6e25c Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Thu, 20 Nov 2014 14:28:11 -0500 Subject: [PATCH] dpci: Add 'masked' as an gate for hvm_dirq_assist to process. commit f6dd295381f4b6a66acddacf46bca8940586c8d8 dpci: replace tasklet with softirq used the 'masked' as an two-bit state mechanism (STATE_SCHED, STATE_RUN) to communicate between 'raise_softirq_for' and 'dpci_softirq' to determine whether the 'struct hvm_pirq_dpci' can be re-scheduled. SNIP Hi Konrad, Is this patch supposed to replace both the previous ones ? -- Sander ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.5 v2] libxc: don't leak buffer containing the uncompressed PV kernel
On Thu, Nov 20, 2014 at 03:48:47PM +, Ian Campbell wrote: The libxc xc_dom_* infrastructure uses a very simple malloc memory pool which is freed by xc_dom_release. However the various xc_try_*_decode routines (other than the gzip one) just use plain malloc/realloc and therefore the buffer ends up leaked. The memory pool currently supports mmap'd buffers as well as a directly allocated buffers, however the try decode routines make use of realloc and do not fit well into this model. Introduce a concept of an external memory block to the memory pool and provide an interface to register such memory. The mmap_ptr and mmap_len fields of the memblock tracking struct lose their mmap_ prefix since they are now also used for external memory blocks. We are only seeing this now because the gzip decoder doesn't leak and it's only relatively recently that kernels in the wild have switched to better compression. This is https://bugs.debian.org/767295 Reported by: Gedalya geda...@gedalya.net Gedelya, Could you also test this patch to make sure it does fix the reported issue please? Signed-off-by: Ian Campbell ian.campb...@citrix.com --- v2: Correct handling in xc_try_lzo1x_decode. This is a bug fix and should go into 4.5. I have a sneaking suspicion this is going to need to backport a very long way... --- tools/libxc/include/xc_dom.h| 10 -- tools/libxc/xc_dom_bzimageloader.c | 20 tools/libxc/xc_dom_core.c | 61 +++ tools/libxc/xc_dom_decompress_lz4.c |5 +++ 4 files changed, 80 insertions(+), 16 deletions(-) diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h index 6ae6a9f..07d7224 100644 --- a/tools/libxc/include/xc_dom.h +++ b/tools/libxc/include/xc_dom.h @@ -33,8 +33,13 @@ struct xc_dom_seg { struct xc_dom_mem { struct xc_dom_mem *next; -void *mmap_ptr; -size_t mmap_len; +void *ptr; +enum { +XC_DOM_MEM_TYPE_MALLOC_INTERNAL, +XC_DOM_MEM_TYPE_MALLOC_EXTERNAL, +XC_DOM_MEM_TYPE_MMAP, +} type; +size_t len; unsigned char memory[0]; }; @@ -298,6 +303,7 @@ void xc_dom_log_memory_footprint(struct xc_dom_image *dom); /* --- simple memory pool -- */ void *xc_dom_malloc(struct xc_dom_image *dom, size_t size); +int xc_dom_register_external(struct xc_dom_image *dom, void *ptr, size_t size); void *xc_dom_malloc_page_aligned(struct xc_dom_image *dom, size_t size); void *xc_dom_malloc_filemap(struct xc_dom_image *dom, const char *filename, size_t * size, diff --git a/tools/libxc/xc_dom_bzimageloader.c b/tools/libxc/xc_dom_bzimageloader.c index 2225699..964ebdc 100644 --- a/tools/libxc/xc_dom_bzimageloader.c +++ b/tools/libxc/xc_dom_bzimageloader.c @@ -161,6 +161,13 @@ static int xc_try_bzip2_decode( total = (((uint64_t)stream.total_out_hi32) 32) | stream.total_out_lo32; +if ( xc_dom_register_external(dom, out_buf, total) ) +{ +DOMPRINTF(BZIP2: Error registering stream output); +free(out_buf); +goto bzip2_cleanup; +} + DOMPRINTF(%s: BZIP2 decompress OK, 0x%zx - 0x%lx, __FUNCTION__, *size, (long unsigned int) total); @@ -305,6 +312,13 @@ static int _xc_try_lzma_decode( } } +if ( xc_dom_register_external(dom, out_buf, stream-total_out) ) +{ +DOMPRINTF(%s: Error registering stream output, what); +free(out_buf); +goto lzma_cleanup; +} + DOMPRINTF(%s: %s decompress OK, 0x%zx - 0x%zx, __FUNCTION__, what, *size, (size_t)stream-total_out); @@ -464,7 +478,13 @@ static int xc_try_lzo1x_decode( dst_len = lzo_read_32(cur); if ( !dst_len ) +{ +msg = Error registering stream output; +if ( xc_dom_register_external(dom, out_buf, out_len) ) +break; + return 0; +} if ( dst_len LZOP_MAX_BLOCK_SIZE ) { diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c index baa62a1..ecbf981 100644 --- a/tools/libxc/xc_dom_core.c +++ b/tools/libxc/xc_dom_core.c @@ -132,6 +132,7 @@ void *xc_dom_malloc(struct xc_dom_image *dom, size_t size) return NULL; } memset(block, 0, sizeof(*block) + size); +block-type = XC_DOM_MEM_TYPE_MALLOC_INTERNAL; block-next = dom-memblocks; dom-memblocks = block; dom-alloc_malloc += sizeof(*block) + size; @@ -151,23 +152,45 @@ void *xc_dom_malloc_page_aligned(struct xc_dom_image *dom, size_t size) return NULL; } memset(block, 0, sizeof(*block)); -block-mmap_len = size; -block-mmap_ptr = mmap(NULL, block-mmap_len, - PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, - -1, 0);
Re: [Xen-devel] [PATCH v2 for 4.5] scripts/get_maintainer.pl: Correctly CC the maintainers
On Thu, Nov 20, 2014 at 05:36:03PM +, Julien Grall wrote: The current script is setting $email_remove_duplicates to 1 by default, on complex patch (see [1]), this will result to ommitting randomly some maintainers. One could see that as feature - the emails about bugs or patches to review don't land in your inbox! This is because, the script will: 1) Get the list of maintainers of the file (incidentally all the maintainers in THE REST role are added). If the email address already exists in the global list, skip it. = The role will be lost 2) Filter the list to remove the entry with THE REST role So if a maintainers is marked with THE REST role on the first file and actually be an x86 maintainers on the script, the script will only retain the THE REST role. During the filtering step, this maintainers will therefore be dropped. This patch fixes this by setting $email_remove_duplicates to 0 by default. The new behavior of the script will be: 1) Append the list of maintainers for every file 2) Filter the list to remove the entry with THE REST role 3) Remove duplicated email address Example: Patch: https://patches.linaro.org/41083/ Before the patch: Daniel De Graaf dgde...@tycho.nsa.gov Ian Jackson ian.jack...@eu.citrix.com Stefano Stabellini stefano.stabell...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Wei Liu wei.l...@citrix.com George Dunlap george.dun...@eu.citrix.com xen-devel@lists.xen.org After the patch: Daniel De Graaf dgde...@tycho.nsa.gov Ian Jackson ian.jack...@eu.citrix.com Stefano Stabellini stefano.stabell...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Wei Liu wei.l...@citrix.com Stefano Stabellini stefano.stabell...@citrix.com Tim Deegan t...@xen.org Keir Fraser k...@xen.org Jan Beulich jbeul...@suse.com George Dunlap george.dun...@eu.citrix.com xen-devel@lists.xen.org [1] http://lists.xenproject.org/archives/html/xen-devel/2014-11/msg00060.html Signed-off-by: Julien Grall julien.gr...@linaro.org CC: Don Slutz dsl...@verizon.com --- Changes in v2: - Rework the commit message to explain the problem and the solution more clearly I would like to see this patch in Xen 4.5 and backported to Xen 4.4 (first time the script has been introduced). Developpers using this script won't ommitted to cc some maintainers, and it will avoid maintainers complaining about miss CC. The only drawbacks I can see is if the maintainers is referenced twice in the file MAINTAINERS with different email, the script won't notice it's duplicated and list 2 times. Though, for this one it could be fixed by modifying the MAINTAINERS file. Is it worth for Xen 4.5? For know, it seems to only happen with Stefano. I am OK with this going in Xen 4.5. Thank you. --- scripts/get_maintainer.pl |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl index df920e2..cc445cd 100755 --- a/scripts/get_maintainer.pl +++ b/scripts/get_maintainer.pl @@ -35,7 +35,7 @@ my $email_git_min_percent = 5; my $email_git_since = 1-year-ago; my $email_hg_since = -365; my $interactive = 0; -my $email_remove_duplicates = 1; +my $email_remove_duplicates = 0; my $email_use_mailmap = 1; my $email_drop_the_rest_supporter_if_supporter_found = 1; my $output_multiline = 1; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v9 12/13] swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu
On Thu, Nov 20, 2014 at 10:47:51AM +, Stefano Stabellini wrote: On Thu, 20 Nov 2014, Stefano Stabellini wrote: On Wed, 19 Nov 2014, Konrad Rzeszutek Wilk wrote: On Wed, Nov 12, 2014 at 11:40:53AM +, Stefano Stabellini wrote: xen_dma_unmap_page and xen_dma_sync_single_for_cpu take a dma_addr_t handle as argument, not a physical address. Ouch. Should this also go on stable tree? Good idea Also can I take that as an Acked-by for this patch? Yes. There is one last bit of common and x86 changes in this series: patch #7 adds a paramter to xen_dma_map_page, even though the x86 implementation is empty: http://marc.info/?l=xen-develm=141579253829808w=2 is that OK for you? Yes. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Reviewed-by: Catalin Marinas catalin.mari...@arm.com --- drivers/xen/swiotlb-xen.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 3725ee4..498b654 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -449,7 +449,7 @@ static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr, BUG_ON(dir == DMA_NONE); - xen_dma_unmap_page(hwdev, paddr, size, dir, attrs); + xen_dma_unmap_page(hwdev, dev_addr, size, dir, attrs); /* NOTE: We use dev_addr here, not paddr! */ if (is_xen_swiotlb_buffer(dev_addr)) { @@ -497,14 +497,14 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, BUG_ON(dir == DMA_NONE); if (target == SYNC_FOR_CPU) - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir); + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir); /* NOTE: We use dev_addr here, not paddr! */ if (is_xen_swiotlb_buffer(dev_addr)) swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target); if (target == SYNC_FOR_DEVICE) - xen_dma_sync_single_for_cpu(hwdev, paddr, size, dir); + xen_dma_sync_single_for_cpu(hwdev, dev_addr, size, dir); if (dir != DMA_FROM_DEVICE) return; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH V4] Decouple SandyBridge quirk from VTd timeout
Currently the quirk code for SandyBridge uses the VTd timeout value when writing to an IGD register. This is the wrong timeout to use and, at 1000 msec., is also much too large. This patch changes the quirk code to use a timeout that is specific to the IGD device and allows the user control of the timeout. Boolean settings for the boot parameter `snb_igd_quirk' keep their current meaning, enabling or disabling the quirk code with a timeout of 1000 msec. In addition specifying `snb_igd_quirk=default' will enable the code and set the timeout to the theoretical maximum of 670 msec. For finer control, specifying `snb_igd_quirk=n', where `n' is a decimal number, will enable the code and set the timeout to `n' msec. Signed-off-by: Don Dugger donald.d.dug...@intel.com -- diff -r 9d485e2c8339 xen/drivers/passthrough/vtd/quirks.c --- a/xen/drivers/passthrough/vtd/quirks.c Mon Nov 10 12:03:36 2014 + +++ b/xen/drivers/passthrough/vtd/quirks.c Thu Nov 20 22:00:51 2014 -0700 @@ -50,6 +50,11 @@ #define IS_ILK(id)(id == 0x00408086 || id == 0x00448086 || id== 0x00628086 || id == 0x006A8086) #define IS_CPT(id)(id == 0x01008086 || id == 0x01048086) +/* SandyBridge IGD timeouts in milliseconds */ +#define SNB_IGD_TIMEOUT_LEGACY1000 +#define SNB_IGD_TIMEOUT670 +static u32 snb_igd_timeout = 0; + static u32 __read_mostly ioh_id; static u32 __initdata igd_id; bool_t __read_mostly rwbf_quirk; @@ -158,6 +163,16 @@ * Workaround is to prevent graphics get into RC6 * state when doing VT-d IOTLB operations, do the VT-d * IOTLB operation, and then re-enable RC6 state. + * + * This quirk is enabled with the snb_igd_quirk command + * line parameter. Specifying snb_igd_quirk with no value + * (or any of the standard boolean values) enables this + * quirk and sets the timeout to the legacy timeout of + * 1000 msec. Setting this parameter to the string + * default enables this quirk and sets the timeout to + * the theoretical maximum of 670 msec. Setting this + * parameter to a numerical value enables the quirk and + * sets the timeout to that numerical number of msecs. */ static void snb_vtd_ops_preamble(struct iommu* iommu) { @@ -177,7 +192,7 @@ start_time = NOW(); while ( (*(volatile u32 *)(igd_reg_va + 0x22AC) 0xF) != 0 ) { -if ( NOW() start_time + DMAR_OPERATION_TIMEOUT ) +if ( NOW() start_time + snb_igd_timeout ) { dprintk(XENLOG_INFO VTDPREFIX, snb_vtd_ops_preamble: failed to disable idle handshake\n); @@ -208,13 +223,10 @@ * call before VT-d translation enable and IOTLB flush operations. */ -static int snb_igd_quirk; -boolean_param(snb_igd_quirk, snb_igd_quirk); - void vtd_ops_preamble_quirk(struct iommu* iommu) { cantiga_vtd_ops_preamble(iommu); -if ( snb_igd_quirk ) +if ( snb_igd_timeout != 0 ) { spin_lock(igd_lock); @@ -228,7 +240,7 @@ */ void vtd_ops_postamble_quirk(struct iommu* iommu) { -if ( snb_igd_quirk ) +if ( snb_igd_timeout != 0 ) { snb_vtd_ops_postamble(iommu); @@ -237,6 +249,36 @@ } } +static void __init parse_snb_timeout(const char *s) +{ +int t; + +t = parse_bool(s); +if ( t 0 ) +{ +if ( *s == '\0' ) +{ +t = SNB_IGD_TIMEOUT_LEGACY; +} +else if ( strcmp(s, default) == 0 ) +{ +t = SNB_IGD_TIMEOUT; +} +else +{ +t = strtoul(s, NULL, 0); +} +} +else +{ +t = t ? SNB_IGD_TIMEOUT_LEGACY : 0; +} +snb_igd_timeout = MILLISECS(t); + +return; +} +custom_param(snb_igd_quirk, parse_snb_timeout); + /* 5500/5520/X58 Chipset Interrupt remapping errata, for stepping B-3. * Fixed in stepping C-2. */ static void __init tylersburg_intremap_quirk(void) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][RFC][PATCH 04/13] hvmloader/util: get reserved device memory maps
On 2014/11/3 18:02, Jan Beulich wrote: On 03.11.14 at 10:55, tiejun.c...@intel.com wrote: On 2014/11/3 17:45, Jan Beulich wrote: On 03.11.14 at 10:32, tiejun.c...@intel.com wrote: On 2014/11/3 16:53, Jan Beulich wrote: On 03.11.14 at 03:22, tiejun.c...@intel.com wrote: On 2014/10/31 16:14, Jan Beulich wrote: On 31.10.14 at 03:20, tiejun.c...@intel.com wrote: On 2014/10/30 17:13, Jan Beulich wrote: On 30.10.14 at 06:55, tiejun.c...@intel.com wrote: On 2014/10/29 17:05, Jan Beulich wrote: On 29.10.14 at 07:54, tiejun.c...@intel.com wrote: Looks I can remove those stuff from util.h and just add 'extern' to them when we really need them. Please stop thinking this way. Declarations for things defined in .c files are to be present in headers, and the defining .c file has to include that header (making sure declaration and definition are and remain in sync). I hate having to again repeat my remark that you shouldn't forget it's not application code that you're modifying. Robust and maintainable code are a requirement in the hypervisor (and, as said it being an extension of it, hvmloader). Which - just to avoid any misunderstanding - isn't to say that this shouldn't also apply to application code. It's just that in the hypervisor and kernel (and certain other code system components) the consequences of being lax are much more severe. Okay. But currently, the pci.c file already include 'util.h' and 'xen/memory.h, #include util.h ... #include xen/memory.h We can't redefine struct xen_reserved_device_memory in util.h. Redefine? I said forward declare. Seems we just need to declare hvm_get_reserved_device_memory_map() in the head file, tools/firmware/hvmloader/util.h, unsigned int hvm_get_reserved_device_memory_map(void); To me this looks very much like poor programming style, even if in the context of hvmloader communicating information via global variables rather than function arguments and return values is Do you mean you don't like a global variable? But it can be use to get RDM without more hypercall or function call in the context of hvmloader. This argument which you brought up before, and which we commented on before, is pretty pointless. We don't really care much about doing one or two more hypercalls from hvmloader, unless these would be long-running ones. Another benefit to use a global variable is that we wouldn't allocate xen_reserved_device_memory * N each time, and reduce some duplicated codes, unless you mean I should define that as static inside in local. Now this reason is indeed worth a consideration. How many times is the data being needed/retrieved? Currently there are two cases in tools/hvmloader, setup pci and build e820 table. Each time, as you know we don't know how may entries we should require, so we always allocate one instance then according to the return value to allocate the proper instances to get that. Hmm, two uses isn't really that bad, i.e. I'd then still be in favor of a more normal interface. Just go back here since I realize we always use mem_alloc(), which is pick from RESERVED_MEMORY, to allocate all buffer inside this hypercall caller in hvmloader, but unfortunately we have no any associated free function implementation in hvmloader, so if we call this multiple times this means it really waster more memory in RESERVED_MEMORY. So I still think one global variable should be fine. Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [for-xen-4.5 PATCH v2 2/2] dpci: Add ZOMBIE state to allow the softirq to finish with the dpci_pirq.
On 20.11.14 at 20:51, konrad.w...@oracle.com wrote: @@ -669,7 +670,7 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci) ASSERT(d-arch.hvm_domain.irq.dpci); spin_lock(d-event_lock); -if ( pirq_dpci-state ) +if ( test_and_clear_bool(pirq_dpci-masked) ) { struct pirq *pirq = dpci_pirq(pirq_dpci); const struct dev_intx_gsi_link *digl; So this now guards solely against the timeout enforced EOI? Why do you no longer need to guard against cancellation (i.e. why isn't this looking at both -state and -masked)? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][RFC][PATCH 04/13] hvmloader/util: get reserved device memory maps
On 21.11.14 at 08:43, kevin.t...@intel.com wrote: From: Chen, Tiejun Sent: Friday, November 21, 2014 2:26 PM On 2014/11/3 18:02, Jan Beulich wrote: On 03.11.14 at 10:55, tiejun.c...@intel.com wrote: On 2014/11/3 17:45, Jan Beulich wrote: On 03.11.14 at 10:32, tiejun.c...@intel.com wrote: On 2014/11/3 16:53, Jan Beulich wrote: On 03.11.14 at 03:22, tiejun.c...@intel.com wrote: On 2014/10/31 16:14, Jan Beulich wrote: On 31.10.14 at 03:20, tiejun.c...@intel.com wrote: On 2014/10/30 17:13, Jan Beulich wrote: On 30.10.14 at 06:55, tiejun.c...@intel.com wrote: On 2014/10/29 17:05, Jan Beulich wrote: On 29.10.14 at 07:54, tiejun.c...@intel.com wrote: Looks I can remove those stuff from util.h and just add 'extern' to them when we really need them. Please stop thinking this way. Declarations for things defined in .c files are to be present in headers, and the defining .c file has to include that header (making sure declaration and definition are and remain in sync). I hate having to again repeat my remark that you shouldn't forget it's not application code that you're modifying. Robust and maintainable code are a requirement in the hypervisor (and, as said it being an extension of it, hvmloader). Which - just to avoid any misunderstanding - isn't to say that this shouldn't also apply to application code. It's just that in the hypervisor and kernel (and certain other code system components) the consequences of being lax are much more severe. Okay. But currently, the pci.c file already include 'util.h' and 'xen/memory.h, #include util.h ... #include xen/memory.h We can't redefine struct xen_reserved_device_memory in util.h. Redefine? I said forward declare. Seems we just need to declare hvm_get_reserved_device_memory_map() in the head file, tools/firmware/hvmloader/util.h, unsigned int hvm_get_reserved_device_memory_map(void); To me this looks very much like poor programming style, even if in the context of hvmloader communicating information via global variables rather than function arguments and return values is Do you mean you don't like a global variable? But it can be use to get RDM without more hypercall or function call in the context of hvmloader. This argument which you brought up before, and which we commented on before, is pretty pointless. We don't really care much about doing one or two more hypercalls from hvmloader, unless these would be long-running ones. Another benefit to use a global variable is that we wouldn't allocate xen_reserved_device_memory * N each time, and reduce some duplicated codes, unless you mean I should define that as static inside in local. Now this reason is indeed worth a consideration. How many times is the data being needed/retrieved? Currently there are two cases in tools/hvmloader, setup pci and build e820 table. Each time, as you know we don't know how may entries we should require, so we always allocate one instance then according to the return value to allocate the proper instances to get that. Hmm, two uses isn't really that bad, i.e. I'd then still be in favor of a more normal interface. Just go back here since I realize we always use mem_alloc(), which is pick from RESERVED_MEMORY, to allocate all buffer inside this hypercall caller in hvmloader, but unfortunately we have no any associated free function implementation in hvmloader, so if we call this multiple times this means it really waster more memory in RESERVED_MEMORY. So I still think one global variable should be fine. it's natural to get reserved information once, and then saved for either one-time or multiple-time references. Not really natural, but possible. Yet if done this way, then the interface shouldn't give the appearance of retrieving it every time, i.e. the global should be set up separately and the users of the data should access the data rather than calling a (fake) function. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel