Re: [Xen-devel] [v3 01/15] Vt-d Posted-intterrupt (PI) design
-Original Message- From: Meng Xu [mailto:xumengpa...@gmail.com] Sent: Wednesday, June 24, 2015 2:16 PM To: Wu, Feng Cc: xen-devel@lists.xen.org; Tian, Kevin; Keir Fraser; George Dunlap; Andrew Cooper; Jan Beulich; Zhang, Yang Z Subject: Re: [Xen-devel] [v3 01/15] Vt-d Posted-intterrupt (PI) design Hi Feng, One minor thing: +Important Definitions +== +There are some changes to IRTE and posted-interrupt descriptor after +VT-d PI is introduced: +IRTE: It seems that you forgot to define IRTE. :-) I guess it stands for Interrupt Remapping Table Entry? (Probably I'm wrong. :-)) Yes, you're right. I will add this in the next version. Thanks for pointing it out! Thanks, Feng Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v4][PATCH 03/19] xen/vtd: create RMRR mapping
Note actually we just need p2m_remove_page() to unmap these mapping on both ept and vt-d sides, and guest_physmap_remove_page is really a wrapper of p2m_remove_page(). And I agree with Tim regarding the desire to avoid code duplication. Yet that's no reason to do it asymmetrically: clear_identity_p2m_entry() could still be an inline (or macro) wrapper around guest_physmap_remove_page(). That way, apart from making I can define that as a macro close to set_identity_p2m_entry() in p2m.h. the code above look nicer, if the latter needs extending in the future for some reason, simply converting the wrapper to a real function is possible without needing to touch the call site(s). This would need to go into patch 2; I wonder whether folding that Yes. and this one wouldn't be warranted, avoiding the former adding Are you saying to fold patch #2 and patch #3? But shouldn't we always define a new and then use that in practice subsequently? Even with two patches, respectively. Thanks Tiejun (at that point) dead code. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 COLO 01/15] docs: add colo readme
On 06/16/2015 06:56 PM, Ian Campbell wrote: On Mon, 2015-06-08 at 11:45 +0800, Yang Hongyang wrote: add colo readme, refer to http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com This is fine as far as it goes but I wonder if perhaps docs/README.{remus,colo} ought to be moved into docs/misc, perhaps converted to markdown (which should be trivial) and perhaps merged into a single document about checkpointing? Agreeed that we can add a checkpointing.txt to docs/misc, and describe remus/COLO in that file. but can we do this later when COLO feature is merged? at that time we can do this within one patch. The reason for the move is twofold, first it is a bit a typical for docs to live in the top-level docs dir and secondly moving it into misc will cause it to appear automatically at http://xenbits.xen.org/docs/unstable/ etc. Ian. --- docs/README.colo | 9 + 1 file changed, 9 insertions(+) create mode 100644 docs/README.colo diff --git a/docs/README.colo b/docs/README.colo new file mode 100644 index 000..466eb72 --- /dev/null +++ b/docs/README.colo @@ -0,0 +1,9 @@ +COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service) +project is a high availability solution. Both primary VM (PVM) and secondary VM +(SVM) run in parallel. They receive the same request from client, and generate +response in parallel too. If the response packets from PVM and SVM are +identical, they are released immediately. Otherwise, a VM checkpoint (on demand) +is conducted. + +See the website at http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping +for details. . -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v3 01/15] Vt-d Posted-intterrupt (PI) design
Hi Feng, One minor thing: +Important Definitions +== +There are some changes to IRTE and posted-interrupt descriptor after +VT-d PI is introduced: +IRTE: It seems that you forgot to define IRTE. :-) I guess it stands for Interrupt Remapping Table Entry? (Probably I'm wrong. :-)) Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 01/12] VMX: VMFUNC and #VE definitions and detection.
On 22/06/15 19:56, Ed White wrote: Currently, neither is enabled globally but may be enabled on a per-VCPU basis by the altp2m code. Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as that bit is now hardware-defined. Signed-off-by: Ed White edmund.h.wh...@intel.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install
On 06/24/2015 12:14 PM, Fanhenglong wrote: I want to debug the procedure of windows os install with windbg, windbg executes instruction(fxsave) after the blank vm is started and before guest iso start to install, fxsave trigger the following code path: vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION) -ept_handle_violation -hvm_hap_nested_page_fault -handle_mmio_with_translation -handle_mmio -hvm_emulate_one -x86_emulate *X86_emulate return X86EMUL_UNHANDLEABLE* How are you using Xen in this case? Are you by any chance using the vm_event system in a way that sends back an emulate vm_event response from userspace? You might want to look at x86_emulate() in xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is being handled. HTH, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v4][PATCH 03/19] xen/vtd: create RMRR mapping
On 24.06.15 at 03:11, tiejun.c...@intel.com wrote: On 2015/6/23 18:12, Jan Beulich wrote: On 23.06.15 at 11:57, tiejun.c...@intel.com wrote: --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1839,7 +1839,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, while ( base_pfn end_pfn ) { -if ( intel_iommu_unmap_page(d, base_pfn) ) +if ( guest_physmap_remove_page(d, base_pfn, base_pfn, 0) ) Yeah, I also thought this may bring some confusions in this context. ret = -ENXIO; base_pfn++; } @@ -1855,8 +1855,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map, while ( base_pfn end_pfn ) { -int err = intel_iommu_map_page(d, base_pfn, base_pfn, - IOMMUF_readable|IOMMUF_writable); +int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw); Shouldn't the two continue to be the inverse of one another? Initially, instead of using guest_physmap_remove_page, I was trying to introduce a new, clear_identity_p2m_entry, which can wrapper p2m_remove_page(). But Tim just thought we'd better avoid duplicating code, http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg02970.html Maybe guest_physmap_remove_page() does what you want, Note actually we just need p2m_remove_page() to unmap these mapping on both ept and vt-d sides, and guest_physmap_remove_page is really a wrapper of p2m_remove_page(). And I agree with Tim regarding the desire to avoid code duplication. Yet that's no reason to do it asymmetrically: clear_identity_p2m_entry() could still be an inline (or macro) wrapper around guest_physmap_remove_page(). That way, apart from making the code above look nicer, if the latter needs extending in the future for some reason, simply converting the wrapper to a real function is possible without needing to touch the call site(s). This would need to go into patch 2; I wonder whether folding that and this one wouldn't be warranted, avoiding the former adding (at that point) dead code. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [linux-arm-xen test] 58849: regressions - FAIL
On Wed, 2015-06-24 at 06:03 +, osstest service user wrote: flight 58849 linux-arm-xen real [real] http://logs.test-lab.xenproject.org/osstest/logs/58849/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 58830 This was: http://logs.test-lab.xenproject.org/osstest/logs/58849/test-armhf-armhf-xl-cubietruck/cubietruck-braque---var-log-kern.log Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637687] [ cut here ] Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637756] kernel BUG at drivers/xen/grant-table.c:923! Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637784] Internal error: Oops - BUG: 0 [#1] SMP ARM Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637810] Modules linked in: xen_gntalloc bridge stp ipv6 llc brcmfmac brcmutil cfg80211 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637899] CPU: 0 PID: 16206 Comm: vif1.0-q0-guest Not tainted 3.16.7-ckt12+ #1 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637936] task: c12fc480 ti: d2d3c000 task.ti: d2d3c000 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637977] PC is at gnttab_batch_copy+0xd4/0xe0 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638004] LR is at gnttab_batch_copy+0x1c/0xe0 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638030] pc : [c04abf7c] lr : [c04abec4]psr: a013 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638030] sp : d2d3deb0 ip : deadbeef fp : d2d3df3c Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638091] r10: 0001 r9 : r8 : 0008 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638124] r7 : 0001 r6 : 0001 r5 : r4 : e1e38d30 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638159] r3 : 0001 r2 : deadbeef r1 : deadbeef r0 : fff2 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638193] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638227] Control: 10c5387d Table: 7b50406a DAC: 0015 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638257] Process vif1.0-q0-guest (pid: 16206, stack limit = 0xd2d3c248) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638287] Stack: (0xd2d3deb0 to 0xd2d3e000) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638316] dea0: 0001 e1e3 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638353] dec0: 0001 c05d7c44 003e 0ec2 d2d3df3c 0001 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638391] dee0: dbbb7a80 0008 d2d3df20 e1e38cfc e1e38d30 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638430] df00: 0001 0001 e1e38d30 e1e63530 003e 0208 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638468] df20: d9f0f480 d9f0f480 0001 d2d3df2c d2d3df34 d2d3df34 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638505] df40: db34e380 e1e3 c05d776c Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638543] df60: c0264138 e1e3 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638581] df80: d2d3df80 d2d3df80 d2d3df90 d2d3df90 d2d3dfac db34e380 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638638] dfa0: c026406c c020f038 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638686] dfc0: Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638723] dfe0: 0013 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638785] [c04abf7c] (gnttab_batch_copy) from [c05d7c44] (xenvif_kthread_guest_rx+0x4d8/0xbc0) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638841] [c05d7c44] (xenvif_kthread_guest_rx) from [c0264138] (kthread+0xcc/0xe8) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638887] [c0264138] (kthread) from [c020f038] (ret_from_fork+0x14/0x3c) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638929] Code: 0ae5 eaed e8bd80f8 e7f001f2 (e7f001f2) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638978] ---[ end trace 98c74482d9a5771d ]--- Which looks familiar, although I can't seem to find it, does anyone remember it? Are we missing a backport perhaps? This is the 3.16.y based linux-arm-xen tree, which was recently updated from a baseline of v3.16.4-ckt7 to v3.16.7-ckt12 (flight 58830) in both cases plus xen_arch_need_swiotlb for swiotlb stuff. This here was the next flight which only added the xen: netback: read hotplug script once at start of day., which I don't think is related to the failure, which I suspect is intermittent. Ian. ___
Re: [Xen-devel] [PATCH] tools: libxl: Take the userdata lock around maxmem changes
On Tue, 2015-06-23 at 19:50 +0100, Wei Liu wrote: On Tue, Jun 23, 2015 at 05:38:17PM +0100, Ian Campbell wrote: On Tue, 2015-06-23 at 17:32 +0100, Wei Liu wrote: On Tue, Jun 23, 2015 at 03:58:32PM +0100, Ian Campbell wrote: There is an issue in libxl_set_memory_target whereby the target and the max mem can get out of sync, this is because the call the xc_domain_setmaxmem is not tied in any way to the xenstore transaction which controls updates to the xenstore side of things. Consider a domain with 1M of RAM (==target and maxmem for the sake of argument) and two simultaneous calls to libxl_set_memory_target, both with relative=0 and enforce=1, one with target=3 and the other with target=5. target=5 call target=3 call transaction start transaction start write target=5 to xenstore write target=3 to xenstore setmaxmem(5) setmaxmem(3) transaction commit = success transaction commit = EAGAIN At this point maxmem=3 while target=5. In reality the target=3 case will the retry and eventually (hopefully) succeed with target=maxmem=3, however the bad state will persist for some window which is undesirable. On failure other than EAGAIN all bets are off anyway, but in that case we will likely stick in the bad state until someone else sets the memory). To fix this we slightly abuse the userdata lock which is used to protect updates to the domain's json configuration. Abused because maxmem is not actually stored in there, but is kept by Xen. However the lock protects some semantically similar things and is convenient to use here too. libxl_domain_setmaxmem also takes the lock, since it reads memory/target from xenstore before calling xc_domain_setmaxmem there is a small (but perhaps not very interesting) race there too. There is on more use of xc_domain_setmaxmem in libxl__build_pre. However taking a lock around this would be tricky since the xenstore parts are not done until libxl__build_post. I think this one could be argued to be OK since the domid is not public yet, that is it has not been returned to the application yet (as the result of the create operation). Toolstacks which go round fiddling with random domid's which they find lying on the floor should be taught to do better. Add a doc note that taking the userdata lock requires the CTX_LOCK to be held. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- This applies on top of Wei's revert of libxl_set_memory_target: retain the same maxmem offset on top of the current target I couldn't quite rule out some race (for transaction=EAGAIN, for !EAGAIN there are obvious ones) which resulted in the incorrect state being in place after both entities exit, but couldn't construct an actual case. Not sure I follow. With this patch you lock out other contenders to even start a transaction so the EAGAIN vs !EAGAIN should be moot. You can safely rollback in !EAGAIN case, right? Sorry, I meant a prexisting race which was fixed by this patch, rather than one that continues to exist with this fix. Is there inconsistency after this fix? That post-changelog note was regarding the original code before the patch, because I felt the example race given in the code was a relatively minor one (since EAGAIN will cause it to be fixed up pretty quickly in the real world) and I was hoping to include an example of a much more serious race in the commit message but hadn't been able to construct one. I think not, because you can roll back maxmem and pod target to previous values -- but the rollback was not implemented here though. It's not necessary, I think, because with EAGAIN failures we will always try again and (eventually) succeed. Other kinds of transaction failure are of the xenstore is completely b0rked kind and all bets are pretty much off in that case. What this patch prevents is getting into some weird state which has aspects of two separate calls to set_memory_target, which could be much worse than a transient state as part of a single call, especially if we could exit with success on both cases but in some hybrid state. There will always be a small window between the xc_domain_setmaxmem and the transaction commit. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
On Fri, 2015-06-19 at 12:07 +0100, Ian Campbell wrote: On Fri, 2015-06-19 at 10:51 +0100, Jan Beulich wrote: On 18.06.15 at 16:22, ian.campb...@citrix.com wrote: On Thu, 2015-06-18 at 12:37 +0100, Jan Beulich wrote: On 17.06.15 at 12:26, ian.jack...@eu.citrix.com wrote: Jan Beulich writes (stable trees (was: [xen-4.2-testing test] 58584: regressions)): Which leaves several options: - the problem was always there, but hidden by some factor in the old osstest instance, I think this is most likely. The old system had much older hosts. I think this is a race that we now happen to lose most of the time. For verification purposes, would it be possible to set up a couple of flights on the old instance for one of the stable trees? I can try and run something adhoc on the old system if you can let me know exactly which jobs (test-*-*-*) and branches you are interested in. Any or all of test-amd64-*-xl-qemuu-win* (not sure whether you can specify wildcards), and I guess stable-4.5 (or staging-4.5) would be the most natural branch choice. I think the tools can do wildcards, yes. I've kicked off a full adhoc xen-4.5-testing flight so I have a local template to copy the jobs from for some repeated runs with just the problem flights (it's just easier to do that than to invent a cut-down flight from scratch...). After that baseline I ran a few tests of just the windows + qemuu stuff: http://xenbits.xen.org/people/ianc/tmp/adhoc/37619/ was allowing free reign on the machines and was mostly successful, apart from the windows-install failure on lake-frog. Looking at the test history this seems to have always been a problem on the old infra. *-frog are AMD Opteron(tm) Processor 6168 which is as close as the old infra has to the new colos merlot[01] which is AMD Opteron(tm) Processor 6376. With that in mind I reran with things limited to the two frog-* boxes and got http://xenbits.xen.org/people/ianc/tmp/adhoc/37624/. The windows-install of winxpsp3 persisted but there was no migration failure elsewhere. It's not a lot of data, but in comparison with the results in the colo: http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-xl-qemuu-win7-amd64/xen-4.5-testing.html it looks like it's the newer system which is exposing the issue. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 8/9] x86/pvh: Don't try to get l4 table for PVH guests in vcpu_destroy_pagetables()
On 24.06.15 at 05:05, boris.ostrov...@oracle.com wrote: On 06/23/2015 09:38 AM, Jan Beulich wrote: On 20.06.15 at 05:09, boris.ostrov...@oracle.com wrote: --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -2652,7 +2652,7 @@ int vcpu_destroy_pagetables(struct vcpu *v) if ( rc ) return rc; -if ( is_pv_32on64_vcpu(v) ) +if ( is_pv_32on64_vcpu(v) !is_pvh_vcpu(v) ) This looks wrong - is_pv_32on64_vcpu() should imply is_pv_vcpu() and hence !is_pvh_vcpu(). That's because I kept d-arch.is_32bit_pv = d-arch.has_32bit_shinfo = 1; in switch_compat() for both PV and PVH. I should probably only set has_32bit_shinfo for PVH. Right. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/9] x86/pvh: Set PVH guest's mode in XEN_DOMCTL_set_address_size
On 24.06.15 at 04:53, boris.ostrov...@oracle.com wrote: On 06/23/2015 09:22 AM, Jan Beulich wrote: --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2320,12 +2320,7 @@ int hvm_vcpu_initialise(struct vcpu *v) v-arch.hvm_vcpu.inject_trap.vector = -1; if ( is_pvh_domain(d) ) -{ -v-arch.hvm_vcpu.hcall_64bit = 1;/* PVH 32bitfixme. */ -/* This is for hvm_long_mode_enabled(v). */ -v-arch.hvm_vcpu.guest_efer = EFER_LMA | EFER_LME; return 0; -} With this removed, is there any guarantee that hvm_set_mode() will be called for each vCPU? IIUIC, toolstack is required to call XEN_DOMCTL_set_address_size which results in a call to switch_compat/native(), which loop over all VCPUs, calling set_mode. I don't recall this being a strict requirement. I think a PV 64-bit guest would start fine without. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 04/18] x86/hvm: make sure translated MMIO reads or writes fall within a page
On 23.06.15 at 18:32, paul.durr...@citrix.com wrote: Ok. If you believe it's the right thing to do, I'm happy to drop this patch out of the series. I'll send v4 tomorrow. Perhaps worth waiting a little for further review comments (fwiw I didn't get to look at 5 and onwards so far)? But of course if you don't mind doing an extra round... Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-arm-xen test] 58849: regressions - FAIL
flight 58849 linux-arm-xen real [real] http://logs.test-lab.xenproject.org/osstest/logs/58849/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 58830 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass version targeted for testing: linux64972ceb0b0cafc91a09764bc731e1b7f0503b5c baseline version: linux9f51b5de8c3fdd01a9d692da5633449cc6936688 People who touched revisions under test: David S. Miller da...@davemloft.net Ian Campbell ian.campb...@citrix.com Luis Henriques luis.henriq...@canonical.com Wei Liu wei.l...@citrix.com jobs: build-armhf-xsm pass build-armhf pass build-armhf-libvirt pass build-armhf-pvopspass test-armhf-armhf-xl pass test-armhf-armhf-libvirt-xsm pass test-armhf-armhf-xl-xsm pass test-armhf-armhf-xl-arndale pass test-armhf-armhf-xl-credit2 pass test-armhf-armhf-xl-cubietruck fail test-armhf-armhf-libvirt pass test-armhf-armhf-xl-multivcpupass test-armhf-armhf-xl-sedf-pin pass test-armhf-armhf-xl-sedf pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit 64972ceb0b0cafc91a09764bc731e1b7f0503b5c Author: Ian Campbell ian.campb...@citrix.com Date: Mon Jun 1 11:30:24 2015 +0100 xen: netback: read hotplug script once at start of day. commit 31a418986a5852034d520a5bab546821ff1ccf3d upstream. When we come to tear things down in netback_remove() and generate the uevent it is possible that the xenstore directory has already been removed (details below). In such cases netback_uevent() won't be able to read the hotplug script and will write a xenstore error node. A recent change to the hypervisor exposed this race such that we now sometimes lose it (where apparently we didn't ever before). Instead read the hotplug script configuration during setup and use it for the lifetime of the backend device. The apparently more obvious fix of moving the transition to state=Closed in netback_remove() to after the uevent does not work because it is possible that we are already in state=Closed (in reaction to the guest having disconnected as it shutdown). Being already in Closed means the toolstack is at liberty to start tearing down the xenstore directories. In principal it might be possible to arrange to unregister the device sooner (e.g on transition to Closing) such that xenstore would still be there but this state machine is fragile and prone to anger... A modern Xen system only relies on the hotplug uevent for driver domains, when the backend is in the same domain as the toolstack it will run the necessary setup/teardown directly in the correct sequence wrt xenstore changes. Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Signed-off-by: David S. Miller da...@davemloft.net Signed-off-by: Luis Henriques luis.henriq...@canonical.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/2] NetBSDRump: provide evtchn.h and privcmd.h
Xen's build system has a target for rump kernel called NetBSDRump. We want to build libxc against rump kernel, so we need to copy NetBSD's evtchn.h and privcmd.h to NetBSDRump. These copies is not very likely to diverge from NetBSD's copies, but we don't preclude such possibility. Signed-off-by: Wei Liu wei.l...@citrix.com --- tools/include/xen-sys/NetBSDRump/evtchn.h | 86 ++ tools/include/xen-sys/NetBSDRump/privcmd.h | 81 ++-- 2 files changed, 164 insertions(+), 3 deletions(-) create mode 100644 tools/include/xen-sys/NetBSDRump/evtchn.h diff --git a/tools/include/xen-sys/NetBSDRump/evtchn.h b/tools/include/xen-sys/NetBSDRump/evtchn.h new file mode 100644 index 000..2d8a1f9 --- /dev/null +++ b/tools/include/xen-sys/NetBSDRump/evtchn.h @@ -0,0 +1,86 @@ +/* $NetBSD: evtchn.h,v 1.1.1.1 2007/06/14 19:39:45 bouyer Exp $ */ +/** + * evtchn.h + * + * Interface to /dev/xen/evtchn. + * + * Copyright (c) 2003-2005, K A Fraser + * + * This file may be distributed separately from the Linux kernel, or + * incorporated into other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the Software), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef __NetBSD_EVTCHN_H__ +#define __NetBSD_EVTCHN_H__ + +/* + * Bind a fresh port to VIRQ @virq. + */ +#define IOCTL_EVTCHN_BIND_VIRQ \ + _IOWR('E', 4, struct ioctl_evtchn_bind_virq) +struct ioctl_evtchn_bind_virq { + unsigned int virq; + unsigned int port; +}; + +/* + * Bind a fresh port to remote @remote_domain, @remote_port. + */ +#define IOCTL_EVTCHN_BIND_INTERDOMAIN \ + _IOWR('E', 5, struct ioctl_evtchn_bind_interdomain) +struct ioctl_evtchn_bind_interdomain { + unsigned int remote_domain, remote_port; + unsigned int port; +}; + +/* + * Allocate a fresh port for binding to @remote_domain. + */ +#define IOCTL_EVTCHN_BIND_UNBOUND_PORT \ + _IOWR('E', 6, struct ioctl_evtchn_bind_unbound_port) +struct ioctl_evtchn_bind_unbound_port { + unsigned int remote_domain; + unsigned int port; +}; + +/* + * Unbind previously allocated @port. + */ +#define IOCTL_EVTCHN_UNBIND\ + _IOW('E', 7, struct ioctl_evtchn_unbind) +struct ioctl_evtchn_unbind { + unsigned int port; +}; + +/* + * Send event to previously allocated @port. + */ +#define IOCTL_EVTCHN_NOTIFY\ + _IOW('E', 8, struct ioctl_evtchn_notify) +struct ioctl_evtchn_notify { + unsigned int port; +}; + +/* Clear and reinitialise the event buffer. Clear error condition. */ +#define IOCTL_EVTCHN_RESET \ + _IO('E', 9) + +#endif /* __NetBSD_EVTCHN_H__ */ diff --git a/tools/include/xen-sys/NetBSDRump/privcmd.h b/tools/include/xen-sys/NetBSDRump/privcmd.h index efdcae9..1296b30 100644 --- a/tools/include/xen-sys/NetBSDRump/privcmd.h +++ b/tools/include/xen-sys/NetBSDRump/privcmd.h @@ -1,6 +1,36 @@ +/* NetBSD: xenio.h,v 1.3 2005/05/24 12:07:12 yamt Exp $*/ -#ifndef __NetBSDRump_PRIVCMD_H__ -#define __NetBSDRump_PRIVCMD_H__ +/** + * privcmd.h + * + * Copyright (c) 2003-2004, K A Fraser + * + * This file may be distributed separately from the Linux kernel, or + * incorporated into other software packages, subject to the following license: + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this source file (the Software), to deal in the Software without + * restriction, including without limitation the rights to use, copy, modify, + * merge, publish, distribute, sublicense, and/or sell copies of the Software, + * and to permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this
[Xen-devel] [PATCH 0/2] Build libxc on rump kernel
I have upstreamed a privcmd driver for rump kernel. That driver has the same semantics as the NetBSD one so we can just use xc_netbsd for rump kernel. Wei. Wei Liu (2): NetBSDRump: provide evtchn.h and privcmd.h libxc: use xc_netbsd.c for rump kernel tools/include/xen-sys/NetBSDRump/evtchn.h | 86 ++ tools/include/xen-sys/NetBSDRump/privcmd.h | 81 ++-- tools/libxc/Makefile | 1 + 3 files changed, 165 insertions(+), 3 deletions(-) create mode 100644 tools/include/xen-sys/NetBSDRump/evtchn.h -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/2] libxc: use xc_netbsd.c for rump kernel
Signed-off-by: Wei Liu wei.l...@citrix.com --- tools/libxc/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile index 55782c8..153b79e 100644 --- a/tools/libxc/Makefile +++ b/tools/libxc/Makefile @@ -48,6 +48,7 @@ CTRL_SRCS-$(CONFIG_Linux) += xc_linux.c xc_linux_osdep.c CTRL_SRCS-$(CONFIG_FreeBSD) += xc_freebsd.c xc_freebsd_osdep.c CTRL_SRCS-$(CONFIG_SunOS) += xc_solaris.c CTRL_SRCS-$(CONFIG_NetBSD) += xc_netbsd.c +CTRL_SRCS-$(CONFIG_NetBSDRump) += xc_netbsd.c CTRL_SRCS-$(CONFIG_MiniOS) += xc_minios.c GUEST_SRCS-y := -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install
On Wed, Jun 24, 2015 at 10:31:57AM +0100, Andrew Cooper wrote: On 24/06/15 10:25, Razvan Cojocaru wrote: On 06/24/2015 12:14 PM, Fanhenglong wrote: I want to debug the procedure of windows os install with windbg, windbg executes instruction(fxsave) after the blank vm is started and before guest iso start to install, fxsave trigger the following code path: vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION) -ept_handle_violation -hvm_hap_nested_page_fault -handle_mmio_with_translation -handle_mmio -hvm_emulate_one -x86_emulate *X86_emulate return X86EMUL_UNHANDLEABLE* How are you using Xen in this case? Are you by any chance using the vm_event system in a way that sends back an emulate vm_event response from userspace? You might want to look at x86_emulate() in xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is being handled. The fxsave instruction has no emulation implementation. 0f ae 07 is fxsave (%rdi) which means that either introspection is active, or %rdi is a pointer into an MMIO region. So I think this is not a regression? (I'm now trying to identify possible blockers for the release) Wei. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 10/17] x86/hvm: revert 82ed8716b fix direct PCI port I/O emulation retry...
...and error handling NOTE: A straight reversion was not possible because of subsequent changes in the code so this at least partially a manual reversion. By limiting hvmemul_do_io_addr() to reps falling within the pages on which a reference has already been taken, we can guarantee that calls to hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic from the intercept code and simplify it significantly. Normally hvmemul_do_io_addr() will only reference single page at a time. It will, however, take an extra page reference for I/O spanning a page boundary. It is still important to know, upon returning from x86_emulate(), whether the number of reps was reduced so the mmio_retry flag is retained for that purpose. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 86 +++- xen/arch/x86/hvm/hvm.c |4 ++ xen/arch/x86/hvm/intercept.c | 52 +--- xen/include/asm-x86/hvm/vcpu.h |2 +- 4 files changed, 74 insertions(+), 70 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 4e2fdf1..eefe860 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -52,7 +52,7 @@ static void hvmtrace_io_assist(ioreq_t *p) } static int hvmemul_do_io( -bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size, +bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size, uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data) { struct vcpu *curr = current; @@ -61,6 +61,7 @@ static int hvmemul_do_io( .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO, .addr = addr, .size = size, +.count = reps, .dir = dir, .df = df, .data = data, @@ -126,15 +127,6 @@ static int hvmemul_do_io( HVMIO_dispatched : HVMIO_awaiting_completion; vio-io_size = size; -/* - * When retrying a repeated string instruction, force exit to guest after - * completion of the retried iteration to allow handling of interrupts. - */ -if ( vio-mmio_retrying ) -*reps = 1; - -p.count = *reps; - if ( dir == IOREQ_WRITE ) { if ( !data_is_addr ) @@ -148,17 +140,9 @@ static int hvmemul_do_io( switch ( rc ) { case X86EMUL_OKAY: -case X86EMUL_RETRY: -*reps = p.count; p.state = STATE_IORESP_READY; -if ( !vio-mmio_retry ) -{ -hvm_io_assist(p); -vio-io_state = HVMIO_none; -} -else -/* Defer hvm_io_assist() invocation to hvm_do_resume(). */ -vio-io_state = HVMIO_handle_mmio_awaiting_completion; +hvm_io_assist(p); +vio-io_state = HVMIO_none; break; case X86EMUL_UNHANDLEABLE: { @@ -236,7 +220,7 @@ static int hvmemul_do_io_buffer( BUG_ON(buffer == NULL); -rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0, +rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0, (uintptr_t)buffer); if ( rc == X86EMUL_UNHANDLEABLE dir == IOREQ_READ ) memset(buffer, 0xff, size); @@ -288,17 +272,66 @@ static int hvmemul_do_io_addr( bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa) { -struct page_info *ram_page; +struct vcpu *v = current; +unsigned long ram_gmfn = paddr_to_pfn(ram_gpa); +struct page_info *ram_page[2]; +int nr_pages = 0; +unsigned long count; int rc; -rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), ram_page); +rc = hvmemul_acquire_page(ram_gmfn, ram_page[nr_pages]); if ( rc != X86EMUL_OKAY ) -return rc; +goto out; -rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1, +nr_pages++; + +/* Detemine how many reps will fit within this page */ +for ( count = 0; count *reps; count++ ) +{ +paddr_t start, end; + +if ( df ) +{ +start = ram_gpa - count * size; +end = ram_gpa + size - 1; +} +else +{ +start = ram_gpa; +end = ram_gpa + (count + 1) * size - 1; +} + +if ( paddr_to_pfn(start) != ram_gmfn || + paddr_to_pfn(end) != ram_gmfn ) +break; +} + +if ( count == 0 ) +{ +/* + * This access must span two pages, so grab a reference to + * the next page and do a single rep. + */ +rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1, + ram_page[nr_pages]); +if ( rc != X86EMUL_OKAY ) +goto out; + +nr_pages++; +count = 1; +} + +rc =
[Xen-devel] [PATCH v4 13/17] x86/hvm: remove HVMIO_dispatched I/O state
By removing the HVMIO_dispatched state and making all pending emulations (i.e. all those not handled by the hypervisor) use HVMIO_awating_completion, various code-paths can be simplified. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 12 +++- xen/arch/x86/hvm/hvm.c | 12 +++- xen/arch/x86/hvm/io.c | 12 xen/arch/x86/hvm/vmx/realmode.c |2 +- xen/include/asm-x86/hvm/vcpu.h |8 +++- 5 files changed, 18 insertions(+), 28 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 111987c..c10adad 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -138,20 +138,14 @@ static int hvmemul_do_io( if ( data_is_addr || dir == IOREQ_WRITE ) return X86EMUL_UNHANDLEABLE; goto finish_access; -case HVMIO_dispatched: -/* May have to wait for previous cycle of a multi-write to complete. */ -if ( is_mmio !data_is_addr (dir == IOREQ_WRITE) - (addr == (vio-mmio_large_write_pa + - vio-mmio_large_write_bytes)) ) -return X86EMUL_RETRY; -/* fallthrough */ default: return X86EMUL_UNHANDLEABLE; } -vio-io_state = (data_is_addr || dir == IOREQ_WRITE) ? -HVMIO_dispatched : HVMIO_awaiting_completion; +vio-io_state = HVMIO_awaiting_completion; vio-io_size = size; +vio-io_dir = dir; +vio-io_data_is_addr = data_is_addr; if ( dir == IOREQ_WRITE ) { diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 39f40ad..4458fa4 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -416,22 +416,16 @@ static void hvm_io_assist(ioreq_t *p) { struct vcpu *curr = current; struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; -enum hvm_io_state io_state; p-state = STATE_IOREQ_NONE; -io_state = vio-io_state; -vio-io_state = HVMIO_none; - -switch ( io_state ) +if ( HVMIO_NEED_COMPLETION(vio) ) { -case HVMIO_awaiting_completion: vio-io_state = HVMIO_completed; vio-io_data = p-data; -break; -default: -break; } +else +vio-io_state = HVMIO_none; msix_write_completion(curr); vcpu_end_shutdown_deferral(curr); diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index 27150e9..12fc923 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -90,9 +90,7 @@ int handle_mmio(void) rc = hvm_emulate_one(ctxt); -if ( rc != X86EMUL_RETRY ) -vio-io_state = HVMIO_none; -if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry ) +if ( HVMIO_NEED_COMPLETION(vio) || vio-mmio_retry ) vio-io_completion = HVMIO_mmio_completion; else vio-mmio_access = (struct npfec){}; @@ -142,6 +140,9 @@ int handle_pio(uint16_t port, unsigned int size, int dir) rc = hvmemul_do_pio_buffer(port, size, dir, data); +if ( HVMIO_NEED_COMPLETION(vio) ) +vio-io_completion = HVMIO_pio_completion; + switch ( rc ) { case X86EMUL_OKAY: @@ -154,11 +155,6 @@ int handle_pio(uint16_t port, unsigned int size, int dir) } break; case X86EMUL_RETRY: -if ( vio-io_state != HVMIO_awaiting_completion ) -return 0; -/* Completion in hvm_io_assist() with no re-emulation required. */ -ASSERT(dir == IOREQ_READ); -vio-io_completion = HVMIO_pio_completion; break; default: gdprintk(XENLOG_ERR, Weird HVM ioemulation status %d.\n, rc); diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c index 76ff9a5..5e56a1f 100644 --- a/xen/arch/x86/hvm/vmx/realmode.c +++ b/xen/arch/x86/hvm/vmx/realmode.c @@ -111,7 +111,7 @@ void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt) rc = hvm_emulate_one(hvmemul_ctxt); -if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry ) +if ( HVMIO_NEED_COMPLETION(vio) || vio-mmio_retry ) vio-io_completion = HVMIO_realmode_completion; if ( rc == X86EMUL_UNHANDLEABLE ) diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h index c4d96a8..2830057 100644 --- a/xen/include/asm-x86/hvm/vcpu.h +++ b/xen/include/asm-x86/hvm/vcpu.h @@ -32,7 +32,6 @@ enum hvm_io_state { HVMIO_none = 0, -HVMIO_dispatched, HVMIO_awaiting_completion, HVMIO_completed }; @@ -55,6 +54,13 @@ struct hvm_vcpu_io { unsigned long io_data; intio_size; enum hvm_io_completion io_completion; +uint8_tio_dir; +uint8_tio_data_is_addr; + +#define HVMIO_NEED_COMPLETION(_vio) \ +( ((_vio)-io_state == HVMIO_awaiting_completion) \ + !(_vio)-io_data_is_addr \ +
Re: [Xen-devel] [RFC PATCH v3 02/18] xen: Add log2 functionality
On Mon, Jun 22, 2015 at 6:47 PM, Jan Beulich jbeul...@suse.com wrote: On 22.06.15 at 14:01, vijay.kil...@gmail.com wrote: First of all, please Cc _all_ relevant maintainers. --- a/xen/include/xen/bitops.h +++ b/xen/include/xen/bitops.h @@ -117,6 +117,14 @@ static inline int generic_fls64(__u64 x) # endif #endif +static inline unsigned fls_long(unsigned long l) +{ +if (sizeof(l) == 4) +return fls(l); + +return fls64(l); +} I'm not really opposed to this, but did you really verify that there's no suitable functionality in tree already (even if named differently)? I can't, e.g., see why flsl() wouldn't fit your needs. #define fls64 flsl So flsl also should be fine --- /dev/null +++ b/xen/include/xen/log2.h @@ -0,0 +1,205 @@ +/* + * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowe...@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _LINUX_LOG2_H +#define _LINUX_LOG2_H LINUX? Most of the include/xen/*.h files has LINUX kept. Anyway, I will remove it +/* + * deal with unrepresentable constant logarithms + */ +extern __attribute__((const)) +int ilog2_NaN(void); -ETOOMANYUNDERSCORES I have taken this from linux. I have not introduced it. +#if 1 ??? (at least one more below) Also, are you really needing all of what you add here? I have taken complete log2 file from linux. Regards Vijay ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install
On 06/24/2015 12:31 PM, Andrew Cooper wrote: On 24/06/15 10:25, Razvan Cojocaru wrote: On 06/24/2015 12:14 PM, Fanhenglong wrote: I want to debug the procedure of windows os install with windbg, windbg executes instruction(fxsave) after the blank vm is started and before guest iso start to install, fxsave trigger the following code path: vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION) -ept_handle_violation -hvm_hap_nested_page_fault -handle_mmio_with_translation -handle_mmio -hvm_emulate_one -x86_emulate *X86_emulate return X86EMUL_UNHANDLEABLE* How are you using Xen in this case? Are you by any chance using the vm_event system in a way that sends back an emulate vm_event response from userspace? You might want to look at x86_emulate() in xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is being handled. The fxsave instruction has no emulation implementation. 0f ae 07 is fxsave (%rdi) which means that either introspection is active, or %rdi is a pointer into an MMIO region. I see, these are the cases we wanted to treat with the old patch (I thick it was called xen: Handle resumed instruction based on previous mem_event reply - the early versions, with RFC) that sometimes bypassed the emulator in the introspection case. Without that, there's always going to be a potential current or future instruction not emulated, and then something like this happens. Cheers, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v3 10/18] xen/arm: ITS: Add APIs to add and assign device
Hi Vijay, On 22/06/15 13:01, vijay.kil...@gmail.com wrote: From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com Add APIs to add devices to RB-tree, assign and remove devices to domain. Signed-off-by: Vijaya Kumar K vijaya.ku...@caviumnetworks.com --- xen/arch/arm/gic-v3-its.c | 246 - xen/include/asm-arm/gic-its.h |4 +- 2 files changed, 246 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 2a4fa97..4471669 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -94,6 +94,7 @@ static LIST_HEAD(its_nodes); static DEFINE_SPINLOCK(its_lock); static struct rdist_prop *gic_rdists; static struct rb_root rb_its_dev; +static DEFINE_SPINLOCK(rb_its_dev_lock); Can't you use the its_lock to handle locking for the rb_its_dev? #define gic_data_rdist()(per_cpu(rdist, smp_processor_id())) #define gic_data_rdist_rd_base()(per_cpu(rdist, smp_processor_id()).rbase) @@ -152,6 +153,21 @@ u32 its_get_nr_events(void) return (1 id_bits); } +static struct its_node * its_get_phys_node(u32 dev_id) +{ +struct its_node *its; + +/* TODO: For now return ITS0 node. + * Need Query PCI helper function to get on which + * ITS node the device is attached + */ +list_for_each_entry(its, its_nodes, entry) +{ +return its; +} + +return NULL; +} /* RB-tree helpers for its_device */ struct its_device * find_its_device(struct rb_root *root, u32 devid) { @@ -459,7 +475,7 @@ void its_send_inv(struct its_device *dev, struct its_collection *col, its_send_single_command(dev-its, its_build_inv_cmd, desc); } -void its_send_mapd(struct its_device *dev, int valid) +static void its_send_mapd(struct its_device *dev, int valid) I would prefer to see this static where the function has been introduced and delay the compilation until the end. { struct its_cmd_desc desc; @@ -493,7 +509,7 @@ void its_send_mapvi(struct its_device *dev, struct its_collection *col, its_send_single_command(dev-its, its_build_mapvi_cmd, desc); } -void its_send_movi(struct its_device *dev, struct its_collection *col, +static void its_send_movi(struct its_device *dev, struct its_collection *col, u32 event) Ditto { struct its_cmd_desc desc; @@ -596,7 +612,7 @@ int its_lpi_init(u32 id_bits) return 0; } -unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids) +static unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids) Ditto { unsigned long *bitmap = NULL; int chunk_id; @@ -636,6 +652,230 @@ out: return bitmap; } +static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids) +{ +int lpi; + +spin_lock(lpi_lock); + +for ( lpi = base; lpi (base + nr_ids); lpi += IRQS_PER_CHUNK ) +{ +int chunk = its_lpi_to_chunk(lpi); + +BUG_ON(chunk lpi_chunks); +if ( test_bit(chunk, lpi_bitmap) ) +clear_bit(chunk, lpi_bitmap); +else +its_err(Bad LPI chunk %d\n, chunk); +} + +spin_unlock(lpi_lock); + +xfree(bitmap); +} + +static int its_alloc_device_irq(struct its_device *dev, u32 *hwirq) +{ +int idx; + +idx = find_first_zero_bit(dev-lpi_map, dev-nr_lpis); +if ( idx == dev-nr_lpis ) +return -ENOSPC; + +*hwirq = dev-lpi_base + idx; You could use its_get_plpi here. +set_bit(idx, dev-lpi_map); + +return 0; +} + +static u32 its_get_plpi(struct its_device *dev, u32 event) +{ +ASSERT(event dev-nr_lpis); +return dev-lpi_base + event; +} + +/* Device assignment. Should be called from pci_device_add */ I guess you mean PHYSDEVOP_pci_device_add? +int its_add_device(struct domain *d, u32 devid) Why do you pass a domain in parameter? This function only adds a device into the ITS, there is nothing related to a specific domain. For more abstraction, I would create an helper which take a struct device in parameter and retrieving all the necessary informations (devid, number of MSI...) and then call this function. +{ +struct its_device *dev; +unsigned long *lpi_map; +void *itt; +int lpi_base, nr_lpis, sz; +u32 i, nr_ites, plpi, nr_cpus; +struct its_collection *col; + +spin_lock(rb_its_dev_lock); +dev = find_its_device(rb_its_dev, devid); +if ( dev ) This check is not useful. As you release the lock before adding the device someone else can have time to add the device in the RB-tree. +{ +spin_unlock(rb_its_dev_lock); +dprintk(XENLOG_G_ERR, ITS:d%dv%d Device already exists dev 0x%x\n, +d-domain_id, current-vcpu_id, dev-device_id); +return -EEXIST; +} +spin_unlock(rb_its_dev_lock); + +DPRINTK(ITS:d%dv%d Add device devid 0x%x\n,
Re: [Xen-devel] Hyper and Xen Project
Hi Wang, I don't know the answer, so I CCed xen-devel (the Xen development list) and a few people that I think will be able to help. Cheers, Stefano On Wed, 24 Jun 2015, Wang Xu wrote: A problem about channel, where do I found the channel name in the guest, In the document, it says I could found it in sysfs, but looks there isn't a name property: | root@test-container-create-ubuntu:/sys/bus/xen/devices# udevadm info --attribute-walk --path=/devices/console-1 | [...] | | looking at device '/devices/console-1': | KERNEL==console-1 | SUBSYSTEM==xen | DRIVER==xenconsole | ATTR{devtype}==console | ATTR{nodename}==device/console/1 and I directly test `/dev/hvc1`, and it could communicate with the outside socket. Is there some mistake in my channel name configuration? | static void hyper_config_channel(libxl_device_channel* ch, const char* name, const char* sock, int devid) { | libxl_device_channel_init(ch); | ch-backend_domid = 0; | ch-name = strdup(name); | ch-devid = devid; | ch-connection = LIBXL_CHANNEL_CONNECTION_SOCKET; | ch-u.socket.path = strdup(sock); | } I tried to look at the oVirt code as it is mentioned in the dock, but I did not find xen console in its guest agent code. So the issue is that the name you assign here to the channel, doesn't come up anywhere in the guest. Is that correct? Thank you! On Tue, Jun 23, 2015 at 7:30 PM, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: On Tue, 23 Jun 2015, Wang Xu wrote: On Sat, Jun 20, 2015 at 1:10 AM Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: Integrating hyper with Xen using libxl was the right decision and it looks like you did a good job. I think that you can go ahead with the PR! But I did have a few issues building hyper. I am getting: hyperd.go:11:2: cannot find package hyper/daemon in any of: [...] I tried with a clean 0.2-dev branch ./autogen.sh ./configure make It looks ok, are you work on the 0.2-dev branch, I did not write the branch name in the instruction of Readme, sorry for that. No worries, the most important part at this stage is the code, and that looks OK :-) Yes, I was using 0.2-dev and followed those steps. As I usually don't program in go, it is likely that my go working environment is missing something, or my go paths are wrong. This is the full error message: CGO_LDFLAGS=-Lhypervisor/xen -lxenlight -lxenctrl -lhyperxl godep go build hyperd.go hyperd.go:11:2: cannot find package hyper/daemon in any of: /local/scratch/sstabellini/go/src/hyper/daemon (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/daemon (from $GOPATH) hyperd.go:10:2: cannot find package hyper/engine in any of: /local/scratch/sstabellini/go/src/hyper/engine (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/engine (from $GOPATH) hyperd.go:12:2: cannot find package hyper/lib/glog in any of: /local/scratch/sstabellini/go/src/hyper/lib/glog (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/lib/glog (from $GOPATH) hyperd.go:13:2: cannot find package hyper/utils in any of: /local/scratch/sstabellini/go/src/hyper/utils (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/utils (from $GOPATH) godep: go exit status 1 Looking through the code, it seems that you are adding a virtio-serial-pci device, why do you need it? It is not used very much on Xen; the regular Xen uart is specified by setting b_info-u.hvm.serial to pty, and it looks like you are already doing that. If you need more than one console, you can have a list setting b_info-u.hvm.serial_list. What the difference between u.hvm.serial_list and channels in domain_config. The channel looks having more features. Actually I think that you are right: channels are better tested and more flexible. virtio-9p-pci is also not used very much with Xen, but as we don't have an alternative yet, I think it is good for now. Thanks again, Stefano On Fri, 19 Jun 2015, Sarah Conway wrote: Hi Xu, I'd be happy to work with you on some PR to promote this work. I'll be in touch with some next steps next week and look forward to Stefano's feedback. Sarah
Re: [Xen-devel] [PATCH v4 00/17] x86/hvm: I/O emulation cleanup and fix
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 24 June 2015 13:16 To: Paul Durrant Cc: xen-de...@lists.xenproject.org Subject: Re: [Xen-devel] [PATCH v4 00/17] x86/hvm: I/O emulation cleanup and fix On 24.06.15 at 13:24, paul.durr...@citrix.com wrote: This patch series re-works much of the code involved in emulation of port and memory mapped I/O for HVM guests. The code has become very convoluted and, at least by inspection, certain emulations will apparently malfunction. The series is broken down into 17 patches (which are also available in my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git on the emulation27 branch) as follows: 0001-x86-hvm-simplify-hvmemul_do_io.patch 0002-x86-hvm-remove-hvm_io_pending-check-in-hvmemul_do_io.patch 0003-x86-hvm-remove-extraneous-parameter-from-hvmtrace_io.patch In the event there were any, however minor, changes to these three - they went in yesterday evening. Ok. Thanks. Having not seen your acked-by on list, I didn't check. I'll drop these from any subsequent post. Paul Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86: fix build with old gas
.string8 is only supported by gas 2.19 and newer. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/include/asm-x86/bug.h +++ b/xen/include/asm-x86/bug.h @@ -79,7 +79,7 @@ extern const struct bug_frame __start_bu .L\@ud: ud2a .pushsection .rodata.str1, aMS, @progbits, 1 - .L\@s1: .string8 \file_str + .L\@s1: .asciz \file_str .popsection .pushsection .bug_frames.\type, a, @progbits @@ -91,7 +91,7 @@ extern const struct bug_frame __start_bu .if \second_frame .pushsection .rodata.str1, aMS, @progbits, 1 -.L\@s2: .string8 \msg +.L\@s2: .asciz \msg .popsection .long 0, (.L\@s2 - .L\@bf) .endif x86: fix build with old gas .string8 is only supported by gas 2.19 and newer. Signed-off-by: Jan Beulich jbeul...@suse.com --- a/xen/include/asm-x86/bug.h +++ b/xen/include/asm-x86/bug.h @@ -79,7 +79,7 @@ extern const struct bug_frame __start_bu .L\@ud: ud2a .pushsection .rodata.str1, aMS, @progbits, 1 - .L\@s1: .string8 \file_str + .L\@s1: .asciz \file_str .popsection .pushsection .bug_frames.\type, a, @progbits @@ -91,7 +91,7 @@ extern const struct bug_frame __start_bu .if \second_frame .pushsection .rodata.str1, aMS, @progbits, 1 -.L\@s2: .string8 \msg +.L\@s2: .asciz \msg .popsection .long 0, (.L\@s2 - .L\@bf) .endif ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen{trace/analyze}: don't use 64bit versions of libc functions
El 24/06/15 a les 13.11, Roger Pau Monné ha escrit: El 22/06/15 a les 16.48, Roger Pau Monné ha escrit: El 22/06/15 a les 12.09, George Dunlap ha escrit: On 06/22/2015 10:59 AM, Roger Pau Monné wrote: El 22/06/15 a les 11.08, George Dunlap ha escrit: On 06/19/2015 09:58 AM, Roger Pau Monne wrote: This is not needed, neither encouraged. Configure already checks _FILE_OFFSET_BITS and appends it when needed, so that the right functions are used. Also remove the usage of loff_t and O_LARGEFILE for the same reason. Just so I understand -- are you saying that configure at the tools directory level will notice that Linux can handle 64-bit file operations and use them automatically? Yes, according to the man page [1]: Over time, increases in the size of the stat structure have led to three successive versions of stat(): sys_stat() (slot __NR_oldstat), sys_newstat() (slot __NR_stat), and sys_stat64() (new in kernel 2.4; slot __NR_stat64). The glibc stat() wrapper function hides these details from applications, invoking the most recent version of the system call provided by the kernel, and repacking the returned information if required for old binaries. Similar remarks apply for fstat() and lstat(). OK, if you can confirm that you've actually tested this on a file larger than 4GiB, then: No, I have only build tested it since I was trying to unbreak the build. I don't think I will have time to test this until tomorrow, sorry for the delay. I've now tested this with a ~5GB file and it seems to work fine, I haven't seen any error and the output looks reasonable. This was on a 64bit Dom0, if someone has a 32bit Dom0 it would be good to test it there also. I've also tested on a 32bit Dom0, with and without the patches in this series and I always end up getting the same strange output from xenalyze: # xenalyze trace.file No output defined, using summary. Using VMX hardware-assisted virtualization. scan_for_new_pcpu: Activating pcpu 0 at offset 0 Creating vcpu 0 for dom 32768 scan_for_new_pcpu: Activating pcpu 1 at offset 10376 Creating vcpu 1 for dom 32768 scan_for_new_pcpu: Activating pcpu 4 at offset 10848 Creating vcpu 4 for dom 32768 scan_for_new_pcpu: Activating pcpu 6 at offset 11176 Creating vcpu 6 for dom 32768 init_pcpus: through first trace write, done for now. Creating domain 0 Creating vcpu 0 for dom 0 Using first_tsc for d0v0 (8109 cycles) Creating domain 32767 Creating vcpu 1 for dom 32767 Creating vcpu 1 for dom 0 Creating vcpu 2 for dom 0 Creating vcpu 4 for dom 32767 Using first_tsc for d32767v4 (9407 cycles) Creating vcpu 6 for dom 32767 Using first_tsc for d32767v6 (8755 cycles) process_cpu_change: Activating pcpu 5 at offset 16664 Creating vcpu 5 for dom 32768 scan_for_new_pcpu: Activating pcpu 7 at offset 17812 Creating vcpu 7 for dom 32768 Creating vcpu 3 for dom 0 Using first_tsc for d0v3 (3369172 cycles) Creating vcpu 0 for dom 32767 Creating vcpu 6 for dom 0 Creating vcpu 5 for dom 32767 Using first_tsc for d32767v5 (7868 cycles) Creating vcpu 7 for dom 0 Creating vcpu 7 for dom 32767 Using first_tsc for d32767v7 (7693 cycles) process_cpu_change: Activating pcpu 2 at offset 61284 Creating vcpu 2 for dom 32768 process_cpu_change: Activating pcpu 3 at offset 62128 Creating vcpu 3 for dom 32768 Creating vcpu 5 for dom 0 Creating vcpu 3 for dom 32767 Using first_tsc for d32767v3 (24609 cycles) Creating vcpu 4 for dom 0 Creating vcpu 2 for dom 32767 Using first_tsc for d32767v2 (2575 cycles) WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2. Not processing ] 84007(8:4:7) 0 [ ] WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2. Not processing ] 84006(8:4:6) 0 [ ] WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2. Not processing ] 84008(8:4:8) 0 [ ] WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2. Not processing ] 84008(8:4:8) 0 [ ] WARNING: Unexpected vcpu data type for d0v3 on proc 0! Expected 1 got 2. Not processing ] 84006(8:4:6) 0 [ ] Creating domain 90 Creating vcpu 0 for dom 90 Creating domain 89 Creating vcpu 0 for dom 89 Unknown hvm event: 84011 h-exit_reason 7b exit_reason_max 38! ] 81002(8:1:2) 2 [ 7b 100d9e ] And that's all. Since this seems to not be related to this fixes I think they should be applied. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Hyper and Xen Project
On 24 Jun 2015, at 12:48, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: Hi Wang, I don't know the answer, so I CCed xen-devel (the Xen development list) and a few people that I think will be able to help. Cheers, Stefano On Wed, 24 Jun 2015, Wang Xu wrote: A problem about channel, where do I found the channel name in the guest, In the document, it says I could found it in sysfs, but looks there isn't a name property: | root@test-container-create-ubuntu:/sys/bus/xen/devices# udevadm info --attribute-walk --path=/devices/console-1 | [...] | | looking at device '/devices/console-1': | KERNEL==console-1 | SUBSYSTEM==xen | DRIVER==xenconsole | ATTR{devtype}==console | ATTR{nodename}==device/console/1” I don’t think the frontend driver in Linux knows about the name key. In my testing I wrote a udev script which looks up the ‘name’ key directly in xenstore and created a named device node using that. For reference my script is here: https://github.com/mirage/mirage-console/blob/master/udev/xenconsole-setup-tty Cheers, Dave and I directly test `/dev/hvc1`, and it could communicate with the outside socket. Is there some mistake in my channel name configuration? | static void hyper_config_channel(libxl_device_channel* ch, const char* name, const char* sock, int devid) { | libxl_device_channel_init(ch); | ch-backend_domid = 0; | ch-name = strdup(name); | ch-devid = devid; | ch-connection = LIBXL_CHANNEL_CONNECTION_SOCKET; | ch-u.socket.path = strdup(sock); | } I tried to look at the oVirt code as it is mentioned in the dock, but I did not find xen console in its guest agent code. So the issue is that the name you assign here to the channel, doesn't come up anywhere in the guest. Is that correct? Thank you! On Tue, Jun 23, 2015 at 7:30 PM, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: On Tue, 23 Jun 2015, Wang Xu wrote: On Sat, Jun 20, 2015 at 1:10 AM Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: Integrating hyper with Xen using libxl was the right decision and it looks like you did a good job. I think that you can go ahead with the PR! But I did have a few issues building hyper. I am getting: hyperd.go:11:2: cannot find package hyper/daemon in any of: [...] I tried with a clean 0.2-dev branch ./autogen.sh ./configure make It looks ok, are you work on the 0.2-dev branch, I did not write the branch name in the instruction of Readme, sorry for that. No worries, the most important part at this stage is the code, and that looks OK :-) Yes, I was using 0.2-dev and followed those steps. As I usually don't program in go, it is likely that my go working environment is missing something, or my go paths are wrong. This is the full error message: CGO_LDFLAGS=-Lhypervisor/xen -lxenlight -lxenctrl -lhyperxl godep go build hyperd.go hyperd.go:11:2: cannot find package hyper/daemon in any of: /local/scratch/sstabellini/go/src/hyper/daemon (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/daemon (from $GOPATH) hyperd.go:10:2: cannot find package hyper/engine in any of: /local/scratch/sstabellini/go/src/hyper/engine (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/engine (from $GOPATH) hyperd.go:12:2: cannot find package hyper/lib/glog in any of: /local/scratch/sstabellini/go/src/hyper/lib/glog (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/lib/glog (from $GOPATH) hyperd.go:13:2: cannot find package hyper/utils in any of: /local/scratch/sstabellini/go/src/hyper/utils (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/utils (from $GOPATH) godep: go exit status 1 Looking through the code, it seems that you are adding a virtio-serial-pci device, why do you need it? It is not used very much on Xen; the regular Xen uart is specified by setting b_info-u.hvm.serial to pty, and it looks like you are already doing that. If you need more than one console, you can have a list setting b_info-u.hvm.serial_list. What the difference between u.hvm.serial_list and channels in domain_config. The channel looks having more features. Actually I think that you are right: channels are better tested and more flexible. virtio-9p-pci is also not used very much with Xen, but as we don't have an alternative yet, I think it is good for now. Thanks again, Stefano On Fri, 19 Jun 2015, Sarah Conway wrote: Hi Xu, I'd be happy to work with you on some PR to
Re: [Xen-devel] [PATCH 1/2] NetBSDRump: provide evtchn.h and privcmd.h
El 24/06/15 a les 12.10, Wei Liu ha escrit: +#define IOCTL_PRIVCMD_MMAP \ +_IOW('P', 2, privcmd_mmap_t) +#define IOCTL_PRIVCMD_MMAPBATCH\ +_IOW('P', 3, privcmd_mmapbatch_t) FWIW you could have gotten away with just implementing IOCTL_PRIVCMD_MMAPBATCH, this is what I did on FreeBSD. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/12] x86/altp2m: basic data structures and support routines.
On 22/06/15 19:56, Ed White wrote: diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h index 3d8f4dc..a1529c0 100644 --- a/xen/include/asm-x86/hvm/vcpu.h +++ b/xen/include/asm-x86/hvm/vcpu.h @@ -118,6 +118,13 @@ struct nestedvcpu { #define vcpu_nestedhvm(v) ((v)-arch.hvm_vcpu.nvcpu) +struct altp2mvcpu { +uint16_tp2midx; /* alternate p2m index */ +uint64_tveinfo_gfn; /* #VE information page guest pfn */ Please use the recently-introduced pfn_t here. pfn is a more appropriate term than gfn in this case. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen{trace/analyze}: don't use 64bit versions of libc functions
El 22/06/15 a les 16.48, Roger Pau Monné ha escrit: El 22/06/15 a les 12.09, George Dunlap ha escrit: On 06/22/2015 10:59 AM, Roger Pau Monné wrote: El 22/06/15 a les 11.08, George Dunlap ha escrit: On 06/19/2015 09:58 AM, Roger Pau Monne wrote: This is not needed, neither encouraged. Configure already checks _FILE_OFFSET_BITS and appends it when needed, so that the right functions are used. Also remove the usage of loff_t and O_LARGEFILE for the same reason. Just so I understand -- are you saying that configure at the tools directory level will notice that Linux can handle 64-bit file operations and use them automatically? Yes, according to the man page [1]: Over time, increases in the size of the stat structure have led to three successive versions of stat(): sys_stat() (slot __NR_oldstat), sys_newstat() (slot __NR_stat), and sys_stat64() (new in kernel 2.4; slot __NR_stat64). The glibc stat() wrapper function hides these details from applications, invoking the most recent version of the system call provided by the kernel, and repacking the returned information if required for old binaries. Similar remarks apply for fstat() and lstat(). OK, if you can confirm that you've actually tested this on a file larger than 4GiB, then: No, I have only build tested it since I was trying to unbreak the build. I don't think I will have time to test this until tomorrow, sorry for the delay. I've now tested this with a ~5GB file and it seems to work fine, I haven't seen any error and the output looks reasonable. This was on a 64bit Dom0, if someone has a 32bit Dom0 it would be good to test it there also. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 04/17] x86/hvm: remove multiple open coded 'chunking' loops
...in hvmemul_read/write() Add hvmemul_phys_mmio_access() and hvmemul_linear_mmio_access() functions to reduce code duplication. NOTE: This patch also introduces a change in 'chunking' around a page boundary. Previously (for example) an 8 byte access at the last byte of a page would get carried out as 8 single-byte accesses. It will now be carried out as a single-byte access, followed by a 4-byte access, a 2-byte access and then another single-byte access. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 225 +++- 1 file changed, 118 insertions(+), 107 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 935eab3..4d11c6c 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -540,6 +540,119 @@ static int hvmemul_virtual_to_linear( return X86EMUL_EXCEPTION; } +static int hvmemul_phys_mmio_access( +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer) +{ +unsigned long one_rep = 1; +unsigned int chunk; +int rc; + +/* Accesses must fall within a page */ +BUG_ON((gpa (PAGE_SIZE - 1)) + size PAGE_SIZE); + +/* + * hvmemul_do_io() cannot handle non-power-of-2 accesses or + * accesses larger than sizeof(long), so choose the highest power + * of 2 not exceeding sizeof(long) as the 'chunk' size. + */ +chunk = 1 (fls(size) - 1); +if ( chunk sizeof (long) ) +chunk = sizeof (long); + +for ( ;; ) +{ +rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0, +*buffer); +if ( rc != X86EMUL_OKAY ) +break; + +/* Advance to the next chunk */ +gpa += chunk; +*buffer += chunk; +size -= chunk; + +if ( size == 0 ) +break; + +/* + * If the chunk now exceeds the remaining size, choose the next + * lowest power of 2 that will fit. + */ +while ( chunk size ) +chunk = 1; +} + +return rc; +} + +static int hvmemul_linear_mmio_access( +unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer, +uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t translate) +{ +struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io; +unsigned long page_off = gla (PAGE_SIZE - 1); +unsigned int chunk; +paddr_t gpa; +unsigned long one_rep = 1; +int rc; + +chunk = min_t(unsigned int, size, PAGE_SIZE - page_off); + +if ( translate ) +gpa = pfn_to_paddr(vio-mmio_gpfn) | page_off; +else +{ +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec, +hvmemul_ctxt); +if ( rc != X86EMUL_OKAY ) +return rc; +} + +for ( ;; ) +{ +rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer); +if ( rc != X86EMUL_OKAY ) +break; + +gla += chunk; +gpa += chunk; +size -= chunk; + +if ( size == 0 ) +break; + +ASSERT((gla (PAGE_SIZE - 1)) == 0); +chunk = min_t(unsigned int, size, PAGE_SIZE); +if ( !translate ) +{ +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec, +hvmemul_ctxt); +if ( rc != X86EMUL_OKAY ) +return rc; +} +} + +return rc; +} + +static inline int hvmemul_linear_mmio_read( +unsigned long gla, unsigned int size, void *buffer, +uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, +bool_t translate) +{ +return hvmemul_linear_mmio_access(gla, size, IOREQ_READ, buffer, + pfec, hvmemul_ctxt, translate); +} + +static inline int hvmemul_linear_mmio_write( +unsigned long gla, unsigned int size, void *buffer, +uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, +bool_t translate) +{ +return hvmemul_linear_mmio_access(gla, size, IOREQ_WRITE, buffer, + pfec, hvmemul_ctxt, translate); +} + static int __hvmemul_read( enum x86_segment seg, unsigned long offset, @@ -550,51 +663,19 @@ static int __hvmemul_read( { struct vcpu *curr = current; unsigned long addr, reps = 1; -unsigned int off, chunk = min(bytes, 1U LONG_BYTEORDER); uint32_t pfec = PFEC_page_present; struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; -paddr_t gpa; int rc; rc = hvmemul_virtual_to_linear( seg, offset, bytes, reps, access_type, hvmemul_ctxt, addr); if ( rc != X86EMUL_OKAY ) return rc; -off = addr (PAGE_SIZE - 1); -/* - * We only need to handle sizes actual instruction operands can have. All - * such sizes are either powers
Re: [Xen-devel] [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
On 06/24/2015 06:14 AM, Roger Pau Monné wrote: El 24/06/15 a les 12.05, Jan Beulich ha escrit: On 24.06.15 at 11:47, roger@citrix.com wrote: What needs to be done (ordered by priority): - Clean up the patches, this patch series was done in less than a week. - Finish the boot ABI (this would also be needed for PVH anyway). - Convert the rest of xc_dom_*loaders in order to use the physical entry point when present, right now xc_dom_elfloader is the only one usable with HVMlite. This is quite trivial (see patch 10, it's a 4 LOC change). - Dom0 support. - Migration. - PCI pass-through. IMHO this is what we agreed to do with PVH, make it an HVM guest without a device model and without the emulated devices inside of Xen. Sooner or later we would need to make that change anyway in order to properly integrate PVH into Xen, and we get a bunch of new features for free as compared to PVH. I don't think of this as throw PVH out of the window and start something completely new from scratch, we are going to reuse some of the code paths used by PVH inside of Xen. From a guest POV the changes needed to move from PVH into HVMlite are regarding the boot ABI only, which we already agreed that should be changed anyway. I have to admit that I'm having a hard time making myself a clear picture of what the intention now is, namely with feature freeze being in about 2.5 weeks: If we assume that this series gets ready in time, should we drop Boris' 32-bit support patches? Would then be unfortunate if the series here didn't get ready. TBH I'm not going to make any promises of this being ready before the 4.6 feature freeze, not until I get some feedback from the tools maintainers regarding the libxc changes to unify the PV and HVM domain creation paths. FWIW, I gave this a quick spin on Monday and crashed the hypervisor on a NULL pointer right away in vapic code. Which, I assume, is not surprising since we are not supposed to be there in the first place. I'll try it again later today (I was out yesterday), maybe I messed something up. Otoh I don't think this and Boris' code conflict, and what we got in the tree PVH-wise is kind of a mess right now anyway, so adding to it just a few more bits (actually getting rid of some fixme-s, i.e. reducing messiness), so I'd be inclined to take the rest of Boris' series once ready, and if the series here gets ready too it could then also go in. Which would then mean for someone (perhaps after 4.6 was branched) to clean up any no longer necessary PVH special cases, unifying things towards what we seem to now call HVMlite. I'm not against merging the 32bit support series for PVH, but I'm certainly not going to invest time in adding 32bit PVH entry points to any OSes. What about Tim's proposal (http://lists.xen.org/archives/html/xen-devel/2014-12/msg00596.html)? Can this work be made part of it? At least, make it extendable to that? -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] xen{trace/analyze}: don't use 64bit versions of libc functions
On 06/24/2015 02:02 PM, Roger Pau Monné wrote: El 24/06/15 a les 13.11, Roger Pau Monné ha escrit: El 22/06/15 a les 16.48, Roger Pau Monné ha escrit: El 22/06/15 a les 12.09, George Dunlap ha escrit: On 06/22/2015 10:59 AM, Roger Pau Monné wrote: El 22/06/15 a les 11.08, George Dunlap ha escrit: On 06/19/2015 09:58 AM, Roger Pau Monne wrote: This is not needed, neither encouraged. Configure already checks _FILE_OFFSET_BITS and appends it when needed, so that the right functions are used. Also remove the usage of loff_t and O_LARGEFILE for the same reason. Just so I understand -- are you saying that configure at the tools directory level will notice that Linux can handle 64-bit file operations and use them automatically? Yes, according to the man page [1]: Over time, increases in the size of the stat structure have led to three successive versions of stat(): sys_stat() (slot __NR_oldstat), sys_newstat() (slot __NR_stat), and sys_stat64() (new in kernel 2.4; slot __NR_stat64). The glibc stat() wrapper function hides these details from applications, invoking the most recent version of the system call provided by the kernel, and repacking the returned information if required for old binaries. Similar remarks apply for fstat() and lstat(). OK, if you can confirm that you've actually tested this on a file larger than 4GiB, then: No, I have only build tested it since I was trying to unbreak the build. I don't think I will have time to test this until tomorrow, sorry for the delay. I've now tested this with a ~5GB file and it seems to work fine, I haven't seen any error and the output looks reasonable. This was on a 64bit Dom0, if someone has a 32bit Dom0 it would be good to test it there also. I've also tested on a 32bit Dom0, with and without the patches in this series and I always end up getting the same strange output from xenalyze: # xenalyze trace.file No output defined, using summary. Using VMX hardware-assisted virtualization. scan_for_new_pcpu: Activating pcpu 0 at offset 0 Creating vcpu 0 for dom 32768 scan_for_new_pcpu: Activating pcpu 1 at offset 10376 Creating vcpu 1 for dom 32768 scan_for_new_pcpu: Activating pcpu 4 at offset 10848 Creating vcpu 4 for dom 32768 scan_for_new_pcpu: Activating pcpu 6 at offset 11176 Creating vcpu 6 for dom 32768 init_pcpus: through first trace write, done for now. Creating domain 0 Creating vcpu 0 for dom 0 Using first_tsc for d0v0 (8109 cycles) Creating domain 32767 Creating vcpu 1 for dom 32767 Creating vcpu 1 for dom 0 Creating vcpu 2 for dom 0 Creating vcpu 4 for dom 32767 Using first_tsc for d32767v4 (9407 cycles) Creating vcpu 6 for dom 32767 Using first_tsc for d32767v6 (8755 cycles) process_cpu_change: Activating pcpu 5 at offset 16664 Creating vcpu 5 for dom 32768 scan_for_new_pcpu: Activating pcpu 7 at offset 17812 Creating vcpu 7 for dom 32768 Creating vcpu 3 for dom 0 Using first_tsc for d0v3 (3369172 cycles) Creating vcpu 0 for dom 32767 Creating vcpu 6 for dom 0 Creating vcpu 5 for dom 32767 Using first_tsc for d32767v5 (7868 cycles) Creating vcpu 7 for dom 0 Creating vcpu 7 for dom 32767 Using first_tsc for d32767v7 (7693 cycles) process_cpu_change: Activating pcpu 2 at offset 61284 Creating vcpu 2 for dom 32768 process_cpu_change: Activating pcpu 3 at offset 62128 Creating vcpu 3 for dom 32768 Creating vcpu 5 for dom 0 Creating vcpu 3 for dom 32767 Using first_tsc for d32767v3 (24609 cycles) Creating vcpu 4 for dom 0 Creating vcpu 2 for dom 32767 Using first_tsc for d32767v2 (2575 cycles) WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2. Not processing ] 84007(8:4:7) 0 [ ] WARNING: Unexpected vcpu data type for d0v0 on proc 1! Expected 1 got 2. Not processing ] 84006(8:4:6) 0 [ ] WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2. Not processing ] 84008(8:4:8) 0 [ ] WARNING: Unexpected vcpu data type for d0v2 on proc 6! Expected 1 got 2. Not processing ] 84008(8:4:8) 0 [ ] WARNING: Unexpected vcpu data type for d0v3 on proc 0! Expected 1 got 2. Not processing ] 84006(8:4:6) 0 [ ] Creating domain 90 Creating vcpu 0 for dom 90 Creating domain 89 Creating vcpu 0 for dom 89 Unknown hvm event: 84011 h-exit_reason 7b exit_reason_max 38! ] 81002(8:1:2) 2 [ 7b 100d9e ] And that's all. Since this seems to not be related to this fixes I think they should be applied. +1. (Ack is already there.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 02/12] VMX: implement suppress #VE.
On 22/06/15 19:56, Ed White wrote: In preparation for selectively enabling #VE in a later patch, set suppress #VE on all EPTE's. Suppress #VE should always be the default condition for two reasons: it is generally not safe to deliver #VE into a guest unless that guest has been modified to receive it; and even then for most EPT violations only the hypervisor is able to handle the violation. Signed-off-by: Ed White edmund.h.wh...@intel.com --- xen/arch/x86/mm/p2m-ept.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index a6c9adf..5de3387 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -41,7 +41,7 @@ #define is_epte_superpage(ept_entry)((ept_entry)-sp) static inline bool_t is_epte_valid(ept_entry_t *e) { -return (e-epte != 0 e-sa_p2mt != p2m_invalid); +return ((e-epte ~(1ul 63)) != 0 e-sa_p2mt != p2m_invalid); It might be nice to leave a comment explaining that epte.suppress_ve is not considered as part of validity. This avoids a rather opaque mask against a magic number. Otherwise, Reviewed-by: Andrew Cooper andrew.coop...@citrix.com } /* returns : 0 for success, -errno otherwise */ @@ -219,6 +219,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry, static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry) { struct page_info *pg; +ept_entry_t *table; +unsigned int i; pg = p2m_alloc_ptp(p2m, 0); if ( pg == NULL ) @@ -232,6 +234,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry) /* Manually set A bit to avoid overhead of MMU having to write it later. */ ept_entry-a = 1; +ept_entry-suppress_ve = 1; + +table = __map_domain_page(pg); + +for ( i = 0; i EPT_PAGETABLE_ENTRIES; i++ ) +table[i].suppress_ve = 1; + +unmap_domain_page(table); + return 1; } @@ -281,6 +292,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, ept_entry_t *ept_entry, epte-sp = (level 1); epte-mfn += i * trunk; epte-snp = (iommu_enabled iommu_snoop); +epte-suppress_ve = 1; ept_p2m_type_to_flags(p2m, epte, epte-sa_p2mt, epte-access); @@ -790,6 +802,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, ept_p2m_type_to_flags(p2m, new_entry, p2mt, p2ma); } +new_entry.suppress_ve = 1; + rc = atomic_write_ept_entry(ept_entry, new_entry, target); if ( unlikely(rc) ) old_entry.epte = 0; @@ -,6 +1125,8 @@ static void ept_flush_pml_buffers(struct p2m_domain *p2m) int ept_p2m_init(struct p2m_domain *p2m) { struct ept_data *ept = p2m-ept; +ept_entry_t *table; +unsigned int i; p2m-set_entry = ept_set_entry; p2m-get_entry = ept_get_entry; @@ -1134,6 +1150,13 @@ int ept_p2m_init(struct p2m_domain *p2m) p2m-flush_hardware_cached_dirty = ept_flush_pml_buffers; } +table = map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m))); + +for ( i = 0; i EPT_PAGETABLE_ENTRIES; i++ ) +table[i].suppress_ve = 1; + +unmap_domain_page(table); + if ( !zalloc_cpumask_var(ept-synced_mask) ) return -ENOMEM; ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [rumpuserxen test] 58871: regressions - FAIL
flight 58871 rumpuserxen real [real] http://logs.test-lab.xenproject.org/osstest/logs/58871/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-rumpuserxen 5 rumpuserxen-build fail REGR. vs. 33866 build-i386-rumpuserxen5 rumpuserxen-build fail REGR. vs. 33866 Tests which did not succeed, but are not blocking: test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a version targeted for testing: rumpuserxen 3b91e44996ea6ae1276bce1cc44f38701c53ee6f baseline version: rumpuserxen 30d72f3fc5e35cd53afd82c8179cc0e0b11146ad People who touched revisions under test: Antti Kantee po...@iki.fi Ian Jackson ian.jack...@eu.citrix.com Martin Lucina mar...@lucina.net Wei Liu l...@liuw.name jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-rumpuserxen-amd64 blocked test-amd64-i386-rumpuserxen-i386 blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 535 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 15/17] x86/hvm: use ioreq_t to track in-flight state
Use an ioreq_t rather than open coded state, size, dir and data fields in struct hvm_vcpu_io. This also allows PIO completion to be handled similarly to MMIO completion by re-issuing the handle_pio() call. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 35 +-- xen/arch/x86/hvm/hvm.c | 15 --- xen/arch/x86/hvm/svm/nestedsvm.c |2 +- xen/arch/x86/hvm/vmx/realmode.c |4 ++-- xen/include/asm-x86/hvm/vcpu.h | 12 5 files changed, 36 insertions(+), 32 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 6f538bf..6c50ef5 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -92,6 +92,7 @@ static int hvmemul_do_io( .df = df, .data = data, .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */ +.state = STATE_IOREQ_READY, }; void *p_data = (void *)data; int rc; @@ -129,12 +130,24 @@ static int hvmemul_do_io( } } -switch ( vio-io_state ) +switch ( vio-io_req.state ) { case STATE_IOREQ_NONE: break; case STATE_IORESP_READY: -vio-io_state = STATE_IOREQ_NONE; +vio-io_req.state = STATE_IOREQ_NONE; +p = vio-io_req; + +/* Verify the emulation request has been correctly re-issued */ +if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) || + (p.addr != addr) || + (p.size != size) || + (p.count != reps) || + (p.dir != dir) || + (p.df != df) || + (p.data_is_ptr != data_is_addr) ) +domain_crash(curr-domain); + if ( data_is_addr || dir == IOREQ_WRITE ) return X86EMUL_UNHANDLEABLE; goto finish_access; @@ -142,11 +155,6 @@ static int hvmemul_do_io( return X86EMUL_UNHANDLEABLE; } -vio-io_state = STATE_IOREQ_READY; -vio-io_size = size; -vio-io_dir = dir; -vio-io_data_is_addr = data_is_addr; - if ( dir == IOREQ_WRITE ) { if ( !data_is_addr ) @@ -155,13 +163,14 @@ static int hvmemul_do_io( hvmtrace_io_assist(p); } +vio-io_req = p; + rc = hvm_io_intercept(p); switch ( rc ) { case X86EMUL_OKAY: -vio-io_data = p.data; -vio-io_state = STATE_IOREQ_NONE; +vio-io_req.state = STATE_IOREQ_NONE; break; case X86EMUL_UNHANDLEABLE: { @@ -172,15 +181,13 @@ static int hvmemul_do_io( if ( !s ) { rc = hvm_process_io_intercept(null_handler, p); -if ( rc == X86EMUL_OKAY ) -vio-io_data = p.data; -vio-io_state = STATE_IOREQ_NONE; +vio-io_req.state = STATE_IOREQ_NONE; } else { rc = hvm_send_assist_req(s, p); if ( rc != X86EMUL_RETRY ) -vio-io_state = STATE_IOREQ_NONE; +vio-io_req.state = STATE_IOREQ_NONE; else if ( data_is_addr || dir == IOREQ_WRITE ) rc = X86EMUL_OKAY; } @@ -199,7 +206,7 @@ static int hvmemul_do_io( hvmtrace_io_assist(p); if ( !data_is_addr ) -memcpy(p_data, vio-io_data, size); +memcpy(p_data, p.data, size); } if ( is_mmio !data_is_addr ) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 7411287..8abf29b 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p) if ( HVMIO_NEED_COMPLETION(vio) ) { -vio-io_state = STATE_IORESP_READY; -vio-io_data = p-data; +vio-io_req.state = STATE_IORESP_READY; +vio-io_req.data = p-data; } else -vio-io_state = STATE_IOREQ_NONE; +vio-io_req.state = STATE_IOREQ_NONE; msix_write_completion(curr); vcpu_end_shutdown_deferral(curr); @@ -501,11 +501,12 @@ void hvm_do_resume(struct vcpu *v) (void)handle_mmio(); break; case HVMIO_pio_completion: -if ( vio-io_size == 4 ) /* Needs zero extension. */ -guest_cpu_user_regs()-rax = (uint32_t)vio-io_data; +if ( vio-io_req.size == 4 ) /* Needs zero extension. */ +guest_cpu_user_regs()-rax = (uint32_t)vio-io_req.data; else -memcpy(guest_cpu_user_regs()-rax, vio-io_data, vio-io_size); -vio-io_state = STATE_IOREQ_NONE; +memcpy(guest_cpu_user_regs()-rax, vio-io_req.data, + vio-io_req.size); +vio-io_req.state = STATE_IOREQ_NONE; break; case HVMIO_realmode_completion: { diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index 8b165c6..78667a2 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c
[Xen-devel] [PATCH v4 14/17] x86/hvm: remove hvm_io_state enumeration
Emulation request status is already covered by STATE_IOREQ_XXX values so just use those. The mapping is: HVMIO_none- STATE_IOREQ_NONE HVMIO_awaiting_completion - STATE_IOREQ_READY HVMIO_completed - STATE_IORESP_READY Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 14 +++--- xen/arch/x86/hvm/hvm.c |6 +++--- xen/arch/x86/hvm/svm/nestedsvm.c |2 +- xen/arch/x86/hvm/vmx/realmode.c |4 ++-- xen/include/asm-x86/hvm/vcpu.h | 10 ++ 5 files changed, 15 insertions(+), 21 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index c10adad..6f538bf 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -131,10 +131,10 @@ static int hvmemul_do_io( switch ( vio-io_state ) { -case HVMIO_none: +case STATE_IOREQ_NONE: break; -case HVMIO_completed: -vio-io_state = HVMIO_none; +case STATE_IORESP_READY: +vio-io_state = STATE_IOREQ_NONE; if ( data_is_addr || dir == IOREQ_WRITE ) return X86EMUL_UNHANDLEABLE; goto finish_access; @@ -142,7 +142,7 @@ static int hvmemul_do_io( return X86EMUL_UNHANDLEABLE; } -vio-io_state = HVMIO_awaiting_completion; +vio-io_state = STATE_IOREQ_READY; vio-io_size = size; vio-io_dir = dir; vio-io_data_is_addr = data_is_addr; @@ -161,7 +161,7 @@ static int hvmemul_do_io( { case X86EMUL_OKAY: vio-io_data = p.data; -vio-io_state = HVMIO_none; +vio-io_state = STATE_IOREQ_NONE; break; case X86EMUL_UNHANDLEABLE: { @@ -174,13 +174,13 @@ static int hvmemul_do_io( rc = hvm_process_io_intercept(null_handler, p); if ( rc == X86EMUL_OKAY ) vio-io_data = p.data; -vio-io_state = HVMIO_none; +vio-io_state = STATE_IOREQ_NONE; } else { rc = hvm_send_assist_req(s, p); if ( rc != X86EMUL_RETRY ) -vio-io_state = HVMIO_none; +vio-io_state = STATE_IOREQ_NONE; else if ( data_is_addr || dir == IOREQ_WRITE ) rc = X86EMUL_OKAY; } diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 4458fa4..7411287 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -421,11 +421,11 @@ static void hvm_io_assist(ioreq_t *p) if ( HVMIO_NEED_COMPLETION(vio) ) { -vio-io_state = HVMIO_completed; +vio-io_state = STATE_IORESP_READY; vio-io_data = p-data; } else -vio-io_state = HVMIO_none; +vio-io_state = STATE_IOREQ_NONE; msix_write_completion(curr); vcpu_end_shutdown_deferral(curr); @@ -505,7 +505,7 @@ void hvm_do_resume(struct vcpu *v) guest_cpu_user_regs()-rax = (uint32_t)vio-io_data; else memcpy(guest_cpu_user_regs()-rax, vio-io_data, vio-io_size); -vio-io_state = HVMIO_none; +vio-io_state = STATE_IOREQ_NONE; break; case HVMIO_realmode_completion: { diff --git a/xen/arch/x86/hvm/svm/nestedsvm.c b/xen/arch/x86/hvm/svm/nestedsvm.c index be5797a..8b165c6 100644 --- a/xen/arch/x86/hvm/svm/nestedsvm.c +++ b/xen/arch/x86/hvm/svm/nestedsvm.c @@ -1231,7 +1231,7 @@ enum hvm_intblk nsvm_intr_blocked(struct vcpu *v) * Delay the injection because this would result in delivering * an interrupt *within* the execution of an instruction. */ -if ( v-arch.hvm_vcpu.hvm_io.io_state != HVMIO_none ) +if ( v-arch.hvm_vcpu.hvm_io.io_state != STATE_IOREQ_NONE ) return hvm_intblk_shadow; if ( !nv-nv_vmexit_pending n2vmcb-exitintinfo.bytes != 0 ) { diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c index 5e56a1f..4135ad4 100644 --- a/xen/arch/x86/hvm/vmx/realmode.c +++ b/xen/arch/x86/hvm/vmx/realmode.c @@ -205,7 +205,7 @@ void vmx_realmode(struct cpu_user_regs *regs) vmx_realmode_emulate_one(hvmemul_ctxt); -if ( vio-io_state != HVMIO_none || vio-mmio_retry ) +if ( vio-io_state != STATE_IOREQ_NONE || vio-mmio_retry ) break; /* Stop emulating unless our segment state is not safe */ @@ -219,7 +219,7 @@ void vmx_realmode(struct cpu_user_regs *regs) } /* Need to emulate next time if we've started an IO operation */ -if ( vio-io_state != HVMIO_none ) +if ( vio-io_state != STATE_IOREQ_NONE ) curr-arch.hvm_vmx.vmx_emulate = 1; if ( !curr-arch.hvm_vmx.vmx_emulate !curr-arch.hvm_vmx.vmx_realmode ) diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h index 2830057..f797518 100644 --- a/xen/include/asm-x86/hvm/vcpu.h +++ b/xen/include/asm-x86/hvm/vcpu.h @@ -30,12 +30,6
[Xen-devel] [PATCH v4 11/17] x86/hvm: only call hvm_io_assist() from hvm_wait_for_io()
By removing the calls in hvmemul_do_io() (which is replaced by a single assignment) and hvm_complete_assist_request() (which is replaced by a call to process_portio_intercept() with a suitable set of ops) then hvm_io_assist() can be moved into hvm.c and made static (and hence be a candidate for inlining). This patch also fixes the I/O state test at the end of hvm_io_assist() to check the correct value. Since the ioreq server patch series was integrated the current ioreq state is no longer an indicator of in-flight I/O state, since an I/O sheduled by re-emulation may be targetted at a different ioreq server. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 34 +--- xen/arch/x86/hvm/hvm.c | 70 +++--- xen/arch/x86/hvm/intercept.c |4 +-- xen/arch/x86/hvm/io.c| 39 --- xen/include/asm-x86/hvm/io.h |3 +- 5 files changed, 73 insertions(+), 77 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index eefe860..111987c 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -51,6 +51,32 @@ static void hvmtrace_io_assist(ioreq_t *p) trace_var(event, 0/*!cycles*/, size, buffer); } +static int null_read(struct hvm_io_handler *io_handler, + uint64_t addr, + uint64_t size, + uint64_t *data) +{ +*data = ~0ul; +return X86EMUL_OKAY; +} + +static int null_write(struct hvm_io_handler *handler, + uint64_t addr, + uint64_t size, + uint64_t data) +{ +return X86EMUL_OKAY; +} + +static const struct hvm_io_ops null_ops = { +.read = null_read, +.write = null_write +}; + +static struct hvm_io_handler null_handler = { +.ops = null_ops +}; + static int hvmemul_do_io( bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size, uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data) @@ -140,8 +166,7 @@ static int hvmemul_do_io( switch ( rc ) { case X86EMUL_OKAY: -p.state = STATE_IORESP_READY; -hvm_io_assist(p); +vio-io_data = p.data; vio-io_state = HVMIO_none; break; case X86EMUL_UNHANDLEABLE: @@ -152,8 +177,9 @@ static int hvmemul_do_io( /* If there is no suitable backing DM, just ignore accesses */ if ( !s ) { -hvm_complete_assist_req(p); -rc = X86EMUL_OKAY; +rc = hvm_process_io_intercept(null_handler, p); +if ( rc == X86EMUL_OKAY ) +vio-io_data = p.data; vio-io_state = HVMIO_none; } else diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 626c431..3365abb 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -411,6 +411,45 @@ bool_t hvm_io_pending(struct vcpu *v) return 0; } +static void hvm_io_assist(ioreq_t *p) +{ +struct vcpu *curr = current; +struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; +enum hvm_io_state io_state; + +p-state = STATE_IOREQ_NONE; + +io_state = vio-io_state; +vio-io_state = HVMIO_none; + +switch ( io_state ) +{ +case HVMIO_awaiting_completion: +vio-io_state = HVMIO_completed; +vio-io_data = p-data; +break; +case HVMIO_handle_mmio_awaiting_completion: +vio-io_state = HVMIO_completed; +vio-io_data = p-data; +(void)handle_mmio(); +break; +case HVMIO_handle_pio_awaiting_completion: +if ( vio-io_size == 4 ) /* Needs zero extension. */ +guest_cpu_user_regs()-rax = (uint32_t)p-data; +else +memcpy(guest_cpu_user_regs()-rax, p-data, vio-io_size); +break; +default: +break; +} + +if ( p-state == STATE_IOREQ_NONE ) +{ +msix_write_completion(curr); +vcpu_end_shutdown_deferral(curr); +} +} + static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p) { /* NB. Optimised for common case (p-state == STATE_IOREQ_NONE). */ @@ -2667,37 +2706,6 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, ioreq_t *proto_p) return X86EMUL_UNHANDLEABLE; } -void hvm_complete_assist_req(ioreq_t *p) -{ -switch ( p-type ) -{ -case IOREQ_TYPE_PCI_CONFIG: -ASSERT_UNREACHABLE(); -break; -case IOREQ_TYPE_COPY: -case IOREQ_TYPE_PIO: -if ( p-dir == IOREQ_READ ) -{ -if ( !p-data_is_ptr ) -p-data = ~0ul; -else -{ -int i, step = p-df ? -p-size : p-size; -uint32_t data = ~0; - -for ( i = 0; i p-count; i++ ) -hvm_copy_to_guest_phys(p-data + step * i, data, -
[Xen-devel] [PATCH v4 12/17] x86/hvm: split I/O completion handling from state model
The state of in-flight I/O and how its completion will be handled are logically separate and conflating the two makes the code unnecessarily confusing. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/hvm.c| 50 - xen/arch/x86/hvm/io.c |6 ++--- xen/arch/x86/hvm/vmx/realmode.c | 27 ++-- xen/include/asm-x86/hvm/vcpu.h| 16 xen/include/asm-x86/hvm/vmx/vmx.h |1 + 5 files changed, 68 insertions(+), 32 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 3365abb..39f40ad 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -59,6 +59,7 @@ #include asm/hvm/trace.h #include asm/hvm/nestedhvm.h #include asm/hvm/event.h +#include asm/hvm/vmx/vmx.h #include asm/mtrr.h #include asm/apic.h #include public/sched.h @@ -428,26 +429,12 @@ static void hvm_io_assist(ioreq_t *p) vio-io_state = HVMIO_completed; vio-io_data = p-data; break; -case HVMIO_handle_mmio_awaiting_completion: -vio-io_state = HVMIO_completed; -vio-io_data = p-data; -(void)handle_mmio(); -break; -case HVMIO_handle_pio_awaiting_completion: -if ( vio-io_size == 4 ) /* Needs zero extension. */ -guest_cpu_user_regs()-rax = (uint32_t)p-data; -else -memcpy(guest_cpu_user_regs()-rax, p-data, vio-io_size); -break; default: break; } -if ( p-state == STATE_IOREQ_NONE ) -{ -msix_write_completion(curr); -vcpu_end_shutdown_deferral(curr); -} +msix_write_completion(curr); +vcpu_end_shutdown_deferral(curr); } static bool_t hvm_wait_for_io(struct hvm_ioreq_vcpu *sv, ioreq_t *p) @@ -482,6 +469,7 @@ void hvm_do_resume(struct vcpu *v) struct hvm_vcpu_io *vio = v-arch.hvm_vcpu.hvm_io; struct domain *d = v-domain; struct hvm_ioreq_server *s; +enum hvm_io_completion io_completion; check_wakeup_from_wait(); @@ -508,8 +496,36 @@ void hvm_do_resume(struct vcpu *v) } } -if ( vio-mmio_retry ) +io_completion = vio-io_completion; +vio-io_completion = HVMIO_no_completion; + +switch ( io_completion ) +{ +case HVMIO_no_completion: +break; +case HVMIO_mmio_completion: (void)handle_mmio(); +break; +case HVMIO_pio_completion: +if ( vio-io_size == 4 ) /* Needs zero extension. */ +guest_cpu_user_regs()-rax = (uint32_t)vio-io_data; +else +memcpy(guest_cpu_user_regs()-rax, vio-io_data, vio-io_size); +vio-io_state = HVMIO_none; +break; +case HVMIO_realmode_completion: +{ +struct hvm_emulate_ctxt ctxt; + +hvm_emulate_prepare(ctxt, guest_cpu_user_regs()); +vmx_realmode_emulate_one(ctxt); +hvm_emulate_writeback(ctxt); + +break; +} +default: +BUG(); +} /* Inject pending hw/sw trap */ if ( v-arch.hvm_vcpu.inject_trap.vector != -1 ) diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index 61df6dd..27150e9 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -92,8 +92,8 @@ int handle_mmio(void) if ( rc != X86EMUL_RETRY ) vio-io_state = HVMIO_none; -if ( vio-io_state == HVMIO_awaiting_completion ) -vio-io_state = HVMIO_handle_mmio_awaiting_completion; +if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry ) +vio-io_completion = HVMIO_mmio_completion; else vio-mmio_access = (struct npfec){}; @@ -158,7 +158,7 @@ int handle_pio(uint16_t port, unsigned int size, int dir) return 0; /* Completion in hvm_io_assist() with no re-emulation required. */ ASSERT(dir == IOREQ_READ); -vio-io_state = HVMIO_handle_pio_awaiting_completion; +vio-io_completion = HVMIO_pio_completion; break; default: gdprintk(XENLOG_ERR, Weird HVM ioemulation status %d.\n, rc); diff --git a/xen/arch/x86/hvm/vmx/realmode.c b/xen/arch/x86/hvm/vmx/realmode.c index fe8b4a0..76ff9a5 100644 --- a/xen/arch/x86/hvm/vmx/realmode.c +++ b/xen/arch/x86/hvm/vmx/realmode.c @@ -101,15 +101,19 @@ static void realmode_deliver_exception( } } -static void realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt) +void vmx_realmode_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt) { struct vcpu *curr = current; +struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; int rc; perfc_incr(realmode_emulations); rc = hvm_emulate_one(hvmemul_ctxt); +if ( vio-io_state == HVMIO_awaiting_completion || vio-mmio_retry ) +vio-io_completion = HVMIO_realmode_completion; + if ( rc == X86EMUL_UNHANDLEABLE ) { gdprintk(XENLOG_ERR, Failed to emulate
[Xen-devel] [PATCH v4 17/17] x86/hvm: track large memory mapped accesses by buffer offset
The code in hvmemul_do_io() that tracks large reads or writes, to avoid re-issue of component I/O, is defeated by accesses across a page boundary because it uses physical address. The code is also only relevant to memory mapped I/O to or from a buffer. This patch re-factors the code and moves it into hvmemul_phys_mmio_access() where it is relevant and tracks using buffer offset rather then address. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 98 xen/include/asm-x86/hvm/vcpu.h | 16 --- 2 files changed, 48 insertions(+), 66 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index aa68787..4424dfc 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -107,29 +107,6 @@ static int hvmemul_do_io( return X86EMUL_UNHANDLEABLE; } -if ( is_mmio !data_is_addr ) -{ -/* Part of a multi-cycle read or write? */ -if ( dir == IOREQ_WRITE ) -{ -paddr_t pa = vio-mmio_large_write_pa; -unsigned int bytes = vio-mmio_large_write_bytes; -if ( (addr = pa) ((addr + size) = (pa + bytes)) ) -return X86EMUL_OKAY; -} -else -{ -paddr_t pa = vio-mmio_large_read_pa; -unsigned int bytes = vio-mmio_large_read_bytes; -if ( (addr = pa) ((addr + size) = (pa + bytes)) ) -{ -memcpy(p_data, vio-mmio_large_read[addr - pa], - size); -return X86EMUL_OKAY; -} -} -} - switch ( vio-io_req.state ) { case STATE_IOREQ_NONE: @@ -209,33 +186,6 @@ static int hvmemul_do_io( memcpy(p_data, p.data, size); } -if ( is_mmio !data_is_addr ) -{ -/* Part of a multi-cycle read or write? */ -if ( dir == IOREQ_WRITE ) -{ -paddr_t pa = vio-mmio_large_write_pa; -unsigned int bytes = vio-mmio_large_write_bytes; -if ( bytes == 0 ) -pa = vio-mmio_large_write_pa = addr; -if ( addr == (pa + bytes) ) -vio-mmio_large_write_bytes += size; -} -else -{ -paddr_t pa = vio-mmio_large_read_pa; -unsigned int bytes = vio-mmio_large_read_bytes; -if ( bytes == 0 ) -pa = vio-mmio_large_read_pa = addr; -if ( (addr == (pa + bytes)) - ((bytes + size) = sizeof(vio-mmio_large_read)) ) -{ -memcpy(vio-mmio_large_read[bytes], p_data, size); -vio-mmio_large_read_bytes += size; -} -} -} - return X86EMUL_OKAY; } @@ -601,8 +551,11 @@ static int hvmemul_virtual_to_linear( } static int hvmemul_phys_mmio_access( -paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer) +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer, +unsigned int *off) { +struct vcpu *curr = current; +struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; unsigned long one_rep = 1; unsigned int chunk; int rc; @@ -621,14 +574,41 @@ static int hvmemul_phys_mmio_access( for ( ;; ) { -rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0, -*buffer); -if ( rc != X86EMUL_OKAY ) -break; +/* Have we already done this chunk? */ +if ( (*off + chunk) = vio-mmio_cache[dir].size ) +{ +ASSERT(*off + chunk = vio-mmio_cache[dir].size); + +if ( dir == IOREQ_READ ) +memcpy(buffer[*off], + vio-mmio_cache[IOREQ_READ].buffer[*off], + chunk); +else +{ +if ( memcmp(buffer[*off], +vio-mmio_cache[IOREQ_WRITE].buffer[*off], +chunk) != 0 ) +domain_crash(curr-domain); +} +} +else +{ +ASSERT(*off == vio-mmio_cache[dir].size); + +rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0, +buffer[*off]); +if ( rc != X86EMUL_OKAY ) +break; + +/* Note that we have now done this chunk */ +memcpy(vio-mmio_cache[dir].buffer[*off], + buffer[*off], chunk); +vio-mmio_cache[dir].size += chunk; +} /* Advance to the next chunk */ gpa += chunk; -*buffer += chunk; +*off += chunk; size -= chunk; if ( size == 0 ) @@ -651,7 +631,7 @@ static int hvmemul_linear_mmio_access( { struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io; unsigned long page_off =
[Xen-devel] [PATCH v4 16/17] x86/hvm: always re-emulate I/O from a buffer
If memory mapped I/O is 'chunked' then the I/O must be re-emulated, otherwise only the first chunk will be processed. This patch makes sure all I/O from a buffer is re-emulated regardless of whether it is a read or a write. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c |4 ++-- xen/arch/x86/hvm/hvm.c | 13 - xen/include/asm-x86/hvm/vcpu.h |3 +-- 3 files changed, 11 insertions(+), 9 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 6c50ef5..aa68787 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -148,7 +148,7 @@ static int hvmemul_do_io( (p.data_is_ptr != data_is_addr) ) domain_crash(curr-domain); -if ( data_is_addr || dir == IOREQ_WRITE ) +if ( data_is_addr ) return X86EMUL_UNHANDLEABLE; goto finish_access; default: @@ -188,7 +188,7 @@ static int hvmemul_do_io( rc = hvm_send_assist_req(s, p); if ( rc != X86EMUL_RETRY ) vio-io_req.state = STATE_IOREQ_NONE; -else if ( data_is_addr || dir == IOREQ_WRITE ) +else if ( data_is_addr ) rc = X86EMUL_OKAY; } break; diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 8abf29b..c062c9f 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -501,11 +501,14 @@ void hvm_do_resume(struct vcpu *v) (void)handle_mmio(); break; case HVMIO_pio_completion: -if ( vio-io_req.size == 4 ) /* Needs zero extension. */ -guest_cpu_user_regs()-rax = (uint32_t)vio-io_req.data; -else -memcpy(guest_cpu_user_regs()-rax, vio-io_req.data, - vio-io_req.size); +if ( vio-io_req.dir == IOREQ_READ ) +{ +if ( vio-io_req.size == 4 ) /* Needs zero extension. */ +guest_cpu_user_regs()-rax = (uint32_t)vio-io_req.data; +else +memcpy(guest_cpu_user_regs()-rax, vio-io_req.data, + vio-io_req.size); +} vio-io_req.state = STATE_IOREQ_NONE; break; case HVMIO_realmode_completion: diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h index 7338638..008c8fa 100644 --- a/xen/include/asm-x86/hvm/vcpu.h +++ b/xen/include/asm-x86/hvm/vcpu.h @@ -49,8 +49,7 @@ struct hvm_vcpu_io { #define HVMIO_NEED_COMPLETION(_vio) \ ( ((_vio)-io_req.state == STATE_IOREQ_READY) \ - !(_vio)-io_req.data_is_ptr \ - ((_vio)-io_req.dir == IOREQ_READ) ) + !(_vio)-io_req.data_is_ptr ) /* * HVM emulation: -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [libvirt test] 58119: regressions - FAIL
On Tue, 2015-06-23 at 14:38 +0100, Ian Campbell wrote: On Tue, 2015-06-23 at 12:15 +0100, Anthony PERARD wrote: On Mon, Jun 08, 2015 at 10:22:28AM +0100, Ian Campbell wrote: On Mon, 2015-06-08 at 04:37 +, osstest service user wrote: flight 58119 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/58119/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: This has been failing for a while now, sorry for not brining it to your attention sooner. libxl: debug: libxl_event.c:638:libxl__ev_xswatch_deregister: watch w=0x7f805c25b248 wpath=/local/domain/0/device-model/1/state token=3/0: deregister slotnum=3 libxl: error: libxl_exec.c:393:spawn_watch_event: domain 1 device model: startup timed out libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x7f805c25b248: deregister unregistered libxl: debug: libxl_event.c:652:libxl__ev_xswatch_deregister: watch w=0x7f805c25b248: deregister unregistered libxl: error: libxl_dm.c:1564:device_model_spawn_outcome: domain 1 device model: spawn failed (rc=-3) libxl: error: libxl_create.c:1373:domcreate_devmodel_started: device model did not start: -3 Hi, I've tried to debug this device model: startup time out issue that I'm seeing on OpenStack. What I've done is strace every single QEMU. It appear that QEMU take more than 10s to load... Looking through http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-libvirt/ALL.html when it passes the collected var-log-libvirt-libxl-libxl-driver.log.gz seems to indicate that the device model is successfully spawned in 2-4s. The same is true of the tests run on the Cambridge instance. So, can we take Anthony's code/instrumentation for stracing QEMU and do the same in the ad-hoc run on the test on merlot? The goal would be to have something like what he attached to his email (the strace output) for our failing case on merlot. That's assuming that what Anthony have done to get the traces could be put in a patch to libxl and/or libvirt, apply it to some branch, and make the ad-hoc test pick code for the proper components from such branch... which, I think, should all be doable, or am I talking nonsense? Regards, Dario -- This happens because I choose it to happen! (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 04/17] x86/hvm: remove multiple open coded 'chunking' loops
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 24 June 2015 13:34 To: Paul Durrant Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org) Subject: Re: [PATCH v4 04/17] x86/hvm: remove multiple open coded 'chunking' loops On 24.06.15 at 13:24, paul.durr...@citrix.com wrote: +static int hvmemul_phys_mmio_access( +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer) As much as the earlier offset you returned via indirection to the caller was unnecessary, the indirection here seems pointless too. All callers know how (or don't care) to update the buffer pointer. Ok. Personally I'd prefer one thing to be in charge of updating the pointer though. +static int hvmemul_linear_mmio_access( +unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer, +uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t translate) +{ +struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io; +unsigned long page_off = gla (PAGE_SIZE - 1); +unsigned int chunk; +paddr_t gpa; +unsigned long one_rep = 1; +int rc; + +chunk = min_t(unsigned int, size, PAGE_SIZE - page_off); + +if ( translate ) +gpa = pfn_to_paddr(vio-mmio_gpfn) | page_off; translate as name for the parameter signaling that the translation is known is kind of odd - translated or known_gpfn or some such? Or invert the meaning? Ok. I think I'll go with the latter. +else +{ +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec, +hvmemul_ctxt); +if ( rc != X86EMUL_OKAY ) +return rc; +} + +for ( ;; ) +{ +rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer); +if ( rc != X86EMUL_OKAY ) +break; + +gla += chunk; +gpa += chunk; +size -= chunk; + +if ( size == 0 ) +break; + +ASSERT((gla (PAGE_SIZE - 1)) == 0); +chunk = min_t(unsigned int, size, PAGE_SIZE); I think you could just assert that size is now less than PAGE_SIZE. True. +if ( !translate ) +{ +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec, +hvmemul_ctxt); +if ( rc != X86EMUL_OKAY ) +return rc; +} This must be done unconditionally (and gpa doesn't need updating above then), as the known translation is only for the first byte (and whatever falls on the same page). Yes indeed, I somehow missed that before. Paul Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1 00/13] Introduce HMV without dm and new boot ABI
El 24/06/15 a les 12.05, Jan Beulich ha escrit: On 24.06.15 at 11:47, roger@citrix.com wrote: What needs to be done (ordered by priority): - Clean up the patches, this patch series was done in less than a week. - Finish the boot ABI (this would also be needed for PVH anyway). - Convert the rest of xc_dom_*loaders in order to use the physical entry point when present, right now xc_dom_elfloader is the only one usable with HVMlite. This is quite trivial (see patch 10, it's a 4 LOC change). - Dom0 support. - Migration. - PCI pass-through. IMHO this is what we agreed to do with PVH, make it an HVM guest without a device model and without the emulated devices inside of Xen. Sooner or later we would need to make that change anyway in order to properly integrate PVH into Xen, and we get a bunch of new features for free as compared to PVH. I don't think of this as throw PVH out of the window and start something completely new from scratch, we are going to reuse some of the code paths used by PVH inside of Xen. From a guest POV the changes needed to move from PVH into HVMlite are regarding the boot ABI only, which we already agreed that should be changed anyway. I have to admit that I'm having a hard time making myself a clear picture of what the intention now is, namely with feature freeze being in about 2.5 weeks: If we assume that this series gets ready in time, should we drop Boris' 32-bit support patches? Would then be unfortunate if the series here didn't get ready. TBH I'm not going to make any promises of this being ready before the 4.6 feature freeze, not until I get some feedback from the tools maintainers regarding the libxc changes to unify the PV and HVM domain creation paths. Otoh I don't think this and Boris' code conflict, and what we got in the tree PVH-wise is kind of a mess right now anyway, so adding to it just a few more bits (actually getting rid of some fixme-s, i.e. reducing messiness), so I'd be inclined to take the rest of Boris' series once ready, and if the series here gets ready too it could then also go in. Which would then mean for someone (perhaps after 4.6 was branched) to clean up any no longer necessary PVH special cases, unifying things towards what we seem to now call HVMlite. I'm not against merging the 32bit support series for PVH, but I'm certainly not going to invest time in adding 32bit PVH entry points to any OSes. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH v3 09/18] xen/arm: ITS: Add virtual ITS commands support
Hi Vijay, On 22/06/15 13:01, vijay.kil...@gmail.com wrote: From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com Add Virtual ITS command processing support to Virtual ITS driver Signed-off-by: Vijaya Kumar K vijaya.ku...@caviumnetworks.com --- xen/arch/arm/gic-v3-its.c |7 + xen/arch/arm/vgic-v3-its.c | 393 2 files changed, 400 insertions(+) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 535fc53..2a4fa97 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -89,6 +89,7 @@ struct its_node { #define ITS_ITT_ALIGNSZ_256 +static u32 id_bits; static LIST_HEAD(its_nodes); static DEFINE_SPINLOCK(its_lock); static struct rdist_prop *gic_rdists; @@ -146,6 +147,11 @@ void dump_cmd(its_cmd_block *cmd) } #endif +u32 its_get_nr_events(void) +{ +return (1 id_bits); +} + /* RB-tree helpers for its_device */ struct its_device * find_its_device(struct rb_root *root, u32 devid) { @@ -1044,6 +1050,7 @@ static int its_probe(struct dt_device_node *node) its-phys_size = its_size; typer = readl_relaxed(its_base + GITS_TYPER); its-ite_size = ((typer 4) 0xf) + 1; +id_bits = GITS_TYPER_IDBITS(typer); its-cmd_base = xzalloc_bytes(ITS_CMD_QUEUE_SZ); if ( !its-cmd_base ) diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c index ea52a87..0671434 100644 --- a/xen/arch/arm/vgic-v3-its.c +++ b/xen/arch/arm/vgic-v3-its.c @@ -256,6 +256,399 @@ int remove_vits_device(struct rb_root *root, struct vits_device *dev) return 0; } +static int vgic_its_process_sync(struct vcpu *v, struct vgic_its *vits, + its_cmd_block *virt_cmd) virt_cmd is not modified, please use const. +{ +/* XXX: Ignored */ IHMO, XXX means TODO which is not the case here. +DPRINTK(vITS:d%dv%d SYNC: ta 0x%x \n, + v-domain-domain_id, v-vcpu_id, virt_cmd-sync.ta); You can use %pv rather than d%dv%d an directly pass the vcpu. + +return 0; +} + +static int vgic_its_process_mapvi(struct vcpu *v, struct vgic_its *vits, + its_cmd_block *virt_cmd) Please use const for the virt_cmd. +{ +struct vitt entry; +struct vits_device *vdev; +uint8_t vcol_id, cmd; +uint32_t vid, dev_id, event; struct domain *d = v-domain for a better abstraction in the code. + +vcol_id = virt_cmd-mapvi.col; +vid = virt_cmd-mapvi.phy_id; +dev_id = its_decode_devid(v-domain, virt_cmd); AFAIU, the its_decode_devid will return a physical devID... although you function find_vits_device is working on virtual devID. This will also not work on fake device. Did I miss something? +cmd = virt_cmd-mapvi.cmd; + +DPRINTK(vITS:d%dv%d MAPVI: dev_id 0x%x vcol_id %d vid %d \n, + v-domain-domain_id, v-vcpu_id, dev_id, vcol_id, vid); + +if ( vcol_id (v-domain-max_vcpus + 1) || vid its_get_nr_events() ) Checking the validity is pointless as a malicious guest can rewrite the ITT. We only need to check it when the LPI is effectively injected. +return -EINVAL; + +/* XXX: Enable validation later */ What do you mean? +vdev = find_vits_device(v-domain-arch.vits-dev_root, dev_id); +if ( !vdev !vdev-its_dev ) +return -EINVAL; You deny the possibility to have fake device in the domain. Anyway, this check is not necessary too. + +entry.valid = true; +entry.vcollection = vcol_id; +entry.vlpi = vid; + +if ( cmd == GITS_CMD_MAPI ) +vits_set_vitt_entry(v-domain, dev_id, vid, entry); +else +{ +event = virt_cmd-mapvi.event; +if ( event its_get_nr_events() ) You have hardcoded the number of event in the vGIC but you are using the physical ITS to check the value. IHMO, we should introduce a new field in the vITS to specify the number of events supported by the domain. For DOM0 it will be equal to the physical ITS. But this check is also unnecessary. +return -EINVAL; + +vits_set_vitt_entry(v-domain, dev_id, event, entry); +} + +return 0; +} + +static int vgic_its_process_movi(struct vcpu *v, struct vgic_its *vits, + its_cmd_block *virt_cmd) virt_cmd is not modified, please use const. +{ +struct vitt entry; +struct vits_device *vdev; +uint32_t dev_id, event; +uint8_t vcol_id; struct domain *d = v-domain for a better abstraction in the code. + +dev_id = its_decode_devid(v-domain, virt_cmd); +vcol_id = virt_cmd-movi.col; +event = virt_cmd-movi.event; + +DPRINTK(vITS:d%dv%d MOVI: dev_id 0x%x vcol_id %d event %d\n, +v-domain-domain_id, v-vcpu_id, dev_id, vcol_id, event); +if ( vcol_id (v-domain-max_vcpus + 1) || event its_get_nr_events() ) +return -EINVAL; + +/* Enable validation later
Re: [Xen-devel] [PATCH 1/2] NetBSDRump: provide evtchn.h and privcmd.h
On Wed, Jun 24, 2015 at 12:22:46PM +0200, Roger Pau Monné wrote: El 24/06/15 a les 12.10, Wei Liu ha escrit: +#define IOCTL_PRIVCMD_MMAP \ +_IOW('P', 2, privcmd_mmap_t) +#define IOCTL_PRIVCMD_MMAPBATCH\ +_IOW('P', 3, privcmd_mmapbatch_t) FWIW you could have gotten away with just implementing IOCTL_PRIVCMD_MMAPBATCH, this is what I did on FreeBSD. I was too lazy to change libxc code so I implemented both in rump kernel. It's just plumbing through some minios functions. Wei. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 06/17] x86/hvm: unify internal portio and mmio intercepts
The implementation of mmio and portio intercepts is unnecessarily different. This leads to much code duplication. This patch unifies much of the intercept handling, leaving only distinct handlers for stdvga mmio and dpci portio. Subsequent patches will unify those handlers. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c| 11 +- xen/arch/x86/hvm/hpet.c |4 +- xen/arch/x86/hvm/hvm.c|7 +- xen/arch/x86/hvm/intercept.c | 502 + xen/arch/x86/hvm/stdvga.c | 30 +- xen/arch/x86/hvm/vioapic.c|4 +- xen/arch/x86/hvm/vlapic.c |5 +- xen/arch/x86/hvm/vmsi.c |7 +- xen/drivers/passthrough/amd/iommu_guest.c | 30 +- xen/include/asm-x86/hvm/domain.h |1 + xen/include/asm-x86/hvm/io.h | 119 +++ 11 files changed, 350 insertions(+), 370 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 4d11c6c..9ced81b 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -143,16 +143,7 @@ static int hvmemul_do_io( hvmtrace_io_assist(p); } -if ( is_mmio ) -{ -rc = hvm_mmio_intercept(p); -if ( rc == X86EMUL_UNHANDLEABLE ) -rc = hvm_buffered_io_intercept(p); -} -else -{ -rc = hvm_portio_intercept(p); -} +rc = hvm_io_intercept(p); switch ( rc ) { diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c index 9585ca8..8958873 100644 --- a/xen/arch/x86/hvm/hpet.c +++ b/xen/arch/x86/hvm/hpet.c @@ -504,7 +504,7 @@ static int hpet_range(struct vcpu *v, unsigned long addr) (addr (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) ); } -const struct hvm_mmio_ops hpet_mmio_ops = { +static const struct hvm_mmio_ops hpet_mmio_ops = { .check = hpet_range, .read = hpet_read, .write = hpet_write @@ -659,6 +659,8 @@ void hpet_init(struct domain *d) h-hpet.comparator64[i] = ~0ULL; h-pt[i].source = PTSRC_isa; } + +register_mmio_handler(d, hpet_mmio_ops); } void hpet_deinit(struct domain *d) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 535d622..c10db78 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1465,11 +1465,12 @@ int hvm_domain_initialise(struct domain *d) goto fail0; d-arch.hvm_domain.params = xzalloc_array(uint64_t, HVM_NR_PARAMS); -d-arch.hvm_domain.io_handler = xmalloc(struct hvm_io_handler); +d-arch.hvm_domain.io_handler = xzalloc_array(struct hvm_io_handler, + NR_IO_HANDLERS); rc = -ENOMEM; if ( !d-arch.hvm_domain.params || !d-arch.hvm_domain.io_handler ) goto fail1; -d-arch.hvm_domain.io_handler-num_slot = 0; +d-arch.hvm_domain.io_handler_count = 0; /* Set the default IO Bitmap. */ if ( is_hardware_domain(d) ) @@ -1506,6 +1507,8 @@ int hvm_domain_initialise(struct domain *d) rtc_init(d); +msixtbl_init(d); + register_portio_handler(d, 0xe9, 1, hvm_print_line); register_portio_handler(d, 0xcf8, 4, hvm_access_cf8); diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c index cc44733..4db024e 100644 --- a/xen/arch/x86/hvm/intercept.c +++ b/xen/arch/x86/hvm/intercept.c @@ -32,205 +32,97 @@ #include xen/event.h #include xen/iommu.h -static const struct hvm_mmio_ops *const -hvm_mmio_handlers[HVM_MMIO_HANDLER_NR] = +static bool_t hvm_mmio_accept(struct hvm_io_handler *handler, + uint64_t addr, + uint64_t size) { -hpet_mmio_ops, -vlapic_mmio_ops, -vioapic_mmio_ops, -msixtbl_mmio_ops, -iommu_mmio_ops -}; +BUG_ON(handler-type != IOREQ_TYPE_COPY); + +return handler-u.mmio.ops-check(current, addr); +} -static int hvm_mmio_access(struct vcpu *v, - ioreq_t *p, - hvm_mmio_read_t read, - hvm_mmio_write_t write) +static int hvm_mmio_read(struct hvm_io_handler *handler, + uint64_t addr, + uint64_t size, + uint64_t *data) { -struct hvm_vcpu_io *vio = v-arch.hvm_vcpu.hvm_io; -unsigned long data; -int rc = X86EMUL_OKAY, i, step = p-df ? -p-size : p-size; +BUG_ON(handler-type != IOREQ_TYPE_COPY); -if ( !p-data_is_ptr ) -{ -if ( p-dir == IOREQ_READ ) -{ -if ( vio-mmio_retrying ) -{ -if ( vio-mmio_large_read_bytes != p-size ) -return X86EMUL_UNHANDLEABLE; -memcpy(data, vio-mmio_large_read, p-size); -vio-mmio_large_read_bytes = 0; -
[Xen-devel] [PATCH v4 01/17] x86/hvm: simplify hvmemul_do_io()
Currently hvmemul_do_io() handles paging for I/O to/from a guest address inline. This causes every exit point to have to execute: if ( ram_page ) put_page(ram_page); This patch introduces wrapper hvmemul_do_io_addr() and hvmemul_do_io_buffer() functions. The latter is used for I/O to/from a Xen buffer and thus the complexity of paging can be restricted only to the former, making the common hvmemul_do_io() function less convoluted. This patch also tightens up some types and introduces pio/mmio wrappers for the above functions with comments to document their semantics. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c| 278 - xen/arch/x86/hvm/io.c |4 +- xen/include/asm-x86/hvm/emulate.h | 17 ++- 3 files changed, 198 insertions(+), 101 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index ac9c9d6..9d7af0c 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -51,41 +51,23 @@ static void hvmtrace_io_assist(int is_mmio, ioreq_t *p) } static int hvmemul_do_io( -int is_mmio, paddr_t addr, unsigned long *reps, int size, -paddr_t ram_gpa, int dir, int df, void *p_data) +bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size, +uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data) { struct vcpu *curr = current; -struct hvm_vcpu_io *vio; +struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; ioreq_t p = { .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO, .addr = addr, .size = size, .dir = dir, .df = df, -.data = ram_gpa, -.data_is_ptr = (p_data == NULL), +.data = data, +.data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */ }; -unsigned long ram_gfn = paddr_to_pfn(ram_gpa); -p2m_type_t p2mt; -struct page_info *ram_page; +void *p_data = (void *)data; int rc; -/* Check for paged out page */ -ram_page = get_page_from_gfn(curr-domain, ram_gfn, p2mt, P2M_UNSHARE); -if ( p2m_is_paging(p2mt) ) -{ -if ( ram_page ) -put_page(ram_page); -p2m_mem_paging_populate(curr-domain, ram_gfn); -return X86EMUL_RETRY; -} -if ( p2m_is_shared(p2mt) ) -{ -if ( ram_page ) -put_page(ram_page); -return X86EMUL_RETRY; -} - /* * Weird-sized accesses have undefined behaviour: we discard writes * and read all-ones. @@ -93,23 +75,10 @@ static int hvmemul_do_io( if ( unlikely((size sizeof(long)) || (size (size - 1))) ) { gdprintk(XENLOG_WARNING, bad mmio size %d\n, size); -ASSERT(p_data != NULL); /* cannot happen with a REP prefix */ -if ( dir == IOREQ_READ ) -memset(p_data, ~0, size); -if ( ram_page ) -put_page(ram_page); return X86EMUL_UNHANDLEABLE; } -if ( !p.data_is_ptr (dir == IOREQ_WRITE) ) -{ -memcpy(p.data, p_data, size); -p_data = NULL; -} - -vio = curr-arch.hvm_vcpu.hvm_io; - -if ( is_mmio !p.data_is_ptr ) +if ( is_mmio !data_is_addr ) { /* Part of a multi-cycle read or write? */ if ( dir == IOREQ_WRITE ) @@ -117,11 +86,7 @@ static int hvmemul_do_io( paddr_t pa = vio-mmio_large_write_pa; unsigned int bytes = vio-mmio_large_write_bytes; if ( (addr = pa) ((addr + size) = (pa + bytes)) ) -{ -if ( ram_page ) -put_page(ram_page); return X86EMUL_OKAY; -} } else { @@ -131,8 +96,6 @@ static int hvmemul_do_io( { memcpy(p_data, vio-mmio_large_read[addr - pa], size); -if ( ram_page ) -put_page(ram_page); return X86EMUL_OKAY; } } @@ -144,40 +107,28 @@ static int hvmemul_do_io( break; case HVMIO_completed: vio-io_state = HVMIO_none; -if ( p_data == NULL ) -{ -if ( ram_page ) -put_page(ram_page); +if ( data_is_addr || dir == IOREQ_WRITE ) return X86EMUL_UNHANDLEABLE; -} goto finish_access; case HVMIO_dispatched: /* May have to wait for previous cycle of a multi-write to complete. */ -if ( is_mmio !p.data_is_ptr (dir == IOREQ_WRITE) +if ( is_mmio !data_is_addr (dir == IOREQ_WRITE) (addr == (vio-mmio_large_write_pa + vio-mmio_large_write_bytes)) ) -{ -if ( ram_page ) -put_page(ram_page); return X86EMUL_RETRY; -} /* fallthrough */ default: -
[Xen-devel] [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
It's clear from the following check in hvmemul_rep_movs: if ( sp2mt == p2m_mmio_direct || dp2mt == p2m_mmio_direct || (sp2mt == p2m_mmio_dm dp2mt == p2m_mmio_dm) ) return X86EMUL_UNHANDLEABLE; that mmio - mmio copy is not handled. This means the code in the stdvga mmio intercept that explicitly handles mmio - mmio copy when hvm_copy_to/from_guest_phys() fails is never going to be executed. This patch therefore adds a check in hvmemul_do_io_addr() to make sure mmio - mmio is disallowed and then registers standard mmio intercept ops in stdvga_init(). With this patch all mmio and portio handled within Xen now goes through process_io_intercept(). Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c |9 +++ xen/arch/x86/hvm/intercept.c |7 -- xen/arch/x86/hvm/stdvga.c| 173 +- xen/include/asm-x86/hvm/io.h |1 - 4 files changed, 44 insertions(+), 146 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 9ced81b..4e2fdf1 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -267,6 +267,15 @@ static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page) return X86EMUL_RETRY; } +/* This code should not be reached if the gmfn is not RAM */ +if ( p2m_is_mmio(p2mt) ) +{ +domain_crash(curr_d); + +put_page(*page); +return X86EMUL_UNHANDLEABLE; +} + return X86EMUL_OKAY; } diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c index 5633959..625e585 100644 --- a/xen/arch/x86/hvm/intercept.c +++ b/xen/arch/x86/hvm/intercept.c @@ -279,13 +279,6 @@ int hvm_io_intercept(ioreq_t *p) { struct hvm_io_handler *handler; -if ( p-type == IOREQ_TYPE_COPY ) -{ -int rc = stdvga_intercept_mmio(p); -if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) ) -return rc; -} - handler = hvm_find_io_handler(p); if ( handler == NULL ) diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c index dcd532a..639da6a 100644 --- a/xen/arch/x86/hvm/stdvga.c +++ b/xen/arch/x86/hvm/stdvga.c @@ -275,9 +275,10 @@ static uint8_t stdvga_mem_readb(uint64_t addr) return ret; } -static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size) +static int stdvga_mem_read(struct vcpu *v, unsigned long addr, + unsigned long size, unsigned long *p_data) { -uint64_t data = 0; +unsigned long data = 0; switch ( size ) { @@ -313,7 +314,9 @@ static uint64_t stdvga_mem_read(uint64_t addr, uint64_t size) break; } -return data; +*p_data = data; + +return X86EMUL_OKAY; } static void stdvga_mem_writeb(uint64_t addr, uint32_t val) @@ -424,8 +427,17 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val) } } -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size) +static int stdvga_mem_write(struct vcpu *v, unsigned long addr, +unsigned long size, unsigned long data) { +ioreq_t p = { .type = IOREQ_TYPE_COPY, + .addr = addr, + .size = size, + .count = 1, + .dir = IOREQ_WRITE, + .data = data, +}; + /* Intercept mmio write */ switch ( size ) { @@ -460,153 +472,36 @@ static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size) gdprintk(XENLOG_WARNING, invalid io size: %PRId64\n, size); break; } -} - -static uint32_t read_data; - -static int mmio_move(struct hvm_hw_stdvga *s, ioreq_t *p) -{ -int i; -uint64_t addr = p-addr; -p2m_type_t p2mt; -struct domain *d = current-domain; -int step = p-df ? -p-size : p-size; -if ( p-data_is_ptr ) -{ -uint64_t data = p-data, tmp; - -if ( p-dir == IOREQ_READ ) -{ -for ( i = 0; i p-count; i++ ) -{ -tmp = stdvga_mem_read(addr, p-size); -if ( hvm_copy_to_guest_phys(data, tmp, p-size) != - HVMCOPY_okay ) -{ -struct page_info *dp = get_page_from_gfn( -d, data PAGE_SHIFT, p2mt, P2M_ALLOC); -/* - * The only case we handle is vga_mem - vga_mem. - * Anything else disables caching and leaves it to qemu-dm. - */ -if ( (p2mt != p2m_mmio_dm) || (data VGA_MEM_BASE) || - ((data + p-size) (VGA_MEM_BASE + VGA_MEM_SIZE)) ) -{ -if ( dp ) -put_page(dp); -return 0; -} -ASSERT(!dp); -
[Xen-devel] [PATCH v4 03/17] x86/hvm: remove extraneous parameter from hvmtrace_io_assist()
The is_mmio parameter can be inferred from the ioreq type. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index b412302..935eab3 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -23,8 +23,9 @@ #include asm/hvm/support.h #include asm/hvm/svm/svm.h -static void hvmtrace_io_assist(int is_mmio, ioreq_t *p) +static void hvmtrace_io_assist(ioreq_t *p) { +bool_t is_mmio = (p-type == IOREQ_TYPE_COPY); unsigned int size, event; unsigned char buffer[12]; @@ -139,7 +140,7 @@ static int hvmemul_do_io( if ( !data_is_addr ) memcpy(p.data, p_data, size); -hvmtrace_io_assist(is_mmio, p); +hvmtrace_io_assist(p); } if ( is_mmio ) @@ -200,7 +201,7 @@ static int hvmemul_do_io( finish_access: if ( dir == IOREQ_READ ) { -hvmtrace_io_assist(is_mmio, p); +hvmtrace_io_assist(p); if ( !data_is_addr ) memcpy(p_data, vio-io_data, size); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 08/17] x86/hvm: unify dpci portio intercept with standard portio intercept
This patch re-works the dpci portio intercepts so that they can be unified with standard portio handling thereby removing a substantial amount of code duplication. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/hvm.c |2 + xen/arch/x86/hvm/intercept.c | 22 ++-- xen/arch/x86/hvm/io.c | 225 +--- xen/include/asm-x86/hvm/io.h |8 ++ xen/include/asm-x86/hvm/vcpu.h |2 + xen/include/xen/iommu.h|1 - 6 files changed, 88 insertions(+), 172 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index c10db78..f8486f4 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1486,6 +1486,8 @@ int hvm_domain_initialise(struct domain *d) else d-arch.hvm_domain.io_bitmap = hvm_io_bitmap; +register_dpci_portio_handler(d); + if ( is_pvh_domain(d) ) { register_portio_handler(d, 0, 0x10003, handle_pvh_io); diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c index 5e8d8b2..5633959 100644 --- a/xen/arch/x86/hvm/intercept.c +++ b/xen/arch/x86/hvm/intercept.c @@ -116,10 +116,7 @@ static int hvm_process_io_intercept(struct hvm_io_handler *handler, ioreq_t *p) { struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io; -const struct hvm_io_ops *ops = -(p-type == IOREQ_TYPE_COPY) ? -mmio_ops : -portio_ops; +const struct hvm_io_ops *ops = handler-ops; int rc = X86EMUL_OKAY, i, step = p-df ? -p-size : p-size; uint64_t data; uint64_t addr; @@ -237,16 +234,13 @@ static struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p) { struct vcpu *curr = current; struct domain *curr_d = curr-domain; -const struct hvm_io_ops *ops = -(p-type == IOREQ_TYPE_COPY) ? -mmio_ops : -portio_ops; unsigned int i; for ( i = 0; i curr_d-arch.hvm_domain.io_handler_count; i++ ) { struct hvm_io_handler *handler = curr_d-arch.hvm_domain.io_handler[i]; +const struct hvm_io_ops *ops = handler-ops; uint64_t start, end, count = p-count, size = p-size; if ( handler-type != p-type ) @@ -285,13 +279,7 @@ int hvm_io_intercept(ioreq_t *p) { struct hvm_io_handler *handler; -if ( p-type == IOREQ_TYPE_PIO ) -{ -int rc = dpci_ioport_intercept(p); -if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) ) -return rc; -} -else if ( p-type == IOREQ_TYPE_COPY ) +if ( p-type == IOREQ_TYPE_COPY ) { int rc = stdvga_intercept_mmio(p); if ( (rc == X86EMUL_OKAY) || (rc == X86EMUL_RETRY) ) @@ -306,7 +294,7 @@ int hvm_io_intercept(ioreq_t *p) return hvm_process_io_intercept(handler, p); } -static struct hvm_io_handler *hvm_next_io_handler(struct domain *d) +struct hvm_io_handler *hvm_next_io_handler(struct domain *d) { unsigned int i = d-arch.hvm_domain.io_handler_count++; @@ -321,6 +309,7 @@ void register_mmio_handler(struct domain *d, const struct hvm_mmio_ops *ops) struct hvm_io_handler *handler = hvm_next_io_handler(d); handler-type = IOREQ_TYPE_COPY; +handler-ops = mmio_ops; handler-u.mmio.ops = ops; } @@ -330,6 +319,7 @@ void register_portio_handler(struct domain *d, uint32_t addr, struct hvm_io_handler *handler = hvm_next_io_handler(d); handler-type = IOREQ_TYPE_PIO; +handler-ops = portio_ops; handler-u.portio.start = addr; handler-u.portio.end = addr + size; handler-u.portio.action = action; diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index c0964ec..51ef19a 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -208,185 +208,100 @@ void hvm_io_assist(ioreq_t *p) } } -static int dpci_ioport_read(uint32_t mport, ioreq_t *p) +static bool_t dpci_portio_accept(struct hvm_io_handler *handler, + uint64_t addr, + uint64_t size) { -struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io; -int rc = X86EMUL_OKAY, i, step = p-df ? -p-size : p-size; -uint32_t data = 0; +struct vcpu *curr = current; +struct hvm_iommu *hd = domain_hvm_iommu(curr-domain); +struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; +struct g2m_ioport *g2m_ioport; +uint32_t start, end; +uint32_t gport = addr, mport; -for ( i = 0; i p-count; i++ ) +list_for_each_entry( g2m_ioport, hd-arch.g2m_ioport_list, list ) { -if ( vio-mmio_retrying ) -{ -if ( vio-mmio_large_read_bytes != p-size ) -return X86EMUL_UNHANDLEABLE; -memcpy(data, vio-mmio_large_read, p-size); -vio-mmio_large_read_bytes = 0; -vio-mmio_retrying = 0; -} -else switch (
[Xen-devel] [PATCH v4 00/17] x86/hvm: I/O emulation cleanup and fix
This patch series re-works much of the code involved in emulation of port and memory mapped I/O for HVM guests. The code has become very convoluted and, at least by inspection, certain emulations will apparently malfunction. The series is broken down into 17 patches (which are also available in my xenbits repo: http://xenbits.xen.org/gitweb/?p=people/pauldu/xen.git on the emulation27 branch) as follows: 0001-x86-hvm-simplify-hvmemul_do_io.patch 0002-x86-hvm-remove-hvm_io_pending-check-in-hvmemul_do_io.patch 0003-x86-hvm-remove-extraneous-parameter-from-hvmtrace_io.patch 0004-x86-hvm-remove-multiple-open-coded-chunking-loops.patch 0005-x86-hvm-re-name-struct-hvm_mmio_handler-to-hvm_mmio_.patch 0006-x86-hvm-unify-internal-portio-and-mmio-intercepts.patch 0007-x86-hvm-add-length-to-mmio-check-op.patch 0008-x86-hvm-unify-dpci-portio-intercept-with-standard-po.patch 0009-x86-hvm-unify-stdvga-mmio-intercept-with-standard-mm.patch 0010-x86-hvm-revert-82ed8716b-fix-direct-PCI-port-I-O-emu.patch 0011-x86-hvm-only-call-hvm_io_assist-from-hvm_wait_for_io.patch 0012-x86-hvm-split-I-O-completion-handling-from-state-mod.patch 0013-x86-hvm-remove-HVMIO_dispatched-I-O-state.patch 0014-x86-hvm-remove-hvm_io_state-enumeration.patch 0015-x86-hvm-use-ioreq_t-to-track-in-flight-state.patch 0016-x86-hvm-always-re-emulate-I-O-from-a-buffer.patch 0017-x86-hvm-track-large-memory-mapped-accesses-by-buffer.patch v2: - Removed bogus assertion from patch 16 - Re-worked patch #17 after basic testing of back-port onto XenServer v3: - Addressed comments from Jan - Re-ordered series to bring a couple of more trivial patches to the front - Backport to XenServer (4.5) now passing automated tests - Tested on unstable with QEMU upstream and trad, with and without HAP (to force shadow emulation) v4: - Removed previous patch #4 (make sure translated MMIO reads or writes fall within a page) and rebased rest of series. - Address Jan's comments on prevous patch #5 (now patch #4) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op
When memory mapped I/O is range checked by internal handlers, the length of the access should be taken into account. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/hpet.c |7 --- xen/arch/x86/hvm/intercept.c |2 +- xen/arch/x86/hvm/vioapic.c| 17 ++--- xen/arch/x86/hvm/vlapic.c |8 +--- xen/arch/x86/hvm/vmsi.c | 27 --- xen/drivers/passthrough/amd/iommu_guest.c | 18 +++--- xen/include/asm-x86/hvm/io.h |4 +++- 7 files changed, 62 insertions(+), 21 deletions(-) diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c index 8958873..1a1f239 100644 --- a/xen/arch/x86/hvm/hpet.c +++ b/xen/arch/x86/hvm/hpet.c @@ -498,10 +498,11 @@ static int hpet_write( return X86EMUL_OKAY; } -static int hpet_range(struct vcpu *v, unsigned long addr) +static int hpet_range(struct vcpu *v, unsigned long addr, + unsigned long length) { -return ( (addr = HPET_BASE_ADDRESS) - (addr (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) ); +return (addr = HPET_BASE_ADDRESS) + ((addr + length) (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)); } static const struct hvm_mmio_ops hpet_mmio_ops = { diff --git a/xen/arch/x86/hvm/intercept.c b/xen/arch/x86/hvm/intercept.c index 4db024e..5e8d8b2 100644 --- a/xen/arch/x86/hvm/intercept.c +++ b/xen/arch/x86/hvm/intercept.c @@ -38,7 +38,7 @@ static bool_t hvm_mmio_accept(struct hvm_io_handler *handler, { BUG_ON(handler-type != IOREQ_TYPE_COPY); -return handler-u.mmio.ops-check(current, addr); +return handler-u.mmio.ops-check(current, addr, size); } static int hvm_mmio_read(struct hvm_io_handler *handler, diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c index 9ad909b..4a9b33e 100644 --- a/xen/arch/x86/hvm/vioapic.c +++ b/xen/arch/x86/hvm/vioapic.c @@ -242,12 +242,13 @@ static int vioapic_write( return X86EMUL_OKAY; } -static int vioapic_range(struct vcpu *v, unsigned long addr) +static int vioapic_range(struct vcpu *v, unsigned long addr, +unsigned long length) { struct hvm_hw_vioapic *vioapic = domain_vioapic(v-domain); -return ((addr = vioapic-base_address - (addr vioapic-base_address + VIOAPIC_MEM_LENGTH))); +return (addr = vioapic-base_address) + ((addr + length) = (vioapic-base_address + VIOAPIC_MEM_LENGTH)); } static const struct hvm_mmio_ops vioapic_mmio_ops = { @@ -466,3 +467,13 @@ void vioapic_deinit(struct domain *d) xfree(d-arch.hvm_domain.vioapic); d-arch.hvm_domain.vioapic = NULL; } + +/* + * Local variables: + * mode: C + * c-file-style: BSD + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index f2052cf..7421fc5 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -986,14 +986,16 @@ int hvm_x2apic_msr_write(struct vcpu *v, unsigned int msr, uint64_t msr_content) return vlapic_reg_write(v, offset, (uint32_t)msr_content); } -static int vlapic_range(struct vcpu *v, unsigned long addr) +static int vlapic_range(struct vcpu *v, unsigned long address, +unsigned long len) { struct vlapic *vlapic = vcpu_vlapic(v); -unsigned long offset = addr - vlapic_base_address(vlapic); +unsigned long offset = address - vlapic_base_address(vlapic); return !vlapic_hw_disabled(vlapic) !vlapic_x2apic_mode(vlapic) - (offset PAGE_SIZE); + (address = vlapic_base_address(vlapic)) + ((offset + len) = PAGE_SIZE); } static const struct hvm_mmio_ops vlapic_mmio_ops = { diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c index 09ea301..61fe391 100644 --- a/xen/arch/x86/hvm/vmsi.c +++ b/xen/arch/x86/hvm/vmsi.c @@ -168,14 +168,14 @@ struct msixtbl_entry static DEFINE_RCU_READ_LOCK(msixtbl_rcu_lock); static struct msixtbl_entry *msixtbl_find_entry( -struct vcpu *v, unsigned long addr) +struct vcpu *v, unsigned long address, unsigned long len) { struct msixtbl_entry *entry; struct domain *d = v-domain; list_for_each_entry( entry, d-arch.hvm_domain.msixtbl_list, list ) -if ( addr = entry-gtable - addr entry-gtable + entry-table_len ) +if ( (address = entry-gtable) + ((address + len) = (entry-gtable + entry-table_len)) ) return entry; return NULL; @@ -214,7 +214,7 @@ static int msixtbl_read( rcu_read_lock(msixtbl_rcu_lock); -entry = msixtbl_find_entry(v, address); +entry = msixtbl_find_entry(v, address, len); if ( !entry ) goto out; offset = address (PCI_MSIX_ENTRY_SIZE - 1); @@ -273,7
Re: [Xen-devel] [PATCH 5/9] x86/pvh: Set PVH guest's mode in XEN_DOMCTL_set_address_size
On 06/24/2015 03:57 AM, Jan Beulich wrote: On 24.06.15 at 04:53, boris.ostrov...@oracle.com wrote: On 06/23/2015 09:22 AM, Jan Beulich wrote: --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2320,12 +2320,7 @@ int hvm_vcpu_initialise(struct vcpu *v) v-arch.hvm_vcpu.inject_trap.vector = -1; if ( is_pvh_domain(d) ) -{ -v-arch.hvm_vcpu.hcall_64bit = 1;/* PVH 32bitfixme. */ -/* This is for hvm_long_mode_enabled(v). */ -v-arch.hvm_vcpu.guest_efer = EFER_LMA | EFER_LME; return 0; -} With this removed, is there any guarantee that hvm_set_mode() will be called for each vCPU? IIUIC, toolstack is required to call XEN_DOMCTL_set_address_size which results in a call to switch_compat/native(), which loop over all VCPUs, calling set_mode. I don't recall this being a strict requirement. I think a PV 64-bit guest would start fine without. We do call it via libxl__build_pv() - xc_dom_boot_mem_init() - arch_setup_mem_init() - x86_compat(). -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 04/17] x86/hvm: remove multiple open coded 'chunking' loops
On 24.06.15 at 13:24, paul.durr...@citrix.com wrote: +static int hvmemul_phys_mmio_access( +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t **buffer) As much as the earlier offset you returned via indirection to the caller was unnecessary, the indirection here seems pointless too. All callers know how (or don't care) to update the buffer pointer. +static int hvmemul_linear_mmio_access( +unsigned long gla, unsigned int size, uint8_t dir, uint8_t *buffer, +uint32_t pfec, struct hvm_emulate_ctxt *hvmemul_ctxt, bool_t translate) +{ +struct hvm_vcpu_io *vio = current-arch.hvm_vcpu.hvm_io; +unsigned long page_off = gla (PAGE_SIZE - 1); +unsigned int chunk; +paddr_t gpa; +unsigned long one_rep = 1; +int rc; + +chunk = min_t(unsigned int, size, PAGE_SIZE - page_off); + +if ( translate ) +gpa = pfn_to_paddr(vio-mmio_gpfn) | page_off; translate as name for the parameter signaling that the translation is known is kind of odd - translated or known_gpfn or some such? Or invert the meaning? +else +{ +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec, +hvmemul_ctxt); +if ( rc != X86EMUL_OKAY ) +return rc; +} + +for ( ;; ) +{ +rc = hvmemul_phys_mmio_access(gpa, chunk, dir, buffer); +if ( rc != X86EMUL_OKAY ) +break; + +gla += chunk; +gpa += chunk; +size -= chunk; + +if ( size == 0 ) +break; + +ASSERT((gla (PAGE_SIZE - 1)) == 0); +chunk = min_t(unsigned int, size, PAGE_SIZE); I think you could just assert that size is now less than PAGE_SIZE. +if ( !translate ) +{ +rc = hvmemul_linear_to_phys(gla, gpa, chunk, one_rep, pfec, +hvmemul_ctxt); +if ( rc != X86EMUL_OKAY ) +return rc; +} This must be done unconditionally (and gpa doesn't need updating above then), as the known translation is only for the first byte (and whatever falls on the same page). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/12] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
On 22/06/15 19:56, Ed White wrote: From: Ravi Sahita ravi.sah...@intel.com Signed-off-by: Ravi Sahita ravi.sah...@intel.com --- xen/arch/x86/hvm/emulate.c | 13 +++-- xen/arch/x86/hvm/vmx/vmx.c | 30 ++ xen/arch/x86/x86_emulate/x86_emulate.c | 8 xen/arch/x86/x86_emulate/x86_emulate.h | 4 xen/include/asm-x86/hvm/hvm.h | 2 ++ 5 files changed, 55 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index ac9c9d6..e38a2fe 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -1356,6 +1356,13 @@ static int hvmemul_invlpg( return rc; } +static int hvmemul_vmfunc( +struct x86_emulate_ctxt *ctxt) +{ +hvm_funcs.ahvm_vcpu_emulate_vmfunc(ctxt-regs); +return X86EMUL_OKAY; +} ahvm_vcpu_emulate_vmfunc() should return an X86EMUL code. + static const struct x86_emulate_ops hvm_emulate_ops = { .read = hvmemul_read, .insn_fetch= hvmemul_insn_fetch, @@ -1379,7 +1386,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = { .inject_sw_interrupt = hvmemul_inject_sw_interrupt, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, -.invlpg= hvmemul_invlpg +.invlpg= hvmemul_invlpg, +.vmfunc= hvmemul_vmfunc, }; static const struct x86_emulate_ops hvm_emulate_ops_no_write = { @@ -1405,7 +1413,8 @@ static const struct x86_emulate_ops hvm_emulate_ops_no_write = { .inject_sw_interrupt = hvmemul_inject_sw_interrupt, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, -.invlpg= hvmemul_invlpg +.invlpg= hvmemul_invlpg, +.vmfunc= hvmemul_vmfunc, }; static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt, diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index e8d9c82..ad9e9e4 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void); static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content); static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content); static void vmx_invlpg_intercept(unsigned long vaddr); +static int vmx_vmfunc_intercept(struct cpu_user_regs* regs); s/* / */ uint8_t __read_mostly posted_intr_vector; @@ -1826,6 +1827,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) vmx_vmcs_exit(v); } +static bool_t vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs) +{ +bool_t rc = 0; + +if ( !cpu_has_vmx_vmfunc altp2mhvm_active(current-domain) + regs-eax == 0 + p2m_switch_vcpu_altp2m_by_id(current, (uint16_t)regs-ecx) ) Please latch current at the top of the function. It is inefficient to access like this. +{ +regs-eip += 3; +rc = 1; +} +return rc; +} + static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) { bool_t rc = 0; @@ -1894,6 +1909,7 @@ static struct hvm_function_table __initdata vmx_function_table = { .msr_read_intercept = vmx_msr_read_intercept, .msr_write_intercept = vmx_msr_write_intercept, .invlpg_intercept = vmx_invlpg_intercept, +.vmfunc_intercept = vmx_vmfunc_intercept, .handle_cd= vmx_handle_cd, .set_info_guest = vmx_set_info_guest, .set_rdtsc_exiting= vmx_set_rdtsc_exiting, @@ -1920,6 +1936,7 @@ static struct hvm_function_table __initdata vmx_function_table = { .ahvm_vcpu_update_eptp = vmx_vcpu_update_eptp, .ahvm_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve, .ahvm_vcpu_emulate_ve = vmx_vcpu_emulate_ve, +.ahvm_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc, }; const struct hvm_function_table * __init start_vmx(void) @@ -2091,6 +2108,13 @@ static void vmx_invlpg_intercept(unsigned long vaddr) vpid_sync_vcpu_gva(curr, vaddr); } +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs) +{ +gdprintk(XENLOG_ERR, Failed guest VMFUNC execution\n); +domain_crash(current-domain); +return X86EMUL_OKAY; +} + static int vmx_cr_access(unsigned long exit_qualification) { struct vcpu *curr = current; @@ -2675,6 +2699,7 @@ void vmx_enter_realmode(struct cpu_user_regs *regs) regs-eflags |= (X86_EFLAGS_VM | X86_EFLAGS_IOPL); } + Spurious whitespace change. static void vmx_vmexit_ud_intercept(struct cpu_user_regs *regs) { struct hvm_emulate_ctxt ctxt; @@ -3239,6 +3264,11 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) update_guest_eip(); break; +case EXIT_REASON_VMFUNC: +if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY ) This is currently an unconditional failure, and I don't see subsequent patches which alter vmx_vmfunc_intercept(). Shouldn't
[Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))
Adding Boris+Suravee+Aravind (AMD/SVM maintainers), Dario (NUMA) and Jim +Anthony (libvirt) to the CC. TL;DR osstest is exposing issues running on AMD Opteron(tm) Processor 6376 in at least a couple of test cases. It would be good if someone from AMD could have a look. The systems here == merlot[01], which seem to be having with win7 live migration tests as well as libvirt when starting PV guests. They each contain AMD Opteron(tm) Processor 6376 processors with 32 threads in 4 nodes and seem to have a strange NUMA layout with no RAM on nodes 1 or 3. The test history on these machines: http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html I just posted some analysis of the windows cases (including experiments on the old Cambridge test infra with AMD Opteron(tm) Processor 6168 processes) in: http://lists.xen.org/archives/html/xen-devel/2015-06/msg03713.html I've also been investigating the libvirt guest-start failures. The symptom is a 10s timeout starting qemu. Anthony is seeing this with openstack too and did some analysis in http://thread.gmane.org/gmane.comp.emulators.xen.devel/246473/focus=249172 onwards, but it may be that this is unrelated to the osstest failures and that for Anthony's scenario the 10s timeout could be explained by the openstack tempest tests starting lots of VMs in parallel. However for the osstests we are starting a single PV domain on an otherwise idle host. There should be no reason for qemu to take as long as 10s to come up in that case, even with pessimal NUMA layout (IMHO at least). By comparison on other hosts starting qemu seems to take 2-4s, so merlot is at least 2.5-5 times worse. I tried running some adhoc tests on the old infra tied to the *-frog machines (which are the Opteron 6168 ones): http://xenbits.xen.org/people/ianc/tmp/adhoc/37623/ http://xenbits.xen.org/people/ianc/tmp/adhoc/37625/ The -xsm failures are because I botched the flight configuration, the interesting information is that the other ones passed both times (migrate-support is expected to fail at the moment). Supposing that the NUMA oddities might be what is exposing this issue I tried an adhoc run on the merlot machines where I specified dom0_max_vcpus=8 dom0_nodes=0 on the hypervisor command line: http://logs.test-lab.xenproject.org/osstest/logs/58853/ Again, I messed up the config for the -xsm case, so ignore. The interesting thing is that the extra NUMA settings were apparently_not_ helpful. From http://logs.test-lab.xenproject.org/osstest/logs/58853/test-amd64-amd64-libvirt/serial-merlot0.log I can see they were applied: Jun 23 15:50:34.205057 (XEN) Command line: placeholder conswitch=x watchdog com1=115200,8n1 console=com1,vga gdb=com1 dom0_mem=512M,max:512M ucode=scan dom0_max_vcpus=8 dom0_nodes=0 [...] Jun 23 15:50:38.309057 (XEN) Dom0 has maximum 8 VCPUs The memory info Jun 23 15:56:27.749008 (XEN) Memory location of each domain: Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072): Jun 23 15:56:27.756983 (XEN) Node 0: 126905 Jun 23 15:56:27.756998 (XEN) Node 1: 0 Jun 23 15:56:27.764952 (XEN) Node 2: 4167 Jun 23 15:56:27.764969 (XEN) Node 3: 0 suggests at least a small amount of cross-node memory allocation (16M out of dom0s 512M total). That's probably small enough to be OK. And it seems as if the 8 dom0 vcpus are correctly pinned to the first 8 cpus (the ones in node 0): Jun 23 15:56:43.797055 (XEN) VCPU information and callbacks for domain 0: Jun 23 15:56:43.797110 (XEN) VCPU0: CPU4 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={4} Jun 23 15:56:43.805078 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.813121 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.813157 (XEN) No periodic timer Jun 23 15:56:43.821050 (XEN) VCPU1: CPU3 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={3} Jun 23 15:56:43.829044 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.829082 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.837051 (XEN) No periodic timer Jun 23 15:56:43.837084 (XEN) VCPU2: CPU5 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={5} Jun 23 15:56:43.845102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.853035 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.853071 (XEN) No periodic timer Jun 23 15:56:43.853099 (XEN) VCPU3: CPU7 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={7} Jun 23 15:56:43.861102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.869110 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.869145 (XEN) No periodic timer Jun 23 15:56:43.877014 (XEN) VCPU4: CPU0 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={} Jun 23 15:56:43.877038 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.885053 (XEN) pause_count=0 pause_flags=1 Jun 23
Re: [Xen-devel] stable trees (was: [xen-4.2-testing test] 58584: regressions)
On 24.06.15 at 11:06, ian.campb...@citrix.com wrote: After that baseline I ran a few tests of just the windows + qemuu stuff: http://xenbits.xen.org/people/ianc/tmp/adhoc/37619/ was allowing free reign on the machines and was mostly successful, apart from the windows-install failure on lake-frog. Looking at the test history this seems to have always been a problem on the old infra. *-frog are AMD Opteron(tm) Processor 6168 which is as close as the old infra has to the new colos merlot[01] which is AMD Opteron(tm) Processor 6376. With that in mind I reran with things limited to the two frog-* boxes and got http://xenbits.xen.org/people/ianc/tmp/adhoc/37624/. The windows-install of winxpsp3 persisted but there was no migration failure elsewhere. It's not a lot of data, but in comparison with the results in the colo: http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64 -xl-qemuu-win7-amd64/xen-4.5-testing.html it looks like it's the newer system which is exposing the issue. Thanks for doing all of this! While not pointing towards a solution on the side of the newer systems, it at least reassures us that we didn't release regressing software with 4.5.1. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-3.4 test] 58845: regressions - FAIL
flight 58845 linux-3.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/58845/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemut-win7-amd64 6 xen-boot fail REGR. vs. 30511 Tests which are failing intermittently (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail in 58831 pass in 58845 test-amd64-amd64-pair10 xen-boot/dst_host fail pass in 58798 test-amd64-amd64-pair 9 xen-boot/src_host fail pass in 58798 test-amd64-amd64-xl-sedf-pin 6 xen-bootfail pass in 58798 test-amd64-i386-pair 10 xen-boot/dst_host fail pass in 58831 test-amd64-i386-pair 9 xen-boot/src_host fail pass in 58831 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested test-amd64-i386-libvirt-xsm 6 xen-bootfail baseline untested test-amd64-amd64-xl-multivcpu 6 xen-boot fail baseline untested test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested test-amd64-amd64-libvirt-xsm 6 xen-bootfail baseline untested test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail baseline untested test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested test-amd64-amd64-xl-sedf 6 xen-boot fail like 30406 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 30511 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 30511 test-amd64-amd64-xl-qemuu-ovmf-amd64 6 xen-bootfail like 53709-bisect test-amd64-i386-freebsd10-amd64 6 xen-boot fail like 58780-bisect test-amd64-i386-xl-qemuu-winxpsp3 6 xen-boot fail like 58786-bisect test-amd64-i386-qemut-rhel6hvm-intel 6 xen-bootfail like 58788-bisect test-amd64-i386-rumpuserxen-i386 6 xen-bootfail like 58799-bisect test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 6 xen-bootfail like 58801-bisect test-amd64-amd64-xl-qemuu-debianhvm-amd64 6 xen-boot fail like 58803-bisect test-amd64-amd64-xl-qemut-winxpsp3 6 xen-boot fail like 58804-bisect test-amd64-i386-freebsd10-i386 6 xen-boot fail like 58805-bisect test-amd64-i386-xl-qemuu-ovmf-amd64 6 xen-boot fail like 58806-bisect test-amd64-amd64-xl-qemuu-winxpsp3 6 xen-boot fail like 58807-bisect test-amd64-i386-xl-qemut-winxpsp3 6 xen-boot fail like 58808-bisect test-amd64-i386-xl-qemut-winxpsp3-vcpus1 6 xen-bootfail like 58809-bisect test-amd64-amd64-rumpuserxen-amd64 6 xen-boot fail like 58810-bisect test-amd64-i386-xl-qemuu-debianhvm-amd64 6 xen-bootfail like 58811-bisect test-amd64-amd64-xl-qemut-debianhvm-amd64 6 xen-boot fail like 58813-bisect test-amd64-i386-qemuu-rhel6hvm-intel 6 xen-bootfail like 58814-bisect test-amd64-i386-xl-qemut-debianhvm-amd64 6 xen-bootfail like 58815-bisect Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt-xsm 12 migrate-support-check fail in 58831 never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: linuxcf1b3dad6c5699b977273276bada8597636ef3e2 baseline version: linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70 500 people touched revisions under test, not listing them all jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-amd64-i386-xl pass
Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op
On 24.06.15 at 13:24, paul.durr...@citrix.com wrote: --- a/xen/arch/x86/hvm/hpet.c +++ b/xen/arch/x86/hvm/hpet.c @@ -498,10 +498,11 @@ static int hpet_write( return X86EMUL_OKAY; } -static int hpet_range(struct vcpu *v, unsigned long addr) +static int hpet_range(struct vcpu *v, unsigned long addr, + unsigned long length) { -return ( (addr = HPET_BASE_ADDRESS) - (addr (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)) ); +return (addr = HPET_BASE_ADDRESS) + ((addr + length) (HPET_BASE_ADDRESS + HPET_MMAP_SIZE)); = --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -986,14 +986,16 @@ int hvm_x2apic_msr_write(struct vcpu *v, unsigned int msr, uint64_t msr_content) return vlapic_reg_write(v, offset, (uint32_t)msr_content); } -static int vlapic_range(struct vcpu *v, unsigned long addr) +static int vlapic_range(struct vcpu *v, unsigned long address, +unsigned long len) { struct vlapic *vlapic = vcpu_vlapic(v); -unsigned long offset = addr - vlapic_base_address(vlapic); +unsigned long offset = address - vlapic_base_address(vlapic); return !vlapic_hw_disabled(vlapic) !vlapic_x2apic_mode(vlapic) - (offset PAGE_SIZE); + (address = vlapic_base_address(vlapic)) + ((offset + len) = PAGE_SIZE); I'd prefer to stay with checking just offset here, unless you see anything wrong with that. @@ -333,12 +333,15 @@ out: return r; } -static int msixtbl_range(struct vcpu *v, unsigned long addr) +static int msixtbl_range(struct vcpu *v, unsigned long address, + unsigned long len) { +struct msixtbl_entry *entry; const struct msi_desc *desc; rcu_read_lock(msixtbl_rcu_lock); -desc = msixtbl_addr_to_desc(msixtbl_find_entry(v, addr), addr); +entry = msixtbl_find_entry(v, address, len); +desc = msixtbl_addr_to_desc(entry, address); Again I don't see the need to do more adjustments here than necessary for your purpose. --- a/xen/include/asm-x86/hvm/io.h +++ b/xen/include/asm-x86/hvm/io.h @@ -35,7 +35,9 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v, unsigned long addr, unsigned long length, unsigned long val); -typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr); +typedef int (*hvm_mmio_check_t)(struct vcpu *v, +unsigned long addr, +unsigned long length); I don't think this really needs to be long? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 05/18] x86/hvm: remove multiple open coded 'chunking' loops
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 24 June 2015 10:38 To: Paul Durrant Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org) Subject: Re: [PATCH v3 05/18] x86/hvm: remove multiple open coded 'chunking' loops On 23.06.15 at 12:39, paul.durr...@citrix.com wrote: --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -540,6 +540,115 @@ static int hvmemul_virtual_to_linear( return X86EMUL_EXCEPTION; } +static int hvmemul_phys_mmio_access( +paddr_t gpa, unsigned int size, uint8_t dir, uint8_t *buffer, +unsigned int *off) Why this (buffer, off) pair? The caller can easily adjust buffer as necessary, avoiding the other parameter altogether. And buffer itself can be void * just like it is in some of the callers (and the others should follow suit). It actually becomes necessary in a later patch, but I'll make the change there instead. As for incrementing a void *, I know that MSVC disallows this. I believe it is a gcc-ism which I guess clang must tolerate, but I don't think it is standard C. +{ +unsigned long one_rep = 1; +unsigned int chunk; +int rc = 0; + +/* Accesses must fall within a page */ +if ( (gpa (PAGE_SIZE - 1)) + size PAGE_SIZE ) +return X86EMUL_UNHANDLEABLE; As for patch 4 - this imposes a restriction that real hardware doesn't have, and hence this needs to be replaced by adjusting the one caller not currently guaranteeing this such that it caps the size. Ok. +/* + * hvmemul_do_io() cannot handle non-power-of-2 accesses or + * accesses larger than sizeof(long), so choose the highest power + * of 2 not exceeding sizeof(long) as the 'chunk' size. + */ +chunk = 1 (fls(size) - 1); +if ( chunk sizeof (long) ) +chunk = sizeof (long); I suppose you intentionally generalize this; if so this should be mentioned in the commit message. This is particularly because it results in changed behavior (which isn't to say that I'm sure the previous way was any better in the sense of being closer to what real hardware does): Right now, an 8 byte access at the last byte of a page would get carried out as 8 1-byte accesses. Your change makes it a 1-, 4-, 2-, and 1-byte access in that order. ...which is certainly going to be quicker since it's only 4 round-trips to an emulator rather than 8. Also, considering instruction characteristics (as explained in the original comment) I think the old way of determining the chunk size may have been cheaper than yours using fls(). I thought fls() was generally implemented using inline assembler and was pretty fast. I didn't actually check how Xen implements it; I just assumed it would be optimal. + +while ( size != 0 ) +{ +rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0, +buffer[*off]); +if ( rc != X86EMUL_OKAY ) +break; + +/* Advance to the next chunk */ +gpa += chunk; +*off += chunk; +size -= chunk; + +/* + * If the chunk now exceeds the remaining size, choose the next + * lowest power of 2 that will fit. + */ +while ( chunk size ) +chunk = 1; Please avoid this loop when size == 0. Since the function won't be called with size being zero, I think the loop should be a for ( ; ; ) one with the loop exit condition put in the middle. Sure. @@ -549,52 +658,26 @@ static int __hvmemul_read( struct hvm_emulate_ctxt *hvmemul_ctxt) { struct vcpu *curr = current; -unsigned long addr, reps = 1; -unsigned int off, chunk = min(bytes, 1U LONG_BYTEORDER); +unsigned long addr, one_rep = 1; uint32_t pfec = PFEC_page_present; struct hvm_vcpu_io *vio = curr-arch.hvm_vcpu.hvm_io; -paddr_t gpa; int rc; rc = hvmemul_virtual_to_linear( -seg, offset, bytes, reps, access_type, hvmemul_ctxt, addr); +seg, offset, bytes, one_rep, access_type, hvmemul_ctxt, addr); if ( rc != X86EMUL_OKAY ) return rc; -off = addr (PAGE_SIZE - 1); -/* - * We only need to handle sizes actual instruction operands can have. All - * such sizes are either powers of 2 or the sum of two powers of 2. Thus - * picking as initial chunk size the largest power of 2 not greater than - * the total size will always result in only power-of-2 size requests - * issued to hvmemul_do_mmio() (hvmemul_do_io() rejects non- powers-of-2). - */ -while ( chunk (chunk - 1) ) -chunk = chunk - 1; -if ( off + bytes PAGE_SIZE ) -while ( off (chunk - 1) ) -chunk = 1; if ( ((access_type != hvm_access_insn_fetch ? vio-mmio_access.read_access :
Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install
On 24.06.15 at 11:14, fanhengl...@huawei.com wrote: I want to debug the procedure of windows os install with windbg, windbg executes instruction(fxsave) after the blank vm is started and before guest iso start to install, fxsave trigger the following code path: vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION) -ept_handle_violation -hvm_hap_nested_page_fault -handle_mmio_with_translation -handle_mmio -hvm_emulate_one -x86_emulate X86_emulate return X86EMUL_UNHANDLEABLE The xl dmesg log; (d5) Writing SMBIOS tables ... (d5) Loading OVMF ... (XEN) d5v0 Over-allocation for domain 5: 2097409 2097408 (XEN) memory.c:155:d5v0 Could not allocate order=0 extent: id=5 memflags=0 (0 of 1) (d5) Loading ACPI ... (d5) vm86 TSS at fc012d00 (d5) BIOS map: (d5) ffe0-: Main BIOS (d5) E820 table: (d5) [00]: : - :000a: RAM (d5) HOLE: :000a - :000f (d5) [01]: :000f - :0010: RESERVED (d5) [02]: :0010 - :f000: RAM (d5) HOLE: :f000 - :fc00 (d5) [03]: :fc00 - 0001:: RESERVED (d5) [04]: 0001: - 0002:0f6ed000: RAM (d5) Invoking OVMF ... (XEN) stdvga.c:147:d5v0 entering stdvga and caching modes (XEN) stdvga.c:151:d5v0 leaving stdvga (XEN) irq.c:276: Dom5 PCI link 0 changed 5 - 11 (XEN) irq.c:276: Dom5 PCI link 1 changed 10 - 11 (XEN) irq.c:276: Dom5 PCI link 2 changed 11 - 10 (XEN) irq.c:276: Dom5 PCI link 3 changed 5 - 10 (XEN) MMIO emulation failed: d5v0 64bit @ 0028:efe54dab - 0f ae 07 fc ff 75 10 48 8b 4d 08 48 89 e2 48 83 (XEN) MMIO emulation failed: d5v0 64bit @ 0028:efe54dab - 0f ae 07 fc ff 75 10 48 8b 4d 08 48 89 e2 48 83 (XEN) MMIO emulation failed: d5v0 64bit @ 0028:efe54dab - 0f ae 07 fc ff 75 10 48 8b 4d 08 48 89 e2 48 83 (XEN) d5v0 Triple fault - invoking HVM shutdown action 1 Considering the address (below 4Gb) I'd view it equally possible that it's OVMF that is running into this (and Windows may not have got control at all by that time). But as others have said - unless you're using VM events, it first of all would need to be understood why fxsave would be issued on MMIO space, which as a very minimum requires register state to be made visible. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/12] VMX/altp2m: add code to support EPTP switching and #VE.
On 22/06/15 19:56, Ed White wrote: Implement and hook up the code to enable VMX support of VMFUNC and #VE. VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch. Signed-off-by: Ed White edmund.h.wh...@intel.com --- xen/arch/x86/hvm/vmx/vmx.c | 132 + 1 file changed, 132 insertions(+) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 2d3ad63..e8d9c82 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -56,6 +56,7 @@ #include asm/debugger.h #include asm/apic.h #include asm/hvm/nestedhvm.h +#include asm/hvm/altp2mhvm.h #include asm/event.h #include asm/monitor.h #include public/arch-x86/cpuid.h @@ -1763,6 +1764,100 @@ static void vmx_enable_msr_exit_interception(struct domain *d) MSR_TYPE_W); } +static void vmx_vcpu_update_eptp(struct vcpu *v) +{ +struct domain *d = v-domain; +struct p2m_domain *p2m = NULL; +struct ept_data *ept; + +if ( altp2mhvm_active(d) ) +p2m = p2m_get_altp2m(v); +if ( !p2m ) +p2m = p2m_get_hostp2m(d); + +ept = p2m-ept; +ept-asr = pagetable_get_pfn(p2m_get_pagetable(p2m)); + +vmx_vmcs_enter(v); + +__vmwrite(EPT_POINTER, ept_get_eptp(ept)); + +if ( v-arch.hvm_vmx.secondary_exec_control +SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS ) +__vmwrite(EPTP_INDEX, vcpu_altp2mhvm(v).p2midx); + +vmx_vmcs_exit(v); +} + +static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) +{ +struct domain *d = v-domain; +u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS; + +if ( !cpu_has_vmx_vmfunc ) +return; + +if ( cpu_has_vmx_virt_exceptions ) +mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS; + +vmx_vmcs_enter(v); + +if ( !d-is_dying altp2mhvm_active(d) ) +{ +v-arch.hvm_vmx.secondary_exec_control |= mask; +__vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING); +__vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d-arch.altp2m_eptp)); + +if ( cpu_has_vmx_virt_exceptions ) +{ +p2m_type_t t; +mfn_t mfn; + +mfn = get_gfn_query_unlocked(d, vcpu_altp2mhvm(v).veinfo_gfn, t); get_gfn_query_unlocked() returns _mfn(INVALID_MFN) in the failure case, which you must not blindly write back. +__vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) PAGE_SHIFT); pfn_to_paddr() please, rather than opencoding it. (This is a helper which needs cleaning up, name-wise). +} +} +else +v-arch.hvm_vmx.secondary_exec_control = ~mask; + +__vmwrite(SECONDARY_VM_EXEC_CONTROL, +v-arch.hvm_vmx.secondary_exec_control); + +vmx_vmcs_exit(v); +} + +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) +{ +bool_t rc = 0; +ve_info_t *veinfo = vcpu_altp2mhvm(v).veinfo_gfn ? +hvm_map_guest_frame_rw(vcpu_altp2mhvm(v).veinfo_gfn, 0) : NULL; gfn 0 is a valid (albeit unlikely) location to request the veinfo page. Use GFN_INVALID as the sentinel. + +if ( !veinfo ) +return 0; + +if ( veinfo-semaphore != 0 ) +goto out; The semantics of this semaphore are not clearly spelled out in the manual. The only information I can locate concerning this field is in note in 25.5.6.1 which says: Delivery of virtualization exceptions writes the value H to offset 4 in the virtualization-exception informa- tion area (see Section 25.5.6.2). Thus, once a virtualization exception occurs, another can occur only if software clears this field. I presume this should be taken to mean software writes 0 to this field, but some clarification would be nice. + +rc = 1; + +veinfo-exit_reason = EXIT_REASON_EPT_VIOLATION; +veinfo-semaphore = ~0l; semaphore is declared as an unsigned field, so should use ~0u. +veinfo-eptp_index = vcpu_altp2mhvm(v).p2midx; + +vmx_vmcs_enter(v); +__vmread(EXIT_QUALIFICATION, veinfo-exit_qualification); +__vmread(GUEST_LINEAR_ADDRESS, veinfo-gla); +__vmread(GUEST_PHYSICAL_ADDRESS, veinfo-gpa); +vmx_vmcs_exit(v); + +hvm_inject_hw_exception(TRAP_virtualisation, +HVM_DELIVER_NO_ERROR_CODE); + +out: +hvm_unmap_guest_frame(veinfo, 0); +return rc; +} + static struct hvm_function_table __initdata vmx_function_table = { .name = VMX, .cpu_up_prepare = vmx_cpu_up_prepare, @@ -1822,6 +1917,9 @@ static struct hvm_function_table __initdata vmx_function_table = { .nhvm_hap_walk_L1_p2m = nvmx_hap_walk_L1_p2m, .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf, .enable_msr_exit_interception = vmx_enable_msr_exit_interception, +.ahvm_vcpu_update_eptp = vmx_vcpu_update_eptp, +.ahvm_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve, +.ahvm_vcpu_emulate_ve = vmx_vcpu_emulate_ve, };
Re: [Xen-devel] [PATCH v3 07/18] x86/hvm: unify internal portio and mmio intercepts
On 23.06.15 at 12:39, paul.durr...@citrix.com wrote: The implementation of mmio and portio intercepts is unnecessarily different. This leads to much code duplication. This patch unifies much of the intercept handling, leaving only distinct handlers for stdvga mmio and dpci portio. Subsequent patches will unify those handlers. Signed-off-by: Paul Durrant paul.durr...@citrix.com Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- Please help reviewers of previous versions of your patches by summarizing changes in the current version here. +static int hvm_portio_read(struct hvm_io_handler *handler, + uint64_t addr, + uint64_t size, + uint64_t *data) { -struct vcpu *curr = current; -unsigned int i; +uint32_t val = 0; +int rc; -for ( i = 0; i HVM_MMIO_HANDLER_NR; ++i ) -if ( hvm_mmio_handlers[i]-check(curr, gpa) ) -return 1; +BUG_ON(handler-type != IOREQ_TYPE_PIO); -return 0; +rc = handler-u.portio.action(IOREQ_READ, addr, size, val); +if ( rc == X86EMUL_OKAY ) +*data = val; I think there would be no harm doing this unconditionally, and it would eliminate the potential of the caller using uninitialized data. @@ -284,29 +185,37 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p) { for ( i = 0; i p-count; i++ ) { -data = 0; -switch ( hvm_copy_from_guest_phys(data, p-data + step * i, - p-size) ) +if ( p-data_is_ptr ) { -case HVMCOPY_okay: -break; -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: -rc = X86EMUL_RETRY; -break; -case HVMCOPY_bad_gfn_to_mfn: -data = ~0; -break; -case HVMCOPY_bad_gva_to_gfn: -ASSERT(0); -/* fall through */ -default: -rc = X86EMUL_UNHANDLEABLE; -break; +switch ( hvm_copy_from_guest_phys(data, p-data + step * i, + p-size) ) +{ +case HVMCOPY_okay: +break; +case HVMCOPY_gfn_paged_out: +case HVMCOPY_gfn_shared: +rc = X86EMUL_RETRY; +break; +case HVMCOPY_bad_gfn_to_mfn: +data = ~0; +break; +case HVMCOPY_bad_gva_to_gfn: +ASSERT_UNREACHABLE(); +/* fall through */ +default: +rc = X86EMUL_UNHANDLEABLE; +break; +} +if ( rc != X86EMUL_OKAY ) +break; } -if ( rc != X86EMUL_OKAY ) -break; -rc = action(IOREQ_WRITE, p-addr, p-size, data); +else +data = p-data; + +addr = (p-type == IOREQ_TYPE_COPY) ? +p-addr + step * i : +p-addr; Indentation. @@ -324,78 +233,133 @@ static int process_portio_intercept(portio_action_t action, ioreq_t *p) return rc; } -/* - * Check if the request is handled inside xen - * return value: 0 --not handled; 1 --handled - */ -int hvm_io_intercept(ioreq_t *p, int type) +static struct hvm_io_handler *hvm_find_io_handler(ioreq_t *p) +{ +struct vcpu *curr = current; +struct domain *curr_d = curr-domain; curr is used only once (here) and hence pointless as a local variable. +const struct hvm_io_ops *ops = +(p-type == IOREQ_TYPE_COPY) ? +mmio_ops : +portio_ops; +unsigned int i; + +for ( i = 0; i curr_d-arch.hvm_domain.io_handler_count; i++ ) +{ +struct hvm_io_handler *handler = +curr_d-arch.hvm_domain.io_handler[i]; +uint64_t start, end, count = p-count, size = p-size; I'm not really happy with all these 64-bit local variables, but I guess they are the result of you not wanting to do things the way they're currently done everywhere... From a logical pov, start and end would better be paddr_t, count and size unsigned long. +if ( handler-type != p-type ) +continue; + +switch ( handler-type ) +{ +case IOREQ_TYPE_PIO: +start = p-addr; +end = p-addr + size; +break; +case IOREQ_TYPE_COPY: +if ( p-df ) +{ +start = (p-addr - (count - 1) * size); +end = p-addr + size; +} +else +{ +start = p-addr;
Re: [Xen-devel] Hyper and Xen Project
On Wed, 24 Jun 2015, Dave Scott wrote: On 24 Jun 2015, at 12:48, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: Hi Wang, I don't know the answer, so I CCed xen-devel (the Xen development list) and a few people that I think will be able to help. Cheers, Stefano On Wed, 24 Jun 2015, Wang Xu wrote: A problem about channel, where do I found the channel name in the guest, In the document, it says I could found it in sysfs, but looks there isn't a name property: | root@test-container-create-ubuntu:/sys/bus/xen/devices# udevadm info --attribute-walk --path=/devices/console-1 | [...] | | looking at device '/devices/console-1': | KERNEL==console-1 | SUBSYSTEM==xen | DRIVER==xenconsole | ATTR{devtype}==console | ATTR{nodename}==device/console/1” I don’t think the frontend driver in Linux knows about the name key. In my testing I wrote a udev script which looks up the ‘name’ key directly in xenstore and created a named device node using that. For reference my script is here: https://github.com/mirage/mirage-console/blob/master/udev/xenconsole-setup-tty That's a great workaround. However I think it would be best to make the Linux hvc_xen driver aware of the name key going forward. and I directly test `/dev/hvc1`, and it could communicate with the outside socket. Is there some mistake in my channel name configuration? | static void hyper_config_channel(libxl_device_channel* ch, const char* name, const char* sock, int devid) { | libxl_device_channel_init(ch); | ch-backend_domid = 0; | ch-name = strdup(name); | ch-devid = devid; | ch-connection = LIBXL_CHANNEL_CONNECTION_SOCKET; | ch-u.socket.path = strdup(sock); | } I tried to look at the oVirt code as it is mentioned in the dock, but I did not find xen console in its guest agent code. So the issue is that the name you assign here to the channel, doesn't come up anywhere in the guest. Is that correct? Thank you! On Tue, Jun 23, 2015 at 7:30 PM, Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: On Tue, 23 Jun 2015, Wang Xu wrote: On Sat, Jun 20, 2015 at 1:10 AM Stefano Stabellini stefano.stabell...@eu.citrix.com wrote: Integrating hyper with Xen using libxl was the right decision and it looks like you did a good job. I think that you can go ahead with the PR! But I did have a few issues building hyper. I am getting: hyperd.go:11:2: cannot find package hyper/daemon in any of: [...] I tried with a clean 0.2-dev branch ./autogen.sh ./configure make It looks ok, are you work on the 0.2-dev branch, I did not write the branch name in the instruction of Readme, sorry for that. No worries, the most important part at this stage is the code, and that looks OK :-) Yes, I was using 0.2-dev and followed those steps. As I usually don't program in go, it is likely that my go working environment is missing something, or my go paths are wrong. This is the full error message: CGO_LDFLAGS=-Lhypervisor/xen -lxenlight -lxenctrl -lhyperxl godep go build hyperd.go hyperd.go:11:2: cannot find package hyper/daemon in any of: /local/scratch/sstabellini/go/src/hyper/daemon (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/daemon (from $GOPATH) hyperd.go:10:2: cannot find package hyper/engine in any of: /local/scratch/sstabellini/go/src/hyper/engine (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/engine (from $GOPATH) hyperd.go:12:2: cannot find package hyper/lib/glog in any of: /local/scratch/sstabellini/go/src/hyper/lib/glog (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/lib/glog (from $GOPATH) hyperd.go:13:2: cannot find package hyper/utils in any of: /local/scratch/sstabellini/go/src/hyper/utils (from $GOROOT) /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/utils (from $GOPATH) godep: go exit status 1 Looking through the code, it seems that you are adding a virtio-serial-pci device, why do you need it? It is not used very much on Xen; the regular Xen uart is specified by setting b_info-u.hvm.serial to pty, and it looks like you are already doing that. If you need more than one console, you can have a list setting b_info-u.hvm.serial_list. What the difference between u.hvm.serial_list and channels in domain_config. The channel looks having more features. Actually I think that you are right: channels are better tested and more flexible. virtio-9p-pci is also
[Xen-devel] vTPM issues
Hello everyone, I would like to try the vTPM feature, but I'm having some issues. Basically, I followed the steps explained in https://mhsamsal.wordpress.com/2013/12/05/configuring-virtual-tpm-vtpm-for-xen-4-3-guest-virtual-machines/ I'm running Ubuntu 14.04 as Dom0 on a Dell optiplex-9020. I compiled Xen 4.5.0 from source. After creating vtpmmgr and vtpm stubdoms, and DomU, I can invoke tpm_version from DomU: root@DomU:/home/xen# tpm_version TPM 1.2 Version Info: Chip Version:1.2.0.7 Spec Level: 2 Errata Revision: 1 TPM Vendor ID: ETHZ TPM Version: 0101 Manufacturer Info: 4554485a I can also see the PCRs status by invoking cat /sys/class/misc/tpm0/device/pcrs, however, most of the commands return an error. When I invoke takeownership I get the following error: root@DomU:/home/xen# tpm_takeownership -y -z -l debug Tspi_Context_Create success Tspi_Context_Connect success Tspi_Context_GetTpmObject success Tspi_GetPolicyObject success Tspi_Policy_SetSecret success Tspi_Context_CreateObject success Tspi_GetPolicyObject success Tspi_Policy_SetSecret success Tspi_TPM_TakeOwnership failed: 0x2004 - layer=tcs, code=0004 (4), Internal software error Tspi_Context_CloseObject success Tspi_Context_FreeMemory success Tspi_Context_Close success The same error is given when invoking tpm_getpubkey. I have already tried after clearing the TPM from BIOS, after having taken ownership and with ownership no taken with the same result when using the vTPM. I have also installed Xen 4.3.4, with the same result too. In the end, I would like to use the vTPM to generate and use RSA keys for TLS session establishing (using the API provided with GnuTLS). Since I cannot take ownership of the vTPM, the GnuTLS' tpmtool complains it doesn't find any SRK. I really appreciate any help you can provide. Best regards, Marcos ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 6/6] AMD-PVH: enable pvh if requirements met
On Wed, 24 Jun 2015 16:26:44 -0400 Elena Ufimtseva elena.ufimts...@oracle.com wrote: On Wed, Jun 24, 2015 at 07:24:18PM +0100, Andrew Cooper wrote: On 24/06/15 08:49, Jan Beulich wrote: On 24.06.15 at 04:34, boris.ostrov...@oracle.com wrote: On 06/23/2015 08:30 AM, Jan Beulich wrote: On 22.06.15 at 18:37, elena.ufimts...@oracle.com wrote: --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1444,6 +1444,9 @@ const struct hvm_function_table * __init start_svm(void) svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB | ((cpuid_edx(0x8001) 0x0400) ? HVM_HAP_SUPERPAGE_1GB : 0); +if ( cpu_has_svm_npt cpu_has_svm_decode ) +svm_function_table.pvh_supported = 1; If svm_decode indeed is a prereq, then the earlier patch dealing with the handle_mmio() invocations doesn't need to fiddle with VMEXIT_INVLPG other than to maybe add a documenting ASSERT(). I am not sure we should require decode feature to be required for PVH support. I can't remember exactly but I think this feature was first introduced in family 15h so requiring it will leave at least family 10h processors as not supporting PVH. The question was why the dependency was added in the first place. Indeed only fam 12, 15, and 16 have the field documented. Otoh PVH isn't being supported universally on all VMX variants either... Right, but this is a bug (feature?) of the current implementation and need fixing. There are no technical reasons to prevent PVH guests running in any case where an HVM guest currently runs. The only technical restriction I can think of is that a PVH hardware domain needs IOMMU support, but that is it. CCing Mukesh, maybe he will reply to as why that restriction is here. Hi Elena, Basically, the restriction was to allow AMD to come on par with intel and get phase I working on it. Then, I could just focus on handle_mmio for INS/OUTS for both intel and amd, and if supporting !svm_decode family of CPUs was important, then extend handle_mmio further... http://xen-devel.narkive.com/liQjEoV2/rfh-amd-cr-intercept-for-lmsw-clts [In the absence of svm_decode, mov cr would need to go thru handle_mmio..] thanks, Mukesh ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Interested in taking up a project
Thanks for the help and support guys. I'll need some time to get a proper understanding of how it is incorporated in Linux kernel and what all interfaces are built on top of it. Once I'm comfortable with that and xen's credit_scheduler, for starting, I'll come up with a design doc and share with you all. I'll keep reporting the progress of the work and ask related doubts in this thread. Thanks, Abhinav On Mon, Jun 22, 2015, 3:15 PM George Dunlap george.dun...@eu.citrix.com wrote: On 06/21/2015 07:37 AM, Abhinav Gupta wrote: Hii, I'm still waiting for the confirmation. Have started looking into the code though. Hey Abhinav, Thanks for your interest! As others have said, it's a free world, so of course you can work on and attempt to contribute whatever you want. :-) There's nobody else working on this yet, and it's probably still a good idea, so in that sense, the project is something that you should feel free to start working on. I don't have time at the moment to commit to the level of mentorship I would if you were a GSoC intern; but as a community, we're generally pretty good about helping people who try to get involved -- as you've already found out. :-) One heads-up: A thing we've started doing in our community, before submitting a large new feature, is to post a design document describing the purpose of the new feature, and a technical overview of the changes that you want to make and why. This is *not required*; you are free to just submit patches with your changes, and many people do. However, it's not uncommon for maintainers to request significant changes to the architecture or approach on major features, which require a major re-write. This can be frustrating both for you and for us. Done properly, a design document can make things easier for all of us. Looking forward to seeing your work! -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants
On Wed, 2015-06-24 at 18:38 +0200, Luis R. Rodriguez wrote: On Wed, Jun 24, 2015 at 08:42:23AM +1000, Benjamin Herrenschmidt wrote: On Fri, 2015-06-19 at 15:08 -0700, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com PCI BARs tell us whether prefetching is safe, but they don't say anything about write combining (WC). WC changes ordering rules and allows writes to be collapsed, so it's not safe in general to use it on a prefetchable region. Well, the PCIe spec at least specifies that a prefetchable BAR also tolerates write merging... How can that be determined and can that be used as a full bullet proof hint to enable wc ? And are you sure? :) Well, Im sure the spec says that ;-) But it could be new to PCIe, I haven't checked legacy PCI. Reason all this was stated was to be apologetic over why we can't automate this behind the scenes. Otherwise we could amend what you stated into the commit log to elaborate on our technical apology. Let me know! At least on powerpc, for mmap of resource to userspace, we take off the garded bit in the PTE for prefetchable BARs. This has the effect architecturally of enabling both prefetch and write combine (ie. side effect) though afaik, the implementations probably don't actually prefetch. We've done that for years. In fact we don't have a way to split the notions, it's either G or no G, which carries both meanings. Do you have example/case of a device having problems ? Cheers, Ben. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 58851: regressions - FAIL
flight 58851 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/58851/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-rumpuserxen-amd64 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 58821 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. vs. 58821 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail like 58773 test-amd64-i386-libvirt-xsm 11 guest-start fail like 58821 test-amd64-i386-libvirt 11 guest-start fail like 58821 test-amd64-amd64-libvirt-xsm 11 guest-start fail like 58821 test-amd64-amd64-libvirt 11 guest-start fail like 58821 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58821 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 58821 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: xen 6b444c8e1c19fac08c82f18011ad00ca185e4e80 baseline version: xen e76ff6c156906b515c2a4300a81c95886ece5d5f People who touched revisions under test: Boris Ostrovsky boris.ostrov...@oracle.com David Vrabel david.vra...@citrix.com Don Slutz dsl...@verizon.com Ian Jackson ian.jack...@eu.citrix.com Jan Beulich jbeul...@suse.com Juergen Gross jgr...@suse.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail test-amd64-amd64-libvirt-xsm fail test-armhf-armhf-libvirt-xsm pass test-amd64-i386-libvirt-xsm fail test-amd64-amd64-xl-xsm pass test-armhf-armhf-xl-xsm pass test-amd64-i386-xl-xsm pass test-amd64-amd64-xl-pvh-amd
Re: [Xen-devel] [PATCH 6/6] AMD-PVH: enable pvh if requirements met
On Wed, Jun 24, 2015 at 07:24:18PM +0100, Andrew Cooper wrote: On 24/06/15 08:49, Jan Beulich wrote: On 24.06.15 at 04:34, boris.ostrov...@oracle.com wrote: On 06/23/2015 08:30 AM, Jan Beulich wrote: On 22.06.15 at 18:37, elena.ufimts...@oracle.com wrote: --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -1444,6 +1444,9 @@ const struct hvm_function_table * __init start_svm(void) svm_function_table.hap_capabilities = HVM_HAP_SUPERPAGE_2MB | ((cpuid_edx(0x8001) 0x0400) ? HVM_HAP_SUPERPAGE_1GB : 0); +if ( cpu_has_svm_npt cpu_has_svm_decode ) +svm_function_table.pvh_supported = 1; If svm_decode indeed is a prereq, then the earlier patch dealing with the handle_mmio() invocations doesn't need to fiddle with VMEXIT_INVLPG other than to maybe add a documenting ASSERT(). I am not sure we should require decode feature to be required for PVH support. I can't remember exactly but I think this feature was first introduced in family 15h so requiring it will leave at least family 10h processors as not supporting PVH. The question was why the dependency was added in the first place. Indeed only fam 12, 15, and 16 have the field documented. Otoh PVH isn't being supported universally on all VMX variants either... Right, but this is a bug (feature?) of the current implementation and need fixing. There are no technical reasons to prevent PVH guests running in any case where an HVM guest currently runs. The only technical restriction I can think of is that a PVH hardware domain needs IOMMU support, but that is it. CCing Mukesh, maybe he will reply to as why that restriction is here. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m
On Wed, Jun 24, 2015 at 12:43 PM, Ed White edmund.h.wh...@intel.com wrote: On 06/24/2015 06:37 AM, Razvan Cojocaru wrote: On 06/24/2015 04:32 PM, Lengyel, Tamas wrote: On Wed, Jun 24, 2015 at 1:39 AM, Razvan Cojocaru rcojoc...@bitdefender.com mailto:rcojoc...@bitdefender.com wrote: On 06/24/2015 12:27 AM, Lengyel, Tamas wrote: I've extended xen-access to exercise this new feature taking into account some of the current limitations. Using the altp2m_write|exec options we create a duplicate view of the default hostp2m, and instead of relaxing the mem_access permissions when we encounter a violation, we swap the view on the violating vCPU while also enabling MTF singlestepping. When the singlestep event fires, we use the response to that event to swap the view back to the restricted altp2m view. That's certainly very interesting. I wonder what the benefits are in this case over emulating the fault-causing instruction (other than obviously not going through the emulator)? The altp2m method would certainly be slower, since you need more round-trips from userspace to the hypervisor (the EPT vm_event handling + the singlestep event, whereas with emulation you just reply to the original vm_event). Regards, Razvan Certainly, this is pretty slow right now, especially for the altp2m_exec case. However, sometimes you simply cannot emulate. For example if you write breakpoints into target locations, the original instruction has been overwritten with 0xCC. If you have a duplicate of the page without the breakpoint, this is an easy way to make the guest fetch the original instruction. Of course, if you extend the emulation routine where you can provide the instruction to emulate, instead of it being fetched from guest memory, that would be equally useful ;) Makes sense, thanks for the explanation! Sure, sending back the instruction to emulate could be something to consider for the future. Thanks, Razvan One thing I'd add is that what Tamas has done provides a valuable test that the cross-domain functionality works, even if it might not be a recommended design pattern. Our primary use case at Intel is intra-domain, and there the advantages of avoiding many exits are clear. Also, even cross-domain usage allows for different views of, and levels of access to, memory concurrently on different vcpus. Ed Hi Ed, I tried the system using memsharing and I collected the following crash log. In this test I ran memsharing on all pages of the domain before activating altp2m and creating the view. Afterwards I used my updated xen-access to create a copy of this p2m with only R/X permissions. The idea would be that the altp2m view remains completely shared, while the hostp2m would be able to do its CoW propagation as the domain is executing. (XEN) mm locking order violation: 278 239 (XEN) Xen BUG at mm-locks.h:68 (XEN) [ Xen-4.6-unstable x86_64 debug=y Tainted:C ] (XEN) CPU:2 (XEN) RIP:e008:[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9 (XEN) RFLAGS: 00010282 CONTEXT: hypervisor (d6v0) (XEN) rax: rbx: rcx: (XEN) rdx: 8302163a8000 rsi: 000a rdi: 82d0802a069c (XEN) rbp: 8302163afa68 rsp: 8302163af9e8 r8: 83021c00 (XEN) r9: 0003 r10: 00ef r11: 0003 (XEN) r12: 83010cc51820 r13: r14: 830158d9 (XEN) r15: 00025697 cr0: 80050033 cr4: 001526f0 (XEN) cr3: dbba3000 cr2: 778c9714 (XEN) ds: es: fs: gs: ss: cs: e008 (XEN) Xen stack trace from rsp=8302163af9e8: (XEN)8302163af9f8 803180f8 000c 82d0801892ee (XEN)82d0801fb4d1 83010cc51de0 0008ff49 82d08012f86a (XEN)83010cc51820 83010cc51820 (XEN)83010cc51820 8300dbb334b8 8302163afa00 (XEN)8302163afb18 82d0801fd549 00050009 83020001 (XEN)0001 830158d9 0002 0008ff49 (XEN)00025697 000c 8302163afae8 80c08ff49175 (XEN)80c0d0a97175 01ff83010cc51820 0097 8300dbb33000 (XEN)8302163afb78 0008ff49 0001 (XEN)00025697 83010cc51820 8302163afb38 82d0801fd644 (XEN) 000d0a97 8302163afb98 82d0801f23c5 (XEN)830158d9 0cc51820 830158d9 000c (XEN)0008ff49 83010cc51820 00025697 000d0a97 (XEN)0008ff49 830158d9 8302163afbd8 82d0801f45c8 (XEN)83010cc51820 000c 83008fd41170
Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m
On 06/24/2015 03:45 PM, Lengyel, Tamas wrote: On Wed, Jun 24, 2015 at 6:02 PM, Ed White edmund.h.wh...@intel.com wrote: On 06/24/2015 02:34 PM, Lengyel, Tamas wrote: Hi Ed, I tried the system using memsharing and I collected the following crash log. In this test I ran memsharing on all pages of the domain before activating altp2m and creating the view. Afterwards I used my updated xen-access to create a copy of this p2m with only R/X permissions. The idea would be that the altp2m view remains completely shared, while the hostp2m would be able to do its CoW propagation as the domain is executing. (XEN) mm locking order violation: 278 239 (XEN) Xen BUG at mm-locks.h:68 (XEN) [ Xen-4.6-unstable x86_64 debug=y Tainted:C ] (XEN) CPU:2 (XEN) RIP:e008:[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9 (XEN) RFLAGS: 00010282 CONTEXT: hypervisor (d6v0) (XEN) rax: rbx: rcx: (XEN) rdx: 8302163a8000 rsi: 000a rdi: 82d0802a069c (XEN) rbp: 8302163afa68 rsp: 8302163af9e8 r8: 83021c00 (XEN) r9: 0003 r10: 00ef r11: 0003 (XEN) r12: 83010cc51820 r13: r14: 830158d9 (XEN) r15: 00025697 cr0: 80050033 cr4: 001526f0 (XEN) cr3: dbba3000 cr2: 778c9714 (XEN) ds: es: fs: gs: ss: cs: e008 (XEN) Xen stack trace from rsp=8302163af9e8: (XEN)8302163af9f8 803180f8 000c 82d0801892ee (XEN)82d0801fb4d1 83010cc51de0 0008ff49 82d08012f86a (XEN)83010cc51820 83010cc51820 (XEN)83010cc51820 8300dbb334b8 8302163afa00 (XEN)8302163afb18 82d0801fd549 00050009 83020001 (XEN)0001 830158d9 0002 0008ff49 (XEN)00025697 000c 8302163afae8 80c08ff49175 (XEN)80c0d0a97175 01ff83010cc51820 0097 8300dbb33000 (XEN)8302163afb78 0008ff49 0001 (XEN)00025697 83010cc51820 8302163afb38 82d0801fd644 (XEN) 000d0a97 8302163afb98 82d0801f23c5 (XEN)830158d9 0cc51820 830158d9 000c (XEN)0008ff49 83010cc51820 00025697 000d0a97 (XEN)0008ff49 830158d9 8302163afbd8 82d0801f45c8 (XEN)83010cc51820 000c 83008fd41170 0008ff49 (XEN)00025697 82e001a152e0 8302163afc58 82d080205b51 (XEN)0009 0008ff49 8300d0a97000 83008fd41160 (XEN)82e001a152f0 82e0011fe920 83010cc51820 000c (XEN)00025697 0003 83010cc51820 8302163afd34 (XEN)00025697 8302163afca8 82d0801f1f7d (XEN) Xen call trace: (XEN)[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9 (XEN)[82d0801fd549] ept_set_entry_sve+0x5fa/0x6e6 (XEN)[82d0801fd644] ept_set_entry+0xf/0x11 (XEN)[82d0801f23c5] p2m_set_entry+0xd4/0x112 (XEN)[82d0801f45c8] set_shared_p2m_entry+0x2d0/0x39b (XEN)[82d080205b51] __mem_sharing_unshare_page+0x83f/0xbd6 (XEN)[82d0801f1f7d] __get_gfn_type_access+0x224/0x2b0 (XEN)[82d0801c6df5] hvm_hap_nested_page_fault+0x21f/0x795 (XEN)[82d0801e86ae] vmx_vmexit_handler+0x1764/0x1af3 (XEN)[82d0801ee891] vmx_asm_vmexit_handler+0x41/0xc0 The crash here is because I haven't successfully forced all the shared pages in the host p2m to become unshared before copying, which is the intended behaviour. I think I know how that has happened and how to fix it, but what you're trying to do won't work by design. By the time a copy from host p2m to altp2m occurs, the sharing is supposed to be broken. Hm. If the sharing gets broken before the hostp2m-altp2m copy, maybe doing sharing after the view has been created is a better route? I guess the sharing code would need to be adapted to check if altp2m is enabled for that to work.. You're coming up with some ways of attempting to use altp2m that we hadn't thought of. That's a good thing, and just what we want, but there are limits to what we can support without more far-reaching changes to existing parts of Xen. This isn't going to be do-able for 4.6. My main concern is just getting it to work, hitting 4.6 is not a priority. I understand that my stuff is highly experimental ;) While the gfn remapping feature is intriguing, in my setup I already have a copy of the page I would want to present during a singlestep-altp2mswitch - in the origin domains memory. AFAIU the
Re: [Xen-devel] [PATCH 1/3] x86: drop is_pv_32on64_vcpu()
On 06/23/2015 11:18 AM, Jan Beulich wrote: ... as being identical to is_pv_32bit_vcpu() after the x86-32 removal. In a few cases this includes an additional is_pv_32bit_vcpu() - is_pv_32bit_domain() conversion. Signed-off-by: Jan Beulich jbeul...@suse.com We have struct arch_domain { ... /* Is a 32-bit PV (non-HVM) guest? */ bool_t is_32bit_pv; /* Is shared-info page in 32-bit format? */ bool_t has_32bit_shinfo; ... } and currently both of these fields are set/unset together (except for one HVM case --- hvm_latch_shinfo_size()). Why not have a single 'bool is_32bit' and then replace macros at the top of include/asm-x86/domain.h with is_32bit_vcpu/domain()? I think in majority of places when we test for is_pv_32bit_vcpu/domain() we already know that we are PV so it shouldn't add any additional tests. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants
On Wed, Jun 24, 2015 at 3:05 PM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Wed, 2015-06-24 at 18:38 +0200, Luis R. Rodriguez wrote: On Wed, Jun 24, 2015 at 08:42:23AM +1000, Benjamin Herrenschmidt wrote: On Fri, 2015-06-19 at 15:08 -0700, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com PCI BARs tell us whether prefetching is safe, but they don't say anything about write combining (WC). WC changes ordering rules and allows writes to be collapsed, so it's not safe in general to use it on a prefetchable region. Well, the PCIe spec at least specifies that a prefetchable BAR also tolerates write merging... How can that be determined and can that be used as a full bullet proof hint to enable wc ? And are you sure? :) Well, Im sure the spec says that ;-) But it could be new to PCIe, I haven't checked legacy PCI. OK cool so to be clear from what I gather you are suggesting (or not and letting me make it) is that we might be able to enforce write-merging on prefetchable areas, and if we can *ensure* we do this then automatically enable write-combining behind the scenes? Reason all this was stated was to be apologetic over why we can't automate this behind the scenes. Otherwise we could amend what you stated into the commit log to elaborate on our technical apology. Let me know! At least on powerpc, for mmap of resource to userspace, we take off the garded bit in the PTE for prefetchable BARs. This has the effect architecturally of enabling both prefetch and write combine (ie. side effect) That's pretty darn sexy. though afaik, the implementations probably don't actually prefetch. We've done that for years. Neat! In fact we don't have a way to split the notions, it's either G or no G, which carries both meanings. Interesting. Do you have example/case of a device having problems ? Nope but at least what made me squint at this being a possible feature was that in practice when reviewing all of the kernels pending device drivers using MTRR (potential write-combine candidates) I encountered a slew of them which had the architectural unfortunate practice of combining PCI bars for MMIO and their respective write-combined desirable area (framebuffer for video, PIO buffers for infiniband, etc). Now, to me that read more as a practice for old school devices when such things were likely still being evaluated, more modern devices seem to adhere to sticking a full PCI bar with write-combining or not. Did you not encounter such mismatch splits on powerpc ? Was such possibility addressed? If what you are implying here is applicable to the x86 world I'm all for enabling this as we'd have less code to maintain but I'll note that getting a clarification alone on that prefetchable != write-combining was in and of itself hard, I'd be surprised if we could get full architectural buy-in to this as an immediate automatic feature. Because of this and because PAT did have some errata as well, I would not be surprised if some PCI bridges / devices would end up finding corner cases, as such if we can really do what you're saying and unless we can get some super sane certainty over it across the board, I'd be inclined to leave such things as a part of a new API. Maybe have some folks test using the new API for all calls and after some sanity of testing / releases consider a full switch. That is, unless of course you're sure all this is sane and would wager all-in on it from the get-go. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/12] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
On 06/24/2015 05:47 AM, Andrew Cooper wrote: +case EXIT_REASON_VMFUNC: +if ( vmx_vmfunc_intercept(regs) == X86EMUL_OKAY ) This is currently an unconditional failure, and I don't see subsequent patches which alter vmx_vmfunc_intercept(). Shouldn't vmx_vmfunc_intercept() switch on eax and optionally call p2m_switch_vcpu_altp2m_by_id()? If the VMFUNC instruction was valid, the hardware would have executed it. The only time a VMFUNC exit occurs is if the hardware supports VMFUNC and the hypervisor has enabled it, but the VMFUNC instruction is invalid in some way and can't be executed (because EAX != 0, for example). There are only two choices: crash the domain or inject #UD (which is the closest analogue to what happens in the absence of a hypervisor and will probably crash the OS in the domain). I chose the latter in the code I originally wrote; Ravi chose the former in his patch. I don't have a strong opinion either way, but I think these are the only two choices. I hope this answers Jan's question in another email on the same subject. Ed ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m
On 06/24/2015 02:34 PM, Lengyel, Tamas wrote: Hi Ed, I tried the system using memsharing and I collected the following crash log. In this test I ran memsharing on all pages of the domain before activating altp2m and creating the view. Afterwards I used my updated xen-access to create a copy of this p2m with only R/X permissions. The idea would be that the altp2m view remains completely shared, while the hostp2m would be able to do its CoW propagation as the domain is executing. (XEN) mm locking order violation: 278 239 (XEN) Xen BUG at mm-locks.h:68 (XEN) [ Xen-4.6-unstable x86_64 debug=y Tainted:C ] (XEN) CPU:2 (XEN) RIP:e008:[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9 (XEN) RFLAGS: 00010282 CONTEXT: hypervisor (d6v0) (XEN) rax: rbx: rcx: (XEN) rdx: 8302163a8000 rsi: 000a rdi: 82d0802a069c (XEN) rbp: 8302163afa68 rsp: 8302163af9e8 r8: 83021c00 (XEN) r9: 0003 r10: 00ef r11: 0003 (XEN) r12: 83010cc51820 r13: r14: 830158d9 (XEN) r15: 00025697 cr0: 80050033 cr4: 001526f0 (XEN) cr3: dbba3000 cr2: 778c9714 (XEN) ds: es: fs: gs: ss: cs: e008 (XEN) Xen stack trace from rsp=8302163af9e8: (XEN)8302163af9f8 803180f8 000c 82d0801892ee (XEN)82d0801fb4d1 83010cc51de0 0008ff49 82d08012f86a (XEN)83010cc51820 83010cc51820 (XEN)83010cc51820 8300dbb334b8 8302163afa00 (XEN)8302163afb18 82d0801fd549 00050009 83020001 (XEN)0001 830158d9 0002 0008ff49 (XEN)00025697 000c 8302163afae8 80c08ff49175 (XEN)80c0d0a97175 01ff83010cc51820 0097 8300dbb33000 (XEN)8302163afb78 0008ff49 0001 (XEN)00025697 83010cc51820 8302163afb38 82d0801fd644 (XEN) 000d0a97 8302163afb98 82d0801f23c5 (XEN)830158d9 0cc51820 830158d9 000c (XEN)0008ff49 83010cc51820 00025697 000d0a97 (XEN)0008ff49 830158d9 8302163afbd8 82d0801f45c8 (XEN)83010cc51820 000c 83008fd41170 0008ff49 (XEN)00025697 82e001a152e0 8302163afc58 82d080205b51 (XEN)0009 0008ff49 8300d0a97000 83008fd41160 (XEN)82e001a152f0 82e0011fe920 83010cc51820 000c (XEN)00025697 0003 83010cc51820 8302163afd34 (XEN)00025697 8302163afca8 82d0801f1f7d (XEN) Xen call trace: (XEN)[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9 (XEN)[82d0801fd549] ept_set_entry_sve+0x5fa/0x6e6 (XEN)[82d0801fd644] ept_set_entry+0xf/0x11 (XEN)[82d0801f23c5] p2m_set_entry+0xd4/0x112 (XEN)[82d0801f45c8] set_shared_p2m_entry+0x2d0/0x39b (XEN)[82d080205b51] __mem_sharing_unshare_page+0x83f/0xbd6 (XEN)[82d0801f1f7d] __get_gfn_type_access+0x224/0x2b0 (XEN)[82d0801c6df5] hvm_hap_nested_page_fault+0x21f/0x795 (XEN)[82d0801e86ae] vmx_vmexit_handler+0x1764/0x1af3 (XEN)[82d0801ee891] vmx_asm_vmexit_handler+0x41/0xc0 The crash here is because I haven't successfully forced all the shared pages in the host p2m to become unshared before copying, which is the intended behaviour. I think I know how that has happened and how to fix it, but what you're trying to do won't work by design. By the time a copy from host p2m to altp2m occurs, the sharing is supposed to be broken. You're coming up with some ways of attempting to use altp2m that we hadn't thought of. That's a good thing, and just what we want, but there are limits to what we can support without more far-reaching changes to existing parts of Xen. This isn't going to be do-able for 4.6. Ed Ed ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m
On Wed, Jun 24, 2015 at 6:02 PM, Ed White edmund.h.wh...@intel.com wrote: On 06/24/2015 02:34 PM, Lengyel, Tamas wrote: Hi Ed, I tried the system using memsharing and I collected the following crash log. In this test I ran memsharing on all pages of the domain before activating altp2m and creating the view. Afterwards I used my updated xen-access to create a copy of this p2m with only R/X permissions. The idea would be that the altp2m view remains completely shared, while the hostp2m would be able to do its CoW propagation as the domain is executing. (XEN) mm locking order violation: 278 239 (XEN) Xen BUG at mm-locks.h:68 (XEN) [ Xen-4.6-unstable x86_64 debug=y Tainted:C ] (XEN) CPU:2 (XEN) RIP:e008:[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9 (XEN) RFLAGS: 00010282 CONTEXT: hypervisor (d6v0) (XEN) rax: rbx: rcx: (XEN) rdx: 8302163a8000 rsi: 000a rdi: 82d0802a069c (XEN) rbp: 8302163afa68 rsp: 8302163af9e8 r8: 83021c00 (XEN) r9: 0003 r10: 00ef r11: 0003 (XEN) r12: 83010cc51820 r13: r14: 830158d9 (XEN) r15: 00025697 cr0: 80050033 cr4: 001526f0 (XEN) cr3: dbba3000 cr2: 778c9714 (XEN) ds: es: fs: gs: ss: cs: e008 (XEN) Xen stack trace from rsp=8302163af9e8: (XEN)8302163af9f8 803180f8 000c 82d0801892ee (XEN)82d0801fb4d1 83010cc51de0 0008ff49 82d08012f86a (XEN)83010cc51820 83010cc51820 (XEN)83010cc51820 8300dbb334b8 8302163afa00 (XEN)8302163afb18 82d0801fd549 00050009 83020001 (XEN)0001 830158d9 0002 0008ff49 (XEN)00025697 000c 8302163afae8 80c08ff49175 (XEN)80c0d0a97175 01ff83010cc51820 0097 8300dbb33000 (XEN)8302163afb78 0008ff49 0001 (XEN)00025697 83010cc51820 8302163afb38 82d0801fd644 (XEN) 000d0a97 8302163afb98 82d0801f23c5 (XEN)830158d9 0cc51820 830158d9 000c (XEN)0008ff49 83010cc51820 00025697 000d0a97 (XEN)0008ff49 830158d9 8302163afbd8 82d0801f45c8 (XEN)83010cc51820 000c 83008fd41170 0008ff49 (XEN)00025697 82e001a152e0 8302163afc58 82d080205b51 (XEN)0009 0008ff49 8300d0a97000 83008fd41160 (XEN)82e001a152f0 82e0011fe920 83010cc51820 000c (XEN)00025697 0003 83010cc51820 8302163afd34 (XEN)00025697 8302163afca8 82d0801f1f7d (XEN) Xen call trace: (XEN)[82d0801f8768] p2m_altp2m_propagate_change+0x85/0x4a9 (XEN)[82d0801fd549] ept_set_entry_sve+0x5fa/0x6e6 (XEN)[82d0801fd644] ept_set_entry+0xf/0x11 (XEN)[82d0801f23c5] p2m_set_entry+0xd4/0x112 (XEN)[82d0801f45c8] set_shared_p2m_entry+0x2d0/0x39b (XEN)[82d080205b51] __mem_sharing_unshare_page+0x83f/0xbd6 (XEN)[82d0801f1f7d] __get_gfn_type_access+0x224/0x2b0 (XEN)[82d0801c6df5] hvm_hap_nested_page_fault+0x21f/0x795 (XEN)[82d0801e86ae] vmx_vmexit_handler+0x1764/0x1af3 (XEN)[82d0801ee891] vmx_asm_vmexit_handler+0x41/0xc0 The crash here is because I haven't successfully forced all the shared pages in the host p2m to become unshared before copying, which is the intended behaviour. I think I know how that has happened and how to fix it, but what you're trying to do won't work by design. By the time a copy from host p2m to altp2m occurs, the sharing is supposed to be broken. Hm. If the sharing gets broken before the hostp2m-altp2m copy, maybe doing sharing after the view has been created is a better route? I guess the sharing code would need to be adapted to check if altp2m is enabled for that to work.. You're coming up with some ways of attempting to use altp2m that we hadn't thought of. That's a good thing, and just what we want, but there are limits to what we can support without more far-reaching changes to existing parts of Xen. This isn't going to be do-able for 4.6. My main concern is just getting it to work, hitting 4.6 is not a priority. I understand that my stuff is highly experimental ;) While the gfn remapping feature is intriguing, in my setup I already have a copy of the page I would want to present during a singlestep-altp2mswitch - in the origin domains memory. AFAIU the gfn
Re: [Xen-devel] Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions))
[Moving most people to Bcc, as this is indeed unrelated to the original topic] On Wed, 2015-06-24 at 13:41 +0100, Jan Beulich wrote: On 24.06.15 at 14:29, dario.faggi...@citrix.com wrote: On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: The memory info Jun 23 15:56:27.749008 (XEN) Memory location of each domain: Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072): Jun 23 15:56:27.756983 (XEN) Node 0: 126905 Jun 23 15:56:27.756998 (XEN) Node 1: 0 Jun 23 15:56:27.764952 (XEN) Node 2: 4167 Jun 23 15:56:27.764969 (XEN) Node 3: 0 suggests at least a small amount of cross-node memory allocation (16M out of dom0s 512M total). That's probably small enough to be OK. Yeah, that is in line with what you usually get with dom0_nodes. Most of the memory, as you noted, comes from the proper node. We're just not (yet?) at the point where _all_ of it can come from there. Actually as long as there is enough memory on the requested node (minus any amount set aside for the DMA pool), this shouldn't happen (and I had seen this to be clean in my own testing). ISTR some allocation not being 'converted'. Perhaps I'm misremembering. There being 8Gb per node, I see no immediate reason why memory from node 2 would be handed out. Still I wouldn't suspect this to matter here. On my 2 nodes test box with the following configuration: (XEN) SRAT: Node 1 PXM 1 0-dc00 (XEN) SRAT: Node 1 PXM 1 1-1a400 (XEN) SRAT: Node 0 PXM 0 1a400-32400 with 'dom0_nodes=0', I see this: (XEN) Memory location of each domain: (XEN) Domain 0 (total: 131072): (XEN) Node 0: 114664 (XEN) Node 1: 16408 while with 'dom0_nodes=1', this: (XEN) Memory location of each domain: (XEN) Domain 0 (total: 131072): (XEN) Node 0: 7749 (XEN) Node 1: 123323 Dario -- This happens because I choose it to happen! (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op
On 24.06.15 at 15:34, paul.durr...@citrix.com wrote: -Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 24 June 2015 14:25 To: Paul Durrant Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org) Subject: RE: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op On 24.06.15 at 15:14, paul.durr...@citrix.com wrote: From: xen-devel-boun...@lists.xen.org [mailto:xen-devel- boun...@lists.xen.org] On Behalf Of Jan Beulich Sent: 24 June 2015 14:08 On 24.06.15 at 13:24, paul.durr...@citrix.com wrote: --- a/xen/include/asm-x86/hvm/io.h +++ b/xen/include/asm-x86/hvm/io.h @@ -35,7 +35,9 @@ typedef int (*hvm_mmio_write_t)(struct vcpu *v, unsigned long addr, unsigned long length, unsigned long val); -typedef int (*hvm_mmio_check_t)(struct vcpu *v, unsigned long addr); +typedef int (*hvm_mmio_check_t)(struct vcpu *v, +unsigned long addr, +unsigned long length); I don't think this really needs to be long? For consistency with the mmio read and write function types I went with 'long'. Is there any harm in that? Generally generates worse code (due to the need for the REX64 prefix on all involved instructions). Perhaps the other ones don't need sizes/lengths passed as longs either? I'm happy to do it that way round if you don't mind the extra diffs. I'll do it as a separate patch just before this one, to ease review. Thanks. There's actually a good example of why they shouldn't be unsigned long in patch 9 I just looked at: +static int stdvga_mem_write(struct vcpu *v, unsigned long addr, +unsigned long size, unsigned long data) { +ioreq_t p = { .type = IOREQ_TYPE_COPY, + .addr = addr, + .size = size, This clearly would truncate size if it ever exceeded a uint32_t (and hence an unsigned int on x86). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/12] Alternate p2m: support multiple copies of host p2m
On 22/06/15 19:56, Ed White wrote: This set of patches adds support to hvm domains for EPTP switching by creating multiple copies of the host p2m (currently limited to 10 copies). The primary use of this capability is expected to be in scenarios where access to memory needs to be monitored and/or restricted below the level at which the guest OS page tables operate. Two examples that were discussed at the 2014 Xen developer summit are: VM introspection: http://www.slideshare.net/xen_com_mgr/ zero-footprint-guest-memory-introspection-from-xen Secure inter-VM communication: http://www.slideshare.net/xen_com_mgr/nakajima-nvf A more detailed design specification can be found at: http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html Each p2m copy is populated lazily on EPT violations. Permissions for pages in alternate p2m's can be changed in a similar way to the existing memory access interface, and gfn-mfn mappings can be changed. All this is done through extra HVMOP types. The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain code is hypervisor-only, the toolstack has not been modified. The intra-domain code has been tested. Violation notifications can only be received for pages that have been modified (access permissions and/or gfn-mfn mapping) intra-domain, and only on VCPU's that have enabled notification. VMFUNC and #VE will both be emulated on hardware without native support. This code is not compatible with nested hvm functionality and will refuse to work with nested hvm active. It is also not compatible with migration. It should be considered experimental. Overall, this patch series is looking very good, and it would seem that 3rd party testing agrees! ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/12] x86/altp2m: basic data structures and support routines.
On 22.06.15 at 20:56, edmund.h.wh...@intel.com wrote: --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -210,6 +210,14 @@ struct hvm_function_table { uint32_t *ecx, uint32_t *edx); void (*enable_msr_exit_interception)(struct domain *d); + +/* Alternate p2m */ +int (*ahvm_vcpu_initialise)(struct vcpu *v); +void (*ahvm_vcpu_destroy)(struct vcpu *v); +int (*ahvm_vcpu_reset)(struct vcpu *v); +void (*ahvm_vcpu_update_eptp)(struct vcpu *v); +void (*ahvm_vcpu_update_vmfunc_ve)(struct vcpu *v); +bool_t (*ahvm_vcpu_emulate_ve)(struct vcpu *v); }; These ahvm_ prefixes are pretty strange - this isn't about alternate HVM after all. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH RFC 8/9] libxl: introduce specific error codes in libxl_device_cdrom_insert
Signed-off-by: Rob Hoes rob.h...@citrix.com --- tools/libxl/libxl.c | 12 ++-- tools/libxl/libxl_device.c | 6 +++--- tools/libxl/libxl_qmp.c | 4 +++- tools/libxl/libxl_types.idl | 22 +- 4 files changed, 33 insertions(+), 11 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 2f56c6e..f41f291 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -2848,25 +2848,25 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, libxl_domain_type type = libxl__domain_type(gc, domid); if (type == LIBXL_DOMAIN_TYPE_INVALID) { -rc = ERROR_FAIL; +rc = ERROR_INVAL_DOMAIN_TYPE; goto out; } if (type != LIBXL_DOMAIN_TYPE_HVM) { LOG(ERROR, cdrom-insert requires an HVM domain); -rc = ERROR_INVAL; +rc = ERROR_NOHVM; goto out; } if (libxl_get_stubdom_id(ctx, domid) != 0) { LOG(ERROR, cdrom-insert doesn't work for stub domains); -rc = ERROR_INVAL; +rc = ERROR_STUBDOM; goto out; } dm_ver = libxl__device_model_version_running(gc, domid); if (dm_ver == -1) { LOG(ERROR, cannot determine device model version); -rc = ERROR_FAIL; +rc = ERROR_DM_VERSION_UNDETERMINED; goto out; } @@ -2881,7 +2881,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, } if (i == num) { LIBXL__LOG(ctx, LIBXL__LOG_ERROR, Virtual device not found); -rc = ERROR_FAIL; +rc = ERROR_DISK_VDEV_NOT_FOUND; goto out; } @@ -2941,7 +2941,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, { LIBXL__LOG(ctx, LIBXL__LOG_ERROR, Internal error: %s does not exist, libxl__sprintf(gc, %s/frontend, path)); -rc = ERROR_FAIL; +rc = ERROR_INTERNAL; goto out; } diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 56c6e2e..1c5f659 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -271,7 +271,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) { if (disk-format == LIBXL_DISK_FORMAT_EMPTY) { if (!disk-is_cdrom) { LOG(ERROR, Disk vdev=%s is empty but not cdrom, disk-vdev); -return ERROR_INVAL; +return ERROR_INVAL_DISK_FORMAT; } memset(a.stab, 0, sizeof(a.stab)); } else if ((disk-backend == LIBXL_DISK_BACKEND_UNKNOWN || @@ -281,7 +281,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) { if (stat(disk-pdev_path, a.stab)) { LOGE(ERROR, Disk vdev=%s failed to stat: %s, disk-vdev, disk-pdev_path); -return ERROR_INVAL; +return ERROR_DISK_PDEV_NOT_FOUND; } } @@ -299,7 +299,7 @@ int libxl__device_disk_set_backend(libxl__gc *gc, libxl_device_disk *disk) { } if (!ok) { LOG(ERROR, no suitable backend for disk %s, disk-vdev); -return ERROR_INVAL; +return ERROR_DISK_BACKEND_UNDETERMINED; } disk-backend = ok; return 0; diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c index 9aa7e2e..c687e86 100644 --- a/tools/libxl/libxl_qmp.c +++ b/tools/libxl/libxl_qmp.c @@ -817,9 +817,11 @@ static int qmp_run_command(libxl__gc *gc, int domid, qmp = libxl__qmp_initialize(gc, domid); if (!qmp) -return ERROR_FAIL; +return ERROR_QMP_INIT; rc = qmp_synchronous_send(qmp, cmd, args, callback, opaque, qmp-timeout); +if (rc 0) +rc = ERROR_QMP_SEND; libxl__qmp_close(qmp); return rc; diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 15e4af2..88262ca 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -107,6 +107,10 @@ libxl_error = Enumeration(error, [ # Requested domain was not found (-21, DOMAIN_NOTFOUND), +# Internal error; not actionable by the caller other than by doing something like +# a retry/reboot (perhaps a libxl bug) +(ENUM_PREV, INTERNAL), + # Xenstore errors (ENUM_PREV, XS_CONNECT), (ENUM_PREV, XS_READ), @@ -132,12 +136,28 @@ libxl_error = Enumeration(error, [ # Disk parameters invalid (ENUM_PREV, INVAL_DISK_VDEV), (ENUM_PREV, INVAL_DISK_BACKEND), +(ENUM_PREV, INVAL_DISK_FORMAT), # Disk parameters could not be determined (ENUM_PREV, DISK_VDEV_UNDETERMINED), +(ENUM_PREV, DISK_BACKEND_UNDETERMINED), -# Physical disk device could not be found +# Physical/virtual disk device could not be found (ENUM_PREV, DISK_PDEV_NOT_FOUND), +(ENUM_PREV, DISK_VDEV_NOT_FOUND), + +# Operation requires an HVM domain +(ENUM_PREV, NOHVM), + +# Operation is not compatible with a stub
[Xen-devel] [PATCH RFC 6/9] libxl: introduce specific error code for libxl__wait_device_connection
Signed-off-by: Rob Hoes rob.h...@citrix.com --- tools/libxl/libxl_device.c | 1 + tools/libxl/libxl_types.idl | 3 +++ 2 files changed, 4 insertions(+) diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c index 93bb41e..56c6e2e 100644 --- a/tools/libxl/libxl_device.c +++ b/tools/libxl/libxl_device.c @@ -768,6 +768,7 @@ void libxl__wait_device_connection(libxl__egc *egc, libxl__ao_device *aodev) LIBXL_INIT_TIMEOUT * 1000); if (rc) { LOG(ERROR, unable to initialize device %s, be_path); +rc = ERROR_DEVICE_WAIT_INIT; goto out; } diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index b905353..3c44b41 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -122,6 +122,9 @@ libxl_error = Enumeration(error, [ (ENUM_PREV, JSON_GET_CONFIG), (ENUM_PREV, JSON_SET_CONFIG), (ENUM_PREV, JSON_PARSE_CONFIG), + +# Unable to initialise device connection watch +(ENUM_PREV, DEVICE_WAIT_INIT), ], value_namespace = ) libxl_domain_type = Enumeration(domain_type, [ -- 2.4.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH RFC 1/9] libxl idl: add comments to error enum
Signed-off-by: Rob Hoes rob.h...@citrix.com --- tools/libxl/libxl_types.idl | 41 + 1 file changed, 41 insertions(+) diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 65d479f..6dc18fa 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -44,26 +44,67 @@ MemKB = UInt(64, init_val = LIBXL_MEMKB_DEFAULT, json_gen_fn = libxl__uint64_ # libxl_error = Enumeration(error, [ +# Generic failure; code should be avoided (often seen as rc = -1) (-1, NONSPECIFIC), + +# Libxl version mismatch (-2, VERSION), + +# General failure; code should be avoided (-3, FAIL), + +# Not implemented (-4, NI), + +# Out of memory (malloc or similar failed) (-5, NOMEM), + +# General failure; code should be avoided (-6, INVAL), + +# General failure; code should be avoided (used only in xl) (-7, BADFAIL), + +# Domain responded to suspend request (-8, GUEST_TIMEDOUT), + +# A xenstore watch has timed out (-9, TIMEDOUT), + +# The operation requires PV control, but the domain does not offer it (-10, NOPARAVIRT), + +# Event has not happened (libxl_event_check) (-11, NOT_READY), + +# osevent registration or modification hook failed (-12, OSEVENT_REG_FAIL), + +# fd buffer full (libxl_osevent_beforepoll) (-13, BUFFERFULL), + +# Process is not a child of the current libxl instance (libxl_childproc_reaped) (-14, UNKNOWN_CHILD), + +# Could not acquire lock (-15, LOCK_FAIL), + +# Unable to find JSON domain config (-16, JSON_CONFIG_EMPTY), + +# The requested device already exists (-17, DEVICE_EXISTS), + +# Remus ops do not match device (-18, REMUS_DEVOPS_DOES_NOT_MATCH), + +# Remus device not supported (-19, REMUS_DEVICE_NOT_SUPPORTED), + +# vNUMA config not valid (-20, VNUMA_CONFIG_INVALID), + +# Requested domain was not found (-21, DOMAIN_NOTFOUND), ], value_namespace = ) -- 2.4.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH RFC 3/9] libxl: introduce specific xenstore error codes
Signed-off-by: Rob Hoes rob.h...@citrix.com --- tools/libxl/libxl_types.idl | 8 tools/libxl/libxl_xshelp.c | 14 +++--- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 6dc18fa..e9b3477 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -106,6 +106,14 @@ libxl_error = Enumeration(error, [ # Requested domain was not found (-21, DOMAIN_NOTFOUND), + +# Xenstore errors +(ENUM_PREV, XS_CONNECT), +(ENUM_PREV, XS_READ), +(ENUM_PREV, XS_WRITE), +(ENUM_PREV, XS_TRANS_START), +(ENUM_PREV, XS_TRANS_COMMIT), +(ENUM_PREV, XS_REMOVE), ], value_namespace = ) libxl_domain_type = Enumeration(domain_type, [ diff --git a/tools/libxl/libxl_xshelp.c b/tools/libxl/libxl_xshelp.c index d7eaa66..e634ee5 100644 --- a/tools/libxl/libxl_xshelp.c +++ b/tools/libxl/libxl_xshelp.c @@ -174,7 +174,7 @@ int libxl__xs_read_checked(libxl__gc *gc, xs_transaction_t t, if (!result) { if (errno != ENOENT) { LOGE(ERROR, xenstore read failed: `%s', path); -return ERROR_FAIL; +return ERROR_XS_READ; } } *result_out = result; @@ -187,7 +187,7 @@ int libxl__xs_write_checked(libxl__gc *gc, xs_transaction_t t, size_t length = strlen(string); if (!xs_write(CTX-xsh, t, path, string, length)) { LOGE(ERROR, xenstore write failed: `%s' = `%s', path, string); -return ERROR_FAIL; +return ERROR_XS_WRITE; } return 0; } @@ -199,7 +199,7 @@ int libxl__xs_rm_checked(libxl__gc *gc, xs_transaction_t t, const char *path) return 0; LOGE(ERROR, xenstore rm failed: `%s', path); -return ERROR_FAIL; +return ERROR_XS_REMOVE; } return 0; } @@ -210,7 +210,7 @@ int libxl__xs_transaction_start(libxl__gc *gc, xs_transaction_t *t) *t = xs_transaction_start(CTX-xsh); if (!*t) { LOGE(ERROR, could not create xenstore transaction); -return ERROR_FAIL; +return ERROR_XS_TRANS_START; } return 0; } @@ -225,7 +225,7 @@ int libxl__xs_transaction_commit(libxl__gc *gc, xs_transaction_t *t) return +1; LOGE(ERROR, could not commit xenstore transaction); -return ERROR_FAIL; +return ERROR_XS_TRANS_COMMIT; } *t = 0; @@ -257,7 +257,7 @@ int libxl__xs_path_cleanup(libxl__gc *gc, xs_transaction_t t, if (!xs_rm(CTX-xsh, t, path)) { if (errno != ENOENT) LOGE(DEBUG, unable to remove path %s, path); -rc = ERROR_FAIL; +rc = ERROR_XS_REMOVE; goto out; } @@ -274,7 +274,7 @@ int libxl__xs_path_cleanup(libxl__gc *gc, xs_transaction_t t, if (!xs_rm(CTX-xsh, t, path)) { if (errno != ENOENT) LOGE(DEBUG, unable to remove path %s, path); -rc = ERROR_FAIL; +rc = ERROR_XS_REMOVE; goto out; } } -- 2.4.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 10/12] x86/altp2m: define and implement alternate p2m HVMOP types.
On 22/06/15 19:56, Ed White wrote: Signed-off-by: Ed White edmund.h.wh...@intel.com --- xen/arch/x86/hvm/hvm.c | 216 xen/include/public/hvm/hvm_op.h | 69 + 2 files changed, 285 insertions(+) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index b758ee1..b3e74ce 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -6424,6 +6424,222 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg) break; } +case HVMOP_altp2m_get_domain_state: +{ +struct xen_hvm_altp2m_domain_state a; +struct domain *d; + +if ( copy_from_guest(a, arg, 1) ) +return -EFAULT; + +d = rcu_lock_domain_by_any_id(a.domid); +if ( d == NULL ) +return -ESRCH; + +rc = -EINVAL; +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() ) +goto param_fail9; + +a.state = altp2mhvm_active(d); +rc = copy_to_guest(arg, a, 1) ? -EFAULT : 0; + +param_fail9: +rcu_unlock_domain(d); +break; +} + +case HVMOP_altp2m_set_domain_state: +{ +struct xen_hvm_altp2m_domain_state a; +struct domain *d; +struct vcpu *v; +bool_t ostate; + +if ( copy_from_guest(a, arg, 1) ) +return -EFAULT; + +d = rcu_lock_domain_by_any_id(a.domid); +if ( d == NULL ) +return -ESRCH; + +rc = -EINVAL; +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() || + nestedhvm_enabled(d) ) +goto param_fail10; + +ostate = d-arch.altp2m_active; +d-arch.altp2m_active = !!a.state; + +/* If the alternate p2m state has changed, handle appropriately */ +if ( d-arch.altp2m_active != ostate ) +{ +if ( !ostate !p2m_init_altp2m_by_id(d, 0) ) +goto param_fail10; Indentation. + +for_each_vcpu( d, v ) +if (!ostate) +altp2mhvm_vcpu_initialise(v); +else +altp2mhvm_vcpu_destroy(v); Although strictly speaking this is (almost) ok by the style guidelines, it would probably be better to have braces for the for_each_vcpu() loop. Also, spaces for the brackets for !ostate. + +if ( ostate ) +p2m_flush_altp2m(d); +} + +rc = 0; + +param_fail10: +rcu_unlock_domain(d); +break; +} + +case HVMOP_altp2m_vcpu_enable_notify: +{ +struct domain *curr_d = current-domain; +struct vcpu *curr = current; +struct xen_hvm_altp2m_vcpu_enable_notify a; + +if ( copy_from_guest(a, arg, 1) ) +return -EFAULT; + +if ( !is_hvm_domain(curr_d) || !hvm_altp2m_supported() || + !curr_d-arch.altp2m_active || vcpu_altp2mhvm(curr).veinfo_gfn ) +return -EINVAL; + +vcpu_altp2mhvm(curr).veinfo_gfn = a.pfn; +ahvm_vcpu_update_vmfunc_ve(curr); You need a gfn bounds check against the host p2m here. +rc = 0; + +break; +} + +case HVMOP_altp2m_create_p2m: +{ +struct xen_hvm_altp2m_view a; +struct domain *d; + +if ( copy_from_guest(a, arg, 1) ) +return -EFAULT; + +d = rcu_lock_domain_by_any_id(a.domid); +if ( d == NULL ) +return -ESRCH; + +rc = -EINVAL; +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() || + !d-arch.altp2m_active ) +goto param_fail11; + +if ( !p2m_init_next_altp2m(d, a.view) ) +goto param_fail11; + +rc = copy_to_guest(arg, a, 1) ? -EFAULT : 0; + +param_fail11: +rcu_unlock_domain(d); +break; +} + +case HVMOP_altp2m_destroy_p2m: +{ +struct xen_hvm_altp2m_view a; +struct domain *d; + +if ( copy_from_guest(a, arg, 1) ) +return -EFAULT; + +d = rcu_lock_domain_by_any_id(a.domid); +if ( d == NULL ) +return -ESRCH; + +rc = -EINVAL; +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() || + !d-arch.altp2m_active ) +goto param_fail12; + +if ( p2m_destroy_altp2m_by_id(d, a.view) ) +rc = 0; + +param_fail12: +rcu_unlock_domain(d); +break; +} + +case HVMOP_altp2m_switch_p2m: +{ +struct xen_hvm_altp2m_view a; +struct domain *d; + +if ( copy_from_guest(a, arg, 1) ) +return -EFAULT; + +d = rcu_lock_domain_by_any_id(a.domid); +if ( d == NULL ) +return -ESRCH; + +rc = -EINVAL; +if ( !is_hvm_domain(d) || !hvm_altp2m_supported() || + !d-arch.altp2m_active ) +
Re: [Xen-devel] [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
On 24.06.15 at 13:24, paul.durr...@citrix.com wrote: @@ -424,8 +427,17 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val) } } -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size) +static int stdvga_mem_write(struct vcpu *v, unsigned long addr, +unsigned long size, unsigned long data) { +ioreq_t p = { .type = IOREQ_TYPE_COPY, + .addr = addr, + .size = size, + .count = 1, + .dir = IOREQ_WRITE, + .data = data, Indentation. -if ( s-stdvga s-cache ) -{ -switch ( p-type ) -{ -case IOREQ_TYPE_COPY: -buf = mmio_move(s, p); -if ( !buf ) -s-cache = 0; -break; -default: -gdprintk(XENLOG_WARNING, unsupported mmio request type:%d - addr:0x%04x data:0x%04x size:%d count:%d state:%d - isptr:%d dir:%d df:%d\n, - p-type, (int)p-addr, (int)p-data, (int)p-size, - (int)p-count, p-state, - p-data_is_ptr, p-dir, p-df); -s-cache = 0; -} I can't see where these cases of clearing s-cache move to. -} -else -{ -buf = (p-dir == IOREQ_WRITE); -} - -rc = (buf hvm_buffered_io_send(p)); +rc = s-stdvga s-cache +(addr = VGA_MEM_BASE) +((addr + length) (VGA_MEM_BASE + VGA_MEM_SIZE)); Not how the old code also calls hvm_buffered_io_send() when !s-stdvga || !s-cache but p-dir == IOREQ_WRITE. Do you really mean to drop that? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: 24 June 2015 14:59 To: Paul Durrant Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org) Subject: Re: [PATCH v4 09/17] x86/hvm: unify stdvga mmio intercept with standard mmio intercept On 24.06.15 at 13:24, paul.durr...@citrix.com wrote: @@ -424,8 +427,17 @@ static void stdvga_mem_writeb(uint64_t addr, uint32_t val) } } -static void stdvga_mem_write(uint64_t addr, uint64_t data, uint64_t size) +static int stdvga_mem_write(struct vcpu *v, unsigned long addr, +unsigned long size, unsigned long data) { +ioreq_t p = { .type = IOREQ_TYPE_COPY, + .addr = addr, + .size = size, + .count = 1, + .dir = IOREQ_WRITE, + .data = data, Indentation. Damn emacs. -if ( s-stdvga s-cache ) -{ -switch ( p-type ) -{ -case IOREQ_TYPE_COPY: -buf = mmio_move(s, p); -if ( !buf ) -s-cache = 0; -break; -default: -gdprintk(XENLOG_WARNING, unsupported mmio request type:%d - addr:0x%04x data:0x%04x size:%d count:%d state:%d - isptr:%d dir:%d df:%d\n, - p-type, (int)p-addr, (int)p-data, (int)p-size, - (int)p-count, p-state, - p-data_is_ptr, p-dir, p-df); -s-cache = 0; -} I can't see where these cases of clearing s-cache move to. There's only one case AFAICT, which is if the domain goes through a save/restore then s-cache is cleared. -} -else -{ -buf = (p-dir == IOREQ_WRITE); -} - -rc = (buf hvm_buffered_io_send(p)); +rc = s-stdvga s-cache +(addr = VGA_MEM_BASE) +((addr + length) (VGA_MEM_BASE + VGA_MEM_SIZE)); Not how the old code also calls hvm_buffered_io_send() when !s-stdvga || !s-cache but p-dir == IOREQ_WRITE. Do you really mean to drop that? Hmmm. I'm not sure why it would have done that. It seems wrong. I'll check. Paul Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [linux-arm-xen test] 58849: regressions - FAIL
On Wed, 24 Jun 2015, Ian Campbell wrote: On Wed, 2015-06-24 at 06:03 +, osstest service user wrote: flight 58849 linux-arm-xen real [real] http://logs.test-lab.xenproject.org/osstest/logs/58849/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-cubietruck 11 guest-start fail REGR. vs. 58830 This was: http://logs.test-lab.xenproject.org/osstest/logs/58849/test-armhf-armhf-xl-cubietruck/cubietruck-braque---var-log-kern.log Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637687] [ cut here ] Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637756] kernel BUG at drivers/xen/grant-table.c:923! Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637784] Internal error: Oops - BUG: 0 [#1] SMP ARM Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637810] Modules linked in: xen_gntalloc bridge stp ipv6 llc brcmfmac brcmutil cfg80211 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637899] CPU: 0 PID: 16206 Comm: vif1.0-q0-guest Not tainted 3.16.7-ckt12+ #1 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637936] task: c12fc480 ti: d2d3c000 task.ti: d2d3c000 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.637977] PC is at gnttab_batch_copy+0xd4/0xe0 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638004] LR is at gnttab_batch_copy+0x1c/0xe0 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638030] pc : [c04abf7c] lr : [c04abec4]psr: a013 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638030] sp : d2d3deb0 ip : deadbeef fp : d2d3df3c Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638091] r10: 0001 r9 : r8 : 0008 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638124] r7 : 0001 r6 : 0001 r5 : r4 : e1e38d30 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638159] r3 : 0001 r2 : deadbeef r1 : deadbeef r0 : fff2 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638193] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638227] Control: 10c5387d Table: 7b50406a DAC: 0015 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638257] Process vif1.0-q0-guest (pid: 16206, stack limit = 0xd2d3c248) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638287] Stack: (0xd2d3deb0 to 0xd2d3e000) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638316] dea0: 0001 e1e3 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638353] dec0: 0001 c05d7c44 003e 0ec2 d2d3df3c 0001 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638391] dee0: dbbb7a80 0008 d2d3df20 e1e38cfc e1e38d30 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638430] df00: 0001 0001 e1e38d30 e1e63530 003e 0208 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638468] df20: d9f0f480 d9f0f480 0001 d2d3df2c d2d3df34 d2d3df34 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638505] df40: db34e380 e1e3 c05d776c Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638543] df60: c0264138 e1e3 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638581] df80: d2d3df80 d2d3df80 d2d3df90 d2d3df90 d2d3dfac db34e380 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638638] dfa0: c026406c c020f038 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638686] dfc0: Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638723] dfe0: 0013 Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638785] [c04abf7c] (gnttab_batch_copy) from [c05d7c44] (xenvif_kthread_guest_rx+0x4d8/0xbc0) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638841] [c05d7c44] (xenvif_kthread_guest_rx) from [c0264138] (kthread+0xcc/0xe8) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638887] [c0264138] (kthread) from [c020f038] (ret_from_fork+0x14/0x3c) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638929] Code: 0ae5 eaed e8bd80f8 e7f001f2 (e7f001f2) Jun 24 04:09:13 cubietruck-braque kernel: [ 807.638978] ---[ end trace 98c74482d9a5771d ]--- Which looks familiar, although I can't seem to find it, does anyone remember it? Are we missing a backport perhaps? This is the 3.16.y based linux-arm-xen tree, which was recently updated from a baseline of v3.16.4-ckt7 to v3.16.7-ckt12 (flight 58830) in both cases plus xen_arch_need_swiotlb for swiotlb stuff. This here was the next flight which only added the xen: netback: read hotplug script once at start of day., which I
[Xen-devel] [xen-4.1-testing test] 58847: regressions - FAIL
flight 58847 xen-4.1-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/58847/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-pair 21 guest-migrate/src_host/dst_host fail REGR. vs. 27396 Regressions which are regarded as allowable (not blocking): test-i386-i386-pair 21 guest-migrate/src_host/dst_host fail like 27420 test-amd64-amd64-pair 21 guest-migrate/src_host/dst_host fail like 27420 Tests which did not succeed, but are not blocking: test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-i386-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a build-i386-rumpuserxen5 rumpuserxen-buildfail never pass build-amd64-rumpuserxen 5 rumpuserxen-buildfail never pass build-amd64-libvirt 5 libvirt-buildfail never pass test-i386-i386-libvirt5 xen-install fail never pass test-amd64-amd64-libvirt 5 xen-install fail never pass test-amd64-i386-libvirt 5 xen-install fail never pass test-i386-i386-xl-sedf-pin 15 guest-saverestore.2 fail never pass build-i386-libvirt5 libvirt-buildfail never pass test-amd64-amd64-xl-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-qemut-debianhvm-amd64 20 leak-check/check fail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 16 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-i386-qemuu-freebsd10-i386 21 leak-check/check fail never pass test-amd64-i386-xl-win7-amd64 16 guest-stop fail never pass test-i386-i386-xl-qemut-winxpsp3 16 guest-stop fail never pass test-amd64-i386-xl-qemut-debianhvm-amd64 20 leak-check/check fail never pass test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass test-amd64-i386-xl-winxpsp3-vcpus1 16 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 16 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 20 leak-check/check fail never pass test-amd64-amd64-xl-winxpsp3 16 guest-stop fail never pass test-amd64-i386-qemuu-freebsd10-amd64 21 leak-check/check fail never pass test-i386-i386-xl-winxpsp3 16 guest-stop fail never pass version targeted for testing: xen 40feff8733e2ac27561a27e7c009a61ba3b320fe baseline version: xen 8995a94f8f88b174dabd1289d1d54c1dcfe7c78d People who touched revisions under test: Gonglei arei.gong...@huawei.com Ian Jackson ian.jack...@eu.citrix.com Paolo Bonzini pbonz...@redhat.com Petr Matousek pmato...@redhat.com Stefan Hajnoczi stefa...@redhat.com jobs: build-amd64 pass build-i386 pass build-amd64-libvirt fail build-i386-libvirt fail build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-xl pass test-amd64-i386-xl pass test-i386-i386-xlpass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64fail test-amd64-i386-xl-qemut-debianhvm-amd64 fail test-amd64-i386-qemuu-freebsd10-amd64fail test-amd64-amd64-rumpuserxen-amd64 blocked test-amd64-amd64-xl-qemut-win7-amd64 fail test-amd64-i386-xl-qemut-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64fail test-amd64-amd64-xl-credit2 pass test-i386-i386-xl-credit2pass test-amd64-i386-qemuu-freebsd10-i386 fail test-amd64-i386-rumpuserxen-i386
Re: [Xen-devel] [PATCH v2 06/12] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.
On 22.06.15 at 20:56, edmund.h.wh...@intel.com wrote: @@ -1826,6 +1827,20 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v) vmx_vmcs_exit(v); } +static bool_t vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs) +{ +bool_t rc = 0; + +if ( !cpu_has_vmx_vmfunc altp2mhvm_active(current-domain) + regs-eax == 0 + p2m_switch_vcpu_altp2m_by_id(current, (uint16_t)regs-ecx) ) +{ +regs-eip += 3; What if the instruction has some (bogus but not invalid) opcode prefix? @@ -2091,6 +2108,13 @@ static void vmx_invlpg_intercept(unsigned long vaddr) vpid_sync_vcpu_gva(curr, vaddr); } +static int vmx_vmfunc_intercept(struct cpu_user_regs *regs) +{ +gdprintk(XENLOG_ERR, Failed guest VMFUNC execution\n); +domain_crash(current-domain); +return X86EMUL_OKAY; +} What is this unconditional crashing of the guest good for? --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -3837,6 +3837,14 @@ x86_emulate( goto rdtsc; } +if (modrm == 0xd4) /* vmfunc */ +{ +fail_if(ops-vmfunc == NULL); +if ( (rc = ops-vmfunc(ctxt) != 0) ) +goto done; +break; +} Together with the two preceding if()-s this is now finally the point where switch() should be used instead. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] 答复: (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install
Windows 8.0 iso can't install in uefi mode, I have to use windbg to get more debug information, So using xen in this case 发件人: Razvan Cojocaru [mailto:rcojoc...@bitdefender.com] 发送时间: 2015年6月24日 17:26 收件人: Fanhenglong; xen-devel@lists.xen.org 抄送: Liuqiming (John); Yanqiangjun; Huangpeng (Peter); Hanweidong (Randy) 主题: Re: [Xen-devel] (xen 4.6 unstable) triple fault when execute fxsave during the procedure of guest iso install On 06/24/2015 12:14 PM, Fanhenglong wrote: I want to debug the procedure of windows os install with windbg, windbg executes instruction(fxsave) after the blank vm is started and before guest iso start to install, fxsave trigger the following code path: vmx_vmexit_handler(EXIT_REASON_EPT_VIOLATION) -ept_handle_violation -hvm_hap_nested_page_fault -handle_mmio_with_translation -handle_mmio -hvm_emulate_one -x86_emulate *X86_emulate return X86EMUL_UNHANDLEABLE* How are you using Xen in this case? Are you by any chance using the vm_event system in a way that sends back an emulate vm_event response from userspace? You might want to look at x86_emulate() in xen/arch/x86/x86_emulate/x86_emulate.c and see if (and how) fxsave is being handled. HTH, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v8 8/9] video: fbdev: s3fb: use arch_phys_wc_add() and pci_iomap_wc()
From: Luis R. Rodriguez mcg...@suse.com This driver uses the same area for MTRR as for the ioremap(). Convert the driver from using the x86 specific MTRR code to the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add() will avoid MTRR if write-combining is available, in order to take advantage of that also ensure the ioremap'd area is requested as write-combining. There are a few motivations for this: a) Take advantage of PAT when available b) Help bury MTRR code away, MTRR is architecture specific and on x86 its replaced by PAT c) Help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit de33c442e titled x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()) The conversion done is expressed by the following Coccinelle SmPL patch, it additionally required manual intervention to address all the #ifdery and removal of redundant things which arch_phys_wc_add() already addresses such as verbose message about when MTRR fails and doing nothing when we didn't get an MTRR. @ mtrr_found @ expression index, base, size; @@ -index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1); +index = arch_phys_wc_add(base, size); @ mtrr_rm depends on mtrr_found @ expression mtrr_found.index, mtrr_found.base, mtrr_found.size; @@ -mtrr_del(index, base, size); +arch_phys_wc_del(index); @ mtrr_rm_zero_arg depends on mtrr_found @ expression mtrr_found.index; @@ -mtrr_del(index, 0, 0); +arch_phys_wc_del(index); @ mtrr_rm_fb_info depends on mtrr_found @ struct fb_info *info; expression mtrr_found.index; @@ -mtrr_del(index, info-fix.smem_start, info-fix.smem_len); +arch_phys_wc_del(index); @ ioremap_replace_nocache depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info-screen_base = ioremap_nocache(base, size); +info-screen_base = ioremap_wc(base, size); @ ioremap_replace_default depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info-screen_base = ioremap(base, size); +info-screen_base = ioremap_wc(base, size); Generated-by: Coccinelle SmPL Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: Jingoo Han jg1@samsung.com Cc: Geert Uytterhoeven ge...@linux-m68k.org Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Lad, Prabhakar prabhakar.cse...@gmail.com Cc: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se Cc: Suresh Siddha sbsid...@gmail.com Cc: Ingo Molnar mi...@elte.hu Cc: Thomas Gleixner t...@linutronix.de Cc: Juergen Gross jgr...@suse.com Cc: Andy Lutomirski l...@amacapital.net Cc: Dave Airlie airl...@redhat.com Cc: Antonino Daplas adap...@gmail.com Cc: linux-fb...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Acked-by: Tomi Valkeinen tomi.valkei...@ti.com Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- drivers/video/fbdev/s3fb.c | 35 ++- 1 file changed, 6 insertions(+), 29 deletions(-) diff --git a/drivers/video/fbdev/s3fb.c b/drivers/video/fbdev/s3fb.c index f0ae61a..13b1090 100644 --- a/drivers/video/fbdev/s3fb.c +++ b/drivers/video/fbdev/s3fb.c @@ -28,13 +28,9 @@ #include linux/i2c.h #include linux/i2c-algo-bit.h -#ifdef CONFIG_MTRR -#include asm/mtrr.h -#endif - struct s3fb_info { int chip, rev, mclk_freq; - int mtrr_reg; + int wc_cookie; struct vgastate state; struct mutex open_lock; unsigned int ref_count; @@ -154,11 +150,7 @@ static const struct svga_timing_regs s3_timing_regs = { static char *mode_option; - -#ifdef CONFIG_MTRR static int mtrr = 1; -#endif - static int fasttext = 1; @@ -170,11 +162,8 @@ module_param(mode_option, charp, 0444); MODULE_PARM_DESC(mode_option, Default video mode ('640x480-8@60', etc)); module_param_named(mode, mode_option, charp, 0444); MODULE_PARM_DESC(mode, Default video mode ('640x480-8@60', etc) (deprecated)); - -#ifdef CONFIG_MTRR module_param(mtrr, int, 0444); MODULE_PARM_DESC(mtrr, Enable write-combining with MTRR (1=enable, 0=disable, default=1)); -#endif module_param(fasttext, int, 0644); MODULE_PARM_DESC(fasttext, Enable S3 fast text mode (1=enable, 0=disable, default=1)); @@ -1168,7 +1157,7 @@ static int s3_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) info-fix.smem_len = pci_resource_len(dev, 0); /* Map physical IO memory address into kernel space */ - info-screen_base = pci_iomap(dev, 0, 0); + info-screen_base = pci_iomap_wc(dev, 0, 0); if (! info-screen_base) { rc = -ENOMEM; dev_err(info-device, iomap for framebuffer failed\n); @@ -1365,12 +1354,9 @@ static int s3_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) /* Record a reference to the driver data */ pci_set_drvdata(dev, info); -#ifdef CONFIG_MTRR - if (mtrr) { - par-mtrr_reg = -1; - par-mtrr_reg =
Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants
On Thu, Jun 25, 2015 at 09:38:01AM +1000, Benjamin Herrenschmidt wrote: On Wed, 2015-06-24 at 15:29 -0700, Luis R. Rodriguez wrote: Nope but at least what made me squint at this being a possible feature was that in practice when reviewing all of the kernels pending device drivers using MTRR (potential write-combine candidates) I encountered a slew of them which had the architectural unfortunate practice of combining PCI bars for MMIO and their respective write-combined desirable area (framebuffer for video, PIO buffers for infiniband, etc). Now, to me that read more as a practice for old school devices when such things were likely still being evaluated, more modern devices seem to adhere to sticking a full PCI bar with write-combining or not. Did you not encounter such mismatch splits on powerpc ? Was such possibility addressed? Yes, I remember we dealt with some networking (or maybe IB) stuff back in the day. We dealt with it by using a WC mapping and explicit barriers to prevent combine when not wanted. It is to be noted that on powerpc at least, writel() and co will never combine due to the memory barriers in them. Only normal stores (or __raw_writel) will. On Intel things I different I assume... And the people who really know seem to be eaten by volcanoes or not have time. The problem I see is that architectures can provide widely different mechanisms and semantics in those areas and it's hard to define a generic driver interface. Provided asm generic helpers are defined this should work though. The question is just if there is enough motivation. Doesn't sound like it or as you note maybe for userspace there might be. My position is that if it was too late for PCIE or if this was too ambigious for PCIE perhaps the next generation bus archicture or ammendments (I have no clue if this would would be possible) will make this part of future device negotiation clear and fully expected, not a wonderful side effect. If what you are implying here is applicable to the x86 world I'm all for enabling this as we'd have less code to maintain but I'll note that getting a clarification alone on that prefetchable != write-combining was in and of itself hard, I'd be surprised if we could get full architectural buy-in to this as an immediate automatic feature. I'm happy not to make it automatic for kernel space. OK thanks I'll proceed with these patches then. As for user mappings, Which APIs were you considering in this regard BTW? maybe the right thing to do is to let us do what we do by default with a quirk that can set a flag in pci_dev to disable that behaviour (maybe on a per BAR basis ?). That might mean it could restrict userspace WC to require devices to have WC parts on a full PCI BAR. Although this is restrictive having reviewed most WC uses in the kernel I'd think this would be a fair compromise to make, but again, if things are still murky perhaps best we kiss this idea good bye for now and hope for it to come in on future buses or ammendments (if that's even possible?). I think the common case is that WC works. If WC does not I will note one hack which migh be worth mentioning -- just for the record, this was devised as a shortcoming of a device where they failed to split things properly and that *without* WC performance suffered quite a bit so they made one full PCI BAR WC and as a work around this: http://lkml.kernel.org/r/20150416041837.GA5712@hykim-PC That is for registers that needed it: write; wmb; Then if they wanted to wait till the NIC has seen the write, they did: write; wmb; read; Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 5/9] PCI: Add pci_iomap_wc() variants
On Wed, 2015-06-24 at 17:58 -0700, Luis R. Rodriguez wrote: On Wed, Jun 24, 2015 at 5:52 PM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Thu, 2015-06-25 at 02:08 +0200, Luis R. Rodriguez wrote: OK thanks I'll proceed with these patches then. As for user mappings, Which APIs were you considering in this regard BTW? mmap of the generic /sys/bus/pci/.../resource* Like? Got a demo patch in mind ? :) Nope. I was just thinking out loud. Today I have yet to see a problem with what we do so ... Cheers, Ben. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v8 2/9] video: fbdev: i740fb: use arch_phys_wc_add() and pci_ioremap_wc_bar()
From: Luis R. Rodriguez mcg...@suse.com Convert the driver from using the x86 specific MTRR code to the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add() will avoid MTRR if write-combining is available, in order to take advantage of that also ensure the ioremap'd area is requested as write-combining. There are a few motivations for this: a) Take advantage of PAT when available b) Help bury MTRR code away, MTRR is architecture specific and on x86 its replaced by PAT c) Help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit de33c442e titled x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()) The conversion done is expressed by the following Coccinelle SmPL patch, it additionally required manual intervention to address all the #ifdery and removal of redundant things which arch_phys_wc_add() already addresses such as verbose message about when MTRR fails and doing nothing when we didn't get an MTRR. @ mtrr_found @ expression index, base, size; @@ -index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1); +index = arch_phys_wc_add(base, size); @ mtrr_rm depends on mtrr_found @ expression mtrr_found.index, mtrr_found.base, mtrr_found.size; @@ -mtrr_del(index, base, size); +arch_phys_wc_del(index); @ mtrr_rm_zero_arg depends on mtrr_found @ expression mtrr_found.index; @@ -mtrr_del(index, 0, 0); +arch_phys_wc_del(index); @ mtrr_rm_fb_info depends on mtrr_found @ struct fb_info *info; expression mtrr_found.index; @@ -mtrr_del(index, info-fix.smem_start, info-fix.smem_len); +arch_phys_wc_del(index); @ ioremap_replace_nocache depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info-screen_base = ioremap_nocache(base, size); +info-screen_base = ioremap_wc(base, size); @ ioremap_replace_default depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info-screen_base = ioremap(base, size); +info-screen_base = ioremap_wc(base, size); Generated-by: Coccinelle SmPL Cc: Jingoo Han jg1@samsung.com Cc: Bjorn Helgaas bhelg...@google.com Cc: Geert Uytterhoeven ge...@linux-m68k.org Cc: Rob Clark robdcl...@gmail.com Cc: Benoit Taine benoit.ta...@lip6.fr Cc: Suresh Siddha sbsid...@gmail.com Cc: Ingo Molnar mi...@elte.hu Cc: Thomas Gleixner t...@linutronix.de Cc: Juergen Gross jgr...@suse.com Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Andy Lutomirski l...@amacapital.net Cc: Dave Airlie airl...@redhat.com Cc: Antonino Daplas adap...@gmail.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: linux-fb...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Acked-by: Tomi Valkeinen tomi.valkei...@ti.com Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- drivers/video/fbdev/i740fb.c | 35 ++- 1 file changed, 6 insertions(+), 29 deletions(-) diff --git a/drivers/video/fbdev/i740fb.c b/drivers/video/fbdev/i740fb.c index a2b4204..452e116 100644 --- a/drivers/video/fbdev/i740fb.c +++ b/drivers/video/fbdev/i740fb.c @@ -27,24 +27,15 @@ #include linux/console.h #include video/vga.h -#ifdef CONFIG_MTRR -#include asm/mtrr.h -#endif - #include i740_reg.h static char *mode_option; - -#ifdef CONFIG_MTRR static int mtrr = 1; -#endif struct i740fb_par { unsigned char __iomem *regs; bool has_sgram; -#ifdef CONFIG_MTRR - int mtrr_reg; -#endif + int wc_cookie; bool ddc_registered; struct i2c_adapter ddc_adapter; struct i2c_algo_bit_data ddc_algo; @@ -1040,7 +1031,7 @@ static int i740fb_probe(struct pci_dev *dev, const struct pci_device_id *ent) goto err_request_regions; } - info-screen_base = pci_ioremap_bar(dev, 0); + info-screen_base = pci_ioremap_wc_bar(dev, 0); if (!info-screen_base) { dev_err(info-device, error remapping base\n); ret = -ENOMEM; @@ -1144,13 +1135,9 @@ static int i740fb_probe(struct pci_dev *dev, const struct pci_device_id *ent) fb_info(info, %s frame buffer device\n, info-fix.id); pci_set_drvdata(dev, info); -#ifdef CONFIG_MTRR - if (mtrr) { - par-mtrr_reg = -1; - par-mtrr_reg = mtrr_add(info-fix.smem_start, - info-fix.smem_len, MTRR_TYPE_WRCOMB, 1); - } -#endif + if (mtrr) + par-wc_cookie = arch_phys_wc_add(info-fix.smem_start, + info-fix.smem_len); return 0; err_reg_framebuffer: @@ -1177,13 +1164,7 @@ static void i740fb_remove(struct pci_dev *dev) if (info) { struct i740fb_par *par = info-par; - -#ifdef CONFIG_MTRR - if (par-mtrr_reg = 0) { - mtrr_del(par-mtrr_reg, 0, 0); - par-mtrr_reg = -1; - } -#endif +
[Xen-devel] [PATCH v8 4/9] video: fbdev: gxt4500: use pci_ioremap_wc_bar() for framebuffer
From: Luis R. Rodriguez mcg...@suse.com The driver doesn't use mtrr_add() or arch_phys_wc_add() but since we know the framebuffer is isolated already on an ioremap() we can take advantage of write combining for performance where possible. In this case there are a few motivations for this: a) Take advantage of PAT when available b) Help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit de33c442e titled x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()) Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com Cc: Rob Clark robdcl...@gmail.com Cc: Geert Uytterhoeven ge...@linux-m68k.org Cc: Suresh Siddha sbsid...@gmail.com Cc: Ingo Molnar mi...@elte.hu Cc: Thomas Gleixner t...@linutronix.de Cc: Juergen Gross jgr...@suse.com Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Andy Lutomirski l...@amacapital.net Cc: Dave Airlie airl...@redhat.com Cc: Antonino Daplas adap...@gmail.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: linux-fb...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Acked-by: Tomi Valkeinen tomi.valkei...@ti.com Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- drivers/video/fbdev/gxt4500.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/video/fbdev/gxt4500.c b/drivers/video/fbdev/gxt4500.c index 135d78a..f19133a 100644 --- a/drivers/video/fbdev/gxt4500.c +++ b/drivers/video/fbdev/gxt4500.c @@ -662,7 +662,7 @@ static int gxt4500_probe(struct pci_dev *pdev, const struct pci_device_id *ent) info-fix.smem_start = fb_phys; info-fix.smem_len = pci_resource_len(pdev, 1); - info-screen_base = pci_ioremap_bar(pdev, 1); + info-screen_base = pci_ioremap_wc_bar(pdev, 1); if (!info-screen_base) { dev_err(pdev-dev, gxt4500: cannot map framebuffer\n); goto err_unmap_regs; -- 2.3.2.209.gd67f9d5.dirty ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v8 9/9] video: fbdev: vt8623fb: use arch_phys_wc_add() and pci_iomap_wc()
From: Luis R. Rodriguez mcg...@suse.com This driver uses the same area for MTRR as for the ioremap(). Convert the driver from using the x86 specific MTRR code to the architecture agnostic arch_phys_wc_add(). arch_phys_wc_add() will avoid MTRR if write-combining is available, in order to take advantage of that also ensure the ioremap'd area is requested as write-combining. There are a few motivations for this: a) Take advantage of PAT when available b) Help bury MTRR code away, MTRR is architecture specific and on x86 its replaced by PAT c) Help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (see commit de33c442e titled x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()) The conversion done is expressed by the following Coccinelle SmPL patch, it additionally required manual intervention to address all the #ifdery and removal of redundant things which arch_phys_wc_add() already addresses such as verbose message about when MTRR fails and doing nothing when we didn't get an MTRR. @ mtrr_found @ expression index, base, size; @@ -index = mtrr_add(base, size, MTRR_TYPE_WRCOMB, 1); +index = arch_phys_wc_add(base, size); @ mtrr_rm depends on mtrr_found @ expression mtrr_found.index, mtrr_found.base, mtrr_found.size; @@ -mtrr_del(index, base, size); +arch_phys_wc_del(index); @ mtrr_rm_zero_arg depends on mtrr_found @ expression mtrr_found.index; @@ -mtrr_del(index, 0, 0); +arch_phys_wc_del(index); @ mtrr_rm_fb_info depends on mtrr_found @ struct fb_info *info; expression mtrr_found.index; @@ -mtrr_del(index, info-fix.smem_start, info-fix.smem_len); +arch_phys_wc_del(index); @ ioremap_replace_nocache depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info-screen_base = ioremap_nocache(base, size); +info-screen_base = ioremap_wc(base, size); @ ioremap_replace_default depends on mtrr_found @ struct fb_info *info; expression base, size; @@ -info-screen_base = ioremap(base, size); +info-screen_base = ioremap_wc(base, size); Generated-by: Coccinelle SmPL Cc: Rob Clark robdcl...@gmail.com Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com Cc: Jingoo Han jg1@samsung.com Cc: Lad, Prabhakar prabhakar.cse...@gmail.com Cc: Suresh Siddha sbsid...@gmail.com Cc: Ingo Molnar mi...@elte.hu Cc: Thomas Gleixner t...@linutronix.de Cc: Juergen Gross jgr...@suse.com Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Andy Lutomirski l...@amacapital.net Cc: Dave Airlie airl...@redhat.com Cc: Antonino Daplas adap...@gmail.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: linux-fb...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Acked-by: Tomi Valkeinen tomi.valkei...@ti.com Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- drivers/video/fbdev/vt8623fb.c | 31 ++- 1 file changed, 6 insertions(+), 25 deletions(-) diff --git a/drivers/video/fbdev/vt8623fb.c b/drivers/video/fbdev/vt8623fb.c index ea7f056..60f24828 100644 --- a/drivers/video/fbdev/vt8623fb.c +++ b/drivers/video/fbdev/vt8623fb.c @@ -26,13 +26,9 @@ #include linux/console.h /* Why should fb driver call console functions? because console_lock() */ #include video/vga.h -#ifdef CONFIG_MTRR -#include asm/mtrr.h -#endif - struct vt8623fb_info { char __iomem *mmio_base; - int mtrr_reg; + int wc_cookie; struct vgastate state; struct mutex open_lock; unsigned int ref_count; @@ -99,10 +95,7 @@ static struct svga_timing_regs vt8623_timing_regs = { /* Module parameters */ static char *mode_option = 640x480-8@60; - -#ifdef CONFIG_MTRR static int mtrr = 1; -#endif MODULE_AUTHOR((c) 2006 Ondrej Zajicek santi...@crfreenet.org); MODULE_LICENSE(GPL); @@ -112,11 +105,8 @@ module_param(mode_option, charp, 0644); MODULE_PARM_DESC(mode_option, Default video mode ('640x480-8@60', etc)); module_param_named(mode, mode_option, charp, 0); MODULE_PARM_DESC(mode, Default video mode e.g. '648x480-8@60' (deprecated)); - -#ifdef CONFIG_MTRR module_param(mtrr, int, 0444); MODULE_PARM_DESC(mtrr, Enable write-combining with MTRR (1=enable, 0=disable, default=1)); -#endif /* - */ @@ -710,7 +700,7 @@ static int vt8623_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) info-fix.mmio_len = pci_resource_len(dev, 1); /* Map physical IO memory address into kernel space */ - info-screen_base = pci_iomap(dev, 0, 0); + info-screen_base = pci_iomap_wc(dev, 0, 0); if (! info-screen_base) { rc = -ENOMEM; dev_err(info-device, iomap for framebuffer failed\n); @@ -781,12 +771,9 @@ static int vt8623_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) /* Record a reference to the driver data */ pci_set_drvdata(dev,