[Xen-devel] [PATCH] oxenstored: fix del_watches and del_transactions
The statement to reset nb_watches should be in del_watches, not del_transactions. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: David Scott dave.sc...@citrix.com --- It was only until Ian applied previous patch that I discovered this copy-n-paste error. Sorry about this. --- tools/ocaml/xenstored/connection.ml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml index 9de4978..935939a 100644 --- a/tools/ocaml/xenstored/connection.ml +++ b/tools/ocaml/xenstored/connection.ml @@ -188,10 +188,10 @@ let del_watch con path token = let del_watches con = Hashtbl.clear con.watches + con.nb_watches - 0 let del_transactions con = Hashtbl.clear con.transactions; - con.nb_watches - 0 let list_watches con = let ll = Hashtbl.fold -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] race condition in xen-gntdev (was: Re: gntdev/gntalloc and fork? - crash in gntdev)
On Thu, May 28, 2015 at 01:45:08AM +0200, Marek Marczykowski-Górecki wrote: On Thu, Apr 30, 2015 at 04:47:44PM +0200, Marek Marczykowski-Górecki wrote: Hi, What is the proper way to handle shared pages (either side - using gntdev or gntalloc) regarding fork and possible exec later? The child process do not need to access those pages in any way, but will map different one(s), using newly opened FD to the gntdev/gntalloc device. Should it unmap them and close FD to the device manually just after the fork? Or the process using gntdev or gntalloc should prevent using fork at all? I'm asking because I get kernel oops[1] in context of such process. This process uses both gntdev and gntalloc. The PID reported there is a child, which maps additional pages (using newly opened FD to /dev/xen/gnt*), but I'm not sure if the crash happens before, after or at this second mapping (actually vchan connection), or maybe even at cleanup of this second mapping. The parent process keeps its mappings for the whole lifetime of its child. I don't have a 100% reliable way to reproduce this problem, but it happens quite often when I run such operations in a loop. Any ideas? I've done some further debugging, below is what I get. The kernel is vanilla 3.19.3, running on Xen 4.4.2. The kernel message: [74376.073464] general protection fault: [#1] SMP [74376.073475] Modules linked in: fuse xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip6table_filter ip6_tables intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr xen_netfront ghash_clmulni_intel nfsd auth_rpcgss nfs_acl lockd grace xenfs xen_privcmd dummy_hcd udc_core xen_gntdev xen_gntalloc xen_blkback sunrpc u2mfn(O) xen_evtchn xen_blkfront [74376.073522] CPU: 1 PID: 9377 Comm: qrexec-agent Tainted: G O 3.19.3-4.pvops.qubes.x86_64 #1 [74376.073528] task: 880002442e40 ti: 8832c000 task.ti: 8832c000 [74376.073532] RIP: e030:[a00952c5] [a00952c5] unmap_if_in_range+0x15/0xd0 [xen_gntdev] static void unmap_if_in_range(struct grant_map *map, unsigned long start, unsigned long end) { unsigned long mstart, mend; int err; if (!map-vma) return; The above crash is at first access to map... [74376.073543] RSP: e02b:8832fc08 EFLAGS: 00010292 [74376.073546] RAX: RBX: dead00100100 RCX: 7fd8616ea000 [74376.073550] RDX: 7fd8616ea000 RSI: 7fd8616e9000 RDI: dead00100100 ... which is 0xdead00100100 (LIST_POISON1). [74376.073554] RBP: 8832fc48 R08: R09: [74376.073557] R10: ea21bb00 R11: R12: 7fd8616e9000 [74376.073561] R13: 7fd8616ea000 R14: 880012702e40 R15: 880012702e70 [74376.073569] FS: 7fd8616ca700() GS:880013c8() knlGS: [74376.073574] CS: e033 DS: ES: CR0: 80050033 [74376.073577] CR2: 7fd8616e9458 CR3: e7af5000 CR4: 00042660 [74376.073582] Stack: [74376.073584] 8800188356c0 00d0 8832fc68 c64ef797 [74376.073590] 0220 dead00100100 7fd8616e9000 7fd8616ea000 [74376.073596] 8832fc88 a00953c6 8832fcc8 880012702e70 [74376.073603] Call Trace: [74376.073610] [a00953c6] mn_invl_range_start+0x46/0x90 [xen_gntdev] [74376.073620] [811e88fb] __mmu_notifier_invalidate_range_start+0x5b/0x90 [74376.073627] [811c2a59] do_wp_page+0x769/0x820 [74376.074031] [811c4f5c] handle_mm_fault+0x7fc/0x10c0 [74376.074031] [813864cd] ? radix_tree_lookup+0xd/0x10 [74376.074031] [81061e1c] __do_page_fault+0x1dc/0x5a0 [74376.074031] [817560a6] ? mutex_lock+0x16/0x37 [74376.074031] [a0008928] ? evtchn_ioctl+0x118/0x3c0 [xen_evtchn] [74376.074031] [812209d8] ? do_vfs_ioctl+0x2f8/0x4f0 [74376.074031] [811cafdf] ? do_munmap+0x29f/0x3b0 [74376.074031] [81062211] do_page_fault+0x31/0x70 [74376.074031] [81759e28] page_fault+0x28/0x30 static void mn_invl_range_start(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, unsigned long end) { struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn); struct grant_map *map; spin_lock(priv-lock); list_for_each_entry(map, priv-maps, next) { unmap_if_in_range(map, start, end); } mn_invl_range_start+0x46 is the first call to unmap_if_in_range, so something is wrong with
Re: [Xen-devel] [PATCH] oxenstored: fix del_watches and del_transactions
On 17 Jun 2015, at 20:39, Wei Liu wei.l...@citrix.com wrote: The statement to reset nb_watches should be in del_watches, not del_transactions. Signed-off-by: Wei Liu wei.l...@citrix.com Cc: David Scott dave.sc...@citrix.com --- It was only until Ian applied previous patch that I discovered this copy-n-paste error. Sorry about this. Oh yes I see what you mean. Acked-by: David Scott dave.sc...@citrix.com Cheers, Dave --- tools/ocaml/xenstored/connection.ml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/ocaml/xenstored/connection.ml b/tools/ocaml/xenstored/connection.ml index 9de4978..935939a 100644 --- a/tools/ocaml/xenstored/connection.ml +++ b/tools/ocaml/xenstored/connection.ml @@ -188,10 +188,10 @@ let del_watch con path token = let del_watches con = Hashtbl.clear con.watches + con.nb_watches - 0 let del_transactions con = Hashtbl.clear con.transactions; - con.nb_watches - 0 let list_watches con = let ll = Hashtbl.fold -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 58663: tolerable FAIL - PUSHED
flight 58663 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/58663/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-credit2 6 xen-boot fail in 58618 pass in 58663 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail pass in 58618 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-libvirt-xsm 11 guest-start fail REGR. vs. 58392 test-armhf-armhf-libvirt 11 guest-start fail blocked in 58392 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail in 58618 like 58392 test-amd64-i386-libvirt-xsm 11 guest-start fail like 58392 test-amd64-i386-libvirt 11 guest-start fail like 58392 test-amd64-amd64-libvirt-xsm 11 guest-start fail like 58392 test-amd64-amd64-libvirt 11 guest-start fail like 58392 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58392 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: xen fcbfaf9d260adbdb9352d6300b9f63c4ed443d49 baseline version: xen 12e817e281034f5881f46e0e4f1d127820101a78 People who touched revisions under test: Andrew Cooper andrew.coop...@citrix.com Dario Faggioli dario.faggi...@citrix.com David Vrabel david.vra...@citrix.com Don Dugger donald.d.dug...@intel.com Don Slutz dsl...@verizon.com George Dunlap george.dun...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Ian Jackson ian.jack...@eu.citrix.com Jan Beulich jbeul...@suse.com Juergen Gross jgr...@suse.com Kevin Tian kevin.t...@intel.com Olaf Hering o...@aepfle.de Roger Pau Monné roger@citrix.com Ross Lagerwall ross.lagerw...@citrix.com Wei Liu wei.l...@citrix.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm fail test-armhf-armhf-libvirt-xsm fail test-amd64-i386-libvirt-xsm fail test-amd64-amd64-xl-xsm pass test-armhf-armhf-xl-xsm pass test-amd64-i386-xl-xsm pass test-amd64-amd64-xl-pvh-amd fail
[Xen-devel] [linux-3.4 test] 58676: regressions - FAIL
flight 58676 linux-3.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/58676/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl 9 debian-install fail REGR. vs. 52209-bisect test-amd64-amd64-pair 15 debian-install/dst_host fail REGR. vs. 52715-bisect test-amd64-i386-pair15 debian-install/dst_host fail REGR. vs. 56366-bisect Regressions which are regarded as allowable (not blocking): test-amd64-i386-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-i386-rumpuserxen-i386 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-libvirt 9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-freebsd10-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-winxpsp3 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-rumpuserxen-amd64 6 xen-bootfail blocked in 56366-bisect test-amd64-i386-qemut-rhel6hvm-intel 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-multivcpu 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-winxpsp3 6 xen-bootfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-ovmf-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl9 debian-installfail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-winxpsp3 6 xen-bootfail blocked in 56366-bisect test-amd64-i386-libvirt-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-winxpsp3 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-winxpsp3-vcpus1 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-sedf 9 debian-installfail blocked in 56366-bisect test-amd64-i386-xl-qemuu-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-freebsd10-i386 6 xen-bootfail blocked in 56366-bisect test-amd64-amd64-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-sedf-pin 9 debian-installfail blocked in 56366-bisect test-amd64-i386-qemuu-rhel6hvm-intel 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-debianhvm-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-win7-amd64 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail blocked in 56366-bisect test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail blocked in 56366-bisect test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail blocked in 56366-bisect test-amd64-amd64-xl-qemuu-ovmf-amd64 6 xen-bootfail like 53709-bisect Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-xsm 9 debian-install fail never pass test-amd64-amd64-libvirt-xsm 9 debian-install fail never pass test-amd64-amd64-xl-pvh-amd 9 debian-install fail never pass test-amd64-i386-xl-xsm9 debian-install fail never pass test-amd64-amd64-xl-credit2 9 debian-install fail never pass test-amd64-amd64-xl-pvh-intel 9 debian-install fail never pass test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 20 leak-check/check fail never pass test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 20 leak-check/check fail never pass version targeted for testing: linux56b48fcda5076d4070ab00df32ff5ff834e0be86 baseline version: linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70 370 people touched revisions under test, not listing them all jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl fail test-amd64-i386-xl
[Xen-devel] [qemu-mainline test] 58672: regressions - trouble: blocked/broken/fail/pass
flight 58672 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/58672/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-libvirt 4 host-build-prep fail REGR. vs. 58551 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-installfail REGR. vs. 58551 Regressions which are regarded as allowable (not blocking): test-amd64-i386-libvirt 11 guest-start fail like 58551 test-amd64-i386-libvirt-xsm 11 guest-start fail like 58551 test-amd64-amd64-libvirt 11 guest-start fail like 58551 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-libvirt-xsm 11 guest-start fail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: qemuua09f4a9d19c500ea5cbcdc0bd7f0d540cf54f9f5 baseline version: qemuu0a2df857a7038c75379cc575de5d4be4c0ac629e People who touched revisions under test: Alberto Garcia be...@igalia.com Alex Bennée alex.ben...@linaro.org Alexander Graf ag...@suse.de Andrew Jones drjo...@redhat.com Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Aurelien Jarno aurel...@aurel32.net Aurelio C. Remonda aurelioremo...@gmail.com Benoit Canet benoit.ca...@nodalink.com Benoît Canet benoit.ca...@nodalink.com Christian Borntraeger borntrae...@de.ibm.com Cole Robinson crobi...@redhat.com Dave Airlie airl...@redhat.com Eric Auger eric.au...@linaro.org Fam Zheng f...@redhat.com Gerd Hoffmann kra...@redhat.com Johan Karlsson johan.karls...@enea.com John Snow js...@redhat.com Kevin Wolf kw...@redhat.com KÅvágó, Zoltán dirty.ice...@gmail.com Laurent Vivier laur...@vivier.eu Mao Chuan Li maoch...@linux.vnet.ibm.com Max Reitz mre...@redhat.com Michael S. Tsirkin m...@redhat.com Pavel Fedin p.fe...@samsung.com Peter Crosthwaite peter.crosthwa...@xilinx.com Peter Maydell peter.mayd...@linaro.org Riku Voipio riku.voi...@linaro.org Sergey Fedorov serge.f...@gmail.com Shannon Zhao shannon.z...@linaro.org Shannon Zhao zhaoshengl...@huawei.com Shlomo Pongratz shlomo.pongr...@huawei.com Stefan Hajnoczi stefa...@redhat.com Thierry Bultel thierry.bul...@basystemes.fr Xu Wang gesa...@linux.vnet.ibm.com Yongbok Kim yongbok@imgtec.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt broken build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm fail test-armhf-armhf-libvirt-xsm blocked test-amd64-i386-libvirt-xsm fail test-amd64-amd64-xl-xsm pass test-armhf-armhf-xl-xsm pass
[Xen-devel] [linux-next test] 58666: regressions - FAIL
flight 58666 linux-next real [real] http://logs.test-lab.xenproject.org/osstest/logs/58666/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail REGR. vs. 58620 Regressions which are regarded as allowable (not blocking): test-amd64-i386-freebsd10-amd64 9 freebsd-install fail like 58620 test-amd64-i386-freebsd10-i386 9 freebsd-install fail like 58620 test-amd64-i386-libvirt 11 guest-start fail like 58620 test-amd64-amd64-libvirt 11 guest-start fail like 58620 test-amd64-i386-libvirt-xsm 11 guest-start fail like 58620 test-armhf-armhf-libvirt-xsm 11 guest-start fail like 58620 test-armhf-armhf-libvirt 11 guest-start fail like 58620 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58620 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 58620 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 58620 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass version targeted for testing: linuxca3cfaa9e7db530b14e4a2d98a9310130152fdb8 baseline version: linux0f57d86787d8b1076ea8f9cba2a46d534a27 jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm pass test-armhf-armhf-libvirt-xsm fail test-amd64-i386-libvirt-xsm fail test-amd64-amd64-xl-xsm pass test-armhf-armhf-xl-xsm pass test-amd64-i386-xl-xsm pass test-amd64-amd64-xl-pvh-amd fail test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 fail test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-rumpuserxen-amd64
Re: [Xen-devel] [PATCH 1/2 linux-next] Revert ufs: fix deadlocks introduced by sb mutex merge
On Wed, Jun 17, 2015 at 10:57:15AM +0200, Jan Kara wrote: Joy... Folks, is anybody actively maintaining fs/ufs these days? Looking into the changelog there wasn't anyone seriously looking into UFS for at least 5-6 years... Fabian did some cleanups recently but they were mostly cosmetic. So I don't think it's really maintained. OTOH clearly there are some users since occasionally someone comes with a bug report. Welcome to the world where we have ~50 on disk / network filesystems :-| FWIW, I've a partial rewrite of that area (completely untested, etc.) in vfs.git#ufs; it's still in progress, but hopefully it'll be done by tomorrow. Basically, it switches the sucker to locking scheme similar to ext2 (modulo seqlock instead of rwlock for protection of block pointers, having to be more careful due to 64bit assignments not being atomic and being unable to fall back to -truncate_mutex in case of race on the read side) and starts cleaning the hell out of the truncate (and eventually create side of get_block as well). As far as I can see, the root of the problems with that sucker is the misplaced handling of tail-packing logics. That had ballooned into a lot of convoluted codepaths ;-/ I'm not blaming Evgeniy - it *is* painful to massage into saner shape and the damn thing kept missing cleanups that were done to similar filesystems due to that... One thing I'm somewhat unhappy about in what Evgeniy had done is dealing with UFS vs. UFS2 differences at such a low level. Sure, the way we did endianness handling in there provoked exactly that, but it might have been better to go the other way round; see e.g. fs/minix/itree*.c for opposite approach. All this stuff is currently completely untested; I'll be using generic parts of xfstests for beating it up, but any assistance would be welcome. Note: all testing I'll be realistically able to do will be for FreeBSD UFS variants - I don't have Solaris left on any boxen, to start with... ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [ovmf test] 58681: tolerable FAIL - PUSHED
flight 58681 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/58681/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58640 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 58640 version targeted for testing: ovmf cc63add853d4b5c16b74b4d4ceb59c015a2adcc4 baseline version: ovmf ffbb1b25402c38d25cbbfb059139e76900bdc843 People who touched revisions under test: Dandan Bi dandan...@intel.com Qiu Shumin shumin@intel.com Star Zeng star.z...@intel.com jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass test-amd64-amd64-xl-qemuu-winxpsp3 pass test-amd64-i386-xl-qemuu-winxpsp3pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=ovmf + revision=cc63add853d4b5c16b74b4d4ceb59c015a2adcc4 + . cri-lock-repos ++ . cri-common +++ . cri-getconfig +++ umask 002 +++ getconfig Repos +++ perl -e ' use Osstest; readglobalconfig(); print $c{Repos} or die $!; ' ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf cc63add853d4b5c16b74b4d4ceb59c015a2adcc4 + branch=ovmf + revision=cc63add853d4b5c16b74b4d4ceb59c015a2adcc4 + . cri-lock-repos ++ . cri-common +++ . cri-getconfig +++ umask 002 +++ getconfig Repos +++ perl -e ' use Osstest; readglobalconfig(); print $c{Repos} or die $!; ' ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . cri-common ++ . cri-getconfig ++ umask 002 + select_xenbranch + case $branch in + tree=ovmf + xenbranch=xen-unstable + '[' xovmf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + : tested/2.6.39.x + . ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{OsstestUpstream} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/staging/qemu-xen-unstable.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/rumpuser-xen.git ++ : git ++ : git://xenbits.xen.org/rumpuser-xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git +++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src +++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src +++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src '[fetch=try]' +++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src +++ local 'options=[fetch=try]' getconfig GitCacheProxy perl -e ' use Osstest;
Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller
On 6/17/2015 at 07:06 PM, in message 1434539195.13744.321.ca...@citrix.com, Ian Campbell ian.campb...@citrix.com wrote: On Thu, 2015-05-21 at 18:07 +0100, George Dunlap wrote: We have several outstanding patch series which add devices that have two levels: a controller and individual devices attached to that controller. In the interest of consistency, this patch introduces a section that sketches out a template for interfaces for such devices. Chun Yan and Jeurgen: I was hoping we could come to some sort of agreement on this such that it can be used as the basis for both the pvusb and pvscsi interfaces. As such your feedback here is important... Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Juergen Gross jgr...@suse.com CC: Chun Yan Liu cy...@suse.com CC: Olaf Hering o...@aepfle.de So, this is definitely RFC -- I tried to spec things out in a way that made sense, but I often just chose something that I thought would be a sensible starting point for discussion. This spec looks a lot more like the PVUSB spec than the PVSCSI spec, in part because I think the PVUSB spec has already had a lot more thought that's gone into it. A couple of random points to discuss: * Calling things controllers, using typectrl for the device name, and using ctrl as the field name for the devid of the controller in the individual devices. * I've said that having an index (port, lun, whatever) is optional. Do we want to make that requred? Do we want it to have a consistent name? In the case of emulated USB, we can't really specify to qemu what port the device gets attached to, so I'm tempted to say it's not required; but even there we could always give it a port number just for name's sake. * Naming sub-devices. We need to have a way to uniquely name both controllers and subdevices. Here I've said that we will have both typectrl and type devid namespaces, mainly because in the previous point I opted not to require an index. Another option would be not to have another devid namespace, but to use ctrl,index as the unique identifier. (This would mean requiring the index/port/lun specification above.) * libxl_device_type_list listing devices across all controllers. I think this is the most practical thing to do, but one could imagine wanting to query by controller ID instead. Feedback welcome. --- tools/libxl/libxl.h | 46 ++ 1 file changed, 46 insertions(+) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 2ed7194..d757845 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1234,6 +1234,52 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int nr_vtpms); * * This function does not interact with the guest and therefore * cannot block on the guest. + * + * Controllers + * --- + * + * Most devices are treated individually. Some devices however, like + * USB or SCSI, inherently have the need to have busses or + * controllers to which individual devices can be attached. + * + * In that case, for each type, there will be two sets of the + * functions, types, and devid namespaces outlined above: one based on + * 'type', and one based on 'typectrl'. + * + * In those cases, libxl_device_typectrl_function will act more or + * less like top-level non-bus devices: they will either create or + * accept a libxl_devid which will be unique within the + * typectrl libxl_devid namespace. + * + * Second-level devices which will be attached to a controller will + * include in their libxl_device_type a field called ctrl, which + * will be the libxl_devid of the corresponding controller. It may also + * include an index onto that bus, that represents (for example) a USB + * port or a SCSI LUN. + * + * These second-level devices will also have their own devid which + * will be unique within the type devid namespace, and will be used + * for queries or removals. All other description is agreed except here: For pvusb, currently we uses ctrl, port instead of devid. It seems to be more straightforward. To add a USB device info to xenstore, it only writes the USB busid to controller/port. To remove a USB device, just remove USB busid from xenstore controller/port. - Chunyan + * + * In the case where there are multiple different ways to implement a + * given device -- for instance, one which is fully PV and one which + * uses an emulator -- the controller will contain a field which + * specifies what type of implementation is used. The implementations + * of individual devices will be known by
Re: [Xen-devel] [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
On 06/15/2015 09:44 PM, Andrew Cooper wrote: when signalled to do so by libxl__remus_domain_checkpoint_callback() Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_dom.c | 16 +++--- tools/libxl/libxl_internal.h |7 +++ tools/libxl/libxl_stream_write.c | 111 -- 3 files changed, 121 insertions(+), 13 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 06bfaab..3597a91 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -1867,8 +1867,8 @@ static void remus_devices_preresume_cb(libxl__egc *egc, /*- remus asynchronous checkpoint callback -*/ -static void remus_checkpoint_dm_saved(libxl__egc *egc, - libxl__domain_suspend_state *dss, int rc); +static void remus_checkpoint_stream_written( +libxl__egc *egc, libxl__domain_suspend_state *dss, int rc); static void remus_devices_commit_cb(libxl__egc *egc, libxl__remus_devices_state *rds, int rc); @@ -1882,16 +1882,11 @@ static void libxl__remus_domain_checkpoint_callback(void *data) libxl__egc *egc = dss-shs.egc; STATE_AO_GC(dss-ao); -/* This would go into tailbuf. */ -if (dss-hvm) { -libxl__domain_save_device_model(egc, dss, remus_checkpoint_dm_saved); -} else { -remus_checkpoint_dm_saved(egc, dss, 0); -} +libxl__stream_write_start_checkpoint(egc, dss-sws); } -static void remus_checkpoint_dm_saved(libxl__egc *egc, - libxl__domain_suspend_state *dss, int rc) +static void remus_checkpoint_stream_written( +libxl__egc *egc, libxl__domain_suspend_state *dss, int rc) { /* Convenience aliases */ libxl__remus_devices_state *const rds = dss-rds; @@ -2036,6 +2031,7 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss) callbacks-suspend = libxl__remus_domain_suspend_callback; callbacks-postcopy = libxl__remus_domain_resume_callback; callbacks-checkpoint = libxl__remus_domain_checkpoint_callback; +dss-sws.checkpoint_callback = remus_checkpoint_stream_written; } else callbacks-suspend = libxl__domain_suspend_callback; diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 82cd792..bf1c377 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -2879,17 +2879,24 @@ struct libxl__stream_write_state { void (*completion_callback)(libxl__egc *egc, libxl__domain_suspend_state *dss, int rc); +void (*checkpoint_callback)(libxl__egc *egc, +libxl__domain_suspend_state *dss, +int rc); /* Private */ int rc; int joined_rc; size_t padding; bool running; +bool in_checkpoint; libxl__datacopier_state dc; }; _hidden void libxl__stream_write_start(libxl__egc *egc, libxl__stream_write_state *stream); +_hidden void libxl__stream_write_start_checkpoint( +libxl__egc *egc, libxl__stream_write_state *stream); + _hidden void libxl__stream_write_abort(libxl__egc *egc, libxl__stream_write_state *stream, int rc); diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c index d28a8a5..40f2cb7 100644 --- a/tools/libxl/libxl_stream_write.c +++ b/tools/libxl/libxl_stream_write.c @@ -23,6 +23,9 @@ * - libxl__stream_write_start() * - Start writing a stream from the start. * + * - libxl__stream_write_start() + * - Write the records which form a checkpoint into a stream. + * * In normal operation, there are two tasks running at once; this stream * processing, and the the libxl-save-helper. check_stream_finished() is used * to join all the tasks in both success and error cases. @@ -39,6 +42,12 @@ * - Toolstack record * - if (hvm), Qemu record * - End record + * + * For checkpointed stream, there is a second loop which is triggered by a + * save-helper checkpoint callback. It writes: + * - Toolstack record + * - if (hvm), Qemu record + * - Checkpoint end record */ static const uint8_t zero_padding[1U REC_ALIGN_ORDER] = { 0 }; @@ -81,6 +90,16 @@ static void end_record_done(libxl__egc *egc, libxl__datacopier_state *dc, int onwrite, int errnoval); +/* Event callbacks unique to checkpointed streams. */ +static void checkpoint_done(libxl__egc *egc, +
Re: [Xen-devel] [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state
On 06/15/2015 09:44 PM, Andrew Cooper wrote: Shortly more parameters will appear, and this saves unboxing each one. No functional change. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com Reviewed-by: Yang Hongyang yan...@cn.fujitsu.com --- tools/libxl/libxl_create.c | 12 ++-- tools/libxl/libxl_internal.h |2 +- tools/libxl/libxl_save_callout.c |2 +- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 86384d2..385891c 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc, int rc, uint32_t domid); static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, -uint32_t *domid, -int restore_fd, int checkpointed_stream, +uint32_t *domid, int restore_fd, +const libxl_domain_restore_params *params, const libxl_asyncop_how *ao_how, const libxl_asyncprogress_how *aop_console_how) { @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, libxl_domain_config_init(cdcs-dcs.guest_config_saved); libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config); cdcs-dcs.restore_fd = restore_fd; +if (params) cdcs-dcs.restore_params = *params; cdcs-dcs.callback = domain_create_cb; -cdcs-dcs.checkpointed_stream = checkpointed_stream; libxl__ao_progress_gethow(cdcs-dcs.aop_console_how, aop_console_how); cdcs-domid_out = domid; @@ -1619,7 +1619,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config, const libxl_asyncop_how *ao_how, const libxl_asyncprogress_how *aop_console_how) { -return do_domain_create(ctx, d_config, domid, -1, 0, +return do_domain_create(ctx, d_config, domid, -1, NULL, ao_how, aop_console_how); } @@ -1629,8 +1629,8 @@ int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config, const libxl_asyncop_how *ao_how, const libxl_asyncprogress_how *aop_console_how) { -return do_domain_create(ctx, d_config, domid, restore_fd, -params-checkpointed_stream, ao_how, aop_console_how); +return do_domain_create(ctx, d_config, domid, restore_fd, params, +ao_how, aop_console_how); } /* diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 6226c18..796bd21 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state { libxl_domain_config *guest_config; libxl_domain_config guest_config_saved; /* vanilla config */ int restore_fd; +libxl_domain_restore_params restore_params; libxl__domain_create_cb *callback; libxl_asyncprogress_how aop_console_how; /* private to domain_create */ int guest_domid; -int checkpointed_stream; libxl__domain_build_state build_state; libxl__bootloader_state bl; libxl__stub_dm_spawn_state dmss; diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c index 40b25e4..3585a84 100644 --- a/tools/libxl/libxl_save_callout.c +++ b/tools/libxl/libxl_save_callout.c @@ -59,7 +59,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs, state-store_domid, state-console_port, state-console_domid, hvm, pae, superpages, -cbflags, dcs-checkpointed_stream, +cbflags, dcs-restore_params.checkpointed_stream, }; dcs-shs.ao = ao; -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.2-testing test] 58678: FAIL
flight 58678 xen-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/58678/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-libvirt 3 host-install(3) broken in 58584 REGR. vs. 58411 Tests which are failing intermittently (not blocking): test-i386-i386-libvirt3 host-install(3) broken in 58584 pass in 58678 test-amd64-i386-pair 4 host-install/dst_host(4) broken in 58584 pass in 58678 test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken in 58584 pass in 58678 test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 58678 test-amd64-i386-xl-win7-amd64 3 host-install(3) broken in 58584 pass in 58678 test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass in 58678 test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass in 58678 test-i386-i386-xl-qemut-winxpsp3 3 host-install(3) broken in 58616 pass in 58678 test-amd64-amd64-xl-qemuu-ovmf-amd64 6 xen-boot fail in 58584 pass in 58678 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-localmigrate.2 fail in 58584 pass in 58678 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 58631 pass in 58616 test-i386-i386-libvirt9 debian-install fail in 58631 pass in 58678 test-amd64-amd64-xl-win7-amd64 16 guest-stopfail pass in 58584 test-amd64-i386-xl-qemuu-win7-amd64 12 guest-localmigrate fail pass in 58631 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 58411 test-amd64-i386-xl-win7-amd64 16 guest-stop fail like 58411 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58411 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1)blocked in 58584 n/a test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-i386-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail in 58616 never pass test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass build-amd64-rumpuserxen 5 rumpuserxen-buildfail never pass build-i386-rumpuserxen5 rumpuserxen-buildfail never pass test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-i386-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 20 leak-check/check fail never pass test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass version targeted for testing: xen 97134c441d6d81ba0d7cdcfdc4d8315115b99dce baseline version: xen 21a8344ca38a2797a13b4bf57031b6f49ae12ccb People who touched revisions under test: Ian Campbell ian.campb...@citrix.com Ian Jackson ian.jack...@eu.citrix.com Jan Beulich jbeul...@suse.com Stefano Stabellini stefano.stabell...@eu.citrix.com jobs: build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass build-amd64-rumpuserxen fail build-i386-rumpuserxen fail test-amd64-amd64-xl pass test-amd64-i386-xl pass test-i386-i386-xlpass test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-qemuu-freebsd10-amd64pass test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
Re: [Xen-devel] [PATCH 01/44] kernel: Add support for poweroff handler call chain
On 10/06/2014 10:28 PM, Guenter Roeck wrote: Various drivers implement architecture and/or device specific means to remove power from the system. For the most part, those drivers set the global variable pm_power_off to point to a function within the driver. This mechanism has a number of drawbacks. Typically only one scheme to remove power is supported (at least if pm_power_off is used). At least in theory there can be multiple means remove power, some of which may be less desirable. For example, some mechanisms may only power off the CPU or the CPU card, while another may power off the entire system. Others may really just execute a restart sequence or drop into the ROM monitor. Using pm_power_off can also be racy if the function pointer is set from a driver built as module, as the driver may be in the process of being unloaded when pm_power_off is called. If there are multiple poweroff handlers in the system, removing a module with such a handler may inadvertently reset the pointer to pm_power_off to NULL, leaving the system with no means to remove power. Introduce a system poweroff handler call chain to solve the described problems. This call chain is expected to be executed from the architecture specific machine_power_off() function. Drivers providing system poweroff functionality are expected to register with this call chain. By using the priority field in the notifier block, callers can control poweroff handler execution sequence and thus ensure that the poweroff handler with the optimal capabilities to remove power for a given system is called first. What happened to this series? I want to add shutdown support to my platform and I need to write a register on the PMIC in one driver to configure it for shutdown instead of restart and then write an MMIO register to tell the PMIC to actually do the shutdown in another driver. It seems that the notifier solves this case for me, albeit with the slight complication that I need to order the two with some priority. I'm also considering putting the PMIC configuration part into the reboot notifier chain, because it only does things to change the configuration and not actually any shutdown/restart itself. That removes any requirement to get the priority of notifiers right. This series will still be useful for the MMIO register that needs to be toggled though. Right now I have to assign pm_power_off or hook the reboot notifier with a different priority to make this work. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3.4 000/172] 3.4.108-rc1 review
On 2015/6/16 23:13, Ian Campbell wrote: On Tue, 2015-06-16 at 16:33 +0800, l...@kernel.org wrote: From: Zefan Li lize...@huawei.com This is the start of the stable review cycle for the 3.4.108 release. There are 172 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Thu Jun 18 08:30:58 UTC 2015. Anything received after that time might be too late. Would it be possible to also include 31a418986a58 xen: netback: read hotplug script once at start of day. which has started trickling into other stable branches already, please. If not now then for 109. Queued up for 3.4.108. :) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller
On Wed, Jun 17, Ian Campbell wrote: So, Olaf, ping... I will return to pvscsi work next week. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests
On Wed, 2015-06-17 at 12:22 +0100, Ross Lagerwall wrote: On 06/17/2015 12:14 PM, Ian Campbell wrote: On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote: On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote: The current code for allocating memory for PV guests batches the hypercalls to allocate memory by allocating 1024*1024 extents of order 0 at a time. To make this faster, first try allocating extents of order 9 (2 MiB) before falling back to the order 0 allocating if the order 9 allocation fails. On my test machine this reduced the time to start a 128 GiB PV guest by about 60 seconds. From 61s or from 600s? It reduced the total time _by_ 60 seconds. I know, my question was whether 60s was a significant or insignificant fraction of the time take to start a guest. Seems like it was pretty significant. I think it takes about 10 seconds to start the guest now, but it is not related to the size of guest memory. Regards ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 0/6] evtchn: Improve scalebility
The per-domain event channel lock limits scalability when many VCPUs are trying to send interdomain events. A per-channel lock is introduced eliminating any lock contention when sending an event. See this graph for the performance improvements: http://xenbits.xen.org/people/dvrabel/evtchn-scalability.png A different test (using Linux's evtchn device which masks/unmasks event channels) showed the following lock profile improvements: Per-domain lock: (XEN) lock:69267976(0004:19830041), block: 2407(0002:3C7C5C96) Per-event channel lock (XEN) lock: 686530(:076AF5F6), block: 1787(:000B4D22) Locking removed from evtchn_unmask(): (XEN) lock: 10769(:00512999), block: 99(:9491) v3: - Clear xen_consumer when clearing state. - Defer freeing struct evtchn's until evtchn_destroy_final(). - Remove redundant d-evtchn test from port_is_valid(). - Use port_is_valid() again. - Drop event lock from notify_via_xen_event_channel(). v2: - Use unsigned int for d-valid_evtchns. - Compare channel pointers in double_evtchn_lock(). David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On Mon, 2015-06-15 at 12:12 +0200, Dario Faggioli wrote: A parameter is either per-domain or per-vcpu, no matter how the user try to set or get it. In RTDS, all parameters are per-domain now and, with your work, all of them are becoming per-vcpu, and that's ok. But then, per-dom parameters should just no longer exist. Are you saying there is going to be no domain wide default for a given per-vcpu parameter? if that is the case then what happens if I hotplug a new VCPU without settings its per-vcpu properties? I expected something like. domain_params_set (weight=10) add vcpu0 (weight will be 10) add vcpu1 (weight will be 10) vcpu_params_set(0, weight=20) add vcpu2 (weight will be 10, from domain wide default) If not that then what weight should vcpu2 have at this point? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On Wed, 2015-06-17 at 13:14 +0100, Ian Campbell wrote: On Mon, 2015-06-15 at 12:12 +0200, Dario Faggioli wrote: A parameter is either per-domain or per-vcpu, no matter how the user try to set or get it. In RTDS, all parameters are per-domain now and, with your work, all of them are becoming per-vcpu, and that's ok. But then, per-dom parameters should just no longer exist. Are you saying there is going to be no domain wide default for a given per-vcpu parameter? No, no, there sure is a default, and I certainly would make it domain wide. if that is the case then what happens if I hotplug a new VCPU without settings its per-vcpu properties? Exactly. I expected something like. domain_params_set (weight=10) add vcpu0 (weight will be 10) add vcpu1 (weight will be 10) vcpu_params_set(0, weight=20) add vcpu2 (weight will be 10, from domain wide default) If not that then what weight should vcpu2 have at this point? Absolutely. What I'm aiming at here is the per-domain API to have a well defined and valuable behavior, and the one you suggest in your other email (i.e., let's return the default) is exactly that! :-) Thanks and Regards, Dario -- This happens because I choose it to happen! (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On Wed, 2015-06-17 at 13:08 +0100, Ian Campbell wrote: On Tue, 2015-06-09 at 18:18 +0200, Dario Faggioli wrote: On Mon, 2015-06-08 at 15:55 -0500, Chong Li wrote: On Mon, Jun 8, 2015 at 10:56 AM, Dario Faggioli So, Thoughts? What do you think the best way forward could be? I like option 2 more. But I think we may also need a 'vcpuid' field in libxl_sched_params. For sparse array support, yes. At which point, I would flip the names as well, i.e., something like this: libxl_vcpu_sched_params = Struct(vcpu_sched_params,[ (vcpuid, integer, { xxx some init val xxx}), (weight, integer, {'init_val': 'LIBXL_PARAM_WEIGHT_DEFAULT'}), (cap, integer, {'init_val': 'LIBXL_PARAM_CAP_DEFAULT'}), (period, integer, {'init_val': 'LIBXL_PARAM_PERIOD_DEFAULT'}), (slice,integer, {'init_val': 'LIBXL_PARAM_SLICE_DEFAULT'}), (latency, integer, {'init_val': 'LIBXL_PARAM_LATENCY_DEFAULT'}), (extratime,integer, {'init_val': 'LIBXL_PARAM_EXTRATIME_DEFAULT'}), (budget, integer, {'init_val': 'LIBXL_PARAM_BUDGET_DEFAULT'}), Is a full new set of defaults really necessary? No, it's not... I was 'sketching' the various alternatives and just went too far. :-) I don't think it would be semantically all that strange to say that an unspecified per-vcpu value will default to the domain default, at which point having LIBXL_DOMAIN_SCHED_PARAM_FOO_DEFAULT mentioned doesn't seem all that strange. Indeed. libxl_sched_params = Struct(sched_params,[ (sched,libxl_scheduler), (vcpus,Array(libxl_sched_params, num_vcpus)), ]) With the possibility of naming the latter 'libxl_vcpus_sched_params', which is more descriptive, but perhaps is too similar to libxl_vcpu_sched_params. I think I'd go the other way and name the thing with all the values in it libxl_sched_params and the thing with the per-vcpu array in it libxl_vcpu_sched_params. Fine with me. While we're here, another thing we would appreciate some feedback on is what should happen to libxl_domain_sched_params_get(). This occurred to my mind while reviewing patch 4 of this series. Actually, I think we've discussed this before, but can't find the reference now. Anyway, my view is that, for a scheduler that uses per-vcpu parameters, libxl_domain_sched_params_set() should set the same parameters for all the vcpus. When it comes to _get(), however, I'm not sure. To match the _set() case, we'd need to return the parameters of all the vcpus, but we can't, because the function takes a libxl_domain_sched_params argument, which just holds 1 tuple. Wouldn't domain_get/set be manipulating the domain's default values for things, i.e. what a vcpu gets if nothing more specific is specified? Or is it too late to say that? I actually like this. A lot. For set overall I don't really think it matters too much if it sets everything (i..e all vcpus and the defaults) so long as is a documented effect of the API -- anyone who adds per-vcpu support would then be aware of this and can adjust their code accordingly. Exactly. Personally, I think I'd make per-domain set behave consistently, and hence use it for set the default. For get I think saying it returns the defaults used for vcpus which don't have something more explicit set is perfectly fine and doesn't pose an upgrade wrinkle, since only those who are aware of the vcpu settings would care about the distinction. As said, I like this, great idea. Chong, what do you think? Can you make this happen? Regards, Dario -- This happens because I choose it to happen! (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job
Ian Campbell writes (Re: [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job): I'll leave it to Ian to suggest a solution since it will no doubt involve some tcl plumbing (I'd be inclined to record 'hosts which are actually guests' somewhere and have the infra clean them up automatically after doing leak check and log collection). Thinking aloud: The things we want to happen at the end of the job are: * leak check on the l1 !* logs capture on the l1 (might involve power cycle) * destruction of the l1 on l0 * leak check on l0 !* logs capture on l0 The items marked ! should happen even in case of failure. This is going to be quite fiddly to do in a generic way. But on the other hand, doing it in a generic way would avoid writing an explicit error handler in the l1 test case code, and replicating the per-host operations. I was thinking more along the lines of creating Osstest/PDU/guest.pm with the appropriate methods calling out to toolstack($l0)-foo, setting $ho-{Power} = 'guest $l1guestname' somewhere and allowing power_cycle_host_setup to do it's thing. Yes, that's what I meant. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wed, 2015-06-17 at 13:11 +0100, Julien Grall wrote: Hi Ian, On 17/06/15 11:08, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. By default, the OS doesn't setup the BARs, it expects the firmware to do it. _By_default_, yes. But it can decide to rewrite the BARs if it wants, Linux for example has command line options to do so. Maybe we have the option to exclude this possibility. It would certainly simplify things. (Which makes most of the rest of your reply moot). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv2 3/5] evtchn: use a per-event channel lock for sending events
On 16.06.15 at 18:39, david.vra...@citrix.com wrote: On 16/06/15 17:19, Jan Beulich wrote: On 16.06.15 at 17:58, david.vra...@citrix.com wrote: On 16/06/15 16:19, David Vrabel wrote: @@ -1221,6 +1277,8 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) evtchn_port_set_pending(rd, rchn-notify_vcpu_id, rchn); } +spin_unlock(lchn-lock); + spin_unlock(ld-event_lock); } Again I think the event lock can be dropped earlier now. Ditto. Uh, no. This is notify. I've kept the locking like this because of the ld-is_dying check. I think we need the ld-event_lock in case d-is_dying is set and evtchn_destroy(ld) is called. Right, but if evtchn_destroy() was a concern, then this wouldn't apply just here, but also in the sending path you are relaxing. Afaict due to the channel lock being taken in __evtchn_close() you can drop the even lock here as the latest after you acquired the channel one (I haven't been able to convince myself yet that dropping it even before that would be okay). But in the evtchn_send() case, we're in a hypercall so we know ld-is_dying is false and thus cannot be racing with evtchn_destroy(ld). That's only the common code path; there's an evtchn_send() call from __domain_finalise_shutdown() which afaict doesn't make such guarantees (in particular there clearly are d != current-domain cases here). And then - how is being in a hypercall a guard against is_dying not getting set? It would be good to remove event_lock from notify_xen_event_channel() as well since this is heavily used for ioreqs and vm events. Let me have a more careful look. Indeed, thanks. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
On 06/15/2015 09:44 PM, Andrew Cooper wrote: This is the final bit of untangling for Remus. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/libxl_create.c | 25 tools/libxl/libxl_internal.h|6 tools/libxl/libxl_stream_read.c | 62 +++ 3 files changed, 93 insertions(+) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 7dd7130..ac918bd 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -747,6 +747,27 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid, libxl_device_model_version_to_string(b_info-device_model_version)); } +/*- remus asynchronous checkpoint callback -*/ + +static void remus_checkpoint_stream_done( +libxl__egc *egc, libxl__domain_create_state *dcs, int rc); + +static void libxl__remus_domain_checkpoint_callback(void *data) +{ +libxl__save_helper_state *shs = data; +libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs); +libxl__egc *egc = dcs-shs.egc; +STATE_AO_GC(dcs-ao); + +libxl__stream_read_start_checkpoint(egc, dcs-srs); +} + +static void remus_checkpoint_stream_done( +libxl__egc *egc, libxl__domain_create_state *dcs, int rc) +{ +libxl__xc_domain_saverestore_async_callback_done(egc, dcs-shs, rc); +} + /*- main domain creation -*/ /* We have a linear control flow; only one event callback is @@ -1008,6 +1029,8 @@ static void domcreate_bootloader_done(libxl__egc *egc, libxl_domain_config *const d_config = dcs-guest_config; const int restore_fd = dcs-restore_fd; libxl__domain_build_state *const state = dcs-build_state; +libxl__srm_restore_autogen_callbacks *const callbacks = +dcs-shs.callbacks.restore.a; if (rc) { domcreate_rebuild_done(egc, dcs, rc); @@ -1035,6 +1058,7 @@ static void domcreate_bootloader_done(libxl__egc *egc, } /* Restore */ +callbacks-checkpoint = libxl__remus_domain_checkpoint_callback; rc = libxl__build_pre(gc, domid, d_config, state); if (rc) @@ -1044,6 +1068,7 @@ static void domcreate_bootloader_done(libxl__egc *egc, dcs-srs.fd = restore_fd; dcs-srs.legacy = (dcs-restore_params.stream_version == 1); dcs-srs.completion_callback = domcreate_stream_done; +dcs-srs.checkpoint_callback = remus_checkpoint_stream_done; libxl__stream_read_start(egc, dcs-srs); return; diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index bf1c377..e271a0b 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -3205,11 +3205,15 @@ struct libxl__stream_read_state { void (*completion_callback)(libxl__egc *egc, libxl__domain_create_state *dcs, int rc); +void (*checkpoint_callback)(libxl__egc *egc, +libxl__domain_create_state *dcs, +int rc); /* Private */ libxl__carefd *v2_carefd; int rc; int joined_rc; bool running; +bool in_checkpoint; libxl__datacopier_state dc; size_t expected_len; libxl_sr_hdr hdr; @@ -3222,6 +3226,8 @@ _hidden void libxl__stream_read_start(libxl__egc *egc, _hidden void libxl__stream_read_continue(libxl__egc *egc, libxl__stream_read_state *stream); +_hidden void libxl__stream_read_start_checkpoint( +libxl__egc *egc, libxl__stream_read_state *stream); _hidden void libxl__stream_read_abort(libxl__egc *egc, libxl__stream_read_state *stream, int rc); diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c index a8cd2c3..09ef0aa 100644 --- a/tools/libxl/libxl_stream_read.c +++ b/tools/libxl/libxl_stream_read.c @@ -80,6 +80,10 @@ static void emulator_padding_done(libxl__egc *egc, libxl__datacopier_state *dc, int onwrite, int errnoval); +/* Error handling for checkpoint mini-loop. */ +static void checkpoint_done(libxl__egc *egc, +libxl__stream_read_state *stream, int rc); + void libxl__stream_read_start(libxl__egc *egc, libxl__stream_read_state *stream) { @@ -162,6 +166,35 @@ void libxl__stream_read_continue(libxl__egc *egc, stream_failed(egc, stream, ret); } +void libxl__stream_read_start_checkpoint(libxl__egc *egc, + libxl__stream_read_state *stream) +{ +libxl__datacopier_state *dc = stream-dc; +int ret = 0; + +assert(stream-running); +assert(!stream-in_checkpoint); +
Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests
On 06/17/2015 12:14 PM, Ian Campbell wrote: On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote: On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote: The current code for allocating memory for PV guests batches the hypercalls to allocate memory by allocating 1024*1024 extents of order 0 at a time. To make this faster, first try allocating extents of order 9 (2 MiB) before falling back to the order 0 allocating if the order 9 allocation fails. On my test machine this reduced the time to start a 128 GiB PV guest by about 60 seconds. From 61s or from 600s? It reduced the total time _by_ 60 seconds. I think it takes about 10 seconds to start the guest now, but it is not related to the size of guest memory. Regards -- Ross Lagerwall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller
On 06/17/2015 01:06 PM, Ian Campbell wrote: On Thu, 2015-05-21 at 18:07 +0100, George Dunlap wrote: We have several outstanding patch series which add devices that have two levels: a controller and individual devices attached to that controller. In the interest of consistency, this patch introduces a section that sketches out a template for interfaces for such devices. Chun Yan and Jeurgen: I was hoping we could come to some sort of agreement on this such that it can be used as the basis for both the pvusb and pvscsi interfaces. As such your feedback here is important... I already gave some feedback. As everything regarding this feedback has been discussed: Acked-by: Juergen Gross jgr...@suse.com Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Juergen Gross jgr...@suse.com CC: Chun Yan Liu cy...@suse.com CC: Olaf Hering o...@aepfle.de So, this is definitely RFC -- I tried to spec things out in a way that made sense, but I often just chose something that I thought would be a sensible starting point for discussion. This spec looks a lot more like the PVUSB spec than the PVSCSI spec, in part because I think the PVUSB spec has already had a lot more thought that's gone into it. A couple of random points to discuss: * Calling things controllers, using typectrl for the device name, and using ctrl as the field name for the devid of the controller in the individual devices. * I've said that having an index (port, lun, whatever) is optional. Do we want to make that requred? Do we want it to have a consistent name? In the case of emulated USB, we can't really specify to qemu what port the device gets attached to, so I'm tempted to say it's not required; but even there we could always give it a port number just for name's sake. * Naming sub-devices. We need to have a way to uniquely name both controllers and subdevices. Here I've said that we will have both typectrl and type devid namespaces, mainly because in the previous point I opted not to require an index. Another option would be not to have another devid namespace, but to use ctrl,index as the unique identifier. (This would mean requiring the index/port/lun specification above.) * libxl_device_type_list listing devices across all controllers. I think this is the most practical thing to do, but one could imagine wanting to query by controller ID instead. Feedback welcome. --- tools/libxl/libxl.h | 46 ++ 1 file changed, 46 insertions(+) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 2ed7194..d757845 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1234,6 +1234,52 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int nr_vtpms); * * This function does not interact with the guest and therefore * cannot block on the guest. + * + * Controllers + * --- + * + * Most devices are treated individually. Some devices however, like + * USB or SCSI, inherently have the need to have busses or + * controllers to which individual devices can be attached. + * + * In that case, for each type, there will be two sets of the + * functions, types, and devid namespaces outlined above: one based on + * 'type', and one based on 'typectrl'. + * + * In those cases, libxl_device_typectrl_function will act more or + * less like top-level non-bus devices: they will either create or + * accept a libxl_devid which will be unique within the + * typectrl libxl_devid namespace. + * + * Second-level devices which will be attached to a controller will + * include in their libxl_device_type a field called ctrl, which + * will be the libxl_devid of the corresponding controller. It may also + * include an index onto that bus, that represents (for example) a USB + * port or a SCSI LUN. + * + * These second-level devices will also have their own devid which + * will be unique within the type devid namespace, and will be used + * for queries or removals. + * + * In the case where there are multiple different ways to implement a + * given device -- for instance, one which is fully PV and one which + * uses an emulator -- the controller will contain a field which + * specifies what type of implementation is used. The implementations + * of individual devices will be known by the controller to which they are + * attached. + * + * If libxl_device_type_add receives an uninitialized ctrl devid, it + * may return an error. Or it may (but is not required to) choose to + * automatically choose a suitable controller to which to attach the + * new device. It may also (but is not required to) automatically + * create a new controller if no suitable controllers exist. + * Individual devices should document their behavior. + * + * libxl_device_typectrl_list will list all controllers for the domain. + * + *
[Xen-devel] [qemu-mainline test] 58664: regressions - FAIL
flight 58664 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/58664/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 5 xen-build fail REGR. vs. 58551 build-i3865 xen-build fail REGR. vs. 58551 build-amd64-xsm 5 xen-build fail REGR. vs. 58551 build-i386-xsm5 xen-build fail REGR. vs. 58551 build-armhf 5 xen-build fail REGR. vs. 58551 build-armhf-xsm 5 xen-build fail REGR. vs. 58551 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-amd 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-intel 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-xl-sedf 1 build-check(1) blocked n/a test-amd64-amd64-xl-sedf-pin 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-sedf 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1)blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-sedf-pin 1 build-check(1) blocked n/a test-armhf-armhf-xl-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a version targeted for testing: qemuu93f6d1c16036aaf34055d16f54ea770fb8d6d280 baseline version: qemuu0a2df857a7038c75379cc575de5d4be4c0ac629e People who touched revisions under test: Alberto Garcia be...@igalia.com Alex Bennée alex.ben...@linaro.org Alexander Graf ag...@suse.de Andrew Jones drjo...@redhat.com Aurelien Jarno aurel...@aurel32.net Aurelio C. Remonda aurelioremo...@gmail.com Benoit Canet benoit.ca...@nodalink.com Benoît Canet benoit.ca...@nodalink.com Christian Borntraeger borntrae...@de.ibm.com Cole Robinson crobi...@redhat.com Dave Airlie airl...@redhat.com Eric Auger eric.au...@linaro.org Fam Zheng f...@redhat.com Gerd Hoffmann kra...@redhat.com Johan Karlsson johan.karls...@enea.com John Snow js...@redhat.com Kevin Wolf
Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller
On 05/27/2015 03:57 PM, George Dunlap wrote: On 05/27/2015 11:09 AM, Juergen Gross wrote: George, I'm on vacation this and the next week with only limited email access. So please don't expect fast reaction on any further questions during this time. :-) Then quit reading your work e-mail and get back to the important stuff! :-) OK, so I looked it up[1] and the full address seems to be: * adapter number / host * channel number / bus * id number / target * LUN In which case, controller would correspond to adapter / host, right? In the vscsi world, what levels of what can you make? I know you mentioned before that some devices have multiple LUNs, and those need to be grouped together, with the same LUNs as they do on real hardware, to work properly -- is that right? Not all of the devices have this requirement, but some. The USB case actually has something somewhat similar: * USB controller * USB bus * USB device * USB function So far, there's not really a controller/bus distinction: each controller has exactly one bus. When we assign a USB device to a bus, we automatically go through and assign each function fo that device individually. Would it make sense to treat vscsi the same way -- i.e., to make a bus, and then attach targetss to it, and have the LUNs for any given target automatically assigned when the target is assigned? As long as it is still possible to assign individual LUNs as well. If dom0 is controlling e.g. a RAID system you might want to assign one LUN of a target to domU A and one LUN of the same target to domU B. OK, so it sounds like in the vscsi case, it would be useful to assign either an entire target, or an individual LUN. In the case of assigning a target, you'll want to assign all the LUNs as well, such that the virtual LUNs mirror the real LUNs. In the case of assigning a LUN, I assume you'll still need a virtual target. Will you be wanting an interface for creating virtual targets, so that you can assign several real LUNs to the same target? Or will you just want one virtual target per LUN if you're not assigning an entire target? Nearly missed this question. Hmm, I think the first option would be better. Otherwise it could be difficult to assign a just created LUN of a target to the same virtual target while the system is running. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 1/6] evtchn: clear xen_consumer when clearing state
Freeing a xen event channel would clear xen_consumer before clearing the channel state, leaving a window where the channel is in a funny state (still bound but no consumer). Move the clear of xen_consumer into free_evtchn() where the state is also cleared. Signed-off-by: David Vrabel david.vra...@citrix.com --- xen/common/event_channel.c |5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 947880f..90e3121 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -200,6 +200,7 @@ static void free_evtchn(struct domain *d, struct evtchn *chn) /* Reset binding to vcpu0 when the channel is freed. */ chn-state = ECS_FREE; chn-notify_vcpu_id = 0; +chn-xen_consumer = 0; xsm_evtchn_close_post(chn); } @@ -1187,7 +1188,6 @@ void free_xen_event_channel(struct domain *d, int port) BUG_ON(!port_is_valid(d, port)); chn = evtchn_from_port(d, port); BUG_ON(!consumer_is_xen(chn)); -chn-xen_consumer = 0; spin_unlock(d-event_lock); @@ -1287,10 +1287,7 @@ void evtchn_destroy(struct domain *d) /* Close all existing event channels. */ for ( i = 0; port_is_valid(d, i); i++ ) -{ -evtchn_from_port(d, i)-xen_consumer = 0; (void)__evtchn_close(d, i); -} /* Free all event-channel buckets. */ spin_lock(d-event_lock); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 5/6] evtchn: remove the locking when unmasking an event channel
The event channel lock is no longer required to check if the port is valid. Signed-off-by: David Vrabel david.vra...@citrix.com --- v3: - Use port_is_valid() again. --- xen/common/event_channel.c |4 1 file changed, 4 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index e7c4445..eabb4c8 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -978,8 +978,6 @@ int evtchn_unmask(unsigned int port) struct domain *d = current-domain; struct evtchn *evtchn; -ASSERT(spin_is_locked(d-event_lock)); - if ( unlikely(!port_is_valid(d, port)) ) return -EINVAL; @@ -1146,9 +1144,7 @@ long do_event_channel_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) struct evtchn_unmask unmask; if ( copy_from_guest(unmask, arg, 1) != 0 ) return -EFAULT; -spin_lock(current-domain-event_lock); rc = evtchn_unmask(unmask.port); -spin_unlock(current-domain-event_lock); break; } -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 2/6] evtchn: defer freeing struct evtchn's until evtchn_destroy_final()
notify_via_xen_event_channel() and free_xen_event_channel() had to check if the domain was dying because they may be called while the domain is being destroyed and the struct evtchn's are being freed. By deferring the freeing of the struct evtchn's until all references to the domain are dropped, these functions can rely on the channel state being present and valid. Signed-off-by: David Vrabel david.vra...@citrix.com --- xen/common/event_channel.c | 48 +++- 1 file changed, 12 insertions(+), 36 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 90e3121..ab3b48e 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -1175,22 +1175,6 @@ int alloc_unbound_xen_event_channel( void free_xen_event_channel(struct domain *d, int port) { -struct evtchn *chn; - -spin_lock(d-event_lock); - -if ( unlikely(d-is_dying) ) -{ -spin_unlock(d-event_lock); -return; -} - -BUG_ON(!port_is_valid(d, port)); -chn = evtchn_from_port(d, port); -BUG_ON(!consumer_is_xen(chn)); - -spin_unlock(d-event_lock); - (void)__evtchn_close(d, port); } @@ -1202,18 +1186,12 @@ void notify_via_xen_event_channel(struct domain *ld, int lport) spin_lock(ld-event_lock); -if ( unlikely(ld-is_dying) ) -{ -spin_unlock(ld-event_lock); -return; -} - ASSERT(port_is_valid(ld, lport)); lchn = evtchn_from_port(ld, lport); -ASSERT(consumer_is_xen(lchn)); if ( likely(lchn-state == ECS_INTERDOMAIN) ) { +ASSERT(consumer_is_xen(lchn)); rd= lchn-u.interdomain.remote_dom; rchn = evtchn_from_port(rd, lchn-u.interdomain.remote_port); evtchn_port_set_pending(rd, rchn-notify_vcpu_id, rchn); @@ -1279,7 +1257,7 @@ int evtchn_init(struct domain *d) void evtchn_destroy(struct domain *d) { -unsigned int i, j; +unsigned int i; /* After this barrier no new event-channel allocations can occur. */ BUG_ON(!d-is_dying); @@ -1289,8 +1267,17 @@ void evtchn_destroy(struct domain *d) for ( i = 0; port_is_valid(d, i); i++ ) (void)__evtchn_close(d, i); +clear_global_virq_handlers(d); + +evtchn_fifo_destroy(d); +} + + +void evtchn_destroy_final(struct domain *d) +{ +unsigned int i, j; + /* Free all event-channel buckets. */ -spin_lock(d-event_lock); for ( i = 0; i NR_EVTCHN_GROUPS; i++ ) { if ( !d-evtchn_group[i] ) @@ -1298,20 +1285,9 @@ void evtchn_destroy(struct domain *d) for ( j = 0; j BUCKETS_PER_GROUP; j++ ) free_evtchn_bucket(d, d-evtchn_group[i][j]); xfree(d-evtchn_group[i]); -d-evtchn_group[i] = NULL; } free_evtchn_bucket(d, d-evtchn); -d-evtchn = NULL; -spin_unlock(d-event_lock); - -clear_global_virq_handlers(d); - -evtchn_fifo_destroy(d); -} - -void evtchn_destroy_final(struct domain *d) -{ #if MAX_VIRT_CPUS BITS_PER_LONG xfree(d-poll_mask); d-poll_mask = NULL; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 3/6] evtchn: simplify port_is_valid()
By keeping a count of the number of currently valid event channels, port_is_valid() can be simplified. d-valid_evtchns is only increased (while holding d-event_lock), so port_is_valid() may be safely called without taking the lock (this will be useful later). Signed-off-by: David Vrabel david.vra...@citrix.com --- v3: - Remove redundant d-evtchn test. v2: - Used unsigned int for d-valid_evtchns. --- xen/common/event_channel.c |3 +++ xen/include/xen/event.h|6 +- xen/include/xen/sched.h|5 +++-- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index ab3b48e..4924796 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -189,6 +189,8 @@ static int get_free_port(struct domain *d) return -ENOMEM; bucket_from_port(d, port) = chn; +write_atomic(d-valid_evtchns, d-valid_evtchns + EVTCHNS_PER_BUCKET); + return port; } @@ -1232,6 +1234,7 @@ int evtchn_init(struct domain *d) d-evtchn = alloc_evtchn_bucket(d, 0); if ( !d-evtchn ) return -ENOMEM; +d-valid_evtchns = EVTCHNS_PER_BUCKET; spin_lock_init_prof(d, event_lock); if ( get_free_port(d) != 0 ) diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h index 690f865..af923d1 100644 --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -89,11 +89,7 @@ static inline bool_t port_is_valid(struct domain *d, unsigned int p) { if ( p = d-max_evtchns ) return 0; -if ( !d-evtchn ) -return 0; -if ( p EVTCHNS_PER_BUCKET ) -return 1; -return group_from_port(d, p) != NULL bucket_from_port(d, p) != NULL; +return p read_atomic(d-valid_evtchns); } static inline struct evtchn *evtchn_from_port(struct domain *d, unsigned int p) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 80c6f62..604d047 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -336,8 +336,9 @@ struct domain /* Event channel information. */ struct evtchn *evtchn; /* first bucket only */ struct evtchn **evtchn_group[NR_EVTCHN_GROUPS]; /* all other buckets */ -unsigned int max_evtchns; -unsigned int max_evtchn_port; +unsigned int max_evtchns; /* number supported by ABI */ +unsigned int max_evtchn_port; /* max permitted port number */ +unsigned int valid_evtchns; /* number of allocated event channels */ spinlock_t event_lock; const struct evtchn_port_ops *evtchn_port_ops; struct evtchn_fifo_domain *evtchn_fifo; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 6/6] evtchn: pad struct evtchn to 64 bytes
The number of struct evtchn in a page must be a power of two. Under some workloads performance is improved slightly by padding struct evtchn to 64 bytes (a typical cache line size), thus putting the fewer per-channel locks into each cache line. This does not decrease the number of struct evtchn's per-page. Signed-off-by: David Vrabel david.vra...@citrix.com --- v2: - Use __attribute__((aligned(64))) for the padding. --- xen/include/xen/sched.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 44ea92d..a0ff9d2 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -129,7 +129,7 @@ struct evtchn #endif } ssid; #endif -}; +} __attribute__((aligned(64))); int evtchn_init(struct domain *d); /* from domain_create */ void evtchn_destroy(struct domain *d); /* from domain_kill */ -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 4/6] evtchn: use a per-event channel lock for sending events
When sending an event, use a new per-event channel lock to safely validate the event channel state. This new lock must be held when changing event channel state. Note that the event channel lock must also be held when changing state from ECS_FREE or it will race with a concurrent get_free_port() call. To avoid having to take the remote event channel locks when sending to an interdomain event channel, the local and remote channel locks are both held when binding or closing an interdomain event channel. This significantly increases the number of events that can be sent from multiple VCPUs. But struct evtchn increases in size, reducing the number that fit into a single page to 64 (instead of 128). Signed-off-by: David Vrabel david.vra...@citrix.com --- v3: - Use port_is_valid in evtchn_send(). - Drop event lock from notify_via_xen_event_channel(). v2: - Compare channel pointers in double_evtchn_lock(). --- xen/common/event_channel.c | 82 xen/include/xen/sched.h|1 + 2 files changed, 69 insertions(+), 14 deletions(-) diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c index 4924796..e7c4445 100644 --- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -139,6 +139,7 @@ static struct evtchn *alloc_evtchn_bucket(struct domain *d, unsigned int port) return NULL; } chn[i].port = port + i; +spin_lock_init(chn[i].lock); } return chn; } @@ -229,11 +230,15 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) if ( rc ) goto out; +spin_lock(chn-lock); + chn-state = ECS_UNBOUND; if ( (chn-u.unbound.remote_domid = alloc-remote_dom) == DOMID_SELF ) chn-u.unbound.remote_domid = current-domain-domain_id; evtchn_port_init(d, chn); +spin_unlock(chn-lock); + alloc-port = port; out: @@ -244,6 +249,28 @@ static long evtchn_alloc_unbound(evtchn_alloc_unbound_t *alloc) } +static void double_evtchn_lock(struct evtchn *lchn, struct evtchn *rchn) +{ +if ( lchn rchn ) +{ +spin_lock(lchn-lock); +spin_lock(rchn-lock); +} +else +{ +if ( lchn != rchn ) +spin_lock(rchn-lock); +spin_lock(lchn-lock); +} +} + +static void double_evtchn_unlock(struct evtchn *lchn, struct evtchn *rchn) +{ +spin_unlock(lchn-lock); +if ( lchn != rchn ) +spin_unlock(rchn-lock); +} + static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) { struct evtchn *lchn, *rchn; @@ -286,6 +313,8 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) if ( rc ) goto out; +double_evtchn_lock(lchn, rchn); + lchn-u.interdomain.remote_dom = rd; lchn-u.interdomain.remote_port = rport; lchn-state = ECS_INTERDOMAIN; @@ -301,6 +330,8 @@ static long evtchn_bind_interdomain(evtchn_bind_interdomain_t *bind) */ evtchn_port_set_pending(ld, lchn-notify_vcpu_id, lchn); +double_evtchn_unlock(lchn, rchn); + bind-local_port = lport; out: @@ -341,11 +372,16 @@ static long evtchn_bind_virq(evtchn_bind_virq_t *bind) ERROR_EXIT(port); chn = evtchn_from_port(d, port); + +spin_lock(chn-lock); + chn-state = ECS_VIRQ; chn-notify_vcpu_id = vcpu; chn-u.virq = virq; evtchn_port_init(d, chn); +spin_unlock(chn-lock); + v-virq_to_evtchn[virq] = bind-port = port; out: @@ -372,10 +408,15 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind) ERROR_EXIT(port); chn = evtchn_from_port(d, port); + +spin_lock(chn-lock); + chn-state = ECS_IPI; chn-notify_vcpu_id = vcpu; evtchn_port_init(d, chn); +spin_unlock(chn-lock); + bind-port = port; out: @@ -450,11 +491,15 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind) goto out; } +spin_lock(chn-lock); + chn-state = ECS_PIRQ; chn-u.pirq.irq = pirq; link_pirq_port(port, chn, v); evtchn_port_init(d, chn); +spin_unlock(chn-lock); + bind-port = port; #ifdef CONFIG_X86 @@ -468,7 +513,6 @@ static long evtchn_bind_pirq(evtchn_bind_pirq_t *bind) return rc; } - static long __evtchn_close(struct domain *d1, int port1) { struct domain *d2 = NULL; @@ -575,15 +619,24 @@ static long __evtchn_close(struct domain *d1, int port1) BUG_ON(chn2-state != ECS_INTERDOMAIN); BUG_ON(chn2-u.interdomain.remote_dom != d1); +double_evtchn_lock(chn1, chn2); + +free_evtchn(d1, chn1); + chn2-state = ECS_UNBOUND; chn2-u.unbound.remote_domid = d1-domain_id; -break; + +double_evtchn_unlock(chn1, chn2); + +goto out; default: BUG(); } +spin_lock(chn1-lock); free_evtchn(d1, chn1); +spin_unlock(chn1-lock); out: if ( d2 != NULL ) @@ -610,21 +663,18 @@ int
Re: [Xen-devel] [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job
On Wed, 2015-06-17 at 11:00 +, Pang, LongtaoX wrote: 2015-06-17 08:10:26 Z executing ssh ... root@192.168.199.25 find /tmp /var/run /var/tmp /var/lib/xen /var/core ! -type d -print0 -ls -printf '\0' find: `/var/core': No such file or directory /var/core is created by ts-host-install. I think the tail end of the function sub in there which does that and populates /etc/sysctl.conf and /etc/security/limits.d/coredumps.conf should be refactored probably to be alongside the osstest-confirm-booted thing which IIRC you are already going to refactor in the next version. I reviewed related code in 'ts-host-install'. But I am not very clear, the below code will be executed in 'ts-debian-hvm-install' too? Or refactor 'osstest-confirm-booted' and the below action should be finished inside this shell script? Since, it need '/var/core' directory in L1, please correct me. I think you've already arranged for the osstest-confirm-booted stuff to happen for the L1 host too, but that Ian J has requested it be done differently such that it is not duplicated in two places. The code to setup /var/core and the associated sysctl.conf and limits.d changes should be done in the same manner as osstest-confirm-booted eventually is based on Ian's suggestion, whatever that was. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 for Xen 4.6 3/4] libxl: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler
On Tue, 2015-06-09 at 18:18 +0200, Dario Faggioli wrote: On Mon, 2015-06-08 at 15:55 -0500, Chong Li wrote: On Mon, Jun 8, 2015 at 10:56 AM, Dario Faggioli So, Thoughts? What do you think the best way forward could be? I like option 2 more. But I think we may also need a 'vcpuid' field in libxl_sched_params. For sparse array support, yes. At which point, I would flip the names as well, i.e., something like this: libxl_vcpu_sched_params = Struct(vcpu_sched_params,[ (vcpuid, integer, { xxx some init val xxx}), (weight, integer, {'init_val': 'LIBXL_PARAM_WEIGHT_DEFAULT'}), (cap, integer, {'init_val': 'LIBXL_PARAM_CAP_DEFAULT'}), (period, integer, {'init_val': 'LIBXL_PARAM_PERIOD_DEFAULT'}), (slice,integer, {'init_val': 'LIBXL_PARAM_SLICE_DEFAULT'}), (latency, integer, {'init_val': 'LIBXL_PARAM_LATENCY_DEFAULT'}), (extratime,integer, {'init_val': 'LIBXL_PARAM_EXTRATIME_DEFAULT'}), (budget, integer, {'init_val': 'LIBXL_PARAM_BUDGET_DEFAULT'}), Is a full new set of defaults really necessary? I don't think it would be semantically all that strange to say that an unspecified per-vcpu value will default to the domain default, at which point having LIBXL_DOMAIN_SCHED_PARAM_FOO_DEFAULT mentioned doesn't seem all that strange. ]) libxl_sched_params = Struct(sched_params,[ (sched,libxl_scheduler), (vcpus,Array(libxl_sched_params, num_vcpus)), ]) With the possibility of naming the latter 'libxl_vcpus_sched_params', which is more descriptive, but perhaps is too similar to libxl_vcpu_sched_params. I think I'd go the other way and name the thing with all the values in it libxl_sched_params and the thing with the per-vcpu array in it libxl_vcpu_sched_params. Ian, George, what do you think? While we're here, another thing we would appreciate some feedback on is what should happen to libxl_domain_sched_params_get(). This occurred to my mind while reviewing patch 4 of this series. Actually, I think we've discussed this before, but can't find the reference now. Anyway, my view is that, for a scheduler that uses per-vcpu parameters, libxl_domain_sched_params_set() should set the same parameters for all the vcpus. When it comes to _get(), however, I'm not sure. To match the _set() case, we'd need to return the parameters of all the vcpus, but we can't, because the function takes a libxl_domain_sched_params argument, which just holds 1 tuple. Wouldn't domain_get/set be manipulating the domain's default values for things, i.e. what a vcpu gets if nothing more specific is specified? Or is it too late to say that? For set overall I don't really think it matters too much if it sets everything (i..e all vcpus and the defaults) so long as is a documented effect of the API -- anyone who adds per-vcpu support would then be aware of this and can adjust their code accordingly. For get I think saying it returns the defaults used for vcpus which don't have something more explicit set is perfectly fine and doesn't pose an upgrade wrinkle, since only those who are aware of the vcpu settings would care about the distinction. Should we just WARN and ask, when on that specific scheduler, to use the per-vcpu variant being introduced in this patch (libxl_vcpu_sched_params_get())? This does not look ideal, but without changing the prototype of libxl_domain_sched_params_get(), I don't see what else sensible we could do... :-/ Should we change it, and do the LIBXL_API_VERSION trick? So, again, thoughts? Regards, Dario ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
Hi Ian, On 17/06/15 11:08, Ian Campbell wrote: On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. By default, the OS doesn't setup the BARs, it expects the firmware to do it. If I'm not mistaken the PV guest is using the same memory layout as the host, so the BAR is mapped 1:1 in the guest. The interesting part is bar_* in drivers/xen/xenpci-back/config_space_header.c So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). The guest has to write in the BAR in order to get the size of the BAR [1]. So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. I don't see any possible security issue to let the guest map the BAR itself. If the guest doesn't have the right to some physical region it won't be able to map it (see iomem permission). Actually, we already do similar things for interrupt. So I think updates to the BAR need to be reflected in the p2m too by whichever entity is emulating those reads/writes. QEMU (or another daemon) is not really an option IMHO. Well, by default the OS doesn't setup the BAR and expect the firmware to initialize the BAR. In our case, there is no firmware and Xen will act as it. It would be OK, to do it in the toolstack at boot time. Which leaves us with either Xen doing trap and emulate or pciback dealing with this via the pciif.h cfg space access stuff. I've a slight preference for the latter since I think both ARM and x86 PVH are planning to use pcifront/pciback already. pcifront/pciback is already emulate the access to the config space for us... It would be mad to use Xen for that as we would have to reinvent our own config space access way (ARM doesn't have ioport, hence cf8 is not present). Which leaves us with a requirement for the backend to be able to update the p2m, which in turn leads to a need for an interface available to the dom0 kernel (which the existing domctl is not). There will also need to be a mechanism to expose a suitable MMIO hole to pcifront. On x86 this would be via the machine memory map, I think, but on ARM we may need something else, perhaps a xenstore key associated with the pci bus entry in xenstore? That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. I don't understand why dom0 Xen needs to be aware of changes in physical BAR. PCIback is already able to deal with the value of physical BAR, you have nothing else to do (see drivers/xen/xenpci-back/config_space_header.c). Regards, [1] http://wiki.osdev.org/PCI#Base_Address_Registers Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests
On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote: On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote: The current code for allocating memory for PV guests batches the hypercalls to allocate memory by allocating 1024*1024 extents of order 0 at a time. To make this faster, first try allocating extents of order 9 (2 MiB) before falling back to the order 0 allocating if the order 9 allocation fails. On my test machine this reduced the time to start a 128 GiB PV guest by about 60 seconds. From 61s or from 600s? Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Applied. There were some rejects due to libxc: unify handling of vNUMA layout having gone in first, I think they were just contextual and they seemed trivial to resolve, but please do (both) check. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: Propagate clock-frequency to DOMU if present in the DT timer node
On Wed, 2015-06-03 at 15:48 +0100, Julien Grall wrote: When the property clock-frequency is present in the DT timer node, it means that the bootloader/firmware didn't correctly configured the configure CNTFRQ/CNTFRQ_EL0 on each processor. The best solution would be to fix the offending firmware/bootloader, although it may not always be possible to modify and re-flash it. As it's not possible to trap the register CNTFRQ/CNTFRQ_EL0, we have to extend xen_arch_domainconfig to provide the timer frequency to the toolstack when the property clock-frequency is present to the host DT timer node. Then, a property clock-frequency will be created in the guest DT timer node if the value is not 0. We could have set the property in the guest DT no matter if the property is present in the host DT. Although, we still want to let the guest using CNTFRQ in normal case. After all, the property clock-frequency is just a workaround for buggy firmware. Signed-off-by: Julien Grall julien.gr...@citrix.com Cc: Chris Brand chris.br...@broadcom.com I was about to apply but: libxl_arm.c: In function ‘make_timer_node’: libxl_arm.c:468:9: error: implicit declaration of function ‘fdt_property_u32’ [-Werror=implicit-function-declaration] cc1: all warnings being treated as errors That happen on both my 32 bit build (Debian Wheezy) and my 64 bit on (Ubuntu Saucy). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2] libxl: refactor toolstack save restore code
This patch does following things: 1. Document v1 format. 2. Factor out function to handle QEMU restore data and function to handle v1 blob for restore path. 3. Refactor save function to generate different blobs in the order specified in format specification. 4. Change functions to use goto out idiom. No functional changes introduced. Signed-off-by: Wei Liu wei.l...@citrix.com --- Changes in v2: 1. Use ret instead of rc. 2. Use ptr instead of buf to reduce patch size. --- tools/libxl/libxl_dom.c | 232 +--- 1 file changed, 142 insertions(+), 90 deletions(-) diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 867172a..600393d 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -1032,6 +1032,15 @@ struct libxl__physmap_info { char name[]; }; +/* Bump version every time when toolstack saved data changes. + * Different types of data are arranged in the specified order. + * + * Version 1: + * uint32_t version + * QEMU physmap data: + * uint32_t count + * libxl__physmap_info * count + */ #define TOOLSTACK_SAVE_VERSION 1 static inline char *restore_helper(libxl__gc *gc, uint32_t dm_domid, @@ -1043,41 +1052,28 @@ static inline char *restore_helper(libxl__gc *gc, uint32_t dm_domid, phys_offset, node); } -int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf, - uint32_t size, void *user) +static int libxl__toolstack_restore_qemu(libxl__gc *gc, uint32_t domid, + const uint8_t *ptr, uint32_t size) { -libxl__save_helper_state *shs = user; -libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs); -STATE_AO_GC(dcs-ao); -int i, ret; -const uint8_t *ptr = buf; -uint32_t count = 0, version = 0; -struct libxl__physmap_info* pi; +int ret, i; +uint32_t count; char *xs_path; uint32_t dm_domid; +struct libxl__physmap_info *pi; -LOG(DEBUG,domain=%PRIu32 toolstack data size=%PRIu32, domid, size); - -if (size sizeof(version) + sizeof(count)) { +if (size sizeof(count)) { LOG(ERROR, wrong size); -return -1; -} - -memcpy(version, ptr, sizeof(version)); -ptr += sizeof(version); - -if (version != TOOLSTACK_SAVE_VERSION) { -LOG(ERROR, wrong version); -return -1; +ret = -1; +goto out; } memcpy(count, ptr, sizeof(count)); ptr += sizeof(count); -if (size sizeof(version) + sizeof(count) + -count * (sizeof(struct libxl__physmap_info))) { +if (size sizeof(count) + count*(sizeof(struct libxl__physmap_info))) { LOG(ERROR, wrong size); -return -1; +ret = -1; +goto out; } dm_domid = libxl_get_stubdom_id(CTX, domid); @@ -1088,21 +1084,64 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf, xs_path = restore_helper(gc, dm_domid, domid, pi-phys_offset, start_addr); ret = libxl__xs_write(gc, 0, xs_path, %PRIx64, pi-start_addr); -if (ret) -return -1; +if (ret) goto out; + xs_path = restore_helper(gc, dm_domid, domid, pi-phys_offset, size); ret = libxl__xs_write(gc, 0, xs_path, %PRIx64, pi-size); -if (ret) -return -1; +if (ret) goto out; + if (pi-namelen 0) { xs_path = restore_helper(gc, dm_domid, domid, pi-phys_offset, name); ret = libxl__xs_write(gc, 0, xs_path, %s, pi-name); -if (ret) -return -1; +if (ret) goto out; } } -return 0; + +ret = 0; +out: +return ret; + +} + +static int libxl__toolstack_restore_v1(libxl__gc *gc, uint32_t domid, + const uint8_t *ptr, uint32_t size) +{ +return libxl__toolstack_restore_qemu(gc, domid, ptr, size); +} + +int libxl__toolstack_restore(uint32_t domid, const uint8_t *ptr, + uint32_t size, void *user) +{ +libxl__save_helper_state *shs = user; +libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs); +STATE_AO_GC(dcs-ao); +int ret; +uint32_t version = 0, bufsize; + +LOG(DEBUG,domain=%PRIu32 toolstack data size=%PRIu32, domid, size); + +if (size sizeof(version)) { +LOG(ERROR, wrong size); +ret = -1; +goto out; +} + +memcpy(version, ptr, sizeof(version)); +ptr += sizeof(version); +bufsize = size - sizeof(version); + +switch (version) { +case 1: +ret = libxl__toolstack_restore_v1(gc, domid, ptr, bufsize); +break; +default: +LOG(ERROR, wrong version); +ret = -1; +} + +out: +return ret; } /* Domain suspend (save) */ @@ -1685,81 +1724,94 @@ int
Re: [Xen-devel] [PATCH] libxc: unify handling of vNUMA layout
On Thu, 2015-06-04 at 12:52 -0400, Boris Ostrovsky wrote: On 06/04/2015 06:23 AM, Wei Liu wrote: This patch does the following: 1. Use local variables for dummy vNUMA layout in PV case. 2. Avoid leaking dummy layout back to caller in PV case. 3. Use local variables to reference vNUMA layout (whether it is dummy or provided by caller) for both PV and HVM. Signed-off-by: Wei Liu wei.l...@citrix.com Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com Acked + applied, thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC v1] libxl: Introduce a template for devices with a controller
On Thu, 2015-05-21 at 18:07 +0100, George Dunlap wrote: We have several outstanding patch series which add devices that have two levels: a controller and individual devices attached to that controller. In the interest of consistency, this patch introduces a section that sketches out a template for interfaces for such devices. Chun Yan and Jeurgen: I was hoping we could come to some sort of agreement on this such that it can be used as the basis for both the pvusb and pvscsi interfaces. As such your feedback here is important... Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@citrix.com CC: Wei Liu wei.l...@citrix.com CC: Juergen Gross jgr...@suse.com CC: Chun Yan Liu cy...@suse.com CC: Olaf Hering o...@aepfle.de So, this is definitely RFC -- I tried to spec things out in a way that made sense, but I often just chose something that I thought would be a sensible starting point for discussion. This spec looks a lot more like the PVUSB spec than the PVSCSI spec, in part because I think the PVUSB spec has already had a lot more thought that's gone into it. A couple of random points to discuss: * Calling things controllers, using typectrl for the device name, and using ctrl as the field name for the devid of the controller in the individual devices. * I've said that having an index (port, lun, whatever) is optional. Do we want to make that requred? Do we want it to have a consistent name? In the case of emulated USB, we can't really specify to qemu what port the device gets attached to, so I'm tempted to say it's not required; but even there we could always give it a port number just for name's sake. * Naming sub-devices. We need to have a way to uniquely name both controllers and subdevices. Here I've said that we will have both typectrl and type devid namespaces, mainly because in the previous point I opted not to require an index. Another option would be not to have another devid namespace, but to use ctrl,index as the unique identifier. (This would mean requiring the index/port/lun specification above.) * libxl_device_type_list listing devices across all controllers. I think this is the most practical thing to do, but one could imagine wanting to query by controller ID instead. Feedback welcome. --- tools/libxl/libxl.h | 46 ++ 1 file changed, 46 insertions(+) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 2ed7194..d757845 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1234,6 +1234,52 @@ void libxl_vtpminfo_list_free(libxl_vtpminfo *, int nr_vtpms); * * This function does not interact with the guest and therefore * cannot block on the guest. + * + * Controllers + * --- + * + * Most devices are treated individually. Some devices however, like + * USB or SCSI, inherently have the need to have busses or + * controllers to which individual devices can be attached. + * + * In that case, for each type, there will be two sets of the + * functions, types, and devid namespaces outlined above: one based on + * 'type', and one based on 'typectrl'. + * + * In those cases, libxl_device_typectrl_function will act more or + * less like top-level non-bus devices: they will either create or + * accept a libxl_devid which will be unique within the + * typectrl libxl_devid namespace. + * + * Second-level devices which will be attached to a controller will + * include in their libxl_device_type a field called ctrl, which + * will be the libxl_devid of the corresponding controller. It may also + * include an index onto that bus, that represents (for example) a USB + * port or a SCSI LUN. + * + * These second-level devices will also have their own devid which + * will be unique within the type devid namespace, and will be used + * for queries or removals. + * + * In the case where there are multiple different ways to implement a + * given device -- for instance, one which is fully PV and one which + * uses an emulator -- the controller will contain a field which + * specifies what type of implementation is used. The implementations + * of individual devices will be known by the controller to which they are + * attached. + * + * If libxl_device_type_add receives an uninitialized ctrl devid, it + * may return an error. Or it may (but is not required to) choose to + * automatically choose a suitable controller to which to attach the + * new device. It may also (but is not required to) automatically + * create a new controller if no suitable controllers exist. + * Individual devices should document their behavior. + * + * libxl_device_typectrl_list will list all controllers for the domain. + * + * libxl_device_type_list will list all devices for all controllers + * for the domain. The individual
Re: [Xen-devel] [PATCH] oxenstored: implement XS_RESET_WATCHES
On Tue, 2015-06-09 at 11:08 +0100, Wei Liu wrote: Signed-off-by: Wei Liu wei.l...@citrix.com Acked-by: David Scott dave.sc...@citrix.com Applied, thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen-netback:Make the function xenvif_schedulable have a return type of bool
On Tue, Jun 16, 2015 at 11:03:30PM -0400, Nicholas Krause wrote: This makes the function xenvif_sechedulable have a return type of bool now due to this particular function's return statement only ever evaluating to have a value of one or zero. Signed-off-by: Nicholas Krause xerofo...@gmail.com Acked-by: Wei Liu wei.l...@citrix.com --- drivers/net/xen-netback/common.h| 2 +- drivers/net/xen-netback/interface.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h index 8a495b3..c02cafb 100644 --- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -280,7 +280,7 @@ void xenvif_free(struct xenvif *vif); int xenvif_xenbus_init(void); void xenvif_xenbus_fini(void); -int xenvif_schedulable(struct xenvif *vif); +bool xenvif_schedulable(struct xenvif *vif); int xenvif_queue_stopped(struct xenvif_queue *queue); void xenvif_wake_queue(struct xenvif_queue *queue); diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 1a83e19..b5fcb52 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -63,7 +63,7 @@ void xenvif_skb_zerocopy_complete(struct xenvif_queue *queue) atomic_dec(queue-inflight_packets); } -int xenvif_schedulable(struct xenvif *vif) +bool xenvif_schedulable(struct xenvif *vif) { return netif_running(vif-dev) test_bit(VIF_STATUS_CONNECTED, vif-status) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] libxl: clean up qemu-save and qemu-resume files
On Wed, 2015-06-03 at 11:44 +0100, Wei Liu wrote: These files are leaked when using qemu-trad stubdom. They are intermediate files created by libxc. Unfortunately they don't fit well in our userdata scheme. Clean them up after we destroy all userdata, we're sure they are not useful anymore at that point. Signed-off-by: Wei Liu wei.l...@citrix.com Acked + applied. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] libxc: fix xc_dom_load_elf_symtab
On Wed, 2015-06-17 at 10:56 +0100, Wei Liu wrote: On Thu, Jun 11, 2015 at 06:05:20PM +0200, Roger Pau Monne wrote: xc_dom_load_elf_symtab was incorrectly trying to perform the same calculations already done in elf_parse_bsdsyms when load == 0 is used. Instead of trying to repeat the calculations, just trust what elf_parse_bsdsyms has already accounted for. This also simplifies the code by allowing the non-load case to return earlier. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Applied, thanks (#1 was committed by Jan already). ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On Tue, 2015-06-16 at 12:44 +0100, Ian Jackson wrote: George Dunlap writes (Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API): On Tue, Jun 16, 2015 at 12:19 PM, Juergen Gross jgr...@suse.com wrote: I'm pretty sure this is just a matter of timing during boot: the busses are all (or at least some of them) queried at the same time and the first answering gets number 1, the next 2, and so on. And timing seems to be different enough to result in unstable bus numbers. Right -- I meant stable in topology within one boot, but at least two of you have now understood me to mean across reboots by default, so that's obviously a detail that needs to be specified. :-) I think stable in topology within one boot is nearly vacuous (or are we expecting buses or devices to spontaneously renumber themselves for no reason) and also nearly useless. We definitely need _some_ way for a user to identify a specific device uniquely in a way that is reliable from one boot to the next. vid:pid is one way to do this, but not always sufficient. We should perhaps separate out the end user from the libxl user (i.e. the toolstack application) in our deliberations here. We have the option of settling on a single way of describing a device to libxl in the libxl API but allowing the toolstack to choose to present the user with multiple different ways to specify things by providing a suitable set of helper functions (in either libxlu or libxl proper) from a variety of schemes to the one true way of describing a device. That sounds better to me than having the libxl API consist of the union of all potentially useful ways to refer to a device. I'm not sure whether that also gives us the freedom to use something which is only stable for a given boot at the libxl API layer though. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] tools/libxc: Batch memory allocations for PV guests
On Wed, Jun 17, 2015 at 12:14:03PM +0100, Ian Campbell wrote: On Mon, 2015-06-15 at 17:08 +0100, Wei Liu wrote: On Mon, Jun 15, 2015 at 11:12:07AM +0100, Ross Lagerwall wrote: The current code for allocating memory for PV guests batches the hypercalls to allocate memory by allocating 1024*1024 extents of order 0 at a time. To make this faster, first try allocating extents of order 9 (2 MiB) before falling back to the order 0 allocating if the order 9 allocation fails. On my test machine this reduced the time to start a 128 GiB PV guest by about 60 seconds. From 61s or from 600s? Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Applied. There were some rejects due to libxc: unify handling of vNUMA layout having gone in first, I think they were just contextual and they seemed trivial to resolve, but please do (both) check. The patch looks good to me. Wei. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On Tue, 2015-06-16 at 16:20 +0200, Juergen Gross wrote: My point was to avoid the sysfs accesses in libxl in order to support BSD as well and to reduce the complexity. As a slight aside to this, can't libxl use libusb for a lot of this stuff and therefore avoid being Linux specific? http://libusb.info/ claims to support Linux, OS X, Windows, Windows CE, Android, OpenBSD/NetBSD, Haiku.. Interestingly FreeBSD is missing there but I don't think that need to be a blocker. I don't see a problem with adding libusb to our set of dependencies, and it's certainly got to be better than (re)implementing a bunch of sysfs stuff (which I presume is what libusb does under the hood anyway). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V4 3/7] libxl: add pvusb API
On 06/17/2015 01:34 PM, Ian Campbell wrote: On Tue, 2015-06-16 at 16:20 +0200, Juergen Gross wrote: My point was to avoid the sysfs accesses in libxl in order to support BSD as well and to reduce the complexity. As a slight aside to this, can't libxl use libusb for a lot of this stuff and therefore avoid being Linux specific? http://libusb.info/ claims to support Linux, OS X, Windows, Windows CE, Android, OpenBSD/NetBSD, Haiku.. Interestingly FreeBSD is missing there but I don't think that need to be a blocker. I don't see a problem with adding libusb to our set of dependencies, and it's certainly got to be better than (re)implementing a bunch of sysfs stuff (which I presume is what libusb does under the hood anyway). +1 Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v5 0/3] net/xen: Clean up
Hi, The first 2 patches were originally part of the Xen 64KB series [1]. Although I think they can go without waiting the rest of the 64KB series. The third patch has been added in the v4. Regards, Cc: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: David Vrabel david.vra...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: net...@vger.kernel.org [1] http://lkml.org/lkml/2015/5/14/533 Julien Grall (3): net/xen-netfront: Correct printf format in xennet_get_responses net/xen-netback: Remove unused code in xenvif_rx_action net/xen-netback: Don't mix hexa and decimal with 0x in the printf format drivers/net/xen-netback/netback.c | 19 +++ drivers/net/xen-netfront.c| 2 +- 2 files changed, 8 insertions(+), 13 deletions(-) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v5 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format
Prepend 0x to all %x in order to avoid confusion while reading when there is other decimal value in the log. Also replace some of the hexadecimal print to decimal to uniformize the format with netfront. Signed-off-by: Julien Grall julien.gr...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: net...@vger.kernel.org --- Changes in v5: - Fix commit message - Add Ian's ack. Changes in v4: - Patch added --- drivers/net/xen-netback/netback.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index ba3ae30..11bd9d8 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -748,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue, slots++; if (unlikely((txp-offset + txp-size) PAGE_SIZE)) { - netdev_err(queue-vif-dev, Cross page boundary, txp-offset: %x, size: %u\n, + netdev_err(queue-vif-dev, Cross page boundary, txp-offset: %u, size: %u\n, txp-offset, txp-size); xenvif_fatal_tx_err(queue-vif); return -EINVAL; @@ -874,7 +874,7 @@ static inline void xenvif_grant_handle_set(struct xenvif_queue *queue, if (unlikely(queue-grant_tx_handle[pending_idx] != NETBACK_INVALID_HANDLE)) { netdev_err(queue-vif-dev, - Trying to overwrite active handle! pending_idx: %x\n, + Trying to overwrite active handle! pending_idx: 0x%x\n, pending_idx); BUG(); } @@ -887,7 +887,7 @@ static inline void xenvif_grant_handle_reset(struct xenvif_queue *queue, if (unlikely(queue-grant_tx_handle[pending_idx] == NETBACK_INVALID_HANDLE)) { netdev_err(queue-vif-dev, - Trying to unmap invalid handle! pending_idx: %x\n, + Trying to unmap invalid handle! pending_idx: 0x%x\n, pending_idx); BUG(); } @@ -1243,7 +1243,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue, /* No crossing a page as the payload mustn't fragment. */ if (unlikely((txreq.offset + txreq.size) PAGE_SIZE)) { netdev_err(queue-vif-dev, - txreq.offset: %x, size: %u, end: %lu\n, + txreq.offset: %u, size: %u, end: %lu\n, txreq.offset, txreq.size, (unsigned long)(txreq.offset~PAGE_MASK) + txreq.size); xenvif_fatal_tx_err(queue-vif); @@ -1593,12 +1593,12 @@ static inline void xenvif_tx_dealloc_action(struct xenvif_queue *queue) queue-pages_to_unmap, gop - queue-tx_unmap_ops); if (ret) { - netdev_err(queue-vif-dev, Unmap fail: nr_ops %tx ret %d\n, + netdev_err(queue-vif-dev, Unmap fail: nr_ops %tu ret %d\n, gop - queue-tx_unmap_ops, ret); for (i = 0; i gop - queue-tx_unmap_ops; ++i) { if (gop[i].status != GNTST_okay) netdev_err(queue-vif-dev, - host_addr: %llx handle: %x status: %d\n, + host_addr: 0x%llx handle: 0x%x status: %d\n, gop[i].host_addr, gop[i].handle, gop[i].status); @@ -1731,7 +1731,7 @@ void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx) queue-mmap_pages[pending_idx], 1); if (ret) { netdev_err(queue-vif-dev, - Unmap fail: ret: %d pending_idx: %d host_addr: %llx handle: %x status: %d\n, + Unmap fail: ret: %d pending_idx: %d host_addr: %llx handle: 0x%x status: %d\n, ret, pending_idx, tx_unmap_op.host_addr, -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v5 1/3] net/xen-netfront: Correct printf format in xennet_get_responses
rx-status is an int16_t, print it using %d rather than %u in order to have a meaningful value when the field is negative. Also use %u rather than %x for rx-offset. Signed-off-by: Julien Grall julien.gr...@citrix.com Reviewed-by: David Vrabel david.vra...@citrix.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: net...@vger.kernel.org --- Changes in v4: - Use %u for the rx-offset because offset is unsigned Changes in v3: - Use %d for the rx-offset too. Changes in v2: - Add David's Reviewed-by --- drivers/net/xen-netfront.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index e031c94..281720f 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -733,7 +733,7 @@ static int xennet_get_responses(struct netfront_queue *queue, if (unlikely(rx-status 0 || rx-offset + rx-status PAGE_SIZE)) { if (net_ratelimit()) - dev_warn(dev, rx-offset: %x, size: %u\n, + dev_warn(dev, rx-offset: %u, size: %d\n, rx-offset, rx-status); xennet_move_rx_slot(queue, skb, ref); err = -EINVAL; -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v5 2/3] net/xen-netback: Remove unused code in xenvif_rx_action
The variables old_req_cons and ring_slots_used are assigned but never used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b xen-netback: always fully coalesce guest Rx packets. Signed-off-by: Julien Grall julien.gr...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: net...@vger.kernel.org --- Changes in v2: - Add Wei's Acked-by --- drivers/net/xen-netback/netback.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 0d25943..ba3ae30 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue) while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX) (skb = xenvif_rx_dequeue(queue)) != NULL) { - RING_IDX old_req_cons; - RING_IDX ring_slots_used; - queue-last_rx_time = jiffies; - old_req_cons = queue-rx.req_cons; XENVIF_RX_CB(skb)-meta_slots_used = xenvif_gop_skb(skb, npo, queue); - ring_slots_used = queue-rx.req_cons - old_req_cons; __skb_queue_tail(rxq, skb); } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] Modified RTDS scheduler to use an event-driven model instead of polling.
Hi Dagaen, First, please try not top post in the ML. :-) As the one you just sent, it is better to reply to the comment instead of top reply. :-) 2015-06-16 23:03 GMT-07:00 Dagaen Golomb dgol...@seas.upenn.edu: Thanks for the reply, budget enforcement in the scheduler timer makes sense. I think I have an idea of what he wants done now. Great! It is better to summarize your understanding of the design and send to the ML to check if your understanding is correct as Chong did for the improved toolstack of RTDS scheduler. This will save rounds of patches. Thanks, Meng --- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 10, 2015 2:44 PM To: Wu, Feng Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org Subject: Re: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling On 08.05.15 at 11:07, feng...@intel.com wrote: --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1711,6 +1711,131 @@ static void vmx_handle_eoi(u8 vector) __vmwrite(GUEST_INTR_STATUS, status); } +static void vmx_pi_desc_update(struct vcpu *v, int old_state) +{ +struct pi_desc *pi_desc = v-arch.hvm_vmx.pi_desc; +struct pi_desc old, new; +unsigned long flags; + +if ( !iommu_intpost ) +return; Considering how the adjustment to start_vmx(), this at best should be an ASSERT() (if a check is needed at all). +switch ( v-runstate.state ) +{ +case RUNSTATE_runnable: +case RUNSTATE_offline: +/* + * We don't need to send notification event to a non-running + * vcpu, the interrupt information will be delivered to it before + * VM-ENTRY when the vcpu is scheduled to run next time. + */ +pi_desc-sn = 1; + +/* + * If the state is transferred from RUNSTATE_blocked, + * we should set 'NV' feild back to posted_intr_vector, + * so the Posted-Interrupts can be delivered to the vCPU + * by VT-d HW after it is scheduled to run. + */ +if ( old_state == RUNSTATE_blocked ) +{ +do +{ +old.control = new.control = pi_desc-control; +new.nv = posted_intr_vector; +} +while ( cmpxchg(pi_desc-control, old.control, new.control) +!= old.control ); So why is it okay to do the SN update non-atomically when the vector update is done atomically? Also the curly braces do _not_ go on separate lines for do/while constructs. And then do you really need to atomically update the whole 64 bits here, rather than just the nv field? By not making it a bit field you could perhaps just write_atomic() it? Thinking more about this, maybe write_atomic() is not good for this, we need use the LOCK prefix when updating the posted-interrupt descriptor. Thanks, Feng + + /* +* Delete the vCPU from the related block list +* if we are resuming from blocked state +*/ + spin_lock_irqsave(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); + list_del(v-blocked_vcpu_list); + spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); +} +break; + +case RUNSTATE_blocked: +/* + * The vCPU is blocked on the block list. + * Add the blocked vCPU on the list of the + * vcpu-pre_pcpu, which is the destination + * of the wake-up notification event. + */ +v-pre_pcpu = v-processor; Is latching this upon runstate change really enough? I.e. what about the v-processor changes that sched_move_domain() or individual schedulers do? Or if it really just matters on which CPU's blocked list the vCPU is (while its -processor changing subsequently doesn't matter) I'd like to see the field named more after its purpose (e.g. pi_block_cpu; list and lock should btw also have a connection to PI in their names). In the end, if the placement on a list followed v-processor, you would likely get away without the extra new field. Are there synchronization constraints speaking against such a model? +spin_lock_irqsave(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); +list_add_tail(v-blocked_vcpu_list, + per_cpu(blocked_vcpu, v-pre_pcpu)); +spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); + +do +{ +old.control = new.control = pi_desc-control; + +/* + * We should not block the vCPU if + * an interrupt was posted for it. + */ + +if ( old.on == 1 ) I think I said so elsewhere, but just in case I didn't: Please don't compare boolean fields with 1 - e.g. in the case here if ( old.on ) suffices. +{ +/* + * The vCPU will be removed from the block list + * during its state transferring from RUNSTATE_blocked + * to RUNSTATE_runnable after the following tasklet + * is scheduled to run. + */ Precise comments please - I don't think the scheduling
Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Tuesday, June 16, 2015 5:25 PM To: Wu, Feng Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling On 16.06.15 at 10:56, feng...@intel.com wrote: -Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Tuesday, June 16, 2015 4:29 PM To: Wu, Feng Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling On 16.06.15 at 10:13, feng...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Tuesday, June 16, 2015 3:53 PM On 16.06.15 at 02:17, feng...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 10, 2015 2:44 PM On 08.05.15 at 11:07, feng...@intel.com wrote: + + /* +* Delete the vCPU from the related block list +* if we are resuming from blocked state +*/ + spin_lock_irqsave(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); + list_del(v-blocked_vcpu_list); + spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); +} +break; + +case RUNSTATE_blocked: +/* + * The vCPU is blocked on the block list. + * Add the blocked vCPU on the list of the + * vcpu-pre_pcpu, which is the destination + * of the wake-up notification event. + */ +v-pre_pcpu = v-processor; Is latching this upon runstate change really enough? I.e. what about the v-processor changes that sched_move_domain() or individual schedulers do? Or if it really just matters on which CPU's blocked list the vCPU is (while its -processor changing subsequently doesn't matter) I'd like to see the field named more after its purpose (e.g. pi_block_cpu; list and lock should btw also have a connection to PI in their names). Yes, It doesn't matter if vCPU changes. The key point is that we put the vCPU on a pCPU list and we change the NDST field to this pCPU, then the wakeup notification event will get there. You are right, I need to rename them to reflect the real purpose of it. In the end, if the placement on a list followed v-processor, you would likely get away without the extra new field. Are there synchronization constraints speaking against such a model? I don't understand this quit well. Do you mean using 'v-processor' as the pCPU list for the blocked vCPUs? Then what about 'v-processor' changes, seems we cannot handle this case. That was the question - does anything speak against such a model? Do you mean still using v-processor as the pCPU to store the blocked vCPU? Not to store, but to indicate, but yes, the question still is regarding the need for the new field. I'm fine with adding the field if there is a proper explanation for it being needed; I'm not going to agree to add new fields when existing ones can serve the purpose. Fair enough. The basic reason for adding this new field is 'v-processor' can be changed behind, so we lost control of it. This is straightforward way I can think of right now. This is the obvious part you state. What needs to be explained is why it would be impossible (or a t least a bad idea) to move the vCPU between blocked lists when its v-processor changes. The key point here is that we put the blocked vCPU in the pCPU list and update the 'NDST' field in posted-interrupt descriptor for the wakeup notification event. Here are some reasons why using 'v-processor' as the pCPU to store the blocked vCPUs is not a good idea: There may be many places where 'v-processor' gets changed, we need to find all of them and in each place, see if the vCPU is currently blocked, if it is, remove it from the old pCPU list and insert to the new pCPU list (need spinlock operations), we need also update the 'NDST' filed for the wakeup notification event. Besides that, the individual scheduler can change 'v-processor', which means we may need to add some code specific to PI purpose to the scheduler code, I don't think this is a good way. On the other hand, if you use the current solution, we can get the following benefit: - The logic is clear and simple. - We only need to update the posted-interrupt descriptor before the vCPU is blocked, i.e. once the 'NDST' filed is updated before blocking the vCPU, we don't need to change it even 'v-processor' is changed. - We have a central
Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges
On 2015/6/17 16:05, Jan Beulich wrote: On 17.06.15 at 09:54, tiejun.c...@intel.com wrote: On 2015/6/17 15:19, Jan Beulich wrote: On 17.06.15 at 09:10, tiejun.c...@intel.com wrote: Yeah, this may waste some spaces in this worst case but I this think this can guarantee our change don't impact on the original expectation, right? Some space may be multiple Gb (e.g. the frame buffer of a graphics Sure. card), which is totally unacceptable. But then I don't understand what's your way. How can we fit all pci devices just with the smallest power-of-2 region enclosing the reserved device memory? For example, the whole pci memory is sitting at [0xa000, 0xa200]. And there are two PCI devices, A and B. Note each device needs to be allocated with 0x100. So if without concerning RMRR, A. [0xa000,0xa100] B. [0xa100,0xa200] But if one RMRR resides at [0xa0f0, 0xa1f0] which obviously generate its own alignment with 0x100. So the pci memory is expended as [0xa000, 0xa300], right? Then actually the whole pci memory can be separated three segments like, #1. [0xa000, 0xa0f0] #2. [0xa0f0, 0xa1f0] - RMRR would occupy #3. [0xa1f0, 0xa300] So just #3 can suffice to allocate but just for one device, right? Right, i.e. this isn't even sufficient - you need [a000,a3ff] to fit everything (but of course you can put smaller BARs into the unused ranges [a000,a0ef] and [a1f0,a1ff]). Yes, I knew there's this sort of hole that we should use efficiently as you said. And I also thought about this way previously but current pci allocation framework isn't friend to implement this easily, /* Assign iomem and ioport resources in descending order of size. */ for ( i = 0; i nr_bars; i++ ) { I mean it isn't easy to calculate what's the most sufficient size in advance, and its also difficult to find to fit a appropriate pci device into those holes, so see below, That's why I said it's not going to be tricky to get all corner cases right _and_ not use up more space than needed. ought to work out the smallest power-of-2 region enclosing the Okay. I remember the smallest size of a given PCI I/O space is 8 bytes, and the smallest size of a PCI memory space is 16 bytes. So /* At least 16 bytes to align a PCI BAR size. */ uint64_t align = 16; reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_start = (reserved_star + align) ~(align - 1); reserved_size = (reserved_size + align) ~(align - 1); Is this correct? Simply aligning the region doesn't help afaict. You need to fit it with the other MMIO allocations. I guess you're saying just those mmio allocations conflicting with RMRR? But we don't know these exact addresses until we finalize to allocate them, right? That's the point - you need to allocate them _around_ the reserved regions. Something hits me to generate another idea, #1. Still allocate all devices as before. #2. Lookup all actual bars to check if they're conflicting RMRR We can skip these bars to keep zero. Then later it would make lookup easily. #3. Need to reallocate these conflicting bars. #3.1 Trying to reallocate them with the remaining resources #3.2 If the remaining resources aren't enough, we need to allocate them from high_mem_resource. I just feel this way may be easy and better. And even, this way also can help terminate the preexisting allocation failures, right? Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 58648: regressions - FAIL
flight 58648 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/58648/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 5 xen-build fail REGR. vs. 58551 build-i3865 xen-build fail REGR. vs. 58551 build-amd64-xsm 5 xen-build fail REGR. vs. 58551 build-i386-xsm5 xen-build fail REGR. vs. 58551 build-armhf 5 xen-build fail REGR. vs. 58551 build-armhf-xsm 5 xen-build fail REGR. vs. 58551 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-amd 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-intel 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-amd64-xl-sedf 1 build-check(1) blocked n/a test-amd64-amd64-xl-sedf-pin 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-armhf-armhf-xl-sedf 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a build-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1)blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-sedf-pin 1 build-check(1) blocked n/a test-armhf-armhf-xl-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a version targeted for testing: qemuu93f6d1c16036aaf34055d16f54ea770fb8d6d280 baseline version: qemuu0a2df857a7038c75379cc575de5d4be4c0ac629e People who touched revisions under test: Alberto Garcia be...@igalia.com Alex Bennée alex.ben...@linaro.org Alexander Graf ag...@suse.de Andrew Jones drjo...@redhat.com Aurelien Jarno aurel...@aurel32.net Aurelio C. Remonda aurelioremo...@gmail.com Benoit Canet benoit.ca...@nodalink.com Benoît Canet benoit.ca...@nodalink.com Christian Borntraeger borntrae...@de.ibm.com Cole Robinson crobi...@redhat.com Dave Airlie airl...@redhat.com Eric Auger eric.au...@linaro.org Fam Zheng f...@redhat.com Gerd Hoffmann kra...@redhat.com Johan Karlsson johan.karls...@enea.com John Snow js...@redhat.com Kevin Wolf
Re: [Xen-devel] [PATCH v2] x86: synchronize PCI config space access decoding
On 16.06.15 at 20:26, andrew.coop...@citrix.com wrote: On 16/06/15 09:09, Jan Beulich wrote: On 15.06.15 at 17:32, andrew.coop...@citrix.com wrote: On 15/06/15 15:30, Jan Beulich wrote: @@ -2439,9 +2434,19 @@ struct hvm_ioreq_server *hvm_select_iore type = IOREQ_TYPE_PCI_CONFIG; addr = ((uint64_t)sbdf 32) | - CF8_ADDR_HI(cf8) | CF8_ADDR_LO(cf8) | (p-addr 3); +/* AMD extended configuration space access? */ +if ( CF8_ADDR_HI(cf8) + d-arch.x86_vendor == X86_VENDOR_AMD + d-arch.x86 = 0x10 d-arch.x86 = 0x17 ) +{ +uint64_t msr_val; + +if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) We now have several common paths which read this MSR looking for CF8_EXT. I think it would make sense to probe this on boot and have a cpu_has_amd_cf8_ext rather than repeatedly sampling an off-cpu MSR, although this would require synchronising it across all northbridges in emulate privileged op. Alternatively, it might just be better to unconditionally enable it during startup (as Linux does) and prevent dom0 from playing, which would avoid the need to synchronise updates to it. You just repeat what you said for v1, without taking into consideration my reply thereto: Us not using this method ourselves, we should honor and play by what Dom0 does. Sorry - I had completely forgotten that this was a v2, and had already asked this question. However, hvm_select_ioreq_server() it not a rare function to call, and I am still concerned with the overhead. And I can only repeat that the MSR isn't being accessed unless an apparent extended access is being seen. It turns out that MSR_AMD64_NB_CFG is unconditionally RAZ and has all writes discarded, so no HVM guest will ever be in a position to legitimately use AMD extended configuration access. Where have you found that? The register (named NB_CFG1 in newer families' BKGDs) is clearly r/w. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 10, 2015 2:44 PM To: Wu, Feng Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org Subject: Re: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling On 08.05.15 at 11:07, feng...@intel.com wrote: --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1711,6 +1711,131 @@ static void vmx_handle_eoi(u8 vector) __vmwrite(GUEST_INTR_STATUS, status); } +static void vmx_pi_desc_update(struct vcpu *v, int old_state) +{ +struct pi_desc *pi_desc = v-arch.hvm_vmx.pi_desc; +struct pi_desc old, new; +unsigned long flags; + +if ( !iommu_intpost ) +return; Considering how the adjustment to start_vmx(), this at best should be an ASSERT() (if a check is needed at all). +switch ( v-runstate.state ) +{ +case RUNSTATE_runnable: +case RUNSTATE_offline: +/* + * We don't need to send notification event to a non-running + * vcpu, the interrupt information will be delivered to it before + * VM-ENTRY when the vcpu is scheduled to run next time. + */ +pi_desc-sn = 1; + +/* + * If the state is transferred from RUNSTATE_blocked, + * we should set 'NV' feild back to posted_intr_vector, + * so the Posted-Interrupts can be delivered to the vCPU + * by VT-d HW after it is scheduled to run. + */ +if ( old_state == RUNSTATE_blocked ) +{ +do +{ +old.control = new.control = pi_desc-control; +new.nv = posted_intr_vector; +} +while ( cmpxchg(pi_desc-control, old.control, new.control) +!= old.control ); So why is it okay to do the SN update non-atomically when the vector update is done atomically? Also the curly braces do _not_ go on separate lines for do/while constructs. And then do you really need to atomically update the whole 64 bits here, rather than just the nv field? By not making it a bit field you could perhaps just write_atomic() it? + + /* +* Delete the vCPU from the related block list +* if we are resuming from blocked state +*/ + spin_lock_irqsave(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); + list_del(v-blocked_vcpu_list); + spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); +} +break; + +case RUNSTATE_blocked: +/* + * The vCPU is blocked on the block list. + * Add the blocked vCPU on the list of the + * vcpu-pre_pcpu, which is the destination + * of the wake-up notification event. + */ +v-pre_pcpu = v-processor; Is latching this upon runstate change really enough? I.e. what about the v-processor changes that sched_move_domain() or individual schedulers do? Or if it really just matters on which CPU's blocked list the vCPU is (while its -processor changing subsequently doesn't matter) I'd like to see the field named more after its purpose (e.g. pi_block_cpu; list and lock should btw also have a connection to PI in their names). In the end, if the placement on a list followed v-processor, you would likely get away without the extra new field. Are there synchronization constraints speaking against such a model? +spin_lock_irqsave(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); +list_add_tail(v-blocked_vcpu_list, + per_cpu(blocked_vcpu, v-pre_pcpu)); +spin_unlock_irqrestore(per_cpu(blocked_vcpu_lock, + v-pre_pcpu), flags); + +do +{ +old.control = new.control = pi_desc-control; + +/* + * We should not block the vCPU if + * an interrupt was posted for it. + */ + +if ( old.on == 1 ) I think I said so elsewhere, but just in case I didn't: Please don't compare boolean fields with 1 - e.g. in the case here if ( old.on ) suffices. +{ +/* + * The vCPU will be removed from the block list + * during its state transferring from RUNSTATE_blocked + * to RUNSTATE_runnable after the following tasklet + * is scheduled to run. + */ Precise comments please - I don't think the scheduling of the tasklet makes any difference; I suppose it's the execution of it that does. +tasklet_schedule(v-vcpu_wakeup_tasklet); +
Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges
On 17.06.15 at 09:54, tiejun.c...@intel.com wrote: On 2015/6/17 15:19, Jan Beulich wrote: On 17.06.15 at 09:10, tiejun.c...@intel.com wrote: Yeah, this may waste some spaces in this worst case but I this think this can guarantee our change don't impact on the original expectation, right? Some space may be multiple Gb (e.g. the frame buffer of a graphics Sure. card), which is totally unacceptable. But then I don't understand what's your way. How can we fit all pci devices just with the smallest power-of-2 region enclosing the reserved device memory? For example, the whole pci memory is sitting at [0xa000, 0xa200]. And there are two PCI devices, A and B. Note each device needs to be allocated with 0x100. So if without concerning RMRR, A. [0xa000,0xa100] B. [0xa100,0xa200] But if one RMRR resides at [0xa0f0, 0xa1f0] which obviously generate its own alignment with 0x100. So the pci memory is expended as [0xa000, 0xa300], right? Then actually the whole pci memory can be separated three segments like, #1. [0xa000, 0xa0f0] #2. [0xa0f0, 0xa1f0] - RMRR would occupy #3. [0xa1f0, 0xa300] So just #3 can suffice to allocate but just for one device, right? Right, i.e. this isn't even sufficient - you need [a000,a3ff] to fit everything (but of course you can put smaller BARs into the unused ranges [a000,a0ef] and [a1f0,a1ff]). That's why I said it's not going to be tricky to get all corner cases right _and_ not use up more space than needed. ought to work out the smallest power-of-2 region enclosing the Okay. I remember the smallest size of a given PCI I/O space is 8 bytes, and the smallest size of a PCI memory space is 16 bytes. So /* At least 16 bytes to align a PCI BAR size. */ uint64_t align = 16; reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_start = (reserved_star + align) ~(align - 1); reserved_size = (reserved_size + align) ~(align - 1); Is this correct? Simply aligning the region doesn't help afaict. You need to fit it with the other MMIO allocations. I guess you're saying just those mmio allocations conflicting with RMRR? But we don't know these exact addresses until we finalize to allocate them, right? That's the point - you need to allocate them _around_ the reserved regions. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] Modified RTDS scheduler to use an event-driven model instead of polling.
[Let me just reply to this... Then I'll jump to the end of the thread] On Wed, 2015-06-17 at 01:24 -0400, Dagaen Golomb wrote: If replenishments were done by themsevles, then the scheduler may decide to wait for some period of time, and in this same time period a replenishment could change what should be executing. Well, the scheduler then would better *not* decide to wait for any period of time. It better act like this: as soon as a replenishment happens, and a new deadline is assigned to a vCPU, put the vCPU in the runqueue, in the proper position; and if such a newly replenished vCPU should now preempt any other, currently running, vCPU, 'tickle' the pCPU in question and let it reschedule, and pick up the newly replenished vCPU. Yes, this is an option. However, I thought this would actually be an option you would not like. How so... I've been arguing for this the whole time?!?! :-O I'm sure I've put down a sketch of what I think the replenishment function should do in my first or second email in the thread, and I'm sure that involved a call to runq_tickle() For the replenishment routine to know when to call the scheduler and when not to, it basically has to perform the the scheduler's logic, at which point if there is a change... why doesn't it just do it (i.e., be the scheduler) instead of adding logic and possibly overhead to call it. It has to call runq_tickle(), which is not the scheduler's logic, is the function that decides whether a pCPU should reschedule. runq_tickle(), right now, is called from within rt_vcpu_wake(), which makes perfect sense, and from within rt_context_saved(), which makes no sense. It should be called from within rt_vcpu_wake() and from the replenishment handling routine. Scheduling, to me, is __runq_pick() + the if() in rt_schedule() that checks whether scurr-cur_deadline=snext-cur_deadline. Anyway, I've zero interest in turning this into a fight over terminology... If you want to call runq_tickle() the scheduler, go ahead, it would just make communication a bit more difficult, but I'm up for the challenge! :-) Oh, BTW, while we're here, __runq_pick() being called from a bunch of places, outside of rt_schedule(), is also something I never liked (you can go check that in the archives), and I also blame the confusion between scheduling and replenishmets, for the fact that such a thing is necessary. I seriously hope this can be fixed too. Also, I'd like to point out that when I said things like make more intelligent decision handling, I meant exactly things like the above: using as much information as possible to prevent scheduler invocations. The current patch doesn't reschedule for any replenishment. How does it not? you're arming s_timer to fire at either next replenishment or budget enforcement point in time... Firstly, it doesn't even look at the run queue: replenishments here can only happen on miss. ... Oh, wait, I know! I know! I know! You mean it doesn't tickle at any replenishment, or it doesn't context switch at any replenishment, don't you? (See, I'm good at adapting to the new terminology! :-D) So, see? Transient violation of EDF is just there, no matter the approach! Of course, violation during the scheduler is possible for any implementation :( Yeah, well, it was you that said that which should stick to your design to prevent violations, which would have crept in if going with my solution. Bha, anyway, it's probably best to let go... I really don't see what you mean here. There won't be anything like that to take care of. Xen offers timers as an abstraction, and deals with them itself. You only need to take care of properly serializing rt_schedule() and the timer handling routine, for the code sections that require that. This is good to know, thanks. The question would then change to how does Xen handle such an occurrence, but I think this is irrelevant now that I know which side you had in mind - that is, that the replenishment handler would decide when the scheduler should be invoked. It should call runq_tickle(), yes. It would be a hard pitch for a gEDF scheduler with a similar accounting period where the gEDF policy could be violated. There's no accounting period of any sort. It's called overhead! This part was referring to if we had two timers without extensive communication, cooperation, or information/logic sharing. I'm in the dark about what this extensive communication, cooperation, or information/logic sharing could be, I have to admit. If it means that the replenishment handler should call runq_tickle(), then yes, let me state it once more: replenishment handling routine should call runq_tickle() (other places, with the only other exception of rt_vcpu_wake(), shouldn't). You also may know that there is an EDF implementation in mainline Linux, and you may would like to go have a look there too. I'm not including any example from there
Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
On 06/15/2015 09:44 PM, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/Makefile|1 + tools/libxl/libxl_internal.h| 39 tools/libxl/libxl_stream_read.c | 485 +++ 3 files changed, 525 insertions(+) create mode 100644 tools/libxl/libxl_stream_read.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index cc9c152..c71c5fe 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ + libxl_stream_read.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) LIBXL_OBJS += libxl_genid.o diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 101994f..4f33cb8 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -19,6 +19,8 @@ #include libxl_osdeps.h /* must come before any other headers */ +#include libxl_sr_stream_format.h + #include assert.h #include dirent.h #include errno.h @@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc, libxl__domain_create_state*, int rc, uint32_t domid); +/* State for manipulating a libxl migration v2 stream */ +typedef struct libxl__stream_read_state libxl__stream_read_state; + +struct libxl__stream_read_state { +/* filled by the user */ +libxl__ao *ao; +int fd; +void (*completion_callback)(libxl__egc *egc, +libxl__domain_create_state *dcs, +int rc); +/* Private */ +int rc; +bool running; +libxl__datacopier_state dc; +size_t expected_len; +libxl_sr_hdr hdr; +libxl_sr_rec_hdr rec_hdr; +void *rec_body; +}; + +_hidden void libxl__stream_read_start(libxl__egc *egc, + libxl__stream_read_state *stream); + +_hidden void libxl__stream_read_continue(libxl__egc *egc, + libxl__stream_read_state *stream); + +_hidden void libxl__stream_read_abort(libxl__egc *egc, + libxl__stream_read_state *stream, int rc); + +static inline bool libxl__stream_read_inuse( +const libxl__stream_read_state *stream) +{ +return stream-running; +} + + struct libxl__domain_create_state { /* filled in by user */ libxl__ao *ao; @@ -3137,6 +3175,7 @@ struct libxl__domain_create_state { libxl__stub_dm_spawn_state dmss; /* If we're not doing stubdom, we use only dmss.dm, * for the non-stubdom device model. */ +libxl__stream_read_state srs; libxl__save_helper_state shs; /* necessary if the domain creation failed and we have to destroy it */ libxl__domain_destroy_state dds; diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c new file mode 100644 index 000..9cdaadf --- /dev/null +++ b/tools/libxl/libxl_stream_read.c @@ -0,0 +1,485 @@ +/* + * Copyright (C) 2015 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include libxl_osdeps.h /* must come before any other headers */ + +#include libxl_internal.h + +/* + * Infrastructure for reading and acting on the contents of a libxl migration + * stream. There are a lot of moving parts here. + * + * Entry points from outside: + * - libxl__stream_read_start() + * - Set up reading a stream from the start. + * + * - libxl__stream_read_continue() + * - Set up reading the next record from a started stream. + * + * The principle loop functionality involves reading the stream header, then + * reading a record at time and acting upon it. It follows the callbacks: + * + * - stream_header_done() + * -
Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
On 17.06.15 at 09:00, feng...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 10, 2015 2:44 PM On 08.05.15 at 11:07, feng...@intel.com wrote: --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -142,6 +142,8 @@ struct vcpu int processor; +int pre_pcpu; This again doesn't belong into the common structure (and should be accompanied by a comment, and should be unsigned int). I added three new member to 'struct vcpu' in this series: - struct tasklet vcpu_wakeup_tasklet; - struct list_head blocked_vcpu_list; - int pre_pcpu; I am trying to find a place specific to vmx to put them in, but the only one I find is ' struct arch_vmx_struct ', however, I don't think this is a good place, since it contains all vt-x stuff defined in the SDM. Do you think 'struct hvm_vcpu' is the right place? No, arch_vmx_struct is. Taking active_list and active_cpu as example, the structure clearly isn't limited to fields defined by the SDM. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges
On 2015/6/16 17:40, Jan Beulich wrote: On 16.06.15 at 11:29, tiejun.c...@intel.com wrote: I'm trying to walk into this direction: /* * We'll skip all space overlapping with reserved memory later, * so we need to increase mmio_total to compensate them. */ for ( j = 0; j memory_map.nr_map ; j++ ) { uint64_t conflict_size = 0; if ( memory_map.map[j].type != E820_RAM ) { reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_end = reserved_start + reserved_size; if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start, reserved_start, reserved_size) ) { /* * Calculate how much mmio range conflict with * reserved device memory. */ conflict_size += reserved_size; /* * But we may need to subtract those sizes beyond the * pci memory, [pci_mem_start, pci_mem_end]. */ if ( reserved_start pci_mem_start ) conflict_size -= (pci_mem_start - reserved_start); if ( reserved_end pci_mem_end ) conflict_size -= (reserved_end - pci_mem_end); } } if ( conflict_size ) { uint64_t conflict_size = max_t( uint64_t, conflict_size, max_bar_sz); conflict_size = ~(conflict_size - 1); mmio_total += conflict_size; } } This last thing goes in the right direction, but is complete overkill when you have a small reserved region and a huge BAR. You Yeah, this may waste some spaces in this worst case but I this think this can guarantee our change don't impact on the original expectation, right? ought to work out the smallest power-of-2 region enclosing the Okay. I remember the smallest size of a given PCI I/O space is 8 bytes, and the smallest size of a PCI memory space is 16 bytes. So /* At least 16 bytes to align a PCI BAR size. */ uint64_t align = 16; reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_start = (reserved_star + align) ~(align - 1); reserved_size = (reserved_size + align) ~(align - 1); Is this correct? reserved range (albeit there are tricky corner cases to consider). Yeah, its a little tricky since RMRR always owns a fixed start address, so we can't reorder them with all pci bars. I just think at least we should provide a correct solution now, then further look into what can be optimized. So I think we'd better get conflict_size with max(conflict_size, max_bar_sz), right? Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
On 06/15/2015 09:44 PM, Andrew Cooper wrote: From: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxl/Makefile|1 + tools/libxl/libxl_internal.h| 39 tools/libxl/libxl_stream_read.c | 485 +++ 3 files changed, 525 insertions(+) create mode 100644 tools/libxl/libxl_stream_read.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index cc9c152..c71c5fe 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ + libxl_stream_read.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y) LIBXL_OBJS += libxl_genid.o diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 101994f..4f33cb8 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -19,6 +19,8 @@ #include libxl_osdeps.h /* must come before any other headers */ +#include libxl_sr_stream_format.h + #include assert.h #include dirent.h #include errno.h @@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc, libxl__domain_create_state*, int rc, uint32_t domid); +/* State for manipulating a libxl migration v2 stream */ +typedef struct libxl__stream_read_state libxl__stream_read_state; + +struct libxl__stream_read_state { +/* filled by the user */ +libxl__ao *ao; +int fd; +void (*completion_callback)(libxl__egc *egc, +libxl__domain_create_state *dcs, +int rc); +/* Private */ +int rc; +bool running; +libxl__datacopier_state dc; +size_t expected_len; +libxl_sr_hdr hdr; +libxl_sr_rec_hdr rec_hdr; +void *rec_body; +}; + +_hidden void libxl__stream_read_start(libxl__egc *egc, + libxl__stream_read_state *stream); + +_hidden void libxl__stream_read_continue(libxl__egc *egc, + libxl__stream_read_state *stream); + +_hidden void libxl__stream_read_abort(libxl__egc *egc, + libxl__stream_read_state *stream, int rc); + +static inline bool libxl__stream_read_inuse( +const libxl__stream_read_state *stream) +{ +return stream-running; +} + + struct libxl__domain_create_state { /* filled in by user */ libxl__ao *ao; @@ -3137,6 +3175,7 @@ struct libxl__domain_create_state { libxl__stub_dm_spawn_state dmss; /* If we're not doing stubdom, we use only dmss.dm, * for the non-stubdom device model. */ +libxl__stream_read_state srs; libxl__save_helper_state shs; /* necessary if the domain creation failed and we have to destroy it */ libxl__domain_destroy_state dds; diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c new file mode 100644 index 000..9cdaadf --- /dev/null +++ b/tools/libxl/libxl_stream_read.c @@ -0,0 +1,485 @@ +/* + * Copyright (C) 2015 Citrix Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include libxl_osdeps.h /* must come before any other headers */ + +#include libxl_internal.h + +/* + * Infrastructure for reading and acting on the contents of a libxl migration + * stream. There are a lot of moving parts here. + * + * Entry points from outside: + * - libxl__stream_read_start() + * - Set up reading a stream from the start. + * + * - libxl__stream_read_continue() + * - Set up reading the next record from a started stream. + * + * The principle loop functionality involves reading the stream header, then + * reading a record at time and acting upon it. It follows the callbacks: + * + * - stream_header_done() + * -
Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges
On 2015/6/17 15:19, Jan Beulich wrote: On 17.06.15 at 09:10, tiejun.c...@intel.com wrote: On 2015/6/16 17:40, Jan Beulich wrote: On 16.06.15 at 11:29, tiejun.c...@intel.com wrote: I'm trying to walk into this direction: /* * We'll skip all space overlapping with reserved memory later, * so we need to increase mmio_total to compensate them. */ for ( j = 0; j memory_map.nr_map ; j++ ) { uint64_t conflict_size = 0; if ( memory_map.map[j].type != E820_RAM ) { reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_end = reserved_start + reserved_size; if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start, reserved_start, reserved_size) ) { /* * Calculate how much mmio range conflict with * reserved device memory. */ conflict_size += reserved_size; /* * But we may need to subtract those sizes beyond the * pci memory, [pci_mem_start, pci_mem_end]. */ if ( reserved_start pci_mem_start ) conflict_size -= (pci_mem_start - reserved_start); if ( reserved_end pci_mem_end ) conflict_size -= (reserved_end - pci_mem_end); } } if ( conflict_size ) { uint64_t conflict_size = max_t( uint64_t, conflict_size, max_bar_sz); conflict_size = ~(conflict_size - 1); mmio_total += conflict_size; } } This last thing goes in the right direction, but is complete overkill when you have a small reserved region and a huge BAR. You Yeah, this may waste some spaces in this worst case but I this think this can guarantee our change don't impact on the original expectation, right? Some space may be multiple Gb (e.g. the frame buffer of a graphics Sure. card), which is totally unacceptable. But then I don't understand what's your way. How can we fit all pci devices just with the smallest power-of-2 region enclosing the reserved device memory? For example, the whole pci memory is sitting at [0xa000, 0xa200]. And there are two PCI devices, A and B. Note each device needs to be allocated with 0x100. So if without concerning RMRR, A. [0xa000,0xa100] B. [0xa100,0xa200] But if one RMRR resides at [0xa0f0, 0xa1f0] which obviously generate its own alignment with 0x100. So the pci memory is expended as [0xa000, 0xa300], right? Then actually the whole pci memory can be separated three segments like, #1. [0xa000, 0xa0f0] #2. [0xa0f0, 0xa1f0] - RMRR would occupy #3. [0xa1f0, 0xa300] So just #3 can suffice to allocate but just for one device, right? If I'm wrong please correct me. ought to work out the smallest power-of-2 region enclosing the Okay. I remember the smallest size of a given PCI I/O space is 8 bytes, and the smallest size of a PCI memory space is 16 bytes. So /* At least 16 bytes to align a PCI BAR size. */ uint64_t align = 16; reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_start = (reserved_star + align) ~(align - 1); reserved_size = (reserved_size + align) ~(align - 1); Is this correct? Simply aligning the region doesn't help afaict. You need to fit it with the other MMIO allocations. I guess you're saying just those mmio allocations conflicting with RMRR? But we don't know these exact addresses until we finalize to allocate them, right? Thanks Tiejun reserved range (albeit there are tricky corner cases to consider). Yeah, its a little tricky since RMRR always owns a fixed start address, so we can't reorder them with all pci bars. I just think at least we should provide a correct solution now, then further look into what can be optimized. So I think we'd better get conflict_size with max(conflict_size, max_bar_sz), right? As per above - no, this is not an option. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-4.2-testing test] 58616: trouble: blocked/broken/fail/pass
On 17.06.15 at 03:13, osst...@xenbits.xen.org wrote: flight 58616 xen-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/58616/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-libvirt 3 host-install(3) broken in 58584 REGR. vs. 58411 Tests which are failing intermittently (not blocking): test-i386-i386-libvirt3 host-install(3) broken in 58584 pass in 58616 test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 58616 test-amd64-i386-pair 4 host-install/dst_host(4) broken in 58584 pass in 58616 test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken in 58584 pass in 58616 test-amd64-i386-xl-win7-amd64 3 host-install(3) broken in 58584 pass in 58616 test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass in 58616 test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass in 58616 test-amd64-amd64-xl-multivcpu 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl-sedf3 host-install(3) broken in 58604 pass in 58616 test-amd64-i386-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 58604 pass in 58616 test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl-qemuu-winxpsp3 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl-qemut-winxpsp3 3 host-install(3) broken pass in 58604 Is there an explanation for all these recurring host install failures? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges
On 17.06.15 at 09:10, tiejun.c...@intel.com wrote: On 2015/6/16 17:40, Jan Beulich wrote: On 16.06.15 at 11:29, tiejun.c...@intel.com wrote: I'm trying to walk into this direction: /* * We'll skip all space overlapping with reserved memory later, * so we need to increase mmio_total to compensate them. */ for ( j = 0; j memory_map.nr_map ; j++ ) { uint64_t conflict_size = 0; if ( memory_map.map[j].type != E820_RAM ) { reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_end = reserved_start + reserved_size; if ( check_overlap(pci_mem_start, pci_mem_end - pci_mem_start, reserved_start, reserved_size) ) { /* * Calculate how much mmio range conflict with * reserved device memory. */ conflict_size += reserved_size; /* * But we may need to subtract those sizes beyond the * pci memory, [pci_mem_start, pci_mem_end]. */ if ( reserved_start pci_mem_start ) conflict_size -= (pci_mem_start - reserved_start); if ( reserved_end pci_mem_end ) conflict_size -= (reserved_end - pci_mem_end); } } if ( conflict_size ) { uint64_t conflict_size = max_t( uint64_t, conflict_size, max_bar_sz); conflict_size = ~(conflict_size - 1); mmio_total += conflict_size; } } This last thing goes in the right direction, but is complete overkill when you have a small reserved region and a huge BAR. You Yeah, this may waste some spaces in this worst case but I this think this can guarantee our change don't impact on the original expectation, right? Some space may be multiple Gb (e.g. the frame buffer of a graphics card), which is totally unacceptable. ought to work out the smallest power-of-2 region enclosing the Okay. I remember the smallest size of a given PCI I/O space is 8 bytes, and the smallest size of a PCI memory space is 16 bytes. So /* At least 16 bytes to align a PCI BAR size. */ uint64_t align = 16; reserved_start = memory_map.map[j].addr; reserved_size = memory_map.map[j].size; reserved_start = (reserved_star + align) ~(align - 1); reserved_size = (reserved_size + align) ~(align - 1); Is this correct? Simply aligning the region doesn't help afaict. You need to fit it with the other MMIO allocations. reserved range (albeit there are tricky corner cases to consider). Yeah, its a little tricky since RMRR always owns a fixed start address, so we can't reorder them with all pci bars. I just think at least we should provide a correct solution now, then further look into what can be optimized. So I think we'd better get conflict_size with max(conflict_size, max_bar_sz), right? As per above - no, this is not an option. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
On Tue, 2015-06-16 at 16:53 +0100, Andrew Cooper wrote: On 16/06/15 16:03, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: when signalled to do so by libxl__remus_domain_checkpoint_callback() I think I saw that Remus wasn't currently working, so I'll let you and Hongyang thrash something out before I spend too much effort reviewing these last few RFC bits. Unless you think it is worth my having a look now? Remus was broken by patch 19 in the series, and this patch forms part of fixing it again. I can't find a way of fixing the layering violation in both plain migration and Remus, in a readable, bisectable way. Remus requires identical source and destination toolstacks, and the Remus maintainers are happy enough with the break it and fix it up in the same series approach. Now that the series is comeplete, there is some shuffling room to reduce the window of breakage, but short of folding patches 19, 21, 23-25 together, Remus will break. The report I was referring to thinking I'd seen was that Remus was still broken even after the complete series was applied i.e. there was still more to be done. I'm happy with the transient breakage in this series on this occasion, but I was proposing not to review until Remus was thought to be working OK at the end. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
On 06/15/2015 09:44 PM, Andrew Cooper wrote: [...] + +static void stream_success(libxl__egc *egc, + libxl__stream_write_state *stream); +static void stream_failed(libxl__egc *egc, + libxl__stream_write_state *stream, int ret); +static void stream_done(libxl__egc *egc, +libxl__stream_write_state *stream); + +static void check_stream_finished(libxl__egc *egc, + libxl__domain_suspend_state *dcs, s/dcs/dss/ + int rc, const char *what); + +/* Event callbacks for plain VM. */ +static void stream_header_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval); +static void libxc_header_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval); +/* libxl__xc_domain_save_done() lives here, event-order wise. */ +static void write_toolstack_record(libxl__egc *egc, + libxl__stream_write_state *stream); +static void toolstack_record_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval); +static void write_emulator_record(libxl__egc *egc, + libxl__stream_write_state *stream); +static void emulator_body_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval); +static void emulator_padding_done(libxl__egc *egc, + libxl__datacopier_state *dc, + int onwrite, int errnoval); +static void write_end_record(libxl__egc *egc, + libxl__stream_write_state *stream); +static void end_record_done(libxl__egc *egc, +libxl__datacopier_state *dc, +int onwrite, int errnoval); + +void libxl__stream_write_start(libxl__egc *egc, + libxl__stream_write_state *stream) +{ +libxl__datacopier_state *dc = stream-dc; +STATE_AO_GC(stream-ao); +struct libxl_sr_hdr hdr = { 0 }; +int ret = 0; + +assert(!stream-running); +stream-running = true; + +memset(dc, 0, sizeof(*dc)); +dc-readwhat = ; +dc-copywhat = suspend header; +dc-writewhat = save/migration stream; +dc-ao = ao; +dc-readfd = -1; +dc-writefd = stream-fd; +dc-maxsz = INT_MAX; +dc-bytes_to_read = INT_MAX; +dc-callback = stream_header_done; + +ret = libxl__datacopier_start(dc); +if (ret) +goto err; + +hdr.ident = htobe64(RESTORE_STREAM_IDENT); +hdr.version = htobe32(RESTORE_STREAM_VERSION); +hdr.options = htobe32(0); + +libxl__datacopier_prefixdata(egc, dc, hdr, sizeof(hdr)); +return; + + err: +assert(ret); +stream_failed(egc, stream, ret); +} + +void libxl__stream_write_abort(libxl__egc *egc, + libxl__stream_write_state *stream, int rc) +{ +stream_failed(egc, stream, rc); +} + +static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream) +{ +stream-rc = 0; +stream-running = false; + +stream_done(egc, stream); +} + +static void stream_failed(libxl__egc *egc, + libxl__stream_write_state *stream, int rc) +{ +assert(rc); +stream-rc = rc; + +if (stream-running) { +stream-running = false; +stream_done(egc, stream); +} +} + +static void stream_done(libxl__egc *egc, +libxl__stream_write_state *stream) +{ +libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws); + +assert(!stream-running); + +check_stream_finished(egc, dss, stream-rc, stream); +} + +static void check_stream_finished(libxl__egc *egc, + libxl__domain_suspend_state *dss, + int rc, const char *what) +{ +libxl__stream_write_state *stream = dss-sws; +STATE_AO_GC(dss-ao); + +LOG(INFO, Task '%s' joining (rc %d), what, rc); + +if (rc !stream-joined_rc) { +bool skip = false; +/* First reported failure from joining tasks. Tear everything down */ +stream-joined_rc = rc; + +if (libxl__stream_write_inuse(dss-sws)) { +skip = true; +libxl__stream_write_abort(egc, dss-sws, rc); +} + +if (libxl__save_helper_inuse(dss-shs)) { +skip = true; +libxl__save_helper_abort(egc, dss-shs); +} + +/* There is at least one more active task to join - wait for its + callback */ +if ( skip ) +return; +} + +if (libxl__stream_write_inuse(dss-sws)) +LOG(DEBUG, stream still in use); +else if (libxl__save_helper_inuse(dss-shs)) +
Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
On 17.06.15 at 08:54, feng...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 10, 2015 2:44 PM On 08.05.15 at 11:07, feng...@intel.com wrote: +switch ( v-runstate.state ) +{ +case RUNSTATE_runnable: +case RUNSTATE_offline: +/* + * We don't need to send notification event to a non-running + * vcpu, the interrupt information will be delivered to it before + * VM-ENTRY when the vcpu is scheduled to run next time. + */ +pi_desc-sn = 1; + +/* + * If the state is transferred from RUNSTATE_blocked, + * we should set 'NV' feild back to posted_intr_vector, + * so the Posted-Interrupts can be delivered to the vCPU + * by VT-d HW after it is scheduled to run. + */ +if ( old_state == RUNSTATE_blocked ) +{ +do +{ +old.control = new.control = pi_desc-control; +new.nv = posted_intr_vector; +} +while ( cmpxchg(pi_desc-control, old.control, new.control) +!= old.control ); So why is it okay to do the SN update non-atomically when the vector update is done atomically? Also the curly braces do _not_ go on separate lines for do/while constructs. And then do you really need to atomically update the whole 64 bits here, rather than just the nv field? By not making it a bit field you could perhaps just write_atomic() it? Thinking more about this, maybe write_atomic() is not good for this, we need use the LOCK prefix when updating the posted-interrupt descriptor. The LOCK prefix can't be applied to simple loads and stores; they're implicitly atomic. Also - please remember to trim your replies. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint()
On 06/15/2015 09:44 PM, Andrew Cooper wrote: And call it when a checkpoint record is found in the libxc stream. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- tools/libxc/include/xenguest.h |3 +++ tools/libxc/xc_sr_restore.c| 15 ++- tools/libxl/libxl_save_msgs_gen.pl |2 +- 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h index 7581263..b0d27ed 100644 --- a/tools/libxc/include/xenguest.h +++ b/tools/libxc/include/xenguest.h @@ -102,6 +102,9 @@ struct restore_callbacks { int (*toolstack_restore)(uint32_t domid, const uint8_t *buf, uint32_t size, void* data); +/* A checkpoint record has been found in the stream */ +int (*checkpoint)(void* data); + /* to be provided as the last argument to each callback function */ void* data; }; diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c index 9e27dba..5e0f817 100644 --- a/tools/libxc/xc_sr_restore.c +++ b/tools/libxc/xc_sr_restore.c @@ -1,5 +1,7 @@ #include arpa/inet.h +#include assert.h + #include xc_sr_common.h /* @@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) static int handle_checkpoint(struct xc_sr_context *ctx) { xc_interface *xch = ctx-xch; -int rc = 0; +int rc = 0, ret; unsigned i; if ( !ctx-restore.checkpointed ) @@ -482,6 +484,13 @@ static int handle_checkpoint(struct xc_sr_context *ctx) goto err; } +ret = ctx-restore.callbacks-checkpoint(ctx-restore.callbacks-data); +if ( ret ) +{ +rc = -1; +goto err; +} + if ( ctx-restore.buffer_all_records ) { IPRINTF(All records buffered); @@ -735,6 +744,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom, ctx.restore.checkpointed = checkpointed_stream; ctx.restore.callbacks = callbacks; +/* Sanity checks for callbacks. */ +if (checkpointed_stream) coding style +assert(callbacks-checkpoint); + IPRINTF(In experimental %s, __func__); DPRINTF(fd %d, dom %u, hvm %u, pae %u, superpages %d , checkpointed_stream %d, io_fd, dom, hvm, pae, diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl index 6b4b65e..36b279e 100755 --- a/tools/libxl/libxl_save_msgs_gen.pl +++ b/tools/libxl/libxl_save_msgs_gen.pl @@ -25,7 +25,7 @@ our @msgs = ( 'unsigned long', 'total'] ], [ 3, 'scxA', suspend, [] ], [ 4, 'scxA', postcopy, [] ], -[ 5, 'scxA', checkpoint, [] ], +[ 5, 'srcxA', checkpoint, [] ], [ 6, 'scxA', switch_qemu_logdirty, [qw(int domid unsigned enable)] ], #toolstack_save done entirely `by hand' -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v3][PATCH 07/16] hvmloader/pci: skip reserved ranges
On 17.06.15 at 10:26, tiejun.c...@intel.com wrote: Something hits me to generate another idea, #1. Still allocate all devices as before. #2. Lookup all actual bars to check if they're conflicting RMRR We can skip these bars to keep zero. Then later it would make lookup easily. #3. Need to reallocate these conflicting bars. #3.1 Trying to reallocate them with the remaining resources #3.2 If the remaining resources aren't enough, we need to allocate them from high_mem_resource. That's possible onyl for 64-bit BARs. I just feel this way may be easy and better. And even, this way also can help terminate the preexisting allocation failures, right? I think this complicates things rather than simplifying them: The more passes (and adjustments to previous settings) you do, the more error prone the whole logic will become. It may well be that you need to basically re-write what is there right now... Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 COLOPre 07/13] tools/libxl: Update libxl__domain_unpause() to support qemu-xen
On 06/16/2015 12:22 AM, Wei Liu wrote: On Mon, Jun 15, 2015 at 09:29:55AM +0800, Yang Hongyang wrote: On 06/12/2015 08:33 PM, Wei Liu wrote: On Mon, Jun 08, 2015 at 11:43:11AM +0800, Yang Hongyang wrote: Currently, libxl__domain_unpause() only supports qemu-xen-traditional. Update it to support qemu-xen. Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com This looks very similar to an existing function called libxl__domain_resume_device_model. Maybe you don't need to invent a new function. --- tools/libxl/libxl.c | 42 +- 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index d5691dc..5c843c2 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -933,10 +933,37 @@ out: return AO_INPROGRESS; } -int libxl__domain_unpause(libxl__gc *gc, uint32_t domid) +static int libxl__domain_unpause_device_model(libxl__gc *gc, uint32_t domid) { char *path; char *state; + +switch (libxl__device_model_version_running(gc, domid)) { +case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: { +uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid); + +path = libxl__device_model_xs_path(gc, dm_domid, domid, /state); +state = libxl__xs_read(gc, XBT_NULL, path); +if (state != NULL !strcmp(state, paused)) { The only difference between your function and libxl__domain_unpause_device_model is the check for state node. I think you can just add the check to libxl__domain_resume_device_model and use that function. I'm not sure if we change the existing function's behavior will affect the existing callers, if there's no problem to do so, I will do as what you said in the next version. Qemu-dm currently has several states. libxl__domain_resume_device_model doesn't check the state and writes unconditionally. I think checking before writing would be an improvement. fixed, thank you! Wei. Wei. . -- Thanks, Yang. . -- Thanks, Yang. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job
Thinking about this leads me to ask another question. Suppose that a bug causes the l1 to lock up completely. ts-logs-capture will attempt to hard reboot a locked-up host. If it can't fetch any logs, it calls target_reboot_hard($ho); What will that do if $ho refers to the l1 ? It relies on the power method. Does your nested l1 host have a power method ? I'm afraid l1 won't like normal hosts has power cycle operations. Maybe we need to simulate it? Perhaps arrange for an appropriate PowerMethod for hosts which are actually guests? Update. I think maybe we need to refactor 'power_cycle' function in TestSupport.pm. I have not try it, something like below? sub power_cycle ($) { my ($ho) = @_; +if (guest_var($ho,enable_nestedhvm,'') =~ m/true/) { +guest_destroy($ho); +guest_create($ho); +guest_await_dhcp_tcp($ho,300); +guest_check_up($ho); +} else { $mjobdb-host_check_allocated($ho); die refusing to set power state for host $ho-{Name}. possibly shared with other jobs\n if $ho-{SharedMaybeOthers}; power_state($ho, 0); power_cycle_sleep($ho); power_state($ho, 1); +} } Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format
On 16.06.15 at 21:10, julien.gr...@citrix.com wrote: Append 0x to all %x in order to avoid while reading when there is other decimal value in the log. To save on the space taken by the format strings you should prefer %#x over 0x%x (like we do in the hypervisor). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] Modified RTDS scheduler to use an event-driven model instead of polling.
On Tue, 2015-06-16 at 22:45 -0700, Meng Xu wrote: So here's my proposal, lets see if it fits what you want: I will leave this to Dario to answer if it fits what he wants. :-P 1.) The scheduler does not do any timing, Not really. The rt_schedule does the budget enforcement. When a current VCPU runs out of budget, the rt_schedule will be invoked by a timer (you can refer to the schedule function in xen/sched/schedule.c to have a look how the timer is armed/disarmed.). When the rt_schedule is invoked, it needs to: a) update the budget of the current running VCPU and move it from runq to depleted queue; b) pick the current highest VCPU from runq and return it as the snext VCPU. So scheduling logic is still involved in the rt_schedule function. Correct, thanks Meng. 2.) replenishments are scheduled via timer at each [next] replenishment period. Each replenishment resorts the replenished vCPUs, and if any of the first #CPUs in the runq change, we tickle a pCPU for each change This is right. Yeah, sort of. I mean, I'm not sure what Each replenishment resorts the replenished vCPUs means. If it means that a replenished vCPU is given a new deadline, and hence needs to move from depleted queue to runqueue, in a position within runqueue that reflects its new deadline... then, I agree. This requires holding the runqueue lock, but we need it anyway, for now. This may or may not require a pCPU to be tickled, and it's runq_tickle()'s job to decide whether that is the case and, if yes, which one. Also, about this we tickle a pCPU for each change: first of all, as I just said, that's not _always_ the case, and we should avoid tickling when it's not necessary. Also, when it's necessary in principle, it makes few sense to tickle a pCPU twice in a very tight loop (i.e., without giving it the chance to reschedule). So if, for instance, two different replenishments (no matter whether being serviced together, or in different invocations of the replenishment timer handler) both determine that pCPU Y is the one that should be tickled (because it's the one running the vCPU with the latest deadline, and both the two replenished vCPUs have more imminent deadline than what's running on Y, but later than what's running everywhere else), there's no point in tickling twice. Not that doing it would be harmful, it just won't have any effect. What I mean is much rather that you can use this information to keep runq_tickle() simple/quick. Note that this is pretty much what's happening already in runq_tickle(), thanks to the use of the prv-tickled bitmap, so the point here is making sure that we tickle a pCPU for each change doesn't mean removing this, because I think that would be wrong. It's probably unlikely that this was what Dagaen meant, but I though I better state things clearly. :-) In this case, we can use one timer. We actually have two: one for budget enforcement and the other for budget replenishment. EXACTLY!! :-) We could use the current one as a replenishment timer, and make it so rt_schedule isn't the default invoked method. Does this sound similar to what you are suggesting? I don't think so, but I will leave it for Dario's for his opinion. use current one as replenishment timer, and make it so rt_schedule isn't the default invoked method Please, please, please, don't do that! :-P In Dario's suggestion, you just simply remove the update_budget function out of rt_schedule. This is because budget enforcement, which is the work of rt_schedule, does not naturelly involves budget replenishment. Yes. Or at least, that's the core of my suggestion, yes. In more details, my suggestion includes getting rid of a bunch of other invocations of scheduling/picking and replenishment related functions from many other places, as I tried to make clear in my last email. Perhaps, the fact that I haven't stated that clearly enough since now, was what was making it a bit harder for you to see what I meant with 'making things simple', etc. I hope I explained myself better now. In order to achieve budget replenishment, we need another timer to invoke another function (budget_replenish function), that is dedicated to that. Yep, indeed. I have to ask because I thought you wouldn't like the idea, I guess Dario won't like this idea. :-P (I'm kidding, but it should be the case.) You're right, it's not. :-) and its not really *that* far off from what we have now, Its a little restructuring so that replenishments occur before any scheduling activity and the handler checks if switching is needed (basically acting as the scheduler) and then calls tickle. Sounds like what you had in mind? This need for an strict ordering between replenishments and scheduling is something that you're (Dagaen) insisting a lot on, while I really don't see why it would be a good thing. I've stated countless time that they're independent, which also mean there's no
Re: [Xen-devel] [PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format
On Wed, 2015-06-17 at 10:30 +0100, Julien Grall wrote: I see different opinion on whether using 0x% or %#. As I plan to resend a version with the commit message update, shall I use %#? I think it's mostly pointless bike-shedding over a saving measured in single digit bytes, use whichever you like. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] libxc: fix xc_dom_load_elf_symtab
On Thu, Jun 11, 2015 at 06:05:20PM +0200, Roger Pau Monne wrote: xc_dom_load_elf_symtab was incorrectly trying to perform the same calculations already done in elf_parse_bsdsyms when load == 0 is used. Instead of trying to repeat the calculations, just trust what elf_parse_bsdsyms has already accounted for. This also simplifies the code by allowing the non-load case to return earlier. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 58620: tolerable FAIL - PUSHED
flight 58620 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/58620/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-amd64-libvirt 3 host-install(3) broken in 58568 pass in 58620 test-amd64-i386-xl-qemuu-winxpsp3 3 host-install(3) broken in 58568 pass in 58620 test-amd64-i386-rumpuserxen-i386 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail pass in 58568 Regressions which are regarded as allowable (not blocking): test-amd64-i386-libvirt 11 guest-start fail like 58428 test-amd64-amd64-libvirt 11 guest-start fail like 58428 test-amd64-i386-freebsd10-amd64 9 freebsd-install fail like 58428 test-amd64-i386-freebsd10-i386 9 freebsd-install fail like 58428 test-amd64-i386-libvirt-xsm 11 guest-start fail like 58428 test-armhf-armhf-libvirt-xsm 11 guest-start fail like 58428 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58428 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 58428 test-armhf-armhf-libvirt 11 guest-start fail like 58428 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass version targeted for testing: linux0f57d86787d8b1076ea8f9cba2a46d534a27 baseline version: linuxdf5f4158415b6fc4a2d683c6fdc806be6da176bc People who touched revisions under test: Axel Lin axel@ingics.com Dan Williams dan.j.willi...@intel.com Daniel Verkamp daniel.verk...@intel.com Dave Airlie airl...@redhat.com Dave Jiang dave.ji...@intel.com David S. Miller da...@davemloft.net David Woodhouse david.woodho...@intel.com Eric Dumazet eduma...@google.com Erik Hugne erik.hu...@ericsson.com Govindarajulu Varadarajan _gov...@gmx.com Hauke Mehrtens ha...@hauke-m.de Herbert Xu herb...@gondor.apana.org.au Ingo Molnar mi...@kernel.org Jaedon Shin jaedon.s...@gmail.com Jeff Kirsher jeffrey.t.kirs...@intel.com Jeff Layton jeff.lay...@primarydata.com Jens Axboe ax...@fb.com Jerome Marchand jmarc...@redhat.com Johannes Berg johannes.b...@intel.com Jon Mason jdma...@kudzu.us Jurgen Kramer gtmkra...@xs4all.nl Kan Liang kan.li...@intel.com Krzysztof Kozlowski k.kozlow...@samsung.com Linus Torvalds torva...@linux-foundation.org Ludovic Desroches ludovic.desroc...@atmel.com Marcelo Ricardo Leitner marcelo.leit...@gmail.com Markos Chandras markos.chand...@imgtec.com Masanari Iida standby2...@gmail.com Maxime Ripard maxime.rip...@free-electrons.com Mel Gorman mgor...@suse.de Michael Buesch m...@bues.ch Ming Lei tom.leim...@gmail.com Neil Horman nhor...@tuxdriver.com NeilBrown ne...@suse.de NeilBrown ne...@suse.de Nikolay Aleksandrov ra...@blackwall.org Peter Zijlstra (Intel) pet...@infradead.org Peter Zijlstra pet...@infradead.org Rabin Vincent rabin.vinc...@axis.com Ralf Baechle r...@linux-mips.org Richard Cochran richardcoch...@gmail.com Richard Weinberger rich...@nod.at Robert Shearman rshea...@brocade.com Shaohua Li s...@fb.com Takashi Iwai ti...@suse.de Thomas Gleixner t...@linutronix.de Vinod Koul vinod.k...@intel.com Vlad Yasevich vyasev...@gmail.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt
Re: [Xen-devel] [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
On 17/06/15 11:14, Ian Campbell wrote: On Tue, 2015-06-16 at 16:06 +0100, Andrew Cooper wrote: On 16/06/15 16:04, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Libxl has now been fully adjusted not to need them. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com /me looks mournfully at the #28 shaped hole in this series which would nuke all the migration v1 code from libxc :-) I was going to slip that into v2. I didn't want to delay posting v1 for review, given the proximity of the 4.6 freeze. I think I will transcribe the description of the legacy protocol from xg_save_restore.h That would be good, thanks. and code up the legacy protocol in python. That would be above and beyond, but don't let me stop you ;-) Well - it is currently coded up with magic numbers in the conversion script. All I was planning to do was make some numbers less magic. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2 linux-next] Revert ufs: fix deadlocks introduced by sb mutex merge
On Fri 05-06-15 23:03:48, Al Viro wrote: On Fri, Jun 05, 2015 at 07:50:18PM +0100, Al Viro wrote: Basically, we have i_mutex: file size changes, contents-affecting syscalls. Per-inode. truncate_mutex: block pointers changes. Per-inode. s_lock: block and inode bitmaps changes. Per-filesystem. For UFS it's slightly more complicated due to tail packing they are doing for short files, but most of that complexity is due to having that stuff handled way too deep in call chain. Oh, lovely... commit 10e5dc Author: Evgeniy Dushistov dushis...@mail.ru Date: Sat Jul 1 04:36:24 2006 -0700 [PATCH] ufs: truncate should allocate block for last byte had removed -truncate() method and missed the fact that vmtrucate() had callers outside of -setattr(), such as handling of -prepare_write() partial failures and short copies on write(2) in general. Then we had a long and convoluted series of conversions that ended up with vmtruncate() lifted into ufs_write_failed() and replaced with truncate_pagecache() in there. Through all that, everybody (me included) had not noticed that we *still* do not free blocks allocated by ufs_write_begin() failing halfway through. While we are at it, ufs_write_end() ought to call ufs_write_failed() in case when we'd been called after a short copy (and do the same block freeing). Joy... Folks, is anybody actively maintaining fs/ufs these days? Looking into the changelog there wasn't anyone seriously looking into UFS for at least 5-6 years... Fabian did some cleanups recently but they were mostly cosmetic. So I don't think it's really maintained. OTOH clearly there are some users since occasionally someone comes with a bug report. Welcome to the world where we have ~50 on disk / network filesystems :-| Honza -- Jan Kara j...@suse.cz SUSE Labs, CR ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] building error
When I try to build xen on fc21, I meet the following errors: drivers/net/ath/ath9k/ath9k_ar9003_phy.c: In function ‘ar9003_hw_ani_control’: drivers/net/ath/ath9k/ath9k_ar9003_phy.c:862:11: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses] if (!on != aniState-ofdmWeakSigDetectOff) { ^ drivers/net/ath/ath9k/ath9k_ar9003_phy.c:1016:14: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses] if (!is_on != aniState-mrcCCKOff) { ^ cc1: all warnings being treated as errors bin/rules/drivers/net/ath/ath9k/ath9k_ar9003_phy.c.r:3: recipe for target 'bin/ath9k_ar9003_phy.o' failed make[7]: *** [bin/ath9k_ar9003_phy.o] Error 1 make[7]: *** Waiting for unfinished jobs [BUILD] bin/nvsvpd.o drivers/net/ath/ath9k/ath9k_ar5008_phy.c: In function ‘ar5008_hw_ani_control_old’: drivers/net/ath/ath9k/ath9k_ar5008_phy.c:1144:11: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses] if (!on != aniState-ofdmWeakSigDetectOff) { ^ drivers/net/ath/ath9k/ath9k_ar5008_phy.c: In function ‘ar5008_hw_ani_control_new’: drivers/net/ath/ath9k/ath9k_ar5008_phy.c:1310:11: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses] if (!on != aniState-ofdmWeakSigDetectOff) { ^ cc1: all warnings being treated as errors bin/rules/drivers/net/ath/ath9k/ath9k_ar5008_phy.c.r:3: recipe for target 'bin/ath9k_ar5008_phy.o' failed make[7]: *** [bin/ath9k_ar5008_phy.o] Error 1 make[7]: Leaving directory '/work/src/xen/tools/firmware/etherboot/ipxe/src' Makefile:28: recipe for target 'ipxe/src/bin/rtl8139.rom' failed make[6]: *** [ipxe/src/bin/rtl8139.rom] Error 2 I know my gcc has some problems, and will generate wrong bin file(We have discussed it earlier). But this building error is new problem. Do we have any plan to use newer ipxe? I update ipxe to avoid the building error, and I meet the following error: hw/virtio/virtio-rng.c: In function ‘virtio_rng_device_realize’: hw/virtio/virtio-rng.c:152:31: error: logical not is only applied to the left hand side of comparison [-Werror=logical-not-parentheses] if (!vrng-conf.period_ms 0) { ^ cc1: all warnings being treated as errors It seems that qemu have some problems. Thanks Wen Congyang ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/3] net/xen-netback: Don't mix hexa and decimal with 0x in the printf format
Hi Ian, On 17/06/2015 10:25, Ian Campbell wrote: On Tue, 2015-06-16 at 20:10 +0100, Julien Grall wrote: Append 0x to all %x in order to avoid while reading when there is other decimal value in the log. Also replace some of the hexadecimal print to decimal to uniformize the format with netfront. Signed-off-by: Julien Grall julien.gr...@citrix.com Cc: Wei Liu wei.l...@citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: net...@vger.kernel.org You meant s/Append/Prepend/, nonetheless: I noticed a missing word after avoid in the commit message too. I will update to: Prepend 0x to all %x in order to avoid confusion while reading when there is other decimal value in the log. [...]. Acked-by: Ian Campbell ian.campb...@citrix.com I see different opinion on whether using 0x% or %#. As I plan to resend a version with the commit message update, shall I use %#? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
On 17/06/15 02:31, Yang Hongyang wrote: +default: +ret = ERROR_FAIL; +goto err; +} + +ret = libxl__domain_suspend_device_model(gc, dss); This is no longer needed, the suspend callback already called this function and the emulator context already saved to a file. This call will cause Primary's emulator stop under Remus. postcopy callback will resume primary. then in checkpoint callback, we shouldn't suspend device model. It is the result of copying how everything was done previously. I will drop it. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] x86: synchronize PCI config space access decoding
On 17.06.15 at 11:36, andrew.coop...@citrix.com wrote: On 17/06/15 07:29, Jan Beulich wrote: On 16.06.15 at 20:26, andrew.coop...@citrix.com wrote: It turns out that MSR_AMD64_NB_CFG is unconditionally RAZ and has all writes discarded, so no HVM guest will ever be in a position to legitimately use AMD extended configuration access. Where have you found that? The register (named NB_CFG1 in newer families' BKGDs) is clearly r/w. It is implemented as RAZ/write discard in the hvm msr intercept code, and appears to exist only to prevent the guest blowing up in a cross-vendor case. Ah, that's something _we_ do. But the MSR write being discarded doesn't mean a guest can't legitimately use extended accesses: I don't think you expect OSes to read back what they wrote? I.e. an OS enabling this functionality would read zero, write back the value modified to have the bit set, and go on assuming extended accesses are okay now. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
On 06/17/2015 05:50 PM, Andrew Cooper wrote: On 17/06/15 08:57, Wen Congyang wrote: +/* Queue up reading the body. */ +size_t bytes_to_read; + +switch (rec_hdr-type) { +/* + * Emulator records want to retain the blob in the pipe, for a further + * datacopier call to move elsewhere. Just read the emulator header. + */ In this case, we should not call ROUNDUP(). +case REC_TYPE_EMULATOR_CONTEXT: +bytes_to_read = sizeof(struct libxl_sr_emulator_hdr); +break; + +default: +bytes_to_read = rec_hdr-length; +break; +} + +bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER); So, I think it is better to move ROUNDUP to default case. Thanks Wen Congyang sizeof(struct libxl_sr_emulator_hdr) is cunningly of the appropriate order already. Yes I suppose it is probably better to move the roundup into the default case and assert() appropriate alignment after the switch() Do you mean the sub-header must be aligned Thanks Wen Congyang ~Andrew . ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
On Wed, 2015-06-17 at 18:49 +0800, Wen Congyang wrote: On 06/17/2015 06:15 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 11:09 +0800, Wen Congyang wrote: +if (hdr-options RESTORE_OPT_BIG_ENDIAN) { +ret = ERROR_FAIL; +LOG(ERROR, Unable to handle big endian streams); +goto err; I think it is better to check if the host is big endian or not. It's not, there are no big endian ports of Xen today. I think encoding that here is fine. IIRC, arm supports big endian. Do we only use arm+little endian? Currently, yes. There are plans afoot to support big endian _guests_ but not big endian hypervisor (and by extension IMHO tools or at least the migration stream should remain LE). That would be a lot of faff for very little benefit IMHO. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
On Wed, 2015-06-17 at 11:43 +0100, Andrew Cooper wrote: On 17/06/15 11:14, Ian Campbell wrote: On Tue, 2015-06-16 at 16:06 +0100, Andrew Cooper wrote: On 16/06/15 16:04, Ian Campbell wrote: On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote: Libxl has now been fully adjusted not to need them. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com /me looks mournfully at the #28 shaped hole in this series which would nuke all the migration v1 code from libxc :-) I was going to slip that into v2. I didn't want to delay posting v1 for review, given the proximity of the 4.6 freeze. I think I will transcribe the description of the legacy protocol from xg_save_restore.h That would be good, thanks. and code up the legacy protocol in python. That would be above and beyond, but don't let me stop you ;-) Well - it is currently coded up with magic numbers in the conversion script. All I was planning to do was make some numbers less magic. That does sound nice, thanks. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 06/17] x86/hvm: revert 82ed8716b fix direct PCI port I/O emulation retry...
-Original Message- From: Paul Durrant [mailto:paul.durr...@citrix.com] Sent: 11 June 2015 16:43 To: xen-de...@lists.xenproject.org Cc: Paul Durrant; Keir (Xen.org); Jan Beulich; Andrew Cooper Subject: [PATCH v2 06/17] x86/hvm: revert 82ed8716b fix direct PCI port I/O emulation retry... ...and error handling NOTE: A straight reversion was not possible because of subsequent changes in the code so this at least partially a manual reversion. By limiting hvmemul_do_io_addr() to reps falling within the pages on which a reference has already been taken, we can guarantee that calls to hvm_copy_to/from_guest_phys() will not hit the HVMCOPY_gfn_paged_out or HVMCOPY_gfn_shared cases. Thus we can remove the retry logic from the intercept code and simplify it significantly. Normally hvmemul_do_io_addr() will only reference single page at a time. It will, however, take an extra page reference for I/O spanning a page boundary. It is still important to know, upon returning from x86_emulate(), whether the number of reps was reduced so the mmio_retry flag is retained for that purpose. Signed-off-by: Paul Durrant paul.durr...@citrix.com Unfortunately, during further testing with XenServer, I found an issue with this patch and also subsequent patches in the series. I will therefore be posting a v3 of the series. Paul Cc: Keir Fraser k...@xen.org Cc: Jan Beulich jbeul...@suse.com Cc: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/hvm/emulate.c | 83 -- -- xen/arch/x86/hvm/intercept.c | 52 + xen/include/asm-x86/hvm/vcpu.h |2 +- 3 files changed, 69 insertions(+), 68 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 4af70b0..38b8956 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -60,6 +60,7 @@ static int hvmemul_do_io( .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO, .addr = addr, .size = size, +.count = *reps, .dir = dir, .df = df, .data = data, @@ -131,15 +132,6 @@ static int hvmemul_do_io( HVMIO_dispatched : HVMIO_awaiting_completion; vio-io_size = size; -/* - * When retrying a repeated string instruction, force exit to guest after - * completion of the retried iteration to allow handling of interrupts. - */ -if ( vio-mmio_retrying ) -*reps = 1; - -p.count = *reps; - if ( dir == IOREQ_WRITE ) { if ( !data_is_addr ) @@ -153,17 +145,9 @@ static int hvmemul_do_io( switch ( rc ) { case X86EMUL_OKAY: -case X86EMUL_RETRY: -*reps = p.count; p.state = STATE_IORESP_READY; -if ( !vio-mmio_retry ) -{ -hvm_io_assist(p); -vio-io_state = HVMIO_none; -} -else -/* Defer hvm_io_assist() invocation to hvm_do_resume(). */ -vio-io_state = HVMIO_handle_mmio_awaiting_completion; +hvm_io_assist(p); +vio-io_state = HVMIO_none; break; case X86EMUL_UNHANDLEABLE: { @@ -294,17 +278,67 @@ int hvmemul_do_io_addr( bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size, uint8_t dir, bool_t df, paddr_t ram_gpa) { -struct page_info *ram_page; +struct vcpu *v = current; +unsigned long ram_gmfn = paddr_to_pfn(ram_gpa); +struct page_info *ram_page[2]; +int nr_pages = 0; +unsigned long count; int rc; -rc = hvmemul_acquire_page(paddr_to_pfn(ram_gpa), ram_page); + +rc = hvmemul_acquire_page(ram_gmfn, ram_page[nr_pages]); if ( rc != X86EMUL_OKAY ) -return rc; +goto out; + +nr_pages++; + +/* Detemine how many reps will fit within this page */ +for ( count = 0; count *reps; count++ ) +{ +paddr_t start, end; + +if ( df ) +{ +start = ram_gpa - count * size; +end = ram_gpa + size - 1; +} +else +{ +start = ram_gpa; +end = ram_gpa + (count + 1) * size - 1; +} + +if ( paddr_to_pfn(start) != ram_gmfn || + paddr_to_pfn(end) != ram_gmfn ) +break; +} + +if ( count == 0 ) +{ +/* + * This access must span two pages, so grab a reference to + * the next page and do a single rep. + */ +rc = hvmemul_acquire_page(df ? ram_gmfn - 1 : ram_gmfn + 1, + ram_page[nr_pages]); +if ( rc != X86EMUL_OKAY ) +goto out; -rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 1, +nr_pages++; +count = 1; +} + +rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1, (uint64_t)ram_gpa); +if ( rc ==
Re: [Xen-devel] [xen-4.2-testing test] 58616: trouble: blocked/broken/fail/pass
On Wed, 2015-06-17 at 09:17 +0100, Jan Beulich wrote: On 17.06.15 at 03:13, osst...@xenbits.xen.org wrote: flight 58616 xen-4.2-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/58616/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-libvirt 3 host-install(3) broken in 58584 REGR. vs. 58411 Tests which are failing intermittently (not blocking): test-i386-i386-libvirt3 host-install(3) broken in 58584 pass in 58616 test-amd64-amd64-pair 3 host-install/src_host(3) broken in 58584 pass in 58616 test-amd64-i386-pair 4 host-install/dst_host(4) broken in 58584 pass in 58616 test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken in 58584 pass in 58616 test-amd64-i386-xl-win7-amd64 3 host-install(3) broken in 58584 pass in 58616 test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken in 58584 pass in 58616 test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 58584 pass in 58616 test-amd64-amd64-xl-multivcpu 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl-sedf3 host-install(3) broken in 58604 pass in 58616 test-amd64-i386-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 58604 pass in 58616 test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl-qemuu-winxpsp3 3 host-install(3) broken in 58604 pass in 58616 test-i386-i386-xl-qemut-winxpsp3 3 host-install(3) broken pass in 58604 Is there an explanation for all these recurring host install failures? merlot0 had forgotten its BIOS settings, which is now fixed but seems to have taken out most flights in the last 24hours. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 17, 2015 3:56 PM To: Wu, Feng Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling On 17.06.15 at 08:54, feng...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 10, 2015 2:44 PM On 08.05.15 at 11:07, feng...@intel.com wrote: +switch ( v-runstate.state ) +{ +case RUNSTATE_runnable: +case RUNSTATE_offline: +/* + * We don't need to send notification event to a non-running + * vcpu, the interrupt information will be delivered to it before + * VM-ENTRY when the vcpu is scheduled to run next time. + */ +pi_desc-sn = 1; + +/* + * If the state is transferred from RUNSTATE_blocked, + * we should set 'NV' feild back to posted_intr_vector, + * so the Posted-Interrupts can be delivered to the vCPU + * by VT-d HW after it is scheduled to run. + */ +if ( old_state == RUNSTATE_blocked ) +{ +do +{ +old.control = new.control = pi_desc-control; +new.nv = posted_intr_vector; +} +while ( cmpxchg(pi_desc-control, old.control, new.control) +!= old.control ); So why is it okay to do the SN update non-atomically when the vector update is done atomically? Also the curly braces do _not_ go on separate lines for do/while constructs. And then do you really need to atomically update the whole 64 bits here, rather than just the nv field? By not making it a bit field you could perhaps just write_atomic() it? Thinking more about this, maybe write_atomic() is not good for this, we need use the LOCK prefix when updating the posted-interrupt descriptor. The LOCK prefix can't be applied to simple loads and stores; they're implicitly atomic. Yes, I know LOCK prefix cannot be used together with 'mov', but my concern is without LOCK, does it work well when this write-atomic() operation and hardware setting 'ON' occur at the same time? Hardware updates the 'ON' filed like this: (which is listed in the Spec.) Hardware performs a coherent atomic read-modify-write operation of the posted-interrupt descriptor as follows: - Read contents of the Posted Interrupt Descriptor, claiming exclusive ownership of its hosting cache-line. If invalid programming of Posted Interrupt Descriptor is detected, release ownership of the cache-line, and block the interrupt request. - If above checks succeed, retrieve current values of Posted Interrupt Requests (PIR bits 255:0), Outstanding Notification (ON), Suppress Notification (SN), Notification Vector (NV), and Notification Destination (NDST) fields in the Posted Interrupt Descriptor. - Modify the following descriptor field values atomically: * Set bit in PIR corresponding to the Vector field value from the IRTE * Compute X = ((ON == 0) (URG | (SN == 0))) * If (X == 1), Set ON field. - Promote the cache-line to be globally observable, so that the modifications are visible to other caching agents. Hardware may write-back the cache-line any time after this step. If we don't use the LOCK prefix, 'NV' filed may be changed between reading these filed and writing back by hardware, right? But for 'NV', maybe this is not an issue, since hardware doesn't change it, It only check 'ON', 'SN' and updates 'ON' if needed. And that's why we need use LOCK prefix to update 'SN', right? What is you thoughts about this? Sorry for the long description above, since the hardware guys advise me to use LOCK prefix when updating the posted-interrupt descriptor, I just don't want to make any mistake here. Thanks, Feng Also - please remember to trim your replies. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [qemu-mainline test] 58648: regressions - FAIL
On Wed, 2015-06-17 at 07:42 +, osstest service user wrote: flight 58648 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/58648/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 5 xen-build fail REGR. vs. 58551 build-i3865 xen-build fail REGR. vs. 58551 build-amd64-xsm 5 xen-build fail REGR. vs. 58551 build-i386-xsm5 xen-build fail REGR. vs. 58551 build-armhf 5 xen-build fail REGR. vs. 58551 build-armhf-xsm 5 xen-build fail REGR. vs. 58551 Seems to be a real upstream issue: install: cannot stat `/home/osstest/build.58648.build-amd64-xsm/xen/tools/qemu-xen-dir/pc-bios/vgabios-virtio.bin': No such file or directory ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 58618: regressions - FAIL
flight 58618 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/58618/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl-credit2 6 xen-boot fail REGR. vs. 58392 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-libvirt-xsm 11 guest-start fail REGR. vs. 58392 test-armhf-armhf-libvirt 11 guest-start fail blocked in 58392 test-amd64-i386-libvirt-xsm 11 guest-start fail like 58392 test-amd64-i386-libvirt 11 guest-start fail like 58392 test-amd64-amd64-libvirt-xsm 11 guest-start fail like 58392 test-amd64-amd64-libvirt 11 guest-start fail like 58392 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58392 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 58392 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail never pass test-armhf-armhf-xl-sedf 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass version targeted for testing: xen fcbfaf9d260adbdb9352d6300b9f63c4ed443d49 baseline version: xen 12e817e281034f5881f46e0e4f1d127820101a78 People who touched revisions under test: Andrew Cooper andrew.coop...@citrix.com Dario Faggioli dario.faggi...@citrix.com David Vrabel david.vra...@citrix.com Don Dugger donald.d.dug...@intel.com Don Slutz dsl...@verizon.com George Dunlap george.dun...@eu.citrix.com Ian Campbell ian.campb...@citrix.com Ian Jackson ian.jack...@eu.citrix.com Jan Beulich jbeul...@suse.com Juergen Gross jgr...@suse.com Kevin Tian kevin.t...@intel.com Olaf Hering o...@aepfle.de Roger Pau Monné roger@citrix.com Ross Lagerwall ross.lagerw...@citrix.com Wei Liu wei.l...@citrix.com jobs: build-amd64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass test-amd64-amd64-libvirt-xsm fail test-armhf-armhf-libvirt-xsm fail test-amd64-i386-libvirt-xsm fail test-amd64-amd64-xl-xsm pass test-armhf-armhf-xl-xsm pass test-amd64-i386-xl-xsm pass test-amd64-amd64-xl-pvh-amd fail test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass
Re: [Xen-devel] [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling
-Original Message- From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 17, 2015 4:59 PM To: Wu, Feng Cc: andrew.coop...@citrix.com; george.dun...@eu.citrix.com; Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org; k...@xen.org Subject: RE: [RFC v2 13/15] Update Posted-Interrupts Descriptor during vCPU scheduling On 17.06.15 at 10:46, feng...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 17, 2015 3:56 PM On 17.06.15 at 08:54, feng...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, June 10, 2015 2:44 PM On 08.05.15 at 11:07, feng...@intel.com wrote: +switch ( v-runstate.state ) +{ +case RUNSTATE_runnable: +case RUNSTATE_offline: +/* + * We don't need to send notification event to a non-running + * vcpu, the interrupt information will be delivered to it before + * VM-ENTRY when the vcpu is scheduled to run next time. + */ +pi_desc-sn = 1; + +/* + * If the state is transferred from RUNSTATE_blocked, + * we should set 'NV' feild back to posted_intr_vector, + * so the Posted-Interrupts can be delivered to the vCPU + * by VT-d HW after it is scheduled to run. + */ +if ( old_state == RUNSTATE_blocked ) +{ +do +{ +old.control = new.control = pi_desc-control; +new.nv = posted_intr_vector; +} +while ( cmpxchg(pi_desc-control, old.control, new.control) +!= old.control ); So why is it okay to do the SN update non-atomically when the vector update is done atomically? Also the curly braces do _not_ go on separate lines for do/while constructs. And then do you really need to atomically update the whole 64 bits here, rather than just the nv field? By not making it a bit field you could perhaps just write_atomic() it? Thinking more about this, maybe write_atomic() is not good for this, we need use the LOCK prefix when updating the posted-interrupt descriptor. The LOCK prefix can't be applied to simple loads and stores; they're implicitly atomic. Yes, I know LOCK prefix cannot be used together with 'mov', but my concern is without LOCK, does it work well when this write-atomic() operation and hardware setting 'ON' occur at the same time? Hardware updates the 'ON' filed like this: (which is listed in the Spec.) Hardware performs a coherent atomic read-modify-write operation of the posted-interrupt descriptor as follows: - Read contents of the Posted Interrupt Descriptor, claiming exclusive ownership of its hosting cache-line. The cache line part here is the really important aspect. The CPU will do the same for LOCKed transactions as well as simple stores. But of course - I repeat this just to avoid any misunderstanding - this works only if the store covers _only_ nv (recall that I brought this up in context of trying to avoid using bitfields where possible). Your clarification makes thing clear. Thank you very much! Thanks, Feng Sorry for the long description above, since the hardware guys advise me to use LOCK prefix when updating the posted-interrupt descriptor, I just don't want to make any mistake here. And that advice is correct if you fiddle with the descriptor in quantities covering more than the precise byte(s) you intend to modify. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [libvirt test] 58119: regressions - FAIL
Ian Campbell writes (Re: [Xen-devel] [libvirt test] 58119: regressions - FAIL): merlot0 has just lost its BIOS settings and failed to reboot -- which might also imply some other interesting misconfiguration, Ian, do you think that is plausible? Not particularly plausible, I'm afraid, no. .oO{ I should get that host history report patch series finished. } Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Tue, 2015-06-16 at 18:16 +0100, Stefano Stabellini wrote: I wrote this before reading the rest of the thread with your alternative suggestion. Both approaches can work, maybe it is true that having the guest requesting mappings is better. And we should certainly do the same thing for PVH and ARM guests. If we have the guest requesting the mapping, obviously the hypercall implementation should check whether mapping the memory region has been previously allowed by the toolstack, that should still call xc_domain_iomem_permission. Whoever makes the initial decision/setup we still need to decide what to do about reads and writes to the device's BAR registers. Who deals with these and how, and if writes are allowed and if so how is this then reflected in the guest p2m. I would very much prefer ARM and x86 PVH to use the same approach to this. For x86 HVM I believe QEMU takes care of this, by emulating the reads/writes to CFG space and making hypercalls (domctls?) to update the p2m. There is no QEMU to do this in the ARM or x86 PVH cases. For x86 PV I believe pciback deals with the register reads/writes but it doesn't need to do anything wrt the p2m because that is a guest internal construct. This obviously doesn't apply to ARM or x86 PVH either since the p2m is managed by Xen in both those cases. So it seems that neither of the existing solutions are a good match for either ARM or x86 PVH. _If_ it were permissible to implement all BAR registers as r/o then this might simplify things, however I suspect this is not something we can get away with in the general case (i.e. a driver for a given PCI device _knows_ that BARN is writeable...). So I think we do need to allow guest writes to the BAR registers. I think it would be a bit strange (or even open potentially subtle (security?) issues) to require the guest to write the BAR register and to update the p2m to match as well. So I think updates to the BAR need to be reflected in the p2m too by whichever entity is emulating those reads/writes. QEMU (or another daemon) is not really an option IMHO. Which leaves us with either Xen doing trap and emulate or pciback dealing with this via the pciif.h cfg space access stuff. I've a slight preference for the latter since I think both ARM and x86 PVH are planning to use pcifront/pciback already. Which leaves us with a requirement for the backend to be able to update the p2m, which in turn leads to a need for an interface available to the dom0 kernel (which the existing domctl is not). There will also need to be a mechanism to expose a suitable MMIO hole to pcifront. On x86 this would be via the machine memory map, I think, but on ARM we may need something else, perhaps a xenstore key associated with the pci bus entry in xenstore? That's for guests, for dom0 Xen also need to be aware of changes to the BAR registers of the real physical devices since it needs to know the real hardware values in order to point guest p2m mappings at them. Wow, this is far more complicated than I imagined :-( Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel