Re: [Xen-devel] ARM: SMMUv3 support
On 6/13/2017 10:19 AM, Manish Jaggi wrote: On 3/29/2017 5:30 AM, Goel, Sameer wrote: Sure, I will try to post something soon. Hi Sameer, Are you still working on SMMU v3, can you please post patches. Hi Sameer, Could you please post RFC patches for SMMUv3, can provide feedback by testing on thunderX platform. Thanks manish Thanks Manish Thanks, Sameer On 3/27/2017 11:03 PM, Vijay Kilari wrote: On Mon, Mar 27, 2017 at 10:00 PM, Goel, Sameerwrote: Hi, I am working on adding this support. The work is in initial stages and will target ACPI systems to start with. Do you have a specific requirement? Or even better: want to help with DT testing ? :) Thanks Sameer. I don't have any specific requirement. I am also looking with ACPI support. Please share your RFC patches so that I can test on our platform. Thanks, Sameer On 3/20/2017 11:58 PM, Vijay Kilari wrote: Hi, Is there any effort put by anyone to get SMMUv3 support in Xen for ARM64?. Would be glad to know. Regards Vijay ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-4.9 test] 112397: trouble: blocked/broken/fail/pass
flight 112397 linux-4.9 real [real] http://logs.test-lab.xenproject.org/osstest/logs/112397/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-xsm 2 hosts-allocate broken REGR. vs. 112193 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112193 build-arm64 2 hosts-allocate broken REGR. vs. 112193 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail in 112388 pass in 112397 test-armhf-armhf-xl-multivcpu 19 leak-check/check fail in 112388 pass in 112397 test-amd64-amd64-qemuu-nested-intel 19 leak-check/check/l1 fail in 112388 pass in 112397 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail pass in 112388 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-examine 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64-pvops 3 capture-logs broken blocked in 112193 build-arm64 3 capture-logs broken blocked in 112193 build-arm64-xsm 3 capture-logs broken never pass test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 112193 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail in 112388 like 112086 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail like 112193 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail like 112193 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112193 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112193 test-amd64-amd64-xl-rtds 10 debian-install fail like 112193 test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail
[Xen-devel] [xtf test] 112399: all pass - PUSHED
flight 112399 xtf real [real] http://logs.test-lab.xenproject.org/osstest/logs/112399/ Perfect :-) All tests in this flight passed as required version targeted for testing: xtf c5be5f2f71d5bebb89c79bbcc5469445ee327b38 baseline version: xtf 2b5adea4636ae5b6c2b5f3eb391cd4aeb7a997a4 Last test of basis 112287 2017-07-25 12:47:57 Z6 days Testing same since 112399 2017-07-31 17:50:54 Z0 days1 attempts People who touched revisions under test: Wei Liujobs: build-amd64-xtf pass build-amd64 pass build-amd64-pvopspass test-xtf-amd64-amd64-1 pass test-xtf-amd64-amd64-2 pass test-xtf-amd64-amd64-3 pass test-xtf-amd64-amd64-4 pass test-xtf-amd64-amd64-5 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xtf + revision=c5be5f2f71d5bebb89c79bbcc5469445ee327b38 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xtf c5be5f2f71d5bebb89c79bbcc5469445ee327b38 + branch=xtf + revision=c5be5f2f71d5bebb89c79bbcc5469445ee327b38 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xtf + xenbranch=xen-unstable + '[' xxtf = xlinux ']' + linuxbranch= + '[' x = x ']' + qemuubranch=qemu-upstream-unstable + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable + prevxenbranch=xen-4.9-testing + '[' xc5be5f2f71d5bebb89c79bbcc5469445ee327b38 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/osstest/ext/linux-firmware.git ++ : git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git ++ : osst...@xenbits.xen.org:/home/xen/git/linux-pvops.git ++ : git://xenbits.xen.org/linux-pvops.git ++ : tested/linux-4.9 ++ : tested/linux-arm-xen ++ '['
Re: [Xen-devel] Xen on Intel Atom E3815: crash, no output
> From: Stefano Stabellini [mailto:sstabell...@kernel.org] > Sent: Tuesday, August 1, 2017 7:40 AM > > Hi all, > > I noticed that Xen does not boot on Intel Atom E3815. The system is a > Dell Edge Gateway 3003: > > http://i.dell.com/sites/doccontent/shared-content/data- > sheets/en/Documents/Dell_Edge_Gateway_3000_Series_spec_sheet.pdf?ne > wtab=true > > Grub2 loads Xen and Dom0, but no output comes out of Xen. After the > "Loading" messages from Grub2, Xen doesn't manage to print even a single > character and the system obviously crashes, but I don't know why because > there is no output. Before you ask, no I don't have a serial on the > system. > > I tried to pass console=vga vga=text-80x25 and console=vga vga=ask, but > I still got nothing. > > Do you have any ideas how to get some output on the screen? Do you know > how to get Xen to boot successfully? > No much idea except finding a serial for such early boot issue... Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 00/13] "Non-shared" IOMMU support on ARM
> From: Oleksandr Tyshchenko [mailto:olekst...@gmail.com] > Sent: Monday, July 31, 2017 7:58 PM > > Hi, Kevin > > On Mon, Jul 31, 2017 at 8:57 AM, Tian, Kevinwrote: > >> From: Oleksandr Tyshchenko > >> Sent: Wednesday, July 26, 2017 1:27 AM > >> > >> From: Oleksandr Tyshchenko > >> > >> Hi, all. > >> > >> The purpose of this patch series is to create a base for porting > >> any "Non-shared" IOMMUs to Xen on ARM. Saying "Non-shared" IOMMU > I > >> mean > >> the IOMMU that can't share the page table with the CPU. > > > > Is "non-shared" IOMMU a standard terminology in ARM side? I quickly > > searched to find it mostly used in this thread... > I don't think that it is a standard terminology. > > > > > On the other hand, all IOMMUs support a basic DMA remapping > > mechanism with page table not shared with CPU. Then some IOMMUs > > may optional support Shared Virtual Memory (SVM) through page > > sharing with CPU. Then I'm not sure why need to highlight the > > "non-shared" manner in this thread, instead of just saying > > IPMMU-VMSA support... > I wouldn't use "IPMMU-VMSA support" in this thread since it may be any > other IOMMUs which can't share page table > with CPU because of format incompatibilities. As I commented you can assume all IOMMUs cannot share page table with CPU as the starting point. It's not good to name an IOMMU driver based on such fact. > I needed something short to describe such IOMMUs, but, If title > "non-shared" IOMMU sounds confusing > I won't use it anymore. Do you have something in mind? IOMMU driver needs to be vendor specific. Is your driver working for all IPMMU-VMSA compatible IOMMUs or only for Renesas? If the latter, you may make the name explicit for such purpose. btw since you're porting Linux driver to Xen. What 's the name used in Linux side? that should be a good reference to you. > > > > >> Primarily, we are interested in IPMMU-VMSA and I hope that it will be the > >> first candidate. > >> It is VMSA-compatible IOMMU that integrated in the newest Renesas R- > Car > >> Gen3 SoCs (ARM). > >> I am about to push IPMMU-VMSA support in a while. > >> > >> With regard to the patch series, it was rebased on Xen 4.9.0 release and > >> tested on Renesas R-Car Gen3 > >> H3/M3 based boards with applied IPMMU-VMSA support: > >> - Patches 1 and 3 have Julien's Rb. > >> - Patch 2 has Jan's Rb but only for x86 and generic parts. > >> - Patch 4 has Julien's Ab. > >> - Patches 5,6,9,10 were slightly reworked. > >> - Patch 7 was significantly reworked. The previous patch -> iommu: Split > >> iommu_hwdom_init() into arch specific parts > >> - Patches 8,11,12,13 are new. > >> > >> Not really sure about x86-related changes since I had no possibility to > check. > >> So, compile-tested on x86. > >> > >> You can find current patch series here: > >> repo: https://github.com/otyshchenko1/xen.git branch: > >> non_shared_iommu_v2 > >> > >> Previous patch series here: > >> [PATCH v1 00/10] "Non-shared" IOMMU support on ARM > >> https://www.mail-archive.com/xen-devel@lists.xen.org/msg107532.html > >> > >> [RFC PATCH 0/9] "Non-shared" IOMMU support on ARM > >> https://www.mail-archive.com/xen-devel@lists.xen.org/msg100468.html > >> > >> Thank you. > >> > >> Oleksandr Tyshchenko (13): > >> xen/device-tree: Add dt_count_phandle_with_args helper > >> iommu: Add extra order argument to the IOMMU APIs and platform > >> callbacks > >> xen/arm: p2m: Add helper to convert p2m type to IOMMU flags > >> xen/arm: p2m: Update IOMMU mapping whenever possible if page table > is > >> not shared > >> iommu/arm: Re-define iommu_use_hap_pt(d) as iommu_hap_pt_share > >> iommu: Add extra use_iommu argument to iommu_domain_init() > >> iommu: Make decision about needing IOMMU for hardware domains in > >> advance > >> iommu/arm: Misc fixes for arch specific part > >> xen/arm: Add use_iommu flag to xen_arch_domainconfig > >> xen/arm: domain_build: Don't expose IOMMU specific properties to the > >> guest > >> iommu/arm: smmu: Squash map_pages/unmap_pages with > >> map_page/unmap_page > >> [RFC] iommu: VT-d: Squash map_pages/unmap_pages with > >> map_page/unmap_page > >> [RFC] iommu: AMD-Vi: Squash map_pages/unmap_pages with > >> map_page/unmap_page > >> > >> tools/libxl/libxl_arm.c | 8 + > >> xen/arch/arm/domain.c | 2 +- > >> xen/arch/arm/domain_build.c | 10 ++ > >> xen/arch/arm/p2m.c| 10 +- > >> xen/arch/x86/domain.c | 2 +- > >> xen/arch/x86/mm.c | 11 +- > >> xen/arch/x86/mm/p2m-ept.c | 21 +-- > >> xen/arch/x86/mm/p2m-pt.c | 26 +--- > >> xen/arch/x86/mm/p2m.c | 38 + > >> xen/arch/x86/x86_64/mm.c | 5 +- > >> xen/common/device_tree.c | 7
Re: [Xen-devel] [PATCH v2] VT-d: don't panic/warn on iommu=no-igfx
> From: Rusty Bird [mailto:rustyb...@openmailbox.org] > Sent: Monday, July 31, 2017 5:04 PM > > When operating on an Intel graphics device, iommu_enable_translation() > panicked (force_iommu==1) or warned (force_iommu==0) about the BIOS if > is_igd_vt_enabled_quirk() returned 0. That's good if the actual BIOS > problem has been detected. But since commit 1463411, returning 0 could > also happen if the user simply passed "iommu=no-igfx", in which case > bailing out with an info message (instead of a panic/warning) would be > more appropriate. > > The panic broke the combination "iommu=force,no-igfx", and also the case > where "iommu=no-igfx" is passed but force_iommu=1 is set automatically > by x2apic_bsp_setup(). > > Move the iommu_igfx check from is_igd_vt_enabled_quirk() into its only > caller iommu_enable_translation(), and tweak the logic. > > Signed-off-by: Rusty BirdAcked-by: Kevin Tian ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-next test] 112395: regressions - trouble: blocked/broken/fail/pass
flight 112395 linux-next real [real] http://logs.test-lab.xenproject.org/osstest/logs/112395/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-qemuu-rhel6hvm-intel 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemuu-ws16-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemuu-win10-i386 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-libvirt 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-qemut-rhel6hvm-intel 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 112382 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 112382 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 112382 test-amd64-i386-xl-qemut-win10-i386 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl7 xen-boot fail REGR. vs. 112382 test-amd64-i386-freebsd10-i386 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 112382 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 112382 test-amd64-i386-xl-qemut-debianhvm-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemut-ws16-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 112382 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 112382 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemut-win7-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 112382 test-amd64-i386-rumprun-i386 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-libvirt-xsm 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-qemut-rhel6hvm-amd 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-xl-qemuu-win7-amd64 7 xen-boot fail REGR. vs. 112382 test-amd64-i386-examine 7 reboot fail REGR. vs. 112382 Tests which did not succeed, but are not blocking: build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-examine 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64-pvops 2 hosts-allocate broken like 112382 build-arm64-xsm 2 hosts-allocate broken like 112382 build-arm64 2 hosts-allocate broken like 112382 build-arm64 3 capture-logsbroken like 112382 build-arm64-xsm 3 capture-logsbroken like 112382 build-arm64-pvops 3 capture-logsbroken like 112382 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail blocked in 112382 test-amd64-amd64-examine 7 reboot fail like 112382 test-amd64-amd64-amd64-pvgrub 7 xen-boot fail like 112382 test-amd64-amd64-rumprun-amd64 7 xen-bootfail like 112382 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 112382 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 112382 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 112382 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 112382 test-amd64-amd64-xl-rtds 10 debian-install fail like 112382 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass
Re: [Xen-devel] [RFC v2 10/12] tools: implemet new get value interface suitable for all psr allocation features.
On 17-07-31 15:30:16, Wei Liu wrote: > On Thu, Jul 20, 2017 at 04:49:11PM +0800, Yi Sun wrote: > > This patch implements a new get value interface in tools suitable for all > > psr > > allocation features and the whole flow. It also enables MBA support in tools > > to get MBA value. > > This suggests this patch can be at least broken into two? > > > > > Signed-off-by: Yi Sun> > --- > > tools/libxc/include/xenctrl.h | 13 +- > > tools/libxc/xc_psr.c | 11 +- > > tools/libxl/libxl_psr.c | 61 ++ > > tools/xl/xl.h | 3 + > > tools/xl/xl_cmdtable.c| 9 +- > > tools/xl/xl_psr.c | 275 > > ++ > > 6 files changed, 236 insertions(+), 136 deletions(-) > > > > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h > > index 0b0ec31..def18f5 100644 > > --- a/tools/libxc/include/xenctrl.h > > +++ b/tools/libxc/include/xenctrl.h > > @@ -2452,13 +2452,14 @@ enum xc_psr_cmt_type { > > }; > > typedef enum xc_psr_cmt_type xc_psr_cmt_type; > > > > -enum xc_psr_cat_type { > > +enum xc_psr_val_type { > > XC_PSR_CAT_L3_CBM = 1, > > XC_PSR_CAT_L3_CBM_CODE = 2, > > XC_PSR_CAT_L3_CBM_DATA = 3, > > XC_PSR_CAT_L2_CBM = 4, > > +XC_PSR_MBA_THRTL = 5, > > }; > > -typedef enum xc_psr_cat_type xc_psr_cat_type; > > +typedef enum xc_psr_val_type xc_psr_val_type; > > Changing the name of the type should be done in a separate patch. > > The rest of this patch mixes renaming and functional change which is > rather difficult to review I'm afraid. > Thanks! I will split the patch to two, one for type renaming, the other for functionality. BRs, Sun Yi ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 08/12] tools: create general interfaces to support psr allocation features.
On 17-07-31 15:30:08, Wei Liu wrote: > On Thu, Jul 20, 2017 at 04:49:09PM +0800, Yi Sun wrote: > [...] > > + > > +#ifdef LIBXL_HAVE_PSR_MBA > > +/* > > + * Function to set a domain's value. It operates on a single or multiple > > + * target(s) defined in 'target_map'. 'target_map' specifies all the > > sockets > > + * to be operated on. > > + */ > > +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid, > > + libxl_psr_cbm_type type, libxl_bitmap *target_map, > > + uint64_t val); > > +/* > > + * Function to get a domain's cbm. It operates on a single 'target'. > > + * 'target' specifies which socket to be operated on. > > + */ > > +int libxl_psr_get_val(libxl_ctx *ctx, uint32_t domid, > > + libxl_psr_cbm_type type, uint32_t target, > > There is no need for target to be uint32_t right? Unsigned int should > work too? > > > + uint64_t *val); > > +/* > > + * On success, the function returns an array of elements in 'info', > > + * and the length in 'nr'. > > + */ > > +int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info, > > + int *nr, libxl_psr_feat_type type, int lvl); > > +void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr); > > nr should be unsigned int. > > > +#endif /* LIBXL_HAVE_PSR_MBA */ > > +#endif /* LIBXL_HAVE_PSR_CAT */ > > > > /* misc */ > > > > diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c > > index f55ba1e..8319301 100644 > > --- a/tools/libxl/libxl_psr.c > > +++ b/tools/libxl/libxl_psr.c > > @@ -425,6 +425,30 @@ void libxl_psr_cat_info_list_free(libxl_psr_cat_info > > *list, int nr) > > free(list); > > } > > > > +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid, > > + libxl_psr_cbm_type type, libxl_bitmap *target_map, > > + uint64_t val) > > +{ > > +return EXIT_FAILURE; > > ERROR_FAIL here. > Thanks! Will fix above points. > > + > > +libxl_psr_hw_info = Struct("psr_hw_info", [ > > +("id", uint32), > > +("u", KeyedUnion(None, libxl_psr_feat_type, "type", > > + [("cat_info", Struct(None, [ > > + ("cos_max", uint32), > > + ("cbm_len", uint32), > > + ("cdp_enabled", bool), > > + ])), > > + ("mba_info", Struct(None, [ > > + ("cos_max", uint32), > > + ("thrtl_max", uint32), > > + ("linear", bool), > > + ])), > > + ])) > > If this is output only please mark it as dir=DIR_OUT. > Sorry, I do not understand this clearly. DYM if these values are all output values for xl application? BRs, Sun Yi ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 04/12] x86: implement data structure and CPU init flow for MBA.
On 17-07-31 15:30:11, Wei Liu wrote: > On Thu, Jul 20, 2017 at 04:49:05PM +0800, Yi Sun wrote: > > #define PSR_CMT(1<<0) > > #define PSR_CAT(1<<1) > > #define PSR_CDP(1<<2) > > +#define PSR_MBA(1<<3) > > These should really be (1u << X) -- please use unsigned value and add > spaces around "<<". > > Can you please submit a patch to fix them first? Thanks for the suggestion! I will fix it in next version because a patch need be split to two per your suggestion. BRs, Sun Yi ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.
On Mon, 2017-07-31 at 16:58 -0700, Stefano Stabellini wrote: > On Tue, 1 Aug 2017, Dario Faggioli wrote: > > On Mon, 2017-07-31 at 14:20 -0700, Stefano Stabellini wrote: > > > On Thu, 27 Jul 2017, Dario Faggioli wrote: > > > > > > > > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c > > > > index f0fdc87..4586f2a 100644 > > > > --- a/xen/common/rcupdate.c > > > > +++ b/xen/common/rcupdate.c > > > > @@ -84,8 +84,14 @@ struct rcu_data { > > > > int cpu; > > > > struct rcu_head barrier; > > > > longlast_rs_qlen; /* qlen during the last > > > > resched */ > > > > + > > > > +/* 3) idle CPUs handling */ > > > > +struct timer idle_timer; > > > > +bool idle_timer_active; > > > > }; > > > > > > > > +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10) > > > > > > Isn't this a bit too short? How is it chosen? > > > > > > > What makes you think it's short? > > In terms of power saving and CPU sleep states, 10ms is not much to > sleep > for. I wonder if there are any power saving benefits in sleeping for > only 10ms (especially on x86 where entering and exiting CPU sleep > states > takes longer, to be confirmed). > I *think* we should be fine with, say, 100ms. But that's again, guess/rule of thumb, nothing more than that. And, TBH, I'm not even sure what a good experiment/benchmark would be, to assess whether a particular value is good or bad. :-/ > We might as well do the thing we need > to do immediately? I guess we cannot do that? > You're guessing correct, we can't. It's indeed a bit tricky, and it took it a little bit to me as well to figure all of it out properly. Basically, let's say that, at time t1, on CPU1, someone calls call_rcu(). The situation on the other CPUs is: CPU0 busy; CPU2 idle (no timer pending); CPU3 busy. So, a new grace period starts, and its exact end will be when CPU0, CPU1 and CPU3 have quiesced once (in Xen, quiescence means: "going through do_softirq()"). But RCU it's a passive mechanism, i.e., we rely on each CPU coming to the RCU core logic, and tell <>. Let's say that CPU0 quiesces at time t2 > t1. CPU1 quiesces at time t3 > t2, and goes fully idle (no timer pending). CPU3 quiesces at time t4 > t3. Now, at time t4, CPU1 can actually invoke the callbeck, queued at time t1, from within call_rcu(). This patch series solves two problems, of our current RCU implementation: 1) right now, we don't only wait for CPU0, CPU1 and CPU3, we also wait for CPU2. But since CPU2 is fully idle, it just won't bother telling RCU that it has quiesced (well, on x86, that would actually happen at some point, while on ARM, it is really possible that this never happens!). This is solved in patch 3, by introducing the cpumask; 2) now (after patch 3) we know that we just can avoid waiting for CPU2. Good. But at time t4, when CPU3 quiesces, marking the end of the grace period, how would CPU1 --which is fully idle-- know that it can now safely invoke the callback? Again, RCU is passive, so it relies on CPU1 to figure that out on its own, next time it wakes up, e.g., because of the periodic tick timer. But we don't have a periodic tick timer! Well, this again means that, on x86, CPU1 will actually figure that out at some (unpredictable) point in future. On ARM, not so much. The purpose of the timer in this patch is to make sure it always will. In fact, with patches 4 and 5 applied, at time t3, we let CPU1 go idle, but we program the timer. It will fire at t3+T (with T=10ms, right now). When this happens, if t3+T > t4, CPU1 invokes the callback, and we're done. If not (and CPU1 is still idle) we retry in another T. So, when you say "immediately", *when* do you actually mean? We can't invoke the callback at t3 (i.e., immediately after CPU1 quiesces), because we need to wait for CPU3 to do the same. We can't invoke the callback when CPU3 quiesces, at t4 (i.e., immediately after the grace period ends) either, because the callback it's on CPU1, not on CPU3. Linux's solution is not to stop the tick for CPU1 at time t3. It will be stopped only after the first firing of the tick itself at time t > t4 (if CPU1 is still idle, of course). This timer thing is, IMO, pretty similar. The only difference is that we don't have a tick to not stop... So I had to make up a very special one. :-D TL;DR, yes, I also would have loved the world (or even just this problem) to be simple. It's not! ;-P Thanks and Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v14 13/23] x86: refactor psr: CDP: implement CPU init flow.
On 17-07-31 08:20:44, Jan Beulich wrote: > >>> Yi Sun07/15/17 2:48 AM >>> > >@@ -272,7 +312,8 @@ static int cat_init_feature(const struct cpuid_leaf > >*regs, > >if ( !opt_cpu_info ) > >return 0; > > > >-printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, > >cbm_len:%u\n", > >+printk(XENLOG_INFO "%s: enabled on socket %u, cos_max:%u, cbm_len:%u\n", > >+ ((type == FEAT_TYPE_L3_CDP) ? "CDP" : "L3 CAT"), > > Why is this not "L3 CDP" when the enumerator includes L3? > Sure, I will change it. Thanks! > >@@ -1283,10 +1344,22 @@ static void psr_cpu_init(void) > >feat = feat_l3; > >feat_l3 = NULL; > > > >-if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CAT) ) > >-feat_props[FEAT_TYPE_L3_CAT] = _cat_props; > >+if ( (regs.c & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) ) > >+{ > >+if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CDP) ) > >+feat_props[FEAT_TYPE_L3_CDP] = _cdp_props; > >+else > >+/* If CDP init fails, try to work as L3 CAT. */ > >+goto l3_cat_init; > >+} > >else > >-feat_l3 = feat; > >+{ > >+ l3_cat_init: > > I'd really like to ask to re-structure this slightly so that you won't > need goto and a label here. As said before, goto-s are sort of okay > for making complicated error paths readable, but they should be > avoided in almost all other cases. > Got it, I will try to remove the goto. Thanks for the review! BRs, Sun Yi > Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 0/1] xen/arm: zynqmp: Disable PCIe
On Mon, Jul 31, 2017 at 11:11:39PM +0100, Julien Grall wrote: > > > On 31/07/2017 20:37, Edgar E. Iglesias wrote: > >From: "Edgar E. Iglesias"> > > >Hi, > > Hi Edgar, > > > >We're seeing panics in dom0 with PCIe enabled due to what seems > >to be wrongly created mappings by Xen. With older kernels we > >didn't see the panics but PCIe wasn't functional in dom0. > > > >This disables the PCIe nodes on the ZynqMP until Xen/ARM gets > >more PCIe support. > > I feel a bit sad to ack a patch disabling PCIe in the ZynqMP. > > Before doing that. Can you describe what is the exact problem with Xen? It > might be possible that we don't parse correctly the device-tree. Yes, it was related to the DTB since we had a workaround with a modified DTB. Anyway, I was to quick with sending out the patch. We've been carrying this for a while and your questions prompted me to have a look again and dig out the details. Turns out I can't reproduce it with our latest kernel and the latest Xen. I don't know what fixed it yet but PCIe works ATM both on QEMU models and on real HW. I don't think it ever has before without modified DTBs. So please ignore this patch for now :-) Cheers, Edgar ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.
On Tue, 1 Aug 2017, Dario Faggioli wrote: > On Mon, 2017-07-31 at 14:20 -0700, Stefano Stabellini wrote: > > On Thu, 27 Jul 2017, Dario Faggioli wrote: > > > > > > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c > > > index f0fdc87..4586f2a 100644 > > > --- a/xen/common/rcupdate.c > > > +++ b/xen/common/rcupdate.c > > > @@ -84,8 +84,14 @@ struct rcu_data { > > > int cpu; > > > struct rcu_head barrier; > > > longlast_rs_qlen; /* qlen during the last > > > resched */ > > > + > > > +/* 3) idle CPUs handling */ > > > +struct timer idle_timer; > > > +bool idle_timer_active; > > > }; > > > > > > +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10) > > > > Isn't this a bit too short? How is it chosen? > > > It's totally arbitrary (and that would be the case for whatever value > we choose). > > Basically, it's how long, at worst, after the actual end of a grace > period, a (batch of) callback(s) will be invoked. Currently, on Credit1 > on ARM (without my patch, from this series, that suspends the tick) > that's (by chance) 30 ms (or whatever value is chosen for Credit1 > timeslice). On Credit2 (on both ARM and x86), it's never, but on x86 it > (apparently) is 'however frequent time sync rendezvouses happs' (which > I don't recall, but it's longer), while on ARM is (potentially) never. > > I accept suggestions about alternatives values, and I'm certainly fine > with adding a comment, containing something along the lines of the > explanation above, but I fear it's going to be hard to figure out what > value is actually the "absolute best". > > In Linux (which is where the same 'callback book-keeping' happens for > them), a tick with a frequency of 1000Hz (== 1ms) is considered 'low- > latency/Deskop/real-time'. For us, as said above, tick --when it's > there-- would be 30ms by default. > > I just went with something in the middle. > > Also, it's not that we'll have a 10ms periodic timer going on for > significant amount of time. In fact we expect it to actually fire just > once (for each grace period). It's not 100% guaranteed that it won't be > reprogrammed and fire a couple of times, but it should not, in the vast > majority of cases. > > What makes you think it's short? In terms of power saving and CPU sleep states, 10ms is not much to sleep for. I wonder if there are any power saving benefits in sleeping for only 10ms (especially on x86 where entering and exiting CPU sleep states takes longer, to be confirmed). We might as well do the thing we need to do immediately? I guess we cannot do that?___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file
On Mon, 31 Jul 2017, Edgar E. Iglesias wrote: > > > > @role Can only be 'master' or 'slave', it defaults to > > > > 'slave'. > > > > > > > > @prot When @role = master, this means the largest set > > > > of > > > > stage-2 permission flags that can be granted to > > > > the > > > > slave domains. > > > > When @role = slave, this means the stage-2 permission > > > > flags of the shared memory area. > > > > Currently only 'rw' is supported. If not set. it > > > > defaults to 'rw'. > > > > > > > > @cache_policy The stage-2 cacheability/shareability > > > > attributes of the > > > > shared memory area. Currently, only two > > > > policies are > > > > supported: > > > > * ARM_normal: Only applicable to ARM guests. > > > > This > > > > would mean Inner and Outer > > > > Write-Back > > > > Cacheable, and Inner Shareable. > > > > > > > > > Is there a reason not to set this to Outer Shareable? > > > Again, mainly useful when these pages get shared with devs as well. > > > > > > The guest can always lower it to Inner Shareable via S1 tables if needed. > > > > I don't think we can support memory sharing with devices in this version > > of the document (see above about GSoC timelines). Normal memory is inner > > shareable in Xen today, it makes sense to default to that. > > I thought we mapped RAM as Outer shareable to guests but you seem to be right. > I think we should be mapping all RAM as Outer Shareable and then let the > guest decide what is Inner and what is Outer via it's S1 tables. > Right now it would be impossible to be Coherent with a DMA device outside > of the Inner domain... > > Perhaps we should fix that and then ARM_normal would by itself become Outer. > If there's agreement I can test it and send a patch. Today, only device memory is mapped Outer Shareable, while normal memory is mapped Inner Shareable. I am OK with changing the default in mfn_to_p2m_entry to Outer Shareable for normal RAM if the change would make it possible to do coherent DMA with more devices on the platform. Julien? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] Xen on Intel Atom E3815: crash, no output
Hi all, I noticed that Xen does not boot on Intel Atom E3815. The system is a Dell Edge Gateway 3003: http://i.dell.com/sites/doccontent/shared-content/data-sheets/en/Documents/Dell_Edge_Gateway_3000_Series_spec_sheet.pdf?newtab=true Grub2 loads Xen and Dom0, but no output comes out of Xen. After the "Loading" messages from Grub2, Xen doesn't manage to print even a single character and the system obviously crashes, but I don't know why because there is no output. Before you ask, no I don't have a serial on the system. I tried to pass console=vga vga=text-80x25 and console=vga vga=ask, but I still got nothing. Do you have any ideas how to get some output on the screen? Do you know how to get Xen to boot successfully? Thanks, Stefano ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 112402: tolerable trouble: broken/pass - PUSHED
flight 112402 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/112402/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass version targeted for testing: xen b8029db62eb2a06a204a8e2b69437d0927bd1ac4 baseline version: xen aa4eb460bcf77ea87b9209bb136efc8142a1a512 Last test of basis 112365 2017-07-28 18:01:42 Z3 days Testing same since 112402 2017-07-31 21:02:08 Z0 days1 attempts People who touched revisions under test: Andrii Anisovjobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm broken test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=b8029db62eb2a06a204a8e2b69437d0927bd1ac4 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke b8029db62eb2a06a204a8e2b69437d0927bd1ac4 + branch=xen-unstable-smoke + revision=b8029db62eb2a06a204a8e2b69437d0927bd1ac4 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.9-testing + '[' xb8029db62eb2a06a204a8e2b69437d0927bd1ac4 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ :
[Xen-devel] [PATCH v3 11/13] xen/pvcalls: implement poll command
For active sockets, check the indexes and use the inflight_conn_req waitqueue to wait. For passive sockets if an accept is outstanding (PVCALLS_FLAG_ACCEPT_INFLIGHT), check if it has been answered by looking at bedata->rsp[req_id]. If so, return POLLIN. Otherwise use the inflight_accept_req waitqueue. If no accepts are inflight, send PVCALLS_POLL to the backend. If we have outstanding POLL requests awaiting for a response use the inflight_req waitqueue: inflight_req is awaken when a new response is received; on wakeup we check whether the POLL response is arrived by looking at the PVCALLS_FLAG_POLL_RET flag. We set the flag from pvcalls_front_event_handler, if the response was for a POLL command. In pvcalls_front_event_handler, get the struct sock_mapping from the poll id (we previously converted struct sock_mapping* to uint64_t and used it as id). Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 135 +--- drivers/xen/pvcalls-front.h | 3 + 2 files changed, 129 insertions(+), 9 deletions(-) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 635a83a..1c975d6 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -72,6 +72,8 @@ struct sock_mapping { * Only one poll operation can be inflight for a given socket. */ #define PVCALLS_FLAG_ACCEPT_INFLIGHT 0 +#define PVCALLS_FLAG_POLL_INFLIGHT 1 +#define PVCALLS_FLAG_POLL_RET2 uint8_t flags; uint32_t inflight_req_id; struct sock_mapping *accept_map; @@ -139,15 +141,32 @@ static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) rsp = RING_GET_RESPONSE(>ring, bedata->ring.rsp_cons); req_id = rsp->req_id; - dst = (uint8_t *)>rsp[req_id] + sizeof(rsp->req_id); - src = (uint8_t *)rsp + sizeof(rsp->req_id); - memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id)); - /* -* First copy the rest of the data, then req_id. It is -* paired with the barrier when accessing bedata->rsp. -*/ - smp_wmb(); - WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id); + if (rsp->cmd == PVCALLS_POLL) { + struct sock_mapping *map = (struct sock_mapping *) + rsp->u.poll.id; + + set_bit(PVCALLS_FLAG_POLL_RET, + (void *)>passive.flags); + /* +* Set RET, then clear INFLIGHT. It pairs with +* the checks at the beginning of +* pvcalls_front_poll_passive. +*/ + smp_wmb(); + clear_bit(PVCALLS_FLAG_POLL_INFLIGHT, + (void *)>passive.flags); + } else { + dst = (uint8_t *)>rsp[req_id] + + sizeof(rsp->req_id); + src = (uint8_t *)rsp + sizeof(rsp->req_id); + memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id)); + /* +* First copy the rest of the data, then req_id. It is +* paired with the barrier when accessing bedata->rsp. +*/ + smp_wmb(); + WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id); + } done = 1; bedata->ring.rsp_cons++; @@ -736,6 +755,104 @@ int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int flags) return ret; } +static unsigned int pvcalls_front_poll_passive(struct file *file, + struct pvcalls_bedata *bedata, + struct sock_mapping *map, + poll_table *wait) +{ + int notify, req_id, ret; + struct xen_pvcalls_request *req; + + if (test_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT, +(void *)>passive.flags)) { + uint32_t req_id = READ_ONCE(map->passive.inflight_req_id); + if (req_id != PVCALLS_INVALID_ID && + READ_ONCE(bedata->rsp[req_id].req_id) == req_id) + return POLLIN; + + poll_wait(file, >passive.inflight_accept_req, wait); + return 0; + } + + if (test_and_clear_bit(PVCALLS_FLAG_POLL_RET, + (void *)>passive.flags)) + return POLLIN; + + /* +* First check RET, then INFLIGHT. No barriers necessary to +* ensure execution ordering because of the
[Xen-devel] [PATCH v3 02/13] xen/pvcalls: implement frontend disconnect
Introduce a data structure named pvcalls_bedata. It contains pointers to the command ring, the event channel, a list of active sockets and a list of passive sockets. Lists accesses are protected by a spin_lock. Introduce a waitqueue to allow waiting for a response on commands sent to the backend. Introduce an array of struct xen_pvcalls_response to store commands responses. Implement pvcalls frontend removal function. Go through the list of active and passive sockets and free them all, one at a time. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 51 + 1 file changed, 51 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index a8d38c2..a126195 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -20,6 +20,29 @@ #include #include +#define PVCALLS_INVALID_ID UINT_MAX +#define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER +#define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE) + +struct pvcalls_bedata { + struct xen_pvcalls_front_ring ring; + grant_ref_t ref; + int irq; + + struct list_head socket_mappings; + struct list_head socketpass_mappings; + spinlock_t pvcallss_lock; + + wait_queue_head_t inflight_req; + struct xen_pvcalls_response rsp[PVCALLS_NR_REQ_PER_RING]; +}; +static struct xenbus_device *pvcalls_front_dev; + +static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) +{ + return IRQ_HANDLED; +} + static const struct xenbus_device_id pvcalls_front_ids[] = { { "pvcalls" }, { "" } @@ -27,6 +50,34 @@ static int pvcalls_front_remove(struct xenbus_device *dev) { + struct pvcalls_bedata *bedata; + struct sock_mapping *map = NULL, *n; + + bedata = dev_get_drvdata(_front_dev->dev); + + list_for_each_entry_safe(map, n, >socket_mappings, list) { + mutex_lock(>active.in_mutex); + mutex_lock(>active.out_mutex); + pvcalls_front_free_map(bedata, map); + mutex_unlock(>active.out_mutex); + mutex_unlock(>active.in_mutex); + kfree(map); + } + list_for_each_entry_safe(map, n, >socketpass_mappings, list) { + spin_lock(>pvcallss_lock); + list_del_init(>list); + spin_unlock(>pvcallss_lock); + kfree(map); + } + if (bedata->irq > 0) + unbind_from_irqhandler(bedata->irq, dev); + if (bedata->ref >= 0) + gnttab_end_foreign_access(bedata->ref, 0, 0); + kfree(bedata->ring.sring); + kfree(bedata); + dev_set_drvdata(>dev, NULL); + xenbus_switch_state(dev, XenbusStateClosed); + pvcalls_front_dev = NULL; return 0; } -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 12/13] xen/pvcalls: implement release command
Send PVCALLS_RELEASE to the backend and wait for a reply. Take both in_mutex and out_mutex to avoid concurrent accesses. Then, free the socket. For passive sockets, check whether we have already pre-allocated an active socket for the purpose of being accepted. If so, free that as well. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 88 + drivers/xen/pvcalls-front.h | 1 + 2 files changed, 89 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 1c975d6..775a6d2 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -192,6 +192,23 @@ static irqreturn_t pvcalls_front_conn_handler(int irq, void *sock_map) return IRQ_HANDLED; } +static void pvcalls_front_free_map(struct pvcalls_bedata *bedata, + struct sock_mapping *map) +{ + int i; + + spin_lock(>pvcallss_lock); + if (!list_empty(>list)) + list_del_init(>list); + spin_unlock(>pvcallss_lock); + + for (i = 0; i < (1 << map->active.ring->ring_order); i++) + gnttab_end_foreign_access(map->active.ring->ref[i], 0, 0); + gnttab_end_foreign_access(map->active.ref, 0, 0); + free_page((unsigned long)map->active.ring); + unbind_from_irqhandler(map->active.irq, map); +} + int pvcalls_front_socket(struct socket *sock) { struct pvcalls_bedata *bedata; @@ -853,6 +870,77 @@ unsigned int pvcalls_front_poll(struct file *file, struct socket *sock, return pvcalls_front_poll_passive(file, bedata, map, wait); } +int pvcalls_front_release(struct socket *sock) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map; + int req_id, notify, ret; + struct xen_pvcalls_request *req; + + if (!pvcalls_front_dev) + return -EIO; + bedata = dev_get_drvdata(_front_dev->dev); + + if (sock->sk == NULL) + return 0; + + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); + if (map == NULL) + return 0; + + spin_lock(>pvcallss_lock); + ret = get_request(bedata, _id); + if (ret < 0) { + spin_unlock(>pvcallss_lock); + return ret; + } + WRITE_ONCE(sock->sk->sk_send_head, NULL); + + req = RING_GET_REQUEST(>ring, req_id); + req->req_id = req_id; + req->cmd = PVCALLS_RELEASE; + req->u.release.id = (uint64_t)map; + + bedata->ring.req_prod_pvt++; + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify); + spin_unlock(>pvcallss_lock); + if (notify) + notify_remote_via_irq(bedata->irq); + + wait_event(bedata->inflight_req, + READ_ONCE(bedata->rsp[req_id].req_id) == req_id); + + if (map->active_socket) { + /* +* Set in_error and wake up inflight_conn_req to force +* recvmsg waiters to exit. +*/ + map->active.ring->in_error = -EBADF; + wake_up_interruptible(>active.inflight_conn_req); + + mutex_lock(>active.in_mutex); + mutex_lock(>active.out_mutex); + pvcalls_front_free_map(bedata, map); + mutex_unlock(>active.out_mutex); + mutex_unlock(>active.in_mutex); + kfree(map); + } else { + spin_lock(>pvcallss_lock); + if (READ_ONCE(map->passive.inflight_req_id) != + PVCALLS_INVALID_ID) { + pvcalls_front_free_map(bedata, + map->passive.accept_map); + kfree(map->passive.accept_map); + } + list_del_init(>list); + kfree(map); + spin_unlock(>pvcallss_lock); + } + WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID); + + return 0; +} + static const struct xenbus_device_id pvcalls_front_ids[] = { { "pvcalls" }, { "" } diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h index 25e05b8..3332978 100644 --- a/drivers/xen/pvcalls-front.h +++ b/drivers/xen/pvcalls-front.h @@ -23,5 +23,6 @@ int pvcalls_front_recvmsg(struct socket *sock, unsigned int pvcalls_front_poll(struct file *file, struct socket *sock, poll_table *wait); +int pvcalls_front_release(struct socket *sock); #endif -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 10/13] xen/pvcalls: implement recvmsg
Implement recvmsg by copying data from the "in" ring. If not enough data is available and the recvmsg call is blocking, then wait on the inflight_conn_req waitqueue. Take the active socket in_mutex so that only one function can access the ring at any given time. If no data is available on the ring, rather than returning immediately or sleep-waiting, spin for up to 5000 cycles. This small optimization turns out to improve performance and latency significantly. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 102 drivers/xen/pvcalls-front.h | 4 ++ 2 files changed, 106 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 369acde..635a83a 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -105,6 +105,20 @@ static int pvcalls_front_write_todo(struct sock_mapping *map) return size - pvcalls_queued(prod, cons, size); } +static bool pvcalls_front_read_todo(struct sock_mapping *map) +{ + struct pvcalls_data_intf *intf = map->active.ring; + RING_IDX cons, prod; + int32_t error; + + cons = intf->in_cons; + prod = intf->in_prod; + error = intf->in_error; + return (error != 0 || + pvcalls_queued(prod, cons, + XEN_FLEX_RING_SIZE(intf->ring_order)) != 0); +} + static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) { struct xenbus_device *dev = dev_id; @@ -434,6 +448,94 @@ int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg, return tot_sent; } +static int __read_ring(struct pvcalls_data_intf *intf, + struct pvcalls_data *data, + struct iov_iter *msg_iter, + size_t len, int flags) +{ + RING_IDX cons, prod, size, masked_prod, masked_cons; + RING_IDX array_size = XEN_FLEX_RING_SIZE(intf->ring_order); + int32_t error; + + cons = intf->in_cons; + prod = intf->in_prod; + error = intf->in_error; + /* get pointers before reading from the ring */ + virt_rmb(); + if (error < 0) + return error; + + size = pvcalls_queued(prod, cons, array_size); + masked_prod = pvcalls_mask(prod, array_size); + masked_cons = pvcalls_mask(cons, array_size); + + if (size == 0) + return 0; + + if (len > size) + len = size; + + if (masked_prod > masked_cons) { + copy_to_iter(data->in + masked_cons, len, msg_iter); + } else { + if (len > (array_size - masked_cons)) { + copy_to_iter(data->in + masked_cons, +array_size - masked_cons, msg_iter); + copy_to_iter(data->in, +len - (array_size - masked_cons), +msg_iter); + } else { + copy_to_iter(data->in + masked_cons, len, msg_iter); + } + } + /* read data from the ring before increasing the index */ + virt_mb(); + if (!(flags & MSG_PEEK)) + intf->in_cons += len; + + return len; +} + +int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, +int flags) +{ + struct pvcalls_bedata *bedata; + int ret = -EAGAIN; + struct sock_mapping *map; + + if (!pvcalls_front_dev) + return -ENOTCONN; + bedata = dev_get_drvdata(_front_dev->dev); + + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); + if (!map) + return -ENOTSOCK; + + if (flags & (MSG_CMSG_CLOEXEC|MSG_ERRQUEUE|MSG_OOB|MSG_TRUNC)) + return -EOPNOTSUPP; + + mutex_lock(>active.in_mutex); + if (len > XEN_FLEX_RING_SIZE(map->active.ring->ring_order)) + len = XEN_FLEX_RING_SIZE(map->active.ring->ring_order); + + while (!(flags & MSG_DONTWAIT) && !pvcalls_front_read_todo(map)) { + wait_event_interruptible(map->active.inflight_conn_req, +pvcalls_front_read_todo(map)); + } + ret = __read_ring(map->active.ring, >active.data, + >msg_iter, len, flags); + + if (ret > 0) + notify_remote_via_irq(map->active.irq); + if (ret == 0) + ret = -EAGAIN; + if (ret == -ENOTCONN) + ret = 0; + + mutex_unlock(>active.in_mutex); + return ret; +} + int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int addr_len) { struct pvcalls_bedata *bedata; diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h index d937c24..de24041 100644 --- a/drivers/xen/pvcalls-front.h +++ b/drivers/xen/pvcalls-front.h @@
[Xen-devel] [PATCH v3 06/13] xen/pvcalls: implement bind command
Send PVCALLS_BIND to the backend. Introduce a new structure, part of struct sock_mapping, to store information specific to passive sockets. Introduce a status field to keep track of the status of the passive socket. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 59 + drivers/xen/pvcalls-front.h | 3 +++ 2 files changed, 62 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 379b8fb..5ccef34 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -59,6 +59,13 @@ struct sock_mapping { wait_queue_head_t inflight_conn_req; } active; + struct { + /* Socket status */ +#define PVCALLS_STATUS_UNINITALIZED 0 +#define PVCALLS_STATUS_BIND 1 +#define PVCALLS_STATUS_LISTEN2 + uint8_t status; + } passive; }; }; @@ -308,6 +315,58 @@ int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr, return ret; } +int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int addr_len) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map = NULL; + struct xen_pvcalls_request *req; + int notify, req_id, ret; + + if (!pvcalls_front_dev) + return -ENOTCONN; + if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM) + return -ENOTSUPP; + bedata = dev_get_drvdata(_front_dev->dev); + + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); + if (map == NULL) + return -EINVAL; + + spin_lock(>pvcallss_lock); + ret = get_request(bedata, _id); + if (ret < 0) { + spin_unlock(>pvcallss_lock); + return ret; + } + req = RING_GET_REQUEST(>ring, req_id); + req->req_id = req_id; + map->sock = sock; + req->cmd = PVCALLS_BIND; + req->u.bind.id = (uint64_t) map; + memcpy(req->u.bind.addr, addr, sizeof(*addr)); + req->u.bind.len = addr_len; + + init_waitqueue_head(>passive.inflight_accept_req); + + map->active_socket = false; + + bedata->ring.req_prod_pvt++; + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify); + spin_unlock(>pvcallss_lock); + if (notify) + notify_remote_via_irq(bedata->irq); + + wait_event(bedata->inflight_req, + READ_ONCE(bedata->rsp[req_id].req_id) == req_id); + + map->passive.status = PVCALLS_STATUS_BIND; + ret = bedata->rsp[req_id].ret; + /* read ret, then set this rsp slot to be reused */ + smp_mb(); + WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID); + return 0; +} + static const struct xenbus_device_id pvcalls_front_ids[] = { { "pvcalls" }, { "" } diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h index 63b0417..8b0a274 100644 --- a/drivers/xen/pvcalls-front.h +++ b/drivers/xen/pvcalls-front.h @@ -6,5 +6,8 @@ int pvcalls_front_socket(struct socket *sock); int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr, int addr_len, int flags); +int pvcalls_front_bind(struct socket *sock, + struct sockaddr *addr, + int addr_len); #endif -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 08/13] xen/pvcalls: implement accept command
Introduce a waitqueue to allow only one outstanding accept command at any given time and to implement polling on the passive socket. Introduce a flags field to keep track of in-flight accept and poll commands. Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make sure that only one accept command is executed at any given time by setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the inflight_accept_req waitqueue. Convert the new struct sock_mapping pointer into an uint64_t and use it as id for the new socket to pass to the backend. Check if the accept call is non-blocking: in that case after sending the ACCEPT command to the backend store the sock_mapping pointer of the new struct and the inflight req_id then return -EAGAIN (which will respond only when there is something to accept). Next time accept is called, we'll check if the ACCEPT command has been answered, if so we'll pick up where we left off, otherwise we return -EAGAIN again. Note that, differently from the other commands, we can use wait_event_interruptible (instead of wait_event) in the case of accept as we are able to track the req_id of the ACCEPT response that we are waiting. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 111 drivers/xen/pvcalls-front.h | 3 ++ 2 files changed, 114 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index b2757f5..f83b910 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -65,6 +65,16 @@ struct sock_mapping { #define PVCALLS_STATUS_BIND 1 #define PVCALLS_STATUS_LISTEN2 uint8_t status; + /* +* Internal state-machine flags. +* Only one accept operation can be inflight for a socket. +* Only one poll operation can be inflight for a given socket. +*/ +#define PVCALLS_FLAG_ACCEPT_INFLIGHT 0 + uint8_t flags; + uint32_t inflight_req_id; + struct sock_mapping *accept_map; + wait_queue_head_t inflight_accept_req; } passive; }; }; @@ -414,6 +424,107 @@ int pvcalls_front_listen(struct socket *sock, int backlog) return ret; } +int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int flags) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map; + struct sock_mapping *map2 = NULL; + struct xen_pvcalls_request *req; + int notify, req_id, ret, evtchn, nonblock; + + if (!pvcalls_front_dev) + return -ENOTCONN; + bedata = dev_get_drvdata(_front_dev->dev); + + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); + if (!map) + return -ENOTSOCK; + + if (map->passive.status != PVCALLS_STATUS_LISTEN) + return -EINVAL; + + nonblock = flags & SOCK_NONBLOCK; + /* +* Backend only supports 1 inflight accept request, will return +* errors for the others +*/ + if (test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT, +(void *)>passive.flags)) { + req_id = READ_ONCE(map->passive.inflight_req_id); + if (req_id != PVCALLS_INVALID_ID && + READ_ONCE(bedata->rsp[req_id].req_id) == req_id) + goto received; + if (nonblock) + return -EAGAIN; + if (wait_event_interruptible(map->passive.inflight_accept_req, + !test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT, + (void *)>passive.flags))) + return -EINTR; + } + + spin_lock(>pvcallss_lock); + ret = get_request(bedata, _id); + if (ret < 0) { + spin_unlock(>pvcallss_lock); + return ret; + } + map2 = kzalloc(sizeof(*map2), GFP_KERNEL); + if (map2 == NULL) + return -ENOMEM; + ret = create_active(map2, ); + if (ret < 0) { + kfree(map2); + return -ENOMEM; + } + list_add_tail(>list, >socket_mappings); + + req = RING_GET_REQUEST(>ring, req_id); + req->req_id = req_id; + req->cmd = PVCALLS_ACCEPT; + req->u.accept.id = (uint64_t) map; + req->u.accept.ref = map2->active.ref; + req->u.accept.id_new = (uint64_t) map2; + req->u.accept.evtchn = evtchn; + map->passive.accept_map = map2; + + bedata->ring.req_prod_pvt++; + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify); + spin_unlock(>pvcallss_lock); + if (notify) + notify_remote_via_irq(bedata->irq); + if (nonblock) { + WRITE_ONCE(map->passive.inflight_req_id, req_id); +
[Xen-devel] [PATCH v3 00/13] introduce the Xen PV Calls frontend
Hi all, this series introduces the frontend for the newly introduced PV Calls procotol. PV Calls is a paravirtualized protocol that allows the implementation of a set of POSIX functions in a different domain. The PV Calls frontend sends POSIX function calls to the backend, which implements them and returns a value to the frontend and acts on the function call. For more information about PV Calls, please read: https://xenbits.xen.org/docs/unstable/misc/pvcalls.html This patch series only implements the frontend driver. It doesn't attempt to redirect POSIX calls to it. The functions exported in pvcalls-front.h are meant to be used for that. A separate patch series will be sent to use them and hook them into the system. Changes in v3: In addition to addressing all comments, in this version of the series I also made one other significant change: I implemented non-blocking accept, which I didn't realize was missing in the last version of the series (non-blocking accept is usually called only after a successful poll, when there are already outstanding connections to accept). To do that, I also changed the id for sockets sent to the backend from the value of the struct socket pointer to the value of the struct sock_mapping pointer. It's all documented in the patch descriptions. This is the full list of changes: - use struct sock_mapping* instead of struct socket* as socket id to share with the backend - allocate struct sock_mapping for a socket in pvcalls_front_socket - in pvcalls_front_accept check if the request is non-blocking and return -EAGAIN in that case - store req_id and struct sock_mapping * of an inflight accept request, so that we can resume when it's ready, in case the accept is non-blocking - remove unnecessary parenthesis - rename RING_ORDER to PVCALLS_RING_ORDER - make pvcalls_front_dev static - remove ref local variable from pvcalls_front_probe - move patch #12 before patch #2 - move dev_set_drvdata before any got error in pvcalls_front_probe - combine src and dst calculation lines in pvcalls_front_event_handler - remove unnecessary != 0 in the wait_event and wait_event_interruptible tests - add an in-code comment about sk_send_head - In v2 I applied some changes to the wrong patch. Move changes to the right patches. - refactor the code to get a req_id into a function - use wait_event (instead of wait_event_interruptible) to wait for a response from the backend because we cannot cope with missing responses - remove the usage of PVCALLS_FRONT_MAX_SPIN from recvmsg - remove unnecessary !bedata check in pvcalls_front_release Changes in v2: - use xenbus_read_unsigned when possible - call dev_set_drvdata earlier in pvcalls_front_probe not to dereference a NULL pointer in the error path - set ret appropriately in pvcalls_front_probe - include pvcalls-front.h in pvcalls-front.c - call wake_up only once after the consuming loop in pvcalls_front_event_handler - don't leak "bytes" in case of errors in create_active - call spin_unlock appropriately in case of errors in create_active - remove all BUG_ONs - don't leak newsock->sk in pvcalls_front_accept in case of errors - rename PVCALLS_FRON_MAX_SPIN to PVCALLS_FRONT_MAX_SPIN - return bool from pvcalls_front_read_todo - add a barrier after setting PVCALLS_FLAG_POLL_RET in pvcalls_front_event_handler - remove outdated comment in pvcalls_front_free_map - clear sock->sk->sk_send_head later in pvcalls_front_release - make XEN_PVCALLS_FRONTEND tristate - don't add an empty resume function Stefano Stabellini (13): xen/pvcalls: introduce the pvcalls xenbus frontend xen/pvcalls: implement frontend disconnect xen/pvcalls: connect to the backend xen/pvcalls: implement socket command and handle events xen/pvcalls: implement connect command xen/pvcalls: implement bind command xen/pvcalls: implement listen command xen/pvcalls: implement accept command xen/pvcalls: implement sendmsg xen/pvcalls: implement recvmsg xen/pvcalls: implement poll command xen/pvcalls: implement release command xen: introduce a Kconfig option to enable the pvcalls frontend drivers/xen/Kconfig |9 + drivers/xen/Makefile|1 + drivers/xen/pvcalls-front.c | 1140 +++ drivers/xen/pvcalls-front.h | 28 ++ 4 files changed, 1178 insertions(+) create mode 100644 drivers/xen/pvcalls-front.c create mode 100644 drivers/xen/pvcalls-front.h ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 05/13] xen/pvcalls: implement connect command
Send PVCALLS_CONNECT to the backend. Allocate a new ring and evtchn for the active socket. Introduce fields in struct sock_mapping to keep track of active sockets. Introduce a waitqueue to allow the frontend to wait on data coming from the backend on the active socket (recvmsg command). Two mutexes (one of reads and one for writes) will be used to protect the active socket in and out rings from concurrent accesses. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 146 drivers/xen/pvcalls-front.h | 2 + 2 files changed, 148 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 7c4a7cb..379b8fb 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -13,6 +13,10 @@ */ #include +#include +#include + +#include #include #include @@ -44,6 +48,18 @@ struct sock_mapping { bool active_socket; struct list_head list; struct socket *sock; + union { + struct { + int irq; + grant_ref_t ref; + struct pvcalls_data_intf *ring; + struct pvcalls_data data; + struct mutex in_mutex; + struct mutex out_mutex; + + wait_queue_head_t inflight_conn_req; + } active; + }; }; static inline int get_request(struct pvcalls_bedata *bedata, int *req_id) @@ -97,6 +113,18 @@ static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) return IRQ_HANDLED; } +static irqreturn_t pvcalls_front_conn_handler(int irq, void *sock_map) +{ + struct sock_mapping *map = sock_map; + + if (map == NULL) + return IRQ_HANDLED; + + wake_up_interruptible(>active.inflight_conn_req); + + return IRQ_HANDLED; +} + int pvcalls_front_socket(struct socket *sock) { struct pvcalls_bedata *bedata; @@ -162,6 +190,124 @@ int pvcalls_front_socket(struct socket *sock) return ret; } +static int create_active(struct sock_mapping *map, int *evtchn) +{ + void *bytes; + int ret = -ENOMEM, irq = -1, i; + + init_waitqueue_head(>active.inflight_conn_req); + + map->active.ring = (struct pvcalls_data_intf *) + __get_free_page(GFP_KERNEL | __GFP_ZERO); + if (map->active.ring == NULL) + goto out_error; + memset(map->active.ring, 0, XEN_PAGE_SIZE); + map->active.ring->ring_order = PVCALLS_RING_ORDER; + bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, + map->active.ring->ring_order); + if (bytes == NULL) + goto out_error; + for (i = 0; i < (1 << map->active.ring->ring_order); i++) + map->active.ring->ref[i] = gnttab_grant_foreign_access( + pvcalls_front_dev->otherend_id, + pfn_to_gfn(virt_to_pfn(bytes) + i), 0); + + map->active.ref = gnttab_grant_foreign_access( + pvcalls_front_dev->otherend_id, + pfn_to_gfn(virt_to_pfn((void *)map->active.ring)), 0); + + map->active.data.in = bytes; + map->active.data.out = bytes + + XEN_FLEX_RING_SIZE(map->active.ring->ring_order); + + ret = xenbus_alloc_evtchn(pvcalls_front_dev, evtchn); + if (ret) + goto out_error; + irq = bind_evtchn_to_irqhandler(*evtchn, pvcalls_front_conn_handler, + 0, "pvcalls-frontend", map); + if (irq < 0) { + ret = irq; + goto out_error; + } + + map->active.irq = irq; + map->active_socket = true; + mutex_init(>active.in_mutex); + mutex_init(>active.out_mutex); + + return 0; + +out_error: + if (irq >= 0) + unbind_from_irqhandler(irq, map); + else if (*evtchn >= 0) + xenbus_free_evtchn(pvcalls_front_dev, *evtchn); + kfree(map->active.data.in); + kfree(map->active.ring); + kfree(map); + return ret; +} + +int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr, + int addr_len, int flags) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map = NULL; + struct xen_pvcalls_request *req; + int notify, req_id, ret, evtchn; + + if (!pvcalls_front_dev) + return -ENETUNREACH; + if (addr->sa_family != AF_INET || sock->type != SOCK_STREAM) + return -ENOTSUPP; + + bedata = dev_get_drvdata(_front_dev->dev); + + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); + if (!map) + return -EINVAL; + + spin_lock(>pvcallss_lock); + ret = get_request(bedata, _id); + if (ret < 0) { +
[Xen-devel] [PATCH v3 13/13] xen: introduce a Kconfig option to enable the pvcalls frontend
Also add pvcalls-front to the Makefile. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/Kconfig | 9 + drivers/xen/Makefile | 1 + 2 files changed, 10 insertions(+) diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig index 4545561..0b2c828 100644 --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -196,6 +196,15 @@ config XEN_PCIDEV_BACKEND If in doubt, say m. +config XEN_PVCALLS_FRONTEND + tristate "XEN PV Calls frontend driver" + depends on INET && XEN + help + Experimental frontend for the Xen PV Calls protocol + (https://xenbits.xen.org/docs/unstable/misc/pvcalls.html). It + sends a small set of POSIX calls to the backend, which + implements them. + config XEN_PVCALLS_BACKEND bool "XEN PV Calls backend driver" depends on INET && XEN && XEN_BACKEND diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile index 480b928..afb9e03 100644 --- a/drivers/xen/Makefile +++ b/drivers/xen/Makefile @@ -39,6 +39,7 @@ obj-$(CONFIG_XEN_EFI) += efi.o obj-$(CONFIG_XEN_SCSI_BACKEND) += xen-scsiback.o obj-$(CONFIG_XEN_AUTO_XLATE) += xlate_mmu.o obj-$(CONFIG_XEN_PVCALLS_BACKEND) += pvcalls-back.o +obj-$(CONFIG_XEN_PVCALLS_FRONTEND) += pvcalls-front.o xen-evtchn-y := evtchn.o xen-gntdev-y := gntdev.o xen-gntalloc-y := gntalloc.o -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 01/13] xen/pvcalls: introduce the pvcalls xenbus frontend
Introduce a xenbus frontend for the pvcalls protocol, as defined by https://xenbits.xen.org/docs/unstable/misc/pvcalls.html. This patch only adds the stubs, the code will be added by the following patches. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 61 + 1 file changed, 61 insertions(+) create mode 100644 drivers/xen/pvcalls-front.c diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c new file mode 100644 index 000..a8d38c2 --- /dev/null +++ b/drivers/xen/pvcalls-front.c @@ -0,0 +1,61 @@ +/* + * (c) 2017 Stefano Stabellini + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include + +#include +#include +#include +#include +#include + +static const struct xenbus_device_id pvcalls_front_ids[] = { + { "pvcalls" }, + { "" } +}; + +static int pvcalls_front_remove(struct xenbus_device *dev) +{ + return 0; +} + +static int pvcalls_front_probe(struct xenbus_device *dev, + const struct xenbus_device_id *id) +{ + return 0; +} + +static void pvcalls_front_changed(struct xenbus_device *dev, + enum xenbus_state backend_state) +{ +} + +static struct xenbus_driver pvcalls_front_driver = { + .ids = pvcalls_front_ids, + .probe = pvcalls_front_probe, + .remove = pvcalls_front_remove, + .otherend_changed = pvcalls_front_changed, +}; + +static int __init pvcalls_frontend_init(void) +{ + if (!xen_domain()) + return -ENODEV; + + pr_info("Initialising Xen pvcalls frontend driver\n"); + + return xenbus_register_frontend(_front_driver); +} + +module_init(pvcalls_frontend_init); -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 04/13] xen/pvcalls: implement socket command and handle events
Send a PVCALLS_SOCKET command to the backend, use the masked req_prod_pvt as req_id. This way, req_id is guaranteed to be between 0 and PVCALLS_NR_REQ_PER_RING. We already have a slot in the rsp array ready for the response, and there cannot be two outstanding responses with the same req_id. Wait for the response by waiting on the inflight_req waitqueue and check for the req_id field in rsp[req_id]. Use atomic accesses and barriers to read the field. Note that the barriers are simple smp barriers (as opposed to virt barriers) because they are for internal frontend synchronization, not frontend<->backend communication. Once a response is received, clear the corresponding rsp slot by setting req_id to PVCALLS_INVALID_ID. Note that PVCALLS_INVALID_ID is invalid only from the frontend point of view. It is not part of the PVCalls protocol. pvcalls_front_event_handler is in charge of copying responses from the ring to the appropriate rsp slot. It is done by copying the body of the response first, then by copying req_id atomically. After the copies, wake up anybody waiting on waitqueue. pvcallss_lock protects accesses to the ring. Create a new struct sock_mapping and convert the pointer into an uint64_t and use it as id for the new socket to pass to the backend. The struct will be fully initialized later on connect or bind. In this patch the struct sock_mapping is empty, the fields will be added by the next patch. sock->sk->sk_send_head is not used for ip sockets: reuse the field to store a pointer to the struct sock_mapping corresponding to the socket. This way, we can easily get the struct sock_mapping from the struct socket. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 119 drivers/xen/pvcalls-front.h | 8 +++ 2 files changed, 127 insertions(+) create mode 100644 drivers/xen/pvcalls-front.h diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 2afe36d..7c4a7cb 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -20,6 +20,8 @@ #include #include +#include "pvcalls-front.h" + #define PVCALLS_INVALID_ID UINT_MAX #define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER #define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE) @@ -38,11 +40,128 @@ struct pvcalls_bedata { }; static struct xenbus_device *pvcalls_front_dev; +struct sock_mapping { + bool active_socket; + struct list_head list; + struct socket *sock; +}; + +static inline int get_request(struct pvcalls_bedata *bedata, int *req_id) +{ + *req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1); + if (RING_FULL(>ring) || + READ_ONCE(bedata->rsp[*req_id].req_id) != PVCALLS_INVALID_ID) + return -EAGAIN; + return 0; +} + static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) { + struct xenbus_device *dev = dev_id; + struct pvcalls_bedata *bedata; + struct xen_pvcalls_response *rsp; + uint8_t *src, *dst; + int req_id = 0, more = 0, done = 0; + + if (dev == NULL) + return IRQ_HANDLED; + + bedata = dev_get_drvdata(>dev); + if (bedata == NULL) + return IRQ_HANDLED; + +again: + while (RING_HAS_UNCONSUMED_RESPONSES(>ring)) { + rsp = RING_GET_RESPONSE(>ring, bedata->ring.rsp_cons); + + req_id = rsp->req_id; + dst = (uint8_t *)>rsp[req_id] + sizeof(rsp->req_id); + src = (uint8_t *)rsp + sizeof(rsp->req_id); + memcpy(dst, src, sizeof(*rsp) - sizeof(rsp->req_id)); + /* +* First copy the rest of the data, then req_id. It is +* paired with the barrier when accessing bedata->rsp. +*/ + smp_wmb(); + WRITE_ONCE(bedata->rsp[req_id].req_id, rsp->req_id); + + done = 1; + bedata->ring.rsp_cons++; + } + + RING_FINAL_CHECK_FOR_RESPONSES(>ring, more); + if (more) + goto again; + if (done) + wake_up(>inflight_req); return IRQ_HANDLED; } +int pvcalls_front_socket(struct socket *sock) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map = NULL; + struct xen_pvcalls_request *req; + int notify, req_id, ret; + + if (!pvcalls_front_dev) + return -EACCES; + /* +* PVCalls only supports domain AF_INET, +* type SOCK_STREAM and protocol 0 sockets for now. +* +* Check socket type here, AF_INET and protocol checks are done +* by the caller. +*/ + if (sock->type != SOCK_STREAM) + return -ENOTSUPP; + + bedata = dev_get_drvdata(_front_dev->dev); + + map = kzalloc(sizeof(*map), GFP_KERNEL); + if (map == NULL) +
[Xen-devel] [PATCH v3 07/13] xen/pvcalls: implement listen command
Send PVCALLS_LISTEN to the backend. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 47 + drivers/xen/pvcalls-front.h | 1 + 2 files changed, 48 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index 5ccef34..b2757f5 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -367,6 +367,53 @@ int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int addr_len) return 0; } +int pvcalls_front_listen(struct socket *sock, int backlog) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map; + struct xen_pvcalls_request *req; + int notify, req_id, ret; + + if (!pvcalls_front_dev) + return -ENOTCONN; + bedata = dev_get_drvdata(_front_dev->dev); + + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); + if (!map) + return -ENOTSOCK; + + if (map->passive.status != PVCALLS_STATUS_BIND) + return -EOPNOTSUPP; + + spin_lock(>pvcallss_lock); + ret = get_request(bedata, _id); + if (ret < 0) { + spin_unlock(>pvcallss_lock); + return ret; + } + req = RING_GET_REQUEST(>ring, req_id); + req->req_id = req_id; + req->cmd = PVCALLS_LISTEN; + req->u.listen.id = (uint64_t) map; + req->u.listen.backlog = backlog; + + bedata->ring.req_prod_pvt++; + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify); + spin_unlock(>pvcallss_lock); + if (notify) + notify_remote_via_irq(bedata->irq); + + wait_event(bedata->inflight_req, + READ_ONCE(bedata->rsp[req_id].req_id) == req_id); + + map->passive.status = PVCALLS_STATUS_LISTEN; + ret = bedata->rsp[req_id].ret; + /* read ret, then set this rsp slot to be reused */ + smp_mb(); + WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID); + return ret; +} + static const struct xenbus_device_id pvcalls_front_ids[] = { { "pvcalls" }, { "" } diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h index 8b0a274..aa8fe10 100644 --- a/drivers/xen/pvcalls-front.h +++ b/drivers/xen/pvcalls-front.h @@ -9,5 +9,6 @@ int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr, int pvcalls_front_bind(struct socket *sock, struct sockaddr *addr, int addr_len); +int pvcalls_front_listen(struct socket *sock, int backlog); #endif -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 09/13] xen/pvcalls: implement sendmsg
Send data to an active socket by copying data to the "out" ring. Take the active socket out_mutex so that only one function can access the ring at any given time. If not enough room is available on the ring, rather than returning immediately or sleep-waiting, spin for up to 5000 cycles. This small optimization turns out to improve performance significantly. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 109 drivers/xen/pvcalls-front.h | 3 ++ 2 files changed, 112 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index f83b910..369acde 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -29,6 +29,7 @@ #define PVCALLS_INVALID_ID UINT_MAX #define PVCALLS_RING_ORDER XENBUS_MAX_RING_GRANT_ORDER #define PVCALLS_NR_REQ_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE) +#define PVCALLS_FRONT_MAX_SPIN 5000 struct pvcalls_bedata { struct xen_pvcalls_front_ring ring; @@ -88,6 +89,22 @@ static inline int get_request(struct pvcalls_bedata *bedata, int *req_id) return 0; } +static int pvcalls_front_write_todo(struct sock_mapping *map) +{ + struct pvcalls_data_intf *intf = map->active.ring; + RING_IDX cons, prod, size = XEN_FLEX_RING_SIZE(intf->ring_order); + int32_t error; + + cons = intf->out_cons; + prod = intf->out_prod; + error = intf->out_error; + if (error == -ENOTCONN) + return 0; + if (error != 0) + return error; + return size - pvcalls_queued(prod, cons, size); +} + static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id) { struct xenbus_device *dev = dev_id; @@ -325,6 +342,98 @@ int pvcalls_front_connect(struct socket *sock, struct sockaddr *addr, return ret; } +static int __write_ring(struct pvcalls_data_intf *intf, + struct pvcalls_data *data, + struct iov_iter *msg_iter, + size_t len) +{ + RING_IDX cons, prod, size, masked_prod, masked_cons; + RING_IDX array_size = XEN_FLEX_RING_SIZE(intf->ring_order); + int32_t error; + + cons = intf->out_cons; + prod = intf->out_prod; + error = intf->out_error; + /* read indexes before continuing */ + virt_mb(); + + if (error < 0) + return error; + + size = pvcalls_queued(prod, cons, array_size); + if (size >= array_size) + return 0; + if (len > array_size - size) + len = array_size - size; + + masked_prod = pvcalls_mask(prod, array_size); + masked_cons = pvcalls_mask(cons, array_size); + + if (masked_prod < masked_cons) { + copy_from_iter(data->out + masked_prod, len, msg_iter); + } else { + if (len > array_size - masked_prod) { + copy_from_iter(data->out + masked_prod, + array_size - masked_prod, msg_iter); + copy_from_iter(data->out, + len - (array_size - masked_prod), + msg_iter); + } else { + copy_from_iter(data->out + masked_prod, len, msg_iter); + } + } + /* write to ring before updating pointer */ + virt_wmb(); + intf->out_prod += len; + + return len; +} + +int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg, + size_t len) +{ + struct pvcalls_bedata *bedata; + struct sock_mapping *map; + int sent = 0, tot_sent = 0; + int count = 0, flags; + + if (!pvcalls_front_dev) + return -ENOTCONN; + bedata = dev_get_drvdata(_front_dev->dev); + + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); + if (!map) + return -ENOTSOCK; + + flags = msg->msg_flags; + if (flags & (MSG_CONFIRM|MSG_DONTROUTE|MSG_EOR|MSG_OOB)) + return -EOPNOTSUPP; + + mutex_lock(>active.out_mutex); + if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) { + mutex_unlock(>active.out_mutex); + return -EAGAIN; + } + +again: + count++; + sent = __write_ring(map->active.ring, + >active.data, >msg_iter, + len); + if (sent > 0) { + len -= sent; + tot_sent += sent; + notify_remote_via_irq(map->active.irq); + } + if (sent >= 0 && len > 0 && count < PVCALLS_FRONT_MAX_SPIN) + goto again; + if (sent < 0) + tot_sent = sent; + + mutex_unlock(>active.out_mutex); + return tot_sent; +} + int pvcalls_front_bind(struct socket *sock, struct
[Xen-devel] [PATCH v3 03/13] xen/pvcalls: connect to the backend
Implement the probe function for the pvcalls frontend. Read the supported versions, max-page-order and function-calls nodes from xenstore. Only one frontend<->backend connection is supported at any given time for a guest. Store the active frontend device to a static pointer. Introduce a stub functions for the event handler. Signed-off-by: Stefano StabelliniCC: boris.ostrov...@oracle.com CC: jgr...@suse.com --- drivers/xen/pvcalls-front.c | 130 1 file changed, 130 insertions(+) diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c index a126195..2afe36d 100644 --- a/drivers/xen/pvcalls-front.c +++ b/drivers/xen/pvcalls-front.c @@ -84,12 +84,142 @@ static int pvcalls_front_remove(struct xenbus_device *dev) static int pvcalls_front_probe(struct xenbus_device *dev, const struct xenbus_device_id *id) { + int ret = -ENOMEM, evtchn, i; + unsigned int max_page_order, function_calls, len; + char *versions; + grant_ref_t gref_head = 0; + struct xenbus_transaction xbt; + struct pvcalls_bedata *bedata = NULL; + struct xen_pvcalls_sring *sring; + + if (pvcalls_front_dev != NULL) { + dev_err(>dev, "only one PV Calls connection supported\n"); + return -EINVAL; + } + + versions = xenbus_read(XBT_NIL, dev->otherend, "versions", ); + if (!len) + return -EINVAL; + if (strcmp(versions, "1")) { + kfree(versions); + return -EINVAL; + } + kfree(versions); + max_page_order = xenbus_read_unsigned(dev->otherend, + "max-page-order", 0); + if (max_page_order < PVCALLS_RING_ORDER) + return -ENODEV; + function_calls = xenbus_read_unsigned(dev->otherend, + "function-calls", 0); + if (function_calls != 1) + return -ENODEV; + pr_info("%s max-page-order is %u\n", __func__, max_page_order); + + bedata = kzalloc(sizeof(struct pvcalls_bedata), GFP_KERNEL); + if (!bedata) + return -ENOMEM; + + dev_set_drvdata(>dev, bedata); + pvcalls_front_dev = dev; + init_waitqueue_head(>inflight_req); + for (i = 0; i < PVCALLS_NR_REQ_PER_RING; i++) + bedata->rsp[i].req_id = PVCALLS_INVALID_ID; + + sring = (struct xen_pvcalls_sring *) __get_free_page(GFP_KERNEL | +__GFP_ZERO); + if (!sring) + goto error; + SHARED_RING_INIT(sring); + FRONT_RING_INIT(>ring, sring, XEN_PAGE_SIZE); + + ret = xenbus_alloc_evtchn(dev, ); + if (ret) + goto error; + + bedata->irq = bind_evtchn_to_irqhandler(evtchn, + pvcalls_front_event_handler, + 0, "pvcalls-frontend", dev); + if (bedata->irq < 0) { + ret = bedata->irq; + goto error; + } + + ret = gnttab_alloc_grant_references(1, _head); + if (ret < 0) + goto error; + bedata->ref = gnttab_claim_grant_reference(_head); + if (bedata->ref < 0) { + ret = bedata->ref; + goto error; + } + gnttab_grant_foreign_access_ref(bedata->ref, dev->otherend_id, + virt_to_gfn((void *)sring), 0); + + again: + ret = xenbus_transaction_start(); + if (ret) { + xenbus_dev_fatal(dev, ret, "starting transaction"); + goto error; + } + ret = xenbus_printf(xbt, dev->nodename, "version", "%u", 1); + if (ret) + goto error_xenbus; + ret = xenbus_printf(xbt, dev->nodename, "ring-ref", "%d", bedata->ref); + if (ret) + goto error_xenbus; + ret = xenbus_printf(xbt, dev->nodename, "port", "%u", + evtchn); + if (ret) + goto error_xenbus; + ret = xenbus_transaction_end(xbt, 0); + if (ret) { + if (ret == -EAGAIN) + goto again; + xenbus_dev_fatal(dev, ret, "completing transaction"); + goto error; + } + + INIT_LIST_HEAD(>socket_mappings); + INIT_LIST_HEAD(>socketpass_mappings); + spin_lock_init(>pvcallss_lock); + xenbus_switch_state(dev, XenbusStateInitialised); + return 0; + + error_xenbus: + xenbus_transaction_end(xbt, 1); + xenbus_dev_fatal(dev, ret, "writing xenstore"); + error: + pvcalls_front_remove(dev); + return ret; } static void pvcalls_front_changed(struct xenbus_device *dev, enum xenbus_state backend_state) { + switch (backend_state) { + case XenbusStateReconfiguring: + case
[Xen-devel] [linux-3.18 test] 112394: trouble: blocked/broken/fail/pass
flight 112394 linux-3.18 real [real] http://logs.test-lab.xenproject.org/osstest/logs/112394/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-xsm 2 hosts-allocate broken REGR. vs. 112102 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112102 build-arm64 2 hosts-allocate broken REGR. vs. 112102 build-arm64-pvops 3 capture-logs broken REGR. vs. 112102 Tests which are failing intermittently (not blocking): test-amd64-i386-qemut-rhel6hvm-intel 7 xen-boot fail in 112387 pass in 112394 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail pass in 112387 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-examine 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64-xsm 3 capture-logs broken blocked in 112102 build-arm64 3 capture-logs broken blocked in 112102 test-amd64-amd64-xl-qemuu-win7-amd64 18 guest-start/win.repeat fail blocked in 112102 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail in 112387 blocked in 112102 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail in 112387 blocked in 112102 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail in 112387 like 112102 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112085 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 112085 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 112102 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112102 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 112102 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 112102 test-amd64-amd64-xl-rtds 10 debian-install fail like 112102 test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail
Re: [Xen-devel] [PATCH v2 11/13] xen/pvcalls: implement release command
On Thu, 27 Jul 2017, Boris Ostrovsky wrote: > > +int pvcalls_front_release(struct socket *sock) > > +{ > > + struct pvcalls_bedata *bedata; > > + struct sock_mapping *map; > > + int req_id, notify; > > + struct xen_pvcalls_request *req; > > + > > + if (!pvcalls_front_dev) > > + return -EIO; > > + bedata = dev_get_drvdata(_front_dev->dev); > > + if (!bedata) > > + return -EIO; > > Some (all?) other ops don't check bedata validity. Should they all do? No, I don't think they should: dev_set_drvdata is called in the probe function (pvcalls_front_probe). I'll remove it. > > + > > + if (sock->sk == NULL) > > + return 0; > > + > > + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head); > > + if (map == NULL) > > + return 0; > > + > > + spin_lock(>pvcallss_lock); > > + req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1); > > + if (RING_FULL(>ring) || > > + READ_ONCE(bedata->rsp[req_id].req_id) != PVCALLS_INVALID_ID) { > > + spin_unlock(>pvcallss_lock); > > + return -EAGAIN; > > + } > > + WRITE_ONCE(sock->sk->sk_send_head, NULL); > > + > > + req = RING_GET_REQUEST(>ring, req_id); > > + req->req_id = req_id; > > + req->cmd = PVCALLS_RELEASE; > > + req->u.release.id = (uint64_t)sock; > > + > > + bedata->ring.req_prod_pvt++; > > + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(>ring, notify); > > + spin_unlock(>pvcallss_lock); > > + if (notify) > > + notify_remote_via_irq(bedata->irq); > > + > > + wait_event(bedata->inflight_req, > > + READ_ONCE(bedata->rsp[req_id].req_id) == req_id); > > + > > + if (map->active_socket) { > > + /* > > +* Set in_error and wake up inflight_conn_req to force > > +* recvmsg waiters to exit. > > +*/ > > + map->active.ring->in_error = -EBADF; > > + wake_up_interruptible(>active.inflight_conn_req); > > + > > + mutex_lock(>active.in_mutex); > > + mutex_lock(>active.out_mutex); > > + pvcalls_front_free_map(bedata, map); > > + mutex_unlock(>active.out_mutex); > > + mutex_unlock(>active.in_mutex); > > + kfree(map); > > Since you are locking here I assume you expect that someone else might > also be trying to lock the map. But you are freeing it immediately after > unlocking. Wouldn't that mean that whoever is trying to grab the lock > might then dereference freed memory? The lock is to make sure there are no recvmsg or sendmsg in progress. We are sure that no newer sendmsg or recvmsg are waiting for pvcalls_front_release to release the lock because before send a message to the backend we set sk_send_head to NULL. > > + } else { > > + spin_lock(>pvcallss_lock); > > + list_del_init(>list); > > + kfree(map); > > + spin_unlock(>pvcallss_lock); > > + } > > + WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID); > > + > > + return 0; > > +} > > + > > static const struct xenbus_device_id pvcalls_front_ids[] = { > > { "pvcalls" }, > > { "" } > > diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h > > index 25e05b8..3332978 100644 > > --- a/drivers/xen/pvcalls-front.h > > +++ b/drivers/xen/pvcalls-front.h > > @@ -23,5 +23,6 @@ int pvcalls_front_recvmsg(struct socket *sock, > > unsigned int pvcalls_front_poll(struct file *file, > > struct socket *sock, > > poll_table *wait); > > +int pvcalls_front_release(struct socket *sock); > > > > #endif ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 09/13] xen/pvcalls: implement recvmsg
On Thu, 27 Jul 2017, Boris Ostrovsky wrote: > On 07/26/2017 08:08 PM, Stefano Stabellini wrote: > > On Wed, 26 Jul 2017, Boris Ostrovsky wrote: > +count++; > +else > + > wait_event_interruptible(map->active.inflight_conn_req, > + > pvcalls_front_read_todo(map)); > +} > >>> Should we be using PVCALLS_FRONT_MAX_SPIN here? In sendmsg it is > >>> counting non-sleeping iterations but here we are sleeping so > >>> PVCALLS_FRONT_MAX_SPIN (5000) may take a while. > >>> > >>> In fact, what shouldn't this waiting be a function of MSG_DONTWAIT > >> err, which it already is. But the question still stands (except for > >> MSG_DONTWAIT). > > The code (admittedly unintuitive) is busy-looping (non-sleeping) for > > 5000 iterations *before* attempting to sleep. So in that regard, recvmsg > > and sendmsg use PVCALLS_FRONT_MAX_SPIN in the same way: only for > > non-sleeping iterations. > > > > OK. > > Why not go directly into wait_event_interruptible()? I see you write in > the commit message > > If not enough data is available on the ring, rather than returning > immediately or sleep-waiting, spin for up to 5000 cycles. This small > optimization turns out to improve performance and latency significantly. > > > Is this because of scheduling latency? I think this should be mentioned not > just in the commit message but also as a comment in the code. It tries to mitigate scheduling latencies on both ends (dom0 and domU) when the ring buffer is the bottleneck (high bandwidth connections). But to be honest with you, it's mostly beneficial in the sendmsg case, because for recvmsg we also introduce a busy-wait in regular circumstances, when no data is actually available. I confirmed this statement with a quick iperf test. I'll remove the spin from recvmsg and keep it in sendmsg. > > (I also think it's not "not enough data" but rather "no data"?) you are right ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 3/6] xen/arm: Allow platform_hvc to handle guest SMC calls
On Thu, Feb 09, 2017 at 12:32:09PM -0700, Tamas K Lengyel wrote: > On Thu, Feb 9, 2017 at 11:43 AM, Stefano Stabellini >wrote: > > On Thu, 9 Feb 2017, Tamas K Lengyel wrote: > >> On Thu, Feb 9, 2017 at 11:22 AM, Stefano Stabellini > >> wrote: > >> > On Thu, 9 Feb 2017, Edgar E. Iglesias wrote: > >> >> On Thu, Feb 09, 2017 at 10:12:41AM +0100, Edgar E. Iglesias wrote: > >> >> > On Wed, Feb 08, 2017 at 05:20:44PM -0800, Stefano Stabellini wrote: > >> >> > > On Thu, 9 Feb 2017, Julien Grall wrote: > >> >> > > > On 08/02/2017 23:28, Tamas K Lengyel wrote: > >> >> > > > > On Wed, Feb 8, 2017 at 3:04 PM, Julien Grall > >> >> > > > > wrote: > >> >> > > > > > Hi Tamas, .. > >> In principle I have nothing against a command line option, but I don't > >> really follow how that would help. The monitor system is disabled by > >> default for all domains, so there is no problem with dom0 booting or > >> any other domain needing to access the firmware. You specifically have > >> to enable the monitoring for domains. Why is it a problem to have it > >> be exclusive for just those domains where it is enabled? > > > > I am suggesting this solution because I expect many use-cases for memory > > introspection that don't actually require any platform_hvc events to be > > monitored at all. On the other end, I expect that on platforms where > > platform_hvc is implemented, such as the ZynqMP, those calls are > > important and should be handled in Xen in most cases. > > > > Looking at the code, does monitor.privileged_call_enabled only cover > > SMC? Is monitor.privileged_call_enabled disabled by default? > > If so, monitor.privileged_call_enabled could be the tunable I was > > talking about. As long as enabling memory introspection doesn't > > automatically forward platform_hvc events to the monitor, I am fine with > > it. > > Yes, monitor.privileged_call_enabled only covers SMCs right now and it > is disabled by default. It has to be enabled specifically for a > domain. Memory introspection is separate from this, that is handled > by the mem_access system and it can be enabled separately from SMC > monitoring. > > As for hypercalls that get handled by Xen, I don't really need to > monitor those. If Xen would on the other hand go and call some > firmware as a result of the hypercall, I would need to be able to deny > that. So as long as XSM can be used to control HVC calls, that works > for me just fine too. Hi again! This was quite a while ago but I think we kind of ended up with monitor.privileged_call_enabled being a possible flag to conditionalize the forwarding of firmware calls or not. There are at least 3 cases to consider at the moment: 1. Firmware calls over SMC (PSCI or other platform calls like EEMI) 2. Firmware calls over HVC Handled by Xen (PSCI and XEN Hypercalls) 3. Firmware calls over HVC Handled by platform specific code (e.g EEMI) For #1 Firmware calls over SMC: I've conditionalized all of it on monitor.privileged_call_enabled. It's either the monitor or the firmware call handling, they are mutually exclusive. Guests can still do PSCI over HVC. For #2, things work like today. This is PSCI and the Xen Hypercallsi over HVC. For #3, only platform code knows if the specific call will be handled in Xen completely or if it will result in some kind of SMC to lower layers. If monitor.privileged_call_enabled is on, I've made the ZynqMP implementation gracefully NACK any call that would result in an SMC issued by Xen. Are there any concerns around this? I'll also send out code for review, it may be easier to follow :-) Best regards, Edgar ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/13] xen/pvcalls: implement bind command
On Thu, 27 Jul 2017, Boris Ostrovsky wrote: > >> This all looks very similar to previous patches. Can it be factored out? > > You are right that the pattern is the same for all commands: > > - get a request > > - fill the request > > - possibly do something else > > - wait > > however each request is different, the struct and fields are different. > > There are spin_lock and spin_unlock calls intermingled. I am not sure I > > can factor out much of this. Maybe I could create a static inline or > > macro as a syntactic sugar to replace the wait call, but that's pretty > > much it I think. > > Maybe you could factor out common fragments, not necessarily the whole > thing at once? > > For example, > > static inline int get_request(*bedata, int *req_id) > { > > *req_id = bedata->ring.req_prod_pvt & (RING_SIZE(>ring) - 1); > if (RING_FULL(>ring) || > READ_ONCE(bedata->rsp[*req_id].req_id) != PVCALLS_INVALID_ID) { > return -EAGAIN; > return 0; > } > > (or some such) You are right, the code looks better this way. I'll add it. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 0/1] xen/arm: zynqmp: Disable PCIe
On 31/07/2017 20:37, Edgar E. Iglesias wrote: From: "Edgar E. Iglesias"Hi, Hi Edgar, We're seeing panics in dom0 with PCIe enabled due to what seems to be wrongly created mappings by Xen. With older kernels we didn't see the panics but PCIe wasn't functional in dom0. This disables the PCIe nodes on the ZynqMP until Xen/ARM gets more PCIe support. I feel a bit sad to ack a patch disabling PCIe in the ZynqMP. Before doing that. Can you describe what is the exact problem with Xen? It might be possible that we don't parse correctly the device-tree. Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.
On Mon, 2017-07-31 at 14:20 -0700, Stefano Stabellini wrote: > On Thu, 27 Jul 2017, Dario Faggioli wrote: > > > > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c > > index f0fdc87..4586f2a 100644 > > --- a/xen/common/rcupdate.c > > +++ b/xen/common/rcupdate.c > > @@ -84,8 +84,14 @@ struct rcu_data { > > int cpu; > > struct rcu_head barrier; > > longlast_rs_qlen; /* qlen during the last > > resched */ > > + > > +/* 3) idle CPUs handling */ > > +struct timer idle_timer; > > +bool idle_timer_active; > > }; > > > > +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10) > > Isn't this a bit too short? How is it chosen? > It's totally arbitrary (and that would be the case for whatever value we choose). Basically, it's how long, at worst, after the actual end of a grace period, a (batch of) callback(s) will be invoked. Currently, on Credit1 on ARM (without my patch, from this series, that suspends the tick) that's (by chance) 30 ms (or whatever value is chosen for Credit1 timeslice). On Credit2 (on both ARM and x86), it's never, but on x86 it (apparently) is 'however frequent time sync rendezvouses happs' (which I don't recall, but it's longer), while on ARM is (potentially) never. I accept suggestions about alternatives values, and I'm certainly fine with adding a comment, containing something along the lines of the explanation above, but I fear it's going to be hard to figure out what value is actually the "absolute best". In Linux (which is where the same 'callback book-keeping' happens for them), a tick with a frequency of 1000Hz (== 1ms) is considered 'low- latency/Deskop/real-time'. For us, as said above, tick --when it's there-- would be 30ms by default. I just went with something in the middle. Also, it's not that we'll have a 10ms periodic timer going on for significant amount of time. In fact we expect it to actually fire just once (for each grace period). It's not 100% guaranteed that it won't be reprogrammed and fire a couple of times, but it should not, in the vast majority of cases. What makes you think it's short? Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 02/13] xen/pvcalls: connect to the backend
On Thu, 27 Jul 2017, Boris Ostrovsky wrote: > >>> static int pvcalls_front_probe(struct xenbus_device *dev, > >>> const struct xenbus_device_id *id) > >>> { > >>> + int ret = -EFAULT, evtchn, ref = -1, i; > >>> + unsigned int max_page_order, function_calls, len; > >>> + char *versions; > >>> + grant_ref_t gref_head = 0; > >>> + struct xenbus_transaction xbt; > >>> + struct pvcalls_bedata *bedata = NULL; > >>> + struct xen_pvcalls_sring *sring; > >>> + > >>> + if (pvcalls_front_dev != NULL) { > >>> + dev_err(>dev, "only one PV Calls connection > >>> supported\n"); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + versions = xenbus_read(XBT_NIL, dev->otherend, "versions", ); > >>> + if (!len) > >>> + return -EINVAL; > >>> + if (strcmp(versions, "1")) { > >>> + kfree(versions); > >>> + return -EINVAL; > >>> + } > >>> + kfree(versions); > >>> + ret = xenbus_scanf(XBT_NIL, dev->otherend, > >>> +"max-page-order", "%u", _page_order); > >>> + if (ret <= 0) > >>> + return -ENODEV; > >>> + if (max_page_order < RING_ORDER) > >>> + return -ENODEV; > >>> + ret = xenbus_scanf(XBT_NIL, dev->otherend, > >>> +"function-calls", "%u", _calls); > >>> + if (ret <= 0 || function_calls != 1) > >>> + return -ENODEV; > >>> + pr_info("%s max-page-order is %u\n", __func__, max_page_order); > >>> + > >>> + bedata = kzalloc(sizeof(struct pvcalls_bedata), GFP_KERNEL); > >>> + if (!bedata) > >>> + return -ENOMEM; > >>> + > >>> + init_waitqueue_head(>inflight_req); > >>> + for (i = 0; i < PVCALLS_NR_REQ_PER_RING; i++) > >>> + bedata->rsp[i].req_id = PVCALLS_INVALID_ID; > >>> + > >>> + sring = (struct xen_pvcalls_sring *) __get_free_page(GFP_KERNEL | > >>> + __GFP_ZERO); > >>> + if (!sring) > >>> + goto error; > >>> + SHARED_RING_INIT(sring); > >>> + FRONT_RING_INIT(>ring, sring, XEN_PAGE_SIZE); > >>> + > >>> + ret = xenbus_alloc_evtchn(dev, ); > >>> + if (ret) > >>> + goto error; > >>> + > >>> + bedata->irq = bind_evtchn_to_irqhandler(evtchn, > >>> + pvcalls_front_event_handler, > >>> + 0, "pvcalls-frontend", dev); > >>> + if (bedata->irq < 0) { > >>> + ret = bedata->irq; > >>> + goto error; > >>> + } > >>> + > >>> + ret = gnttab_alloc_grant_references(1, _head); > >>> + if (ret < 0) > >>> + goto error; > >>> + bedata->ref = ref = gnttab_claim_grant_reference(_head); > >> Is ref really needed? > > No, I'll remove it > > > > > >>> + if (ref < 0) > >>> + goto error; > >>> + gnttab_grant_foreign_access_ref(ref, dev->otherend_id, > >>> + virt_to_gfn((void *)sring), 0); > >>> + > >>> + again: > >>> + ret = xenbus_transaction_start(); > >>> + if (ret) { > >>> + xenbus_dev_fatal(dev, ret, "starting transaction"); > >>> + goto error; > >>> + } > >>> + ret = xenbus_printf(xbt, dev->nodename, "version", "%u", 1); > >>> + if (ret) > >>> + goto error_xenbus; > >>> + ret = xenbus_printf(xbt, dev->nodename, "ring-ref", "%d", ref); > >>> + if (ret) > >>> + goto error_xenbus; > >>> + ret = xenbus_printf(xbt, dev->nodename, "port", "%u", > >>> + evtchn); > >>> + if (ret) > >>> + goto error_xenbus; > >>> + ret = xenbus_transaction_end(xbt, 0); > >>> + if (ret) { > >>> + if (ret == -EAGAIN) > >>> + goto again; > >>> + xenbus_dev_fatal(dev, ret, "completing transaction"); > >>> + goto error; > >>> + } > >>> + > >>> + INIT_LIST_HEAD(>socket_mappings); > >>> + INIT_LIST_HEAD(>socketpass_mappings); > >>> + spin_lock_init(>pvcallss_lock); > >>> + dev_set_drvdata(>dev, bedata); > >>> + pvcalls_front_dev = dev; > >>> + xenbus_switch_state(dev, XenbusStateInitialised); > >>> + > >>> return 0; > >>> + > >>> + error_xenbus: > >>> + xenbus_transaction_end(xbt, 1); > >>> + xenbus_dev_fatal(dev, ret, "writing xenstore"); > >>> + error: > >>> + pvcalls_front_remove(dev); > >> I think patch 12 (where you implement cleanup) could be moved before this > >> one. > > I'll move the patch > > > > > >> I also think you are leaking bedata on error paths. > > bedata is freed by pvcalls_front_remove (kfree(bedata)), why do you say > > so? > > bedata there is read from dev_get_drvdata() and here you assign drvdata > at the very end. > > Come think of it, pvcalls_front_remove() should probably first check > whether bedata is valid. Or drvdata should be assigned right away in > this routine, before any 'got error/error_xenbus'. Yes, I'll do that ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file
On Mon, Jul 31, 2017 at 02:30:47PM -0700, Stefano Stabellini wrote: > On Mon, 31 Jul 2017, Edgar E. Iglesias wrote: > > On Fri, Jul 28, 2017 at 09:03:15PM +0800, Zhongze Liu wrote: > > > > > > 1. Motivation and Description > > > > Hi, > > > > I think this looks quite useful. I have a few comments inline. > > Hi Edgar, thanks for giving it a look! > > > > > > > > Virtual machines use grant table hypercalls to setup a share page for > > > inter-VMs communications. These hypercalls are used by all PV > > > protocols today. However, very simple guests, such as baremetal > > > applications, might not have the infrastructure to handle the grant table. > > > This project is about setting up several shared memory areas for inter-VMs > > > communications directly from the VM config file. > > > So that the guest kernel doesn't have to have grant table support (in the > > > embedded space, this is not unusual) to be able to communicate with > > > other guests. > > > > > > > > > 2. Implementation Plan: > > > > > > > > > == > > > 2.1 Introduce a new VM config option in xl: > > > == > > > > > > 2.1.1 Design Goals > > > ~~~ > > > > > > The shared areas should be shareable among several (>=2) VMs, so every > > > shared > > > physical memory area is assigned to a set of VMs. Therefore, a “token” or > > > “identifier” should be used here to uniquely identify a backing memory > > > area. > > > A string no longer than 128 bytes is used here to serve the purpose. > > > > > > The backing area would be taken from one domain, which we will regard > > > as the "master domain", and this domain should be created prior to any > > > other "slave domain"s. Again, we have to use some kind of tag to tell who > > > is the "master domain". > > > > > > And the ability to specify the permissions and cacheability (and > > > shareability > > > for ARM guest's) of the pages to be shared should also be given to the > > > user. > > > > > > 2.2.2 Syntax and Behavior > > > ~ > > > The following example illustrates the syntax of the proposed config entry > > > (suppose that we're on x86): > > > > > > In xl config file of vm1: > > > static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master, > > > cache_policy=x86_normal, prot=r2', > > > > > > 'id=ID2, begin=0x30, end=0x40, role=master' ] > > > > > > In xl config file of vm2: > > > static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60, > > > role=slave, prot=ro' ] > > > > > > In xl config file of vm3: > > > static_shm = [ 'id=ID2, offset = 1, begin=0x69, > > > end=0x80, role=slave, prot=ro' ] > > > > > > where: > > > @id The identifier of the backing memory area. > > > Can be any string that matches the regexp "[^ > > > \t\n,]+" > > > and no longer than 128 characters > > > > > > @offset Can only appear when @role = slave. The sharing > > > will > > > start from the beginning of backing memory area > > > plus > > > this offset. If not set, it defaults to zero. > > > Can be decimals or hexadecimals of the form > > > "0x2", > > > and should be the multiple of the hypervisor page > > > granularity (currently 4K on both ARM and x86). > > > > > > @begin/endThe boundaries of the shared memory area. The > > > format > > > requirements are the same with @offset. > > > > I'm assuming this is all specified in GFN and also not MFN contigous? > > Would it be possible to allow the specification of MFN mappings > > that are contigous? > > That could be done with the iomem= parameter? The missing part from the iomem paramater is the attributes, cacheability, shared etc. But we could perhaps add that to iomem somehow. > > > > This would be useful to map specific kinds of memory (e.g On Chip RAMs). > > > > Other use-cases are when there are not only guests sharing > > the pages but also devices. In some cases these devs may be locked in > > with low-latency access to specific memory regions. > > > > Perhaps something like the following? > > addr=gfn@ > > size=0x1000 > > > > with mfn being optional? > > I can see that it might be useful, but we are trying to keep the scope > small to be able to complete the project within the limited timeframe > allowed by GSoC. We only have one month left! We risk not getting the > feature completed. Once this set of features is done and committed, we > can expand on it. > > For the sake of this document, we should make clear that addresses
Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file
On Mon, 31 Jul 2017, Edgar E. Iglesias wrote: > On Fri, Jul 28, 2017 at 09:03:15PM +0800, Zhongze Liu wrote: > > > > 1. Motivation and Description > > Hi, > > I think this looks quite useful. I have a few comments inline. Hi Edgar, thanks for giving it a look! > > > > Virtual machines use grant table hypercalls to setup a share page for > > inter-VMs communications. These hypercalls are used by all PV > > protocols today. However, very simple guests, such as baremetal > > applications, might not have the infrastructure to handle the grant table. > > This project is about setting up several shared memory areas for inter-VMs > > communications directly from the VM config file. > > So that the guest kernel doesn't have to have grant table support (in the > > embedded space, this is not unusual) to be able to communicate with > > other guests. > > > > > > 2. Implementation Plan: > > > > > > == > > 2.1 Introduce a new VM config option in xl: > > == > > > > 2.1.1 Design Goals > > ~~~ > > > > The shared areas should be shareable among several (>=2) VMs, so every > > shared > > physical memory area is assigned to a set of VMs. Therefore, a “token” or > > “identifier” should be used here to uniquely identify a backing memory area. > > A string no longer than 128 bytes is used here to serve the purpose. > > > > The backing area would be taken from one domain, which we will regard > > as the "master domain", and this domain should be created prior to any > > other "slave domain"s. Again, we have to use some kind of tag to tell who > > is the "master domain". > > > > And the ability to specify the permissions and cacheability (and > > shareability > > for ARM guest's) of the pages to be shared should also be given to the user. > > > > 2.2.2 Syntax and Behavior > > ~ > > The following example illustrates the syntax of the proposed config entry > > (suppose that we're on x86): > > > > In xl config file of vm1: > > static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master, > > cache_policy=x86_normal, prot=r2', > > > > 'id=ID2, begin=0x30, end=0x40, role=master' ] > > > > In xl config file of vm2: > > static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60, > > role=slave, prot=ro' ] > > > > In xl config file of vm3: > > static_shm = [ 'id=ID2, offset = 1, begin=0x69, > > end=0x80, role=slave, prot=ro' ] > > > > where: > > @id The identifier of the backing memory area. > > Can be any string that matches the regexp "[^ > > \t\n,]+" > > and no longer than 128 characters > > > > @offset Can only appear when @role = slave. The sharing will > > start from the beginning of backing memory area plus > > this offset. If not set, it defaults to zero. > > Can be decimals or hexadecimals of the form > > "0x2", > > and should be the multiple of the hypervisor page > > granularity (currently 4K on both ARM and x86). > > > > @begin/endThe boundaries of the shared memory area. The format > > requirements are the same with @offset. > > I'm assuming this is all specified in GFN and also not MFN contigous? > Would it be possible to allow the specification of MFN mappings > that are contigous? That could be done with the iomem= parameter? > This would be useful to map specific kinds of memory (e.g On Chip RAMs). > > Other use-cases are when there are not only guests sharing > the pages but also devices. In some cases these devs may be locked in > with low-latency access to specific memory regions. > > Perhaps something like the following? > addr=gfn@ > size=0x1000 > > with mfn being optional? I can see that it might be useful, but we are trying to keep the scope small to be able to complete the project within the limited timeframe allowed by GSoC. We only have one month left! We risk not getting the feature completed. Once this set of features is done and committed, we can expand on it. For the sake of this document, we should make clear that addresses are in the gfn space and memory is allocated to the master domain. > > @role Can only be 'master' or 'slave', it defaults to > > 'slave'. > > > > @prot When @role = master, this means the largest set of > > stage-2 permission flags that can be granted to the > > slave domains. > > When @role = slave, this means the stage-2 permission > > flags of
Re: [Xen-devel] [RFC v2 2/6] xen/arm: Introduce platform_hvc
On Mon, Feb 13, 2017 at 02:08:43PM -0800, Stefano Stabellini wrote: > On Tue, 7 Feb 2017, Edgar E. Iglesias wrote: > > From: "Edgar E. Iglesias"> > > > Introduce platform_hvc as a way to handle hypercalls that > > Xen does not know about in a platform specific way. This > > is particularly useful for implementing the SiP (SoC > > implementation specific) service calls. > > > > Signed-off-by: Edgar E. Iglesias > > --- > > xen/arch/arm/platform.c| 8 > > xen/arch/arm/traps.c | 3 +++ > > xen/include/asm-arm/platform.h | 5 + > > 3 files changed, 16 insertions(+) > > > > diff --git a/xen/arch/arm/platform.c b/xen/arch/arm/platform.c > > index 0af6d57..90ea6b8 100644 > > --- a/xen/arch/arm/platform.c > > +++ b/xen/arch/arm/platform.c > > @@ -127,6 +127,14 @@ void platform_poweroff(void) > > platform->poweroff(); > > } > > > > +bool platform_hvc(struct cpu_user_regs *regs) > > This is fine, but we need a different name for it, as it can be used to > handle both HVC and SMC calls. Maybe "firmware_call"? Hi, Sorry for the super long delay.. I'm looking at this again now. Yes, you're right. I went with platform_firmware_call() and platform->firmware_call(). I kept the platform_ prefix in the wrapper function to be consistent with the other hooks. Thanks, Edgar > > > > +{ > > +if ( platform && platform->hvc ) > > +return platform->hvc(regs); > > + > > +return false; > > +} > > + > > bool_t platform_has_quirk(uint32_t quirk) > > { > > uint32_t quirks = 0; > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > > index c5a4d41..33950d9 100644 > > --- a/xen/arch/arm/traps.c > > +++ b/xen/arch/arm/traps.c > > @@ -44,6 +44,7 @@ > > #include > > #include > > #include > > +#include > > > > #include "decode.h" > > #include "vtimer.h" > > @@ -1430,6 +1431,8 @@ static void do_trap_psci(struct cpu_user_regs *regs) > > } > > break; > > default: > > +if ( platform_hvc(regs) ) > > +return; > > domain_crash_synchronous(); > > return; > > } > > diff --git a/xen/include/asm-arm/platform.h b/xen/include/asm-arm/platform.h > > index 08010ba..4d51f0a 100644 > > --- a/xen/include/asm-arm/platform.h > > +++ b/xen/include/asm-arm/platform.h > > @@ -26,6 +26,10 @@ struct platform_desc { > > void (*reset)(void); > > /* Platform power-off */ > > void (*poweroff)(void); > > +/* Platform specific HVC handler. > > + * Returns true if the call was handled and false if not. > > + */ > > +bool (*hvc)(struct cpu_user_regs *regs); > > /* > > * Platform quirks > > * Defined has a function because a platform can support multiple > > @@ -55,6 +59,7 @@ int platform_cpu_up(int cpu); > > #endif > > void platform_reset(void); > > void platform_poweroff(void); > > +bool platform_hvc(struct cpu_user_regs *regs); > > bool_t platform_has_quirk(uint32_t quirk); > > bool_t platform_device_is_blacklisted(const struct dt_device_node *node); ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] xen: RCU: avoid busy waiting until the end of grace period.
On Thu, 27 Jul 2017, Dario Faggioli wrote: > Instead of having the CPU where a callback is queued, busy > looping on rcu_pending(), use a timer. > > In fact, we let the CPU go idla,e but we program a timer ^ idle, > that will periodically wake it up, for checking whether the > grace period has actually ended. > > It is kind of similar to introducing a periodic tick, but > with a much more limited scope, and a lot less overhead. In > fact, this timer is: > - only active for the CPU(s) that have callbacks queued, > waiting for the end of a grace period; > - only active when those CPU(s) are idle (and stopped as > soon as they resume execution). > > Signed-off-by: Dario Faggioli> --- > Cc: Stefano Stabellini > Cc: Julien Grall > Cc: Jan Beulich > Cc: Andrew Cooper > --- > xen/arch/arm/domain.c |4 ++- > xen/arch/x86/acpi/cpu_idle.c |6 +++-- > xen/arch/x86/cpu/mwait-idle.c |6 +++-- > xen/common/rcupdate.c | 52 > - > xen/include/xen/rcupdate.h|3 ++ > 5 files changed, 65 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 666b7ef..01da96e 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -43,8 +43,9 @@ static void do_idle(void) > { > unsigned int cpu = smp_processor_id(); > > +rcu_idle_timer_start(); > sched_tick_suspend(); > -/* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */ > +/* Timer related operations can raise TIMER_SOFTIRQ. Process it now. */ > process_pending_softirqs(); > > local_irq_disable(); > @@ -58,6 +59,7 @@ static void do_idle(void) > local_irq_enable(); > > sched_tick_resume(); > +rcu_idle_timer_stop(); > } > > void idle_loop(void) > diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c > index 04c52e8..b97986f 100644 > --- a/xen/arch/x86/acpi/cpu_idle.c > +++ b/xen/arch/x86/acpi/cpu_idle.c > @@ -576,10 +576,10 @@ static void acpi_processor_idle(void) > return; > } > > +rcu_idle_timer_start(); > cpufreq_dbs_timer_suspend(); > - > sched_tick_suspend(); > -/* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */ > +/* Timer related operations can raise TIMER_SOFTIRQ. Process it now. */ > process_pending_softirqs(); > > /* > @@ -593,6 +593,7 @@ static void acpi_processor_idle(void) > local_irq_enable(); > sched_tick_resume(); > cpufreq_dbs_timer_resume(); > +rcu_idle_timer_stop(); > return; > } > > @@ -726,6 +727,7 @@ static void acpi_processor_idle(void) > > sched_tick_resume(); > cpufreq_dbs_timer_resume(); > +rcu_idle_timer_stop(); > > if ( cpuidle_current_governor->reflect ) > cpuidle_current_governor->reflect(power); > diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c > index ae9e92b..c426e41 100644 > --- a/xen/arch/x86/cpu/mwait-idle.c > +++ b/xen/arch/x86/cpu/mwait-idle.c > @@ -743,10 +743,10 @@ static void mwait_idle(void) > return; > } > > + rcu_idle_timer_start(); > cpufreq_dbs_timer_suspend(); > - > sched_tick_suspend(); > - /* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */ > + /* Timer related operations can raise TIMER_SOFTIRQ. Process it now. */ > process_pending_softirqs(); > > /* Interrupts must be disabled for C2 and higher transitions. */ > @@ -756,6 +756,7 @@ static void mwait_idle(void) > local_irq_enable(); > sched_tick_resume(); > cpufreq_dbs_timer_resume(); > +rcu_idle_timer_stop(); > return; > } > > @@ -802,6 +803,7 @@ static void mwait_idle(void) > > sched_tick_resume(); > cpufreq_dbs_timer_resume(); > + rcu_idle_timer_stop(); > > if ( cpuidle_current_governor->reflect ) > cpuidle_current_governor->reflect(power); > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c > index f0fdc87..4586f2a 100644 > --- a/xen/common/rcupdate.c > +++ b/xen/common/rcupdate.c > @@ -84,8 +84,14 @@ struct rcu_data { > int cpu; > struct rcu_head barrier; > longlast_rs_qlen; /* qlen during the last resched */ > + > +/* 3) idle CPUs handling */ > +struct timer idle_timer; > +bool idle_timer_active; > }; > > +#define RCU_IDLE_TIMER_PERIOD MILLISECS(10) Isn't this a bit too short? How is it chosen? > static DEFINE_PER_CPU(struct rcu_data, rcu_data); > > static int blimit = 10; > @@ -402,7 +408,48 @@ int rcu_needs_cpu(int cpu) > { > struct rcu_data *rdp = _cpu(rcu_data, cpu); > > -return (!!rdp->curlist || rcu_pending(cpu)); > +return (!!rdp->curlist ||
Re: [Xen-devel] [PATCH 3/5] xen: RCU/x86/ARM: discount CPUs that were idle when grace period started.
On Thu, 27 Jul 2017, Dario Faggioli wrote: > Xen is a tickless (micro-)kernel. This means that, when a CPU > becomes idle, we stop all the activity on it, including any > periodic tick or timer. > > When we imported RCU from Linux, Linux (x86) was a ticking > kernel, i.e., there was a periodic timer tick always running, > even on totally idle CPUs. This was bad from a power efficiency > perspective, but it's what maked it possible to monitor the > quiescent states of all the CPUs, and hence tell when an RCU > grace period ends. > > In Xen, that is impossible, and that's particularly problematic > when system is idle (or lightly loaded) systems, as CPUs that > are idle may never have the chance to tell RCU about their > quiescence, and grace periods could extend indefinitely! > > This has led, on x86, to long (an unpredictable) delays between > RCU callbacks queueing and invokation. On ARM, we actually see > infinite grace periods (e.g., complate_domain_destroy() may > never be actually invoked on an idle system). See here: > > https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg02454.html > > The first step for fixing this situation is for RCU to record, > at the beginning of a grace period, which CPUs are already idle. > In fact, being idle, they can't be in the middle of any read-side > critical section, and we don't have to wait for them to declare > a grace period finished. > > This is tracked in a cpumask, in a way that is very similar to > how Linux also was achieving the same on s390 --which indeed was > tickless already, even back then-- and to what it started to do > for x86, from 2.6.21 on (see commit 79bf2bb3 "tick-management: > dyntick / highres functionality"). > > While there, also adopt the memory barrier introduced by Linux > commit commit c3f59023 ("Fix RCU race in access of nohz_cpu_mask"). > > Signed-off-by: Dario Faggioli> --- > Cc: Stefano Stabellini > Cc: Julien Grall > Cc: Jan Beulich > Cc: Andrew Cooper > --- > xen/arch/arm/domain.c |2 ++ > xen/arch/x86/acpi/cpu_idle.c | 25 + > xen/arch/x86/cpu/mwait-idle.c |9 - > xen/arch/x86/domain.c |8 +++- > xen/common/rcupdate.c | 28 ++-- > xen/include/xen/rcupdate.h|3 +++ > 6 files changed, 63 insertions(+), 12 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index fce29cb..666b7ef 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -50,8 +50,10 @@ static void do_idle(void) > local_irq_disable(); > if ( cpu_is_haltable(cpu) ) > { > +rcu_idle_enter(cpu); > dsb(sy); > wfi(); > +rcu_idle_exit(cpu); > } > local_irq_enable(); > > diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c > index 8cc5a82..f0fdc87 100644 > --- a/xen/common/rcupdate.c > +++ b/xen/common/rcupdate.c > @@ -52,7 +52,8 @@ static struct rcu_ctrlblk { > int next_pending; /* Is the next batch already waiting? */ > > spinlock_t lock __cacheline_aligned; > -cpumask_t cpumask; /* CPUs that need to switch in order*/ > +cpumask_t cpumask; /* CPUs that need to switch in order ... */ > +cpumask_t idle_cpumask; /* ... unless they are already idle */ > /* for current batch to proceed.*/ > } __cacheline_aligned rcu_ctrlblk = { > .cur = -300, > @@ -248,7 +249,14 @@ static void rcu_start_batch(struct rcu_ctrlblk *rcp) > smp_wmb(); > rcp->cur++; > > -cpumask_copy(>cpumask, _online_map); > + /* > +* Accessing idle_cpumask before incrementing rcp->cur needs a > +* Barrier Otherwise it can cause tickless idle CPUs to be ^ otherwise > +* included in rcp->cpumask, which will extend graceperiods > +* unnecessarily. > +*/ It doesn't look like this comment applies to this code: we are accessing idle_cpumask after rcp->cur here. Unless you meant "Accessing idle_cpumask *after* incrementing rcp->cur." Also could you please add a pointer to the other barrier in the pair (barriers always go in pair, for example I think the smp_wmb() above in rcu_start_batch is matched by the smp_rmb() in __rcu_process_callbacks.) > +smp_mb(); > +cpumask_andnot(>cpumask, _online_map, >idle_cpumask); > } > } > > @@ -474,7 +482,23 @@ static struct notifier_block cpu_nfb = { > void __init rcu_init(void) > { > void *cpu = (void *)(long)smp_processor_id(); > + > +cpumask_setall(_ctrlblk.idle_cpumask); > cpu_callback(_nfb, CPU_UP_PREPARE, cpu); > register_cpu_notifier(_nfb); > open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); > } > + > +/* > + * The CPU is becoming idle, so no more read side critical > + * sections, and one more step toward
Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file
On Fri, Jul 28, 2017 at 09:03:15PM +0800, Zhongze Liu wrote: > > 1. Motivation and Description Hi, I think this looks quite useful. I have a few comments inline. > > Virtual machines use grant table hypercalls to setup a share page for > inter-VMs communications. These hypercalls are used by all PV > protocols today. However, very simple guests, such as baremetal > applications, might not have the infrastructure to handle the grant table. > This project is about setting up several shared memory areas for inter-VMs > communications directly from the VM config file. > So that the guest kernel doesn't have to have grant table support (in the > embedded space, this is not unusual) to be able to communicate with > other guests. > > > 2. Implementation Plan: > > > == > 2.1 Introduce a new VM config option in xl: > == > > 2.1.1 Design Goals > ~~~ > > The shared areas should be shareable among several (>=2) VMs, so every shared > physical memory area is assigned to a set of VMs. Therefore, a “token” or > “identifier” should be used here to uniquely identify a backing memory area. > A string no longer than 128 bytes is used here to serve the purpose. > > The backing area would be taken from one domain, which we will regard > as the "master domain", and this domain should be created prior to any > other "slave domain"s. Again, we have to use some kind of tag to tell who > is the "master domain". > > And the ability to specify the permissions and cacheability (and shareability > for ARM guest's) of the pages to be shared should also be given to the user. > > 2.2.2 Syntax and Behavior > ~ > The following example illustrates the syntax of the proposed config entry > (suppose that we're on x86): > > In xl config file of vm1: > static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master, > cache_policy=x86_normal, prot=r2', > > 'id=ID2, begin=0x30, end=0x40, role=master' ] > > In xl config file of vm2: > static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60, > role=slave, prot=ro' ] > > In xl config file of vm3: > static_shm = [ 'id=ID2, offset = 1, begin=0x69, > end=0x80, role=slave, prot=ro' ] > > where: > @id The identifier of the backing memory area. > Can be any string that matches the regexp "[^ \t\n,]+" > and no longer than 128 characters > > @offset Can only appear when @role = slave. The sharing will > start from the beginning of backing memory area plus > this offset. If not set, it defaults to zero. > Can be decimals or hexadecimals of the form "0x2", > and should be the multiple of the hypervisor page > granularity (currently 4K on both ARM and x86). > > @begin/endThe boundaries of the shared memory area. The format > requirements are the same with @offset. I'm assuming this is all specified in GFN and also not MFN contigous? Would it be possible to allow the specification of MFN mappings that are contigous? This would be useful to map specific kinds of memory (e.g On Chip RAMs). Other use-cases are when there are not only guests sharing the pages but also devices. In some cases these devs may be locked in with low-latency access to specific memory regions. Perhaps something like the following? addr=gfn@ size=0x1000 with mfn being optional? > @role Can only be 'master' or 'slave', it defaults to > 'slave'. > > @prot When @role = master, this means the largest set of > stage-2 permission flags that can be granted to the > slave domains. > When @role = slave, this means the stage-2 permission > flags of the shared memory area. > Currently only 'rw' is supported. If not set. it > defaults to 'rw'. > > @cache_policy The stage-2 cacheability/shareability attributes of > the > shared memory area. Currently, only two policies are > supported: > * ARM_normal: Only applicable to ARM guests. This > would mean Inner and Outer Write-Back > Cacheable, and Inner Shareable. Is there a reason not to set this to Outer Shareable? Again, mainly useful when these pages get shared with devs as well. The guest can always lower it to Inner Shareable via S1 tables if needed. Best
Re: [Xen-devel] [PATCH 2/5] xen: ARM: suspend the tick (if in use) when going idle.
On Thu, 27 Jul 2017, Dario Faggioli wrote: > Since commit 964fae8ac ("cpuidle: suspend/resume scheduler > tick timer during cpu idle state entry/exit"), if a scheduler > has a periodic tick timer, we stop it when going idle. > > This, however, is only true for x86. Make it true for ARM as > well. > > Signed-off-by: Dario FaggioliReviewed-by: Stefano Stabellini > --- > Cc: Stefano Stabellini > Cc: Julien Grall > --- > xen/arch/arm/domain.c | 29 - > 1 file changed, 20 insertions(+), 9 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 2dc8b0a..fce29cb 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -39,6 +39,25 @@ > > DEFINE_PER_CPU(struct vcpu *, curr_vcpu); > > +static void do_idle(void) > +{ > +unsigned int cpu = smp_processor_id(); > + > +sched_tick_suspend(); > +/* sched_tick_suspend() can raise TIMER_SOFTIRQ. Process it now. */ > +process_pending_softirqs(); > + > +local_irq_disable(); > +if ( cpu_is_haltable(cpu) ) > +{ > +dsb(sy); > +wfi(); > +} > +local_irq_enable(); > + > +sched_tick_resume(); > +} > + > void idle_loop(void) > { > unsigned int cpu = smp_processor_id(); > @@ -52,15 +71,7 @@ void idle_loop(void) > if ( unlikely(tasklet_work_to_do(cpu)) ) > do_tasklet(); > else > -{ > -local_irq_disable(); > -if ( cpu_is_haltable(cpu) ) > -{ > -dsb(sy); > -wfi(); > -} > -local_irq_enable(); > -} > +do_idle(); > > do_softirq(); > /* > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file
On Mon, 31 Jul 2017, Zhongze Liu wrote: > I'm extremely sorry that I mistakenly copied and pasted an immediate > version of the proposal here. As you might have already noticed, some > of the content obviously conflicts with itself. Please see the new one below. > And some typo's and indentation issues are also fixed. Sorry again for my > mistake. > > The revisited proposal: > > > 1. Motivation and Description > > Virtual machines use grant table hypercalls to setup a share page for > inter-VMs communications. These hypercalls are used by all PV > protocols today. However, very simple guests, such as baremetal > applications, might not have the infrastructure to handle the grant table. > This project is about setting up several shared memory areas for inter-VMs > communications directly from the VM config file. > So that the guest kernel doesn't have to have grant table support (in the > embedded space, this is not unusual) to be able to communicate with > other guests. > > > 2. Implementation Plan: > > > == > 2.1 Introduce a new VM config option in xl: > == > > 2.1.1 Design Goals > ~~~ > > The shared areas should be shareable among several (>=2) VMs, so every shared > physical memory area is assigned to a set of VMs. Therefore, a “token” or > “identifier” should be used here to uniquely identify a backing memory area. > A string no longer than 128 bytes is used here to serve the purpose. > > The backing area would be taken from one domain, which we will regard > as the "master domain", and this domain should be created prior to any > other "slave domain"s. Again, we have to use some kind of tag to tell who > is the "master domain". > > And the ability to specify the permissions and cacheability (and shareability > for ARM guest's) of the pages to be shared should also be given to the user. > > 2.2.2 Syntax and Behavior > ~ > The following example illustrates the syntax of the proposed config entry > (suppose that we're on x86): > > In xl config file of vm1: > static_shm = [ 'id=ID1, begin=0x10, end=0x20, role=master,\ > cache_policy=x86_normal, prot=rw',\ > \ > 'id=ID2, begin=0x30, end=0x40, role=master' ] > > In xl config file of vm2: > static_shm = [ 'id=ID1, offset = 0, begin=0x50, end=0x60, \ > role=slave, prot=rw' ] > > In xl config file of vm3: > static_shm = [ 'id=ID2, offset = 0x1, begin=0x69, \ > end=0x80, role=slave' ] > > where: > @id The identifier of the backing memory area. >Can be any string that matches the regexp "[_a-zA-Z0-9]+" >and no longer than 128 characters > > @offset Can only appear when @role = slave. The sharing will >start from the beginning of backing memory area plus >this offset. If not set, it defaults to zero. >Can be decimals or hexadecimals of the form "0x2", >and should be the multiple of the hypervisor page >granularity (currently 4K on both ARM and x86). > > @begin/end The boundaries of the shared memory area. The format >requirements are the same with @offset. > > @roleCan only be 'master' or 'slave', it defaults to 'slave'. > > @protWhen @role = master, this means the largest set of >stage-2 permission flags that can be granted to the >slave domains. >When @role = slave, this means the stage-2 permission >flags of the shared memory area. >Currently only 'rw' is supported. If not set. it >defaults to 'rw'. > > @cache_policyCan only appear when @role = master. >The stage-2 cacheability/shareability attributes of the >shared memory area. Currently, only two policies are >supported: > * ARM_normal: Only applicable to ARM guests. This >would mean Inner and Outer Write-Back >Cacheable, and Inner Shareable. > * x86_normal: Only applicable to x86 HVM guests. This >would mean Write-Back Cacheable. >If not set, it defaults to the *_normal policy for the >corresponding platform. > > Note: > The sizes of the areas
Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file
On Mon, 31 Jul 2017, Wei Liu wrote: > On Mon, Jul 31, 2017 at 01:09:04AM +0800, Zhongze Liu wrote: > > > > @cache_policyCan only appear when @role = master. > >The stage-2 cacheability/shareability attributes of the > >shared memory area. Currently, only two policies are > >supported: > > * ARM_normal: Only applicable to ARM guests. This > >would mean Inner and Outer Write-Back > >Cacheable, and Inner Shareable. > > * x86_normal: Only applicable to x86 HVM guests. This > >would mean Write-Back Cacheable. > > This question might have been asked before, I'm sorry for not being > able to follow closely previous discussions... > > Why are these opaque names chosen instead of just writing "wb" "wc" or > whatever? Because "wb" and "wc" mean different things on ARM and x86. Also the meaning of "wb" is not clear enough on ARM unless you also specify shareability. It's simpler this way, and it gives us plenty of room to improve the syntax in the future if we want to add more cacheability and shareability modes while remaining backward compatible. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [XenSummit 2017] Shared coprocessor framework followup
On Tue, Jul 18, 2017 at 08:10:15PM +0300, Andrii Anisov wrote: > **Dear All, > > During the developers summit a Shared Coprocessor Framework (SCF) concept > was presented. Noticeable interest from community was discovered during > discussions. So this is a call for all interested parties to collect a > feedback and setup a collaboration. Hi Andrii! > > There are several topics I would like to collect responses from the > community: > - Who are interested in SCF design, discussions, development, usage, > etc? Personalities or organizations. Yes I'm interested in this. I'm not sure how much time I'll be able to contribute but at least I can review proposals and hopefully look at implementing a driver/backend that may be useful for our FPGA platforms. Cheers, Edgar > - What devices (type of devices) are intended to be shared using SCF? > - What are expected coprocessor sharing use-cases (i.e. DSP running > different FW for different domains, etc). > - If someone is willing to take a part in SCF design and development > (core, API)? > - If someone is willing to implement their coprocessor support (driver) > for SCF? > > I look forward to hearing from you. > > -- > > *Andrii Anisov* > > > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: Fix comments coding style in assembler files
On Thu, 27 Jul 2017, Andrii Anisov wrote: > From: Andrii Anisov> > Signed-off-by: Andrii Anisov Reviewed-by: Stefano Stabellini > --- > xen/arch/arm/arm32/debug-8250.inc | 12 +++-- > xen/arch/arm/arm32/debug-exynos4210.inc | 12 +++-- > xen/arch/arm/arm32/debug-pl011.inc | 18 +--- > xen/arch/arm/arm32/debug-scif.inc | 6 ++- > xen/arch/arm/arm32/debug.S | 6 ++- > xen/arch/arm/arm32/head.S | 81 > ++--- > xen/arch/arm/arm64/debug-8250.inc | 12 +++-- > xen/arch/arm/arm64/debug-cadence.inc| 12 +++-- > xen/arch/arm/arm64/debug-pl011.inc | 18 +--- > xen/arch/arm/arm64/debug.S | 6 ++- > xen/arch/arm/arm64/entry.S | 50 ++-- > 11 files changed, 147 insertions(+), 86 deletions(-) > > diff --git a/xen/arch/arm/arm32/debug-8250.inc > b/xen/arch/arm/arm32/debug-8250.inc > index 757ffd8..0759a27 100644 > --- a/xen/arch/arm/arm32/debug-8250.inc > +++ b/xen/arch/arm/arm32/debug-8250.inc > @@ -16,9 +16,11 @@ > > #include > > -/* 8250 UART wait UART to be ready to transmit > +/* > + * 8250 UART wait UART to be ready to transmit > * rb: register which contains the UART base address > - * rc: scratch register */ > + * rc: scratch register > + */ > .macro early_uart_ready rb rc > 1: > ldr \rc, [\rb, #(UART_LSR << EARLY_UART_REG_SHIFT)] /* Read LSR > */ > @@ -26,9 +28,11 @@ > beq 1b /* Wait for the UART to be ready > */ > .endm > > -/* 8250 UART transmit character > +/* > + * 8250 UART transmit character > * rb: register which contains the UART base address > - * rt: register which contains the character to transmit */ > + * rt: register which contains the character to transmit > + */ > .macro early_uart_transmit rb rt > str \rt, [\rb, #UART_THR] /* Write Transmit buffer */ > .endm > diff --git a/xen/arch/arm/arm32/debug-exynos4210.inc > b/xen/arch/arm/arm32/debug-exynos4210.inc > index 752942d..4e80a21 100644 > --- a/xen/arch/arm/arm32/debug-exynos4210.inc > +++ b/xen/arch/arm/arm32/debug-exynos4210.inc > @@ -18,9 +18,11 @@ > > #include > > -/* Exynos 5 UART wait UART to be ready to transmit > +/* > + * Exynos 5 UART wait UART to be ready to transmit > * rb: register which contains the UART base address > - * rc: scratch register */ > + * rc: scratch register > + */ > .macro early_uart_ready rb rc > 1: > ldr \rc, [\rb, #UTRSTAT] /* <- UTRSTAT (Flag register) */ > @@ -28,9 +30,11 @@ > beq 1b /* Wait for the UART to be ready */ > .endm > > -/* Exynos 5 UART transmit character > +/* > + * Exynos 5 UART transmit character > * rb: register which contains the UART base address > - * rt: register which contains the character to transmit */ > + * rt: register which contains the character to transmit > + */ > .macro early_uart_transmit rb rt > str \rt, [\rb, #UTXH] /* -> UTXH (Data Register) */ > .endm > diff --git a/xen/arch/arm/arm32/debug-pl011.inc > b/xen/arch/arm/arm32/debug-pl011.inc > index 6a64dbf..ec462ea 100644 > --- a/xen/arch/arm/arm32/debug-pl011.inc > +++ b/xen/arch/arm/arm32/debug-pl011.inc > @@ -18,10 +18,12 @@ > > #include > > -/* PL011 UART initialization > +/* > + * PL011 UART initialization > * rb: register which contains the UART base address > * rc: scratch register 1 > - * rd: scratch register 2 (unused here) */ > + * rd: scratch register 2 (unused here) > + */ > .macro early_uart_init rb, rc, rd > mov \rc, #(7372800 / EARLY_PRINTK_BAUD % 16) > str \rc, [\rb, #FBRD] /* -> UARTFBRD (Baud divisor fraction) */ > @@ -33,9 +35,11 @@ > str \rc, [\rb, #CR] /* -> UARTCR (Control Register) */ > .endm > > -/* PL011 UART wait UART to be ready to transmit > +/* > + * PL011 UART wait UART to be ready to transmit > * rb: register which contains the UART base address > - * rc: scratch register */ > + * rc: scratch register > + */ > .macro early_uart_ready rb, rc > 1: > ldr \rc, [\rb, #FR] /* <- UARTFR (Flag register) */ > @@ -43,9 +47,11 @@ > bne 1b/* Wait for the UART to be ready */ > .endm > > -/* PL011 UART transmit character > +/* > + * PL011 UART transmit character > * rb: register which contains the UART base address > - * rt: register which contains the character to transmit */ > + * rt: register which contains the character to transmit > + */ > .macro early_uart_transmit rb, rt > str \rt, [\rb, #DR]/* -> UARTDR (Data Register) */ > .endm > diff --git a/xen/arch/arm/arm32/debug-scif.inc > b/xen/arch/arm/arm32/debug-scif.inc > index ce85752..143f05d 100644 > --- a/xen/arch/arm/arm32/debug-scif.inc > +++ b/xen/arch/arm/arm32/debug-scif.inc > @@ -19,7 +19,8 @@ > > #include > > -/* SCIF UART
Re: [Xen-devel] PV drivers and zero copying
On Mon, 31 Jul 2017, Oleksandr Andrushchenko wrote: > 3 Sharing with page exchange (XENMEM_exchange) > == > > This API was pointed to me by Stefano Stabellini as one of the possible ways > to > achieve zero copying and share physically contiguous buffers. It is used by > x86 > SWIOTLB code (xen_create_contiguous_region, [5]), but as per my understanding > this API cannot be used on ARM as of now [6]. Conclusion: not an option for > ARM > at the moment Let me elaborate on this. The purpose of XENMEM_exchange is to exchange a number of memory pages with an equal number of contiguous memory pages, possibly even under 4G. The original purpose of the hypercall was to get DMA-able memory. So far, it has only been used by Dom0 on x86. Dom0 on ARM doesn't need it because it is mapped 1:1 by default and device assignment is not allowed without an IOMMU. However it should work on ARM too, as the implementation is all common code in Xen. Also, looking at the implementation (xen/common/memory.c:memory_exchange) it would seem that it can be called from a DomU too (but I have never tried). Thus, if you have a platform without IOMMU and you disabled the IOMMU checks in Xen to assign a device to a DomU anyway, then you could use this hypercall from DomU to get memory under 4G to be used for DMA with this device. As far as I can tell XENMEM_exchange could help in the design of zero-copy PV protocols only to address this specific use case: - you have a frontend in DomU and a backend in Dom0 - pages shared by DomU get mapped in Dom0 and potentially used for DMA - the device has under 4G DMA restrictions Normally Dom0 maps a DomU page, then at the time of using the mapped page for DMA it checks whether it is suitable for DMA (under 4G if the device requires so). If it is not, Dom0 uses a bounce buffer borrowed from the swiotlb. Obviously this introduces one or two memcpys. Instead, if DomU calls XENMEM_exchange to get memory under 4G, and shares one of the pages with Dom0 via PV frontends, then Dom0 wouldn't have to use a bounce buffer to do DMA to this page. Does it make sense? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 112391: regressions - trouble: blocked/broken/fail/pass
flight 112391 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/112391/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-xsm 2 hosts-allocate broken REGR. vs. 112286 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112286 build-arm64 2 hosts-allocate broken REGR. vs. 112286 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 112286 Tests which are failing intermittently (not blocking): test-amd64-amd64-rumprun-amd64 17 rumprun-demo-xenstorels/xenstorels.repeat fail pass in 112384 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail pass in 112384 test-armhf-armhf-xl-rtds 12 guest-startfail pass in 112384 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-examine 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64-xsm 3 capture-logs broken blocked in 112286 build-arm64-pvops 3 capture-logs broken blocked in 112286 build-arm64 3 capture-logs broken blocked in 112286 test-amd64-amd64-xl-qemut-win7-amd64 18 guest-start/win.repeat fail in 112384 blocked in 112286 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail in 112384 blocked in 112286 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 112384 like 112274 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail in 112384 like 112286 test-armhf-armhf-xl-rtds13 migrate-support-check fail in 112384 never pass test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 112384 never pass test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 112286 test-amd64-i386-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail like 112286 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 112286 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 112286 test-amd64-amd64-xl-rtds 10 debian-install fail like 112286 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 112286 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never
[Xen-devel] [PATCH] x86/apic/x2apic: Share IRQ vector between cluster members only when no cpumask is specified
We have limited number (slightly under NR_DYNAMIC_VECTORS=192) of IRQ vectors that are available to each processor. Currently, when x2apic cluster mode is used (which is default), each vector is shared among all processors in the cluster. With many IRQs (as is the case on systems with multiple SR-IOV cards) and few clusters (e.g. single socket) there is a good chance that we will run out of vectors. This patch tries to decrease vector sharing between processors by assigning vector to a single processor if the assignment request (via __assign_irq_vector()) comes without explicitly specifying which processors are expected to share the interrupt. This typically happens during boot time (or possibly PCI hotplug) when create_irq(NUMA_NO_NODE) is called. When __assign_irq_vector() is called from set_desc_affinity() which provides sharing mask, vector sharing will continue to be performed, as before. This patch to some extent mirrors Linux commit d872818dbbee ("x86/apic/x2apic: Use multiple cluster members for the irq destination only with the explicit affinity"). Note that this change still does not guarantee that we never run out of vectors. For example, on a single core system we will be effectively back to the single cluster/socket case of original code. Signed-off-by: Boris Ostrovsky--- xen/arch/x86/acpi/boot.c | 2 +- xen/arch/x86/apic.c | 2 +- xen/arch/x86/genapic/delivery.c | 4 ++-- xen/arch/x86/genapic/x2apic.c| 9 +++-- xen/arch/x86/io_apic.c | 8 xen/arch/x86/irq.c | 4 ++-- xen/arch/x86/mpparse.c | 2 +- xen/arch/x86/msi.c | 2 +- xen/include/asm-x86/genapic.h| 9 ++--- xen/include/asm-x86/mach-generic/mach_apic.h | 9 + 10 files changed, 30 insertions(+), 21 deletions(-) diff --git a/xen/arch/x86/acpi/boot.c b/xen/arch/x86/acpi/boot.c index 8e6c96d..c16b14a 100644 --- a/xen/arch/x86/acpi/boot.c +++ b/xen/arch/x86/acpi/boot.c @@ -645,7 +645,7 @@ static void __init acpi_process_madt(void) acpi_ioapic = true; smp_found_config = true; - clustered_apic_check(); + CLUSTERED_APIC_CHECK(); } } if (error == -EINVAL) { diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c index 851a6cc..a0e1798 100644 --- a/xen/arch/x86/apic.c +++ b/xen/arch/x86/apic.c @@ -544,7 +544,7 @@ void setup_local_APIC(void) * an APIC. See e.g. "AP-388 82489DX User's Manual" (Intel * document number 292116). So here it goes... */ -init_apic_ldr(); +INIT_APIC_LDR(); /* * Set Task Priority to reject any interrupts below FIRST_DYNAMIC_VECTOR. diff --git a/xen/arch/x86/genapic/delivery.c b/xen/arch/x86/genapic/delivery.c index ced92a1..d71c01c 100644 --- a/xen/arch/x86/genapic/delivery.c +++ b/xen/arch/x86/genapic/delivery.c @@ -30,7 +30,7 @@ void __init clustered_apic_check_flat(void) printk("Enabling APIC mode: Flat. Using %d I/O APICs\n", nr_ioapics); } -const cpumask_t *vector_allocation_cpumask_flat(int cpu) +const cpumask_t *vector_allocation_cpumask_flat(int cpu, const cpumask_t *cpumask) { return _online_map; } @@ -58,7 +58,7 @@ void __init clustered_apic_check_phys(void) printk("Enabling APIC mode: Phys. Using %d I/O APICs\n", nr_ioapics); } -const cpumask_t *vector_allocation_cpumask_phys(int cpu) +const cpumask_t *vector_allocation_cpumask_phys(int cpu, const cpumask_t *cpumask) { return cpumask_of(cpu); } diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c index 5fffb31..c0b97c9 100644 --- a/xen/arch/x86/genapic/x2apic.c +++ b/xen/arch/x86/genapic/x2apic.c @@ -27,6 +27,7 @@ #include #include #include +#include static DEFINE_PER_CPU_READ_MOSTLY(u32, cpu_2_logical_apicid); static DEFINE_PER_CPU_READ_MOSTLY(cpumask_t *, cluster_cpus); @@ -72,9 +73,13 @@ static void __init clustered_apic_check_x2apic(void) { } -static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu) +static const cpumask_t *vector_allocation_cpumask_x2apic_cluster(int cpu, +const cpumask_t *cpumask) { -return per_cpu(cluster_cpus, cpu); +if ( cpumask != TARGET_CPUS ) +return per_cpu(cluster_cpus, cpu); +else +return cpumask_of(cpu); } static unsigned int cpu_mask_to_apicid_x2apic_cluster(const cpumask_t *cpumask) diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c index 2838f6b..3eefcfc 100644 --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -1038,7 +1038,7 @@ static void __init setup_IO_APIC_irqs(void) disable_8259A_irq(irq_to_desc(irq)); desc = irq_to_desc(irq); -SET_DEST(entry, logical,
Re: [Xen-devel] DESIGN v2: CPUID part 3
On Wed, Jul 05, 2017 at 02:22:00PM +0100, Joao Martins wrote: > On 07/05/2017 12:16 PM, Andrew Cooper wrote: > > On 05/07/17 10:46, Joao Martins wrote: > >> Hey Andrew, > >> > >> On 07/04/2017 03:55 PM, Andrew Cooper wrote: > >>> Presented herewith is the a plan for the final part of CPUID work, which > >>> primarily covers better Xen/Toolstack interaction for configuring the > >>> guests > >>> CPUID policy. > >>> > >> Really nice write up, a few comments below. > >> > >>> A PDF version of this document is available from: > >>> > >>> http://xenbits.xen.org/people/andrewcoop/cpuid-part-3-rev2.pdf > >>> > >>> Changes from v1: > >>> * Clarification of the interaction of emulated features > >>> * More information about the difference between max and default > >>> featuresets. > >>> > >>> ~Andrew > >>> > >>> -8<- > >>> % CPUID Handling (part 3) > >>> % Revision 2 > >>> > > [snip] > > >>> # Proposal > >>> > >>> First and foremost, split the current **max\_policy** notion into separate > >>> **max** and **default** policies. This allows for the provision of > >>> features > >>> which are unused by default, but may be opted in to, both at the > >>> hypervisor > >>> level and the toolstack level. > >>> > >>> At the hypervisor level, **max** constitutes all the features Xen can use > >>> on > >>> the current hardware, while **default** is the subset thereof which are > >>> supported features, the features which the user has explicitly opted in > >>> to, > >>> and excluding any features the user has explicitly opted out of. > >>> > >>> A new `cpuid=` command line option shall be introduced, whose internals > >>> are > >>> generated automatically from the featureset ABI. This means that all > >>> features > >>> added to `include/public/arch-x86/cpufeatureset.h` automatically gain > >>> command > >>> line control. (RFC: The same top level option can probably be used for > >>> non-feature CPUID data control, although I can't currently think of any > >>> cases > >>> where this would be used Also find a sensible way to express 'available > >>> but > >>> not to be used by Xen', as per the current `smep` and `smap` options.) > >>> > >>> > >>> At the guest level, the **max** policy is conceptually unchanged. It > >>> constitutes all the features Xen is willing to offer to each type of > >>> guest on > >>> the current hardware (including emulated features). However, it shall > >>> instead > >>> be derived from Xen's **default** host policy. This is to ensure that > >>> experimental hypervisor features must be opted in to at the Xen level > >>> before > >>> they can be opted in to at the toolstack level. > >>> > >>> The guests **default** policy is then derived from its **max**. This is > >>> because there are some features which should always be explicitly opted > >>> in to > >>> by the toolstack, such as emulated features which come with a security > >>> trade-off, or for non-architectural features which may differ in > >>> implementation in heterogeneous environments. > >>> > >>> All global policies (Xen and guest, max and default) shall be made > >>> available > >>> to the toolstack, in a manner similar to the existing > >>> _XEN\_SYSCTL\_get\_cpu\_featureset_ mechanism. This allows decisions to > >>> be > >>> taken which include all CPUID data, not just the feature bitmaps. > >>> > >>> New _XEN\_DOMCTL\_{get,set}\_cpuid\_policy_ hypercalls will be introduced, > >>> which allows the toolstack to query and set the cpuid policy for a > >>> specific > >>> domain. It shall supersede _XEN\_DOMCTL\_set\_cpuid_, and shall fail if > >>> Xen > >>> is unhappy with any aspect of the policy during auditing. This provides > >>> feedback to the user that a chosen combination will not work, rather than > >>> the > >>> guest booting in an unexpected state. > >>> > >>> When a domain is initially created, the appropriate guests **default** > >>> policy > >>> is duplicated for use. When auditing, Xen shall audit the toolstacks > >>> requested policy against the guests **max** policy. This allows > >>> experimental > >>> features or non-migration-safe features to be opted in to, without those > >>> features being imposed upon all guests automatically. > >>> > >>> A guests CPUID policy shall be immutable after construction. This better > >>> matches real hardware, and simplifies the logic in Xen to translate policy > >>> alterations into configuration changes. > >>> > >> This appears to be a suitable abstraction even for higher level toolstacks > >> (libxl). At least I can imagine libvirt fetching the PV/HVM max policy, and > >> compare them between different servers when user computes the guest cpu > >> config > >> (the normalized one) and use the common denominator as the guest policy. > >> Probably higher level toolstack could even use these said policies > >> constructs > >> and built the idea of models such that the user could easily choose one > >> for a > >> pool of hosts with different
[Xen-devel] [PATCH v1 0/1] xen/arm: zynqmp: Disable PCIe
From: "Edgar E. Iglesias"Hi, We're seeing panics in dom0 with PCIe enabled due to what seems to be wrongly created mappings by Xen. With older kernels we didn't see the panics but PCIe wasn't functional in dom0. This disables the PCIe nodes on the ZynqMP until Xen/ARM gets more PCIe support. Can we get this into stable-4.8 and 4.9? Cheers, Edgar Edgar E. Iglesias (1): xen/arm: Disable PCIe on the ZynqMP xen/arch/arm/platforms/xilinx-zynqmp.c | 1 + 1 file changed, 1 insertion(+) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH v1 1/1] xen/arm: Disable PCIe on the ZynqMP
From: "Edgar E. Iglesias"Disable PCIe on the ZynqMP. Xen does not yet know how to map the controller and dom0 fails to boot with the node enabled. Signed-off-by: Edgar E. Iglesias --- xen/arch/arm/platforms/xilinx-zynqmp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/arch/arm/platforms/xilinx-zynqmp.c b/xen/arch/arm/platforms/xilinx-zynqmp.c index 2adee91..822287b 100644 --- a/xen/arch/arm/platforms/xilinx-zynqmp.c +++ b/xen/arch/arm/platforms/xilinx-zynqmp.c @@ -29,6 +29,7 @@ static const struct dt_device_match zynqmp_blacklist_dev[] __initconst = { /* Power management is not yet supported. */ DT_MATCH_COMPATIBLE("xlnx,zynqmp-pm"), +DT_MATCH_COMPATIBLE("xlnx,nwl-pcie-2.11"), { /* sentinel */ }, }; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 112390: regressions - trouble: blocked/broken/fail/pass
flight 112390 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/112390/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64 2 hosts-allocate broken REGR. vs. 110515 build-arm64-pvops 2 hosts-allocate broken REGR. vs. 110515 build-arm64-xsm 2 hosts-allocate broken REGR. vs. 110515 test-amd64-amd64-examine 7 reboot fail REGR. vs. 110515 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host fail REGR. vs. 110515 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-rumprun-amd64 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host fail REGR. vs. 110515 test-amd64-amd64-qemuu-nested-intel 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-xl-qemut-debianhvm-amd64 7 xen-bootfail REGR. vs. 110515 test-amd64-amd64-xl-pvh-intel 7 xen-bootfail REGR. vs. 110515 test-amd64-amd64-pygrub 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-libvirt 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-xl-qcow2 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-i386-pvgrub 7 xen-boot fail REGR. vs. 110515 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 110515 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 110515 test-amd64-amd64-amd64-pvgrub 7 xen-bootfail REGR. vs. 110515 test-amd64-amd64-libvirt-vhd 7 xen-boot fail REGR. vs. 110515 Tests which did not succeed, but are not blocking: build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-examine 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64-xsm 3 capture-logs broken blocked in 110515 build-arm64-pvops 3 capture-logs broken blocked in 110515 build-arm64 3 capture-logs broken blocked in 110515 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 110515 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail like 110515 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 110515 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 110515 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 110515 test-amd64-amd64-xl-rtds 10 debian-install fail like 110515 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 110515 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-rtds 13 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-installfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass test-amd64-i386-xl-qemut-ws16-amd64 13 guest-saverestore fail never pass test-armhf-armhf-xl 13 migrate-support-checkfail never pass test-armhf-armhf-xl 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass
Re: [Xen-devel] [XTF PATCH] build: append -fno-pic to CFLAGS
On 31/07/17 18:20, Wei Liu wrote: > It appears that Stretch's gcc has this on by default, which causes the > generating of several get_pc_thunk's, which breaks xsa-192 test. > > Signed-off-by: Wei LiuReviewed and committed. Thanks. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [XTF PATCH] build: append -fno-pic to CFLAGS
It appears that Stretch's gcc has this on by default, which causes the generating of several get_pc_thunk's, which breaks xsa-192 test. Signed-off-by: Wei Liu--- build/common.mk | 1 + 1 file changed, 1 insertion(+) diff --git a/build/common.mk b/build/common.mk index f1de800..4ce6abf 100644 --- a/build/common.mk +++ b/build/common.mk @@ -28,6 +28,7 @@ COMMON_CFLAGS += -fno-common -fno-asynchronous-unwind-tables -fno-strict-aliasin COMMON_CFLAGS += -fno-stack-protector -ffreestanding COMMON_CFLAGS += -mno-red-zone -mno-sse COMMON_CFLAGS += -Wno-unused-parameter -Winline +COMMON_CFLAGS += -fno-pic COMMON_AFLAGS-x86_32 := -m32 COMMON_AFLAGS-x86_64 := -m64 -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen:arm earlyprintk configuration for Hikey 960 boards
On 31/07/17 17:11, Konrad Rzeszutek Wilk wrote: On Thu, Jul 27, 2017 at 10:52:40AM +0100, Julien Grall wrote: Hi Konrad, On 27/07/17 02:18, Konrad Rzeszutek Wilk wrote: On Wed, Jul 26, 2017 at 05:59:15PM +0100, Julien Grall wrote: Hi Konrad, On 26/07/17 17:54, Konrad Rzeszutek Wilk wrote: Introduce an earlyprintk configuration of Hikey 960 boards. ..snip.. Would it be possible to update the wiki page on the hikey [1] with your latest finding? I added a whole new web-page: https://wiki.xenproject.org/wiki/HiKey960 As the HiKey != HiKey960 Oh :). Fine then. Would you be up to do testing on this platform during RC? Yes. Thank you! If you can you add your name in [1]. This would help user to know the board has been tested with latest release. [Also updated the landing page https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions to point to this new one] Cheers, [1] https://wiki.xenproject.org/wiki/Xen_ARM_Manual_Smoke_Test/Results -- Julien Grall -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/2] libxl: do not start dom0 qemu for stubdomain when not needed
Wei Liu writes ("Re: [PATCH v2 2/2] libxl: do not start dom0 qemu for stubdomain when not needed"): > Hmm... I don't think there is requirement in CODING_STYLE for > multiple-line comment, so there are quite a few styles in use. But > looking at libxl code the prevailing style seems to be: > >/* > * > * > */ > > I will make the adjustment while committing. Personally I think this is well into the kind of territory that is not worth arguing over. Even a mix of styles here is fine. (I think we should mandate the initial `*' on continuation lines.) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [libvirt test] 112392: trouble: blocked/broken/pass
flight 112392 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/112392/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-pvops 2 hosts-allocate broken REGR. vs. 112276 build-arm64-xsm 2 hosts-allocate broken REGR. vs. 112276 build-arm64 2 hosts-allocate broken REGR. vs. 112276 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a build-arm64 3 capture-logs broken blocked in 112276 build-arm64-xsm 3 capture-logs broken blocked in 112276 build-arm64-pvops 3 capture-logs broken blocked in 112276 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 112276 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 112276 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 112276 test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass version targeted for testing: libvirt 772a6e36a5cbed4992315c4a8325947020f7ec70 baseline version: libvirt f7237d63e8f02f3689f9b63b413fae7d4221faa9 Last test of basis 112276 2017-07-25 04:21:09 Z6 days Failing since112310 2017-07-26 04:21:38 Z5 days6 attempts Testing same since 112370 2017-07-29 04:23:27 Z2 days3 attempts People who touched revisions under test: Andrea BolognaniDaniel P. Berrange Erik Skultety John Ferlan Ján Tomko Martin Kletzander Michal Privoznik Nitesh Konkar Nitesh Konkar Pavel Hrdina Peter Krempa Scott Garfinkle jobs: build-amd64-xsm pass build-arm64-xsm broken build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 broken build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt blocked build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-arm64-pvopsbroken build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm blocked test-armhf-armhf-libvirt-xsm pass test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt blocked test-armhf-armhf-libvirt pass test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pair
Re: [Xen-devel] [PATCH v2 2/2] libxl: do not start dom0 qemu for stubdomain when not needed
On Fri, Jul 28, 2017 at 06:42:14PM +0200, Marek Marczykowski-Górecki wrote: > Do not setup vfb+vkb when no access method was configured. Then check if > qemu is really needed. > > The only not configurable thing forcing qemu running in dom0 after this > change are consoles used to save/restore. But even in that case, there > is much smaller part of qemu exposed. > > Signed-off-by: Marek Marczykowski-GóreckiAcked-by: Wei Liu > +/* will be changed back to LIBXL__CONSOLE_BACKEND_IOEMU if > qemu > + * will be in use */ Hmm... I don't think there is requirement in CODING_STYLE for multiple-line comment, so there are quite a few styles in use. But looking at libxl code the prevailing style seems to be: /* * * */ I will make the adjustment while committing. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PVH VCPU hotplug support v7?
On 07/31/2017 11:36 AM, Ross Lagerwall wrote: > On 07/31/2017 03:29 PM, Boris Ostrovsky wrote: >> On 07/31/2017 10:12 AM, Andrew Cooper wrote: >>> On 31/07/17 14:55, Boris Ostrovsky wrote: On 07/31/2017 09:20 AM, Ross Lagerwall wrote: > Hi Boris, > > I've modified your PVH VCPU hotplug support v6 patch series [1] to > support HVM guests running _with_ a device model for XenServer's > purposes. This is useful because it moves the vCPU hotplug handling > out of QEMU and allows it to mostly be shared with PVH. It will also > allow unplugging vCPUs (libxl currently only does cpu-add for > upstream > qemu). > > Are you still planning on continuing with that patch series since > your > commit to Linux [2]? This series has been put on hold until we figure out what to do with hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI CPU namespace --- on hotplug event dom0 has to somehow figure out whether the event was due to (dis)appearance of a physical or virtual CPU). I don't think this has been dealt with yet (copying Roger). >>> From the point of view of unblocking several pieces of work, it >>> would be >>> fine for this logic to be behind an emulation flag, just like >>> LAPIC/etc. >> >> The (I think) last message discussing this series was >> >> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html >> >> >> Are you suggesting extracting pieces that would move hotplug support for >> HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific >> code out? (The feature flag is part of this series --- >> https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html) >> >> > > I think(?) Andrew was suggesting to have an emulation flag such that > hotplug support is moved into the hypervisor for HVM guests _and_ PVH > guests except for PVH dom0. That (different handling for PVH dom0 vs. domU) was exactly what Jan was objecting to. (I'll add him too). -boris > > I don't know what work this unblocks that he was referring to. > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 4/8] mm: Scrub memory from idle loop
On 07/31/2017 11:20 AM, Jan Beulich wrote: Boris Ostrovsky07/23/17 4:14 AM >>> @@ -1050,17 +1120,42 @@ static void scrub_free_pages(unsigned int node) -if ( node_need_scrub[node] == 0 ) -return; +if ( preempt || (node_need_scrub[node] == 0) ) +goto out; } } while ( order-- != 0 ); } + + out: +spin_unlock(_lock); +node_clear(node, node_scrubbing); +return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE); >>> While I can see why you use it here, the softirq_pending() looks sort of >>> misplaced: While invoking it twice in the caller will look a little odd too, >>> I still think that's where the check belongs. >> >> scrub_free_pages is called from idle loop as >> >> else if ( !softirq_pending(cpu) && !scrub_free_pages() ) >> pm_idle(); >> >> so softirq_pending() is unnecessary here. >> >> (Not sure why you are saying it would be invoked twice) > That was sort of implicit - the caller would want to become > > > else if ( !softirq_pending(cpu) && !scrub_free_pages() && > !softirq_pending(cpu) ) > pm_idle(); > > to account for the fact that a softirq may become pending while scrubbing. That would look really odd IMO. Would else if ( !softirq_pending(cpu) ) if ( !scrub_free_pages() && !softirq_pending(cpu) ) pm_idle(); or else if ( !softirq_pending(cpu) && !scrub_free_pages() ) if ( !softirq_pending(cpu) ) pm_idle(); be better? (I'd prefer the first) -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen:arm earlyprintk configuration for Hikey 960 boards
On Thu, Jul 27, 2017 at 10:52:40AM +0100, Julien Grall wrote: > Hi Konrad, > > On 27/07/17 02:18, Konrad Rzeszutek Wilk wrote: > > On Wed, Jul 26, 2017 at 05:59:15PM +0100, Julien Grall wrote: > > > Hi Konrad, > > > > > > On 26/07/17 17:54, Konrad Rzeszutek Wilk wrote: > > > > Introduce an earlyprintk configuration of Hikey 960 boards. > > > > ..snip.. > > > > > > Would it be possible to update the wiki page on the hikey [1] with your > > > latest finding? > > > > I added a whole new web-page: > > https://wiki.xenproject.org/wiki/HiKey960 > > > > As the HiKey != HiKey960 > > Oh :). Fine then. Would you be up to do testing on this platform during RC? Yes. > > If you can you add your name in [1]. This would help user to know the board > has been tested with latest release. > > > > > [Also updated the landing page > > https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions > > to point to this new one] > > > > Cheers, > > [1] https://wiki.xenproject.org/wiki/Xen_ARM_Manual_Smoke_Test/Results > > -- > Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 3/8] mm: Scrub pages in alloc_heap_pages() if needed
On 07/31/2017 11:16 AM, Jan Beulich wrote: Boris Ostrovsky07/23/17 4:07 AM >>> >> On 06/27/2017 02:00 PM, Jan Beulich wrote: >> Boris Ostrovsky 06/22/17 8:55 PM >>> @@ -862,10 +879,19 @@ static struct page_info *alloc_heap_pages( if ( d != NULL ) d->last_alloc_node = node; +need_scrub = !!first_dirty_pg && !(memflags & MEMF_no_scrub); >>> No need for !! here. But I wonder whether that part of the check is really >>> useful anyway, considering the sole use ... >>> for ( i = 0; i < (1 << order); i++ ) { /* Reference count must continuously be zero for free pages. */ -BUG_ON(pg[i].count_info != PGC_state_free); +BUG_ON((pg[i].count_info & ~PGC_need_scrub) != PGC_state_free); + +if ( test_bit(_PGC_need_scrub, [i].count_info) ) +{ +if ( need_scrub ) +scrub_one_page([i]); >>> ... here. If it isn't, I think the local variable isn't warranted either. >>> If you agree, the thus adjusted patch can have >>> Reviewed-by: Jan Beulich >>> (otherwise I'll wait with it to understand the reason first). >> first_dirty_pg is indeed unnecessary but I think local variable is >> useful to avoid ANDing memflags inside the loop on each iteration >> (unless you think compiler is smart enough to realize that memflags is >> not changing). > I don't understand: At least on x86 I'd expect the compiler to use a > single TEST if you used memflags inside the loop, whereas the local > variable would likely be a single CMP inside the loop plus setup code > outside of it. OK, I haven't considered that you don't actually need to AND and then CMP. Then yes, local variable is unnecessary. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/5] xen/livepatch/ARM32: Don't load and crash on livepatches loaded with wrong alignment.
On Mon, Jul 31, 2017 at 07:55:34AM -0600, Jan Beulich wrote: > >>> Konrad Rzeszutek Wilk07/26/17 9:50 PM >>> > >--- a/docs/misc/livepatch.markdown > >+++ b/docs/misc/livepatch.markdown > >@@ -279,6 +279,10 @@ It may also have some architecture-specific sections. > >For example: > >* Exception tables. > >* Relocations for each of these sections. > > > >+Note that on ARM 32 the sections SHOULD be four byte aligned. Otherwise > >+we risk hitting Data Abort exception as un-aligned manipulation of data is > >+prohibited on ARM 32. > > This (and hence the rest of the patch) is not in line with the outcome of the > earlier discussion we had. Nothing is wrong with a section having smaller > alignment, as long as there are no 32-bit (or wider, but I don't think there > are any such) relocations against such a section. And even if there were, I > think it should rather be the code doing the relocations needing to cope, as > I don't think the ARM ELF ABI imposes any such restriction. The idea behind this patch is to give advance warnings. Akin to what 2ff229643b739e2fd0cd0536ee9fca506cfa92f8 "xen/livepatch: Don't crash on encountering STN_UNDEF relocations" did. The other patches in this series fix the alignment issues. The ARM ELF ABI (http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044f/IHI0044F_aaelf.pdf) says: 4.3.5 Section Alignment There is no minimum alignment required for a section. However, sections containing thumb code must be at least 16-bit aligned and sections containing ARM code must be at least 32-bit aligned. Platform standards may set a limit on the maximum alignment that they can guarantee (normally the page size). ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist
On 07/31/2017 10:45 AM, Jan Beulich wrote: Boris Ostrovsky07/23/17 4:01 AM >>> >> On 06/27/2017 01:06 PM, Jan Beulich wrote: >> Boris Ostrovsky 06/22/17 8:55 PM >>> +{ +if ( pg < first_dirty_pg ) +first_dirty = (first_dirty_pg - pg) / sizeof(*pg); >>> Pointer subtraction already includes the involved division. >> >> Yes, this was a mistake. >> >>> Otoh I wonder >>> if you couldn't get away without pointer comparison/subtraction here >>> altogether. >> >> Without comparison I can only assume that first_dirty is zero (i.e. the >> whole buddy is potentially dirty). Is there something else I could do? > I was thinking of tracking indexes instead of pointers. But maybe that > would more hamper readability of the overall result than help it. I'll try to see how it looks. > +else +i = 0; + +for ( ; i < (1 << cur_order); i++ ) +if ( test_bit(_PGC_need_scrub, + _head[i].count_info) ) +{ +first_dirty = i; +break; +} >>> Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are >>> there cases where ->u.free.first_dirty of a page may be wrong? >> >> When we merge in free_heap_pages we don't clear first_dirty of the >> successor buddy (at some point I did have this done but you questioned >> whether it was needed and I dropped it). > Hmm, this indeed answers my question, but doesn't help (me) understanding > whether the suggested ASSERT() could be wrong. Oh, I see what you were asking --- ASSERT() *after* the loop, to make sure we indeed found the first dirty page. Yes, I will add it. > --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -88,7 +88,15 @@ struct page_info /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */ struct { /* Do TLBs need flushing for safety before next page use? */ -bool_t need_tlbflush; +unsigned long need_tlbflush:1; + +/* + * Index of the first *possibly* unscrubbed page in the buddy. + * One more than maximum possible order (MAX_ORDER+1) to >>> Why +1 here and hence ... >> Don't we have MAX_ORDER+1 orders? > So here there might be a simple misunderstanding: I understand the > parenthesized MAX_ORDER+1 to represent "maximum possible > order", i.e. excluding the "one more than", not the least because of > the ... > >>> + * accommodate INVALID_DIRTY_IDX. >>> + */ >>> +#define INVALID_DIRTY_IDX (-1UL & (((1UL< >> +unsigned long first_dirty:MAX_ORDER + 2; > +2 here. > >>> ... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly >>> parenthesized (apart from lacking blanks around the shift operator)? I'd >>> expect you want a value with MAX_ORDER+1 set bits, i.e. >>> (1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too. >> Yes to parentheses and AND. Should be (1UL << (MAX_ORDER + 2)) - 1 > I.e. I would still expect it to be (1UL << (MAX_ORDER + 1)) - 1 > here. Sorry, I still don't get it. Say, MAX_ORDER is 1. Since this implies that indexes 0, 1, 2 and 3 are all valid (because we can have up to 2^(MAX_ORDER+1) pages), don't we need 3 bits to indicate an invalid index? -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] libxl: use xen-blkback for 'vbd' disk types by default
On Mon, Jul 31, 2017 at 04:56:04PM +0100, Wei Liu wrote: > On Fri, Jul 28, 2017 at 06:42:13PM +0200, Marek Marczykowski-Górecki wrote: > > This will allow later to make HVM domain without qemu in dom0 (in > > addition to the one in stubdomain). > > > > Signed-off-by: Marek Marczykowski-Górecki> > > > --- > > This is extracted from v1 of "libxl: do not start dom0 qemu for > > stubdomain when not needed". > > > > Signed-off-by: Marek Marczykowski-Górecki > > --- > > tools/libxl/libxl_disk.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/tools/libxl/libxl_disk.c b/tools/libxl/libxl_disk.c > > index 63de75c..7842d9b 100644 > > --- a/tools/libxl/libxl_disk.c > > +++ b/tools/libxl/libxl_disk.c > > @@ -56,10 +56,12 @@ static void disk_eject_xswatch_callback(libxl__egc > > *egc, libxl__ev_xswatch *w, > > "/local/domain/%d/backend/%" TOSTRING(BACKEND_STRING_SIZE) > > "[a-z]/%*d/%*d", > > >backend_domid, backend_type); > > -if (!strcmp(backend_type, "tap") || !strcmp(backend_type, "vbd")) { > > +if (!strcmp(backend_type, "tap")) { > > disk->backend = LIBXL_DISK_BACKEND_TAP; > > } else if (!strcmp(backend_type, "qdisk")) { > > disk->backend = LIBXL_DISK_BACKEND_QDISK; > > +} else if (!strcmp(backend_type, "vbd")) { > > +disk->backend = LIBXL_DISK_BACKEND_PHY; > > Wait, it only occurred to me until now this patch is changing > disk_eject_xswatch_callback. > > Is this a bug fix? How is it possible for the backend_type to be "vbd" > when there isn't such thing in libxl_types.idl? Oh, I'm an idiot. That's read from xenstore path. I think this patch is correct. But I still tend to think this is a bug fix. How do you discover this problem? Has disk eject ever worked for you? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xen/link: Move .data.rel.ro sections into .rodata for final link
On Mon, 2017-07-31 at 07:15 -0600, Jan Beulich wrote: > > > > David Woodhouse07/31/17 1:02 PM >>> > > On Sun, 2017-07-30 at 00:16 -0600, Jan Beulich wrote: > > > > > > David Woodhouse 07/20/17 5:22 PM >>> > > > > This includes stuff lke the hypercall tables which we really want > > > > to be read-only. And they were going into .data.read-mostly. > > > > > > Yes, we'd like them to be read-only, but what if EFI properly assigned r/o > > > permissions to the .rodata section when loading xen.efi? We'd then be > > > unable to apply relocations when switching from 1:1 to virtual mappings > > > (see efi_arch_relocate_image()). > > > > FWIW it does look like TianoCore has gained the ability to mark > > sections as read-only, in January of this year: > > https://github.com/tianocore/edk2/commit/d0e92aad46 > > > > It doesn't actually seem to be complete — even with subsequent fixes > > since that commit, it doesn't look like it catches the case of data > > sections without EFI_IMAGE_SCN_MEM_WRITE, such as .rodata. > > > > And even if/when that gets fixed you'll note that the protection is > > deliberately torn down in ExitBootServices(), specifically for the case > > you're concerned about below — because you'll need to do the > > relocations. > > As said in an earlier reply, a first pass over relocations is being done > long before the call to ExitBootServices(). A minimal adjustment to > efi_arch_relocate_image() will be needed anyway, afaict. Ah, right. I think more "implied" than "said" but I understand now. :) At least, I understand what you're saying... I have no bloody clue what's actually going on though. There is a first call to efi_arch_relocate_image(0) before the ExitBootServices() call. However I'm missing something because I can't see how that call achieves anything *other* than to trigger the fault we're concerned about. There are three types of relocations — PE_BASE_RELOC_ABS, PE_BASE_RELOC_HIGHLOW, and PE_BASE_RELOC_DIR64. The first (ABS) doesn't seem to do anything, ever. And is never emitted by mkrelocs.c. The second (HIGHLOW) does nothing if (!delta). The third (DIR64) simply adds 'delta' to the target address. We could potentially stop it faulting on that pointless '*addr += 0' by doing this... --- a/xen/arch/x86/efi/efi-boot.h +++ b/xen/arch/x86/efi/efi-boot.h @@ -87,7 +87,8 @@ static void __init efi_arch_relocate_image(unsigned long delta) case PE_BASE_RELOC_DIR64: if ( in_page_tables(addr) ) blexit(L"Unexpected relocation type"); -*(u64 *)addr += delta; +if ( delta ) +*(u64 *)addr += delta; break; default: blexit(L"Unsupported relocation type"); ... but then again, if the whole function is really doing nothing at all when invoked with delta==0 then perhaps it would just be easier to remove the first pass altogether. I feel sure I'm missing something, but what? Is it still supposed to be adding xen_phys_start in the PE_BASE_RELOC_HIGHLOW case even when delta==0? Because it isn't... Either way, this is still broken before my patch though, right? Especially if UEFI learns to do it for non-executable sections, but AFAICT even before that. These are the sections I see the PE section headers of a local build: Name Characteristics Relocations .text 0xe0d00020 (WRX) ✓ .rodata 0x40600040 ( R ) ✓ .buildid 0x40300040 ( R ) .init.te 0x60500020 ( RX) ✓ .init.da 0xc0d00040 (WR ) ✓ .data.re 0xc0800040 (WR ) ✓ .data 0xc0d00040 (WR ) ✓ .bss 0xc180 (WR ) .reloc 0x42300040 ( R ) .pad 0xc0300080 (WR ) So there are (again, before my patch) relocations in .init.da(ta) and .rodata sections which UEFI *might* start marking read-only, and also in .init.te(xt) which is R+X and could be marked read-only today. And the .init.te(xt) relocations include PE_BASE_RELOC_DIR64 relocations, which *would* cause a fault in the !delta case. (All the relocations in .rodata both before and after my patch are also PE_BASE_RELOC_DIR64, FWIW) smime.p7s Description: S/MIME cryptographic signature ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] libxl: use xen-blkback for 'vbd' disk types by default
On Fri, Jul 28, 2017 at 06:42:13PM +0200, Marek Marczykowski-Górecki wrote: > This will allow later to make HVM domain without qemu in dom0 (in > addition to the one in stubdomain). > > Signed-off-by: Marek Marczykowski-Górecki> > --- > This is extracted from v1 of "libxl: do not start dom0 qemu for > stubdomain when not needed". > > Signed-off-by: Marek Marczykowski-Górecki > --- > tools/libxl/libxl_disk.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/tools/libxl/libxl_disk.c b/tools/libxl/libxl_disk.c > index 63de75c..7842d9b 100644 > --- a/tools/libxl/libxl_disk.c > +++ b/tools/libxl/libxl_disk.c > @@ -56,10 +56,12 @@ static void disk_eject_xswatch_callback(libxl__egc *egc, > libxl__ev_xswatch *w, > "/local/domain/%d/backend/%" TOSTRING(BACKEND_STRING_SIZE) > "[a-z]/%*d/%*d", > >backend_domid, backend_type); > -if (!strcmp(backend_type, "tap") || !strcmp(backend_type, "vbd")) { > +if (!strcmp(backend_type, "tap")) { > disk->backend = LIBXL_DISK_BACKEND_TAP; > } else if (!strcmp(backend_type, "qdisk")) { > disk->backend = LIBXL_DISK_BACKEND_QDISK; > +} else if (!strcmp(backend_type, "vbd")) { > +disk->backend = LIBXL_DISK_BACKEND_PHY; Wait, it only occurred to me until now this patch is changing disk_eject_xswatch_callback. Is this a bug fix? How is it possible for the backend_type to be "vbd" when there isn't such thing in libxl_types.idl? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PVH VCPU hotplug support v7?
On 07/31/2017 03:29 PM, Boris Ostrovsky wrote: On 07/31/2017 10:12 AM, Andrew Cooper wrote: On 31/07/17 14:55, Boris Ostrovsky wrote: On 07/31/2017 09:20 AM, Ross Lagerwall wrote: Hi Boris, I've modified your PVH VCPU hotplug support v6 patch series [1] to support HVM guests running _with_ a device model for XenServer's purposes. This is useful because it moves the vCPU hotplug handling out of QEMU and allows it to mostly be shared with PVH. It will also allow unplugging vCPUs (libxl currently only does cpu-add for upstream qemu). Are you still planning on continuing with that patch series since your commit to Linux [2]? This series has been put on hold until we figure out what to do with hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI CPU namespace --- on hotplug event dom0 has to somehow figure out whether the event was due to (dis)appearance of a physical or virtual CPU). I don't think this has been dealt with yet (copying Roger). From the point of view of unblocking several pieces of work, it would be fine for this logic to be behind an emulation flag, just like LAPIC/etc. The (I think) last message discussing this series was https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html Are you suggesting extracting pieces that would move hotplug support for HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific code out? (The feature flag is part of this series --- https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html) I think(?) Andrew was suggesting to have an emulation flag such that hotplug support is moved into the hypervisor for HVM guests _and_ PVH guests except for PVH dom0. I don't know what work this unblocks that he was referring to. -- Ross Lagerwall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/3] Docs: consolidate release related documents
On Mon, Jul 31, 2017 at 02:51:21PM +0100, Ian Jackson wrote: > Wei Liu writes ("[PATCH 0/3] Docs: consolidate release related documents"): > > Wei Liu (3): > > docs: consolidate release related documents > > docs: add xen-release-management.pandoc > > docs: hook up process/ to build system > > FWIW, > > Acked-by: Ian Jackson> > However, AFAICT the mean reason this doesn't process > release-checklist.txt and branching-checklist.txt into files on the > website is that you only have it process pandocs. > Correct, that's deliberate. > But I don't think those files are ever going to want to be made into > web pages. Whereas it is possible that we will grow other process > documents in .txt form which do, and no provision is made for them. > > I wonder if it would be better to rename those to files to a different > directory. The reason I invented misc/ was precisely to avoid people > (users and developers) tripping over them thinking they might be > useful... > Lars (?) said there should be a "process" directory, hence I put everything that's relevant here. I'm open to suggestions about directory structure. Maybe having a technician-checklist directory under process? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 4/8] mm: Scrub memory from idle loop
>>> Boris Ostrovsky07/23/17 4:14 AM >>> >>> @@ -1050,17 +1120,42 @@ static void scrub_free_pages(unsigned int node) >>> -if ( node_need_scrub[node] == 0 ) >>> -return; >>> +if ( preempt || (node_need_scrub[node] == 0) ) >>> +goto out; >>> } >>> } while ( order-- != 0 ); >>> } >>> + >>> + out: >>> +spin_unlock(_lock); >>> +node_clear(node, node_scrubbing); >>> +return softirq_pending(cpu) || (node_to_scrub(false) != NUMA_NO_NODE); >> >> While I can see why you use it here, the softirq_pending() looks sort of >> misplaced: While invoking it twice in the caller will look a little odd too, >> I still think that's where the check belongs. > > >scrub_free_pages is called from idle loop as > >else if ( !softirq_pending(cpu) && !scrub_free_pages() ) >pm_idle(); > >so softirq_pending() is unnecessary here. > >(Not sure why you are saying it would be invoked twice) That was sort of implicit - the caller would want to become else if ( !softirq_pending(cpu) && !scrub_free_pages() && !softirq_pending(cpu) ) pm_idle(); to account for the fact that a softirq may become pending while scrubbing. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v4]Proposal to allow setting up shared memory areas between VMs from xl config file
On Mon, Jul 31, 2017 at 01:09:04AM +0800, Zhongze Liu wrote: > > @cache_policyCan only appear when @role = master. >The stage-2 cacheability/shareability attributes of the >shared memory area. Currently, only two policies are >supported: > * ARM_normal: Only applicable to ARM guests. This >would mean Inner and Outer Write-Back >Cacheable, and Inner Shareable. > * x86_normal: Only applicable to x86 HVM guests. This >would mean Write-Back Cacheable. This question might have been asked before, I'm sorry for not being able to follow closely previous discussions... Why are these opaque names chosen instead of just writing "wb" "wc" or whatever? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 3/8] mm: Scrub pages in alloc_heap_pages() if needed
>>> Boris Ostrovsky07/23/17 4:07 AM >>> >On 06/27/2017 02:00 PM, Jan Beulich wrote: > Boris Ostrovsky 06/22/17 8:55 PM >>> >>> @@ -862,10 +879,19 @@ static struct page_info *alloc_heap_pages( >>> if ( d != NULL ) >>> d->last_alloc_node = node; >>> >>> +need_scrub = !!first_dirty_pg && !(memflags & MEMF_no_scrub); >> >> No need for !! here. But I wonder whether that part of the check is really >> useful anyway, considering the sole use ... >> >>> for ( i = 0; i < (1 << order); i++ ) >>> { >>> /* Reference count must continuously be zero for free pages. */ >>> -BUG_ON(pg[i].count_info != PGC_state_free); >>> +BUG_ON((pg[i].count_info & ~PGC_need_scrub) != PGC_state_free); >>> + >>> +if ( test_bit(_PGC_need_scrub, [i].count_info) ) >>> +{ >>> +if ( need_scrub ) >>> +scrub_one_page([i]); >> >> ... here. If it isn't, I think the local variable isn't warranted either. >> If you agree, the thus adjusted patch can have >> Reviewed-by: Jan Beulich >> (otherwise I'll wait with it to understand the reason first). > >first_dirty_pg is indeed unnecessary but I think local variable is >useful to avoid ANDing memflags inside the loop on each iteration >(unless you think compiler is smart enough to realize that memflags is >not changing). I don't understand: At least on x86 I'd expect the compiler to use a single TEST if you used memflags inside the loop, whereas the local variable would likely be a single CMP inside the loop plus setup code outside of it. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/2] libxl: Implement the handler to handle unrecoverable AER errors.
On Fri, Jul 28, 2017 at 04:56:56PM -0700, Venu Busireddy wrote: > On 2017-07-28 17:39:52 +0100, Ian Jackson wrote: > > Venu Busireddy writes ("[PATCH v2 1/2] libxl: Implement the handler to > > handle unrecoverable AER errors."): > > > Implement the callback function to handle unrecoverable AER errors, and > > > also the public APIs that can be used to register/unregister the handler. > > > When an AER error occurs, the handler will forcibly remove the erring > > > PCIe device from the guest. > > > > Why is this only sometimes the right thing to do ? On what basis > > might a user choose ? > > This is not an "only sometimes" thing. User doesn't choose it. We always > want to watch for AER errors. > > > If this is always the right thing to do then maybe we need to recast > > this as a general "please run monitoring for this domain" call ? > > What does "recast" imply? Which "call" are you referring to? > > > > +int libxl_reg_aer_events_handler(libxl_ctx *, uint32_t) > > > LIBXL_EXTERNAL_CALLERS_ONLY; > > > +void libxl_unreg_aer_events_handler(libxl_ctx *, uint32_t) > > > LIBXL_EXTERNAL_CALLERS_ONLY; > > > > I think these names are very unintuitive. They describe the > > implementation, not the effect. > > These names can be changed to anything you want. Please suggest any names > of your choice, and I will change them. That ensures that we don't spend > more review revisions in fine tuning those names. > > > The API seems awkward. Inside libxl, events are only processed while > > the application is inside libxl. So for these functions to be > > effective, the calling application must arrange to be running the > > libxl event loop. This should be documented, at least. > > I am afraid I don't follow you. This scheme is tested and it works. So, I > do not follow you when you say, "...for these functions to be effective,..." Libxl has an internal event loop. The code as-is registers a watch which runs on the internal event loop. The event loop's event is only processed when the process enters libxl (calls libxl functions). Imagine a toolstack which doesn't use libxl's internal event loop -- for example libvirt has its own event loop, or a toolstack which only calls the register function then nothing else. Your API would not work for those cases. Ian, please correct me if I'm wrong. > > > What happens if more than one process calls this at once ? > > Each call is handled within a separate 'xl' process's context. I don't > see a problem there. Do you have anything specific in mind? > It is possible that both xl processes see its watch fires and try to write to the same node at the same time. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 05/12] x86/domctl: Handle ACPI access from domctl
On 07/31/2017 10:14 AM, Ross Lagerwall wrote: > On 01/03/2017 02:04 PM, Boris Ostrovsky wrote: >> Signed-off-by: Boris Ostrovsky>> --- >> Changes in v6: >> * Adjustments to to patch 4 changes. >> * Added a spinlock for VCPU map access >> * Return an error on guest trying to write VCPU map >> > snip >> -static int acpi_cpumap_access_common(struct domain *d, bool is_write, >> - unsigned int port, >> +static int acpi_cpumap_access_common(struct domain *d, bool >> is_guest_access, >> + bool is_write, unsigned int port, >>unsigned int bytes, uint32_t >> *val) >> { >> unsigned int first_byte = port - XEN_ACPI_CPU_MAP; >> +int rc = X86EMUL_OKAY; >> >> BUILD_BUG_ON(XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN >>> ACPI_GPE0_BLK_ADDRESS_V1); >> >> +spin_lock(>arch.hvm_domain.acpi_lock); >> + >> if ( !is_write ) >> { >> uint32_t mask = (bytes < 4) ? ~0U << (bytes * 8) : 0; >> @@ -32,23 +37,61 @@ static int acpi_cpumap_access_common(struct >> domain *d, bool is_write, >> memcpy(val, (uint8_t *)d->avail_vcpus + first_byte, >> min(bytes, ((d->max_vcpus + 7) / 8) - first_byte)); >> } >> +else if ( !is_guest_access ) >> +memcpy((uint8_t *)d->avail_vcpus + first_byte, val, >> + min(bytes, ((d->max_vcpus + 7) / 8) - first_byte)); >> else >> /* Guests do not write CPU map */ >> -return X86EMUL_UNHANDLEABLE; >> +rc = X86EMUL_UNHANDLEABLE; >> >> -return X86EMUL_OKAY; >> +spin_unlock(>arch.hvm_domain.acpi_lock); >> + >> +return rc; >> } >> >> int hvm_acpi_domctl_access(struct domain *d, >> const struct xen_domctl_acpi_access >> *access) >> { >> -return -ENOSYS; >> +unsigned int bytes, i; >> +uint32_t val = 0; >> +uint8_t *ptr = (uint8_t *) >> +int rc; >> +bool is_write = (access->rw == XEN_DOMCTL_ACPI_WRITE) ? true : >> false; >> + >> +if ( has_acpi_dm_ff(d) ) >> +return -EOPNOTSUPP; >> + >> +if ( access->space_id != XEN_ACPI_SYSTEM_IO ) >> +return -EINVAL; >> + >> +if ( !((access->address >= XEN_ACPI_CPU_MAP) && >> + (access->address < XEN_ACPI_CPU_MAP + >> XEN_ACPI_CPU_MAP_LEN)) ) >> +return -ENODEV; >> + >> +for ( i = 0; i < access->width; i += sizeof(val) ) >> +{ >> +bytes = (access->width - i > sizeof(val)) ? >> +sizeof(val) : access->width - i; >> + >> +if ( is_write && copy_from_guest_offset(ptr, access->val, i, >> bytes) ) >> +return -EFAULT; >> + >> +rc = acpi_cpumap_access_common(d, false, is_write, >> + access->address, bytes, ); > > While I'm looking at this code... > This doesn't work if access->width > sizeof(val) (4 bytes). The same > value (access->address) is always passed into > acpi_cpumap_access_common for 'port' and this is used as an offset > into the avail_cpus array. So the offset is unchanged and only the > first 4 bytes of avail_cpus ever gets changed. I'd have to go back to the series (haven't looked at it since it was posted back in January) but I think I enforce somewhere size of the access to fit into 4 bytes. And if not then you are right. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 1/8] mm: Place unscrubbed pages at the end of pagelist
>>> Boris Ostrovsky07/23/17 4:01 AM >>> >On 06/27/2017 01:06 PM, Jan Beulich wrote: > Boris Ostrovsky 06/22/17 8:55 PM >>> >>> +{ >>> +if ( pg < first_dirty_pg ) >>> +first_dirty = (first_dirty_pg - pg) / sizeof(*pg); >> >> Pointer subtraction already includes the involved division. > > >Yes, this was a mistake. > >> Otoh I wonder >> if you couldn't get away without pointer comparison/subtraction here >> altogether. > > >Without comparison I can only assume that first_dirty is zero (i.e. the >whole buddy is potentially dirty). Is there something else I could do? I was thinking of tracking indexes instead of pointers. But maybe that would more hamper readability of the overall result than help it. >>> @@ -892,8 +934,25 @@ static int reserve_offlined_page(struct page_info >>> *head) >>> { >>> merge: >>> /* We don't consider merging outside the head_order. */ >>> -page_list_add_tail(cur_head, (node, zone, cur_order)); >>> -PFN_ORDER(cur_head) = cur_order; >>> + >>> +/* See if any of the pages indeed need scrubbing. */ >>> +if ( first_dirty_pg && (cur_head + (1 << cur_order) > >>> first_dirty_pg) ) >>> +{ >>> +if ( cur_head < first_dirty_pg ) >>> +i = (first_dirty_pg - cur_head) / >>> sizeof(*cur_head); > >I assume the same comment as above applies here. Of course. I usually avoid repeating the same comment, except maybe when reviewing patches of first time contributors. >>> +else >>> +i = 0; >>> + >>> +for ( ; i < (1 << cur_order); i++ ) >>> +if ( test_bit(_PGC_need_scrub, >>> + _head[i].count_info) ) >>> +{ >>> +first_dirty = i; >>> +break; >>> +} >> >> Perhaps worth having ASSERT(first_dirty != INVALID_DIRTY_IDX) here? Or are >> there cases where ->u.free.first_dirty of a page may be wrong? > > >When we merge in free_heap_pages we don't clear first_dirty of the >successor buddy (at some point I did have this done but you questioned >whether it was needed and I dropped it). Hmm, this indeed answers my question, but doesn't help (me) understanding whether the suggested ASSERT() could be wrong. >>> --- a/xen/include/asm-x86/mm.h >>> +++ b/xen/include/asm-x86/mm.h >>> @@ -88,7 +88,15 @@ struct page_info >>> /* Page is on a free list: ((count_info & PGC_count_mask) == 0). >>> */ >>> struct { >>> /* Do TLBs need flushing for safety before next page use? */ >>> -bool_t need_tlbflush; >>> +unsigned long need_tlbflush:1; >>> + >>> +/* >>> + * Index of the first *possibly* unscrubbed page in the buddy. >>> + * One more than maximum possible order (MAX_ORDER+1) to >> >> Why +1 here and hence ... > >Don't we have MAX_ORDER+1 orders? So here there might be a simple misunderstanding: I understand the parenthesized MAX_ORDER+1 to represent "maximum possible order", i.e. excluding the "one more than", not the least because of the ... >> + * accommodate INVALID_DIRTY_IDX. >> + */ >> +#define INVALID_DIRTY_IDX (-1UL & (((1UL< > +unsigned long first_dirty:MAX_ORDER + 2; +2 here. >> ... why +2 instead of +1? And isn't the expression INVALID_DIRTY_IDX wrongly >> parenthesized (apart from lacking blanks around the shift operator)? I'd >> expect you want a value with MAX_ORDER+1 set bits, i.e. >> (1UL << (MAX_ORDER + 1)) - 1. ANDing with -1UL seems quite pointless too. > >Yes to parentheses and AND. Should be (1UL << (MAX_ORDER + 2)) - 1 I.e. I would still expect it to be (1UL << (MAX_ORDER + 1)) - 1 here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 00/14] arm/mem_access: Walk guest page tables in SW if mem_access is active
On 18/07/17 13:24, Sergej Proskurin wrote: > Hi all, Hi, > > The function p2m_mem_access_check_and_get_page is called from the function > get_page_from_gva if mem_access is active and the hardware-aided translation > of > the given guest virtual address (gva) into machine address fails. That is, if > the stage-2 translation tables constrain access to the guests's page tables, > hardware-assisted translation will fail. The idea of the function > p2m_mem_access_check_and_get_page is thus to translate the given gva and check > the requested access rights in software. However, as the current > implementation > of p2m_mem_access_check_and_get_page makes use of the hardware-aided gva to > ipa > translation, the translation might also fail because of reasons stated above > and will become equally relevant for the altp2m implementation on ARM. As > such, we provide a software guest translation table walk to address the above > mentioned issue. > > The current version of the implementation supports translation of both the > short-descriptor as well as the long-descriptor translation table format on > ARMv7 and ARMv8 (AArch32/AArch64). > > This revised version incorporates the comments of the previous patch series. > In > this patch version we refine the definition of PAGE_SIZE_GRAN and > PAGE_MASK_GRAN. In particular, we use PAGE_SIZE_GRAN to define PAGE_MASK_GRAN > and thus avoid these defines to have a differing type. We also changed the > previously introduced macro BITS_PER_LONG_LONG to BITS_PER_LLONG. Further > changes comprise minor adjustments in comments and renaming of macros and > function parameters. Some additional changes comprising code readability and > correct type usage have been made and stated in the individual commits. > > The following patch series can be found on Github[0]. I tried this series today with the change [1] in Xen to check the translation is valid. However, I got a failure when booting non-LPAE arm32 Dom0: (XEN) Loading kernel from boot module @ 80008000 (XEN) Allocating 1:1 mappings totalling 512MB for dom0: (XEN) BANK[0] 0x00a000-0x00c000 (512MB) (XEN) Grant table range: 0x00ffe0-0x00ffe6a000 (XEN) Loading zImage from 80008000 to a780-a7f50e28 (XEN) Allocating PPI 16 for event channel interrupt (XEN) Loading dom0 DTB to 0xa800-0xa8001f8e (XEN) Std. Loglevel: All (XEN) Guest Loglevel: All (XEN) guest_walk_tables: gva 0xffeff018 pipa 0x1c090018 (XEN) access_guest_memory_by_ipa: gpa 0xa0207ff8 (XEN) access_guest_memory_by_ipa: gpa 0xa13aebfc (XEN) d0: guestcopy: failed to get table entry. (XEN) Xen BUG at traps.c:2737 (XEN) [ Xen-4.10-unstable arm32 debug=y Not tainted ] (XEN) CPU:0 (XEN) PC: 00264dc0 do_trap_guest_sync+0x161c/0x1804 (XEN) CPSR: a05a MODE:Hypervisor (XEN) R0: ffea R1: R2: R3: 004a (XEN) R4: 93830007 R5: 47fcff58 R6: 93830007 R7: 0007 (XEN) R8: 1c09 R9: R10: R11:47fcff54 R12:ffea (XEN) HYP: SP: 47fcfee4 LR: 00258dec (XEN) (XEN) VTCR_EL2: 80003558 (XEN) VTTBR_EL2: 00010008f3ffc000 (XEN) (XEN) SCTLR_EL2: 30cd187f (XEN)HCR_EL2: 0038663f (XEN) TTBR0_EL2: fff02000 (XEN) (XEN)ESR_EL2: (XEN) HPFAR_EL2: 001c0900 (XEN) HDFAR: ffeff018 (XEN) HIFAR: (XEN) (XEN) Xen stack trace from sp=47fcfee4: (XEN) 47fcff34 00256008 47fcfefc 47fcfefc 20da 0004 47fd48f4 (XEN)002d5ef0 0004 002d1f00 0004 002d1f00 c163f740 93830007 (XEN)ffeff018 1c090018 47fcff44 c15e70ac 005b c15e70ac c074400c (XEN)0031 c0743ff8 47fcff58 00268ce0 c15e70ac 005b 0031 (XEN)ffeff000 c15e70ac 005b c15e70ac c074400c 0031 c0743ff8 (XEN) 001f c074401c 21d3 93830007 (XEN)c161cac0 c161cac0 c1501de0 c0735640 c161cacc c161cacc c161cad8 c161cad8 (XEN) c161cae4 c161cae4 41d3 (XEN) dfdfdfcf cfdfdfdf (XEN) Xen call trace: (XEN)[<00264dc0>] do_trap_guest_sync+0x161c/0x1804 (PC) (XEN)[<00258dec>] access_guest_memory_by_ipa+0x25c/0x284 (LR) (XEN)[<00268ce0>] entry.o#return_from_trap+0/0x4 (XEN) (XEN) (XEN) (XEN) Panic on CPU 0: (XEN) Xen BUG at traps.c:2737 (XEN) (XEN) (XEN) Reboot in five seconds... The IPA 0xa13aebfc is not valid for the domain. Cheers, [1] diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c index 4ee07fcea3..89c5ebf3cf 100644 --- a/xen/arch/arm/guestcopy.c +++ b/xen/arch/arm/guestcopy.c @@ -139,6 +139,8 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t gpa, void *buf, return -EINVAL; } +printk("%s: gpa 0x%llx\n", __FUNCTION__, gpa); +
Re: [Xen-devel] [PATCH v2 1/2] libxl: Implement the handler to handle unrecoverable AER errors.
On Fri, Jul 28, 2017 at 10:15:40AM -0700, Venu Busireddy wrote: > On 2017-07-28 16:58:13 +0100, Wei Liu wrote: > > On Wed, Jul 26, 2017 at 07:16:38PM -0500, Venu Busireddy wrote: > > > Implement the callback function to handle unrecoverable AER errors, and > > > also the public APIs that can be used to register/unregister the handler. > > > When an AER error occurs, the handler will forcibly remove the erring > > > PCIe device from the guest. > > > > > > Signed-off-by: Venu Busireddy> > > --- > > > tools/libxl/libxl_event.h | 2 ++ > > > tools/libxl/libxl_pci.c | 85 > > > +++ > > > 2 files changed, 87 insertions(+) > > > > > > > Please also add a LIBXL_HAVE macro to libxl.h. There are plenty of > > examples there. > > I assume you meant, for example, something like: > > /* LIBXL_HAVE_UNREG_AER_EVENTS_HANDLER > * > * If it is defined, libxl has a library function called > * libxl_unreg_aer_events_handler. > */ > #define LIBXL_HAVE_UNREG_AER_EVENTS_HANDLER 1 > > If so, I will add them in the next revision. > > > > + > > > +int libxl_reg_aer_events_handler(libxl_ctx *ctx, uint32_t domid) > > > +{ > > > +int rc = 0; > > > +char *be_path; > > > +GC_INIT(ctx); > > > + > > > +aer_watch.domid = domid; > > > +be_path = > > > GCSPRINTF("/local/domain/0/backend/pci/%u/0/aerFailedSBDF", domid); > > > > I think the best thing to do is you get the domid using > > libxl__get_domid. Try not to hard-code 0. > > > > Same for your callback function. And there are quite a few 0's that I'm > > not sure what they stand for. > > All those 0's are saying dom0's domid. Isn't dom0's domid always 0? > If that is not the case, I can use libxl__get_domid(). Please let me > know. Don't assume the code will always run in dom0, so use get_domid is best. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 06/13] libxl: change p9 to use generec add function
On Sun, Jul 30, 2017 at 09:42:09PM +0300, Oleksandr Grytsov wrote: > On Fri, Jul 28, 2017 at 7:23 PM, Wei Liuwrote: > > On Fri, Jul 28, 2017 at 03:11:34PM +0100, Wei Liu wrote: > >> On Tue, Jul 18, 2017 at 05:25:23PM +0300, Oleksandr Grytsov wrote: > >> [...] > >> > /* Waits for the passed device to reach state XenbusStateInitWait. > >> > * This is not really useful by itself, but is important when executing > >> > * hotplug scripts, since we need to be sure the device is in the > >> > correct > >> > @@ -3565,6 +3559,7 @@ extern const struct libxl_device_type > >> > libxl__usbctrl_devtype; > >> > extern const struct libxl_device_type libxl__usbdev_devtype; > >> > extern const struct libxl_device_type libxl__pcidev_devtype; > >> > extern const struct libxl_device_type libxl__vdispl_devtype; > >> > +extern const struct libxl_device_type libxl__p9_devtype; > >> > > >> > extern const struct libxl_device_type *device_type_tbl[]; > >> > > >> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > >> > index 25563cf..96dbaed 100644 > >> > --- a/tools/libxl/libxl_types.idl > >> > +++ b/tools/libxl/libxl_types.idl > >> > @@ -804,7 +804,7 @@ libxl_domain_config = Struct("domain_config", [ > >> > ("vfbs", Array(libxl_device_vfb, "num_vfbs")), > >> > ("vkbs", Array(libxl_device_vkb, "num_vkbs")), > >> > ("vtpms", Array(libxl_device_vtpm, "num_vtpms")), > >> > -("p9", Array(libxl_device_p9, "num_p9s")), > >> > +("p9s", Array(libxl_device_p9, "num_p9s")), > >> > >> Oh, no, please don't do this. We can't change the name of the fields. > >> > >> There is already on irregular device type -- the PCI device. I suppose > >> you probably need another hook somewhere. And please convert PCI devices > >> if you can. > > > > OK, going through the code I think we need to come to a conclusion if we > > want an extra callback to handle the irregular device names first > > because that's likely to affect the code of the framework in previous > > patch. > > Actually creating new callback to handle irregular device name looks > not so good. > There is the pattern which all namings should follow. May be it has to > be documented The normal pattern is DEVTYPEs. > somewhere. p9 was added recently we can ask the author to review this rename. Once it is released we can't change it, of course unless we deem it unstable. I'm two minded here. P9 was released in 4.9, which was only a few months old. But we definitely can't change the PCI type. It has been around since forever. And there is provision in code to deal with that. > From other side this rename touches only internals changes: no changes > in config file > or CLI interface. > As said, the framework need to be ready to deal with PCI anyway. What sort of issues do you foresee here? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 10/12] tools: implemet new get value interface suitable for all psr allocation features.
On Thu, Jul 20, 2017 at 04:49:11PM +0800, Yi Sun wrote: > This patch implements a new get value interface in tools suitable for all psr > allocation features and the whole flow. It also enables MBA support in tools > to get MBA value. This suggests this patch can be at least broken into two? > > Signed-off-by: Yi Sun> --- > tools/libxc/include/xenctrl.h | 13 +- > tools/libxc/xc_psr.c | 11 +- > tools/libxl/libxl_psr.c | 61 ++ > tools/xl/xl.h | 3 + > tools/xl/xl_cmdtable.c| 9 +- > tools/xl/xl_psr.c | 275 > ++ > 6 files changed, 236 insertions(+), 136 deletions(-) > > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h > index 0b0ec31..def18f5 100644 > --- a/tools/libxc/include/xenctrl.h > +++ b/tools/libxc/include/xenctrl.h > @@ -2452,13 +2452,14 @@ enum xc_psr_cmt_type { > }; > typedef enum xc_psr_cmt_type xc_psr_cmt_type; > > -enum xc_psr_cat_type { > +enum xc_psr_val_type { > XC_PSR_CAT_L3_CBM = 1, > XC_PSR_CAT_L3_CBM_CODE = 2, > XC_PSR_CAT_L3_CBM_DATA = 3, > XC_PSR_CAT_L2_CBM = 4, > +XC_PSR_MBA_THRTL = 5, > }; > -typedef enum xc_psr_cat_type xc_psr_cat_type; > +typedef enum xc_psr_val_type xc_psr_val_type; Changing the name of the type should be done in a separate patch. The rest of this patch mixes renaming and functional change which is rather difficult to review I'm afraid. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 03/12] x86: rename 'cbm_type' to 'psr_val_type' to make it general.
On Thu, Jul 20, 2017 at 04:49:04PM +0800, Yi Sun wrote: > This patch renames 'cbm_type' to 'psr_val_type' to make it be general. > Then, we can reuse this for all psr allocation features. > > Signed-off-by: Yi SunThe code LGTM. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 08/12] tools: create general interfaces to support psr allocation features.
On Thu, Jul 20, 2017 at 04:49:09PM +0800, Yi Sun wrote: [...] > + > +#ifdef LIBXL_HAVE_PSR_MBA > +/* > + * Function to set a domain's value. It operates on a single or multiple > + * target(s) defined in 'target_map'. 'target_map' specifies all the sockets > + * to be operated on. > + */ > +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid, > + libxl_psr_cbm_type type, libxl_bitmap *target_map, > + uint64_t val); > +/* > + * Function to get a domain's cbm. It operates on a single 'target'. > + * 'target' specifies which socket to be operated on. > + */ > +int libxl_psr_get_val(libxl_ctx *ctx, uint32_t domid, > + libxl_psr_cbm_type type, uint32_t target, There is no need for target to be uint32_t right? Unsigned int should work too? > + uint64_t *val); > +/* > + * On success, the function returns an array of elements in 'info', > + * and the length in 'nr'. > + */ > +int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info, > + int *nr, libxl_psr_feat_type type, int lvl); > +void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr); nr should be unsigned int. > +#endif /* LIBXL_HAVE_PSR_MBA */ > +#endif /* LIBXL_HAVE_PSR_CAT */ > > /* misc */ > > diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c > index f55ba1e..8319301 100644 > --- a/tools/libxl/libxl_psr.c > +++ b/tools/libxl/libxl_psr.c > @@ -425,6 +425,30 @@ void libxl_psr_cat_info_list_free(libxl_psr_cat_info > *list, int nr) > free(list); > } > > +int libxl_psr_set_val(libxl_ctx *ctx, uint32_t domid, > + libxl_psr_cbm_type type, libxl_bitmap *target_map, > + uint64_t val) > +{ > +return EXIT_FAILURE; ERROR_FAIL here. > + > +libxl_psr_hw_info = Struct("psr_hw_info", [ > +("id", uint32), > +("u", KeyedUnion(None, libxl_psr_feat_type, "type", > + [("cat_info", Struct(None, [ > + ("cos_max", uint32), > + ("cbm_len", uint32), > + ("cdp_enabled", bool), > + ])), > + ("mba_info", Struct(None, [ > + ("cos_max", uint32), > + ("thrtl_max", uint32), > + ("linear", bool), > + ])), > + ])) If this is output only please mark it as dir=DIR_OUT. > +]) > -- > 1.9.1 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 09/12] tools: implement the new get hw info interface suitable to all psr allocation features.
On Thu, Jul 20, 2017 at 04:49:10PM +0800, Yi Sun wrote: > This patch implements a new get hw info interface suitable for all psr > allocation > features and the whole flow. It also enables MBA support in tools to get MBA > HW info. > > Signed-off-by: Yi Sun> --- > tools/libxc/include/xenctrl.h | 30 +++- > tools/libxc/xc_psr.c | 46 + > tools/libxl/libxl_psr.c | 155 > +++--- > tools/xl/xl_cmdtable.c| 3 + > tools/xl/xl_psr.c | 45 +++- > 5 files changed, 238 insertions(+), 41 deletions(-) > > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h > index 2248900..0b0ec31 100644 > --- a/tools/libxc/include/xenctrl.h > +++ b/tools/libxc/include/xenctrl.h > @@ -2460,6 +2460,31 @@ enum xc_psr_cat_type { > }; > typedef enum xc_psr_cat_type xc_psr_cat_type; > > +enum xc_psr_feat_type { > +XC_PSR_FEAT_UNKNOWN= 0, > +XC_PSR_FEAT_CAT_L3 = 1, > +XC_PSR_FEAT_CAT_L2 = 2, > +XC_PSR_FEAT_MBA= 3, Pointless initialisers. > +}; > +typedef enum xc_psr_feat_type xc_psr_feat_type; > + > +struct xc_psr_hw_info { > +union { > +struct { > +uint32_t cos_max; > +uint32_t cbm_len; > +bool cdp_enabled; > +} xc_cat_info; > + > +struct { > +uint32_t cos_max; > +uint32_t thrtl_max; > +bool linear; > +} xc_mba_info; > +} u; > +}; > +typedef struct xc_psr_hw_info xc_psr_hw_info; [...] > index 8319301..43b84b6 100644 > --- a/tools/libxl/libxl_psr.c > +++ b/tools/libxl/libxl_psr.c > @@ -361,47 +361,49 @@ int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t > domid, > return rc; > } > > +static inline int libxl_psr_hw_info_to_libxl_psr_cat_info( libxl__ -> two underscores for internal functions. > + libxl_psr_feat_type type, libxl_psr_hw_info *hw_info, > + libxl_psr_cat_info *cat_info) > +{ > +if (type != LIBXL_PSR_FEAT_TYPE_CAT_INFO) > +return -1; ERROR_INVAL; > + > +static inline int libxc__psr_hw_info_to_libxl_psr_hw_info( > + libxl_psr_feat_type type, xc_psr_hw_info *xc_hw_info, > + libxl_psr_hw_info *xl_hw_info) > +{ > +switch (type) { > +case LIBXL_PSR_FEAT_TYPE_CAT_INFO: > +xl_hw_info->u.cat_info.cos_max = xc_hw_info->u.xc_cat_info.cos_max; > +xl_hw_info->u.cat_info.cbm_len = xc_hw_info->u.xc_cat_info.cbm_len; > +xl_hw_info->u.cat_info.cdp_enabled = > + > xc_hw_info->u.xc_cat_info.cdp_enabled; > +break; > +case LIBXL_PSR_FEAT_TYPE_MBA_INFO: > +xl_hw_info->u.mba_info.cos_max = xc_hw_info->u.xc_mba_info.cos_max; > +xl_hw_info->u.mba_info.thrtl_max = > xc_hw_info->u.xc_mba_info.thrtl_max; > +xl_hw_info->u.mba_info.linear = xc_hw_info->u.xc_mba_info.linear; > +break; > +default: > +return -1; ERROR_INVAL > +} > + > +return 0; > +} > + > int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_hw_info **info, >int *nr, libxl_psr_feat_type type, int lvl) > { > -return EXIT_FAILURE; > +GC_INIT(ctx); > +int rc; > +int i = 0, socketid, nr_sockets; > +libxl_bitmap socketmap; > +libxl_psr_hw_info *ptr; > +xc_psr_feat_type xc_type; > +xc_psr_hw_info hw_info; > + > +libxl_bitmap_init(); > + > +if ( type == LIBXL_PSR_FEAT_TYPE_CAT_INFO && lvl != 3 && lvl != 2) { Extraneous space. > > void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, int nr) > { > +int i; unsigned int > + > +for (i = 0; i < nr; i++) > +libxl_psr_hw_info_dispose([i]); > +free(list); > } > > /* > diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c > index 2c71a9f..14a02d4 100644 > --- a/tools/xl/xl_cmdtable.c > +++ b/tools/xl/xl_cmdtable.c > @@ -524,6 +524,9 @@ struct cmd_spec cmd_table[] = { >"[options]", >"-m, --cmt Show Cache Monitoring Technology (CMT) hardware > info\n" >"-a, --cat Show Cache Allocation Technology (CAT) hardware > info\n" > +#ifdef LIBXL_HAVE_PSR_MBA > + "-b, --mba Show Memory Bandwidth Allocation (MBA) hardware > info\n" > +#endif You don't need to test the macro. xl always has all features available. Same comment applies to the rest of this patch. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 04/12] x86: implement data structure and CPU init flow for MBA.
On Thu, Jul 20, 2017 at 04:49:05PM +0800, Yi Sun wrote: > This patch implements main data structures of MBA. > > Like CAT features, MBA HW info has cos_max which means the max cos > registers number, and thrtl_max which means the max throttle value > (delay value). It also has a flag to represent if the throttle > value is linear or not. > > One COS register of MBA stores a throttle value for one or more > domains. The throttle value means the transaction time between L2 > cache and next level memory to be delayed. > > This patch also implements init flow for MBA and register stub > callback functions. > > Signed-off-by: Yi Sun> --- > xen/arch/x86/psr.c | 130 > > xen/include/asm-x86/msr-index.h | 1 + > xen/include/asm-x86/psr.h | 2 + > 3 files changed, 109 insertions(+), 24 deletions(-) > > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c > index d1d854f..d1ea5a4 100644 > --- a/xen/arch/x86/psr.c > +++ b/xen/arch/x86/psr.c > @@ -27,13 +27,16 @@ > * - CMT Cache Monitoring Technology > * - COS/CLOSClass of Service. Also mean COS registers. > * - COS_MAX Max number of COS for the feature (minus 1) > + * - MBA Memory Bandwidth Allocation > * - MSRsMachine Specific Registers > * - PSR Intel Platform Shared Resource > + * - THRTL_MAX Max throttle value (delay value) of MBA > */ > > #define PSR_CMT(1<<0) > #define PSR_CAT(1<<1) > #define PSR_CDP(1<<2) > +#define PSR_MBA(1<<3) These should really be (1u << X) -- please use unsigned value and add spaces around "<<". Can you please submit a patch to fix them first? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 02/12] Rename PSR sysctl/domctl interfaces and xsm policy to make them be general.
Normally there is no need to have period at the end of the subject line. On Thu, Jul 20, 2017 at 04:49:03PM +0800, Yi Sun wrote: > This patch renames PSR sysctl/domctl interfaces and related xsm policy to > make them be general for all resource allocation features but not only > for CAT. Then, we can resuse the interfaces for all allocation features. > It would be useful to list what is changed to what. Afaict "cat" is changed to "alloc". This patch mostly looks fine cod-wise. The only thing I want to point out is that you should bump the version number for both domctl and sysctl. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] PVH VCPU hotplug support v7?
On 07/31/2017 10:12 AM, Andrew Cooper wrote: > On 31/07/17 14:55, Boris Ostrovsky wrote: >> On 07/31/2017 09:20 AM, Ross Lagerwall wrote: >>> Hi Boris, >>> >>> I've modified your PVH VCPU hotplug support v6 patch series [1] to >>> support HVM guests running _with_ a device model for XenServer's >>> purposes. This is useful because it moves the vCPU hotplug handling >>> out of QEMU and allows it to mostly be shared with PVH. It will also >>> allow unplugging vCPUs (libxl currently only does cpu-add for upstream >>> qemu). >>> >>> Are you still planning on continuing with that patch series since your >>> commit to Linux [2]? >> This series has been put on hold until we figure out what to do with >> hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI >> CPU namespace --- on hotplug event dom0 has to somehow figure out >> whether the event was due to (dis)appearance of a physical or virtual CPU). >> >> I don't think this has been dealt with yet (copying Roger). > From the point of view of unblocking several pieces of work, it would be > fine for this logic to be behind an emulation flag, just like LAPIC/etc. The (I think) last message discussing this series was https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00143.html Are you suggesting extracting pieces that would move hotplug support for HVM guests from qemu to hypervisor/toolstack but leave all PVH-specific code out? (The feature flag is part of this series --- https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg00059.html) -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v14 16/23] x86: L2 CAT: implement CPU init flow.
>>> Yi Sun07/15/17 2:49 AM >>> >@@ -273,6 +275,12 @@ static int cat_init_feature(const struct cpuid_leaf *regs, >struct psr_socket_info *info, >enum psr_feat_type type) >{ >+const char * const cat_feat_name[FEAT_TYPE_NUM] = { Strictly speaking the blank after the star is wrong here. >+"L3 CAT", >+"CDP", >+"L2 CAT", >+}; Please use designated initializers here. With at least this second aspect taken care of Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v14 13/23] x86: refactor psr: CDP: implement CPU init flow.
>>> Yi Sun07/15/17 2:48 AM >>> >@@ -272,7 +312,8 @@ static int cat_init_feature(const struct cpuid_leaf *regs, >if ( !opt_cpu_info ) >return 0; > >-printk(XENLOG_INFO "CAT: enabled on socket %u, cos_max:%u, cbm_len:%u\n", >+printk(XENLOG_INFO "%s: enabled on socket %u, cos_max:%u, cbm_len:%u\n", >+ ((type == FEAT_TYPE_L3_CDP) ? "CDP" : "L3 CAT"), Why is this not "L3 CDP" when the enumerator includes L3? >@@ -1283,10 +1344,22 @@ static void psr_cpu_init(void) >feat = feat_l3; >feat_l3 = NULL; > >-if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CAT) ) >-feat_props[FEAT_TYPE_L3_CAT] = _cat_props; >+if ( (regs.c & PSR_CAT_CDP_CAPABILITY) && (opt_psr & PSR_CDP) ) >+{ >+if ( !cat_init_feature(®s, feat, info, FEAT_TYPE_L3_CDP) ) >+feat_props[FEAT_TYPE_L3_CDP] = _cdp_props; >+else >+/* If CDP init fails, try to work as L3 CAT. */ >+goto l3_cat_init; >+} >else >-feat_l3 = feat; >+{ >+ l3_cat_init: I'd really like to ask to re-structure this slightly so that you won't need goto and a label here. As said before, goto-s are sort of okay for making complicated error paths readable, but they should be avoided in almost all other cases. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline test] 112389: regressions - trouble: blocked/broken/fail/pass
flight 112389 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/112389/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-pvops 2 hosts-allocate broken REGR. vs. 111765 build-arm64-xsm 2 hosts-allocate broken REGR. vs. 111765 build-arm64 2 hosts-allocate broken REGR. vs. 111765 test-amd64-i386-xl-qemuu-ws16-amd64 10 windows-install fail REGR. vs. 111765 test-amd64-i386-xl-qemuu-win7-amd64 10 windows-install fail REGR. vs. 111765 test-amd64-amd64-xl-qemuu-win7-amd64 10 windows-install fail REGR. vs. 111765 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl 10 debian-install fail pass in 112380 test-armhf-armhf-xl-rtds 12 guest-startfail pass in 112380 Tests which did not succeed, but are not blocking: test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-xl 1 build-check(1) blocked n/a build-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a build-arm64 3 capture-logs broken blocked in 111765 build-arm64-pvops 3 capture-logs broken blocked in 111765 build-arm64-xsm 3 capture-logs broken blocked in 111765 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 112380 like 111765 test-armhf-armhf-xl 13 migrate-support-check fail in 112380 never pass test-armhf-armhf-xl 14 saverestore-support-check fail in 112380 never pass test-armhf-armhf-xl-rtds13 migrate-support-check fail in 112380 never pass test-armhf-armhf-xl-rtds 14 saverestore-support-check fail in 112380 never pass test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail like 111765 test-armhf-armhf-libvirt 14 saverestore-support-checkfail like 111765 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail like 111765 test-amd64-amd64-xl-rtds 10 debian-install fail like 111765 test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-installfail never pass test-amd64-i386-libvirt 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-credit2 13 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 13 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 13 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 14 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail never pass test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass version targeted for testing: qemuua588c4985eff363154d65aee8607d0a4601655f7 baseline version: qemuu31fe1c414501047cbb91b695bdccc0068496dcf6 Last test of basis 111765 2017-07-13 10:20:16 Z 18 days Failing since111790 2017-07-14 04:20:46 Z 17 days 25 attempts Testing same since 112366 2017-07-28 18:48:13 Z2 days4 attempts People who touched revisions under test: Alex BennéeAlex Williamson Alexander Graf
Re: [Xen-devel] PVH VCPU hotplug support v7?
On 31/07/17 14:55, Boris Ostrovsky wrote: > On 07/31/2017 09:20 AM, Ross Lagerwall wrote: >> Hi Boris, >> >> I've modified your PVH VCPU hotplug support v6 patch series [1] to >> support HVM guests running _with_ a device model for XenServer's >> purposes. This is useful because it moves the vCPU hotplug handling >> out of QEMU and allows it to mostly be shared with PVH. It will also >> allow unplugging vCPUs (libxl currently only does cpu-add for upstream >> qemu). >> >> Are you still planning on continuing with that patch series since your >> commit to Linux [2]? > This series has been put on hold until we figure out what to do with > hotplug for PVH dom0. (The problem was the "dual" view by dom0 of APCI > CPU namespace --- on hotplug event dom0 has to somehow figure out > whether the event was due to (dis)appearance of a physical or virtual CPU). > > I don't think this has been dealt with yet (copying Roger). From the point of view of unblocking several pieces of work, it would be fine for this logic to be behind an emulation flag, just like LAPIC/etc. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v6 05/12] x86/domctl: Handle ACPI access from domctl
On 01/03/2017 02:04 PM, Boris Ostrovsky wrote: Signed-off-by: Boris Ostrovsky--- Changes in v6: * Adjustments to to patch 4 changes. * Added a spinlock for VCPU map access * Return an error on guest trying to write VCPU map snip -static int acpi_cpumap_access_common(struct domain *d, bool is_write, - unsigned int port, +static int acpi_cpumap_access_common(struct domain *d, bool is_guest_access, + bool is_write, unsigned int port, unsigned int bytes, uint32_t *val) { unsigned int first_byte = port - XEN_ACPI_CPU_MAP; +int rc = X86EMUL_OKAY; BUILD_BUG_ON(XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN > ACPI_GPE0_BLK_ADDRESS_V1); +spin_lock(>arch.hvm_domain.acpi_lock); + if ( !is_write ) { uint32_t mask = (bytes < 4) ? ~0U << (bytes * 8) : 0; @@ -32,23 +37,61 @@ static int acpi_cpumap_access_common(struct domain *d, bool is_write, memcpy(val, (uint8_t *)d->avail_vcpus + first_byte, min(bytes, ((d->max_vcpus + 7) / 8) - first_byte)); } +else if ( !is_guest_access ) +memcpy((uint8_t *)d->avail_vcpus + first_byte, val, + min(bytes, ((d->max_vcpus + 7) / 8) - first_byte)); else /* Guests do not write CPU map */ -return X86EMUL_UNHANDLEABLE; +rc = X86EMUL_UNHANDLEABLE; -return X86EMUL_OKAY; +spin_unlock(>arch.hvm_domain.acpi_lock); + +return rc; } int hvm_acpi_domctl_access(struct domain *d, const struct xen_domctl_acpi_access *access) { -return -ENOSYS; +unsigned int bytes, i; +uint32_t val = 0; +uint8_t *ptr = (uint8_t *) +int rc; +bool is_write = (access->rw == XEN_DOMCTL_ACPI_WRITE) ? true : false; + +if ( has_acpi_dm_ff(d) ) +return -EOPNOTSUPP; + +if ( access->space_id != XEN_ACPI_SYSTEM_IO ) +return -EINVAL; + +if ( !((access->address >= XEN_ACPI_CPU_MAP) && + (access->address < XEN_ACPI_CPU_MAP + XEN_ACPI_CPU_MAP_LEN)) ) +return -ENODEV; + +for ( i = 0; i < access->width; i += sizeof(val) ) +{ +bytes = (access->width - i > sizeof(val)) ? +sizeof(val) : access->width - i; + +if ( is_write && copy_from_guest_offset(ptr, access->val, i, bytes) ) +return -EFAULT; + +rc = acpi_cpumap_access_common(d, false, is_write, + access->address, bytes, ); While I'm looking at this code... This doesn't work if access->width > sizeof(val) (4 bytes). The same value (access->address) is always passed into acpi_cpumap_access_common for 'port' and this is used as an offset into the avail_cpus array. So the offset is unchanged and only the first 4 bytes of avail_cpus ever gets changed. -- Ross Lagerwall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v14 12/23] x86: refactor psr: L3 CAT: set value: implement write msr flow.
>>> Yi Sun07/15/17 2:48 AM >>> >static int write_psr_msrs(unsigned int socket, unsigned int cos, >const uint32_t val[], unsigned int array_len, >enum psr_feat_type feat_type) >{ >-return -ENOENT; >+int ret; >+struct psr_socket_info *info = get_socket_info(socket); >+struct cos_write_info data = >+{ >+.cos = cos, >+.feature = info->features[feat_type], >+.props = feat_props[feat_type], >+}; >+ >+if ( cos > info->features[feat_type]->cos_max ) >+return -EINVAL; >+ >+/* Skip to the feature's value head. */ >+ret = skip_prior_features(_len, feat_type); >+if ( ret < 0 ) >+return ret; >+else >+val += ret; With this (again pointless) else removed, Reviewed-by: Jan Beulich Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v14 09/23] x86: refactor psr: L3 CAT: set value: assemble features value array.
>>> Yi Sun07/15/17 2:47 AM >>> >@@ -619,6 +710,46 @@ static int insert_val_into_array(uint32_t val[], >enum cbm_type type, >uint32_t new_val) >{ >+const struct feat_node *feat; >+const struct feat_props *props; >+unsigned int i; >+int ret; >+ >+ASSERT(feat_type < FEAT_TYPE_NUM); >+ >+ret = skip_prior_features(_len, feat_type); >+if ( ret < 0 ) >+return ret; >+else >+val += ret; Please avoid such pointless "else". Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 5/5] livepatch: Declare live patching as a supported feature
>>> Konrad Rzeszutek Wilk07/26/17 9:48 PM >>> >From: Ross Lagerwall > >See docs/features/livepatch.pandoc for the details. > >Signed-off-by: Ross Lagerwall >Signed-off-by: Konrad Rzeszutek Wilk Reviewed-by: Jan Beulich ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/5] alternative/x86/arm32: Align altinstructions (and altinstr_replacement) sections.
>>> Konrad Rzeszutek Wilk07/26/17 9:50 PM >>> >On x86 the bloat-o-meter detects that with this change the file shrinks: >add/remove: 1/0 grow/shrink: 0/2 up/down: 156/-367 (-211) >function old new delta >get_page_from_gfn - 156+156 >do_mmu_update 45784569 -9 >do_mmuext_op56045246-358 >Total: Before=3170439, After=3170228, chg -0.01% This looks unexpected, and hence I'd like to ask for an explanation. If anything I'd expect the image size to grow (slightly). >--- a/xen/include/asm-x86/alternative.h >+++ b/xen/include/asm-x86/alternative.h >@@ -56,10 +56,12 @@ extern void alternative_instructions(void); > >#define ALTERNATIVE_N(newinstr, feature, number) \ >".pushsection .altinstructions,\"a\"\n"\ >+ ".p2align 2\n" \ Can't this then be accompanied by dropping the (over-)alignment done in xen.lds.S? >ALTINSTR_ENTRY(feature, number)\ >".section .discard,\"a\",@progbits\n" \ >DISCARD_ENTRY(number) \ >".section .altinstr_replacement, \"ax\"\n" \ >+ ".p2align 2\n" \ This surely isn't needed on x86. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/events: Fix interrupt lost during irq_disable and irq_enable
On 07/29/2017 12:59 PM, Liu Shuo wrote: > Here is a device has xen-pirq-MSI interrupt. Dom0 might lost interrupt > during driver irq_disable/irq_enable. Here is the scenario, > 1. irq_disable -> disable_dynirq -> mask_evtchn(irq channel) > 2. dev interrupt raised by HW and Xen mark its evtchn as pending > 3. irq_enable -> startup_pirq -> eoi_pirq -> > clear_evtchn(channel of irq) -> clear pending status > 4. consume_one_event process the irq event without pending bit assert > which result in interrupt lost once > 5. No HW interrupt raising anymore. > > Now use enable_dynirq for enable_pirq of xen_pirq_chip to remove > eoi_pirq when irq_enable. > > Signed-off-by: Liu ShuoReviewed-by: Boris Ostrovsky ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel